Regression task to predict house sale prices for King County, including Seattle, between May 2014 and May 2015.
Contains 19 features and 21613 observations.
Target column is "price"
.
Pre-processing
Id column has been removed.
Dates in column
"date"
have been converted from strings to POSIXct.Values
0
in feature"yr_renovated"
have been replaced withNA
.Values
0
in feature"sqft_basement"
have been replaced withNA
.Feature
"waterfront"
has been converted to logical.
Examples
data("kc_housing", package = "mlr3data")
str(kc_housing)
#> 'data.frame': 21613 obs. of 20 variables:
#> $ date : POSIXct, format: "2014-10-13" "2014-12-09" ...
#> $ price : num 221900 538000 180000 604000 510000 ...
#> $ bedrooms : int 3 3 2 4 3 4 3 3 3 3 ...
#> $ bathrooms : num 1 2.25 1 3 2 4.5 2.25 1.5 1 2.5 ...
#> $ sqft_living : int 1180 2570 770 1960 1680 5420 1715 1060 1780 1890 ...
#> $ sqft_lot : int 5650 7242 10000 5000 8080 101930 6819 9711 7470 6560 ...
#> $ floors : num 1 2 1 1 1 1 2 1 1 2 ...
#> $ waterfront : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
#> $ view : int 0 0 0 0 0 0 0 0 0 0 ...
#> $ condition : int 3 3 3 5 3 3 3 3 3 3 ...
#> $ grade : int 7 7 6 7 8 11 7 7 7 7 ...
#> $ sqft_above : int 1180 2170 770 1050 1680 3890 1715 1060 1050 1890 ...
#> $ sqft_basement: int NA 400 NA 910 NA 1530 NA NA 730 NA ...
#> $ yr_built : int 1955 1951 1933 1965 1987 2001 1995 1963 1960 2003 ...
#> $ yr_renovated : int NA 1991 NA NA NA NA NA NA NA NA ...
#> $ zipcode : int 98178 98125 98028 98136 98074 98053 98003 98198 98146 98038 ...
#> $ lat : num 47.5 47.7 47.7 47.5 47.6 ...
#> $ long : num -122 -122 -122 -122 -122 ...
#> $ sqft_living15: int 1340 1690 2720 1360 1800 4760 2238 1650 1780 2390 ...
#> $ sqft_lot15 : int 5650 7639 8062 5000 7503 101930 6819 9711 8113 7570 ...
#> - attr(*, "index")= int(0)