XGBoost Baseline — NJ Housing Price Prediction
XGBoost regressor trained on 7 structured features from NJ housing data. Serves as a baseline comparison against a QLoRA fine-tuned Qwen2.5-0.5B LLM.
Metrics (held-out test set, n=1,050)
| Metric | XGBoost | QLoRA (Qwen2.5-0.5B) |
|---|---|---|
| MAE | $128,013 | $140,141 |
| RMSE | $168,135 | $190,172 |
| R² | 0.7154 | 0.6359 |
| MAPE | 22.7% | 23.0% |
Best Hyperparameters
{
"learning_rate": 0.01,
"max_depth": 4,
"n_estimators": 500
}
Features
| Feature | Type |
|---|---|
| bedrooms | int |
| bathrooms | float |
| sqft | int |
| lot_size | float |
| year_built | int |
| zip_code | int (ordinal) |
| property_type | one-hot encoded |
Usage
from xgboost import XGBRegressor
from huggingface_hub import hf_hub_download
path = hf_hub_download("rajkumar4466/nj-housing-xgboost-baseline", "xgboost_baseline.json")
model = XGBRegressor()
model.load_model(path)
# Predict (features must be encoded the same way as training)
# model.predict(X)
Dataset
Trained on rajkumar4466/nj-housing-prices-tabular