Building a House Price Prediction Model Using Random Forest in Python

March 19, 2025 | By touch@creative-moon.com

1. Setting Up the Environment

To build a house price prediction model, we first set up the Python environment and installed necessary libraries.

# Install required libraries
!pip install numpy pandas scikit-learn matplotlib seaborn joblib

2. Loading and Exploring the Dataset

We used a housing dataset from Toronto and performed initial data exploration.

import pandas as pd

# Load the dataset
df = pd.read_csv("houses.csv")

# Display dataset structure
print(df.info())
print(df.head())

3. Data Preprocessing

We handled missing values and transformed categorical features.

# Fill missing values
df["sqft"].fillna(df["sqft"].median(), inplace=True)

# Convert categorical variables
df = pd.get_dummies(df, columns=["type", "city_district"], drop_first=True)

# Transform price into logarithmic scale
import numpy as np
df["final_price_log"] = np.log(df["final_price"])

4. Splitting the Dataset

We split the data into training and testing sets.

from sklearn.model_selection import train_test_split

X = df.drop(columns=["final_price", "final_price_log"])
y = df["final_price_log"]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

5. Training the Random Forest Model

from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error, r2_score

# Train the model
rf_model = RandomForestRegressor(n_estimators=100, random_state=42)
rf_model.fit(X_train, y_train)

# Predictions
y_pred = rf_model.predict(X_test)

# Evaluate the model
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print(f"Mean Squared Error: {mse:.4f}")
print(f"R² Score: {r2:.4f}")

6. Model Performance

Our Random Forest model achieved:• Mean Squared Error (MSE): 0.0253• R² Score: 0.8859

7. Saving the Model

To reuse the trained model, we saved it using joblib.

import joblib

# Save the model
joblib.dump(rf_model, "random_forest_model.pkl")

print("Model saved successfully")

8. Loading the Model and Making Predictions

We reloaded the saved model and made predictions.

# Load the saved model
rf_model_loaded = joblib.load("random_forest_model.pkl")

# Sample input (example house)
sample_house = pd.DataFrame([[800, 2, 1, 2, 60000] + [0]*140], columns=X_train.columns)

# Predict the price
predicted_price_log = rf_model_loaded.predict(sample_house)
predicted_price = np.exp(predicted_price_log)[0]

print(f"Predicted House Price: ${predicted_price:,.2f}")

Creating Dynamic Pages with Next.js and Fetching Data from a REST API

In this tutorial, we’ll go over how to create dynamic pages in Next.js using React and REST API integration. This approach allows us to fetch data dynamically and render it on the page. First, we use WordPress to expose a REST API that provides our content. In this case, we… Read More
Excited to Dive into React, Next.js, and GSAP!

Today, I’m absolutely thrilled to share that I’ve learned some fantastic new skills! I’ve been diving into React and Next.js, and it feels amazing to be able to build dynamic, modern web applications with these tools. I’m also integrating GSAP for smooth animations and interactions—it’s a game-changer! The best part? I’ve moved away from WordPress… Read More
Building a House Price Prediction Model with FastAPI and Deployment

In the previous post, we built a Random Forest model to predict house prices using the Toronto housing dataset. Now, let’s move forward by creating a FastAPI web application to serve this model and deploy it so others can interact with it. Below, I’ll guide you through the process of… Read More
Setting Up a Machine Learning Development Environment on Mac

Setting up a proper development environment is the first step toward machine learning proficiency. This guide covers installing Python, setting up a virtual environment, and configuring essential libraries for machine learning on a MacBook Air M4. Additionally, it includes instructions for running Jupyter Notebook and a simple linear regression experiment… Read More

Building a House Price Prediction Model Using Random Forest in Python

Creating Dynamic Pages with Next.js and Fetching Data from a REST API

Excited to Dive into React, Next.js, and GSAP!

Building a House Price Prediction Model with FastAPI and Deployment

Setting Up a Machine Learning Development Environment on Mac