3:00

Linear Regression

The Foundation of Machine Learning

3-Minute Complete Guide

🎯 What is Linear Regression?

Linear regression finds the best straight line through data points to predict continuous values.

Formula: y = mx + b

💻 Quick Implementation

import numpy as np from sklearn.linear_model import LinearRegression import matplotlib.pyplot as plt # Sample data: Hours studied vs Exam score X = np.array([[1], [2], [3], [4], [5]]) # Hours y = np.array([50, 60, 70, 80, 90]) # Scores # Create and train model model = LinearRegression() model.fit(X, y) # Make prediction prediction = model.predict([[6]]) # 6 hours print(f"Predicted score: {prediction[0]:.1f}")

📊 Real Example

Problem: Predict house prices based on size

# House data sizes = np.array([[1000], [1500], [2000], [2500]]) # sq ft prices = np.array([200000, 300000, 400000, 500000]) # dollars # Train model house_model = LinearRegression() house_model.fit(sizes, prices) # Predict price for 1800 sq ft house new_price = house_model.predict([[1800]]) print(f"1800 sq ft house price: ${new_price[0]:,.0f}") # Output: 1800 sq ft house price: $360,000

⚡ Key Points

✅ When to Use:

  • Predicting continuous values
  • Linear relationships
  • Simple baseline model

❌ Limitations:

  • Only linear relationships
  • Sensitive to outliers
  • Assumes normal distribution

🚀 Complete Working Example

import pandas as pd from sklearn.model_selection import train_test_split from sklearn.metrics import mean_squared_error, r2_score # Create sample dataset data = { 'experience': [1, 2, 3, 4, 5, 6, 7, 8], 'salary': [30000, 35000, 40000, 45000, 50000, 55000, 60000, 65000] } df = pd.DataFrame(data) # Split data X = df[['experience']] y = df['salary'] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3) # Train and evaluate model = LinearRegression() model.fit(X_train, y_train) predictions = model.predict(X_test) print(f"R² Score: {r2_score(y_test, predictions):.2f}") print(f"Equation: Salary = {model.coef_[0]:.0f} * Experience + {model.intercept_:.0f}")

🎓 Summary

Linear Regression is the simplest ML algorithm that draws a line through data to make predictions. Perfect for beginners and often surprisingly effective!

Next Steps: Try polynomial regression, multiple features, or regularization techniques.