Logistic Regression in Python – Testing

After building and training a Logistic Regression classifier, the next crucial step is testing the model. Testing allows us to evaluate how well the model performs on unseen data and whether it is reliable for real-world predictions.

A machine learning model is only useful if it performs well on data it has never seen before. That is why testing is a key part of the machine learning workflow.

In this tutorial, you will learn how to test a Logistic Regression model in Python using Scikit-Learn and interpret the results effectively.

Why Testing is Important

Testing helps to:

Measure model accuracy
Detect overfitting or underfitting
Evaluate real-world performance
Compare different models
Improve decision-making

Without testing, we cannot trust model predictions.

Testing Workflow Overview

A typical Logistic Regression testing process includes:

1. Load Trained Model
2. Make Predictions
3. Compare with Actual Values
4. Evaluate Metrics
5. Interpret Results

Step 1: Load Trained Model

If the model is already trained, you can reuse it.

import joblib

model = joblib.load("models/logistic_model.pkl")

Step 2: Prepare Test Data

Ensure test data is properly preprocessed.

import pandas as pd

data = pd.read_csv("data/customers_test.csv")

X_test = data[['Age', 'Salary']]
y_test = data['Purchased']

Step 3: Feature Scaling

Use the same scaler used during training.

from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()

X_test = scaler.fit_transform(X_test)

Important: In real projects, you should load the saved scaler instead of refitting.

Step 4: Make Predictions

Use the trained model to predict outcomes.

y_pred = model.predict(X_test)

print(y_pred)

Example output:

[0 0 1 1 0 1]

Step 5: Compare Predictions

Compare predicted values with actual values.

comparison = pd.DataFrame({
    "Actual": y_test,
    "Predicted": y_pred
})

print(comparison)

Step 6: Calculate Accuracy

Accuracy measures how many predictions are correct.

from sklearn.metrics import accuracy_score

accuracy = accuracy_score(y_test, y_pred)

print("Accuracy:", accuracy)

Example output:

Accuracy: 0.88

Step 7: Confusion Matrix

The confusion matrix shows detailed prediction results.

from sklearn.metrics import confusion_matrix

cm = confusion_matrix(y_test, y_pred)

print(cm)

Example:

[[50  5]
 [ 6 39]]

Interpretation:

True Negatives: 50
False Positives: 5
False Negatives: 6
True Positives: 39

Step 8: Classification Report

from sklearn.metrics import classification_report

print(classification_report(y_test, y_pred))

Output includes:

Precision
Recall
F1-score
Support

Step 9: Predict Probability

Logistic Regression provides probability values.

y_prob = model.predict_proba(X_test)

print(y_prob[:5])

Example:

[[0.80 0.20]
 [0.30 0.70]
 [0.15 0.85]]

Understanding Testing Results

Good Model Indicators:

High accuracy (>80%)
Balanced precision and recall
Low false positives and false negatives

Poor Model Indicators:

Low accuracy
High error rates
Overfitting or underfitting

Visualizing Test Results

import matplotlib.pyplot as plt

plt.scatter(range(len(y_test)), y_test, color='blue', label='Actual')
plt.scatter(range(len(y_pred)), y_pred, color='red', label='Predicted')

plt.title("Actual vs Predicted Results")
plt.legend()
plt.show()

Full Testing Code Example

import pandas as pd
import joblib
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

# Load model
model = joblib.load("models/logistic_model.pkl")

# Load test data
data = pd.read_csv("data/customers_test.csv")

X_test = data[['Age', 'Salary']]
y_test = data['Purchased']

# Predict
y_pred = model.predict(X_test)

# Evaluation
print("Accuracy:", accuracy_score(y_test, y_pred))
print(confusion_matrix(y_test, y_pred))
print(classification_report(y_test, y_pred))

Best Practices for Testing

Always use unseen data
Load saved scaler and model
Avoid data leakage
Evaluate multiple metrics
Test on real-world data

Common Mistakes

Avoid:

Testing on training data
Refitting scaler on test set
Ignoring evaluation metrics
Relying only on accuracy
Not saving preprocessing steps

Real-World Applications

Testing Logistic Regression models is essential in:

Fraud detection systems
Healthcare prediction models
Customer behavior analysis
Credit scoring systems
Marketing analytics

Conclusion

Testing is a critical step in the machine learning lifecycle. It ensures that your Logistic Regression model performs well on unseen data and can be trusted in real-world applications.

By properly evaluating predictions using accuracy, confusion matrix, and classification reports, you can confidently deploy your model for practical use cases.

Header Ads Widget

Logistic Regression in Python – Testing Model with Scikit-Learn | Evaluation Guide

Logistic Regression in Python – Testing

Why Testing is Important

Testing Workflow Overview

Step 1: Load Trained Model

Step 2: Prepare Test Data

Step 3: Feature Scaling

Step 4: Make Predictions

Step 5: Compare Predictions

Step 6: Calculate Accuracy

Step 7: Confusion Matrix

Step 8: Classification Report

Step 9: Predict Probability

Understanding Testing Results

Good Model Indicators:

Poor Model Indicators:

Visualizing Test Results

Full Testing Code Example

Best Practices for Testing

Common Mistakes

Real-World Applications

Conclusion

Posted by: Roger John Williams

You may like these posts

Post a Comment

0 Comments

Search This Blog

Report Abuse

Labels

Subscribe Us

Ad Space

Popular Posts

NumPy Inverse Fourier Transform Explained – Python IFFT with Examples

Python - Join Tuples (Complete Guide for Beginners)

Python - Tuple Methods (Complete Guide for Beginners)

Tags

Popular Posts

NumPy Inverse Fourier Transform Explained – Python IFFT with Examples

Python - Join Tuples (Complete Guide for Beginners)

Python - Tuple Methods (Complete Guide for Beginners)

Labels

Menu Footer Widget