Skip to content

Use Case: Customer Churn Prediction

Predict which customers are likely to cancel their subscription or stop using your service.

When to Use This

  • Subscription businesses (SaaS, media, telecom)
  • Customer retention programs
  • Proactive support targeting
  • Marketing budget allocation

Complete Implementation

from featrixsphere.api import FeatrixSphere

featrix = FeatrixSphere()

# 1. Create Foundational Model from customer data
fm = featrix.create_foundational_model(
    name="customer_churn_model",
    data_file="customers.csv",
    ignore_columns=["customer_id", "signup_date", "email"]  # Exclude IDs and PII
)
fm.wait_for_training()

print(f"Foundational Model trained: {fm.dimensions} dimensions")

# 2. Create cost-sensitive binary classifier
#    - Missing a churner costs $500 (lost customer value)
#    - False alarm costs $50 (wasted retention effort)
predictor = fm.create_binary_classifier(
    target_column="churned",
    name="churn_predictor_v1",
    rare_label_value="yes",           # "yes" is the positive class
    cost_false_negative=500,          # Cost of missing a churner
    cost_false_positive=50            # Cost of false alarm
)
predictor.wait_for_training()

print(f"Accuracy: {predictor.accuracy:.4f}")
print(f"AUC: {predictor.auc:.4f}")
print(f"F1: {predictor.f1:.4f}")

# 3. Make predictions
customer = {
    "tenure_months": 8,
    "monthly_charges": 85.0,
    "contract": "month-to-month",
    "payment_method": "credit_card",
    "total_charges": 680.0,
    "support_tickets": 3
}

result = predictor.predict(customer)
print(f"Prediction: {result.predicted_class}")
print(f"Probability: {result.probability:.2%}")
print(f"Confidence: {result.confidence:.2%}")

# 4. Batch predict for risk scoring
import pandas as pd

customers_df = pd.read_csv("active_customers.csv")
results = predictor.batch_predict(customers_df, show_progress=True)

# Find high-risk customers
high_risk = []
for i, result in enumerate(results):
    if result.predicted_class == "yes" and result.confidence > 0.7:
        high_risk.append({
            "customer_id": customers_df.iloc[i]["customer_id"],
            "churn_probability": result.probability,
            "confidence": result.confidence
        })

print(f"Found {len(high_risk)} high-risk customers")

# 5. Create production endpoint
endpoint = predictor.create_api_endpoint(
    name="churn_api_v1",
    description="Production churn prediction endpoint"
)
print(f"Endpoint URL: {endpoint.url}")
print(f"API Key: {endpoint.api_key}")

# 6. Configure monitoring webhooks
predictor.configure_webhooks(
    alert_drift="https://your-slack-webhook.com/drift",
    alert_performance_degradation="https://your-slack-webhook.com/perf"
)

# 7. Publish to production
fm.publish(org_id="my_org", name="churn_model_v1")

Key Parameters

Cost-Sensitive Classification

Set costs based on business impact:

Parameter Description Example
cost_false_negative Cost of missing a churner $500 (customer lifetime value)
cost_false_positive Cost of false churn alert $50 (retention offer cost)

The model optimizes the decision threshold using Bayes-optimal selection.

Class Imbalance

If your production data has different class distribution than training:

predictor = fm.create_binary_classifier(
    target_column="churned",
    class_imbalance={"yes": 0.15, "no": 0.85}  # 15% churn rate in production
)

Understanding Predictions

PredictionResult Fields

result = predictor.predict(customer)

# Classification result
result.predicted_class   # "yes" or "no"
result.probability       # Raw probability for predicted class (0.87)
result.confidence        # Normalized confidence from threshold (0.74)
result.probabilities     # {"yes": 0.87, "no": 0.13}
result.threshold         # Decision threshold (0.35 after cost optimization)

# Tracking
result.prediction_uuid   # UUID for feedback

# Warnings
result.guardrails        # Per-column warnings for unusual values

Confidence vs Probability

  • probability: Raw softmax output (e.g., 87% chance of churn)
  • confidence: How far from the decision boundary (0 = uncertain, 1 = very certain)
  • threshold: Optimized cutoff point based on costs (may not be 0.5)

Feature Importance

Understand why customers are predicted to churn:

result = predictor.predict(customer, feature_importance=True)

# Top factors driving this prediction
for feature, importance in sorted(
    result.feature_importance.items(),
    key=lambda x: abs(x[1]),
    reverse=True
)[:5]:
    print(f"{feature}: {importance:+.3f}")

Example output:

contract: +0.45           # Month-to-month increases churn risk
support_tickets: +0.23    # More tickets = higher risk
tenure_months: -0.18      # Longer tenure = lower risk
monthly_charges: +0.12    # Higher charges = higher risk
payment_method: -0.05     # Credit card = slightly lower risk

Sending Feedback

Track actual outcomes to improve future models:

# After customer actually churns or stays
actual_outcome = "yes"  # Customer did churn

# Option 1: From the result object
feedback = result.send_feedback(ground_truth=actual_outcome)
feedback.send()

# Option 2: Using stored prediction UUID
featrix.prediction_feedback(
    prediction_uuid=stored_uuid,
    ground_truth=actual_outcome
)

Production API Usage

Python

result = endpoint.predict(
    {"tenure_months": 8, "monthly_charges": 85.0, "contract": "month-to-month"},
    api_key=endpoint.api_key
)

HTTP

curl -X POST "https://sphere-api.featrix.com/endpoint/churn_api_v1/predict" \
  -H "X-API-Key: your-api-key" \
  -H "Content-Type: application/json" \
  -d '{"tenure_months": 8, "monthly_charges": 85.0, "contract": "month-to-month"}'

Response

{
  "predicted_class": "yes",
  "probability": 0.87,
  "confidence": 0.74,
  "probabilities": {"yes": 0.87, "no": 0.13},
  "threshold": 0.35,
  "prediction_uuid": "550e8400-e29b-41d4-a716-446655440000"
}

Best Practices

  1. Exclude ID columns - Customer IDs, emails, timestamps don't help prediction
  2. Set appropriate costs - False negatives (missed churners) often cost more than false positives
  3. Monitor for drift - Customer behavior changes over time
  4. Send feedback - Real outcomes improve future model versions
  5. Version your models - Use clear naming: churn_model_v1_2024_01