Use Case: Customer Churn Prediction¶

Predict which customers are likely to cancel their subscription or stop using your service.

When to Use This¶

Subscription businesses (SaaS, media, telecom)
Customer retention programs
Proactive support targeting
Marketing budget allocation

Complete Implementation¶

from featrixsphere.api import FeatrixSphere

featrix = FeatrixSphere()

# 1. Create Foundational Model from customer data
fm = featrix.create_foundational_model(
    name="customer_churn_model",
    data_file="customers.csv",
    ignore_columns=["customer_id", "signup_date", "email"]  # Exclude IDs and PII
)
fm.wait_for_training()

print(f"Foundational Model trained: {fm.dimensions} dimensions")

# 2. Create cost-sensitive binary classifier
#    - Missing a churner costs $500 (lost customer value)
#    - False alarm costs $50 (wasted retention effort)
predictor = fm.create_binary_classifier(
    target_column="churned",
    name="churn_predictor_v1",
    rare_label_value="yes",           # "yes" is the positive class
    cost_false_negative=500,          # Cost of missing a churner
    cost_false_positive=50            # Cost of false alarm
)
predictor.wait_for_training()

print(f"Accuracy: {predictor.accuracy:.4f}")
print(f"AUC: {predictor.auc:.4f}")
print(f"F1: {predictor.f1:.4f}")

# 3. Make predictions
customer = {
    "tenure_months": 8,
    "monthly_charges": 85.0,
    "contract": "month-to-month",
    "payment_method": "credit_card",
    "total_charges": 680.0,
    "support_tickets": 3
}

result = predictor.predict(customer)
print(f"Prediction: {result.predicted_class}")
print(f"Probability: {result.probability:.2%}")
print(f"Confidence: {result.confidence:.2%}")

# 4. Batch predict for risk scoring
import pandas as pd

customers_df = pd.read_csv("active_customers.csv")
results = predictor.batch_predict(customers_df, show_progress=True)

# Find high-risk customers
high_risk = []
for i, result in enumerate(results):
    if result.predicted_class == "yes" and result.confidence > 0.7:
        high_risk.append({
            "customer_id": customers_df.iloc[i]["customer_id"],
            "churn_probability": result.probability,
            "confidence": result.confidence
        })

print(f"Found {len(high_risk)} high-risk customers")

# 5. Create production endpoint
endpoint = predictor.create_api_endpoint(
    name="churn_api_v1",
    description="Production churn prediction endpoint"
)
print(f"Endpoint URL: {endpoint.url}")
print(f"API Key: {endpoint.api_key}")

# 6. Configure monitoring webhooks
predictor.configure_webhooks(
    alert_drift="https://your-slack-webhook.com/drift",
    alert_performance_degradation="https://your-slack-webhook.com/perf"
)

# 7. Publish to production
fm.publish(org_id="my_org", name="churn_model_v1")

Key Parameters¶

Cost-Sensitive Classification¶

Set costs based on business impact:

Parameter	Description	Example
`cost_false_negative`	Cost of missing a churner	$500 (customer lifetime value)
`cost_false_positive`	Cost of false churn alert	$50 (retention offer cost)

The model optimizes the decision threshold using Bayes-optimal selection.

Class Imbalance¶

If your production data has different class distribution than training:

predictor = fm.create_binary_classifier(
    target_column="churned",
    class_imbalance={"yes": 0.15, "no": 0.85}  # 15% churn rate in production
)

Understanding Predictions¶

PredictionResult Fields¶

result = predictor.predict(customer)

# Classification result
result.predicted_class   # "yes" or "no"
result.probability       # Raw probability for predicted class (0.87)
result.confidence        # Normalized confidence from threshold (0.74)
result.probabilities     # {"yes": 0.87, "no": 0.13}
result.threshold         # Decision threshold (0.35 after cost optimization)

# Tracking
result.prediction_uuid   # UUID for feedback

# Warnings
result.guardrails        # Per-column warnings for unusual values

Confidence vs Probability¶

probability: Raw softmax output (e.g., 87% chance of churn)
confidence: How far from the decision boundary (0 = uncertain, 1 = very certain)
threshold: Optimized cutoff point based on costs (may not be 0.5)

Feature Importance¶

Understand why customers are predicted to churn:

result = predictor.predict(customer, feature_importance=True)

# Top factors driving this prediction
for feature, importance in sorted(
    result.feature_importance.items(),
    key=lambda x: abs(x[1]),
    reverse=True
)[:5]:
    print(f"{feature}: {importance:+.3f}")

Example output:

contract: +0.45           # Month-to-month increases churn risk
support_tickets: +0.23    # More tickets = higher risk
tenure_months: -0.18      # Longer tenure = lower risk
monthly_charges: +0.12    # Higher charges = higher risk
payment_method: -0.05     # Credit card = slightly lower risk

Sending Feedback¶

Track actual outcomes to improve future models:

# After customer actually churns or stays
actual_outcome = "yes"  # Customer did churn

# Option 1: From the result object
feedback = result.send_feedback(ground_truth=actual_outcome)
feedback.send()

# Option 2: Using stored prediction UUID
featrix.prediction_feedback(
    prediction_uuid=stored_uuid,
    ground_truth=actual_outcome
)

Production API Usage¶

Python¶

result = endpoint.predict(
    {"tenure_months": 8, "monthly_charges": 85.0, "contract": "month-to-month"},
    api_key=endpoint.api_key
)

HTTP¶

curl -X POST "https://sphere-api.featrix.com/endpoint/churn_api_v1/predict" \
  -H "X-API-Key: your-api-key" \
  -H "Content-Type: application/json" \
  -d '{"tenure_months": 8, "monthly_charges": 85.0, "contract": "month-to-month"}'

Response¶

{
  "predicted_class": "yes",
  "probability": 0.87,
  "confidence": 0.74,
  "probabilities": {"yes": 0.87, "no": 0.13},
  "threshold": 0.35,
  "prediction_uuid": "550e8400-e29b-41d4-a716-446655440000"
}

Best Practices¶

Exclude ID columns - Customer IDs, emails, timestamps don't help prediction
Set appropriate costs - False negatives (missed churners) often cost more than false positives
Monitor for drift - Customer behavior changes over time
Send feedback - Real outcomes improve future model versions
Version your models - Use clear naming: churn_model_v1_2024_01