Cost-Optimal Threshold Implementation¶
Summary¶
Implemented improvements [1] and [2] from the feedback to use Bayes-optimal decision thresholds when costs are specified.
The Problem¶
Previous behavior:
- Cost parameters (cost_false_positive, cost_false_negative) were computed and logged but NOT connected to the training objective or decision threshold
- self.optimal_threshold was always set to the F1-optimal threshold (maximizes F1 score)
- The cost-optimal threshold (tau_cost) was computed but only used for logging/metrics, not for predictions
- This meant the "default cost" was informational only, not actually driving decisions
Why this matters: For cost-sensitive classification, the Bayes-optimal decision threshold is:
With cost_false_negative=2.33 and cost_false_positive=1.0:
So you should predict "bad" when P(bad|x) >= 0.30, not at the default 0.50.
The Solution¶
New behavior:
1. When costs are specified, the system now uses the cost-optimal threshold for predictions
2. The cost-optimal threshold is computed using best_threshold_for_cost() which searches for the threshold that minimizes: cost_fp * FP + cost_fn * FN
3. self.optimal_threshold is now set to the cost-optimal threshold when costs are provided
4. All logging clearly indicates whether we're using "cost-optimal (Bayes)" or "F1-optimal" threshold
5. The Bayes-optimal formula is documented throughout the code
Changes Made¶
1. Threshold Selection Logic (lines ~9890-9940)¶
# BAYES-OPTIMAL DECISION THRESHOLD:
# When costs are specified, use the cost-optimal threshold instead of F1-optimal.
# This is the theoretically correct Bayes-optimal decision rule that minimizes expected cost.
use_cost_optimal_threshold = True
# Compute Bayes-optimal theoretical threshold for comparison
bayes_threshold_theory = self.cost_false_positive / (self.cost_false_positive + self.cost_false_negative)
logger.info(f"💰 Cost-optimal threshold: {tau_cost:.4f} (Bayes-optimal theory: {bayes_threshold_theory:.4f})")
# DECISION: Which threshold to use for predictions?
if use_cost_optimal_threshold and tau_cost is not None:
threshold_for_prediction = tau_cost
threshold_source = "cost-optimal (Bayes)"
# Recompute metrics at cost-optimal threshold
...
else:
threshold_for_prediction = optimal_threshold
threshold_source = "F1-optimal"
2. Updated Threshold Tracking (lines ~9944-9989)¶
# THRESHOLD TRACKING POLICY:
# - When costs are specified, we use cost-optimal threshold (Bayes-optimal decision rule)
# - Otherwise, we use F1-optimal threshold
# - self.optimal_threshold is set to the best epoch's threshold (for final predictions)
# Save threshold_for_prediction instead of optimal_threshold
self.optimal_threshold = threshold_for_prediction
# Log with source indicator
best_epoch_msg = f"⭐ New best epoch: AUC={auc:.3f}, F1={f1:.3f}, threshold={threshold_for_prediction:.4f} ({threshold_source})"
3. Enhanced Logging (lines ~9992-10007)¶
# B: Enhanced logging showing what we're optimizing and tradeoffs
logger.debug(f"{log_prefix}📊 Threshold Optimization Summary:")
logger.info(f"{log_prefix} Using {threshold_source} threshold: {threshold_for_prediction:.4f}")
logger.info(f"{log_prefix} F1: argmax={argmax_f1:.3f}, optimal={f1:.3f}, ΔF1={delta_f1:+.3f}")
logger.info(f"{log_prefix} Accuracy: argmax={argmax_accuracy:.3f}, optimal={accuracy_at_optimal:.3f}, ΔAcc={delta_accuracy:+.3f}")
# Cost metrics with savings percentage
if cost_metrics:
cost_savings_pct = ((baseline_cost - cost_min) / baseline_cost * 100) if baseline_cost > 0 else 0
logger.info(f"{log_prefix}💰 Cost metrics - Min cost: {cost_min:.2f} (baseline: {baseline_cost:.2f}, savings: {cost_savings_pct:.1f}%)")
4. Updated Prediction Code (lines ~10622-10641)¶
if use_optimal_threshold:
# USE OPTIMAL THRESHOLD: For binary classification, use either:
# 1. Cost-optimal threshold (Bayes-optimal) when costs are specified
# 2. F1-optimal threshold as fallback when costs not specified
# This threshold was computed during training and stored in self.optimal_threshold
# Apply optimal threshold: predict positive if prob >= threshold
# For cost-based: threshold ≈ C_FP / (C_FP + C_FN) (Bayes-optimal decision rule)
# For F1-based: threshold maximizes F1 score
if pos_prob >= self.optimal_threshold:
# Predict positive class
5. Documentation Updates¶
Updated optimal_threshold attribute documentation (lines ~530-540):
# Optimal threshold for binary classification (computed during training, used during prediction)
# When costs are specified: Uses cost-optimal threshold (Bayes-optimal decision rule)
# - Formula: threshold ≈ C_FP / (C_FP + C_FN)
# - Example: With C_FN=2.33 and C_FP=1.0, threshold ≈ 0.30 (not 0.50)
# - This minimizes expected cost for the given false positive and false negative costs
# When costs not specified: Uses F1-optimal threshold (maximizes F1 score on validation set)
# The threshold is saved from the best AUC epoch during training
self.optimal_threshold = None
Enhanced best_threshold_for_cost docstring (lines ~8573-8596):
"""
Find the optimal threshold that minimizes cost: cost_fp * FP + cost_fn * FN.
This implements the Bayes-optimal decision rule for cost-sensitive binary classification.
The theoretical optimal threshold is: threshold = C_FP / (C_FP + C_FN)
For example:
- If C_FN = 2.33 and C_FP = 1.0 (false negatives cost 2.33x more than false positives)
- Then optimal threshold ≈ 1.0 / (1.0 + 2.33) ≈ 0.30
- This means: predict positive if P(positive|x) >= 0.30 (not the default 0.50)
This threshold balances the asymmetric costs to minimize expected cost.
...
"""
Impact¶
When costs are specified:
- ✅ The system now makes Bayes-optimal decisions that minimize expected cost
- ✅ The threshold is theoretically correct (approximately C_FP / (C_FP + C_FN))
- ✅ All metrics are computed at the cost-optimal threshold
- ✅ Logs clearly show "cost-optimal (Bayes)" threshold with comparison to theoretical value
- ✅ Cost savings percentage is prominently displayed
When costs are NOT specified: - ✅ Falls back to F1-optimal threshold (unchanged behavior) - ✅ Logs clearly show "F1-optimal" threshold
Theoretical Background¶
The Bayes-optimal decision rule for binary classification with asymmetric costs states:
Predict positive class if:
Proof sketch: - Expected cost = P(y=1|x) * C_FN * I(predict 0) + P(y=0|x) * C_FP * I(predict 1) - Minimize cost by predicting positive when: P(y=0|x) * C_FP < P(y=1|x) * C_FN - Rearrange: P(y=1|x) / P(y=0|x) > C_FP / C_FN - Since P(y=0|x) = 1 - P(y=1|x), solve for threshold - Result: threshold = C_FP / (C_FP + C_FN)
Example Output¶
With cost_false_negative=2.33 and cost_false_positive=1.0, you'll now see:
💰 Cost-optimal threshold: 0.2987 (Bayes-optimal theory: 0.3003)
💰 Cost at threshold: 145.23 (baseline: 233.00), F1: 0.682
📊 Threshold Optimization Summary:
Using cost-optimal (Bayes) threshold: 0.2987
F1: argmax=0.645, optimal=0.682, ΔF1=+0.037
Accuracy: argmax=0.823, optimal=0.817, ΔAcc=-0.006
💰 Cost metrics - Min cost: 145.23 (baseline: 233.00, savings: 37.7%)
⭐ New best epoch: AUC=0.741, F1=0.682, threshold=0.2987 (cost-optimal (Bayes))
What About the Loss Function?¶
The feedback noted that ideally the loss function should also be connected to costs. This is a valid point, but harder to implement:
Current approach (still valid): - Use FocalLoss with class weights for training stability - Apply cost-optimal threshold at inference time
Why this works: - FocalLoss produces well-calibrated probabilities - The cost-optimal threshold converts these probabilities into optimal decisions - This is a common and theoretically sound approach (train for probability estimation, optimize threshold separately)
Future improvement (optional):
- Modify FocalLoss to use cost-sensitive class weights: weight = [C_FN/C_FP, 1.0]
- Or use a fully cost-sensitive loss function
- However, the current approach is already Bayes-optimal at the decision level
Files Modified¶
src/lib/featrix/neural/single_predictor.py:- Lines ~530-540: Updated
optimal_thresholdattribute documentation - Lines ~8573-8596: Enhanced
best_threshold_for_costdocstring - Lines ~9890-9940: Added cost-optimal threshold selection logic
- Lines ~9944-9989: Updated threshold tracking to use cost-optimal when available
- Lines ~9992-10007: Enhanced logging to show threshold source
- Lines ~10622-10641: Updated prediction code documentation
Testing Recommendations¶
- Verify threshold values: Check that with
cost_fn=2.33andcost_fp=1.0, the threshold is around 0.30 - Verify predictions: Ensure predictions change when costs are specified vs. not specified
- Check logging: Confirm logs show "cost-optimal (Bayes)" vs. "F1-optimal" appropriately
- Cost savings: Verify that cost is lower at the cost-optimal threshold than at the F1-optimal threshold
- Backward compatibility: Ensure models without costs still work correctly (F1-optimal threshold)