Featrix API Reference¶

Complete reference for the FeatrixSphere Python API.

Installation¶

pip install featrixsphere

FeatrixSphere (Client)¶

The main client for interacting with Featrix.

Constructor¶

from featrixsphere.api import FeatrixSphere

featrix = FeatrixSphere(
    api_url: str = None,           # API URL (default: auto-detect)
    compute_cluster: str = None    # Target compute cluster
)

Methods¶

create_foundational_model¶

Create and train a new Foundational Model.

fm = featrix.create_foundational_model(
    name: str = None,                    # Model name (optional)
    data_file: str = None,               # Path to CSV/Parquet/JSON or S3 URL
    df: pd.DataFrame = None,             # Pandas DataFrame (alternative to data_file)
    ignore_columns: List[str] = None,    # Columns to exclude from training
    epochs: int = None,                  # Training epochs (None = automatic)
    webhooks: Dict[str, str] = None,     # Webhook URLs for training events
    user_metadata: Dict[str, Any] = None,# Custom metadata (max 32KB)
    foundation_mode: bool = False        # Force foundation training for large datasets
) -> FoundationalModel

Returns: FoundationalModel instance (training starts immediately)

foundational_model¶

Get an existing Foundational Model by ID.

fm = featrix.foundational_model(
    session_id: str    # Session ID of the Foundational Model
) -> FoundationalModel

predictor¶

Get an existing Predictor by ID.

predictor = featrix.predictor(
    predictor_id: str,           # Predictor ID
    foundational_model_id: str   # Parent Foundational Model ID
) -> Predictor

vector_database¶

Get an existing Vector Database.

vdb = featrix.vector_database(
    foundational_model_id: str   # Foundational Model ID
) -> VectorDatabase

api_endpoint¶

Get an existing API Endpoint.

endpoint = featrix.api_endpoint(
    endpoint_id: str,            # Endpoint ID
    foundational_model_id: str   # Parent Foundational Model ID
) -> APIEndpoint

list_sessions¶

List Foundational Model sessions by name prefix.

sessions = featrix.list_sessions(
    name_prefix: str = None    # Filter by name prefix
) -> List[str]

Returns: List of session IDs

prediction_feedback¶

Send ground truth feedback for a prediction.

featrix.prediction_feedback(
    prediction_uuid: str,    # UUID from PredictionResult
    ground_truth: Any        # Actual outcome (string or numeric)
)

health_check¶

Check API connectivity.

result = featrix.health_check() -> Dict

get_notebook¶

Get Jupyter notebook visualization helper.

notebook = featrix.get_notebook() -> NotebookHelper

set_compute_cluster¶

Change the target compute cluster.

featrix.set_compute_cluster(cluster: str)

FoundationalModel¶

Represents a trained neural embedding space.

Attributes¶

Attribute	Type	Description
`id`	`str`	Session ID
`name`	`str`	Model name
`status`	`str`	"training", "done", or "error"
`dimensions`	`int`	Embedding dimensions (d_model)
`epochs`	`int`	Training epochs completed
`final_loss`	`float`	Final training loss
`created_at`	`str`	Creation timestamp (ISO format)
`columns`	`List[str]`	Column names in the model

Methods¶

wait_for_training¶

Block until training completes.

fm.wait_for_training(
    max_wait_time: int = 3600,    # Maximum wait in seconds
    poll_interval: int = 10,       # Polling frequency in seconds
    show_progress: bool = True     # Print progress updates
)

Raises: TimeoutError if max_wait_time exceeded, RuntimeError if training fails

is_ready¶

Check if training is complete.

ready = fm.is_ready() -> bool

refresh¶

Refresh model state from server.

info = fm.refresh() -> Dict

create_binary_classifier¶

Create a binary classifier predictor.

predictor = fm.create_binary_classifier(
    target_column: str,                      # Column to predict
    name: str = None,                        # Predictor name
    rare_label_value: str = None,            # Positive/minority class label
    epochs: int = 0,                         # Training epochs (0 = auto)
    labels_file: str = None,                 # Separate labels file path
    labels_df: pd.DataFrame = None,          # Separate labels DataFrame
    cost_false_positive: float = None,       # Cost per false positive
    cost_false_negative: float = None,       # Cost per false negative
    class_imbalance: Dict[str, float] = None # Production class distribution
) -> Predictor

create_multi_classifier¶

Create a multiclass classifier predictor.

predictor = fm.create_multi_classifier(
    target_column: str,                      # Column to predict
    name: str = None,                        # Predictor name
    epochs: int = 0,                         # Training epochs (0 = auto)
    labels_file: str = None,                 # Separate labels file path
    labels_df: pd.DataFrame = None,          # Separate labels DataFrame
    class_imbalance: Dict[str, float] = None # Production class distribution
) -> Predictor

create_regressor¶

Create a regression predictor.

predictor = fm.create_regressor(
    target_column: str,            # Column to predict
    name: str = None,              # Predictor name
    epochs: int = 0,               # Training epochs (0 = auto)
    labels_file: str = None,       # Separate labels file path
    labels_df: pd.DataFrame = None # Separate labels DataFrame
) -> Predictor

list_predictors¶

List all predictors for this Foundational Model.

predictors = fm.list_predictors() -> List[Predictor]

create_vector_database¶

Create a vector database for similarity search.

vdb = fm.create_vector_database(
    name: str = None,                    # Database name
    records: List[Dict] = None           # Initial records to add
) -> VectorDatabase

create_reference_record¶

Create a reference record for positive-only matching.

ref = fm.create_reference_record(
    record: Dict,        # The reference record
    name: str = None     # Reference name
) -> ReferenceRecord

encode¶

Convert records to embedding vectors.

embeddings = fm.encode(
    records: List[Dict],    # Records to encode
    short: bool = False     # If True, return only 3D vectors
) -> List[Dict]

Returns: List of dicts with embedding (3D) and embedding_long (full) keys

get_training_metrics¶

Get training metrics and loss history.

metrics = fm.get_training_metrics() -> Dict

Returns: Dict with loss_history, lr_history, etc.

get_projections¶

Get 2D/3D projection data for visualization.

projections = fm.get_projections() -> Dict

get_sphere_preview¶

Get PNG preview image of embedding space.

png_bytes = fm.get_sphere_preview(
    save_path: str = None    # Optional path to save PNG
) -> bytes

get_model_card¶

Get the model card with all training decisions.

model_card = fm.get_model_card() -> Dict

Returns: Dict with columns, excluded_columns, training_config, quality_metrics, calibration, data_summary, warnings

publish¶

Publish to production directory (protected from garbage collection).

fm.publish(
    org_id: str,     # Organization ID
    name: str        # Published model name
)

publish_checkpoint¶

Publish a training checkpoint as a new Foundational Model.

checkpoint_fm = fm.publish_checkpoint(
    name: str,                    # New model name
    org_id: str,                  # Organization ID
    checkpoint_epoch: int = None  # Epoch to snapshot (None = latest)
) -> FoundationalModel

unpublish¶

Remove from published directory.

fm.unpublish()

deprecate¶

Mark as deprecated with warning and expiration.

fm.deprecate(
    warning_message: str,     # Warning shown to users
    expiration_date: str      # ISO format expiration date
)

delete¶

Mark for garbage collection.

fm.delete()

clone¶

Clone to a different compute cluster.

cloned_fm = fm.clone(
    target_compute_cluster: str = None,  # Target cluster
    new_name: str = None                  # Name for cloned model
) -> FoundationalModel

Predictor¶

Represents a trained classifier or regressor.

Attributes¶

Attribute	Type	Description
`id`	`str`	Predictor ID
`session_id`	`str`	Parent Foundational Model ID
`target_column`	`str`	Target column name
`target_type`	`str`	"set" (classification) or "numeric" (regression)
`status`	`str`	"training", "done", or "error"
`accuracy`	`float`	Training accuracy
`auc`	`float`	ROC-AUC score (classification)
`f1`	`float`	F1 score (classification)

Methods¶

wait_for_training¶

Block until training completes.

predictor.wait_for_training(
    max_wait_time: int = 3600,    # Maximum wait in seconds
    poll_interval: int = 10,       # Polling frequency
    show_progress: bool = True     # Print progress
)

predict¶

Make a single prediction.

result = predictor.predict(
    record: Dict,                              # Record to predict
    feature_importance: bool = False,          # Compute feature importance
    best_metric_preference: str = None         # Checkpoint selection: "roc_auc" or "pr_auc"
) -> PredictionResult

batch_predict¶

Make batch predictions.

results = predictor.batch_predict(
    records: List[Dict] | pd.DataFrame,    # Records to predict
    show_progress: bool = False            # Show progress bar
) -> List[PredictionResult]

predict_csv_file¶

Predict from a CSV file.

results = predictor.predict_csv_file(
    file_path: str,                # Path to CSV file
    show_progress: bool = False    # Show progress bar
) -> List[PredictionResult]

explain¶

Get feature attributions for a prediction.

explanation = predictor.explain(
    record: Dict    # Record to explain
) -> Dict

train_more¶

Continue training for more epochs.

predictor.train_more(
    epochs: int    # Additional epochs to train
)

create_api_endpoint¶

Create a production API endpoint.

endpoint = predictor.create_api_endpoint(
    name: str,                  # Endpoint name
    description: str = None     # Endpoint description
) -> APIEndpoint

configure_webhooks¶

Configure webhook URLs for events.

predictor.configure_webhooks(
    training_started: str = None,              # Training start URL
    training_finished: str = None,             # Training complete URL
    alert_drift: str = None,                   # Data drift alert URL
    alert_performance_degradation: str = None, # Performance drop URL
    alert_error_rate: str = None,              # Error rate URL
    alert_quota_threshold: str = None,         # Quota warning URL
    prediction_error: str = None,              # Prediction error URL
    batch_job_completed: str = None,           # Batch complete URL
    webhook_secret: str = None                 # Secret for signature verification
)

get_webhooks¶

Get current webhook configuration.

webhooks = predictor.get_webhooks() -> Dict

disable_webhook¶

Disable a specific webhook.

predictor.disable_webhook(
    event: str    # Event name to disable
)

get_metrics¶

Get predictor training metrics.

metrics = predictor.get_metrics() -> Dict

predict_grid¶

Create a prediction grid for parameter sweeps.

grid = predictor.predict_grid(
    degrees_of_freedom: int = 2,     # Number of varying dimensions
    grid_shape: tuple = (10, 10)     # Grid dimensions
) -> PredictionGrid

PredictionResult¶

Result from a prediction call.

Attributes¶

Attribute	Type	Description
`prediction_uuid`	`str`	UUID for feedback tracking
`predicted_class`	`str`	Predicted class (classification)
`prediction`	`Any`	Raw prediction (float for regression)
`probability`	`float`	Raw softmax probability
`confidence`	`float`	Normalized confidence (0-1)
`probabilities`	`Dict[str, float]`	Full probability distribution
`threshold`	`float`	Decision threshold used
`guardrails`	`Dict[str, str]`	Per-column warnings
`feature_importance`	`Dict[str, float]`	Feature importance scores (if requested)
`ignored_query_columns`	`List[str]`	Unknown columns in input
`available_query_columns`	`List[str]`	Valid columns for this model

Methods¶

send_feedback¶

Create a feedback object.

feedback = result.send_feedback(
    ground_truth: Any    # Actual outcome
) -> PredictionFeedback

PredictionFeedback¶

Feedback for a prediction.

Methods¶

send¶

Submit the feedback to Featrix.

feedback.send()

VectorDatabase¶

Similarity search database.

Methods¶

add_records¶

Add records to the database.

vdb.add_records(
    records: List[Dict],    # Records to add
    batch_size: int = 500   # Batch size for upload
)

similarity_search¶

Find similar records.

results = vdb.similarity_search(
    query: Dict,    # Query record
    k: int = 10     # Number of results
) -> List[Dict]

Returns: List of dicts with similarity (float) and record (dict) keys

encode¶

Encode records to vectors.

embeddings = vdb.encode(
    records: List[Dict]    # Records to encode
) -> List[List[float]]

size¶

Get number of records.

count = vdb.size() -> int

remove_records¶

Remove records from the database.

vdb.remove_records(
    record_ids: List[str]    # IDs to remove
)

ReferenceRecord¶

Reference for positive-only matching.

Methods¶

find_similar¶

Find records similar to this reference.

similar = ref.find_similar(
    k: int = 10,                           # Number of results
    vector_database: VectorDatabase = None # Database to search
) -> List[Dict]

get_embedding¶

Get the embedding vector for this reference.

embedding = ref.get_embedding() -> List[float]

APIEndpoint¶

Production prediction endpoint.

Attributes¶

Attribute	Type	Description
`id`	`str`	Endpoint ID
`url`	`str`	Endpoint URL
`api_key`	`str`	API key for authentication

Methods¶

predict¶

Make a prediction via the endpoint.

result = endpoint.predict(
    record: Dict,            # Record to predict
    api_key: str = None      # API key (uses stored key if None)
) -> PredictionResult

regenerate_api_key¶

Generate a new API key.

new_key = endpoint.regenerate_api_key() -> str

revoke_api_key¶

Revoke the API key (make endpoint public or disabled).

endpoint.revoke_api_key()

get_usage_stats¶

Get usage statistics.

stats = endpoint.get_usage_stats() -> Dict

Returns: Dict with usage_count, last_used_at, etc.

delete¶

Delete the endpoint.

endpoint.delete()

PredictionGrid¶

Grid for parameter sweeps and surface exploration.

Methods¶

set_axis_labels¶

Set labels for the grid axes.

grid.set_axis_labels(labels: List[str])

predict¶

Queue a prediction at a grid position.

grid.predict(
    record: Dict,              # Record to predict
    grid_position: tuple       # (x, y) position in grid
)

process_batch¶

Process all queued predictions.

grid.process_batch(show_progress: bool = False)

plot_heatmap¶

Generate a heatmap visualization.

fig = grid.plot_heatmap() -> matplotlib.Figure

get_optimal_position¶

Find the grid position with best prediction.

position = grid.get_optimal_position() -> tuple

NotebookHelper¶

Jupyter notebook visualization utilities.

Methods¶

training_loss¶

Visualize training loss curve.

fig = notebook.training_loss(
    fm: FoundationalModel,
    style: str = 'notebook',        # 'notebook', 'paper', 'presentation'
    show_learning_rate: bool = True,
    smooth: bool = True,
    figsize: tuple = (12, 6)
) -> Figure

embedding_space_3d¶

Visualize embedding space in 3D.

fig = notebook.embedding_space_3d(
    fm: FoundationalModel,
    sample_size: int = 2000,
    interactive: bool = True,       # True = plotly, False = matplotlib
    color_by: str = None,           # Column name for coloring
    figsize: tuple = (800, 600)
) -> Figure

training_movie¶

Create animated visualization of training.

movie = notebook.training_movie(
    fm: FoundationalModel,
    notebook_mode: bool = True,     # True = Jupyter widget
    fps: int = 2
)

training_comparison¶

Compare multiple models.

fig = notebook.training_comparison(
    models: List[FoundationalModel],
    labels: List[str],
    figsize: tuple = (12, 6)
) -> Figure

embedding_space_training¶

Visualize embedding space training metrics.

fig = notebook.embedding_space_training(
    fm: FoundationalModel,
    style: str = 'notebook'
) -> Figure

single_predictor_training¶

Visualize predictor training metrics.

fig = notebook.single_predictor_training(
    predictor: Predictor,
    style: str = 'notebook'
) -> Figure

Error Handling¶

Common Exceptions¶

Exception	When Raised
`TimeoutError`	`wait_for_training()` exceeds max_wait_time
`RuntimeError`	Training fails, model in error state
`ValueError`	Invalid input parameters
`requests.HTTPError`	API request fails

Example¶

try:
    fm = featrix.create_foundational_model(data_file="data.csv")
    fm.wait_for_training(max_wait_time=3600)
except TimeoutError:
    print("Training took too long")
except RuntimeError as e:
    print(f"Training failed: {e}")
except ValueError as e:
    print(f"Invalid input: {e}")

Webhook Events¶

Event	Trigger
`training_started`	Predictor training begins
`training_finished`	Predictor training completes
`alert_drift`	Data drift detected in predictions
`alert_performance_degradation`	Model performance dropping
`alert_error_rate`	Prediction error rate increasing
`alert_quota_threshold`	Approaching usage quota
`prediction_error`	Individual prediction fails
`batch_job_completed`	Batch prediction job finishes

Webhook Payload Format¶

{
    "event": "training_finished",
    "timestamp": "2025-01-13T10:30:00Z",
    "foundational_model_id": "abc123",
    "predictor_id": "pred_456",
    "data": {
        "accuracy": 0.94,
        "auc": 0.97,
        "epochs_trained": 150
    }
}

Signature Verification¶

import hmac
import hashlib

def verify_webhook(payload_bytes, signature, secret):
    expected = hmac.new(
        secret.encode(),
        payload_bytes,
        hashlib.sha256
    ).hexdigest()
    return hmac.compare_digest(f"sha256={expected}", signature)