ModelTask
API reference for ModelTask
ModelTask
Machine learning task types supported by Fiddler.
This enum defines the different types of ML tasks that Fiddler can monitor. The task type determines which metrics are calculated, how performance is measured, and what monitoring capabilities are available.
Task-Specific Features:
Classification: Accuracy, precision, recall, F1, AUC, confusion matrix
Regression: MAE, MSE, RMSE, R², residual analysis
Ranking: NDCG, MAP, precision@k, ranking-specific metrics
LLM: Token-based metrics, response quality, safety metrics
Examples
Configuring models for different tasks:
# Binary classification (fraud detection)
fraud_model = fdl.Model.from_data(
name=’fraud_detector’,
source=fraud_data,
spec=model_spec,
task=fdl.ModelTask.BINARY_CLASSIFICATION,
task_params=fdl.ModelTaskParams(
binary_classification_threshold=0.5
)
)
# Multiclass classification (sentiment analysis)
sentiment_model = fdl.Model.from_data(
name=’sentiment_analyzer’,
source=sentiment_data,
spec=model_spec,
task=fdl.ModelTask.MULTICLASS_CLASSIFICATION,
task_params=fdl.ModelTaskParams(
target_class_order=[‘negative’, ‘neutral’, ‘positive’]
)
)
# Regression (price prediction)
price_model = fdl.Model.from_data(
name=’price_predictor’,
source=price_data,
spec=model_spec,
task=fdl.ModelTask.REGRESSION
)
# Ranking (recommendation system)
ranking_model = fdl.Model.from_data(
name=’recommender’,
source=ranking_data,
spec=model_spec,
task=fdl.ModelTask.RANKING,
task_params=fdl.ModelTaskParams(
group_by=’user_id’,
top_k=10
)
)
# LLM (language model)
llm_model = fdl.Model.from_data(
name=’chatbot’,
source=conversation_data,
spec=model_spec,
task=fdl.ModelTask.LLM
## )BINARY_CLASSIFICATION = 'binary_classification'
Two-class classification tasks.
Used for models that predict one of two possible outcomes or classes. Enables binary classification metrics and threshold-based analysis.
Available metrics:
Accuracy, Precision, Recall, F1-score
AUC-ROC, AUC-PR curves
Confusion matrix analysis
Threshold optimization tools
Typical use cases:
Fraud detection (fraud/legitimate)
Email spam filtering (spam/ham)
Medical diagnosis (positive/negative)
Credit approval (approve/deny)
Churn prediction (churn/retain)
Required outputs: Single probability score or binary prediction Task parameters: binary_classification_threshold
MULTICLASS_CLASSIFICATION = 'multiclass_classification'
Multi-class classification tasks.
Used for models that predict one of multiple possible classes or categories. Supports comprehensive multiclass performance analysis and class-specific metrics.
Available metrics:
Per-class precision, recall, F1-score
Macro and micro-averaged metrics
Confusion matrix with multiple classes
Class distribution analysis
Typical use cases:
Document categorization (multiple topics)
Image classification (multiple objects)
Sentiment analysis (positive/neutral/negative)
Product categorization
Intent classification in chatbots
Required outputs: Class probabilities or single class prediction Task parameters: target_class_order, class_weights
REGRESSION = 'regression'
Continuous value prediction tasks.
Used for models that predict numerical values on a continuous scale. Enables regression-specific metrics and residual analysis.
Available metrics:
Mean Absolute Error (MAE)
Mean Squared Error (MSE)
Root Mean Squared Error (RMSE)
R-squared (coefficient of determination)
Residual distribution analysis
Typical use cases:
Price prediction
Sales forecasting
Risk scoring (continuous scores)
Demand forecasting
Performance rating prediction
Required outputs: Single continuous numerical value Task parameters: None (uses standard regression metrics)
RANKING = 'ranking'
Ranking and recommendation tasks.
Used for models that rank items or provide ordered recommendations. Supports ranking-specific metrics and list-wise evaluation.
Available metrics:
Normalized Discounted Cumulative Gain (NDCG)
Mean Average Precision (MAP)
Precision@K, Recall@K
Mean Reciprocal Rank (MRR)
Hit Rate analysis
Typical use cases:
Search result ranking
Product recommendations
Content recommendation systems
Information retrieval
Personalized ranking
Required outputs: Ranked list of items with scores Task parameters: group_by (session/user ID), top_k Special data format: Grouped by query/session identifier
LLM = 'llm'
Large language model and generative AI tasks.
Used for language models, chatbots, and generative AI applications. Enables LLM-specific monitoring including safety, quality, and performance metrics.
Available metrics:
Response quality metrics
Safety and toxicity detection
Hallucination detection
Token-based analysis
Latency and throughput metrics
Typical use cases:
Chatbots and conversational AI
Text generation models
Question-answering systems
Code generation models
Content creation assistants
Special features:
Guardrails integration
Safety monitoring
Prompt and response analysis
Token usage tracking
NOT_SET = 'not_set'
Placeholder for undefined or unspecified tasks.
Used as a default value when the model task has not been explicitly defined. Should be replaced with an appropriate task type during model configuration.
This value should not be used for production models as it limits available monitoring capabilities and metrics.
is_classification()
Check if the task is a classification type.
Returns
True if task is binary or multiclass classification Return type: bool
is_regression()
Check if the task is regression.
Returns
True if task is regression Return type: bool
Last updated
Was this helpful?