LogoLogo
👨‍💻 API Reference📣 Release Notes📺 Request a Demo
  • Introduction to Fiddler
    • Monitor, Analyze, and Protect your ML Models and Gen AI Applications
  • Fiddler Doc Chatbot
  • First Steps
    • Getting Started With Fiddler Guardrails
    • Getting Started with LLM Monitoring
    • Getting Started with ML Model Observability
  • Tutorials & Quick Starts
    • LLM and GenAI
      • LLM Evaluation - Compare Outputs
      • LLM Monitoring - Simple
    • Fiddler Free Guardrails
      • Guardrails - Quick Start Guide
      • Guardrails - Faithfulness
      • Guardrails - Safety
      • Guardrails FAQ
    • ML Observability
      • ML Monitoring - Simple
      • ML Monitoring - NLP Inputs
      • ML Monitoring - Class Imbalance
      • ML Monitoring - Model Versions
      • ML Monitoring - Ranking
      • ML Monitoring - Regression
      • ML Monitoring - Feature Impact
      • ML Monitoring - CV Inputs
  • Glossary
    • Product Concepts
      • Baseline
      • Custom Metric
      • Data Drift
      • Embedding Visualization
      • Fiddler Guardrails
      • Fiddler Trust Service
      • LLM and GenAI Observability
      • Metric
      • Model Drift
      • Model Performance
      • ML Observability
      • Trust Score
  • Product Guide
    • LLM Application Monitoring & Protection
      • LLM-Based Metrics
      • Embedding Visualizations for LLM Monitoring and Analysis
      • Selecting Enrichments
      • Enrichments (Private Preview)
      • Guardrails for Proactive Application Protection
    • Optimize Your ML Models and LLMs with Fiddler's Comprehensive Monitoring
      • Alerts
      • Package-Based Alerts (Private Preview)
      • Class Imbalanced Data
      • Enhance ML and LLM Insights with Custom Metrics
      • Data Drift: Monitor Model Performance Changes with Fiddler's Insights
      • Ensuring Data Integrity in ML Models And LLMs
      • Embedding Visualization With UMAP
      • Fiddler Query Language
      • Model Versions
      • How to Effectively Use the Monitoring Chart UI
      • Performance Tracking
      • Model Segments: Analyze Cohorts for Performance Insights and Bias Detection
      • Statistics
      • Monitoring ML Model and LLM Traffic
      • Vector Monitoring
    • Enhance Model Insights with Fiddler's Slice and Explain
      • Events Table in RCA
      • Feature Analytics Creation
      • Metric Card Creation
      • Performance Charts Creation
      • Performance Charts Visualization
    • Master AI Monitoring: Create, Customize, and Compare Dashboards
      • Creating Dashboards
      • Dashboard Interactions
      • Dashboard Utilities
    • Adding and Editing Models in the UI
      • Model Editor UI
      • Model Schema Editing Guide
    • Fairness
    • Explainability
      • Model: Artifacts, Package, Surrogate
      • Global Explainability: Visualize Feature Impact and Importance in Fiddler
      • Point Explainability
      • Flexible Model Deployment
        • On Prem Manual Flexible Model Deployment XAI
  • Technical Reference
    • Python Client API Reference
    • Python Client Guides
      • Installation and Setup
      • Model Onboarding
        • Create a Project and Onboard a Model for Observation
        • Model Task Types
        • Customizing your Model Schema
        • Specifying Custom Missing Value Representations
      • Publishing Inference Data
        • Creating a Baseline Dataset
        • Publishing Batches Of Events
        • Publishing Ranking Events
        • Streaming Live Events
        • Updating Already Published Events
        • Deleting Events From Fiddler
      • Creating and Managing Alerts
      • Explainability Examples
        • Adding a Surrogate Model
        • Uploading Model Artifacts
        • Updating Model Artifacts
        • ML Framework Examples
          • Scikit Learn
          • Tensorflow HDF5
          • Tensorflow Savedmodel
          • Xgboost
        • Model Task Examples
          • Binary Classification
          • Multiclass Classification
          • Regression
          • Uploading A Ranking Model Artifact
    • Integrations
      • Data Pipeline Integrations
        • Airflow Integration
        • BigQuery Integration
        • Integration With S3
        • Kafka Integration
        • Sagemaker Integration
        • Snowflake Integration
      • ML Platform Integrations
        • Integrate Fiddler with Databricks for Model Monitoring and Explainability
        • Datadog Integration
        • ML Flow Integration
      • Alerting Integrations
        • PagerDuty Integration
    • Comprehensive REST API Reference
      • Projects REST API Guide
      • Model REST API Guide
      • File Upload REST API Guide
      • Custom Metrics REST API Guide
      • Segments REST API Guide
      • Baselines REST API Guide
      • Jobs REST API Guide
      • Alert Rules REST API Guide
      • Environments REST API Guide
      • Explainability REST API Guide
      • Server Info REST API Guide
      • Events REST API Guide
      • Fiddler Trust Service REST API Guide
    • Fiddler Free Guardrails Documentation
  • Configuration Guide
    • Authentication & Authorization
      • Adding Users
      • Overview of Role-Based Access Control
      • Email Authentication
      • Okta OIDC SSO Integration
      • Azure AD OIDC SSO Integration
      • Ping Identity SAML SSO Integration
      • Mapping LDAP Groups & Users to Fiddler Teams
    • Application Settings
    • Supported Browsers
  • History
    • Release Notes
    • Python Client History
    • Compatibility Matrix
    • Product Maturity Definitions
Powered by GitBook

© 2024 Fiddler Labs, Inc.

On this page
  • Detecting Drift in Multi-Dimensional ML and GenAI Model Data
  • Defining Custom Features
  • Passing Custom Features List to ModelSpec

Was this helpful?

  1. Product Guide
  2. Optimize Your ML Models and LLMs with Fiddler's Comprehensive Monitoring

Vector Monitoring

Detecting Drift in Multi-Dimensional ML and GenAI Model Data

Many modern machine learning systems use input features that cannot be represented as a single number (e.g., text or image data). Such complex features are usually rather represented by high-dimensional vectors which are obtained by applying a vectorization method (e.g., text embeddings generated by NLP models). Furthermore, Fiddler users might be interested in monitoring a group of univariate features together and detecting data drift in multi-dimensional feature spaces.

In order to address the above needs, Fiddler provides a vector monitoring capability which involves enabling users to define "custom features", and a novel method for monitoring data drift in multi-dimensional spaces.

Custom features can be defined by grouping columns together in the baseline and inference data. Or, in the case of NLP or image data, custom features can be defined using columns containing embedding vectors.

Defining Custom Features

Users can use the Fiddler client to define one or more custom features. Custom features can be specified by:

  1. a group of dataset columns that need to be monitored together as a vector. (CF1, CF2)

  2. a column containing an existing embedding vector along with the source column (CF3, CF4)

  3. defining an enrichment that will instruct Fiddler to generate the embedding vector within the product itself (CF5, CF6).

Once a list of custom features is defined and passed to Fiddler (the details of how to use the Fiddler client to define custom features are provided below), Fiddler runs a clustering-based data drift detection algorithm for each custom feature and calculates a corresponding drift value between the baseline and the published events at the selected time period.

CF1 = fdl.CustomFeature.from_columns(['f1','f2','f3'], custom_name = 'vector1')
CF2 = fdl.CustomFeature.from_columns(['f1','f2','f3'], n_clusters=5, custom_name = 'vector2')
CF3 = fdl.TextEmbedding(name='text_embedding',column='embedding_col',source_column='text')
CF4 = fdl.ImageEmbedding(name='image_embedding',column='embedding_col2',source_column='image_url')

CF5 = fdl.Enrichment(
        name='Enrichment Unstructured Embedding',
        enrichment='embedding',
        columns=['doc_col'],
    )
CF6 = fdl.TextEmbedding(
        name='Document TextEmbedding',
        source_column='doc_col',
        column='Enrichment Unstructured Embedding',
        n_tags=10
    )

Passing Custom Features List to ModelSpec

model_spec = fdl.ModelSpec(
    inputs=[
        'creditscore',
        'geography',
        'gender',
        'age',
        'tenure',
        'balance',
        'numofproducts',
        'hascrcard',
        'isactivemember',
        'estimatedsalary',
        'doc_col'
    ],
    outputs=['predicted_churn'],
    targets=['churn'],
    metadata=['customer_id', 'timestamp']
    custom_features = [CF1,CF2,CF3,CF4,C5,C6]
)

MODEL_NAME = 'my_model'

model = fdl.Model.from_data(
    name=MODEL_NAME,
    project_id=fdl.Project.from_name(PROJECT_NAME).id,
    source=sample_df,
    spec=model_spec,
    task=model_task,
    task_params=task_params,
    event_id_col=id_column,
    event_ts_col=timestamp_column
)

model.create()

📘 Quick Start for NLP Monitoring

PreviousMonitoring ML Model and LLM TrafficNextEnhance Model Insights with Fiddler's Slice and Explain

Last updated 2 months ago

Was this helpful?

Once the custom features are defined for vector monitoring, they are then defined as a part of the and onboarded to Fiddler.

Check out our for a fully functional notebook example where we instruct Fiddler to generate the embeddings for unstructured inputs.

Quick Start guide for NLP monitoring

Questions? to a product expert or a demo.

Need help? Contact us at .

❓
💡
Talk
request
help@fiddler.ai
fdl.ModelSpec