LogoLogo
👨‍💻 API Reference📣 Release Notes📺 Request a Demo
  • Introduction to Fiddler
    • Monitor, Analyze, and Protect your ML Models and Gen AI Applications
  • Fiddler Doc Chatbot
  • First Steps
    • Getting Started With Fiddler Guardrails
    • Getting Started with LLM Monitoring
    • Getting Started with ML Model Observability
  • Tutorials & Quick Starts
    • LLM and GenAI
      • LLM Evaluation - Compare Outputs
      • LLM Monitoring - Simple
    • Fiddler Free Guardrails
      • Guardrails - Quick Start Guide
      • Guardrails - Faithfulness
      • Guardrails - Safety
      • Guardrails FAQ
    • ML Observability
      • ML Monitoring - Simple
      • ML Monitoring - NLP Inputs
      • ML Monitoring - Class Imbalance
      • ML Monitoring - Model Versions
      • ML Monitoring - Ranking
      • ML Monitoring - Regression
      • ML Monitoring - Feature Impact
      • ML Monitoring - CV Inputs
  • Glossary
    • Product Concepts
      • Baseline
      • Custom Metric
      • Data Drift
      • Embedding Visualization
      • Fiddler Guardrails
      • Fiddler Trust Service
      • LLM and GenAI Observability
      • Metric
      • Model Drift
      • Model Performance
      • ML Observability
      • Trust Score
  • Product Guide
    • LLM Application Monitoring & Protection
      • LLM-Based Metrics
      • Embedding Visualizations for LLM Monitoring and Analysis
      • Selecting Enrichments
      • Enrichments (Private Preview)
      • Guardrails for Proactive Application Protection
    • Optimize Your ML Models and LLMs with Fiddler's Comprehensive Monitoring
      • Alerts
      • Package-Based Alerts (Private Preview)
      • Class Imbalanced Data
      • Enhance ML and LLM Insights with Custom Metrics
      • Data Drift: Monitor Model Performance Changes with Fiddler's Insights
      • Ensuring Data Integrity in ML Models And LLMs
      • Embedding Visualization With UMAP
      • Fiddler Query Language
      • Model Versions
      • How to Effectively Use the Monitoring Chart UI
      • Performance Tracking
      • Model Segments: Analyze Cohorts for Performance Insights and Bias Detection
      • Statistics
      • Monitoring ML Model and LLM Traffic
      • Vector Monitoring
    • Enhance Model Insights with Fiddler's Slice and Explain
      • Events Table in RCA
      • Feature Analytics Creation
      • Metric Card Creation
      • Performance Charts Creation
      • Performance Charts Visualization
    • Master AI Monitoring: Create, Customize, and Compare Dashboards
      • Creating Dashboards
      • Dashboard Interactions
      • Dashboard Utilities
    • Adding and Editing Models in the UI
      • Model Editor UI
      • Model Schema Editing Guide
    • Fairness
    • Explainability
      • Model: Artifacts, Package, Surrogate
      • Global Explainability: Visualize Feature Impact and Importance in Fiddler
      • Point Explainability
      • Flexible Model Deployment
        • On Prem Manual Flexible Model Deployment XAI
  • Technical Reference
    • Python Client API Reference
    • Python Client Guides
      • Installation and Setup
      • Model Onboarding
        • Create a Project and Onboard a Model for Observation
        • Model Task Types
        • Customizing your Model Schema
        • Specifying Custom Missing Value Representations
      • Publishing Inference Data
        • Creating a Baseline Dataset
        • Publishing Batches Of Events
        • Publishing Ranking Events
        • Streaming Live Events
        • Updating Already Published Events
        • Deleting Events From Fiddler
      • Creating and Managing Alerts
      • Explainability Examples
        • Adding a Surrogate Model
        • Uploading Model Artifacts
        • Updating Model Artifacts
        • ML Framework Examples
          • Scikit Learn
          • Tensorflow HDF5
          • Tensorflow Savedmodel
          • Xgboost
        • Model Task Examples
          • Binary Classification
          • Multiclass Classification
          • Regression
          • Uploading A Ranking Model Artifact
    • Integrations
      • Data Pipeline Integrations
        • Airflow Integration
        • BigQuery Integration
        • Integration With S3
        • Kafka Integration
        • Sagemaker Integration
        • Snowflake Integration
      • ML Platform Integrations
        • Integrate Fiddler with Databricks for Model Monitoring and Explainability
        • Datadog Integration
        • ML Flow Integration
      • Alerting Integrations
        • PagerDuty Integration
    • Comprehensive REST API Reference
      • Projects REST API Guide
      • Model REST API Guide
      • File Upload REST API Guide
      • Custom Metrics REST API Guide
      • Segments REST API Guide
      • Baselines REST API Guide
      • Jobs REST API Guide
      • Alert Rules REST API Guide
      • Environments REST API Guide
      • Explainability REST API Guide
      • Server Info REST API Guide
      • Events REST API Guide
      • Fiddler Trust Service REST API Guide
    • Fiddler Free Guardrails Documentation
  • Configuration Guide
    • Authentication & Authorization
      • Adding Users
      • Overview of Role-Based Access Control
      • Email Authentication
      • Okta Integration
      • SSO with Azure AD
      • Ping Identity SAML SSO Integration
      • Mapping LDAP Groups & Users to Fiddler Teams
    • Application Settings
    • Supported Browsers
  • History
    • Release Notes
    • Python Client History
    • Compatibility Matrix
    • Product Maturity Definitions
Powered by GitBook

© 2024 Fiddler Labs, Inc.

On this page

Was this helpful?

  1. Product Guide
  2. Explainability

Flexible Model Deployment

PreviousPoint ExplainabilityNextOn Prem Manual Flexible Model Deployment XAI

Last updated 1 month ago

Was this helpful?

Fiddler supports explainability for models with varying dependencies. This is achieved by running each model in its own dedicated container to provide the resources and dependencies that are unique to that model. For example, if your team has two models developed with the same libraries but using different versions you can run both those models by specifying the exact version they were built with.

📘 Note

For models that require monitoring features only, there is no need to upload your model artifact or create a surrogate model as these are only used to support explainability features.


When adding a model artifact to your Fiddler model (see ), you specify the deployment configuration needed to run it using the argument. Fiddler has a set of starter images from which to select the configuration most appropriate for running your model. These images vary by included libraries and Python versions. Note you can also customize an image by including your own requirements.txt file along with the model artifact package.

DeploymentParams Arguments

  • image_uri: This is the Docker image used to create a new runtime to serve the model. You can choose a base image from the following list, with the matching requirements for your model:

    Image URI
    Dependencies

    md-base/python/python-39:2.0.2

    fiddler-client==3.0.3 flask==2.2.5 gevent==23.9.0 gunicorn==22.0.0 prometheus-flask-exporter==0.21.0 pyarrow==14.0.1 pydantic==1.10.13

    md-base/python/python-310:1.0.0

    fiddler-client==3.0.3 flask==2.2.5 gevent==23.9.0 gunicorn==22.0.0 prometheus-flask-exporter==0.21.0 pyarrow==14.0.1 pydantic==1.10.13

    md-base/python/python-311:1.0.0

    fiddler-client==3.0.3 flask==2.2.5 gevent==23.9.0 gunicorn==22.0.0 prometheus-flask-exporter==0.21.0 pyarrow==14.0.1 pydantic==1.10.13

    md-base/python/java:2.1.0

    fiddler-client==3.0.3 flask==2.2.5 gevent==23.9.0 gunicorn==22.0.0 h2o==3.46.0.5 prometheus-flask-exporter==0.21.0 pyarrow==14.0.1 pydantic==1.10.13

    md-base/python/rpy2:2.0.2

    fiddler-client==3.0.3 flask==2.2.5 gevent==23.9.0 gunicorn==22.0.0 prometheus-flask-exporter==0.21.0 pyarrow==14.0.1 pydantic==1.10.13 rpy2==3.5.1

📘 Image upgrades

These Docker images are upgraded routinely to resolve security vulnerabilities and the image tag is updated accordingly. Unsupported Python versions are not provided.

🚧 Be aware

Model version features are supported with the image versions listed above. Images below 2.x for python-39, java and rpy2 will continue to work for existing models using a single version. From 24.5 onwards, model version first class support is added and these require the new model deployment base image tag versions.

Each base image comes with a few pre-installed libraries and these can be overridden and added to by specifying a file inside your model artifact directory where is defined.

Note that the old images deep-learning and machine-learning are deprecated (All current versions are still working, but we stopped maintaining and upgrading those). We encourage users to select any plain Python image, and add the necessary libraries in requirements.txt.

🚧 Be aware

Installing new dependencies at runtime will take time and is prone to network errors.

* `replicas`: The number of Docker image replicas running the model.
* `memory`: The amount of memory (mebibytes) reserved per replica. NLP models might need more memory, so ensure to allocate the required amount of resources.

🚧 Be aware

Your model might require more memory than the default setting. Please ensure you set a sufficient amount of resources. If you see a ModelServeError error when adding a model, it means the current settings were not enough to run your model.

  • cpu: The amount of CPU (milli cpus) reserved per replica. Both number of features and model complexity can require more CPU allocation.

# Specify deployment parameters
deployment_params = fdl.DeploymentParams(
        image_uri="md-base/python/python-311:1.0.0",
        cpu=250,
        memory=512,
          replicas=1)

# Add model artifact
job = model.add_artifact(
  model_dir =  str, #path to your model dirctory with model artifacts and package.py
  deployment_param = DeploymentParams | None,
) -> AsyncJob
job.wait()
  • Horizontal scaling: horizontal scaling via replicas parameter. This will create multiple Kubernetes pods internally to handle concurrent requests.

  • Vertical scaling: Model deployments support vertical scaling via cpu and memory parameters. Some models might need more memory to load the artifacts into memory or process the requests.

  • Scale down: You may want to scale down the model deployments to avoid allocating the resources when the model is not in use. Use active parameter set to False to scale down the deployment.

  • Scale up: To scale model deployments back up, set active parameter to True.

Both and methods support passing deployment_params. For example:

Once the model is added in Fiddler, you can fine-tune the model deployment based on the scaling requirements, using . This function allows you to:


Questions? to a product expert or a demo.

Need help? Contact us at .

❓
💡
Talk
request
help@fiddler.ai
requirements.txt
package.py
add_artifact
DeploymentParams
add_artifact
update_artifact
update_model_deployment