CustomFeatureType

API reference for CustomFeatureType

CustomFeatureType

Types of custom features for advanced model monitoring.

This enum defines different types of custom features that can be created for advanced monitoring scenarios. Custom features enable monitoring of complex data types, embeddings, and multi-column relationships.

Feature Categories:

  • Multi-column: Features derived from multiple input columns

  • Vector-based: Features from embedding or vector columns

  • Embedding-specific: Specialized embedding monitoring

  • Enrichment: Features from data enrichment processes

Examples

Creating different types of custom features:

# Multi-column feature for monitoring column interactions
multivariate_feature = fdl.Multivariate(

    name=’user_profile’,
    columns=[‘age’, ‘income’, ‘location’],
    monitor_components=True

)

# Vector feature for embedding monitoring
vector_feature = fdl.VectorFeature(

    name=’product_embedding’,
    column=’product_vector’,
    n_clusters=10

)

# Text embedding feature with clustering
text_embedding = fdl.TextEmbedding(

    name=’review_sentiment’,
    column=’review_embedding’,
    n_clusters=5,
    n_tags=10

)

# Image embedding feature
image_embedding = fdl.ImageEmbedding(

    name=’image_features’,
    column=’image_embedding’,
    n_clusters=8

)

# Enrichment feature for data validation
enrichment_feature = fdl.Enrichment(

    name=’email_validation’,
    enrichment=’email_validation’,
    columns=[‘email_address’],
    config={‘strict’: True}

## )

Custom features enable advanced monitoring capabilities but require careful configuration to match your specific use case and data structure.

FROM_COLUMNS = 'FROM_COLUMNS'

Multi-column derived features (Multivariate).

Used for creating custom features that monitor relationships and interactions between multiple input columns. Enables detection of drift patterns across column combinations.

Characteristics:

  • Monitors multiple columns as a single feature

  • Detects multi-dimensional drift patterns

  • Can monitor individual components separately

  • Supports complex feature interactions

Use cases:

  • Geographic coordinates (latitude, longitude)

  • User profiles (age, income, location)

  • Product specifications (dimensions, weight, price)

  • Time series components (trend, seasonality)

Configuration:

  • Specify list of columns to monitor together

  • Optional component monitoring

  • Clustering for dimensionality reduction

FROM_VECTOR = 'FROM_VECTOR'

Single vector column features (VectorFeature).

Used for monitoring embedding vectors or other high-dimensional numerical arrays as single features. Enables clustering-based drift detection and embedding analysis.

Characteristics:

  • Monitors single vector/embedding column

  • Clustering-based drift detection

  • Dimensionality reduction visualization

  • Vector similarity analysis

Use cases:

  • Word embeddings (Word2Vec, GloVe)

  • Neural network hidden layer outputs

  • Feature vectors from autoencoders

  • Learned representations

Configuration:

  • Specify vector column name

  • Set number of clusters for monitoring

  • Optional source column reference

FROM_TEXT_EMBEDDING = 'FROM_TEXT_EMBEDDING'

Text embedding features (TextEmbedding).

Specialized for monitoring text embeddings with text-specific analysis capabilities. Includes TF-IDF summarization and text-aware clustering.

Characteristics:

  • Text-specific embedding analysis

  • TF-IDF token summarization

  • Text-aware clustering

  • Semantic drift detection

Use cases:

  • BERT, GPT embeddings

  • Document embeddings

  • Sentence transformers

  • Text classification features

Configuration:

  • Specify embedding column

  • Set number of clusters

  • Configure TF-IDF tags per cluster

FROM_IMAGE_EMBEDDING = 'FROM_IMAGE_EMBEDDING'

Image embedding features (ImageEmbedding).

Specialized for monitoring image embeddings and visual features extracted from images. Optimized for computer vision model monitoring.

Characteristics:

  • Image-specific embedding analysis

  • Visual feature clustering

  • Image-aware drift detection

  • Computer vision optimizations

Use cases:

  • CNN feature extractions

  • Image classification embeddings

  • Object detection features

  • Visual similarity vectors

Configuration:

  • Specify embedding column

  • Set clustering parameters

  • Image-specific preprocessing

ENRICHMENT = 'ENRICHMENT'

Enrichment-derived features (Enrichment).

Used for features created through data enrichment processes such as validation, transformation, or external data augmentation. Enables monitoring of enriched data quality and consistency.

Characteristics:

  • Derived from enrichment processes

  • Data quality monitoring

  • Validation result tracking

  • Transformation monitoring

Use cases:

  • Email validation results

  • Address standardization

  • Data quality scores

  • External API enrichments

Configuration:

  • Specify enrichment type

  • Configure enrichment parameters

  • Set input columns for enrichment

Last updated

Was this helpful?