client.get_feature_impact

Get global feature impact for a model over a dataset or a slice.

Input ParameterTypeDefaultDescription
project_idstrNoneA unique identifier for the project.
model_idstrNoneA unique identifier for the model.
data_sourceUnion[fdl.DatasetDataSource, fdl.SqlSliceQueryDataSource]NoneType of data source for the input dataset to compute feature impact on (DatasetDataSource or SqlSliceQueryDataSource)
num_iterationsOptional[int]10000The maximum number of ablated model inferences per feature. Used for TABULAR data only.
num_refsOptional[int]10000Number of reference points used in the explanation. Used for TABULAR data only.
ci_levelOptional[float]0.95The confidence level (between 0 and 1). Used for TABULAR data only.
output_columnsOptional[List[str]]NoneOnly used for NLP (TEXT inputs) models. Output column names to compute feature impact on. Useful for Multi-class Classification models. If None, compute for all output columns.
min_supportOptional[int]15Only used for NLP (TEXT inputs) models. Specify a minimum support (number of times a specific word was present in the sample data) to retrieve top words. Default to 15.
overwrite_cacheOptional[bool]FalseWhether to overwrite the feature impact cached values or not.
PROJECT_ID = 'example_project'
MODEL_ID = 'example_model'
DATASET_ID = 'example_dataset'

# Feature Impact for TABULAR data - Dataset Data Source
feature_impact = client.get_feature_impact(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    data_source=fdl.DatasetDataSource(dataset_id=DATASET_ID, num_samples=200),
    num_iterations=300,
    num_refs=200,
    ci_level=0.90,
)

# Feature Impact for TABULAR data - Slice Query data source
query = f'SELECT * FROM {DATASET_ID}.{MODEL_ID} WHERE CreditScore > 700'
feature_impact = client.get_feature_impact(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    data_source=fdl.SqlSliceQueryDataSource(query=query, num_samples=80),
    num_iterations=300,
    num_refs=200,
    ci_level=0.90,
)

# Feature Impact for TEXT data
feature_impact = client.get_feature_impact(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    data_source=fdl.DatasetDataSource(dataset_id=DATASET_ID, num_samples=50),
    output_columns= ['probability_A', 'probability_B'],
  	min_support=30
)
Return TypeDescription
tupleA named tuple with the feature impact results.