fdl.ModelInfo.from_dataset_info

Constructs a ModelInfo object from a DatasetInfo object.

Input ParametersTypeDefaultDescription
dataset_infofdl.DatasetInfo()The DatasetInfo object from which to construct the ModelInfo object.
targetstrThe column to be used as the target (ground truth).
model_taskfdl.ModelTaskNoneA ModelTask object containing the model task.
dataset_idOptional [str]NoneThe unique identifier for the dataset.
featuresOptional [list]NoneA list of columns to be used as features.
custom_featuresOptional[List[CustomFeature]]NoneList of Custom Features definitions for a model. Objects of type Multivariate, Vector, ImageEmbedding or TextEmbedding derived from CustomFeature can be provided.
metadata_colsOptional [list]NoneA list of columns to be used as metadata fields.
decision_colsOptional [list]NoneA list of columns to be used as decision fields.
display_nameOptional [str]NoneA display name for the model.
descriptionOptional [str]NoneA description of the model.
input_typeOptional [fdl.ModelInputType]fdl.ModelInputType.TABULARA ModelInputType object containing the input type of the model.
outputsOptional [list]A list of Column objects corresponding to the outputs (predictions) of the model.
targetsOptional [list]NoneA list of Column objects corresponding to the targets (ground truth) of the model.
model_deployment_paramsOptional [fdl.ModelDeploymentParams]NoneA ModelDeploymentParams object containing information about model deployment.
frameworkOptional [str]NoneA string providing information about the software library and version used to train and run this model.
datasetsOptional [list]NoneA list of the dataset IDs used by the model.
mlflow_paramsOptional [fdl.MLFlowParams]NoneA MLFlowParams object containing information about MLFlow parameters.
preferred_explanation_methodOptional [fdl.ExplanationMethod]NoneAn ExplanationMethod object that specifies the default explanation algorithm to use for the model.
custom_explanation_namesOptional [list][ ]A list of names that can be passed to the explanation_name _argument of the optional user-defined _explain_custom method of the model object defined in package.py.
binary_classification_thresholdOptional [float].5The threshold used for classifying inferences for binary classifiers.
ranking_top_kOptional [int]50Used only for ranking models. Sets the top k results to take into consideration when computing performance metrics like MAP and NDCG.
group_byOptional [str]NoneUsed only for ranking models. The column by which to group events for certain performance metrics like MAP and NDCG.
fall_backOptional [dict]NoneA dictionary mapping a column name to custom missing value encodings for that column.
categorical_target_class_detailsOptional [Union[list, int, str]]NoneA list denoting the order of classes in the target. This parameter is required in the following cases:

- Binary classification tasks: If the target is of type string, you must tell Fiddler which class is considered the positive class for your output column. If you provide a single element, it is considered the positive class. Alternatively, you can provide a list with two elements. The 0th element by convention is considered the negative class, and the 1st element is considered the positive class. When your target is boolean, you don't need to specify this argument. By default Fiddler considers True as the positive class. In case your target is numerical, you don't need to specify this argument, by default Fiddler considers the higher of the two possible values as the positive class.

- Multi-class classification tasks: You must tell Fiddler which class corresponds to which output by giving an ordered list of classes. This order should be the same as the order of the outputs.

- Ranking tasks: If the target is of type string, you must provide a list of all the possible target values in the order of relevance. The first element will be considered as the least relevant grade and the last element from the list will be considered the most relevant grade.
In the case your target is numerical, Fiddler considers the smallest value to be the least relevant grade and the biggest value from the list will be considered the most relevant grade.
import pandas as pd

df = pd.read_csv('example_dataset.csv')

dataset_info = fdl.DatasetInfo.from_dataframe(
    df=df
)

model_info = fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    features=[
        'feature_1',
        'feature_2',
        'feature_3'
    ],
    outputs=[
        'output_column'
    ],
    target='target_column',
    input_type=fdl.ModelInputType.TABULAR,
    model_task=fdl.ModelTask.BINARY_CLASSIFICATION
)
Return TypeDescription
fdl.ModelInfoA fdl.ModelInfo() object constructed from the fdl.DatasetInfo() object provided.