Customizing your Model Schema
It's common to want to modify your fdl.ModelSchema
object in the case where something was inferred incorrectly by fdl.Model.from_data
.
Let's walk through an example of how to do this.
Suppose you've loaded in a dataset as a pandas DataFrame.
Below is an example of what is displayed upon inspection.
Suppose you create a fdl.Model
object by inferring the details from this DataFrame.
Below is an example of what is displayed upon inspection of model.schema
.
But upon inspection, you notice a few things are wrong.
The value range of
output_column
is set to[0.01, 0.99]
, when it should really be[0.0, 1.0]
.There are no possible values set for
feature_3
.The data type of
feature_3
is set tofdl.DataType.STRING
, when it should really befdl.DataType.CATEGORY
.
Let's see how we can address these issues.
Modifying a column’s value range
Let's say we want to modify the range of output_column
in the above fdl.Model
object to be [0.0, 1.0]
.
You can do this by setting the min
and max
of the output_column
column.
Modifying a column’s possible values
Let's say we want to modify the possible values of feature_3
to be ['Yes', 'No']
.
You can do this by setting the categories
of the feature_3
column.
Modifying a column’s data type
Let's say we want to modify the data type of feature_3
to be fdl.DataType.CATEGORY
.
You can do this by setting the data_type
of the feature_3
column.
🚧 Note when modifying a column's data type to Category
Note that it is also required when modifying a column's data type to Category to also set the column's possible_values to the list of unique values for that column.
model.schema['feature_3'].data_type = fdl.DataType.CATEGORY model.schema['feature_3'].possible_values = ['Yes', 'No']
Last updated