Fiddler Query Language (FQL)
Overview
Custom Metrics and Segments are defined using the Fiddler Query Language (FQL), a flexible set of constants, operators, and functions which can accommodate a large variety of metrics.
Definitions
Term | Definition |
---|---|
Row-level function | A function which executes row-wise for a set of data. Returns a value for each row. |
Aggregate function | A function which executes across rows. Returns a single value for a given set of rows. |
FQL Rules
- Column names can be referenced by name either with double quotes ("my_column") or with no quotes (my_column).
- Single quotes (') are used to represent string values.
Data Types
FQL distinguishes between three data types:
Data type | Supported values | Examples | Supported Model Schema Data Types |
---|---|---|---|
Number | Any numeric value (integers and floats are both included) | 10 2.34 | Data.Type.INTEGER DataType.FLOAT |
Boolean | Only true and false | true false | DataType.BOOLEAN |
String | Any value wrapped in single quotes (' ) | 'This is a string.' '200.0' | DataType.CATEGORY DataType.STRING |
Constants
Symbol | Description |
---|---|
true | Boolean constant for true expressions |
false | Boolean constant for false expressions |
Operators
Symbol | Description | Syntax | Returns | Examples |
---|---|---|---|---|
^ | Exponentiation | Number ^ Number | Number | 2.5 ^ 4 (column1 - column2)^2 |
- | Unary negation | -Number | Number | -column1 |
* | Multiplication | Number * Number | Number | 2 * 10 2 * column1 column1 * column2 sum(column1) * 10 |
/ | Division | Number / Number | Number | 2 / 10 2 / column1 column1 / column2 sum(column1) / 10 |
% | Modulo | Number % Number | Number | 2 % 10 2 % column1 column1 % column2 sum(column1) % 10 |
+ | Addition | Number + Number | Number | 2 + 2 2 + column1 column1 + column2 average(column1) + 2 |
- | Subtraction | Number - Number | Number | 2 - 2 2 - column1 column1 - column2 average(column1) - 2 |
< | Less than | Number < Number | Boolean | 10 < 20 column1 < 10 column1 < column2 average(column2) < 5 |
<= | Less than or equal to | Number <= Number | Boolean | 10 <= 20 column1 <= 10 column1 <= column2 average(column2) <= 5 |
> | Greater than | Number > Number | Boolean | 10 > 20 column1 > 10 column1 > column2 average(column2) > 5 |
>= | Greater than or equal to | Number >= Number | Boolean | 10 >= 20 column1 >= 10 column1 >= column2 average(column2) >= 5 |
== | Equals | Number == Number | Boolean | 10 == 20 column1 == 10 column1 == column2 average(column2) == 5 |
!= | Does not equal | Number != Number | Boolean | 10 != 20 column1 != 10 column1 != column2 average(column2) != 5 |
not | Logical NOT | not Boolean | Boolean | not true not column1 |
and | Logical AND | Boolean and Boolean | Boolean | true and false column1 and column2 |
or | Logical OR | Boolean or Boolean | Boolean | true or false column1 or column2 |
Constant functions
Symbol | Description | Syntax | Returns | Examples |
---|---|---|---|---|
e() | Base of the natural logarithm | e() | Number | e() == 2.718281828459045 |
pi() | The ratio of a circle's circumference to its diameter | pi() | Number | pi() == 3.141592653589793 |
Row-level functions
Row-level functions can be applied either to a single value or to a column/row expression (in which case they are mapped element-wise to each value in the column/row expression).
Symbol | Description | Syntax | Returns | Examples |
---|---|---|---|---|
if(condition, value1, value2) | Evaluates condition and returns value1 if true, otherwise returns value2 .value1 and value2 must have the same type. | if(Boolean, Any, Any) | Any | if(false, 'yes', 'no') == 'no' if(column1 == 1, 'yes', 'no') |
length(x) | Returns the length of string x . | length(String) | Number | length('Hello world') == 11 |
to_string(x) | Converts a value x to a string. | to_string(Any) | String | to_string(42) == '42' to_string(true) == 'true' |
is_null(x) | Returns true if x is null, otherwise returns false . | is_null(Any) | Boolean | is_null('') == true is_null("column1") |
abs(x) | Returns the absolute value of number x . | abs(Number) | Number | abs(-3) == 3 |
exp(x) | Returns e^x , where e is the base of the natural logarithm. | exp(Number) | Number | exp(1) == 2.718281828459045 |
log(x) | Returns the natural logarithm (base e ) of number x . | log(Number) | Number | log(e) == 1 |
log2(x) | Returns the binary logarithm (base 2 ) of number x . | log2(Number) | Number | log2(16) == 4 |
log10(x) | Returns the binary logarithm (base 10 ) of number x . | log10(Number) | Number | log10(1000) == 3 |
sqrt(x) | Returns the positive square root of number x . | sqrt(Number) | Number | sqrt(144) == 12 |
Aggregate functions
Every Custom Metric must be wrapped in an aggregate function or be a combination of aggregate functions.
Symbol | Description | Syntax | Returns | Examples |
---|---|---|---|---|
sum(x) | Returns the sum of a numeric column or row expression x . | sum(Number) | Number | sum(column1 + column2) |
average(x) | Returns the arithmetic mean/average value of a numeric column or row expression x . | average(Number) | Number | average(2 * column1) |
count(x) | Returns the number of non-null rows of a column or row expression x . | count(Any) | Number | count(column1) |
Built-in metric functions
Symbol | Description | Syntax | Returns | Examples |
---|---|---|---|---|
jsd(column, baseline) | The Jensen-Shannon distance of column column with respect to baseline baseline . | jsd(Any, String) | Number | jsd(column1, 'my_baseline') |
psi(column, baseline) | The population stability index of column column with respect to baseline baseline . | psi(Any, String) | Number | psi(column1, 'my_baseline') |
null_violation_count(column) | Number of rows with null values in column column . | null_violation_count(Any) | Number | null_violation_count(column1) |
range_violation_count(column) | Number of rows with out-of-range values in column column . | range_violation_count(Any) | Number | range_violation_count(column1) |
type_violation_count(column) | Number of rows with invalid data types in column column . | type_violation_count(Any) | Number | type_violation_count(column1) |
any_violation_count(column) | Number of rows with at least one Data Integrity violation in column . | any_violation_count(Any) | Number | any_violation_count(column1) |
traffic() | Total row count. Includes null rows. | traffic() | Number | traffic() |
tp(class) | True positive count. Available for binary classification and multiclass classification models. For multiclass, class is used to specify the positive class. | tp(class=Optional[String]) | Number | tp() tp(class='class1') |
tn(class) | True negative count. Available for binary classification and multiclass classification models. For multiclass, class is used to specify the positive class. | tn(class=Optional[String]) | Number | tn() tn(class='class1') |
fp(class) | False positive count. Available for binary classification and multiclass classification models. For multiclass, class is used to specify the positive class. | fp(class=Optional[String]) | Number | fp() fp(class='class1') |
fn(class) | False negative count. Available for binary classification and multiclass classification models. For multiclass, class is used to specify the positive class. | fn(class=Optional[String]) | Number | fn() fn(class='class1') |
precision(target, threshold) | Precision between target and output. Available for binary classification model tasks. If target is specified, it will be used in place of the default target column. | precision(target=Optional[Any], threshold=Optional[Number]) | Number | precision() precision(target=column1) precision(threshold=0.5) precision(target=column1, threshold=0.5) |
recall(target, threshold) | Recall between target and output. Available for binary classification model tasks. If target is specified, it will be used in place of the default target column. | recall(target=Optional[Any], threshold=Optional[Number]) | Number | recall() recall(target=column1) recall(threshold=0.5) recall(target=column1, threshold=0.5) |
f1_score(target, threshold) | F1 score between target and output. Available for binary classification model tasks. If target is specified, it will be used in place of the default target column. | f1_score(target=Optional[Any], threshold=Optional[Number]) | Number | f1_score() f1_score(target=column1) f1_score(threshold=0.5) f1_score(target=column1, threshold=0.5) |
fpr(target, threshold) | False positive rate between target and output. Available for binary classification model tasks. If target is specified, it will be used in place of the default target column. | fpr(target=Optional[Any], threshold=Optional[Number]) | Number | fpr() fpr(target=column1) fpr(threshold=0.5) fpr(target=column1, threshold=0.5) |
auroc(target) | Area under the ROC curve between target and output. Available for binary classification model tasks. If target is specified, it will be used in place of the default target column. | auroc(target=Optional[Any]) | Number | auroc() auroc(target=column1) |
geometric_mean(target, threshold) | Geometric mean score between target and output. Available for binary classification model tasks. If target is specified, it will be used in place of the default target column. | geometric_mean(target=Optional[Any], threshold=Optional[Number]) | Number | geometric_mean() geometric_mean(target=column1) geometric_mean(threshold=0.5) geometric_mean(target=column1, threshold=0.5) |
expected_calibration_error(target) | Expected calibration error between target and output. Available for binary classification model tasks. If target is specified, it will be used in place of the default target column. | expected_calibration_error(target=Optional[Any]) | Number | expected_calibration_error() expected_calibration_error(target=column1) |
log_loss(target) | Log loss (binary cross entropy) between target and output. Available for binary classification model tasks. If target is specified, it will be used in place of the default target column. | log_loss(target=Optional[Any]) | Number | log_loss() log_loss(target=column1) |
calibrated_threshold(target) | Optimal threshold value for a high TPR and a low FPR. Available for binary classification model tasks. If target is specified, it will be used in place of the default target column. | calibrated_threshold(target=Optional[Any]) | Number | calibrated_threshold() calibrated_threshold(target=column1) |
accuracy(target, threshold) | Accuracy score between target and outputs. Available for multiclass classification model tasks. If target is specified, it will be used in place of the default target column. | accuracy(target=Optional[Any], threshold=Optional[Number]) | Number | accuracy() accuracy(target=column1) accuracy(threshold=0.5) accuracy(target=column1, threshold=0.5) |
log_loss(target) | Log loss score between target and outputs. Available for multiclass classification model tasks. If target is specified, it will be used in place of the default target column. | log_loss(target=Optional[Any]) | Number | log_loss() log_loss(target=column1) |
r2(target) | R-squared score between target and output. Available for regression model tasks. If target is specified, it will be used in place of the default target column. | r2(target=Optional[Any]) | Number | r2() r2(target=column1) |
mse(target) | Mean squared error between target and output. Available for regression model tasks. If target is specified, it will be used in place of the default target column. | mse(target=Optional[Any]) | Number | mse() mse(target=column1) |
mae(target) | Mean absolute error between target and output. Available for regression model tasks. If target is specified, it will be used in place of the default target column. | mae(target=Optional[Any]) | Number | mae() mae(target=column1) |
mape(target) | Mean absolute percentage error between target and output. Available for regression model tasks. If target is specified, it will be used in place of the default target column. | mape(target=Optional[Any]) | Number | mape() mape(target=column1) |
wmape(target) | Weighted mean absolute percentage error between target and output. Available for regression model tasks. If target is specified, it will be used in place of the default target column. | wmape(target=Optional[Any]) | Number | wmape() wmape(target=column1) |
map(target) | Mean average precision score. Available for ranking model tasks. If target is specified, it will be used in place of the default target column. | map(target=Optional[Any]) | Number | map() map(target=column1) |
ndcg_mean(target) | Mean normalized discounted cumulative gain score. Available for ranking model tasks. If target is specified, it will be used in place of the default target column. | ndcg_mean(target=Optional[Any]) | Number | ndcg_mean() ndcg_mean(target=column1) |
query_count(target) | Count of ranking queries. Available for ranking model tasks. If target is specified, it will be used in place of the default target column. | query_count(target=Optional[Any]) | Number | query_count() query_count(target=column1) |
Updated 5 days ago