This script uses the BigML anomaly detection functions to assess covariate shift between a dataset used to train a model and a production dataset.
In brief, the principle of the method is to compute an average anomaly score of the production dataset relative to the model training dataset as a measure of the covariate shift between the training dataset and the production dataset. An anomaly detector is trained from the same dataset used to train the model. This anomaly detector is then used to derive a batch anomaly score for the production dataset. Finally, the average value of that batch anomaly score is computed as an indicator of covariate shift.
In practice, one might compute the average batch anomaly scores for several pairs of subsets of the training and production datasets, and then assess covariate shift based on the mean and variance of the average batch anomaly scores from those iterations.
This script uses the BigML anomaly detection functions to assess covariate shift between a dataset used to train a model and a production dataset.
In brief, the principle of the method is to compute an average anomaly score of the production dataset relative to the model training dataset as a measure of the covariate shift between the training dataset and the production dataset. An anomaly detector is trained from the same dataset used to train the model. This anomaly detector is then used to derive a batch anomaly score for the production dataset. Finally, the average value of that batch anomaly score is computed as an indicator of covariate shift.
In practice, one might compute the average batch anomaly scores for several pairs of subsets of the training and production datasets, and then assess covariate shift based on the mean and variance of the average batch anomaly scores from those iterations.
This script uses the BigML anomaly detection functions to assess covariate shift between a dataset used to train a model and a production dataset.
In brief, the principle of the method is to compute an average anomaly score of the production dataset relative to the model training dataset as a measure of the covariate shift between the training dataset and the production dataset. An anomaly detector is trained from the same dataset used to train the model. This anomaly detector is then used to derive a batch anomaly score for the production dataset. Finally, the average value of that batch anomaly score is computed as an indicator of covariate shift.
In practice, one might compute the average batch anomaly scores for several pairs of subsets of the training and production datasets, and then assess covariate shift based on the mean and variance of the average batch anomaly scores from those iterations.