Production mode is required
Embed this resource in your web site
This script takes as inputs a cluster identifier, an instance, i.e., a map with values for all fields used by the cluster, and a positive count
n. It then:
Finds the centroid in the cluster closer to the given instance
Selects within that centroid's dataset the
ninstances that are closest to
If there are less than
nrows in the centroid's dataset, missing instances are read from the next closest centroid.
This workflow uses flatline to compute the distance between
p and the centroid datasets (via the
row-distance-squared flatline function) and add an extra column to the dataset, and then creates a sample of the result, ordered by the computed distance.
The input instance can be specified using either field identifiers or field names.
Find the global field importance across a cluster
Please see the readme for more information.
Given a dataset and a categorical field, finds the minimum scale required to create class purity in the cluster with k = number of classes.
A variation on the k-means-- algorithm proposed by Sanjay Chawla and Aristides Gionis in their paper "k-means--: A unified approach to clustering and outlier detection".
Given a dataset, a number of clusters k and a number of anomalies l, this script creates a BigML k-means cluster. The l instances that are the farthest from their centroids are removed and another BigML k-means cluster is created. This process is repeated until the Jaccard index of subsequent sets of anomalies passes some threshold, or until some maximum number of iterations.
dataset: the dataset of interest
k: the number of clusters desired
l: the number of anomalies to be removed at each step
threshold: the minimum desired Jaccard index between iterations
maximum: the maximum number of desired iterations
cluster: the cluster id of the final cluster
dataset-id: the original dataset appended with fields for cluster membership and distance to centroid
anomalies: a list of the anomalous instances
similarities: a list of the similarity coefficients from each step
Please see our readme for more information.