×
BigML is working hard to support a wide range of browsers. Your experience will be better with:
English
Sign up / in
Sign up
Sign in
or Sign in with...
Amazon
Github
GitLab
Google
Toggle navigation
Australia or New Zealand users Sign in
PRODUCT
Features
Releases
What's New
GETTING STARTED
PRICING
Subscriptions
Private Deployments
Training
Certifications
BigML for Education
SUPPORT
Toggle navigation
PRODUCT
Features
Releases
What's New
GETTING STARTED
PRICING
Subscriptions
Private Deployments
Training
Certifications
BigML for Education
SUPPORT
Australia or New Zealand users Sign in
License
Oops!
We could not process your request!
Cluster classification | BigML.com
×
Embed this Script in your web site
Copy & Paste this code into your HTML code:
akashenfelter
Public
akashenfelter's
Scripts
5
Models
0
Datasets
1
195
FREE
BUY
Cluster Classification
Details:
Created
Mon, 1 Aug 2016 18:02:56 +0000
Published
Fri, 12 Aug 2016 02:30:08 +0000
Description:
Find the global field importance across a cluster
Please see the
readme
for more information.
Tags:
Clusters
Importance
URL:
https://bigml.com/user/akashenfelter/gallery/script/579f8ed0af447f2234000ddb
Script
FREE
Source code
;; Given a cluster, creates a batch centroid and uses the output ;; dataset where each instance is labeled by centroid to create an ;; ensemble (random forest), and returns the importance of each field ;; averaged over the ensemble. ;; Given a cluster, returns a batchcentroid id created from that ;; cluster whose output dataset is the original cluster dataset ;; appended with the centroids the instances are part of. (define (label-by-cluster cl-id) (let (cl (fetch cl-id) cl-ds (cl "dataset") cl-fields (cl "input_fields")) (create-and-wait-batchcentroid {"cluster" cl-id "dataset" cl-ds "output_dataset" true "output_fields" cl-fields}))) ;; Given a list of pairs, creates the equivalent map (define (list-to-map nested-list) (let (key-list (map head nested-list) value-list (map (lambda (x) (last x)) nested-list)) (make-map key-list value-list))) ;; Given a function of two values and two maps, returns a new map ;; where if a key appears in only one of the original maps its value ;; is unchanged, but if it appears in both its value is the function ;; of the two original values. (define (merge-with fn map1 map2) (let (subset-keys (filter (lambda (x) (contains? map1 x)) (keys map2)) subset-values (map (lambda (x) (fn (get map1 x) (get map2 x))) subset-keys) subset (make-map subset-keys subset-values)) (merge map1 (merge map2 subset)))) ;; Given a ensemble id and a list of field ids, returns a map of those ;; field ids and average importances (define (make-importance-map en-id ids) (let (models ((fetch en-id) ["models"]) len-models (count models) imp-lst (map (lambda (md-id) ((fetch md-id) ["model" "importance"])) models) imp-map (map list-to-map imp-lst) imp-sum (reduce (lambda (x y) (merge-with + x y)) {} imp-map)) (make-map ids (map (lambda (x) (if (contains? imp-sum x) (/ (get imp-sum x) len-models) 0)) ids)))) ;; Given a cluster-id, builds a random forest on centroid membership, ;; and returns a paired list of field importance averaged across all ;; models in the forest (define (cluster-classification cl-id) (let (bc-id (label-by-cluster cl-id) bc (fetch bc-id) old-ids (bc "output_fields") labeled (bc "output_dataset_resource") new-ids (butlast ((fetch labeled) "input_fields")) cl-en (create-and-wait-ensemble labeled {"randomize" true}) ds-fields ((fetch labeled) "fields") imp-map (make-importance-map cl-en new-ids) name-map (make-map new-ids (map (lambda (x) (ds-fields [x "name"])) new-ids)) id-map (make-map new-ids old-ids) total-map (map (lambda (x) (make-map ["field id", "name", "importance"] [(get id-map x), (get name-map x), (get imp-map x)])) (keys imp-map))) (delete* [labeled bc-id cl-en]) (reverse (sort-by-key "importance" total-map)))) ;; Putting it all together... (define importance-list (cluster-classification cluster-id))
Description
Find the global field importance across a cluster
Please see the
readme
for more information.
Find the global field importance across a cluster
Please see the
readme
for more information.
Show more
Inputs
Outputs
License:
Creative Commons Zero V1.0
Script
FREE
Cluster classification | BigML.com
Sending Request...
Sending Request...