Scripts Gallery: Miscellaneous

FREE

Logistic regression's k-fold cross-validation whizzml

The objective of this script is to perform a k-fold cross validation of a logistic regression built from a dataset. The algorithm:

Divides the dataset in k parts
Holds out the data in one of the parts and builds a logistic regression with the rest of data
Evaluates the logistic regression with the hold out data
The second and third steps are repeated with each of the k parts, so that k evaluations are generated
Finally, the evaluation metrics are averaged to provide the cross-validation metrics.

The output of the script will be an evaluation ID. This evaluation is a cross-validation, meaning that its metrics are averages of the k evaluations created in the cross-validation process.

For more information, please see the readme.

15.2 KB

101

FREE

Deepnet's k-fold cross-validation whizzml

The objective of this script is to perform a k-fold cross validation of a deepnet built from a dataset. The algorithm:

Divides the dataset in k parts
Holds out the data in one of the parts and builds a deepnet with the rest of data
Evaluates the deepnet with the hold out data
The second and third steps are repeated with each of the k parts, so that k evaluations are generated
Finally, the evaluation metrics are averaged to provide the cross-validation metrics.

The output of the script will be an evaluation ID. This evaluation is a cross-validation, meaning that its metrics are averages of the k evaluations created in the cross-validation process.

For more information, please see the readme.

23.5 KB

99

FREE

Best-first feature selection with cross-validation pgonzalezcarrizo

Find the best features for modeling using a greedy algorithm. Extends the best-first feature selection script that only worked with models and used split-evaluation

28.7 KB

48

FREE

Model per category whizzml

Creating a model per category

Generates models for a dataset, simulating a root split on a given field.

Given an input dataset and one of its categorical fields, create a dataset for each field category.

For each category dataset, create a model of the given kind.

If the provided field has missing values in the input dataset, a model for instances with missing as their "category" will also be created.

Return a map with the list of datasets, models and categories for use during predictions.

The lists are returned in a map as the execution's result, for later convenience, as well as in the output named "result".

forced split

3.7 KB

31

FREE

Ordinal encoder kenbaldwin

Given a dataset, encodes categorical fields using ordinal encoding, which uses a single column of integers to represent field classes (levels). It then creates a new dataset, with additional fields containing ordinal encodings of the categorical fields.

If classes have a known order (such as Like, Somewhat Like, Neutral, Somewhat Dislike, and Dislike), the integer mapping can be supplied; otherwise, integers are assigned by class count, in descending order (in the case of ties, classes are ordered alphabetically).

For more information, please see the readme.

2.5 KB

10

FREE

Lat/Long Distance from a reference point petersen

Extends a dataset with the distance in meters between lat/long fields and a reference point.

925 bytes

9

FREE

Pick random row in group mmartin

Selects one of the rows (at random) from the ones grouped by a list of fields

1.2 KB

8

FREE

Image dataset to source mprats

The script is meant for datasets that contain images. It transforms the information in the dataset to an editable source, where users can inspect images entirely and see or update its labels and regions. Datasets created using a batch prediction will also add the prediction fields, but their names might be changed to avoid duplicated field names.

A score_threshold parameter has been added to allow users to filter predicted regions. Regions whose score value are below the score_threshold will be discarded.

5.7 KB

2

COMPANY

PRODUCT

BUSINESS

TRAINING

GALLERY

License

Embed this resource in your web site

COMPANY

PRODUCT

BUSINESS

TRAINING

GALLERY