BigML Certifications

The breadth of intelligent applications the BigML platform can support spawn many new opportunities for BigML partners to get involved in delivering Machine Learning-based solutions. Our certifications are perfect for software developers, system integrators, technology consulting, and strategic consulting firms to rapidly get up to speed with Machine Learning and the BigML platform as they acquire and grow their customer base.

In order to be eligible to enroll into the BigML Certified courses you must show certain level of proficiency in Machine Learning, BigML Dashboard, BigML API, and WhizzML. The following getting started assets will get you up and running in no time: ML 101, Tutorials, API documentation, and WhizzML.

Certified Engineer

This certification track prepares analysts, scientists, and software developers to become BigML Certified Engineers.

The certification process consists of 8 online classes of 1.5 hours each. Evaluation will be based on solving a set of theoretical questions and exercises presented during the course. The courses listed below will consist of 2 sessions each to complete the 8 online classes.

Courses

Advanced Modeling

Objective

  • Understand how to parameterize supervised and unsupervised methods to achieve better performance.
  • Learn how to compose multiple methods together to better solve modeling problems.

Pre-requisites

Syllabus

  • Modeling vs. Prediction
  • Supervised Learning

    Decision Trees: Node threshold, Weights, Statistical Pruning, Modeling Missing Values.

    Ensemble Classifiers: Bagging (Sample Rates, Number of Models), Random Decision Forests (Random Candidates), Boosting.

    Logistic Regression: L1 Normalization, L2 Normalization , Field Encodings, Scales.

    Evaluation: How to Properly Evaluate a Predictive Model, Cross-Validation , ROC Spaces and Curves.

  • Unsupervised Learning

    Clustering: Number of Clusters, Dealing with Missing Values, Modeling Clusters, Scaling Fields, Weights, Summary Fields, K-means vs. G-means.

    Association Discovery: Measures (Support, Confidence, Leverage, Significance Level, Lift), Search Strategies (Confidence, Coverage, Leverage, Lift, Support), Missing Items, Discretization.

    Topic Modeling: Topics, Terms, Text analysis.

    Anomaly Detection: Forest Size, Constraints, ID Fields.

  • Combination and Automation

    Stacking.

Advanced API

Objective

  • Proficiency in using BigML's API and client-side tools to create ML resources.
  • Integration and automation of the workflows needed put a ML solution in production.

Pre-requisites

  • Basic knowledge of BigML and its resources (UI-level familiarity is enough).
  • Basic programming skills (some examples are in Python, so knowledge of the language will be a plus).
  • Familiarity with REST APIs.

Syllabus

  • API description

    Domains (bigml.io vs. Private Deployments).

    Authentication.

    Inputs and outputs.

    Resources: Common information, Specifics, Listing and filtering.

  • First level wrappers

    Bindings.

    Methods mapping.

    Field management.

    Local resources.

  • Second level wrappers

    BigMLer.

    Resource management.

    Field management.

    Workflow automation.

    Automated feature engineering.

  • Modeling strategies
  • Predicting strategies
Advanced Data Transformations

Objective

  • Data is typically: scattered, unclean, and imperfect. How to make it ML-Ready.
  • Once data is ML-Ready, why/how to make better features.
  • Not all features are good. How to choose and what to watch out for.

Pre-requisites

  • Advanced Modeling Class.
  • Familiarity with: SQL, Python / Pandas, CSV formatting.

Syllabus

  • ML-Ready Data

    What is it?

    Formats.

    Structures for ML tasks.

    Automating Labeling.

  • Data Transformations

    Cleansing Missing Data, Cleaning Data, Better Data.

    Transformations outside BigML: Denormalizing, Aggregating, Pivoting, Time windows, Updates, Streaming Data, Images, Transformations inside BigML.

  • Feature Engineering

    Auto Transformations: Date-time parsing, LR/cluster missing, LR/cluster auto-scaling, Bag-of-words (Language, Tokenization, etc).

    Manual - Flatline: DSL for feature engineering, Basics (s-expressions, Literals, Counters, Field Values / Properties, Strings, Regex, Operators), Limitations.

    Numerics: Discretization, Normalization, Z-score, Built-in math functions, Type-casting, Random, Shocks, Moving averages.

    Date-times: UI timestamp, Epoch, Moon phase.

    Text: JSON key/val, Topic distributions.

  • Feature Selection

    Correlations.

    Leakage.

    Field Importance (ensembles).

    Advanced Selection: Best-First, Boruta.

Advanced WhizzML

Objective

  • Proficiency in using BigML's DSL language, WhizzML, as a server-side tool to automate ML-workflows in a scalable, replicable and shareable way.

Pre-requisites

  • Basic knowledge of BigML and its resources (UI-level familiarity is enough).
  • Familiarity with ML-workflows.
  • Basic programming skills (knowledge of some language of the LISP-family and/or WhizzML will be a plus).

Syllabus

  • WhizzML directives
  • Directives mappings
  • Simple workflows in WhizzML

    Batch Anomaly Score.

    Evaluation.

    Clustered dataset generation.

  • Advanced workflows in WhizzML

    Cross-validation.

    Covariate shift.

    Stacked generalization.

Certifications calendar
Starts Certification by Registered by
1st Starts November 3, 2016 Certification by December 1, 2016 Registered by October 31, 2016
2nd Starts December 22, 2016 Certification by January 26, 2017 Registered by December 20, 2016
3rd Starts February 14, 2017 Certification by March 9, 2017 Registered by February 13, 2017
4th Starts March 14, 2017 Certification by April 6, 2017 Registered by March 13, 2017

Certified Architect

This certification track prepares BigML Certified Engineers to become BigML Certified Architects. Once you've successfully passed the BigML Certified Engineer Exam, you are eligible to enroll into the BigML Certified Architect Courses.

The certification process consists of 8 online classes of 1.5 hours each. Evaluation will be based on solving a set of theoretical questions and exercises presented during the course. The courses listed below will consist of 2 sessions each to complete the 8 online classes.

Courses

Designing Large-Scale Machine Learning Solutions
Measuring the Impact of Machine Learning Solutions
Using Machine Learning to Solve Machine Learning Problems
Lessons Learned Implementing Machine Learning Solutions