Every Machine Learning project starts with data and data can come from many sources. This is especially true for complex enterprise computing environments. In responding to the need to import data directly from external databases to streamline Machine Learning workflows, BigML now supports MySQL, SQL Server, and Elasticsearch in addition to PostgreSQL.
BigML brings Linear Regression to the platform, a well known algorithm for supervised learning. This technique is widely applied across industries and is simple to understand, allowing for high interpretability. It assumes a linear relationship between the input fields and the objective field, enabling you to summarize relationships between quantitative, continuous variables.
BigML brings Principal Component Analysis (PCA) to the platform, a key unsupervised learning technique used to transform a given dataset in order to yield uncorrelated features and reduce dimensionality.
BigML brings Data Transformations to the BigML platform, a key part of any Machine Learning workflow. The new capabilities include the ability to perform SQL-style queries, Flatline editor improvements, and more ways to do feature engineering.
Fusions solve classification and regression problems using the combination of several supervised models (models, ensembles, logistic regressions, and deepnets) to provide better performance than any of the individual components.
OptiML is an optimization process for model selection and parameterization that automatically finds the best supervised model to help you solve classification and regression problems. OptiML is available from the Dashboard, API, and WhizzML.
Operating thresholds is a key feature that allows you to fine tune the performance of your classification models, and organizations, a convenient collaborative space that breaks down barriers for companies to adopt ML across their corporate structure.
Deepnets are an optimized version of Deep Neural Networks, the machine-learned models loosely inspired by the neural circuitry of the human brain. With Deepnets you can solve classification and regression problems such speech recognition, text classification, image classification, or object detection tasks, among other use cases.
Time Series is a supervised learning method for analyzing time based data when historical patterns can explain future behavior. It is available from the BigML Dashboard, the API, as well as WhizzML for its automation. Time Series is commonly used for predicting stock prices, sales forecasting, website traffic, production and inventory analysis among many other use cases.
BigML launches Boosted Trees, the third ensemble-based strategy that helps you easily solve classification and regression problems. This Machine Learning technique allows each tree model to concentrate on the wrong predictions of the previously grown tree to correct and improve on any mistakes made in those previous iterations.
Topic Models is the resource that helps you easily find thematically related terms in your text data. Discover BigML’s implementation of the underlying Latent Dirichlet Allocation (LDA) technique, one of the most popular probabilistic methods for topic modeling tasks.
BigML brings to your Dashboard Logistic Regression, one of the most popular methods used to solve classification problems. Now you can build a Logistic Regression with a single click, introspect it by using intuitive visualizations, evaluate it like any other classification model, fine tune it via handy configuration options, and create individual or batch predictions from it with ease.
WhizzML is a new domain-specific language for automating Machine Learning workflows, implementing high-level Machine Learning algorithms, and easily sharing them with others. WhizzML offers out-of-the-box scalability, abstracts away the complexity of underlying infrastructure, and helps analysts, developers, and scientists reduce the burden of repetitive and time-consuming analytics tasks.
BigML is the first Machine Learning service offering Association Discovery on the cloud. With Association Discovery you can pinpoint hidden relations between values of your variables in high-dimensional datasets with just one click. It is very useful for market basket analysis, web usage patterns, intrusion detection, fraud detection, or bioinformatics.
The enhanced version of BigML includes: the Sample Service for fast access to datasets that are kept in an in-memory cache, very convenient for filtering and correlation techniques; the Dynamic Scatterplot, a graph that lets you visualize your samples differently and it is extremely useful to detect interesting patterns in your data, correlations among your fields, or anomalous data points amidst other observations; the G-Means algorithm, ideal to create clusters when you may not know how many clusters you wish to build from your dataset; the Projects to help you organize your Machine Learning resources created in BigML; and more integrations you do not want to miss.