Association Discovery

Reveal statistically significant association rules in your data

Association Discovery is a rule-based unsupervised Machine Learning method for discovering relations between variables in high-dimensional datasets. The main motivation behind the technique is to arrive at statistically significant rules discovered as per a given measure of interestingness. Associations go beyond simple variable correlations by revealing complex set of rules that state which particular values of a given set of variables imply the existence of other variables in your dataset that assume specific values of their own.

Sign up now! It's free!

Applications of Association Discovery

Statistically significant association rules can help answer questions like: Which products are purchased together? or What is the next most likely action of a user? In addition to the widely known market basket analysis and next best offer use cases, Associations are used in recommender systems, cross-sell/upsell analysis, marketing campaign analysis, Web usage mining, digital forensics, continuous production, bioinformatics, and numerous scientific applications ranging from health data mining and cancer mortality studies to controlling robots and improving e-learning.

Best-in-class algorithm

Association Discovery concentrates on discovering relationships between values rather than variables. BigML's implementation is a highly scalable method able to routinely deal with high-dimensional datasets containing thousands of distinct values. BigML is the first Machine Learning platform that offers this unique unsupervised method that was originally developed by a world-renowned research scientist that currently serves as a BigML Scientific Advisor.

Highly interpretable results

Associations are easily expressed as rules that can be understood by non-experts. By default, BigML presents the Top 100 associations in your dataset by your chosen association measure in a table view. Each rule can be further analyzed by using the detailed Venn diagram visual presented in the same view. Additionally, the network chart visualization gives you a bird's-eye view of how all the identified associations relate to one another. Depending on your case, you may not always be interested in finding the strongest relationships but only the rules that meet certain conditions instead. You can use the convenient filters in either view to declutter and further analyze your rules to iteratively unearth the most useful association rules for your context. For example, you can filter by your data field of interest (i.e., consequent) and one or more allowed field values for that part of the rule showing only associations where the consequent states Diabetes = 'TRUE'.

Real-time or customizable Batch Association Sets

Once you create an Association model, you can use the Association Sets capability to predict the items that are most strongly associated with your input data. For example, given a set of products purchased by a person, what other products are most likely to be ordered? All the predicted items will be ranked according to a similarity score, and they will be displayed in a table view. You can also visualize each predicted rule in a Venn diagram to get a sense of the correlation strength between the input data and the predicted items.

Fully programmable Associations

In addition to the point-and-click mode on BigML Dashboard, you can programmatically create, list, delete, and use your Associations through the BigML's REST API and the bindings for all popular languages. You can choose to use BigML with Python, Node.js, Java, Swift, C# or other languages. Associations are also supported by WhizzML, our domain-specific language for automating Machine Learning workflows, implementing high-level Machine Learning algorithms, and sharing them with others.

Association Discovery Training Video