Model that will predict the quality of risk of a loan application. This dataset was initially created by dr. Hans Hofmann at the Institut fur Statistik und Okonometrie Universitat Hamburg.
Model to predict survival on the Titanic, based on Class/Department, Age, Group, City of Embarkment, Job and Survived (as objective field). The data source is from http://www.encyclopedia-titanica.org. The Fare was indexed to current prices, based on a retail index.
This predictive model is capable of distinguishing between bad connections, called intrusions or attacks, and good normal connections. The data source used was part of the data created for The Third International Knowledge Discovery and Data Mining Tools Competition, which was held in conjunction with KDD-99 The Fifth International Conference on Knowledge Discovery and Data Mining.
Model with word counts for about 47,000 text documents, of which roughly 32,000 are novels, 7,500 are supreme court opinions, and 7,500 are webpages from universities. The features are word counts for the 3000 top words by TF*IDF, with stopwords removed.
A model that predicts miles per gallon for a series of older automobiles, based on other information about the car, such as acceleration and horsepower. Taken from the UCI machine learning repository at http://archive.ics.uci.edu/ml/datasets/Auto+MPG.
768 Instances of medical information of females of Pima Indian heritage. Originally owned by National Institute of Diabetes and Digestive and Kidney Diseases