Embed this resource in your web site
Model that will predict the quality of risk of a loan application. This dataset was initially created by dr. Hans Hofmann at the Institut fur Statistik und Okonometrie Universitat Hamburg.
Data on loan delinquency for loans given by LendingClub.com based on about 50000 loans. Data is available at [http://lendingclub.com/info]
768 Instances of medical information of females of Pima Indian heritage. Originally owned by National Institute of Diabetes and Digestive and Kidney Disease.
A model that predicts the output of a solar power system installed in Berkeley, CA. The data was compiled by Ph.D. candidate Alexandra Constantin and is available at www.eecs.berkeley.edu/~alexacon/.
Dataset with 3,333 instances of customer behavior and churn indicator.
Data on mileage per gallon for a series of older automobiles, based on other information about the car, such as acceleration and horsepower. Taken from the UCI machine learning repository at http://archive.ics.uci.edu/ml/datasets/Auto+MPG.
Dataset with results from 4,500 Hospital Patient surveys. The data is from a list of hospital ratings for the Hospital Consumer Assessment of Healthcare Providers and Systems (HCAHPS). HCAHPS is a national, standardized survey of hospital patients about their experiences during a recent inpatient hospital stay. https://data.medicare.gov/dataset/Survey-of-Patients-Hospital-Experiences-HCAHPS-/rj76-22dk
Model with word counts for about 47,000 text documents, of which roughly 32,000 are novels, 7,500 are supreme court opinions, and 7,500 are webpages from universities. The features are word counts for the 3000 top words by TF*IDF, with stopwords removed.
The dataset on the distinction between good and bad connections (intrusions was part of the data created for The Third International Knowledge Discovery and Data Mining Tools Competition, which was held in conjunction with KDD-99 The Fifth International Conference on Knowledge Discovery and Data Mining.