Model that will predict the quality of risk of a loan application. This dataset was initially created by dr. Hans Hofmann at the Institut fur Statistik und Okonometrie Universitat Hamburg.
The data is related with direct marketing campaigns of a Portuguese banking institution. The marketing campaigns were based on phone calls. Often, more than one contact to the same client was required, in order to access if the product (bank term deposit) would be (or not) subscribed.
The classification goal is to predict if the client will subscribe a term deposit.
Dataset aimed to improve in credit scoring, by predicting the probability that somebody will experience financial distress in the next two years. The goal is to build model that borrowers can use to help make the best financial decisions.
• 150,000 borrowers
Dataset structure: ID: ID of borrower. SeriousDlqin2yrs: Person experienced 90 days past due delinquency or worse (Type: Y/N). RevolvingUtilizationOfUnsecuredLines: Total balance on credit cards and personal lines of credit except real estate and no installment debt like car loans divided by the sum of credit limits (Type: percentage) Age: Age of borrower in years (Type: integer) NumberOfTime30-59DaysPastDueNotWorse: Number of times borrower has been 30-59 days past due but no worse in the last 2 years. (Type: integer). DebtRatio: Monthly debt payments, alimony, living costs divided by monthly gross income (Type: integer) MonthlyIncome: Monthly income (Type: real) NumberOfOpenCreditLinesAndLoans: Number of Open loans (installment like car loan or mortgage) and Lines of credit (e.g. credit cards) (Type: integer) NumberOfTimes90DaysLate: Number of times borrower has been 90 days or more past due. (Type: integer) NumberRealEstateLoansOrLines: Number of mortgage and real estate loans including home equity lines of credit (Type: integer) NumberOfTime60-89DaysPastDueNotWorse: Number of times borrower has been 60-89 days past due but no worse in the last 2 years. (Type: integer) NumberOfDependents: Number of dependents in family excluding themselves (spouse, children etc.). (Type: integer)
BBVA contest. Credit card transactions stats in Barcelona and Madrid between Nov-2012 and Apr-2013
Zipcodes: Give a zone identifier and a commercial category, it returns the top postal codes where the clients with the most payments, unique cards and total spent originate.
Predicted field "Incomes" is the total income in a location (Seller Zipcode) and month, grouped by Commercial Category and Buyer Zipcode.
This is the dataset that was used for the BigML Webinar on January 28, 2014 for the Winter 2014 Release. For an explanation of how this dataset was created (and what to do with it), see the first few minutes of the webinar here