Embed this resource in your web site
2011 Reading habits
This dataset consists of 14 demographic attributes of shopping mall customers in the San Francisco Bay area.
The goal is to predict the Anual Income of Household from the other 13 demographics attributes.
ANNUAL INCOME OF HOUSEHOLD (PERSONAL INCOME IF SINGLE) - 1 = Less than $10,000 - 2 = $10,000 to $14,999 - 3 = $15,000 to $19,999 - 4 = $20,000 to $24,999 - 5 = $25,000 to $29,999 - 6 = $30,000 to $39,999 - 7 = $40,000 to $49,999 - 8 = $50,000 to $74,999 - 9 = $75,000 or more
- 1 = Male
- 2 = Female
MARITAL STATUS - 1 = Married - 2 = Living together, not married - 3 = Divorced or separated - 4 = Widowed - 5 = Single, never married
AGE - 1 = 14 thru 17 - 2 = 18 thru 24 - 3 = 25 thru 34 - 4 = 35 thru 44 - 5 = 45 thru 54 - 6 = 55 thru 64 7. 65 and Over
EDUCATION - 1 = Grade 8 or less - 2 = Grades 9 to 11 - 3 = Graduated high school - 4 = 1 to 3 years of college - 5 = College graduate - 6 = Grad Study
- 1 = Professional/Managerial
- 2 = Sales Worker
- 3 = Factory Worker/Laborer/Driver
- 4 = Clerical/Service Worker
- 5 = Homemaker
- 6 = Student, HS or College
- 7 = Military
- 8 = Retired
- 9 = Unemployed
HOW LONG HAVE YOU LIVED IN THE SAN FRAN./OAKLAND/SAN JOSE AREA? - 1 = Less than one year - 2 = One to three years - 3 = Four to six years - 4 = Seven to ten years - 5 = More than ten years
DUAL INCOMES (IF MARRIED) - 1 = Not Married - 2 = Yes - 3 = No
PERSONS IN YOUR HOUSEHOLD - 1 = One - 2 = Two - 3 = Three - 4 = Four - 5 = Five - 6 = Six - 7 = Seven - 8 = Eight - 9 = Nine or more
PERSONS IN HOUSEHOLD UNDER 18 - 0 = None - 1 = One - 2 = Two - 3 = Three - 4 = Four - 5 = Five - 6 = Six - 7 = Seven - 8 = Eight - 9 = Nine or more
HOUSEHOLDER STATUS - 1 = Own - 2 = Rent - 3 = Live with Parents/Family
TYPE OF HOME - 1 = House - 2 = Condominium - 3 = Apartment - 4 = Mobile Home - 5 = Other
- ETHNIC CLASSIFICATION
- 1 = American Indian
- 2 = Asian
- 3 = Black
- 4 = East Indian
- 5 = Hispanic
- 6 = Pacific Islander
- 7 = White
- 8 = Other
WHAT LANGUAGE IS SPOKEN MOST OFTEN IN YOUR HOME? - 1 = English - 2 = Spanish - 3 = Other
This dataset is an extract from a survey performed by Impact Resources, Inc., Columbus, OH (1987).
Based on omnibus survey containing questions on people's Facebook habits and attitudes.
Happiness survey based on several factors.
Abstract: Predict whether income exceeds $50K/yr based on census data. Also known as "Census Income" dataset.
Ronny Kohavi and Barry Becker
Data Mining and Visualization
e-mail: firstname.lastname@example.org for questions.
Data Set Information: Extraction was done by Barry Becker from the 1994 Census database. A set of reasonably clean records was extracted using the following conditions: ((AAGE>16) && (AGI>100) && (AFNLWGT>1)&& (HRSWK>0))
Prediction task is to determine whether a person makes over 50K a year.
Listing of attributes:
workclass: Private, Self-emp-not-inc, Self-emp-inc, Federal-gov, Local-gov, State-gov, Without-pay, Never-worked.
education: Bachelors, Some-college, 11th, HS-grad, Prof-school, Assoc-acdm, Assoc-voc, 9th, 7th-8th, 12th, Masters, 1st-4th, 10th, Doctorate, 5th-6th, Preschool.
marital-status: Married-civ-spouse, Divorced, Never-married, Separated, Widowed, Married-spouse-absent, Married-AF-spouse.
occupation: Tech-support, Craft-repair, Other-service, Sales, Exec-managerial, Prof-specialty, Handlers-cleaners, Machine-op-inspct, Adm-clerical, Farming-fishing, Transport-moving, Priv-house-serv, Protective-serv, Armed-Forces.
relationship: Wife, Own-child, Husband, Not-in-family, Other-relative, Unmarried.
race: White, Asian-Pac-Islander, Amer-Indian-Eskimo, Other, Black.
sex: Female, Male.
native-country: United-States, Cambodia, England, Puerto-Rico, Canada, Germany, Outlying-US(Guam-USVI-etc), India, Japan, Greece, South, China, Cuba, Iran, Honduras, Philippines, Italy, Poland, Jamaica, Vietnam, Mexico, Portugal, Ireland, France, Dominican-Republic, Laos, Ecuador, Taiwan, Haiti, Columbia, Hungary, Guatemala, Nicaragua, Scotland, Thailand, Yugoslavia, El-Salvador, Trinadad&Tobago, Peru, Hong, Holand-Netherlands.
Data based on administrative records (individual income tax returns) from the Internal Revenue Service's Individual Master File (IMF) system, which includes a record for every Form 1040, 1040A, and 1040EZ filed with the IRS. The records included in this study were returns that were filed between January 1, 2009 and December 31, 2009. Generally, these are Tax Year 2008 returns although a limited number of late-filed returns for tax years before 2008 were also filed during this period. If a taxpayer filed returns for multiple years during this period, only the most recent return was included.
Princeton Survey Research Associates International for The Pew Research Center's Internet & American Life Project Source: http://pewinternet.org