Embed this resource in your web site
The 2008-09 nine-month academic salary for Assistant Professors, Associate Professors and Professors in a college in the U.S. The data were collected as part of the on-going effort of the college's administration to monitor salary differences between male and female faculty members.
Fox J. and Weisberg, S. (2011) An R Companion to Applied Regression, Second Edition Sage.
Predict average SAT scores for math given the school demographic and performance characteristics. Data taken from the NYC Dept. of Education.
See Click Predict Fix from Kaggle competition number of views prediction
Kaggle Higgs Boson Machine Learning Challenge
American Colleges and Universities
Playmate of the Year stats 1957-2012. Bust prediction.
Credits: Based in the Nathany dataset (1960-2007)
Library visits patterns based on 8000+ public Libraries in USA.
The objective is to identify each of a large number of black-and-white rectangular pixel displays as one of the 26 capital letters in the English alphabet. The character images were based on 20 different fonts and each letter within these 20 fonts was randomly distorted to produce a file of 20,000 unique stimuli. Each stimulus was converted into 16 primitive numerical attributes (statistical moments and edge counts) which were then scaled to fit into a range of integer values from 0 through 15. We typically train on the first 16000 items and then use the resulting model to predict the letter category for the remaining 4000. See the article cited above for more details.
letter - capital letter (26 values from A to Z)
x-box - horizontal position of box (integer)
y-box - vertical position of box (integer)
width - width of box (integer)
high - height of box (integer)
onpix - total # on pixels (integer)
x-bar - mean x of on pixels in box (integer)
y-bar - mean y of on pixels in box (integer)
x2bar - mean x variance (integer)
y2bar - mean y variance (integer)
xybar - mean x y correlation (integer)
x2ybr - mean of x * x * y (integer)
xy2br - mean of x * y * y (integer)
x-ege - mean edge count left to right (integer)
xegvy - correlation of x-ege with y (integer)
y-ege - mean edge count bottom to top (integer)
yegvx - correlation of y-ege with x (integer)