Embed this resource in your web site
This is a simplified dataset aimed to predict inventory demand based on historical sales data. The objective is to forecast the demand of a product for a given week, at a particular store. The dataset consists of 9 weeks of sales transactions in Mexico.
Every week, there are delivery trucks that deliver products to the vendors. Each transaction consists of sales and returns. Returns are the products that are unsold and expired. The demand for a product in a certain week is defined as the sales this week subtracted by the return next week.
Things to note: The adjusted demand (Demanda_uni_equil) is always >= 0 since demand should be either 0 or a positive value. The reason that Venta_uni_hoy - Dev_uni_proxima sometimes has negative values is that the returns records sometimes carry over a few weeks.
Semana — Week number (From Thursday to Wednesday)
Agencia_ID — Sales Depot ID
Town — Town of the Agencia
State — State of the Agencia
Canal_ID — Sales Channel ID
Ruta_SAK — Route ID (Several routes = Sales Depot)
Cliente_ID — Client ID
NombreCliente — Client name
Producto_ID — Product ID
NombreProducto — Product Name
Venta_uni_hoy — Sales unit this week (integer)
Venta_hoy — Sales this week (unit: pesos)
Dev_uni_proxima — Returns unit next week (integer)
Dev_proxima — Returns next week (unit: pesos)
Demanda_uni_equil — Adjusted Demand (integer) (This is the target you will predict)
Source: Grupo Bimbo Inventory Demand competition at Kaggle.
The data featured here are mainly from Harbour City Ferries (formerly Sydney Ferries). We have also added statistics and publications about ferry usage from other BTS collections such as the Household Travel Survey.
Harbour City Ferries compiles data on patronage and ticket validations. The Ferry Load Census data are now available and are provided by wharf and route. These are based on a week-long census of ferry services conducted in May and November of each year. Users should note that these two periods do not represent the lowest and highest figures, which are known to occur in June and December.
Ticket Validations Data, to be made available later, are based on actual validations at Circular Quay. Ferry Patronage data, estimated from both the Ferry Load Census and Ticket Validations data are now available.
Based on the original passenger list, this is a dataset that contains all Titanic passenger and crew.
This dataset, downloaded from Buzzdata, contains bicycle collision data for 31,480 accidents in Toronto Canada between the years 1986 thru 2010.
There were several fields that had corrupted data (latitude/longitude with text, dates with random text), inconsistent data (different street name abbreviations, different ways of marking missing data) and improper data (time of day formatted as HHMM instead of HH:MM, age with leading zeros) which I cleaned up.
Titanic Survival Simplified.
Passangers survival based only in the age range, class/dept and boarding.
Data taken from http://www.encyclopedia-titanica.org/
An instance per hour has been added such as a high traffic flag.
The dataset has 2 objective fields: car counts (numeric) or high traffic (boolean). It allows to predict traffic volume in New York streets segments.
The dataset has been ordered per date ascending