Datasets for Data Analytics and Visualization

This page pulls together a collection of reliable data sources for visualization and analysis projects. Feel free to use this list as a jumping off point for homeworks and course projects! It is by no means exhaustive; there are nearly unlimited great data sources available on the internet. If you’d like to suggest an addition to this page you can do so with this form.

Repositories

Responsible datasets in context

https://www.responsible-datasets-in-context.com

UCI Machine Learning Repository

https://archive.ics.uci.edu/

Data is Plural

https://www.data-is-plural.com/archive/

DataHub

https://datahub.io/collections

Awesome public datasets

https://github.com/awesomedata/awesome-public-datasets

Data for existing articles

Five-thrity-eight

https://github.com/fivethirtyeight/data

New York Times

https://github.com/nytimes https://github.com/theupshot

Washington Post

https://github.com/washingtonpost

Propublica

https://github.com/propublica

CityLab

https://github.com/theatlantic/citylab-data

Society

Stanford open policing dataset

https://openpolicing.stanford.edu/data/

Zillow House prices

https://www.zillow.com/research/data/

Gun Violence Archieve

https://www.gunviolencearchive.org/

Gapminder data

https://www.gapminder.org/data/

Our World in Data

https://ourworldindata.org/

World Inequality Database

https://wid.world/

World Bank Open Data

https://datacatalog.worldbank.org/home

Emerson College Polling

https://emersoncollegepolling.com/

U.S. Economy

Federal reserve data

https://fred.stlouisfed.org/

Bureau of economic analysis

https://www.bea.gov/data

Congressional budget office

https://www.cbo.gov/data/budget-economic-data

Campaign Finance Data

https://www.fec.gov/data/

Bureau of Labor Statistics

https://www.bls.gov/

Other economic data resources

https://www.aeaweb.org/resources/data

City/State data (Housing, health, transportation etc.)

California Open Data

https://data.ca.gov/

NYC Open Data

https://opendata.cityofnewyork.us/

NYS Open Data

https://data.ny.gov/

LA Open Data

https://data.lacity.org/

US Census Bureau

https://data.census.gov/

Public health

CDC Data Portal

https://data.cdc.gov/

Sports and games

Data sources for video games

https://github.com/leomaurodesenv/game-datasets

Lahman Baseball database

http://seanlahman.com/

Kaggle NBA Database

https://www.kaggle.com/datasets/wyattowalsh/basketball

Sports reference (paid for query access)

https://www.sports-reference.com

Fan Graphs (Paid)

https://www.fangraphs.com/

Climate

NOAA

https://psl.noaa.gov/data/index.html

Google Earth Engine

https://developers.google.com/earth-engine/datasets/

Transportation

NHTSA Crash Data

https://cdan.dot.gov/

Bureau of Transportation Statistics

https://www.bts.gov/A-Z-Index

Entertainment

IMDB Data

https://developer.imdb.com/non-commercial-datasets/

Media APIs

https://developer.spotify.com/documentation/web-api https://developers.google.com/youtube/v3/docs https://developer.apple.com/documentation/applemusicapi

Network data

Stanford Network Analysis

https://snap.stanford.edu/data/index.html