Uber trip data from a freedom of information request to NYC's Taxi & Limousine Commission
updated at May 6, 2024, 1:50 a.m.
Datasets of the daily Twitter output of Congress.
updated at May 5, 2024, 4:24 a.m.
Large medical text dataset curated for abbreviation disambiguation, designed for natural language understanding pre-training in the medical domain
updated at May 4, 2024, 4:36 a.m.
MOVED - The project is still under development but this page is deprecated.
updated at May 3, 2024, 9:37 p.m.
The Washington Post's analysis of NOAA climate change data for the contiguous United States
updated at April 29, 2024, 8:08 a.m.
All-Age-Faces (AAF) Database.
updated at April 26, 2024, 6:52 a.m.
This data set includes Landsat 8 images and their manually extracted pixel-level ground truths for cloud detection.
updated at April 12, 2024, 8:25 a.m.
This repo contains a set of Arabic newspaper articles alongwith metadata, extracted from various Saudi newspapers.
updated at April 10, 2024, 1:19 a.m.
An air travel dataset consisting of user reviews from Skytrax (www.airlinequality.com)
updated at April 2, 2024, 5:40 p.m.
The tracebase appliance-level power consumption data set
updated at April 2, 2024, 5:40 p.m.
Proton Exchange Membrane (PEM) Fuel Cell Dataset
updated at April 1, 2024, 8:02 a.m.
A repo containing various data (demographics, employment, etc.) in JSON form.
updated at March 29, 2024, 10 a.m.
Cube++ is a novel dataset collected for illumination estimation problem. It has 4890 raw 18-megapixel images, each containing a SpyderCube color target in their scenes, manually labelled categories, and ground truth illumination chromaticities.
updated at March 15, 2024, 6:44 a.m.
Simple but fast reverse geocoding up to city granularitiy level
updated at Feb. 4, 2024, 7:35 a.m.