medal by McGill-NLP

Large medical text dataset curated for abbreviation disambiguation, designed for natural language understanding pre-training in the medical domain

updated at April 28, 2024, 4:27 p.m.

Python

10 +0

208 +4

36 +0

GitHub
pudl by catalyst-cooperative

The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.

updated at April 27, 2024, 3:31 p.m.

Python

18 +0

443 +2

101 +4

GitHub
transfermarkt-datasets by dcaribou

⚽️ Extract, prepare and publish Transfermarkt datasets.

updated at April 27, 2024, 12:21 a.m.

Python

9 +0

174 +3

45 +0

GitHub
TCPD by alan-turing-institute

The Turing Change Point Dataset - A collection of time series for the evaluation and development of change point detection algorithms

updated at April 19, 2024, 11:45 a.m.

Python

8 +0

130 +0

28 +1

GitHub
collection by tategallery

Tate Collection metadata

updated at April 13, 2024, 6:16 p.m.

Python

59 +0

505 +0

187 +0

GitHub
skytrax-reviews-dataset by quankiquanki

An air travel dataset consisting of user reviews from Skytrax (www.airlinequality.com)

updated at April 2, 2024, 5:40 p.m.

Python

1 +0

70 +0

39 +0

GitHub
JsonOfCounties by evangambit

A repo containing various data (demographics, employment, etc.) in JSON form.

updated at March 29, 2024, 10 a.m.

Python

7 +0

55 +0

10 +0

GitHub
CubePlusPlus by Visillect

Cube++ is a novel dataset collected for illumination estimation problem. It has 4890 raw 18-megapixel images, each containing a SpyderCube color target in their scenes, manually labelled categories, and ground truth illumination chromaticities.

updated at March 15, 2024, 6:44 a.m.

Python

13 +0

49 +0

5 +0

GitHub
mcafp by google

None

updated at Jan. 4, 2024, 4:09 p.m.

Python

2 +0

39 +0

17 +0

GitHub
gun-violence-data by jamesqo

A comprehensive, accessible database that contains records of over 260k US gun violence incidents from January 2013 to March 2018.

updated at Dec. 14, 2023, 4:48 p.m.

Python

0 +0

3 +0

2 +0

GitHub