open-data by freeCodeCamp

None

created at Nov. 25, 2015, 10:15 p.m.

HTML

16 +0

152 +0

41 +0

GitHub
collection by artsmia

Mia collection metadata

created at Jan. 31, 2014, 7:47 p.m.

Unknown languages

13 +0

72 +0

10 +0

GitHub
CubePlusPlus by Visillect

Cube++ is a novel dataset collected for illumination estimation problem. It has 4890 raw 18-megapixel images, each containing a SpyderCube color target in their scenes, manually labelled categories, and ground truth illumination chromaticities.

created at July 21, 2020, 1 p.m.

Python

13 +0

49 +0

5 +0

GitHub
3w_dataset by ricardovvargas

The first realistic and public dataset with rare undesirable real events in oil wells.

created at Jan. 19, 2019, 12:30 a.m.

Jupyter Notebook

13 +0

104 +0

56 +0

GitHub
medal by McGill-NLP

Large medical text dataset curated for abbreviation disambiguation, designed for natural language understanding pre-training in the medical domain

created at April 22, 2020, 3:13 a.m.

Python

10 +0

208 +4

36 +0

GitHub
ecuacovid by andrab

Datos sin procesar extraído, limpiado, y normalizado de los informes de la situación nacional frente a la Emergencia Sanitaria SARS-CoV2 (COVID-19) de SNGRE, MSP, Registro Civil, e INEC.

created at March 29, 2020, 7:57 a.m.

Ruby

10 +0

78 +0

58 +0

GitHub
transfermarkt-datasets by dcaribou

⚽️ Extract, prepare and publish Transfermarkt datasets.

created at Dec. 26, 2020, 5:33 p.m.

Python

9 +0

174 +3

45 +0

GitHub
TCPD by alan-turing-institute

The Turing Change Point Dataset - A collection of time series for the evaluation and development of change point detection algorithms

created at Nov. 28, 2019, 4:07 p.m.

Python

8 +0

130 +0

28 +1

GitHub
coin_registry by Blockmodo

A global registry of JSON formatted files on 1500+ cryptocurrency tokens. Provides information like chat rooms, communities, explorers, and contact information on each coin. Used by https://blockmodo.com, DEXs, developers, and exchanges.

created at June 20, 2018, 6:15 a.m.

Unknown languages

8 +0

113 +0

34 +0

GitHub
JsonOfCounties by evangambit

A repo containing various data (demographics, employment, etc.) in JSON form.

created at June 23, 2020, 3:20 a.m.

Python

7 +0

55 +0

10 +0

GitHub
reversegeocode by kno10

Simple but fast reverse geocoding up to city granularitiy level

created at Feb. 2, 2015, 5:48 p.m.

Java

7 +0

55 +0

7 +0

GitHub
SaudiNewsNet by ParallelMazen

This repo contains a set of Arabic newspaper articles alongwith metadata, extracted from various Saudi newspapers.

created at July 21, 2015, 8:40 p.m.

Unknown languages

7 +0

66 +0

16 +0

GitHub
caption-contest-data by nextml

Data from the caption contest.

created at Nov. 23, 2015, 11:46 p.m.

HTML

7 +0

5 +0

2 +0

GitHub
congresstweets by alexlitel

Datasets of the daily Twitter output of Congress.

created at May 3, 2017, 10:47 p.m.

SCSS

7 +0

98 +1

38 +0

GitHub
awesome-citygml by OloOcki

The ultimate list of open data semantic 3D city models

created at Jan. 7, 2021, 4:48 p.m.

Unknown languages

7 +0

183 +0

24 +0

GitHub
pem-dataset1 by ECSIM

Proton Exchange Membrane (PEM) Fuel Cell Dataset

created at Jan. 4, 2020, 8:57 a.m.

Jupyter Notebook

6 +0

76 +0

23 +0

GitHub
38-Cloud-A-Cloud-Segmentation-Dataset by SorourMo

This data set includes Landsat 8 images and their manually extracted pixel-level ground truths for cloud detection.

created at Feb. 6, 2019, 12:11 a.m.

MATLAB

6 +0

138 +0

37 +0

GitHub
shopper-intent-prediction-nature-2020 by coveooss

🏟

created at Oct. 29, 2020, 1:52 p.m.

Unknown languages

6 +0

24 +0

5 +0

GitHub
lemon-dataset by softwaremill

Lemons quality control dataset

created at July 28, 2020, 6:42 a.m.

Unknown languages

5 +0

98 +0

12 +0

GitHub
usa-soccer by gavinr

USA soccer teams - location and metadata

created at Nov. 27, 2018, 3:34 p.m.

JavaScript

5 +0

14 +0

12 +0

GitHub