An orchestration platform for the development, production, and observation of data assets.
updated at May 11, 2024, 7:31 p.m.
What's in your data? Extract schema, statistics and entities from datasets
updated at May 10, 2024, 10:33 p.m.
Utils for streaming large files (S3, HDFS, gzip, bz2...)
updated at May 10, 2024, 4:57 p.m.
pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
updated at May 10, 2024, 3:44 p.m.
Pandas and Spark DataFrame comparison for humans and more!
updated at May 6, 2024, 11:14 p.m.