An orchestration platform for the development, production, and observation of data assets.
created at April 30, 2018, 4:30 p.m.
pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
created at Feb. 26, 2019, 1:39 a.m.
Utils for streaming large files (S3, HDFS, gzip, bz2...)
created at Jan. 2, 2015, 1:05 p.m.
What's in your data? Extract schema, statistics and entities from datasets
created at Nov. 9, 2020, 3:20 p.m.
Pandas, Polars, Spark, and Snowpark DataFrame comparison for humans and more!
created at March 23, 2018, 1:16 p.m.