Neo4j Connector for Apache Spark, which provides bi-directional read/write access to Neo4j from Spark, using the Spark DataSource APIs
created at March 3, 2016, 4:01 p.m.
Spark Gotchas. A subjective compilation of the Apache Spark tips and tricks
created at June 2, 2016, 10:21 p.m.
Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
created at July 25, 2016, 9:47 a.m.
Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max). A PyTorch LLM library that seamlessly integrates with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, DeepSpeed, vLLM, FastChat, etc.
created at Aug. 29, 2016, 7:59 a.m.
Apache Spark datasource for OrientDB
created at Oct. 31, 2016, 2:51 p.m.
Apache (Py)Spark type annotations (stub files).
created at Jan. 31, 2017, 1:13 a.m.
Essential Spark extensions and helper methods ✨😲
created at Feb. 16, 2017, 3:41 p.m.
Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)
created at April 6, 2017, 9:40 p.m.
Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.
created at June 25, 2017, 7 a.m.
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
created at July 6, 2017, 10:13 a.m.
State of the Art Natural Language Processing
created at Sept. 24, 2017, 7:36 p.m.
C4E, a JVM friendly library written in Scala for both local and distributed (Spark) Clustering.
created at March 26, 2018, 7:58 p.m.