-
microsoft/synapseml 1.0.8
Simple and Distributed Machine Learning
Scala versions: 2.12 -
johnsnowlabs/spark-nlp 5.5.1
State of the Art Natural Language Processing
Scala versions: 2.12 -
salesforce/transmogrifai 0.7.0
TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning
Scala versions: 2.11 -
almond-sh/almond 0.13.14
A Scala kernel for Jupyter
Scala versions: 3.x 2.13 2.12 -
combust/mleap 0.23.2
MLeap: Deploy ML Pipelines to Production
Scala versions: 2.12 -
tibcosoftware/snappydata 0.5
Project SnappyData - memory optimized analytics database, based on Apache Spark™ and Apache Geode™. Stream, Transact, Analyze, Predict in one cluster
Scala versions: 2.10 -
lucacanali/sparkmeasure 0.24
This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spark jobs. It focuses on easing the collection and examination of Spark metrics, making it a practical choice for both developers and data engineers.
Scala versions: 2.13 2.12 -
h2oai/sparkling-water 2.4.13
Sparkling Water provides H2O functionality inside Spark cluster
Scala versions: 2.11 -
delta-io/delta-sharing 1.2.2
An open protocol for secure data sharing
Scala versions: 2.13 2.12