-
microsoft/mobius
C# and F# language binding and extensions to Apache Spark
Scala (JVM): 2.10 2.11 -
lucacanali/sparkmeasure
This is the development repository of SparkMeasure, a tool for performance troubleshooting of Apache Spark workloads. It simplifies the collection and analysis of Spark task metrics data.
Scala (JVM): 2.11 2.12 -
hydrospheredata/mist
Serverless proxy for Spark cluster
Scala (JVM): 2.10 2.11 2.12 -
azure/azure-cosmosdb-spark
Apache Spark Connector for Azure Cosmos DB
Scala (JVM): 2.10 2.11 -
azure/azure-event-hubs-spark
Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Scala (JVM): 2.11 2.12 -
whylabs/whylogs-java
Profile and monitor your ML data pipeline end-to-end
Scala (JVM): 2.11 2.12 -
chermenin/spark-states
Custom state store providers for Apache Spark
Scala (JVM): 2.11 2.12 -
uosdmlab/spark-nkp
Natural Korean Processor for Apache Spark
Scala (JVM): 2.11 -
bizreach/aws-kinesis-scala
Scala client for Amazon Kinesis. Also provides write to Kinesis capability for Apache Spark or Spark Streaming.
Scala (JVM): 2.11 2.12 2.13 -
heartsavior/spark-state-tools
Spark Structured Streaming State Tools
Scala (JVM): 2.11 2.12 -
flipkart-incubator/spark-transformers
Spark-Transformers: Library for exporting Apache Spark MLLIB models in to use them in any Java application with no other dependencies.
Scala (JVM): 2.10 2.11 -
isarn/isarn-sketches-spark
Routines and data structures for using isarn-sketches idiomatically in Apache Spark
Scala (JVM): 2.10 2.11 2.12 -
itspawanbhardwaj/spark-fuzzy-matching
Fuzzy matching function in spark (https://spark-packages.org/package/itspawanbhardwaj/spark-fuzzy-matching)
Scala (JVM): 2.10 2.11 -
tupol/spark-utils
Basic framework utilities to quickly start writing production ready Apache Spark applications
Scala (JVM): 2.11 2.12 -
liquidsvm/liquidsvm
Support vector machines (SVMs) and related kernel-based learning algorithms are a well-known class of machine learning algorithms, for non-parametric classification and regression. liquidSVM is an implementation of SVMs whose key features are: fully integrated hyper-parameter selection, extreme speed on both small and large data sets, full flexibility for experts, and inclusion of a variety of different learning scenarios: multi-class classification, ROC, and Neyman-Pearson learning, and least-squares, quantile, and expectile regression.
Scala (JVM): 2.11 -
tupol/spark-tools
Executable Apache Spark Tools: Format Converter & SQL Processor
Scala (JVM): 2.11 2.12 -
astrolabsoftware/spark3d
Spark extension for processing large-scale 3D data sets: Astrophysics, High Energy Physics, Meteorology, …
Scala (JVM): 2.11 -
emcecs/spark-ecs-connector
ECS connector for Apache Spark
Scala (JVM): 2.11