-
s22s/pre-lt-raster-frames
Spark DataFrames for earth observation data
Scala versions: 2.11 -
romans-weapon/spear-framework
Rapid ETL/ELT-connectors/pipeline development leveraged on top of Apache Spark
Scala versions: 2.12 2.11 -
arcizon/spark-filetransfer
API for reading and writing data via various file transfer protocols from Apache Spark.
Scala versions: 2.12 2.11 -
qubole/streaminglens
Qubole Streaminglens tool for tuning Spark Structured Streaming Pipelines
Scala versions: 2.11 -
florentf9/sparkml-som
:sparkles: Spark ML implementation of SOM algorithm (Kohonen self-organizing map)
Scala versions: 2.11 -
piotr-kalanski/data-quality-monitoring
Data Quality Monitoring Tool
Scala versions: 2.11 -
getsentry/sentry-spark
Apache Spark Sentry Integration
Scala versions: 2.11 -
qubole/s3-sqs-connector
A library for reading data from Amzon S3 with optimised listing using Amazon SQS using Spark SQL Streaming ( or Structured streaming).
Scala versions: 2.11 -
jtnystrom/discount
Very large scale k-mer counting and analysis on Apache Spark.
Scala versions: 2.13 2.12 -
zuinnote/spark-hadoopcryptoledger-ds
A Spark datasource for the HadoopCryptoLedger library
Scala versions: 2.12 2.11 2.10 -
qubole/spark-state-store
Rocksdb state storage implementation for Structured Streaming.
Scala versions: 2.11 -
derrickburns/generalized-kmeans-clustering
Spark library for generalized K-Means clustering. Supports general Bregman divergences. Suitable for clustering probabilistic data, time series data, high dimensional data, and very large data.
Scala versions: 2.10