-
florentf9/sparkml-som
:sparkles: Spark ML implementation of SOM algorithm (Kohonen self-organizing map)
Scala versions: 2.11 -
piotr-kalanski/data-quality-monitoring
Data Quality Monitoring Tool
Scala versions: 2.11 -
qubole/s3-sqs-connector
A library for reading data from Amzon S3 with optimised listing using Amazon SQS using Spark SQL Streaming ( or Structured streaming).
Scala versions: 2.11 -
arcizon/spark-filetransfer
API for reading and writing data via various file transfer protocols from Apache Spark.
Scala versions: 2.12 2.11 -
getsentry/sentry-spark
Apache Spark Sentry Integration
Scala versions: 2.11 -
absaoss/pramen
Resilient data pipeline framework running on Apache Spark
Scala versions: 2.13 2.12 2.11 -
qubole/spark-state-store
Rocksdb state storage implementation for Structured Streaming.
Scala versions: 2.11 -
zuinnote/spark-hadoopcryptoledger-ds
A Spark datasource for the HadoopCryptoLedger library
Scala versions: 2.12 2.11 2.10 -
exasol/spark-connector
A connector for Apache Spark to access Exasol
Scala versions: 2.13 2.12 2.11 -
jtnystrom/discount
Very large scale k-mer counting and analysis on Apache Spark.
Scala versions: 2.13 2.12 -
tupol/spark-tools
Executable Apache Spark Tools: Format Converter & SQL Processor
Scala versions: 2.12 2.11 -
phymbert/spark-search
Spark Search - high performance advanced search features based on Apache Lucene
Scala versions: 2.12 2.11 -
data-tools/big-data-types
A library to transform Scala product types and Schemas from different databases into other Schemas. Any implemented type gets automatically methods to convert it into the rest of the types and vice versa. For example, an Spark Schema can be transformed into a BigQuery table.
Scala versions: 3.x 2.13 2.12