-
azure/azure-cosmosdb-spark
Apache Spark Connector for Azure Cosmos DB
Scala versions: 2.11 2.10 -
leobenkel/zparkio
Boiler plate framework to use Spark and ZIO together.
Scala versions: 2.11 -
projectglow/glow
An open-source toolkit for large-scale genomic analysis
Scala versions: 2.12 2.11 -
sparkling-graph/sparkling-graph
SparklingGraph provides easy to use set of features that will give you ability to proces large scala graphs using Spark and GraphX.
Scala versions: 2.11 2.10 -
qbeast-io/qbeast-spark
Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!
Scala versions: 2.12 -
zouzias/spark-lucenerdd
Spark RDD with Lucene's query and entity linkage capabilities
Scala versions: 2.11 2.10 -
clustering4ever/clustering4ever
C4E, a JVM friendly library written in Scala for both local and distributed (Spark) Clustering.
Scala versions: 2.11 -
aliyun/aliyun-emapreduce-datasources
Extended datasource support for Spark/Hadoop on Aliyun E-MapReduce.
Scala versions: 2.11 2.10 -
indix/schemer
Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
Scala versions: 2.11 -
streamnative/pulsar-spark
When Apache Pulsar meets Apache Spark
Scala versions: 2.12 2.11 -
helgeho/archivespark
An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.
Scala versions: 2.11 -
microsoft/mobius
C# and F# language binding and extensions to Apache Spark
Scala versions: 2.11 2.10 -
chermenin/spark-states
Custom state store providers for Apache Spark
Scala versions: 2.12 2.11 -
housepower/spark-clickhouse-connector
Spark ClickHouse Connector build on DataSourceV2 API
Scala versions: 2.13 2.12