Connect Spark to HBase for reading and writing data with ease
An open-source toolkit for large-scale genomic analysis
A library for querying Binlog with Apache Spark structure streaming, for Spark SQL , DataFrames and [MLSQL](https://www.mlsql.tech).
[RETIRED] Jupyter Declarative Widget Extension
Extended datasource support for Spark/Hadoop on Aliyun E-MapReduce.
Big Spatial Data Processing using Spark
BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.
Big Data Toolkit for the JVM
Avro SerDe for Apache Spark structured APIs.
Mirror of Apache livy (Incubating)
Spark metrics related custom classes and sinks (e.g. Prometheus)
Hadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.
Approximate Nearest Neighbors in Spark
Read and write Tensorflow TFRecord data from Apache Spark.
General Vectorization Lib for Machine Learning Tools
Google BigQuery support for Spark, SQL, and DataFrames
Spark RDD with Lucene's query and entity linkage capabilities
Snowflake Data Source for Apache Spark.
Boiler plate framework to use Spark and ZIO together.