An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
DataStax Connector for Apache Spark to Apache Cassandra
Alpakka Kafka connector - Alpakka is a Reactive Enterprise Integration library for Java and Scala, based on Reactive Streams and Akka.
Alpakka is a Reactive Enterprise Integration library for Java and Scala, based on Reactive Streams and Akka.
A connector for Spark that allows reading and writing to/from Redis cluster
CSV Data Source for Apache Spark 1.x
Redshift data source for Apache Spark
XML data source for Spark SQL and DataFrames
Avro Data Source for Apache Spark
Apache Spark Connector for SQL Server and Azure SQL
A connector for MemSQL and Spark
This library allows Scala and Java-based projects (including Apache Flink, Apache Hive, Apache Beam, and PrestoDB) to read from and write to Delta Lake.
The Official Couchbase Spark Connector
Implementation of akka-persistence storage plugins for mongodb
Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
Kafka Connect Cassandra Connector. This project includes source/sink connectors for Cassandra to/from Kafka.
Spark connector for SFTP
BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.
A set of connectors for Monix. 🔛
HadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)