Spark Library for Bulk Loading into Cassandra
JSON schema parser for Apache Spark
An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.
CSV data source for Spark SQL and DataFrames
Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
A library for querying Google AdWords data with Apache Spark, for Spark SQL and DataFrames
A refreshing treatment for all quality control ailments. Apache 2 licensed.
A library that converts between nested DataSets and flatten DataFrames
Hadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
NetFlow data source for Spark SQL and DataFrames
Big Spatial Data Processing using Spark
Google BigQuery support for Spark, SQL, and DataFrames
A library for Spark DataFrame using MinIO Select API
Apache Spark test helper functions with pretty error messages
spark-cassandra-sink is a Spark Structured Streaming Sink for cassandra
Spark NetSuite Connector