Big Data RDF Processing and Analytics Stack built on Apache Spark and Apache Jena
Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
Run spark calculations from Ammonite
A COBOL parser and Mainframe/EBCDIC data source for Apache Spark
sbt plugin for spark-submit
Custom state store providers for Apache Spark
A framework for writing Spark 2.x applications in a pretty way
A library for Spark DataFrame using MinIO Select API
When Apache Pulsar meets Apache Spark
Writing application logic for Spark jobs that can be unit-tested without a SparkContext
Building Annoy Index on Apache Spark
Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
Google BigQuery support for Spark, Structured Streaming, SQL, and DataFrames with easy Databricks integration.
Framework to quickly build and maintain Smart Data Lakes
Bulletproof Apache Spark jobs with fast root cause analysis of failures.
SBT plugin for Apache Spark on AWS EMR
Scala and Spark library focused on reading OpenStreetMap Pbf files.
HadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)
Natural Korean Processor for Apache Spark
Google Spreadsheets datasource for SparkSQL and DataFrames