A tool for data sampling, data generation, and data diffing
Big Data Toolkit for the JVM
Read and write Parquet in Scala. Use Scala classes as schema. No need to start a cluster.
Read SparkSQL parquet file as RDD[Protobuf]
Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
Scala macros for generating Parquet schema projections and filter predicates
A collection of Apache Parquet add-on modules