Spark Structured Streaming State Tools
Offline Recommender System Evaluation for Spark
Functional, Composable library in Scala based on ZIO for writing ETL jobs in AWS and GCP https://tharwaninitin.github.io/etlflow/site/
ECS connector for Apache Spark
A library for Spark DataFrame using MinIO Select API
A library to query heterogeneous data sources uniformly using SPARQL
Fuzzy matching function in spark (https://spark-packages.org/package/itspawanbhardwaj/spark-fuzzy-matching)
Arc-Jupyter is an interactive Jupyter Notebooks Extenstion for building Arc data pipelines via Jupyter Notebooks.
YugabyteDB Spark Connector for YCQL, based on the DataStax Connector
Spark data source for Cognite Data Fusion
Read well-known ML datasets in Apache Spark
A spark package to approximate the diameter of large graphs
Map Algebra Modeling Language: It's what we and whales are.
A library for reading social data from Instagram using Spark Streaming.
Apache Spark Data Source for ROOT File Format
Scala-Spark port of https://github.com/bmabey/pyLDAvis for Apache Spark LDA Topic Modelling Visualisation
Power a Spark Stream from anywhere in your Akka Stream Flow
A Spark datasource for the HadoopCryptoLedger library
Another A/B test library