-
fsanaulla/chronicler
Scala toolchain for InfluxDB
Scala versions: 2.13 2.12 2.11 -
sansa-stack/archived-sansa-inference
A general Inference API based on two of the most popular Big Data processing engines: Apache Spark and Apache Flink
Scala versions: 2.11 -
sansa-stack/archived-sansa-owl
SANSA Stack OWL (Web Ontology Language) API
Scala versions: 2.11 -
agile-lab-dev/wasp
WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.
Scala versions: 2.12 2.11 -
isarn/isarn-sketches-spark
Routines and data structures for using isarn-sketches idiomatically in Apache Spark
Scala versions: 2.12 2.11 2.10 -
whylabs/whylogs-java
Profile and monitor your ML data pipeline end-to-end
Scala versions: 2.12 2.11 -
locationtech/rasterframes
Geospatial Raster support for Spark DataFrames
Scala versions: 2.12 2.11 -
s22s/pre-lt-raster-frames
Spark DataFrames for earth observation data
Scala versions: 2.11 -
romans-weapon/spear-framework
Rapid ETL/ELT-connectors/pipeline development leveraged on top of Apache Spark
Scala versions: 2.12 2.11 -
qubole/streaminglens
Qubole Streaminglens tool for tuning Spark Structured Streaming Pipelines
Scala versions: 2.11 -
florentf9/sparkml-som
:sparkles: Spark ML implementation of SOM algorithm (Kohonen self-organizing map)
Scala versions: 2.11 -
piotr-kalanski/data-quality-monitoring
Data Quality Monitoring Tool
Scala versions: 2.11