Spark DataFrames for earth observation data
SANSA Machine Learning Layer
Spark based implementation of the Topological Mapper algorithm
BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.
Secondary sort and streaming reduce for Apache Spark
Plug-and-play implementation of an Apache Spark custom data source for AWS DynamoDB.
Creating reusable workflows for Apache Spark
Optics for Spark DataFrames
A library you can include in your Spark job to validate the counters and perform operations on success. Goal is scala/java/python support.
Building Annoy Index on Apache Spark
Routines and data structures for using isarn-sketches idiomatically in Apache Spark
InfluxDB connector to Apache Spark on top of Chronicler
Basic framework utilities to quickly start writing production ready Apache Spark applications
The Almaren Framework provides a simplified consistent minimalistic layer over Apache Spark. While still allowing you to take advantage of native Apache Spark features. You can still combine it with standard Spark code.
A Variant Caller, Distributed. Apache 2 licensed.
General utility code used across BDG products. Apache 2 licensed.
Apache Spark test helper functions with pretty error messages
Natural Korean Processor for Apache Spark
Read SparkSQL parquet file as RDD[Protobuf]
Code Less, Build More. Clean, automated Feature Generation and Selection for Apache Spark!