General utility code used across BDG products. Apache 2 licensed.
Serializers (input and output) for the phone call-related models
Spark package to "plug" holes in data using SQL based rules ⚡️ 🔌
Generation of a few sample data sets. For instance, feature set derived from CDR
Mirror of Apache Bahir
InfluxDB connector to Apache Spark on top of Chronicler
Spark connector for RSS and HTML sources.
A monadic design pattern that can be used to construct data processing pipeline. It also provides several monads implemented using Apache Spark.
Provides KafkaExtract, KafkaLoad and KafkaCommitExecute stages
Collection of Spark SQL Helper : udf, udaf, …
Generate Scala case class based on Spark DataFrame schema
A Cluster Computing System for Processing Large-Scale Spatial Data
A Spark plugin for reading Excel files via Apache POI
Optics for Spark DataFrames
API enabling switching between Spark execution engine and local fast implementation based on Scala collections.
Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.