Spark library for easy MongoDB access
General Vectorization Lib for Machine Learning Tools
Distributed exome CNV analyzer. Apache 2 licensed.
An extension to the amazing Spark framework for better functional programming.
Building Annoy Index on Apache Spark
machine learning for genomic variants
Scala Library/REPL for Machine Learning Research
Implementation of Random Ferns for Apache Spark
Spark-Transformers: Library for exporting Apache Spark MLLIB models in to use them in any Java application with no other dependencies.
ETL Library for Machine Learning - data pipelines, data munging and wrangling
Spark Marketo Connector
A Play Module for running Livy Job, that runs code on remote Spark Session.
Google Spreadsheets datasource for SparkSQL and DataFrames
Deriving Spark DataFrame schemas from case classes
Apache Spark Data Source for ROOT File Format
Provides the MongoDBExtract and MongoDBLoad stages
A library you can include in your Spark job to validate the counters and perform operations on success. Goal is scala/java/python support.