-
iaja/scalaldavis
Scala-Spark port of https://github.com/bmabey/pyLDAvis for Apache Spark LDA Topic Modelling Visualisation
Scala versions: 2.11 -
ozancicek/artan
Online latent state estimation with Spark
Scala versions: 2.12 2.11 -
absaoss/spark-data-standardization
A library for Spark that helps to stadardize any input data (DataFrame) to adhere to the provided schema.
Scala versions: 2.12 2.11 -
piotr-kalanski/spark-local
API enabling switching between Spark execution engine and local fast implementation based on Scala collections.
Scala versions: 2.11 -
eto-ai/rikai
Parquet-based ML data format optimized for working with unstructured data
Scala versions: 2.13 2.12 -
zuinnote/hadoopoffice
HadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)
Scala versions: 2.12 2.11 -
raistlintao/sparkmodelhelper
Scala Library for extracting useful information from trained Spark Model (DecisionTreeClassificationModel)
Scala versions: 2.12 -
eclipse/deeplearning4j
Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learning using automatic differentiation.
Scala versions: 2.12 2.11 2.10 -
chitralverma/sparkml-extensions
Scala versions: 2.11 -
h2oai/h2o-3
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Scala versions: 2.11 2.10 -
cdapio/cdap
An open source framework for building data analytic applications.
Scala versions: 2.12 2.11 2.10