-
combust/mleap 0.24.0
MLeap: Deploy ML Pipelines to Production
Scala versions: 2.13 -
apache/sedona 1.8.1
A cluster computing framework for processing large-scale geospatial data
Scala versions: 2.13 2.12 -
lucacanali/sparkmeasure 0.27
This repository contains the development code for sparkMeasure, an Apache Spark performance analysis and troubleshooting library. It simplifies collecting, aggregating, and exporting Spark task/stage metrics, and is designed for practical use by developers and data engineers in interactive analysis, testing, and production monitoring workflows.
Scala versions: 2.13 2.12 -
scalapy/scalapy 0.5.3
Use the world of Python from the comfort of Scala!
Scala versions: 3.x 2.13 2.12Scala Native versions: 0.4 -
aws/sagemaker-spark spark_2.4.0-1.4.2.dev0
A Spark library for Amazon SageMaker.
Scala versions: 2.11 -
catboost/catboost 1.2.10
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
Scala versions: 2.13 2.12 -
locationtech-labs/geopyspark 0.3.0
GeoTrellis for PySpark
Scala versions: 2.11 -
isarn/isarn-sketches-spark 0.6.0-sp3.2
Routines and data structures for using isarn-sketches idiomatically in Apache Spark
Scala versions: 2.12 -
salmon-brain/dead-salmon-brain 0.0.8
Apache Spark based framework for analysis A/B experiments
Scala versions: 2.12 -
ozancicek/artan 0.5.1
Online latent state estimation with Spark
Scala versions: 2.12 -
timvw/adobe-analytics-datafeed-datasource 0.1.0
Apache Spark data source for Adobe Analytics Data Feed
Scala versions: 2.12