- 
    
      combust/mleap 0.23.3MLeap: Deploy ML Pipelines to Production Scala versions: 2.12
- 
    
      apache/sedona 1.8.0A cluster computing framework for processing large-scale geospatial data Scala versions: 2.13 2.12
- 
    
      lucacanali/sparkmeasure 0.27This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spark jobs. It focuses on easing the collection and examination of Spark metrics, making it a practical choice for both developers and data engineers. Scala versions: 2.13 2.12
- 
    
      scalapy/scalapy 0.5.3Use the world of Python from the comfort of Scala! Scala versions: 3.x 2.13 2.12Scala Native versions: 0.4
- 
    
      aws/sagemaker-spark spark_2.4.0-1.4.2.dev0A Spark library for Amazon SageMaker. Scala versions: 2.11
- 
    
      catboost/catboost 1.2.8A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU. Scala versions: 2.13 2.12
- 
    
      locationtech-labs/geopyspark 0.3.0GeoTrellis for PySpark Scala versions: 2.11
- 
    
      isarn/isarn-sketches-spark 0.6.0-sp3.2Routines and data structures for using isarn-sketches idiomatically in Apache Spark Scala versions: 2.12
- 
    
      salmon-brain/dead-salmon-brain 0.0.8Apache Spark based framework for analysis A/B experiments Scala versions: 2.12
- 
    
      ozancicek/artan 0.5.1Online latent state estimation with Spark Scala versions: 2.12
- 
    
      timvw/adobe-analytics-datafeed-datasource 0.1.0Apache Spark data source for Adobe Analytics Data Feed Scala versions: 2.12