emergentorder / onnx-scala

An ONNX (Open Neural Network eXchange) API, Code Generator and Backend(s) for Typeful, Functional Deep Learning in Scala

Version Matrix


Build status Latest version

Getting Started

Add this to the build.sbt in your project:

libraryDependencies += "com.github.EmergentOrder" %% "onnx-scala-backends" % "0.8.0"

As of v0.1.0, artifacts are published to Sonatype OSS / Maven Central. For the latest, build and publish locally from master.

Full ONNX model inference - quick start

First, download the model file for SqueezeNet. You can use get_models.sh

Using the console, from this project root:

sbt
project backendsJVM
console 

or from your project:

sbt console

Note that all code snippets are written in Scala 3 (Dotty).

Run SqueezeNet image classification inference on an "image" composed entirely of pixel value 42:

import java.nio.file.{Files, Paths}
import org.emergentorder.onnx.Tensors._
import org.emergentorder.onnx.backends.ORTOperatorBackendAll
import org.emergentorder.onnx.backends.ORTModelBackend
import org.emergentorder.compiletime._
import io.kjaer.compiletime._

val squeezenetBytes = Files.readAllBytes(Paths.get("squeezenet1.1.onnx"))
val squeezenet = new ORTModelBackend(squeezenetBytes)

val data = Array.fill(1*3*224*224){42f}
val shape = 1 #: 3 #: 224 #: 224 #: SNil

val tensorDenotation: String & Singleton = "Image"
//In NCHW tensor image format
val tensorShapeDenotation = "Batch" ##: "Channel" ##: "Height" ##: "Width" ##: TSNil

val imageTens = Tensor(data,tensorDenotation,tensorShapeDenotation,shape)

//or as a shorthand if you aren't concerned with enforcing denotations
val imageTensDefaultDenotations = Tensor(data,shape)

Note that ONNX Tensor content is in row-major order.

val out = squeezenet.fullModel[Float, 
                               "ImageNetClassification",
                               "Batch" ##: "Class" ##: TSNil,
                               1 #: 1000 #: SNil](Tuple(imageTens))
// val out:
//  Tensor[Float,("ImageNetClassification", 
//                "Batch" ##: "Class" ##: TSNil,
//                1 #: 1000 #: SNil)] = (Array(0.8230729,
// ...

//The output shape
out.shape
// val res0: Array[Int] = Array(1, 1000)


//The highest probability (predicted) class
out.data.indices.maxBy(out.data)
// val res1: Int = 418

Referring to the ImageNet 1000 class labels, we see that the predicted class is "ballpoint pen".

Based on a simple benchmark of 100000 iterations of SqueezeNet inference (run on my laptop), it is roughly on par with (within 10% of) ONNX Runtime (via Python).

The resulting output values also match ONNX Runtime/Python.

When using this API, we load the provided ONNX model file and pass it as-is to the underlying ONNX backend. This is the most performant execution mode, and is recommended for off-the-shelf models / performance-critical scenarios.

This full-model API is untyped in the inputs, so it can fail at runtime. This inevitable because we load models from disk at runtime. Feel free to wrap your calls into it in a facade with typed inputs.

Operator-level (Fine-grained) API - quick start

You can call individual operators:

val onnxBackend = new ORTOperatorBackendAll()

val longTens = Tensor(Array.fill(1*3*224*224){-42l},tensorDenotation,tensorShapeDenotation,shape)
// longTens:
//  org.emergentorder.onnx.Tensors.Tensor[Float, 
//                                         ("Image", 
//                                          "Batch" ##: "Channel" ##: "Height" ##: "Width" ##:
//    org.emergentorder.compiletime.TSNil
//  , 1 #: 1000 #: io.kjaer.compiletime.SNil)] = (
//   Array(
//     -42L,
//     -42L,
// ...

onnxBackend.AbsV6("abs", longTens)
// res2:
//  org.emergentorder.onnx.Tensors.Tensor[Float, 
//                                          ("Image", 
//                                           "Batch" ##: "Channel" ##: "Height" ##: "Width" ##:
//    org.emergentorder.compiletime.TSNil
//  , 1 #: 1000 #: io.kjaer.compiletime.SNil)] = ( 
//   Array(
//     42L,
//     42L,
// ...

Sqrt will fail to compile because it's not defined for Long:

onnxBackend.SqrtV6("sqrt", longTens)
// ...
//Required: org.emergentorder.onnx.Tensors.Tensor[T, (
//...
//where:    T            is a type variable with constraint <: org.emergentorder.onnx.Float16 | Float | Double

Note that in real use backends should be closed to prevent native memory leaks.

Project Details

Automatic differentiation to enable training is under consideration (ONNX currently provides facilities for training as a tech preview only).

The ONNX-Scala core (fine-grained) API is cross-built against Scala JVM (for Scala 2.13 and Dotty/3.0) , Scala.js / JavaScript (for Scala 2.13 and Dotty/3.0).

Currently at ONNX 1.7.0, ONNX Runtime 1.5.2.

A) Fine-grained API

A complete*, versioned, numerically generic, type-safe / typeful API to ONNX(Open Neural Network eXchange, an open format to represent deep learning and classical machine learning models), derived from the Protobuf definitions and the operator schemas (defined in C++) via the JavaCPP Preset for ONNX. We also generate implementations for each operator in terms of core methods to be implemented by the backend.

This API is expressed via traits, with version-named methods. For example, Abs, the absolute value operator (defined here for operator set 6):

* Up to roughly the intersection of supported ops in ONNX Runtime and ONNX.js

import scala.{specialized => sp}
import spire.math._
import spire.implicits._
import org.emergentorder.onnx._

  trait AbsV6 extends Operator {
    def AbsV6[
        @sp T <: UByte | UShort | UInt | 
                 ULong | Byte | Short | Int | 
                 Long | Float16 | Float | Double: Numeric,
      Tt <: TensorTypeDenotation, 
      Td <: TensorShapeDenotation, 
      S <: Shape]
      (name: String, X: Tensor[T, Tuple3[Tt, Td, S]])
      (using tt: ValueOf[Tt], 
             td: TensorShapeDenotationOf[Td], 
             s: ShapeOf[S]): Tensor[T, Tuple3[Tt, Td, S]] = {
      val map: Map[String, Any] = Map()
      val allInputs             = Tuple1(X)
      (callOp(name, "Abs", allInputs, map))
    }
  }

Using this API, each ONNX operation is executed on the underyling backend individually. As a result, you can write your own models from scratch in Scala using ONNX-Scala operations, injecting parameters from outside sources as need be. This allows for dynamic graph structure, in which the execution itself defines the graph, similar to PyTorch and Tensorflow Eager. The trade-off made for this flexibility is that the underlying ONNX backend can no longer optimize the full graph, and the JNI boundary-crossing and ONNX graph structure at each operation results in additional overhead.

Type-safe Tensors

Featuring type-level tensor and axis labels/denotations, which along with literal types for dimension sizes allow for tensor/axes/shape/data-typed tensors. Type constraints, as per the ONNX spec, are implemented at the operation level on inputs and outputs, using union types, match types and compiletime singleton ops (thanks to @MaximeKjaer for getting the latter into dotty). Using ONNX docs for dimension and type denotation, as well as the operators doc as a reference, and inspired by Nexus, Neurocat and Named Tensors.

B) Backend

Currently there is one backend support, based on ONNX Runtime, via their official Java API. An alternate backend to enable Scala.js support, based on ONNX.js is coming soon (blocked on new Scala.js bundler / ScalaPB releases for dotty support).

Supported ONNX input and output tensor data types:

  • Byte
  • Short
  • Int
  • Long
  • Float
  • Double
  • Boolean

Supported ONNX ops:

  • ONNX Runtime: 145/154 total.
  • ONNX JS: 72/154 total.
  • ONNX-Scala: 82/154 total.

See the ONNX backend scoreboard

Example execution

TODO: T5 example

Build / Publish

You'll need sbt.

To build and publish locally:

sbt publishLocal

or

sbt +publishLocal

to build against Scala 2.13 and Dotty/3.0, where possible.

Built With

Core

  • ONNX via ScalaPB - Open Neural Network Exchange / The missing bridge between Java and native C++ libraries (For access to Protobuf definitions, used in the fine-grained API to create ONNX models in memory to send to the backend)

  • Spire - Powerful new number types and numeric abstractions for Scala. (For support for unsigned ints, complex numbers and the Numeric type class in the core API)

  • Dotty - The Scala 3 compiler, also known as Dotty. (For union types (used here to express ONNX type constraints), match types, compiletime singleton ops, ...)

Backends

Inspiration

Scala

  • Neurocat - From neural networks to the Category of composable supervised learning algorithms in Scala with compile-time matrix checking based on singleton-types

  • Nexus - Experimental typesafe tensors & deep learning in Scala

  • Lantern - Machine learning framework prototype in Scala. The design of Lantern is built on two important and well-studied programming language concepts, delimited continuations (for automatic differentiation) and multi-stage programming (staging for short).

  • DeepLearning.scala - A simple library for creating complex neural networks

  • tf-dotty - Shape-safe TensorFlow in Dotty