# botkop / numsca

numsca is numpy for scala

GitHub

"What I cannot create, I do not understand." - Richard Feynman.

# Numsca: Numpy for Scala

Numsca is Numpy for Scala.

Here's the famous neural network in 11 lines of Python, translated to Numsca:

```import botkop.{numsca => ns}
val x = ns.array(0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1).reshape(4, 3)
val y = ns.array(0, 1, 1, 0).T
val w0 = 2 * ns.rand(3, 4) - 1
val w1 = 2 * ns.rand(4, 1) - 1
for (j <- 0 until 60000) {
val l1 = 1 / (1 + ns.exp(-ns.dot(x, w0)))
val l2 = 1 / (1 + ns.exp(-ns.dot(l1, w1)))
val l2Delta = (y - l2) * (l2 * (1 - l2))
val l1Delta = l2Delta.dot(w1.T) * (l1 * (1 - l1))
w1 += l1.T.dot(l2Delta)
w0 += x.T.dot(l1Delta)
}```

Another example: a Scala translation of Andrej Karpathy's 'Minimal character-level language model with a Vanilla Recurrent Neural Network'. (Compare with Andrej Karpathy's original post.)

Also have a look at Scorch, a neural net framework in the spirit of PyTorch, which uses Numsca.

## Why?

I love Scala. I teach myself deep learning. Everything in deep learning is written in Python. This library helps me to quickly translate Python and Numpy code to my favorite language.

I hope you find it useful.

Pull requests welcome.

## Disclaimer

This is far from an exhaustive copy of Numpy's functionality. I'm adding functionality as I go. That being said, I think many of the most interesting aspects of Numpy like slicing, broadcasting and indexing have been successfully implemented.

## Under the hood

Numsca piggybacks on Nd4j. Thanks, people!

## Dependency

For Scala 2.13:

`libraryDependencies += "be.botkop" %% "numsca" % "0.1.7"`

For Scala 2.11 and 2.12:

`libraryDependencies += "be.botkop" %% "numsca" % "0.1.5"`

## Importing Numsca

```import botkop.{numsca => ns}
import ns.Tensor```

## Creating a Tensor

```scala> Tensor(3, 2, 1, 0)
[3.00,  2.00,  1.00,  0.00]

scala> ns.zeros(3, 3)
[[0.00,  0.00,  0.00],
[0.00,  0.00,  0.00],
[0.00,  0.00,  0.00]]

scala> ns.ones(3, 2)
[[1.00,  1.00],
[1.00,  1.00],
[1.00,  1.00]]

scala> val ta: Tensor = ns.arange(10)
[0.00,  1.00,  2.00,  3.00,  4.00,  5.00,  6.00,  7.00,  8.00,  9.00]

scala> val tb: Tensor = ns.reshape(ns.arange(9), 3, 3)
[[0.00,  1.00,  2.00],
[3.00,  4.00,  5.00],
[6.00,  7.00,  8.00]]

scala> val tc: Tensor = ns.reshape(ns.arange(2 * 3 * 4), 2, 3, 4)
[[[0.00,  1.00,  2.00,  3.00],
[4.00,  5.00,  6.00,  7.00],
[8.00,  9.00,  10.00,  11.00]],

[[12.00,  13.00,  14.00,  15.00],
[16.00,  17.00,  18.00,  19.00],
[20.00,  21.00,  22.00,  23.00]]]```

## Access

Single element

```scala> ta(0)
res10: botkop.numsca.Tensor = 0.00

scala> tc(0, 1, 2)
res14: botkop.numsca.Tensor = 6.00```

Get the value of a single element Tensor:

```scala> ta(0).squeeze()
res11: Double = 0.0```

Slice

```scala> tc(0)
res7: botkop.numsca.Tensor =
[[0.00,  1.00,  2.00,  3.00],
[4.00,  5.00,  6.00,  7.00],
[8.00,  9.00,  10.00,  11.00]]

scala> tc(0, 1)
res8: botkop.numsca.Tensor = [4.00,  5.00,  6.00,  7.00]```

## Update

In place

```scala> val t = ta.copy()
t: botkop.numsca.Tensor = [0.00,  1.00,  2.00,  3.00,  4.00,  5.00,  6.00,  7.00,  8.00,  9.00]

scala> t(3) := -5
scala> t
res16: botkop.numsca.Tensor = [0.00,  1.00,  2.00,  -5.00,  4.00,  5.00,  6.00,  7.00,  8.00,  9.00]

scala> t(0) += 7
scala> t
res18: botkop.numsca.Tensor = [7.00,  1.00,  2.00,  -5.00,  4.00,  5.00,  6.00,  7.00,  8.00,  9.00]```

Array wise

```scala> val a2 = 2 * ta
val a2 = 2 * ta
a2: botkop.numsca.Tensor = [0.00,  2.00,  4.00,  6.00,  8.00,  10.00,  12.00,  14.00,  16.00,  18.00]```

## Slicing

Note:

• negative indexing is supported
• Python notation `t[:3]` must be written as `t(0 :> 3)` or `t(:>(3))`

Not supported (yet):

• step size
• ellipsis

### Single dimension

#### Slice over a single dimension

```scala> val a0 = ta.copy().reshape(10, 1)
a0: botkop.numsca.Tensor = [0.00,  1.00,  2.00,  3.00,  4.00,  5.00,  6.00,  7.00,  8.00,  9.00]

scala> val a1 = a0(1 :>)
a1: botkop.numsca.Tensor = [1.00,  2.00,  3.00,  4.00,  5.00,  6.00,  7.00,  8.00,  9.00]

scala> val a2 = a0(0 :> -1)
a2: botkop.numsca.Tensor = [0.00,  1.00,  2.00,  3.00,  4.00,  5.00,  6.00,  7.00,  8.00]

scala> val a3 = a1 - a2
a3: botkop.numsca.Tensor = [1.00,  1.00,  1.00,  1.00,  1.00,  1.00,  1.00,  1.00,  1.00]

scala> ta(:>, 5 :>)
res19: botkop.numsca.Tensor = [5.00,  6.00,  7.00,  8.00,  9.00]

scala> ta(:>, -3 :>)
res4: botkop.numsca.Tensor = [7.00,  8.00,  9.00]```

#### Update single dimension slice

```scala> val t = ta.copy()
t: botkop.numsca.Tensor = [0.00,  1.00,  2.00,  3.00,  4.00,  5.00,  6.00,  7.00,  8.00,  9.00]```

Assign another tensor

```scala> t(2 :> 5) := -ns.ones(3)
scala> t
res6: botkop.numsca.Tensor = [0.00,  1.00,  -1.00,  -1.00,  -1.00,  5.00,  6.00,  7.00,  8.00,  9.00]```

Assign a value

```scala> t(2 :> 5) := 33
scala> t
res8: botkop.numsca.Tensor = [0.00,  1.00,  33.00,  33.00,  33.00,  5.00,  6.00,  7.00,  8.00,  9.00]```

Update in place

```scala> t(2 :> 5) -= 1
scala> t
res10: botkop.numsca.Tensor = [0.00,  1.00,  32.00,  32.00,  32.00,  5.00,  6.00,  7.00,  8.00,  9.00]
```

### Multidimensional slices

```scala> tb
res11: botkop.numsca.Tensor =
[[0.00,  1.00,  2.00],
[3.00,  4.00,  5.00],
[6.00,  7.00,  8.00]]

scala> tb(2:>, :>)
res15: botkop.numsca.Tensor = [6.00,  7.00,  8.00]```

Mixed range/integer indexing. Note that integers are implicitly translated to ranges, and this differs from Python.

```scala> tb(1, 0 :> -1)
res1: botkop.numsca.Tensor = [3.00,  4.00]```

## Fancy indexing

### Boolean indexing

```scala> val c = ta < 5 && ta > 1
c: botkop.numsca.Tensor = [0.00,  0.00,  1.00,  1.00,  1.00,  0.00,  0.00,  0.00,  0.00,  0.00]```

This returns a TensorSelection:

```scala> val d = ta(c)
d: botkop.numsca.TensorSelection = TensorSelection([0.00,  1.00,  2.00,  3.00,  4.00,  5.00,  6.00,  7.00,  8.00,  9.00],[[I@153ea1aa,None)```

Which is implicitly converted to a Tensor when needed:

```scala> val d: Tensor = ta(c)
d: botkop.numsca.Tensor = [2.00,  3.00,  4.00]```

Or you can force it to become a Tensor:

```scala> ta(c).asTensor
res10: botkop.numsca.Tensor = [2.00,  3.00,  4.00]```

Updating:

```scala> val t = ta.copy()
scala> t(ta < 5 && ta > 1) := -7
res6: botkop.numsca.Tensor = [0.00,  1.00,  -7.00,  -7.00,  -7.00,  5.00,  6.00,  7.00,  8.00,  9.00]```

Selection over multiple dimensions:

```scala> val c: Tensor = tc(tc % 5 == 0)
c: botkop.numsca.Tensor = [0.00,  5.00,  10.00,  15.00,  20.00]```

Updating over multiple dimensions:

```scala> val t1 = tc.copy()
t1: botkop.numsca.Tensor =
[[[0.00,  1.00,  2.00,  3.00],
[4.00,  5.00,  6.00,  7.00],
[8.00,  9.00,  10.00,  11.00]],

[[12.00,  13.00,  14.00,  15.00],
[16.00,  17.00,  18.00,  19.00],
[20.00,  21.00,  22.00,  23.00]]]

scala> t1(t1 > 5 && t1 < 15) *= 2
res21: botkop.numsca.Tensor =
[[[0.00,  1.00,  2.00,  3.00],
[4.00,  5.00,  12.00,  14.00],
[16.00,  18.00,  20.00,  22.00]],

[[24.00,  26.00,  28.00,  15.00],
[16.00,  17.00,  18.00,  19.00],
[20.00,  21.00,  22.00,  23.00]]]```

### List of location indexing

```scala> val primes = Tensor(2, 3, 5, 7, 11, 13, 17, 19, 23)

scala> val idx = Tensor(3, 4, 1, 2, 2)

scala> primes(idx).asTensor
res23: botkop.numsca.Tensor = [7.00,  11.00,  3.00,  5.00,  5.00]
```

Reshape according to index:

```scala> tb
res25: botkop.numsca.Tensor =
[[0.00,  1.00,  2.00],
[3.00,  4.00,  5.00],
[6.00,  7.00,  8.00]]

scala> primes(tb).asTensor
res24: botkop.numsca.Tensor =
[[2.00,  3.00,  5.00],
[7.00,  11.00,  13.00],
[17.00,  19.00,  23.00]]```

Use as a look-up table:

```scala> val numSamples = 4
val numClasses = 3
val x = ns.arange(numSamples * numClasses).reshape(numSamples, numClasses)
val y = Tensor(0, 1, 2, 1)
val z: Tensor = x(ns.arange(numSamples), y)
res26: botkop.numsca.Tensor = [0.00,  4.00,  8.00,  10.00]```

Update along a single dimension:

```scala> val primes = Tensor(2, 3, 5, 7, 11, 13, 17, 19, 23)
primes: botkop.numsca.Tensor = [2.00,  3.00,  5.00,  7.00,  11.00,  13.00,  17.00,  19.00,  23.00]

scala> val idx = Tensor(3, 4, 1, 2, 2)
idx: botkop.numsca.Tensor = [3.00,  4.00,  1.00,  2.00,  2.00]

scala> primes(idx) := 0

scala> primes
res1: botkop.numsca.Tensor = [2.00,  0.00,  0.00,  0.00,  0.00,  13.00,  17.00,  19.00,  23.00]```

Multiple dimensions

```scala> val a = ns.arange(6).reshape(3, 2) + 1
a: botkop.numsca.Tensor =
[[1.00,  2.00],
[3.00,  4.00],
[5.00,  6.00]]

scala> val s1 = Tensor(0, 1, 2)
s1: botkop.numsca.Tensor = [0.00,  1.00,  2.00]

scala> val s2 = Tensor(0, 1, 0)
s2: botkop.numsca.Tensor = [0.00,  1.00,  0.00]

scala> val r1: Tensor = a(s1, s2)
r1: botkop.numsca.Tensor = [1.00,  4.00,  5.00]```

An index will be broadcast if needed:

```scala> val y = ns.arange(35).reshape(5, 7)
y: botkop.numsca.Tensor =
[[0.00,  1.00,  2.00,  3.00,  4.00,  5.00,  6.00],
[7.00,  8.00,  9.00,  10.00,  11.00,  12.00,  13.00],
[14.00,  15.00,  16.00,  17.00,  18.00,  19.00,  20.00],
[21.00,  22.00,  23.00,  24.00,  25.00,  26.00,  27.00],
[28.00,  29.00,  30.00,  31.00,  32.00,  33.00,  34.00]]

scala> val r5: Tensor = y(Tensor(0, 2, 4), Tensor(1))
r5: botkop.numsca.Tensor = [1.00,  15.00,  29.00]```

Update along multiple dimensions:

```scala> val a = ns.arange(6).reshape(3, 2) + 1
a: botkop.numsca.Tensor =
[[1.00,  2.00],
[3.00,  4.00],
[5.00,  6.00]]

scala> val s1 = Tensor(1, 1, 2)
s1: botkop.numsca.Tensor = [1.00,  1.00,  2.00]

scala> val s2 = Tensor(0, 1, 0)
s2: botkop.numsca.Tensor = [0.00,  1.00,  0.00]

scala> a(s1, s2) := 0
res1: botkop.numsca.Tensor =
[[1.00,  2.00],
[0.00,  0.00],
[0.00,  6.00]]```

```scala> val x = ns.arange(4)
x: botkop.numsca.Tensor = [0.00,  1.00,  2.00,  3.00]

scala> val xx = x.reshape(4, 1)
xx: botkop.numsca.Tensor = [0.00,  1.00,  2.00,  3.00]

scala> val y = ns.ones(5)
y: botkop.numsca.Tensor = [1.00,  1.00,  1.00,  1.00,  1.00]

scala> val z = ns.ones(3, 4)
val z = ns.ones(3, 4)
[[1.00,  1.00,  1.00,  1.00],
[1.00,  1.00,  1.00,  1.00],
[1.00,  1.00,  1.00,  1.00]]

scala> (xx + y)
[[1.00,  1.00,  1.00,  1.00,  1.00],
[2.00,  2.00,  2.00,  2.00,  2.00],
[3.00,  3.00,  3.00,  3.00,  3.00],
[4.00,  4.00,  4.00,  4.00,  4.00]]

scala> x + z
[[1.00,  2.00,  3.00,  4.00],
[1.00,  2.00,  3.00,  4.00],
[1.00,  2.00,  3.00,  4.00]]```

Outer sum:

```scala> val a = Tensor(0.0, 10.0, 20.0, 30.0).reshape(4, 1)
a: botkop.numsca.Tensor = [0.00,  10.00,  20.00,  30.00]

scala> val b = Tensor(1.0, 2.0, 3.0)
b: botkop.numsca.Tensor = [1.00,  2.00,  3.00]

scala> a + b
res6: botkop.numsca.Tensor =
[[1.00,  2.00,  3.00],
[11.00,  12.00,  13.00],
[21.00,  22.00,  23.00],
[31.00,  32.00,  33.00]]
```

```scala> val observation = Tensor(111.0, 188.0)

scala> val codes = Tensor( 102.0, 203.0, 132.0, 193.0, 45.0, 155.0, 57.0, 173.0).reshape(4, 2)
codes: botkop.numsca.Tensor =
[[102.00,  203.00],
[132.00,  193.00],
[45.00,  155.00],
[57.00,  173.00]]

scala> val diff = codes - observation
diff: botkop.numsca.Tensor =
[[-9.00,  15.00],
[21.00,  5.00],
[-66.00,  -33.00],
[-54.00,  -15.00]]

scala> val dist = ns.sqrt(ns.sum(ns.square(diff), axis = -1))
dist: botkop.numsca.Tensor = [17.49,  21.59,  73.79,  56.04]

scala>     val nearest = ns.argmin(dist).squeeze()
nearest: Double = 0.0
```