This is a highly optimized blake3 implementation for scala, scala-js and scala-native

Blake3 for scala

This is a highly optimized blake3 implementation for scala, scala-js, and scala-native, without any dependencies. This implementation has a constant memory footprint (about 5kb) which hasn't depended on the size of hashed data or the size of the output hash.

If you're looking for the faster possible hash function for scala.js I suggest to use this one, instead of SHA because this implementation use only 32 bits number which is natively supported by JS.

You can use it as

libraryDependencies += "pt.kcry" %%% "blake3" % "x.x.x"

The latest version is maven-central

API is pretty simple:

scala> import pt.kcry.blake3.Blake3

scala> Blake3.newHasher().update("Some string").doneHex(64)
val res1: String = 2e5524f3481046587080604ae4b4ceb44b721f3964ce0764627dee2c171de4c2

scala> Blake3.newDeriveKeyHasher("whats the Elvish word for friend").update("Some string").doneHex(64)
val res2: String = c2e79fe73dde16a13b4aa5a947b0e9cd7277ea8e68da250759de3ae62372b340

scala> Blake3.newKeyedHasher("whats the Elvish word for friend").update("Some string").doneHex(64)
val res3: String = 79943402309f9bb05338193f21fb57d98ab848bdcac67e5e097340f116ff90ba

scala> Blake3.hex("Some string", 64)
val res4: String = 2e5524f3481046587080604ae4b4ceb44b721f3964ce0764627dee2c171de4c2

scala> Blake3.bigInt("Some string", 32)
val res5: BigInt = 777331955


Hasher.update is mutable when Hasher.done isn't.

Hasher.update supports different input such as: byte array, part of byte array, single byte or string, and many others like OutputStream or ByteBuffer.

Hasher.done supports different output such as:

  • done(out: Array[Byte]) to fill full provided array;
  • done(out: Array[Byte], offset: Int, len: Int) to fill specified part of provided array;
  • done(out: OutputStream, len: Int) to fill specified OutputStream;
  • done(out: ByteBuffer) to fill specified ByteBuffer;
  • done() that returns a single byte hash value;
  • doneShort(), doneInt() and doneLong() that returns a single short, int or long hash value;
  • doneBigInt(bitLength: Int) that returns positive BigInt with specified length in bits;
  • doneHex(resultLength: Int) that returns hex encoded string with specified output length in characters;
  • doneBaseXXX(len: Int) that returns string representative of XXX encoded as it defined in RFC 4648 without padding;
  • doneXor(...) that applied hash to existed value via XOR;
  • doneCallBack(..) and doneXorCallBack(...) which is used callback to for each produced byte.

This implementation is thread-safe and you can use it in a multithreaded environment. Anyway, this implementation doesn't currently include any multithreading optimizations.

As a baseline for benchmarks, I've used the original C version c-0.3.7 via the JNI interface that was implemented as part of BLAKE3jni.

All benchmarks were performed on two machines:

  • Zulu11.56+19-CA (build 11.0.15+10-LTS) at Intel® Core™ i7-8700B with AVX2 assembly optimization inside the baseline,
  • Zulu11.56+19-CA (build 11.0.15+10-LTS) at Apple M1 without any assembly optimization inside the baseline.

Short summary:

  • it is about 4 times slower than AVX2 assembly version via JNI which is expected,
  • it is about 30% slower than the original C version via JNI,
  • it has a constant memory footprint (yeah, no GC on hashing!),
  • increasing result hash size has the same impact as hashing.

The full version of the results are available as: