A JNA wrapper around spotify/annoy which calls the C++ library of annoy directly from Scala/JVM.
For linux-x86-64 or Mac users, just add the library directly as:
libraryDependencies += "net.pishen" %% "annoy4s" % "0.10.0"
If you meet an error like below when using annoy4s, you may have to compile the native library by yourself.
java.lang.UnsatisfiedLinkError: Unable to load library 'annoy': Native library
To compile the native library and install annoy4s on local machine:
- Clone this repository.
- Check the values of
build.sbt, you may change it to the value you want, it's recommended to let
compileNativein sbt (Note that g++ installation is required).
testin sbt to see if the native library is successfully compiled.
publishLocalin sbt to install annoy4s on your machine.
Now you can add the library dependency as (organization and version may be different according to your settings):
libraryDependencies += "net.pishen" %% "annoy4s" % "0.10.0-SNAPSHOT"
The library file generated by the g++ command in
compileNative can also be installed independently on your machine. Please reference to library search paths for more details on how to make JNA able to load the library.
Create and query the index in memory mode:
import annoy4s._ val annoy = Annoy.create[Int]("./input_vectors", numOfTrees = 10, metric = Euclidean, verbose = true) val result: Option[Seq[(Int, Float)]] = annoy.query(itemId, maxReturnSize = 30)
- The format of
<item id> <vector>for each line, here is an example:
3 0.2 -1.5 0.3 5 0.4 0.01 -0.5 0 1.1 0.9 -0.1 2 1.2 0.8 0.2
<item id>could be
UUID, just change the type parameter at
Annoy.create[T]. You can also implement a
KeyConverter[T]by yourself to support your own type.
resultis a tuple list of id and distances, where the query item is itself contained.
To use the index in disk mode, one need to provide an
val annoy = Annoy.create[Int]("./input_vectors", 10, outputDir = "./annoy_result/", Euclidean) val result: Option[Seq[(Int, Float)]] = annoy.query(itemId, maxReturnSize = 30) annoy.close() // load an created index val reloadedAnnoy = Annoy.load[Int]("./annoy_result/") val reloadedResult: Option[Seq[(Int, Float)]] = reloadedAnnoy.query(itemId, 30)