globalnamesarchitecture / gnmatcher Edit

Fuzzy matching library for scientific names with emphasis on performance and scalability

Version Matrix

Global Names Matcher

Global Names Matcher or gnmatcher is a Scala 2.10.3+ library for very fast fuzzy matching of a query string against given set of strings.


The artifacts for gnmatcher live on Maven Central.

Insert SBT line as follows to install the dependency:

libraryDependencies += "org.globalnames" %% "gnmatcher" % "0.1.0"

Corresponding maven code:




gnmatcher implements sophisticated heuristic algorithms to match semantical parts of scientific biological names as follows:

  • authors match answers to a question: how similar the authors string Linnaeus, Muller 1767 to the Muller and Linnaeus?

Authors Matching

The entire algorithm is ported from Ruby implementation developed by Patrick Leary of uBio and EOL fame. To find out the answer to the question above, run the code as follows:

$ sbt matcher/console
scala> import org.globalnames._
scala> AuthorsMatcher.score(Seq(Author("Linnaeus"), Author("Muller")), Some(1767),
     |                      Seq(Author("Muller"), Author("Linnaeus")), None)
res0: Double = 0.5



Released under MIT license