allenai / blacklab

A corpus retrieval engine based on Apache Lucene

GitHub

BlackLab

This is a fork of the open source Java implementation of BlackLab (https://github.com/INL/BlackLab) ported to work with SBT, and has some fixes to reduce memory usage.

It gets published to Maven Central under package name `org.allenai.blacklab`. All package referenences are changed from `nl.inl` to `org.allenai` in the Java source code.

What is BlackLab?

BlackLab is a corpus retrieval engine built on top of Apache Lucene. It allows fast, complex searches with accurate hit highlighting on large, tagged and annotated, bodies of text. It was developed at the Institute of Dutch Lexicology (INL) to provide a fast and feature-rich search interface on our historical and contemporary text corpora.

We're also working on BlackLab Server, a web service interface to BlackLab, so you can access it from any programming language. See the ALPHA version here: https://github.com/INL/BlackLab-server

BlackLab is licensed under the Apache License 2.0.

More information: