Spark-based implementation of pDC3, a linear-time parallel suffix-array-construction algorithm.
This repo contains:
- an implementation of the sequential algorithm DC3 (paper) under
org.hammerlab.suffixes.dc3
. - an Apache-Spark-based implementation of its parallel counterpart, pDC3 (paper), under
org.hammerlab.suffixes.pdc3
.
The tests verify that both give the same answers on a variety of inputs.