An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.
Scala versions:
2.11
Latest version
[](https://index.scala-lang.org/helgeho/archivespark/archivespark)
JVM badge
[](https://index.scala-lang.org/helgeho/archivespark/archivespark)