navicore / navilake   1.3.0

MIT License GitHub

An Akka Streams source of Azure Data Lake data

Scala versions: 2.11 2.12 2.13

Build Status Codacy Badge

Read Azure Data Lake Storage into Akka Streams

Replay historical data-at-rest into an existing code base that had been designed for streaming.

Current Storage Sources

  1. GZip files of UTF8 \n delimited strings
  2. Other storage implementations TBD

Uses the adslapi.

USAGE

update your build.sbt dependencies with:

// https://mvnrepository.com/artifact/tech.navicore/navilake
libraryDependencies += "tech.navicore" %% "navilake" % "1.3.0"

This example reads gzip data from Azure Data Lake.

Create a config, a connector, and a source via the example below.

    val consumer = ... // some Sink
    ...
    ...
    ...
    // credentials and location
    implicit val cfg: LakeConfig = LakeConfig(ACCOUNTFQDN, CLIENTID, AUTHEP, CLIENTKEY, Some(PATH))
    val connector: ActorRef = actorSystem.actorOf(GzipConnector.props)
    val src = NaviLake(connector)
    ...
    ...
    ...
    src.runWith(consumer)
    ...
    ...
    ...