indix / schemer

Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.

GitHub

schemer

Build Status Maven Docker Pulls

Schema registry with support for CSV, TSV, AVRO, JSON and Parquet. Has ability to infer schema from a given data source.

Schemer Core

schemer-core is the core library that implements most of the logic needed to understand the supported schema types along with the schema inference. To use schemer-core directly, just add it to your dependencies:

libraryDependencies += "com.indix" %% "schemer" % "v0.2.3"

Schemer Registry

schemer-registry is a schema registry for storing the metadata about schema and schema versions. It provides a GraphQL API for adding, viewing and inferring schemas.

Schemer Registry is available as a docker image at DockeHub

Running Locally

Local docker based PostgreSQL can be run as follows:

docker run -e POSTGRES_USER=schemer -e POSTGRES_PASSWORD=schemer -e PGDATA=/var/lib/postgresql/data/pgdata -e POSTGRES_DB=schemer -v $(pwd)/schemer_db:/var/lib/postgresql/data/pgdata -p 5432:5432 postgres:9.5.0

Remove schmer_db folder to clear all data and start from scratch.

The registry service can be run using sbt:

sbt "project registry" ~reStart