Build Status

Table of Contents

Introduction

A collection of scala and java classes for some basic character level processing for the Sanskrit and other Indic (kannada, telugu, etc..) languages, contributed by the open source sanskrit-coders projects and friends. Some notable facilities:

  • Transliterate text from one script or encoding scheme to another.

Users

Library users

  • Maven repository here .
  • Last update : 2017-05-??

Built output

  • Final jar files
    • out/*.jar [all modules in intellij project]
    • target/*.jar [includes sources and javadocs in separate jars. indic-transliteration module only]
  • Classes
    • out/production/*/ [Modules other than indic-transliteration.]
    • target/ [sanskritnlp module output.]

Some known users

Libraries in other languages

Contributors

Deployment

SBT:

  • Use sbt command release to publish to maven repos.
  • Use sbt command test and testOnly to run tests.
  • You should be able to use it roughly immediately; and after many hours you should see at maven repo listings here.

Building a jar.

  • Simplest way is to set up a build artifact in intellij IDea.

Technical choices

Scala

  • One can write much more concise code (1/4th to 1/3rd relative to Java and 3/4ths to 5/6ths relative to Python, according to this )
    • For example, the ease with which one can iterate in scala using higher order functions (the maps, filters and zips above) available with scala's excellent collections library.
  • while not sacrificing the ability to use java libraries, and readability/ speed of java.
  • It is increasing in popularity relative to competitors : scala vs clojure ( Google trends ), scala vs julia ( Google trends ).
  • Here is a good series of blog posts which provide an introduction to Scala.