A Scala library for intelligent character-by-character reading with automatic indentation tracking.
char_reader provides a powerful abstraction for parsing text with significant whitespace. It automatically generates INDENT and DEDENT tokens when indentation levels change, making it ideal for parsing languages like Python, YAML, or any custom DSL that uses indentation for structure.
Key features include:
- Automatic indentation tracking with configurable indentation styles
- Precise position tracking (line and column numbers)
- Cross-platform support (JVM, JavaScript via Scala.js, and Native)
- Rich error reporting with contextual information
- Flexible iteration over characters with lookahead capabilities
- Comment line detection and handling
Add the dependency to your build.sbt:
libraryDependencies += "io.github.edadma" %%% "char_reader" % "0.1.25"For cross-platform projects, use %%% to automatically select the appropriate artifact.
import io.github.edadma.char_reader.CharReader
// Read from string without indentation tracking
val reader = CharReader.fromString("Hello\nWorld")
while (!reader.eoi) {
println(s"Char: '${reader.ch}' at line ${reader.line}, column ${reader.col}")
reader = reader.next
}import io.github.edadma.char_reader.CharReader
val text = """
|1
| a
| b
| c
|2
""".stripMargin
// Configure comment syntax (prefix, middle, suffix)
val reader = CharReader.fromString(text, indentation = Some(("#", "", "")))
reader.iterator.foreach { r =>
r.ch match {
case CharReader.INDENT => println("Indentation increased")
case CharReader.DEDENT => println("Indentation decreased")
case CharReader.EOI => println("End of input")
case '\n' => println("Newline")
case c => println(s"Character: '$c'")
}
}val reader = CharReader.fromFile("input.txt", indentation = Some(("#", "", "")))val reader = CharReader.fromString("function hello() {")
reader.matches("function") match {
case Some(nextReader) => println("Found 'function' keyword")
case None => println("Pattern not found")
}// Consume until whitespace
val (consumed, rest) = reader.consume(_.ch.isWhitespace)
println(s"Consumed: '$consumed'")
// Consume past a delimiter
reader.consumePastDelimiter("*/") match {
case Some((content, rest)) => println(s"Comment content: '$content'")
case None => println("Delimiter not found")
}val reader = CharReader.fromString("error here")
reader.error("Unexpected token") // Throws with contextOutput:
Unexpected token (line 1, column 1):
error here
^
CharReader.EOI: End of InputCharReader.INDENT: Indentation level increasedCharReader.DEDENT: Indentation level decreased
CharReader.fromString(text, tabs = 4) // Default tab width// Configure comment syntax: (prefix, middle, suffix)
val pythonStyle = Some(("#", "", ""))
val cStyle = Some(("/*", "", "*/"))
val reader = CharReader.fromString(text, indentation = pythonStyle)This project uses SBT with cross-compilation:
# Test all platforms
sbt test
# Test specific platform
sbt charReaderJVM/test
sbt charReaderJS/test
sbt charReaderNative/test
# Publish
sbt publishSigned