A Scala library for intelligent character-by-character reading with automatic indentation tracking.
char-reader
provides a powerful abstraction for parsing text with significant whitespace. It automatically generates INDENT
and DEDENT
tokens when indentation levels change, making it ideal for parsing languages like Python, YAML, or any custom DSL that uses indentation for structure.
Key features include:
- Automatic indentation tracking with configurable indentation styles
- Precise position tracking (line and column numbers)
- Cross-platform support (JVM, JavaScript via Scala.js, and Native)
- Rich error reporting with contextual information
- Flexible iteration over characters with lookahead capabilities
- Comment line detection and handling
Add the dependency to your build.sbt
:
libraryDependencies += "io.github.edadma" %%% "char-reader" % "0.1.20"
For cross-platform projects, use %%%
to automatically select the appropriate artifact.
import io.github.edadma.char_reader.CharReader
// Read from string without indentation tracking
val reader = CharReader.fromString("Hello\nWorld")
while (!reader.eoi) {
println(s"Char: '${reader.ch}' at line ${reader.line}, column ${reader.col}")
reader = reader.next
}
import io.github.edadma.char_reader.CharReader
val text = """
|1
| a
| b
| c
|2
""".stripMargin
// Configure comment syntax (prefix, middle, suffix)
val reader = CharReader.fromString(text, indentation = Some(("#", "", "")))
reader.iterator.foreach { r =>
r.ch match {
case CharReader.INDENT => println("Indentation increased")
case CharReader.DEDENT => println("Indentation decreased")
case CharReader.EOI => println("End of input")
case '\n' => println("Newline")
case c => println(s"Character: '$c'")
}
}
val reader = CharReader.fromFile("input.txt", indentation = Some(("#", "", "")))
val reader = CharReader.fromString("function hello() {")
reader.matches("function") match {
case Some(nextReader) => println("Found 'function' keyword")
case None => println("Pattern not found")
}
// Consume until whitespace
val (consumed, rest) = reader.consume(_.ch.isWhitespace)
println(s"Consumed: '$consumed'")
// Consume past a delimiter
reader.consumePastDelimiter("*/") match {
case Some((content, rest)) => println(s"Comment content: '$content'")
case None => println("Delimiter not found")
}
val reader = CharReader.fromString("error here")
reader.error("Unexpected token") // Throws with context
Output:
Unexpected token (line 1, column 1):
error here
^
CharReader.EOI
: End of InputCharReader.INDENT
: Indentation level increasedCharReader.DEDENT
: Indentation level decreased
CharReader.fromString(text, tabs = 4) // Default tab width
// Configure comment syntax: (prefix, middle, suffix)
val pythonStyle = Some(("#", "", ""))
val cStyle = Some(("/*", "", "*/"))
val reader = CharReader.fromString(text, indentation = pythonStyle)
This project uses SBT with cross-compilation:
# Test all platforms
sbt test
# Test specific platform
sbt charReaderJVM/test
sbt charReaderJS/test
sbt charReaderNative/test
# Publish
sbt publishSigned
This project welcomes contributions from the community. Contributions are accepted using GitHub pull requests.
For a good pull request, please provide:
- Clear description: Include the "what" and "why" of your changes
- Passing tests: Ensure existing tests pass and add new tests for new features
- Test coverage: Use
sbt coverage test
to generate coverage reports - Documentation: Update README if adding new features
- Code style: Run
scalafmt
to maintain consistent formatting
Run the test suite:
sbt clean test
Generate coverage report:
sbt clean coverage test coverageReport
This project is licensed under the ISC License.