A table-driven precedence climbing parser for Scala 3, designed for parsing Prolog and other languages with configurable operator precedence.
- Configurable operator precedence and associativity - all Prolog fixity types (xfx, xfy, yfx, fx, fy, xf, yf)
- Configurable bracket syntax - lists, curly braces, or custom delimiters with optional tail separators
- Modular lexer extensions - composable number format readers
- Embeddable as a sublanguage -
parseExprPartialwith stop-sets lets you use the precedence climber inside a larger parser - Cross-platform - JVM, JavaScript (Scala.js), and Native (Scala Native)
libraryDependencies += "io.github.edadma" %%% "recursive_descent_parser" % "0.0.2"Use the included PrologConfig for parsing Prolog expressions:
import io.github.edadma.recursive_descent_parser.*
val lexer = PrologConfig.lexer
val parser = PrologConfig.parser
val result = parser.parse("foo(X, Y) :- bar(X), baz(Y)", lexer)
println(result) // :-(foo(X,Y),,(bar(X),baz(Y)))parser.parse("1 + 2 * 3", lexer) // +(1,*(2,3))
parser.parse("(1 + 2) * 3", lexer) // *(+(1,2),3)parser.parse("[1, 2, 3]", lexer) // .(1,.(2,.(3,[])))
parser.parse("[H|T]", lexer) // .(H,T)
parser.parse("[1, 2|Rest]", lexer) // .(1,.(2,Rest))parser.parse("-X", lexer) // -(X) prefix
parser.parse("X = Y", lexer) // =(X,Y) infix
parser.parse("\\+ foo(X)", lexer) // \+(foo(X)) prefix negation
parser.parse("a :- b, c", lexer) // :-(a,,(b,c))parser.parse("0'a", lexer) // 97 character code
parser.parse("0xFF", lexer) // 255 hexadecimal
parser.parse("0b1010", lexer) // 10 binary
parser.parse("0o77", lexer) // 63 octal
parser.parse("1_000_000", lexer) // 1000000 underscoresCreate your own parser configuration for different languages:
import io.github.edadma.recursive_descent_parser.*
// Define operators with precedence and fixity
val myOperators: Map[String, List[OpEntry]] = Map(
"+" -> List(OpEntry(500, Fixity.Yfx)), // left-associative
"*" -> List(OpEntry(400, Fixity.Yfx)), // higher precedence (tighter)
"^" -> List(OpEntry(200, Fixity.Xfy)), // right-associative
"-" -> List(OpEntry(500, Fixity.Yfx), OpEntry(200, Fixity.Fy)), // infix and prefix
)
// Define bracket syntax
val myBrackets: List[BracketConfig] = List(
BracketConfig("[", "]", ",", Some("|"), myListBuilder),
)
val myLexer = Lexer(
symbols = myOperators.keySet ++ Set("(", ")", "[", "]", ",", "|"),
numberReaders = List(HexReader, BinaryReader), // pick what you need
)
val myParser = Parser(myOperators, myBrackets)Use parseExprPartial to embed the precedence climber inside a larger recursive descent parser. It returns the parsed term and the remaining token stream, stopping at tokens specified in the stop-set:
val exprParser = Parser(myOperators, myBrackets)
def parseIf(tokens: LazyList[Token]): (IfExpr, LazyList[Token]) =
val toks = tokens.tail // consume 'if'
val (cond, rest1) = exprParser.parseExprPartial(toks, stopAt = Set("then"))
val rest2 = consumeKeyword(rest1, "then")
val (body, rest3) = exprParser.parseExprPartial(rest2, stopAt = Set("else", ";"))
// ...The stop-set is context-dependent — each call site specifies which tokens delimit the expression in that syntactic position. Stop tokens are not consumed, so the outer parser can inspect them. Tokens inside parentheses, brackets, and functor arguments are unaffected by the stop-set, since their delimiters naturally stop expression parsing.
| Fixity | Type | Associativity | Example |
|---|---|---|---|
Xfx |
infix | non-associative | a = b (but not a = b = c) |
Xfy |
infix | right-associative | a ; b ; c → a ; (b ; c) |
Yfx |
infix | left-associative | a + b + c → (a + b) + c |
Fx |
prefix | non-associative | not x |
Fy |
prefix | associative | - - x → - (- x) |
Xf |
postfix | non-associative | x ! |
Yf |
postfix | associative | x ! ! → (x !) ! |
Modular number format readers can be composed as needed:
val lexer = Lexer(
symbols = ...,
numberReaders = List(
CharCodeReader, // 0'a -> 97
HexReader, // 0xFF -> 255
BinaryReader, // 0b1010 -> 10
OctalReader, // 0o77 -> 63
),
allowUnderscoreInNumbers = true, // 1_000_000
)The parser produces Term values:
enum Term:
case Atom(pos: CharReader, name: String)
case Var(pos: CharReader, name: String)
case Integer(pos: CharReader, value: Long)
case Float(pos: CharReader, value: Double)
case Str(pos: CharReader, value: String)
case Compound(pos: CharReader, functor: String, args: List[Term])Each term carries its source position for error reporting.
ISC License - see LICENSE for details.