Macro PEG extends Parsing Expression Grammars with macro-like rules and is implemented in Scala 3. It supports lambda-style macros so you can build higher-order grammars.
Whitespace is omitted in the grammar below.
Grammar <- Definition* ";"
Definition <- Identifier ("(" Arg ("," Arg)* ")")? "=" Expression ";"
Arg <- Identifier (":" Type)?
Type <- RuleType / "?"
RuleType <- ("(" Type ("," Type)* ")" "->" Type)
/ (Type "->" Type)
Expression <- Sequence ("/" Sequence)*
Sequence <- Prefix+
Prefix <- ("&" / "!") Suffix
/ Suffix
Suffix <- Primary "?"
/ Primary "*"
/ Primary "+"
/ Primary
Primary <- "(" Expression ")"
/ Call
/ Debug
/ Identifier
/ StringLiteral
/ CharacterClass
/ Lambda
Call <- Identifier "(" Expression ("," Expression)* ")"
Debug <- "Debug" "(" Expression ")"
Lambda <- "(" Identifier ("," Identifier)* "->" Expression ")"
StringLiteral <- '"' (!'"' .)* '"'
CharacterClass<- '[' '^'? (!']' .)+ ']'
- Macro rules with parameters
- Lambda macros for higher-order grammars
- Type annotations for macro parameters
- Multiple evaluation strategies (call by name, call by value sequential/parallel)
- Parser combinator library
MacroParsers - Scala 3 inline macro API
InlineMacroParsers.mpeg(compile-time grammar validation, strategy selection) - Rich diagnostics via
Diagnostic(parse,well-formedness,type-check,evaluation,generation) - Static grammar validation (
GrammarValidator) for undefined references, nullable repetition, and left recursion - Packrat-style memoization in evaluator (
evaluateWithDiagnostics) - Parser generator backend (
codegen.ParserGenerator) for first-order grammars, with interpreter-backed fallback for higher-order grammars - Combinator ergonomics:
label,cut,recover,trace, and formatted failures - Debug expressions for inspecting matches
- Ruby parser (
ruby.RubyParser) achieving 100% parse success on the upstream Ruby test corpus (302/302 files), with full AST (ruby.RubyAst)
Add the library to your build.sbt:
libraryDependencies += "com.github.kmizu" %% "macro_peg" % "0.1.1-SNAPSHOT"Then parse and evaluate a grammar:
import com.github.kmizu.macro_peg._
val grammar = Parser.parse("""
S = Double((x -> x x), "aa") !.;
Double(f: ?, s: ?) = f(f(s));
""")
val evaluator = Evaluator(grammar)
val result = evaluator.evaluate("aaaaaaaa", Symbol("S"))
println(result)For typed diagnostics and safe construction:
val interpreterEither = Interpreter.fromSourceEither("""S = "ab";""")
val resultEither = interpreterEither.flatMap(_.evaluateEither("ac"))For compile-time checked grammar (Scala 3 inline macro):
import com.github.kmizu.macro_peg.InlineMacroParsers._
import com.github.kmizu.macro_peg.EvaluationStrategy
val parser = mpeg("""S = "ab" !.;""")
assert(parser.accepts("ab"))
// Useful for dynamic delimiter capture patterns (scannerless, no external lexer state).
val parser2 = mpeg(
"""S = F("<<", [A-Z]+, "\n") !.; F(Open, Delim, NL) = Delim;""",
strategy = EvaluationStrategy.CallByValueSeq
)For generated parser source code from a first-order grammar:
import com.github.kmizu.macro_peg.codegen.ParserGenerator
val source = ParserGenerator.generateFromSource("""S = "a" "b";""")| Language | Coverage | Approach |
|---|---|---|
| Ruby 3.x | 302/302 files | Combinator (RubyParser, full AST) + Generated (GeneratedRubyParser, error reporting) |
| Python | planned |
Ruby
Full AST (ruby.RubyAst), covering classes, modules, methods, blocks, pattern matching (case/in), heredocs, string interpolation, regex, percent literals, operator precedence, assignment variants, and more.
import com.github.kmizu.macro_peg.ruby.RubyParser
val astEither = RubyParser.parse("""class User; def greet(name); "hi"; end; end""")Accepts/rejects Ruby source with structured error reporting (line:col + expected token + rule stack).
import com.github.kmizu.macro_peg.ruby.GeneratedRubyParser
GeneratedRubyParser.parseAll("x = 1") match {
case Right(_) => println("ok")
case Left(msg) => println(msg) // e.g. "parse error at 1:5\nexpected: ..."
}mkdir -p third_party/ruby3/upstream
git clone --depth 1 --filter=blob:none --sparse https://github.com/ruby/ruby.git third_party/ruby3/upstream/ruby
cd third_party/ruby3/upstream/ruby
git sparse-checkout set test/ruby bootstraptest test/prism# Combinator parser
sbt "runMain com.github.kmizu.macro_peg.ruby.RubyCorpusRunner"
# Generated parser
sbt "runMain com.github.kmizu.macro_peg.ruby.GeneratedRubyCorpusRunner"Optional environment variables:
RUBY_CORPUS_TIMEOUT_MS(default:5000)RUBY_CORPUS_FAIL_SAMPLES(default:20)RUBY_CORPUS_FULL_ERROR(1to print full formatted failures, default: first line only)
Python (planned)
Coming soon.
- Introduce backreference as
evalCCmethod. - pfun -> delayedParser, which is better naming than before(breaking change)
- More accurate ParseException
- EvaluationException is thrown when arity of function is not equal to passed params.
- Improved Parser
Execute the following command:
sbt testThis project is released under the MIT License.