Parsley is a fast and modern parser combinator library for Scala based loosely on a Haskell-style parsec API.
Parsley is distributed on Maven Central, and can be added to your project via:
// SBT
libraryDependencies += "com.github.j-mie6" %% "parsley" % "4.6.0"
// scala-cli
--dependency com.github.j-mie6::parsley:4.6.0
// or in file
//> using dep com.github.j-mie6::parsley:4.6.0
// mill
ivy"com.github.j-mie6::parsley:4.6.0"Documentation can be found here
If you're a cats user, you may also be interested in using parsley-cats
to augment
parsley with instances for various cats typeclasses:
libraryDependencies += "com.github.j-mie6" %% "parsley-cats" % "1.5.0"scala> import parsley.Parsley
scala> import parsley.syntax.character.{charLift, stringLift}
scala> val hello: Parsley[Unit] = ('h' ~> ("ello" | "i") ~> " world!").void
scala> hello.parse("hello world!")
val res0: parsley.Result[String,Unit] = Success(())
scala> hello.parse("hi world!")
val res1: parsley.Result[String,Unit] = Success(())
scala> hello.parse("hey world!")
val res2: parsley.Result[String,Unit] =
Failure((line 1, column 2):
unexpected "ey"
expected "ello"
>hey world!
^^)
scala> import parsley.character.digit
scala> val natural: Parsley[Int] = digit.foldLeft1(0)((n, d) => n * 10 + d.asDigit)
scala> natural.parse("0")
val res3: parsley.Result[String,Int] = Success(0)
scala> natural.parse("123")
val res4: parsley.Result[String,Int] = Success(123)For more see the Wiki!
Mostly, this library is quite similar. However, due to Scala's differences in operator characters a few operators are changed:
(<$>)is known asmaptryis known asattempt($>)is known as either#>oras
In addition, lift2 and lift3 are uncurried in this library: this is to provide better performance and easier usage with
Scala's traditionally uncurried functions. There are also a few new operators in general to be found here!
Parsley is a modern parser combinator library, which strives to be on the bleeding-edge of parser combinator library design. This means that improvements will come naturally over time. Feel free to suggest improvements for consideration, as well as high-level problems you commonly encounter that we may be able to find a way to mitigate (see the Design Patterns for Parser Combinators paper for example!).
Part of innovation is being willing to admit
design mistakes and rectify them: when a binary-breaking release is made, the
opportunity may be taken to polish parts of the libary's API that are clunky, or
could be better organised or improved. For example, see the differences between
parsley-3.3.10 and parsley-4.0.0! However, constant breaking changes are
not a good way to encourage the use of a library as users often want stability:
to that end, annoyances and bugbears with the API are only addressed
approximately yearly, and the frequence of these will decrease over time.
For future major releases, care will be taken to, wherever possible, publish
all patch-level changes in a final version to the previous major.minor
version, and then all minor-level changes as a final major.(minor+1).0
version before releasing the major-level changes as (major+1).0.0: this will
allow users stuck on the old version to benefit as much as possible from the
fixes and new functionality.
As of 4.0.0, parsley is strictly commited to early-semver, which means
that the version numbers are significant:
- Two versions
x._._andy._._withx != yare incompatible with each other at a binary level: havingx._._on the classpath with code compiled with they._._will most likely result in a linkage-error at runtime. - Two versions
a.x._anda.y._withx <= yare binary compatible, which means that code compiled againsta.x._will still work witha.y._on the classpath. A "source" componenty > xindicates thata.y._has added or deprecated functionality sincea.x._. - Two versions
a.b.xanda.b.yare binary and source compatible, which means there are no compatiblity concerns between the two versions. Code compiled againsta.b.xwill run witha.b.yon the classpath and vice-versa. A "patch" componenty > xindicates thata.b.yfixes issues (bugs or poor performance) witha.b.x.
In short, if you are on version a.x.y, you can: feel free to upgrade to
version a.x.z if z > y without worry; and upgrade to a.z._ if z > x,
with a possible (but rare) need to update your code minorly. Occasionally,
a "source" component bump may deprecate functionality, but it will provide a
migration to tell you how to avoid the deprecation warning. Altered/deprecated
functionality may be hidden from the public API in a binary backwards
compatible way in a "source" bump and therefore may require updating when
recompiled; this will be done sparingly and with minimal disruption as to not
discourage updating the libary, and any immediate migration changes to user
code from a.x._ to any a.y._ with y > x will be documented in
a.y._'s release.
Note: all functionality marked as private [parsley] or within
the parsley.internal package is not adherent to early-semver and may be
removed or changed at will with no impact to regular/intended use of the
library.
Occasionally, a minor (source) release will contain either a significant body of new work, or a significant rework of some internal machinery. In these cases additional versioning may be employed:
- Experimental (and volatile) new functionality may be iterated with
a.b.0-Mnversions: these are (hopefully) working pre-release versions of the functionality, subject to even binary incompatible changes betweenMversions. When the new API and behaviour becomes stable, the release graduates to thea.b.0-RC1release candidate. - Release candidates are used to iron-out any lingering issues with a minor
release and potentially alter the finer-points of the new functionality's
behaviour. Binary compatiblity will be preserved between
RCxandRCywithy > xexcept within truly exceptional circumstances. - Finally, the release makes it to
a.b.0and is hopefully truly stable.
Old versions of the library may still be given important bug-fixes after it has be obsoleted by a new release. In exceptional circumstances, performance problems may be addressed for old versions. The lifetime policy is as follows:
- Major (binary) versions reach EoL a minimum of 6 months after its successor was released, unless an extension to its life is requested by a issue.
- Minor (source) versions reach EoL immediately on the release of its successor, unless deprecations were issued by its successor, in which case it will reach EoL after a minimum of 3 months.
Some more minor bugfixes may not be ported to previous versions if they (a) do not appear in that version or (b) the code has changed too much internally to make porting feasible.
An exception to this policy is made for any version 3.x.y, which reaches EoL effective immediately (December 2022) excluding exceptional circumstances.
| Version | Released On | EoL Status |
|---|---|---|
3.3.0 |
7th January 2022 | EoL reached (3.3.10) |
4.0.0 |
30th November 2022 | EoL reached (4.0.4) |
4.1.0 |
18th January 2023 | EoL reached (4.1.8) |
4.2.0 |
22nd January 2023 | EoL reached (4.2.14) |
4.3.0 |
8th July 2023 | EoL reached (4.3.1) |
4.4.0 |
6th October 2023 | EoL reached (4.4.1) |
4.5.0 |
6th January 2023 | EoL reached (4.5.3) |
4.6.0 |
15th February 2025 | Enjoying indefinite support |
If you encounter a bug when using Parsley, try and minimise the example of the parser (and the input) that triggers the bug. If possible, make a self contained example: this will help to identify the issue without too much issue.
Parsley represents parsers as an abstract-syntax tree AST, which is constructed lazily. As a result, Parsley is able to perform analysis and optimisations on your parsers, which helps reduce the burden on you, the programmer. This representation is then compiled into a light-weight stack-based instruction set designed to run fast on the JVM. This is what offers Parsley its competitive performance, but for best effect a parser should be compiled once and used many times (so-called hot execution).
To make recursive parsers work in this AST format, you must ensure that recursion is done by knot-tying: you should define all
recursive parsers with val and introduce lazy val where necessary for the compiler to accept the definition.
- This work is based on my Master's Thesis (2018) which can be found here
- This work spawned a paper at the Scala Symposium at ICFP 2018: Garnishing Parsec with Parsley
- This work supports the patterns introduced at the Scala Symposium in 2022: Design Patterns for Parser Combinators in Scala