s_mach.string is an open-source Scala library that provides primitives for string manipulation beyond the standard library. s_mach.string supports token-level string manipulations using an arbitrary token definition (e.g. delimited by whitespace, camel case, etc). s_mach.string also provides a variety of find-replace methods.
-
Add to build.sbt
libraryDependencies += "net.s_mach" %% "string" % "2.1.0"
Notes_mach.string is cross compiled for Scala 2.11/JDK6 and 2.12/JDK8
s_mach.string uses semantic versioning (http://semver.org/). s_mach.string does not use the package private modifier. Instead, all code files outside of the s_mach.string.impl package form the public interface and are governed by the rules of semantic versioning. Code files inside the s_mach.string.impl package may be used by downstream applications and libraries. However, no guarantees are made as to the stability or interface of code in the s_mach.string.impl package between versions.
-
Separate tokens and delimiters using Lexers
-
Import any of 3 provided implicit Lexers:
-
import Lexer.Whitespace
-
import Lexer.Underscore
-
import Lexer.CamelCase
-
-
Or create a custom RegexCharLexer or RegexCharTransitionLexer
-
Or provide a completely custom token separating implementation
-
Separate but preserve the delimiters between tokens for reconstructing the original string’s structure (mapTokens, LexResult)
-
-
Find-replace using a variety of input types
-
Search by regex, String, or Seq[String]
-
Replace by function (Match => String) or String literal
-
Match only tokens
-
Match case sensitive or insensitive
-
-
Adds utility String methods:
-
String.ensureSuffix
-
String.mapTokens
-
String.collapseDelims
-
String.collapseWhitespace
-
String.toProperCase
-
String.toTitleCase
-
String.toCamelCase
-
String.toPascalCase
-
String.toSnakeCase
-
String.toTokens
-
String.lex
-
String.indent
-
String.toOption
-
String.toDoubleOpt
-
String.toLongOpt
-
String.toIntOpt
-
String.convert
-
Indexed[[IndexedSeq[String]].printGrid
-
-
Extract and create char group regex patterns using CharGroup, CharGroupPattern & CharGroupRegex
[info] Starting scala interpreter...
[info]
Welcome to Scala version 2.11.1 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_72).
Type in expressions to have them evaluated.
Type :help for more information.
scala> import s_mach.string._
import s_mach.string._
scala> :paste
// Entering paste mode (ctrl-D to finish)
val sentence = "the_rain_in__spain falls on the Plain"
// Split on whitespace
val example1 = {
import Lexer.Whitespace
sentence.toTokens.toList
}
// Split on underscore
val example2 = {
import Lexer.Underscore
sentence.toTokens.toList
}
// Split on whitespace or underscore
val example3 = {
import Lexer.WhitespaceOrUnderscore
sentence.toTokens.toList
}
// Exiting paste mode, now interpreting.
sentence: String = the_rain_in__spain falls on the Plain
example1: List[String] = List(the_rain_in__spain, falls, on, the, Plain)
example2: List[String] = List(the, rain, in, spain falls on the Plain)
example3: List[String] = List(the, rain, in, spain, falls, on, the, Plain)
scala> :paste
// Entering paste mode (ctrl-D to finish)
// find replace on tokens (delimited by whitespace)
val example4 = {
import Lexer.Whitespace
sentence.findReplaceTokens(Seq(("spain", "france"),("plain","savanna")), caseSensitive = false)
}
// find replace on tokens (delimited by whitespace or underscore)
val example5 = {
import Lexer.WhitespaceOrUnderscore
sentence.findReplaceTokens(Seq(("spain", "france"),("plain","savanna")), caseSensitive = true)
}
// Exiting paste mode, now interpreting.
example4: String = the_rain_in__spain falls on the savanna
example5: String = the_rain_in__france falls on the Plain
scala> :paste
// find matching regex and append '!' to each match
val example6 = {
sentence.findRegexReplaceMatch(Seq(("[a-z]*ain".r,{ m => m.toString + "!" })))
}
// Exiting paste mode, now interpreting.
example6: String = the_rain!_in__spain! falls on the Plain!
scala>
