s_mach.string is an open-source Scala library that provides primitives for string manipulation beyond the standard library. s_mach.string supports word-level string manipulations using an arbitrary word definition (e.g. delimited by whitespace, camel case, etc). s_mach.string also provides a variety of find-replace methods.

s_mach.string: String utility library

Build Status Test Coverage Codacy Badge Scaladocs: 2.11 2.12

s_mach.string is an open-source Scala library that provides primitives for string manipulation beyond the standard library. s_mach.string supports token-level string manipulations using an arbitrary token definition (e.g. delimited by whitespace, camel case, etc). s_mach.string also provides a variety of find-replace methods.

Include in SBT

  1. Add to build.sbt

    libraryDependencies += "net.s_mach" %% "string" % "2.1.0"
    Note
    s_mach.string is cross compiled for Scala 2.11/JDK6 and 2.12/JDK8

Versioning

s_mach.string uses semantic versioning (http://semver.org/). s_mach.string does not use the package private modifier. Instead, all code files outside of the s_mach.string.impl package form the public interface and are governed by the rules of semantic versioning. Code files inside the s_mach.string.impl package may be used by downstream applications and libraries. However, no guarantees are made as to the stability or interface of code in the s_mach.string.impl package between versions.

Features

  • Separate tokens and delimiters using Lexers

    • Import any of 3 provided implicit Lexers:

      • import Lexer.Whitespace

      • import Lexer.Underscore

      • import Lexer.CamelCase

    • Or create a custom RegexCharLexer or RegexCharTransitionLexer

    • Or provide a completely custom token separating implementation

    • Separate but preserve the delimiters between tokens for reconstructing the original string’s structure (mapTokens, LexResult)

  • Find-replace using a variety of input types

    • Search by regex, String, or Seq[String]

    • Replace by function (Match => String) or String literal

    • Match only tokens

    • Match case sensitive or insensitive

  • Adds utility String methods:

    • String.ensureSuffix

    • String.mapTokens

    • String.collapseDelims

    • String.collapseWhitespace

    • String.toProperCase

    • String.toTitleCase

    • String.toCamelCase

    • String.toPascalCase

    • String.toSnakeCase

    • String.toTokens

    • String.lex

    • String.indent

    • String.toOption

    • String.toDoubleOpt

    • String.toLongOpt

    • String.toIntOpt

    • String.convert

    • Indexed[[IndexedSeq[String]].printGrid

  • Extract and create char group regex patterns using CharGroup, CharGroupPattern & CharGroupRegex

Examples

[info] Starting scala interpreter...
[info]
Welcome to Scala version 2.11.1 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_72).
Type in expressions to have them evaluated.
Type :help for more information.

scala> import s_mach.string._
import s_mach.string._

scala> :paste
// Entering paste mode (ctrl-D to finish)

val sentence = "the_rain_in__spain falls  on the Plain"

// Split on whitespace
val example1 = {
  import Lexer.Whitespace
  sentence.toTokens.toList
}

// Split on underscore
val example2 = {
  import Lexer.Underscore
  sentence.toTokens.toList
}

// Split on whitespace or underscore
val example3 = {
  import Lexer.WhitespaceOrUnderscore
  sentence.toTokens.toList
}

// Exiting paste mode, now interpreting.

sentence: String = the_rain_in__spain falls  on the Plain
example1: List[String] = List(the_rain_in__spain, falls, on, the, Plain)
example2: List[String] = List(the, rain, in, spain falls  on the Plain)
example3: List[String] = List(the, rain, in, spain, falls, on, the, Plain)

scala> :paste
// Entering paste mode (ctrl-D to finish)

// find replace on tokens (delimited by whitespace)
val example4 = {
  import Lexer.Whitespace
  sentence.findReplaceTokens(Seq(("spain", "france"),("plain","savanna")), caseSensitive = false)
}

// find replace on tokens (delimited by whitespace or underscore)
val example5 = {
  import Lexer.WhitespaceOrUnderscore
  sentence.findReplaceTokens(Seq(("spain", "france"),("plain","savanna")), caseSensitive = true)
}

// Exiting paste mode, now interpreting.

example4: String = the_rain_in__spain falls  on the savanna
example5: String = the_rain_in__france falls  on the Plain

scala> :paste

// find matching regex and append '!' to each match
val example6 = {
  sentence.findRegexReplaceMatch(Seq(("[a-z]*ain".r,{ m => m.toString + "!" })))
}

// Exiting paste mode, now interpreting.

example6: String = the_rain!_in__spain! falls  on the Plain!

scala>