polentino / redacted   0.7.1

Do What The F*ck You Want To Public License Website GitHub

Scala library and compiler plugin that prevent inadvertent leakage of sensitive fields in `case classes` (such as credentials, personal data, and other confidential information)

Actions Status GitHub Tag Sonatype Nexus (Releases) Sonatype Nexus (Releases) Scala Steward badge

Redacted

Prevents leaking sensitive fields defined inside case class.

Simple example of @redacted usage

Introduction

In Scala, case class(es) are omnipresent: they are the building blocks for complex business domain models, due to how easily they can be defined and instantiated; on top of that, the Scala compiler provides a convenient toString method for us that will pretty print in console/log their content, for example:

case class UserPreferences(useDarkTheme: Boolean, maxHistoryItems: Int)

val id = 123
val up = store.getUserPreferencesByID(123)
log.info(s"user preferences for user $id are $up")

will print

user preferences for user 123 are UserPreferences(true, 5)

However, this becomes a double-edge sword when handling sensitive data: assume you're writing an HTTP server, and you have a case class to pass its headers around, i.e.

case class HttpHeaders(userId: String, apiKey: String, languages: Seq[Locale], correlationId: String)

or a case class representing a user in a DB

case class User(id: UUID, nickname: String, email: String)

you probably wouldn't want to leak by mistake an apiKey (for security reasons) or an email (for PII/GDPR reasons).

Sure, you can get creative and define middleware layers/utility methods and so on to circumvent the issue, but wouldn't it be better if you were simply to say "when I dump the whole object, I don't want this field to be printed out" ?

@redacted to the rescue!

Usage

No matter of the scala version you use (redacted is available for Scala 2.12.x, 2.13.x and all 3.x LTS versions), all you have to do is open build.sbt file, add the following lines

val redactedVersion = "x.y.z" // use latest version of the library
// resolvers += DefaultMavenRepository

libraryDependencies ++= Seq(
  "io.github.polentino" %% "redacted" % redactedVersion cross CrossVersion.full,
  compilerPlugin("io.github.polentino" %% "redacted-plugin" % redactedVersion cross CrossVersion.full)
)

and then, in your case class definitions

import io.github.polentino.redacted.redacted

case class HttpHeaders(userId: UUID, @redacted apiKey: String, languages: Seq[Locale], correlationId: String)

case class User(id: UUID, nickname: String, @redacted email: String)

That's all!

From now on, every time you'll try to dump the whole object,or invoke toString method

val headers: HttpHeaders = HttpHeaders(
  userId = UUID.randomUUID(),
  apiKey = "abcdefghijklmnopqrstuvwxyz",
  languages = Seq("it_IT", "en_US"),
  correlationId = "corr-id-123"
)
val user: User = User(
  id = UUID.randomUUID(),
  nickname = "polentino911",
  email = "polentino911@somemail.com"
)
println(headers)
println(user)

this will actually be printed

$ HttpHeaders(d58b6a78-5411-4bd4-a0d3-e1ed38b579c4, ***, Seq(it_IT, en_US), corr-id-123)
$ User(8b2d4570-d043-473b-a56d-fe98105ccc2b, polentino911, ***)

But, of course, accessing the field itself will return its content, i.e.

println(headers.apiKey)
println(user.email)

will still print the real values:

$ abcdefghijklmnopqrstuvwxyz
$ polentino911@somemail.com

Nested case class

It also works with nested case classes:

case class Wrapper(id: String, user: User)

val wrapper = Wrapper("id-1", user) // user is the same object defined above
println(wrapper)

will print

Wrapper(id-1,User(8b2d4570-d043-473b-a56d-fe98105ccc2b, polentino911, ***))

Nested case class with upper level annotation

It also works with nested case classes:

case class Wrapper(id: String, @redacted user: User)

val wrapper = Wrapper("id-1", user) // user is the same object defined above
println(wrapper)

will print

Wrapper(id-1,***)

Value case classes

@redacted plays nicely with value case classes too, i.e.

case class Password(@redacted value: String) extends AnyVal

val p = Password("somepassword")
println(p)

will print on console

Password(***)

Note on curried case classes

While it is possible to write something like

case class Curried(id: String, @redacted name: String)(@redacted email: String)

the toString method that Scala compiler generates by default will print only the parameters in the primary constructor, meaning that

val c = Curried(0, "Berfu")("berfu@gmail.com")
println(c)

will display

Curried(0, Berfu)

Therefore, the same behavior is being kept in the customized toString implementation.

Supported Scala Versions

redacted supports all Scala versions listed in the table below. However, it is advised to use the ones with a green checkmark ( ✅ ) since those are the Long Term Support ones specified in the Scala website.

Scala Version LTS ?
3.6.4 -
3.5.2 -
3.4.3 -
3.3.5
3.3.4
3.3.3
3.3.1
3.3.0
3.2.2 -
3.1.3 -
2.13.16 -
2.12.20 -

How it works

Given a case class with at least one field annotated with @redacted, i.e.

final case class User(id: UUID, @redacted name: String)

the compiler plugin will replace the default implementation of its toString method with this

final case class User(id: UUID, @redacted name: String) {
  def toString(): String = "User(" + this.id + ",***" + ")"
}

The way it's done is the following:

The compiler plugin will inspect each type definition and check whether the class being analysed is a case class, and if it has at least one of its fields annotated with @redacted ; if that's the case, it will then proceed to rewrite the default toString implementation by selectively returning either the *** string, or the value of the field, depending on the presence (or not) of @redacted, resulting in an implementation that looks like so:

def toString(): String =
  "<class name>(" + this.< field not redacted > + "," + "***" +
...+")"

Improvements

  • create Sbt plugin
  • add some benchmarks with jmh

Credits