guardian / etag-caching   3.0.7

Apache License 2.0 GitHub

Library for in-memory ETag-aware caching of services like AWS S3, saving CPU & bandwidth

Scala versions: 3.x 2.13 2.12


Only fetch what's needed, only parse what you don't already have

core Scala version support aws-s3-sdk-v2 Scala version support CI Release

Many services (eg Amazon S3) include the ETags HTTP response header in their replies - the ETag is a service-generated hash of the content requested. If the client retains the ETag, they can send it in a If-None-Match HTTP request header in subsequent requests - if the service knows the content still has the same ETag, the content hasn't changed, and the service will respond with a blank HTTP 304 Not Modified status code - no body will be returned, as the service knows you already have the content - this saves network bandwidth, and as there's no new content-parsing required for the client, the client will have lower CPU requirements as well!

To make use of this as a client, you need an ETagCache - one where the latest ETags are stored. Although the cache could simply be storing the raw content sent by the service, for an in-memory cache it's usually optimal to store a parsed representation of the data - to save having to parse the data multiple times. Consequently, ETagCache has a Loader that holds the two concerns of fetching & parsing.


The main API entry point is the class.


This example (taken from S3ObjectFetchingTest) shows an ETagCache setup for fetching-and-parsing compressed XML from S3:

import scala.util.Using

val s3Client: S3AsyncClient = ???
def parseFruit(is: InputStream): Fruit = ???

val fruitCache = new ETagCache[ObjectId, Fruit](
  S3ObjectFetching(s3Client, Bytes).thenParsing {
    bytes => Using(new GZIPInputStream(bytes.asInputStream()))(parseFruit).get

fruitCache.get(exampleS3id) // Future[Fruit]

Loading = Fetching + Parsing

flowchart LR
subgraph "Loading[K,V]"
direction LR
fetching("fetching: Fetching[K, Response]")
parsing("parsing: Response => V")
fetching --"response"--> parsing