Non-intrusive native Prometheus collectors for Pekko internals, negligible performance overhead, suitable for production use.
- 
Are you running (or about to run) Pekko in production, full-throttle, and want to see what happens inside? Did your load tests produce some ask timeouts? thread starvation? threads behaving non-reactively? old code doing nasty blocking I/O?
 - 
Already familiar with Prometheus/Grafana observability stack?
 
If you answer 'yes' to most of the questions above, Pekko Sensors may be the right choice for you:
- 
Comprehensive feature set to make internals of your Pekko visible, in any environment, including high-load production.
 - 
It is OSS/free, as in MIT license, and uses explicit, very lightweight instrumentation - yet is a treasure trove for a busy observability engineer.
 - 
Won't affect CPU costs, when running in public cloud.
 
- time of runnable waiting in queue (histogram)
 - time of runnable run (histogram)
 - implementation-specific ForkJoinPool and ThreadPool stats (gauges)
 - thread states, as seen from JMX ThreadInfo (histogram, updated once in X seconds)
 - active worker threads (histogram, updated on each runnable)
 
- thread watcher, keeping eye on threads running suspiciously long, and reporting their stacktraces - to help you find blocking code quickly
 
- number of actors (gauge)
 - time of actor 'receive' run (histogram)
 - actor activity time (histogram)
 - unhandled messages (count)
 - exceptions (count)
 
- recovery time (histogram)
 - number of recovery events (histogram)
 - persist time (histogram)
 - recovery failures (counter)
 - persist failures (counter)
 
- cluster events, per type/member (counter)
 
- number of instances
 - start since / uptime
 - JVM version
 - memory pools
 - garbage collector
 
We assuming you have docker and docker-compose up and running.
Prepare sample app:
sbt "compile; project app; Docker/publishLocal"
Start observability stack:
docker-compose -f examples/observability/docker-compose.yml up
Send some events:
for z in {1..100}; do curl -X POST http://localhost:8080/api/ping/$z/100; done
for z in {101..200}; do curl -X POST http://localhost:8080/api/ping-tp/$z/100; done
for z in {3001..3300}; do curl -X POST http://localhost:8080/api/ping-persistence/$z/300 ; done
Open Grafana at http://localhost:3000.
Go to http://localhost:3000/plugins/sensors-prometheus-app, click Enable. Sensors' bundled dashboards will be imported.
libraryDependencies ++= 
  Seq(
     "nl.pragmasoft.pekko" %% "sensors" % "1.0.3",
  )
If you already have Prometheus exporter in your application, CollectorRegistry.defaultRegistry will be used by default.
To control this finely, PekkoSensors.prometheusRegistry needs to be overridden.
For an example of HTTP exporter service, check MetricService implementation in example application (app) module.
Override type and executor with Sensors' instrumented executors.
Add pekko.sensors.PekkoSensorsExtension to extensions.
pekko {
  actor {
    # main/global/default dispatcher
    default-dispatcher {
      type = "nl.pragmasoft.pekko.sensors.dispatch.InstrumentedDispatcherConfigurator"
      executor = "nl.pragmasoft.pekko.sensors.dispatch.InstrumentedExecutor"
      instrumented-executor {
        delegate = "fork-join-executor" 
        measure-runs = true
        watch-long-runs = true
        watch-check-interval = 1s
        watch-too-long-run = 3s
      }
    }
    # some other dispatcher used in your app
    default-blocking-io-dispatcher {
      type = "nl.pragmasoft.pekko.sensors.dispatch.InstrumentedDispatcherConfigurator"
      executor = "nl.pragmasoft.pekko.sensors.dispatch.InstrumentedExecutor"
      instrumented-executor {
        delegate = "thread-pool-executor"
        measure-runs = true
        watch-long-runs = false
      }
    }
  }
  extensions = [
    nl.pragmasoft.pekko.sensors.PekkoSensorsExtension
  ]
}
  default-dispatcher {
    type = "nl.pragmasoft.pekko.sensors.dispatch.InstrumentedDispatcherConfigurator"
    executor = "nl.pragmasoft.pekko.sensors.dispatch.InstrumentedExecutor"
    instrumented-executor {
      delegate = "fork-join-executor"
      measure-runs = true
      watch-long-runs = false
    }
    fork-join-executor {
      parallelism-min = 6
      parallelism-factor = 1
      parallelism-max = 6
    }
  }  
 # Non-persistent actors
 class MyImportantActor extends Actor with ActorMetrics {
    # This becomes label 'actor', default is simple class name
    # but you may segment it further
    # Just make sure the cardinality is sane (<100)
    override protected def actorTag: String = ... 
      ... # your implementation
  }
 # Persistent actors
 class MyImportantPersistentActor extends Actor with PersistentActorMetrics {
  ...
val behavior = BehaviorMetrics[Command]("ActorLabel") # basic actor metrics
    .withReceiveTimeoutMetrics(TimeoutCmd) # provides metric for amount of received timeout commands
    .withPersistenceMetrics # if inner behavior is event sourced, persistence metrics would be collected
    .setup { ctx: ActorContext[Command] =>
      ... # your implementation
    }
Some parameters of the Sensors library itself, that you may want to tune:
pekko.sensors {
  thread-state-snapshot-period = 5s
  cluster-watch-enabled = false
}
For anything additional to measure in actors, extend *ActorMetrics in your own trait.
trait CustomActorMetrics extends ActorMetrics  with MetricsBuilders {
  val importantEvents: Counter = counter
    .name("important_events_total")
    .help(s"Important events")
    .labelNames("actor")
    .register(metrics.registry)
}

