This library augments kamon-spray
to make it provide more
useful metrics. In particular, it consists of two independent parts
KamonHttp
-
A drop-in replacement for Spray can’s
Http
IO extension with one that will automatically gather Spray server metrics on a periodic basis and publish them to Kamon TracingHttpService
-
A drop-in replacement for Spray routing’s
HttpService
trait that will provide more useful trace metrics
In order to use this library, you will need to add dependencies to your project:
libraryDependencies ++= Seq( "com.monsanto.arch" %% "spray-kamon-metrics" % "0.1.5", // optional: Needed for KamonHttp "io.spray" %% "spray-can" % "1.3.4", // optional: Needed for TracingHttpService "io.spray" %% "spray-routing" % "1.3.4", )
Note that each of the Spray dependencies is optional and only required when you are using the corresponding functionality.
Additionally, you will need to add JCenter to your resolver chain.
resolvers += Resolver.jcenterRepo
When you use KamonHttp
, you will be able to retrieve the
Spray can server’s metrics from Kamon.
To use KamonHttp
, just use it instead of Spray’s Http
extension when binding a new server port. For example,
instead of:
import akka.io.IO
import spray.can.Http
IO(Http) ! Http.Bind(myService, interface = "localhost", port = 80)
Do this:
KamonHttp
import akka.io.IO
import com.monsanto.arch.kamon.spray.can.KamonHttp
import spray.can.Http
IO(KamonHttp) ! Http.Bind(myService, interface = "localhost", port = 80)
Everything else should just work just as it did before. If the service successfully binds to a port, KamonHttp
will
begin polling it periodically to gather the server’s metrics.
All Spray can server metrics are published to the spray-can-server
category with a name generated from the socket
address and port, e.g. localhost:80
. The metrics that are published include:
connections
|
a counter tracking the number of connections to the server |
open-connections
|
a histogram tracking the number of open connections at different points in time |
max-open-connections
|
a counter tracking the maximum number of connections over the life of the server |
requests
|
a counter tracking the number of requests to the server |
open-requests
|
a histogram tracking the number of open requests at different points in time |
max-open-requests
|
a counter tracking the maximum number of requests over the life of the server |
uptime
|
a counter tracking the server uptime in nanoseconds |
Note that max-open-connections
, max-open-requests
, and uptime
are not published as time-series data.
kamon-spray
provides some valuable help in instrumenting Spray services, but falls short in a few areas:
-
It does not support putting things like the request method or path in tags
-
It does not properly track request timeouts
To use this part of the library, simply replace any use of HttpService, `HttpServiceActor
, or HttpServiceBase
with a corresponding use of TracingHttpService
, TracingHttpServiceActor
, or TracingHttpServiceBase
. For example,
instead of:
import spray.routing.HttpService
class MyService extends HttpServiceActor {
def receive = runRoute {
path("ping") {
get {
complete("pong")
}
}
}
}
Do this:
TracingHttpService
import com.monsanto.arch.kamon.spray.routing.HttpService
class MyService extends TracingHttpServiceActor {
def receive = runRoute {
path("ping") {
get {
complete("pong")
}
}
}
}
It’s that easy.
Now, each request that is processed by your server will add to a histogram
called spray-service-response-duration
. The following tags are added to each
record:
method
|
the method from the request |
path
|
the path from the request |
status_code
|
the integer value of the status code sent in the response |
timed_out
|
whether or not a particular response is considered to have timed out |
Note
|
About timeouts
The way that Spray handles timeouts is somewhat annoying. When a particular request times out, Spray creates a new request that get processed specially. This means that in the server, the original request still runs to completion. Meanwhile, the request that actually goes out to the client is void of any context from the original request. As a result, we rely on a couple of heuristics to try to generate the most useful data:
In summary, any request that times out should result in two values: one for the response that times out (marked
|