A Solr client built on Akka
The goal of akka-solr is to provide a high-performance, non-blocking, Akka-and-Scala-native interface to Apache Solr. The initial implementation provides an interface similar to spray-client's, with an Akka extension that allows requests to be sent to an actor, or an interface to request a connection actor and send requests to it. Optional builders for requests are provided, but are not required; results from Solr are returned as wrapper objects that provide easier access from Scala to SolrJ objects. Some SolrJ objects are used in the interest of maintainability.
In order to keep from reinventing the wheel and then maintaining said wheel, the SolrJ library is used for generating update (add/delete) requests (which could easily be replaced, and actually is buggy) and for parsing results (XMLResponseParser, BinaryResponseParser, StreamingBinaryResponseParser). Since the SolrJ ResponseParsers work from java.io.InputSources, and akka-solr uses reactive/non-blocking response chunking, blocking calls were added to bridge the InputSource requests into Akka messages. A dedicated, runtime-configurable executor is used for all SolrJ response parsing with the ActorInputStream class and the akkasolr.response-parser-dispatcher config. Any improvements / alternate implementations are welcome. Along the same lines, SolrJ's ZkStateReader class is used for ZooKeeper/SolrCloud support, it has a configurable dispatcher that will be created upon first usage.
akka-solr depends on SolrJ for request generation and response parsing, but the dependency is marked as "provided" so the end user is required to pull the dependency in. After cursory inspections, akka-solr is expected to work with SolrJ versions 4.5 through 4.10. Akka and spray-can are not pulled in as "provided", this can be changed if feedback demands.
sbt:
libraryDependencies ++= Seq(
"com.codemettle.akka-solr" %% "akka-solr" % "3.0.2",
"org.apache.solr" % "solr-solrj" % "7.2.1" // later versions should work but are untested
)Maven:
<dependency>
<groupId>com.codemettle.akka-solr</groupId>
<artifactId>akka-solr</artifactId>
<version>3.0.2</version>
</dependency>
<dependency>
<groupId>org.apache.solr</groupId>
<artifactId>solr-solrj</artifactId>
<version>7.2.1</version>
</dependency>If you've used the spray-can HTTP client library, you're already familiar with akka-solr's philosophy.
Some scaladocs are provided, open a ticket for anything unclear (or submit a pull request!).
In lieu of detailed documentation, here are a list of examples (using ask/? syntax for clarity and brevity, even though the library is meant to be used from Actors with message passing).
A builder is provided as part of akka-solr, any improvements are welcome.
import com.codemettle.akkasolr.querybuilder.SolrQueryStringBuilder.Methods._
val qs = rawQuery("my custom query")
val qs = defaultField() := "wantthis"
val qs = defaultField() :!= "dontwantthis"
val qs = field("myfield") := "requiredvalue"
val qs = field("myfield") isAnyOf Seq("1", "2")
val qs = field("mylong") isInRange (12345, 98765)
val qs = field("requiredField") exists()
val qs = field("illegalField") doesNotExist()import com.codemettle.akkasolr.querybuilder.SolrQueryStringBuilder.Methods._
val qs = AND (
field ("x") := "y",
defaultField() :!= "3",
NOT(field ("z") := 2),
OR (
rawQuery ("<my custom query>"),
field("aa") := "2",
field("bb") isAnyOf Seq("1", "2")
)
)import com.codemettle.akkasolr.querybuilder.SolrQueryStringBuilder.Methods._
// if both lower and upper are nonempty, then the "time" field will be in the query (Some(field("time").isInRange(lo, hi)))
// if one or both are empty, then the for comprehension yields a None, and will be dropped at query render time
def buildQuery(lower: Option[Long], upper: Option[Long]) = {
AND(
defaultField() := "xyz",
for (lo <- lower; hi <- upper) yield field("time") isInRange (lo, hi)
)
}import com.codemettle.akkasolr.querybuilder.SolrQueryStringBuilder.Methods._
import com.codemettle.akkasolr.querybuilder.SolrQueryBuilder.FieldStrToSort
val qs: QueryPart = ???
val query = qs start 50 rows 25 facets ("a", "b") fields ("f1", "f2") sortBy "f1".descval req = Solr.Ping(action = None, options = Solr.RequestOptions(method = RequestMethods.GET, responseType = SolrResponseTypes.XML, requestTimeout = 5.seconds))
val req = Solr.Ping()
val req = Solr.Ping(Solr.Ping.Enable)
val req = Solr.Commit(waitForSearcher = false, softCommit = true)
val req = Solr.Optimize(waitForSearcher = false, maxSegments = 2)
val req = Solr.Rollback(options = Solr.RequestOptions(actorSystem).copy(method = RequestMethods.POST, responseType = SolrResponseTypes.Binary))
val req = Solr.Select(qs)
val req = Solr.Select.Streaming(qs)
val req = Solr.Select(qs).streaming
val req = Solr.Select(qs).streaming withOptions Solr.RequestOptions(actorSystem).copy(requestTimeout = 15.seconds)
val req = Solr.Update DeleteById ("id1", "id2")
val req = Solr.Update() deleteById "id1" deleteByQuery (Solr.queryStringBuilder defaultField() := "blah")
val docs: Seq[SolrInputDocument] = ???
val req = Solr.Update AddSolrDocs (docs: _*)
val docs: Seq[Map[String, AnyRef]] = ???
val req = Solr.Update AddDocs (docs: _*)
val doc: Map[String, AnyRef] = ???
val req = Solr.Update() addDoc doc
val req = Solr.Update() addDoc doc overwrite false
val req = Solr.Update() addDoc doc commit true
val req = Solr.Update() addDoc doc commitWithin 42.secondsA Connection actor is requested with Solr.Client.clientTo(). The connection accepts Solr.SolrOperation messages. Solr.Client.manager can also accept Solr.Request objects which will create connections as needed.
val req: Solr.SolrOperation = ???
val resp: Future[SolrQueryResponse] =
(Solr.Client.manager ? Solr.Request("http://mysolrserver:8983/solr/core1", req)).mapTo[SolrQueryResponse]Actor:
Solr.Client.clientTo("http://mysolrserver")
def receive = {
case Solr.SolrConnection("http://mysolrserver", connection) => // sender() is the same actor as `connection`
}Future:
val connF: Future[ActorRef] = Solr.Client.clientFutureTo("http://mysolrserver")Connection actor accepts requests and sends back SolrQueryResponse objects
val responseF: Future[SolrQueryResponse] = (connectionActor ? req).mapTo[SolrQueryResponse]Errors can be raised from Spray (which should be Http.ConnectionException errors) or from akka-solr (which should be Solr.AkkaSolrErrors - ParseError, RequestTimedOut, etc)
akka-solr provides an ImperativeWrapper class that can be wrapped around the client ActorRef or requested with:
val connF: Future[ImperativeWrapper] = Solr.Client.imperativeClientTo("http://mysolrserver")The purpose of ImperativeWrapper is to provide a vaguely SolrServer-ish interface to akka-solr using Akka asks. This can be helpful to transition from SolrJ or other imperative clients. All ImperativeWrapper methods return Future[SolrQueryResponse]s.
akka-solr can use Solr's chunking/streaming mechanism to send query results to an actor as they are received and parsed. The behavior is similar to SolrJ's SolrServer.queryAndStreamResponse.
val req = Solr.Select(qs).streaming
connection ! req
// or
Solr.Client.manager ! Solr.Request(solrUrl, req)
def receive = {
case SolrResultInfo(numFound, start, maxScore) => // received first
case doc: AkkaSolrDocument => // documents are received
case res: SolrQueryResponse => // response is sent last and has no documents; (connection ? req) returns Future[SolrQueryResponse]
}
We have many unit tests which employ Solr's EmbeddedSolrServer to fire up temporary Solr instances that are loaded, queried, updated, and destroyed during testing. I'm sure there's better ways to go about that, but in the interest of maintaining our test setup I've made akka-solr customizable with different connection actors at runtime.
To use a different connection actor, extend the com.codemettle.akkasolr.ext.ConnectionProvider trait and configure the akkasolr.connectionProvider config to point to your implementation. akka-solr provides an HttpSolrServerConnectionProvider in the "tests" jar as an example, which uses the akka-solr-provided SolrServerClientConnection actor to run queries against a SolrJ SolrServer. A simple ConnectionProvider can be created in your test code which uses the same SolrServerClientConnection actor with an EmbeddedSolrServer (example uri: "solr://embedded?options=that&you=need").
- Tests - I'm not a unit-testing expert, any help is appreciated
- Add more Scala-friendly request/response wrappers as requested.
- Sub-queries? (can provide queries without using builders)
- document field weights? (can provide queries without using builders)
- facet dates/ranges, limiting facets (can use original non-Scala-ish response)
- spellcheck, highlighting, stats, terms, etc (can use original non-Scala-ish response)
- ???
- 3.0.2
- Build for Scala 2.13
- 3.0.0
- Add update-defaults.fail-on-non-zero-status, set to true by default, which causes UpdateRequests to fail with a Solr.UpdateError if a non-zero status is returned to add/delete/commit requests
- Upgrade SolrJ to version 7.2.1
- Add support for Scala 2.12
- Upgrade Akka to 2.5
- 2.1.0
- Upgrade Akka to 2.4
- 2.0.1
- Same changes as 1.5.1
- 2.0.0
- Drop support for Solr4, move to Solr 5.1
- Drop support for Scala 2.10
- Build with Java8
- API Change: change
isAnyOf/isNoneOfto take iterables instead of being varargs methods (to cut down onWrappedArraybugs from forgetting: _*vararg conversions)
- 1.5.1
- Add support to
SolrQueryBuilderfor shards - Changed
SolrQueryBuilderaround as its fields were getting out of control, left the API intact, but this could affect users in rare cases - not enough of an issue to bump the minor version number. - Add support to
SolrQueryBuilderfor Filter Queries
- Add support to
- 1.5.0
- Version change due to breaking API
- Add support for authentication on regular connections, although not supported (yet?) for LoadBalanced/SolrCloud connections
- 1.0.1
SolrQueryBuildernow supports facet pivots, stats, and groupingisAnyOfnow generates more concise queries (key:(v1 OR v2 OR v3)vs(key:v1 OR key:v2 OR key:v3))Solr.(RequestOptions|UpdateOptions|LBConnectionOptions|SolrCloudConnectionOptions)now all have a.materialize(implicit ActorRefFactory)method to create instances from the ether inside of any actorSolrQueryStringBuildernow has an implicit conversion fromOption[QueryPart]s toQueryParts- Bug Fix - nested AND/ORs:a query like
AND(defaultField() := "*", OR(Seq.empty[QueryPart]: _*))would generate(* AND ), now correctly generates*
- 1.0.0
- Update build to build against 2.10.5 and 2.11.6
- No code changes, but the project has been in production long enough to mark it as 1.0.
- 0.10.2
- Support for cursorMark in
SolrQueryBuilderand nextCursorMark inSolrQueryResponse. Cursors require Solr 4.7.1, but akka-solr hard-codes the constants from SolrJ'sCursorMarkParamsto maintain compatibility with SolrJ < 4.7.1 SolrQueryBuilder.withSortIfNewField(SortClause)
- Support for cursorMark in
- 0.10.1
- Bugfix - asking for a SolrCloud/LoadBalanced connection with different options but same address as existing would return the existing connection instead of creating a new connection with different connection options (especially visible for SolrCloud connections with different defaultCollection settings)
- 0.10.0
SolrQueryBuilder.queryis no longer a String, it is aSolrQueryStringBuilder.QueryPartfor easier modification of queries- Responses to operations should now come from the Connection actor instead of the transient Request Handler actors
- Add the
LBClientConnectionclass that behaves pretty much exactly the same as SolrJ'sorg.apache.solr.client.solrj.impl.LBHttpSolrServerclassLBHttpSolrServerattempts to cycle through servers in order (but the order is changed at runtime when failures happen),LBClientConnectionuses a random order on every requestLBHttpSolrServer.Req(lets the user specify a list of servers to try per request that don't necessarily have to be servers that the connection was configured to handle) is reproduced by sendingLBClientConnection.ExtendedRequestmessages to theLBClientConnection(LBHttpSolrServer.Rsp->LBClientConnection.ExtendedResponse)
- Add the
SolrCloudConnectionclass that behaves pretty much exactly the same as SolrJ'sorg.apache.solr.client.solrj.impl.CloudSolrServerclassCloudSolrServerhas asetDefaultCollection()method to set default collections for requests,SolrCloudConnections can be created with a default collection by providing aSolr.SolrCloudConnectionOptionsinstance withdefaultCollectionset; no runtime changes are currently supportedCloudSolrServerlooks for a"collection"parameter in requests to override the default (or provide this required piece of data ifsetDefaultCollection()hasn't been called);SolrCloudConnectionacceptsSolrCloudConnection.OperateOnCollectionmessages that provide a per-request collection parameterSolr.Client.solrCloudClientToand its brethern (imperative client, client future) take a host string in the form "host:port,host:port" and acceptSolrCloudConnectionOptionsto set the default collection and other configurationSolr.Client.clientToand its brethern accept connection strings in the form "zk://host:port,host:port" and create aSolrCloudConnection; using this method requires that every request be sent in aSolrCloudConnection.OperateOnCollectionmessage
- 0.9.2
- Add support in SolrQueryBuilder for facetLimit, facetMinCount, and facetPrefix
- 0.9.1
- Bug fixes for empty NOT and IsAnyOf query builders
- add isNoneOf query builder method
- 0.9.0
- Initial Release
- Support for major Solr operations in an actor+message passing interface
- Support for building immutable messages that represent Solr operations in a Scala-ish manner
- Testability support through runtime-configurable connection providers, with a provided implementation that can use
EmbeddedSolrServers - Easier access to Solr output objects through Scala wrappers
- Authored by @codingismy11to7 for @CodeMettle
- We've used @takezoe's solr-scala-client library extensively in production, and submitted features. akka-solr has no code from solr-scala-client, but there are some superficial similarities. It's a fine library if you need an asynchronous Solr client but don't use Akka.
- Facet pivoting, stats, and grouping
SolrQueryBuildersupport by @compfix