A library for reading and writing data in Redis using Apache Spark.
Spark-Redis provides access to all of Redis' data structures - String, Hash, List, Set and Sorted Set - from Spark as RDDs. It also supports reading and writing with DataFrames and Spark SQL syntax.
The library can be used both with Redis stand-alone as well as clustered databases. When used with Redis cluster, Spark-Redis is aware of its partitioning scheme and adjusts in response to resharding and node failure events.
Spark-Redis also supports Spark Streaming (DStreams) and Structured Streaming.
Version compatibility and branching
The library has several branches, each corresponds to a different supported Spark version. For example, 'branch-2.3' works with any Spark 2.3.x version. The master branch contains the recent development for the next release.
|Spark-Redis||Spark||Redis||Supported Scala Versions|
|2.4, 2.5, 2.6||2.4.x||>=2.9.0||2.11, 2.12|
- Java, Python and R API bindings are not provided at this time
This library is a work in progress so the API may change before the official release.
Please make sure you use documentation from the correct branch (2.4, 2.3, etc).
- Getting Started
- Structured Streaming
- Dev environment
You're encouraged to contribute to the Spark-Redis project.
There are two ways you can do so:
If you encounter an issue while using the library, please report it via the project's issues tracker.
Author Pull Requests
Code contributions to the Spark-Redis project can be made using pull requests. To submit a pull request:
- Fork this project.
- Make and commit your changes.
- Submit your changes as a pull request.