The Scala library provides Tuple1 to Tuple22 that allow programmers to hold a fixed number of items together so they can be passed as a single object. While all the elements in an Array have the same type, a TupleN can have a mix of element types, e.g.
scala> val mytuple = ((2, "Be"), "Or", "Not", (2, "Be"))
mytuple: ((Int, String), String, String, (Int, String)) = ((2,Be),Or,Not,(2,Be))
scala> mytuple._1
res1: (Int, String) = (2,Be)In this example, mytuple is a Tuple4 and has both Int and String elements.
The same code using Avro tuples, looks like...
scala> val mytuple = AvroTuple4(AvroTuple2(2, "Be"), "Or", "Not", AvroTuple2(2, "Be"))
mytuple: com.github.massie.avrotuples.AvroTuple4[com.github.massie.avrotuples.AvroTuple2[Int,String],String,String,com.github.massie.avrotuples.AvroTuple2[Int,String]] = ((2,Be),Or,Not,(2,Be))
scala> mytuple._1
res0: com.github.massie.avrotuples.AvroTuple2[Int,String] = (2,Be)Avro tuples is published to Maven Central.
In Maven, use
<dependency>
<groupId>com.github.massie</groupId>
<artifactId>avrotuples_**SCALA_VERSION**</artifactId>
<version>**AVROTUPLES_VERSION**</version>
</dependency>In sbt, add the line
libraryDependencies += "com.github.massie" %% "avrotuples" % "**AVROTUPLES_VERSION**"
Note, that for sbt you don't need to specify the Scala version since the line above uses %% which will automatically use the correct Scala version.
- Avro tuples can serve as a drop in replacement for Scala tuples
AvroTuple2has aswapmethod just likeTuple2- All Avro tuples extend
ProductN, e.g.AvroTuple1[T1]extendsProduct1[T1] - Avro tuples implement
Externalizablemaking them Java serializable - Avro tuples can be nested
This interface allows Avro to (de)serialize Avro tuples. An Avro serialize/deserialize round-trip looks like...
val tuple = AvroTuple2("This", AvroTuple4("That", "and", "the", "other"))
val outTuple = AvroTuple2.fromBytes(tuple.toBytes)
assert(tuple == outTuple)If you pass Avro tuples to Kryo, the tuple will be (de)serialized in Avro format using the Avro tuple schema.
You can update the values for an Avro tuple without needing to create a new tuple, e.g.
val tuple = AvroTuple2("One", 1L)
assert(tuple._1 == "One")
assert(tuple._2 == 1L)
tuple.update("Two", 2L)
assert(tuple._1 == "Two")
assert(tuple._2 == 2L)Scala provides syntactic sugar that Avro tuples do not. In Scala, you don't need to write Tuple2("a", "b"), you can just use ("a", "b"). Avro tuple code is more verbose.
For now, Avro tuples can be comprised of null values, strings, booleans, floats, doubles, ints, and longs. Support for more types is coming, e.g. Option.
There is a known issue with Avro/Parquet and recursive schemas. AvroTuples use a recursive schema in order to support nesting. If you are using AvroTuples with Parquet, you will need to use the AvroFlatTupleX types, since they have flat schemas.
Avro tuples is released under an Apache 2.0 license.
Pull requests are welcomed.