A library for reading social data from Instagram using Spark Streaming.
Run a demo via:
# set up all the requisite environment variables
export INSTAGRAM_AUTH_TOKEN="..."
# compile scala, run tests, build fat jar
sbt assembly
# run locally
java -cp target/scala-2.11/streaming-instagram-assembly-0.0.7.jar InstagramDemo standalone
# run on spark
spark-submit --class InstagramDemo --master local[2] target/scala-2.11/streaming-instagram-assembly-0.0.7.jar spark
Instagram doesn't expose a firehose API so we resort to polling. The InstagramReceiver pings the Instagram API every few seconds and pushes any new images into Spark Streaming for further processing.
Currently, the following ways to read images are supported:
- by location (sample data)
- by tag (sample data)
- by user (sample data)
- Configure your credentials via the
SONATYPE_USER
andSONATYPE_PASSWORD
environment variables. - Update
version.sbt
- Enter the SBT shell:
sbt
- Run
sonatypeOpen "enter staging description here"
- Run
publishSigned
- Run
sonatypeRelease