Spark Google AdWords Library
A library for querying Google AdWords data with Apache Spark, for Spark SQL and DataFrames.
This library is tested with Spark 2.1+. It might work on older versions, but we don't provide any support on that.
You can link against this library in your program at the following coordinates:
groupId: com.crealytics artifactId: spark-google-adwords_2.11 version: 0.9.2
Using with Spark shell
This package can be added to Spark using the
--packages command line option. For example, to include it when starting the spark shell:
Spark compiled with Scala 2.11
$SPARK_HOME/bin/spark-shell --packages com.crealytics:spark-google-adwords_2.11:0.9.2
clientSecret: a client identifier and secret that you can generate like this.
developerToken: a token that identifies your API activity
refreshToken: a token that you can generate using the AdWordsAuthHelper as shown below. This token represents the user consent to grant access to a certain set of APIs and will be used to generate further, more short-lived access tokens which are actually used to authenticate calls to the AdWords API. For more information also see the official documentation.
reportType: The report type you want to query. Use the same
CAPITALS_WITH_UNDERSCOREspelling as in the listing.
clientCustomerId: id of the account for which you want to query data.
userAgent(optional, default =
Spark): An arbitrary user-agent that will be used when querying the API.
during(optional, default =
LAST_30_DAYS): The time range for which you want to query data. Check the official documentation for allowed values or use
StartDate,EndDatefor a custom date range.
Generate a refresh token (if you don't have one yet):
import com.crealytics.google.adwords._ val clientId = "123456789123-yourclientid.apps.googleusercontent.com" val clientSecret = "yourclientsecret-1" val authHelper = new AdWordsAuthHelper(clientId, clientSecret) // The next line prints a URL that you have to open in the browser and copy the displayed authentication code println(authHelper.authorizationUrl) // Paste the authentication code from the browser window here to get the refresh token println(authHelper.getRefreshToken("TheAuthenticationTokenFromTheBrowser"))
Create a DataFrame from an AdWords report:
import org.apache.spark.sql.SQLContext val sqlContext = new SQLContext(sc) val df = sqlContext.read .format("com.crealytics.google.adwords") .option("clientId", clientId) .option("clientSecret", clientSecret) .option("developerToken", "YourDeveloperToken") .option("refreshToken", "1/YourRefreshToken") .option("reportType", "SHOPPING_PERFORMANCE_REPORT") .option("clientCustomerId", "1234567890") .option("userAgent", "Spark") .option("during", "LAST_30_DAYS") .load()
Building From Source
This library is built with SBT. To build a JAR file simply run
sbt assembly from the project root. The build configuration includes support for Scala 2.11.