Spark NetSuite Library

Spark Connector for NetSuite is a SOAP web service wrapper around the NetSuite web service published here.


This library requires Spark 2.x+

For Spark 1.x support, please check spark1.x branch.


You can link against this library in your program at the following ways:

Maven Dependency


SBT Dependency

libraryDependencies += "com.springml" % "spark-netsuite_2.11" % "1.1.0"

Using with Spark shell

This package can be added to Spark using the --packages command line option. For example, to include it when starting the spark shell:

$ bin/spark-shell --packages com.springml:spark-netsuite_2.11:1.1.0


  • Construct Spark Dataframe using NetSuite data - User has to provide NetSuite web service request and list of XPath to read data from NetSuite. The XPath will be evaluated against NetSuite web service response and dataframe will be constructed based on that. Records will be searched based on the user provided request and further records will be fetched using searchMoreWithId


  • email: NetSuite account user Id
  • password: NetSuite account passsword
  • account: NetSuite account Id
  • applicationId: NetSuite application Id
  • role: (Optional) NetSuite Role Id. Default value is 3
  • pageSize: (Optional) Number of records to pulled in a single request. Max pageSze is 1000. Default value is 100.
  • request: NetSuite Web Service search request. This request will be used to search for records from NetSuite. Sample request is present over here
  • recordTagPath: (Optional) XPath of the response element which should be considered as record. Default value is //platformCore:record
  • xpathMap: Location of CSV file which should contain fieldName and its XPath. Sample file is present over here
  • namespacePrefixMap: Location of CSV file which should contain prefix and its corresponding namespace. Sample file is present over here
  • schema: (Optional) Schema to be used for constructing dataframes. If not provided all fields will be of type String

Scala API

import org.apache.spark.sql.SQLContext

// Construct Dataframe from NetSuite records
// Search request to be executed against NetSuite Web Service
// Here Customers are fetched
val request = """
<search xmlns="" xmlns:xsi="">
    <searchRecord xmlns:ns7="" xsi:type="ns7:CustomerSearch">
        <ns7:basic xmlns:ns8="" xsi:type="ns8:CustomerSearchBasic"></ns7:basic>

// Below constructs dataframe by executing search and searchMoreWithId operations 
val df =
    option("email", "netsuite_email").
    option("password", "netsuite_password").
    option("account", "netsuite_account").
    option("applicationId", "netsuite_application_id").
    option("request", request).


# Search request to be executed against NetSuite Web Service
# Here Customers are fetched
netsuite_request <- "<search xmlns=\"\" xmlns:xsi=\"\"><searchRecord xmlns:ns7=\"\" xsi:type=\"ns7:CustomerSearch\"><ns7:basic xmlns:ns8=\"\" xsi:type=\"ns8:CustomerSearchBasic\"></ns7:basic></searchRecord></search>"

// Below constructs dataframe by executing search and searchMoreWithId operations 
df <- read.df(source="com.springml.spark.netsuite",

Building From Source

This library is built with SBT, which is automatically downloaded by the included shell script. To build a JAR file simply run sbt/sbt package from the project root.