Generation of Data Sets
- This GitHub repository is a component of the BOM4V project, aiming at demonstrating end-to-end Spark-based examples of Machine Learning (ML) pipelines, for instance churn detection in telecoms and transport industries.
- Central Maven repository with BOM4V Jar artefacts
- Docker cloud with ready-to-use images
- Generation of CDR: https://github.com/RealImpactAnalytics/cdr-generator/tree/master/src/main/scala/Model
Just add the dependency on
ti-spark-data-generation in the SBT project configuration (typically,
build.sbt in the project root directory):
libraryDependencies += "org.bom4v.ti" %% "ti-spark-data-generation" % "0.0.1-spark2.3"