Normalized Database Schema Tables for JPL's Ontological Modeling Language (OML)
This project specifies a set of normalized schema tables for JPL's Ontological Modeling Language. By normalize schema tables, we mean precisely a 4th Normal Form database schema.
This schema is intended to be a single source of truth for technology-neutral data interchange of OMF models. By technology-neutral data interchange, we mean the separation between:
- the specification of the data to be exchanged among tools,
- the representation of this data in a particular technology stack.
Normalized schema tables specify the shape of the data to be exchanged in terms of tables with single-valued columns. This means that for each table:
- each column specifies a simple attribute typed by a scalar datatype (e.g. string, integer, boolean, ..)
- there are no "multiple values" in a given table row; instead, multiple values are represented as multiple rows.
The representation of these normalized schema tables is deliberately left open to leverage various technologies and serializations. In particular, representation technologies include but are not limited to:
In particular, serializations include but are not limited to:
- RDF (RDF/XML, RDF/Json, RDF/NTriples, ...)
- OWL (OWL/XML, RDF/XML, Manchester, ...)
The reference serialization for OMF normalized schema tables data is Json in the following format:
- Each row of a table is a single line Json tuple of name/value pairs for each table column.
GIT and *.omlzip archives
For change management purposes, the
*.omlzip serialization format yields the following benefits:
Minimal, local formatting
Every serialization involves some kind of formatting. XMI and XText serializations of OML involve global formatting: the serialization of OML ModuleElements is indented in the serialization of the containing OML Module.
*.omlzipuses local formatting because each OML object is serialized in a single JSON line. This local formatting is minimal because it involves a tuple of name/value pairs where a value can be a string, a number, a boolean or null (no array values, no nested JSON objects).
Minimal, local formatting speeds up serialization because it eliminates the overhead of indentation inherent in global formatting.
Simple & precise comparisons
Global formatting complicates comparison because there are two sources of differences to consider:
- differences in the internal representation of an object
- differences in the global context where an object is serialized
The computational complexity of comparing globally formatted representation varies with the particular format. For OML, XMI and XText serializations are inherently tree-based serializations; that is, labeled, ordered trees. Many algorithms exists for comparing such trees; depending on the properties of the tree, their time complexity varies between
O(n^4). The Robust Tree Edit Distance (RTED) algorithm achieves an optional worst-case complexity of
*.omlzip, comparison has linear worst-case complexity of
O(n)since each JSON-lines files is sorted and each line is flat ordered list of name/value pairs. Furthermore, each of the 66 JSON lines files corresponds to one of the 66 concrete OML metaclasses. Therefore, additions/deletions in a particular JSON lines file correspond to creating/deleting instances of the corresponding OML concrete metaclass.
To configure GIT for simple & precise comparison of
*.omlzipfiles in a GIT project:
- Add the following to
[diff "zip"] textconv = unzip -c -q
- Add the following to
*.omlziparchives may have the same contents (as seen by
unzip -c -q) but the timestamps in the ZIP archive may differ.
For example, suppose a GIT repository has a file:
example.omlzip. If nothing has changed in the OML contents and a new archive overwrites the existing one, then GIT may see a modification (due to the difference in timestamps in the ZIP metadata) but diffing the contents should confirm there is no significant change.
$ git status modified: *.omlzip $ git diff $
- Add the following to
Scala as a single-source of truth
The OML normalized schema tables are specified in the Scala programming language:
- each table is a Scala case class
- each table column is an immutable field of a Scala case class
The source code for these normalized schema tables was generated from the OML Specification.
Cross-compiling this project results in three distinct libraries:
A JVM library for writing pure Java applications; mixed Java/Scala applications using Scala or pure Scala applications.
Polyglot interoperability of the OMF Schema tables.
Java & Scala seem to be OK.
Intellij IDEA (2016 & later)
- Import the github project from existing sources as an SBT project.
Intellij will import the root project (tablesRoot) and the two cross-build variants (tablesJS, tablesJVM). Since all the Intellij-specific metadata can be re-created by simply importing the project, it is unecessary to store this metadata in github.
- It is possible to work using both Intellij IDEA and the SBT CLI in a terminal.
Eclipse Neon.3 Installation
Use the Eclipse Marketplace and search for 'Scala'. Install all components of Scala IDE 4.2.x
Install the following components by going to Install New Software and searching for...
Eclipse EMF SDK Eclipse EMF XCore SDK Eclipse Xtend SDK Eclipse SDK Eclipse EMF Parsley CDO Eclipse EMF/MWE2 runtime & language
There are several projects related to the OMF Schema Tables:
This is the Eclipse XCore specification of the OMF Schema Tables This project needs to be checked out.
This is a set of 3 Eclipse XTend code generators, 1 for the OMF Schema Tables, 2 for the OMF Schema Resolvers
Why is Eclipse so complicated?
Unfortunately, Eclipse lacks good support for SBT projects of any kind. The Eclipse-specific metadata was initially generated with sbt eclipse and subsequently edited as follows:
Fix the Eclipse resource links in
jvm/.projectto use location-neutral paths:
<linkedResources> <link> <name>jpl.omf.schema.tables-shared-src-main-scala</name> <type>2</type> <location>PARENT-1-PROJECT_LOC/shared/src/main/scala</location> </link> <link> <name>jpl.omf.schema.tables-shared-src-test-scala</name> <type>2</type> <locationURI>PARENT-1-PROJECT_LOC/shared/src/test/scala</locationURI> </link> </linkedResources>
Define an Eclipse Classpath variable,
IVY_CACHEfor the location of the Ivy cache used by SBT (typically,
Fix the Eclipse library paths in
jvm/.classpathto use the
<classpathentry kind="var" path="IVY_CACHE/org.scala-js/scalajs-library_2.11/jars/scalajs-library_2.11-0.6.12.jar"/> <classpathentry kind="var" path="IVY_CACHE/com.lihaoyi/upickle_sjs0.6_2.11/jars/upickle_sjs0.6_2.11-0.4.1.jar"/> ...
<classpathentry kind="var" path="IVY_CACHE/org.scala-lang.modules/scala-xml_2.11/bundles/scala-xml_2.11-1.0.2.jar"/> <classpathentry kind="var" path="IVY_CACHE/com.lihaoyi/upickle_2.11/jars/upickle_2.11-0.4.1.jar"/> ...
The Eclipse metadata files should be properly generated with
sbt eclipse; the above is a workaround! Do not update this project dependencies by editing the Eclipse metadata files; instead, update the SBT configuration and either use
sbt eclipse + post-editing or update the Eclipse metadata files accordingly.
Go to Configure Contents and then tick the checkbox for Error/Warnings on Project
Publishing to & resolving from bintray.com as a scoped NPM package.
Publishing a scoped NPM package is important for using a combination of multiple NPM repositories for resolving NPM packages:
- Unscoped packages are resolved against the default NPM repository.
- Scoped packages are resolved against a scoped entry in the project or user's
@imce:registry=https://api.bintray.com/npm/jpl-imce/gov.nasa.jpl.imce.npm/ //api.bintray.com/npm/jpl-imce/gov.nasa.jpl.imce.npm/:username=nrouquette //api.bintray.com/npm/jpl-imce/gov.nasa.jpl.imce.npm/:_authToken=<base64 API key> //api.bintray.com/npm/jpl-imce/gov.nasa.jpl.imce.npm/:firstname.lastname@example.org //api.bintray.com/npm/jpl-imce/gov.nasa.jpl.imce.npm/:always-auth=true
Testing JS library
sbt fullOptJS node shared/test/js/index