Digital Substrate Model¶
Introduction¶
Data modeling is carried out using the DSM (Digital Substrate Model) language. This DSL (Domain-Specific Language) allows you to define the different elements involved in the design of a data model. This DSM makes it possible to define aggregates of information which are linked by unique key to construct complex structured data.
To define a structured, flexible data model, the DSM introduces basically five fundamental notions.
- Namespace: a space where types are defined.
- Concept: an abstract thing (like an abstract Type)
- Key: a way to identify the instantiation of a Concept (like a UUID<Concept>)
- Document: a piece of information expressible in the type system (ex: struct Document { ... })
- Attachment: a way to associate a Document with a Key (like a map
)
Identifier terminology:
- Namespace ID: the uuid of the namespace used as the seed to generate other RuntimeId.
- Runtime ID: the uuid generated by the runtime for a concept, a club, a struct, an enum or an attachment computed from its definition.
Concept and key¶
A concept is generally an abstract term from the data domain to be modeled. It's an abstraction that only has a concrete form during specific use.
Material, Surface, Light are concepts.
Concepts can also be strongly linked to each other abstractedly. A Standard Material is a Material, and a Matte Material is also a Material.
// DSM
namespace Tuto {f529bc42-0618-4f54-a3fb-d55f95c5ad03} {
// The concept of a user
concept User;
// The concept of a material and related concepts
concept Material;
concept MaterialStandard is a Material;
concept MaterialMatte is a Material;
};
A concrete example of a key (an instance of a concept)
UserKey user_key = UserKey.create()
MaterialStandardKey ms_key = MaterialStandardKey.create()
MaterialKey m_key(ms_key) # always true
optional<MaterialStandardId> o_ms_key = m_key.asMaterialStandardKey() # maybe ?
optional<MaterialMatteId> o_mm_key = m_key.asMaterialMatteKey() # maybe ?
Structure¶
A Structure is an aggregate of fields.
The type of field can be another Structure, however, this structure composition mechanism can't result in a recursive definition.
namespace Tuto {f529bc42-0618-4f54-a3fb-d55f95c5ad03} {
struct Login {
string nickname;
string password;
};
};
Attachment¶
Attachment allows a document to be associated with a key.
The attachment can be seen as a map<Key, Document> for a specific usage.
// DSM
namespace Tuto {f529bc42-0618-4f54-a3fb-d55f95c5ad03} {
concept User;
struct Login {
string nickname;
string password;
};
struct Identity {
string firstName;
string lastName;
};
attachment<User, Login> login;
attachment<User, Identity> identity;
# The document can be any type
attachment<User, vector<string>> comments;
attachment<User, set<string>> tags;
};
Definitions¶
The Definitions of the data model is carried out by writing one or more
file.dsm allowing concept, document and attachments to be defined.
Usually, it is the notion of schema which seals all the definitions. The evolution of the data model is therefore based on the notion of a schema version.
In our approach, each definition is sealed by its definition in a namespace, and the notion of schema is replaced by a set of immutable definitions. So there is no longer a schema version.
The data model is therefore ultimately a set of sealed definitions.
Feature Associated with a Data Model¶
We have defined a data model, but we need to use it in various environments. One environment requires a compiled language like C++, another environment requires an interpreted language like Python.
We know that the data model needs a concrete implementation in an environment. For imperative programming language, we can map the data model to classes and containers. But we also need another feature like serialization, persistence, binary representation, JSON representation, and we know that writing this code is repetitive, boring and error-prone and have a huge cost to maintain.
We can drastically reduce this pain if we have access to a technology where:
repetitive_code_for_a_feature = feature(data_model)
For a C++ environment and some required features:
- cpp_code_for_classes = cpp_feature_data(data_model)
- cpp_code_for_serialization = cpp_feature_serialization(data_model)
- cpp_code_for_persistence = cpp_feature_persistence(data_model)
- ...
But we can do better if we can abstract the feature as well.
repetitive_code_for_a_feature = generate_code(feature, data_model)
For a C++ environment and some required features:
- cpp_code_for_classes = generate(cpp_feature_data, data_model)
- cpp_code_for_serialization = generate(cpp_feature_serialization, data_model)
- cpp_code_for_persistence = generate(cpp_feature_persistence, data_model)
- ...
And for a Python environment and some required features:
- python_code_for_classes = generate(python_feature_data, data_model)
- python_code_for_persistence = generate(python_feature_persistence, data_model)
- ...
This technology is kibo which uses the StringTemplate engine
to implement a templated feature and the DSM to describe the data model.
code_for_a_feature = renderer(templated_feature, dsm_definitions)
Here is a partial list of templated features for C++:
./templates/cpp/Data/ # Type Implementation for concept, club, enum and struct
./templates/cpp/Model/ # The definitions of the model
./templates/cpp/Database/ # The persistence layer based on sqlite3
./templates/cpp/Stream/ # The encoder/decoder for types
./templates/cpp/Commit/ # The Commit API
...
See the documentation of kibo to generate code and the documentation of Kibo Template Model to create your own Templated Feature.
In a dynamic environment like Python definitions are used as metadata to implement a feature without code generation.
See Getting Started With DSM for a concrete use of DSM for templated features and Getting Started With Viper to use Definitions in a dynamic environment.
Type System for Modeling a Data Model¶
The modeling of the concrete data is carried out by the definition of a structure.
The type system exposes standard primitive types like boolean, integer, real, string, uuid, enumeration, structure and generic containers like vector, set, map, optional...
When defining a structure, it is possible to declare the default value for certain types. If the value is not specified, the field is initialized with the zero of the type.
To introduce the DSM language and illustrate all available types, we specify the mapping for C++ generated code.
Primitive Types¶
Boolean¶
The type bool is mapped to std::bool.
// DSM
namespace Tuto {f529bc42-0618-4f54-a3fb-d55f95c5ad03} {
struct S {
bool f_b1;
bool f_b2 = true;
};
};
Integers¶
Integer types uint8, uint16... are mapped to std::uint8_t, std::uint16_t ...
// DSM
namespace Tuto {f529bc42-0618-4f54-a3fb-d55f95c5ad03} {
struct S {
uint8 f_ui8;
uint16 f_ui16 = 42;
uint32 f_ui32;
uint64 f_ui64;
int8 f_i8;
int16 f_i16 = -42;
int32 f_i32;
int64 f_i64;
};
};
Real¶
Real types float, double are mapped to float, double.
// DSM
namespace Tuto {f529bc42-0618-4f54-a3fb-d55f95c5ad03} {
struct S {
float f_float;
double f_double = 42.0;
};
};
String¶
The type string is mapped to std::string and the content must be UTF-8 encoded.
// DSM
namespace Tuto {f529bc42-0618-4f54-a3fb-d55f95c5ad03} {
struct S {
string f_s1;
string f_s2 = "42";
};
};
UUID¶
The type uuid is mapped to viper::UUId which is a platform independent UUID4
implementation.
// DSM
namespace Tuto {f529bc42-0618-4f54-a3fb-d55f95c5ad03} {
struct S {
uuid f_uuid;
uuid f_uuid = {8f2586fc-735b-48ca-8d32-3b7545f65cd6};
};
};
BlobId¶
The type blob_id is mapped to Viper::BlobId which is like an uuid but have a
specific semantic for the persistence layer. The value of a blob_id must reference an
existing blob in the persistence layer.
// DSM
namespace Tuto {f529bc42-0618-4f54-a3fb-d55f95c5ad03} {
struct S {
blob_id vertices;
blob_id normals;
};
};
Blob¶
The type blob is mapped to std::vector<std::uint8_t> and is used to encode small
binary data.
// DSM
namespace Tuto {f529bc42-0618-4f54-a3fb-d55f95c5ad03} {
struct Image {
uint16 with;
uint16 height;
blob thumbnail; // Small data (32x32 ARGB)
blob_id pixels; // Big data
};
};
Enumeration¶
The type enum is mapped to an enum class without a specific value and storage size.
Enum are limited to 256 cases.
// DSM
namespace Tuto {f529bc42-0618-4f54-a3fb-d55f95c5ad03} {
enum MaterialMirrorOutboundSceneColor {
black,
background,
environment
};
struct MaterialMirrorProperties {
MaterialMirrorOutboundSceneColor outboundSceneColor = .black;
};
};
Structure¶
The type struct is mapped to a struct.
Structures are composable and the structure field has an optional initializer.
// DSM
namespace Tuto {f529bc42-0618-4f54-a3fb-d55f95c5ad03} {
struct Vector {
float x; // initialized to 0
float y = 0;
float z;
};
struct Transform {
Vector translation; // initialized to {0.0, 0.0, 0.0};
Vector orientation = {0.0, 0.0, 0.0};
Vector scaling = {1.0, 1.0, 1.0};
};
};
Since a structure is not recursive, we must use a optional<key<T>> to create a relation
in a recursive data model.
// DSM
namespace Tuto {f529bc42-0618-4f54-a3fb-d55f95c5ad03} {
struct ConfigurationExpressionProperties {
ConfigurationExpressionOperationType operationType = .defined;
optional<key<ConfigurationExpression>> leftExpressionKey;
optional<key<ConfigurationExpression>> rightExpressionKey;
string symbol;
};
};
Mathematical types¶
Vec¶
The type vec<T, n> is mapped to std::array<T, n> and T must be a numeric type
(integers or reals).
// DSM
namespace Tuto {f529bc42-0618-4f54-a3fb-d55f95c5ad03} {
struct Mesh {
vector<vec<float, 3>> positions;
vector<vec<float, 2>> uvs;
};
};
Mat¶
The type mat<T, columns, rows> is mapped to std::array<std::array<T, row>, column>
and T must be a numeric type (integers or reals).
// DSM
namespace Tuto {f529bc42-0618-4f54-a3fb-d55f95c5ad03} {
struct S {
mat<float, 4, 4> transform;
};
};
Generic Types¶
Generic containers are mapped to STL containers.
Vector¶
The vector<T> is mapped to std::vector<T>.
// DSM
namespace Tuto {f529bc42-0618-4f54-a3fb-d55f95c5ad03} {
struct CameraGroupProperties {
string name = "Camera Group";
vector<CameraGroup> groups;
vector<Camera> cameras;
};
};
Set¶
The set<T> is mapped to std::set<T>.
// DSM
namespace Tuto {f529bc42-0618-4f54-a3fb-d55f95c5ad03} {
struct SurfaceProperties {
...
set<string> tags;
...
};
};
Map¶
The map<K, V> is mapped to std::map<K, V>.
// DSM
namespace Tuto {f529bc42-0618-4f54-a3fb-d55f95c5ad03} {
struct EnvironmentLayerProperties {
...
map<Surface, Environment> environmentAssignments;
...
};
};
Optional¶
The optional<T> is mapped to std::optional<T>.
// DSM
namespace Tuto {f529bc42-0618-4f54-a3fb-d55f95c5ad03} {
struct S {
string firstname;
string lastname;
optional<string> nickname;
};
};
Tuple¶
The tuple<T0, ...> is mapped to std::tuple<T0, ...>.
// DSM
namespace Tuto {f529bc42-0618-4f54-a3fb-d55f95c5ad03} {
struct S {
tuple<float, float> location;
};
};
Variant¶
The variant<T0, ...> is mapped to std::variant<T0, ...>.
// DSM
namespace Tuto {f529bc42-0618-4f54-a3fb-d55f95c5ad03} {
struct MaterialMultilayer {
...
vector<variant<LayerIllumination, LayerDiffuse, LayerSpecular>> layers;
...
};
};
XArray¶
A xarray<T> is like a vector<T>. It is designed to keep the order of an element during
collaborative mutations by replacing index by position.
The xarray<T> is mapped to Viper::xarray<T>.
// DSM
namespace Tuto {f529bc42-0618-4f54-a3fb-d55f95c5ad03} {
struct S {
...
xarray<string> comments;
...
};
};
Any¶
Any is a special type mapped to Viper::Any and require the dynamic aspect of the
Viper runtime to be used in C++.
// DSM
namespace Tuto {f529bc42-0618-4f54-a3fb-d55f95c5ad03} {
attachment<User, any> documents;
};
Key¶
key<T> is a keyword type used to distinguish between a key for a concept, a club
or any_concept and the other concrete data types.
Concept¶
A concept is generally an abstract term from the Data Domain to be modeled. Given its
totally abstract nature, there is no concrete implementation of a concept. Only a
concrete implementation of a <Concept>Key exists for an instance of the concept.
<Concept>Key is a sort of UUID<Concept>`.
// DSM
namespace Tuto {f529bc42-0618-4f54-a3fb-d55f95c5ad03} {
concept Material;
concept MaterialMatte is a Material;
concept MaterialEnvironment is a Material;
struct MaterialAssignment {
optional<key<Material>> material;
Transform transform;
int8 uvSet = 0;
};
struct MaterialGroupProperties {
string name = "Material Group";
vector<MaterialGroup> groups;
vector<key<Material>> materials;
};
};
Club¶
A club is used to group concept when they are not related in abstraction.
A concept becomes a member of a club by the membership keyword.
A concept can be a member of several clubs.
Given its totally abstract nature, there is no concrete implementation of a
club. Only the concrete implementation of a <Club>Key exists to represent an
instance of a concept in a club.
<Club>Key is a sort of UUID<Concept,...> of all these members.
// DSM
namespace Tuto {f529bc42-0618-4f54-a3fb-d55f95c5ad03} {
// unrelated abstraction
concept GeometryLayer;
concept AspectLayer;
concept PositionLayer;
concept EnvironmentLayer;
club ConfigurationTargetElement;
membership ConfigurationTargetElement GeometryLayer;
membership ConfigurationTargetElement AspectLayer;
membership ConfigurationTargetElement PositionLayer;
membership ConfigurationTargetElement EnvironmentLayer;
struct ConfigurationTarget {
...
optional<key<ConfigurationTargetElement>> element;
...
};
};
Attachment¶
An attachment associates a key with a document for a specific usage.
The concrete implementation of an attachment is like a Mapping API.
namespace Tuto {f529bc42-0618-4f54-a3fb-d55f95c5ad03} {
struct Mipmap {
int32 width = 0;
int32 height = 0;
blob_id blob_image;
};
concept BumpMap;
struct BumpMapProperties {
string name = "BumpMap";
optional<key<Thumbnail>> thumbnailKey;
vector<Mipmap> mipmaps;
};
attachment<BumpMap, BumpMapProperties> properties;
See Getting Started with DSM for a concrete exploration.
What Next ...¶
Install the Visual Studio Code Extension and examine the definitions we used for our demo applications.
-
./dsm_samples/REcontains the definitions extracted from Lumiscaphe P3D to implement the Raptor Editor Demo based on a legacy model with only one attachment per concept. -
./dsm_samples/gecontains the definitions used for the Graph Editor Demo with many attachments per concepts which is the recommended design for a fresh project. The folder also contains definitions for function pools and commit function pools to expose the C++ business logic for an embedded Python interpreter or a service.