155 lines
7.7 KiB
Markdown
155 lines
7.7 KiB
Markdown
# Architecture
|
|
|
|
To illustrate how the services are tied together it is important to first know
|
|
where the data is stored, what it looks like, which service has the
|
|
ownership, and if the data is considered to be public. This is because the ties
|
|
between services can be interpreted as the flow of data.
|
|
|
|
## Data model
|
|
|
|
All data is stored in PostgreSQL. The reason for this is simplicity and
|
|
maintenance burden. We have simplicity due to the fact that all data is in one
|
|
place where it can be cross-referenced easily and transformed using SQL (which
|
|
is also very capable of dealing with semi-structured data nowadays). We have
|
|
a low maintenance burden because the PostgreSQL instance is shared between
|
|
other Blender services, so we only pay that cost once.
|
|
|
|
All models are owned by `/website` meaning that all models are defined by this
|
|
service. This implies that we consider PostgreSQL to be part of this service.
|
|
|
|
### Main models
|
|
|
|
There are currently two main models around which the whole system is build.
|
|
These models are called `RawBenchmarks` and `Benchmarks`.
|
|
|
|
#### RawBenchmarks
|
|
`RawBenchmarks` is a semi-structured model which consists of system information
|
|
and benchmark times. The reason it is only semi-structured is that we want the
|
|
data samples to be immutable, but still allow us to change the schema over time.
|
|
Immutability is important for transparency and reproducibility. It is
|
|
[defined](website/opendata_main/models/benchmarks.py) using Django. This model
|
|
is considered **public**, and all users can download daily snapshots of the
|
|
whole table.
|
|
|
|
#### Benchmarks
|
|
`Benchmarks` is the structured and indexed model derived from the
|
|
`RawBenchmarks` which is used for display and querying purposes. It contains the
|
|
subset of information we need to visualize the results as well as the verified
|
|
and anonymity status of the corresponding user. You might ask yourself why we
|
|
not just use the `RawBenchmarks` directly. This is because working with
|
|
semi-structured data becomes complex really quickly, we isolate this complexity
|
|
by centralizing the data normalization. This model is
|
|
[defined](website/opendata_main/migrations/0002_benchmarks.sql) using SQL. Since
|
|
this model is derived directly from `RawBenchmarks` it is also considered
|
|
**public**.
|
|
|
|
### User/authentication models
|
|
|
|
Because benchmarks are run and owned by users we inevitably need to model them.
|
|
All models pertaining to users are considered **private** unless the user
|
|
explicitly wants his data to be public.
|
|
|
|
#### Users
|
|
`Users` are provided by Django together with Blender ID.
|
|
|
|
#### RawBenchmarkOwnership
|
|
`RawBenchmarkOwnership` are used to track ownership of benchmarks. It contains
|
|
nothing more than a pointer to a `User` and a `RawBenchmark`. A natural question
|
|
to ask is why not just make a `RawBenchmark` point to a `User` directly? This
|
|
is because this information is **private**. We want to be able to accommodate
|
|
anonymous submissions and in order to do so we keep private and public
|
|
information separate on a data level. It is
|
|
[defined](website/opendata_main/models/benchmarks.py) using Django.
|
|
|
|
#### UserSettings
|
|
`UserSettings` is the model which contains the _Open Data specific_ settings for
|
|
a given `User`. It contains a flag signalling if a user wants his data to be
|
|
anonymous and a flag signalling if we (Blender) trust the data of that specific
|
|
user. If we trust his data the user is called a **verified user**. It is
|
|
[defined](website/opendata_main/models/user_settings.py) using Django. The
|
|
anonymity and verified flag, if provided decoupled from the user, are considered
|
|
**public**.
|
|
|
|
#### LauncherAuthenticationTokens
|
|
`LauncherAuthenticationTokens` are tokens that are used to authenticate the
|
|
`/launcher` for a specific user. They contain a pointer to a `User` and a
|
|
secret value, knowledge of which implies valid authentication for that user.
|
|
It is [defined](website/opendata_main/models/tokens.py) using Django.
|
|
|
|
### Metadata models
|
|
|
|
We are dealing with multiple Blender versions, scenes, benchmark scripts and
|
|
launchers, to simplify dealing with that we store some centralized metadata. All
|
|
metadata consists of at least a URL of where to get them and their checksums.
|
|
All metadata is considered to be **public**.
|
|
|
|
#### Launchers
|
|
`Launchers` contains metadata for a specific version of `/launcher`. In
|
|
addition to a URL and a checksum it also contains a flag signalling if this
|
|
launcher is still supported. This allows us to enforce a minimum version and to
|
|
point the user to where to get the latest version. It is
|
|
[defined](website/opendata_main/models/metadata.py) using Django.
|
|
|
|
#### BenchmarkScripts
|
|
`BenchmarkScripts` contains metadata for a specific version of
|
|
`/benchmark_script`. It is [defined](website/opendata_main/models/metadata.py)
|
|
using Django.
|
|
|
|
#### Scenes
|
|
`Scenes` contains metadata for a specific Blender scene for which a
|
|
benchmark can be run. It is [defined](website/opendata_main/models/metadata.py)
|
|
using Django.
|
|
|
|
#### BlenderVersion
|
|
`BenchmarkScripts` consists of metadata about a specific version of
|
|
`/benchmark_script`. Besides information about the Blender version it also
|
|
points to a required `BenchmarkScript` and one or more available `Scenes`.
|
|
It is [defined](website/opendata_main/models/metadata.py) using Django.
|
|
|
|
## Data flow
|
|
|
|
Now we know what the data looks like we can talk about how and where it is
|
|
exchanged between services.
|
|
|
|
### User flow
|
|
`Users` are created and authenticated by deferring to Blender ID. Once a `User`
|
|
is created a corresponding `UserSettings` instance is
|
|
[created](website/opendata_main/signals.py) using a Django signal.
|
|
|
|
### Launcher authentication flow
|
|
When authenticating, a `/launcher` connects to `/launcher_authenticator` using a
|
|
WebSocket and asks for a new token. In response the launcher receives a new
|
|
(unverified) token and a URL pointing to `/website` at which the user can verify
|
|
the token. After sending the response the `/launcher_authenticator` waits for
|
|
the token to be verified using `LISTEN` in PostgreSQL. The `/launcher` directs
|
|
the user to the verification URL and starts waiting on a verification signal
|
|
from the `/launcher_authenticator`. After the user verifies the token `/website`
|
|
updates the token as belonging to the user and being verified and notifies the
|
|
`/launcher_authenticator` of this fact using `NOTIFY` in PostgreSQL. As soon as
|
|
the `/launcher_authenticator` receives this signal it forwards it to the
|
|
`/launcher` in addition to the name and email of the user. Next, to protect
|
|
against the possibility of the verification URL being leaked, the `/launcher`
|
|
asks for confirmation of the name and email. If the user confirms, the token
|
|
is saved locally and can be used for authenticating the `/launcher` when
|
|
submitting benchmarks.
|
|
|
|
### Benchmark flow
|
|
When the user starts the `/launcher` all metadata is fetched from `/website`.
|
|
After the user chooses the Blender versions and scenes all required assets will
|
|
be downloaded by the `/launcher` according to this metadata. Once all assets are
|
|
in place the `/benchmark_script` is invoked within the requested Blender
|
|
version. The `/benchmark_script` gathers the required system information and
|
|
starts the benchmark while reporting progress to the `/launcher`. After the
|
|
benchmark is complete the `/benchmark_script` sends all gathered information to
|
|
the `/launcher`. The `/launcher` then submits the resulting `RawBenchmark` to
|
|
`/website` using the token obtained in the
|
|
[launcher authentication flow](#launcher-authentication-flow). As soon as the
|
|
`/website` inserts the `RawBenchmark` into PostgreSQL a
|
|
[trigger](website/opendata_main/migrations/0002_benchmarks.sql) fires which
|
|
creates the corresponding indexed `Benchmark`.
|
|
|
|
### UserSettings anonymity/verified flow
|
|
If the anonymity/verified flag is toggled on a `UserSettings` instance a
|
|
[trigger](website/opendata_main/migrations/0002_benchmarks.sql) fires in
|
|
PostgreSQL which updates all corresponding `Benchmarks`.
|