7.7 KiB
Architecture
To illustrate how the services are tied together it is important to first know where the data is stored, what it looks like, which service has the ownership, and if the data is considered to be public. This is because the ties between services can be interpreted as the flow of data.
Data model
All data is stored in PostgreSQL. The reason for this is simplicity and maintenance burden. We have simplicity due to the fact that all data is in one place where it can be cross-referenced easily and transformed using SQL (which is also very capable of dealing with semi-structured data nowadays). We have a low maintenance burden because the PostgreSQL instance is shared between other Blender services, so we only pay that cost once.
All models are owned by /website
meaning that all models are defined by this
service. This implies that we consider PostgreSQL to be part of this service.
Main models
There are currently two main models around which the whole system is build.
These models are called RawBenchmarks
and Benchmarks
.
RawBenchmarks
RawBenchmarks
is a semi-structured model which consists of system information
and benchmark times. The reason it is only semi-structured is that we want the
data samples to be immutable, but still allow us to change the schema over time.
Immutability is important for transparency and reproducibility. It is
defined using Django. This model
is considered public, and all users can download daily snapshots of the
whole table.
Benchmarks
Benchmarks
is the structured and indexed model derived from the
RawBenchmarks
which is used for display and querying purposes. It contains the
subset of information we need to visualize the results as well as the verified
and anonymity status of the corresponding user. You might ask yourself why we
not just use the RawBenchmarks
directly. This is because working with
semi-structured data becomes complex really quickly, we isolate this complexity
by centralizing the data normalization. This model is
defined using SQL. Since
this model is derived directly from RawBenchmarks
it is also considered
public.
User/authentication models
Because benchmarks are run and owned by users we inevitably need to model them. All models pertaining to users are considered private unless the user explicitly wants his data to be public.
Users
Users
are provided by Django together with Blender ID.
RawBenchmarkOwnership
RawBenchmarkOwnership
are used to track ownership of benchmarks. It contains
nothing more than a pointer to a User
and a RawBenchmark
. A natural question
to ask is why not just make a RawBenchmark
point to a User
directly? This
is because this information is private. We want to be able to accommodate
anonymous submissions and in order to do so we keep private and public
information separate on a data level. It is
defined using Django.
UserSettings
UserSettings
is the model which contains the Open Data specific settings for
a given User
. It contains a flag signalling if a user wants his data to be
anonymous and a flag signalling if we (Blender) trust the data of that specific
user. If we trust his data the user is called a verified user. It is
defined using Django. The
anonymity and verified flag, if provided decoupled from the user, are considered
public.
LauncherAuthenticationTokens
LauncherAuthenticationTokens
are tokens that are used to authenticate the
/launcher
for a specific user. They contain a pointer to a User
and a
secret value, knowledge of which implies valid authentication for that user.
It is defined using Django.
Metadata models
We are dealing with multiple Blender versions, scenes, benchmark scripts and launchers, to simplify dealing with that we store some centralized metadata. All metadata consists of at least a URL of where to get them and their checksums. All metadata is considered to be public.
Launchers
Launchers
contains metadata for a specific version of /launcher
. In
addition to a URL and a checksum it also contains a flag signalling if this
launcher is still supported. This allows us to enforce a minimum version and to
point the user to where to get the latest version. It is
defined using Django.
BenchmarkScripts
BenchmarkScripts
contains metadata for a specific version of
/benchmark_script
. It is defined
using Django.
Scenes
Scenes
contains metadata for a specific Blender scene for which a
benchmark can be run. It is defined
using Django.
BlenderVersion
BenchmarkScripts
consists of metadata about a specific version of
/benchmark_script
. Besides information about the Blender version it also
points to a required BenchmarkScript
and one or more available Scenes
.
It is defined using Django.
Data flow
Now we know what the data looks like we can talk about how and where it is exchanged between services.
User flow
Users
are created and authenticated by deferring to Blender ID. Once a User
is created a corresponding UserSettings
instance is
created using a Django signal.
Launcher authentication flow
When authenticating, a /launcher
connects to /launcher_authenticator
using a
WebSocket and asks for a new token. In response the launcher receives a new
(unverified) token and a URL pointing to /website
at which the user can verify
the token. After sending the response the /launcher_authenticator
waits for
the token to be verified using LISTEN
in PostgreSQL. The /launcher
directs
the user to the verification URL and starts waiting on a verification signal
from the /launcher_authenticator
. After the user verifies the token /website
updates the token as belonging to the user and being verified and notifies the
/launcher_authenticator
of this fact using NOTIFY
in PostgreSQL. As soon as
the /launcher_authenticator
receives this signal it forwards it to the
/launcher
in addition to the name and email of the user. Next, to protect
against the possibility of the verification URL being leaked, the /launcher
asks for confirmation of the name and email. If the user confirms, the token
is saved locally and can be used for authenticating the /launcher
when
submitting benchmarks.
Benchmark flow
When the user starts the /launcher
all metadata is fetched from /website
.
After the user chooses the Blender versions and scenes all required assets will
be downloaded by the /launcher
according to this metadata. Once all assets are
in place the /benchmark_script
is invoked within the requested Blender
version. The /benchmark_script
gathers the required system information and
starts the benchmark while reporting progress to the /launcher
. After the
benchmark is complete the /benchmark_script
sends all gathered information to
the /launcher
. The /launcher
then submits the resulting RawBenchmark
to
/website
using the token obtained in the
launcher authentication flow. As soon as the
/website
inserts the RawBenchmark
into PostgreSQL a
trigger fires which
creates the corresponding indexed Benchmark
.
UserSettings anonymity/verified flow
If the anonymity/verified flag is toggled on a UserSettings
instance a
trigger fires in
PostgreSQL which updates all corresponding Benchmarks
.