GPencil: Update-on-write using an update cache #95401

New Issue

Falk David · 2022-02-01T16:56:26+01:00

Falk David commented

2022-02-01 16:56:26 +01:00

This is a follow-up design of our prototype D13534.

What is the update cache?

In short: The update cache stores what elements have changed since the last time the eval object was updated. Additionally, the update cache can now store multiple updates to the data and minimizes the number of elements that need to be copied.

Motivation

In our previous proposal (#93934) the elements that changed were tagged in the original data-block and had to be searched for when the update of the eval object took place. This means that in the worst case, the entire data-block had to be iterated through (into all the layers, into all frames and over all the strokes). Our profiling showed that close to 100% of the time was simply spent iterating over the linked-lists checking if something had been tagged. The update cache is our attempt at solving this problem.

Tagging

To make use of the update cache, we need to tag elements that need to be updated. This is done using dedicated BKE functions.

Full update tag (BKE_gpencil_tag_full_update)
The full update tag will update the element itself, but also all of its content. E.g. a full update tag on a layer will not only copy the layer, but also the entire frames and strokes in that layer.

Light update tag (BKE_gpencil_tag_light_update)
The light update is a shallow type of update: it will only update the properties of an element, leaving its underlying content unchanged. E.g a light copy can be used to indicate that a layer's name has been modified or that a keyframe selection status has changed.

Explanation of the Update Cache

Here is a diagram of what an update cache might look like after some operation on the grease pencil data happend:

{F12841726, width=100%}

As we can see, the cache resembles a tree. This is because the grease pencil data can be divided up into four distinct “levels”. The upper-most level is the whole data-block, then below are layers and then frames and finally strokes. Each one of those levels fully contains the others below, hence the tree structure.

An update cache node is composed of 4 things:

The index of the cached element in the linked-list. E.g. when a node caches an update of a frame, the index is where that frame is in the linked-list of all the frames in its layer. In the diagram, this is the number inside the node.
A flag indicating the type of update: full, light, or no update.
A data pointer to the element that is cached. The pointer might be null if the node does not cache an update for this element, but it will still store its index. These nodes are indicated using dashed lines in the diagram. Nodes that do have a valid data pointer are indicated with a bold circle for light updates and a square for full updates. These are the nodes that were tagged for an update.
The children nodes. This is an ordered map with the key being the index and the value being an update cache node. This also means that an update cache node can have 0 or more children (except if it is on the lowest level, e.g. a stroke).

To minimize the number of updates needed, the update cache does the following optimization: If a node in the update cache tree indicates a full update it mustn't have any children.

This can be done, because the element that a node represents must contain all the data that is represented by the children nodes. So for example, during the update_on_write, if a layer was tagged and needs to be copied, all of the data contained in that layer will of course be copied as well. So there is no need to e.g. tag a frame in that layer, because it would be copied twice.

So to summarize our update cache tree has these properties:

Nodes can have 0 or more children.
Only the leaf nodes cache elements that need a full update.
Any node may cache elements that require a light update.

We use an ordered map for the children of the nodes to be able to do a single iteration over the eval data-block. Since the indices are sorted, we can always find the next element by iterating forwards.

Updating the eval object

With the update cache, the original object is no longer copied to replace the current eval object (copy-on-write). Instead, the eval object is updated, meaning that only parts will be copied over from the original (we call this update-on-write). The assumption here is that these parts were correctly tagged and nothing else changed. This then guarantees that the indices in our cache are the same on the eval object so we can use them to find the elements that need to be replaced. Updating then becomes almost trivial. We just iterate over the levels in the update cache tree, iterate over the children nodes, check if they need to be replaced and then do so. All the information needed to perform the replacement is stored in the node.

Implementation

Update cache patch: D13984

Further improvements

A lot of the lag in heavier files is due to the global undo. The proposal #95450 is addressing this.

Co-authored-by: @yann-lty

This is a follow-up design of our prototype [D13534](https://archive.blender.org/developer/D13534). ## What is the update cache? In short: The update cache stores what elements have changed since the last time the eval object was updated. Additionally, the update cache can now store multiple updates to the data and minimizes the number of elements that need to be copied. ## Motivation In our previous proposal (#93934) the elements that changed were tagged in the original data-block and had to be searched for when the update of the eval object took place. This means that in the worst case, the entire data-block had to be iterated through (into all the layers, into all frames and over all the strokes). Our profiling showed that close to 100% of the time was simply spent iterating over the linked-lists checking if something had been tagged. The update cache is our attempt at solving this problem. ## Tagging To make use of the update cache, we need to tag elements that need to be updated. This is done using dedicated BKE functions. **Full update tag (`BKE_gpencil_tag_full_update`)** The full update tag will update the element itself, but also all of its content. E.g. a full update tag on a layer will not only copy the layer, but also the entire frames and strokes in that layer. **Light update tag (`BKE_gpencil_tag_light_update`)** The light update is a shallow type of update: it will only update the properties of an element, leaving its underlying content unchanged. E.g a light copy can be used to indicate that a layer's name has been modified or that a keyframe selection status has changed. ## Explanation of the Update Cache Here is a diagram of what an update cache might look like after some operation on the grease pencil data happend: {[F12841726](https://archive.blender.org/developer/F12841726/GPencil_Update_Cache_Diagram.drawio__2_.png), width=100%} As we can see, the cache resembles a tree. This is because the grease pencil data can be divided up into four distinct “levels”. The upper-most level is the whole data-block, then below are layers and then frames and finally strokes. Each one of those levels fully contains the others below, hence the tree structure. An update cache node is composed of 4 things: - The **index** of the cached element in the linked-list. E.g. when a node caches an update of a frame, the index is where that frame is in the linked-list of all the frames in its layer. In the diagram, this is the number inside the node. - A **flag** indicating the type of update: full, light, or no update. - A **data pointer** to the element that is cached. The pointer might be null if the node does not cache an update for this element, but it will still store its index. These nodes are indicated using dashed lines in the diagram. Nodes that do have a valid data pointer are indicated with a bold circle for light updates and a square for full updates. These are the nodes that were tagged for an update. - The **children** nodes. This is an ordered map with the key being the index and the value being an update cache node. This also means that an update cache node can have 0 or more children (except if it is on the lowest level, e.g. a stroke). To minimize the number of updates needed, the update cache does the following optimization: If a node in the update cache tree indicates a full update it mustn't have any children. This can be done, because the element that a node represents must contain all the data that is represented by the children nodes. So for example, during the `update_on_write`, if a layer was tagged and needs to be copied, all of the data contained in that layer will of course be copied as well. So there is no need to e.g. tag a frame in that layer, because it would be copied twice. So to summarize our update cache tree has these properties: - Nodes can have 0 or more children. - Only the leaf nodes cache elements that need a full update. - Any node may cache elements that require a light update. We use an ordered map for the children of the nodes to be able to do a single iteration over the eval data-block. Since the indices are sorted, we can always find the next element by iterating forwards. ## Updating the eval object With the update cache, the original object is no longer copied to replace the current eval object (copy-on-write). Instead, the eval object is updated, meaning that only parts will be copied over from the original (we call this **update-on-write**). The assumption here is that these parts were correctly tagged and nothing else changed. This then guarantees that the indices in our cache are the same on the eval object so we can use them to find the elements that need to be replaced. Updating then becomes almost trivial. We just iterate over the levels in the update cache tree, iterate over the children nodes, check if they need to be replaced and then do so. All the information needed to perform the replacement is stored in the node. ## Implementation Update cache patch: [D13984](https://archive.blender.org/developer/D13984) ## Further improvements A lot of the lag in heavier files is due to the global undo. The proposal #95450 is addressing this. Co-authored-by: @yann-lty

Falk David commented

2022-02-01 16:56:26 +01:00

Added subscriber: @filedescriptor

Falk David commented

2022-02-01 17:03:18 +01:00

Added subscriber: @yann-lty

Hans Goudey commented

2022-02-02 06:44:00 +01:00

Added subscriber: @HooglyBoogly

Hans Goudey commented

2022-02-02 06:44:00 +01:00

Great job on the detailed proposal. It's nice to see more granular updates and ways to avoid copying.

However, I get the impression that at this point we're really fighting against the data structure. Four levels of linked lists means everything is all over the place. The lack of O(1) indexing means that complexity is always moved elsewhere, and even down at the leaves, data is stored inefficiently (you can't just update the positions of strokes, because they're stored together with the color, etc.)

Recently there has been discussion about unifying the curves and hair data structures: #94193. We're now starting to implement this here: #95355
The patch linked there contains a DNA struct that can be embedded in other DNA (like grease pencil structs) as a common storage for curve data.
In the hair/curves data structure, all of the curves are stored contiguously, and a struct of arrays representation makes dealing with attributes individually much simpler.
That also brings the benefit of possibly sharing code procedural and even interactive operations.

I can't help but wonder if changing the data structures would help avoid a lot of this complexity. Obviously storing multiple frames in the same bGPdata means there needs to be some extra work to avoid copies, but beyond that, sharing the data structure seems like an attractive option in my opinion.

Great job on the detailed proposal. It's nice to see more granular updates and ways to avoid copying. However, I get the impression that at this point we're really fighting against the data structure. Four levels of linked lists means everything is all over the place. The lack of O(1) indexing means that complexity is always moved elsewhere, and even down at the leaves, data is stored inefficiently (you can't just update the positions of strokes, because they're stored together with the color, etc.) Recently there has been discussion about unifying the curves and hair data structures: #94193. We're now starting to implement this here: #95355 The patch linked there contains a DNA struct that can be embedded in other DNA (like grease pencil structs) as a common storage for curve data. In the hair/curves data structure, all of the curves are stored contiguously, and a struct of arrays representation makes dealing with attributes individually much simpler. That also brings the benefit of possibly sharing code procedural and even interactive operations. I can't help but wonder if changing the data structures would help avoid a lot of this complexity. Obviously storing multiple frames in the same `bGPdata` means there needs to be some extra work to avoid copies, but beyond that, sharing the data structure seems like an attractive option in my opinion.

Antonio Vazquez commented

2022-02-02 09:45:10 +01:00

Added subscriber: @antoniov

Antonio Vazquez commented

2022-02-02 09:45:10 +01:00

I don't know the details of the implementation of arrays you are talking about, but there are several points that seem important to me and that we must include before thinking about this solution.

First of all, the code for grease pencil is quite extensive and there are currently over 130 operators, plus its own draw engine and other changes in various places in Blender.

Due to the nature of grease pencil, on many occasions it is necessary to move elements in the lists, that is, to include new elements in the middle of the list, at the beginning or at the end (this case would not be a problem). Examples of these changes could be moving a frame, adding strokes in the middle, etc.

I'm sure that accessing elements in the list would be much faster using an array and the element's index, but I don't know if the position changes of elements in the list would benefit as much.

Another element to consider is the resources needed to make the change in the code, tests, versioning of old files, etc.

I don't know the details of the implementation of arrays you are talking about, but there are several points that seem important to me and that we must include before thinking about this solution. First of all, the code for grease pencil is quite extensive and there are currently over 130 operators, plus its own draw engine and other changes in various places in Blender. Due to the nature of grease pencil, on many occasions it is necessary to move elements in the lists, that is, to include new elements in the middle of the list, at the beginning or at the end (this case would not be a problem). Examples of these changes could be moving a frame, adding strokes in the middle, etc. I'm sure that accessing elements in the list would be much faster using an array and the element's index, but I don't know if the position changes of elements in the list would benefit as much. Another element to consider is the resources needed to make the change in the code, tests, versioning of old files, etc.

Falk David commented

2022-02-02 10:38:37 +01:00

@HooglyBoogly I agree with you - the linked-lists need to go. But as Antonio said, we just didn't have the time budget right now to implement that change.

The other thing to consider is that the copy-on-write was only half of the problem. The other issue is the undo. Every operation that triggers the encode step of the undo takes like a second in the heavy test file. And this is where the update cache becomes useful again, because we can use the same system to implement a grease pencil undo type (will also be published as another patch). So one of the (if not the most important) reason to go down the path of the update cache was that we could hit two birds with one stone.

Additionally, the update cache could also use the array data structures. It would require some small adjustments, but the same idea would still work I believe. So then we would still benefit from not having to copy the whole data block, every time you select a key-frame for example.

@HooglyBoogly I agree with you - the linked-lists need to go. But as Antonio said, we just didn't have the time budget right now to implement that change. The other thing to consider is that the copy-on-write was only half of the problem. The other issue is the undo. Every operation that triggers the encode step of the undo takes like a second in the heavy test file. And this is where the update cache becomes useful again, because we can use the same system to implement a grease pencil undo type (will also be published as another patch). So one of the (if not the most important) reason to go down the path of the update cache was that we could hit two birds with one stone. Additionally, the update cache could also use the array data structures. It would require some small adjustments, but the same idea would still work I believe. So then we would still benefit from not having to copy the whole data block, every time you select a key-frame for example.

Antonio Vazquez commented

2022-02-03 12:54:18 +01:00

Changed status from 'Needs Triage' to: 'Confirmed'

Falk David was assigned by Antonio Vazquez

2022-02-03 12:55:16 +01:00

blender-admin commented

2022-02-10 11:34:12 +01:00

This issue was referenced by e2befa425a

This issue was referenced by e2befa425a84c9e4ec715442e85624a5d3669a4f

Philipp Oeser removed the

Download

What's New

Blender Studio

Manual

Developers Blog

Documentation

Benchmark

Blender Conference

Development Fund

One-time Donations

GPencil: Update-on-write using an update cache #95401

What is the update cache?

Motivation

Tagging

Explanation of the Update Cache

Updating the eval object

Implementation

Further improvements