Cycles: add instancing support in light tree #106683

Closed
Weizhen Huang wants to merge 10 commits from weizhen:light_tree_instance into main

When changing the target branch, be careful to rebase the branch in your fork to match. See documentation.
Member

Build a subtree for each unique mesh light.

Implementation details:

  1. The top-level light tree contains two types of emitters: lamps (local or distant) and mesh lights (emissive objects). Each unique mesh light, regardless of whether it's instanced or not, has a subtree. The leaf node of such subtrees only contains triangles. During traversal, we first traverse the top-level tree until we pick an emitter; if this emitter is a mesh light, we transform it to its local space and continue traversing its subtree until we pick a triangle.
  2. Excluding triangles from the top-level tree regardless of whether they are instanced simplifies many logics. During traversal, the transformation is only performed if !SD_OBJECT_TRANSFORM_APPLIED, so the cost is comparable with the previous implementation where we needed to transform three vertices of the triangle when computing the importance.
  3. While the recursive_build() function is parallelized for meshes with many triangles, the process of iterating through mesh lights and building subtrees is not. This means that if there are numerous small unique mesh lights, the building process could be slower than before. However, it's possible to parallelize by splitting the loop into two.

Performance

Tested with Apple M1

  • the scene from https://devtalk.blender.org/t/gsoc-2022-many-lights-sampling-in-cycles-x-feedback-thread/24773/173 has one unique mesh consisting of 5052 triangles, instanced 8848 times. Light tree build time was 65s before and 1.6 0.34s after, with 99% of the time being spent on computing LightTreeMeasure, since the instances have non-uniform scalings, resulting in non-trivial transformations. Some side notes about this transformation are in the comment (light_tree.cpp:340).
  • Ladder.blend from #77889 (comment), with 205 mesh lights, 5 of which are unique. There are slight differences in the noise present in the two renderings.
    32 spp Before After
    Light tree build time (ms) 6.6 0.86
    Number of nodes 4115 720
    Rendering befor_32.png after_32.png
  • Also tested with benchmark scenes, almost no visible difference in the renderings, but we also don't have any mesh light instances in the benchmark scenes. The bistro scene is 1.7x slower when buiding the light tree, probably because there are 79 mesh lights with less than 4096 triangles (and 12 mesh lights with 10824 triangles). As mentioned before, the process of iterating through the mesh lights and building subtrees is not yet parallelized. After implementing parallelization, the time required for building the light tree in the bistro scene is now half of the time in main.
Build a subtree for each unique mesh light. ### Implementation details: 1. The top-level light tree contains two types of emitters: lamps (local or distant) and mesh lights (emissive objects). Each unique mesh light, regardless of whether it's instanced or not, has a subtree. The leaf node of such subtrees only contains triangles. During traversal, we first traverse the top-level tree until we pick an emitter; if this emitter is a mesh light, we transform it to its local space and continue traversing its subtree until we pick a triangle. 2. Excluding triangles from the top-level tree regardless of whether they are instanced simplifies many logics. During traversal, the transformation is only performed if `!SD_OBJECT_TRANSFORM_APPLIED`, so the cost is comparable with the previous implementation where we needed to transform three vertices of the triangle when computing the importance. 3. ~~While the `recursive_build()` function is parallelized for meshes with many triangles, the process of iterating through mesh lights and building subtrees is not. This means that if there are numerous small unique mesh lights, the building process could be slower than before. However, it's possible to parallelize by splitting the loop into two.~~ ### Performance Tested with Apple M1 - the scene from https://devtalk.blender.org/t/gsoc-2022-many-lights-sampling-in-cycles-x-feedback-thread/24773/173 has one unique mesh consisting of 5052 triangles, instanced 8848 times. Light tree build time was **65s** before and ~~**1.6**~~ **0.34s** after, with 99% of the time being spent on computing `LightTreeMeasure`, since the instances have non-uniform scalings, resulting in non-trivial transformations. Some side notes about this transformation are in the comment (`light_tree.cpp:340`). - `Ladder.blend` from https://projects.blender.org/blender/blender/issues/77889#issuecomment-315413, with 205 mesh lights, 5 of which are unique. There are slight differences in the noise present in the two renderings. | 32 spp | Before | After | | -------- | -------- | -------- | |Light tree build time (ms) | 6.6 | 0.86 | |Number of nodes | 4115 | 720 | |Rendering | ![befor_32.png](/attachments/152b5b54-51cd-4969-80ed-b6a3ae65c3b8) | ![after_32.png](/attachments/b105a976-73ae-4877-bbcc-ae02542fbab0) | - Also tested with benchmark scenes, almost no visible difference in the renderings, but we also don't have any mesh light instances in the benchmark scenes. ~~The bistro scene is **1.7x slower** when buiding the light tree, probably because there are 79 mesh lights with less than 4096 triangles (and 12 mesh lights with 10824 triangles). As mentioned before, the process of iterating through the mesh lights and building subtrees is not yet parallelized.~~ After implementing parallelization, the time required for building the light tree in the bistro scene is now half of the time in main.
Weizhen Huang added 1 commit 2023-04-07 18:41:37 +02:00
Cycles: speed up light tree build with many instanced mesh lights
All checks were successful
buildbot/vexp-code-patch-coordinator Build done.
dec90cfcf9
by building a subtree for each unique mesh light.
The top-level light tree contains two types of emitters: lamps (local or distant) and mesh lights (emissive objects). Each unique mesh light, regardless of whether it's instanced or not, has a subtree. The leaf node of such subtrees only contain triangles. During traversal, we first traverse the top-level tree until we pick an emitter. If this emitter is a mesh light, we transform it to its local space and continue traversing its subtree until we pick a triangle.
Author
Member

@blender-bot build

@blender-bot build
Weizhen Huang changed title from Cycles: speed up light tree build with many instanced mesh lights to Cycles: add instancing support in light tree 2023-04-07 23:49:09 +02:00
Weizhen Huang added the
Module
Render & Cycles
label 2023-04-08 00:47:00 +02:00
Weizhen Huang requested review from Brecht Van Lommel 2023-04-08 00:49:35 +02:00
Weizhen Huang requested review from Sergey Sharybin 2023-04-08 00:49:53 +02:00
Sergey Sharybin reviewed 2023-04-11 13:27:02 +02:00
Sergey Sharybin left a comment
Owner

This sounds like an impressive speedup!

Only did a quick pass so far. Didn't spot obvious logic mistakes, just some stlying/naming comments.

This sounds like an impressive speedup! Only did a quick pass so far. Didn't spot obvious logic mistakes, just some stlying/naming comments.
@ -64,3 +64,3 @@
KERNEL_DATA_ARRAY(KernelLightTreeNode, light_tree_nodes)
KERNEL_DATA_ARRAY(KernelLightTreeEmitter, light_tree_emitters)
KERNEL_DATA_ARRAY(uint, light_to_tree)
KERNEL_DATA_ARRAY(uint, lamp_to_tree)

What is the difference between lamp and light?

This is something confusing in the current state of kernel. Is it possible to stick to light? Or somewhere have a comment about whet the difference is?

What is the difference between lamp and light? This is something confusing in the current state of kernel. Is it possible to stick to `light`? Or somewhere have a comment about whet the difference is?

I think we should get rid of the term "lamp". It was used in Blender before and renamed to light in 2.8, but some usage of lamp stuck around in Cycles.

Maybe best to use "light" for area/spot/sun/point, and then "emitter" for lights and emissive triangles. Though not as part of this patch of course. I don't mind it being called lamp as long as that is still used in other places.

I think we should get rid of the term "lamp". It was used in Blender before and renamed to light in 2.8, but some usage of lamp stuck around in Cycles. Maybe best to use "light" for area/spot/sun/point, and then "emitter" for lights and emissive triangles. Though not as part of this patch of course. I don't mind it being called lamp as long as that is still used in other places.
weizhen marked this conversation as resolved
@ -1370,6 +1370,13 @@ using BoundingCone = struct BoundingCone {
float theta_e;
};
using LightTreeNodeType = enum LightTreeNodeType {

Not sure why such syntax is needed.

For being C-style (which matches the legacy roots of the OpenCL days in the file) it is typedef LightTreeNodeType { ... } LightTreeNodeType;.

For the C++ you can simply do enum LightTreeNodeType{ ... };

Not sure why such syntax is needed. For being C-style (which matches the legacy roots of the OpenCL days in the file) it is `typedef LightTreeNodeType { ... } LightTreeNodeType;`. For the C++ you can simply do `enum LightTreeNodeType{ ... };`
weizhen marked this conversation as resolved
@ -1404,3 +1413,2 @@
/* The location in the lights or triangles array. */
int prim_id;
union {

Such things I always get wrong. It is easy to run into an extension created by MSVC. It seems fine here, but worth double-checking it does not generate a warning about it (in the case I've missed something).

Such things I always get wrong. It is easy to run into an extension created by MSVC. It seems fine here, but worth double-checking it does not generate a warning about it (in the case I've missed something).
weizhen marked this conversation as resolved
@ -156,3 +190,1 @@
it is an inner node. */
int first_emitter_index; /* Leaf nodes contain an index to first emitter. */
unique_ptr<LightTreeNode> children[2]; /* Inner node has two children. */
int type = LIGHT_TREE_INNER;

Add a note that this is a bitmask of LightTreeNodeType.

Add a note that this is a bitmask of `LightTreeNodeType`.
weizhen marked this conversation as resolved
Weizhen Huang reviewed 2023-04-11 14:20:18 +02:00
@ -64,3 +64,3 @@
KERNEL_DATA_ARRAY(KernelLightTreeNode, light_tree_nodes)
KERNEL_DATA_ARRAY(KernelLightTreeEmitter, light_tree_emitters)
KERNEL_DATA_ARRAY(uint, light_to_tree)
KERNEL_DATA_ARRAY(uint, lamp_to_tree)
Author
Member

This array stores indices of the build-in lights (area light, spot light, sun light, point light), these lights are often called "lamps" throughout the code. Besides that we have mesh lights, which are kept in a separate array called mesh_to_light. Because both types are called "lights", I thought it would be less confusing to rename the other one to lamp_to_tree.
I can add a comment explaining this.

This array stores indices of the build-in lights (area light, spot light, sun light, point light), these lights are often called "lamps" throughout the code. Besides that we have mesh lights, which are kept in a separate array called `mesh_to_light`. Because both types are called "lights", I thought it would be less confusing to rename the other one to `lamp_to_tree`. I can add a comment explaining this.
Brecht Van Lommel requested changes 2023-04-13 18:34:03 +02:00
Brecht Van Lommel left a comment
Owner

Looks generally fine, only minor comments.

Would be nice to parallelize building of multiple mesh lights, but also not a blocker for landing this.

Looks generally fine, only minor comments. Would be nice to parallelize building of multiple mesh lights, but also not a blocker for landing this.
@ -1385,3 +1390,1 @@
* Otherwise, it's an index to the node's second child. */
int child_index;
int num_emitters; /* leaf nodes need to know the number of emitters stored. */
LightTreeNodeType type;

We don't use enum types in struct like this and instead use an integer type, because we don't know that the size is the same on the CPU and GPU.

Maybe we can start using them with an explicit size in the declaration like enum LightTreeNodeType : int { or enum LightTreeNodeType : uint8_t {? Would need to be tested on the buildbot, not sure it works for all GPU compilers.

We don't use enum types in struct like this and instead use an integer type, because we don't know that the size is the same on the CPU and GPU. Maybe we can start using them with an explicit size in the declaration like `enum LightTreeNodeType : int {` or `enum LightTreeNodeType : uint8_t {`? Would need to be tested on the buildbot, not sure it works for all GPU compilers.
@ -1388,0 +1395,4 @@
union {
int first_emitter; /* Leaf node stores the index of the first emitter. */
int right_child; /* Inner node stores the index of the right child. */
int reference; /* Instance node refers to the node with the subtree. */

I think it would be a bit more clear like this:

union {
   struct { int first_emitter; } leaf;
   struct { int right_child; } inner;
   struct { int reference; } instance;
};
I think it would be a bit more clear like this: ``` union { struct { int first_emitter; } leaf; struct { int right_child; } inner; struct { int reference; } instance; }; ```
weizhen marked this conversation as resolved
@ -1407,0 +1422,4 @@
struct {
int object_id;
int node_id;
}; /* Mesh Light. */

Same suggestion as for KernelLightTreeNode, could be more explicit about which emitter type the data belongs to.

Same suggestion as for `KernelLightTreeNode`, could be more explicit about which emitter type the data belongs to.
weizhen marked this conversation as resolved
Sergey Sharybin reviewed 2023-04-13 19:59:45 +02:00
Sergey Sharybin left a comment
Owner

I forgot to publish the comments.

I forgot to publish the comments.
@ -64,3 +64,3 @@
KERNEL_DATA_ARRAY(KernelLightTreeNode, light_tree_nodes)
KERNEL_DATA_ARRAY(KernelLightTreeEmitter, light_tree_emitters)
KERNEL_DATA_ARRAY(uint, light_to_tree)
KERNEL_DATA_ARRAY(uint, lamp_to_tree)

The code indeed often refers to the light sources as lamps. And this is exactly what causes quite some confusion.

Like, KernelLight *klight = &kernel_data_fetch(lights, lamp). So, is the lamp different from light, how can they be used interchangeably.

We can probably also ignore the problem until the lights are actually a real objects in Cycles, and then everything is just an object.

The code indeed often refers to the light sources as lamps. And this is exactly what causes quite some confusion. Like, `KernelLight *klight = &kernel_data_fetch(lights, lamp)`. So, is the lamp different from light, how can they be used interchangeably. We can probably also ignore the problem until the lights are actually a real objects in Cycles, and then everything is just an object.
weizhen marked this conversation as resolved
@ -65,2 +65,3 @@
KERNEL_DATA_ARRAY(KernelLightTreeEmitter, light_tree_emitters)
KERNEL_DATA_ARRAY(uint, light_to_tree)
KERNEL_DATA_ARRAY(uint, lamp_to_tree)
KERNEL_DATA_ARRAY(uint, mesh_to_tree)

Isn't it more like object_to_tree, as it is indexed by object?

Also, maybe object_to_light_tree?`

Isn't it more like `object_to_tree`, as it is indexed by object? Also, maybe `object_to_light_tree`?`
weizhen marked this conversation as resolved
Weizhen Huang added 4 commits 2023-04-14 00:36:26 +02:00
`lamp` -> `light`
`mesh_to_tree` -> `object_to_tree`
Explicit naming of union structs (`leaf`, `inner`, `instance`,
`triangle`, `light`, `mesh`)
`enum LightTreeNodeType` -> `enum LightTreeNodeType : uint8_t`
Fix transform zero vector in volume
Some checks failed
buildbot/vexp-code-patch-coordinator Build done.
13a876dff7

@blender-bot package

@blender-bot package
Member

Package build started. Download here when ready.

Package build started. [Download here](https://builder.blender.org/download/patch/PR106683) when ready.

Code looks fine, but not sure what is going on with those crashes on the buildbot on Linux and Windows. I guess a debug + asan build on macOS might crash too.

Code looks fine, but not sure what is going on with those crashes on the buildbot on Linux and Windows. I guess a debug + asan build on macOS might crash too.

The crash is caused by use of non-POD in an union without its proper construction.
It is possible to ensure the expected lifetime of those unique pointers by manually initializing them, but it is going to be fragile. Doing so is highly discouraged in the modern C++, and I would not suggest going that route.

Alternatives are:

  • Remove the union all together, inline the parameters into the node itself. I do not think it will lead to a measurable peak memory usage increase.
  • Use std::variant which is a type-safe alternative of the unions, which also works for non-POD. Using it will free you from doing all the object life-time management, in exchange of slightly bigger memory usage (the variant stores type of the value it is holding).

Attached the PoC of the std::variant for the reference. I did not have time to build it to scale or paint it. Surely some of the boiler plate could be reduced somewhat.

The crash is caused by use of non-POD in an union without its proper construction. It is possible to ensure the expected lifetime of those unique pointers by manually initializing them, but it is going to be fragile. Doing so is highly discouraged in the modern C++, and I would not suggest going that route. Alternatives are: - Remove the union all together, inline the parameters into the node itself. I do not think it will lead to a measurable peak memory usage increase. - Use `std::variant` which is a type-safe alternative of the unions, which also works for non-POD. Using it will free you from doing all the object life-time management, in exchange of slightly bigger memory usage (the variant stores type of the value it is holding). Attached the PoC of the `std::variant` for the reference. I did not have time to build it to scale or paint it. Surely some of the boiler plate could be reduced somewhat.
Weizhen Huang added 3 commits 2023-04-14 16:48:16 +02:00
Author
Member

@blender-bot package

@blender-bot package
Member

Package build started. Download here when ready.

Package build started. [Download here](https://builder.blender.org/download/patch/PR106683) when ready.
Weizhen Huang added 1 commit 2023-04-14 17:00:55 +02:00
Add #include <variant>
All checks were successful
buildbot/vexp-code-patch-coordinator Build done.
c10023b8e3
Author
Member

@blender-bot package

@blender-bot package
Member

Package build started. Download here when ready.

Package build started. [Download here](https://builder.blender.org/download/patch/PR106683) when ready.
Author
Member

Thanks for the patch. The two proposed alternatives require the same amount of memory (16 bytes more compared to the version with union), but with variant the intention of the code is clearer and looks safer, so I used variant in the end.

Thanks for the patch. The two proposed alternatives require the same amount of memory (16 bytes more compared to the version with `union`), but with `variant` the intention of the code is clearer and looks safer, so I used `variant` in the end.
Brecht Van Lommel approved these changes 2023-04-14 18:07:19 +02:00
Weizhen Huang added 1 commit 2023-04-14 18:49:07 +02:00
Author
Member

committed as bfd1836861

committed as bfd1836861
Weizhen Huang closed this pull request 2023-04-14 19:14:34 +02:00
Weizhen Huang deleted branch light_tree_instance 2023-04-14 19:21:57 +02:00

Pull request closed

Sign in to join this conversation.
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset System
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Code Documentation
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Viewport & EEVEE
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Asset Browser Project
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Module
Viewport & EEVEE
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Severity
High
Severity
Low
Severity
Normal
Severity
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
4 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#106683
No description provided.