Cycles: reduce mesh memory usage by unflattening #105173

Merged
William Leeson merged 13 commits from mesh_unflattening into main 2023-02-27 10:39:31 +01:00
Member

What?

The meshes in Cycles are currently stored as a flattened meshes, where each triangle is stored as a set of 3 vertices. This uses a lot of memory and for current hardware does not provide a noticeable benefit. This allows for larger scenes and also the reduction in the sizes of the meshes also results in a decrease the amount of time it takes to upload the data to a GPU. This is especially important for when multiple GPUs are used in a single machine. The below images shows a graph for the reduction in vertex buffer sizes Vertex Buffer Sizes
Also the next graph shows that the render time is unaffected by the change
Path Trace Times
Finally the upload times are shown below Mesh Upload Times
Here is a link to the spreadsheet with all the data.

Why?

To improve mesh upload speeds and reduce the size of the scene data which allows larger scenes to be rendered.

## What? The meshes in Cycles are currently stored as a flattened meshes, where each triangle is stored as a set of 3 vertices. This uses a lot of memory and for current hardware does not provide a noticeable benefit. This allows for larger scenes and also the reduction in the sizes of the meshes also results in a decrease the amount of time it takes to upload the data to a GPU. This is especially important for when multiple GPUs are used in a single machine. The below images shows a graph for the reduction in vertex buffer sizes ![Vertex Buffer Sizes](https://docs.google.com/spreadsheets/d/e/2PACX-1vSwltj-ZOK7Tth_-lt1G3WKrYytqgWNU6QmYFIZHxMxpP6pv1sX6dCt8HYUYdAk9OovxH76GIyXvlCX/pubchart?oid=1489818539&format=image) Also the next graph shows that the render time is unaffected by the change ![Path Trace Times](https://docs.google.com/spreadsheets/d/e/2PACX-1vSwltj-ZOK7Tth_-lt1G3WKrYytqgWNU6QmYFIZHxMxpP6pv1sX6dCt8HYUYdAk9OovxH76GIyXvlCX/pubchart?oid=52208238&format=interactive) Finally the upload times are shown below ![Mesh Upload Times](https://docs.google.com/spreadsheets/d/e/2PACX-1vSwltj-ZOK7Tth_-lt1G3WKrYytqgWNU6QmYFIZHxMxpP6pv1sX6dCt8HYUYdAk9OovxH76GIyXvlCX/pubchart?oid=565630684&format=interactive) [Here](https://docs.google.com/spreadsheets/d/1XUyFyXczVKrbkKgGRkSbyniSbzU_w_BVjrCCogfQhiI/edit?usp=sharing) is a link to the spreadsheet with all the data. ## Why? To improve mesh upload speeds and reduce the size of the scene data which allows larger scenes to be rendered.
William Leeson added 6 commits 2023-02-24 13:00:35 +01:00
0fc5386289 Unflattens the vertex and index buffers
Vertex buffers and index buffers are now stored as a set of vertices
and the corresponding triangle indices that make up the meshes.
Previously the vertex buffers were flattened meaning the vertices
were repeated for each triangle storing 3 vertices for each triangle.
Storing the meshes this way reduces the amount of space required
to store the mesh and thus reduces the cost of transferring the
data to the GPU and improves caching. The patch also stores the
vertices as packed_float3 and the index buffers are packed_uint3.
c929f9d6b0 Logs the sizes of the DeviceScene buffers that hold the scene data
Adds a simple size log for recording the size of the individual
buffers and also calculates the total size.
William Leeson requested review from Brecht Van Lommel 2023-02-24 13:07:39 +01:00

@blender-bot package

@blender-bot package
Member

Package build started. Download here when ready.

Package build started. [Download here](https://builder.blender.org/download/patch/PR105173) when ready.
Brecht Van Lommel changed title from WIP:mesh_unflattening to WIP: Cycles: reduce mesh memory usage by unflattening 2023-02-24 13:54:01 +01:00
Brecht Van Lommel requested changes 2023-02-24 14:04:31 +01:00
Brecht Van Lommel left a comment
Owner

Overall looks like a very nice improvement.

Overall looks like a very nice improvement.
@ -232,3 +235,3 @@
for (int i = 0; i < std::min(num_points, b_points_num); i++) {
const float3 co = get_float3(b_attr_position.data[i].vector());
float3 P = get_float3(b_attr_position.data[i].vector());

Add const.

Add const.
leesonw marked this conversation as resolved
@ -334,0 +318,4 @@
num_verts + 1);
}
}
// FRL_CGR END

Remove this comment.

Remove this comment.
Author
Member

Removed this.

Removed this.
leesonw marked this conversation as resolved
@ -566,3 +587,3 @@
if (mesh->num_triangles() > 0) {
RTCGeometry geom = rtcGetGeometry(scene, geom_id);
set_tri_vertex_buffer(geom, mesh, true);
// FIXME: Update only works if the buffer has the same pointer

Still to be fixed before this is ready to commit I guess.

Still to be fixed before this is ready to commit I guess.
leesonw marked this conversation as resolved
@ -113,1 +111,3 @@
static_assert(sizeof(uint3) == num_elements * datatype_size(data_type));
// static const DataType data_type = TYPE_UINT;
// static const size_t num_elements = 3;
// static_assert(sizeof(uint3) == num_elements * datatype_size(data_type));

This looks like work in progress too, should not leave commented code.

This looks like work in progress too, should not leave commented code.
Author
Member

I put in a comment and removed the commented out code.

I put in a comment and removed the commented out code.
leesonw marked this conversation as resolved
@ -167,6 +167,12 @@ size_t Attribute::data_sizeof() const
return sizeof(float2);
else if (type == TypeDesc::TypeMatrix)
return sizeof(Transform);
// WL: The float3 type is not interchangeable with float4

No need for WL: prefix now, it's a general informative comment.

No need for WL: prefix now, it's a general informative comment.
Author
Member

I removed all the WL:'s from all changed code.

I removed all the WL:'s from all changed code.
leesonw marked this conversation as resolved
@ -2122,0 +2135,4 @@
&(dscene->attributes_float4), &(dscene->attributes_uchar4)};
std::for_each(memoryItems.begin(), memoryItems.end(), [update_stats](const device_memory *dm) {
update_stats->device.buffers.add_entry({dm->name, dm->memory_size()});
});

This looks useful, but I prefer to leave the stats functionality out of this patch and review it separately.

This looks useful, but I prefer to leave the stats functionality out of this patch and review it separately.
Author
Member

Ok I have reverted this and I will submit it in another pull request.

Ok I have reverted this and I will submit it in another pull request.
leesonw marked this conversation as resolved
@ -385,3 +406,3 @@
if (use_motion_blur && curve_attr) {
size_t steps_size = curve_keys.size() * (motion_steps - 1);
float3 *key_steps = curve_attr->data_float3();
// WL: Attribute data is stored as a float4 and is not

Remove WL prefix.

Remove WL prefix.
leesonw marked this conversation as resolved
@ -764,0 +784,4 @@
tri_patch[i] = -1;
off += 3;
}
// Store extra index set that does not have the offset for motion blur

This looks like work in progress still.

This looks like work in progress still.
Author
Member

I removed this code. It is from another change that is coming later that I had commented out because it is not needed for this change.

I removed this code. It is from another change that is coming later that I had commented out because it is not needed for this change.
leesonw marked this conversation as resolved
@ -155,2 +155,2 @@
memcpy(
attr_mP->data_float3() + motion_step * numpoints, points_data, sizeof(float3) * numpoints);
float *radius_data = radius.data();
// FRL_CGR BEGIN

Leave out comments like this.

Leave out comments like this.
Author
Member

I removed all the // FRL... comments I could find. Sorry about that I thought I had removed them all.

I removed all the // FRL... comments I could find. Sorry about that I thought I had removed them all.
leesonw marked this conversation as resolved
William Leeson added 4 commits 2023-02-24 15:54:08 +01:00
William Leeson added 1 commit 2023-02-24 16:09:33 +01:00
buildbot/vexp-code-patch-coordinator Build done. Details
48cb3821f8
Merge branch 'upstream_main' into mesh_unflattening
William Leeson requested review from Brecht Van Lommel 2023-02-24 16:10:57 +01:00

@blender-bot build

@blender-bot build
Brecht Van Lommel approved these changes 2023-02-24 17:47:12 +01:00
Brecht Van Lommel left a comment
Owner

Looks good now besides one minor whitespace comment, but no need to re-review for that.

Looks good now besides one minor whitespace comment, but no need to re-review for that.

Also if you do a squash merge, you can edit the commit message before it goes through, and should turn the description into plaintext then.
https://wiki.blender.org/wiki/Tools/Pull_Requests#Merge_a_Pull_Request

Also if you do a squash merge, you can edit the commit message before it goes through, and should turn the description into plaintext then. https://wiki.blender.org/wiki/Tools/Pull_Requests#Merge_a_Pull_Request
Brecht Van Lommel changed title from WIP: Cycles: reduce mesh memory usage by unflattening to Cycles: reduce mesh memory usage by unflattening 2023-02-24 17:55:08 +01:00
Jesse Yurkovich reviewed 2023-02-24 21:16:50 +01:00
@ -53,0 +56,4 @@
/* CUDA int3 is already packed. */
typedef int3 packed_int3;
#else
/* HIP int3 is not packed (https://github.com/ROCm-Developer-Tools/HIP/issues/706). */

For follow-up, probably after the initial commit: This HIP issue seems to have been addressed last Sept. Does it still apply?

For follow-up, probably after the initial commit: This HIP issue seems to have been addressed last Sept. Does it still apply?
Author
Member

Thanks for the input. I checked the HIP docs and int3 is packed. Do you have a commit I could look at for the HIP issue?

Thanks for the input. I checked the HIP docs and [int3](https://docs.amd.com/bundle/HIP-API-Guide-v5.3/page/a01604.html) is packed. Do you have a commit I could look at for the HIP issue?

Hmm, the commit that's listed at the bottom of that issue is:
e27f64582c

Which was early 2022, but maybe it wasn't actually released until very recently. I'm not sure how quickly their code-flow is over there...

Hmm, the commit that's listed at the bottom of that issue is: https://github.com/ROCm-Developer-Tools/hipamd/commit/e27f64582cae9ae1f8c341018e8379a6025872bb Which was early 2022, but maybe it wasn't actually released until very recently. I'm not sure how quickly their code-flow is over there...

We are on HIP 5.3 so that should be new enough. But I think it's best to leave this in, and then afterwards do a change that removes it for both float3 and int3.

That way if there is a problem with it, it's isolated from this optimization.

We are on HIP 5.3 so that should be new enough. But I think it's best to leave this in, and then afterwards do a change that removes it for both float3 and int3. That way if there is a problem with it, it's isolated from this optimization.

Also this code is used for oneAPI even though the comment does not mention it. I don't know if oneAPI actually needs it but again that's best left for a separate PR.

Also this code is used for oneAPI even though the comment does not mention it. I don't know if oneAPI actually needs it but again that's best left for a separate PR.

For future pull requests, please create them from your fork rather than a branch in the main Blender repository:
https://wiki.blender.org/wiki/Tools/Pull_Requests

For future pull requests, please create them from your fork rather than a branch in the main Blender repository: https://wiki.blender.org/wiki/Tools/Pull_Requests
William Leeson added 1 commit 2023-02-27 09:49:08 +01:00
William Leeson added 1 commit 2023-02-27 10:01:37 +01:00
William Leeson merged commit 6c03339e48 into main 2023-02-27 10:39:31 +01:00
William Leeson deleted branch mesh_unflattening 2023-02-27 10:39:32 +01:00
First-time contributor

This is a great improvement. For me, in a scene with lots of trees and cars, memory usage went down from 5700mb to 4760mb. And no drop in performance.👌

Would love to see more memory optimizations like this, since GPUs with more vram are still very expensive.

This is a great improvement. For me, in a scene with lots of trees and cars, memory usage went down from 5700mb to 4760mb. And no drop in performance.👌 Would love to see more memory optimizations like this, since GPUs with more vram are still very expensive.
Sign in to join this conversation.
No reviewers
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
5 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#105173
No description provided.