Document performance profiling Blender #70016

Closed
opened 2019-09-18 12:57:43 +02:00 by Brecht Van Lommel · 37 comments

We are missing documentation on how to use profiling tools with Blender:
https://wiki.blender.org/wiki/Tools

Ideally for each platform we would document one recommended, user-friendly profiler and exact steps to use it.

We are missing documentation on how to use profiling tools with Blender: https://wiki.blender.org/wiki/Tools Ideally for each platform we would document one recommended, user-friendly profiler and exact steps to use it.
Author
Owner

Added subscriber: @brecht

Added subscriber: @brecht
Member

Added subscriber: @ankitm

Added subscriber: @ankitm
Member

For Mac.

Is OpenGL profiler relevant ?https:*developer.apple.com/library/archive/documentation/GraphicsImaging/Conceptual/OpenGLProfilerUserGuide/Introduction/Introduction.html#*apple_ref/doc/uid/TP40006475

Also, instruments.app seems to be a fine choice for Mac since it comes with the compulsory Xcode

For Mac. Is OpenGL profiler relevant ?https:*developer.apple.com/library/archive/documentation/GraphicsImaging/Conceptual/OpenGLProfilerUserGuide/Introduction/Introduction.html#*apple_ref/doc/uid/TP40006475 Also, instruments.app seems to be a fine choice for Mac since it comes with the compulsory Xcode
Author
Owner

I'm not sure OpenGL Profiler is still available with Xcode. But Instruments is indeed the profiling tool to recommend on macOS.

I'm not sure OpenGL Profiler is still available with Xcode. But Instruments is indeed the profiling tool to recommend on macOS.
Member

Added subscriber: @JacquesLucke

Added subscriber: @JacquesLucke
Member

I can document on how I profile with this tool next week: https://github.com/KDAB/hotspot.

I can document on how I profile with this tool next week: https://github.com/KDAB/hotspot.

Added subscriber: @Ramanathi

Added subscriber: @Ramanathi

Hello,

I'm GSoC Aspirant here. I want to contribute in project. As a beginner I want to fix this bug. Can I get some help?

Thank you

Hello, I'm GSoC Aspirant here. I want to contribute in project. As a beginner I want to fix this bug. Can I get some help? Thank you
Member

Added subscriber: @robbott

Added subscriber: @robbott
Member

this is not really a bug in blender... but a documentation TODO about profiling. Start at https://wiki.blender.org/wiki/GSoC

@JacquesLucke I have tried twice, unsuccessfully to build hotspot on Mac. https:github.com/brendangregg/FlameGraph is much more convenient. you can seemany// more tools on his website/ blog.

For Instruments.app, I have used several of its "modes": zombies, metal, time, etc. It would be nice if I could also make proper sense of the traces, not just "oh something's wrong there". @robbott could you write something ?

this is not really a bug in blender... but a documentation TODO about profiling. Start at https://wiki.blender.org/wiki/GSoC @JacquesLucke I have tried twice, unsuccessfully to build hotspot on Mac. https:*github.com/brendangregg/FlameGraph is much more convenient. you can see*many// more tools on his website/ blog. For Instruments.app, I have used several of its "modes": zombies, metal, time, etc. It would be nice if I could also make proper sense of the traces, not just "oh something's wrong there". @robbott could you write something ?
Author
Owner

Hotspot can't work on macOS, it's only designed to work with perf. But there's also not much point since there is Instruments, which is pretty easy to get started with?
https://help.apple.com/instruments/mac/10.0/#/dev44b2b437

If there's a good tutorial or docs we can link to that, and we can add Blender specific advice, like using RelWithDebInfo builds.

Hotspot can't work on macOS, it's only designed to work with `perf`. But there's also not much point since there is Instruments, which is pretty easy to get started with? https://help.apple.com/instruments/mac/10.0/#/dev44b2b437 If there's a good tutorial or docs we can link to that, and we can add Blender specific advice, like using RelWithDebInfo builds.

Added subscriber: @harald.reingruber

Added subscriber: @harald.reingruber

I would like to document how profiling works on Windows. This is a good preparation step for my GSOC "Improving Compositor Performance" proposal.
Does anybody already have experience using Visual Studio or any other native profiler on Windows? Visual Studio even offers GPU profiler, but maybe other tools (e.g. from Nvidia) allow to go even deeper?

I would like to document how profiling works on Windows. This is a good preparation step for my GSOC "Improving Compositor Performance" proposal. Does anybody already have experience using Visual Studio or any other native profiler on Windows? Visual Studio even offers GPU profiler, but maybe other tools (e.g. from Nvidia) allow to go even deeper?
Member

Added subscriber: @LazyDodo

Added subscriber: @LazyDodo
Member

Think this ticket is mostly aimed at CPU profiling not GPU, that being said

I'm struggling combining good and user friendly, both Windows Performance Analyzer and Intels VTune are rather amazing profilers but the learning curve is nearly a sheer vertical wall, the build in profiler in VS is easy to use, but not that useful. Profilers are no magic bullets, it's best to figure out what you want to measure, what that will tell you and pick a product that best matches your requirements.

Think this ticket is mostly aimed at CPU profiling not GPU, that being said I'm struggling combining good and user friendly, both [Windows Performance Analyzer ](https://docs.microsoft.com/en-us/windows-hardware/test/wpt/windows-performance-analyzer) and Intels VTune are rather amazing profilers but the learning curve is nearly a sheer vertical wall, the build in profiler in VS is easy to use, but not that useful. Profilers are no magic bullets, it's best to figure out what you want to measure, what that will tell you and pick a product that best matches your requirements.

Awesome, thanks for the feedback.

So, for this ticket, I would try the different options and create a simple and advanced section in the wiki.
@LazyDodo: What do you think?

Shall we create separate wiki pages about different platforms?

Awesome, thanks for the feedback. So, for this ticket, I would try the different options and create a simple and advanced section in the wiki. @LazyDodo: What do you think? Shall we create separate wiki pages about different platforms?
Author
Owner

We can start by documenting the simple setup all on one page. Then if needed we can split it up or extend it.

The important thing is that a developer tests profiling Blender with the tool, and documents any Blender specific steps, configuration or pitfalls. We need just the minimal information to get started.

For example with VTune on Windows that might be something like:

  • Install VTune with Visual Studio integration
  • Set Blender build configuration to RelWithDebInfo (or Release?)
  • Configure VTune like this so it works correctly with Blender
  • Run profile like this and you will get results that look like this
  • To isolate results for one operation in Blender (without .blend load time etc.), do this
We can start by documenting the simple setup all on one page. Then if needed we can split it up or extend it. The important thing is that a developer tests profiling Blender with the tool, and documents any Blender specific steps, configuration or pitfalls. We need just the minimal information to get started. For example with VTune on Windows that might be something like: * Install VTune with Visual Studio integration * Set Blender build configuration to RelWithDebInfo (or Release?) * Configure VTune like this so it works correctly with Blender * Run profile like this and you will get results that look like this * To isolate results for one operation in Blender (without .blend load time etc.), do this

@brecht: Thanks for the valuable input. Will let you know once I have the first draft ready.

@brecht: Thanks for the valuable input. Will let you know once I have the first draft ready.

@brecht, @LazyDodo:

Here is a draft for the instructions to start profiling on Windows: https://docs.google.com/document/d/1s15X1qZ8iRC2zc3zkEwDboJCARqhqpECXZ2PRLKG7QI/edit?usp=sharing
It would be awesome if you could give me some feedback on the content, and how to improve the document.

I discovered that there is a problem with the CMake RelWithDebInfo configuration, which leads to VS not generating the debug information (if not activated manually for each project). I will try to figure out how to solve this in the next days.

@brecht, @LazyDodo: Here is a draft for the instructions to start profiling on Windows: https://docs.google.com/document/d/1s15X1qZ8iRC2zc3zkEwDboJCARqhqpECXZ2PRLKG7QI/edit?usp=sharing It would be awesome if you could give me some feedback on the content, and how to improve the document. I discovered that there is a problem with the CMake RelWithDebInfo configuration, which leads to VS not generating the debug information (if not activated manually for each project). I will try to figure out how to solve this in the next days.
Member

I discovered that there is a problem with the CMake RelWithDebInfo configuration, which leads to VS not generating the debug information (if not activated manually for each project). I will try to figure out how to solve this in the next days.

i fixed that yesterday

> I discovered that there is a problem with the CMake RelWithDebInfo configuration, which leads to VS not generating the debug information (if not activated manually for each project). I will try to figure out how to solve this in the next days. i fixed that yesterday
Member

I think it's important to note that when you're profiling you're going to compare performance between builds, you have to be very very careful to compare apples to apples, there can be up to a 20% difference in performance between a Release and RelWithDebInfo build, a fun thread where i miserably failed at remembering this is #70463.

As for Vtune vs WPA they answer different questions vtune is focused on measuring if every instruction is living up to its true potential by recording if it's executing as fast as it could have, and what is holding it back while throwing tons and tons of data at you. Things get overwhelming fast but it's great if you're looking to squeeze every last bit of performance out of a piece of code.

WPA is less concerned with this low level detail and it's great for answering questions like 'when am i running? am i running as often as i can, and if not, what is preventing me? what are the threads waiting for?" or just generic questions of 'where is time being spend' D6267 is a great example where analysis of both those things can be leveraged to deal with performance issues. Bruce Dawson from google has written a wonderful tool that makes recording WPA traces as easy as a 1 button click called [UiForEtw ]] it also takes care of the installation of the tools required, it's highly recommended. Learning to use it yeah.. lets say there is a steep learning curve, however Bruce has an excellent blog where he shows how to use the tool to analyze various problems, he keeps a list of them [ https:*randomascii.wordpress.com/2015/09/24/etw-central/ | over here

Now for the fun part, WPA and Vtune do not co-exist peacefully on a single machine, you install Vtune, WPA appears to be working, but the traces you end up with are virtually empty...

There is a work around to disable vtune temporarily but it is only documented in this obscure post on the intel forums from 2013 amplxe-sepreg.exe -u pax to remove the problematic driver (and reboot) restores WPA but now vtune is broken, amplxe-sepreg.exe -i reinstall the driver

Personally i find myself using WPA more often than VTune, that being said for the longest time VTune was not available for free, so that may have influenced that behavior.

I think it's important to note that when you're profiling you're going to compare performance between builds, you have to be very very careful to compare apples to apples, there can be up to a 20% difference in performance between a Release and RelWithDebInfo build, a fun thread where i miserably failed at remembering this is #70463. As for Vtune vs WPA they answer different questions vtune is focused on measuring if every instruction is living up to its true potential by recording if it's executing as fast as it could have, and what is holding it back while throwing tons and tons of data at you. Things get overwhelming fast but it's great if you're looking to squeeze every last bit of performance out of a piece of code. WPA is less concerned with this low level detail and it's great for answering questions like 'when am i running? am i running as often as i can, and if not, what is preventing me? what are the threads waiting for?" or just generic questions of 'where is time being spend' [D6267](https://archive.blender.org/developer/D6267) is a great example where analysis of both those things can be leveraged to deal with performance issues. Bruce Dawson from google has written a wonderful tool that makes recording WPA traces as easy as a 1 button click called [UiForEtw ]] it also takes care of the installation of the tools required, it's highly recommended. Learning to use it yeah.. lets say there is a steep learning curve, however Bruce has an excellent blog where he shows how to use the tool to analyze various problems, he keeps a list of them [[ https:*randomascii.wordpress.com/2015/09/24/etw-central/ | over here ](https:*github.com/google/UIforETW) Now for the fun part, WPA and Vtune do not co-exist peacefully on a single machine, you install Vtune, WPA appears to be working, but the traces you end up with are virtually empty... There is a work around to disable vtune temporarily but it is only documented in this [obscure post ](https:*software.intel.com*en-us/comment/1727169#comment-1727169) on the intel forums from 2013 `amplxe-sepreg.exe -u pax` to remove the problematic driver (and reboot) restores WPA but now vtune is broken, `amplxe-sepreg.exe -i` reinstall the driver Personally i find myself using WPA more often than VTune, that being said for the longest time VTune was not available for free, so that may have influenced that behavior.

i fixed that yesterday

Thanks. I saw in the VS docs that the /ZI (Program Database for Edit and Continue) option might disable many optimizations. Maybe for profiling it's better to use /Zi for the RelWithDebInfo configuration (Program Database without the Edit and Continue feature).
https://docs.microsoft.com/en-us/cpp/build/reference/z7-zi-zi-debug-information-format?view=vs-2019

most optimizations are incompatible with Edit and Continue

Here is a patch in case you would like to change it for RelWithDebInfo: 0001-Change-Debug-Info-Format-to-Programm-Database-withou.patch

> i fixed that yesterday Thanks. I saw in the VS docs that the /ZI (Program Database for Edit and Continue) option might disable many optimizations. Maybe for profiling it's better to use /Zi for the RelWithDebInfo configuration (Program Database without the Edit and Continue feature). https://docs.microsoft.com/en-us/cpp/build/reference/z7-zi-zi-debug-information-format?view=vs-2019 > most optimizations are incompatible with Edit and Continue Here is a patch in case you would like to change it for RelWithDebInfo: [0001-Change-Debug-Info-Format-to-Programm-Database-withou.patch](https://archive.blender.org/developer/F8507177/0001-Change-Debug-Info-Format-to-Programm-Database-withou.patch)

In #70016#921806, @LazyDodo wrote:
I think it's important to note that when you're profiling you're going to compare performance between builds, you have to be very very careful to compare apples to apples, there can be up to a 20% difference in performance between a Release and RelWithDebInfo build, a fun thread where i miserably failed at remembering this is #70463.

Very good point, I've added this to the Tips and Tricks section in the draft.

As for Vtune vs WPA they answer different questions vtune is focused on measuring if every instruction is living up to its true potential by recording if it's executing as fast as it could have, and what is holding it back while throwing tons and tons of data at you. Things get overwhelming fast but it's great if you're looking to squeeze every last bit of performance out of a piece of code.

WPA is less concerned with this low level detail and it's great for answering questions like 'when am i running? am i running as often as i can, and if not, what is preventing me? what are the threads waiting for?" or just generic questions of 'where is time being spend' D6267 is a great example where analysis of both those things can be leveraged to deal with performance issues. Bruce Dawson from google has written a wonderful tool that makes recording WPA traces as easy as a 1 button click called [UiForEtw ]] it also takes care of the installation of the tools required, it's highly recommended. Learning to use it yeah.. lets say there is a steep learning curve, however Bruce has an excellent blog where he shows how to use the tool to analyze various problems, he keeps a list of them [ https:*randomascii.wordpress.com/2015/09/24/etw-central/ | over here

Now for the fun part, WPA and Vtune do not co-exist peacefully on a single machine, you install Vtune, WPA appears to be working, but the traces you end up with are virtually empty...

There is a work around to disable vtune temporarily but it is only documented in this obscure post on the intel forums from 2013 amplxe-sepreg.exe -u pax to remove the problematic driver (and reboot) restores WPA but now vtune is broken, amplxe-sepreg.exe -i reinstall the driver

Personally i find myself using WPA more often than VTune, that being said for the longest time VTune was not available for free, so that may have influenced that behavior.

Okay, now I better understand how WPA can be useful. I will try it out, document the setup as well, and add a summary when WPA can be useful.
Thanks for the input!

> In #70016#921806, @LazyDodo wrote: > I think it's important to note that when you're profiling you're going to compare performance between builds, you have to be very very careful to compare apples to apples, there can be up to a 20% difference in performance between a Release and RelWithDebInfo build, a fun thread where i miserably failed at remembering this is #70463. Very good point, I've added this to the Tips and Tricks section in the draft. > As for Vtune vs WPA they answer different questions vtune is focused on measuring if every instruction is living up to its true potential by recording if it's executing as fast as it could have, and what is holding it back while throwing tons and tons of data at you. Things get overwhelming fast but it's great if you're looking to squeeze every last bit of performance out of a piece of code. > > WPA is less concerned with this low level detail and it's great for answering questions like 'when am i running? am i running as often as i can, and if not, what is preventing me? what are the threads waiting for?" or just generic questions of 'where is time being spend' [D6267](https://archive.blender.org/developer/D6267) is a great example where analysis of both those things can be leveraged to deal with performance issues. Bruce Dawson from google has written a wonderful tool that makes recording WPA traces as easy as a 1 button click called [UiForEtw ]] it also takes care of the installation of the tools required, it's highly recommended. Learning to use it yeah.. lets say there is a steep learning curve, however Bruce has an excellent blog where he shows how to use the tool to analyze various problems, he keeps a list of them [[ https:*randomascii.wordpress.com/2015/09/24/etw-central/ | over here ](https:*github.com/google/UIforETW) > > Now for the fun part, WPA and Vtune do not co-exist peacefully on a single machine, you install Vtune, WPA appears to be working, but the traces you end up with are virtually empty... > > There is a work around to disable vtune temporarily but it is only documented in this [obscure post ](https:*software.intel.com*en-us/comment/1727169#comment-1727169) on the intel forums from 2013 `amplxe-sepreg.exe -u pax` to remove the problematic driver (and reboot) restores WPA but now vtune is broken, `amplxe-sepreg.exe -i` reinstall the driver > > Personally i find myself using WPA more often than VTune, that being said for the longest time VTune was not available for free, so that may have influenced that behavior. Okay, now I better understand how WPA can be useful. I will try it out, document the setup as well, and add a summary when WPA can be useful. Thanks for the input!
Member

In #70016#922201, @harald.reingruber wrote:

most optimizations are incompatible with Edit and Continue

Here is a patch in case you would like to change it for RelWithDebInfo

Yeah there's no reason for RelWithDebInfo to be using the E&C flags, i'll get that fixed up, thanks for the patch!

> In #70016#922201, @harald.reingruber wrote: >> most optimizations are incompatible with Edit and Continue > > Here is a patch in case you would like to change it for RelWithDebInfo Yeah there's no reason for RelWithDebInfo to be using the E&C flags, i'll get that fixed up, thanks for the patch!
Member

Added subscriber: @EAW

Added subscriber: @EAW
Member
Just thought I should mention @LazyDodo's [D8126: Speed up saving when rendering movies / Profiling walk-though](https://archive.blender.org/developer/D8126) here.

Added subscriber: @sayak_adak

Added subscriber: @sayak_adak

Hello everyone, I am totally new to open source contribution, I have been using blender for sometime now and have a basic idea about python and C++, can anybody kindly tell me which skills to acquire and from where so I can start contributing?

Hello everyone, I am totally new to open source contribution, I have been using blender for sometime now and have a basic idea about python and C++, can anybody kindly tell me which skills to acquire and from where so I can start contributing?
Author
Owner

@sayak_adak this isn't really the place to ask for advice on contributing, but see here and for further questions see the communication channels listed:
https://wiki.blender.org/wiki/Developer_Intro/Advice

@sayak_adak this isn't really the place to ask for advice on contributing, but see here and for further questions see the communication channels listed: https://wiki.blender.org/wiki/Developer_Intro/Advice

Added subscriber: @Antriksh-Misri

Added subscriber: @Antriksh-Misri

Added subscriber: @rotoglup

Added subscriber: @rotoglup

Added subscriber: @Zhen-Dai

Added subscriber: @Zhen-Dai

I am very late to this but I would love to be able to find manual page for draw manager profiling (or just document how to enable it);

It is there since 2017 but I cannot find a single reference on Google.

https://github.com/blender/blender/blob/master/source/blender/draw/intern/draw_manager_profiling.c

I am very late to this but I would love to be able to find manual page for draw manager profiling (or just document how to enable it); It is there since 2017 but I cannot find a single reference on Google. https://github.com/blender/blender/blob/master/source/blender/draw/intern/draw_manager_profiling.c
Member

Added subscriber: @Jeroen-Bakker

Added subscriber: @Jeroen-Bakker
Member

@Zhen-Dai Draw manager profiling can be enabled by setting the debug value a value between 21 and 29. It was introduced in blender 2.80, but isn't used by the main GPU developers anymore after implementing the --debug-gpu flag. This flag allows to debug/profile with external tools like renderdoc (mostly used by us).

@Zhen-Dai Draw manager profiling can be enabled by setting the debug value a value between 21 and 29. It was introduced in blender 2.80, but isn't used by the main GPU developers anymore after implementing the --debug-gpu flag. This flag allows to debug/profile with external tools like renderdoc (mostly used by us).

In #70016#1340341, @Jeroen-Bakker wrote:
@Zhen-Dai Draw manager profiling can be enabled by setting the debug value a value between 21 and 29. It was introduced in blender 2.80, but isn't used by the main GPU developers anymore after implementing the --debug-gpu flag. This flag allows to debug/profile with external tools like renderdoc (mostly used by us).

Thx, I was troubleshooting some EEVEE viewport performance issue on macOS. Having this overlay info helps a lot.

I will definitely try Intel GPA and Xcode Instruments with --debug-gpu flag once I isolated the problem.

> In #70016#1340341, @Jeroen-Bakker wrote: > @Zhen-Dai Draw manager profiling can be enabled by setting the debug value a value between 21 and 29. It was introduced in blender 2.80, but isn't used by the main GPU developers anymore after implementing the --debug-gpu flag. This flag allows to debug/profile with external tools like renderdoc (mostly used by us). Thx, I was troubleshooting some EEVEE viewport performance issue on macOS. Having this overlay info helps a lot. I will definitely try Intel GPA and Xcode Instruments with `--debug-gpu` flag once I isolated the problem.
Julian Eisel added this to the Developer Documentation project 2023-02-20 14:52:36 +01:00
Author
Owner

Closing, as I don't think this is something that can be documented well by anyone who is not an active developer. Better wait for there to be an organic need for this than try to get it done this way.

Closing, as I don't think this is something that can be documented well by anyone who is not an active developer. Better wait for there to be an organic need for this than try to get it done this way.
Blender Bot added
Status
Archived
and removed
Status
Confirmed
labels 2024-01-29 16:05:38 +01:00
Sign in to join this conversation.
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset System
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Code Documentation
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Viewport & EEVEE
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Asset Browser Project
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Module
Viewport & EEVEE
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Severity
High
Severity
Low
Severity
Normal
Severity
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No Assignees
12 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#70016
No description provided.