1
1

Compare commits

...

441 Commits

Author SHA1 Message Date
b40787115b GHOST: Device discovery API.
Still wip.
2021-11-23 15:29:11 +01:00
fc502a8710 Merge branch 'temp-ghost-vulkan-backend' into tmp-vulkan 2021-11-23 14:01:01 +01:00
d944b969f0 Merge branch 'master' into temp-ghost-vulkan-backend 2021-11-23 13:23:50 +01:00
cf299bee80 Merge remote-tracking branch 'origin/blender-v3.0-release' 2021-11-23 13:02:25 +01:00
3e65bb86f9 Cleanup: fix clang-tidy warnings
Fix clang-tidy warnings introduced by a recent commit on the release
branch.

No functional changes.
2021-11-23 13:02:00 +01:00
f392ce50c4 Merge remote-tracking branch 'origin/blender-v3.0-release' 2021-11-23 12:39:26 +01:00
cd2849c89b Asset Browser: add operator for installing asset bundles
Add an operator "Copy to Asset Library" for installing asset bundles
into already-existing asset libraries.

The operator is shown when:
- the "Current File" library is selected,
- the blend file name matches `*_bundle.blend`, and
- the file is not already located in an asset library.

The user can select a target asset library, then gets a "Save As"
dialogue box to select where in that library the file should be saved.
This allows for renaming, saving in a subdirectory, etc.

The Asset Catalogs from the asset bundle are merged into the target
asset library.

The operator will refuse to run when external files are referenced. This
is not done in its poll function, as it's quite an extensive operator
(it loops over all ID datablocks).

Reviewed by: Severin

Differential Revision: https://developer.blender.org/D13312
2021-11-23 12:29:44 +01:00
605cdc4346 BKE LibLink/Append: Add mechanism for external code to loop over link/append context items.
Will be required for python's `bpy.data.libraries.load()` refactor.
2021-11-23 12:18:37 +01:00
0452a04f1a BKE link/append: Add optional blendfile handle to libraries.
This enables calling code to deal with the blendfile handle themselves,
BKE_blendfile_link then just borrows, uses this handle and does not
release it.

Needed e.g. for python's libcontext system to use new
BKE_blendfile_link_append code.

Part of T91414: Unify link/append between WM operators and BPY context
manager API, and cleanup usages of `BKE_library_make_local`.
2021-11-23 12:17:16 +01:00
6c8f73b220 Cleanup: remove unnecessary "extern" 2021-11-23 12:05:50 +01:00
b9b98448a6 Cleanup: add missing pragma once 2021-11-23 12:00:11 +01:00
fbb4a7eb43 BKE link/append: Add optional blendfile handle to libraries.
This enables calling code to deal with the blendfile handle themselves,
BKE_blendfile_link then just borrows, uses this handle and does not
release it.

Needed e.g. for python's libcontext system to use new
BKE_blendfile_link_append code.

Part of T91414: Unify link/append between WM operators and BPY context
manager API, and cleanup usages of `BKE_library_make_local`.
2021-11-23 10:38:51 +01:00
f657356062 Merge branch 'blender-v3.0-release' 2021-11-23 09:44:04 +01:00
b02ac2d8be Fix T93092: incomplete animation rendering of multi-layer exr composition
This was broken by rB0c3b215e7d5456878b155d13440864f49ad1f230.
The caching of loaded exr files needed some special treatment.

Differential Revision: https://developer.blender.org/D13313
2021-11-23 09:43:00 +01:00
84be741329 Fix T92654: missing padding in spreadsheet cells
This was probably broken by rB5c2330203e11e0d916960218b07d88d2193bf526.
For now just add the padding back in a spreadsheet specific way.

Differential Revision: https://developer.blender.org/D13315
2021-11-23 09:35:47 +01:00
6987060f70 Fix T93090: crash with data transfer modifier and geometry nodes
There was a missing normals layer that was requested by the data transfer
modifier from the target object. The normal layer was correctly added to
the target object. However, it never reached the data transfer modifier
because the mesh was copied in `BKE_object_get_evaluated_mesh`
(in the call to `get_mesh_for_write`) and the copy does not include the normals
layer.

The solution is to not use `get_mesh_for_write` here which was only used
because `BKE_object_get_evaluated_mesh` returns a non-const `Mesh *`.
Mid term, it should actually return a `const Mesh *` to avoid the confusion.

Differential Revision: https://developer.blender.org/D13319
2021-11-23 09:32:12 +01:00
28870a8f89 Cleanup: Use new CollectionRef::empty() method
Use the new CollectionRef::empty() method in all locations where
appropriate.

Differential Revision: https://developer.blender.org/D13276
2021-11-22 23:47:26 -08:00
59754ef0b2 Cleanup: silent clang-tidy warning NULL vs nullptr. 2021-11-23 08:39:26 +01:00
Peter Fog
34370d9fdf VSE: Add drag and drop handler for preview area
For some users, dropping assets into preview area may be more practical
due to space constraints or it may be just more intuitive.

Reviewed By: ISS

Differential Revision: https://developer.blender.org/D13311
2021-11-23 05:40:08 +01:00
Peter Fog
b42494cf6b VSE: Support drag and drop for datablocks
For using the Outliner and/or the Asset Browser as scene independent
tools to organize a/v source material is necessary for the users to be
able to drag and drop data blocks into the VSE. This was also an
unfulfilled design target for the Outliner Gsoc project.

Datablocks won't be used directly. Path to file will be passed to strip
add operator instead.

Reviewed By: ISS

Differential Revision: https://developer.blender.org/D13304
2021-11-23 05:39:10 +01:00
d1a4e043bd Merge branch 'blender-v3.0-release' 2021-11-23 00:57:15 +01:00
Sayak Biswas
f749506163 Fix T93244: Cycles HIP not working with multi GPU rendering
Use the correct device function (hipDeviceGet) for multi GPU setups, instead
of hipGetDevice which just returns the default device.

Differential Revision: https://developer.blender.org/D13323
2021-11-23 00:55:56 +01:00
8600d4491f Fix: Const warning in editmesh_knife.c
Fixes a warning caused by freeing a const pointer.
This commit removes the const modifier.

Differential Revision: https://developer.blender.org/D13321
2021-11-22 23:40:21 +01:00
456d5e14b8 Merge branch 'blender-v3.0-release' 2021-11-22 21:26:39 +01:00
481f032f5c Assets: Generate light preview when making light object an asset
We already supported previews for lights, just didn't actually use them
when making a light object an asset. They were only used when making the
light data itself an asset.
2021-11-22 21:14:11 +01:00
34615cd269 Fix grayed out preview generation button for light objects
Mistake in e7bea3fb6e.

We should only skip preview generation for objects that don't support
preview rendering, not completely forbid accessing preview data of such
IDs.
2021-11-22 21:12:36 +01:00
48c2b4012f Merge branch 'blender-v3.0-release' 2021-11-22 21:06:10 +01:00
29681f186e Fix T93283: Cycles render error with CUDA CPU + GPU after recent optimization
BVH2 triangle intersection was broken on the GPU since packed floats can't
be loaded directly into SSE. The better long term solution for performance
would be to build a BVH2 for GPU and Embree for CPU, similar to what we do
for OptiX.
2021-11-22 21:02:46 +01:00
e2b736aa40 Fix part of T93278: transparent glass option not working with environment pass 2021-11-22 20:58:09 +01:00
Takahiro Shizuki
73b1ad1920 IME: Fix Multi-Window Duplicated First Character
Fix problem with duplicated initial character when initiating or
switching to new windows. This is done by updating our copies of state
and modes from the new window when it receives WM_IME_SETCONTEXT
message. This problem and fix are only for the Windows platform.
2021-11-22 10:44:34 -08:00
bfff9ca5f1 Fix confusing new Cycles UI terminology
* Rename "Auto Tiles" to "Use Tiling", it's not really automatic and
  confusing with the old auto tile size add-on.
* Rename "Adaptive" scrambling distance to "Automatic", to avoid confusion
  with adaptive sampling.
2021-11-22 10:44:34 -08:00
Takahiro Shizuki
ee0277271c IME: Fix Multi-Window Duplicated First Character
Fix problem with duplicated initial character when initiating or
switching to new windows. This is done by updating our copies of state
and modes from the new window when it receives WM_IME_SETCONTEXT
message. This problem and fix are only for the Windows platform.
2021-11-22 10:41:11 -08:00
cc6bcb53b2 Geometry Nodes: Rename legacy node files
This will be useful to solve some issues with unity builds,
which compiles different files together to improve build times.
2021-11-22 13:16:54 -05:00
5ad4ca4e02 CurveEval: Add total_length() and total_control_point_size() methods
Add the following methods to the CurveEval class:
total_length() : returns the total length of the curve without needing to
                    allocate a new array
total_control_point_size() : returns the total number of control points without
                    needing to allocate a new array
2021-11-22 12:07:59 -06:00
8a84a61f6b Fix confusing new Cycles UI terminology
* Rename "Auto Tiles" to "Use Tiling", it's not really automatic and
  confusing with the old auto tile size add-on.
* Rename "Adaptive" scrambling distance to "Automatic", to avoid confusion
  with adaptive sampling.
2021-11-22 18:37:47 +01:00
0129178376 UI: Use a map for block name lookups
Use a map to speed up search for UI block names.
Time to redraw the node editor was decreased from
around 75-120ms to 40-70ms in a tree with many
Geometry Nodes.

Differential Revision: https://developer.blender.org/D13225
2021-11-22 18:23:54 +01:00
55c82d8380 Fix T84493 issue with selection after boolean.
According to Blender selection rules, selections should be flushed
to containing elements. Added an EDMB_select_flush() after edit
mode booleans or intersects are done. Hopefully this doesn't break
any scripts that might have been depending on the old (broken) behavior.
2021-11-22 11:47:46 -05:00
1706bf7780 Merge branch 'blender-v3.0-release' 2021-11-22 17:32:23 +01:00
336ca6796a Fix T90308: Cycles crash copying memory from device to host
Happens when device runs out of memory and Cycles is moving some
textures to the host memory.

The delayed memory free for OptiX BVH was moving data from one
device_memory to another, leaving the original device memory in
an invalid state. This was ruining the allocation map in the CUDA
device which is using pointer to the device_memory.

This change makes it so the memory pointer is stolen from BVH
into the delayed memory free list.

Additionally, forbid copying and moving instances of device_memory
and added sanity checks in the device implementation.

Differential Revision: https://developer.blender.org/D13316
2021-11-22 17:26:59 +01:00
25c83c217b Cleanup: Clang-format of the HIP device implementation 2021-11-22 17:26:52 +01:00
059da44fbc BKE Link/Append: Use BLO's LibraryLink_Params.
This allows to reduce signature of several functions, and make it eaiser
to integrate more higher-level usages later on.

This should be a non-behavioral-change commit.

Part of T91414: Unify link/append between WM operators and BPY context
manager API, and cleanup usages of `BKE_library_make_local`.
2021-11-22 17:14:21 +01:00
d7cf7d4048 Link/Append: Move main linking code (from WM) to use new instantiation code in BKE.
This removes the last main usage of BLO's instantiation code.

Also required some limited refactoring of BKE_blendfile_link_append's
instantiation to make it more modular, and usable by both linking and
appending code paths.

NOTE: This should be a non-behavioral change commit.

Part of T91414: Unify link/append between WM operators and BPY context
manager API, and cleanup usages of `BKE_library_make_local`.
2021-11-22 16:55:36 +01:00
fe274d91a1 Link/Append: Move most of core link/append code from WM to new BKE_blendflie_link_append module.
This will allow to expose all those advanced features of the WM
operators to other parts of the code, like the python library context
manager, copy/paste code, etc.

This is expected to be a strictly no-behavioral-change commit.

Part of T91414: Unify link/append between WM operators and BPY context
manager API, and cleanup usages of `BKE_library_make_local`.

Maniphest Tasks: T91414

Differential Revision: https://developer.blender.org/D13222
2021-11-22 16:52:17 +01:00
9c2a4d158c Merge remote-tracking branch 'origin/blender-v3.0-release' 2021-11-22 16:47:50 +01:00
875f24352a Material Preview: Fix Sphere object (squared UV and poles)
A good sphere preview material has a 1:1 UV ratio (so we see squares as
least distorted as possible), as well as good poles.

Square UV:
The original sphere expected a 2:1 panorama to be mapped to it. This
patch changes that (I scaled Y by 0.5) so that square textures look ok.

Poles:
The original sphere had a low initial resolution, so no ammount of
subdivision would fix the poles.

The sphere has a subdivision modifier with 0 resolution. Later (3.1?) I
want to try to change the resolution on-the-fly based on whether the material
has a displacement map.

Old sphere (1.9K vertices):
{F11845752, size=500px}

New sphere (2.0K vertices):
{F11845710, size=500px}

Differential Revision: https://developer.blender.org/D13309
2021-11-22 16:46:49 +01:00
Jeroen Bakker
0624235900 Moviecache: Fix potential memory corruption.
`IMB_moviecache` is implemented as a singleton. When destructing the
singleton via `IMB_moviecache_destruct` it will not be created anymore
resulting inusage of unallocated memory and potentional memory
corruption.

When running blender this doesn't happen, but when creating images in
test cases the moviecache should be able to be recreated after it is
destroyed.

Reviewed By: sergey

Differential Revision: https://developer.blender.org/D13287
2021-11-22 15:52:30 +01:00
be876b8db6 Painting: Performance curve masks.
This patch separates the static-part from the dynamic-part when
generate brush masks. This makes the generation of brush masks 2-5 times
faster depending on the size of the brush.

More improvements can be done, this was just low hanging fruit.
2021-11-22 15:36:03 +01:00
178947dec3 Merge branch 'blender-v3.0-release' 2021-11-22 09:34:51 -05:00
Henrik Dick
819b9bdfa1 Fix T92631: Fix negative thickness regression in complex solidify
This regression was introduced by D11832, but there was problems before
that as well. I seem to have missed it in review. See the differential
revision for a screenshot of the difference.

Differential Revision: https://developer.blender.org/D13216
2021-11-22 09:33:49 -05:00
77ddc6e350 Fix T93250: Crash in spreadsheet tree view after recent commit
It seems that update_from_old assumed there would be an old
tree view available in the old block. This works for the asset browser
because the tree is always drawn, but for the spreadsheet that depends
on having an active object, which isn't necessarily always true.

Differential Revision: https://developer.blender.org/D13301
2021-11-22 08:47:02 -05:00
7e8f9213e9 Merge branch 'blender-v3.0-release' 2021-11-22 13:49:37 +01:00
7b09213f2f Fix T93198: Frame Selected in greasepencil curve editing does not work
Was not taking into account curve points at all.

Maniphest Tasks: T93198

Differential Revision: https://developer.blender.org/D13281
2021-11-22 13:42:00 +01:00
9b6f3d2d0a Cleanup: Removed unused code in curve mask creation.
Generating curve mask for 2d texture painting had some hard-coded
parameters that eventually weren't used in the algorithm (hardness and
rotation of the brush). This patch removes these parameters.
2021-11-22 11:22:58 +01:00
9bbb5f5a6a Painting: migrated curve mask generation to CPP.
Curve mask generation is done during 2d texture painting. There are some
performance issues in this part of the code. Before addressing those we
move the code to CPP.
2021-11-22 10:46:33 +01:00
29f6ec56e6 Cleanup: Make parameter const (BKE_brush_curve_strength_clamped). 2021-11-22 10:46:33 +01:00
0b246ed813 Revert "Fix (unreported) broken handling of constraints reordering with liboverride."
This reverts commit 6eaa69c66c.

Committed by nistake, sorry for the noise.
2021-11-22 10:39:20 +01:00
31864a40ba Cleanup: use simple data member instead of callback
This really doesn't have to be a callback currently, since it is always
the same `CPPType` for a socket type.
2021-11-22 10:18:08 +01:00
c850189adf Cleanup: make naming more consistent 2021-11-22 09:48:36 +01:00
db20837c3a Merge branch 'blender-v3.0-release' 2021-11-22 09:32:59 +01:00
6c16bb2706 Fix broken NLA RNA code after own rBfa6a913ef19c.
Thanks to @scurest for noticing this mistake!
2021-11-22 09:31:59 +01:00
6eaa69c66c Fix (unreported) broken handling of constraints reordering with liboverride.
New drag&drop reordering code would call constraints reordering operator
with the generic context, and not the one from the panel's layout.
missing the "constraint" member which is mandatory for poll function to
properly deal with override vs. local constraints.

This commit fixes it by generating a temp bContextStore in the panel
re-ordering callback.

NOTE: this fix will have to be extended to modifiers (which happen to
work currently because they have an 'active' status), and gpencil
modifiers (which are also broken currently).

Differential Revision: https://developer.blender.org/D13291
2021-11-22 09:28:27 +01:00
fb470c256a Fix T93256: Instances to points node broken after recent commit
When 97533eede4 added the instance domain, it didn't change
the domain that instance attributes are read from in this node.
2021-11-21 11:37:35 -05:00
873f6148ad Functions: remove test for dynamic name
This was broken in rB6ee2abde82ef121cd6e927995053ac33afdbb438.
2021-11-21 13:08:23 +01:00
940e6525c7 Functions: fix compile error in tests 2021-11-21 13:06:05 +01:00
15011e0b70 Functions: use static string for parameter names
The idea behind this change is the same as in
rB6ee2abde82ef121cd6e927995053ac33afdbb438.

A `MultiFunction::debug_parameter_name` method could be
added separately when necessary.
2021-11-21 12:57:34 +01:00
6ee2abde82 Functions: use static names for multi-functions
Previously, the function names were stored in `std::string` and were often
created dynamically (especially when the function just output a constant).
This resulted in a lot of overhead.

Now the function name is just a `const char *` that should be statically
allocated. This is good enough for the majority of cases. If a multi-function
needs a more dynamic name, it can override the `MultiFunction::debug_name`
method.

In my test file with >400,000 simple math nodes, the execution time improves from
3s to 1s.
2021-11-21 12:48:07 +01:00
d455eadcd8 Cleanup: Move menu search template to C++
This allows the use of better/more readable data structures,
and will also make some refactors to the search button easier.
The build completed on the buildbot for all platforms.
2021-11-20 16:30:53 -05:00
b3ee9f44cf Merge branch 'blender-v3.0-release' 2021-11-20 18:00:46 +01:00
1b2ee3cf20 Fix T92090: Eevee crash with Intel HD 4000 and macOS 10.15.7
A recent security update to macOS 10.15.7 causes crashes when using Eevee and
various other 3D viewport features. It appears that glGenerateMipmap is
broken, causing a crash whenever its commands are flushed/submitted to the GPU.

Ideally this would be fixed in a driver update, however it's unlikely this will
happen. Earlier macOS versions have been receiving security updates for 2 years,
and that window has just passed for 10.15. Further, computers with these GPUs
can't upgrade to a newer macOS version.

As a workaround, disable mipmaps on these GPUs, by setting the mipmap max level
to 0 and not calling glGenerateMipmaps. Effects like depth of field also use
mipmaps, but fill in the mip levels by other means. In those cases we keep the
mipmap level.

Differential Revision: https://developer.blender.org/D13295
2021-11-20 17:50:05 +01:00
e949ac0bfc Cleanup: unset 'FILE_ENTRY_PREVIEW_LOADING' at the end
Although this function only runs on the main thread, it seems safer to
clear the flag only after setting the result.
2021-11-20 10:34:29 -03:00
59ffe1c5b1 Merge branch 'blender-v3.0-release' 2021-11-20 14:05:26 +01:00
Sayak Biswas
f2bb42a095 Fix T92984: Cycles HIP crash with smoke volumes
This fixes the the app crash happening when trying to render smoke as a dense
3D texture. The changes are related to matching up hipew with the actual HIP
headers.

Differential Revision: https://developer.blender.org/D13296
2021-11-20 14:02:38 +01:00
3d447b6335 Merge branch 'blender-v3.0-release' 2021-11-20 12:47:31 +01:00
b20997cb34 Fix T93194: greasepencil channel lists ignoring collection visibility
Same fix as rB0a3b4d4c64f1, but this time for greasepencil.

To repeat: dopesheet in greasepencil mode was ignoring the temporariy
visibility flag of collections. As a result, even though the dopesheet
was supposed to show animation data of visible greasepencils only was
still showing such data of greasepencils that were hidden by hiding
their collection.
2021-11-20 12:31:33 +01:00
3baaab15fc Cleanup: Else after return 2021-11-19 23:48:51 -05:00
411261fb32 Merge branch 'blender-v3.0-release' 2021-11-20 01:45:06 -03:00
092df87534 Fix error in rBfb0ea9
There is no need to multiply the "dash_width" by `UM.pixel size` since the "viewport_size" is already being divided by the DPI.

Ref {rBfb0ea9}
2021-11-20 01:44:35 -03:00
15ecd47b96 Geometry Nodes: Instance attributes in Transfer/Capture nodes
Updates the Transfer Attributes and Capture Attributes nodes
to support attributes from instances.

Differential Revision: https://developer.blender.org/D13292
2021-11-19 23:40:08 +01:00
01df48a983 Refactor: Port spreadsheet data set to UI tree view
This patch removes a bunch of specific code for drawing the spreadsheet
data set region, which was an overly specific solution for a generic UI.
Nowadays, the UI tree view API used for asset browser catalogs is a much
better way to implement this behavior.

To make this possible, the tree view API is extended in a few ways.
Collapsibility can now be turned off, and whether an item should
be active is moved to a separate virtual function.

The only visual change is that the items are now drawn in a box,
just like the asset catalog.

Differential Revision: https://developer.blender.org/D13198
2021-11-19 17:36:11 -05:00
a0780ad625 Fix: Exception in tree view code in new case
This isn't a problem in 3.0 or master, but I'm porting the spreadsheet
data set region to a tree view and ran into this. This line needs to
whether the function is empty before calling it.

Differential Revision: https://developer.blender.org/D13197
2021-11-19 17:28:08 -05:00
cfbc9df60e Cleanup: Use canonical variable name
All `ARegion` variables should be called `region` unless there is
a good reason not to, since that is the convention.
2021-11-19 16:10:37 -05:00
217d0a1524 Cleanup: Use vector instead of linked list 2021-11-19 15:57:31 -05:00
51a7961e09 Cleanup: Simplify node editor link dragging storage
Now that `node_intern.hh` is a C++ header, we can use C++ types
there. This patch replaces the linked list of dragged links with a
vector. Also, the list of drag operator custom data, `nldrag`, doesn't
seem to need to be a list at all, so I just made it a unique pointer.

Differential Revision: https://developer.blender.org/D13252
2021-11-19 15:36:32 -05:00
c3fed4d463 Merge branch 'blender-v3.0-release' 2021-11-19 16:57:33 -03:00
fb0ea94c63 Fix T85855: F-curves too thin on Mac
Use the `GPU_SHADER_3D_POLYLINE_UNIFORM_COLOR` shader instead of `GPU_SHADER_2D_LINE_DASHED_UNIFORM_COLOR`.
This is just a partial solution as "protected" fcurves still use the dashed shader.

Differential Revision: https://developer.blender.org/D13290
2021-11-19 16:57:05 -03:00
50ad0e15fe Merge branch 'blender-v3.0-release' 2021-11-19 19:24:55 +01:00
00e4d665f4 Fix T91838 Crash when toggling edit mode on object with geometry node modifier, but only if the instanced objects material has a normal map assigned.
This is only a workaround to avoid the crash. The underlying issue is left
unfixed.

New report for tracking the underlying issue is T93223.
2021-11-19 19:24:22 +01:00
fabd088067 Update splash for Blender 3.x development series
CC-BY Blender Studio https://studio.blender.org

Update the splash artwork for the daily builds (`master`) to
celebrate the beginning of a new major series in Blender.

The badges of the Development Fund membership levels have been removed for
a simpler design, and the font matches the one used elsewhere in blender.org
2021-11-19 18:39:14 +01:00
1222c45544 Fix: Use the instances domain in instance transform nodes
Instance attributes exist on the instance domain after rB97533eede44421,
so these fields should be evaluated on that domain.
2021-11-19 12:26:48 -05:00
ba8dd0f24f Spreadsheet: Support instances component viewer node columns
After rB97533eede444217b, instances have their own attribute domain,
but the spreadsheet code worked under the assumption that the component
used the point domain. Old files have to re-select the instances data
source to make it properly active
2021-11-19 12:21:54 -05:00
06a2e2b28c Merge branch 'blender-v3.0-release' 2021-11-19 18:05:17 +01:00
ef687bd7c2 Merge branch 'blender-v3.0-release' 2021-11-19 12:00:35 -05:00
97533eede4 Geometry Nodes: Support custom instance attributes
Adds an attribute provider for instance attributes.
A new domain `ATTR_DOMAIN_INSTANCE` is implemented.
Instance attributes are not yet realized correctly.

Differential Revision: D13149
2021-11-19 17:53:48 +01:00
1b686c60b5 Fix T93046: Cycles world volume rendering very slow in OptiX with some scenes
With very long ray distance, OptiX ends up traversing many BVH nodes due to
a feature that improves precision. However this causes very slow rendering.

We now avoid generating such long rays by rejecting the few samples that have
long ray distances and very low probability of being generated. This should not
meaningfully affect render results.

Thanks to Sergey and Patrick for the investigation.
2021-11-19 17:42:22 +01:00
0f1a200a67 Fix T92682: EEVEE motion blur crash with curve objects
After rBb9febb54a492, the evaluated mesh from a curve is now presented
to render engines as a separate mesh object, but some code still assumed
that a curve object itself could have an evaluated mesh. However, this is
still true for surface objects and metaballs, which don't
use geometry sets yet.

Differential Revision: https://developer.blender.org/D13272
2021-11-19 11:36:29 -05:00
9e3a913b35 Merge remote-tracking branch 'origin/blender-v3.0-release' 2021-11-19 16:39:28 +01:00
1a1ddcb5e2 Asset Browser: don't display linked-in asset datablocks
Datablocks marked as asset, linked from another file, were shown in the
"Current File" asset library. This is now resolved.
2021-11-19 16:29:25 +01:00
ec71054a9b Merge branch 'blender-v3.0-release'
Conflicts:
	source/blender/blenkernel/BKE_blender_version.h
	source/blender/blenloader/intern/versioning_300.c
2021-11-19 16:10:28 +01:00
06ead314b6 Asset Preferences: disallow single file as asset library
Asset libraries should be directories on disk. By manually entering a
file path it was possible to have a single blend file as asset library,
but that was not a designed-for situation, and it doesn't play well
with the asset catalog system.
2021-11-19 16:08:55 +01:00
330290d2a8 Cleanup: typos in comments 2021-11-19 15:55:47 +01:00
33c5e7bcd5 LibOverrides: Refactor how diffing of RNA collections is handled.
Original implementation was a quick prototype which should have never
landed as-is in master. It had very limiting constraints and did not
allow for any real further development.

This commit fixes the internal implementation to make more sensible,
maintainable and evolutive.

NOTE: This commit introduces another forward-incompatibility in the
Blender file format: Files saved after this commit won't open properly
in older versions of blender regarding local inserted constraints or
modifiers into overrides of linked data.

NOTE: Technical details: The 'anchor' item name/index is now stored in
`subitem_reference_` members, and the actual 'source' item name/index is
stored in `subitem_local_` members of the override property operation
data.
Previously, only the `subitem_local_` members were used, storing the
anchor item name/index, and assuming the 'source' item was always the
next in the list.

Milestone I of T82160.

Maniphest Tasks: T82160

Differential Revision: https://developer.blender.org/D13282
2021-11-19 15:41:53 +01:00
d6ea881a74 BLI_listbase: Add utils to search from string or index.
If a valid matching string is found, return that item, otherwise
fallback to the item matching the given index, if any.

This will be useful in RNA override code, and potentially other
areas where data in lists can be referenced by their names or indices.
2021-11-19 15:41:36 +01:00
d7aaa145c6 Merge branch 'blender-v3.0-release' 2021-11-19 15:30:54 +01:00
04ec36f677 Fix T87912: use session id instead of name to identify dropped object
The old code did not work when there were multiple ids with
the same name (which can happen when ids are linked in).
The solution is to use the session ids instead. Those are different
even when two ids have the same name.

Differential Revision: https://developer.blender.org/D11116
2021-11-19 15:28:44 +01:00
0852805ed7 Merge branch 'blender-v3.0-release' 2021-11-19 15:58:37 +02:00
a20e703d1a Fix T93184: Link color not used for custom sockets
D13044 allowed the link color overlay to be used with custom sockets.
This no longer works due to a condition that checks if the socket is
standard or not, which was in place to avoid bad indexing of the
std_node_socket_colors array. Since that array is no longer used, this
condition needs to be removed.

Differential Revision: https://developer.blender.org/D13274

Reviewed By: Hans Goudey
2021-11-19 15:55:58 +02:00
06691d1b21 Tests: disable Cycles volume test when WITH_MOD_FLUID is off 2021-11-19 13:21:48 +01:00
1b94c53aa6 Cleanup: fix typos in comments and docs
Contributed by luzpaz.

Differential Revision: https://developer.blender.org/D10447
2021-11-19 13:02:16 +01:00
7d5ef64bfb Cleanup: fix typos in comments and docs
Contributed by luzpaz.

Differential Revision: https://developer.blender.org/D13264
2021-11-19 12:46:49 +01:00
fa6a913ef1 LibOverride: Add read-only flags accessors for 'local override' status.
Constraints, modifiers and NLA tracks can now report from RNA whether
they are defined as comming from the override's reference linked data,
or are local to the override.
2021-11-19 12:09:28 +01:00
83e245023c Fix (unreported) wrong behavior of constraints in liboverrides.
All constraints were 'made local', including the ones comming from the
reference linked object.
2021-11-19 12:09:28 +01:00
eb071c9ff4 Merge branch 'blender-v3.0-release' 2021-11-19 10:16:30 +01:00
de3fda29c7 Fix T93054: crash when deleting a missed linked file
This is a bit similar to rBb7260ca4c9f4b7618c9c214f1270e31d6ed9886b.
Sometimes a group node may not reference a node group
because it was linked and can't be found.
2021-11-19 10:15:58 +01:00
48e64a5fb5 Merge branch 'blender-v3.0-release' 2021-11-19 10:09:29 +01:00
4ea6b4ba84 Fix crash in VSE versioning code from recent commit
Caused by {rB4d09a692e22a}.
Greenlit by @sergey in chat.
2021-11-19 10:05:18 +01:00
992634427e Nodes: add bf_nodes_geometry library
Separating geometry nodes into a new library will make it
easier to improve compile times with features like unity
builds and precompiled headers.

Differential Revision: https://developer.blender.org/D13261
2021-11-19 09:09:14 +01:00
b8dc845e57 Merge branch 'blender-v3.0-release' 2021-11-19 06:22:47 +01:00
4d09a692e2 Fix T92847: Meta-strip corrupt
Offsets for meta strip were invalid. No steps to reproduce the issue are
available, but it is quite possible that there are files with incorrect
state after issues with meta strips were fixed.

Ensure correct offsets for meta strips in versioning code.

Reviewed By: sergey

Differential Revision: https://developer.blender.org/D13257
2021-11-19 06:12:09 +01:00
5ed3a5d023 Cleanup: Add an empty() method to RNA's CollectionRef class
The existing RNA CollectionRef class only offers a length() operation
which is sometimes used for checking if the collection is empty. This is
inefficient for certain collection types which do not have a native
length member; the entire list is iterated to find the count.

This patch creates an explicit empty() method to be used in such cases
for better semantics. Additionally, many collection types will behave
more efficiently when using the new method instead of checking length.

Making use of the new method will follow separately.

Differential Revision: https://developer.blender.org/D12314
2021-11-18 14:32:56 -08:00
3f288e9414 Node Editor: Change minimum zoom
Now with Geometry Nodes in Blender,
trees can become much bigger than before.
This changes the minimum zoom value in the node editor.

Differential Revision: https://developer.blender.org/D13254
2021-11-18 21:24:55 +01:00
b2d37c35d0 Node Editor: Skip socket drawing on low zoom
Socket drawing can be heavy with many nodes.
This patch skips drawing them on scale < 0.2
when they are barely visible anyway.

Differential Revision: https://developer.blender.org/D13255
2021-11-18 21:21:10 +01:00
167ee8f2c7 Merge branch 'blender-v3.0-release' 2021-11-18 19:37:48 +01:00
fd2a155d06 Fix T91797: Cycles volume rendering artifact with overlapping volumes
With the new volume rendering code this was no longer accurate, we always
need to use a new dimension for the next volume segment.
2021-11-18 19:27:37 +01:00
f190f2d267 Revert fixes 2021-11-18 14:19:59 -03:00
0c33411bdd Revert "Revert "Revert "Revert "Allow navigating while transforming""""
This reverts commit 717a971035.
2021-11-18 14:15:08 -03:00
ea42c1a22e Revert "Revert "Revert "Revert "Adjust snap source drawing when adding multiple snap points""""
This reverts commit b8bf40ed4b.
2021-11-18 14:14:57 -03:00
f61a73093b Revert "Revert "Revert "Revert "Transform: interactive mode for editing a 'Snap Source'""""
This reverts commit 701f2dfd5b.
2021-11-18 14:14:51 -03:00
ada6742601 Merge remote-tracking branch 'origin/blender-v3.0-release' 2021-11-18 17:58:26 +01:00
701f2dfd5b Revert "Revert "Revert "Transform: interactive mode for editing a 'Snap Source'"""
This reverts commit 25fa6c74b9.
2021-11-18 13:55:24 -03:00
b8bf40ed4b Revert "Revert "Revert "Adjust snap source drawing when adding multiple snap points"""
This reverts commit c7f9a782aa.
2021-11-18 13:55:21 -03:00
717a971035 Revert "Revert "Revert "Allow navigating while transforming"""
This reverts commit 2a9cfdac7e.
2021-11-18 13:55:17 -03:00
Germano Cavalcante
2a9cfdac7e Revert "Revert "Allow navigating while transforming""
This reverts commit 5e6fdaa07f.
2021-11-18 13:52:39 -03:00
c7f9a782aa Revert "Revert "Adjust snap source drawing when adding multiple snap points""
This reverts commit 77df32548b.
2021-11-18 13:52:30 -03:00
Germano Cavalcante
25fa6c74b9 Revert "Revert "Transform: interactive mode for editing a 'Snap Source'""
This reverts commit 805181bffa.
2021-11-18 13:52:18 -03:00
7c4e4d605c Pose Library: clarify apply/blend operator names
The old names had "{verb} Pose Library Pose"; they are now named
"{verb} Pose Asset", which is shorter & clearer.
2021-11-18 17:52:17 +01:00
3531021d1b Cleanup: Simplify declarations in C++ header
Using `struct` everywhere is unnecessary in C++, and the typedefs
are also unnecessary.
2021-11-18 11:46:44 -05:00
12fc395436 Fix T93152: Cycles baking multiple selected object to active not working 2021-11-18 17:43:27 +01:00
Germano Cavalcante
805181bffa Revert "Transform: interactive mode for editing a 'Snap Source'"
This reverts commit f19bd637e2.
2021-11-18 13:42:45 -03:00
77df32548b Revert "Adjust snap source drawing when adding multiple snap points"
This reverts commit cb3ba68ec4.
2021-11-18 13:42:36 -03:00
Germano Cavalcante
5e6fdaa07f Revert "Allow navigating while transforming"
This reverts commit 1d1855e95f.
2021-11-18 13:42:31 -03:00
Germano Cavalcante
1d1855e95f Allow navigating while transforming
This feature has been desired for some time:
- https://rightclickselect.com/p/ui/Tqbbbc/allow-navigating-while-transforming (See comments);
- D1583;
- T37427;

In short, blocking navigation during transform limits the user to move the object only to visible areas within the screen and hinders the allocation of objects within closed meshes.

The node editor is also impaired because some nodes are far between them and the connectors are too small.

The only disadvantage of this patch (as I see it) is the conflict with the existing key map:
MIDDLEMOUSE:
- enable axis constrain in 3D view;

WHEELDOWNMOUSE, WHEELUPMOUSE, PAGEUPKEY, PAGEDOWNKEY:
- change the threshold of the proportional edit;

So the patch solution was to change these keymaps:
- MIDDLEMOUSE to Alt+MIDDLEMOUSE;
- WHEELDOWNMOUSE, WHEELUPMOUSE, PAGEUPKEY, PAGEDOWNKEY to Alt+(corresponding key);

When you use this new keymap for the first time in the proportional edit, it may seem strange due to the custom of using it (both in View2D and View3D).
But quickly the user gets used to it.

Alternatively we can add an option to the user preferences ([] Allow navigating while transforming). (I'm not much fan of this option).

The patch was done on branch2.8. But maybe it's a good idea to apply it to 2.79

Differential Revision: https://developer.blender.org/D2624
2021-11-18 13:14:18 -03:00
cb3ba68ec4 Adjust snap source drawing when adding multiple snap points 2021-11-18 13:14:18 -03:00
Germano Cavalcante
f19bd637e2 Transform: interactive mode for editing a 'Snap Source'
This patch implements part of what was stated in {T66484}, with respect to `Base Point`.

## Introduction

The snapping feature of the transform tools has a variety of applications:
- Organization of nodes.
- Positioning of frames in precise time units.
- Retopology with snap to face
- Creation of armatures with bone positioning through the snap to volume
- Precise positioning of 3D or 2D objects in the surrounding geometry (CAD modeling)

The goal of this document is to make it more powerful for precision modeling and still supporting the old use cases without extra complexity.
The main topic addressed here is the introduction of a **interactive mode for setting a snap source** (See terminology).

## Terminology

* **Snap Source**: 3d coordinate * we want to snap from. (Currently defined by the `Snap With` options: `Closest`, `Center`, `Median` and `Active`).
* **Snap Target**: 3d coordinate*  we want to snap to. (Vertices, Edges, Faces, Grid...)

## Interactive Mode for Editing a Snap Source

Currently the fixed snap point can only be obtained through the `Snap With` options. So it's a little tricky for the user to define a snap source point having so much geometry on an object.
Because of this, the user needs to resort to impractical solutions to get a point in the geometry.
See example of an impractical use:
{F11714181, layout=left, width=960, alt="The user used the cursor (which can be snapped) to choose the snap origin point."}
The user used the cursor (which can be snapped) to choose the snap source point.

While it is possible to work around this current limitation, it is important to reduce the number of steps and allow the user to set a snap source point through an optional interactive mode during a transformation.

The proposed solution is to be able to move the current snap source point through a modal modifier activated with a key (eg. B).
The snap source point can thus "snap" to the elements in the scene (vertex, mid-edge, Lamp, …) during this mode.
{F9122814, layout=left, width=960, alt="Base Point Snap, example of transform operation via the shortcut (not the tool). After pressing g and the snap base change shortcut (e.g., shift + ctrl) the user set the base point. The base point is then visible until the end of the operation. The z axis constrains the final position."}

## Implementation Details

- The feature will only be available in 3D View.
- The feature will only be available for `Move`, `Rotate` and `Scale` transform modes.
- The snap source editing will be enabled with a single click on the modifier key (B).
- Having a snap point indicated, the new snap origin point will be confirmed with the same buttons that confirms the transformation (but the transformation will not be concluded).
- The snap source editing can be canceled with the same key that activated it (B).
- If the transformation is done with "release_confirm" (common for gizmos), the new feature cannot be enabled.
- During the transformation, when enabling the feature, if the snap option is turned off in the scene, the snap will be forced on throughout the rest of the transformation (unless interactive mode is canceled).
- During a transformation, if no snap target is set for an element in the scene (Vertex, Grid...), the snap targets to geometry Vertex, Edge, Face, Center of Edge and Perpendicular of Edge will be set automatically.
- Snap cannot be turned off during the snap source editing.
- Constraint or similar modification features will not be available during the snap source editing.
- Text input will not be available during the snap source editing.
- When adding multiple snap points (A) the new prone snap source point will be indicated with an "X" drawing.
{F11817267}

Maniphest Tasks: T66484

Differential Revision: https://developer.blender.org/D9415
2021-11-18 13:14:18 -03:00
f0be276514 Fix T93082: Cycles baking not handling transparency correctly
For baking, replace transparent BSDF with holdout for baking. This ensure no
objects behind are baked, and that the baked image has alpha.
2021-11-18 17:13:16 +01:00
ed91e759d1 Fix Cryptomatte accurate option showing for Cycles, only needed for Eevee 2021-11-18 17:13:16 +01:00
67b4eecac9 Cleanup: Use const arguments 2021-11-18 11:03:18 -05:00
Nikhil Shringarpurey
dd31b8bd50 UI: Use full word "Start" instead of "Sta"
Differential Revision: https://developer.blender.org/D13098
2021-11-18 10:45:10 -05:00
5816eb4c56 Merge remote-tracking branch 'origin/blender-v3.0-release' 2021-11-18 16:36:11 +01:00
8d1357ea6b Pose Library: add conversion button to old poselib UI
Add the "Convert Old Pose Library" operator to the old pose library (in
the Armature properties editor). This makes it more discoverable; before
it only was available in the Action editor.
2021-11-18 16:35:56 +01:00
beb9e332ca Merge remote-tracking branch 'origin/blender-v3.0-release' 2021-11-18 16:21:17 +01:00
31afa1bb9a Asset tags: include partial tag matches
When filtering the asset browser, also include results that have partial
tag matches. So searching for "xite" will include results tagged with
"excited".

This brings the tag filtering in line with other search boxes in
Blender. Later we might want to provide users with more options for
prefix-only ("excite" would match "excited", but "xited" would not) or
only exact matches.
2021-11-18 16:21:00 +01:00
0624acf088 Fix T92442: undo removal of Eevee cryptomatte accurate mode option
The Cycles accurate mode was removed, but the Eevee option for this has
a different meaning and should not have been removed. The Eevee accurate
makes cryptomatte accumulate for every sample, which Cycles has always
done regardless of any option.
2021-11-18 15:59:08 +01:00
b926f54f3c Fix: error when materializing curve point attribute
Differential Revision: https://developer.blender.org/D13271
2021-11-18 15:28:04 +01:00
d1f944c186 Cycles: declare constants at program scope on Metal
MSL requires that constant address space literals be declared at program
scope. This patch moves the `blackbody_table_r/g/b` and `cie_colour_match`
constants into separate files so they can be declared at the appropriate scope.

Ref T92212

Differential Revision: https://developer.blender.org/D13241
2021-11-18 14:38:05 +01:00
d19e35873f Cycles: several small fixes and additions for MSL
This patch contains many small leftover fixes and additions that are
required for Metal-enablement:

- Address space fixes and a few other small compile fixes
- Addition of missing functionality to the Metal adapter headers
- Addition of various scattered `__KERNEL_METAL__` blocks (e.g. for
  atomic support & maths functions)

Ref T92212

Differential Revision: https://developer.blender.org/D13263
2021-11-18 14:38:02 +01:00
c0d52db783 Merge branch 'blender-v3.0-release' 2021-11-18 14:33:43 +01:00
f71813204c Cycles: Don't tile if image area fits into tile area
Previously the check was done based on dimension of image and if any
of dimensions were larger than tile size tiling was used.

This change makes it so that if image does not exceed number of pixels
in the tile no tile will be used. Allows to render widescreen images
without tiling.

Differential Revision: https://developer.blender.org/D13206
2021-11-18 14:27:45 +01:00
3ad2bf1327 Cycles: Fix command line render overshooting time limit
The calculation based on preserving device occupancy was conflicting
with the fact that time limit needs to render less samples at the last
round of render work.

For example, rendering BMW27 for 30sec on i9-11900k was actually
rendering for almost a minute. Now the render time limit is respected
much more close.

Differential Revision: https://developer.blender.org/D13269
2021-11-18 14:27:45 +01:00
bd2e3bb7bd Fix T93045: Cycles HIP not rendering OpenVDB volumes
Build HIP kernels with NanoVDB, and patch NanoVDB to work with HIP.

This is a header only library so no rebuild is needed. The changes are being
submitted upstream to openvdb, so this patch should be temporary.

Thanks Thomas for help testing this.
2021-11-18 13:24:56 +01:00
2b63a76041 Merge branch 'blender-v3.0-release' 2021-11-18 11:53:46 +01:00
e5774282b9 Fix asset preview not showing up for current file data-blocks
For data-blocks from the current file, the image-buffer for dragging
wasn't set at all. This wasn't intentional, dragging things in the Asset
Browser should just always show the preview.
2021-11-18 11:47:21 +01:00
William Leeson
e1a3b697ec Merge branch 'blender-v3.0-release' to pick up D13262. 2021-11-18 09:41:11 +01:00
William Leeson
8c0370ef7b Fix T93102: Principled hair shader stack was built incorrectly
stack_assign_if was used in the middle of creating the shader value blocks.
Which caused stack variables to be inserted in the middle of the shader value data.
This resulted in the shader node data no being in sequential order. This was also
the case for the wave texture wave node.

Reviewed By: brecht

Maniphest Tasks: T93102

Differential Revision: https://developer.blender.org/D13262
2021-11-18 09:31:29 +01:00
Félix
9cf3d841a8 VSE: Add meta.separate() Python API function
This function can be used to "dissolve" meta strip anywhere in strip
hierarchy. This has same effect as `meta_separate` operator.

Reviewed By: ISS

Differential Revision: https://developer.blender.org/T91005
2021-11-18 03:55:48 +01:00
f1f7a8b018 Merge branch 'blender-v3.0-release' 2021-11-18 03:03:13 +01:00
a182b05f07 Fix T90390: VSE Jump operator don't work during playback
Tag dependency graph `ID_RECALC_AUDIO_SEEK` update.
2021-11-18 02:59:50 +01:00
032ab0270d Merge branch 'blender-v3.0-release' 2021-11-18 02:23:17 +01:00
Andrea Beconcini
daaa43232d Fix T92445: Thumbnail height without overlay text
This patch changes the thumbnails' height used for image and movie
strips by choosing the proper size according to the VSE's text overlay
settings: i.e. thumbnails use the whole strip's height when no overlay
text is displayed; otherwise, some space is left for the overlay.

Reviewed By: ISS

Differential Revision: https://developer.blender.org/D13043
2021-11-18 02:14:11 +01:00
ceec400975 Merge branch 'blender-v3.0-release' 2021-11-18 01:40:37 +01:00
d8fd575af9 Fix T93154: Crash adding multiple movie strips
Some when adding multiple movies at once and only some of them have
audio track, this causes crash on NULL dereference. Issue was introduced
in bdbc7e12a0 to align sound and video properly.

Check if sound is present in movie file. If it's not, don't try to align
sound with video.
2021-11-18 01:32:06 +01:00
d6b5251572 Merge branch 'blender-v3.0-release' 2021-11-18 01:13:22 +01:00
b071083496 Fix T93166: Division by zero when drawing thumbnails
Caused by incorrect step calculation fo too short strips. For these,
step should be equal to strip length not 0.
2021-11-18 01:11:51 +01:00
fa7a6d67a8 Fix Cycles CUDA/HIP compiler error after recent changes 2021-11-17 19:56:18 +01:00
Sebastian Herholz
d9bc8f189c Cycles: add build option to enable a debugging feature for MIS
This patch adds a CMake option "WITH_CYCLES_DEBUG" which builds cycles with
a feature that allows debugging/selecting the direct-light sampling strategy.
The same option may later be used to add other debugging features that could
affect performance in release builds.

The three options are:
* Forward path tracing (e.g., via BSDF or phase function)
* Next-event estimation
* Multiple importance sampling combination of the previous two methods

Such a feature is useful for debugging light different sampling, evaluation,
and pdf methods (e.g., for light sources and BSDFs).

Differential Revision: https://developer.blender.org/D13152
2021-11-17 18:03:56 +01:00
063ad8635e Cycles: reduce triangle memory usage with packed_float3
Depends on D13243

Differential Revision: https://developer.blender.org/D13244
2021-11-17 17:29:41 +01:00
9937d5379c Cycles: add packed_float3 type for storage
Introduce a packed_float3 type for smaller storage that is exactly 3
floats, instead of 4. For computation float3 is still used since it can
use SIMD instructions.

Ref T92212

Differential Revision: https://developer.blender.org/D13243
2021-11-17 17:29:41 +01:00
89d5714d8f Build: match GCC and Clang float conversion warnings in Cycles 2021-11-17 17:29:41 +01:00
ea7efa5569 Merge branch 'blender-v3.0-release' 2021-11-17 15:41:32 +01:00
00a9617f92 Fix: wrong assert in geometry nodes evaluator
It only makes sense to check if all required outputs have been computed
if the node was executed at all.
2021-11-17 15:40:53 +01:00
c3422c48ad Cleanup: remove dummy multi function 2021-11-17 12:33:18 +01:00
e5f05bc7a6 Cleanup: Painting - reduce reallocation of same memory.
Curve mask is freed/allocated every time, but could still reuse the
previous allocated buffer when the diameter of the brush doesn't change.
2021-11-17 12:30:42 +01:00
f5dde382af Cleanup: use same function for updating internal links for all nodes
Previously, node types had a callback that creates internal links. Pretty
much all nodes used the same callback though. The exceptions are the
reroute node (which probably shouldn't be mutable anyway) and some
input/output nodes that are not mutable.

Removing the callback helps with D13246, because it makes it easier
to reason about which internal links are created and when they change.
In the future, the internal links should be part of the node declaration.
2021-11-17 11:52:54 +01:00
83a4d51997 Cleanup: Remove unused show_samples() device code in Cycles. 2021-11-17 11:16:48 +01:00
c2ab47e729 Cleanup: change node socket availability in a single place
This cleans up part of the code that still set the flag manually. Also, this
change helps with D13246 because it makes it easier to tag the node
tree as changed when the availability of a socket changed.
2021-11-17 11:11:28 +01:00
51b8e34fb7 LineArt: Improve certain edge cases in occlusion
This patch includes:
View vector fix for ortho back face.
Point on segment logic correction.
Better handling of boundary cases.

See review page for detailed description.

Reviewed By: Sebastian Parborg (zeddb)

Differential Revision: https://developer.blender.org/D13143
2021-11-17 14:31:12 +08:00
473be239c3 LineArt: Improve certain edge cases in occlusion
This patch includes:
View vector fix for ortho back face.
Point on segment logic correction.
Better handling of boundary cases.

See review page for detailed description.

Reviewed By: Sebastian Parborg (zeddb)

Differential Revision: https://developer.blender.org/D13143
2021-11-17 14:30:08 +08:00
e3c974b7e4 Merge branch 'blender-v3.0-release' 2021-11-17 05:43:34 +01:00
6e6123b40f Fix T93080: Crash on scrubbing with snapping
Sequencer wasn't initialized, snapping crashed on NULL dereference.
Add Null check.
2021-11-17 05:40:25 +01:00
ecad33f214 UI: Use ampersand instad of 'and' in labels
- When and is used in labels use ampsand
- When used in description use 'and'
2021-11-16 21:38:03 -05:00
61bffa565e Fix T90412: Inconsistency in mask strip color
This seems to be oversight in 271231f58e where strip color was
defined only for light theme.
2021-11-17 03:27:13 +01:00
f72dc00569 Merge branch 'blender-v3.0-release' 2021-11-17 03:12:19 +01:00
9bdf3fa5f0 Fix T91724: Strip height is too limited
This change was introduced in 997b5fe45d, to not display pixelated
thumbnails. However when VSE timeline height is made smaller, this
limits strip height.

Change limit, so one strip can occupy full height of VSE timeline
2021-11-17 02:45:53 +01:00
4c988eb3e1 Fix error with makefiles compilation
Use 'template' keyword to treat 'is' as a dependent template name
2021-11-16 20:46:33 -03:00
luzpaz
dea26253a0 cleanup: fix typos in comments and docs
Followup to https://developer.blender.org/D10288

Reviewed By: Blendify

Differential Revision: https://developer.blender.org/D10346
2021-11-16 18:45:10 -05:00
8290edefad Cleanup: Use bool instead of int 2021-11-16 17:20:31 -06:00
2f39b45e8c Merge branch 'blender-v3.0-release' 2021-11-16 17:50:24 -05:00
f829b86039 Cleanup: Correct copy paste error in comment
Mistake from rB2743d746ea4f38c098512f6dd6fc33d5a62429d3
2021-11-16 17:49:51 -05:00
93f26d652e Merge branch 'blender-v3.0-release' 2021-11-16 23:21:13 +01:00
fbf4fe6963 Fix missing Cycles implicit float/double conversion error with Clang
Since these are errors with GCC and Visual Studio.
2021-11-16 23:18:25 +01:00
1e4d1eb398 Fix another Linux build error with double and float comparison 2021-11-16 23:12:50 +01:00
59da22c309 Merge branch 'blender-v3.0-release' 2021-11-16 22:40:08 +01:00
b496c1c721 Cleanup: compiler warnings 2021-11-16 22:29:50 +01:00
3189171a94 Fix build error with strict double to float conversion 2021-11-16 22:25:24 +01:00
cf83719761 Geometry Nodes: Small improvements to object info node
This commit contains a few mostly-related changes to this node:
 -  Add a warning when retrieving the geometry from the modifier object.
 - Only create the output geometry when it is necessary.
 - Decompose transform matrices in a more friendly way.
 - Use default return callbacks like other newer nodes.

Differential Revision: https://developer.blender.org/D13232
2021-11-16 15:09:00 -06:00
c9fb08e075 Merge branch 'blender-v3.0-release' 2021-11-16 14:55:13 -06:00
f30e1fd2f0 Fix T93085: Incorrect geometry nodes modifier warning
It's valid for a node group connected to the modifier not to
have a geometry input, but I didn't consider that case
with the last change I made here, f3bdabbe24.

Differential Revision: https://developer.blender.org/D13231
2021-11-16 14:51:03 -06:00
25d30e6c99 Fix T92857: Deadlock in geometry nodes curve multi-threading
The spline code, especially Bezier splines, often make use of lazily
evaluation and caching. In order to do that, they use mutex locks.
When multi-threading, this can lead to problems. Further detail
can be found in rBfcc844f8fbd0d1.

To fix the deadlock, isolate the task before multi-threading
when holding a lock.

Differential Revision: https://developer.blender.org/D13229
2021-11-16 14:49:58 -06:00
cfd0e96e47 Fix T93125: Cycles wrong remaining render time with high number of samples
Avoid integer overflow.
2021-11-16 20:49:32 +01:00
7293c1b357 Fix T93106: Cycles SSS not working with normals pointing inside 2021-11-16 19:44:45 +01:00
Germano Cavalcante
9d7422b817 File Browser: Improve usage of threads in the creation of thumbnails
Due to asynchronous process, the preview for a given image may be
generated several times.

This regenerates many thumbs unnecessarily.

The solution is to add the `FILE_ENTRY_PREVIEW_LOADING` flag for file
entries that are still in the thread queue.

So this flag is checked not to redraw the thumb when it is still being
created on a different thread.

Differential Revision: https://developer.blender.org/D11150
2021-11-16 14:10:24 -03:00
917218269e Merge branch 'blender-v3.0-release' 2021-11-16 13:59:28 -03:00
1572c4d3d3 Fix T93011: Individual origins being used when pivot point is override
There should be a special `t->around` for this case, but for now let's
just avoid having the individual origins overlap.
2021-11-16 13:56:11 -03:00
bd37553850 Cleanup: better delimit member initialization
The initialization of `t->around` and `t->view` was scattered and with
duplicate code
2021-11-16 13:55:30 -03:00
0335df9384 Transform: better contextualize the status bar
`Remove Last Snap Point` should only be displayed when there is a Snap Point to be removed.
2021-11-16 13:39:10 -03:00
ba6427adfa Merge branch 'blender-v3.0-release' 2021-11-16 17:25:48 +01:00
b3529ecf0e Fix CUDA error when using tiny border in viewport
Need to clamp scaled render buffers window to be above zero
when applying resolution divider.
2021-11-16 17:25:18 +01:00
72ee62e0da Fix crash on freeing hair system
Fix a crash when a hair system's `ParticleSettings` ID datablock was
linked from another file but couldn't be found. This results in default
settings, with `type = PART_EMITTER`, where the particle data still has
a non-NULL `hair` pointer. Previously, copies of such a particle system
would NOT copy hair data for non-hair particle systems, hence the
pointer of the copy pointed to the original data, which got freed (at
least) twice upon closing the blend file.

This is now fixed by always copying the hair data, regardless of the
particle system type.

Reviewed by: mont29

Differential Revision: https://developer.blender.org/D13245
2021-11-16 17:18:01 +01:00
bee7a56687 Cleanup: document that MEM_dupallocN is NULL-safe
Add comment explaining `MEM_dupallocN` is NULL-safe, in that it returns
NULL when it receives a NULL pointer. This is currently true for both
implementations of the function (`MEM_lockfree_dupallocN` and
`MEM_guarded_dupallocN`), and will be expected of other implementations
as well.

No functional changes.
2021-11-16 17:11:45 +01:00
07af45eec5 Asset Browser: hide catalog debug info behind debug option
Add a new "experimental" debug option `show_asset_debug_info`, and use
that to determine the visibility of the active asset's catalog UUID and
simple name. Previously this was only determined by the "Developer
Extras" option, which meant it was visible in too many situations. It's
not really a "developer extra", and really just a debugging tool, so the
new option is more in line with its purpose.

Reviewed by: Severin

Differential Revision: https://developer.blender.org/D13242
2021-11-16 16:49:27 +01:00
Michael Jones
64003fa4b0 Cycles: Adapt volumetric lambda functions to work on MSL
This patch adapts the existing volumetric read/write lambda functions for Metal. Lambda expressions are not supported on MSL, so two new macros `VOLUME_READ_LAMBDA` and `VOLUME_WRITE_LAMBDA` have been defined with a default implementation which, on Metal, is overridden to use inline function objects.

This patch also removes the last remaining mention of the now-unused `ccl_addr_space`.

Ref T92212

Reviewed By: leesonw

Maniphest Tasks: T92212

Differential Revision: https://developer.blender.org/D13234
2021-11-16 13:42:23 +00:00
85ac9b8584 Merge branch 'blender-v3.0-release' 2021-11-16 14:39:51 +01:00
ce0d817bb7 UI: Fix hard to read text for drag disabled hints
In 499dbb626a, the background color of drag tooltips were changed so
text becomes more readable. But multiple people were touching the same
code, so the disabled hint tooltips didn't get the same tweak. They
would benefit from them even more, since the red text is even harder to
read on the transparent background than the regular, white text.
2021-11-16 14:37:20 +01:00
12a986c9b5 Merge remote-tracking branch 'origin/blender-v3.0-release' 2021-11-16 13:10:35 +01:00
c7a1e115b5 Tests: fix memory leak of GHOST system paths
Dispose of GHOST system paths when tearing down `BlendfileLoadingBaseTest`
and some other test cases. This prevents a memory leak.

A better solution would be to rework Blender's initialisation & teardown
structure, but that's outside the scope of this fix.

No functional changes to Blender.
2021-11-16 13:07:11 +01:00
faa8aa3bb9 Asset Browser: Forbid dragging catalogs into themselves
While there is nothing technically that would cause issues when moving a
catalog into itself (it just changes the path of the catalog, and the
missing parent catalogs will be created), it seems broken to the user.
So disable this in the drag & drop code for asset catalogs.
2021-11-16 13:01:57 +01:00
052c22199d Asset Browser: use one more refresh operator
Refreshing the assets requires `file_OT_asset_library_refresh` in the
asset browser, and `asset_OT_list_refresh` for the asset view. Both
are now done from `ASSET_OT_open_containing_blend_file`.
2021-11-16 12:35:39 +01:00
7da714f387 Merge branch 'blender-v3.0-release' 2021-11-16 10:57:26 +01:00
Diptangshu Dey
da14a482f2 Fix T90866: Python operator templates are not accessible from menus
Python Operator templates made accessible from respective menus
(required to also use F3 search for quick access)
Also fixed Modal Draw Operator id_name (had duplicate name from other template)

Maniphest Tasks: T90866

Differential Revision: https://developer.blender.org/D13182
2021-11-16 10:45:23 +01:00
d4c868da9f Geometry Nodes: refactor virtual array system
Goals of this refactor:
* Simplify creating virtual arrays.
* Simplify passing virtual arrays around.
* Simplify converting between typed and generic virtual arrays.
* Reduce memory allocations.

As a quick reminder, a virtual arrays is a data structure that behaves like an
array (i.e. it can be accessed using an index). However, it may not actually
be stored as array internally. The two most important implementations
of virtual arrays are those that correspond to an actual plain array and those
that have the same value for every index. However, many more
implementations exist for various reasons (interfacing with legacy attributes,
unified iterator over all points in multiple splines, ...).

With this refactor the core types (`VArray`, `GVArray`, `VMutableArray` and
`GVMutableArray`) can be used like "normal values". They typically live
on the stack. Before, they were usually inside a `std::unique_ptr`. This makes
passing them around much easier. Creation of new virtual arrays is also
much simpler now due to some constructors. Memory allocations are
reduced by making use of small object optimization inside the core types.

Previously, `VArray` was a class with virtual methods that had to be overridden
to change the behavior of a the virtual array. Now,`VArray` has a fixed size
and has no virtual methods. Instead it contains a `VArrayImpl` that is
similar to the old `VArray`. `VArrayImpl` should rarely ever be used directly,
unless a new virtual array implementation is added.

To support the small object optimization for many `VArrayImpl` classes,
a new `blender::Any` type is added. It is similar to `std::any` with two
additional features. It has an adjustable inline buffer size and alignment.
The inline buffer size of `std::any` can't be relied on and is usually too
small for our use case here. Furthermore, `blender::Any` can store
additional user-defined type information without increasing the
stack size.

Differential Revision: https://developer.blender.org/D12986
2021-11-16 10:16:30 +01:00
6d35972b06 Merge branch 'blender-v3.0-release' 2021-11-16 09:58:47 +01:00
7d985d6b69 Fix T93066: Alembic export ignores Mantaflow particles
`ABCPointsWriter::is_supported` already checked for valid particle
system types (liquid, spray, foam, bubbles, ...).

`AbstractHierarchyIterator::make_writers_particle_systems` did not
create a writer for these though, so now bring these in line and also
create writers for these.
2021-11-16 09:41:09 +01:00
57ed435def Cleanup: Use C++ matrix identity constructor 2021-11-15 23:24:16 -06:00
7e42ae7c1a Cleanup: Typo in comments 2021-11-15 22:08:28 +01:00
165cacc6f0 VSE: Use alpha over as default blend mode
With transform tools, it is expected to see backgroud image when overlay
is transformed.

Alpha over caused performance to be not optimal when used with opaque
media. This should be addressed with D12914 at least partially. There
may be some corner cases not addressed.

Differential Revision: https://developer.blender.org/D12952
2021-11-15 21:07:25 +01:00
62da6ffe08 VSE: Use early out for aplha over blending
When scaling down image, users expect to see background, which doesn't
currently happen in VSE. This is because strips use cross blend mode by
default, because alpha over is much slower. Reason is, because any area
of image can be transparent, and therefore it can't have early out
implemented in a way that cross blend mode can.

Flag images rendered by codecs that don't support transparency as fully
opaque and implement a form of early out for alpha over blend mode.

When rendering image stack, 2-input effects are ignored on the
"way down". Alpha over needs rendered overlay image to decide whether it
will use only overlay or background too. Therefore overlay can be
rendered safely before it is used. Image flags can be checked and it can
be freed if needed. Freeing doesn't cause any performance degradation,
because image is always stored in cache.

This feature does not improve blend mode performance. In summary, it
only allowes for having alpha over blend mode on background images
without suffering from lower performance.

Reviewed By: sergey

Differential Revision: https://developer.blender.org/D12914
2021-11-15 21:03:43 +01:00
46f5f60c13 Merge branch 'blender-v3.0-release' 2021-11-15 20:33:34 +01:00
a040d2a93a Fix T90592: Incorrect scrollbar range with backdrop
`v2d->tot` rect was set for backdrop drawing. Set range before drawing
scrollbars.

Reviewed By: Severin

Differential Revision: https://developer.blender.org/D13099
2021-11-15 20:28:11 +01:00
7e148c45c8 Fix T90415: Missing cache invalidation
Some RNA properties and operators did not invalidate cache or did it
incorrectly.

Reviewed By: sergey

Differential Revision: https://developer.blender.org/D13101
2021-11-15 20:23:57 +01:00
d3c45e1c39 Fix T91405: Block artefacts in WEBM video
Issue was caused by incorrect FFmpeg asynchronous decoding API. In most
cases, decoder returns 1 frame each time it is fed by 1 packet. Here
decoder wanted to return more frames, but our code always expected only
one.

Before sending new packets to decoder, check if there are frames to
receive. If there are, process them, otherwise continue decoding as
usual.

Reviewed By: zeddb, sergey

Differential Revision: https://developer.blender.org/D13079
2021-11-15 20:20:33 +01:00
ef8240e64c Fix T91992: Incorrect clip strip image size
When proxy size lower than 100% is used, clip strips are rendered with
incorrect image size.

This is because if proxies aren't enabled in movieclip, it automatically
falls back on rendering original media. Sequencer doesn't have knowledge
about this and since 9c99292a16 it assumes that image is proxy,
because it explicitly requested this size.

Check movieclip flag to see if proxies are enabled.

Reviewed By: sergey

Differential Revision: https://developer.blender.org/D13080
2021-11-15 20:07:46 +01:00
10a6a540af Cleanup: remove unnecessary functions
Those functions were more useful when `FieldInferencingInterface`
was still declared further down in `node.cc`.
2021-11-15 18:04:03 +01:00
8976b72843 Merge branch 'blender-v3.0-release' 2021-11-15 09:10:19 -06:00
62da41d63d Fix: Incorrect socket identifier versioning
There were two issues:
 - The third math node socket does not exist in old enough files.
 - The comment incorrectly referred to the vector math node.

Differential Revision: https://developer.blender.org/D13219
2021-11-15 08:52:58 -06:00
c3472cb11c Fix T93074: Gpencil cutter not using flat caps in middle cuts
When cut an stroke using the option Flat Caps, the falt was not done if the cut was done in the middle of the stroke.

Now the flat is applied to the segments created and also some cleanup of the code done.
2021-11-15 12:17:11 +01:00
Jeroen Bakker
a5c59fb90e Fix T89260: Eevee crashes with custom node sockets.
Cause of this issue is that Custom Node Sockets info type was
initialized as SOCK_FLOAT when registering. Areas within the core that
would ignore custom socket types by checking its type would use the
socket as being a float type.

When custom node sockets have a property called default_value blender
tries to store it as an internal default value what failed in debug
builds.

This patch will set the socket type to SOCK_CUSTOM when registering a
custom socket type and allow, but skip storage of custom default values.
In this case the default values should already be stored as custom
properies.

Reviewed By: campbellbarton, JacquesLucke

Maniphest Tasks: T89260

Differential Revision: https://developer.blender.org/D13174
2021-11-15 08:13:22 +01:00
c2c65cc4bf Merge branch 'blender-v3.0-release' 2021-11-15 02:39:33 +01:00
622e6f05f1 Fix T92750: sculpt vertex colors missing in object mode
The layers were not aliased properly for usage in the shaders.

Regression caused by rB03013d19d167.
2021-11-15 02:38:52 +01:00
7e82c840b7 Fix text editor auto-close with quotes
Back-spacing a quote from the beginning of a line
would delete the quote in-front instead of doing nothing.
2021-11-14 11:26:06 +11:00
2549384baa Cleanup: minor tweaks to auto-close
Spelling and failure to reuse variable missed in review.
2021-11-14 11:11:20 +11:00
73047c69ea BLF: Use Floats for Font Point Sizes
Allow the use of floating-point values for font point sizes, which
allows greater precision and flexibility for text output.

See D8960 for more information, details, and justification.

Differential Revision: https://developer.blender.org/D8960

Reviewed by Campbell Barton
2021-11-13 09:39:18 -08:00
e8a8bb67fc Cleanup: Correct order of guard and lock in moviecache_valfree
Fix own mistake in rB7061d1e39fe

In my attempt to quickly address T92838, along with the original bug, I
made a nonsensical choice to use the limiter lock to guard the check
against the cache item itself. While harmless, it is not necessary and
semantically wrong / potentially confusing to future readers of the code.

Differential Revision: https://developer.blender.org/D13122
2021-11-12 20:47:26 -08:00
ab9ec193c3 Fix splash screen showing on startup with files loaded by scripts
Suppressing the splash was only done when passing in an argument from
the command line.

Remove G.file_loaded, as it is misleading, only set once on startup,
replace with G.relbase_valid which is used everywhere else to check
if the file path should be used.
2021-11-13 14:05:27 +11:00
Matheus Santos
c4ea5cb1a3 Text Editor: Auto close relevant characters
Support the ability to close relevant characters like '(', '[' and '{'.

It will also delete the pair character if they're empty.

Ref D13119

Reviewed By: campbellbarton
2021-11-13 13:56:31 +11:00
1143bf281a Cleanup: spelling in comments, comment block formatting 2021-11-13 13:07:13 +11:00
acc800d24d Cleanup: clang-format 2021-11-13 12:47:18 +11:00
dc378bf1a4 Merge branch 'blender-v3.0-release' 2021-11-12 21:57:44 -03:00
0a6f428be7 Fix snapping not performing on selected bones or splines
This is an old issue but never reported as it is only visible in the measure tool snapping.
2021-11-12 21:56:42 -03:00
60b8eb30bb Merge branch 'blender-v3.0-release' 2021-11-12 21:36:35 -03:00
eed48a7322 Cleanup: reference some snap parameters in the snap context itself
This decreases the number of parameters in functions and makes important variables available in more places.
2021-11-12 21:34:19 -03:00
738f4fbc5e Revert "Fix T92636: Vector math node link disconnects when loading old file"
This reverts commit 6b4ca78108.

A simpler fix was used for 3.0, but rBd845ba481c6d2ef already contained
a more complete solution to the problem of inconsistent socket ids.
2021-11-12 17:14:27 -06:00
ebb4aba325 Merge branch 'blender-v3.0-release' 2021-11-12 17:12:39 -06:00
6b4ca78108 Fix T92636: Vector math node link disconnects when loading old file
Previously, unique identifiers for sockets were created automatically,
sometimes using .001 and sometimes _001. Now they are created
manually with the second format, but some files were saved with .001
format. I think this was only an issue in the vector math node.

rBd845ba481c6d fixed this problem in 3.1, but in a more general way.
After I merge this patch to 3.1, I will revert it, since the versioning
added in that commit will make this redundant.

Differential Revision: https://developer.blender.org/D13209
2021-11-12 17:07:38 -06:00
ec432ae998 Merge branch 'blender-v3.0-release' 2021-11-12 14:06:26 -06:00
8d3a771574 Fix: Node Class Type for Select by Handle Type
Change the node class type for Node Select by Handle Type to
NODE_CLASS_INPUT
2021-11-12 14:05:40 -06:00
55c69373e8 Cleanup: split 'initSnappingMode' into more specific functions
This helps to reuse small regions of a function's code elsewhere.

The logic had to be reorganized, theoretically it should behave the same way.
2021-11-12 16:30:01 -03:00
8b13cf5667 Cleanup: move 'imm_drawcircball' to 'gpu_immediate_util.c' 2021-11-12 16:30:01 -03:00
5941c39fbf Cleanup: fix some comments in the transform code 2021-11-12 16:30:01 -03:00
30f9034182 Cleanup: use 't->tsnap.mode' in transform code
This also prevents different snap modes from being used at the same time in the code.
2021-11-12 16:30:01 -03:00
e5a7dd8ab6 Cleanup: unify snap modes to geometry in a single flag
This combination was being repeated in some places.
2021-11-12 16:30:01 -03:00
5b787c24fb Merge branch 'blender-v3.0-release' 2021-11-12 13:26:26 -06:00
888b879f5f Fix: Incorrect transfer attribute error message with curves
The node does support curves, but only in index mode (see T88630)
So add a specific error message for the nearest mode, and let the
node support curves in the declaration.

Differential Revision: https://developer.blender.org/D13205
2021-11-12 13:25:22 -06:00
1b55b911f2 Merge branch 'blender-v3.0-release' 2021-11-12 20:04:05 +01:00
71131b4969 Tests: disable audio system for performance tests 2021-11-12 20:03:46 +01:00
b4d9b8b7f8 Fix T91893, T92455: wrong transmission pass with hair and multiscatter glass
We need to increase GPU memory usage a bit. Unfortunately we can't get away
with writing either reflection or transmission passes because these BSDFs may
scatter in either direction but still must be in a fixed reflection or
transmission category to match up with the color passes.
2021-11-12 20:03:46 +01:00
ef0b8d6306 Fix T92002: no Cycles combined baking support for filter settings 2021-11-12 20:03:46 +01:00
d845ba481c Fix T91826: Inconsistent node socket name identifier separator
Previously both `.` and `_` were used as separators when finding
a unique name for a socket. This removes the use of `.`, since `_`
was more common. It also does versioning for all of a file's node
trees to make sure that they all use the `_` convention.

Differential Revision: https://developer.blender.org/D13181
2021-11-12 12:22:43 -06:00
cbca71a7cf Cleanup: Move remaning node editor files to C++
Differential Revision: https://developer.blender.org/D13200
2021-11-12 12:12:27 -06:00
809ae823b7 Merge branch 'blender-v3.0-release' 2021-11-12 19:00:23 +01:00
9d0d4b8601 Fix T93029: OptiX denoising artifacts at high sample counts in specific scenes
Partially reverts commit rB440a3475b8f5410e5c41bfbed5ce82771b41356f because
"optixDenoiserComputeIntensity" does not currently support input images that are not packed (the
"pixelStrideInBytes" field is not zero). As a result the intensity calculation would take into account
data from other passes in the image, some of which was scaled by the number of samples still and
therefore produce widely incorrect results that then caused artifacts in the denoised image.

Maniphest Tasks: T93029
2021-11-12 18:59:50 +01:00
8a8bf99717 Merge remote-tracking branch 'origin/blender-v3.0-release' 2021-11-12 18:23:44 +01:00
2b394e1108 UI: (Performance) Avoid drawing buttons out of view
The UI was always drawing all buttons in a layout, no matter if they
were scrolled out of view (as in, outside of the visible part of the
region) or not. This means it's doing quite some work that can be
avoided.
UI drawing generally isn't a big bottleneck in Blender, so I don't
expect huge speedups from this. But while playing back animation, we do
redraw a fair bit of the UI, so in cases where there are many buttons
out of view, it may bring a little FPS boost. E.g. say in complex node
trees (the node editor is redrawn on animation playback in case there
are animated values that need updated UI feedback). This also mitigates
the issue in T92922 significantly.

Differential Revision: https://developer.blender.org/T92922

Reviewed by: Brecht Van Lommel
2021-11-12 18:21:12 +01:00
Yevgeny Makarov
a89529d8db UI: Do not shade alpha when blending colors
UI_GetThemeColorBlendShade4fv incorrectly changing alpha by the amount
of the shading offset.

See D9944 for more details.

Differential Revision: https://developer.blender.org/D9944

Reviewed by Hans Goudey
2021-11-12 08:55:20 -08:00
9f5290e3bc Fix T93007: Cycles not updating for animated Object properties like color 2021-11-12 08:55:20 -08:00
c671b5eee4 Fix Cycles ray visibility panel missing for volume objects 2021-11-12 08:55:20 -08:00
Yevgeny Makarov
896d3f1ce5 UI: Do not shade alpha when blending colors
UI_GetThemeColorBlendShade4fv incorrectly changing alpha by the amount
of the shading offset.

See D9944 for more details.

Differential Revision: https://developer.blender.org/D9944

Reviewed by Hans Goudey
2021-11-12 08:51:31 -08:00
76105eb752 Fix T93007: Cycles not updating for animated Object properties like color 2021-11-12 17:09:21 +01:00
d48523cb4d Fix Cycles ray visibility panel missing for volume objects 2021-11-12 16:18:07 +01:00
ddf66cd060 Merge remote-tracking branch 'origin/blender-v3.0-release' 2021-11-12 14:36:55 +01:00
b4cfe80547 Assets: Store Action sub-type in asset metadata
Blender 3.0 will only support single-frame Actions in the pose library.
The goal of this patch is to lay the groundwork for making it possible
for the Asset Browser to reject/hide "animation snippet" Action assets.

Determining whether an Action has one or more frames (i.e. whether it
has a single pose or animation) requires inspecting the Action itself,
and thus loading the data-block itself. This would make it impossible to
quickly determine from the asset browser.

To solve this, the Action is inspected before saving, and a
`"is_single_frame"` boolean (well, 0/1 integer) IDProperty is added.

Reviewed by: Severin

Differential Revision: https://developer.blender.org/D13202
2021-11-12 14:36:22 +01:00
c7a88cf91a Merge branch 'blender-v3.0-release' 2021-11-13 00:31:03 +11:00
5f7d5c0809 Gizmo: adjust when gizmo protection flags are displayed
Restore behavior reverted in
0ea60cf6b8 but only for location
as it makes sense to use protection flags in global mode in that case.
2021-11-13 00:28:13 +11:00
a470e3c9d1 Merge branch 'blender-v3.0-release' 2021-11-13 00:20:07 +11:00
0ea60cf6b8 Fix gizmo protection flag use in object mode
Channel protection flags were only used in global mode,
this doesn't make any sense, especially for rotation and scale.

Follow pose-bones, only using protection flags for
local & gimbal orientation.
2021-11-13 00:19:36 +11:00
3fe735d371 Geometry Nodes: Add Outer Points Selection to Star
Adds a boolean field output containing a selection of the
points of the star that are controlled by the outer radius
of the star.

Differential Revision: https://developer.blender.org/D13097
2021-11-12 06:44:41 -06:00
a47359ff36 Merge branch 'blender-v3.0-release' 2021-11-12 11:52:49 +01:00
de8a46c6ad Fix T93008: Cycles: Huge memory spike when saving tile to disk
The root of the problem lies in bug in OIIO which we can work around
from our side (which does not affect pack memory usage).

Thanks Brecht for finding the root cause!

Differential Revision: https://developer.blender.org/D13186
2021-11-12 11:20:01 +01:00
26502f3d89 Merge remote-tracking branch 'origin/blender-v3.0-release' 2021-11-12 10:52:12 +01:00
2b633f12ad Fix crash in asset browser when switching from file browser
When the file browser is in asset browser mode, it sets the callback
`filelist->prepare_filter_fn` to an asset browser specific function. This
function will segfault if there is no current asset library. Switching back
from asset browser to file browser would not reset that callback to
`NULL`, causing it to be called and crash Blender. This is now fixed.
2021-11-12 10:51:09 +01:00
86ca206db8 Cleanup/document BKE_blender_copybuffer.
* Rename the 'copy' functions to make it clear they belong to the same
  'group' and are to be used together.
* Fix `flag` parameter of `BKE_copybuffer_paste` being a short instead
  of an int.
* Improve documentation.
2021-11-12 10:20:49 +01:00
William Leeson
456876208b Fix T92601: Disable profiling when the profiler is deemed not active.
Adds a method to profiler that can be used to check if it is active.
This is used to determine if stop_profiling and start_profiling
should be called.

| patch | Juans Scene UI 256 samples | Juans Scene bg 256 samples | junkshop UI | junkshop bg |
| No patch | 6:16.59 | 4:05.37 | 2:08.48 | 1:59.7 |
| D13187   | 4:12.15 | 3:57.36 | 2:07.25 | 1:58.16 |
| D13185   | 4.11.18 |3:54.74 | 2:07.44 | 1:58.03 |
| D13190   | 4:12.39 | 3:55.42 | 2:07.62 | 1:58.68 |

UI - means rendered from within Blender
bg - means rendered from the command line using ##blender -b scene.blend -f 1##

Reviewed By: sergey, brecht

Maniphest Tasks: T92601

Differential Revision: https://developer.blender.org/D13190
2021-11-12 10:16:01 +01:00
William Leeson
32c7687859 Fix T92601: Disable profiling when the profiler is deemed not active.
Adds a method to profiler that can be used to check if it is active.
This is used to determine if stop_profiling and start_profiling
should be called.

| patch | Juans Scene UI 256 samples | Juans Scene bg 256 samples | junkshop UI | junkshop bg |
| No patch | 6:16.59 | 4:05.37 | 2:08.48 | 1:59.7 |
| D13187   | 4:12.15 | 3:57.36 | 2:07.25 | 1:58.16 |
| D13185   | 4.11.18 |3:54.74 | 2:07.44 | 1:58.03 |
| D13190   | 4:12.39 | 3:55.42 | 2:07.62 | 1:58.68 |

UI - means rendered from within Blender
bg - means rendered from the command line using ##blender -b scene.blend -f 1##

Reviewed By: sergey, brecht

Maniphest Tasks: T92601

Differential Revision: https://developer.blender.org/D13190
2021-11-12 10:01:48 +01:00
a87253942d Cleanup: Remove GHOST_isUpsideDownContext.
GHOST API only has a header definition. No implementation or usage.
2021-11-12 09:38:25 +01:00
8e43757834 GHOST: Vulkan Backend.
This adds a vulkan backend to GHOST. Still WIP, we should decide what
would be the minimum requirement to land in master.

Differential Revision: https://developer.blender.org/D13155
2021-11-12 09:22:56 +01:00
ae74ad191c Merge branch 'blender-v3.0-release' 2021-11-12 18:35:22 +11:00
aa1c44a113 Merge branch 'blender-v3.0-release' 2021-11-12 18:35:19 +11:00
5c0d4753cf Merge branch 'blender-v3.0-release' 2021-11-12 18:35:16 +11:00
02333544d1 Merge branch 'blender-v3.0-release' 2021-11-12 18:35:13 +11:00
1061f5a1ba Cleanup: remove redundant loop-index access
Use a counter for loop indices as they're being iterated in order.
2021-11-12 18:33:05 +11:00
1a7757b0bc Fix T91444: Edge Loop Preview fails with two Mirror Modifiers
The mirror modifiers merge option caused unnecessary re-ordering
to the vertex array with original vertices merging into their copies.

While this wasn't an error, it meant creating a 1:1 mapping from input
vertices to their final output wasn't reliable (when looping over
vertices first to last) as is done in
BKE_editmesh_vert_coords_when_deformed.

As merging in either direction is supported, keep the source meshes
vertices in-order since it allows the vertex coordinates to be extracted.
2021-11-12 18:27:47 +11:00
f133c6b094 Fix crash saving blend files in background mode
Reading the windows pixels was attempted in background mode.
2021-11-12 18:05:27 +11:00
d612d92630 Cleanup: use term sequence_strip instead of vse_strip 2021-11-12 17:38:58 +11:00
1e1c870001 Cleanup: Improve comment 2021-11-11 21:57:40 -06:00
0533f2851e Geometry Nodes: change selection output order in Cylinder node
This new order is a bit more intuitive.
2021-11-11 19:53:02 +01:00
d6e682a7b0 Merge branch 'blender-v3.0-release' 2021-11-11 19:50:24 +01:00
bd734cc441 Fix: Attribute Transfer node does not work with a single index
Differential Revision: https://developer.blender.org/D13194
2021-11-11 19:49:20 +01:00
e61da8e4fb Merge branch 'blender-v3.0-release' 2021-11-11 11:47:43 -06:00
f3bdabbe24 Fix: Incorrect modifier warning with non-geometry input first
The code assumed that any geometry input that wasn't the first input
was a second geometry input. Fix by separating the warning for the
first input and for the number of geometry inputs.
2021-11-11 11:47:19 -06:00
50f32025ac Merge branch 'blender-v3.0-release' 2021-11-11 18:27:31 +01:00
393879f30c Fix: wrong field inferencing with unavailable sockets 2021-11-11 18:26:40 +01:00
3d9c8397fc Fix T93005: Cycles shadow catcher not inherited by instances 2021-11-11 18:12:05 +01:00
c9c7658926 Geometry Nodes: Add Offset to Handle Position Node
Adds a vector offset field to the "Curve Handle Position Node".
This vector is added to the incoming position (which is the
implicit handle position if not connected) which will set the
position of the handle. Default is (0,0,0)

Differential Revision: https://developer.blender.org/D13035
2021-11-11 10:58:23 -06:00
3ca41b7312 Merge branch 'blender-v3.0-release' 2021-11-11 10:16:35 -06:00
Peter Sergay
4bc08b79aa UI: Updated Blender Light theme for Spreadsheet
This patch adds list colors to the light theme for the spreadsheet,
which are needed for the data set region. Addresses T92492.

Differential Revision: https://developer.blender.org/D13090
2021-11-11 10:10:28 -06:00
25e7365d0d Cleanup CUDA / HIP comments
Remove outdated CUDA comments for bindless textures and cleanup some HIP comments that still mentioned CUDA.

Differential Revision: https://developer.blender.org/D13189
2021-11-11 16:37:29 +01:00
52c617802f Merge branch 'blender-v3.0-release' 2021-11-11 09:27:42 -06:00
Martijn Versteegh
7aa39b40f4 Fix: Prevent use of uninitialized memory when creating Bezier spline
When Constructing bezier splines from dna, the positions of the
left/right handles were set directly in the internal vectors, by
requesting a reference to them. The problem is that
BezierSpline::handle_positions_left() calls ensure_auto_handles()
before returning the reference. That function does some calculations on
uninitialized memory if the positions array is not yet filled.

Differential Revision: https://developer.blender.org/D13107
2021-11-11 09:25:10 -06:00
ce395c84a3 Merge branch 'blender-v3.0-release' 2021-11-11 15:29:35 +01:00
d26d3cfe19 Fix T92868: Cycles catcher with transparency crashes
The issue was caused by splitting happening twice.

Fixed by checking for split flag which is assigned to the both states
during split.

The tricky part was to write catcher data at the moment of split: the
transparency and shadow catcher sample count is to be accumulated at
that point. Now it is happening in the `intersect_closest` kernel.
The downside is that render buffer is to be passed to the kernel, but
the benefit is that extra split bounce check is not needed now.

Had to move the passes write to shadow catcher header, since include
of `film/passes.h` causes all the fun of requirement to have BSDF
data structures available.

Differential Revision: https://developer.blender.org/D13177
2021-11-11 15:21:35 +01:00
06a74e7816 LibLink/Append tests: Add basic testing of bpy.data.libraries.load code. 2021-11-11 14:54:26 +01:00
9f31b9b7d3 Merge branch 'blender-v3.0-release' 2021-11-11 14:33:28 +01:00
9be6880d02 Fix (unreported) bad handling of reports in libraries.load code.
rB60fee69682ac39 only partially fixed the issue, `BlendFileReadReport
bf_reports` was now properly stored in `BPy_Library` `self` for the
lifetime of the context, but its `reports` member was still referencing
local variable to `bpy_lib_enter` function.
2021-11-11 14:29:14 +01:00
4a98faf9f1 Merge branch 'blender-v3.0-release' 2021-11-11 21:34:05 +11:00
8c240f50b2 Merge branch 'blender-v3.0-release' 2021-11-11 21:34:02 +11:00
b7e2408ea4 Fix T92867: Gimbal rotation broken when used for multiple objects
Support gimbal orientation for objects & bones.
2021-11-11 21:33:09 +11:00
bb64155c63 Cleanup: split gimbal_axis into pose and object
Allow access to a single bones gimbal matrix.
2021-11-11 21:14:08 +11:00
Andrii
c63e735f6b Cycles: Add sample offset option
This patch exposes the sampling offset option to Blender. It is located in the "Sampling > Advanced" panel.
For example, this can be useful to parallelize rendering and distribute different chunks of samples for each computer to render.

---

I also had to add this option to `RenderWork` and `RenderScheduler` classes so that the sample count in the status string can be calculated correctly.

Reviewed By: leesonw

Differential Revision: https://developer.blender.org/D13086
2021-11-11 09:39:25 +01:00
b8d53b703a Merge branch 'blender-v3.0-release' 2021-11-11 15:00:23 +11:00
9787b46f09 Merge branch 'blender-v3.0-release' 2021-11-11 15:00:20 +11:00
03f0be35d6 Merge branch 'blender-v3.0-release' 2021-11-11 15:00:17 +11:00
d753ebd40a Cleanup: spelling in comments 2021-11-11 14:59:30 +11:00
ddf0bacaa9 Cleanup: typo in function name 2021-11-11 14:58:55 +11:00
3929db265f Fix T92954: Loop Cut Tool preview line visible during operation
Apply a local-workaround instead of adding support for this use-case
since pre-selection isn't the intended purpose of gizmos.

This also resolves a glitch where poly-build and loop cut would
briefly show loop-cut or poly-build pre-selection after transforming.

See gizmo_preselect_poll_for_draw note for more details.
2021-11-11 14:58:55 +11:00
e1bd4bbb66 UI: Introduce View pie in more editors
#### Motivation

The View pie menu is a convenient way to access operators such as `Frame Selected` and `Frame All` which are usually mapped to `PERIOD` or `HOME` keys on the right side of most keyboard, making it hard hard to reach with the left hand.

The motivation for this patch comes from working with a 75% keyboard (no numpad). Most laptops face a similar problem.

#### Implementation

The View pie menu has been added to the following editors and sub-modes where applicable:

* Node Editor
* Video Sequencer
* Dopesheet
* Graph
* NLA
* Image
* Clip
* Outliner

More options could definitely be added to this menu for convenience, as long as it maintains the common options in the same place (Frame Selected on the left, Frame All on the right).

For positioning I went with the following layout:
{F11791186, size=full}

I've added `Zoom 1:1`to the Image Editor and the VSE Preview since there is no way to reset the zoom on keyboards without numpad (unless Emulate Numpad is turned on).

The Outliner uses `Show Active` and `Show Hierarchy` which are the closest ones to the equivalent in other editors. Should `Show Active` be renamed to `Frame Selected`?

The shortcut assigned is the same as the 3D Viewport (`ACCENT_GRAVE`).

#### Screenshots

Node Editor
{F11778387, size=full}

Dopesheet
{F11778400, size=full}

Graph
{F11778403, size=full}

Image Editor (Paint and View)
{F11791113, size=full}

Image Editor (Mask)
{F11791114, size=full}

UV Editor
{F11791119, size=full}

Clip Editor (Tracking)
{F11791137, size=full}

Clip Editor (Mask)
{F11791140, size=full}

Clip Editor (Graph)
{F11791151, size=full}
View operators are not yet implemented in Clip Editor Dopesheet mode (left a note about this in the menu poll).

Reviewed By: #user_interface, campbellbarton

Differential Revision: https://developer.blender.org/D13169
2021-11-11 01:18:50 +01:00
f1a8644121 Cleanup: Move interface_region_search.c to C++
This will be helpful for solving a bug with search during animation
playback, T89313. I tested this on all platforms on the buildbot.
2021-11-10 15:49:49 -06:00
1ec7075ff2 Merge branch 'blender-v3.0-release' 2021-11-10 15:44:13 -06:00
22ffd69a91 Fix T89313: Attribute search crash with animation playback
rBc473b2ce8bdbf8fa42 improved the situation somewhat, but
attribute search still crashes during animation playback, because
the UI search data references stale memory. The proper solution
is to allow the search to own data rather than just referencing it,
but I would prefer not to do that for 3.0. In the meantime, just
disable attribute search when animation is playing.

Differential Revision: https://developer.blender.org/D13179
2021-11-10 15:43:18 -06:00
9ca8bf0b29 Merge branch 'blender-v3.0-release' 2021-11-10 22:28:03 +01:00
040630bb9a Fix wrong device check in HIP kernel compile.
Also cleanup some related code, that was falsely copied from CUDA.

Differential Revision: https://developer.blender.org/D13180
2021-11-10 22:24:53 +01:00
3fa86f4b28 Merge branch 'blender-v3.0-release' 2021-11-10 20:19:09 +01:00
7689f501e2 Cycles: enable HIP device and binaries on Windows
We've now done testing to confirm this works with RDNA and RDNA2 AMD GPUs
on Windows. The AMD driver needed for this will soon be released publicly.
2021-11-10 20:16:44 +01:00
Thomas Dinges
e507a789b3 Cycles: disable graphics interop for HIP devices
This is due to a driver bug, so disable it for now until it gets resolved
in a future driver release.

Ref T92972

Differential Revision: https://developer.blender.org/D13167
2021-11-10 20:16:44 +01:00
6b0008129e Fix T92972: Cycles HIP wrong render display after a recent refactor
It's unclear why this fails. Maybe the size of half4 is not the expected
8 bytes and adjacent pixels are overwritten. Or there is some bug in the
HIP compiler writing a struct into global memory, which we probably don't
do elsewhere in the kernel.

Thanks to Thomas, William and Jeroen for helping investigate this.
2021-11-10 20:03:07 +01:00
c8e93da0a7 Fix Cycles assert in denoising fallback to OIDN 2021-11-10 19:56:30 +01:00
b65df10346 Merge branch 'master' into tmp-vulkan 2021-11-09 10:34:41 +01:00
8bf8db8ca2 Vulkan: Add support for wayland. 2021-07-06 10:42:45 +02:00
5bd7a2c416 Fix merge conflict. Missing parameter when creating EGLContext. 2021-07-06 09:02:29 +02:00
8c3e8d0eb6 Vulkan: Clean-up platform build support
This polishes up the build support for WITH_VULKAN
the find_package calls were moved to the appropriate
place in the platform cmake files.

For windows the required binaries are copied to
the binary folder.

Windows support relies on D11678 landing and
libraries being added to SVN and will automatically
disable WITH_VULKAN when they are not found.
2021-06-30 18:01:36 -06:00
8bfdec76c0 Vulkan: Fix uninitialized ram in VKTexture
The ram being uninitialised caused issues in
VKTexture's destructor which tests for
VK_NULL_HANDLE. The nvidia driver did not
enjoy being fed a bogus pointer and crashed.
2021-06-30 10:53:00 -06:00
9cc5075680 Ghost: Fix unresolved merge conflict
Introduced by rBddd64ba0
2021-06-30 10:48:59 -06:00
2ea8725339 Vulkan: Retrieve platform information.
Retrieve device platform information for `GPUPlatformGlobal`.
As vulkan needs a physical device it was moved from the GPUBackend
constructor to an init method that is called when an initial context
is created.

Still need to find out how to extract the driver type.
2021-06-29 16:51:10 +02:00
a55bbedc06 Merge branch 'master' into tmp-vulkan 2021-06-29 09:59:34 +02:00
5d52da2e65 Vulkan: Compiler log parsing.
For the vulkan backend we control the compiler and therefore the log
parsing is much simpler then the OpenGL parsers.

This patch formats the log to a better readable/debuggable format so
developers can quickly detect what needs to be done.
2021-06-28 16:59:11 +02:00
25b341a4ae Vulkan: Fix double free of shader interface.
interface is allocated by a subclass but freed by the parent class.
Might consider a uniqueptr here.
2021-06-28 15:56:46 +02:00
dfa5bc25ae Vulkan: Enabled GPU tests. 2021-06-28 15:51:16 +02:00
ddd64ba03a Merge branch 'master' into tmp-vulkan 2021-06-28 15:46:45 +02:00
0b3caae8a7 Merge commit 'master@{1-day-ago}' into tmp-vulkan 2021-06-28 15:17:51 +02:00
b1d5950471 Merge commit 'master@{1-weeks-ago}' into tmp-vulkan 2021-06-28 15:04:27 +02:00
99886f91a9 Merge commit 'master@{2-weeks-ago}' into tmp-vulkan 2021-06-28 15:04:21 +02:00
fa00caba88 Merge commit 'master@{3-weeks-ago}' into tmp-vulkan 2021-06-28 14:57:20 +02:00
f853163a54 Merge commit 'master@{4-weeks-ago}' into tmp-vulkan 2021-06-28 14:47:30 +02:00
87492f93a4 Merge commit 'master@{5-weeks-ago}' into tmp-vulkan 2021-06-28 14:27:59 +02:00
6152113b45 Merge commit 'master@{6-weeks-ago}' into tmp-vulkan 2021-06-28 14:06:11 +02:00
2741a4106a Merge commit 'master@{7-weeks-ago}' into tmp-vulkan 2021-06-28 13:51:33 +02:00
3420792ea8 Merge commit 'master@{8-weeks-ago}' into tmp-vulkan 2021-06-28 13:35:48 +02:00
3eaddb4da8 Merge commit 'master@{9-weeks-ago}' into tmp-vulkan 2021-06-28 13:35:39 +02:00
1a85a68a5e Merge commit 'master@{10-weeks-ago}' into tmp-vulkan 2021-06-28 13:25:24 +02:00
adba79a04f Merge commit 'master@{11-weeks-ago}' into tmp-vulkan 2021-06-28 13:15:10 +02:00
c73894ea5b Merge commit 'master@{12-weeks-ago}' into tmp-vulkan 2021-06-28 13:05:21 +02:00
48360afb10 Merge commit 'master@{3-months-ago}' into tmp-vulkan 2021-06-28 12:37:05 +02:00
0493a32803 Initial SpirV log parsing.
Still work in progress.
2021-06-25 16:53:07 +02:00
79346fd72e Moved shader compilation flag to cmake.
Migration of the shaders can take a long time. During this time it
you won't be able to start most GPU tests or Blender. This switch
will enable the compilation of the shader to other developments
can still happen.

`WITH_VULKAN_SHADER_COMPILATION` Default is OFF. Enable when migrating
shaders.
2021-06-25 16:01:05 +02:00
44c875f59a Vulkan: Patched gl_VertexID to gl_VertexIndex. 2021-06-25 15:52:02 +02:00
a68be3b3c2 ShaderCompiler: Use CPP strings to improve API. 2021-06-25 15:28:19 +02:00
c124606d70 Vulkan: Compile Shader Modules.
Shaders are expected to fail as they need to be converted to Vulkan. Dy default the compilation is turned off. Enable VULKAN_SHADER_COMPILATION to activate shader module compilation.
2021-06-25 14:38:50 +02:00
975b09e3e7 ShaderCompiler: Split code per target type.
No functional changes.
2021-06-25 11:01:59 +02:00
15ac620204 Added Vulkan test cases.
Added counterparts of the shader test cases for vulkan.
Fixing an issue where the OpenGL context was created when using a vulkan
backend.
2021-06-25 10:48:07 +02:00
983f61e9c5 Add test case to compile builtin shaders.
When using the vulkan backend it crashes. Will look into this later this
week.
2021-06-23 17:07:33 +02:00
ed959cd912 Cleanup: Added license headers. 2021-06-23 14:14:14 +02:00
193a17474e Added shader compiler abstration.
The shader_compiler is an abstration layer for in case we need to switch
between other shader compilers in the future.

The shader compiler uses a CPP Api as it will only be accessed from GPU
module.

```
  const char *source = "#version 450\nvoid main() {}";
  shader_compiler::Compiler *compiler = shader_compiler::Compiler::create_default();
  shader_compiler::Job job;
  job.source = source;
  job.name = __func__;
  job.source_type = shader_compiler::SourceType::GlslComputeShader;
  job.compilation_target = shader_compiler::TargetType::SpirV;

  shader_compiler::Result *result = compiler->compile(job);
  EXPECT_EQ(result->type, shader_compiler::TargetType::SpirV);
  EXPECT_EQ(result->status_code, shader_compiler::StatusCode::Ok);
  EXPECT_GT(result->bin.size(), 0);
  EXPECT_EQ(result->error_log, "");
  delete result;
  delete compiler;
```

The CMakeFiles need some attention as it currently works when linking to
the shaderlib_combined. On linux it picks up the shaderlib from the
vulkan SDK, eventually it needs to pick up from our prebuild libraries.
2021-06-23 14:04:44 +02:00
5b246fd4b3 Vulkan: Fix offscreen context creation.
On linux both the display and window should be available to create a
window surface. Only the display was checked making it fail later on
when creating a present queue.
2021-06-22 09:22:28 +02:00
f7d7a5aad7 Cleanup: Silence compilation warning. 2021-06-21 15:13:09 +02:00
8b68ee3a17 Vulkan: Fix compilation error. 2021-06-18 15:12:26 +02:00
230b24159e Vulkan: remove VK_ERROR_INCOMPATIBLE_VERSION_KHR
It has been removed from the SDK since version 1.1.18.
2021-06-18 15:10:56 +02:00
a5a01cc0c3 Performance: GPU Batch Baseline TestCase.
When using a dense mesh and transforming a small number of verts the mesh received a copy-on-write signal. This will free all GPU batches. No reuse is currently possible.
This patch adds a test case that can be used as a baseline to optimize the cache construction in the draw module.

Differential Revision: https://developer.blender.org/D11255
2021-05-17 11:16:27 +02:00
e7b3a7028b Merge branch 'master' into tmp-vulkan 2021-03-11 20:17:32 +01:00
fced6f19be Attempt 2 to silence warning. 2021-03-12 00:41:13 +05:30
402e19ddc8 Attempt to silence warning. 2021-03-12 00:37:55 +05:30
af2bc8be40 Vulkan: Implementation of VKTexture
Implementation is a first draft and needs to be refined over time.
2021-03-11 16:56:33 +01:00
72bce1be8e Vulkan: Bypass BGL when running with the vulkan backend
This avoids a crash at startup.
2021-03-11 16:55:12 +01:00
d390f01e48 Vulkan: Fix crash because of null active framebuffer 2021-03-11 16:02:01 +01:00
0956a5134a Vulkan: Add allocator and command pool to VKContext
This allows memory backed object (Textures, VertexBuffers, ...) to be created.
Also adds functions for one time use command buffers.
2021-03-11 16:01:27 +01:00
9a08daae92 GPUContext: Pass GHOST context handle for GPUContext creation
This is because we need to have a direct reference to the context to get
the vulkan resources.
2021-03-11 15:38:39 +01:00
dd4405121c GHOST: Window: Add drawing context handle getter
This is in order to retrieve the vulkan/gl context from a window handle.
2021-03-11 14:53:06 +01:00
b996cc6440 GHOST: Vulkan: Add compile time option to bypass required features
This makes it possible to test the vulkan implementation on all platforms
(mainly MacOS) even if all required features are not supported.
2021-03-11 14:14:14 +01:00
0bcd24c0c4 Vulkan: Add basic implementation of textures
Adds basic texture creation and update as well as texture view support
for attachements.
2021-03-11 13:41:05 +01:00
12e88ee722 Vulkan: Fix compilation after merging 2021-03-11 13:37:39 +01:00
b908c3fa0a VulkanMemoryAllocator: Add new extern library
This library is used to simplify memory allocation / management for Vulkan.
2021-03-11 13:02:02 +01:00
ef8e53c15b GHOST: Vulkan: Fix compilation error 2021-03-07 17:50:43 +01:00
639829ea1a Merge branch 'master' into tmp-vulkan 2021-03-07 17:28:39 +01:00
0ef5c14de0 Merge branch 'master' into tmp-vulkan
# Conflicts:
#	source/blender/blenkernel/BKE_global.h
#	source/blender/gpu/intern/gpu_context.cc
#	source/blender/windowmanager/intern/wm_window.c
#	source/creator/creator_args.c
2021-03-07 17:22:48 +01:00
26fd1c71e1 VK: Initial Base implementation 2020-11-13 22:30:40 +01:00
de062ffd10 Merge branch 'master' into tmp-vulkan
# Conflicts:
#	CMakeLists.txt
2020-09-16 22:05:35 +02:00
fc0f409911 VK: Avoid asserts in debug builds. 2020-09-16 22:04:03 +02:00
c72e6c25d7 GHOST: X11: Enable debug context for vulkan in offscreen contexts 2020-09-16 22:03:34 +02:00
816bd8bed6 GHOST: Vulkan: Rework support for MSVC
This removes designated initializer because not supported by MSVC for our
current C++ version.

Fixes some other compile errors and add usage of the `ContextVK` context in
the Win32 backend.
2020-09-16 22:01:04 +02:00
47613a40f5 VK: OSX: Use debug context and use METAL_SURFACE instead of MACOS_SURFACE 2020-09-13 03:37:35 +02:00
ea5a70973d VK: Make the placeholder backend work
Now we get a green window without a crash
2020-09-13 03:36:22 +02:00
70c0f652de Cleanup: Vulkan: Unused header and avoid duplicated code 2020-09-13 01:53:43 +02:00
e70dcde88b Vulkan: OSX: Use vulkan context for offscreen context 2020-09-13 01:52:57 +02:00
6d82ae3581 Vulkan: Fix compilation on OSX
This was caused by an internal metal type. The fix is a
workaround since I could not find the right solution.
2020-09-13 01:52:03 +02:00
99fc68bf6a GPU: Add Vulkan empty implementation
This does not even builds! WHY?
2020-09-13 00:44:15 +02:00
3efdb3f232 GHOST: Vulkan: Add MoltenVK Backend on osx
This is still highly experimental.
2020-09-13 00:43:43 +02:00
193cebd3c2 Merge branch 'master' into tmp-vulkan 2020-09-12 19:50:22 +02:00
bdeca237ff Merge branch 'master' into tmp-vulkan
# Conflicts:
#	CMakeLists.txt
#	intern/ghost/GHOST_C-api.h
#	intern/ghost/GHOST_ISystem.h
#	intern/ghost/intern/GHOST_C-api.cpp
#	intern/ghost/intern/GHOST_System.h
#	intern/ghost/intern/GHOST_SystemCocoa.h
#	intern/ghost/intern/GHOST_SystemCocoa.mm
#	intern/ghost/intern/GHOST_SystemNULL.h
#	intern/ghost/intern/GHOST_SystemSDL.cpp
#	intern/ghost/intern/GHOST_SystemSDL.h
#	intern/ghost/intern/GHOST_SystemWayland.cpp
#	intern/ghost/intern/GHOST_SystemWayland.h
#	intern/ghost/intern/GHOST_SystemWin32.cpp
#	intern/ghost/intern/GHOST_SystemWin32.h
#	intern/ghost/intern/GHOST_SystemX11.cpp
#	intern/ghost/intern/GHOST_SystemX11.h
#	source/blender/blenkernel/BKE_global.h
#	source/blender/windowmanager/intern/wm_window.c
2020-09-12 19:05:59 +02:00
a08d242acd GPU: Vulkan: Add creator arg to enable vulkan drawing
The creator arg is meant to support more than vulkan or opengl.

This currently fails because there is no GPU backend separation.

Note that we use G.debug with a debug flag to know what context type to
create. This will need to be cleaned later.
2020-07-24 22:08:37 +02:00
5629ff5f69 GHOST: Vulkan: Support window resizing 2020-07-24 19:53:01 +02:00
eb23ecd1b3 GHOST: Vulkan: Style Cleanup 2020-07-24 18:51:08 +02:00
a02da40732 GHOST: Vulkan: Better debug messages 2020-07-24 18:30:05 +02:00
bd0c1d8c53 GHOST: Vulkan: Remove swap interval functions 2020-07-24 17:48:31 +02:00
73b585b0b2 GHOST: Vulkan: Add Test command buffers and sync primitives 2020-07-24 17:00:06 +02:00
b549b1f477 GHOST: Vulkan: Add ImageView, Framebuffer and default RenderPass 2020-07-24 02:13:40 +02:00
c155c10888 GHOST: Vulkan: Add swapchain creation 2020-07-24 00:42:44 +02:00
bc86701b58 Merge branch 'master' into tmp-vulkan 2020-07-23 16:41:42 +02:00
6dc1d823ba Ghost: Vulkan: Add surface creation 2020-07-23 17:05:17 +02:00
392e744ed5 Merge branch 'master' into tmp-vulkan 2020-07-23 15:11:46 +02:00
f56af0b899 GHOST: Vulkan: Add logical device creation 2020-07-20 14:02:05 +02:00
78cbcccf34 GHOST: Vulkan: Add physical device & queue family selection 2020-07-20 13:23:41 +02:00
b671b33871 GHOST: Vulkan: Add instance creation 2020-07-19 22:59:57 +02:00
be9dc493a4 GHOST: Add barebone vulkan context for X11
The context does nothing. For now it always fail and will always
fallback to opengl.
2020-07-19 02:53:07 +02:00
8dec7e5219 Vulkan: Add CMake inclusion
We need to bump minimum version to 3.7 for FindVulkan.cmake to be
available.
2020-07-19 02:53:06 +02:00
b79e7f92bf Fix T78977 GPU: blf fonts are not gamma corrected 2020-07-18 04:25:56 +02:00
816 changed files with 38497 additions and 8974 deletions

View File

@@ -411,6 +411,7 @@ option(WITH_CYCLES "Enable Cycles Render Engine" ON)
option(WITH_CYCLES_OSL "Build Cycles with OpenShadingLanguage support" ON)
option(WITH_CYCLES_EMBREE "Build Cycles with Embree support" ON)
option(WITH_CYCLES_LOGGING "Build Cycles with logging support" ON)
option(WITH_CYCLES_DEBUG "Build Cycles with options useful for debugging (e.g., MIS)" OFF)
option(WITH_CYCLES_STANDALONE "Build Cycles standalone application" OFF)
option(WITH_CYCLES_STANDALONE_GUI "Build Cycles standalone with GUI" OFF)
@@ -440,7 +441,11 @@ mark_as_advanced(WITH_CYCLES_CUDA_BUILD_SERIAL)
mark_as_advanced(WITH_CUDA_DYNLOAD)
# AMD HIP
option(WITH_CYCLES_DEVICE_HIP "Enable Cycles AMD HIP support" OFF)
if(WIN32)
option(WITH_CYCLES_DEVICE_HIP "Enable Cycles AMD HIP support" ON)
else()
option(WITH_CYCLES_DEVICE_HIP "Enable Cycles AMD HIP support" OFF)
endif()
option(WITH_CYCLES_HIP_BINARIES "Build Cycles AMD HIP binaries" OFF)
set(CYCLES_HIP_BINARIES_ARCH gfx1010 gfx1011 gfx1012 gfx1030 gfx1031 gfx1032 gfx1034 CACHE STRING "AMD HIP architectures to build binaries for")
mark_as_advanced(WITH_CYCLES_DEVICE_HIP)
@@ -509,9 +514,13 @@ if(UNIX AND NOT APPLE)
endif()
# Vulkan
option(WITH_VULKAN "Enable Vulkan backend (Experimental)" OFF)
option(WITH_VULKAN_SHADER_COMPILATION "Temporary flag to enable vulkan shader compilation needed to continue development during the migration of GLSL to Vulkan." OFF)
# OpenGL
option(WITH_OPENGL "When off limits visibility of the opengl headers to just bf_gpu and gawain (temporary option for development purposes)" ON)
option(WITH_OPENGL "When off limits visibility of the opengl headers to just bf_gpu (temporary option for development purposes)" ON)
option(WITH_GLEW_ES "Switches to experimental copy of GLEW that has support for OpenGL ES. (temporary option for development purposes)" OFF)
option(WITH_GL_EGL "Use the EGL OpenGL system library instead of the platform specific OpenGL system library (CGL, glX, or WGL)" OFF)
option(WITH_GL_PROFILE_ES20 "Support using OpenGL ES 2.0. (through either EGL or the AGL/WGL/XGL 'es20' profile)" OFF)
@@ -521,6 +530,7 @@ mark_as_advanced(
WITH_GLEW_ES
WITH_GL_EGL
WITH_GL_PROFILE_ES20
WITH_VULKAN_SHADER_COMPILATION
)
if(WIN32)
@@ -1065,7 +1075,7 @@ if(MSVC)
add_definitions(-D__LITTLE_ENDIAN__)
# OSX-Note: as we do cross-compiling with specific set architecture,
# endianess-detection and auto-setting is counterproductive
# endianness-detection and auto-setting is counterproductive
# so we just set endianness according CMAKE_OSX_ARCHITECTURES
elseif(CMAKE_OSX_ARCHITECTURES MATCHES i386 OR CMAKE_OSX_ARCHITECTURES MATCHES x86_64 OR CMAKE_OSX_ARCHITECTURES MATCHES arm64)
@@ -1121,6 +1131,18 @@ if(WITH_OPENVDB)
list(APPEND OPENVDB_LIBRARIES ${BOOST_LIBRARIES} ${TBB_LIBRARIES})
endif()
#-----------------------------------------------------------------------------
# Configure Vulkan.
if(WITH_VULKAN)
list(APPEND BLENDER_GL_LIBRARIES ${Vulkan_LIBRARY})
add_definitions(-DWITH_VULKAN)
if(WITH_VULKAN_SHADER_COMPILATION)
add_definitions(-DWITH_VULKAN_SHADER_COMPILATION)
endif()
endif()
#-----------------------------------------------------------------------------
# Configure OpenGL.
@@ -1755,7 +1777,7 @@ endif()
set(CMAKE_CXX_STANDARD 17)
# If C++17 is not available, downgrading to an earlier standard is NOT OK.
set(CMAKE_CXX_STANDARD_REQUIRED ON)
# Do not enable compiler specific language extentions.
# Do not enable compiler specific language extensions.
set(CMAKE_CXX_EXTENSIONS OFF)
# Make MSVC properly report the value of the __cplusplus preprocessor macro

View File

@@ -51,7 +51,7 @@ Other Convenience Targets
* config: Run cmake configuration tool to set build options.
* deps: Build library dependencies (intended only for platform maintainers).
The existance of locally build dependancies overrides the pre-built dependencies from subversion.
The existance of locally build dependencies overrides the pre-built dependencies from subversion.
These must be manually removed from '../lib/' to go back to using the pre-compiled libraries.
Project Files

View File

@@ -17,7 +17,7 @@
# ***** END GPL LICENSE BLOCK *****
########################################################################
# Copy all generated files to the proper strucure as blender prefers
# Copy all generated files to the proper structure as blender prefers
########################################################################
if(NOT DEFINED HARVEST_TARGET)

View File

@@ -42,6 +42,7 @@ ExternalProject_Add(nanovdb
URL_HASH ${NANOVDB_HASH_TYPE}=${NANOVDB_HASH}
PREFIX ${BUILD_DIR}/nanovdb
SOURCE_SUBDIR nanovdb
PATCH_COMMAND ${PATCH_CMD} -p 1 -d ${BUILD_DIR}/nanovdb/src/nanovdb < ${PATCH_DIR}/nanovdb.diff
CMAKE_ARGS -DCMAKE_INSTALL_PREFIX=${LIBDIR}/nanovdb ${DEFAULT_CMAKE_FLAGS} ${NANOVDB_EXTRA_ARGS}
INSTALL_DIR ${LIBDIR}/nanovdb
)

View File

@@ -39,7 +39,7 @@ endif()
set(DOWNLOAD_DIR "${CMAKE_CURRENT_BINARY_DIR}/downloads" CACHE STRING "Path for downloaded files")
# This path must be hard-coded like this, so that the GNUmakefile knows where it is and can pass it to make_source_archive.py:
set(PACKAGE_DIR "${CMAKE_CURRENT_BINARY_DIR}/packages")
option(PACKAGE_USE_UPSTREAM_SOURCES "Use soures upstream to download the package sources, when OFF the blender mirror will be used" ON)
option(PACKAGE_USE_UPSTREAM_SOURCES "Use sources upstream to download the package sources, when OFF the blender mirror will be used" ON)
file(TO_CMAKE_PATH ${DOWNLOAD_DIR} DOWNLOAD_DIR)
file(TO_CMAKE_PATH ${PACKAGE_DIR} PACKAGE_DIR)

View File

@@ -24,7 +24,7 @@ if(MSVC)
add_custom_command(
OUTPUT ${PYTARGET}/bin/python${PYTHON_POSTFIX}.exe
COMMAND echo packaging python
COMMAND echo this should ouput at ${PYTARGET}/bin/python${PYTHON_POSTFIX}.exe
COMMAND echo this should output at ${PYTARGET}/bin/python${PYTHON_POSTFIX}.exe
COMMAND ${CMAKE_COMMAND} -E make_directory ${PYTARGET}/libs
COMMAND ${CMAKE_COMMAND} -E copy ${PYSRC}/libs/python${PYTHON_SHORT_VERSION_NO_DOTS}.lib ${PYTARGET}/libs/python${PYTHON_SHORT_VERSION_NO_DOTS}.lib
COMMAND ${CMAKE_COMMAND} -E copy ${PYSRC}/python.exe ${PYTARGET}/bin/python.exe
@@ -43,7 +43,7 @@ if(MSVC)
add_custom_command(
OUTPUT ${PYTARGET}/bin/python${PYTHON_POSTFIX}.exe
COMMAND echo packaging python
COMMAND echo this should ouput at ${PYTARGET}/bin/python${PYTHON_POSTFIX}.exe
COMMAND echo this should output at ${PYTARGET}/bin/python${PYTHON_POSTFIX}.exe
COMMAND ${CMAKE_COMMAND} -E make_directory ${PYTARGET}/libs
COMMAND ${CMAKE_COMMAND} -E copy ${PYSRC}/libs/python${PYTHON_SHORT_VERSION_NO_DOTS}${PYTHON_POSTFIX}.lib ${PYTARGET}/libs/python${PYTHON_SHORT_VERSION_NO_DOTS}${PYTHON_POSTFIX}.lib
COMMAND ${CMAKE_COMMAND} -E copy ${PYSRC}/python${PYTHON_POSTFIX}.exe ${PYTARGET}/bin/python${PYTHON_POSTFIX}.exe

View File

@@ -1826,7 +1826,7 @@ compile_OCIO() {
# Force linking against static libs
#rm -f $_inst/lib/*.so*
# Additional depencencies
# Additional dependencies
#cp ext/dist/lib/libtinyxml.a $_inst/lib
#cp ext/dist/lib/libyaml-cpp.a $_inst/lib

View File

@@ -0,0 +1,374 @@
Index: nanovdb/nanovdb/NanoVDB.h
===================================================================
--- a/nanovdb/nanovdb/NanoVDB.h (revision 62751)
+++ b/nanovdb/nanovdb/NanoVDB.h (working copy)
@@ -152,8 +152,8 @@
#endif // __CUDACC_RTC__
-#ifdef __CUDACC__
-// Only define __hostdev__ when using NVIDIA CUDA compiler
+#if defined(__CUDACC__) || defined(__HIP__)
+// Only define __hostdev__ when using NVIDIA CUDA or HIP compiler
#define __hostdev__ __host__ __device__
#else
#define __hostdev__
@@ -461,7 +461,7 @@
/// Maximum floating-point values
template<typename T>
struct Maximum;
-#ifdef __CUDA_ARCH__
+#if defined(__CUDA_ARCH__) || defined(__HIP__)
template<>
struct Maximum<int>
{
@@ -1006,10 +1006,10 @@
using Vec3i = Vec3<int>;
/// @brief Return a single precision floating-point vector of this coordinate
-Vec3f Coord::asVec3s() const { return Vec3f(float(mVec[0]), float(mVec[1]), float(mVec[2])); }
+inline __hostdev__ Vec3f Coord::asVec3s() const { return Vec3f(float(mVec[0]), float(mVec[1]), float(mVec[2])); }
/// @brief Return a double precision floating-point vector of this coordinate
-Vec3d Coord::asVec3d() const { return Vec3d(double(mVec[0]), double(mVec[1]), double(mVec[2])); }
+inline __hostdev__ Vec3d Coord::asVec3d() const { return Vec3d(double(mVec[0]), double(mVec[1]), double(mVec[2])); }
// ----------------------------> Vec4 <--------------------------------------
@@ -1820,7 +1820,7 @@
}; // Map
template<typename Mat4T>
-void Map::set(const Mat4T& mat, const Mat4T& invMat, double taper)
+__hostdev__ void Map::set(const Mat4T& mat, const Mat4T& invMat, double taper)
{
float * mf = mMatF, *vf = mVecF;
float* mif = mInvMatF;
@@ -2170,7 +2170,7 @@
}; // Class Grid
template<typename TreeT>
-int Grid<TreeT>::findBlindDataForSemantic(GridBlindDataSemantic semantic) const
+__hostdev__ int Grid<TreeT>::findBlindDataForSemantic(GridBlindDataSemantic semantic) const
{
for (uint32_t i = 0, n = blindDataCount(); i < n; ++i)
if (blindMetaData(i).mSemantic == semantic)
@@ -2328,7 +2328,7 @@
}; // Tree class
template<typename RootT>
-void Tree<RootT>::extrema(ValueType& min, ValueType& max) const
+__hostdev__ void Tree<RootT>::extrema(ValueType& min, ValueType& max) const
{
min = this->root().valueMin();
max = this->root().valueMax();
@@ -2336,7 +2336,7 @@
template<typename RootT>
template<typename NodeT>
-const NodeT* Tree<RootT>::getNode(uint32_t i) const
+__hostdev__ const NodeT* Tree<RootT>::getNode(uint32_t i) const
{
static_assert(is_same<TreeNodeT<NodeT::LEVEL>, NodeT>::value, "Tree::getNode: unvalid node type");
NANOVDB_ASSERT(i < DataType::mCount[NodeT::LEVEL]);
@@ -2345,7 +2345,7 @@
template<typename RootT>
template<int LEVEL>
-const typename TreeNode<Tree<RootT>, LEVEL>::type* Tree<RootT>::getNode(uint32_t i) const
+__hostdev__ const typename TreeNode<Tree<RootT>, LEVEL>::type* Tree<RootT>::getNode(uint32_t i) const
{
NANOVDB_ASSERT(i < DataType::mCount[LEVEL]);
return reinterpret_cast<const TreeNodeT<LEVEL>*>(reinterpret_cast<const uint8_t*>(this) + DataType::mBytes[LEVEL]) + i;
@@ -2353,7 +2353,7 @@
template<typename RootT>
template<typename NodeT>
-NodeT* Tree<RootT>::getNode(uint32_t i)
+__hostdev__ NodeT* Tree<RootT>::getNode(uint32_t i)
{
static_assert(is_same<TreeNodeT<NodeT::LEVEL>, NodeT>::value, "Tree::getNode: invalid node type");
NANOVDB_ASSERT(i < DataType::mCount[NodeT::LEVEL]);
@@ -2362,7 +2362,7 @@
template<typename RootT>
template<int LEVEL>
-typename TreeNode<Tree<RootT>, LEVEL>::type* Tree<RootT>::getNode(uint32_t i)
+__hostdev__ typename TreeNode<Tree<RootT>, LEVEL>::type* Tree<RootT>::getNode(uint32_t i)
{
NANOVDB_ASSERT(i < DataType::mCount[LEVEL]);
return reinterpret_cast<TreeNodeT<LEVEL>*>(reinterpret_cast<uint8_t*>(this) + DataType::mBytes[LEVEL]) + i;
@@ -2370,7 +2370,7 @@
template<typename RootT>
template<typename NodeT>
-uint32_t Tree<RootT>::getNodeID(const NodeT& node) const
+__hostdev__ uint32_t Tree<RootT>::getNodeID(const NodeT& node) const
{
static_assert(is_same<TreeNodeT<NodeT::LEVEL>, NodeT>::value, "Tree::getNodeID: invalid node type");
const NodeT* first = reinterpret_cast<const NodeT*>(reinterpret_cast<const uint8_t*>(this) + DataType::mBytes[NodeT::LEVEL]);
@@ -2380,7 +2380,7 @@
template<typename RootT>
template<typename NodeT>
-uint32_t Tree<RootT>::getLinearOffset(const NodeT& node) const
+__hostdev__ uint32_t Tree<RootT>::getLinearOffset(const NodeT& node) const
{
return this->getNodeID(node) + DataType::mPFSum[NodeT::LEVEL];
}
@@ -3366,7 +3366,7 @@
}; // LeafNode class
template<typename ValueT, typename CoordT, template<uint32_t> class MaskT, uint32_t LOG2DIM>
-inline void LeafNode<ValueT, CoordT, MaskT, LOG2DIM>::updateBBox()
+inline __hostdev__ void LeafNode<ValueT, CoordT, MaskT, LOG2DIM>::updateBBox()
{
static_assert(LOG2DIM == 3, "LeafNode::updateBBox: only supports LOGDIM = 3!");
if (!this->isActive()) return;
Index: nanovdb/nanovdb/util/SampleFromVoxels.h
===================================================================
--- a/nanovdb/nanovdb/util/SampleFromVoxels.h (revision 62751)
+++ b/nanovdb/nanovdb/util/SampleFromVoxels.h (working copy)
@@ -22,7 +22,7 @@
#define NANOVDB_SAMPLE_FROM_VOXELS_H_HAS_BEEN_INCLUDED
// Only define __hostdev__ when compiling as NVIDIA CUDA
-#ifdef __CUDACC__
+#if defined(__CUDACC__) || defined(__HIP__)
#define __hostdev__ __host__ __device__
#else
#include <cmath> // for floor
@@ -136,7 +136,7 @@
template<typename TreeOrAccT>
template<typename Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 0, true>::operator()(const Vec3T& xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 0, true>::operator()(const Vec3T& xyz) const
{
const CoordT ijk = Round<CoordT>(xyz);
if (ijk != mPos) {
@@ -147,7 +147,7 @@
}
template<typename TreeOrAccT>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 0, true>::operator()(const CoordT& ijk) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 0, true>::operator()(const CoordT& ijk) const
{
if (ijk != mPos) {
mPos = ijk;
@@ -158,7 +158,7 @@
template<typename TreeOrAccT>
template<typename Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 0, false>::operator()(const Vec3T& xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 0, false>::operator()(const Vec3T& xyz) const
{
return mAcc.getValue(Round<CoordT>(xyz));
}
@@ -195,7 +195,7 @@
}; // TrilinearSamplerBase
template<typename TreeOrAccT>
-void TrilinearSampler<TreeOrAccT>::stencil(CoordT& ijk, ValueT (&v)[2][2][2]) const
+__hostdev__ void TrilinearSampler<TreeOrAccT>::stencil(CoordT& ijk, ValueT (&v)[2][2][2]) const
{
v[0][0][0] = mAcc.getValue(ijk); // i, j, k
@@ -224,7 +224,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType TrilinearSampler<TreeOrAccT>::sample(const Vec3T<RealT> &uvw, const ValueT (&v)[2][2][2])
+__hostdev__ typename TreeOrAccT::ValueType TrilinearSampler<TreeOrAccT>::sample(const Vec3T<RealT> &uvw, const ValueT (&v)[2][2][2])
{
#if 0
auto lerp = [](ValueT a, ValueT b, ValueT w){ return fma(w, b-a, a); };// = w*(b-a) + a
@@ -239,7 +239,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-Vec3T<typename TreeOrAccT::ValueType> TrilinearSampler<TreeOrAccT>::gradient(const Vec3T<RealT> &uvw, const ValueT (&v)[2][2][2])
+__hostdev__ Vec3T<typename TreeOrAccT::ValueType> TrilinearSampler<TreeOrAccT>::gradient(const Vec3T<RealT> &uvw, const ValueT (&v)[2][2][2])
{
static_assert(std::is_floating_point<ValueT>::value, "TrilinearSampler::gradient requires a floating-point type");
#if 0
@@ -270,7 +270,7 @@
}
template<typename TreeOrAccT>
-bool TrilinearSampler<TreeOrAccT>::zeroCrossing(const ValueT (&v)[2][2][2])
+__hostdev__ bool TrilinearSampler<TreeOrAccT>::zeroCrossing(const ValueT (&v)[2][2][2])
{
static_assert(std::is_floating_point<ValueT>::value, "TrilinearSampler::zeroCrossing requires a floating-point type");
const bool less = v[0][0][0] < ValueT(0);
@@ -363,7 +363,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, true>::operator()(Vec3T<RealT> xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, true>::operator()(Vec3T<RealT> xyz) const
{
this->cache(xyz);
return BaseT::sample(xyz, mVal);
@@ -370,7 +370,7 @@
}
template<typename TreeOrAccT>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, true>::operator()(const CoordT &ijk) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, true>::operator()(const CoordT &ijk) const
{
return ijk == mPos ? mVal[0][0][0] : BaseT::mAcc.getValue(ijk);
}
@@ -377,7 +377,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-Vec3T<typename TreeOrAccT::ValueType> SampleFromVoxels<TreeOrAccT, 1, true>::gradient(Vec3T<RealT> xyz) const
+__hostdev__ Vec3T<typename TreeOrAccT::ValueType> SampleFromVoxels<TreeOrAccT, 1, true>::gradient(Vec3T<RealT> xyz) const
{
this->cache(xyz);
return BaseT::gradient(xyz, mVal);
@@ -393,7 +393,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-void SampleFromVoxels<TreeOrAccT, 1, true>::cache(Vec3T<RealT>& xyz) const
+__hostdev__ void SampleFromVoxels<TreeOrAccT, 1, true>::cache(Vec3T<RealT>& xyz) const
{
CoordT ijk = Floor<CoordT>(xyz);
if (ijk != mPos) {
@@ -406,7 +406,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, false>::operator()(Vec3T<RealT> xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, false>::operator()(Vec3T<RealT> xyz) const
{
ValueT val[2][2][2];
CoordT ijk = Floor<CoordT>(xyz);
@@ -418,7 +418,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, false>::operator()(Vec3T<RealT> xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, false>::operator()(Vec3T<RealT> xyz) const
{
auto lerp = [](ValueT a, ValueT b, RealT w) { return a + ValueT(w) * (b - a); };
@@ -463,7 +463,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-inline Vec3T<typename TreeOrAccT::ValueType> SampleFromVoxels<TreeOrAccT, 1, false>::gradient(Vec3T<RealT> xyz) const
+inline __hostdev__ Vec3T<typename TreeOrAccT::ValueType> SampleFromVoxels<TreeOrAccT, 1, false>::gradient(Vec3T<RealT> xyz) const
{
ValueT val[2][2][2];
CoordT ijk = Floor<CoordT>(xyz);
@@ -473,7 +473,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-bool SampleFromVoxels<TreeOrAccT, 1, false>::zeroCrossing(Vec3T<RealT> xyz) const
+__hostdev__ bool SampleFromVoxels<TreeOrAccT, 1, false>::zeroCrossing(Vec3T<RealT> xyz) const
{
ValueT val[2][2][2];
CoordT ijk = Floor<CoordT>(xyz);
@@ -510,7 +510,7 @@
}; // TriquadraticSamplerBase
template<typename TreeOrAccT>
-void TriquadraticSampler<TreeOrAccT>::stencil(const CoordT &ijk, ValueT (&v)[3][3][3]) const
+__hostdev__ void TriquadraticSampler<TreeOrAccT>::stencil(const CoordT &ijk, ValueT (&v)[3][3][3]) const
{
CoordT p(ijk[0] - 1, 0, 0);
for (int dx = 0; dx < 3; ++dx, ++p[0]) {
@@ -526,7 +526,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType TriquadraticSampler<TreeOrAccT>::sample(const Vec3T<RealT> &uvw, const ValueT (&v)[3][3][3])
+__hostdev__ typename TreeOrAccT::ValueType TriquadraticSampler<TreeOrAccT>::sample(const Vec3T<RealT> &uvw, const ValueT (&v)[3][3][3])
{
auto kernel = [](const ValueT* value, double weight)->ValueT {
return weight * (weight * (0.5f * (value[0] + value[2]) - value[1]) +
@@ -545,7 +545,7 @@
}
template<typename TreeOrAccT>
-bool TriquadraticSampler<TreeOrAccT>::zeroCrossing(const ValueT (&v)[3][3][3])
+__hostdev__ bool TriquadraticSampler<TreeOrAccT>::zeroCrossing(const ValueT (&v)[3][3][3])
{
static_assert(std::is_floating_point<ValueT>::value, "TrilinearSampler::zeroCrossing requires a floating-point type");
const bool less = v[0][0][0] < ValueT(0);
@@ -624,7 +624,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 2, true>::operator()(Vec3T<RealT> xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 2, true>::operator()(Vec3T<RealT> xyz) const
{
this->cache(xyz);
return BaseT::sample(xyz, mVal);
@@ -631,7 +631,7 @@
}
template<typename TreeOrAccT>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 2, true>::operator()(const CoordT &ijk) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 2, true>::operator()(const CoordT &ijk) const
{
return ijk == mPos ? mVal[1][1][1] : BaseT::mAcc.getValue(ijk);
}
@@ -646,7 +646,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-void SampleFromVoxels<TreeOrAccT, 2, true>::cache(Vec3T<RealT>& xyz) const
+__hostdev__ void SampleFromVoxels<TreeOrAccT, 2, true>::cache(Vec3T<RealT>& xyz) const
{
CoordT ijk = Floor<CoordT>(xyz);
if (ijk != mPos) {
@@ -657,7 +657,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 2, false>::operator()(Vec3T<RealT> xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 2, false>::operator()(Vec3T<RealT> xyz) const
{
ValueT val[3][3][3];
CoordT ijk = Floor<CoordT>(xyz);
@@ -667,7 +667,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-bool SampleFromVoxels<TreeOrAccT, 2, false>::zeroCrossing(Vec3T<RealT> xyz) const
+__hostdev__ bool SampleFromVoxels<TreeOrAccT, 2, false>::zeroCrossing(Vec3T<RealT> xyz) const
{
ValueT val[3][3][3];
CoordT ijk = Floor<CoordT>(xyz);
@@ -710,7 +710,7 @@
}; // TricubicSampler
template<typename TreeOrAccT>
-void TricubicSampler<TreeOrAccT>::stencil(const CoordT& ijk, ValueT (&C)[64]) const
+__hostdev__ void TricubicSampler<TreeOrAccT>::stencil(const CoordT& ijk, ValueT (&C)[64]) const
{
auto fetch = [&](int i, int j, int k) -> ValueT& { return C[((i + 1) << 4) + ((j + 1) << 2) + k + 1]; };
@@ -929,7 +929,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 3, true>::operator()(Vec3T<RealT> xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 3, true>::operator()(Vec3T<RealT> xyz) const
{
this->cache(xyz);
return BaseT::sample(xyz, mC);
@@ -937,7 +937,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-void SampleFromVoxels<TreeOrAccT, 3, true>::cache(Vec3T<RealT>& xyz) const
+__hostdev__ void SampleFromVoxels<TreeOrAccT, 3, true>::cache(Vec3T<RealT>& xyz) const
{
CoordT ijk = Floor<CoordT>(xyz);
if (ijk != mPos) {

View File

@@ -0,0 +1,66 @@
# - Find SHADERC library
# Find the native Haru includes and library
# This module defines
# SHADERC_INCLUDE_DIRS, where to find hpdf.h, set when
# SHADERC_INCLUDE_DIR is found.
# SHADERC_LIBRARIES, libraries to link against to use Haru.
# SHADERC_ROOT_DIR, The base directory to search for Haru.
# This can also be an environment variable.
# SHADERC_FOUND, If false, do not try to use Haru.
#
# also defined, but not for general use are
# SHADERC_LIBRARY, where to find the Haru library.
#=============================================================================
# Copyright 2021 Blender Foundation.
#
# Distributed under the OSI-approved BSD 3-Clause License,
# see accompanying file BSD-3-Clause-license.txt for details.
#=============================================================================
# If SHADERC_ROOT_DIR was defined in the environment, use it.
if(NOT SHADERC_ROOT_DIR AND NOT $ENV{SHADERC_ROOT_DIR} STREQUAL "")
set(SHADERC_ROOT_DIR $ENV{SHADERC_ROOT_DIR})
endif()
set(_shaderc_SEARCH_DIRS
${SHADERC_ROOT_DIR}
/opt/lib/haru
)
find_path(SHADERC_INCLUDE_DIR
NAMES
shaderc.hpp
HINTS
${_shaderc_SEARCH_DIRS}
PATH_SUFFIXES
include/shaderc
include
)
find_library(SHADERC_LIBRARY
NAMES
shaderc_combined
shaderc
HINTS
${_shaderc_SEARCH_DIRS}
PATH_SUFFIXES
lib64 lib
)
# Handle the QUIETLY and REQUIRED arguments and set SHADERC_FOUND to TRUE if
# all listed variables are TRUE.
include(FindPackageHandleStandardArgs)
find_package_handle_standard_args(ShaderC DEFAULT_MSG SHADERC_LIBRARY SHADERC_INCLUDE_DIR)
if(SHADERC_FOUND)
set(SHADERC_LIBRARIES ${SHADERC_LIBRARY})
set(SHADERC_INCLUDE_DIRS ${SHADERC_INCLUDE_DIR})
endif()
mark_as_advanced(
SHADERC_INCLUDE_DIR
SHADERC_LIBRARY
)
unset(_shaderc_SEARCH_DIRS)

View File

@@ -180,7 +180,7 @@ def create_nb_project_main():
f.write(' </logicalFolder>\n')
f.write(' </logicalFolder>\n')
# default, but this dir is infact not in blender dir so we can ignore it
# default, but this dir is in fact not in blender dir so we can ignore it
# f.write(' <sourceFolderFilter>^(nbproject)$</sourceFolderFilter>\n')
f.write(r' <sourceFolderFilter>^(nbproject|__pycache__|.*\.py|.*\.html|.*\.blend)$</sourceFolderFilter>\n')

View File

@@ -81,4 +81,5 @@ if(NOT APPLE)
set(WITH_CYCLES_DEVICE_OPTIX ON CACHE BOOL "" FORCE)
set(WITH_CYCLES_CUDA_BINARIES ON CACHE BOOL "" FORCE)
set(WITH_CYCLES_CUBIN_COMPILER OFF CACHE BOOL "" FORCE)
set(WITH_CYCLES_HIP_BINARIES ON CACHE BOOL "" FORCE)
endif()

View File

@@ -529,7 +529,7 @@ function(SETUP_LIBDIRS)
# NOTE: For all new libraries, use absolute library paths.
# This should eventually be phased out.
# APPLE plaform uses full paths for linking libraries, and avoids link_directories.
# APPLE platform uses full paths for linking libraries, and avoids link_directories.
if(NOT MSVC AND NOT APPLE)
link_directories(${JPEG_LIBPATH} ${PNG_LIBPATH} ${ZLIB_LIBPATH} ${FREETYPE_LIBPATH})

View File

@@ -102,6 +102,11 @@ find_package_wrapper(ZLIB REQUIRED)
find_package_wrapper(Zstd REQUIRED)
find_package_wrapper(Freetype REQUIRED)
if(WITH_VULKAN)
find_package_wrapper(Vulkan REQUIRED)
find_package(ShaderC REQUIRED)
endif()
if(WITH_PYTHON)
# No way to set py35, remove for now.
# find_package(PythonLibs)

View File

@@ -874,5 +874,32 @@ if(WITH_HARU)
endif()
endif()
if(WITH_VULKAN)
if(EXISTS ${LIBDIR}/vulkan)
set(Vulkan_FOUND On)
set(Vulkan_ROOT_DIR ${LIBDIR}/vulkan)
set(Vulkan_INCLUDE_DIR ${Vulkan_ROOT_DIR}/include)
set(Vulkan_INCLUDE_DIRS ${Vulkan_INCLUDE_DIR})
set(Vulkan_LIBRARY ${Vulkan_ROOT_DIR}/lib/vulkan-1.lib)
set(Vulkan_LIBRARIES ${Vulkan_LIBRARY})
else()
message(WARNING "vulkan was not found, disabling WITH_VULKAN")
set(WITH_VULKAN OFF)
endif()
endif()
if(WITH_VULKAN)
if(EXISTS ${LIBDIR}/shaderc)
set(SHADERC_ROOT_DIR ${LIBDIR}/shaderc)
set(SHADERC_INCLUDE_DIR ${SHADERC_ROOT_DIR}/include)
set(SHADERC_INCLUDE_DIRS ${SHADERC_INCLUDE_DIR})
set(SHADERC_LIBRARY optimized ${SHADERC_ROOT_DIR}/lib/shaderc_shared.lib debug ${SHADERC_ROOT_DIR}/lib/shaderc_shared_d.lib)
set(SHADERC_LIBRARIES ${SHADERC_LIBRARY})
else()
message(WARNING "shaderc was not found, disabling WITH_VULKAN")
set(WITH_VULKAN OFF)
endif()
endif()
set(ZSTD_INCLUDE_DIRS ${LIBDIR}/zstd/include)
set(ZSTD_LIBRARIES ${LIBDIR}/zstd/lib/zstd_static.lib)

View File

@@ -27,7 +27,7 @@ if(WITH_WINDOWS_BUNDLE_CRT)
# Install the CRT to the blender.crt Sub folder.
install(FILES ${CMAKE_INSTALL_SYSTEM_RUNTIME_LIBS} DESTINATION ./blender.crt COMPONENT Libraries)
# Generating the manifest is a relativly expensive operation since
# Generating the manifest is a relatively expensive operation since
# it is collecting an sha1 hash for every file required. so only do
# this work when the libs have either changed or the manifest does
# not exist yet.

View File

@@ -11,7 +11,7 @@ import queue
execution_queue = queue.Queue()
# This function can savely be called in another thread.
# This function can safely be called in another thread.
# The function will be executed when the timer runs the next time.
def run_in_main_thread(function):
execution_queue.put(function)

View File

@@ -42,8 +42,13 @@ class SimpleMouseOperator(bpy.types.Operator):
self.y = event.mouse_y
return self.execute(context)
# Only needed if you want to add into a dynamic menu
def menu_func(self, context):
self.layout.operator(SimpleMouseOperator.bl_idname, text="Simple Mouse Operator")
# Register and add to the view menu (required to also use F3 search "Simple Mouse Operator" for quick access)
bpy.utils.register_class(SimpleMouseOperator)
bpy.types.VIEW3D_MT_view.append(menu_func)
# Test call to the newly defined operator.
# Here we call the operator and invoke it, meaning that the settings are taken

View File

@@ -43,7 +43,7 @@ def menu_func(self, context):
self.layout.operator(ExportSomeData.bl_idname, text="Text Export Operator")
# Register and add to the file selector
# Register and add to the file selector (required to also use F3 search "Text Export Operator" for quick access)
bpy.utils.register_class(ExportSomeData)
bpy.types.TOPBAR_MT_file_export.append(menu_func)

View File

@@ -27,8 +27,14 @@ class DialogOperator(bpy.types.Operator):
wm = context.window_manager
return wm.invoke_props_dialog(self)
# Only needed if you want to add into a dynamic menu
def menu_func(self, context):
self.layout.operator(DialogOperator.bl_idname, text="Dialog Operator")
# Register and add to the object menu (required to also use F3 search "Dialog Operator" for quick access)
bpy.utils.register_class(DialogOperator)
bpy.types.VIEW3D_MT_object.append(menu_func)
# Test call.
bpy.ops.object.dialog_operator('INVOKE_DEFAULT')

View File

@@ -41,8 +41,13 @@ class CustomDrawOperator(bpy.types.Operator):
col.prop(self, "my_string")
# Only needed if you want to add into a dynamic menu
def menu_func(self, context):
self.layout.operator(CustomDrawOperator.bl_idname, text="Custom Draw Operator")
# Register and add to the object menu (required to also use F3 search "Custom Draw Operator" for quick access)
bpy.utils.register_class(CustomDrawOperator)
bpy.types.VIEW3D_MT_object.append(menu_func)
# test call
bpy.ops.object.custom_draw('INVOKE_DEFAULT')

View File

@@ -55,8 +55,13 @@ class ModalOperator(bpy.types.Operator):
context.window_manager.modal_handler_add(self)
return {'RUNNING_MODAL'}
# Only needed if you want to add into a dynamic menu
def menu_func(self, context):
self.layout.operator(ModalOperator.bl_idname, text="Modal Operator")
# Register and add to the object menu (required to also use F3 search "Modal Operator" for quick access)
bpy.utils.register_class(ModalOperator)
bpy.types.VIEW3D_MT_object.append(menu_func)
# test call
bpy.ops.object.modal_operator('INVOKE_DEFAULT')

View File

@@ -31,8 +31,13 @@ class SearchEnumOperator(bpy.types.Operator):
context.window_manager.invoke_search_popup(self)
return {'RUNNING_MODAL'}
# Only needed if you want to add into a dynamic menu
def menu_func(self, context):
self.layout.operator(SearchEnumOperator.bl_idname, text="Search Enum Operator")
# Register and add to the object menu (required to also use F3 search "Search Enum Operator" for quick access)
bpy.utils.register_class(SearchEnumOperator)
bpy.types.VIEW3D_MT_object.append(menu_func)
# test call
bpy.ops.object.search_enum_operator('INVOKE_DEFAULT')

View File

@@ -22,8 +22,13 @@ class HelloWorldOperator(bpy.types.Operator):
print("Hello World")
return {'FINISHED'}
# Only needed if you want to add into a dynamic menu
def menu_func(self, context):
self.layout.operator(HelloWorldOperator.bl_idname, text="Hello World Operator")
# Register and add to the view menu (required to also use F3 search "Hello World Operator" for quick access)
bpy.utils.register_class(HelloWorldOperator)
bpy.types.VIEW3D_MT_view.append(menu_func)
# test call to the newly defined operator
bpy.ops.wm.hello_world()

View File

@@ -728,7 +728,7 @@ Abusing RNA property callbacks
------------------------------
Python-defined RNA properties can have custom callbacks. Trying to perform complex operations
from there, like calling an operator, may work, but is not officialy recommended nor supported.
from there, like calling an operator, may work, but is not officially recommended nor supported.
Main reason is that those callback should be very fast, but additionally, it may for example
create issues with undo/redo system (most operators store an history step, and editing an RNA

View File

@@ -116,3 +116,7 @@ endif()
if (WITH_COMPOSITOR)
add_subdirectory(smaa_areatex)
endif()
if(WITH_VULKAN)
add_subdirectory(vulkan_memory_allocator)
endif()

View File

@@ -804,31 +804,29 @@ typedef enum hipDeviceP2PAttr {
} hipDeviceP2PAttr;
typedef struct HIP_MEMCPY3D {
size_t srcXInBytes;
size_t srcY;
size_t srcZ;
size_t srcLOD;
unsigned int srcXInBytes;
unsigned int srcY;
unsigned int srcZ;
unsigned int srcLOD;
hipMemoryType srcMemoryType;
const void* srcHost;
hipDeviceptr_t srcDevice;
hArray * srcArray;
void* reserved0;
size_t srcPitch;
size_t srcHeight;
size_t dstXInBytes;
size_t dstY;
size_t dstZ;
size_t dstLOD;
hArray srcArray;
unsigned int srcPitch;
unsigned int srcHeight;
unsigned int dstXInBytes;
unsigned int dstY;
unsigned int dstZ;
unsigned int dstLOD;
hipMemoryType dstMemoryType;
void* dstHost;
hipDeviceptr_t dstDevice;
hArray * dstArray;
void* reserved1;
size_t dstPitch;
size_t dstHeight;
size_t WidthInBytes;
size_t Height;
size_t Depth;
hArray dstArray;
unsigned int dstPitch;
unsigned int dstHeight;
unsigned int WidthInBytes;
unsigned int Height;
unsigned int Depth;
} HIP_MEMCPY3D;
typedef struct HIP_MEMCPY3D_PEER_st {
@@ -879,7 +877,7 @@ typedef struct HIP_RESOURCE_DESC_st {
hipResourceType resType;
union {
struct {
hArray * h_Array;
hArray h_Array;
} array;
struct {
hipMipmappedArray_t hMipmappedArray;
@@ -1074,9 +1072,10 @@ typedef enum hiprtcResult {
typedef hipError_t HIPAPI thipGetErrorName(hipError_t error, const char** pStr);
typedef hipError_t HIPAPI thipInit(unsigned int Flags);
typedef hipError_t HIPAPI thipDriverGetVersion(int* driverVersion);
typedef hipError_t HIPAPI thipGetDevice(hipDevice_t* device, int ordinal);
typedef hipError_t HIPAPI thipGetDevice(int* device);
typedef hipError_t HIPAPI thipGetDeviceCount(int* count);
typedef hipError_t HIPAPI thipGetDeviceProperties(hipDeviceProp_t* props, int deviceId);
typedef hipError_t HIPAPI thipDeviceGet(hipDevice_t* device, int ordinal);
typedef hipError_t HIPAPI thipDeviceGetName(char* name, int len, hipDevice_t dev);
typedef hipError_t HIPAPI thipDeviceGetAttribute(int* pi, hipDeviceAttribute_t attrib, hipDevice_t dev);
typedef hipError_t HIPAPI thipDeviceComputeCapability(int* major, int* minor, hipDevice_t dev);
@@ -1209,6 +1208,7 @@ extern thipDriverGetVersion *hipDriverGetVersion;
extern thipGetDevice *hipGetDevice;
extern thipGetDeviceCount *hipGetDeviceCount;
extern thipGetDeviceProperties *hipGetDeviceProperties;
extern thipDeviceGet* hipDeviceGet;
extern thipDeviceGetName *hipDeviceGetName;
extern thipDeviceGetAttribute *hipDeviceGetAttribute;
extern thipDeviceComputeCapability *hipDeviceComputeCapability;

View File

@@ -71,6 +71,7 @@ thipDriverGetVersion *hipDriverGetVersion;
thipGetDevice *hipGetDevice;
thipGetDeviceCount *hipGetDeviceCount;
thipGetDeviceProperties *hipGetDeviceProperties;
thipDeviceGet* hipDeviceGet;
thipDeviceGetName *hipDeviceGetName;
thipDeviceGetAttribute *hipDeviceGetAttribute;
thipDeviceComputeCapability *hipDeviceComputeCapability;
@@ -255,6 +256,7 @@ static int hipewHipInit(void) {
HIP_LIBRARY_FIND_CHECKED(hipGetDevice);
HIP_LIBRARY_FIND_CHECKED(hipGetDeviceCount);
HIP_LIBRARY_FIND_CHECKED(hipGetDeviceProperties);
HIP_LIBRARY_FIND_CHECKED(hipDeviceGet);
HIP_LIBRARY_FIND_CHECKED(hipDeviceGetName);
HIP_LIBRARY_FIND_CHECKED(hipDeviceGetAttribute);
HIP_LIBRARY_FIND_CHECKED(hipDeviceComputeCapability);

View File

@@ -0,0 +1,42 @@
# ***** BEGIN GPL LICENSE BLOCK *****
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software Foundation,
# Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
#
# The Original Code is Copyright (C) 2012, Blender Foundation
# All rights reserved.
# ***** END GPL LICENSE BLOCK *****
set(INC
.
)
set(INC_SYS
${Vulkan_INCLUDE_DIRS}
)
set(SRC
vk_mem_alloc_impl.cc
vk_mem_alloc.h
)
blender_add_lib(extern_vulkan_memory_allocator "${SRC}" "${INC}" "${INC_SYS}" "${LIB}")
if(CMAKE_COMPILER_IS_GNUCC OR CMAKE_C_COMPILER_ID MATCHES "Clang")
target_compile_options(extern_vulkan_memory_allocator
PRIVATE "-Wno-nullability-completeness"
)
endif()

View File

@@ -0,0 +1,19 @@
Copyright (c) 2017-2021 Advanced Micro Devices, Inc. All rights reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

View File

@@ -0,0 +1,5 @@
Project: VulkanMemoryAllocator
URL: https://github.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator
License: MIT
Upstream version: 4b047fd
Local modifications: None

134
extern/vulkan_memory_allocator/README.md vendored Normal file
View File

@@ -0,0 +1,134 @@
# Vulkan Memory Allocator
Easy to integrate Vulkan memory allocation library.
**Documentation:** See [Vulkan Memory Allocator](https://gpuopen-librariesandsdks.github.io/VulkanMemoryAllocator/html/) (generated from Doxygen-style comments in [src/vk_mem_alloc.h](src/vk_mem_alloc.h))
**License:** MIT. See [LICENSE.txt](LICENSE.txt)
**Changelog:** See [CHANGELOG.md](CHANGELOG.md)
**Product page:** [Vulkan Memory Allocator on GPUOpen](https://gpuopen.com/gaming-product/vulkan-memory-allocator/)
**Build status:**
- Windows: [![Build status](https://ci.appveyor.com/api/projects/status/4vlcrb0emkaio2pn/branch/master?svg=true)](https://ci.appveyor.com/project/adam-sawicki-amd/vulkanmemoryallocator/branch/master)
- Linux: [![Build Status](https://travis-ci.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator.svg?branch=master)](https://travis-ci.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator)
# Problem
Memory allocation and resource (buffer and image) creation in Vulkan is difficult (comparing to older graphics API-s, like D3D11 or OpenGL) for several reasons:
- It requires a lot of boilerplate code, just like everything else in Vulkan, because it is a low-level and high-performance API.
- There is additional level of indirection: `VkDeviceMemory` is allocated separately from creating `VkBuffer`/`VkImage` and they must be bound together.
- Driver must be queried for supported memory heaps and memory types. Different IHVs provide different types of it.
- It is recommended practice to allocate bigger chunks of memory and assign parts of them to particular resources.
# Features
This library can help game developers to manage memory allocations and resource creation by offering some higher-level functions:
1. Functions that help to choose correct and optimal memory type based on intended usage of the memory.
- Required or preferred traits of the memory are expressed using higher-level description comparing to Vulkan flags.
2. Functions that allocate memory blocks, reserve and return parts of them (`VkDeviceMemory` + offset + size) to the user.
- Library keeps track of allocated memory blocks, used and unused ranges inside them, finds best matching unused ranges for new allocations, respects all the rules of alignment and buffer/image granularity.
3. Functions that can create an image/buffer, allocate memory for it and bind them together - all in one call.
Additional features:
- Well-documented - description of all functions and structures provided, along with chapters that contain general description and example code.
- Thread-safety: Library is designed to be used in multithreaded code. Access to a single device memory block referred by different buffers and textures (binding, mapping) is synchronized internally.
- Configuration: Fill optional members of CreateInfo structure to provide custom CPU memory allocator, pointers to Vulkan functions and other parameters.
- Customization: Predefine appropriate macros to provide your own implementation of all external facilities used by the library, from assert, mutex, and atomic, to vector and linked list.
- Support for memory mapping, reference-counted internally. Support for persistently mapped memory: Just allocate with appropriate flag and you get access to mapped pointer.
- Support for non-coherent memory. Functions that flush/invalidate memory. `nonCoherentAtomSize` is respected automatically.
- Support for resource aliasing (overlap).
- Support for sparse binding and sparse residency: Convenience functions that allocate or free multiple memory pages at once.
- Custom memory pools: Create a pool with desired parameters (e.g. fixed or limited maximum size) and allocate memory out of it.
- Linear allocator: Create a pool with linear algorithm and use it for much faster allocations and deallocations in free-at-once, stack, double stack, or ring buffer fashion.
- Support for Vulkan 1.0, 1.1, 1.2.
- Support for extensions (and equivalent functionality included in new Vulkan versions):
- VK_EXT_memory_budget: Used internally if available to query for current usage and budget. If not available, it falls back to an estimation based on memory heap sizes.
- VK_KHR_dedicated_allocation: Just enable it and it will be used automatically by the library.
- VK_AMD_device_coherent_memory
- VK_KHR_buffer_device_address
- Defragmentation of GPU and CPU memory: Let the library move data around to free some memory blocks and make your allocations better compacted.
- Lost allocations: Allocate memory with appropriate flags and let the library remove allocations that are not used for many frames to make room for new ones.
- Statistics: Obtain detailed statistics about the amount of memory used, unused, number of allocated blocks, number of allocations etc. - globally, per memory heap, and per memory type.
- Debug annotations: Associate string with name or opaque pointer to your own data with every allocation.
- JSON dump: Obtain a string in JSON format with detailed map of internal state, including list of allocations and gaps between them.
- Convert this JSON dump into a picture to visualize your memory. See [tools/VmaDumpVis](tools/VmaDumpVis/README.md).
- Debugging incorrect memory usage: Enable initialization of all allocated memory with a bit pattern to detect usage of uninitialized or freed memory. Enable validation of a magic number before and after every allocation to detect out-of-bounds memory corruption.
- Record and replay sequence of calls to library functions to a file to check correctness, measure performance, and gather statistics.
# Prequisites
- Self-contained C++ library in single header file. No external dependencies other than standard C and C++ library and of course Vulkan. STL containers are not used by default.
- Public interface in C, in same convention as Vulkan API. Implementation in C++.
- Error handling implemented by returning `VkResult` error codes - same way as in Vulkan.
- Interface documented using Doxygen-style comments.
- Platform-independent, but developed and tested on Windows using Visual Studio. Continuous integration setup for Windows and Linux. Used also on Android, MacOS, and other platforms.
# Example
Basic usage of this library is very simple. Advanced features are optional. After you created global `VmaAllocator` object, a complete code needed to create a buffer may look like this:
```cpp
VkBufferCreateInfo bufferInfo = { VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO };
bufferInfo.size = 65536;
bufferInfo.usage = VK_BUFFER_USAGE_VERTEX_BUFFER_BIT | VK_BUFFER_USAGE_TRANSFER_DST_BIT;
VmaAllocationCreateInfo allocInfo = {};
allocInfo.usage = VMA_MEMORY_USAGE_GPU_ONLY;
VkBuffer buffer;
VmaAllocation allocation;
vmaCreateBuffer(allocator, &bufferInfo, &allocInfo, &buffer, &allocation, nullptr);
```
With this one function call:
1. `VkBuffer` is created.
2. `VkDeviceMemory` block is allocated if needed.
3. An unused region of the memory block is bound to this buffer.
`VmaAllocation` is an object that represents memory assigned to this buffer. It can be queried for parameters like Vulkan memory handle and offset.
# Binaries
The release comes with precompiled binary executables for "VulkanSample" application which contains test suite and "VmaReplay" tool. They are compiled using Visual Studio 2019, so they require appropriate libraries to work, including "MSVCP140.dll", "VCRUNTIME140.dll", "VCRUNTIME140_1.dll". If their launch fails with error message telling about those files missing, please download and install [Microsoft Visual C++ Redistributable for Visual Studio 2015, 2017 and 2019](https://support.microsoft.com/en-us/help/2977003/the-latest-supported-visual-c-downloads), "x64" version.
# Read more
See **[Documentation](https://gpuopen-librariesandsdks.github.io/VulkanMemoryAllocator/html/)**.
# Software using this library
- **[Detroit: Become Human](https://gpuopen.com/learn/porting-detroit-3/)**
- **[Vulkan Samples](https://github.com/LunarG/VulkanSamples)** - official Khronos Vulkan samples. License: Apache-style.
- **[Anvil](https://github.com/GPUOpen-LibrariesAndSDKs/Anvil)** - cross-platform framework for Vulkan. License: MIT.
- **[Filament](https://github.com/google/filament)** - physically based rendering engine for Android, Windows, Linux and macOS, from Google. Apache License 2.0.
- **[Atypical Games - proprietary game engine](https://developer.samsung.com/galaxy-gamedev/gamedev-blog/infinitejet.html)**
- **[Flax Engine](https://flaxengine.com/)**
- **[Lightweight Java Game Library (LWJGL)](https://www.lwjgl.org/)** - includes binding of the library for Java. License: BSD.
- **[PowerVR SDK](https://github.com/powervr-graphics/Native_SDK)** - C++ cross-platform 3D graphics SDK, from Imagination. License: MIT.
- **[Skia](https://github.com/google/skia)** - complete 2D graphic library for drawing Text, Geometries, and Images, from Google.
- **[The Forge](https://github.com/ConfettiFX/The-Forge)** - cross-platform rendering framework. Apache License 2.0.
- **[VK9](https://github.com/disks86/VK9)** - Direct3D 9 compatibility layer using Vulkan. Zlib lincese.
- **[vkDOOM3](https://github.com/DustinHLand/vkDOOM3)** - Vulkan port of GPL DOOM 3 BFG Edition. License: GNU GPL.
- **[vkQuake2](https://github.com/kondrak/vkQuake2)** - vanilla Quake 2 with Vulkan support. License: GNU GPL.
- **[Vulkan Best Practice for Mobile Developers](https://github.com/ARM-software/vulkan_best_practice_for_mobile_developers)** from ARM. License: MIT.
- **[RPCS3](https://github.com/RPCS3/rpcs3)** - PlayStation 3 emulator/debugger. License: GNU GPLv2.
[Many other projects on GitHub](https://github.com/search?q=AMD_VULKAN_MEMORY_ALLOCATOR_H&type=Code) and some game development studios that use Vulkan in their games.
# See also
- **[D3D12 Memory Allocator](https://github.com/GPUOpen-LibrariesAndSDKs/D3D12MemoryAllocator)** - equivalent library for Direct3D 12. License: MIT.
- **[Awesome Vulkan](https://github.com/vinjn/awesome-vulkan)** - a curated list of awesome Vulkan libraries, debuggers and resources.
- **[VulkanMemoryAllocator-Hpp](https://github.com/malte-v/VulkanMemoryAllocator-Hpp)** - C++ binding for this library. License: CC0-1.0.
- **[PyVMA](https://github.com/realitix/pyvma)** - Python wrapper for this library. Author: Jean-Sébastien B. (@realitix). License: Apache 2.0.
- **[vk-mem](https://github.com/gwihlidal/vk-mem-rs)** - Rust binding for this library. Author: Graham Wihlidal. License: Apache 2.0 or MIT.
- **[Haskell bindings](https://hackage.haskell.org/package/VulkanMemoryAllocator)**, **[github](https://github.com/expipiplus1/vulkan/tree/master/VulkanMemoryAllocator)** - Haskell bindings for this library. Author: Joe Hermaszewski (@expipiplus1). License BSD-3-Clause.
- **[vma_sample_sdl](https://github.com/rextimmy/vma_sample_sdl)** - SDL port of the sample app of this library (with the goal of running it on multiple platforms, including MacOS). Author: @rextimmy. License: MIT.
- **[vulkan-malloc](https://github.com/dylanede/vulkan-malloc)** - Vulkan memory allocation library for Rust. Based on version 1 of this library. Author: Dylan Ede (@dylanede). License: MIT / Apache 2.0.

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,3 @@
#define VMA_IMPLEMENTATION
#include "vk_mem_alloc.h"

View File

@@ -85,3 +85,7 @@ endif()
if(UNIX AND NOT APPLE)
add_subdirectory(libc_compat)
endif()
if(WITH_VULKAN)
add_subdirectory(shader_compiler)
endif()

View File

@@ -226,6 +226,9 @@ add_definitions(
-DCCL_NAMESPACE_END=}
)
if(WITH_CYCLES_DEBUG)
add_definitions(-DWITH_CYCLES_DEBUG)
endif()
if(WITH_CYCLES_STANDALONE_GUI)
add_definitions(-DWITH_CYCLES_STANDALONE_GUI)
endif()
@@ -334,7 +337,7 @@ else()
endif()
# Warnings
if(CMAKE_COMPILER_IS_GNUCXX)
if(CMAKE_COMPILER_IS_GNUCXX OR CMAKE_C_COMPILER_ID MATCHES "Clang")
ADD_CHECK_CXX_COMPILER_FLAG(CMAKE_CXX_FLAGS _has_cxxflag_float_conversion "-Werror=float-conversion")
ADD_CHECK_CXX_COMPILER_FLAG(CMAKE_CXX_FLAGS _has_cxxflag_double_promotion "-Werror=double-promotion")
ADD_CHECK_CXX_COMPILER_FLAG(CMAKE_CXX_FLAGS _has_no_error_unused_macros "-Wno-error=unused-macros")

View File

@@ -218,6 +218,12 @@ enum_denoising_prefilter = (
('ACCURATE', "Accurate", "Prefilter noisy guiding passes before denoising color. Improves quality when guiding passes are noisy using extra processing time", 3),
)
enum_direct_light_sampling_type = (
('MULTIPLE_IMPORTANCE_SAMPLING', "Multiple Importance Sampling", "Multiple importance sampling is used to combine direct light contributions from next-event estimation and forward path tracing", 0),
('FORWARD_PATH_TRACING', "Forward Path Tracing", "Direct light contributions are only sampled using forward path tracing", 1),
('NEXT_EVENT_ESTIMATION', "Next-Event Estimation", "Direct light contributions are only sampled using next-event estimation", 2),
)
def update_render_passes(self, context):
scene = context.scene
view_layer = context.view_layer
@@ -325,6 +331,13 @@ class CyclesRenderSettings(bpy.types.PropertyGroup):
default=1024,
)
sample_offset: IntProperty(
name="Sample Offset",
description="Number of samples to skip when starting render",
min=0, max=(1 << 24),
default=0,
)
time_limit: FloatProperty(
name="Time Limit",
description="Limit the render time (excluding synchronization time)."
@@ -346,7 +359,7 @@ class CyclesRenderSettings(bpy.types.PropertyGroup):
name="Scrambling Distance",
default=1.0,
min=0.0, max=1.0,
description="Lower values give faster rendering with GPU rendering and less noise with all devices at the cost of possible artifacts if set too low. Only works when not using adaptive sampling",
description="Reduce randomization between pixels to improve GPU rendering performance, at the cost of possible rendering artifacts if set too low. Only works when not using adaptive sampling",
)
preview_scrambling_distance: BoolProperty(
name="Scrambling Distance viewport",
@@ -354,10 +367,10 @@ class CyclesRenderSettings(bpy.types.PropertyGroup):
description="Uses the Scrambling Distance value for the viewport. Faster but may flicker",
)
adaptive_scrambling_distance: BoolProperty(
name="Adaptive Scrambling Distance",
auto_scrambling_distance: BoolProperty(
name="Automatic Scrambling Distance",
default=False,
description="Uses a formula to adapt the scrambling distance strength based on the sample count",
description="Automatically reduce the randomization between pixels to improve GPU rendering performance, at the cost of possible rendering artifacts. Only works when not using adaptive sampling",
)
use_layer_samples: EnumProperty(
@@ -415,6 +428,13 @@ class CyclesRenderSettings(bpy.types.PropertyGroup):
default=0,
)
direct_light_sampling_type: EnumProperty(
name="Direct Light Sampling Type",
description="The type of strategy used for sampling direct light contributions",
items=enum_direct_light_sampling_type,
default='MULTIPLE_IMPORTANCE_SAMPLING',
)
min_light_bounces: IntProperty(
name="Min Light Bounces",
description="Minimum number of light bounces. Setting this higher reduces noise in the first bounces, "
@@ -770,8 +790,8 @@ class CyclesRenderSettings(bpy.types.PropertyGroup):
)
use_auto_tile: BoolProperty(
name="Auto Tiles",
description="Automatically render high resolution images in tiles to reduce memory usage, using the specified tile size. Tiles are cached to disk while rendering to save memory",
name="Using Tiling",
description="Render high resolution images in tiles to reduce memory usage, using the specified tile size. Tiles are cached to disk while rendering to save memory",
default=True,
)
tile_size: IntProperty(

View File

@@ -290,15 +290,18 @@ class CYCLES_RENDER_PT_sampling_advanced(CyclesButtonsPanel, Panel):
col.active = not (cscene.use_adaptive_sampling and cscene.use_preview_adaptive_sampling)
col.prop(cscene, "sampling_pattern", text="Pattern")
col = layout.column(align=True)
col.prop(cscene, "sample_offset")
layout.separator()
col = layout.column(align=True)
col.active = not (cscene.use_adaptive_sampling and cscene.use_preview_adaptive_sampling)
col.prop(cscene, "scrambling_distance", text="Scrambling Distance")
col.prop(cscene, "adaptive_scrambling_distance", text="Adaptive")
sub = col.row(align=True)
heading = layout.column(align=True, heading="Scrambling Distance")
heading.active = not (cscene.use_adaptive_sampling and cscene.use_preview_adaptive_sampling)
heading.prop(cscene, "auto_scrambling_distance", text="Automatic")
sub = heading.row()
sub.active = not cscene.use_preview_adaptive_sampling
sub.prop(cscene, "preview_scrambling_distance", text="Viewport")
heading.prop(cscene, "scrambling_distance", text="Multiplier")
layout.separator()
@@ -1051,7 +1054,7 @@ class CYCLES_OBJECT_PT_motion_blur(CyclesButtonsPanel, Panel):
def has_geometry_visibility(ob):
return ob and ((ob.type in {'MESH', 'CURVE', 'SURFACE', 'FONT', 'META', 'LIGHT'}) or
return ob and ((ob.type in {'MESH', 'CURVE', 'SURFACE', 'FONT', 'META', 'LIGHT', 'VOLUME', 'POINTCLOUD', 'HAIR'}) or
(ob.instance_type == 'COLLECTION' and ob.instance_collection))

View File

@@ -199,7 +199,7 @@ static bool ObtainCacheParticleUV(Hair *hair,
b_mesh->uv_layers.begin(l);
float2 uv = zero_float2();
if (b_mesh->uv_layers.length())
if (!b_mesh->uv_layers.empty())
b_psys.uv_on_emitter(psmd, *b_pa, pa_no, uv_num, &uv.x);
CData->curve_uv.push_back_slow(uv);
@@ -261,7 +261,7 @@ static bool ObtainCacheParticleVcol(Hair *hair,
b_mesh->vertex_colors.begin(l);
float4 vcol = make_float4(0.0f, 0.0f, 0.0f, 1.0f);
if (b_mesh->vertex_colors.length())
if (!b_mesh->vertex_colors.empty())
b_psys.mcol_on_emitter(psmd, *b_pa, pa_no, vcol_num, &vcol.x);
CData->curve_vcol.push_back_slow(vcol);

View File

@@ -334,7 +334,7 @@ bool BlenderDisplayDriver::update_begin(const Params &params,
/* Update PBO dimensions if needed.
*
* NOTE: Allocate the PBO for the the size which will fit the final render resolution (as in,
* NOTE: Allocate the PBO for the size which will fit the final render resolution (as in,
* at a resolution divider 1. This was we don't need to recreate graphics interoperability
* objects which are costly and which are tied to the specific underlying buffer size.
* The downside of this approach is that when graphics interoperability is not used we are

View File

@@ -555,7 +555,7 @@ static void attr_create_vertex_color(Scene *scene, Mesh *mesh, BL::Mesh &b_mesh,
/* Create uv map attributes. */
static void attr_create_uv_map(Scene *scene, Mesh *mesh, BL::Mesh &b_mesh)
{
if (b_mesh.uv_layers.length() != 0) {
if (!b_mesh.uv_layers.empty()) {
for (BL::MeshUVLoopLayer &l : b_mesh.uv_layers) {
const bool active_render = l.active_render();
AttributeStandard uv_std = (active_render) ? ATTR_STD_UV : ATTR_STD_NONE;
@@ -619,7 +619,7 @@ static void attr_create_uv_map(Scene *scene, Mesh *mesh, BL::Mesh &b_mesh)
static void attr_create_subd_uv_map(Scene *scene, Mesh *mesh, BL::Mesh &b_mesh, bool subdivide_uvs)
{
if (b_mesh.uv_layers.length() != 0) {
if (!b_mesh.uv_layers.empty()) {
BL::Mesh::uv_layers_iterator l;
int i = 0;
@@ -951,7 +951,7 @@ static void create_mesh(Scene *scene,
N = attr_N->data_float3();
/* create generated coordinates from undeformed coordinates */
const bool need_default_tangent = (subdivision == false) && (b_mesh.uv_layers.length() == 0) &&
const bool need_default_tangent = (subdivision == false) && (b_mesh.uv_layers.empty()) &&
(mesh->need_attribute(scene, ATTR_STD_UV_TANGENT));
if (mesh->need_attribute(scene, ATTR_STD_GENERATED) || need_default_tangent) {
Attribute *attr = attributes.add(ATTR_STD_GENERATED);

View File

@@ -294,7 +294,7 @@ Object *BlenderSync::sync_object(BL::Depsgraph &b_depsgraph,
object->set_visibility(visibility);
object->set_is_shadow_catcher(b_ob.is_shadow_catcher());
object->set_is_shadow_catcher(b_ob.is_shadow_catcher() || b_parent.is_shadow_catcher());
float shadow_terminator_shading_offset = get_float(cobject, "shadow_terminator_offset");
object->set_shadow_terminator_shading_offset(shadow_terminator_shading_offset);

View File

@@ -129,7 +129,7 @@ void BlenderSession::create_session()
/* reset status/progress */
last_status = "";
last_error = "";
last_progress = -1.0f;
last_progress = -1.0;
start_resize_time = 0.0;
/* create session */
@@ -615,6 +615,24 @@ void BlenderSession::bake(BL::Depsgraph &b_depsgraph_,
sync->sync_camera(b_render, b_camera_override, width, height, "");
sync->sync_data(
b_render, b_depsgraph, b_v3d, b_camera_override, width, height, &python_thread_state);
/* Filtering settings for combined pass. */
if (pass->get_type() == PASS_COMBINED) {
Integrator *integrator = scene->integrator;
integrator->set_use_direct_light((bake_filter & BL::BakeSettings::pass_filter_DIRECT) != 0);
integrator->set_use_indirect_light((bake_filter & BL::BakeSettings::pass_filter_INDIRECT) !=
0);
integrator->set_use_diffuse((bake_filter & BL::BakeSettings::pass_filter_DIFFUSE) != 0);
integrator->set_use_glossy((bake_filter & BL::BakeSettings::pass_filter_GLOSSY) != 0);
integrator->set_use_transmission(
(bake_filter & BL::BakeSettings::pass_filter_TRANSMISSION) != 0);
integrator->set_use_emission((bake_filter & BL::BakeSettings::pass_filter_EMIT) != 0);
}
/* Always use transpanent background for baking. */
scene->background->set_transparent(true);
/* Load built-in images from Blender. */
builtin_images_load();
}
@@ -841,7 +859,7 @@ void BlenderSession::get_status(string &status, string &substatus)
session->progress.get_status(status, substatus);
}
void BlenderSession::get_progress(float &progress, double &total_time, double &render_time)
void BlenderSession::get_progress(double &progress, double &total_time, double &render_time)
{
session->progress.get_time(total_time, render_time);
progress = session->progress.get_progress();
@@ -849,10 +867,10 @@ void BlenderSession::get_progress(float &progress, double &total_time, double &r
void BlenderSession::update_bake_progress()
{
float progress = session->progress.get_progress();
double progress = session->progress.get_progress();
if (progress != last_progress) {
b_engine.update_progress(progress);
b_engine.update_progress((float)progress);
last_progress = progress;
}
}
@@ -861,7 +879,7 @@ void BlenderSession::update_status_progress()
{
string timestatus, status, substatus;
string scene_status = "";
float progress;
double progress;
double total_time, remaining_time = 0, render_time;
float mem_used = (float)session->stats.mem_used / 1024.0f / 1024.0f;
float mem_peak = (float)session->stats.mem_peak / 1024.0f / 1024.0f;
@@ -905,7 +923,7 @@ void BlenderSession::update_status_progress()
last_status_time = current_time;
}
if (progress != last_progress) {
b_engine.update_progress(progress);
b_engine.update_progress((float)progress);
last_progress = progress;
}

View File

@@ -82,7 +82,7 @@ class BlenderSession {
void tag_redraw();
void tag_update();
void get_status(string &status, string &substatus);
void get_progress(float &progress, double &total_time, double &render_time);
void get_progress(double &progress, double &total_time, double &render_time);
void test_cancel();
void update_status_progress();
void update_bake_progress();
@@ -108,7 +108,7 @@ class BlenderSession {
string last_status;
string last_error;
float last_progress;
double last_progress;
double last_status_time;
int width, height;

View File

@@ -365,8 +365,8 @@ void BlenderSync::sync_integrator(BL::ViewLayer &b_view_layer, bool background)
int samples = get_int(cscene, "samples");
float scrambling_distance = get_float(cscene, "scrambling_distance");
bool adaptive_scrambling_distance = get_boolean(cscene, "adaptive_scrambling_distance");
if (adaptive_scrambling_distance) {
bool auto_scrambling_distance = get_boolean(cscene, "auto_scrambling_distance");
if (auto_scrambling_distance) {
scrambling_distance *= 4.0f / sqrtf(samples);
}
@@ -392,6 +392,12 @@ void BlenderSync::sync_integrator(BL::ViewLayer &b_view_layer, bool background)
integrator->set_ao_bounces(0);
}
#ifdef WITH_CYCLES_DEBUG
DirectLightSamplingType direct_light_sampling_type = (DirectLightSamplingType)get_enum(
cscene, "direct_light_sampling_type", DIRECT_LIGHT_SAMPLING_NUM, DIRECT_LIGHT_SAMPLING_MIS);
integrator->set_direct_light_sampling_type(direct_light_sampling_type);
#endif
const DenoiseParams denoise_params = get_denoise_params(b_scene, b_view_layer, background);
integrator->set_use_denoise(denoise_params.use);
@@ -835,18 +841,25 @@ SessionParams BlenderSync::get_session_params(BL::RenderEngine &b_engine,
/* samples */
int samples = get_int(cscene, "samples");
int preview_samples = get_int(cscene, "preview_samples");
int sample_offset = get_int(cscene, "sample_offset");
if (background) {
params.samples = samples;
params.sample_offset = sample_offset;
}
else {
params.samples = preview_samples;
if (params.samples == 0)
if (params.samples == 0) {
params.samples = INT_MAX;
}
params.sample_offset = 0;
}
/* Clamp sample offset. */
params.sample_offset = clamp(params.sample_offset, 0, Integrator::MAX_SAMPLES);
/* Clamp samples. */
params.samples = min(params.samples, Integrator::MAX_SAMPLES);
params.samples = clamp(params.samples, 0, Integrator::MAX_SAMPLES - params.sample_offset);
/* Viewport Performance */
params.pixel_size = b_engine.get_preview_pixel_size(b_scene);
@@ -865,7 +878,7 @@ SessionParams BlenderSync::get_session_params(BL::RenderEngine &b_engine,
/* Time limit. */
if (background) {
params.time_limit = get_float(cscene, "time_limit");
params.time_limit = (double)get_float(cscene, "time_limit");
}
else {
/* For the viewport it kind of makes more sense to think in terms of the noise floor, which is

View File

@@ -303,7 +303,7 @@ static inline string image_user_file_path(BL::ImageUser &iuser,
string filepath_str = string(filepath);
if (load_tiled && ima.source() == BL::Image::source_TILED) {
string udim;
if (ima.tiles.length() > 0) {
if (!ima.tiles.empty()) {
udim = to_string(ima.tiles[0].number());
}
string_replace(filepath_str, udim, "<UDIM>");
@@ -647,7 +647,7 @@ static inline Mesh::SubdivisionType object_subdivision_type(BL::Object &b_ob,
{
PointerRNA cobj = RNA_pointer_get(&b_ob.ptr, "cycles");
if (cobj.data && b_ob.modifiers.length() > 0 && experimental) {
if (cobj.data && !b_ob.modifiers.empty() && experimental) {
BL::Modifier mod = b_ob.modifiers[b_ob.modifiers.length() - 1];
bool enabled = preview ? mod.show_viewport() : mod.show_render();

View File

@@ -303,7 +303,7 @@ static void rtc_error_func(void *, enum RTCError, const char *str)
VLOG(1) << str;
}
static double progress_start_time = 0.0f;
static double progress_start_time = 0.0;
static bool rtc_progress_func(void *user_ptr, const double n)
{

View File

@@ -153,7 +153,7 @@ void BVHNode::update_time()
namespace {
struct DumpTraversalContext {
/* Descriptor of wile where writing is happening. */
/* Descriptor of while where writing is happening. */
FILE *stream;
/* Unique identifier of the node current. */
int id;

View File

@@ -178,7 +178,7 @@ class InnerNode : public BVHNode {
reset_unused_children();
}
/* NOTE: This function is only used during binary BVH builder, and it
/* NOTE: This function is only used during binary BVH builder, and it's
* supposed to be configured to have 2 children which will be filled-in in a
* bit. But this is important to have children reset to NULL. */
explicit InnerNode(const BoundBox &bounds) : BVHNode(bounds), num_children_(0)

View File

@@ -30,15 +30,17 @@ BVHOptiX::BVHOptiX(const BVHParams &params_,
: BVH(params_, geometry_, objects_),
device(device),
traversable_handle(0),
as_data(device, params_.top_level ? "optix tlas" : "optix blas", false),
motion_transform_data(device, "optix motion transform", false)
as_data(make_unique<device_only_memory<char>>(
device, params.top_level ? "optix tlas" : "optix blas", false)),
motion_transform_data(
make_unique<device_only_memory<char>>(device, "optix motion transform", false))
{
}
BVHOptiX::~BVHOptiX()
{
// Acceleration structure memory is delayed freed on device, since deleting the
// BVH may happen while still being used for rendering.
/* Acceleration structure memory is delayed freed on device, since deleting the
* BVH may happen while still being used for rendering. */
device->release_optix_bvh(this);
}

View File

@@ -25,14 +25,16 @@
# include "device/memory.h"
# include "util/unique_ptr.h"
CCL_NAMESPACE_BEGIN
class BVHOptiX : public BVH {
public:
Device *device;
uint64_t traversable_handle;
device_only_memory<char> as_data;
device_only_memory<char> motion_transform_data;
unique_ptr<device_only_memory<char>> as_data;
unique_ptr<device_only_memory<char>> motion_transform_data;
protected:
friend class BVH;

View File

@@ -88,7 +88,7 @@ endmacro()
function(cycles_link_directories)
if(APPLE)
# APPLE plaform uses full paths for linking libraries, and avoids link_directories.
# APPLE platform uses full paths for linking libraries, and avoids link_directories.
return()
endif()

View File

@@ -93,11 +93,6 @@ CPUDevice::~CPUDevice()
texture_info.free();
}
bool CPUDevice::show_samples() const
{
return (info.cpu_threads == 1);
}
BVHLayoutMask CPUDevice::get_bvh_layout_mask() const
{
BVHLayoutMask bvh_layout_mask = BVH_LAYOUT_BVH2;

View File

@@ -60,8 +60,6 @@ class CPUDevice : public Device {
CPUDevice(const DeviceInfo &info_, Stats &stats_, Profiler &profiler_);
~CPUDevice();
virtual bool show_samples() const override;
virtual BVHLayoutMask get_bvh_layout_mask() const override;
/* Returns true if the texture info was copied to the device (meaning, some more

View File

@@ -42,7 +42,7 @@ class CPUKernels {
IntegratorInitFunction integrator_init_from_camera;
IntegratorInitFunction integrator_init_from_bake;
IntegratorFunction integrator_intersect_closest;
IntegratorShadeFunction integrator_intersect_closest;
IntegratorFunction integrator_intersect_shadow;
IntegratorFunction integrator_intersect_subsurface;
IntegratorFunction integrator_intersect_volume_stack;

View File

@@ -46,12 +46,6 @@ bool CUDADevice::have_precompiled_kernels()
return path_exists(cubins_path);
}
bool CUDADevice::show_samples() const
{
/* The CUDADevice only processes one tile at a time, so showing samples is fine. */
return true;
}
BVHLayoutMask CUDADevice::get_bvh_layout_mask() const
{
return BVH_LAYOUT_BVH2;
@@ -242,6 +236,10 @@ string CUDADevice::compile_kernel_get_common_cflags(const uint kernel_features)
cflags += " -DWITH_NANOVDB";
# endif
# ifdef WITH_CYCLES_DEBUG
cflags += " -DWITH_CYCLES_DEBUG";
# endif
return cflags;
}
@@ -777,6 +775,7 @@ void CUDADevice::generic_free(device_memory &mem)
if (mem.device_pointer) {
CUDAContextScope scope(this);
thread_scoped_lock lock(cuda_mem_map_mutex);
DCHECK(cuda_mem_map.find(&mem) != cuda_mem_map.end());
const CUDAMem &cmem = cuda_mem_map[&mem];
/* If cmem.use_mapped_host is true, reference counting is used
@@ -931,7 +930,6 @@ void CUDADevice::tex_alloc(device_texture &mem)
{
CUDAContextScope scope(this);
/* General variables for both architectures */
string bind_name = mem.name;
size_t dsize = datatype_size(mem.data_type);
size_t size = mem.memory_size();
@@ -1094,7 +1092,6 @@ void CUDADevice::tex_alloc(device_texture &mem)
if (mem.info.data_type != IMAGE_DATA_TYPE_NANOVDB_FLOAT &&
mem.info.data_type != IMAGE_DATA_TYPE_NANOVDB_FLOAT3) {
/* Kepler+, bindless textures. */
CUDA_RESOURCE_DESC resDesc;
memset(&resDesc, 0, sizeof(resDesc));
@@ -1145,6 +1142,7 @@ void CUDADevice::tex_free(device_texture &mem)
if (mem.device_pointer) {
CUDAContextScope scope(this);
thread_scoped_lock lock(cuda_mem_map_mutex);
DCHECK(cuda_mem_map.find(&mem) != cuda_mem_map.end());
const CUDAMem &cmem = cuda_mem_map[&mem];
if (cmem.texobject) {

View File

@@ -76,8 +76,6 @@ class CUDADevice : public Device {
static bool have_precompiled_kernels();
virtual bool show_samples() const override;
virtual BVHLayoutMask get_bvh_layout_mask() const override;
void set_error(const string &error) override;

View File

@@ -149,10 +149,6 @@ class Device {
fprintf(stderr, "%s\n", error.c_str());
fflush(stderr);
}
virtual bool show_samples() const
{
return false;
}
virtual BVHLayoutMask get_bvh_layout_mask() const = 0;
/* statistics */

View File

@@ -47,12 +47,6 @@ bool HIPDevice::have_precompiled_kernels()
return path_exists(fatbins_path);
}
bool HIPDevice::show_samples() const
{
/* The HIPDevice only processes one tile at a time, so showing samples is fine. */
return true;
}
BVHLayoutMask HIPDevice::get_bvh_layout_mask() const
{
return BVH_LAYOUT_BVH2;
@@ -99,7 +93,7 @@ HIPDevice::HIPDevice(const DeviceInfo &info, Stats &stats, Profiler &profiler)
}
/* Setup device and context. */
result = hipGetDevice(&hipDevice, hipDevId);
result = hipDeviceGet(&hipDevice, hipDevId);
if (result != hipSuccess) {
set_error(string_printf("Failed to get HIP device handle from ordinal (%s)",
hipewErrorString(result)));
@@ -222,7 +216,6 @@ string HIPDevice::compile_kernel_get_common_cflags(const uint kernel_features)
const string include_path = source_path;
string cflags = string_printf(
"-m%d "
"--ptxas-options=\"-v\" "
"--use_fast_math "
"-DHIPCC "
"-I\"%s\"",
@@ -234,10 +227,7 @@ string HIPDevice::compile_kernel_get_common_cflags(const uint kernel_features)
return cflags;
}
string HIPDevice::compile_kernel(const uint kernel_features,
const char *name,
const char *base,
bool force_ptx)
string HIPDevice::compile_kernel(const uint kernel_features, const char *name, const char *base)
{
/* Compute kernel name. */
int major, minor;
@@ -247,7 +237,7 @@ string HIPDevice::compile_kernel(const uint kernel_features,
hipGetDeviceProperties(&props, hipDevId);
/* gcnArchName can contain tokens after the arch name with features, ie.
"gfx1010:sramecc-:xnack-" so we tokenize it to get the first part. */
* `gfx1010:sramecc-:xnack-` so we tokenize it to get the first part. */
char *arch = strtok(props.gcnArchName, ":");
if (arch == NULL) {
arch = props.gcnArchName;
@@ -255,13 +245,11 @@ string HIPDevice::compile_kernel(const uint kernel_features,
/* Attempt to use kernel provided with Blender. */
if (!use_adaptive_compilation()) {
if (!force_ptx) {
const string fatbin = path_get(string_printf("lib/%s_%s.fatbin", name, arch));
VLOG(1) << "Testing for pre-compiled kernel " << fatbin << ".";
if (path_exists(fatbin)) {
VLOG(1) << "Using precompiled kernel.";
return fatbin;
}
const string fatbin = path_get(string_printf("lib/%s_%s.fatbin", name, arch));
VLOG(1) << "Testing for pre-compiled kernel " << fatbin << ".";
if (path_exists(fatbin)) {
VLOG(1) << "Using precompiled kernel.";
return fatbin;
}
}
@@ -298,9 +286,9 @@ string HIPDevice::compile_kernel(const uint kernel_features,
# ifdef _WIN32
if (!use_adaptive_compilation() && have_precompiled_kernels()) {
if (major < 3) {
if (!hipSupportsDevice(hipDevId)) {
set_error(
string_printf("HIP backend requires compute capability 3.0 or up, but found %d.%d. "
string_printf("HIP backend requires compute capability 10.1 or up, but found %d.%d. "
"Your GPU is not supported.",
major,
minor));
@@ -380,10 +368,9 @@ string HIPDevice::compile_kernel(const uint kernel_features,
bool HIPDevice::load_kernels(const uint kernel_features)
{
/* TODO(sergey): Support kernels re-load for CUDA devices adaptive compile.
/* TODO(sergey): Support kernels re-load for HIP devices adaptive compile.
*
* Currently re-loading kernel will invalidate memory pointers,
* causing problems in cuCtxSynchronize.
* Currently re-loading kernels will invalidate memory pointers.
*/
if (hipModule) {
if (use_adaptive_compilation()) {
@@ -751,6 +738,7 @@ void HIPDevice::generic_free(device_memory &mem)
if (mem.device_pointer) {
HIPContextScope scope(this);
thread_scoped_lock lock(hip_mem_map_mutex);
DCHECK(hip_mem_map.find(&mem) != hip_mem_map.end());
const HIPMem &cmem = hip_mem_map[&mem];
/* If cmem.use_mapped_host is true, reference counting is used
@@ -904,7 +892,6 @@ void HIPDevice::tex_alloc(device_texture &mem)
{
HIPContextScope scope(this);
/* General variables for both architectures */
string bind_name = mem.name;
size_t dsize = datatype_size(mem.data_type);
size_t size = mem.memory_size();
@@ -994,16 +981,16 @@ void HIPDevice::tex_alloc(device_texture &mem)
<< string_human_readable_number(mem.memory_size()) << " bytes. ("
<< string_human_readable_size(mem.memory_size()) << ")";
hip_assert(hipArray3DCreate(&array_3d, &desc));
hip_assert(hipArray3DCreate((hArray *)&array_3d, &desc));
if (!array_3d) {
return;
}
HIP_MEMCPY3D param;
memset(&param, 0, sizeof(param));
memset(&param, 0, sizeof(HIP_MEMCPY3D));
param.dstMemoryType = hipMemoryTypeArray;
param.dstArray = &array_3d;
param.dstArray = array_3d;
param.srcMemoryType = hipMemoryTypeHost;
param.srcHost = mem.host_pointer;
param.srcPitch = src_pitch;
@@ -1069,13 +1056,13 @@ void HIPDevice::tex_alloc(device_texture &mem)
if (mem.info.data_type != IMAGE_DATA_TYPE_NANOVDB_FLOAT &&
mem.info.data_type != IMAGE_DATA_TYPE_NANOVDB_FLOAT3) {
/* Kepler+, bindless textures. */
/* Bindless textures. */
hipResourceDesc resDesc;
memset(&resDesc, 0, sizeof(resDesc));
if (array_3d) {
resDesc.resType = hipResourceTypeArray;
resDesc.res.array.h_Array = &array_3d;
resDesc.res.array.h_Array = array_3d;
resDesc.flags = 0;
}
else if (mem.data_height > 0) {
@@ -1120,6 +1107,7 @@ void HIPDevice::tex_free(device_texture &mem)
if (mem.device_pointer) {
HIPContextScope scope(this);
thread_scoped_lock lock(hip_mem_map_mutex);
DCHECK(hip_mem_map.find(&mem) != hip_mem_map.end());
const HIPMem &cmem = hip_mem_map[&mem];
if (cmem.texobject) {
@@ -1160,6 +1148,8 @@ bool HIPDevice::should_use_graphics_interop()
* possible, but from the empiric measurements it can be considerably slower than using naive
* pixels copy. */
/* Disable graphics interop for now, because of driver bug in 21.40. See T92972 */
# if 0
HIPContextScope scope(this);
int num_all_devices = 0;
@@ -1178,6 +1168,7 @@ bool HIPDevice::should_use_graphics_interop()
return true;
}
}
# endif
return false;
}

View File

@@ -75,8 +75,6 @@ class HIPDevice : public Device {
static bool have_precompiled_kernels();
virtual bool show_samples() const override;
virtual BVHLayoutMask get_bvh_layout_mask() const override;
void set_error(const string &error) override;
@@ -93,10 +91,7 @@ class HIPDevice : public Device {
virtual string compile_kernel_get_common_cflags(const uint kernel_features);
string compile_kernel(const uint kernel_features,
const char *name,
const char *base = "hip",
bool force_ptx = false);
string compile_kernel(const uint kernel_features, const char *name, const char *base = "hip");
virtual bool load_kernels(const uint kernel_features) override;
void reserve_local_memory(const uint kernel_features);

View File

@@ -48,7 +48,7 @@ class HIPDeviceGraphicsInterop : public DeviceGraphicsInterop {
HIPDeviceQueue *queue_ = nullptr;
HIPDevice *device_ = nullptr;
/* OpenGL PBO which is currently registered as the destination for the CUDA buffer. */
/* OpenGL PBO which is currently registered as the destination for the HIP buffer. */
uint opengl_pbo_id_ = 0;
/* Buffer area in pixels of the corresponding PBO. */
int64_t buffer_area_ = 0;

View File

@@ -23,7 +23,7 @@ CCL_NAMESPACE_BEGIN
device_memory::device_memory(Device *device, const char *name, MemoryType type)
: data_type(device_type_traits<uchar>::data_type),
data_elements(device_type_traits<uchar>::num_elements_cpu),
data_elements(device_type_traits<uchar>::num_elements),
data_size(0),
device_size(0),
data_width(0),
@@ -44,45 +44,6 @@ device_memory::device_memory(Device *device, const char *name, MemoryType type)
{
}
device_memory::device_memory(device_memory &&other) noexcept
: data_type(other.data_type),
data_elements(other.data_elements),
data_size(other.data_size),
device_size(other.device_size),
data_width(other.data_width),
data_height(other.data_height),
data_depth(other.data_depth),
type(other.type),
name(other.name),
device(other.device),
device_pointer(other.device_pointer),
host_pointer(other.host_pointer),
shared_pointer(other.shared_pointer),
shared_counter(other.shared_counter),
original_device_ptr(other.original_device_ptr),
original_device_size(other.original_device_size),
original_device(other.original_device),
need_realloc_(other.need_realloc_),
modified(other.modified)
{
other.data_elements = 0;
other.data_size = 0;
other.device_size = 0;
other.data_width = 0;
other.data_height = 0;
other.data_depth = 0;
other.device = 0;
other.device_pointer = 0;
other.host_pointer = 0;
other.shared_pointer = 0;
other.shared_counter = 0;
other.original_device_ptr = 0;
other.original_device_size = 0;
other.original_device = 0;
other.need_realloc_ = false;
other.modified = false;
}
device_memory::~device_memory()
{
assert(shared_pointer == 0);

View File

@@ -81,155 +81,140 @@ static constexpr size_t datatype_size(DataType datatype)
template<typename T> struct device_type_traits {
static const DataType data_type = TYPE_UNKNOWN;
static const size_t num_elements_cpu = sizeof(T);
static const size_t num_elements_gpu = sizeof(T);
static const size_t num_elements = sizeof(T);
};
template<> struct device_type_traits<uchar> {
static const DataType data_type = TYPE_UCHAR;
static const size_t num_elements_cpu = 1;
static const size_t num_elements_gpu = 1;
static_assert(sizeof(uchar) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 1;
static_assert(sizeof(uchar) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<uchar2> {
static const DataType data_type = TYPE_UCHAR;
static const size_t num_elements_cpu = 2;
static const size_t num_elements_gpu = 2;
static_assert(sizeof(uchar2) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 2;
static_assert(sizeof(uchar2) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<uchar3> {
static const DataType data_type = TYPE_UCHAR;
static const size_t num_elements_cpu = 3;
static const size_t num_elements_gpu = 3;
static_assert(sizeof(uchar3) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 3;
static_assert(sizeof(uchar3) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<uchar4> {
static const DataType data_type = TYPE_UCHAR;
static const size_t num_elements_cpu = 4;
static const size_t num_elements_gpu = 4;
static_assert(sizeof(uchar4) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 4;
static_assert(sizeof(uchar4) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<uint> {
static const DataType data_type = TYPE_UINT;
static const size_t num_elements_cpu = 1;
static const size_t num_elements_gpu = 1;
static_assert(sizeof(uint) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 1;
static_assert(sizeof(uint) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<uint2> {
static const DataType data_type = TYPE_UINT;
static const size_t num_elements_cpu = 2;
static const size_t num_elements_gpu = 2;
static_assert(sizeof(uint2) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 2;
static_assert(sizeof(uint2) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<uint3> {
static const DataType data_type = TYPE_UINT;
static const size_t num_elements_cpu = 3;
static const size_t num_elements_gpu = 3;
static_assert(sizeof(uint3) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 3;
static_assert(sizeof(uint3) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<uint4> {
static const DataType data_type = TYPE_UINT;
static const size_t num_elements_cpu = 4;
static const size_t num_elements_gpu = 4;
static_assert(sizeof(uint4) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 4;
static_assert(sizeof(uint4) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<int> {
static const DataType data_type = TYPE_INT;
static const size_t num_elements_cpu = 1;
static const size_t num_elements_gpu = 1;
static_assert(sizeof(int) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 1;
static_assert(sizeof(int) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<int2> {
static const DataType data_type = TYPE_INT;
static const size_t num_elements_cpu = 2;
static const size_t num_elements_gpu = 2;
static_assert(sizeof(int2) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 2;
static_assert(sizeof(int2) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<int3> {
static const DataType data_type = TYPE_INT;
static const size_t num_elements_cpu = 4;
static const size_t num_elements_gpu = 3;
static_assert(sizeof(int3) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 4;
static_assert(sizeof(int3) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<int4> {
static const DataType data_type = TYPE_INT;
static const size_t num_elements_cpu = 4;
static const size_t num_elements_gpu = 4;
static_assert(sizeof(int4) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 4;
static_assert(sizeof(int4) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<float> {
static const DataType data_type = TYPE_FLOAT;
static const size_t num_elements_cpu = 1;
static const size_t num_elements_gpu = 1;
static_assert(sizeof(float) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 1;
static_assert(sizeof(float) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<float2> {
static const DataType data_type = TYPE_FLOAT;
static const size_t num_elements_cpu = 2;
static const size_t num_elements_gpu = 2;
static_assert(sizeof(float2) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 2;
static_assert(sizeof(float2) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<float3> {
/* float3 has different size depending on the device, can't use it for interchanging
* memory between CPU and GPU.
*
* Leave body empty to trigger a compile error if used. */
};
template<> struct device_type_traits<packed_float3> {
static const DataType data_type = TYPE_FLOAT;
static const size_t num_elements_cpu = 4;
static const size_t num_elements_gpu = 3;
static_assert(sizeof(float3) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 3;
static_assert(sizeof(packed_float3) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<float4> {
static const DataType data_type = TYPE_FLOAT;
static const size_t num_elements_cpu = 4;
static const size_t num_elements_gpu = 4;
static_assert(sizeof(float4) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 4;
static_assert(sizeof(float4) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<half> {
static const DataType data_type = TYPE_HALF;
static const size_t num_elements_cpu = 1;
static const size_t num_elements_gpu = 1;
static_assert(sizeof(half) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 1;
static_assert(sizeof(half) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<ushort4> {
static const DataType data_type = TYPE_UINT16;
static const size_t num_elements_cpu = 4;
static const size_t num_elements_gpu = 4;
static_assert(sizeof(ushort4) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 4;
static_assert(sizeof(ushort4) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<uint16_t> {
static const DataType data_type = TYPE_UINT16;
static const size_t num_elements_cpu = 1;
static const size_t num_elements_gpu = 1;
static_assert(sizeof(uint16_t) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 1;
static_assert(sizeof(uint16_t) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<half4> {
static const DataType data_type = TYPE_HALF;
static const size_t num_elements_cpu = 4;
static const size_t num_elements_gpu = 4;
static_assert(sizeof(half4) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 4;
static_assert(sizeof(half4) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<uint64_t> {
static const DataType data_type = TYPE_UINT64;
static const size_t num_elements_cpu = 1;
static const size_t num_elements_gpu = 1;
static_assert(sizeof(uint64_t) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 1;
static_assert(sizeof(uint64_t) == num_elements * datatype_size(data_type));
};
/* Device Memory
@@ -281,11 +266,16 @@ class device_memory {
/* Only create through subclasses. */
device_memory(Device *device, const char *name, MemoryType type);
device_memory(device_memory &&other) noexcept;
/* No copying allowed. */
/* No copying and allowed.
*
* This is because device implementation might need to register device memory in an allocation
* map of some sort and use pointer as a key to identify blocks. Moving data from one place to
* another bypassing device allocation routines will make those maps hard to maintain. */
device_memory(const device_memory &) = delete;
device_memory(device_memory &&other) noexcept = delete;
device_memory &operator=(const device_memory &) = delete;
device_memory &operator=(device_memory &&) = delete;
/* Host allocation on the device. All host_pointer memory should be
* allocated with these functions, for devices that support using
@@ -320,9 +310,7 @@ template<typename T> class device_only_memory : public device_memory {
: device_memory(device, name, allow_host_memory_fallback ? MEM_READ_WRITE : MEM_DEVICE_ONLY)
{
data_type = device_type_traits<T>::data_type;
data_elements = max(device_is_cpu() ? device_type_traits<T>::num_elements_cpu :
device_type_traits<T>::num_elements_gpu,
1);
data_elements = max(device_type_traits<T>::num_elements, 1);
}
device_only_memory(device_only_memory &&other) noexcept : device_memory(std::move(other))
@@ -378,15 +366,11 @@ template<typename T> class device_only_memory : public device_memory {
template<typename T> class device_vector : public device_memory {
public:
/* Can only use this for types that have the same size on CPU and GPU. */
static_assert(device_type_traits<T>::num_elements_cpu ==
device_type_traits<T>::num_elements_gpu);
device_vector(Device *device, const char *name, MemoryType type)
: device_memory(device, name, type)
{
data_type = device_type_traits<T>::data_type;
data_elements = device_type_traits<T>::num_elements_cpu;
data_elements = device_type_traits<T>::num_elements;
modified = true;
need_realloc_ = true;

View File

@@ -109,14 +109,6 @@ class MultiDevice : public Device {
return error_msg;
}
virtual bool show_samples() const override
{
if (devices.size() > 1) {
return false;
}
return devices.front().device->show_samples();
}
virtual BVHLayoutMask get_bvh_layout_mask() const override
{
BVHLayoutMask bvh_layout_mask = BVH_LAYOUT_ALL;

View File

@@ -886,8 +886,7 @@ bool OptiXDevice::denoise_configure_if_needed(DenoiseContext &context)
denoiser_.scratch_offset = sizes.stateSizeInBytes;
/* Allocate denoiser state if tile size has changed since last setup. */
denoiser_.state.alloc_to_device(denoiser_.scratch_offset + denoiser_.scratch_size +
sizeof(float));
denoiser_.state.alloc_to_device(denoiser_.scratch_offset + denoiser_.scratch_size);
/* Initialize denoiser state for the current tile size. */
const OptixResult result = optixDenoiserSetup(
@@ -971,16 +970,6 @@ bool OptiXDevice::denoise_run(DenoiseContext &context, const DenoisePass &pass)
/* Finally run denoising. */
OptixDenoiserParams params = {}; /* All parameters are disabled/zero. */
params.hdrIntensity = denoiser_.state.device_pointer + denoiser_.scratch_offset +
denoiser_.scratch_size;
optix_assert(
optixDenoiserComputeIntensity(denoiser_.optix_denoiser,
denoiser_.queue.stream(),
&color_layer,
params.hdrIntensity,
denoiser_.state.device_pointer + denoiser_.scratch_offset,
denoiser_.scratch_size));
OptixDenoiserLayer image_layers = {};
image_layers.input = color_layer;
@@ -1043,7 +1032,7 @@ bool OptiXDevice::build_optix_bvh(BVHOptiX *bvh,
return false;
}
device_only_memory<char> &out_data = bvh->as_data;
device_only_memory<char> &out_data = *bvh->as_data;
if (operation == OPTIX_BUILD_OPERATION_BUILD) {
assert(out_data.device == this);
out_data.alloc_to_device(sizes.outputSizeInBytes);
@@ -1134,7 +1123,7 @@ void OptiXDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
operation = OPTIX_BUILD_OPERATION_UPDATE;
}
else {
bvh_optix->as_data.free();
bvh_optix->as_data->free();
bvh_optix->traversable_handle = 0;
}
@@ -1355,9 +1344,9 @@ void OptiXDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
unsigned int num_instances = 0;
unsigned int max_num_instances = 0xFFFFFFFF;
bvh_optix->as_data.free();
bvh_optix->as_data->free();
bvh_optix->traversable_handle = 0;
bvh_optix->motion_transform_data.free();
bvh_optix->motion_transform_data->free();
optixDeviceContextGetProperty(context,
OPTIX_DEVICE_PROPERTY_LIMIT_MAX_INSTANCE_ID,
@@ -1390,8 +1379,8 @@ void OptiXDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
}
}
assert(bvh_optix->motion_transform_data.device == this);
bvh_optix->motion_transform_data.alloc_to_device(total_motion_transform_size);
assert(bvh_optix->motion_transform_data->device == this);
bvh_optix->motion_transform_data->alloc_to_device(total_motion_transform_size);
}
for (Object *ob : bvh->objects) {
@@ -1452,7 +1441,7 @@ void OptiXDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
motion_transform_offset = align_up(motion_transform_offset,
OPTIX_TRANSFORM_BYTE_ALIGNMENT);
CUdeviceptr motion_transform_gpu = bvh_optix->motion_transform_data.device_pointer +
CUdeviceptr motion_transform_gpu = bvh_optix->motion_transform_data->device_pointer +
motion_transform_offset;
motion_transform_offset += motion_transform_size;

View File

@@ -23,6 +23,7 @@
# include "device/optix/queue.h"
# include "device/optix/util.h"
# include "kernel/types.h"
# include "util/unique_ptr.h"
CCL_NAMESPACE_BEGIN
@@ -76,7 +77,7 @@ class OptiXDevice : public CUDADevice {
device_only_memory<KernelParamsOptiX> launch_params;
OptixTraversableHandle tlas_handle = 0;
vector<device_only_memory<char>> delayed_free_bvh_memory;
vector<unique_ptr<device_only_memory<char>>> delayed_free_bvh_memory;
thread_mutex delayed_free_bvh_mutex;
class Denoiser {

View File

@@ -73,7 +73,8 @@ bool OptiXDeviceQueue::enqueue(DeviceKernel kernel, const int work_size, void *a
sizeof(device_ptr),
cuda_stream_));
if (kernel == DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE) {
if (kernel == DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST ||
kernel == DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE) {
cuda_device_assert(
cuda_device_,
cuMemcpyHtoDAsync(launch_params_ptr + offsetof(KernelParamsOptiX, render_buffer),

View File

@@ -3,7 +3,7 @@ This program uses code from various sources, the default license is Apache 2.0
for all code, with the following exceptions.
Modified BSD License
* Code adapated from Open Shading Language
* Code adapted from Open Shading Language
* Sobol direction vectors
* Matrix inversion code from OpenEXR
* MD5 Hash code

View File

@@ -33,7 +33,10 @@ unique_ptr<Denoiser> Denoiser::create(Device *path_trace_device, const DenoisePa
return make_unique<OptiXDenoiser>(path_trace_device, params);
}
return make_unique<OIDNDenoiser>(path_trace_device, params);
/* Always fallback to OIDN. */
DenoiseParams oidn_params = params;
oidn_params.type = DENOISER_OPENIMAGEDENOISE;
return make_unique<OIDNDenoiser>(path_trace_device, oidn_params);
}
Denoiser::Denoiser(Device *path_trace_device, const DenoiseParams &params)

View File

@@ -296,13 +296,13 @@ static BufferParams scale_buffer_params(const BufferParams &params, int resoluti
scaled_params.window_x = params.window_x / resolution_divider;
scaled_params.window_y = params.window_y / resolution_divider;
scaled_params.window_width = params.window_width / resolution_divider;
scaled_params.window_height = params.window_height / resolution_divider;
scaled_params.window_width = max(1, params.window_width / resolution_divider);
scaled_params.window_height = max(1, params.window_height / resolution_divider);
scaled_params.full_x = params.full_x / resolution_divider;
scaled_params.full_y = params.full_y / resolution_divider;
scaled_params.full_width = params.full_width / resolution_divider;
scaled_params.full_height = params.full_height / resolution_divider;
scaled_params.full_width = max(1, params.full_width / resolution_divider);
scaled_params.full_height = max(1, params.full_height / resolution_divider);
scaled_params.update_offset_stride();
@@ -380,7 +380,10 @@ void PathTrace::path_trace(RenderWork &render_work)
PathTraceWork *path_trace_work = path_trace_works_[i].get();
PathTraceWork::RenderStatistics statistics;
path_trace_work->render_samples(statistics, render_work.path_trace.start_sample, num_samples);
path_trace_work->render_samples(statistics,
render_work.path_trace.start_sample,
num_samples,
render_work.path_trace.sample_offset);
const double work_time = time_dt() - work_start_time;
work_balance_infos_[i].time_spent += work_time;
@@ -847,9 +850,11 @@ void PathTrace::progress_update_if_needed(const RenderWork &render_work)
{
if (progress_ != nullptr) {
const int2 tile_size = get_render_tile_size();
const int num_samples_added = tile_size.x * tile_size.y * render_work.path_trace.num_samples;
const uint64_t num_samples_added = uint64_t(tile_size.x) * tile_size.y *
render_work.path_trace.num_samples;
const int current_sample = render_work.path_trace.start_sample +
render_work.path_trace.num_samples;
render_work.path_trace.num_samples -
render_work.path_trace.sample_offset;
progress_->add_samples(num_samples_added, current_sample);
}

View File

@@ -76,7 +76,7 @@ class PathTraceDisplay {
/* Copy buffer of rendered pixels of a given size into a given position of the texture.
*
* This function does not acquire a lock. The reason for this is is to allow use of this function
* This function does not acquire a lock. The reason for this is to allow use of this function
* for partial updates from different devices. In this case the caller will acquire the lock
* once, update all the slices and release
* the lock once. This will ensure that draw() will never use partially updated texture. */

View File

@@ -75,7 +75,10 @@ class PathTraceWork {
/* Render given number of samples as a synchronous blocking call.
* The samples are added to the render buffer associated with this work. */
virtual void render_samples(RenderStatistics &statistics, int start_sample, int samples_num) = 0;
virtual void render_samples(RenderStatistics &statistics,
int start_sample,
int samples_num,
int sample_offset) = 0;
/* Copy render result from this work to the corresponding place of the GPU display.
*

View File

@@ -71,14 +71,17 @@ void PathTraceWorkCPU::init_execution()
void PathTraceWorkCPU::render_samples(RenderStatistics &statistics,
int start_sample,
int samples_num)
int samples_num,
int sample_offset)
{
const int64_t image_width = effective_buffer_params_.width;
const int64_t image_height = effective_buffer_params_.height;
const int64_t total_pixels_num = image_width * image_height;
for (CPUKernelThreadGlobals &kernel_globals : kernel_thread_globals_) {
kernel_globals.start_profiling();
if (device_->profiler.active()) {
for (CPUKernelThreadGlobals &kernel_globals : kernel_thread_globals_) {
kernel_globals.start_profiling();
}
}
tbb::task_arena local_arena = local_tbb_arena_create(device_);
@@ -97,6 +100,7 @@ void PathTraceWorkCPU::render_samples(RenderStatistics &statistics,
work_tile.w = 1;
work_tile.h = 1;
work_tile.start_sample = start_sample;
work_tile.sample_offset = sample_offset;
work_tile.num_samples = 1;
work_tile.offset = effective_buffer_params_.offset;
work_tile.stride = effective_buffer_params_.stride;
@@ -106,9 +110,10 @@ void PathTraceWorkCPU::render_samples(RenderStatistics &statistics,
render_samples_full_pipeline(kernel_globals, work_tile, samples_num);
});
});
for (CPUKernelThreadGlobals &kernel_globals : kernel_thread_globals_) {
kernel_globals.stop_profiling();
if (device_->profiler.active()) {
for (CPUKernelThreadGlobals &kernel_globals : kernel_thread_globals_) {
kernel_globals.stop_profiling();
}
}
statistics.occupancy = 1.0f;

View File

@@ -48,7 +48,8 @@ class PathTraceWorkCPU : public PathTraceWork {
virtual void render_samples(RenderStatistics &statistics,
int start_sample,
int samples_num) override;
int samples_num,
int sample_offset) override;
virtual void copy_to_display(PathTraceDisplay *display,
PassMode pass_mode,

View File

@@ -250,7 +250,8 @@ void PathTraceWorkGPU::init_execution()
void PathTraceWorkGPU::render_samples(RenderStatistics &statistics,
int start_sample,
int samples_num)
int samples_num,
int sample_offset)
{
/* Limit number of states for the tile and rely on a greedy scheduling of tiles. This allows to
* add more work (because tiles are smaller, so there is higher chance that more paths will
@@ -261,6 +262,7 @@ void PathTraceWorkGPU::render_samples(RenderStatistics &statistics,
work_tile_scheduler_.reset(effective_buffer_params_,
start_sample,
samples_num,
sample_offset,
device_scene_->data.integrator.scrambling_distance);
enqueue_reset();
@@ -437,7 +439,15 @@ void PathTraceWorkGPU::enqueue_path_iteration(DeviceKernel kernel, const int num
DCHECK_LE(work_size, max_num_paths_);
switch (kernel) {
case DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST:
case DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST: {
/* Closest ray intersection kernels with integrator state and render buffer. */
void *d_render_buffer = (void *)buffers_->buffer.device_pointer;
void *args[] = {&d_path_index, &d_render_buffer, const_cast<int *>(&work_size)};
queue_->enqueue(kernel, work_size, args);
break;
}
case DEVICE_KERNEL_INTEGRATOR_INTERSECT_SHADOW:
case DEVICE_KERNEL_INTEGRATOR_INTERSECT_SUBSURFACE:
case DEVICE_KERNEL_INTEGRATOR_INTERSECT_VOLUME_STACK: {

View File

@@ -46,7 +46,8 @@ class PathTraceWorkGPU : public PathTraceWork {
virtual void render_samples(RenderStatistics &statistics,
int start_sample,
int samples_num) override;
int samples_num,
int sample_offset) override;
virtual void copy_to_display(PathTraceDisplay *display,
PassMode pass_mode,

View File

@@ -88,6 +88,16 @@ int RenderScheduler::get_num_samples() const
return num_samples_;
}
void RenderScheduler::set_sample_offset(int sample_offset)
{
sample_offset_ = sample_offset;
}
int RenderScheduler::get_sample_offset() const
{
return sample_offset_;
}
void RenderScheduler::set_time_limit(double time_limit)
{
time_limit_ = time_limit;
@@ -110,13 +120,15 @@ int RenderScheduler::get_num_rendered_samples() const
return state_.num_rendered_samples;
}
void RenderScheduler::reset(const BufferParams &buffer_params, int num_samples)
void RenderScheduler::reset(const BufferParams &buffer_params, int num_samples, int sample_offset)
{
buffer_params_ = buffer_params;
update_start_resolution_divider();
set_num_samples(num_samples);
set_start_sample(sample_offset);
set_sample_offset(sample_offset);
/* In background mode never do lower resolution render preview, as it is not really supported
* by the software. */
@@ -171,7 +183,7 @@ void RenderScheduler::reset(const BufferParams &buffer_params, int num_samples)
void RenderScheduler::reset_for_next_tile()
{
reset(buffer_params_, num_samples_);
reset(buffer_params_, num_samples_, sample_offset_);
}
bool RenderScheduler::render_work_reschedule_on_converge(RenderWork &render_work)
@@ -317,6 +329,7 @@ RenderWork RenderScheduler::get_render_work()
render_work.path_trace.start_sample = get_start_sample_to_path_trace();
render_work.path_trace.num_samples = get_num_samples_to_path_trace();
render_work.path_trace.sample_offset = get_sample_offset();
render_work.init_render_buffers = (render_work.path_trace.start_sample == get_start_sample());
@@ -827,6 +840,26 @@ int RenderScheduler::get_num_samples_to_path_trace() const
num_samples_to_occupy = lround(state_.occupancy_num_samples * 0.7f / state_.occupancy);
}
/* When time limit is used clamp the calculated number of samples to keep occupancy.
* This is because time limit causes the last render iteration to happen with less number of
* samples, which conflicts with the occupancy (lower number of samples causes lower
* occupancy, also the calculation is based on number of previously rendered samples).
*
* When time limit is not used the number of samples per render iteration is either increasing
* or stays the same, so there is no need to clamp number of samples calculated for occupancy.
*/
if (time_limit_ != 0.0 && state_.start_render_time != 0.0) {
const double remaining_render_time = max(
0.0, time_limit_ - (time_dt() - state_.start_render_time));
const double time_per_sample_average = path_trace_time_.get_average();
const double predicted_render_time = num_samples_to_occupy * time_per_sample_average;
if (predicted_render_time > remaining_render_time) {
num_samples_to_occupy = lround(num_samples_to_occupy *
(remaining_render_time / predicted_render_time));
}
}
num_samples_to_render = max(num_samples_to_render,
min(num_samples_to_occupy, max_num_samples_to_render));
}

View File

@@ -39,6 +39,7 @@ class RenderWork {
struct {
int start_sample = 0;
int num_samples = 0;
int sample_offset = 0;
} path_trace;
struct {
@@ -125,6 +126,9 @@ class RenderScheduler {
void set_num_samples(int num_samples);
int get_num_samples() const;
void set_sample_offset(int sample_offset);
int get_sample_offset() const;
/* Time limit for the path tracing tasks, in minutes.
* Zero disables the limit. */
void set_time_limit(double time_limit);
@@ -150,7 +154,7 @@ class RenderScheduler {
/* Reset scheduler, indicating that rendering will happen from scratch.
* Resets current rendered state, as well as scheduling information. */
void reset(const BufferParams &buffer_params, int num_samples);
void reset(const BufferParams &buffer_params, int num_samples, int sample_offset);
/* Reset scheduler upon switching to a next tile.
* Will keep the same number of samples and full-frame render parameters, but will reset progress
@@ -419,6 +423,8 @@ class RenderScheduler {
int start_sample_ = 0;
int num_samples_ = 0;
int sample_offset_ = 0;
/* Limit in seconds for how long path tracing is allowed to happen.
* Zero means no limit is applied. */
double time_limit_ = 0.0;

View File

@@ -36,6 +36,7 @@ void WorkTileScheduler::set_max_num_path_states(int max_num_path_states)
void WorkTileScheduler::reset(const BufferParams &buffer_params,
int sample_start,
int samples_num,
int sample_offset,
float scrambling_distance)
{
/* Image buffer parameters. */
@@ -51,6 +52,7 @@ void WorkTileScheduler::reset(const BufferParams &buffer_params,
/* Samples parameters. */
sample_start_ = sample_start;
samples_num_ = samples_num;
sample_offset_ = sample_offset;
/* Initialize new scheduling. */
reset_scheduler_state();
@@ -111,6 +113,7 @@ bool WorkTileScheduler::get_work(KernelWorkTile *work_tile_, const int max_work_
work_tile.h = tile_size_.height;
work_tile.start_sample = sample_start_ + start_sample;
work_tile.num_samples = min(tile_size_.num_samples, samples_num_ - start_sample);
work_tile.sample_offset = sample_offset_;
work_tile.offset = offset_;
work_tile.stride = stride_;

View File

@@ -41,6 +41,7 @@ class WorkTileScheduler {
void reset(const BufferParams &buffer_params,
int sample_start,
int samples_num,
int sample_offset,
float scrambling_distance);
/* Get work for a device.
@@ -79,6 +80,7 @@ class WorkTileScheduler {
* (splitting into a smaller work tiles). */
int sample_start_ = 0;
int samples_num_ = 0;
int sample_offset_ = 0;
/* Tile size which be scheduled for rendering. */
TileSize tile_size_;

View File

@@ -273,6 +273,7 @@ set(SRC_KERNEL_UTIL_HEADERS
)
set(SRC_KERNEL_TYPES_HEADERS
tables.h
textures.h
types.h
)
@@ -410,12 +411,8 @@ if(WITH_CYCLES_CUDA_BINARIES)
-I ${CMAKE_CURRENT_SOURCE_DIR}/..
-I ${CMAKE_CURRENT_SOURCE_DIR}/device/cuda
--use_fast_math
-o ${CMAKE_CURRENT_BINARY_DIR}/${cuda_file})
if(${experimental})
set(cuda_flags ${cuda_flags} -D __KERNEL_EXPERIMENTAL__)
set(name ${name}_experimental)
endif()
-o ${CMAKE_CURRENT_BINARY_DIR}/${cuda_file}
-Wno-deprecated-gpu-targets)
if(WITH_NANOVDB)
set(cuda_flags ${cuda_flags}
@@ -423,6 +420,10 @@ if(WITH_CYCLES_CUDA_BINARIES)
-I "${NANOVDB_INCLUDE_DIR}")
endif()
if(WITH_CYCLES_DEBUG)
set(cuda_flags ${cuda_flags} -D WITH_CYCLES_DEBUG)
endif()
if(WITH_CYCLES_CUBIN_COMPILER)
string(SUBSTRING ${arch} 3 -1 CUDA_ARCH)
@@ -571,13 +572,14 @@ if(WITH_CYCLES_HIP_BINARIES AND WITH_CYCLES_DEVICE_HIP)
-ffast-math
-o ${CMAKE_CURRENT_BINARY_DIR}/${hip_file})
if(${experimental})
set(hip_flags ${hip_flags} -D __KERNEL_EXPERIMENTAL__)
set(name ${name}_experimental)
if(WITH_NANOVDB)
set(hip_flags ${hip_flags}
-D WITH_NANOVDB
-I "${NANOVDB_INCLUDE_DIR}")
endif()
if(WITH_CYCLES_DEBUG)
set(hip_flags ${hip_flags} -D __KERNEL_DEBUG__)
set(hip_flags ${hip_flags} -D WITH_CYCLES_DEBUG)
endif()
add_custom_command(
@@ -618,6 +620,10 @@ if(WITH_CYCLES_DEVICE_OPTIX AND WITH_CYCLES_CUDA_BINARIES)
-I "${NANOVDB_INCLUDE_DIR}")
endif()
if(WITH_CYCLES_DEBUG)
set(cuda_flags ${cuda_flags} -D WITH_CYCLES_DEBUG)
endif()
if(WITH_CYCLES_CUBIN_COMPILER)
# Needed to find libnvrtc-builtins.so. Can't do it from inside
# cycles_cubin_cc since the env variable is read before main()
@@ -706,7 +712,7 @@ if(WITH_COMPILER_ASAN)
string(APPEND CMAKE_CXX_FLAGS_RELWITHDEBINFO " -fno-sanitize=all")
string(APPEND CMAKE_CXX_FLAGS_DEBUG " -fno-sanitize=vptr")
elseif(CMAKE_C_COMPILER_ID MATCHES "Clang")
# With OSL, Cycles disables rtti in some modules, wich then breaks at linking
# With OSL, Cycles disables rtti in some modules, which then breaks at linking
# when trying to use vptr sanitizer (included into 'undefined' general option).
string(APPEND CMAKE_CXX_FLAGS_RELWITHDEBINFO " -fno-sanitize=vptr")
string(APPEND CMAKE_CXX_FLAGS_DEBUG " -fno-sanitize=vptr")

View File

@@ -97,7 +97,7 @@ ccl_device_inline void sort_intersections_and_normals(ccl_private Intersection *
swapped = false;
for (int j = 0; j < num_hits - 1; ++j) {
if (hits[j].t > hits[j + 1].t) {
struct Intersection tmp_hit = hits[j];
Intersection tmp_hit = hits[j];
float3 tmp_Ng = Ng[j];
hits[j] = hits[j + 1];
Ng[j] = Ng[j + 1];

View File

@@ -438,7 +438,7 @@ ccl_device_inline int bsdf_sample(KernelGlobals kg,
if (label & LABEL_TRANSMIT) {
float threshold_squared = kernel_data.background.transparent_roughness_squared_threshold;
if (threshold_squared >= 0.0f) {
if (threshold_squared >= 0.0f && !(label & LABEL_DIFFUSE)) {
if (bsdf_get_specular_roughness_squared(sc) <= threshold_squared) {
label |= LABEL_TRANSMIT_TRANSPARENT;
}

View File

@@ -18,6 +18,7 @@
#pragma once
#include "kernel/tables.h"
#include "kernel/types.h"
#include "kernel/util/profiling.h"

View File

@@ -37,7 +37,7 @@
KERNEL_INTEGRATOR_INIT_FUNCTION(init_from_camera);
KERNEL_INTEGRATOR_INIT_FUNCTION(init_from_bake);
KERNEL_INTEGRATOR_FUNCTION(intersect_closest);
KERNEL_INTEGRATOR_SHADE_FUNCTION(intersect_closest);
KERNEL_INTEGRATOR_FUNCTION(intersect_shadow);
KERNEL_INTEGRATOR_FUNCTION(intersect_subsurface);
KERNEL_INTEGRATOR_FUNCTION(intersect_volume_stack);

View File

@@ -112,7 +112,7 @@ CCL_NAMESPACE_BEGIN
DEFINE_INTEGRATOR_INIT_KERNEL(init_from_camera)
DEFINE_INTEGRATOR_INIT_KERNEL(init_from_bake)
DEFINE_INTEGRATOR_KERNEL(intersect_closest)
DEFINE_INTEGRATOR_SHADE_KERNEL(intersect_closest)
DEFINE_INTEGRATOR_KERNEL(intersect_subsurface)
DEFINE_INTEGRATOR_KERNEL(intersect_volume_stack)
DEFINE_INTEGRATOR_SHADE_KERNEL(shade_background)

View File

@@ -52,8 +52,9 @@ typedef unsigned long long uint64_t;
#endif
#define ccl_device_noinline __device__ __noinline__
#define ccl_device_noinline_cpu ccl_device
#define ccl_device_inline_method ccl_device
#define ccl_global
#define ccl_static_constant __constant__
#define ccl_inline_constant __constant__
#define ccl_device_constant __constant__ __device__
#define ccl_constant const
#define ccl_gpu_shared __shared__
@@ -85,7 +86,6 @@ typedef unsigned long long uint64_t;
#define ccl_gpu_syncthreads() __syncthreads()
#define ccl_gpu_ballot(predicate) __ballot_sync(0xFFFFFFFF, predicate)
#define ccl_gpu_shfl_down_sync(mask, var, detla) __shfl_down_sync(mask, var, detla)
#define ccl_gpu_popc(x) __popc(x)
/* GPU texture objects */

View File

@@ -21,6 +21,9 @@
#include "kernel/device/gpu/parallel_sorted_index.h"
#include "kernel/device/gpu/work_stealing.h"
/* Include constant tables before entering Metal's context class scope (context_begin.h) */
#include "kernel/tables.h"
#ifdef __KERNEL_METAL__
# include "kernel/device/metal/context_begin.h"
#endif
@@ -131,13 +134,14 @@ ccl_gpu_kernel(GPU_KERNEL_BLOCK_NUM_THREADS, GPU_KERNEL_MAX_REGISTERS)
ccl_gpu_kernel(GPU_KERNEL_BLOCK_NUM_THREADS, GPU_KERNEL_MAX_REGISTERS)
ccl_gpu_kernel_signature(integrator_intersect_closest,
ccl_global const int *path_index_array,
ccl_global float *render_buffer,
const int work_size)
{
const int global_index = ccl_gpu_global_id_x();
if (global_index < work_size) {
const int state = (path_index_array) ? path_index_array[global_index] : global_index;
ccl_gpu_kernel_call(integrator_intersect_closest(NULL, state));
ccl_gpu_kernel_call(integrator_intersect_closest(NULL, state, render_buffer));
}
}
@@ -463,7 +467,7 @@ ccl_gpu_kernel(GPU_KERNEL_BLOCK_NUM_THREADS, GPU_KERNEL_MAX_REGISTERS)
const auto num_active_pixels_mask = ccl_gpu_ballot(!converged);
const int lane_id = ccl_gpu_thread_idx_x % ccl_gpu_warp_size;
if (lane_id == 0) {
atomic_fetch_and_add_uint32(num_active_pixels, ccl_gpu_popc(num_active_pixels_mask));
atomic_fetch_and_add_uint32(num_active_pixels, popcount(num_active_pixels_mask));
}
}
@@ -523,6 +527,26 @@ ccl_gpu_kernel(GPU_KERNEL_BLOCK_NUM_THREADS, GPU_KERNEL_MAX_REGISTERS)
* Film.
*/
ccl_device_inline void kernel_gpu_film_convert_half_write(ccl_global uchar4 *rgba,
const int rgba_offset,
const int rgba_stride,
const int x,
const int y,
const half4 half_pixel)
{
/* Work around HIP issue with half float display, see T92972. */
#ifdef __KERNEL_HIP__
ccl_global half *out = ((ccl_global half *)rgba) + (rgba_offset + y * rgba_stride + x) * 4;
out[0] = half_pixel.x;
out[1] = half_pixel.y;
out[2] = half_pixel.z;
out[3] = half_pixel.w;
#else
ccl_global half4 *out = ((ccl_global half4 *)rgba) + rgba_offset + y * rgba_stride + x;
*out = half_pixel;
#endif
}
#define KERNEL_FILM_CONVERT_VARIANT(variant, input_channel_count) \
ccl_gpu_kernel(GPU_KERNEL_BLOCK_NUM_THREADS, GPU_KERNEL_MAX_REGISTERS) \
ccl_gpu_kernel_signature(film_convert_##variant, \
@@ -588,8 +612,9 @@ ccl_gpu_kernel(GPU_KERNEL_BLOCK_NUM_THREADS, GPU_KERNEL_MAX_REGISTERS)
\
film_apply_pass_pixel_overlays_rgba(&kfilm_convert, buffer, pixel); \
\
ccl_global half4 *out = ((ccl_global half4 *)rgba) + rgba_offset + y * rgba_stride + x; \
*out = float4_to_half4_display(make_float4(pixel[0], pixel[1], pixel[2], pixel[3])); \
const half4 half_pixel = float4_to_half4_display( \
make_float4(pixel[0], pixel[1], pixel[2], pixel[3])); \
kernel_gpu_film_convert_half_write(rgba, rgba_offset, rgba_stride, x, y, half_pixel); \
}
/* 1 channel inputs */
@@ -870,6 +895,6 @@ ccl_gpu_kernel(GPU_KERNEL_BLOCK_NUM_THREADS, GPU_KERNEL_MAX_REGISTERS)
const auto can_split_mask = ccl_gpu_ballot(can_split);
const int lane_id = ccl_gpu_thread_idx_x % ccl_gpu_warp_size;
if (lane_id == 0) {
atomic_fetch_and_add_uint32(num_possible_splits, ccl_gpu_popc(can_split_mask));
atomic_fetch_and_add_uint32(num_possible_splits, popcount(can_split_mask));
}
}

View File

@@ -85,8 +85,8 @@ __device__ void gpu_parallel_active_index_array(const uint num_states,
const uint is_active = (state_index < num_states) ? is_active_op(state_index) : 0;
/* For each thread within a warp compute how many other active states precede it. */
const uint thread_offset = ccl_gpu_popc(ccl_gpu_ballot(is_active) &
ccl_gpu_thread_mask(thread_warp));
const uint thread_offset = popcount(ccl_gpu_ballot(is_active) &
ccl_gpu_thread_mask(thread_warp));
/* Last thread in warp stores number of active states for each warp. */
if (thread_warp == ccl_gpu_warp_size - 1) {

View File

@@ -45,8 +45,9 @@ typedef unsigned long long uint64_t;
#define ccl_device_forceinline __device__ __forceinline__
#define ccl_device_noinline __device__ __noinline__
#define ccl_device_noinline_cpu ccl_device
#define ccl_device_inline_method ccl_device
#define ccl_global
#define ccl_static_constant __constant__
#define ccl_inline_constant __constant__
#define ccl_device_constant __constant__ __device__
#define ccl_constant const
#define ccl_gpu_shared __shared__
@@ -84,7 +85,6 @@ typedef unsigned long long uint64_t;
#define ccl_gpu_syncthreads() __syncthreads()
#define ccl_gpu_ballot(predicate) __ballot(predicate)
#define ccl_gpu_shfl_down_sync(mask, var, detla) __shfl_down(var, detla)
#define ccl_gpu_popc(x) __popc(x)
/* GPU texture objects */
typedef hipTextureObject_t ccl_gpu_tex_object;

View File

@@ -34,6 +34,7 @@ using namespace metal;
#pragma clang diagnostic ignored "-Wunused-variable"
#pragma clang diagnostic ignored "-Wsign-compare"
#pragma clang diagnostic ignored "-Wuninitialized"
/* Qualifiers */
@@ -42,8 +43,9 @@ using namespace metal;
#define ccl_device_forceinline ccl_device
#define ccl_device_noinline ccl_device __attribute__((noinline))
#define ccl_device_noinline_cpu ccl_device
#define ccl_device_inline_method ccl_device
#define ccl_global device
#define ccl_static_constant static constant constexpr
#define ccl_inline_constant static constant constexpr
#define ccl_device_constant constant
#define ccl_constant const device
#define ccl_gpu_shared threadgroup
@@ -64,7 +66,7 @@ using namespace metal;
#define ccl_gpu_thread_mask(thread_warp) uint64_t((1ull << thread_warp) - 1)
#define ccl_gpu_ballot(predicate) ((uint64_t)((simd_vote::vote_t)simd_ballot(predicate)))
#define ccl_gpu_popc(x) popcount(x)
#define ccl_gpu_syncthreads() threadgroup_barrier(mem_flags::mem_threadgroup);
// clang-format off
@@ -73,7 +75,8 @@ using namespace metal;
#define ccl_gpu_kernel(block_num_threads, thread_num_registers)
#define ccl_gpu_kernel_threads(block_num_threads)
/* convert a comma-separated list into a semicolon-separated list (so that we can generate a struct based on kernel entrypoint parameters) */
/* Convert a comma-separated list into a semicolon-separated list
* (so that we can generate a struct based on kernel entry-point parameters). */
#define FN0()
#define FN1(p1) p1;
#define FN2(p1, p2) p1; p2;
@@ -94,7 +97,8 @@ using namespace metal;
#define GET_LAST_ARG(p0, p1, p2, p3, p4, p5, p6, p7, p8, p9, p10, p11, p12, p13, p14, p15, p16, ...) p16
#define PARAMS_MAKER(...) GET_LAST_ARG(__VA_ARGS__, FN16, FN15, FN14, FN13, FN12, FN11, FN10, FN9, FN8, FN7, FN6, FN5, FN4, FN3, FN2, FN1, FN0)
/* generate a struct containing the entrypoint parameters and a "run" method which can access them implicitly via this-> */
/* Generate a struct containing the entry-point parameters and a "run"
* method which can access them implicitly via this-> */
#define ccl_gpu_kernel_signature(name, ...) \
struct kernel_gpu_##name \
{ \
@@ -121,7 +125,6 @@ kernel void kernel_metal_##name(device const kernel_gpu_##name *params_struct, \
uint simd_group_index [[simdgroup_index_in_threadgroup]], \
uint num_simd_groups [[simdgroups_per_threadgroup]]) { \
MetalKernelContext context(_launch_params_metal, _metal_ancillaries); \
INIT_DEBUG_BUFFER \
params_struct->run(context, simdgroup_offset, metal_global_id, metal_local_id, metal_local_size, simdgroup_size, simd_lane_index, simd_group_index, num_simd_groups); \
} \
void kernel_gpu_##name::run(thread MetalKernelContext& context, \
@@ -148,6 +151,31 @@ void kernel_gpu_##name::run(thread MetalKernelContext& context, \
// clang-format on
/* volumetric lambda functions - use function objects for lambda-like functionality */
#define VOLUME_READ_LAMBDA(function_call) \
struct FnObjectRead { \
KernelGlobals kg; \
ccl_private MetalKernelContext *context; \
int state; \
\
VolumeStack operator()(const int i) const \
{ \
return context->function_call; \
} \
} volume_read_lambda_pass{kg, this, state};
#define VOLUME_WRITE_LAMBDA(function_call) \
struct FnObjectWrite { \
KernelGlobals kg; \
ccl_private MetalKernelContext *context; \
int state; \
\
void operator()(const int i, VolumeStack entry) const \
{ \
context->function_call; \
} \
} volume_write_lambda_pass{kg, this, state};
/* make_type definitions with Metal style element initializers */
#ifdef make_float2
# undef make_float2
@@ -202,6 +230,7 @@ void kernel_gpu_##name::run(thread MetalKernelContext& context, \
#define sinhf(x) sinh(float(x))
#define coshf(x) cosh(float(x))
#define tanhf(x) tanh(float(x))
#define saturatef(x) saturate(float(x))
/* Use native functions with possibly lower precision for performance,
* no issues found so far. */
@@ -215,6 +244,8 @@ void kernel_gpu_##name::run(thread MetalKernelContext& context, \
#define NULL 0
#define __device__
/* texture bindings and sampler setup */
struct Texture2DParamsMetal {
@@ -229,6 +260,9 @@ struct MetalAncillaries {
device Texture3DParamsMetal *textures_3d;
};
#include "util/half.h"
#include "util/types.h"
enum SamplerType {
SamplerFilterNearest_AddressRepeat,
SamplerFilterNearest_AddressClampEdge,

View File

@@ -76,4 +76,4 @@ class MetalKernelContext {
}
# include "kernel/device/gpu/image.h"
// clang-format on
// clang-format on

View File

@@ -17,7 +17,7 @@
; /* end of MetalKernelContext class definition */
/* Silently redirect into the MetalKernelContext instance */
/* NOTE: These macros will need maintaining as entrypoints change */
/* NOTE: These macros will need maintaining as entry-points change. */
#undef kernel_integrator_state
#define kernel_integrator_state context.launch_params_metal.__integrator_state

View File

@@ -25,7 +25,7 @@ CCL_NAMESPACE_BEGIN
typedef struct KernelParamsMetal {
#define KERNEL_TEX(type, name) ccl_constant type *name;
#define KERNEL_TEX(type, name) ccl_global const type *name;
#include "kernel/textures.h"
#undef KERNEL_TEX

View File

@@ -49,10 +49,11 @@ typedef unsigned long long uint64_t;
__device__ __forceinline__ // Function calls are bad for OptiX performance, so inline everything
#define ccl_device_inline ccl_device
#define ccl_device_forceinline ccl_device
#define ccl_device_inline_method ccl_device
#define ccl_device_noinline __device__ __noinline__
#define ccl_device_noinline_cpu ccl_device
#define ccl_global
#define ccl_static_constant __constant__
#define ccl_inline_constant __constant__
#define ccl_device_constant __constant__ __device__
#define ccl_constant const
#define ccl_gpu_shared __shared__
@@ -86,7 +87,6 @@ typedef unsigned long long uint64_t;
#define ccl_gpu_syncthreads() __syncthreads()
#define ccl_gpu_ballot(predicate) __ballot_sync(0xFFFFFFFF, predicate)
#define ccl_gpu_shfl_down_sync(mask, var, detla) __shfl_down_sync(mask, var, detla)
#define ccl_gpu_popc(x) __popc(x)
/* GPU texture objects */

View File

@@ -21,6 +21,8 @@
#include "kernel/device/gpu/image.h" /* Texture lookup uses normal CUDA intrinsics. */
#include "kernel/tables.h"
#include "kernel/integrator/state.h"
#include "kernel/integrator/state_flow.h"
#include "kernel/integrator/state_util.h"
@@ -44,7 +46,7 @@ template<typename T> ccl_device_forceinline T *get_payload_ptr_2()
ccl_device_forceinline int get_object_id()
{
#ifdef __OBJECT_MOTION__
/* Always get the the instance ID from the TLAS
/* Always get the instance ID from the TLAS
* There might be a motion transform node between TLAS and BLAS which does not have one. */
return optixGetInstanceIdFromHandle(optixGetTransformListHandle(0));
#else
@@ -57,7 +59,7 @@ extern "C" __global__ void __raygen__kernel_optix_integrator_intersect_closest()
const int global_index = optixGetLaunchIndex().x;
const int path_index = (__params.path_index_array) ? __params.path_index_array[global_index] :
global_index;
integrator_intersect_closest(nullptr, path_index);
integrator_intersect_closest(nullptr, path_index, __params.render_buffer);
}
extern "C" __global__ void __raygen__kernel_optix_integrator_intersect_shadow()
@@ -159,9 +161,9 @@ extern "C" __global__ void __anyhit__kernel_optix_local_hit()
/* Record geometric normal. */
const uint tri_vindex = kernel_tex_fetch(__tri_vindex, prim).w;
const float3 tri_a = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex + 0));
const float3 tri_b = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex + 1));
const float3 tri_c = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex + 2));
const float3 tri_a = kernel_tex_fetch(__tri_verts, tri_vindex + 0);
const float3 tri_b = kernel_tex_fetch(__tri_verts, tri_vindex + 1);
const float3 tri_c = kernel_tex_fetch(__tri_verts, tri_vindex + 2);
local_isect->Ng[hit] = normalize(cross(tri_b - tri_a, tri_c - tri_a));
/* Continue tracing (without this the trace call would return after the first hit). */

View File

@@ -33,62 +33,72 @@ CCL_NAMESPACE_BEGIN
* them separately. */
ccl_device_inline void bsdf_eval_init(ccl_private BsdfEval *eval,
const bool is_diffuse,
const ClosureType closure_type,
float3 value)
{
eval->diffuse = zero_float3();
eval->glossy = zero_float3();
if (is_diffuse) {
if (CLOSURE_IS_BSDF_DIFFUSE(closure_type)) {
eval->diffuse = value;
}
else {
else if (CLOSURE_IS_BSDF_GLOSSY(closure_type)) {
eval->glossy = value;
}
eval->sum = value;
}
ccl_device_inline void bsdf_eval_accum(ccl_private BsdfEval *eval,
const bool is_diffuse,
float3 value,
float mis_weight)
const ClosureType closure_type,
float3 value)
{
value *= mis_weight;
if (is_diffuse) {
if (CLOSURE_IS_BSDF_DIFFUSE(closure_type)) {
eval->diffuse += value;
}
else {
else if (CLOSURE_IS_BSDF_GLOSSY(closure_type)) {
eval->glossy += value;
}
eval->sum += value;
}
ccl_device_inline bool bsdf_eval_is_zero(ccl_private BsdfEval *eval)
{
return is_zero(eval->diffuse) && is_zero(eval->glossy);
return is_zero(eval->sum);
}
ccl_device_inline void bsdf_eval_mul(ccl_private BsdfEval *eval, float value)
{
eval->diffuse *= value;
eval->glossy *= value;
eval->sum *= value;
}
ccl_device_inline void bsdf_eval_mul3(ccl_private BsdfEval *eval, float3 value)
{
eval->diffuse *= value;
eval->glossy *= value;
eval->sum *= value;
}
ccl_device_inline float3 bsdf_eval_sum(ccl_private const BsdfEval *eval)
{
return eval->diffuse + eval->glossy;
return eval->sum;
}
ccl_device_inline float3 bsdf_eval_diffuse_glossy_ratio(ccl_private const BsdfEval *eval)
ccl_device_inline float3 bsdf_eval_pass_diffuse_weight(ccl_private const BsdfEval *eval)
{
/* Ratio of diffuse and glossy to recover proportions for writing to render pass.
/* Ratio of diffuse weight to recover proportions for writing to render pass.
* We assume reflection, transmission and volume scatter to be exclusive. */
return safe_divide_float3_float3(eval->diffuse, eval->diffuse + eval->glossy);
return safe_divide_float3_float3(eval->diffuse, eval->sum);
}
ccl_device_inline float3 bsdf_eval_pass_glossy_weight(ccl_private const BsdfEval *eval)
{
/* Ratio of glossy weight to recover proportions for writing to render pass.
* We assume reflection, transmission and volume scatter to be exclusive. */
return safe_divide_float3_float3(eval->glossy, eval->sum);
}
/* --------------------------------------------------------------------
@@ -141,7 +151,8 @@ ccl_device_forceinline ccl_global float *kernel_accum_pixel_render_buffer(
ccl_device_inline int kernel_accum_sample(KernelGlobals kg,
ConstIntegratorState state,
ccl_global float *ccl_restrict render_buffer,
int sample)
int sample,
int sample_offset)
{
if (kernel_data.film.pass_sample_count == PASS_UNUSED) {
return sample;
@@ -149,7 +160,9 @@ ccl_device_inline int kernel_accum_sample(KernelGlobals kg,
ccl_global float *buffer = kernel_accum_pixel_render_buffer(kg, state, render_buffer);
return atomic_fetch_and_add_uint32((uint *)(buffer) + kernel_data.film.pass_sample_count, 1);
return atomic_fetch_and_add_uint32(
(ccl_global uint *)(buffer) + kernel_data.film.pass_sample_count, 1) +
sample_offset;
}
ccl_device void kernel_accum_adaptive_buffer(KernelGlobals kg,
@@ -351,37 +364,47 @@ ccl_device_inline void kernel_accum_emission_or_background_pass(KernelGlobals kg
/* Directly visible, write to emission or background pass. */
pass_offset = pass;
}
else if (path_flag & (PATH_RAY_REFLECT_PASS | PATH_RAY_TRANSMISSION_PASS)) {
/* Indirectly visible through reflection. */
const int glossy_pass_offset = (path_flag & PATH_RAY_REFLECT_PASS) ?
((INTEGRATOR_STATE(state, path, bounce) == 1) ?
kernel_data.film.pass_glossy_direct :
kernel_data.film.pass_glossy_indirect) :
((INTEGRATOR_STATE(state, path, bounce) == 1) ?
kernel_data.film.pass_transmission_direct :
kernel_data.film.pass_transmission_indirect);
else if (kernel_data.kernel_features & KERNEL_FEATURE_LIGHT_PASSES) {
if (path_flag & PATH_RAY_SURFACE_PASS) {
/* Indirectly visible through reflection. */
const float3 diffuse_weight = INTEGRATOR_STATE(state, path, pass_diffuse_weight);
const float3 glossy_weight = INTEGRATOR_STATE(state, path, pass_glossy_weight);
if (glossy_pass_offset != PASS_UNUSED) {
/* Glossy is a subset of the throughput, reconstruct it here using the
* diffuse-glossy ratio. */
const float3 ratio = INTEGRATOR_STATE(state, path, diffuse_glossy_ratio);
const float3 glossy_contribution = (one_float3() - ratio) * contribution;
kernel_write_pass_float3(buffer + glossy_pass_offset, glossy_contribution);
}
/* Glossy */
const int glossy_pass_offset = ((INTEGRATOR_STATE(state, path, bounce) == 1) ?
kernel_data.film.pass_glossy_direct :
kernel_data.film.pass_glossy_indirect);
if (glossy_pass_offset != PASS_UNUSED) {
kernel_write_pass_float3(buffer + glossy_pass_offset, glossy_weight * contribution);
}
/* Reconstruct diffuse subset of throughput. */
pass_offset = (INTEGRATOR_STATE(state, path, bounce) == 1) ?
kernel_data.film.pass_diffuse_direct :
kernel_data.film.pass_diffuse_indirect;
if (pass_offset != PASS_UNUSED) {
contribution *= INTEGRATOR_STATE(state, path, diffuse_glossy_ratio);
/* Transmission */
const int transmission_pass_offset = ((INTEGRATOR_STATE(state, path, bounce) == 1) ?
kernel_data.film.pass_transmission_direct :
kernel_data.film.pass_transmission_indirect);
if (transmission_pass_offset != PASS_UNUSED) {
/* Transmission is what remains if not diffuse and glossy, not stored explicitly to save
* GPU memory. */
const float3 transmission_weight = one_float3() - diffuse_weight - glossy_weight;
kernel_write_pass_float3(buffer + transmission_pass_offset,
transmission_weight * contribution);
}
/* Reconstruct diffuse subset of throughput. */
pass_offset = (INTEGRATOR_STATE(state, path, bounce) == 1) ?
kernel_data.film.pass_diffuse_direct :
kernel_data.film.pass_diffuse_indirect;
if (pass_offset != PASS_UNUSED) {
contribution *= diffuse_weight;
}
}
else if (path_flag & PATH_RAY_VOLUME_PASS) {
/* Indirectly visible through volume. */
pass_offset = (INTEGRATOR_STATE(state, path, bounce) == 1) ?
kernel_data.film.pass_volume_direct :
kernel_data.film.pass_volume_indirect;
}
}
else if (path_flag & PATH_RAY_VOLUME_PASS) {
/* Indirectly visible through volume. */
pass_offset = (INTEGRATOR_STATE(state, path, bounce) == 1) ?
kernel_data.film.pass_volume_direct :
kernel_data.film.pass_volume_indirect;
}
/* Single write call for GPU coherence. */
@@ -426,49 +449,60 @@ ccl_device_inline void kernel_accum_light(KernelGlobals kg,
#ifdef __PASSES__
if (kernel_data.film.light_pass_flag & PASS_ANY) {
const uint32_t path_flag = INTEGRATOR_STATE(state, shadow_path, flag);
int pass_offset = PASS_UNUSED;
if (path_flag & (PATH_RAY_REFLECT_PASS | PATH_RAY_TRANSMISSION_PASS)) {
/* Indirectly visible through reflection. */
const int glossy_pass_offset = (path_flag & PATH_RAY_REFLECT_PASS) ?
((INTEGRATOR_STATE(state, shadow_path, bounce) == 0) ?
kernel_data.film.pass_glossy_direct :
kernel_data.film.pass_glossy_indirect) :
((INTEGRATOR_STATE(state, shadow_path, bounce) == 0) ?
kernel_data.film.pass_transmission_direct :
kernel_data.film.pass_transmission_indirect);
if (kernel_data.kernel_features & KERNEL_FEATURE_LIGHT_PASSES) {
int pass_offset = PASS_UNUSED;
if (glossy_pass_offset != PASS_UNUSED) {
/* Glossy is a subset of the throughput, reconstruct it here using the
* diffuse-glossy ratio. */
const float3 ratio = INTEGRATOR_STATE(state, shadow_path, diffuse_glossy_ratio);
const float3 glossy_contribution = (one_float3() - ratio) * contribution;
kernel_write_pass_float3(buffer + glossy_pass_offset, glossy_contribution);
if (path_flag & PATH_RAY_SURFACE_PASS) {
/* Indirectly visible through reflection. */
const float3 diffuse_weight = INTEGRATOR_STATE(state, shadow_path, pass_diffuse_weight);
const float3 glossy_weight = INTEGRATOR_STATE(state, shadow_path, pass_glossy_weight);
/* Glossy */
const int glossy_pass_offset = ((INTEGRATOR_STATE(state, shadow_path, bounce) == 0) ?
kernel_data.film.pass_glossy_direct :
kernel_data.film.pass_glossy_indirect);
if (glossy_pass_offset != PASS_UNUSED) {
kernel_write_pass_float3(buffer + glossy_pass_offset, glossy_weight * contribution);
}
/* Transmission */
const int transmission_pass_offset = ((INTEGRATOR_STATE(state, shadow_path, bounce) == 0) ?
kernel_data.film.pass_transmission_direct :
kernel_data.film.pass_transmission_indirect);
if (transmission_pass_offset != PASS_UNUSED) {
/* Transmission is what remains if not diffuse and glossy, not stored explicitly to save
* GPU memory. */
const float3 transmission_weight = one_float3() - diffuse_weight - glossy_weight;
kernel_write_pass_float3(buffer + transmission_pass_offset,
transmission_weight * contribution);
}
/* Reconstruct diffuse subset of throughput. */
pass_offset = (INTEGRATOR_STATE(state, shadow_path, bounce) == 0) ?
kernel_data.film.pass_diffuse_direct :
kernel_data.film.pass_diffuse_indirect;
if (pass_offset != PASS_UNUSED) {
contribution *= diffuse_weight;
}
}
else if (path_flag & PATH_RAY_VOLUME_PASS) {
/* Indirectly visible through volume. */
pass_offset = (INTEGRATOR_STATE(state, shadow_path, bounce) == 0) ?
kernel_data.film.pass_volume_direct :
kernel_data.film.pass_volume_indirect;
}
/* Reconstruct diffuse subset of throughput. */
pass_offset = (INTEGRATOR_STATE(state, shadow_path, bounce) == 0) ?
kernel_data.film.pass_diffuse_direct :
kernel_data.film.pass_diffuse_indirect;
/* Single write call for GPU coherence. */
if (pass_offset != PASS_UNUSED) {
contribution *= INTEGRATOR_STATE(state, shadow_path, diffuse_glossy_ratio);
kernel_write_pass_float3(buffer + pass_offset, contribution);
}
}
else if (path_flag & PATH_RAY_VOLUME_PASS) {
/* Indirectly visible through volume. */
pass_offset = (INTEGRATOR_STATE(state, shadow_path, bounce) == 0) ?
kernel_data.film.pass_volume_direct :
kernel_data.film.pass_volume_indirect;
}
/* Single write call for GPU coherence. */
if (pass_offset != PASS_UNUSED) {
kernel_write_pass_float3(buffer + pass_offset, contribution);
}
/* Write shadow pass. */
if (kernel_data.film.pass_shadow != PASS_UNUSED && (path_flag & PATH_RAY_SHADOW_FOR_LIGHT) &&
(path_flag & PATH_RAY_CAMERA)) {
(path_flag & PATH_RAY_TRANSPARENT_BACKGROUND)) {
const float3 unshadowed_throughput = INTEGRATOR_STATE(
state, shadow_path, unshadowed_throughput);
const float3 shadowed_throughput = INTEGRATOR_STATE(state, shadow_path, throughput);
@@ -519,7 +553,7 @@ ccl_device_inline void kernel_accum_background(KernelGlobals kg,
const bool is_transparent_background_ray,
ccl_global float *ccl_restrict render_buffer)
{
float3 contribution = INTEGRATOR_STATE(state, path, throughput) * L;
float3 contribution = float3(INTEGRATOR_STATE(state, path, throughput)) * L;
kernel_accum_clamp(kg, &contribution, INTEGRATOR_STATE(state, path, bounce) - 1);
ccl_global float *buffer = kernel_accum_pixel_render_buffer(kg, state, render_buffer);

View File

@@ -160,40 +160,6 @@ ccl_device_forceinline void kernel_write_denoising_features_volume(KernelGlobals
}
#endif /* __DENOISING_FEATURES__ */
#ifdef __SHADOW_CATCHER__
/* Write shadow catcher passes on a bounce from the shadow catcher object. */
ccl_device_forceinline void kernel_write_shadow_catcher_bounce_data(
KernelGlobals kg,
IntegratorState state,
ccl_private const ShaderData *sd,
ccl_global float *ccl_restrict render_buffer)
{
if (!kernel_data.integrator.has_shadow_catcher) {
return;
}
kernel_assert(kernel_data.film.pass_shadow_catcher_sample_count != PASS_UNUSED);
kernel_assert(kernel_data.film.pass_shadow_catcher_matte != PASS_UNUSED);
if (!kernel_shadow_catcher_is_path_split_bounce(kg, state, sd->object_flag)) {
return;
}
ccl_global float *buffer = kernel_pass_pixel_render_buffer(kg, state, render_buffer);
/* Count sample for the shadow catcher object. */
kernel_write_pass_float(buffer + kernel_data.film.pass_shadow_catcher_sample_count, 1.0f);
/* Since the split is done, the sample does not contribute to the matte, so accumulate it as
* transparency to the matte. */
const float3 throughput = INTEGRATOR_STATE(state, path, throughput);
kernel_write_pass_float(buffer + kernel_data.film.pass_shadow_catcher_matte + 3,
average(throughput));
}
#endif /* __SHADOW_CATCHER__ */
ccl_device_inline size_t kernel_write_id_pass(ccl_global float *ccl_restrict buffer,
size_t depth,
float id,
@@ -211,7 +177,7 @@ ccl_device_inline void kernel_write_data_passes(KernelGlobals kg,
#ifdef __PASSES__
const uint32_t path_flag = INTEGRATOR_STATE(state, path, flag);
if (!(path_flag & PATH_RAY_CAMERA)) {
if (!(path_flag & PATH_RAY_TRANSPARENT_BACKGROUND)) {
return;
}

View File

@@ -27,7 +27,12 @@ CCL_NAMESPACE_BEGIN
* Lookup of attributes is different between OSL and SVM, as OSL is ustring
* based while for SVM we use integer ids. */
ccl_device_inline uint subd_triangle_patch(KernelGlobals kg, ccl_private const ShaderData *sd);
/* Patch index for triangle, -1 if not subdivision triangle */
ccl_device_inline uint subd_triangle_patch(KernelGlobals kg, ccl_private const ShaderData *sd)
{
return (sd->prim != PRIM_NONE) ? kernel_tex_fetch(__tri_patch, sd->prim) : ~0;
}
ccl_device_inline uint attribute_primitive_type(KernelGlobals kg, ccl_private const ShaderData *sd)
{
@@ -106,9 +111,9 @@ ccl_device Transform primitive_attribute_matrix(KernelGlobals kg,
{
Transform tfm;
tfm.x = kernel_tex_fetch(__attributes_float3, desc.offset + 0);
tfm.y = kernel_tex_fetch(__attributes_float3, desc.offset + 1);
tfm.z = kernel_tex_fetch(__attributes_float3, desc.offset + 2);
tfm.x = kernel_tex_fetch(__attributes_float4, desc.offset + 0);
tfm.y = kernel_tex_fetch(__attributes_float4, desc.offset + 1);
tfm.z = kernel_tex_fetch(__attributes_float4, desc.offset + 2);
return tfm;
}

View File

@@ -126,8 +126,8 @@ ccl_device float3 curve_attribute_float3(KernelGlobals kg,
int k0 = curve.first_key + PRIMITIVE_UNPACK_SEGMENT(sd->type);
int k1 = k0 + 1;
float3 f0 = float4_to_float3(kernel_tex_fetch(__attributes_float3, desc.offset + k0));
float3 f1 = float4_to_float3(kernel_tex_fetch(__attributes_float3, desc.offset + k1));
float3 f0 = kernel_tex_fetch(__attributes_float3, desc.offset + k0);
float3 f1 = kernel_tex_fetch(__attributes_float3, desc.offset + k1);
# ifdef __RAY_DIFFERENTIALS__
if (dx)
@@ -149,7 +149,7 @@ ccl_device float3 curve_attribute_float3(KernelGlobals kg,
if (desc.element & (ATTR_ELEMENT_CURVE | ATTR_ELEMENT_OBJECT | ATTR_ELEMENT_MESH)) {
const int offset = (desc.element == ATTR_ELEMENT_CURVE) ? desc.offset + sd->prim :
desc.offset;
return float4_to_float3(kernel_tex_fetch(__attributes_float3, offset));
return kernel_tex_fetch(__attributes_float3, offset);
}
else {
return make_float3(0.0f, 0.0f, 0.0f);
@@ -168,8 +168,8 @@ ccl_device float4 curve_attribute_float4(KernelGlobals kg,
int k0 = curve.first_key + PRIMITIVE_UNPACK_SEGMENT(sd->type);
int k1 = k0 + 1;
float4 f0 = kernel_tex_fetch(__attributes_float3, desc.offset + k0);
float4 f1 = kernel_tex_fetch(__attributes_float3, desc.offset + k1);
float4 f0 = kernel_tex_fetch(__attributes_float4, desc.offset + k0);
float4 f1 = kernel_tex_fetch(__attributes_float4, desc.offset + k1);
# ifdef __RAY_DIFFERENTIALS__
if (dx)
@@ -191,7 +191,7 @@ ccl_device float4 curve_attribute_float4(KernelGlobals kg,
if (desc.element & (ATTR_ELEMENT_CURVE | ATTR_ELEMENT_OBJECT | ATTR_ELEMENT_MESH)) {
const int offset = (desc.element == ATTR_ELEMENT_CURVE) ? desc.offset + sd->prim :
desc.offset;
return kernel_tex_fetch(__attributes_float3, offset);
return kernel_tex_fetch(__attributes_float4, offset);
}
else {
return make_float4(0.0f, 0.0f, 0.0f, 0.0f);

View File

@@ -48,8 +48,8 @@ ccl_device_inline void motion_curve_keys_for_step_linear(KernelGlobals kg,
offset += step * numkeys;
keys[0] = kernel_tex_fetch(__attributes_float3, offset + k0);
keys[1] = kernel_tex_fetch(__attributes_float3, offset + k1);
keys[0] = kernel_tex_fetch(__attributes_float4, offset + k0);
keys[1] = kernel_tex_fetch(__attributes_float4, offset + k1);
}
}
@@ -106,10 +106,10 @@ ccl_device_inline void motion_curve_keys_for_step(KernelGlobals kg,
offset += step * numkeys;
keys[0] = kernel_tex_fetch(__attributes_float3, offset + k0);
keys[1] = kernel_tex_fetch(__attributes_float3, offset + k1);
keys[2] = kernel_tex_fetch(__attributes_float3, offset + k2);
keys[3] = kernel_tex_fetch(__attributes_float3, offset + k3);
keys[0] = kernel_tex_fetch(__attributes_float4, offset + k0);
keys[1] = kernel_tex_fetch(__attributes_float4, offset + k1);
keys[2] = kernel_tex_fetch(__attributes_float4, offset + k2);
keys[3] = kernel_tex_fetch(__attributes_float4, offset + k3);
}
}

Some files were not shown because too many files have changed in this diff Show More