Compare commits

..

336 Commits

Author SHA1 Message Date
b40787115b GHOST: Device discovery API.
Still wip.
2021-11-23 15:29:11 +01:00
fc502a8710 Merge branch 'temp-ghost-vulkan-backend' into tmp-vulkan 2021-11-23 14:01:01 +01:00
d944b969f0 Merge branch 'master' into temp-ghost-vulkan-backend 2021-11-23 13:23:50 +01:00
cf299bee80 Merge remote-tracking branch 'origin/blender-v3.0-release' 2021-11-23 13:02:25 +01:00
3e65bb86f9 Cleanup: fix clang-tidy warnings
Fix clang-tidy warnings introduced by a recent commit on the release
branch.

No functional changes.
2021-11-23 13:02:00 +01:00
f392ce50c4 Merge remote-tracking branch 'origin/blender-v3.0-release' 2021-11-23 12:39:26 +01:00
cd2849c89b Asset Browser: add operator for installing asset bundles
Add an operator "Copy to Asset Library" for installing asset bundles
into already-existing asset libraries.

The operator is shown when:
- the "Current File" library is selected,
- the blend file name matches `*_bundle.blend`, and
- the file is not already located in an asset library.

The user can select a target asset library, then gets a "Save As"
dialogue box to select where in that library the file should be saved.
This allows for renaming, saving in a subdirectory, etc.

The Asset Catalogs from the asset bundle are merged into the target
asset library.

The operator will refuse to run when external files are referenced. This
is not done in its poll function, as it's quite an extensive operator
(it loops over all ID datablocks).

Reviewed by: Severin

Differential Revision: https://developer.blender.org/D13312
2021-11-23 12:29:44 +01:00
605cdc4346 BKE LibLink/Append: Add mechanism for external code to loop over link/append context items.
Will be required for python's `bpy.data.libraries.load()` refactor.
2021-11-23 12:18:37 +01:00
0452a04f1a BKE link/append: Add optional blendfile handle to libraries.
This enables calling code to deal with the blendfile handle themselves,
BKE_blendfile_link then just borrows, uses this handle and does not
release it.

Needed e.g. for python's libcontext system to use new
BKE_blendfile_link_append code.

Part of T91414: Unify link/append between WM operators and BPY context
manager API, and cleanup usages of `BKE_library_make_local`.
2021-11-23 12:17:16 +01:00
6c8f73b220 Cleanup: remove unnecessary "extern" 2021-11-23 12:05:50 +01:00
b9b98448a6 Cleanup: add missing pragma once 2021-11-23 12:00:11 +01:00
fbb4a7eb43 BKE link/append: Add optional blendfile handle to libraries.
This enables calling code to deal with the blendfile handle themselves,
BKE_blendfile_link then just borrows, uses this handle and does not
release it.

Needed e.g. for python's libcontext system to use new
BKE_blendfile_link_append code.

Part of T91414: Unify link/append between WM operators and BPY context
manager API, and cleanup usages of `BKE_library_make_local`.
2021-11-23 10:38:51 +01:00
f657356062 Merge branch 'blender-v3.0-release' 2021-11-23 09:44:04 +01:00
b02ac2d8be Fix T93092: incomplete animation rendering of multi-layer exr composition
This was broken by rB0c3b215e7d5456878b155d13440864f49ad1f230.
The caching of loaded exr files needed some special treatment.

Differential Revision: https://developer.blender.org/D13313
2021-11-23 09:43:00 +01:00
84be741329 Fix T92654: missing padding in spreadsheet cells
This was probably broken by rB5c2330203e11e0d916960218b07d88d2193bf526.
For now just add the padding back in a spreadsheet specific way.

Differential Revision: https://developer.blender.org/D13315
2021-11-23 09:35:47 +01:00
6987060f70 Fix T93090: crash with data transfer modifier and geometry nodes
There was a missing normals layer that was requested by the data transfer
modifier from the target object. The normal layer was correctly added to
the target object. However, it never reached the data transfer modifier
because the mesh was copied in `BKE_object_get_evaluated_mesh`
(in the call to `get_mesh_for_write`) and the copy does not include the normals
layer.

The solution is to not use `get_mesh_for_write` here which was only used
because `BKE_object_get_evaluated_mesh` returns a non-const `Mesh *`.
Mid term, it should actually return a `const Mesh *` to avoid the confusion.

Differential Revision: https://developer.blender.org/D13319
2021-11-23 09:32:12 +01:00
28870a8f89 Cleanup: Use new CollectionRef::empty() method
Use the new CollectionRef::empty() method in all locations where
appropriate.

Differential Revision: https://developer.blender.org/D13276
2021-11-22 23:47:26 -08:00
59754ef0b2 Cleanup: silent clang-tidy warning NULL vs nullptr. 2021-11-23 08:39:26 +01:00
Peter Fog
34370d9fdf VSE: Add drag and drop handler for preview area
For some users, dropping assets into preview area may be more practical
due to space constraints or it may be just more intuitive.

Reviewed By: ISS

Differential Revision: https://developer.blender.org/D13311
2021-11-23 05:40:08 +01:00
Peter Fog
b42494cf6b VSE: Support drag and drop for datablocks
For using the Outliner and/or the Asset Browser as scene independent
tools to organize a/v source material is necessary for the users to be
able to drag and drop data blocks into the VSE. This was also an
unfulfilled design target for the Outliner Gsoc project.

Datablocks won't be used directly. Path to file will be passed to strip
add operator instead.

Reviewed By: ISS

Differential Revision: https://developer.blender.org/D13304
2021-11-23 05:39:10 +01:00
d1a4e043bd Merge branch 'blender-v3.0-release' 2021-11-23 00:57:15 +01:00
Sayak Biswas
f749506163 Fix T93244: Cycles HIP not working with multi GPU rendering
Use the correct device function (hipDeviceGet) for multi GPU setups, instead
of hipGetDevice which just returns the default device.

Differential Revision: https://developer.blender.org/D13323
2021-11-23 00:55:56 +01:00
8600d4491f Fix: Const warning in editmesh_knife.c
Fixes a warning caused by freeing a const pointer.
This commit removes the const modifier.

Differential Revision: https://developer.blender.org/D13321
2021-11-22 23:40:21 +01:00
456d5e14b8 Merge branch 'blender-v3.0-release' 2021-11-22 21:26:39 +01:00
481f032f5c Assets: Generate light preview when making light object an asset
We already supported previews for lights, just didn't actually use them
when making a light object an asset. They were only used when making the
light data itself an asset.
2021-11-22 21:14:11 +01:00
34615cd269 Fix grayed out preview generation button for light objects
Mistake in e7bea3fb6e.

We should only skip preview generation for objects that don't support
preview rendering, not completely forbid accessing preview data of such
IDs.
2021-11-22 21:12:36 +01:00
48c2b4012f Merge branch 'blender-v3.0-release' 2021-11-22 21:06:10 +01:00
29681f186e Fix T93283: Cycles render error with CUDA CPU + GPU after recent optimization
BVH2 triangle intersection was broken on the GPU since packed floats can't
be loaded directly into SSE. The better long term solution for performance
would be to build a BVH2 for GPU and Embree for CPU, similar to what we do
for OptiX.
2021-11-22 21:02:46 +01:00
e2b736aa40 Fix part of T93278: transparent glass option not working with environment pass 2021-11-22 20:58:09 +01:00
Takahiro Shizuki
73b1ad1920 IME: Fix Multi-Window Duplicated First Character
Fix problem with duplicated initial character when initiating or
switching to new windows. This is done by updating our copies of state
and modes from the new window when it receives WM_IME_SETCONTEXT
message. This problem and fix are only for the Windows platform.
2021-11-22 10:44:34 -08:00
bfff9ca5f1 Fix confusing new Cycles UI terminology
* Rename "Auto Tiles" to "Use Tiling", it's not really automatic and
  confusing with the old auto tile size add-on.
* Rename "Adaptive" scrambling distance to "Automatic", to avoid confusion
  with adaptive sampling.
2021-11-22 10:44:34 -08:00
Takahiro Shizuki
ee0277271c IME: Fix Multi-Window Duplicated First Character
Fix problem with duplicated initial character when initiating or
switching to new windows. This is done by updating our copies of state
and modes from the new window when it receives WM_IME_SETCONTEXT
message. This problem and fix are only for the Windows platform.
2021-11-22 10:41:11 -08:00
cc6bcb53b2 Geometry Nodes: Rename legacy node files
This will be useful to solve some issues with unity builds,
which compiles different files together to improve build times.
2021-11-22 13:16:54 -05:00
5ad4ca4e02 CurveEval: Add total_length() and total_control_point_size() methods
Add the following methods to the CurveEval class:
total_length() : returns the total length of the curve without needing to
                    allocate a new array
total_control_point_size() : returns the total number of control points without
                    needing to allocate a new array
2021-11-22 12:07:59 -06:00
8a84a61f6b Fix confusing new Cycles UI terminology
* Rename "Auto Tiles" to "Use Tiling", it's not really automatic and
  confusing with the old auto tile size add-on.
* Rename "Adaptive" scrambling distance to "Automatic", to avoid confusion
  with adaptive sampling.
2021-11-22 18:37:47 +01:00
0129178376 UI: Use a map for block name lookups
Use a map to speed up search for UI block names.
Time to redraw the node editor was decreased from
around 75-120ms to 40-70ms in a tree with many
Geometry Nodes.

Differential Revision: https://developer.blender.org/D13225
2021-11-22 18:23:54 +01:00
55c82d8380 Fix T84493 issue with selection after boolean.
According to Blender selection rules, selections should be flushed
to containing elements. Added an EDMB_select_flush() after edit
mode booleans or intersects are done. Hopefully this doesn't break
any scripts that might have been depending on the old (broken) behavior.
2021-11-22 11:47:46 -05:00
1706bf7780 Merge branch 'blender-v3.0-release' 2021-11-22 17:32:23 +01:00
336ca6796a Fix T90308: Cycles crash copying memory from device to host
Happens when device runs out of memory and Cycles is moving some
textures to the host memory.

The delayed memory free for OptiX BVH was moving data from one
device_memory to another, leaving the original device memory in
an invalid state. This was ruining the allocation map in the CUDA
device which is using pointer to the device_memory.

This change makes it so the memory pointer is stolen from BVH
into the delayed memory free list.

Additionally, forbid copying and moving instances of device_memory
and added sanity checks in the device implementation.

Differential Revision: https://developer.blender.org/D13316
2021-11-22 17:26:59 +01:00
25c83c217b Cleanup: Clang-format of the HIP device implementation 2021-11-22 17:26:52 +01:00
059da44fbc BKE Link/Append: Use BLO's LibraryLink_Params.
This allows to reduce signature of several functions, and make it eaiser
to integrate more higher-level usages later on.

This should be a non-behavioral-change commit.

Part of T91414: Unify link/append between WM operators and BPY context
manager API, and cleanup usages of `BKE_library_make_local`.
2021-11-22 17:14:21 +01:00
d7cf7d4048 Link/Append: Move main linking code (from WM) to use new instantiation code in BKE.
This removes the last main usage of BLO's instantiation code.

Also required some limited refactoring of BKE_blendfile_link_append's
instantiation to make it more modular, and usable by both linking and
appending code paths.

NOTE: This should be a non-behavioral change commit.

Part of T91414: Unify link/append between WM operators and BPY context
manager API, and cleanup usages of `BKE_library_make_local`.
2021-11-22 16:55:36 +01:00
fe274d91a1 Link/Append: Move most of core link/append code from WM to new BKE_blendflie_link_append module.
This will allow to expose all those advanced features of the WM
operators to other parts of the code, like the python library context
manager, copy/paste code, etc.

This is expected to be a strictly no-behavioral-change commit.

Part of T91414: Unify link/append between WM operators and BPY context
manager API, and cleanup usages of `BKE_library_make_local`.

Maniphest Tasks: T91414

Differential Revision: https://developer.blender.org/D13222
2021-11-22 16:52:17 +01:00
9c2a4d158c Merge remote-tracking branch 'origin/blender-v3.0-release' 2021-11-22 16:47:50 +01:00
875f24352a Material Preview: Fix Sphere object (squared UV and poles)
A good sphere preview material has a 1:1 UV ratio (so we see squares as
least distorted as possible), as well as good poles.

Square UV:
The original sphere expected a 2:1 panorama to be mapped to it. This
patch changes that (I scaled Y by 0.5) so that square textures look ok.

Poles:
The original sphere had a low initial resolution, so no ammount of
subdivision would fix the poles.

The sphere has a subdivision modifier with 0 resolution. Later (3.1?) I
want to try to change the resolution on-the-fly based on whether the material
has a displacement map.

Old sphere (1.9K vertices):
{F11845752, size=500px}

New sphere (2.0K vertices):
{F11845710, size=500px}

Differential Revision: https://developer.blender.org/D13309
2021-11-22 16:46:49 +01:00
Jeroen Bakker
0624235900 Moviecache: Fix potential memory corruption.
`IMB_moviecache` is implemented as a singleton. When destructing the
singleton via `IMB_moviecache_destruct` it will not be created anymore
resulting inusage of unallocated memory and potentional memory
corruption.

When running blender this doesn't happen, but when creating images in
test cases the moviecache should be able to be recreated after it is
destroyed.

Reviewed By: sergey

Differential Revision: https://developer.blender.org/D13287
2021-11-22 15:52:30 +01:00
be876b8db6 Painting: Performance curve masks.
This patch separates the static-part from the dynamic-part when
generate brush masks. This makes the generation of brush masks 2-5 times
faster depending on the size of the brush.

More improvements can be done, this was just low hanging fruit.
2021-11-22 15:36:03 +01:00
178947dec3 Merge branch 'blender-v3.0-release' 2021-11-22 09:34:51 -05:00
Henrik Dick
819b9bdfa1 Fix T92631: Fix negative thickness regression in complex solidify
This regression was introduced by D11832, but there was problems before
that as well. I seem to have missed it in review. See the differential
revision for a screenshot of the difference.

Differential Revision: https://developer.blender.org/D13216
2021-11-22 09:33:49 -05:00
77ddc6e350 Fix T93250: Crash in spreadsheet tree view after recent commit
It seems that update_from_old assumed there would be an old
tree view available in the old block. This works for the asset browser
because the tree is always drawn, but for the spreadsheet that depends
on having an active object, which isn't necessarily always true.

Differential Revision: https://developer.blender.org/D13301
2021-11-22 08:47:02 -05:00
7e8f9213e9 Merge branch 'blender-v3.0-release' 2021-11-22 13:49:37 +01:00
7b09213f2f Fix T93198: Frame Selected in greasepencil curve editing does not work
Was not taking into account curve points at all.

Maniphest Tasks: T93198

Differential Revision: https://developer.blender.org/D13281
2021-11-22 13:42:00 +01:00
9b6f3d2d0a Cleanup: Removed unused code in curve mask creation.
Generating curve mask for 2d texture painting had some hard-coded
parameters that eventually weren't used in the algorithm (hardness and
rotation of the brush). This patch removes these parameters.
2021-11-22 11:22:58 +01:00
9bbb5f5a6a Painting: migrated curve mask generation to CPP.
Curve mask generation is done during 2d texture painting. There are some
performance issues in this part of the code. Before addressing those we
move the code to CPP.
2021-11-22 10:46:33 +01:00
29f6ec56e6 Cleanup: Make parameter const (BKE_brush_curve_strength_clamped). 2021-11-22 10:46:33 +01:00
0b246ed813 Revert "Fix (unreported) broken handling of constraints reordering with liboverride."
This reverts commit 6eaa69c66c.

Committed by nistake, sorry for the noise.
2021-11-22 10:39:20 +01:00
31864a40ba Cleanup: use simple data member instead of callback
This really doesn't have to be a callback currently, since it is always
the same `CPPType` for a socket type.
2021-11-22 10:18:08 +01:00
c850189adf Cleanup: make naming more consistent 2021-11-22 09:48:36 +01:00
db20837c3a Merge branch 'blender-v3.0-release' 2021-11-22 09:32:59 +01:00
6c16bb2706 Fix broken NLA RNA code after own rBfa6a913ef19c.
Thanks to @scurest for noticing this mistake!
2021-11-22 09:31:59 +01:00
6eaa69c66c Fix (unreported) broken handling of constraints reordering with liboverride.
New drag&drop reordering code would call constraints reordering operator
with the generic context, and not the one from the panel's layout.
missing the "constraint" member which is mandatory for poll function to
properly deal with override vs. local constraints.

This commit fixes it by generating a temp bContextStore in the panel
re-ordering callback.

NOTE: this fix will have to be extended to modifiers (which happen to
work currently because they have an 'active' status), and gpencil
modifiers (which are also broken currently).

Differential Revision: https://developer.blender.org/D13291
2021-11-22 09:28:27 +01:00
fb470c256a Fix T93256: Instances to points node broken after recent commit
When 97533eede4 added the instance domain, it didn't change
the domain that instance attributes are read from in this node.
2021-11-21 11:37:35 -05:00
873f6148ad Functions: remove test for dynamic name
This was broken in rB6ee2abde82ef121cd6e927995053ac33afdbb438.
2021-11-21 13:08:23 +01:00
940e6525c7 Functions: fix compile error in tests 2021-11-21 13:06:05 +01:00
15011e0b70 Functions: use static string for parameter names
The idea behind this change is the same as in
rB6ee2abde82ef121cd6e927995053ac33afdbb438.

A `MultiFunction::debug_parameter_name` method could be
added separately when necessary.
2021-11-21 12:57:34 +01:00
6ee2abde82 Functions: use static names for multi-functions
Previously, the function names were stored in `std::string` and were often
created dynamically (especially when the function just output a constant).
This resulted in a lot of overhead.

Now the function name is just a `const char *` that should be statically
allocated. This is good enough for the majority of cases. If a multi-function
needs a more dynamic name, it can override the `MultiFunction::debug_name`
method.

In my test file with >400,000 simple math nodes, the execution time improves from
3s to 1s.
2021-11-21 12:48:07 +01:00
d455eadcd8 Cleanup: Move menu search template to C++
This allows the use of better/more readable data structures,
and will also make some refactors to the search button easier.
The build completed on the buildbot for all platforms.
2021-11-20 16:30:53 -05:00
b3ee9f44cf Merge branch 'blender-v3.0-release' 2021-11-20 18:00:46 +01:00
1b2ee3cf20 Fix T92090: Eevee crash with Intel HD 4000 and macOS 10.15.7
A recent security update to macOS 10.15.7 causes crashes when using Eevee and
various other 3D viewport features. It appears that glGenerateMipmap is
broken, causing a crash whenever its commands are flushed/submitted to the GPU.

Ideally this would be fixed in a driver update, however it's unlikely this will
happen. Earlier macOS versions have been receiving security updates for 2 years,
and that window has just passed for 10.15. Further, computers with these GPUs
can't upgrade to a newer macOS version.

As a workaround, disable mipmaps on these GPUs, by setting the mipmap max level
to 0 and not calling glGenerateMipmaps. Effects like depth of field also use
mipmaps, but fill in the mip levels by other means. In those cases we keep the
mipmap level.

Differential Revision: https://developer.blender.org/D13295
2021-11-20 17:50:05 +01:00
e949ac0bfc Cleanup: unset 'FILE_ENTRY_PREVIEW_LOADING' at the end
Although this function only runs on the main thread, it seems safer to
clear the flag only after setting the result.
2021-11-20 10:34:29 -03:00
59ffe1c5b1 Merge branch 'blender-v3.0-release' 2021-11-20 14:05:26 +01:00
Sayak Biswas
f2bb42a095 Fix T92984: Cycles HIP crash with smoke volumes
This fixes the the app crash happening when trying to render smoke as a dense
3D texture. The changes are related to matching up hipew with the actual HIP
headers.

Differential Revision: https://developer.blender.org/D13296
2021-11-20 14:02:38 +01:00
3d447b6335 Merge branch 'blender-v3.0-release' 2021-11-20 12:47:31 +01:00
b20997cb34 Fix T93194: greasepencil channel lists ignoring collection visibility
Same fix as rB0a3b4d4c64f1, but this time for greasepencil.

To repeat: dopesheet in greasepencil mode was ignoring the temporariy
visibility flag of collections. As a result, even though the dopesheet
was supposed to show animation data of visible greasepencils only was
still showing such data of greasepencils that were hidden by hiding
their collection.
2021-11-20 12:31:33 +01:00
3baaab15fc Cleanup: Else after return 2021-11-19 23:48:51 -05:00
411261fb32 Merge branch 'blender-v3.0-release' 2021-11-20 01:45:06 -03:00
092df87534 Fix error in rBfb0ea9
There is no need to multiply the "dash_width" by `UM.pixel size` since the "viewport_size" is already being divided by the DPI.

Ref {rBfb0ea9}
2021-11-20 01:44:35 -03:00
15ecd47b96 Geometry Nodes: Instance attributes in Transfer/Capture nodes
Updates the Transfer Attributes and Capture Attributes nodes
to support attributes from instances.

Differential Revision: https://developer.blender.org/D13292
2021-11-19 23:40:08 +01:00
01df48a983 Refactor: Port spreadsheet data set to UI tree view
This patch removes a bunch of specific code for drawing the spreadsheet
data set region, which was an overly specific solution for a generic UI.
Nowadays, the UI tree view API used for asset browser catalogs is a much
better way to implement this behavior.

To make this possible, the tree view API is extended in a few ways.
Collapsibility can now be turned off, and whether an item should
be active is moved to a separate virtual function.

The only visual change is that the items are now drawn in a box,
just like the asset catalog.

Differential Revision: https://developer.blender.org/D13198
2021-11-19 17:36:11 -05:00
a0780ad625 Fix: Exception in tree view code in new case
This isn't a problem in 3.0 or master, but I'm porting the spreadsheet
data set region to a tree view and ran into this. This line needs to
whether the function is empty before calling it.

Differential Revision: https://developer.blender.org/D13197
2021-11-19 17:28:08 -05:00
cfbc9df60e Cleanup: Use canonical variable name
All `ARegion` variables should be called `region` unless there is
a good reason not to, since that is the convention.
2021-11-19 16:10:37 -05:00
217d0a1524 Cleanup: Use vector instead of linked list 2021-11-19 15:57:31 -05:00
51a7961e09 Cleanup: Simplify node editor link dragging storage
Now that `node_intern.hh` is a C++ header, we can use C++ types
there. This patch replaces the linked list of dragged links with a
vector. Also, the list of drag operator custom data, `nldrag`, doesn't
seem to need to be a list at all, so I just made it a unique pointer.

Differential Revision: https://developer.blender.org/D13252
2021-11-19 15:36:32 -05:00
c3fed4d463 Merge branch 'blender-v3.0-release' 2021-11-19 16:57:33 -03:00
fb0ea94c63 Fix T85855: F-curves too thin on Mac
Use the `GPU_SHADER_3D_POLYLINE_UNIFORM_COLOR` shader instead of `GPU_SHADER_2D_LINE_DASHED_UNIFORM_COLOR`.
This is just a partial solution as "protected" fcurves still use the dashed shader.

Differential Revision: https://developer.blender.org/D13290
2021-11-19 16:57:05 -03:00
50ad0e15fe Merge branch 'blender-v3.0-release' 2021-11-19 19:24:55 +01:00
00e4d665f4 Fix T91838 Crash when toggling edit mode on object with geometry node modifier, but only if the instanced objects material has a normal map assigned.
This is only a workaround to avoid the crash. The underlying issue is left
unfixed.

New report for tracking the underlying issue is T93223.
2021-11-19 19:24:22 +01:00
fabd088067 Update splash for Blender 3.x development series
CC-BY Blender Studio https://studio.blender.org

Update the splash artwork for the daily builds (`master`) to
celebrate the beginning of a new major series in Blender.

The badges of the Development Fund membership levels have been removed for
a simpler design, and the font matches the one used elsewhere in blender.org
2021-11-19 18:39:14 +01:00
1222c45544 Fix: Use the instances domain in instance transform nodes
Instance attributes exist on the instance domain after rB97533eede44421,
so these fields should be evaluated on that domain.
2021-11-19 12:26:48 -05:00
ba8dd0f24f Spreadsheet: Support instances component viewer node columns
After rB97533eede444217b, instances have their own attribute domain,
but the spreadsheet code worked under the assumption that the component
used the point domain. Old files have to re-select the instances data
source to make it properly active
2021-11-19 12:21:54 -05:00
06a2e2b28c Merge branch 'blender-v3.0-release' 2021-11-19 18:05:17 +01:00
ef687bd7c2 Merge branch 'blender-v3.0-release' 2021-11-19 12:00:35 -05:00
97533eede4 Geometry Nodes: Support custom instance attributes
Adds an attribute provider for instance attributes.
A new domain `ATTR_DOMAIN_INSTANCE` is implemented.
Instance attributes are not yet realized correctly.

Differential Revision: D13149
2021-11-19 17:53:48 +01:00
1b686c60b5 Fix T93046: Cycles world volume rendering very slow in OptiX with some scenes
With very long ray distance, OptiX ends up traversing many BVH nodes due to
a feature that improves precision. However this causes very slow rendering.

We now avoid generating such long rays by rejecting the few samples that have
long ray distances and very low probability of being generated. This should not
meaningfully affect render results.

Thanks to Sergey and Patrick for the investigation.
2021-11-19 17:42:22 +01:00
0f1a200a67 Fix T92682: EEVEE motion blur crash with curve objects
After rBb9febb54a492, the evaluated mesh from a curve is now presented
to render engines as a separate mesh object, but some code still assumed
that a curve object itself could have an evaluated mesh. However, this is
still true for surface objects and metaballs, which don't
use geometry sets yet.

Differential Revision: https://developer.blender.org/D13272
2021-11-19 11:36:29 -05:00
9e3a913b35 Merge remote-tracking branch 'origin/blender-v3.0-release' 2021-11-19 16:39:28 +01:00
1a1ddcb5e2 Asset Browser: don't display linked-in asset datablocks
Datablocks marked as asset, linked from another file, were shown in the
"Current File" asset library. This is now resolved.
2021-11-19 16:29:25 +01:00
ec71054a9b Merge branch 'blender-v3.0-release'
Conflicts:
	source/blender/blenkernel/BKE_blender_version.h
	source/blender/blenloader/intern/versioning_300.c
2021-11-19 16:10:28 +01:00
06ead314b6 Asset Preferences: disallow single file as asset library
Asset libraries should be directories on disk. By manually entering a
file path it was possible to have a single blend file as asset library,
but that was not a designed-for situation, and it doesn't play well
with the asset catalog system.
2021-11-19 16:08:55 +01:00
330290d2a8 Cleanup: typos in comments 2021-11-19 15:55:47 +01:00
33c5e7bcd5 LibOverrides: Refactor how diffing of RNA collections is handled.
Original implementation was a quick prototype which should have never
landed as-is in master. It had very limiting constraints and did not
allow for any real further development.

This commit fixes the internal implementation to make more sensible,
maintainable and evolutive.

NOTE: This commit introduces another forward-incompatibility in the
Blender file format: Files saved after this commit won't open properly
in older versions of blender regarding local inserted constraints or
modifiers into overrides of linked data.

NOTE: Technical details: The 'anchor' item name/index is now stored in
`subitem_reference_` members, and the actual 'source' item name/index is
stored in `subitem_local_` members of the override property operation
data.
Previously, only the `subitem_local_` members were used, storing the
anchor item name/index, and assuming the 'source' item was always the
next in the list.

Milestone I of T82160.

Maniphest Tasks: T82160

Differential Revision: https://developer.blender.org/D13282
2021-11-19 15:41:53 +01:00
d6ea881a74 BLI_listbase: Add utils to search from string or index.
If a valid matching string is found, return that item, otherwise
fallback to the item matching the given index, if any.

This will be useful in RNA override code, and potentially other
areas where data in lists can be referenced by their names or indices.
2021-11-19 15:41:36 +01:00
d7aaa145c6 Merge branch 'blender-v3.0-release' 2021-11-19 15:30:54 +01:00
04ec36f677 Fix T87912: use session id instead of name to identify dropped object
The old code did not work when there were multiple ids with
the same name (which can happen when ids are linked in).
The solution is to use the session ids instead. Those are different
even when two ids have the same name.

Differential Revision: https://developer.blender.org/D11116
2021-11-19 15:28:44 +01:00
0852805ed7 Merge branch 'blender-v3.0-release' 2021-11-19 15:58:37 +02:00
a20e703d1a Fix T93184: Link color not used for custom sockets
D13044 allowed the link color overlay to be used with custom sockets.
This no longer works due to a condition that checks if the socket is
standard or not, which was in place to avoid bad indexing of the
std_node_socket_colors array. Since that array is no longer used, this
condition needs to be removed.

Differential Revision: https://developer.blender.org/D13274

Reviewed By: Hans Goudey
2021-11-19 15:55:58 +02:00
06691d1b21 Tests: disable Cycles volume test when WITH_MOD_FLUID is off 2021-11-19 13:21:48 +01:00
1b94c53aa6 Cleanup: fix typos in comments and docs
Contributed by luzpaz.

Differential Revision: https://developer.blender.org/D10447
2021-11-19 13:02:16 +01:00
7d5ef64bfb Cleanup: fix typos in comments and docs
Contributed by luzpaz.

Differential Revision: https://developer.blender.org/D13264
2021-11-19 12:46:49 +01:00
fa6a913ef1 LibOverride: Add read-only flags accessors for 'local override' status.
Constraints, modifiers and NLA tracks can now report from RNA whether
they are defined as comming from the override's reference linked data,
or are local to the override.
2021-11-19 12:09:28 +01:00
83e245023c Fix (unreported) wrong behavior of constraints in liboverrides.
All constraints were 'made local', including the ones comming from the
reference linked object.
2021-11-19 12:09:28 +01:00
eb071c9ff4 Merge branch 'blender-v3.0-release' 2021-11-19 10:16:30 +01:00
de3fda29c7 Fix T93054: crash when deleting a missed linked file
This is a bit similar to rBb7260ca4c9f4b7618c9c214f1270e31d6ed9886b.
Sometimes a group node may not reference a node group
because it was linked and can't be found.
2021-11-19 10:15:58 +01:00
48e64a5fb5 Merge branch 'blender-v3.0-release' 2021-11-19 10:09:29 +01:00
4ea6b4ba84 Fix crash in VSE versioning code from recent commit
Caused by {rB4d09a692e22a}.
Greenlit by @sergey in chat.
2021-11-19 10:05:18 +01:00
992634427e Nodes: add bf_nodes_geometry library
Separating geometry nodes into a new library will make it
easier to improve compile times with features like unity
builds and precompiled headers.

Differential Revision: https://developer.blender.org/D13261
2021-11-19 09:09:14 +01:00
b8dc845e57 Merge branch 'blender-v3.0-release' 2021-11-19 06:22:47 +01:00
4d09a692e2 Fix T92847: Meta-strip corrupt
Offsets for meta strip were invalid. No steps to reproduce the issue are
available, but it is quite possible that there are files with incorrect
state after issues with meta strips were fixed.

Ensure correct offsets for meta strips in versioning code.

Reviewed By: sergey

Differential Revision: https://developer.blender.org/D13257
2021-11-19 06:12:09 +01:00
5ed3a5d023 Cleanup: Add an empty() method to RNA's CollectionRef class
The existing RNA CollectionRef class only offers a length() operation
which is sometimes used for checking if the collection is empty. This is
inefficient for certain collection types which do not have a native
length member; the entire list is iterated to find the count.

This patch creates an explicit empty() method to be used in such cases
for better semantics. Additionally, many collection types will behave
more efficiently when using the new method instead of checking length.

Making use of the new method will follow separately.

Differential Revision: https://developer.blender.org/D12314
2021-11-18 14:32:56 -08:00
3f288e9414 Node Editor: Change minimum zoom
Now with Geometry Nodes in Blender,
trees can become much bigger than before.
This changes the minimum zoom value in the node editor.

Differential Revision: https://developer.blender.org/D13254
2021-11-18 21:24:55 +01:00
b2d37c35d0 Node Editor: Skip socket drawing on low zoom
Socket drawing can be heavy with many nodes.
This patch skips drawing them on scale < 0.2
when they are barely visible anyway.

Differential Revision: https://developer.blender.org/D13255
2021-11-18 21:21:10 +01:00
167ee8f2c7 Merge branch 'blender-v3.0-release' 2021-11-18 19:37:48 +01:00
fd2a155d06 Fix T91797: Cycles volume rendering artifact with overlapping volumes
With the new volume rendering code this was no longer accurate, we always
need to use a new dimension for the next volume segment.
2021-11-18 19:27:37 +01:00
f190f2d267 Revert fixes 2021-11-18 14:19:59 -03:00
0c33411bdd Revert "Revert "Revert "Revert "Allow navigating while transforming""""
This reverts commit 717a971035.
2021-11-18 14:15:08 -03:00
ea42c1a22e Revert "Revert "Revert "Revert "Adjust snap source drawing when adding multiple snap points""""
This reverts commit b8bf40ed4b.
2021-11-18 14:14:57 -03:00
f61a73093b Revert "Revert "Revert "Revert "Transform: interactive mode for editing a 'Snap Source'""""
This reverts commit 701f2dfd5b.
2021-11-18 14:14:51 -03:00
ada6742601 Merge remote-tracking branch 'origin/blender-v3.0-release' 2021-11-18 17:58:26 +01:00
701f2dfd5b Revert "Revert "Revert "Transform: interactive mode for editing a 'Snap Source'"""
This reverts commit 25fa6c74b9.
2021-11-18 13:55:24 -03:00
b8bf40ed4b Revert "Revert "Revert "Adjust snap source drawing when adding multiple snap points"""
This reverts commit c7f9a782aa.
2021-11-18 13:55:21 -03:00
717a971035 Revert "Revert "Revert "Allow navigating while transforming"""
This reverts commit 2a9cfdac7e.
2021-11-18 13:55:17 -03:00
Germano Cavalcante
2a9cfdac7e Revert "Revert "Allow navigating while transforming""
This reverts commit 5e6fdaa07f.
2021-11-18 13:52:39 -03:00
c7f9a782aa Revert "Revert "Adjust snap source drawing when adding multiple snap points""
This reverts commit 77df32548b.
2021-11-18 13:52:30 -03:00
Germano Cavalcante
25fa6c74b9 Revert "Revert "Transform: interactive mode for editing a 'Snap Source'""
This reverts commit 805181bffa.
2021-11-18 13:52:18 -03:00
7c4e4d605c Pose Library: clarify apply/blend operator names
The old names had "{verb} Pose Library Pose"; they are now named
"{verb} Pose Asset", which is shorter & clearer.
2021-11-18 17:52:17 +01:00
3531021d1b Cleanup: Simplify declarations in C++ header
Using `struct` everywhere is unnecessary in C++, and the typedefs
are also unnecessary.
2021-11-18 11:46:44 -05:00
12fc395436 Fix T93152: Cycles baking multiple selected object to active not working 2021-11-18 17:43:27 +01:00
Germano Cavalcante
805181bffa Revert "Transform: interactive mode for editing a 'Snap Source'"
This reverts commit f19bd637e2.
2021-11-18 13:42:45 -03:00
77df32548b Revert "Adjust snap source drawing when adding multiple snap points"
This reverts commit cb3ba68ec4.
2021-11-18 13:42:36 -03:00
Germano Cavalcante
5e6fdaa07f Revert "Allow navigating while transforming"
This reverts commit 1d1855e95f.
2021-11-18 13:42:31 -03:00
Germano Cavalcante
1d1855e95f Allow navigating while transforming
This feature has been desired for some time:
- https://rightclickselect.com/p/ui/Tqbbbc/allow-navigating-while-transforming (See comments);
- D1583;
- T37427;

In short, blocking navigation during transform limits the user to move the object only to visible areas within the screen and hinders the allocation of objects within closed meshes.

The node editor is also impaired because some nodes are far between them and the connectors are too small.

The only disadvantage of this patch (as I see it) is the conflict with the existing key map:
MIDDLEMOUSE:
- enable axis constrain in 3D view;

WHEELDOWNMOUSE, WHEELUPMOUSE, PAGEUPKEY, PAGEDOWNKEY:
- change the threshold of the proportional edit;

So the patch solution was to change these keymaps:
- MIDDLEMOUSE to Alt+MIDDLEMOUSE;
- WHEELDOWNMOUSE, WHEELUPMOUSE, PAGEUPKEY, PAGEDOWNKEY to Alt+(corresponding key);

When you use this new keymap for the first time in the proportional edit, it may seem strange due to the custom of using it (both in View2D and View3D).
But quickly the user gets used to it.

Alternatively we can add an option to the user preferences ([] Allow navigating while transforming). (I'm not much fan of this option).

The patch was done on branch2.8. But maybe it's a good idea to apply it to 2.79

Differential Revision: https://developer.blender.org/D2624
2021-11-18 13:14:18 -03:00
cb3ba68ec4 Adjust snap source drawing when adding multiple snap points 2021-11-18 13:14:18 -03:00
Germano Cavalcante
f19bd637e2 Transform: interactive mode for editing a 'Snap Source'
This patch implements part of what was stated in {T66484}, with respect to `Base Point`.

## Introduction

The snapping feature of the transform tools has a variety of applications:
- Organization of nodes.
- Positioning of frames in precise time units.
- Retopology with snap to face
- Creation of armatures with bone positioning through the snap to volume
- Precise positioning of 3D or 2D objects in the surrounding geometry (CAD modeling)

The goal of this document is to make it more powerful for precision modeling and still supporting the old use cases without extra complexity.
The main topic addressed here is the introduction of a **interactive mode for setting a snap source** (See terminology).

## Terminology

* **Snap Source**: 3d coordinate * we want to snap from. (Currently defined by the `Snap With` options: `Closest`, `Center`, `Median` and `Active`).
* **Snap Target**: 3d coordinate*  we want to snap to. (Vertices, Edges, Faces, Grid...)

## Interactive Mode for Editing a Snap Source

Currently the fixed snap point can only be obtained through the `Snap With` options. So it's a little tricky for the user to define a snap source point having so much geometry on an object.
Because of this, the user needs to resort to impractical solutions to get a point in the geometry.
See example of an impractical use:
{F11714181, layout=left, width=960, alt="The user used the cursor (which can be snapped) to choose the snap origin point."}
The user used the cursor (which can be snapped) to choose the snap source point.

While it is possible to work around this current limitation, it is important to reduce the number of steps and allow the user to set a snap source point through an optional interactive mode during a transformation.

The proposed solution is to be able to move the current snap source point through a modal modifier activated with a key (eg. B).
The snap source point can thus "snap" to the elements in the scene (vertex, mid-edge, Lamp, …) during this mode.
{F9122814, layout=left, width=960, alt="Base Point Snap, example of transform operation via the shortcut (not the tool). After pressing g and the snap base change shortcut (e.g., shift + ctrl) the user set the base point. The base point is then visible until the end of the operation. The z axis constrains the final position."}

## Implementation Details

- The feature will only be available in 3D View.
- The feature will only be available for `Move`, `Rotate` and `Scale` transform modes.
- The snap source editing will be enabled with a single click on the modifier key (B).
- Having a snap point indicated, the new snap origin point will be confirmed with the same buttons that confirms the transformation (but the transformation will not be concluded).
- The snap source editing can be canceled with the same key that activated it (B).
- If the transformation is done with "release_confirm" (common for gizmos), the new feature cannot be enabled.
- During the transformation, when enabling the feature, if the snap option is turned off in the scene, the snap will be forced on throughout the rest of the transformation (unless interactive mode is canceled).
- During a transformation, if no snap target is set for an element in the scene (Vertex, Grid...), the snap targets to geometry Vertex, Edge, Face, Center of Edge and Perpendicular of Edge will be set automatically.
- Snap cannot be turned off during the snap source editing.
- Constraint or similar modification features will not be available during the snap source editing.
- Text input will not be available during the snap source editing.
- When adding multiple snap points (A) the new prone snap source point will be indicated with an "X" drawing.
{F11817267}

Maniphest Tasks: T66484

Differential Revision: https://developer.blender.org/D9415
2021-11-18 13:14:18 -03:00
f0be276514 Fix T93082: Cycles baking not handling transparency correctly
For baking, replace transparent BSDF with holdout for baking. This ensure no
objects behind are baked, and that the baked image has alpha.
2021-11-18 17:13:16 +01:00
ed91e759d1 Fix Cryptomatte accurate option showing for Cycles, only needed for Eevee 2021-11-18 17:13:16 +01:00
67b4eecac9 Cleanup: Use const arguments 2021-11-18 11:03:18 -05:00
Nikhil Shringarpurey
dd31b8bd50 UI: Use full word "Start" instead of "Sta"
Differential Revision: https://developer.blender.org/D13098
2021-11-18 10:45:10 -05:00
5816eb4c56 Merge remote-tracking branch 'origin/blender-v3.0-release' 2021-11-18 16:36:11 +01:00
8d1357ea6b Pose Library: add conversion button to old poselib UI
Add the "Convert Old Pose Library" operator to the old pose library (in
the Armature properties editor). This makes it more discoverable; before
it only was available in the Action editor.
2021-11-18 16:35:56 +01:00
beb9e332ca Merge remote-tracking branch 'origin/blender-v3.0-release' 2021-11-18 16:21:17 +01:00
31afa1bb9a Asset tags: include partial tag matches
When filtering the asset browser, also include results that have partial
tag matches. So searching for "xite" will include results tagged with
"excited".

This brings the tag filtering in line with other search boxes in
Blender. Later we might want to provide users with more options for
prefix-only ("excite" would match "excited", but "xited" would not) or
only exact matches.
2021-11-18 16:21:00 +01:00
0624acf088 Fix T92442: undo removal of Eevee cryptomatte accurate mode option
The Cycles accurate mode was removed, but the Eevee option for this has
a different meaning and should not have been removed. The Eevee accurate
makes cryptomatte accumulate for every sample, which Cycles has always
done regardless of any option.
2021-11-18 15:59:08 +01:00
b926f54f3c Fix: error when materializing curve point attribute
Differential Revision: https://developer.blender.org/D13271
2021-11-18 15:28:04 +01:00
d1f944c186 Cycles: declare constants at program scope on Metal
MSL requires that constant address space literals be declared at program
scope. This patch moves the `blackbody_table_r/g/b` and `cie_colour_match`
constants into separate files so they can be declared at the appropriate scope.

Ref T92212

Differential Revision: https://developer.blender.org/D13241
2021-11-18 14:38:05 +01:00
d19e35873f Cycles: several small fixes and additions for MSL
This patch contains many small leftover fixes and additions that are
required for Metal-enablement:

- Address space fixes and a few other small compile fixes
- Addition of missing functionality to the Metal adapter headers
- Addition of various scattered `__KERNEL_METAL__` blocks (e.g. for
  atomic support & maths functions)

Ref T92212

Differential Revision: https://developer.blender.org/D13263
2021-11-18 14:38:02 +01:00
c0d52db783 Merge branch 'blender-v3.0-release' 2021-11-18 14:33:43 +01:00
f71813204c Cycles: Don't tile if image area fits into tile area
Previously the check was done based on dimension of image and if any
of dimensions were larger than tile size tiling was used.

This change makes it so that if image does not exceed number of pixels
in the tile no tile will be used. Allows to render widescreen images
without tiling.

Differential Revision: https://developer.blender.org/D13206
2021-11-18 14:27:45 +01:00
3ad2bf1327 Cycles: Fix command line render overshooting time limit
The calculation based on preserving device occupancy was conflicting
with the fact that time limit needs to render less samples at the last
round of render work.

For example, rendering BMW27 for 30sec on i9-11900k was actually
rendering for almost a minute. Now the render time limit is respected
much more close.

Differential Revision: https://developer.blender.org/D13269
2021-11-18 14:27:45 +01:00
bd2e3bb7bd Fix T93045: Cycles HIP not rendering OpenVDB volumes
Build HIP kernels with NanoVDB, and patch NanoVDB to work with HIP.

This is a header only library so no rebuild is needed. The changes are being
submitted upstream to openvdb, so this patch should be temporary.

Thanks Thomas for help testing this.
2021-11-18 13:24:56 +01:00
2b63a76041 Merge branch 'blender-v3.0-release' 2021-11-18 11:53:46 +01:00
e5774282b9 Fix asset preview not showing up for current file data-blocks
For data-blocks from the current file, the image-buffer for dragging
wasn't set at all. This wasn't intentional, dragging things in the Asset
Browser should just always show the preview.
2021-11-18 11:47:21 +01:00
William Leeson
e1a3b697ec Merge branch 'blender-v3.0-release' to pick up D13262. 2021-11-18 09:41:11 +01:00
William Leeson
8c0370ef7b Fix T93102: Principled hair shader stack was built incorrectly
stack_assign_if was used in the middle of creating the shader value blocks.
Which caused stack variables to be inserted in the middle of the shader value data.
This resulted in the shader node data no being in sequential order. This was also
the case for the wave texture wave node.

Reviewed By: brecht

Maniphest Tasks: T93102

Differential Revision: https://developer.blender.org/D13262
2021-11-18 09:31:29 +01:00
Félix
9cf3d841a8 VSE: Add meta.separate() Python API function
This function can be used to "dissolve" meta strip anywhere in strip
hierarchy. This has same effect as `meta_separate` operator.

Reviewed By: ISS

Differential Revision: https://developer.blender.org/T91005
2021-11-18 03:55:48 +01:00
f1f7a8b018 Merge branch 'blender-v3.0-release' 2021-11-18 03:03:13 +01:00
a182b05f07 Fix T90390: VSE Jump operator don't work during playback
Tag dependency graph `ID_RECALC_AUDIO_SEEK` update.
2021-11-18 02:59:50 +01:00
032ab0270d Merge branch 'blender-v3.0-release' 2021-11-18 02:23:17 +01:00
Andrea Beconcini
daaa43232d Fix T92445: Thumbnail height without overlay text
This patch changes the thumbnails' height used for image and movie
strips by choosing the proper size according to the VSE's text overlay
settings: i.e. thumbnails use the whole strip's height when no overlay
text is displayed; otherwise, some space is left for the overlay.

Reviewed By: ISS

Differential Revision: https://developer.blender.org/D13043
2021-11-18 02:14:11 +01:00
ceec400975 Merge branch 'blender-v3.0-release' 2021-11-18 01:40:37 +01:00
d8fd575af9 Fix T93154: Crash adding multiple movie strips
Some when adding multiple movies at once and only some of them have
audio track, this causes crash on NULL dereference. Issue was introduced
in bdbc7e12a0 to align sound and video properly.

Check if sound is present in movie file. If it's not, don't try to align
sound with video.
2021-11-18 01:32:06 +01:00
d6b5251572 Merge branch 'blender-v3.0-release' 2021-11-18 01:13:22 +01:00
b071083496 Fix T93166: Division by zero when drawing thumbnails
Caused by incorrect step calculation fo too short strips. For these,
step should be equal to strip length not 0.
2021-11-18 01:11:51 +01:00
fa7a6d67a8 Fix Cycles CUDA/HIP compiler error after recent changes 2021-11-17 19:56:18 +01:00
Sebastian Herholz
d9bc8f189c Cycles: add build option to enable a debugging feature for MIS
This patch adds a CMake option "WITH_CYCLES_DEBUG" which builds cycles with
a feature that allows debugging/selecting the direct-light sampling strategy.
The same option may later be used to add other debugging features that could
affect performance in release builds.

The three options are:
* Forward path tracing (e.g., via BSDF or phase function)
* Next-event estimation
* Multiple importance sampling combination of the previous two methods

Such a feature is useful for debugging light different sampling, evaluation,
and pdf methods (e.g., for light sources and BSDFs).

Differential Revision: https://developer.blender.org/D13152
2021-11-17 18:03:56 +01:00
063ad8635e Cycles: reduce triangle memory usage with packed_float3
Depends on D13243

Differential Revision: https://developer.blender.org/D13244
2021-11-17 17:29:41 +01:00
9937d5379c Cycles: add packed_float3 type for storage
Introduce a packed_float3 type for smaller storage that is exactly 3
floats, instead of 4. For computation float3 is still used since it can
use SIMD instructions.

Ref T92212

Differential Revision: https://developer.blender.org/D13243
2021-11-17 17:29:41 +01:00
89d5714d8f Build: match GCC and Clang float conversion warnings in Cycles 2021-11-17 17:29:41 +01:00
ea7efa5569 Merge branch 'blender-v3.0-release' 2021-11-17 15:41:32 +01:00
00a9617f92 Fix: wrong assert in geometry nodes evaluator
It only makes sense to check if all required outputs have been computed
if the node was executed at all.
2021-11-17 15:40:53 +01:00
c3422c48ad Cleanup: remove dummy multi function 2021-11-17 12:33:18 +01:00
e5f05bc7a6 Cleanup: Painting - reduce reallocation of same memory.
Curve mask is freed/allocated every time, but could still reuse the
previous allocated buffer when the diameter of the brush doesn't change.
2021-11-17 12:30:42 +01:00
f5dde382af Cleanup: use same function for updating internal links for all nodes
Previously, node types had a callback that creates internal links. Pretty
much all nodes used the same callback though. The exceptions are the
reroute node (which probably shouldn't be mutable anyway) and some
input/output nodes that are not mutable.

Removing the callback helps with D13246, because it makes it easier
to reason about which internal links are created and when they change.
In the future, the internal links should be part of the node declaration.
2021-11-17 11:52:54 +01:00
83a4d51997 Cleanup: Remove unused show_samples() device code in Cycles. 2021-11-17 11:16:48 +01:00
c2ab47e729 Cleanup: change node socket availability in a single place
This cleans up part of the code that still set the flag manually. Also, this
change helps with D13246 because it makes it easier to tag the node
tree as changed when the availability of a socket changed.
2021-11-17 11:11:28 +01:00
51b8e34fb7 LineArt: Improve certain edge cases in occlusion
This patch includes:
View vector fix for ortho back face.
Point on segment logic correction.
Better handling of boundary cases.

See review page for detailed description.

Reviewed By: Sebastian Parborg (zeddb)

Differential Revision: https://developer.blender.org/D13143
2021-11-17 14:31:12 +08:00
473be239c3 LineArt: Improve certain edge cases in occlusion
This patch includes:
View vector fix for ortho back face.
Point on segment logic correction.
Better handling of boundary cases.

See review page for detailed description.

Reviewed By: Sebastian Parborg (zeddb)

Differential Revision: https://developer.blender.org/D13143
2021-11-17 14:30:08 +08:00
e3c974b7e4 Merge branch 'blender-v3.0-release' 2021-11-17 05:43:34 +01:00
6e6123b40f Fix T93080: Crash on scrubbing with snapping
Sequencer wasn't initialized, snapping crashed on NULL dereference.
Add Null check.
2021-11-17 05:40:25 +01:00
ecad33f214 UI: Use ampersand instad of 'and' in labels
- When and is used in labels use ampsand
- When used in description use 'and'
2021-11-16 21:38:03 -05:00
61bffa565e Fix T90412: Inconsistency in mask strip color
This seems to be oversight in 271231f58e where strip color was
defined only for light theme.
2021-11-17 03:27:13 +01:00
f72dc00569 Merge branch 'blender-v3.0-release' 2021-11-17 03:12:19 +01:00
9bdf3fa5f0 Fix T91724: Strip height is too limited
This change was introduced in 997b5fe45d, to not display pixelated
thumbnails. However when VSE timeline height is made smaller, this
limits strip height.

Change limit, so one strip can occupy full height of VSE timeline
2021-11-17 02:45:53 +01:00
4c988eb3e1 Fix error with makefiles compilation
Use 'template' keyword to treat 'is' as a dependent template name
2021-11-16 20:46:33 -03:00
luzpaz
dea26253a0 cleanup: fix typos in comments and docs
Followup to https://developer.blender.org/D10288

Reviewed By: Blendify

Differential Revision: https://developer.blender.org/D10346
2021-11-16 18:45:10 -05:00
8290edefad Cleanup: Use bool instead of int 2021-11-16 17:20:31 -06:00
2f39b45e8c Merge branch 'blender-v3.0-release' 2021-11-16 17:50:24 -05:00
f829b86039 Cleanup: Correct copy paste error in comment
Mistake from rB2743d746ea4f38c098512f6dd6fc33d5a62429d3
2021-11-16 17:49:51 -05:00
93f26d652e Merge branch 'blender-v3.0-release' 2021-11-16 23:21:13 +01:00
fbf4fe6963 Fix missing Cycles implicit float/double conversion error with Clang
Since these are errors with GCC and Visual Studio.
2021-11-16 23:18:25 +01:00
1e4d1eb398 Fix another Linux build error with double and float comparison 2021-11-16 23:12:50 +01:00
59da22c309 Merge branch 'blender-v3.0-release' 2021-11-16 22:40:08 +01:00
b496c1c721 Cleanup: compiler warnings 2021-11-16 22:29:50 +01:00
3189171a94 Fix build error with strict double to float conversion 2021-11-16 22:25:24 +01:00
cf83719761 Geometry Nodes: Small improvements to object info node
This commit contains a few mostly-related changes to this node:
 -  Add a warning when retrieving the geometry from the modifier object.
 - Only create the output geometry when it is necessary.
 - Decompose transform matrices in a more friendly way.
 - Use default return callbacks like other newer nodes.

Differential Revision: https://developer.blender.org/D13232
2021-11-16 15:09:00 -06:00
c9fb08e075 Merge branch 'blender-v3.0-release' 2021-11-16 14:55:13 -06:00
f30e1fd2f0 Fix T93085: Incorrect geometry nodes modifier warning
It's valid for a node group connected to the modifier not to
have a geometry input, but I didn't consider that case
with the last change I made here, f3bdabbe24.

Differential Revision: https://developer.blender.org/D13231
2021-11-16 14:51:03 -06:00
25d30e6c99 Fix T92857: Deadlock in geometry nodes curve multi-threading
The spline code, especially Bezier splines, often make use of lazily
evaluation and caching. In order to do that, they use mutex locks.
When multi-threading, this can lead to problems. Further detail
can be found in rBfcc844f8fbd0d1.

To fix the deadlock, isolate the task before multi-threading
when holding a lock.

Differential Revision: https://developer.blender.org/D13229
2021-11-16 14:49:58 -06:00
cfd0e96e47 Fix T93125: Cycles wrong remaining render time with high number of samples
Avoid integer overflow.
2021-11-16 20:49:32 +01:00
7293c1b357 Fix T93106: Cycles SSS not working with normals pointing inside 2021-11-16 19:44:45 +01:00
Germano Cavalcante
9d7422b817 File Browser: Improve usage of threads in the creation of thumbnails
Due to asynchronous process, the preview for a given image may be
generated several times.

This regenerates many thumbs unnecessarily.

The solution is to add the `FILE_ENTRY_PREVIEW_LOADING` flag for file
entries that are still in the thread queue.

So this flag is checked not to redraw the thumb when it is still being
created on a different thread.

Differential Revision: https://developer.blender.org/D11150
2021-11-16 14:10:24 -03:00
917218269e Merge branch 'blender-v3.0-release' 2021-11-16 13:59:28 -03:00
1572c4d3d3 Fix T93011: Individual origins being used when pivot point is override
There should be a special `t->around` for this case, but for now let's
just avoid having the individual origins overlap.
2021-11-16 13:56:11 -03:00
bd37553850 Cleanup: better delimit member initialization
The initialization of `t->around` and `t->view` was scattered and with
duplicate code
2021-11-16 13:55:30 -03:00
0335df9384 Transform: better contextualize the status bar
`Remove Last Snap Point` should only be displayed when there is a Snap Point to be removed.
2021-11-16 13:39:10 -03:00
ba6427adfa Merge branch 'blender-v3.0-release' 2021-11-16 17:25:48 +01:00
b3529ecf0e Fix CUDA error when using tiny border in viewport
Need to clamp scaled render buffers window to be above zero
when applying resolution divider.
2021-11-16 17:25:18 +01:00
72ee62e0da Fix crash on freeing hair system
Fix a crash when a hair system's `ParticleSettings` ID datablock was
linked from another file but couldn't be found. This results in default
settings, with `type = PART_EMITTER`, where the particle data still has
a non-NULL `hair` pointer. Previously, copies of such a particle system
would NOT copy hair data for non-hair particle systems, hence the
pointer of the copy pointed to the original data, which got freed (at
least) twice upon closing the blend file.

This is now fixed by always copying the hair data, regardless of the
particle system type.

Reviewed by: mont29

Differential Revision: https://developer.blender.org/D13245
2021-11-16 17:18:01 +01:00
bee7a56687 Cleanup: document that MEM_dupallocN is NULL-safe
Add comment explaining `MEM_dupallocN` is NULL-safe, in that it returns
NULL when it receives a NULL pointer. This is currently true for both
implementations of the function (`MEM_lockfree_dupallocN` and
`MEM_guarded_dupallocN`), and will be expected of other implementations
as well.

No functional changes.
2021-11-16 17:11:45 +01:00
07af45eec5 Asset Browser: hide catalog debug info behind debug option
Add a new "experimental" debug option `show_asset_debug_info`, and use
that to determine the visibility of the active asset's catalog UUID and
simple name. Previously this was only determined by the "Developer
Extras" option, which meant it was visible in too many situations. It's
not really a "developer extra", and really just a debugging tool, so the
new option is more in line with its purpose.

Reviewed by: Severin

Differential Revision: https://developer.blender.org/D13242
2021-11-16 16:49:27 +01:00
Michael Jones
64003fa4b0 Cycles: Adapt volumetric lambda functions to work on MSL
This patch adapts the existing volumetric read/write lambda functions for Metal. Lambda expressions are not supported on MSL, so two new macros `VOLUME_READ_LAMBDA` and `VOLUME_WRITE_LAMBDA` have been defined with a default implementation which, on Metal, is overridden to use inline function objects.

This patch also removes the last remaining mention of the now-unused `ccl_addr_space`.

Ref T92212

Reviewed By: leesonw

Maniphest Tasks: T92212

Differential Revision: https://developer.blender.org/D13234
2021-11-16 13:42:23 +00:00
85ac9b8584 Merge branch 'blender-v3.0-release' 2021-11-16 14:39:51 +01:00
ce0d817bb7 UI: Fix hard to read text for drag disabled hints
In 499dbb626a, the background color of drag tooltips were changed so
text becomes more readable. But multiple people were touching the same
code, so the disabled hint tooltips didn't get the same tweak. They
would benefit from them even more, since the red text is even harder to
read on the transparent background than the regular, white text.
2021-11-16 14:37:20 +01:00
12a986c9b5 Merge remote-tracking branch 'origin/blender-v3.0-release' 2021-11-16 13:10:35 +01:00
c7a1e115b5 Tests: fix memory leak of GHOST system paths
Dispose of GHOST system paths when tearing down `BlendfileLoadingBaseTest`
and some other test cases. This prevents a memory leak.

A better solution would be to rework Blender's initialisation & teardown
structure, but that's outside the scope of this fix.

No functional changes to Blender.
2021-11-16 13:07:11 +01:00
faa8aa3bb9 Asset Browser: Forbid dragging catalogs into themselves
While there is nothing technically that would cause issues when moving a
catalog into itself (it just changes the path of the catalog, and the
missing parent catalogs will be created), it seems broken to the user.
So disable this in the drag & drop code for asset catalogs.
2021-11-16 13:01:57 +01:00
052c22199d Asset Browser: use one more refresh operator
Refreshing the assets requires `file_OT_asset_library_refresh` in the
asset browser, and `asset_OT_list_refresh` for the asset view. Both
are now done from `ASSET_OT_open_containing_blend_file`.
2021-11-16 12:35:39 +01:00
7da714f387 Merge branch 'blender-v3.0-release' 2021-11-16 10:57:26 +01:00
Diptangshu Dey
da14a482f2 Fix T90866: Python operator templates are not accessible from menus
Python Operator templates made accessible from respective menus
(required to also use F3 search for quick access)
Also fixed Modal Draw Operator id_name (had duplicate name from other template)

Maniphest Tasks: T90866

Differential Revision: https://developer.blender.org/D13182
2021-11-16 10:45:23 +01:00
d4c868da9f Geometry Nodes: refactor virtual array system
Goals of this refactor:
* Simplify creating virtual arrays.
* Simplify passing virtual arrays around.
* Simplify converting between typed and generic virtual arrays.
* Reduce memory allocations.

As a quick reminder, a virtual arrays is a data structure that behaves like an
array (i.e. it can be accessed using an index). However, it may not actually
be stored as array internally. The two most important implementations
of virtual arrays are those that correspond to an actual plain array and those
that have the same value for every index. However, many more
implementations exist for various reasons (interfacing with legacy attributes,
unified iterator over all points in multiple splines, ...).

With this refactor the core types (`VArray`, `GVArray`, `VMutableArray` and
`GVMutableArray`) can be used like "normal values". They typically live
on the stack. Before, they were usually inside a `std::unique_ptr`. This makes
passing them around much easier. Creation of new virtual arrays is also
much simpler now due to some constructors. Memory allocations are
reduced by making use of small object optimization inside the core types.

Previously, `VArray` was a class with virtual methods that had to be overridden
to change the behavior of a the virtual array. Now,`VArray` has a fixed size
and has no virtual methods. Instead it contains a `VArrayImpl` that is
similar to the old `VArray`. `VArrayImpl` should rarely ever be used directly,
unless a new virtual array implementation is added.

To support the small object optimization for many `VArrayImpl` classes,
a new `blender::Any` type is added. It is similar to `std::any` with two
additional features. It has an adjustable inline buffer size and alignment.
The inline buffer size of `std::any` can't be relied on and is usually too
small for our use case here. Furthermore, `blender::Any` can store
additional user-defined type information without increasing the
stack size.

Differential Revision: https://developer.blender.org/D12986
2021-11-16 10:16:30 +01:00
6d35972b06 Merge branch 'blender-v3.0-release' 2021-11-16 09:58:47 +01:00
7d985d6b69 Fix T93066: Alembic export ignores Mantaflow particles
`ABCPointsWriter::is_supported` already checked for valid particle
system types (liquid, spray, foam, bubbles, ...).

`AbstractHierarchyIterator::make_writers_particle_systems` did not
create a writer for these though, so now bring these in line and also
create writers for these.
2021-11-16 09:41:09 +01:00
57ed435def Cleanup: Use C++ matrix identity constructor 2021-11-15 23:24:16 -06:00
7e42ae7c1a Cleanup: Typo in comments 2021-11-15 22:08:28 +01:00
165cacc6f0 VSE: Use alpha over as default blend mode
With transform tools, it is expected to see backgroud image when overlay
is transformed.

Alpha over caused performance to be not optimal when used with opaque
media. This should be addressed with D12914 at least partially. There
may be some corner cases not addressed.

Differential Revision: https://developer.blender.org/D12952
2021-11-15 21:07:25 +01:00
62da6ffe08 VSE: Use early out for aplha over blending
When scaling down image, users expect to see background, which doesn't
currently happen in VSE. This is because strips use cross blend mode by
default, because alpha over is much slower. Reason is, because any area
of image can be transparent, and therefore it can't have early out
implemented in a way that cross blend mode can.

Flag images rendered by codecs that don't support transparency as fully
opaque and implement a form of early out for alpha over blend mode.

When rendering image stack, 2-input effects are ignored on the
"way down". Alpha over needs rendered overlay image to decide whether it
will use only overlay or background too. Therefore overlay can be
rendered safely before it is used. Image flags can be checked and it can
be freed if needed. Freeing doesn't cause any performance degradation,
because image is always stored in cache.

This feature does not improve blend mode performance. In summary, it
only allowes for having alpha over blend mode on background images
without suffering from lower performance.

Reviewed By: sergey

Differential Revision: https://developer.blender.org/D12914
2021-11-15 21:03:43 +01:00
46f5f60c13 Merge branch 'blender-v3.0-release' 2021-11-15 20:33:34 +01:00
a040d2a93a Fix T90592: Incorrect scrollbar range with backdrop
`v2d->tot` rect was set for backdrop drawing. Set range before drawing
scrollbars.

Reviewed By: Severin

Differential Revision: https://developer.blender.org/D13099
2021-11-15 20:28:11 +01:00
7e148c45c8 Fix T90415: Missing cache invalidation
Some RNA properties and operators did not invalidate cache or did it
incorrectly.

Reviewed By: sergey

Differential Revision: https://developer.blender.org/D13101
2021-11-15 20:23:57 +01:00
d3c45e1c39 Fix T91405: Block artefacts in WEBM video
Issue was caused by incorrect FFmpeg asynchronous decoding API. In most
cases, decoder returns 1 frame each time it is fed by 1 packet. Here
decoder wanted to return more frames, but our code always expected only
one.

Before sending new packets to decoder, check if there are frames to
receive. If there are, process them, otherwise continue decoding as
usual.

Reviewed By: zeddb, sergey

Differential Revision: https://developer.blender.org/D13079
2021-11-15 20:20:33 +01:00
ef8240e64c Fix T91992: Incorrect clip strip image size
When proxy size lower than 100% is used, clip strips are rendered with
incorrect image size.

This is because if proxies aren't enabled in movieclip, it automatically
falls back on rendering original media. Sequencer doesn't have knowledge
about this and since 9c99292a16 it assumes that image is proxy,
because it explicitly requested this size.

Check movieclip flag to see if proxies are enabled.

Reviewed By: sergey

Differential Revision: https://developer.blender.org/D13080
2021-11-15 20:07:46 +01:00
10a6a540af Cleanup: remove unnecessary functions
Those functions were more useful when `FieldInferencingInterface`
was still declared further down in `node.cc`.
2021-11-15 18:04:03 +01:00
8976b72843 Merge branch 'blender-v3.0-release' 2021-11-15 09:10:19 -06:00
62da41d63d Fix: Incorrect socket identifier versioning
There were two issues:
 - The third math node socket does not exist in old enough files.
 - The comment incorrectly referred to the vector math node.

Differential Revision: https://developer.blender.org/D13219
2021-11-15 08:52:58 -06:00
c3472cb11c Fix T93074: Gpencil cutter not using flat caps in middle cuts
When cut an stroke using the option Flat Caps, the falt was not done if the cut was done in the middle of the stroke.

Now the flat is applied to the segments created and also some cleanup of the code done.
2021-11-15 12:17:11 +01:00
Jeroen Bakker
a5c59fb90e Fix T89260: Eevee crashes with custom node sockets.
Cause of this issue is that Custom Node Sockets info type was
initialized as SOCK_FLOAT when registering. Areas within the core that
would ignore custom socket types by checking its type would use the
socket as being a float type.

When custom node sockets have a property called default_value blender
tries to store it as an internal default value what failed in debug
builds.

This patch will set the socket type to SOCK_CUSTOM when registering a
custom socket type and allow, but skip storage of custom default values.
In this case the default values should already be stored as custom
properies.

Reviewed By: campbellbarton, JacquesLucke

Maniphest Tasks: T89260

Differential Revision: https://developer.blender.org/D13174
2021-11-15 08:13:22 +01:00
c2c65cc4bf Merge branch 'blender-v3.0-release' 2021-11-15 02:39:33 +01:00
622e6f05f1 Fix T92750: sculpt vertex colors missing in object mode
The layers were not aliased properly for usage in the shaders.

Regression caused by rB03013d19d167.
2021-11-15 02:38:52 +01:00
8e43757834 GHOST: Vulkan Backend.
This adds a vulkan backend to GHOST. Still WIP, we should decide what
would be the minimum requirement to land in master.

Differential Revision: https://developer.blender.org/D13155
2021-11-12 09:22:56 +01:00
b65df10346 Merge branch 'master' into tmp-vulkan 2021-11-09 10:34:41 +01:00
8bf8db8ca2 Vulkan: Add support for wayland. 2021-07-06 10:42:45 +02:00
5bd7a2c416 Fix merge conflict. Missing parameter when creating EGLContext. 2021-07-06 09:02:29 +02:00
8c3e8d0eb6 Vulkan: Clean-up platform build support
This polishes up the build support for WITH_VULKAN
the find_package calls were moved to the appropriate
place in the platform cmake files.

For windows the required binaries are copied to
the binary folder.

Windows support relies on D11678 landing and
libraries being added to SVN and will automatically
disable WITH_VULKAN when they are not found.
2021-06-30 18:01:36 -06:00
8bfdec76c0 Vulkan: Fix uninitialized ram in VKTexture
The ram being uninitialised caused issues in
VKTexture's destructor which tests for
VK_NULL_HANDLE. The nvidia driver did not
enjoy being fed a bogus pointer and crashed.
2021-06-30 10:53:00 -06:00
9cc5075680 Ghost: Fix unresolved merge conflict
Introduced by rBddd64ba0
2021-06-30 10:48:59 -06:00
2ea8725339 Vulkan: Retrieve platform information.
Retrieve device platform information for `GPUPlatformGlobal`.
As vulkan needs a physical device it was moved from the GPUBackend
constructor to an init method that is called when an initial context
is created.

Still need to find out how to extract the driver type.
2021-06-29 16:51:10 +02:00
a55bbedc06 Merge branch 'master' into tmp-vulkan 2021-06-29 09:59:34 +02:00
5d52da2e65 Vulkan: Compiler log parsing.
For the vulkan backend we control the compiler and therefore the log
parsing is much simpler then the OpenGL parsers.

This patch formats the log to a better readable/debuggable format so
developers can quickly detect what needs to be done.
2021-06-28 16:59:11 +02:00
25b341a4ae Vulkan: Fix double free of shader interface.
interface is allocated by a subclass but freed by the parent class.
Might consider a uniqueptr here.
2021-06-28 15:56:46 +02:00
dfa5bc25ae Vulkan: Enabled GPU tests. 2021-06-28 15:51:16 +02:00
ddd64ba03a Merge branch 'master' into tmp-vulkan 2021-06-28 15:46:45 +02:00
0b3caae8a7 Merge commit 'master@{1-day-ago}' into tmp-vulkan 2021-06-28 15:17:51 +02:00
b1d5950471 Merge commit 'master@{1-weeks-ago}' into tmp-vulkan 2021-06-28 15:04:27 +02:00
99886f91a9 Merge commit 'master@{2-weeks-ago}' into tmp-vulkan 2021-06-28 15:04:21 +02:00
fa00caba88 Merge commit 'master@{3-weeks-ago}' into tmp-vulkan 2021-06-28 14:57:20 +02:00
f853163a54 Merge commit 'master@{4-weeks-ago}' into tmp-vulkan 2021-06-28 14:47:30 +02:00
87492f93a4 Merge commit 'master@{5-weeks-ago}' into tmp-vulkan 2021-06-28 14:27:59 +02:00
6152113b45 Merge commit 'master@{6-weeks-ago}' into tmp-vulkan 2021-06-28 14:06:11 +02:00
2741a4106a Merge commit 'master@{7-weeks-ago}' into tmp-vulkan 2021-06-28 13:51:33 +02:00
3420792ea8 Merge commit 'master@{8-weeks-ago}' into tmp-vulkan 2021-06-28 13:35:48 +02:00
3eaddb4da8 Merge commit 'master@{9-weeks-ago}' into tmp-vulkan 2021-06-28 13:35:39 +02:00
1a85a68a5e Merge commit 'master@{10-weeks-ago}' into tmp-vulkan 2021-06-28 13:25:24 +02:00
adba79a04f Merge commit 'master@{11-weeks-ago}' into tmp-vulkan 2021-06-28 13:15:10 +02:00
c73894ea5b Merge commit 'master@{12-weeks-ago}' into tmp-vulkan 2021-06-28 13:05:21 +02:00
48360afb10 Merge commit 'master@{3-months-ago}' into tmp-vulkan 2021-06-28 12:37:05 +02:00
0493a32803 Initial SpirV log parsing.
Still work in progress.
2021-06-25 16:53:07 +02:00
79346fd72e Moved shader compilation flag to cmake.
Migration of the shaders can take a long time. During this time it
you won't be able to start most GPU tests or Blender. This switch
will enable the compilation of the shader to other developments
can still happen.

`WITH_VULKAN_SHADER_COMPILATION` Default is OFF. Enable when migrating
shaders.
2021-06-25 16:01:05 +02:00
44c875f59a Vulkan: Patched gl_VertexID to gl_VertexIndex. 2021-06-25 15:52:02 +02:00
a68be3b3c2 ShaderCompiler: Use CPP strings to improve API. 2021-06-25 15:28:19 +02:00
c124606d70 Vulkan: Compile Shader Modules.
Shaders are expected to fail as they need to be converted to Vulkan. Dy default the compilation is turned off. Enable VULKAN_SHADER_COMPILATION to activate shader module compilation.
2021-06-25 14:38:50 +02:00
975b09e3e7 ShaderCompiler: Split code per target type.
No functional changes.
2021-06-25 11:01:59 +02:00
15ac620204 Added Vulkan test cases.
Added counterparts of the shader test cases for vulkan.
Fixing an issue where the OpenGL context was created when using a vulkan
backend.
2021-06-25 10:48:07 +02:00
983f61e9c5 Add test case to compile builtin shaders.
When using the vulkan backend it crashes. Will look into this later this
week.
2021-06-23 17:07:33 +02:00
ed959cd912 Cleanup: Added license headers. 2021-06-23 14:14:14 +02:00
193a17474e Added shader compiler abstration.
The shader_compiler is an abstration layer for in case we need to switch
between other shader compilers in the future.

The shader compiler uses a CPP Api as it will only be accessed from GPU
module.

```
  const char *source = "#version 450\nvoid main() {}";
  shader_compiler::Compiler *compiler = shader_compiler::Compiler::create_default();
  shader_compiler::Job job;
  job.source = source;
  job.name = __func__;
  job.source_type = shader_compiler::SourceType::GlslComputeShader;
  job.compilation_target = shader_compiler::TargetType::SpirV;

  shader_compiler::Result *result = compiler->compile(job);
  EXPECT_EQ(result->type, shader_compiler::TargetType::SpirV);
  EXPECT_EQ(result->status_code, shader_compiler::StatusCode::Ok);
  EXPECT_GT(result->bin.size(), 0);
  EXPECT_EQ(result->error_log, "");
  delete result;
  delete compiler;
```

The CMakeFiles need some attention as it currently works when linking to
the shaderlib_combined. On linux it picks up the shaderlib from the
vulkan SDK, eventually it needs to pick up from our prebuild libraries.
2021-06-23 14:04:44 +02:00
5b246fd4b3 Vulkan: Fix offscreen context creation.
On linux both the display and window should be available to create a
window surface. Only the display was checked making it fail later on
when creating a present queue.
2021-06-22 09:22:28 +02:00
f7d7a5aad7 Cleanup: Silence compilation warning. 2021-06-21 15:13:09 +02:00
8b68ee3a17 Vulkan: Fix compilation error. 2021-06-18 15:12:26 +02:00
230b24159e Vulkan: remove VK_ERROR_INCOMPATIBLE_VERSION_KHR
It has been removed from the SDK since version 1.1.18.
2021-06-18 15:10:56 +02:00
a5a01cc0c3 Performance: GPU Batch Baseline TestCase.
When using a dense mesh and transforming a small number of verts the mesh received a copy-on-write signal. This will free all GPU batches. No reuse is currently possible.
This patch adds a test case that can be used as a baseline to optimize the cache construction in the draw module.

Differential Revision: https://developer.blender.org/D11255
2021-05-17 11:16:27 +02:00
e7b3a7028b Merge branch 'master' into tmp-vulkan 2021-03-11 20:17:32 +01:00
fced6f19be Attempt 2 to silence warning. 2021-03-12 00:41:13 +05:30
402e19ddc8 Attempt to silence warning. 2021-03-12 00:37:55 +05:30
af2bc8be40 Vulkan: Implementation of VKTexture
Implementation is a first draft and needs to be refined over time.
2021-03-11 16:56:33 +01:00
72bce1be8e Vulkan: Bypass BGL when running with the vulkan backend
This avoids a crash at startup.
2021-03-11 16:55:12 +01:00
d390f01e48 Vulkan: Fix crash because of null active framebuffer 2021-03-11 16:02:01 +01:00
0956a5134a Vulkan: Add allocator and command pool to VKContext
This allows memory backed object (Textures, VertexBuffers, ...) to be created.
Also adds functions for one time use command buffers.
2021-03-11 16:01:27 +01:00
9a08daae92 GPUContext: Pass GHOST context handle for GPUContext creation
This is because we need to have a direct reference to the context to get
the vulkan resources.
2021-03-11 15:38:39 +01:00
dd4405121c GHOST: Window: Add drawing context handle getter
This is in order to retrieve the vulkan/gl context from a window handle.
2021-03-11 14:53:06 +01:00
b996cc6440 GHOST: Vulkan: Add compile time option to bypass required features
This makes it possible to test the vulkan implementation on all platforms
(mainly MacOS) even if all required features are not supported.
2021-03-11 14:14:14 +01:00
0bcd24c0c4 Vulkan: Add basic implementation of textures
Adds basic texture creation and update as well as texture view support
for attachements.
2021-03-11 13:41:05 +01:00
12e88ee722 Vulkan: Fix compilation after merging 2021-03-11 13:37:39 +01:00
b908c3fa0a VulkanMemoryAllocator: Add new extern library
This library is used to simplify memory allocation / management for Vulkan.
2021-03-11 13:02:02 +01:00
ef8e53c15b GHOST: Vulkan: Fix compilation error 2021-03-07 17:50:43 +01:00
639829ea1a Merge branch 'master' into tmp-vulkan 2021-03-07 17:28:39 +01:00
0ef5c14de0 Merge branch 'master' into tmp-vulkan
# Conflicts:
#	source/blender/blenkernel/BKE_global.h
#	source/blender/gpu/intern/gpu_context.cc
#	source/blender/windowmanager/intern/wm_window.c
#	source/creator/creator_args.c
2021-03-07 17:22:48 +01:00
26fd1c71e1 VK: Initial Base implementation 2020-11-13 22:30:40 +01:00
de062ffd10 Merge branch 'master' into tmp-vulkan
# Conflicts:
#	CMakeLists.txt
2020-09-16 22:05:35 +02:00
fc0f409911 VK: Avoid asserts in debug builds. 2020-09-16 22:04:03 +02:00
c72e6c25d7 GHOST: X11: Enable debug context for vulkan in offscreen contexts 2020-09-16 22:03:34 +02:00
816bd8bed6 GHOST: Vulkan: Rework support for MSVC
This removes designated initializer because not supported by MSVC for our
current C++ version.

Fixes some other compile errors and add usage of the `ContextVK` context in
the Win32 backend.
2020-09-16 22:01:04 +02:00
47613a40f5 VK: OSX: Use debug context and use METAL_SURFACE instead of MACOS_SURFACE 2020-09-13 03:37:35 +02:00
ea5a70973d VK: Make the placeholder backend work
Now we get a green window without a crash
2020-09-13 03:36:22 +02:00
70c0f652de Cleanup: Vulkan: Unused header and avoid duplicated code 2020-09-13 01:53:43 +02:00
e70dcde88b Vulkan: OSX: Use vulkan context for offscreen context 2020-09-13 01:52:57 +02:00
6d82ae3581 Vulkan: Fix compilation on OSX
This was caused by an internal metal type. The fix is a
workaround since I could not find the right solution.
2020-09-13 01:52:03 +02:00
99fc68bf6a GPU: Add Vulkan empty implementation
This does not even builds! WHY?
2020-09-13 00:44:15 +02:00
3efdb3f232 GHOST: Vulkan: Add MoltenVK Backend on osx
This is still highly experimental.
2020-09-13 00:43:43 +02:00
193cebd3c2 Merge branch 'master' into tmp-vulkan 2020-09-12 19:50:22 +02:00
bdeca237ff Merge branch 'master' into tmp-vulkan
# Conflicts:
#	CMakeLists.txt
#	intern/ghost/GHOST_C-api.h
#	intern/ghost/GHOST_ISystem.h
#	intern/ghost/intern/GHOST_C-api.cpp
#	intern/ghost/intern/GHOST_System.h
#	intern/ghost/intern/GHOST_SystemCocoa.h
#	intern/ghost/intern/GHOST_SystemCocoa.mm
#	intern/ghost/intern/GHOST_SystemNULL.h
#	intern/ghost/intern/GHOST_SystemSDL.cpp
#	intern/ghost/intern/GHOST_SystemSDL.h
#	intern/ghost/intern/GHOST_SystemWayland.cpp
#	intern/ghost/intern/GHOST_SystemWayland.h
#	intern/ghost/intern/GHOST_SystemWin32.cpp
#	intern/ghost/intern/GHOST_SystemWin32.h
#	intern/ghost/intern/GHOST_SystemX11.cpp
#	intern/ghost/intern/GHOST_SystemX11.h
#	source/blender/blenkernel/BKE_global.h
#	source/blender/windowmanager/intern/wm_window.c
2020-09-12 19:05:59 +02:00
a08d242acd GPU: Vulkan: Add creator arg to enable vulkan drawing
The creator arg is meant to support more than vulkan or opengl.

This currently fails because there is no GPU backend separation.

Note that we use G.debug with a debug flag to know what context type to
create. This will need to be cleaned later.
2020-07-24 22:08:37 +02:00
5629ff5f69 GHOST: Vulkan: Support window resizing 2020-07-24 19:53:01 +02:00
eb23ecd1b3 GHOST: Vulkan: Style Cleanup 2020-07-24 18:51:08 +02:00
a02da40732 GHOST: Vulkan: Better debug messages 2020-07-24 18:30:05 +02:00
bd0c1d8c53 GHOST: Vulkan: Remove swap interval functions 2020-07-24 17:48:31 +02:00
73b585b0b2 GHOST: Vulkan: Add Test command buffers and sync primitives 2020-07-24 17:00:06 +02:00
b549b1f477 GHOST: Vulkan: Add ImageView, Framebuffer and default RenderPass 2020-07-24 02:13:40 +02:00
c155c10888 GHOST: Vulkan: Add swapchain creation 2020-07-24 00:42:44 +02:00
bc86701b58 Merge branch 'master' into tmp-vulkan 2020-07-23 16:41:42 +02:00
6dc1d823ba Ghost: Vulkan: Add surface creation 2020-07-23 17:05:17 +02:00
392e744ed5 Merge branch 'master' into tmp-vulkan 2020-07-23 15:11:46 +02:00
f56af0b899 GHOST: Vulkan: Add logical device creation 2020-07-20 14:02:05 +02:00
78cbcccf34 GHOST: Vulkan: Add physical device & queue family selection 2020-07-20 13:23:41 +02:00
b671b33871 GHOST: Vulkan: Add instance creation 2020-07-19 22:59:57 +02:00
be9dc493a4 GHOST: Add barebone vulkan context for X11
The context does nothing. For now it always fail and will always
fallback to opengl.
2020-07-19 02:53:07 +02:00
8dec7e5219 Vulkan: Add CMake inclusion
We need to bump minimum version to 3.7 for FindVulkan.cmake to be
available.
2020-07-19 02:53:06 +02:00
b79e7f92bf Fix T78977 GPU: blf fonts are not gamma corrected 2020-07-18 04:25:56 +02:00
692 changed files with 36153 additions and 7850 deletions

View File

@@ -411,6 +411,7 @@ option(WITH_CYCLES "Enable Cycles Render Engine" ON)
option(WITH_CYCLES_OSL "Build Cycles with OpenShadingLanguage support" ON)
option(WITH_CYCLES_EMBREE "Build Cycles with Embree support" ON)
option(WITH_CYCLES_LOGGING "Build Cycles with logging support" ON)
option(WITH_CYCLES_DEBUG "Build Cycles with options useful for debugging (e.g., MIS)" OFF)
option(WITH_CYCLES_STANDALONE "Build Cycles standalone application" OFF)
option(WITH_CYCLES_STANDALONE_GUI "Build Cycles standalone with GUI" OFF)
@@ -513,9 +514,13 @@ if(UNIX AND NOT APPLE)
endif()
# Vulkan
option(WITH_VULKAN "Enable Vulkan backend (Experimental)" OFF)
option(WITH_VULKAN_SHADER_COMPILATION "Temporary flag to enable vulkan shader compilation needed to continue development during the migration of GLSL to Vulkan." OFF)
# OpenGL
option(WITH_OPENGL "When off limits visibility of the opengl headers to just bf_gpu and gawain (temporary option for development purposes)" ON)
option(WITH_OPENGL "When off limits visibility of the opengl headers to just bf_gpu (temporary option for development purposes)" ON)
option(WITH_GLEW_ES "Switches to experimental copy of GLEW that has support for OpenGL ES. (temporary option for development purposes)" OFF)
option(WITH_GL_EGL "Use the EGL OpenGL system library instead of the platform specific OpenGL system library (CGL, glX, or WGL)" OFF)
option(WITH_GL_PROFILE_ES20 "Support using OpenGL ES 2.0. (through either EGL or the AGL/WGL/XGL 'es20' profile)" OFF)
@@ -525,6 +530,7 @@ mark_as_advanced(
WITH_GLEW_ES
WITH_GL_EGL
WITH_GL_PROFILE_ES20
WITH_VULKAN_SHADER_COMPILATION
)
if(WIN32)
@@ -1069,7 +1075,7 @@ if(MSVC)
add_definitions(-D__LITTLE_ENDIAN__)
# OSX-Note: as we do cross-compiling with specific set architecture,
# endianess-detection and auto-setting is counterproductive
# endianness-detection and auto-setting is counterproductive
# so we just set endianness according CMAKE_OSX_ARCHITECTURES
elseif(CMAKE_OSX_ARCHITECTURES MATCHES i386 OR CMAKE_OSX_ARCHITECTURES MATCHES x86_64 OR CMAKE_OSX_ARCHITECTURES MATCHES arm64)
@@ -1125,6 +1131,18 @@ if(WITH_OPENVDB)
list(APPEND OPENVDB_LIBRARIES ${BOOST_LIBRARIES} ${TBB_LIBRARIES})
endif()
#-----------------------------------------------------------------------------
# Configure Vulkan.
if(WITH_VULKAN)
list(APPEND BLENDER_GL_LIBRARIES ${Vulkan_LIBRARY})
add_definitions(-DWITH_VULKAN)
if(WITH_VULKAN_SHADER_COMPILATION)
add_definitions(-DWITH_VULKAN_SHADER_COMPILATION)
endif()
endif()
#-----------------------------------------------------------------------------
# Configure OpenGL.
@@ -1759,7 +1777,7 @@ endif()
set(CMAKE_CXX_STANDARD 17)
# If C++17 is not available, downgrading to an earlier standard is NOT OK.
set(CMAKE_CXX_STANDARD_REQUIRED ON)
# Do not enable compiler specific language extentions.
# Do not enable compiler specific language extensions.
set(CMAKE_CXX_EXTENSIONS OFF)
# Make MSVC properly report the value of the __cplusplus preprocessor macro

View File

@@ -51,7 +51,7 @@ Other Convenience Targets
* config: Run cmake configuration tool to set build options.
* deps: Build library dependencies (intended only for platform maintainers).
The existance of locally build dependancies overrides the pre-built dependencies from subversion.
The existance of locally build dependencies overrides the pre-built dependencies from subversion.
These must be manually removed from '../lib/' to go back to using the pre-compiled libraries.
Project Files

View File

@@ -17,7 +17,7 @@
# ***** END GPL LICENSE BLOCK *****
########################################################################
# Copy all generated files to the proper strucure as blender prefers
# Copy all generated files to the proper structure as blender prefers
########################################################################
if(NOT DEFINED HARVEST_TARGET)

View File

@@ -42,6 +42,7 @@ ExternalProject_Add(nanovdb
URL_HASH ${NANOVDB_HASH_TYPE}=${NANOVDB_HASH}
PREFIX ${BUILD_DIR}/nanovdb
SOURCE_SUBDIR nanovdb
PATCH_COMMAND ${PATCH_CMD} -p 1 -d ${BUILD_DIR}/nanovdb/src/nanovdb < ${PATCH_DIR}/nanovdb.diff
CMAKE_ARGS -DCMAKE_INSTALL_PREFIX=${LIBDIR}/nanovdb ${DEFAULT_CMAKE_FLAGS} ${NANOVDB_EXTRA_ARGS}
INSTALL_DIR ${LIBDIR}/nanovdb
)

View File

@@ -39,7 +39,7 @@ endif()
set(DOWNLOAD_DIR "${CMAKE_CURRENT_BINARY_DIR}/downloads" CACHE STRING "Path for downloaded files")
# This path must be hard-coded like this, so that the GNUmakefile knows where it is and can pass it to make_source_archive.py:
set(PACKAGE_DIR "${CMAKE_CURRENT_BINARY_DIR}/packages")
option(PACKAGE_USE_UPSTREAM_SOURCES "Use soures upstream to download the package sources, when OFF the blender mirror will be used" ON)
option(PACKAGE_USE_UPSTREAM_SOURCES "Use sources upstream to download the package sources, when OFF the blender mirror will be used" ON)
file(TO_CMAKE_PATH ${DOWNLOAD_DIR} DOWNLOAD_DIR)
file(TO_CMAKE_PATH ${PACKAGE_DIR} PACKAGE_DIR)

View File

@@ -24,7 +24,7 @@ if(MSVC)
add_custom_command(
OUTPUT ${PYTARGET}/bin/python${PYTHON_POSTFIX}.exe
COMMAND echo packaging python
COMMAND echo this should ouput at ${PYTARGET}/bin/python${PYTHON_POSTFIX}.exe
COMMAND echo this should output at ${PYTARGET}/bin/python${PYTHON_POSTFIX}.exe
COMMAND ${CMAKE_COMMAND} -E make_directory ${PYTARGET}/libs
COMMAND ${CMAKE_COMMAND} -E copy ${PYSRC}/libs/python${PYTHON_SHORT_VERSION_NO_DOTS}.lib ${PYTARGET}/libs/python${PYTHON_SHORT_VERSION_NO_DOTS}.lib
COMMAND ${CMAKE_COMMAND} -E copy ${PYSRC}/python.exe ${PYTARGET}/bin/python.exe
@@ -43,7 +43,7 @@ if(MSVC)
add_custom_command(
OUTPUT ${PYTARGET}/bin/python${PYTHON_POSTFIX}.exe
COMMAND echo packaging python
COMMAND echo this should ouput at ${PYTARGET}/bin/python${PYTHON_POSTFIX}.exe
COMMAND echo this should output at ${PYTARGET}/bin/python${PYTHON_POSTFIX}.exe
COMMAND ${CMAKE_COMMAND} -E make_directory ${PYTARGET}/libs
COMMAND ${CMAKE_COMMAND} -E copy ${PYSRC}/libs/python${PYTHON_SHORT_VERSION_NO_DOTS}${PYTHON_POSTFIX}.lib ${PYTARGET}/libs/python${PYTHON_SHORT_VERSION_NO_DOTS}${PYTHON_POSTFIX}.lib
COMMAND ${CMAKE_COMMAND} -E copy ${PYSRC}/python${PYTHON_POSTFIX}.exe ${PYTARGET}/bin/python${PYTHON_POSTFIX}.exe

View File

@@ -1826,7 +1826,7 @@ compile_OCIO() {
# Force linking against static libs
#rm -f $_inst/lib/*.so*
# Additional depencencies
# Additional dependencies
#cp ext/dist/lib/libtinyxml.a $_inst/lib
#cp ext/dist/lib/libyaml-cpp.a $_inst/lib

View File

@@ -0,0 +1,374 @@
Index: nanovdb/nanovdb/NanoVDB.h
===================================================================
--- a/nanovdb/nanovdb/NanoVDB.h (revision 62751)
+++ b/nanovdb/nanovdb/NanoVDB.h (working copy)
@@ -152,8 +152,8 @@
#endif // __CUDACC_RTC__
-#ifdef __CUDACC__
-// Only define __hostdev__ when using NVIDIA CUDA compiler
+#if defined(__CUDACC__) || defined(__HIP__)
+// Only define __hostdev__ when using NVIDIA CUDA or HIP compiler
#define __hostdev__ __host__ __device__
#else
#define __hostdev__
@@ -461,7 +461,7 @@
/// Maximum floating-point values
template<typename T>
struct Maximum;
-#ifdef __CUDA_ARCH__
+#if defined(__CUDA_ARCH__) || defined(__HIP__)
template<>
struct Maximum<int>
{
@@ -1006,10 +1006,10 @@
using Vec3i = Vec3<int>;
/// @brief Return a single precision floating-point vector of this coordinate
-Vec3f Coord::asVec3s() const { return Vec3f(float(mVec[0]), float(mVec[1]), float(mVec[2])); }
+inline __hostdev__ Vec3f Coord::asVec3s() const { return Vec3f(float(mVec[0]), float(mVec[1]), float(mVec[2])); }
/// @brief Return a double precision floating-point vector of this coordinate
-Vec3d Coord::asVec3d() const { return Vec3d(double(mVec[0]), double(mVec[1]), double(mVec[2])); }
+inline __hostdev__ Vec3d Coord::asVec3d() const { return Vec3d(double(mVec[0]), double(mVec[1]), double(mVec[2])); }
// ----------------------------> Vec4 <--------------------------------------
@@ -1820,7 +1820,7 @@
}; // Map
template<typename Mat4T>
-void Map::set(const Mat4T& mat, const Mat4T& invMat, double taper)
+__hostdev__ void Map::set(const Mat4T& mat, const Mat4T& invMat, double taper)
{
float * mf = mMatF, *vf = mVecF;
float* mif = mInvMatF;
@@ -2170,7 +2170,7 @@
}; // Class Grid
template<typename TreeT>
-int Grid<TreeT>::findBlindDataForSemantic(GridBlindDataSemantic semantic) const
+__hostdev__ int Grid<TreeT>::findBlindDataForSemantic(GridBlindDataSemantic semantic) const
{
for (uint32_t i = 0, n = blindDataCount(); i < n; ++i)
if (blindMetaData(i).mSemantic == semantic)
@@ -2328,7 +2328,7 @@
}; // Tree class
template<typename RootT>
-void Tree<RootT>::extrema(ValueType& min, ValueType& max) const
+__hostdev__ void Tree<RootT>::extrema(ValueType& min, ValueType& max) const
{
min = this->root().valueMin();
max = this->root().valueMax();
@@ -2336,7 +2336,7 @@
template<typename RootT>
template<typename NodeT>
-const NodeT* Tree<RootT>::getNode(uint32_t i) const
+__hostdev__ const NodeT* Tree<RootT>::getNode(uint32_t i) const
{
static_assert(is_same<TreeNodeT<NodeT::LEVEL>, NodeT>::value, "Tree::getNode: unvalid node type");
NANOVDB_ASSERT(i < DataType::mCount[NodeT::LEVEL]);
@@ -2345,7 +2345,7 @@
template<typename RootT>
template<int LEVEL>
-const typename TreeNode<Tree<RootT>, LEVEL>::type* Tree<RootT>::getNode(uint32_t i) const
+__hostdev__ const typename TreeNode<Tree<RootT>, LEVEL>::type* Tree<RootT>::getNode(uint32_t i) const
{
NANOVDB_ASSERT(i < DataType::mCount[LEVEL]);
return reinterpret_cast<const TreeNodeT<LEVEL>*>(reinterpret_cast<const uint8_t*>(this) + DataType::mBytes[LEVEL]) + i;
@@ -2353,7 +2353,7 @@
template<typename RootT>
template<typename NodeT>
-NodeT* Tree<RootT>::getNode(uint32_t i)
+__hostdev__ NodeT* Tree<RootT>::getNode(uint32_t i)
{
static_assert(is_same<TreeNodeT<NodeT::LEVEL>, NodeT>::value, "Tree::getNode: invalid node type");
NANOVDB_ASSERT(i < DataType::mCount[NodeT::LEVEL]);
@@ -2362,7 +2362,7 @@
template<typename RootT>
template<int LEVEL>
-typename TreeNode<Tree<RootT>, LEVEL>::type* Tree<RootT>::getNode(uint32_t i)
+__hostdev__ typename TreeNode<Tree<RootT>, LEVEL>::type* Tree<RootT>::getNode(uint32_t i)
{
NANOVDB_ASSERT(i < DataType::mCount[LEVEL]);
return reinterpret_cast<TreeNodeT<LEVEL>*>(reinterpret_cast<uint8_t*>(this) + DataType::mBytes[LEVEL]) + i;
@@ -2370,7 +2370,7 @@
template<typename RootT>
template<typename NodeT>
-uint32_t Tree<RootT>::getNodeID(const NodeT& node) const
+__hostdev__ uint32_t Tree<RootT>::getNodeID(const NodeT& node) const
{
static_assert(is_same<TreeNodeT<NodeT::LEVEL>, NodeT>::value, "Tree::getNodeID: invalid node type");
const NodeT* first = reinterpret_cast<const NodeT*>(reinterpret_cast<const uint8_t*>(this) + DataType::mBytes[NodeT::LEVEL]);
@@ -2380,7 +2380,7 @@
template<typename RootT>
template<typename NodeT>
-uint32_t Tree<RootT>::getLinearOffset(const NodeT& node) const
+__hostdev__ uint32_t Tree<RootT>::getLinearOffset(const NodeT& node) const
{
return this->getNodeID(node) + DataType::mPFSum[NodeT::LEVEL];
}
@@ -3366,7 +3366,7 @@
}; // LeafNode class
template<typename ValueT, typename CoordT, template<uint32_t> class MaskT, uint32_t LOG2DIM>
-inline void LeafNode<ValueT, CoordT, MaskT, LOG2DIM>::updateBBox()
+inline __hostdev__ void LeafNode<ValueT, CoordT, MaskT, LOG2DIM>::updateBBox()
{
static_assert(LOG2DIM == 3, "LeafNode::updateBBox: only supports LOGDIM = 3!");
if (!this->isActive()) return;
Index: nanovdb/nanovdb/util/SampleFromVoxels.h
===================================================================
--- a/nanovdb/nanovdb/util/SampleFromVoxels.h (revision 62751)
+++ b/nanovdb/nanovdb/util/SampleFromVoxels.h (working copy)
@@ -22,7 +22,7 @@
#define NANOVDB_SAMPLE_FROM_VOXELS_H_HAS_BEEN_INCLUDED
// Only define __hostdev__ when compiling as NVIDIA CUDA
-#ifdef __CUDACC__
+#if defined(__CUDACC__) || defined(__HIP__)
#define __hostdev__ __host__ __device__
#else
#include <cmath> // for floor
@@ -136,7 +136,7 @@
template<typename TreeOrAccT>
template<typename Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 0, true>::operator()(const Vec3T& xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 0, true>::operator()(const Vec3T& xyz) const
{
const CoordT ijk = Round<CoordT>(xyz);
if (ijk != mPos) {
@@ -147,7 +147,7 @@
}
template<typename TreeOrAccT>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 0, true>::operator()(const CoordT& ijk) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 0, true>::operator()(const CoordT& ijk) const
{
if (ijk != mPos) {
mPos = ijk;
@@ -158,7 +158,7 @@
template<typename TreeOrAccT>
template<typename Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 0, false>::operator()(const Vec3T& xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 0, false>::operator()(const Vec3T& xyz) const
{
return mAcc.getValue(Round<CoordT>(xyz));
}
@@ -195,7 +195,7 @@
}; // TrilinearSamplerBase
template<typename TreeOrAccT>
-void TrilinearSampler<TreeOrAccT>::stencil(CoordT& ijk, ValueT (&v)[2][2][2]) const
+__hostdev__ void TrilinearSampler<TreeOrAccT>::stencil(CoordT& ijk, ValueT (&v)[2][2][2]) const
{
v[0][0][0] = mAcc.getValue(ijk); // i, j, k
@@ -224,7 +224,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType TrilinearSampler<TreeOrAccT>::sample(const Vec3T<RealT> &uvw, const ValueT (&v)[2][2][2])
+__hostdev__ typename TreeOrAccT::ValueType TrilinearSampler<TreeOrAccT>::sample(const Vec3T<RealT> &uvw, const ValueT (&v)[2][2][2])
{
#if 0
auto lerp = [](ValueT a, ValueT b, ValueT w){ return fma(w, b-a, a); };// = w*(b-a) + a
@@ -239,7 +239,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-Vec3T<typename TreeOrAccT::ValueType> TrilinearSampler<TreeOrAccT>::gradient(const Vec3T<RealT> &uvw, const ValueT (&v)[2][2][2])
+__hostdev__ Vec3T<typename TreeOrAccT::ValueType> TrilinearSampler<TreeOrAccT>::gradient(const Vec3T<RealT> &uvw, const ValueT (&v)[2][2][2])
{
static_assert(std::is_floating_point<ValueT>::value, "TrilinearSampler::gradient requires a floating-point type");
#if 0
@@ -270,7 +270,7 @@
}
template<typename TreeOrAccT>
-bool TrilinearSampler<TreeOrAccT>::zeroCrossing(const ValueT (&v)[2][2][2])
+__hostdev__ bool TrilinearSampler<TreeOrAccT>::zeroCrossing(const ValueT (&v)[2][2][2])
{
static_assert(std::is_floating_point<ValueT>::value, "TrilinearSampler::zeroCrossing requires a floating-point type");
const bool less = v[0][0][0] < ValueT(0);
@@ -363,7 +363,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, true>::operator()(Vec3T<RealT> xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, true>::operator()(Vec3T<RealT> xyz) const
{
this->cache(xyz);
return BaseT::sample(xyz, mVal);
@@ -370,7 +370,7 @@
}
template<typename TreeOrAccT>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, true>::operator()(const CoordT &ijk) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, true>::operator()(const CoordT &ijk) const
{
return ijk == mPos ? mVal[0][0][0] : BaseT::mAcc.getValue(ijk);
}
@@ -377,7 +377,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-Vec3T<typename TreeOrAccT::ValueType> SampleFromVoxels<TreeOrAccT, 1, true>::gradient(Vec3T<RealT> xyz) const
+__hostdev__ Vec3T<typename TreeOrAccT::ValueType> SampleFromVoxels<TreeOrAccT, 1, true>::gradient(Vec3T<RealT> xyz) const
{
this->cache(xyz);
return BaseT::gradient(xyz, mVal);
@@ -393,7 +393,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-void SampleFromVoxels<TreeOrAccT, 1, true>::cache(Vec3T<RealT>& xyz) const
+__hostdev__ void SampleFromVoxels<TreeOrAccT, 1, true>::cache(Vec3T<RealT>& xyz) const
{
CoordT ijk = Floor<CoordT>(xyz);
if (ijk != mPos) {
@@ -406,7 +406,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, false>::operator()(Vec3T<RealT> xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, false>::operator()(Vec3T<RealT> xyz) const
{
ValueT val[2][2][2];
CoordT ijk = Floor<CoordT>(xyz);
@@ -418,7 +418,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, false>::operator()(Vec3T<RealT> xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, false>::operator()(Vec3T<RealT> xyz) const
{
auto lerp = [](ValueT a, ValueT b, RealT w) { return a + ValueT(w) * (b - a); };
@@ -463,7 +463,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-inline Vec3T<typename TreeOrAccT::ValueType> SampleFromVoxels<TreeOrAccT, 1, false>::gradient(Vec3T<RealT> xyz) const
+inline __hostdev__ Vec3T<typename TreeOrAccT::ValueType> SampleFromVoxels<TreeOrAccT, 1, false>::gradient(Vec3T<RealT> xyz) const
{
ValueT val[2][2][2];
CoordT ijk = Floor<CoordT>(xyz);
@@ -473,7 +473,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-bool SampleFromVoxels<TreeOrAccT, 1, false>::zeroCrossing(Vec3T<RealT> xyz) const
+__hostdev__ bool SampleFromVoxels<TreeOrAccT, 1, false>::zeroCrossing(Vec3T<RealT> xyz) const
{
ValueT val[2][2][2];
CoordT ijk = Floor<CoordT>(xyz);
@@ -510,7 +510,7 @@
}; // TriquadraticSamplerBase
template<typename TreeOrAccT>
-void TriquadraticSampler<TreeOrAccT>::stencil(const CoordT &ijk, ValueT (&v)[3][3][3]) const
+__hostdev__ void TriquadraticSampler<TreeOrAccT>::stencil(const CoordT &ijk, ValueT (&v)[3][3][3]) const
{
CoordT p(ijk[0] - 1, 0, 0);
for (int dx = 0; dx < 3; ++dx, ++p[0]) {
@@ -526,7 +526,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType TriquadraticSampler<TreeOrAccT>::sample(const Vec3T<RealT> &uvw, const ValueT (&v)[3][3][3])
+__hostdev__ typename TreeOrAccT::ValueType TriquadraticSampler<TreeOrAccT>::sample(const Vec3T<RealT> &uvw, const ValueT (&v)[3][3][3])
{
auto kernel = [](const ValueT* value, double weight)->ValueT {
return weight * (weight * (0.5f * (value[0] + value[2]) - value[1]) +
@@ -545,7 +545,7 @@
}
template<typename TreeOrAccT>
-bool TriquadraticSampler<TreeOrAccT>::zeroCrossing(const ValueT (&v)[3][3][3])
+__hostdev__ bool TriquadraticSampler<TreeOrAccT>::zeroCrossing(const ValueT (&v)[3][3][3])
{
static_assert(std::is_floating_point<ValueT>::value, "TrilinearSampler::zeroCrossing requires a floating-point type");
const bool less = v[0][0][0] < ValueT(0);
@@ -624,7 +624,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 2, true>::operator()(Vec3T<RealT> xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 2, true>::operator()(Vec3T<RealT> xyz) const
{
this->cache(xyz);
return BaseT::sample(xyz, mVal);
@@ -631,7 +631,7 @@
}
template<typename TreeOrAccT>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 2, true>::operator()(const CoordT &ijk) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 2, true>::operator()(const CoordT &ijk) const
{
return ijk == mPos ? mVal[1][1][1] : BaseT::mAcc.getValue(ijk);
}
@@ -646,7 +646,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-void SampleFromVoxels<TreeOrAccT, 2, true>::cache(Vec3T<RealT>& xyz) const
+__hostdev__ void SampleFromVoxels<TreeOrAccT, 2, true>::cache(Vec3T<RealT>& xyz) const
{
CoordT ijk = Floor<CoordT>(xyz);
if (ijk != mPos) {
@@ -657,7 +657,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 2, false>::operator()(Vec3T<RealT> xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 2, false>::operator()(Vec3T<RealT> xyz) const
{
ValueT val[3][3][3];
CoordT ijk = Floor<CoordT>(xyz);
@@ -667,7 +667,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-bool SampleFromVoxels<TreeOrAccT, 2, false>::zeroCrossing(Vec3T<RealT> xyz) const
+__hostdev__ bool SampleFromVoxels<TreeOrAccT, 2, false>::zeroCrossing(Vec3T<RealT> xyz) const
{
ValueT val[3][3][3];
CoordT ijk = Floor<CoordT>(xyz);
@@ -710,7 +710,7 @@
}; // TricubicSampler
template<typename TreeOrAccT>
-void TricubicSampler<TreeOrAccT>::stencil(const CoordT& ijk, ValueT (&C)[64]) const
+__hostdev__ void TricubicSampler<TreeOrAccT>::stencil(const CoordT& ijk, ValueT (&C)[64]) const
{
auto fetch = [&](int i, int j, int k) -> ValueT& { return C[((i + 1) << 4) + ((j + 1) << 2) + k + 1]; };
@@ -929,7 +929,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 3, true>::operator()(Vec3T<RealT> xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 3, true>::operator()(Vec3T<RealT> xyz) const
{
this->cache(xyz);
return BaseT::sample(xyz, mC);
@@ -937,7 +937,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-void SampleFromVoxels<TreeOrAccT, 3, true>::cache(Vec3T<RealT>& xyz) const
+__hostdev__ void SampleFromVoxels<TreeOrAccT, 3, true>::cache(Vec3T<RealT>& xyz) const
{
CoordT ijk = Floor<CoordT>(xyz);
if (ijk != mPos) {

View File

@@ -0,0 +1,66 @@
# - Find SHADERC library
# Find the native Haru includes and library
# This module defines
# SHADERC_INCLUDE_DIRS, where to find hpdf.h, set when
# SHADERC_INCLUDE_DIR is found.
# SHADERC_LIBRARIES, libraries to link against to use Haru.
# SHADERC_ROOT_DIR, The base directory to search for Haru.
# This can also be an environment variable.
# SHADERC_FOUND, If false, do not try to use Haru.
#
# also defined, but not for general use are
# SHADERC_LIBRARY, where to find the Haru library.
#=============================================================================
# Copyright 2021 Blender Foundation.
#
# Distributed under the OSI-approved BSD 3-Clause License,
# see accompanying file BSD-3-Clause-license.txt for details.
#=============================================================================
# If SHADERC_ROOT_DIR was defined in the environment, use it.
if(NOT SHADERC_ROOT_DIR AND NOT $ENV{SHADERC_ROOT_DIR} STREQUAL "")
set(SHADERC_ROOT_DIR $ENV{SHADERC_ROOT_DIR})
endif()
set(_shaderc_SEARCH_DIRS
${SHADERC_ROOT_DIR}
/opt/lib/haru
)
find_path(SHADERC_INCLUDE_DIR
NAMES
shaderc.hpp
HINTS
${_shaderc_SEARCH_DIRS}
PATH_SUFFIXES
include/shaderc
include
)
find_library(SHADERC_LIBRARY
NAMES
shaderc_combined
shaderc
HINTS
${_shaderc_SEARCH_DIRS}
PATH_SUFFIXES
lib64 lib
)
# Handle the QUIETLY and REQUIRED arguments and set SHADERC_FOUND to TRUE if
# all listed variables are TRUE.
include(FindPackageHandleStandardArgs)
find_package_handle_standard_args(ShaderC DEFAULT_MSG SHADERC_LIBRARY SHADERC_INCLUDE_DIR)
if(SHADERC_FOUND)
set(SHADERC_LIBRARIES ${SHADERC_LIBRARY})
set(SHADERC_INCLUDE_DIRS ${SHADERC_INCLUDE_DIR})
endif()
mark_as_advanced(
SHADERC_INCLUDE_DIR
SHADERC_LIBRARY
)
unset(_shaderc_SEARCH_DIRS)

View File

@@ -180,7 +180,7 @@ def create_nb_project_main():
f.write(' </logicalFolder>\n')
f.write(' </logicalFolder>\n')
# default, but this dir is infact not in blender dir so we can ignore it
# default, but this dir is in fact not in blender dir so we can ignore it
# f.write(' <sourceFolderFilter>^(nbproject)$</sourceFolderFilter>\n')
f.write(r' <sourceFolderFilter>^(nbproject|__pycache__|.*\.py|.*\.html|.*\.blend)$</sourceFolderFilter>\n')

View File

@@ -529,7 +529,7 @@ function(SETUP_LIBDIRS)
# NOTE: For all new libraries, use absolute library paths.
# This should eventually be phased out.
# APPLE plaform uses full paths for linking libraries, and avoids link_directories.
# APPLE platform uses full paths for linking libraries, and avoids link_directories.
if(NOT MSVC AND NOT APPLE)
link_directories(${JPEG_LIBPATH} ${PNG_LIBPATH} ${ZLIB_LIBPATH} ${FREETYPE_LIBPATH})

View File

@@ -102,6 +102,11 @@ find_package_wrapper(ZLIB REQUIRED)
find_package_wrapper(Zstd REQUIRED)
find_package_wrapper(Freetype REQUIRED)
if(WITH_VULKAN)
find_package_wrapper(Vulkan REQUIRED)
find_package(ShaderC REQUIRED)
endif()
if(WITH_PYTHON)
# No way to set py35, remove for now.
# find_package(PythonLibs)

View File

@@ -874,5 +874,32 @@ if(WITH_HARU)
endif()
endif()
if(WITH_VULKAN)
if(EXISTS ${LIBDIR}/vulkan)
set(Vulkan_FOUND On)
set(Vulkan_ROOT_DIR ${LIBDIR}/vulkan)
set(Vulkan_INCLUDE_DIR ${Vulkan_ROOT_DIR}/include)
set(Vulkan_INCLUDE_DIRS ${Vulkan_INCLUDE_DIR})
set(Vulkan_LIBRARY ${Vulkan_ROOT_DIR}/lib/vulkan-1.lib)
set(Vulkan_LIBRARIES ${Vulkan_LIBRARY})
else()
message(WARNING "vulkan was not found, disabling WITH_VULKAN")
set(WITH_VULKAN OFF)
endif()
endif()
if(WITH_VULKAN)
if(EXISTS ${LIBDIR}/shaderc)
set(SHADERC_ROOT_DIR ${LIBDIR}/shaderc)
set(SHADERC_INCLUDE_DIR ${SHADERC_ROOT_DIR}/include)
set(SHADERC_INCLUDE_DIRS ${SHADERC_INCLUDE_DIR})
set(SHADERC_LIBRARY optimized ${SHADERC_ROOT_DIR}/lib/shaderc_shared.lib debug ${SHADERC_ROOT_DIR}/lib/shaderc_shared_d.lib)
set(SHADERC_LIBRARIES ${SHADERC_LIBRARY})
else()
message(WARNING "shaderc was not found, disabling WITH_VULKAN")
set(WITH_VULKAN OFF)
endif()
endif()
set(ZSTD_INCLUDE_DIRS ${LIBDIR}/zstd/include)
set(ZSTD_LIBRARIES ${LIBDIR}/zstd/lib/zstd_static.lib)

View File

@@ -27,7 +27,7 @@ if(WITH_WINDOWS_BUNDLE_CRT)
# Install the CRT to the blender.crt Sub folder.
install(FILES ${CMAKE_INSTALL_SYSTEM_RUNTIME_LIBS} DESTINATION ./blender.crt COMPONENT Libraries)
# Generating the manifest is a relativly expensive operation since
# Generating the manifest is a relatively expensive operation since
# it is collecting an sha1 hash for every file required. so only do
# this work when the libs have either changed or the manifest does
# not exist yet.

View File

@@ -11,7 +11,7 @@ import queue
execution_queue = queue.Queue()
# This function can savely be called in another thread.
# This function can safely be called in another thread.
# The function will be executed when the timer runs the next time.
def run_in_main_thread(function):
execution_queue.put(function)

View File

@@ -42,8 +42,13 @@ class SimpleMouseOperator(bpy.types.Operator):
self.y = event.mouse_y
return self.execute(context)
# Only needed if you want to add into a dynamic menu
def menu_func(self, context):
self.layout.operator(SimpleMouseOperator.bl_idname, text="Simple Mouse Operator")
# Register and add to the view menu (required to also use F3 search "Simple Mouse Operator" for quick access)
bpy.utils.register_class(SimpleMouseOperator)
bpy.types.VIEW3D_MT_view.append(menu_func)
# Test call to the newly defined operator.
# Here we call the operator and invoke it, meaning that the settings are taken

View File

@@ -43,7 +43,7 @@ def menu_func(self, context):
self.layout.operator(ExportSomeData.bl_idname, text="Text Export Operator")
# Register and add to the file selector
# Register and add to the file selector (required to also use F3 search "Text Export Operator" for quick access)
bpy.utils.register_class(ExportSomeData)
bpy.types.TOPBAR_MT_file_export.append(menu_func)

View File

@@ -27,8 +27,14 @@ class DialogOperator(bpy.types.Operator):
wm = context.window_manager
return wm.invoke_props_dialog(self)
# Only needed if you want to add into a dynamic menu
def menu_func(self, context):
self.layout.operator(DialogOperator.bl_idname, text="Dialog Operator")
# Register and add to the object menu (required to also use F3 search "Dialog Operator" for quick access)
bpy.utils.register_class(DialogOperator)
bpy.types.VIEW3D_MT_object.append(menu_func)
# Test call.
bpy.ops.object.dialog_operator('INVOKE_DEFAULT')

View File

@@ -41,8 +41,13 @@ class CustomDrawOperator(bpy.types.Operator):
col.prop(self, "my_string")
# Only needed if you want to add into a dynamic menu
def menu_func(self, context):
self.layout.operator(CustomDrawOperator.bl_idname, text="Custom Draw Operator")
# Register and add to the object menu (required to also use F3 search "Custom Draw Operator" for quick access)
bpy.utils.register_class(CustomDrawOperator)
bpy.types.VIEW3D_MT_object.append(menu_func)
# test call
bpy.ops.object.custom_draw('INVOKE_DEFAULT')

View File

@@ -55,8 +55,13 @@ class ModalOperator(bpy.types.Operator):
context.window_manager.modal_handler_add(self)
return {'RUNNING_MODAL'}
# Only needed if you want to add into a dynamic menu
def menu_func(self, context):
self.layout.operator(ModalOperator.bl_idname, text="Modal Operator")
# Register and add to the object menu (required to also use F3 search "Modal Operator" for quick access)
bpy.utils.register_class(ModalOperator)
bpy.types.VIEW3D_MT_object.append(menu_func)
# test call
bpy.ops.object.modal_operator('INVOKE_DEFAULT')

View File

@@ -31,8 +31,13 @@ class SearchEnumOperator(bpy.types.Operator):
context.window_manager.invoke_search_popup(self)
return {'RUNNING_MODAL'}
# Only needed if you want to add into a dynamic menu
def menu_func(self, context):
self.layout.operator(SearchEnumOperator.bl_idname, text="Search Enum Operator")
# Register and add to the object menu (required to also use F3 search "Search Enum Operator" for quick access)
bpy.utils.register_class(SearchEnumOperator)
bpy.types.VIEW3D_MT_object.append(menu_func)
# test call
bpy.ops.object.search_enum_operator('INVOKE_DEFAULT')

View File

@@ -22,8 +22,13 @@ class HelloWorldOperator(bpy.types.Operator):
print("Hello World")
return {'FINISHED'}
# Only needed if you want to add into a dynamic menu
def menu_func(self, context):
self.layout.operator(HelloWorldOperator.bl_idname, text="Hello World Operator")
# Register and add to the view menu (required to also use F3 search "Hello World Operator" for quick access)
bpy.utils.register_class(HelloWorldOperator)
bpy.types.VIEW3D_MT_view.append(menu_func)
# test call to the newly defined operator
bpy.ops.wm.hello_world()

View File

@@ -728,7 +728,7 @@ Abusing RNA property callbacks
------------------------------
Python-defined RNA properties can have custom callbacks. Trying to perform complex operations
from there, like calling an operator, may work, but is not officialy recommended nor supported.
from there, like calling an operator, may work, but is not officially recommended nor supported.
Main reason is that those callback should be very fast, but additionally, it may for example
create issues with undo/redo system (most operators store an history step, and editing an RNA

View File

@@ -116,3 +116,7 @@ endif()
if (WITH_COMPOSITOR)
add_subdirectory(smaa_areatex)
endif()
if(WITH_VULKAN)
add_subdirectory(vulkan_memory_allocator)
endif()

View File

@@ -804,31 +804,29 @@ typedef enum hipDeviceP2PAttr {
} hipDeviceP2PAttr;
typedef struct HIP_MEMCPY3D {
size_t srcXInBytes;
size_t srcY;
size_t srcZ;
size_t srcLOD;
unsigned int srcXInBytes;
unsigned int srcY;
unsigned int srcZ;
unsigned int srcLOD;
hipMemoryType srcMemoryType;
const void* srcHost;
hipDeviceptr_t srcDevice;
hArray * srcArray;
void* reserved0;
size_t srcPitch;
size_t srcHeight;
size_t dstXInBytes;
size_t dstY;
size_t dstZ;
size_t dstLOD;
hArray srcArray;
unsigned int srcPitch;
unsigned int srcHeight;
unsigned int dstXInBytes;
unsigned int dstY;
unsigned int dstZ;
unsigned int dstLOD;
hipMemoryType dstMemoryType;
void* dstHost;
hipDeviceptr_t dstDevice;
hArray * dstArray;
void* reserved1;
size_t dstPitch;
size_t dstHeight;
size_t WidthInBytes;
size_t Height;
size_t Depth;
hArray dstArray;
unsigned int dstPitch;
unsigned int dstHeight;
unsigned int WidthInBytes;
unsigned int Height;
unsigned int Depth;
} HIP_MEMCPY3D;
typedef struct HIP_MEMCPY3D_PEER_st {
@@ -879,7 +877,7 @@ typedef struct HIP_RESOURCE_DESC_st {
hipResourceType resType;
union {
struct {
hArray * h_Array;
hArray h_Array;
} array;
struct {
hipMipmappedArray_t hMipmappedArray;
@@ -1074,9 +1072,10 @@ typedef enum hiprtcResult {
typedef hipError_t HIPAPI thipGetErrorName(hipError_t error, const char** pStr);
typedef hipError_t HIPAPI thipInit(unsigned int Flags);
typedef hipError_t HIPAPI thipDriverGetVersion(int* driverVersion);
typedef hipError_t HIPAPI thipGetDevice(hipDevice_t* device, int ordinal);
typedef hipError_t HIPAPI thipGetDevice(int* device);
typedef hipError_t HIPAPI thipGetDeviceCount(int* count);
typedef hipError_t HIPAPI thipGetDeviceProperties(hipDeviceProp_t* props, int deviceId);
typedef hipError_t HIPAPI thipDeviceGet(hipDevice_t* device, int ordinal);
typedef hipError_t HIPAPI thipDeviceGetName(char* name, int len, hipDevice_t dev);
typedef hipError_t HIPAPI thipDeviceGetAttribute(int* pi, hipDeviceAttribute_t attrib, hipDevice_t dev);
typedef hipError_t HIPAPI thipDeviceComputeCapability(int* major, int* minor, hipDevice_t dev);
@@ -1209,6 +1208,7 @@ extern thipDriverGetVersion *hipDriverGetVersion;
extern thipGetDevice *hipGetDevice;
extern thipGetDeviceCount *hipGetDeviceCount;
extern thipGetDeviceProperties *hipGetDeviceProperties;
extern thipDeviceGet* hipDeviceGet;
extern thipDeviceGetName *hipDeviceGetName;
extern thipDeviceGetAttribute *hipDeviceGetAttribute;
extern thipDeviceComputeCapability *hipDeviceComputeCapability;

View File

@@ -71,6 +71,7 @@ thipDriverGetVersion *hipDriverGetVersion;
thipGetDevice *hipGetDevice;
thipGetDeviceCount *hipGetDeviceCount;
thipGetDeviceProperties *hipGetDeviceProperties;
thipDeviceGet* hipDeviceGet;
thipDeviceGetName *hipDeviceGetName;
thipDeviceGetAttribute *hipDeviceGetAttribute;
thipDeviceComputeCapability *hipDeviceComputeCapability;
@@ -255,6 +256,7 @@ static int hipewHipInit(void) {
HIP_LIBRARY_FIND_CHECKED(hipGetDevice);
HIP_LIBRARY_FIND_CHECKED(hipGetDeviceCount);
HIP_LIBRARY_FIND_CHECKED(hipGetDeviceProperties);
HIP_LIBRARY_FIND_CHECKED(hipDeviceGet);
HIP_LIBRARY_FIND_CHECKED(hipDeviceGetName);
HIP_LIBRARY_FIND_CHECKED(hipDeviceGetAttribute);
HIP_LIBRARY_FIND_CHECKED(hipDeviceComputeCapability);

View File

@@ -0,0 +1,42 @@
# ***** BEGIN GPL LICENSE BLOCK *****
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software Foundation,
# Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
#
# The Original Code is Copyright (C) 2012, Blender Foundation
# All rights reserved.
# ***** END GPL LICENSE BLOCK *****
set(INC
.
)
set(INC_SYS
${Vulkan_INCLUDE_DIRS}
)
set(SRC
vk_mem_alloc_impl.cc
vk_mem_alloc.h
)
blender_add_lib(extern_vulkan_memory_allocator "${SRC}" "${INC}" "${INC_SYS}" "${LIB}")
if(CMAKE_COMPILER_IS_GNUCC OR CMAKE_C_COMPILER_ID MATCHES "Clang")
target_compile_options(extern_vulkan_memory_allocator
PRIVATE "-Wno-nullability-completeness"
)
endif()

View File

@@ -0,0 +1,19 @@
Copyright (c) 2017-2021 Advanced Micro Devices, Inc. All rights reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

View File

@@ -0,0 +1,5 @@
Project: VulkanMemoryAllocator
URL: https://github.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator
License: MIT
Upstream version: 4b047fd
Local modifications: None

134
extern/vulkan_memory_allocator/README.md vendored Normal file
View File

@@ -0,0 +1,134 @@
# Vulkan Memory Allocator
Easy to integrate Vulkan memory allocation library.
**Documentation:** See [Vulkan Memory Allocator](https://gpuopen-librariesandsdks.github.io/VulkanMemoryAllocator/html/) (generated from Doxygen-style comments in [src/vk_mem_alloc.h](src/vk_mem_alloc.h))
**License:** MIT. See [LICENSE.txt](LICENSE.txt)
**Changelog:** See [CHANGELOG.md](CHANGELOG.md)
**Product page:** [Vulkan Memory Allocator on GPUOpen](https://gpuopen.com/gaming-product/vulkan-memory-allocator/)
**Build status:**
- Windows: [![Build status](https://ci.appveyor.com/api/projects/status/4vlcrb0emkaio2pn/branch/master?svg=true)](https://ci.appveyor.com/project/adam-sawicki-amd/vulkanmemoryallocator/branch/master)
- Linux: [![Build Status](https://travis-ci.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator.svg?branch=master)](https://travis-ci.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator)
# Problem
Memory allocation and resource (buffer and image) creation in Vulkan is difficult (comparing to older graphics API-s, like D3D11 or OpenGL) for several reasons:
- It requires a lot of boilerplate code, just like everything else in Vulkan, because it is a low-level and high-performance API.
- There is additional level of indirection: `VkDeviceMemory` is allocated separately from creating `VkBuffer`/`VkImage` and they must be bound together.
- Driver must be queried for supported memory heaps and memory types. Different IHVs provide different types of it.
- It is recommended practice to allocate bigger chunks of memory and assign parts of them to particular resources.
# Features
This library can help game developers to manage memory allocations and resource creation by offering some higher-level functions:
1. Functions that help to choose correct and optimal memory type based on intended usage of the memory.
- Required or preferred traits of the memory are expressed using higher-level description comparing to Vulkan flags.
2. Functions that allocate memory blocks, reserve and return parts of them (`VkDeviceMemory` + offset + size) to the user.
- Library keeps track of allocated memory blocks, used and unused ranges inside them, finds best matching unused ranges for new allocations, respects all the rules of alignment and buffer/image granularity.
3. Functions that can create an image/buffer, allocate memory for it and bind them together - all in one call.
Additional features:
- Well-documented - description of all functions and structures provided, along with chapters that contain general description and example code.
- Thread-safety: Library is designed to be used in multithreaded code. Access to a single device memory block referred by different buffers and textures (binding, mapping) is synchronized internally.
- Configuration: Fill optional members of CreateInfo structure to provide custom CPU memory allocator, pointers to Vulkan functions and other parameters.
- Customization: Predefine appropriate macros to provide your own implementation of all external facilities used by the library, from assert, mutex, and atomic, to vector and linked list.
- Support for memory mapping, reference-counted internally. Support for persistently mapped memory: Just allocate with appropriate flag and you get access to mapped pointer.
- Support for non-coherent memory. Functions that flush/invalidate memory. `nonCoherentAtomSize` is respected automatically.
- Support for resource aliasing (overlap).
- Support for sparse binding and sparse residency: Convenience functions that allocate or free multiple memory pages at once.
- Custom memory pools: Create a pool with desired parameters (e.g. fixed or limited maximum size) and allocate memory out of it.
- Linear allocator: Create a pool with linear algorithm and use it for much faster allocations and deallocations in free-at-once, stack, double stack, or ring buffer fashion.
- Support for Vulkan 1.0, 1.1, 1.2.
- Support for extensions (and equivalent functionality included in new Vulkan versions):
- VK_EXT_memory_budget: Used internally if available to query for current usage and budget. If not available, it falls back to an estimation based on memory heap sizes.
- VK_KHR_dedicated_allocation: Just enable it and it will be used automatically by the library.
- VK_AMD_device_coherent_memory
- VK_KHR_buffer_device_address
- Defragmentation of GPU and CPU memory: Let the library move data around to free some memory blocks and make your allocations better compacted.
- Lost allocations: Allocate memory with appropriate flags and let the library remove allocations that are not used for many frames to make room for new ones.
- Statistics: Obtain detailed statistics about the amount of memory used, unused, number of allocated blocks, number of allocations etc. - globally, per memory heap, and per memory type.
- Debug annotations: Associate string with name or opaque pointer to your own data with every allocation.
- JSON dump: Obtain a string in JSON format with detailed map of internal state, including list of allocations and gaps between them.
- Convert this JSON dump into a picture to visualize your memory. See [tools/VmaDumpVis](tools/VmaDumpVis/README.md).
- Debugging incorrect memory usage: Enable initialization of all allocated memory with a bit pattern to detect usage of uninitialized or freed memory. Enable validation of a magic number before and after every allocation to detect out-of-bounds memory corruption.
- Record and replay sequence of calls to library functions to a file to check correctness, measure performance, and gather statistics.
# Prequisites
- Self-contained C++ library in single header file. No external dependencies other than standard C and C++ library and of course Vulkan. STL containers are not used by default.
- Public interface in C, in same convention as Vulkan API. Implementation in C++.
- Error handling implemented by returning `VkResult` error codes - same way as in Vulkan.
- Interface documented using Doxygen-style comments.
- Platform-independent, but developed and tested on Windows using Visual Studio. Continuous integration setup for Windows and Linux. Used also on Android, MacOS, and other platforms.
# Example
Basic usage of this library is very simple. Advanced features are optional. After you created global `VmaAllocator` object, a complete code needed to create a buffer may look like this:
```cpp
VkBufferCreateInfo bufferInfo = { VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO };
bufferInfo.size = 65536;
bufferInfo.usage = VK_BUFFER_USAGE_VERTEX_BUFFER_BIT | VK_BUFFER_USAGE_TRANSFER_DST_BIT;
VmaAllocationCreateInfo allocInfo = {};
allocInfo.usage = VMA_MEMORY_USAGE_GPU_ONLY;
VkBuffer buffer;
VmaAllocation allocation;
vmaCreateBuffer(allocator, &bufferInfo, &allocInfo, &buffer, &allocation, nullptr);
```
With this one function call:
1. `VkBuffer` is created.
2. `VkDeviceMemory` block is allocated if needed.
3. An unused region of the memory block is bound to this buffer.
`VmaAllocation` is an object that represents memory assigned to this buffer. It can be queried for parameters like Vulkan memory handle and offset.
# Binaries
The release comes with precompiled binary executables for "VulkanSample" application which contains test suite and "VmaReplay" tool. They are compiled using Visual Studio 2019, so they require appropriate libraries to work, including "MSVCP140.dll", "VCRUNTIME140.dll", "VCRUNTIME140_1.dll". If their launch fails with error message telling about those files missing, please download and install [Microsoft Visual C++ Redistributable for Visual Studio 2015, 2017 and 2019](https://support.microsoft.com/en-us/help/2977003/the-latest-supported-visual-c-downloads), "x64" version.
# Read more
See **[Documentation](https://gpuopen-librariesandsdks.github.io/VulkanMemoryAllocator/html/)**.
# Software using this library
- **[Detroit: Become Human](https://gpuopen.com/learn/porting-detroit-3/)**
- **[Vulkan Samples](https://github.com/LunarG/VulkanSamples)** - official Khronos Vulkan samples. License: Apache-style.
- **[Anvil](https://github.com/GPUOpen-LibrariesAndSDKs/Anvil)** - cross-platform framework for Vulkan. License: MIT.
- **[Filament](https://github.com/google/filament)** - physically based rendering engine for Android, Windows, Linux and macOS, from Google. Apache License 2.0.
- **[Atypical Games - proprietary game engine](https://developer.samsung.com/galaxy-gamedev/gamedev-blog/infinitejet.html)**
- **[Flax Engine](https://flaxengine.com/)**
- **[Lightweight Java Game Library (LWJGL)](https://www.lwjgl.org/)** - includes binding of the library for Java. License: BSD.
- **[PowerVR SDK](https://github.com/powervr-graphics/Native_SDK)** - C++ cross-platform 3D graphics SDK, from Imagination. License: MIT.
- **[Skia](https://github.com/google/skia)** - complete 2D graphic library for drawing Text, Geometries, and Images, from Google.
- **[The Forge](https://github.com/ConfettiFX/The-Forge)** - cross-platform rendering framework. Apache License 2.0.
- **[VK9](https://github.com/disks86/VK9)** - Direct3D 9 compatibility layer using Vulkan. Zlib lincese.
- **[vkDOOM3](https://github.com/DustinHLand/vkDOOM3)** - Vulkan port of GPL DOOM 3 BFG Edition. License: GNU GPL.
- **[vkQuake2](https://github.com/kondrak/vkQuake2)** - vanilla Quake 2 with Vulkan support. License: GNU GPL.
- **[Vulkan Best Practice for Mobile Developers](https://github.com/ARM-software/vulkan_best_practice_for_mobile_developers)** from ARM. License: MIT.
- **[RPCS3](https://github.com/RPCS3/rpcs3)** - PlayStation 3 emulator/debugger. License: GNU GPLv2.
[Many other projects on GitHub](https://github.com/search?q=AMD_VULKAN_MEMORY_ALLOCATOR_H&type=Code) and some game development studios that use Vulkan in their games.
# See also
- **[D3D12 Memory Allocator](https://github.com/GPUOpen-LibrariesAndSDKs/D3D12MemoryAllocator)** - equivalent library for Direct3D 12. License: MIT.
- **[Awesome Vulkan](https://github.com/vinjn/awesome-vulkan)** - a curated list of awesome Vulkan libraries, debuggers and resources.
- **[VulkanMemoryAllocator-Hpp](https://github.com/malte-v/VulkanMemoryAllocator-Hpp)** - C++ binding for this library. License: CC0-1.0.
- **[PyVMA](https://github.com/realitix/pyvma)** - Python wrapper for this library. Author: Jean-Sébastien B. (@realitix). License: Apache 2.0.
- **[vk-mem](https://github.com/gwihlidal/vk-mem-rs)** - Rust binding for this library. Author: Graham Wihlidal. License: Apache 2.0 or MIT.
- **[Haskell bindings](https://hackage.haskell.org/package/VulkanMemoryAllocator)**, **[github](https://github.com/expipiplus1/vulkan/tree/master/VulkanMemoryAllocator)** - Haskell bindings for this library. Author: Joe Hermaszewski (@expipiplus1). License BSD-3-Clause.
- **[vma_sample_sdl](https://github.com/rextimmy/vma_sample_sdl)** - SDL port of the sample app of this library (with the goal of running it on multiple platforms, including MacOS). Author: @rextimmy. License: MIT.
- **[vulkan-malloc](https://github.com/dylanede/vulkan-malloc)** - Vulkan memory allocation library for Rust. Based on version 1 of this library. Author: Dylan Ede (@dylanede). License: MIT / Apache 2.0.

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,3 @@
#define VMA_IMPLEMENTATION
#include "vk_mem_alloc.h"

View File

@@ -85,3 +85,7 @@ endif()
if(UNIX AND NOT APPLE)
add_subdirectory(libc_compat)
endif()
if(WITH_VULKAN)
add_subdirectory(shader_compiler)
endif()

View File

@@ -226,6 +226,9 @@ add_definitions(
-DCCL_NAMESPACE_END=}
)
if(WITH_CYCLES_DEBUG)
add_definitions(-DWITH_CYCLES_DEBUG)
endif()
if(WITH_CYCLES_STANDALONE_GUI)
add_definitions(-DWITH_CYCLES_STANDALONE_GUI)
endif()
@@ -334,7 +337,7 @@ else()
endif()
# Warnings
if(CMAKE_COMPILER_IS_GNUCXX)
if(CMAKE_COMPILER_IS_GNUCXX OR CMAKE_C_COMPILER_ID MATCHES "Clang")
ADD_CHECK_CXX_COMPILER_FLAG(CMAKE_CXX_FLAGS _has_cxxflag_float_conversion "-Werror=float-conversion")
ADD_CHECK_CXX_COMPILER_FLAG(CMAKE_CXX_FLAGS _has_cxxflag_double_promotion "-Werror=double-promotion")
ADD_CHECK_CXX_COMPILER_FLAG(CMAKE_CXX_FLAGS _has_no_error_unused_macros "-Wno-error=unused-macros")

View File

@@ -218,6 +218,12 @@ enum_denoising_prefilter = (
('ACCURATE', "Accurate", "Prefilter noisy guiding passes before denoising color. Improves quality when guiding passes are noisy using extra processing time", 3),
)
enum_direct_light_sampling_type = (
('MULTIPLE_IMPORTANCE_SAMPLING', "Multiple Importance Sampling", "Multiple importance sampling is used to combine direct light contributions from next-event estimation and forward path tracing", 0),
('FORWARD_PATH_TRACING', "Forward Path Tracing", "Direct light contributions are only sampled using forward path tracing", 1),
('NEXT_EVENT_ESTIMATION', "Next-Event Estimation", "Direct light contributions are only sampled using next-event estimation", 2),
)
def update_render_passes(self, context):
scene = context.scene
view_layer = context.view_layer
@@ -353,7 +359,7 @@ class CyclesRenderSettings(bpy.types.PropertyGroup):
name="Scrambling Distance",
default=1.0,
min=0.0, max=1.0,
description="Lower values give faster rendering with GPU rendering and less noise with all devices at the cost of possible artifacts if set too low. Only works when not using adaptive sampling",
description="Reduce randomization between pixels to improve GPU rendering performance, at the cost of possible rendering artifacts if set too low. Only works when not using adaptive sampling",
)
preview_scrambling_distance: BoolProperty(
name="Scrambling Distance viewport",
@@ -361,10 +367,10 @@ class CyclesRenderSettings(bpy.types.PropertyGroup):
description="Uses the Scrambling Distance value for the viewport. Faster but may flicker",
)
adaptive_scrambling_distance: BoolProperty(
name="Adaptive Scrambling Distance",
auto_scrambling_distance: BoolProperty(
name="Automatic Scrambling Distance",
default=False,
description="Uses a formula to adapt the scrambling distance strength based on the sample count",
description="Automatically reduce the randomization between pixels to improve GPU rendering performance, at the cost of possible rendering artifacts. Only works when not using adaptive sampling",
)
use_layer_samples: EnumProperty(
@@ -422,6 +428,13 @@ class CyclesRenderSettings(bpy.types.PropertyGroup):
default=0,
)
direct_light_sampling_type: EnumProperty(
name="Direct Light Sampling Type",
description="The type of strategy used for sampling direct light contributions",
items=enum_direct_light_sampling_type,
default='MULTIPLE_IMPORTANCE_SAMPLING',
)
min_light_bounces: IntProperty(
name="Min Light Bounces",
description="Minimum number of light bounces. Setting this higher reduces noise in the first bounces, "
@@ -777,8 +790,8 @@ class CyclesRenderSettings(bpy.types.PropertyGroup):
)
use_auto_tile: BoolProperty(
name="Auto Tiles",
description="Automatically render high resolution images in tiles to reduce memory usage, using the specified tile size. Tiles are cached to disk while rendering to save memory",
name="Using Tiling",
description="Render high resolution images in tiles to reduce memory usage, using the specified tile size. Tiles are cached to disk while rendering to save memory",
default=True,
)
tile_size: IntProperty(

View File

@@ -295,13 +295,13 @@ class CYCLES_RENDER_PT_sampling_advanced(CyclesButtonsPanel, Panel):
layout.separator()
col = layout.column(align=True)
col.active = not (cscene.use_adaptive_sampling and cscene.use_preview_adaptive_sampling)
col.prop(cscene, "scrambling_distance", text="Scrambling Distance")
col.prop(cscene, "adaptive_scrambling_distance", text="Adaptive")
sub = col.row(align=True)
heading = layout.column(align=True, heading="Scrambling Distance")
heading.active = not (cscene.use_adaptive_sampling and cscene.use_preview_adaptive_sampling)
heading.prop(cscene, "auto_scrambling_distance", text="Automatic")
sub = heading.row()
sub.active = not cscene.use_preview_adaptive_sampling
sub.prop(cscene, "preview_scrambling_distance", text="Viewport")
heading.prop(cscene, "scrambling_distance", text="Multiplier")
layout.separator()

View File

@@ -199,7 +199,7 @@ static bool ObtainCacheParticleUV(Hair *hair,
b_mesh->uv_layers.begin(l);
float2 uv = zero_float2();
if (b_mesh->uv_layers.length())
if (!b_mesh->uv_layers.empty())
b_psys.uv_on_emitter(psmd, *b_pa, pa_no, uv_num, &uv.x);
CData->curve_uv.push_back_slow(uv);
@@ -261,7 +261,7 @@ static bool ObtainCacheParticleVcol(Hair *hair,
b_mesh->vertex_colors.begin(l);
float4 vcol = make_float4(0.0f, 0.0f, 0.0f, 1.0f);
if (b_mesh->vertex_colors.length())
if (!b_mesh->vertex_colors.empty())
b_psys.mcol_on_emitter(psmd, *b_pa, pa_no, vcol_num, &vcol.x);
CData->curve_vcol.push_back_slow(vcol);

View File

@@ -334,7 +334,7 @@ bool BlenderDisplayDriver::update_begin(const Params &params,
/* Update PBO dimensions if needed.
*
* NOTE: Allocate the PBO for the the size which will fit the final render resolution (as in,
* NOTE: Allocate the PBO for the size which will fit the final render resolution (as in,
* at a resolution divider 1. This was we don't need to recreate graphics interoperability
* objects which are costly and which are tied to the specific underlying buffer size.
* The downside of this approach is that when graphics interoperability is not used we are

View File

@@ -555,7 +555,7 @@ static void attr_create_vertex_color(Scene *scene, Mesh *mesh, BL::Mesh &b_mesh,
/* Create uv map attributes. */
static void attr_create_uv_map(Scene *scene, Mesh *mesh, BL::Mesh &b_mesh)
{
if (b_mesh.uv_layers.length() != 0) {
if (!b_mesh.uv_layers.empty()) {
for (BL::MeshUVLoopLayer &l : b_mesh.uv_layers) {
const bool active_render = l.active_render();
AttributeStandard uv_std = (active_render) ? ATTR_STD_UV : ATTR_STD_NONE;
@@ -619,7 +619,7 @@ static void attr_create_uv_map(Scene *scene, Mesh *mesh, BL::Mesh &b_mesh)
static void attr_create_subd_uv_map(Scene *scene, Mesh *mesh, BL::Mesh &b_mesh, bool subdivide_uvs)
{
if (b_mesh.uv_layers.length() != 0) {
if (!b_mesh.uv_layers.empty()) {
BL::Mesh::uv_layers_iterator l;
int i = 0;
@@ -951,7 +951,7 @@ static void create_mesh(Scene *scene,
N = attr_N->data_float3();
/* create generated coordinates from undeformed coordinates */
const bool need_default_tangent = (subdivision == false) && (b_mesh.uv_layers.length() == 0) &&
const bool need_default_tangent = (subdivision == false) && (b_mesh.uv_layers.empty()) &&
(mesh->need_attribute(scene, ATTR_STD_UV_TANGENT));
if (mesh->need_attribute(scene, ATTR_STD_GENERATED) || need_default_tangent) {
Attribute *attr = attributes.add(ATTR_STD_GENERATED);

View File

@@ -129,7 +129,7 @@ void BlenderSession::create_session()
/* reset status/progress */
last_status = "";
last_error = "";
last_progress = -1.0f;
last_progress = -1.0;
start_resize_time = 0.0;
/* create session */
@@ -606,19 +606,6 @@ void BlenderSession::bake(BL::Depsgraph &b_depsgraph_,
pass->set_type(bake_type_to_pass(bake_type, bake_filter));
pass->set_include_albedo((bake_filter & BL::BakeSettings::pass_filter_COLOR));
if (pass->get_type() == PASS_COMBINED) {
/* Filtering settings for combined pass. */
Integrator *integrator = scene->integrator;
integrator->set_use_direct_light((bake_filter & BL::BakeSettings::pass_filter_DIRECT) != 0);
integrator->set_use_indirect_light((bake_filter & BL::BakeSettings::pass_filter_INDIRECT) !=
0);
integrator->set_use_diffuse((bake_filter & BL::BakeSettings::pass_filter_DIFFUSE) != 0);
integrator->set_use_glossy((bake_filter & BL::BakeSettings::pass_filter_GLOSSY) != 0);
integrator->set_use_transmission((bake_filter & BL::BakeSettings::pass_filter_TRANSMISSION) !=
0);
integrator->set_use_emission((bake_filter & BL::BakeSettings::pass_filter_EMIT) != 0);
}
session->set_display_driver(nullptr);
session->set_output_driver(make_unique<BlenderOutputDriver>(b_engine));
@@ -628,6 +615,24 @@ void BlenderSession::bake(BL::Depsgraph &b_depsgraph_,
sync->sync_camera(b_render, b_camera_override, width, height, "");
sync->sync_data(
b_render, b_depsgraph, b_v3d, b_camera_override, width, height, &python_thread_state);
/* Filtering settings for combined pass. */
if (pass->get_type() == PASS_COMBINED) {
Integrator *integrator = scene->integrator;
integrator->set_use_direct_light((bake_filter & BL::BakeSettings::pass_filter_DIRECT) != 0);
integrator->set_use_indirect_light((bake_filter & BL::BakeSettings::pass_filter_INDIRECT) !=
0);
integrator->set_use_diffuse((bake_filter & BL::BakeSettings::pass_filter_DIFFUSE) != 0);
integrator->set_use_glossy((bake_filter & BL::BakeSettings::pass_filter_GLOSSY) != 0);
integrator->set_use_transmission(
(bake_filter & BL::BakeSettings::pass_filter_TRANSMISSION) != 0);
integrator->set_use_emission((bake_filter & BL::BakeSettings::pass_filter_EMIT) != 0);
}
/* Always use transpanent background for baking. */
scene->background->set_transparent(true);
/* Load built-in images from Blender. */
builtin_images_load();
}
@@ -854,7 +859,7 @@ void BlenderSession::get_status(string &status, string &substatus)
session->progress.get_status(status, substatus);
}
void BlenderSession::get_progress(float &progress, double &total_time, double &render_time)
void BlenderSession::get_progress(double &progress, double &total_time, double &render_time)
{
session->progress.get_time(total_time, render_time);
progress = session->progress.get_progress();
@@ -862,10 +867,10 @@ void BlenderSession::get_progress(float &progress, double &total_time, double &r
void BlenderSession::update_bake_progress()
{
float progress = session->progress.get_progress();
double progress = session->progress.get_progress();
if (progress != last_progress) {
b_engine.update_progress(progress);
b_engine.update_progress((float)progress);
last_progress = progress;
}
}
@@ -874,7 +879,7 @@ void BlenderSession::update_status_progress()
{
string timestatus, status, substatus;
string scene_status = "";
float progress;
double progress;
double total_time, remaining_time = 0, render_time;
float mem_used = (float)session->stats.mem_used / 1024.0f / 1024.0f;
float mem_peak = (float)session->stats.mem_peak / 1024.0f / 1024.0f;
@@ -918,7 +923,7 @@ void BlenderSession::update_status_progress()
last_status_time = current_time;
}
if (progress != last_progress) {
b_engine.update_progress(progress);
b_engine.update_progress((float)progress);
last_progress = progress;
}

View File

@@ -82,7 +82,7 @@ class BlenderSession {
void tag_redraw();
void tag_update();
void get_status(string &status, string &substatus);
void get_progress(float &progress, double &total_time, double &render_time);
void get_progress(double &progress, double &total_time, double &render_time);
void test_cancel();
void update_status_progress();
void update_bake_progress();
@@ -108,7 +108,7 @@ class BlenderSession {
string last_status;
string last_error;
float last_progress;
double last_progress;
double last_status_time;
int width, height;

View File

@@ -365,8 +365,8 @@ void BlenderSync::sync_integrator(BL::ViewLayer &b_view_layer, bool background)
int samples = get_int(cscene, "samples");
float scrambling_distance = get_float(cscene, "scrambling_distance");
bool adaptive_scrambling_distance = get_boolean(cscene, "adaptive_scrambling_distance");
if (adaptive_scrambling_distance) {
bool auto_scrambling_distance = get_boolean(cscene, "auto_scrambling_distance");
if (auto_scrambling_distance) {
scrambling_distance *= 4.0f / sqrtf(samples);
}
@@ -392,6 +392,12 @@ void BlenderSync::sync_integrator(BL::ViewLayer &b_view_layer, bool background)
integrator->set_ao_bounces(0);
}
#ifdef WITH_CYCLES_DEBUG
DirectLightSamplingType direct_light_sampling_type = (DirectLightSamplingType)get_enum(
cscene, "direct_light_sampling_type", DIRECT_LIGHT_SAMPLING_NUM, DIRECT_LIGHT_SAMPLING_MIS);
integrator->set_direct_light_sampling_type(direct_light_sampling_type);
#endif
const DenoiseParams denoise_params = get_denoise_params(b_scene, b_view_layer, background);
integrator->set_use_denoise(denoise_params.use);
@@ -872,7 +878,7 @@ SessionParams BlenderSync::get_session_params(BL::RenderEngine &b_engine,
/* Time limit. */
if (background) {
params.time_limit = get_float(cscene, "time_limit");
params.time_limit = (double)get_float(cscene, "time_limit");
}
else {
/* For the viewport it kind of makes more sense to think in terms of the noise floor, which is

View File

@@ -303,7 +303,7 @@ static inline string image_user_file_path(BL::ImageUser &iuser,
string filepath_str = string(filepath);
if (load_tiled && ima.source() == BL::Image::source_TILED) {
string udim;
if (ima.tiles.length() > 0) {
if (!ima.tiles.empty()) {
udim = to_string(ima.tiles[0].number());
}
string_replace(filepath_str, udim, "<UDIM>");
@@ -647,7 +647,7 @@ static inline Mesh::SubdivisionType object_subdivision_type(BL::Object &b_ob,
{
PointerRNA cobj = RNA_pointer_get(&b_ob.ptr, "cycles");
if (cobj.data && b_ob.modifiers.length() > 0 && experimental) {
if (cobj.data && !b_ob.modifiers.empty() && experimental) {
BL::Modifier mod = b_ob.modifiers[b_ob.modifiers.length() - 1];
bool enabled = preview ? mod.show_viewport() : mod.show_render();

View File

@@ -303,7 +303,7 @@ static void rtc_error_func(void *, enum RTCError, const char *str)
VLOG(1) << str;
}
static double progress_start_time = 0.0f;
static double progress_start_time = 0.0;
static bool rtc_progress_func(void *user_ptr, const double n)
{

View File

@@ -153,7 +153,7 @@ void BVHNode::update_time()
namespace {
struct DumpTraversalContext {
/* Descriptor of wile where writing is happening. */
/* Descriptor of while where writing is happening. */
FILE *stream;
/* Unique identifier of the node current. */
int id;

View File

@@ -178,7 +178,7 @@ class InnerNode : public BVHNode {
reset_unused_children();
}
/* NOTE: This function is only used during binary BVH builder, and it
/* NOTE: This function is only used during binary BVH builder, and it's
* supposed to be configured to have 2 children which will be filled-in in a
* bit. But this is important to have children reset to NULL. */
explicit InnerNode(const BoundBox &bounds) : BVHNode(bounds), num_children_(0)

View File

@@ -30,15 +30,17 @@ BVHOptiX::BVHOptiX(const BVHParams &params_,
: BVH(params_, geometry_, objects_),
device(device),
traversable_handle(0),
as_data(device, params_.top_level ? "optix tlas" : "optix blas", false),
motion_transform_data(device, "optix motion transform", false)
as_data(make_unique<device_only_memory<char>>(
device, params.top_level ? "optix tlas" : "optix blas", false)),
motion_transform_data(
make_unique<device_only_memory<char>>(device, "optix motion transform", false))
{
}
BVHOptiX::~BVHOptiX()
{
// Acceleration structure memory is delayed freed on device, since deleting the
// BVH may happen while still being used for rendering.
/* Acceleration structure memory is delayed freed on device, since deleting the
* BVH may happen while still being used for rendering. */
device->release_optix_bvh(this);
}

View File

@@ -25,14 +25,16 @@
# include "device/memory.h"
# include "util/unique_ptr.h"
CCL_NAMESPACE_BEGIN
class BVHOptiX : public BVH {
public:
Device *device;
uint64_t traversable_handle;
device_only_memory<char> as_data;
device_only_memory<char> motion_transform_data;
unique_ptr<device_only_memory<char>> as_data;
unique_ptr<device_only_memory<char>> motion_transform_data;
protected:
friend class BVH;

View File

@@ -88,7 +88,7 @@ endmacro()
function(cycles_link_directories)
if(APPLE)
# APPLE plaform uses full paths for linking libraries, and avoids link_directories.
# APPLE platform uses full paths for linking libraries, and avoids link_directories.
return()
endif()

View File

@@ -93,11 +93,6 @@ CPUDevice::~CPUDevice()
texture_info.free();
}
bool CPUDevice::show_samples() const
{
return (info.cpu_threads == 1);
}
BVHLayoutMask CPUDevice::get_bvh_layout_mask() const
{
BVHLayoutMask bvh_layout_mask = BVH_LAYOUT_BVH2;

View File

@@ -60,8 +60,6 @@ class CPUDevice : public Device {
CPUDevice(const DeviceInfo &info_, Stats &stats_, Profiler &profiler_);
~CPUDevice();
virtual bool show_samples() const override;
virtual BVHLayoutMask get_bvh_layout_mask() const override;
/* Returns true if the texture info was copied to the device (meaning, some more

View File

@@ -46,12 +46,6 @@ bool CUDADevice::have_precompiled_kernels()
return path_exists(cubins_path);
}
bool CUDADevice::show_samples() const
{
/* The CUDADevice only processes one tile at a time, so showing samples is fine. */
return true;
}
BVHLayoutMask CUDADevice::get_bvh_layout_mask() const
{
return BVH_LAYOUT_BVH2;
@@ -242,6 +236,10 @@ string CUDADevice::compile_kernel_get_common_cflags(const uint kernel_features)
cflags += " -DWITH_NANOVDB";
# endif
# ifdef WITH_CYCLES_DEBUG
cflags += " -DWITH_CYCLES_DEBUG";
# endif
return cflags;
}
@@ -777,6 +775,7 @@ void CUDADevice::generic_free(device_memory &mem)
if (mem.device_pointer) {
CUDAContextScope scope(this);
thread_scoped_lock lock(cuda_mem_map_mutex);
DCHECK(cuda_mem_map.find(&mem) != cuda_mem_map.end());
const CUDAMem &cmem = cuda_mem_map[&mem];
/* If cmem.use_mapped_host is true, reference counting is used
@@ -1143,6 +1142,7 @@ void CUDADevice::tex_free(device_texture &mem)
if (mem.device_pointer) {
CUDAContextScope scope(this);
thread_scoped_lock lock(cuda_mem_map_mutex);
DCHECK(cuda_mem_map.find(&mem) != cuda_mem_map.end());
const CUDAMem &cmem = cuda_mem_map[&mem];
if (cmem.texobject) {

View File

@@ -76,8 +76,6 @@ class CUDADevice : public Device {
static bool have_precompiled_kernels();
virtual bool show_samples() const override;
virtual BVHLayoutMask get_bvh_layout_mask() const override;
void set_error(const string &error) override;

View File

@@ -149,10 +149,6 @@ class Device {
fprintf(stderr, "%s\n", error.c_str());
fflush(stderr);
}
virtual bool show_samples() const
{
return false;
}
virtual BVHLayoutMask get_bvh_layout_mask() const = 0;
/* statistics */

View File

@@ -47,12 +47,6 @@ bool HIPDevice::have_precompiled_kernels()
return path_exists(fatbins_path);
}
bool HIPDevice::show_samples() const
{
/* The HIPDevice only processes one tile at a time, so showing samples is fine. */
return true;
}
BVHLayoutMask HIPDevice::get_bvh_layout_mask() const
{
return BVH_LAYOUT_BVH2;
@@ -99,7 +93,7 @@ HIPDevice::HIPDevice(const DeviceInfo &info, Stats &stats, Profiler &profiler)
}
/* Setup device and context. */
result = hipGetDevice(&hipDevice, hipDevId);
result = hipDeviceGet(&hipDevice, hipDevId);
if (result != hipSuccess) {
set_error(string_printf("Failed to get HIP device handle from ordinal (%s)",
hipewErrorString(result)));
@@ -744,6 +738,7 @@ void HIPDevice::generic_free(device_memory &mem)
if (mem.device_pointer) {
HIPContextScope scope(this);
thread_scoped_lock lock(hip_mem_map_mutex);
DCHECK(hip_mem_map.find(&mem) != hip_mem_map.end());
const HIPMem &cmem = hip_mem_map[&mem];
/* If cmem.use_mapped_host is true, reference counting is used
@@ -986,16 +981,16 @@ void HIPDevice::tex_alloc(device_texture &mem)
<< string_human_readable_number(mem.memory_size()) << " bytes. ("
<< string_human_readable_size(mem.memory_size()) << ")";
hip_assert(hipArray3DCreate(&array_3d, &desc));
hip_assert(hipArray3DCreate((hArray *)&array_3d, &desc));
if (!array_3d) {
return;
}
HIP_MEMCPY3D param;
memset(&param, 0, sizeof(param));
memset(&param, 0, sizeof(HIP_MEMCPY3D));
param.dstMemoryType = hipMemoryTypeArray;
param.dstArray = &array_3d;
param.dstArray = array_3d;
param.srcMemoryType = hipMemoryTypeHost;
param.srcHost = mem.host_pointer;
param.srcPitch = src_pitch;
@@ -1061,12 +1056,13 @@ void HIPDevice::tex_alloc(device_texture &mem)
if (mem.info.data_type != IMAGE_DATA_TYPE_NANOVDB_FLOAT &&
mem.info.data_type != IMAGE_DATA_TYPE_NANOVDB_FLOAT3) {
/* Bindless textures. */
hipResourceDesc resDesc;
memset(&resDesc, 0, sizeof(resDesc));
if (array_3d) {
resDesc.resType = hipResourceTypeArray;
resDesc.res.array.h_Array = &array_3d;
resDesc.res.array.h_Array = array_3d;
resDesc.flags = 0;
}
else if (mem.data_height > 0) {
@@ -1111,6 +1107,7 @@ void HIPDevice::tex_free(device_texture &mem)
if (mem.device_pointer) {
HIPContextScope scope(this);
thread_scoped_lock lock(hip_mem_map_mutex);
DCHECK(hip_mem_map.find(&mem) != hip_mem_map.end());
const HIPMem &cmem = hip_mem_map[&mem];
if (cmem.texobject) {

View File

@@ -75,8 +75,6 @@ class HIPDevice : public Device {
static bool have_precompiled_kernels();
virtual bool show_samples() const override;
virtual BVHLayoutMask get_bvh_layout_mask() const override;
void set_error(const string &error) override;

View File

@@ -23,7 +23,7 @@ CCL_NAMESPACE_BEGIN
device_memory::device_memory(Device *device, const char *name, MemoryType type)
: data_type(device_type_traits<uchar>::data_type),
data_elements(device_type_traits<uchar>::num_elements_cpu),
data_elements(device_type_traits<uchar>::num_elements),
data_size(0),
device_size(0),
data_width(0),
@@ -44,45 +44,6 @@ device_memory::device_memory(Device *device, const char *name, MemoryType type)
{
}
device_memory::device_memory(device_memory &&other) noexcept
: data_type(other.data_type),
data_elements(other.data_elements),
data_size(other.data_size),
device_size(other.device_size),
data_width(other.data_width),
data_height(other.data_height),
data_depth(other.data_depth),
type(other.type),
name(other.name),
device(other.device),
device_pointer(other.device_pointer),
host_pointer(other.host_pointer),
shared_pointer(other.shared_pointer),
shared_counter(other.shared_counter),
original_device_ptr(other.original_device_ptr),
original_device_size(other.original_device_size),
original_device(other.original_device),
need_realloc_(other.need_realloc_),
modified(other.modified)
{
other.data_elements = 0;
other.data_size = 0;
other.device_size = 0;
other.data_width = 0;
other.data_height = 0;
other.data_depth = 0;
other.device = 0;
other.device_pointer = 0;
other.host_pointer = 0;
other.shared_pointer = 0;
other.shared_counter = 0;
other.original_device_ptr = 0;
other.original_device_size = 0;
other.original_device = 0;
other.need_realloc_ = false;
other.modified = false;
}
device_memory::~device_memory()
{
assert(shared_pointer == 0);

View File

@@ -81,155 +81,140 @@ static constexpr size_t datatype_size(DataType datatype)
template<typename T> struct device_type_traits {
static const DataType data_type = TYPE_UNKNOWN;
static const size_t num_elements_cpu = sizeof(T);
static const size_t num_elements_gpu = sizeof(T);
static const size_t num_elements = sizeof(T);
};
template<> struct device_type_traits<uchar> {
static const DataType data_type = TYPE_UCHAR;
static const size_t num_elements_cpu = 1;
static const size_t num_elements_gpu = 1;
static_assert(sizeof(uchar) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 1;
static_assert(sizeof(uchar) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<uchar2> {
static const DataType data_type = TYPE_UCHAR;
static const size_t num_elements_cpu = 2;
static const size_t num_elements_gpu = 2;
static_assert(sizeof(uchar2) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 2;
static_assert(sizeof(uchar2) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<uchar3> {
static const DataType data_type = TYPE_UCHAR;
static const size_t num_elements_cpu = 3;
static const size_t num_elements_gpu = 3;
static_assert(sizeof(uchar3) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 3;
static_assert(sizeof(uchar3) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<uchar4> {
static const DataType data_type = TYPE_UCHAR;
static const size_t num_elements_cpu = 4;
static const size_t num_elements_gpu = 4;
static_assert(sizeof(uchar4) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 4;
static_assert(sizeof(uchar4) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<uint> {
static const DataType data_type = TYPE_UINT;
static const size_t num_elements_cpu = 1;
static const size_t num_elements_gpu = 1;
static_assert(sizeof(uint) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 1;
static_assert(sizeof(uint) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<uint2> {
static const DataType data_type = TYPE_UINT;
static const size_t num_elements_cpu = 2;
static const size_t num_elements_gpu = 2;
static_assert(sizeof(uint2) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 2;
static_assert(sizeof(uint2) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<uint3> {
static const DataType data_type = TYPE_UINT;
static const size_t num_elements_cpu = 3;
static const size_t num_elements_gpu = 3;
static_assert(sizeof(uint3) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 3;
static_assert(sizeof(uint3) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<uint4> {
static const DataType data_type = TYPE_UINT;
static const size_t num_elements_cpu = 4;
static const size_t num_elements_gpu = 4;
static_assert(sizeof(uint4) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 4;
static_assert(sizeof(uint4) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<int> {
static const DataType data_type = TYPE_INT;
static const size_t num_elements_cpu = 1;
static const size_t num_elements_gpu = 1;
static_assert(sizeof(int) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 1;
static_assert(sizeof(int) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<int2> {
static const DataType data_type = TYPE_INT;
static const size_t num_elements_cpu = 2;
static const size_t num_elements_gpu = 2;
static_assert(sizeof(int2) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 2;
static_assert(sizeof(int2) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<int3> {
static const DataType data_type = TYPE_INT;
static const size_t num_elements_cpu = 4;
static const size_t num_elements_gpu = 3;
static_assert(sizeof(int3) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 4;
static_assert(sizeof(int3) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<int4> {
static const DataType data_type = TYPE_INT;
static const size_t num_elements_cpu = 4;
static const size_t num_elements_gpu = 4;
static_assert(sizeof(int4) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 4;
static_assert(sizeof(int4) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<float> {
static const DataType data_type = TYPE_FLOAT;
static const size_t num_elements_cpu = 1;
static const size_t num_elements_gpu = 1;
static_assert(sizeof(float) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 1;
static_assert(sizeof(float) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<float2> {
static const DataType data_type = TYPE_FLOAT;
static const size_t num_elements_cpu = 2;
static const size_t num_elements_gpu = 2;
static_assert(sizeof(float2) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 2;
static_assert(sizeof(float2) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<float3> {
/* float3 has different size depending on the device, can't use it for interchanging
* memory between CPU and GPU.
*
* Leave body empty to trigger a compile error if used. */
};
template<> struct device_type_traits<packed_float3> {
static const DataType data_type = TYPE_FLOAT;
static const size_t num_elements_cpu = 4;
static const size_t num_elements_gpu = 3;
static_assert(sizeof(float3) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 3;
static_assert(sizeof(packed_float3) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<float4> {
static const DataType data_type = TYPE_FLOAT;
static const size_t num_elements_cpu = 4;
static const size_t num_elements_gpu = 4;
static_assert(sizeof(float4) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 4;
static_assert(sizeof(float4) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<half> {
static const DataType data_type = TYPE_HALF;
static const size_t num_elements_cpu = 1;
static const size_t num_elements_gpu = 1;
static_assert(sizeof(half) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 1;
static_assert(sizeof(half) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<ushort4> {
static const DataType data_type = TYPE_UINT16;
static const size_t num_elements_cpu = 4;
static const size_t num_elements_gpu = 4;
static_assert(sizeof(ushort4) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 4;
static_assert(sizeof(ushort4) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<uint16_t> {
static const DataType data_type = TYPE_UINT16;
static const size_t num_elements_cpu = 1;
static const size_t num_elements_gpu = 1;
static_assert(sizeof(uint16_t) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 1;
static_assert(sizeof(uint16_t) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<half4> {
static const DataType data_type = TYPE_HALF;
static const size_t num_elements_cpu = 4;
static const size_t num_elements_gpu = 4;
static_assert(sizeof(half4) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 4;
static_assert(sizeof(half4) == num_elements * datatype_size(data_type));
};
template<> struct device_type_traits<uint64_t> {
static const DataType data_type = TYPE_UINT64;
static const size_t num_elements_cpu = 1;
static const size_t num_elements_gpu = 1;
static_assert(sizeof(uint64_t) == num_elements_cpu * datatype_size(data_type));
static const size_t num_elements = 1;
static_assert(sizeof(uint64_t) == num_elements * datatype_size(data_type));
};
/* Device Memory
@@ -281,11 +266,16 @@ class device_memory {
/* Only create through subclasses. */
device_memory(Device *device, const char *name, MemoryType type);
device_memory(device_memory &&other) noexcept;
/* No copying allowed. */
/* No copying and allowed.
*
* This is because device implementation might need to register device memory in an allocation
* map of some sort and use pointer as a key to identify blocks. Moving data from one place to
* another bypassing device allocation routines will make those maps hard to maintain. */
device_memory(const device_memory &) = delete;
device_memory(device_memory &&other) noexcept = delete;
device_memory &operator=(const device_memory &) = delete;
device_memory &operator=(device_memory &&) = delete;
/* Host allocation on the device. All host_pointer memory should be
* allocated with these functions, for devices that support using
@@ -320,9 +310,7 @@ template<typename T> class device_only_memory : public device_memory {
: device_memory(device, name, allow_host_memory_fallback ? MEM_READ_WRITE : MEM_DEVICE_ONLY)
{
data_type = device_type_traits<T>::data_type;
data_elements = max(device_is_cpu() ? device_type_traits<T>::num_elements_cpu :
device_type_traits<T>::num_elements_gpu,
1);
data_elements = max(device_type_traits<T>::num_elements, 1);
}
device_only_memory(device_only_memory &&other) noexcept : device_memory(std::move(other))
@@ -378,15 +366,11 @@ template<typename T> class device_only_memory : public device_memory {
template<typename T> class device_vector : public device_memory {
public:
/* Can only use this for types that have the same size on CPU and GPU. */
static_assert(device_type_traits<T>::num_elements_cpu ==
device_type_traits<T>::num_elements_gpu);
device_vector(Device *device, const char *name, MemoryType type)
: device_memory(device, name, type)
{
data_type = device_type_traits<T>::data_type;
data_elements = device_type_traits<T>::num_elements_cpu;
data_elements = device_type_traits<T>::num_elements;
modified = true;
need_realloc_ = true;

View File

@@ -109,14 +109,6 @@ class MultiDevice : public Device {
return error_msg;
}
virtual bool show_samples() const override
{
if (devices.size() > 1) {
return false;
}
return devices.front().device->show_samples();
}
virtual BVHLayoutMask get_bvh_layout_mask() const override
{
BVHLayoutMask bvh_layout_mask = BVH_LAYOUT_ALL;

View File

@@ -1032,7 +1032,7 @@ bool OptiXDevice::build_optix_bvh(BVHOptiX *bvh,
return false;
}
device_only_memory<char> &out_data = bvh->as_data;
device_only_memory<char> &out_data = *bvh->as_data;
if (operation == OPTIX_BUILD_OPERATION_BUILD) {
assert(out_data.device == this);
out_data.alloc_to_device(sizes.outputSizeInBytes);
@@ -1123,7 +1123,7 @@ void OptiXDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
operation = OPTIX_BUILD_OPERATION_UPDATE;
}
else {
bvh_optix->as_data.free();
bvh_optix->as_data->free();
bvh_optix->traversable_handle = 0;
}
@@ -1344,9 +1344,9 @@ void OptiXDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
unsigned int num_instances = 0;
unsigned int max_num_instances = 0xFFFFFFFF;
bvh_optix->as_data.free();
bvh_optix->as_data->free();
bvh_optix->traversable_handle = 0;
bvh_optix->motion_transform_data.free();
bvh_optix->motion_transform_data->free();
optixDeviceContextGetProperty(context,
OPTIX_DEVICE_PROPERTY_LIMIT_MAX_INSTANCE_ID,
@@ -1379,8 +1379,8 @@ void OptiXDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
}
}
assert(bvh_optix->motion_transform_data.device == this);
bvh_optix->motion_transform_data.alloc_to_device(total_motion_transform_size);
assert(bvh_optix->motion_transform_data->device == this);
bvh_optix->motion_transform_data->alloc_to_device(total_motion_transform_size);
}
for (Object *ob : bvh->objects) {
@@ -1441,7 +1441,7 @@ void OptiXDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
motion_transform_offset = align_up(motion_transform_offset,
OPTIX_TRANSFORM_BYTE_ALIGNMENT);
CUdeviceptr motion_transform_gpu = bvh_optix->motion_transform_data.device_pointer +
CUdeviceptr motion_transform_gpu = bvh_optix->motion_transform_data->device_pointer +
motion_transform_offset;
motion_transform_offset += motion_transform_size;

View File

@@ -23,6 +23,7 @@
# include "device/optix/queue.h"
# include "device/optix/util.h"
# include "kernel/types.h"
# include "util/unique_ptr.h"
CCL_NAMESPACE_BEGIN
@@ -76,7 +77,7 @@ class OptiXDevice : public CUDADevice {
device_only_memory<KernelParamsOptiX> launch_params;
OptixTraversableHandle tlas_handle = 0;
vector<device_only_memory<char>> delayed_free_bvh_memory;
vector<unique_ptr<device_only_memory<char>>> delayed_free_bvh_memory;
thread_mutex delayed_free_bvh_mutex;
class Denoiser {

View File

@@ -3,7 +3,7 @@ This program uses code from various sources, the default license is Apache 2.0
for all code, with the following exceptions.
Modified BSD License
* Code adapated from Open Shading Language
* Code adapted from Open Shading Language
* Sobol direction vectors
* Matrix inversion code from OpenEXR
* MD5 Hash code

View File

@@ -296,13 +296,13 @@ static BufferParams scale_buffer_params(const BufferParams &params, int resoluti
scaled_params.window_x = params.window_x / resolution_divider;
scaled_params.window_y = params.window_y / resolution_divider;
scaled_params.window_width = params.window_width / resolution_divider;
scaled_params.window_height = params.window_height / resolution_divider;
scaled_params.window_width = max(1, params.window_width / resolution_divider);
scaled_params.window_height = max(1, params.window_height / resolution_divider);
scaled_params.full_x = params.full_x / resolution_divider;
scaled_params.full_y = params.full_y / resolution_divider;
scaled_params.full_width = params.full_width / resolution_divider;
scaled_params.full_height = params.full_height / resolution_divider;
scaled_params.full_width = max(1, params.full_width / resolution_divider);
scaled_params.full_height = max(1, params.full_height / resolution_divider);
scaled_params.update_offset_stride();
@@ -850,7 +850,8 @@ void PathTrace::progress_update_if_needed(const RenderWork &render_work)
{
if (progress_ != nullptr) {
const int2 tile_size = get_render_tile_size();
const int num_samples_added = tile_size.x * tile_size.y * render_work.path_trace.num_samples;
const uint64_t num_samples_added = uint64_t(tile_size.x) * tile_size.y *
render_work.path_trace.num_samples;
const int current_sample = render_work.path_trace.start_sample +
render_work.path_trace.num_samples -
render_work.path_trace.sample_offset;

View File

@@ -76,7 +76,7 @@ class PathTraceDisplay {
/* Copy buffer of rendered pixels of a given size into a given position of the texture.
*
* This function does not acquire a lock. The reason for this is is to allow use of this function
* This function does not acquire a lock. The reason for this is to allow use of this function
* for partial updates from different devices. In this case the caller will acquire the lock
* once, update all the slices and release
* the lock once. This will ensure that draw() will never use partially updated texture. */

View File

@@ -840,6 +840,26 @@ int RenderScheduler::get_num_samples_to_path_trace() const
num_samples_to_occupy = lround(state_.occupancy_num_samples * 0.7f / state_.occupancy);
}
/* When time limit is used clamp the calculated number of samples to keep occupancy.
* This is because time limit causes the last render iteration to happen with less number of
* samples, which conflicts with the occupancy (lower number of samples causes lower
* occupancy, also the calculation is based on number of previously rendered samples).
*
* When time limit is not used the number of samples per render iteration is either increasing
* or stays the same, so there is no need to clamp number of samples calculated for occupancy.
*/
if (time_limit_ != 0.0 && state_.start_render_time != 0.0) {
const double remaining_render_time = max(
0.0, time_limit_ - (time_dt() - state_.start_render_time));
const double time_per_sample_average = path_trace_time_.get_average();
const double predicted_render_time = num_samples_to_occupy * time_per_sample_average;
if (predicted_render_time > remaining_render_time) {
num_samples_to_occupy = lround(num_samples_to_occupy *
(remaining_render_time / predicted_render_time));
}
}
num_samples_to_render = max(num_samples_to_render,
min(num_samples_to_occupy, max_num_samples_to_render));
}

View File

@@ -273,6 +273,7 @@ set(SRC_KERNEL_UTIL_HEADERS
)
set(SRC_KERNEL_TYPES_HEADERS
tables.h
textures.h
types.h
)
@@ -410,12 +411,8 @@ if(WITH_CYCLES_CUDA_BINARIES)
-I ${CMAKE_CURRENT_SOURCE_DIR}/..
-I ${CMAKE_CURRENT_SOURCE_DIR}/device/cuda
--use_fast_math
-o ${CMAKE_CURRENT_BINARY_DIR}/${cuda_file})
if(${experimental})
set(cuda_flags ${cuda_flags} -D __KERNEL_EXPERIMENTAL__)
set(name ${name}_experimental)
endif()
-o ${CMAKE_CURRENT_BINARY_DIR}/${cuda_file}
-Wno-deprecated-gpu-targets)
if(WITH_NANOVDB)
set(cuda_flags ${cuda_flags}
@@ -423,6 +420,10 @@ if(WITH_CYCLES_CUDA_BINARIES)
-I "${NANOVDB_INCLUDE_DIR}")
endif()
if(WITH_CYCLES_DEBUG)
set(cuda_flags ${cuda_flags} -D WITH_CYCLES_DEBUG)
endif()
if(WITH_CYCLES_CUBIN_COMPILER)
string(SUBSTRING ${arch} 3 -1 CUDA_ARCH)
@@ -571,13 +572,14 @@ if(WITH_CYCLES_HIP_BINARIES AND WITH_CYCLES_DEVICE_HIP)
-ffast-math
-o ${CMAKE_CURRENT_BINARY_DIR}/${hip_file})
if(${experimental})
set(hip_flags ${hip_flags} -D __KERNEL_EXPERIMENTAL__)
set(name ${name}_experimental)
if(WITH_NANOVDB)
set(hip_flags ${hip_flags}
-D WITH_NANOVDB
-I "${NANOVDB_INCLUDE_DIR}")
endif()
if(WITH_CYCLES_DEBUG)
set(hip_flags ${hip_flags} -D __KERNEL_DEBUG__)
set(hip_flags ${hip_flags} -D WITH_CYCLES_DEBUG)
endif()
add_custom_command(
@@ -618,6 +620,10 @@ if(WITH_CYCLES_DEVICE_OPTIX AND WITH_CYCLES_CUDA_BINARIES)
-I "${NANOVDB_INCLUDE_DIR}")
endif()
if(WITH_CYCLES_DEBUG)
set(cuda_flags ${cuda_flags} -D WITH_CYCLES_DEBUG)
endif()
if(WITH_CYCLES_CUBIN_COMPILER)
# Needed to find libnvrtc-builtins.so. Can't do it from inside
# cycles_cubin_cc since the env variable is read before main()
@@ -706,7 +712,7 @@ if(WITH_COMPILER_ASAN)
string(APPEND CMAKE_CXX_FLAGS_RELWITHDEBINFO " -fno-sanitize=all")
string(APPEND CMAKE_CXX_FLAGS_DEBUG " -fno-sanitize=vptr")
elseif(CMAKE_C_COMPILER_ID MATCHES "Clang")
# With OSL, Cycles disables rtti in some modules, wich then breaks at linking
# With OSL, Cycles disables rtti in some modules, which then breaks at linking
# when trying to use vptr sanitizer (included into 'undefined' general option).
string(APPEND CMAKE_CXX_FLAGS_RELWITHDEBINFO " -fno-sanitize=vptr")
string(APPEND CMAKE_CXX_FLAGS_DEBUG " -fno-sanitize=vptr")

View File

@@ -97,7 +97,7 @@ ccl_device_inline void sort_intersections_and_normals(ccl_private Intersection *
swapped = false;
for (int j = 0; j < num_hits - 1; ++j) {
if (hits[j].t > hits[j + 1].t) {
struct Intersection tmp_hit = hits[j];
Intersection tmp_hit = hits[j];
float3 tmp_Ng = Ng[j];
hits[j] = hits[j + 1];
Ng[j] = Ng[j + 1];

View File

@@ -438,7 +438,7 @@ ccl_device_inline int bsdf_sample(KernelGlobals kg,
if (label & LABEL_TRANSMIT) {
float threshold_squared = kernel_data.background.transparent_roughness_squared_threshold;
if (threshold_squared >= 0.0f) {
if (threshold_squared >= 0.0f && !(label & LABEL_DIFFUSE)) {
if (bsdf_get_specular_roughness_squared(sc) <= threshold_squared) {
label |= LABEL_TRANSMIT_TRANSPARENT;
}

View File

@@ -18,6 +18,7 @@
#pragma once
#include "kernel/tables.h"
#include "kernel/types.h"
#include "kernel/util/profiling.h"

View File

@@ -52,8 +52,9 @@ typedef unsigned long long uint64_t;
#endif
#define ccl_device_noinline __device__ __noinline__
#define ccl_device_noinline_cpu ccl_device
#define ccl_device_inline_method ccl_device
#define ccl_global
#define ccl_static_constant __constant__
#define ccl_inline_constant __constant__
#define ccl_device_constant __constant__ __device__
#define ccl_constant const
#define ccl_gpu_shared __shared__
@@ -85,7 +86,6 @@ typedef unsigned long long uint64_t;
#define ccl_gpu_syncthreads() __syncthreads()
#define ccl_gpu_ballot(predicate) __ballot_sync(0xFFFFFFFF, predicate)
#define ccl_gpu_shfl_down_sync(mask, var, detla) __shfl_down_sync(mask, var, detla)
#define ccl_gpu_popc(x) __popc(x)
/* GPU texture objects */

View File

@@ -21,6 +21,9 @@
#include "kernel/device/gpu/parallel_sorted_index.h"
#include "kernel/device/gpu/work_stealing.h"
/* Include constant tables before entering Metal's context class scope (context_begin.h) */
#include "kernel/tables.h"
#ifdef __KERNEL_METAL__
# include "kernel/device/metal/context_begin.h"
#endif
@@ -464,7 +467,7 @@ ccl_gpu_kernel(GPU_KERNEL_BLOCK_NUM_THREADS, GPU_KERNEL_MAX_REGISTERS)
const auto num_active_pixels_mask = ccl_gpu_ballot(!converged);
const int lane_id = ccl_gpu_thread_idx_x % ccl_gpu_warp_size;
if (lane_id == 0) {
atomic_fetch_and_add_uint32(num_active_pixels, ccl_gpu_popc(num_active_pixels_mask));
atomic_fetch_and_add_uint32(num_active_pixels, popcount(num_active_pixels_mask));
}
}
@@ -892,6 +895,6 @@ ccl_gpu_kernel(GPU_KERNEL_BLOCK_NUM_THREADS, GPU_KERNEL_MAX_REGISTERS)
const auto can_split_mask = ccl_gpu_ballot(can_split);
const int lane_id = ccl_gpu_thread_idx_x % ccl_gpu_warp_size;
if (lane_id == 0) {
atomic_fetch_and_add_uint32(num_possible_splits, ccl_gpu_popc(can_split_mask));
atomic_fetch_and_add_uint32(num_possible_splits, popcount(can_split_mask));
}
}

View File

@@ -85,8 +85,8 @@ __device__ void gpu_parallel_active_index_array(const uint num_states,
const uint is_active = (state_index < num_states) ? is_active_op(state_index) : 0;
/* For each thread within a warp compute how many other active states precede it. */
const uint thread_offset = ccl_gpu_popc(ccl_gpu_ballot(is_active) &
ccl_gpu_thread_mask(thread_warp));
const uint thread_offset = popcount(ccl_gpu_ballot(is_active) &
ccl_gpu_thread_mask(thread_warp));
/* Last thread in warp stores number of active states for each warp. */
if (thread_warp == ccl_gpu_warp_size - 1) {

View File

@@ -45,8 +45,9 @@ typedef unsigned long long uint64_t;
#define ccl_device_forceinline __device__ __forceinline__
#define ccl_device_noinline __device__ __noinline__
#define ccl_device_noinline_cpu ccl_device
#define ccl_device_inline_method ccl_device
#define ccl_global
#define ccl_static_constant __constant__
#define ccl_inline_constant __constant__
#define ccl_device_constant __constant__ __device__
#define ccl_constant const
#define ccl_gpu_shared __shared__
@@ -84,7 +85,6 @@ typedef unsigned long long uint64_t;
#define ccl_gpu_syncthreads() __syncthreads()
#define ccl_gpu_ballot(predicate) __ballot(predicate)
#define ccl_gpu_shfl_down_sync(mask, var, detla) __shfl_down(var, detla)
#define ccl_gpu_popc(x) __popc(x)
/* GPU texture objects */
typedef hipTextureObject_t ccl_gpu_tex_object;

View File

@@ -34,6 +34,7 @@ using namespace metal;
#pragma clang diagnostic ignored "-Wunused-variable"
#pragma clang diagnostic ignored "-Wsign-compare"
#pragma clang diagnostic ignored "-Wuninitialized"
/* Qualifiers */
@@ -42,8 +43,9 @@ using namespace metal;
#define ccl_device_forceinline ccl_device
#define ccl_device_noinline ccl_device __attribute__((noinline))
#define ccl_device_noinline_cpu ccl_device
#define ccl_device_inline_method ccl_device
#define ccl_global device
#define ccl_static_constant static constant constexpr
#define ccl_inline_constant static constant constexpr
#define ccl_device_constant constant
#define ccl_constant const device
#define ccl_gpu_shared threadgroup
@@ -64,7 +66,7 @@ using namespace metal;
#define ccl_gpu_thread_mask(thread_warp) uint64_t((1ull << thread_warp) - 1)
#define ccl_gpu_ballot(predicate) ((uint64_t)((simd_vote::vote_t)simd_ballot(predicate)))
#define ccl_gpu_popc(x) popcount(x)
#define ccl_gpu_syncthreads() threadgroup_barrier(mem_flags::mem_threadgroup);
// clang-format off
@@ -123,7 +125,6 @@ kernel void kernel_metal_##name(device const kernel_gpu_##name *params_struct, \
uint simd_group_index [[simdgroup_index_in_threadgroup]], \
uint num_simd_groups [[simdgroups_per_threadgroup]]) { \
MetalKernelContext context(_launch_params_metal, _metal_ancillaries); \
INIT_DEBUG_BUFFER \
params_struct->run(context, simdgroup_offset, metal_global_id, metal_local_id, metal_local_size, simdgroup_size, simd_lane_index, simd_group_index, num_simd_groups); \
} \
void kernel_gpu_##name::run(thread MetalKernelContext& context, \
@@ -150,6 +151,31 @@ void kernel_gpu_##name::run(thread MetalKernelContext& context, \
// clang-format on
/* volumetric lambda functions - use function objects for lambda-like functionality */
#define VOLUME_READ_LAMBDA(function_call) \
struct FnObjectRead { \
KernelGlobals kg; \
ccl_private MetalKernelContext *context; \
int state; \
\
VolumeStack operator()(const int i) const \
{ \
return context->function_call; \
} \
} volume_read_lambda_pass{kg, this, state};
#define VOLUME_WRITE_LAMBDA(function_call) \
struct FnObjectWrite { \
KernelGlobals kg; \
ccl_private MetalKernelContext *context; \
int state; \
\
void operator()(const int i, VolumeStack entry) const \
{ \
context->function_call; \
} \
} volume_write_lambda_pass{kg, this, state};
/* make_type definitions with Metal style element initializers */
#ifdef make_float2
# undef make_float2
@@ -204,6 +230,7 @@ void kernel_gpu_##name::run(thread MetalKernelContext& context, \
#define sinhf(x) sinh(float(x))
#define coshf(x) cosh(float(x))
#define tanhf(x) tanh(float(x))
#define saturatef(x) saturate(float(x))
/* Use native functions with possibly lower precision for performance,
* no issues found so far. */
@@ -217,6 +244,8 @@ void kernel_gpu_##name::run(thread MetalKernelContext& context, \
#define NULL 0
#define __device__
/* texture bindings and sampler setup */
struct Texture2DParamsMetal {
@@ -231,6 +260,9 @@ struct MetalAncillaries {
device Texture3DParamsMetal *textures_3d;
};
#include "util/half.h"
#include "util/types.h"
enum SamplerType {
SamplerFilterNearest_AddressRepeat,
SamplerFilterNearest_AddressClampEdge,

View File

@@ -25,7 +25,7 @@ CCL_NAMESPACE_BEGIN
typedef struct KernelParamsMetal {
#define KERNEL_TEX(type, name) ccl_constant type *name;
#define KERNEL_TEX(type, name) ccl_global const type *name;
#include "kernel/textures.h"
#undef KERNEL_TEX

View File

@@ -49,10 +49,11 @@ typedef unsigned long long uint64_t;
__device__ __forceinline__ // Function calls are bad for OptiX performance, so inline everything
#define ccl_device_inline ccl_device
#define ccl_device_forceinline ccl_device
#define ccl_device_inline_method ccl_device
#define ccl_device_noinline __device__ __noinline__
#define ccl_device_noinline_cpu ccl_device
#define ccl_global
#define ccl_static_constant __constant__
#define ccl_inline_constant __constant__
#define ccl_device_constant __constant__ __device__
#define ccl_constant const
#define ccl_gpu_shared __shared__
@@ -86,7 +87,6 @@ typedef unsigned long long uint64_t;
#define ccl_gpu_syncthreads() __syncthreads()
#define ccl_gpu_ballot(predicate) __ballot_sync(0xFFFFFFFF, predicate)
#define ccl_gpu_shfl_down_sync(mask, var, detla) __shfl_down_sync(mask, var, detla)
#define ccl_gpu_popc(x) __popc(x)
/* GPU texture objects */

View File

@@ -21,6 +21,8 @@
#include "kernel/device/gpu/image.h" /* Texture lookup uses normal CUDA intrinsics. */
#include "kernel/tables.h"
#include "kernel/integrator/state.h"
#include "kernel/integrator/state_flow.h"
#include "kernel/integrator/state_util.h"
@@ -44,7 +46,7 @@ template<typename T> ccl_device_forceinline T *get_payload_ptr_2()
ccl_device_forceinline int get_object_id()
{
#ifdef __OBJECT_MOTION__
/* Always get the the instance ID from the TLAS
/* Always get the instance ID from the TLAS
* There might be a motion transform node between TLAS and BLAS which does not have one. */
return optixGetInstanceIdFromHandle(optixGetTransformListHandle(0));
#else
@@ -159,9 +161,9 @@ extern "C" __global__ void __anyhit__kernel_optix_local_hit()
/* Record geometric normal. */
const uint tri_vindex = kernel_tex_fetch(__tri_vindex, prim).w;
const float3 tri_a = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex + 0));
const float3 tri_b = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex + 1));
const float3 tri_c = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex + 2));
const float3 tri_a = kernel_tex_fetch(__tri_verts, tri_vindex + 0);
const float3 tri_b = kernel_tex_fetch(__tri_verts, tri_vindex + 1);
const float3 tri_c = kernel_tex_fetch(__tri_verts, tri_vindex + 2);
local_isect->Ng[hit] = normalize(cross(tri_b - tri_a, tri_c - tri_a));
/* Continue tracing (without this the trace call would return after the first hit). */

View File

@@ -160,7 +160,8 @@ ccl_device_inline int kernel_accum_sample(KernelGlobals kg,
ccl_global float *buffer = kernel_accum_pixel_render_buffer(kg, state, render_buffer);
return atomic_fetch_and_add_uint32((uint *)(buffer) + kernel_data.film.pass_sample_count, 1) +
return atomic_fetch_and_add_uint32(
(ccl_global uint *)(buffer) + kernel_data.film.pass_sample_count, 1) +
sample_offset;
}
@@ -501,7 +502,7 @@ ccl_device_inline void kernel_accum_light(KernelGlobals kg,
/* Write shadow pass. */
if (kernel_data.film.pass_shadow != PASS_UNUSED && (path_flag & PATH_RAY_SHADOW_FOR_LIGHT) &&
(path_flag & PATH_RAY_CAMERA)) {
(path_flag & PATH_RAY_TRANSPARENT_BACKGROUND)) {
const float3 unshadowed_throughput = INTEGRATOR_STATE(
state, shadow_path, unshadowed_throughput);
const float3 shadowed_throughput = INTEGRATOR_STATE(state, shadow_path, throughput);
@@ -552,7 +553,7 @@ ccl_device_inline void kernel_accum_background(KernelGlobals kg,
const bool is_transparent_background_ray,
ccl_global float *ccl_restrict render_buffer)
{
float3 contribution = INTEGRATOR_STATE(state, path, throughput) * L;
float3 contribution = float3(INTEGRATOR_STATE(state, path, throughput)) * L;
kernel_accum_clamp(kg, &contribution, INTEGRATOR_STATE(state, path, bounce) - 1);
ccl_global float *buffer = kernel_accum_pixel_render_buffer(kg, state, render_buffer);

View File

@@ -177,7 +177,7 @@ ccl_device_inline void kernel_write_data_passes(KernelGlobals kg,
#ifdef __PASSES__
const uint32_t path_flag = INTEGRATOR_STATE(state, path, flag);
if (!(path_flag & PATH_RAY_CAMERA)) {
if (!(path_flag & PATH_RAY_TRANSPARENT_BACKGROUND)) {
return;
}

View File

@@ -27,7 +27,12 @@ CCL_NAMESPACE_BEGIN
* Lookup of attributes is different between OSL and SVM, as OSL is ustring
* based while for SVM we use integer ids. */
ccl_device_inline uint subd_triangle_patch(KernelGlobals kg, ccl_private const ShaderData *sd);
/* Patch index for triangle, -1 if not subdivision triangle */
ccl_device_inline uint subd_triangle_patch(KernelGlobals kg, ccl_private const ShaderData *sd)
{
return (sd->prim != PRIM_NONE) ? kernel_tex_fetch(__tri_patch, sd->prim) : ~0;
}
ccl_device_inline uint attribute_primitive_type(KernelGlobals kg, ccl_private const ShaderData *sd)
{
@@ -106,9 +111,9 @@ ccl_device Transform primitive_attribute_matrix(KernelGlobals kg,
{
Transform tfm;
tfm.x = kernel_tex_fetch(__attributes_float3, desc.offset + 0);
tfm.y = kernel_tex_fetch(__attributes_float3, desc.offset + 1);
tfm.z = kernel_tex_fetch(__attributes_float3, desc.offset + 2);
tfm.x = kernel_tex_fetch(__attributes_float4, desc.offset + 0);
tfm.y = kernel_tex_fetch(__attributes_float4, desc.offset + 1);
tfm.z = kernel_tex_fetch(__attributes_float4, desc.offset + 2);
return tfm;
}

View File

@@ -126,8 +126,8 @@ ccl_device float3 curve_attribute_float3(KernelGlobals kg,
int k0 = curve.first_key + PRIMITIVE_UNPACK_SEGMENT(sd->type);
int k1 = k0 + 1;
float3 f0 = float4_to_float3(kernel_tex_fetch(__attributes_float3, desc.offset + k0));
float3 f1 = float4_to_float3(kernel_tex_fetch(__attributes_float3, desc.offset + k1));
float3 f0 = kernel_tex_fetch(__attributes_float3, desc.offset + k0);
float3 f1 = kernel_tex_fetch(__attributes_float3, desc.offset + k1);
# ifdef __RAY_DIFFERENTIALS__
if (dx)
@@ -149,7 +149,7 @@ ccl_device float3 curve_attribute_float3(KernelGlobals kg,
if (desc.element & (ATTR_ELEMENT_CURVE | ATTR_ELEMENT_OBJECT | ATTR_ELEMENT_MESH)) {
const int offset = (desc.element == ATTR_ELEMENT_CURVE) ? desc.offset + sd->prim :
desc.offset;
return float4_to_float3(kernel_tex_fetch(__attributes_float3, offset));
return kernel_tex_fetch(__attributes_float3, offset);
}
else {
return make_float3(0.0f, 0.0f, 0.0f);
@@ -168,8 +168,8 @@ ccl_device float4 curve_attribute_float4(KernelGlobals kg,
int k0 = curve.first_key + PRIMITIVE_UNPACK_SEGMENT(sd->type);
int k1 = k0 + 1;
float4 f0 = kernel_tex_fetch(__attributes_float3, desc.offset + k0);
float4 f1 = kernel_tex_fetch(__attributes_float3, desc.offset + k1);
float4 f0 = kernel_tex_fetch(__attributes_float4, desc.offset + k0);
float4 f1 = kernel_tex_fetch(__attributes_float4, desc.offset + k1);
# ifdef __RAY_DIFFERENTIALS__
if (dx)
@@ -191,7 +191,7 @@ ccl_device float4 curve_attribute_float4(KernelGlobals kg,
if (desc.element & (ATTR_ELEMENT_CURVE | ATTR_ELEMENT_OBJECT | ATTR_ELEMENT_MESH)) {
const int offset = (desc.element == ATTR_ELEMENT_CURVE) ? desc.offset + sd->prim :
desc.offset;
return kernel_tex_fetch(__attributes_float3, offset);
return kernel_tex_fetch(__attributes_float4, offset);
}
else {
return make_float4(0.0f, 0.0f, 0.0f, 0.0f);

View File

@@ -48,8 +48,8 @@ ccl_device_inline void motion_curve_keys_for_step_linear(KernelGlobals kg,
offset += step * numkeys;
keys[0] = kernel_tex_fetch(__attributes_float3, offset + k0);
keys[1] = kernel_tex_fetch(__attributes_float3, offset + k1);
keys[0] = kernel_tex_fetch(__attributes_float4, offset + k0);
keys[1] = kernel_tex_fetch(__attributes_float4, offset + k1);
}
}
@@ -106,10 +106,10 @@ ccl_device_inline void motion_curve_keys_for_step(KernelGlobals kg,
offset += step * numkeys;
keys[0] = kernel_tex_fetch(__attributes_float3, offset + k0);
keys[1] = kernel_tex_fetch(__attributes_float3, offset + k1);
keys[2] = kernel_tex_fetch(__attributes_float3, offset + k2);
keys[3] = kernel_tex_fetch(__attributes_float3, offset + k3);
keys[0] = kernel_tex_fetch(__attributes_float4, offset + k0);
keys[1] = kernel_tex_fetch(__attributes_float4, offset + k1);
keys[2] = kernel_tex_fetch(__attributes_float4, offset + k2);
keys[3] = kernel_tex_fetch(__attributes_float4, offset + k3);
}
}

View File

@@ -43,9 +43,9 @@ ccl_device_inline void motion_triangle_verts_for_step(KernelGlobals kg,
{
if (step == numsteps) {
/* center step: regular vertex location */
verts[0] = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex.w + 0));
verts[1] = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex.w + 1));
verts[2] = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex.w + 2));
verts[0] = kernel_tex_fetch(__tri_verts, tri_vindex.w + 0);
verts[1] = kernel_tex_fetch(__tri_verts, tri_vindex.w + 1);
verts[2] = kernel_tex_fetch(__tri_verts, tri_vindex.w + 2);
}
else {
/* center step not store in this array */
@@ -54,9 +54,9 @@ ccl_device_inline void motion_triangle_verts_for_step(KernelGlobals kg,
offset += step * numverts;
verts[0] = float4_to_float3(kernel_tex_fetch(__attributes_float3, offset + tri_vindex.x));
verts[1] = float4_to_float3(kernel_tex_fetch(__attributes_float3, offset + tri_vindex.y));
verts[2] = float4_to_float3(kernel_tex_fetch(__attributes_float3, offset + tri_vindex.z));
verts[0] = kernel_tex_fetch(__attributes_float3, offset + tri_vindex.x);
verts[1] = kernel_tex_fetch(__attributes_float3, offset + tri_vindex.y);
verts[2] = kernel_tex_fetch(__attributes_float3, offset + tri_vindex.z);
}
}
@@ -70,9 +70,9 @@ ccl_device_inline void motion_triangle_normals_for_step(KernelGlobals kg,
{
if (step == numsteps) {
/* center step: regular vertex location */
normals[0] = float4_to_float3(kernel_tex_fetch(__tri_vnormal, tri_vindex.x));
normals[1] = float4_to_float3(kernel_tex_fetch(__tri_vnormal, tri_vindex.y));
normals[2] = float4_to_float3(kernel_tex_fetch(__tri_vnormal, tri_vindex.z));
normals[0] = kernel_tex_fetch(__tri_vnormal, tri_vindex.x);
normals[1] = kernel_tex_fetch(__tri_vnormal, tri_vindex.y);
normals[2] = kernel_tex_fetch(__tri_vnormal, tri_vindex.z);
}
else {
/* center step is not stored in this array */
@@ -81,9 +81,9 @@ ccl_device_inline void motion_triangle_normals_for_step(KernelGlobals kg,
offset += step * numverts;
normals[0] = float4_to_float3(kernel_tex_fetch(__attributes_float3, offset + tri_vindex.x));
normals[1] = float4_to_float3(kernel_tex_fetch(__attributes_float3, offset + tri_vindex.y));
normals[2] = float4_to_float3(kernel_tex_fetch(__attributes_float3, offset + tri_vindex.z));
normals[0] = kernel_tex_fetch(__attributes_float3, offset + tri_vindex.x);
normals[1] = kernel_tex_fetch(__attributes_float3, offset + tri_vindex.y);
normals[2] = kernel_tex_fetch(__attributes_float3, offset + tri_vindex.z);
}
}

View File

@@ -163,19 +163,7 @@ ccl_device_inline bool motion_triangle_intersect(KernelGlobals kg,
motion_triangle_vertices(kg, fobject, prim, time, verts);
/* Ray-triangle intersection, unoptimized. */
float t, u, v;
if (ray_triangle_intersect(P,
dir,
tmax,
#if defined(__KERNEL_SSE2__) && defined(__KERNEL_SSE__)
(ssef *)verts,
#else
verts[0],
verts[1],
verts[2],
#endif
&u,
&v,
&t)) {
if (ray_triangle_intersect(P, dir, tmax, verts[0], verts[1], verts[2], &u, &v, &t)) {
#ifdef __VISIBILITY_FLAG__
/* Visibility flag test. we do it here under the assumption
* that most triangles are culled by node flags.
@@ -229,19 +217,7 @@ ccl_device_inline bool motion_triangle_intersect_local(KernelGlobals kg,
motion_triangle_vertices(kg, local_object, prim, time, verts);
/* Ray-triangle intersection, unoptimized. */
float t, u, v;
if (!ray_triangle_intersect(P,
dir,
tmax,
# if defined(__KERNEL_SSE2__) && defined(__KERNEL_SSE__)
(ssef *)verts,
# else
verts[0],
verts[1],
verts[2],
# endif
&u,
&v,
&t)) {
if (!ray_triangle_intersect(P, dir, tmax, verts[0], verts[1], verts[2], &u, &v, &t)) {
return false;
}

View File

@@ -380,7 +380,7 @@ ccl_device float3 patch_eval_float3(KernelGlobals kg,
*dv = make_float3(0.0f, 0.0f, 0.0f);
for (int i = 0; i < num_control; i++) {
float3 v = float4_to_float3(kernel_tex_fetch(__attributes_float3, offset + indices[i]));
float3 v = kernel_tex_fetch(__attributes_float3, offset + indices[i]);
val += v * weights[i];
if (du)
@@ -417,7 +417,7 @@ ccl_device float4 patch_eval_float4(KernelGlobals kg,
*dv = make_float4(0.0f, 0.0f, 0.0f, 0.0f);
for (int i = 0; i < num_control; i++) {
float4 v = kernel_tex_fetch(__attributes_float3, offset + indices[i]);
float4 v = kernel_tex_fetch(__attributes_float4, offset + indices[i]);
val += v * weights[i];
if (du)

View File

@@ -284,18 +284,33 @@ ccl_device_inline float4 primitive_motion_vector(KernelGlobals kg,
int numverts, numkeys;
object_motion_info(kg, sd->object, NULL, &numverts, &numkeys);
/* lookup attributes */
motion_pre = primitive_surface_attribute_float3(kg, sd, desc, NULL, NULL);
desc.offset += (sd->type & PRIMITIVE_ALL_TRIANGLE) ? numverts : numkeys;
motion_post = primitive_surface_attribute_float3(kg, sd, desc, NULL, NULL);
#ifdef __HAIR__
if (is_curve_primitive && (sd->object_flag & SD_OBJECT_HAS_VERTEX_MOTION) == 0) {
object_position_transform(kg, sd, &motion_pre);
object_position_transform(kg, sd, &motion_post);
if (is_curve_primitive) {
motion_pre = float4_to_float3(curve_attribute_float4(kg, sd, desc, NULL, NULL));
desc.offset += numkeys;
motion_post = float4_to_float3(curve_attribute_float4(kg, sd, desc, NULL, NULL));
/* Curve */
if ((sd->object_flag & SD_OBJECT_HAS_VERTEX_MOTION) == 0) {
object_position_transform(kg, sd, &motion_pre);
object_position_transform(kg, sd, &motion_post);
}
}
else
#endif
if (sd->type & PRIMITIVE_ALL_TRIANGLE) {
/* Triangle */
if (subd_triangle_patch(kg, sd) == ~0) {
motion_pre = triangle_attribute_float3(kg, sd, desc, NULL, NULL);
desc.offset += numverts;
motion_post = triangle_attribute_float3(kg, sd, desc, NULL, NULL);
}
else {
motion_pre = subd_triangle_attribute_float3(kg, sd, desc, NULL, NULL);
desc.offset += numverts;
motion_post = subd_triangle_attribute_float3(kg, sd, desc, NULL, NULL);
}
}
}
/* object motion. note that depending on the mesh having motion vectors, this

View File

@@ -20,13 +20,6 @@
CCL_NAMESPACE_BEGIN
/* Patch index for triangle, -1 if not subdivision triangle */
ccl_device_inline uint subd_triangle_patch(KernelGlobals kg, ccl_private const ShaderData *sd)
{
return (sd->prim != PRIM_NONE) ? kernel_tex_fetch(__tri_patch, sd->prim) : ~0;
}
/* UV coords of triangle within patch */
ccl_device_inline void subd_triangle_patch_uv(KernelGlobals kg,
@@ -443,8 +436,8 @@ ccl_device_noinline float3 subd_triangle_attribute_float3(KernelGlobals kg,
if (dy)
*dy = make_float3(0.0f, 0.0f, 0.0f);
return float4_to_float3(
kernel_tex_fetch(__attributes_float3, desc.offset + subd_triangle_patch_face(kg, patch)));
return kernel_tex_fetch(__attributes_float3,
desc.offset + subd_triangle_patch_face(kg, patch));
}
else if (desc.element == ATTR_ELEMENT_VERTEX || desc.element == ATTR_ELEMENT_VERTEX_MOTION) {
float2 uv[3];
@@ -452,10 +445,10 @@ ccl_device_noinline float3 subd_triangle_attribute_float3(KernelGlobals kg,
uint4 v = subd_triangle_patch_indices(kg, patch);
float3 f0 = float4_to_float3(kernel_tex_fetch(__attributes_float3, desc.offset + v.x));
float3 f1 = float4_to_float3(kernel_tex_fetch(__attributes_float3, desc.offset + v.y));
float3 f2 = float4_to_float3(kernel_tex_fetch(__attributes_float3, desc.offset + v.z));
float3 f3 = float4_to_float3(kernel_tex_fetch(__attributes_float3, desc.offset + v.w));
float3 f0 = kernel_tex_fetch(__attributes_float3, desc.offset + v.x);
float3 f1 = kernel_tex_fetch(__attributes_float3, desc.offset + v.y);
float3 f2 = kernel_tex_fetch(__attributes_float3, desc.offset + v.z);
float3 f3 = kernel_tex_fetch(__attributes_float3, desc.offset + v.w);
if (subd_triangle_patch_num_corners(kg, patch) != 4) {
f1 = (f1 + f0) * 0.5f;
@@ -484,10 +477,10 @@ ccl_device_noinline float3 subd_triangle_attribute_float3(KernelGlobals kg,
float3 f0, f1, f2, f3;
f0 = float4_to_float3(kernel_tex_fetch(__attributes_float3, corners[0] + desc.offset));
f1 = float4_to_float3(kernel_tex_fetch(__attributes_float3, corners[1] + desc.offset));
f2 = float4_to_float3(kernel_tex_fetch(__attributes_float3, corners[2] + desc.offset));
f3 = float4_to_float3(kernel_tex_fetch(__attributes_float3, corners[3] + desc.offset));
f0 = kernel_tex_fetch(__attributes_float3, corners[0] + desc.offset);
f1 = kernel_tex_fetch(__attributes_float3, corners[1] + desc.offset);
f2 = kernel_tex_fetch(__attributes_float3, corners[2] + desc.offset);
f3 = kernel_tex_fetch(__attributes_float3, corners[3] + desc.offset);
if (subd_triangle_patch_num_corners(kg, patch) != 4) {
f1 = (f1 + f0) * 0.5f;
@@ -513,7 +506,7 @@ ccl_device_noinline float3 subd_triangle_attribute_float3(KernelGlobals kg,
if (dy)
*dy = make_float3(0.0f, 0.0f, 0.0f);
return float4_to_float3(kernel_tex_fetch(__attributes_float3, desc.offset));
return kernel_tex_fetch(__attributes_float3, desc.offset);
}
else {
if (dx)
@@ -590,7 +583,7 @@ ccl_device_noinline float4 subd_triangle_attribute_float4(KernelGlobals kg,
if (dy)
*dy = make_float4(0.0f, 0.0f, 0.0f, 0.0f);
return kernel_tex_fetch(__attributes_float3,
return kernel_tex_fetch(__attributes_float4,
desc.offset + subd_triangle_patch_face(kg, patch));
}
else if (desc.element == ATTR_ELEMENT_VERTEX || desc.element == ATTR_ELEMENT_VERTEX_MOTION) {
@@ -599,10 +592,10 @@ ccl_device_noinline float4 subd_triangle_attribute_float4(KernelGlobals kg,
uint4 v = subd_triangle_patch_indices(kg, patch);
float4 f0 = kernel_tex_fetch(__attributes_float3, desc.offset + v.x);
float4 f1 = kernel_tex_fetch(__attributes_float3, desc.offset + v.y);
float4 f2 = kernel_tex_fetch(__attributes_float3, desc.offset + v.z);
float4 f3 = kernel_tex_fetch(__attributes_float3, desc.offset + v.w);
float4 f0 = kernel_tex_fetch(__attributes_float4, desc.offset + v.x);
float4 f1 = kernel_tex_fetch(__attributes_float4, desc.offset + v.y);
float4 f2 = kernel_tex_fetch(__attributes_float4, desc.offset + v.z);
float4 f3 = kernel_tex_fetch(__attributes_float4, desc.offset + v.w);
if (subd_triangle_patch_num_corners(kg, patch) != 4) {
f1 = (f1 + f0) * 0.5f;
@@ -642,10 +635,10 @@ ccl_device_noinline float4 subd_triangle_attribute_float4(KernelGlobals kg,
color_uchar4_to_float4(kernel_tex_fetch(__attributes_uchar4, corners[3] + desc.offset)));
}
else {
f0 = kernel_tex_fetch(__attributes_float3, corners[0] + desc.offset);
f1 = kernel_tex_fetch(__attributes_float3, corners[1] + desc.offset);
f2 = kernel_tex_fetch(__attributes_float3, corners[2] + desc.offset);
f3 = kernel_tex_fetch(__attributes_float3, corners[3] + desc.offset);
f0 = kernel_tex_fetch(__attributes_float4, corners[0] + desc.offset);
f1 = kernel_tex_fetch(__attributes_float4, corners[1] + desc.offset);
f2 = kernel_tex_fetch(__attributes_float4, corners[2] + desc.offset);
f3 = kernel_tex_fetch(__attributes_float4, corners[3] + desc.offset);
}
if (subd_triangle_patch_num_corners(kg, patch) != 4) {
@@ -672,7 +665,7 @@ ccl_device_noinline float4 subd_triangle_attribute_float4(KernelGlobals kg,
if (dy)
*dy = make_float4(0.0f, 0.0f, 0.0f, 0.0f);
return kernel_tex_fetch(__attributes_float3, desc.offset);
return kernel_tex_fetch(__attributes_float4, desc.offset);
}
else {
if (dx)

View File

@@ -29,9 +29,9 @@ ccl_device_inline float3 triangle_normal(KernelGlobals kg, ccl_private ShaderDat
{
/* load triangle vertices */
const uint4 tri_vindex = kernel_tex_fetch(__tri_vindex, sd->prim);
const float3 v0 = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex.w + 0));
const float3 v1 = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex.w + 1));
const float3 v2 = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex.w + 2));
const float3 v0 = kernel_tex_fetch(__tri_verts, tri_vindex.w + 0);
const float3 v1 = kernel_tex_fetch(__tri_verts, tri_vindex.w + 1);
const float3 v2 = kernel_tex_fetch(__tri_verts, tri_vindex.w + 2);
/* return normal */
if (sd->object_flag & SD_OBJECT_NEGATIVE_SCALE_APPLIED) {
@@ -54,9 +54,9 @@ ccl_device_inline void triangle_point_normal(KernelGlobals kg,
{
/* load triangle vertices */
const uint4 tri_vindex = kernel_tex_fetch(__tri_vindex, prim);
float3 v0 = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex.w + 0));
float3 v1 = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex.w + 1));
float3 v2 = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex.w + 2));
float3 v0 = kernel_tex_fetch(__tri_verts, tri_vindex.w + 0);
float3 v1 = kernel_tex_fetch(__tri_verts, tri_vindex.w + 1);
float3 v2 = kernel_tex_fetch(__tri_verts, tri_vindex.w + 2);
/* compute point */
float t = 1.0f - u - v;
*P = (u * v0 + v * v1 + t * v2);
@@ -78,9 +78,9 @@ ccl_device_inline void triangle_point_normal(KernelGlobals kg,
ccl_device_inline void triangle_vertices(KernelGlobals kg, int prim, float3 P[3])
{
const uint4 tri_vindex = kernel_tex_fetch(__tri_vindex, prim);
P[0] = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex.w + 0));
P[1] = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex.w + 1));
P[2] = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex.w + 2));
P[0] = kernel_tex_fetch(__tri_verts, tri_vindex.w + 0);
P[1] = kernel_tex_fetch(__tri_verts, tri_vindex.w + 1);
P[2] = kernel_tex_fetch(__tri_verts, tri_vindex.w + 2);
}
/* Triangle vertex locations and vertex normals */
@@ -91,12 +91,12 @@ ccl_device_inline void triangle_vertices_and_normals(KernelGlobals kg,
float3 N[3])
{
const uint4 tri_vindex = kernel_tex_fetch(__tri_vindex, prim);
P[0] = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex.w + 0));
P[1] = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex.w + 1));
P[2] = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex.w + 2));
N[0] = float4_to_float3(kernel_tex_fetch(__tri_vnormal, tri_vindex.x));
N[1] = float4_to_float3(kernel_tex_fetch(__tri_vnormal, tri_vindex.y));
N[2] = float4_to_float3(kernel_tex_fetch(__tri_vnormal, tri_vindex.z));
P[0] = kernel_tex_fetch(__tri_verts, tri_vindex.w + 0);
P[1] = kernel_tex_fetch(__tri_verts, tri_vindex.w + 1);
P[2] = kernel_tex_fetch(__tri_verts, tri_vindex.w + 2);
N[0] = kernel_tex_fetch(__tri_vnormal, tri_vindex.x);
N[1] = kernel_tex_fetch(__tri_vnormal, tri_vindex.y);
N[2] = kernel_tex_fetch(__tri_vnormal, tri_vindex.z);
}
/* Interpolate smooth vertex normal from vertices */
@@ -106,9 +106,9 @@ triangle_smooth_normal(KernelGlobals kg, float3 Ng, int prim, float u, float v)
{
/* load triangle vertices */
const uint4 tri_vindex = kernel_tex_fetch(__tri_vindex, prim);
float3 n0 = float4_to_float3(kernel_tex_fetch(__tri_vnormal, tri_vindex.x));
float3 n1 = float4_to_float3(kernel_tex_fetch(__tri_vnormal, tri_vindex.y));
float3 n2 = float4_to_float3(kernel_tex_fetch(__tri_vnormal, tri_vindex.z));
float3 n0 = kernel_tex_fetch(__tri_vnormal, tri_vindex.x);
float3 n1 = kernel_tex_fetch(__tri_vnormal, tri_vindex.y);
float3 n2 = kernel_tex_fetch(__tri_vnormal, tri_vindex.z);
float3 N = safe_normalize((1.0f - u - v) * n2 + u * n0 + v * n1);
@@ -120,9 +120,9 @@ ccl_device_inline float3 triangle_smooth_normal_unnormalized(
{
/* load triangle vertices */
const uint4 tri_vindex = kernel_tex_fetch(__tri_vindex, prim);
float3 n0 = float4_to_float3(kernel_tex_fetch(__tri_vnormal, tri_vindex.x));
float3 n1 = float4_to_float3(kernel_tex_fetch(__tri_vnormal, tri_vindex.y));
float3 n2 = float4_to_float3(kernel_tex_fetch(__tri_vnormal, tri_vindex.z));
float3 n0 = kernel_tex_fetch(__tri_vnormal, tri_vindex.x);
float3 n1 = kernel_tex_fetch(__tri_vnormal, tri_vindex.y);
float3 n2 = kernel_tex_fetch(__tri_vnormal, tri_vindex.z);
/* ensure that the normals are in object space */
if (sd->object_flag & SD_OBJECT_TRANSFORM_APPLIED) {
@@ -145,9 +145,9 @@ ccl_device_inline void triangle_dPdudv(KernelGlobals kg,
{
/* fetch triangle vertex coordinates */
const uint4 tri_vindex = kernel_tex_fetch(__tri_vindex, prim);
const float3 p0 = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex.w + 0));
const float3 p1 = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex.w + 1));
const float3 p2 = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex.w + 2));
const float3 p0 = kernel_tex_fetch(__tri_verts, tri_vindex.w + 0);
const float3 p1 = kernel_tex_fetch(__tri_verts, tri_vindex.w + 1);
const float3 p2 = kernel_tex_fetch(__tri_verts, tri_vindex.w + 2);
/* compute derivatives of P w.r.t. uv */
*dPdu = (p0 - p2);
@@ -267,15 +267,15 @@ ccl_device float3 triangle_attribute_float3(KernelGlobals kg,
if (desc.element & (ATTR_ELEMENT_VERTEX | ATTR_ELEMENT_VERTEX_MOTION)) {
const uint4 tri_vindex = kernel_tex_fetch(__tri_vindex, sd->prim);
f0 = float4_to_float3(kernel_tex_fetch(__attributes_float3, desc.offset + tri_vindex.x));
f1 = float4_to_float3(kernel_tex_fetch(__attributes_float3, desc.offset + tri_vindex.y));
f2 = float4_to_float3(kernel_tex_fetch(__attributes_float3, desc.offset + tri_vindex.z));
f0 = kernel_tex_fetch(__attributes_float3, desc.offset + tri_vindex.x);
f1 = kernel_tex_fetch(__attributes_float3, desc.offset + tri_vindex.y);
f2 = kernel_tex_fetch(__attributes_float3, desc.offset + tri_vindex.z);
}
else {
const int tri = desc.offset + sd->prim * 3;
f0 = float4_to_float3(kernel_tex_fetch(__attributes_float3, tri + 0));
f1 = float4_to_float3(kernel_tex_fetch(__attributes_float3, tri + 1));
f2 = float4_to_float3(kernel_tex_fetch(__attributes_float3, tri + 2));
f0 = kernel_tex_fetch(__attributes_float3, tri + 0);
f1 = kernel_tex_fetch(__attributes_float3, tri + 1);
f2 = kernel_tex_fetch(__attributes_float3, tri + 2);
}
#ifdef __RAY_DIFFERENTIALS__
@@ -298,7 +298,7 @@ ccl_device float3 triangle_attribute_float3(KernelGlobals kg,
if (desc.element & (ATTR_ELEMENT_FACE | ATTR_ELEMENT_OBJECT | ATTR_ELEMENT_MESH)) {
const int offset = (desc.element == ATTR_ELEMENT_FACE) ? desc.offset + sd->prim :
desc.offset;
return float4_to_float3(kernel_tex_fetch(__attributes_float3, offset));
return kernel_tex_fetch(__attributes_float3, offset);
}
else {
return make_float3(0.0f, 0.0f, 0.0f);
@@ -318,16 +318,16 @@ ccl_device float4 triangle_attribute_float4(KernelGlobals kg,
if (desc.element & (ATTR_ELEMENT_VERTEX | ATTR_ELEMENT_VERTEX_MOTION)) {
const uint4 tri_vindex = kernel_tex_fetch(__tri_vindex, sd->prim);
f0 = kernel_tex_fetch(__attributes_float3, desc.offset + tri_vindex.x);
f1 = kernel_tex_fetch(__attributes_float3, desc.offset + tri_vindex.y);
f2 = kernel_tex_fetch(__attributes_float3, desc.offset + tri_vindex.z);
f0 = kernel_tex_fetch(__attributes_float4, desc.offset + tri_vindex.x);
f1 = kernel_tex_fetch(__attributes_float4, desc.offset + tri_vindex.y);
f2 = kernel_tex_fetch(__attributes_float4, desc.offset + tri_vindex.z);
}
else {
const int tri = desc.offset + sd->prim * 3;
if (desc.element == ATTR_ELEMENT_CORNER) {
f0 = kernel_tex_fetch(__attributes_float3, tri + 0);
f1 = kernel_tex_fetch(__attributes_float3, tri + 1);
f2 = kernel_tex_fetch(__attributes_float3, tri + 2);
f0 = kernel_tex_fetch(__attributes_float4, tri + 0);
f1 = kernel_tex_fetch(__attributes_float4, tri + 1);
f2 = kernel_tex_fetch(__attributes_float4, tri + 2);
}
else {
f0 = color_srgb_to_linear_v4(
@@ -359,7 +359,7 @@ ccl_device float4 triangle_attribute_float4(KernelGlobals kg,
if (desc.element & (ATTR_ELEMENT_FACE | ATTR_ELEMENT_OBJECT | ATTR_ELEMENT_MESH)) {
const int offset = (desc.element == ATTR_ELEMENT_FACE) ? desc.offset + sd->prim :
desc.offset;
return kernel_tex_fetch(__attributes_float3, offset);
return kernel_tex_fetch(__attributes_float4, offset);
}
else {
return make_float4(0.0f, 0.0f, 0.0f, 0.0f);

View File

@@ -37,27 +37,11 @@ ccl_device_inline bool triangle_intersect(KernelGlobals kg,
{
const int prim = kernel_tex_fetch(__prim_index, prim_addr);
const uint tri_vindex = kernel_tex_fetch(__tri_vindex, prim).w;
#if defined(__KERNEL_SSE2__) && defined(__KERNEL_SSE__)
const ssef *ssef_verts = (ssef *)&kg->__tri_verts.data[tri_vindex];
#else
const float4 tri_a = kernel_tex_fetch(__tri_verts, tri_vindex + 0),
const float3 tri_a = kernel_tex_fetch(__tri_verts, tri_vindex + 0),
tri_b = kernel_tex_fetch(__tri_verts, tri_vindex + 1),
tri_c = kernel_tex_fetch(__tri_verts, tri_vindex + 2);
#endif
float t, u, v;
if (ray_triangle_intersect(P,
dir,
tmax,
#if defined(__KERNEL_SSE2__) && defined(__KERNEL_SSE__)
ssef_verts,
#else
float4_to_float3(tri_a),
float4_to_float3(tri_b),
float4_to_float3(tri_c),
#endif
&u,
&v,
&t)) {
if (ray_triangle_intersect(P, dir, tmax, tri_a, tri_b, tri_c, &u, &v, &t)) {
#ifdef __VISIBILITY_FLAG__
/* Visibility flag test. we do it here under the assumption
* that most triangles are culled by node flags.
@@ -106,27 +90,11 @@ ccl_device_inline bool triangle_intersect_local(KernelGlobals kg,
const int prim = kernel_tex_fetch(__prim_index, prim_addr);
const uint tri_vindex = kernel_tex_fetch(__tri_vindex, prim).w;
# if defined(__KERNEL_SSE2__) && defined(__KERNEL_SSE__)
const ssef *ssef_verts = (ssef *)&kg->__tri_verts.data[tri_vindex];
# else
const float3 tri_a = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex + 0)),
tri_b = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex + 1)),
tri_c = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex + 2));
# endif
const float3 tri_a = kernel_tex_fetch(__tri_verts, tri_vindex + 0),
tri_b = kernel_tex_fetch(__tri_verts, tri_vindex + 1),
tri_c = kernel_tex_fetch(__tri_verts, tri_vindex + 2);
float t, u, v;
if (!ray_triangle_intersect(P,
dir,
tmax,
# if defined(__KERNEL_SSE2__) && defined(__KERNEL_SSE__)
ssef_verts,
# else
tri_a,
tri_b,
tri_c,
# endif
&u,
&v,
&t)) {
if (!ray_triangle_intersect(P, dir, tmax, tri_a, tri_b, tri_c, &u, &v, &t)) {
return false;
}
@@ -178,11 +146,6 @@ ccl_device_inline bool triangle_intersect_local(KernelGlobals kg,
isect->t = t;
/* Record geometric normal. */
# if defined(__KERNEL_SSE2__) && defined(__KERNEL_SSE__)
const float3 tri_a = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex + 0)),
tri_b = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex + 1)),
tri_c = float4_to_float3(kernel_tex_fetch(__tri_verts, tri_vindex + 2));
# endif
local_isect->Ng[hit] = normalize(cross(tri_b - tri_a, tri_c - tri_a));
return false;
@@ -223,9 +186,9 @@ ccl_device_inline float3 triangle_refine(KernelGlobals kg,
P = P + D * t;
const uint tri_vindex = kernel_tex_fetch(__tri_vindex, isect_prim).w;
const float4 tri_a = kernel_tex_fetch(__tri_verts, tri_vindex + 0),
tri_b = kernel_tex_fetch(__tri_verts, tri_vindex + 1),
tri_c = kernel_tex_fetch(__tri_verts, tri_vindex + 2);
const packed_float3 tri_a = kernel_tex_fetch(__tri_verts, tri_vindex + 0),
tri_b = kernel_tex_fetch(__tri_verts, tri_vindex + 1),
tri_c = kernel_tex_fetch(__tri_verts, tri_vindex + 2);
float3 edge1 = make_float3(tri_a.x - tri_c.x, tri_a.y - tri_c.y, tri_a.z - tri_c.z);
float3 edge2 = make_float3(tri_b.x - tri_c.x, tri_b.y - tri_c.y, tri_b.z - tri_c.z);
float3 tvec = make_float3(P.x - tri_c.x, P.y - tri_c.y, P.z - tri_c.z);
@@ -280,9 +243,9 @@ ccl_device_inline float3 triangle_refine_local(KernelGlobals kg,
# ifdef __INTERSECTION_REFINE__
const uint tri_vindex = kernel_tex_fetch(__tri_vindex, isect_prim).w;
const float4 tri_a = kernel_tex_fetch(__tri_verts, tri_vindex + 0),
tri_b = kernel_tex_fetch(__tri_verts, tri_vindex + 1),
tri_c = kernel_tex_fetch(__tri_verts, tri_vindex + 2);
const packed_float3 tri_a = kernel_tex_fetch(__tri_verts, tri_vindex + 0),
tri_b = kernel_tex_fetch(__tri_verts, tri_vindex + 1),
tri_c = kernel_tex_fetch(__tri_verts, tri_vindex + 2);
float3 edge1 = make_float3(tri_a.x - tri_c.x, tri_a.y - tri_c.y, tri_a.z - tri_c.z);
float3 edge2 = make_float3(tri_b.x - tri_c.x, tri_b.y - tri_c.y, tri_b.z - tri_c.z);
float3 tvec = make_float3(P.x - tri_c.x, P.y - tri_c.y, P.z - tri_c.z);

View File

@@ -75,7 +75,7 @@ ccl_device float4 volume_attribute_float4(KernelGlobals kg,
const AttributeDescriptor desc)
{
if (desc.element & (ATTR_ELEMENT_OBJECT | ATTR_ELEMENT_MESH)) {
return kernel_tex_fetch(__attributes_float3, desc.offset);
return kernel_tex_fetch(__attributes_float4, desc.offset);
}
else if (desc.element == ATTR_ELEMENT_VOXEL) {
/* todo: optimize this so we don't have to transform both here and in

View File

@@ -71,14 +71,16 @@ ccl_device bool integrator_init_from_bake(KernelGlobals kg,
/* Setup render buffers. */
const int index = INTEGRATOR_STATE(state, path, render_pixel_index);
const int pass_stride = kernel_data.film.pass_stride;
render_buffer += index * pass_stride;
ccl_global float *buffer = render_buffer + index * pass_stride;
ccl_global float *primitive = render_buffer + kernel_data.film.pass_bake_primitive;
ccl_global float *differential = render_buffer + kernel_data.film.pass_bake_differential;
ccl_global float *primitive = buffer + kernel_data.film.pass_bake_primitive;
ccl_global float *differential = buffer + kernel_data.film.pass_bake_differential;
const int seed = __float_as_uint(primitive[0]);
int prim = __float_as_uint(primitive[1]);
if (prim == -1) {
/* Accumulate transparency for empty pixels. */
kernel_accum_transparent(kg, state, 0, 1.0f, buffer);
return false;
}

View File

@@ -122,7 +122,7 @@ ccl_device_inline void path_state_next(KernelGlobals kg, IntegratorState state,
/* volume scatter */
flag |= PATH_RAY_VOLUME_SCATTER;
flag &= ~PATH_RAY_TRANSPARENT_BACKGROUND;
if (bounce == 1) {
if (!(flag & PATH_RAY_ANY_PASS)) {
flag |= PATH_RAY_VOLUME_PASS;
}
@@ -184,7 +184,7 @@ ccl_device_inline void path_state_next(KernelGlobals kg, IntegratorState state,
}
/* Render pass categories. */
if (bounce == 1) {
if (!(flag & PATH_RAY_ANY_PASS) && !(flag & PATH_RAY_TRANSPARENT_BACKGROUND)) {
flag |= PATH_RAY_SURFACE_PASS;
}
}
@@ -208,9 +208,7 @@ ccl_device_inline bool path_state_volume_next(IntegratorState state)
}
/* Random number generator next bounce. */
if (volume_bounds_bounce > 1) {
INTEGRATOR_STATE_WRITE(state, path, rng_offset) += PRNG_BOUNCE_NUM;
}
INTEGRATOR_STATE_WRITE(state, path, rng_offset) += PRNG_BOUNCE_NUM;
return true;
}

View File

@@ -20,7 +20,6 @@
#include "kernel/integrator/shader_eval.h"
#include "kernel/light/light.h"
#include "kernel/light/sample.h"
#include "kernel/sample/mis.h"
CCL_NAMESPACE_BEGIN
@@ -81,8 +80,7 @@ ccl_device float3 integrator_eval_background_shader(KernelGlobals kg,
/* multiple importance sampling, get background light pdf for ray
* direction, and compute weight with respect to BSDF pdf */
const float pdf = background_light_pdf(kg, ray_P - ray_D * mis_ray_t, ray_D);
const float mis_weight = power_heuristic(mis_ray_pdf, pdf);
const float mis_weight = light_sample_mis_weight_forward(kg, mis_ray_pdf, pdf);
L *= mis_weight;
}
# endif
@@ -169,7 +167,7 @@ ccl_device_inline void integrate_distant_lights(KernelGlobals kg,
/* multiple importance sampling, get regular light pdf,
* and compute weight with respect to BSDF pdf */
const float mis_ray_pdf = INTEGRATOR_STATE(state, path, mis_ray_pdf);
const float mis_weight = power_heuristic(mis_ray_pdf, ls.pdf);
const float mis_weight = light_sample_mis_weight_forward(kg, mis_ray_pdf, ls.pdf);
light_eval *= mis_weight;
}

View File

@@ -84,7 +84,7 @@ ccl_device_inline void integrate_light(KernelGlobals kg,
/* multiple importance sampling, get regular light pdf,
* and compute weight with respect to BSDF pdf */
const float mis_ray_pdf = INTEGRATOR_STATE(state, path, mis_ray_pdf);
const float mis_weight = power_heuristic(mis_ray_pdf, ls.pdf);
const float mis_weight = light_sample_mis_weight_forward(kg, mis_ray_pdf, ls.pdf);
light_eval *= mis_weight;
}

View File

@@ -95,8 +95,8 @@ ccl_device_inline void integrate_transparent_volume_shadow(KernelGlobals kg,
shader_setup_from_volume(kg, shadow_sd, &ray);
const float step_size = volume_stack_step_size(
kg, [=](const int i) { return integrator_state_read_shadow_volume_stack(state, i); });
VOLUME_READ_LAMBDA(integrator_state_read_shadow_volume_stack(state, i));
const float step_size = volume_stack_step_size(kg, volume_read_lambda_pass);
volume_shadow_heterogeneous(kg, state, &ray, shadow_sd, throughput, step_size);
}

View File

@@ -27,8 +27,6 @@
#include "kernel/light/light.h"
#include "kernel/light/sample.h"
#include "kernel/sample/mis.h"
CCL_NAMESPACE_BEGIN
ccl_device_forceinline void integrate_surface_shader_setup(KernelGlobals kg,
@@ -95,8 +93,7 @@ ccl_device_forceinline void integrate_surface_emission(KernelGlobals kg,
/* Multiple importance sampling, get triangle light pdf,
* and compute weight with respect to BSDF pdf. */
float pdf = triangle_light_pdf(kg, sd, t);
float mis_weight = power_heuristic(bsdf_pdf, pdf);
float mis_weight = light_sample_mis_weight_forward(kg, bsdf_pdf, pdf);
L *= mis_weight;
}
@@ -155,7 +152,7 @@ ccl_device_forceinline void integrate_surface_direct_light(KernelGlobals kg,
bsdf_eval_mul3(&bsdf_eval, light_eval / ls.pdf);
if (ls.shader & SHADER_USE_MIS) {
const float mis_weight = power_heuristic(ls.pdf, bsdf_pdf);
const float mis_weight = light_sample_mis_weight_nee(kg, ls.pdf, bsdf_pdf);
bsdf_eval_mul(&bsdf_eval, mis_weight);
}
@@ -195,12 +192,13 @@ ccl_device_forceinline void integrate_surface_direct_light(KernelGlobals kg,
const float3 throughput = INTEGRATOR_STATE(state, path, throughput) * bsdf_eval_sum(&bsdf_eval);
if (kernel_data.kernel_features & KERNEL_FEATURE_LIGHT_PASSES) {
const float3 pass_diffuse_weight = (bounce == 0) ?
bsdf_eval_pass_diffuse_weight(&bsdf_eval) :
INTEGRATOR_STATE(state, path, pass_diffuse_weight);
const float3 pass_glossy_weight = (bounce == 0) ?
bsdf_eval_pass_glossy_weight(&bsdf_eval) :
INTEGRATOR_STATE(state, path, pass_glossy_weight);
const packed_float3 pass_diffuse_weight =
(bounce == 0) ? packed_float3(bsdf_eval_pass_diffuse_weight(&bsdf_eval)) :
INTEGRATOR_STATE(state, path, pass_diffuse_weight);
const packed_float3 pass_glossy_weight = (bounce == 0) ?
packed_float3(
bsdf_eval_pass_glossy_weight(&bsdf_eval)) :
INTEGRATOR_STATE(state, path, pass_glossy_weight);
INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, pass_diffuse_weight) = pass_diffuse_weight;
INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, pass_glossy_weight) = pass_glossy_weight;
}

View File

@@ -27,8 +27,6 @@
#include "kernel/light/light.h"
#include "kernel/light/sample.h"
#include "kernel/sample/mis.h"
CCL_NAMESPACE_BEGIN
#ifdef __VOLUME__
@@ -78,9 +76,8 @@ ccl_device_inline bool shadow_volume_shader_sample(KernelGlobals kg,
ccl_private ShaderData *ccl_restrict sd,
ccl_private float3 *ccl_restrict extinction)
{
shader_eval_volume<true>(kg, state, sd, PATH_RAY_SHADOW, [=](const int i) {
return integrator_state_read_shadow_volume_stack(state, i);
});
VOLUME_READ_LAMBDA(integrator_state_read_shadow_volume_stack(state, i))
shader_eval_volume<true>(kg, state, sd, PATH_RAY_SHADOW, volume_read_lambda_pass);
if (!(sd->flag & SD_EXTINCTION)) {
return false;
@@ -98,9 +95,8 @@ ccl_device_inline bool volume_shader_sample(KernelGlobals kg,
ccl_private VolumeShaderCoefficients *coeff)
{
const uint32_t path_flag = INTEGRATOR_STATE(state, path, flag);
shader_eval_volume<false>(kg, state, sd, path_flag, [=](const int i) {
return integrator_state_read_volume_stack(state, i);
});
VOLUME_READ_LAMBDA(integrator_state_read_volume_stack(state, i))
shader_eval_volume<false>(kg, state, sd, path_flag, volume_read_lambda_pass);
if (!(sd->flag & (SD_EXTINCTION | SD_SCATTER | SD_EMISSION))) {
return false;
@@ -263,6 +259,12 @@ ccl_device void volume_shadow_heterogeneous(KernelGlobals kg,
/* Equi-angular sampling as in:
* "Importance Sampling Techniques for Path Tracing in Participating Media" */
/* Below this pdf we ignore samples, as they tend to lead to very long distances.
* This can cause performance issues with BVH traversal in OptiX, leading it to
* traverse many nodes. Since these contribute very little to the image, just ignore
* those samples. */
# define VOLUME_SAMPLE_PDF_CUTOFF 1e-8f
ccl_device float volume_equiangular_sample(ccl_private const Ray *ccl_restrict ray,
const float3 light_P,
const float xi,
@@ -437,7 +439,8 @@ ccl_device_forceinline void volume_integrate_step_scattering(
/* Equiangular sampling for direct lighting. */
if (vstate.direct_sample_method == VOLUME_SAMPLE_EQUIANGULAR && !result.direct_scatter) {
if (result.direct_t >= vstate.start_t && result.direct_t <= vstate.end_t) {
if (result.direct_t >= vstate.start_t && result.direct_t <= vstate.end_t &&
vstate.equiangular_pdf > VOLUME_SAMPLE_PDF_CUTOFF) {
const float new_dt = result.direct_t - vstate.start_t;
const float3 new_transmittance = volume_color_transmittance(coeff.sigma_t, new_dt);
@@ -474,26 +477,28 @@ ccl_device_forceinline void volume_integrate_step_scattering(
const float3 new_transmittance = volume_color_transmittance(coeff.sigma_t, new_dt);
const float distance_pdf = dot(channel_pdf, coeff.sigma_t * new_transmittance);
/* throughput */
result.indirect_scatter = true;
result.indirect_t = new_t;
result.indirect_throughput *= coeff.sigma_s * new_transmittance / distance_pdf;
shader_copy_volume_phases(&result.indirect_phases, sd);
if (vstate.distance_pdf * distance_pdf > VOLUME_SAMPLE_PDF_CUTOFF) {
/* throughput */
result.indirect_scatter = true;
result.indirect_t = new_t;
result.indirect_throughput *= coeff.sigma_s * new_transmittance / distance_pdf;
shader_copy_volume_phases(&result.indirect_phases, sd);
if (vstate.direct_sample_method != VOLUME_SAMPLE_EQUIANGULAR) {
/* If using distance sampling for direct light, just copy parameters
* of indirect light since we scatter at the same point then. */
result.direct_scatter = true;
result.direct_t = result.indirect_t;
result.direct_throughput = result.indirect_throughput;
shader_copy_volume_phases(&result.direct_phases, sd);
if (vstate.direct_sample_method != VOLUME_SAMPLE_EQUIANGULAR) {
/* If using distance sampling for direct light, just copy parameters
* of indirect light since we scatter at the same point then. */
result.direct_scatter = true;
result.direct_t = result.indirect_t;
result.direct_throughput = result.indirect_throughput;
shader_copy_volume_phases(&result.direct_phases, sd);
/* Multiple importance sampling. */
if (vstate.use_mis) {
const float equiangular_pdf = volume_equiangular_pdf(ray, equiangular_light_P, new_t);
const float mis_weight = power_heuristic(vstate.distance_pdf * distance_pdf,
equiangular_pdf);
result.direct_throughput *= 2.0f * mis_weight;
/* Multiple importance sampling. */
if (vstate.use_mis) {
const float equiangular_pdf = volume_equiangular_pdf(ray, equiangular_light_P, new_t);
const float mis_weight = power_heuristic(vstate.distance_pdf * distance_pdf,
equiangular_pdf);
result.direct_throughput *= 2.0f * mis_weight;
}
}
}
}
@@ -761,7 +766,7 @@ ccl_device_forceinline void integrate_volume_direct_light(
const float phase_pdf = shader_volume_phase_eval(kg, sd, phases, ls->D, &phase_eval);
if (ls->shader & SHADER_USE_MIS) {
float mis_weight = power_heuristic(ls->pdf, phase_pdf);
float mis_weight = light_sample_mis_weight_nee(kg, ls->pdf, phase_pdf);
bsdf_eval_mul(&phase_eval, mis_weight);
}
@@ -794,9 +799,10 @@ ccl_device_forceinline void integrate_volume_direct_light(
const float3 throughput_phase = throughput * bsdf_eval_sum(&phase_eval);
if (kernel_data.kernel_features & KERNEL_FEATURE_LIGHT_PASSES) {
const float3 pass_diffuse_weight = (bounce == 0) ?
one_float3() :
INTEGRATOR_STATE(state, path, pass_diffuse_weight);
const packed_float3 pass_diffuse_weight = (bounce == 0) ?
packed_float3(one_float3()) :
INTEGRATOR_STATE(
state, path, pass_diffuse_weight);
INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, pass_diffuse_weight) = pass_diffuse_weight;
INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, pass_glossy_weight) = zero_float3();
}
@@ -921,8 +927,8 @@ ccl_device VolumeIntegrateEvent volume_integrate(KernelGlobals kg,
VOLUME_SAMPLE_DISTANCE;
/* Step through volume. */
const float step_size = volume_stack_step_size(
kg, [=](const int i) { return integrator_state_read_volume_stack(state, i); });
VOLUME_READ_LAMBDA(integrator_state_read_volume_stack(state, i))
const float step_size = volume_stack_step_size(kg, volume_read_lambda_pass);
/* TODO: expensive to zero closures? */
VolumeIntegrateResult result = {};

View File

@@ -122,23 +122,20 @@ ccl_device_inline void shader_prepare_surface_closures(KernelGlobals kg,
for (int i = 0; i < sd->num_closure; i++) {
ccl_private ShaderClosure *sc = &sd->closure[i];
if (CLOSURE_IS_BSDF_DIFFUSE(sc->type)) {
if (kernel_data.integrator.filter_closures & FILTER_CLOSURE_DIFFUSE) {
sc->type = CLOSURE_NONE_ID;
sc->sample_weight = 0.0f;
}
if ((CLOSURE_IS_BSDF_DIFFUSE(sc->type) &&
(kernel_data.integrator.filter_closures & FILTER_CLOSURE_DIFFUSE)) ||
(CLOSURE_IS_BSDF_GLOSSY(sc->type) &&
(kernel_data.integrator.filter_closures & FILTER_CLOSURE_GLOSSY)) ||
(CLOSURE_IS_BSDF_TRANSMISSION(sc->type) &&
(kernel_data.integrator.filter_closures & FILTER_CLOSURE_TRANSMISSION))) {
sc->type = CLOSURE_NONE_ID;
sc->sample_weight = 0.0f;
}
else if (CLOSURE_IS_BSDF_GLOSSY(sc->type)) {
if (kernel_data.integrator.filter_closures & FILTER_CLOSURE_GLOSSY) {
sc->type = CLOSURE_NONE_ID;
sc->sample_weight = 0.0f;
}
}
else if (CLOSURE_IS_BSDF_TRANSMISSION(sc->type)) {
if (kernel_data.integrator.filter_closures & FILTER_CLOSURE_TRANSMISSION) {
sc->type = CLOSURE_NONE_ID;
sc->sample_weight = 0.0f;
}
else if ((CLOSURE_IS_BSDF_TRANSPARENT(sc->type) &&
(kernel_data.integrator.filter_closures & FILTER_CLOSURE_TRANSPARENT))) {
sc->type = CLOSURE_HOLDOUT_ID;
sc->sample_weight = 0.0f;
sd->flag |= SD_HOLDOUT;
}
}
}

View File

@@ -40,15 +40,15 @@ KERNEL_STRUCT_MEMBER(shadow_path, uint16_t, queued_kernel, KERNEL_FEATURE_PATH_T
/* enum PathRayFlag */
KERNEL_STRUCT_MEMBER(shadow_path, uint32_t, flag, KERNEL_FEATURE_PATH_TRACING)
/* Throughput. */
KERNEL_STRUCT_MEMBER(shadow_path, float3, throughput, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(shadow_path, packed_float3, throughput, KERNEL_FEATURE_PATH_TRACING)
/* Throughput for shadow pass. */
KERNEL_STRUCT_MEMBER(shadow_path,
float3,
packed_float3,
unshadowed_throughput,
KERNEL_FEATURE_SHADOW_PASS | KERNEL_FEATURE_AO_ADDITIVE)
/* Ratio of throughput to distinguish diffuse / glossy / transmission render passes. */
KERNEL_STRUCT_MEMBER(shadow_path, float3, pass_diffuse_weight, KERNEL_FEATURE_LIGHT_PASSES)
KERNEL_STRUCT_MEMBER(shadow_path, float3, pass_glossy_weight, KERNEL_FEATURE_LIGHT_PASSES)
KERNEL_STRUCT_MEMBER(shadow_path, packed_float3, pass_diffuse_weight, KERNEL_FEATURE_LIGHT_PASSES)
KERNEL_STRUCT_MEMBER(shadow_path, packed_float3, pass_glossy_weight, KERNEL_FEATURE_LIGHT_PASSES)
/* Number of intersections found by ray-tracing. */
KERNEL_STRUCT_MEMBER(shadow_path, uint16_t, num_hits, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_END(shadow_path)
@@ -56,8 +56,8 @@ KERNEL_STRUCT_END(shadow_path)
/********************************** Shadow Ray *******************************/
KERNEL_STRUCT_BEGIN(shadow_ray)
KERNEL_STRUCT_MEMBER(shadow_ray, float3, P, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(shadow_ray, float3, D, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(shadow_ray, packed_float3, P, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(shadow_ray, packed_float3, D, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(shadow_ray, float, t, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(shadow_ray, float, time, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(shadow_ray, float, dP, KERNEL_FEATURE_PATH_TRACING)

View File

@@ -59,12 +59,12 @@ KERNEL_STRUCT_MEMBER(path, float, min_ray_pdf, KERNEL_FEATURE_PATH_TRACING)
/* Continuation probability for path termination. */
KERNEL_STRUCT_MEMBER(path, float, continuation_probability, KERNEL_FEATURE_PATH_TRACING)
/* Throughput. */
KERNEL_STRUCT_MEMBER(path, float3, throughput, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(path, packed_float3, throughput, KERNEL_FEATURE_PATH_TRACING)
/* Ratio of throughput to distinguish diffuse / glossy / transmission render passes. */
KERNEL_STRUCT_MEMBER(path, float3, pass_diffuse_weight, KERNEL_FEATURE_LIGHT_PASSES)
KERNEL_STRUCT_MEMBER(path, float3, pass_glossy_weight, KERNEL_FEATURE_LIGHT_PASSES)
KERNEL_STRUCT_MEMBER(path, packed_float3, pass_diffuse_weight, KERNEL_FEATURE_LIGHT_PASSES)
KERNEL_STRUCT_MEMBER(path, packed_float3, pass_glossy_weight, KERNEL_FEATURE_LIGHT_PASSES)
/* Denoising. */
KERNEL_STRUCT_MEMBER(path, float3, denoising_feature_throughput, KERNEL_FEATURE_DENOISING)
KERNEL_STRUCT_MEMBER(path, packed_float3, denoising_feature_throughput, KERNEL_FEATURE_DENOISING)
/* Shader sorting. */
/* TODO: compress as uint16? or leave out entirely and recompute key in sorting code? */
KERNEL_STRUCT_MEMBER(path, uint32_t, shader_sort_key, KERNEL_FEATURE_PATH_TRACING)
@@ -73,8 +73,8 @@ KERNEL_STRUCT_END(path)
/************************************** Ray ***********************************/
KERNEL_STRUCT_BEGIN(ray)
KERNEL_STRUCT_MEMBER(ray, float3, P, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(ray, float3, D, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(ray, packed_float3, P, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(ray, packed_float3, D, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(ray, float, t, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(ray, float, time, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(ray, float, dP, KERNEL_FEATURE_PATH_TRACING)
@@ -96,10 +96,10 @@ KERNEL_STRUCT_END(isect)
/*************** Subsurface closure state for subsurface kernel ***************/
KERNEL_STRUCT_BEGIN(subsurface)
KERNEL_STRUCT_MEMBER(subsurface, float3, albedo, KERNEL_FEATURE_SUBSURFACE)
KERNEL_STRUCT_MEMBER(subsurface, float3, radius, KERNEL_FEATURE_SUBSURFACE)
KERNEL_STRUCT_MEMBER(subsurface, packed_float3, albedo, KERNEL_FEATURE_SUBSURFACE)
KERNEL_STRUCT_MEMBER(subsurface, packed_float3, radius, KERNEL_FEATURE_SUBSURFACE)
KERNEL_STRUCT_MEMBER(subsurface, float, anisotropy, KERNEL_FEATURE_SUBSURFACE)
KERNEL_STRUCT_MEMBER(subsurface, float3, Ng, KERNEL_FEATURE_SUBSURFACE)
KERNEL_STRUCT_MEMBER(subsurface, packed_float3, Ng, KERNEL_FEATURE_SUBSURFACE)
KERNEL_STRUCT_END(subsurface)
/********************************** Volume Stack ******************************/

Some files were not shown because too many files have changed in this diff Show More