Compare commits

...

716 Commits

Author SHA1 Message Date
c81d830ba0 Merge branch 'master' into temp-image-engine 2022-12-07 15:04:39 +01:00
3255ddc2f8 Rename some methods to represent their internal flag. 2022-12-07 15:03:36 +01:00
8db9d6bd6b Add back UDIM support. 2022-12-07 15:01:17 +01:00
2dc51fccb8 Fix T101787, T102786. Cycles: Improved out-of-memory messaging on Metal
This patch adds a new `max_working_set_exceeded()` check on Metal so that we can display a "System is out of GPU memory" message to the user. Without this, we get obtuse "CommandBuffer failed" errors at render time due to exceeding the size limit of resident resources.

Likely fix for T101787 & T102786.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D16713
2022-12-07 13:56:21 +00:00
4d05a000cb Fix light tree header file included while feature disabled 2022-12-07 14:47:11 +01:00
f8ddf16e5a Cleanup: Typo in comment 2022-12-07 14:44:45 +01:00
6df6e537a3 Recalculate bounds when textures go offscreen. 2022-12-07 14:28:39 +01:00
cc1ba74ce2 Fix T102966: Curves editmode selection drawing not stable / flickering
Issue was the lifetime of GPUVertFormat & GPUVertAttr.
Both need to be static in the function to be persistent here (and
handled appropriately).

Was an error in rB319ee296fd0c.

Maniphest Tasks: T102966

Differential Revision: https://developer.blender.org/D16704
2022-12-07 09:40:54 +01:00
38f793d349 Remove visible flag. all textures are visible at all moments. 2022-12-07 08:53:26 +01:00
0ba5e6a8dd Rename dirty to need_full_update. 2022-12-07 08:45:48 +01:00
ff280aba1b Cleanup: renamed ensure_float_buffer to cached_float_buffer and remove wrapper function. 2022-12-07 08:41:43 +01:00
7ccb299fb7 Make sure textures are drawn in the right place. 2022-12-07 08:28:26 +01:00
0b85d6a030 3D View: support canceling viewport operations
Support Esc / RMB to cancel dolly, move, rotate & zoom.
Previously only roll could be canceled.

This can be useful to temporary orbit away from the camera or an
orthographic view without having to manually set it back.
2022-12-07 17:45:24 +11:00
dcd4eb4c25 Cleanup: minor refactor to view operator event handling
- Add VIEW_CANCEL event_code.
- De-duplicate operator freeing logic for the roll operator.
- Structure checks so adding cancel is is simplified.
- Split event checks into two blocks, one for model events, another
  for all other events.
2022-12-07 17:45:24 +11:00
0e90896cba Cleanup: simplify udim parameters when uv packing
Migrate (some) of the UDIM offset calculation from inside one
of the packing engines (where it's consumed) to the packing
operator (where it's produced).

This change (and others) will help simplify the future migration
of the packing engine inside editors/uvedit/uvedit_islands.cc
to the Geometry module, so it can eventually replace the other
packing engine in geometry/intern/uv_parametrizer.cc
2022-12-07 15:30:13 +13:00
a5f9f7e2fc OBJ: Avoid retrieving mesh arrays, improve const correctness
Store the potentially owned mesh separately from the original/evaluated
mesh which is now stored with a const pointer. Also store mesh spans
separately in the class so they don't have to be retrieved for every
index.
2022-12-06 15:26:42 -06:00
a459018a99 Cleanup: Simplify naming in UV sphere primitive
It's obvious that these are indices, no need for it to be part of names.
2022-12-06 14:25:43 -06:00
c1d4105005 Cleanup: Remove unnecessary indentation in cone primitive
The loop is skipped if there are zero iterations anyway.
2022-12-06 14:25:43 -06:00
ff324ab716 Cleanup: Make mesh primitive topology building more parallel
Avoid using an incremented "loop index" variable which makes the whole
data-filling necessarily sequential. No functional changes expected,
this just simplifies some refactors to face corner storage.
2022-12-06 14:25:43 -06:00
fd9b197226 GPU: Fix using FLOAT_2D_ARRAY and FLOAT_3D textures via Python.
Translation from python enum values were incorrect and textures created
in python using those types would result in faulty textures. In
renderdoc those textures would not bind.
2022-12-06 20:16:39 +01:00
3124241256 Fix Cycles SSE4 define for fast math rint function.
Differential Revision: https://developer.blender.org/D16708
2022-12-06 19:06:43 +01:00
48b5dcdbe8 Animation: Removal of most of the old pose library
Remove most of the old (pre-3.0) pose library:

- Remove The entire `editors/armature/pose_lib.c` file
- Deprecate `Object::poselib` in DNA
- Remove Operators marked as deprecated in T93405
- Remove RNA property `Object.pose_library`
- Add comment to clarify that the call `BLO_read_id_address(reader,
  ob->id.lib, &ob->poselib);` handles deprecated data.

Note that this functionality has been documented as deprecated since
Blender 3.2.

What remains of the old pose library: The DNA for action markers
(`bAction::markers`) and the corresponding Python API. This will allow
future versions of Blender to still convert old pose libraries to new
ones (via the Pose Library panel in the Action editor).

Manifest task: T93406
2022-12-06 18:37:10 +01:00
de9f32a666 Merge branch 'blender-v3.4-release' 2022-12-06 18:27:44 +01:00
7d99c51e17 Cycles: enable light tree again
Bugs that caused wrong renders should be fixed now, and tests that showed minor
floating point differences on platforms were tweaked to sidestep the problem.

Ref T77889
2022-12-06 18:18:53 +01:00
bc9548da80 Merge branch 'blender-v3.4-release' 2022-12-06 18:05:08 +01:00
658220e815 Fix T101245: Allow Thumbnails of > 256:1 Images
Ensure that thumbnails of images with aspect greater than 256:1 have
dimensions of at least one pixel.

See D16707 for more details

Differential Revision: https://developer.blender.org/D16707

Reviewed by Brecht Van Lommel
2022-12-06 08:17:50 -08:00
64541b242a Fix T102940: "Mask by Color" sculpt tool crash
We need to ensure the mask layer exists before running the operator.
I made the order of function calls here the same as in newer code later
on in this file for consistency.

Differential Revision: https://developer.blender.org/D16696
2022-12-06 08:17:50 -08:00
16b6116b9d Fix Cycles light tree render errors on Windows
Due to mistake in popcount implementation. Thanks to Weizhen for help
figuring this out.
2022-12-06 16:52:15 +01:00
8f213e7436 Merge branch 'blender-v3.4-release' 2022-12-06 16:39:51 +01:00
adea6681c0 Merge branch 'blender-v3.4-release' 2022-12-06 16:35:44 +01:00
7a6630e5ce Enable other textures as well. 2022-12-06 16:23:16 +01:00
44748e6fe3 Initial transform from tile to texture. 2022-12-06 15:40:53 +01:00
6428c847fd Cycles oneAPI: clarify Linux Driver requirements in GUI
"Linux Driver" wasn't precise enough for users, the actual driver
requirement is on "Intel® Graphics Compute Runtime for oneAPI Level Zero
and OpenCL™ Driver", ie. https://github.com/intel/compute-runtime /
intel-level-zero-gpu package.

This follows-up the discussion on
https://developer.blender.org/rBff89c1793d8c75615ed43248def25812ec13e6e3
2022-12-06 13:21:19 +01:00
67dd652557 Tests: anim keylist test, avoid interaction between checks
Simplify checks so that one check doesn't influence the following one.

Checks no longer pass the last-visited frame number into the "start frame"
parameter of the next check. This way all test values are hard-coded and
easy to read, without having to understand how all the checks fit together.

No functional changes.
2022-12-06 12:27:16 +01:00
979930f8b6 Test: animation, avoid segfault in keylist unit tests
Replace `EXPECT_NE(column, nullptr)` with `ASSERT_NE(column, nullptr)` to
abort the test on failure. With `EXPECT_NE`, the test would continue onto
the next like, which accesses `column->cfra` and would segfault.

No functional changes to the tests. Just better reporting of failures.
2022-12-06 12:27:16 +01:00
997ff54d30 Fix: UI: broken texpaintslot/color attributes/attributes name filtering
rB8b7cd1ed2a17 broke this for the paint slots
rB4669178fc378 broke this for regular attributes

Name filtering in UI Lists works when:
- [one] the items to be filtered have a name property
-- see how `uilist_filter_items_default` gets the `namebuf`
- [two] custom python filter functions (`filter_items`) implement it
themselves
-- if you use `filter_items` and dont do name filtering there, the default
name filtering wont be used

So, two problems with rB8b7cd1ed2a17:
- [1] items to be listed changed from `texture_paint_images` to
`texture_paint_slots`
-- the former has name_property defined, the later lacks this
- [2] the new `ColorAttributesListBase` defined a `filter_items` function,
but did not implement name filtering

And the problem with rB4669178fc378:
- it added `filter_items` functions, but did not implement name filtering.

These are all corrected now.

Fixes T102878

Maniphest Tasks: T102878

Differential Revision: https://developer.blender.org/D16676
2022-12-06 11:09:28 +01:00
31943d1313 Fix T102937: "view3d.view_roll" operator conflicts with RMB invocation
When RMB is used to start the operator, don't use it for canceling.
2022-12-06 20:50:51 +11:00
ea14c48c09 Fix T102276: Hotkey conflict Alt D in Node Editor with Duplicate Linked and Detach
This unassign the Alt+D shortcut from the detach operator. Right now the
operator has to be accessed via the menu.

Alt+D is left for duplicate link, following the other editors.
2022-12-06 09:45:20 +01:00
979b295154 Fix incorrect cursor motion coordinates for WIN32
Cursor motion events on windows read the position from GetCursorPos()
which wasn't always the same location stored in `lParam`.

In situations where events were handled immediately this wasn't often a
problem, for heavier scenes or when updates between event handling was
slow - many in-between cursor events would be incorrect.

This behavior dates back to the initial commit, there doesn't seem to be
a good reason not to use the cursor coordinates from the event.

Noticed when investigating T102346.
2022-12-06 17:18:01 +11:00
d486f33d63 GHOST: OpenGL errors now use "file:line: " contention for errors
Make OpenGL errors match formatting used by GCC & clang
(as well as Blender's logging), so utilities that recognize this
convention can be used to quickly access this location.
2022-12-06 14:48:38 +11:00
f68e50a263 WM: operators that add their own undo pushes now clears the redo panel
Detect when the operator adds its own undo step and clear the panel.

An alternative fix for [0] which caused T101743.

Needed to prevent changing values in the last operator panel from
destructively undoing brush steps.

[0]: 11bdc321a2.

Reviewed By: mont29, joeedh

Ref D16523
2022-12-06 14:01:36 +11:00
7465aa8965 Merge branch 'blender-v3.4-release' 2022-12-06 13:47:40 +11:00
db54b99ee1 BLI_path_util: support both forward and back slashes for WIN32
The following functions only supported back slashes on WIN32,
which can use both forward and back slashes.

- BLI_path_append
- BLI_path_append_dir
- BLI_path_slash_ensure
- BLI_path_slash_rstrip

Follow up to [0] which is a more limited bug-fix.

[0]: a16ef95ff6
2022-12-06 13:28:39 +11:00
ed0125afe5 Merge branch 'blender-v3.4-release' 2022-12-06 13:19:04 +11:00
f450d39ada Fix T84078: improve UV unwrapping for quads with an internal reflex angle
When triangulating meshes, the UV unwrapper was previously using a
heuristic to split quads into triangles. If one of the internal angles
is greater than 180degrees, a so-called "reflex angle", the heuristic
was giving a poor choice of split.

Instead of using a special case for quads, this change routes everything
through the generic n-gon `BLI_polyfill_beautify` method instead.

Reviewed By: Brecht Van Lommel

Differential Revision: https://developer.blender.org/D16505
2022-12-06 13:56:02 +13:00
644afda7eb Fix T102543: improve uv unwrapping with n-gons and shared vertices
When n-gons share vertices, their triangulation can be non-manifold,
even if the original mesh is manifold.

The UV Unwrapper does not currently work with non-manifold meshes.

This workaround attempts to modify the triangulation of n-gons in
the UV unwrapper to preserve the manifold property.

This change replaces the previous fix for quads, and extends it
to all n-gons.

See T84078 as motivation for this change.

Differential Revision: https://developer.blender.org/D16521
2022-12-06 13:42:24 +13:00
7a12934f1e Merge branch 'blender-v3.4-release' 2022-12-06 11:12:53 +11:00
Ramil Roosileht
587a1b16ae Attributes: Autofill for attribute conversion operators
Make "Convert Attribute" and "Convert Color Attribute" operators
auto-fill their initial settings with active attribute's domain
and data type if it wasn't already set explicitly.

Differential Revision: https://developer.blender.org/D16550
2022-12-05 16:30:50 -06:00
294e41477b Fix T102961: mirrored vertices sometimes get locked in transform
Two vertices within the threshold can mirror each other causing neither
to be transformed.
2022-12-05 19:11:11 -03:00
0808eaf44e Cycles: temporarily disable light tree again due to platform differences
Regression tests are failing with some platform/compiler combinations, and
fixing this is taking some time.

Ref T77889
2022-12-05 21:57:43 +01:00
5270610b29 Fix Cycles uninitialized variables in mesh light sampling
Causing wrong renders and differences between platforms.
2022-12-05 20:20:51 +01:00
1af8ddf69f Merge branch 'blender-v3.4-release' 2022-12-05 12:47:25 -06:00
2ce6ac462b Cleanup: Const correctness for node find functions
You shouldn't be able to retrieve a mutable node from a const node tree
or a mutable socket from a const node. Use const_cast in one place in
order to correct this without duplicating the function, which is still
awkward in the C-API.
2022-12-05 11:37:55 -06:00
ca2ca0ce5d Geometry Nodes: add instance test category 2022-12-05 17:56:47 +01:00
f646a4f22c Cleanup: renaming tan_spread to cot_half_spread to avoid ambiguity
Differential Revision: https://developer.blender.org/D16695
2022-12-05 17:04:04 +01:00
ee89f213de Cycles: improve many lights sampling using light tree
Uses a light tree to more effectively sample scenes with many lights. This can
significantly reduce noise, at the cost of a somewhat longer render time per
sample.

Light tree sampling is enabled by default. It can be disabled in the Sampling >
Lights panel. Scenes using light clamping or ray visibility tricks may render
different as these are biased techniques that depend on the sampling strategy.

The implementation is currently disabled on AMD HIP. This is planned to be fixed
before the release.

Implementation by Jeffrey Liu, Weizhen Huang, Alaska and Brecht Van Lommel.

Ref T77889
2022-12-05 16:09:03 +01:00
0731d78d00 Cycles: remove shadow pass
This was not working well in non-trivial scenes before the light tree, and now
it is even harder to make it work well with the light tree. It would average the
with equal weight for every light object regardless of intensity or distance, and
be quite noisy due to not working with multiple importance sampling.

We may restore this if were enough good use cases for the previous implementation,
but let's wait and see what the feedback is.

Some uses cases for this have been replaced by the shadow catcher passes, which
did not exist when this was added.

Ref T77889
2022-12-05 15:52:10 +01:00
ccae00c9e2 Fix: memory leak in curve circle primitive node 2022-12-05 15:34:54 +01:00
83077d3683 Fix: wrong pivot point output in String to Curves node
The issue was that using `curves.bounds_min_max` included the radius
which does not make sense in this context.
2022-12-05 13:20:30 +01:00
44ab02fc5c Geometry Nodes: add texture regression test category 2022-12-05 12:06:18 +01:00
9cb061f4f0 Cleanup: spelling in comments 2022-12-05 12:58:18 +11:00
0dee238c8c Cleanup: remove duplicate doc-strings
Duplicating doc-strings in both header & implementation
should be avoided as they often diverge & maintaining them is more work.
2022-12-05 12:54:04 +11:00
2b914a2ecb Cleanup: correct misspelling of occurrence 2022-12-05 12:54:02 +11:00
cc6bdac921 Cleanup: format 2022-12-05 12:54:00 +11:00
997e143a50 Cleanup: quiet compiler warnings 2022-12-05 12:53:56 +11:00
9719fd6964 Cleanup: format 2022-12-03 10:53:44 +13:00
Iliya Katueshenock
18e386613c Attributes: Remove asserts for DefaultMixer negative weight
The attribute smoothing node asks for the ability to have a factor
outside the range of 0 and 1. The problem with this is that there is a
negative weight assertion for some of the mixers. If mixing between 0
and 1, then at a factor of 2, one of the elements will be negative.

Differential Revision: https://developer.blender.org/D16351
2022-12-02 14:44:54 -06:00
ce16fa0f4c Fix: Node Editor: Hide compoitor-specific menu items
Previews and the "Read Viewlayers" operator are specific to the
compositor and shouldn't show in other node editor types.
2022-12-02 14:31:44 -06:00
2155bdd500 Cleanup: Remove "done" variable from node runtime
The runtime storage is meant for more persistent things. These local
states for an algorithm are much better handled by an array now.
2022-12-02 14:14:14 -06:00
1c26341464 Cleanup: Gammar in BMesh mesh conversion comment 2022-12-02 13:28:30 -06:00
ab4926bcff Fix: Various mishandling of node identifiers and vector
In a few places, nodes were added without updating the Identifiers and
vector. In other places nodes we removed without removing from and
rebuilding the vector. This is solved in a few ways. First I exposed
a function to rebuild the vector from scratch, and added unique ID
finding to a few places.

The changes to node group building and separating are more involved,
mostly because it was hard to see the correct behavior without some
refactoring. Now `VectorSet` is used to store nodes involved in the
operation. Some things are handled more simply with the topology
cache and by passing a span of nodes.
2022-12-02 13:28:30 -06:00
948f13a8e7 Cleanup: compiler warning 2022-12-02 19:13:38 +01:00
71071a25a0 Fix crash on File > Link or Append
Would attempt to destruct memory of a null pointer. Use `MEM_delete()`
instead of manual destruction, which allows this case (NOP then).
2022-12-02 19:09:52 +01:00
2a33875065 Fix link error after recent changes to use span for iterating over nodes 2022-12-02 18:51:38 +01:00
0302ab4e02 Fix link error on Linux + Clang due to missing atomic symbols
The new atomic disjoint set uses additional atomics which are not supported
as intrinsics on all architectures and require linking to libatomic.

Now always link to libatomic on Linux when it is available, instead of only
checking if atomic add for int64_t requires linking to this library.

Thanks to Sergey for the help fixing this.
2022-12-02 18:27:07 +01:00
6b7119f9ed Merge branch 'blender-v3.4-release' 2022-12-02 11:24:18 -06:00
5b8e2ebd97 Cleanup: Use Span to iterate over nodes instead of ListBase
Since 90ea1b7643, there is always a span of nodes
available at runtime. This is easier to read and write.
2022-12-02 11:13:00 -06:00
c5e71cebaa Cycles: Remove OpenGL header
It is not really used from any of the sources, including the
standalone app. Since we are moving to a more backend-independent
drawing it makes sense to remove header which was specific to
how Blender integrates Cycles into viewport.

There is probably some cleanup in CMake files is possible, but
there is some inter-dependency with USD.

Differential Revision: https://developer.blender.org/D16681
2022-12-02 17:19:00 +01:00
ab8946f957 Merge branch 'blender-v3.4-release' 2022-12-02 16:48:09 +01:00
3d9f4012dc Cycles: Fixes for viewport render on Metal drawing backend
This change fixes issues with viewport rendering when Metal
GPU backend is used for drawing. This is not a default build
configuration and requires the following tweaks:

- Enable WITH_METAL_BACKEND CMake option (set it to on)
- Use `--gpu-backend metal` command line arguments

It also helps using the `--factory-startup` command line
argument to ensure Eevee is not used (it is not ported and
will crash).

The root of the problem was in the use of glViewport().
It is replaced with the GPU_viewport_size_get_i() which
is supposed to be portable equivalent form the GPU module.
Without this change the viewport size is detected to be 0
which backfired in few places.

The rest of the changes were to make the code more robust
in the extreme conditions instead of asserting or crashing.

Simplified and streamlined GPU resources creation in the
display driver. It was a bit convoluted mix of creation of
the GPU resources and resizing them to the proper size. It
even seemed to be done in the reverse order. Now it is as
simple as "just ensure GPU resources are there for the
given texture or buffer size".

Also avoid division by zero in the tile manager.

Differential Revision: https://developer.blender.org/D16679
2022-12-02 16:46:43 +01:00
2bce3c0ac4 Fix: don't allow node identifiers to be zero
Was missing in rB88c6d824e78ebe40b891.
2022-12-02 15:42:15 +01:00
e028662f78 Cycles: store axis and length of an area light instead of their product 2022-12-02 15:23:09 +01:00
6a7917162c Fix asset index only generating empty entries since 1efc94bb2f
Steps to reproduce were:
- Open a .blend file that is located inside of an asset library and
  contains assets.
- Save and close the file.
- Open a new file (Ctrl+N -> General).
- Open asset browser and load the asset library from above.
- If the assets from the file above still show up, press refresh button.
- -> Assets from the file above don't appear.

Likely fixes the underlying issue for T102610. A followup will be needed
to correct the empty asset index files written because of this bug.

We're in the process of moving responsibilities from the file/asset
browser backend to the asset system. 1efc94bb2f introduces a new
representation for asset, which would own the asset metadata now instead
of the file data.

Since the file-list code still does the loading of asset libraries,
ownership of the asset metadata has to be transferred to the asset
system. However, the asset indexing still requires it to be available,
so it can update the index with latest data. So transfer the ownership,
but still keep a non-owning pointer set.

Differential Revision: https://developer.blender.org/D16665

Reviewed by: Bastien Montagne
2022-12-02 14:48:51 +01:00
79498d4463 Cleanup: Silenced unused parameter in pbvh.c 2022-12-02 13:43:22 +01:00
ea86ec200a GPU: Added VkVertexBuffer alloc/release data.
This makes sure that the GPU_batch_init will not crash on an assert
where the data of vertex buffer needs to be allocated.
2022-12-02 13:41:23 +01:00
b8c7e93a65 Add experimental option to force all linked data as directly linked.
This is a workaround required to get BAT reliably working again after
recent rB133dde41bb5b, which fixed many indirectly linked IDs being
tagged as directly linked, and therefore having their reference written
in .blend file.

It seems that BAT is still missing proper handling of some ID pointers.

Required for the end of the Heist production here at Blender Studio.
2022-12-02 13:39:28 +01:00
d57f68616a Fix: bump minimum version
rB9fa4ceb340951 caused a forward compatibility issue.
Going forward, when changing socket names, only the name should be
changed and not the identifier if possible.
2022-12-02 13:18:54 +01:00
6d22aa2f84 Cleanup: simplify access to cached mesh normals 2022-12-02 13:12:06 +01:00
198460f6a4 Cleanup: fix compiler warning about using %u with int value
`but->type` is an `enum`, which maps to `int`, so `%d` should be used for
printing its value with `printf()`.
2022-12-02 12:54:11 +01:00
caac5686c5 GPU: Add vulkan to GPU_backend_get_type().
Vulkan backend detection wasn't added to GPU_backend_get_type.
This change will add support for vulkan to the function.
2022-12-02 12:51:11 +01:00
3d5a4fbcc2 Cleanup: move some files that use normals to C++
Doing this to help with T102858.
2022-12-02 12:34:26 +01:00
88c6d824e7 Nodes: ensure that node identifiers are larger than zero
Zero should not be a valid identifier to make it easier to detect when
the identifier has not been set after a node has been allocated.
2022-12-02 11:59:20 +01:00
39615cd3b7 BLI: add atomic disjoint set data structure
The existing `DisjointSet` data structure only supports single
threaded access, which limits performance severely in some cases.

This patch implements `AtomicDisjointSet` based on
"Wait-free Parallel Algorithms for the Union-Find Problem"
by Richard J. Anderson and Heather Woll.

The Mesh Island node also got updated to make use of the new data
structure. In my tests it got 2-5 times faster. More details are in 16653.

Differential Revision: https://developer.blender.org/D16653
2022-12-02 10:39:19 +01:00
5f0120cd35 Merge branch 'blender-v3.4-release' 2022-12-02 08:47:05 +01:00
0197b524e4 Update THIRD-PARTY-LICENSES.txt for Blender 3.4. 2022-12-02 00:38:23 -08:00
46f991dbae Sculpt: Fix broken pivots when entering paint modes
When entering paint modes the paint pivot was cleared,
which broken rotate around pivot.  Fixed for all paint modes.
PBVH modes set the pivot to the PBVH bounding box
while texture paint uses the evaluated mesh bounding box.
2022-12-02 00:37:50 -08:00
6b0e769d14 Nodes: Restrict viewer key tree updates to compositor
The active viewer key is only used by the compositor, so only tag the
node tree for update of it is a compositor node tree.
2022-12-02 10:32:50 +02:00
09ee781a67 GPU: Add placeholders for PixelBuffer to vulkan backend.
PixelBuffer was recently introduced. This change adds empty placeholders to the
vulkan backend and other related API tweaks.
2022-12-02 08:35:17 +01:00
587b213fe1 Fix: Node sorting broken after node identifier commit
90ea1b7643 broke the sorting that happens as nodes are selected.
The compare function for stable sort had different requirements than
the previous implementation.
2022-12-01 17:55:33 -06:00
4d5e8b7caa Cleanup: Use new node identifiers when copying tree
We can avoid creating a new map and use the node vector set that
must be built anyway when updating pointers in the new tree.
2022-12-01 15:40:46 -06:00
e78cd27565 Fix T102895: Grammar in apply scale operator
"Fonts" are referred to as "Text objects" now.
2022-12-01 15:40:46 -06:00
b768a2bf2f Cleanup: Remove unnecessary list clearing in node tree reading
The lists were cleared a few lines below already.
2022-12-01 15:40:46 -06:00
8842a8c4c3 Cleanup: format 2022-12-02 10:14:50 +13:00
90ea1b7643 Nodes: Use persistent integer to identify to nodes
This patch adds an integer identifier to nodes that doesn't change when
the node name changes. This identifier can be used by different systems
to reference a node. This may be important to store caches and simulation
states per node, because otherwise those would always be invalidated
when a node name changes.

Additionally, this kind of identifier could make some things more efficient,
because with it an integer is enough to identify a node and one does not
have to store the node name.

I observed a 10% improvement in evaluation time in a file with an extreme
number of simple math nodes, due to reduced logging overhead-- from
0.226s to 0.205s.

Differential Revision: https://developer.blender.org/D15775
2022-12-01 15:08:12 -06:00
fefe7ddf39 BLI: Add math::orthogonal and math::compare
Port of C BLI API.
2022-12-01 21:46:06 +01:00
2466b2e43c Cleanup: BLI: Rename arguments of math::atan2 2022-12-01 21:46:06 +01:00
730fd0a257 BLI: Add math::sqrt
Allows other number types to overload this function without poluting std
namespace.
2022-12-01 21:46:06 +01:00
4c1b250e17 Fix T102893: Assert Opening File Browser (Win32)
Fix debug assert opening File Browser on Windows platform.

See D16672 for more details.

Differential Revision: https://developer.blender.org/D16672

Reviewed by Julian Eisel
2022-12-01 12:25:49 -08:00
25501983bb Cleanup: Spelling mistake in comment 2022-12-01 16:29:38 +01:00
009f7de619 Cleanup: use better matching integer types for graphics interop handle
Ref D16042
2022-12-01 15:55:48 +01:00
Jason Fielder
b132e3b3ce Cycles: use GPU module for viewport display
To make GPU backends other than OpenGL work. Adds required pixel buffer and
fence objects to GPU module.

Authored by Apple: Michael Parkin-White

Ref T96261
Ref T92212

Reviewed By: fclem, brecht

Differential Revision: https://developer.blender.org/D16042
2022-12-01 15:55:48 +01:00
b5ebc9bb24 Fix T101996: merge fcurve keyframes on the same frame after snapping
Use recently introduced BKE_fcurve_merge_duplicate_keys (that was moved
from the transform system to BKE) to merge keyframes on the same frame
after snapping (same as what would happen with the transform system).

This makes behavior consistent and prevents a state after snapping that
cannot be reproduced in any other way.

NOTE: same probably has to be done for greasepencil, but that is for
another commit.
2022-12-01 15:41:55 +01:00
a179246e1f Move fcurve cleanup from transform system to BKE
This exposes the fcurve cleanup from transform system to other callers
in anticipation to use it in the snapping operators.

It has been renamed from `posttrans_fcurve_clean` to
`BKE_fcurve_merge_duplicate_keys` to better describe what it does.
No functional change expected.

Ref. T101996

NOTE: same probably has to be done for greasepencil, but that is for
another commit.

Maniphest Tasks: T101996

Differential Revision: https://developer.blender.org/D16663
2022-12-01 15:41:50 +01:00
5e4dcb8cf0 Cleanup: use OB_MODE_ALL_PAINT_GPENCIL in more places
This just replaces the combined usage of OB_MODE_PAINT_GPENCIL
OB_MODE_SCULPT_GPENCIL
OB_MODE_WEIGHT_GPENCIL
OB_MODE_VERTEX_GPENCIL.

Differential Revision: https://developer.blender.org/D16652
2022-12-01 12:15:36 +01:00
1a2e2dcddc Cleanup: Improve function name for asset identifier creation
I find this a bit more explanatory/clear.
2022-12-01 11:42:27 +01:00
5c580ff457 Fix asset-only loading optimizatoin not working as intended
Introduced in fc7beac8d6, but I think this never worked because the
`asset_library_ref` of the temporary file-list used for reading in a
background thread is nulled. Now there's a different pointer that we can
use that works properly.
2022-12-01 11:42:27 +01:00
9f3b0e41bb Fix T102887: crash deleting plane track
Two things here:
- fix ghash lookup from rB4d497721ecd1
-- this was looking in the wrong map (causing an assert on file load)
- set MovieTrackingObject active_plane_track to NULL upon deletion (same
as for regular tracks)
-- rBfe38715600c introduced a crash because `draw_tracking_tracks` would
still get an active plane track (logic for getting these changed)

Maniphest Tasks: T102887

Differential Revision: https://developer.blender.org/D16660
2022-12-01 10:59:59 +01:00
3cebc58936 Fix: Assert in subdivide curves node after span slicing change
a5e7657cee missed this call where clamped slicing is necessary.
The subdivision of a segment purposefully modifies the handle types of
the other side of the following control point, but that didn't work for
the final cyclic segment.
2022-11-30 21:21:58 -06:00
4aac5b92c1 Sculpt: Fix T102824: broken face primitive partitioning in pbvh nodes
The code I wrote to group triangles or multires quads that
belonging to single faces into single PBVH nodes had edge
cases that failed.  The code is now much simpler and simply
assigns groups of primitives to nodes.
2022-11-30 13:55:08 -08:00
65393add52 Sculpt: Fix broken pivots when entering paint modes
When entering paint modes the paint pivot was cleared,
which broken rotate around pivot.  Fixed for all paint modes.
PBVH modes set the pivot to the PBVH bounding box
while texture paint uses the evaluated mesh bounding box.
2022-11-30 13:54:56 -08:00
918282d391 Sculpt: fix crash when no brush
If no brush exists the stroke operator
falls through to the grab transform
op in the global view3d keymap.

This now works.  It would be nice if
we could get rid of that keymap entry
though and add it manually to the edit/paint
modes that need it.
2022-11-30 13:54:03 -08:00
7151c2dc3e Cleanup: Unused variable, RNA description warning 2022-11-30 15:34:08 -06:00
222b64fcdc Fix Cycles CUDA crash when building kernels without optimizations (for debug)
In this case the blocksize may not the one we requested, which was assumed to be
the case. Instead get the effective block size from the compiler as was already
done for Metal and OneAPI.
2022-11-30 21:46:17 +01:00
b25c301c15 Build: make CUDA kernel compilation output not verbose
Unless using WITH_CYCLES_DEBUG.

This is convenient for investigating kernel performance, but too verbose to
always have in the buildbot logs especially now that we are also compiling HIP
and OneAPI kernels.
2022-11-30 21:19:51 +01:00
396b407c7d Cycles: new setting and heuristics for mesh light importance sampling
Materials now have an enum to set the emission sampling method, to be
either None, Auto, Front, Back or Front & Back. This replace the
previous "Multiple Importance Sample" option.

Auto is the new default, and uses a heuristic to estimate the emitted
light intensity to determine of the mesh should be considered as a light
for sampling. Shaders sometimes have a bit of emission but treating them
as a light source is not worth the memory/performance overhead.

The Front/Back settings are not important yet, but will help when a
light tree is added. In that case setting emission to Front only on
closed meshes can help ignore emission from inside the mesh interior that
does not contribute anything.

Includes contributions by Brecht Van Lommel and Alaska.

Ref T77889
2022-11-30 21:19:51 +01:00
ac51d331df Refactor: Cycles light sampling code reorganization
* Split light types into own files, move light type specific code from
  light tree and MNEE.
* Move flat light distribution code into own kernel file and host side
  building function, in preparation of light tree addition. Add light/sample.h
  as main entry point to kernel light sampling.
* Better separate calculation of pdf for selecting a light, and pdf for
  sampling a point on the light. The selection pdf is now also stored in
  LightSampling for MNEE to correctly recalculate the full pdf when the
  shading position changes but the point on the light remains fixed.
* Improvement to kernel light storage, using packed_float3, better variable
  names, etc.

Includes contributions by Brecht Van Lommel and Weizhen Huang.

Ref T77889
2022-11-30 21:19:51 +01:00
db1728096a Cleanup: Remove unused node socket cache handling
This cache was never written to, only "copied" between sockets in one
case, it dates back at least a decade. It doesn't make sense to store
caches on node trees directly anyway, since they can be used in
multiple places.
2022-11-30 13:25:06 -06:00
31b3b07ad7 Cleanup: Remove useless comments in node.cc
Also remove unnecessary `struct` keywords.
2022-11-30 13:11:12 -06:00
f37e8c2e96 Merge branch 'blender-v3.4-release' 2022-11-30 20:08:31 +01:00
b582028b12 Correct previously missed case of manual path building in file browser
Missed in 39c9164ea1. Also adds a comments to point at the function
that should be used instead.
2022-11-30 20:02:09 +01:00
39c9164ea1 File/Asset Browser: Get full asset path from asset representation
No user visible changes expected.

Add a function to query the full path for a file, so that asset files
can get the path via the asset representation and its new asset
identifier. This is designed to be a reliable way to locate an asset,
and using it is yet another step to rely less on the problematic file
browser code.
Also, previous code would build the full path manually in a few places,
which is good to deduplicate anyway.
2022-11-30 19:44:34 +01:00
ccc9eef1b9 Assets: Get asset path via new identifier (not via file browser hacks)
With the asset identifier introduced in the previous commit, we can now
locate an asset just from its `AssetRepresentation`, without requiring
information from the asset library and the file browser storage. With
this we can remove some hacks and function parameters. A RNA/BPY
function is also affected, but I didn't remove the paramter to keep
compatibility. It's simply ignored and not required anymore, noted this
in the parameter description (noted for T102877).
2022-11-30 19:44:34 +01:00
f68da703a5 Asset system: Initial asset identifier type
No user visible changes expected.

`AssetIdentifier` holds information to uniquely identify and locate an
asset. More information:
https://wiki.blender.org/wiki/Source/Architecture/Asset_System/Back_End#Asset_Identifier

For the start this is tied quite a bit to file paths, so that external
assets are assumed to be in the file system.

This is needed to support an "All" asset library (see T102879), which
would contain assets from different locations. Currently the location of
an asset is queried via the file browser backend, which however requires
a common root location. It also moves us further away from the file
browser towards the asset system (see T87235) and allows us to remove
some hacks (see following commit).
2022-11-30 19:44:34 +01:00
cfaca0d9ab Asset System: Store root path in asset library data
No user visible changes expected.

If an asset library is located on disk, store the path to it in the
asset library data. This is called the "root path" now.
With this we can construct an asset identifier, which is introduced in
the following commit.
2022-11-30 19:44:34 +01:00
2165136740 File/Asset Browser: Refactor how recursive paths are set
When reading directories recursively, the code would first only set the
file name as the relative path and then later iterate over the read files
and prepend the recursed into path, to get the complete path relative to
the recursed into directory. This isn't clear and confused me quite a
bit. And it is not compatible with what we need for creating asset
identifiers, which are introduced in the 2nd following commit.

Instead properly determine the complete relative path when initially
adding the file, and don't change it after. The asset identifier can the
be constructed properly at the time needed.
2022-11-30 19:44:34 +01:00
b78b6e3cd7 Cleanup: Correct comment in hash description
We use the blender namespace now rather than BLI.
2022-11-30 11:49:32 -06:00
7cdcb76815 Cleanup: Remove node tree runtime fields
`done` was only used in one place, and `is_updating` was never read.
Generally we should avoid adding this sort of temporary data to longer
lived structs.
2022-11-30 11:41:01 -06:00
507b724056 Cleanup: Remove unnecessary BMesh unique pointer in OBJ code
This is only used once, it's simpler to just free it in that case and
wait for further RAII improvements from elsewhere in the codebase.
2022-11-30 10:46:37 -06:00
0b13e7ce0f Cleanup: Remove unnecessary use of deprecated DNA define
This was solved by `dna::shallow_copy`
2022-11-30 10:27:33 -06:00
Christoph Lendenfeld
c17d7ddabe Merge branch 'blender-v3.4-release' 2022-11-30 17:26:01 +01:00
429771fed5 Add 'work around' to accessing data from volatile iterators in py API.
Re T102550.
2022-11-30 17:04:37 +01:00
Christoph Lendenfeld
18de712257 Fix T100879: Bake Action fails with "Nothing to Bake"
When applying the "Bake Action" operator in pose mode
it could throw an error saying "Nothing to Bake"
even though bones are selected

That is because the code was looking for a selected armature
But in Pose Mode, clicking into empty space to de-select would also
deselect the armature.
Then box selecting would not make the armature selected again

Reviewed by: Sybren A. Stüvel
Differential Revision: https://developer.blender.org/D16593
2022-11-30 16:57:21 +01:00
313c2e9105 Fix a test after recent changes to lib (in)directly linked ID handling.
rB133dde41bb5b changed handling of (in)directly linked status handling
for IDs, now IDs that are not directly linked get proper status and
handling on file save. this broke parts of the `pyapi_idprop_datablock`
tests.
2022-11-30 15:08:11 +01:00
19dc2157cd BLI: Add trigonometric functions to BLI_math_base.hh``
This is needed for the upcomming matrix library.
2022-11-30 12:59:47 +01:00
249acdf529 Cleanup: Unused variable warning in release build 2022-11-30 12:49:33 +01:00
Bastien Montagne
133dde41bb Improve handling of (in)direclty linked status for linked IDs.
This commit essentially ensures before writing .blend file that only
actualy locally used linked IDs are tagged as `LIB_TAG_EXTERN` (and
therefore get a reference stored in the .blend file).

Previously, a linked ID tagged as directly linked would never get back
to the indirectly linked status. Consequence was a lot of 'needless'
references to linked data in .blend files.

This commit also introduces a new flag for lib_query ID usage types,
`IDWALK_CB_DIRECT_WEAK_LINK`, used currently for base's Object
pointer, and for LayerCollection's Collection pointer.

NOTE: A side-effect of this patch is that even IDs explicitely linked by
the user won't be kept anymore when writing files, i.e. they will not be
there anymore after a file reload, if they are not actually used.

Overhead of new process is below 0.1% in whole file saving process in
a Heist production file e.g.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D16605
2022-11-30 11:16:14 +01:00
ae081b2de1 Cleanup: reduce variable scope in uv parametrizer
Also improve const correctness and update comments.

Simplify future fix for T78101
2022-11-30 12:01:40 +13:00
d602d4f15f Cleanup: reduce variable scope in uv parametrizer
Simplify future fix for T78101
2022-11-30 11:56:00 +13:00
5ce72bba7e Cleanup: simplify flush/blend logic in uv parametrizer
Simplify future fix for T78101
2022-11-30 11:50:12 +13:00
b99cdf7472 Cleanup: use blenlib geometry functions in uv parametrizer
Simplify future fix for T78101
2022-11-30 11:44:13 +13:00
cb7e36cfa5 Merge branch 'blender-v3.4-release' 2022-11-29 21:05:45 +01:00
68a0846021 Fix T102657: Unable to add strip to new channel
This is usability improvement, rather than bugfix. By default, height of
VSE timeline is clamped to 7 channels, unless more are added. But adding
new strip is not intuitive, since user can't scroll up due to clamping.

Clamp timeline height to n+1 used channels, so there is always 1 free
channel visible.
2022-11-29 19:49:40 +01:00
3f9febcabf Merge branch 'blender-v3.4-release' 2022-11-29 19:09:42 +01:00
9f753e5649 Fix build error after PyGPU changes 2022-11-29 18:46:59 +01:00
8fa8cea8e0 Fix PyGPU: return NULL instead of PyNone on error
Error in d7f124f06f
2022-11-29 13:59:52 -03:00
d7f124f06f Fix T102845: GPU python crash in background mode
`BPYGPU_IS_INIT_OR_ERROR_OBJ` is not implemented in all pygpu functions.

Instead of copying and pasting that call across the API when it has no
gpu context, override the methods with one that always reports error.
2022-11-29 13:55:46 -03:00
71f9fbcf35 Merge branch 'blender-v3.4-release' 2022-11-29 16:46:07 +01:00
85d9f12339 BLI: increase default inline buffer capacity in BitVector
Using 32 does not make much sense, because there will be 4 remaining
padding bytes in the struct anyway. Using 64 instead does not actually
increase the size of the struct, but makes allocations less likely.
2022-11-29 16:35:59 +01:00
7b08298927 BLI: use no_unique_address in BitVector
This allows the vector to be smaller when it has no inline buffer (24 ->32 byte).
2022-11-29 16:34:57 +01:00
57613630c7 BLO: use blender::Map in OldNewMap
`OldNewMap` used to have its own map implementation. Given that
the file uses C++ now, it is easy to use a C++ map implementation
instead. This simplifies the code a lot.

Going forward it might make sense to remove the `OldNewMap`
abstraction or to split it up in two (currently, `NewAddress.nr` has
two different meanings in different contexts which is confusing).

No functional changes are expected.

Differential Revision: https://developer.blender.org/D16546
2022-11-29 16:24:45 +01:00
fdf1837120 Merge branch 'blender-v3.4-release' 2022-11-29 13:17:33 +01:00
2b639f671f GPencil: Create Keyframe using Eraser if Auto-key is ONn
Before, the frame was not created, but now if there is
a previous stroke and teh frame changed, the new keyframe
is created.

This is related to T102623
2022-11-29 12:15:10 +01:00
2b85151a32 Cleanup: Braces around initialization of subobject 2022-11-29 11:03:48 +01:00
863cd1ea8e Merge branch 'blender-v3.4-release' 2022-11-29 09:33:41 +01:00
4067e6bc41 Cleanup: format 2022-11-29 17:33:07 +13:00
3d1594417b UV: support constrain-to-bounds for uv shear operator
For uv rotation operator, see rBd527aa4dd53d4.
2022-11-29 12:04:36 +13:00
6fdddae2b0 Fix T102804: Click & Drag on toggles no longer possible
Typo in 136ea84d9a
2022-11-28 16:29:37 -06:00
01a38c2be9 Fix T102827: 3D View header layout broken after C++ conversion
I missed this flag when removing designated initializers.
2022-11-28 16:29:37 -06:00
0ed4865fd0 Sculpt: Fix T102337: Null pointer error circle (tube) brush test code 2022-11-28 13:11:52 -08:00
0ad8f3ff58 Sculpt: fix T102348: Don't fold area normal automasking into cache
Certain automasking modes build a factor cache.  Modes
that rely on the mirror symmetry pass should not fold into
this pass.
2022-11-28 13:11:51 -08:00
1fc5dc3bf3 Sculpt: fix T102664: Broken multires solid shading
Calculate quad normal directly instead of averaging
the vertex normals.
2022-11-28 13:11:51 -08:00
d1d2946f59 Merge branch 'blender-v3.4-release' 2022-11-28 14:12:04 -06:00
07200eaa85 Merge branch 'blender-v3.4-release' 2022-11-28 21:07:13 +01:00
719ad4f93f Fix T100537: wrong depth pass for background after recent fix for gaps
Also have to write if we hit the background and have not written any valid
value for the pass yet.
2022-11-28 21:03:07 +01:00
9bdde6ca96 Python Module: add source code and credits to project description 2022-11-28 21:03:07 +01:00
6d52975019 Cleanup: remove Cycles standalone repository lib detection
This is only needed in the Cycles repo and having it in the Blender repo
is making merging more complicated than it is helping.
2022-11-28 21:03:07 +01:00
7540842ca7 Fix T99592: Exact Boolean: Skip empty materials, add index-based option
**Empty Slot Fix**
Currently the boolean modifier transfers the default material from
meshes with no materials and empty material slots to the faces on the
base mesh. I added this in a2d59b2dac for the sake of consistency,
but the behavior is actually not useful at all. The default empty
material isn't chosen by users, it just signifies "nothing," so when
it replaces a material chosen by users, it feels like a bug.

This commit corrects that behavior by only transferring materials from
non-empty material slots. The implementation is now consistent between
exact mode of the boolean modifier and the geometry node.

**Index-Based Option**

"Index-based" is the new default material method for the boolean
modifier, to access the old behavior from before the breaking commit.

a2d59b2dac actually broke some Boolean workflows fundamentally, since
it was important to set up matching slot indices on each operand. That
isn't the cleanest workflow, and it breaks when materials change
procedurally, but historically that hasn't been a problem. The
"transfer" behavior transfers all materials except for empty slots,
but the fundamental problem is that there isn't a good way to specify
the result materials besides using the slot indices.

Even then, the transfer option is a bit more intuitive and useful for
some simpler situations, and it allows accessing the behavior that has
been in 3.2 and 3.3 for a long time, so it's also left in as an option.
The geometry node doesn't get this new option, in the hope that we'll
find a better solution in the future.

Differential Revision: https://developer.blender.org/D16187
2022-11-28 21:03:07 +01:00
da363d831b Fix assert when calling transform operators in python handles
In these cases `t->spacetype` is `SPACE_EMPTY`.

Returning 0 is not problematic as this space does not support snapping
anyway.
2022-11-28 15:51:01 -03:00
56ae4089eb GPencil: Allow interpolation to use breakdown keyframe as extremes
Actually, the interpolation can be done only between keyframes different of breakdown type,
but in some cases, this is not convenient.

Now, a new option is displayed to allow the interpolation using breakdown keyframes
as interpolation extremes.

Reviewed By: mendio, pepeland

Differential Revision: https://developer.blender.org/D16515
2022-11-28 19:32:18 +01:00
7e7c6bc468 Cleanup: Use spans and lambdas for mesh normal calculation
Makes code safer, easier to understand, and less verbose. I detected
negligible performance differences, only a slight improvement for the
normalize step where the function call overhead was probably more
of a bottleneck.

I kept `memset` instead of `.fill(float3(0))` because that gave
better performance in my tests. In the future that stage could be
parallelized, or we could make sure new arrays are allocated with
`calloc`.
2022-11-28 12:28:27 -06:00
b0bf10889b Merge branch 'blender-v3.4-release' 2022-11-28 18:26:21 +01:00
33b3645d97 Merge branch 'blender-v3.4-release' 2022-11-28 17:11:46 +01:00
Colin Basnett
c47b6978e3 Animation: Make Bake Animation operator use preview range when enabled
This patch makes the Bake Actions operator fills the Start Frame & End From with that of the Preview Range if "Use Preview Range" is enabled.

{F13973619}

Reviewed By: sybren

Differential Revision: https://developer.blender.org/D16630
2022-11-28 08:04:21 -08:00
19bb30baf6 Fix T102735: Knife tool does not work properly in perspective viewport
Use `ED_view3d_win_to_3d` to unproject the first click coords.

This is the same function used in other tools like Draw Curve.

Differential revision: https://developer.blender.org/D16617
2022-11-28 12:38:47 -03:00
5758d114c1 Dual Mesh: Avoid transferring position attribute twice
The node transferred position once as a generic attribute, and then set
the values again manually. This wastes processing during the attribute
transfer step. On a 1 million face grid, I observed roughly an 8%
improvement, from 231.5 to 217.1 ms average and 225.4 to 209.6 ms min.
2022-11-28 08:19:33 -06:00
d96859c5b1 Cleanup: Move dual mesh topology map to blenkernel
It's helpful to have these topology maps standardized and organized
a bit better so they can be optimized and considered for future caching
together. Also use a more standard name for the map for that purpose.
2022-11-28 08:19:33 -06:00
7a9fce28c0 Cleanup: Pass spans by value in cone mesh primitive
Also use more typical ordering for the arguments.
2022-11-28 08:19:33 -06:00
70041ced14 Cleanup: Remove unused mesh array variables and arguments 2022-11-28 08:19:33 -06:00
6e26d0645e Cleanup: Use spans for voxel remesh mesh data 2022-11-28 08:19:33 -06:00
0940719b5a Line Art: Use local spans for mesh arrays
Avoid accessing arrays from the mesh for every element and add safety
by using Span instead of raw pointers. Similar to previous commits.
2022-11-28 08:19:33 -06:00
baba5d2214 Multires: Avoid retriving mesh arrays for every element
Based on the surrounding code this probably wasn't a
bottleneck, but it's nice to avoid in principle anyway.
2022-11-28 08:19:33 -06:00
653e3e2689 Subdiv: Avoid repeatedly accessing mesh arrays
Fix a performance regression from 05952aa94d by storing pointers
to mesh arrays locally in the subdiv foreach context. In a simple test
of a 1 million face grid, this improved performance by 5% (from 0.31
to 0.295 seconds).
2022-11-28 08:19:33 -06:00
bcabd04e32 Mesh: Avoid retrieving edge and loop arrays repeatedly
A utility function retrieved mesh arrays for every element after
05952aa94d which can be easily avoided. This was used when
building the GPU indices for sculpt mode drawing. In my tests this
saves 0.1ms per PBVH node. There may be very slight improvements
in line art and shrinkwrap as well.
2022-11-28 08:19:33 -06:00
a059b1b0f1 Merge branch 'blender-v3.4-release' 2022-11-28 12:52:32 +01:00
4ed649352f 3D Texturing: Fix seam bleeding.
{F13294314}
# Process

In the pixel extraction process a larger domain will be extracted then the input mesh.
The borders of uv islands are extended with connected geometry of the input mesh.
The extended mesh is then fed into the pixel extraction process.
A mask is used to limit the extraction so UV islands will not overlap.

Input UV islands.
{F13206401}

Extended UV Island (only one showing).
{F13288764}

This patch doesn't include fixing uv seams at non-manifold edges (like suzannes eyes) as that
would require a different approach (edge extending or pixel copy-ing). The later has already been
implemented in D14702, but should be revisited to only use do the non-manifold edge fixing.

This patch supports fixing UV seams across UDIM textures.
There might be an issue when using a single texture on multiple uv maps.

Reviewed By: brecht, joeedh, JulienKaspar

Maniphest Tasks: T97352

Differential Revision: https://developer.blender.org/D14970
2022-11-28 08:32:06 +01:00
c02ec74405 Cleanup: format 2022-11-28 13:17:59 +13:00
95003c99d9 GPU: Change inheritance of depth write and default values
This new inheritance behavior is more beneficial for the metal Backend.
Also change the default depth write behavior of shaders to be unchanged.
This makes fragment shader depth amendment more explicit.

This also add the missing depth_write for metal kernels.
2022-11-27 23:58:55 +01:00
d961119563 Python API Docs: document when fields use mathutils types.
When accessing certain structure fields from Python, they return
mathutils types instead of generic arrays (this is based on subtype).

This exposes this information in the Python API documentation.

Differential Revision: https://developer.blender.org/D16626
2022-11-28 00:33:41 +02:00
57a20b6d52 DRW: Add missing depth_write to certain shader create info
These are required by the Metal backend.
2022-11-27 22:58:10 +01:00
5b4efaeeb3 Merge branch 'blender-v3.4-release' 2022-11-27 21:42:37 +01:00
ea384fc096 Merge branch 'blender-v3.4-release' 2022-11-27 14:41:36 +01:00
3ccf4a3944 Merge branch 'blender-v3.4-release' 2022-11-27 12:41:42 +01:00
Iliya Katueshenock
9fa4ceb340 Geometry Nodes: Change Collection Info output socket name to Instances
As described in T101948, this commit changes socket name to be more
consistent with other nodes that generate instances output.

Differential Revision: https://developer.blender.org/D16394
2022-11-26 18:11:01 -06:00
Iliya Katueshenock
42485b01d2 Geometry Nodes: Rename Transform node to Transform Geometry
Change name to make navigation easier for beginner users. This should
more clearly hint at the use of this node to change the full geometry,
and not work with fields, and makes the name more consistent.

Differential Revision: https://developer.blender.org/D16396
2022-11-26 18:05:41 -06:00
Iliya Katueshenock
beeeb6da30 Cleanup: Integer types, references in geometry node image texture node
While implementing T102289, I noticed that this node has
several solutions that are different from other, newer nodes.
 - Explicitly set default values
 - Use references
 - Reduce the size of the node settings structure

Differential Revision: https://developer.blender.org/D16548
2022-11-26 18:00:47 -06:00
3f5dfbf681 Geometry Nodes: Modify existing mesh in split edges node
Instead of creating a new mesh from scratch, modify an existing mesh.
This allows us to keep derived caches for triangulation and bounds
alive more easily, and allows keeping materials and non-generic
attributes like vertex groups alive on the mesh.

It also has other performance benefits, since face and face corner
attributes aren't affected at all, and because of reduced overhead
from not allocating a new mesh.

Updating edge attributes is a bit more complicated now, since we
have to completely replace the arrays but keep the existing attribute
IDs around. The new mesh update tag is also slightly too specific IMO.
But I think both of those things will improve in the future because
of existing plans for further refactoring these areas:
- New attribute storage that gives pointer stability
- Further use and granularity of mesh update tagging that will
  make the correct API more clear

Fixes T102711

Differential Revision: https://developer.blender.org/D16615
2022-11-26 17:54:05 -06:00
3a41e0f611 Tests: Automated geometry nodes benchmark
Add a script for a very simple object evaluation benchmark.
There could be more advanced ways of measuring the time
per-node or per modifier, but this just loads the file, tags
the active object for a reevaluation, and times how long
that takes.

Differential Revision: https://developer.blender.org/D16604
2022-11-26 17:15:55 -06:00
828525b268 Fix: MSVC build error without TBB
windows.h once more providing min/max macro's when you least want them.
2022-11-26 11:44:08 -07:00
4ecc7cf14a Cleanup: Move interface_intern.hh
The entire interface directory is now compiled as C++ files.
2022-11-26 10:12:58 -06:00
e47c75aa6e Cleanup: Move interface eyedroppers directory to C++ 2022-11-26 10:12:58 -06:00
136ea84d9a Cleanup: Move interface_handlers.c to C++ 2022-11-26 10:12:58 -06:00
162f0dcb2f Cleanup: Move six more interface files to C++ 2022-11-26 10:12:58 -06:00
1aff91b1a7 GPencil: Add Vertex Opacity Overlay parameter in Sculpt
This option was missing in overlay panel.
2022-11-26 15:39:59 +01:00
b43bdd8ba2 Fix T102751: missing tree update with muted nodes
This was accidentally caused by removing too much code in
{rBb4c3ea264439158df70e837e20f8dd9ec548de2d}.
2022-11-26 13:46:39 +01:00
86ade3df56 Nodes: move node registration to nodes module
The main goal here is to move towards more self contained node
definitions. Previously, one would have to change `blenkernel` to
add a new node which is not necessary anymore. There is no need
for all these register functions to "leak out" of the nodes module.

Differential Revision: https://developer.blender.org/D16612
2022-11-26 13:20:18 +01:00
8d269a2488 PyDocs: Fix incorrect data type of bmesh.types.BMFaceSeq.new
The API documentation of [[ https://docs.blender.org/api/current/bmesh.types.html?highlight=faces#bmesh.types.BMFaceSeq.new | bmesh.types.BMFaceSeq.new ]] indicates that the argument is only `BMVert`.
But the correct one is sequence of `BMVert`.
This patch fixes this mismatch.

Contributed by @Nutti

Differential Revision: https://developer.blender.org/D15668
2022-11-25 19:55:04 -05:00
ed6e1381dc Cleanup: Avoid using macros to refer to theme global variables
Prefer slightly more verbose code to the use of macros where
they aren't really necessary and just add indirection.
2022-11-25 17:11:44 -06:00
248ee6270b Cleanup: Remove unnecessary includes 2022-11-25 17:10:28 -06:00
afd16c487d Cleanup: Move four interface files to C++ 2022-11-25 17:09:47 -06:00
4029cdee7b Merge branch 'blender-v3.4-release' 2022-11-25 15:28:48 -06:00
4a0e19e608 Cleanup: Group deprecated mesh DNA fields, improve comments 2022-11-25 12:54:22 -06:00
5ca6965273 Merge branch 'blender-v3.4-release' 2022-11-25 18:54:49 +01:00
f07b09da27 Cycles: Improve oneAPI backend support for non-Intel platforms 2022-11-25 17:46:59 +01:00
f83aa1ae08 Fix T102764: Slow change of active material slot
The issue is caused by the combination of the following factors:

- There is a driver from custom property to the subdivision surface
  modifier.
- Active material index tags the ID for the copy-on-write update.

Dependency graph currently does not fully distinguish between
copy-on-write tag and properties-update tag, so the copy-on-write tag
makes the dependency graph believe that it is property which actually
affects evaluation has been changed.

The simple solution is to treat the active material slot index as an
interface data which does not need to trigger copy-on-write tag.

The possible downside of this solution is that if someone has a driver
from this property the driver will stop working. Whether there is such
a real-life setup or not is not clear. Is not something advisable to do
anyway.

Possible alternative would be to introduce more granularity into the
way how property tagging is done. This is something that would be nice
to implement eventually, but it is a much bigger refactor.

Differential Revision: https://developer.blender.org/D16613
2022-11-25 17:44:54 +01:00
d1c21d5673 Fix T102470: Make material LineArt properties animatable.
MaterialLineArt didn't have a path func, now corrected.
2022-11-25 23:18:41 +08:00
994e3c6ac5 Clarify depsgraph API usage in the libraries code
Basically copy the information from the commit message of the
03e2f11d48 directly to the code.

This makes the information easier to find when working on the
code.
2022-11-25 15:25:55 +01:00
118afe1de3 Fix T101824: Line art flickers when light object has scaling.
Line art doesn't expect light or camera objects to have scaling.
2022-11-25 22:00:21 +08:00
ae6e35279f Clarify comment about ID_RECALC_COPY_ON_WRITE
The copy-on-write is really an implementation detail of the
dependency graph. While there are still cases where there is
no better tag to be used, the ID_RECALC_COPY_ON_WRITE should
not be used in combination with a dedicated tag.

For example if location of object changes the proper tag is
`ID_RECALC_TRANSFORM`. Tagging with `ID_RECALC_TRANSFORM |
ID_RECALC_COPY_ON_WRITE` will seemingly work, but this is
not an intended usage.
2022-11-25 14:55:50 +01:00
043673ca70 Cleanup: Alembic, deduplicate CacheObjectPath creation
No functionnal changes.
2022-11-25 14:37:48 +01:00
3cf803cf3c Cleanup: Alembic, use MEM_cnew
Avoids extra cast. No functionnal change.
2022-11-25 14:27:18 +01:00
6422f75088 Merge branch 'blender-v3.4-release' 2022-11-25 12:44:05 +01:00
6bc3311410 Fix: Missing node warning when compositor is enabled
If the compositor is enabled or disabled, the node warnings for
unsupported nodes is not updated because of a missing redraw. This patch
adds that missing redraw in order to make the change immediate.
2022-11-25 13:19:13 +02:00
60ad5f49fa Cleanup: move C++ declarations to separate .hh header 2022-11-25 12:18:10 +01:00
826535979a Nodes: add non-const utility to find socket by identifier
This does the same as the corresponding const method.
2022-11-25 12:11:13 +01:00
32690cafd1 Fix: Missing compositor warning for Render Layers
The Render Layers node didn't draw a warning in the viewport when an
unsupported pass is used. This patch adds that warning.
2022-11-25 12:59:50 +02:00
64c26d2862 Merge branch 'blender-v3.4-release' 2022-11-25 11:45:01 +01:00
b9c358392d BLI: Fix error in vector library and add more test for operators
The operator was wrongly returning a reference to local temp variable.

Add test for all uncovered operators.
2022-11-25 11:28:04 +01:00
0ce18561bc Fix (unreported) uv unwrap selected was splitting selection
Add support for `pin_unselected` in new UV Packing API.

Regression introduced by API change in rBe3075f3cf7ce.
2022-11-25 15:52:04 +13:00
851906744e Merge branch 'blender-v3.4-release' 2022-11-24 15:14:12 -06:00
0710ec485e Cleanup: Remove unused IMB tile cache code (part 2)
Missed in the first commit[1].

Initially it was reported that the `flags` parameter was unused on
`imb_cache_filename` but it turns out another swath of code was unused
related to that same function. Clean this up now too.

[1] 38573d515e
2022-11-24 12:41:05 -08:00
49129180b2 Merge branch 'blender-v3.4-release' 2022-11-24 17:30:54 -03:00
58c8c4fde3 Animation: Improve performance of Bake Action operator
This dramatically improves baking performance by batch-adding keyframes
instead of adding them individually, reducing unnecessary overhead.

Testing indicates an approximate 4x performance uplift.

Reviewed By: sybren, RiggingDojo

Differential Revision: https://developer.blender.org/D8808
2022-11-24 11:26:17 -08:00
81754a0fc5 Cleanup: remove else after return. 2022-11-24 09:56:05 -08:00
412642865d Cleanup: Resolve a warning for the ambiguity on the parenthesis in oneAPI code
No functional changes.
2022-11-24 18:05:02 +01:00
14a0fb0cc6 Merge branch 'blender-v3.4-release' 2022-11-24 09:02:23 -08:00
de27925aea Fix (unreported) inconsistent name_map during file reading.
Swapping some ID lists between Mains must invalidate the name_map cache.

Note that in theory, at least WM type could be ignored by name_map
cache, since it is a singleton. However, don't think it's worth adding
extra complication here, for really marginal benefits. The overhead of
rebuilding the name cache here is extremly small.

For some reason, this issue did not show so far in master, only appeared
in some branch work on improving (in)direct status of linked IDs... Go
figure.
2022-11-24 17:08:18 +01:00
d6d5089c65 Metal: Fix a warning and compilation errors
These were oversight when developping without testing on MacOS.
2022-11-24 16:57:46 +01:00
20c1ce3d9b BLI: Make scalar vector constructor more generic
This makes it possible to use any type of scalar to init a vector and
reduce code duplication.
2022-11-24 16:16:42 +01:00
f47daa7ec9 Merge branch 'blender-v3.4-release' 2022-11-24 15:33:25 +01:00
Christoph Lendenfeld
bb665ef8d2 Merge branch 'blender-v3.4-release' 2022-11-24 15:20:30 +01:00
1b7b996e16 Merge branch 'blender-v3.4-release' 2022-11-24 13:36:48 +01:00
5f626ac331 Cleanup: use more concise function names in function nodes
This is the same naming convention that's used for geometry nodes.
2022-11-24 12:49:17 +01:00
369914e7b1 Liblink: Add test over direct vs indirect link status.
Some checks are currently commented out, since Blender will never 'clear'
the directly linked status of an ID once it has been used by local data.
2022-11-24 10:52:27 +01:00
bbf09eb59c Merge branch 'blender-v3.4-release' 2022-11-24 10:15:36 +01:00
2dcdfab94c Realtime Compositor: Warn about unsupported MacOS
This patch warns the user that MacOS is not supported for the viewport
compositor in the shading panel.

See T102353.
2022-11-24 09:25:44 +02:00
38573d515e Cleanup: Remove unused IMB tile cache code
This removes the unused code for the IBM tile cache APIs.  These have
been unused for as far back as I could manage to search.

Since TIFF was used for the cached images, this removal will allow for
an easier review when it comes time to move TIFF to OIIO as part of
T101413.

Differential Revision: https://developer.blender.org/D16587
2022-11-23 19:31:10 -08:00
f4e1f62c62 Merge branch 'blender-v3.4-release' 2022-11-23 19:35:39 +01:00
584089879c BLI: Follow up and fix recent span slicing change
a5e7657cee didn't account for slices of zero sizes, and the asserts
were slightly incorrect otherwise. Also, the change didn't apply to
`Span`, only `MutableSpan`, which was a mistake. This also adds "safe"
methods to `IndexMask`, and switches function calls where necessary.
2022-11-23 11:36:06 -06:00
38cf48f62b Fix: Missing caches in curves bounds evaluation 2022-11-23 11:36:06 -06:00
50aad904b3 Merge branch 'blender-v3.4-release' 2022-11-23 14:15:22 -03:00
f13160d188 Cleanup: quiet deprecation warnings
This fixes these warnings: P3340.
2022-11-23 17:15:33 +01:00
583f19d692 Merge branch 'blender-v3.4-release' 2022-11-23 17:03:17 +01:00
c1eeb38f7c Cleanup: Move poly normal calculation to mesh_normals.cc 2022-11-23 09:49:18 -06:00
c3d6f5ecf3 Merge branch 'blender-v3.4-release' 2022-11-23 16:39:09 +01:00
737d363e02 Cleanup: remove unused node type
This wasn't used for backwards compatibility, because Blender does not
read from the `nodetype` anywhere. It also wasn't used for forward
compatibility, because it was not initialized for new node groups.
2022-11-23 16:15:25 +01:00
460f7ec7aa Windows: Run blender-launcher.exe instead of blender.exe
With this change Blender, delivered via the Microsoft store, will launch without the console window flashing.

Ref T88613

Differential Revision: https://developer.blender.org/D16589
2022-11-23 15:14:13 +01:00
Jeroen Bakker
a819523dff Vulkan: Add VK memory allocator 3.0.1 to extern.
Vulkan doesn't have a memory allocator builtin. The application should
provide the memory allocator at runtime. Vulkan Memory Allocator is a
widely used implementation.

Vulkan Memory Allocator is a header only implementation, but the using
application should compile a part in a CPP compile unit. The file
`vk_mem_alloc_impl.cc` and `extern_vulkan_memory_allocator` library
is therefore introduced.

Reviewed By: fclem

Differential Revision: https://developer.blender.org/D16572
2022-11-23 14:42:27 +01:00
68a450cbe4 Cleanup: Remove unused parameter in node draw 2022-11-23 15:06:40 +02:00
aa0c2c0f47 Cleanup: move some data from bNodeTree to run-time data
No functional changes are expected.
2022-11-23 14:05:30 +01:00
4f02817367 Nodes: remove bNodeTree->interface_type
This is not used for anything in practice currently. The original intention
was probably to generate different socket subtypes, but that is solved
differently now (e.g. using `NodeSocketFloatDistance`). It's possible
that an addon tried to use this but it's rather unlikely.

Differential Revision: https://developer.blender.org/D13188
2022-11-23 13:49:07 +01:00
247d75d2b1 Realtime Compositor: Warn about unsupported setups
This patch warns the user that the compositor setup is not fully
supported when an unsupported node is used. The warning is displayed as
an engine warning overlay and in the node header itself.

See T102353.

Differential Revision: https://developer.blender.org/D16508

Reviewed By: Clement Foucault
2022-11-23 14:35:26 +02:00
6396d29779 Merge branch 'blender-v3.4-release' 2022-11-23 13:25:12 +01:00
1c00b2ef70 Cleanup: move paint_cursor.c and paint_image_proj.c to C++
This makes it easier to use c++ when improving the internal node api.
2022-11-23 12:56:34 +01:00
106277be43 Merge branch 'blender-v3.4-release' 2022-11-23 12:54:15 +01:00
7d44676b5f Realtime Compositor: Disable on MacOS
This patch disables the realtime compositor on MacOS until Metal is
supported. This is because MacOS doesn't support the necessary GPU
features to make it work.

An engine error overlay is displayed if it is enabled and the option
itself is greyed out.

See T102353.

Differential Revision: https://developer.blender.org/D16510

Reviewed By: Clement Foucault
2022-11-23 13:34:31 +02:00
11275b7363 Realtime Compositor: Extend option to enable compositor
This patch turns the checkbox option to enable the viewport compositor
into a 3-option enum that allows:

- Disabled.
- Enabled.
- Enabled only in camera view.

See T102353.

Differential Revision: https://developer.blender.org/D16509

Reviewed By: Clement Foucault
2022-11-23 13:27:47 +02:00
80249ce6e4 Asset Browser: Allow changing active catalog from Python
The active catalog ID (UUID) was a read only property. From a studio I
got the request to make this editable, so their pipeline tooling can
make certain assets visible.

Differential Revision: https://developer.blender.org/D16356

Reviewed by: Sybren Stüvel
2022-11-23 12:05:16 +01:00
e0c5ff87b7 Realtime Compositor: Implement Track Position node
This patch implements the Track Position node for the realtime
compositor.

Differential Revision: https://developer.blender.org/D16387

Reviewed By: Clement Foucault
2022-11-23 12:56:24 +02:00
571f373155 UI: Don't render missing linked material previews, avoids UI freezing
Opening the material selector after reloading files could cause long UI
freezes, because some linked in materials don't have the preview stored
in the source file. So Blender would keep rerendering it after every
file load, which may involve compiling OpenGL shaders, which again
freezes the UI typically. This was reported as quite an issue for the
Heist Production by the Blender Studio.

Don't render these missing material previews from linked data-blocks
anymore.

Differential Revision: https://developer.blender.org/D16538

Reviewed by: Brecht Van Lommel, Jeroen Bakker
2022-11-23 11:39:53 +01:00
c464fd724b Fix T102697: Gpencil Subdiv modifier level increased
The old hard limit was 5, but now it's possible set to a max
value of 16. UI limit remains to 5.

This extreme value is only used in some corner case, but it
was a request by some artists.

Warning: Using very high values could produce a long calculation time, especially in strokes with a high density of points.
2022-11-23 11:23:38 +01:00
356373ff7a Cleanup: move some data from bNodeSocket to run-time data
No functional changes are expected.
2022-11-23 10:42:17 +01:00
5938e97a24 Merge branch 'blender-v3.4-release' 2022-11-23 10:36:11 +01:00
01e479b790 Cleanup: simplify removing asset code
Differential Revision: https://developer.blender.org/D16570
2022-11-23 10:04:24 +01:00
63ae0939ed Merge branch 'blender-v3.4-release' 2022-11-23 10:22:42 +09:00
cdcbf05ea8 BLI: Make Report Missing Files display message when no files are missing
Before this, if there were no missing files, the operator would run
successfully but there would be no user feedback at all, making the
user wonder if the operator was even run.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D16585
2022-11-22 16:48:51 -08:00
2cb6b0b4eb Merge branch 'blender-v3.4-release' 2022-11-22 15:52:33 -06:00
e3567aad0a Merge branch 'blender-v3.4-release' 2022-11-22 12:49:48 -08:00
0f3b3ee679 Merge branch 'blender-v3.4-release' 2022-11-22 12:48:29 -08:00
40ac3776db Merge branch 'blender-v3.4-release' 2022-11-22 12:41:54 -08:00
a777c09d5f Sculpt: Standardize face set undo steps, optimize memory usage
Currently the face set of every single face is saved for every sculpt undo step.
When only changing the face sets of a small section of the mesh, this can be quite
wasteful. It also makes face sets a special case compare to all other sculpt undo step
types, which makes the whole system more complex and harder to improve.

Fixes T101203.

Reviewed By: Hans Goudey
Differential Revision: https://developer.blender.org/D16224
Ref D16224
2022-11-22 12:13:33 -08:00
41f58fadae Cleanup: Decrease variable scope, change names in BMesh layer handling 2022-11-22 14:03:39 -06:00
aa1a51ff9f Merge branch 'blender-v3.4-release' 2022-11-22 13:52:30 -06:00
55db44ca2c Merge branch 'blender-v3.4-release' 2022-11-22 11:17:30 -08:00
c3e919b317 Sculpt: Fix broken multires hidden undo
Wrote a new API method, BKE_pbvh_sync_visibility_from_verts
that flushes vertex hidden flags to edges & faces.

Fixes not being able to sculpt outside a face set after
undoing the fkey hide-all-but-this operator.
2022-11-22 11:11:03 -08:00
14d0b57be7 Cleanup: Use array_utils to copy evaluated field array 2022-11-22 12:49:51 -06:00
8e535ee9b4 Cleanup: Remove unused math min max utility
This has a friendlier multithreaded implementation in BLI_bounds.hh now.
2022-11-22 12:31:31 -06:00
772696a1a5 Geometry Nodes: Parallelize bounds compuation in points to volume
On my computer this saves a few milliseconds when there are
over 1 million points.
2022-11-22 12:13:52 -06:00
8990983b07 Cleanup: Use const arguments for custom normals 2022-11-22 11:48:16 -06:00
a5e7657cee BLI: Remove clamping from span slicing
Currently slicing a span clamped the final size so that it would be
within bounds of the input. However, in the vast majority of cases
that is already the case anyway, and we can use asserts to detect
when that assumption fails.

The clamping had a performance cost. On a test interpolating a boolean
attribute from 1 million curves to 4 million points, removing the
clamping saved about 10% of the time. That's an extreme case but
this probably slightly improves performance in other cases too.
Slicing is used a lot in the new curve code.

This commit introduces `slice_safe` which still does the clamping,
and uses it in the few places that needed it or where I wasn't
sure.
2022-11-22 11:29:24 -06:00
822dab9a42 Cleanup: Strict compiler warnings
Mark function as local to the translation unit.
2022-11-22 17:52:44 +01:00
67194fb247 Merge branch 'blender-v3.4-release' 2022-11-22 16:35:26 +01:00
ed82bbfc2c Cleanup: use designated initializers in C 2022-11-22 10:48:16 -03:00
dd5fdb9370 GPU: Add MoltenVK as dependencies to Vulkan SDK.
MoltenVK is part of the vulkan SDK. Blender requires the vulkan SDK
to compile. This patch adds the MoltenVK includes and libraries to
the Vulkan includes and libraries.
2022-11-22 14:07:34 +01:00
51f56e71cb Merge branch 'blender-v3.4-release' 2022-11-22 13:41:07 +01:00
02c6136958 Fix compile error with msvc
error C2059: syntax error: '}'
2022-11-22 09:13:16 -03:00
b79e5ae4f2 GHOST: Add missing C_API function to header file.
- GHOST_GetDrawingContext was missing.
2022-11-22 12:45:49 +01:00
7dea18b3aa Tracking: Store lens principal point in normalized space
This avoids need to do special trickery detecting whether the principal
point is to be changed when reloading movie clip. This also allows to
transfer the optical center from high-res footage to possibly its lower
resolution proxy without manual adjustment.

On a user level the difference is that the principal point is exposed in
the normalized coordinates: frame center has coordinate of (0, 0), left
bottom corner of a frame has coordinate of (-1, -1) and the right top
corner has coordinate of (1, 1).

Another user-visible change is that there is no more operator for setting
the principal point to center: use backspace on the center sliders will
reset values to 0 which corresponds to the center.

The code implements versioning in both directions, so it should be
possible to open file in older Blender versions without loosing
configuration.

For the Python API there are two ways to access the property:
- `tracking.camera.principal_point` which is measured in the normalized
  space.
- `tracking.camera.principal_point_pixels` to access the pixel-space
  principal point.

Both properties are not animatable, so there will by no conflict coming.

Differential Revision: https://developer.blender.org/D16573
2022-11-22 11:49:56 +01:00
04dc58df83 Tracking: Mark more deprecated RNA access as such
Also clarify a bit the new way of accessing the data.
2022-11-22 11:49:56 +01:00
294ff0de43 Cleanup: More clear function name
Make it explicit in the name that track index is the one used for
selection (previously it read as if the track index is within its
list).
2022-11-22 11:49:56 +01:00
f383dfabf7 Cleanup: Simplify public tracking API
Remove functions which are a trivial accessor.
2022-11-22 11:49:56 +01:00
d37efe332c Tracking: Mark deprecated active tracks access as such 2022-11-22 11:49:56 +01:00
ceaf4779da Fix active track always assigned for the active object
A missing part from the storage split refactor.
2022-11-22 11:49:56 +01:00
88d9ed3c1c Fix missing update on active motion track change from Python 2022-11-22 11:49:56 +01:00
0b251493c8 Tracking: Raise python exception when assigning wrong active track
Before this an attempt to assign track from another object wos
silently assigning active object to null. Such silencing of
errors is not really a good way.
2022-11-22 11:49:56 +01:00
ea969ccc02 Refactor: Replace marker visibility macro with function
Also optimize sub-optimal request for active object for every
call of the check.

Should be no functional changes.
2022-11-22 11:49:56 +01:00
b864397201 Refactor: Clip editor tracking selection operator
De-duplicate selection logic and threshold between various
operators (selection and sliding).

The user measurable difference is that regular selection
threshold now matches sliding threshold: it is more strict
now. The possible downside of this is that it might be more
tricky to select tracks, but this is what needs to happen
for tools support. Also, this matches object selection in
viewport.
2022-11-22 11:49:56 +01:00
4d1a116cdf Refactor: Simplify tracking active element accessor API
Use active object accessor, and then access data from the
object. There is no need to have an API call for shortcut
of all object fields.

Should be no functional change.
2022-11-22 11:49:56 +01:00
1300da6d39 Cleanup: More clear function name in tracking module
Make it more obvious in the name that an operation is not
cheap, and that the function operates on a tracks from
object and does not need a global tracking structure.
2022-11-22 11:49:56 +01:00
016f9c2cf5 Cleanup: Variable naming in tracking files
Make it obvious that the object is the motion tracking one, and
not the ID_OB type.
2022-11-22 11:49:56 +01:00
953f719e58 Cleanup: Variable scope in tracking files 2022-11-22 11:49:56 +01:00
1168665653 Refactor: Remove trivial accessor functions from tracking 2022-11-22 11:49:56 +01:00
fe38715600 Refactor: Unify storage for motion tracking camera and objects
Historically tracks and reconstruction for motion tracking camera
object were stored in the motion tracking structure. This is because
the data structures pre-dates object tracking support, and it was
never changed to preserve compatibility.

Now the compatibility code supports more tricks and allows to change
the ownership without breaking any compatibility. This is what this
change does: it moves tracks from motion tracking structure to the
motion tracking camera object, and does it in a way that no
compatibility is broken.

One of the side-effects of this change is that the active track is
now stored on motion tracking object level, which allows to change
active motion tracking object without loosing active track. Other
than that there are no expected user-level changes.
2022-11-22 11:49:56 +01:00
4d497721ec Refactor: Streamline tracking data copying a bit
Prepare the code to more easily support pointers remapping
for tracking objects.

Should be no functional changes.
2022-11-22 11:49:56 +01:00
3c479b9823 Movie clip: Remove special selection synchronization function
The function is already doing a lot of memory indirections and
sub-optimal lookups, so for the simplicity and robustness of the
system might as well just do copy-on-write update.
2022-11-22 11:49:56 +01:00
7411fa4e0d Cleanup: Better const-correctness in tracking code 2022-11-22 11:49:56 +01:00
44d7ec7e80 Clip Editor: Migrate orientation file to C++
Should be no functional changes.

Some of the code is suboptimal from C++ syntax point of view.
It will be worked on as a further development.
2022-11-22 11:49:56 +01:00
Jeroen Bakker
6dac345a64 GHOST: Vulkan Backend.
This adds a vulkan backend to GHOST. The code was extracted from the
tmp-vulkan branch. The main difference with the original code is that
GHOST isn't responsible for fallback. For Metal backend there is already
an idea that the GPU module is responsible for the fallback, not the system.

For Blender we target Vulkan 1.2 at the time of this patch.
MoltenVK (needed to convert Vulkan calls to Metal) has been added as
a separate package.

This patch isn't useful for end-users, currently when starting blender with
`--gpu-backend vulkan` it would crash as the `VBBackend` doesn't initialize
the expected global structs in the GPU module.

Validated to be working on Windows and Apple. Linux still needs to be tested.

Reviewed By: fclem

Differential Revision: https://developer.blender.org/D13155
2022-11-22 11:29:09 +01:00
efa87ad5eb Merge branch 'blender-v3.4-release' 2022-11-22 10:48:55 +01:00
89349067b6 Merge branch 'blender-v3.4-release' 2022-11-21 18:01:29 -06:00
25ddb576ff Cleanup: add ATTR_FALLTHROGUH 2022-11-21 10:43:12 -08:00
02e045ffbe Merge branch 'blender-v3.4-release' 2022-11-21 19:19:58 +01:00
be1745425c Nodes: Remove "level" building pass on update
The node level was an indication of how deep the node was in the tree.
It was only used for detecting link cycles. Now that the node topology
cache from 25e307d725 exists, this calculation can be removed
completely.

The level calculation was quadratic and very slow on larger node trees.
In the mouse house file with a few thousand nodes, it took 23ms on
every single update. Another benefit is storing slightly less runtime
data, though this was only 2 bytes per node.

Differential Revision: https://developer.blender.org/D16566
2022-11-21 11:34:22 -06:00
b391037424 Nodes: Use topology cache for node exec node list
Instead of generating a dependency sorted node list whenever evaluating
texture or EEVEE/viewport shader nodes, use the existing sorted array
from the topology cache. This may be more efficient because the
algorithm isn't quadratic. It's also the second-to-last place to
use `node.runtime->level`, which can be removed soon.

Differential Revision: https://developer.blender.org/D16565
2022-11-21 11:30:49 -06:00
b0e38f4d04 Cleanup: Strict compiler warnings
Remove `private:` from the PBVHFaceIter. This is not really a C++
class, and the C++ code generates a lot of warnings about unused
fields.

Also mark function static and run clang-format.
2022-11-21 16:57:46 +01:00
Iliya Katueshenock
2f69266c64 Cleanup: use topology cache for frame timings overlay
Differential Revision: https://developer.blender.org/D16571
2022-11-21 16:10:05 +01:00
5d08396970 Merge branch 'blender-v3.4-release' 2022-11-21 15:53:42 +01:00
Jarrett Johnson
1c0f5e79fa DRW: Fix pointcloud selection
Fixes point cloud selection by using new draw call.

Reviewed By: fclem

Maniphest Tasks: T102659

Differential Revision: https://developer.blender.org/D16501
2022-11-21 15:50:32 +01:00
2910be8f19 Cleanup: Correct semantics for .blend listing in file/asset browser
When attempting to load contents of a .blend, the code would just assume
if the number of added items is 0, that means it's not a .blend (but a
directory, although the previous commit fixed that part already).
However there may be situations where a .blend file simply doesn't
contain anything of interest to be added (e.g. when listing assets
only), so have a proper "none" value for this.
2022-11-21 12:29:18 +01:00
c58e7da43e Asset Browser: Avoid non-existent directory prints
When loading asset libraries, there would be a bunch of "non-existent
directory" prints because we were calling a function to list directory
contents on .blend file paths. Make sure the path actually points to a
directory.
2022-11-21 12:29:18 +01:00
9f83ef2149 BLI: make different pointer types compatible in hash tables
For example, this allows doing a lookup using a raw pointer in a
hash table that uses `std::unique_ptr` as key.
2022-11-21 12:02:10 +01:00
85990c877c Merge branch 'blender-v3.4-release' 2022-11-21 11:45:12 +01:00
71efb7805b Merge branch 'blender-v3.4-release' 2022-11-21 11:23:18 +01:00
7a6cdeb242 Merge remote-tracking branch 'origin/blender-v3.4-release' 2022-11-20 18:31:53 -07:00
Wannes Malfait
e83f46ea76 Geometry Nodes: Use Mesh instead of BMesh in split edges node
Rewrite the edge split code to operate directly on Mesh instead
of BMesh. This allows for the use of multi-threading and makes
the node around 2 times faster. Around 15% of the time is spent
just on the creation of the topology maps, so these being cached
on the mesh could cause an even greater speedup. The new node
gave identical results compared to the BMesh version on all the
meshes I tested it on (up to permutation of the indices).

Here are some of the results on a few simple test cases:
(Intel i7-7700HQ (8 cores) @ 2.800GHz , with 50% of edges selected)
|       | 370x370 UV Sphere | 400x400 Grid | Suzanne 4 subdiv levels |
| ----- | ----------------- | -------------- | --------------------- |
| Mesh  | 89ms              | 111ms          | 76ms                  |
| BMesh | 200ms             | 276ms          | 208ms                 |

Differential Revision: https://developer.blender.org/D16399
2022-11-20 15:42:10 -06:00
97e0cc41ca Nodes: Simplify view panning when selecting node
The "Activate Same Type Next/Prev" and "Find Node" operators pan
the view to the newly selected node if it's outside of the view. This
simplifies that check and improves it in the case where the node
is only partially visible-- now it pans in while it didn't before.
2022-11-20 14:45:58 -06:00
984edb2c4e Nodes: Replace implementation of select next/prev type operator
The previous code was quadratic; it looped over every link for every
node. For one large node tree I tested the operator took 20ms. On the
same node tree it now takes less than 1ms.

The change replaces the current building of the "dependency list"
on every call with a use of the topology cache from 25e307d725.
2022-11-20 14:45:58 -06:00
012895e8a1 Cleanup: Remove unused node operator property 2022-11-20 14:45:58 -06:00
47c92bf8de Cleanup: Remove unused boolean in node select function 2022-11-20 14:45:58 -06:00
afb7da5538 Sculpt: Add comment explaining PaintMaskFloodMode 2022-11-20 09:53:29 -08:00
b041678028 Merge branch 'blender-v3.4-release' 2022-11-20 09:44:53 -08:00
42c30f3b9c Sculpt: Fix missing const in recent commit 2022-11-20 08:17:02 -08:00
6b3cee2538 Sculpt: Face iterator API
This patch adds basic face iterators to the sculpt API.  The interface is similar to the existing vertex iterators.  It's not C++ (though it does mark private fields in PBVHFaceIter as private if compiling under C++).

Example:

```
PBVHFaceIter fd;

BKE_pbvh_face_iter_begin(pbvh, node, fd) {

  /* Face reference and face index */
  PBVHFaceRef face = fd->face;
  int face_index = fd->index;

  /* Can read and modify hide flag if it exist (it may not) */
  if (fd->hide) {
    *fd->hide ^= true; /* toggle hide */
  }

  /* Can read and modify face set if it exists */
  if (fd->face_set) {
    *fd->face_set = something;
  }

  /*Can read vertices*/
  for (int i=0; i<fd.verts_num; i++) {
    float *co = SCULPT_vertex_co_get(ss, fd.verts[i]);
  }
}
BKE_pbvh_face_iter_end(fd);
```

Reviewed By: Brecht Von Lommen and Hans Goudey
Differential Revision: https://developer.blender.org/D16225
Ref D16225
2022-11-20 08:08:26 -08:00
5097105b3c Sculpt: Fix T102567: multires crash with high subdivision levels
Previous fix did not work for ngons.
The pbvh leaf limit minimum is now set
to the maximum ngon's vertex count.
2022-11-20 07:32:21 -08:00
41d29a1603 Merge branch 'blender-v3.4-release' 2022-11-20 07:18:45 -08:00
d24c0011cf GPencil: Make Ignore Transparent option more consistent
The code was doing the oposite of the UI option.

Related to T102625
2022-11-20 11:22:20 +01:00
0fd94a1f5e Tests: enable element-wise multiplication for mathutils API tests 2022-11-20 10:42:17 +11:00
9100cc0f39 Merge branch 'blender-v3.4-release' 2022-11-20 10:41:19 +11:00
4d17301ba4 Merge branch 'blender-v3.4-release' 2022-11-19 20:47:16 +01:00
dfb157f9c4 Merge branch 'blender-v3.4-release' 2022-11-19 19:09:49 +01:00
2654c523c1 Cleanup: use nullptr in C++ 2022-11-19 11:51:42 +01:00
7ca9fb9865 Merge branch 'blender-v3.4-release' 2022-11-19 21:15:28 +11:00
Aaron Carlisle
bfb3d78902 Cleanup: Remove disabled edge slide keymap feature
This feature has been disabled since 2.80 but the feature description was still visible in the UI.

Addresses part of T101429

Breaking changes:

- Removes `EDGESLIDE_EDGE_NEXT`
- Removes `EDGESLIDE_PREV_NEXT`

Reviewed By: mano-wii

Maniphest Tasks: T101429

Differential Revision: https://developer.blender.org/D16430
2022-11-18 21:13:33 -05:00
511ac66dab Mesh: Use shared cache for derived triangulation
Use the shared cache system introduced in e8f4010611 for the
"looptris" triangulation cache. This avoids recalculation when meshes
are copied but the positions or topology don't change. The most obvious
improvement is for cases like a large meshes being adjusted slightly
with a simple geometry nodes modifier. In a basic test with a transform
node with a 1 million point grid I observed an improvement of 13%, from
9.75 to 11 FPS, which shows that we avoid spending 6ms recalculating
the triangulation of every update.

This also makes the thread safety for the triangulation data use a
more standard double-checked lock pattern, which is nice because we
can avoid holding a lock whenever the cached data is retrieved.

Split from https://developer.blender.org/D16530
2022-11-18 17:29:24 -06:00
12d7994a48 Cleanup: Improve comment about copying mesh shared caches 2022-11-18 17:02:30 -06:00
c83e33b661 Cleanup: Sort includes in mesh header 2022-11-18 17:01:38 -06:00
05f93b58d3 Fix: Crash when writing mesh after previous commit
Runtime data was accessed after it was explicitly set to null.
2022-11-18 16:24:15 -06:00
1ea169d90e Mesh: Move loose edge flag to a separate cache
As part of T95966, this patch moves loose edge information out of the
flag on each edge and into a new lazily calculated cache in mesh
runtime data. The number of loose edges is also cached, so further
processing can be skipped completely when there are no loose edges.

Previously the `ME_LOOSEEDGE` flag was updated on a "best effort"
basis. In order to be sure that it was correct, you had to be sure
to call `BKE_mesh_calc_edges_loose` first. Now the loose edge tag
is always correct. It also doesn't have to be calculated eagerly
in various places like the screw modifier where the complexity
wasn't worth the theoretical performance benefit.

The patch also adds a function to eagerly set the number of loose
edges to zero to avoid building the cache. This is used by various
primitive nodes, with the goal of improving drawing performance.
This results in a few ms shaved off extracting draw data for some
large meshes in my tests.

In the Python API, `MeshEdge.is_loose` is no longer editable.
No built-in addons set the value anyway. The upside is that
addons can be sure the data is correct based on the mesh.

**Tests**
There is one test failure in the Python OBJ exporter: `export_obj_cube`
that happens because of existing incorrect versioning. Opening the
file in master, all the edges were set to "loose", which is fixed
by this patch.

Differential Revision: https://developer.blender.org/D16504
2022-11-18 16:05:06 -06:00
c0f33814c1 Fix T102611: Unable to change file output node sockets from python
Similar to 84c66fe9db
2022-11-18 14:01:40 -06:00
ab819517fc Fix: Crash when deleting node
Caused by b4c3ea2644 not removing the
dangling pointers to the freed internal links from the vector.
2022-11-18 13:58:36 -06:00
21adf2ec89 Cleanup: Split UV sample geometry node into two functions
This separates the UV reverse sampling and the barycentric mixing of
the mesh attribute into separate multi-functions. This separates
concerns and allows for future de-duplication of the UV sampling
function if that is implemented as an optimization pass. That would
be helpful since it's the much more expensive operation.

This was simplified by returning the triangle index in the reverse
UV sampler rather than a pointer to the triangle, which required
passing a span of triangles separately in a few places.
2022-11-18 13:38:55 -06:00
8fa69dafdd Cleanup: Remove unnecessary using keyword and namespace 2022-11-18 13:38:55 -06:00
b3b00be34e UI: Simplify description for geometry node socket
Simplify wording of "Output true" to a noun.
2022-11-18 13:38:55 -06:00
1c0cd50472 Curves: Add descriptions for normal mode RNA enum
This is only exposed in the "Set Normal Node" now but will be used in
more places in the future.
2022-11-18 13:38:55 -06:00
f5128f219f Cleanup: Use simpler check for Bezier curves 2022-11-18 13:38:55 -06:00
c6e4953719 Fix use-after-free of asset catalog data in node add menu
(Probably requires ASan for a reliable crash.)

Steps to reproduce were:
* Enter Geometry Nodes Workspace
* Press "New" button in the geometry nodes editor header
* Right-click the data-block selector -> "Mark as Asset"
* Change 3D View to Asset Browser
* Create a catalog
* Drag new Geometry Nodes asset into the catalog
* Save the file
* Press Shift+A in the geometry nodes editor

There was a general issue here with keeping catalog pointers around
during the add menu building. The way it does things, catalogs may be
reloaded in between.
Since the Current File asset library isn't loaded in a separate thread,
the use-after-free would always happen in between. For other libraries
it could still happen, but apparently didn't by chance.
2022-11-18 17:52:59 +01:00
b211266226 Merge branch 'blender-v3.4-release' 2022-11-18 16:05:10 +01:00
6c0a5461f7 Fix build error when not using unity build 2022-11-18 16:04:56 +01:00
0151d846e8 Fix MSVC warnings from recent asset system changes
* Mismatching class vs struct forward declaration (one forward
  declaration wasn't needed anymore)
* Unused member warning (`on_load_callback_store_`)
2022-11-18 15:20:16 +01:00
4e38771d5c Merge branch 'blender-v3.4-release' 2022-11-18 13:56:43 +01:00
b4c3ea2644 Cleanup: move internal links of nodes to runtime data
No functional changes are expected.
2022-11-18 13:46:35 +01:00
40b63bbf5b Merge branch 'blender-v3.4-release' 2022-11-18 12:50:21 +01:00
7b82d8f029 Nodes: move most runtime data out of bNode
* This patch just moves runtime data to the runtime struct to cleanup
  the dna struct. Arguably, some of this data should not even be there
  because it's very use case specific. This can be cleaned up separately.
* `miniwidth` was removed completely, because it was not used anywhere.
  The corresponding rna property `width_hidden` is kept to avoid
  script breakage, but does not do anything (e.g. node wrangler sets it).
* Since rna is in C, some helper functions where added to access the
  C++ runtime data from rna.
* This size of `bNode` decreases from 432 to 368 bytes.
2022-11-18 12:47:02 +01:00
754f674977 Cleanup: Missing trailing underscore in private asset system member vars
See style guide:
https://wiki.blender.org/wiki/Style_Guide/C_Cpp#Class_data_member_names
2022-11-18 12:45:56 +01:00
d5c8d3e661 Cleanup: Avoid unnecessary/annoying type alias in asset system
A `using FooPtr = std::unique_ptr<Foo>` isn't that useful usually, just
saves a few character stokes. It obfuscates the underlying type, which
is usually relevant information. Plus, `Ptr` for a unique pointer is
misleading (should be `UPtr` or similar).
2022-11-18 12:45:56 +01:00
61d0f77810 Cleanup: Better follow class layout style guide in asset headers
Move "using" declarations and member variables to the top of the class.
See https://wiki.blender.org/wiki/Style_Guide/C_Cpp#Class_Layout.

Changes access specifiers of some variables from public/protected to
private, there was no point in not having them private.
2022-11-18 12:45:56 +01:00
e31f282917 Cleanup: Minor cleanups in asset system headers
- Move main comment on class to header comment where it's more visible.
- Improve comment.
- Move stdlib includes first, like we do it usually
- Separate includes my code module
- Remove unnecessary forward declarations
2022-11-18 12:45:56 +01:00
7c0cecfd00 Asset system: Move catalog tree code to own files
The catalog code is already quite complex, I rather keep the tree stuff
separate in a more focused unit.
2022-11-18 12:45:56 +01:00
Iliya Katueshenock
6239e089cf Nodes: cache children of frame nodes
This allows for optimizations because one does not have to iterate
over all nodes anymore to find all nodes within a frame.

Differential Revision: https://developer.blender.org/D16106
2022-11-18 11:20:13 +01:00
dec459e424 Cleanup: move some files that use nodes to C++ 2022-11-18 11:08:52 +01:00
bc886bc8d7 Cleanup: Use int64_t for size methods.
- BKE_pbvh_pixels.hh
2022-11-18 10:45:06 +01:00
Pablo Vazquez
2c096f17a6 UI: Refactor Node Context Menu
The Node Context Menu contains options that are not always available for
the selected nodes, and misses important entries for accesibility.

This patch covers the following:
* Add operators to join and remove nodes from frames.
* Sort and group entries more logically and follow Blender conventions.
* Add `Insert into Group`
* Show group actions only on nodes that support it.
* Move all toggles to a sub-menu called `Show/Hide`.
* When nothing is selected, show Add menu, links actions, and paste.

Inspired by RightClickSelect proposals and community feedback.

See D16216 for images.

Reviewed By: HooglyBoogly

Differential Revision: https://developer.blender.org/D16216
2022-11-17 23:27:12 +01:00
8d77973dd7 Fix T99125: Curve mapping widget removes all vector points
Add a new flag value `CUMA_REMOVE` to explicitly tag duplicate points
for removal. This prevents a bug where all curve points with vector
handles were deleted, when removing duplicate curve points while
updating the widget. This happened, because the flag value used to tag
points for removal was the same as the value of `CUMA_HANDLE_VECTOR`
used to store the handle type of the curve point.

Reviewed By: Hans Goudey

Differential Revision: http://developer.blender.org/D16463
2022-11-17 22:00:17 +01:00
609a681fb5 Merge branch 'blender-v3.4-release' 2022-11-17 17:51:23 +01:00
ca253df623 Cleanup: move transform_snap to C++ 2022-11-17 17:13:49 +01:00
f74234895a Merge branch 'blender-v3.4-release' 2022-11-17 17:05:03 +01:00
780b29109c Merge branch 'blender-v3.4-release' 2022-11-17 16:05:01 +01:00
6bf13d0734 Fix crash when loading different file with asset browser open
Steps to reproduce were:
- Open an asset browser
- Open an asset library with assets in it
- Load a different file (e.g. File -> New -> General)

Didn't see a nice way to fix this with the current pre file load handler
callback we use for freeing asset libraries. Using this is cleaner, but
for now, the relationship between UI and asset system is too close
still, so better do explicit freeing at the right point in time.
2022-11-17 15:50:08 +01:00
c7bd508766 Cleanup outliner instancing collection code.
Remove needless call to `id_lib_extern`, this is already part of
`id_ud_plus` code.
2022-11-17 15:46:31 +01:00
53f401ea63 Merge remote-tracking branch 'origin/blender-v3.4-release' 2022-11-17 07:29:58 -07:00
576d99e59a Cleanup: move texture nodes to C++
No functional changes are expected. The goal here is to make
further refactorings to the nodes system easier.
2022-11-17 13:04:45 +01:00
dad8f4ac09 Merge branch 'blender-v3.4-release' 2022-11-17 12:21:10 +01:00
59f8061a34 Assets: Refactor asset representation storage
- Move code to manage storage to own class in own file, separates
  concerns and different levels of abstraction better.
- Store local ID assets separately in the storage class for more
  efficient lookups (e.g. for ID remapping).
- Make API function names and comments more complete.
2022-11-17 11:55:38 +01:00
67869432f2 Asset system: Remap local asset ID pointers as part of UI remapping
After checking with @mont29, this is much prefered over calling this in
BKE directly.
2022-11-17 11:55:38 +01:00
dd260d2f03 Fix T102554: Crash when Use Nodes is enabled
Blender crashes when enabling Use Nodes after the viewport compositor is
already enabled.

This happens because the active viewer key is not yet initialized for the
node tree at this point, which eventually leads to a nullptr.

This patch fixes that by returning the root context in case the active
viewer key is not yet initialized.
2022-11-17 11:12:41 +02:00
9a09adb7af Fix: Missing bounding box dirty tag when clearing mesh geometry
Similar to 801451c459.
2022-11-16 18:30:04 -06:00
bf0180d206 Sculpt: Remove some normal calculation with deformed sculpting
Remove unnecessary (and No-op) normal calculation when sculpting on top
of deformed coordinates. Examples are shape keys and deform modifiers.
On a 1 million face mesh, this saved 100ms per stroke update.
This function actually did nothing since cfa53e0fbe,
so that large improvement comes for free.

Conceptually this is correct because when sculpting on deformed
coordinates, we don't change the positions of the base mesh directly.
In the future it might be better to allocate a separate array for
normals when using deformed coordinates, but it's not clear that's
necessary yet.
2022-11-16 18:22:09 -06:00
c481549870 Cleanup: Remove unused node clipboard type handling 2022-11-16 18:01:34 -06:00
845a3573f5 Cleanup: Improve curves comments 2022-11-16 17:54:51 -06:00
Yann Doersam
b7a4f79748 Nodes: Allow pasting common nodes between editor types
Ignore difference between source and target tree type. When copying
nodes from clipboard to target tree compatibility is checked. After
pasting nodes only the links between nodes that are existing in the
node tree are added.

See Task T95033.

Differential Revision: https://developer.blender.org/D16349
2022-11-16 17:36:30 -06:00
0d3a33e45e Geometry Nodes: Add "Exists" output to Named Attribute input node
As described in T100004, add an output socket that returns true if the
attribute accessed by the node was already present in that context.

Initial patch by Edward (@edward88).

Differential Revision: https://developer.blender.org/D16316
2022-11-16 17:27:28 -06:00
145839aa42 Fix T102365: Wireframe skips edges after recent cleanup
10131a6f62 replaced use of the `ME_EDGERENDER` flag with
`ME_EDGEDRAW`. However, left over from previous refactors, code
for leaving edit mode set that flag based on the edge angle. Edge angle
wireframe hiding is currently supposed to be adjustable with the
wireframe overlay settings. This patch restores the previous behavior
from before the cleanup commit.

Differential Revision: https://developer.blender.org/D16451
2022-11-16 16:49:02 -06:00
cacfaaa9a5 Fix T92416: First render with unknown image colorspace looks different
The issue here was that the Barbershop benchmark scene was saved with a
custom OCIO config, which leads to some textures having a unknown
colorspace when loading with a default installation.

This is automatically fixed by Blender during image loading, but since
Cycles queried the colorspace before actually loading the image, it
didn't get the updated value in the first render.

To fix this, just re-query the colorspace after the image is loaded.

Note that non-packed images still get treated as raw data if the
colorspace is unknown, but this is at least consistent and doesn't
magically change when you press F12 a second time.

Differential Revision: https://developer.blender.org/D16427
2022-11-16 23:42:23 +01:00
1677ddb7ee Sculpt: Avoid retrieving vertices attribute when flushing positions
Currently the positions are retrieved again for every vertex. This is
slow, and will get slower when positions are stored as a named
attribute. Saves around 0.5ms per stroke update when a modifier
is active in my test with a 1 million face mesh.
2022-11-16 14:54:20 -06:00
87ace7207d Revert "Sculpt/Paint: Use cached triangulation when building PBVH"
This reverts commit 676137f043.

This change worked locally with a specific test file and local changes,
but didn't work in general, since we don't reliably retrieve the new
looptris after setting them the first time. This can be improved again
in the future, but probably along with a more general look about ownership
is handled with PBVH.
2022-11-16 14:29:12 -06:00
676137f043 Sculpt/Paint: Use cached triangulation when building PBVH
This avoids recalculation of looptri derived triangulation whenever
switching to sculpt mode or whenever the PBVH is rebuilt, which can
happen after strokes in some situations. In my tests actually building
the PBVH is much more expensive (300ms), but this saves 6ms when
switching to sculpt mode and in other situations.

The cost is the possibility of higher memory usage because the cache
will live in the original main database mesh. However, the impact of
that will be smaller when the shared cache concept from D16204 is
applied to this data too.
2022-11-16 13:35:27 -06:00
25b3515324 Cleanup: Remove unnecessary mesh normals debugging function
This assertion function came from when derived normal data was stored
as custom data layers, which made it harder to keep track of whether
it was allocated and propagated. Nowadays it's all relatively easy to
predict, so there's no point in keeping this function around-- it only
makes code longer and more complex looking.
2022-11-16 13:07:49 -06:00
6cf4999e50 Cleanup: Slightly improve mesh normals and runtime comments
Also resolved an unused variable warning caused by an earlier cleanup.
2022-11-16 12:54:48 -06:00
89ca298210 Cleanup: Don't set mesh normals directly for metaballs
At the cost of a memory copy, this allows using a C++ type to store
normals in mesh runtime data in upcoming patches.
2022-11-16 12:37:35 -06:00
99c970a94d UI: Allow Joining of Tiny Screen Areas
Allow joining of areas that are below our minimum sizes.

See D16522 for more details.

Differential Revision: https://developer.blender.org/D16522

Reviewed by Campbell Barton
2022-11-16 09:53:06 -08:00
Rateeb Riyasat
cce4271b31 Add regression test for triangulate faces
Basic test for `quads_convert_to_tris`

As the operator name in blender is different from the bpy name, the operator
name in blender was opted in terms of the blend file collection name as well
as the test name. This was done so that new developers in the future can
easier understand which operator this corresponds to. Although it might be
better to change this to the bpy name so as to be consistent with the rest
of the codebase.

Updated blend file `lib/tests/modeling/operators.blend` has been
committed as rBL63101.

Reviewed By: zazizizou, mont29

Differential Revision: https://developer.blender.org/D16072
2022-11-16 18:34:11 +01:00
801451c459 Fix: Missing clearing of mesh triangulation data
Missed in e412fe1798
2022-11-16 11:25:54 -06:00
a81abbbb8f Fix T100772: Joins with Interfering Tiny Areas
Detect unlikely situation of an area (that is smaller than our allowed
minimums) sharing an edge that will be moved during a join.

See D16519 for more details.

Differential Revision: https://developer.blender.org/D16519

Reviewed by Campbell Barton
2022-11-16 09:08:15 -08:00
6077fe7fae Merge branch 'blender-v3.4-release' 2022-11-16 18:00:48 +01:00
f7ca0ecfff Cleanup: Cognitive complexity in mask animation filtering
No functional changes expected.
2022-11-16 15:29:14 +01:00
4f2ce8d8d3 DrawManager: Remove experimental draw lock.
The draw locking was implemented for project Heist and moved behind an experimental
feature after it became clear there were issues with it. Nowadays it isn't used,
and the idea is to replace it with a different solution after all draw engines have
been ported to the new draw manager API. {T102180}

This patch will remove the experimental feature as it isn't used, or useful.
2022-11-16 15:18:39 +01:00
5a05fa8f74 Merge branch 'blender-v3.4-release' 2022-11-16 11:01:29 -03:00
e4871b2835 EEVEE/Viewport: Make info text when compiling shaders more clear
The N in `Compiling Shaders N` in Text Info, is the number of how many
shaders are left in the queue. It's a countdown, but this wasn't mentioned
and led to confusion.

Ideally this text would be like Cycles' "Samples 50/100", but in EEVEE it's
not easy to guess how many shaders are left (this number could even go
up mid-compilation).

In the past there used to be a progress bar but it's also confusing because
it could be 90/100 shaders done, but the remaining 10 are slow to compile.

Change the text to "Compiling Shaders (N remaining)" so it's easier to
understand what is going on. Similar to how some game engines do.
2022-11-16 14:28:21 +01:00
0ebb7ab41f Geometry Nodes: disable unreachable nodes in evaluator
Nodes that were not connected to any output could still impact performance.
While they were never executed, sometimes their inputs could keep references
to geometries that other nodes want to modify. That caused unnecessary geometry
copies, because a geometry can only be modified if it is not shared.

Now, inputs that will never be used are tagged accordingly and they will never
have references to geometries that others might want to modify.
2022-11-16 14:26:11 +01:00
edcce2c073 Cleanup: correct inverted variable name 2022-11-16 13:19:23 +01:00
1e88fc251f Cleanup: remove unused data member 2022-11-16 13:19:23 +01:00
71ce178b3e Merge branch 'blender-v3.4-release' 2022-11-16 12:55:44 +01:00
06e9d40c33 Merge branch 'blender-v3.4-release' 2022-11-16 12:32:51 +01:00
ec9acdeac2 Merge branch 'blender-v3.4-release' 2022-11-16 21:37:01 +11:00
91d3cc51c3 Merge branch 'blender-v3.4-release' 2022-11-16 21:36:55 +11:00
1aa851e939 Mesh: Don't tag normals and triangluation dirty when translating
This only applies to procedural operations rather than edit mode
operations, but it might save some recalculations of these caches
for the transform geometry node in some cases.
2022-11-15 23:58:34 -06:00
c8c14d1681 Cleanup: Remove unnecessary clearing of mesh runtime data
The calls in the remesh operator were unnecessary because the mesh is
about to be replaced anyway, and nothing invalidates the caches, and
the call in BMesh -> Mesh conversion was unnecessary because the caches
are cleared at the top of the function already.
2022-11-15 23:43:22 -06:00
90fb1cc4e6 Cleanup: Remove unnecessary dirty normal tags
These were redundant for one of a few reasons:
- A call to `BKE_mesh_tag_coords_changed` was correct instead
- A mesh has dirty normals when created from scratch anyway
- The call was redundant with `BKE_mesh_runtime_clear_geometry`
2022-11-15 20:28:39 -06:00
192cd76b7c Fix: Build error on MSVC with mismatched struct/class keywords 2022-11-15 20:26:33 -06:00
e412fe1798 Cleanup: Simplify freeing and clearing mesh runtime data
Separate freeing and clearing mesh runtime data in a more obvious way.
This makes it easier to see what data is meant to be cleared on certain
changes, rather than conflating it with freeing all of the runtime
caches.

Also comment and reduce the surface area of the "mesh runtime" API.
The redundancy in some functions made it confusing which one should
be used, resulting in subtle bugs or unnecessary boilerplate code.

Also, now bke::MeshRuntime is able to free all the data it owns by
itself, which makes this area easier to reason about. That required
changing the interface of a few functions to avoid passing Mesh when
they really just dealt with some runtime struct.

With more RAII semantics in the future, more of this manual freeing
will become unnecessary.
2022-11-15 20:26:33 -06:00
550c51b08b Merge branch 'blender-v3.4-release' 2022-11-16 12:33:59 +11:00
71067a58ec Merge branch 'blender-v3.4-release' 2022-11-16 12:33:55 +11:00
6940c4b602 Merge branch 'blender-v3.4-release' 2022-11-16 12:33:53 +11:00
0ac19425d4 Merge branch 'blender-v3.4-release' 2022-11-16 12:33:49 +11:00
e3ddfedbb6 Merge branch 'blender-v3.4-release' 2022-11-16 12:33:46 +11:00
5e203c4f4b Merge branch 'blender-v3.4-release' 2022-11-16 12:33:43 +11:00
25630ab2a1 Merge branch 'blender-v3.4-release' 2022-11-16 12:33:38 +11:00
87ba0dcaca Merge branch 'blender-v3.4-release' 2022-11-15 18:55:18 -06:00
60523ea523 Cleanup: format 2022-11-16 12:59:47 +13:00
9f2f9dbca6 Merge branch 'blender-v3.4-release' 2022-11-16 11:28:57 +13:00
da82d46a5a Cleanup: simplify asserts in uv unwrapper 2022-11-16 10:36:09 +13:00
65944e7e84 Merge branch 'blender-v3.4-release' 2022-11-15 22:13:08 +01:00
e8f4010611 Geometry: Cache bounds min and max, share between data-blocks
Bounding box calculation can be a large in some situations, especially
instancing. This patch caches the min and max of the bounding box in
runtime data of meshes, point clouds, and curves, implementing part of
T96968.

Bounds are now calculated lazily-- only after they are tagged dirty.
Also, cached bounds are also shared when copying geometry data-blocks
that have equivalent data. When bounds are calculated on an evaluated
data-block, they are also accessible on the original, and the next
evaluated ID will also share them. A geometry will stop sharing bounds
as soon as its positions (or radii) are changed.

Just caching the bounds gave a 2-3x speedup with thousands of mesh
geometry instances in the viewport. Sharing the bounds can eliminate
recalculations entirely in cases like copying meshes in geometry nodes
or the selection paint brush in curves sculpt mode, which causes a
reevaluation but doesn't change the positions.

**Implementation**
The sharing is achieved with a `shared_ptr` that points to a cache mutex
(from D16419) and the cached bounds data. When geometries are copied,
the bounds are shared by default, and only "un-shared" when the bounds
are tagged dirty.

Point clouds have a new runtime struct to store this data. Functions
for tagging the data dirty are improved for added for point clouds
and improved for curves. A missing tag has also been fixed for mesh
sculpt mode.

**Future**
There are further improvements which can be worked on next
- Apply changes to volume objects and other types where it makes sense
- Continue cleanup changes described in T96968
- Apply shared cache design to more expensive data like triangulation
  or normals

Differential Revision: https://developer.blender.org/D16204
2022-11-15 13:48:00 -06:00
b2d9716b4a Cleanup: Remove DerivedMesh use in UV unwrapping
Previously the UV unwrapping handling for subsurf modifiers used
`DerivedMesh`to implement the subdivision. Since we're trying to remove
`DerivedMesh` in general, and since this just made use of the `Mesh`
data anyway, it's relatively simple to remove it here. Combined with
D15939, this makes it possible to remove more `DerivedMesh` code.

Differential Revision: https://developer.blender.org/D16487
2022-11-15 13:48:00 -06:00
d775995dc3 DRW: Manager: Add possibility to bind UBO and VBO as SSBO through commands
This exposes `GPU_uniformbuf_bind_as_ssbo` and `GPU_vertbuf_bind_as_ssbo`
through the `draw::Pass` API.
2022-11-15 20:16:25 +01:00
a9a5f7ce17 GPU: UniformBuf: Add GPU_uniformbuf_clear_to_zero
This allows clearing the entire buffer directly on GPU.
2022-11-15 20:16:25 +01:00
b8fd474bb1 Merge branch 'blender-v3.4-release' 2022-11-15 18:41:24 +01:00
50b257715f Assets: Avoid quadratic complexity when freeing asset libraries
Using a vector to store assets means we have to lookup the position of
the asset to be able to remove/free it. Use a `blender::Set` instead for
(nearly?) constant time removal.
2022-11-15 17:44:47 +01:00
2f442234e7 Merge branch 'blender-v3.4-release' 2022-11-15 08:42:55 -08:00
2d251478bb Sculpt: Fix mask from cavity settings issues
Mask from cavity can now pull settings from three
places: the operator properties, scene tool settings
or the brush.  This is needed to make the "create mask"
button work as expected.
2022-11-15 08:39:06 -08:00
277b2fcbfa GPU: Update Vulkan backend with latest API changes.
UniformBuf::bind_as_ssbo has been introduced.
2022-11-15 16:24:30 +01:00
4fb02d7f8e Fix T102482: Crash loading geometry nodes assets after file load
Asset library data is destructed on file load. Asset lists (weak and
hopefully temporary design) contain pointers into it that would dangle
then. Make sure the asset lists are destructed before the asset library
data.
2022-11-15 16:21:06 +01:00
cff78860ac Merge branch 'blender-v3.4-release' 2022-11-15 12:08:05 -03:00
23d0b5dcd2 Merge branch 'blender-v3.4-release' 2022-11-15 16:04:35 +01:00
a859837cde Cleanup: Move OptiX denoiser code from device into denoiser class
Cycles already treats denoising fairly separate in its code, with a
dedicated `Denoiser` base class used to describe denoising
behavior. That class has been fully implemented for OIDN
(`denoiser_oidn.cpp`), but for OptiX was mostly empty
(`denoiser_optix.cpp`) and denoising was instead implemented in
the OptiX device. That meant denoising code was split over various
files and directories, making it a bit awkward to work with. This
patch moves the OptiX denoising implementation into the existing
`OptiXDenoiser` class, so that everything is in one place. There are
no functional changes, code has been mostly moved as-is. To
retain support for potential other denoiser implementations based
on a GPU device in the future, the `DeviceDenoiser` base class was
kept and slightly extended (and its file renamed to
`denoiser_gpu.cpp` to follow similar naming rules as
`path_trace_work_*.cpp`).

Differential Revision: https://developer.blender.org/D16502
2022-11-15 15:50:01 +01:00
a94c3aafe5 Merge branch 'blender-v3.4-release' 2022-11-15 14:53:37 +01:00
ff40b90f99 GPU: UniformBuffer: Add possibility to bind as SSBO
This way UBOs can be modified directly in shader just like VBOs and IBOs.
2022-11-15 14:41:38 +01:00
5db84d0ef1 GPU: State: Add GPU_BARRIER_UNIFORM
This allows to synchronise uniform buffer writes from compute shader
when an UBO is bound as SSBO.
2022-11-15 14:41:38 +01:00
f0ce95b7b9 GPU: Enabled Metal test cases.
This commit enabled the metal gpu backend test cases. These test cases
will currently fail, but are by default disabled.
2022-11-15 13:14:05 +01:00
d2728868c0 GPU: Improve Codegen variable names
Include the node name and parameter index in the variable name for easier debugging.
(Enabled for debug builds only)

Reviewed By: fclem, jbakker

Differential Revision: https://developer.blender.org/D16496
2022-11-15 13:06:58 +01:00
191a3bf2ad Merge branch 'blender-v3.4-release' 2022-11-15 13:05:49 +01:00
66939d47b1 Merge branch 'blender-v3.4-release' 2022-11-15 12:44:39 +01:00
84dce9c1fa Merge branch 'blender-v3.4-release' 2022-11-15 12:11:12 +01:00
1b5ceb9a75 Merge branch 'blender-v3.4-release' 2022-11-15 12:05:11 +01:00
d95216b94c Merge branch 'blender-v3.4-release' 2022-11-15 11:35:29 +01:00
d46317cf3c Fix T101775: grease pencil keyframe not filtered in dopesheet summary
Grease pencil data keyframes were listed twice in the summary.

First by the generic object data listing,
which did not handle properly grease pencil objects,
and did not account for the grease pencil filter.
Second by the specific grease pencil function.

Now only the second call is made,
and the filter hides keyframes in summary as well.

Reviewed By : Jeroen Bakker, Falk David

Differential Revision: https://developer.blender.org/D16369
2022-11-15 09:49:39 +01:00
7f80b5e675 Animation: rearrange grease pencil channels in the main dopesheet
Operations to rearrange channels in the main dopesheet
did not cover grease pencil layer channels.
Now grease pencil layer channels can be moved up and down
in the main dopesheet just like other channels.

Reviewed By: Sybren A. Stüvel

Differential Revision: https://developer.blender.org/D15542
2022-11-15 09:09:48 +01:00
91215ace72 Merge branch 'blender-v3.4-release' 2022-11-15 08:08:17 +01:00
2a41cd46ba Cleanup: format 2022-11-15 16:43:18 +11:00
f0f97e18c1 Cleanup: quiet unused variable warning 2022-11-15 16:41:50 +11:00
f396ab236a Merge branch 'blender-v3.4-release' 2022-11-15 16:37:56 +11:00
be024ee7b7 Merge branch 'blender-v3.4-release' 2022-11-15 15:32:52 +11:00
435c824a5f Merge branch 'blender-v3.4-release' 2022-11-15 15:32:47 +11:00
2205e5f63f Merge branch 'blender-v3.4-release' 2022-11-15 15:32:34 +11:00
Iliya Katueshenock
efcd587bc2 Geometry Nodes: Image Info Node
This commit adds a new "Image Info" node to retrieve various
information from an image like its width, height, and whether
it has an alpha channel. It is also possible to retrieve the FPS
and frame count of video files.

Differential Revision: https://developer.blender.org/D15042
2022-11-14 18:55:51 -06:00
b64042b482 Merge branch 'blender-v3.4-release' 2022-11-14 18:21:35 -06:00
d158db475b Merge branch 'blender-v3.4-release' 2022-11-14 17:53:25 -06:00
e3ee913932 Cleanup: Resolve unused variable warning in draw module 2022-11-14 14:50:10 -06:00
103fe4d1d1 Merge branch 'blender-v3.4-release' 2022-11-14 20:09:02 +01:00
6873aabf93 Cleanup: BVH utils: Remove print, use spans instead of pointers 2022-11-14 11:59:40 -06:00
b0e2e45496 Cycles: Enable MetalRT pointclouds & other fixes
Code authored by Marco Giordano.

This fixes pointcloud rendering on MetalRT and some other subtle MetalRT bugs:
- Incorrect kernel hashing
- Missing specialisation constants
- Incorrect visibility filtering
- Missing null pointer check

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D16499
2022-11-14 16:39:18 +00:00
85bbcb32eb Merge branch 'blender-v3.4-release' 2022-11-14 16:46:01 +01:00
276d7f7c19 Mesh: Avoid calculating normals when building BVH tree
Though they are sometimes used by users of the BVH tree, mostly
vertex normals when building the BVH tree is unnecessary. Skip it
instead and avoid storing the vertex normals in the BVH tree cache.
They are just calculated in the few places they are actually needed.
This should save at least a few percent of the runtime in some cases
where the normals weren't needed otherwise.
2022-11-14 07:56:53 -06:00
cc12d2a5a0 Cleanup: Reduce indentation and variable scope in BVH utils 2022-11-14 07:56:53 -06:00
1aaf4ce0a4 Cleanup: Use C++ BitVector instead of BLI_bitmap for BVH utils
This gives a friendlier interface, an inline buffer, RAII, etc.
Also switch some BMesh functions that were only used by the snap
system's use of BVH utils.
2022-11-14 07:56:53 -06:00
6192695a94 Cleanup: Allow using C++ features in BMesh header functions
Generally the `extern "C" {` brackets shouldn't be added around other
headers since it causes problems when using C++ features in them.
Follow that convention for the "bmesh.h" header.
2022-11-14 07:56:53 -06:00
28dc3b0b84 Cleanup: Move bmesh_iterators.c to C++ 2022-11-14 07:56:53 -06:00
37f50ffdbc Cleanup: Braces around initialization warning
There is no need to initialize the bounding box as it is fully
initialized by the BKE_boundbox_init_from_minmax().
2022-11-14 14:43:41 +01:00
Ejner Fergo
b557e4317d Gizmo toggle for Movie Clip Editor
This patch adds a "Show Gizmo" toggle to the Movie Clip Editor header, for consistency with other editors.

{F13892765}

Differential Revision: https://developer.blender.org/D16437
2022-11-14 14:35:32 +01:00
Mikhail Matrosov
857bb1b5ec Cycles: improve adaptive sampling for overexposed scenes
Render time is reduced for overexposed scenes, by taking into account absolute
light intensity for adaptive sampling.

This can negatively affect some scenes where compositing or color management
are used to make the scene much darker or lighter. For best results adjust the
Film > Exposure setting to bring the intensity into a good range, and then do
further compositing and color management on top of that. Note that this setting
is different than color management exposure.

Previously Cycles' adaptive sampling used sqrt(I) to normalize noise level to
conform to a viewer's eye sensitivity. It is great for darker regions of the
image, but also requests too much samples in bright regions, sometimes several
times more than needed. Highlights can tolerate more noise because in most
examples it is still less noticeable then the noise in darker areas in the same
render.

Differential Revision: https://developer.blender.org/D16392
2022-11-14 14:14:31 +01:00
d7f1430f11 Merge branch 'blender-v3.4-release' 2022-11-14 14:04:13 +01:00
Colin Basnett
4dd19a1ad6 UI: disable curve map & profile zoom buttons at max/min zoom level
Disable the zoom in and out buttons on the when they would have no effect.

This also removes an incorrect comment that indicates the maximum zoom level
was 20x when in fact it was 25x.

Differential Revision: https://developer.blender.org/D16252
2022-11-14 14:03:50 +01:00
187bce103b DRW: Fix compilation issues in inline functions 2022-11-14 14:01:23 +01:00
0fe21fe7b9 Merge branch 'blender-v3.4-release' 2022-11-14 13:20:21 +01:00
ea2dda306c Asset system: New asset system code module (with files from BKE)
Adds a new `source/blender/asset_system` directory and moves asset
related files from BKE to it. More asset related code can follow
(e.g. asset indexing, ED_assetlist stuff) but needs further work to
untangle it. I also kept `BKE_asset.h` and `asset.cc` as is, since they
deal with asset DNA data mostly, thus make sense in BKE.

Motivation:
- Makes the asset system design more present (term wasn't even used in
  code before).
- An `asset_system` directory is quite descriptive (trivial to identify
  core asset system features) and makes it easy to find asset code.
- Asset system is mostly runtime data, with little relation to other
  `Main`/BKE/DNA types.
- There's a lot of stuff in BKE already. It shouldn't be just a dump for
  all stuff that seems core enough.
- Being its own directly helps us be more mindful about encapsulating
  the module well, and avoiding dependencies on other modules.
- We can be more free with splitting files here than in BKE.
- In future there might be an asset system BPY module, which would then
  map quite nicely to the `asset_system` directory.

Checked with some other core devs, consensus seems that this makes
sense.
2022-11-14 12:46:34 +01:00
6a96edce2e Merge branch 'blender-v3.4-release' 2022-11-14 12:24:45 +01:00
bfb6ea898b DRW: View: Add base for multi-view support
This implements the base needed for supporting multiple view concurently
inside the same drawcall.

The view used by common macros and view related functions is indexed using
a global variable `drw_view_id` which can be set arbitrarly or read
from the `drw_ResourceID`.

This is needed for EEVEE-Next shadow but can be used for other purpose
in the future.

Note that a shader specialization is needed for it to work. `DRW_VIEW_LEN`
needs to be defined to the amount of view the shader will access.

The number of views contained in a `draw::View` is set at construction
time.

Note that the maximum number of object correctly drawn by the shaders
using multiple views will be lower than thoses who don't.
2022-11-14 11:17:38 +01:00
ab3fcd62cc Cleanup: DRW: Remove two clang-tidy warnings 2022-11-14 11:17:38 +01:00
5fe146e505 Cleanup: Move mesh_remap.c to C++
To facilitate further mesh data structure refactoring.
2022-11-13 20:27:37 -06:00
909f47e0e1 UV: fix crash with uv copy on empty selection
Introduced in 721fc9c1c9
2022-11-14 13:37:12 +13:00
e0cb3e0a39 Merge branch 'blender-v3.4-release' 2022-11-14 10:38:01 +11:00
b8d1022dff Merge branch 'blender-v3.4-release' 2022-11-14 10:37:57 +11:00
dc513a0af8 Cleanup: Disable mesh normal debug time printing
Left enabled mistakenly by d63ada602d.
2022-11-13 14:19:20 -06:00
3eb2bc2c3f Cleanup: Move lineart_cpu.c to C++
To enable further mesh data structure refactoring-- access to loose
edges in particular.
2022-11-13 14:16:24 -06:00
fba7461e1a UV: fix compile on windows
Remove VLAs for compiling on windows.
Regression from 721fc9c1c9
2022-11-14 08:22:36 +13:00
d17f5bcd8f Fix T95335 Bevel operator Loop Slide overshoot.
If the edge you are going to slide along is very close to in line
with the adjacent beveled edge, then there will be sharp overshoots.
There is an epsilon comparison to just abandon loop slide if this
situation is happening. That epsilon used to be 0.25 radians, but
bug T86768 complained that that value was too high, so it was changed
to .0001 radians (5 millidegrees). Now this current bug shows that
that was too aggressively small, so this change ups it by a factor
of 10, to .001 radians (5 centidegrees). All previous bug reports
remained fixed.
2022-11-13 14:09:27 -05:00
7419e291e8 Merge branch 'blender-v3.4-release' 2022-11-13 22:54:38 +05:30
ce9fcb15a3 DRW: Manager: Fix ClearMulti breaking compilation on Mac
The error was:
`draw_pass.hh:1055:16: error: call to implicitly-deleted default constructor of 'blender::draw::command::Undetermined [3]'
2022-11-13 18:02:17 +01:00
bd622aef3c BLI: Fix ListBaseWrapper::get wrong return type 2022-11-13 16:48:30 +01:00
c255be2d02 DRW: Manager: Add bind_texture command for vertex buffer
This allows the same behavior as with `DRW_shgroup_buffer_texture`.
2022-11-13 16:47:43 +01:00
cd64615425 DRW: Manager: Add ClearMulti command
Allows to record `GPU_framebuffer_multi_clear` inside `draw::Pass`.
2022-11-13 16:23:22 +01:00
930d14cc62 DRW: Manager: Finish / change implementation of framebuffer_set command
Use reference instead of direct pointer. This is because framebuffers
often use temp textures and are configured later just before submission.
2022-11-13 16:16:26 +01:00
f1466ce9a8 DRW: Wrappers: Allow taking reference of the framebuffer object
This is in order to make it work with the new `framebuffer_set` command
which requires a `GPUFrameBuffer **`.
2022-11-13 16:02:57 +01:00
0e4bdd428c DRW: Wrappers: Allow trivial types inside draw::SwapChain
This allows to use pointers and such other trivial types which cannot
implement the `swap` mehod.
2022-11-13 16:00:58 +01:00
67dfb61700 DRW: Wrappers: Avoid default vector length of 0 if sizeof(T) is large
This increases the default size to some reasonable value (>512bytes) and
allocate at least 1 element.
2022-11-13 15:59:23 +01:00
d0f05ba915 Cleanup: fix compiler error/warnings 2022-11-13 11:29:04 +01:00
721fc9c1c9 UV: implement copy and paste for uv
Implement a new topology-based copy and paste solution for UVs.

Usage notes:

* Open the UV Editor

* Use the selection tools to select a Quad joined to a Triangle joined to another Quad.
* From the menu, choose UV > UV Copy
 * The UV co-ordinates for your quad<=>tri<=>quad are now stored internally

* Use the selection tools to select a different Quad joined to a Triangle joined to a Quad.
* (Optional) From the menu, choose UV > Split > Selection

* From the menu, choose UV > UV Paste
 * The UV co-ordinates for the new selection will be moved to match the stored UVs.

Repeat selection / UV Paste steps as many times as desired.
For performance considerations, see https://en.wikipedia.org/wiki/Graph_isomorphism_problem

In theory, UV Copy and Paste should work with all UV selection modes.
Please report any problems.

A copy has been made of the Graph Isomorphism code from https://github.com/stefanoquer/graphISO
Copyright (c) 2019 Stefano Quer stefano.quer@polito.it GPL v3 or later.

Additional integration code Copyright (c) 2022 by Blender Foundation, GPL v2 or later.

Maniphest Tasks: T77911
Differential Revision: https://developer.blender.org/D16278
2022-11-13 12:48:17 +13:00
533c396898 Merge remote-tracking branch 'origin/blender-v3.4-release' 2022-11-12 13:32:26 -07:00
115cf5ef98 Cleanup: Move cloth.c to C++
To support further mesh data structure refactoring.
2022-11-12 12:14:09 -06:00
a6c822733a BLI: improve CPPType system
* Support bidirectional type lookups. E.g. finding the base type of a
  field was supported, but not the other way around. This also removes
  the todo in `get_vector_type`. To achieve this, types have to be
  registered up-front.
* Separate `CPPType` from other "type traits". For example, previously
  `ValueOrFieldCPPType` adds additional behavior on top of `CPPType`.
  Previously, it was a subclass, now it just contains a reference to the
  `CPPType` it corresponds to. This follows the composition-over-inheritance
  idea. This makes it easier to have self-contained "type traits" without
  having to put everything into `CPPType`.

Differential Revision: https://developer.blender.org/D16479
2022-11-12 18:33:31 +01:00
a145b96396 Merge branch 'blender-v3.4-release' 2022-11-12 16:51:53 +01:00
Iliya Katueshenock
99fe17f52d BLI: use templates for disjoint set data structure
Differential Revision: https://developer.blender.org/D16472
2022-11-12 14:26:47 +01:00
db25e64f6a Merge branch 'blender-v3.4-release' 2022-11-12 14:23:32 +01:00
Iliya Katueshenock
b5e82ff93d Cleanup: remove unused variable
Differential Revision: https://developer.blender.org/D16350
2022-11-12 14:20:19 +01:00
5a37724455 Merge branch 'blender-v3.4-release' 2022-11-12 14:19:32 +01:00
Iliya Katueshenock
3534c2b4ad Cleanup: make GArray declarations more explicit
Differential Revision: https://developer.blender.org/D16064
2022-11-12 14:14:42 +01:00
0fc27536fb Merge branch 'blender-v3.4-release' 2022-11-12 19:52:11 +11:00
4737f9cff2 Merge branch 'blender-v3.4-release' 2022-11-12 17:10:42 +11:00
935d6a965a Merge branch 'blender-v3.4-release' 2022-11-12 17:10:39 +11:00
b973e27327 Merge branch 'blender-v3.4-release' 2022-11-12 17:10:36 +11:00
cd659f7bbf Merge branch 'blender-v3.4-release' 2022-11-12 17:10:32 +11:00
41137eb7a5 Merge branch 'blender-v3.4-release' 2022-11-12 17:10:29 +11:00
e87b99d7f3 Merge branch 'blender-v3.4-release' 2022-11-12 17:10:25 +11:00
fcfa9ac219 Merge branch 'blender-v3.4-release' 2022-11-12 17:10:21 +11:00
1a8516163f Cleanup: Simplify handling of loop to poly map in normal calculation
A Loop to poly map was passed as an optional output to the loop normal
calculation. That meant it was often recalculated more than necessary.
Instead, treat it as an optional argument. This also helps relieve
unnecessary responsibilities from the already-complicated loop normal
calculation code.
2022-11-11 23:27:36 -06:00
d63ada602d Cleanup: Use simpler timers for mesh normals debug timing 2022-11-11 23:26:56 -06:00
7c519aa5d8 Cleanup: Make loop normal calculation function static 2022-11-11 23:26:49 -06:00
78bfb74743 Cleanup: Decrease variable scope in mesh loop normal calculation 2022-11-11 21:56:17 -06:00
d9e5a3e6ad Cleanup: Use spans for loop normal calculation input data 2022-11-11 21:49:43 -06:00
d0522d4ef1 Cleanup: Remove unnecessary struct keywords 2022-11-11 19:05:22 -06:00
03ccf37162 Cleanup: Rename curves sculpt selection variable
It's a bit simpler to skip the "indices" in the name, that can be
assumed from the type.
2022-11-11 15:32:51 -06:00
5465aa63d5 Merge branch 'blender-v3.4-release' 2022-11-11 13:49:55 -07:00
9d827a1834 Fix OSL object matrix with Cycles on the GPU
The OSL GPU services implementation of "osl_get_matrix" and
"osl_get_inverse_matrix" was missing support for the "common",
"shader" and "object" matrices and thus any matrix operations in OSL
shaders using these would not work. This patch adds the proper
implementation copied from the OSL CPU services.

Maniphest Tasks: T101222
2022-11-11 20:21:08 +01:00
1fdaf748bf Add poll messages for marker operators
A number of operators were missing poll messages when disabled.

These are the following new error messages:

1. "No markers are selected"
2. "Markers are locked"

Reviewed By: sybren

Differential Revision: https://developer.blender.org/D16403
2022-11-11 11:13:24 -08:00
5671e7a92c Cleanup: Fixing anti-patterns in fcurve.c
This is a clean-up pass that eliminates a few problematic patterns:

* Eliminating redundant parentheses around simple expressions.
* Combing declaration and assignment of variables where appropriate.
* Moving variable declarations closer to their first use.
* Many variables and arguments have been marked as `const`.
* Using `LISTBASE_FOREACH_*` variants where applicable instead of
  manually managing loop control flow.

There are no functional changes.

Reviewed By: sybren

Differential Revision: https://developer.blender.org/D16459
2022-11-11 11:07:30 -08:00
abbbf9f002 Merge branch 'blender-v3.4-release' 2022-11-11 12:41:04 -06:00
Michael Jones
2c596319a4 Cycles: Cache only up to 5 kernels of each type on Metal
This patch adapts D14754 for the Metal backend. Kernels of the same type are already organised into subdirectories which simplifies type matching.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D16469
2022-11-11 18:10:29 +00:00
6f6a0185f2 Merge branch 'blender-v3.4-release' 2022-11-11 19:05:44 +01:00
097a13f5be Fix broken Cycles rendering with recent OSL versions
Commit c8dd33f5a37b6a6db0b6950d24f9a7cff5ceb799 in OSL
changed behavior of shader parameters that reference each other
and are also overwritten with an instance value.
This is causing the "NormalIn" parameter of a few OSL nodes in
Cycles to be set to zero somehow, which should instead have
received the value from a "node_geometry" node Cycles generates
and connects automatically. I am not entirely sure why that is
happening, but these parameters are superfluous anyway, since
OSL already provides the necessary data in the global variable "N".
So this patch simply removes those parameters (which mimics
SVM, where these parameters do not exist either), which also
fixes the rendering artifacts that occured with recent OSL.

Maniphest Tasks: T101222

Differential Revision: https://developer.blender.org/D16470
2022-11-11 17:10:30 +01:00
dc8a1d38b7 Merge branch 'blender-v3.4-release' 2022-11-11 09:09:28 -06:00
a304dfdb69 Cleanup: Improve curves sculpt code section names 2022-11-11 08:50:28 -06:00
111974234c Addons submodule version bump
(Previous attempt was accidentally 4 days outdated)
2022-11-11 15:10:30 +01:00
fe2be36510 Merge branch 'blender-v3.4-release' 2022-11-11 13:27:27 +01:00
5f35e7f12a Addons submodule version bump 2022-11-11 12:48:14 +01:00
a3877d8fe4 Merge branch 'blender-v3.4-release' 2022-11-11 11:49:54 +01:00
824d5984aa Merge branch 'blender-v3.4-release' 2022-11-11 10:10:15 +01:00
6f1b5e1081 Merge branch 'blender-v3.4-release' 2022-11-11 08:48:43 +01:00
c967aab4ef Cleanup: Remove unused navigation widget struct members
The `region_size[2]` was set to -1 but was never accessed.
2022-11-10 22:38:38 -05:00
ca1642cd0c Cleanup: Use string argument for attribute API function
Instead of CustomDataLayer, which exposes the internal implementation
more than necessary, and requires that the layer is always available,
which isn't always true.
2022-11-10 15:29:21 -06:00
34f4646786 Cleanup: Clarify and deduplicate attribute convert implementation
The ED level function is used for more code paths now, and it has been
cleaned up. Handling of the active attribute is slightly improved too.
2022-11-10 15:29:21 -06:00
Ramil Roosileht
9800312590 Mesh: Convert color attribute operator
Implements an operator to convert color attributes in
available domains and types, as described in T97106.

Differential Revision: https://developer.blender.org/D15596
2022-11-10 15:29:21 -06:00
f613c504c4 Merge branch 'blender-v3.4-release' 2022-11-10 11:52:04 -08:00
3c089c0a88 Sculpt: Fix T102209: Multiresolution levels greater than 6 crashes
pbvh->leaf_limit needs to be at least 4 to split nodes
original face boundaries properly.
2022-11-10 11:30:04 -08:00
Edward
59618c7646 Sculpt: Fix T101914: Wpaint gradient tool doesn't work with vertex mask
Reviewed by: Julian Kaspar & Joseph Eagar
Differential Revision: https://developer.blender.org/D16293
Ref D16293
2022-11-10 11:03:59 -08:00
366796bbbe Merge branch 'blender-v3.4-release' 2022-11-10 19:59:47 +01:00
969aa7bbfc Sculpt: Change symmetrize merge threshold and expose in workspace panel
The sculpt symmetrize operator's merge threshold now defaults
to 0.0005 instead of 0.001, which tends to be a bit too big for
metric scale.

Also changed its step and precision a bit to be more usable.
2022-11-10 10:55:09 -08:00
Julien Kaspar
659de90a32 Sculpt: Rename Show/Hide operators for consistency
This is a minor naming update to make the box hide and show operators in sculpt mode follow current naming conventions.

Reviewed by: Joseph Eagar
Differential Revision: https://developer.blender.org/D16413
Ref D16413
2022-11-10 10:47:08 -08:00
cc2b5959bb Sculpt: Fix inconsistent naming for cavity_from_mask operator
With db40b62252 there have been various UI adjustments and improved renaming.
The Mask From Cavity menu operator didn't follow this new naming yet.

Reviewed By: Joseph Eagar
Differential Revision: https://developer.blender.org/D16409
Ref D16409
2022-11-10 10:40:54 -08:00
6a8ce5ec1c Fix abort when rendering with OSL and OptiX in Cycles
LLVM could kill the process during OSL PTX code generation, due
to generated symbols contained invalid characters in their name.
Those names are generated by Cycles and were not properly filtered:

- If the locale was set to something other than the minimal locale
  (when Blender was built with WITH_INTERNATIONAL), pointers
  may be printed with grouping characters, like commas or dots,
  added to them.
- Material names from Blender may contain the full range of UTF8
  characters.

This fixes those cases by forcing the locale used in the symbol name
generation to the minimal locale and using the material name hash
instead of the actual material name string.
2022-11-10 19:31:59 +01:00
2688d7200a Sculpt: Fix T102379: Crash in dyntopo 2022-11-10 10:28:10 -08:00
a5b2a3041f Fix const-correctness for a number of F-Curve functions
Reviewed By: sybren

Differential Revision: https://developer.blender.org/D16445
2022-11-10 10:22:03 -08:00
df9ab1c922 Merge branch 'blender-v3.4-release' 2022-11-10 18:07:34 +01:00
acaa736037 Fix: GPU: Set the last enum in ENUM_OPERATORS 2022-11-10 17:43:53 +01:00
04eab3fd39 Merge branch 'blender-v3.4-release' 2022-11-10 16:18:16 +01:00
e7a3454f5f Cleanup: Fix strict compiler warning 2022-11-10 16:11:02 +01:00
b85fe57887 Merge branch 'blender-v3.4-release' 2022-11-10 15:54:16 +01:00
598bb9065c Cleanup: Move sculpt.c to C++ 2022-11-10 07:40:41 -06:00
55e86f94a0 EEVEE Next: Fix wrong DoF when a non-camera object is the active camera
Related to T101533.

Reviewed By: fclem

Differential Revision: https://developer.blender.org/D16412
2022-11-10 13:01:06 +01:00
bc672e76eb Merge branch 'blender-v3.4-release' 2022-11-10 12:41:57 +01:00
a4668ecf17 Merge branch 'blender-v3.4-release' 2022-11-10 12:39:01 +01:00
Jason Schleifer
0b4bd3ddc0 NLA: Update context menu to include meta strip operators
The meta strip operator is available in the Add menu, but not in the
context menu.

This patch adds these two operators to the context menu:
* nla.meta_add
* nla.meta_remove

Reviewed By: sybren, RiggingDojo

Differential Revision: https://developer.blender.org/D16353
2022-11-10 12:07:29 +01:00
8ef092d2d8 Fix T102151: Output nodes don't work inside node groups
Using output nodes inside node groups in compositor node trees doesn't
work for the realtime compositor.

Currently, the realtime compositor only considers top level output
nodes. That means if a user edits a node group and adds an output node
in the group, the output node outside of the node group will still be
used, which breaks the temporary viewers workflow where users debug
results inside a node group.

This patch fixes that by first considering the output nodes in the
active context, then consider the root context as a fallback. This is
mostly consistent with the CPU compositor, but the realtime compositor
allow viewing node group output nodes even if no output nodes exist at
the top level context.

Differential Revision: https://developer.blender.org/D16446

Reviewed By: Clement Foucault
2022-11-10 13:02:15 +02:00
ec1ab6310a Merge branch 'blender-v3.4-release' 2022-11-10 11:05:32 +01:00
baabac5909 Merge branch 'blender-v3.4-release' 2022-11-10 11:34:43 +11:00
7d606ad3b8 Merge branch 'blender-v3.4-release' 2022-11-10 11:22:08 +11:00
1bf3069912 Merge branch 'blender-v3.4-release' 2022-11-10 11:22:02 +11:00
9b646cfae5 Merge branch 'blender-v3.4-release' 2022-11-10 11:21:18 +11:00
2630fdb787 Cleanup: format 2022-11-10 11:17:16 +11:00
d49dec896a Attempt to fix build error on Windows
Was failing since 1efc94bb2f, probably because some include uses
`std::min()`/`std::max()` which messes with the windows min/max defines.
2022-11-09 23:09:56 +01:00
32757b2429 Fix uninitialized variable use in asset metadata test
Wasn't an issue until 1efc94bb2f added a destructor, which would
attempt to destruct variables at uninitialized memory.
2022-11-09 23:04:24 +01:00
95077549c1 Fix failure in recently added asset library tests
Mistake in 1efc94bb2f.
2022-11-09 22:51:32 +01:00
1cdc6381cf Merge branch 'blender-v3.4-release' 2022-11-09 14:40:34 -07:00
c932fd79ac Merge branch 'blender-v3.4-release' 2022-11-09 22:38:25 +01:00
b8fc7ed994 Fix incorrect forward declarations, causing warnings on Windows 2022-11-09 22:37:49 +01:00
4cd9e9991c Merge branch 'blender-v3.4-release' 2022-11-09 22:03:56 +01:00
7a2827ee99 Merge branch 'blender-v3.4-release' 2022-11-09 13:55:08 -06:00
501036faae Merge branch 'blender-v3.4-release' 2022-11-09 20:42:19 +01:00
5f169fdfdc Merge branch 'blender-v3.4-release' 2022-11-09 19:45:19 +01:00
ad227e73f3 Cleanup: Link to documentation page for asset representation type 2022-11-09 19:37:05 +01:00
5291e4c358 Cleanup: Remove unused class variable, added in previous commit 2022-11-09 19:35:47 +01:00
1efc94bb2f Asset System: New core type to represent assets (AssetRepresenation)
Introduces a new `AssetRepresentation` type, as a runtime only container
to hold asset information. It is supposed to become _the_ main way to
represent and refer to assets in the asset system, see T87235. It can
store things like the asset name, asset traits, preview and other asset
metadata.

Technical documentation:
https://wiki.blender.org/wiki/Source/Architecture/Asset_System/Back_End#Asset_Representation.

By introducing a proper asset representation type, we do an important
step away from the previous, non-optimal representation of assets as
files in the file browser backend, and towards the asset system as
backend. It should replace the temporary & hacky `AssetHandle` design in
the near future. Note that the loading of asset data still happens
through the file browser backend, check the linked to Wiki page for more
information on that.

As a side-effect, asset metadata isn't stored in file browser file
entries when browsing with link/append anymore. Don't think this was
ever used, but scripts may have accessed this. Can be brought back if
there's a need for it.
2022-11-09 19:30:47 +01:00
7395062480 Cleanup: Miscellaneous cleanups to trim curves node
- Fix braces initialization warning
- Fixed missing static specifier
- Removed two unused functions
2022-11-09 12:13:59 -06:00
eea3913348 Merge branch 'blender-v3.4-release' 2022-11-09 10:56:28 -06:00
a43053a00a Improved Korean Font Sample
Small change to the text sample used for Korean font previews

See D16428 for details.

Differential Revision: https://developer.blender.org/D16428

Reviewed by Brecht Van Lommel
2022-11-09 08:51:00 -08:00
ce68367969 Fix T102140: Replacement of Noto Sans CJK Font
Replace our Noto Sans CJK with a version that has Simplified Chinese
set as the default script.

See D16426 for details and examples

Differential Revision: https://developer.blender.org/D16426

Reviewed by Brecht Van Lommel
2022-11-09 08:22:38 -08:00
Germano Cavalcante
edc00429e8 Fix T102257: Crash when making an Object as Effector set to Guide and trying to scrub the timeline
rB67e23b4b2967 revealed the bug. But the bug already existed before,
it just wasn't triggered.

Apparently the problem happens because the python code generated in
`initGuiding()` cannot be executed twice.

The second time the `initGuiding()` code is executed, the local python
variables are removed to make way for the others, but the reference to
one of the grids in a `Solver` object (name='solver_guiding2') is still
being used somewhere. So an error is raised and a crash is forced.

The solution is to prevent the python code in `initGuiding()` from being
executed twice.

When `FLUID_DOMAIN_ACTIVE_GUIDE` is in `fds->active_fields` this
indicates that the pointer in `mPhiGuideIn` has been set and the guiding
is already computed (does not need to be computed again).

Maniphest Tasks: T102257

Differential Revision: https://developer.blender.org/D16416
2022-11-09 12:11:20 -03:00
e6b38deb9d Cycles: Add basic support for using OSL with OptiX
This patch  generalizes the OSL support in Cycles to include GPU
device types and adds an implementation for that in the OptiX
device. There are some caveats still, including simplified texturing
due to lack of OIIO on the GPU and a few missing OSL intrinsics.

Note that this is incomplete and missing an update to the OSL
library before being enabled! The implementation is already
committed now to simplify further development.

Maniphest Tasks: T101222

Differential Revision: https://developer.blender.org/D15902
2022-11-09 15:30:21 +01:00
efe073f57c Merge branch 'blender-v3.4-release' 2022-11-09 14:43:05 +01:00
477faffd78 Cleanup: unused argument warning 2022-11-09 21:08:31 +11:00
683b945917 Cleanup: format 2022-11-09 21:07:09 +11:00
024bec85f6 Depsgraph: simplify scheduling in depsgraph evaluator
No functional or performance changes are expected.

Differential Revision: https://developer.blender.org/D16423
2022-11-09 09:58:05 +01:00
59e69fc2bd Fix strict compiler warnings
Functions which are local to a translation unit should either be
marked as static, or be in an anonymous namespace.
2022-11-09 09:47:24 +01:00
aba0d01b78 Fix T102278: Compositor transforms apply locally
When using two transformed compositor results, the transformation of one
of them is apparently in the local space of the other, while it should
be applied in the global space instead.

In order to realize a compositor result on a certain operation domain,
the domain of the result is projected on the operation domain and later
realized. This is done by multiplying by the inverse of the operation
domain. However, the order of multiplication was inverted, so the
transformation was applied in the local space of the operation domain.

This patch fixes that by inverting the order of multiplication in domain
realization.
2022-11-09 10:35:41 +02:00
3836b6ff8c Cancel Equalize Handles & Snap Keys when no control points are selected
The Equalize Handles and Snap Keys operators would allow the user to
invoke them successfully even when they would have no effect due to
there not being any selected control points.

This patch makes it so that an error is displayed when these operators
are invoked with no control points are selected.

The reason this is in the `invoke` function is because it would be too
expensive to run this check in the `poll` function since it requires a
linear search through all the keys of all the visible F-Curves.

Reviewed By: sybren

Differential Revision: https://developer.blender.org/D16390
2022-11-08 21:04:47 -08:00
800b025518 Merge branch 'blender-v3.4-release' 2022-11-09 14:34:08 +11:00
baee7ce4a5 Fix T102306: buildtime shader compilation option fails under Wayland
libdecor (for window decorations) was crashing on exit with the shader
builder, avoid the crash by calling the "background" system creation
function which doesn't initialize window management under Wayland.
2022-11-09 14:01:14 +11:00
76c308e45d Merge branch 'blender-v3.4-release' 2022-11-09 13:57:27 +11:00
801db0d429 Revert "Fix T102306: buildtime shader compilation option fails under Wayland"
This reverts commit 6fa05e2c29.
2022-11-09 13:54:19 +11:00
756538b4a1 BLI_math: remove normalize from mat3_normalized_to_quat_fast
The quaternion calculated are unit length unless the the input matrix is
degenerate. Detect degenerate cases and remove the normalize_qt call.
2022-11-09 13:25:11 +11:00
c6612da1e6 Merge branch 'blender-v3.4-release' 2022-11-09 13:06:05 +11:00
2d9d08677e Cleanup: fix types from f04f9cc3d0 2022-11-09 14:54:37 +13:00
335082dcd3 Merge branch 'blender-v3.4-release' 2022-11-08 15:31:33 -08:00
f0b5f94cb5 Cleanup: format 2022-11-09 11:59:51 +13:00
f04f9cc3d0 Cleanup: add unique_index_table to UvElementMap
In anticipation of UV Copy+Paste, we need fast access to indices
of unique UvElements. Can also be used to improve performance and
simplify code for UV Sculpt tools and UV Stitch.

No user visible changes expected.

Maniphest Tasks: T77911

See also: D16278
2022-11-09 11:47:16 +13:00
75265f27da Merge branch 'blender-v3.4-release' 2022-11-08 21:21:02 +01:00
Lukas Stockner
1eca437197 Color Management: Parallelize ImBuf conversion to float
Motivated by long loading times in T101969, reduces render preparation time from 14sec to 6sec.

Another possible improvement would be to use C++ and template based on OCIO vs. sRGB,
but moving the file to C++ seems nontrivial (and opens up the question whether ocio_capi
makes any sense then or we should just use OCIO directly) so I left it at a direct 1:1
parallelization of the existing code for now.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D16317
2022-11-08 20:59:41 +01:00
c6aacd718a Cleanup: Improve precision during UV packing.
Simplify API and improve accuracy of uv packing placement
by using pre-translation and double precision internally.

Will protect against future precision problems with UDIM.

No user visible changes expected.

Maniphest Tasks: T68889
Differential Revision: https://developer.blender.org/D16362
2022-11-09 08:50:53 +13:00
96d8e5e66b Merge branch 'blender-v3.4-release' 2022-11-08 13:39:45 -06:00
4b57bc4e5d Cleanup: format 2022-11-09 08:30:18 +13:00
b539d425f0 Merge branch 'blender-v3.4-release' 2022-11-08 19:47:55 +01:00
35eb37c60d Merge branch 'blender-v3.4-release' 2022-11-08 19:36:06 +01:00
ad5814a2a7 Merge branch 'blender-v3.4-release' 2022-11-08 12:33:31 -06:00
8eab23bc66 Geometry Nodes: Fix alignment of exposed properties in the modifier
The spacing and alignment of the properties in the geometry nodes
modifier could vary depending on the type of the socket or
whether the input can accept attributes.
Wrapping each property in its own `row` layout allows us to make
the spacing and alignment between them consistent.

Reviewed By: Hans Goudey

Differential Revision: http://developer.blender.org/D16417
2022-11-08 19:14:29 +01:00
95d36a31b6 Merge branch 'blender-v3.4-release' 2022-11-08 18:23:49 +01:00
4c182aef7c GPencil: Make Sculpt Auto-masking Global and not by Brush
The auto-masking was working by Brush and this was very
inconvenient because it was necessary set the options by
Brush, now the options are global and can be set at once.

Also, the automa-masking now works with `and` logic
and not with `or` as before. That means that a stroke
must meet all the conditions of the masking.

Added new Layer and Material options to masking the 
strokes using the same Layer/Material of the selected stroke.
Before, only Active Layer and Active Material could be masked.

The options of masking has been moved to the top-bar using
the same design of Mesh Sculpt masking.

As result of the changes above, the following props changed:

Removed:

`brush.gpencil_settings.use_automasking_strokes`
`brush.gpencil_settings.use_automasking_layer`
`brush.gpencil_settings.use_automasking_material`

Added:

`tool_settings.gpencil_sculpt.use_automasking_stroke`
`tool_settings.gpencil_sculpt.use_automasking_layer_stroke`
`tool_settings.gpencil_sculpt.use_automasking_material_stroke`
`tool_settings.gpencil_sculpt.use_automasking_layer_active`
`tool_settings.gpencil_sculpt.use_automasking_material_active`


Reviewed by: Julien Kaspar, Matias Mendiola, Daniel Martinez Lara
2022-11-08 16:55:59 +01:00
bbb1d3e5e7 Merge branch 'blender-v3.4-release' 2022-11-08 16:28:11 +01:00
6a14ca18d0 Merge branch 'blender-v3.4-release' 2022-11-08 16:26:09 +01:00
c73ae711bf BLI: new basic CacheMutex
This patch introduces a new `CacheMutex` which makes it easy to implement
lazily computed caches in e.g. `Curves`. For more details see `BLI_cache_mutex.hh`.

Differential Revision: https://developer.blender.org/D16419
2022-11-08 15:50:49 +01:00
Edward
1d71f82033 Texture Paint: sync adding a new texture slot to the Image Editor
When changing the texture paint slot index or activating a Texture Node, the texture displayed in the Image Editor changes accordingly.
This patch syncs the Image Editor when a new texture paint slot was added, which currently is not the case.

Also deduplicates some code.
2022-11-08 14:28:44 +01:00
77c4d3154b Merge branch 'blender-v3.4-release' 2022-11-08 13:47:43 +01:00
Brecht Van Lommel
e1b3d91127 Refactor: replace Cycles sse/avx types by vectorized float4/int4/float8/int8
The distinction existed for legacy reasons, to easily port of Embree
intersection code without affecting the main vector types. However we are now
using SIMD for these types as well, so no good reason to keep the distinction.

Also more consistently pass these vector types by value in inline functions.
Previously it was partially changed for functions used by Metal to avoid having
to add address space qualifiers, simple to do it everywhere.

Also removes function declarations for vector math headers, serves no real
purpose.

Differential Revision: https://developer.blender.org/D16146
2022-11-08 12:28:40 +01:00
32ec0521c5 Merge branch 'blender-v3.4-release' 2022-11-08 12:18:12 +01:00
c047042adf Merge branch 'blender-v3.4-release' 2022-11-08 12:03:07 +01:00
f12236d1e3 Merge branch 'blender-v3.4-release' 2022-11-08 11:34:38 +01:00
871375f222 PyAPI: add invalid objects check for RNA struct keys()/values()/items() 2022-11-08 17:17:30 +11:00
8b151982fe Merge branch 'blender-v3.4-release' 2022-11-08 17:00:46 +11:00
fddcdcc20c Merge branch 'blender-v3.4-release' 2022-11-08 12:18:52 +11:00
9d1380e0a9 Merge branch 'blender-v3.4-release' 2022-11-08 12:18:48 +11:00
452865c80d Merge branch 'blender-v3.4-release' 2022-11-08 12:18:44 +11:00
2257a9bfb1 Cleanup: correct type of RNA struct methods
Some BPy_StructRNA methods used BPy_PropertyRNA in their function
signatures, while this didn't case any bugs, it could lead to issues
in the future.
2022-11-08 11:26:33 +11:00
4eb9322eda Cleanup: PyMethodDef formatting
Missed these changes in [0].

Also replace designated initializers in some C code, as it's not used
often and would need to be removed when converting to C++.

[0] e555ede626
2022-11-08 11:13:58 +11:00
3e71220efc Fix support for building with ffmpeg < 5.0
Seems like the new audio channel api was not as backwards compatible as we thought.
Therefore we need to reintroduce the usage of the old api to make older ffmpeg version be able to compile Blender.

This change is only intended to stick around for two releases or so. After that we hope that most Linux distros ship
ffmpeg >=5.0 so we can switch to it.

Reviewed By: Sergey

Differential Revision: http://developer.blender.org/D16408
2022-11-07 17:46:13 +01:00
95631c94c4 Merge branch 'blender-v3.4-release' 2022-11-07 15:30:49 +01:00
8473b5a592 Fix strict compiler warnings 2022-11-07 14:20:50 +01:00
403fc9a3f1 Merge branch 'blender-v3.4-release' 2022-11-07 08:46:15 -03:00
e555ede626 Cleanup: unify struct declaration style for Python types, update names
Use struct identifiers in comments before the value.
This has some advantages:

- The struct identifiers didn't mix well with other code-comments,
  where other comments were wrapped onto the next line.
- Minor changes could re-align all other comments in the struct.
- PyVarObject_HEAD_INIT & tp_name are no longer placed on the same line.

Remove overly verbose comments copied from PyTypeObject (Python v2.x),
these aren't especially helpful and get outdated.

Also corrected some outdated names:

- PyTypeObject.tp_print -> tp_vectorcall_offset
- PyTypeObject.tp_reserved -> tp_as_async
2022-11-07 22:38:32 +11:00
719332d120 Cleanup: remove unused variable 2022-11-07 22:38:32 +11:00
688b408bbb Fix 'ED_transform_snap_object_project_ray_all' not return 'hit_list'
Missed in rBff4f14b21a42.
2022-11-07 08:38:09 -03:00
888fb0b395 Merge branch 'blender-v3.4-release' 2022-11-07 12:32:36 +01:00
ff4f14b21a Fix T102053: snap fails with instances of geometry nodes
As instances are often generated geometries, we cannot rely on the data
provided by `DupliObject::ob`.

Use `DupliObject::ob_data` when possible.

This required a major refactor in the code as the output variables are
now gathered in context and easier to access.
2022-11-07 08:27:54 -03:00
b2db324f60 Fix potentially uninitialized memory usage
`nearest_world_tree_co` allows null parameter, so the `index` variable
isn't really needed and doesn't even need to be initialized.
2022-11-07 08:27:54 -03:00
cad897de16 Transform: remove SnapData cache for meshes
All cache needed is already stored in `Mesh.runtime`.
2022-11-07 08:27:54 -03:00
129197f20d Merge branch 'blender-v3.4-release' 2022-11-07 21:33:24 +11:00
74140d41b1 Cycles: Apple GPU threadgroup tuning
This patch tunes maximum threads-per-threadgroup and threads-per-block for faster renders on Apple GPUs. Appropriate tuning is selected based on the GPU architecture (M1 or M2). We see a benchmark uplift of around 5-10% on M1 family chips. Similar uplift is expected on M2 with upcoming OS changes. (Ref T101931)

Reviewed By: brecht

Maniphest Tasks: T101931

Differential Revision: https://developer.blender.org/D16299
2022-11-07 10:00:46 +00:00
671c3e1fa4 Fix File Browser Move Bookmark malfunction if no item is selected
The operator was acting on non selected items (wasnt checking SpaceFile
bookmarknr for being -1) which could end up removing items even.

Now sanatize this by introducing proper poll (which returns false if
nothing is selected).

Fixes T102014.

Maniphest Tasks: T102014

Differential Revision: https://developer.blender.org/D16385
2022-11-07 10:28:37 +01:00
37ca6e4fd1 Merge branch 'blender-v3.4-release' 2022-11-06 15:45:45 +01:00
a8865f3402 Fix: Missing initialization curves bounds in set origin operator
It could be changed, but currently curves.bounds_min_max
relies on the initial value of its arguments. Split from D16331.
2022-11-06 10:23:14 +01:00
3852094b35 Cleanup: Nodes: Use const arguments, avoid recursive iteration
Use the node topology cache and avoid modifying the node tree
in a non-threadsafe way to improve the predictability of using
the helper function. Replaces the implementation from
e0d4047136.
2022-11-06 10:23:14 +01:00
Pablo Vazquez
28e952dacd UI: Sort items in Weight Locks menu
Reorder the items in the `Locks` menu:
* Split into three groups: Lock, Unlock, Invert.
* Use icon only in the first item of each group, following the HIG.

Reviewed By: Severin

Differential Revision: https://developer.blender.org/D16383
2022-11-06 02:28:48 +01:00
6ef6778215 Fix: Broken debug build after recent cleanup commit
5060f26f50
2022-11-05 21:37:37 +01:00
8b29d6cd75 Cleanup: Remove unused node function 2022-11-05 21:37:37 +01:00
c6725dc507 Cleanup: Use Vector in group input/output node update functions
Also reduce the scope of variables and use ListBase macros
2022-11-05 21:37:37 +01:00
db3bf36770 Sculpt: Fix T102253: Missing call to SCULPT_automasking_node_update 2022-11-05 11:57:39 -07:00
5060f26f50 Cleanup: Move function to legacy mesh conversion file 2022-11-05 18:14:26 +01:00
455d195d55 OBJ Export: Remove edge recalculation
The removed function call removes all attributes from mesh edges
and rebuilds the mesh edge topology. This isn't necessary because
meshes always have edges in the first place.

Exporting a 4 million face grid, this saved 1.5 seconds out of 4
seconds total for the whole export.

Tests files have to be updated, since the edge calculation could
potentially change the order of elements. This is also a fix, since
previously the exporter would delete all attributes on the evaluated
mesh edges.

Differential Revision: https://developer.blender.org/D16391
2022-11-05 16:28:13 +01:00
4ec5a8cbc2 Cleanup: Remove unnecessary node type registraction functions
These functions provided little benefit compared to simply setting
the function pointers directly.
2022-11-05 16:10:27 +01:00
e673f3ba24 Cleanup: Remove redundant assignment of loose edge flag
This is assigned by `BKE_mesh_calc_edges_loose` a few lines below.
2022-11-05 13:26:52 +01:00
38086dcfdc Merge branch 'blender-v3.4-release' 2022-11-05 20:03:46 +11:00
412e7d3771 Merge branch 'blender-v3.4-release' 2022-11-05 17:10:23 +11:00
b3e1540c50 Cleanup: use bools and typed enums for WM_job type & flag
Also use typed enum for the event handler flag.
2022-11-05 14:14:39 +11:00
ae3073323e Cleanup: use bool instead of short for job stop & do_update arguments
Since these values are only ever 0/1, use bool type.
2022-11-05 13:47:01 +11:00
4a313b8252 Cleanup: Move legacy mesh conversions to proper file 2022-11-04 23:28:10 +01:00
23dafa4ad6 Cleanup: OBJ: Simplify access to loose edges
Implementing this with a separate function just added extra code,
there wasn't much benefit to it.
2022-11-04 21:58:15 +01:00
59af0fba9d Merge branch 'blender-v3.4-release' 2022-11-04 21:06:39 +01:00
4b200b491c Cleanup: Use mesh API functions 2022-11-04 20:37:17 +01:00
10131a6f62 Cleanup: Mesh: Remove redundant edge render flag
Currently there are both "EDGERENDER" and "EDGEDRAW" flags, which are
almost always used together. Both are runtime data and not exposed to
RNA, used to skip drawing some edges after the subdivision surface
modifier. The render flag is a relic of the Blender internal renderer.
This commit removes the render flag and replaces its uses with the
draw flag.
2022-11-04 20:19:52 +01:00
85ce488298 Realtime Compositor: Implement static cache manager
This patch introduces the concept of a Cached Resource that can be
cached across compositor evaluations as well as used by multiple
operations in the same evaluation. Additionally, this patch implements a
new structure for the realtime compositor, the Static Cache Manager,
that manages all the cached resources and deletes them when they are no
longer needed.

This improves responsiveness while adjusting compositor node trees and
also conserves memory usage.

Differential Revision: https://developer.blender.org/D16357

Reviewed By: Clement Foucault
2022-11-04 16:14:22 +02:00
943d574185 Merge branch 'blender-v3.4-release' 2022-11-04 21:59:38 +11:00
4b2458b457 Merge branch 'blender-v3.4-release' 2022-11-04 19:19:23 +11:00
d4f0ccb6b4 Cleanup: pass const view_area in sequencer functions 2022-11-04 18:50:31 +11:00
7a7055c186 Cleanup: use bool for render types ok/result_ok 2022-11-04 18:50:31 +11:00
d6b2f4ad8e Merge branch 'blender-v3.4-release' 2022-11-04 00:03:47 -07:00
624c11d69f BLI_path: remove use of BLI_path_normalize in BLI_path_parent_dir
Normalize is no longer necessary as BLI_path_name_at_index skips
redundant path components such as "//" and "/./".

This has the advantage that the path length isn't limited to FILE_MAX.
2022-11-04 17:22:11 +11:00
3d34b8c901 Merge branch 'blender-v3.4-release' 2022-11-04 16:51:18 +11:00
ecda9483f1 Merge branch 'blender-v3.4-release' 2022-11-03 23:41:11 -04:00
e36db7018d Merge branch 'blender-v3.4-release' 2022-11-04 14:00:14 +11:00
9dfc134c9d DRW: Fix incorrect logic in state redundancy check
Error introduced by rB3c39a3affee7.
2022-11-03 19:41:36 +01:00
c2a99cb0b6 Merge branch 'blender-v3.4-release' 2022-11-03 19:38:36 +01:00
5401e68a61 Merge branch 'blender-v3.4-release' 2022-11-03 12:20:44 -05:00
3c39a3affe DRW: Add support for clip plane count as part of the draw state.
This moves the implementation from the View to the draw manager itself.

However, this is not its final place and should be moved to the shader
create info at some point in the future.
For now it is not possible because of possible interaction with the
old draw manager codebase.
2022-11-03 17:03:22 +01:00
dcfe4a302c Merge branch 'blender-v3.4-release' 2022-11-03 16:52:39 +01:00
41c692ee2f Fix deprecation warnings in FFmpeg related code
The non-deprecated API dates back to 2017, so it should be safe
to simply migrate to it.

Fixes verbose error prints, making it easier to see actual issues.

Differential Revision: https://developer.blender.org/D16370
2022-11-03 15:18:02 +01:00
74c293863d Cycles: Remove use of sprintf() in MD5 code
The new Xcode declares the `sprintf()` function deprecated and
suggests to sue `snprintf()` as a safer alternative.

This change actually moves away from any formatted printing and
uses inlined byte-to-hex-string conversion which is also safe
and is (unmesurably) faster.

Differential Revision: https://developer.blender.org/D16378
2022-11-03 15:10:37 +01:00
90805c9943 Merge branch 'blender-v3.4-release' 2022-11-03 14:32:27 +01:00
a16dd407b3 Revert "Blender 3.4 - Beta"
This reverts commit 666135c32a.
2022-11-03 10:12:46 +01:00
31c52bc34e Merge branch 'blender-v3.4-release' 2022-11-03 10:12:17 +01:00
ba8754cf11 Blender 3.5 Alpha: Start of new release cycle. 2022-11-03 10:11:13 +01:00
1488 changed files with 68431 additions and 37591 deletions

View File

@@ -1239,12 +1239,11 @@ if(WITH_OPENGL)
add_definitions(-DWITH_OPENGL)
endif()
# -----------------------------------------------------------------------------
#-----------------------------------------------------------------------------
# Configure Vulkan.
if(WITH_VULKAN_BACKEND)
add_definitions(-DWITH_VULKAN_BACKEND)
list(APPEND BLENDER_GL_LIBRARIES ${VULKAN_LIBRARIES})
endif()
# -----------------------------------------------------------------------------

View File

@@ -1,7 +1,7 @@
# SPDX-License-Identifier: GPL-2.0-or-later
## Update and uncomment this in the release branch
set(BLENDER_VERSION 3.4)
# set(BLENDER_VERSION 3.1)
function(download_source dep)
set(TARGET_FILE ${${dep}_FILE})

View File

@@ -0,0 +1,59 @@
# SPDX-License-Identifier: BSD-3-Clause
# Copyright 2022 Blender Foundation.
# - Find MoltenVK libraries
# Find the MoltenVK includes and libraries
# This module defines
# MOLTENVK_INCLUDE_DIRS, where to find MoltenVK headers, Set when
# MOLTENVK_INCLUDE_DIR is found.
# MOLTENVK_LIBRARIES, libraries to link against to use MoltenVK.
# MOLTENVK_ROOT_DIR, The base directory to search for MoltenVK.
# This can also be an environment variable.
# MOLTENVK_FOUND, If false, do not try to use MoltenVK.
#
# If MOLTENVK_ROOT_DIR was defined in the environment, use it.
IF(NOT MOLTENVK_ROOT_DIR AND NOT $ENV{MOLTENVK_ROOT_DIR} STREQUAL "")
SET(MOLTENVK_ROOT_DIR $ENV{MOLTENVK_ROOT_DIR})
ENDIF()
SET(_moltenvk_SEARCH_DIRS
${MOLTENVK_ROOT_DIR}
${LIBDIR}/vulkan/MoltenVK
)
FIND_PATH(MOLTENVK_INCLUDE_DIR
NAMES
MoltenVK/vk_mvk_moltenvk.h
HINTS
${_moltenvk_SEARCH_DIRS}
PATH_SUFFIXES
include
)
FIND_LIBRARY(MOLTENVK_LIBRARY
NAMES
MoltenVK
HINTS
${_moltenvk_SEARCH_DIRS}
PATH_SUFFIXES
dylib/macOS
)
# handle the QUIETLY and REQUIRED arguments and set MOLTENVK_FOUND to TRUE if
# all listed variables are TRUE
INCLUDE(FindPackageHandleStandardArgs)
FIND_PACKAGE_HANDLE_STANDARD_ARGS(MoltenVK DEFAULT_MSG MOLTENVK_LIBRARY MOLTENVK_INCLUDE_DIR)
IF(MOLTENVK_FOUND)
SET(MOLTENVK_LIBRARIES ${MOLTENVK_LIBRARY})
SET(MOLTENVK_INCLUDE_DIRS ${MOLTENVK_INCLUDE_DIR})
ENDIF()
MARK_AS_ADVANCED(
MOLTENVK_INCLUDE_DIR
MOLTENVK_LIBRARY
)
UNSET(_moltenvk_SEARCH_DIRS)

View File

@@ -100,6 +100,23 @@ if(WITH_USD)
find_package(USD REQUIRED)
endif()
if(WITH_VULKAN_BACKEND)
find_package(MoltenVK REQUIRED)
if(EXISTS ${LIBDIR}/vulkan)
set(VULKAN_FOUND On)
set(VULKAN_ROOT_DIR ${LIBDIR}/vulkan/macOS)
set(VULKAN_INCLUDE_DIR ${VULKAN_ROOT_DIR}/include)
set(VULKAN_LIBRARY ${VULKAN_ROOT_DIR}/lib/libvulkan.1.dylib)
set(VULKAN_INCLUDE_DIRS ${VULKAN_INCLUDE_DIR} ${MOLTENVK_INCLUDE_DIRS})
set(VULKAN_LIBRARIES ${VULKAN_LIBRARY} ${MOLTENVK_LIBRARIES})
else()
message(WARNING "Vulkan SDK was not found, disabling WITH_VULKAN_BACKEND")
set(WITH_VULKAN_BACKEND OFF)
endif()
endif()
if(WITH_OPENSUBDIV)
find_package(OpenSubdiv)
endif()

View File

@@ -108,6 +108,10 @@ find_package_wrapper(ZLIB REQUIRED)
find_package_wrapper(Zstd REQUIRED)
find_package_wrapper(Epoxy REQUIRED)
if(WITH_VULKAN_BACKEND)
find_package_wrapper(Vulkan REQUIRED)
endif()
function(check_freetype_for_brotli)
include(CheckSymbolExists)
set(CMAKE_REQUIRED_INCLUDES ${FREETYPE_INCLUDE_DIRS})
@@ -322,9 +326,10 @@ if(WITH_CYCLES AND WITH_CYCLES_DEVICE_ONEAPI)
file(GLOB _sycl_runtime_libraries
${SYCL_ROOT_DIR}/lib/libsycl.so
${SYCL_ROOT_DIR}/lib/libsycl.so.*
${SYCL_ROOT_DIR}/lib/libpi_level_zero.so
${SYCL_ROOT_DIR}/lib/libpi_*.so
)
list(FILTER _sycl_runtime_libraries EXCLUDE REGEX ".*\.py")
list(REMOVE_ITEM _sycl_runtime_libraries "${SYCL_ROOT_DIR}/lib/libpi_opencl.so")
list(APPEND PLATFORM_BUNDLED_LIBRARIES ${_sycl_runtime_libraries})
unset(_sycl_runtime_libraries)
endif()
@@ -965,16 +970,9 @@ if(WITH_COMPILER_CCACHE)
endif()
endif()
# On some platforms certain atomic operations are not possible with assembly and/or intrinsics and
# they are emulated in software with locks. For example, on armel there is no intrinsics to grant
# 64 bit atomic operations and STL library uses libatomic to offload software emulation of atomics
# to.
# This function will check whether libatomic is required and if so will configure linker flags.
# If atomic operations are possible without libatomic then linker flags are left as-is.
function(CONFIGURE_ATOMIC_LIB_IF_NEEDED)
# Source which is used to enforce situation when software emulation of atomics is required.
# Assume that using 64bit integer gives a definitive answer (as in, if 64bit atomic operations
# are possible using assembly/intrinsics 8, 16, and 32 bit operations will also be possible.
# Always link with libatomic if available, as it is required for data types
# which don't have intrinsics.
function(configure_atomic_lib_if_needed)
set(_source
"#include <atomic>
#include <cstdint>
@@ -985,25 +983,12 @@ function(CONFIGURE_ATOMIC_LIB_IF_NEEDED)
)
include(CheckCXXSourceCompiles)
check_cxx_source_compiles("${_source}" ATOMIC_OPS_WITHOUT_LIBATOMIC)
set(CMAKE_REQUIRED_LIBRARIES atomic)
check_cxx_source_compiles("${_source}" ATOMIC_OPS_WITH_LIBATOMIC)
unset(CMAKE_REQUIRED_LIBRARIES)
if(NOT ATOMIC_OPS_WITHOUT_LIBATOMIC)
# Compilation of the test program has failed.
# Try it again with -latomic to see if this is what is needed, or whether something else is
# going on.
set(CMAKE_REQUIRED_LIBRARIES atomic)
check_cxx_source_compiles("${_source}" ATOMIC_OPS_WITH_LIBATOMIC)
unset(CMAKE_REQUIRED_LIBRARIES)
if(ATOMIC_OPS_WITH_LIBATOMIC)
set(PLATFORM_LINKFLAGS "${PLATFORM_LINKFLAGS} -latomic" PARENT_SCOPE)
else()
# Atomic operations are required part of Blender and it is not possible to process forward.
# We expect that either standard library or libatomic will make atomics to work. If both
# cases has failed something fishy o na bigger scope is going on.
message(FATAL_ERROR "Failed to detect required configuration for atomic operations")
endif()
if(ATOMIC_OPS_WITH_LIBATOMIC)
set(PLATFORM_LINKFLAGS "${PLATFORM_LINKFLAGS} -latomic" PARENT_SCOPE)
endif()
endfunction()

View File

@@ -419,7 +419,7 @@ if(WITH_IMAGE_OPENEXR)
warn_hardcoded_paths(OpenEXR)
set(OPENEXR ${LIBDIR}/openexr)
set(OPENEXR_INCLUDE_DIR ${OPENEXR}/include)
set(OPENEXR_INCLUDE_DIRS ${OPENEXR_INCLUDE_DIR} ${IMATH_INCLUDE_DIRS} ${OPENEXR}/include/OpenEXR)
set(OPENEXR_INCLUDE_DIRS ${OPENEXR_INCLUDE_DIR} ${IMATH_INCLUDE_DIRS} ${OPENEXR_INCLUDE_DIR}/OpenEXR)
set(OPENEXR_LIBPATH ${OPENEXR}/lib)
# Check if the 3.x library name exists
# if not assume this is a 2.x library folder
@@ -568,7 +568,8 @@ if(WITH_OPENIMAGEIO)
if(NOT OpenImageIO_FOUND)
set(OPENIMAGEIO ${LIBDIR}/OpenImageIO)
set(OPENIMAGEIO_LIBPATH ${OPENIMAGEIO}/lib)
set(OPENIMAGEIO_INCLUDE_DIRS ${OPENIMAGEIO}/include)
set(OPENIMAGEIO_INCLUDE_DIR ${OPENIMAGEIO}/include)
set(OPENIMAGEIO_INCLUDE_DIRS ${OPENIMAGEIO_INCLUDE_DIR})
set(OIIO_OPTIMIZED optimized ${OPENIMAGEIO_LIBPATH}/OpenImageIO.lib optimized ${OPENIMAGEIO_LIBPATH}/OpenImageIO_Util.lib)
set(OIIO_DEBUG debug ${OPENIMAGEIO_LIBPATH}/OpenImageIO_d.lib debug ${OPENIMAGEIO_LIBPATH}/OpenImageIO_Util_d.lib)
set(OPENIMAGEIO_LIBRARIES ${OIIO_OPTIMIZED} ${OIIO_DEBUG})
@@ -785,6 +786,14 @@ if(WITH_CYCLES AND WITH_CYCLES_OSL)
endif()
find_path(OSL_INCLUDE_DIR OSL/oslclosure.h PATHS ${CYCLES_OSL}/include)
find_program(OSL_COMPILER NAMES oslc PATHS ${CYCLES_OSL}/bin)
file(STRINGS "${OSL_INCLUDE_DIR}/OSL/oslversion.h" OSL_LIBRARY_VERSION_MAJOR
REGEX "^[ \t]*#define[ \t]+OSL_LIBRARY_VERSION_MAJOR[ \t]+[0-9]+.*$")
file(STRINGS "${OSL_INCLUDE_DIR}/OSL/oslversion.h" OSL_LIBRARY_VERSION_MINOR
REGEX "^[ \t]*#define[ \t]+OSL_LIBRARY_VERSION_MINOR[ \t]+[0-9]+.*$")
string(REGEX REPLACE ".*#define[ \t]+OSL_LIBRARY_VERSION_MAJOR[ \t]+([.0-9]+).*"
"\\1" OSL_LIBRARY_VERSION_MAJOR ${OSL_LIBRARY_VERSION_MAJOR})
string(REGEX REPLACE ".*#define[ \t]+OSL_LIBRARY_VERSION_MINOR[ \t]+([.0-9]+).*"
"\\1" OSL_LIBRARY_VERSION_MINOR ${OSL_LIBRARY_VERSION_MINOR})
endif()
if(WITH_CYCLES AND WITH_CYCLES_EMBREE)
@@ -917,6 +926,20 @@ if(WITH_HARU)
set(HARU_LIBRARIES ${HARU_ROOT_DIR}/lib/libhpdfs.lib)
endif()
if(WITH_VULKAN_BACKEND)
if(EXISTS ${LIBDIR}/vulkan)
set(VULKAN_FOUND On)
set(VULKAN_ROOT_DIR ${LIBDIR}/vulkan)
set(VULKAN_INCLUDE_DIR ${VULKAN_ROOT_DIR}/include)
set(VULKAN_INCLUDE_DIRS ${VULKAN_INCLUDE_DIR})
set(VULKAN_LIBRARY ${VULKAN_ROOT_DIR}/lib/vulkan-1.lib)
set(VULKAN_LIBRARIES ${VULKAN_LIBRARY})
else()
message(WARNING "Vulkan SDK was not found, disabling WITH_VULKAN_BACKEND")
set(WITH_VULKAN_BACKEND OFF)
endif()
endif()
if(WITH_CYCLES AND WITH_CYCLES_PATH_GUIDING)
find_package(openpgl QUIET)
if(openpgl_FOUND)
@@ -949,7 +972,13 @@ if(WITH_CYCLES AND WITH_CYCLES_DEVICE_ONEAPI)
endforeach()
unset(_sycl_runtime_libraries_glob)
list(APPEND _sycl_runtime_libraries ${SYCL_ROOT_DIR}/bin/pi_level_zero.dll)
file(GLOB _sycl_pi_runtime_libraries_glob
${SYCL_ROOT_DIR}/bin/pi_*.dll
)
list(REMOVE_ITEM _sycl_pi_runtime_libraries_glob "${SYCL_ROOT_DIR}/bin/pi_opencl.dll")
list (APPEND _sycl_runtime_libraries ${_sycl_pi_runtime_libraries_glob})
unset(_sycl_pi_runtime_libraries_glob)
list(APPEND PLATFORM_BUNDLED_LIBRARIES ${_sycl_runtime_libraries})
unset(_sycl_runtime_libraries)
endif()

View File

@@ -5,38 +5,38 @@
update-code:
git:
submodules:
- branch: blender-v3.4-release
- branch: master
commit_id: HEAD
path: release/scripts/addons
- branch: blender-v3.4-release
- branch: master
commit_id: HEAD
path: release/scripts/addons_contrib
- branch: blender-v3.4-release
- branch: master
commit_id: HEAD
path: release/datafiles/locale
- branch: blender-v3.4-release
- branch: master
commit_id: HEAD
path: source/tools
svn:
libraries:
darwin-arm64:
branch: tags/blender-3.4-release
branch: trunk
commit_id: HEAD
path: lib/darwin_arm64
darwin-x86_64:
branch: tags/blender-3.4-release
branch: trunk
commit_id: HEAD
path: lib/darwin
linux-x86_64:
branch: tags/blender-3.4-release
branch: trunk
commit_id: HEAD
path: lib/linux_centos7_x86_64
windows-amd64:
branch: tags/blender-3.4-release
branch: trunk
commit_id: HEAD
path: lib/win64_vc15
tests:
branch: tags/blender-3.4-release
branch: trunk
commit_id: HEAD
path: lib/tests
benchmarks:

View File

@@ -69,6 +69,7 @@ Thanks to Tyler Alden Gubala for maintaining the original version of this packag
# ------------------------------------------------------------------------------
# Generic Functions
def find_dominating_file(
path: str,
search: Sequence[str],

View File

@@ -38,7 +38,7 @@ PROJECT_NAME = Blender
# could be handy for archiving the generated documentation or if some version
# control system is used.
PROJECT_NUMBER = V3.4
PROJECT_NUMBER = V3.5
# Using the PROJECT_BRIEF tag one can provide an optional one line description
# for a project that appears at the top of each page and should give viewer a

View File

@@ -870,6 +870,26 @@ an issue but, due to internal implementation details, currently are:
thus breaking any current iteration over ``Collection.all_objects``.
.. rubric:: Do not:
.. code-block:: python
# `all_objects` is an iterator. Using it directly while performing operations on its members that will update
# the memory accessed by the `all_objects` iterator will lead to invalid memory accesses and crashes.
for object in bpy.data.collections["Collection"].all_objects:
object.hide_viewport = True
.. rubric:: Do:
.. code-block:: python
# `all_objects[:]` is an independent list generated from the iterator. As long as no objects are deleted,
# its content will remain valid even if the data accessed by the `all_objects` iterator is modified.
for object in bpy.data.collections["Collection"].all_objects[:]:
object.hide_viewport = True
sys.exit
========

View File

@@ -1294,6 +1294,7 @@ def pycontext2sphinx(basepath):
type_descr = prop.get_type_description(
class_fmt=":class:`bpy.types.%s`",
mathutils_fmt=":class:`mathutils.%s`",
collection_id=_BPY_PROP_COLLECTION_ID,
enum_descr_override=enum_descr_override,
)
@@ -1446,6 +1447,7 @@ def pyrna2sphinx(basepath):
identifier = " %s" % prop.identifier
kwargs["class_fmt"] = ":class:`%s`"
kwargs["mathutils_fmt"] = ":class:`mathutils.%s`"
kwargs["collection_id"] = _BPY_PROP_COLLECTION_ID
@@ -1565,6 +1567,7 @@ def pyrna2sphinx(basepath):
type_descr = prop.get_type_description(
class_fmt=":class:`%s`",
mathutils_fmt=":class:`mathutils.%s`",
collection_id=_BPY_PROP_COLLECTION_ID,
enum_descr_override=enum_descr_override,
)
@@ -1631,6 +1634,7 @@ def pyrna2sphinx(basepath):
type_descr = prop.get_type_description(
as_ret=True, class_fmt=":class:`%s`",
mathutils_fmt=":class:`mathutils.%s`",
collection_id=_BPY_PROP_COLLECTION_ID,
enum_descr_override=enum_descr_override,
)

View File

@@ -91,3 +91,7 @@ endif()
if(WITH_COMPOSITOR_CPU)
add_subdirectory(smaa_areatex)
endif()
if(WITH_VULKAN_BACKEND)
add_subdirectory(vulkan_memory_allocator)
endif()

View File

@@ -0,0 +1,24 @@
# SPDX-License-Identifier: GPL-2.0-or-later
# Copyright 2022 Blender Foundation. All rights reserved.
set(INC
.
)
set(INC_SYS
${VULKAN_INCLUDE_DIRS}
)
set(SRC
vk_mem_alloc_impl.cc
vk_mem_alloc.h
)
blender_add_lib(extern_vulkan_memory_allocator "${SRC}" "${INC}" "${INC_SYS}" "${LIB}")
if(CMAKE_COMPILER_IS_GNUCC OR CMAKE_C_COMPILER_ID MATCHES "Clang")
target_compile_options(extern_vulkan_memory_allocator
PRIVATE "-Wno-nullability-completeness"
)
endif()

View File

@@ -0,0 +1,19 @@
Copyright (c) 2017-2022 Advanced Micro Devices, Inc. All rights reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

View File

@@ -0,0 +1,5 @@
Project: VulkanMemoryAllocator
URL: https://github.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator
License: MIT
Upstream version: a6bfc23
Local modifications: None

175
extern/vulkan_memory_allocator/README.md vendored Normal file
View File

@@ -0,0 +1,175 @@
# Vulkan Memory Allocator
Easy to integrate Vulkan memory allocation library.
**Documentation:** Browse online: [Vulkan Memory Allocator](https://gpuopen-librariesandsdks.github.io/VulkanMemoryAllocator/html/) (generated from Doxygen-style comments in [include/vk_mem_alloc.h](include/vk_mem_alloc.h))
**License:** MIT. See [LICENSE.txt](LICENSE.txt)
**Changelog:** See [CHANGELOG.md](CHANGELOG.md)
**Product page:** [Vulkan Memory Allocator on GPUOpen](https://gpuopen.com/gaming-product/vulkan-memory-allocator/)
**Build status:**
- Windows: [![Build status](https://ci.appveyor.com/api/projects/status/4vlcrb0emkaio2pn/branch/master?svg=true)](https://ci.appveyor.com/project/adam-sawicki-amd/vulkanmemoryallocator/branch/master)
- Linux: [![Build Status](https://app.travis-ci.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator.svg?branch=master)](https://app.travis-ci.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator)
[![Average time to resolve an issue](http://isitmaintained.com/badge/resolution/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator.svg)](http://isitmaintained.com/project/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator "Average time to resolve an issue")
# Problem
Memory allocation and resource (buffer and image) creation in Vulkan is difficult (comparing to older graphics APIs, like D3D11 or OpenGL) for several reasons:
- It requires a lot of boilerplate code, just like everything else in Vulkan, because it is a low-level and high-performance API.
- There is additional level of indirection: `VkDeviceMemory` is allocated separately from creating `VkBuffer`/`VkImage` and they must be bound together.
- Driver must be queried for supported memory heaps and memory types. Different GPU vendors provide different types of it.
- It is recommended to allocate bigger chunks of memory and assign parts of them to particular resources, as there is a limit on maximum number of memory blocks that can be allocated.
# Features
This library can help game developers to manage memory allocations and resource creation by offering some higher-level functions:
1. Functions that help to choose correct and optimal memory type based on intended usage of the memory.
- Required or preferred traits of the memory are expressed using higher-level description comparing to Vulkan flags.
2. Functions that allocate memory blocks, reserve and return parts of them (`VkDeviceMemory` + offset + size) to the user.
- Library keeps track of allocated memory blocks, used and unused ranges inside them, finds best matching unused ranges for new allocations, respects all the rules of alignment and buffer/image granularity.
3. Functions that can create an image/buffer, allocate memory for it and bind them together - all in one call.
Additional features:
- Well-documented - description of all functions and structures provided, along with chapters that contain general description and example code.
- Thread-safety: Library is designed to be used in multithreaded code. Access to a single device memory block referred by different buffers and textures (binding, mapping) is synchronized internally. Memory mapping is reference-counted.
- Configuration: Fill optional members of `VmaAllocatorCreateInfo` structure to provide custom CPU memory allocator, pointers to Vulkan functions and other parameters.
- Customization and integration with custom engines: Predefine appropriate macros to provide your own implementation of all external facilities used by the library like assert, mutex, atomic.
- Support for memory mapping, reference-counted internally. Support for persistently mapped memory: Just allocate with appropriate flag and access the pointer to already mapped memory.
- Support for non-coherent memory. Functions that flush/invalidate memory. `nonCoherentAtomSize` is respected automatically.
- Support for resource aliasing (overlap).
- Support for sparse binding and sparse residency: Convenience functions that allocate or free multiple memory pages at once.
- Custom memory pools: Create a pool with desired parameters (e.g. fixed or limited maximum size) and allocate memory out of it.
- Linear allocator: Create a pool with linear algorithm and use it for much faster allocations and deallocations in free-at-once, stack, double stack, or ring buffer fashion.
- Support for Vulkan 1.0, 1.1, 1.2, 1.3.
- Support for extensions (and equivalent functionality included in new Vulkan versions):
- VK_KHR_dedicated_allocation: Just enable it and it will be used automatically by the library.
- VK_KHR_buffer_device_address: Flag `VK_MEMORY_ALLOCATE_DEVICE_ADDRESS_BIT_KHR` is automatically added to memory allocations where needed.
- VK_EXT_memory_budget: Used internally if available to query for current usage and budget. If not available, it falls back to an estimation based on memory heap sizes.
- VK_EXT_memory_priority: Set `priority` of allocations or custom pools and it will be set automatically using this extension.
- VK_AMD_device_coherent_memory
- Defragmentation of GPU and CPU memory: Let the library move data around to free some memory blocks and make your allocations better compacted.
- Statistics: Obtain brief or detailed statistics about the amount of memory used, unused, number of allocated blocks, number of allocations etc. - globally, per memory heap, and per memory type.
- Debug annotations: Associate custom `void* pUserData` and debug `char* pName` with each allocation.
- JSON dump: Obtain a string in JSON format with detailed map of internal state, including list of allocations, their string names, and gaps between them.
- Convert this JSON dump into a picture to visualize your memory. See [tools/GpuMemDumpVis](tools/GpuMemDumpVis/README.md).
- Debugging incorrect memory usage: Enable initialization of all allocated memory with a bit pattern to detect usage of uninitialized or freed memory. Enable validation of a magic number after every allocation to detect out-of-bounds memory corruption.
- Support for interoperability with OpenGL.
- Virtual allocator: Interface for using core allocation algorithm to allocate any custom data, e.g. pieces of one large buffer.
# Prerequisites
- Self-contained C++ library in single header file. No external dependencies other than standard C and C++ library and of course Vulkan. Some features of C++14 used. STL containers, RTTI, or C++ exceptions are not used.
- Public interface in C, in same convention as Vulkan API. Implementation in C++.
- Error handling implemented by returning `VkResult` error codes - same way as in Vulkan.
- Interface documented using Doxygen-style comments.
- Platform-independent, but developed and tested on Windows using Visual Studio. Continuous integration setup for Windows and Linux. Used also on Android, MacOS, and other platforms.
# Example
Basic usage of this library is very simple. Advanced features are optional. After you created global `VmaAllocator` object, a complete code needed to create a buffer may look like this:
```cpp
VkBufferCreateInfo bufferInfo = { VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO };
bufferInfo.size = 65536;
bufferInfo.usage = VK_BUFFER_USAGE_VERTEX_BUFFER_BIT | VK_BUFFER_USAGE_TRANSFER_DST_BIT;
VmaAllocationCreateInfo allocInfo = {};
allocInfo.usage = VMA_MEMORY_USAGE_AUTO;
VkBuffer buffer;
VmaAllocation allocation;
vmaCreateBuffer(allocator, &bufferInfo, &allocInfo, &buffer, &allocation, nullptr);
```
With this one function call:
1. `VkBuffer` is created.
2. `VkDeviceMemory` block is allocated if needed.
3. An unused region of the memory block is bound to this buffer.
`VmaAllocation` is an object that represents memory assigned to this buffer. It can be queried for parameters like `VkDeviceMemory` handle and offset.
# How to build
On Windows it is recommended to use [CMake UI](https://cmake.org/runningcmake/). Alternatively you can generate a Visual Studio project map using CMake in command line: `cmake -B./build/ -DCMAKE_BUILD_TYPE=Debug -G "Visual Studio 16 2019" -A x64 ./`
On Linux:
```
mkdir build
cd build
cmake ..
make
```
The following targets are available
| Target | Description | CMake option | Default setting |
| ------------- | ------------- | ------------- | ------------- |
| VmaSample | VMA sample application | `VMA_BUILD_SAMPLE` | `OFF` |
| VmaBuildSampleShaders | Shaders for VmaSample | `VMA_BUILD_SAMPLE_SHADERS` | `OFF` |
Please note that while VulkanMemoryAllocator library is supported on other platforms besides Windows, VmaSample is not.
These CMake options are available
| CMake option | Description | Default setting |
| ------------- | ------------- | ------------- |
| `VMA_RECORDING_ENABLED` | Enable VMA memory recording for debugging | `OFF` |
| `VMA_USE_STL_CONTAINERS` | Use C++ STL containers instead of VMA's containers | `OFF` |
| `VMA_STATIC_VULKAN_FUNCTIONS` | Link statically with Vulkan API | `OFF` |
| `VMA_DYNAMIC_VULKAN_FUNCTIONS` | Fetch pointers to Vulkan functions internally (no static linking) | `ON` |
| `VMA_DEBUG_ALWAYS_DEDICATED_MEMORY` | Every allocation will have its own memory block | `OFF` |
| `VMA_DEBUG_INITIALIZE_ALLOCATIONS` | Automatically fill new allocations and destroyed allocations with some bit pattern | `OFF` |
| `VMA_DEBUG_GLOBAL_MUTEX` | Enable single mutex protecting all entry calls to the library | `OFF` |
| `VMA_DEBUG_DONT_EXCEED_MAX_MEMORY_ALLOCATION_COUNT` | Never exceed [VkPhysicalDeviceLimits::maxMemoryAllocationCount](https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/vkspec.html#limits-maxMemoryAllocationCount) and return error | `OFF` |
# Binaries
The release comes with precompiled binary executable for "VulkanSample" application which contains test suite. It is compiled using Visual Studio 2019, so it requires appropriate libraries to work, including "MSVCP140.dll", "VCRUNTIME140.dll", "VCRUNTIME140_1.dll". If the launch fails with error message telling about those files missing, please download and install [Microsoft Visual C++ Redistributable for Visual Studio 2015, 2017 and 2019](https://support.microsoft.com/en-us/help/2977003/the-latest-supported-visual-c-downloads), "x64" version.
# Read more
See **[Documentation](https://gpuopen-librariesandsdks.github.io/VulkanMemoryAllocator/html/)**.
# Software using this library
- **[X-Plane](https://x-plane.com/)**
- **[Detroit: Become Human](https://gpuopen.com/learn/porting-detroit-3/)**
- **[Vulkan Samples](https://github.com/LunarG/VulkanSamples)** - official Khronos Vulkan samples. License: Apache-style.
- **[Anvil](https://github.com/GPUOpen-LibrariesAndSDKs/Anvil)** - cross-platform framework for Vulkan. License: MIT.
- **[Filament](https://github.com/google/filament)** - physically based rendering engine for Android, Windows, Linux and macOS, from Google. Apache License 2.0.
- **[Atypical Games - proprietary game engine](https://developer.samsung.com/galaxy-gamedev/gamedev-blog/infinitejet.html)**
- **[Flax Engine](https://flaxengine.com/)**
- **[Godot Engine](https://github.com/godotengine/godot/)** - multi-platform 2D and 3D game engine. License: MIT.
- **[Lightweight Java Game Library (LWJGL)](https://www.lwjgl.org/)** - includes binding of the library for Java. License: BSD.
- **[PowerVR SDK](https://github.com/powervr-graphics/Native_SDK)** - C++ cross-platform 3D graphics SDK, from Imagination. License: MIT.
- **[Skia](https://github.com/google/skia)** - complete 2D graphic library for drawing Text, Geometries, and Images, from Google.
- **[The Forge](https://github.com/ConfettiFX/The-Forge)** - cross-platform rendering framework. Apache License 2.0.
- **[VK9](https://github.com/disks86/VK9)** - Direct3D 9 compatibility layer using Vulkan. Zlib lincese.
- **[vkDOOM3](https://github.com/DustinHLand/vkDOOM3)** - Vulkan port of GPL DOOM 3 BFG Edition. License: GNU GPL.
- **[vkQuake2](https://github.com/kondrak/vkQuake2)** - vanilla Quake 2 with Vulkan support. License: GNU GPL.
- **[Vulkan Best Practice for Mobile Developers](https://github.com/ARM-software/vulkan_best_practice_for_mobile_developers)** from ARM. License: MIT.
- **[RPCS3](https://github.com/RPCS3/rpcs3)** - PlayStation 3 emulator/debugger. License: GNU GPLv2.
- **[PPSSPP](https://github.com/hrydgard/ppsspp)** - Playstation Portable emulator/debugger. License: GNU GPLv2+.
[Many other projects on GitHub](https://github.com/search?q=AMD_VULKAN_MEMORY_ALLOCATOR_H&type=Code) and some game development studios that use Vulkan in their games.
# See also
- **[D3D12 Memory Allocator](https://github.com/GPUOpen-LibrariesAndSDKs/D3D12MemoryAllocator)** - equivalent library for Direct3D 12. License: MIT.
- **[Awesome Vulkan](https://github.com/vinjn/awesome-vulkan)** - a curated list of awesome Vulkan libraries, debuggers and resources.
- **[VulkanMemoryAllocator-Hpp](https://github.com/malte-v/VulkanMemoryAllocator-Hpp)** - C++ binding for this library. License: CC0-1.0.
- **[PyVMA](https://github.com/realitix/pyvma)** - Python wrapper for this library. Author: Jean-Sébastien B. (@realitix). License: Apache 2.0.
- **[vk-mem](https://github.com/gwihlidal/vk-mem-rs)** - Rust binding for this library. Author: Graham Wihlidal. License: Apache 2.0 or MIT.
- **[Haskell bindings](https://hackage.haskell.org/package/VulkanMemoryAllocator)**, **[github](https://github.com/expipiplus1/vulkan/tree/master/VulkanMemoryAllocator)** - Haskell bindings for this library. Author: Ellie Hermaszewska (@expipiplus1). License BSD-3-Clause.
- **[vma_sample_sdl](https://github.com/rextimmy/vma_sample_sdl)** - SDL port of the sample app of this library (with the goal of running it on multiple platforms, including MacOS). Author: @rextimmy. License: MIT.
- **[vulkan-malloc](https://github.com/dylanede/vulkan-malloc)** - Vulkan memory allocation library for Rust. Based on version 1 of this library. Author: Dylan Ede (@dylanede). License: MIT / Apache 2.0.

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,12 @@
/* SPDX-License-Identifier: GPL-2.0-or-later
* Copyright 2022 Blender Foundation. All rights reserved. */
#ifdef __APPLE__
# include <MoltenVK/vk_mvk_moltenvk.h>
#else
# include <vulkan/vulkan.h>
#endif
#define VMA_IMPLEMENTATION
#include "vk_mem_alloc.h"

View File

@@ -253,6 +253,33 @@ if(WITH_CYCLES_OSL)
)
endif()
if(WITH_CYCLES_DEVICE_CUDA OR WITH_CYCLES_DEVICE_OPTIX)
add_definitions(-DWITH_CUDA)
if(WITH_CUDA_DYNLOAD)
include_directories(
../../extern/cuew/include
)
add_definitions(-DWITH_CUDA_DYNLOAD)
else()
include_directories(
SYSTEM
${CUDA_TOOLKIT_INCLUDE}
)
endif()
endif()
if(WITH_CYCLES_DEVICE_HIP)
add_definitions(-DWITH_HIP)
if(WITH_HIP_DYNLOAD)
include_directories(
../../extern/hipew/include
)
add_definitions(-DWITH_HIP_DYNLOAD)
endif()
endif()
if(WITH_CYCLES_DEVICE_OPTIX)
find_package(OptiX 7.3.0)
@@ -261,12 +288,16 @@ if(WITH_CYCLES_DEVICE_OPTIX)
include_directories(
SYSTEM
${OPTIX_INCLUDE_DIR}
)
)
else()
set_and_warn_library_found("OptiX" OPTIX_FOUND WITH_CYCLES_DEVICE_OPTIX)
endif()
endif()
if(WITH_CYCLES_DEVICE_METAL)
add_definitions(-DWITH_METAL)
endif()
if (WITH_CYCLES_DEVICE_ONEAPI)
add_definitions(-DWITH_ONEAPI)
endif()

View File

@@ -58,7 +58,7 @@ class CyclesRender(bpy.types.RenderEngine):
if not self.session:
if self.is_preview:
cscene = bpy.context.scene.cycles
use_osl = cscene.shading_system and cscene.device == 'CPU'
use_osl = cscene.shading_system
engine.create(self, data, preview_osl=use_osl)
else:

View File

@@ -156,6 +156,11 @@ def with_osl():
return _cycles.with_osl
def osl_version():
import _cycles
return _cycles.osl_version
def with_path_guiding():
import _cycles
return _cycles.with_path_guiding
@@ -199,7 +204,6 @@ def list_render_passes(scene, srl):
if crl.use_pass_volume_indirect: yield ("VolumeInd", "RGB", 'COLOR')
if srl.use_pass_emit: yield ("Emit", "RGB", 'COLOR')
if srl.use_pass_environment: yield ("Env", "RGB", 'COLOR')
if srl.use_pass_shadow: yield ("Shadow", "RGB", 'COLOR')
if srl.use_pass_ambient_occlusion: yield ("AO", "RGB", 'COLOR')
if crl.use_pass_shadow_catcher: yield ("Shadow Catcher", "RGB", 'COLOR')
# autopep8: on

View File

@@ -86,6 +86,29 @@ enum_sampling_pattern = (
('PROGRESSIVE_MULTI_JITTER', "Progressive Multi-Jitter", "Use Progressive Multi-Jitter random sampling pattern", 1),
)
enum_emission_sampling = (
('NONE',
'None',
"Do not use this surface as a light for sampling",
0),
('AUTO',
'Auto',
"Automatically determine if the surface should be treated as a light for sampling, based on estimated emission intensity",
1),
('FRONT',
'Front',
"Treat only front side of the surface as a light, usually for closed meshes whose interior is not visible",
2),
('BACK',
'Back',
"Treat only back side of the surface as a light for sampling",
3),
('FRONT_BACK',
'Front and Back',
"Treat surface as a light for sampling, emitting from both the front and back side",
4),
)
enum_volume_sampling = (
('DISTANCE',
"Distance",
@@ -147,7 +170,6 @@ enum_view3d_shading_render_pass = (
('EMISSION', "Emission", "Show the Emission render pass"),
('BACKGROUND', "Background", "Show the Background render pass"),
('AO', "Ambient Occlusion", "Show the Ambient Occlusion render pass"),
('SHADOW', "Shadow", "Show the Shadow render pass"),
('SHADOW_CATCHER', "Shadow Catcher", "Show the Shadow Catcher render pass"),
('', "Light", ""),
@@ -290,7 +312,7 @@ class CyclesRenderSettings(bpy.types.PropertyGroup):
)
shading_system: BoolProperty(
name="Open Shading Language",
description="Use Open Shading Language (CPU rendering only)",
description="Use Open Shading Language",
)
preview_pause: BoolProperty(
@@ -481,6 +503,12 @@ class CyclesRenderSettings(bpy.types.PropertyGroup):
default='MULTIPLE_IMPORTANCE_SAMPLING',
)
use_light_tree: BoolProperty(
name="Light Tree",
description="Sample multiple lights more efficiently based on estimated contribution at every shading point",
default=True,
)
min_light_bounces: IntProperty(
name="Min Light Bounces",
description="Minimum number of light bounces. Setting this higher reduces noise in the first bounces, "
@@ -622,7 +650,7 @@ class CyclesRenderSettings(bpy.types.PropertyGroup):
transparent_max_bounces: IntProperty(
name="Transparent Max Bounces",
description="Maximum number of transparent bounces. This is independent of maximum number of other bounces ",
description="Maximum number of transparent bounces. This is independent of maximum number of other bounces",
min=0, max=1024,
default=8,
)
@@ -1043,13 +1071,13 @@ class CyclesCameraSettings(bpy.types.PropertyGroup):
class CyclesMaterialSettings(bpy.types.PropertyGroup):
sample_as_light: BoolProperty(
name="Multiple Importance Sample",
description="Use multiple importance sampling for this material, "
"disabling may reduce overall noise for large "
"objects that emit little light compared to other light sources",
default=True,
emission_sampling: EnumProperty(
name="Emission Sampling",
description="Sampling strategy for emissive surfaces",
items=enum_emission_sampling,
default="AUTO",
)
use_transparent_shadow: BoolProperty(
name="Transparent Shadows",
description="Use transparent shadows for this material if it contains a Transparent BSDF, "
@@ -1642,7 +1670,7 @@ class CyclesPreferences(bpy.types.AddonPreferences):
col.label(text="and Windows driver version 101.3430 or newer", icon='BLANK1')
elif sys.platform.startswith("linux"):
col.label(text="Requires Intel GPU with Xe-HPG architecture and", icon='BLANK1')
col.label(text=" - Linux driver version xx.xx.23904 or newer", icon='BLANK1')
col.label(text=" - intel-level-zero-gpu version 1.3.23904 or newer", icon='BLANK1')
col.label(text=" - oneAPI Level-Zero Loader", icon='BLANK1')
elif device_type == 'METAL':
col.label(text="Requires Apple Silicon with macOS 12.2 or newer", icon='BLANK1')

View File

@@ -383,7 +383,6 @@ class CYCLES_RENDER_PT_sampling_advanced(CyclesButtonsPanel, Panel):
col = layout.column(align=True)
col.prop(cscene, "min_light_bounces")
col.prop(cscene, "min_transparent_bounces")
col.prop(cscene, "light_sampling_threshold", text="Light Threshold")
for view_layer in scene.view_layers:
if view_layer.samples > 0:
@@ -392,6 +391,31 @@ class CYCLES_RENDER_PT_sampling_advanced(CyclesButtonsPanel, Panel):
break
class CYCLES_RENDER_PT_sampling_lights(CyclesButtonsPanel, Panel):
bl_label = "Lights"
bl_parent_id = "CYCLES_RENDER_PT_sampling"
bl_options = {'DEFAULT_CLOSED'}
def draw_header(self, context):
layout = self.layout
scene = context.scene
cscene = scene.cycles
def draw(self, context):
layout = self.layout
layout.use_property_split = True
layout.use_property_decorate = False
scene = context.scene
cscene = scene.cycles
col = layout.column(align=True)
col.prop(cscene, "use_light_tree")
sub = col.row()
sub.prop(cscene, "light_sampling_threshold", text="Light Threshold")
sub.active = not cscene.use_light_tree
class CYCLES_RENDER_PT_subdivision(CyclesButtonsPanel, Panel):
bl_label = "Subdivision"
bl_options = {'DEFAULT_CLOSED'}
@@ -954,7 +978,6 @@ class CYCLES_RENDER_PT_passes_light(CyclesButtonsPanel, Panel):
col = layout.column(heading="Other", align=True)
col.prop(view_layer, "use_pass_emit", text="Emission")
col.prop(view_layer, "use_pass_environment")
col.prop(view_layer, "use_pass_shadow")
col.prop(view_layer, "use_pass_ambient_occlusion", text="Ambient Occlusion")
col.prop(cycles_view_layer, "use_pass_shadow_catcher")
@@ -1832,9 +1855,9 @@ class CYCLES_MATERIAL_PT_settings_surface(CyclesButtonsPanel, Panel):
cmat = mat.cycles
col = layout.column()
col.prop(cmat, "sample_as_light", text="Multiple Importance")
col.prop(cmat, "use_transparent_shadow")
col.prop(cmat, "displacement_method", text="Displacement")
col.prop(cmat, "emission_sampling")
col.prop(cmat, "use_transparent_shadow")
def draw(self, context):
self.draw_shared(self, context.material)
@@ -2307,7 +2330,10 @@ def draw_device(self, context):
col.prop(cscene, "device")
from . import engine
if engine.with_osl() and use_cpu(context):
if engine.with_osl() and (
use_cpu(context) or
(use_optix(context) and (engine.osl_version()[1] >= 13 or engine.osl_version()[0] > 1))
):
col.prop(cscene, "shading_system")
@@ -2363,6 +2389,7 @@ classes = (
CYCLES_RENDER_PT_sampling_render_denoise,
CYCLES_RENDER_PT_sampling_path_guiding,
CYCLES_RENDER_PT_sampling_path_guiding_debug,
CYCLES_RENDER_PT_sampling_lights,
CYCLES_RENDER_PT_sampling_advanced,
CYCLES_RENDER_PT_light_paths,
CYCLES_RENDER_PT_light_paths_max_bounces,

View File

@@ -99,7 +99,7 @@ def do_versions(self):
library_versions.setdefault(library.version, []).append(library)
# Do versioning per library, since they might have different versions.
max_need_versioning = (3, 0, 25)
max_need_versioning = (3, 5, 2)
for version, libraries in library_versions.items():
if version > max_need_versioning:
continue
@@ -297,3 +297,8 @@ def do_versions(self):
cmat = mat.cycles
if not cmat.is_property_set("displacement_method"):
cmat.displacement_method = 'DISPLACEMENT'
if version <= (3, 5, 3):
cmat = mat.cycles
if not cmat.get("sample_as_light", True):
cmat.emission_sampling = 'NONE'

File diff suppressed because it is too large Load Diff

View File

@@ -15,6 +15,10 @@
#include "util/unique_ptr.h"
#include "util/vector.h"
typedef struct GPUContext GPUContext;
typedef struct GPUFence GPUFence;
typedef struct GPUShader GPUShader;
CCL_NAMESPACE_BEGIN
/* Base class of shader used for display driver rendering. */
@@ -29,7 +33,7 @@ class BlenderDisplayShader {
BlenderDisplayShader() = default;
virtual ~BlenderDisplayShader() = default;
virtual void bind(int width, int height) = 0;
virtual GPUShader *bind(int width, int height) = 0;
virtual void unbind() = 0;
/* Get attribute location for position and texture coordinate respectively.
@@ -40,7 +44,7 @@ class BlenderDisplayShader {
protected:
/* Get program of this display shader.
* NOTE: The shader needs to be bound to have access to this. */
virtual uint get_shader_program() = 0;
virtual GPUShader *get_shader_program() = 0;
/* Cached values of various OpenGL resources. */
int position_attribute_location_ = -1;
@@ -51,16 +55,16 @@ class BlenderDisplayShader {
* display space shader. */
class BlenderFallbackDisplayShader : public BlenderDisplayShader {
public:
virtual void bind(int width, int height) override;
virtual GPUShader *bind(int width, int height) override;
virtual void unbind() override;
protected:
virtual uint get_shader_program() override;
virtual GPUShader *get_shader_program() override;
void create_shader_if_needed();
void destroy_shader();
uint shader_program_ = 0;
GPUShader *shader_program_ = 0;
int image_texture_location_ = -1;
int fullscreen_location_ = -1;
@@ -73,17 +77,17 @@ class BlenderDisplaySpaceShader : public BlenderDisplayShader {
public:
BlenderDisplaySpaceShader(BL::RenderEngine &b_engine, BL::Scene &b_scene);
virtual void bind(int width, int height) override;
virtual GPUShader *bind(int width, int height) override;
virtual void unbind() override;
protected:
virtual uint get_shader_program() override;
virtual GPUShader *get_shader_program() override;
BL::RenderEngine b_engine_;
BL::Scene &b_scene_;
/* Cached values of various OpenGL resources. */
uint shader_program_ = 0;
GPUShader *shader_program_ = nullptr;
};
/* Display driver implementation which is specific for Blender viewport integration. */
@@ -122,6 +126,9 @@ class BlenderDisplayDriver : public DisplayDriver {
void gpu_context_lock();
void gpu_context_unlock();
/* Create GPU resources used by the display driver. */
bool gpu_resources_create();
/* Destroy all GPU resources which are being used by this object. */
void gpu_resources_destroy();
@@ -137,8 +144,8 @@ class BlenderDisplayDriver : public DisplayDriver {
struct Tiles;
unique_ptr<Tiles> tiles_;
void *gl_render_sync_ = nullptr;
void *gl_upload_sync_ = nullptr;
GPUFence *gpu_render_sync_ = nullptr;
GPUFence *gpu_upload_sync_ = nullptr;
float2 zoom_ = make_float2(1.0f, 1.0f);
};

View File

@@ -18,7 +18,6 @@
#include "util/guiding.h"
#include "util/log.h"
#include "util/md5.h"
#include "util/opengl.h"
#include "util/openimagedenoise.h"
#include "util/path.h"
#include "util/string.h"
@@ -26,6 +25,8 @@
#include "util/tbb.h"
#include "util/types.h"
#include "GPU_state.h"
#ifdef WITH_OSL
# include "scene/osl.h"
@@ -337,7 +338,7 @@ static PyObject *view_draw_func(PyObject * /*self*/, PyObject *args)
if (PyLong_AsVoidPtr(pyrv3d)) {
/* 3d view drawing */
int viewport[4];
glGetIntegerv(GL_VIEWPORT, viewport);
GPU_viewport_size_get_i(viewport);
session->view_draw(viewport[2], viewport[3]);
}

View File

@@ -559,11 +559,6 @@ static bool bake_setup_pass(Scene *scene, const string &bake_type_str, const int
0);
integrator->set_use_emission((bake_filter & BL::BakeSettings::pass_filter_EMIT) != 0);
}
/* Shadow pass. */
else if (strcmp(bake_type, "SHADOW") == 0) {
type = PASS_SHADOW;
use_direct_light = true;
}
/* Light component passes. */
else if (strcmp(bake_type, "DIFFUSE") == 0) {
if ((bake_filter & BL::BakeSettings::pass_filter_DIRECT) &&

View File

@@ -61,6 +61,12 @@ static DisplacementMethod get_displacement_method(PointerRNA &ptr)
ptr, "displacement_method", DISPLACE_NUM_METHODS, DISPLACE_BUMP);
}
static EmissionSampling get_emission_sampling(PointerRNA &ptr)
{
return (EmissionSampling)get_enum(
ptr, "emission_sampling", EMISSION_SAMPLING_NUM, EMISSION_SAMPLING_AUTO);
}
static int validate_enum_value(int value, int num_values, int default_value)
{
if (value >= num_values) {
@@ -1559,7 +1565,7 @@ void BlenderSync::sync_materials(BL::Depsgraph &b_depsgraph, bool update_all)
/* settings */
PointerRNA cmat = RNA_pointer_get(&b_mat.ptr, "cycles");
shader->set_use_mis(get_boolean(cmat, "sample_as_light"));
shader->set_emission_sampling_method(get_emission_sampling(cmat));
shader->set_use_transparent_shadow(get_boolean(cmat, "use_transparent_shadow"));
shader->set_heterogeneous_volume(!get_boolean(cmat, "homogeneous_volume"));
shader->set_volume_sampling_method(get_volume_sampling(cmat));

View File

@@ -26,7 +26,6 @@
#include "util/foreach.h"
#include "util/hash.h"
#include "util/log.h"
#include "util/opengl.h"
#include "util/openimagedenoise.h"
CCL_NAMESPACE_BEGIN
@@ -348,7 +347,14 @@ void BlenderSync::sync_integrator(BL::ViewLayer &b_view_layer, bool background)
integrator->set_motion_blur(view_layer.use_motion_blur);
}
integrator->set_light_sampling_threshold(get_float(cscene, "light_sampling_threshold"));
bool use_light_tree = get_boolean(cscene, "use_light_tree");
integrator->set_use_light_tree(use_light_tree);
integrator->set_light_sampling_threshold(
(use_light_tree) ? 0.0f : get_float(cscene, "light_sampling_threshold"));
if (integrator->use_light_tree_is_modified()) {
scene->light_manager->tag_update(scene, LightManager::UPDATE_ALL);
}
SamplingPattern sampling_pattern = (SamplingPattern)get_enum(
cscene, "sampling_pattern", SAMPLING_NUM_PATTERNS, SAMPLING_PATTERN_PMJ);
@@ -617,7 +623,6 @@ static bool get_known_pass_type(BL::RenderPass &b_pass, PassType &type, PassMode
MAP_PASS("Emit", PASS_EMISSION, false);
MAP_PASS("Env", PASS_BACKGROUND, false);
MAP_PASS("AO", PASS_AO, false);
MAP_PASS("Shadow", PASS_SHADOW, false);
MAP_PASS("BakePrimitive", PASS_BAKE_PRIMITIVE, false);
MAP_PASS("BakeDifferential", PASS_BAKE_DIFFERENTIAL, false);

View File

@@ -8,28 +8,13 @@ set(INC
set(INC_SYS )
if(WITH_CYCLES_DEVICE_OPTIX OR WITH_CYCLES_DEVICE_CUDA)
if(WITH_CUDA_DYNLOAD)
list(APPEND INC
../../../extern/cuew/include
)
add_definitions(-DWITH_CUDA_DYNLOAD)
else()
list(APPEND INC_SYS
${CUDA_TOOLKIT_INCLUDE}
)
if(NOT WITH_CUDA_DYNLOAD)
add_definitions(-DCYCLES_CUDA_NVCC_EXECUTABLE="${CUDA_NVCC_EXECUTABLE}")
endif()
add_definitions(-DCYCLES_RUNTIME_OPTIX_ROOT_DIR="${CYCLES_RUNTIME_OPTIX_ROOT_DIR}")
endif()
if(WITH_CYCLES_DEVICE_HIP AND WITH_HIP_DYNLOAD)
list(APPEND INC
../../../extern/hipew/include
)
add_definitions(-DWITH_HIP_DYNLOAD)
endif()
set(SRC_BASE
device.cpp
denoise.cpp
@@ -168,24 +153,15 @@ if(WITH_CYCLES_DEVICE_HIP AND WITH_HIP_DYNLOAD)
)
endif()
if(WITH_CYCLES_DEVICE_CUDA)
add_definitions(-DWITH_CUDA)
endif()
if(WITH_CYCLES_DEVICE_HIP)
add_definitions(-DWITH_HIP)
endif()
if(WITH_CYCLES_DEVICE_OPTIX)
add_definitions(-DWITH_OPTIX)
endif()
if(WITH_CYCLES_DEVICE_METAL)
list(APPEND LIB
${METAL_LIBRARY}
)
add_definitions(-DWITH_METAL)
list(APPEND SRC
${SRC_METAL}
)
endif()
if (WITH_CYCLES_DEVICE_ONEAPI)
if(WITH_CYCLES_ONEAPI_BINARIES)
set(cycles_kernel_oneapi_lib_suffix "_aot")
@@ -203,7 +179,6 @@ if (WITH_CYCLES_DEVICE_ONEAPI)
else()
list(APPEND LIB ${SYCL_LIBRARY})
endif()
add_definitions(-DWITH_ONEAPI)
list(APPEND SRC
${SRC_ONEAPI}
)

View File

@@ -38,7 +38,7 @@ class CUDADeviceGraphicsInterop : public DeviceGraphicsInterop {
CUDADevice *device_ = nullptr;
/* OpenGL PBO which is currently registered as the destination for the CUDA buffer. */
uint opengl_pbo_id_ = 0;
int64_t opengl_pbo_id_ = 0;
/* Buffer area in pixels of the corresponding PBO. */
int64_t buffer_area_ = 0;

View File

@@ -78,24 +78,4 @@ class DenoiseParams : public Node {
}
};
/* All the parameters needed to perform buffer denoising on a device.
* Is not really a task in its canonical terms (as in, is not an asynchronous running task). Is
* more like a wrapper for all the arguments and parameters needed to perform denoising. Is a
* single place where they are all listed, so that it's not required to modify all device methods
* when these parameters do change. */
class DeviceDenoiseTask {
public:
DenoiseParams params;
int num_samples;
RenderBuffers *render_buffers;
BufferParams buffer_params;
/* Allow to do in-place modification of the input passes (scaling them down i.e.). This will
* lower the memory footprint of the denoiser but will make input passes "invalid" (from path
* tracer) point of view. */
bool allow_inplace_modification;
};
CCL_NAMESPACE_END

View File

@@ -351,6 +351,7 @@ DeviceInfo Device::get_multi_device(const vector<DeviceInfo> &subdevices,
info.num = 0;
info.has_nanovdb = true;
info.has_light_tree = true;
info.has_osl = true;
info.has_guiding = true;
info.has_profiling = true;
@@ -399,6 +400,7 @@ DeviceInfo Device::get_multi_device(const vector<DeviceInfo> &subdevices,
/* Accumulate device info. */
info.has_nanovdb &= device.has_nanovdb;
info.has_light_tree &= device.has_light_tree;
info.has_osl &= device.has_osl;
info.has_guiding &= device.has_guiding;
info.has_profiling &= device.has_profiling;

View File

@@ -65,6 +65,7 @@ class DeviceInfo {
int num;
bool display_device; /* GPU is used as a display device. */
bool has_nanovdb; /* Support NanoVDB volumes. */
bool has_light_tree; /* Support light tree. */
bool has_osl; /* Support Open Shading Language. */
bool has_guiding; /* Support path guiding. */
bool has_profiling; /* Supports runtime collection of profiling info. */
@@ -84,6 +85,7 @@ class DeviceInfo {
cpu_threads = 0;
display_device = false;
has_nanovdb = false;
has_light_tree = true;
has_osl = false;
has_guiding = false;
has_profiling = false;
@@ -160,6 +162,11 @@ class Device {
return true;
}
virtual bool load_osl_kernels()
{
return true;
}
/* GPU device only functions.
* These may not be used on CPU or multi-devices. */
@@ -228,21 +235,6 @@ class Device {
return nullptr;
}
/* Buffer denoising. */
/* Returns true if task is fully handled. */
virtual bool denoise_buffer(const DeviceDenoiseTask & /*task*/)
{
LOG(ERROR) << "Request buffer denoising from a device which does not support it.";
return false;
}
virtual DeviceQueue *get_denoise_queue()
{
LOG(ERROR) << "Request denoising queue from a device which does not support it.";
return nullptr;
}
/* Sub-devices */
/* Run given callback for every individual device which will be handling rendering.

View File

@@ -137,6 +137,7 @@ void device_hip_info(vector<DeviceInfo> &devices)
info.num = num;
info.has_nanovdb = true;
info.has_light_tree = false;
info.denoisers = 0;
info.has_gpu_queue = true;

View File

@@ -36,7 +36,7 @@ class HIPDeviceGraphicsInterop : public DeviceGraphicsInterop {
HIPDevice *device_ = nullptr;
/* OpenGL PBO which is currently registered as the destination for the HIP buffer. */
uint opengl_pbo_id_ = 0;
int64_t opengl_pbo_id_ = 0;
/* Buffer area in pixels of the corresponding PBO. */
int64_t buffer_area_ = 0;

View File

@@ -7,6 +7,30 @@
CCL_NAMESPACE_BEGIN
bool device_kernel_has_shading(DeviceKernel kernel)
{
return (kernel == DEVICE_KERNEL_INTEGRATOR_SHADE_BACKGROUND ||
kernel == DEVICE_KERNEL_INTEGRATOR_SHADE_LIGHT ||
kernel == DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE ||
kernel == DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE ||
kernel == DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_MNEE ||
kernel == DEVICE_KERNEL_INTEGRATOR_SHADE_VOLUME ||
kernel == DEVICE_KERNEL_INTEGRATOR_SHADE_SHADOW ||
kernel == DEVICE_KERNEL_SHADER_EVAL_DISPLACE ||
kernel == DEVICE_KERNEL_SHADER_EVAL_BACKGROUND ||
kernel == DEVICE_KERNEL_SHADER_EVAL_CURVE_SHADOW_TRANSPARENCY);
}
bool device_kernel_has_intersection(DeviceKernel kernel)
{
return (kernel == DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST ||
kernel == DEVICE_KERNEL_INTEGRATOR_INTERSECT_SHADOW ||
kernel == DEVICE_KERNEL_INTEGRATOR_INTERSECT_SUBSURFACE ||
kernel == DEVICE_KERNEL_INTEGRATOR_INTERSECT_VOLUME_STACK ||
kernel == DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE ||
kernel == DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_MNEE);
}
const char *device_kernel_as_string(DeviceKernel kernel)
{
switch (kernel) {

View File

@@ -11,6 +11,9 @@
CCL_NAMESPACE_BEGIN
bool device_kernel_has_shading(DeviceKernel kernel);
bool device_kernel_has_intersection(DeviceKernel kernel);
const char *device_kernel_as_string(DeviceKernel kernel);
std::ostream &operator<<(std::ostream &os, DeviceKernel kernel);

View File

@@ -117,6 +117,8 @@ class MetalDevice : public Device {
/* ------------------------------------------------------------------ */
/* low-level memory management */
bool max_working_set_exceeded(size_t safety_margin = 8 * 1024 * 1024) const;
MetalMem *generic_alloc(device_memory &mem);
void generic_copy_to(device_memory &mem);

View File

@@ -446,6 +446,14 @@ void MetalDevice::erase_allocation(device_memory &mem)
}
}
bool MetalDevice::max_working_set_exceeded(size_t safety_margin) const
{
/* We're allowed to allocate beyond the safe working set size, but then if all resources are made
* resident we will get command buffer failures at render time. */
size_t available = [mtlDevice recommendedMaxWorkingSetSize] - safety_margin;
return (stats.mem_used > available);
}
MetalDevice::MetalMem *MetalDevice::generic_alloc(device_memory &mem)
{
size_t size = mem.memory_size();
@@ -523,6 +531,11 @@ MetalDevice::MetalMem *MetalDevice::generic_alloc(device_memory &mem)
mmem->use_UMA = false;
}
if (max_working_set_exceeded()) {
set_error("System is out of GPU memory");
return nullptr;
}
return mmem;
}
@@ -921,9 +934,8 @@ void MetalDevice::tex_alloc(device_texture &mem)
<< string_human_readable_size(mem.memory_size()) << ")";
mtlTexture = [mtlDevice newTextureWithDescriptor:desc];
assert(mtlTexture);
if (!mtlTexture) {
set_error("System is out of GPU memory");
return;
}
@@ -955,7 +967,10 @@ void MetalDevice::tex_alloc(device_texture &mem)
<< string_human_readable_size(mem.memory_size()) << ")";
mtlTexture = [mtlDevice newTextureWithDescriptor:desc];
assert(mtlTexture);
if (!mtlTexture) {
set_error("System is out of GPU memory");
return;
}
[mtlTexture replaceRegion:MTLRegionMake2D(0, 0, mem.data_width, mem.data_height)
mipmapLevel:0
@@ -1017,6 +1032,10 @@ void MetalDevice::tex_alloc(device_texture &mem)
need_texture_info = true;
texture_info[slot].data = uint64_t(slot) | (sampler_index << 32);
if (max_working_set_exceeded()) {
set_error("System is out of GPU memory");
}
}
void MetalDevice::tex_free(device_texture &mem)
@@ -1077,6 +1096,10 @@ void MetalDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
}
}
}
if (max_working_set_exceeded()) {
set_error("System is out of GPU memory");
}
}
CCL_NAMESPACE_END

View File

@@ -45,6 +45,36 @@ bool kernel_has_intersection(DeviceKernel device_kernel)
struct ShaderCache {
ShaderCache(id<MTLDevice> _mtlDevice) : mtlDevice(_mtlDevice)
{
/* Initialize occupancy tuning LUT. */
if (MetalInfo::get_device_vendor(mtlDevice) == METAL_GPU_APPLE) {
switch (MetalInfo::get_apple_gpu_architecture(mtlDevice)) {
default:
case APPLE_M2:
occupancy_tuning[DEVICE_KERNEL_INTEGRATOR_COMPACT_SHADOW_STATES] = {32, 32};
occupancy_tuning[DEVICE_KERNEL_INTEGRATOR_INIT_FROM_CAMERA] = {832, 32};
occupancy_tuning[DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST] = {64, 64};
occupancy_tuning[DEVICE_KERNEL_INTEGRATOR_INTERSECT_SHADOW] = {64, 64};
occupancy_tuning[DEVICE_KERNEL_INTEGRATOR_INTERSECT_SUBSURFACE] = {704, 32};
occupancy_tuning[DEVICE_KERNEL_INTEGRATOR_QUEUED_PATHS_ARRAY] = {1024, 256};
occupancy_tuning[DEVICE_KERNEL_INTEGRATOR_SHADE_BACKGROUND] = {64, 32};
occupancy_tuning[DEVICE_KERNEL_INTEGRATOR_SHADE_SHADOW] = {256, 256};
occupancy_tuning[DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE] = {448, 384};
occupancy_tuning[DEVICE_KERNEL_INTEGRATOR_SORTED_PATHS_ARRAY] = {1024, 1024};
break;
case APPLE_M1:
occupancy_tuning[DEVICE_KERNEL_INTEGRATOR_COMPACT_SHADOW_STATES] = {256, 128};
occupancy_tuning[DEVICE_KERNEL_INTEGRATOR_INIT_FROM_CAMERA] = {768, 32};
occupancy_tuning[DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST] = {512, 128};
occupancy_tuning[DEVICE_KERNEL_INTEGRATOR_INTERSECT_SHADOW] = {384, 128};
occupancy_tuning[DEVICE_KERNEL_INTEGRATOR_INTERSECT_SUBSURFACE] = {512, 64};
occupancy_tuning[DEVICE_KERNEL_INTEGRATOR_QUEUED_PATHS_ARRAY] = {512, 256};
occupancy_tuning[DEVICE_KERNEL_INTEGRATOR_SHADE_BACKGROUND] = {512, 128};
occupancy_tuning[DEVICE_KERNEL_INTEGRATOR_SHADE_SHADOW] = {384, 32};
occupancy_tuning[DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE] = {576, 384};
occupancy_tuning[DEVICE_KERNEL_INTEGRATOR_SORTED_PATHS_ARRAY] = {832, 832};
break;
}
}
}
~ShaderCache();
@@ -73,6 +103,11 @@ struct ShaderCache {
std::function<void(MetalKernelPipeline *)> completionHandler;
};
struct OccupancyTuningParameters {
int threads_per_threadgroup = 0;
int num_threads_per_block = 0;
} occupancy_tuning[DEVICE_KERNEL_NUM];
std::mutex cache_mutex;
PipelineCollection pipelines[DEVICE_KERNEL_NUM];
@@ -230,6 +265,13 @@ void ShaderCache::load_kernel(DeviceKernel device_kernel,
request.pipeline->device_kernel = device_kernel;
request.pipeline->threads_per_threadgroup = device->max_threads_per_threadgroup;
if (occupancy_tuning[device_kernel].threads_per_threadgroup) {
request.pipeline->threads_per_threadgroup =
occupancy_tuning[device_kernel].threads_per_threadgroup;
request.pipeline->num_threads_per_block =
occupancy_tuning[device_kernel].num_threads_per_block;
}
/* metalrt options */
request.pipeline->use_metalrt = device->use_metalrt;
request.pipeline->metalrt_features = device->use_metalrt ?
@@ -384,13 +426,6 @@ void MetalKernelPipeline::compile()
const std::string function_name = std::string("cycles_metal_") +
device_kernel_as_string(device_kernel);
int threads_per_threadgroup = this->threads_per_threadgroup;
if (device_kernel > DEVICE_KERNEL_INTEGRATOR_MEGAKERNEL &&
device_kernel < DEVICE_KERNEL_INTEGRATOR_RESET) {
/* Always use 512 for the sorting kernels */
threads_per_threadgroup = 512;
}
NSString *entryPoint = [@(function_name.c_str()) copy];
NSError *error = NULL;
@@ -601,7 +636,9 @@ void MetalKernelPipeline::compile()
metalbin_path = path_cache_get(path_join("kernels", metalbin_name));
path_create_directories(metalbin_path);
if (path_exists(metalbin_path) && use_binary_archive) {
/* Retrieve shader binary from disk, and update the file timestamp for LRU purging to work as
* intended. */
if (use_binary_archive && path_cache_kernel_exists_and_mark_used(metalbin_path)) {
if (@available(macOS 11.0, *)) {
MTLBinaryArchiveDescriptor *archiveDesc = [[MTLBinaryArchiveDescriptor alloc] init];
archiveDesc.url = [NSURL fileURLWithPath:@(metalbin_path.c_str())];
@@ -662,12 +699,14 @@ void MetalKernelPipeline::compile()
return;
}
int num_threads_per_block = round_down(computePipelineState.maxTotalThreadsPerThreadgroup,
computePipelineState.threadExecutionWidth);
num_threads_per_block = std::max(num_threads_per_block,
(int)computePipelineState.threadExecutionWidth);
if (!num_threads_per_block) {
num_threads_per_block = round_down(computePipelineState.maxTotalThreadsPerThreadgroup,
computePipelineState.threadExecutionWidth);
num_threads_per_block = std::max(num_threads_per_block,
(int)computePipelineState.threadExecutionWidth);
}
this->pipeline = computePipelineState;
this->num_threads_per_block = num_threads_per_block;
if (@available(macOS 11.0, *)) {
if (creating_new_archive || recreate_archive) {
@@ -676,6 +715,9 @@ void MetalKernelPipeline::compile()
metal_printf("Failed to save binary archive, error:\n%s\n",
[[error localizedDescription] UTF8String]);
}
else {
path_cache_kernel_mark_added_and_clear_old(metalbin_path);
}
}
}
};

View File

@@ -138,6 +138,15 @@ class MultiDevice : public Device {
return true;
}
bool load_osl_kernels() override
{
foreach (SubDevice &sub, devices)
if (!sub.device->load_osl_kernels())
return false;
return true;
}
void build_bvh(BVH *bvh, Progress &progress, bool refit) override
{
/* Try to build and share a single acceleration structure, if possible */
@@ -204,10 +213,12 @@ class MultiDevice : public Device {
virtual void *get_cpu_osl_memory() override
{
if (devices.size() > 1) {
/* Always return the OSL memory of the CPU device (this works since the constructor above
* guarantees that CPU devices are always added to the back). */
if (devices.size() > 1 && devices.back().device->info.type != DEVICE_CPU) {
return NULL;
}
return devices.front().device->get_cpu_osl_memory();
return devices.back().device->get_cpu_osl_memory();
}
bool is_resident(device_ptr key, Device *sub_device) override

View File

@@ -31,6 +31,8 @@ bool device_oneapi_init()
* improves stability as of intel/LLVM SYCL-nightly/20220529.
* All these env variable can be set beforehand by end-users and
* will in that case -not- be overwritten. */
/* By default, enable only Level-Zero and if all devices are allowed, also CUDA and HIP.
* OpenCL backend isn't currently well supported. */
# ifdef _WIN32
if (getenv("SYCL_CACHE_PERSISTENT") == nullptr) {
_putenv_s("SYCL_CACHE_PERSISTENT", "1");
@@ -39,7 +41,12 @@ bool device_oneapi_init()
_putenv_s("SYCL_CACHE_THRESHOLD", "0");
}
if (getenv("SYCL_DEVICE_FILTER") == nullptr) {
_putenv_s("SYCL_DEVICE_FILTER", "level_zero");
if (getenv("CYCLES_ONEAPI_ALL_DEVICES") == nullptr) {
_putenv_s("SYCL_DEVICE_FILTER", "level_zero");
}
else {
_putenv_s("SYCL_DEVICE_FILTER", "level_zero,cuda,hip");
}
}
if (getenv("SYCL_ENABLE_PCI") == nullptr) {
_putenv_s("SYCL_ENABLE_PCI", "1");
@@ -50,7 +57,12 @@ bool device_oneapi_init()
# elif __linux__
setenv("SYCL_CACHE_PERSISTENT", "1", false);
setenv("SYCL_CACHE_THRESHOLD", "0", false);
setenv("SYCL_DEVICE_FILTER", "level_zero", false);
if (getenv("CYCLES_ONEAPI_ALL_DEVICES") == nullptr) {
setenv("SYCL_DEVICE_FILTER", "level_zero", false);
}
else {
setenv("SYCL_DEVICE_FILTER", "level_zero,cuda,hip", false);
}
setenv("SYCL_ENABLE_PCI", "1", false);
setenv("SYCL_PI_LEVEL_ZERO_USE_COPY_ENGINE_FOR_IN_ORDER_QUEUE", "0", false);
# endif

View File

@@ -430,9 +430,9 @@ void OneapiDevice::check_usm(SyclQueue *queue_, const void *usm_ptr, bool allow_
sycl::usm::alloc usm_type = get_pointer_type(usm_ptr, queue->get_context());
(void)usm_type;
assert(usm_type == sycl::usm::alloc::device ||
((device_type == sycl::info::device_type::cpu || allow_host) &&
usm_type == sycl::usm::alloc::host ||
usm_type == sycl::usm::alloc::unknown));
(usm_type == sycl::usm::alloc::host &&
(allow_host || device_type == sycl::info::device_type::cpu)) ||
usm_type == sycl::usm::alloc::unknown);
# else
/* Silence warning about unused arguments. */
(void)queue_;

View File

@@ -9,6 +9,10 @@
#include "util/log.h"
#ifdef WITH_OSL
# include <OSL/oslversion.h>
#endif
#ifdef WITH_OPTIX
# include <optix_function_table_definition.h>
#endif
@@ -65,6 +69,9 @@ void device_optix_info(const vector<DeviceInfo> &cuda_devices, vector<DeviceInfo
info.type = DEVICE_OPTIX;
info.id += "_OptiX";
# if defined(WITH_OSL) && (OSL_VERSION_MINOR >= 13 || OSL_VERSION_MAJOR > 1)
info.has_osl = true;
# endif
info.denoisers |= DENOISER_OPTIX;
devices.push_back(info);

File diff suppressed because it is too large Load Diff

View File

@@ -1,16 +1,14 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2019, NVIDIA Corporation.
* Copyright 2019-2022 Blender Foundation. */
* Copyright 2019, NVIDIA Corporation
* Copyright 2019-2022 Blender Foundation */
#pragma once
#ifdef WITH_OPTIX
# include "device/cuda/device_impl.h"
# include "device/optix/queue.h"
# include "device/optix/util.h"
# include "kernel/types.h"
# include "util/unique_ptr.h"
# include "kernel/osl/globals.h"
CCL_NAMESPACE_BEGIN
@@ -23,8 +21,16 @@ enum {
PG_RGEN_INTERSECT_SHADOW,
PG_RGEN_INTERSECT_SUBSURFACE,
PG_RGEN_INTERSECT_VOLUME_STACK,
PG_RGEN_SHADE_BACKGROUND,
PG_RGEN_SHADE_LIGHT,
PG_RGEN_SHADE_SURFACE,
PG_RGEN_SHADE_SURFACE_RAYTRACE,
PG_RGEN_SHADE_SURFACE_MNEE,
PG_RGEN_SHADE_VOLUME,
PG_RGEN_SHADE_SHADOW,
PG_RGEN_EVAL_DISPLACE,
PG_RGEN_EVAL_BACKGROUND,
PG_RGEN_EVAL_CURVE_SHADOW_TRANSPARENCY,
PG_MISS,
PG_HITD, /* Default hit group. */
PG_HITS, /* __SHADOW_RECORD_ALL__ hit group. */
@@ -40,14 +46,14 @@ enum {
};
static const int MISS_PROGRAM_GROUP_OFFSET = PG_MISS;
static const int NUM_MIS_PROGRAM_GROUPS = 1;
static const int NUM_MISS_PROGRAM_GROUPS = 1;
static const int HIT_PROGAM_GROUP_OFFSET = PG_HITD;
static const int NUM_HIT_PROGRAM_GROUPS = 8;
static const int CALLABLE_PROGRAM_GROUPS_BASE = PG_CALL_SVM_AO;
static const int NUM_CALLABLE_PROGRAM_GROUPS = 2;
/* List of OptiX pipelines. */
enum { PIP_SHADE_RAYTRACE, PIP_SHADE_MNEE, PIP_INTERSECT, NUM_PIPELINES };
enum { PIP_SHADE, PIP_INTERSECT, NUM_PIPELINES };
/* A single shader binding table entry. */
struct SbtRecord {
@@ -61,52 +67,35 @@ class OptiXDevice : public CUDADevice {
OptixModule optix_module = NULL; /* All necessary OptiX kernels are in one module. */
OptixModule builtin_modules[2] = {};
OptixPipeline pipelines[NUM_PIPELINES] = {};
OptixProgramGroup groups[NUM_PROGRAM_GROUPS] = {};
OptixPipelineCompileOptions pipeline_options = {};
bool motion_blur = false;
device_vector<SbtRecord> sbt_data;
device_only_memory<KernelParamsOptiX> launch_params;
OptixTraversableHandle tlas_handle = 0;
# ifdef WITH_OSL
OSLGlobals osl_globals;
vector<OptixModule> osl_modules;
vector<OptixProgramGroup> osl_groups;
# endif
private:
OptixTraversableHandle tlas_handle = 0;
vector<unique_ptr<device_only_memory<char>>> delayed_free_bvh_memory;
thread_mutex delayed_free_bvh_mutex;
class Denoiser {
public:
explicit Denoiser(OptiXDevice *device);
OptiXDevice *device;
OptiXDeviceQueue queue;
OptixDenoiser optix_denoiser = nullptr;
/* Configuration size, as provided to `optixDenoiserSetup`.
* If the `optixDenoiserSetup()` was never used on the current `optix_denoiser` the
* `is_configured` will be false. */
bool is_configured = false;
int2 configured_size = make_int2(0, 0);
/* OptiX denoiser state and scratch buffers, stored in a single memory buffer.
* The memory layout goes as following: [denoiser state][scratch buffer]. */
device_only_memory<unsigned char> state;
OptixDenoiserSizes sizes = {};
bool use_pass_albedo = false;
bool use_pass_normal = false;
bool use_pass_flow = false;
};
Denoiser denoiser_;
public:
OptiXDevice(const DeviceInfo &info, Stats &stats, Profiler &profiler);
~OptiXDevice();
private:
BVHLayoutMask get_bvh_layout_mask() const override;
string compile_kernel_get_common_cflags(const uint kernel_features);
bool load_kernels(const uint kernel_features) override;
bool load_osl_kernels() override;
bool build_optix_bvh(BVHOptiX *bvh,
OptixBuildOperation operation,
const OptixBuildInput &build_input,
@@ -123,52 +112,7 @@ class OptiXDevice : public CUDADevice {
virtual unique_ptr<DeviceQueue> gpu_queue_create() override;
/* --------------------------------------------------------------------
* Denoising.
*/
class DenoiseContext;
class DenoisePass;
virtual bool denoise_buffer(const DeviceDenoiseTask &task) override;
virtual DeviceQueue *get_denoise_queue() override;
/* Read guiding passes from the render buffers, preprocess them in a way which is expected by
* OptiX and store in the guiding passes memory within the given context.
*
* Pre=-processing of the guiding passes is to only happen once per context lifetime. DO not
* preprocess them for every pass which is being denoised. */
bool denoise_filter_guiding_preprocess(DenoiseContext &context);
/* Set fake albedo pixels in the albedo guiding pass storage.
* After this point only passes which do not need albedo for denoising can be processed. */
bool denoise_filter_guiding_set_fake_albedo(DenoiseContext &context);
void denoise_pass(DenoiseContext &context, PassType pass_type);
/* Read input color pass from the render buffer into the memory which corresponds to the noisy
* input within the given context. Pixels are scaled to the number of samples, but are not
* preprocessed yet. */
void denoise_color_read(DenoiseContext &context, const DenoisePass &pass);
/* Run corresponding filter kernels, preparing data for the denoiser or copying data from the
* denoiser result to the render buffer. */
bool denoise_filter_color_preprocess(DenoiseContext &context, const DenoisePass &pass);
bool denoise_filter_color_postprocess(DenoiseContext &context, const DenoisePass &pass);
/* Make sure the OptiX denoiser is created and configured. */
bool denoise_ensure(DenoiseContext &context);
/* Create OptiX denoiser descriptor if needed.
* Will do nothing if the current OptiX descriptor is usable for the given parameters.
* If the OptiX denoiser descriptor did re-allocate here it is left unconfigured. */
bool denoise_create_if_needed(DenoiseContext &context);
/* Configure existing OptiX denoiser descriptor for the use for the given task. */
bool denoise_configure_if_needed(DenoiseContext &context);
/* Run configured denoiser. */
bool denoise_run(DenoiseContext &context, const DenoisePass &pass);
void *get_cpu_osl_memory() override;
};
CCL_NAMESPACE_END

View File

@@ -24,21 +24,33 @@ void OptiXDeviceQueue::init_execution()
CUDADeviceQueue::init_execution();
}
static bool is_optix_specific_kernel(DeviceKernel kernel)
static bool is_optix_specific_kernel(DeviceKernel kernel, bool use_osl)
{
return (kernel == DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE ||
kernel == DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_MNEE ||
kernel == DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST ||
kernel == DEVICE_KERNEL_INTEGRATOR_INTERSECT_SHADOW ||
kernel == DEVICE_KERNEL_INTEGRATOR_INTERSECT_SUBSURFACE ||
kernel == DEVICE_KERNEL_INTEGRATOR_INTERSECT_VOLUME_STACK);
# ifdef WITH_OSL
/* OSL uses direct callables to execute, so shading needs to be done in OptiX if OSL is used. */
if (use_osl && device_kernel_has_shading(kernel)) {
return true;
}
# else
(void)use_osl;
# endif
return device_kernel_has_intersection(kernel);
}
bool OptiXDeviceQueue::enqueue(DeviceKernel kernel,
const int work_size,
DeviceKernelArguments const &args)
{
if (!is_optix_specific_kernel(kernel)) {
OptiXDevice *const optix_device = static_cast<OptiXDevice *>(cuda_device_);
# ifdef WITH_OSL
const bool use_osl = static_cast<OSLGlobals *>(optix_device->get_cpu_osl_memory())->use;
# else
const bool use_osl = false;
# endif
if (!is_optix_specific_kernel(kernel, use_osl)) {
return CUDADeviceQueue::enqueue(kernel, work_size, args);
}
@@ -50,8 +62,6 @@ bool OptiXDeviceQueue::enqueue(DeviceKernel kernel,
const CUDAContextScope scope(cuda_device_);
OptiXDevice *const optix_device = static_cast<OptiXDevice *>(cuda_device_);
const device_ptr sbt_data_ptr = optix_device->sbt_data.device_pointer;
const device_ptr launch_params_ptr = optix_device->launch_params.device_pointer;
@@ -62,9 +72,7 @@ bool OptiXDeviceQueue::enqueue(DeviceKernel kernel,
sizeof(device_ptr),
cuda_stream_));
if (kernel == DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST ||
kernel == DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE ||
kernel == DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_MNEE) {
if (kernel == DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST || device_kernel_has_shading(kernel)) {
cuda_device_assert(
cuda_device_,
cuMemcpyHtoDAsync(launch_params_ptr + offsetof(KernelParamsOptiX, render_buffer),
@@ -72,6 +80,15 @@ bool OptiXDeviceQueue::enqueue(DeviceKernel kernel,
sizeof(device_ptr),
cuda_stream_));
}
if (kernel == DEVICE_KERNEL_SHADER_EVAL_DISPLACE ||
kernel == DEVICE_KERNEL_SHADER_EVAL_BACKGROUND ||
kernel == DEVICE_KERNEL_SHADER_EVAL_CURVE_SHADOW_TRANSPARENCY) {
cuda_device_assert(cuda_device_,
cuMemcpyHtoDAsync(launch_params_ptr + offsetof(KernelParamsOptiX, offset),
args.values[2], // &d_offset
sizeof(int32_t),
cuda_stream_));
}
cuda_device_assert(cuda_device_, cuStreamSynchronize(cuda_stream_));
@@ -79,14 +96,35 @@ bool OptiXDeviceQueue::enqueue(DeviceKernel kernel,
OptixShaderBindingTable sbt_params = {};
switch (kernel) {
case DEVICE_KERNEL_INTEGRATOR_SHADE_BACKGROUND:
pipeline = optix_device->pipelines[PIP_SHADE];
sbt_params.raygenRecord = sbt_data_ptr + PG_RGEN_SHADE_BACKGROUND * sizeof(SbtRecord);
break;
case DEVICE_KERNEL_INTEGRATOR_SHADE_LIGHT:
pipeline = optix_device->pipelines[PIP_SHADE];
sbt_params.raygenRecord = sbt_data_ptr + PG_RGEN_SHADE_LIGHT * sizeof(SbtRecord);
break;
case DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE:
pipeline = optix_device->pipelines[PIP_SHADE];
sbt_params.raygenRecord = sbt_data_ptr + PG_RGEN_SHADE_SURFACE * sizeof(SbtRecord);
break;
case DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE:
pipeline = optix_device->pipelines[PIP_SHADE_RAYTRACE];
pipeline = optix_device->pipelines[PIP_SHADE];
sbt_params.raygenRecord = sbt_data_ptr + PG_RGEN_SHADE_SURFACE_RAYTRACE * sizeof(SbtRecord);
break;
case DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_MNEE:
pipeline = optix_device->pipelines[PIP_SHADE_MNEE];
pipeline = optix_device->pipelines[PIP_SHADE];
sbt_params.raygenRecord = sbt_data_ptr + PG_RGEN_SHADE_SURFACE_MNEE * sizeof(SbtRecord);
break;
case DEVICE_KERNEL_INTEGRATOR_SHADE_VOLUME:
pipeline = optix_device->pipelines[PIP_SHADE];
sbt_params.raygenRecord = sbt_data_ptr + PG_RGEN_SHADE_VOLUME * sizeof(SbtRecord);
break;
case DEVICE_KERNEL_INTEGRATOR_SHADE_SHADOW:
pipeline = optix_device->pipelines[PIP_SHADE];
sbt_params.raygenRecord = sbt_data_ptr + PG_RGEN_SHADE_SHADOW * sizeof(SbtRecord);
break;
case DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST:
pipeline = optix_device->pipelines[PIP_INTERSECT];
sbt_params.raygenRecord = sbt_data_ptr + PG_RGEN_INTERSECT_CLOSEST * sizeof(SbtRecord);
@@ -104,6 +142,20 @@ bool OptiXDeviceQueue::enqueue(DeviceKernel kernel,
sbt_params.raygenRecord = sbt_data_ptr + PG_RGEN_INTERSECT_VOLUME_STACK * sizeof(SbtRecord);
break;
case DEVICE_KERNEL_SHADER_EVAL_DISPLACE:
pipeline = optix_device->pipelines[PIP_SHADE];
sbt_params.raygenRecord = sbt_data_ptr + PG_RGEN_EVAL_DISPLACE * sizeof(SbtRecord);
break;
case DEVICE_KERNEL_SHADER_EVAL_BACKGROUND:
pipeline = optix_device->pipelines[PIP_SHADE];
sbt_params.raygenRecord = sbt_data_ptr + PG_RGEN_EVAL_BACKGROUND * sizeof(SbtRecord);
break;
case DEVICE_KERNEL_SHADER_EVAL_CURVE_SHADOW_TRANSPARENCY:
pipeline = optix_device->pipelines[PIP_SHADE];
sbt_params.raygenRecord = sbt_data_ptr +
PG_RGEN_EVAL_CURVE_SHADOW_TRANSPARENCY * sizeof(SbtRecord);
break;
default:
LOG(ERROR) << "Invalid kernel " << device_kernel_as_string(kernel)
<< " is attempted to be enqueued.";
@@ -112,7 +164,7 @@ bool OptiXDeviceQueue::enqueue(DeviceKernel kernel,
sbt_params.missRecordBase = sbt_data_ptr + MISS_PROGRAM_GROUP_OFFSET * sizeof(SbtRecord);
sbt_params.missRecordStrideInBytes = sizeof(SbtRecord);
sbt_params.missRecordCount = NUM_MIS_PROGRAM_GROUPS;
sbt_params.missRecordCount = NUM_MISS_PROGRAM_GROUPS;
sbt_params.hitgroupRecordBase = sbt_data_ptr + HIT_PROGAM_GROUP_OFFSET * sizeof(SbtRecord);
sbt_params.hitgroupRecordStrideInBytes = sizeof(SbtRecord);
sbt_params.hitgroupRecordCount = NUM_HIT_PROGRAM_GROUPS;
@@ -120,6 +172,12 @@ bool OptiXDeviceQueue::enqueue(DeviceKernel kernel,
sbt_params.callablesRecordCount = NUM_CALLABLE_PROGRAM_GROUPS;
sbt_params.callablesRecordStrideInBytes = sizeof(SbtRecord);
# ifdef WITH_OSL
if (use_osl) {
sbt_params.callablesRecordCount += static_cast<unsigned int>(optix_device->osl_groups.size());
}
# endif
/* Launch the ray generation program. */
optix_device_assert(optix_device,
optixLaunch(pipeline,

View File

@@ -8,7 +8,7 @@ set(INC
set(SRC
adaptive_sampling.cpp
denoiser.cpp
denoiser_device.cpp
denoiser_gpu.cpp
denoiser_oidn.cpp
denoiser_optix.cpp
path_trace.cpp
@@ -30,7 +30,7 @@ set(SRC
set(SRC_HEADERS
adaptive_sampling.h
denoiser.h
denoiser_device.h
denoiser_gpu.h
denoiser_oidn.h
denoiser_optix.h
path_trace.h

View File

@@ -16,9 +16,11 @@ unique_ptr<Denoiser> Denoiser::create(Device *path_trace_device, const DenoisePa
{
DCHECK(params.use);
#ifdef WITH_OPTIX
if (params.type == DENOISER_OPTIX && Device::available_devices(DEVICE_MASK_OPTIX).size()) {
return make_unique<OptiXDenoiser>(path_trace_device, params);
}
#endif
/* Always fallback to OIDN. */
DenoiseParams oidn_params = params;

View File

@@ -1,27 +0,0 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2011-2022 Blender Foundation */
#pragma once
#include "integrator/denoiser.h"
#include "util/unique_ptr.h"
CCL_NAMESPACE_BEGIN
/* Denoiser which uses device-specific denoising implementation, such as OptiX denoiser which are
* implemented as a part of a driver of specific device.
*
* This implementation makes sure the to-be-denoised buffer is available on the denoising device
* and invoke denoising kernel via device API. */
class DeviceDenoiser : public Denoiser {
public:
DeviceDenoiser(Device *path_trace_device, const DenoiseParams &params);
~DeviceDenoiser();
virtual bool denoise_buffer(const BufferParams &buffer_params,
RenderBuffers *render_buffers,
const int num_samples,
bool allow_inplace_modification) override;
};
CCL_NAMESPACE_END

View File

@@ -1,7 +1,7 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2011-2022 Blender Foundation */
#include "integrator/denoiser_device.h"
#include "integrator/denoiser_gpu.h"
#include "device/denoise.h"
#include "device/device.h"
@@ -13,27 +13,27 @@
CCL_NAMESPACE_BEGIN
DeviceDenoiser::DeviceDenoiser(Device *path_trace_device, const DenoiseParams &params)
DenoiserGPU::DenoiserGPU(Device *path_trace_device, const DenoiseParams &params)
: Denoiser(path_trace_device, params)
{
}
DeviceDenoiser::~DeviceDenoiser()
DenoiserGPU::~DenoiserGPU()
{
/* Explicit implementation, to allow forward declaration of Device in the header. */
}
bool DeviceDenoiser::denoise_buffer(const BufferParams &buffer_params,
RenderBuffers *render_buffers,
const int num_samples,
bool allow_inplace_modification)
bool DenoiserGPU::denoise_buffer(const BufferParams &buffer_params,
RenderBuffers *render_buffers,
const int num_samples,
bool allow_inplace_modification)
{
Device *denoiser_device = get_denoiser_device();
if (!denoiser_device) {
return false;
}
DeviceDenoiseTask task;
DenoiseTask task;
task.params = params_;
task.num_samples = num_samples;
task.buffer_params = buffer_params;
@@ -50,8 +50,6 @@ bool DeviceDenoiser::denoise_buffer(const BufferParams &buffer_params,
else {
VLOG_WORK << "Creating temporary buffer on denoiser device.";
DeviceQueue *queue = denoiser_device->get_denoise_queue();
/* Create buffer which is available by the device used by denoiser. */
/* TODO(sergey): Optimize data transfers. For example, only copy denoising related passes,
@@ -70,13 +68,13 @@ bool DeviceDenoiser::denoise_buffer(const BufferParams &buffer_params,
render_buffers->buffer.data(),
sizeof(float) * local_render_buffers.buffer.size());
queue->copy_to_device(local_render_buffers.buffer);
denoiser_queue_->copy_to_device(local_render_buffers.buffer);
task.render_buffers = &local_render_buffers;
task.allow_inplace_modification = true;
}
const bool denoise_result = denoiser_device->denoise_buffer(task);
const bool denoise_result = denoise_buffer(task);
if (local_buffer_used) {
local_render_buffers.copy_from_device();
@@ -90,4 +88,21 @@ bool DeviceDenoiser::denoise_buffer(const BufferParams &buffer_params,
return denoise_result;
}
Device *DenoiserGPU::ensure_denoiser_device(Progress *progress)
{
Device *denoiser_device = Denoiser::ensure_denoiser_device(progress);
if (!denoiser_device) {
return nullptr;
}
if (!denoiser_queue_) {
denoiser_queue_ = denoiser_device->gpu_queue_create();
if (!denoiser_queue_) {
return nullptr;
}
}
return denoiser_device;
}
CCL_NAMESPACE_END

View File

@@ -0,0 +1,52 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2011-2022 Blender Foundation */
#pragma once
#include "integrator/denoiser.h"
CCL_NAMESPACE_BEGIN
/* Implementation of Denoiser which uses a device-specific denoising implementation, running on a
* GPU device queue. It makes sure the to-be-denoised buffer is available on the denoising device
* and invokes denoising kernels via the device queue API. */
class DenoiserGPU : public Denoiser {
public:
DenoiserGPU(Device *path_trace_device, const DenoiseParams &params);
~DenoiserGPU();
virtual bool denoise_buffer(const BufferParams &buffer_params,
RenderBuffers *render_buffers,
const int num_samples,
bool allow_inplace_modification) override;
protected:
/* All the parameters needed to perform buffer denoising on a device.
* Is not really a task in its canonical terms (as in, is not an asynchronous running task). Is
* more like a wrapper for all the arguments and parameters needed to perform denoising. Is a
* single place where they are all listed, so that it's not required to modify all device methods
* when these parameters do change. */
class DenoiseTask {
public:
DenoiseParams params;
int num_samples;
RenderBuffers *render_buffers;
BufferParams buffer_params;
/* Allow to do in-place modification of the input passes (scaling them down i.e.). This will
* lower the memory footprint of the denoiser but will make input passes "invalid" (from path
* tracer) point of view. */
bool allow_inplace_modification;
};
/* Returns true if task is fully handled. */
virtual bool denoise_buffer(const DenoiseTask & /*task*/) = 0;
virtual Device *ensure_denoiser_device(Progress *progress) override;
unique_ptr<DeviceQueue> denoiser_queue_;
};
CCL_NAMESPACE_END

View File

@@ -1,16 +1,216 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2011-2022 Blender Foundation */
#include "integrator/denoiser_optix.h"
#ifdef WITH_OPTIX
#include "device/denoise.h"
#include "device/device.h"
# include "integrator/denoiser_optix.h"
# include "integrator/pass_accessor_gpu.h"
# include "device/optix/device_impl.h"
# include "device/optix/queue.h"
# include <optix_denoiser_tiling.h>
CCL_NAMESPACE_BEGIN
OptiXDenoiser::OptiXDenoiser(Device *path_trace_device, const DenoiseParams &params)
: DeviceDenoiser(path_trace_device, params)
# if OPTIX_ABI_VERSION >= 60
using ::optixUtilDenoiserInvokeTiled;
# else
// A minimal copy of functionality `optix_denoiser_tiling.h` which allows to fix integer overflow
// issues without bumping SDK or driver requirement.
//
// The original code is Copyright NVIDIA Corporation, BSD-3-Clause.
static OptixResult optixUtilDenoiserSplitImage(const OptixImage2D &input,
const OptixImage2D &output,
unsigned int overlapWindowSizeInPixels,
unsigned int tileWidth,
unsigned int tileHeight,
std::vector<OptixUtilDenoiserImageTile> &tiles)
{
if (tileWidth == 0 || tileHeight == 0)
return OPTIX_ERROR_INVALID_VALUE;
unsigned int inPixelStride = optixUtilGetPixelStride(input);
unsigned int outPixelStride = optixUtilGetPixelStride(output);
int inp_w = std::min(tileWidth + 2 * overlapWindowSizeInPixels, input.width);
int inp_h = std::min(tileHeight + 2 * overlapWindowSizeInPixels, input.height);
int inp_y = 0, copied_y = 0;
do {
int inputOffsetY = inp_y == 0 ? 0 :
std::max((int)overlapWindowSizeInPixels,
inp_h - ((int)input.height - inp_y));
int copy_y = inp_y == 0 ? std::min(input.height, tileHeight + overlapWindowSizeInPixels) :
std::min(tileHeight, input.height - copied_y);
int inp_x = 0, copied_x = 0;
do {
int inputOffsetX = inp_x == 0 ? 0 :
std::max((int)overlapWindowSizeInPixels,
inp_w - ((int)input.width - inp_x));
int copy_x = inp_x == 0 ? std::min(input.width, tileWidth + overlapWindowSizeInPixels) :
std::min(tileWidth, input.width - copied_x);
OptixUtilDenoiserImageTile tile;
tile.input.data = input.data + (size_t)(inp_y - inputOffsetY) * input.rowStrideInBytes +
+(size_t)(inp_x - inputOffsetX) * inPixelStride;
tile.input.width = inp_w;
tile.input.height = inp_h;
tile.input.rowStrideInBytes = input.rowStrideInBytes;
tile.input.pixelStrideInBytes = input.pixelStrideInBytes;
tile.input.format = input.format;
tile.output.data = output.data + (size_t)inp_y * output.rowStrideInBytes +
(size_t)inp_x * outPixelStride;
tile.output.width = copy_x;
tile.output.height = copy_y;
tile.output.rowStrideInBytes = output.rowStrideInBytes;
tile.output.pixelStrideInBytes = output.pixelStrideInBytes;
tile.output.format = output.format;
tile.inputOffsetX = inputOffsetX;
tile.inputOffsetY = inputOffsetY;
tiles.push_back(tile);
inp_x += inp_x == 0 ? tileWidth + overlapWindowSizeInPixels : tileWidth;
copied_x += copy_x;
} while (inp_x < static_cast<int>(input.width));
inp_y += inp_y == 0 ? tileHeight + overlapWindowSizeInPixels : tileHeight;
copied_y += copy_y;
} while (inp_y < static_cast<int>(input.height));
return OPTIX_SUCCESS;
}
static OptixResult optixUtilDenoiserInvokeTiled(OptixDenoiser denoiser,
CUstream stream,
const OptixDenoiserParams *params,
CUdeviceptr denoiserState,
size_t denoiserStateSizeInBytes,
const OptixDenoiserGuideLayer *guideLayer,
const OptixDenoiserLayer *layers,
unsigned int numLayers,
CUdeviceptr scratch,
size_t scratchSizeInBytes,
unsigned int overlapWindowSizeInPixels,
unsigned int tileWidth,
unsigned int tileHeight)
{
if (!guideLayer || !layers)
return OPTIX_ERROR_INVALID_VALUE;
std::vector<std::vector<OptixUtilDenoiserImageTile>> tiles(numLayers);
std::vector<std::vector<OptixUtilDenoiserImageTile>> prevTiles(numLayers);
for (unsigned int l = 0; l < numLayers; l++) {
if (const OptixResult res = ccl::optixUtilDenoiserSplitImage(layers[l].input,
layers[l].output,
overlapWindowSizeInPixels,
tileWidth,
tileHeight,
tiles[l]))
return res;
if (layers[l].previousOutput.data) {
OptixImage2D dummyOutput = layers[l].previousOutput;
if (const OptixResult res = ccl::optixUtilDenoiserSplitImage(layers[l].previousOutput,
dummyOutput,
overlapWindowSizeInPixels,
tileWidth,
tileHeight,
prevTiles[l]))
return res;
}
}
std::vector<OptixUtilDenoiserImageTile> albedoTiles;
if (guideLayer->albedo.data) {
OptixImage2D dummyOutput = guideLayer->albedo;
if (const OptixResult res = ccl::optixUtilDenoiserSplitImage(guideLayer->albedo,
dummyOutput,
overlapWindowSizeInPixels,
tileWidth,
tileHeight,
albedoTiles))
return res;
}
std::vector<OptixUtilDenoiserImageTile> normalTiles;
if (guideLayer->normal.data) {
OptixImage2D dummyOutput = guideLayer->normal;
if (const OptixResult res = ccl::optixUtilDenoiserSplitImage(guideLayer->normal,
dummyOutput,
overlapWindowSizeInPixels,
tileWidth,
tileHeight,
normalTiles))
return res;
}
std::vector<OptixUtilDenoiserImageTile> flowTiles;
if (guideLayer->flow.data) {
OptixImage2D dummyOutput = guideLayer->flow;
if (const OptixResult res = ccl::optixUtilDenoiserSplitImage(guideLayer->flow,
dummyOutput,
overlapWindowSizeInPixels,
tileWidth,
tileHeight,
flowTiles))
return res;
}
for (size_t t = 0; t < tiles[0].size(); t++) {
std::vector<OptixDenoiserLayer> tlayers;
for (unsigned int l = 0; l < numLayers; l++) {
OptixDenoiserLayer layer = {};
layer.input = (tiles[l])[t].input;
layer.output = (tiles[l])[t].output;
if (layers[l].previousOutput.data)
layer.previousOutput = (prevTiles[l])[t].input;
tlayers.push_back(layer);
}
OptixDenoiserGuideLayer gl = {};
if (guideLayer->albedo.data)
gl.albedo = albedoTiles[t].input;
if (guideLayer->normal.data)
gl.normal = normalTiles[t].input;
if (guideLayer->flow.data)
gl.flow = flowTiles[t].input;
if (const OptixResult res = optixDenoiserInvoke(denoiser,
stream,
params,
denoiserState,
denoiserStateSizeInBytes,
&gl,
&tlayers[0],
numLayers,
(tiles[0])[t].inputOffsetX,
(tiles[0])[t].inputOffsetY,
scratch,
scratchSizeInBytes))
return res;
}
return OPTIX_SUCCESS;
}
# endif
OptiXDenoiser::OptiXDenoiser(Device *path_trace_device, const DenoiseParams &params)
: DenoiserGPU(path_trace_device, params), state_(path_trace_device, "__denoiser_state", true)
{
}
OptiXDenoiser::~OptiXDenoiser()
{
/* It is important that the OptixDenoiser handle is destroyed before the OptixDeviceContext
* handle, which is guaranteed since the local denoising device owning the OptiX device context
* is deleted as part of the Denoiser class destructor call after this. */
if (optix_denoiser_ != nullptr) {
optixDenoiserDestroy(optix_denoiser_);
}
}
uint OptiXDenoiser::get_device_type_mask() const
@@ -18,4 +218,569 @@ uint OptiXDenoiser::get_device_type_mask() const
return DEVICE_MASK_OPTIX;
}
class OptiXDenoiser::DenoiseContext {
public:
explicit DenoiseContext(OptiXDevice *device, const DenoiseTask &task)
: denoise_params(task.params),
render_buffers(task.render_buffers),
buffer_params(task.buffer_params),
guiding_buffer(device, "denoiser guiding passes buffer", true),
num_samples(task.num_samples)
{
num_input_passes = 1;
if (denoise_params.use_pass_albedo) {
num_input_passes += 1;
use_pass_albedo = true;
pass_denoising_albedo = buffer_params.get_pass_offset(PASS_DENOISING_ALBEDO);
if (denoise_params.use_pass_normal) {
num_input_passes += 1;
use_pass_normal = true;
pass_denoising_normal = buffer_params.get_pass_offset(PASS_DENOISING_NORMAL);
}
}
if (denoise_params.temporally_stable) {
prev_output.device_pointer = render_buffers->buffer.device_pointer;
prev_output.offset = buffer_params.get_pass_offset(PASS_DENOISING_PREVIOUS);
prev_output.stride = buffer_params.stride;
prev_output.pass_stride = buffer_params.pass_stride;
num_input_passes += 1;
use_pass_motion = true;
pass_motion = buffer_params.get_pass_offset(PASS_MOTION);
}
use_guiding_passes = (num_input_passes - 1) > 0;
if (use_guiding_passes) {
if (task.allow_inplace_modification) {
guiding_params.device_pointer = render_buffers->buffer.device_pointer;
guiding_params.pass_albedo = pass_denoising_albedo;
guiding_params.pass_normal = pass_denoising_normal;
guiding_params.pass_flow = pass_motion;
guiding_params.stride = buffer_params.stride;
guiding_params.pass_stride = buffer_params.pass_stride;
}
else {
guiding_params.pass_stride = 0;
if (use_pass_albedo) {
guiding_params.pass_albedo = guiding_params.pass_stride;
guiding_params.pass_stride += 3;
}
if (use_pass_normal) {
guiding_params.pass_normal = guiding_params.pass_stride;
guiding_params.pass_stride += 3;
}
if (use_pass_motion) {
guiding_params.pass_flow = guiding_params.pass_stride;
guiding_params.pass_stride += 2;
}
guiding_params.stride = buffer_params.width;
guiding_buffer.alloc_to_device(buffer_params.width * buffer_params.height *
guiding_params.pass_stride);
guiding_params.device_pointer = guiding_buffer.device_pointer;
}
}
pass_sample_count = buffer_params.get_pass_offset(PASS_SAMPLE_COUNT);
}
const DenoiseParams &denoise_params;
RenderBuffers *render_buffers = nullptr;
const BufferParams &buffer_params;
/* Previous output. */
struct {
device_ptr device_pointer = 0;
int offset = PASS_UNUSED;
int stride = -1;
int pass_stride = -1;
} prev_output;
/* Device-side storage of the guiding passes. */
device_only_memory<float> guiding_buffer;
struct {
device_ptr device_pointer = 0;
/* NOTE: Are only initialized when the corresponding guiding pass is enabled. */
int pass_albedo = PASS_UNUSED;
int pass_normal = PASS_UNUSED;
int pass_flow = PASS_UNUSED;
int stride = -1;
int pass_stride = -1;
} guiding_params;
/* Number of input passes. Including the color and extra auxiliary passes. */
int num_input_passes = 0;
bool use_guiding_passes = false;
bool use_pass_albedo = false;
bool use_pass_normal = false;
bool use_pass_motion = false;
int num_samples = 0;
int pass_sample_count = PASS_UNUSED;
/* NOTE: Are only initialized when the corresponding guiding pass is enabled. */
int pass_denoising_albedo = PASS_UNUSED;
int pass_denoising_normal = PASS_UNUSED;
int pass_motion = PASS_UNUSED;
/* For passes which don't need albedo channel for denoising we replace the actual albedo with
* the (0.5, 0.5, 0.5). This flag indicates that the real albedo pass has been replaced with
* the fake values and denoising of passes which do need albedo can no longer happen. */
bool albedo_replaced_with_fake = false;
};
class OptiXDenoiser::DenoisePass {
public:
DenoisePass(const PassType type, const BufferParams &buffer_params) : type(type)
{
noisy_offset = buffer_params.get_pass_offset(type, PassMode::NOISY);
denoised_offset = buffer_params.get_pass_offset(type, PassMode::DENOISED);
const PassInfo pass_info = Pass::get_info(type);
num_components = pass_info.num_components;
use_compositing = pass_info.use_compositing;
use_denoising_albedo = pass_info.use_denoising_albedo;
}
PassType type;
int noisy_offset;
int denoised_offset;
int num_components;
bool use_compositing;
bool use_denoising_albedo;
};
bool OptiXDenoiser::denoise_buffer(const DenoiseTask &task)
{
OptiXDevice *const optix_device = static_cast<OptiXDevice *>(denoiser_device_);
const CUDAContextScope scope(optix_device);
DenoiseContext context(optix_device, task);
if (!denoise_ensure(context)) {
return false;
}
if (!denoise_filter_guiding_preprocess(context)) {
LOG(ERROR) << "Error preprocessing guiding passes.";
return false;
}
/* Passes which will use real albedo when it is available. */
denoise_pass(context, PASS_COMBINED);
denoise_pass(context, PASS_SHADOW_CATCHER_MATTE);
/* Passes which do not need albedo and hence if real is present it needs to become fake. */
denoise_pass(context, PASS_SHADOW_CATCHER);
return true;
}
bool OptiXDenoiser::denoise_filter_guiding_preprocess(const DenoiseContext &context)
{
const BufferParams &buffer_params = context.buffer_params;
const int work_size = buffer_params.width * buffer_params.height;
DeviceKernelArguments args(&context.guiding_params.device_pointer,
&context.guiding_params.pass_stride,
&context.guiding_params.pass_albedo,
&context.guiding_params.pass_normal,
&context.guiding_params.pass_flow,
&context.render_buffers->buffer.device_pointer,
&buffer_params.offset,
&buffer_params.stride,
&buffer_params.pass_stride,
&context.pass_sample_count,
&context.pass_denoising_albedo,
&context.pass_denoising_normal,
&context.pass_motion,
&buffer_params.full_x,
&buffer_params.full_y,
&buffer_params.width,
&buffer_params.height,
&context.num_samples);
return denoiser_queue_->enqueue(DEVICE_KERNEL_FILTER_GUIDING_PREPROCESS, work_size, args);
}
bool OptiXDenoiser::denoise_filter_guiding_set_fake_albedo(const DenoiseContext &context)
{
const BufferParams &buffer_params = context.buffer_params;
const int work_size = buffer_params.width * buffer_params.height;
DeviceKernelArguments args(&context.guiding_params.device_pointer,
&context.guiding_params.pass_stride,
&context.guiding_params.pass_albedo,
&buffer_params.width,
&buffer_params.height);
return denoiser_queue_->enqueue(DEVICE_KERNEL_FILTER_GUIDING_SET_FAKE_ALBEDO, work_size, args);
}
void OptiXDenoiser::denoise_pass(DenoiseContext &context, PassType pass_type)
{
const BufferParams &buffer_params = context.buffer_params;
const DenoisePass pass(pass_type, buffer_params);
if (pass.noisy_offset == PASS_UNUSED) {
return;
}
if (pass.denoised_offset == PASS_UNUSED) {
LOG(DFATAL) << "Missing denoised pass " << pass_type_as_string(pass_type);
return;
}
if (pass.use_denoising_albedo) {
if (context.albedo_replaced_with_fake) {
LOG(ERROR) << "Pass which requires albedo is denoised after fake albedo has been set.";
return;
}
}
else if (context.use_guiding_passes && !context.albedo_replaced_with_fake) {
context.albedo_replaced_with_fake = true;
if (!denoise_filter_guiding_set_fake_albedo(context)) {
LOG(ERROR) << "Error replacing real albedo with the fake one.";
return;
}
}
/* Read and preprocess noisy color input pass. */
denoise_color_read(context, pass);
if (!denoise_filter_color_preprocess(context, pass)) {
LOG(ERROR) << "Error converting denoising passes to RGB buffer.";
return;
}
if (!denoise_run(context, pass)) {
LOG(ERROR) << "Error running OptiX denoiser.";
return;
}
/* Store result in the combined pass of the render buffer.
*
* This will scale the denoiser result up to match the number of, possibly per-pixel, samples. */
if (!denoise_filter_color_postprocess(context, pass)) {
LOG(ERROR) << "Error copying denoiser result to the denoised pass.";
return;
}
denoiser_queue_->synchronize();
}
void OptiXDenoiser::denoise_color_read(const DenoiseContext &context, const DenoisePass &pass)
{
PassAccessor::PassAccessInfo pass_access_info;
pass_access_info.type = pass.type;
pass_access_info.mode = PassMode::NOISY;
pass_access_info.offset = pass.noisy_offset;
/* Denoiser operates on passes which are used to calculate the approximation, and is never used
* on the approximation. The latter is not even possible because OptiX does not support
* denoising of semi-transparent pixels. */
pass_access_info.use_approximate_shadow_catcher = false;
pass_access_info.use_approximate_shadow_catcher_background = false;
pass_access_info.show_active_pixels = false;
/* TODO(sergey): Consider adding support of actual exposure, to avoid clamping in extreme cases.
*/
const PassAccessorGPU pass_accessor(
denoiser_queue_.get(), pass_access_info, 1.0f, context.num_samples);
PassAccessor::Destination destination(pass_access_info.type);
destination.d_pixels = context.render_buffers->buffer.device_pointer +
pass.denoised_offset * sizeof(float);
destination.num_components = 3;
destination.pixel_stride = context.buffer_params.pass_stride;
BufferParams buffer_params = context.buffer_params;
buffer_params.window_x = 0;
buffer_params.window_y = 0;
buffer_params.window_width = buffer_params.width;
buffer_params.window_height = buffer_params.height;
pass_accessor.get_render_tile_pixels(context.render_buffers, buffer_params, destination);
}
bool OptiXDenoiser::denoise_filter_color_preprocess(const DenoiseContext &context,
const DenoisePass &pass)
{
const BufferParams &buffer_params = context.buffer_params;
const int work_size = buffer_params.width * buffer_params.height;
DeviceKernelArguments args(&context.render_buffers->buffer.device_pointer,
&buffer_params.full_x,
&buffer_params.full_y,
&buffer_params.width,
&buffer_params.height,
&buffer_params.offset,
&buffer_params.stride,
&buffer_params.pass_stride,
&pass.denoised_offset);
return denoiser_queue_->enqueue(DEVICE_KERNEL_FILTER_COLOR_PREPROCESS, work_size, args);
}
bool OptiXDenoiser::denoise_filter_color_postprocess(const DenoiseContext &context,
const DenoisePass &pass)
{
const BufferParams &buffer_params = context.buffer_params;
const int work_size = buffer_params.width * buffer_params.height;
DeviceKernelArguments args(&context.render_buffers->buffer.device_pointer,
&buffer_params.full_x,
&buffer_params.full_y,
&buffer_params.width,
&buffer_params.height,
&buffer_params.offset,
&buffer_params.stride,
&buffer_params.pass_stride,
&context.num_samples,
&pass.noisy_offset,
&pass.denoised_offset,
&context.pass_sample_count,
&pass.num_components,
&pass.use_compositing);
return denoiser_queue_->enqueue(DEVICE_KERNEL_FILTER_COLOR_POSTPROCESS, work_size, args);
}
bool OptiXDenoiser::denoise_ensure(DenoiseContext &context)
{
if (!denoise_create_if_needed(context)) {
LOG(ERROR) << "OptiX denoiser creation has failed.";
return false;
}
if (!denoise_configure_if_needed(context)) {
LOG(ERROR) << "OptiX denoiser configuration has failed.";
return false;
}
return true;
}
bool OptiXDenoiser::denoise_create_if_needed(DenoiseContext &context)
{
const bool recreate_denoiser = (optix_denoiser_ == nullptr) ||
(use_pass_albedo_ != context.use_pass_albedo) ||
(use_pass_normal_ != context.use_pass_normal) ||
(use_pass_motion_ != context.use_pass_motion);
if (!recreate_denoiser) {
return true;
}
/* Destroy existing handle before creating new one. */
if (optix_denoiser_) {
optixDenoiserDestroy(optix_denoiser_);
}
/* Create OptiX denoiser handle on demand when it is first used. */
OptixDenoiserOptions denoiser_options = {};
denoiser_options.guideAlbedo = context.use_pass_albedo;
denoiser_options.guideNormal = context.use_pass_normal;
OptixDenoiserModelKind model = OPTIX_DENOISER_MODEL_KIND_HDR;
if (context.use_pass_motion) {
model = OPTIX_DENOISER_MODEL_KIND_TEMPORAL;
}
const OptixResult result = optixDenoiserCreate(
static_cast<OptiXDevice *>(denoiser_device_)->context,
model,
&denoiser_options,
&optix_denoiser_);
if (result != OPTIX_SUCCESS) {
denoiser_device_->set_error("Failed to create OptiX denoiser");
return false;
}
/* OptiX denoiser handle was created with the requested number of input passes. */
use_pass_albedo_ = context.use_pass_albedo;
use_pass_normal_ = context.use_pass_normal;
use_pass_motion_ = context.use_pass_motion;
/* OptiX denoiser has been created, but it needs configuration. */
is_configured_ = false;
return true;
}
bool OptiXDenoiser::denoise_configure_if_needed(DenoiseContext &context)
{
/* Limit maximum tile size denoiser can be invoked with. */
const int2 tile_size = make_int2(min(context.buffer_params.width, 4096),
min(context.buffer_params.height, 4096));
if (is_configured_ && (configured_size_.x == tile_size.x && configured_size_.y == tile_size.y)) {
return true;
}
optix_device_assert(
denoiser_device_,
optixDenoiserComputeMemoryResources(optix_denoiser_, tile_size.x, tile_size.y, &sizes_));
/* Allocate denoiser state if tile size has changed since last setup. */
state_.device = denoiser_device_;
state_.alloc_to_device(sizes_.stateSizeInBytes + sizes_.withOverlapScratchSizeInBytes);
/* Initialize denoiser state for the current tile size. */
const OptixResult result = optixDenoiserSetup(
optix_denoiser_,
0, /* Work around bug in r495 drivers that causes artifacts when denoiser setup is called
* on a stream that is not the default stream. */
tile_size.x + sizes_.overlapWindowSizeInPixels * 2,
tile_size.y + sizes_.overlapWindowSizeInPixels * 2,
state_.device_pointer,
sizes_.stateSizeInBytes,
state_.device_pointer + sizes_.stateSizeInBytes,
sizes_.withOverlapScratchSizeInBytes);
if (result != OPTIX_SUCCESS) {
denoiser_device_->set_error("Failed to set up OptiX denoiser");
return false;
}
cuda_device_assert(denoiser_device_, cuCtxSynchronize());
is_configured_ = true;
configured_size_ = tile_size;
return true;
}
bool OptiXDenoiser::denoise_run(const DenoiseContext &context, const DenoisePass &pass)
{
const BufferParams &buffer_params = context.buffer_params;
const int width = buffer_params.width;
const int height = buffer_params.height;
/* Set up input and output layer information. */
OptixImage2D color_layer = {0};
OptixImage2D albedo_layer = {0};
OptixImage2D normal_layer = {0};
OptixImage2D flow_layer = {0};
OptixImage2D output_layer = {0};
OptixImage2D prev_output_layer = {0};
/* Color pass. */
{
const int pass_denoised = pass.denoised_offset;
const int64_t pass_stride_in_bytes = context.buffer_params.pass_stride * sizeof(float);
color_layer.data = context.render_buffers->buffer.device_pointer +
pass_denoised * sizeof(float);
color_layer.width = width;
color_layer.height = height;
color_layer.rowStrideInBytes = pass_stride_in_bytes * context.buffer_params.stride;
color_layer.pixelStrideInBytes = pass_stride_in_bytes;
color_layer.format = OPTIX_PIXEL_FORMAT_FLOAT3;
}
/* Previous output. */
if (context.prev_output.offset != PASS_UNUSED) {
const int64_t pass_stride_in_bytes = context.prev_output.pass_stride * sizeof(float);
prev_output_layer.data = context.prev_output.device_pointer +
context.prev_output.offset * sizeof(float);
prev_output_layer.width = width;
prev_output_layer.height = height;
prev_output_layer.rowStrideInBytes = pass_stride_in_bytes * context.prev_output.stride;
prev_output_layer.pixelStrideInBytes = pass_stride_in_bytes;
prev_output_layer.format = OPTIX_PIXEL_FORMAT_FLOAT3;
}
/* Optional albedo and color passes. */
if (context.num_input_passes > 1) {
const device_ptr d_guiding_buffer = context.guiding_params.device_pointer;
const int64_t pixel_stride_in_bytes = context.guiding_params.pass_stride * sizeof(float);
const int64_t row_stride_in_bytes = context.guiding_params.stride * pixel_stride_in_bytes;
if (context.use_pass_albedo) {
albedo_layer.data = d_guiding_buffer + context.guiding_params.pass_albedo * sizeof(float);
albedo_layer.width = width;
albedo_layer.height = height;
albedo_layer.rowStrideInBytes = row_stride_in_bytes;
albedo_layer.pixelStrideInBytes = pixel_stride_in_bytes;
albedo_layer.format = OPTIX_PIXEL_FORMAT_FLOAT3;
}
if (context.use_pass_normal) {
normal_layer.data = d_guiding_buffer + context.guiding_params.pass_normal * sizeof(float);
normal_layer.width = width;
normal_layer.height = height;
normal_layer.rowStrideInBytes = row_stride_in_bytes;
normal_layer.pixelStrideInBytes = pixel_stride_in_bytes;
normal_layer.format = OPTIX_PIXEL_FORMAT_FLOAT3;
}
if (context.use_pass_motion) {
flow_layer.data = d_guiding_buffer + context.guiding_params.pass_flow * sizeof(float);
flow_layer.width = width;
flow_layer.height = height;
flow_layer.rowStrideInBytes = row_stride_in_bytes;
flow_layer.pixelStrideInBytes = pixel_stride_in_bytes;
flow_layer.format = OPTIX_PIXEL_FORMAT_FLOAT2;
}
}
/* Denoise in-place of the noisy input in the render buffers. */
output_layer = color_layer;
OptixDenoiserGuideLayer guide_layers = {};
guide_layers.albedo = albedo_layer;
guide_layers.normal = normal_layer;
guide_layers.flow = flow_layer;
OptixDenoiserLayer image_layers = {};
image_layers.input = color_layer;
image_layers.previousOutput = prev_output_layer;
image_layers.output = output_layer;
/* Finally run denoising. */
OptixDenoiserParams params = {}; /* All parameters are disabled/zero. */
optix_device_assert(denoiser_device_,
ccl::optixUtilDenoiserInvokeTiled(
optix_denoiser_,
static_cast<OptiXDeviceQueue *>(denoiser_queue_.get())->stream(),
&params,
state_.device_pointer,
sizes_.stateSizeInBytes,
&guide_layers,
&image_layers,
1,
state_.device_pointer + sizes_.stateSizeInBytes,
sizes_.withOverlapScratchSizeInBytes,
sizes_.overlapWindowSizeInPixels,
configured_size_.x,
configured_size_.y));
return true;
}
CCL_NAMESPACE_END
#endif

View File

@@ -3,16 +3,84 @@
#pragma once
#include "integrator/denoiser_device.h"
#ifdef WITH_OPTIX
# include "integrator/denoiser_gpu.h"
# include "device/optix/util.h"
CCL_NAMESPACE_BEGIN
class OptiXDenoiser : public DeviceDenoiser {
/* Implementation of denoising API which uses the OptiX denoiser. */
class OptiXDenoiser : public DenoiserGPU {
public:
OptiXDenoiser(Device *path_trace_device, const DenoiseParams &params);
~OptiXDenoiser();
protected:
virtual uint get_device_type_mask() const override;
private:
class DenoiseContext;
class DenoisePass;
virtual bool denoise_buffer(const DenoiseTask &task) override;
/* Read guiding passes from the render buffers, preprocess them in a way which is expected by
* OptiX and store in the guiding passes memory within the given context.
*
* Pre-processing of the guiding passes is to only happen once per context lifetime. DO not
* preprocess them for every pass which is being denoised. */
bool denoise_filter_guiding_preprocess(const DenoiseContext &context);
/* Set fake albedo pixels in the albedo guiding pass storage.
* After this point only passes which do not need albedo for denoising can be processed. */
bool denoise_filter_guiding_set_fake_albedo(const DenoiseContext &context);
void denoise_pass(DenoiseContext &context, PassType pass_type);
/* Read input color pass from the render buffer into the memory which corresponds to the noisy
* input within the given context. Pixels are scaled to the number of samples, but are not
* preprocessed yet. */
void denoise_color_read(const DenoiseContext &context, const DenoisePass &pass);
/* Run corresponding filter kernels, preparing data for the denoiser or copying data from the
* denoiser result to the render buffer. */
bool denoise_filter_color_preprocess(const DenoiseContext &context, const DenoisePass &pass);
bool denoise_filter_color_postprocess(const DenoiseContext &context, const DenoisePass &pass);
/* Make sure the OptiX denoiser is created and configured. */
bool denoise_ensure(DenoiseContext &context);
/* Create OptiX denoiser descriptor if needed.
* Will do nothing if the current OptiX descriptor is usable for the given parameters.
* If the OptiX denoiser descriptor did re-allocate here it is left unconfigured. */
bool denoise_create_if_needed(DenoiseContext &context);
/* Configure existing OptiX denoiser descriptor for the use for the given task. */
bool denoise_configure_if_needed(DenoiseContext &context);
/* Run configured denoiser. */
bool denoise_run(const DenoiseContext &context, const DenoisePass &pass);
OptixDenoiser optix_denoiser_ = nullptr;
/* Configuration size, as provided to `optixDenoiserSetup`.
* If the `optixDenoiserSetup()` was never used on the current `optix_denoiser` the
* `is_configured` will be false. */
bool is_configured_ = false;
int2 configured_size_ = make_int2(0, 0);
/* OptiX denoiser state and scratch buffers, stored in a single memory buffer.
* The memory layout goes as following: [denoiser state][scratch buffer]. */
device_only_memory<unsigned char> state_;
OptixDenoiserSizes sizes_ = {};
bool use_pass_albedo_ = false;
bool use_pass_normal_ = false;
bool use_pass_motion_ = false;
};
CCL_NAMESPACE_END
#endif

View File

@@ -37,6 +37,14 @@ set(SRC_KERNEL_DEVICE_OPTIX
device/optix/kernel_shader_raytrace.cu
)
if(WITH_CYCLES_OSL AND (OSL_LIBRARY_VERSION_MINOR GREATER_EQUAL 13 OR OSL_LIBRARY_VERSION_MAJOR GREATER 1))
set(SRC_KERNEL_DEVICE_OPTIX
${SRC_KERNEL_DEVICE_OPTIX}
osl/services_optix.cu
device/optix/kernel_osl.cu
)
endif()
set(SRC_KERNEL_DEVICE_ONEAPI
device/oneapi/kernel.cpp
)
@@ -181,6 +189,16 @@ set(SRC_KERNEL_SVM_HEADERS
svm/vertex_color.h
)
if(WITH_CYCLES_OSL)
set(SRC_KERNEL_OSL_HEADERS
osl/osl.h
osl/closures_setup.h
osl/closures_template.h
osl/services_gpu.h
osl/types.h
)
endif()
set(SRC_KERNEL_GEOM_HEADERS
geom/geom.h
geom/attribute.h
@@ -267,10 +285,17 @@ set(SRC_KERNEL_INTEGRATOR_HEADERS
)
set(SRC_KERNEL_LIGHT_HEADERS
light/light.h
light/area.h
light/background.h
light/common.h
light/distant.h
light/distribution.h
light/light.h
light/point.h
light/sample.h
light/spot.h
light/tree.h
light/triangle.h
)
set(SRC_KERNEL_SAMPLE_HEADERS
@@ -306,6 +331,7 @@ set(SRC_KERNEL_HEADERS
${SRC_KERNEL_GEOM_HEADERS}
${SRC_KERNEL_INTEGRATOR_HEADERS}
${SRC_KERNEL_LIGHT_HEADERS}
${SRC_KERNEL_OSL_HEADERS}
${SRC_KERNEL_SAMPLE_HEADERS}
${SRC_KERNEL_SVM_HEADERS}
${SRC_KERNEL_TYPES_HEADERS}
@@ -328,6 +354,7 @@ set(SRC_UTIL_HEADERS
../util/math_int2.h
../util/math_int3.h
../util/math_int4.h
../util/math_int8.h
../util/math_matrix.h
../util/projection.h
../util/rect.h
@@ -350,6 +377,8 @@ set(SRC_UTIL_HEADERS
../util/types_int3_impl.h
../util/types_int4.h
../util/types_int4_impl.h
../util/types_int8.h
../util/types_int8_impl.h
../util/types_spectrum.h
../util/types_uchar2.h
../util/types_uchar2_impl.h
@@ -444,6 +473,7 @@ if(WITH_CYCLES_CUDA_BINARIES)
if(WITH_CYCLES_DEBUG)
set(cuda_flags ${cuda_flags} -D WITH_CYCLES_DEBUG)
set(cuda_flags ${cuda_flags} --ptxas-options="-v")
endif()
set(_cuda_nvcc_args
@@ -451,7 +481,6 @@ if(WITH_CYCLES_CUDA_BINARIES)
${CUDA_NVCC_FLAGS}
--${format}
${CMAKE_CURRENT_SOURCE_DIR}${cuda_kernel_src}
--ptxas-options="-v"
${cuda_flags})
if(WITH_COMPILER_CCACHE AND CCACHE_PROGRAM)
@@ -660,6 +689,16 @@ if(WITH_CYCLES_DEVICE_OPTIX AND WITH_CYCLES_CUDA_BINARIES)
kernel_optix_shader_raytrace
"device/optix/kernel_shader_raytrace.cu"
"--keep-device-functions")
if(WITH_CYCLES_OSL AND (OSL_LIBRARY_VERSION_MINOR GREATER_EQUAL 13 OR OSL_LIBRARY_VERSION_MAJOR GREATER 1))
CYCLES_OPTIX_KERNEL_ADD(
kernel_optix_osl
"device/optix/kernel_osl.cu"
"--relocatable-device-code=true")
CYCLES_OPTIX_KERNEL_ADD(
kernel_optix_osl_services
"osl/services_optix.cu"
"--relocatable-device-code=true")
endif()
add_custom_target(cycles_kernel_optix ALL DEPENDS ${optix_ptx})
cycles_set_solution_folder(cycles_kernel_optix)
@@ -947,6 +986,7 @@ source_group("geom" FILES ${SRC_KERNEL_GEOM_HEADERS})
source_group("integrator" FILES ${SRC_KERNEL_INTEGRATOR_HEADERS})
source_group("kernel" FILES ${SRC_KERNEL_TYPES_HEADERS})
source_group("light" FILES ${SRC_KERNEL_LIGHT_HEADERS})
source_group("osl" FILES ${SRC_KERNEL_OSL_HEADERS})
source_group("sample" FILES ${SRC_KERNEL_SAMPLE_HEADERS})
source_group("svm" FILES ${SRC_KERNEL_SVM_HEADERS})
source_group("util" FILES ${SRC_KERNEL_UTIL_HEADERS})
@@ -983,6 +1023,7 @@ delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_FILM_HEADERS}" ${CYCLE
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_GEOM_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel/geom)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_INTEGRATOR_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel/integrator)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_LIGHT_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel/light)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_OSL_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel/osl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_SAMPLE_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel/sample)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_SVM_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel/svm)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_TYPES_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel)

View File

@@ -297,8 +297,10 @@ ccl_device_inline void bsdf_roughness_eta(const KernelGlobals kg,
ccl_private float2 *roughness,
ccl_private float *eta)
{
#ifdef __SVM__
bool refractive = false;
float alpha = 1.0f;
#endif
switch (sc->type) {
case CLOSURE_BSDF_DIFFUSE_ID:
*roughness = one_float2();

View File

@@ -69,7 +69,7 @@ ccl_device int bsdf_diffuse_sample(ccl_private const ShaderClosure *sc,
ccl_device int bsdf_translucent_setup(ccl_private DiffuseBsdf *bsdf)
{
bsdf->type = CLOSURE_BSDF_TRANSLUCENT_ID;
return SD_BSDF | SD_BSDF_HAS_EVAL;
return SD_BSDF | SD_BSDF_HAS_EVAL | SD_BSDF_HAS_TRANSMISSION;
}
ccl_device Spectrum bsdf_translucent_eval(ccl_private const ShaderClosure *sc,

View File

@@ -34,7 +34,7 @@ ccl_device int bsdf_hair_transmission_setup(ccl_private HairBsdf *bsdf)
bsdf->type = CLOSURE_BSDF_HAIR_TRANSMISSION_ID;
bsdf->roughness1 = clamp(bsdf->roughness1, 0.001f, 1.0f);
bsdf->roughness2 = clamp(bsdf->roughness2, 0.001f, 1.0f);
return SD_BSDF | SD_BSDF_HAS_EVAL;
return SD_BSDF | SD_BSDF_HAS_EVAL | SD_BSDF_HAS_TRANSMISSION;
}
ccl_device Spectrum bsdf_hair_reflection_eval(ccl_private const ShaderClosure *sc,

View File

@@ -196,7 +196,7 @@ ccl_device int bsdf_principled_hair_setup(ccl_private ShaderData *sd,
bsdf->extra->geom = make_float4(Y.x, Y.y, Y.z, h);
return SD_BSDF | SD_BSDF_HAS_EVAL | SD_BSDF_NEEDS_LCG;
return SD_BSDF | SD_BSDF_HAS_EVAL | SD_BSDF_NEEDS_LCG | SD_BSDF_HAS_TRANSMISSION;
}
#endif /* __HAIR__ */

View File

@@ -346,7 +346,7 @@ ccl_device int bsdf_microfacet_ggx_refraction_setup(ccl_private MicrofacetBsdf *
bsdf->type = CLOSURE_BSDF_MICROFACET_GGX_REFRACTION_ID;
return SD_BSDF | SD_BSDF_HAS_EVAL;
return SD_BSDF | SD_BSDF_HAS_EVAL | SD_BSDF_HAS_TRANSMISSION;
}
ccl_device void bsdf_microfacet_ggx_blur(ccl_private ShaderClosure *sc, float roughness)
@@ -776,7 +776,7 @@ ccl_device int bsdf_microfacet_beckmann_refraction_setup(ccl_private MicrofacetB
bsdf->alpha_y = bsdf->alpha_x;
bsdf->type = CLOSURE_BSDF_MICROFACET_BECKMANN_REFRACTION_ID;
return SD_BSDF | SD_BSDF_HAS_EVAL;
return SD_BSDF | SD_BSDF_HAS_EVAL | SD_BSDF_HAS_TRANSMISSION;
}
ccl_device void bsdf_microfacet_beckmann_blur(ccl_private ShaderClosure *sc, float roughness)

View File

@@ -559,7 +559,7 @@ ccl_device int bsdf_microfacet_multi_ggx_glass_setup(ccl_private MicrofacetBsdf
bsdf->type = CLOSURE_BSDF_MICROFACET_MULTI_GGX_GLASS_ID;
return SD_BSDF | SD_BSDF_HAS_EVAL | SD_BSDF_NEEDS_LCG;
return SD_BSDF | SD_BSDF_HAS_EVAL | SD_BSDF_NEEDS_LCG | SD_BSDF_HAS_TRANSMISSION;
}
ccl_device int bsdf_microfacet_multi_ggx_glass_fresnel_setup(ccl_private MicrofacetBsdf *bsdf,

View File

@@ -60,6 +60,13 @@ KERNEL_DATA_ARRAY(KernelLight, lights)
KERNEL_DATA_ARRAY(float2, light_background_marginal_cdf)
KERNEL_DATA_ARRAY(float2, light_background_conditional_cdf)
/* light tree */
KERNEL_DATA_ARRAY(KernelLightTreeNode, light_tree_nodes)
KERNEL_DATA_ARRAY(KernelLightTreeEmitter, light_tree_emitters)
KERNEL_DATA_ARRAY(uint, light_to_tree)
KERNEL_DATA_ARRAY(uint, object_lookup_offset)
KERNEL_DATA_ARRAY(uint, triangle_to_tree)
/* particles */
KERNEL_DATA_ARRAY(KernelParticle, particles)

View File

@@ -23,24 +23,19 @@ KERNEL_STRUCT_MEMBER(background, int, volume_shader)
KERNEL_STRUCT_MEMBER(background, float, volume_step_size)
KERNEL_STRUCT_MEMBER(background, int, transparent)
KERNEL_STRUCT_MEMBER(background, float, transparent_roughness_squared_threshold)
/* Portal sampling. */
KERNEL_STRUCT_MEMBER(background, float, portal_weight)
KERNEL_STRUCT_MEMBER(background, int, num_portals)
KERNEL_STRUCT_MEMBER(background, int, portal_offset)
/* Sun sampling. */
KERNEL_STRUCT_MEMBER(background, float, sun_weight)
/* Importance map sampling. */
KERNEL_STRUCT_MEMBER(background, float, map_weight)
KERNEL_STRUCT_MEMBER(background, float, portal_weight)
KERNEL_STRUCT_MEMBER(background, int, map_res_x)
KERNEL_STRUCT_MEMBER(background, int, map_res_y)
/* Multiple importance sampling. */
KERNEL_STRUCT_MEMBER(background, int, use_mis)
/* Lightgroup. */
KERNEL_STRUCT_MEMBER(background, int, lightgroup)
/* Padding. */
KERNEL_STRUCT_MEMBER(background, int, pad1)
KERNEL_STRUCT_MEMBER(background, int, pad2)
KERNEL_STRUCT_MEMBER(background, int, pad3)
/* Light Index. */
KERNEL_STRUCT_MEMBER(background, int, light_index)
KERNEL_STRUCT_END(KernelBackground)
/* BVH: own BVH2 if no native device acceleration struct used. */
@@ -102,8 +97,6 @@ KERNEL_STRUCT_MEMBER(film, int, pass_emission)
KERNEL_STRUCT_MEMBER(film, int, pass_background)
KERNEL_STRUCT_MEMBER(film, int, pass_ao)
KERNEL_STRUCT_MEMBER(film, float, pass_alpha_threshold)
KERNEL_STRUCT_MEMBER(film, int, pass_shadow)
KERNEL_STRUCT_MEMBER(film, float, pass_shadow_scale)
KERNEL_STRUCT_MEMBER(film, int, pass_shadow_catcher)
KERNEL_STRUCT_MEMBER(film, int, pass_shadow_catcher_sample_count)
KERNEL_STRUCT_MEMBER(film, int, pass_shadow_catcher_matte)
@@ -137,9 +130,6 @@ KERNEL_STRUCT_MEMBER(film, int, use_approximate_shadow_catcher)
KERNEL_STRUCT_MEMBER(film, int, pass_guiding_color)
KERNEL_STRUCT_MEMBER(film, int, pass_guiding_probability)
KERNEL_STRUCT_MEMBER(film, int, pass_guiding_avg_roughness)
/* Padding. */
KERNEL_STRUCT_MEMBER(film, int, pad1)
KERNEL_STRUCT_MEMBER(film, int, pad2)
KERNEL_STRUCT_END(KernelFilm)
/* Integrator. */
@@ -147,10 +137,18 @@ KERNEL_STRUCT_END(KernelFilm)
KERNEL_STRUCT_BEGIN(KernelIntegrator, integrator)
/* Emission. */
KERNEL_STRUCT_MEMBER(integrator, int, use_direct_light)
KERNEL_STRUCT_MEMBER(integrator, int, use_light_mis)
KERNEL_STRUCT_MEMBER(integrator, int, use_light_tree)
KERNEL_STRUCT_MEMBER(integrator, int, num_lights)
KERNEL_STRUCT_MEMBER(integrator, int, num_distant_lights)
KERNEL_STRUCT_MEMBER(integrator, int, num_background_lights)
/* Portal sampling. */
KERNEL_STRUCT_MEMBER(integrator, int, num_portals)
KERNEL_STRUCT_MEMBER(integrator, int, portal_offset)
/* Flat light distribution. */
KERNEL_STRUCT_MEMBER(integrator, int, num_distribution)
KERNEL_STRUCT_MEMBER(integrator, int, num_all_lights)
KERNEL_STRUCT_MEMBER(integrator, float, pdf_triangles)
KERNEL_STRUCT_MEMBER(integrator, float, pdf_lights)
KERNEL_STRUCT_MEMBER(integrator, float, distribution_pdf_triangles)
KERNEL_STRUCT_MEMBER(integrator, float, distribution_pdf_lights)
KERNEL_STRUCT_MEMBER(integrator, float, light_inv_rr_threshold)
/* Bounces. */
KERNEL_STRUCT_MEMBER(integrator, int, min_bounce)
@@ -177,8 +175,6 @@ KERNEL_STRUCT_MEMBER(integrator, int, seed)
/* Clamp. */
KERNEL_STRUCT_MEMBER(integrator, float, sample_clamp_direct)
KERNEL_STRUCT_MEMBER(integrator, float, sample_clamp_indirect)
/* MIS. */
KERNEL_STRUCT_MEMBER(integrator, int, use_lamp_mis)
/* Caustics. */
KERNEL_STRUCT_MEMBER(integrator, int, use_caustics)
/* Sampling pattern. */
@@ -195,7 +191,6 @@ KERNEL_STRUCT_MEMBER(integrator, int, has_shadow_catcher)
KERNEL_STRUCT_MEMBER(integrator, int, filter_closures)
/* MIS debugging. */
KERNEL_STRUCT_MEMBER(integrator, int, direct_light_sampling_type)
/* Path Guiding */
KERNEL_STRUCT_MEMBER(integrator, float, surface_guiding_probability)
KERNEL_STRUCT_MEMBER(integrator, float, volume_guiding_probability)
@@ -210,7 +205,6 @@ KERNEL_STRUCT_MEMBER(integrator, int, use_guiding_mis_weights)
/* Padding. */
KERNEL_STRUCT_MEMBER(integrator, int, pad1)
KERNEL_STRUCT_MEMBER(integrator, int, pad2)
KERNEL_STRUCT_MEMBER(integrator, int, pad3)
KERNEL_STRUCT_END(KernelIntegrator)
/* SVM. For shader specialization. */

View File

@@ -7,6 +7,7 @@
* one with SSE2 intrinsics.
*/
#if defined(__x86_64__) || defined(_M_X64)
# define __KERNEL_SSE__
# define __KERNEL_SSE2__
#endif
@@ -29,11 +30,15 @@
# define __KERNEL_SSE41__
# endif
# ifdef __AVX__
# define __KERNEL_SSE__
# ifndef __KERNEL_SSE__
# define __KERNEL_SSE__
# endif
# define __KERNEL_AVX__
# endif
# ifdef __AVX2__
# define __KERNEL_SSE__
# ifndef __KERNEL_SSE__
# define __KERNEL_SSE__
# endif
# define __KERNEL_AVX2__
# endif
#endif

View File

@@ -30,6 +30,7 @@ typedef unsigned long long uint64_t;
/* Qualifiers */
#define ccl_device __device__ __inline__
#define ccl_device_extern extern "C" __device__
#if __CUDA_ARCH__ < 500
# define ccl_device_inline __device__ __forceinline__
# define ccl_device_forceinline __device__ __forceinline__
@@ -109,14 +110,14 @@ ccl_device_forceinline T ccl_gpu_tex_object_read_3D(const ccl_gpu_tex_object_3D
typedef unsigned short half;
__device__ half __float2half(const float f)
ccl_device_forceinline half __float2half(const float f)
{
half val;
asm("{ cvt.rn.f16.f32 %0, %1;}\n" : "=h"(val) : "f"(f));
return val;
}
__device__ float __half2float(const half h)
ccl_device_forceinline float __half2float(const half h)
{
float val;
asm("{ cvt.f32.f16 %0, %1;}\n" : "=f"(val) : "h"(h));

View File

@@ -314,11 +314,7 @@ ccl_gpu_kernel_threads(GPU_PARALLEL_ACTIVE_INDEX_DEFAULT_BLOCK_SIZE)
int kernel_index);
ccl_gpu_kernel_lambda_pass.kernel_index = kernel_index;
gpu_parallel_active_index_array(GPU_PARALLEL_ACTIVE_INDEX_DEFAULT_BLOCK_SIZE,
num_states,
indices,
num_indices,
ccl_gpu_kernel_lambda_pass);
gpu_parallel_active_index_array(num_states, indices, num_indices, ccl_gpu_kernel_lambda_pass);
}
ccl_gpu_kernel_postfix
@@ -333,11 +329,7 @@ ccl_gpu_kernel_threads(GPU_PARALLEL_ACTIVE_INDEX_DEFAULT_BLOCK_SIZE)
int kernel_index);
ccl_gpu_kernel_lambda_pass.kernel_index = kernel_index;
gpu_parallel_active_index_array(GPU_PARALLEL_ACTIVE_INDEX_DEFAULT_BLOCK_SIZE,
num_states,
indices,
num_indices,
ccl_gpu_kernel_lambda_pass);
gpu_parallel_active_index_array(num_states, indices, num_indices, ccl_gpu_kernel_lambda_pass);
}
ccl_gpu_kernel_postfix
@@ -349,11 +341,7 @@ ccl_gpu_kernel_threads(GPU_PARALLEL_ACTIVE_INDEX_DEFAULT_BLOCK_SIZE)
{
ccl_gpu_kernel_lambda(INTEGRATOR_STATE(state, path, queued_kernel) != 0);
gpu_parallel_active_index_array(GPU_PARALLEL_ACTIVE_INDEX_DEFAULT_BLOCK_SIZE,
num_states,
indices,
num_indices,
ccl_gpu_kernel_lambda_pass);
gpu_parallel_active_index_array(num_states, indices, num_indices, ccl_gpu_kernel_lambda_pass);
}
ccl_gpu_kernel_postfix
@@ -366,11 +354,8 @@ ccl_gpu_kernel_threads(GPU_PARALLEL_ACTIVE_INDEX_DEFAULT_BLOCK_SIZE)
{
ccl_gpu_kernel_lambda(INTEGRATOR_STATE(state, path, queued_kernel) == 0);
gpu_parallel_active_index_array(GPU_PARALLEL_ACTIVE_INDEX_DEFAULT_BLOCK_SIZE,
num_states,
indices + indices_offset,
num_indices,
ccl_gpu_kernel_lambda_pass);
gpu_parallel_active_index_array(
num_states, indices + indices_offset, num_indices, ccl_gpu_kernel_lambda_pass);
}
ccl_gpu_kernel_postfix
@@ -383,11 +368,8 @@ ccl_gpu_kernel_threads(GPU_PARALLEL_ACTIVE_INDEX_DEFAULT_BLOCK_SIZE)
{
ccl_gpu_kernel_lambda(INTEGRATOR_STATE(state, shadow_path, queued_kernel) == 0);
gpu_parallel_active_index_array(GPU_PARALLEL_ACTIVE_INDEX_DEFAULT_BLOCK_SIZE,
num_states,
indices + indices_offset,
num_indices,
ccl_gpu_kernel_lambda_pass);
gpu_parallel_active_index_array(
num_states, indices + indices_offset, num_indices, ccl_gpu_kernel_lambda_pass);
}
ccl_gpu_kernel_postfix
@@ -431,11 +413,7 @@ ccl_gpu_kernel_threads(GPU_PARALLEL_ACTIVE_INDEX_DEFAULT_BLOCK_SIZE)
int num_active_paths);
ccl_gpu_kernel_lambda_pass.num_active_paths = num_active_paths;
gpu_parallel_active_index_array(GPU_PARALLEL_ACTIVE_INDEX_DEFAULT_BLOCK_SIZE,
num_states,
indices,
num_indices,
ccl_gpu_kernel_lambda_pass);
gpu_parallel_active_index_array(num_states, indices, num_indices, ccl_gpu_kernel_lambda_pass);
}
ccl_gpu_kernel_postfix
@@ -469,11 +447,7 @@ ccl_gpu_kernel_threads(GPU_PARALLEL_ACTIVE_INDEX_DEFAULT_BLOCK_SIZE)
int num_active_paths);
ccl_gpu_kernel_lambda_pass.num_active_paths = num_active_paths;
gpu_parallel_active_index_array(GPU_PARALLEL_ACTIVE_INDEX_DEFAULT_BLOCK_SIZE,
num_states,
indices,
num_indices,
ccl_gpu_kernel_lambda_pass);
gpu_parallel_active_index_array(num_states, indices, num_indices, ccl_gpu_kernel_lambda_pass);
}
ccl_gpu_kernel_postfix

View File

@@ -56,7 +56,7 @@ void gpu_parallel_active_index_array_impl(const uint num_states,
const uint is_active = (state_index < num_states) ? is_active_op(state_index) : 0;
#else /* !__KERNEL__ONEAPI__ */
# ifndef __KERNEL_METAL__
template<uint blocksize, typename IsActiveOp>
template<typename IsActiveOp>
__device__
# endif
void
@@ -79,6 +79,10 @@ __device__
{
extern ccl_gpu_shared int warp_offset[];
# ifndef __KERNEL_METAL__
const uint blocksize = ccl_gpu_block_dim_x;
# endif
const uint thread_index = ccl_gpu_thread_idx_x;
const uint thread_warp = thread_index % ccl_gpu_warp_size;
@@ -149,7 +153,7 @@ __device__
#ifdef __KERNEL_METAL__
# define gpu_parallel_active_index_array(dummy, num_states, indices, num_indices, is_active_op) \
# define gpu_parallel_active_index_array(num_states, indices, num_indices, is_active_op) \
const uint is_active = (ccl_gpu_global_id_x() < num_states) ? \
is_active_op(ccl_gpu_global_id_x()) : \
0; \
@@ -167,15 +171,13 @@ __device__
simdgroup_offset)
#elif defined(__KERNEL_ONEAPI__)
# define gpu_parallel_active_index_array( \
blocksize, num_states, indices, num_indices, is_active_op) \
# define gpu_parallel_active_index_array(num_states, indices, num_indices, is_active_op) \
gpu_parallel_active_index_array_impl(num_states, indices, num_indices, is_active_op)
#else
# define gpu_parallel_active_index_array( \
blocksize, num_states, indices, num_indices, is_active_op) \
gpu_parallel_active_index_array_impl<blocksize>(num_states, indices, num_indices, is_active_op)
# define gpu_parallel_active_index_array(num_states, indices, num_indices, is_active_op) \
gpu_parallel_active_index_array_impl(num_states, indices, num_indices, is_active_op)
#endif

View File

@@ -28,6 +28,7 @@ typedef unsigned long long uint64_t;
/* Qualifiers */
#define ccl_device __device__ __inline__
#define ccl_device_extern extern "C" __device__
#define ccl_device_inline __device__ __inline__
#define ccl_device_forceinline __device__ __forceinline__
#define ccl_device_noinline __device__ __noinline__

View File

@@ -38,6 +38,7 @@ using namespace metal::raytracing;
# define ccl_device_noinline ccl_device __attribute__((noinline))
#endif
#define ccl_device_extern extern "C"
#define ccl_device_noinline_cpu ccl_device
#define ccl_device_inline_method ccl_device
#define ccl_global device

View File

@@ -28,6 +28,7 @@
/* Qualifier wrappers for different names on different devices */
#define ccl_device
#define ccl_device_extern extern "C"
#define ccl_global
#define ccl_always_inline __attribute__((always_inline))
#define ccl_device_inline inline

View File

@@ -33,14 +33,16 @@ typedef unsigned long long uint64_t;
#endif
#define ccl_device \
__device__ __forceinline__ // Function calls are bad for OptiX performance, so inline everything
static __device__ \
__forceinline__ // Function calls are bad for OptiX performance, so inline everything
#define ccl_device_extern extern "C" __device__
#define ccl_device_inline ccl_device
#define ccl_device_forceinline ccl_device
#define ccl_device_inline_method ccl_device
#define ccl_device_noinline __device__ __noinline__
#define ccl_device_inline_method __device__ __forceinline__
#define ccl_device_noinline static __device__ __noinline__
#define ccl_device_noinline_cpu ccl_device
#define ccl_global
#define ccl_inline_constant __constant__
#define ccl_inline_constant static __constant__
#define ccl_device_constant __constant__ __device__
#define ccl_constant const
#define ccl_gpu_shared __shared__
@@ -57,23 +59,6 @@ typedef unsigned long long uint64_t;
#define kernel_assert(cond)
/* GPU thread, block, grid size and index */
#define ccl_gpu_thread_idx_x (threadIdx.x)
#define ccl_gpu_block_dim_x (blockDim.x)
#define ccl_gpu_block_idx_x (blockIdx.x)
#define ccl_gpu_grid_dim_x (gridDim.x)
#define ccl_gpu_warp_size (warpSize)
#define ccl_gpu_thread_mask(thread_warp) uint(0xFFFFFFFF >> (ccl_gpu_warp_size - thread_warp))
#define ccl_gpu_global_id_x() (ccl_gpu_block_idx_x * ccl_gpu_block_dim_x + ccl_gpu_thread_idx_x)
#define ccl_gpu_global_size_x() (ccl_gpu_grid_dim_x * ccl_gpu_block_dim_x)
/* GPU warp synchronization. */
#define ccl_gpu_syncthreads() __syncthreads()
#define ccl_gpu_ballot(predicate) __ballot_sync(0xFFFFFFFF, predicate)
/* GPU texture objects */
typedef unsigned long long CUtexObject;
@@ -101,14 +86,14 @@ ccl_device_forceinline T ccl_gpu_tex_object_read_3D(const ccl_gpu_tex_object_3D
typedef unsigned short half;
__device__ half __float2half(const float f)
ccl_device_forceinline half __float2half(const float f)
{
half val;
asm("{ cvt.rn.f16.f32 %0, %1;}\n" : "=h"(val) : "f"(f));
return val;
}
__device__ float __half2float(const half h)
ccl_device_forceinline float __half2float(const half h)
{
float val;
asm("{ cvt.f32.f16 %0, %1;}\n" : "=f"(val) : "h"(h));

View File

@@ -25,6 +25,7 @@ struct KernelParamsOptiX {
/* Kernel arguments */
const int *path_index_array;
float *render_buffer;
int offset;
/* Global scene data and textures */
KernelData data;
@@ -36,7 +37,11 @@ struct KernelParamsOptiX {
};
#ifdef __NVCC__
extern "C" static __constant__ KernelParamsOptiX kernel_params;
extern "C"
# ifndef __CUDACC_RDC__
static
# endif
__constant__ KernelParamsOptiX kernel_params;
#endif
/* Abstraction macros */

View File

@@ -0,0 +1,83 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2011-2022 Blender Foundation */
#define WITH_OSL
/* Copy of the regular OptiX kernels with additional OSL support. */
#include "kernel/device/optix/kernel_shader_raytrace.cu"
#include "kernel/bake/bake.h"
#include "kernel/integrator/shade_background.h"
#include "kernel/integrator/shade_light.h"
#include "kernel/integrator/shade_shadow.h"
#include "kernel/integrator/shade_volume.h"
extern "C" __global__ void __raygen__kernel_optix_integrator_shade_background()
{
const int global_index = optixGetLaunchIndex().x;
const int path_index = (kernel_params.path_index_array) ?
kernel_params.path_index_array[global_index] :
global_index;
integrator_shade_background(nullptr, path_index, kernel_params.render_buffer);
}
extern "C" __global__ void __raygen__kernel_optix_integrator_shade_light()
{
const int global_index = optixGetLaunchIndex().x;
const int path_index = (kernel_params.path_index_array) ?
kernel_params.path_index_array[global_index] :
global_index;
integrator_shade_light(nullptr, path_index, kernel_params.render_buffer);
}
extern "C" __global__ void __raygen__kernel_optix_integrator_shade_surface()
{
const int global_index = optixGetLaunchIndex().x;
const int path_index = (kernel_params.path_index_array) ?
kernel_params.path_index_array[global_index] :
global_index;
integrator_shade_surface(nullptr, path_index, kernel_params.render_buffer);
}
extern "C" __global__ void __raygen__kernel_optix_integrator_shade_volume()
{
const int global_index = optixGetLaunchIndex().x;
const int path_index = (kernel_params.path_index_array) ?
kernel_params.path_index_array[global_index] :
global_index;
integrator_shade_volume(nullptr, path_index, kernel_params.render_buffer);
}
extern "C" __global__ void __raygen__kernel_optix_integrator_shade_shadow()
{
const int global_index = optixGetLaunchIndex().x;
const int path_index = (kernel_params.path_index_array) ?
kernel_params.path_index_array[global_index] :
global_index;
integrator_shade_shadow(nullptr, path_index, kernel_params.render_buffer);
}
extern "C" __global__ void __raygen__kernel_optix_shader_eval_displace()
{
KernelShaderEvalInput *const input = (KernelShaderEvalInput *)kernel_params.path_index_array;
float *const output = kernel_params.render_buffer;
const int global_index = kernel_params.offset + optixGetLaunchIndex().x;
kernel_displace_evaluate(nullptr, input, output, global_index);
}
extern "C" __global__ void __raygen__kernel_optix_shader_eval_background()
{
KernelShaderEvalInput *const input = (KernelShaderEvalInput *)kernel_params.path_index_array;
float *const output = kernel_params.render_buffer;
const int global_index = kernel_params.offset + optixGetLaunchIndex().x;
kernel_background_evaluate(nullptr, input, output, global_index);
}
extern "C" __global__ void __raygen__kernel_optix_shader_eval_curve_shadow_transparency()
{
KernelShaderEvalInput *const input = (KernelShaderEvalInput *)kernel_params.path_index_array;
float *const output = kernel_params.render_buffer;
const int global_index = kernel_params.offset + optixGetLaunchIndex().x;
kernel_curve_shadow_transparency_evaluate(nullptr, input, output, global_index);
}

View File

@@ -58,13 +58,29 @@ ccl_device bool film_adaptive_sampling_convergence_check(KernelGlobals kg,
const float4 I = kernel_read_pass_float4(buffer + kernel_data.film.pass_combined);
const float sample = __float_as_uint(buffer[kernel_data.film.pass_sample_count]);
const float inv_sample = 1.0f / sample;
const float intensity_scale = kernel_data.film.exposure / sample;
/* The per pixel error as seen in section 2.1 of
* "A hierarchical automatic stopping condition for Monte Carlo global illumination" */
const float error_difference = (fabsf(I.x - A.x) + fabsf(I.y - A.y) + fabsf(I.z - A.z)) *
inv_sample;
const float error_normalize = sqrtf((I.x + I.y + I.z) * inv_sample);
intensity_scale;
const float intensity = (I.x + I.y + I.z) * intensity_scale;
/* Anything with R+G+B > 1 is highly exposed - even in sRGB it's a range that
* some displays aren't even able to display without significant losses in
* detalization. Everything with R+G+B > 3 is overexposed and should receive
* even less samples. Filmic-like curves need maximum sampling rate at
* intensity near 0.1-0.2, so threshold of 1 for R+G+B leaves an additional
* fstop in case it is needed for compositing.
*/
float error_normalize;
if (intensity < 1.0f) {
error_normalize = sqrtf(intensity);
}
else {
error_normalize = intensity;
}
/* A small epsilon is added to the divisor to prevent division by zero. */
const float error = error_difference / (0.0001f + error_normalize);
const bool did_converge = (error < threshold);

View File

@@ -527,17 +527,6 @@ ccl_device_inline void film_write_direct_light(KernelGlobals kg,
film_write_pass_spectrum(buffer + pass_offset, contribution);
}
}
/* Write shadow pass. */
if (kernel_data.film.pass_shadow != PASS_UNUSED && (path_flag & PATH_RAY_SHADOW_FOR_LIGHT) &&
(path_flag & PATH_RAY_TRANSPARENT_BACKGROUND)) {
const Spectrum unshadowed_throughput = INTEGRATOR_STATE(
state, shadow_path, unshadowed_throughput);
const Spectrum shadowed_throughput = INTEGRATOR_STATE(state, shadow_path, throughput);
const Spectrum shadow = safe_divide(shadowed_throughput, unshadowed_throughput) *
kernel_data.film.pass_shadow_scale;
film_write_pass_spectrum(buffer + kernel_data.film.pass_shadow, shadow);
}
}
#endif
}

View File

@@ -24,8 +24,8 @@ ccl_device void displacement_shader_eval(KernelGlobals kg,
/* this will modify sd->P */
#ifdef __OSL__
if (kg->osl) {
OSLShader::eval_displacement(kg, state, sd);
if (kernel_data.kernel_features & KERNEL_FEATURE_OSL) {
osl_eval_nodes<SHADER_TYPE_DISPLACEMENT>(kg, state, sd, 0);
}
else
#endif

View File

@@ -11,10 +11,10 @@
#include "kernel/integrator/path_state.h"
#include "kernel/integrator/shadow_catcher.h"
#include "kernel/light/light.h"
#include "kernel/geom/geom.h"
#include "kernel/light/light.h"
#include "kernel/bvh/bvh.h"
CCL_NAMESPACE_BEGIN
@@ -387,7 +387,7 @@ ccl_device void integrator_intersect_closest(KernelGlobals kg,
#endif /* __MNEE__ */
/* Light intersection for MIS. */
if (kernel_data.integrator.use_lamp_mis) {
if (kernel_data.integrator.use_light_mis) {
/* NOTE: if we make lights visible to camera rays, we'll need to initialize
* these in the path_state_init. */
const int last_type = INTEGRATOR_STATE(state, isect, type);

View File

@@ -108,48 +108,6 @@ ccl_device_inline float mat22_inverse(const float4 m, ccl_private float4 &m_inve
return det;
}
/* Update light sample */
ccl_device_forceinline void mnee_update_light_sample(KernelGlobals kg,
const float3 P,
ccl_private LightSample *ls)
{
/* correct light sample position/direction and pdf
* NOTE: preserve pdf in area measure */
const ccl_global KernelLight *klight = &kernel_data_fetch(lights, ls->lamp);
if (ls->type == LIGHT_POINT || ls->type == LIGHT_SPOT) {
ls->D = normalize_len(ls->P - P, &ls->t);
ls->Ng = -ls->D;
float2 uv = map_to_sphere(ls->Ng);
ls->u = uv.x;
ls->v = uv.y;
float invarea = klight->spot.invarea;
ls->eval_fac = (0.25f * M_1_PI_F) * invarea;
ls->pdf = invarea;
if (ls->type == LIGHT_SPOT) {
/* spot light attenuation */
float3 dir = make_float3(klight->spot.dir[0], klight->spot.dir[1], klight->spot.dir[2]);
ls->eval_fac *= spot_light_attenuation(
dir, klight->spot.spot_angle, klight->spot.spot_smooth, ls->Ng);
}
}
else if (ls->type == LIGHT_AREA) {
float invarea = fabsf(klight->area.invarea);
ls->D = normalize_len(ls->P - P, &ls->t);
ls->pdf = invarea;
if (klight->area.tan_spread > 0.f) {
ls->eval_fac = 0.25f * invarea;
ls->eval_fac *= light_spread_attenuation(
ls->D, ls->Ng, klight->area.tan_spread, klight->area.normalize_spread);
}
}
ls->pdf *= kernel_data.integrator.pdf_lights;
}
/* Manifold vertex setup from ray and intersection data */
ccl_device_forceinline void mnee_setup_manifold_vertex(KernelGlobals kg,
ccl_private ManifoldVertex *vtx,
@@ -819,7 +777,7 @@ ccl_device_forceinline bool mnee_path_contribution(KernelGlobals kg,
/* Update light sample with new position / direct.ion
* and keep pdf in vertex area measure */
mnee_update_light_sample(kg, vertices[vertex_count - 1].p, ls);
light_sample_update_position(kg, ls, vertices[vertex_count - 1].p);
/* Save state path bounce info in case a light path node is used in the refractive interface or
* light shader graph. */

View File

@@ -91,7 +91,10 @@ ccl_device_inline void path_state_init_integrator(KernelGlobals kg,
#endif
}
ccl_device_inline void path_state_next(KernelGlobals kg, IntegratorState state, int label)
ccl_device_inline void path_state_next(KernelGlobals kg,
IntegratorState state,
const int label,
const int shader_flag)
{
uint32_t flag = INTEGRATOR_STATE(state, path, flag);
@@ -120,12 +123,12 @@ ccl_device_inline void path_state_next(KernelGlobals kg, IntegratorState state,
flag |= PATH_RAY_TERMINATE_AFTER_TRANSPARENT;
}
flag &= ~(PATH_RAY_ALL_VISIBILITY | PATH_RAY_MIS_SKIP);
flag &= ~(PATH_RAY_ALL_VISIBILITY | PATH_RAY_MIS_SKIP | PATH_RAY_MIS_HAD_TRANSMISSION);
#ifdef __VOLUME__
if (label & LABEL_VOLUME_SCATTER) {
/* volume scatter */
flag |= PATH_RAY_VOLUME_SCATTER;
flag |= PATH_RAY_VOLUME_SCATTER | PATH_RAY_MIS_HAD_TRANSMISSION;
flag &= ~PATH_RAY_TRANSPARENT_BACKGROUND;
if (!(flag & PATH_RAY_ANY_PASS)) {
flag |= PATH_RAY_VOLUME_PASS;
@@ -188,6 +191,11 @@ ccl_device_inline void path_state_next(KernelGlobals kg, IntegratorState state,
flag |= PATH_RAY_GLOSSY | PATH_RAY_SINGULAR | PATH_RAY_MIS_SKIP;
}
/* Flag for consistent MIS weights with light tree. */
if (shader_flag & SD_BSDF_HAS_TRANSMISSION) {
flag |= PATH_RAY_MIS_HAD_TRANSMISSION;
}
/* Render pass categories. */
if (!(flag & PATH_RAY_ANY_PASS) && !(flag & PATH_RAY_TRANSPARENT_BACKGROUND)) {
flag |= PATH_RAY_SURFACE_PASS;

View File

@@ -69,9 +69,9 @@ ccl_device_inline void integrate_background(KernelGlobals kg,
bool eval_background = true;
float transparent = 0.0f;
int path_flag = INTEGRATOR_STATE(state, path, flag);
const bool is_transparent_background_ray = kernel_data.background.transparent &&
(INTEGRATOR_STATE(state, path, flag) &
PATH_RAY_TRANSPARENT_BACKGROUND);
(path_flag & PATH_RAY_TRANSPARENT_BACKGROUND);
if (is_transparent_background_ray) {
transparent = average(INTEGRATOR_STATE(state, path, throughput));
@@ -86,7 +86,7 @@ ccl_device_inline void integrate_background(KernelGlobals kg,
#ifdef __MNEE__
if (INTEGRATOR_STATE(state, path, mnee) & PATH_MNEE_CULL_LIGHT_CONNECTION) {
if (kernel_data.background.use_mis) {
for (int lamp = 0; lamp < kernel_data.integrator.num_all_lights; lamp++) {
for (int lamp = 0; lamp < kernel_data.integrator.num_lights; lamp++) {
/* This path should have been resolved with mnee, it will
* generate a firefly for small lights since it is improbable. */
const ccl_global KernelLight *klight = &kernel_data_fetch(lights, lamp);
@@ -113,17 +113,10 @@ ccl_device_inline void integrate_background(KernelGlobals kg,
/* Background MIS weights. */
float mis_weight = 1.0f;
/* Check if background light exists or if we should skip pdf. */
/* Check if background light exists or if we should skip PDF. */
if (!(INTEGRATOR_STATE(state, path, flag) & PATH_RAY_MIS_SKIP) &&
kernel_data.background.use_mis) {
const float3 ray_P = INTEGRATOR_STATE(state, ray, P);
const float3 ray_D = INTEGRATOR_STATE(state, ray, D);
const float mis_ray_pdf = INTEGRATOR_STATE(state, path, mis_ray_pdf);
/* multiple importance sampling, get background light pdf for ray
* direction, and compute weight with respect to BSDF pdf */
const float pdf = background_light_pdf(kg, ray_P, ray_D);
mis_weight = light_sample_mis_weight_forward(kg, mis_ray_pdf, pdf);
mis_weight = light_sample_mis_weight_forward_background(kg, state, path_flag);
}
guiding_record_background(kg, state, L, mis_weight);
@@ -142,8 +135,8 @@ ccl_device_inline void integrate_distant_lights(KernelGlobals kg,
const float3 ray_D = INTEGRATOR_STATE(state, ray, D);
const float ray_time = INTEGRATOR_STATE(state, ray, time);
LightSample ls ccl_optional_struct_init;
for (int lamp = 0; lamp < kernel_data.integrator.num_all_lights; lamp++) {
if (light_sample_from_distant_ray(kg, ray_D, lamp, &ls)) {
for (int lamp = 0; lamp < kernel_data.integrator.num_lights; lamp++) {
if (distant_light_sample_from_intersection(kg, ray_D, lamp, &ls)) {
/* Use visibility flag to skip lights. */
#ifdef __PASSES__
const uint32_t path_flag = INTEGRATOR_STATE(state, path, flag);
@@ -182,10 +175,7 @@ ccl_device_inline void integrate_distant_lights(KernelGlobals kg,
/* MIS weighting. */
float mis_weight = 1.0f;
if (!(path_flag & PATH_RAY_MIS_SKIP)) {
/* multiple importance sampling, get regular light pdf,
* and compute weight with respect to BSDF pdf */
const float mis_ray_pdf = INTEGRATOR_STATE(state, path, mis_ray_pdf);
mis_weight = light_sample_mis_weight_forward(kg, mis_ray_pdf, ls.pdf);
mis_weight = light_sample_mis_weight_forward_distant(kg, state, path_flag, &ls);
}
/* Write to render buffer. */

View File

@@ -61,10 +61,7 @@ ccl_device_inline void integrate_light(KernelGlobals kg,
/* MIS weighting. */
float mis_weight = 1.0f;
if (!(path_flag & PATH_RAY_MIS_SKIP)) {
/* multiple importance sampling, get regular light pdf,
* and compute weight with respect to BSDF pdf */
const float mis_ray_pdf = INTEGRATOR_STATE(state, path, mis_ray_pdf);
mis_weight = light_sample_mis_weight_forward(kg, mis_ray_pdf, ls.pdf);
mis_weight = light_sample_mis_weight_forward_lamp(kg, state, path_flag, &ls, ray_P);
}
/* Write to render buffer. */

View File

@@ -15,7 +15,6 @@
#include "kernel/integrator/surface_shader.h"
#include "kernel/integrator/volume_stack.h"
#include "kernel/light/light.h"
#include "kernel/light/sample.h"
CCL_NAMESPACE_BEGIN
@@ -113,20 +112,16 @@ ccl_device_forceinline void integrate_surface_emission(KernelGlobals kg,
Spectrum L = surface_shader_emission(sd);
float mis_weight = 1.0f;
const bool has_mis = !(path_flag & PATH_RAY_MIS_SKIP) &&
(sd->flag & ((sd->flag & SD_BACKFACING) ? SD_MIS_BACK : SD_MIS_FRONT));
#ifdef __HAIR__
if (!(path_flag & PATH_RAY_MIS_SKIP) && (sd->flag & SD_USE_MIS) &&
(sd->type & PRIMITIVE_TRIANGLE))
if (has_mis && (sd->type & PRIMITIVE_TRIANGLE))
#else
if (!(path_flag & PATH_RAY_MIS_SKIP) && (sd->flag & SD_USE_MIS))
if (has_mis)
#endif
{
const float bsdf_pdf = INTEGRATOR_STATE(state, path, mis_ray_pdf);
const float t = sd->ray_length;
/* Multiple importance sampling, get triangle light pdf,
* and compute weight with respect to BSDF pdf. */
float pdf = triangle_light_pdf(kg, sd, t);
mis_weight = light_sample_mis_weight_forward(kg, bsdf_pdf, pdf);
mis_weight = light_sample_mis_weight_forward_surface(kg, state, path_flag, sd);
}
guiding_record_surface_emission(kg, state, L, mis_weight);
@@ -154,8 +149,17 @@ ccl_device_forceinline void integrate_surface_direct_light(KernelGlobals kg,
const uint bounce = INTEGRATOR_STATE(state, path, bounce);
const float2 rand_light = path_state_rng_2D(kg, rng_state, PRNG_LIGHT);
if (!light_distribution_sample_from_position(
kg, rand_light.x, rand_light.y, sd->time, sd->P, bounce, path_flag, &ls)) {
if (!light_sample_from_position(kg,
rng_state,
rand_light.x,
rand_light.y,
sd->time,
sd->P,
sd->N,
sd->flag,
bounce,
path_flag,
&ls)) {
return;
}
}
@@ -322,10 +326,6 @@ ccl_device_forceinline void integrate_surface_direct_light(KernelGlobals kg,
INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, throughput) = throughput;
if (kernel_data.kernel_features & KERNEL_FEATURE_SHADOW_PASS) {
INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, unshadowed_throughput) = throughput;
}
/* Write Lightgroup, +1 as lightgroup is int but we need to encode into a uint8_t. */
INTEGRATOR_STATE_WRITE(
shadow_state, shadow_path, lightgroup) = (ls.type != LIGHT_BACKGROUND) ?
@@ -441,11 +441,12 @@ ccl_device_forceinline int integrate_surface_bsdf_bssrdf_bounce(
/* Update path state */
if (!(label & LABEL_TRANSPARENT)) {
INTEGRATOR_STATE_WRITE(state, path, mis_ray_pdf) = bsdf_pdf;
INTEGRATOR_STATE_WRITE(state, path, mis_origin_n) = sd->N;
INTEGRATOR_STATE_WRITE(state, path, min_ray_pdf) = fminf(
unguided_bsdf_pdf, INTEGRATOR_STATE(state, path, min_ray_pdf));
}
path_state_next(kg, state, label);
path_state_next(kg, state, label, sd->flag);
guiding_record_surface_bounce(kg,
state,

View File

@@ -685,14 +685,14 @@ ccl_device_forceinline void volume_integrate_heterogeneous(
# endif /* __DENOISING_FEATURES__ */
}
/* Path tracing: sample point on light and evaluate light shader, then
* queue shadow ray to be traced. */
ccl_device_forceinline bool integrate_volume_sample_light(
/* Path tracing: sample point on light for equiangular sampling. */
ccl_device_forceinline bool integrate_volume_equiangular_sample_light(
KernelGlobals kg,
IntegratorState state,
ccl_private const Ray *ccl_restrict ray,
ccl_private const ShaderData *ccl_restrict sd,
ccl_private const RNGState *ccl_restrict rng_state,
ccl_private LightSample *ccl_restrict ls)
ccl_private float3 *ccl_restrict P)
{
/* Test if there is a light or BSDF that needs direct light. */
if (!kernel_data.integrator.use_direct_light) {
@@ -704,15 +704,30 @@ ccl_device_forceinline bool integrate_volume_sample_light(
const uint bounce = INTEGRATOR_STATE(state, path, bounce);
const float2 rand_light = path_state_rng_2D(kg, rng_state, PRNG_LIGHT);
if (!light_distribution_sample_from_volume_segment(
kg, rand_light.x, rand_light.y, sd->time, sd->P, bounce, path_flag, ls)) {
LightSample ls ccl_optional_struct_init;
if (!light_sample_from_volume_segment(kg,
rand_light.x,
rand_light.y,
sd->time,
sd->P,
ray->D,
ray->tmax - ray->tmin,
bounce,
path_flag,
&ls)) {
return false;
}
if (ls->shader & SHADER_EXCLUDE_SCATTER) {
if (ls.shader & SHADER_EXCLUDE_SCATTER) {
return false;
}
if (ls.t == FLT_MAX) {
return false;
}
*P = ls.P;
return true;
}
@@ -728,8 +743,7 @@ ccl_device_forceinline void integrate_volume_direct_light(
# ifdef __PATH_GUIDING__
ccl_private const Spectrum unlit_throughput,
# endif
ccl_private const Spectrum throughput,
ccl_private LightSample *ccl_restrict ls)
ccl_private const Spectrum throughput)
{
PROFILING_INIT(kg, PROFILING_SHADE_VOLUME_DIRECT_LIGHT);
@@ -737,23 +751,38 @@ ccl_device_forceinline void integrate_volume_direct_light(
return;
}
/* Sample position on the same light again, now from the shading
* point where we scattered.
/* Sample position on the same light again, now from the shading point where we scattered.
*
* TODO: decorrelate random numbers and use light_sample_new_position to
* avoid resampling the CDF. */
* Note that this means we sample the light tree twice when equiangular sampling is used.
* We could consider sampling the light tree just once and use the same light position again.
*
* This would make the PDFs for MIS weights more complicated due to having to account for
* both distance/equiangular and direct/indirect light sampling, but could be more accurate.
* Additionally we could end up behind the light or outside a spot light cone, which might
* waste a sample. Though on the other hand it would be possible to prevent that with
* equiangular sampling restricted to a smaller sub-segment where the light has influence. */
LightSample ls ccl_optional_struct_init;
{
const uint32_t path_flag = INTEGRATOR_STATE(state, path, flag);
const uint bounce = INTEGRATOR_STATE(state, path, bounce);
const float2 rand_light = path_state_rng_2D(kg, rng_state, PRNG_LIGHT);
if (!light_distribution_sample_from_position(
kg, rand_light.x, rand_light.y, sd->time, P, bounce, path_flag, ls)) {
if (!light_sample_from_position(kg,
rng_state,
rand_light.x,
rand_light.y,
sd->time,
P,
zero_float3(),
SD_BSDF_HAS_TRANSMISSION,
bounce,
path_flag,
&ls)) {
return;
}
}
if (ls->shader & SHADER_EXCLUDE_SCATTER) {
if (ls.shader & SHADER_EXCLUDE_SCATTER) {
return;
}
@@ -765,32 +794,32 @@ ccl_device_forceinline void integrate_volume_direct_light(
* non-constant light sources. */
ShaderDataTinyStorage emission_sd_storage;
ccl_private ShaderData *emission_sd = AS_SHADER_DATA(&emission_sd_storage);
const Spectrum light_eval = light_sample_shader_eval(kg, state, emission_sd, ls, sd->time);
const Spectrum light_eval = light_sample_shader_eval(kg, state, emission_sd, &ls, sd->time);
if (is_zero(light_eval)) {
return;
}
/* Evaluate BSDF. */
BsdfEval phase_eval ccl_optional_struct_init;
float phase_pdf = volume_shader_phase_eval(kg, state, sd, phases, ls->D, &phase_eval);
float phase_pdf = volume_shader_phase_eval(kg, state, sd, phases, ls.D, &phase_eval);
if (ls->shader & SHADER_USE_MIS) {
float mis_weight = light_sample_mis_weight_nee(kg, ls->pdf, phase_pdf);
if (ls.shader & SHADER_USE_MIS) {
float mis_weight = light_sample_mis_weight_nee(kg, ls.pdf, phase_pdf);
bsdf_eval_mul(&phase_eval, mis_weight);
}
bsdf_eval_mul(&phase_eval, light_eval / ls->pdf);
bsdf_eval_mul(&phase_eval, light_eval / ls.pdf);
/* Path termination. */
const float terminate = path_state_rng_light_termination(kg, rng_state);
if (light_sample_terminate(kg, ls, &phase_eval, terminate)) {
if (light_sample_terminate(kg, &ls, &phase_eval, terminate)) {
return;
}
/* Create shadow ray. */
Ray ray ccl_optional_struct_init;
light_sample_to_volume_shadow_ray(kg, sd, ls, P, &ray);
const bool is_light = light_sample_is_light(ls);
light_sample_to_volume_shadow_ray(kg, sd, &ls, P, &ray);
const bool is_light = light_sample_is_light(&ls);
/* Branch off shadow kernel. */
IntegratorShadowState shadow_state = integrator_shadow_path_init(
@@ -849,14 +878,10 @@ ccl_device_forceinline void integrate_volume_direct_light(
state, path, transmission_bounce);
INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, throughput) = throughput_phase;
if (kernel_data.kernel_features & KERNEL_FEATURE_SHADOW_PASS) {
INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, unshadowed_throughput) = throughput;
}
/* Write Lightgroup, +1 as lightgroup is int but we need to encode into a uint8_t. */
INTEGRATOR_STATE_WRITE(
shadow_state, shadow_path, lightgroup) = (ls->type != LIGHT_BACKGROUND) ?
ls->group + 1 :
shadow_state, shadow_path, lightgroup) = (ls.type != LIGHT_BACKGROUND) ?
ls.group + 1 :
kernel_data.background.lightgroup + 1;
# ifdef __PATH_GUIDING__
@@ -958,10 +983,11 @@ ccl_device_forceinline bool integrate_volume_phase_scatter(
/* Update path state */
INTEGRATOR_STATE_WRITE(state, path, mis_ray_pdf) = phase_pdf;
INTEGRATOR_STATE_WRITE(state, path, mis_origin_n) = zero_float3();
INTEGRATOR_STATE_WRITE(state, path, min_ray_pdf) = fminf(
unguided_phase_pdf, INTEGRATOR_STATE(state, path, min_ray_pdf));
path_state_next(kg, state, label);
path_state_next(kg, state, label, sd->flag);
return true;
}
@@ -983,12 +1009,11 @@ ccl_device VolumeIntegrateEvent volume_integrate(KernelGlobals kg,
/* Sample light ahead of volume stepping, for equiangular sampling. */
/* TODO: distant lights are ignored now, but could instead use even distribution. */
LightSample ls ccl_optional_struct_init;
const bool need_light_sample = !(INTEGRATOR_STATE(state, path, flag) & PATH_RAY_TERMINATE);
float3 equiangular_P = zero_float3();
const bool have_equiangular_sample = need_light_sample &&
integrate_volume_sample_light(
kg, state, &sd, &rng_state, &ls) &&
(ls.t != FLT_MAX);
integrate_volume_equiangular_sample_light(
kg, state, ray, &sd, &rng_state, &equiangular_P);
VolumeSampleMethod direct_sample_method = (have_equiangular_sample) ?
volume_stack_sample_method(kg, state) :
@@ -1018,7 +1043,7 @@ ccl_device VolumeIntegrateEvent volume_integrate(KernelGlobals kg,
render_buffer,
step_size,
direct_sample_method,
ls.P,
equiangular_P,
result);
/* Perform path termination. The intersect_closest will have already marked this path
@@ -1085,8 +1110,7 @@ ccl_device VolumeIntegrateEvent volume_integrate(KernelGlobals kg,
# ifdef __PATH_GUIDING__
unlit_throughput,
# endif
result.direct_throughput,
&ls);
result.direct_throughput);
}
/* Indirect light.

View File

@@ -32,7 +32,7 @@ KERNEL_STRUCT_MEMBER(shadow_path, PackedSpectrum, throughput, KERNEL_FEATURE_PAT
KERNEL_STRUCT_MEMBER(shadow_path,
PackedSpectrum,
unshadowed_throughput,
KERNEL_FEATURE_SHADOW_PASS | KERNEL_FEATURE_AO_ADDITIVE)
KERNEL_FEATURE_AO_ADDITIVE)
/* Ratio of throughput to distinguish diffuse / glossy / transmission render passes. */
KERNEL_STRUCT_MEMBER(shadow_path, PackedSpectrum, pass_diffuse_weight, KERNEL_FEATURE_LIGHT_PASSES)
KERNEL_STRUCT_MEMBER(shadow_path, PackedSpectrum, pass_glossy_weight, KERNEL_FEATURE_LIGHT_PASSES)

View File

@@ -41,6 +41,7 @@ KERNEL_STRUCT_MEMBER(path, uint8_t, mnee, KERNEL_FEATURE_PATH_TRACING)
* zero and distance. Note that transparency and volume attenuation increase
* the ray tmin but keep P unmodified so that this works. */
KERNEL_STRUCT_MEMBER(path, float, mis_ray_pdf, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(path, packed_float3, mis_origin_n, KERNEL_FEATURE_PATH_TRACING)
/* Filter glossy. */
KERNEL_STRUCT_MEMBER(path, float, min_ray_pdf, KERNEL_FEATURE_PATH_TRACING)
/* Continuation probability for path termination. */

View File

@@ -827,13 +827,8 @@ ccl_device void surface_shader_eval(KernelGlobals kg,
sd->num_closure_left = max_closures;
#ifdef __OSL__
if (kg->osl) {
if (sd->object == OBJECT_NONE && sd->lamp == LAMP_NONE) {
OSLShader::eval_background(kg, state, sd, path_flag);
}
else {
OSLShader::eval_surface(kg, state, sd, path_flag);
}
if (kernel_data.kernel_features & KERNEL_FEATURE_OSL) {
osl_eval_nodes<SHADER_TYPE_SURFACE>(kg, state, sd, path_flag);
}
else
#endif

View File

@@ -491,8 +491,8 @@ ccl_device_inline void volume_shader_eval(KernelGlobals kg,
/* evaluate shader */
# ifdef __OSL__
if (kg->osl) {
OSLShader::eval_volume(kg, state, sd, path_flag);
if (kernel_data.kernel_features & KERNEL_FEATURE_OSL) {
osl_eval_nodes<SHADER_TYPE_VOLUME>(kg, state, sd, path_flag);
}
else
# endif

View File

@@ -0,0 +1,387 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2011-2022 Blender Foundation */
#pragma once
#include "kernel/light/common.h"
CCL_NAMESPACE_BEGIN
/* Importance sampling.
*
* An Area-Preserving Parametrization for Spherical Rectangles.
* Carlos Urena et al.
*
* NOTE: light_p is modified when sample_coord is true. */
ccl_device_inline float area_light_rect_sample(float3 P,
ccl_private float3 *light_p,
const float3 axis_u,
const float len_u,
const float3 axis_v,
const float len_v,
float randu,
float randv,
bool sample_coord)
{
/* In our name system we're using P for the center, which is o in the paper. */
float3 corner = *light_p - axis_u * len_u * 0.5f - axis_v * len_v * 0.5f;
/* Compute local reference system R. */
float3 x = axis_u;
float3 y = axis_v;
float3 z = cross(x, y);
/* Compute rectangle coords in local reference system. */
float3 dir = corner - P;
float z0 = dot(dir, z);
/* Flip 'z' to make it point against Q. */
if (z0 > 0.0f) {
z *= -1.0f;
z0 *= -1.0f;
}
float x0 = dot(dir, x);
float y0 = dot(dir, y);
float x1 = x0 + len_u;
float y1 = y0 + len_v;
/* Compute internal angles (gamma_i). */
float4 diff = make_float4(x0, y1, x1, y0) - make_float4(x1, y0, x0, y1);
float4 nz = make_float4(y0, x1, y1, x0) * diff;
nz = nz / sqrt(z0 * z0 * diff * diff + nz * nz);
float g0 = safe_acosf(-nz.x * nz.y);
float g1 = safe_acosf(-nz.y * nz.z);
float g2 = safe_acosf(-nz.z * nz.w);
float g3 = safe_acosf(-nz.w * nz.x);
/* Compute predefined constants. */
float b0 = nz.x;
float b1 = nz.z;
float b0sq = b0 * b0;
float k = M_2PI_F - g2 - g3;
/* Compute solid angle from internal angles. */
float S = g0 + g1 - k;
if (sample_coord) {
/* Compute cu. */
float au = randu * S + k;
float fu = (cosf(au) * b0 - b1) / sinf(au);
float cu = 1.0f / sqrtf(fu * fu + b0sq) * (fu > 0.0f ? 1.0f : -1.0f);
cu = clamp(cu, -1.0f, 1.0f);
/* Compute xu. */
float xu = -(cu * z0) / max(sqrtf(1.0f - cu * cu), 1e-7f);
xu = clamp(xu, x0, x1);
/* Compute yv. */
float z0sq = z0 * z0;
float y0sq = y0 * y0;
float y1sq = y1 * y1;
float d = sqrtf(xu * xu + z0sq);
float h0 = y0 / sqrtf(d * d + y0sq);
float h1 = y1 / sqrtf(d * d + y1sq);
float hv = h0 + randv * (h1 - h0), hv2 = hv * hv;
float yv = (hv2 < 1.0f - 1e-6f) ? (hv * d) / sqrtf(1.0f - hv2) : y1;
/* Transform (xu, yv, z0) to world coords. */
*light_p = P + xu * x + yv * y + z0 * z;
}
/* return pdf */
if (S != 0.0f)
return 1.0f / S;
else
return 0.0f;
}
/* Light spread. */
ccl_device float area_light_spread_attenuation(const float3 D,
const float3 lightNg,
const float cot_half_spread,
const float normalize_spread)
{
/* Model a soft-box grid, computing the ratio of light not hidden by the
* slats of the grid at a given angle. (see D10594). */
const float cos_a = -dot(D, lightNg);
const float sin_a = safe_sqrtf(1.0f - sqr(cos_a));
const float tan_a = sin_a / cos_a;
return max((1.0f - (cot_half_spread * tan_a)) * normalize_spread, 0.0f);
}
/* Compute subset of area light that actually has an influence on the shading point, to
* reduce noise with low spread. */
ccl_device bool area_light_spread_clamp_area_light(const float3 P,
const float3 lightNg,
ccl_private float3 *lightP,
const float3 axis_u,
ccl_private float *len_u,
const float3 axis_v,
ccl_private float *len_v,
const float cot_half_spread)
{
/* Closest point in area light plane and distance to that plane. */
const float3 closest_P = P - dot(lightNg, P - *lightP) * lightNg;
const float t = len(closest_P - P);
/* Radius of circle on area light that actually affects the shading point. */
const float radius = t / cot_half_spread;
/* Local uv coordinates of closest point. */
const float closest_u = dot(axis_u, closest_P - *lightP);
const float closest_v = dot(axis_v, closest_P - *lightP);
/* Compute rectangle encompassing the circle that affects the shading point,
* clamped to the bounds of the area light. */
const float min_u = max(closest_u - radius, -*len_u * 0.5f);
const float max_u = min(closest_u + radius, *len_u * 0.5f);
const float min_v = max(closest_v - radius, -*len_v * 0.5f);
const float max_v = min(closest_v + radius, *len_v * 0.5f);
/* Skip if rectangle is empty. */
if (min_u >= max_u || min_v >= max_v) {
return false;
}
/* Compute new area light center position and axes from rectangle in local
* uv coordinates. */
const float new_center_u = 0.5f * (min_u + max_u);
const float new_center_v = 0.5f * (min_v + max_v);
*len_u = max_u - min_u;
*len_v = max_v - min_v;
*lightP = *lightP + new_center_u * axis_u + new_center_v * axis_v;
return true;
}
/* Common API. */
template<bool in_volume_segment>
ccl_device_inline bool area_light_sample(const ccl_global KernelLight *klight,
const float randu,
const float randv,
const float3 P,
ccl_private LightSample *ls)
{
ls->P = klight->co;
const float3 axis_u = klight->area.axis_u;
const float3 axis_v = klight->area.axis_v;
const float len_u = klight->area.len_u;
const float len_v = klight->area.len_v;
float3 Ng = klight->area.dir;
float invarea = fabsf(klight->area.invarea);
bool is_round = (klight->area.invarea < 0.0f);
if (!in_volume_segment) {
if (dot(ls->P - P, Ng) > 0.0f) {
return false;
}
}
float3 inplane;
if (is_round || in_volume_segment) {
inplane = ellipse_sample(axis_u * len_u * 0.5f, axis_v * len_v * 0.5f, randu, randv);
ls->P += inplane;
ls->pdf = invarea;
}
else {
inplane = ls->P;
float sample_len_u = len_u;
float sample_len_v = len_v;
if (!in_volume_segment && klight->area.cot_half_spread > 0.0f) {
if (!area_light_spread_clamp_area_light(P,
Ng,
&ls->P,
axis_u,
&sample_len_u,
axis_v,
&sample_len_v,
klight->area.cot_half_spread)) {
return false;
}
}
ls->pdf = area_light_rect_sample(
P, &ls->P, axis_u, sample_len_u, axis_v, sample_len_v, randu, randv, true);
inplane = ls->P - inplane;
}
const float light_u = dot(inplane, axis_u) / len_u;
const float light_v = dot(inplane, axis_v) / len_v;
/* NOTE: Return barycentric coordinates in the same notation as Embree and OptiX. */
ls->u = light_v + 0.5f;
ls->v = -light_u - light_v;
ls->Ng = Ng;
ls->D = normalize_len(ls->P - P, &ls->t);
ls->eval_fac = 0.25f * invarea;
if (klight->area.cot_half_spread > 0.0f) {
/* Area Light spread angle attenuation */
ls->eval_fac *= area_light_spread_attenuation(
ls->D, ls->Ng, klight->area.cot_half_spread, klight->area.normalize_spread);
}
if (is_round) {
ls->pdf *= lamp_light_pdf(Ng, -ls->D, ls->t);
}
return true;
}
ccl_device_forceinline void area_light_update_position(const ccl_global KernelLight *klight,
ccl_private LightSample *ls,
const float3 P)
{
const float invarea = fabsf(klight->area.invarea);
ls->D = normalize_len(ls->P - P, &ls->t);
ls->pdf = invarea;
if (klight->area.cot_half_spread > 0.f) {
ls->eval_fac = 0.25f * invarea;
ls->eval_fac *= area_light_spread_attenuation(
ls->D, ls->Ng, klight->area.cot_half_spread, klight->area.normalize_spread);
}
}
ccl_device_inline bool area_light_intersect(const ccl_global KernelLight *klight,
const ccl_private Ray *ccl_restrict ray,
ccl_private float *t,
ccl_private float *u,
ccl_private float *v)
{
/* Area light. */
const float invarea = fabsf(klight->area.invarea);
const bool is_round = (klight->area.invarea < 0.0f);
if (invarea == 0.0f) {
return false;
}
const float3 inv_extent_u = klight->area.axis_u / klight->area.len_u;
const float3 inv_extent_v = klight->area.axis_v / klight->area.len_v;
const float3 Ng = klight->area.dir;
/* One sided. */
if (dot(ray->D, Ng) >= 0.0f) {
return false;
}
const float3 light_P = klight->co;
float3 P;
return ray_quad_intersect(ray->P,
ray->D,
ray->tmin,
ray->tmax,
light_P,
inv_extent_u,
inv_extent_v,
Ng,
&P,
t,
u,
v,
is_round);
}
ccl_device_inline bool area_light_sample_from_intersection(
const ccl_global KernelLight *klight,
ccl_private const Intersection *ccl_restrict isect,
const float3 ray_P,
const float3 ray_D,
ccl_private LightSample *ccl_restrict ls)
{
/* area light */
float invarea = fabsf(klight->area.invarea);
float3 Ng = klight->area.dir;
float3 light_P = klight->co;
ls->u = isect->u;
ls->v = isect->v;
ls->D = ray_D;
ls->Ng = Ng;
const bool is_round = (klight->area.invarea < 0.0f);
if (is_round) {
ls->pdf = invarea * lamp_light_pdf(Ng, -ray_D, ls->t);
}
else {
const float3 axis_u = klight->area.axis_u;
const float3 axis_v = klight->area.axis_v;
float sample_len_u = klight->area.len_u;
float sample_len_v = klight->area.len_v;
if (klight->area.cot_half_spread > 0.0f) {
if (!area_light_spread_clamp_area_light(ray_P,
Ng,
&light_P,
axis_u,
&sample_len_u,
axis_v,
&sample_len_v,
klight->area.cot_half_spread)) {
return false;
}
}
ls->pdf = area_light_rect_sample(
ray_P, &light_P, axis_u, sample_len_u, axis_v, sample_len_v, 0, 0, false);
}
ls->eval_fac = 0.25f * invarea;
if (klight->area.cot_half_spread > 0.0f) {
/* Area Light spread angle attenuation */
ls->eval_fac *= area_light_spread_attenuation(
ls->D, ls->Ng, klight->area.cot_half_spread, klight->area.normalize_spread);
if (ls->eval_fac == 0.0f) {
return false;
}
}
return true;
}
template<bool in_volume_segment>
ccl_device_forceinline bool area_light_tree_parameters(const ccl_global KernelLight *klight,
const float3 centroid,
const float3 P,
const float3 N,
const float3 bcone_axis,
ccl_private float &cos_theta_u,
ccl_private float2 &distance,
ccl_private float3 &point_to_centroid)
{
if (!in_volume_segment) {
/* TODO: a cheap substitute for minimal distance between point and primitive. Does it
* worth the overhead to compute the accurate minimal distance? */
float min_distance;
point_to_centroid = safe_normalize_len(centroid - P, &min_distance);
distance = make_float2(min_distance, min_distance);
}
cos_theta_u = FLT_MAX;
const float3 extentu = klight->area.axis_u * klight->area.len_u;
const float3 extentv = klight->area.axis_v * klight->area.len_v;
for (int i = 0; i < 4; i++) {
const float3 corner = ((i & 1) - 0.5f) * extentu + 0.5f * ((i & 2) - 1) * extentv + centroid;
float distance_point_to_corner;
const float3 point_to_corner = safe_normalize_len(corner - P, &distance_point_to_corner);
cos_theta_u = fminf(cos_theta_u, dot(point_to_centroid, point_to_corner));
if (!in_volume_segment) {
distance.x = fmaxf(distance.x, distance_point_to_corner);
}
}
const bool front_facing = dot(bcone_axis, point_to_centroid) < 0;
const bool shape_above_surface = dot(N, centroid - P) + fabsf(dot(N, extentu)) +
fabsf(dot(N, extentv)) >
0;
const bool in_volume = is_zero(N);
return (front_facing && shape_above_surface) || in_volume;
}
CCL_NAMESPACE_END

View File

@@ -3,6 +3,7 @@
#pragma once
#include "kernel/light/area.h"
#include "kernel/light/common.h"
CCL_NAMESPACE_BEGIN
@@ -130,11 +131,11 @@ ccl_device float background_map_pdf(KernelGlobals kg, float3 direction)
ccl_device_inline bool background_portal_data_fetch_and_check_side(
KernelGlobals kg, float3 P, int index, ccl_private float3 *lightpos, ccl_private float3 *dir)
{
int portal = kernel_data.background.portal_offset + index;
int portal = kernel_data.integrator.portal_offset + index;
const ccl_global KernelLight *klight = &kernel_data_fetch(lights, portal);
*lightpos = make_float3(klight->co[0], klight->co[1], klight->co[2]);
*dir = make_float3(klight->area.dir[0], klight->area.dir[1], klight->area.dir[2]);
*lightpos = klight->co;
*dir = klight->area.dir;
/* Check whether portal is on the right side. */
if (dot(*dir, P - *lightpos) > 1e-4f)
@@ -149,7 +150,7 @@ ccl_device_inline float background_portal_pdf(
float portal_pdf = 0.0f;
int num_possible = 0;
for (int p = 0; p < kernel_data.background.num_portals; p++) {
for (int p = 0; p < kernel_data.integrator.num_portals; p++) {
if (p == ignore_portal)
continue;
@@ -163,12 +164,16 @@ ccl_device_inline float background_portal_pdf(
}
num_possible++;
int portal = kernel_data.background.portal_offset + p;
int portal = kernel_data.integrator.portal_offset + p;
const ccl_global KernelLight *klight = &kernel_data_fetch(lights, portal);
float3 axisu = make_float3(
klight->area.axisu[0], klight->area.axisu[1], klight->area.axisu[2]);
float3 axisv = make_float3(
klight->area.axisv[0], klight->area.axisv[1], klight->area.axisv[2]);
const float3 axis_u = klight->area.axis_u;
const float len_u = klight->area.len_u;
const float3 axis_v = klight->area.axis_v;
const float len_v = klight->area.len_v;
const float3 inv_extent_u = axis_u / len_u;
const float3 inv_extent_v = axis_v / len_v;
bool is_round = (klight->area.invarea < 0.0f);
if (!ray_quad_intersect(P,
@@ -176,8 +181,8 @@ ccl_device_inline float background_portal_pdf(
1e-4f,
FLT_MAX,
lightpos,
axisu,
axisv,
inv_extent_u,
inv_extent_v,
dir,
NULL,
NULL,
@@ -189,10 +194,11 @@ ccl_device_inline float background_portal_pdf(
if (is_round) {
float t;
float3 D = normalize_len(lightpos - P, &t);
portal_pdf += fabsf(klight->area.invarea) * lamp_light_pdf(kg, dir, -D, t);
portal_pdf += fabsf(klight->area.invarea) * lamp_light_pdf(dir, -D, t);
}
else {
portal_pdf += rect_light_sample(P, &lightpos, axisu, axisv, 0.0f, 0.0f, false);
portal_pdf += area_light_rect_sample(
P, &lightpos, axis_u, len_u, axis_v, len_v, 0.0f, 0.0f, false);
}
}
@@ -207,7 +213,7 @@ ccl_device_inline float background_portal_pdf(
ccl_device int background_num_possible_portals(KernelGlobals kg, float3 P)
{
int num_possible_portals = 0;
for (int p = 0; p < kernel_data.background.num_portals; p++) {
for (int p = 0; p < kernel_data.integrator.num_portals; p++) {
float3 lightpos, dir;
if (background_portal_data_fetch_and_check_side(kg, P, p, &lightpos, &dir))
num_possible_portals++;
@@ -231,7 +237,7 @@ ccl_device float3 background_portal_sample(KernelGlobals kg,
/* TODO(sergey): Some smarter way of finding portal to sample
* is welcome.
*/
for (int p = 0; p < kernel_data.background.num_portals; p++) {
for (int p = 0; p < kernel_data.integrator.num_portals; p++) {
/* Search for the sampled portal. */
float3 lightpos, dir;
if (!background_portal_data_fetch_and_check_side(kg, P, p, &lightpos, &dir))
@@ -239,23 +245,24 @@ ccl_device float3 background_portal_sample(KernelGlobals kg,
if (portal == 0) {
/* p is the portal to be sampled. */
int portal = kernel_data.background.portal_offset + p;
int portal = kernel_data.integrator.portal_offset + p;
const ccl_global KernelLight *klight = &kernel_data_fetch(lights, portal);
float3 axisu = make_float3(
klight->area.axisu[0], klight->area.axisu[1], klight->area.axisu[2]);
float3 axisv = make_float3(
klight->area.axisv[0], klight->area.axisv[1], klight->area.axisv[2]);
const float3 axis_u = klight->area.axis_u;
const float3 axis_v = klight->area.axis_v;
const float len_u = klight->area.len_u;
const float len_v = klight->area.len_v;
bool is_round = (klight->area.invarea < 0.0f);
float3 D;
if (is_round) {
lightpos += ellipse_sample(axisu * 0.5f, axisv * 0.5f, randu, randv);
lightpos += ellipse_sample(axis_u * len_u * 0.5f, axis_v * len_v * 0.5f, randu, randv);
float t;
D = normalize_len(lightpos - P, &t);
*pdf = fabsf(klight->area.invarea) * lamp_light_pdf(kg, dir, -D, t);
*pdf = fabsf(klight->area.invarea) * lamp_light_pdf(dir, -D, t);
}
else {
*pdf = rect_light_sample(P, &lightpos, axisu, axisv, randu, randv, true);
*pdf = area_light_rect_sample(
P, &lightpos, axis_u, len_u, axis_v, len_v, randu, randv, true);
D = normalize(lightpos - P);
}
@@ -414,7 +421,7 @@ ccl_device float background_light_pdf(KernelGlobals kg, float3 P, float3 directi
float pdf_fac = (portal_method_pdf + sun_method_pdf + map_method_pdf);
if (pdf_fac == 0.0f) {
/* Use uniform as a fallback if we can't use any strategy. */
return kernel_data.integrator.pdf_lights / M_4PI_F;
return 1.0f / M_4PI_F;
}
pdf_fac = 1.0f / pdf_fac;
@@ -430,7 +437,21 @@ ccl_device float background_light_pdf(KernelGlobals kg, float3 P, float3 directi
pdf += background_map_pdf(kg, direction) * map_method_pdf;
}
return pdf * kernel_data.integrator.pdf_lights;
return pdf;
}
ccl_device_forceinline bool background_light_tree_parameters(const float3 centroid,
ccl_private float &cos_theta_u,
ccl_private float2 &distance,
ccl_private float3 &point_to_centroid)
{
/* Cover the whole sphere */
cos_theta_u = -1.0f;
distance = make_float2(1.0f, 1.0f);
point_to_centroid = -centroid;
return true;
}
CCL_NAMESPACE_END

View File

@@ -7,92 +7,26 @@
CCL_NAMESPACE_BEGIN
/* Area light sampling */
/* Light Sample Result */
/* Uses the following paper:
*
* Carlos Urena et al.
* An Area-Preserving Parametrization for Spherical Rectangles.
*
* https://www.solidangle.com/research/egsr2013_spherical_rectangle.pdf
*
* NOTE: light_p is modified when sample_coord is true.
*/
ccl_device_inline float rect_light_sample(float3 P,
ccl_private float3 *light_p,
float3 axisu,
float3 axisv,
float randu,
float randv,
bool sample_coord)
{
/* In our name system we're using P for the center,
* which is o in the paper.
*/
typedef struct LightSample {
float3 P; /* position on light, or direction for distant light */
float3 Ng; /* normal on light */
float3 D; /* direction from shading point to light */
float t; /* distance to light (FLT_MAX for distant light) */
float u, v; /* parametric coordinate on primitive */
float pdf; /* pdf for selecting light and point on light */
float pdf_selection; /* pdf for selecting light */
float eval_fac; /* intensity multiplier */
int object; /* object id for triangle/curve lights */
int prim; /* primitive id for triangle/curve lights */
int shader; /* shader id */
int lamp; /* lamp id */
int group; /* lightgroup */
LightType type; /* type of light */
} LightSample;
float3 corner = *light_p - axisu * 0.5f - axisv * 0.5f;
float axisu_len, axisv_len;
/* Compute local reference system R. */
float3 x = normalize_len(axisu, &axisu_len);
float3 y = normalize_len(axisv, &axisv_len);
float3 z = cross(x, y);
/* Compute rectangle coords in local reference system. */
float3 dir = corner - P;
float z0 = dot(dir, z);
/* Flip 'z' to make it point against Q. */
if (z0 > 0.0f) {
z *= -1.0f;
z0 *= -1.0f;
}
float x0 = dot(dir, x);
float y0 = dot(dir, y);
float x1 = x0 + axisu_len;
float y1 = y0 + axisv_len;
/* Compute internal angles (gamma_i). */
float4 diff = make_float4(x0, y1, x1, y0) - make_float4(x1, y0, x0, y1);
float4 nz = make_float4(y0, x1, y1, x0) * diff;
nz = nz / sqrt(z0 * z0 * diff * diff + nz * nz);
float g0 = safe_acosf(-nz.x * nz.y);
float g1 = safe_acosf(-nz.y * nz.z);
float g2 = safe_acosf(-nz.z * nz.w);
float g3 = safe_acosf(-nz.w * nz.x);
/* Compute predefined constants. */
float b0 = nz.x;
float b1 = nz.z;
float b0sq = b0 * b0;
float k = M_2PI_F - g2 - g3;
/* Compute solid angle from internal angles. */
float S = g0 + g1 - k;
if (sample_coord) {
/* Compute cu. */
float au = randu * S + k;
float fu = (cosf(au) * b0 - b1) / sinf(au);
float cu = 1.0f / sqrtf(fu * fu + b0sq) * (fu > 0.0f ? 1.0f : -1.0f);
cu = clamp(cu, -1.0f, 1.0f);
/* Compute xu. */
float xu = -(cu * z0) / max(sqrtf(1.0f - cu * cu), 1e-7f);
xu = clamp(xu, x0, x1);
/* Compute yv. */
float z0sq = z0 * z0;
float y0sq = y0 * y0;
float y1sq = y1 * y1;
float d = sqrtf(xu * xu + z0sq);
float h0 = y0 / sqrtf(d * d + y0sq);
float h1 = y1 / sqrtf(d * d + y1sq);
float hv = h0 + randv * (h1 - h0), hv2 = hv * hv;
float yv = (hv2 < 1.0f - 1e-6f) ? (hv * d) / sqrtf(1.0f - hv2) : y1;
/* Transform (xu, yv, z0) to world coords. */
*light_p = P + xu * x + yv * y + z0 * z;
}
/* return pdf */
if (S != 0.0f)
return 1.0f / S;
else
return 0.0f;
}
/* Utilities */
ccl_device_inline float3 ellipse_sample(float3 ru, float3 rv, float randu, float randv)
{
@@ -109,99 +43,7 @@ ccl_device float3 disk_light_sample(float3 v, float randu, float randv)
return ellipse_sample(ru, rv, randu, randv);
}
ccl_device float3 distant_light_sample(float3 D, float radius, float randu, float randv)
{
return normalize(D + disk_light_sample(D, randu, randv) * radius);
}
ccl_device float3
sphere_light_sample(float3 P, float3 center, float radius, float randu, float randv)
{
return disk_light_sample(normalize(P - center), randu, randv) * radius;
}
ccl_device float spot_light_attenuation(float3 dir, float spot_angle, float spot_smooth, float3 N)
{
float attenuation = dot(dir, N);
if (attenuation <= spot_angle) {
attenuation = 0.0f;
}
else {
float t = attenuation - spot_angle;
if (t < spot_smooth && spot_smooth != 0.0f)
attenuation *= smoothstepf(t / spot_smooth);
}
return attenuation;
}
ccl_device float light_spread_attenuation(const float3 D,
const float3 lightNg,
const float tan_spread,
const float normalize_spread)
{
/* Model a soft-box grid, computing the ratio of light not hidden by the
* slats of the grid at a given angle. (see D10594). */
const float cos_a = -dot(D, lightNg);
const float sin_a = safe_sqrtf(1.0f - sqr(cos_a));
const float tan_a = sin_a / cos_a;
return max((1.0f - (tan_spread * tan_a)) * normalize_spread, 0.0f);
}
/* Compute subset of area light that actually has an influence on the shading point, to
* reduce noise with low spread. */
ccl_device bool light_spread_clamp_area_light(const float3 P,
const float3 lightNg,
ccl_private float3 *lightP,
ccl_private float3 *axisu,
ccl_private float3 *axisv,
const float tan_spread)
{
/* Closest point in area light plane and distance to that plane. */
const float3 closest_P = P - dot(lightNg, P - *lightP) * lightNg;
const float t = len(closest_P - P);
/* Radius of circle on area light that actually affects the shading point. */
const float radius = t / tan_spread;
/* TODO: would be faster to store as normalized vector + length, also in rect_light_sample. */
float len_u, len_v;
const float3 u = normalize_len(*axisu, &len_u);
const float3 v = normalize_len(*axisv, &len_v);
/* Local uv coordinates of closest point. */
const float closest_u = dot(u, closest_P - *lightP);
const float closest_v = dot(v, closest_P - *lightP);
/* Compute rectangle encompassing the circle that affects the shading point,
* clamped to the bounds of the area light. */
const float min_u = max(closest_u - radius, -len_u * 0.5f);
const float max_u = min(closest_u + radius, len_u * 0.5f);
const float min_v = max(closest_v - radius, -len_v * 0.5f);
const float max_v = min(closest_v + radius, len_v * 0.5f);
/* Skip if rectangle is empty. */
if (min_u >= max_u || min_v >= max_v) {
return false;
}
/* Compute new area light center position and axes from rectangle in local
* uv coordinates. */
const float new_center_u = 0.5f * (min_u + max_u);
const float new_center_v = 0.5f * (min_v + max_v);
const float new_len_u = max_u - min_u;
const float new_len_v = max_v - min_v;
*lightP = *lightP + new_center_u * u + new_center_v * v;
*axisu = u * new_len_u;
*axisv = v * new_len_v;
return true;
}
ccl_device float lamp_light_pdf(KernelGlobals kg, const float3 Ng, const float3 I, float t)
ccl_device float lamp_light_pdf(const float3 Ng, const float3 I, float t)
{
float cos_pi = dot(Ng, I);

View File

@@ -0,0 +1,127 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2011-2022 Blender Foundation */
#pragma once
#include "kernel/geom/geom.h"
#include "kernel/light/common.h"
CCL_NAMESPACE_BEGIN
ccl_device_inline bool distant_light_sample(const ccl_global KernelLight *klight,
const float randu,
const float randv,
ccl_private LightSample *ls)
{
/* distant light */
float3 lightD = klight->co;
float3 D = lightD;
float radius = klight->distant.radius;
float invarea = klight->distant.invarea;
if (radius > 0.0f) {
D = normalize(D + disk_light_sample(D, randu, randv) * radius);
}
ls->P = D;
ls->Ng = D;
ls->D = -D;
ls->t = FLT_MAX;
float costheta = dot(lightD, D);
ls->pdf = invarea / (costheta * costheta * costheta);
ls->eval_fac = ls->pdf;
return true;
}
ccl_device bool distant_light_sample_from_intersection(KernelGlobals kg,
const float3 ray_D,
const int lamp,
ccl_private LightSample *ccl_restrict ls)
{
ccl_global const KernelLight *klight = &kernel_data_fetch(lights, lamp);
const int shader = klight->shader_id;
const float radius = klight->distant.radius;
const LightType type = (LightType)klight->type;
if (type != LIGHT_DISTANT) {
return false;
}
if (!(shader & SHADER_USE_MIS)) {
return false;
}
if (radius == 0.0f) {
return false;
}
/* a distant light is infinitely far away, but equivalent to a disk
* shaped light exactly 1 unit away from the current shading point.
*
* radius t^2/cos(theta)
* <----------> t = sqrt(1^2 + tan(theta)^2)
* tan(th) area = radius*radius*pi
* <----->
* \ | (1 + tan(theta)^2)/cos(theta)
* \ | (1 + tan(acos(cos(theta)))^2)/cos(theta)
* t \th| 1 simplifies to
* \-| 1/(cos(theta)^3)
* \| magic!
* P
*/
float3 lightD = klight->co;
float costheta = dot(-lightD, ray_D);
float cosangle = klight->distant.cosangle;
/* Workaround to prevent a hang in the classroom scene with AMD HIP drivers 22.10,
* Remove when a compiler fix is available. */
#ifdef __HIP__
ls->shader = klight->shader_id;
#endif
if (costheta < cosangle)
return false;
ls->type = type;
#ifndef __HIP__
ls->shader = klight->shader_id;
#endif
ls->object = PRIM_NONE;
ls->prim = PRIM_NONE;
ls->lamp = lamp;
/* todo: missing texture coordinates */
ls->u = 0.0f;
ls->v = 0.0f;
ls->t = FLT_MAX;
ls->P = -ray_D;
ls->Ng = -ray_D;
ls->D = ray_D;
ls->group = lamp_lightgroup(kg, lamp);
/* compute pdf */
float invarea = klight->distant.invarea;
ls->pdf = invarea / (costheta * costheta * costheta);
ls->eval_fac = ls->pdf;
return true;
}
ccl_device_forceinline bool distant_light_tree_parameters(const float3 centroid,
const float theta_e,
ccl_private float &cos_theta_u,
ccl_private float2 &distance,
ccl_private float3 &point_to_centroid)
{
/* Treating it as a disk light 1 unit away */
cos_theta_u = fast_cosf(theta_e);
distance = make_float2(1.0f / cos_theta_u, 1.0f);
point_to_centroid = -centroid;
return true;
}
CCL_NAMESPACE_END

View File

@@ -0,0 +1,80 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2011-2022 Blender Foundation */
#pragma once
#include "kernel/light/light.h"
#include "kernel/light/triangle.h"
CCL_NAMESPACE_BEGIN
/* Simple CDF based sampling over all lights in the scene, without taking into
* account shading position or normal. */
ccl_device int light_distribution_sample(KernelGlobals kg, ccl_private float &randu)
{
/* This is basically std::upper_bound as used by PBRT, to find a point light or
* triangle to emit from, proportional to area. a good improvement would be to
* also sample proportional to power, though it's not so well defined with
* arbitrary shaders. */
int first = 0;
int len = kernel_data.integrator.num_distribution + 1;
float r = randu;
do {
int half_len = len >> 1;
int middle = first + half_len;
if (r < kernel_data_fetch(light_distribution, middle).totarea) {
len = half_len;
}
else {
first = middle + 1;
len = len - half_len - 1;
}
} while (len > 0);
/* Clamping should not be needed but float rounding errors seem to
* make this fail on rare occasions. */
int index = clamp(first - 1, 0, kernel_data.integrator.num_distribution - 1);
/* Rescale to reuse random number. this helps the 2D samples within
* each area light be stratified as well. */
float distr_min = kernel_data_fetch(light_distribution, index).totarea;
float distr_max = kernel_data_fetch(light_distribution, index + 1).totarea;
randu = (r - distr_min) / (distr_max - distr_min);
return index;
}
ccl_device_noinline bool light_distribution_sample(KernelGlobals kg,
ccl_private float &randu,
const float randv,
const float time,
const float3 P,
const int bounce,
const uint32_t path_flag,
ccl_private int &emitter_object,
ccl_private int &emitter_prim,
ccl_private int &emitter_shader_flag,
ccl_private float &emitter_pdf_selection)
{
/* Sample light index from distribution. */
const int index = light_distribution_sample(kg, randu);
ccl_global const KernelLightDistribution *kdistribution = &kernel_data_fetch(light_distribution,
index);
emitter_object = kdistribution->mesh_light.object_id;
emitter_prim = kdistribution->prim;
emitter_shader_flag = kdistribution->mesh_light.shader_flag;
emitter_pdf_selection = kernel_data.integrator.distribution_pdf_lights;
return true;
}
ccl_device_inline float light_distribution_pdf_lamp(KernelGlobals kg)
{
return kernel_data.integrator.distribution_pdf_lights;
}
CCL_NAMESPACE_END

View File

@@ -3,31 +3,18 @@
#pragma once
#include "kernel/geom/geom.h"
#include "kernel/light/area.h"
#include "kernel/light/background.h"
#include "kernel/light/distant.h"
#include "kernel/light/point.h"
#include "kernel/light/spot.h"
#include "kernel/light/triangle.h"
#include "kernel/sample/mapping.h"
CCL_NAMESPACE_BEGIN
/* Light Sample result */
typedef struct LightSample {
float3 P; /* position on light, or direction for distant light */
float3 Ng; /* normal on light */
float3 D; /* direction from shading point to light */
float t; /* distance to light (FLT_MAX for distant light) */
float u, v; /* parametric coordinate on primitive */
float pdf; /* light sampling probability density function */
float eval_fac; /* intensity multiplier */
int object; /* object id for triangle/curve lights */
int prim; /* primitive id for triangle/curve lights */
int shader; /* shader id */
int lamp; /* lamp id */
int group; /* lightgroup */
LightType type; /* type of light */
} LightSample;
/* Regular Light */
/* Sample point on an individual light. */
template<bool in_volume_segment>
ccl_device_inline bool light_sample(KernelGlobals kg,
@@ -63,28 +50,15 @@ ccl_device_inline bool light_sample(KernelGlobals kg,
ls->Ng = zero_float3();
ls->D = zero_float3();
ls->pdf = 1.0f;
ls->eval_fac = 0.0f;
ls->t = FLT_MAX;
return true;
}
if (type == LIGHT_DISTANT) {
/* distant light */
float3 lightD = make_float3(klight->co[0], klight->co[1], klight->co[2]);
float3 D = lightD;
float radius = klight->distant.radius;
float invarea = klight->distant.invarea;
if (radius > 0.0f)
D = distant_light_sample(D, radius, randu, randv);
ls->P = D;
ls->Ng = D;
ls->D = -D;
ls->t = FLT_MAX;
float costheta = dot(lightD, D);
ls->pdf = invarea / (costheta * costheta * costheta);
ls->eval_fac = ls->pdf;
if (!distant_light_sample(klight, randu, randv, ls)) {
return false;
}
}
else if (type == LIGHT_BACKGROUND) {
/* infinite area light (e.g. light dome or env light) */
@@ -96,139 +70,28 @@ ccl_device_inline bool light_sample(KernelGlobals kg,
ls->t = FLT_MAX;
ls->eval_fac = 1.0f;
}
else if (type == LIGHT_SPOT) {
if (!spot_light_sample<in_volume_segment>(klight, randu, randv, P, ls)) {
return false;
}
}
else if (type == LIGHT_POINT) {
if (!point_light_sample<in_volume_segment>(klight, randu, randv, P, ls)) {
return false;
}
}
else {
ls->P = make_float3(klight->co[0], klight->co[1], klight->co[2]);
if (type == LIGHT_SPOT) {
const float3 center = make_float3(klight->co[0], klight->co[1], klight->co[2]);
const float radius = klight->spot.radius;
const float3 dir = make_float3(
klight->spot.dir[0], klight->spot.dir[1], klight->spot.dir[2]);
/* disk oriented normal */
const float3 lightN = normalize(P - center);
ls->P = center;
if (radius > 0.0f)
/* disk light */
ls->P += disk_light_sample(lightN, randu, randv) * radius;
const float invarea = klight->spot.invarea;
ls->pdf = invarea;
ls->D = normalize_len(ls->P - P, &ls->t);
/* we set the light normal to the outgoing direction to support texturing */
ls->Ng = -ls->D;
ls->eval_fac = (0.25f * M_1_PI_F) * invarea;
/* spot light attenuation */
ls->eval_fac *= spot_light_attenuation(
dir, klight->spot.spot_angle, klight->spot.spot_smooth, -ls->D);
if (!in_volume_segment && ls->eval_fac == 0.0f) {
return false;
}
float2 uv = map_to_sphere(ls->Ng);
ls->u = uv.x;
ls->v = uv.y;
ls->pdf *= lamp_light_pdf(kg, lightN, -ls->D, ls->t);
}
else if (type == LIGHT_POINT) {
float3 center = make_float3(klight->co[0], klight->co[1], klight->co[2]);
float radius = klight->spot.radius;
/* disk oriented normal */
const float3 lightN = normalize(P - center);
ls->P = center;
if (radius > 0.0f) {
ls->P += disk_light_sample(lightN, randu, randv) * radius;
}
ls->pdf = klight->spot.invarea;
ls->D = normalize_len(ls->P - P, &ls->t);
/* we set the light normal to the outgoing direction to support texturing */
ls->Ng = -ls->D;
ls->eval_fac = M_1_PI_F * 0.25f * klight->spot.invarea;
if (!in_volume_segment && ls->eval_fac == 0.0f) {
return false;
}
float2 uv = map_to_sphere(ls->Ng);
ls->u = uv.x;
ls->v = uv.y;
ls->pdf *= lamp_light_pdf(kg, lightN, -ls->D, ls->t);
}
else {
/* area light */
float3 axisu = make_float3(
klight->area.axisu[0], klight->area.axisu[1], klight->area.axisu[2]);
float3 axisv = make_float3(
klight->area.axisv[0], klight->area.axisv[1], klight->area.axisv[2]);
float3 Ng = make_float3(klight->area.dir[0], klight->area.dir[1], klight->area.dir[2]);
float invarea = fabsf(klight->area.invarea);
bool is_round = (klight->area.invarea < 0.0f);
if (!in_volume_segment) {
if (dot(ls->P - P, Ng) > 0.0f) {
return false;
}
}
float3 inplane;
if (is_round || in_volume_segment) {
inplane = ellipse_sample(axisu * 0.5f, axisv * 0.5f, randu, randv);
ls->P += inplane;
ls->pdf = invarea;
}
else {
inplane = ls->P;
float3 sample_axisu = axisu;
float3 sample_axisv = axisv;
if (!in_volume_segment && klight->area.tan_spread > 0.0f) {
if (!light_spread_clamp_area_light(
P, Ng, &ls->P, &sample_axisu, &sample_axisv, klight->area.tan_spread)) {
return false;
}
}
ls->pdf = rect_light_sample(P, &ls->P, sample_axisu, sample_axisv, randu, randv, true);
inplane = ls->P - inplane;
}
const float light_u = dot(inplane, axisu) * (1.0f / dot(axisu, axisu));
const float light_v = dot(inplane, axisv) * (1.0f / dot(axisv, axisv));
/* NOTE: Return barycentric coordinates in the same notation as Embree and OptiX. */
ls->u = light_v + 0.5f;
ls->v = -light_u - light_v;
ls->Ng = Ng;
ls->D = normalize_len(ls->P - P, &ls->t);
ls->eval_fac = 0.25f * invarea;
if (klight->area.tan_spread > 0.0f) {
/* Area Light spread angle attenuation */
ls->eval_fac *= light_spread_attenuation(
ls->D, ls->Ng, klight->area.tan_spread, klight->area.normalize_spread);
}
if (is_round) {
ls->pdf *= lamp_light_pdf(kg, Ng, -ls->D, ls->t);
}
/* area light */
if (!area_light_sample<in_volume_segment>(klight, randu, randv, P, ls)) {
return false;
}
}
ls->pdf *= kernel_data.integrator.pdf_lights;
return in_volume_segment || (ls->pdf > 0.0f);
}
/* Intersect ray with individual light. */
ccl_device bool lights_intersect(KernelGlobals kg,
IntegratorState state,
ccl_private const Ray *ccl_restrict ray,
@@ -238,7 +101,7 @@ ccl_device bool lights_intersect(KernelGlobals kg,
const int last_type,
const uint32_t path_flag)
{
for (int lamp = 0; lamp < kernel_data.integrator.num_all_lights; lamp++) {
for (int lamp = 0; lamp < kernel_data.integrator.num_lights; lamp++) {
const ccl_global KernelLight *klight = &kernel_data_fetch(lights, lamp);
if (path_flag & PATH_RAY_CAMERA) {
@@ -271,76 +134,17 @@ ccl_device bool lights_intersect(KernelGlobals kg,
float t = 0.0f, u = 0.0f, v = 0.0f;
if (type == LIGHT_SPOT) {
/* Spot/Disk light. */
const float3 lightP = make_float3(klight->co[0], klight->co[1], klight->co[2]);
const float radius = klight->spot.radius;
if (radius == 0.0f) {
continue;
}
/* disk oriented normal */
const float3 lightN = normalize(ray->P - lightP);
/* One sided. */
if (dot(ray->D, lightN) >= 0.0f) {
continue;
}
float3 P;
if (!ray_disk_intersect(
ray->P, ray->D, ray->tmin, ray->tmax, lightP, lightN, radius, &P, &t)) {
if (!spot_light_intersect(klight, ray, &t)) {
continue;
}
}
else if (type == LIGHT_POINT) {
/* Sphere light (aka, aligned disk light). */
const float3 lightP = make_float3(klight->co[0], klight->co[1], klight->co[2]);
const float radius = klight->spot.radius;
if (radius == 0.0f) {
continue;
}
/* disk oriented normal */
const float3 lightN = normalize(ray->P - lightP);
float3 P;
if (!ray_disk_intersect(
ray->P, ray->D, ray->tmin, ray->tmax, lightP, lightN, radius, &P, &t)) {
if (!point_light_intersect(klight, ray, &t)) {
continue;
}
}
else if (type == LIGHT_AREA) {
/* Area light. */
const float invarea = fabsf(klight->area.invarea);
const bool is_round = (klight->area.invarea < 0.0f);
if (invarea == 0.0f) {
continue;
}
const float3 axisu = make_float3(
klight->area.axisu[0], klight->area.axisu[1], klight->area.axisu[2]);
const float3 axisv = make_float3(
klight->area.axisv[0], klight->area.axisv[1], klight->area.axisv[2]);
const float3 Ng = make_float3(klight->area.dir[0], klight->area.dir[1], klight->area.dir[2]);
/* One sided. */
if (dot(ray->D, Ng) >= 0.0f) {
continue;
}
const float3 light_P = make_float3(klight->co[0], klight->co[1], klight->co[2]);
float3 P;
if (!ray_quad_intersect(ray->P,
ray->D,
ray->tmin,
ray->tmax,
light_P,
axisu,
axisv,
Ng,
&P,
&t,
&u,
&v,
is_round)) {
if (!area_light_intersect(klight, ray, &t, &u, &v)) {
continue;
}
}
@@ -362,78 +166,7 @@ ccl_device bool lights_intersect(KernelGlobals kg,
return isect->prim != PRIM_NONE;
}
ccl_device bool light_sample_from_distant_ray(KernelGlobals kg,
const float3 ray_D,
const int lamp,
ccl_private LightSample *ccl_restrict ls)
{
ccl_global const KernelLight *klight = &kernel_data_fetch(lights, lamp);
const int shader = klight->shader_id;
const float radius = klight->distant.radius;
const LightType type = (LightType)klight->type;
if (type != LIGHT_DISTANT) {
return false;
}
if (!(shader & SHADER_USE_MIS)) {
return false;
}
if (radius == 0.0f) {
return false;
}
/* a distant light is infinitely far away, but equivalent to a disk
* shaped light exactly 1 unit away from the current shading point.
*
* radius t^2/cos(theta)
* <----------> t = sqrt(1^2 + tan(theta)^2)
* tan(th) area = radius*radius*pi
* <----->
* \ | (1 + tan(theta)^2)/cos(theta)
* \ | (1 + tan(acos(cos(theta)))^2)/cos(theta)
* t \th| 1 simplifies to
* \-| 1/(cos(theta)^3)
* \| magic!
* P
*/
float3 lightD = make_float3(klight->co[0], klight->co[1], klight->co[2]);
float costheta = dot(-lightD, ray_D);
float cosangle = klight->distant.cosangle;
/* Workaround to prevent a hang in the classroom scene with AMD HIP drivers 22.10,
* Remove when a compiler fix is available. */
#ifdef __HIP__
ls->shader = klight->shader_id;
#endif
if (costheta < cosangle)
return false;
ls->type = type;
#ifndef __HIP__
ls->shader = klight->shader_id;
#endif
ls->object = PRIM_NONE;
ls->prim = PRIM_NONE;
ls->lamp = lamp;
/* todo: missing texture coordinates */
ls->u = 0.0f;
ls->v = 0.0f;
ls->t = FLT_MAX;
ls->P = -ray_D;
ls->Ng = -ray_D;
ls->D = ray_D;
ls->group = lamp_lightgroup(kg, lamp);
/* compute pdf */
float invarea = klight->distant.invarea;
ls->pdf = invarea / (costheta * costheta * costheta);
ls->eval_fac = ls->pdf;
ls->pdf *= kernel_data.integrator.pdf_lights;
return true;
}
/* Setup light sample from intersection. */
ccl_device bool light_sample_from_intersection(KernelGlobals kg,
ccl_private const Intersection *ccl_restrict isect,
@@ -456,102 +189,18 @@ ccl_device bool light_sample_from_intersection(KernelGlobals kg,
ls->group = lamp_lightgroup(kg, lamp);
if (type == LIGHT_SPOT) {
const float3 center = make_float3(klight->co[0], klight->co[1], klight->co[2]);
const float3 dir = make_float3(klight->spot.dir[0], klight->spot.dir[1], klight->spot.dir[2]);
/* the normal of the oriented disk */
const float3 lightN = normalize(ray_P - center);
/* We set the light normal to the outgoing direction to support texturing. */
ls->Ng = -ls->D;
float invarea = klight->spot.invarea;
ls->eval_fac = (0.25f * M_1_PI_F) * invarea;
ls->pdf = invarea;
/* spot light attenuation */
ls->eval_fac *= spot_light_attenuation(
dir, klight->spot.spot_angle, klight->spot.spot_smooth, -ls->D);
if (ls->eval_fac == 0.0f) {
if (!spot_light_sample_from_intersection(klight, isect, ray_P, ray_D, ls)) {
return false;
}
float2 uv = map_to_sphere(ls->Ng);
ls->u = uv.x;
ls->v = uv.y;
/* compute pdf */
if (ls->t != FLT_MAX)
ls->pdf *= lamp_light_pdf(kg, lightN, -ls->D, ls->t);
else
ls->pdf = 0.f;
}
else if (type == LIGHT_POINT) {
const float3 center = make_float3(klight->co[0], klight->co[1], klight->co[2]);
const float3 lighN = normalize(ray_P - center);
/* We set the light normal to the outgoing direction to support texturing. */
ls->Ng = -ls->D;
float invarea = klight->spot.invarea;
ls->eval_fac = (0.25f * M_1_PI_F) * invarea;
ls->pdf = invarea;
if (ls->eval_fac == 0.0f) {
if (!point_light_sample_from_intersection(klight, isect, ray_P, ray_D, ls)) {
return false;
}
float2 uv = map_to_sphere(ls->Ng);
ls->u = uv.x;
ls->v = uv.y;
/* compute pdf */
if (ls->t != FLT_MAX)
ls->pdf *= lamp_light_pdf(kg, lighN, -ls->D, ls->t);
else
ls->pdf = 0.f;
}
else if (type == LIGHT_AREA) {
/* area light */
float invarea = fabsf(klight->area.invarea);
float3 axisu = make_float3(
klight->area.axisu[0], klight->area.axisu[1], klight->area.axisu[2]);
float3 axisv = make_float3(
klight->area.axisv[0], klight->area.axisv[1], klight->area.axisv[2]);
float3 Ng = make_float3(klight->area.dir[0], klight->area.dir[1], klight->area.dir[2]);
float3 light_P = make_float3(klight->co[0], klight->co[1], klight->co[2]);
ls->u = isect->u;
ls->v = isect->v;
ls->D = ray_D;
ls->Ng = Ng;
const bool is_round = (klight->area.invarea < 0.0f);
if (is_round) {
ls->pdf = invarea * lamp_light_pdf(kg, Ng, -ray_D, ls->t);
}
else {
float3 sample_axisu = axisu;
float3 sample_axisv = axisv;
if (klight->area.tan_spread > 0.0f) {
if (!light_spread_clamp_area_light(
ray_P, Ng, &light_P, &sample_axisu, &sample_axisv, klight->area.tan_spread)) {
return false;
}
}
ls->pdf = rect_light_sample(ray_P, &light_P, sample_axisu, sample_axisv, 0, 0, false);
}
ls->eval_fac = 0.25f * invarea;
if (klight->area.tan_spread > 0.0f) {
/* Area Light spread angle attenuation */
ls->eval_fac *= light_spread_attenuation(
ls->D, ls->Ng, klight->area.tan_spread, klight->area.normalize_spread);
if (ls->eval_fac == 0.0f) {
return false;
}
if (!area_light_sample_from_intersection(klight, isect, ray_P, ray_D, ls)) {
return false;
}
}
else {
@@ -559,411 +208,33 @@ ccl_device bool light_sample_from_intersection(KernelGlobals kg,
return false;
}
ls->pdf *= kernel_data.integrator.pdf_lights;
return true;
}
/* Triangle Light */
/* Update light sample for changed new position, for MNEE. */
/* returns true if the triangle is has motion blur or an instancing transform applied */
ccl_device_inline bool triangle_world_space_vertices(
KernelGlobals kg, int object, int prim, float time, float3 V[3])
{
bool has_motion = false;
const int object_flag = kernel_data_fetch(object_flag, object);
if (object_flag & SD_OBJECT_HAS_VERTEX_MOTION && time >= 0.0f) {
motion_triangle_vertices(kg, object, prim, time, V);
has_motion = true;
}
else {
triangle_vertices(kg, prim, V);
}
if (!(object_flag & SD_OBJECT_TRANSFORM_APPLIED)) {
#ifdef __OBJECT_MOTION__
float object_time = (time >= 0.0f) ? time : 0.5f;
Transform tfm = object_fetch_transform_motion_test(kg, object, object_time, NULL);
#else
Transform tfm = object_fetch_transform(kg, object, OBJECT_TRANSFORM);
#endif
V[0] = transform_point(&tfm, V[0]);
V[1] = transform_point(&tfm, V[1]);
V[2] = transform_point(&tfm, V[2]);
has_motion = true;
}
return has_motion;
}
ccl_device_inline float triangle_light_pdf_area(KernelGlobals kg,
const float3 Ng,
const float3 I,
float t)
{
float pdf = kernel_data.integrator.pdf_triangles;
float cos_pi = fabsf(dot(Ng, I));
if (cos_pi == 0.0f)
return 0.0f;
return t * t * pdf / cos_pi;
}
ccl_device_forceinline float triangle_light_pdf(KernelGlobals kg,
ccl_private const ShaderData *sd,
float t)
{
/* A naive heuristic to decide between costly solid angle sampling
* and simple area sampling, comparing the distance to the triangle plane
* to the length of the edges of the triangle. */
float3 V[3];
bool has_motion = triangle_world_space_vertices(kg, sd->object, sd->prim, sd->time, V);
const float3 e0 = V[1] - V[0];
const float3 e1 = V[2] - V[0];
const float3 e2 = V[2] - V[1];
const float longest_edge_squared = max(len_squared(e0), max(len_squared(e1), len_squared(e2)));
const float3 N = cross(e0, e1);
const float distance_to_plane = fabsf(dot(N, sd->I * t)) / dot(N, N);
if (longest_edge_squared > distance_to_plane * distance_to_plane) {
/* sd contains the point on the light source
* calculate Px, the point that we're shading */
const float3 Px = sd->P + sd->I * t;
const float3 v0_p = V[0] - Px;
const float3 v1_p = V[1] - Px;
const float3 v2_p = V[2] - Px;
const float3 u01 = safe_normalize(cross(v0_p, v1_p));
const float3 u02 = safe_normalize(cross(v0_p, v2_p));
const float3 u12 = safe_normalize(cross(v1_p, v2_p));
const float alpha = fast_acosf(dot(u02, u01));
const float beta = fast_acosf(-dot(u01, u12));
const float gamma = fast_acosf(dot(u02, u12));
const float solid_angle = alpha + beta + gamma - M_PI_F;
/* pdf_triangles is calculated over triangle area, but we're not sampling over its area */
if (UNLIKELY(solid_angle == 0.0f)) {
return 0.0f;
}
else {
float area = 1.0f;
if (has_motion) {
/* get the center frame vertices, this is what the PDF was calculated from */
triangle_world_space_vertices(kg, sd->object, sd->prim, -1.0f, V);
area = triangle_area(V[0], V[1], V[2]);
}
else {
area = 0.5f * len(N);
}
const float pdf = area * kernel_data.integrator.pdf_triangles;
return pdf / solid_angle;
}
}
else {
float pdf = triangle_light_pdf_area(kg, sd->Ng, sd->I, t);
if (has_motion) {
const float area = 0.5f * len(N);
if (UNLIKELY(area == 0.0f)) {
return 0.0f;
}
/* scale the PDF.
* area = the area the sample was taken from
* area_pre = the are from which pdf_triangles was calculated from */
triangle_world_space_vertices(kg, sd->object, sd->prim, -1.0f, V);
const float area_pre = triangle_area(V[0], V[1], V[2]);
pdf = pdf * area_pre / area;
}
return pdf;
}
}
template<bool in_volume_segment>
ccl_device_forceinline void triangle_light_sample(KernelGlobals kg,
int prim,
int object,
float randu,
float randv,
float time,
ccl_device_forceinline void light_update_position(KernelGlobals kg,
ccl_private LightSample *ls,
const float3 P)
{
/* A naive heuristic to decide between costly solid angle sampling
* and simple area sampling, comparing the distance to the triangle plane
* to the length of the edges of the triangle. */
const ccl_global KernelLight *klight = &kernel_data_fetch(lights, ls->lamp);
float3 V[3];
bool has_motion = triangle_world_space_vertices(kg, object, prim, time, V);
const float3 e0 = V[1] - V[0];
const float3 e1 = V[2] - V[0];
const float3 e2 = V[2] - V[1];
const float longest_edge_squared = max(len_squared(e0), max(len_squared(e1), len_squared(e2)));
const float3 N0 = cross(e0, e1);
float Nl = 0.0f;
ls->Ng = safe_normalize_len(N0, &Nl);
float area = 0.5f * Nl;
/* flip normal if necessary */
const int object_flag = kernel_data_fetch(object_flag, object);
if (object_flag & SD_OBJECT_NEGATIVE_SCALE_APPLIED) {
ls->Ng = -ls->Ng;
if (ls->type == LIGHT_POINT) {
point_light_update_position(klight, ls, P);
}
ls->eval_fac = 1.0f;
ls->shader = kernel_data_fetch(tri_shader, prim);
ls->object = object;
ls->prim = prim;
ls->lamp = LAMP_NONE;
ls->shader |= SHADER_USE_MIS;
ls->type = LIGHT_TRIANGLE;
ls->group = object_lightgroup(kg, object);
float distance_to_plane = fabsf(dot(N0, V[0] - P) / dot(N0, N0));
if (!in_volume_segment && (longest_edge_squared > distance_to_plane * distance_to_plane)) {
/* see James Arvo, "Stratified Sampling of Spherical Triangles"
* http://www.graphics.cornell.edu/pubs/1995/Arv95c.pdf */
/* project the triangle to the unit sphere
* and calculate its edges and angles */
const float3 v0_p = V[0] - P;
const float3 v1_p = V[1] - P;
const float3 v2_p = V[2] - P;
const float3 u01 = safe_normalize(cross(v0_p, v1_p));
const float3 u02 = safe_normalize(cross(v0_p, v2_p));
const float3 u12 = safe_normalize(cross(v1_p, v2_p));
const float3 A = safe_normalize(v0_p);
const float3 B = safe_normalize(v1_p);
const float3 C = safe_normalize(v2_p);
const float cos_alpha = dot(u02, u01);
const float cos_beta = -dot(u01, u12);
const float cos_gamma = dot(u02, u12);
/* calculate dihedral angles */
const float alpha = fast_acosf(cos_alpha);
const float beta = fast_acosf(cos_beta);
const float gamma = fast_acosf(cos_gamma);
/* the area of the unit spherical triangle = solid angle */
const float solid_angle = alpha + beta + gamma - M_PI_F;
/* precompute a few things
* these could be re-used to take several samples
* as they are independent of randu/randv */
const float cos_c = dot(A, B);
const float sin_alpha = fast_sinf(alpha);
const float product = sin_alpha * cos_c;
/* Select a random sub-area of the spherical triangle
* and calculate the third vertex C_ of that new triangle */
const float phi = randu * solid_angle - alpha;
float s, t;
fast_sincosf(phi, &s, &t);
const float u = t - cos_alpha;
const float v = s + product;
const float3 U = safe_normalize(C - dot(C, A) * A);
float q = 1.0f;
const float det = ((v * s + u * t) * sin_alpha);
if (det != 0.0f) {
q = ((v * t - u * s) * cos_alpha - v) / det;
}
const float temp = max(1.0f - q * q, 0.0f);
const float3 C_ = safe_normalize(q * A + sqrtf(temp) * U);
/* Finally, select a random point along the edge of the new triangle
* That point on the spherical triangle is the sampled ray direction */
const float z = 1.0f - randv * (1.0f - dot(C_, B));
ls->D = z * B + safe_sqrtf(1.0f - z * z) * safe_normalize(C_ - dot(C_, B) * B);
/* calculate intersection with the planar triangle */
if (!ray_triangle_intersect(
P, ls->D, 0.0f, FLT_MAX, V[0], V[1], V[2], &ls->u, &ls->v, &ls->t)) {
ls->pdf = 0.0f;
return;
}
ls->P = P + ls->D * ls->t;
/* pdf_triangles is calculated over triangle area, but we're sampling over solid angle */
if (UNLIKELY(solid_angle == 0.0f)) {
ls->pdf = 0.0f;
return;
}
else {
if (has_motion) {
/* get the center frame vertices, this is what the PDF was calculated from */
triangle_world_space_vertices(kg, object, prim, -1.0f, V);
area = triangle_area(V[0], V[1], V[2]);
}
const float pdf = area * kernel_data.integrator.pdf_triangles;
ls->pdf = pdf / solid_angle;
}
else if (ls->type == LIGHT_SPOT) {
spot_light_update_position(klight, ls, P);
}
else {
/* compute random point in triangle. From Eric Heitz's "A Low-Distortion Map Between Triangle
* and Square" */
float u = randu;
float v = randv;
if (v > u) {
u *= 0.5f;
v -= u;
}
else {
v *= 0.5f;
u -= v;
}
const float t = 1.0f - u - v;
ls->P = u * V[0] + v * V[1] + t * V[2];
/* compute incoming direction, distance and pdf */
ls->D = normalize_len(ls->P - P, &ls->t);
ls->pdf = triangle_light_pdf_area(kg, ls->Ng, -ls->D, ls->t);
if (has_motion && area != 0.0f) {
/* scale the PDF.
* area = the area the sample was taken from
* area_pre = the are from which pdf_triangles was calculated from */
triangle_world_space_vertices(kg, object, prim, -1.0f, V);
const float area_pre = triangle_area(V[0], V[1], V[2]);
ls->pdf = ls->pdf * area_pre / area;
}
ls->u = u;
ls->v = v;
else if (ls->type == LIGHT_AREA) {
area_light_update_position(klight, ls, P);
}
}
/* Light Distribution */
ccl_device int light_distribution_sample(KernelGlobals kg, ccl_private float *randu)
{
/* This is basically std::upper_bound as used by PBRT, to find a point light or
* triangle to emit from, proportional to area. a good improvement would be to
* also sample proportional to power, though it's not so well defined with
* arbitrary shaders. */
int first = 0;
int len = kernel_data.integrator.num_distribution + 1;
float r = *randu;
do {
int half_len = len >> 1;
int middle = first + half_len;
if (r < kernel_data_fetch(light_distribution, middle).totarea) {
len = half_len;
}
else {
first = middle + 1;
len = len - half_len - 1;
}
} while (len > 0);
/* Clamping should not be needed but float rounding errors seem to
* make this fail on rare occasions. */
int index = clamp(first - 1, 0, kernel_data.integrator.num_distribution - 1);
/* Rescale to reuse random number. this helps the 2D samples within
* each area light be stratified as well. */
float distr_min = kernel_data_fetch(light_distribution, index).totarea;
float distr_max = kernel_data_fetch(light_distribution, index + 1).totarea;
*randu = (r - distr_min) / (distr_max - distr_min);
return index;
}
/* Generic Light */
/* Light info. */
ccl_device_inline bool light_select_reached_max_bounces(KernelGlobals kg, int index, int bounce)
{
return (bounce > kernel_data_fetch(lights, index).max_bounces);
}
template<bool in_volume_segment>
ccl_device_noinline bool light_distribution_sample(KernelGlobals kg,
float randu,
const float randv,
const float time,
const float3 P,
const int bounce,
const uint32_t path_flag,
ccl_private LightSample *ls)
{
/* Sample light index from distribution. */
const int index = light_distribution_sample(kg, &randu);
ccl_global const KernelLightDistribution *kdistribution = &kernel_data_fetch(light_distribution,
index);
const int prim = kdistribution->prim;
if (prim >= 0) {
/* Mesh light. */
const int object = kdistribution->mesh_light.object_id;
/* Exclude synthetic meshes from shadow catcher pass. */
if ((path_flag & PATH_RAY_SHADOW_CATCHER_PASS) &&
!(kernel_data_fetch(object_flag, object) & SD_OBJECT_SHADOW_CATCHER)) {
return false;
}
const int shader_flag = kdistribution->mesh_light.shader_flag;
triangle_light_sample<in_volume_segment>(kg, prim, object, randu, randv, time, ls, P);
ls->shader |= shader_flag;
return (ls->pdf > 0.0f);
}
const int lamp = -prim - 1;
if (UNLIKELY(light_select_reached_max_bounces(kg, lamp, bounce))) {
return false;
}
return light_sample<in_volume_segment>(kg, lamp, randu, randv, P, path_flag, ls);
}
ccl_device_inline bool light_distribution_sample_from_volume_segment(KernelGlobals kg,
float randu,
const float randv,
const float time,
const float3 P,
const int bounce,
const uint32_t path_flag,
ccl_private LightSample *ls)
{
return light_distribution_sample<true>(kg, randu, randv, time, P, bounce, path_flag, ls);
}
ccl_device_inline bool light_distribution_sample_from_position(KernelGlobals kg,
float randu,
const float randv,
const float time,
const float3 P,
const int bounce,
const uint32_t path_flag,
ccl_private LightSample *ls)
{
return light_distribution_sample<false>(kg, randu, randv, time, P, bounce, path_flag, ls);
}
ccl_device_inline bool light_distribution_sample_new_position(KernelGlobals kg,
const float randu,
const float randv,
const float time,
const float3 P,
ccl_private LightSample *ls)
{
/* Sample a new position on the same light, for volume sampling. */
if (ls->type == LIGHT_TRIANGLE) {
triangle_light_sample<false>(kg, ls->prim, ls->object, randu, randv, time, ls, P);
return (ls->pdf > 0.0f);
}
else {
return light_sample<false>(kg, ls->lamp, randu, randv, P, 0, ls);
}
}
CCL_NAMESPACE_END

View File

@@ -0,0 +1,136 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2011-2022 Blender Foundation */
#pragma once
#include "kernel/light/common.h"
CCL_NAMESPACE_BEGIN
template<bool in_volume_segment>
ccl_device_inline bool point_light_sample(const ccl_global KernelLight *klight,
const float randu,
const float randv,
const float3 P,
ccl_private LightSample *ls)
{
float3 center = klight->co;
float radius = klight->spot.radius;
/* disk oriented normal */
const float3 lightN = normalize(P - center);
ls->P = center;
if (radius > 0.0f) {
ls->P += disk_light_sample(lightN, randu, randv) * radius;
}
ls->pdf = klight->spot.invarea;
ls->D = normalize_len(ls->P - P, &ls->t);
/* we set the light normal to the outgoing direction to support texturing */
ls->Ng = -ls->D;
ls->eval_fac = M_1_PI_F * 0.25f * klight->spot.invarea;
if (!in_volume_segment && ls->eval_fac == 0.0f) {
return false;
}
float2 uv = map_to_sphere(ls->Ng);
ls->u = uv.x;
ls->v = uv.y;
ls->pdf *= lamp_light_pdf(lightN, -ls->D, ls->t);
return true;
}
ccl_device_forceinline void point_light_update_position(const ccl_global KernelLight *klight,
ccl_private LightSample *ls,
const float3 P)
{
ls->D = normalize_len(ls->P - P, &ls->t);
ls->Ng = -ls->D;
float2 uv = map_to_sphere(ls->Ng);
ls->u = uv.x;
ls->v = uv.y;
float invarea = klight->spot.invarea;
ls->eval_fac = (0.25f * M_1_PI_F) * invarea;
ls->pdf = invarea;
}
ccl_device_inline bool point_light_intersect(const ccl_global KernelLight *klight,
const ccl_private Ray *ccl_restrict ray,
ccl_private float *t)
{
/* Sphere light (aka, aligned disk light). */
const float3 lightP = klight->co;
const float radius = klight->spot.radius;
if (radius == 0.0f) {
return false;
}
/* disk oriented normal */
const float3 lightN = normalize(ray->P - lightP);
float3 P;
return ray_disk_intersect(ray->P, ray->D, ray->tmin, ray->tmax, lightP, lightN, radius, &P, t);
}
ccl_device_inline bool point_light_sample_from_intersection(
const ccl_global KernelLight *klight,
ccl_private const Intersection *ccl_restrict isect,
const float3 ray_P,
const float3 ray_D,
ccl_private LightSample *ccl_restrict ls)
{
const float3 lighN = normalize(ray_P - klight->co);
/* We set the light normal to the outgoing direction to support texturing. */
ls->Ng = -ls->D;
float invarea = klight->spot.invarea;
ls->eval_fac = (0.25f * M_1_PI_F) * invarea;
ls->pdf = invarea;
if (ls->eval_fac == 0.0f) {
return false;
}
float2 uv = map_to_sphere(ls->Ng);
ls->u = uv.x;
ls->v = uv.y;
/* compute pdf */
if (ls->t != FLT_MAX) {
ls->pdf *= lamp_light_pdf(lighN, -ls->D, ls->t);
}
else {
ls->pdf = 0.f;
}
return true;
}
template<bool in_volume_segment>
ccl_device_forceinline bool point_light_tree_parameters(const ccl_global KernelLight *klight,
const float3 centroid,
const float3 P,
ccl_private float &cos_theta_u,
ccl_private float2 &distance,
ccl_private float3 &point_to_centroid)
{
if (in_volume_segment) {
cos_theta_u = 1.0f; /* Any value in [-1, 1], irrelevant since theta = 0 */
return true;
}
float min_distance;
point_to_centroid = safe_normalize_len(centroid - P, &min_distance);
const float radius = klight->spot.radius;
const float hypotenus = sqrtf(sqr(radius) + sqr(min_distance));
cos_theta_u = min_distance / hypotenus;
distance = make_float2(hypotenus, min_distance);
return true;
}
CCL_NAMESPACE_END

View File

@@ -6,8 +6,13 @@
#include "kernel/integrator/path_state.h"
#include "kernel/integrator/surface_shader.h"
#include "kernel/light/distribution.h"
#include "kernel/light/light.h"
#ifdef __LIGHT_TREE__
# include "kernel/light/tree.h"
#endif
#include "kernel/sample/mapping.h"
#include "kernel/sample/mis.h"
@@ -277,6 +282,8 @@ ccl_device_inline void light_sample_to_volume_shadow_ray(
shadow_ray_setup(sd, ls, P, ray, false);
}
/* Multiple importance sampling weights. */
ccl_device_inline float light_sample_mis_weight_forward(KernelGlobals kg,
const float forward_pdf,
const float nee_pdf)
@@ -309,4 +316,333 @@ ccl_device_inline float light_sample_mis_weight_nee(KernelGlobals kg,
return power_heuristic(nee_pdf, forward_pdf);
}
/* Next event estimation sampling.
*
* Sample a position on a light in the scene, from a position on a surface or
* from a volume segment.
*
* Uses either a flat distribution or light tree. */
ccl_device_inline bool light_sample_from_volume_segment(KernelGlobals kg,
float randu,
float randv,
const float time,
const float3 P,
const float3 D,
const float t,
const int bounce,
const uint32_t path_flag,
ccl_private LightSample *ls)
{
/* Select an emitter. */
int emitter_object = 0;
int emitter_prim = 0;
int emitter_shader_flag = 0;
float emitter_pdf_selection = 0.0f;
#ifdef __LIGHT_TREE__
if (kernel_data.integrator.use_light_tree) {
if (!light_tree_sample<true>(kg,
randu,
randv,
time,
P,
D,
t,
SD_BSDF_HAS_TRANSMISSION,
bounce,
path_flag,
emitter_object,
emitter_prim,
emitter_shader_flag,
emitter_pdf_selection)) {
return false;
}
}
else
#endif
{
if (!light_distribution_sample(kg,
randu,
randv,
time,
P,
bounce,
path_flag,
emitter_object,
emitter_prim,
emitter_shader_flag,
emitter_pdf_selection)) {
return false;
}
}
/* Set first, triangle light sampling from flat distribution will override. */
ls->pdf_selection = emitter_pdf_selection;
/* Sample a point on the chosen emitter. */
if (emitter_prim >= 0) {
/* Mesh light. */
/* Exclude synthetic meshes from shadow catcher pass. */
if ((path_flag & PATH_RAY_SHADOW_CATCHER_PASS) &&
!(kernel_data_fetch(object_flag, emitter_object) & SD_OBJECT_SHADOW_CATCHER)) {
return false;
}
if (!triangle_light_sample<true>(
kg, emitter_prim, emitter_object, randu, randv, time, ls, P)) {
return false;
}
}
else {
/* Light object. */
const int lamp = ~emitter_prim;
if (UNLIKELY(light_select_reached_max_bounces(kg, lamp, bounce))) {
return false;
}
if (!light_sample<true>(kg, lamp, randu, randv, P, path_flag, ls)) {
return false;
}
}
ls->pdf *= ls->pdf_selection;
ls->shader |= emitter_shader_flag;
return (ls->pdf > 0);
}
ccl_device bool light_sample_from_position(KernelGlobals kg,
ccl_private const RNGState *rng_state,
float randu,
float randv,
const float time,
const float3 P,
const float3 N,
const int shader_flags,
const int bounce,
const uint32_t path_flag,
ccl_private LightSample *ls)
{
/* Select an emitter. */
int emitter_object = 0;
int emitter_prim = 0;
int emitter_shader_flag = 0;
float emitter_pdf_selection = 0.0f;
#ifdef __LIGHT_TREE__
if (kernel_data.integrator.use_light_tree) {
if (!light_tree_sample<false>(kg,
randu,
randv,
time,
P,
N,
0,
shader_flags,
bounce,
path_flag,
emitter_object,
emitter_prim,
emitter_shader_flag,
emitter_pdf_selection)) {
return false;
}
}
else
#endif
{
if (!light_distribution_sample(kg,
randu,
randv,
time,
P,
bounce,
path_flag,
emitter_object,
emitter_prim,
emitter_shader_flag,
emitter_pdf_selection)) {
return false;
}
}
/* Set first, triangle light sampling from flat distribution will override. */
ls->pdf_selection = emitter_pdf_selection;
/* Sample a point on the chosen emitter.
* TODO: deduplicate code with light_sample_from_volume_segment? */
if (emitter_prim >= 0) {
/* Mesh light. */
/* Exclude synthetic meshes from shadow catcher pass. */
if ((path_flag & PATH_RAY_SHADOW_CATCHER_PASS) &&
!(kernel_data_fetch(object_flag, emitter_object) & SD_OBJECT_SHADOW_CATCHER)) {
return false;
}
if (!triangle_light_sample<false>(
kg, emitter_prim, emitter_object, randu, randv, time, ls, P)) {
return false;
}
}
else {
/* Light object. */
const int lamp = ~emitter_prim;
if (UNLIKELY(light_select_reached_max_bounces(kg, lamp, bounce))) {
return false;
}
if (!light_sample<false>(kg, lamp, randu, randv, P, path_flag, ls)) {
return false;
}
}
ls->pdf *= ls->pdf_selection;
ls->shader |= emitter_shader_flag;
return (ls->pdf > 0);
}
ccl_device_inline bool light_sample_new_position(KernelGlobals kg,
const float randu,
const float randv,
const float time,
const float3 P,
ccl_private LightSample *ls)
{
/* Sample a new position on the same light, for volume sampling. */
if (ls->type == LIGHT_TRIANGLE) {
if (!triangle_light_sample<false>(kg, ls->prim, ls->object, randu, randv, time, ls, P)) {
return false;
}
#ifdef __LIGHT_TREE__
if (kernel_data.integrator.use_light_tree) {
ls->pdf *= ls->pdf_selection;
}
else
#endif
{
/* Handled in triangle_light_sample for efficiency. */
}
return true;
}
else {
if (!light_sample<false>(kg, ls->lamp, randu, randv, P, 0, ls)) {
return false;
}
ls->pdf *= ls->pdf_selection;
return true;
}
}
ccl_device_forceinline void light_sample_update_position(KernelGlobals kg,
ccl_private LightSample *ls,
const float3 P)
{
/* Update light sample for new shading point position, while keeping
* position on the light fixed. */
/* NOTE : preserve pdf in area measure. */
light_update_position(kg, ls, P);
/* Re-apply already computed selection pdf. */
ls->pdf *= ls->pdf_selection;
}
/* Forward sampling.
*
* Multiple importance sampling weights for hitting surface, light or background
* through indirect light ray.
*
* The BSDF or phase pdf from the previous bounce was stored in mis_ray_pdf and
* is used for balancing with the light sampling pdf. */
ccl_device_inline float light_sample_mis_weight_forward_surface(KernelGlobals kg,
IntegratorState state,
const uint32_t path_flag,
const ccl_private ShaderData *sd)
{
const float bsdf_pdf = INTEGRATOR_STATE(state, path, mis_ray_pdf);
const float t = sd->ray_length;
float pdf = triangle_light_pdf(kg, sd, t);
/* Light selection pdf. */
#ifdef __LIGHT_TREE__
if (kernel_data.integrator.use_light_tree) {
float3 ray_P = INTEGRATOR_STATE(state, ray, P);
const float3 N = INTEGRATOR_STATE(state, path, mis_origin_n);
uint lookup_offset = kernel_data_fetch(object_lookup_offset, sd->object);
uint prim_offset = kernel_data_fetch(object_prim_offset, sd->object);
pdf *= light_tree_pdf(kg, ray_P, N, path_flag, sd->prim - prim_offset + lookup_offset);
}
else
#endif
{
/* Handled in triangle_light_pdf for efficiency. */
}
return light_sample_mis_weight_forward(kg, bsdf_pdf, pdf);
}
ccl_device_inline float light_sample_mis_weight_forward_lamp(KernelGlobals kg,
IntegratorState state,
const uint32_t path_flag,
const ccl_private LightSample *ls,
const float3 P)
{
const float mis_ray_pdf = INTEGRATOR_STATE(state, path, mis_ray_pdf);
float pdf = ls->pdf;
/* Light selection pdf. */
#ifdef __LIGHT_TREE__
if (kernel_data.integrator.use_light_tree) {
const float3 N = INTEGRATOR_STATE(state, path, mis_origin_n);
pdf *= light_tree_pdf(kg, P, N, path_flag, ~ls->lamp);
}
else
#endif
{
pdf *= light_distribution_pdf_lamp(kg);
}
return light_sample_mis_weight_forward(kg, mis_ray_pdf, pdf);
}
ccl_device_inline float light_sample_mis_weight_forward_distant(KernelGlobals kg,
IntegratorState state,
const uint32_t path_flag,
const ccl_private LightSample *ls)
{
const float3 ray_P = INTEGRATOR_STATE(state, ray, P);
return light_sample_mis_weight_forward_lamp(kg, state, path_flag, ls, ray_P);
}
ccl_device_inline float light_sample_mis_weight_forward_background(KernelGlobals kg,
IntegratorState state,
const uint32_t path_flag)
{
const float3 ray_P = INTEGRATOR_STATE(state, ray, P);
const float3 ray_D = INTEGRATOR_STATE(state, ray, D);
const float mis_ray_pdf = INTEGRATOR_STATE(state, path, mis_ray_pdf);
float pdf = background_light_pdf(kg, ray_P, ray_D);
/* Light selection pdf. */
#ifdef __LIGHT_TREE__
if (kernel_data.integrator.use_light_tree) {
const float3 N = INTEGRATOR_STATE(state, path, mis_origin_n);
pdf *= light_tree_pdf(kg, ray_P, N, path_flag, ~kernel_data.background.light_index);
}
else
#endif
{
pdf *= light_distribution_pdf_lamp(kg);
}
return light_sample_mis_weight_forward(kg, mis_ray_pdf, pdf);
}
CCL_NAMESPACE_END

View File

@@ -0,0 +1,179 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2011-2022 Blender Foundation */
#pragma once
#include "kernel/light/common.h"
CCL_NAMESPACE_BEGIN
ccl_device float spot_light_attenuation(float3 dir,
float cos_half_spot_angle,
float spot_smooth,
float3 N)
{
float attenuation = dot(dir, N);
if (attenuation <= cos_half_spot_angle) {
attenuation = 0.0f;
}
else {
float t = attenuation - cos_half_spot_angle;
if (t < spot_smooth && spot_smooth != 0.0f)
attenuation *= smoothstepf(t / spot_smooth);
}
return attenuation;
}
template<bool in_volume_segment>
ccl_device_inline bool spot_light_sample(const ccl_global KernelLight *klight,
const float randu,
const float randv,
const float3 P,
ccl_private LightSample *ls)
{
ls->P = klight->co;
const float3 center = klight->co;
const float radius = klight->spot.radius;
/* disk oriented normal */
const float3 lightN = normalize(P - center);
ls->P = center;
if (radius > 0.0f) {
/* disk light */
ls->P += disk_light_sample(lightN, randu, randv) * radius;
}
const float invarea = klight->spot.invarea;
ls->pdf = invarea;
ls->D = normalize_len(ls->P - P, &ls->t);
/* we set the light normal to the outgoing direction to support texturing */
ls->Ng = -ls->D;
ls->eval_fac = (0.25f * M_1_PI_F) * invarea;
/* spot light attenuation */
ls->eval_fac *= spot_light_attenuation(
klight->spot.dir, klight->spot.cos_half_spot_angle, klight->spot.spot_smooth, -ls->D);
if (!in_volume_segment && ls->eval_fac == 0.0f) {
return false;
}
float2 uv = map_to_sphere(ls->Ng);
ls->u = uv.x;
ls->v = uv.y;
ls->pdf *= lamp_light_pdf(lightN, -ls->D, ls->t);
return true;
}
ccl_device_forceinline void spot_light_update_position(const ccl_global KernelLight *klight,
ccl_private LightSample *ls,
const float3 P)
{
ls->D = normalize_len(ls->P - P, &ls->t);
ls->Ng = -ls->D;
float2 uv = map_to_sphere(ls->Ng);
ls->u = uv.x;
ls->v = uv.y;
float invarea = klight->spot.invarea;
ls->eval_fac = (0.25f * M_1_PI_F) * invarea;
ls->pdf = invarea;
/* spot light attenuation */
ls->eval_fac *= spot_light_attenuation(
klight->spot.dir, klight->spot.cos_half_spot_angle, klight->spot.spot_smooth, ls->Ng);
}
ccl_device_inline bool spot_light_intersect(const ccl_global KernelLight *klight,
const ccl_private Ray *ccl_restrict ray,
ccl_private float *t)
{
/* Spot/Disk light. */
const float3 lightP = klight->co;
const float radius = klight->spot.radius;
if (radius == 0.0f) {
return false;
}
/* disk oriented normal */
const float3 lightN = normalize(ray->P - lightP);
/* One sided. */
if (dot(ray->D, lightN) >= 0.0f) {
return false;
}
float3 P;
return ray_disk_intersect(ray->P, ray->D, ray->tmin, ray->tmax, lightP, lightN, radius, &P, t);
}
ccl_device_inline bool spot_light_sample_from_intersection(
const ccl_global KernelLight *klight,
ccl_private const Intersection *ccl_restrict isect,
const float3 ray_P,
const float3 ray_D,
ccl_private LightSample *ccl_restrict ls)
{
/* the normal of the oriented disk */
const float3 lightN = normalize(ray_P - klight->co);
/* We set the light normal to the outgoing direction to support texturing. */
ls->Ng = -ls->D;
float invarea = klight->spot.invarea;
ls->eval_fac = (0.25f * M_1_PI_F) * invarea;
ls->pdf = invarea;
/* spot light attenuation */
ls->eval_fac *= spot_light_attenuation(
klight->spot.dir, klight->spot.cos_half_spot_angle, klight->spot.spot_smooth, -ls->D);
if (ls->eval_fac == 0.0f) {
return false;
}
float2 uv = map_to_sphere(ls->Ng);
ls->u = uv.x;
ls->v = uv.y;
/* compute pdf */
if (ls->t != FLT_MAX) {
ls->pdf *= lamp_light_pdf(lightN, -ls->D, ls->t);
}
else {
ls->pdf = 0.f;
}
return true;
}
template<bool in_volume_segment>
ccl_device_forceinline bool spot_light_tree_parameters(const ccl_global KernelLight *klight,
const float3 centroid,
const float3 P,
ccl_private float &cos_theta_u,
ccl_private float2 &distance,
ccl_private float3 &point_to_centroid)
{
float min_distance;
const float3 point_to_centroid_ = safe_normalize_len(centroid - P, &min_distance);
const float radius = klight->spot.radius;
const float hypotenus = sqrtf(sqr(radius) + sqr(min_distance));
cos_theta_u = min_distance / hypotenus;
if (in_volume_segment) {
return true;
}
distance = make_float2(hypotenus, min_distance);
point_to_centroid = point_to_centroid_;
return true;
}
CCL_NAMESPACE_END

View File

@@ -0,0 +1,691 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2011-2022 Blender Foundation */
#pragma once
#include "kernel/light/area.h"
#include "kernel/light/common.h"
#include "kernel/light/light.h"
#include "kernel/light/spot.h"
#include "kernel/light/triangle.h"
CCL_NAMESPACE_BEGIN
/* TODO: this seems like a relative expensive computation, and we can make it a lot cheaper
* by using a bounding sphere instead of a bounding box. This will be more inaccurate, but it
* might be fine when used along with the adaptive splitting. */
ccl_device float light_tree_cos_bounding_box_angle(const BoundingBox bbox,
const float3 P,
const float3 point_to_centroid)
{
if (P.x > bbox.min.x && P.y > bbox.min.y && P.z > bbox.min.z && P.x < bbox.max.x &&
P.y < bbox.max.y && P.z < bbox.max.z) {
/* If P is inside the bbox, `theta_u` covers the whole sphere */
return -1.0f;
}
float cos_theta_u = 1.0f;
/* Iterate through all 8 possible points of the bounding box. */
for (int i = 0; i < 8; ++i) {
const float3 corner = make_float3((i & 1) ? bbox.max.x : bbox.min.x,
(i & 2) ? bbox.max.y : bbox.min.y,
(i & 4) ? bbox.max.z : bbox.min.z);
/* Caculate the bounding box angle. */
float3 point_to_corner = normalize(corner - P);
cos_theta_u = fminf(cos_theta_u, dot(point_to_centroid, point_to_corner));
}
return cos_theta_u;
}
ccl_device_forceinline float sin_from_cos(const float c)
{
return safe_sqrtf(1.0f - sqr(c));
}
/* Compute vector v as in Fig .8. P_v is the corresponding point along the ray ccl_device float3 */
ccl_device float3 compute_v(
const float3 centroid, const float3 P, const float3 D, const float3 bcone_axis, const float t)
{
const float3 unnormalized_v0 = P - centroid;
float len_v0;
const float3 unnormalized_v1 = unnormalized_v0 + D * fminf(t, 1e12f);
const float3 v0 = normalize_len(unnormalized_v0, &len_v0);
const float3 v1 = normalize(unnormalized_v1);
const float3 o0 = v0;
float3 o1, o2;
make_orthonormals_tangent(o0, v1, &o1, &o2);
const float dot_o0_a = dot(o0, bcone_axis);
const float dot_o1_a = dot(o1, bcone_axis);
const float cos_phi0 = dot_o0_a / sqrtf(sqr(dot_o0_a) + sqr(dot_o1_a));
return (dot_o1_a < 0 || dot(v0, v1) > cos_phi0) ? (dot_o0_a > dot(v1, bcone_axis) ? v0 : v1) :
cos_phi0 * o0 + sin_from_cos(cos_phi0) * o1;
}
/* This is the general function for calculating the importance of either a cluster or an emitter.
* Both of the specialized functions obtain the necessary data before calling this function. */
template<bool in_volume_segment>
ccl_device void light_tree_importance(const float3 N_or_D,
const bool has_transmission,
const float3 point_to_centroid,
const float cos_theta_u,
const BoundingCone bcone,
const float max_distance,
const float min_distance,
const float t,
const float energy,
ccl_private float &max_importance,
ccl_private float &min_importance)
{
max_importance = 0.0f;
min_importance = 0.0f;
const float sin_theta_u = sin_from_cos(cos_theta_u);
/* cos(theta_i') in the paper, omitted for volume */
float cos_min_incidence_angle = 1.0f;
float cos_max_incidence_angle = 1.0f;
/* when sampling the light tree for the second time in `shade_volume.h` and when query the pdf in
* `sample.h` */
const bool in_volume = is_zero(N_or_D);
if (!in_volume_segment && !in_volume) {
const float3 N = N_or_D;
const float cos_theta_i = has_transmission ? fabsf(dot(point_to_centroid, N)) :
dot(point_to_centroid, N);
const float sin_theta_i = sin_from_cos(cos_theta_i);
/* cos_min_incidence_angle = cos(max{theta_i - theta_u, 0}) = cos(theta_i') in the paper */
cos_min_incidence_angle = cos_theta_i >= cos_theta_u ?
1.0f :
cos_theta_i * cos_theta_u + sin_theta_i * sin_theta_u;
/* If the node is guaranteed to be behind the surface we're sampling, and the surface is
* opaque, then we can give the node an importance of 0 as it contributes nothing to the
* surface. This is more accurate than the bbox test if we are calculating the importance of
* an emitter with radius */
if (!has_transmission && cos_min_incidence_angle < 0) {
return;
}
/* cos_max_incidence_angle = cos(min{theta_i + theta_u, pi}) */
cos_max_incidence_angle = fmaxf(cos_theta_i * cos_theta_u - sin_theta_i * sin_theta_u, 0.0f);
}
/* cos(theta - theta_u) */
const float cos_theta = dot(bcone.axis, -point_to_centroid);
const float sin_theta = sin_from_cos(cos_theta);
const float cos_theta_minus_theta_u = cos_theta * cos_theta_u + sin_theta * sin_theta_u;
float cos_theta_o, sin_theta_o;
fast_sincosf(bcone.theta_o, &sin_theta_o, &cos_theta_o);
/* minimum angle an emitters axis would form with the direction to the shading point,
* cos(theta') in the paper */
float cos_min_outgoing_angle;
if ((cos_theta >= cos_theta_u) || (cos_theta_minus_theta_u >= cos_theta_o)) {
/* theta - theta_o - theta_u <= 0 */
kernel_assert((fast_acosf(cos_theta) - bcone.theta_o - fast_acosf(cos_theta_u)) < 5e-4f);
cos_min_outgoing_angle = 1.0f;
}
else if ((bcone.theta_o + bcone.theta_e > M_PI_F) ||
(cos_theta_minus_theta_u > cos(bcone.theta_o + bcone.theta_e))) {
/* theta' = theta - theta_o - theta_u < theta_e */
kernel_assert(
(fast_acosf(cos_theta) - bcone.theta_o - fast_acosf(cos_theta_u) - bcone.theta_e) < 5e-4f);
const float sin_theta_minus_theta_u = sin_from_cos(cos_theta_minus_theta_u);
cos_min_outgoing_angle = cos_theta_minus_theta_u * cos_theta_o +
sin_theta_minus_theta_u * sin_theta_o;
}
else {
/* cluster invisible */
return;
}
/* TODO: find a good approximation for f_a. */
const float f_a = 1.0f;
/* TODO: also consider t (or theta_a, theta_b) for volume */
max_importance = fabsf(f_a * cos_min_incidence_angle * energy * cos_min_outgoing_angle /
(in_volume_segment ? min_distance : sqr(min_distance)));
/* TODO: also min importance for volume? */
if (in_volume_segment) {
min_importance = max_importance;
return;
}
/* cos(theta + theta_o + theta_u) if theta + theta_o + theta_u < theta_e, 0 otherwise */
float cos_max_outgoing_angle;
const float cos_theta_plus_theta_u = cos_theta * cos_theta_u - sin_theta * sin_theta_u;
if (bcone.theta_e - bcone.theta_o < 0 || cos_theta < 0 || cos_theta_u < 0 ||
cos_theta_plus_theta_u < cos(bcone.theta_e - bcone.theta_o)) {
min_importance = 0.0f;
}
else {
const float sin_theta_plus_theta_u = sin_from_cos(cos_theta_plus_theta_u);
cos_max_outgoing_angle = cos_theta_plus_theta_u * cos_theta_o -
sin_theta_plus_theta_u * sin_theta_o;
min_importance = fabsf(f_a * cos_max_incidence_angle * energy * cos_max_outgoing_angle /
sqr(max_distance));
}
}
template<bool in_volume_segment>
ccl_device bool compute_emitter_centroid_and_dir(KernelGlobals kg,
ccl_global const KernelLightTreeEmitter *kemitter,
const float3 P,
ccl_private float3 &centroid,
ccl_private packed_float3 &dir)
{
const int prim_id = kemitter->prim_id;
if (prim_id < 0) {
const ccl_global KernelLight *klight = &kernel_data_fetch(lights, ~prim_id);
centroid = klight->co;
switch (klight->type) {
case LIGHT_SPOT:
dir = klight->spot.dir;
break;
case LIGHT_POINT:
/* Disk-oriented normal */
dir = safe_normalize(P - centroid);
break;
case LIGHT_AREA:
dir = klight->area.dir;
break;
case LIGHT_BACKGROUND:
/* Aarbitrary centroid and direction */
centroid = make_float3(0.0f, 0.0f, 1.0f);
dir = make_float3(0.0f, 0.0f, -1.0f);
return !in_volume_segment;
case LIGHT_DISTANT:
dir = centroid;
return !in_volume_segment;
default:
return false;
}
}
else {
const int object = kemitter->mesh_light.object_id;
float3 vertices[3];
triangle_world_space_vertices(kg, object, prim_id, -1.0f, vertices);
centroid = (vertices[0] + vertices[1] + vertices[2]) / 3.0f;
if (kemitter->mesh_light.emission_sampling == EMISSION_SAMPLING_FRONT) {
dir = safe_normalize(cross(vertices[1] - vertices[0], vertices[2] - vertices[0]));
}
else if (kemitter->mesh_light.emission_sampling == EMISSION_SAMPLING_BACK) {
dir = -safe_normalize(cross(vertices[1] - vertices[0], vertices[2] - vertices[0]));
}
else {
/* Double sided: any vector in the plane. */
dir = safe_normalize(vertices[0] - vertices[1]);
}
}
return true;
}
template<bool in_volume_segment>
ccl_device void light_tree_emitter_importance(KernelGlobals kg,
const float3 P,
const float3 N_or_D,
const float t,
const bool has_transmission,
int emitter_index,
ccl_private float &max_importance,
ccl_private float &min_importance)
{
const ccl_global KernelLightTreeEmitter *kemitter = &kernel_data_fetch(light_tree_emitters,
emitter_index);
max_importance = 0.0f;
min_importance = 0.0f;
BoundingCone bcone;
bcone.theta_o = kemitter->theta_o;
bcone.theta_e = kemitter->theta_e;
float cos_theta_u;
float2 distance; /* distance.x = max_distance, distance.y = mix_distance */
float3 centroid, point_to_centroid, P_c;
if (!compute_emitter_centroid_and_dir<in_volume_segment>(
kg, kemitter, P, centroid, bcone.axis)) {
return;
}
const int prim_id = kemitter->prim_id;
if (in_volume_segment) {
const float3 D = N_or_D;
/* Closest point */
P_c = P + dot(centroid - P, D) * D;
/* minimal distance of the ray to the cluster */
distance.x = len(centroid - P_c);
distance.y = distance.x;
point_to_centroid = -compute_v(centroid, P, D, bcone.axis, t);
}
else {
P_c = P;
}
bool is_visible;
if (prim_id < 0) {
const ccl_global KernelLight *klight = &kernel_data_fetch(lights, ~prim_id);
switch (klight->type) {
/* Function templates only modifies cos_theta_u when in_volume_segment = true */
case LIGHT_SPOT:
is_visible = spot_light_tree_parameters<in_volume_segment>(
klight, centroid, P_c, cos_theta_u, distance, point_to_centroid);
break;
case LIGHT_POINT:
is_visible = point_light_tree_parameters<in_volume_segment>(
klight, centroid, P_c, cos_theta_u, distance, point_to_centroid);
bcone.theta_o = 0.0f;
break;
case LIGHT_AREA:
is_visible = area_light_tree_parameters<in_volume_segment>(
klight, centroid, P_c, N_or_D, bcone.axis, cos_theta_u, distance, point_to_centroid);
break;
case LIGHT_BACKGROUND:
is_visible = background_light_tree_parameters(
centroid, cos_theta_u, distance, point_to_centroid);
break;
case LIGHT_DISTANT:
is_visible = distant_light_tree_parameters(
centroid, bcone.theta_e, cos_theta_u, distance, point_to_centroid);
break;
default:
return;
}
}
else { /* mesh light */
is_visible = triangle_light_tree_parameters<in_volume_segment>(
kg, kemitter, centroid, P_c, N_or_D, bcone, cos_theta_u, distance, point_to_centroid);
}
is_visible |= has_transmission;
if (!is_visible) {
return;
}
light_tree_importance<in_volume_segment>(N_or_D,
has_transmission,
point_to_centroid,
cos_theta_u,
bcone,
distance.x,
distance.y,
t,
kemitter->energy,
max_importance,
min_importance);
}
template<bool in_volume_segment>
ccl_device void light_tree_node_importance(KernelGlobals kg,
const float3 P,
const float3 N_or_D,
const float t,
const bool has_transmission,
const ccl_global KernelLightTreeNode *knode,
ccl_private float &max_importance,
ccl_private float &min_importance)
{
max_importance = 0.0f;
min_importance = 0.0f;
if (knode->num_prims == 1) {
/* At a leaf node with only one emitter */
light_tree_emitter_importance<in_volume_segment>(
kg, P, N_or_D, t, has_transmission, -knode->child_index, max_importance, min_importance);
}
else if (knode->num_prims != 0) {
const BoundingCone bcone = knode->bcone;
const BoundingBox bbox = knode->bbox;
float3 point_to_centroid;
float cos_theta_u;
float distance;
if (knode->bit_trail == 1) {
/* distant light node */
if (in_volume_segment) {
return;
}
point_to_centroid = -bcone.axis;
cos_theta_u = fast_cosf(bcone.theta_o);
distance = 1.0f;
}
else {
const float3 centroid = 0.5f * (bbox.min + bbox.max);
if (in_volume_segment) {
const float3 D = N_or_D;
const float3 closest_point = P + dot(centroid - P, D) * D;
/* minimal distance of the ray to the cluster */
distance = len(centroid - closest_point);
point_to_centroid = -compute_v(centroid, P, D, bcone.axis, t);
cos_theta_u = light_tree_cos_bounding_box_angle(bbox, closest_point, point_to_centroid);
}
else {
const float3 N = N_or_D;
const float3 bbox_extent = bbox.max - centroid;
const bool bbox_is_visible = has_transmission |
(dot(N, centroid - P) + dot(fabs(N), fabs(bbox_extent)) > 0);
/* If the node is guaranteed to be behind the surface we're sampling, and the surface is
* opaque, then we can give the node an importance of 0 as it contributes nothing to the
* surface. */
if (!bbox_is_visible) {
return;
}
point_to_centroid = normalize_len(centroid - P, &distance);
cos_theta_u = light_tree_cos_bounding_box_angle(bbox, P, point_to_centroid);
}
/* clamp distance to half the radius of the cluster when splitting is disabled */
distance = fmaxf(0.5f * len(centroid - bbox.max), distance);
}
/* TODO: currently max_distance = min_distance, max_importance = min_importance for the
* nodes. Do we need better weights for complex scenes? */
light_tree_importance<in_volume_segment>(N_or_D,
has_transmission,
point_to_centroid,
cos_theta_u,
bcone,
distance,
distance,
t,
knode->energy,
max_importance,
min_importance);
}
}
ccl_device void sample_resevoir(const int current_index,
const float current_weight,
ccl_private int &selected_index,
ccl_private float &selected_weight,
ccl_private float &total_weight,
ccl_private float &rand)
{
if (current_weight == 0.0f) {
return;
}
total_weight += current_weight;
float thresh = current_weight / total_weight;
if (rand <= thresh) {
selected_index = current_index;
selected_weight = current_weight;
rand = rand / thresh;
}
else {
rand = (rand - thresh) / (1.0f - thresh);
}
kernel_assert(rand >= 0.0f && rand <= 1.0f);
return;
}
/* pick an emitter from a leaf node using resevoir sampling, keep two reservoirs for upper and
* lower bounds */
template<bool in_volume_segment>
ccl_device int light_tree_cluster_select_emitter(KernelGlobals kg,
ccl_private float &rand,
const float3 P,
const float3 N_or_D,
const float t,
const bool has_transmission,
const ccl_global KernelLightTreeNode *knode,
ccl_private float *pdf_factor)
{
float selected_importance[2] = {0.0f, 0.0f};
float total_importance[2] = {0.0f, 0.0f};
int selected_index = -1;
/* Mark emitters with zero importance. Used for resevoir when total minimum importance = 0 */
kernel_assert(knode->num_prims <= sizeof(uint) * 8);
uint has_importance = 0;
const bool sample_max = (rand > 0.5f); /* sampling using the maximum importance */
rand = rand * 2.0f - float(sample_max);
for (int i = 0; i < knode->num_prims; i++) {
int current_index = -knode->child_index + i;
/* maximum importance = importance[0], mininum importance = importance[1] */
float importance[2];
light_tree_emitter_importance<in_volume_segment>(
kg, P, N_or_D, t, has_transmission, current_index, importance[0], importance[1]);
sample_resevoir(current_index,
importance[!sample_max],
selected_index,
selected_importance[!sample_max],
total_importance[!sample_max],
rand);
if (selected_index == current_index) {
selected_importance[sample_max] = importance[sample_max];
}
total_importance[sample_max] += importance[sample_max];
has_importance |= ((importance[0] > 0) << i);
}
if (total_importance[0] == 0.0f) {
return -1;
}
if (total_importance[1] == 0.0f) {
/* uniformly sample emitters with positive maximum importance */
if (sample_max) {
selected_importance[1] = 1.0f;
total_importance[1] = float(popcount(has_importance));
}
else {
selected_index = -1;
for (int i = 0; i < knode->num_prims; i++) {
int current_index = -knode->child_index + i;
sample_resevoir(current_index,
float(has_importance & 1),
selected_index,
selected_importance[1],
total_importance[1],
rand);
has_importance >>= 1;
}
float discard;
light_tree_emitter_importance<in_volume_segment>(
kg, P, N_or_D, t, has_transmission, selected_index, selected_importance[0], discard);
}
}
*pdf_factor = 0.5f * (selected_importance[0] / total_importance[0] +
selected_importance[1] / total_importance[1]);
return selected_index;
}
template<bool in_volume_segment>
ccl_device bool get_left_probability(KernelGlobals kg,
const float3 P,
const float3 N_or_D,
const float t,
const bool has_transmission,
const int left_index,
const int right_index,
ccl_private float &left_probability)
{
const ccl_global KernelLightTreeNode *left = &kernel_data_fetch(light_tree_nodes, left_index);
const ccl_global KernelLightTreeNode *right = &kernel_data_fetch(light_tree_nodes, right_index);
float min_left_importance, max_left_importance, min_right_importance, max_right_importance;
light_tree_node_importance<in_volume_segment>(
kg, P, N_or_D, t, has_transmission, left, max_left_importance, min_left_importance);
light_tree_node_importance<in_volume_segment>(
kg, P, N_or_D, t, has_transmission, right, max_right_importance, min_right_importance);
const float total_max_importance = max_left_importance + max_right_importance;
if (total_max_importance == 0.0f) {
return false;
}
const float total_min_importance = min_left_importance + min_right_importance;
/* average two probabilities of picking the left child node using lower and upper bounds */
const float probability_max = max_left_importance / total_max_importance;
const float probability_min = total_min_importance > 0 ?
min_left_importance / total_min_importance :
0.5f * (float(max_left_importance > 0) +
float(max_right_importance == 0.0f));
left_probability = 0.5f * (probability_max + probability_min);
return true;
}
template<bool in_volume_segment>
ccl_device_noinline bool light_tree_sample(KernelGlobals kg,
ccl_private float &randu,
ccl_private float &randv,
const float time,
const float3 P,
const float3 N_or_D,
const float t,
const int shader_flags,
const int bounce,
const uint32_t path_flag,
ccl_private int &emitter_object,
ccl_private int &emitter_prim,
ccl_private int &emitter_shader_flag,
ccl_private float &emitter_pdf_selection)
{
if (!kernel_data.integrator.use_direct_light) {
return false;
}
const bool has_transmission = (shader_flags & SD_BSDF_HAS_TRANSMISSION);
float pdf_leaf = 1.0f;
float pdf_emitter_from_leaf = 1.0f;
int selected_light = -1;
int node_index = 0; /* root node */
/* Traverse the light tree until a leaf node is reached. */
while (true) {
const ccl_global KernelLightTreeNode *knode = &kernel_data_fetch(light_tree_nodes, node_index);
if (knode->child_index <= 0) {
/* At a leaf node, we pick an emitter */
selected_light = light_tree_cluster_select_emitter<in_volume_segment>(
kg, randv, P, N_or_D, t, has_transmission, knode, &pdf_emitter_from_leaf);
break;
}
/* At an interior node, the left child is directly after the parent,
* while the right child is stored as the child index. */
const int left_index = node_index + 1;
const int right_index = knode->child_index;
float left_prob;
if (!get_left_probability<in_volume_segment>(
kg, P, N_or_D, t, has_transmission, left_index, right_index, left_prob)) {
return false; /* both child nodes have zero importance */
}
float discard;
float total_prob = left_prob;
node_index = left_index;
sample_resevoir(right_index, 1.0f - left_prob, node_index, discard, total_prob, randu);
pdf_leaf *= (node_index == left_index) ? left_prob : (1.0f - left_prob);
}
if (selected_light < 0) {
return false;
}
/* Return info about chosen emitter. */
ccl_global const KernelLightTreeEmitter *kemitter = &kernel_data_fetch(light_tree_emitters,
selected_light);
emitter_object = kemitter->mesh_light.object_id;
emitter_prim = kemitter->prim_id;
emitter_shader_flag = kemitter->mesh_light.shader_flag;
emitter_pdf_selection = pdf_leaf * pdf_emitter_from_leaf;
return true;
}
/* We need to be able to find the probability of selecting a given light for MIS. */
ccl_device float light_tree_pdf(
KernelGlobals kg, const float3 P, const float3 N, const int path_flag, const int prim)
{
const bool has_transmission = (path_flag & PATH_RAY_MIS_HAD_TRANSMISSION);
/* Target emitter info */
const int target_emitter = (prim >= 0) ? kernel_data_fetch(triangle_to_tree, prim) :
kernel_data_fetch(light_to_tree, ~prim);
ccl_global const KernelLightTreeEmitter *kemitter = &kernel_data_fetch(light_tree_emitters,
target_emitter);
const int target_leaf = kemitter->parent_index;
ccl_global const KernelLightTreeNode *kleaf = &kernel_data_fetch(light_tree_nodes, target_leaf);
uint bit_trail = kleaf->bit_trail;
int node_index = 0; /* root node */
float pdf = 1.0f;
/* Traverse the light tree until we reach the target leaf node */
while (true) {
const ccl_global KernelLightTreeNode *knode = &kernel_data_fetch(light_tree_nodes, node_index);
if (knode->child_index <= 0) {
break;
}
/* Interior node */
const int left_index = node_index + 1;
const int right_index = knode->child_index;
float left_prob;
if (!get_left_probability<false>(
kg, P, N, 0, has_transmission, left_index, right_index, left_prob)) {
return 0.0f;
}
const bool go_left = (bit_trail & 1) == 0;
bit_trail >>= 1;
pdf *= go_left ? left_prob : (1.0f - left_prob);
node_index = go_left ? left_index : right_index;
if (pdf == 0) {
return 0.0f;
}
}
kernel_assert(node_index == target_leaf);
/* Iterate through leaf node to find the probability of sampling the target emitter. */
float target_max_importance = 0.0f;
float target_min_importance = 0.0f;
float total_max_importance = 0.0f;
float total_min_importance = 0.0f;
int num_has_importance = 0;
for (int i = 0; i < kleaf->num_prims; i++) {
const int emitter = -kleaf->child_index + i;
float max_importance, min_importance;
light_tree_emitter_importance<false>(
kg, P, N, 0, has_transmission, emitter, max_importance, min_importance);
num_has_importance += (max_importance > 0);
if (emitter == target_emitter) {
target_max_importance = max_importance;
target_min_importance = min_importance;
}
total_max_importance += max_importance;
total_min_importance += min_importance;
}
if (target_max_importance > 0.0f) {
return pdf * 0.5f *
(target_max_importance / total_max_importance +
(total_min_importance > 0 ? target_min_importance / total_min_importance :
1.0f / num_has_importance));
}
return 0.0f;
}
CCL_NAMESPACE_END

View File

@@ -0,0 +1,329 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2011-2022 Blender Foundation */
#pragma once
#include "kernel/geom/geom.h"
CCL_NAMESPACE_BEGIN
/* returns true if the triangle is has motion blur or an instancing transform applied */
ccl_device_inline bool triangle_world_space_vertices(
KernelGlobals kg, int object, int prim, float time, float3 V[3])
{
bool has_motion = false;
const int object_flag = kernel_data_fetch(object_flag, object);
if (object_flag & SD_OBJECT_HAS_VERTEX_MOTION && time >= 0.0f) {
motion_triangle_vertices(kg, object, prim, time, V);
has_motion = true;
}
else {
triangle_vertices(kg, prim, V);
}
if (!(object_flag & SD_OBJECT_TRANSFORM_APPLIED)) {
#ifdef __OBJECT_MOTION__
float object_time = (time >= 0.0f) ? time : 0.5f;
Transform tfm = object_fetch_transform_motion_test(kg, object, object_time, NULL);
#else
Transform tfm = object_fetch_transform(kg, object, OBJECT_TRANSFORM);
#endif
V[0] = transform_point(&tfm, V[0]);
V[1] = transform_point(&tfm, V[1]);
V[2] = transform_point(&tfm, V[2]);
has_motion = true;
}
return has_motion;
}
ccl_device_inline float triangle_light_pdf_area_sampling(const float3 Ng, const float3 I, float t)
{
float cos_pi = fabsf(dot(Ng, I));
if (cos_pi == 0.0f)
return 0.0f;
return t * t / cos_pi;
}
ccl_device_forceinline float triangle_light_pdf(KernelGlobals kg,
ccl_private const ShaderData *sd,
float t)
{
/* A naive heuristic to decide between costly solid angle sampling
* and simple area sampling, comparing the distance to the triangle plane
* to the length of the edges of the triangle. */
float3 V[3];
bool has_motion = triangle_world_space_vertices(kg, sd->object, sd->prim, sd->time, V);
const float3 e0 = V[1] - V[0];
const float3 e1 = V[2] - V[0];
const float3 e2 = V[2] - V[1];
const float longest_edge_squared = max(len_squared(e0), max(len_squared(e1), len_squared(e2)));
const float3 N = cross(e0, e1);
const float distance_to_plane = fabsf(dot(N, sd->I * t)) / dot(N, N);
const float area = 0.5f * len(N);
float pdf;
if (longest_edge_squared > distance_to_plane * distance_to_plane) {
/* sd contains the point on the light source
* calculate Px, the point that we're shading */
const float3 Px = sd->P + sd->I * t;
const float3 v0_p = V[0] - Px;
const float3 v1_p = V[1] - Px;
const float3 v2_p = V[2] - Px;
const float3 u01 = safe_normalize(cross(v0_p, v1_p));
const float3 u02 = safe_normalize(cross(v0_p, v2_p));
const float3 u12 = safe_normalize(cross(v1_p, v2_p));
const float alpha = fast_acosf(dot(u02, u01));
const float beta = fast_acosf(-dot(u01, u12));
const float gamma = fast_acosf(dot(u02, u12));
const float solid_angle = alpha + beta + gamma - M_PI_F;
/* distribution_pdf_triangles is calculated over triangle area, but we're not sampling over
* its area */
if (UNLIKELY(solid_angle == 0.0f)) {
return 0.0f;
}
else {
pdf = 1.0f / solid_angle;
}
}
else {
if (UNLIKELY(area == 0.0f)) {
return 0.0f;
}
pdf = triangle_light_pdf_area_sampling(sd->Ng, sd->I, t) / area;
}
/* Belongs in distribution.h but can reuse computations here. */
if (!kernel_data.integrator.use_light_tree) {
float distribution_area = area;
if (has_motion && area != 0.0f) {
/* For motion blur need area of triangle at fixed time as used in the CDF. */
triangle_world_space_vertices(kg, sd->object, sd->prim, -1.0f, V);
distribution_area = triangle_area(V[0], V[1], V[2]);
}
pdf *= distribution_area * kernel_data.integrator.distribution_pdf_triangles;
}
return pdf;
}
template<bool in_volume_segment>
ccl_device_forceinline bool triangle_light_sample(KernelGlobals kg,
int prim,
int object,
float randu,
float randv,
float time,
ccl_private LightSample *ls,
const float3 P)
{
/* A naive heuristic to decide between costly solid angle sampling
* and simple area sampling, comparing the distance to the triangle plane
* to the length of the edges of the triangle. */
float3 V[3];
bool has_motion = triangle_world_space_vertices(kg, object, prim, time, V);
const float3 e0 = V[1] - V[0];
const float3 e1 = V[2] - V[0];
const float3 e2 = V[2] - V[1];
const float longest_edge_squared = max(len_squared(e0), max(len_squared(e1), len_squared(e2)));
const float3 N0 = cross(e0, e1);
float Nl = 0.0f;
ls->Ng = safe_normalize_len(N0, &Nl);
const float area = 0.5f * Nl;
/* flip normal if necessary */
const int object_flag = kernel_data_fetch(object_flag, object);
if (object_flag & SD_OBJECT_NEGATIVE_SCALE_APPLIED) {
ls->Ng = -ls->Ng;
}
ls->eval_fac = 1.0f;
ls->shader = kernel_data_fetch(tri_shader, prim);
ls->object = object;
ls->prim = prim;
ls->lamp = LAMP_NONE;
ls->shader |= SHADER_USE_MIS;
ls->type = LIGHT_TRIANGLE;
ls->group = object_lightgroup(kg, object);
float distance_to_plane = fabsf(dot(N0, V[0] - P) / dot(N0, N0));
if (!in_volume_segment && (longest_edge_squared > distance_to_plane * distance_to_plane)) {
/* see James Arvo, "Stratified Sampling of Spherical Triangles"
* http://www.graphics.cornell.edu/pubs/1995/Arv95c.pdf */
/* project the triangle to the unit sphere
* and calculate its edges and angles */
const float3 v0_p = V[0] - P;
const float3 v1_p = V[1] - P;
const float3 v2_p = V[2] - P;
const float3 u01 = safe_normalize(cross(v0_p, v1_p));
const float3 u02 = safe_normalize(cross(v0_p, v2_p));
const float3 u12 = safe_normalize(cross(v1_p, v2_p));
const float3 A = safe_normalize(v0_p);
const float3 B = safe_normalize(v1_p);
const float3 C = safe_normalize(v2_p);
const float cos_alpha = dot(u02, u01);
const float cos_beta = -dot(u01, u12);
const float cos_gamma = dot(u02, u12);
/* calculate dihedral angles */
const float alpha = fast_acosf(cos_alpha);
const float beta = fast_acosf(cos_beta);
const float gamma = fast_acosf(cos_gamma);
/* the area of the unit spherical triangle = solid angle */
const float solid_angle = alpha + beta + gamma - M_PI_F;
/* precompute a few things
* these could be re-used to take several samples
* as they are independent of randu/randv */
const float cos_c = dot(A, B);
const float sin_alpha = fast_sinf(alpha);
const float product = sin_alpha * cos_c;
/* Select a random sub-area of the spherical triangle
* and calculate the third vertex C_ of that new triangle */
const float phi = randu * solid_angle - alpha;
float s, t;
fast_sincosf(phi, &s, &t);
const float u = t - cos_alpha;
const float v = s + product;
const float3 U = safe_normalize(C - dot(C, A) * A);
float q = 1.0f;
const float det = ((v * s + u * t) * sin_alpha);
if (det != 0.0f) {
q = ((v * t - u * s) * cos_alpha - v) / det;
}
const float temp = max(1.0f - q * q, 0.0f);
const float3 C_ = safe_normalize(q * A + sqrtf(temp) * U);
/* Finally, select a random point along the edge of the new triangle
* That point on the spherical triangle is the sampled ray direction */
const float z = 1.0f - randv * (1.0f - dot(C_, B));
ls->D = z * B + safe_sqrtf(1.0f - z * z) * safe_normalize(C_ - dot(C_, B) * B);
/* calculate intersection with the planar triangle */
if (!ray_triangle_intersect(
P, ls->D, 0.0f, FLT_MAX, V[0], V[1], V[2], &ls->u, &ls->v, &ls->t)) {
ls->pdf = 0.0f;
return false;
}
ls->P = P + ls->D * ls->t;
/* distribution_pdf_triangles is calculated over triangle area, but we're sampling over solid
* angle */
if (UNLIKELY(solid_angle == 0.0f)) {
ls->pdf = 0.0f;
return false;
}
else {
ls->pdf = 1.0f / solid_angle;
}
}
else {
if (UNLIKELY(area == 0.0f)) {
return 0.0f;
}
/* compute random point in triangle. From Eric Heitz's "A Low-Distortion Map Between Triangle
* and Square" */
float u = randu;
float v = randv;
if (v > u) {
u *= 0.5f;
v -= u;
}
else {
v *= 0.5f;
u -= v;
}
const float t = 1.0f - u - v;
ls->P = u * V[0] + v * V[1] + t * V[2];
/* compute incoming direction, distance and pdf */
ls->D = normalize_len(ls->P - P, &ls->t);
ls->pdf = triangle_light_pdf_area_sampling(ls->Ng, -ls->D, ls->t) / area;
ls->u = u;
ls->v = v;
}
/* Belongs in distribution.h but can reuse computations here. */
if (!kernel_data.integrator.use_light_tree) {
float distribution_area = area;
if (has_motion && area != 0.0f) {
/* For motion blur need area of triangle at fixed time as used in the CDF. */
triangle_world_space_vertices(kg, object, prim, -1.0f, V);
distribution_area = triangle_area(V[0], V[1], V[2]);
}
ls->pdf_selection = distribution_area * kernel_data.integrator.distribution_pdf_triangles;
}
return (ls->pdf > 0.0f);
}
template<bool in_volume_segment>
ccl_device_forceinline bool triangle_light_tree_parameters(
KernelGlobals kg,
const ccl_global KernelLightTreeEmitter *kemitter,
const float3 centroid,
const float3 P,
const float3 N,
const BoundingCone bcone,
ccl_private float &cos_theta_u,
ccl_private float2 &distance,
ccl_private float3 &point_to_centroid)
{
if (!in_volume_segment) {
/* TODO: a cheap substitute for minimal distance between point and primitive. Does it
* worth the overhead to compute the accurate minimal distance? */
float min_distance;
point_to_centroid = safe_normalize_len(centroid - P, &min_distance);
distance = make_float2(min_distance, min_distance);
}
cos_theta_u = FLT_MAX;
const int object = kemitter->mesh_light.object_id;
float3 vertices[3];
triangle_world_space_vertices(kg, object, kemitter->prim_id, -1.0f, vertices);
bool shape_above_surface = false;
for (int i = 0; i < 3; i++) {
const float3 corner = vertices[i];
float distance_point_to_corner;
const float3 point_to_corner = safe_normalize_len(corner - P, &distance_point_to_corner);
cos_theta_u = fminf(cos_theta_u, dot(point_to_centroid, point_to_corner));
shape_above_surface |= dot(point_to_corner, N) > 0;
if (!in_volume_segment) {
distance.x = fmaxf(distance.x, distance_point_to_corner);
}
}
const bool front_facing = bcone.theta_o != 0.0f || dot(bcone.axis, point_to_centroid) < 0;
const bool in_volume = is_zero(N);
return (front_facing && shape_above_surface) || in_volume;
}
CCL_NAMESPACE_END

Some files were not shown because too many files have changed in this diff Show More