Compare commits

...

303 Commits

Author SHA1 Message Date
3a97c4056f Proof of Concept: File Browser thumbnail mode using grid view
This was meant as an experiment to see how tangible it is to rewrite the
File Browser UI code to be based on views, starting with the grid view
for thumbnail mode. See T99890.
My initial conclusion is that porting to views is quite doable, but
we'll need some further UI code features to make certain things
possible. Like big "composed" icons, where a file type icon is displayed
on top of a big, generic file icon.

There is a fair bit of stuff here that I'm not happy with. Plus things
like selection, double clicking to open and renaming don't work yet.
It's a start, a proof of concept even :)
2022-07-21 17:16:10 +02:00
9dbcefb10e Merge branch 'asset-browser-grid-view' into file-browser-grid-view 2022-07-20 17:25:31 +02:00
85f0b2ef5d Merge branch 'master' into asset-browser-grid-view 2022-07-20 17:17:12 +02:00
c4d8e28aa7 UI: Remove redundant view reference in view items
The new view item base class already holds a reference to the view, no
need to have one in the derived class as well.
2022-07-20 17:16:15 +02:00
0c6ae51d9f Fix missing registration of grid view items in the view 2022-07-20 17:16:15 +02:00
712960cefd Cleanup: use BLI_strncpy instead of strcpy
Using `strcpy` resulted in `stringop-truncation` warnings for me.
2022-07-20 16:08:02 +02:00
bffc1fbf31 Apply changes to catalog tree view from master 2022-07-20 16:01:05 +02:00
3ea91cecc9 Cleanup: remove unused get_cage_mesh parameter
All callers passed `false` for this parameter, making it more confusing
than useful. If this functionality is needed again in the future, a separate
function should be added.

Differential Revision: https://developer.blender.org/D15401
2022-07-20 15:57:16 +02:00
Wannes Malfait
7561183830 Fix T99667: regression in Delete Geometry node
Differential Revision: https://developer.blender.org/D15445
2022-07-20 15:49:50 +02:00
6bae10ef45 Merge branch 'master' into asset-browser-grid-view 2022-07-20 15:42:58 +02:00
0f8e58e7b8 Merge commit 'c355be6faeac~1' into asset-browser-grid-view 2022-07-20 15:42:48 +02:00
29c68e2523 Fix T99869: Edge crease no longer working
Missed in d14c2d549b
2022-07-20 10:28:23 -03:00
a814c7091b Fix T99847: Dragging image from Image Editor to Node Editor broken
Oversight in b0da080c2c. The `session_uuid` operator property wouldn't
be checked by the invoke callback, and if neither the `filepath` nor the
`name` property were set, the File Browser would open.
2022-07-20 14:54:32 +02:00
7f8d05131a IDManagement: Speedup ID unique name assignment by tracking used names/basenames/suffixes
An implementation of T73412, roughly as outlined there:

Track the names that are in use, as well as base names (before
numeric suffix) plus a bit map for each base name, indicating which
numeric suffixes are already used. This is done per-Main/Library,
per-object-type.

Timings (Windows, VS2022 Release build, AMD Ryzen 5950X):

- Scene with 10k cubes, Shift+D to duplicate them all: 8.7s -> 1.9s.
  Name map memory usage for resulting 20k objects: 4.3MB.
- Importing a 2.5GB .obj file of exported Blender 3.0 splash scene
  (24k objects), using the new C++ importer: 34.2s-> 22.0s. Name map
  memory usage for resulting scene: 8.6MB.
- Importing Disney Moana USD scene (almost half a million objects):
  56min -> 10min. Name map usage: ~100MB. Blender crashes later on
  when trying to render it, in the same place in both cases, but
  that's for another day.

Reviewed By: Bastien Montagne
Differential Revision: https://developer.blender.org/D14162
2022-07-20 14:27:14 +03:00
8d69c6c4e7 Constraints: add checks to specially handle the custom space target.
- The custom space target never needs B-Bone data (used by depsgraph).
- When drawing the relationship lines use the space matrix directly.
- Don't use the custom target to control the target space type dropdown.

Differential Revision: https://developer.blender.org/D9732
2022-07-20 14:18:17 +03:00
e6855507a5 Constraints: add missing calls to initialize custom space.
Add calls to a few locations that look like they may need to
initialize the Custom Space matrix, i.e. generally any place
that computes target matrices.

Differential Revision: https://developer.blender.org/D9732
2022-07-20 14:18:17 +03:00
054a169be0 Use appropriate context for the DopeSheet Action Custom Properties panel.
Refactor D14646 to use context.active_action for the Action
Custom Properties panel, matching the already existing Action panel.

This has the advantage that it allows access to the properties of
any actions with channels visible in the Dope Sheet, e.g. Shape Keys,
Materials etc; while using just the active object is limited to just
the object animation.

Also move both panels from Item to the Action tab.

Differential Revision: https://developer.blender.org/D15288
2022-07-20 14:18:17 +03:00
ae49e0e8be Cleanup: unused parameter in own recent commit rB92ca920c52b9. 2022-07-20 11:09:16 +02:00
Damien Picard
c48dc61749 I18n: fixes to add-on message extraction
This commit fixes several issues in add-ons UI messages extraction code:

- In multi-file modules, the script would crash because it tried to write to
  the dir instead of a `translations.py` file;
- The add-on message extraction works by enabling the add-on, getting all
  messages; disabling the add-on, getting all messages; then comparing the
  two message sets. But often a bug happens where a class gets a
  description from somewhere else in memory. I couldn’t debug that, so a
  workaround is to check that the message isn’t a corrupted one before
  removing it;
- `printf()` doesn't exist in Python and would crash the script;
- `self.src[self.settings.PARSER_PY_ID]` can be replaced by `self.py_file`
  in class `I18n`, since a property exists to do that;
- At one point a generator was printed instead of its values, so let's
  unpack the generator to get the values. Maybe the print could be
  deleted entirely;
- Use SPDX license identifier instead of GPL license block, to be more in
  line with other scripts from the codebase.

Reviewed By: mont29

Differential Revision: https://developer.blender.org/D15474
2022-07-20 11:05:07 +02:00
Bastien Montagne
92ca920c52 Add a 'Apply and Delete All' operation to shapekeys.
Adds a new option to the 'Delete ShpaKeys' operator, which first applies
the current mix to the object data, before removing all shapekeys.

Request from @JulienKaspar from Blender studio.

Reviewed By: JulienKaspar

Differential Revision: https://developer.blender.org/D15443
2022-07-20 10:55:08 +02:00
ecf4f4a71f Curves: fix uninitialized curve type when adding new curves 2022-07-20 10:47:23 +02:00
c3b9a4e001 Cleanup: Access attributes with new API instead of geometry components
Remove the boilerplate of using a local geometry component just to use
the attribute API that was necessary before b876ce2a4a.
2022-07-19 22:34:32 -05:00
215f805ce6 Curves: Remove use of CurveEval in sculpt brushes
This commit removes the use of PolySpline for resampling curves and
replaces it with the length parameterization utility for that purpose.
I didn't test performance, but I would expect the shrinking to be
slightly faster because I reused some arrays to avoid allocating
them for every curve. I noted some potential improvements in
the "add curves" function.

Differential Revision: https://developer.blender.org/D15342
2022-07-19 21:48:32 -05:00
2551cf9087 Curves: Port fillet node to the new data-block
This commit ports the fillet curves node to the new curves data-block,
and moves the fillet node implementation to the geometry module to help
separate the implementation from the node.

The changes are similar to the subdivide node or resample node. I've
resused common utilities where it makes sense, though some things like
the iteration over attributes can be generalized further. The node
is now multi-threaded per-curve and inside each curve, and some buffers
are reused per curve to avoid many allocations.

The code is more explicit now, and though there is more boilerplate to
pass around many spans, the more complex logic should be more readable.

Differential Revision: https://developer.blender.org/D15346
2022-07-19 18:50:27 -05:00
5d6e4822d8 Cleanup: Combine geometry null checks in if statements
Testing if components or virtual arrays are null in the same line they
are retrieved can make this boilerplate code a bit easier to read.
2022-07-19 18:29:23 -05:00
40ffb94ab4 Cleanup: Use generic utility for copying compressed attribute
In the future, `materialize_compressed_to_uninitialized_threaded`
could be moved somewhere else and reused.
2022-07-19 18:16:12 -05:00
410a6efb74 Point Cloud: Remove redundant custom data pointers
Similar to e9f82d3dc7, but for point clouds instead.

Differential Revision: https://developer.blender.org/D15487
2022-07-19 18:06:56 -05:00
e9f82d3dc7 Curves: Remove redundant custom data pointers
These mutable pointers present problems with ownership in relation to
proper copy-on-write for attributes. The simplest solution is to just
remove them and retrieve the layers from  `CustomData` when they are
needed. This also removes the complexity and redundancy of having to
update the pointers as the curves change. A similar change will apply
to meshes and point clouds.

One downside of this change is that it makes random access with RNA
slower. However, it's simple to just use the RNA attribute API instead,
which is unaffected. In this patch I updated Cycles to do that. With
the future attribute CoW changes, this generic approach makes sense
because Cycles can just request ownership of the existing arrays.

Differential Revision: https://developer.blender.org/D15486
2022-07-19 18:01:04 -05:00
Pratik Borhade
28985ccc05 Fix T99583: Missing update for option in particle edit mode
Missing Ui refresh when property is accessed through shortcut.
Use RNA_def_property_update to solve this.

Differential Revision: https://developer.blender.org/D15431
2022-07-19 14:20:44 -05:00
43d04c7eeb UI Code Quality: Move eyedropper files to own folder
Part of T98518
2022-07-19 19:33:38 +02:00
Tomek Gubala
3b7ac10d62 UV: add Snap Cursor to Origin
Similar to snapping to the world origin in the 3D viewport. This can be found
in the Shift+S pie menu and UV > Snap menu.

Differential Revision: https://developer.blender.org/D15055
2022-07-19 19:29:03 +02:00
75e62df429 Cleanup: compiler warning 2022-07-19 19:22:13 +02:00
cd478fbfb3 Fix error printed in console when running Shade Flat operator
Only the Shade Smooth operator has autosmooth settings.
2022-07-19 19:22:13 +02:00
edc89c7f46 Fix build error when not using unity build 2022-07-19 19:22:13 +02:00
d14c2d549b Fix T99643: Vertex Crease fails with symmetry
Create a transform conversion type that only considers the Vertex
Custom Data.

This reduces the complexity of converting Meshes and slightly
optimizes the transformation.
2022-07-19 14:13:00 -03:00
597955d0a8 Cleanup: Remove compile option for curves object
After becb1530b1 the new curves object type isn't hidden
behind an experimental flag anymore, and other areas depend on this,
so disabling curves at compile time doesn't make sense anymore.
2022-07-19 12:10:45 -05:00
16b145bc62 Cleanup: Improve API and documentation of interface_view.cc
File documentation was outdated and could use general improvement.
Function names didn't really reflect the level they are operating on.
2022-07-19 18:04:03 +02:00
fb9dc810f1 UI Code Quality: Move view related files to own folder
Part of T98518.
2022-07-19 18:04:03 +02:00
Amélie Fondevilla
801513efa0 Fix T99732: Crash on F3 in the dopesheet
Adding the `FCURVESONLY` filter in the filter function of the Grease
Pencil dopesheet prevents calls to F-curves related functions to grease
pencil channels, thereby fixing the crash.

Reviewed By: antoniov, sybren

Maniphest Tasks: T99732

Differential Revision: https://developer.blender.org/D15490
2022-07-19 17:49:10 +02:00
87ae10a050 Curves: Hide snapping menu in curves sculpt mode
The menu/popover doesn't affect anything in the mode, and precision
operations that would use snapping are meant for edit mode anyway.
2022-07-19 10:45:01 -05:00
35b4e3a350 RNA: Don't allocate empty char pointer properties
Instead of allocating a single 0 char, set the `char *` DNA pointer to
null. This avoids the overhead of allocating and copying single-bytes.

rBeed45b655c9f didn't do this for safety reasons, but I checked the
existing uses of this behavior in DNA/RNA. Out of 43 total `char *`
members, this change only affects 7 recently added properties.
For a complete list, see the patch description.

Differential Revision: https://developer.blender.org/D14779
2022-07-19 10:30:19 -05:00
c771dd5e9c Depsgraph: Make animated properties API receive const ID
Semantically it is more correct as the cache does not modify the ID.

There is need to do couple of const casts since the BKE (which is in C)
does not easily allow to iterate into f-curves of const ID.

Should be no functional changes.
2022-07-19 17:22:53 +02:00
6a1ab4747b Geometry Nodes: Copy parameters when copying a curves data-block
Previously, things like materials, symmetry, and selection options
stored on `Curves` weren't copied to the result in nodes like the
subdivide and resample nodes. Now they are, which fixes some
unexpected behavior and allows visualization of the sculpt mode
selection.

In the realize instances and join nodes the behavior is the same as
for meshes, the parameters are taken from the first (top) input.

I also refactored some functions to return a `CurvesGeometry` by-value,
which makes it the responsibility of the node to copy the parameters.
That should make the algorithms more reusable in other situations.

Differential Revision: https://developer.blender.org/D15408
2022-07-19 10:16:30 -05:00
9246ff373a Metal: Add license header to new files 2022-07-19 17:14:39 +02:00
Jason Fielder
9835d5e58b Metal: MTLUniformBuffer module implementation
Initial implementation.

Authored by Apple: Michael Parkin-White
Ref T96261

Reviewed By: fclem
Differential Revision: https://developer.blender.org/D15357
2022-07-19 17:12:39 +02:00
Jason Fielder
6bba4d864e Metal: MTLQueryPool implementation adding support for occlusion queries.
When a query begins, the current visibility result buffer needs to be
associated with the currently active Render Pass. The MTLContext and
MTLCommandBuffer are responsible for ensuring new render pass objects are
created if the visibility state changes.

Authored by Apple: Michael Parkin-White
Ref T96261

Reviewed By: fclem

Maniphest Tasks: T96261

Differential Revision: https://developer.blender.org/D15356
2022-07-19 17:02:15 +02:00
3370c1a8a7 Cleanup: Add comment for geometry nodes lazyness 2022-07-19 09:54:20 -05:00
ad5e3d30a2 Nishita sky: Increase elevation range
The Nishita sky texture currently only allows moving the sun to the zenith.
The problem is if you want to animate the passing of a full night-day-night
cycle. Currently it's not easy to do due to this limitation.

The patch makes it so users can easily animate the sun moving from sunrise
to sunset by expanding the max sun elevation to go 360° instead of 90°.

Reviewed By: fclem
Differential Revision: https://developer.blender.org/D13724
2022-07-19 16:36:57 +02:00
2e3fb58128 Cleanup: Apply clang-format 2022-07-19 16:33:09 +02:00
5bee991132 UI: Port view item features to base class, merge view item button types
No user visible changes expected.

Merges the tree row and grid tile button types, which were mostly doing
the same things. The idea is that there is a button type for
highlighting, as well as supporting general view item features (e.g.
renaming, drag/drop, etc.). So instead there is a view item button type
now. Also ports view item features like renaming, custom context menus,
drag controllers and drop controllers to `ui::AbstractViewItem` (the new
base class for all view items).

This should be quite an improvement because:
- Merges code that was duplicated over view items.
- Mentioned features (renaming, drag & drop, ...) are much easier to
  implement in new view types now. Most of it comes "for free".
- Further features will immediately become availalbe to all views (e.g.
  selection).
- Simplifies APIs, there don't have to be functions for individual view
  item types anymore.
- View item classes are split and thus less overwhelming visually.
- View item buttons now share all code (drawing, handling, etc.)
- We're soon running out of available button types, this commit merges
  two into one.

I was hoping I could do this in multiple smaller commits, but things
were quite intertwined so that would've taken quite some effort.
2022-07-19 16:31:23 +02:00
348ec37f52 UI: Add AbstractViewItem base class
No user visible changes expected.

Similar to rBc355be6faeac, but for view items now instead of the view.
Not much of the item code is ported to use it yet, it's actually a bit
tricky for the most part. But just introducing the base class already
allows me to start unifying the view item buttons (`uiButTreeRow` and
`uiButGridTile`). This would be a nice improvement.
2022-07-19 16:31:23 +02:00
Colin Basnett
2f834bfc14 Fix T97559: Undoing of NLA strip duplication requires two undo steps
Fix the issue where undoing a "duplicate NLA strip" operation would
require two undo steps.

The cause of this was that the operator was not using the operator macro
system to combine both the duplication and the translate operators into
one. Instead, the old code was simply manually invoking invoking the
translate operator after the duplicate operator had completed.

This patch requires the default keymap to be modified to include the two
new macro operators, `NLA_OT_duplicate_move` and
`NLA_OT_duplicate_linked_move` in favour of the old keymap that simply
called `NLA_OT_duplicate` and passed along a `linked` argument.

`duplicate_move` and `duplicate_move_linked` are two different enough
operations to justify having their own operators from user's
point-of-view, especially since we cannot yet have different tool-tips
based on an operator's settings.

Reviewed By: sybren, mont29

Differential Revision: https://developer.blender.org/D15086
2022-07-19 16:07:30 +02:00
Colin Basnett
4812eda3c5 Animation RNA: Add clear() method to FCurveKeyframePoints
Add `FCurveKeyframePoints.clear()` method to delete all keyframe points
from an FCurve.

Reviewed By: sybren

Differential Revision: https://developer.blender.org/D15283
2022-07-19 15:58:37 +02:00
2232855b50 Curves: align surface and curves object in Empty Hair operator
Without this, symmetry does not work by default when the surface
object was not at the same location as the 3d cursor.
2022-07-19 15:49:45 +02:00
95fd7c3679 Depsgraph: Cleanup, Make variable less ambiguous and more clear 2022-07-19 15:27:20 +02:00
bc6b612d8b Depsgraph: Make variable naming more clear
Disambiguate from nodes visibility flags.
2022-07-19 15:25:49 +02:00
Ethan-Hall
44f1495b57 EEVEE: use mipmaps of compressed textures (DDS)
Currently Blender generates mipmaps that override the existing ones.
This patch disables generating new mipmaps for compressed textures.

Reviewed By: fclem

Differential Revision: https://developer.blender.org/D14459
2022-07-19 15:20:53 +02:00
e8465f941c Depsgraph: Cleanup, use nested namespace definition 2022-07-19 14:44:07 +02:00
835203fde8 Depsgraph: Localize synchronization component visibility handling
Move it from generic visibility handling to the synchronization
component node implementation. Should be no functional changes.
2022-07-19 14:36:36 +02:00
73f8a7ca0a Depsgraph: Cleanup, comments wrapping and spacing between lines
Should make it easier to read. No functional changes expected.
2022-07-19 14:25:02 +02:00
6d2100f7de Depsgraph: Introduce operation code for visibility operation
No functional changes, just makes code more semantically clear.
2022-07-19 13:00:19 +02:00
99faebfca6 Depsgraph: Cleanup, don't mic static function and anonymous namespace 2022-07-19 12:51:24 +02:00
2280a71f90 Depsgraph: Refactor evaluation into smaller reusable functions
Should be no functional changes.
2022-07-19 12:41:47 +02:00
9b2b61a07b Depsgraph: Cleanup, use nested namespace definition 2022-07-19 11:40:18 +02:00
d3c063188e Depsgraph: Make name and name tag optional in component node
Matches the builder API, making some code less verbose.
2022-07-19 11:31:44 +02:00
ff98b5eaa8 Depsgraph: Clarify comment in the component node 2022-07-19 11:29:00 +02:00
e00a027c1e Depsgraph: Use single task pool during evaluation
Not sure why multiple pools were created: the pool should be able to
handle two sets of tasks.

Perhaps non-measurable improvement in terms of performance but this
change simplifies code a bit.
2022-07-19 11:18:30 +02:00
533a5a6a8c Fix T99785: Make Principled Hair IOR input behave like other IOR sliders
Was accidental regression in rBed9b21098dd27bf9364397357f89b4c2648f40c2

Remove the input slider's PROP_FACTOR subtype in favor of the default to
align with other IOR sliders. This provides much better control when
dragging the value with the mouse.

Differential Revision: https://developer.blender.org/D15477
2022-07-18 20:38:19 -07:00
bbf87c4f75 Fix T99737: Dropping files fails with Wayland
Drop events ignored the cursor coordinates, under the assumption that
cursor motion events would also be sent to update the cursor location.

This depended on the behavior of the compositor, it failed for Sway
but worked for Gnome-shell & River.

Resolve by making use of the drop events cursor coordinates.
2022-07-19 13:33:02 +10:00
7ebd1f4b79 Cleanup: quiet compiler warnings 2022-07-19 13:32:53 +10:00
37922eab90 Fix T99794: regression from uv unwrap selected
Restore only_selected_faces flag inadvertently changed by c0e4532331

Differential Revision: https://developer.blender.org/D15480
2022-07-19 13:26:24 +12:00
135e530356 Fix T99781: uv minimize stretch now unflips flipped faces
Add a small gradient to flipped faces proportional to length of edges.

Differential Revision: https://developer.blender.org/D15475
2022-07-19 10:26:28 +12:00
Henrik Dick
d175eb6c30 Fix Text Editor highlight of assert and async
Due to the ordering of the checks, assert and async were not highlighted
in the editor, even though they were in the list of keywords.

Differential Revision: http://developer.blender.org/D15483
2022-07-18 23:33:30 +02:00
8358cc7963 Fix wrong alpha for scene linear byte images during texture painting
Make the update logic consistent with the case where the initial texture is
created. Also fixes a wrong assert.

Thanks Clément for spotting this.
2022-07-18 17:40:48 +02:00
cd21022b78 Context: implement an active_action property that returns a single action.
Although e.g. in the dopesheet there is no specific concept of
active action, displaying panels requires singling out one action
reference. It is more efficient and clearer to implement this
natively in the context rather than using selected_visible_actions[0].

- In the Action Editor the action is taken from the header.
- In the Dope Sheet the first selected action is chosen, because
  there is no concept of an active channel or keyframe.
- In the Graph Editor the action associated with the active curve
  is used, which should also be associated with the active vertex.
  This case may be different from selected_visible_actions[0].

Differential Revision: https://developer.blender.org/D15412
2022-07-18 17:44:46 +03:00
935b7a6f65 Context: implement indexing for list properties in path_resolve.
The real RNA path_resolve method supports indexing lists,
but the python version on the Context object does not. This
patch adds the missing feature for completeness.

Differential Revision: https://developer.blender.org/D15413
2022-07-18 17:38:00 +03:00
1f8567ac68 Fix T99750: crash with file output node, after image colorspace saving changes 2022-07-18 15:50:39 +02:00
757041560f Build: update Embree to 3.13.4, enable Neon2x on Arm
* Allows Apple Silicon machines to use 8-wide BVH, which the release notes
  mention give an 8% performance boost.
* An update to this version is also required for OpenPGL.

This patch includes contributions from Jason Fielder and Sebastian Herholz.

Ref D15286, T98555

Differential Revision: https://developer.blender.org/D15482
2022-07-18 15:36:34 +02:00
3407ed5f9b Cleanup: change internal Cycles compact BVH default to match UI 2022-07-18 15:34:13 +02:00
Damien Picard
dec8854bf3 I18n: translate add node operator tooltips
The tooltips from the Add Node menu were extracted, but not translated.

Reviewed By: mont29

Differential Revision: https://developer.blender.org/D15467
2022-07-18 15:10:08 +02:00
Jun Mizutani
47f3b53756 Fix T99742: crash when generating rigify rigs, after recent changes
Differential Revision: https://developer.blender.org/D15473
2022-07-18 14:47:48 +02:00
c7f788b877 Subdiv: remove unused GPU device choice, fix crash with libepoxy on init
openSubdiv_init() would detect available evaluators before any OpenGL context
exists, causing a crash with libepoxy. This test however is redundant as we
already check the requirements on the Blender side through the GPU API.

To simplify things, completely remove the device detection in the opensubdiv
module and reduce the evaluators to just CPU and GPU. The plan here is to move
to the GPU module abstraction over OpenGL/Metal/Vulkan and so all these
different backends no longer make sense.

This also removes the user preference for OpenSubdiv compute device, which was
not used for the new GPU subdivision implementation.

Ref D15291

Differential Revision: https://developer.blender.org/D15470
2022-07-18 13:59:08 +02:00
3c016fbfd0 Fix error indenting new-lines in generated RST API docs
New-lines in RNA type descriptions caused invalid RST indentation.

This resolve the error noted by @nutti in D15481.
2022-07-18 20:00:21 +10:00
cd1e4ae448 Fix T99644: Anchored brush mode fails for negative brushes
The stroke code now supports raycasting the original mesh.
This fixes anchored mode not working for negative brushes,
which might move the mesh out of the initial mouse cursor
position.
2022-07-16 17:27:25 -07:00
9a14887905 Sculpt: Fix scene spacing mode (phase 1)
The scene spacing code was failing to
check if a raycast failed, which can happen
when sculpting the edges of objects in negative
mode.

Note I removed what I suspect was a hack put
in to fix this, spacing was clamped
to 0.001 scene units.

Scene spacing mode is actually quite broken,
so it will be fixed in a series of phases.
2022-07-16 16:45:41 -07:00
d136a996ca Audaspace: minor formatting fix for last commit. 2022-07-16 22:20:08 +02:00
Colin Basnett
1e4c557d82 Fix T99039: bpy.ops.sound.mixdown returns indecipherable error
Fix for {T99039}.

The problem was that `AUD_mixdown` and `AUD_mixdown_per_channel` were returning pointers to freed memory.

Two key changes are made:
1. The return value of those functions now simply return a bool as to whether the operation succeeded, instead of an optional error string pointer.
2. The error string buffer is now passed into the function to be filled in case an error occurs. In this way, the onus of memory ownership is unamibiguously on the caller.

Differential Revision: https://developer.blender.org/D15260
2022-07-16 22:14:19 +02:00
0a8d21e0c9 PyAPI: re-enable the "bgl" module for headless builds
Instead of removing the `bgl` module, set all it's functions to stubs
so importing `bgl` or any of it's members doesn't raise an error.

This avoids problems for scripts that import bgl but don't call it's
functions when running in background mode.
2022-07-16 17:30:17 +10:00
49babc7caa Cleanup: early exit MEM_lockfree_freeN when called with NULL pointer
Simplify logic for freeing a NULL pointer. While no null-pointer
de-reference was performed, this wasn't as so obvious as the pointer
was passed to MEM_lockfree_allocN_len before checking for NULL.

NOTE: T99744 claimed the a NULL pointer free was a vulnerability,
while I can't see evidence for this - exiting early makes it clearer
the memory isn't accessed.

*Details*

- Add MEMHEAD_LEN macro, avoids redundant NULL check.
- Use "UNLIKELY(..)" hint's for error cases
  (freeing NULL pointer and checking if `leak_detector_has_run`).
2022-07-16 17:23:42 +10:00
f76b537d48 Fix T99744: NULL pointer free with corrupt zSTD reading 2022-07-16 16:32:36 +10:00
bf49e6040c Fix error in assertion after 92a99c1496 2022-07-16 16:12:48 +10:00
b985437283 Fix workbench background render broken after recent changes from D15463
For Eevee the light baking can initialize OpenGL earlier, but for workbench we
can't assume the backend exists here already.
2022-07-15 20:10:42 +02:00
92a99c1496 Fix Eevee backround render crash after recent changes from D15463
Backend initialization needs to be delayed until after the OpenGL context
is created. This worked fine in foreground mode because the OpenGL context
already exists for the window at the point GPU_backend_init_once was called,
but not for background mode.

Create the backend just in time in GPU_context_create as before, and
automatically free it when the last context id discarded. But check if any
GPU backend is supported before creating the OpenGL context.

Ref D15463, D15465
2022-07-15 19:11:07 +02:00
5152c7c152 Cycles: refactor rays to have start and end distance, fix precision issues
For transparency, volume and light intersection rays, adjust these distances
rather than the ray start position. This way we increment the start distance
by the smallest possible float increment to avoid self intersections, and be
sure it works as the distance compared to be will be exactly the same as
before, due to the ray start position and direction remaining the same.

Fix T98764, T96537, hair ray tracing precision issues.

Differential Revision: https://developer.blender.org/D15455
2022-07-15 18:46:24 +02:00
bb376da6df Fix Cycles MetalRT error after recent specialization changes 2022-07-15 18:28:13 +02:00
03aeef64d5 Cleanup: compiler warnings 2022-07-15 16:57:04 +02:00
c505f19efe Fix compiler error in debug builds after 1cf465bbc3 2022-07-15 16:52:01 +02:00
1cf465bbc3 Fix GPU backend deleting resources without an active context
This causes an assert with libepoxy, but was wrong already regardless.

Refactor logic to work as follows:
* GPU_exit() deletes backend resources
* Destroy UI GPU resources with the context active
* Call GPU_backend_exit() after deleting the context

Ref D15291

Differential Revision: https://developer.blender.org/D15465
2022-07-15 16:31:28 +02:00
5e1229f253 Fix overly noisy surface deform warning message
An increased number of vertices is not a stopper for the surface
deform modifier anymore. It might still be useful to expose the
message in the UI, but printing error message to the console on
every modifier evaluation makes real errors to become almost
invisible.

Differential Revision: https://developer.blender.org/D15468
2022-07-15 16:30:11 +02:00
82f65d8971 Cleanup: VSE waveform drawing
No functional changes.
2022-07-15 15:52:37 +02:00
011d3c75a7 Cleanup: compiler warning 2022-07-15 15:20:53 +02:00
00dc747702 Fix T99706: Crash rendering with headless builds
When rendering with headless builds, show an error instead of crashing.

Previously GPU_backend_init was called indirectly from
DRW_opengl_context_create, a new function is now called from the window
manager (GPU_backend_init_once), so it's possible to check if the GPU
has a back-end.

This also disables the `bgl` Python module when building WITH_HEADLESS.

Reviewed By: fclem

Ref D15463
2022-07-15 22:16:44 +10:00
Damien Picard
180db0f752 UI: make many modifier strings translatable
This includes:
- new modifier names

It mostly uses `N_` because the strings are actually translated elsewhere.
The goal is simply to export them to .po files.

Most of the new translations were reported in T43295#1105335.

Reviewed By: mont29

Differential Revision: https://developer.blender.org/D15418
2022-07-15 14:15:40 +02:00
914617f8fd Fix (unreported) LibOverride: invalid behaviors when creating (partial) overrides.
The outliner would tagg all existing local IDs (for remap from linked
reference data to newly created overrides) when creating a new override.
This would become critical issue in case there is several existing
copies of the same override hierarchy (leading to several hierarchies
using the same override).

Further more, BKE override creation code would not systematically
properly remapp linked usages to new overrides one whithin the affected
override hierarchy, leading to potential undesired remaining usages of
linked data.
2022-07-15 14:15:40 +02:00
523bbf7065 Cycles: generalize shader sorting / locality heuristic to all GPU devices
This was added for Metal, but also gives good results with CUDA and OptiX.
Also enable it for future Apple GPUs instead of only M1 and M2, since this has
been shown to help across multiple GPUs so the better bet seems to enable
rather than disable it.

Also moves some of the logic outside of the Metal device code, and always
enables the code in the kernel since other devices don't do dynamic compile.

Time per sample with OptiX + RTX A6000:
                                         new                  old
barbershop_interior                      0.0730s              0.0727s
bmw27                                    0.0047s              0.0053s
classroom                                0.0428s              0.0464s
fishy_cat                                0.0102s              0.0108s
junkshop                                 0.0366s              0.0395s
koro                                     0.0567s              0.0578s
monster                                  0.0206s              0.0223s
pabellon                                 0.0158s              0.0174s
sponza                                   0.0088s              0.0100s
spring                                   0.1267s              0.1280s
victor                                   0.0524s              0.0531s
wdas_cloud                               0.0817s              0.0816s

Ref D15331, T87836
2022-07-15 13:42:47 +02:00
da4ef05e4d Cycles: Apple Silicon optimization to specialize intersection kernels
The Metal backend now compiles and caches a second set of kernels which are
optimized for scene contents, enabled for Apple Silicon.

The implementation supports doing this both for intersection and shading
kernels. However this is currently only enabled for intersection kernels that
are quick to compile, and already give a good speedup. Enabling this for
shading kernels would be faster still, however this also causes a long wait
times and would need a good user interface to control this.

M1 Max samples per minute (macOS 13.0):

                    PSO_GENERIC  PSO_SPECIALIZED_INTERSECT  PSO_SPECIALIZED_SHADE

barbershop_interior       83.4	            89.5                   93.7
bmw27                   1486.1	          1671.0                 1825.8
classroom                175.2	           196.8                  206.3
fishy_cat                674.2	           704.3                  719.3
junkshop                 205.4	           212.0                  257.7
koro                     310.1	           336.1                  342.8
monster                  376.7	           418.6                  424.1
pabellon                 273.5	           325.4                  339.8
sponza                   830.6	           929.6                 1142.4
victor                    86.7              96.4                   96.3
wdas_cloud               111.8	           112.7                  183.1

Code contributed by Jason Fielder, Morteza Mostajabodaveh and Michael Jones

Differential Revision: https://developer.blender.org/D14645
2022-07-15 13:40:04 +02:00
5653c5fcdd Cycles: keep track of SVM nodes used in kernels
To be used for specialization in Metal, to automatically leave out unused nodes
from the kernel.

Ref D14645
2022-07-15 13:40:04 +02:00
79da7f2a8f Cycles: refactor to move part of KernelData definition to template header
To be used for specialization on Metal in a following commit, turning these
members into compile time constants.

Ref D14645
2022-07-15 13:40:04 +02:00
Damien Picard
2e70d5cb98 Render: camera depth of field support for armature bone targets
This is useful when using an armature as a camera rig, to avoid creating and
targetting an empty object.

Differential Revision: https://developer.blender.org/D7012
2022-07-15 13:40:04 +02:00
5ddbc14bb2 Render: improve render border operator in image editor
* Snap border to pixels just outside the drawn border, to more easily select
  specific pixels by drawing a border inside them.
* Support cropped border renders.
2022-07-15 13:40:04 +02:00
9ea1b88f0f Cleanup: add utlity function to compute render resolution
Instead of duplicating logic many times.
2022-07-15 13:40:04 +02:00
b8ffd43bd2 Cleanup: make format 2022-07-15 13:40:04 +02:00
4d7c990180 Fix T98061: uv resize with individual origins could break constrain to bounds
Fix unreported: Resize with Constrain To Bounds will now limit one shared scale
value for both U and V instead of calculating separate scale values for each.

To fix T98061, the individual origins (transdata->center) is now used when
that mode is active.

See also: 0e9367fc29

Differential Revision: https://developer.blender.org/D15420
2022-07-15 23:18:13 +12:00
e69c4482f1 I18n: Add suport for labels from modifiers' subpanels.
Was a bit oif a struggle since those functions take a first string which
is not our label, but should work fine now.

Reported/detected as part of D15418.
2022-07-15 11:42:58 +02:00
Martijn Versteegh
8e1323f633 Fix: Move DRW_shgroup_add_material_resources(grp, mat) to after the null-check for grp.
Reviewed By: fclem

Maniphest Tasks: T99646

Differential Revision: https://developer.blender.org/D15436
2022-07-15 11:23:26 +02:00
f4d7ea2cf6 Fix T99606: Regression: TexCoordinate losing precision far away from origin
Same root cause as T99128. The fix also needed to be done in another place.
2022-07-15 11:16:14 +02:00
98f688ac42 GPU: Fix shader builder on hardware that does not have all features 2022-07-15 11:16:14 +02:00
862170c0b1 Cleanup: GPU: Replace NULL by nullptr from C++ files 2022-07-15 11:16:14 +02:00
ca1daf4cda Fix an increasing bottleneck when key press operator is too slow
The goal of this change is to fix an increasing bottleneck of the event
queue handling when there is an operator bound to a key press event and
is taking longer to finish than a key-repeat speed on the system.

Practical example of when it happens is the marker tracking operator in
a single-frame track mode. Quite often artists will hold down Alt-arrow
to track a segment of footage which seems trivial to track. The issue
arises when the Alt-arrow is released: prior to this change it is was
possible that more frames will be tracked. It also seems that redraws
are less smooth.

It is a bit hard to make easily shareable computer-independent test
case. Instead, a synthetic case can be reproduced by adding a 50 ms
sleep in the `text_move_exec()`. In such synthetic case open a long
text in the text editor and hold left/right arrow button to navigate
the cursor. The observed behavior is that seemingly redraws happen
less and less often and cursor travels longer and longer distances
between redraws. The cursor will also keep moving after the buttons
has been released.

The proposed solution is to ignore sequential key-press events from
being added to the event queue. This seems to be the least intrusive
and the most safe approach:

- If the operator is fast enough there will be no multiple press events
  in the queue in both prior and after of this change.

- If the operator is slow enough, clicking the button multiple times
  (i.e. clicking arrow button 3 times in a heavy shot will change the
  scene frame by exactly 3 frames because no events are ignored in
  this case).

- Only do it for key press events, keeping mouse and tabled behavior
  unchanged which is crucial for the paint mode.

Note that this is a bit different from the key repeat tracking and
filtering which is already implemented for keymaps as here we only want
to avoid the event queue build-up and do want to ignore all repeat
events. In other words: we do want to handle as many key presses as the
operator performance allows it without clogging anything.

A possible extension to this change could be a key press counter, so
that instead of ignoring the event we merge it into the last event in
the queue, incrementing some counter. This way if some operator really
needs to know exact number of key repeats it can still access it.

Differential Revision: https://developer.blender.org/D15444
2022-07-15 11:03:00 +02:00
c2715dc416 GPU: Remove USD dependency from shader_builder.
Dependency was added as shader builder depended to blenkernel as an
umbrella, in stead of adding the actual dependencies it required.
2022-07-15 10:54:10 +02:00
63ea0f7581 BLI_bitmap: fix _BITMAP_NUM_BLOCKS to not over-count by one block
For bit counts that were exact multiple of block size, the macro was
computing one block too much.

Reviewed By: Campbell Barton, Bastien Montagne
Differential Revision: https://developer.blender.org/D15454
2022-07-15 10:21:27 +03:00
8fd2b79ca1 BLI_bitmap: ability to declare by-value, and function to find lowest unset bit
In preparation for a larger change (D14162), some BLI_bitmap
functionality that could be submitted separately:

- Ability to declare a fixed size bitmap by-value, without extra
  memory allocation: BLI_BITMAP_DECLARE
- Function to find the index of lowest unset bit:
  BLI_bitmap_find_first_unset
- Test coverage of the above.

Reviewed By: Campbell Barton, Bastien Montagne
Differential Revision: https://developer.blender.org/D15454
2022-07-15 10:20:04 +03:00
d8094f9212 GHOST/Wayland: partial support for updating the UI scale
Partial support for changing the UI scale while Blender is open.

The scale is set but issues with the window size not updating remain.
2022-07-15 15:42:24 +10:00
60f260eb6a GHOST/Wayland: fix error setting the cursor scale
Calculate a scale that's compatible with the cursor size.
Needed so the cursor is always a multiple of scale.
2022-07-15 15:36:21 +10:00
d14d570580 blend_render_info: add check for negative BHead length (corrupt file)
Without this check, corrupt files would raise a Python exception,
now early exit with a useful error.
2022-07-15 14:53:38 +10:00
c8e8f107bf Fix T99711: Eternal loop reading blend file thumbnail
Account for negative BHead length (already handled by blend file loading).
2022-07-15 14:53:38 +10:00
675f6ef089 Cleanup: Use const pointers for ImageSaveOptions and ImageFormatData
Use const pointers to ImageSaveOptions and ImageFormatData for API
parameters where appropriate.

Differential Revision: https://developer.blender.org/D15400
2022-07-14 21:27:58 -07:00
178868cf42 Fix T79304: improve uv island calculation when in edge selection mode
Differential Revision: https://developer.blender.org/D15419
2022-07-15 14:19:48 +12:00
0e9367fc29 Cleanup: separate clipUVTransform into two different functions
No functional changes.

Prep for D15420 / T98061.
2022-07-15 10:44:11 +12:00
b1329d7eaa Fix T99705: fix integer overflow in thumbnail extractor
It was smart enough to check if the buffer had the right
size but neglected to cast to a 64 bit value so it
overflowed.

Differential Revision: https://developer.blender.org/D15457
Reviewed By: brecht
2022-07-14 12:18:35 -06:00
9fedcde750 Modifiers: fix mesh to volume modifier on non-volume objects 2022-07-14 20:05:23 +02:00
bdd0ac5bce Fix on_drag_start handler not getting ID when dragging from Outliner
We would first invoke the dragging, and then set the drag data (like the
ID or the dragged modifier), so the `wmDropBox.on_drag_start()` handler
wouldn't be able to access this. This broke dragging some IDs from the
Outliner, noticed in D15333.

It's now possible to first create/request drag data, extend it, and then
invoke the actual dragging. The normal function to start dragging
returns `void` now instead of `wmDrag *`, so the drag data can't easily
be modified after starting anymore.
2022-07-14 19:21:56 +02:00
1ef686bd26 UI: Tweak layout of File Browser Preferences
* Don't nest "Show Recent Locations" and "Show System Locations" under a
  "Defaults" heading. They are not just a default setting but completely
  hide panels from the UI.
* Use own "Show Locations" heading instead, and remove redundant words
  from labels.
* Move the options to the top of the panel, they are more general since
  they can't be toggled in a File Browser session, and thus have bigger
  impact.

We may want to remove these options in a future major release, I don't
think they are useful.

Agreed on with Pablo Vazquez.
2022-07-14 19:21:22 +02:00
b6de6da59a I18n: Fix regex for messages from BKE_modifier_set_error.
Signature of this function changed at some point, regex to extract
messages from it was no longer working.

Reported/detected as part of D15418.
2022-07-14 18:50:16 +02:00
3b15467e97 Fix T99702: Gpencil Flip strokes did not support multiframe edit
This was a missing feature and this commit solves this.
2022-07-14 16:42:26 +02:00
Olivier Maury
1b5db02a02 Fix Cycles MNEE wrong results with area light spread
When the solve is successful, the light sample needs to be updated since the
effective shading point is now on the last refractive interface. Spread was
not taken into account, creating false caustics.

Differential Revision: https://developer.blender.org/D15449
2022-07-14 16:36:38 +02:00
28c3739a9b Cleanup: replace state flow macros in the kernel with functions 2022-07-14 16:36:38 +02:00
5539fb3121 Cycles: add presets to the Performance panel
With choices Default, Lower Memory and Faster Render. For convenience, and
to help communicate what the various settings do.

Differential Revision: https://developer.blender.org/D15446
2022-07-14 16:36:38 +02:00
02ce29c6ee Improve Tool tip for Add-on search
Differential Revision: https://developer.blender.org/D15411
2022-07-14 16:33:04 +02:00
4b1d315017 Cycles: Improve cache usage on Apple GPUs by chunking active indices
This patch partitions the active indices into chunks prior to sorting by material in order to tradeoff some material coherence for better locality. On Apple Silicon GPUs (particularly higher end M1-family GPUs), we observe overall render time speedups of up to 15%. The partitioning is implemented by repeating the range of `shader_sort_key` for each partition, and encoding a "locator" key which distributes the indices into sorted chunks.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D15331
2022-07-14 14:26:18 +01:00
47d4ce498e Cleanup: minor changes to camera frame fitting
Use const vars & make order of min/max checks consistent.
2022-07-14 22:08:30 +10:00
2d04012e57 Cleanup: spelling in comments
Also remove duplicate comments in bmesh_log.h, caused by automated
comment relocation in [0].

[0]: c4e041da23
2022-07-14 22:02:52 +10:00
9dfabc1de3 Cleanup: remove redundant event->val check for 3D text insertion 2022-07-14 22:01:43 +10:00
93f74299f0 Cleanup: clang-tidy changes to GHOST_SystemX11
Also remove redundant check.
2022-07-14 21:55:46 +10:00
cdd8b96e3b GHOST: remove redundant ascii argument to GHOST_EventKey
Now only the utf8 buffer is used there is no reason to pass both.
2022-07-14 21:54:28 +10:00
eb3e56a36e Fix: Wrong output types for some compositor nodes
The Difference Matte and RGB To BW nodes have a wrong output type. They
should be floats but are of type color.

This is a regression that was introduced during the migration to the
socket builder API in D13266.

Reviewed By: Blendify, fclem

Differential Revision: https://developer.blender.org/D15232
2022-07-14 13:52:44 +02:00
db80cf6ad7 GHOST/X11: avoid redundant utf8 text lookups for release events
The text representation of release events is never used,
so only calculate this for press events.
2022-07-14 21:10:22 +10:00
64e196422e GHOST/X11: Quiet warning about key-release events having their utf8 set
Quiet warning from [0], no functional change as the this information
was always ignored.

Key release events shouldn't have associated text, this was cleared
for wmEvent's, so there is no reason to pass it from GHOST.

[0]: d6fef73ef1
2022-07-14 21:04:16 +10:00
7fa7722350 Cleanup: format, unused argument 2022-07-14 20:53:20 +10:00
6cd30d5ff0 IDManagement: add more ID naming tests
As part of a larger change (https://developer.blender.org/D14162),
adding more test coverage for existing functionality separately.

New tests:
- ids_sorted_by_default
- ids_sorted_by_default_with_libraries
- name_too_long_handling
- create_equivalent_numeric_suffixes
- zero_suffix_is_never_assigned
- remove_after_dup_get_original_name
- name_number_suffix_assignment
- renames_with_duplicates
- names_are_unique_per_id_type
2022-07-14 13:41:43 +03:00
9024ac31be Fix curve drawing crash after changing geometry nodes output.
Using geometry nodes with attributes on a curve object and changing the
output is crashing. This is because the `render_mutex` in the curve
drawing cache is cleared after changes in `curves_batch_cache_init` and
set to null. The cache isn't actually needed currently because all draw
updates are single-threaded, but the new `drw_attributes_merge` function
still tries to access it (this seems to be tolerated on Linux platforms
but crashes on Windows).

Make sure the render_mutex is always initialized after (re-)initializing
the cache.
2022-07-14 11:17:59 +01:00
cb62095c1c Correct error with IME from d6fef73ef1 2022-07-14 20:07:06 +10:00
96cc603037 Geometry Nodes: update curve type counts after realizing instances
The type counts have to be updated eagerly. This was missing from
the realize-instances code before, leading to bugs further down
the line.
2022-07-14 11:49:28 +02:00
c8a07ef663 BLI: fix finding indices from virtual array
The sorting of index vectors assumed that all vectors have
at least one element. Now this is checked for more explicitely.
2022-07-14 11:32:01 +02:00
bcdce4ffd8 Geometry Nodes: fix face corner to edge boolean interpolation
This is a follow up for rB44e530e1b107fd0d91f472f9a58642ab59fd5422
which did not fix the function that interpolates boolean attributes.
2022-07-14 10:47:26 +02:00
44e530e1b1 Geometry Nodes: fix face corner to edge attribute interpolation
Looks like this was wrong all the time.. Luckily, this conversion
is not very common.

Found when testing D15274.
2022-07-14 10:35:59 +02:00
Damien Picard
9d73bbd966 UI: translate tooltips coming from menu descriptions
Many menus get their labels exported to the .po file, but then are not actually translated in the UI.

Before:
{F13283752}

After:
{F13283750}

Reviewed By: mont29

Differential Revision: https://developer.blender.org/D15417
2022-07-14 10:29:10 +02:00
Colin Basnett
8e3879ab52 Cleanup: Rename & refactor several F-curve functions
Rename and refactor several F-curve key manipulation functions, and move
them from `editors` to `blenkernel`.

The functions formerly known as `delete_fcurve_key`,
`delete_fcurve_keys`, and `clear_fcurve_keys` have been moved from
`ED_keyframes_edit.h` to `BKE_fcurve.h` and have been renamed according
to hierarchical naming rules.

Below is a table of the naming changes.

| From | To |
| -- | -- |
| `delete_fcurve_key(fcu, index, do_recalc)` | `BKE_fcurve_delete_key(fcu, index)` |
| `delete_fcurve_keys(fcu)` | `BKE_fcurve_delete_keys_selected(fcu)` |
| `clear_fcurve_keys(fcu)` | `BKE_fcurve_delete_keys_all(fcu)` |
| `calchandles_fcurve()` | `BKE_fcurve_handles_recalc()` |
| `calchandles_fcurve_ex()`| `BKE_fcurve_handles_recalc_ex()` |

The function formerly known as `delete_fcurve_key` no longer takes a
`do_fast` parameter, which determined whether or not to call
`calchandles_fcurve`. Now, the responsibility is on the caller to run
the new `BKE_fcurve_handles_recalc` function if they have want to
recalculate the handles.

In addition, there is now a new static private function called
`fcurve_bezt_free` which sets the key count to zero and frees the key
array. This function is now used in couple of instances of functionally
equivalent code. Note that `BKE_fcurve_delete_keys_all` is just a
wrapper around `fcurve_bezt_free`.

This change was initially spurred by the fact that `delete_fcurve_keys`
was improperly named; this was a good opportunity to fix the location
and naming of a few of these functions.

Reviewed By: sybren

Differential Revision: https://developer.blender.org/D15282
2022-07-14 10:24:32 +02:00
Iliay Katueshenock
77df9d788a Fix T99239: weird behavior in Field on Domain node 2022-07-14 10:05:24 +02:00
Damien Picard
f48fadc953 UI: translate quick favorites menu operator names
Some operator titles were not translated in the quick favorites menu.

Before:
{F13283724}

After:
{F13283725}

Reviewed By: mont29

Differential Revision: https://developer.blender.org/D15416
2022-07-14 10:01:09 +02:00
Damien Picard
1571ee66b5 I18N: Allow translating newly added GP data names, and a missing Surface one. 2022-07-14 10:01:09 +02:00
3935bf255e Fix T99677: crash when convex hull node is used on empty mesh
Fundamental issue is that the attribute api returns none, because
the custom data api returns null for a layer when the size of 0.
This should be improved separately.
2022-07-14 09:57:08 +02:00
411bcf1fe7 Cleanup: simplify 3D text insert, remove checks for ascii escape codes
3D text included checks for escape codes such as \n\r\b which have not
been included in wmEvent.ascii since [0] (2009).

Remove these and use more straightforward logic for overriding the
events text input.

[0] 66437a62a7
2022-07-14 17:04:22 +10:00
b35e33317d Cleanup: update & correct comments for event handling
- Remove references to `ISTEXTINPUT` as any keyboard event with it's
  utf8_buf set can be handled as text input.

- Update references to the key repeat flag.
2022-07-14 16:10:13 +10:00
d6fef73ef1 WM: Remove ASCII members from wmEvent & GHOST_TEventKeyData
The `ascii` member was only kept for historic reason as some platforms
didn't support utf8 when it was first introduced.

Remove the `ascii` struct members since many checks used this as a
fall-back for utf8_buf not being set which isn't needed.
There are a few cases where it's convenient to access the ASCII value
of an event (or nil) so a function has been added to do that.

*Details*

- WM_event_utf8_to_ascii() has been added for the few cases an events
  ASCII value needs to be accessed, this just avoids having to do
  multi-byte character checks in-line.

- RNA Event.ascii remains, using utf8_buf[0] for single byte characters.

- GHOST_TEventKeyData.ascii has been removed.

- To avoid regressions non-ASCII Latin1 characters from GHOST are
  converted into multi-byte UTF8, when building X11 without
  XInput & X_HAVE_UTF8_STRING it seems like could still occur.
2022-07-14 15:59:19 +10:00
816a73891b GHOST/SDL: pass in utf8 buffer for keyboard events
While GHOST/SDL doesn't support non-ASCII text input,
use the UTF8 buffer to be consistent with all other back-ends.

Move the conversion from SDL_KeyboardEvent to ASCII into a function.

Also only lookup this value on key press (not release).
2022-07-14 15:58:46 +10:00
d3374e5337 Fix build and warnings from previous commit. 2022-07-14 16:19:21 +12:00
931779197a Fix T99684: Upgrade Averages Island Scale with options Scale UV and Shear
Differential Revision: https://developer.blender.org/D15421
2022-07-14 15:42:08 +12:00
09a74ff8b6 GHOST/SDL: add support for the key repeat flag
Now all ghost back-ends support the key repeat flag
(accessed as WM_EVENT_IS_REPEAT from wmEvent.flag).
2022-07-14 09:46:58 +10:00
50d832634e Docs: Fix out of order parameters
Fixes T99672
2022-07-13 16:25:57 -04:00
144d9f2b2e Cleanup: Do not use spaces in default data names.
Using white spaces in data names should not be encouraged in general,
better not give wrong example here.

Originally part of D15441.
2022-07-13 17:40:35 +02:00
88fbf0a8fc Fix (studio-reported) bad remapping of libraries.
New remapper code would also fail in some cases when remapping
libraries, similar to the issue yesterday, because ID_LI type had no
mask value.

That would fail to remap `parent` member of a library to NULL when
deleting that parent, leading to a crash e.g. in Outliner tree building
code.

Reported by @JulienKaspar from Blender studio.
2022-07-13 16:11:59 +02:00
ccdf189d3c Documentation: Update Docs for Gizmo
This patch updates the documentation for arguments regarding the `Gizmo`
type.

- Corrected `select_id` doc for draw_preset_ functions. `-1` indicates
  that no selection ID is to be written, but previous docs incorrectly
  specified `0` instead.
- Added missing doc for `target` argument for `target_set_handler`
  function.

Reviewed by: Aaron Carlisle (Blendify)

Differential Revision: https://developer.blender.org/D14834
2022-07-13 08:43:57 -04:00
441dd08dba Expose option for fallback tools keymap in GizmoGroup type
This patch allows new GizmoGroup classes to support tool fallback keymap.

With this patch, when new gizmo groups add `'TOOL_FALLBACK_KEYMAP'` to
its `bl_options`, the fallback tools are added to the group. This
allows a `WorkSpaceTool` (for example) to have selection be a fallback
tool if the user LeftMouse drags away from other gizmos in the group.

Reviewed by: Campbell Barton (campbellbarton)

Differential Revision: https://developer.blender.org/D15154
2022-07-13 08:30:40 -04:00
c484599687 Expose snap options in transform operators
This commit exposes snap options in transform operators. These options
are needed for Python tools to control snapping without requiring the
tool settings to be adjusted.

The newly exposed options are:

- `snap_elements` for choosing which element(s) of target the source
  geometry should snap to (ex: Face Raycast).
- `use_snap_self`, `use_snap_edit`, `use_snap_nonedit`,
  `use_snap_selectable_only` for controlling target selection.
- `use_snap_project` for controlling Face Raycast snapping.
- `use_snap_to_same_target` and `snap_face_nearest_steps` for
  controlling Face Nearest snapping.

Reviewed by: Campbell Barton (campbellbarton)

Differential Revision: https://developer.blender.org/D15398
2022-07-13 07:10:09 -04:00
c8be3d3b27 Fix T99654: Applying Mirror modifier breaks the erase tool
The problem was the new generated strokes were copied from original and the location was offset to mirror, but the internal geometry data was not updated and the collision check done by brushes was not working.

Now, the internal geometry data is recalculated when the modifier is applied.
2022-07-13 12:56:48 +02:00
74888cdbfd Fix (studio-reported) issue in remapping code.
Not clearing runtime remapping data for the new ID as well as the old
one can lead to false stale data there, wichi could e.g. make indirectly
linked data be tagged as directly linked.

This would generate an error report on file write when hapening on
ShapeKey ID, since that type is not allowed to be directly linked.
2022-07-13 11:15:19 +02:00
a084839605 Cleanup: logical order of axis defines, assign variables for readability 2022-07-13 16:42:06 +10:00
b3913d7551 Cleanup: use defines for camera axes for view frame fitting
The values used for axes weren't in any meaningful order, use defines
to improve readability.
2022-07-13 16:18:16 +10:00
9422627155 Fix T99653: "Align Active Camera to Selected" fails with ortho camera
There were two bugs, a regression in [0] and the object-data wasn't
tagged for depsgraph updating.

[0]: 19df0e3cfd
2022-07-13 16:18:14 +10:00
8f543a73ab Fix T99659: Improve UV Island calculation with hidden faces.
Simplify interface, regularize implementation and some light cleanup.

See also: T79304 and D15419.
2022-07-13 11:42:42 +12:00
d0a552b5c6 Fix: set dangling pointer to null
The data has been moved somewhere else, the span should not
keep a pointer to it.
2022-07-12 18:47:51 +02:00
5f09440d5a Cycles: Make not-compact BVH the default for embree
Measurements shown on average a 1.08x speedup for a 1.04x increase in
memory usage which is an acceptable trade off for a default setting,
although discoverability of such settings influencing memory usage could
be improved.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D15429
2022-07-12 18:40:14 +02:00
9f153949f9 Enable "copy to selected" for new Curves modifiers
The operator bpy.ops.object.modifier_copy_to_selected()
does not work for the new Curves objects.
This is because it isn't added to BKE_object_supports_modifiers.

Differential Revision: https://developer.blender.org/D15439
2022-07-12 18:31:33 +02:00
1c382a4940 Curves: improve error checking in deform curves on surface node 2022-07-12 17:10:23 +02:00
02aefa7659 Fix: wrong node name in menu 2022-07-12 17:10:23 +02:00
b767628173 Fix: Memory leaks in indexer code
Reviewed By: Richard Antalik

Differential Revision: http://developer.blender.org/D15376
2022-07-12 16:58:04 +02:00
Khoi Dau
93253d5dcc Fix T99103: crash when displaying or rendering Grease Pencil object
On some hardware/systems, blender may crash when adding, rendering or displaying Grease Pencil objects.

In `/source/blender/draw/engines/gpencil/shaders/gpencil_vert.glsl`, line 35:
```
gpMaterialFlag gp_flag = floatBitsToInt(gp_mat._flag);
```
`gpMaterialFlag` is of type `uint`. This is a mismatched-type assignment that can cause crashes on some hardware/systems with GLSL that do not support implicit type casting.

So use `floatBitsToUint` for type conversion.

Differential Revision: https://developer.blender.org/D15433
2022-07-12 11:34:41 -03:00
d58072caf4 Fix: missing geometry copy before modifying it
A geometry component may reference read-only geometry.
In this case it has to be copied before making changes to it.

This was caused by rBb876ce2a4a4638142.
2022-07-12 16:27:06 +02:00
47dd42485e Cycles: fix and enable JIT oneAPI CentOS7 builds for drivers 23570+
The current specific CentOS7 workaround we have for AoT, which is to
disable __FAST_MATH__ by using -fhonor-nans, now also fixes the
compilation issue for JIT as well since at least driver 23570.
2022-07-12 15:55:32 +02:00
4a445c8dc0 LibOverride: Fix some issues from.revealed by recent previous commit.
rB57097e9a8515 did not properly consider case where you have more than
one override for a same reference linked ID.

Also adds more security checks around shapekeys, since match between
override and its linked reference is never ensured either way (fixes a
crash reported by @Rik from Blender studio).
2022-07-12 15:34:40 +02:00
f72cedffb6 Cleanup: Use interpf instead of repeating the logic
This makes the code clearer.
2022-07-12 10:20:01 -03:00
2d1fe736fa Fix T96238: shrinking to zero doesn't actually scales the radius to zero
Probably to prevent the radius of a point from being stuck at zero,
the `Shrink/Fatten` curve operator sets a minimum value of `0.001`
for all points being transformed.

This is an inconvenience as these points may have been purposely set
to zero on the panel.

And it also doesn't follow the convention seen in other operators
(which keep the value zero).

So remove this limitation.
2022-07-12 10:11:05 -03:00
57097e9a85 Fix T99261: partial LibOverride creation would not properly remap all needed data.
When creating partial overrides, there may also be need to reamap usage
of linked data towards already existing overrides, in newly created
overrides.
2022-07-12 14:39:08 +02:00
RedMser
4344b2bf19 Markers: Make delete confirmation depend on key used
Add a 'Delete Confirmation' operator property to the 'Delete Marker'
operator. This determines whether the user is asked to confirm the
deletion or not.

Defaults so that only {key X} ({key Backspace} for industry compatible
keymap) prompts for deletion, whereas {key Del} does not show the
confirmation popup.

This also makes the default keymap for marker deletion consistent with
the common delete operators (such as objects and keyframes).

Reviewed By: sybren

Differential Revision: https://developer.blender.org/D13818
2022-07-12 13:00:56 +02:00
52b7f2b089 UV: Box and lasso selection for partially intersecting edges
In UV edge mode, box and lasso selections allow edge selections only
when the entire edge is contained within the selection area. This
doesn't consider any edges that partially overlap with the selection
area.

This is  now fixed by adding a second pass, similar to how these
operators work for edit-mesh selections. Now if both operators are
unable to find any edges contained within the selection area, then
they will perform a second pass which checks for edges that partially
intersect with the selection area.

Now edge selection in the UV editor matches edit-mesh edge-selection
when drawing wire-frame.

Resolves T99443.

Ref D15362
2022-07-12 19:50:40 +10:00
bb3a538843 Fix: Incorrect coordinates used in BLI_rct*_isect_segment functions
Ref D15330.
2022-07-12 19:28:26 +10:00
8bd32019ca Fix threading crash due to conflict in mesh wrapper type
A mesh wrapper might be being accessed for read-only from one thread
while another thread converts the wrapper type to something else.

The proposes solution is to defer assignment of the mesh wrapper
type until the wrapper is fully converted. The good thing about this
approach is that it does not introduce extra synchronization (and,
potentially, evaluation pipeline stalls). The downside is that it
might not work with all possible wrapper types in the future. If a
wrapper type which does not clear data separation is ever added in
the future we will re-consider the threading safety then.

Unfortunately, some changes outside of the mesh wrapper file are
to be made to allow "incremental" construction of the mesh prior
changing its wrapper type.

Unfortunately, there is no simplified file which demonstrates the
issue. It was investigated using Heist production file checked at
the revision 1228: `pro/lib/char/einar/einar.shading.blend`. The
repro case is simple: tab into edit mode, possibly few times.

The gist is that there several surface deform and shrinkwrap
modifiers which uses the same target. While one of them is building
BVH tree (which changes wrapper type) the other one accesses it for
read-only via `BKE_mesh_wrapper_vert_coords_copy_with_mat4()`.

Differential Revision: https://developer.blender.org/D15424
2022-07-12 10:26:52 +02:00
6e6da22eb0 Fix: crash when iterating over all attributes 2022-07-12 09:42:19 +02:00
b8d1e576bc Cleanup: use array for internal _bpy methods 2022-07-12 16:11:19 +10:00
ae6a4fcc7a Tests: add test to ensure restricted py-driver execution is working
Add internal function (only used for testing at the moment)
`_bpy._driver_secure_code_test`.

Add test `script_pyapi_bpy_driver_secure_eval` to serves two purposes:

- Ensure expressions that should be insecure remain so when upgrading
  Python or making any changes in this area.

- Ensure new versions of Python don't introduce new byte-codes that
  prevent existing expressions from being executed
  (happened when upgrading from 3.7, see [0]).

[0]: dfa5201763
2022-07-12 16:11:19 +10:00
00c7e760b3 Python: add opcodes for safe py-drivers
The following opcodes have been added, see [0] for details:

- LIST_TO_TUPLE: convert a list to a tuple,
  use for constructing lists/tuples in some cases.

- LIST_EXTEND: use for constructing lists with unpacking.

- SET_UPDATE: use for constructing sets with unpacking.

- CONTAINS_OP: check if `a in b` generally useful.

When writing tests these op-codes where needed for basic operations
and can be safely supported.

Add note why dictionary manipulation op-codes have been left out.

Also restrict namsepace access to anything with an underscore prefix
since these may be undocumented.

[0]: https://docs.python.org/3.10/library/dis.html
2022-07-12 16:05:13 +10:00
2a1d12d7a0 Fix (studio-reported) crash in ID remapping code on rare cases.
Some ID types did not have a filter value, even though they would be
used in remapping code, leading to missing remappings. In that specific
case, shape keys would actually never be properly remapped.

Reproducible in r1230 of
`Heist/pro/animation_test/einar/einar_new_expression_shapes2.blend`,
2022-07-11 19:16:04 +02:00
995c904d00 Fix (unreported) crash in liboverride code on rare cases.
When dealing with 'embedded' IDs (and the like, e.g. shape keys),
liboverride code could fail in case the reference linked data (e.g. a
mesh) would not have a shapekey anymore, while the override mesh would
still have one.

Found while investigating another issue in Heist production file
`Heist/pro/animation_test/einar/einar_new_expression_shapes2.blend`,
r1230.
2022-07-11 19:16:04 +02:00
8ca09e6c5e GPU: add BUIDTIME to WITH_GPU_SHADER_BUILDER
Adds a better name that describes when it is used.
The GPU_SHADER_BUILDER is a buildtime tool for developers
to pre-validate GLSL (and in the overseen future pre-compile to
SpirV). We don't see that this needs to become a required
step in the future so WITH_GPU_BUILDTIME_SHADER_BUILDER
is more descriptive name.
2022-07-11 16:45:07 +02:00
Jeroen Bakker
76d8614236 GPU: Do not allow GPU Shader builder when USD is enabled.
Linking GPU shader builder requires stubs for many functions of the USD library.
We don't want to rely on other modules to update the stubs for a tool that
is only used by GPU developers.

This patch raises an error when both WITH_GPU_SHADER_BUILDER and WITH_USD are
enabled. This reduces the maintenance of updating the stubs when USD API changes.

Reviewed By: LazyDodo

Differential Revision: https://developer.blender.org/D15422
2022-07-11 16:12:36 +02:00
6e426259b4 Fix T99218: light group add button should be disabled when name is empty
Previously it was inactive but still clickable.

Ref D15316
2022-07-11 14:02:38 +02:00
6ca5ac2084 GPU: Update shader builder stubs.
Fixes workflow when using WITH_GPU_SHADER_BUILDER=On.
2022-07-11 13:36:29 +02:00
275419f6fd Fix/Cleanup UI messages. 2022-07-11 12:46:22 +02:00
1c4c904786 Deps Builder: Disable TermInfo and ncurses for DPC++
They are not strictly needed for compilation and disabling them makes
the compiler more portable without any special trickery.

This change aimed to solve problem which currently happens on the API
documentation build which does not have terminfo installed, but needs
to compile Cycles.

Note that the DPC++ is to be re-compiled.
2022-07-11 12:09:09 +02:00
da101118d4 Cleanup: Remove unused operator name storage in UI lists 2022-07-11 11:17:08 +02:00
7357176b57 Fix T99383: Wrong origdata type in color filter 2022-07-11 00:19:53 -07:00
cb39058f2f Fix T94633: Sculpt mode missing check for hidden active object
Note there is a bug in BKE_object_is_visible_in_viewport, it
returns false when the object is in local mode.

The transform operator poll should do a similar test.  That
would allow us to move the test from sculpt_brush_strok_invoke
to SCULPT_mode_poll (at the moment we cannot do this due to
the brush operator falling through to the translate keymap
item in global view3d keymap).
2022-07-10 23:39:45 -07:00
133d398120 PyAPI: add Matrix.is_identity read-only attribute
Add a convenient way of checking if the matrix is an identity matrix.
2022-07-11 12:45:00 +10:00
d51bc8215f GPencil: Dot-dash modifier rename segment bug fix.
This patch fixes naming and renaming issue with dot-dash modifier segment list.

Before: when double clicking and exiting it would append
number at the end regardless of name being changed or not.

Now it works like in other areas.

Authored by: Aleš Jelovčan (frogstomp)

Reviewed By: YimingWu (NicksBest)

Differential Revision: https://developer.blender.org/D15359
2022-07-11 10:13:07 +08:00
d4a4691c0c Cleanup: spelling in comments 2022-07-11 10:38:04 +10:00
a83502f05f Cleanup: remove unused GHOST function getAnyModifiedState.
Remove unused GHOST_WindowManager::getAnyModifiedState()
2022-07-11 10:38:02 +10:00
7f4ee97b9e Revert "Fix an assert trip in boolean tickled by D11272 example."
This reverts commit 6543290116.
It broke tests and I don't know why, so reverting this while
figuring that out.
2022-07-10 18:50:11 -04:00
6543290116 Fix an assert trip in boolean tickled by D11272 example.
The face merging code in exact boolean made an assumption that
the tesselated original face was manifold except at the boundaries.
This should be true but sometimes (e.g., if the input faces have
self-intersection, as happens in the example), it is not.
This commit makes face merging tolerant of such a situation.
It might leave some stray edges from triangulation, but it should
only happen if the input is malformed.
Note: the input may be malformed if there were previous booleans
in the stack, since snapping the exact result to float coordinates
is not guaranteed to leave the mesh without defects.
2022-07-10 14:50:17 -04:00
fad857f473 Fix T99532: New OBJ importer in some cases fails to import faces
The importer code was written under incorrect assumption that vertex
data (v, vn, vt commands etc.) are grouped by object, i.e. follow the
o command, and that each object has its own vertex data commands. This
is not the case -- all the vertex data in the whole OBJ file is
"global", with no relation to any objects/groups; it's just that the
faces belong to the object, and then they pull in any vertices they
like.

This patch fixes this incorrect assumption in the importer:

- Vertex data is now properly global; no need to track some sort of
  "offsets" per object like it was doing before.
- For each object, face definitions track the minimum & maximum vertex
  indices referenced by the object, and then all that vertex range is
  created in the final Blender object. Note: it might be (unusual, but
  possible) that an object does not reference a sequential range of
  vertices, e.g. just a single face with vertex indices 1, 10, 100 --
  the resulting Blender mesh will have all the 100 vertices (some
  "loose" without belonging to a face). It should be possible to track
  the used vertices exactly (e.g. with a vector set), but I haven't
  done that for performance reasons.

Reviewed By: Howard Trickey
Differential Revision: https://developer.blender.org/D15410
2022-07-10 20:09:29 +03:00
4114ace616 Fix T99536: new 3.2 OBJ importer fails with trailing space after wrapped lines
Address the issue by re-working line continuation handling: stop
trying to parse sequences like "backslash, newline" (which is the
bug: it should also handle "backslash, possible whitespace, newline")
during parsing. Instead, fixup line continuations after reading chunks
of input file data - turn backslash and the following newline into
spaces. The rest of parsing code does not have to be aware of them
at all then.

Makes the file attached to T99536 load correctly now. Also will extend
one of the test files in subversion tests repo to contain backslashes
followed by newlines.
2022-07-10 18:27:38 +03:00
443690604f Fix cursor display size with tablet input in GHOST/Wayland
The scale for tablet cursor surfaces was never set, making them display
larger. Now the outputs scale is set for mouse & tablet cursors.
2022-07-09 22:46:53 +10:00
1de14061cb Cleanup: split out wl_buffer creation into a utility function
Simplify logic for initializing the wl_buffer, ensure the cursors
custom data is never heft in a half initialized state.
Also remove the need for multiple calls to close when handling errors.
2022-07-09 22:27:30 +10:00
9a1d772339 Cleanup: remove buffer_t in GHOST/Wayland
This was allocated and only used to store the custom cursor data.
Use a pointer & size member instead for simplicity.
2022-07-09 22:27:28 +10:00
ef970b7756 Cleanup: split memfd_create into it's own function for Wayland
Avoid ifdef's in cursor loading by creating a memfd_create_sealed
utility function that works irrespective of memfd_create availability.
2022-07-09 22:27:27 +10:00
b5d22a8134 Fix resource leaks setting custom cursors in Wayland
- Memory from the prior cursor was never un-mapped.
- posix_fallocate failure left a file handle open..
2022-07-09 22:27:26 +10:00
80f8b7cbbb UI: renaming fIle browser thumbnail sizes
Rename the thumbnail size from Regular to Medium since it's the typical
way to refer to sizing in American English

Reviewed By: Campbell Barton
Differential Revision: https://developer.blender.org/D15305
2022-07-09 02:09:07 -06:00
e3801a2bd4 Weight & Vertex Paint: always respect edit mode hiding on faces.
In some cases it is mandatory to be able to hide parts of the mesh
in order to paint certain areas. The Mask modifier doesn't work in
weight paint, and edit mode hiding requires using selection, which
is not always convenient.

This makes the weight and vertex paint modes always respect edit mode
hiding like sculpt mode. The change in behavior affects drawing and
building paint PBVH. Thus it affects brushes, but not menu operators
like Smooth or Normalize.

In addition, this makes the Alt-H shortcut available even without
any selection enabled, and implements Hide for vertex selection.

Differential Revision: https://developer.blender.org/D14163
2022-07-09 10:39:53 +03:00
d9e00fbbf6 Cleanup: quiet class-memaccess warning 2022-07-09 15:08:23 +10:00
bc2121147f Cleanup: Remove unused variable 2022-07-08 18:47:31 -05:00
2ee6891728 Fix T99494: Transition effects not working correctly
This was caused by strip content length and start position being
incorrect. Previously this was set from strip boundary by update
function, but it was removed.

Add back code to set effect strip start and length.

Previously content length was always 1 for effects, but now it must
correspond to strip length. Because of this workaround for speed effect
to get this apparent content length was removed.
2022-07-08 18:07:07 +02:00
d8e980a4a6 Fix bug in recently added MutableVArraySpan move constructor 2022-07-08 16:27:47 +02:00
8159e0a666 Curves: use consistent default radius for Cycles, Eevee, Set Curve Radius node
To avoid Cycles not showing any hair by default, and to avoid very slow render
due to many overlaps with the previous 1 meter default in the node.

Fixes T97584, T99319

Differential Revision: https://developer.blender.org/D15405
2022-07-08 16:21:32 +02:00
f639b59a29 Cleanup: convert brush.c to C++
In preparation of refactoring for texture nodes.
2022-07-08 16:21:32 +02:00
b876ce2a4a Geometry Nodes: new geometry attribute API
Currently, there are two attribute API. The first, defined in `BKE_attribute.h` is
accessible from RNA and C code. The second is implemented with `GeometryComponent`
and is only accessible in C++ code. The second is widely used, but only being
accessible through the `GeometrySet` API makes it awkward to use, and even impossible
for types that don't correspond directly to a geometry component like `CurvesGeometry`.

This patch adds a new attribute API, designed to replace the `GeometryComponent`
attribute API now, and to eventually replace or be the basis of the other one.

The basic idea is that there is an `AttributeAccessor` class that allows code to
interact with a set of attributes owned by some geometry. The accessor itself has
no ownership. `AttributeAccessor` is a simple type that can be passed around by
value. That makes it easy to return it from functions and to store it in containers.

For const-correctness, there is also a `MutableAttributeAccessor` that allows
changing individual and can add or remove attributes.

Currently, `AttributeAccessor` is composed of two pointers. The first is a pointer
to the owner of the attribute data. The second is a pointer to a struct with
function pointers, that is similar to a virtual function table. The functions
know how to access attributes on the owner.

The actual attribute access for geometries is still implemented with the `AttributeProvider`
pattern, which makes it easy to support different sources of attributes on a
geometry and simplifies dealing with built-in attributes.

There are different ways to get an attribute accessor for a geometry:
* `GeometryComponent.attributes()`
* `CurvesGeometry.attributes()`
* `bke::mesh_attributes(const Mesh &)`
* `bke::pointcloud_attributes(const PointCloud &)`

All of these also have a `_for_write` variant that returns a `MutabelAttributeAccessor`.

Differential Revision: https://developer.blender.org/D15280
2022-07-08 16:16:56 +02:00
f391e8f316 Linux: Move Mesa software OpenGL libraries to sub-directory
Allows to put libraries which are always needed by Blender into the
lib/ folder and not worry about OpenGL libraries picked up from there.

Currently no functional changes as we do not yet have dynamic libraries
which we load at startup. It allows to use direct linking of oneAPI
Cycles device (see D15397), also it is something which would need to
happen to support USD/Hydra/TBB compiler as dynamic libraries in the
future.

Differential Revision: https://developer.blender.org/D15403
2022-07-08 15:44:06 +02:00
0f50ae131f Cycles: enable oneAPI in Linux release builds
with a very high min-driver version requirement, placeholder until JIT
CentOS runtime compilation issue gets fixed in a defined version.
min-driver version check can be worked around by setting
CYCLES_ONEAPI_ALL_DEVICES environment variable.
2022-07-08 15:39:13 +02:00
5723bf926d Fix T99191: Boolean modifier creates invalid material indices
Similar to 1a6d0ec71c which changed the mesh boolean node (and
also caused this bug), this commit changes the material mapping for the
exact mode of the boolean modifier. Now the result should contain any
material on the faces of the input objects (including materials linked
to objects and meshes). The improvement is possible because materials
can be changed during evaluation (as of 1a81d268a1).

Differential Revision: https://developer.blender.org/D15365
2022-07-08 08:32:32 -05:00
becb1530b1 Hair Curves: The new curves object is now available
This commit doesn't implement any new feature but makes the new curves
object type no longer experimental.

Documentation:

* https://docs.blender.org/manual/en/3.3/modeling/curves/primitives.html#empty-hair
* https://docs.blender.org/manual/en/3.3/sculpt_paint/curves_sculpting/introduction.html

Note: This also makes the Selection Paint tool available. This tool
should have been moved out of the "New Curves Tool" flag when we got the
selection drawing to work.

Differential Revision: https://developer.blender.org/D15402
2022-07-08 15:11:32 +02:00
2c55d8c1cf Cleanup: make format 2022-07-08 15:11:32 +02:00
05b38ecc78 Curves: support deforming curves on surface
Curves that are attached to a surface can now follow the surface when
it is modified using shape keys or modifiers (but not when the original
surface is deformed in edit or sculpt mode).

The surface is allowed to be changed in any way that keeps uv maps
intact. So deformation is allowed, but also some topology changes like
subdivision.

The following features are added:
* A new `Deform Curves on Surface` node, which deforms curves with
  attachment information based on the surface object and uv map set
  in the properties panel.
* A new `Add Rest Position` checkbox in the shape keys panel. When checked,
  a new `rest_position` vector attribute is added to the mesh before shape
  keys and modifiers are applied. This is necessary to support proper
  deformation of the curves, but can also be used for other purposes.
* The `Add > Curve > Empty Hair` operator now sets up a simple geometry
  nodes setup that deforms the hair. It also makes sure that the rest
  position attribute is added to the surface.
* A new `Object (Attach Curves to Surface)` operator in the `Set Parent To`
  (ctrl+P) menu, which attaches existing curves to the surface and sets the
  surface object as parent.

Limitations:
* Sculpting the procedurally deformed curves will be implemented separately.
* The `Deform Curves on Surface` node is not generic and can only be used
  for one specific purpose currently. We plan to generalize this more in the
  future by adding support by exposing more inputs and/or by turning it into
  a node group.

Differential Revision: https://developer.blender.org/D14864
2022-07-08 14:47:10 +02:00
aa78278ef6 Fix build error without unity build, after recent changes 2022-07-08 14:38:21 +02:00
155bb95353 Fix Crash: Reading canvas tool settings.
Blender would crash when a file was saved where the tool settings is
set to paint on a single image (3d texture painting).

Reason is that the selected image memory address wasn't updated
when the new address.
2022-07-08 14:05:11 +02:00
Jeroen Bakker
a8f7d41d38 Draw: Curve outline drawing in object mode.
This patch adds (selected/active) outline around a curve object in object mode.

{F13270680}

In the past the draw bounds option was enabled for any curve objects. With this
patch it isn't needed and will be disabled.

In the future the curve outline could also be enabled to improve GPU selection.

Reviewed By: dfelinto, HooglyBoogly, fclem

Maniphest Tasks: T95933

Differential Revision: https://developer.blender.org/D15308
2022-07-08 12:08:31 +02:00
Damien Picard
2c4dfe3453 Add a few missing UI strings to translation.
Reviewed By: mont29

Differential Revision: https://developer.blender.org/D15392
2022-07-08 11:55:01 +02:00
754dae6c76 GHOST/Wayland: add logging for listener handlers
Add logging to all Wayland listener callbacks as it can be difficult
to detect the cause of problems.

Using break-points often isn't practical for debugging interactive
windowing / compositor issues

Logging needs to be enabled on the command line, e.g:

blender --log "ghost.wl.*" --log-level 2 --log-show-basename
2022-07-08 19:40:10 +10:00
47616992f8 GHOST: use ELEM/ARRAY_SIZE/UNPACK macros to avoid repetition
Also use UNLIKELY macro for checks for very unlikely scenarios.
2022-07-08 19:39:11 +10:00
418d82af28 GHOST: add GHOST_utildefines
Add macros from BLI_utildefines, mainly to avoid that avoid repetition
(ELEM, UNPACK*, CLAMP* & ARRAY_SIZE).

Also add macros LIKELY/UNLIKELY as there are quiet a lot of checks
for unlikely situations for GHOST/Wayland (not having a keyboard,
or mouse for e.g.).
2022-07-08 19:36:31 +10:00
6fc388743d Merge branch 'master' into asset-browser-grid-view 2022-06-15 20:20:21 +02:00
66204860ca Merge branch 'master' into asset-browser-grid-view 2022-06-02 13:01:29 +02:00
e1ced645fa Merge branch 'asset-browser-grid-view' into file-browser-grid-view 2022-05-27 11:33:26 +02:00
5162632a20 Merge branch 'master' into asset-browser-grid-view 2022-05-27 11:33:05 +02:00
3f7015a79f Merge branch 'asset-browser-grid-view' into file-browser-grid-view 2022-05-20 12:01:21 +02:00
560c20e067 Fix wrong data passed after recent changes in master 2022-05-20 12:00:52 +02:00
63cc972402 Fix wrong data passed after recent changes in master 2022-05-20 12:00:16 +02:00
c8c8783088 Merge branch 'master' into asset-browser-grid-view 2022-05-20 11:46:53 +02:00
1049b0c818 Merge branch 'master' into asset-browser-grid-view 2022-05-18 22:43:03 +02:00
46934eaf25 Merge branch 'master' into asset-browser-grid-view 2022-05-10 17:09:32 +02:00
c67a650718 Merge branch 'master' into asset-browser-grid-view 2022-05-10 17:09:13 +02:00
065bc42ce5 Merge branch 'master' into asset-browser-grid-view 2022-04-06 11:37:28 +02:00
063689b8a8 Cleanup: Use new license header convention 2022-03-28 00:14:49 +02:00
eaa58a1607 Adopt to RNA prototypes changes in master (fix compile error) 2022-03-28 00:13:49 +02:00
72e67691ed Merge branch 'master' into asset-browser-grid-view 2022-03-27 23:56:26 +02:00
e1a8a15945 Merge branch 'master' into asset-browser-grid-view 2022-03-02 15:07:28 +01:00
e91e4bf8e5 Merge branch 'master' into asset-browser-grid-view 2022-02-28 14:32:46 +01:00
9ae2259db5 Solve redraw performance issue in huge asset libraries
In a big asset library (>3000 assets in the one I'm testing with), the asset
browser would get a notable redraw lag due to the O(n^2) complexity of the tree
item matching (to recognize items over multiple redraws and keep state like
selection, highlights, renaming, ...).
2022-02-17 23:50:41 +01:00
9bac0894f6 Merge branch 'master' into asset-browser-grid-view 2022-02-17 21:43:07 +01:00
731c1be92a Speedup preview icon loading
Significantly speeds up loading of previews, not just for assets but also
Python loaded custom previews. Patch will be submitted for master.
2022-02-17 21:41:22 +01:00
3d31ad823a Fix missing redraw when switching asset libraries 2022-02-17 21:40:17 +01:00
fdc5301205 Fix crash when loading asset library takes multiple redraws 2022-02-17 21:39:43 +01:00
2ce7f02a06 Fix missing (re)load when changing asset libraries
Logic to compare asset library references was incorrect.
2022-02-17 21:35:34 +01:00
91853d95a9 Merge branch 'master' into asset-browser-grid-view 2022-02-16 18:13:12 +01:00
eafd98c920 Remove preview-caching from asset-list API
This isn't needed anymore, see previous commit.
2022-02-16 18:10:40 +01:00
39eab45c8e Let UI do lazy-loading of previews, rather than file-list cache
This is an important step to decouple the asset-views from the file browser
backend. One downside is that the UI preview loading is much slower than the
file browser one, however, I'm pretty sure I know how to address this.
2022-02-16 17:59:27 +01:00
ef58467594 Store active item, asset metadata sidebar
Also:
- Fix double-free bug when closing Blender
- Fix issues with identifying assets with the same name.
- Add functions to store asset handle as pointer in the asset list storage,
  asset browser exposes it via context.
2022-02-15 21:44:08 +01:00
8f48dd8f72 Merge branch 'master' into asset-browser-grid-view 2022-02-14 17:57:52 +01:00
d46357dd25 Merge branch 'master' into asset-browser-grid-view 2022-02-14 17:57:16 +01:00
696295f849 Support active item
Makes activating assets in the Asset Browser work. The active item is only
stored at the UI level now, so it's not stored in files, not accessible via
context and is lost when changing editors. The Asset Browser itself will have
to get a way to store that.
2022-02-14 17:48:51 +01:00
c9c332f422 Fix failed assert with small region size
If there's not enough space to draw at least one item per row, there would be a
failed assert in the code to skip drawing items scrolled out of view.
2022-02-14 16:15:29 +01:00
18f8749fb7 Only add items to layout that are visible on screen
Basically we skip adding buttons for items that aren't visible, but
scrolled out of view. This already makes scrolling in very large
libraries smoother. However this also prepares the next step, where we
only load previews for items that are currently in view, which should
make the experience with large asset libraries better.
2022-02-11 20:30:54 +01:00
947c73578c Merge branch 'master' into asset-browser-grid-view 2022-02-11 18:18:59 +01:00
f41368cf02 Fix null-reference use 2022-02-11 18:18:15 +01:00
f29fa9895f Bring back editor pulldown menus, get operators to work, T for nav-bar
- Adds the View, Edit & Select pulldown menus
- Makes asset operators work (e.g. catalog management)
- Correct notifiers (fixing missing redraws)
- Add T shortcut to toggle toolbar
2022-02-11 18:13:48 +01:00
ff213c802c Fix failing assert when splitting new asset browser in 2
This is an issue in master already, but for many layouts it just isn't
triggered. Pretty sure the assert just shouldn't be executed in extreme
cases, where there is no width for the layout and max-sizes become < 0.
2022-02-09 19:35:07 +01:00
5fcf6822dd Reduce brightness of grid item highlight
Also show theme settings for the view item in the Theme Preferences.
Called it "Data-View Item" for now, to be evaluated.
2022-02-09 19:33:58 +01:00
9530fb60ad Fix crash when loading files with asset browser open
Some fun with static memory. When loading a file, the asset-library
service was destructed since some while ago. For the old asset browser
that wasn't a problem, since its storage was recreated from scratch. But
the new asset browser accesses the global asset library storage of the
asset system which is static and thus stays alive if a different file is
loaded.

For now just destruct the global asset library storage when loading
a new file.
2022-02-09 19:01:12 +01:00
86ea1ad5df Support reading and writing asset browsers from/to files 2022-02-09 18:53:21 +01:00
2011d1f6d0 Various GUI tweaks
- Tweak tile size to match previous Asset Browser better
- Reduce margins between preview tiles (grid view was ignoring `align`
  parameter).
- Improve placement of text below previews (allowing previews to be
  slightly bigger).
- Tweak margins of main asset browser layout.
2022-02-09 17:59:01 +01:00
f17ea3da02 Add theme colors for view items 2022-02-09 15:04:27 +01:00
400d7235c3 Mouse hover highlight for grid items
Like the tree-view rows, grid items use an overlapping layout to draw
the background and a custom layout on top. There is a new dedicated
button type for the grid view items.
Adds some related bits needed for persistent view-item state storage.
Also a bunch of code for the grid item button type can be shared with
the tree-row one, but I prefer doing that separately.
2022-02-09 14:41:11 +01:00
29fdd43605 Get changes in catalog tree to update the UI correctly
The catalog tree-view now sends appropriate notifiers and messages when
changing data. Either itself or other UIs (like the main asset browser
region) can then listen to these and update.
I try to design this in a way that the views become independent on the
editor displaying them, so the views can be reused in multiple places.
2022-02-08 16:29:49 +01:00
eab2a8479a Move asset catalog filtering to editors/assets
Such general asset-view functionality can go to the general editors
level, I think that makes the most sense.
2022-02-07 18:58:25 +01:00
ee013f44b5 Show asset catalog tree-view in navigation sidebar
Selecting catalogs doesn't work yet.
Includes some temporary changes needed to avoid conflicts between old
File/Asset Browser and new Asset Browser code.
2022-02-07 17:18:08 +01:00
f20814892a Merge branch 'master' into asset-browser-grid-view 2022-02-07 12:20:07 +01:00
39e3580065 Enforce a fixed column number for the grid layout
Avoid stretching items to fit the layout, they should always have a
fixed size.
2022-02-07 12:17:17 +01:00
d2c4918d77 Merge branch 'master' into asset-browser-grid-view 2022-02-07 11:13:31 +01:00
11d0c91ba5 Merge branch 'master' into asset-browser-grid-view 2022-02-04 16:58:04 +01:00
925b82efb0 Basic grid layout with big preview tiles
Draws the asset previews in a grid layout, similar to the "old" Asset
Browser. No interactivity is supported yet.
The layout is managed through the grid-view.
2022-02-04 15:18:40 +01:00
3df2e4e888 Display a basic list of assets, set up scrolling and notifier listening
* Display a non-interactive list of assets, updates as assets get
  loaded.
* Notifier listening happens via the view, so the grid-view supports
  listening to notifiers. This is what triggers regular redraws as
  assets get loaded.
* Scrolling may need more fine tuning, so that scroll-bars are
  hidden if they don't apply (e.g. don't show horizontal scroll-bar for
  the vertically expanding grid-view layout).
2022-02-03 15:36:05 +01:00
a54bd5fe19 Merge branch 'master' into asset-browser-grid-view 2022-01-31 23:51:12 +01:00
4b43bd820e Merge branch 'master' into asset-browser-grid-view 2022-01-31 23:45:08 +01:00
b8b7b0af70 Add basic asset library loading, general cleanups
Adds an asset library selector and prints the list of assets when drawing the
Asset Browser main window.
2022-01-31 23:39:29 +01:00
b3597b1128 Add Asset Browser as own editor
This is part of a (tentative) plan to split off the Asset Browser from
the File Browser, while adding a new grid-view API that generalizes most
of the UI. Both editors can use this and only have to implement their
case-specific logic. This then allows us to add a proper asset system,
that's not tied to the file browser backend anymore.
2022-01-28 19:27:37 +01:00
810 changed files with 17615 additions and 10549 deletions

View File

@@ -565,15 +565,19 @@ endif()
option(WITH_OPENGL "When off limits visibility of the opengl headers to just bf_gpu and gawain (temporary option for development purposes)" ON)
option(WITH_GLEW_ES "Switches to experimental copy of GLEW that has support for OpenGL ES. (temporary option for development purposes)" OFF)
option(WITH_GL_PROFILE_ES20 "Support using OpenGL ES 2.0. (through either EGL or the AGL/WGL/XGL 'es20' profile)" OFF)
option(WITH_GPU_SHADER_BUILDER "Shader builder is a developer option enabling linting on GLSL during compilation" OFF)
option(WITH_GPU_BUILDTIME_SHADER_BUILDER "Shader builder is a developer option enabling linting on GLSL during compilation" OFF)
mark_as_advanced(
WITH_OPENGL
WITH_GLEW_ES
WITH_GL_PROFILE_ES20
WITH_GPU_SHADER_BUILDER
WITH_GPU_BUILDTIME_SHADER_BUILDER
)
if(WITH_HEADLESS)
set(WITH_OPENGL OFF)
endif()
# Metal
if (APPLE)

View File

@@ -63,6 +63,8 @@ set(DPCPP_EXTRA_ARGS
-DPython3_ROOT_DIR=${LIBDIR}/python/
-DPython3_EXECUTABLE=${PYTHON_BINARY}
-DPYTHON_EXECUTABLE=${PYTHON_BINARY}
-DLLDB_ENABLE_CURSES=OFF
-DLLVM_ENABLE_TERMINFO=OFF
)
if(WIN32)

View File

@@ -10,18 +10,12 @@ set(EMBREE_EXTRA_ARGS
-DEMBREE_RAY_MASK=ON
-DEMBREE_FILTER_FUNCTION=ON
-DEMBREE_BACKFACE_CULLING=OFF
-DEMBREE_MAX_ISA=AVX2
-DEMBREE_TASKING_SYSTEM=TBB
-DEMBREE_TBB_ROOT=${LIBDIR}/tbb
-DTBB_ROOT=${LIBDIR}/tbb
-DTBB_STATIC_LIB=${TBB_STATIC_LIBRARY}
)
if(BLENDER_PLATFORM_ARM)
set(EMBREE_EXTRA_ARGS
${EMBREE_EXTRA_ARGS}
-DEMBREE_MAX_ISA=NEON)
else()
if (NOT BLENDER_PLATFORM_ARM)
set(EMBREE_EXTRA_ARGS
${EMBREE_EXTRA_ARGS}
-DEMBREE_MAX_ISA=AVX2)
@@ -30,23 +24,10 @@ endif()
if(TBB_STATIC_LIBRARY)
set(EMBREE_EXTRA_ARGS
${EMBREE_EXTRA_ARGS}
-DEMBREE_TBB_LIBRARY_NAME=tbb_static
-DEMBREE_TBBMALLOC_LIBRARY_NAME=tbbmalloc_static
-DEMBREE_TBB_COMPONENT=tbb_static
)
endif()
if(WIN32)
set(EMBREE_BUILD_DIR ${BUILD_MODE}/)
if(BUILD_MODE STREQUAL Debug)
list(APPEND EMBREE_EXTRA_ARGS
-DEMBREE_TBBMALLOC_LIBRARY_NAME=tbbmalloc_debug
-DEMBREE_TBB_LIBRARY_NAME=tbb_debug
)
endif()
else()
set(EMBREE_BUILD_DIR)
endif()
ExternalProject_Add(external_embree
URL file://${PACKAGE_DIR}/${EMBREE_FILE}
DOWNLOAD_DIR ${DOWNLOAD_DIR}

View File

@@ -410,9 +410,9 @@ set(SQLITE_HASH fb558c49ee21a837713c4f1e7e413309aabdd9c7)
set(SQLITE_HASH_TYPE SHA1)
set(SQLITE_FILE sqlite-src-3240000.zip)
set(EMBREE_VERSION 3.13.3)
set(EMBREE_VERSION 3.13.4)
set(EMBREE_URI https://github.com/embree/embree/archive/v${EMBREE_VERSION}.zip)
set(EMBREE_HASH f62766ba54e48a2f327c3a22596e7133)
set(EMBREE_HASH 52d0be294d6c88ba7a6c9e046796e7be)
set(EMBREE_HASH_TYPE MD5)
set(EMBREE_FILE embree-v${EMBREE_VERSION}.zip)

View File

@@ -567,7 +567,7 @@ OPENCOLLADA_FORCE_BUILD=false
OPENCOLLADA_FORCE_REBUILD=false
OPENCOLLADA_SKIP=false
EMBREE_VERSION="3.13.3"
EMBREE_VERSION="3.13.4"
EMBREE_VERSION_SHORT="3.13"
EMBREE_VERSION_MIN="3.13"
EMBREE_VERSION_MEX="4.0"

View File

@@ -1,30 +1,37 @@
diff -Naur orig/common/sys/platform.h external_embree/common/sys/platform.h
--- orig/common/sys/platform.h 2020-05-13 23:08:53 -0600
+++ external_embree/common/sys/platform.h 2020-06-13 17:40:26 -0600
@@ -84,8 +84,8 @@
////////////////////////////////////////////////////////////////////////////////
diff -Naur org/kernels/rtcore_config.h.in embree-3.13.4/kernels/rtcore_config.h.in
--- org/kernels/rtcore_config.h.in 2022-06-14 22:13:52 -0600
+++ embree-3.13.4/kernels/rtcore_config.h.in 2022-06-24 15:20:12 -0600
@@ -14,6 +14,7 @@
#cmakedefine01 EMBREE_MIN_WIDTH
#define RTC_MIN_WIDTH EMBREE_MIN_WIDTH
+#cmakedefine EMBREE_STATIC_LIB
#cmakedefine EMBREE_API_NAMESPACE
#if defined(EMBREE_API_NAMESPACE)
diff --git a/kernels/CMakeLists.txt b/kernels/CMakeLists.txt
index 7c2f43d..106b1d5 100644
--- a/kernels/CMakeLists.txt
+++ b/kernels/CMakeLists.txt
@@ -201,6 +201,12 @@ embree_files(EMBREE_LIBRARY_FILES_AVX512 ${AVX512})
#message("AVX2: ${EMBREE_LIBRARY_FILES_AVX2}")
#message("AVX512: ${EMBREE_LIBRARY_FILES_AVX512}")
#ifdef __WIN32__
-#define dll_export __declspec(dllexport)
-#define dll_import __declspec(dllimport)
+#define dll_export
+#define dll_import
#else
#define dll_export __attribute__ ((visibility ("default")))
#define dll_import
diff --git orig/common/tasking/CMakeLists.txt external_embree/common/tasking/CMakeLists.txt
--- orig/common/tasking/CMakeLists.txt
+++ external_embree/common/tasking/CMakeLists.txt
@@ -27,7 +27,11 @@
else()
# If not found try getting older TBB via module (FindTBB.cmake)
unset(TBB_DIR CACHE)
- find_package(TBB 4.1 REQUIRED tbb)
+ if (TBB_STATIC_LIB)
+ find_package(TBB 4.1 REQUIRED tbb_static)
+ else()
+ find_package(TBB 4.1 REQUIRED tbb)
+ endif()
if (TBB_FOUND)
TARGET_LINK_LIBRARIES(tasking PUBLIC TBB)
TARGET_INCLUDE_DIRECTORIES(tasking PUBLIC "${TBB_INCLUDE_DIRS}")
+# Bundle Neon2x into the main static library.
+IF(EMBREE_ISA_NEON2X AND EMBREE_STATIC_LIB)
+ LIST(APPEND EMBREE_LIBRARY_FILES ${EMBREE_LIBRARY_FILES_AVX2})
+ LIST(REMOVE_DUPLICATES EMBREE_LIBRARY_FILES)
+ENDIF()
+
# replaces all .cpp files with a dummy file that includes that .cpp file
# this is to work around an ICC name mangling issue related to lambda functions under windows
MACRO (CreateISADummyFiles list isa)
@@ -277,7 +283,7 @@ IF (EMBREE_ISA_AVX AND EMBREE_LIBRARY_FILES_AVX)
ENDIF()
ENDIF()
-IF (EMBREE_ISA_AVX2 AND EMBREE_LIBRARY_FILES_AVX2)
+IF (EMBREE_ISA_AVX2 AND EMBREE_LIBRARY_FILES_AVX2 AND NOT (EMBREE_ISA_NEON2X AND EMBREE_STATIC_LIB))
DISABLE_STACK_PROTECTOR_FOR_INTERSECTORS(${EMBREE_LIBRARY_FILES_AVX2})
ADD_LIBRARY(embree_avx2 STATIC ${EMBREE_LIBRARY_FILES_AVX2})
TARGET_LINK_LIBRARIES(embree_avx2 PRIVATE tasking)

View File

@@ -71,7 +71,6 @@ if(NOT WIN32)
endif()
if(WIN32)
set(WITH_WASAPI ON CACHE BOOL "" FORCE)
set(WITH_CYCLES_DEVICE_ONEAPI ON CACHE BOOL "" FORCE)
endif()
if(UNIX AND NOT APPLE)
set(WITH_DOC_MANPAGE ON CACHE BOOL "" FORCE)
@@ -92,6 +91,7 @@ if(NOT APPLE)
set(WITH_CYCLES_CUDA_BINARIES ON CACHE BOOL "" FORCE)
set(WITH_CYCLES_CUBIN_COMPILER OFF CACHE BOOL "" FORCE)
set(WITH_CYCLES_HIP_BINARIES ON CACHE BOOL "" FORCE)
set(WITH_CYCLES_DEVICE_ONEAPI ON CACHE BOOL "" FORCE)
# Disable AoT kernels compilations until buildbot can deliver them in a reasonabel time.
set(WITH_CYCLES_ONEAPI_BINARIES OFF CACHE BOOL "" FORCE)

View File

@@ -1529,7 +1529,8 @@ def pyrna2sphinx(basepath):
else:
fw(".. class:: %s\n\n" % struct_id)
fw(" %s\n\n" % struct.description)
write_indented_lines(" ", fw, struct.description, False)
fw("\n")
# Properties sorted in alphabetical order.
sorted_struct_properties = struct.properties[:]

View File

@@ -270,7 +270,7 @@ AUD_API int AUD_readSound(AUD_Sound* sound, float* buffer, int length, int sampl
return length;
}
AUD_API const char* AUD_mixdown(AUD_Sound* sound, unsigned int start, unsigned int length, unsigned int buffersize, const char* filename, AUD_DeviceSpecs specs, AUD_Container format, AUD_Codec codec, unsigned int bitrate, void(*callback)(float, void*), void* data)
AUD_API int AUD_mixdown(AUD_Sound* sound, unsigned int start, unsigned int length, unsigned int buffersize, const char* filename, AUD_DeviceSpecs specs, AUD_Container format, AUD_Codec codec, unsigned int bitrate, void(*callback)(float, void*), void* data, char* error, size_t errorsize)
{
try
{
@@ -282,15 +282,20 @@ AUD_API const char* AUD_mixdown(AUD_Sound* sound, unsigned int start, unsigned i
std::shared_ptr<IWriter> writer = FileWriter::createWriter(filename, convCToDSpec(specs), static_cast<Container>(format), static_cast<Codec>(codec), bitrate);
FileWriter::writeReader(reader, writer, length, buffersize, callback, data);
return nullptr;
return true;
}
catch(Exception& e)
{
return e.getMessage().c_str();
if(error && errorsize)
{
std::strncpy(error, e.getMessage().c_str(), errorsize);
error[errorsize - 1] = '\0';
}
return false;
}
}
AUD_API const char* AUD_mixdown_per_channel(AUD_Sound* sound, unsigned int start, unsigned int length, unsigned int buffersize, const char* filename, AUD_DeviceSpecs specs, AUD_Container format, AUD_Codec codec, unsigned int bitrate, void(*callback)(float, void*), void* data)
AUD_API int AUD_mixdown_per_channel(AUD_Sound* sound, unsigned int start, unsigned int length, unsigned int buffersize, const char* filename, AUD_DeviceSpecs specs, AUD_Container format, AUD_Codec codec, unsigned int bitrate, void(*callback)(float, void*), void* data, char* error, size_t errorsize)
{
try
{
@@ -328,11 +333,16 @@ AUD_API const char* AUD_mixdown_per_channel(AUD_Sound* sound, unsigned int start
reader->seek(start);
FileWriter::writeReader(reader, writers, length, buffersize, callback, data);
return nullptr;
return true;
}
catch(Exception& e)
{
return e.getMessage().c_str();
if(error && errorsize)
{
std::strncpy(error, e.getMessage().c_str(), errorsize);
error[errorsize - 1] = '\0';
}
return false;
}
}

View File

@@ -70,13 +70,15 @@ extern AUD_API int AUD_readSound(AUD_Sound* sound, float* buffer, int length, in
* \param bitrate The bitrate for encoding.
* \param callback A callback function that is called periodically during mixdown, reporting progress if length > 0. Can be NULL.
* \param data Pass through parameter that is passed to the callback.
* \return An error message or NULL in case of success.
* \param error String buffer to copy the error message to in case of failure.
* \param errorsize The size of the error buffer.
* \return Whether or not the operation succeeded.
*/
extern AUD_API const char* AUD_mixdown(AUD_Sound* sound, unsigned int start, unsigned int length,
extern AUD_API int AUD_mixdown(AUD_Sound* sound, unsigned int start, unsigned int length,
unsigned int buffersize, const char* filename,
AUD_DeviceSpecs specs, AUD_Container format,
AUD_Codec codec, unsigned int bitrate,
void(*callback)(float, void*), void* data);
void(*callback)(float, void*), void* data, char* error, size_t errorsize);
/**
* Mixes a sound down into multiple files.
@@ -91,13 +93,15 @@ extern AUD_API const char* AUD_mixdown(AUD_Sound* sound, unsigned int start, uns
* \param bitrate The bitrate for encoding.
* \param callback A callback function that is called periodically during mixdown, reporting progress if length > 0. Can be NULL.
* \param data Pass through parameter that is passed to the callback.
* \return An error message or NULL in case of success.
* \param error String buffer to copy the error message to in case of failure.
* \param errorsize The size of the error buffer.
* \return Whether or not the operation succeeded.
*/
extern AUD_API const char* AUD_mixdown_per_channel(AUD_Sound* sound, unsigned int start, unsigned int length,
extern AUD_API int AUD_mixdown_per_channel(AUD_Sound* sound, unsigned int start, unsigned int length,
unsigned int buffersize, const char* filename,
AUD_DeviceSpecs specs, AUD_Container format,
AUD_Codec codec, unsigned int bitrate,
void(*callback)(float, void*), void* data);
void(*callback)(float, void*), void* data, char* error, size_t errorsize);
/**
* Opens a read device and prepares it for mixdown of the sound scene.

View File

@@ -128,10 +128,6 @@ if(WITH_OPENIMAGEDENOISE)
)
endif()
if(WITH_EXPERIMENTAL_FEATURES)
add_definitions(-DWITH_NEW_CURVES_TYPE)
endif()
blender_add_lib(bf_intern_cycles "${SRC}" "${INC}" "${INC_SYS}" "${LIB}")
add_dependencies(bf_intern_cycles bf_rna)

View File

@@ -84,10 +84,36 @@ class AddPresetViewportSampling(AddPresetBase, Operator):
preset_subdir = "cycles/viewport_sampling"
class AddPresetPerformance(AddPresetBase, Operator):
'''Add an Performance Preset'''
bl_idname = "render.cycles_performance_preset_add"
bl_label = "Add Performance Preset"
preset_menu = "CYCLES_PT_performance_presets"
preset_defines = [
"render = bpy.context.scene.render"
"cycles = bpy.context.scene.cycles"
]
preset_values = [
"render.threads_mode",
"render.use_persistent_data",
"cycles.debug_use_spatial_splits",
"cycles.debug_use_compact_bvh",
"cycles.debug_use_hair_bvh",
"cycles.debug_bvh_time_steps",
"cycles.use_auto_tile",
"cycles.tile_size",
]
preset_subdir = "cycles/performance"
classes = (
AddPresetIntegrator,
AddPresetSampling,
AddPresetViewportSampling,
AddPresetPerformance,
)

View File

@@ -693,7 +693,7 @@ class CyclesRenderSettings(bpy.types.PropertyGroup):
debug_use_compact_bvh: BoolProperty(
name="Use Compact BVH",
description="Use compact BVH structure (uses less ram but renders slower)",
default=True,
default=False,
)
debug_bvh_time_steps: IntProperty(
name="BVH Time Steps",
@@ -1560,7 +1560,7 @@ class CyclesPreferences(bpy.types.AddonPreferences):
if sys.platform.startswith("win"):
col.label(text="and Windows driver version 101.1660 or newer", icon='BLANK1')
elif sys.platform.startswith("linux"):
col.label(text="and Linux driver version xx.xx.20066 or newer", icon='BLANK1')
col.label(text="and Linux driver version xx.xx.23570 or newer", icon='BLANK1')
elif device_type == 'METAL':
col.label(text="Requires Apple Silicon with macOS 12.2 or newer", icon='BLANK1')
col.label(text="or AMD with macOS 12.3 or newer", icon='BLANK1')

View File

@@ -43,6 +43,12 @@ class CYCLES_PT_integrator_presets(CyclesPresetPanel):
preset_add_operator = "render.cycles_integrator_preset_add"
class CYCLES_PT_performance_presets(CyclesPresetPanel):
bl_label = "Performance Presets"
preset_subdir = "cycles/performance"
preset_add_operator = "render.cycles_performance_preset_add"
class CyclesButtonsPanel:
bl_space_type = "PROPERTIES"
bl_region_type = "WINDOW"
@@ -624,6 +630,9 @@ class CYCLES_RENDER_PT_performance(CyclesButtonsPanel, Panel):
bl_label = "Performance"
bl_options = {'DEFAULT_CLOSED'}
def draw_header_preset(self, context):
CYCLES_PT_performance_presets.draw_panel_header(self.layout)
def draw(self, context):
pass
@@ -943,6 +952,8 @@ class CYCLES_CAMERA_PT_dof(CyclesButtonsPanel, Panel):
col = split.column()
col.prop(dof, "focus_object", text="Focus Object")
if dof.focus_object and dof.focus_object.type == 'ARMATURE':
col.prop_search(dof, "focus_subtarget", dof.focus_object.data, "bones", text="Focus Bone")
sub = col.row()
sub.active = dof.focus_object is None
@@ -1202,7 +1213,7 @@ class CYCLES_OBJECT_PT_lightgroup(CyclesButtonsPanel, Panel):
sub.prop_search(ob, "lightgroup", view_layer, "lightgroups", text="Light Group", results_are_suggestions=True)
sub = row.column(align=True)
sub.active = bool(ob.lightgroup) and not any(lg.name == ob.lightgroup for lg in view_layer.lightgroups)
sub.enabled = bool(ob.lightgroup) and not any(lg.name == ob.lightgroup for lg in view_layer.lightgroups)
sub.operator("scene.view_layer_add_lightgroup", icon='ADD', text="").name = ob.lightgroup
@@ -1640,7 +1651,7 @@ class CYCLES_WORLD_PT_settings_light_group(CyclesButtonsPanel, Panel):
)
sub = row.column(align=True)
sub.active = bool(world.lightgroup) and not any(lg.name == world.lightgroup for lg in view_layer.lightgroups)
sub.enabled = bool(world.lightgroup) and not any(lg.name == world.lightgroup for lg in view_layer.lightgroups)
sub.operator("scene.view_layer_add_lightgroup", icon='ADD', text="").name = world.lightgroup
@@ -2269,6 +2280,7 @@ classes = (
CYCLES_PT_sampling_presets,
CYCLES_PT_viewport_sampling_presets,
CYCLES_PT_integrator_presets,
CYCLES_PT_performance_presets,
CYCLES_RENDER_PT_sampling,
CYCLES_RENDER_PT_sampling_viewport,
CYCLES_RENDER_PT_sampling_viewport_denoise,

View File

@@ -143,11 +143,20 @@ static float blender_camera_focal_distance(BL::RenderEngine &b_engine,
if (!b_dof_object)
return b_camera.dof().focus_distance();
Transform dofmat = get_transform(b_dof_object.matrix_world());
string focus_subtarget = b_camera.dof().focus_subtarget();
if (b_dof_object.pose() && !focus_subtarget.empty()) {
BL::PoseBone b_bone = b_dof_object.pose().bones[focus_subtarget];
if (b_bone) {
dofmat = dofmat * get_transform(b_bone.matrix());
}
}
/* for dof object, return distance along camera Z direction */
BL::Array<float, 16> b_ob_matrix;
b_engine.camera_model_matrix(b_ob, bcam->use_spherical_stereo, b_ob_matrix);
Transform obmat = transform_clear_scale(get_transform(b_ob_matrix));
Transform dofmat = get_transform(b_dof_object.matrix_world());
float3 view_dir = normalize(transform_get_column(&obmat, 2));
float3 dof_dir = transform_get_column(&obmat, 3) - transform_get_column(&dofmat, 3);
return fabsf(dot(view_dir, dof_dir));

View File

@@ -613,8 +613,6 @@ void BlenderSync::sync_particle_hair(
}
}
#ifdef WITH_NEW_CURVES_TYPE
static std::optional<BL::FloatAttribute> find_curves_radius_attribute(BL::Curves b_curves)
{
for (BL::Attribute &b_attribute : b_curves.attributes) {
@@ -632,6 +630,25 @@ static std::optional<BL::FloatAttribute> find_curves_radius_attribute(BL::Curves
return std::nullopt;
}
static BL::FloatVectorAttribute find_curves_position_attribute(BL::Curves b_curves)
{
for (BL::Attribute &b_attribute : b_curves.attributes) {
if (b_attribute.name() != "position") {
continue;
}
if (b_attribute.domain() != BL::Attribute::domain_POINT) {
continue;
}
if (b_attribute.data_type() != BL::Attribute::data_type_FLOAT_VECTOR) {
continue;
}
return BL::FloatVectorAttribute{b_attribute};
}
/* The position attribute must exist. */
assert(false);
return BL::FloatVectorAttribute{b_curves.attributes[0]};
}
template<typename TypeInCycles, typename GetValueAtIndex>
static void fill_generic_attribute(BL::Curves &b_curves,
TypeInCycles *data,
@@ -795,16 +812,16 @@ static void attr_create_generic(Scene *scene,
}
}
static float4 hair_point_as_float4(BL::Curves b_curves,
static float4 hair_point_as_float4(BL::FloatVectorAttribute b_attr_position,
std::optional<BL::FloatAttribute> b_attr_radius,
const int index)
{
float4 mP = float3_to_float4(get_float3(b_curves.position_data[index].vector()));
float4 mP = float3_to_float4(get_float3(b_attr_position.data[index].vector()));
mP.w = b_attr_radius ? b_attr_radius->data[index].value() : 0.0f;
return mP;
}
static float4 interpolate_hair_points(BL::Curves b_curves,
static float4 interpolate_hair_points(BL::FloatVectorAttribute b_attr_position,
std::optional<BL::FloatAttribute> b_attr_radius,
const int first_point_index,
const int num_points,
@@ -814,8 +831,8 @@ static float4 interpolate_hair_points(BL::Curves b_curves,
const int point_a = clamp((int)curve_t, 0, num_points - 1);
const int point_b = min(point_a + 1, num_points - 1);
const float t = curve_t - (float)point_a;
return lerp(hair_point_as_float4(b_curves, b_attr_radius, first_point_index + point_a),
hair_point_as_float4(b_curves, b_attr_radius, first_point_index + point_b),
return lerp(hair_point_as_float4(b_attr_position, b_attr_radius, first_point_index + point_a),
hair_point_as_float4(b_attr_position, b_attr_radius, first_point_index + point_b),
t);
}
@@ -848,6 +865,7 @@ static void export_hair_curves(Scene *scene,
hair->reserve_curves(num_curves, num_keys);
BL::FloatVectorAttribute b_attr_position = find_curves_position_attribute(b_curves);
std::optional<BL::FloatAttribute> b_attr_radius = find_curves_radius_attribute(b_curves);
/* Export curves and points. */
@@ -866,9 +884,9 @@ static void export_hair_curves(Scene *scene,
/* Position and radius. */
for (int i = 0; i < num_points; i++) {
const float3 co = get_float3(b_curves.position_data[first_point_index + i].vector());
const float3 co = get_float3(b_attr_position.data[first_point_index + i].vector());
const float radius = b_attr_radius ? b_attr_radius->data[first_point_index + i].value() :
0.0f;
0.005f;
hair->add_curve_key(co, radius);
if (attr_intercept) {
@@ -923,6 +941,7 @@ static void export_hair_curves_motion(Hair *hair, BL::Curves b_curves, int motio
int num_motion_keys = 0;
int curve_index = 0;
BL::FloatVectorAttribute b_attr_position = find_curves_position_attribute(b_curves);
std::optional<BL::FloatAttribute> b_attr_radius = find_curves_radius_attribute(b_curves);
for (int i = 0; i < num_curves; i++) {
@@ -938,7 +957,7 @@ static void export_hair_curves_motion(Hair *hair, BL::Curves b_curves, int motio
int point_index = first_point_index + i;
if (point_index < num_keys) {
mP[num_motion_keys] = hair_point_as_float4(b_curves, b_attr_radius, point_index);
mP[num_motion_keys] = hair_point_as_float4(b_attr_position, b_attr_radius, point_index);
num_motion_keys++;
if (!have_motion) {
@@ -958,7 +977,7 @@ static void export_hair_curves_motion(Hair *hair, BL::Curves b_curves, int motio
for (int i = 0; i < curve.num_keys; i++) {
const float step = i * step_size;
mP[num_motion_keys] = interpolate_hair_points(
b_curves, b_attr_radius, first_point_index, num_points, step);
b_attr_position, b_attr_radius, first_point_index, num_points, step);
num_motion_keys++;
}
have_motion = true;
@@ -990,15 +1009,6 @@ void BlenderSync::sync_hair(Hair *hair, BObjectInfo &b_ob_info, bool motion, int
export_hair_curves(scene, hair, b_curves, need_motion, motion_scale);
}
}
#else
void BlenderSync::sync_hair(Hair *hair, BObjectInfo &b_ob_info, bool motion, int motion_step)
{
(void)hair;
(void)b_ob_info;
(void)motion;
(void)motion_step;
}
#endif
void BlenderSync::sync_hair(BL::Depsgraph b_depsgraph, BObjectInfo &b_ob_info, Hair *hair)
{
@@ -1010,14 +1020,11 @@ void BlenderSync::sync_hair(BL::Depsgraph b_depsgraph, BObjectInfo &b_ob_info, H
new_hair.set_used_shaders(used_shaders);
if (view_layer.use_hair) {
#ifdef WITH_NEW_CURVES_TYPE
if (b_ob_info.object_data.is_a(&RNA_Curves)) {
/* Hair object. */
sync_hair(&new_hair, b_ob_info, false);
}
else
#endif
{
else {
/* Particle hair. */
bool need_undeformed = new_hair.need_attribute(scene, ATTR_STD_GENERATED);
BL::Mesh b_mesh = object_to_mesh(
@@ -1064,15 +1071,12 @@ void BlenderSync::sync_hair_motion(BL::Depsgraph b_depsgraph,
/* Export deformed coordinates. */
if (ccl::BKE_object_is_deform_modified(b_ob_info, b_scene, preview)) {
#ifdef WITH_NEW_CURVES_TYPE
if (b_ob_info.object_data.is_a(&RNA_Curves)) {
/* Hair object. */
sync_hair(hair, b_ob_info, true, motion_step);
return;
}
else
#endif
{
else {
/* Particle hair. */
BL::Mesh b_mesh = object_to_mesh(
b_data, b_ob_info, b_depsgraph, false, Mesh::SUBDIVISION_NONE);

View File

@@ -18,11 +18,7 @@ CCL_NAMESPACE_BEGIN
static Geometry::Type determine_geom_type(BObjectInfo &b_ob_info, bool use_particle_hair)
{
#ifdef WITH_NEW_CURVES_TYPE
if (b_ob_info.object_data.is_a(&RNA_Curves) || use_particle_hair) {
#else
if (use_particle_hair) {
#endif
return Geometry::HAIR;
}
@@ -217,11 +213,7 @@ void BlenderSync::sync_geometry_motion(BL::Depsgraph &b_depsgraph,
if (progress.get_cancel())
return;
#ifdef WITH_NEW_CURVES_TYPE
if (b_ob_info.object_data.is_a(&RNA_Curves) || use_particle_hair) {
#else
if (use_particle_hair) {
#endif
Hair *hair = static_cast<Hair *>(geom);
sync_hair_motion(b_depsgraph, b_ob_info, hair, motion_step);
}

View File

@@ -1,8 +1,10 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2011-2022 Blender Foundation */
#include "scene/pointcloud.h"
#include <optional>
#include "scene/attribute.h"
#include "scene/pointcloud.h"
#include "scene/scene.h"
#include "blender/sync.h"
@@ -138,6 +140,36 @@ static void copy_attributes(PointCloud *pointcloud,
}
}
static std::optional<BL::FloatAttribute> find_radius_attribute(BL::PointCloud b_pointcloud)
{
for (BL::Attribute &b_attribute : b_pointcloud.attributes) {
if (b_attribute.name() != "radius") {
continue;
}
if (b_attribute.data_type() != BL::Attribute::data_type_FLOAT) {
continue;
}
return BL::FloatAttribute{b_attribute};
}
return std::nullopt;
}
static BL::FloatVectorAttribute find_position_attribute(BL::PointCloud b_pointcloud)
{
for (BL::Attribute &b_attribute : b_pointcloud.attributes) {
if (b_attribute.name() != "position") {
continue;
}
if (b_attribute.data_type() != BL::Attribute::data_type_FLOAT_VECTOR) {
continue;
}
return BL::FloatVectorAttribute{b_attribute};
}
/* The position attribute must exist. */
assert(false);
return BL::FloatVectorAttribute{b_pointcloud.attributes[0]};
}
static void export_pointcloud(Scene *scene,
PointCloud *pointcloud,
BL::PointCloud b_pointcloud,
@@ -156,18 +188,18 @@ static void export_pointcloud(Scene *scene,
const int num_points = b_pointcloud.points.length();
pointcloud->reserve(num_points);
BL::FloatVectorAttribute b_attr_position = find_position_attribute(b_pointcloud);
std::optional<BL::FloatAttribute> b_attr_radius = find_radius_attribute(b_pointcloud);
/* Export points. */
BL::PointCloud::points_iterator b_point_iter;
for (b_pointcloud.points.begin(b_point_iter); b_point_iter != b_pointcloud.points.end();
++b_point_iter) {
BL::Point b_point = *b_point_iter;
const float3 co = get_float3(b_point.co());
const float radius = b_point.radius();
for (int i = 0; i < num_points; i++) {
const float3 co = get_float3(b_attr_position.data[i].vector());
const float radius = b_attr_radius ? b_attr_radius->data[i].value() : 0.0f;
pointcloud->add_point(co, radius);
/* Random number per point. */
if (attr_random != NULL) {
attr_random->add(hash_uint2_to_float(b_point.index(), 0));
attr_random->add(hash_uint2_to_float(i, 0));
}
}
@@ -195,14 +227,15 @@ static void export_pointcloud_motion(PointCloud *pointcloud,
int num_motion_points = 0;
const array<float3> &pointcloud_points = pointcloud->get_points();
BL::PointCloud::points_iterator b_point_iter;
for (b_pointcloud.points.begin(b_point_iter); b_point_iter != b_pointcloud.points.end();
++b_point_iter) {
BL::Point b_point = *b_point_iter;
BL::FloatVectorAttribute b_attr_position = find_position_attribute(b_pointcloud);
std::optional<BL::FloatAttribute> b_attr_radius = find_radius_attribute(b_pointcloud);
for (int i = 0; i < num_points; i++) {
if (num_motion_points < num_points) {
float3 P = get_float3(b_point.co());
P.w = b_point.radius();
const float3 co = get_float3(b_attr_position.data[i].vector());
const float radius = b_attr_radius ? b_attr_radius->data[i].value() : 0.0f;
float3 P = co;
P.w = radius;
mP[num_motion_points] = P;
have_motion = have_motion || (P != pointcloud_points[num_motion_points]);
num_motion_points++;

View File

@@ -129,7 +129,7 @@ class BVHParams {
top_level = false;
bvh_layout = BVH_LAYOUT_BVH2;
use_compact_structure = true;
use_compact_structure = false;
use_unaligned_nodes = false;
num_motion_curve_steps = 0;

View File

@@ -197,7 +197,7 @@ void CPUDevice::const_copy_to(const char *name, void *host, size_t size)
// Update scene handle (since it is different for each device on multi devices)
KernelData *const data = (KernelData *)host;
data->bvh.scene = embree_scene;
data->device_bvh = embree_scene;
}
#endif
kernel_const_copy(&kernel_globals, name, host, size);

View File

@@ -29,6 +29,7 @@ class DeviceQueue;
class Progress;
class CPUKernels;
class CPUKernelThreadGlobals;
class Scene;
/* Device Types */
@@ -186,6 +187,11 @@ class Device {
return 0;
}
/* Called after kernel texture setup, and prior to integrator state setup. */
virtual void optimize_for_scene(Scene * /*scene*/)
{
}
virtual bool is_resident(device_ptr /*key*/, Device *sub_device)
{
/* Memory is always resident if this is not a multi device, regardless of whether the pointer

View File

@@ -75,7 +75,8 @@ class MetalDevice : public Device {
std::vector<id<MTLTexture>> texture_slot_map;
bool use_metalrt = false;
bool use_function_specialisation = false;
MetalPipelineType kernel_specialization_level = PSO_GENERIC;
std::atomic_bool async_compile_and_load = false;
virtual BVHLayoutMask get_bvh_layout_mask() const override;
@@ -91,9 +92,7 @@ class MetalDevice : public Device {
bool use_adaptive_compilation();
string get_source(const uint kernel_features);
string compile_kernel(const uint kernel_features, const char *name);
void make_source(MetalPipelineType pso_type, const uint kernel_features);
virtual bool load_kernels(const uint kernel_features) override;
@@ -111,7 +110,9 @@ class MetalDevice : public Device {
virtual void build_bvh(BVH *bvh, Progress &progress, bool refit) override;
id<MTLLibrary> compile(string const &source);
virtual void optimize_for_scene(Scene *scene) override;
bool compile_and_load(MetalPipelineType pso_type);
/* ------------------------------------------------------------------ */
/* low-level memory management */

View File

@@ -6,6 +6,8 @@
# include "device/metal/device_impl.h"
# include "device/metal/device.h"
# include "scene/scene.h"
# include "util/debug.h"
# include "util/md5.h"
# include "util/path.h"
@@ -78,6 +80,10 @@ MetalDevice::MetalDevice(const DeviceInfo &info, Stats &stats, Profiler &profile
case METAL_GPU_APPLE: {
max_threads_per_threadgroup = 512;
use_metalrt = info.use_metalrt;
/* Specialize the intersection kernels on Apple GPUs by default as these can be built very
* quickly. */
kernel_specialization_level = PSO_SPECIALIZED_INTERSECT;
break;
}
}
@@ -90,6 +96,13 @@ MetalDevice::MetalDevice(const DeviceInfo &info, Stats &stats, Profiler &profile
capture_enabled = true;
}
if (auto envstr = getenv("CYCLES_METAL_SPECIALIZATION_LEVEL")) {
kernel_specialization_level = (MetalPipelineType)atoi(envstr);
}
metal_printf("kernel_specialization_level = %s\n",
kernel_type_as_string(
(MetalPipelineType)min((int)kernel_specialization_level, (int)PSO_NUM - 1)));
MTLArgumentDescriptor *arg_desc_params = [[MTLArgumentDescriptor alloc] init];
arg_desc_params.dataType = MTLDataTypePointer;
arg_desc_params.access = MTLArgumentAccessReadOnly;
@@ -209,61 +222,86 @@ bool MetalDevice::use_adaptive_compilation()
return DebugFlags().metal.adaptive_compile;
}
string MetalDevice::get_source(const uint kernel_features)
void MetalDevice::make_source(MetalPipelineType pso_type, const uint kernel_features)
{
string build_options;
string global_defines;
if (use_adaptive_compilation()) {
build_options += " -D__KERNEL_FEATURES__=" + to_string(kernel_features);
global_defines += "#define __KERNEL_FEATURES__ " + to_string(kernel_features) + "\n";
}
if (use_metalrt) {
build_options += "-D__METALRT__ ";
global_defines += "#define __METALRT__\n";
if (motion_blur) {
build_options += "-D__METALRT_MOTION__ ";
global_defines += "#define __METALRT_MOTION__\n";
}
}
# ifdef WITH_CYCLES_DEBUG
build_options += "-D__KERNEL_DEBUG__ ";
global_defines += "#define __KERNEL_DEBUG__\n";
# endif
switch (device_vendor) {
default:
break;
case METAL_GPU_INTEL:
build_options += "-D__KERNEL_METAL_INTEL__ ";
global_defines += "#define __KERNEL_METAL_INTEL__\n";
break;
case METAL_GPU_AMD:
build_options += "-D__KERNEL_METAL_AMD__ ";
global_defines += "#define __KERNEL_METAL_AMD__\n";
break;
case METAL_GPU_APPLE:
build_options += "-D__KERNEL_METAL_APPLE__ ";
global_defines += "#define __KERNEL_METAL_APPLE__\n";
break;
}
/* reformat -D defines list into compilable form */
vector<string> components;
string_replace(build_options, "-D", "");
string_split(components, build_options, " ");
string globalDefines;
for (const string &component : components) {
vector<string> assignments;
string_split(assignments, component, "=");
if (assignments.size() == 2)
globalDefines += string_printf(
"#define %s %s\n", assignments[0].c_str(), assignments[1].c_str());
else
globalDefines += string_printf("#define %s\n", assignments[0].c_str());
}
string source = globalDefines + "\n#include \"kernel/device/metal/kernel.metal\"\n";
string &source = this->source[pso_type];
source = "\n#include \"kernel/device/metal/kernel.metal\"\n";
source = path_source_replace_includes(source, path_get("source"));
metal_printf("Global defines:\n%s\n", globalDefines.c_str());
/* Perform any required specialization on the source.
* With Metal function constants we can generate a single variant of the kernel source which can
* be repeatedly respecialized.
*/
string baked_constants;
return source;
/* Replace specific KernelData "dot" dereferences with a Metal function_constant identifier of
* the same character length. Build a string of all active constant values which is then hashed
* in order to identify the PSO.
*/
if (pso_type != PSO_GENERIC) {
const double starttime = time_dt();
# define KERNEL_STRUCT_BEGIN(name, parent) \
string_replace_same_length(source, "kernel_data." #parent ".", "kernel_data_" #parent "_");
/* Add constants to md5 so that 'get_best_pipeline' is able to return a suitable match. */
# define KERNEL_STRUCT_MEMBER(parent, _type, name) \
baked_constants += string(#parent "." #name "=") + \
to_string(_type(launch_params.data.parent.name)) + "\n";
# include "kernel/data_template.h"
/* Opt in to all of available specializations. This can be made more granular for the
* PSO_SPECIALIZED_INTERSECT case in order to minimize the number of specialization requests,
* but the overhead should be negligible as these are very quick to (re)build and aren't
* serialized to disk via MTLBinaryArchives.
*/
global_defines += "#define __KERNEL_USE_DATA_CONSTANTS__\n";
metal_printf("KernelData patching took %.1f ms\n", (time_dt() - starttime) * 1000.0);
}
source = global_defines + source;
metal_printf("================\n%s================\n\%s================\n",
global_defines.c_str(),
baked_constants.c_str());
/* Generate an MD5 from the source and include any baked constants. This is used when caching
* PSOs. */
MD5Hash md5;
md5.append(baked_constants);
md5.append(source);
source_md5[pso_type] = md5.get_hex();
}
bool MetalDevice::load_kernels(const uint _kernel_features)
@@ -279,28 +317,22 @@ bool MetalDevice::load_kernels(const uint _kernel_features)
* active, but may still need to be rendered without motion blur if that isn't active as well. */
motion_blur = kernel_features & KERNEL_FEATURE_OBJECT_MOTION;
source[PSO_GENERIC] = get_source(kernel_features);
const double starttime = time_dt();
mtlLibrary[PSO_GENERIC] = compile(source[PSO_GENERIC]);
metal_printf("Front-end compilation finished in %.1f seconds (generic)\n",
time_dt() - starttime);
MD5Hash md5;
md5.append(source[PSO_GENERIC]);
source_md5[PSO_GENERIC] = md5.get_hex();
bool result = MetalDeviceKernels::load(this, false);
bool result = compile_and_load(PSO_GENERIC);
reserve_local_memory(kernel_features);
return result;
}
id<MTLLibrary> MetalDevice::compile(string const &source)
bool MetalDevice::compile_and_load(MetalPipelineType pso_type)
{
make_source(pso_type, kernel_features);
if (!MetalDeviceKernels::should_load_kernels(this, pso_type)) {
/* We already have a full set of matching pipelines which are cached or queued. */
metal_printf("%s kernels already requested\n", kernel_type_as_string(pso_type));
return true;
}
MTLCompileOptions *options = [[MTLCompileOptions alloc] init];
options.fastMathEnabled = YES;
@@ -308,19 +340,30 @@ id<MTLLibrary> MetalDevice::compile(string const &source)
options.languageVersion = MTLLanguageVersion2_4;
}
NSError *error = NULL;
id<MTLLibrary> mtlLibrary = [mtlDevice newLibraryWithSource:@(source.c_str())
options:options
error:&error];
if (getenv("CYCLES_METAL_PROFILING") || getenv("CYCLES_METAL_DEBUG")) {
path_write_text(path_cache_get(string_printf("%s.metal", kernel_type_as_string(pso_type))),
source[pso_type]);
}
if (!mtlLibrary) {
const double starttime = time_dt();
NSError *error = NULL;
mtlLibrary[pso_type] = [mtlDevice newLibraryWithSource:@(source[pso_type].c_str())
options:options
error:&error];
if (!mtlLibrary[pso_type]) {
NSString *err = [error localizedDescription];
set_error(string_printf("Failed to compile library:\n%s", [err UTF8String]));
}
metal_printf("Front-end compilation finished in %.1f seconds (%s)\n",
time_dt() - starttime,
kernel_type_as_string(pso_type));
[options release];
return mtlLibrary;
return MetalDeviceKernels::load(this, pso_type);
}
void MetalDevice::reserve_local_memory(const uint kernel_features)
@@ -627,6 +670,58 @@ device_ptr MetalDevice::mem_alloc_sub_ptr(device_memory &mem, size_t offset, siz
return 0;
}
void MetalDevice::optimize_for_scene(Scene *scene)
{
MetalPipelineType specialization_level = kernel_specialization_level;
if (specialization_level < PSO_SPECIALIZED_INTERSECT) {
return;
}
/* PSO_SPECIALIZED_INTERSECT kernels are fast to specialize, so we always load them
* synchronously. */
compile_and_load(PSO_SPECIALIZED_INTERSECT);
if (specialization_level < PSO_SPECIALIZED_SHADE) {
return;
}
if (!scene->params.background) {
/* Don't load PSO_SPECIALIZED_SHADE kernels during viewport rendering as they are slower to
* build. */
return;
}
/* PSO_SPECIALIZED_SHADE kernels are slower to specialize, so we load them asynchronously, and
* only if there isn't an existing load in flight.
*/
auto specialize_shade_fn = ^() {
compile_and_load(PSO_SPECIALIZED_SHADE);
async_compile_and_load = false;
};
bool async_specialize_shade = true;
/* Block if a per-kernel profiling is enabled (ensure steady rendering rate). */
if (getenv("CYCLES_METAL_PROFILING") != nullptr) {
async_specialize_shade = false;
}
if (async_specialize_shade) {
if (!async_compile_and_load) {
async_compile_and_load = true;
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0),
specialize_shade_fn);
}
else {
metal_printf(
"Async PSO_SPECIALIZED_SHADE load request already in progress - dropping request\n");
}
}
else {
specialize_shade_fn();
}
}
void MetalDevice::const_copy_to(const char *name, void *host, size_t size)
{
if (strcmp(name, "data") == 0) {
@@ -652,7 +747,7 @@ void MetalDevice::const_copy_to(const char *name, void *host, size_t size)
/* Update data storage pointers in launch parameters. */
if (strcmp(name, "integrator_state") == 0) {
/* IntegratorStateGPU is contiguous pointers */
const size_t pointer_block_size = sizeof(IntegratorStateGPU);
const size_t pointer_block_size = offsetof(IntegratorStateGPU, sort_partition_divisor);
update_launch_pointers(
offsetof(KernelParamsMetal, integrator_state), host, size, pointer_block_size);
}

View File

@@ -31,7 +31,7 @@ enum {
enum { METALRT_TABLE_DEFAULT, METALRT_TABLE_SHADOW, METALRT_TABLE_LOCAL, METALRT_TABLE_NUM };
/* Pipeline State Object types */
enum {
enum MetalPipelineType {
/* A kernel that can be used with all scenes, supporting all features.
* It is slow to compile, but only needs to be compiled once and is then
* cached for future render sessions. This allows a render to get underway
@@ -39,28 +39,33 @@ enum {
*/
PSO_GENERIC,
/* A kernel that is relatively quick to compile, but is specialized for the
* scene being rendered. It only contains the functionality and even baked in
* constants for values that means it needs to be recompiled whenever a
* dependent setting is changed. The render performance of this kernel is
* significantly faster though, and justifies the extra compile time.
/* A intersection kernel that is very quick to specialize and results in faster intersection
* kernel performance. It uses Metal function constants to replace several KernelData variables
* with fixed constants.
*/
/* METAL_WIP: This isn't used and will require more changes to enable. */
PSO_SPECIALISED,
PSO_SPECIALIZED_INTERSECT,
/* A shading kernel that is slow to specialize, but results in faster shading kernel performance
* rendered. It uses Metal function constants to replace several KernelData variables with fixed
* constants and short-circuit all unused SVM node case handlers.
*/
PSO_SPECIALIZED_SHADE,
PSO_NUM
};
const char *kernel_type_as_string(int kernel_type);
const char *kernel_type_as_string(MetalPipelineType pso_type);
struct MetalKernelPipeline {
void compile();
id<MTLLibrary> mtlLibrary = nil;
bool scene_specialized;
MetalPipelineType pso_type;
string source_md5;
size_t usage_count = 0;
KernelData kernel_data_;
bool use_metalrt;
bool metalrt_hair;
bool metalrt_hair_thick;
@@ -75,6 +80,8 @@ struct MetalKernelPipeline {
id<MTLComputePipelineState> pipeline = nil;
int num_threads_per_block = 0;
bool should_use_binary_archive() const;
string error_str;
API_AVAILABLE(macos(11.0))
@@ -85,7 +92,8 @@ struct MetalKernelPipeline {
/* Cache of Metal kernels for each DeviceKernel. */
namespace MetalDeviceKernels {
bool load(MetalDevice *device, bool scene_specialized);
bool should_load_kernels(MetalDevice *device, MetalPipelineType pso_type);
bool load(MetalDevice *device, MetalPipelineType pso_type);
const MetalKernelPipeline *get_best_pipeline(const MetalDevice *device, DeviceKernel kernel);
} /* namespace MetalDeviceKernels */

View File

@@ -5,6 +5,7 @@
# include "device/metal/kernel.h"
# include "device/metal/device_impl.h"
# include "kernel/device/metal/function_constants.h"
# include "util/md5.h"
# include "util/path.h"
# include "util/tbb.h"
@@ -16,13 +17,15 @@ CCL_NAMESPACE_BEGIN
/* limit to 2 MTLCompiler instances */
int max_mtlcompiler_threads = 2;
const char *kernel_type_as_string(int kernel_type)
const char *kernel_type_as_string(MetalPipelineType pso_type)
{
switch (kernel_type) {
switch (pso_type) {
case PSO_GENERIC:
return "PSO_GENERIC";
case PSO_SPECIALISED:
return "PSO_SPECIALISED";
case PSO_SPECIALIZED_INTERSECT:
return "PSO_SPECIALIZED_INTERSECT";
case PSO_SPECIALIZED_SHADE:
return "PSO_SPECIALIZED_SHADE";
default:
assert(0);
}
@@ -50,7 +53,11 @@ struct ShaderCache {
/* Non-blocking request for a kernel, optionally specialized to the scene being rendered by
* device. */
void load_kernel(DeviceKernel kernel, MetalDevice *device, bool scene_specialized);
void load_kernel(DeviceKernel kernel, MetalDevice *device, MetalPipelineType pso_type);
bool should_load_kernel(DeviceKernel device_kernel,
MetalDevice *device,
MetalPipelineType pso_type);
void wait_for_all();
@@ -139,9 +146,53 @@ void ShaderCache::compile_thread_func(int thread_index)
}
}
bool ShaderCache::should_load_kernel(DeviceKernel device_kernel,
MetalDevice *device,
MetalPipelineType pso_type)
{
if (device_kernel == DEVICE_KERNEL_INTEGRATOR_MEGAKERNEL) {
/* Skip megakernel. */
return false;
}
if (device_kernel == DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE) {
if ((device->kernel_features & KERNEL_FEATURE_NODE_RAYTRACE) == 0) {
/* Skip shade_surface_raytrace kernel if the scene doesn't require it. */
return false;
}
}
if (pso_type != PSO_GENERIC) {
/* Only specialize kernels where it can make an impact. */
if (device_kernel < DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST ||
device_kernel > DEVICE_KERNEL_INTEGRATOR_MEGAKERNEL) {
return false;
}
/* Only specialize shading / intersection kernels as requested. */
bool is_shade_kernel = (device_kernel >= DEVICE_KERNEL_INTEGRATOR_SHADE_BACKGROUND);
bool is_shade_pso = (pso_type == PSO_SPECIALIZED_SHADE);
if (is_shade_pso != is_shade_kernel) {
return false;
}
}
{
/* check whether the kernel has already been requested / cached */
thread_scoped_lock lock(cache_mutex);
for (auto &pipeline : pipelines[device_kernel]) {
if (pipeline->source_md5 == device->source_md5[pso_type]) {
return false;
}
}
}
return true;
}
void ShaderCache::load_kernel(DeviceKernel device_kernel,
MetalDevice *device,
bool scene_specialized)
MetalPipelineType pso_type)
{
{
/* create compiler threads on first run */
@@ -154,52 +205,21 @@ void ShaderCache::load_kernel(DeviceKernel device_kernel,
}
}
if (device_kernel == DEVICE_KERNEL_INTEGRATOR_MEGAKERNEL) {
/* skip megakernel */
if (!should_load_kernel(device_kernel, device, pso_type)) {
return;
}
if (scene_specialized) {
/* Only specialize kernels where it can make an impact. */
if (device_kernel < DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST ||
device_kernel > DEVICE_KERNEL_INTEGRATOR_MEGAKERNEL) {
return;
}
}
{
/* check whether the kernel has already been requested / cached */
thread_scoped_lock lock(cache_mutex);
for (auto &pipeline : pipelines[device_kernel]) {
if (scene_specialized) {
if (pipeline->source_md5 == device->source_md5[PSO_SPECIALISED]) {
/* we already requested a pipeline that is specialized for this kernel data */
metal_printf("Specialized kernel already requested (%s)\n",
device_kernel_as_string(device_kernel));
return;
}
}
else {
if (pipeline->source_md5 == device->source_md5[PSO_GENERIC]) {
/* we already requested a generic pipeline for this kernel */
metal_printf("Generic kernel already requested (%s)\n",
device_kernel_as_string(device_kernel));
return;
}
}
}
}
incomplete_requests++;
PipelineRequest request;
request.pipeline = new MetalKernelPipeline;
request.pipeline->scene_specialized = scene_specialized;
memcpy(&request.pipeline->kernel_data_,
&device->launch_params.data,
sizeof(request.pipeline->kernel_data_));
request.pipeline->pso_type = pso_type;
request.pipeline->mtlDevice = mtlDevice;
request.pipeline->source_md5 =
device->source_md5[scene_specialized ? PSO_SPECIALISED : PSO_GENERIC];
request.pipeline->mtlLibrary =
device->mtlLibrary[scene_specialized ? PSO_SPECIALISED : PSO_GENERIC];
request.pipeline->source_md5 = device->source_md5[pso_type];
request.pipeline->mtlLibrary = device->mtlLibrary[pso_type];
request.pipeline->device_kernel = device_kernel;
request.pipeline->threads_per_threadgroup = device->max_threads_per_threadgroup;
@@ -214,7 +234,24 @@ void ShaderCache::load_kernel(DeviceKernel device_kernel,
{
thread_scoped_lock lock(cache_mutex);
pipelines[device_kernel].push_back(unique_ptr<MetalKernelPipeline>(request.pipeline));
auto &collection = pipelines[device_kernel];
/* Cache up to 3 kernel variants with the same pso_type, purging oldest first. */
int max_entries_of_same_pso_type = 3;
for (int i = (int)collection.size() - 1; i >= 0; i--) {
if (collection[i]->pso_type == pso_type) {
max_entries_of_same_pso_type -= 1;
if (max_entries_of_same_pso_type == 0) {
metal_printf("Purging oldest %s:%s kernel from ShaderCache\n",
kernel_type_as_string(pso_type),
device_kernel_as_string(device_kernel));
collection.erase(collection.begin() + i);
break;
}
}
}
collection.push_back(unique_ptr<MetalKernelPipeline>(request.pipeline));
request_queue.push_back(request);
}
cond_var.notify_one();
@@ -248,8 +285,9 @@ MetalKernelPipeline *ShaderCache::get_best_pipeline(DeviceKernel kernel, const M
continue;
}
if (pipeline->scene_specialized) {
if (pipeline->source_md5 == device->source_md5[PSO_SPECIALISED]) {
if (pipeline->pso_type != PSO_GENERIC) {
if (pipeline->source_md5 == device->source_md5[PSO_SPECIALIZED_INTERSECT] ||
pipeline->source_md5 == device->source_md5[PSO_SPECIALIZED_SHADE]) {
best_pipeline = pipeline.get();
}
}
@@ -258,13 +296,65 @@ MetalKernelPipeline *ShaderCache::get_best_pipeline(DeviceKernel kernel, const M
}
}
if (best_pipeline->usage_count == 0 && best_pipeline->pso_type != PSO_GENERIC) {
metal_printf("Swapping in %s version of %s\n",
kernel_type_as_string(best_pipeline->pso_type),
device_kernel_as_string(kernel));
}
best_pipeline->usage_count += 1;
return best_pipeline;
}
bool MetalKernelPipeline::should_use_binary_archive() const
{
if (auto str = getenv("CYCLES_METAL_DISABLE_BINARY_ARCHIVES")) {
if (atoi(str) != 0) {
/* Don't archive if we have opted out by env var. */
return false;
}
}
if (pso_type == PSO_GENERIC) {
/* Archive the generic kernels. */
return true;
}
if (device_kernel >= DEVICE_KERNEL_INTEGRATOR_SHADE_BACKGROUND &&
device_kernel <= DEVICE_KERNEL_INTEGRATOR_SHADE_SHADOW) {
/* Archive all shade kernels - they take a long time to compile. */
return true;
}
/* The remaining kernels are all fast to compile. They may get cached by the system shader cache,
* but will be quick to regenerate if not. */
return false;
}
static MTLFunctionConstantValues *GetConstantValues(KernelData const *data = nullptr)
{
MTLFunctionConstantValues *constant_values = [MTLFunctionConstantValues new];
MTLDataType MTLDataType_int = MTLDataTypeInt;
MTLDataType MTLDataType_float = MTLDataTypeFloat;
MTLDataType MTLDataType_float4 = MTLDataTypeFloat4;
KernelData zero_data = {0};
if (!data) {
data = &zero_data;
}
# define KERNEL_STRUCT_MEMBER(parent, _type, name) \
[constant_values setConstantValue:&data->parent.name \
type:MTLDataType_##_type \
atIndex:KernelData_##parent##_##name];
# include "kernel/data_template.h"
return constant_values;
}
void MetalKernelPipeline::compile()
{
int pso_type = scene_specialized ? PSO_SPECIALISED : PSO_GENERIC;
const std::string function_name = std::string("cycles_metal_") +
device_kernel_as_string(device_kernel);
@@ -281,6 +371,17 @@ void MetalKernelPipeline::compile()
if (@available(macOS 11.0, *)) {
MTLFunctionDescriptor *func_desc = [MTLIntersectionFunctionDescriptor functionDescriptor];
func_desc.name = entryPoint;
if (pso_type == PSO_SPECIALIZED_SHADE) {
func_desc.constantValues = GetConstantValues(&kernel_data_);
}
else if (pso_type == PSO_SPECIALIZED_INTERSECT) {
func_desc.constantValues = GetConstantValues(&kernel_data_);
}
else {
func_desc.constantValues = GetConstantValues();
}
function = [mtlLibrary newFunctionWithDescriptor:func_desc error:&error];
}
@@ -427,10 +528,7 @@ void MetalKernelPipeline::compile()
MTLPipelineOption pipelineOptions = MTLPipelineOptionNone;
bool use_binary_archive = true;
if (auto str = getenv("CYCLES_METAL_DISABLE_BINARY_ARCHIVES")) {
use_binary_archive = (atoi(str) == 0);
}
bool use_binary_archive = should_use_binary_archive();
id<MTLBinaryArchive> archive = nil;
string metalbin_path;
@@ -608,17 +706,30 @@ void MetalKernelPipeline::compile()
}
}
bool MetalDeviceKernels::load(MetalDevice *device, bool scene_specialized)
bool MetalDeviceKernels::load(MetalDevice *device, MetalPipelineType pso_type)
{
const double starttime = time_dt();
auto shader_cache = get_shader_cache(device->mtlDevice);
for (int i = 0; i < DEVICE_KERNEL_NUM; i++) {
shader_cache->load_kernel((DeviceKernel)i, device, pso_type);
}
shader_cache->wait_for_all();
metal_printf("Back-end compilation finished in %.1f seconds (%s)\n",
time_dt() - starttime,
kernel_type_as_string(pso_type));
return true;
}
bool MetalDeviceKernels::should_load_kernels(MetalDevice *device, MetalPipelineType pso_type)
{
auto shader_cache = get_shader_cache(device->mtlDevice);
for (int i = 0; i < DEVICE_KERNEL_NUM; i++) {
shader_cache->load_kernel((DeviceKernel)i, device, scene_specialized);
if (shader_cache->should_load_kernel((DeviceKernel)i, device, pso_type)) {
return true;
}
}
if (!scene_specialized || getenv("CYCLES_METAL_PROFILING")) {
shader_cache->wait_for_all();
}
return true;
return false;
}
const MetalKernelPipeline *MetalDeviceKernels::get_best_pipeline(const MetalDevice *device,

View File

@@ -24,6 +24,7 @@ class MetalDeviceQueue : public DeviceQueue {
virtual int num_concurrent_states(const size_t) const override;
virtual int num_concurrent_busy_states() const override;
virtual int num_sort_partition_elements() const override;
virtual void init_execution() override;

View File

@@ -293,6 +293,11 @@ int MetalDeviceQueue::num_concurrent_busy_states() const
return result;
}
int MetalDeviceQueue::num_sort_partition_elements() const
{
return MetalInfo::optimal_sort_partition_elements(metal_device_->mtlDevice);
}
void MetalDeviceQueue::init_execution()
{
/* Synchronize all textures and memory copies before executing task. */
@@ -359,7 +364,7 @@ bool MetalDeviceQueue::enqueue(DeviceKernel kernel,
/* Prepare any non-pointer (i.e. plain-old-data) KernelParamsMetal data */
/* The plain-old-data is contiguous, continuing to the end of KernelParamsMetal */
size_t plain_old_launch_data_offset = offsetof(KernelParamsMetal, integrator_state) +
sizeof(IntegratorStateGPU);
offsetof(IntegratorStateGPU, sort_partition_divisor);
size_t plain_old_launch_data_size = sizeof(KernelParamsMetal) - plain_old_launch_data_offset;
memcpy(init_arg_buffer + globals_offsets + plain_old_launch_data_offset,
(uint8_t *)&metal_device_->launch_params + plain_old_launch_data_offset,
@@ -416,7 +421,7 @@ bool MetalDeviceQueue::enqueue(DeviceKernel kernel,
/* this relies on IntegratorStateGPU layout being contiguous device_ptrs */
const size_t pointer_block_end = offsetof(KernelParamsMetal, integrator_state) +
sizeof(IntegratorStateGPU);
offsetof(IntegratorStateGPU, sort_partition_divisor);
for (size_t offset = 0; offset < pointer_block_end; offset += sizeof(device_ptr)) {
int pointer_index = int(offset / sizeof(device_ptr));
MetalDevice::MetalMem *mmem = *(

View File

@@ -37,6 +37,7 @@ struct MetalInfo {
static int get_apple_gpu_core_count(id<MTLDevice> device);
static MetalGPUVendor get_device_vendor(id<MTLDevice> device);
static AppleGPUArchitecture get_apple_gpu_architecture(id<MTLDevice> device);
static int optimal_sort_partition_elements(id<MTLDevice> device);
static string get_device_name(id<MTLDevice> device);
};

View File

@@ -72,6 +72,21 @@ MetalGPUVendor MetalInfo::get_device_vendor(id<MTLDevice> device)
return METAL_GPU_UNKNOWN;
}
int MetalInfo::optimal_sort_partition_elements(id<MTLDevice> device)
{
if (auto str = getenv("CYCLES_METAL_SORT_PARTITION_ELEMENTS")) {
return atoi(str);
}
/* On M1 and M2 GPUs, we see better cache utilization if we partition the active indices before
* sorting each partition by material. Partitioning into chunks of 65536 elements results in an
* overall render time speedup of up to 15%. */
if (get_device_vendor(device) == METAL_GPU_APPLE) {
return 65536;
}
return 0;
}
vector<id<MTLDevice>> const &MetalInfo::get_usable_devices()
{
static vector<id<MTLDevice>> usable_devices;

View File

@@ -2047,7 +2047,7 @@ void OptiXDevice::const_copy_to(const char *name, void *host, size_t size)
/* Update traversable handle (since it is different for each device on multi devices). */
KernelData *const data = (KernelData *)host;
*(OptixTraversableHandle *)&data->bvh.scene = tlas_handle;
*(OptixTraversableHandle *)&data->device_bvh = tlas_handle;
update_launch_params(offsetof(KernelParamsOptiX, data), host, size);
return;

View File

@@ -105,6 +105,13 @@ class DeviceQueue {
* value. */
virtual int num_concurrent_busy_states() const = 0;
/* Number of elements in a partition of sorted shaders, that improves memory locality of
* integrator state fetch at the cost of decreased coherence for shader kernel execution. */
virtual int num_sort_partition_elements() const
{
return 65536;
}
/* Initialize execution of kernels on this queue.
*
* Will, for example, load all data required by the kernels from Device to global or path state.

View File

@@ -373,7 +373,7 @@ void PathTrace::path_trace(RenderWork &render_work)
work_balance_infos_[i].time_spent += work_time;
work_balance_infos_[i].occupancy = statistics.occupancy;
VLOG_WORK << "Rendered " << num_samples << " samples in " << work_time << " seconds ("
VLOG_INFO << "Rendered " << num_samples << " samples in " << work_time << " seconds ("
<< work_time / num_samples
<< " seconds per sample), occupancy: " << statistics.occupancy;
});

View File

@@ -181,27 +181,45 @@ void PathTraceWorkGPU::alloc_integrator_queue()
void PathTraceWorkGPU::alloc_integrator_sorting()
{
/* Compute sort partitions, to balance between memory locality and coherence.
* Sort partitioning becomes less effective when more shaders are in the wavefront. In lieu of a
* more sophisticated heuristic we simply disable sort partitioning if the shader count is high.
*/
num_sort_partitions_ = 1;
if (device_scene_->data.max_shaders < 300) {
const int num_elements = queue_->num_sort_partition_elements();
if (num_elements) {
num_sort_partitions_ = max(max_num_paths_ / num_elements, 1);
}
}
integrator_state_gpu_.sort_partition_divisor = (int)divide_up(max_num_paths_,
num_sort_partitions_);
/* Allocate arrays for shader sorting. */
const int max_shaders = device_scene_->data.max_shaders;
if (integrator_shader_sort_counter_.size() < max_shaders) {
integrator_shader_sort_counter_.alloc(max_shaders);
const int sort_buckets = device_scene_->data.max_shaders * num_sort_partitions_;
if (integrator_shader_sort_counter_.size() < sort_buckets) {
integrator_shader_sort_counter_.alloc(sort_buckets);
integrator_shader_sort_counter_.zero_to_device();
integrator_shader_raytrace_sort_counter_.alloc(max_shaders);
integrator_shader_raytrace_sort_counter_.zero_to_device();
integrator_shader_mnee_sort_counter_.alloc(max_shaders);
integrator_shader_mnee_sort_counter_.zero_to_device();
integrator_shader_sort_prefix_sum_.alloc(max_shaders);
integrator_shader_sort_prefix_sum_.zero_to_device();
integrator_state_gpu_.sort_key_counter[DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE] =
(int *)integrator_shader_sort_counter_.device_pointer;
integrator_state_gpu_.sort_key_counter[DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE] =
(int *)integrator_shader_raytrace_sort_counter_.device_pointer;
integrator_state_gpu_.sort_key_counter[DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_MNEE] =
(int *)integrator_shader_mnee_sort_counter_.device_pointer;
if (device_scene_->data.kernel_features & KERNEL_FEATURE_NODE_RAYTRACE) {
integrator_shader_raytrace_sort_counter_.alloc(sort_buckets);
integrator_shader_raytrace_sort_counter_.zero_to_device();
integrator_state_gpu_.sort_key_counter[DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE] =
(int *)integrator_shader_raytrace_sort_counter_.device_pointer;
}
if (device_scene_->data.kernel_features & KERNEL_FEATURE_MNEE) {
integrator_shader_mnee_sort_counter_.alloc(sort_buckets);
integrator_shader_mnee_sort_counter_.zero_to_device();
integrator_state_gpu_.sort_key_counter[DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_MNEE] =
(int *)integrator_shader_mnee_sort_counter_.device_pointer;
}
integrator_shader_sort_prefix_sum_.alloc(sort_buckets);
integrator_shader_sort_prefix_sum_.zero_to_device();
}
}
@@ -333,8 +351,12 @@ void PathTraceWorkGPU::enqueue_reset()
queue_->enqueue(DEVICE_KERNEL_INTEGRATOR_RESET, max_num_paths_, args);
queue_->zero_to_device(integrator_queue_counter_);
queue_->zero_to_device(integrator_shader_sort_counter_);
queue_->zero_to_device(integrator_shader_raytrace_sort_counter_);
queue_->zero_to_device(integrator_shader_mnee_sort_counter_);
if (device_scene_->data.kernel_features & KERNEL_FEATURE_NODE_RAYTRACE) {
queue_->zero_to_device(integrator_shader_raytrace_sort_counter_);
}
if (device_scene_->data.kernel_features & KERNEL_FEATURE_MNEE) {
queue_->zero_to_device(integrator_shader_mnee_sort_counter_);
}
/* Tiles enqueue need to know number of active paths, which is based on this counter. Zero the
* counter on the host side because `zero_to_device()` is not doing it. */
@@ -486,9 +508,9 @@ void PathTraceWorkGPU::compute_sorted_queued_paths(DeviceKernel kernel,
/* Compute prefix sum of number of active paths with each shader. */
{
const int work_size = 1;
int max_shaders = device_scene_->data.max_shaders;
int sort_buckets = device_scene_->data.max_shaders * num_sort_partitions_;
DeviceKernelArguments args(&d_counter, &d_prefix_sum, &max_shaders);
DeviceKernelArguments args(&d_counter, &d_prefix_sum, &sort_buckets);
queue_->enqueue(DEVICE_KERNEL_PREFIX_SUM, work_size, args);
}

View File

@@ -156,6 +156,9 @@ class PathTraceWorkGPU : public PathTraceWork {
bool interop_use_checked_ = false;
bool interop_use_ = false;
/* Number of partitions to sort state indices into prior to material sort. */
int num_sort_partitions_;
/* Maximum number of concurrent integrator states. */
int max_num_paths_;

View File

@@ -79,6 +79,7 @@ set(SRC_KERNEL_DEVICE_METAL_HEADERS
device/metal/compat.h
device/metal/context_begin.h
device/metal/context_end.h
device/metal/function_constants.h
device/metal/globals.h
)
@@ -154,6 +155,7 @@ set(SRC_KERNEL_SVM_HEADERS
svm/math_util.h
svm/mix.h
svm/musgrave.h
svm/node_types_template.h
svm/noise.h
svm/noisetex.h
svm/normal.h
@@ -282,6 +284,7 @@ set(SRC_KERNEL_UTIL_HEADERS
set(SRC_KERNEL_TYPES_HEADERS
data_arrays.h
data_template.h
tables.h
types.h
)
@@ -844,10 +847,9 @@ if(WITH_CYCLES_DEVICE_ONEAPI)
else()
list(APPEND sycl_compiler_flags -fPIC)
# avoid getting __FAST_MATH__ to be defined for the graphics compiler on CentOS 7 until the compile-time issue it triggers gets fixed.
if(WITH_CYCLES_ONEAPI_BINARIES)
list(APPEND sycl_compiler_flags -fhonor-nans)
endif()
# We avoid getting __FAST_MATH__ to be defined when building on CentOS 7 until the compilation crash
# it triggers at either AoT or JIT stages gets fixed.
list(APPEND sycl_compiler_flags -fhonor-nans)
# add $ORIGIN to cycles_kernel_oneapi.so rpath so libsycl.so and
# libpi_level_zero.so can be placed next to it and get found.

View File

@@ -172,11 +172,11 @@ ccl_device_intersect bool scene_intersect(KernelGlobals kg,
ray_flags |= OPTIX_RAY_FLAG_TERMINATE_ON_FIRST_HIT;
}
optixTrace(scene_intersect_valid(ray) ? kernel_data.bvh.scene : 0,
optixTrace(scene_intersect_valid(ray) ? kernel_data.device_bvh : 0,
ray->P,
ray->D,
0.0f,
ray->t,
ray->tmin,
ray->tmax,
ray->time,
ray_mask,
ray_flags,
@@ -203,28 +203,28 @@ ccl_device_intersect bool scene_intersect(KernelGlobals kg,
#elif defined(__METALRT__)
if (!scene_intersect_valid(ray)) {
isect->t = ray->t;
isect->t = ray->tmax;
isect->type = PRIMITIVE_NONE;
return false;
}
# if defined(__KERNEL_DEBUG__)
if (is_null_instance_acceleration_structure(metal_ancillaries->accel_struct)) {
isect->t = ray->t;
isect->t = ray->tmax;
isect->type = PRIMITIVE_NONE;
kernel_assert(!"Invalid metal_ancillaries->accel_struct pointer");
return false;
}
if (is_null_intersection_function_table(metal_ancillaries->ift_default)) {
isect->t = ray->t;
isect->t = ray->tmax;
isect->type = PRIMITIVE_NONE;
kernel_assert(!"Invalid ift_default");
return false;
}
# endif
metal::raytracing::ray r(ray->P, ray->D, 0.0f, ray->t);
metal::raytracing::ray r(ray->P, ray->D, ray->tmin, ray->tmax);
metalrt_intersector_type metalrt_intersect;
if (!kernel_data.bvh.have_curves) {
@@ -263,7 +263,7 @@ ccl_device_intersect bool scene_intersect(KernelGlobals kg,
# endif
if (intersection.type == intersection_type::none) {
isect->t = ray->t;
isect->t = ray->tmax;
isect->type = PRIMITIVE_NONE;
return false;
@@ -295,14 +295,14 @@ ccl_device_intersect bool scene_intersect(KernelGlobals kg,
}
# ifdef __EMBREE__
if (kernel_data.bvh.scene) {
isect->t = ray->t;
if (kernel_data.device_bvh) {
isect->t = ray->tmax;
CCLIntersectContext ctx(kg, CCLIntersectContext::RAY_REGULAR);
IntersectContext rtc_ctx(&ctx);
RTCRayHit ray_hit;
ctx.ray = ray;
kernel_embree_setup_rayhit(*ray, ray_hit, visibility);
rtcIntersect1(kernel_data.bvh.scene, &rtc_ctx.context, &ray_hit);
rtcIntersect1(kernel_data.device_bvh, &rtc_ctx.context, &ray_hit);
if (ray_hit.hit.geomID != RTC_INVALID_GEOMETRY_ID &&
ray_hit.hit.primID != RTC_INVALID_GEOMETRY_ID) {
kernel_embree_convert_hit(kg, &ray_hit.ray, &ray_hit.hit, isect);
@@ -357,11 +357,11 @@ ccl_device_intersect bool scene_intersect_local(KernelGlobals kg,
if (local_isect) {
local_isect->num_hits = 0; /* Initialize hit count to zero. */
}
optixTrace(scene_intersect_valid(ray) ? kernel_data.bvh.scene : 0,
optixTrace(scene_intersect_valid(ray) ? kernel_data.device_bvh : 0,
ray->P,
ray->D,
0.0f,
ray->t,
ray->tmin,
ray->tmax,
ray->time,
0xFF,
/* Need to always call into __anyhit__kernel_optix_local_hit. */
@@ -405,7 +405,7 @@ ccl_device_intersect bool scene_intersect_local(KernelGlobals kg,
}
# endif
metal::raytracing::ray r(ray->P, ray->D, 0.0f, ray->t);
metal::raytracing::ray r(ray->P, ray->D, ray->tmin, ray->tmax);
metalrt_intersector_type metalrt_intersect;
metalrt_intersect.force_opacity(metal::raytracing::forced_opacity::non_opaque);
@@ -451,7 +451,7 @@ ccl_device_intersect bool scene_intersect_local(KernelGlobals kg,
}
# ifdef __EMBREE__
if (kernel_data.bvh.scene) {
if (kernel_data.device_bvh) {
const bool has_bvh = !(kernel_data_fetch(object_flag, local_object) &
SD_OBJECT_TRANSFORM_APPLIED);
CCLIntersectContext ctx(
@@ -470,13 +470,13 @@ ccl_device_intersect bool scene_intersect_local(KernelGlobals kg,
/* If this object has its own BVH, use it. */
if (has_bvh) {
RTCGeometry geom = rtcGetGeometry(kernel_data.bvh.scene, local_object * 2);
RTCGeometry geom = rtcGetGeometry(kernel_data.device_bvh, local_object * 2);
if (geom) {
float3 P = ray->P;
float3 dir = ray->D;
float3 idir = ray->D;
Transform ob_itfm;
rtc_ray.tfar = ray->t *
rtc_ray.tfar = ray->tmax *
bvh_instance_motion_push(kg, local_object, ray, &P, &dir, &idir, &ob_itfm);
/* bvh_instance_motion_push() returns the inverse transform but
* it's not needed here. */
@@ -496,7 +496,7 @@ ccl_device_intersect bool scene_intersect_local(KernelGlobals kg,
}
}
else {
rtcOccluded1(kernel_data.bvh.scene, &rtc_ctx.context, &rtc_ray);
rtcOccluded1(kernel_data.device_bvh, &rtc_ctx.context, &rtc_ray);
}
/* rtcOccluded1 sets tfar to -inf if a hit was found. */
@@ -539,11 +539,11 @@ ccl_device_intersect bool scene_intersect_shadow_all(KernelGlobals kg,
ray_mask = 0xFF;
}
optixTrace(scene_intersect_valid(ray) ? kernel_data.bvh.scene : 0,
optixTrace(scene_intersect_valid(ray) ? kernel_data.device_bvh : 0,
ray->P,
ray->D,
0.0f,
ray->t,
ray->tmin,
ray->tmax,
ray->time,
ray_mask,
/* Need to always call into __anyhit__kernel_optix_shadow_all_hit. */
@@ -582,7 +582,7 @@ ccl_device_intersect bool scene_intersect_shadow_all(KernelGlobals kg,
}
# endif
metal::raytracing::ray r(ray->P, ray->D, 0.0f, ray->t);
metal::raytracing::ray r(ray->P, ray->D, ray->tmin, ray->tmax);
metalrt_intersector_type metalrt_intersect;
metalrt_intersect.force_opacity(metal::raytracing::forced_opacity::non_opaque);
@@ -633,7 +633,7 @@ ccl_device_intersect bool scene_intersect_shadow_all(KernelGlobals kg,
}
# ifdef __EMBREE__
if (kernel_data.bvh.scene) {
if (kernel_data.device_bvh) {
CCLIntersectContext ctx(kg, CCLIntersectContext::RAY_SHADOW_ALL);
Intersection *isect_array = (Intersection *)state->shadow_isect;
ctx.isect_s = isect_array;
@@ -642,7 +642,7 @@ ccl_device_intersect bool scene_intersect_shadow_all(KernelGlobals kg,
IntersectContext rtc_ctx(&ctx);
RTCRay rtc_ray;
kernel_embree_setup_ray(*ray, rtc_ray, visibility);
rtcOccluded1(kernel_data.bvh.scene, &rtc_ctx.context, &rtc_ray);
rtcOccluded1(kernel_data.device_bvh, &rtc_ctx.context, &rtc_ray);
*num_recorded_hits = ctx.num_recorded_hits;
*throughput = ctx.throughput;
@@ -698,11 +698,11 @@ ccl_device_intersect bool scene_intersect_volume(KernelGlobals kg,
ray_mask = 0xFF;
}
optixTrace(scene_intersect_valid(ray) ? kernel_data.bvh.scene : 0,
optixTrace(scene_intersect_valid(ray) ? kernel_data.device_bvh : 0,
ray->P,
ray->D,
0.0f,
ray->t,
ray->tmin,
ray->tmax,
ray->time,
ray_mask,
/* Need to always call into __anyhit__kernel_optix_volume_test. */
@@ -744,7 +744,7 @@ ccl_device_intersect bool scene_intersect_volume(KernelGlobals kg,
}
# endif
metal::raytracing::ray r(ray->P, ray->D, 0.0f, ray->t);
metal::raytracing::ray r(ray->P, ray->D, ray->tmin, ray->tmax);
metalrt_intersector_type metalrt_intersect;
metalrt_intersect.force_opacity(metal::raytracing::forced_opacity::non_opaque);
@@ -825,7 +825,7 @@ ccl_device_intersect uint scene_intersect_volume_all(KernelGlobals kg,
}
# ifdef __EMBREE__
if (kernel_data.bvh.scene) {
if (kernel_data.device_bvh) {
CCLIntersectContext ctx(kg, CCLIntersectContext::RAY_VOLUME_ALL);
ctx.isect_s = isect;
ctx.max_hits = max_hits;
@@ -834,7 +834,7 @@ ccl_device_intersect uint scene_intersect_volume_all(KernelGlobals kg,
IntersectContext rtc_ctx(&ctx);
RTCRay rtc_ray;
kernel_embree_setup_ray(*ray, rtc_ray, visibility);
rtcOccluded1(kernel_data.bvh.scene, &rtc_ctx.context, &rtc_ray);
rtcOccluded1(kernel_data.device_bvh, &rtc_ctx.context, &rtc_ray);
return ctx.num_hits;
}
# endif /* __EMBREE__ */

View File

@@ -83,8 +83,8 @@ ccl_device_inline void kernel_embree_setup_ray(const Ray &ray,
rtc_ray.dir_x = ray.D.x;
rtc_ray.dir_y = ray.D.y;
rtc_ray.dir_z = ray.D.z;
rtc_ray.tnear = 0.0f;
rtc_ray.tfar = ray.t;
rtc_ray.tnear = ray.tmin;
rtc_ray.tfar = ray.tmax;
rtc_ray.time = ray.time;
rtc_ray.mask = visibility;
}
@@ -107,7 +107,7 @@ ccl_device_inline bool kernel_embree_is_self_intersection(const KernelGlobals kg
const int oID = hit->instID[0] / 2;
if ((ray->self.object == oID) || (ray->self.light_object == oID)) {
RTCScene inst_scene = (RTCScene)rtcGetGeometryUserData(
rtcGetGeometry(kernel_data.bvh.scene, hit->instID[0]));
rtcGetGeometry(kernel_data.device_bvh, hit->instID[0]));
const int pID = hit->primID +
(intptr_t)rtcGetGeometryUserData(rtcGetGeometry(inst_scene, hit->geomID));
status = intersection_skip_self_shadow(ray->self, oID, pID);
@@ -117,7 +117,7 @@ ccl_device_inline bool kernel_embree_is_self_intersection(const KernelGlobals kg
const int oID = hit->geomID / 2;
if ((ray->self.object == oID) || (ray->self.light_object == oID)) {
const int pID = hit->primID + (intptr_t)rtcGetGeometryUserData(
rtcGetGeometry(kernel_data.bvh.scene, hit->geomID));
rtcGetGeometry(kernel_data.device_bvh, hit->geomID));
status = intersection_skip_self_shadow(ray->self, oID, pID);
}
}
@@ -133,14 +133,14 @@ ccl_device_inline void kernel_embree_convert_hit(KernelGlobals kg,
isect->t = ray->tfar;
if (hit->instID[0] != RTC_INVALID_GEOMETRY_ID) {
RTCScene inst_scene = (RTCScene)rtcGetGeometryUserData(
rtcGetGeometry(kernel_data.bvh.scene, hit->instID[0]));
rtcGetGeometry(kernel_data.device_bvh, hit->instID[0]));
isect->prim = hit->primID +
(intptr_t)rtcGetGeometryUserData(rtcGetGeometry(inst_scene, hit->geomID));
isect->object = hit->instID[0] / 2;
}
else {
isect->prim = hit->primID + (intptr_t)rtcGetGeometryUserData(
rtcGetGeometry(kernel_data.bvh.scene, hit->geomID));
rtcGetGeometry(kernel_data.device_bvh, hit->geomID));
isect->object = hit->geomID / 2;
}
@@ -166,7 +166,7 @@ ccl_device_inline void kernel_embree_convert_sss_hit(
isect->v = hit->u;
isect->t = ray->tfar;
RTCScene inst_scene = (RTCScene)rtcGetGeometryUserData(
rtcGetGeometry(kernel_data.bvh.scene, object * 2));
rtcGetGeometry(kernel_data.device_bvh, object * 2));
isect->prim = hit->primID +
(intptr_t)rtcGetGeometryUserData(rtcGetGeometry(inst_scene, hit->geomID));
isect->object = object;

View File

@@ -47,8 +47,9 @@ ccl_device_inline
float3 P = ray->P;
float3 dir = bvh_clamp_direction(ray->D);
float3 idir = bvh_inverse_direction(dir);
float tmin = ray->tmin;
int object = OBJECT_NONE;
float isect_t = ray->t;
float isect_t = ray->tmax;
if (local_isect != NULL) {
local_isect->num_hits = 0;
@@ -59,10 +60,13 @@ ccl_device_inline
if (!(object_flag & SD_OBJECT_TRANSFORM_APPLIED)) {
#if BVH_FEATURE(BVH_MOTION)
Transform ob_itfm;
isect_t *= bvh_instance_motion_push(kg, local_object, ray, &P, &dir, &idir, &ob_itfm);
const float t_world_to_instance = bvh_instance_motion_push(
kg, local_object, ray, &P, &dir, &idir, &ob_itfm);
#else
isect_t *= bvh_instance_push(kg, local_object, ray, &P, &dir, &idir);
const float t_world_to_instance = bvh_instance_push(kg, local_object, ray, &P, &dir, &idir);
#endif
isect_t *= t_world_to_instance;
tmin *= t_world_to_instance;
object = local_object;
}
@@ -81,6 +85,7 @@ ccl_device_inline
dir,
#endif
idir,
tmin,
isect_t,
node_addr,
PATH_RAY_ALL_VISIBILITY,
@@ -155,6 +160,7 @@ ccl_device_inline
local_object,
prim,
prim_addr,
tmin,
isect_t,
lcg_state,
max_hits)) {
@@ -191,6 +197,7 @@ ccl_device_inline
local_object,
prim,
prim_addr,
tmin,
isect_t,
lcg_state,
max_hits)) {

View File

@@ -18,7 +18,8 @@ ccl_device_forceinline Transform bvh_unaligned_node_fetch_space(KernelGlobals kg
ccl_device_forceinline int bvh_aligned_node_intersect(KernelGlobals kg,
const float3 P,
const float3 idir,
const float t,
const float tmin,
const float tmax,
const int node_addr,
const uint visibility,
float dist[2])
@@ -39,8 +40,8 @@ ccl_device_forceinline int bvh_aligned_node_intersect(KernelGlobals kg,
float c0hiy = (node1.z - P.y) * idir.y;
float c0loz = (node2.x - P.z) * idir.z;
float c0hiz = (node2.z - P.z) * idir.z;
float c0min = max4(0.0f, min(c0lox, c0hix), min(c0loy, c0hiy), min(c0loz, c0hiz));
float c0max = min4(t, max(c0lox, c0hix), max(c0loy, c0hiy), max(c0loz, c0hiz));
float c0min = max4(tmin, min(c0lox, c0hix), min(c0loy, c0hiy), min(c0loz, c0hiz));
float c0max = min4(tmax, max(c0lox, c0hix), max(c0loy, c0hiy), max(c0loz, c0hiz));
float c1lox = (node0.y - P.x) * idir.x;
float c1hix = (node0.w - P.x) * idir.x;
@@ -48,8 +49,8 @@ ccl_device_forceinline int bvh_aligned_node_intersect(KernelGlobals kg,
float c1hiy = (node1.w - P.y) * idir.y;
float c1loz = (node2.y - P.z) * idir.z;
float c1hiz = (node2.w - P.z) * idir.z;
float c1min = max4(0.0f, min(c1lox, c1hix), min(c1loy, c1hiy), min(c1loz, c1hiz));
float c1max = min4(t, max(c1lox, c1hix), max(c1loy, c1hiy), max(c1loz, c1hiz));
float c1min = max4(tmin, min(c1lox, c1hix), min(c1loy, c1hiy), min(c1loz, c1hiz));
float c1max = min4(tmax, max(c1lox, c1hix), max(c1loy, c1hiy), max(c1loz, c1hiz));
dist[0] = c0min;
dist[1] = c1min;
@@ -66,7 +67,8 @@ ccl_device_forceinline int bvh_aligned_node_intersect(KernelGlobals kg,
ccl_device_forceinline bool bvh_unaligned_node_intersect_child(KernelGlobals kg,
const float3 P,
const float3 dir,
const float t,
const float tmin,
const float tmax,
int node_addr,
int child,
float dist[2])
@@ -83,8 +85,8 @@ ccl_device_forceinline bool bvh_unaligned_node_intersect_child(KernelGlobals kg,
const float far_x = max(lower_xyz.x, upper_xyz.x);
const float far_y = max(lower_xyz.y, upper_xyz.y);
const float far_z = max(lower_xyz.z, upper_xyz.z);
const float tnear = max4(0.0f, near_x, near_y, near_z);
const float tfar = min4(t, far_x, far_y, far_z);
const float tnear = max4(tmin, near_x, near_y, near_z);
const float tfar = min4(tmax, far_x, far_y, far_z);
*dist = tnear;
return tnear <= tfar;
}
@@ -93,7 +95,8 @@ ccl_device_forceinline int bvh_unaligned_node_intersect(KernelGlobals kg,
const float3 P,
const float3 dir,
const float3 idir,
const float t,
const float tmin,
const float tmax,
const int node_addr,
const uint visibility,
float dist[2])
@@ -102,7 +105,7 @@ ccl_device_forceinline int bvh_unaligned_node_intersect(KernelGlobals kg,
#ifdef __VISIBILITY_FLAG__
float4 cnodes = kernel_data_fetch(bvh_nodes, node_addr + 0);
#endif
if (bvh_unaligned_node_intersect_child(kg, P, dir, t, node_addr, 0, &dist[0])) {
if (bvh_unaligned_node_intersect_child(kg, P, dir, tmin, tmax, node_addr, 0, &dist[0])) {
#ifdef __VISIBILITY_FLAG__
if ((__float_as_uint(cnodes.x) & visibility))
#endif
@@ -110,7 +113,7 @@ ccl_device_forceinline int bvh_unaligned_node_intersect(KernelGlobals kg,
mask |= 1;
}
}
if (bvh_unaligned_node_intersect_child(kg, P, dir, t, node_addr, 1, &dist[1])) {
if (bvh_unaligned_node_intersect_child(kg, P, dir, tmin, tmax, node_addr, 1, &dist[1])) {
#ifdef __VISIBILITY_FLAG__
if ((__float_as_uint(cnodes.y) & visibility))
#endif
@@ -125,16 +128,17 @@ ccl_device_forceinline int bvh_node_intersect(KernelGlobals kg,
const float3 P,
const float3 dir,
const float3 idir,
const float t,
const float tmin,
const float tmax,
const int node_addr,
const uint visibility,
float dist[2])
{
float4 node = kernel_data_fetch(bvh_nodes, node_addr);
if (__float_as_uint(node.x) & PATH_RAY_NODE_UNALIGNED) {
return bvh_unaligned_node_intersect(kg, P, dir, idir, t, node_addr, visibility, dist);
return bvh_unaligned_node_intersect(kg, P, dir, idir, tmin, tmax, node_addr, visibility, dist);
}
else {
return bvh_aligned_node_intersect(kg, P, idir, t, node_addr, visibility, dist);
return bvh_aligned_node_intersect(kg, P, idir, tmin, tmax, node_addr, visibility, dist);
}
}

View File

@@ -49,6 +49,7 @@ ccl_device_inline
float3 P = ray->P;
float3 dir = bvh_clamp_direction(ray->D);
float3 idir = bvh_inverse_direction(dir);
float tmin = ray->tmin;
int object = OBJECT_NONE;
uint num_hits = 0;
@@ -59,12 +60,12 @@ ccl_device_inline
/* Max distance in world space. May be dynamically reduced when max number of
* recorded hits is exceeded and we no longer need to find hits beyond the max
* distance found. */
float t_max_world = ray->t;
float t_max_world = ray->tmax;
/* Current maximum distance to the intersection.
* Is calculated as a ray length, transformed to an object space when entering
* instance node. */
float t_max_current = ray->t;
float t_max_current = ray->tmax;
/* Conversion from world to local space for the current instance if any, 1.0
* otherwise. */
@@ -88,6 +89,7 @@ ccl_device_inline
dir,
#endif
idir,
tmin,
t_max_current,
node_addr,
visibility,
@@ -156,8 +158,16 @@ ccl_device_inline
switch (type & PRIMITIVE_ALL) {
case PRIMITIVE_TRIANGLE: {
hit = triangle_intersect(
kg, &isect, P, dir, t_max_current, visibility, prim_object, prim, prim_addr);
hit = triangle_intersect(kg,
&isect,
P,
dir,
tmin,
t_max_current,
visibility,
prim_object,
prim,
prim_addr);
break;
}
#if BVH_FEATURE(BVH_MOTION)
@@ -166,6 +176,7 @@ ccl_device_inline
&isect,
P,
dir,
tmin,
t_max_current,
ray->time,
visibility,
@@ -189,8 +200,16 @@ ccl_device_inline
}
const int curve_type = kernel_data_fetch(prim_type, prim_addr);
hit = curve_intersect(
kg, &isect, P, dir, t_max_current, prim_object, prim, ray->time, curve_type);
hit = curve_intersect(kg,
&isect,
P,
dir,
tmin,
t_max_current,
prim_object,
prim,
ray->time,
curve_type);
break;
}
@@ -207,8 +226,16 @@ ccl_device_inline
}
const int point_type = kernel_data_fetch(prim_type, prim_addr);
hit = point_intersect(
kg, &isect, P, dir, t_max_current, prim_object, prim, ray->time, point_type);
hit = point_intersect(kg,
&isect,
P,
dir,
tmin,
t_max_current,
prim_object,
prim,
ray->time,
point_type);
break;
}
#endif /* BVH_FEATURE(BVH_POINTCLOUD) */
@@ -302,6 +329,7 @@ ccl_device_inline
/* Convert intersection to object space. */
t_max_current *= t_world_to_instance;
tmin *= t_world_to_instance;
++stack_ptr;
kernel_assert(stack_ptr < BVH_STACK_SIZE);
@@ -323,7 +351,8 @@ ccl_device_inline
#endif
/* Restore world space ray length. */
t_max_current = ray->t;
tmin = ray->tmin;
t_max_current = ray->tmax;
object = OBJECT_NONE;
t_world_to_instance = 1.0f;

View File

@@ -43,13 +43,14 @@ ccl_device_noinline bool BVH_FUNCTION_FULL_NAME(BVH)(KernelGlobals kg,
float3 P = ray->P;
float3 dir = bvh_clamp_direction(ray->D);
float3 idir = bvh_inverse_direction(dir);
float tmin = ray->tmin;
int object = OBJECT_NONE;
#if BVH_FEATURE(BVH_MOTION)
Transform ob_itfm;
#endif
isect->t = ray->t;
isect->t = ray->tmax;
isect->u = 0.0f;
isect->v = 0.0f;
isect->prim = PRIM_NONE;
@@ -71,6 +72,7 @@ ccl_device_noinline bool BVH_FUNCTION_FULL_NAME(BVH)(KernelGlobals kg,
dir,
#endif
idir,
tmin,
isect->t,
node_addr,
visibility,
@@ -133,8 +135,16 @@ ccl_device_noinline bool BVH_FUNCTION_FULL_NAME(BVH)(KernelGlobals kg,
switch (type & PRIMITIVE_ALL) {
case PRIMITIVE_TRIANGLE: {
if (triangle_intersect(
kg, isect, P, dir, isect->t, visibility, prim_object, prim, prim_addr)) {
if (triangle_intersect(kg,
isect,
P,
dir,
tmin,
isect->t,
visibility,
prim_object,
prim,
prim_addr)) {
/* shadow ray early termination */
if (visibility & PATH_RAY_SHADOW_OPAQUE)
return true;
@@ -147,6 +157,7 @@ ccl_device_noinline bool BVH_FUNCTION_FULL_NAME(BVH)(KernelGlobals kg,
isect,
P,
dir,
tmin,
isect->t,
ray->time,
visibility,
@@ -174,7 +185,7 @@ ccl_device_noinline bool BVH_FUNCTION_FULL_NAME(BVH)(KernelGlobals kg,
const int curve_type = kernel_data_fetch(prim_type, prim_addr);
const bool hit = curve_intersect(
kg, isect, P, dir, isect->t, prim_object, prim, ray->time, curve_type);
kg, isect, P, dir, tmin, isect->t, prim_object, prim, ray->time, curve_type);
if (hit) {
/* shadow ray early termination */
if (visibility & PATH_RAY_SHADOW_OPAQUE)
@@ -195,7 +206,7 @@ ccl_device_noinline bool BVH_FUNCTION_FULL_NAME(BVH)(KernelGlobals kg,
const int point_type = kernel_data_fetch(prim_type, prim_addr);
const bool hit = point_intersect(
kg, isect, P, dir, isect->t, prim_object, prim, ray->time, point_type);
kg, isect, P, dir, tmin, isect->t, prim_object, prim, ray->time, point_type);
if (hit) {
/* shadow ray early termination */
if (visibility & PATH_RAY_SHADOW_OPAQUE)
@@ -212,11 +223,15 @@ ccl_device_noinline bool BVH_FUNCTION_FULL_NAME(BVH)(KernelGlobals kg,
object = kernel_data_fetch(prim_object, -prim_addr - 1);
#if BVH_FEATURE(BVH_MOTION)
isect->t *= bvh_instance_motion_push(kg, object, ray, &P, &dir, &idir, &ob_itfm);
const float t_world_to_instance = bvh_instance_motion_push(
kg, object, ray, &P, &dir, &idir, &ob_itfm);
#else
isect->t *= bvh_instance_push(kg, object, ray, &P, &dir, &idir);
const float t_world_to_instance = bvh_instance_push(kg, object, ray, &P, &dir, &idir);
#endif
isect->t *= t_world_to_instance;
tmin *= t_world_to_instance;
++stack_ptr;
kernel_assert(stack_ptr < BVH_STACK_SIZE);
traversal_stack[stack_ptr] = ENTRYPOINT_SENTINEL;
@@ -235,6 +250,7 @@ ccl_device_noinline bool BVH_FUNCTION_FULL_NAME(BVH)(KernelGlobals kg,
#else
isect->t = bvh_instance_pop(kg, object, ray, &P, &dir, &idir, isect->t);
#endif
tmin = ray->tmin;
object = OBJECT_NONE;
node_addr = traversal_stack[stack_ptr];

View File

@@ -5,6 +5,19 @@
CCL_NAMESPACE_BEGIN
/* Offset intersection distance by the smallest possible amount, to skip
* intersections at this distance. This works in cases where the ray start
* position is unchanged and only tmin is updated, since for self
* intersection we'll be comparing against the exact same distances. */
ccl_device_forceinline float intersection_t_offset(const float t)
{
/* This is a simplified version of nextafterf(t, FLT_MAX), only dealing with
* non-negative and finite t. */
kernel_assert(t >= 0.0f && isfinite_safe(t));
const uint32_t bits = (t == 0.0f) ? 1 : __float_as_uint(t) + 1;
return __uint_as_float(bits);
}
#if defined(__KERNEL_CPU__)
ccl_device int intersections_compare(const void *a, const void *b)
{

View File

@@ -46,13 +46,14 @@ ccl_device_inline
float3 P = ray->P;
float3 dir = bvh_clamp_direction(ray->D);
float3 idir = bvh_inverse_direction(dir);
float tmin = ray->tmin;
int object = OBJECT_NONE;
#if BVH_FEATURE(BVH_MOTION)
Transform ob_itfm;
#endif
isect->t = ray->t;
isect->t = ray->tmax;
isect->u = 0.0f;
isect->v = 0.0f;
isect->prim = PRIM_NONE;
@@ -73,6 +74,7 @@ ccl_device_inline
dir,
#endif
idir,
tmin,
isect->t,
node_addr,
visibility,
@@ -140,7 +142,7 @@ ccl_device_inline
continue;
}
triangle_intersect(
kg, isect, P, dir, isect->t, visibility, prim_object, prim, prim_addr);
kg, isect, P, dir, tmin, isect->t, visibility, prim_object, prim, prim_addr);
}
break;
}
@@ -165,6 +167,7 @@ ccl_device_inline
isect,
P,
dir,
tmin,
isect->t,
ray->time,
visibility,
@@ -186,11 +189,15 @@ ccl_device_inline
int object_flag = kernel_data_fetch(object_flag, object);
if (object_flag & SD_OBJECT_HAS_VOLUME) {
#if BVH_FEATURE(BVH_MOTION)
isect->t *= bvh_instance_motion_push(kg, object, ray, &P, &dir, &idir, &ob_itfm);
const float t_world_to_instance = bvh_instance_motion_push(
kg, object, ray, &P, &dir, &idir, &ob_itfm);
#else
isect->t *= bvh_instance_push(kg, object, ray, &P, &dir, &idir);
const float t_world_to_instance = bvh_instance_push(kg, object, ray, &P, &dir, &idir);
#endif
isect->t *= t_world_to_instance;
tmin *= t_world_to_instance;
++stack_ptr;
kernel_assert(stack_ptr < BVH_STACK_SIZE);
traversal_stack[stack_ptr] = ENTRYPOINT_SENTINEL;
@@ -217,6 +224,8 @@ ccl_device_inline
isect->t = bvh_instance_pop(kg, object, ray, &P, &dir, &idir, isect->t);
#endif
tmin = ray->tmin;
object = OBJECT_NONE;
node_addr = traversal_stack[stack_ptr];
--stack_ptr;

View File

@@ -44,12 +44,12 @@ ccl_device_inline
int node_addr = kernel_data.bvh.root;
/* ray parameters in registers */
const float tmax = ray->t;
float3 P = ray->P;
float3 dir = bvh_clamp_direction(ray->D);
float3 idir = bvh_inverse_direction(dir);
float tmin = ray->tmin;
int object = OBJECT_NONE;
float isect_t = tmax;
float isect_t = ray->tmax;
#if BVH_FEATURE(BVH_MOTION)
Transform ob_itfm;
@@ -58,7 +58,7 @@ ccl_device_inline
int num_hits_in_instance = 0;
uint num_hits = 0;
isect_array->t = tmax;
isect_array->t = ray->tmax;
/* traversal loop */
do {
@@ -75,6 +75,7 @@ ccl_device_inline
dir,
#endif
idir,
tmin,
isect_t,
node_addr,
visibility,
@@ -141,8 +142,16 @@ ccl_device_inline
if ((object_flag & SD_OBJECT_HAS_VOLUME) == 0) {
continue;
}
hit = triangle_intersect(
kg, isect_array, P, dir, isect_t, visibility, prim_object, prim, prim_addr);
hit = triangle_intersect(kg,
isect_array,
P,
dir,
tmin,
isect_t,
visibility,
prim_object,
prim,
prim_addr);
if (hit) {
/* Move on to next entry in intersections array. */
isect_array++;
@@ -189,6 +198,7 @@ ccl_device_inline
isect_array,
P,
dir,
tmin,
isect_t,
ray->time,
visibility,
@@ -232,11 +242,15 @@ ccl_device_inline
int object_flag = kernel_data_fetch(object_flag, object);
if (object_flag & SD_OBJECT_HAS_VOLUME) {
#if BVH_FEATURE(BVH_MOTION)
isect_t *= bvh_instance_motion_push(kg, object, ray, &P, &dir, &idir, &ob_itfm);
const float t_world_to_instance = bvh_instance_motion_push(
kg, object, ray, &P, &dir, &idir, &ob_itfm);
#else
isect_t *= bvh_instance_push(kg, object, ray, &P, &dir, &idir);
const float t_world_to_instance = bvh_instance_push(kg, object, ray, &P, &dir, &idir);
#endif
isect_t *= t_world_to_instance;
tmin *= t_world_to_instance;
num_hits_in_instance = 0;
isect_array->t = isect_t;
@@ -280,7 +294,8 @@ ccl_device_inline
#endif
}
isect_t = tmax;
tmin = ray->tmin;
isect_t = ray->tmax;
isect_array->t = isect_t;
object = OBJECT_NONE;

View File

@@ -165,9 +165,11 @@ ccl_device void camera_sample_perspective(KernelGlobals kg,
float nearclip = kernel_data.cam.nearclip * z_inv;
ray->P += nearclip * ray->D;
ray->dP += nearclip * ray->dD;
ray->t = kernel_data.cam.cliplength * z_inv;
ray->tmin = 0.0f;
ray->tmax = kernel_data.cam.cliplength * z_inv;
#else
ray->t = FLT_MAX;
ray->tmin = 0.0f;
ray->tmax = FLT_MAX;
#endif
}
@@ -231,9 +233,11 @@ ccl_device void camera_sample_orthographic(KernelGlobals kg,
#ifdef __CAMERA_CLIPPING__
/* clipping */
ray->t = kernel_data.cam.cliplength;
ray->tmin = 0.0f;
ray->tmax = kernel_data.cam.cliplength;
#else
ray->t = FLT_MAX;
ray->tmin = 0.0f;
ray->tmax = FLT_MAX;
#endif
}
@@ -258,7 +262,7 @@ ccl_device_inline void camera_sample_panorama(ccl_constant KernelCamera *cam,
/* indicates ray should not receive any light, outside of the lens */
if (is_zero(D)) {
ray->t = 0.0f;
ray->tmax = 0.0f;
return;
}
@@ -349,9 +353,11 @@ ccl_device_inline void camera_sample_panorama(ccl_constant KernelCamera *cam,
float nearclip = cam->nearclip;
ray->P += nearclip * ray->D;
ray->dP += nearclip * ray->dD;
ray->t = cam->cliplength;
ray->tmin = 0.0f;
ray->tmax = cam->cliplength;
#else
ray->t = FLT_MAX;
ray->tmin = 0.0f;
ray->tmax = FLT_MAX;
#endif
}
@@ -368,7 +374,7 @@ ccl_device_inline void camera_sample(KernelGlobals kg,
ccl_private Ray *ray)
{
/* pixel filter */
int filter_table_offset = kernel_data.film.filter_table_offset;
int filter_table_offset = kernel_data.tables.filter_table_offset;
float raster_x = x + lookup_table_read(kg, filter_u, filter_table_offset, FILTER_TABLE_SIZE);
float raster_y = y + lookup_table_read(kg, filter_v, filter_table_offset, FILTER_TABLE_SIZE);

View File

@@ -0,0 +1,206 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2011-2022 Blender Foundation */
#ifndef KERNEL_STRUCT_BEGIN
# define KERNEL_STRUCT_BEGIN(name, parent)
#endif
#ifndef KERNEL_STRUCT_END
# define KERNEL_STRUCT_END(name)
#endif
#ifndef KERNEL_STRUCT_MEMBER
# define KERNEL_STRUCT_MEMBER(parent, type, name)
#endif
/* Background. */
KERNEL_STRUCT_BEGIN(KernelBackground, background)
/* xyz store direction, w the angle. float4 instead of float3 is used
* to ensure consistent padding/alignment across devices. */
KERNEL_STRUCT_MEMBER(background, float4, sun)
/* Only shader index. */
KERNEL_STRUCT_MEMBER(background, int, surface_shader)
KERNEL_STRUCT_MEMBER(background, int, volume_shader)
KERNEL_STRUCT_MEMBER(background, float, volume_step_size)
KERNEL_STRUCT_MEMBER(background, int, transparent)
KERNEL_STRUCT_MEMBER(background, float, transparent_roughness_squared_threshold)
/* Portal sampling. */
KERNEL_STRUCT_MEMBER(background, float, portal_weight)
KERNEL_STRUCT_MEMBER(background, int, num_portals)
KERNEL_STRUCT_MEMBER(background, int, portal_offset)
/* Sun sampling. */
KERNEL_STRUCT_MEMBER(background, float, sun_weight)
/* Importance map sampling. */
KERNEL_STRUCT_MEMBER(background, float, map_weight)
KERNEL_STRUCT_MEMBER(background, int, map_res_x)
KERNEL_STRUCT_MEMBER(background, int, map_res_y)
/* Multiple importance sampling. */
KERNEL_STRUCT_MEMBER(background, int, use_mis)
/* Lightgroup. */
KERNEL_STRUCT_MEMBER(background, int, lightgroup)
/* Padding. */
KERNEL_STRUCT_MEMBER(background, int, pad1)
KERNEL_STRUCT_MEMBER(background, int, pad2)
KERNEL_STRUCT_MEMBER(background, int, pad3)
KERNEL_STRUCT_END(KernelBackground)
/* BVH: own BVH2 if no native device acceleration struct used. */
KERNEL_STRUCT_BEGIN(KernelBVH, bvh)
KERNEL_STRUCT_MEMBER(bvh, int, root)
KERNEL_STRUCT_MEMBER(bvh, int, have_motion)
KERNEL_STRUCT_MEMBER(bvh, int, have_curves)
KERNEL_STRUCT_MEMBER(bvh, int, bvh_layout)
KERNEL_STRUCT_MEMBER(bvh, int, use_bvh_steps)
KERNEL_STRUCT_MEMBER(bvh, int, curve_subdivisions)
KERNEL_STRUCT_MEMBER(bvh, int, pad1)
KERNEL_STRUCT_MEMBER(bvh, int, pad2)
KERNEL_STRUCT_END(KernelBVH)
/* Film. */
KERNEL_STRUCT_BEGIN(KernelFilm, film)
/* XYZ to rendering color space transform. float4 instead of float3 to
* ensure consistent padding/alignment across devices. */
KERNEL_STRUCT_MEMBER(film, float4, xyz_to_r)
KERNEL_STRUCT_MEMBER(film, float4, xyz_to_g)
KERNEL_STRUCT_MEMBER(film, float4, xyz_to_b)
KERNEL_STRUCT_MEMBER(film, float4, rgb_to_y)
/* Rec709 to rendering color space. */
KERNEL_STRUCT_MEMBER(film, float4, rec709_to_r)
KERNEL_STRUCT_MEMBER(film, float4, rec709_to_g)
KERNEL_STRUCT_MEMBER(film, float4, rec709_to_b)
KERNEL_STRUCT_MEMBER(film, int, is_rec709)
/* Exposuse. */
KERNEL_STRUCT_MEMBER(film, float, exposure)
/* Passed used. */
KERNEL_STRUCT_MEMBER(film, int, pass_flag)
KERNEL_STRUCT_MEMBER(film, int, light_pass_flag)
/* Pass offsets. */
KERNEL_STRUCT_MEMBER(film, int, pass_stride)
KERNEL_STRUCT_MEMBER(film, int, pass_combined)
KERNEL_STRUCT_MEMBER(film, int, pass_depth)
KERNEL_STRUCT_MEMBER(film, int, pass_position)
KERNEL_STRUCT_MEMBER(film, int, pass_normal)
KERNEL_STRUCT_MEMBER(film, int, pass_roughness)
KERNEL_STRUCT_MEMBER(film, int, pass_motion)
KERNEL_STRUCT_MEMBER(film, int, pass_motion_weight)
KERNEL_STRUCT_MEMBER(film, int, pass_uv)
KERNEL_STRUCT_MEMBER(film, int, pass_object_id)
KERNEL_STRUCT_MEMBER(film, int, pass_material_id)
KERNEL_STRUCT_MEMBER(film, int, pass_diffuse_color)
KERNEL_STRUCT_MEMBER(film, int, pass_glossy_color)
KERNEL_STRUCT_MEMBER(film, int, pass_transmission_color)
KERNEL_STRUCT_MEMBER(film, int, pass_diffuse_indirect)
KERNEL_STRUCT_MEMBER(film, int, pass_glossy_indirect)
KERNEL_STRUCT_MEMBER(film, int, pass_transmission_indirect)
KERNEL_STRUCT_MEMBER(film, int, pass_volume_indirect)
KERNEL_STRUCT_MEMBER(film, int, pass_diffuse_direct)
KERNEL_STRUCT_MEMBER(film, int, pass_glossy_direct)
KERNEL_STRUCT_MEMBER(film, int, pass_transmission_direct)
KERNEL_STRUCT_MEMBER(film, int, pass_volume_direct)
KERNEL_STRUCT_MEMBER(film, int, pass_emission)
KERNEL_STRUCT_MEMBER(film, int, pass_background)
KERNEL_STRUCT_MEMBER(film, int, pass_ao)
KERNEL_STRUCT_MEMBER(film, float, pass_alpha_threshold)
KERNEL_STRUCT_MEMBER(film, int, pass_shadow)
KERNEL_STRUCT_MEMBER(film, float, pass_shadow_scale)
KERNEL_STRUCT_MEMBER(film, int, pass_shadow_catcher)
KERNEL_STRUCT_MEMBER(film, int, pass_shadow_catcher_sample_count)
KERNEL_STRUCT_MEMBER(film, int, pass_shadow_catcher_matte)
/* Cryptomatte. */
KERNEL_STRUCT_MEMBER(film, int, cryptomatte_passes)
KERNEL_STRUCT_MEMBER(film, int, cryptomatte_depth)
KERNEL_STRUCT_MEMBER(film, int, pass_cryptomatte)
/* Adaptive sampling. */
KERNEL_STRUCT_MEMBER(film, int, pass_adaptive_aux_buffer)
KERNEL_STRUCT_MEMBER(film, int, pass_sample_count)
/* Mist. */
KERNEL_STRUCT_MEMBER(film, int, pass_mist)
KERNEL_STRUCT_MEMBER(film, float, mist_start)
KERNEL_STRUCT_MEMBER(film, float, mist_inv_depth)
KERNEL_STRUCT_MEMBER(film, float, mist_falloff)
/* Denoising. */
KERNEL_STRUCT_MEMBER(film, int, pass_denoising_normal)
KERNEL_STRUCT_MEMBER(film, int, pass_denoising_albedo)
KERNEL_STRUCT_MEMBER(film, int, pass_denoising_depth)
/* AOVs. */
KERNEL_STRUCT_MEMBER(film, int, pass_aov_color)
KERNEL_STRUCT_MEMBER(film, int, pass_aov_value)
/* Light groups. */
KERNEL_STRUCT_MEMBER(film, int, pass_lightgroup)
/* Baking. */
KERNEL_STRUCT_MEMBER(film, int, pass_bake_primitive)
KERNEL_STRUCT_MEMBER(film, int, pass_bake_differential)
/* Shadow catcher. */
KERNEL_STRUCT_MEMBER(film, int, use_approximate_shadow_catcher)
/* Padding. */
KERNEL_STRUCT_MEMBER(film, int, pad1)
KERNEL_STRUCT_MEMBER(film, int, pad2)
KERNEL_STRUCT_END(KernelFilm)
/* Integrator. */
KERNEL_STRUCT_BEGIN(KernelIntegrator, integrator)
/* Emission. */
KERNEL_STRUCT_MEMBER(integrator, int, use_direct_light)
KERNEL_STRUCT_MEMBER(integrator, int, num_distribution)
KERNEL_STRUCT_MEMBER(integrator, int, num_all_lights)
KERNEL_STRUCT_MEMBER(integrator, float, pdf_triangles)
KERNEL_STRUCT_MEMBER(integrator, float, pdf_lights)
KERNEL_STRUCT_MEMBER(integrator, float, light_inv_rr_threshold)
/* Bounces. */
KERNEL_STRUCT_MEMBER(integrator, int, min_bounce)
KERNEL_STRUCT_MEMBER(integrator, int, max_bounce)
KERNEL_STRUCT_MEMBER(integrator, int, max_diffuse_bounce)
KERNEL_STRUCT_MEMBER(integrator, int, max_glossy_bounce)
KERNEL_STRUCT_MEMBER(integrator, int, max_transmission_bounce)
KERNEL_STRUCT_MEMBER(integrator, int, max_volume_bounce)
/* AO bounces. */
KERNEL_STRUCT_MEMBER(integrator, int, ao_bounces)
KERNEL_STRUCT_MEMBER(integrator, float, ao_bounces_distance)
KERNEL_STRUCT_MEMBER(integrator, float, ao_bounces_factor)
KERNEL_STRUCT_MEMBER(integrator, float, ao_additive_factor)
/* Transparency. */
KERNEL_STRUCT_MEMBER(integrator, int, transparent_min_bounce)
KERNEL_STRUCT_MEMBER(integrator, int, transparent_max_bounce)
KERNEL_STRUCT_MEMBER(integrator, int, transparent_shadows)
/* Caustics. */
KERNEL_STRUCT_MEMBER(integrator, int, caustics_reflective)
KERNEL_STRUCT_MEMBER(integrator, int, caustics_refractive)
KERNEL_STRUCT_MEMBER(integrator, float, filter_glossy)
/* Seed. */
KERNEL_STRUCT_MEMBER(integrator, int, seed)
/* Clamp. */
KERNEL_STRUCT_MEMBER(integrator, float, sample_clamp_direct)
KERNEL_STRUCT_MEMBER(integrator, float, sample_clamp_indirect)
/* MIS. */
KERNEL_STRUCT_MEMBER(integrator, int, use_lamp_mis)
/* Caustics. */
KERNEL_STRUCT_MEMBER(integrator, int, use_caustics)
/* Sampling pattern. */
KERNEL_STRUCT_MEMBER(integrator, int, sampling_pattern)
KERNEL_STRUCT_MEMBER(integrator, float, scrambling_distance)
/* Volume render. */
KERNEL_STRUCT_MEMBER(integrator, int, use_volumes)
KERNEL_STRUCT_MEMBER(integrator, int, volume_max_steps)
KERNEL_STRUCT_MEMBER(integrator, float, volume_step_rate)
/* Shadow catcher. */
KERNEL_STRUCT_MEMBER(integrator, int, has_shadow_catcher)
/* Closure filter. */
KERNEL_STRUCT_MEMBER(integrator, int, filter_closures)
/* MIS debugging. */
KERNEL_STRUCT_MEMBER(integrator, int, direct_light_sampling_type)
/* Padding */
KERNEL_STRUCT_MEMBER(integrator, int, pad1)
KERNEL_STRUCT_END(KernelIntegrator)
/* SVM. For shader specialization. */
KERNEL_STRUCT_BEGIN(KernelSVMUsage, svm_usage)
#define SHADER_NODE_TYPE(type) KERNEL_STRUCT_MEMBER(svm_usage, int, type)
#include "kernel/svm/node_types_template.h"
KERNEL_STRUCT_END(KernelSVMUsage)
#undef KERNEL_STRUCT_BEGIN
#undef KERNEL_STRUCT_MEMBER
#undef KERNEL_STRUCT_END

View File

@@ -246,7 +246,7 @@ ccl_gpu_kernel(GPU_KERNEL_BLOCK_NUM_THREADS, GPU_KERNEL_MAX_REGISTERS)
ccl_gpu_kernel_postfix
#if defined(__KERNEL_METAL_APPLE__) && defined(__METALRT__)
constant int __dummy_constant [[function_constant(0)]];
constant int __dummy_constant [[function_constant(Kernel_DummyConstant)]];
#endif
ccl_gpu_kernel(GPU_KERNEL_BLOCK_NUM_THREADS, GPU_KERNEL_MAX_REGISTERS)

View File

@@ -0,0 +1,15 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2021-2022 Blender Foundation */
enum {
Kernel_DummyConstant,
#define KERNEL_STRUCT_MEMBER(parent, type, name) KernelData_##parent##_##name,
#include "kernel/data_template.h"
};
#ifdef __KERNEL_METAL__
# define KERNEL_STRUCT_MEMBER(parent, type, name) \
constant type kernel_data_##parent##_##name \
[[function_constant(KernelData_##parent##_##name)]];
# include "kernel/data_template.h"
#endif

View File

@@ -5,6 +5,7 @@
#include "kernel/device/metal/compat.h"
#include "kernel/device/metal/globals.h"
#include "kernel/device/metal/function_constants.h"
#include "kernel/device/gpu/kernel.h"
/* MetalRT intersection handlers */
@@ -409,6 +410,7 @@ void metalrt_intersection_curve(constant KernelParamsMetal &launch_params_metal,
const float3 ray_origin,
const float3 ray_direction,
float time,
const float ray_tmin,
const float ray_tmax,
thread BoundingBoxIntersectionResult &result)
{
@@ -433,7 +435,7 @@ void metalrt_intersection_curve(constant KernelParamsMetal &launch_params_metal,
isect.t *= len;
MetalKernelContext context(launch_params_metal);
if (context.curve_intersect(NULL, &isect, P, dir, isect.t, object, prim, time, type)) {
if (context.curve_intersect(NULL, &isect, P, dir, ray_tmin, isect.t, object, prim, time, type)) {
result = metalrt_visibility_test<BoundingBoxIntersectionResult, METALRT_HIT_BOUNDING_BOX>(
launch_params_metal, payload, object, prim, isect.u);
if (result.accept) {
@@ -455,6 +457,7 @@ void metalrt_intersection_curve_shadow(constant KernelParamsMetal &launch_params
const float3 ray_origin,
const float3 ray_direction,
float time,
const float ray_tmin,
const float ray_tmax,
thread BoundingBoxIntersectionResult &result)
{
@@ -474,7 +477,7 @@ void metalrt_intersection_curve_shadow(constant KernelParamsMetal &launch_params
isect.t *= len;
MetalKernelContext context(launch_params_metal);
if (context.curve_intersect(NULL, &isect, P, dir, isect.t, object, prim, time, type)) {
if (context.curve_intersect(NULL, &isect, P, dir, ray_tmin, isect.t, object, prim, time, type)) {
result.continue_search = metalrt_shadow_all_hit<METALRT_HIT_BOUNDING_BOX>(
launch_params_metal, payload, object, prim, float2(isect.u, isect.v), ray_tmax);
result.accept = !result.continue_search;
@@ -493,6 +496,7 @@ __intersection__curve_ribbon(constant KernelParamsMetal &launch_params_metal [[b
const uint primitive_id [[primitive_id]],
const float3 ray_origin [[origin]],
const float3 ray_direction [[direction]],
const float ray_tmin [[min_distance]],
const float ray_tmax [[max_distance]])
{
uint prim = primitive_id + kernel_data_fetch(object_prim_offset, object);
@@ -510,7 +514,7 @@ __intersection__curve_ribbon(constant KernelParamsMetal &launch_params_metal [[b
# else
0.0f,
# endif
ray_tmax, result);
ray_tmin, ray_tmax, result);
}
return result;
@@ -524,6 +528,7 @@ __intersection__curve_ribbon_shadow(constant KernelParamsMetal &launch_params_me
const uint primitive_id [[primitive_id]],
const float3 ray_origin [[origin]],
const float3 ray_direction [[direction]],
const float ray_tmin [[min_distance]],
const float ray_tmax [[max_distance]])
{
uint prim = primitive_id + kernel_data_fetch(object_prim_offset, object);
@@ -541,7 +546,7 @@ __intersection__curve_ribbon_shadow(constant KernelParamsMetal &launch_params_me
# else
0.0f,
# endif
ray_tmax, result);
ray_tmin, ray_tmax, result);
}
return result;
@@ -555,6 +560,7 @@ __intersection__curve_all(constant KernelParamsMetal &launch_params_metal [[buff
const uint primitive_id [[primitive_id]],
const float3 ray_origin [[origin]],
const float3 ray_direction [[direction]],
const float ray_tmin [[min_distance]],
const float ray_tmax [[max_distance]])
{
uint prim = primitive_id + kernel_data_fetch(object_prim_offset, object);
@@ -570,7 +576,7 @@ __intersection__curve_all(constant KernelParamsMetal &launch_params_metal [[buff
# else
0.0f,
# endif
ray_tmax, result);
ray_tmin, ray_tmax, result);
return result;
}
@@ -583,6 +589,7 @@ __intersection__curve_all_shadow(constant KernelParamsMetal &launch_params_metal
const uint primitive_id [[primitive_id]],
const float3 ray_origin [[origin]],
const float3 ray_direction [[direction]],
const float ray_tmin [[min_distance]],
const float ray_tmax [[max_distance]])
{
uint prim = primitive_id + kernel_data_fetch(object_prim_offset, object);
@@ -599,7 +606,7 @@ __intersection__curve_all_shadow(constant KernelParamsMetal &launch_params_metal
# else
0.0f,
# endif
ray_tmax, result);
ray_tmin, ray_tmax, result);
return result;
}
@@ -615,6 +622,7 @@ void metalrt_intersection_point(constant KernelParamsMetal &launch_params_metal,
const float3 ray_origin,
const float3 ray_direction,
float time,
const float ray_tmin,
const float ray_tmax,
thread BoundingBoxIntersectionResult &result)
{
@@ -639,7 +647,7 @@ void metalrt_intersection_point(constant KernelParamsMetal &launch_params_metal,
isect.t *= len;
MetalKernelContext context(launch_params_metal);
if (context.point_intersect(NULL, &isect, P, dir, isect.t, object, prim, time, type)) {
if (context.point_intersect(NULL, &isect, P, dir, ray_tmin, isect.t, object, prim, time, type)) {
result = metalrt_visibility_test<BoundingBoxIntersectionResult, METALRT_HIT_BOUNDING_BOX>(
launch_params_metal, payload, object, prim, isect.u);
if (result.accept) {
@@ -661,6 +669,7 @@ void metalrt_intersection_point_shadow(constant KernelParamsMetal &launch_params
const float3 ray_origin,
const float3 ray_direction,
float time,
const float ray_tmin,
const float ray_tmax,
thread BoundingBoxIntersectionResult &result)
{
@@ -680,7 +689,7 @@ void metalrt_intersection_point_shadow(constant KernelParamsMetal &launch_params
isect.t *= len;
MetalKernelContext context(launch_params_metal);
if (context.point_intersect(NULL, &isect, P, dir, isect.t, object, prim, time, type)) {
if (context.point_intersect(NULL, &isect, P, dir, ray_tmin, isect.t, object, prim, time, type)) {
result.continue_search = metalrt_shadow_all_hit<METALRT_HIT_BOUNDING_BOX>(
launch_params_metal, payload, object, prim, float2(isect.u, isect.v), ray_tmax);
result.accept = !result.continue_search;
@@ -699,6 +708,7 @@ __intersection__point(constant KernelParamsMetal &launch_params_metal [[buffer(1
const uint primitive_id [[primitive_id]],
const float3 ray_origin [[origin]],
const float3 ray_direction [[direction]],
const float ray_tmin [[min_distance]],
const float ray_tmax [[max_distance]])
{
const uint prim = primitive_id + kernel_data_fetch(object_prim_offset, object);
@@ -715,7 +725,7 @@ __intersection__point(constant KernelParamsMetal &launch_params_metal [[buffer(1
# else
0.0f,
# endif
ray_tmax, result);
ray_tmin, ray_tmax, result);
return result;
}
@@ -728,6 +738,7 @@ __intersection__point_shadow(constant KernelParamsMetal &launch_params_metal [[b
const uint primitive_id [[primitive_id]],
const float3 ray_origin [[origin]],
const float3 ray_direction [[direction]],
const float ray_tmin [[min_distance]],
const float ray_tmax [[max_distance]])
{
const uint prim = primitive_id + kernel_data_fetch(object_prim_offset, object);
@@ -744,7 +755,7 @@ __intersection__point_shadow(constant KernelParamsMetal &launch_params_metal [[b
# else
0.0f,
# endif
ray_tmax, result);
ray_tmin, ray_tmax, result);
return result;
}

View File

@@ -670,7 +670,7 @@ bool oneapi_enqueue_kernel(KernelContext *kernel_context,
}
static const int lowest_supported_driver_version_win = 1011660;
static const int lowest_supported_driver_version_neo = 20066;
static const int lowest_supported_driver_version_neo = 23570;
static int parse_driver_build_version(const sycl::device &device)
{

View File

@@ -51,32 +51,36 @@ ccl_device_forceinline int get_object_id()
extern "C" __global__ void __raygen__kernel_optix_integrator_intersect_closest()
{
const int global_index = optixGetLaunchIndex().x;
const int path_index = (kernel_params.path_index_array) ? kernel_params.path_index_array[global_index] :
global_index;
const int path_index = (kernel_params.path_index_array) ?
kernel_params.path_index_array[global_index] :
global_index;
integrator_intersect_closest(nullptr, path_index, kernel_params.render_buffer);
}
extern "C" __global__ void __raygen__kernel_optix_integrator_intersect_shadow()
{
const int global_index = optixGetLaunchIndex().x;
const int path_index = (kernel_params.path_index_array) ? kernel_params.path_index_array[global_index] :
global_index;
const int path_index = (kernel_params.path_index_array) ?
kernel_params.path_index_array[global_index] :
global_index;
integrator_intersect_shadow(nullptr, path_index);
}
extern "C" __global__ void __raygen__kernel_optix_integrator_intersect_subsurface()
{
const int global_index = optixGetLaunchIndex().x;
const int path_index = (kernel_params.path_index_array) ? kernel_params.path_index_array[global_index] :
global_index;
const int path_index = (kernel_params.path_index_array) ?
kernel_params.path_index_array[global_index] :
global_index;
integrator_intersect_subsurface(nullptr, path_index);
}
extern "C" __global__ void __raygen__kernel_optix_integrator_intersect_volume_stack()
{
const int global_index = optixGetLaunchIndex().x;
const int path_index = (kernel_params.path_index_array) ? kernel_params.path_index_array[global_index] :
global_index;
const int path_index = (kernel_params.path_index_array) ?
kernel_params.path_index_array[global_index] :
global_index;
integrator_intersect_volume_stack(nullptr, path_index);
}
@@ -408,6 +412,7 @@ ccl_device_inline void optix_intersection_curve(const int prim, const int type)
float3 P = optixGetObjectRayOrigin();
float3 dir = optixGetObjectRayDirection();
float tmin = optixGetRayTmin();
/* The direction is not normalized by default, but the curve intersection routine expects that */
float len;
@@ -425,7 +430,7 @@ ccl_device_inline void optix_intersection_curve(const int prim, const int type)
if (isect.t != FLT_MAX)
isect.t *= len;
if (curve_intersect(NULL, &isect, P, dir, isect.t, object, prim, time, type)) {
if (curve_intersect(NULL, &isect, P, dir, tmin, isect.t, object, prim, time, type)) {
static_assert(PRIMITIVE_ALL < 128, "Values >= 128 are reserved for OptiX internal use");
optixReportIntersection(isect.t / len,
type & PRIMITIVE_ALL,
@@ -462,6 +467,7 @@ extern "C" __global__ void __intersection__point()
float3 P = optixGetObjectRayOrigin();
float3 dir = optixGetObjectRayDirection();
float tmin = optixGetRayTmin();
/* The direction is not normalized by default, the point intersection routine expects that. */
float len;
@@ -480,7 +486,7 @@ extern "C" __global__ void __intersection__point()
isect.t *= len;
}
if (point_intersect(NULL, &isect, P, dir, isect.t, object, prim, time, type)) {
if (point_intersect(NULL, &isect, P, dir, tmin, isect.t, object, prim, time, type)) {
static_assert(PRIMITIVE_ALL < 128, "Values >= 128 are reserved for OptiX internal use");
optixReportIntersection(isect.t / len, type & PRIMITIVE_ALL);
}

View File

@@ -156,7 +156,8 @@ ccl_device_inline float2 half_plane_intersect(const float3 P, const float3 N, co
}
ccl_device bool curve_intersect_iterative(const float3 ray_dir,
ccl_private float *ray_tfar,
const float ray_tmin,
ccl_private float *ray_tmax,
const float dt,
const float4 curve[4],
float u,
@@ -220,7 +221,7 @@ ccl_device bool curve_intersect_iterative(const float3 ray_dir,
if (fabsf(f) < f_err && fabsf(g) < g_err) {
t += dt;
if (!(0.0f <= t && t <= *ray_tfar)) {
if (!(t >= ray_tmin && t <= *ray_tmax)) {
return false; /* Rejects NaNs */
}
if (!(u >= 0.0f && u <= 1.0f)) {
@@ -237,7 +238,7 @@ ccl_device bool curve_intersect_iterative(const float3 ray_dir,
}
/* Record intersection. */
*ray_tfar = t;
*ray_tmax = t;
isect->t = t;
isect->u = u;
isect->v = 0.0f;
@@ -250,7 +251,8 @@ ccl_device bool curve_intersect_iterative(const float3 ray_dir,
ccl_device bool curve_intersect_recursive(const float3 ray_orig,
const float3 ray_dir,
float ray_tfar,
const float ray_tmin,
float ray_tmax,
float4 curve[4],
ccl_private Intersection *isect)
{
@@ -331,7 +333,7 @@ ccl_device bool curve_intersect_recursive(const float3 ray_orig,
}
/* Intersect with cap-planes. */
float2 tp = make_float2(-dt, ray_tfar - dt);
float2 tp = make_float2(ray_tmin - dt, ray_tmax - dt);
tp = make_float2(max(tp.x, tc_outer.x), min(tp.y, tc_outer.y));
const float2 h0 = half_plane_intersect(
float4_to_float3(P0), float4_to_float3(dP0du), ray_dir);
@@ -394,19 +396,20 @@ ccl_device bool curve_intersect_recursive(const float3 ray_orig,
CURVE_NUM_BEZIER_SUBDIVISIONS;
if (depth >= termDepth) {
found |= curve_intersect_iterative(
ray_dir, &ray_tfar, dt, curve, u_outer0, tp0.x, use_backfacing, isect);
ray_dir, ray_tmin, &ray_tmax, dt, curve, u_outer0, tp0.x, use_backfacing, isect);
}
else {
recurse = true;
}
}
if (valid1 && (tp1.x + dt <= ray_tfar)) {
const float t1 = tp1.x + dt;
if (valid1 && (t1 >= ray_tmin && t1 <= ray_tmax)) {
const int termDepth = unstable1 ? CURVE_NUM_BEZIER_SUBDIVISIONS_UNSTABLE :
CURVE_NUM_BEZIER_SUBDIVISIONS;
if (depth >= termDepth) {
found |= curve_intersect_iterative(
ray_dir, &ray_tfar, dt, curve, u_outer1, tp1.y, use_backfacing, isect);
ray_dir, ray_tmin, &ray_tmax, dt, curve, u_outer1, tp1.y, use_backfacing, isect);
}
else {
recurse = true;
@@ -456,7 +459,8 @@ ccl_device_inline bool cylinder_culling_test(const float2 p1, const float2 p2, c
* v0,v1,v3 and v2,v3,v1. The edge v1,v2 decides which of the two
* triangles gets intersected.
*/
ccl_device_inline bool ribbon_intersect_quad(const float ray_tfar,
ccl_device_inline bool ribbon_intersect_quad(const float ray_tmin,
const float ray_tmax,
const float3 quad_v0,
const float3 quad_v1,
const float3 quad_v2,
@@ -497,7 +501,7 @@ ccl_device_inline bool ribbon_intersect_quad(const float ray_tfar,
/* Perform depth test? */
const float t = rcpDen * dot(v0, Ng);
if (!(0.0f <= t && t <= ray_tfar)) {
if (!(t >= ray_tmin && t <= ray_tmax)) {
return false;
}
@@ -534,7 +538,8 @@ ccl_device_inline float4 ribbon_to_ray_space(const float3 ray_space[3],
ccl_device_inline bool ribbon_intersect(const float3 ray_org,
const float3 ray_dir,
float ray_tfar,
const float ray_tmin,
float ray_tmax,
const int N,
float4 curve[4],
ccl_private Intersection *isect)
@@ -582,7 +587,7 @@ ccl_device_inline bool ribbon_intersect(const float3 ray_org,
/* Intersect quad. */
float vu, vv, vt;
bool valid0 = ribbon_intersect_quad(ray_tfar, lp0, lp1, up1, up0, &vu, &vv, &vt);
bool valid0 = ribbon_intersect_quad(ray_tmin, ray_tmax, lp0, lp1, up1, up0, &vu, &vv, &vt);
if (valid0) {
/* ignore self intersections */
@@ -596,7 +601,7 @@ ccl_device_inline bool ribbon_intersect(const float3 ray_org,
vv = 2.0f * vv - 1.0f;
/* Record intersection. */
ray_tfar = vt;
ray_tmax = vt;
isect->t = vt;
isect->u = u + vu * step_size;
isect->v = vv;
@@ -616,6 +621,7 @@ ccl_device_forceinline bool curve_intersect(KernelGlobals kg,
ccl_private Intersection *isect,
const float3 P,
const float3 dir,
const float tmin,
const float tmax,
int object,
int prim,
@@ -645,7 +651,7 @@ ccl_device_forceinline bool curve_intersect(KernelGlobals kg,
if (type & PRIMITIVE_CURVE_RIBBON) {
/* todo: adaptive number of subdivisions could help performance here. */
const int subdivisions = kernel_data.bvh.curve_subdivisions;
if (ribbon_intersect(P, dir, tmax, subdivisions, curve, isect)) {
if (ribbon_intersect(P, dir, tmin, tmax, subdivisions, curve, isect)) {
isect->prim = prim;
isect->object = object;
isect->type = type;
@@ -655,7 +661,7 @@ ccl_device_forceinline bool curve_intersect(KernelGlobals kg,
return false;
}
else {
if (curve_intersect_recursive(P, dir, tmax, curve, isect)) {
if (curve_intersect_recursive(P, dir, tmin, tmax, curve, isect)) {
isect->prim = prim;
isect->object = object;
isect->type = type;

View File

@@ -46,6 +46,7 @@ ccl_device_inline bool motion_triangle_intersect(KernelGlobals kg,
ccl_private Intersection *isect,
float3 P,
float3 dir,
float tmin,
float tmax,
float time,
uint visibility,
@@ -58,7 +59,7 @@ ccl_device_inline bool motion_triangle_intersect(KernelGlobals kg,
motion_triangle_vertices(kg, object, prim, time, verts);
/* Ray-triangle intersection, unoptimized. */
float t, u, v;
if (ray_triangle_intersect(P, dir, tmax, verts[0], verts[1], verts[2], &u, &v, &t)) {
if (ray_triangle_intersect(P, dir, tmin, tmax, verts[0], verts[1], verts[2], &u, &v, &t)) {
#ifdef __VISIBILITY_FLAG__
/* Visibility flag test. we do it here under the assumption
* that most triangles are culled by node flags.
@@ -92,6 +93,7 @@ ccl_device_inline bool motion_triangle_intersect_local(KernelGlobals kg,
int object,
int prim,
int prim_addr,
float tmin,
float tmax,
ccl_private uint *lcg_state,
int max_hits)
@@ -101,7 +103,7 @@ ccl_device_inline bool motion_triangle_intersect_local(KernelGlobals kg,
motion_triangle_vertices(kg, object, prim, time, verts);
/* Ray-triangle intersection, unoptimized. */
float t, u, v;
if (!ray_triangle_intersect(P, dir, tmax, verts[0], verts[1], verts[2], &u, &v, &t)) {
if (!ray_triangle_intersect(P, dir, tmin, tmax, verts[0], verts[1], verts[2], &u, &v, &t)) {
return false;
}

View File

@@ -9,8 +9,12 @@ CCL_NAMESPACE_BEGIN
#ifdef __POINTCLOUD__
ccl_device_forceinline bool point_intersect_test(
const float4 point, const float3 P, const float3 dir, const float tmax, ccl_private float *t)
ccl_device_forceinline bool point_intersect_test(const float4 point,
const float3 P,
const float3 dir,
const float tmin,
const float tmax,
ccl_private float *t)
{
const float3 center = float4_to_float3(point);
const float radius = point.w;
@@ -28,12 +32,12 @@ ccl_device_forceinline bool point_intersect_test(
const float td = sqrt((r2 - l2) * rd2);
const float t_front = projC0 - td;
const bool valid_front = (0.0f <= t_front) & (t_front <= tmax);
const bool valid_front = (tmin <= t_front) & (t_front <= tmax);
/* Always back-face culling for now. */
# if 0
const float t_back = projC0 + td;
const bool valid_back = (0.0f <= t_back) & (t_back <= tmax);
const bool valid_back = (tmin <= t_back) & (t_back <= tmax);
/* check if there is a first hit */
const bool valid_first = valid_front | valid_back;
@@ -56,6 +60,7 @@ ccl_device_forceinline bool point_intersect(KernelGlobals kg,
ccl_private Intersection *isect,
const float3 P,
const float3 dir,
const float tmin,
const float tmax,
const int object,
const int prim,
@@ -65,7 +70,7 @@ ccl_device_forceinline bool point_intersect(KernelGlobals kg,
const float4 point = (type & PRIMITIVE_MOTION) ? motion_point(kg, object, prim, time) :
kernel_data_fetch(points, prim);
if (!point_intersect_test(point, P, dir, tmax, &isect->t)) {
if (!point_intersect_test(point, P, dir, tmin, tmax, &isect->t)) {
return false;
}

View File

@@ -407,7 +407,7 @@ ccl_device_inline void shader_setup_from_volume(KernelGlobals kg,
{
/* vectors */
sd->P = ray->P;
sd->P = ray->P + ray->D * ray->tmin;
sd->N = -ray->D;
sd->Ng = -ray->D;
sd->I = -ray->D;
@@ -441,7 +441,6 @@ ccl_device_inline void shader_setup_from_volume(KernelGlobals kg,
/* for NDC coordinates */
sd->ray_P = ray->P;
sd->ray_dP = ray->dP;
}
#endif /* __VOLUME__ */

View File

@@ -17,6 +17,7 @@ ccl_device_inline bool triangle_intersect(KernelGlobals kg,
ccl_private Intersection *isect,
float3 P,
float3 dir,
float tmin,
float tmax,
uint visibility,
int object,
@@ -28,7 +29,7 @@ ccl_device_inline bool triangle_intersect(KernelGlobals kg,
tri_b = kernel_data_fetch(tri_verts, tri_vindex + 1),
tri_c = kernel_data_fetch(tri_verts, tri_vindex + 2);
float t, u, v;
if (ray_triangle_intersect(P, dir, tmax, tri_a, tri_b, tri_c, &u, &v, &t)) {
if (ray_triangle_intersect(P, dir, tmin, tmax, tri_a, tri_b, tri_c, &u, &v, &t)) {
#ifdef __VISIBILITY_FLAG__
/* Visibility flag test. we do it here under the assumption
* that most triangles are culled by node flags.
@@ -62,6 +63,7 @@ ccl_device_inline bool triangle_intersect_local(KernelGlobals kg,
int object,
int prim,
int prim_addr,
float tmin,
float tmax,
ccl_private uint *lcg_state,
int max_hits)
@@ -71,7 +73,7 @@ ccl_device_inline bool triangle_intersect_local(KernelGlobals kg,
tri_b = kernel_data_fetch(tri_verts, tri_vindex + 1),
tri_c = kernel_data_fetch(tri_verts, tri_vindex + 2);
float t, u, v;
if (!ray_triangle_intersect(P, dir, tmax, tri_a, tri_b, tri_c, &u, &v, &t)) {
if (!ray_triangle_intersect(P, dir, tmin, tmax, tri_a, tri_b, tri_c, &u, &v, &t)) {
return false;
}

View File

@@ -174,14 +174,15 @@ ccl_device bool integrator_init_from_bake(KernelGlobals kg,
Ray ray ccl_optional_struct_init;
ray.P = zero_float3();
ray.D = normalize(P);
ray.t = FLT_MAX;
ray.tmin = 0.0f;
ray.tmax = FLT_MAX;
ray.time = 0.5f;
ray.dP = differential_zero_compact();
ray.dD = differential_zero_compact();
integrator_state_write_ray(kg, state, &ray);
/* Setup next kernel to execute. */
INTEGRATOR_PATH_INIT(DEVICE_KERNEL_INTEGRATOR_SHADE_BACKGROUND);
integrator_path_init(kg, state, DEVICE_KERNEL_INTEGRATOR_SHADE_BACKGROUND);
}
else {
/* Surface baking. */
@@ -210,7 +211,8 @@ ccl_device bool integrator_init_from_bake(KernelGlobals kg,
Ray ray ccl_optional_struct_init;
ray.P = P + N;
ray.D = -N;
ray.t = FLT_MAX;
ray.tmin = 0.0f;
ray.tmax = FLT_MAX;
ray.time = 0.5f;
/* Setup differentials. */
@@ -247,13 +249,15 @@ ccl_device bool integrator_init_from_bake(KernelGlobals kg,
const bool use_raytrace_kernel = (shader_flags & SD_HAS_RAYTRACE);
if (use_caustics) {
INTEGRATOR_PATH_INIT_SORTED(DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_MNEE, shader_index);
integrator_path_init_sorted(
kg, state, DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_MNEE, shader_index);
}
else if (use_raytrace_kernel) {
INTEGRATOR_PATH_INIT_SORTED(DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE, shader_index);
integrator_path_init_sorted(
kg, state, DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE, shader_index);
}
else {
INTEGRATOR_PATH_INIT_SORTED(DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE, shader_index);
integrator_path_init_sorted(kg, state, DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE, shader_index);
}
}

View File

@@ -86,7 +86,7 @@ ccl_device bool integrator_init_from_camera(KernelGlobals kg,
/* Generate camera ray. */
Ray ray;
integrate_camera_sample(kg, sample, x, y, rng_hash, &ray);
if (ray.t == 0.0f) {
if (ray.tmax == 0.0f) {
return true;
}
@@ -100,10 +100,10 @@ ccl_device bool integrator_init_from_camera(KernelGlobals kg,
/* Continue with intersect_closest kernel, optionally initializing volume
* stack before that if the camera may be inside a volume. */
if (kernel_data.cam.is_inside_volume) {
INTEGRATOR_PATH_INIT(DEVICE_KERNEL_INTEGRATOR_INTERSECT_VOLUME_STACK);
integrator_path_init(kg, state, DEVICE_KERNEL_INTEGRATOR_INTERSECT_VOLUME_STACK);
}
else {
INTEGRATOR_PATH_INIT(DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST);
integrator_path_init(kg, state, DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST);
}
return true;

View File

@@ -109,14 +109,14 @@ ccl_device_forceinline void integrator_split_shadow_catcher(
/* If using background pass, schedule background shading kernel so that we have a background
* to alpha-over on. The background kernel will then continue the path afterwards. */
INTEGRATOR_STATE_WRITE(state, path, flag) |= PATH_RAY_SHADOW_CATCHER_BACKGROUND;
INTEGRATOR_PATH_INIT(DEVICE_KERNEL_INTEGRATOR_SHADE_BACKGROUND);
integrator_path_init(kg, state, DEVICE_KERNEL_INTEGRATOR_SHADE_BACKGROUND);
return;
}
if (!integrator_state_volume_stack_is_empty(kg, state)) {
/* Volume stack is not empty. Re-init the volume stack to exclude any non-shadow catcher
* objects from it, and then continue shading volume and shadow catcher surface after. */
INTEGRATOR_PATH_INIT(DEVICE_KERNEL_INTEGRATOR_INTERSECT_VOLUME_STACK);
integrator_path_init(kg, state, DEVICE_KERNEL_INTEGRATOR_INTERSECT_VOLUME_STACK);
return;
}
@@ -128,18 +128,19 @@ ccl_device_forceinline void integrator_split_shadow_catcher(
const bool use_raytrace_kernel = (flags & SD_HAS_RAYTRACE);
if (use_caustics) {
INTEGRATOR_PATH_INIT_SORTED(DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_MNEE, shader);
integrator_path_init_sorted(kg, state, DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_MNEE, shader);
}
else if (use_raytrace_kernel) {
INTEGRATOR_PATH_INIT_SORTED(DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE, shader);
integrator_path_init_sorted(
kg, state, DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE, shader);
}
else {
INTEGRATOR_PATH_INIT_SORTED(DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE, shader);
integrator_path_init_sorted(kg, state, DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE, shader);
}
}
/* Schedule next kernel to be executed after updating volume stack for shadow catcher. */
template<uint32_t current_kernel>
template<DeviceKernel current_kernel>
ccl_device_forceinline void integrator_intersect_next_kernel_after_shadow_catcher_volume(
KernelGlobals kg, IntegratorState state)
{
@@ -156,20 +157,21 @@ ccl_device_forceinline void integrator_intersect_next_kernel_after_shadow_catche
const bool use_raytrace_kernel = (flags & SD_HAS_RAYTRACE);
if (use_caustics) {
INTEGRATOR_PATH_NEXT_SORTED(
current_kernel, DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_MNEE, shader);
integrator_path_next_sorted(
kg, state, current_kernel, DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_MNEE, shader);
}
else if (use_raytrace_kernel) {
INTEGRATOR_PATH_NEXT_SORTED(
current_kernel, DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE, shader);
integrator_path_next_sorted(
kg, state, current_kernel, DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE, shader);
}
else {
INTEGRATOR_PATH_NEXT_SORTED(current_kernel, DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE, shader);
integrator_path_next_sorted(
kg, state, current_kernel, DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE, shader);
}
}
/* Schedule next kernel to be executed after executing background shader for shadow catcher. */
template<uint32_t current_kernel>
template<DeviceKernel current_kernel>
ccl_device_forceinline void integrator_intersect_next_kernel_after_shadow_catcher_background(
KernelGlobals kg, IntegratorState state)
{
@@ -177,7 +179,8 @@ ccl_device_forceinline void integrator_intersect_next_kernel_after_shadow_catche
if (!integrator_state_volume_stack_is_empty(kg, state)) {
/* Volume stack is not empty. Re-init the volume stack to exclude any non-shadow catcher
* objects from it, and then continue shading volume and shadow catcher surface after. */
INTEGRATOR_PATH_NEXT(current_kernel, DEVICE_KERNEL_INTEGRATOR_INTERSECT_VOLUME_STACK);
integrator_path_next(
kg, state, current_kernel, DEVICE_KERNEL_INTEGRATOR_INTERSECT_VOLUME_STACK);
return;
}
@@ -190,7 +193,7 @@ ccl_device_forceinline void integrator_intersect_next_kernel_after_shadow_catche
*
* Note that current_kernel is a template value since making this a variable
* leads to poor performance with CUDA atomics. */
template<uint32_t current_kernel>
template<DeviceKernel current_kernel>
ccl_device_forceinline void integrator_intersect_next_kernel(
KernelGlobals kg,
IntegratorState state,
@@ -206,10 +209,10 @@ ccl_device_forceinline void integrator_intersect_next_kernel(
const int flags = (hit_surface) ? kernel_data_fetch(shaders, shader).flags : 0;
if (!integrator_intersect_terminate(kg, state, flags)) {
INTEGRATOR_PATH_NEXT(current_kernel, DEVICE_KERNEL_INTEGRATOR_SHADE_VOLUME);
integrator_path_next(kg, state, current_kernel, DEVICE_KERNEL_INTEGRATOR_SHADE_VOLUME);
}
else {
INTEGRATOR_PATH_TERMINATE(current_kernel);
integrator_path_terminate(kg, state, current_kernel);
}
return;
}
@@ -218,7 +221,7 @@ ccl_device_forceinline void integrator_intersect_next_kernel(
if (hit) {
/* Hit a surface, continue with light or surface kernel. */
if (isect->type & PRIMITIVE_LAMP) {
INTEGRATOR_PATH_NEXT(current_kernel, DEVICE_KERNEL_INTEGRATOR_SHADE_LIGHT);
integrator_path_next(kg, state, current_kernel, DEVICE_KERNEL_INTEGRATOR_SHADE_LIGHT);
}
else {
/* Hit a surface, continue with surface kernel unless terminated. */
@@ -231,16 +234,16 @@ ccl_device_forceinline void integrator_intersect_next_kernel(
(object_flags & SD_OBJECT_CAUSTICS);
const bool use_raytrace_kernel = (flags & SD_HAS_RAYTRACE);
if (use_caustics) {
INTEGRATOR_PATH_NEXT_SORTED(
current_kernel, DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_MNEE, shader);
integrator_path_next_sorted(
kg, state, current_kernel, DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_MNEE, shader);
}
else if (use_raytrace_kernel) {
INTEGRATOR_PATH_NEXT_SORTED(
current_kernel, DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE, shader);
integrator_path_next_sorted(
kg, state, current_kernel, DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE, shader);
}
else {
INTEGRATOR_PATH_NEXT_SORTED(
current_kernel, DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE, shader);
integrator_path_next_sorted(
kg, state, current_kernel, DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE, shader);
}
#ifdef __SHADOW_CATCHER__
@@ -249,13 +252,13 @@ ccl_device_forceinline void integrator_intersect_next_kernel(
#endif
}
else {
INTEGRATOR_PATH_TERMINATE(current_kernel);
integrator_path_terminate(kg, state, current_kernel);
}
}
}
else {
/* Nothing hit, continue with background kernel. */
INTEGRATOR_PATH_NEXT(current_kernel, DEVICE_KERNEL_INTEGRATOR_SHADE_BACKGROUND);
integrator_path_next(kg, state, current_kernel, DEVICE_KERNEL_INTEGRATOR_SHADE_BACKGROUND);
}
}
@@ -263,7 +266,7 @@ ccl_device_forceinline void integrator_intersect_next_kernel(
*
* The logic here matches integrator_intersect_next_kernel, except that
* volume shading and termination testing have already been done. */
template<uint32_t current_kernel>
template<DeviceKernel current_kernel>
ccl_device_forceinline void integrator_intersect_next_kernel_after_volume(
KernelGlobals kg,
IntegratorState state,
@@ -273,7 +276,7 @@ ccl_device_forceinline void integrator_intersect_next_kernel_after_volume(
if (isect->prim != PRIM_NONE) {
/* Hit a surface, continue with light or surface kernel. */
if (isect->type & PRIMITIVE_LAMP) {
INTEGRATOR_PATH_NEXT(current_kernel, DEVICE_KERNEL_INTEGRATOR_SHADE_LIGHT);
integrator_path_next(kg, state, current_kernel, DEVICE_KERNEL_INTEGRATOR_SHADE_LIGHT);
return;
}
else {
@@ -286,16 +289,16 @@ ccl_device_forceinline void integrator_intersect_next_kernel_after_volume(
const bool use_raytrace_kernel = (flags & SD_HAS_RAYTRACE);
if (use_caustics) {
INTEGRATOR_PATH_NEXT_SORTED(
current_kernel, DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_MNEE, shader);
integrator_path_next_sorted(
kg, state, current_kernel, DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_MNEE, shader);
}
else if (use_raytrace_kernel) {
INTEGRATOR_PATH_NEXT_SORTED(
current_kernel, DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE, shader);
integrator_path_next_sorted(
kg, state, current_kernel, DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE, shader);
}
else {
INTEGRATOR_PATH_NEXT_SORTED(
current_kernel, DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE, shader);
integrator_path_next_sorted(
kg, state, current_kernel, DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE, shader);
}
#ifdef __SHADOW_CATCHER__
@@ -307,7 +310,7 @@ ccl_device_forceinline void integrator_intersect_next_kernel_after_volume(
}
else {
/* Nothing hit, continue with background kernel. */
INTEGRATOR_PATH_NEXT(current_kernel, DEVICE_KERNEL_INTEGRATOR_SHADE_BACKGROUND);
integrator_path_next(kg, state, current_kernel, DEVICE_KERNEL_INTEGRATOR_SHADE_BACKGROUND);
return;
}
}
@@ -321,7 +324,7 @@ ccl_device void integrator_intersect_closest(KernelGlobals kg,
/* Read ray from integrator state into local memory. */
Ray ray ccl_optional_struct_init;
integrator_state_read_ray(kg, state, &ray);
kernel_assert(ray.t != 0.0f);
kernel_assert(ray.tmax != 0.0f);
const uint visibility = path_state_ray_visibility(state);
const int last_isect_prim = INTEGRATOR_STATE(state, isect, prim);
@@ -329,12 +332,12 @@ ccl_device void integrator_intersect_closest(KernelGlobals kg,
/* Trick to use short AO rays to approximate indirect light at the end of the path. */
if (path_state_ao_bounce(kg, state)) {
ray.t = kernel_data.integrator.ao_bounces_distance;
ray.tmax = kernel_data.integrator.ao_bounces_distance;
if (last_isect_object != OBJECT_NONE) {
const float object_ao_distance = kernel_data_fetch(objects, last_isect_object).ao_distance;
if (object_ao_distance != 0.0f) {
ray.t = object_ao_distance;
ray.tmax = object_ao_distance;
}
}
}

View File

@@ -162,7 +162,7 @@ ccl_device void integrator_intersect_shadow(KernelGlobals kg, IntegratorShadowSt
if (opaque_hit) {
/* Hit an opaque surface, shadow path ends here. */
INTEGRATOR_SHADOW_PATH_TERMINATE(DEVICE_KERNEL_INTEGRATOR_INTERSECT_SHADOW);
integrator_shadow_path_terminate(kg, state, DEVICE_KERNEL_INTEGRATOR_INTERSECT_SHADOW);
return;
}
else {
@@ -171,7 +171,9 @@ ccl_device void integrator_intersect_shadow(KernelGlobals kg, IntegratorShadowSt
*
* TODO: could also write to render buffer directly if no transparent shadows?
* Could save a kernel execution for the common case. */
INTEGRATOR_SHADOW_PATH_NEXT(DEVICE_KERNEL_INTEGRATOR_INTERSECT_SHADOW,
integrator_shadow_path_next(kg,
state,
DEVICE_KERNEL_INTEGRATOR_INTERSECT_SHADOW,
DEVICE_KERNEL_INTEGRATOR_SHADE_SHADOW);
return;
}

View File

@@ -17,7 +17,7 @@ ccl_device void integrator_intersect_subsurface(KernelGlobals kg, IntegratorStat
}
#endif
INTEGRATOR_PATH_TERMINATE(DEVICE_KERNEL_INTEGRATOR_INTERSECT_SUBSURFACE);
integrator_path_terminate(kg, state, DEVICE_KERNEL_INTEGRATOR_INTERSECT_SUBSURFACE);
}
CCL_NAMESPACE_END

View File

@@ -24,7 +24,8 @@ ccl_device void integrator_volume_stack_update_for_subsurface(KernelGlobals kg,
Ray volume_ray ccl_optional_struct_init;
volume_ray.P = from_P;
volume_ray.D = normalize_len(to_P - from_P, &volume_ray.t);
volume_ray.D = normalize_len(to_P - from_P, &volume_ray.tmax);
volume_ray.tmin = 0.0f;
volume_ray.self.object = INTEGRATOR_STATE(state, isect, object);
volume_ray.self.prim = INTEGRATOR_STATE(state, isect, prim);
volume_ray.self.light_object = OBJECT_NONE;
@@ -58,12 +59,9 @@ ccl_device void integrator_volume_stack_update_for_subsurface(KernelGlobals kg,
volume_stack_enter_exit(kg, state, stack_sd);
/* Move ray forward. */
volume_ray.P = stack_sd->P;
volume_ray.tmin = intersection_t_offset(isect.t);
volume_ray.self.object = isect.object;
volume_ray.self.prim = isect.prim;
if (volume_ray.t != FLT_MAX) {
volume_ray.D = normalize_len(to_P - volume_ray.P, &volume_ray.t);
}
++step;
}
#endif
@@ -82,7 +80,8 @@ ccl_device void integrator_volume_stack_init(KernelGlobals kg, IntegratorState s
/* Trace ray in random direction. Any direction works, Z up is a guess to get the
* fewest hits. */
volume_ray.D = make_float3(0.0f, 0.0f, 1.0f);
volume_ray.t = FLT_MAX;
volume_ray.tmin = 0.0f;
volume_ray.tmax = FLT_MAX;
volume_ray.self.object = OBJECT_NONE;
volume_ray.self.prim = PRIM_NONE;
volume_ray.self.light_object = OBJECT_NONE;
@@ -199,7 +198,7 @@ ccl_device void integrator_volume_stack_init(KernelGlobals kg, IntegratorState s
}
/* Move ray forward. */
volume_ray.P = stack_sd->P;
volume_ray.tmin = intersection_t_offset(isect.t);
volume_ray.self.object = isect.object;
volume_ray.self.prim = isect.prim;
++step;
@@ -222,7 +221,9 @@ ccl_device void integrator_intersect_volume_stack(KernelGlobals kg, IntegratorSt
}
else {
/* Volume stack init for camera rays, continue with intersection of camera ray. */
INTEGRATOR_PATH_NEXT(DEVICE_KERNEL_INTEGRATOR_INTERSECT_VOLUME_STACK,
integrator_path_next(kg,
state,
DEVICE_KERNEL_INTEGRATOR_INTERSECT_VOLUME_STACK,
DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST);
}
}

View File

@@ -137,8 +137,14 @@ ccl_device_forceinline void mnee_update_light_sample(KernelGlobals kg,
}
}
else if (ls->type == LIGHT_AREA) {
float invarea = fabsf(klight->area.invarea);
ls->D = normalize_len(ls->P - P, &ls->t);
ls->pdf = fabsf(klight->area.invarea);
ls->pdf = invarea;
if (klight->area.tan_spread > 0.f) {
ls->eval_fac = 0.25f * invarea;
ls->eval_fac *= light_spread_attenuation(
ls->D, ls->Ng, klight->area.tan_spread, klight->area.normalize_spread);
}
}
ls->pdf *= kernel_data.integrator.pdf_lights;
@@ -436,6 +442,7 @@ ccl_device_forceinline bool mnee_newton_solver(KernelGlobals kg,
projection_ray.self.light_prim = PRIM_NONE;
projection_ray.dP = differential_make_compact(sd->dP);
projection_ray.dD = differential_zero_compact();
projection_ray.tmin = 0.0f;
projection_ray.time = sd->time;
Intersection projection_isect;
@@ -499,8 +506,8 @@ ccl_device_forceinline bool mnee_newton_solver(KernelGlobals kg,
projection_ray.self.prim = pv.prim;
projection_ray.P = pv.p;
}
projection_ray.D = normalize_len(tentative_p - projection_ray.P, &projection_ray.t);
projection_ray.t *= MNEE_PROJECTION_DISTANCE_MULTIPLIER;
projection_ray.D = normalize_len(tentative_p - projection_ray.P, &projection_ray.tmax);
projection_ray.tmax *= MNEE_PROJECTION_DISTANCE_MULTIPLIER;
bool projection_success = false;
for (int isect_count = 0; isect_count < MNEE_MAX_INTERSECTION_COUNT; isect_count++) {
@@ -519,8 +526,7 @@ ccl_device_forceinline bool mnee_newton_solver(KernelGlobals kg,
projection_ray.self.object = projection_isect.object;
projection_ray.self.prim = projection_isect.prim;
projection_ray.P += projection_isect.t * projection_ray.D;
projection_ray.t -= projection_isect.t;
projection_ray.tmin = intersection_t_offset(projection_isect.t);
}
if (!projection_success) {
reduce_stepsize = true;
@@ -852,6 +858,7 @@ ccl_device_forceinline bool mnee_path_contribution(KernelGlobals kg,
Ray probe_ray;
probe_ray.self.light_object = ls->object;
probe_ray.self.light_prim = ls->prim;
probe_ray.tmin = 0.0f;
probe_ray.dP = differential_make_compact(sd->dP);
probe_ray.dD = differential_zero_compact();
probe_ray.time = sd->time;
@@ -867,13 +874,13 @@ ccl_device_forceinline bool mnee_path_contribution(KernelGlobals kg,
ccl_private const ManifoldVertex &v = vertices[vi];
/* Check visibility. */
probe_ray.D = normalize_len(v.p - probe_ray.P, &probe_ray.t);
probe_ray.D = normalize_len(v.p - probe_ray.P, &probe_ray.tmax);
if (scene_intersect(kg, &probe_ray, PATH_RAY_TRANSMIT, &probe_isect)) {
int hit_object = (probe_isect.object == OBJECT_NONE) ?
kernel_data_fetch(prim_object, probe_isect.prim) :
probe_isect.object;
/* Test whether the ray hit the appropriate object at its intended location. */
if (hit_object != v.object || fabsf(probe_ray.t - probe_isect.t) > MNEE_MIN_DISTANCE)
if (hit_object != v.object || fabsf(probe_ray.tmax - probe_isect.t) > MNEE_MIN_DISTANCE)
return false;
}
probe_ray.self.object = v.object;
@@ -952,15 +959,16 @@ ccl_device_forceinline int kernel_path_mnee_sample(KernelGlobals kg,
probe_ray.self.light_object = ls->object;
probe_ray.self.light_prim = ls->prim;
probe_ray.P = sd->P;
probe_ray.tmin = 0.0f;
if (ls->t == FLT_MAX) {
/* Distant / env light. */
probe_ray.D = ls->D;
probe_ray.t = ls->t;
probe_ray.tmax = ls->t;
}
else {
/* Other lights, avoid self-intersection. */
probe_ray.D = ls->P - probe_ray.P;
probe_ray.D = normalize_len(probe_ray.D, &probe_ray.t);
probe_ray.D = normalize_len(probe_ray.D, &probe_ray.tmax);
}
probe_ray.dP = differential_make_compact(sd->dP);
probe_ray.dD = differential_zero_compact();
@@ -1042,9 +1050,7 @@ ccl_device_forceinline int kernel_path_mnee_sample(KernelGlobals kg,
probe_ray.self.object = probe_isect.object;
probe_ray.self.prim = probe_isect.prim;
probe_ray.P += probe_isect.t * probe_ray.D;
if (ls->t != FLT_MAX)
probe_ray.t -= probe_isect.t;
probe_ray.tmin = intersection_t_offset(probe_isect.t);
};
/* Mark the manifold walk invalid to keep mollification on by default. */

View File

@@ -52,7 +52,6 @@ ccl_device_inline void path_state_init_integrator(KernelGlobals kg,
INTEGRATOR_STATE_WRITE(state, path, flag) = PATH_RAY_CAMERA | PATH_RAY_MIS_SKIP |
PATH_RAY_TRANSPARENT_BACKGROUND;
INTEGRATOR_STATE_WRITE(state, path, mis_ray_pdf) = 0.0f;
INTEGRATOR_STATE_WRITE(state, path, mis_ray_t) = 0.0f;
INTEGRATOR_STATE_WRITE(state, path, min_ray_pdf) = FLT_MAX;
INTEGRATOR_STATE_WRITE(state, path, continuation_probability) = 1.0f;
INTEGRATOR_STATE_WRITE(state, path, throughput) = make_float3(1.0f, 1.0f, 1.0f);

View File

@@ -62,11 +62,10 @@ ccl_device float3 integrator_eval_background_shader(KernelGlobals kg,
const float3 ray_P = INTEGRATOR_STATE(state, ray, P);
const float3 ray_D = INTEGRATOR_STATE(state, ray, D);
const float mis_ray_pdf = INTEGRATOR_STATE(state, path, mis_ray_pdf);
const float mis_ray_t = INTEGRATOR_STATE(state, path, mis_ray_t);
/* multiple importance sampling, get background light pdf for ray
* direction, and compute weight with respect to BSDF pdf */
const float pdf = background_light_pdf(kg, ray_P - ray_D * mis_ray_t, ray_D);
const float pdf = background_light_pdf(kg, ray_P, ray_D);
const float mis_weight = light_sample_mis_weight_forward(kg, mis_ray_pdf, pdf);
L *= mis_weight;
}
@@ -213,7 +212,7 @@ ccl_device void integrator_shade_background(KernelGlobals kg,
}
#endif
INTEGRATOR_PATH_TERMINATE(DEVICE_KERNEL_INTEGRATOR_SHADE_BACKGROUND);
integrator_path_terminate(kg, state, DEVICE_KERNEL_INTEGRATOR_SHADE_BACKGROUND);
}
CCL_NAMESPACE_END

View File

@@ -22,19 +22,8 @@ ccl_device_inline void integrate_light(KernelGlobals kg,
const float3 ray_D = INTEGRATOR_STATE(state, ray, D);
const float ray_time = INTEGRATOR_STATE(state, ray, time);
/* Advance ray beyond light. */
/* TODO: can we make this more numerically robust to avoid reintersecting the
* same light in some cases? Ray should not intersect surface anymore as the
* object and prim ids will prevent self intersection. */
const float3 new_ray_P = ray_P + ray_D * isect.t;
INTEGRATOR_STATE_WRITE(state, ray, P) = new_ray_P;
INTEGRATOR_STATE_WRITE(state, ray, t) -= isect.t;
/* Set position to where the BSDF was sampled, for correct MIS PDF. */
const float mis_ray_t = INTEGRATOR_STATE(state, path, mis_ray_t);
ray_P -= ray_D * mis_ray_t;
isect.t += mis_ray_t;
INTEGRATOR_STATE_WRITE(state, path, mis_ray_t) = isect.t;
/* Advance ray to new start distance. */
INTEGRATOR_STATE_WRITE(state, ray, tmin) = intersection_t_offset(isect.t);
LightSample ls ccl_optional_struct_init;
const bool use_light_sample = light_sample_from_intersection(kg, &isect, ray_P, ray_D, &ls);
@@ -99,11 +88,13 @@ ccl_device void integrator_shade_light(KernelGlobals kg,
INTEGRATOR_STATE_WRITE(state, path, transparent_bounce) = transparent_bounce;
if (transparent_bounce >= kernel_data.integrator.transparent_max_bounce) {
INTEGRATOR_PATH_TERMINATE(DEVICE_KERNEL_INTEGRATOR_SHADE_LIGHT);
integrator_path_terminate(kg, state, DEVICE_KERNEL_INTEGRATOR_SHADE_LIGHT);
return;
}
else {
INTEGRATOR_PATH_NEXT(DEVICE_KERNEL_INTEGRATOR_SHADE_LIGHT,
integrator_path_next(kg,
state,
DEVICE_KERNEL_INTEGRATOR_SHADE_LIGHT,
DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST);
return;
}

View File

@@ -75,13 +75,9 @@ ccl_device_inline void integrate_transparent_volume_shadow(KernelGlobals kg,
ray.self.light_object = OBJECT_NONE;
ray.self.light_prim = PRIM_NONE;
/* Modify ray position and length to match current segment. */
const float start_t = (hit == 0) ? 0.0f :
INTEGRATOR_STATE_ARRAY(state, shadow_isect, hit - 1, t);
const float end_t = (hit < num_recorded_hits) ?
INTEGRATOR_STATE_ARRAY(state, shadow_isect, hit, t) :
ray.t;
ray.P += start_t * ray.D;
ray.t = end_t - start_t;
ray.tmin = (hit == 0) ? ray.tmin : INTEGRATOR_STATE_ARRAY(state, shadow_isect, hit - 1, t);
ray.tmax = (hit < num_recorded_hits) ? INTEGRATOR_STATE_ARRAY(state, shadow_isect, hit, t) :
ray.tmax;
shader_setup_from_volume(kg, shadow_sd, &ray);
@@ -137,10 +133,7 @@ ccl_device_inline bool integrate_transparent_shadow(KernelGlobals kg,
/* There are more hits that we could not recorded due to memory usage,
* adjust ray to intersect again from the last hit. */
const float last_hit_t = INTEGRATOR_STATE_ARRAY(state, shadow_isect, num_recorded_hits - 1, t);
const float3 ray_P = INTEGRATOR_STATE(state, shadow_ray, P);
const float3 ray_D = INTEGRATOR_STATE(state, shadow_ray, D);
INTEGRATOR_STATE_WRITE(state, shadow_ray, P) = ray_P + last_hit_t * ray_D;
INTEGRATOR_STATE_WRITE(state, shadow_ray, t) -= last_hit_t;
INTEGRATOR_STATE_WRITE(state, shadow_ray, tmin) = intersection_t_offset(last_hit_t);
}
return false;
@@ -158,20 +151,22 @@ ccl_device void integrator_shade_shadow(KernelGlobals kg,
/* Evaluate transparent shadows. */
const bool opaque = integrate_transparent_shadow(kg, state, num_hits);
if (opaque) {
INTEGRATOR_SHADOW_PATH_TERMINATE(DEVICE_KERNEL_INTEGRATOR_SHADE_SHADOW);
integrator_shadow_path_terminate(kg, state, DEVICE_KERNEL_INTEGRATOR_SHADE_SHADOW);
return;
}
#endif
if (shadow_intersections_has_remaining(num_hits)) {
/* More intersections to find, continue shadow ray. */
INTEGRATOR_SHADOW_PATH_NEXT(DEVICE_KERNEL_INTEGRATOR_SHADE_SHADOW,
integrator_shadow_path_next(kg,
state,
DEVICE_KERNEL_INTEGRATOR_SHADE_SHADOW,
DEVICE_KERNEL_INTEGRATOR_INTERSECT_SHADOW);
return;
}
else {
kernel_accum_light(kg, state, render_buffer);
INTEGRATOR_SHADOW_PATH_TERMINATE(DEVICE_KERNEL_INTEGRATOR_SHADE_SHADOW);
integrator_shadow_path_terminate(kg, state, DEVICE_KERNEL_INTEGRATOR_SHADE_SHADOW);
return;
}
}

View File

@@ -77,7 +77,7 @@ ccl_device_forceinline void integrate_surface_emission(KernelGlobals kg,
# endif
{
const float bsdf_pdf = INTEGRATOR_STATE(state, path, mis_ray_pdf);
const float t = sd->ray_length + INTEGRATOR_STATE(state, path, mis_ray_t);
const float t = sd->ray_length;
/* Multiple importance sampling, get triangle light pdf,
* and compute weight with respect to BSDF pdf. */
@@ -190,8 +190,8 @@ ccl_device_forceinline void integrate_surface_direct_light(KernelGlobals kg,
const bool is_light = light_sample_is_light(&ls);
/* Branch off shadow kernel. */
INTEGRATOR_SHADOW_PATH_INIT(
shadow_state, state, DEVICE_KERNEL_INTEGRATOR_INTERSECT_SHADOW, shadow);
IntegratorShadowState shadow_state = integrator_shadow_path_init(
kg, state, DEVICE_KERNEL_INTEGRATOR_INTERSECT_SHADOW, false);
/* Copy volume stack and enter/exit volume. */
integrator_state_copy_volume_stack_to_shadow(kg, shadow_state, state);
@@ -323,16 +323,21 @@ ccl_device_forceinline int integrate_surface_bsdf_bssrdf_bounce(
return LABEL_NONE;
}
/* Setup ray. Note that clipping works through transparent bounces. */
INTEGRATOR_STATE_WRITE(state, ray, P) = sd->P;
INTEGRATOR_STATE_WRITE(state, ray, D) = normalize(bsdf_omega_in);
INTEGRATOR_STATE_WRITE(state, ray, t) = (label & LABEL_TRANSPARENT) ?
INTEGRATOR_STATE(state, ray, t) - sd->ray_length :
FLT_MAX;
if (label & LABEL_TRANSPARENT) {
/* Only need to modify start distance for transparent. */
INTEGRATOR_STATE_WRITE(state, ray, tmin) = intersection_t_offset(sd->ray_length);
}
else {
/* Setup ray with changed origin and direction. */
INTEGRATOR_STATE_WRITE(state, ray, P) = sd->P;
INTEGRATOR_STATE_WRITE(state, ray, D) = normalize(bsdf_omega_in);
INTEGRATOR_STATE_WRITE(state, ray, tmin) = 0.0f;
INTEGRATOR_STATE_WRITE(state, ray, tmax) = FLT_MAX;
#ifdef __RAY_DIFFERENTIALS__
INTEGRATOR_STATE_WRITE(state, ray, dP) = differential_make_compact(sd->dP);
INTEGRATOR_STATE_WRITE(state, ray, dD) = differential_make_compact(bsdf_domega_in);
INTEGRATOR_STATE_WRITE(state, ray, dP) = differential_make_compact(sd->dP);
INTEGRATOR_STATE_WRITE(state, ray, dD) = differential_make_compact(bsdf_domega_in);
#endif
}
/* Update throughput. */
float3 throughput = INTEGRATOR_STATE(state, path, throughput);
@@ -349,12 +354,8 @@ ccl_device_forceinline int integrate_surface_bsdf_bssrdf_bounce(
}
/* Update path state */
if (label & LABEL_TRANSPARENT) {
INTEGRATOR_STATE_WRITE(state, path, mis_ray_t) += sd->ray_length;
}
else {
if (!(label & LABEL_TRANSPARENT)) {
INTEGRATOR_STATE_WRITE(state, path, mis_ray_pdf) = bsdf_pdf;
INTEGRATOR_STATE_WRITE(state, path, mis_ray_t) = 0.0f;
INTEGRATOR_STATE_WRITE(state, path, min_ray_pdf) = fminf(
bsdf_pdf, INTEGRATOR_STATE(state, path, min_ray_pdf));
}
@@ -371,17 +372,8 @@ ccl_device_forceinline int integrate_surface_volume_only_bounce(IntegratorState
return LABEL_NONE;
}
/* Setup ray position, direction stays unchanged. */
INTEGRATOR_STATE_WRITE(state, ray, P) = sd->P;
/* Clipping works through transparent. */
INTEGRATOR_STATE_WRITE(state, ray, t) -= sd->ray_length;
# ifdef __RAY_DIFFERENTIALS__
INTEGRATOR_STATE_WRITE(state, ray, dP) = differential_make_compact(sd->dP);
# endif
INTEGRATOR_STATE_WRITE(state, path, mis_ray_t) += sd->ray_length;
/* Only modify start distance. */
INTEGRATOR_STATE_WRITE(state, ray, tmin) = intersection_t_offset(sd->ray_length);
return LABEL_TRANSMIT | LABEL_TRANSPARENT;
}
@@ -432,7 +424,8 @@ ccl_device_forceinline void integrate_surface_ao(KernelGlobals kg,
Ray ray ccl_optional_struct_init;
ray.P = shadow_ray_offset(kg, sd, ao_D, &skip_self);
ray.D = ao_D;
ray.t = kernel_data.integrator.ao_bounces_distance;
ray.tmin = 0.0f;
ray.tmax = kernel_data.integrator.ao_bounces_distance;
ray.time = sd->time;
ray.self.object = (skip_self) ? sd->object : OBJECT_NONE;
ray.self.prim = (skip_self) ? sd->prim : PRIM_NONE;
@@ -442,7 +435,8 @@ ccl_device_forceinline void integrate_surface_ao(KernelGlobals kg,
ray.dD = differential_zero_compact();
/* Branch off shadow kernel. */
INTEGRATOR_SHADOW_PATH_INIT(shadow_state, state, DEVICE_KERNEL_INTEGRATOR_INTERSECT_SHADOW, ao);
IntegratorShadowState shadow_state = integrator_shadow_path_init(
kg, state, DEVICE_KERNEL_INTEGRATOR_INTERSECT_SHADOW, true);
/* Copy volume stack and enter/exit volume. */
integrator_state_copy_volume_stack_to_shadow(kg, shadow_state, state);
@@ -604,22 +598,23 @@ ccl_device bool integrate_surface(KernelGlobals kg,
}
template<uint node_feature_mask = KERNEL_FEATURE_NODE_MASK_SURFACE & ~KERNEL_FEATURE_NODE_RAYTRACE,
int current_kernel = DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE>
DeviceKernel current_kernel = DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE>
ccl_device_forceinline void integrator_shade_surface(KernelGlobals kg,
IntegratorState state,
ccl_global float *ccl_restrict render_buffer)
{
if (integrate_surface<node_feature_mask>(kg, state, render_buffer)) {
if (INTEGRATOR_STATE(state, path, flag) & PATH_RAY_SUBSURFACE) {
INTEGRATOR_PATH_NEXT(current_kernel, DEVICE_KERNEL_INTEGRATOR_INTERSECT_SUBSURFACE);
integrator_path_next(
kg, state, current_kernel, DEVICE_KERNEL_INTEGRATOR_INTERSECT_SUBSURFACE);
}
else {
kernel_assert(INTEGRATOR_STATE(state, ray, t) != 0.0f);
INTEGRATOR_PATH_NEXT(current_kernel, DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST);
kernel_assert(INTEGRATOR_STATE(state, ray, tmax) != 0.0f);
integrator_path_next(kg, state, current_kernel, DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST);
}
}
else {
INTEGRATOR_PATH_TERMINATE(current_kernel);
integrator_path_terminate(kg, state, current_kernel);
}
}

View File

@@ -114,7 +114,8 @@ ccl_device_inline bool volume_shader_sample(KernelGlobals kg,
ccl_device_forceinline void volume_step_init(KernelGlobals kg,
ccl_private const RNGState *rng_state,
const float object_step_size,
float t,
const float tmin,
const float tmax,
ccl_private float *step_size,
ccl_private float *step_shade_offset,
ccl_private float *steps_offset,
@@ -122,7 +123,7 @@ ccl_device_forceinline void volume_step_init(KernelGlobals kg,
{
if (object_step_size == FLT_MAX) {
/* Homogeneous volume. */
*step_size = t;
*step_size = tmax - tmin;
*step_shade_offset = 0.0f;
*steps_offset = 1.0f;
*max_steps = 1;
@@ -130,6 +131,7 @@ ccl_device_forceinline void volume_step_init(KernelGlobals kg,
else {
/* Heterogeneous volume. */
*max_steps = kernel_data.integrator.volume_max_steps;
const float t = tmax - tmin;
float step = min(object_step_size, t);
/* compute exact steps in advance for malloc */
@@ -165,7 +167,7 @@ ccl_device void volume_shadow_homogeneous(KernelGlobals kg, IntegratorState stat
float3 sigma_t = zero_float3();
if (shadow_volume_shader_sample(kg, state, sd, &sigma_t)) {
*throughput *= volume_color_transmittance(sigma_t, ray->t);
*throughput *= volume_color_transmittance(sigma_t, ray->tmax - ray->tmin);
}
}
# endif
@@ -194,7 +196,8 @@ ccl_device void volume_shadow_heterogeneous(KernelGlobals kg,
volume_step_init(kg,
&rng_state,
object_step_size,
ray->t,
ray->tmin,
ray->tmax,
&step_size,
&step_shade_offset,
&unused,
@@ -202,13 +205,13 @@ ccl_device void volume_shadow_heterogeneous(KernelGlobals kg,
const float steps_offset = 1.0f;
/* compute extinction at the start */
float t = 0.0f;
float t = ray->tmin;
float3 sum = zero_float3();
for (int i = 0; i < max_steps; i++) {
/* advance to new position */
float new_t = min(ray->t, (i + steps_offset) * step_size);
float new_t = min(ray->tmax, ray->tmin + (i + steps_offset) * step_size);
float dt = new_t - t;
float3 new_P = ray->P + ray->D * (t + dt * step_shade_offset);
@@ -233,7 +236,7 @@ ccl_device void volume_shadow_heterogeneous(KernelGlobals kg,
/* stop if at the end of the volume */
t = new_t;
if (t == ray->t) {
if (t == ray->tmax) {
/* Update throughput in case we haven't done it above */
tp = *throughput * exp(sum);
break;
@@ -257,15 +260,16 @@ ccl_device float volume_equiangular_sample(ccl_private const Ray *ccl_restrict r
const float xi,
ccl_private float *pdf)
{
const float t = ray->t;
const float tmin = ray->tmin;
const float tmax = ray->tmax;
const float delta = dot((light_P - ray->P), ray->D);
const float D = safe_sqrtf(len_squared(light_P - ray->P) - delta * delta);
if (UNLIKELY(D == 0.0f)) {
*pdf = 0.0f;
return 0.0f;
}
const float theta_a = -atan2f(delta, D);
const float theta_b = atan2f(t - delta, D);
const float theta_a = atan2f(tmin - delta, D);
const float theta_b = atan2f(tmax - delta, D);
const float t_ = D * tanf((xi * theta_b) + (1 - xi) * theta_a);
if (UNLIKELY(theta_b == theta_a)) {
*pdf = 0.0f;
@@ -273,7 +277,7 @@ ccl_device float volume_equiangular_sample(ccl_private const Ray *ccl_restrict r
}
*pdf = D / ((theta_b - theta_a) * (D * D + t_ * t_));
return min(t, delta + t_); /* min is only for float precision errors */
return clamp(delta + t_, tmin, tmax); /* clamp is only for float precision errors */
}
ccl_device float volume_equiangular_pdf(ccl_private const Ray *ccl_restrict ray,
@@ -286,11 +290,12 @@ ccl_device float volume_equiangular_pdf(ccl_private const Ray *ccl_restrict ray,
return 0.0f;
}
const float t = ray->t;
const float tmin = ray->tmin;
const float tmax = ray->tmax;
const float t_ = sample_t - delta;
const float theta_a = -atan2f(delta, D);
const float theta_b = atan2f(t - delta, D);
const float theta_a = atan2f(tmin - delta, D);
const float theta_b = atan2f(tmax - delta, D);
if (UNLIKELY(theta_b == theta_a)) {
return 0.0f;
}
@@ -310,11 +315,12 @@ ccl_device float volume_equiangular_cdf(ccl_private const Ray *ccl_restrict ray,
return 0.0f;
}
const float t = ray->t;
const float tmin = ray->tmin;
const float tmax = ray->tmax;
const float t_ = sample_t - delta;
const float theta_a = -atan2f(delta, D);
const float theta_b = atan2f(t - delta, D);
const float theta_a = atan2f(tmin - delta, D);
const float theta_b = atan2f(tmax - delta, D);
if (UNLIKELY(theta_b == theta_a)) {
return 0.0f;
}
@@ -390,8 +396,8 @@ ccl_device float3 volume_emission_integrate(ccl_private VolumeShaderCoefficients
typedef struct VolumeIntegrateState {
/* Volume segment extents. */
float start_t;
float end_t;
float tmin;
float tmax;
/* If volume is absorption-only up to this point, and no probabilistic
* scattering or termination has been used yet. */
@@ -426,9 +432,9 @@ ccl_device_forceinline void volume_integrate_step_scattering(
/* Equiangular sampling for direct lighting. */
if (vstate.direct_sample_method == VOLUME_SAMPLE_EQUIANGULAR && !result.direct_scatter) {
if (result.direct_t >= vstate.start_t && result.direct_t <= vstate.end_t &&
if (result.direct_t >= vstate.tmin && result.direct_t <= vstate.tmax &&
vstate.equiangular_pdf > VOLUME_SAMPLE_PDF_CUTOFF) {
const float new_dt = result.direct_t - vstate.start_t;
const float new_dt = result.direct_t - vstate.tmin;
const float3 new_transmittance = volume_color_transmittance(coeff.sigma_t, new_dt);
result.direct_scatter = true;
@@ -458,7 +464,7 @@ ccl_device_forceinline void volume_integrate_step_scattering(
/* compute sampling distance */
const float sample_sigma_t = volume_channel_get(coeff.sigma_t, channel);
const float new_dt = -logf(1.0f - vstate.rscatter) / sample_sigma_t;
const float new_t = vstate.start_t + new_dt;
const float new_t = vstate.tmin + new_dt;
/* transmittance and pdf */
const float3 new_transmittance = volume_color_transmittance(coeff.sigma_t, new_dt);
@@ -528,7 +534,8 @@ ccl_device_forceinline void volume_integrate_heterogeneous(
volume_step_init(kg,
rng_state,
object_step_size,
ray->t,
ray->tmin,
ray->tmax,
&step_size,
&step_shade_offset,
&steps_offset,
@@ -536,8 +543,8 @@ ccl_device_forceinline void volume_integrate_heterogeneous(
/* Initialize volume integration state. */
VolumeIntegrateState vstate ccl_optional_struct_init;
vstate.start_t = 0.0f;
vstate.end_t = 0.0f;
vstate.tmin = ray->tmin;
vstate.tmax = ray->tmin;
vstate.absorption_only = true;
vstate.rscatter = path_state_rng_1D(kg, rng_state, PRNG_SCATTER_DISTANCE);
vstate.rphase = path_state_rng_1D(kg, rng_state, PRNG_PHASE_CHANNEL);
@@ -578,8 +585,8 @@ ccl_device_forceinline void volume_integrate_heterogeneous(
for (int i = 0; i < max_steps; i++) {
/* Advance to new position */
vstate.end_t = min(ray->t, (i + steps_offset) * step_size);
const float shade_t = vstate.start_t + (vstate.end_t - vstate.start_t) * step_shade_offset;
vstate.tmax = min(ray->tmax, ray->tmin + (i + steps_offset) * step_size);
const float shade_t = vstate.tmin + (vstate.tmax - vstate.tmin) * step_shade_offset;
sd->P = ray->P + ray->D * shade_t;
/* compute segment */
@@ -588,7 +595,7 @@ ccl_device_forceinline void volume_integrate_heterogeneous(
const int closure_flag = sd->flag;
/* Evaluate transmittance over segment. */
const float dt = (vstate.end_t - vstate.start_t);
const float dt = (vstate.tmax - vstate.tmin);
const float3 transmittance = (closure_flag & SD_EXTINCTION) ?
volume_color_transmittance(coeff.sigma_t, dt) :
one_float3();
@@ -645,8 +652,8 @@ ccl_device_forceinline void volume_integrate_heterogeneous(
}
/* Stop if at the end of the volume. */
vstate.start_t = vstate.end_t;
if (vstate.start_t == ray->t) {
vstate.tmin = vstate.tmax;
if (vstate.tmin == ray->tmax) {
break;
}
}
@@ -774,8 +781,8 @@ ccl_device_forceinline void integrate_volume_direct_light(
const bool is_light = light_sample_is_light(ls);
/* Branch off shadow kernel. */
INTEGRATOR_SHADOW_PATH_INIT(
shadow_state, state, DEVICE_KERNEL_INTEGRATOR_INTERSECT_SHADOW, shadow);
IntegratorShadowState shadow_state = integrator_shadow_path_init(
kg, state, DEVICE_KERNEL_INTEGRATOR_INTERSECT_SHADOW, false);
/* Write shadow ray and associated state to global memory. */
integrator_state_write_shadow_ray(kg, shadow_state, &ray);
@@ -880,7 +887,8 @@ ccl_device_forceinline bool integrate_volume_phase_scatter(
/* Setup ray. */
INTEGRATOR_STATE_WRITE(state, ray, P) = sd->P;
INTEGRATOR_STATE_WRITE(state, ray, D) = normalize(phase_omega_in);
INTEGRATOR_STATE_WRITE(state, ray, t) = FLT_MAX;
INTEGRATOR_STATE_WRITE(state, ray, tmin) = 0.0f;
INTEGRATOR_STATE_WRITE(state, ray, tmax) = FLT_MAX;
# ifdef __RAY_DIFFERENTIALS__
INTEGRATOR_STATE_WRITE(state, ray, dP) = differential_make_compact(sd->dP);
INTEGRATOR_STATE_WRITE(state, ray, dD) = differential_make_compact(phase_domega_in);
@@ -901,7 +909,6 @@ ccl_device_forceinline bool integrate_volume_phase_scatter(
/* Update path state */
INTEGRATOR_STATE_WRITE(state, path, mis_ray_pdf) = phase_pdf;
INTEGRATOR_STATE_WRITE(state, path, mis_ray_t) = 0.0f;
INTEGRATOR_STATE_WRITE(state, path, min_ray_pdf) = fminf(
phase_pdf, INTEGRATOR_STATE(state, path, min_ray_pdf));
@@ -1021,7 +1028,7 @@ ccl_device void integrator_shade_volume(KernelGlobals kg,
integrator_state_read_isect(kg, state, &isect);
/* Set ray length to current segment. */
ray.t = (isect.prim != PRIM_NONE) ? isect.t : FLT_MAX;
ray.tmax = (isect.prim != PRIM_NONE) ? isect.t : FLT_MAX;
/* Clean volume stack for background rays. */
if (isect.prim == PRIM_NONE) {
@@ -1032,13 +1039,15 @@ ccl_device void integrator_shade_volume(KernelGlobals kg,
if (event == VOLUME_PATH_SCATTERED) {
/* Queue intersect_closest kernel. */
INTEGRATOR_PATH_NEXT(DEVICE_KERNEL_INTEGRATOR_SHADE_VOLUME,
integrator_path_next(kg,
state,
DEVICE_KERNEL_INTEGRATOR_SHADE_VOLUME,
DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST);
return;
}
else if (event == VOLUME_PATH_MISSED) {
/* End path. */
INTEGRATOR_PATH_TERMINATE(DEVICE_KERNEL_INTEGRATOR_SHADE_VOLUME);
integrator_path_terminate(kg, state, DEVICE_KERNEL_INTEGRATOR_SHADE_VOLUME);
return;
}
else {

View File

@@ -50,7 +50,7 @@ ccl_device_inline bool kernel_shadow_catcher_is_path_split_bounce(KernelGlobals
ccl_device_inline bool kernel_shadow_catcher_path_can_split(KernelGlobals kg,
ConstIntegratorState state)
{
if (INTEGRATOR_PATH_IS_TERMINATED) {
if (integrator_path_is_terminated(state)) {
return false;
}

View File

@@ -47,7 +47,8 @@ KERNEL_STRUCT_END(shadow_path)
KERNEL_STRUCT_BEGIN(shadow_ray)
KERNEL_STRUCT_MEMBER(shadow_ray, packed_float3, P, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(shadow_ray, packed_float3, D, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(shadow_ray, float, t, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(shadow_ray, float, tmin, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(shadow_ray, float, tmax, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(shadow_ray, float, time, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(shadow_ray, float, dP, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(shadow_ray, int, object, KERNEL_FEATURE_PATH_TRACING)

View File

@@ -127,6 +127,9 @@ typedef struct IntegratorStateGPU {
/* Index of main path which will be used by a next shadow catcher split. */
ccl_global int *next_main_path_index;
/* Divisor used to partition active indices by locality when sorting by material. */
uint sort_partition_divisor;
} IntegratorStateGPU;
/* Abstraction

View File

@@ -10,125 +10,196 @@ CCL_NAMESPACE_BEGIN
/* Control Flow
*
* Utilities for control flow between kernels. The implementation may differ per device
* or even be handled on the host side. To abstract such differences, experiment with
* different implementations and for debugging, this is abstracted using macros.
* Utilities for control flow between kernels. The implementation is different between CPU and
* GPU devices. For the latter part of the logic is handled on the host side with wavefronts.
*
* There is a main path for regular path tracing camera for path tracing. Shadows for next
* event estimation branch off from this into their own path, that may be computed in
* parallel while the main path continues.
* parallel while the main path continues. Additionally, shading kernels are sorted using
* a key for coherence.
*
* Each kernel on the main path must call one of these functions. These may not be called
* multiple times from the same kernel.
*
* INTEGRATOR_PATH_INIT(next_kernel)
* INTEGRATOR_PATH_NEXT(current_kernel, next_kernel)
* INTEGRATOR_PATH_TERMINATE(current_kernel)
* integrator_path_init(kg, state, next_kernel)
* integrator_path_next(kg, state, current_kernel, next_kernel)
* integrator_path_terminate(kg, state, current_kernel)
*
* For the shadow path similar functions are used, and again each shadow kernel must call
* one of them, and only once.
*/
#define INTEGRATOR_PATH_IS_TERMINATED (INTEGRATOR_STATE(state, path, queued_kernel) == 0)
#define INTEGRATOR_SHADOW_PATH_IS_TERMINATED \
(INTEGRATOR_STATE(state, shadow_path, queued_kernel) == 0)
ccl_device_forceinline bool integrator_path_is_terminated(ConstIntegratorState state)
{
return INTEGRATOR_STATE(state, path, queued_kernel) == 0;
}
ccl_device_forceinline bool integrator_shadow_path_is_terminated(ConstIntegratorShadowState state)
{
return INTEGRATOR_STATE(state, shadow_path, queued_kernel) == 0;
}
#ifdef __KERNEL_GPU__
# define INTEGRATOR_PATH_INIT(next_kernel) \
atomic_fetch_and_add_uint32(&kernel_integrator_state.queue_counter->num_queued[next_kernel], \
1); \
INTEGRATOR_STATE_WRITE(state, path, queued_kernel) = next_kernel;
# define INTEGRATOR_PATH_NEXT(current_kernel, next_kernel) \
atomic_fetch_and_sub_uint32( \
&kernel_integrator_state.queue_counter->num_queued[current_kernel], 1); \
atomic_fetch_and_add_uint32(&kernel_integrator_state.queue_counter->num_queued[next_kernel], \
1); \
INTEGRATOR_STATE_WRITE(state, path, queued_kernel) = next_kernel;
# define INTEGRATOR_PATH_TERMINATE(current_kernel) \
atomic_fetch_and_sub_uint32( \
&kernel_integrator_state.queue_counter->num_queued[current_kernel], 1); \
INTEGRATOR_STATE_WRITE(state, path, queued_kernel) = 0;
ccl_device_forceinline void integrator_path_init(KernelGlobals kg,
IntegratorState state,
const DeviceKernel next_kernel)
{
atomic_fetch_and_add_uint32(&kernel_integrator_state.queue_counter->num_queued[next_kernel], 1);
INTEGRATOR_STATE_WRITE(state, path, queued_kernel) = next_kernel;
}
# define INTEGRATOR_SHADOW_PATH_INIT(shadow_state, state, next_kernel, shadow_type) \
IntegratorShadowState shadow_state = atomic_fetch_and_add_uint32( \
&kernel_integrator_state.next_shadow_path_index[0], 1); \
atomic_fetch_and_add_uint32(&kernel_integrator_state.queue_counter->num_queued[next_kernel], \
1); \
INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, queued_kernel) = next_kernel;
# define INTEGRATOR_SHADOW_PATH_NEXT(current_kernel, next_kernel) \
atomic_fetch_and_sub_uint32( \
&kernel_integrator_state.queue_counter->num_queued[current_kernel], 1); \
atomic_fetch_and_add_uint32(&kernel_integrator_state.queue_counter->num_queued[next_kernel], \
1); \
INTEGRATOR_STATE_WRITE(state, shadow_path, queued_kernel) = next_kernel;
# define INTEGRATOR_SHADOW_PATH_TERMINATE(current_kernel) \
atomic_fetch_and_sub_uint32( \
&kernel_integrator_state.queue_counter->num_queued[current_kernel], 1); \
INTEGRATOR_STATE_WRITE(state, shadow_path, queued_kernel) = 0;
ccl_device_forceinline void integrator_path_next(KernelGlobals kg,
IntegratorState state,
const DeviceKernel current_kernel,
const DeviceKernel next_kernel)
{
atomic_fetch_and_sub_uint32(&kernel_integrator_state.queue_counter->num_queued[current_kernel],
1);
atomic_fetch_and_add_uint32(&kernel_integrator_state.queue_counter->num_queued[next_kernel], 1);
INTEGRATOR_STATE_WRITE(state, path, queued_kernel) = next_kernel;
}
# define INTEGRATOR_PATH_INIT_SORTED(next_kernel, key) \
{ \
const int key_ = key; \
atomic_fetch_and_add_uint32( \
&kernel_integrator_state.queue_counter->num_queued[next_kernel], 1); \
INTEGRATOR_STATE_WRITE(state, path, queued_kernel) = next_kernel; \
INTEGRATOR_STATE_WRITE(state, path, shader_sort_key) = key_; \
atomic_fetch_and_add_uint32(&kernel_integrator_state.sort_key_counter[next_kernel][key_], \
1); \
}
# define INTEGRATOR_PATH_NEXT_SORTED(current_kernel, next_kernel, key) \
{ \
const int key_ = key; \
atomic_fetch_and_sub_uint32( \
&kernel_integrator_state.queue_counter->num_queued[current_kernel], 1); \
atomic_fetch_and_add_uint32( \
&kernel_integrator_state.queue_counter->num_queued[next_kernel], 1); \
INTEGRATOR_STATE_WRITE(state, path, queued_kernel) = next_kernel; \
INTEGRATOR_STATE_WRITE(state, path, shader_sort_key) = key_; \
atomic_fetch_and_add_uint32(&kernel_integrator_state.sort_key_counter[next_kernel][key_], \
1); \
}
ccl_device_forceinline void integrator_path_terminate(KernelGlobals kg,
IntegratorState state,
const DeviceKernel current_kernel)
{
atomic_fetch_and_sub_uint32(&kernel_integrator_state.queue_counter->num_queued[current_kernel],
1);
INTEGRATOR_STATE_WRITE(state, path, queued_kernel) = 0;
}
ccl_device_forceinline IntegratorShadowState integrator_shadow_path_init(
KernelGlobals kg, IntegratorState state, const DeviceKernel next_kernel, const bool is_ao)
{
IntegratorShadowState shadow_state = atomic_fetch_and_add_uint32(
&kernel_integrator_state.next_shadow_path_index[0], 1);
atomic_fetch_and_add_uint32(&kernel_integrator_state.queue_counter->num_queued[next_kernel], 1);
INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, queued_kernel) = next_kernel;
return shadow_state;
}
ccl_device_forceinline void integrator_shadow_path_next(KernelGlobals kg,
IntegratorShadowState state,
const DeviceKernel current_kernel,
const DeviceKernel next_kernel)
{
atomic_fetch_and_sub_uint32(&kernel_integrator_state.queue_counter->num_queued[current_kernel],
1);
atomic_fetch_and_add_uint32(&kernel_integrator_state.queue_counter->num_queued[next_kernel], 1);
INTEGRATOR_STATE_WRITE(state, shadow_path, queued_kernel) = next_kernel;
}
ccl_device_forceinline void integrator_shadow_path_terminate(KernelGlobals kg,
IntegratorShadowState state,
const DeviceKernel current_kernel)
{
atomic_fetch_and_sub_uint32(&kernel_integrator_state.queue_counter->num_queued[current_kernel],
1);
INTEGRATOR_STATE_WRITE(state, shadow_path, queued_kernel) = 0;
}
/* Sort first by truncated state index (for good locality), then by key (for good coherence). */
# define INTEGRATOR_SORT_KEY(key, state) \
(key + kernel_data.max_shaders * (state / kernel_integrator_state.sort_partition_divisor))
ccl_device_forceinline void integrator_path_init_sorted(KernelGlobals kg,
IntegratorState state,
const DeviceKernel next_kernel,
const uint32_t key)
{
const int key_ = INTEGRATOR_SORT_KEY(key, state);
atomic_fetch_and_add_uint32(&kernel_integrator_state.queue_counter->num_queued[next_kernel], 1);
INTEGRATOR_STATE_WRITE(state, path, queued_kernel) = next_kernel;
INTEGRATOR_STATE_WRITE(state, path, shader_sort_key) = key_;
atomic_fetch_and_add_uint32(&kernel_integrator_state.sort_key_counter[next_kernel][key_], 1);
}
ccl_device_forceinline void integrator_path_next_sorted(KernelGlobals kg,
IntegratorState state,
const DeviceKernel current_kernel,
const DeviceKernel next_kernel,
const uint32_t key)
{
const int key_ = INTEGRATOR_SORT_KEY(key, state);
atomic_fetch_and_sub_uint32(&kernel_integrator_state.queue_counter->num_queued[current_kernel],
1);
atomic_fetch_and_add_uint32(&kernel_integrator_state.queue_counter->num_queued[next_kernel], 1);
INTEGRATOR_STATE_WRITE(state, path, queued_kernel) = next_kernel;
INTEGRATOR_STATE_WRITE(state, path, shader_sort_key) = key_;
atomic_fetch_and_add_uint32(&kernel_integrator_state.sort_key_counter[next_kernel][key_], 1);
}
#else
# define INTEGRATOR_PATH_INIT(next_kernel) \
INTEGRATOR_STATE_WRITE(state, path, queued_kernel) = next_kernel;
# define INTEGRATOR_PATH_INIT_SORTED(next_kernel, key) \
{ \
INTEGRATOR_STATE_WRITE(state, path, queued_kernel) = next_kernel; \
(void)key; \
}
# define INTEGRATOR_PATH_NEXT(current_kernel, next_kernel) \
{ \
INTEGRATOR_STATE_WRITE(state, path, queued_kernel) = next_kernel; \
(void)current_kernel; \
}
# define INTEGRATOR_PATH_TERMINATE(current_kernel) \
{ \
INTEGRATOR_STATE_WRITE(state, path, queued_kernel) = 0; \
(void)current_kernel; \
}
# define INTEGRATOR_PATH_NEXT_SORTED(current_kernel, next_kernel, key) \
{ \
INTEGRATOR_STATE_WRITE(state, path, queued_kernel) = next_kernel; \
(void)key; \
(void)current_kernel; \
}
ccl_device_forceinline void integrator_path_init(KernelGlobals kg,
IntegratorState state,
const DeviceKernel next_kernel)
{
INTEGRATOR_STATE_WRITE(state, path, queued_kernel) = next_kernel;
}
# define INTEGRATOR_SHADOW_PATH_INIT(shadow_state, state, next_kernel, shadow_type) \
IntegratorShadowState shadow_state = &state->shadow_type; \
INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, queued_kernel) = next_kernel;
# define INTEGRATOR_SHADOW_PATH_NEXT(current_kernel, next_kernel) \
{ \
INTEGRATOR_STATE_WRITE(state, shadow_path, queued_kernel) = next_kernel; \
(void)current_kernel; \
}
# define INTEGRATOR_SHADOW_PATH_TERMINATE(current_kernel) \
{ \
INTEGRATOR_STATE_WRITE(state, shadow_path, queued_kernel) = 0; \
(void)current_kernel; \
}
ccl_device_forceinline void integrator_path_init_sorted(KernelGlobals kg,
IntegratorState state,
const DeviceKernel next_kernel,
const uint32_t key)
{
INTEGRATOR_STATE_WRITE(state, path, queued_kernel) = next_kernel;
(void)key;
}
ccl_device_forceinline void integrator_path_next(KernelGlobals kg,
IntegratorState state,
const DeviceKernel current_kernel,
const DeviceKernel next_kernel)
{
INTEGRATOR_STATE_WRITE(state, path, queued_kernel) = next_kernel;
(void)current_kernel;
}
ccl_device_forceinline void integrator_path_terminate(KernelGlobals kg,
IntegratorState state,
const DeviceKernel current_kernel)
{
INTEGRATOR_STATE_WRITE(state, path, queued_kernel) = 0;
(void)current_kernel;
}
ccl_device_forceinline void integrator_path_next_sorted(KernelGlobals kg,
IntegratorState state,
const DeviceKernel current_kernel,
const DeviceKernel next_kernel,
const uint32_t key)
{
INTEGRATOR_STATE_WRITE(state, path, queued_kernel) = next_kernel;
(void)key;
(void)current_kernel;
}
ccl_device_forceinline IntegratorShadowState integrator_shadow_path_init(
KernelGlobals kg, IntegratorState state, const DeviceKernel next_kernel, const bool is_ao)
{
IntegratorShadowState shadow_state = (is_ao) ? &state->ao : &state->shadow;
INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, queued_kernel) = next_kernel;
return shadow_state;
}
ccl_device_forceinline void integrator_shadow_path_next(KernelGlobals kg,
IntegratorShadowState state,
const DeviceKernel current_kernel,
const DeviceKernel next_kernel)
{
INTEGRATOR_STATE_WRITE(state, shadow_path, queued_kernel) = next_kernel;
(void)current_kernel;
}
ccl_device_forceinline void integrator_shadow_path_terminate(KernelGlobals kg,
IntegratorShadowState state,
const DeviceKernel current_kernel)
{
INTEGRATOR_STATE_WRITE(state, shadow_path, queued_kernel) = 0;
(void)current_kernel;
}
#endif

View File

@@ -37,11 +37,10 @@ KERNEL_STRUCT_MEMBER(path, uint32_t, flag, KERNEL_FEATURE_PATH_TRACING)
/* enum PathRayMNEE */
KERNEL_STRUCT_MEMBER(path, uint8_t, mnee, KERNEL_FEATURE_PATH_TRACING)
/* Multiple importance sampling
* The PDF of BSDF sampling at the last scatter point, and distance to the
* last scatter point minus the last ray segment. This distance lets us
* compute the complete distance through transparent surfaces and volumes. */
* The PDF of BSDF sampling at the last scatter point, which is at ray distance
* zero and distance. Note that transparency and volume attenuation increase
* the ray tmin but keep P unmodified so that this works. */
KERNEL_STRUCT_MEMBER(path, float, mis_ray_pdf, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(path, float, mis_ray_t, KERNEL_FEATURE_PATH_TRACING)
/* Filter glossy. */
KERNEL_STRUCT_MEMBER(path, float, min_ray_pdf, KERNEL_FEATURE_PATH_TRACING)
/* Continuation probability for path termination. */
@@ -63,7 +62,8 @@ KERNEL_STRUCT_END(path)
KERNEL_STRUCT_BEGIN(ray)
KERNEL_STRUCT_MEMBER(ray, packed_float3, P, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(ray, packed_float3, D, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(ray, float, t, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(ray, float, tmin, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(ray, float, tmax, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(ray, float, time, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(ray, float, dP, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(ray, float, dD, KERNEL_FEATURE_PATH_TRACING)

View File

@@ -17,7 +17,8 @@ ccl_device_forceinline void integrator_state_write_ray(KernelGlobals kg,
{
INTEGRATOR_STATE_WRITE(state, ray, P) = ray->P;
INTEGRATOR_STATE_WRITE(state, ray, D) = ray->D;
INTEGRATOR_STATE_WRITE(state, ray, t) = ray->t;
INTEGRATOR_STATE_WRITE(state, ray, tmin) = ray->tmin;
INTEGRATOR_STATE_WRITE(state, ray, tmax) = ray->tmax;
INTEGRATOR_STATE_WRITE(state, ray, time) = ray->time;
INTEGRATOR_STATE_WRITE(state, ray, dP) = ray->dP;
INTEGRATOR_STATE_WRITE(state, ray, dD) = ray->dD;
@@ -29,7 +30,8 @@ ccl_device_forceinline void integrator_state_read_ray(KernelGlobals kg,
{
ray->P = INTEGRATOR_STATE(state, ray, P);
ray->D = INTEGRATOR_STATE(state, ray, D);
ray->t = INTEGRATOR_STATE(state, ray, t);
ray->tmin = INTEGRATOR_STATE(state, ray, tmin);
ray->tmax = INTEGRATOR_STATE(state, ray, tmax);
ray->time = INTEGRATOR_STATE(state, ray, time);
ray->dP = INTEGRATOR_STATE(state, ray, dP);
ray->dD = INTEGRATOR_STATE(state, ray, dD);
@@ -42,7 +44,8 @@ ccl_device_forceinline void integrator_state_write_shadow_ray(
{
INTEGRATOR_STATE_WRITE(state, shadow_ray, P) = ray->P;
INTEGRATOR_STATE_WRITE(state, shadow_ray, D) = ray->D;
INTEGRATOR_STATE_WRITE(state, shadow_ray, t) = ray->t;
INTEGRATOR_STATE_WRITE(state, shadow_ray, tmin) = ray->tmin;
INTEGRATOR_STATE_WRITE(state, shadow_ray, tmax) = ray->tmax;
INTEGRATOR_STATE_WRITE(state, shadow_ray, time) = ray->time;
INTEGRATOR_STATE_WRITE(state, shadow_ray, dP) = ray->dP;
}
@@ -53,7 +56,8 @@ ccl_device_forceinline void integrator_state_read_shadow_ray(KernelGlobals kg,
{
ray->P = INTEGRATOR_STATE(state, shadow_ray, P);
ray->D = INTEGRATOR_STATE(state, shadow_ray, D);
ray->t = INTEGRATOR_STATE(state, shadow_ray, t);
ray->tmin = INTEGRATOR_STATE(state, shadow_ray, tmin);
ray->tmax = INTEGRATOR_STATE(state, shadow_ray, tmax);
ray->time = INTEGRATOR_STATE(state, shadow_ray, time);
ray->dP = INTEGRATOR_STATE(state, shadow_ray, dP);
ray->dD = differential_zero_compact();

View File

@@ -38,7 +38,8 @@ ccl_device int subsurface_bounce(KernelGlobals kg,
/* Setup ray into surface. */
INTEGRATOR_STATE_WRITE(state, ray, P) = sd->P;
INTEGRATOR_STATE_WRITE(state, ray, D) = bssrdf->N;
INTEGRATOR_STATE_WRITE(state, ray, t) = FLT_MAX;
INTEGRATOR_STATE_WRITE(state, ray, tmin) = 0.0f;
INTEGRATOR_STATE_WRITE(state, ray, tmax) = FLT_MAX;
INTEGRATOR_STATE_WRITE(state, ray, dP) = differential_make_compact(sd->dP);
INTEGRATOR_STATE_WRITE(state, ray, dD) = differential_zero_compact();
@@ -160,7 +161,7 @@ ccl_device_inline bool subsurface_scatter(KernelGlobals kg, IntegratorState stat
/* Pretend ray is coming from the outside towards the exit point. This ensures
* correct front/back facing normals.
* TODO: find a more elegant solution? */
ray.P += ray.D * ray.t * 2.0f;
ray.P += ray.D * ray.tmax * 2.0f;
ray.D = -ray.D;
integrator_state_write_isect(kg, state, &ss_isect.hits[0]);
@@ -177,17 +178,23 @@ ccl_device_inline bool subsurface_scatter(KernelGlobals kg, IntegratorState stat
const bool use_raytrace_kernel = (shader_flags & SD_HAS_RAYTRACE);
if (use_caustics) {
INTEGRATOR_PATH_NEXT_SORTED(DEVICE_KERNEL_INTEGRATOR_INTERSECT_SUBSURFACE,
integrator_path_next_sorted(kg,
state,
DEVICE_KERNEL_INTEGRATOR_INTERSECT_SUBSURFACE,
DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_MNEE,
shader);
}
else if (use_raytrace_kernel) {
INTEGRATOR_PATH_NEXT_SORTED(DEVICE_KERNEL_INTEGRATOR_INTERSECT_SUBSURFACE,
integrator_path_next_sorted(kg,
state,
DEVICE_KERNEL_INTEGRATOR_INTERSECT_SUBSURFACE,
DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE,
shader);
}
else {
INTEGRATOR_PATH_NEXT_SORTED(DEVICE_KERNEL_INTEGRATOR_INTERSECT_SUBSURFACE,
integrator_path_next_sorted(kg,
state,
DEVICE_KERNEL_INTEGRATOR_INTERSECT_SUBSURFACE,
DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE,
shader);
}

View File

@@ -82,7 +82,8 @@ ccl_device_inline bool subsurface_disk(KernelGlobals kg,
/* Create ray. */
ray.P = P + disk_N * disk_height + disk_P;
ray.D = -disk_N;
ray.t = 2.0f * disk_height;
ray.tmin = 0.0f;
ray.tmax = 2.0f * disk_height;
ray.dP = ray_dP;
ray.dD = differential_zero_compact();
ray.time = time;
@@ -188,7 +189,8 @@ ccl_device_inline bool subsurface_disk(KernelGlobals kg,
ray.P = ray.P + ray.D * ss_isect.hits[hit].t;
ray.D = ss_isect.Ng[hit];
ray.t = 1.0f;
ray.tmin = 0.0f;
ray.tmax = 1.0f;
return true;
}

View File

@@ -195,7 +195,8 @@ ccl_device_inline bool subsurface_random_walk(KernelGlobals kg,
/* Setup ray. */
ray.P = P;
ray.D = D;
ray.t = FLT_MAX;
ray.tmin = 0.0f;
ray.tmax = FLT_MAX;
ray.time = time;
ray.dP = ray_dP;
ray.dD = differential_zero_compact();
@@ -370,10 +371,10 @@ ccl_device_inline bool subsurface_random_walk(KernelGlobals kg,
* chance of connecting to it.
* TODO: Maybe use less than 10 times the mean free path? */
if (bounce == 0) {
ray.t = max(t, 10.0f / (reduce_min(sigma_t)));
ray.tmax = max(t, 10.0f / (reduce_min(sigma_t)));
}
else {
ray.t = t;
ray.tmax = t;
/* After the first bounce the object can intersect the same surface again */
ray.self.object = OBJECT_NONE;
ray.self.prim = PRIM_NONE;
@@ -384,12 +385,12 @@ ccl_device_inline bool subsurface_random_walk(KernelGlobals kg,
if (hit) {
#ifdef __KERNEL_GPU_RAYTRACING__
/* t is always in world space with OptiX and MetalRT. */
ray.t = ss_isect.hits[0].t;
ray.tmax = ss_isect.hits[0].t;
#else
/* Compute world space distance to surface hit. */
float3 D = transform_direction(&ob_itfm, ray.D);
D = normalize(D) * ss_isect.hits[0].t;
ray.t = len(transform_direction(&ob_tfm, D));
ray.tmax = len(transform_direction(&ob_tfm, D));
#endif
}
@@ -397,16 +398,16 @@ ccl_device_inline bool subsurface_random_walk(KernelGlobals kg,
/* Check if we hit the opposite side. */
if (hit) {
have_opposite_interface = true;
opposite_distance = dot(ray.P + ray.t * ray.D - P, -N);
opposite_distance = dot(ray.P + ray.tmax * ray.D - P, -N);
}
/* Apart from the opposite side check, we were supposed to only trace up to distance t,
* so check if there would have been a hit in that case. */
hit = ray.t < t;
hit = ray.tmax < t;
}
/* Use the distance to the exit point for the throughput update if we found one. */
if (hit) {
t = ray.t;
t = ray.tmax;
}
/* Advance to new scatter location. */

View File

@@ -270,31 +270,26 @@ ccl_device bool lights_intersect(KernelGlobals kg,
if (type == LIGHT_SPOT) {
/* Spot/Disk light. */
const float mis_ray_t = INTEGRATOR_STATE(state, path, mis_ray_t);
const float3 ray_P = ray->P - ray->D * mis_ray_t;
const float3 lightP = make_float3(klight->co[0], klight->co[1], klight->co[2]);
const float radius = klight->spot.radius;
if (radius == 0.0f) {
continue;
}
/* disk oriented normal */
const float3 lightN = normalize(ray_P - lightP);
const float3 lightN = normalize(ray->P - lightP);
/* One sided. */
if (dot(ray->D, lightN) >= 0.0f) {
continue;
}
float3 P;
if (!ray_disk_intersect(ray->P, ray->D, ray->t, lightP, lightN, radius, &P, &t)) {
if (!ray_disk_intersect(
ray->P, ray->D, ray->tmin, ray->tmax, lightP, lightN, radius, &P, &t)) {
continue;
}
}
else if (type == LIGHT_POINT) {
/* Sphere light (aka, aligned disk light). */
const float mis_ray_t = INTEGRATOR_STATE(state, path, mis_ray_t);
const float3 ray_P = ray->P - ray->D * mis_ray_t;
const float3 lightP = make_float3(klight->co[0], klight->co[1], klight->co[2]);
const float radius = klight->spot.radius;
if (radius == 0.0f) {
@@ -302,9 +297,10 @@ ccl_device bool lights_intersect(KernelGlobals kg,
}
/* disk oriented normal */
const float3 lightN = normalize(ray_P - lightP);
const float3 lightN = normalize(ray->P - lightP);
float3 P;
if (!ray_disk_intersect(ray->P, ray->D, ray->t, lightP, lightN, radius, &P, &t)) {
if (!ray_disk_intersect(
ray->P, ray->D, ray->tmin, ray->tmax, lightP, lightN, radius, &P, &t)) {
continue;
}
}
@@ -330,8 +326,19 @@ ccl_device bool lights_intersect(KernelGlobals kg,
const float3 light_P = make_float3(klight->co[0], klight->co[1], klight->co[2]);
float3 P;
if (!ray_quad_intersect(
ray->P, ray->D, 0.0f, ray->t, light_P, axisu, axisv, Ng, &P, &t, &u, &v, is_round)) {
if (!ray_quad_intersect(ray->P,
ray->D,
ray->tmin,
ray->tmax,
light_P,
axisu,
axisv,
Ng,
&P,
&t,
&u,
&v,
is_round)) {
continue;
}
}
@@ -775,7 +782,8 @@ ccl_device_forceinline void triangle_light_sample(KernelGlobals kg,
ls->D = z * B + safe_sqrtf(1.0f - z * z) * safe_normalize(C_ - dot(C_, B) * B);
/* calculate intersection with the planar triangle */
if (!ray_triangle_intersect(P, ls->D, FLT_MAX, V[0], V[1], V[2], &ls->u, &ls->v, &ls->t)) {
if (!ray_triangle_intersect(
P, ls->D, 0.0f, FLT_MAX, V[0], V[1], V[2], &ls->u, &ls->v, &ls->t)) {
ls->pdf = 0.0f;
return;
}

View File

@@ -227,23 +227,24 @@ ccl_device_inline void shadow_ray_setup(ccl_private const ShaderData *ccl_restri
if (ls->shader & SHADER_CAST_SHADOW) {
/* setup ray */
ray->P = P;
ray->tmin = 0.0f;
if (ls->t == FLT_MAX) {
/* distant light */
ray->D = ls->D;
ray->t = ls->t;
ray->tmax = ls->t;
}
else {
/* other lights, avoid self-intersection */
ray->D = ls->P - P;
ray->D = normalize_len(ray->D, &ray->t);
ray->D = normalize_len(ray->D, &ray->tmax);
}
}
else {
/* signal to not cast shadow ray */
ray->P = zero_float3();
ray->D = zero_float3();
ray->t = 0.0f;
ray->tmax = 0.0f;
}
ray->dP = differential_make_compact(sd->dP);

View File

@@ -1094,10 +1094,8 @@ bool OSLRenderServices::get_background_attribute(const KernelGlobalsCPU *kg,
ndc[0] = camera_world_to_ndc(kg, sd, sd->ray_P);
if (derivatives) {
ndc[1] = camera_world_to_ndc(kg, sd, sd->ray_P + make_float3(sd->ray_dP, 0.0f, 0.0f)) -
ndc[0];
ndc[2] = camera_world_to_ndc(kg, sd, sd->ray_P + make_float3(0.0f, sd->ray_dP, 0.0f)) -
ndc[0];
ndc[1] = zero_float3();
ndc[2] = zero_float3();
}
}
else {
@@ -1671,7 +1669,8 @@ bool OSLRenderServices::trace(TraceOpt &options,
ray.P = TO_FLOAT3(P);
ray.D = TO_FLOAT3(R);
ray.t = (options.maxdist == 1.0e30f) ? FLT_MAX : options.maxdist - options.mindist;
ray.tmin = 0.0f;
ray.tmax = (options.maxdist == 1.0e30f) ? FLT_MAX : options.maxdist - options.mindist;
ray.time = sd->time;
ray.self.object = OBJECT_NONE;
ray.self.prim = PRIM_NONE;

View File

@@ -59,7 +59,8 @@ ccl_device float svm_ao(
Ray ray;
ray.P = sd->P;
ray.D = D.x * T + D.y * B + D.z * N;
ray.t = max_dist;
ray.tmin = 0.0f;
ray.tmax = max_dist;
ray.time = sd->time;
ray.self.object = sd->object;
ray.self.prim = sd->prim;

View File

@@ -179,7 +179,8 @@ ccl_device float3 svm_bevel(
Ray ray ccl_optional_struct_init;
ray.P = sd->P + disk_N * disk_height + disk_P;
ray.D = -disk_N;
ray.t = 2.0f * disk_height;
ray.tmin = 0.0f;
ray.tmax = 2.0f * disk_height;
ray.dP = differential_zero_compact();
ray.dD = differential_zero_compact();
ray.time = sd->time;

View File

@@ -0,0 +1,110 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2011-2022 Blender Foundation */
#ifndef SHADER_NODE_TYPE
# define SHADER_NODE_TYPE(name)
#endif
/* NOTE: for best OpenCL performance, item definition in the enum must
* match the switch case order in `svm.h`. */
SHADER_NODE_TYPE(NODE_END)
SHADER_NODE_TYPE(NODE_SHADER_JUMP)
SHADER_NODE_TYPE(NODE_CLOSURE_BSDF)
SHADER_NODE_TYPE(NODE_CLOSURE_EMISSION)
SHADER_NODE_TYPE(NODE_CLOSURE_BACKGROUND)
SHADER_NODE_TYPE(NODE_CLOSURE_SET_WEIGHT)
SHADER_NODE_TYPE(NODE_CLOSURE_WEIGHT)
SHADER_NODE_TYPE(NODE_EMISSION_WEIGHT)
SHADER_NODE_TYPE(NODE_MIX_CLOSURE)
SHADER_NODE_TYPE(NODE_JUMP_IF_ZERO)
SHADER_NODE_TYPE(NODE_JUMP_IF_ONE)
SHADER_NODE_TYPE(NODE_GEOMETRY)
SHADER_NODE_TYPE(NODE_CONVERT)
SHADER_NODE_TYPE(NODE_TEX_COORD)
SHADER_NODE_TYPE(NODE_VALUE_F)
SHADER_NODE_TYPE(NODE_VALUE_V)
SHADER_NODE_TYPE(NODE_ATTR)
SHADER_NODE_TYPE(NODE_VERTEX_COLOR)
SHADER_NODE_TYPE(NODE_GEOMETRY_BUMP_DX)
SHADER_NODE_TYPE(NODE_GEOMETRY_BUMP_DY)
SHADER_NODE_TYPE(NODE_SET_DISPLACEMENT)
SHADER_NODE_TYPE(NODE_DISPLACEMENT)
SHADER_NODE_TYPE(NODE_VECTOR_DISPLACEMENT)
SHADER_NODE_TYPE(NODE_TEX_IMAGE)
SHADER_NODE_TYPE(NODE_TEX_IMAGE_BOX)
SHADER_NODE_TYPE(NODE_TEX_NOISE)
SHADER_NODE_TYPE(NODE_SET_BUMP)
SHADER_NODE_TYPE(NODE_ATTR_BUMP_DX)
SHADER_NODE_TYPE(NODE_ATTR_BUMP_DY)
SHADER_NODE_TYPE(NODE_VERTEX_COLOR_BUMP_DX)
SHADER_NODE_TYPE(NODE_VERTEX_COLOR_BUMP_DY)
SHADER_NODE_TYPE(NODE_TEX_COORD_BUMP_DX)
SHADER_NODE_TYPE(NODE_TEX_COORD_BUMP_DY)
SHADER_NODE_TYPE(NODE_CLOSURE_SET_NORMAL)
SHADER_NODE_TYPE(NODE_ENTER_BUMP_EVAL)
SHADER_NODE_TYPE(NODE_LEAVE_BUMP_EVAL)
SHADER_NODE_TYPE(NODE_HSV)
SHADER_NODE_TYPE(NODE_CLOSURE_HOLDOUT)
SHADER_NODE_TYPE(NODE_FRESNEL)
SHADER_NODE_TYPE(NODE_LAYER_WEIGHT)
SHADER_NODE_TYPE(NODE_CLOSURE_VOLUME)
SHADER_NODE_TYPE(NODE_PRINCIPLED_VOLUME)
SHADER_NODE_TYPE(NODE_MATH)
SHADER_NODE_TYPE(NODE_VECTOR_MATH)
SHADER_NODE_TYPE(NODE_RGB_RAMP)
SHADER_NODE_TYPE(NODE_GAMMA)
SHADER_NODE_TYPE(NODE_BRIGHTCONTRAST)
SHADER_NODE_TYPE(NODE_LIGHT_PATH)
SHADER_NODE_TYPE(NODE_OBJECT_INFO)
SHADER_NODE_TYPE(NODE_PARTICLE_INFO)
SHADER_NODE_TYPE(NODE_HAIR_INFO)
SHADER_NODE_TYPE(NODE_POINT_INFO)
SHADER_NODE_TYPE(NODE_TEXTURE_MAPPING)
SHADER_NODE_TYPE(NODE_MAPPING)
SHADER_NODE_TYPE(NODE_MIN_MAX)
SHADER_NODE_TYPE(NODE_CAMERA)
SHADER_NODE_TYPE(NODE_TEX_ENVIRONMENT)
SHADER_NODE_TYPE(NODE_TEX_SKY)
SHADER_NODE_TYPE(NODE_TEX_GRADIENT)
SHADER_NODE_TYPE(NODE_TEX_VORONOI)
SHADER_NODE_TYPE(NODE_TEX_MUSGRAVE)
SHADER_NODE_TYPE(NODE_TEX_WAVE)
SHADER_NODE_TYPE(NODE_TEX_MAGIC)
SHADER_NODE_TYPE(NODE_TEX_CHECKER)
SHADER_NODE_TYPE(NODE_TEX_BRICK)
SHADER_NODE_TYPE(NODE_TEX_WHITE_NOISE)
SHADER_NODE_TYPE(NODE_NORMAL)
SHADER_NODE_TYPE(NODE_LIGHT_FALLOFF)
SHADER_NODE_TYPE(NODE_IES)
SHADER_NODE_TYPE(NODE_CURVES)
SHADER_NODE_TYPE(NODE_TANGENT)
SHADER_NODE_TYPE(NODE_NORMAL_MAP)
SHADER_NODE_TYPE(NODE_INVERT)
SHADER_NODE_TYPE(NODE_MIX)
SHADER_NODE_TYPE(NODE_SEPARATE_COLOR)
SHADER_NODE_TYPE(NODE_COMBINE_COLOR)
SHADER_NODE_TYPE(NODE_SEPARATE_VECTOR)
SHADER_NODE_TYPE(NODE_COMBINE_VECTOR)
SHADER_NODE_TYPE(NODE_SEPARATE_HSV)
SHADER_NODE_TYPE(NODE_COMBINE_HSV)
SHADER_NODE_TYPE(NODE_VECTOR_ROTATE)
SHADER_NODE_TYPE(NODE_VECTOR_TRANSFORM)
SHADER_NODE_TYPE(NODE_WIREFRAME)
SHADER_NODE_TYPE(NODE_WAVELENGTH)
SHADER_NODE_TYPE(NODE_BLACKBODY)
SHADER_NODE_TYPE(NODE_MAP_RANGE)
SHADER_NODE_TYPE(NODE_VECTOR_MAP_RANGE)
SHADER_NODE_TYPE(NODE_CLAMP)
SHADER_NODE_TYPE(NODE_BEVEL)
SHADER_NODE_TYPE(NODE_AMBIENT_OCCLUSION)
SHADER_NODE_TYPE(NODE_TEX_VOXEL)
SHADER_NODE_TYPE(NODE_AOV_START)
SHADER_NODE_TYPE(NODE_AOV_COLOR)
SHADER_NODE_TYPE(NODE_AOV_VALUE)
SHADER_NODE_TYPE(NODE_FLOAT_CURVE)
/* Padding for struct alignment. */
SHADER_NODE_TYPE(NODE_PAD1)
#undef SHADER_NODE_TYPE

View File

@@ -204,6 +204,15 @@ CCL_NAMESPACE_END
CCL_NAMESPACE_BEGIN
#ifdef __KERNEL_USE_DATA_CONSTANTS__
# define SVM_CASE(node) \
case node: \
if (!kernel_data_svm_usage_##node) \
break;
#else
# define SVM_CASE(node) case node:
#endif
/* Main Interpreter Loop */
template<uint node_feature_mask, ShaderType type, typename ConstIntegratorGenericState>
ccl_device void svm_eval_nodes(KernelGlobals kg,
@@ -219,9 +228,10 @@ ccl_device void svm_eval_nodes(KernelGlobals kg,
uint4 node = read_node(kg, &offset);
switch (node.x) {
case NODE_END:
return;
case NODE_SHADER_JUMP: {
SVM_CASE(NODE_END)
return;
SVM_CASE(NODE_SHADER_JUMP)
{
if (type == SHADER_TYPE_SURFACE)
offset = node.y;
else if (type == SHADER_TYPE_VOLUME)
@@ -232,351 +242,349 @@ ccl_device void svm_eval_nodes(KernelGlobals kg,
return;
break;
}
case NODE_CLOSURE_BSDF:
offset = svm_node_closure_bsdf<node_feature_mask, type>(
kg, sd, stack, node, path_flag, offset);
break;
case NODE_CLOSURE_EMISSION:
IF_KERNEL_NODES_FEATURE(EMISSION)
{
svm_node_closure_emission(sd, stack, node);
}
break;
case NODE_CLOSURE_BACKGROUND:
IF_KERNEL_NODES_FEATURE(EMISSION)
{
svm_node_closure_background(sd, stack, node);
}
break;
case NODE_CLOSURE_SET_WEIGHT:
svm_node_closure_set_weight(sd, node.y, node.z, node.w);
break;
case NODE_CLOSURE_WEIGHT:
svm_node_closure_weight(sd, stack, node.y);
break;
case NODE_EMISSION_WEIGHT:
IF_KERNEL_NODES_FEATURE(EMISSION)
{
svm_node_emission_weight(kg, sd, stack, node);
}
break;
case NODE_MIX_CLOSURE:
svm_node_mix_closure(sd, stack, node);
break;
case NODE_JUMP_IF_ZERO:
if (stack_load_float(stack, node.z) <= 0.0f)
offset += node.y;
break;
case NODE_JUMP_IF_ONE:
if (stack_load_float(stack, node.z) >= 1.0f)
offset += node.y;
break;
case NODE_GEOMETRY:
svm_node_geometry(kg, sd, stack, node.y, node.z);
break;
case NODE_CONVERT:
svm_node_convert(kg, sd, stack, node.y, node.z, node.w);
break;
case NODE_TEX_COORD:
offset = svm_node_tex_coord(kg, sd, path_flag, stack, node, offset);
break;
case NODE_VALUE_F:
svm_node_value_f(kg, sd, stack, node.y, node.z);
break;
case NODE_VALUE_V:
offset = svm_node_value_v(kg, sd, stack, node.y, offset);
break;
case NODE_ATTR:
svm_node_attr<node_feature_mask>(kg, sd, stack, node);
break;
case NODE_VERTEX_COLOR:
svm_node_vertex_color(kg, sd, stack, node.y, node.z, node.w);
break;
case NODE_GEOMETRY_BUMP_DX:
IF_KERNEL_NODES_FEATURE(BUMP)
{
svm_node_geometry_bump_dx(kg, sd, stack, node.y, node.z);
}
break;
case NODE_GEOMETRY_BUMP_DY:
IF_KERNEL_NODES_FEATURE(BUMP)
{
svm_node_geometry_bump_dy(kg, sd, stack, node.y, node.z);
}
break;
case NODE_SET_DISPLACEMENT:
svm_node_set_displacement<node_feature_mask>(kg, sd, stack, node.y);
break;
case NODE_DISPLACEMENT:
svm_node_displacement<node_feature_mask>(kg, sd, stack, node);
break;
case NODE_VECTOR_DISPLACEMENT:
offset = svm_node_vector_displacement<node_feature_mask>(kg, sd, stack, node, offset);
break;
case NODE_TEX_IMAGE:
offset = svm_node_tex_image(kg, sd, stack, node, offset);
break;
case NODE_TEX_IMAGE_BOX:
svm_node_tex_image_box(kg, sd, stack, node);
break;
case NODE_TEX_NOISE:
offset = svm_node_tex_noise(kg, sd, stack, node.y, node.z, node.w, offset);
break;
case NODE_SET_BUMP:
svm_node_set_bump<node_feature_mask>(kg, sd, stack, node);
break;
case NODE_ATTR_BUMP_DX:
IF_KERNEL_NODES_FEATURE(BUMP)
{
svm_node_attr_bump_dx(kg, sd, stack, node);
}
break;
case NODE_ATTR_BUMP_DY:
IF_KERNEL_NODES_FEATURE(BUMP)
{
svm_node_attr_bump_dy(kg, sd, stack, node);
}
break;
case NODE_VERTEX_COLOR_BUMP_DX:
IF_KERNEL_NODES_FEATURE(BUMP)
{
svm_node_vertex_color_bump_dx(kg, sd, stack, node.y, node.z, node.w);
}
break;
case NODE_VERTEX_COLOR_BUMP_DY:
IF_KERNEL_NODES_FEATURE(BUMP)
{
svm_node_vertex_color_bump_dy(kg, sd, stack, node.y, node.z, node.w);
}
break;
case NODE_TEX_COORD_BUMP_DX:
IF_KERNEL_NODES_FEATURE(BUMP)
{
offset = svm_node_tex_coord_bump_dx(kg, sd, path_flag, stack, node, offset);
}
break;
case NODE_TEX_COORD_BUMP_DY:
IF_KERNEL_NODES_FEATURE(BUMP)
{
offset = svm_node_tex_coord_bump_dy(kg, sd, path_flag, stack, node, offset);
}
break;
case NODE_CLOSURE_SET_NORMAL:
IF_KERNEL_NODES_FEATURE(BUMP)
{
svm_node_set_normal(kg, sd, stack, node.y, node.z);
}
break;
case NODE_ENTER_BUMP_EVAL:
IF_KERNEL_NODES_FEATURE(BUMP_STATE)
{
svm_node_enter_bump_eval(kg, sd, stack, node.y);
}
break;
case NODE_LEAVE_BUMP_EVAL:
IF_KERNEL_NODES_FEATURE(BUMP_STATE)
{
svm_node_leave_bump_eval(kg, sd, stack, node.y);
}
break;
case NODE_HSV:
svm_node_hsv(kg, sd, stack, node);
break;
case NODE_CLOSURE_HOLDOUT:
svm_node_closure_holdout(sd, stack, node);
break;
case NODE_FRESNEL:
svm_node_fresnel(sd, stack, node.y, node.z, node.w);
break;
case NODE_LAYER_WEIGHT:
svm_node_layer_weight(sd, stack, node);
break;
case NODE_CLOSURE_VOLUME:
IF_KERNEL_NODES_FEATURE(VOLUME)
{
svm_node_closure_volume<type>(kg, sd, stack, node);
}
break;
case NODE_PRINCIPLED_VOLUME:
IF_KERNEL_NODES_FEATURE(VOLUME)
{
offset = svm_node_principled_volume<type>(kg, sd, stack, node, path_flag, offset);
}
break;
case NODE_MATH:
svm_node_math(kg, sd, stack, node.y, node.z, node.w);
break;
case NODE_VECTOR_MATH:
offset = svm_node_vector_math(kg, sd, stack, node.y, node.z, node.w, offset);
break;
case NODE_RGB_RAMP:
offset = svm_node_rgb_ramp(kg, sd, stack, node, offset);
break;
case NODE_GAMMA:
svm_node_gamma(sd, stack, node.y, node.z, node.w);
break;
case NODE_BRIGHTCONTRAST:
svm_node_brightness(sd, stack, node.y, node.z, node.w);
break;
case NODE_LIGHT_PATH:
svm_node_light_path<node_feature_mask>(kg, state, sd, stack, node.y, node.z, path_flag);
break;
case NODE_OBJECT_INFO:
svm_node_object_info(kg, sd, stack, node.y, node.z);
break;
case NODE_PARTICLE_INFO:
svm_node_particle_info(kg, sd, stack, node.y, node.z);
break;
SVM_CASE(NODE_CLOSURE_BSDF)
offset = svm_node_closure_bsdf<node_feature_mask, type>(
kg, sd, stack, node, path_flag, offset);
break;
SVM_CASE(NODE_CLOSURE_EMISSION)
IF_KERNEL_NODES_FEATURE(EMISSION)
{
svm_node_closure_emission(sd, stack, node);
}
break;
SVM_CASE(NODE_CLOSURE_BACKGROUND)
IF_KERNEL_NODES_FEATURE(EMISSION)
{
svm_node_closure_background(sd, stack, node);
}
break;
SVM_CASE(NODE_CLOSURE_SET_WEIGHT)
svm_node_closure_set_weight(sd, node.y, node.z, node.w);
break;
SVM_CASE(NODE_CLOSURE_WEIGHT)
svm_node_closure_weight(sd, stack, node.y);
break;
SVM_CASE(NODE_EMISSION_WEIGHT)
IF_KERNEL_NODES_FEATURE(EMISSION)
{
svm_node_emission_weight(kg, sd, stack, node);
}
break;
SVM_CASE(NODE_MIX_CLOSURE)
svm_node_mix_closure(sd, stack, node);
break;
SVM_CASE(NODE_JUMP_IF_ZERO)
if (stack_load_float(stack, node.z) <= 0.0f)
offset += node.y;
break;
SVM_CASE(NODE_JUMP_IF_ONE)
if (stack_load_float(stack, node.z) >= 1.0f)
offset += node.y;
break;
SVM_CASE(NODE_GEOMETRY)
svm_node_geometry(kg, sd, stack, node.y, node.z);
break;
SVM_CASE(NODE_CONVERT)
svm_node_convert(kg, sd, stack, node.y, node.z, node.w);
break;
SVM_CASE(NODE_TEX_COORD)
offset = svm_node_tex_coord(kg, sd, path_flag, stack, node, offset);
break;
SVM_CASE(NODE_VALUE_F)
svm_node_value_f(kg, sd, stack, node.y, node.z);
break;
SVM_CASE(NODE_VALUE_V)
offset = svm_node_value_v(kg, sd, stack, node.y, offset);
break;
SVM_CASE(NODE_ATTR)
svm_node_attr<node_feature_mask>(kg, sd, stack, node);
break;
SVM_CASE(NODE_VERTEX_COLOR)
svm_node_vertex_color(kg, sd, stack, node.y, node.z, node.w);
break;
SVM_CASE(NODE_GEOMETRY_BUMP_DX)
IF_KERNEL_NODES_FEATURE(BUMP)
{
svm_node_geometry_bump_dx(kg, sd, stack, node.y, node.z);
}
break;
SVM_CASE(NODE_GEOMETRY_BUMP_DY)
IF_KERNEL_NODES_FEATURE(BUMP)
{
svm_node_geometry_bump_dy(kg, sd, stack, node.y, node.z);
}
break;
SVM_CASE(NODE_SET_DISPLACEMENT)
svm_node_set_displacement<node_feature_mask>(kg, sd, stack, node.y);
break;
SVM_CASE(NODE_DISPLACEMENT)
svm_node_displacement<node_feature_mask>(kg, sd, stack, node);
break;
SVM_CASE(NODE_VECTOR_DISPLACEMENT)
offset = svm_node_vector_displacement<node_feature_mask>(kg, sd, stack, node, offset);
break;
SVM_CASE(NODE_TEX_IMAGE)
offset = svm_node_tex_image(kg, sd, stack, node, offset);
break;
SVM_CASE(NODE_TEX_IMAGE_BOX)
svm_node_tex_image_box(kg, sd, stack, node);
break;
SVM_CASE(NODE_TEX_NOISE)
offset = svm_node_tex_noise(kg, sd, stack, node.y, node.z, node.w, offset);
break;
SVM_CASE(NODE_SET_BUMP)
svm_node_set_bump<node_feature_mask>(kg, sd, stack, node);
break;
SVM_CASE(NODE_ATTR_BUMP_DX)
IF_KERNEL_NODES_FEATURE(BUMP)
{
svm_node_attr_bump_dx(kg, sd, stack, node);
}
break;
SVM_CASE(NODE_ATTR_BUMP_DY)
IF_KERNEL_NODES_FEATURE(BUMP)
{
svm_node_attr_bump_dy(kg, sd, stack, node);
}
break;
SVM_CASE(NODE_VERTEX_COLOR_BUMP_DX)
IF_KERNEL_NODES_FEATURE(BUMP)
{
svm_node_vertex_color_bump_dx(kg, sd, stack, node.y, node.z, node.w);
}
break;
SVM_CASE(NODE_VERTEX_COLOR_BUMP_DY)
IF_KERNEL_NODES_FEATURE(BUMP)
{
svm_node_vertex_color_bump_dy(kg, sd, stack, node.y, node.z, node.w);
}
break;
SVM_CASE(NODE_TEX_COORD_BUMP_DX)
IF_KERNEL_NODES_FEATURE(BUMP)
{
offset = svm_node_tex_coord_bump_dx(kg, sd, path_flag, stack, node, offset);
}
break;
SVM_CASE(NODE_TEX_COORD_BUMP_DY)
IF_KERNEL_NODES_FEATURE(BUMP)
{
offset = svm_node_tex_coord_bump_dy(kg, sd, path_flag, stack, node, offset);
}
break;
SVM_CASE(NODE_CLOSURE_SET_NORMAL)
IF_KERNEL_NODES_FEATURE(BUMP)
{
svm_node_set_normal(kg, sd, stack, node.y, node.z);
}
break;
SVM_CASE(NODE_ENTER_BUMP_EVAL)
IF_KERNEL_NODES_FEATURE(BUMP_STATE)
{
svm_node_enter_bump_eval(kg, sd, stack, node.y);
}
break;
SVM_CASE(NODE_LEAVE_BUMP_EVAL)
IF_KERNEL_NODES_FEATURE(BUMP_STATE)
{
svm_node_leave_bump_eval(kg, sd, stack, node.y);
}
break;
SVM_CASE(NODE_HSV)
svm_node_hsv(kg, sd, stack, node);
break;
SVM_CASE(NODE_CLOSURE_HOLDOUT)
svm_node_closure_holdout(sd, stack, node);
break;
SVM_CASE(NODE_FRESNEL)
svm_node_fresnel(sd, stack, node.y, node.z, node.w);
break;
SVM_CASE(NODE_LAYER_WEIGHT)
svm_node_layer_weight(sd, stack, node);
break;
SVM_CASE(NODE_CLOSURE_VOLUME)
IF_KERNEL_NODES_FEATURE(VOLUME)
{
svm_node_closure_volume<type>(kg, sd, stack, node);
}
break;
SVM_CASE(NODE_PRINCIPLED_VOLUME)
IF_KERNEL_NODES_FEATURE(VOLUME)
{
offset = svm_node_principled_volume<type>(kg, sd, stack, node, path_flag, offset);
}
break;
SVM_CASE(NODE_MATH)
svm_node_math(kg, sd, stack, node.y, node.z, node.w);
break;
SVM_CASE(NODE_VECTOR_MATH)
offset = svm_node_vector_math(kg, sd, stack, node.y, node.z, node.w, offset);
break;
SVM_CASE(NODE_RGB_RAMP)
offset = svm_node_rgb_ramp(kg, sd, stack, node, offset);
break;
SVM_CASE(NODE_GAMMA)
svm_node_gamma(sd, stack, node.y, node.z, node.w);
break;
SVM_CASE(NODE_BRIGHTCONTRAST)
svm_node_brightness(sd, stack, node.y, node.z, node.w);
break;
SVM_CASE(NODE_LIGHT_PATH)
svm_node_light_path<node_feature_mask>(kg, state, sd, stack, node.y, node.z, path_flag);
break;
SVM_CASE(NODE_OBJECT_INFO)
svm_node_object_info(kg, sd, stack, node.y, node.z);
break;
SVM_CASE(NODE_PARTICLE_INFO)
svm_node_particle_info(kg, sd, stack, node.y, node.z);
break;
#if defined(__HAIR__)
case NODE_HAIR_INFO:
svm_node_hair_info(kg, sd, stack, node.y, node.z);
break;
SVM_CASE(NODE_HAIR_INFO)
svm_node_hair_info(kg, sd, stack, node.y, node.z);
break;
#endif
#if defined(__POINTCLOUD__)
case NODE_POINT_INFO:
svm_node_point_info(kg, sd, stack, node.y, node.z);
break;
SVM_CASE(NODE_POINT_INFO)
svm_node_point_info(kg, sd, stack, node.y, node.z);
break;
#endif
case NODE_TEXTURE_MAPPING:
offset = svm_node_texture_mapping(kg, sd, stack, node.y, node.z, offset);
break;
case NODE_MAPPING:
svm_node_mapping(kg, sd, stack, node.y, node.z, node.w);
break;
case NODE_MIN_MAX:
offset = svm_node_min_max(kg, sd, stack, node.y, node.z, offset);
break;
case NODE_CAMERA:
svm_node_camera(kg, sd, stack, node.y, node.z, node.w);
break;
case NODE_TEX_ENVIRONMENT:
svm_node_tex_environment(kg, sd, stack, node);
break;
case NODE_TEX_SKY:
offset = svm_node_tex_sky(kg, sd, stack, node, offset);
break;
case NODE_TEX_GRADIENT:
svm_node_tex_gradient(sd, stack, node);
break;
case NODE_TEX_VORONOI:
offset = svm_node_tex_voronoi<node_feature_mask>(
kg, sd, stack, node.y, node.z, node.w, offset);
break;
case NODE_TEX_MUSGRAVE:
offset = svm_node_tex_musgrave(kg, sd, stack, node.y, node.z, node.w, offset);
break;
case NODE_TEX_WAVE:
offset = svm_node_tex_wave(kg, sd, stack, node, offset);
break;
case NODE_TEX_MAGIC:
offset = svm_node_tex_magic(kg, sd, stack, node, offset);
break;
case NODE_TEX_CHECKER:
svm_node_tex_checker(kg, sd, stack, node);
break;
case NODE_TEX_BRICK:
offset = svm_node_tex_brick(kg, sd, stack, node, offset);
break;
case NODE_TEX_WHITE_NOISE:
svm_node_tex_white_noise(kg, sd, stack, node.y, node.z, node.w);
break;
case NODE_NORMAL:
offset = svm_node_normal(kg, sd, stack, node.y, node.z, node.w, offset);
break;
case NODE_LIGHT_FALLOFF:
svm_node_light_falloff(sd, stack, node);
break;
case NODE_IES:
svm_node_ies(kg, sd, stack, node);
break;
case NODE_RGB_CURVES:
case NODE_VECTOR_CURVES:
offset = svm_node_curves(kg, sd, stack, node, offset);
break;
case NODE_FLOAT_CURVE:
offset = svm_node_curve(kg, sd, stack, node, offset);
break;
case NODE_TANGENT:
svm_node_tangent(kg, sd, stack, node);
break;
case NODE_NORMAL_MAP:
svm_node_normal_map(kg, sd, stack, node);
break;
case NODE_INVERT:
svm_node_invert(sd, stack, node.y, node.z, node.w);
break;
case NODE_MIX:
offset = svm_node_mix(kg, sd, stack, node.y, node.z, node.w, offset);
break;
case NODE_SEPARATE_COLOR:
svm_node_separate_color(kg, sd, stack, node.y, node.z, node.w);
break;
case NODE_COMBINE_COLOR:
svm_node_combine_color(kg, sd, stack, node.y, node.z, node.w);
break;
case NODE_SEPARATE_VECTOR:
svm_node_separate_vector(sd, stack, node.y, node.z, node.w);
break;
case NODE_COMBINE_VECTOR:
svm_node_combine_vector(sd, stack, node.y, node.z, node.w);
break;
case NODE_SEPARATE_HSV:
offset = svm_node_separate_hsv(kg, sd, stack, node.y, node.z, node.w, offset);
break;
case NODE_COMBINE_HSV:
offset = svm_node_combine_hsv(kg, sd, stack, node.y, node.z, node.w, offset);
break;
case NODE_VECTOR_ROTATE:
svm_node_vector_rotate(sd, stack, node.y, node.z, node.w);
break;
case NODE_VECTOR_TRANSFORM:
svm_node_vector_transform(kg, sd, stack, node);
break;
case NODE_WIREFRAME:
svm_node_wireframe(kg, sd, stack, node);
break;
case NODE_WAVELENGTH:
svm_node_wavelength(kg, sd, stack, node.y, node.z);
break;
case NODE_BLACKBODY:
svm_node_blackbody(kg, sd, stack, node.y, node.z);
break;
case NODE_MAP_RANGE:
offset = svm_node_map_range(kg, sd, stack, node.y, node.z, node.w, offset);
break;
case NODE_VECTOR_MAP_RANGE:
offset = svm_node_vector_map_range(kg, sd, stack, node.y, node.z, node.w, offset);
break;
case NODE_CLAMP:
offset = svm_node_clamp(kg, sd, stack, node.y, node.z, node.w, offset);
break;
SVM_CASE(NODE_TEXTURE_MAPPING)
offset = svm_node_texture_mapping(kg, sd, stack, node.y, node.z, offset);
break;
SVM_CASE(NODE_MAPPING)
svm_node_mapping(kg, sd, stack, node.y, node.z, node.w);
break;
SVM_CASE(NODE_MIN_MAX)
offset = svm_node_min_max(kg, sd, stack, node.y, node.z, offset);
break;
SVM_CASE(NODE_CAMERA)
svm_node_camera(kg, sd, stack, node.y, node.z, node.w);
break;
SVM_CASE(NODE_TEX_ENVIRONMENT)
svm_node_tex_environment(kg, sd, stack, node);
break;
SVM_CASE(NODE_TEX_SKY)
offset = svm_node_tex_sky(kg, sd, stack, node, offset);
break;
SVM_CASE(NODE_TEX_GRADIENT)
svm_node_tex_gradient(sd, stack, node);
break;
SVM_CASE(NODE_TEX_VORONOI)
offset = svm_node_tex_voronoi<node_feature_mask>(
kg, sd, stack, node.y, node.z, node.w, offset);
break;
SVM_CASE(NODE_TEX_MUSGRAVE)
offset = svm_node_tex_musgrave(kg, sd, stack, node.y, node.z, node.w, offset);
break;
SVM_CASE(NODE_TEX_WAVE)
offset = svm_node_tex_wave(kg, sd, stack, node, offset);
break;
SVM_CASE(NODE_TEX_MAGIC)
offset = svm_node_tex_magic(kg, sd, stack, node, offset);
break;
SVM_CASE(NODE_TEX_CHECKER)
svm_node_tex_checker(kg, sd, stack, node);
break;
SVM_CASE(NODE_TEX_BRICK)
offset = svm_node_tex_brick(kg, sd, stack, node, offset);
break;
SVM_CASE(NODE_TEX_WHITE_NOISE)
svm_node_tex_white_noise(kg, sd, stack, node.y, node.z, node.w);
break;
SVM_CASE(NODE_NORMAL)
offset = svm_node_normal(kg, sd, stack, node.y, node.z, node.w, offset);
break;
SVM_CASE(NODE_LIGHT_FALLOFF)
svm_node_light_falloff(sd, stack, node);
break;
SVM_CASE(NODE_IES)
svm_node_ies(kg, sd, stack, node);
break;
SVM_CASE(NODE_CURVES)
offset = svm_node_curves(kg, sd, stack, node, offset);
break;
SVM_CASE(NODE_FLOAT_CURVE)
offset = svm_node_curve(kg, sd, stack, node, offset);
break;
SVM_CASE(NODE_TANGENT)
svm_node_tangent(kg, sd, stack, node);
break;
SVM_CASE(NODE_NORMAL_MAP)
svm_node_normal_map(kg, sd, stack, node);
break;
SVM_CASE(NODE_INVERT)
svm_node_invert(sd, stack, node.y, node.z, node.w);
break;
SVM_CASE(NODE_MIX)
offset = svm_node_mix(kg, sd, stack, node.y, node.z, node.w, offset);
break;
SVM_CASE(NODE_SEPARATE_COLOR)
svm_node_separate_color(kg, sd, stack, node.y, node.z, node.w);
break;
SVM_CASE(NODE_COMBINE_COLOR)
svm_node_combine_color(kg, sd, stack, node.y, node.z, node.w);
break;
SVM_CASE(NODE_SEPARATE_VECTOR)
svm_node_separate_vector(sd, stack, node.y, node.z, node.w);
break;
SVM_CASE(NODE_COMBINE_VECTOR)
svm_node_combine_vector(sd, stack, node.y, node.z, node.w);
break;
SVM_CASE(NODE_SEPARATE_HSV)
offset = svm_node_separate_hsv(kg, sd, stack, node.y, node.z, node.w, offset);
break;
SVM_CASE(NODE_COMBINE_HSV)
offset = svm_node_combine_hsv(kg, sd, stack, node.y, node.z, node.w, offset);
break;
SVM_CASE(NODE_VECTOR_ROTATE)
svm_node_vector_rotate(sd, stack, node.y, node.z, node.w);
break;
SVM_CASE(NODE_VECTOR_TRANSFORM)
svm_node_vector_transform(kg, sd, stack, node);
break;
SVM_CASE(NODE_WIREFRAME)
svm_node_wireframe(kg, sd, stack, node);
break;
SVM_CASE(NODE_WAVELENGTH)
svm_node_wavelength(kg, sd, stack, node.y, node.z);
break;
SVM_CASE(NODE_BLACKBODY)
svm_node_blackbody(kg, sd, stack, node.y, node.z);
break;
SVM_CASE(NODE_MAP_RANGE)
offset = svm_node_map_range(kg, sd, stack, node.y, node.z, node.w, offset);
break;
SVM_CASE(NODE_VECTOR_MAP_RANGE)
offset = svm_node_vector_map_range(kg, sd, stack, node.y, node.z, node.w, offset);
break;
SVM_CASE(NODE_CLAMP)
offset = svm_node_clamp(kg, sd, stack, node.y, node.z, node.w, offset);
break;
#ifdef __SHADER_RAYTRACE__
case NODE_BEVEL:
svm_node_bevel<node_feature_mask>(kg, state, sd, stack, node);
break;
case NODE_AMBIENT_OCCLUSION:
svm_node_ao<node_feature_mask>(kg, state, sd, stack, node);
break;
SVM_CASE(NODE_BEVEL)
svm_node_bevel<node_feature_mask>(kg, state, sd, stack, node);
break;
SVM_CASE(NODE_AMBIENT_OCCLUSION)
svm_node_ao<node_feature_mask>(kg, state, sd, stack, node);
break;
#endif
case NODE_TEX_VOXEL:
IF_KERNEL_NODES_FEATURE(VOLUME)
{
offset = svm_node_tex_voxel(kg, sd, stack, node, offset);
}
break;
case NODE_AOV_START:
if (!svm_node_aov_check(path_flag, render_buffer)) {
return;
}
break;
case NODE_AOV_COLOR:
svm_node_aov_color<node_feature_mask>(kg, state, sd, stack, node, render_buffer);
break;
case NODE_AOV_VALUE:
svm_node_aov_value<node_feature_mask>(kg, state, sd, stack, node, render_buffer);
break;
SVM_CASE(NODE_TEX_VOXEL)
IF_KERNEL_NODES_FEATURE(VOLUME)
{
offset = svm_node_tex_voxel(kg, sd, stack, node, offset);
}
break;
SVM_CASE(NODE_AOV_START)
if (!svm_node_aov_check(path_flag, render_buffer)) {
return;
}
break;
SVM_CASE(NODE_AOV_COLOR)
svm_node_aov_color<node_feature_mask>(kg, state, sd, stack, node, render_buffer);
break;
SVM_CASE(NODE_AOV_VALUE)
svm_node_aov_value<node_feature_mask>(kg, state, sd, stack, node, render_buffer);
break;
default:
kernel_assert(!"Unknown node type was passed to the SVM machine");
return;

View File

@@ -138,7 +138,7 @@ ccl_device_noinline int svm_node_tex_coord_bump_dx(KernelGlobals kg,
case NODE_TEXCO_WINDOW: {
if ((path_flag & PATH_RAY_CAMERA) && sd->object == OBJECT_NONE &&
kernel_data.cam.type == CAMERA_ORTHOGRAPHIC)
data = camera_world_to_ndc(kg, sd, sd->ray_P + make_float3(sd->ray_dP, 0.0f, 0.0f));
data = camera_world_to_ndc(kg, sd, sd->ray_P);
else
data = camera_world_to_ndc(kg, sd, sd->P + sd->dP.dx);
data.z = 0.0f;
@@ -223,7 +223,7 @@ ccl_device_noinline int svm_node_tex_coord_bump_dy(KernelGlobals kg,
case NODE_TEXCO_WINDOW: {
if ((path_flag & PATH_RAY_CAMERA) && sd->object == OBJECT_NONE &&
kernel_data.cam.type == CAMERA_ORTHOGRAPHIC)
data = camera_world_to_ndc(kg, sd, sd->ray_P + make_float3(0.0f, sd->ray_dP, 0.0f));
data = camera_world_to_ndc(kg, sd, sd->ray_P);
else
data = camera_world_to_ndc(kg, sd, sd->P + sd->dP.dy);
data.z = 0.0f;

View File

@@ -17,104 +17,9 @@ CCL_NAMESPACE_BEGIN
/* Nodes */
typedef enum ShaderNodeType {
NODE_END = 0,
NODE_SHADER_JUMP,
NODE_CLOSURE_BSDF,
NODE_CLOSURE_EMISSION,
NODE_CLOSURE_BACKGROUND,
NODE_CLOSURE_SET_WEIGHT,
NODE_CLOSURE_WEIGHT,
NODE_EMISSION_WEIGHT,
NODE_MIX_CLOSURE,
NODE_JUMP_IF_ZERO,
NODE_JUMP_IF_ONE,
NODE_GEOMETRY,
NODE_CONVERT,
NODE_TEX_COORD,
NODE_VALUE_F,
NODE_VALUE_V,
NODE_ATTR,
NODE_VERTEX_COLOR,
NODE_GEOMETRY_BUMP_DX,
NODE_GEOMETRY_BUMP_DY,
NODE_SET_DISPLACEMENT,
NODE_DISPLACEMENT,
NODE_VECTOR_DISPLACEMENT,
NODE_TEX_IMAGE,
NODE_TEX_IMAGE_BOX,
NODE_TEX_NOISE,
NODE_SET_BUMP,
NODE_ATTR_BUMP_DX,
NODE_ATTR_BUMP_DY,
NODE_VERTEX_COLOR_BUMP_DX,
NODE_VERTEX_COLOR_BUMP_DY,
NODE_TEX_COORD_BUMP_DX,
NODE_TEX_COORD_BUMP_DY,
NODE_CLOSURE_SET_NORMAL,
NODE_ENTER_BUMP_EVAL,
NODE_LEAVE_BUMP_EVAL,
NODE_HSV,
NODE_CLOSURE_HOLDOUT,
NODE_FRESNEL,
NODE_LAYER_WEIGHT,
NODE_CLOSURE_VOLUME,
NODE_PRINCIPLED_VOLUME,
NODE_MATH,
NODE_VECTOR_MATH,
NODE_RGB_RAMP,
NODE_GAMMA,
NODE_BRIGHTCONTRAST,
NODE_LIGHT_PATH,
NODE_OBJECT_INFO,
NODE_PARTICLE_INFO,
NODE_HAIR_INFO,
NODE_POINT_INFO,
NODE_TEXTURE_MAPPING,
NODE_MAPPING,
NODE_MIN_MAX,
NODE_CAMERA,
NODE_TEX_ENVIRONMENT,
NODE_TEX_SKY,
NODE_TEX_GRADIENT,
NODE_TEX_VORONOI,
NODE_TEX_MUSGRAVE,
NODE_TEX_WAVE,
NODE_TEX_MAGIC,
NODE_TEX_CHECKER,
NODE_TEX_BRICK,
NODE_TEX_WHITE_NOISE,
NODE_NORMAL,
NODE_LIGHT_FALLOFF,
NODE_IES,
NODE_RGB_CURVES,
NODE_VECTOR_CURVES,
NODE_TANGENT,
NODE_NORMAL_MAP,
NODE_INVERT,
NODE_MIX,
NODE_SEPARATE_COLOR,
NODE_COMBINE_COLOR,
NODE_SEPARATE_VECTOR,
NODE_COMBINE_VECTOR,
NODE_SEPARATE_HSV,
NODE_COMBINE_HSV,
NODE_VECTOR_ROTATE,
NODE_VECTOR_TRANSFORM,
NODE_WIREFRAME,
NODE_WAVELENGTH,
NODE_BLACKBODY,
NODE_MAP_RANGE,
NODE_VECTOR_MAP_RANGE,
NODE_CLAMP,
NODE_BEVEL,
NODE_AMBIENT_OCCLUSION,
NODE_TEX_VOXEL,
NODE_AOV_START,
NODE_AOV_COLOR,
NODE_AOV_VALUE,
NODE_FLOAT_CURVE,
/* NOTE: for best OpenCL performance, item definition in the enum must
* match the switch case order in `svm.h`. */
#define SHADER_NODE_TYPE(name) name,
#include "node_types_template.h"
NODE_NUM
} ShaderNodeType;
typedef enum NodeAttributeOutputType {

View File

@@ -535,7 +535,8 @@ typedef struct RaySelfPrimitives {
typedef struct Ray {
float3 P; /* origin */
float3 D; /* direction */
float t; /* length of the ray */
float tmin; /* start distance */
float tmax; /* end distance */
float time; /* time (for motion blur) */
RaySelfPrimitives self;
@@ -1072,94 +1073,6 @@ typedef struct KernelCamera {
} KernelCamera;
static_assert_align(KernelCamera, 16);
typedef struct KernelFilm {
float exposure;
int pass_flag;
int light_pass_flag;
int pass_stride;
int pass_combined;
int pass_depth;
int pass_position;
int pass_normal;
int pass_roughness;
int pass_motion;
int pass_motion_weight;
int pass_uv;
int pass_object_id;
int pass_material_id;
int pass_diffuse_color;
int pass_glossy_color;
int pass_transmission_color;
int pass_diffuse_indirect;
int pass_glossy_indirect;
int pass_transmission_indirect;
int pass_volume_indirect;
int pass_diffuse_direct;
int pass_glossy_direct;
int pass_transmission_direct;
int pass_volume_direct;
int pass_emission;
int pass_background;
int pass_ao;
float pass_alpha_threshold;
int pass_shadow;
float pass_shadow_scale;
int pass_shadow_catcher;
int pass_shadow_catcher_sample_count;
int pass_shadow_catcher_matte;
int filter_table_offset;
int cryptomatte_passes;
int cryptomatte_depth;
int pass_cryptomatte;
int pass_adaptive_aux_buffer;
int pass_sample_count;
int pass_mist;
float mist_start;
float mist_inv_depth;
float mist_falloff;
int pass_denoising_normal;
int pass_denoising_albedo;
int pass_denoising_depth;
int pass_aov_color;
int pass_aov_value;
int pass_lightgroup;
/* XYZ to rendering color space transform. float4 instead of float3 to
* ensure consistent padding/alignment across devices. */
float4 xyz_to_r;
float4 xyz_to_g;
float4 xyz_to_b;
float4 rgb_to_y;
/* Rec709 to rendering color space. */
float4 rec709_to_r;
float4 rec709_to_g;
float4 rec709_to_b;
int is_rec709;
int pass_bake_primitive;
int pass_bake_differential;
int use_approximate_shadow_catcher;
int pad1;
} KernelFilm;
static_assert_align(KernelFilm, 16);
typedef struct KernelFilmConvert {
int pass_offset;
int pass_stride;
@@ -1201,108 +1114,6 @@ typedef struct KernelFilmConvert {
} KernelFilmConvert;
static_assert_align(KernelFilmConvert, 16);
typedef struct KernelBackground {
/* only shader index */
int surface_shader;
int volume_shader;
float volume_step_size;
int transparent;
float transparent_roughness_squared_threshold;
/* portal sampling */
float portal_weight;
int num_portals;
int portal_offset;
/* sun sampling */
float sun_weight;
/* xyz store direction, w the angle. float4 instead of float3 is used
* to ensure consistent padding/alignment across devices. */
float4 sun;
/* map sampling */
float map_weight;
int map_res_x;
int map_res_y;
int use_mis;
int lightgroup;
/* Padding */
int pad1, pad2;
} KernelBackground;
static_assert_align(KernelBackground, 16);
typedef struct KernelIntegrator {
/* emission */
int use_direct_light;
int num_distribution;
int num_all_lights;
float pdf_triangles;
float pdf_lights;
float light_inv_rr_threshold;
/* bounces */
int min_bounce;
int max_bounce;
int max_diffuse_bounce;
int max_glossy_bounce;
int max_transmission_bounce;
int max_volume_bounce;
/* AO bounces */
int ao_bounces;
float ao_bounces_distance;
float ao_bounces_factor;
float ao_additive_factor;
/* transparent */
int transparent_min_bounce;
int transparent_max_bounce;
int transparent_shadows;
/* caustics */
int caustics_reflective;
int caustics_refractive;
float filter_glossy;
/* seed */
int seed;
/* clamp */
float sample_clamp_direct;
float sample_clamp_indirect;
/* mis */
int use_lamp_mis;
/* caustics */
int use_caustics;
/* sampler */
int sampling_pattern;
/* volume render */
int use_volumes;
int volume_max_steps;
float volume_step_rate;
int has_shadow_catcher;
float scrambling_distance;
/* Closure filter. */
int filter_closures;
/* MIS debugging. */
int direct_light_sampling_type;
/* padding */
int pad1;
} KernelIntegrator;
static_assert_align(KernelIntegrator, 16);
typedef enum KernelBVHLayout {
BVH_LAYOUT_NONE = 0,
@@ -1320,36 +1131,25 @@ typedef enum KernelBVHLayout {
BVH_LAYOUT_ALL = BVH_LAYOUT_BVH2 | BVH_LAYOUT_EMBREE | BVH_LAYOUT_OPTIX | BVH_LAYOUT_METAL,
} KernelBVHLayout;
typedef struct KernelBVH {
/* Own BVH */
int root;
int have_motion;
int have_curves;
int bvh_layout;
int use_bvh_steps;
int curve_subdivisions;
/* Specialized struct that can become constants in dynamic compilation. */
#define KERNEL_STRUCT_BEGIN(name, parent) struct name {
#define KERNEL_STRUCT_END(name) \
} \
; \
static_assert_align(name, 16);
/* Custom BVH */
#ifdef __KERNEL_OPTIX__
OptixTraversableHandle scene;
#elif defined __METALRT__
metalrt_as_type scene;
#ifdef __KERNEL_USE_DATA_CONSTANTS__
# define KERNEL_STRUCT_MEMBER(parent, type, name) type __unused_##name;
#else
# ifdef __EMBREE__
RTCScene scene;
# ifndef __KERNEL_64_BIT__
int pad2;
# endif
# else
int scene, pad2;
# endif
# define KERNEL_STRUCT_MEMBER(parent, type, name) type name;
#endif
} KernelBVH;
static_assert_align(KernelBVH, 16);
#include "kernel/data_template.h"
typedef struct KernelTables {
int beckmann_offset;
int pad1, pad2, pad3;
int filter_table_offset;
int pad1, pad2;
} KernelTables;
static_assert_align(KernelTables, 16);
@@ -1362,18 +1162,37 @@ typedef struct KernelBake {
static_assert_align(KernelBake, 16);
typedef struct KernelData {
/* Features and limits. */
uint kernel_features;
uint max_closures;
uint max_shaders;
uint volume_stack_size;
/* Always dynamic data mambers. */
KernelCamera cam;
KernelFilm film;
KernelBackground background;
KernelIntegrator integrator;
KernelBVH bvh;
KernelTables tables;
KernelBake bake;
KernelTables tables;
/* Potentially specialized data members. */
#define KERNEL_STRUCT_BEGIN(name, parent) name parent;
#include "kernel/data_template.h"
/* Device specific BVH. */
#ifdef __KERNEL_OPTIX__
OptixTraversableHandle device_bvh;
#elif defined __METALRT__
metalrt_as_type device_bvh;
#else
# ifdef __EMBREE__
RTCScene device_bvh;
# ifndef __KERNEL_64_BIT__
int pad1;
# endif
# else
int device_bvh, pad1;
# endif
#endif
int pad2, pad3;
} KernelData;
static_assert_align(KernelData, 16);

View File

@@ -394,7 +394,7 @@ void Film::device_update(Device *device, DeviceScene *dscene, Scene *scene)
vector<float> table = filter_table(filter_type, filter_width);
scene->lookup_tables->remove_table(&filter_table_offset_);
filter_table_offset_ = scene->lookup_tables->add_table(dscene, table);
kfilm->filter_table_offset = (int)filter_table_offset_;
dscene->data.tables.filter_table_offset = (int)filter_table_offset_;
/* mist pass parameters */
kfilm->mist_start = mist_start;

View File

@@ -1362,7 +1362,7 @@ void GeometryManager::device_update_bvh(Device *device,
dscene->data.bvh.use_bvh_steps = (scene->params.num_bvh_time_steps != 0);
dscene->data.bvh.curve_subdivisions = scene->params.curve_subdivisions();
/* The scene handle is set in 'CPUDevice::const_copy_to' and 'OptiXDevice::const_copy_to' */
dscene->data.bvh.scene = 0;
dscene->data.device_bvh = 0;
}
/* Set of flags used to help determining what data has been modified or needs reallocation, so we

View File

@@ -369,6 +369,8 @@ void Scene::device_update(Device *device_, Progress &progress)
device->const_copy_to("data", &dscene.data, sizeof(dscene.data));
}
device->optimize_for_scene(this);
if (print_stats) {
size_t mem_used = util_guarded_get_mem_used();
size_t mem_peak = util_guarded_get_mem_peak();

View File

@@ -82,7 +82,7 @@ class DeviceScene {
device_vector<uint> patches;
/* pointcloud */
/* point-cloud */
device_vector<float4> points;
device_vector<uint> points_shader;
@@ -124,7 +124,7 @@ class DeviceScene {
/* integrator */
device_vector<float> sample_pattern_lut;
/* ies lights */
/* IES lights */
device_vector<float> ies_lights;
KernelData data;

View File

@@ -6671,7 +6671,7 @@ void CurvesNode::compile(SVMCompiler &compiler,
ShaderInput *fac_in = input("Fac");
compiler.add_node(type,
compiler.add_node(ShaderNodeType(type),
compiler.encode_uchar4(compiler.stack_assign(fac_in),
compiler.stack_assign(value_in),
compiler.stack_assign(value_out),
@@ -6736,7 +6736,7 @@ void RGBCurvesNode::constant_fold(const ConstantFolder &folder)
void RGBCurvesNode::compile(SVMCompiler &compiler)
{
CurvesNode::compile(compiler, NODE_RGB_CURVES, input("Color"), output("Color"));
CurvesNode::compile(compiler, NODE_CURVES, input("Color"), output("Color"));
}
void RGBCurvesNode::compile(OSLCompiler &compiler)
@@ -6774,7 +6774,7 @@ void VectorCurvesNode::constant_fold(const ConstantFolder &folder)
void VectorCurvesNode::compile(SVMCompiler &compiler)
{
CurvesNode::compile(compiler, NODE_VECTOR_CURVES, input("Vector"), output("Vector"));
CurvesNode::compile(compiler, NODE_CURVES, input("Vector"), output("Vector"));
}
void VectorCurvesNode::compile(OSLCompiler &compiler)

View File

@@ -44,8 +44,6 @@ void SVMShaderManager::device_update_shader(Scene *scene,
}
assert(shader->graph);
svm_nodes->push_back_slow(make_int4(NODE_SHADER_JUMP, 0, 0, 0));
SVMCompiler::Summary summary;
SVMCompiler compiler(scene);
compiler.background = (shader == scene->background->get_shader(scene));
@@ -170,6 +168,9 @@ SVMCompiler::SVMCompiler(Scene *scene) : scene(scene)
background = false;
mix_weight_offset = SVM_STACK_INVALID;
compile_failed = false;
/* This struct has one entry for every node, in order of ShaderNodeType definition. */
svm_node_types_used = (std::atomic_int *)&scene->dscene.data.svm_usage;
}
int SVMCompiler::stack_size(SocketType::Type type)
@@ -378,11 +379,13 @@ void SVMCompiler::add_node(int a, int b, int c, int d)
void SVMCompiler::add_node(ShaderNodeType type, int a, int b, int c)
{
svm_node_types_used[type] = true;
current_svm_nodes.push_back_slow(make_int4(type, a, b, c));
}
void SVMCompiler::add_node(ShaderNodeType type, const float3 &f)
{
svm_node_types_used[type] = true;
current_svm_nodes.push_back_slow(
make_int4(type, __float_as_int(f.x), __float_as_int(f.y), __float_as_int(f.z)));
}
@@ -663,6 +666,7 @@ void SVMCompiler::generate_multi_closure(ShaderNode *root_node,
/* Add instruction to skip closure and its dependencies if mix
* weight is zero.
*/
svm_node_types_used[NODE_JUMP_IF_ONE] = true;
current_svm_nodes.push_back_slow(make_int4(NODE_JUMP_IF_ONE, 0, stack_assign(facin), 0));
int node_jump_skip_index = current_svm_nodes.size() - 1;
@@ -678,6 +682,7 @@ void SVMCompiler::generate_multi_closure(ShaderNode *root_node,
/* Add instruction to skip closure and its dependencies if mix
* weight is zero.
*/
svm_node_types_used[NODE_JUMP_IF_ZERO] = true;
current_svm_nodes.push_back_slow(make_int4(NODE_JUMP_IF_ZERO, 0, stack_assign(facin), 0));
int node_jump_skip_index = current_svm_nodes.size() - 1;
@@ -844,6 +849,9 @@ void SVMCompiler::compile_type(Shader *shader, ShaderGraph *graph, ShaderType ty
void SVMCompiler::compile(Shader *shader, array<int4> &svm_nodes, int index, Summary *summary)
{
svm_node_types_used[NODE_SHADER_JUMP] = true;
svm_nodes.push_back_slow(make_int4(NODE_SHADER_JUMP, 0, 0, 0));
/* copy graph for shader with bump mapping */
ShaderNode *output = shader->graph->output();
int start_num_svm_nodes = svm_nodes.size();

View File

@@ -211,6 +211,7 @@ class SVMCompiler {
/* compile */
void compile_type(Shader *shader, ShaderGraph *graph, ShaderType type);
std::atomic_int *svm_node_types_used;
array<int4> current_svm_nodes;
ShaderType current_type;
Shader *current_shader;

View File

@@ -10,7 +10,8 @@ CCL_NAMESPACE_BEGIN
ccl_device bool ray_sphere_intersect(float3 ray_P,
float3 ray_D,
float ray_t,
float ray_tmin,
float ray_tmax,
float3 sphere_P,
float sphere_radius,
ccl_private float3 *isect_P,
@@ -33,7 +34,7 @@ ccl_device bool ray_sphere_intersect(float3 ray_P,
return false;
}
const float t = tp - sqrtf(radiussq - dsq); /* pythagoras */
if (t < ray_t) {
if (t > ray_tmin && t < ray_tmax) {
*isect_t = t;
*isect_P = ray_P + ray_D * t;
return true;
@@ -44,7 +45,8 @@ ccl_device bool ray_sphere_intersect(float3 ray_P,
ccl_device bool ray_aligned_disk_intersect(float3 ray_P,
float3 ray_D,
float ray_t,
float ray_tmin,
float ray_tmax,
float3 disk_P,
float disk_radius,
ccl_private float3 *isect_P,
@@ -59,7 +61,7 @@ ccl_device bool ray_aligned_disk_intersect(float3 ray_P,
}
/* Compute t to intersection point. */
const float t = -disk_t / div;
if (t < 0.0f || t > ray_t) {
if (!(t > ray_tmin && t < ray_tmax)) {
return false;
}
/* Test if within radius. */
@@ -74,7 +76,8 @@ ccl_device bool ray_aligned_disk_intersect(float3 ray_P,
ccl_device bool ray_disk_intersect(float3 ray_P,
float3 ray_D,
float ray_t,
float ray_tmin,
float ray_tmax,
float3 disk_P,
float3 disk_N,
float disk_radius,
@@ -92,7 +95,8 @@ ccl_device bool ray_disk_intersect(float3 ray_P,
}
float3 P = ray_P + t * ray_D;
float3 T = P - disk_P;
if (dot(T, T) < sqr(disk_radius) /*&& t > 0.f*/ && t <= ray_t) {
if (dot(T, T) < sqr(disk_radius) && (t > ray_tmin && t < ray_tmax)) {
*isect_P = ray_P + t * ray_D;
*isect_t = t;
return true;
@@ -103,7 +107,8 @@ ccl_device bool ray_disk_intersect(float3 ray_P,
ccl_device_forceinline bool ray_triangle_intersect(float3 ray_P,
float3 ray_dir,
float ray_t,
float ray_tmin,
float ray_tmax,
const float3 tri_a,
const float3 tri_b,
const float3 tri_c,
@@ -149,16 +154,14 @@ ccl_device_forceinline bool ray_triangle_intersect(float3 ray_P,
/* Perform depth test. */
const float T = dot3(v0, Ng);
const int sign_den = (__float_as_int(den) & 0x80000000);
const float sign_T = xor_signmask(T, sign_den);
if ((sign_T < 0.0f) || (sign_T > ray_t * xor_signmask(den, sign_den))) {
const float t = T / den;
if (!(t >= ray_tmin && t <= ray_tmax)) {
return false;
}
const float inv_den = 1.0f / den;
*isect_u = U * inv_den;
*isect_v = V * inv_den;
*isect_t = T * inv_den;
*isect_u = U / den;
*isect_v = V / den;
*isect_t = t;
return true;
#undef dot3
@@ -171,8 +174,8 @@ ccl_device_forceinline bool ray_triangle_intersect(float3 ray_P,
*/
ccl_device bool ray_quad_intersect(float3 ray_P,
float3 ray_D,
float ray_mint,
float ray_maxt,
float ray_tmin,
float ray_tmax,
float3 quad_P,
float3 quad_u,
float3 quad_v,
@@ -185,7 +188,7 @@ ccl_device bool ray_quad_intersect(float3 ray_P,
{
/* Perform intersection test. */
float t = -(dot(ray_P, quad_n) - dot(quad_P, quad_n)) / dot(ray_D, quad_n);
if (t < ray_mint || t > ray_maxt) {
if (!(t > ray_tmin && t < ray_tmax)) {
return false;
}
const float3 hit = ray_P + t * ray_D;

View File

@@ -136,6 +136,19 @@ void string_replace(string &haystack, const string &needle, const string &other)
}
}
void string_replace_same_length(string &haystack, const string &needle, const string &other)
{
assert(needle.size() == other.size());
size_t pos = 0;
while (pos != string::npos) {
pos = haystack.find(needle, pos);
if (pos != string::npos) {
memcpy(haystack.data() + pos, other.data(), other.size());
pos += other.size();
}
}
}
string string_remove_trademark(const string &s)
{
string result = s;
@@ -164,6 +177,11 @@ string to_string(const char *str)
return string(str);
}
string to_string(const float4 &v)
{
return string_printf("%f,%f,%f,%f", v.x, v.y, v.z, v.w);
}
string string_to_lower(const string &s)
{
string r = s;

View File

@@ -38,12 +38,14 @@ void string_split(vector<string> &tokens,
const string &separators = "\t ",
bool skip_empty_tokens = true);
void string_replace(string &haystack, const string &needle, const string &other);
void string_replace_same_length(string &haystack, const string &needle, const string &other);
bool string_startswith(string_view s, string_view start);
bool string_endswith(string_view s, string_view end);
string string_strip(const string &s);
string string_remove_trademark(const string &s);
string string_from_bool(const bool var);
string to_string(const char *str);
string to_string(const float4 &v);
string string_to_lower(const string &s);
/* Wide char strings are only used on Windows to deal with non-ASCII

View File

@@ -3,6 +3,7 @@
set(INC
.
../clog
../glew-mx
../../source/blender/imbuf
../../source/blender/makesdna
@@ -65,6 +66,8 @@ set(SRC
intern/GHOST_Util.h
intern/GHOST_Window.h
intern/GHOST_WindowManager.h
intern/GHOST_utildefines.h
intern/GHOST_utildefines_variadic.h
)
set(LIB

View File

@@ -547,21 +547,15 @@ typedef struct {
/** The key code. */
GHOST_TKey key;
/* ascii / utf8: both should always be set when possible,
* - ascii may be '\0' however if the user presses a non ascii key
* - unicode may not be set if the system has no unicode support
*
* These values are intended to be used as follows.
* For text input use unicode when available, fallback to ascii.
* For areas where unicode is not needed, number input for example, always
* use ascii, unicode is ignored - campbell.
*/
/** The ascii code for the key event ('\0' if none). */
char ascii;
/** The unicode character. if the length is 6, not NULL terminated if all 6 are set. */
char utf8_buf[6];
/** Generated by auto-repeat. */
/**
* Enabled when the key is held (auto-repeat).
* In this case press events are sent without a corresponding release/up event.
*
* All back-ends must set this variable for correct behavior regarding repeatable keys.
*/
char is_repeat;
} GHOST_TEventKeyData;

Some files were not shown because too many files have changed in this diff Show More