1
1

Compare commits

...

468 Commits

Author SHA1 Message Date
c396f41e14 Merge branch 'main' into geometry-nodes-simulation 2023-04-22 16:15:41 +02:00
a9f02cd8d8 Geometry Nodes: add map from bsocket index to lf socket index
The socket indices in `bNode` and their corresponding `lf::Node`
don't match exactly, because `lf::Node` does not contain the
unavailable sockets. A simple mapping from `bNodeSocket` index
to `lf::Socket` index is required for future work. For now it
only removes the need for various tempory vectors.
2023-04-22 16:14:45 +02:00
1e81e27557 Cleanup: use hash value that fits into int to fix compile error 2023-04-22 15:28:26 +02:00
8523e361e8 BLI: use decay_t to determine type to hash
Goal is to fix a compile error on macos.
2023-04-22 15:01:56 +02:00
05ddbc20b8 Simulation Nodes: bake simulation states to disk
This adds baking support to simulation nodes.

The following features are supported:
* Bake simulation nodes of selected objects from the new "Baking" panel in the
  object properties.
* Free baked/cached simulation data.
* The bake is stored on disk in a folder next to the .blend file (so it's necessary
  to save before baking works).
* Baked data is detected automatically when reloading the file.
* The data stored on disk is partially deduplicated. Only duplicates that can be
  detected using implicit-sharing are taken into account.
* The baked data can contain meshes, curves, pointclouds and instances.
* The simulation state is written using a combination raw binary files for the
  data arrays and `.json` for meta data. Other formats besides `.json` could be
  used (most code is agnostic to that), but json is the easiest to use right now and
  seems to be good enough the common use cases (note that the size of the `.json`
  files do not depend on how large e.g. the baked mesh is).
* During baking, there is a progress bar and it can be interrupted using escape.

Limitations:
* Volumes are not written to disk yet.
* Currently it always bakes the entire scene frame range.
* Baking subframes is supported internally, but is not exposed in the UI.
* Currently, all attributes are written, but that is likely not necessary in most
  cases (e.g. selection attributes are written as well).

Pull Request: blender/blender#106937
2023-04-22 14:48:43 +02:00
8e967cfeaf Mesh: Cache loose vertices
Similar to the cache of loose edges added in 1ea169d90e,
cache the number of loose vertices and which are loose in a bit map.
This can save significant time when drawing large meshes in the
viewport, because recalculations can be avoided when the data doesn't
change, and because many geometry nodes set the loose geometry
caches eagerly when the meshes contain no loose elements.

There are two types of loose vertices:
1. Vertices not used by any edges or faces
   `Mesh.loose_verts()`
2. Vertices not used by any faces (may be used by loose edges)
   `Mesh.verts_no_face()`

Because both are used by Blender in various places, because the cost
is only a bit per vertex (or constant at best) and for design consistency,
we cache both types of loose elements. The bit maps will only be
allocated when they're actually used, but they are already accessed
in a few important places:
- Attribute domain interpolation
- Subdivision surface modifier
- Viewport drawing

Just skipping viewport drawing calculation after certain geometry
nodes setups can have a large impact. Here is the time taken by
viewport loose geometry extraction before and after the change:
- 4 million vertex grid node: 28 ms to 0 ms
- Large molecular nodes setup (curve to mesh node): 104 ms to 0 ms
- Realize instances with 1 million cubes: 131 ms to 0 ms

Pull Request: blender/blender#105567
2023-04-22 13:46:11 +02:00
8fbf0a79fc Geometry Nodes: replace more maps with array
This is the same as 8b6777edc2 but for node groups instead
of built-in nodes.
2023-04-22 13:36:58 +02:00
f307b5ae65 Merge branch 'main' into geometry-nodes-simulation 2023-04-22 13:13:16 +02:00
15f9e42c4f Geometry Nodes: new Index of Nearest node
The node outputs the index of the closest element to itself. See #102387
for the original design.

This is different from the Sample Nearest node in two important ways:
* It does not have a geometry input, instead the geometry is taken from the
  field evaluation context.
* The node can exclude the "current" element from the search.
* The group id input can be used to build subsets of elements that only
  consider each other as neighbors and ignore elements with other ids.

Pull Request: blender/blender#104619
2023-04-22 13:11:51 +02:00
8b6777edc2 Geometry Nodes: replace multiple maps with single array
The goal here is to avoid creating to many `Map` which are fairly
large compared to simple arrays. Also, access in arrays is more
efficient.
2023-04-22 13:01:28 +02:00
0e82510ea2 Sculpt: Fix #107068: Crash in multires unsubdivide
A BMesh customdata offset was being pulled from
the base mesh's customdata.
2023-04-21 18:08:42 -07:00
21b51e7b88 Cleanup: Use function ref instead of pointer for transform gizmo
Avoid the need to cast to and from `void *` and use
lambda capture to make things a bit more automatic.

Pull Request: blender/blender#107225
2023-04-21 21:41:35 +02:00
a04b39faf4 Merge branch 'main' into geometry-nodes-simulation 2023-04-21 15:26:21 -04:00
c547ff1ebd Fix VSE thumbnails are displayed from incorrect scene
Add scene to rendering context, so it can be used for cache hashing.
2023-04-21 20:33:52 +02:00
968ecf6f8b Fix #106993: Slowness with Orbiting around select + Mesh Symmetry
The issue happens because the algorithm used to calculate the center of
the selection first needs to create a TransData array. In this array,
the code calculates the "mirrored" elements which can be quite slow in
dense meshes.

The solution is replace this slow algorithm used for calculating the pivot
point with the fast algorithm used to calculate the position of transform
gizmos.

Pull Request: blender/blender#107203
2023-04-21 20:07:05 +02:00
7bf56e5c75 Fix failing lite build
Caused by b21695a507.
2023-04-21 19:29:16 +02:00
271ddc303d Fix Build Warning
Removing unused variable in font_cursor_text_index_from_event
2023-04-21 10:27:24 -07:00
68f8253c71 VFONT: Text Selection Operator
An operator to allow interactive text selection for 3D Text Objects.
This is from the code of Yash Dabhade (yashdabhade) for GSoC 2022
with corrections and simplifications. Also includes double-click for
word selection.

Pull Request: blender/blender#106915
2023-04-21 19:08:44 +02:00
29a4903eb8 Metal: Resolve high memory pressure on EEVEE render
When EEVEE is rendering multiple samples via
eevee_draw_scene, the command submission and in-flight
memory pressure would grow until all samples completed,
due to lack of intermediate flushing of GPU work and memory.

This patch adds a command flush and memory clear for this case
which occurs with high TAA sample counts during saving, similar
to the process in EEVEE_render_draw.

Authored by Apple: Michael Parkin-White

Pull Request: blender/blender#107221
2023-04-21 18:24:30 +02:00
35a8341d7b UI: Make it possible to add shortcuts to UI operators.
Function `WM_keymap_guess_opname()` skipped `UI_OT` operator types. In
some cases this is detremental to workflow, see #105371.

To exclude operators from getting keyboard shortcut it was suggested by
Campbell to use flag `OPTYPE_INTERNAL` or make new one.

Pull Request: blender/blender#105383
2023-04-21 17:09:41 +02:00
b21695a507 VSE: Add sound strip retiming support
This patch contains changes needed for retiming sound strips.

`BKE_sound_set_scene_sound_pitch()` is replaced by
`BKE_sound_set_scene_sound_pitch_constant_range()` which uses new
Audaspace interface to set pitch in bulk.
This is done in `SEQ_retiming_sound_animation_data_set()` where retimed
sections are created for each strip. When strip is inside of meta
strip(s), the retimed sections of meta and actual strip are split where
they intersect and pitch is multiplied where they overlap. Each section
will have pitch value that is provided to audaspace.

Waveform overlay now represents retimed audio accurately.

Ref: #100337

Pull Request: blender/blender#105072
2023-04-21 16:53:27 +02:00
b44dace9d8 Build: remove smatch, sparse & splint checking scripts
These checkers were all C-only making them increasingly less useful.
2023-04-21 23:32:55 +10:00
5e76622f47 Cleanup: remove redundant code 2023-04-21 23:27:21 +10:00
bc338aac74 Cleanup: Avoid switch fallthrough, avoid copying bit span
Duplicating a few function calls makes this section easier to follow.
2023-04-21 08:43:18 -04:00
0dd16758f0 Core: libquery: change Collection's parent pointer to master collection to 'not owned'. 2023-04-21 14:24:54 +02:00
3ee21d1098 Cleanup: rename anonymous attribute id pointer type 2023-04-21 14:14:27 +02:00
88f6d584ca Fix: wrong field inferencing in Sample Curve node
Whether the outputs are fields only depends on whether at least one of the
last three inputs is a field. It does not matter whether the `Value` input is
a field.

Pull Request: blender/blender#106007
2023-04-21 13:27:09 +02:00
e6ec1e4baf Vulkan: Attach debug utils to GPU_debug_group.
Also removes some unneeded CPP keywords.

Pull Request: blender/blender#107217
2023-04-21 12:49:36 +02:00
09a2b5c70f Docs: note that renaming data-blocks sorted them which impacts iteration
Address issue raised in #107027.
2023-04-21 20:36:29 +10:00
3650b36141 Metal: TF more optimal for hair refinement
Patch prefers usage of Transform Feedback for hair refinement
as opposed to compute, as vertex work can be pipelined with
existing rendering work which is in-flight.

This approach is ~20% faster depending on the scene. Note that
the current implementation only uses TF, as storage buffer support
is disabled. Though once storage buffer support is added, we should
still use the TF path.

Authored by Apple: Michael Parkin-White

Pull Request: blender/blender#107166
2023-04-21 12:35:21 +02:00
bebb17a973 Vulkan: Provide Debug Utilities
This PR uses the VK_EXT_debug_utils extension, but it's only for labeling, so it doesn't rely on the VK_LAYER_KHRONOS_validation functionality.

The functions that do these things are loaded into the runtime as vulkan extensions.

Declare the function pointers in a struct and make them members of vk_context.

Pull Request: blender/blender#106098
2023-04-21 12:32:40 +02:00
fc288ec856 Fix: missing variable initialization 2023-04-21 11:44:56 +02:00
99f5e60b86 RNA: ignore some large arrays in override code
This speeds up saving `070_0100.anim.blend` from the Heist project
from ~3s to ~300ms by adding PROPOVERRIDE_IGNORE in a few
places. It's not completely obvious to me when `PROPOVERRIDE_IGNORE`
should be used and when it shouldn't. Given that the same is done for
meshes already, it seems correct.

Pull Request: blender/blender#107196
2023-04-21 10:15:51 +02:00
680a54c7d0 EEVEE Next: Ensure correct texture usage for views
Add texture usage flags for textures which are used as texture views
or require texture views for backing implementation.

Authored by Apple: Michael Parkin-White

Pull Request: blender/blender#107163
2023-04-21 10:06:55 +02:00
25138fd6e0 EEVEE Next: GLSL Metal shader type compatibility
Apply compilation fixes for Metal compatibility.
This includes explicit type casts, packed data types
where vec3 alignment is inconsistent, constructor replacement
with factory function.

The Metal shader generator also needs knowledge of when bound
resources are fundamental data types, so
SHADOWS_TILE_DATA_PACKED must be described as uint in
ShaderCreateInfo.

Authored by Apple: Michael Parkin-White

Pull Request: blender/blender#107178
2023-04-21 09:55:37 +02:00
0865f80591 Merge branch 'main' into geometry-nodes-simulation 2023-04-21 09:26:59 +02:00
4134682ec2 Fix #107156: UV Cylinder/Sphere Projection fails after other operators
Caused by 6b8cdd5979.

Above commit introduced element tagging for boundary calculations but
only cleared them properly on all faces if the new `Preserve Seams`
option was chosen. We cannot be sure about the state of element tags
from prior operators though, so correct the culprit check to also only
be in effect if the new `Preserve Seams` option was chosen.

Pull Request: blender/blender#107161
2023-04-21 08:28:03 +02:00
c18351f670 Metal: Increase concurrent shader compilation threads
Leverage new API call in Metal to increase the number of threads
dedicated to concurrent shader compilation. First step to improve
parallel compilation times when multiple engines are active.

Would also enable an increase in worker threads for shader
compilation jobs within the DRWManager.

Note that this is only available in the latest
version of macOS Ventura (13.3).

Authored by Apple: Michael Parkin-White

Pull Request: blender/blender#106625
2023-04-21 07:52:17 +02:00
fdf920bf5d Metal: Add textureGrad support
Fixes compilation errors in viewport compositor.

Authored by Apple: Michael Parkin-White

Pull Request: blender/blender#106805
2023-04-21 07:45:30 +02:00
19dbe049db Fix #106469: unstable tessellation with quad-flipping detection
The result of detecting if a quad should flip the default 0-2 split
when tessellated only used a pre-calculated normal when available,
since the method of detecting the flip was different, the check for a
concave face could change depending on the existence of polygon-normals.

In practice this meant cycles render preview could use a different
tessellation than the GPU display.

While [0] exposed the bug, it's an inherent problem with having 2
methods of detecting concave quads.

Remove is_quad_flip_v3_first_third_fast_with_normal(..) and always
use is_quad_flip_v3_first_third_fast(..), because having to calculate
the normal inline has significant overhead.

Note that "bow-tie" quads may now render with a subdivision in a
different direction although they must be very distorted with both
triangles along the 0-2 split pointing away from each other.

Thanks to @HooglyBoogly for investigating the issue.

[0]: 16fbadde36.
2023-04-21 15:02:47 +10:00
5721b34e53 Cleanup: add win32 suffix to BLI_path_is_abs
Naming made it seem this might be the opposite of BLI_path_is_rel,
when it checks for WIN32 specific path prefixes.
2023-04-21 09:46:06 +10:00
417b62522d Cleanup: code-comments in path_util.c
- Remove duplicate doc-string.
- Use full sentences.
- Use back-ticks for path literals
  (to avoid confusion with doxy-slash commands).
2023-04-21 09:34:30 +10:00
fc749d9d25 Cleanup: replace binary '&' with '&&' check
As the intention is to check both statements are true, avoid bitwise
operations on boolean results.
2023-04-21 09:13:09 +10:00
54ce0ac922 Cleanup: use const variables when reading X11 events 2023-04-21 09:12:31 +10:00
62806012ed Cleanup: resolve uninitialized members in GHOST Window & SystemX11
While this didn't cause bugs, initialize members to avoid problems
in the future.

GHOST::SystemX11
- m_keyboard_vector
- m_keycode_last_repeat_key

GHOST::Window
- m_cursorGrabInitPos
- m_userData
2023-04-21 09:06:25 +10:00
ae24fe56a3 Cleanup: quiet shadowed variable warning 2023-04-21 08:42:10 +10:00
04faf12bd8 Cleanup: use BLI_listbase_is_single to avoid unnecessary counting 2023-04-21 08:35:09 +10:00
8e69b41bdf Cleanup: use const for implicit sharing info
Generally, one does not know if the sharing info is currently shared
and should therefore be const. Better keep it const almost all the
time and only remove the constness when absolutely necessary
and the code has checked that it is valid.
2023-04-20 23:32:33 +02:00
491f098edf Cleanup: Fix custom data memcpy call null argument
The data was only null if the size was also zero, but it's simple
to avoid the ASAN warning anyway.
2023-04-20 17:31:29 -04:00
f6ec11741c Fix #106208: data-block socket defaults not used for node group
The main challenge is to avoid dangling pointers. Currently, the lifetime of socket
declarations is somewhat unbounded (at least we didn't restrict it explicitly yet).
Therefore, storing non-owning pointers in it tricky. For ID pointers one could
potentially use the foreach-id iterator to update pointers in declarations as well,
but that's a bit out of scope and might not be the right solution anyway, since it's
not obvious that all node declarations are reachable from IDs stored in `bmain`.

The solution now is to use a callback that retrieves the right ID pointer when it
is used. The important thing is that the callback does not capture any potentially
dangling pointer either.

Pull Request: blender/blender#107179
2023-04-20 22:27:45 +02:00
4babb7c02e Cycles: oneAPI: Fix volume intersection for Embree GPU execution 2023-04-20 21:20:33 +02:00
0d9fa73b42 Cycles: oneAPI: Fix motion blur rendering for Embree GPU execution
CPU non-unified shared memory was used for shared geometry buffers.
For the Embree GPU case, we now create new geometry buffers on GPU instead.
2023-04-20 21:20:33 +02:00
7e92fb92ec Cycles: oneAPI: Fix kernels preloading in case of incompatible AoT binaries
When running oneAPI with AoT binaries, on hardware that's not compatible with
these, recompilation could have been missing from the kernels loading phase and
happen during execution instead.

These changes fixes it, any kernel compilation will now happen during the
kernels loading phase.
2023-04-20 21:20:33 +02:00
13d30b0481 Cleanup: fix various warnings on Windows
Ensure windows.h is included before some other headers to avoid
redefining macros.

Pull Request: blender/blender#107189
2023-04-20 20:46:13 +02:00
c732d901a7 Fix : Iteration for BMLayerCollection was broken
It was broken in two ways:
- bpy_bmlayercollection_iter passed PY_SSIZE_T_MIN, while
PY_SSIZE_T_MAX was needed.
- bpy_bmlayercollection_subscript_slice() contained an
off-by-one error.

Pull Request: blender/blender#107165
2023-04-20 20:28:25 +02:00
f04a7a07e3 macOS: Add open files to system recent files
Completes the TODO in GHOST_SystemPathsCocoa::addToSystemRecentFiles
Also renames the filename parameter to the more appropriate filepath.

The recently opened/saved file will now also show up in:
- Blender Dock icon > Right click.
- Three finger swipe down in Open Blender i.e., App Expose

Based on a earlier contribution by @jenkm.

Pull Request: blender/blender#107174
2023-04-20 23:53:08 +05:30
82ca3d3604 Fix #107185: Edit mode or existing attribute break rest position
After e45ed69349 we need to remove the existing attribute
when adding the rest position before evaluating modifiers. Also, adding
the rest position attribute was completely skipped in edit mode.

Pull Request: blender/blender#107190
2023-04-20 20:18:02 +02:00
2f4a8ecf18 Fix: Spreadsheet missing other geometry types for edit mode mesh objects
We need to add to the spreadsheet's display geometry set
rather than completely replacing it with just the mesh.
2023-04-20 13:06:01 -04:00
b2c822065c Fix #106977: Crash when OpenEXR IO fails
The crash can occur in the following situations:

- Attempt to open a corrupted EXR file
- Attempt to save an EXR file under a non-existing directory.

The root cause is not really clear: for some reason the OpenEXE API on
the Blender side can not catch OpenEXE exceptions by a constant
reference to a std::exception, although it can by a constant reference
to an Iex::BaseExc.

This does not seem to be an issue with the OpenEXR library itself as
the idiff tool from our SVN folder catches the exceptions correctly.
It is also not caused by the symbols_apple.map as erasing it does not
make the problem go away.

It could still be some compiler/visibility flag which we were unable
to nail down yet.

The proposed solution is to add catch-all cases, mimicking the OIIO
tools. This solves the problem with the downside is that there are
no friendly error messages in the terminal. Those messages could be
brought as part of the workaround by additionally catching the
Iex::BaseExc exception. But probably nobody relies on those error
prints anyway, so added complexity in the code is likely does not
worth it.

Pull Request: blender/blender#107184
2023-04-20 18:40:07 +02:00
5c4b0c98d3 Animation: Add in Parent space alignment option to the Transform Orientation gizmo
Animation: Adds a new "Parent Space" Orientation option for the Transformation Gizmo.

---
For child targets (objects, bones, etc) being able to transform in parent space is a desired feature (especially when it comes to rigging / animation).

For objects:
* with a parent, the gizmo orients to it's parents orientation
* without a parent, the gizmo orients to Global space

For Armatures:
* Child bone shows parent's space regardless if "Local Location" is set for parent bone
* For root bone **without** "Local Location" set, use the armature objects space.
* For root bone **with** "Local Location" set, use local bone space.

---

No new transformation orientation code needs to be written, we can achieve the desired results be using the existing `transform_orientations_create_from_axis`, `ED_getTransformOrientationMatrix`, and `unit_m3` methods. To do this, we check to see if the bone has a parent, if so, we use the bones pose matrix (`pose_mat`). This is done similarly for objects using the parent's object matrix (`object_to_world`).

Pull Request: blender/blender#104724
2023-04-20 17:40:19 +02:00
0e23aef6b6 Fix build error when not using unity build 2023-04-20 15:46:15 +02:00
4d34028ce9 use BitArrayVector instead of MultiValueMap for attribute propagation detection
This simplifies the code and also makes it more efficient in many cases
(although not by much in my simple tests).
2023-04-20 14:31:21 +02:00
475f9a3e23 Cycles: Break up geometry.cpp and scene.cpp file into smaller pieces
Scene.cpp  and Geometry.cpp are large file it can be broken up into smaller easier to handle files. This change has been broken out from #105403 to make understanding the changes easier.

geometry.cpp is broken up into:
1. geometry.cpp
2. geometry_attributes.cpp
3. geometry_bvh.cpp
4. geometry_mesh.cpp

scene.h & scene.cpp is broken into:
1. scene.h
2. scene.cpp
3. devicescene.h
4. devicescene.cpp

Pull Request: blender/blender#107079
2023-04-20 12:26:02 +02:00
100f37af49 Fix #100053: Incorrect saving asset catalogs after renaming parent item
When a parent item was renamed, the `TreeView` was doing everything as
expected, however `AssetCatalogService::update_catalog_path` is supposed
to also update the catalog paths of all sub-catalogs [which it does --
but it does not tag sub-catalogs as having unsaved changes, resulting in
wrong saving of catalogs afterwards, meaning the parent item was saved
with the old name and a new item with the new name was created].

Now also tag sub-catalogs for having unsaved changes.

This should also go into 3.3 LTS

Pull Request: blender/blender#107121
2023-04-20 11:21:27 +02:00
7ce10ebbbf Cycles: oneAPI: Remove excess quotes in a capabilities output 2023-04-20 11:09:16 +02:00
770b193253 Cleanup: use function style casts & nullptr, spelling in comments 2023-04-20 18:28:50 +10:00
0fa68d1a01 Cleanup: format 2023-04-20 18:28:50 +10:00
6d35e1c238 Fix missing include causing build error & invalid NULL check 2023-04-20 18:28:50 +10:00
fe7815e117 Fix #106771: Selection offset in timeline when NLA track is offset
The selection (box select, click select...) had an offset when selecting keys in the timeline.
That was because the function to get the NLA mapping ignored the timeline.

Pull Request: blender/blender#106904
2023-04-20 10:26:26 +02:00
60ced5283a Animation: make properties from motion path library overrideable
The following properties were not library overrideable, but now are
* Line Thickness
* Color
* Custom Color Checkbox

Pull Request: blender/blender#106959
2023-04-20 10:08:39 +02:00
4054d76749 Fix: Normalization with baked curves and preview range
Currently when a baked curve is in the Graph Editor and normalization is enabled, it doesn't work.
It even throws a warning.

This patch adds the missing logic to normalize baked FCurves within a preview range.

Pull Request: blender/blender#106890
2023-04-20 10:07:49 +02:00
88b125e75d Fix regression tests failure on the latest Xcode
When using Xcode version 14.3 on Apple Silicon hardware a number of
regression tests fails. This change fixes this problem.

The root cause comes to the floating point contraction. It was already
disabled for GCC on Linux, but not for Clang on neither of Linux or
macOS.

Also corrected the comment about Clang default, as it as set to on
somewhere in 2021.

Pull Request: blender/blender#107136
2023-04-20 08:56:55 +02:00
b69f8de5b5 Fix #105450: Resolve box selection issue in Metal
Occlusion query buffers not being cleared to zero resulted in
erroneoues selection in certain situations.

Authored by Apple: Michael Parkin-White

Pull Request: blender/blender#107135
2023-04-20 08:47:56 +02:00
dda4c0721c EEVEE-Next: Resolve compilation errors in Metal
Shader source requires explicit conversions and shader address
space qualifers in certain places in order to compile for Metal.

We also require constructors for a number of default struct types.

Authored by Apple: Michael Parkin-White

Pull Request: blender/blender#106219
2023-04-20 08:03:31 +02:00
397a14deff GPencil: Several Weight Paint additions
This patch adds several tools and options to the weight paint mode of Grease Pencil.

* Blur tool: smooths out vertex weights, by calculating a gaussian blur of adjacent vertices.
* Average tool: painting the average weight from all weights under the brush.
* Smear tool: smudges weights by grabbing the weights under the brush and 'dragging' them.

* With the + and - icons in the toolbar, the user can easily switch between adding and subtracting weight while drawing weights.
* With shortcut `D` you can toggle between these two.

* The auto-normalize options ensures that all bone-deforming vertex groups add up to 1.0 while weight painting.
* With `Ctrl-F` a radial control for weight is invoked (in addition to the radial controls for brush size and strength).
* With `Ctrl-RMB` the user can sample the weight. This sets the brush Weight from the weight under the cursor.

* When painting weights in vertex groups for bones, the user can quickly switch to another vertex group by clicking on a bone with `Ctrl-LMB`.
For this to work, follow these steps:
* Select the armature and switch to Pose Mode.
* Select your Grease Pencil object and switch immediately to Weight Paint Mode.
* Select a bone in the armature with `Ctrl-LMB`. The corresponding vertex group is automatically activated.

Pull Request: blender/blender#106663
2023-04-20 07:55:24 +02:00
3d6117994c Realtime Compositor: Implement ID Mask node
This patch implements the ID Mask node for the realtime compositor.

The node can be implemented as a GPU shader operation when the
anti-aliasing option is disabled, which is something we should do when
the evaluator allows nodes be executed as both standard and GPU shader
operations.

Pull Request: blender/blender#106593
2023-04-20 07:20:58 +02:00
335d32153e Cleanup: remove dead code, reduce variable scope 2023-04-20 13:57:31 +10:00
62cc09f267 Cleanup: match argument names between functions & declarations 2023-04-20 13:52:58 +10:00
92f79e002e Cleanup: format 2023-04-20 13:35:35 +10:00
3f0d2cf9e1 Add scripts dir to the make format paths for Python
`make format` uses autopep8 to format Python, using a list of paths
specified in `tools/utils_maintenance/autopep8_format_paths.py`. The
scripts folder used to be a submodule inside release, but it is now at
the root of the blender repo.

This commit adds `scripts` to the list of paths to format.

Ref !107143
2023-04-20 05:30:34 +02:00
80fdf4a88d Merge branch 'main' into geometry-nodes-simulation 2023-04-19 23:25:50 -04:00
716b9cff23 Fix: Search in node editors missing items
Mistake in d6abd2ce72.
2023-04-19 23:19:45 -04:00
945d71b56b Merge branch 'main' into geometry-nodes-simulation 2023-04-19 22:52:35 -04:00
5ca7e1301f Cleanup: Remove redundant custom data initialization
The comment didn't really make sense, since the removed code did the
same thing as the CustomData function anyway, and that's already done
in `mesh_init_data`.
2023-04-19 22:45:48 -04:00
a32fb96311 Cleanup: Use more specific arguments to calc edges function 2023-04-19 22:45:48 -04:00
3e41b98295 Cleanup: Use utility to create mesh in metaball tessellation
Avoid the need to add each data array manually.
2023-04-19 22:45:48 -04:00
b633b460b8 Cleanup: STL Import: Use utility to copy corner verts data 2023-04-19 22:45:48 -04:00
639ec2e5a9 BLI_path: add BLI_path_extension_or_end
Some callers that access the extension operated on the end of the path
even when there was no extension. Using this avoids having to assign
the end of the string using a separate check.
2023-04-20 12:32:25 +10:00
b778e09492 Cleanup: use memmove instead of a string copy for BLI_path_suffix 2023-04-20 11:58:30 +10:00
7884de02f3 Tests: add BLI_path_extension replace/ensure tests for overflow handling 2023-04-20 11:58:29 +10:00
95296dc3aa Cleanup: remove "Path" prefix from path_utils tests 2023-04-20 11:58:27 +10:00
9be0304b67 Cleanup: order expected value last in path_util tests 2023-04-20 11:58:25 +10:00
b6e527febb Tests: add more path extension tests 2023-04-20 11:58:23 +10:00
373cfa731f Cleanup: use EXPECT_STREQ instead of EXPECT_EQ_ARRAY
While both work, the output of strings being different is more useful.
2023-04-20 11:58:22 +10:00
7cc7cd0e80 Cleanup: use a define for all style flags
This ensures new styles only need to be added in one place.
2023-04-20 11:58:20 +10:00
5294758830 Fix buffer overflow in BLI_path_frame_strip with long extensions
The file extension was copied into a buffer without checking it's size.
While large extensions aren't typical, some callers used small fixed
size buffers so an unusually named file could crash.
2023-04-20 11:47:22 +10:00
80edd10168 Fix regression in BLI_path_suffix for long extensions
Changes from [0] passed in a pointer size to BLI_strncpy.

[0]: f8e23e495b
2023-04-20 11:18:26 +10:00
f87e474af0 Cleanup: Move view3d_gizmo_ruler.c to C++
Move view3d_gizmo_ruler.c to C++ to make further changes easier.
See #103343

Pull Request: blender/blender#107148
2023-04-20 00:06:49 +02:00
911f9bea84 Fix #107067: Properly clear CD_FLAG_ACTIVE/DEFAULT_COLOR flags
Runtime this information is stored in the active_color_attribute and
default_color_attribute strings on the mesh, however when saving it
is still saved in the old format with flags on the CustomData layers.
When converting from the strings to the layers not all flags were
properly cleared from the CustomData layers, leading to multiple
layers having the CD_FLAG_COLOR_ACTIVE/RENDER flag.
2023-04-19 22:26:31 +02:00
e05cbad0d1 Sculpt: Fix #107093: expand helper function not specialized to pbvh
sculpt_expand_is_face_in_active_component wasn't specialzied for
the different PBVH types.
2023-04-19 12:59:09 -07:00
2ab500c234 Cleanup: Remove unnecessary point cloud function argument
The "nomain to main" function for point clouds now always takes
ownership of the source data-block, just like the mesh version.
2023-04-19 15:52:56 -04:00
7535ab412a Cleanup: Remove redundant "reference" argument to geometry copy
Implicit sharing means attribute ownership is shared between geometry
data-blocks, and the sharing happens automatically. So it's unnecessary
to choose whether to enable it when copying a mesh.
2023-04-19 15:52:56 -04:00
10d175e223 Cleanup: Use consistent argument order for mesh creation functions
The typical order is vertex, edge, face(polygon), corner(loop), but in
these three functions polys and loops were reversed. Also use more
typical "num" variable names rather than "len"
2023-04-19 15:52:56 -04:00
60bb57663a Cleanup: IO: Separate creating mesh and adding to Main
Create a "nomain" mesh when converting intermediate representations
to a Mesh, meaning those areas don't have to know about data-block
names or the main database, and also that the boilerplate of adding
attributes individually can be avoided. The attribute arrays aren't
copied here, so the performance should be unaffected.
2023-04-19 15:52:56 -04:00
9344deed89 UI: Change the name of Invert nodes to Invert Color
The nodes for inverting a color are named simply Invert, which begs the question: invert what?

This patch changes the naming for the node in Shading, Texture and Compositing nodes to *Invert Color*

This matches the naming of other color dedicated nodes like Separate Color or Combine Color

Pull Request: blender/blender#106750
2023-04-19 21:52:20 +02:00
c6d4de9e49 Render: Fix crash in baking
corner_edges was being passed to a function that expected
corner_verts.
2023-04-19 12:40:50 -07:00
199c7da06d Assets: Do Not Show Blank Read-Only Metadata
Do not show asset metadata "description", "license", "copyright", or
"author" if they are empty AND read-only, since they can't be edited
and contain no useful information to show.

Pull Request: blender/blender#105812
2023-04-19 19:55:26 +02:00
acb34c718e Fix #107120: Small fixes to OS File Operations
Small fixes to recent file operations changes. FileOperations enum
starting with zero results in bad behavior with EnumPropertyItem. Typo
fix.

Pull Request: blender/blender#107138
2023-04-19 19:43:15 +02:00
097b9c5a36 Fix: Build error after last commit
Also fix fallthrough warnings
2023-04-19 13:05:08 -04:00
98ccee78fe Geometry Nodes: Slightly optimize mesh to curve node
Avoid copying the selected edges if all edges are selected, and
parallelize gathering the selection otherwise. Also use `int2` instead
of `std::pair`.

In simple test file I observed an approximate 10% FPS improvement,
though in real world cases the impact is probably much smaller.
2023-04-19 12:35:09 -04:00
d5757a0a10 Cycles: re-enable AMD GPU binaries on Windows
Using the new HIP SDK 5.5 that includes a fix for the compiler bug.

This also enables the light tree.

For Linux the binaries are still disabled. ROCm 5.5 is planned to
include the same fix but not released yet. When that happens we
should be able to enable Linux as well.

Ref #104786
Fix #104085

Pull Request: blender/blender#107098
2023-04-19 18:18:05 +02:00
45c0762f1b Fix #107125: Entering Grease Pencil Vertex Paint mode crashes
Caused by uninitialized `ToolSettings` `GpPaint` [which was later
accessed in `BKE_gpencil_palette_ensure`].

Not 100% sure why `ToolSettings` `GpPaint` is properly initialized in a
default startup fille, but for some files, this was not the case (as in
the report)

See 22462fed00 for a similar commit.

Now initialize `ToolSettings` `GpPaint` (alongside `GpVertexPaint`) when
entering grease pencil vertex paint mode.

Should probably go into LTS releases as well.

Pull Request: blender/blender#107131
2023-04-19 16:50:46 +02:00
599e52119f Fix #107101: Update depsgraph on muting VSE channel
Muting a VSE channel does not mute the sound, this is caused by lack
of depsgraph updates for sound when mute state changed for the channel.
Now fixed.

Caused by ad146bd17a

Pull Request: blender/blender#107116
2023-04-19 16:27:21 +02:00
a7422f3cd7 deps_builder/windows: Cleanup dpcpp harvest
The dpcpp folder grew from 200M to 500M with the last update
due to lld being enabled and having 5 different copies in the bin
folder. We do not need to ship lld so it can be safely removed.

However previous harvest cleaned up the build folder before copying
the libs to their final destination in output, this will no longer
work, since we actually do need lld to build embree.

So copy to the full build folder to output first, then remove the
binaries we do not need. Embree will use the binaries in the build
folder so it will be unaffected by this.
2023-04-19 07:57:46 -06:00
d6abd2ce72 Fix #106138: Node add searches missing context-based poll
Before the add node search refactor and link-drag-search, nodes were
filtered out based on whether they worked with the active render
engine. For example, the Principled Hair BSDF node doesn't work with
EEVEE, so it isn't displayed in the UI. While we might want to relax
this in the future, we have no better way to show that they don't work
right now, so it's best to keep that behavior.

The filtering is implemented with a new node type callback, mainly
to reduce the boilerplate of implementing many node search callbacks
otherwise. It's also relatively clear this way I think. The only
downside is that now there are three poll functions.

I didn't port the "eevee_cycles_shader_nodes_poll" to the new
searches, since I don't understand the purpose of it.

Pull Request: blender/blender#106829
2023-04-19 15:48:18 +02:00
91a29c9b9a Fix #107127: Context property driver to view layer does not work
The resolution of the driver value RNA path was using the wrong
property (it was forced to be referenced relative to the ID).

Pull Request: blender/blender#107129
2023-04-19 15:32:37 +02:00
5ab48a53e4 Cleanup: Use generic edge calculation for legacy curve to mesh
Change the "displist to mesh" conversion to use the edge calculation
function used everywhere else, to allow removing the old code. This
changes edge vertex and corner edge indices, requiring a test update,
but the visual result should be the same.
2023-04-19 09:29:08 -04:00
b647c2b88d Cleanup: Remove unused variables/functions
Also change from `unsigned int` to `uint` for consistency
between function declarations and definitions.
2023-04-19 08:52:48 -04:00
86611a5fcc Tests: add tests for BLI_path_extension ensure & replace 2023-04-19 21:15:43 +10:00
7f2c7feaee Fix #107113: VSE channel buttons invisible in Light theme
Also fix inconsistency in Movie Clip and Status Bar headers.
2023-04-19 12:39:33 +02:00
c0f7801660 Fix regression in BLI_path_extension_ensure
Error in [0] removed trailing '.' stripping.

[0]: f8e23e495b
2023-04-19 20:33:55 +10:00
a5140712cc Merge branch 'main' into geometry-nodes-simulation 2023-04-19 11:25:26 +02:00
e45ed69349 Attributes: Integrate implicit sharing with the attribute API
Add the ability to retrieve implicit sharing info directly from the
C++ attribute API, which simplifies memory usage and performance
optimizations making use of it. This commit uses the additions to
the API to avoid copies in a few places:
- The "rest_position" attribute in the mesh modifier stack
- Instance on Points node
- Instances to points node
- Mesh to points node
- Points to vertices node

Many files are affected because in order to include the new information
in the API's returned data, I had to switch a bunch of types from
`VArray` to `AttributeReader`. This generally makes sense anyway, since
it allows retrieving the domain, which wasn't possible before in some
cases. I overloaded the `*` deference operator for some syntactic sugar
to avoid the (very ugly) `.varray` that would be necessary otherwise.

Pull Request: blender/blender#107059
2023-04-19 11:21:06 +02:00
19ac02767c Fix regression in recent BLI_path extension logic
Error in [0] meant BLI_path_extension_replace &
BLI_path_extension_ensure did nothing when the input path had no
extension.

[0]: f8e23e495b
2023-04-19 18:38:56 +10:00
fd10ecaeaf Fix bitwise logical operation in Metal backend
Pull Request: blender/blender#107084
2023-04-19 10:02:12 +02:00
187998970a Fix unused variable in release build in Metal backend 2023-04-19 10:02:09 +02:00
c872b6b930 Fix set but unused variable in Freestyle 2023-04-19 10:02:09 +02:00
3c34b13cf8 Fix set but unused variable in mesh intersect
A bit tricky, since there is also variable shadowing involved.
2023-04-19 10:02:09 +02:00
9e63c3cee8 Fix strict prototypes in Audio 2023-04-19 10:02:09 +02:00
a20f45bab9 Fix unqualified access to std::move in OpenSubdiv 2023-04-19 10:02:09 +02:00
63c20e08c4 Fix set but unused variable in Libmv 2023-04-19 10:02:09 +02:00
4f7dc1e4b6 Fix set but unused variable in IK solver 2023-04-19 10:02:09 +02:00
8ed543c6f2 Fix set but unused variable in dualcon octree 2023-04-19 10:02:09 +02:00
daaed83a32 Fix set but unused variables in Cycles 2023-04-19 10:02:09 +02:00
7982d86117 Fix unqualified access to std::move in Cycles 2023-04-19 10:02:09 +02:00
33e5cd4e2f Fix bitwise operation used on boolean in Mantaflow 2023-04-19 10:02:09 +02:00
8365bce958 CMake: Add extra strict flags cancellation for Clang 2023-04-19 10:02:09 +02:00
b0ec4d889a Fix #106998: selection of bones in grease pencil weightpaint mode fails
Caused by 2eeec49640.

Above commit would early out when falling through the specialized
greasepencil selection operator to view3d_select_exec. But in order to
select posebones in grease pencil weightpaint mode, we still have to
continue with view3d_select_exec.

Now check this special case [with convenient
`BKE_object_pose_armature_get_with_wpaint_check`] and DONT early out in
that case.

Should go into 3.3 LTS as well.

Pull Request: blender/blender#107076
2023-04-19 09:13:21 +02:00
40c76a1945 Merge branch 'main' into geometry-nodes-simulation 2023-04-19 08:57:35 +02:00
c10e8e4166 Fix #106751: No implicit conversion for group inputs
When a node input is connected to a group node input that is unlinked
and is of a different type, no implicit conversion takes place, so the
value is unexpected.

This patch fixes that by considering the types of both sockets and do
implicit conversion if necessary.
2023-04-19 06:24:39 +02:00
ed590e9181 macOS/GTests: simplify blender_test library linking
Reverts dcb2821292 but handles
the linker error by relying on target_link_libraries deduplication.

Reverts 18a15bafe8 but handles
blender_test linking after dependency change by passing
lib to target_link_libraries itself.

Closes #107033
2023-04-19 09:05:43 +05:30
26a194abbd BLI_path: add BLI_path_extension_strip as an alternative to replace
While replacing the extension with an empty string works,
it required a redundant string-size argument which took a dummy
value in some cases. Avoid having to pass in a redundant string size by
adding a function that strips the extension.
2023-04-19 12:59:43 +10:00
9e6757f20f Cleanup: expand on why the extension isn't replaced for blend-file save 2023-04-19 12:58:54 +10:00
61fe8da989 Cleanup: avoid changing the filepath for alembic frame range calculation
The internal utility get_sequence_len would make it's filename
argument absolute so as to scan it's directory for files.

Perform this on the directory instead so the filename can be const.
2023-04-19 12:33:28 +10:00
643f8bcedd Cleanup: avoid redundant string copy
This may have been done because BLI_path_frame_get used to take a
non-const string.
2023-04-19 12:32:42 +10:00
f8e23e495b BLI_path: improve behavior of BLI_path_extension
Finding the extension included hidden files (starting with a '.'),
now finding the extension matches Python's `os.path.splitext` behavior
which has the advantate a hidden file is not considered one long
extension - with an empty name part.

Also update code to use BLI_path_extension in cases which previously
in-lined this logic.

BLI_path_frame_get path argument is now const,
it was being manipulated unnecessarily.
2023-04-19 11:33:26 +10:00
7f241fc773 Tests: add test for BLI_path_suffix & BLI_path_sequence_decode 2023-04-19 11:33:26 +10:00
6d2351d26b Text object: operators to move cursor to the top or bottom
This adds new movement types TEXT_BEGIN and TEXT_END to allow
FONT_OT_move and FONT_OT_move_select operators move the text
cursor (caret) to the top and bottom of the text.

Pull Request: blender/blender#106196
2023-04-19 02:18:19 +02:00
846d78b09a Cleanup: improve doc-strings for EditFont 2023-04-19 09:06:24 +10:00
b132118f89 Cleanup: balance doxygen grouping, minor grouping adjustment 2023-04-19 09:02:21 +10:00
88f5dd3c72 Cleanup: format 2023-04-19 08:02:42 +10:00
eb2867de90 Cleanup: spelling in comments 2023-04-19 08:02:41 +10:00
Mateusz Albecki
0fd14d659b GHOST/Wayland: Fix disposeContext with VK
During createOffscreenContext with VK backend enabled wl_surface
was not stored in the context's user data. This resulted in nullptr
dereference later on during disposeContext. Added a line that sets
user data and additionally added nullptr checks in disposeContext.

Ref !107057.
2023-04-19 07:44:06 +10:00
1469613d65 Fix Build Warnings
A differing const argument and an unused var caused by conditionals.

Introduced in 694f792ee1
2023-04-18 13:59:07 -07:00
95bc1dd0e5 Cleanup: MIssing definition 2023-04-18 16:56:26 -04:00
4382a0b350 Cleanup: avoid warnings from gcc in oneAPI device compilation
When building using GCC and with Embree without GPU support, there were
a few unused variables and a non-defined macro.
2023-04-18 22:40:40 +02:00
70892e82ac Cycles: oneAPI: use specialization constant to compile with/without Embree on GPU 2023-04-18 22:09:42 +02:00
9821a2d397 Cycles: pass kernel features to get_bvh_layout_mask
This allows to selectively disable Hardware Raytracing in oneAPI
backend, depending on features used.
2023-04-18 22:09:42 +02:00
3f8c995109 Cycles: add hardware raytracing support to oneAPI device
Updated Embree 4 library with GPU support is required for it to be
compiled - compatiblity with Embree 3 and Embree 4 without GPU support
is maintained.
Enabling hardware raytracing is an opt-in user setting for now.

Pull Request: blender/blender#106266
2023-04-18 22:09:42 +02:00
887022257d Cycles: update DPCPP to 2022-12 release
We also backport a patch to program_manager to it as
61e51015a5
helps avoid unnecessary recompilation when enumerating available
kernels.
2023-04-18 22:09:41 +02:00
5cdf0c9ee9 Cycles: update compute-runtime to 23.05.25593.18
This fixes oneAPI AoT compilation on Linux when using Embree on GPU.
2023-04-18 22:09:41 +02:00
66b4e426cc Cycles: build Embree 4 with GPU support 2023-04-18 22:09:41 +02:00
72aeee96ac Fix Build Warning in fileops.c
Marking unused function arguments caused by conditionals.

Introduced in 694f792ee1
2023-04-18 12:53:45 -07:00
f7ba61d3a6 Fix #107009: Setting Text Object Styles
This allows toggling of text styles of selected text and at the current
mouse cursor position if nothing is selected.

Pull Request: blender/blender#107048
2023-04-18 21:24:38 +02:00
70d854538b Curves: Optimize edit mode selection draw extraction
Use the attribute API for domain and type interpolation instead of doing
it manually. I observed a 3.8x improvement in curve selection mode and
an 18x improvement in point selection mode.
2023-04-18 14:57:04 -04:00
694f792ee1 UI: OS File Operations Within File Browser
Adds a submenu to the File Browser selected item context menu that
allows opening the item or viewing the location in an OS browsing
window. On Win32 also allows other actions like editing, searching,
opening command prompt, etc.

Pull Request: blender/blender#104531
2023-04-18 20:39:30 +02:00
4edcae75aa Cleanup: Remove unused using keyword 2023-04-18 13:38:11 -04:00
954c6c0ae6 Revert "Cycles: move oneAPI kernels dynamic library to blender.shared"
This reverts commit df096eab77.
There is a corner case for when WITH_CYCLES_ONEAPI_BINARIES is set to on
and later turned off during config, in case there is no ocloc.
2023-04-18 18:48:37 +02:00
3a72442f63 Fix comment style in previous commit.
Pull Request: blender/blender#107091
2023-04-18 17:28:53 +02:00
5bb3a3f157 Fix: segfault when indexing into some collections with strings.
This happens when the collection's item type doesn't have a
'nameproperty' to index with.  For debug builds we error out with an
assert, since in general this shouldn't happen.  For release builds
Python will report item not found.

Pull Request: blender/blender#107086
2023-04-18 17:15:22 +02:00
d818d05415 Cleanup: Remove unnecessary attribute provider callbacks
We don't use the callbacks that create virtual arrays from the custom data
anymore, they just add extra indirection. The only non-obvious case was
the crease attribute which had a setter function. Replace that with an
attribute validator like the other similar attributes.

Pull Request: blender/blender#107088
2023-04-18 17:13:38 +02:00
7c927155b5 Fix #90159: Inconsistent display of active filters for import/export file dialogs
Use `filter_glob` property to list only operator extension files.
PR includes filtering for collada, usd, alembic file formats.

Old Revision: https://archive.blender.org/developer/D16739

Pull Request: blender/blender#107034
2023-04-18 15:57:45 +02:00
63f309df11 Fix #107081: Slow selection with context variables
A solution for an older bug was causing it.

Added a special case to avoid an extra relation for context
variables as they do not change during the dependency graph
evaluation,

Pull Request: blender/blender#107082
2023-04-18 15:40:07 +02:00
6f26bb6841 add missing immUnbindProgram() 2023-04-18 14:27:54 +02:00
66158498de BLI: Return number of values removed from remove_if
Make the `remove_if` function for `Vector`, `VectorSet`, `Set`, and `Map` return the number of elements it removed.

Pull Request: blender/blender#107069
2023-04-18 13:28:14 +02:00
e41cd795a6 Small cleanups to zone drawing
- Avoid unnecessary topology tag
- Copy from offsets in one line
- Comment formatting
- Avoid describing future changes in comment, let the code stand on its own
2023-04-18 07:24:06 -04:00
25747301db Cycles: fix SYCL debug library linking on Windows 2023-04-18 12:33:48 +02:00
b623be3377 Cleanup: remove clang-format: off for EnumPropertyItem definitions
These aren't special cases so format them as is done with all other
enum-property declarations.
2023-04-18 20:30:00 +10:00
77268dbe3b WM: add versioning for 3.5 sculpt brushes (part of fix for #106057)
Add a versioning function for tool ID's which can be used if these
need to be changed in future.
2023-04-18 20:30:00 +10:00
7b4d71683f Fix #107060: Curves sculpt mode does not select default tool
Regression in [0] when curve tool names changed to use brush names
with the utility function generate_from_enum_ex().

[0]: 786734e6c8
2023-04-18 20:16:31 +10:00
58b1c54671 Cleanup: remove "Curves" suffix from curve sculpting enum
This isn't necessary information & types aren't included in other
brush names.
2023-04-18 20:16:31 +10:00
01c6824eaf Cleanup: make format 2023-04-18 12:12:51 +02:00
8981bb4ac6 Geometry Nodes: Simulation Zone drawing updates
* Make the drawing smoother/anti-aliased.
* We use the alpha to blend between the background and the zone color.
* If alpha is 100% we then get to see the dotted background again
* Change zone corner radius to match nodes/layout.

If we want to set the background transparent again we need to do:
```
-    immUniformThemeColorBlend(TH_BACK, TH_NODE_ZONE_SIMULATION, zone_color[3]);
+    immUniformThemeColor(TH_NODE_ZONE_SIMULATION);
```

For the design behind some of those change see #106810
Pull Request: #107043
2023-04-18 12:08:27 +02:00
732fa26413 Fix #107032: API Document: matrix_channel (PoseBone) description incorrect
Update the RNA and DNA documentation for two bone matrices:

- `PoseBone.matrix_channel` (`bPoseChannel::chan_mat` in DNA) contains
  the evaluated loc/rot/scale channels, including constraints and drivers.
- `PoseBone.matrix` (`bPoseChannel::pose_mat` in DNA) contains the same
  transform, but then expressed in the armature object space.

No functional changes, just clarifications in comments / tooltips.
2023-04-18 12:01:45 +02:00
4d7a7ce67c Fix #107050: accessing nullptr after progress is canceled 2023-04-18 11:58:07 +02:00
e4926b4b2a Merge branch 'main' into geometry-nodes-simulation 2023-04-18 11:15:23 +02:00
6e75581e65 BKE: Rework ID swap code to properly handle embedded ID pointers.
While embedded IDs are usually considered as private local data of their
owner ID, some areas of code, like the depsgraph, can consider them as
regular IDs in some aspects.

So when swapping IDs, also properly 'counter-swap' their potential
embedded IDs, such that the pointers to the embedded IDs remain as before
swapping, even though the data of the embedded IDs is swapped.

The main target of this change is memfile undo code. There, newly read
IDs are swapped with their oldder version, so that the old address
contains the new data. This allows to avoid rebuilding some of the
depsgraph. Doing the same thing for embedded IDs should reduce even
further the needs for depsgrah rebuilds on undo steps.

This commit also gives more control over the remapping of 'self' ID
pointers inside themselves.

Pull Request: blender/blender#107044
2023-04-18 11:09:36 +02:00
664b31ea73 Cleanup: make format 2023-04-18 09:45:01 +02:00
4d75f10a8a EEVEE: Optimise texture usage flags
Authored by Apple: Michael Parkin-White

Pull Request: blender/blender#107037
2023-04-18 08:11:46 +02:00
982392ca13 Docs: Update RNA to user manual url map
Fixes #107005
2023-04-18 01:12:27 -04:00
ab8acbbfe5 Cleanup: Use curve positions accessor function
There's no particular reason to use the attribute API instead here.
2023-04-17 23:38:10 -04:00
c234a802ba Cleanup: Remove unused using keyword 2023-04-17 23:38:10 -04:00
7bb8c8a5cf Cleanup: Improve comments about curves and mesh offset spans 2023-04-17 23:38:10 -04:00
c615ccde68 Fix splash preference overriding Read Home File's use_splash property
- Split out WM_init_splash_on_startup(..) which performs startup checks.
- WM_init_splash(..) now shows the splash (ignoring preferences).
- Avoid calling BLI_exists on an empty string (in some cases).
2023-04-18 12:59:13 +10:00
1bb77d9eae Cleanup: Better logging for imbuf tests
Recent failures requiring investigation have exposed some shortcomings
that this addresses:
- When creating the diff image for offline comparison, use a higher
  threshold to prevent idiff from printing more output which will often
  contradict the primary failure output just above it (very confusing)
- For metadata failures, make sure these get printed so it's obvious
  what kind of failure we're dealing with

Pull Request: blender/blender#107058
2023-04-18 03:32:20 +02:00
302eb1e0d7 Cleanup: compile warning, correct wording 2023-04-18 11:04:08 +10:00
c4c1cc7cd3 Cleanup: double quotes for non-enum strings
Also use back-ticks for code-references in comments.
2023-04-18 10:51:32 +10:00
2f743b0a92 Cleanup: Replace manual flag checking with methods in node.cc
Not all flags have methods, and not all node primitive types have this.
Replacement of rather simple cases.

Pull Request: blender/blender#107055
2023-04-18 00:29:10 +02:00
29f137e138 Sculpt: fix brush.falloff_shape not being reset in "reset brush" op 2023-04-17 15:16:35 -07:00
96fa5fc2b3 Sculpt: Fix #106996: Mising null check in BKE_sculpt_update_object_before_eval 2023-04-17 14:05:29 -07:00
df096eab77 Cycles: move oneAPI kernels dynamic library to blender.shared
After 17800e0c03, the oneAPI kernels library was still able to find sycl6.dll but that wasn't reliable.
We fix this by moving the oneAPI kernels library also into blender.shared.

Pull Request: blender/blender#106894
2023-04-17 21:47:35 +02:00
09b770388a Fix #107004: Cycles shadow caustics not working with area lights
Tested the wrong variable after a refactor for light spread.
2023-04-17 20:46:08 +02:00
870930bc32 Fix build error using WITH_CYCLES_LOGGING=OFF
Mismatch between glog and stubs. CHECK_NULL does not exist also. Tests
also require logging to be available.
2023-04-17 20:36:18 +02:00
74eda0b6fc Fix build error on macOS after previous commit 2023-04-17 17:47:29 +02:00
92919864a0 Fix #106293: Cycles importance sampling with multiple suns works poorly
Keep sun in importance map in this case, as we do not use special sun
importance sampling in this case.
2023-04-17 17:30:47 +02:00
cff94a808e Fix #106706: fireflies with Nishita sky sun sampling at certain angles
Due to floating point differences between importance sampling and
texture evaluation, disagreeing on whether or not a ray lies within
the sun disc.

* Use the same input values for geographical_to_direction() in
  sky_radiance_nishita() and kernel_data.background.sun.
* The mathematical operations in pdf_uniform_cone() were adjusted to
  match sky_radiance_nishita().

Pull Request: blender/blender#106764
2023-04-17 17:29:27 +02:00
a8feb20e1c DNA: Move irradiance grid light cache data to Object level
This is the first step for refactoring the lightcache system.
Each probe instance (as in `Object`) will now store its own baked data.
The data is currently stored in uncompressed readable format.

This introduces two new operators for baking to avoid confusion with
the previous light baking pipeline. These do nothing other than
creating empty caches that will be populated by EEVEE later on.

The DNA storage is made to be able to include multiple caches
in case of baked simulation over time but it isn't yet supported.

I prefer to keep the implementation simple for now as the long term
goals for this feature are uncertain.
There is still a type flag (`LightProbeObjectCache.cache_type`) that
will be used for versioning.

The naming convention of structs is a bit weird but that's all I
found in order to avoid interfering with the old scene light cache
that is still used by (old) EEVEE.

Related task #106449.

Pull Request: blender/blender#106808
2023-04-17 17:12:19 +02:00
b1703bd902 Fix #107020: crash when canceling Sky Resize with mesh symmetry
Like `t->data` use calloc to `tc->data_mirror`.

This way you make sure that all values are properly initialized.
2023-04-17 11:32:36 -03:00
c041a36286 Fix #91966: Alembic/USD export ignores bone parent animation
For non-object parents (so bones & vertices), the parent is now also
explicitly checked for animation. In other words: having an animated
parent will cause the transform of the child to be written to Alembic/USD
on every frame (as if it is animated itself).
2023-04-17 16:21:39 +02:00
315cc66bd8 Fix #106732: Support for simulation zones in copy operators
Copying a simulation zone should keep the 1:1 pairing intact (see `remap_pairing` functions)
- When copying a simulation input node on its own, unpair it to avoid ambiguity. Only the old simulation input node is paired to the output.
- When copying a simulation output node on its own, no special action is needed - the node gets a new ID and nothing is paired with it.
- When copying both input and output, remap the `output_node_id` property of the simulation input node, so that it is paired with the output copy.

There are a couple of places where copies happen:
* Node tree copy
* Duplicate nodes
* Group Separate (copies nodes from the group tree into another tree)
* Clipboard (both copy to clipboard and paste into node tree)
* Shader node tree branch copy for execution

These copy operators do mostly the same thing, but in slightly different ways, which makes the code incompatible (e.g. using a `Map<const bNode *, bNode *> node_map` vs. `Map<bNode *, bNode *> node_map`). That's why there are 3 `remap_pairing` implementations.

Dynamic node declarations are problematic:
Copying nodes invokes `nodeDeclarationEnsure` to generate declarations for new nodes. It does not, however, change the socket lists. If a dynamic declaration for a node copy alters the sockets (in this case: remove all because the node is unpaired), the subsequent `update_socket_declarations` will crash because it expects sockets to match the declaration.
At the end of operators there is usually a `BKE_ntree_update_main` or similar, which invokes `update_node_declaration_and_sockets`. This method _does_ update the socket lists as well (see #106732), but only if a node is tagged for a respective update.

The solution here is to use `update_node_declaration_and_sockets` for dynamic declarations instead of just `build_node_declaration_dynamic`.

Pull Request: blender/blender#106812
2023-04-17 16:10:55 +02:00
b75b734969 Core: Memfile Undo: Optimize handling of 'no undo' IDs.
Do not read IDs from types flagged as 'no undo', whether they are local
or linked.

This should have no effect currently, since all 'no undo' ID types
currently are supposedly only local data anyways (Screen, WindowManager
and WorkSpace).
2023-04-17 16:08:41 +02:00
915b8b6093 Core: Memfile undo: Add ID tag for IDs that are 'reused in place'.
These IDs kept their address, but their content has been replaced
(re-read from the memfile undo step). Add an ID tag to identify them.

As a further cleanup, systematically tag these IDs for despgraph COW,
since their data is effectively modified (though in practice all of
these IDs are expected to already have other update tags anyway).

No change in behavior is expected from this commit.
2023-04-17 15:46:21 +02:00
a16bcb6576 Core: ID remapping: Do remap 'not owning embedded' ID pointers.
This should not have much effective consequences with current code, but
fixes potential missed remappings for e.g. some nodetree pointers in the
node editor, or the `parent` pointer of collections to a scene's master
collection.
2023-04-17 15:46:21 +02:00
0bc957063c Fix #106405: Cycles multi GPU crash with vertex color baking
Avoid division by zero when one of the devices gets no work.
2023-04-17 15:31:35 +02:00
38bf3e1911 I18n: translate default preset name
The "New Preset" message was already translated and used in some
preset panels, but not all.

Pull Request: blender/blender#106973
2023-04-17 15:00:07 +02:00
48979c6cdc Py module i18n utils: return subprocess.run result to catch output of external commands.
Avoids having prints in random order in multi-processes concurrent
context.
2023-04-17 14:38:51 +02:00
e45746591b Metal: Add new files for Storage Buffers support 2023-04-17 14:12:32 +02:00
2a4323c2f5 Mesh: Move edges to a generic attribute
Implements #95966, as the final step of #95965.

This commit changes the storage of mesh edge vertex indices from the
`MEdge` type to the generic `int2` attribute type. This follows the
general design for geometry and the attribute system, where the data
storage type and the usage semantics are separated.

The main benefit of the change is reduced memory usage-- the
requirements of storing mesh edges is reduced by 1/3. For example,
this saves 8MB on a 1 million vertex grid. This also gives performance
benefits to any memory-bound mesh processing algorithm that uses edges.

Another benefit is that all of the edge's vertex indices are
contiguous. In a few cases, it's helpful to process all of them as
`Span<int>` rather than `Span<int2>`. Similarly, the type is more
likely to match a generic format used by a library, or code that
shouldn't know about specific Blender `Mesh` types.

Various Notes:
- The `.edge_verts` name is used to reflect a mapping between domains,
  similar to `.corner_verts`, etc. The period means that it the data
  shouldn't change arbitrarily by the user or procedural operations.
- `edge[0]` is now used instead of `edge.v1`
- Signed integers are used instead of unsigned to reduce the mixing
  of signed-ness, which can be error prone.
- All of the previously used core mesh data types (`MVert`, `MEdge`,
  `MLoop`, `MPoly` are now deprecated. Only generic types are used).
- The `vec2i` DNA type is used in the few C files where necessary.

Pull Request: blender/blender#106638
2023-04-17 13:47:41 +02:00
f588a0596b Fix #106943: driver on inactive view layer doesn't work
Animation data (including drivers) on inactive view layers now work. The
removal of such view layers was too optimistic; they are now kept
around. The bases are still removed, mostly for safety sake and to keep
the changes to a minimum.

`scene_remove_unused_view_layers()` has been renamed to
`scene_minimize_unused_view_layers()` to reflect its new functionality.

For compatibility with assumptions in other areas of the code, the
function still ensures the input view layer is at index 0.

This also introduces a new function
`BKE_view_layer_free_object_content(view_layer)`, which is a subset of
the functionality of `BKE_view_layer_free()`.
2023-04-17 12:59:03 +02:00
fe7540d39a Cleanup: Define type for object type enum
Having a type defined allows the compiler to help with type safety. For
example we can use it in switches to trigger a warning when a new object
type is added but not covered by the switch yet (but probably should).
2023-04-17 12:39:42 +02:00
62d9e55eec Graph editor: fix box select when scene has annotations
The graph editor box select operator now works properly again, when there
is an annotation layer in the scene.
2023-04-17 12:15:24 +02:00
0ed0165eea Refactor: anim, simplify range check
Simple application of De Morgan's law. No functional changes.
2023-04-17 12:15:24 +02:00
c8435185e1 I18n: Updated translation files from SVN trunk (r6467). 2023-04-17 12:00:22 +02:00
dfa42c614f Cleanup: UI messages fixes and tweaks. 2023-04-17 11:41:10 +02:00
5491563e59 Fix #106982: crash with muted node
The lazy function for muted nodes did request inputs
even if they were not required.
2023-04-17 10:59:05 +02:00
6e59d0b20f Cleanup: document type of Scene::view_layers 2023-04-17 10:57:09 +02:00
3a02d760f7 Python API: Expose background drawing argument for GPUOffScreen.draw_view3d
Currently, when using the python api for offscreen drawing, the
default background will always be rendered into the GPUOffScreen's
framebuffer, rendering the alpha channel essentially useless and
making it difficult to separate objects from the background.

This patch allows offscreen drawing of a 3d view with transparent
background by exposing an optional parameter to the python api,
enabling, for example, compositing the result over another image.

The new parameter to draw_view3d() is optional, with the default
value matching the previous behavior, so this change is fully

Pull Request: blender/blender#105748
2023-04-17 09:28:02 +02:00
15f464019a Geometry Nodes: avoid last buffer copy in Blur Attribute node
Previously, there was a "main" and "tmp" buffer and the final
result was expected to be in the "main" buffer. Now the two buffers
are called a and b and the final result can be in either of those.

This can improve performance especially if the number of iterations is low.

Pull Request: blender/blender#106860
2023-04-17 08:08:46 +02:00
348f57bcec Fix #107017: Missing checks for #PyObject_GetBuffer success
`PyObject_GetBuffer` was used without checking that it was successful.
This could cause the code to access an incompatible or uninitialized
`Py_buffer`.

Add the missing checks, and clears the raised `PyExc_BufferError`
to silently fall back to accessing the PyObject as a sequence.
2023-04-17 16:07:20 +10:00
8f3796e90a Merge branch 'main' into geometry-nodes-simulation 2023-04-17 06:34:09 +02:00
1d8389cd09 Fix: missing cache to get evaluated positions
Without this, there is a crash in the
`geo_node_geometry_test_duplicate_elements_curve_points` test in
a debug build. This was broken in 7bd7043a74.
2023-04-17 06:32:30 +02:00
c7d80b8c70 Fix crash saving an image when ImageOutput::open fails
Saving a PNG into path without write access would crash,
caused by recent move to OIIO.
2023-04-17 13:31:10 +10:00
0b1fb22f69 Fix screenshot path defaulting to the root directory for unsaved files
Using a "//" prefix resolves to the root directory which isn't a good
default as it typically doesn't have write permissions.
Only set the name and let the file selector pick a directory to use
(matches how saving from the text editor works).
2023-04-17 13:31:08 +10:00
153cb7e1df Cleanup: remove inline checks for GPU front-buffer reading
Add WM_window_pixels_read & WM_window_pixels_read_sample that
use front-buffer pixel reading when supported.

Note that direct access to reading from the front-buffer is still needed
for writing thumbnails - where redrawing can cause problems
(see code-comments).
2023-04-17 12:28:56 +10:00
e78c3c9d96 Docs: comments for disabling the front-buffer & view3d offset correction
Expand on why front-buffer support is always disabled on Wayland &
why viewport orbit around selection offset correction isn't used for
perspective views.
2023-04-17 12:27:34 +10:00
7bd7043a74 Fix #106927: Crash when removing handle position attribute
Bezier curve position evaluation expects the handle position attributes
to exist and doesn't handle the case where they don't. Swith to using
a utility function to evaluate each curve type so Bezier evaluation can
stop early in that case.
2023-04-16 21:34:35 -04:00
2fade47a9d Fix: Transform geometry node doesn't translate volumes correctly
Fixes a bug introduced in b0b9e746fa.
The volume transformation matrix is multiplied in the wrong order
which means the grid scale is applied on the translation.
2023-04-17 03:10:40 +02:00
Henry Chang
bd86e719ab UI: Sculpt Paint tool defaults #97616 #105759
Default settings changed for Sculpt mode's
Paint Brush, Smear Brush, and Smear Brush.

~~This includes updates of PR review #105691.~~

Updated to only include commits relevant to this PR.

Reviewed by: Joseph Eagar & Julian Kaspar
Pull Request: #105759
2023-04-16 15:24:47 -07:00
4563a47ac5 Squashed commit of the following:
commit 7aa5e65dcbda862dcb17ecfc6727eb241a12c316
Merge: c08a9ec19f 7c9e493da55
Author: Joseph Eagar <joeedh@gmail.com>
Date:   Sun Apr 16 15:11:53 2023 -0700

    Merge branch 'main' of https://projects.blender.org/ChengduLittleA/blender into ChengduLittleA-main

commit 7c9e493da55a4adbfa2415b711e6d0daa2720ad9
Author: YimingWu <xp8110@outlook.com>
Date:   Fri Mar 31 17:46:32 2023 +0800

    Fix #106358: Handles null evaluated object when entering sculpting workspace.

    The setup where everything in the scene is invisible/not enabled could
    trigger a crash when switched to sculpting workspace, triggered when
    opening the file.

    This patch handles such situation.
2023-04-16 15:14:11 -07:00
Patrick Foley
c08a9ec19f Sculpt: updated Mask and Face Set menu operators
Changed the menu operators:

    Expand Mask by Topology (hotkey Shift A)
    Expand Mask by Normals (hotkey Shift Alt A)
    Expand Face Set by Topology (hotkey Shift W)
    Expand Active Face Set (hotkey Shift Alt W)

so that their hotkeys would appear in their menu entries.

Resolves #104023

Co-authored-by: DisquietingFridge <30654622+DisquietingFridge@users.noreply.github.com>
Pull Request: #104568

Rebased for main instead of sculpt-dev
2023-04-16 15:04:58 -07:00
45ef51d0fb Fix #106242 "Edit Dyntopo Detail Size" status bar missing info
Fixed issue#106242 "Edit Dyntopo Detail Size" status bar missing info

Pull Request: blender/blender#106476
2023-04-16 23:54:48 +02:00
9d4949f80b Cleanup: Reduce nesting in node.cc
Decompose most of the nesting in the code to make the code
more consistent along the line of program execution.
Mainly achieved through:
- Remove redundant else
- Invert condition
- Add temporary variable to redistribute and name conditions

Pull Request: blender/blender#105509
2023-04-16 22:53:09 +02:00
de7e3454fb UI: Capabilities Flag for Clipboard Image copy/paste
This adds an WM_capabilities_flag to indicate that a platform
implements support for copying and pasting images using a shared
clipboard.

Pull Request: blender/blender#106990
2023-04-16 21:04:55 +02:00
254d148458 Fix: PLY export behavior with multiple meshes
A few fixes included here:
- Use `reserve` properly to add space after the first mesh
- Add to the end of the UVs array instead of replacing it for every mesh

Also, a cleanup/simplification:
- Split face size and face vertex loops, they are independent

Pull Request: blender/blender#106967
2023-04-16 20:00:16 +02:00
e1571cb105 Cleanup: correct terms, spelling in comments 2023-04-16 20:41:22 +10:00
5f40118899 Cleanup: rename GPU_offscreen_read_{pixels=>color} noted as a TODO 2023-04-16 20:38:19 +10:00
6cc2c16d06 Fix #106264: Color picker broken with Wayland & AMD GPU
- Use off-screen drawing when reading from the front-buffer isn't
  supported.

- Add a capabilities flag for reading the front-buffer which is always
  disabled on WAYLAND.

- Add GPU_offscreen_read_pixels_region, used for reading a sub-region of
  an off-screen buffer - use for color-picking a single pixel.
2023-04-16 20:16:54 +10:00
6722f90734 Cleanup: quiet mypy warnings in gitea_inactive_developers
Also add to the list of scripts to check with "make check_mypy".
2023-04-16 17:03:56 +10:00
b827c8cd1e Fix #104385: Unexpected clipping in ortho view & orbit around selection
Orbit around selection didn't work well in orthographic views,
potentially causing viewport offset to drift during navigation
to the point content would be outside the far clipping range.

Resolve by aligning the view offset depth with the dynamic offset
being orbited around.
2023-04-16 16:24:41 +10:00
8afb8db66e Cleanup: spelling in comments 2023-04-16 16:24:38 +10:00
cffc9bdb93 Cleanup: quiet unused argument warning 2023-04-16 16:24:36 +10:00
bb25302fc3 Docs: Fix wrong function return type
Fixes blender/blender-manual#104384
2023-04-15 21:03:47 -04:00
b601ae87d0 UV: add overlapping island support for uv packing
From the UV Packing options, choose:

 "Merge Overlapped" / "Overlapping islands stick together"
2023-04-15 13:59:12 +12:00
e078419c9c UV: cleanup uv_parametrizer, simplify types 2023-04-15 10:58:18 +12:00
6a0b90bc92 Cleanup: move pre-rotation inside uv packing engine 2023-04-15 10:33:32 +12:00
1924045142 Cleanup: format 2023-04-15 10:04:07 +12:00
db47f82626 GPU: Add Texture Usage Parameter to GPUOffscreen.
Currently the Textures used for offscreen rendering don't have
the `GPU_TEXTURE_USAGE_HOST_READ` flag. But some cases it is
needed. This PR adds a parameter when creating an offscreen
buffer.

Other solution could be to add this flag to all textures, but
we chose not to do this as that reduces the amount of fine-tuning
options for Metal/Vulkan backends. GPU can store textures
differently based on its actual usage.

This option isn't available in the python API as we don't expect
add-on developers to fine-tune texture usages to this extent.

For convenience `GPU_TEXTURE_USAGE_ATTACHMENT` is by default
always added.

Pull Request: blender/blender#106899
2023-04-14 22:02:51 +02:00
b86fc55d30 Cleanup: Use Vector for passing lists of PBVHNodes around
Cleaned up sculpt code to store lists of `PBVHNodes` with
`blender::Vector` instead of simple pointer arrays.  This is much
simpler and eliminates memory leaks caused by forgetting to free
the result of `BKE_pbvh_search_gather`.

Notes:

* `BKE_pbvh_search_gather` is now `blender::pbvh::search_gather`.
* `FilterCache` and `ExpandCache` have ownership over their .nodes
  members; as a result they're no longer pure C structs and
  are allocated with `MEM_new`/`MEM_delete`.
* The word 'totnode' no longer occurs anywhere in
  `source/blender/editors/sculpt_paint`

Todo (not for this PR): create a new properly C++ task API for sculpt
      (with lambdas) and use it for brushes.

Pull Request: blender/blender#106884
2023-04-14 21:16:42 +02:00
15683d81be Fix: Mesh validate missing mesh polygon removal tags
This was done in a macro before 7966cd16d6.
2023-04-14 14:24:42 -04:00
8df6974a15 Fix #106879: Texture Node add search is broken 2023-04-14 13:52:13 -04:00
7d4edcfa68 Cleanup: Use consistent mesh vertex position names 2023-04-14 13:42:28 -04:00
62548acb1a Cleanup: Re-organize our ID tags.
Re-organize ID tags in a more logical way, and keep their values
strictly increasing, splitting the free available ones in-between the
main groups (to avoid having to edit all tags values when adding a new
one).

Note that shuffling around these ID tags values should not be an issue
anymore, all of these are strictly run-time, and fully cleared in write
code when writing into a .blend file.

This also lead to the second cleanup, which is removing some asserts on
ID tag values in readcode, these are useless since the tag is cleared on
write.
2023-04-14 19:23:40 +02:00
c43d493cce blendfile write: Fix handling of embedded IDs.
Embedded IDs did not benefit from any of the recent optimizations
(especially for undo case) when writing regular IDs (cleaning up of some
pure runtime data that would generate a lot of fake 'changed on undo'
status).

Now factor out of `write_file_handle` this part of the code generating
temp ID copy with cleaned-up data for writing, and expose it in BLO API
such that IDs owning embedded ones can also use it.
2023-04-14 19:20:58 +02:00
bfd1836861 Cycles: add instancing support in light tree
Build a subtree for each unique mesh light.

Pull Request: #106683
2023-04-14 19:12:16 +02:00
910f60de4c Fix (unreported) wrong code in foreach_id code for Editors.
Code there was fairly naive and simple, missing some ID pointers,
sometimes improperly accessing non-ID data as IDs (usual dear Outliner
tree element usages of its 'ID' pointer...).

And code was especially quite severely broken in case these UI ID
usages were processed in a non-readonly context (i.e. if some of these
ID pointers were expected to be modified).

Code has been updated following existing very similar code in
`lib_link_workspace_layout_restore` from `readfile.cc`.
2023-04-14 19:06:58 +02:00
495f679246 Fix (unreported) outliner readfile code doing invalid ID pointer reading.
Code re-reading new ID pointers addresses inside readfile process would
not ensure that the 'ID' pointer of the outliner's treestore element is
actually a real ID pointer, and not a 'fake' one.

Probably harmless in practice, though this could have potentially been
the cause of extremely random rare crashes or corruption...
2023-04-14 19:06:58 +02:00
0cb17a7036 Fix (unreported) invalid pointer assignment in 2.80 collection doversion code.
Code would assign a LayerCollection pointer to an ID pointer... Funny
enough, it never seemed to have been an issue until now.
2023-04-14 19:06:58 +02:00
d633d9fd02 Curves: Define "lookup int" function for RNA arrays
The build seems to complain without this, though theoretically it isn't
meant to be necessary. Though keeping them defined can potentially
avoid quadratic lookups too.
2023-04-14 12:42:28 -04:00
dcb3b1c1f9 Geometry: Use implicit sharing for curve and mesh offsets
Similar to 7eee378ecc, this change decreases memory usage and
improves performance when copying curves and meshes without changing
their topology. The same change used for custom data layers is applied
to face and curve offset indices, which aren't stored as a custom data
layer.

The implicit sharing info for the offsets is stored in the mesh and
curve runtime structs, since it doesn't need to be written to files
directly. When changing the offsets pointer directly, the sharing info
must be updated accordingly. To make that easier, a few utility
functions take care of common operations like making an array mutable,
resizing an array, and creating sharing info for allocated data.

This commit also clarifies the intention to not allocate the offsets
at all when there are no curves/faces. That slightly complicates some
of the logic, but there's no reason for the single `0` integer to be
allocated.

Pull Request: blender/blender#106907
2023-04-14 17:58:13 +02:00
fed463df78 IDManagement: Extend ID remapping code.
This commits adds some new, specific flags to further control ID
remapping process (like and option to skip user refcounting completely).

It also adds a new function to do 'raw' remapping, without any extra
post-processing, depsgraph tagging, etc. This is not used currently, but
will soon be needed by readfile post-processing code changes.

There is also some small cleanups and reorganization in that area of code,
the main noticeable change being the switch from a short to an int for
the flags controlling remapping code (using short here does not give
any benefit, and makes it harder to switch to integers when it becomes
necessary).

No change in behaviors are expected from this commit.
2023-04-14 16:59:47 +02:00
988f23cec3 Attributes: Add 2D integer vector attribute type
This type will be used to store mesh edges in #106638, but it could
be used for anything else too. This commit adds support for:
- The new type in the Python API
- Editing the type in the edit mode "Attribute Set" operator
- Rendering the type in EEVEE and Cycles for all geometry types
- Geometry nodes attribute interpolation and mixing
- Viewing the type in the spreadsheet and using row filters

The attribute uses the `blender::int2` type in most code, and
the `vec2i` DNA type in C code when necessary. The enum names
are based on `INT32_2D` for consistency with `INT8` and `INT32`.

Pull Request: blender/blender#106677
2023-04-14 16:08:05 +02:00
80f3f59555 Fix: Remove unsupported data types in extrude and split edges nodes
The extrude node resizes an existing mesh, but doesn't initialize new
data for most non-generic data types like shape keys or freestyle tags,
or custom normals. The split edges node doesn't process some
similar vertex data too.

In the future this data can become generic attributes, or it can be
supported in the nodes anyway. But now the new data is un-initialized
after being allocated.

Fixes #106926
2023-04-14 10:06:48 -04:00
3f31ac2e1a Cleanup: Make deprecated custom data type handling consistent
Mark some types deprecated where they weren't already, remove redundant
comments, and remove the type masks for deprecated types.
2023-04-14 10:06:48 -04:00
2b4a62fa18 Fix: Respect preview range when auto normalizing in Graph Editor
When hitting normalize in the Graph Editor, it would frame the y-extents of the visible part of the FCurves.
Now, when a preview range is set, it frames the part of the FCurves in the preview range.

Pull Request: blender/blender#106888
2023-04-14 15:23:58 +02:00
23bce32888 Tools: util to get inactive member of teams from gitea
Note, at the moment it is using the last login as a criteria to
whether the person should be listed (comparing it to 2 years past).

However anyone who hasn't logged in in gitea yet shows as last login 1970.

To run this you need to install all the required python packages and
generate a token with scope "read:org" or "admin:org".

See:
infrastructure/blender-projects-platform#55
2023-04-14 14:54:26 +02:00
a1cc15f239 Fix: Assert when converting curves object to mesh object
`BKE_mesh_nomain_to_mesh` expects the object's data to be the mesh.
Also, the curve to mesh conversion can return a null pointer, so use
an empty mesh in that case. Thanks to Falk David for finding these.
2023-04-14 08:51:00 -04:00
4d1acf42e9 Cleanup: Minor fix to comments. 2023-04-14 14:24:40 +02:00
e0a3fcb622 Fix #106856: Pose library does not autokey mirrored poses
The auto-keying system was still considering the input Action, and not the
mirrored one. This is now fixed.
2023-04-14 14:19:28 +02:00
10f20bf5d5 Refactor: Rename more grease pencil files to legacy
This renames more files and folders to indicate that it is grease pencil legacy code.

Pull Request: blender/blender#106862
2023-04-14 13:35:08 +02:00
c9258e6e19 Cleanup: BKE: lib_query: Add a new type of callback flags for not-owned embedded ID pointers.
This is the case e.g. of the `parent` collection pointer of collections
children of a scene's master collection, or some nodetree pointers in
the UI data (node editor).

Right now handling of this new flag is exactly the same as in owning
embedded case, the distiction between both usages will happen in future
commits.

This commit is expected to have no behavioral change at all.
2023-04-14 13:29:19 +02:00
c63b2e5187 BLI timeit utils: Add accessor to time value for TIMEBLOCK macros. 2023-04-14 13:29:19 +02:00
4c793a5b20 Fix: Dangling pointer when clearing mesh
Missed in 1db918f948.
2023-04-14 07:00:28 -04:00
786734e6c8 Fix #106057: setting the sculpt curve brush in Python clears active tool
The built-in brush identifier didn't match the enum name causing
brush assignment not to update the tool-system (clearing the tool).

Resolve by using generate_from_enum_ex(..) to avoid each brush
definition having to manually duplicate enum definitions.
2023-04-14 20:14:06 +10:00
10b7d4f601 WM: support separators when generating tools from enums & icon map
Currently this is disabled for sculpt, we may want to enable this in the
future. Also add an icon map argument for brushes to use generic icons.
2023-04-14 20:14:06 +10:00
bf6f69399f RNA: add EnumProperty.enum_items_static_ui to access separators & titles
Expose the full enum including separators and section titles,
useful for the tool system so it's possible to read separators
from brush enums (not part of this commit).
2023-04-14 20:14:06 +10:00
fba960301f Cleanup: use the tool-order for the curve-sculpt tool enum
Prepare for using the enum for brush definitions.
2023-04-14 20:14:06 +10:00
40683e524c Cleanup: remove unused argument for tool-systems generate_from_enum_ex
Also avoid using a dict as a default argument. While it didn't cause
problems - in general it's bad practice and worth avoiding as any
modifications produce strange behavior.
2023-04-14 20:14:06 +10:00
e9d4e571d0 Cleanup: use identity comparisons with False 2023-04-14 20:14:06 +10:00
84e216fcee UI: replace "copy/paste buffer" by "internal clipboard"
A buffer is a technical term most often referred to using the metaphor
of a "clipboard" in applications. However, the "clipboard" is usually
the system clipboard, used to carry data accross applications. To
avoid confusion, this replaces "clipboard" by "internal clipboard"
when not dealing with the system clipboard.

In addition, a few places still used the "[copy/paste] buffer"
terminology, so they are replaced with "internal clipboard as well.

The replacement from "[copy/paste] buffer" to "clipboard" was
undertaken in previous commits da6d6f99a8, 14b60c3a1c. This
commit should tackle the remaining occurrences.

Pull Request: blender/blender#106060
2023-04-14 12:12:30 +02:00
1b94e60fb0 UI: Quick tooltip showing tab name for Properties editor tabs
Adds "quick tooltips" to quickly see the name of a tab in the Properties
editor. (See patch for visuals.)

From own experience users are often confused about the name of the different
tabs, and I always found the delay to see it in the tooltip annoying. These
quick tooltips have been introduced for the toolshelf and solve this issue
nicely here. There is still a delay so that simple mouse movements won't
trigger the tooltips, but they show up a lot faster than normal tooltips now.

This may have the side-effect that icon only enum-item buttons will show the
enum name when there is no RNA property description. Previously we wouldn't
show this, even if available.

Pull Request: blender/blender#106906
2023-04-14 11:43:20 +02:00
33bfbb2a0c USD IO: Move to the new Mesh Attributes API for Colors
This revision moves the vertex color reading and writing in the USD import and export functions over to the new Mesh Attributes API. I have removed anything else (new features or unnecessary changes) that was present in the prior patches to focus only on this task.

On the import side, I've introduced a class method named read_custom_data. In this function is the call-out for reading mesh colors. As requested, this function is intended to be the starting point for future Attribute reads, with methods like the new read_color_data* methods being called when a USD primvar matches a specific heuristic. UVs will (in the future, not in this revision) also need to be processed here. In a later patch, any primvars that do not match a heuristic can be imported as generic Attributes. There is a matching function on the export side, write_custom_data.

Attached is a .blend file for testing. The plane has five Color Attributes. The colors should be visibly the same when exported and re-imported.

I have also enabled color attribute imports by default. I believe it would be counter intuitive for most users for this feature to be off-- it means that at some point, a person round-tripping with default settings will lose data.

Pull Request: blender/blender#105347
2023-04-14 11:05:26 +02:00
86b39e0aac Vulkan: Fix Compilation Issue on Windows.
Vulkan uses IMath. IMath on windows requires an option to indicate
it is used as dll file. This option wasn't set for the GPU module.

Thanks to Kazashi Yoshioka for mentioning this.

Pull Request: blender/blender#106932
2023-04-14 10:17:35 +02:00
71c4b7f1d0 Fix compilation warning
GHOST_getClipboardImage expects signed integers to be passed by pointer.

Pull Request: blender/blender#106933
2023-04-14 09:58:30 +02:00
26dc9f90d2 Fix: Metal null buffer initialization
Buffer wasn't actually initialized and read out of bounds.

Authored by Apple: Michael Parkin-White

Pull Request: blender/blender#106807
2023-04-14 07:59:45 +02:00
2745cacd95 Fix #106704: Resolve flashing Metal viewport
Previous fix to resolve GPU hang which could occur in the
Metal backend caused additional flickering to occur as
as side effect, due to removal of required execution
dependencies in certain places.

This patch resolves both problems by only removing the
GPU hang dependency stall when additional synchronization
primitives are used along-side the global sync primitive.

Authored by Apple: Michael Parkin-White

Pull Request: blender/blender#106914
2023-04-14 07:54:08 +02:00
dc1b36f288 Cleanup: correct naming of struct member comments
Also reduce right-shift for DupliGenerator declarations.
2023-04-14 14:33:28 +10:00
26aa1b1367 Cleanup: correct doc-string & naming for BMUVOffsets access function
Changes to [0] which worked as intended but used confusing naming.

- The note on using -1 for the active layer causes an assertion.
- The doc-string was above the wrong function.
- The meaning of the `_n()` suffix was flipped,
  where the `layer_index_n` refers to an absolute index across all
  layer types which is done internally for an index calculated from the
  `layer` argument, not the argument it's self which is a UV index.
  Rename BM_uv_map_get_offsets_n to BM_uv_map_get_offsets_from_layer.

[0]: 412b6a8f65
2023-04-14 14:33:25 +10:00
37b7702d74 Cleanup: comment blocks, #if 0 commented code 2023-04-14 13:51:38 +10:00
1633fca4a4 Cleanup: order return arguments last, minor doc-string clarification 2023-04-14 13:36:49 +10:00
a3e954e0a7 UV: paste now skips tagging objects for update when they aren't changed 2023-04-14 13:36:49 +10:00
cbc5b17c1a Cleanup: Use const int cast 2023-04-13 22:20:47 -04:00
a76c714c26 Cleanup: Whitespace 2023-04-13 22:20:35 -04:00
c22fed5c01 Cleanup: Unnecessary null check
See comment on MEM_delete
2023-04-13 22:20:26 -04:00
c38d259779 Cleanup: Use const pointers 2023-04-13 22:17:06 -04:00
8497737d92 Cleanup: Use reference 2023-04-13 22:16:42 -04:00
c385369e07 Cleanup: Fix typo in function name 2023-04-13 22:12:47 -04:00
4ebe696e85 Cleanup: Comment formatting and const casts 2023-04-13 22:12:09 -04:00
b180fea69e Cleanup: Proper doxygen section 2023-04-13 22:08:57 -04:00
a066d62798 Cleanup: Function and variable naming 2023-04-13 22:08:02 -04:00
c226954d03 Cleanup: Proper doxygen syntax 2023-04-13 22:06:43 -04:00
9e4f58a8eb Merge branch 'main' into geometry-nodes-simulation 2023-04-13 22:06:08 -04:00
39bcf6bdc9 UI: Allow Clipboard Copy/Paste Images
Adds operators to copy and paste to and from the OS clipboard, but only
implemented for Windows.

Pull Request: blender/blender#105833
2023-04-14 03:48:17 +02:00
29c2722753 Cleanup: remove stray white-space & redundant addition 2023-04-14 11:25:59 +10:00
89aa86cb0a Cleanup: remove unused ImBuf::c_handle
Cache limiting is implemented in moviecache.cc.
Marked for possible removal in 2011.
2023-04-14 11:25:10 +10:00
3fadaa4fca Cleanup: Remove unused PBVH "respect hide" variable
Pull Request: blender/blender#106832
2023-04-13 22:15:01 +02:00
dda9c59044 Cleanup: remove useless macro
Could cause:
```
warning C4005: 'DIAL_RESOLUTION': macro redefinition
```

Missed in 97c05aa288
2023-04-13 16:02:23 -03:00
97c05aa288 Transform: improve visualization when dragging Gizmos
Apply the changes suggested at #103782

It includes:
- Draw dot at the origin the active gizmo
- Hide other gizmos while dragging (except the move arrows)

Other changes:
- Draw shadow for the move and scale circle gizmos (while transforming)

Pull Request: blender/blender#104624
2023-04-13 20:23:03 +02:00
bdd6e617ea Cleanup: expose utility that finds a gizmo through its properties
Some properties may have the pointer stored in the gizmo structure
itself.

Reading from the struct directly is useful for cases where the value is
accessed frequently but not often required by the caller.

A disadvantage is that the property may not be saved in the file.
2023-04-13 15:18:44 -03:00
fa13058fa6 UI: Color Picker Positioning
If there is not enough space for the Color Picker either above or below
the launching button, adjust the position to fit instead of clipping.

Pull Request: blender/blender#106122
2023-04-13 19:44:54 +02:00
71ed98debe Point Cloud: Avoid unnecessarily initializing initial positions
Similar to e35f971da1. We aren't meant to rely on the
zero-initiatialization on creation anyway. I observed a small (a few
percent) decrease in minimum runtime in the geometry nodes points node.
2023-04-13 12:49:16 -04:00
197e9b9f80 Cleanup: Use more descriptive function name in extrude node 2023-04-13 12:49:16 -04:00
a7bee90c1d Cleanup: Add access method for point cloud positions
The position attribute has special meaning for point clouds, and
meshes and curves have access methods for the attribute as well.
This saves boilerplate and gives more consistency between types.
2023-04-13 12:49:16 -04:00
cef128e68a Fix menu padding in Console editor header
The View menu was a few pixels to the right compared to all other editors.
2023-04-13 18:06:12 +02:00
baeb386410 Merge branch 'main' into geometry-nodes-simulation 2023-04-13 16:41:16 +02:00
5ba35b3d15 Fix #106748: Rendering with OSL fails with OPTIX_ERROR_PIPELINE_LINK_ERROR
The OSL GPU services implementation of noise intrinsics was missing the
overloads for derivatives and therefore OptiX pipeline creation would fail if
those were referenced.
2023-04-13 15:52:23 +02:00
936e608382 Cleanup: Deduplicate curves data-block copying
After 7eee378ecc, the logic is the same in the
CurvesGeometry copy constructor and the curves data-block
copying.
2023-04-13 09:26:13 -04:00
1db918f948 Fix #106901: Dangling pointer after freeing mesh runtime data
This lead to a double-free when deleting the mesh runtime
struct after explicitly clearing the derived caches separately.
2023-04-13 09:21:22 -04:00
7eee378ecc Custom Data: support implicit sharing for custom data layers
This integrates the new implicit-sharing system (from fbcddfcd68)
with `CustomData`. Now the potentially long arrays referenced by custom
data layers can be shared between different systems but most importantly
between different geometries. This makes e.g. copying a mesh much cheaper
because none of the attributes has to be copied. Only when an attribute
is modified does it have to be copied.

Also see the original design task: #95845.

This reduces memory and improves performance by avoiding unnecessary
data copies. For example, the used memory after loading a highly
subdivided mesh is reduced from 2.4GB to 1.79GB. This is about 25%
less which is the expected amount because in `main` there are 4 copies
of the data:
1. The original data which is allocated when the file is loaded.
2. The copy for the depsgraph allocated during depsgraph evaluation.
3. The copy for the undo system allocated when the first undo step is
  created right after loading the file.
4. GPU buffers allocated for drawing.

This patch only gets rid of copy number 2 for the depsgraph. In theory
the other copies can be removed as part of follow up PRs as well though.

-----

The patch has three main components:
* Slightly modified `CustomData` API to make it work better with implicit
  sharing:
  * `CD_REFERENCE` and `CD_DUPLICATE` have been removed because they are
    meaningless when implicit-sharing is used.
  * `CD_ASSIGN` has been removed as well because it's not an allocation
    type anyway. The functionality of using existing arrays as custom
    data layers has not been removed though.
  * This can still be done with `CustomData_add_layer_with_data` which
    also has a new argument that allows passing in information about
    whether the array is shared.
  * `CD_FLAG_NOFREE` has been removed because it's no longer necessary. It
    only existed because of `CD_REFERENCE`.
  * `CustomData_copy` and `CustomData_merge` have been split up into a
    functions that do copy the actual attribute values and those that do
    not. The latter functions now have the `_layout` suffix
    (e.g. `CustomData_copy_layout`).
* Changes in `customdata.cc` to make it actually use implicit-sharing.
* Changes in various other files to adapt to the changes in `BKE_customdata.h`.

Pull Request: blender/blender#106228
2023-04-13 14:57:57 +02:00
7e764ec692 GPU: Texture: Expose depth dimension extent
This function was not exposed outside of internal GPU module.

Renaming `draw::Texture::depth()` to `is_depth` for consistency
and removing the ambiguity.
2023-04-13 14:06:53 +02:00
aa6e95281f Add support for OpenPGL 0.5.0
Some functions changed slightly for this non beta release.
No functional changes though as we didn't use what was removed.

Pull Request: blender/blender#106861
2023-04-13 11:44:35 +02:00
c26083b6be Fix warning in the STL code
The fast_float is an external library, so move it to the
system includes which has less strict compiler flags applied.

This matches how other IO module use this library as well.

Pull Request: blender/blender#106892
2023-04-13 11:06:42 +02:00
e5d50b1787 Fix Cycles unknown passes logged when build with Cycles debug
When WITH_CYCLES_DEBUG is set to ON the following errors are
printed to the console:

E0412 15:51:22.588564 7996345 sync.cpp:737] Unknown pass Guiding Color
E0412 15:51:22.588605 7996345 sync.cpp:737] Unknown pass Guiding Probability
E0412 15:51:22.588613 7996345 sync.cpp:737] Unknown pass Guiding Average Roughness

This change fixes this by treating the guiding passes the same
way as all other passes, solving the errors and making it possible
to visualize guiding passes in the viewport later on.

Pull Request: blender/blender#106863
2023-04-13 10:27:07 +02:00
34739f6a6d Fix #106672: MacOS/OpenGL doesn't draw anything Eevee related.
This PR reverts the breaking part of the #106535. This part doesn't seem
to be required to fix the HD4400-HD5500 issue.

Might also fix #106844.

Pull Request: blender/blender#106887
2023-04-13 10:18:44 +02:00
a899d57e57 Fix: Ignore hidden FCurves when framing y-extents
On hitting normalize,
when the y-extents of the Graph Editor are framed to the FCurves,
it didn't ignore hidden FCurves.
2023-04-13 09:52:05 +02:00
0e5c941049 Fix #106773: resolve Metal grease pencil fill
Changes to viewport state to resolve texture paint color
selection introduced a side effect wherein the correct
attachment size of a framebuffer was reset. This size is
needed when scissor regions are disabled to return the
state to its correct default. When this default was wrong,
certain operators would have incorrect offsets.

To maintain consistency with the OpenGL backend, the
Metal backend independently tracks the raw attachment
size using default_width/height. This will also reset to zero
when attachments are all removed, unlike other state which
may be retained.

Authored by Apple: Michael Parkin-White

Pull Request: blender/blender#106857
2023-04-13 08:23:34 +02:00
ed4374f089 Fix #66722: file doesn't open if app not focused
After double clicking a file, user can click on a different app and
Blender will lose focus. Then it stays on splash screen. So fetch any
window instead of relying on active one to open the file.

pull request #106769
2023-04-13 10:48:29 +05:30
0a270e3513 Revert "GHOST/Wayland: avoid up-scaling window content"
This reverts commit 4fe2685615.

Always use the preferred_scale requested by the compositor as this
did not work so well in the intended use case (where the low resolution
monitor text was scaled down and difficult to see).

After discussion with @ZedDB, revert this change since there are cases
where either functionality might be preferred - to ensure Blender's UI
is visible on a low resolution mirrored projector for e.g.

Changing the behavior of the preferred scale makes most sense in the
compositor, instead of controlling it on a per-application basis.
2023-04-13 13:14:24 +10:00
684dcd3680 CMake: use GCC's -fuse-ld=mold support (mold now requires GCC 12.1)
GCC did not support mold when support was initially added,
since then GCC has been updated to add support, removing the need
to point to a binary directory containing an alternative `ld` command.

Support for MOLD_BIN CMake option has been kept as mold may be installed
to standard location.
2023-04-13 13:14:19 +10:00
a7462f58d1 Fix missing import for DJV on APPLE
We might removing support for this if it's not used,
correct the import for now.
2023-04-13 13:14:17 +10:00
11ad851fbe Fix lightmap UV calculation ignoring unselected objects in edit-mode 2023-04-13 13:14:15 +10:00
ad5ec544c8 Cleanup: remove unused apply-image option from uvcalc_lightmap
Note that this was a quick update from Blender 2.4's API,
while many other improvements could be made - remove this option as
it does nothing.
2023-04-13 13:14:13 +10:00
61cb302dd5 Cleanup: sort cmake file lists 2023-04-13 13:14:11 +10:00
678dc456e3 Cleanup: remove duplicate class
Introduced in ba982119cd.
2023-04-13 13:14:09 +10:00
14905cd1d5 Cleanup: quiet pylint's inconsistent-return-statements warning
No functional changes as the return value either relied on None
evaluating to False or wasn't used.
2023-04-13 13:14:06 +10:00
6482f9fffe Cleanup: quiet various pylint warnings 2023-04-13 13:14:05 +10:00
84e72e8170 Cleanup: remove duplicate assignments, assign 'layout' early in draw(..)
Also remove unused assignments.
2023-04-13 13:14:03 +10:00
ab64bd264a Cleanup: consistent naming for tool_settings & operator properties
Use names:
- `tool_settings` instead of `ts`.

- `props` instead of `op` / `prop` / `op_props`
  As Python may reference operators, don't confuse the operator
  properties with an instance of the operator.

In both cases these names were already used for most scripts.
2023-04-13 13:14:01 +10:00
b0edd63752 Cleanup: avoid early imports, remove unused variable 2023-04-13 13:13:59 +10:00
741d8dc1e2 Cleanup: format, use C++ nullptr & function style casts 2023-04-13 13:13:57 +10:00
4bccbceb34 Cleanup: quiet GCC warnings 2023-04-13 13:13:56 +10:00
35071af465 Sculpt: Fix #104631: Tip Roundness on Paint brush causes jitering
I cleaned up the cube brush tip code quite a bit; more remains
to be done.  There is a new function to initialize cube
tip matrices, SCULPT_cube_tip_init.  It's currently only
used by the paint brush, I'll need to do a bit of testing
before using it for clay strips and multiplane scrape.

Note: SCULPT_cube_tip_init uses the brush local matrix code
to avoid code duplication (and to take advantage of the debouncing
that is done there).
2023-04-12 18:33:53 -07:00
d48939f103 Cleanup: Simplify uv packing engine
Migrating to use uv_phi[] for placement.
2023-04-13 10:05:19 +12:00
5ba30e07f2 Cleanup: Simplify uv packing api
Migrating preprocessing into the packing engine.
2023-04-13 10:01:43 +12:00
ecfdbaef9b Cleanup: format 2023-04-13 09:58:15 +12:00
2b565c6bd0 UI: Add slash character support to fuzzy search initials mode
Nodes with names separated by a slash / can
not be searched by their initials.

This commit adds the slash character to
the list of separators for this type of
fuzzy search.

Pull Request: blender/blender#106838
2023-04-12 23:53:36 +02:00
a277117b3e Merge branch 'main' into geometry-nodes-simulation 2023-04-11 09:19:12 +02:00
e665a50fb6 Cleanup: rename region to zone in theme settings 2023-04-07 16:53:21 +02:00
c6d8da0e97 Remove simulation inputs when deleting outputs and vice versa
To avoid unpaired simulation inputs or outputs, whenever deleting such
nodes the respective paired node should be deleted as well. A simple
utility function selects paired nodes before the delete operator removes
them.

This does not affect API methods, which still remove only individual
nodes. The feature is primarily a workflow improvement.

Resolves #105728

Pull Request: blender/blender#106597
2023-04-05 18:28:36 +02:00
98bc439e47 Add simulation input and output node as a pair
Simulation input and output nodes are currently added individually, but they always need to exist as pair. This PR modifies the _Add Node_ menu so that a single menu entry adds and input and output node together.

The `NODE_OT_add_node` operator currently adds just a single node type. A new variant of this operator is added which adds a _simulation zone_ with origin + target node instead. This requires some modification of the `NodeAddOperator` base class, moving the `node_type` property into the final implementation. Unlike the `NODE_OT_add_node` operator, the `NODE_OT_add_simulation_zone` adds 2 different node types.

After adding the two nodes, a reference needs to be added to "pair" them: Input node ("origin") stores the UID of the output node ("target") in its `output_node_id` property. So far this was detected automatically when adding an input, but this method is not very robust (e.g. it depends on order of adding nodes and adding multiple pairs can be tricky).

Now the pairing is done explicitly through an API function `node_geo_simulation_input_pair_with_output`. The `NODE_OT_add_simulation_zone` operator performs pairing of the input/output nodes after adding them. The function is accessible through RNA, so an operator may be added if necessary to allow users to fix unpaired nodes.

In addition to pairing the two nodes, the operator also positions them at a comfortable distance, as well as adding a default link between the two Geometry sockets for convenience.

Resolves #105727

Pull Request: blender/blender#106557
2023-04-05 16:20:41 +02:00
3d8d142205 Check if output node exists before creating a sim input lazy function.
Avoid creating a lazy function for the `Simulation Input` node if the `output_node_id` is invalid. This can happen when e.g. the output node is deleted without also deleting the input node. The lazy function assumes a valid output ID and will crash if the output node does not exist.

Pull Request: blender/blender#106585
2023-04-05 14:28:58 +02:00
d98988d872 Simulation Nodes: indicate which frames are cached in timeline 2023-03-28 13:37:57 +02:00
b14668ae03 Simulation Nodes: refactor automatic caching
The most important part of this change is that the simulation
state at a specific point in time is more self contained now.
This way, only the modifier has to deal with finding the old/new
simulation states and not every simulation individually.

Furthermore, this also includes some simple cache invalidation
when the user changes something that might affect the result.
2023-03-28 13:37:30 +02:00
f04787e87b Merge branch 'main' into geometry-nodes-simulation 2023-03-28 11:48:25 +02:00
0351ce9769 Merge branch 'main' into geometry-nodes-simulation 2023-03-22 11:35:08 +01:00
c4b3e0e0bb Merge branch 'main' into geometry-nodes-simulation 2023-03-21 17:56:34 +01:00
2ddddf6e36 fix simulation zone background color 2023-03-21 16:02:37 +01:00
b652bcbe92 remove use_persistent_cache from simulation output node 2023-03-21 12:38:47 +01:00
8b47a252b1 Merge branch 'main' into geometry-nodes-simulation 2023-03-21 12:25:13 +01:00
d9fb08133a Geometry Nodes: improve simulation zone discovery 2023-03-20 16:59:44 +01:00
753af18573 BLI: improve bit data structures and processing 2023-03-20 16:59:16 +01:00
7b62cc943d fix 2023-03-20 16:25:02 +01:00
e5c63abece Merge branch 'main' into geometry-nodes-simulation 2023-03-20 16:15:08 +01:00
e21afd2f2f Merge branch 'main' into geometry-nodes-simulation 2023-03-16 12:47:11 +01:00
3bad7a51cb Implement simulation nodes as lazy functions directly
Support for multiple sockets will be slightly more complete now too,
but that part hasn't been tested.
2023-03-13 17:29:06 -04:00
ecb91d46a8 Merge branch 'main' into geometry-nodes-simulation 2023-03-13 13:45:05 -04:00
f36dd06609 Add initial infrastructure for multiple simulation state items
- Use dynamic declarations to build simulation node sockets
- Fixes in some node code for futher use of dynamic declarations
- Copying, freeing, reading, and writing of simulation state array
- Add simulation state items with link drag operator

The new sockets won't do anything yet, only geometry sockets are
supported, and there is no way to remove sockets yet.
2023-03-10 16:17:24 -05:00
b2e508f7af Merge branch 'main' into geometry-nodes-simulation 2023-03-10 12:31:50 -05:00
a90f02d5f4 Merge branch 'main' into geometry-nodes-simulation 2023-03-10 11:33:03 -05:00
9241ab1d7c Merge branch 'main' into geometry-nodes-simulation 2023-02-28 11:48:47 -05:00
6f56fee3bb Fixes after merge 2023-02-17 17:44:56 -05:00
9393c2aba9 Merge branch 'main' into geometry-nodes-simulation 2023-02-17 17:09:21 -05:00
508fd044b4 Revert "Fix simulation"
This reverts commit 468f43c7a6.

Revert "Add initial dynamic declarations"

This reverts commit 50a2c77c4e.

Revert "Add initial simulation state items array to output node"

This reverts commit 3f1027567d.
2022-12-19 12:04:46 -06:00
38567bc023 Cleanup: Slightly refactor cancelling link drag operator
Clarify that the dragged links aren't stored in the tree, use a
separate function for cancelling vs. applying the links to the tree.
2022-12-19 12:04:46 -06:00
8fa664fd33 Cleanup: Return early in node link operator, remove useless comments 2022-12-19 12:04:46 -06:00
1b24140d9f Cleanup: Remove redundant information from node link drag struct 2022-12-19 12:04:46 -06:00
aeea690e00 Cleanup: Remove unnecessary node link flag
Links that are currently being dragged are now stored outside
of the node tree, so we don't need a flag to distinguish them
from "proper" links.
2022-12-19 12:04:46 -06:00
6ecb1cb780 Fix: socket tooltip not showing when there was no type conversion 2022-12-19 12:04:46 -06:00
988241e23e Geometry Nodes: simplify handling of invalid group interface sockets
Previously, the code tried to keep node groups working even if some of
their input/output sockets had undefined type. This caused some
complexity with no benefit because not all places outside of this file
would handle the case correctly. Now node groups with undefined
interface sockets are disabled and have to be fixed manually before
they work again.

Undefined interface sockets are mostly caused by invalid Python
API usage and incomplete forward compatibility (e.g. when newer
versions introduce new socket types that the older version does
not know).
2022-12-19 12:04:46 -06:00
a2cee52617 Fix std::optional value() build error on older macOS SDK
Patch from @dupoxy

Differential Revision: https://developer.blender.org/D16796
2022-12-19 12:04:46 -06:00
a0ed3601c9 Fix T103187: Opening node search menu is slow because of assets.
Avoid utility function call that would query the file system, this was a
bottleneck. The path joining was also problematic. See patch for more
details.

Differential Revision: https://developer.blender.org/D16768

Reviewed by: Jacques Lucke
2022-12-19 12:04:46 -06:00
8226abc111 Cleanup: Remove duplicate UV islands header
This code was duplicated from `pbvh_uv_islands.hh`,
which was the version that was actually used.
2022-12-19 12:04:46 -06:00
Christophe Hery
807be888a5 Fix: Crash after mesh color attribute name commit
6514bb05ea missed a null check when accessing the active
and default color attribute names, since the CustomData API does not
do that check itself.
2022-12-19 12:04:46 -06:00
f92a85d7d2 Nodes: Add Exclusion color mix mode
Expands Color Mix nodes with new Exclusion mode.

Similar to Difference but produces less contrast.

Requested by Pierre Schiller @3D_director and
@OmarSquircleArt on twitter.

Differential Revision: https://developer.blender.org/D16543
2022-12-19 12:04:46 -06:00
6bd6d7aec7 Fix T103258: Deleting a shader with OptiX OSL results in an illegal address error
Materials without connections to the output node would crash with OSL
in OptiX, since the Cycles `OSLCompiler` generates an empty shader
group reference for them, which resulted in the OptiX device
implementation setting an empty SBT entry for the corresponding direct
callables, which then crashed when calling those direct callables was
attempted in `osl_eval_nodes`. This fixes that by setting the SBT entries
for empty shader groups to a dummy direct callable that does nothing.
2022-12-19 12:04:46 -06:00
cfb77c54b0 Fix T103257: Enabling or disabling viewport denoising while using OptiX OSL results in an error
Switching viewport denoising causes kernels to be reloaded with a new
feature mask, which would destroy the existing OptiX pipelines. But OSL
kernels were not reloaded as well, leaving the shading pipeline
uninitialized and therefore causing an error when it is later attempted to
execute it. This fixes that by ensuring OSL kernels are always reloaded
when the normal kernels are too.
2022-12-19 12:04:46 -06:00
0d18005d2b Cleanup: indentation in CMake files 2022-12-19 12:04:46 -06:00
06525747c0 Build: resolve failure to copy indirect dependencies for USD on Linux
Even when building without OpenImageIO and OpenVDB, USD depends on these
libraries.

Ensure these libraries are copied when building with USD.
2022-12-19 12:04:46 -06:00
74171ff3b0 CMake: warn Linux references old linux_centos7_x86_64 paths
When the centos7 library dir is found, warn when the values of cached
variables reference it, listing the variables and their values.
2022-12-19 12:04:46 -06:00
32a7384c0b Geometry Nodes: improve dot graph export of lazy function graph
* Dim default input values.
* Print default input values instead of type name.
* Add node/socket names to group input/output nodes.
2022-12-19 12:04:46 -06:00
54942e5ea6 Fix T102792: Sculpt cursor jumps to random place
Restrict the condition under which paint cursors read use the cursor
location from the the operating-system.

This caused a glitch when dragging UI elements in painting context
popup. Since the paint cursor would display using mouse motion
which was clamped to the window center - an internal detail of hidden
cursor grabbing.

Now only read the cursor coordinates when clamped to a region which
is used for the transform cursor to stay visible even when the cursor
wraps around.
2022-12-19 12:04:46 -06:00
5d1ed47d6c Fix T103253: Infinite drag of number buttons is broken on WIN32
Recent reverting of changes to cursor grabbing intended to match
Blender 3.3 release. This is the case for 3.4x branch, however there is
an additional change to grabbing on WIN32 by Germano [0] which is a
significant improvement on old grabbing logic for Windows.
So instead of matching 3.3x behavior, restore logic that keeps
the cursor centered while grabbing & hidden.

This re-introduces T102792 issue displaying the paint-brush while
dragging buttons, this will have to be solved separately.

Re-apply [1] & [2], revert [3] & [4].

[4]: a3a9459050
[0]: 9fd6dae793
[1]: 4cac8025f0
[2]: 230744d6fd
[3]: 0240b89599
2022-12-19 12:04:46 -06:00
bb8cbf0c10 UI: don't change mouse cursor while it's grabbed
The paint cursor was continuously set which meant hiding the cursor
while interacting with buttons would immediately show it again.

This exposed cursor warping.
2022-12-19 12:04:22 -06:00
b3386868fe Build: correct extension type for SNDFILE 2022-12-19 12:04:22 -06:00
c9bd78890a Build: remove opus workaround for sndfile
For some reason SNDFILE now builds without this workaround,
which broke building FFMPEG.
2022-12-19 12:04:22 -06:00
a12614d166 Fix T102923: replace zero check with epsilons with uv constrain to bounds
Small roundoff errors during UV editing can sometimes occur, most likely
due to so-called "catastrophic cancellation".

Here we set a tolerance around zero when using Constrain-To-Bounds and UV Scaling.

The tolerance is set at one quarter of a texel, on a 65536 x 65536 texture.

TODO: If this fix holds, we should formalize the tolerance into the UV editing
subsystem, perhaps as a helper function, and investigate where else it needs
to be applied.

Differential Revision: https://developer.blender.org/D16702
2022-12-19 12:04:22 -06:00
0403d77a0f Fix T103237: Prevent UV Unwrap from packing hidden UV islands
When migrating to the new packing API, pin_unselected was not
implemented correctly.

Regression from rB143e74c0b8eb, rBe3075f3cf7ce, rB0ce18561bc82.

Differential Revision: https://developer.blender.org/D16788

Reviewed By: Campbell Barton

Duplicated in blender-v3.4-release as rB3dcd9992676a
2022-12-19 12:04:22 -06:00
fe7a0ebce4 Cleanup: format 2022-12-19 12:04:22 -06:00
af7c34716b Fix active/default color names not being editable
Revert [0] and enable the editable flag as the intent for [1] was that
these values would be editable.

[0]: e58f5422c3
[1]: 6514bb05ea
2022-12-19 12:04:22 -06:00
Damien Picard
34f307547b Fix T103183: UV map name of mesh converted from curve is untranslated
Upon conversion, the newly-created UV map with default name "UVMap"
should be translated.

Reviewed By: mont29

Maniphest Tasks: T103183

Differential Revision: https://developer.blender.org/D16775
2022-12-19 12:04:22 -06:00
bb9b2b556f Cleanup: remove unused active name set callback functions 2022-12-19 12:04:22 -06:00
a3a132ea74 Build: upgrade pre-built libraries for Linux
Replace ../lib/linux_centos7_x86_64 with ../lib/linux_x86_64_glibc_228,
built with Rocky8 Linux, compatible with the VFX platform CY2023,
see: T99618.

- Update build-bot configuration.
- Remove unnecessary check for Blosc, this is part of OpenVDB lib now.
- Remove WITH_CXX11_ABI, always use new C++11 ABI now
- Replace centos7 by glibc_228 everywhere

Note that existing builds with cached paths pointing to
"../lib/linux_centos7_x86_64" will need to be updated.

Includes contributions by Brecht.
2022-12-19 12:04:21 -06:00
ac59dfeffd Fix T103049: Cycles specular light leak regression
The logic here is not ideal but was unintentionally changed in refactoring
for path guiding, now restore it back to 3.3 behavior again.
2022-12-19 12:04:21 -06:00
24523726d7 Mesh: Store active & default color attributes with strings
Attributes are unifying around a name-based API, and we would like to
be able to move away from CustomData in the future. This patch moves
the identification of active and fallback (render) color attributes
to strings on the mesh from flags on CustomDataLayer. This also
removes some ugliness used to retrieve these attributes and maintain
the active status.

The design is described more here: T98366

The patch keeps forward compatibility working until 4.0 with
the same method as the mesh struct of array refactors (T95965).

The strings are allowed to not correspond to an attribute, to allow
setting the active/default attribute independently of actually filling
its data. When applying a modifier, if the strings don't match an
attribute, they will be removed.

The realize instances / join node and join operator take the names from
the first / active input mesh. While other heuristics may be helpful
(and could be a future improvement), just using the first is simple
and predictable.

Differential Revision: https://developer.blender.org/D15169
2022-12-19 12:04:21 -06:00
20cab8f8f2 Geometry Nodes: Add error message when applying modifier with no mesh
If the resulting geometry from applying a geometry nodes modifier
contains no mesh, give an error message. This gives people something to
search and makes the behavior more purposeful.

Also remove the `modifyMesh` implementation from the geometry nodes
modifier, since it isn't necessary anymore. And remove the existing
"Modifier returned error, skipping apply" message which was cryptic
and redundant if applying returns an actual error message.

Resolves T103229

Differential Revision: https://developer.blender.org/D16782
2022-12-19 12:04:21 -06:00
f9b621a9d9 Nodes: Allow skipping node attachment after dragging
This patch allows skipping the automatic insertion of nodes on top of
links when the transform operator ends. When putting nodes into small
spaces this often gets in the way and wastes time. Now, when holding
`alt`, this is turned off.

The header text is also improved to add this shortcut and to remove
the Dx and Dy values and improve the formatting a bit.

Making this functionality optional might allow us to use it in more
places in the future, like for the nodes added by link-drag-search.

Differential Revision: https://developer.blender.org/D16230
2022-12-19 12:04:21 -06:00
Iliya Katueshenock
f1b16f3ceb Fix: ignore unavailable sockets linked to multi-input socket
Differential Revision: https://developer.blender.org/D16784
2022-12-19 12:04:21 -06:00
cc9d9c7724 Fix make deps harvest error on Linux, due to macOS specific folder in Vulkan 2022-12-19 12:04:21 -06:00
f1f2ff1116 Fix T103170: missing Cycles viewport light threshold update after exposure edit 2022-12-19 12:04:21 -06:00
6b26b0db21 Fix build issue with NanoVDB and HIP on Linux
This patch was already accepted upstream, so this is temporary until we update
to a new OpenVDB release that includes it.
2022-12-19 12:04:21 -06:00
8848cfdf4b Cleanup: fix warning 2022-12-19 12:04:21 -06:00
e50d567c97 Cleanup: Various improvements to modifier apply operator
Use C++ casts, decrease variable scope, use references, use const.
2022-12-19 12:04:21 -06:00
496e344015 Cleanup: Move mesh modifier apply function to editors module
The function was highly related to the apply modifier operator,
and only used once. This was too specific to be in the blenkernel,
especially in a mesh conversion file.
2022-12-19 12:04:21 -06:00
c30718ded9 cmake/win: Allow running blender_test from the VS debugger
This was missing some paths setup in the environment, ctest
normally sets this up before running the tests from the CLI
but that does not help the IDE all that much.
2022-12-19 12:04:21 -06:00
1965e31d17 Fix T101130: Scaling of NLA Strip Via S Hotkey Not Working
After switching over to using start_frame / end_frame, scaling an NLA strip didn't scale the strip, it just repeated the action.

Now withing the NLA transform code, we look for TFM_TIME_EXTEND / TFM_TIME_SCALE transform mode, and handle the update to strip scale accordingly
2022-12-19 12:04:21 -06:00
5e384860a6 Merge branch 'master' into geometry-nodes-simulation 2022-12-15 13:29:14 +01:00
468f43c7a6 Fix simulation 2022-12-14 16:42:48 -06:00
50a2c77c4e Add initial dynamic declarations 2022-12-14 16:02:43 -06:00
c82b1aa1c0 Merge branch 'master' into geometry-nodes-simulation 2022-12-14 14:41:42 -06:00
3f1027567d Add initial simulation state items array to output node 2022-12-14 14:34:20 -06:00
ab8d77359b Merge branch 'master' into geometry-nodes-simulation 2022-12-14 14:02:15 -06:00
3ef95d7f19 add theme color for simulation region 2022-12-13 11:44:14 +01:00
02a264f5ab Merge branch 'master' into geometry-nodes-simulation 2022-12-13 11:15:20 +01:00
2525c1c023 Add initial "Extend" socket. Doesn't do anything yet 2022-12-10 15:12:17 -06:00
620b190e52 Fix delta time output
Output time in seconds rather than frame units
2022-12-10 14:07:09 -06:00
657ffe9aa7 Fix: Simulation resets when playback stops 2022-12-10 14:06:32 -06:00
3fcf50d37a Make the simulation always run, remove run socket
Note: Still unstable. Simulation resets when playback stops bug.

Since the "Run" behavior can basically be implemented with the switch
node already, it's just adding unnecessary complexity to the interface
now, when it's use case isn't clear. We decided to remove it for now,
and only consider it later as a possible convenience feature, rather
than an essential part of the design.

Also, the simulation nodes are now considered "side effect nodes"
for the evaluator, meaning they are *always* evaluated, even if they
aren't needed because of switch nodes, etc. This was the best way
we thought of to make simulations run consistently even through
situations like that.
2022-12-10 00:01:20 -06:00
49e8218edf Remove Started and Ended booleans
These aren't theoretically necessary, since you can just created them
as regular outputs. Maybe we will eventually add them back for
convenience, but that's not clear.
2022-12-09 18:31:07 -06:00
94e6f87ebc Remove "Persistent Cache" option from the UI
I will keep it internally, but for the simulation MVP we want to focus
on the most basic "last frame's cache" features at first.
2022-12-09 17:12:12 -06:00
3dcb437d5c Remove elapsed time sockets
Since this is theoretically redundant with simulating a float
with the delta time value, we decided to remove it for now to
make the whole interface simpler.
2022-12-09 17:11:05 -06:00
ffe0db184a Merge branch 'master' into geometry-nodes-simulation 2022-12-09 17:05:29 -06:00
8648cf4717 Merge branch 'master' into geometry-nodes-simulation 2022-12-09 15:23:58 -06:00
c9958c8e9f Basics of temporary and persistent cache working again
Edge cases not really tested still though
2022-12-08 00:06:35 -06:00
24e2d08b49 Merge branch 'master' into geometry-nodes-simulation 2022-12-07 16:29:07 -06:00
bb732c240d Half finished refactoring to simulation cache
- Cache is accessed with a string identifier, allowing multiple sockets in the future
- Cache is meant to be stored in a simple array, not sparsely like before
- Persistent cache and temporary "last run" cache are separated more clearly
- Use a "Time Point" class instead of integers, to maybe clarify adding subframe support in the future
- Use a different "sim" namespace (not sure if that will last)
- The value from the last frame is moved, to avoid a copy when no persistent cache is used

I don't think this works now, at least I haven't tested it.
2022-12-04 15:06:28 -06:00
850aa3d26a Allow multiple caches in the same node group
The caches now hash the identifier of the output node as well.
2022-12-02 16:13:18 -06:00
1ba264d5f0 Be more forgiving when simulation nodes lost their storage 2022-12-02 16:06:03 -06:00
92a1234830 Add hint to add simulation output node first
Until we can add both nodes at the same time, or we find
an improve simulation to link the two nodes
2022-12-02 15:49:11 -06:00
afe5d0b9f2 Merge branch 'master' into geometry-nodes-simulation 2022-12-02 14:48:03 -06:00
cff291d1f3 Merge branch 'master' into geometry-nodes-simulation 2022-12-02 14:32:09 -06:00
bbcdca1378 Merge branch 'master' into geometry-nodes-simulation 2022-12-02 11:24:32 -06:00
a0caa03942 Merge branch 'master' into geometry-nodes-simulation 2022-12-02 09:17:54 -06:00
7469e19446 Add basic UI support for multiple simulations in a group
Just allows multiple simulation "frames"/regions/contexts to be drawn
in the editor, doesn't include any changes to caching yet.
2022-12-01 19:49:05 -06:00
0019d6cc8f Fix property name for node UI 2022-12-01 18:11:11 -06:00
e9c3e4f14e Merge branch 'master' into geometry-nodes-simulation 2022-12-01 17:57:14 -06:00
bdd71c129c Set "Run" input default to true 2022-12-01 16:34:50 -06:00
18b2ec1963 Rename "Use Cache" to "Persistent Cache"
This is a temporary option anyway, the caches will be controlled in a
more unified place at the object or scene level. But for now the name
can be a bit better anyway.
2022-12-01 16:32:01 -06:00
97df619be7 Move "Run" input to simulation output node
It's the output node that decides whether to requiest the values
from the nodes inside the simulation, so it makes more sense
for it to be there. This is part of a general effort to have less
redundancy in the options.
2022-12-01 16:30:46 -06:00
3059f1743e Merge branch 'master' into geometry-nodes-simulation 2022-12-01 15:45:46 -06:00
4726803e85 Fixed typo in Simulation Input node
"Delta Time" was set instead of "Elapsed Time" output.
2022-11-30 23:35:48 +01:00
fa277178e8 Merge branch 'master' into geometry-nodes-simulation 2022-11-26 13:51:16 +01:00
7cf192956b Move simulation input/output node a bit more into the convex frame 2022-11-25 12:04:04 +01:00
f7cf6e957d Merge branch 'master' into geometry-nodes-simulation 2022-11-25 11:20:53 +01:00
b5ea0d2f41 Add frame around nodes in simulation 2022-11-24 16:35:40 +01:00
c2a632cd41 Basic working simulation and in-memory caching 2022-11-23 17:19:25 -06:00
7ad2b93ec4 More semi-working sockets, lazyness, more TODOs, RNA 2022-11-23 16:02:39 -06:00
6d930d0b4a Merge branch 'master' into geometry-nodes-simulation 2022-11-23 12:04:46 -06:00
f55f2b5ff4 Add some input sockets 2022-11-22 21:57:06 -06:00
5aaa435ac7 Simulation output and input nodes 2022-11-22 18:23:26 -06:00
1123 changed files with 29376 additions and 17810 deletions

View File

@@ -521,7 +521,8 @@ endif()
if(NOT APPLE)
option(WITH_CYCLES_DEVICE_HIP "Enable Cycles AMD HIP support" ON)
option(WITH_CYCLES_HIP_BINARIES "Build Cycles AMD HIP binaries" OFF)
set(CYCLES_HIP_BINARIES_ARCH gfx900 gfx906 gfx90c gfx902 gfx1010 gfx1011 gfx1012 gfx1030 gfx1031 gfx1032 gfx1034 gfx1035 gfx1100 gfx1101 gfx1102 CACHE STRING "AMD HIP architectures to build binaries for")
# Radeon VII (gfx906) not currently working with HIP SDK, so left out of the list.
set(CYCLES_HIP_BINARIES_ARCH gfx900 gfx90c gfx902 gfx1010 gfx1011 gfx1012 gfx1030 gfx1031 gfx1032 gfx1034 gfx1035 gfx1100 gfx1101 gfx1102 CACHE STRING "AMD HIP architectures to build binaries for")
mark_as_advanced(WITH_CYCLES_DEVICE_HIP)
mark_as_advanced(CYCLES_HIP_BINARIES_ARCH)
endif()
@@ -1580,6 +1581,8 @@ elseif(CMAKE_C_COMPILER_ID MATCHES "Clang")
add_check_c_compiler_flag(C_REMOVE_STRICT_FLAGS C_WARN_NO_MISSING_NORETURN -Wno-missing-noreturn)
add_check_c_compiler_flag(C_REMOVE_STRICT_FLAGS C_WARN_NO_UNUSED_BUT_SET_VARIABLE -Wno-unused-but-set-variable)
add_check_c_compiler_flag(C_REMOVE_STRICT_FLAGS C_WARN_NO_DEPRECATED_DECLARATIONS -Wno-deprecated-declarations)
add_check_c_compiler_flag(C_REMOVE_STRICT_FLAGS C_WARN_NO_STRICT_PROTOTYPES -Wno-strict-prototypes)
add_check_c_compiler_flag(C_REMOVE_STRICT_FLAGS C_WARN_NO_BITWISE_INSTEAD_OF_LOGICAL -Wno-bitwise-instead-of-logical)
add_check_cxx_compiler_flag(CXX_REMOVE_STRICT_FLAGS CXX_WARN_NO_UNUSED_PARAMETER -Wno-unused-parameter)
add_check_cxx_compiler_flag(CXX_REMOVE_STRICT_FLAGS CXX_WARN_NO_UNUSED_PRIVATE_FIELD -Wno-unused-private-field)
@@ -1593,6 +1596,7 @@ elseif(CMAKE_C_COMPILER_ID MATCHES "Clang")
add_check_cxx_compiler_flag(CXX_REMOVE_STRICT_FLAGS CXX_WARN_NO_UNDEFINED_VAR_TEMPLATE -Wno-undefined-var-template)
add_check_cxx_compiler_flag(CXX_REMOVE_STRICT_FLAGS CXX_WARN_NO_INSTANTIATION_AFTER_SPECIALIZATION -Wno-instantiation-after-specialization)
add_check_cxx_compiler_flag(CXX_REMOVE_STRICT_FLAGS CXX_WARN_NO_MISLEADING_INDENTATION -Wno-misleading-indentation)
add_check_cxx_compiler_flag(CXX_REMOVE_STRICT_FLAGS CXX_WARN_NO_BITWISE_INSTEAD_OF_LOGICAL -Wno-bitwise-instead-of-logical)
elseif(CMAKE_C_COMPILER_ID MATCHES "Intel")

View File

@@ -58,9 +58,6 @@ Static Source Code Checking
* check_cppcheck: Run blender source through cppcheck (C & C++).
* check_clang_array: Run blender source through clang array checking script (C & C++).
* check_deprecated: Check if there is any deprecated code to remove.
* check_splint: Run blenders source through splint (C only).
* check_sparse: Run blenders source through sparse (C only).
* check_smatch: Run blenders source through smatch (C only).
* check_descriptions: Check for duplicate/invalid descriptions.
* check_licenses: Check license headers follow the SPDX license specification,
using one of the accepted licenses in 'doc/license/SPDX-license-identifiers.txt'
@@ -474,21 +471,6 @@ check_clang_array: .FORCE
@cd "$(BUILD_DIR)" ; \
$(PYTHON) "$(BLENDER_DIR)/build_files/cmake/cmake_static_check_clang_array.py"
check_splint: .FORCE
@$(CMAKE_CONFIG)
@cd "$(BUILD_DIR)" ; \
$(PYTHON) "$(BLENDER_DIR)/build_files/cmake/cmake_static_check_splint.py"
check_sparse: .FORCE
@$(CMAKE_CONFIG)
@cd "$(BUILD_DIR)" ; \
$(PYTHON) "$(BLENDER_DIR)/build_files/cmake/cmake_static_check_sparse.py"
check_smatch: .FORCE
@$(CMAKE_CONFIG)
@cd "$(BUILD_DIR)" ; \
$(PYTHON) "$(BLENDER_DIR)/build_files/cmake/cmake_static_check_smatch.py"
check_mypy: .FORCE
@$(PYTHON) "$(BLENDER_DIR)/tools/check_source/check_mypy.py"

View File

@@ -90,28 +90,26 @@ include(cmake/haru.cmake)
# Boost needs to be included after `python.cmake` due to the PYTHON_BINARY variable being needed.
include(cmake/boost.cmake)
include(cmake/pugixml.cmake)
include(cmake/ispc.cmake)
include(cmake/openimagedenoise.cmake)
include(cmake/embree.cmake)
include(cmake/openpgl.cmake)
include(cmake/fmt.cmake)
include(cmake/robinmap.cmake)
include(cmake/xml2.cmake)
include(cmake/fribidi.cmake)
include(cmake/harfbuzz.cmake)
if(NOT APPLE)
include(cmake/xr_openxr.cmake)
if(NOT WIN32 OR BUILD_MODE STREQUAL Release)
include(cmake/dpcpp.cmake)
include(cmake/dpcpp_deps.cmake)
endif()
include(cmake/dpcpp.cmake)
include(cmake/dpcpp_deps.cmake)
if(NOT WIN32)
include(cmake/igc.cmake)
include(cmake/gmmlib.cmake)
include(cmake/ocloc.cmake)
endif()
endif()
include(cmake/ispc.cmake)
include(cmake/openimagedenoise.cmake)
# Embree needs to be included after dpcpp as it uses it for compiling with GPU support
include(cmake/embree.cmake)
include(cmake/openpgl.cmake)
include(cmake/fmt.cmake)
include(cmake/robinmap.cmake)
include(cmake/xml2.cmake)
# OpenColorIO and dependencies.
include(cmake/expat.cmake)

View File

@@ -156,6 +156,7 @@ download_source(OPENCLHEADERS)
download_source(ICDLOADER)
download_source(MP11)
download_source(SPIRV_HEADERS)
download_source(UNIFIED_RUNTIME)
download_source(IGC)
download_source(IGC_LLVM)
download_source(IGC_OPENCL_CLANG)

View File

@@ -5,6 +5,9 @@
# for now.
string(REPLACE "-DCMAKE_CXX_STANDARD=17" " " DPCPP_CMAKE_FLAGS "${DEFAULT_CMAKE_FLAGS}")
# DPCPP already generates debug libs, there isn't much point in compiling it in debug mode itself.
string(REPLACE "-DCMAKE_BUILD_TYPE=Debug" "-DCMAKE_BUILD_TYPE=Release" DPCPP_CMAKE_FLAGS "${DPCPP_CMAKE_FLAGS}")
if(WIN32)
set(LLVM_GENERATOR "Ninja")
else()
@@ -38,17 +41,18 @@ set(DPCPP_EXTRA_ARGS
-DLEVEL_ZERO_LIBRARY=${LIBDIR}/level-zero/lib/${LIBPREFIX}ze_loader${SHAREDLIBEXT}
-DLEVEL_ZERO_INCLUDE_DIR=${LIBDIR}/level-zero/include
-DLLVM_EXTERNAL_SPIRV_HEADERS_SOURCE_DIR=${BUILD_DIR}/spirvheaders/src/external_spirvheaders/
-DUNIFIED_RUNTIME_SOURCE_DIR=${BUILD_DIR}/unifiedruntime/src/external_unifiedruntime/
# Below here is copied from an invocation of buildbot/config.py
-DLLVM_ENABLE_ASSERTIONS=ON
-DLLVM_TARGETS_TO_BUILD=X86
-DLLVM_EXTERNAL_PROJECTS=sycl^^llvm-spirv^^opencl^^libdevice^^xpti^^xptifw
-DLLVM_EXTERNAL_PROJECTS=sycl^^llvm-spirv^^opencl^^libdevice^^xpti^^xptifw^^lld
-DLLVM_EXTERNAL_SYCL_SOURCE_DIR=${DPCPP_SOURCE_ROOT}/sycl
-DLLVM_EXTERNAL_LLVM_SPIRV_SOURCE_DIR=${DPCPP_SOURCE_ROOT}/llvm-spirv
-DLLVM_EXTERNAL_XPTI_SOURCE_DIR=${DPCPP_SOURCE_ROOT}/xpti
-DXPTI_SOURCE_DIR=${DPCPP_SOURCE_ROOT}/xpti
-DLLVM_EXTERNAL_XPTIFW_SOURCE_DIR=${DPCPP_SOURCE_ROOT}/xptifw
-DLLVM_EXTERNAL_LIBDEVICE_SOURCE_DIR=${DPCPP_SOURCE_ROOT}/libdevice
-DLLVM_ENABLE_PROJECTS=clang^^sycl^^llvm-spirv^^opencl^^libdevice^^xpti^^xptifw
-DLLVM_ENABLE_PROJECTS=clang^^sycl^^llvm-spirv^^opencl^^libdevice^^xpti^^xptifw^^lld
-DLIBCLC_TARGETS_TO_BUILD=
-DLIBCLC_GENERATE_REMANGLED_VARIANTS=OFF
-DSYCL_BUILD_PI_HIP_PLATFORM=AMD
@@ -104,13 +108,19 @@ add_dependencies(
external_mp11
external_level-zero
external_spirvheaders
external_unifiedruntime
)
if(BUILD_MODE STREQUAL Release AND WIN32)
ExternalProject_Add_Step(external_dpcpp after_install
COMMAND ${CMAKE_COMMAND} -E rm -f ${LIBDIR}/dpcpp/bin/clang-cl.exe
COMMAND ${CMAKE_COMMAND} -E rm -f ${LIBDIR}/dpcpp/bin/clang-cpp.exe
COMMAND ${CMAKE_COMMAND} -E rm -f ${LIBDIR}/dpcpp/bin/clang.exe
COMMAND ${CMAKE_COMMAND} -E copy_directory ${LIBDIR}/dpcpp ${HARVEST_TARGET}/dpcpp
COMMAND ${CMAKE_COMMAND} -E rm -f ${HARVEST_TARGET}/dpcpp/bin/clang-cl.exe
COMMAND ${CMAKE_COMMAND} -E rm -f ${HARVEST_TARGET}/dpcpp/bin/clang-cpp.exe
COMMAND ${CMAKE_COMMAND} -E rm -f ${HARVEST_TARGET}/dpcpp/bin/clang.exe
COMMAND ${CMAKE_COMMAND} -E rm -f ${HARVEST_TARGET}/dpcpp/bin/ld.lld.exe
COMMAND ${CMAKE_COMMAND} -E rm -f ${HARVEST_TARGET}/dpcpp/bin/ld64.lld.exe
COMMAND ${CMAKE_COMMAND} -E rm -f ${HARVEST_TARGET}/dpcpp/bin/lld.exe
COMMAND ${CMAKE_COMMAND} -E rm -f ${HARVEST_TARGET}/dpcpp/bin/lld-link.exe
COMMAND ${CMAKE_COMMAND} -E rm -f ${HARVEST_TARGET}/dpcpp/bin/wasm-ld.exe
)
endif()

View File

@@ -59,3 +59,13 @@ ExternalProject_Add(external_spirvheaders
BUILD_COMMAND echo .
INSTALL_COMMAND echo .
)
ExternalProject_Add(external_unifiedruntime
URL file://${PACKAGE_DIR}/${UNIFIED_RUNTIME_FILE}
URL_HASH ${UNIFIED_RUNTIME_HASH_TYPE}=${UNIFIED_RUNTIME_HASH}
DOWNLOAD_DIR ${DOWNLOAD_DIR}
PREFIX ${BUILD_DIR}/unifiedruntime
CONFIGURE_COMMAND echo .
BUILD_COMMAND echo .
INSTALL_COMMAND echo .
)

View File

@@ -3,6 +3,8 @@
# Note the utility apps may use png/tiff/gif system libraries, but the
# library itself does not depend on them, so should give no problems.
set(EMBREE_CMAKE_FLAGS ${DEFAULT_CMAKE_FLAGS})
set(EMBREE_EXTRA_ARGS
-DEMBREE_ISPC_SUPPORT=OFF
-DEMBREE_TUTORIALS=OFF
@@ -31,6 +33,43 @@ if(NOT BLENDER_PLATFORM_ARM)
)
endif()
if(NOT APPLE)
if(WIN32)
# Levels below -O2 don't work well for Embree+SYCL.
string(REGEX REPLACE "-O[A-Za-z0-9]" "" EMBREE_CLANG_CMAKE_CXX_FLAGS_DEBUG ${BLENDER_CLANG_CMAKE_C_FLAGS_DEBUG})
string(APPEND EMBREE_CLANG_CMAKE_CXX_FLAGS_DEBUG " -O2")
string(REGEX REPLACE "-O[A-Za-z0-9]" "" EMBREE_CLANG_CMAKE_C_FLAGS_DEBUG ${BLENDER_CLANG_CMAKE_C_FLAGS_DEBUG})
string(APPEND EMBREE_CLANG_CMAKE_C_FLAGS_DEBUG " -O2")
set(EMBREE_CMAKE_FLAGS
-DCMAKE_BUILD_TYPE=${BUILD_MODE}
-DCMAKE_CXX_FLAGS_RELEASE=${BLENDER_CLANG_CMAKE_CXX_FLAGS_RELEASE}
-DCMAKE_CXX_FLAGS_MINSIZEREL=${BLENDER_CLANG_CMAKE_CXX_FLAGS_MINSIZEREL}
-DCMAKE_CXX_FLAGS_RELWITHDEBINFO=${BLENDER_CLANG_CMAKE_CXX_FLAGS_RELWITHDEBINFO}
-DCMAKE_CXX_FLAGS_DEBUG=${EMBREE_CLANG_CMAKE_CXX_FLAGS_DEBUG}
-DCMAKE_C_FLAGS_RELEASE=${BLENDER_CLANG_CMAKE_C_FLAGS_RELEASE}
-DCMAKE_C_FLAGS_MINSIZEREL=${BLENDER_CLANG_CMAKE_C_FLAGS_MINSIZEREL}
-DCMAKE_C_FLAGS_RELWITHDEBINFO=${BLENDER_CLANG_CMAKE_C_FLAGS_RELWITHDEBINFO}
-DCMAKE_C_FLAGS_DEBUG=${EMBREE_CLANG_CMAKE_C_FLAGS_DEBUG}
-DCMAKE_CXX_STANDARD=17
)
set(EMBREE_EXTRA_ARGS
-DCMAKE_CXX_COMPILER=${LIBDIR}/dpcpp/bin/clang++.exe
-DCMAKE_C_COMPILER=${LIBDIR}/dpcpp/bin/clang.exe
-DCMAKE_SHARED_LINKER_FLAGS=-L"${LIBDIR}/dpcpp/lib"
-DEMBREE_SYCL_SUPPORT=ON
${EMBREE_EXTRA_ARGS}
)
else()
set(EMBREE_EXTRA_ARGS
-DCMAKE_CXX_COMPILER=${LIBDIR}/dpcpp/bin/clang++
-DCMAKE_C_COMPILER=${LIBDIR}/dpcpp/bin/clang
-DCMAKE_SHARED_LINKER_FLAGS=-L"${LIBDIR}/dpcpp/lib"
-DEMBREE_SYCL_SUPPORT=ON
${EMBREE_EXTRA_ARGS}
)
endif()
endif()
if(TBB_STATIC_LIBRARY)
set(EMBREE_EXTRA_ARGS
${EMBREE_EXTRA_ARGS}
@@ -42,16 +81,25 @@ ExternalProject_Add(external_embree
URL file://${PACKAGE_DIR}/${EMBREE_FILE}
DOWNLOAD_DIR ${DOWNLOAD_DIR}
URL_HASH ${EMBREE_HASH_TYPE}=${EMBREE_HASH}
CMAKE_GENERATOR ${PLATFORM_ALT_GENERATOR}
PREFIX ${BUILD_DIR}/embree
PATCH_COMMAND ${PATCH_CMD} -p 1 -d ${BUILD_DIR}/embree/src/external_embree < ${PATCH_DIR}/embree.diff
CMAKE_ARGS -DCMAKE_INSTALL_PREFIX=${LIBDIR}/embree ${DEFAULT_CMAKE_FLAGS} ${EMBREE_EXTRA_ARGS}
CMAKE_ARGS -DCMAKE_INSTALL_PREFIX=${LIBDIR}/embree ${EMBREE_CMAKE_FLAGS} ${EMBREE_EXTRA_ARGS}
INSTALL_DIR ${LIBDIR}/embree
)
add_dependencies(
external_embree
external_tbb
)
if(NOT APPLE)
add_dependencies(
external_embree
external_tbb
external_dpcpp
)
else()
add_dependencies(
external_embree
external_tbb
)
endif()
if(WIN32)
if(BUILD_MODE STREQUAL Release)
@@ -66,6 +114,7 @@ if(WIN32)
ExternalProject_Add_Step(external_embree after_install
COMMAND ${CMAKE_COMMAND} -E copy ${LIBDIR}/embree/bin/embree4_d.dll ${HARVEST_TARGET}/embree/bin/embree4_d.dll
COMMAND ${CMAKE_COMMAND} -E copy ${LIBDIR}/embree/lib/embree4_d.lib ${HARVEST_TARGET}/embree/lib/embree4_d.lib
COMMAND ${CMAKE_COMMAND} -E copy ${LIBDIR}/embree/lib/embree4_sycl_d.lib ${HARVEST_TARGET}/embree/lib/embree4_sycl_d.lib
DEPENDEES install
)
endif()

View File

@@ -74,6 +74,27 @@ if(WIN32)
set(BLENDER_CMAKE_CXX_FLAGS_RELEASE "/MD ${COMMON_MSVC_FLAGS} /D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS /O2 /Ob2 /D NDEBUG /D PLATFORM_WINDOWS /DPSAPI_VERSION=2 /DTINYFORMAT_ALLOW_WCHAR_STRINGS")
set(BLENDER_CMAKE_CXX_FLAGS_RELWITHDEBINFO "/MD ${COMMON_MSVC_FLAGS} /D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS /Zi /O2 /Ob1 /D NDEBUG /D PLATFORM_WINDOWS /DPSAPI_VERSION=2 /DTINYFORMAT_ALLOW_WCHAR_STRINGS")
# Set similar flags for CLANG compilation.
set(COMMON_CLANG_FLAGS "-D_DLL -D_MT") # Equivalent to MSVC /MD
if(WITH_OPTIMIZED_DEBUG)
set(BLENDER_CLANG_CMAKE_C_FLAGS_DEBUG "${COMMON_CLANG_FLAGS} -Xclang --dependent-lib=msvcrtd -O2 -D_DEBUG -DPSAPI_VERSION=2 -DTINYFORMAT_ALLOW_WCHAR_STRINGS")
else()
set(BLENDER_CLANG_CMAKE_C_FLAGS_DEBUG "${COMMON_CLANG_FLAGS} -Xclang --dependent-lib=msvcrtd -g -D_DEBUG -DPSAPI_VERSION=2 -DTINYFORMAT_ALLOW_WCHAR_STRINGS")
endif()
set(BLENDER_CLANG_CMAKE_C_FLAGS_MINSIZEREL "${COMMON_CLANG_FLAGS} -Xclang --dependent-lib=msvcrt -Os -DNDEBUG -DPSAPI_VERSION=2 -DTINYFORMAT_ALLOW_WCHAR_STRINGS")
set(BLENDER_CLANG_CMAKE_C_FLAGS_RELEASE "${COMMON_CLANG_FLAGS} -Xclang --dependent-lib=msvcrt -O2 -DNDEBUG -DPSAPI_VERSION=2 -DTINYFORMAT_ALLOW_WCHAR_STRINGS")
set(BLENDER_CLANG_CMAKE_C_FLAGS_RELWITHDEBINFO "${COMMON_CLANG_FLAGS} -Xclang --dependent-lib=msvcrt -g -O2 -DNDEBUG -DPSAPI_VERSION=2 -DTINYFORMAT_ALLOW_WCHAR_STRINGS")
if(WITH_OPTIMIZED_DEBUG)
set(BLENDER_CLANG_CMAKE_CXX_FLAGS_DEBUG "${COMMON_CLANG_FLAGS} -Xclang --dependent-lib=msvcrtd -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -O2 -D_DEBUG -DPLATFORM_WINDOWS -DPSAPI_VERSION=2 -DTINYFORMAT_ALLOW_WCHAR_STRINGS -DBOOST_DEBUG_PYTHON -DBOOST_ALL_NO_LIB")
else()
set(BLENDER_CLANG_CMAKE_CXX_FLAGS_DEBUG "${COMMON_CLANG_FLAG} -Xclang --dependent-lib=msvcrtd -D_DEBUG -DPLATFORM_WINDOWS -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -g -DPSAPI_VERSION=2 -DTINYFORMAT_ALLOW_WCHAR_STRINGS -DBOOST_DEBUG_PYTHON -DBOOST_ALL_NO_LIB")
endif()
set(BLENDER_CLANG_CMAKE_CXX_FLAGS_MINSIZEREL "${COMMON_CLANG_FLAGS} -Xclang --dependent-lib=msvcrt -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -O2 -DNDEBUG -DPLATFORM_WINDOWS -DPSAPI_VERSION=2 -DTINYFORMAT_ALLOW_WCHAR_STRINGS")
set(BLENDER_CLANG_CMAKE_CXX_FLAGS_RELEASE "${COMMON_CLANG_FLAGS} -Xclang --dependent-lib=msvcrt -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -O2 -DNDEBUG -DPLATFORM_WINDOWS -DPSAPI_VERSION=2 -DTINYFORMAT_ALLOW_WCHAR_STRINGS")
set(BLENDER_CLANG_CMAKE_CXX_FLAGS_RELWITHDEBINFO "${COMMON_CLANG_FLAGS} -Xclang --dependent-lib=msvcrt -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -g -O2 -DNDEBUG -DPLATFORM_WINDOWS -DPSAPI_VERSION=2 -DTINYFORMAT_ALLOW_WCHAR_STRINGS")
set(PLATFORM_FLAGS)
set(PLATFORM_CXX_FLAGS)
set(PLATFORM_CMAKE_FLAGS)

View File

@@ -599,15 +599,15 @@ set(OPENPGL_HASH db63f5dac5cfa8c110ede241f0c413f00db0c4748697381c4fa23e0f9e82a75
set(OPENPGL_HASH_TYPE SHA256)
set(OPENPGL_FILE openpgl-${OPENPGL_VERSION}.tar.gz)
set(LEVEL_ZERO_VERSION v1.8.5)
set(LEVEL_ZERO_VERSION v1.8.8)
set(LEVEL_ZERO_URI https://github.com/oneapi-src/level-zero/archive/refs/tags/${LEVEL_ZERO_VERSION}.tar.gz)
set(LEVEL_ZERO_HASH b6e9663bbcc53c148d32376998298bec6f7c434ef2218c61fa708963e3a09394)
set(LEVEL_ZERO_HASH 3553ae8fa0d2d69c4210a8f3428bd6612bd8bb8a627faf52c3658a01851e66d2)
set(LEVEL_ZERO_HASH_TYPE SHA256)
set(LEVEL_ZERO_FILE level-zero-${LEVEL_ZERO_VERSION}.tar.gz)
set(DPCPP_VERSION 20221019)
set(DPCPP_URI https://github.com/intel/llvm/archive/refs/tags/sycl-nightly/${DPCPP_VERSION}.tar.gz)
set(DPCPP_HASH 2f533946e91ce3829431758ea17b0b834b960c1a796e9e4563c86e03eb9603a2)
set(DPCPP_VERSION 2022-12)
set(DPCPP_URI https://github.com/intel/llvm/archive/refs/tags/${DPCPP_VERSION}.tar.gz)
set(DPCPP_HASH 13151d5ae79f7c9c4a9b072a0c486ae7b3c4993e301bb1268c92214451025790)
set(DPCPP_HASH_TYPE SHA256)
set(DPCPP_FILE DPCPP-${DPCPP_VERSION}.tar.gz)
@@ -620,9 +620,9 @@ set(DPCPP_FILE DPCPP-${DPCPP_VERSION}.tar.gz)
# will take care of building them, unpack is being done in dpcpp_deps.cmake
# Source llvm/lib/SYCLLowerIR/CMakeLists.txt
set(VCINTRINSICS_VERSION abce9184b7a3a7fe1b02289b9285610d9dc45465)
set(VCINTRINSICS_VERSION 782fbf7301dc73acaa049a4324c976ad94f587f7)
set(VCINTRINSICS_URI https://github.com/intel/vc-intrinsics/archive/${VCINTRINSICS_VERSION}.tar.gz)
set(VCINTRINSICS_HASH 3e9fd471246b87633b26f7e15e17ab7733d357458c53d5c5881c03929d6c551f)
set(VCINTRINSICS_HASH f4c0ccad8c1f77760364c551c65e8e1cf194d058889fa46d3b1b2d19ec4dc33f)
set(VCINTRINSICS_HASH_TYPE SHA256)
set(VCINTRINSICS_FILE vc-intrinsics-${VCINTRINSICS_VERSION}.tar.gz)
@@ -657,6 +657,13 @@ set(SPIRV_HEADERS_HASH ec8ecb471a62672697846c436501638ab25447ae9d4a6761e0bfe8a9a
set(SPIRV_HEADERS_HASH_TYPE SHA256)
set(SPIRV_HEADERS_FILE SPIR-V-Headers-${SPIRV_HEADERS_VERSION}.tar.gz)
# Source llvm/sycl/plugins/unified_runtime/CMakeLists.txt
set(UNIFIED_RUNTIME_VERSION fd711c920acc4434cb52ff18b078c082d9d7f44d)
set(UNIFIED_RUNTIME_URI https://github.com/oneapi-src/unified-runtime/archive/${UNIFIED_RUNTIME_VERSION}.tar.gz)
set(UNIFIED_RUNTIME_HASH 535ca2ee78f68c5e7e62b10f1bbabd909179488885566e6d9b1fc50e8a1be65f)
set(UNIFIED_RUNTIME_HASH_TYPE SHA256)
set(UNIFIED_RUNTIME_FILE unified-runtime-${UNIFIED_RUNTIME_VERSION}.tar.gz)
######################
### DPCPP DEPS END ###
######################
@@ -730,9 +737,9 @@ set(GMMLIB_HASH c1f33e1519edfc527127baeb0436b783430dfd256c643130169a3a71dc86aff9
set(GMMLIB_HASH_TYPE SHA256)
set(GMMLIB_FILE ${GMMLIB_VERSION}.tar.gz)
set(OCLOC_VERSION 22.49.25018.21)
set(OCLOC_VERSION 23.05.25593.18)
set(OCLOC_URI https://github.com/intel/compute-runtime/archive/refs/tags/${OCLOC_VERSION}.tar.gz)
set(OCLOC_HASH 92362dae08b503a34e5d3820ed284198c452bcd5e7504d90eb69887b20492c06)
set(OCLOC_HASH 122415028e631922ae999c996954dfd98ce9a32decd564d5484c31476ec9306e)
set(OCLOC_HASH_TYPE SHA256)
set(OCLOC_FILE ocloc-${OCLOC_VERSION}.tar.gz)

View File

@@ -14,6 +14,7 @@ graph[autosize = false, size = "25.7,8.3!", resolution = 300];
external_dpcpp -- external_mp11;
external_dpcpp -- external_level_zero;
external_dpcpp -- external_spirvheaders;
external_dpcpp -- external_unifiedruntime;
external_embree -- external_tbb;
external_ffmpeg -- external_zlib;
external_ffmpeg -- external_openjpeg;

View File

@@ -34,3 +34,156 @@ diff -Naur llvm-sycl-nightly-20220208.orig/libdevice/cmake/modules/SYCLLibdevice
libsycldevice-obj
libsycldevice-spv)
diff --git a/sycl/source/detail/program_manager/program_manager.cpp b/sycl/source/detail/program_manager/program_manager.cpp
index 17eeaafae194..09e6d2217aaa 100644
--- a/sycl/source/detail/program_manager/program_manager.cpp
+++ b/sycl/source/detail/program_manager/program_manager.cpp
@@ -1647,46 +1647,120 @@ ProgramManager::getSYCLDeviceImagesWithCompatibleState(
}
assert(BinImages.size() > 0 && "Expected to find at least one device image");
+ // Ignore images with incompatible state. Image is considered compatible
+ // with a target state if an image is already in the target state or can
+ // be brought to target state by compiling/linking/building.
+ //
+ // Example: an image in "executable" state is not compatible with
+ // "input" target state - there is no operation to convert the image it
+ // to "input" state. An image in "input" state is compatible with
+ // "executable" target state because it can be built to get into
+ // "executable" state.
+ for (auto It = BinImages.begin(); It != BinImages.end();) {
+ if (getBinImageState(*It) > TargetState)
+ It = BinImages.erase(It);
+ else
+ ++It;
+ }
+
std::vector<device_image_plain> SYCLDeviceImages;
- for (RTDeviceBinaryImage *BinImage : BinImages) {
- const bundle_state ImgState = getBinImageState(BinImage);
-
- // Ignore images with incompatible state. Image is considered compatible
- // with a target state if an image is already in the target state or can
- // be brought to target state by compiling/linking/building.
- //
- // Example: an image in "executable" state is not compatible with
- // "input" target state - there is no operation to convert the image it
- // to "input" state. An image in "input" state is compatible with
- // "executable" target state because it can be built to get into
- // "executable" state.
- if (ImgState > TargetState)
- continue;
- for (const sycl::device &Dev : Devs) {
+ // If a non-input state is requested, we can filter out some compatible
+ // images and return only those with the highest compatible state for each
+ // device-kernel pair. This map tracks how many kernel-device pairs need each
+ // image, so that any unneeded ones are skipped.
+ // TODO this has no effect if the requested state is input, consider having
+ // a separate branch for that case to avoid unnecessary tracking work.
+ struct DeviceBinaryImageInfo {
+ std::shared_ptr<std::vector<sycl::kernel_id>> KernelIDs;
+ bundle_state State = bundle_state::input;
+ int RequirementCounter = 0;
+ };
+ std::unordered_map<RTDeviceBinaryImage *, DeviceBinaryImageInfo> ImageInfoMap;
+
+ for (const sycl::device &Dev : Devs) {
+ // Track the highest image state for each requested kernel.
+ using StateImagesPairT =
+ std::pair<bundle_state, std::vector<RTDeviceBinaryImage *>>;
+ using KernelImageMapT =
+ std::map<kernel_id, StateImagesPairT, LessByNameComp>;
+ KernelImageMapT KernelImageMap;
+ if (!KernelIDs.empty())
+ for (const kernel_id &KernelID : KernelIDs)
+ KernelImageMap.insert({KernelID, {}});
+
+ for (RTDeviceBinaryImage *BinImage : BinImages) {
if (!compatibleWithDevice(BinImage, Dev) ||
!doesDevSupportImgAspects(Dev, *BinImage))
continue;
- std::shared_ptr<std::vector<sycl::kernel_id>> KernelIDs;
- // Collect kernel names for the image
- {
- std::lock_guard<std::mutex> KernelIDsGuard(m_KernelIDsMutex);
- KernelIDs = m_BinImg2KernelIDs[BinImage];
- // If the image does not contain any non-service kernels we can skip it.
- if (!KernelIDs || KernelIDs->empty())
- continue;
+ auto InsertRes = ImageInfoMap.insert({BinImage, {}});
+ DeviceBinaryImageInfo &ImgInfo = InsertRes.first->second;
+ if (InsertRes.second) {
+ ImgInfo.State = getBinImageState(BinImage);
+ // Collect kernel names for the image
+ {
+ std::lock_guard<std::mutex> KernelIDsGuard(m_KernelIDsMutex);
+ ImgInfo.KernelIDs = m_BinImg2KernelIDs[BinImage];
+ }
}
+ const bundle_state ImgState = ImgInfo.State;
+ const std::shared_ptr<std::vector<sycl::kernel_id>> &ImageKernelIDs =
+ ImgInfo.KernelIDs;
+ int &ImgRequirementCounter = ImgInfo.RequirementCounter;
- DeviceImageImplPtr Impl = std::make_shared<detail::device_image_impl>(
- BinImage, Ctx, Devs, ImgState, KernelIDs, /*PIProgram=*/nullptr);
+ // If the image does not contain any non-service kernels we can skip it.
+ if (!ImageKernelIDs || ImageKernelIDs->empty())
+ continue;
- SYCLDeviceImages.push_back(
- createSyclObjFromImpl<device_image_plain>(Impl));
- break;
+ // Update tracked information.
+ for (kernel_id &KernelID : *ImageKernelIDs) {
+ StateImagesPairT *StateImagesPair;
+ // If only specific kernels are requested, ignore the rest.
+ if (!KernelIDs.empty()) {
+ auto It = KernelImageMap.find(KernelID);
+ if (It == KernelImageMap.end())
+ continue;
+ StateImagesPair = &It->second;
+ } else
+ StateImagesPair = &KernelImageMap[KernelID];
+
+ auto &[KernelImagesState, KernelImages] = *StateImagesPair;
+
+ if (KernelImages.empty()) {
+ KernelImagesState = ImgState;
+ KernelImages.push_back(BinImage);
+ ++ImgRequirementCounter;
+ } else if (KernelImagesState < ImgState) {
+ for (RTDeviceBinaryImage *Img : KernelImages) {
+ auto It = ImageInfoMap.find(Img);
+ assert(It != ImageInfoMap.end());
+ assert(It->second.RequirementCounter > 0);
+ --(It->second.RequirementCounter);
+ }
+ KernelImages.clear();
+ KernelImages.push_back(BinImage);
+ KernelImagesState = ImgState;
+ ++ImgRequirementCounter;
+ } else if (KernelImagesState == ImgState) {
+ KernelImages.push_back(BinImage);
+ ++ImgRequirementCounter;
+ }
+ }
}
}
+ for (const auto &ImgInfoPair : ImageInfoMap) {
+ if (ImgInfoPair.second.RequirementCounter == 0)
+ continue;
+
+ DeviceImageImplPtr Impl = std::make_shared<detail::device_image_impl>(
+ ImgInfoPair.first, Ctx, Devs, ImgInfoPair.second.State,
+ ImgInfoPair.second.KernelIDs, /*PIProgram=*/nullptr);
+
+ SYCLDeviceImages.push_back(createSyclObjFromImpl<device_image_plain>(Impl));
+ }
+
return SYCLDeviceImages;
}

View File

@@ -149,3 +149,19 @@ index 074f910a2..30f490818 100644
return is_hit_first | is_hit_second;
}
};
diff -ruN a/kernels/sycl/rthwif_embree_builder.cpp b/kernels/sycl/rthwif_embree_builder.cpp
--- a/kernels/sycl/rthwif_embree_builder.cpp 2023-03-28 17:23:06.429190200 +0200
+++ b/kernels/sycl/rthwif_embree_builder.cpp 2023-03-28 17:35:01.291938600 +0200
@@ -540,7 +540,12 @@
assert(offset <= geomDescrData.size());
}
+ /* Force running BVH building sequentially from the calling thread if using TBB < 2021, as it otherwise leads to runtime issues. */
+#if TBB_VERSION_MAJOR<2021
+ RTHWIF_PARALLEL_OPERATION parallelOperation = nullptr;
+#else
RTHWIF_PARALLEL_OPERATION parallelOperation = rthwifNewParallelOperation();
+#endif
/* estimate static accel size */
BBox1f time_range(0,1);

View File

@@ -37,18 +37,24 @@ elseif(HIP_HIPCC_EXECUTABLE)
set(HIP_VERSION_MINOR 0)
set(HIP_VERSION_PATCH 0)
if(WIN32)
set(_hipcc_executable ${HIP_HIPCC_EXECUTABLE}.bat)
else()
set(_hipcc_executable ${HIP_HIPCC_EXECUTABLE})
endif()
# Get version from the output.
execute_process(COMMAND ${HIP_HIPCC_EXECUTABLE} --version
OUTPUT_VARIABLE HIP_VERSION_RAW
execute_process(COMMAND ${_hipcc_executable} --version
OUTPUT_VARIABLE _hip_version_raw
ERROR_QUIET
OUTPUT_STRIP_TRAILING_WHITESPACE)
# Parse parts.
if(HIP_VERSION_RAW MATCHES "HIP version: .*")
if(_hip_version_raw MATCHES "HIP version: .*")
# Strip the HIP prefix and get list of individual version components.
string(REGEX REPLACE
".*HIP version: ([.0-9]+).*" "\\1"
HIP_SEMANTIC_VERSION "${HIP_VERSION_RAW}")
HIP_SEMANTIC_VERSION "${_hip_version_raw}")
string(REPLACE "." ";" HIP_VERSION_PARTS "${HIP_SEMANTIC_VERSION}")
list(LENGTH HIP_VERSION_PARTS NUM_HIP_VERSION_PARTS)
@@ -71,7 +77,13 @@ elseif(HIP_HIPCC_EXECUTABLE)
# Construct full semantic version.
set(HIP_VERSION "${HIP_VERSION_MAJOR}.${HIP_VERSION_MINOR}.${HIP_VERSION_PATCH}")
unset(HIP_VERSION_RAW)
unset(_hip_version_raw)
unset(_hipcc_executable)
else()
set(HIP_FOUND FALSE)
endif()
include(FindPackageHandleStandardArgs)
find_package_handle_standard_args(HIP
REQUIRED_VARS HIP_HIPCC_EXECUTABLE
VERSION_VAR HIP_VERSION)

View File

@@ -108,7 +108,11 @@ FIND_PACKAGE_HANDLE_STANDARD_ARGS(SYCL
IF(SYCL_FOUND)
SET(SYCL_INCLUDE_DIR ${SYCL_INCLUDE_DIR} ${SYCL_INCLUDE_DIR}/sycl)
SET(SYCL_LIBRARIES ${SYCL_LIBRARY})
IF(WIN32 AND SYCL_LIBRARY_DEBUG)
SET(SYCL_LIBRARIES optimized ${SYCL_LIBRARY} debug ${SYCL_LIBRARY_DEBUG})
ELSE()
SET(SYCL_LIBRARIES ${SYCL_LIBRARY})
ENDIF()
ELSE()
SET(SYCL_SYCL_FOUND FALSE)
ENDIF()

View File

@@ -1,58 +0,0 @@
#!/usr/bin/env python3
# SPDX-License-Identifier: GPL-2.0-or-later
CHECKER_IGNORE_PREFIX = [
"extern",
"intern/moto",
]
CHECKER_BIN = "smatch"
CHECKER_ARGS = [
"--full-path",
"--two-passes",
]
import project_source_info
import subprocess
import sys
import os
USE_QUIET = (os.environ.get("QUIET", None) is not None)
def main():
source_info = project_source_info.build_info(use_cxx=False, ignore_prefix_list=CHECKER_IGNORE_PREFIX)
source_defines = project_source_info.build_defines_as_args()
check_commands = []
for c, inc_dirs, defs in source_info:
cmd = ([CHECKER_BIN] +
CHECKER_ARGS +
[c] +
[("-I%s" % i) for i in inc_dirs] +
[("-D%s" % d) for d in defs] +
source_defines
)
check_commands.append((c, cmd))
def my_process(i, c, cmd):
if not USE_QUIET:
percent = 100.0 * (i / len(check_commands))
percent_str = "[" + ("%.2f]" % percent).rjust(7) + " %:"
sys.stdout.flush()
sys.stdout.write("%s %s\n" % (percent_str, c))
return subprocess.Popen(cmd)
process_functions = []
for i, (c, cmd) in enumerate(check_commands):
process_functions.append((my_process, (i, c, cmd)))
project_source_info.queue_processes(process_functions)
if __name__ == "__main__":
main()

View File

@@ -1,56 +0,0 @@
#!/usr/bin/env python3
# SPDX-License-Identifier: GPL-2.0-or-later
CHECKER_IGNORE_PREFIX = [
"extern",
"intern/moto",
]
CHECKER_BIN = "sparse"
CHECKER_ARGS = [
]
import project_source_info
import subprocess
import sys
import os
USE_QUIET = (os.environ.get("QUIET", None) is not None)
def main():
source_info = project_source_info.build_info(use_cxx=False, ignore_prefix_list=CHECKER_IGNORE_PREFIX)
source_defines = project_source_info.build_defines_as_args()
check_commands = []
for c, inc_dirs, defs in source_info:
cmd = ([CHECKER_BIN] +
CHECKER_ARGS +
[c] +
[("-I%s" % i) for i in inc_dirs] +
[("-D%s" % d) for d in defs] +
source_defines
)
check_commands.append((c, cmd))
def my_process(i, c, cmd):
if not USE_QUIET:
percent = 100.0 * (i / len(check_commands))
percent_str = "[" + ("%.2f]" % percent).rjust(7) + " %:"
sys.stdout.flush()
sys.stdout.write("%s %s\n" % (percent_str, c))
return subprocess.Popen(cmd)
process_functions = []
for i, (c, cmd) in enumerate(check_commands):
process_functions.append((my_process, (i, c, cmd)))
project_source_info.queue_processes(process_functions)
if __name__ == "__main__":
main()

View File

@@ -1,86 +0,0 @@
#!/usr/bin/env python3
# SPDX-License-Identifier: GPL-2.0-or-later
CHECKER_IGNORE_PREFIX = [
"extern",
"intern/moto",
]
CHECKER_BIN = "splint"
CHECKER_ARGS = [
"-weak",
"-posix-lib",
"-linelen", "10000",
"+ignorequals",
"+relaxtypes",
"-retvalother",
"+matchanyintegral",
"+longintegral",
"+ignoresigns",
"-nestcomment",
"-predboolothers",
"-ifempty",
"-unrecogcomments",
# we may want to remove these later
"-type",
"-fixedformalarray",
"-fullinitblock",
"-fcnuse",
"-initallelements",
"-castfcnptr",
# -forcehints,
"-bufferoverflowhigh", # warns a lot about sprintf()
# re-definitions, rna causes most of these
"-redef",
"-syntax",
# dummy, witjout this splint complains with:
# /usr/include/bits/confname.h:31:27: *** Internal Bug at cscannerHelp.c:2428: Unexpanded macro not function or constant: int _PC_MAX_CANON
"-D_PC_MAX_CANON=0",
]
import project_source_info
import subprocess
import sys
import os
USE_QUIET = (os.environ.get("QUIET", None) is not None)
def main():
source_info = project_source_info.build_info(use_cxx=False, ignore_prefix_list=CHECKER_IGNORE_PREFIX)
check_commands = []
for c, inc_dirs, defs in source_info:
cmd = ([CHECKER_BIN] +
CHECKER_ARGS +
[c] +
[("-I%s" % i) for i in inc_dirs] +
[("-D%s" % d) for d in defs]
)
check_commands.append((c, cmd))
def my_process(i, c, cmd):
if not USE_QUIET:
percent = 100.0 * (i / len(check_commands))
percent_str = "[" + ("%.2f]" % percent).rjust(7) + " %:"
sys.stdout.write("%s %s\n" % (percent_str, c))
sys.stdout.flush()
return subprocess.Popen(cmd)
process_functions = []
for i, (c, cmd) in enumerate(check_commands):
process_functions.append((my_process, (i, c, cmd)))
project_source_info.queue_processes(process_functions)
if __name__ == "__main__":
main()

View File

@@ -82,7 +82,7 @@ if(NOT APPLE)
set(WITH_CYCLES_DEVICE_OPTIX ON CACHE BOOL "" FORCE)
set(WITH_CYCLES_CUDA_BINARIES ON CACHE BOOL "" FORCE)
set(WITH_CYCLES_CUBIN_COMPILER OFF CACHE BOOL "" FORCE)
set(WITH_CYCLES_HIP_BINARIES OFF CACHE BOOL "" FORCE)
set(WITH_CYCLES_HIP_BINARIES ON CACHE BOOL "" FORCE)
set(WITH_CYCLES_DEVICE_ONEAPI ON CACHE BOOL "" FORCE)
set(WITH_CYCLES_ONEAPI_BINARIES ON CACHE BOOL "" FORCE)
endif()

View File

@@ -1384,4 +1384,3 @@ macro(windows_process_platform_bundled_libraries library_deps)
endforeach()
endif()
endmacro()

View File

@@ -174,7 +174,7 @@ if(SYSTEMSTUBS_LIBRARY)
list(APPEND PLATFORM_LINKLIBS SystemStubs)
endif()
string(APPEND PLATFORM_CFLAGS " -pipe -funsigned-char -fno-strict-aliasing")
string(APPEND PLATFORM_CFLAGS " -pipe -funsigned-char -fno-strict-aliasing -ffp-contract=off")
set(PLATFORM_LINKFLAGS
"-fexceptions -framework CoreServices -framework Foundation -framework IOKit -framework AppKit -framework Cocoa -framework Carbon -framework AudioUnit -framework AudioToolbox -framework CoreAudio -framework Metal -framework QuartzCore"
)

View File

@@ -803,8 +803,7 @@ if(CMAKE_COMPILER_IS_GNUCC)
# Automatically turned on when building with "-march=native". This is
# explicitly turned off here as it will make floating point math give a bit
# different results. This will lead to automated test failures. So disable
# this until we support it. Seems to default to off in clang and the intel
# compiler.
# this until we support it.
set(PLATFORM_CFLAGS "-pipe -fPIC -funsigned-char -fno-strict-aliasing -ffp-contract=off")
# `maybe-uninitialized` is unreliable in release builds, but fine in debug builds.
@@ -815,64 +814,49 @@ if(CMAKE_COMPILER_IS_GNUCC)
string(PREPEND CMAKE_CXX_FLAGS_RELWITHDEBINFO "${GCC_EXTRA_FLAGS_RELEASE} ")
unset(GCC_EXTRA_FLAGS_RELEASE)
# NOTE(@campbellbarton): Eventually mold will be able to use `-fuse-ld=mold`,
# however at the moment this only works for GCC 12.1+ (unreleased at time of writing).
# So a workaround is used here "-B" which points to another path to find system commands
# such as `ld`.
if(WITH_LINKER_MOLD AND _IS_LINKER_DEFAULT)
find_program(MOLD_BIN "mold")
mark_as_advanced(MOLD_BIN)
if(NOT MOLD_BIN)
message(STATUS "The \"mold\" binary could not be found, using system linker.")
set(WITH_LINKER_MOLD OFF)
elseif(CMAKE_C_COMPILER_VERSION VERSION_LESS 12.1)
message(STATUS "GCC 12.1 or newer is required for th MOLD linker.")
set(WITH_LINKER_MOLD OFF)
else()
# By default mold installs the binary to:
# - `{PREFIX}/bin/mold` as well as a symbolic-link in...
# - `{PREFIX}/lib/mold/ld`.
# (where `PREFIX` is typically `/usr/`).
#
# This block of code finds `{PREFIX}/lib/mold` from the `mold` binary.
# Other methods of searching for the path could also be made to work,
# we could even make our own directory and symbolic-link, however it's more
# convenient to use the one provided by mold.
#
# Use the binary path to "mold", to find the common prefix which contains "lib/mold".
# The parent directory: e.g. `/usr/bin/mold` -> `/usr/bin/`.
get_filename_component(MOLD_PREFIX "${MOLD_BIN}" DIRECTORY)
# The common prefix path: e.g. `/usr/bin/` -> `/usr/` to use as a hint.
get_filename_component(MOLD_PREFIX "${MOLD_PREFIX}" DIRECTORY)
# Find `{PREFIX}/lib/mold/ld`, store the directory component (without the `ld`).
# Then pass `-B {PREFIX}/lib/mold` to GCC so the `ld` located there overrides the default.
find_path(
MOLD_BIN_DIR "ld"
HINTS "${MOLD_PREFIX}"
# The default path is `libexec`, Arch Linux for e.g.
# replaces this with `lib` so check both.
PATH_SUFFIXES "libexec/mold" "lib/mold" "lib64/mold"
NO_DEFAULT_PATH
NO_CACHE
get_filename_component(MOLD_BIN_DIR "${MOLD_BIN}" DIRECTORY)
# Check if the `-B` argument is required.
# This will happen when `MOLD_BIN` points to a non-standard location.
# Keep this option as mold is not yet a standard system component and
# users may have it installed in some unexpected place.
set(_mold_args "-fuse-ld=mold")
execute_process(
COMMAND ${CMAKE_C_COMPILER} -B ${MOLD_BIN_DIR} ${_mold_args} -Wl,--version
ERROR_QUIET OUTPUT_VARIABLE LD_VERSION_WITH_DIR
)
if(NOT MOLD_BIN_DIR)
message(STATUS
"The mold linker could not find the directory containing the linker command "
"(typically "
"\"${MOLD_PREFIX}/libexec/mold/ld\") or "
"\"${MOLD_PREFIX}/lib/mold/ld\") using system linker."
)
set(WITH_LINKER_MOLD OFF)
execute_process(
COMMAND ${CMAKE_C_COMPILER} ${_mold_args} -Wl,--version
ERROR_QUIET OUTPUT_VARIABLE LD_VERSION
)
if(NOT (LD_VERSION STREQUAL LD_VERSION_WITH_DIR))
string(PREPEND _mold_args "-B \"${MOLD_BIN_DIR}\" ")
set(LD_VERSION "${LD_VERSION_WITH_DIR}")
endif()
unset(MOLD_PREFIX)
endif()
if(WITH_LINKER_MOLD)
# GCC will search for `ld` in this directory first.
string(APPEND CMAKE_EXE_LINKER_FLAGS " -B \"${MOLD_BIN_DIR}\"")
string(APPEND CMAKE_SHARED_LINKER_FLAGS " -B \"${MOLD_BIN_DIR}\"")
string(APPEND CMAKE_MODULE_LINKER_FLAGS " -B \"${MOLD_BIN_DIR}\"")
set(_IS_LINKER_DEFAULT OFF)
if("${LD_VERSION}" MATCHES "mold ")
string(APPEND CMAKE_EXE_LINKER_FLAGS " ${_mold_args}")
string(APPEND CMAKE_SHARED_LINKER_FLAGS " ${_mold_args}")
string(APPEND CMAKE_MODULE_LINKER_FLAGS " ${_mold_args}")
set(_IS_LINKER_DEFAULT OFF)
else()
message(STATUS "GNU mold linker isn't available, using the default system linker.")
endif()
unset(_mold_args)
unset(MOLD_BIN_DIR)
unset(LD_VERSION)
endif()
unset(MOLD_BIN)
unset(MOLD_BIN_DIR)
endif()
if(WITH_LINKER_GOLD AND _IS_LINKER_DEFAULT)
@@ -907,7 +891,7 @@ if(CMAKE_COMPILER_IS_GNUCC)
# CLang is the same as GCC for now.
elseif(CMAKE_C_COMPILER_ID MATCHES "Clang")
set(PLATFORM_CFLAGS "-pipe -fPIC -funsigned-char -fno-strict-aliasing")
set(PLATFORM_CFLAGS "-pipe -fPIC -funsigned-char -fno-strict-aliasing -ffp-contract=off")
if(WITH_LINKER_MOLD AND _IS_LINKER_DEFAULT)
find_program(MOLD_BIN "mold")

View File

@@ -9,7 +9,7 @@ buildbot:
cuda11:
version: '11.4.1'
hip:
version: '5.3.22480'
version: '5.5.30571'
optix:
version: '7.3.0'
ocloc:

View File

@@ -1,7 +1,7 @@
#!/usr/bin/env python3
# SPDX-License-Identifier: GPL-2.0-or-later
'''
"""
This script generates the blender.1 man page, embedding the help text
from the Blender executable itself. Invoke it as follows:
@@ -9,7 +9,7 @@ from the Blender executable itself. Invoke it as follows:
where <path-to-blender> is the path to the Blender executable,
and <output-filename> is where to write the generated man page.
'''
"""
import argparse
import os
@@ -87,29 +87,29 @@ def man_page_from_blender_help(fh: TextIO, blender_bin: str, verbose: bool) -> N
(blender_info["date"], blender_info["version"].replace(".", "\\&."))
)
fh.write(r'''
fh.write(r"""
.SH NAME
blender \- a full-featured 3D application''')
blender \- a full-featured 3D application""")
fh.write(r'''
fh.write(r"""
.SH SYNOPSIS
.B blender [args ...] [file] [args ...]''')
.B blender [args ...] [file] [args ...]""")
fh.write(r'''
fh.write(r"""
.br
.SH DESCRIPTION
.PP
.B blender
is a full-featured 3D application. It supports the entirety of the 3D pipeline - '''
'''modeling, rigging, animation, simulation, rendering, compositing, motion tracking, and video editing.
is a full-featured 3D application. It supports the entirety of the 3D pipeline - """
"""modeling, rigging, animation, simulation, rendering, compositing, motion tracking, and video editing.
Use Blender to create 3D images and animations, films and commercials, content for games, '''
r'''architectural and industrial visualizations, and scientific visualizations.
Use Blender to create 3D images and animations, films and commercials, content for games, """
r"""architectural and industrial visualizations, and scientific visualizations.
https://www.blender.org''')
https://www.blender.org""")
fh.write(r'''
.SH OPTIONS''')
fh.write(r"""
.SH OPTIONS""")
fh.write("\n\n")
@@ -152,7 +152,7 @@ https://www.blender.org''')
# Footer Content.
fh.write(r'''
fh.write(r"""
.br
.SH SEE ALSO
.B luxrender(1)
@@ -162,7 +162,7 @@ https://www.blender.org''')
This manpage was written for a Debian GNU/Linux system by Daniel Mester
<mester@uni-bremen.de> and updated by Cyril Brulebois
<cyril.brulebois@enst-bretagne.fr> and Dan Eicher <dan@trollwerks.org>.
''')
""")
def create_argparse() -> argparse.ArgumentParser:

View File

@@ -865,29 +865,40 @@ Unfortunate Corner Cases
Besides all expected cases listed above, there are a few others that should not be
an issue but, due to internal implementation details, currently are:
- ``Object.hide_viewport``, ``Object.hide_select`` and ``Object.hide_render``:
Setting any of those Booleans will trigger a rebuild of Collection caches,
thus breaking any current iteration over ``Collection.all_objects``.
Collection Objects
^^^^^^^^^^^^^^^^^^
Changing: ``Object.hide_viewport``, ``Object.hide_select`` or ``Object.hide_render``
will trigger a rebuild of Collection caches, thus breaking any current iteration over ``Collection.all_objects``.
.. rubric:: Do not:
.. code-block:: python
# `all_objects` is an iterator. Using it directly while performing operations on its members that will update
# the memory accessed by the `all_objects` iterator will lead to invalid memory accesses and crashes.
for object in bpy.data.collections["Collection"].all_objects:
object.hide_viewport = True
.. rubric:: Do not:
.. rubric:: Do:
.. code-block:: python
.. code-block:: python
# `all_objects` is an iterator. Using it directly while performing operations on its members that will update
# the memory accessed by the `all_objects` iterator will lead to invalid memory accesses and crashes.
for object in bpy.data.collections["Collection"].all_objects:
object.hide_viewport = True
# `all_objects[:]` is an independent list generated from the iterator. As long as no objects are deleted,
# its content will remain valid even if the data accessed by the `all_objects` iterator is modified.
for object in bpy.data.collections["Collection"].all_objects[:]:
object.hide_viewport = True
.. rubric:: Do:
Data-Blocks Renaming During Iteration
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. code-block:: python
# `all_objects[:]` is an independent list generated from the iterator. As long as no objects are deleted,
# its content will remain valid even if the data accessed by the `all_objects` iterator is modified.
for object in bpy.data.collections["Collection"].all_objects[:]:
object.hide_viewport = True
Data-blocks accessed from ``bpy.data`` are sorted when their name is set.
Any loop that iterates of a data such as ``bpy.data.objects`` for example,
and sets the objects ``name`` must get all items from the iterator first (typically by converting to a list or tuple)
to avoid missing some objects and iterating over others multiple times.
sys.exit

View File

@@ -572,7 +572,7 @@ template<class T> inline bool cmpMinMax(T &minv, T &maxv, const T &val)
}
template<> inline bool cmpMinMax<Vec3>(Vec3 &minv, Vec3 &maxv, const Vec3 &val)
{
return (cmpMinMax(minv.x, maxv.x, val.x) | cmpMinMax(minv.y, maxv.y, val.y) |
return (cmpMinMax(minv.x, maxv.x, val.x) || cmpMinMax(minv.y, maxv.y, val.y) ||
cmpMinMax(minv.z, maxv.z, val.z));
}

View File

@@ -281,6 +281,9 @@ endif()
if(WITH_CYCLES_EMBREE)
add_definitions(-DWITH_EMBREE)
if(WITH_CYCLES_DEVICE_ONEAPI AND EMBREE_SYCL_SUPPORT)
add_definitions(-DWITH_EMBREE_GPU)
endif()
add_definitions(-DEMBREE_MAJOR_VERSION=${EMBREE_MAJOR_VERSION})
include_directories(
SYSTEM

View File

@@ -106,7 +106,7 @@ class CyclesRender(bpy.types.RenderEngine):
from . import osl
osl.update_script_node(node, self.report)
else:
self.report({'ERROR'}, "OSL support disabled in this build.")
self.report({'ERROR'}, "OSL support disabled in this build")
def update_render_passes(self, scene, srl):
engine.register_passes(self, scene, srl)

View File

@@ -172,6 +172,8 @@ def system_info():
def list_render_passes(scene, srl):
import _cycles
crl = srl.cycles
# Combined pass.
@@ -250,6 +252,12 @@ def list_render_passes(scene, srl):
for lightgroup in srl.lightgroups:
yield ("Combined_%s" % lightgroup.name, "RGB", 'COLOR')
# Path guiding debug passes.
if _cycles.with_debug:
yield ("Guiding Color", "RGB", 'COLOR')
yield ("Guiding Probability", "X", 'VALUE')
yield ("Guiding Average Roughness", "X", 'VALUE')
def register_passes(engine, scene, view_layer):
for name, channelids, channeltype in list_render_passes(scene, view_layer):

View File

@@ -1544,6 +1544,13 @@ class CyclesPreferences(bpy.types.AddonPreferences):
default=False,
)
use_oneapirt: BoolProperty(
name="Embree on GPU (Experimental)",
description="Embree GPU execution will allow to use hardware ray tracing on Intel GPUs, which will provide better performance. "
"However this support is experimental and some scenes may render incorrectly",
default=False,
)
kernel_optimization_level: EnumProperty(
name="Kernel Optimization",
description="Kernels can be optimized based on scene content. Optimized kernels are requested at the start of a render. "
@@ -1676,16 +1683,16 @@ class CyclesPreferences(bpy.types.AddonPreferences):
col.label(text=iface_("and NVIDIA driver version %s or newer") % driver_version,
icon='BLANK1', translate=False)
elif device_type == 'HIP':
if True:
col.label(text="HIP temporarily disabled due to compiler bugs", icon='BLANK1')
else:
import sys
if sys.platform[:3] == "win":
driver_version = "21.Q4"
col.label(text="Requires AMD GPU with Vega or RDNA architecture", icon='BLANK1')
col.label(text=iface_("and AMD Radeon Pro %s driver or newer") % driver_version,
icon='BLANK1', translate=False)
elif sys.platform.startswith("linux"):
import sys
if sys.platform[:3] == "win":
driver_version = "21.Q4"
col.label(text="Requires AMD GPU with Vega or RDNA architecture", icon='BLANK1')
col.label(text=iface_("and AMD Radeon Pro %s driver or newer") % driver_version,
icon='BLANK1', translate=False)
elif sys.platform.startswith("linux"):
if True:
col.label(text="HIP temporarily disabled due to compiler bugs", icon='BLANK1')
else:
driver_version = "22.10"
col.label(text="Requires AMD GPU with Vega or RDNA architecture", icon='BLANK1')
col.label(text=iface_("and AMD driver version %s or newer") % driver_version, icon='BLANK1',
@@ -1763,6 +1770,11 @@ class CyclesPreferences(bpy.types.AddonPreferences):
col.prop(self, "kernel_optimization_level")
col.prop(self, "use_metalrt")
if compute_device_type == 'ONEAPI' and _cycles.with_embree_gpu:
row = layout.row()
row.use_property_split = True
row.prop(self, "use_oneapirt")
def draw(self, context):
self.draw_impl(self.layout, context)

View File

@@ -803,6 +803,16 @@ static void attr_create_generic(Scene *scene,
num_curves, num_keys, data, element, [&](int i) { return float(src[i]); });
break;
}
case BL::Attribute::data_type_INT32_2D: {
BL::Int2Attribute b_int2_attribute{b_attribute};
const int2 *src = static_cast<const int2 *>(b_int2_attribute.data[0].ptr.data);
Attribute *attr = attributes.add(name, TypeFloat2, element);
float2 *data = attr->data_float2();
fill_generic_attribute(num_curves, num_keys, data, element, [&](int i) {
return make_float2(float(src[i][0]), float(src[i][1]));
});
break;
}
case BL::Attribute::data_type_FLOAT_VECTOR: {
BL::FloatVectorAttribute b_vector_attribute{b_attribute};
const float(*src)[3] = static_cast<const float(*)[3]>(b_vector_attribute.data[0].ptr.data);

View File

@@ -112,9 +112,26 @@ DeviceInfo blender_device_info(BL::Preferences &b_preferences,
device.has_peer_memory = false;
}
if (get_boolean(cpreferences, "use_metalrt")) {
device.use_metalrt = true;
bool accumulated_use_hardware_raytracing = false;
foreach (
DeviceInfo &info,
(device.multi_devices.size() != 0 ? device.multi_devices : vector<DeviceInfo>({device}))) {
if (info.type == DEVICE_METAL && !get_boolean(cpreferences, "use_metalrt")) {
info.use_hardware_raytracing = false;
}
if (info.type == DEVICE_ONEAPI && !get_boolean(cpreferences, "use_oneapirt")) {
info.use_hardware_raytracing = false;
}
/* There is an accumulative logic here, because Multi-devices are support only for
* the same backend + CPU in Blender right now, and both oneAPI and Metal have a
* global boolean backend setting (see above) for enabling/disabling HW RT,
* so all sub-devices in the multi-device should enable (or disable) HW RT
* simultaneously (and CPU device are expected to ignore `use_hardware_raytracing` setting). */
accumulated_use_hardware_raytracing |= info.use_hardware_raytracing;
}
device.use_hardware_raytracing = accumulated_use_hardware_raytracing;
if (preview) {
/* Disable specialization for preview renders. */

View File

@@ -280,7 +280,7 @@ static void fill_generic_attribute(BL::Mesh &b_mesh,
assert(0);
}
else {
const MEdge *edges = static_cast<const MEdge *>(b_mesh.edges[0].ptr.data);
const int2 *edges = static_cast<const int2 *>(b_mesh.edges[0].ptr.data);
const size_t verts_num = b_mesh.vertices.length();
vector<int> count(verts_num, 0);
@@ -288,11 +288,11 @@ static void fill_generic_attribute(BL::Mesh &b_mesh,
for (int i = 0; i < edges_num; i++) {
TypeInCycles value = get_value_at_index(i);
const MEdge &b_edge = edges[i];
data[b_edge.v1] += value;
data[b_edge.v2] += value;
count[b_edge.v1]++;
count[b_edge.v2]++;
const int2 &b_edge = edges[i];
data[b_edge[0]] += value;
data[b_edge[1]] += value;
count[b_edge[0]]++;
count[b_edge[1]]++;
}
for (size_t i = 0; i < verts_num; i++) {
@@ -528,6 +528,19 @@ static void attr_create_generic(Scene *scene,
});
break;
}
case BL::Attribute::data_type_INT32_2D: {
BL::Int2Attribute b_int2_attribute{b_attribute};
if (b_int2_attribute.data.length() == 0) {
continue;
}
const int2 *src = static_cast<const int2 *>(b_int2_attribute.data[0].ptr.data);
Attribute *attr = attributes.add(name, TypeFloat2, element);
float2 *data = attr->data_float2();
fill_generic_attribute(b_mesh, data, b_domain, subdivision, [&](int i) {
return make_float2(float(src[i][0]), float(src[i][1]));
});
break;
}
default:
/* Not supported. */
break;
@@ -783,13 +796,13 @@ static void attr_create_pointiness(Scene *scene, Mesh *mesh, BL::Mesh &b_mesh, b
EdgeMap visited_edges;
memset(&counter[0], 0, sizeof(int) * counter.size());
const MEdge *edges = static_cast<MEdge *>(b_mesh.edges[0].ptr.data);
const int2 *edges = static_cast<int2 *>(b_mesh.edges[0].ptr.data);
const int edges_num = b_mesh.edges.length();
for (int i = 0; i < edges_num; i++) {
const MEdge &b_edge = edges[i];
const int v0 = vert_orig_index[b_edge.v1];
const int v1 = vert_orig_index[b_edge.v2];
const int2 &b_edge = edges[i];
const int v0 = vert_orig_index[b_edge[0]];
const int v1 = vert_orig_index[b_edge[1]];
if (visited_edges.exists(v0, v1)) {
continue;
}
@@ -825,9 +838,9 @@ static void attr_create_pointiness(Scene *scene, Mesh *mesh, BL::Mesh &b_mesh, b
memset(&counter[0], 0, sizeof(int) * counter.size());
visited_edges.clear();
for (int i = 0; i < edges_num; i++) {
const MEdge &b_edge = edges[i];
const int v0 = vert_orig_index[b_edge.v1];
const int v1 = vert_orig_index[b_edge.v2];
const int2 &b_edge = edges[i];
const int v0 = vert_orig_index[b_edge[0]];
const int v1 = vert_orig_index[b_edge[1]];
if (visited_edges.exists(v0, v1)) {
continue;
}
@@ -894,12 +907,12 @@ static void attr_create_random_per_island(Scene *scene,
DisjointSet vertices_sets(number_of_vertices);
const MEdge *edges = static_cast<MEdge *>(b_mesh.edges[0].ptr.data);
const int2 *edges = static_cast<int2 *>(b_mesh.edges[0].ptr.data);
const int edges_num = b_mesh.edges.length();
const int *corner_verts = find_corner_vert_attribute(b_mesh);
for (int i = 0; i < edges_num; i++) {
vertices_sets.join(edges[i].v1, edges[i].v2);
vertices_sets.join(edges[i][0], edges[i][1]);
}
AttributeSet &attributes = (subdivision) ? mesh->subd_attributes : mesh->attributes;
@@ -1221,12 +1234,12 @@ static void create_subd_mesh(Scene *scene,
mesh->reserve_subd_creases(num_creases);
const MEdge *edges = static_cast<MEdge *>(b_mesh.edges[0].ptr.data);
const int2 *edges = static_cast<int2 *>(b_mesh.edges[0].ptr.data);
for (int i = 0; i < edges_num; i++) {
const float crease = creases[i];
if (crease != 0.0f) {
const MEdge &b_edge = edges[i];
mesh->add_edge_crease(b_edge.v1, b_edge.v2, crease);
const int2 &b_edge = edges[i];
mesh->add_edge_crease(b_edge[0], b_edge[1], crease);
}
}
}

View File

@@ -102,6 +102,16 @@ static void copy_attributes(PointCloud *pointcloud,
}
break;
}
case BL::Attribute::data_type_INT32_2D: {
BL::Int2Attribute b_int2_attribute{b_attribute};
const int2 *src = static_cast<const int2 *>(b_int2_attribute.data[0].ptr.data);
Attribute *attr = attributes.add(name, TypeFloat2, element);
float2 *data = attr->data_float2();
for (int i = 0; i < num_points; i++) {
data[i] = make_float2(float(src[i][0]), float(src[i][1]));
}
break;
}
case BL::Attribute::data_type_FLOAT_VECTOR: {
BL::FloatVectorAttribute b_vector_attribute{b_attribute};
const float(*src)[3] = static_cast<const float(*)[3]>(b_vector_attribute.data[0].ptr.data);

View File

@@ -1034,6 +1034,14 @@ void *CCL_python_module_init()
Py_INCREF(Py_False);
#endif /* WITH_EMBREE */
#ifdef WITH_EMBREE_GPU
PyModule_AddObject(mod, "with_embree_gpu", Py_True);
Py_INCREF(Py_True);
#else /* WITH_EMBREE_GPU */
PyModule_AddObject(mod, "with_embree_gpu", Py_False);
Py_INCREF(Py_False);
#endif /* WITH_EMBREE_GPU */
if (ccl::openimagedenoise_supported()) {
PyModule_AddObject(mod, "with_openimagedenoise", Py_True);
Py_INCREF(Py_True);

View File

@@ -1061,7 +1061,7 @@ void BlenderSession::ensure_display_driver_if_needed()
unique_ptr<BlenderDisplayDriver> display_driver = make_unique<BlenderDisplayDriver>(
b_engine, b_scene, background);
display_driver_ = display_driver.get();
session->set_display_driver(move(display_driver));
session->set_display_driver(std::move(display_driver));
}
CCL_NAMESPACE_END

View File

@@ -981,22 +981,8 @@ static ShaderNode *add_node(Scene *scene,
sky->set_sun_disc(b_sky_node.sun_disc());
sky->set_sun_size(b_sky_node.sun_size());
sky->set_sun_intensity(b_sky_node.sun_intensity());
/* Patch sun position to be able to animate daylight cycle while keeping the shading code
* simple. */
float sun_rotation = b_sky_node.sun_rotation();
/* Wrap into [-2PI..2PI] range. */
float sun_elevation = fmodf(b_sky_node.sun_elevation(), M_2PI_F);
/* Wrap into [-PI..PI] range. */
if (fabsf(sun_elevation) >= M_PI_F) {
sun_elevation -= copysignf(2.0f, sun_elevation) * M_PI_F;
}
/* Wrap into [-PI/2..PI/2] range while keeping the same absolute position. */
if (sun_elevation >= M_PI_2_F || sun_elevation <= -M_PI_2_F) {
sun_elevation = copysignf(M_PI_F, sun_elevation) - sun_elevation;
sun_rotation += M_PI_F;
}
sky->set_sun_elevation(sun_elevation);
sky->set_sun_rotation(sun_rotation);
sky->set_sun_elevation(b_sky_node.sun_elevation());
sky->set_sun_rotation(b_sky_node.sun_rotation());
sky->set_altitude(b_sky_node.altitude());
sky->set_air_density(b_sky_node.air_density());
sky->set_dust_density(b_sky_node.dust_density());

View File

@@ -634,6 +634,10 @@ static bool get_known_pass_type(BL::RenderPass &b_pass, PassType &type, PassMode
MAP_PASS("AdaptiveAuxBuffer", PASS_ADAPTIVE_AUX_BUFFER, false);
MAP_PASS("Debug Sample Count", PASS_SAMPLE_COUNT, false);
MAP_PASS("Guiding Color", PASS_GUIDING_COLOR, false);
MAP_PASS("Guiding Probability", PASS_GUIDING_PROBABILITY, false);
MAP_PASS("Guiding Average Roughness", PASS_GUIDING_AVG_ROUGHNESS, false);
if (string_startswith(name, cryptomatte_prefix)) {
type = PASS_CRYPTOMATTE;
mode = PassMode::DENOISED;
@@ -684,18 +688,6 @@ void BlenderSync::sync_render_passes(BL::RenderLayer &b_rlay, BL::ViewLayer &b_v
}
scene->film->set_cryptomatte_passes(cryptomatte_passes);
/* Path guiding debug passes. */
#ifdef WITH_CYCLES_DEBUG
b_engine.add_pass("Guiding Color", 3, "RGB", b_view_layer.name().c_str());
pass_add(scene, PASS_GUIDING_COLOR, "Guiding Color", PassMode::NOISY);
b_engine.add_pass("Guiding Probability", 1, "X", b_view_layer.name().c_str());
pass_add(scene, PASS_GUIDING_PROBABILITY, "Guiding Probability", PassMode::NOISY);
b_engine.add_pass("Guiding Average Roughness", 1, "X", b_view_layer.name().c_str());
pass_add(scene, PASS_GUIDING_AVG_ROUGHNESS, "Guiding Average Roughness", PassMode::NOISY);
#endif
unordered_set<string> expected_passes;
/* Custom AOV passes. */

View File

@@ -527,7 +527,7 @@ BVHNode *BVHBuild::run()
if (progress.get_cancel()) {
rootnode->deleteSubtree();
rootnode = NULL;
VLOG_WORK << "BVH build cancelled.";
VLOG_WORK << "BVH build canceled.";
}
else {
/*rotate(rootnode, 4, 5);*/

View File

@@ -606,7 +606,7 @@ void BVH2::pack_instances(size_t nodes_size, size_t leaf_nodes_size)
int4 *bvh_nodes = &bvh->pack.nodes[0];
size_t bvh_nodes_size = bvh->pack.nodes.size();
for (size_t i = 0, j = 0; i < bvh_nodes_size; j++) {
for (size_t i = 0; i < bvh_nodes_size;) {
size_t nsize, nsize_bbox;
if (bvh_nodes[i].x & PATH_RAY_NODE_UNALIGNED) {
nsize = BVH_UNALIGNED_NODE_SIZE;

View File

@@ -111,9 +111,13 @@ BVHEmbree::~BVHEmbree()
}
}
void BVHEmbree::build(Progress &progress, Stats *stats, RTCDevice rtc_device_)
void BVHEmbree::build(Progress &progress,
Stats *stats,
RTCDevice rtc_device_,
const bool rtc_device_is_sycl_)
{
rtc_device = rtc_device_;
rtc_device_is_sycl = rtc_device_is_sycl_;
assert(rtc_device);
rtcSetDeviceErrorFunction(rtc_device, rtc_error_func, NULL);
@@ -266,15 +270,29 @@ void BVHEmbree::add_triangles(const Object *ob, const Mesh *mesh, int i)
rtcSetGeometryTimeStepCount(geom_id, num_motion_steps);
const int *triangles = mesh->get_triangles().data();
rtcSetSharedGeometryBuffer(geom_id,
RTC_BUFFER_TYPE_INDEX,
0,
RTC_FORMAT_UINT3,
triangles,
0,
sizeof(int) * 3,
num_triangles);
if (!rtc_device_is_sycl) {
rtcSetSharedGeometryBuffer(geom_id,
RTC_BUFFER_TYPE_INDEX,
0,
RTC_FORMAT_UINT3,
triangles,
0,
sizeof(int) * 3,
num_triangles);
}
else {
/* NOTE(sirgienko): If the Embree device is a SYCL device, then Embree execution will
* happen on GPU, and we cannot use standard host pointers at this point. So instead
* of making a shared geometry buffer - a new Embree buffer will be created and data
* will be copied. */
int *triangles_buffer = (int *)rtcSetNewGeometryBuffer(
geom_id, RTC_BUFFER_TYPE_INDEX, 0, RTC_FORMAT_UINT3, sizeof(int) * 3, num_triangles);
assert(triangles_buffer);
if (triangles_buffer) {
static_assert(sizeof(int) == sizeof(uint));
std::memcpy(triangles_buffer, triangles, sizeof(int) * 3 * (num_triangles));
}
}
set_tri_vertex_buffer(geom_id, mesh, false);
rtcSetGeometryUserData(geom_id, (void *)prim_offset);
@@ -323,14 +341,38 @@ void BVHEmbree::set_tri_vertex_buffer(RTCGeometry geom_id, const Mesh *mesh, con
rtcUpdateGeometryBuffer(geom_id, RTC_BUFFER_TYPE_VERTEX, t);
}
else {
rtcSetSharedGeometryBuffer(geom_id,
RTC_BUFFER_TYPE_VERTEX,
t,
RTC_FORMAT_FLOAT3,
verts,
0,
sizeof(float3),
num_verts + 1);
if (!rtc_device_is_sycl) {
rtcSetSharedGeometryBuffer(geom_id,
RTC_BUFFER_TYPE_VERTEX,
t,
RTC_FORMAT_FLOAT3,
verts,
0,
sizeof(float3),
num_verts + 1);
}
else {
/* NOTE(sirgienko): If the Embree device is a SYCL device, then Embree execution will
* happen on GPU, and we cannot use standard host pointers at this point. So instead
* of making a shared geometry buffer - a new Embree buffer will be created and data
* will be copied. */
/* As float3 is packed on GPU side, we map it to packed_float3. */
packed_float3 *verts_buffer = (packed_float3 *)rtcSetNewGeometryBuffer(
geom_id,
RTC_BUFFER_TYPE_VERTEX,
t,
RTC_FORMAT_FLOAT3,
sizeof(packed_float3),
num_verts + 1);
assert(verts_buffer);
if (verts_buffer) {
for (size_t i = (size_t)0; i < num_verts + 1; ++i) {
verts_buffer[i].x = verts[i].x;
verts_buffer[i].y = verts[i].y;
verts_buffer[i].z = verts[i].z;
}
}
}
}
}
}

View File

@@ -29,7 +29,10 @@ class PointCloud;
class BVHEmbree : public BVH {
public:
void build(Progress &progress, Stats *stats, RTCDevice rtc_device);
void build(Progress &progress,
Stats *stats,
RTCDevice rtc_device,
const bool isSyclEmbreeDevice = false);
void refit(Progress &progress);
RTCScene scene;
@@ -55,6 +58,7 @@ class BVHEmbree : public BVH {
const bool update);
RTCDevice rtc_device;
bool rtc_device_is_sycl;
enum RTCBuildQuality build_quality;
};

View File

@@ -42,15 +42,19 @@ endif()
###########################################################################
if(WITH_CYCLES_HIP_BINARIES AND WITH_CYCLES_DEVICE_HIP)
set(WITH_CYCLES_HIP_BINARIES OFF)
message(STATUS "HIP temporarily disabled due to compiler bugs")
if(UNIX)
# Disabled until there is a HIP 5.5 release for Linux.
set(WITH_CYCLES_HIP_BINARIES OFF)
message(STATUS "HIP temporarily disabled due to compiler bugs")
else()
# Need at least HIP 5.5 to solve compiler bug affecting the kernel.
find_package(HIP 5.5.0)
set_and_warn_library_found("HIP compiler" HIP_FOUND WITH_CYCLES_HIP_BINARIES)
# find_package(HIP)
# set_and_warn_library_found("HIP compiler" HIP_FOUND WITH_CYCLES_HIP_BINARIES)
# if(HIP_FOUND)
# message(STATUS "Found HIP ${HIP_HIPCC_EXECUTABLE} (${HIP_VERSION})")
# endif()
if(HIP_FOUND)
message(STATUS "Found HIP ${HIP_HIPCC_EXECUTABLE} (${HIP_VERSION})")
endif()
endif()
endif()
if(NOT WITH_HIP_DYNLOAD)

View File

@@ -84,7 +84,7 @@ CPUDevice::~CPUDevice()
texture_info.free();
}
BVHLayoutMask CPUDevice::get_bvh_layout_mask() const
BVHLayoutMask CPUDevice::get_bvh_layout_mask(uint /*kernel_features*/) const
{
BVHLayoutMask bvh_layout_mask = BVH_LAYOUT_BVH2;
#ifdef WITH_EMBREE

View File

@@ -56,7 +56,7 @@ class CPUDevice : public Device {
CPUDevice(const DeviceInfo &info_, Stats &stats_, Profiler &profiler_);
~CPUDevice();
virtual BVHLayoutMask get_bvh_layout_mask() const override;
virtual BVHLayoutMask get_bvh_layout_mask(uint /*kernel_features*/) const override;
/* Returns true if the texture info was copied to the device (meaning, some more
* re-initialization might be needed). */

View File

@@ -35,7 +35,7 @@ bool CUDADevice::have_precompiled_kernels()
return path_exists(cubins_path);
}
BVHLayoutMask CUDADevice::get_bvh_layout_mask() const
BVHLayoutMask CUDADevice::get_bvh_layout_mask(uint /*kernel_features*/) const
{
return BVH_LAYOUT_BVH2;
}

View File

@@ -38,7 +38,7 @@ class CUDADevice : public GPUDevice {
static bool have_precompiled_kernels();
virtual BVHLayoutMask get_bvh_layout_mask() const override;
virtual BVHLayoutMask get_bvh_layout_mask(uint /*kernel_features*/) const override;
void set_error(const string &error) override;

View File

@@ -354,7 +354,7 @@ DeviceInfo Device::get_multi_device(const vector<DeviceInfo> &subdevices,
info.has_guiding = true;
info.has_profiling = true;
info.has_peer_memory = false;
info.use_metalrt = false;
info.use_hardware_raytracing = false;
info.denoisers = DENOISER_ALL;
foreach (const DeviceInfo &device, subdevices) {
@@ -403,7 +403,7 @@ DeviceInfo Device::get_multi_device(const vector<DeviceInfo> &subdevices,
info.has_guiding &= device.has_guiding;
info.has_profiling &= device.has_profiling;
info.has_peer_memory |= device.has_peer_memory;
info.use_metalrt |= device.use_metalrt;
info.use_hardware_raytracing |= device.use_hardware_raytracing;
info.denoisers &= device.denoisers;
}

View File

@@ -71,15 +71,16 @@ class DeviceInfo {
string description;
string id; /* used for user preferences, should stay fixed with changing hardware config */
int num;
bool display_device; /* GPU is used as a display device. */
bool has_nanovdb; /* Support NanoVDB volumes. */
bool has_light_tree; /* Support light tree. */
bool has_osl; /* Support Open Shading Language. */
bool has_guiding; /* Support path guiding. */
bool has_profiling; /* Supports runtime collection of profiling info. */
bool has_peer_memory; /* GPU has P2P access to memory of another GPU. */
bool has_gpu_queue; /* Device supports GPU queue. */
bool use_metalrt; /* Use MetalRT to accelerate ray queries (Metal only). */
bool display_device; /* GPU is used as a display device. */
bool has_nanovdb; /* Support NanoVDB volumes. */
bool has_light_tree; /* Support light tree. */
bool has_osl; /* Support Open Shading Language. */
bool has_guiding; /* Support path guiding. */
bool has_profiling; /* Supports runtime collection of profiling info. */
bool has_peer_memory; /* GPU has P2P access to memory of another GPU. */
bool has_gpu_queue; /* Device supports GPU queue. */
bool use_hardware_raytracing; /* Use hardware ray tracing to accelerate ray queries in a backend.
*/
KernelOptimizationLevel kernel_optimization_level; /* Optimization level applied to path tracing
* kernels (Metal only). */
DenoiserTypeMask denoisers; /* Supported denoiser types. */
@@ -101,7 +102,7 @@ class DeviceInfo {
has_profiling = false;
has_peer_memory = false;
has_gpu_queue = false;
use_metalrt = false;
use_hardware_raytracing = false;
denoisers = DENOISER_NONE;
}
@@ -157,7 +158,7 @@ class Device {
fprintf(stderr, "%s\n", error.c_str());
fflush(stderr);
}
virtual BVHLayoutMask get_bvh_layout_mask() const = 0;
virtual BVHLayoutMask get_bvh_layout_mask(uint kernel_features) const = 0;
/* statistics */
Stats &stats;

View File

@@ -20,7 +20,7 @@ class DummyDevice : public Device {
~DummyDevice() {}
virtual BVHLayoutMask get_bvh_layout_mask() const override
virtual BVHLayoutMask get_bvh_layout_mask(uint /*kernel_features*/) const override
{
return 0;
}

View File

@@ -137,7 +137,7 @@ void device_hip_info(vector<DeviceInfo> &devices)
info.num = num;
info.has_nanovdb = true;
info.has_light_tree = false;
info.has_light_tree = true;
info.denoisers = 0;
info.has_gpu_queue = true;

View File

@@ -35,7 +35,7 @@ bool HIPDevice::have_precompiled_kernels()
return path_exists(fatbins_path);
}
BVHLayoutMask HIPDevice::get_bvh_layout_mask() const
BVHLayoutMask HIPDevice::get_bvh_layout_mask(uint /*kernel_features*/) const
{
return BVH_LAYOUT_BVH2;
}

View File

@@ -35,7 +35,7 @@ class HIPDevice : public GPUDevice {
static bool have_precompiled_kernels();
virtual BVHLayoutMask get_bvh_layout_mask() const override;
virtual BVHLayoutMask get_bvh_layout_mask(uint /*kernel_features*/) const override;
void set_error(const string &error) override;

View File

@@ -3,7 +3,9 @@
#include "device/kernel.h"
#include "util/log.h"
#ifndef __KERNEL_ONEAPI__
# include "util/log.h"
#endif
CCL_NAMESPACE_BEGIN
@@ -153,10 +155,13 @@ const char *device_kernel_as_string(DeviceKernel kernel)
case DEVICE_KERNEL_NUM:
break;
};
#ifndef __KERNEL_ONEAPI__
LOG(FATAL) << "Unhandled kernel " << static_cast<int>(kernel) << ", should never happen.";
#endif
return "UNKNOWN";
}
#ifndef __KERNEL_ONEAPI__
std::ostream &operator<<(std::ostream &os, DeviceKernel kernel)
{
os << device_kernel_as_string(kernel);
@@ -178,5 +183,6 @@ string device_kernel_mask_as_string(DeviceKernelMask mask)
return str;
}
#endif
CCL_NAMESPACE_END

View File

@@ -3,11 +3,13 @@
#pragma once
#include "kernel/types.h"
#ifndef __KERNEL_ONEAPI__
# include "kernel/types.h"
#include "util/string.h"
# include "util/string.h"
#include <ostream> // NOLINT
# include <ostream> // NOLINT
#endif
CCL_NAMESPACE_BEGIN
@@ -15,9 +17,12 @@ bool device_kernel_has_shading(DeviceKernel kernel);
bool device_kernel_has_intersection(DeviceKernel kernel);
const char *device_kernel_as_string(DeviceKernel kernel);
#ifndef __KERNEL_ONEAPI__
std::ostream &operator<<(std::ostream &os, DeviceKernel kernel);
typedef uint64_t DeviceKernelMask;
string device_kernel_mask_as_string(DeviceKernelMask mask);
#endif
CCL_NAMESPACE_END

View File

@@ -100,7 +100,7 @@ class MetalDevice : public Device {
virtual void cancel() override;
virtual BVHLayoutMask get_bvh_layout_mask() const override;
virtual BVHLayoutMask get_bvh_layout_mask(uint /*kernel_features*/) const override;
void set_error(const string &error) override;

View File

@@ -39,7 +39,7 @@ bool MetalDevice::is_device_cancelled(int ID)
return get_device_by_ID(ID, lock) == nullptr;
}
BVHLayoutMask MetalDevice::get_bvh_layout_mask() const
BVHLayoutMask MetalDevice::get_bvh_layout_mask(uint /*kernel_features*/) const
{
return use_metalrt ? BVH_LAYOUT_METAL : BVH_LAYOUT_BVH2;
}
@@ -100,12 +100,12 @@ MetalDevice::MetalDevice(const DeviceInfo &info, Stats &stats, Profiler &profile
}
case METAL_GPU_AMD: {
max_threads_per_threadgroup = 128;
use_metalrt = info.use_metalrt;
use_metalrt = info.use_hardware_raytracing;
break;
}
case METAL_GPU_APPLE: {
max_threads_per_threadgroup = 512;
use_metalrt = info.use_metalrt;
use_metalrt = info.use_hardware_raytracing;
break;
}
}

View File

@@ -96,12 +96,13 @@ class MultiDevice : public Device {
return error_msg;
}
virtual BVHLayoutMask get_bvh_layout_mask() const override
virtual BVHLayoutMask get_bvh_layout_mask(uint kernel_features) const override
{
BVHLayoutMask bvh_layout_mask = BVH_LAYOUT_ALL;
BVHLayoutMask bvh_layout_mask_all = BVH_LAYOUT_NONE;
foreach (const SubDevice &sub_device, devices) {
BVHLayoutMask device_bvh_layout_mask = sub_device.device->get_bvh_layout_mask();
BVHLayoutMask device_bvh_layout_mask = sub_device.device->get_bvh_layout_mask(
kernel_features);
bvh_layout_mask &= device_bvh_layout_mask;
bvh_layout_mask_all |= device_bvh_layout_mask;
}

View File

@@ -40,12 +40,12 @@ bool device_oneapi_init()
if (getenv("SYCL_CACHE_TRESHOLD") == nullptr) {
_putenv_s("SYCL_CACHE_THRESHOLD", "0");
}
if (getenv("SYCL_DEVICE_FILTER") == nullptr) {
if (getenv("ONEAPI_DEVICE_SELECTOR") == nullptr) {
if (getenv("CYCLES_ONEAPI_ALL_DEVICES") == nullptr) {
_putenv_s("SYCL_DEVICE_FILTER", "level_zero");
_putenv_s("ONEAPI_DEVICE_SELECTOR", "level_zero:*");
}
else {
_putenv_s("SYCL_DEVICE_FILTER", "level_zero,cuda,hip");
_putenv_s("ONEAPI_DEVICE_SELECTOR", "!opencl:*");
}
}
if (getenv("SYCL_ENABLE_PCI") == nullptr) {
@@ -58,10 +58,10 @@ bool device_oneapi_init()
setenv("SYCL_CACHE_PERSISTENT", "1", false);
setenv("SYCL_CACHE_THRESHOLD", "0", false);
if (getenv("CYCLES_ONEAPI_ALL_DEVICES") == nullptr) {
setenv("SYCL_DEVICE_FILTER", "level_zero", false);
setenv("ONEAPI_DEVICE_SELECTOR", "level_zero:*", false);
}
else {
setenv("SYCL_DEVICE_FILTER", "level_zero,cuda,hip", false);
setenv("ONEAPI_DEVICE_SELECTOR", "!opencl:*", false);
}
setenv("SYCL_ENABLE_PCI", "1", false);
setenv("SYCL_PI_LEVEL_ZERO_USE_COPY_ENGINE_FOR_IN_ORDER_QUEUE", "0", false);
@@ -87,7 +87,8 @@ Device *device_oneapi_create(const DeviceInfo &info, Stats &stats, Profiler &pro
}
#ifdef WITH_ONEAPI
static void device_iterator_cb(const char *id, const char *name, int num, void *user_ptr)
static void device_iterator_cb(
const char *id, const char *name, int num, bool hwrt_support, void *user_ptr)
{
vector<DeviceInfo> *devices = (vector<DeviceInfo> *)user_ptr;
@@ -112,6 +113,13 @@ static void device_iterator_cb(const char *id, const char *name, int num, void *
/* NOTE(@nsirgien): Seems not possible to know from SYCL/oneAPI or Level0. */
info.display_device = false;
# ifdef WITH_EMBREE_GPU
info.use_hardware_raytracing = hwrt_support;
# else
info.use_hardware_raytracing = false;
(void)hwrt_support;
# endif
devices->push_back(info);
VLOG_INFO << "Added device \"" << name << "\" with id \"" << info.id << "\".";
}

View File

@@ -8,7 +8,19 @@
# include "util/debug.h"
# include "util/log.h"
# ifdef WITH_EMBREE_GPU
# include "bvh/embree.h"
# endif
# include "kernel/device/oneapi/globals.h"
# include "kernel/device/oneapi/kernel.h"
# if defined(WITH_EMBREE_GPU) && defined(EMBREE_SYCL_SUPPORT) && !defined(SYCL_LANGUAGE_VERSION)
/* These declarations are missing from embree headers when compiling from a compiler that doesn't
* support SYCL. */
extern "C" RTCDevice rtcNewSYCLDevice(sycl::context context, const char *config);
extern "C" bool rtcIsSYCLDeviceSupported(const sycl::device sycl_device);
# endif
CCL_NAMESPACE_BEGIN
@@ -22,16 +34,29 @@ static void queue_error_cb(const char *message, void *user_ptr)
OneapiDevice::OneapiDevice(const DeviceInfo &info, Stats &stats, Profiler &profiler)
: Device(info, stats, profiler),
device_queue_(nullptr),
# ifdef WITH_EMBREE_GPU
embree_device(nullptr),
embree_scene(nullptr),
# endif
texture_info_(this, "texture_info", MEM_GLOBAL),
kg_memory_(nullptr),
kg_memory_device_(nullptr),
kg_memory_size_(0)
{
need_texture_info_ = false;
use_hardware_raytracing = info.use_hardware_raytracing;
oneapi_set_error_cb(queue_error_cb, &oneapi_error_string_);
bool is_finished_ok = create_queue(device_queue_, info.num);
bool is_finished_ok = create_queue(device_queue_,
info.num,
# ifdef WITH_EMBREE_GPU
use_hardware_raytracing ? &embree_device : nullptr
# else
nullptr
# endif
);
if (is_finished_ok == false) {
set_error("oneAPI queue initialization error: got runtime exception \"" +
oneapi_error_string_ + "\"");
@@ -42,6 +67,16 @@ OneapiDevice::OneapiDevice(const DeviceInfo &info, Stats &stats, Profiler &profi
assert(device_queue_);
}
# ifdef WITH_EMBREE_GPU
use_hardware_raytracing = use_hardware_raytracing && (embree_device != nullptr);
# else
use_hardware_raytracing = false;
# endif
if (use_hardware_raytracing) {
VLOG_INFO << "oneAPI will use hardware ray tracing for intersection acceleration.";
}
size_t globals_segment_size;
is_finished_ok = kernel_globals_size(globals_segment_size);
if (is_finished_ok == false) {
@@ -64,6 +99,11 @@ OneapiDevice::OneapiDevice(const DeviceInfo &info, Stats &stats, Profiler &profi
OneapiDevice::~OneapiDevice()
{
# ifdef WITH_EMBREE_GPU
if (embree_device)
rtcReleaseDevice(embree_device);
# endif
texture_info_.free();
usm_free(device_queue_, kg_memory_);
usm_free(device_queue_, kg_memory_device_);
@@ -80,15 +120,47 @@ bool OneapiDevice::check_peer_access(Device * /*peer_device*/)
return false;
}
BVHLayoutMask OneapiDevice::get_bvh_layout_mask() const
bool OneapiDevice::can_use_hardware_raytracing_for_features(uint requested_features) const
{
return BVH_LAYOUT_BVH2;
/* MNEE and Ray-trace kernels currently don't work correctly with HWRT. */
return !(requested_features & (KERNEL_FEATURE_MNEE | KERNEL_FEATURE_NODE_RAYTRACE));
}
BVHLayoutMask OneapiDevice::get_bvh_layout_mask(uint requested_features) const
{
return (use_hardware_raytracing &&
can_use_hardware_raytracing_for_features(requested_features)) ?
BVH_LAYOUT_EMBREE :
BVH_LAYOUT_BVH2;
}
# ifdef WITH_EMBREE_GPU
void OneapiDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
{
if (embree_device && bvh->params.bvh_layout == BVH_LAYOUT_EMBREE) {
BVHEmbree *const bvh_embree = static_cast<BVHEmbree *>(bvh);
if (refit) {
bvh_embree->refit(progress);
}
else {
bvh_embree->build(progress, &stats, embree_device, true);
}
if (bvh->params.top_level) {
embree_scene = bvh_embree->scene;
}
}
else {
Device::build_bvh(bvh, progress, refit);
}
}
# endif
bool OneapiDevice::load_kernels(const uint requested_features)
{
assert(device_queue_);
kernel_features = requested_features;
bool is_finished_ok = oneapi_run_test_kernel(device_queue_);
if (is_finished_ok == false) {
set_error("oneAPI test kernel execution: got a runtime exception \"" + oneapi_error_string_ +
@@ -100,7 +172,14 @@ bool OneapiDevice::load_kernels(const uint requested_features)
assert(device_queue_);
}
is_finished_ok = oneapi_load_kernels(device_queue_, (const unsigned int)requested_features);
if (use_hardware_raytracing && !can_use_hardware_raytracing_for_features(requested_features)) {
VLOG_INFO
<< "Hardware ray tracing disabled, not supported yet by oneAPI for requested features.";
use_hardware_raytracing = false;
}
is_finished_ok = oneapi_load_kernels(
device_queue_, (const unsigned int)requested_features, use_hardware_raytracing);
if (is_finished_ok == false) {
set_error("oneAPI kernels loading: got a runtime exception \"" + oneapi_error_string_ + "\"");
}
@@ -327,6 +406,16 @@ void OneapiDevice::const_copy_to(const char *name, void *host, size_t size)
<< string_human_readable_number(size) << " bytes. ("
<< string_human_readable_size(size) << ")";
# ifdef WITH_EMBREE_GPU
if (strcmp(name, "data") == 0) {
assert(size <= sizeof(KernelData));
/* Update scene handle(since it is different for each device on multi devices) */
KernelData *const data = (KernelData *)host;
data->device_bvh = embree_scene;
}
# endif
ConstMemMap::iterator i = const_mem_map_.find(name);
device_vector<uchar> *data;
@@ -446,7 +535,9 @@ void OneapiDevice::check_usm(SyclQueue *queue_, const void *usm_ptr, bool allow_
# endif
}
bool OneapiDevice::create_queue(SyclQueue *&external_queue, int device_index)
bool OneapiDevice::create_queue(SyclQueue *&external_queue,
int device_index,
void *embree_device_pointer)
{
bool finished_correct = true;
try {
@@ -457,6 +548,13 @@ bool OneapiDevice::create_queue(SyclQueue *&external_queue, int device_index)
sycl::queue *created_queue = new sycl::queue(devices[device_index],
sycl::property::queue::in_order());
external_queue = reinterpret_cast<SyclQueue *>(created_queue);
# ifdef WITH_EMBREE_GPU
if (embree_device_pointer) {
*((RTCDevice *)embree_device_pointer) = rtcNewSYCLDevice(created_queue->get_context(), "");
}
# else
(void)embree_device_pointer;
# endif
}
catch (sycl::exception const &e) {
finished_correct = false;
@@ -625,7 +723,8 @@ bool OneapiDevice::enqueue_kernel(KernelContext *kernel_context,
size_t global_size,
void **args)
{
return oneapi_enqueue_kernel(kernel_context, kernel, global_size, args);
return oneapi_enqueue_kernel(
kernel_context, kernel, global_size, kernel_features, use_hardware_raytracing, args);
}
/* Compute-runtime (ie. NEO) version is what gets returned by sycl/L0 on Windows
@@ -767,9 +866,9 @@ char *OneapiDevice::device_capabilities()
sycl::id<3> max_work_item_sizes =
device.get_info<sycl::info::device::max_work_item_sizes<3>>();
WRITE_ATTR("max_work_item_sizes_dim0", ((size_t)max_work_item_sizes.get(0)))
WRITE_ATTR("max_work_item_sizes_dim1", ((size_t)max_work_item_sizes.get(1)))
WRITE_ATTR("max_work_item_sizes_dim2", ((size_t)max_work_item_sizes.get(2)))
WRITE_ATTR(max_work_item_sizes_dim0, ((size_t)max_work_item_sizes.get(0)))
WRITE_ATTR(max_work_item_sizes_dim1, ((size_t)max_work_item_sizes.get(1)))
WRITE_ATTR(max_work_item_sizes_dim2, ((size_t)max_work_item_sizes.get(2)))
GET_NUM_ATTR(max_work_group_size)
GET_NUM_ATTR(max_num_sub_groups)
@@ -792,7 +891,7 @@ char *OneapiDevice::device_capabilities()
GET_NUM_ATTR(native_vector_width_half)
size_t max_clock_frequency = device.get_info<sycl::info::device::max_clock_frequency>();
WRITE_ATTR("max_clock_frequency", max_clock_frequency)
WRITE_ATTR(max_clock_frequency, max_clock_frequency)
GET_NUM_ATTR(address_bits)
GET_NUM_ATTR(max_mem_alloc_size)
@@ -801,7 +900,7 @@ char *OneapiDevice::device_capabilities()
* supported so we always return false, even if device supports HW texture usage acceleration.
*/
bool image_support = false;
WRITE_ATTR("image_support", (size_t)image_support)
WRITE_ATTR(image_support, (size_t)image_support)
GET_NUM_ATTR(max_parameter_size)
GET_NUM_ATTR(mem_base_addr_align)
@@ -830,12 +929,17 @@ void OneapiDevice::iterate_devices(OneAPIDeviceIteratorCallback cb, void *user_p
std::string name = device.get_info<sycl::info::device::name>();
# else
std::string name = "SYCL Host Task (Debug)";
# endif
# ifdef WITH_EMBREE_GPU
bool hwrt_support = rtcIsSYCLDeviceSupported(device);
# else
bool hwrt_support = false;
# endif
std::string id = "ONEAPI_" + platform_name + "_" + name;
if (device.has(sycl::aspect::ext_intel_pci_address)) {
id.append("_" + device.get_info<sycl::ext::intel::info::device::pci_address>());
}
(cb)(id.c_str(), name.c_str(), num, user_ptr);
(cb)(id.c_str(), name.c_str(), num, hwrt_support, user_ptr);
num++;
}
}

View File

@@ -16,15 +16,16 @@ CCL_NAMESPACE_BEGIN
class DeviceQueue;
typedef void (*OneAPIDeviceIteratorCallback)(const char *id,
const char *name,
int num,
void *user_ptr);
typedef void (*OneAPIDeviceIteratorCallback)(
const char *id, const char *name, int num, bool hwrt_support, void *user_ptr);
class OneapiDevice : public Device {
private:
SyclQueue *device_queue_;
# ifdef WITH_EMBREE_GPU
RTCDevice embree_device;
RTCScene embree_scene;
# endif
using ConstMemMap = map<string, device_vector<uchar> *>;
ConstMemMap const_mem_map_;
device_vector<TextureInfo> texture_info_;
@@ -34,17 +35,21 @@ class OneapiDevice : public Device {
size_t kg_memory_size_ = (size_t)0;
size_t max_memory_on_device_ = (size_t)0;
std::string oneapi_error_string_;
bool use_hardware_raytracing = false;
unsigned int kernel_features = 0;
public:
virtual BVHLayoutMask get_bvh_layout_mask() const override;
virtual BVHLayoutMask get_bvh_layout_mask(uint kernel_features) const override;
OneapiDevice(const DeviceInfo &info, Stats &stats, Profiler &profiler);
virtual ~OneapiDevice();
# ifdef WITH_EMBREE_GPU
void build_bvh(BVH *bvh, Progress &progress, bool refit) override;
# endif
bool check_peer_access(Device *peer_device) override;
bool load_kernels(const uint requested_features) override;
bool load_kernels(const uint kernel_features) override;
void load_texture_info();
@@ -113,8 +118,9 @@ class OneapiDevice : public Device {
SyclQueue *sycl_queue();
protected:
bool can_use_hardware_raytracing_for_features(uint kernel_features) const;
void check_usm(SyclQueue *queue, const void *usm_ptr, bool allow_host);
bool create_queue(SyclQueue *&external_queue, int device_index);
bool create_queue(SyclQueue *&external_queue, int device_index, void *embree_device);
void free_queue(SyclQueue *queue);
void *usm_aligned_alloc_host(SyclQueue *queue, size_t memory_size, size_t alignment);
void *usm_alloc_device(SyclQueue *queue, size_t memory_size);

View File

@@ -151,7 +151,7 @@ unique_ptr<DeviceQueue> OptiXDevice::gpu_queue_create()
return make_unique<OptiXDeviceQueue>(this);
}
BVHLayoutMask OptiXDevice::get_bvh_layout_mask() const
BVHLayoutMask OptiXDevice::get_bvh_layout_mask(uint /*kernel_features*/) const
{
/* OptiX has its own internal acceleration structure format. */
return BVH_LAYOUT_OPTIX;

View File

@@ -88,7 +88,7 @@ class OptiXDevice : public CUDADevice {
OptiXDevice(const DeviceInfo &info, Stats &stats, Profiler &profiler);
~OptiXDevice();
BVHLayoutMask get_bvh_layout_mask() const override;
BVHLayoutMask get_bvh_layout_mask(uint /*kernel_features*/) const override;
string compile_kernel_get_common_cflags(const uint kernel_features);

View File

@@ -574,7 +574,7 @@ void PathTrace::denoise(const RenderWork &render_work)
void PathTrace::set_output_driver(unique_ptr<OutputDriver> driver)
{
output_driver_ = move(driver);
output_driver_ = std::move(driver);
}
void PathTrace::set_display_driver(unique_ptr<DisplayDriver> driver)
@@ -585,7 +585,7 @@ void PathTrace::set_display_driver(unique_ptr<DisplayDriver> driver)
destroy_gpu_resources();
if (driver) {
display_ = make_unique<PathTraceDisplay>(move(driver));
display_ = make_unique<PathTraceDisplay>(std::move(driver));
}
else {
display_ = nullptr;

View File

@@ -9,7 +9,9 @@
CCL_NAMESPACE_BEGIN
PathTraceDisplay::PathTraceDisplay(unique_ptr<DisplayDriver> driver) : driver_(move(driver)) {}
PathTraceDisplay::PathTraceDisplay(unique_ptr<DisplayDriver> driver) : driver_(std::move(driver))
{
}
void PathTraceDisplay::reset(const BufferParams &buffer_params, const bool reset_rendering)
{

View File

@@ -357,8 +357,12 @@ void PathTraceWorkCPU::guiding_push_sample_data_to_global_storage(
# if PATH_GUIDING_LEVEL >= 2
const bool use_direct_light = kernel_data.integrator.use_guiding_direct_light;
const bool use_mis_weights = kernel_data.integrator.use_guiding_mis_weights;
# if OPENPGL_VERSION_MINOR >= 5
kg->opgl_path_segment_storage->PrepareSamples(use_mis_weights, use_direct_light, false);
# else
kg->opgl_path_segment_storage->PrepareSamples(
false, nullptr, use_mis_weights, use_direct_light, false);
# endif
# endif
# ifdef WITH_CYCLES_DEBUG

View File

@@ -28,6 +28,7 @@ static size_t estimate_single_state_size(const uint kernel_features)
#define KERNEL_STRUCT_ARRAY_MEMBER(parent_struct, type, name, feature) \
state_size += (kernel_features & (feature)) ? sizeof(type) : 0;
#define KERNEL_STRUCT_END(name) \
(void)array_index; \
break; \
}
#define KERNEL_STRUCT_END_ARRAY(name, cpu_array_size, gpu_array_size) \
@@ -139,6 +140,7 @@ void PathTraceWorkGPU::alloc_integrator_soa()
integrator_state_gpu_.parent_struct[array_index].name = (type *)array->device_pointer; \
}
#define KERNEL_STRUCT_END(name) \
(void)array_index; \
break; \
}
#define KERNEL_STRUCT_END_ARRAY(name, cpu_array_size, gpu_array_size) \
@@ -299,8 +301,8 @@ void PathTraceWorkGPU::render_samples(RenderStatistics &statistics,
* become busy after adding new tiles). This is especially important for the shadow catcher which
* schedules work in halves of available number of paths. */
work_tile_scheduler_.set_max_num_path_states(max_num_paths_ / 8);
work_tile_scheduler_.set_accelerated_rt((device_->get_bvh_layout_mask() & BVH_LAYOUT_OPTIX) !=
0);
work_tile_scheduler_.set_accelerated_rt(
(device_->get_bvh_layout_mask(device_scene_->data.kernel_features) & BVH_LAYOUT_OPTIX) != 0);
work_tile_scheduler_.reset(effective_buffer_params_,
start_sample,
samples_num,

View File

@@ -55,21 +55,29 @@ void WorkTileScheduler::reset_scheduler_state()
VLOG_WORK << "Will schedule tiles of size " << tile_size_;
if (VLOG_IS_ON(3)) {
/* The logging is based on multiple tiles scheduled, ignoring overhead of multi-tile scheduling
* and purely focusing on the number of used path states. */
const int num_path_states_in_tile = tile_size_.width * tile_size_.height *
tile_size_.num_samples;
const int num_tiles = max_num_path_states_ / num_path_states_in_tile;
VLOG_WORK << "Number of unused path states: "
<< max_num_path_states_ - num_tiles * num_path_states_in_tile;
const int num_path_states_in_tile = tile_size_.width * tile_size_.height *
tile_size_.num_samples;
if (num_path_states_in_tile == 0) {
num_tiles_x_ = 0;
num_tiles_y_ = 0;
num_tiles_per_sample_range_ = 0;
}
else {
if (VLOG_IS_ON(3)) {
/* The logging is based on multiple tiles scheduled, ignoring overhead of multi-tile
* scheduling and purely focusing on the number of used path states. */
const int num_tiles = max_num_path_states_ / num_path_states_in_tile;
VLOG_WORK << "Number of unused path states: "
<< max_num_path_states_ - num_tiles * num_path_states_in_tile;
}
num_tiles_x_ = divide_up(image_size_px_.x, tile_size_.width);
num_tiles_y_ = divide_up(image_size_px_.y, tile_size_.height);
num_tiles_per_sample_range_ = divide_up(samples_num_, tile_size_.num_samples);
}
num_tiles_x_ = divide_up(image_size_px_.x, tile_size_.width);
num_tiles_y_ = divide_up(image_size_px_.y, tile_size_.height);
total_tiles_num_ = num_tiles_x_ * num_tiles_y_;
num_tiles_per_sample_range_ = divide_up(samples_num_, tile_size_.num_samples);
next_work_index_ = 0;
total_work_size_ = total_tiles_num_ * num_tiles_per_sample_range_;

View File

@@ -96,10 +96,13 @@ set(SRC_KERNEL_DEVICE_ONEAPI_HEADERS
device/oneapi/compat.h
device/oneapi/context_begin.h
device/oneapi/context_end.h
device/oneapi/context_intersect_begin.h
device/oneapi/context_intersect_end.h
device/oneapi/globals.h
device/oneapi/image.h
device/oneapi/kernel.h
device/oneapi/kernel_templates.h
device/cpu/bvh.h
)
set(SRC_KERNEL_CLOSURE_HEADERS
@@ -764,7 +767,7 @@ if(WITH_CYCLES_DEVICE_ONEAPI)
# Set defaults for spir64 and spir64_gen options
if(NOT DEFINED CYCLES_ONEAPI_SYCL_OPTIONS_spir64)
set(CYCLES_ONEAPI_SYCL_OPTIONS_spir64 "-options '-ze-opt-large-register-file -ze-opt-regular-grf-kernel integrator_intersect'")
set(CYCLES_ONEAPI_SYCL_OPTIONS_spir64 "-options '-ze-opt-regular-grf-kernel integrator_intersect -ze-opt-large-grf-kernel shade -ze-opt-no-local-to-generic'")
endif()
if(NOT DEFINED CYCLES_ONEAPI_SYCL_OPTIONS_spir64_gen)
set(CYCLES_ONEAPI_SYCL_OPTIONS_spir64_gen "${CYCLES_ONEAPI_SYCL_OPTIONS_spir64}" CACHE STRING "Extra build options for spir64_gen target")
@@ -775,8 +778,6 @@ if(WITH_CYCLES_DEVICE_ONEAPI)
# Host execution won't use GPU binaries, no need to compile them.
if(WITH_CYCLES_ONEAPI_BINARIES AND NOT WITH_CYCLES_ONEAPI_HOST_TASK_EXECUTION)
# AoT binaries aren't currently reused when calling sycl::build.
list(APPEND sycl_compiler_flags -DSYCL_SKIP_KERNELS_PRELOAD)
# Iterate over all targest and their options
list(JOIN CYCLES_ONEAPI_SYCL_TARGETS "," targets_string)
list(APPEND sycl_compiler_flags -fsycl-targets=${targets_string})
@@ -798,6 +799,59 @@ if(WITH_CYCLES_DEVICE_ONEAPI)
-I"${NANOVDB_INCLUDE_DIR}")
endif()
if(WITH_CYCLES_EMBREE AND EMBREE_SYCL_SUPPORT)
list(APPEND sycl_compiler_flags
-DWITH_EMBREE
-DWITH_EMBREE_GPU
-DEMBREE_MAJOR_VERSION=${EMBREE_MAJOR_VERSION}
-I"${EMBREE_INCLUDE_DIRS}")
if(WIN32)
list(APPEND sycl_compiler_flags
-ladvapi32.lib
)
endif()
set(next_library_mode "")
foreach(library ${EMBREE_LIBRARIES})
string(TOLOWER "${library}" library_lower)
if(("${library_lower}" STREQUAL "optimized") OR
("${library_lower}" STREQUAL "debug"))
set(next_library_mode "${library_lower}")
else()
if(next_library_mode STREQUAL "")
list(APPEND EMBREE_TBB_LIBRARIES_optimized ${library})
list(APPEND EMBREE_TBB_LIBRARIES_debug ${library})
else()
list(APPEND EMBREE_TBB_LIBRARIES_${next_library_mode} ${library})
endif()
set(next_library_mode "")
endif()
endforeach()
foreach(library ${TBB_LIBRARIES})
string(TOLOWER "${library}" library_lower)
if(("${library_lower}" STREQUAL "optimized") OR
("${library_lower}" STREQUAL "debug"))
set(next_library_mode "${library_lower}")
else()
if(next_library_mode STREQUAL "")
list(APPEND EMBREE_TBB_LIBRARIES_optimized ${library})
list(APPEND EMBREE_TBB_LIBRARIES_debug ${library})
else()
list(APPEND EMBREE_TBB_LIBRARIES_${next_library_mode} ${library})
endif()
set(next_library_mode "")
endif()
endforeach()
list(APPEND sycl_compiler_flags
"$<$<CONFIG:Release>:${EMBREE_TBB_LIBRARIES_optimized}>"
"$<$<CONFIG:RelWithDebInfo>:${EMBREE_TBB_LIBRARIES_optimized}>"
"$<$<CONFIG:MinSizeRel>:${EMBREE_TBB_LIBRARIES_optimized}>"
"$<$<CONFIG:Debug>:${EMBREE_TBB_LIBRARIES_debug}>"
)
endif()
if(WITH_CYCLES_DEBUG)
list(APPEND sycl_compiler_flags -DWITH_CYCLES_DEBUG)
endif()

View File

@@ -21,6 +21,28 @@
# define __BVH2__
#endif
#if defined(__KERNEL_ONEAPI__) && defined(WITH_EMBREE_GPU)
/* bool is apparently not tested for specialization constants:
* https://github.com/intel/llvm/blob/39d1c65272a786b2b13a6f094facfddf9408406d/sycl/test/basic_tests/SYCL-2020-spec-constants.cpp#L25-L27
* Instead of adding one more bool specialization constant, we reuse existing embree_features one
* and use RTC_FEATURE_FLAG_NONE as value to test for avoiding to call Embree on GPU.
*/
/* We set it to RTC_FEATURE_FLAG_NONE by default so AoT binaries contain MNE and ray-trace kernels
* pre-compiled without Embree.
* Changing this default value would require updating the logic in oneapi_load_kernels(). */
static constexpr sycl::specialization_id<RTCFeatureFlags> oneapi_embree_features{
RTC_FEATURE_FLAG_NONE};
# define IF_USING_EMBREE \
if (kernel_handler.get_specialization_constant<oneapi_embree_features>() != \
RTC_FEATURE_FLAG_NONE)
# define IF_NOT_USING_EMBREE \
if (kernel_handler.get_specialization_constant<oneapi_embree_features>() == \
RTC_FEATURE_FLAG_NONE)
#else
# define IF_USING_EMBREE
# define IF_NOT_USING_EMBREE
#endif
CCL_NAMESPACE_BEGIN
#ifdef __BVH2__
@@ -74,30 +96,39 @@ ccl_device_intersect bool scene_intersect(KernelGlobals kg,
}
# ifdef __EMBREE__
if (kernel_data.device_bvh) {
return kernel_embree_intersect(kg, ray, visibility, isect);
IF_USING_EMBREE
{
if (kernel_data.device_bvh) {
return kernel_embree_intersect(kg, ray, visibility, isect);
}
}
# endif
IF_NOT_USING_EMBREE
{
# ifdef __OBJECT_MOTION__
if (kernel_data.bvh.have_motion) {
if (kernel_data.bvh.have_motion) {
# ifdef __HAIR__
if (kernel_data.bvh.have_curves) {
return bvh_intersect_hair_motion(kg, ray, isect, visibility);
}
if (kernel_data.bvh.have_curves) {
return bvh_intersect_hair_motion(kg, ray, isect, visibility);
}
# endif /* __HAIR__ */
return bvh_intersect_motion(kg, ray, isect, visibility);
}
return bvh_intersect_motion(kg, ray, isect, visibility);
}
# endif /* __OBJECT_MOTION__ */
# ifdef __HAIR__
if (kernel_data.bvh.have_curves) {
return bvh_intersect_hair(kg, ray, isect, visibility);
}
if (kernel_data.bvh.have_curves) {
return bvh_intersect_hair(kg, ray, isect, visibility);
}
# endif /* __HAIR__ */
return bvh_intersect(kg, ray, isect, visibility);
return bvh_intersect(kg, ray, isect, visibility);
}
kernel_assert(false);
return false;
}
/* Single object BVH traversal, for SSS/AO/bevel. */
@@ -129,17 +160,27 @@ ccl_device_intersect bool scene_intersect_local(KernelGlobals kg,
}
# ifdef __EMBREE__
if (kernel_data.device_bvh) {
return kernel_embree_intersect_local(kg, ray, local_isect, local_object, lcg_state, max_hits);
IF_USING_EMBREE
{
if (kernel_data.device_bvh) {
return kernel_embree_intersect_local(
kg, ray, local_isect, local_object, lcg_state, max_hits);
}
}
# endif
IF_NOT_USING_EMBREE
{
# ifdef __OBJECT_MOTION__
if (kernel_data.bvh.have_motion) {
return bvh_intersect_local_motion(kg, ray, local_isect, local_object, lcg_state, max_hits);
}
if (kernel_data.bvh.have_motion) {
return bvh_intersect_local_motion(kg, ray, local_isect, local_object, lcg_state, max_hits);
}
# endif /* __OBJECT_MOTION__ */
return bvh_intersect_local(kg, ray, local_isect, local_object, lcg_state, max_hits);
return bvh_intersect_local(kg, ray, local_isect, local_object, lcg_state, max_hits);
}
kernel_assert(false);
return false;
}
# endif
@@ -184,35 +225,44 @@ ccl_device_intersect bool scene_intersect_shadow_all(KernelGlobals kg,
}
# ifdef __EMBREE__
if (kernel_data.device_bvh) {
return kernel_embree_intersect_shadow_all(
kg, state, ray, visibility, max_hits, num_recorded_hits, throughput);
IF_USING_EMBREE
{
if (kernel_data.device_bvh) {
return kernel_embree_intersect_shadow_all(
kg, state, ray, visibility, max_hits, num_recorded_hits, throughput);
}
}
# endif
IF_NOT_USING_EMBREE
{
# ifdef __OBJECT_MOTION__
if (kernel_data.bvh.have_motion) {
if (kernel_data.bvh.have_motion) {
# ifdef __HAIR__
if (kernel_data.bvh.have_curves) {
return bvh_intersect_shadow_all_hair_motion(
kg, ray, state, visibility, max_hits, num_recorded_hits, throughput);
}
if (kernel_data.bvh.have_curves) {
return bvh_intersect_shadow_all_hair_motion(
kg, ray, state, visibility, max_hits, num_recorded_hits, throughput);
}
# endif /* __HAIR__ */
return bvh_intersect_shadow_all_motion(
kg, ray, state, visibility, max_hits, num_recorded_hits, throughput);
}
return bvh_intersect_shadow_all_motion(
kg, ray, state, visibility, max_hits, num_recorded_hits, throughput);
}
# endif /* __OBJECT_MOTION__ */
# ifdef __HAIR__
if (kernel_data.bvh.have_curves) {
return bvh_intersect_shadow_all_hair(
kg, ray, state, visibility, max_hits, num_recorded_hits, throughput);
}
if (kernel_data.bvh.have_curves) {
return bvh_intersect_shadow_all_hair(
kg, ray, state, visibility, max_hits, num_recorded_hits, throughput);
}
# endif /* __HAIR__ */
return bvh_intersect_shadow_all(
kg, ray, state, visibility, max_hits, num_recorded_hits, throughput);
return bvh_intersect_shadow_all(
kg, ray, state, visibility, max_hits, num_recorded_hits, throughput);
}
kernel_assert(false);
return false;
}
# endif /* __SHADOW_RECORD_ALL__ */
@@ -239,13 +289,28 @@ ccl_device_intersect bool scene_intersect_volume(KernelGlobals kg,
return false;
}
# ifdef __OBJECT_MOTION__
if (kernel_data.bvh.have_motion) {
return bvh_intersect_volume_motion(kg, ray, isect, visibility);
# ifdef __EMBREE__
IF_USING_EMBREE
{
if (kernel_data.device_bvh) {
return kernel_embree_intersect_volume(kg, ray, isect, visibility);
}
}
# endif
IF_NOT_USING_EMBREE
{
# ifdef __OBJECT_MOTION__
if (kernel_data.bvh.have_motion) {
return bvh_intersect_volume_motion(kg, ray, isect, visibility);
}
# endif /* __OBJECT_MOTION__ */
return bvh_intersect_volume(kg, ray, isect, visibility);
return bvh_intersect_volume(kg, ray, isect, visibility);
}
kernel_assert(false);
return false;
}
# endif /* defined(__VOLUME__) && !defined(__VOLUME_RECORD_ALL__) */
@@ -275,18 +340,27 @@ ccl_device_intersect uint scene_intersect_volume(KernelGlobals kg,
}
# ifdef __EMBREE__
if (kernel_data.device_bvh) {
return kernel_embree_intersect_volume(kg, ray, isect, max_hits, visibility);
IF_USING_EMBREE
{
if (kernel_data.device_bvh) {
return kernel_embree_intersect_volume(kg, ray, isect, max_hits, visibility);
}
}
# endif
IF_NOT_USING_EMBREE
{
# ifdef __OBJECT_MOTION__
if (kernel_data.bvh.have_motion) {
return bvh_intersect_volume_all_motion(kg, ray, isect, max_hits, visibility);
}
if (kernel_data.bvh.have_motion) {
return bvh_intersect_volume_all_motion(kg, ray, isect, max_hits, visibility);
}
# endif /* __OBJECT_MOTION__ */
return bvh_intersect_volume_all(kg, ray, isect, max_hits, visibility);
return bvh_intersect_volume_all(kg, ray, isect, max_hits, visibility);
}
kernel_assert(false);
return false;
}
# endif /* defined(__VOLUME__) && defined(__VOLUME_RECORD_ALL__) */

View File

@@ -51,8 +51,6 @@ ccl_device_inline
int object = OBJECT_NONE;
float isect_t = ray->tmax;
int num_hits_in_instance = 0;
uint num_hits = 0;
isect_array->t = ray->tmax;
@@ -152,7 +150,6 @@ ccl_device_inline
/* Move on to next entry in intersections array. */
isect_array++;
num_hits++;
num_hits_in_instance++;
isect_array->t = isect_t;
if (num_hits == max_hits) {
return num_hits;
@@ -193,7 +190,6 @@ ccl_device_inline
/* Move on to next entry in intersections array. */
isect_array++;
num_hits++;
num_hits_in_instance++;
isect_array->t = isect_t;
if (num_hits == max_hits) {
return num_hits;
@@ -219,7 +215,6 @@ ccl_device_inline
bvh_instance_push(kg, object, ray, &P, &dir, &idir);
#endif
num_hits_in_instance = 0;
isect_array->t = isect_t;
++stack_ptr;

View File

@@ -64,6 +64,7 @@ KERNEL_DATA_ARRAY(float2, light_background_conditional_cdf)
KERNEL_DATA_ARRAY(KernelLightTreeNode, light_tree_nodes)
KERNEL_DATA_ARRAY(KernelLightTreeEmitter, light_tree_emitters)
KERNEL_DATA_ARRAY(uint, light_to_tree)
KERNEL_DATA_ARRAY(uint, object_to_tree)
KERNEL_DATA_ARRAY(uint, object_lookup_offset)
KERNEL_DATA_ARRAY(uint, triangle_to_tree)

View File

@@ -20,6 +20,7 @@ KERNEL_STRUCT_BEGIN(KernelBackground, background)
/* xyz store direction, w the angle. float4 instead of float3 is used
* to ensure consistent padding/alignment across devices. */
KERNEL_STRUCT_MEMBER(background, float4, sun)
KERNEL_STRUCT_MEMBER(background, int, use_sun_guiding)
/* Only shader index. */
KERNEL_STRUCT_MEMBER(background, int, surface_shader)
KERNEL_STRUCT_MEMBER(background, int, volume_shader)
@@ -39,6 +40,10 @@ KERNEL_STRUCT_MEMBER(background, int, use_mis)
KERNEL_STRUCT_MEMBER(background, int, lightgroup)
/* Light Index. */
KERNEL_STRUCT_MEMBER(background, int, light_index)
/* Padding. */
KERNEL_STRUCT_MEMBER(background, int, pad1)
KERNEL_STRUCT_MEMBER(background, int, pad2)
KERNEL_STRUCT_MEMBER(background, int, pad3)
KERNEL_STRUCT_END(KernelBackground)
/* BVH: own BVH2 if no native device acceleration struct used. */

View File

@@ -13,8 +13,13 @@
# include <embree3/rtcore_scene.h>
#endif
#include "kernel/device/cpu/compat.h"
#include "kernel/device/cpu/globals.h"
#ifdef __KERNEL_ONEAPI__
# include "kernel/device/oneapi/compat.h"
# include "kernel/device/oneapi/globals.h"
#else
# include "kernel/device/cpu/compat.h"
# include "kernel/device/cpu/globals.h"
#endif
#include "kernel/bvh/types.h"
#include "kernel/bvh/util.h"
@@ -33,11 +38,16 @@ using numhit_t = uint8_t;
using numhit_t = uint32_t;
#endif
#define CYCLES_EMBREE_USED_FEATURES \
(RTCFeatureFlags)(RTC_FEATURE_FLAG_TRIANGLE | RTC_FEATURE_FLAG_INSTANCE | \
RTC_FEATURE_FLAG_FILTER_FUNCTION_IN_ARGUMENTS | RTC_FEATURE_FLAG_POINT | \
RTC_FEATURE_FLAG_MOTION_BLUR | RTC_FEATURE_FLAG_ROUND_CATMULL_ROM_CURVE | \
RTC_FEATURE_FLAG_FLAT_CATMULL_ROM_CURVE)
#ifdef __KERNEL_ONEAPI__
# define CYCLES_EMBREE_USED_FEATURES \
(kernel_handler.get_specialization_constant<oneapi_embree_features>())
#else
# define CYCLES_EMBREE_USED_FEATURES \
(RTCFeatureFlags)(RTC_FEATURE_FLAG_TRIANGLE | RTC_FEATURE_FLAG_INSTANCE | \
RTC_FEATURE_FLAG_FILTER_FUNCTION_IN_ARGUMENTS | RTC_FEATURE_FLAG_POINT | \
RTC_FEATURE_FLAG_MOTION_BLUR | RTC_FEATURE_FLAG_ROUND_CATMULL_ROM_CURVE | \
RTC_FEATURE_FLAG_FLAT_CATMULL_ROM_CURVE)
#endif
#define EMBREE_IS_HAIR(x) (x & 1)
@@ -99,7 +109,9 @@ struct CCLVolumeContext
#if EMBREE_MAJOR_VERSION >= 4
KernelGlobals kg;
const Ray *ray;
# ifdef __VOLUME_RECORD_ALL__
numhit_t max_hits;
# endif
numhit_t num_hits;
#endif
Intersection *vol_isect;
@@ -252,7 +264,8 @@ ccl_device_inline void kernel_embree_convert_sss_hit(KernelGlobals kg,
* Things like recording subsurface or shadow hits for later evaluation
* as well as filtering for volume objects happen here.
* Cycles' own BVH does that directly inside the traversal calls. */
ccl_device void kernel_embree_filter_intersection_func(const RTCFilterFunctionNArguments *args)
ccl_device_forceinline void kernel_embree_filter_intersection_func_impl(
const RTCFilterFunctionNArguments *args)
{
/* Current implementation in Cycles assumes only single-ray intersection queries. */
assert(args->N == 1);
@@ -263,7 +276,11 @@ ccl_device void kernel_embree_filter_intersection_func(const RTCFilterFunctionNA
#else
CCLIntersectContext *ctx = (CCLIntersectContext *)(args->context);
#endif
#ifdef __KERNEL_ONEAPI__
KernelGlobalsGPU *kg = nullptr;
#else
const KernelGlobalsCPU *kg = ctx->kg;
#endif
const Ray *cray = ctx->ray;
if (kernel_embree_is_self_intersection(
@@ -277,7 +294,7 @@ ccl_device void kernel_embree_filter_intersection_func(const RTCFilterFunctionNA
* as well as filtering for volume objects happen here.
* Cycles' own BVH does that directly inside the traversal calls.
*/
ccl_device void kernel_embree_filter_occluded_shadow_all_func(
ccl_device_forceinline void kernel_embree_filter_occluded_shadow_all_func_impl(
const RTCFilterFunctionNArguments *args)
{
/* Current implementation in Cycles assumes only single-ray intersection queries. */
@@ -290,7 +307,11 @@ ccl_device void kernel_embree_filter_occluded_shadow_all_func(
#else
CCLIntersectContext *ctx = (CCLIntersectContext *)(args->context);
#endif
#ifdef __KERNEL_ONEAPI__
KernelGlobalsGPU *kg = nullptr;
#else
const KernelGlobalsCPU *kg = ctx->kg;
#endif
const Ray *cray = ctx->ray;
Intersection current_isect;
@@ -326,7 +347,7 @@ ccl_device void kernel_embree_filter_occluded_shadow_all_func(
}
/* Test if we need to record this transparent intersection. */
const numhit_t max_record_hits = min(ctx->max_hits, INTEGRATOR_SHADOW_ISECT_SIZE);
const numhit_t max_record_hits = min(ctx->max_hits, numhit_t(INTEGRATOR_SHADOW_ISECT_SIZE));
if (ctx->num_recorded_hits < max_record_hits) {
/* If maximum number of hits was reached, replace the intersection with the
* highest distance. We want to find the N closest intersections. */
@@ -363,7 +384,7 @@ ccl_device void kernel_embree_filter_occluded_shadow_all_func(
*args->valid = 0;
}
ccl_device_forceinline void kernel_embree_filter_occluded_local_func(
ccl_device_forceinline void kernel_embree_filter_occluded_local_func_impl(
const RTCFilterFunctionNArguments *args)
{
/* Current implementation in Cycles assumes only single-ray intersection queries. */
@@ -376,7 +397,11 @@ ccl_device_forceinline void kernel_embree_filter_occluded_local_func(
#else
CCLIntersectContext *ctx = (CCLIntersectContext *)(args->context);
#endif
#ifdef __KERNEL_ONEAPI__
KernelGlobalsGPU *kg = nullptr;
#else
const KernelGlobalsCPU *kg = ctx->kg;
#endif
const Ray *cray = ctx->ray;
/* Check if it's hitting the correct object. */
@@ -462,7 +487,7 @@ ccl_device_forceinline void kernel_embree_filter_occluded_local_func(
*args->valid = 0;
}
ccl_device_forceinline void kernel_embree_filter_occluded_volume_all_func(
ccl_device_forceinline void kernel_embree_filter_occluded_volume_all_func_impl(
const RTCFilterFunctionNArguments *args)
{
/* Current implementation in Cycles assumes only single-ray intersection queries. */
@@ -475,11 +500,17 @@ ccl_device_forceinline void kernel_embree_filter_occluded_volume_all_func(
#else
CCLIntersectContext *ctx = (CCLIntersectContext *)(args->context);
#endif
#ifdef __KERNEL_ONEAPI__
KernelGlobalsGPU *kg = nullptr;
#else
const KernelGlobalsCPU *kg = ctx->kg;
#endif
const Ray *cray = ctx->ray;
#ifdef __VOLUME_RECORD_ALL__
/* Append the intersection to the end of the array. */
if (ctx->num_hits < ctx->max_hits) {
#endif
Intersection current_isect;
kernel_embree_convert_hit(
kg, ray, hit, &current_isect, reinterpret_cast<intptr_t>(args->geometryUserPtr));
@@ -496,10 +527,17 @@ ccl_device_forceinline void kernel_embree_filter_occluded_volume_all_func(
int object_flag = kernel_data_fetch(object_flag, tri_object);
if ((object_flag & SD_OBJECT_HAS_VOLUME) == 0) {
--ctx->num_hits;
#ifndef __VOLUME_RECORD_ALL__
/* Without __VOLUME_RECORD_ALL__ we need only a first counted hit, so we will
* continue tracing only if a current hit is not counted. */
*args->valid = 0;
#endif
}
#ifdef __VOLUME_RECORD_ALL__
/* This tells Embree to continue tracing. */
*args->valid = 0;
}
#endif
}
#if EMBREE_MAJOR_VERSION < 4
@@ -513,14 +551,14 @@ ccl_device_forceinline void kernel_embree_filter_occluded_func(
switch (ctx->type) {
case CCLIntersectContext::RAY_SHADOW_ALL:
kernel_embree_filter_occluded_shadow_all_func(args);
kernel_embree_filter_occluded_shadow_all_func_impl(args);
break;
case CCLIntersectContext::RAY_LOCAL:
case CCLIntersectContext::RAY_SSS:
kernel_embree_filter_occluded_local_func(args);
kernel_embree_filter_occluded_local_func_impl(args);
break;
case CCLIntersectContext::RAY_VOLUME_ALL:
kernel_embree_filter_occluded_volume_all_func(args);
kernel_embree_filter_occluded_volume_all_func_impl(args);
break;
case CCLIntersectContext::RAY_REGULAR:
@@ -569,7 +607,63 @@ ccl_device void kernel_embree_filter_occluded_func_backface_cull(
kernel_embree_filter_occluded_func(args);
}
#endif
#ifdef __KERNEL_ONEAPI__
/* Static wrappers so we can call the callbacks from out side the ONEAPIKernelContext class */
RTC_SYCL_INDIRECTLY_CALLABLE static void ccl_always_inline
kernel_embree_filter_intersection_func_static(const RTCFilterFunctionNArguments *args)
{
RTCHit *hit = (RTCHit *)args->hit;
CCLFirstHitContext *ctx = (CCLFirstHitContext *)(args->context);
ONEAPIKernelContext *context = static_cast<ONEAPIKernelContext *>(ctx->kg);
context->kernel_embree_filter_intersection_func_impl(args);
}
RTC_SYCL_INDIRECTLY_CALLABLE static void ccl_always_inline
kernel_embree_filter_occluded_shadow_all_func_static(const RTCFilterFunctionNArguments *args)
{
RTCHit *hit = (RTCHit *)args->hit;
CCLShadowContext *ctx = (CCLShadowContext *)(args->context);
ONEAPIKernelContext *context = static_cast<ONEAPIKernelContext *>(ctx->kg);
context->kernel_embree_filter_occluded_shadow_all_func_impl(args);
}
RTC_SYCL_INDIRECTLY_CALLABLE static void ccl_always_inline
kernel_embree_filter_occluded_local_func_static(const RTCFilterFunctionNArguments *args)
{
RTCHit *hit = (RTCHit *)args->hit;
CCLLocalContext *ctx = (CCLLocalContext *)(args->context);
ONEAPIKernelContext *context = static_cast<ONEAPIKernelContext *>(ctx->kg);
context->kernel_embree_filter_occluded_local_func_impl(args);
}
RTC_SYCL_INDIRECTLY_CALLABLE static void ccl_always_inline
kernel_embree_filter_occluded_volume_all_func_static(const RTCFilterFunctionNArguments *args)
{
RTCHit *hit = (RTCHit *)args->hit;
CCLVolumeContext *ctx = (CCLVolumeContext *)(args->context);
ONEAPIKernelContext *context = static_cast<ONEAPIKernelContext *>(ctx->kg);
context->kernel_embree_filter_occluded_volume_all_func_impl(args);
}
# define kernel_embree_filter_intersection_func \
ONEAPIKernelContext::kernel_embree_filter_intersection_func_static
# define kernel_embree_filter_occluded_shadow_all_func \
ONEAPIKernelContext::kernel_embree_filter_occluded_shadow_all_func_static
# define kernel_embree_filter_occluded_local_func \
ONEAPIKernelContext::kernel_embree_filter_occluded_local_func_static
# define kernel_embree_filter_occluded_volume_all_func \
ONEAPIKernelContext::kernel_embree_filter_occluded_volume_all_func_static
#else
# define kernel_embree_filter_intersection_func kernel_embree_filter_intersection_func_impl
# if EMBREE_MAJOR_VERSION >= 4
# define kernel_embree_filter_occluded_shadow_all_func \
kernel_embree_filter_occluded_shadow_all_func_impl
# define kernel_embree_filter_occluded_local_func kernel_embree_filter_occluded_local_func_impl
# define kernel_embree_filter_occluded_volume_all_func \
kernel_embree_filter_occluded_volume_all_func_impl
# endif
#endif
/* Scene intersection. */
@@ -583,7 +677,15 @@ ccl_device_intersect bool kernel_embree_intersect(KernelGlobals kg,
#if EMBREE_MAJOR_VERSION >= 4
CCLFirstHitContext ctx;
rtcInitRayQueryContext(&ctx);
# ifdef __KERNEL_ONEAPI__
/* NOTE(sirgienko): Cycles GPU back-ends passes NULL to KernelGlobals and
* uses global device allocation (CUDA, Optix, HIP) or passes all needed data
* as a class context (Metal, oneAPI). So we need to pass this context here
* in order to have an access to it later in Embree filter functions on GPU. */
ctx.kg = (KernelGlobals)this;
# else
ctx.kg = kg;
# endif
#else
CCLIntersectContext ctx(kg, CCLIntersectContext::RAY_REGULAR);
rtcInitIntersectContext(&ctx);
@@ -596,7 +698,7 @@ ccl_device_intersect bool kernel_embree_intersect(KernelGlobals kg,
#if EMBREE_MAJOR_VERSION >= 4
RTCIntersectArguments args;
rtcInitIntersectArguments(&args);
args.filter = (RTCFilterFunctionN)kernel_embree_filter_intersection_func;
args.filter = reinterpret_cast<RTCFilterFunctionN>(kernel_embree_filter_intersection_func);
args.feature_mask = CYCLES_EMBREE_USED_FEATURES;
args.context = &ctx;
rtcIntersect1(kernel_data.device_bvh, &ray_hit, &args);
@@ -625,7 +727,15 @@ ccl_device_intersect bool kernel_embree_intersect_local(KernelGlobals kg,
# if EMBREE_MAJOR_VERSION >= 4
CCLLocalContext ctx;
rtcInitRayQueryContext(&ctx);
# ifdef __KERNEL_ONEAPI__
/* NOTE(sirgienko): Cycles GPU back-ends passes NULL to KernelGlobals and
* uses global device allocation (CUDA, Optix, HIP) or passes all needed data
* as a class context (Metal, oneAPI). So we need to pass this context here
* in order to have an access to it later in Embree filter functions on GPU. */
ctx.kg = (KernelGlobals)this;
# else
ctx.kg = kg;
# endif
# else
CCLIntersectContext ctx(kg,
has_bvh ? CCLIntersectContext::RAY_SSS : CCLIntersectContext::RAY_LOCAL);
@@ -646,7 +756,7 @@ ccl_device_intersect bool kernel_embree_intersect_local(KernelGlobals kg,
# if EMBREE_MAJOR_VERSION >= 4
RTCOccludedArguments args;
rtcInitOccludedArguments(&args);
args.filter = (RTCFilterFunctionN)(kernel_embree_filter_occluded_local_func);
args.filter = reinterpret_cast<RTCFilterFunctionN>(kernel_embree_filter_occluded_local_func);
args.feature_mask = CYCLES_EMBREE_USED_FEATURES;
args.context = &ctx;
# endif
@@ -692,7 +802,7 @@ ccl_device_intersect bool kernel_embree_intersect_local(KernelGlobals kg,
#ifdef __SHADOW_RECORD_ALL__
ccl_device_intersect bool kernel_embree_intersect_shadow_all(KernelGlobals kg,
IntegratorShadowStateCPU *state,
IntegratorShadowState state,
ccl_private const Ray *ray,
uint visibility,
uint max_hits,
@@ -702,7 +812,15 @@ ccl_device_intersect bool kernel_embree_intersect_shadow_all(KernelGlobals kg,
# if EMBREE_MAJOR_VERSION >= 4
CCLShadowContext ctx;
rtcInitRayQueryContext(&ctx);
# ifdef __KERNEL_ONEAPI__
/* NOTE(sirgienko): Cycles GPU back-ends passes NULL to KernelGlobals and
* uses global device allocation (CUDA, Optix, HIP) or passes all needed data
* as a class context (Metal, oneAPI). So we need to pass this context here
* in order to have an access to it later in Embree filter functions on GPU. */
ctx.kg = (KernelGlobals)this;
# else
ctx.kg = kg;
# endif
# else
CCLIntersectContext ctx(kg, CCLIntersectContext::RAY_SHADOW_ALL);
rtcInitIntersectContext(&ctx);
@@ -718,7 +836,8 @@ ccl_device_intersect bool kernel_embree_intersect_shadow_all(KernelGlobals kg,
# if EMBREE_MAJOR_VERSION >= 4
RTCOccludedArguments args;
rtcInitOccludedArguments(&args);
args.filter = (RTCFilterFunctionN)kernel_embree_filter_occluded_shadow_all_func;
args.filter = reinterpret_cast<RTCFilterFunctionN>(
kernel_embree_filter_occluded_shadow_all_func);
args.feature_mask = CYCLES_EMBREE_USED_FEATURES;
args.context = &ctx;
rtcOccluded1(kernel_data.device_bvh, &rtc_ray, &args);
@@ -736,19 +855,31 @@ ccl_device_intersect bool kernel_embree_intersect_shadow_all(KernelGlobals kg,
ccl_device_intersect uint kernel_embree_intersect_volume(KernelGlobals kg,
ccl_private const Ray *ray,
ccl_private Intersection *isect,
# ifdef __VOLUME_RECORD_ALL__
const uint max_hits,
# endif
const uint visibility)
{
# if EMBREE_MAJOR_VERSION >= 4
CCLVolumeContext ctx;
rtcInitRayQueryContext(&ctx);
# ifdef __KERNEL_ONEAPI__
/* NOTE(sirgienko) Cycles GPU back-ends passes NULL to KernelGlobals and
* uses global device allocation (CUDA, Optix, HIP) or passes all needed data
* as a class context (Metal, oneAPI). So we need to pass this context here
* in order to have an access to it later in Embree filter functions on GPU. */
ctx.kg = (KernelGlobals)this;
# else
ctx.kg = kg;
# endif
# else
CCLIntersectContext ctx(kg, CCLIntersectContext::RAY_VOLUME_ALL);
rtcInitIntersectContext(&ctx);
# endif
ctx.vol_isect = isect;
# ifdef __VOLUME_RECORD_ALL__
ctx.max_hits = numhit_t(max_hits);
# endif
ctx.num_hits = numhit_t(0);
ctx.ray = ray;
RTCRay rtc_ray;
@@ -756,7 +887,8 @@ ccl_device_intersect uint kernel_embree_intersect_volume(KernelGlobals kg,
# if EMBREE_MAJOR_VERSION >= 4
RTCOccludedArguments args;
rtcInitOccludedArguments(&args);
args.filter = (RTCFilterFunctionN)kernel_embree_filter_occluded_volume_all_func;
args.filter = reinterpret_cast<RTCFilterFunctionN>(
kernel_embree_filter_occluded_volume_all_func);
args.feature_mask = CYCLES_EMBREE_USED_FEATURES;
args.context = &ctx;
rtcOccluded1(kernel_data.device_bvh, &rtc_ray, &args);

View File

@@ -128,6 +128,12 @@ ccl_gpu_kernel(GPU_KERNEL_BLOCK_NUM_THREADS, GPU_KERNEL_MAX_REGISTERS)
}
ccl_gpu_kernel_postfix
/* Intersection kernels need access to the kernel handler for specialization constants to work
* properly. */
#ifdef __KERNEL_ONEAPI__
# include "kernel/device/oneapi/context_intersect_begin.h"
#endif
ccl_gpu_kernel(GPU_KERNEL_BLOCK_NUM_THREADS, GPU_KERNEL_MAX_REGISTERS)
ccl_gpu_kernel_signature(integrator_intersect_closest,
ccl_global const int *path_index_array,
@@ -185,6 +191,10 @@ ccl_gpu_kernel(GPU_KERNEL_BLOCK_NUM_THREADS, GPU_KERNEL_MAX_REGISTERS)
}
ccl_gpu_kernel_postfix
#ifdef __KERNEL_ONEAPI__
# include "kernel/device/oneapi/context_intersect_end.h"
#endif
ccl_gpu_kernel(GPU_KERNEL_BLOCK_NUM_THREADS, GPU_KERNEL_MAX_REGISTERS)
ccl_gpu_kernel_signature(integrator_shade_background,
ccl_global const int *path_index_array,
@@ -249,6 +259,12 @@ ccl_gpu_kernel_postfix
constant int __dummy_constant [[function_constant(Kernel_DummyConstant)]];
#endif
/* Kernels using intersections need access to the kernel handler for specialization constants to
* work properly. */
#ifdef __KERNEL_ONEAPI__
# include "kernel/device/oneapi/context_intersect_begin.h"
#endif
ccl_gpu_kernel(GPU_KERNEL_BLOCK_NUM_THREADS, GPU_KERNEL_MAX_REGISTERS)
ccl_gpu_kernel_signature(integrator_shade_surface_raytrace,
ccl_global const int *path_index_array,
@@ -287,6 +303,9 @@ ccl_gpu_kernel(GPU_KERNEL_BLOCK_NUM_THREADS, GPU_KERNEL_MAX_REGISTERS)
}
}
ccl_gpu_kernel_postfix
#ifdef __KERNEL_ONEAPI__
# include "kernel/device/oneapi/context_intersect_end.h"
#endif
ccl_gpu_kernel(GPU_KERNEL_BLOCK_NUM_THREADS, GPU_KERNEL_MAX_REGISTERS)
ccl_gpu_kernel_signature(integrator_shade_volume,

View File

@@ -5,6 +5,11 @@
#define __KERNEL_GPU__
#define __KERNEL_ONEAPI__
#define __KERNEL_64_BIT__
#ifdef WITH_EMBREE_GPU
# define __KERNEL_GPU_RAYTRACING__
#endif
#define CCL_NAMESPACE_BEGIN
#define CCL_NAMESPACE_END
@@ -57,17 +62,19 @@
#define ccl_gpu_kernel_threads(block_num_threads)
#ifndef WITH_ONEAPI_SYCL_HOST_TASK
# define ccl_gpu_kernel_signature(name, ...) \
# define __ccl_gpu_kernel_signature(name, ...) \
void oneapi_kernel_##name(KernelGlobalsGPU *ccl_restrict kg, \
size_t kernel_global_size, \
size_t kernel_local_size, \
sycl::handler &cgh, \
__VA_ARGS__) { \
(kg); \
cgh.parallel_for<class kernel_##name>( \
cgh.parallel_for( \
sycl::nd_range<1>(kernel_global_size, kernel_local_size), \
[=](sycl::nd_item<1> item) {
# define ccl_gpu_kernel_signature __ccl_gpu_kernel_signature
# define ccl_gpu_kernel_postfix \
}); \
}

View File

@@ -0,0 +1,18 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2023 Intel Corporation */
#if !defined(WITH_ONEAPI_SYCL_HOST_TASK) && defined(WITH_EMBREE_GPU)
# undef ccl_gpu_kernel_signature
# define ccl_gpu_kernel_signature(name, ...) \
void oneapi_kernel_##name(KernelGlobalsGPU *ccl_restrict kg, \
size_t kernel_global_size, \
size_t kernel_local_size, \
sycl::handler &cgh, \
__VA_ARGS__) \
{ \
(kg); \
cgh.parallel_for( \
sycl::nd_range<1>(kernel_global_size, kernel_local_size), \
[=](sycl::nd_item<1> item, sycl::kernel_handler oneapi_kernel_handler) { \
((ONEAPIKernelContext*)kg)->kernel_handler = oneapi_kernel_handler;
#endif

View File

@@ -0,0 +1,7 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2023 Intel Corporation */
#if !defined(WITH_ONEAPI_SYCL_HOST_TASK) && defined(WITH_EMBREE_GPU)
# undef ccl_gpu_kernel_signature
# define ccl_gpu_kernel_signature __ccl_gpu_kernel_signature
#endif

View File

@@ -31,6 +31,8 @@ typedef struct KernelGlobalsGPU {
size_t nd_item_group_range_0;
size_t nd_item_global_id_0;
size_t nd_item_global_range_0;
#else
sycl::kernel_handler kernel_handler;
#endif
} KernelGlobalsGPU;

View File

@@ -16,9 +16,22 @@
# include "kernel/device/gpu/kernel.h"
# include "device/kernel.cpp"
static OneAPIErrorCallback s_error_cb = nullptr;
static void *s_error_user_ptr = nullptr;
# ifdef WITH_EMBREE_GPU
static const RTCFeatureFlags CYCLES_ONEAPI_EMBREE_BASIC_FEATURES =
(const RTCFeatureFlags)(RTC_FEATURE_FLAG_TRIANGLE | RTC_FEATURE_FLAG_INSTANCE |
RTC_FEATURE_FLAG_FILTER_FUNCTION_IN_ARGUMENTS |
RTC_FEATURE_FLAG_POINT | RTC_FEATURE_FLAG_MOTION_BLUR);
static const RTCFeatureFlags CYCLES_ONEAPI_EMBREE_ALL_FEATURES =
(const RTCFeatureFlags)(CYCLES_ONEAPI_EMBREE_BASIC_FEATURES |
RTC_FEATURE_FLAG_ROUND_CATMULL_ROM_CURVE |
RTC_FEATURE_FLAG_FLAT_CATMULL_ROM_CURVE);
# endif
void oneapi_set_error_cb(OneAPIErrorCallback cb, void *user_ptr)
{
s_error_cb = cb;
@@ -142,15 +155,99 @@ size_t oneapi_kernel_preferred_local_size(SyclQueue *queue,
return std::min(limit_work_group_size, preferred_work_group_size);
}
bool oneapi_load_kernels(SyclQueue *queue_, const uint requested_features)
bool oneapi_kernel_is_required_for_features(const std::string &kernel_name,
const uint kernel_features)
{
if ((kernel_features & KERNEL_FEATURE_NODE_RAYTRACE) == 0 &&
kernel_name.find(device_kernel_as_string(DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE)) !=
std::string::npos)
return false;
if ((kernel_features & KERNEL_FEATURE_MNEE) == 0 &&
kernel_name.find(device_kernel_as_string(DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_MNEE)) !=
std::string::npos)
return false;
if ((kernel_features & KERNEL_FEATURE_VOLUME) == 0 &&
kernel_name.find(device_kernel_as_string(DEVICE_KERNEL_INTEGRATOR_INTERSECT_VOLUME_STACK)) !=
std::string::npos)
return false;
return true;
}
bool oneapi_kernel_is_raytrace_or_mnee(const std::string &kernel_name)
{
return (kernel_name.find(device_kernel_as_string(DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_MNEE)) !=
std::string::npos) ||
(kernel_name.find(device_kernel_as_string(
DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE)) != std::string::npos);
}
bool oneapi_kernel_is_using_embree(const std::string &kernel_name)
{
# ifdef WITH_EMBREE_GPU
/* MNEE and Ray-trace kernels aren't yet enabled to use Embree. */
for (int i = 0; i < (int)DEVICE_KERNEL_NUM; i++) {
DeviceKernel kernel = (DeviceKernel)i;
if (device_kernel_has_intersection(kernel)) {
if (kernel_name.find(device_kernel_as_string(kernel)) != std::string::npos) {
return !oneapi_kernel_is_raytrace_or_mnee(kernel_name);
}
}
}
# endif
return false;
}
bool oneapi_load_kernels(SyclQueue *queue_,
const uint kernel_features,
bool use_hardware_raytracing)
{
# ifdef SYCL_SKIP_KERNELS_PRELOAD
(void)queue_;
(void)requested_features;
# else
assert(queue_);
sycl::queue *queue = reinterpret_cast<sycl::queue *>(queue_);
# ifdef WITH_EMBREE_GPU
/* For best performance, we always JIT compile the kernels that are using Embree. */
if (use_hardware_raytracing) {
try {
sycl::kernel_bundle<sycl::bundle_state::input> all_kernels_bundle =
sycl::get_kernel_bundle<sycl::bundle_state::input>(queue->get_context(),
{queue->get_device()});
for (const sycl::kernel_id &kernel_id : all_kernels_bundle.get_kernel_ids()) {
const std::string &kernel_name = kernel_id.get_name();
if (!oneapi_kernel_is_required_for_features(kernel_name, kernel_features) ||
!oneapi_kernel_is_using_embree(kernel_name)) {
continue;
}
sycl::kernel_bundle<sycl::bundle_state::input> one_kernel_bundle_input =
sycl::get_kernel_bundle<sycl::bundle_state::input>(queue->get_context(), {kernel_id});
/* Hair requires embree curves support. */
if (kernel_features & KERNEL_FEATURE_HAIR) {
one_kernel_bundle_input
.set_specialization_constant<ONEAPIKernelContext::oneapi_embree_features>(
CYCLES_ONEAPI_EMBREE_ALL_FEATURES);
sycl::build(one_kernel_bundle_input);
}
else {
one_kernel_bundle_input
.set_specialization_constant<ONEAPIKernelContext::oneapi_embree_features>(
CYCLES_ONEAPI_EMBREE_BASIC_FEATURES);
sycl::build(one_kernel_bundle_input);
}
}
}
catch (sycl::exception const &e) {
if (s_error_cb) {
s_error_cb(e.what(), s_error_user_ptr);
}
return false;
}
}
# endif
try {
sycl::kernel_bundle<sycl::bundle_state::input> all_kernels_bundle =
sycl::get_kernel_bundle<sycl::bundle_state::input>(queue->get_context(),
@@ -159,27 +256,29 @@ bool oneapi_load_kernels(SyclQueue *queue_, const uint requested_features)
for (const sycl::kernel_id &kernel_id : all_kernels_bundle.get_kernel_ids()) {
const std::string &kernel_name = kernel_id.get_name();
/* NOTE(@nsirgien): Names in this conditions below should match names from
* oneapi_call macro in oneapi_enqueue_kernel below */
if (((requested_features & KERNEL_FEATURE_VOLUME) == 0) &&
kernel_name.find("oneapi_kernel_integrator_shade_volume") != std::string::npos) {
/* In case HWRT is on, compilation of kernels using Embree is already handled in previous
* block. */
if (!oneapi_kernel_is_required_for_features(kernel_name, kernel_features) ||
(use_hardware_raytracing && oneapi_kernel_is_using_embree(kernel_name))) {
continue;
}
if (((requested_features & KERNEL_FEATURE_MNEE) == 0) &&
kernel_name.find("oneapi_kernel_integrator_shade_surface_mnee") != std::string::npos) {
# ifdef WITH_EMBREE_GPU
if (oneapi_kernel_is_using_embree(kernel_name) ||
oneapi_kernel_is_raytrace_or_mnee(kernel_name)) {
sycl::kernel_bundle<sycl::bundle_state::input> one_kernel_bundle_input =
sycl::get_kernel_bundle<sycl::bundle_state::input>(queue->get_context(), {kernel_id});
one_kernel_bundle_input
.set_specialization_constant<ONEAPIKernelContext::oneapi_embree_features>(
RTC_FEATURE_FLAG_NONE);
sycl::build(one_kernel_bundle_input);
continue;
}
if (((requested_features & KERNEL_FEATURE_NODE_RAYTRACE) == 0) &&
kernel_name.find("oneapi_kernel_integrator_shade_surface_raytrace") !=
std::string::npos) {
continue;
}
sycl::kernel_bundle<sycl::bundle_state::input> one_kernel_bundle =
sycl::get_kernel_bundle<sycl::bundle_state::input>(queue->get_context(), {kernel_id});
sycl::build(one_kernel_bundle);
# endif
/* This call will ensure that AoT or cached JIT binaries are available
* for execution. It will trigger compilation if it is not already the case. */
(void)sycl::get_kernel_bundle<sycl::bundle_state::executable>(queue->get_context(),
{kernel_id});
}
}
catch (sycl::exception const &e) {
@@ -188,13 +287,14 @@ bool oneapi_load_kernels(SyclQueue *queue_, const uint requested_features)
}
return false;
}
# endif
return true;
}
bool oneapi_enqueue_kernel(KernelContext *kernel_context,
int kernel,
size_t global_size,
const uint kernel_features,
bool use_hardware_raytracing,
void **args)
{
bool success = true;
@@ -248,6 +348,21 @@ bool oneapi_enqueue_kernel(KernelContext *kernel_context,
try {
queue->submit([&](sycl::handler &cgh) {
# ifdef WITH_EMBREE_GPU
/* Spec says it has no effect if the called kernel doesn't support the below specialization
* constant but it can still trigger a recompilation, so we set it only if needed. */
if (device_kernel_has_intersection(device_kernel)) {
const RTCFeatureFlags used_embree_features = !use_hardware_raytracing ?
RTC_FEATURE_FLAG_NONE :
!(kernel_features & KERNEL_FEATURE_HAIR) ?
CYCLES_ONEAPI_EMBREE_BASIC_FEATURES :
CYCLES_ONEAPI_EMBREE_ALL_FEATURES;
cgh.set_specialization_constant<ONEAPIKernelContext::oneapi_embree_features>(
used_embree_features);
}
# else
(void)kernel_features;
# endif
switch (device_kernel) {
case DEVICE_KERNEL_INTEGRATOR_RESET: {
oneapi_call(kg, cgh, global_size, local_size, args, oneapi_kernel_integrator_reset);
@@ -549,4 +664,5 @@ bool oneapi_enqueue_kernel(KernelContext *kernel_context,
# endif
return success;
}
#endif /* WITH_ONEAPI */

View File

@@ -47,10 +47,14 @@ CYCLES_KERNEL_ONEAPI_EXPORT size_t oneapi_kernel_preferred_local_size(
CYCLES_KERNEL_ONEAPI_EXPORT bool oneapi_enqueue_kernel(KernelContext *context,
int kernel,
size_t global_size,
const unsigned int kernel_features,
bool use_hardware_raytracing,
void **args);
CYCLES_KERNEL_ONEAPI_EXPORT bool oneapi_load_kernels(SyclQueue *queue,
const unsigned int requested_features);
const unsigned int kernel_features,
bool use_hardware_raytracing);
# ifdef __cplusplus
}
# endif
#endif /* WITH_ONEAPI */

View File

@@ -454,8 +454,13 @@ ccl_device_forceinline bool guiding_bsdf_init(KernelGlobals kg,
ccl_private float &rand)
{
#if defined(__PATH_GUIDING__) && PATH_GUIDING_LEVEL >= 4
# if OPENPGL_VERSION_MINOR >= 5
if (kg->opgl_surface_sampling_distribution->Init(
kg->opgl_guiding_field, guiding_point3f(P), rand)) {
# else
if (kg->opgl_surface_sampling_distribution->Init(
kg->opgl_guiding_field, guiding_point3f(P), rand, true)) {
# endif
kg->opgl_surface_sampling_distribution->ApplyCosineProduct(guiding_point3f(N));
return true;
}
@@ -506,8 +511,13 @@ ccl_device_forceinline bool guiding_phase_init(KernelGlobals kg,
return false;
}
# if OPENPGL_VERSION_MINOR >= 5
if (kg->opgl_volume_sampling_distribution->Init(
kg->opgl_guiding_field, guiding_point3f(P), rand)) {
# else
if (kg->opgl_volume_sampling_distribution->Init(
kg->opgl_guiding_field, guiding_point3f(P), rand, true)) {
# endif
kg->opgl_volume_sampling_distribution->ApplySingleLobeHenyeyGreensteinProduct(guiding_vec3f(D),
g);
return true;

View File

@@ -342,7 +342,7 @@ ccl_device_forceinline void area_light_update_position(const ccl_global KernelLi
ls->D = normalize_len(ls->P - P, &ls->t);
ls->pdf = invarea;
if (klight->area.tan_half_spread > 0) {
if (klight->area.normalize_spread > 0) {
ls->eval_fac = 0.25f * invarea;
ls->eval_fac *= area_light_spread_attenuation(
ls->D, ls->Ng, klight->area.tan_half_spread, klight->area.normalize_spread);

View File

@@ -56,7 +56,7 @@ ccl_device_noinline bool light_distribution_sample(KernelGlobals kg,
const int index = light_distribution_sample(kg, randn);
const float pdf_selection = kernel_data.integrator.distribution_pdf_lights;
return light_sample<in_volume_segment>(
kg, randu, randv, time, P, bounce, path_flag, index, pdf_selection, ls);
kg, randu, randv, time, P, bounce, path_flag, index, 0, pdf_selection, ls);
}
ccl_device_inline float light_distribution_pdf_lamp(KernelGlobals kg)

View File

@@ -108,6 +108,7 @@ ccl_device_noinline bool light_sample(KernelGlobals kg,
const int bounce,
const uint32_t path_flag,
const int emitter_index,
const int object_id,
const float pdf_selection,
ccl_private LightSample *ls)
{
@@ -117,8 +118,9 @@ ccl_device_noinline bool light_sample(KernelGlobals kg,
if (kernel_data.integrator.use_light_tree) {
ccl_global const KernelLightTreeEmitter *kemitter = &kernel_data_fetch(light_tree_emitters,
emitter_index);
prim = kemitter->prim_id;
mesh_light = kemitter->mesh_light;
prim = kemitter->light.id;
mesh_light.shader_flag = kemitter->mesh_light.shader_flag;
mesh_light.object_id = object_id;
}
else
#endif

View File

@@ -438,7 +438,9 @@ ccl_device_inline float light_sample_mis_weight_forward_surface(KernelGlobals kg
const float3 N = INTEGRATOR_STATE(state, path, mis_origin_n);
uint lookup_offset = kernel_data_fetch(object_lookup_offset, sd->object);
uint prim_offset = kernel_data_fetch(object_prim_offset, sd->object);
pdf *= light_tree_pdf(kg, ray_P, N, path_flag, sd->prim - prim_offset + lookup_offset);
uint triangle = kernel_data_fetch(triangle_to_tree, sd->prim - prim_offset + lookup_offset);
pdf *= light_tree_pdf(kg, ray_P, N, path_flag, sd->object, triangle);
}
else
#endif
@@ -462,7 +464,7 @@ ccl_device_inline float light_sample_mis_weight_forward_lamp(KernelGlobals kg,
#ifdef __LIGHT_TREE__
if (kernel_data.integrator.use_light_tree) {
const float3 N = INTEGRATOR_STATE(state, path, mis_origin_n);
pdf *= light_tree_pdf(kg, P, N, path_flag, ~ls->lamp);
pdf *= light_tree_pdf(kg, P, N, path_flag, 0, kernel_data_fetch(light_to_tree, ls->lamp));
}
else
#endif
@@ -496,7 +498,8 @@ ccl_device_inline float light_sample_mis_weight_forward_background(KernelGlobals
#ifdef __LIGHT_TREE__
if (kernel_data.integrator.use_light_tree) {
const float3 N = INTEGRATOR_STATE(state, path, mis_origin_n);
pdf *= light_tree_pdf(kg, ray_P, N, path_flag, ~kernel_data.background.light_index);
uint light = kernel_data_fetch(light_to_tree, kernel_data.background.light_index);
pdf *= light_tree_pdf(kg, ray_P, N, path_flag, 0, light);
}
else
#endif

View File

@@ -69,6 +69,59 @@ ccl_device float3 compute_v(
cos_phi0 * o0 + dot_o1_a * inv_len * o1;
}
ccl_device_inline bool is_light(const ccl_global KernelLightTreeEmitter *kemitter)
{
return kemitter->light.id < 0;
}
ccl_device_inline bool is_mesh(const ccl_global KernelLightTreeEmitter *kemitter)
{
return !is_light(kemitter) && kemitter->mesh_light.object_id == OBJECT_NONE;
}
ccl_device_inline bool is_triangle(const ccl_global KernelLightTreeEmitter *kemitter)
{
return !is_light(kemitter) && kemitter->mesh_light.object_id != OBJECT_NONE;
}
ccl_device_inline bool is_leaf(const ccl_global KernelLightTreeNode *knode)
{
/* The distant node is also considered o leaf node. */
return knode->type >= LIGHT_TREE_LEAF;
}
template<bool in_volume_segment>
ccl_device void light_tree_to_local_space(KernelGlobals kg,
const int object_id,
ccl_private float3 &P,
ccl_private float3 &N_or_D,
ccl_private float &t)
{
const int object_flag = kernel_data_fetch(object_flag, object_id);
if (!(object_flag & SD_OBJECT_TRANSFORM_APPLIED)) {
#ifdef __OBJECT_MOTION__
Transform itfm;
object_fetch_transform_motion_test(kg, object_id, 0.5f, &itfm);
#else
const Transform itfm = object_fetch_transform(kg, object_id, OBJECT_INVERSE_TRANSFORM);
#endif
P = transform_point(&itfm, P);
if (in_volume_segment) {
/* Transform direction. */
float3 D_local = transform_direction(&itfm, N_or_D);
float scale;
N_or_D = normalize_len(D_local, &scale);
t *= scale;
}
else if (!is_zero(N_or_D)) {
/* Transform normal. */
const Transform tfm = object_fetch_transform(kg, object_id, OBJECT_TRANSFORM);
N_or_D = normalize(transform_direction_transposed(&tfm, N_or_D));
}
}
}
/* This is the general function for calculating the importance of either a cluster or an emitter.
* Both of the specialized functions obtain the necessary data before calling this function. */
template<bool in_volume_segment>
@@ -184,9 +237,8 @@ ccl_device bool compute_emitter_centroid_and_dir(KernelGlobals kg,
ccl_private float3 &centroid,
ccl_private packed_float3 &dir)
{
const int prim_id = kemitter->prim_id;
if (prim_id < 0) {
const ccl_global KernelLight *klight = &kernel_data_fetch(lights, ~prim_id);
if (is_light(kemitter)) {
const ccl_global KernelLight *klight = &kernel_data_fetch(lights, ~(kemitter->light.id));
centroid = klight->co;
switch (klight->type) {
@@ -213,19 +265,22 @@ ccl_device bool compute_emitter_centroid_and_dir(KernelGlobals kg,
}
}
else {
kernel_assert(is_triangle(kemitter));
const int object = kemitter->mesh_light.object_id;
float3 vertices[3];
triangle_world_space_vertices(kg, object, prim_id, -1.0f, vertices);
triangle_vertices(kg, kemitter->triangle.id, vertices);
centroid = (vertices[0] + vertices[1] + vertices[2]) / 3.0f;
const bool is_front_only = (kemitter->emission_sampling == EMISSION_SAMPLING_FRONT);
const bool is_back_only = (kemitter->emission_sampling == EMISSION_SAMPLING_BACK);
const bool is_front_only = (kemitter->triangle.emission_sampling == EMISSION_SAMPLING_FRONT);
const bool is_back_only = (kemitter->triangle.emission_sampling == EMISSION_SAMPLING_BACK);
if (is_front_only || is_back_only) {
dir = safe_normalize(cross(vertices[1] - vertices[0], vertices[2] - vertices[0]));
if (is_back_only) {
dir = -dir;
}
if (kernel_data_fetch(object_flag, object) & SD_OBJECT_NEGATIVE_SCALE) {
const int object_flag = kernel_data_fetch(object_flag, object);
if ((object_flag & SD_OBJECT_TRANSFORM_APPLIED) &&
(object_flag & SD_OBJECT_NEGATIVE_SCALE)) {
dir = -dir;
}
}
@@ -237,6 +292,75 @@ ccl_device bool compute_emitter_centroid_and_dir(KernelGlobals kg,
return true;
}
template<bool in_volume_segment>
ccl_device void light_tree_node_importance(KernelGlobals kg,
const float3 P,
const float3 N_or_D,
const float t,
const bool has_transmission,
const ccl_global KernelLightTreeNode *knode,
ccl_private float &max_importance,
ccl_private float &min_importance)
{
const BoundingCone bcone = knode->bcone;
const BoundingBox bbox = knode->bbox;
float3 point_to_centroid;
float cos_theta_u;
float distance;
if (knode->type == LIGHT_TREE_DISTANT) {
if (in_volume_segment) {
return;
}
point_to_centroid = -bcone.axis;
cos_theta_u = fast_cosf(bcone.theta_o);
distance = 1.0f;
}
else {
const float3 centroid = 0.5f * (bbox.min + bbox.max);
if (in_volume_segment) {
const float3 D = N_or_D;
const float3 closest_point = P + dot(centroid - P, D) * D;
/* Minimal distance of the ray to the cluster. */
distance = len(centroid - closest_point);
point_to_centroid = -compute_v(centroid, P, D, bcone.axis, t);
cos_theta_u = light_tree_cos_bounding_box_angle(bbox, closest_point, point_to_centroid);
}
else {
const float3 N = N_or_D;
const float3 bbox_extent = bbox.max - centroid;
const bool bbox_is_visible = has_transmission |
(dot(N, centroid - P) + dot(fabs(N), fabs(bbox_extent)) > 0);
/* If the node is guaranteed to be behind the surface we're sampling, and the surface is
* opaque, then we can give the node an importance of 0 as it contributes nothing to the
* surface. */
if (!bbox_is_visible) {
return;
}
point_to_centroid = normalize_len(centroid - P, &distance);
cos_theta_u = light_tree_cos_bounding_box_angle(bbox, P, point_to_centroid);
}
/* Clamp distance to half the radius of the cluster when splitting is disabled. */
distance = fmaxf(0.5f * len(centroid - bbox.max), distance);
}
/* TODO: currently max_distance = min_distance, max_importance = min_importance for the
* nodes. Do we need better weights for complex scenes? */
light_tree_importance<in_volume_segment>(N_or_D,
has_transmission,
point_to_centroid,
cos_theta_u,
bcone,
distance,
distance,
t,
knode->energy,
max_importance,
min_importance);
}
template<bool in_volume_segment>
ccl_device void light_tree_emitter_importance(KernelGlobals kg,
const float3 P,
@@ -247,11 +371,21 @@ ccl_device void light_tree_emitter_importance(KernelGlobals kg,
ccl_private float &max_importance,
ccl_private float &min_importance)
{
max_importance = 0.0f;
min_importance = 0.0f;
const ccl_global KernelLightTreeEmitter *kemitter = &kernel_data_fetch(light_tree_emitters,
emitter_index);
max_importance = 0.0f;
min_importance = 0.0f;
if (is_mesh(kemitter)) {
const ccl_global KernelLightTreeNode *knode = &kernel_data_fetch(light_tree_nodes,
kemitter->mesh.node_id);
light_tree_node_importance<in_volume_segment>(
kg, P, N_or_D, t, has_transmission, knode, max_importance, min_importance);
return;
}
BoundingCone bcone;
bcone.theta_o = kemitter->theta_o;
bcone.theta_e = kemitter->theta_e;
@@ -264,8 +398,6 @@ ccl_device void light_tree_emitter_importance(KernelGlobals kg,
return;
}
const int prim_id = kemitter->prim_id;
if (in_volume_segment) {
const float3 D = N_or_D;
/* Closest point. */
@@ -279,9 +411,15 @@ ccl_device void light_tree_emitter_importance(KernelGlobals kg,
P_c = P;
}
/* Early out if the emitter is guaranteed to be invisible. */
bool is_visible;
if (prim_id < 0) {
const ccl_global KernelLight *klight = &kernel_data_fetch(lights, ~prim_id);
if (is_triangle(kemitter)) {
is_visible = triangle_light_tree_parameters<in_volume_segment>(
kg, kemitter, centroid, P_c, N_or_D, bcone, cos_theta_u, distance, point_to_centroid);
}
else {
kernel_assert(is_light(kemitter));
const ccl_global KernelLight *klight = &kernel_data_fetch(lights, ~(kemitter->light.id));
switch (klight->type) {
/* Function templates only modifies cos_theta_u when in_volume_segment = true. */
case LIGHT_SPOT:
@@ -309,10 +447,6 @@ ccl_device void light_tree_emitter_importance(KernelGlobals kg,
return;
}
}
else { /* Mesh light. */
is_visible = triangle_light_tree_parameters<in_volume_segment>(
kg, kemitter, centroid, P_c, N_or_D, bcone, cos_theta_u, distance, point_to_centroid);
}
is_visible |= has_transmission;
if (!is_visible) {
@@ -333,81 +467,31 @@ ccl_device void light_tree_emitter_importance(KernelGlobals kg,
}
template<bool in_volume_segment>
ccl_device void light_tree_node_importance(KernelGlobals kg,
const float3 P,
const float3 N_or_D,
const float t,
const bool has_transmission,
const ccl_global KernelLightTreeNode *knode,
ccl_private float &max_importance,
ccl_private float &min_importance)
ccl_device void light_tree_child_importance(KernelGlobals kg,
const float3 P,
const float3 N_or_D,
const float t,
const bool has_transmission,
const ccl_global KernelLightTreeNode *knode,
ccl_private float &max_importance,
ccl_private float &min_importance)
{
max_importance = 0.0f;
min_importance = 0.0f;
if (knode->num_emitters == 1) {
/* At a leaf node with only one emitter. */
light_tree_emitter_importance<in_volume_segment>(
kg, P, N_or_D, t, has_transmission, -knode->child_index, max_importance, min_importance);
light_tree_emitter_importance<in_volume_segment>(kg,
P,
N_or_D,
t,
has_transmission,
knode->leaf.first_emitter,
max_importance,
min_importance);
}
else if (knode->num_emitters != 0) {
const BoundingCone bcone = knode->bcone;
const BoundingBox bbox = knode->bbox;
float3 point_to_centroid;
float cos_theta_u;
float distance;
if (knode->bit_trail == 1) {
/* Distant light node. */
if (in_volume_segment) {
return;
}
point_to_centroid = -bcone.axis;
cos_theta_u = fast_cosf(bcone.theta_o);
distance = 1.0f;
}
else {
const float3 centroid = 0.5f * (bbox.min + bbox.max);
if (in_volume_segment) {
const float3 D = N_or_D;
const float3 closest_point = P + dot(centroid - P, D) * D;
/* Minimal distance of the ray to the cluster. */
distance = len(centroid - closest_point);
point_to_centroid = -compute_v(centroid, P, D, bcone.axis, t);
cos_theta_u = light_tree_cos_bounding_box_angle(bbox, closest_point, point_to_centroid);
}
else {
const float3 N = N_or_D;
const float3 bbox_extent = bbox.max - centroid;
const bool bbox_is_visible = has_transmission |
(dot(N, centroid - P) + dot(fabs(N), fabs(bbox_extent)) > 0);
/* If the node is guaranteed to be behind the surface we're sampling, and the surface is
* opaque, then we can give the node an importance of 0 as it contributes nothing to the
* surface. */
if (!bbox_is_visible) {
return;
}
point_to_centroid = normalize_len(centroid - P, &distance);
cos_theta_u = light_tree_cos_bounding_box_angle(bbox, P, point_to_centroid);
}
/* Clamp distance to half the radius of the cluster when splitting is disabled. */
distance = fmaxf(0.5f * len(centroid - bbox.max), distance);
}
/* TODO: currently max_distance = min_distance, max_importance = min_importance for the
* nodes. Do we need better weights for complex scenes? */
light_tree_importance<in_volume_segment>(N_or_D,
has_transmission,
point_to_centroid,
cos_theta_u,
bcone,
distance,
distance,
t,
knode->energy,
max_importance,
min_importance);
light_tree_node_importance<in_volume_segment>(
kg, P, N_or_D, t, has_transmission, knode, max_importance, min_importance);
}
}
@@ -440,26 +524,30 @@ ccl_device void sample_resevoir(const int current_index,
template<bool in_volume_segment>
ccl_device int light_tree_cluster_select_emitter(KernelGlobals kg,
ccl_private float &rand,
const float3 P,
const float3 N_or_D,
const float t,
ccl_private float3 &P,
ccl_private float3 &N_or_D,
ccl_private float &t,
const bool has_transmission,
const ccl_global KernelLightTreeNode *knode,
ccl_private int *node_index,
ccl_private float *pdf_factor)
{
float selected_importance[2] = {0.0f, 0.0f};
float total_importance[2] = {0.0f, 0.0f};
int selected_index = -1;
const ccl_global KernelLightTreeNode *knode = &kernel_data_fetch(light_tree_nodes, *node_index);
*node_index = -1;
/* Mark emitters with zero importance. Used for resevoir when total minimum importance = 0. */
kernel_assert(knode->num_emitters <= sizeof(uint) * 8);
uint has_importance = 0;
const bool sample_max = (rand > 0.5f); /* Sampling using the maximum importance. */
rand = rand * 2.0f - float(sample_max);
if (knode->num_emitters > 1) {
rand = rand * 2.0f - float(sample_max);
}
for (int i = 0; i < knode->num_emitters; i++) {
int current_index = -knode->child_index + i;
int current_index = knode->leaf.first_emitter + i;
/* maximum importance = importance[0], minimum importance = importance[1] */
float importance[2];
light_tree_emitter_importance<in_volume_segment>(
@@ -492,7 +580,7 @@ ccl_device int light_tree_cluster_select_emitter(KernelGlobals kg,
else {
selected_index = -1;
for (int i = 0; i < knode->num_emitters; i++) {
int current_index = -knode->child_index + i;
int current_index = knode->inner.right_child + i;
sample_resevoir(current_index,
float(has_importance & 1),
selected_index,
@@ -508,8 +596,24 @@ ccl_device int light_tree_cluster_select_emitter(KernelGlobals kg,
}
}
*pdf_factor = 0.5f * (selected_importance[0] / total_importance[0] +
selected_importance[1] / total_importance[1]);
*pdf_factor *= 0.5f * (selected_importance[0] / total_importance[0] +
selected_importance[1] / total_importance[1]);
const ccl_global KernelLightTreeEmitter *kemitter = &kernel_data_fetch(light_tree_emitters,
selected_index);
if (is_mesh(kemitter)) {
/* Transform ray from world to local space. */
light_tree_to_local_space<in_volume_segment>(kg, kemitter->mesh.object_id, P, N_or_D, t);
*node_index = kemitter->mesh.node_id;
const ccl_global KernelLightTreeNode *knode = &kernel_data_fetch(light_tree_nodes,
*node_index);
if (knode->type == LIGHT_TREE_INSTANCE) {
/* Switch to the node with the subtree. */
*node_index = knode->instance.reference;
}
}
return selected_index;
}
@@ -528,9 +632,9 @@ ccl_device bool get_left_probability(KernelGlobals kg,
const ccl_global KernelLightTreeNode *right = &kernel_data_fetch(light_tree_nodes, right_index);
float min_left_importance, max_left_importance, min_right_importance, max_right_importance;
light_tree_node_importance<in_volume_segment>(
light_tree_child_importance<in_volume_segment>(
kg, P, N_or_D, t, has_transmission, left, max_left_importance, min_left_importance);
light_tree_node_importance<in_volume_segment>(
light_tree_child_importance<in_volume_segment>(
kg, P, N_or_D, t, has_transmission, right, max_right_importance, min_right_importance);
const float total_max_importance = max_left_importance + max_right_importance;
@@ -556,8 +660,8 @@ ccl_device_noinline bool light_tree_sample(KernelGlobals kg,
const float randv,
const float time,
const float3 P,
const float3 N_or_D,
const float t,
float3 N_or_D,
float t,
const int shader_flags,
const int bounce,
const uint32_t path_flag,
@@ -571,28 +675,38 @@ ccl_device_noinline bool light_tree_sample(KernelGlobals kg,
float pdf_leaf = 1.0f;
float pdf_selection = 1.0f;
int selected_emitter = -1;
int object = 0;
int node_index = 0; /* Root node. */
float3 local_P = P;
/* Traverse the light tree until a leaf node is reached. */
while (true) {
const ccl_global KernelLightTreeNode *knode = &kernel_data_fetch(light_tree_nodes, node_index);
if (knode->child_index <= 0) {
if (is_leaf(knode)) {
/* At a leaf node, we pick an emitter. */
selected_emitter = light_tree_cluster_select_emitter<in_volume_segment>(
kg, randn, P, N_or_D, t, has_transmission, knode, &pdf_selection);
break;
kg, randn, local_P, N_or_D, t, has_transmission, &node_index, &pdf_selection);
if (node_index < 0) {
break;
}
else {
/* Continue with the picked mesh light. */
object = kernel_data_fetch(light_tree_emitters, selected_emitter).mesh.object_id;
continue;
}
}
/* At an interior node, the left child is directly after the parent, while the right child is
* stored as the child index. */
const int left_index = node_index + 1;
const int right_index = knode->child_index;
const int right_index = knode->inner.right_child;
float left_prob;
if (!get_left_probability<in_volume_segment>(
kg, P, N_or_D, t, has_transmission, left_index, right_index, left_prob)) {
kg, local_P, N_or_D, t, has_transmission, left_index, right_index, left_prob)) {
return false; /* Both child nodes have zero importance. */
}
@@ -610,38 +724,104 @@ ccl_device_noinline bool light_tree_sample(KernelGlobals kg,
pdf_selection *= pdf_leaf;
return light_sample<in_volume_segment>(
kg, randu, randv, time, P, bounce, path_flag, selected_emitter, pdf_selection, ls);
kg, randu, randv, time, P, bounce, path_flag, selected_emitter, object, pdf_selection, ls);
}
/* We need to be able to find the probability of selecting a given light for MIS. */
ccl_device float light_tree_pdf(
KernelGlobals kg, const float3 P, const float3 N, const int path_flag, const int emitter)
KernelGlobals kg, float3 P, float3 N, const int path_flag, const int object, const uint target)
{
const bool has_transmission = (path_flag & PATH_RAY_MIS_HAD_TRANSMISSION);
/* Target emitter info. */
const int target_emitter = (emitter >= 0) ? kernel_data_fetch(triangle_to_tree, emitter) :
kernel_data_fetch(light_to_tree, ~emitter);
ccl_global const KernelLightTreeEmitter *kemitter = &kernel_data_fetch(light_tree_emitters,
target_emitter);
const int target_leaf = kemitter->parent_index;
ccl_global const KernelLightTreeNode *kleaf = &kernel_data_fetch(light_tree_nodes, target_leaf);
uint bit_trail = kleaf->bit_trail;
int node_index = 0; /* Root node. */
ccl_global const KernelLightTreeEmitter *kemitter = &kernel_data_fetch(light_tree_emitters,
target);
int root_index, target_leaf;
uint bit_trail, target_emitter;
if (is_triangle(kemitter)) {
/* If the target is an emissive triangle, first traverse the top level tree to find the mesh
* light emitter, then traverse the subtree. */
target_emitter = kernel_data_fetch(object_to_tree, object);
ccl_global const KernelLightTreeEmitter *kmesh = &kernel_data_fetch(light_tree_emitters,
target_emitter);
target_leaf = kmesh->parent_index;
root_index = kmesh->mesh.node_id;
ccl_global const KernelLightTreeNode *kroot = &kernel_data_fetch(light_tree_nodes, root_index);
bit_trail = kroot->bit_trail;
if (kroot->type == LIGHT_TREE_INSTANCE) {
root_index = kroot->instance.reference;
}
}
else {
root_index = 0;
target_leaf = kemitter->parent_index;
bit_trail = kernel_data_fetch(light_tree_nodes, target_leaf).bit_trail;
target_emitter = target;
}
float pdf = 1.0f;
int node_index = 0;
/* Traverse the light tree until we reach the target leaf node. */
while (true) {
const ccl_global KernelLightTreeNode *knode = &kernel_data_fetch(light_tree_nodes, node_index);
if (knode->child_index <= 0) {
break;
if (is_leaf(knode)) {
kernel_assert(node_index == target_leaf);
ccl_global const KernelLightTreeNode *kleaf = &kernel_data_fetch(light_tree_nodes,
target_leaf);
/* Iterate through leaf node to find the probability of sampling the target emitter. */
float target_max_importance = 0.0f;
float target_min_importance = 0.0f;
float total_max_importance = 0.0f;
float total_min_importance = 0.0f;
int num_has_importance = 0;
for (int i = 0; i < kleaf->num_emitters; i++) {
const int emitter = kleaf->leaf.first_emitter + i;
float max_importance, min_importance;
light_tree_emitter_importance<false>(
kg, P, N, 0, has_transmission, emitter, max_importance, min_importance);
num_has_importance += (max_importance > 0);
if (emitter == target_emitter) {
target_max_importance = max_importance;
target_min_importance = min_importance;
}
total_max_importance += max_importance;
total_min_importance += min_importance;
}
if (target_max_importance > 0.0f) {
pdf *= 0.5f * (target_max_importance / total_max_importance +
(total_min_importance > 0 ? target_min_importance / total_min_importance :
1.0f / num_has_importance));
}
else {
return 0.0f;
}
if (root_index) {
/* Arrived at the mesh light. Continue with the subtree. */
float unused;
light_tree_to_local_space<false>(kg, object, P, N, unused);
node_index = root_index;
root_index = 0;
target_emitter = target;
target_leaf = kemitter->parent_index;
bit_trail = kernel_data_fetch(light_tree_nodes, target_leaf).bit_trail;
continue;
}
else {
kernel_assert(node_index == target_leaf);
return pdf;
}
}
/* Interior node. */
const int left_index = node_index + 1;
const int right_index = knode->child_index;
const int right_index = knode->inner.right_child;
float left_prob;
if (!get_left_probability<false>(
@@ -658,36 +838,6 @@ ccl_device float light_tree_pdf(
return 0.0f;
}
}
kernel_assert(node_index == target_leaf);
/* Iterate through leaf node to find the probability of sampling the target emitter. */
float target_max_importance = 0.0f;
float target_min_importance = 0.0f;
float total_max_importance = 0.0f;
float total_min_importance = 0.0f;
int num_has_importance = 0;
for (int i = 0; i < kleaf->num_emitters; i++) {
const int emitter = -kleaf->child_index + i;
float max_importance, min_importance;
light_tree_emitter_importance<false>(
kg, P, N, 0, has_transmission, emitter, max_importance, min_importance);
num_has_importance += (max_importance > 0);
if (emitter == target_emitter) {
target_max_importance = max_importance;
target_min_importance = min_importance;
}
total_max_importance += max_importance;
total_min_importance += min_importance;
}
if (target_max_importance > 0.0f) {
return pdf * 0.5f *
(target_max_importance / total_max_importance +
(total_min_importance > 0 ? target_min_importance / total_min_importance :
1.0f / num_has_importance));
}
return 0.0f;
}
CCL_NAMESPACE_END

View File

@@ -304,9 +304,8 @@ ccl_device_forceinline bool triangle_light_tree_parameters(
cos_theta_u = FLT_MAX;
const int object = kemitter->mesh_light.object_id;
float3 vertices[3];
triangle_world_space_vertices(kg, object, kemitter->prim_id, -1.0f, vertices);
triangle_vertices(kg, kemitter->triangle.id, vertices);
bool shape_above_surface = false;
for (int i = 0; i < 3; i++) {

View File

@@ -1390,19 +1390,128 @@ ccl_device_extern void osl_noiseparams_set_impulses(ccl_private OSLNoiseOptions
res->y = n; \
res->z = n; \
} \
ccl_device_extern void name##_vv(ccl_private float3 *res, const float3 *v) \
ccl_device_extern void name##_vv(ccl_private float3 *res, ccl_private const float3 *v) \
{ \
const float n = name##_fv(v); \
res->x = n; \
res->y = n; \
res->z = n; \
} \
ccl_device_extern void name##_vvf(ccl_private float3 *res, const float3 *v, float w) \
ccl_device_extern void name##_vvf( \
ccl_private float3 *res, ccl_private const float3 *v, float w) \
{ \
const float n = name##_fvf(v, w); \
res->x = n; \
res->y = n; \
res->z = n; \
} \
ccl_device_extern void name##_dfdf(ccl_private float *res, ccl_private const float *x) \
{ \
res[0] = name##_ff(x[0]); \
res[1] = name##_ff(x[1]); \
res[2] = name##_ff(x[2]); \
} \
ccl_device_extern void name##_dfdff( \
ccl_private float *res, ccl_private const float *x, float y) \
{ \
res[0] = name##_fff(x[0], y); \
res[1] = name##_fff(x[1], y); \
res[2] = name##_fff(x[2], y); \
} \
ccl_device_extern void name##_dffdf( \
ccl_private float *res, float x, ccl_private const float *y) \
{ \
res[0] = name##_fff(x, y[0]); \
res[1] = name##_fff(x, y[1]); \
res[2] = name##_fff(x, y[2]); \
} \
ccl_device_extern void name##_dfdfdf( \
ccl_private float *res, ccl_private const float *x, ccl_private const float *y) \
{ \
res[0] = name##_fff(x[0], y[0]); \
res[1] = name##_fff(x[1], y[1]); \
res[2] = name##_fff(x[2], y[2]); \
} \
ccl_device_extern void name##_dfdv(ccl_private float *res, ccl_private const float3 *v) \
{ \
res[0] = name##_fv(&v[0]); \
res[1] = name##_fv(&v[1]); \
res[2] = name##_fv(&v[2]); \
} \
ccl_device_extern void name##_dfdvf( \
ccl_private float *res, ccl_private const float3 *v, float w) \
{ \
res[0] = name##_fvf(&v[0], w); \
res[1] = name##_fvf(&v[1], w); \
res[2] = name##_fvf(&v[2], w); \
} \
ccl_device_extern void name##_dfvdf( \
ccl_private float *res, ccl_private const float3 *v, ccl_private const float *w) \
{ \
res[0] = name##_fvf(v, w[0]); \
res[1] = name##_fvf(v, w[1]); \
res[2] = name##_fvf(v, w[2]); \
} \
ccl_device_extern void name##_dfdvdf( \
ccl_private float *res, ccl_private const float3 *v, ccl_private const float *w) \
{ \
res[0] = name##_fvf(&v[0], w[0]); \
res[1] = name##_fvf(&v[1], w[1]); \
res[2] = name##_fvf(&v[2], w[2]); \
} \
ccl_device_extern void name##_dvdf(ccl_private float3 *res, ccl_private const float *x) \
{ \
name##_vf(&res[0], x[0]); \
name##_vf(&res[1], x[1]); \
name##_vf(&res[2], x[2]); \
} \
ccl_device_extern void name##_dvdff( \
ccl_private float3 *res, ccl_private const float *x, float y) \
{ \
name##_vff(&res[0], x[0], y); \
name##_vff(&res[1], x[1], y); \
name##_vff(&res[2], x[2], y); \
} \
ccl_device_extern void name##_dvfdf( \
ccl_private float3 *res, float x, ccl_private const float *y) \
{ \
name##_vff(&res[0], x, y[0]); \
name##_vff(&res[1], x, y[1]); \
name##_vff(&res[2], x, y[2]); \
} \
ccl_device_extern void name##_dvdfdf( \
ccl_private float3 *res, ccl_private const float *x, ccl_private const float *y) \
{ \
name##_vff(&res[0], x[0], y[0]); \
name##_vff(&res[1], x[1], y[1]); \
name##_vff(&res[2], x[2], y[2]); \
} \
ccl_device_extern void name##_dvdv(ccl_private float3 *res, ccl_private const float3 *v) \
{ \
name##_vv(&res[0], &v[0]); \
name##_vv(&res[1], &v[1]); \
name##_vv(&res[2], &v[2]); \
} \
ccl_device_extern void name##_dvdvf( \
ccl_private float3 *res, ccl_private const float3 *v, float w) \
{ \
name##_vvf(&res[0], &v[0], w); \
name##_vvf(&res[1], &v[1], w); \
name##_vvf(&res[2], &v[2], w); \
} \
ccl_device_extern void name##_dvvdf( \
ccl_private float3 *res, ccl_private const float3 *v, ccl_private const float *w) \
{ \
name##_vvf(&res[0], v, w[0]); \
name##_vvf(&res[1], v, w[1]); \
name##_vvf(&res[2], v, w[2]); \
} \
ccl_device_extern void name##_dvdvdf( \
ccl_private float3 *res, ccl_private const float3 *v, ccl_private const float *w) \
{ \
name##_vvf(&res[0], &v[0], w[0]); \
name##_vvf(&res[1], &v[1], w[1]); \
name##_vvf(&res[2], &v[2], w[2]); \
}
ccl_device_forceinline float hashnoise_1d(float p)

View File

@@ -132,11 +132,11 @@ color sky_radiance_nishita(vector dir, float nishita_data[10], string filename)
/* definitions */
vector sun_dir = geographical_to_direction(sun_elevation, sun_rotation + M_PI_2);
float sun_dir_angle = precise_angle(dir, sun_dir);
float half_angular = angular_diameter / 2.0;
float half_angular = angular_diameter * 0.5;
float dir_elevation = M_PI_2 - direction[0];
/* if ray inside sun disc render it, otherwise render sky.
* alternatively, ignore the sun if we're evaluating the background texture. */
/* If the ray is inside the sun disc, render it, otherwise render the sky.
* Alternatively, ignore the sun if we're evaluating the background texture. */
if (sun_dir_angle < half_angular && sun_disc == 1 && raytype("importance_bake") != 1) {
/* get 2 pixels data */
color pixel_bottom = color(nishita_data[0], nishita_data[1], nishita_data[2]);

View File

@@ -84,8 +84,8 @@ ccl_device_inline void sample_uniform_cone(const float3 N,
ccl_device_inline float pdf_uniform_cone(const float3 N, float3 D, float angle)
{
float zMin = cosf(angle);
float z = dot(N, D);
if (z > zMin) {
float z = precise_angle(N, D);
if (z < angle) {
return M_1_2PI_F / (1.0f - zMin);
}
return 0.0f;

View File

@@ -138,12 +138,13 @@ ccl_device float3 sky_radiance_nishita(KernelGlobals kg,
/* definitions */
float3 sun_dir = geographical_to_direction(sun_elevation, sun_rotation + M_PI_2_F);
float sun_dir_angle = precise_angle(dir, sun_dir);
float half_angular = angular_diameter / 2.0f;
float half_angular = angular_diameter * 0.5f;
float dir_elevation = M_PI_2_F - direction.x;
/* if ray inside sun disc render it, otherwise render sky.
* alternatively, ignore the sun if we're evaluating the background texture. */
if (sun_disc && sun_dir_angle < half_angular && !(path_flag & PATH_RAY_IMPORTANCE_BAKE)) {
/* If the ray is inside the sun disc, render it, otherwise render the sky.
* Alternatively, ignore the sun if we're evaluating the background texture. */
if (sun_disc && sun_dir_angle < half_angular &&
!((path_flag & PATH_RAY_IMPORTANCE_BAKE) && kernel_data.background.use_sun_guiding)) {
/* get 2 pixels data */
float y;

View File

@@ -3,8 +3,9 @@
#pragma once
#if !defined(__KERNEL_GPU__) && defined(WITH_EMBREE)
# if EMBREE_MAJOR_VERSION >= 4
#if (!defined(__KERNEL_GPU__) || (defined(__KERNEL_ONEAPI__) && defined(WITH_EMBREE_GPU))) && \
defined(WITH_EMBREE)
# if EMBREE_MAJOR_VERSION == 4
# include <embree4/rtcore.h>
# include <embree4/rtcore_scene.h>
# else
@@ -78,9 +79,8 @@ CCL_NAMESPACE_BEGIN
#define __VISIBILITY_FLAG__
#define __VOLUME__
/* TODO: solve internal compiler errors and enable light tree on HIP. */
/* TODO: solve internal compiler perf issue and enable light tree on Metal/AMD. */
#if defined(__KERNEL_HIP__) || defined(__KERNEL_METAL_AMD__)
#if defined(__KERNEL_METAL_AMD__)
# undef __LIGHT_TREE__
#endif
@@ -1370,6 +1370,13 @@ using BoundingCone = struct BoundingCone {
float theta_e;
};
enum LightTreeNodeType : uint8_t {
LIGHT_TREE_INSTANCE = (1 << 0),
LIGHT_TREE_INNER = (1 << 1),
LIGHT_TREE_LEAF = (1 << 2),
LIGHT_TREE_DISTANT = (1 << 3),
};
typedef struct KernelLightTreeNode {
/* Bounding box. */
BoundingBox bbox;
@@ -1380,17 +1387,25 @@ typedef struct KernelLightTreeNode {
/* Energy. */
float energy;
/* If this is 0 or less, we're at a leaf node
* and the negative value indexes into the first child of the light array.
* Otherwise, it's an index to the node's second child. */
int child_index;
int num_emitters; /* leaf nodes need to know the number of emitters stored. */
LightTreeNodeType type;
/* Leaf nodes need to know the number of emitters stored. */
int num_emitters;
union {
struct {
int first_emitter; /* The index of the first emitter. */
} leaf;
struct {
int right_child; /* The index of the right child. */
} inner;
struct {
int reference; /* A reference to the node with the subtree. */
} instance;
};
/* Bit trail. */
uint bit_trail;
/* Padding. */
int pad;
} KernelLightTreeNode;
static_assert_align(KernelLightTreeNode, 16);
@@ -1402,10 +1417,23 @@ typedef struct KernelLightTreeEmitter {
/* Energy. */
float energy;
/* The location in the lights or triangles array. */
int prim_id;
union {
struct {
int id; /* The location in the triangles array. */
EmissionSampling emission_sampling;
} triangle;
struct {
int id; /* The location in the lights array. */
} light;
struct {
int object_id;
int node_id;
} mesh;
};
MeshLight mesh_light;
EmissionSampling emission_sampling;
/* Parent. */
int parent_index;

View File

@@ -15,8 +15,12 @@ set(SRC
camera.cpp
colorspace.cpp
constant_fold.cpp
devicescene.cpp
film.cpp
geometry.cpp
geometry_attributes.cpp
geometry_bvh.cpp
geometry_mesh.cpp
hair.cpp
image.cpp
image_oiio.cpp
@@ -55,6 +59,7 @@ set(SRC_HEADERS
camera.h
colorspace.h
constant_fold.h
devicescene.h
film.h
geometry.h
hair.h

View File

@@ -0,0 +1,64 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2011-2022 Blender Foundation */
#include "scene/devicescene.h"
#include "device/device.h"
#include "device/memory.h"
CCL_NAMESPACE_BEGIN
DeviceScene::DeviceScene(Device *device)
: bvh_nodes(device, "bvh_nodes", MEM_GLOBAL),
bvh_leaf_nodes(device, "bvh_leaf_nodes", MEM_GLOBAL),
object_node(device, "object_node", MEM_GLOBAL),
prim_type(device, "prim_type", MEM_GLOBAL),
prim_visibility(device, "prim_visibility", MEM_GLOBAL),
prim_index(device, "prim_index", MEM_GLOBAL),
prim_object(device, "prim_object", MEM_GLOBAL),
prim_time(device, "prim_time", MEM_GLOBAL),
tri_verts(device, "tri_verts", MEM_GLOBAL),
tri_shader(device, "tri_shader", MEM_GLOBAL),
tri_vnormal(device, "tri_vnormal", MEM_GLOBAL),
tri_vindex(device, "tri_vindex", MEM_GLOBAL),
tri_patch(device, "tri_patch", MEM_GLOBAL),
tri_patch_uv(device, "tri_patch_uv", MEM_GLOBAL),
curves(device, "curves", MEM_GLOBAL),
curve_keys(device, "curve_keys", MEM_GLOBAL),
curve_segments(device, "curve_segments", MEM_GLOBAL),
patches(device, "patches", MEM_GLOBAL),
points(device, "points", MEM_GLOBAL),
points_shader(device, "points_shader", MEM_GLOBAL),
objects(device, "objects", MEM_GLOBAL),
object_motion_pass(device, "object_motion_pass", MEM_GLOBAL),
object_motion(device, "object_motion", MEM_GLOBAL),
object_flag(device, "object_flag", MEM_GLOBAL),
object_volume_step(device, "object_volume_step", MEM_GLOBAL),
object_prim_offset(device, "object_prim_offset", MEM_GLOBAL),
camera_motion(device, "camera_motion", MEM_GLOBAL),
attributes_map(device, "attributes_map", MEM_GLOBAL),
attributes_float(device, "attributes_float", MEM_GLOBAL),
attributes_float2(device, "attributes_float2", MEM_GLOBAL),
attributes_float3(device, "attributes_float3", MEM_GLOBAL),
attributes_float4(device, "attributes_float4", MEM_GLOBAL),
attributes_uchar4(device, "attributes_uchar4", MEM_GLOBAL),
light_distribution(device, "light_distribution", MEM_GLOBAL),
lights(device, "lights", MEM_GLOBAL),
light_background_marginal_cdf(device, "light_background_marginal_cdf", MEM_GLOBAL),
light_background_conditional_cdf(device, "light_background_conditional_cdf", MEM_GLOBAL),
light_tree_nodes(device, "light_tree_nodes", MEM_GLOBAL),
light_tree_emitters(device, "light_tree_emitters", MEM_GLOBAL),
light_to_tree(device, "light_to_tree", MEM_GLOBAL),
object_to_tree(device, "object_to_tree", MEM_GLOBAL),
object_lookup_offset(device, "object_lookup_offset", MEM_GLOBAL),
triangle_to_tree(device, "triangle_to_tree", MEM_GLOBAL),
particles(device, "particles", MEM_GLOBAL),
svm_nodes(device, "svm_nodes", MEM_GLOBAL),
shaders(device, "shaders", MEM_GLOBAL),
lookup_table(device, "lookup_table", MEM_GLOBAL),
sample_pattern_lut(device, "sample_pattern_lut", MEM_GLOBAL),
ies_lights(device, "ies", MEM_GLOBAL)
{
memset((void *)&data, 0, sizeof(data));
}
CCL_NAMESPACE_END

View File

@@ -0,0 +1,101 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2011-2022 Blender Foundation */
#ifndef __DEVICESCENE_H__
#define __DEVICESCENE_H__
#include "device/device.h"
#include "device/memory.h"
#include "util/types.h"
#include "util/vector.h"
CCL_NAMESPACE_BEGIN
class DeviceScene {
public:
/* BVH */
device_vector<int4> bvh_nodes;
device_vector<int4> bvh_leaf_nodes;
device_vector<int> object_node;
device_vector<int> prim_type;
device_vector<uint> prim_visibility;
device_vector<int> prim_index;
device_vector<int> prim_object;
device_vector<float2> prim_time;
/* mesh */
device_vector<packed_float3> tri_verts;
device_vector<uint> tri_shader;
device_vector<packed_float3> tri_vnormal;
device_vector<packed_uint3> tri_vindex;
device_vector<uint> tri_patch;
device_vector<float2> tri_patch_uv;
device_vector<KernelCurve> curves;
device_vector<float4> curve_keys;
device_vector<KernelCurveSegment> curve_segments;
device_vector<uint> patches;
/* point-cloud */
device_vector<float4> points;
device_vector<uint> points_shader;
/* objects */
device_vector<KernelObject> objects;
device_vector<Transform> object_motion_pass;
device_vector<DecomposedTransform> object_motion;
device_vector<uint> object_flag;
device_vector<float> object_volume_step;
device_vector<uint> object_prim_offset;
/* cameras */
device_vector<DecomposedTransform> camera_motion;
/* attributes */
device_vector<AttributeMap> attributes_map;
device_vector<float> attributes_float;
device_vector<float2> attributes_float2;
device_vector<packed_float3> attributes_float3;
device_vector<float4> attributes_float4;
device_vector<uchar4> attributes_uchar4;
/* lights */
device_vector<KernelLightDistribution> light_distribution;
device_vector<KernelLight> lights;
device_vector<float2> light_background_marginal_cdf;
device_vector<float2> light_background_conditional_cdf;
/* light tree */
device_vector<KernelLightTreeNode> light_tree_nodes;
device_vector<KernelLightTreeEmitter> light_tree_emitters;
device_vector<uint> light_to_tree;
device_vector<uint> object_to_tree;
device_vector<uint> object_lookup_offset;
device_vector<uint> triangle_to_tree;
/* particles */
device_vector<KernelParticle> particles;
/* shaders */
device_vector<int4> svm_nodes;
device_vector<KernelShader> shaders;
/* lookup tables */
device_vector<float> lookup_table;
/* integrator */
device_vector<float> sample_pattern_lut;
/* IES lights */
device_vector<float> ies_lights;
KernelData data;
DeviceScene(Device *device);
};
CCL_NAMESPACE_END
#endif /* __DEVICESCENE_H__ */

File diff suppressed because it is too large Load Diff

View File

@@ -30,6 +30,38 @@ class Shader;
class Volume;
struct PackedBVH;
/* Set of flags used to help determining what data has been modified or needs reallocation, so we
* can decide which device data to free or update. */
enum {
DEVICE_CURVE_DATA_MODIFIED = (1 << 0),
DEVICE_MESH_DATA_MODIFIED = (1 << 1),
DEVICE_POINT_DATA_MODIFIED = (1 << 2),
ATTR_FLOAT_MODIFIED = (1 << 3),
ATTR_FLOAT2_MODIFIED = (1 << 4),
ATTR_FLOAT3_MODIFIED = (1 << 5),
ATTR_FLOAT4_MODIFIED = (1 << 6),
ATTR_UCHAR4_MODIFIED = (1 << 7),
CURVE_DATA_NEED_REALLOC = (1 << 8),
MESH_DATA_NEED_REALLOC = (1 << 9),
POINT_DATA_NEED_REALLOC = (1 << 10),
ATTR_FLOAT_NEEDS_REALLOC = (1 << 11),
ATTR_FLOAT2_NEEDS_REALLOC = (1 << 12),
ATTR_FLOAT3_NEEDS_REALLOC = (1 << 13),
ATTR_FLOAT4_NEEDS_REALLOC = (1 << 14),
ATTR_UCHAR4_NEEDS_REALLOC = (1 << 15),
ATTRS_NEED_REALLOC = (ATTR_FLOAT_NEEDS_REALLOC | ATTR_FLOAT2_NEEDS_REALLOC |
ATTR_FLOAT3_NEEDS_REALLOC | ATTR_FLOAT4_NEEDS_REALLOC |
ATTR_UCHAR4_NEEDS_REALLOC),
DEVICE_MESH_DATA_NEEDS_REALLOC = (MESH_DATA_NEED_REALLOC | ATTRS_NEED_REALLOC),
DEVICE_POINT_DATA_NEEDS_REALLOC = (POINT_DATA_NEED_REALLOC | ATTRS_NEED_REALLOC),
DEVICE_CURVE_DATA_NEEDS_REALLOC = (CURVE_DATA_NEED_REALLOC | ATTRS_NEED_REALLOC),
};
/* Geometry
*
* Base class for geometric types like Mesh and Hair. */

View File

@@ -0,0 +1,722 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2011-2022 Blender Foundation */
#include "bvh/bvh.h"
#include "bvh/bvh2.h"
#include "device/device.h"
#include "scene/attribute.h"
#include "scene/camera.h"
#include "scene/geometry.h"
#include "scene/hair.h"
#include "scene/light.h"
#include "scene/mesh.h"
#include "scene/object.h"
#include "scene/pointcloud.h"
#include "scene/scene.h"
#include "scene/shader.h"
#include "scene/shader_nodes.h"
#include "scene/stats.h"
#include "scene/volume.h"
#include "subd/patch_table.h"
#include "subd/split.h"
#include "kernel/osl/globals.h"
#include "util/foreach.h"
#include "util/log.h"
#include "util/progress.h"
#include "util/task.h"
CCL_NAMESPACE_BEGIN
bool Geometry::need_attribute(Scene *scene, AttributeStandard std)
{
if (std == ATTR_STD_NONE)
return false;
if (scene->need_global_attribute(std))
return true;
foreach (Node *node, used_shaders) {
Shader *shader = static_cast<Shader *>(node);
if (shader->attributes.find(std))
return true;
}
return false;
}
bool Geometry::need_attribute(Scene * /*scene*/, ustring name)
{
if (name == ustring())
return false;
foreach (Node *node, used_shaders) {
Shader *shader = static_cast<Shader *>(node);
if (shader->attributes.find(name))
return true;
}
return false;
}
AttributeRequestSet Geometry::needed_attributes()
{
AttributeRequestSet result;
foreach (Node *node, used_shaders) {
Shader *shader = static_cast<Shader *>(node);
result.add(shader->attributes);
}
return result;
}
bool Geometry::has_voxel_attributes() const
{
foreach (const Attribute &attr, attributes.attributes) {
if (attr.element == ATTR_ELEMENT_VOXEL) {
return true;
}
}
return false;
}
/* Generate a normal attribute map entry from an attribute descriptor. */
static void emit_attribute_map_entry(AttributeMap *attr_map,
size_t index,
uint64_t id,
TypeDesc type,
const AttributeDescriptor &desc)
{
attr_map[index].id = id;
attr_map[index].element = desc.element;
attr_map[index].offset = as_uint(desc.offset);
if (type == TypeDesc::TypeFloat)
attr_map[index].type = NODE_ATTR_FLOAT;
else if (type == TypeDesc::TypeMatrix)
attr_map[index].type = NODE_ATTR_MATRIX;
else if (type == TypeFloat2)
attr_map[index].type = NODE_ATTR_FLOAT2;
else if (type == TypeFloat4)
attr_map[index].type = NODE_ATTR_FLOAT4;
else if (type == TypeRGBA)
attr_map[index].type = NODE_ATTR_RGBA;
else
attr_map[index].type = NODE_ATTR_FLOAT3;
attr_map[index].flags = desc.flags;
}
/* Generate an attribute map end marker, optionally including a link to another map.
* Links are used to connect object attribute maps to mesh attribute maps. */
static void emit_attribute_map_terminator(AttributeMap *attr_map,
size_t index,
bool chain,
uint chain_link)
{
for (int j = 0; j < ATTR_PRIM_TYPES; j++) {
attr_map[index + j].id = ATTR_STD_NONE;
attr_map[index + j].element = chain; /* link is valid flag */
attr_map[index + j].offset = chain ? chain_link + j : 0; /* link to the correct sub-entry */
attr_map[index + j].type = 0;
attr_map[index + j].flags = 0;
}
}
/* Generate all necessary attribute map entries from the attribute request. */
static void emit_attribute_mapping(
AttributeMap *attr_map, size_t index, uint64_t id, AttributeRequest &req, Geometry *geom)
{
emit_attribute_map_entry(attr_map, index, id, req.type, req.desc);
if (geom->is_mesh()) {
Mesh *mesh = static_cast<Mesh *>(geom);
if (mesh->get_num_subd_faces()) {
emit_attribute_map_entry(attr_map, index + 1, id, req.subd_type, req.subd_desc);
}
}
}
void GeometryManager::update_svm_attributes(Device *,
DeviceScene *dscene,
Scene *scene,
vector<AttributeRequestSet> &geom_attributes,
vector<AttributeRequestSet> &object_attributes)
{
/* for SVM, the attributes_map table is used to lookup the offset of an
* attribute, based on a unique shader attribute id. */
/* compute array stride */
size_t attr_map_size = 0;
for (size_t i = 0; i < scene->geometry.size(); i++) {
Geometry *geom = scene->geometry[i];
geom->attr_map_offset = attr_map_size;
#ifdef WITH_OSL
size_t attr_count = 0;
foreach (AttributeRequest &req, geom_attributes[i].requests) {
if (req.std != ATTR_STD_NONE &&
scene->shader_manager->get_attribute_id(req.std) != (uint64_t)req.std)
attr_count += 2;
else
attr_count += 1;
}
#else
const size_t attr_count = geom_attributes[i].size();
#endif
attr_map_size += (attr_count + 1) * ATTR_PRIM_TYPES;
}
for (size_t i = 0; i < scene->objects.size(); i++) {
Object *object = scene->objects[i];
/* only allocate a table for the object if it actually has attributes */
if (object_attributes[i].size() == 0) {
object->attr_map_offset = 0;
}
else {
object->attr_map_offset = attr_map_size;
attr_map_size += (object_attributes[i].size() + 1) * ATTR_PRIM_TYPES;
}
}
if (attr_map_size == 0)
return;
if (!dscene->attributes_map.need_realloc()) {
return;
}
/* create attribute map */
AttributeMap *attr_map = dscene->attributes_map.alloc(attr_map_size);
memset(attr_map, 0, dscene->attributes_map.size() * sizeof(*attr_map));
for (size_t i = 0; i < scene->geometry.size(); i++) {
Geometry *geom = scene->geometry[i];
AttributeRequestSet &attributes = geom_attributes[i];
/* set geometry attributes */
size_t index = geom->attr_map_offset;
foreach (AttributeRequest &req, attributes.requests) {
uint64_t id;
if (req.std == ATTR_STD_NONE)
id = scene->shader_manager->get_attribute_id(req.name);
else
id = scene->shader_manager->get_attribute_id(req.std);
emit_attribute_mapping(attr_map, index, id, req, geom);
index += ATTR_PRIM_TYPES;
#ifdef WITH_OSL
/* Some standard attributes are explicitly referenced via their standard ID, so add those
* again in case they were added under a different attribute ID. */
if (req.std != ATTR_STD_NONE && id != (uint64_t)req.std) {
emit_attribute_mapping(attr_map, index, (uint64_t)req.std, req, geom);
index += ATTR_PRIM_TYPES;
}
#endif
}
emit_attribute_map_terminator(attr_map, index, false, 0);
}
for (size_t i = 0; i < scene->objects.size(); i++) {
Object *object = scene->objects[i];
AttributeRequestSet &attributes = object_attributes[i];
/* set object attributes */
if (attributes.size() > 0) {
size_t index = object->attr_map_offset;
foreach (AttributeRequest &req, attributes.requests) {
uint64_t id;
if (req.std == ATTR_STD_NONE)
id = scene->shader_manager->get_attribute_id(req.name);
else
id = scene->shader_manager->get_attribute_id(req.std);
emit_attribute_mapping(attr_map, index, id, req, object->geometry);
index += ATTR_PRIM_TYPES;
}
emit_attribute_map_terminator(attr_map, index, true, object->geometry->attr_map_offset);
}
}
/* copy to device */
dscene->attributes_map.copy_to_device();
}
void GeometryManager::update_attribute_element_offset(Geometry *geom,
device_vector<float> &attr_float,
size_t &attr_float_offset,
device_vector<float2> &attr_float2,
size_t &attr_float2_offset,
device_vector<packed_float3> &attr_float3,
size_t &attr_float3_offset,
device_vector<float4> &attr_float4,
size_t &attr_float4_offset,
device_vector<uchar4> &attr_uchar4,
size_t &attr_uchar4_offset,
Attribute *mattr,
AttributePrimitive prim,
TypeDesc &type,
AttributeDescriptor &desc)
{
if (mattr) {
/* store element and type */
desc.element = mattr->element;
desc.flags = mattr->flags;
type = mattr->type;
/* store attribute data in arrays */
size_t size = mattr->element_size(geom, prim);
AttributeElement &element = desc.element;
int &offset = desc.offset;
if (mattr->element == ATTR_ELEMENT_VOXEL) {
/* store slot in offset value */
ImageHandle &handle = mattr->data_voxel();
offset = handle.svm_slot();
}
else if (mattr->element == ATTR_ELEMENT_CORNER_BYTE) {
uchar4 *data = mattr->data_uchar4();
offset = attr_uchar4_offset;
assert(attr_uchar4.size() >= offset + size);
if (mattr->modified) {
for (size_t k = 0; k < size; k++) {
attr_uchar4[offset + k] = data[k];
}
attr_uchar4.tag_modified();
}
attr_uchar4_offset += size;
}
else if (mattr->type == TypeDesc::TypeFloat) {
float *data = mattr->data_float();
offset = attr_float_offset;
assert(attr_float.size() >= offset + size);
if (mattr->modified) {
for (size_t k = 0; k < size; k++) {
attr_float[offset + k] = data[k];
}
attr_float.tag_modified();
}
attr_float_offset += size;
}
else if (mattr->type == TypeFloat2) {
float2 *data = mattr->data_float2();
offset = attr_float2_offset;
assert(attr_float2.size() >= offset + size);
if (mattr->modified) {
for (size_t k = 0; k < size; k++) {
attr_float2[offset + k] = data[k];
}
attr_float2.tag_modified();
}
attr_float2_offset += size;
}
else if (mattr->type == TypeDesc::TypeMatrix) {
Transform *tfm = mattr->data_transform();
offset = attr_float4_offset;
assert(attr_float4.size() >= offset + size * 3);
if (mattr->modified) {
for (size_t k = 0; k < size * 3; k++) {
attr_float4[offset + k] = (&tfm->x)[k];
}
attr_float4.tag_modified();
}
attr_float4_offset += size * 3;
}
else if (mattr->type == TypeFloat4 || mattr->type == TypeRGBA) {
float4 *data = mattr->data_float4();
offset = attr_float4_offset;
assert(attr_float4.size() >= offset + size);
if (mattr->modified) {
for (size_t k = 0; k < size; k++) {
attr_float4[offset + k] = data[k];
}
attr_float4.tag_modified();
}
attr_float4_offset += size;
}
else {
float3 *data = mattr->data_float3();
offset = attr_float3_offset;
assert(attr_float3.size() >= offset + size);
if (mattr->modified) {
for (size_t k = 0; k < size; k++) {
attr_float3[offset + k] = data[k];
}
attr_float3.tag_modified();
}
attr_float3_offset += size;
}
/* mesh vertex/curve index is global, not per object, so we sneak
* a correction for that in here */
if (geom->is_mesh()) {
Mesh *mesh = static_cast<Mesh *>(geom);
if (mesh->subdivision_type == Mesh::SUBDIVISION_CATMULL_CLARK &&
desc.flags & ATTR_SUBDIVIDED) {
/* Indices for subdivided attributes are retrieved
* from patch table so no need for correction here. */
}
else if (element == ATTR_ELEMENT_VERTEX)
offset -= mesh->vert_offset;
else if (element == ATTR_ELEMENT_VERTEX_MOTION)
offset -= mesh->vert_offset;
else if (element == ATTR_ELEMENT_FACE) {
if (prim == ATTR_PRIM_GEOMETRY)
offset -= mesh->prim_offset;
else
offset -= mesh->face_offset;
}
else if (element == ATTR_ELEMENT_CORNER || element == ATTR_ELEMENT_CORNER_BYTE) {
if (prim == ATTR_PRIM_GEOMETRY)
offset -= 3 * mesh->prim_offset;
else
offset -= mesh->corner_offset;
}
}
else if (geom->is_hair()) {
Hair *hair = static_cast<Hair *>(geom);
if (element == ATTR_ELEMENT_CURVE)
offset -= hair->prim_offset;
else if (element == ATTR_ELEMENT_CURVE_KEY)
offset -= hair->curve_key_offset;
else if (element == ATTR_ELEMENT_CURVE_KEY_MOTION)
offset -= hair->curve_key_offset;
}
else if (geom->is_pointcloud()) {
if (element == ATTR_ELEMENT_VERTEX)
offset -= geom->prim_offset;
else if (element == ATTR_ELEMENT_VERTEX_MOTION)
offset -= geom->prim_offset;
}
}
else {
/* attribute not found */
desc.element = ATTR_ELEMENT_NONE;
desc.offset = 0;
}
}
static void update_attribute_element_size(Geometry *geom,
Attribute *mattr,
AttributePrimitive prim,
size_t *attr_float_size,
size_t *attr_float2_size,
size_t *attr_float3_size,
size_t *attr_float4_size,
size_t *attr_uchar4_size)
{
if (mattr) {
size_t size = mattr->element_size(geom, prim);
if (mattr->element == ATTR_ELEMENT_VOXEL) {
/* pass */
}
else if (mattr->element == ATTR_ELEMENT_CORNER_BYTE) {
*attr_uchar4_size += size;
}
else if (mattr->type == TypeDesc::TypeFloat) {
*attr_float_size += size;
}
else if (mattr->type == TypeFloat2) {
*attr_float2_size += size;
}
else if (mattr->type == TypeDesc::TypeMatrix) {
*attr_float4_size += size * 4;
}
else if (mattr->type == TypeFloat4 || mattr->type == TypeRGBA) {
*attr_float4_size += size;
}
else {
*attr_float3_size += size;
}
}
}
void GeometryManager::device_update_attributes(Device *device,
DeviceScene *dscene,
Scene *scene,
Progress &progress)
{
progress.set_status("Updating Mesh", "Computing attributes");
/* gather per mesh requested attributes. as meshes may have multiple
* shaders assigned, this merges the requested attributes that have
* been set per shader by the shader manager */
vector<AttributeRequestSet> geom_attributes(scene->geometry.size());
for (size_t i = 0; i < scene->geometry.size(); i++) {
Geometry *geom = scene->geometry[i];
geom->index = i;
scene->need_global_attributes(geom_attributes[i]);
foreach (Node *node, geom->get_used_shaders()) {
Shader *shader = static_cast<Shader *>(node);
geom_attributes[i].add(shader->attributes);
}
if (geom->is_hair() && static_cast<Hair *>(geom)->need_shadow_transparency()) {
geom_attributes[i].add(ATTR_STD_SHADOW_TRANSPARENCY);
}
}
/* convert object attributes to use the same data structures as geometry ones */
vector<AttributeRequestSet> object_attributes(scene->objects.size());
vector<AttributeSet> object_attribute_values;
object_attribute_values.reserve(scene->objects.size());
for (size_t i = 0; i < scene->objects.size(); i++) {
Object *object = scene->objects[i];
Geometry *geom = object->geometry;
size_t geom_idx = geom->index;
assert(geom_idx < scene->geometry.size() && scene->geometry[geom_idx] == geom);
object_attribute_values.push_back(AttributeSet(geom, ATTR_PRIM_GEOMETRY));
AttributeRequestSet &geom_requests = geom_attributes[geom_idx];
AttributeRequestSet &attributes = object_attributes[i];
AttributeSet &values = object_attribute_values[i];
for (size_t j = 0; j < object->attributes.size(); j++) {
ParamValue &param = object->attributes[j];
/* add attributes that are requested and not already handled by the mesh */
if (geom_requests.find(param.name()) && !geom->attributes.find(param.name())) {
attributes.add(param.name());
Attribute *attr = values.add(param.name(), param.type(), ATTR_ELEMENT_OBJECT);
assert(param.datasize() == attr->buffer.size());
memcpy(attr->buffer.data(), param.data(), param.datasize());
}
}
}
/* mesh attribute are stored in a single array per data type. here we fill
* those arrays, and set the offset and element type to create attribute
* maps next */
/* Pre-allocate attributes to avoid arrays re-allocation which would
* take 2x of overall attribute memory usage.
*/
size_t attr_float_size = 0;
size_t attr_float2_size = 0;
size_t attr_float3_size = 0;
size_t attr_float4_size = 0;
size_t attr_uchar4_size = 0;
for (size_t i = 0; i < scene->geometry.size(); i++) {
Geometry *geom = scene->geometry[i];
AttributeRequestSet &attributes = geom_attributes[i];
foreach (AttributeRequest &req, attributes.requests) {
Attribute *attr = geom->attributes.find(req);
update_attribute_element_size(geom,
attr,
ATTR_PRIM_GEOMETRY,
&attr_float_size,
&attr_float2_size,
&attr_float3_size,
&attr_float4_size,
&attr_uchar4_size);
if (geom->is_mesh()) {
Mesh *mesh = static_cast<Mesh *>(geom);
Attribute *subd_attr = mesh->subd_attributes.find(req);
update_attribute_element_size(mesh,
subd_attr,
ATTR_PRIM_SUBD,
&attr_float_size,
&attr_float2_size,
&attr_float3_size,
&attr_float4_size,
&attr_uchar4_size);
}
}
}
for (size_t i = 0; i < scene->objects.size(); i++) {
Object *object = scene->objects[i];
foreach (Attribute &attr, object_attribute_values[i].attributes) {
update_attribute_element_size(object->geometry,
&attr,
ATTR_PRIM_GEOMETRY,
&attr_float_size,
&attr_float2_size,
&attr_float3_size,
&attr_float4_size,
&attr_uchar4_size);
}
}
dscene->attributes_float.alloc(attr_float_size);
dscene->attributes_float2.alloc(attr_float2_size);
dscene->attributes_float3.alloc(attr_float3_size);
dscene->attributes_float4.alloc(attr_float4_size);
dscene->attributes_uchar4.alloc(attr_uchar4_size);
/* The order of those flags needs to match that of AttrKernelDataType. */
const bool attributes_need_realloc[AttrKernelDataType::NUM] = {
dscene->attributes_float.need_realloc(),
dscene->attributes_float2.need_realloc(),
dscene->attributes_float3.need_realloc(),
dscene->attributes_float4.need_realloc(),
dscene->attributes_uchar4.need_realloc(),
};
size_t attr_float_offset = 0;
size_t attr_float2_offset = 0;
size_t attr_float3_offset = 0;
size_t attr_float4_offset = 0;
size_t attr_uchar4_offset = 0;
/* Fill in attributes. */
for (size_t i = 0; i < scene->geometry.size(); i++) {
Geometry *geom = scene->geometry[i];
AttributeRequestSet &attributes = geom_attributes[i];
/* todo: we now store std and name attributes from requests even if
* they actually refer to the same mesh attributes, optimize */
foreach (AttributeRequest &req, attributes.requests) {
Attribute *attr = geom->attributes.find(req);
if (attr) {
/* force a copy if we need to reallocate all the data */
attr->modified |= attributes_need_realloc[Attribute::kernel_type(*attr)];
}
update_attribute_element_offset(geom,
dscene->attributes_float,
attr_float_offset,
dscene->attributes_float2,
attr_float2_offset,
dscene->attributes_float3,
attr_float3_offset,
dscene->attributes_float4,
attr_float4_offset,
dscene->attributes_uchar4,
attr_uchar4_offset,
attr,
ATTR_PRIM_GEOMETRY,
req.type,
req.desc);
if (geom->is_mesh()) {
Mesh *mesh = static_cast<Mesh *>(geom);
Attribute *subd_attr = mesh->subd_attributes.find(req);
if (subd_attr) {
/* force a copy if we need to reallocate all the data */
subd_attr->modified |= attributes_need_realloc[Attribute::kernel_type(*subd_attr)];
}
update_attribute_element_offset(mesh,
dscene->attributes_float,
attr_float_offset,
dscene->attributes_float2,
attr_float2_offset,
dscene->attributes_float3,
attr_float3_offset,
dscene->attributes_float4,
attr_float4_offset,
dscene->attributes_uchar4,
attr_uchar4_offset,
subd_attr,
ATTR_PRIM_SUBD,
req.subd_type,
req.subd_desc);
}
if (progress.get_cancel())
return;
}
}
for (size_t i = 0; i < scene->objects.size(); i++) {
Object *object = scene->objects[i];
AttributeRequestSet &attributes = object_attributes[i];
AttributeSet &values = object_attribute_values[i];
foreach (AttributeRequest &req, attributes.requests) {
Attribute *attr = values.find(req);
if (attr) {
attr->modified |= attributes_need_realloc[Attribute::kernel_type(*attr)];
}
update_attribute_element_offset(object->geometry,
dscene->attributes_float,
attr_float_offset,
dscene->attributes_float2,
attr_float2_offset,
dscene->attributes_float3,
attr_float3_offset,
dscene->attributes_float4,
attr_float4_offset,
dscene->attributes_uchar4,
attr_uchar4_offset,
attr,
ATTR_PRIM_GEOMETRY,
req.type,
req.desc);
/* object attributes don't care about subdivision */
req.subd_type = req.type;
req.subd_desc = req.desc;
if (progress.get_cancel())
return;
}
}
/* create attribute lookup maps */
if (scene->shader_manager->use_osl())
update_osl_globals(device, scene);
update_svm_attributes(device, dscene, scene, geom_attributes, object_attributes);
if (progress.get_cancel())
return;
/* copy to device */
progress.set_status("Updating Mesh", "Copying Attributes to device");
dscene->attributes_float.copy_to_device_if_modified();
dscene->attributes_float2.copy_to_device_if_modified();
dscene->attributes_float3.copy_to_device_if_modified();
dscene->attributes_float4.copy_to_device_if_modified();
dscene->attributes_uchar4.copy_to_device_if_modified();
if (progress.get_cancel())
return;
/* After mesh attributes and patch tables have been copied to device memory,
* we need to update offsets in the objects. */
scene->object_manager->device_update_geom_offsets(device, dscene, scene);
}
CCL_NAMESPACE_END

View File

@@ -0,0 +1,196 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2011-2022 Blender Foundation */
#include "bvh/bvh.h"
#include "bvh/bvh2.h"
#include "device/device.h"
#include "scene/attribute.h"
#include "scene/camera.h"
#include "scene/geometry.h"
#include "scene/hair.h"
#include "scene/light.h"
#include "scene/mesh.h"
#include "scene/object.h"
#include "scene/pointcloud.h"
#include "scene/scene.h"
#include "scene/shader.h"
#include "scene/shader_nodes.h"
#include "scene/stats.h"
#include "scene/volume.h"
#include "subd/patch_table.h"
#include "subd/split.h"
#include "kernel/osl/globals.h"
#include "util/foreach.h"
#include "util/log.h"
#include "util/progress.h"
#include "util/task.h"
CCL_NAMESPACE_BEGIN
void Geometry::compute_bvh(Device *device,
DeviceScene *dscene,
SceneParams *params,
Progress *progress,
size_t n,
size_t total)
{
if (progress->get_cancel())
return;
compute_bounds();
const BVHLayout bvh_layout = BVHParams::best_bvh_layout(
params->bvh_layout, device->get_bvh_layout_mask(dscene->data.kernel_features));
if (need_build_bvh(bvh_layout)) {
string msg = "Updating Geometry BVH ";
if (name.empty())
msg += string_printf("%u/%u", (uint)(n + 1), (uint)total);
else
msg += string_printf("%s %u/%u", name.c_str(), (uint)(n + 1), (uint)total);
Object object;
/* Ensure all visibility bits are set at the geometry level BVH. In
* the object level BVH is where actual visibility is tested. */
object.set_is_shadow_catcher(true);
object.set_visibility(~0);
object.set_geometry(this);
vector<Geometry *> geometry;
geometry.push_back(this);
vector<Object *> objects;
objects.push_back(&object);
if (bvh && !need_update_rebuild) {
progress->set_status(msg, "Refitting BVH");
bvh->replace_geometry(geometry, objects);
device->build_bvh(bvh, *progress, true);
}
else {
progress->set_status(msg, "Building BVH");
BVHParams bparams;
bparams.use_spatial_split = params->use_bvh_spatial_split;
bparams.use_compact_structure = params->use_bvh_compact_structure;
bparams.bvh_layout = bvh_layout;
bparams.use_unaligned_nodes = dscene->data.bvh.have_curves &&
params->use_bvh_unaligned_nodes;
bparams.num_motion_triangle_steps = params->num_bvh_time_steps;
bparams.num_motion_curve_steps = params->num_bvh_time_steps;
bparams.num_motion_point_steps = params->num_bvh_time_steps;
bparams.bvh_type = params->bvh_type;
bparams.curve_subdivisions = params->curve_subdivisions();
delete bvh;
bvh = BVH::create(bparams, geometry, objects, device);
MEM_GUARDED_CALL(progress, device->build_bvh, bvh, *progress, false);
}
}
need_update_rebuild = false;
need_update_bvh_for_offset = false;
}
void GeometryManager::device_update_bvh(Device *device,
DeviceScene *dscene,
Scene *scene,
Progress &progress)
{
/* bvh build */
progress.set_status("Updating Scene BVH", "Building");
BVHParams bparams;
bparams.top_level = true;
bparams.bvh_layout = BVHParams::best_bvh_layout(
scene->params.bvh_layout, device->get_bvh_layout_mask(dscene->data.kernel_features));
bparams.use_spatial_split = scene->params.use_bvh_spatial_split;
bparams.use_unaligned_nodes = dscene->data.bvh.have_curves &&
scene->params.use_bvh_unaligned_nodes;
bparams.num_motion_triangle_steps = scene->params.num_bvh_time_steps;
bparams.num_motion_curve_steps = scene->params.num_bvh_time_steps;
bparams.num_motion_point_steps = scene->params.num_bvh_time_steps;
bparams.bvh_type = scene->params.bvh_type;
bparams.curve_subdivisions = scene->params.curve_subdivisions();
VLOG_INFO << "Using " << bvh_layout_name(bparams.bvh_layout) << " layout.";
const bool can_refit = scene->bvh != nullptr &&
(bparams.bvh_layout == BVHLayout::BVH_LAYOUT_OPTIX ||
bparams.bvh_layout == BVHLayout::BVH_LAYOUT_METAL);
BVH *bvh = scene->bvh;
if (!scene->bvh) {
bvh = scene->bvh = BVH::create(bparams, scene->geometry, scene->objects, device);
}
device->build_bvh(bvh, progress, can_refit);
if (progress.get_cancel()) {
return;
}
const bool has_bvh2_layout = (bparams.bvh_layout == BVH_LAYOUT_BVH2);
PackedBVH pack;
if (has_bvh2_layout) {
pack = std::move(static_cast<BVH2 *>(bvh)->pack);
}
else {
pack.root_index = -1;
}
/* copy to device */
progress.set_status("Updating Scene BVH", "Copying BVH to device");
/* When using BVH2, we always have to copy/update the data as its layout is dependent on the
* BVH's leaf nodes which may be different when the objects or vertices move. */
if (pack.nodes.size()) {
dscene->bvh_nodes.steal_data(pack.nodes);
dscene->bvh_nodes.copy_to_device();
}
if (pack.leaf_nodes.size()) {
dscene->bvh_leaf_nodes.steal_data(pack.leaf_nodes);
dscene->bvh_leaf_nodes.copy_to_device();
}
if (pack.object_node.size()) {
dscene->object_node.steal_data(pack.object_node);
dscene->object_node.copy_to_device();
}
if (pack.prim_type.size()) {
dscene->prim_type.steal_data(pack.prim_type);
dscene->prim_type.copy_to_device();
}
if (pack.prim_visibility.size()) {
dscene->prim_visibility.steal_data(pack.prim_visibility);
dscene->prim_visibility.copy_to_device();
}
if (pack.prim_index.size()) {
dscene->prim_index.steal_data(pack.prim_index);
dscene->prim_index.copy_to_device();
}
if (pack.prim_object.size()) {
dscene->prim_object.steal_data(pack.prim_object);
dscene->prim_object.copy_to_device();
}
if (pack.prim_time.size()) {
dscene->prim_time.steal_data(pack.prim_time);
dscene->prim_time.copy_to_device();
}
dscene->data.bvh.root = pack.root_index;
dscene->data.bvh.use_bvh_steps = (scene->params.num_bvh_time_steps != 0);
dscene->data.bvh.curve_subdivisions = scene->params.curve_subdivisions();
/* The scene handle is set in 'CPUDevice::const_copy_to' and 'OptiXDevice::const_copy_to' */
dscene->data.device_bvh = 0;
}
CCL_NAMESPACE_END

View File

@@ -0,0 +1,223 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2011-2022 Blender Foundation */
#include "bvh/bvh.h"
#include "bvh/bvh2.h"
#include "device/device.h"
#include "scene/attribute.h"
#include "scene/camera.h"
#include "scene/geometry.h"
#include "scene/hair.h"
#include "scene/light.h"
#include "scene/mesh.h"
#include "scene/object.h"
#include "scene/osl.h"
#include "scene/pointcloud.h"
#include "scene/scene.h"
#include "scene/shader.h"
#include "scene/shader_nodes.h"
#include "scene/stats.h"
#include "scene/volume.h"
#include "subd/patch_table.h"
#include "subd/split.h"
#ifdef WITH_OSL
# include "kernel/osl/globals.h"
#endif
#include "util/foreach.h"
#include "util/log.h"
#include "util/progress.h"
#include "util/task.h"
CCL_NAMESPACE_BEGIN
void GeometryManager::device_update_mesh(Device *,
DeviceScene *dscene,
Scene *scene,
Progress &progress)
{
/* Count. */
size_t vert_size = 0;
size_t tri_size = 0;
size_t curve_key_size = 0;
size_t curve_size = 0;
size_t curve_segment_size = 0;
size_t point_size = 0;
size_t patch_size = 0;
foreach (Geometry *geom, scene->geometry) {
if (geom->geometry_type == Geometry::MESH || geom->geometry_type == Geometry::VOLUME) {
Mesh *mesh = static_cast<Mesh *>(geom);
vert_size += mesh->verts.size();
tri_size += mesh->num_triangles();
if (mesh->get_num_subd_faces()) {
Mesh::SubdFace last = mesh->get_subd_face(mesh->get_num_subd_faces() - 1);
patch_size += (last.ptex_offset + last.num_ptex_faces()) * 8;
/* patch tables are stored in same array so include them in patch_size */
if (mesh->patch_table) {
mesh->patch_table_offset = patch_size;
patch_size += mesh->patch_table->total_size();
}
}
}
else if (geom->is_hair()) {
Hair *hair = static_cast<Hair *>(geom);
curve_key_size += hair->get_curve_keys().size();
curve_size += hair->num_curves();
curve_segment_size += hair->num_segments();
}
else if (geom->is_pointcloud()) {
PointCloud *pointcloud = static_cast<PointCloud *>(geom);
point_size += pointcloud->num_points();
}
}
/* Fill in all the arrays. */
if (tri_size != 0) {
/* normals */
progress.set_status("Updating Mesh", "Computing normals");
packed_float3 *tri_verts = dscene->tri_verts.alloc(vert_size);
uint *tri_shader = dscene->tri_shader.alloc(tri_size);
packed_float3 *vnormal = dscene->tri_vnormal.alloc(vert_size);
packed_uint3 *tri_vindex = dscene->tri_vindex.alloc(tri_size);
uint *tri_patch = dscene->tri_patch.alloc(tri_size);
float2 *tri_patch_uv = dscene->tri_patch_uv.alloc(vert_size);
const bool copy_all_data = dscene->tri_shader.need_realloc() ||
dscene->tri_vindex.need_realloc() ||
dscene->tri_vnormal.need_realloc() ||
dscene->tri_patch.need_realloc() ||
dscene->tri_patch_uv.need_realloc();
foreach (Geometry *geom, scene->geometry) {
if (geom->geometry_type == Geometry::MESH || geom->geometry_type == Geometry::VOLUME) {
Mesh *mesh = static_cast<Mesh *>(geom);
if (mesh->shader_is_modified() || mesh->smooth_is_modified() ||
mesh->triangles_is_modified() || copy_all_data) {
mesh->pack_shaders(scene, &tri_shader[mesh->prim_offset]);
}
if (mesh->verts_is_modified() || copy_all_data) {
mesh->pack_normals(&vnormal[mesh->vert_offset]);
}
if (mesh->verts_is_modified() || mesh->triangles_is_modified() ||
mesh->vert_patch_uv_is_modified() || copy_all_data) {
mesh->pack_verts(&tri_verts[mesh->vert_offset],
&tri_vindex[mesh->prim_offset],
&tri_patch[mesh->prim_offset],
&tri_patch_uv[mesh->vert_offset]);
}
if (progress.get_cancel())
return;
}
}
/* vertex coordinates */
progress.set_status("Updating Mesh", "Copying Mesh to device");
dscene->tri_verts.copy_to_device_if_modified();
dscene->tri_shader.copy_to_device_if_modified();
dscene->tri_vnormal.copy_to_device_if_modified();
dscene->tri_vindex.copy_to_device_if_modified();
dscene->tri_patch.copy_to_device_if_modified();
dscene->tri_patch_uv.copy_to_device_if_modified();
}
if (curve_segment_size != 0) {
progress.set_status("Updating Mesh", "Copying Curves to device");
float4 *curve_keys = dscene->curve_keys.alloc(curve_key_size);
KernelCurve *curves = dscene->curves.alloc(curve_size);
KernelCurveSegment *curve_segments = dscene->curve_segments.alloc(curve_segment_size);
const bool copy_all_data = dscene->curve_keys.need_realloc() ||
dscene->curves.need_realloc() ||
dscene->curve_segments.need_realloc();
foreach (Geometry *geom, scene->geometry) {
if (geom->is_hair()) {
Hair *hair = static_cast<Hair *>(geom);
bool curve_keys_co_modified = hair->curve_radius_is_modified() ||
hair->curve_keys_is_modified();
bool curve_data_modified = hair->curve_shader_is_modified() ||
hair->curve_first_key_is_modified();
if (!curve_keys_co_modified && !curve_data_modified && !copy_all_data) {
continue;
}
hair->pack_curves(scene,
&curve_keys[hair->curve_key_offset],
&curves[hair->prim_offset],
&curve_segments[hair->curve_segment_offset]);
if (progress.get_cancel())
return;
}
}
dscene->curve_keys.copy_to_device_if_modified();
dscene->curves.copy_to_device_if_modified();
dscene->curve_segments.copy_to_device_if_modified();
}
if (point_size != 0) {
progress.set_status("Updating Mesh", "Copying Point clouds to device");
float4 *points = dscene->points.alloc(point_size);
uint *points_shader = dscene->points_shader.alloc(point_size);
foreach (Geometry *geom, scene->geometry) {
if (geom->is_pointcloud()) {
PointCloud *pointcloud = static_cast<PointCloud *>(geom);
pointcloud->pack(
scene, &points[pointcloud->prim_offset], &points_shader[pointcloud->prim_offset]);
if (progress.get_cancel())
return;
}
}
dscene->points.copy_to_device();
dscene->points_shader.copy_to_device();
}
if (patch_size != 0 && dscene->patches.need_realloc()) {
progress.set_status("Updating Mesh", "Copying Patches to device");
uint *patch_data = dscene->patches.alloc(patch_size);
foreach (Geometry *geom, scene->geometry) {
if (geom->is_mesh()) {
Mesh *mesh = static_cast<Mesh *>(geom);
mesh->pack_patches(&patch_data[mesh->patch_offset]);
if (mesh->patch_table) {
mesh->patch_table->copy_adjusting_offsets(&patch_data[mesh->patch_table_offset],
mesh->patch_table_offset);
}
if (progress.get_cancel())
return;
}
}
dscene->patches.copy_to_device();
}
}
CCL_NAMESPACE_END

Some files were not shown because too many files have changed in this diff Show More