Compare commits

..

399 Commits

Author SHA1 Message Date
ea637ada20 Merge branch 'asset-shelf' into brush-assets-project 2023-04-21 11:03:58 +02:00
c2309356c6 Merge branch 'main' into asset-shelf 2023-04-21 11:02:31 +02:00
99f5e60b86 RNA: ignore some large arrays in override code
This speeds up saving `070_0100.anim.blend` from the Heist project
from ~3s to ~300ms by adding PROPOVERRIDE_IGNORE in a few
places. It's not completely obvious to me when `PROPOVERRIDE_IGNORE`
should be used and when it shouldn't. Given that the same is done for
meshes already, it seems correct.

Pull Request: blender/blender#107196
2023-04-21 10:15:51 +02:00
680a54c7d0 EEVEE Next: Ensure correct texture usage for views
Add texture usage flags for textures which are used as texture views
or require texture views for backing implementation.

Authored by Apple: Michael Parkin-White

Pull Request: blender/blender#107163
2023-04-21 10:06:55 +02:00
25138fd6e0 EEVEE Next: GLSL Metal shader type compatibility
Apply compilation fixes for Metal compatibility.
This includes explicit type casts, packed data types
where vec3 alignment is inconsistent, constructor replacement
with factory function.

The Metal shader generator also needs knowledge of when bound
resources are fundamental data types, so
SHADOWS_TILE_DATA_PACKED must be described as uint in
ShaderCreateInfo.

Authored by Apple: Michael Parkin-White

Pull Request: blender/blender#107178
2023-04-21 09:55:37 +02:00
4134682ec2 Fix #107156: UV Cylinder/Sphere Projection fails after other operators
Caused by 6b8cdd5979.

Above commit introduced element tagging for boundary calculations but
only cleared them properly on all faces if the new `Preserve Seams`
option was chosen. We cannot be sure about the state of element tags
from prior operators though, so correct the culprit check to also only
be in effect if the new `Preserve Seams` option was chosen.

Pull Request: blender/blender#107161
2023-04-21 08:28:03 +02:00
c18351f670 Metal: Increase concurrent shader compilation threads
Leverage new API call in Metal to increase the number of threads
dedicated to concurrent shader compilation. First step to improve
parallel compilation times when multiple engines are active.

Would also enable an increase in worker threads for shader
compilation jobs within the DRWManager.

Note that this is only available in the latest
version of macOS Ventura (13.3).

Authored by Apple: Michael Parkin-White

Pull Request: blender/blender#106625
2023-04-21 07:52:17 +02:00
fdf920bf5d Metal: Add textureGrad support
Fixes compilation errors in viewport compositor.

Authored by Apple: Michael Parkin-White

Pull Request: blender/blender#106805
2023-04-21 07:45:30 +02:00
19dbe049db Fix #106469: unstable tessellation with quad-flipping detection
The result of detecting if a quad should flip the default 0-2 split
when tessellated only used a pre-calculated normal when available,
since the method of detecting the flip was different, the check for a
concave face could change depending on the existence of polygon-normals.

In practice this meant cycles render preview could use a different
tessellation than the GPU display.

While [0] exposed the bug, it's an inherent problem with having 2
methods of detecting concave quads.

Remove is_quad_flip_v3_first_third_fast_with_normal(..) and always
use is_quad_flip_v3_first_third_fast(..), because having to calculate
the normal inline has significant overhead.

Note that "bow-tie" quads may now render with a subdivision in a
different direction although they must be very distorted with both
triangles along the 0-2 split pointing away from each other.

Thanks to @HooglyBoogly for investigating the issue.

[0]: 16fbadde36.
2023-04-21 15:02:47 +10:00
5721b34e53 Cleanup: add win32 suffix to BLI_path_is_abs
Naming made it seem this might be the opposite of BLI_path_is_rel,
when it checks for WIN32 specific path prefixes.
2023-04-21 09:46:06 +10:00
417b62522d Cleanup: code-comments in path_util.c
- Remove duplicate doc-string.
- Use full sentences.
- Use back-ticks for path literals
  (to avoid confusion with doxy-slash commands).
2023-04-21 09:34:30 +10:00
fc749d9d25 Cleanup: replace binary '&' with '&&' check
As the intention is to check both statements are true, avoid bitwise
operations on boolean results.
2023-04-21 09:13:09 +10:00
54ce0ac922 Cleanup: use const variables when reading X11 events 2023-04-21 09:12:31 +10:00
62806012ed Cleanup: resolve uninitialized members in GHOST Window & SystemX11
While this didn't cause bugs, initialize members to avoid problems
in the future.

GHOST::SystemX11
- m_keyboard_vector
- m_keycode_last_repeat_key

GHOST::Window
- m_cursorGrabInitPos
- m_userData
2023-04-21 09:06:25 +10:00
ae24fe56a3 Cleanup: quiet shadowed variable warning 2023-04-21 08:42:10 +10:00
04faf12bd8 Cleanup: use BLI_listbase_is_single to avoid unnecessary counting 2023-04-21 08:35:09 +10:00
8e69b41bdf Cleanup: use const for implicit sharing info
Generally, one does not know if the sharing info is currently shared
and should therefore be const. Better keep it const almost all the
time and only remove the constness when absolutely necessary
and the code has checked that it is valid.
2023-04-20 23:32:33 +02:00
491f098edf Cleanup: Fix custom data memcpy call null argument
The data was only null if the size was also zero, but it's simple
to avoid the ASAN warning anyway.
2023-04-20 17:31:29 -04:00
f6ec11741c Fix #106208: data-block socket defaults not used for node group
The main challenge is to avoid dangling pointers. Currently, the lifetime of socket
declarations is somewhat unbounded (at least we didn't restrict it explicitly yet).
Therefore, storing non-owning pointers in it tricky. For ID pointers one could
potentially use the foreach-id iterator to update pointers in declarations as well,
but that's a bit out of scope and might not be the right solution anyway, since it's
not obvious that all node declarations are reachable from IDs stored in `bmain`.

The solution now is to use a callback that retrieves the right ID pointer when it
is used. The important thing is that the callback does not capture any potentially
dangling pointer either.

Pull Request: blender/blender#107179
2023-04-20 22:27:45 +02:00
4babb7c02e Cycles: oneAPI: Fix volume intersection for Embree GPU execution 2023-04-20 21:20:33 +02:00
0d9fa73b42 Cycles: oneAPI: Fix motion blur rendering for Embree GPU execution
CPU non-unified shared memory was used for shared geometry buffers.
For the Embree GPU case, we now create new geometry buffers on GPU instead.
2023-04-20 21:20:33 +02:00
7e92fb92ec Cycles: oneAPI: Fix kernels preloading in case of incompatible AoT binaries
When running oneAPI with AoT binaries, on hardware that's not compatible with
these, recompilation could have been missing from the kernels loading phase and
happen during execution instead.

These changes fixes it, any kernel compilation will now happen during the
kernels loading phase.
2023-04-20 21:20:33 +02:00
13d30b0481 Cleanup: fix various warnings on Windows
Ensure windows.h is included before some other headers to avoid
redefining macros.

Pull Request: blender/blender#107189
2023-04-20 20:46:13 +02:00
c732d901a7 Fix : Iteration for BMLayerCollection was broken
It was broken in two ways:
- bpy_bmlayercollection_iter passed PY_SSIZE_T_MIN, while
PY_SSIZE_T_MAX was needed.
- bpy_bmlayercollection_subscript_slice() contained an
off-by-one error.

Pull Request: blender/blender#107165
2023-04-20 20:28:25 +02:00
f04a7a07e3 macOS: Add open files to system recent files
Completes the TODO in GHOST_SystemPathsCocoa::addToSystemRecentFiles
Also renames the filename parameter to the more appropriate filepath.

The recently opened/saved file will now also show up in:
- Blender Dock icon > Right click.
- Three finger swipe down in Open Blender i.e., App Expose

Based on a earlier contribution by @jenkm.

Pull Request: blender/blender#107174
2023-04-20 23:53:08 +05:30
82ca3d3604 Fix #107185: Edit mode or existing attribute break rest position
After e45ed69349 we need to remove the existing attribute
when adding the rest position before evaluating modifiers. Also, adding
the rest position attribute was completely skipped in edit mode.

Pull Request: blender/blender#107190
2023-04-20 20:18:02 +02:00
2f4a8ecf18 Fix: Spreadsheet missing other geometry types for edit mode mesh objects
We need to add to the spreadsheet's display geometry set
rather than completely replacing it with just the mesh.
2023-04-20 13:06:01 -04:00
b2c822065c Fix #106977: Crash when OpenEXR IO fails
The crash can occur in the following situations:

- Attempt to open a corrupted EXR file
- Attempt to save an EXR file under a non-existing directory.

The root cause is not really clear: for some reason the OpenEXE API on
the Blender side can not catch OpenEXE exceptions by a constant
reference to a std::exception, although it can by a constant reference
to an Iex::BaseExc.

This does not seem to be an issue with the OpenEXR library itself as
the idiff tool from our SVN folder catches the exceptions correctly.
It is also not caused by the symbols_apple.map as erasing it does not
make the problem go away.

It could still be some compiler/visibility flag which we were unable
to nail down yet.

The proposed solution is to add catch-all cases, mimicking the OIIO
tools. This solves the problem with the downside is that there are
no friendly error messages in the terminal. Those messages could be
brought as part of the workaround by additionally catching the
Iex::BaseExc exception. But probably nobody relies on those error
prints anyway, so added complexity in the code is likely does not
worth it.

Pull Request: blender/blender#107184
2023-04-20 18:40:07 +02:00
5c4b0c98d3 Animation: Add in Parent space alignment option to the Transform Orientation gizmo
Animation: Adds a new "Parent Space" Orientation option for the Transformation Gizmo.

---
For child targets (objects, bones, etc) being able to transform in parent space is a desired feature (especially when it comes to rigging / animation).

For objects:
* with a parent, the gizmo orients to it's parents orientation
* without a parent, the gizmo orients to Global space

For Armatures:
* Child bone shows parent's space regardless if "Local Location" is set for parent bone
* For root bone **without** "Local Location" set, use the armature objects space.
* For root bone **with** "Local Location" set, use local bone space.

---

No new transformation orientation code needs to be written, we can achieve the desired results be using the existing `transform_orientations_create_from_axis`, `ED_getTransformOrientationMatrix`, and `unit_m3` methods. To do this, we check to see if the bone has a parent, if so, we use the bones pose matrix (`pose_mat`). This is done similarly for objects using the parent's object matrix (`object_to_world`).

Pull Request: blender/blender#104724
2023-04-20 17:40:19 +02:00
0e23aef6b6 Fix build error when not using unity build 2023-04-20 15:46:15 +02:00
475f9a3e23 Cycles: Break up geometry.cpp and scene.cpp file into smaller pieces
Scene.cpp  and Geometry.cpp are large file it can be broken up into smaller easier to handle files. This change has been broken out from #105403 to make understanding the changes easier.

geometry.cpp is broken up into:
1. geometry.cpp
2. geometry_attributes.cpp
3. geometry_bvh.cpp
4. geometry_mesh.cpp

scene.h & scene.cpp is broken into:
1. scene.h
2. scene.cpp
3. devicescene.h
4. devicescene.cpp

Pull Request: blender/blender#107079
2023-04-20 12:26:02 +02:00
505c02a5db Merge branch 'asset-shelf' into brush-assets-project 2023-04-20 11:40:19 +02:00
e6387349d6 Merge branch 'main' into asset-shelf 2023-04-20 11:39:28 +02:00
100f37af49 Fix #100053: Incorrect saving asset catalogs after renaming parent item
When a parent item was renamed, the `TreeView` was doing everything as
expected, however `AssetCatalogService::update_catalog_path` is supposed
to also update the catalog paths of all sub-catalogs [which it does --
but it does not tag sub-catalogs as having unsaved changes, resulting in
wrong saving of catalogs afterwards, meaning the parent item was saved
with the old name and a new item with the new name was created].

Now also tag sub-catalogs for having unsaved changes.

This should also go into 3.3 LTS

Pull Request: blender/blender#107121
2023-04-20 11:21:27 +02:00
7ce10ebbbf Cycles: oneAPI: Remove excess quotes in a capabilities output 2023-04-20 11:09:16 +02:00
770b193253 Cleanup: use function style casts & nullptr, spelling in comments 2023-04-20 18:28:50 +10:00
0fa68d1a01 Cleanup: format 2023-04-20 18:28:50 +10:00
6d35e1c238 Fix missing include causing build error & invalid NULL check 2023-04-20 18:28:50 +10:00
fe7815e117 Fix #106771: Selection offset in timeline when NLA track is offset
The selection (box select, click select...) had an offset when selecting keys in the timeline.
That was because the function to get the NLA mapping ignored the timeline.

Pull Request: blender/blender#106904
2023-04-20 10:26:26 +02:00
60ced5283a Animation: make properties from motion path library overrideable
The following properties were not library overrideable, but now are
* Line Thickness
* Color
* Custom Color Checkbox

Pull Request: blender/blender#106959
2023-04-20 10:08:39 +02:00
4054d76749 Fix: Normalization with baked curves and preview range
Currently when a baked curve is in the Graph Editor and normalization is enabled, it doesn't work.
It even throws a warning.

This patch adds the missing logic to normalize baked FCurves within a preview range.

Pull Request: blender/blender#106890
2023-04-20 10:07:49 +02:00
88b125e75d Fix regression tests failure on the latest Xcode
When using Xcode version 14.3 on Apple Silicon hardware a number of
regression tests fails. This change fixes this problem.

The root cause comes to the floating point contraction. It was already
disabled for GCC on Linux, but not for Clang on neither of Linux or
macOS.

Also corrected the comment about Clang default, as it as set to on
somewhere in 2021.

Pull Request: blender/blender#107136
2023-04-20 08:56:55 +02:00
b69f8de5b5 Fix #105450: Resolve box selection issue in Metal
Occlusion query buffers not being cleared to zero resulted in
erroneoues selection in certain situations.

Authored by Apple: Michael Parkin-White

Pull Request: blender/blender#107135
2023-04-20 08:47:56 +02:00
dda4c0721c EEVEE-Next: Resolve compilation errors in Metal
Shader source requires explicit conversions and shader address
space qualifers in certain places in order to compile for Metal.

We also require constructors for a number of default struct types.

Authored by Apple: Michael Parkin-White

Pull Request: blender/blender#106219
2023-04-20 08:03:31 +02:00
397a14deff GPencil: Several Weight Paint additions
This patch adds several tools and options to the weight paint mode of Grease Pencil.

* Blur tool: smooths out vertex weights, by calculating a gaussian blur of adjacent vertices.
* Average tool: painting the average weight from all weights under the brush.
* Smear tool: smudges weights by grabbing the weights under the brush and 'dragging' them.

* With the + and - icons in the toolbar, the user can easily switch between adding and subtracting weight while drawing weights.
* With shortcut `D` you can toggle between these two.

* The auto-normalize options ensures that all bone-deforming vertex groups add up to 1.0 while weight painting.
* With `Ctrl-F` a radial control for weight is invoked (in addition to the radial controls for brush size and strength).
* With `Ctrl-RMB` the user can sample the weight. This sets the brush Weight from the weight under the cursor.

* When painting weights in vertex groups for bones, the user can quickly switch to another vertex group by clicking on a bone with `Ctrl-LMB`.
For this to work, follow these steps:
* Select the armature and switch to Pose Mode.
* Select your Grease Pencil object and switch immediately to Weight Paint Mode.
* Select a bone in the armature with `Ctrl-LMB`. The corresponding vertex group is automatically activated.

Pull Request: blender/blender#106663
2023-04-20 07:55:24 +02:00
3d6117994c Realtime Compositor: Implement ID Mask node
This patch implements the ID Mask node for the realtime compositor.

The node can be implemented as a GPU shader operation when the
anti-aliasing option is disabled, which is something we should do when
the evaluator allows nodes be executed as both standard and GPU shader
operations.

Pull Request: blender/blender#106593
2023-04-20 07:20:58 +02:00
335d32153e Cleanup: remove dead code, reduce variable scope 2023-04-20 13:57:31 +10:00
62cc09f267 Cleanup: match argument names between functions & declarations 2023-04-20 13:52:58 +10:00
92f79e002e Cleanup: format 2023-04-20 13:35:35 +10:00
3f0d2cf9e1 Add scripts dir to the make format paths for Python
`make format` uses autopep8 to format Python, using a list of paths
specified in `tools/utils_maintenance/autopep8_format_paths.py`. The
scripts folder used to be a submodule inside release, but it is now at
the root of the blender repo.

This commit adds `scripts` to the list of paths to format.

Ref !107143
2023-04-20 05:30:34 +02:00
716b9cff23 Fix: Search in node editors missing items
Mistake in d6abd2ce72.
2023-04-19 23:19:45 -04:00
5ca7e1301f Cleanup: Remove redundant custom data initialization
The comment didn't really make sense, since the removed code did the
same thing as the CustomData function anyway, and that's already done
in `mesh_init_data`.
2023-04-19 22:45:48 -04:00
a32fb96311 Cleanup: Use more specific arguments to calc edges function 2023-04-19 22:45:48 -04:00
3e41b98295 Cleanup: Use utility to create mesh in metaball tessellation
Avoid the need to add each data array manually.
2023-04-19 22:45:48 -04:00
b633b460b8 Cleanup: STL Import: Use utility to copy corner verts data 2023-04-19 22:45:48 -04:00
639ec2e5a9 BLI_path: add BLI_path_extension_or_end
Some callers that access the extension operated on the end of the path
even when there was no extension. Using this avoids having to assign
the end of the string using a separate check.
2023-04-20 12:32:25 +10:00
b778e09492 Cleanup: use memmove instead of a string copy for BLI_path_suffix 2023-04-20 11:58:30 +10:00
7884de02f3 Tests: add BLI_path_extension replace/ensure tests for overflow handling 2023-04-20 11:58:29 +10:00
95296dc3aa Cleanup: remove "Path" prefix from path_utils tests 2023-04-20 11:58:27 +10:00
9be0304b67 Cleanup: order expected value last in path_util tests 2023-04-20 11:58:25 +10:00
b6e527febb Tests: add more path extension tests 2023-04-20 11:58:23 +10:00
373cfa731f Cleanup: use EXPECT_STREQ instead of EXPECT_EQ_ARRAY
While both work, the output of strings being different is more useful.
2023-04-20 11:58:22 +10:00
7cc7cd0e80 Cleanup: use a define for all style flags
This ensures new styles only need to be added in one place.
2023-04-20 11:58:20 +10:00
5294758830 Fix buffer overflow in BLI_path_frame_strip with long extensions
The file extension was copied into a buffer without checking it's size.
While large extensions aren't typical, some callers used small fixed
size buffers so an unusually named file could crash.
2023-04-20 11:47:22 +10:00
80edd10168 Fix regression in BLI_path_suffix for long extensions
Changes from [0] passed in a pointer size to BLI_strncpy.

[0]: f8e23e495b
2023-04-20 11:18:26 +10:00
f87e474af0 Cleanup: Move view3d_gizmo_ruler.c to C++
Move view3d_gizmo_ruler.c to C++ to make further changes easier.
See #103343

Pull Request: blender/blender#107148
2023-04-20 00:06:49 +02:00
911f9bea84 Fix #107067: Properly clear CD_FLAG_ACTIVE/DEFAULT_COLOR flags
Runtime this information is stored in the active_color_attribute and
default_color_attribute strings on the mesh, however when saving it
is still saved in the old format with flags on the CustomData layers.
When converting from the strings to the layers not all flags were
properly cleared from the CustomData layers, leading to multiple
layers having the CD_FLAG_COLOR_ACTIVE/RENDER flag.
2023-04-19 22:26:31 +02:00
e05cbad0d1 Sculpt: Fix #107093: expand helper function not specialized to pbvh
sculpt_expand_is_face_in_active_component wasn't specialzied for
the different PBVH types.
2023-04-19 12:59:09 -07:00
2ab500c234 Cleanup: Remove unnecessary point cloud function argument
The "nomain to main" function for point clouds now always takes
ownership of the source data-block, just like the mesh version.
2023-04-19 15:52:56 -04:00
7535ab412a Cleanup: Remove redundant "reference" argument to geometry copy
Implicit sharing means attribute ownership is shared between geometry
data-blocks, and the sharing happens automatically. So it's unnecessary
to choose whether to enable it when copying a mesh.
2023-04-19 15:52:56 -04:00
10d175e223 Cleanup: Use consistent argument order for mesh creation functions
The typical order is vertex, edge, face(polygon), corner(loop), but in
these three functions polys and loops were reversed. Also use more
typical "num" variable names rather than "len"
2023-04-19 15:52:56 -04:00
60bb57663a Cleanup: IO: Separate creating mesh and adding to Main
Create a "nomain" mesh when converting intermediate representations
to a Mesh, meaning those areas don't have to know about data-block
names or the main database, and also that the boilerplate of adding
attributes individually can be avoided. The attribute arrays aren't
copied here, so the performance should be unaffected.
2023-04-19 15:52:56 -04:00
9344deed89 UI: Change the name of Invert nodes to Invert Color
The nodes for inverting a color are named simply Invert, which begs the question: invert what?

This patch changes the naming for the node in Shading, Texture and Compositing nodes to *Invert Color*

This matches the naming of other color dedicated nodes like Separate Color or Combine Color

Pull Request: blender/blender#106750
2023-04-19 21:52:20 +02:00
c6d4de9e49 Render: Fix crash in baking
corner_edges was being passed to a function that expected
corner_verts.
2023-04-19 12:40:50 -07:00
199c7da06d Assets: Do Not Show Blank Read-Only Metadata
Do not show asset metadata "description", "license", "copyright", or
"author" if they are empty AND read-only, since they can't be edited
and contain no useful information to show.

Pull Request: blender/blender#105812
2023-04-19 19:55:26 +02:00
acb34c718e Fix #107120: Small fixes to OS File Operations
Small fixes to recent file operations changes. FileOperations enum
starting with zero results in bad behavior with EnumPropertyItem. Typo
fix.

Pull Request: blender/blender#107138
2023-04-19 19:43:15 +02:00
097b9c5a36 Fix: Build error after last commit
Also fix fallthrough warnings
2023-04-19 13:05:08 -04:00
98ccee78fe Geometry Nodes: Slightly optimize mesh to curve node
Avoid copying the selected edges if all edges are selected, and
parallelize gathering the selection otherwise. Also use `int2` instead
of `std::pair`.

In simple test file I observed an approximate 10% FPS improvement,
though in real world cases the impact is probably much smaller.
2023-04-19 12:35:09 -04:00
d5757a0a10 Cycles: re-enable AMD GPU binaries on Windows
Using the new HIP SDK 5.5 that includes a fix for the compiler bug.

This also enables the light tree.

For Linux the binaries are still disabled. ROCm 5.5 is planned to
include the same fix but not released yet. When that happens we
should be able to enable Linux as well.

Ref #104786
Fix #104085

Pull Request: blender/blender#107098
2023-04-19 18:18:05 +02:00
45c0762f1b Fix #107125: Entering Grease Pencil Vertex Paint mode crashes
Caused by uninitialized `ToolSettings` `GpPaint` [which was later
accessed in `BKE_gpencil_palette_ensure`].

Not 100% sure why `ToolSettings` `GpPaint` is properly initialized in a
default startup fille, but for some files, this was not the case (as in
the report)

See 22462fed00 for a similar commit.

Now initialize `ToolSettings` `GpPaint` (alongside `GpVertexPaint`) when
entering grease pencil vertex paint mode.

Should probably go into LTS releases as well.

Pull Request: blender/blender#107131
2023-04-19 16:50:46 +02:00
599e52119f Fix #107101: Update depsgraph on muting VSE channel
Muting a VSE channel does not mute the sound, this is caused by lack
of depsgraph updates for sound when mute state changed for the channel.
Now fixed.

Caused by ad146bd17a

Pull Request: blender/blender#107116
2023-04-19 16:27:21 +02:00
a7422f3cd7 deps_builder/windows: Cleanup dpcpp harvest
The dpcpp folder grew from 200M to 500M with the last update
due to lld being enabled and having 5 different copies in the bin
folder. We do not need to ship lld so it can be safely removed.

However previous harvest cleaned up the build folder before copying
the libs to their final destination in output, this will no longer
work, since we actually do need lld to build embree.

So copy to the full build folder to output first, then remove the
binaries we do not need. Embree will use the binaries in the build
folder so it will be unaffected by this.
2023-04-19 07:57:46 -06:00
d6abd2ce72 Fix #106138: Node add searches missing context-based poll
Before the add node search refactor and link-drag-search, nodes were
filtered out based on whether they worked with the active render
engine. For example, the Principled Hair BSDF node doesn't work with
EEVEE, so it isn't displayed in the UI. While we might want to relax
this in the future, we have no better way to show that they don't work
right now, so it's best to keep that behavior.

The filtering is implemented with a new node type callback, mainly
to reduce the boilerplate of implementing many node search callbacks
otherwise. It's also relatively clear this way I think. The only
downside is that now there are three poll functions.

I didn't port the "eevee_cycles_shader_nodes_poll" to the new
searches, since I don't understand the purpose of it.

Pull Request: blender/blender#106829
2023-04-19 15:48:18 +02:00
91a29c9b9a Fix #107127: Context property driver to view layer does not work
The resolution of the driver value RNA path was using the wrong
property (it was forced to be referenced relative to the ID).

Pull Request: blender/blender#107129
2023-04-19 15:32:37 +02:00
5ab48a53e4 Cleanup: Use generic edge calculation for legacy curve to mesh
Change the "displist to mesh" conversion to use the edge calculation
function used everywhere else, to allow removing the old code. This
changes edge vertex and corner edge indices, requiring a test update,
but the visual result should be the same.
2023-04-19 09:29:08 -04:00
b647c2b88d Cleanup: Remove unused variables/functions
Also change from `unsigned int` to `uint` for consistency
between function declarations and definitions.
2023-04-19 08:52:48 -04:00
86611a5fcc Tests: add tests for BLI_path_extension ensure & replace 2023-04-19 21:15:43 +10:00
7f2c7feaee Fix #107113: VSE channel buttons invisible in Light theme
Also fix inconsistency in Movie Clip and Status Bar headers.
2023-04-19 12:39:33 +02:00
c0f7801660 Fix regression in BLI_path_extension_ensure
Error in [0] removed trailing '.' stripping.

[0]: f8e23e495b
2023-04-19 20:33:55 +10:00
e45ed69349 Attributes: Integrate implicit sharing with the attribute API
Add the ability to retrieve implicit sharing info directly from the
C++ attribute API, which simplifies memory usage and performance
optimizations making use of it. This commit uses the additions to
the API to avoid copies in a few places:
- The "rest_position" attribute in the mesh modifier stack
- Instance on Points node
- Instances to points node
- Mesh to points node
- Points to vertices node

Many files are affected because in order to include the new information
in the API's returned data, I had to switch a bunch of types from
`VArray` to `AttributeReader`. This generally makes sense anyway, since
it allows retrieving the domain, which wasn't possible before in some
cases. I overloaded the `*` deference operator for some syntactic sugar
to avoid the (very ugly) `.varray` that would be necessary otherwise.

Pull Request: blender/blender#107059
2023-04-19 11:21:06 +02:00
19ac02767c Fix regression in recent BLI_path extension logic
Error in [0] meant BLI_path_extension_replace &
BLI_path_extension_ensure did nothing when the input path had no
extension.

[0]: f8e23e495b
2023-04-19 18:38:56 +10:00
fd10ecaeaf Fix bitwise logical operation in Metal backend
Pull Request: blender/blender#107084
2023-04-19 10:02:12 +02:00
187998970a Fix unused variable in release build in Metal backend 2023-04-19 10:02:09 +02:00
c872b6b930 Fix set but unused variable in Freestyle 2023-04-19 10:02:09 +02:00
3c34b13cf8 Fix set but unused variable in mesh intersect
A bit tricky, since there is also variable shadowing involved.
2023-04-19 10:02:09 +02:00
9e63c3cee8 Fix strict prototypes in Audio 2023-04-19 10:02:09 +02:00
a20f45bab9 Fix unqualified access to std::move in OpenSubdiv 2023-04-19 10:02:09 +02:00
63c20e08c4 Fix set but unused variable in Libmv 2023-04-19 10:02:09 +02:00
4f7dc1e4b6 Fix set but unused variable in IK solver 2023-04-19 10:02:09 +02:00
8ed543c6f2 Fix set but unused variable in dualcon octree 2023-04-19 10:02:09 +02:00
daaed83a32 Fix set but unused variables in Cycles 2023-04-19 10:02:09 +02:00
7982d86117 Fix unqualified access to std::move in Cycles 2023-04-19 10:02:09 +02:00
33e5cd4e2f Fix bitwise operation used on boolean in Mantaflow 2023-04-19 10:02:09 +02:00
8365bce958 CMake: Add extra strict flags cancellation for Clang 2023-04-19 10:02:09 +02:00
b0ec4d889a Fix #106998: selection of bones in grease pencil weightpaint mode fails
Caused by 2eeec49640.

Above commit would early out when falling through the specialized
greasepencil selection operator to view3d_select_exec. But in order to
select posebones in grease pencil weightpaint mode, we still have to
continue with view3d_select_exec.

Now check this special case [with convenient
`BKE_object_pose_armature_get_with_wpaint_check`] and DONT early out in
that case.

Should go into 3.3 LTS as well.

Pull Request: blender/blender#107076
2023-04-19 09:13:21 +02:00
c10e8e4166 Fix #106751: No implicit conversion for group inputs
When a node input is connected to a group node input that is unlinked
and is of a different type, no implicit conversion takes place, so the
value is unexpected.

This patch fixes that by considering the types of both sockets and do
implicit conversion if necessary.
2023-04-19 06:24:39 +02:00
ed590e9181 macOS/GTests: simplify blender_test library linking
Reverts dcb2821292 but handles
the linker error by relying on target_link_libraries deduplication.

Reverts 18a15bafe8 but handles
blender_test linking after dependency change by passing
lib to target_link_libraries itself.

Closes #107033
2023-04-19 09:05:43 +05:30
26a194abbd BLI_path: add BLI_path_extension_strip as an alternative to replace
While replacing the extension with an empty string works,
it required a redundant string-size argument which took a dummy
value in some cases. Avoid having to pass in a redundant string size by
adding a function that strips the extension.
2023-04-19 12:59:43 +10:00
9e6757f20f Cleanup: expand on why the extension isn't replaced for blend-file save 2023-04-19 12:58:54 +10:00
61fe8da989 Cleanup: avoid changing the filepath for alembic frame range calculation
The internal utility get_sequence_len would make it's filename
argument absolute so as to scan it's directory for files.

Perform this on the directory instead so the filename can be const.
2023-04-19 12:33:28 +10:00
643f8bcedd Cleanup: avoid redundant string copy
This may have been done because BLI_path_frame_get used to take a
non-const string.
2023-04-19 12:32:42 +10:00
f8e23e495b BLI_path: improve behavior of BLI_path_extension
Finding the extension included hidden files (starting with a '.'),
now finding the extension matches Python's `os.path.splitext` behavior
which has the advantate a hidden file is not considered one long
extension - with an empty name part.

Also update code to use BLI_path_extension in cases which previously
in-lined this logic.

BLI_path_frame_get path argument is now const,
it was being manipulated unnecessarily.
2023-04-19 11:33:26 +10:00
7f241fc773 Tests: add test for BLI_path_suffix & BLI_path_sequence_decode 2023-04-19 11:33:26 +10:00
6d2351d26b Text object: operators to move cursor to the top or bottom
This adds new movement types TEXT_BEGIN and TEXT_END to allow
FONT_OT_move and FONT_OT_move_select operators move the text
cursor (caret) to the top and bottom of the text.

Pull Request: blender/blender#106196
2023-04-19 02:18:19 +02:00
846d78b09a Cleanup: improve doc-strings for EditFont 2023-04-19 09:06:24 +10:00
b132118f89 Cleanup: balance doxygen grouping, minor grouping adjustment 2023-04-19 09:02:21 +10:00
88f5dd3c72 Cleanup: format 2023-04-19 08:02:42 +10:00
eb2867de90 Cleanup: spelling in comments 2023-04-19 08:02:41 +10:00
Mateusz Albecki
0fd14d659b GHOST/Wayland: Fix disposeContext with VK
During createOffscreenContext with VK backend enabled wl_surface
was not stored in the context's user data. This resulted in nullptr
dereference later on during disposeContext. Added a line that sets
user data and additionally added nullptr checks in disposeContext.

Ref !107057.
2023-04-19 07:44:06 +10:00
1469613d65 Fix Build Warnings
A differing const argument and an unused var caused by conditionals.

Introduced in 694f792ee1
2023-04-18 13:59:07 -07:00
4382a0b350 Cleanup: avoid warnings from gcc in oneAPI device compilation
When building using GCC and with Embree without GPU support, there were
a few unused variables and a non-defined macro.
2023-04-18 22:40:40 +02:00
70892e82ac Cycles: oneAPI: use specialization constant to compile with/without Embree on GPU 2023-04-18 22:09:42 +02:00
9821a2d397 Cycles: pass kernel features to get_bvh_layout_mask
This allows to selectively disable Hardware Raytracing in oneAPI
backend, depending on features used.
2023-04-18 22:09:42 +02:00
3f8c995109 Cycles: add hardware raytracing support to oneAPI device
Updated Embree 4 library with GPU support is required for it to be
compiled - compatiblity with Embree 3 and Embree 4 without GPU support
is maintained.
Enabling hardware raytracing is an opt-in user setting for now.

Pull Request: blender/blender#106266
2023-04-18 22:09:42 +02:00
887022257d Cycles: update DPCPP to 2022-12 release
We also backport a patch to program_manager to it as
61e51015a5
helps avoid unnecessary recompilation when enumerating available
kernels.
2023-04-18 22:09:41 +02:00
5cdf0c9ee9 Cycles: update compute-runtime to 23.05.25593.18
This fixes oneAPI AoT compilation on Linux when using Embree on GPU.
2023-04-18 22:09:41 +02:00
66b4e426cc Cycles: build Embree 4 with GPU support 2023-04-18 22:09:41 +02:00
72aeee96ac Fix Build Warning in fileops.c
Marking unused function arguments caused by conditionals.

Introduced in 694f792ee1
2023-04-18 12:53:45 -07:00
f7ba61d3a6 Fix #107009: Setting Text Object Styles
This allows toggling of text styles of selected text and at the current
mouse cursor position if nothing is selected.

Pull Request: blender/blender#107048
2023-04-18 21:24:38 +02:00
70d854538b Curves: Optimize edit mode selection draw extraction
Use the attribute API for domain and type interpolation instead of doing
it manually. I observed a 3.8x improvement in curve selection mode and
an 18x improvement in point selection mode.
2023-04-18 14:57:04 -04:00
694f792ee1 UI: OS File Operations Within File Browser
Adds a submenu to the File Browser selected item context menu that
allows opening the item or viewing the location in an OS browsing
window. On Win32 also allows other actions like editing, searching,
opening command prompt, etc.

Pull Request: blender/blender#104531
2023-04-18 20:39:30 +02:00
4edcae75aa Cleanup: Remove unused using keyword 2023-04-18 13:38:11 -04:00
954c6c0ae6 Revert "Cycles: move oneAPI kernels dynamic library to blender.shared"
This reverts commit df096eab77.
There is a corner case for when WITH_CYCLES_ONEAPI_BINARIES is set to on
and later turned off during config, in case there is no ocloc.
2023-04-18 18:48:37 +02:00
3a72442f63 Fix comment style in previous commit.
Pull Request: blender/blender#107091
2023-04-18 17:28:53 +02:00
5bb3a3f157 Fix: segfault when indexing into some collections with strings.
This happens when the collection's item type doesn't have a
'nameproperty' to index with.  For debug builds we error out with an
assert, since in general this shouldn't happen.  For release builds
Python will report item not found.

Pull Request: blender/blender#107086
2023-04-18 17:15:22 +02:00
d818d05415 Cleanup: Remove unnecessary attribute provider callbacks
We don't use the callbacks that create virtual arrays from the custom data
anymore, they just add extra indirection. The only non-obvious case was
the crease attribute which had a setter function. Replace that with an
attribute validator like the other similar attributes.

Pull Request: blender/blender#107088
2023-04-18 17:13:38 +02:00
7c927155b5 Fix #90159: Inconsistent display of active filters for import/export file dialogs
Use `filter_glob` property to list only operator extension files.
PR includes filtering for collada, usd, alembic file formats.

Old Revision: https://archive.blender.org/developer/D16739

Pull Request: blender/blender#107034
2023-04-18 15:57:45 +02:00
63f309df11 Fix #107081: Slow selection with context variables
A solution for an older bug was causing it.

Added a special case to avoid an extra relation for context
variables as they do not change during the dependency graph
evaluation,

Pull Request: blender/blender#107082
2023-04-18 15:40:07 +02:00
66158498de BLI: Return number of values removed from remove_if
Make the `remove_if` function for `Vector`, `VectorSet`, `Set`, and `Map` return the number of elements it removed.

Pull Request: blender/blender#107069
2023-04-18 13:28:14 +02:00
25747301db Cycles: fix SYCL debug library linking on Windows 2023-04-18 12:33:48 +02:00
b623be3377 Cleanup: remove clang-format: off for EnumPropertyItem definitions
These aren't special cases so format them as is done with all other
enum-property declarations.
2023-04-18 20:30:00 +10:00
77268dbe3b WM: add versioning for 3.5 sculpt brushes (part of fix for #106057)
Add a versioning function for tool ID's which can be used if these
need to be changed in future.
2023-04-18 20:30:00 +10:00
7b4d71683f Fix #107060: Curves sculpt mode does not select default tool
Regression in [0] when curve tool names changed to use brush names
with the utility function generate_from_enum_ex().

[0]: 786734e6c8
2023-04-18 20:16:31 +10:00
58b1c54671 Cleanup: remove "Curves" suffix from curve sculpting enum
This isn't necessary information & types aren't included in other
brush names.
2023-04-18 20:16:31 +10:00
732fa26413 Fix #107032: API Document: matrix_channel (PoseBone) description incorrect
Update the RNA and DNA documentation for two bone matrices:

- `PoseBone.matrix_channel` (`bPoseChannel::chan_mat` in DNA) contains
  the evaluated loc/rot/scale channels, including constraints and drivers.
- `PoseBone.matrix` (`bPoseChannel::pose_mat` in DNA) contains the same
  transform, but then expressed in the armature object space.

No functional changes, just clarifications in comments / tooltips.
2023-04-18 12:01:45 +02:00
4d7a7ce67c Fix #107050: accessing nullptr after progress is canceled 2023-04-18 11:58:07 +02:00
2d9e3501ee Merge branch 'asset-shelf' into brush-assets-project 2023-04-18 11:19:21 +02:00
565af18e9e Merge branch 'main' into asset-shelf 2023-04-18 11:17:44 +02:00
6e75581e65 BKE: Rework ID swap code to properly handle embedded ID pointers.
While embedded IDs are usually considered as private local data of their
owner ID, some areas of code, like the depsgraph, can consider them as
regular IDs in some aspects.

So when swapping IDs, also properly 'counter-swap' their potential
embedded IDs, such that the pointers to the embedded IDs remain as before
swapping, even though the data of the embedded IDs is swapped.

The main target of this change is memfile undo code. There, newly read
IDs are swapped with their oldder version, so that the old address
contains the new data. This allows to avoid rebuilding some of the
depsgraph. Doing the same thing for embedded IDs should reduce even
further the needs for depsgrah rebuilds on undo steps.

This commit also gives more control over the remapping of 'self' ID
pointers inside themselves.

Pull Request: blender/blender#107044
2023-04-18 11:09:36 +02:00
664b31ea73 Cleanup: make format 2023-04-18 09:45:01 +02:00
4d75f10a8a EEVEE: Optimise texture usage flags
Authored by Apple: Michael Parkin-White

Pull Request: blender/blender#107037
2023-04-18 08:11:46 +02:00
982392ca13 Docs: Update RNA to user manual url map
Fixes #107005
2023-04-18 01:12:27 -04:00
ab8acbbfe5 Cleanup: Use curve positions accessor function
There's no particular reason to use the attribute API instead here.
2023-04-17 23:38:10 -04:00
c234a802ba Cleanup: Remove unused using keyword 2023-04-17 23:38:10 -04:00
7bb8c8a5cf Cleanup: Improve comments about curves and mesh offset spans 2023-04-17 23:38:10 -04:00
c615ccde68 Fix splash preference overriding Read Home File's use_splash property
- Split out WM_init_splash_on_startup(..) which performs startup checks.
- WM_init_splash(..) now shows the splash (ignoring preferences).
- Avoid calling BLI_exists on an empty string (in some cases).
2023-04-18 12:59:13 +10:00
1bb77d9eae Cleanup: Better logging for imbuf tests
Recent failures requiring investigation have exposed some shortcomings
that this addresses:
- When creating the diff image for offline comparison, use a higher
  threshold to prevent idiff from printing more output which will often
  contradict the primary failure output just above it (very confusing)
- For metadata failures, make sure these get printed so it's obvious
  what kind of failure we're dealing with

Pull Request: blender/blender#107058
2023-04-18 03:32:20 +02:00
302eb1e0d7 Cleanup: compile warning, correct wording 2023-04-18 11:04:08 +10:00
c4c1cc7cd3 Cleanup: double quotes for non-enum strings
Also use back-ticks for code-references in comments.
2023-04-18 10:51:32 +10:00
2f743b0a92 Cleanup: Replace manual flag checking with methods in node.cc
Not all flags have methods, and not all node primitive types have this.
Replacement of rather simple cases.

Pull Request: blender/blender#107055
2023-04-18 00:29:10 +02:00
29f137e138 Sculpt: fix brush.falloff_shape not being reset in "reset brush" op 2023-04-17 15:16:35 -07:00
96fa5fc2b3 Sculpt: Fix #106996: Mising null check in BKE_sculpt_update_object_before_eval 2023-04-17 14:05:29 -07:00
df096eab77 Cycles: move oneAPI kernels dynamic library to blender.shared
After 17800e0c03, the oneAPI kernels library was still able to find sycl6.dll but that wasn't reliable.
We fix this by moving the oneAPI kernels library also into blender.shared.

Pull Request: blender/blender#106894
2023-04-17 21:47:35 +02:00
09b770388a Fix #107004: Cycles shadow caustics not working with area lights
Tested the wrong variable after a refactor for light spread.
2023-04-17 20:46:08 +02:00
870930bc32 Fix build error using WITH_CYCLES_LOGGING=OFF
Mismatch between glog and stubs. CHECK_NULL does not exist also. Tests
also require logging to be available.
2023-04-17 20:36:18 +02:00
74eda0b6fc Fix build error on macOS after previous commit 2023-04-17 17:47:29 +02:00
92919864a0 Fix #106293: Cycles importance sampling with multiple suns works poorly
Keep sun in importance map in this case, as we do not use special sun
importance sampling in this case.
2023-04-17 17:30:47 +02:00
cff94a808e Fix #106706: fireflies with Nishita sky sun sampling at certain angles
Due to floating point differences between importance sampling and
texture evaluation, disagreeing on whether or not a ray lies within
the sun disc.

* Use the same input values for geographical_to_direction() in
  sky_radiance_nishita() and kernel_data.background.sun.
* The mathematical operations in pdf_uniform_cone() were adjusted to
  match sky_radiance_nishita().

Pull Request: blender/blender#106764
2023-04-17 17:29:27 +02:00
a8feb20e1c DNA: Move irradiance grid light cache data to Object level
This is the first step for refactoring the lightcache system.
Each probe instance (as in `Object`) will now store its own baked data.
The data is currently stored in uncompressed readable format.

This introduces two new operators for baking to avoid confusion with
the previous light baking pipeline. These do nothing other than
creating empty caches that will be populated by EEVEE later on.

The DNA storage is made to be able to include multiple caches
in case of baked simulation over time but it isn't yet supported.

I prefer to keep the implementation simple for now as the long term
goals for this feature are uncertain.
There is still a type flag (`LightProbeObjectCache.cache_type`) that
will be used for versioning.

The naming convention of structs is a bit weird but that's all I
found in order to avoid interfering with the old scene light cache
that is still used by (old) EEVEE.

Related task #106449.

Pull Request: blender/blender#106808
2023-04-17 17:12:19 +02:00
b1703bd902 Fix #107020: crash when canceling Sky Resize with mesh symmetry
Like `t->data` use calloc to `tc->data_mirror`.

This way you make sure that all values are properly initialized.
2023-04-17 11:32:36 -03:00
c041a36286 Fix #91966: Alembic/USD export ignores bone parent animation
For non-object parents (so bones & vertices), the parent is now also
explicitly checked for animation. In other words: having an animated
parent will cause the transform of the child to be written to Alembic/USD
on every frame (as if it is animated itself).
2023-04-17 16:21:39 +02:00
b75b734969 Core: Memfile Undo: Optimize handling of 'no undo' IDs.
Do not read IDs from types flagged as 'no undo', whether they are local
or linked.

This should have no effect currently, since all 'no undo' ID types
currently are supposedly only local data anyways (Screen, WindowManager
and WorkSpace).
2023-04-17 16:08:41 +02:00
915b8b6093 Core: Memfile undo: Add ID tag for IDs that are 'reused in place'.
These IDs kept their address, but their content has been replaced
(re-read from the memfile undo step). Add an ID tag to identify them.

As a further cleanup, systematically tag these IDs for despgraph COW,
since their data is effectively modified (though in practice all of
these IDs are expected to already have other update tags anyway).

No change in behavior is expected from this commit.
2023-04-17 15:46:21 +02:00
a16bcb6576 Core: ID remapping: Do remap 'not owning embedded' ID pointers.
This should not have much effective consequences with current code, but
fixes potential missed remappings for e.g. some nodetree pointers in the
node editor, or the `parent` pointer of collections to a scene's master
collection.
2023-04-17 15:46:21 +02:00
0bc957063c Fix #106405: Cycles multi GPU crash with vertex color baking
Avoid division by zero when one of the devices gets no work.
2023-04-17 15:31:35 +02:00
38bf3e1911 I18n: translate default preset name
The "New Preset" message was already translated and used in some
preset panels, but not all.

Pull Request: blender/blender#106973
2023-04-17 15:00:07 +02:00
48979c6cdc Py module i18n utils: return subprocess.run result to catch output of external commands.
Avoids having prints in random order in multi-processes concurrent
context.
2023-04-17 14:38:51 +02:00
e45746591b Metal: Add new files for Storage Buffers support 2023-04-17 14:12:32 +02:00
2a4323c2f5 Mesh: Move edges to a generic attribute
Implements #95966, as the final step of #95965.

This commit changes the storage of mesh edge vertex indices from the
`MEdge` type to the generic `int2` attribute type. This follows the
general design for geometry and the attribute system, where the data
storage type and the usage semantics are separated.

The main benefit of the change is reduced memory usage-- the
requirements of storing mesh edges is reduced by 1/3. For example,
this saves 8MB on a 1 million vertex grid. This also gives performance
benefits to any memory-bound mesh processing algorithm that uses edges.

Another benefit is that all of the edge's vertex indices are
contiguous. In a few cases, it's helpful to process all of them as
`Span<int>` rather than `Span<int2>`. Similarly, the type is more
likely to match a generic format used by a library, or code that
shouldn't know about specific Blender `Mesh` types.

Various Notes:
- The `.edge_verts` name is used to reflect a mapping between domains,
  similar to `.corner_verts`, etc. The period means that it the data
  shouldn't change arbitrarily by the user or procedural operations.
- `edge[0]` is now used instead of `edge.v1`
- Signed integers are used instead of unsigned to reduce the mixing
  of signed-ness, which can be error prone.
- All of the previously used core mesh data types (`MVert`, `MEdge`,
  `MLoop`, `MPoly` are now deprecated. Only generic types are used).
- The `vec2i` DNA type is used in the few C files where necessary.

Pull Request: blender/blender#106638
2023-04-17 13:47:41 +02:00
f588a0596b Fix #106943: driver on inactive view layer doesn't work
Animation data (including drivers) on inactive view layers now work. The
removal of such view layers was too optimistic; they are now kept
around. The bases are still removed, mostly for safety sake and to keep
the changes to a minimum.

`scene_remove_unused_view_layers()` has been renamed to
`scene_minimize_unused_view_layers()` to reflect its new functionality.

For compatibility with assumptions in other areas of the code, the
function still ensures the input view layer is at index 0.

This also introduces a new function
`BKE_view_layer_free_object_content(view_layer)`, which is a subset of
the functionality of `BKE_view_layer_free()`.
2023-04-17 12:59:03 +02:00
fe7540d39a Cleanup: Define type for object type enum
Having a type defined allows the compiler to help with type safety. For
example we can use it in switches to trigger a warning when a new object
type is added but not covered by the switch yet (but probably should).
2023-04-17 12:39:42 +02:00
62d9e55eec Graph editor: fix box select when scene has annotations
The graph editor box select operator now works properly again, when there
is an annotation layer in the scene.
2023-04-17 12:15:24 +02:00
0ed0165eea Refactor: anim, simplify range check
Simple application of De Morgan's law. No functional changes.
2023-04-17 12:15:24 +02:00
c8435185e1 I18n: Updated translation files from SVN trunk (r6467). 2023-04-17 12:00:22 +02:00
dfa42c614f Cleanup: UI messages fixes and tweaks. 2023-04-17 11:41:10 +02:00
5491563e59 Fix #106982: crash with muted node
The lazy function for muted nodes did request inputs
even if they were not required.
2023-04-17 10:59:05 +02:00
6e59d0b20f Cleanup: document type of Scene::view_layers 2023-04-17 10:57:09 +02:00
3a02d760f7 Python API: Expose background drawing argument for GPUOffScreen.draw_view3d
Currently, when using the python api for offscreen drawing, the
default background will always be rendered into the GPUOffScreen's
framebuffer, rendering the alpha channel essentially useless and
making it difficult to separate objects from the background.

This patch allows offscreen drawing of a 3d view with transparent
background by exposing an optional parameter to the python api,
enabling, for example, compositing the result over another image.

The new parameter to draw_view3d() is optional, with the default
value matching the previous behavior, so this change is fully

Pull Request: blender/blender#105748
2023-04-17 09:28:02 +02:00
15f464019a Geometry Nodes: avoid last buffer copy in Blur Attribute node
Previously, there was a "main" and "tmp" buffer and the final
result was expected to be in the "main" buffer. Now the two buffers
are called a and b and the final result can be in either of those.

This can improve performance especially if the number of iterations is low.

Pull Request: blender/blender#106860
2023-04-17 08:08:46 +02:00
348f57bcec Fix #107017: Missing checks for #PyObject_GetBuffer success
`PyObject_GetBuffer` was used without checking that it was successful.
This could cause the code to access an incompatible or uninitialized
`Py_buffer`.

Add the missing checks, and clears the raised `PyExc_BufferError`
to silently fall back to accessing the PyObject as a sequence.
2023-04-17 16:07:20 +10:00
1d8389cd09 Fix: missing cache to get evaluated positions
Without this, there is a crash in the
`geo_node_geometry_test_duplicate_elements_curve_points` test in
a debug build. This was broken in 7bd7043a74.
2023-04-17 06:32:30 +02:00
c7d80b8c70 Fix crash saving an image when ImageOutput::open fails
Saving a PNG into path without write access would crash,
caused by recent move to OIIO.
2023-04-17 13:31:10 +10:00
0b1fb22f69 Fix screenshot path defaulting to the root directory for unsaved files
Using a "//" prefix resolves to the root directory which isn't a good
default as it typically doesn't have write permissions.
Only set the name and let the file selector pick a directory to use
(matches how saving from the text editor works).
2023-04-17 13:31:08 +10:00
153cb7e1df Cleanup: remove inline checks for GPU front-buffer reading
Add WM_window_pixels_read & WM_window_pixels_read_sample that
use front-buffer pixel reading when supported.

Note that direct access to reading from the front-buffer is still needed
for writing thumbnails - where redrawing can cause problems
(see code-comments).
2023-04-17 12:28:56 +10:00
e78c3c9d96 Docs: comments for disabling the front-buffer & view3d offset correction
Expand on why front-buffer support is always disabled on Wayland &
why viewport orbit around selection offset correction isn't used for
perspective views.
2023-04-17 12:27:34 +10:00
7bd7043a74 Fix #106927: Crash when removing handle position attribute
Bezier curve position evaluation expects the handle position attributes
to exist and doesn't handle the case where they don't. Swith to using
a utility function to evaluate each curve type so Bezier evaluation can
stop early in that case.
2023-04-16 21:34:35 -04:00
2fade47a9d Fix: Transform geometry node doesn't translate volumes correctly
Fixes a bug introduced in b0b9e746fa.
The volume transformation matrix is multiplied in the wrong order
which means the grid scale is applied on the translation.
2023-04-17 03:10:40 +02:00
Henry Chang
bd86e719ab UI: Sculpt Paint tool defaults #97616 #105759
Default settings changed for Sculpt mode's
Paint Brush, Smear Brush, and Smear Brush.

~~This includes updates of PR review #105691.~~

Updated to only include commits relevant to this PR.

Reviewed by: Joseph Eagar & Julian Kaspar
Pull Request: #105759
2023-04-16 15:24:47 -07:00
4563a47ac5 Squashed commit of the following:
commit 7aa5e65dcbda862dcb17ecfc6727eb241a12c316
Merge: c08a9ec19f 7c9e493da55
Author: Joseph Eagar <joeedh@gmail.com>
Date:   Sun Apr 16 15:11:53 2023 -0700

    Merge branch 'main' of https://projects.blender.org/ChengduLittleA/blender into ChengduLittleA-main

commit 7c9e493da55a4adbfa2415b711e6d0daa2720ad9
Author: YimingWu <xp8110@outlook.com>
Date:   Fri Mar 31 17:46:32 2023 +0800

    Fix #106358: Handles null evaluated object when entering sculpting workspace.

    The setup where everything in the scene is invisible/not enabled could
    trigger a crash when switched to sculpting workspace, triggered when
    opening the file.

    This patch handles such situation.
2023-04-16 15:14:11 -07:00
Patrick Foley
c08a9ec19f Sculpt: updated Mask and Face Set menu operators
Changed the menu operators:

    Expand Mask by Topology (hotkey Shift A)
    Expand Mask by Normals (hotkey Shift Alt A)
    Expand Face Set by Topology (hotkey Shift W)
    Expand Active Face Set (hotkey Shift Alt W)

so that their hotkeys would appear in their menu entries.

Resolves #104023

Co-authored-by: DisquietingFridge <30654622+DisquietingFridge@users.noreply.github.com>
Pull Request: #104568

Rebased for main instead of sculpt-dev
2023-04-16 15:04:58 -07:00
45ef51d0fb Fix #106242 "Edit Dyntopo Detail Size" status bar missing info
Fixed issue#106242 "Edit Dyntopo Detail Size" status bar missing info

Pull Request: blender/blender#106476
2023-04-16 23:54:48 +02:00
9d4949f80b Cleanup: Reduce nesting in node.cc
Decompose most of the nesting in the code to make the code
more consistent along the line of program execution.
Mainly achieved through:
- Remove redundant else
- Invert condition
- Add temporary variable to redistribute and name conditions

Pull Request: blender/blender#105509
2023-04-16 22:53:09 +02:00
de7e3454fb UI: Capabilities Flag for Clipboard Image copy/paste
This adds an WM_capabilities_flag to indicate that a platform
implements support for copying and pasting images using a shared
clipboard.

Pull Request: blender/blender#106990
2023-04-16 21:04:55 +02:00
254d148458 Fix: PLY export behavior with multiple meshes
A few fixes included here:
- Use `reserve` properly to add space after the first mesh
- Add to the end of the UVs array instead of replacing it for every mesh

Also, a cleanup/simplification:
- Split face size and face vertex loops, they are independent

Pull Request: blender/blender#106967
2023-04-16 20:00:16 +02:00
e1571cb105 Cleanup: correct terms, spelling in comments 2023-04-16 20:41:22 +10:00
5f40118899 Cleanup: rename GPU_offscreen_read_{pixels=>color} noted as a TODO 2023-04-16 20:38:19 +10:00
6cc2c16d06 Fix #106264: Color picker broken with Wayland & AMD GPU
- Use off-screen drawing when reading from the front-buffer isn't
  supported.

- Add a capabilities flag for reading the front-buffer which is always
  disabled on WAYLAND.

- Add GPU_offscreen_read_pixels_region, used for reading a sub-region of
  an off-screen buffer - use for color-picking a single pixel.
2023-04-16 20:16:54 +10:00
6722f90734 Cleanup: quiet mypy warnings in gitea_inactive_developers
Also add to the list of scripts to check with "make check_mypy".
2023-04-16 17:03:56 +10:00
b827c8cd1e Fix #104385: Unexpected clipping in ortho view & orbit around selection
Orbit around selection didn't work well in orthographic views,
potentially causing viewport offset to drift during navigation
to the point content would be outside the far clipping range.

Resolve by aligning the view offset depth with the dynamic offset
being orbited around.
2023-04-16 16:24:41 +10:00
8afb8db66e Cleanup: spelling in comments 2023-04-16 16:24:38 +10:00
cffc9bdb93 Cleanup: quiet unused argument warning 2023-04-16 16:24:36 +10:00
bb25302fc3 Docs: Fix wrong function return type
Fixes blender/blender-manual#104384
2023-04-15 21:03:47 -04:00
f2163e657e Merge branch 'asset-shelf' into brush-assets-project 2023-04-14 12:51:07 +02:00
2acf304e14 Add display popover and "Show Names" option 2023-04-14 12:50:54 +02:00
961c6b1c48 Merge branch 'asset-shelf' into brush-assets-project 2023-04-14 12:02:28 +02:00
5c15bcac30 Merge branch 'main' into asset-shelf 2023-04-14 12:02:02 +02:00
5cb07d29df Merge branch 'asset-shelf' into brush-assets-project 2023-04-13 15:58:13 +02:00
00b689c357 Merge branch 'main' into asset-shelf 2023-04-13 15:57:48 +02:00
a3baa62487 Fix uninitialized asset shelf theme color in existing theme 2023-04-12 17:25:14 +02:00
73a2c3453c Fix error when re-registering a bpy.types.AssetShelf subclass 2023-04-12 17:20:26 +02:00
505b4a5dfc Merge branch 'asset-shelf' into brush-assets-project 2023-04-12 16:26:16 +02:00
21ae01b7eb Merge branch 'main' into asset-shelf 2023-04-12 16:25:33 +02:00
52e431f28e Merge branch 'asset-shelf' into brush-assets-project 2023-04-12 15:18:32 +02:00
5efe4a32ff Merge branch 'main' into asset-shelf 2023-04-12 15:18:05 +02:00
4b06cabf1e Update for subversion bump in main branch 2023-04-11 18:29:04 +02:00
45a8625088 Merge branch 'main' into asset-shelf 2023-04-11 18:27:24 +02:00
e25eec6ad8 Merge branch 'asset-shelf' into brush-assets-project 2023-04-07 16:05:20 +02:00
512994fbdb Merge branch 'main' into asset-shelf 2023-04-07 16:03:14 +02:00
321972e69a Merge branch 'asset-shelf' into brush-assets-project 2023-04-07 14:35:40 +02:00
8c3fe5cf23 Merge remote-tracking branch 'upstream/main' into asset-shelf 2023-04-07 14:34:25 +02:00
b93fe0f9e0 Add "All" catalog tab that is always visible 2023-04-07 13:09:14 +02:00
b7a9fed4bc Change default color for asset shelf footer/header 2023-04-07 11:58:11 +02:00
d067e77ee6 Fix wrong theme setting used for asset shelf footer/header 2023-04-07 11:57:36 +02:00
253b417225 Add theme options for the asset shelf 2023-04-07 11:38:37 +02:00
2cd3f1443a Merge branch 'asset-shelf' into brush-assets-project 2023-04-06 12:11:27 +02:00
455deed475 Merge branch 'main' into asset-shelf 2023-04-06 12:10:56 +02:00
2e288dad63 Use new region poll to remove asset shelf based on context 2023-04-05 16:10:23 +02:00
8f74d10ee9 Merge branch 'main' into asset-shelf 2023-04-05 15:55:28 +02:00
cedb752766 Merge branch 'asset-shelf' into brush-assets-project 2023-04-03 16:05:08 +02:00
5797c4944d Merge branch 'main' into asset-shelf 2023-04-03 16:03:31 +02:00
a2eaea461d Merge branch 'asset-shelf' into brush-assets-project 2023-04-01 20:21:29 +02:00
2be8104d8d Merge branch 'main' into asset-shelf 2023-04-01 20:20:35 +02:00
301369ca4b Merge branch 'asset-shelf' into brush-assets-project 2023-03-30 18:42:35 +02:00
f991d52377 Merge branch 'main' into asset-shelf 2023-03-30 18:41:56 +02:00
7ecb23962e Switch to asset-shelf add-ons branch on make update 2023-03-30 15:34:59 +02:00
12d3bda37a Merge branch 'asset-shelf' into brush-assets-project 2023-03-30 12:48:15 +02:00
baa3b34bad Merge branch 'main' into asset-shelf 2023-03-30 12:38:04 +02:00
f9106cb08a Merge branch 'asset-shelf' into brush-assets-project 2023-03-29 11:33:53 +02:00
3b06cb74a6 Merge branch 'temp-asset-weak-reference' into asset-shelf 2023-03-29 11:31:54 +02:00
e8bd93685b Cleanup. 2023-03-29 11:04:52 +02:00
098cd86fa4 Merge branch 'main' into temp-asset-weak-reference 2023-03-29 10:38:31 +02:00
3dc872af0e More fixes. 2023-03-28 20:32:04 +02:00
e28d0720ae More tests and debug prints. 2023-03-28 20:19:16 +02:00
7fa63f66c3 Further fixes and debug prints for path & assets nightmare. 2023-03-28 20:12:03 +02:00
ba73275532 Further fixes to path handling.
debug prints are intended for now.
2023-03-28 16:57:03 +02:00
e27067d7ac Attempt to fix normilizing ID name part of paths. 2023-03-28 15:17:56 +02:00
de9f3308de Merge branch 'main' into temp-asset-weak-reference 2023-03-28 10:52:23 +02:00
08f0b8c870 Merge branch 'asset-shelf' into brush-assets-project 2023-03-27 12:19:47 +02:00
39c45ec425 Merge branch 'main' into asset-shelf 2023-03-27 12:19:16 +02:00
4710dc0337 Fix test failure on macOS because of short string optimization of moved result
Since we pass around a struct with a string, and multiple string-references
into this string, we have to make sure the memory address of the string buffer
never changes. It would seem that move semantics give this behavior, but it
wouldn't work for short strings. When moving a string, short string
optimization would still require the string to be copied, not moved, causing a
change in the memory address. GCC and Clang use different definitions of "short
string", causing differences in behavior.

Wrap the string in a unique pointer, so the string itself is never moved or
copied.
2023-03-24 15:10:08 +01:00
2a7571fa41 Cleanup: Use more clear nullopt return value instead of {} 2023-03-24 15:04:03 +01:00
4dd7d9380d Merge branch 'main' into temp-asset-weak-reference 2023-03-24 11:46:58 +01:00
ce2ef6b9cc Merge branch 'main' into temp-asset-weak-reference 2023-03-23 17:51:33 +01:00
974ebe4864 Merge branch 'main' into temp-asset-weak-reference 2023-03-23 12:21:52 +01:00
873097b0e6 Address points from review 2023-03-23 12:21:18 +01:00
cb96a82aa6 Fix test failure when test asset path is absolute with ".." components 2023-03-23 12:10:57 +01:00
662a405b46 Merge branch 'temp-asset-weak-reference' into brush-assets-project 2023-03-23 11:55:31 +01:00
02d6b2fc16 Merge branch 'asset-shelf' into brush-assets-project 2023-03-23 11:55:16 +01:00
1dce368d37 Merge branch 'main' into asset-shelf 2023-03-23 11:54:27 +01:00
c40c110d4d Fix asset representation tests. 2023-03-23 11:47:05 +01:00
47a1e77a60 Merge branch 'main' into temp-asset-weak-reference 2023-03-23 11:42:11 +01:00
9a56b28771 Basic unit tests for the new path resolving and exploding function 2023-03-22 12:41:08 +01:00
693be95b2d Some refactoring for function to resolve to an exploded path
- Use (arguably) more readable return type instead of return arguments.
- Update function API comment.
- Use more clear (because more direct) return values
2023-03-22 12:34:12 +01:00
1d465ff09c Add function to create an exploded path from a weak asset reference 2023-03-22 11:27:04 +01:00
f8018f7a10 Merge branch 'main' into asset-shelf 2023-03-22 11:14:45 +01:00
91c903b714 Merge branch 'asset-shelf' into brush-assets-project 2023-03-21 18:50:55 +01:00
203e2a3820 Merge branch 'main' into asset-shelf 2023-03-21 18:50:13 +01:00
a26021fc00 Merge branch 'asset-shelf' into brush-assets-project 2023-03-21 17:18:28 +01:00
be1b0deecb Merge branch 'main' into asset-shelf 2023-03-21 17:17:18 +01:00
9e632da128 Free asset shelf types lists in SpaceTypes. 2023-03-21 17:16:38 +01:00
b5ee3d6614 Merge branch 'asset-shelf' into brush-assets-project 2023-03-21 12:52:06 +01:00
5478ca2589 Merge branch 'main' into asset-shelf 2023-03-21 12:50:45 +01:00
af482b91dc Update to changes in main branch 2023-03-20 11:53:49 +01:00
2f258e5b02 Merge branch 'main' into asset-shelf 2023-03-20 11:53:43 +01:00
56070deb6c Merge branch 'asset-shelf' into brush-assets-project 2023-03-16 21:04:55 +01:00
045ef3e214 Merge branch 'main' into asset-shelf 2023-03-16 21:04:15 +01:00
8c1eb0e79a Merge branch 'temp-asset-weak-reference' into brush-assets-project 2023-03-16 18:02:50 +01:00
84503ebf34 Add functions to resolve a full path to an asset from the weak reference
Includes unit tests.
2023-03-16 18:02:16 +01:00
6390c90791 Merge branch 'main' into temp-asset-weak-reference 2023-03-16 16:32:45 +01:00
f5bfa3cd6b Merge branch 'asset-shelf' into brush-assets-project 2023-03-16 16:19:31 +01:00
90097c8c77 Merge branch 'main' into asset-shelf 2023-03-16 16:10:34 +01:00
d1a9dbe2bb Merge branch 'main' into temp-asset-weak-reference 2023-03-16 15:09:44 +01:00
4c12dbd8aa Merge branch 'asset-shelf' into brush-assets-project 2023-03-10 18:15:28 +01:00
e1dd542445 Merge branch 'main' into asset-shelf 2023-03-10 18:11:11 +01:00
3e844fb2dc Attempt to fix linker error on GCC 2023-03-09 16:44:49 +01:00
9c0a73ee2c Merge branch 'temp-asset-weak-reference' into brush-assets-project 2023-03-09 16:13:39 +01:00
d6aa967809 Merge branch 'asset-shelf' into brush-assets-project 2023-03-09 16:13:32 +01:00
0ab8643dcd Merge branch 'main' into asset-shelf 2023-03-09 16:13:21 +01:00
819933cc2e Cleanup: struct member order 2023-03-09 15:47:56 +01:00
7a6f7aa8bc Add test case for custom asset libraries 2023-03-09 15:47:26 +01:00
a77e194b37 Remove redundant librarary name parameter in more cases 2023-03-09 15:46:48 +01:00
04e820cc3f Fix potentially uninitialized values in weak reference 2023-03-09 15:00:57 +01:00
8849199940 Merge branch 'main' into temp-asset-weak-reference 2023-03-09 14:51:53 +01:00
ccbbad8d4b Refactor: Don't pass library name for builtin asset libraries 2023-03-09 14:50:58 +01:00
436d8b6521 Merge branch 'temp-asset-weak-reference' into brush-assets-project 2023-03-09 12:04:34 +01:00
54b6b8660a Fix error in asset weak reference move constructor
I thought the tests passed, what did I doo??..
2023-03-09 12:02:16 +01:00
db6d6ad604 Merge branch 'temp-asset-weak-reference' into brush-assets-project 2023-03-08 16:57:04 +01:00
68c17c32fd Merge branch 'asset-shelf' into brush-assets-project 2023-03-08 16:56:33 +01:00
dca315eea2 Merge branch 'main' into asset-shelf 2023-03-08 16:56:09 +01:00
146141fad3 Merge branch 'main' into temp-asset-weak-reference 2023-03-08 16:53:03 +01:00
557da13294 Initial weak asset reference, to store asset references in .blends 2023-03-08 16:51:52 +01:00
a809070c2d Merge branch 'asset-shelf' into brush-assets-project 2023-03-07 11:41:00 +01:00
8a83495f7b Merge branch 'main' into asset-shelf 2023-03-07 11:31:13 +01:00
9f3d8c2373 Merge branch 'asset-shelf' into brush-assets-project 2023-03-07 11:13:23 +01:00
7b81c6e78b Merge branch 'main' into asset-shelf 2023-03-07 11:10:43 +01:00
b722292130 Merge branch 'main' into asset-shelf 2023-03-06 11:29:31 +01:00
3d1cb8a5ab Merge branch 'main' into asset-shelf 2023-03-03 16:44:10 +01:00
9ea9239189 Merge branch 'main' into asset-shelf 2023-03-02 14:46:38 +01:00
ed60f6c1c3 Update to RNA type registration changes in master 2023-02-28 12:20:05 +01:00
88cf5a36cd Merge branch 'main' into asset-shelf 2023-02-28 12:04:00 +01:00
62af85aed4 Split asset shelf file into multiple files
Makes code more focused, improving local readability.
2023-02-27 19:36:00 +01:00
7d15670a2a Remove duplicated code from merge conflict 2023-02-27 17:03:00 +01:00
1667992e14 Merge branch 'main' into asset-shelf 2023-02-27 17:01:23 +01:00
9b3954c9f8 Merge branch 'asset-shelf' into brush-assets-project 2023-02-27 15:11:02 +01:00
a0b5fda5af Fix build error 2023-02-27 12:25:40 +01:00
19dbb4bfdc Merge branch 'main' into asset-shelf 2023-02-27 11:45:01 +01:00
ccde6e8bea Merge branch 'asset-shelf' into brush-assets-project 2023-02-22 14:28:15 +01:00
1e31d65986 Cleanup: Minor comment formatting change 2023-02-22 14:27:16 +01:00
36326ab736 Merge branch 'temp-asset-shelf-type-bpy' into asset-shelf 2023-02-21 18:49:56 +01:00
2cbf647d9c Temporary design to support filtering assets by type
Python defined asset shelfs can now define an `asset_poll__()` function
to determine if an asset should be visible or not, based on type
information. This isn't great and can probably be a performance issue in
bigger libraries. A proper solution would be to provide a set of asset
traits to filter by, but this is a bit tricky to implement. This is a
temporary solution so the brush assets project isn't held up by this.
2023-02-21 18:47:11 +01:00
1ede8e033c Merge branch 'main' into temp-asset-shelf-type-bpy 2023-02-21 15:29:55 +01:00
d6c4ca72dd Merge branch 'main' into temp-asset-shelf-type-bpy 2023-02-21 15:23:13 +01:00
3a3b444c8e Fix new essentials asset library not being covered in library path query 2023-02-21 14:50:19 +01:00
de27f63ac4 Brush: Add writing and reading ID preview for Brushes.
Mandatory change for the Brush Assets project, from quick test does not
seem to break anything (more) in existing 'old' brushes...

Re. #101908.

Pull Request #105016
2023-02-21 14:50:19 +01:00
e24ea1483c Fix Cycles MetalRT access of macOS 11 features when unavailable
After recent changes in 2d994de.

Pull Request #104976
2023-02-21 14:50:19 +01:00
bbf7dacbb0 Gitea: more updates for new scoped label syntax 2023-02-21 14:50:19 +01:00
451458bad9 Gitea: update for new scoped label syntax 2023-02-21 14:50:19 +01:00
b6a2d0cebd Curves: Add cursor snapping support
This adds support for cursor snapping for the new curves object.

It implements a function `transverts_from_curves_positions_create` (to separate the logic from the `Curves` object type). That function is then C wrapped by `ED_curves_transverts_create` and finally used in `ED_transverts_create_from_obedit`.

Pull Request #104967
2023-02-21 14:50:19 +01:00
2d8b1455bc Curves: Fix proportional editing not working
This adds proper support for proportional editing for the Curves object.

Co-authored-by: Hans Goudey <h.goudey@me.com>
Pull Request #104620
2023-02-21 14:50:19 +01:00
7fe04f68a0 Cleanup: Mark overriden virtual call as such
Fixes the `-Winconsistent-missing-override` warning.

In theory the `virtual` is redundant in such case, but this is how
it is done in may other areas of USD code.

Pull Request #104977
2023-02-21 14:50:19 +01:00
37a2e81747 Fix #104979: GPencil Dot-hash only affects first frame with Time mod
The active frame must be recovered using `BKE_gpencil_frame_retime_get`
2023-02-21 14:50:19 +01:00
ffebf8685f Fix: Channel clamping when markers are used
Previously when markers were used, the newly introduced clamping code (#104516) would stop the last channel from being shown.

This patch fixes that by modifying the `v2d->tot.ymin` calculation.

This is a bit counterintuitive since the `v2d->tot` height is calculated in `action_draw.c`. But the advantage of doing it there is that it also works for the channels region.

Pull Request #104892
2023-02-21 14:50:19 +01:00
57705d9f98 Tests: Address imbuf_save failures on ARM64 builds
This does 2 things to address the ARM64 failures:
- Increases the threshold to be inline with what Cycles uses
- Disables the 2 problematic WebP variations (#105006 will track)
2023-02-21 14:50:19 +01:00
e21911a0c3 Cleanup: format 2023-02-21 14:50:19 +01:00
615958c449 Tests: Add tests for image format saving and loading
This adds saving and loading tests for our supported image formats.

**Saving - bf_imbuf_save.py**
There are 2 template images which are loaded anew for each file save
attempt.  One is an 8-bit RGBA image and the other 32-bit. This is
required as many formats use a variety of factors to determine which of
`ibuf->rect` or `ibuf->rectfloat` to use for processing.  The templates
are constructed to have alpha transparency as well as values > 1 (or
clamped to 1 for the case of the 8-bit template).

Test flow:
 - Load in an appropriate template image
 - Save it to the desired format with the desired set of options
 - Compare against the reference image

Notes:
 - 98 references are used totaling ~3.6MB
 - 10-12 second test runtime
 - Templates can be reconstructed with the create-templates.blend file

**Loading - bf_imbuf_load.py**
Test flow:
 - Load in each of the reference images
 - Save them back out as .exr
 - Save additional metadata to a secondary file (alpha mode, colorspace etc)
 - Compare the saved out .exr with another set of reference .exrs
 - Compare the saved out file metadata with set of reference metadata

Notes:
 - 98 exr references are used totaling ~10MB
 - 10-12 second test runtime as well

A HTML report is not implemented. The diff output organization is very
similar to the other tests so it should be somewhat easy to do in the
future if we want.

The standard set of environment variables are implemented for both:
BLENDER_TEST_UPDATE, BLENDER_VERBOSE, and BLENDER_TEST_COLOR

Pull Request #104442
2023-02-21 14:50:19 +01:00
8f2a9bc116 Sculpt: Implement mesh filter cancel
Added new function sculpt_mesh_filter_cancel in sculpt_filter_mesh.cc
for cancelling mesh filters. It currently is unused pending a
revamped modal map for mesh filter (see pull req 104718).
2023-02-21 14:50:19 +01:00
b3a5c8df62 BMesh: fix invalid existence check in BM_mesh_bm_to_me
Remember that the null customdata layer index is -1,
not 0.
2023-02-21 14:50:19 +01:00
5de924a656 Fix test
Pull Request #104934
2023-02-21 14:50:19 +01:00
eefaa1f8fa BLI_math: fix parameter aliasing in mul_m3_series and mul_m4_series
No functional changes.
2023-02-21 14:50:19 +01:00
8ff42046ab Fix #82936: Make Geometry Nodes modifier icon blue in outliner
In the outliner, the icons for modifiers are tinted blue. This didn't
work for the geometry nodes modifier icon.

Defining the icon with the macro `DEF_ICON_MODIFIER` also
defines the appropriate theme color so it's now tinted blue
when drawn in the outliner like the other modifier icons.

Pull Request #104957
2023-02-21 14:50:19 +01:00
ac17d02993 Nodes: Allow adding multiple search items per type in add menu
Add a per node type callback for creating node add search operations,
similar to the way link drag search is implemented (11be151d58).

Currently the searchable strings have to be separate items in the list.
In a separate step, we can look into adding invisible searchable text
to search items if that's still necessary.

Resolves #102118

Pull Request #104794
2023-02-21 14:50:19 +01:00
80dfc85e71 Merge branch 'asset-shelf' into brush-assets-project 2023-02-21 12:23:21 +01:00
2b64282d9a Merge branch 'main' into asset-shelf 2023-02-21 12:22:44 +01:00
dcb1147eef Basic support for registering asset shelf as a type in BPY
For example, the pose library add-on can now register an asset shelf
like this:

```
class VIEW3D_AST_pose_library(bpy.types.AssetShelf):
    bl_space_type = "VIEW_3D"

    @classmethod
    def poll(cls, context: Context) -> bool:
        return PoseLibraryPanel.poll(context)
```

Filtering by ID type is not supported yet.

This replaces the hack of registering a header type and the asset shelf
template to draw into that.
2023-02-20 20:48:59 +01:00
38833c26e4 Cleanup: Minor correction in comment 2023-02-20 17:11:22 +01:00
11b871728b Merge branch 'main' into asset-shelf 2023-02-20 17:10:22 +01:00
dc46d4c4b3 Merge branch 'main' into brush-assets-project 2023-02-20 15:04:07 +01:00
04f1a1ce21 fix building. 2023-02-18 16:42:51 +01:00
c24c3a445b Merge branch 'main' into brush-assets-project 2023-02-18 16:01:04 +01:00
89177caaaf Merge branch 'main' into brush-assets-project 2023-02-17 11:16:11 +01:00
4fe8d1e39e Fix unused variable in release builds
Assert is not needed in this case.
2023-02-16 18:56:03 +01:00
8b9db43ca0 Merge branch 'main' into asset-shelf 2023-02-16 17:47:28 +01:00
baf45e1ac8 Show asset name in tooltip, add a small gap before catalog tabs 2023-02-03 15:50:22 +01:00
002158c26f Fix asset shelf footer region resizing using wrong coordinates 2023-02-03 15:50:08 +01:00
2110b71f1c Allow operators to override asset applying on click through the keymap 2023-02-03 15:41:48 +01:00
4b4cf6da9a Merge branch 'master' into asset-shelf 2023-02-02 17:02:10 +01:00
56582fbf82 Get filtering by asset catalog to work 2023-02-01 16:26:30 +01:00
d5c60f912f Tabs to activate a catalog
The tabs should be fully working themselves, however we don't filter
the asset shelf contents based on the active catalog (well, catalog
path) yet.

Includes the changes from D17164.
2023-02-01 12:49:55 +01:00
3700b74476 Merge branch 'temp-uibut-non-trivial-construction' into asset-shelf 2023-02-01 11:25:36 +01:00
c4176781d3 Fix assert failures because of unexpected RNA index default 2023-01-31 17:37:36 +01:00
3a34c91448 Cleanup: Remove unnecessary macro & unnecessary cast 2023-01-31 17:32:05 +01:00
b4edc40fb8 UI: Make uiBut safe for non-trivial construction
Essentially, I wanted to use a non-trivially-constructible C++ type
(`std::function`) inside `uiBut`. But this would mean we can't use
`MEM_cnew()` like allocation anymore.

Rather than writing worse code, allow non-trivial construction for
`uiBut`. Member-initializing all members is annoying since there are so
many, but rather safe than sorry. As we use more C++ types (e.g. convert
callbacks to use `std::function`), this should become less since they
initialize properly on default construction.

Also use proper C++ inheritance for `uiBut` subtypes, the old way to
allocate based on size isn't working anymore.

Differential Revision: https://developer.blender.org/D17164
2023-01-31 17:26:29 +01:00
db77a9c55e Merge branch 'master' into asset-shelf 2023-01-31 11:24:51 +01:00
4fa69fbda8 Popup to select which catalogs are displayed in the asset shelf footer
The selected catalogs are currently listed as simple labels in the
footer, just for testing.
2023-01-19 11:50:45 +01:00
47c9c31138 Merge branch 'master' into asset-shelf 2023-01-18 18:32:25 +01:00
b3ee7ad2cc Extend asset shelf region with a region for the catalogs & options
The new region is empty, except of a dummy button.
2023-01-17 16:24:03 +01:00
8ddf492e7c Basic asset shelf prototype
Adds the necessary bits to be able to show an asset shelf template via
the pose library add-on.
2023-01-17 15:18:17 +01:00
0e3f5c6673 Merge branch 'master' into asset-shelf 2023-01-17 13:08:57 +01:00
9b00338ed4 Merge branch 'master' into asset-shelf 2023-01-16 17:22:56 +01:00
d6df32a6f8 Add basic (empty) asset shelf region 2022-12-07 20:00:27 +01:00
3cd93ace24 Simple progress reporting for all library
Progress bar display the file reading (and other operations) is actually
broken in master for a while, so this won't actually be reported. Still
calculate it for once it's fixed.
2022-12-07 18:29:38 +01:00
e8575bfd4a General cleanup (comments, remove outdated TODO marks, naming) 2022-12-07 18:20:05 +01:00
747a9ea263 Make catalogs from "All" library read-only
Loading the asset library will create a read-only catalog service. The
read-only nature is not dealt with much in the asset catalog code, the
using code (e.g. the UI) is responsible for respecting it.
2022-12-06 17:02:05 +01:00
af0c1d72a2 Merge branch 'master' into temp-asset-library-all 2022-12-06 11:57:01 +01:00
11abc1be39 Use "All" library for node search menu building
Code was manually building the search menu items from all asset
libraries, this is simpler now.
2022-12-02 20:29:12 +01:00
ecc25bc62e Use "All" library for node add menu building
Code was manually building the add menu from all asset libraries, this
should be simpler now.
2022-12-02 20:23:54 +01:00
a07a2e2369 Avoid redundant loading of catalogs and "All" library processing 2022-12-02 19:37:43 +01:00
af5d225653 Load catalogs for "All" asset library
Merges the catalog definitions from all asset libraries in to the
storage of the "All" one, builds the catalog tree and refreshes data as
needed. This doesn't allow writing changes back to the catalog
definition files, so the UI probably shouldn't allow edits.
2022-12-02 19:20:37 +01:00
1dc8305213 Merge branch 'master' into temp-asset-library-all 2022-12-02 19:18:18 +01:00
fb2303fb73 Avoid ugly nested library storage
We actually don't have to do this, since we can just iterate over all
loaded libraries after calling the loading for the "All" asset library.
2022-12-02 16:58:47 +01:00
5186c9c9c6 Merge remote-tracking branch 'origin/master' into temp-asset-library-all 2022-12-02 16:20:28 +01:00
126136baab Fix missing asset previews and broken drag & drop in "All" library
Together with the changes made in master, all this does is making sure
the assets are loaded and removed using the correct asset library nested
within the "All" library. Now full paths for the assets can be built
correctly from the asset identifier, which fixes preview loading and
drag & drop.
2022-11-30 20:19:30 +01:00
3f1e4f6f56 Merge branch 'master' into temp-asset-library-all 2022-11-30 20:02:29 +01:00
03bd437170 Merge branch 'master' into temp-asset-library-all 2022-11-30 19:49:29 +01:00
2c2515d465 Merge branch 'master' into temp-asset-library-all 2022-11-29 11:14:41 +01:00
d51212c4f0 Integrate "All" library better with the asset system
Now it actually loads data from all asset libraries when this is
selected. The asset representations still need to be loaded by the file
browser backend, this won't change for now.

This adds the concept of nested asset libraries, which I'd prefer to
keep as implementation detail and not expose in the API. But for now
it's needed (for the asset representation loading by the file browser
backend).
2022-11-28 19:37:00 +01:00
86b9b1df22 Merge branch 'master' into temp-asset-library-all 2022-11-28 15:44:59 +01:00
ca8fa2f7d6 Merge branch 'master' into temp-asset-library-all 2022-11-24 16:25:39 +01:00
33bcc4f430 Initial "All" asset library loading support
An "All" asset library can be selected in the Asset Browser and asset
view templates now, and that will load all assets from all asset
libraries. Preview loading, drag & drop and asset catalogs don't work
yet.
2022-11-22 17:59:29 +01:00
771 changed files with 16952 additions and 12114 deletions

View File

@@ -521,7 +521,8 @@ endif()
if(NOT APPLE)
option(WITH_CYCLES_DEVICE_HIP "Enable Cycles AMD HIP support" ON)
option(WITH_CYCLES_HIP_BINARIES "Build Cycles AMD HIP binaries" OFF)
set(CYCLES_HIP_BINARIES_ARCH gfx900 gfx906 gfx90c gfx902 gfx1010 gfx1011 gfx1012 gfx1030 gfx1031 gfx1032 gfx1034 gfx1035 gfx1100 gfx1101 gfx1102 CACHE STRING "AMD HIP architectures to build binaries for")
# Radeon VII (gfx906) not currently working with HIP SDK, so left out of the list.
set(CYCLES_HIP_BINARIES_ARCH gfx900 gfx90c gfx902 gfx1010 gfx1011 gfx1012 gfx1030 gfx1031 gfx1032 gfx1034 gfx1035 gfx1100 gfx1101 gfx1102 CACHE STRING "AMD HIP architectures to build binaries for")
mark_as_advanced(WITH_CYCLES_DEVICE_HIP)
mark_as_advanced(CYCLES_HIP_BINARIES_ARCH)
endif()
@@ -1580,6 +1581,8 @@ elseif(CMAKE_C_COMPILER_ID MATCHES "Clang")
add_check_c_compiler_flag(C_REMOVE_STRICT_FLAGS C_WARN_NO_MISSING_NORETURN -Wno-missing-noreturn)
add_check_c_compiler_flag(C_REMOVE_STRICT_FLAGS C_WARN_NO_UNUSED_BUT_SET_VARIABLE -Wno-unused-but-set-variable)
add_check_c_compiler_flag(C_REMOVE_STRICT_FLAGS C_WARN_NO_DEPRECATED_DECLARATIONS -Wno-deprecated-declarations)
add_check_c_compiler_flag(C_REMOVE_STRICT_FLAGS C_WARN_NO_STRICT_PROTOTYPES -Wno-strict-prototypes)
add_check_c_compiler_flag(C_REMOVE_STRICT_FLAGS C_WARN_NO_BITWISE_INSTEAD_OF_LOGICAL -Wno-bitwise-instead-of-logical)
add_check_cxx_compiler_flag(CXX_REMOVE_STRICT_FLAGS CXX_WARN_NO_UNUSED_PARAMETER -Wno-unused-parameter)
add_check_cxx_compiler_flag(CXX_REMOVE_STRICT_FLAGS CXX_WARN_NO_UNUSED_PRIVATE_FIELD -Wno-unused-private-field)
@@ -1593,6 +1596,7 @@ elseif(CMAKE_C_COMPILER_ID MATCHES "Clang")
add_check_cxx_compiler_flag(CXX_REMOVE_STRICT_FLAGS CXX_WARN_NO_UNDEFINED_VAR_TEMPLATE -Wno-undefined-var-template)
add_check_cxx_compiler_flag(CXX_REMOVE_STRICT_FLAGS CXX_WARN_NO_INSTANTIATION_AFTER_SPECIALIZATION -Wno-instantiation-after-specialization)
add_check_cxx_compiler_flag(CXX_REMOVE_STRICT_FLAGS CXX_WARN_NO_MISLEADING_INDENTATION -Wno-misleading-indentation)
add_check_cxx_compiler_flag(CXX_REMOVE_STRICT_FLAGS CXX_WARN_NO_BITWISE_INSTEAD_OF_LOGICAL -Wno-bitwise-instead-of-logical)
elseif(CMAKE_C_COMPILER_ID MATCHES "Intel")

View File

@@ -90,28 +90,26 @@ include(cmake/haru.cmake)
# Boost needs to be included after `python.cmake` due to the PYTHON_BINARY variable being needed.
include(cmake/boost.cmake)
include(cmake/pugixml.cmake)
include(cmake/ispc.cmake)
include(cmake/openimagedenoise.cmake)
include(cmake/embree.cmake)
include(cmake/openpgl.cmake)
include(cmake/fmt.cmake)
include(cmake/robinmap.cmake)
include(cmake/xml2.cmake)
include(cmake/fribidi.cmake)
include(cmake/harfbuzz.cmake)
if(NOT APPLE)
include(cmake/xr_openxr.cmake)
if(NOT WIN32 OR BUILD_MODE STREQUAL Release)
include(cmake/dpcpp.cmake)
include(cmake/dpcpp_deps.cmake)
endif()
include(cmake/dpcpp.cmake)
include(cmake/dpcpp_deps.cmake)
if(NOT WIN32)
include(cmake/igc.cmake)
include(cmake/gmmlib.cmake)
include(cmake/ocloc.cmake)
endif()
endif()
include(cmake/ispc.cmake)
include(cmake/openimagedenoise.cmake)
# Embree needs to be included after dpcpp as it uses it for compiling with GPU support
include(cmake/embree.cmake)
include(cmake/openpgl.cmake)
include(cmake/fmt.cmake)
include(cmake/robinmap.cmake)
include(cmake/xml2.cmake)
# OpenColorIO and dependencies.
include(cmake/expat.cmake)

View File

@@ -156,6 +156,7 @@ download_source(OPENCLHEADERS)
download_source(ICDLOADER)
download_source(MP11)
download_source(SPIRV_HEADERS)
download_source(UNIFIED_RUNTIME)
download_source(IGC)
download_source(IGC_LLVM)
download_source(IGC_OPENCL_CLANG)

View File

@@ -5,6 +5,9 @@
# for now.
string(REPLACE "-DCMAKE_CXX_STANDARD=17" " " DPCPP_CMAKE_FLAGS "${DEFAULT_CMAKE_FLAGS}")
# DPCPP already generates debug libs, there isn't much point in compiling it in debug mode itself.
string(REPLACE "-DCMAKE_BUILD_TYPE=Debug" "-DCMAKE_BUILD_TYPE=Release" DPCPP_CMAKE_FLAGS "${DPCPP_CMAKE_FLAGS}")
if(WIN32)
set(LLVM_GENERATOR "Ninja")
else()
@@ -38,17 +41,18 @@ set(DPCPP_EXTRA_ARGS
-DLEVEL_ZERO_LIBRARY=${LIBDIR}/level-zero/lib/${LIBPREFIX}ze_loader${SHAREDLIBEXT}
-DLEVEL_ZERO_INCLUDE_DIR=${LIBDIR}/level-zero/include
-DLLVM_EXTERNAL_SPIRV_HEADERS_SOURCE_DIR=${BUILD_DIR}/spirvheaders/src/external_spirvheaders/
-DUNIFIED_RUNTIME_SOURCE_DIR=${BUILD_DIR}/unifiedruntime/src/external_unifiedruntime/
# Below here is copied from an invocation of buildbot/config.py
-DLLVM_ENABLE_ASSERTIONS=ON
-DLLVM_TARGETS_TO_BUILD=X86
-DLLVM_EXTERNAL_PROJECTS=sycl^^llvm-spirv^^opencl^^libdevice^^xpti^^xptifw
-DLLVM_EXTERNAL_PROJECTS=sycl^^llvm-spirv^^opencl^^libdevice^^xpti^^xptifw^^lld
-DLLVM_EXTERNAL_SYCL_SOURCE_DIR=${DPCPP_SOURCE_ROOT}/sycl
-DLLVM_EXTERNAL_LLVM_SPIRV_SOURCE_DIR=${DPCPP_SOURCE_ROOT}/llvm-spirv
-DLLVM_EXTERNAL_XPTI_SOURCE_DIR=${DPCPP_SOURCE_ROOT}/xpti
-DXPTI_SOURCE_DIR=${DPCPP_SOURCE_ROOT}/xpti
-DLLVM_EXTERNAL_XPTIFW_SOURCE_DIR=${DPCPP_SOURCE_ROOT}/xptifw
-DLLVM_EXTERNAL_LIBDEVICE_SOURCE_DIR=${DPCPP_SOURCE_ROOT}/libdevice
-DLLVM_ENABLE_PROJECTS=clang^^sycl^^llvm-spirv^^opencl^^libdevice^^xpti^^xptifw
-DLLVM_ENABLE_PROJECTS=clang^^sycl^^llvm-spirv^^opencl^^libdevice^^xpti^^xptifw^^lld
-DLIBCLC_TARGETS_TO_BUILD=
-DLIBCLC_GENERATE_REMANGLED_VARIANTS=OFF
-DSYCL_BUILD_PI_HIP_PLATFORM=AMD
@@ -104,13 +108,19 @@ add_dependencies(
external_mp11
external_level-zero
external_spirvheaders
external_unifiedruntime
)
if(BUILD_MODE STREQUAL Release AND WIN32)
ExternalProject_Add_Step(external_dpcpp after_install
COMMAND ${CMAKE_COMMAND} -E rm -f ${LIBDIR}/dpcpp/bin/clang-cl.exe
COMMAND ${CMAKE_COMMAND} -E rm -f ${LIBDIR}/dpcpp/bin/clang-cpp.exe
COMMAND ${CMAKE_COMMAND} -E rm -f ${LIBDIR}/dpcpp/bin/clang.exe
COMMAND ${CMAKE_COMMAND} -E copy_directory ${LIBDIR}/dpcpp ${HARVEST_TARGET}/dpcpp
COMMAND ${CMAKE_COMMAND} -E rm -f ${HARVEST_TARGET}/dpcpp/bin/clang-cl.exe
COMMAND ${CMAKE_COMMAND} -E rm -f ${HARVEST_TARGET}/dpcpp/bin/clang-cpp.exe
COMMAND ${CMAKE_COMMAND} -E rm -f ${HARVEST_TARGET}/dpcpp/bin/clang.exe
COMMAND ${CMAKE_COMMAND} -E rm -f ${HARVEST_TARGET}/dpcpp/bin/ld.lld.exe
COMMAND ${CMAKE_COMMAND} -E rm -f ${HARVEST_TARGET}/dpcpp/bin/ld64.lld.exe
COMMAND ${CMAKE_COMMAND} -E rm -f ${HARVEST_TARGET}/dpcpp/bin/lld.exe
COMMAND ${CMAKE_COMMAND} -E rm -f ${HARVEST_TARGET}/dpcpp/bin/lld-link.exe
COMMAND ${CMAKE_COMMAND} -E rm -f ${HARVEST_TARGET}/dpcpp/bin/wasm-ld.exe
)
endif()

View File

@@ -59,3 +59,13 @@ ExternalProject_Add(external_spirvheaders
BUILD_COMMAND echo .
INSTALL_COMMAND echo .
)
ExternalProject_Add(external_unifiedruntime
URL file://${PACKAGE_DIR}/${UNIFIED_RUNTIME_FILE}
URL_HASH ${UNIFIED_RUNTIME_HASH_TYPE}=${UNIFIED_RUNTIME_HASH}
DOWNLOAD_DIR ${DOWNLOAD_DIR}
PREFIX ${BUILD_DIR}/unifiedruntime
CONFIGURE_COMMAND echo .
BUILD_COMMAND echo .
INSTALL_COMMAND echo .
)

View File

@@ -3,6 +3,8 @@
# Note the utility apps may use png/tiff/gif system libraries, but the
# library itself does not depend on them, so should give no problems.
set(EMBREE_CMAKE_FLAGS ${DEFAULT_CMAKE_FLAGS})
set(EMBREE_EXTRA_ARGS
-DEMBREE_ISPC_SUPPORT=OFF
-DEMBREE_TUTORIALS=OFF
@@ -31,6 +33,43 @@ if(NOT BLENDER_PLATFORM_ARM)
)
endif()
if(NOT APPLE)
if(WIN32)
# Levels below -O2 don't work well for Embree+SYCL.
string(REGEX REPLACE "-O[A-Za-z0-9]" "" EMBREE_CLANG_CMAKE_CXX_FLAGS_DEBUG ${BLENDER_CLANG_CMAKE_C_FLAGS_DEBUG})
string(APPEND EMBREE_CLANG_CMAKE_CXX_FLAGS_DEBUG " -O2")
string(REGEX REPLACE "-O[A-Za-z0-9]" "" EMBREE_CLANG_CMAKE_C_FLAGS_DEBUG ${BLENDER_CLANG_CMAKE_C_FLAGS_DEBUG})
string(APPEND EMBREE_CLANG_CMAKE_C_FLAGS_DEBUG " -O2")
set(EMBREE_CMAKE_FLAGS
-DCMAKE_BUILD_TYPE=${BUILD_MODE}
-DCMAKE_CXX_FLAGS_RELEASE=${BLENDER_CLANG_CMAKE_CXX_FLAGS_RELEASE}
-DCMAKE_CXX_FLAGS_MINSIZEREL=${BLENDER_CLANG_CMAKE_CXX_FLAGS_MINSIZEREL}
-DCMAKE_CXX_FLAGS_RELWITHDEBINFO=${BLENDER_CLANG_CMAKE_CXX_FLAGS_RELWITHDEBINFO}
-DCMAKE_CXX_FLAGS_DEBUG=${EMBREE_CLANG_CMAKE_CXX_FLAGS_DEBUG}
-DCMAKE_C_FLAGS_RELEASE=${BLENDER_CLANG_CMAKE_C_FLAGS_RELEASE}
-DCMAKE_C_FLAGS_MINSIZEREL=${BLENDER_CLANG_CMAKE_C_FLAGS_MINSIZEREL}
-DCMAKE_C_FLAGS_RELWITHDEBINFO=${BLENDER_CLANG_CMAKE_C_FLAGS_RELWITHDEBINFO}
-DCMAKE_C_FLAGS_DEBUG=${EMBREE_CLANG_CMAKE_C_FLAGS_DEBUG}
-DCMAKE_CXX_STANDARD=17
)
set(EMBREE_EXTRA_ARGS
-DCMAKE_CXX_COMPILER=${LIBDIR}/dpcpp/bin/clang++.exe
-DCMAKE_C_COMPILER=${LIBDIR}/dpcpp/bin/clang.exe
-DCMAKE_SHARED_LINKER_FLAGS=-L"${LIBDIR}/dpcpp/lib"
-DEMBREE_SYCL_SUPPORT=ON
${EMBREE_EXTRA_ARGS}
)
else()
set(EMBREE_EXTRA_ARGS
-DCMAKE_CXX_COMPILER=${LIBDIR}/dpcpp/bin/clang++
-DCMAKE_C_COMPILER=${LIBDIR}/dpcpp/bin/clang
-DCMAKE_SHARED_LINKER_FLAGS=-L"${LIBDIR}/dpcpp/lib"
-DEMBREE_SYCL_SUPPORT=ON
${EMBREE_EXTRA_ARGS}
)
endif()
endif()
if(TBB_STATIC_LIBRARY)
set(EMBREE_EXTRA_ARGS
${EMBREE_EXTRA_ARGS}
@@ -42,16 +81,25 @@ ExternalProject_Add(external_embree
URL file://${PACKAGE_DIR}/${EMBREE_FILE}
DOWNLOAD_DIR ${DOWNLOAD_DIR}
URL_HASH ${EMBREE_HASH_TYPE}=${EMBREE_HASH}
CMAKE_GENERATOR ${PLATFORM_ALT_GENERATOR}
PREFIX ${BUILD_DIR}/embree
PATCH_COMMAND ${PATCH_CMD} -p 1 -d ${BUILD_DIR}/embree/src/external_embree < ${PATCH_DIR}/embree.diff
CMAKE_ARGS -DCMAKE_INSTALL_PREFIX=${LIBDIR}/embree ${DEFAULT_CMAKE_FLAGS} ${EMBREE_EXTRA_ARGS}
CMAKE_ARGS -DCMAKE_INSTALL_PREFIX=${LIBDIR}/embree ${EMBREE_CMAKE_FLAGS} ${EMBREE_EXTRA_ARGS}
INSTALL_DIR ${LIBDIR}/embree
)
add_dependencies(
external_embree
external_tbb
)
if(NOT APPLE)
add_dependencies(
external_embree
external_tbb
external_dpcpp
)
else()
add_dependencies(
external_embree
external_tbb
)
endif()
if(WIN32)
if(BUILD_MODE STREQUAL Release)
@@ -66,6 +114,7 @@ if(WIN32)
ExternalProject_Add_Step(external_embree after_install
COMMAND ${CMAKE_COMMAND} -E copy ${LIBDIR}/embree/bin/embree4_d.dll ${HARVEST_TARGET}/embree/bin/embree4_d.dll
COMMAND ${CMAKE_COMMAND} -E copy ${LIBDIR}/embree/lib/embree4_d.lib ${HARVEST_TARGET}/embree/lib/embree4_d.lib
COMMAND ${CMAKE_COMMAND} -E copy ${LIBDIR}/embree/lib/embree4_sycl_d.lib ${HARVEST_TARGET}/embree/lib/embree4_sycl_d.lib
DEPENDEES install
)
endif()

View File

@@ -74,6 +74,27 @@ if(WIN32)
set(BLENDER_CMAKE_CXX_FLAGS_RELEASE "/MD ${COMMON_MSVC_FLAGS} /D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS /O2 /Ob2 /D NDEBUG /D PLATFORM_WINDOWS /DPSAPI_VERSION=2 /DTINYFORMAT_ALLOW_WCHAR_STRINGS")
set(BLENDER_CMAKE_CXX_FLAGS_RELWITHDEBINFO "/MD ${COMMON_MSVC_FLAGS} /D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS /Zi /O2 /Ob1 /D NDEBUG /D PLATFORM_WINDOWS /DPSAPI_VERSION=2 /DTINYFORMAT_ALLOW_WCHAR_STRINGS")
# Set similar flags for CLANG compilation.
set(COMMON_CLANG_FLAGS "-D_DLL -D_MT") # Equivalent to MSVC /MD
if(WITH_OPTIMIZED_DEBUG)
set(BLENDER_CLANG_CMAKE_C_FLAGS_DEBUG "${COMMON_CLANG_FLAGS} -Xclang --dependent-lib=msvcrtd -O2 -D_DEBUG -DPSAPI_VERSION=2 -DTINYFORMAT_ALLOW_WCHAR_STRINGS")
else()
set(BLENDER_CLANG_CMAKE_C_FLAGS_DEBUG "${COMMON_CLANG_FLAGS} -Xclang --dependent-lib=msvcrtd -g -D_DEBUG -DPSAPI_VERSION=2 -DTINYFORMAT_ALLOW_WCHAR_STRINGS")
endif()
set(BLENDER_CLANG_CMAKE_C_FLAGS_MINSIZEREL "${COMMON_CLANG_FLAGS} -Xclang --dependent-lib=msvcrt -Os -DNDEBUG -DPSAPI_VERSION=2 -DTINYFORMAT_ALLOW_WCHAR_STRINGS")
set(BLENDER_CLANG_CMAKE_C_FLAGS_RELEASE "${COMMON_CLANG_FLAGS} -Xclang --dependent-lib=msvcrt -O2 -DNDEBUG -DPSAPI_VERSION=2 -DTINYFORMAT_ALLOW_WCHAR_STRINGS")
set(BLENDER_CLANG_CMAKE_C_FLAGS_RELWITHDEBINFO "${COMMON_CLANG_FLAGS} -Xclang --dependent-lib=msvcrt -g -O2 -DNDEBUG -DPSAPI_VERSION=2 -DTINYFORMAT_ALLOW_WCHAR_STRINGS")
if(WITH_OPTIMIZED_DEBUG)
set(BLENDER_CLANG_CMAKE_CXX_FLAGS_DEBUG "${COMMON_CLANG_FLAGS} -Xclang --dependent-lib=msvcrtd -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -O2 -D_DEBUG -DPLATFORM_WINDOWS -DPSAPI_VERSION=2 -DTINYFORMAT_ALLOW_WCHAR_STRINGS -DBOOST_DEBUG_PYTHON -DBOOST_ALL_NO_LIB")
else()
set(BLENDER_CLANG_CMAKE_CXX_FLAGS_DEBUG "${COMMON_CLANG_FLAG} -Xclang --dependent-lib=msvcrtd -D_DEBUG -DPLATFORM_WINDOWS -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -g -DPSAPI_VERSION=2 -DTINYFORMAT_ALLOW_WCHAR_STRINGS -DBOOST_DEBUG_PYTHON -DBOOST_ALL_NO_LIB")
endif()
set(BLENDER_CLANG_CMAKE_CXX_FLAGS_MINSIZEREL "${COMMON_CLANG_FLAGS} -Xclang --dependent-lib=msvcrt -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -O2 -DNDEBUG -DPLATFORM_WINDOWS -DPSAPI_VERSION=2 -DTINYFORMAT_ALLOW_WCHAR_STRINGS")
set(BLENDER_CLANG_CMAKE_CXX_FLAGS_RELEASE "${COMMON_CLANG_FLAGS} -Xclang --dependent-lib=msvcrt -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -O2 -DNDEBUG -DPLATFORM_WINDOWS -DPSAPI_VERSION=2 -DTINYFORMAT_ALLOW_WCHAR_STRINGS")
set(BLENDER_CLANG_CMAKE_CXX_FLAGS_RELWITHDEBINFO "${COMMON_CLANG_FLAGS} -Xclang --dependent-lib=msvcrt -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -g -O2 -DNDEBUG -DPLATFORM_WINDOWS -DPSAPI_VERSION=2 -DTINYFORMAT_ALLOW_WCHAR_STRINGS")
set(PLATFORM_FLAGS)
set(PLATFORM_CXX_FLAGS)
set(PLATFORM_CMAKE_FLAGS)

View File

@@ -599,15 +599,15 @@ set(OPENPGL_HASH db63f5dac5cfa8c110ede241f0c413f00db0c4748697381c4fa23e0f9e82a75
set(OPENPGL_HASH_TYPE SHA256)
set(OPENPGL_FILE openpgl-${OPENPGL_VERSION}.tar.gz)
set(LEVEL_ZERO_VERSION v1.8.5)
set(LEVEL_ZERO_VERSION v1.8.8)
set(LEVEL_ZERO_URI https://github.com/oneapi-src/level-zero/archive/refs/tags/${LEVEL_ZERO_VERSION}.tar.gz)
set(LEVEL_ZERO_HASH b6e9663bbcc53c148d32376998298bec6f7c434ef2218c61fa708963e3a09394)
set(LEVEL_ZERO_HASH 3553ae8fa0d2d69c4210a8f3428bd6612bd8bb8a627faf52c3658a01851e66d2)
set(LEVEL_ZERO_HASH_TYPE SHA256)
set(LEVEL_ZERO_FILE level-zero-${LEVEL_ZERO_VERSION}.tar.gz)
set(DPCPP_VERSION 20221019)
set(DPCPP_URI https://github.com/intel/llvm/archive/refs/tags/sycl-nightly/${DPCPP_VERSION}.tar.gz)
set(DPCPP_HASH 2f533946e91ce3829431758ea17b0b834b960c1a796e9e4563c86e03eb9603a2)
set(DPCPP_VERSION 2022-12)
set(DPCPP_URI https://github.com/intel/llvm/archive/refs/tags/${DPCPP_VERSION}.tar.gz)
set(DPCPP_HASH 13151d5ae79f7c9c4a9b072a0c486ae7b3c4993e301bb1268c92214451025790)
set(DPCPP_HASH_TYPE SHA256)
set(DPCPP_FILE DPCPP-${DPCPP_VERSION}.tar.gz)
@@ -620,9 +620,9 @@ set(DPCPP_FILE DPCPP-${DPCPP_VERSION}.tar.gz)
# will take care of building them, unpack is being done in dpcpp_deps.cmake
# Source llvm/lib/SYCLLowerIR/CMakeLists.txt
set(VCINTRINSICS_VERSION abce9184b7a3a7fe1b02289b9285610d9dc45465)
set(VCINTRINSICS_VERSION 782fbf7301dc73acaa049a4324c976ad94f587f7)
set(VCINTRINSICS_URI https://github.com/intel/vc-intrinsics/archive/${VCINTRINSICS_VERSION}.tar.gz)
set(VCINTRINSICS_HASH 3e9fd471246b87633b26f7e15e17ab7733d357458c53d5c5881c03929d6c551f)
set(VCINTRINSICS_HASH f4c0ccad8c1f77760364c551c65e8e1cf194d058889fa46d3b1b2d19ec4dc33f)
set(VCINTRINSICS_HASH_TYPE SHA256)
set(VCINTRINSICS_FILE vc-intrinsics-${VCINTRINSICS_VERSION}.tar.gz)
@@ -657,6 +657,13 @@ set(SPIRV_HEADERS_HASH ec8ecb471a62672697846c436501638ab25447ae9d4a6761e0bfe8a9a
set(SPIRV_HEADERS_HASH_TYPE SHA256)
set(SPIRV_HEADERS_FILE SPIR-V-Headers-${SPIRV_HEADERS_VERSION}.tar.gz)
# Source llvm/sycl/plugins/unified_runtime/CMakeLists.txt
set(UNIFIED_RUNTIME_VERSION fd711c920acc4434cb52ff18b078c082d9d7f44d)
set(UNIFIED_RUNTIME_URI https://github.com/oneapi-src/unified-runtime/archive/${UNIFIED_RUNTIME_VERSION}.tar.gz)
set(UNIFIED_RUNTIME_HASH 535ca2ee78f68c5e7e62b10f1bbabd909179488885566e6d9b1fc50e8a1be65f)
set(UNIFIED_RUNTIME_HASH_TYPE SHA256)
set(UNIFIED_RUNTIME_FILE unified-runtime-${UNIFIED_RUNTIME_VERSION}.tar.gz)
######################
### DPCPP DEPS END ###
######################
@@ -730,9 +737,9 @@ set(GMMLIB_HASH c1f33e1519edfc527127baeb0436b783430dfd256c643130169a3a71dc86aff9
set(GMMLIB_HASH_TYPE SHA256)
set(GMMLIB_FILE ${GMMLIB_VERSION}.tar.gz)
set(OCLOC_VERSION 22.49.25018.21)
set(OCLOC_VERSION 23.05.25593.18)
set(OCLOC_URI https://github.com/intel/compute-runtime/archive/refs/tags/${OCLOC_VERSION}.tar.gz)
set(OCLOC_HASH 92362dae08b503a34e5d3820ed284198c452bcd5e7504d90eb69887b20492c06)
set(OCLOC_HASH 122415028e631922ae999c996954dfd98ce9a32decd564d5484c31476ec9306e)
set(OCLOC_HASH_TYPE SHA256)
set(OCLOC_FILE ocloc-${OCLOC_VERSION}.tar.gz)

View File

@@ -14,6 +14,7 @@ graph[autosize = false, size = "25.7,8.3!", resolution = 300];
external_dpcpp -- external_mp11;
external_dpcpp -- external_level_zero;
external_dpcpp -- external_spirvheaders;
external_dpcpp -- external_unifiedruntime;
external_embree -- external_tbb;
external_ffmpeg -- external_zlib;
external_ffmpeg -- external_openjpeg;

View File

@@ -34,3 +34,156 @@ diff -Naur llvm-sycl-nightly-20220208.orig/libdevice/cmake/modules/SYCLLibdevice
libsycldevice-obj
libsycldevice-spv)
diff --git a/sycl/source/detail/program_manager/program_manager.cpp b/sycl/source/detail/program_manager/program_manager.cpp
index 17eeaafae194..09e6d2217aaa 100644
--- a/sycl/source/detail/program_manager/program_manager.cpp
+++ b/sycl/source/detail/program_manager/program_manager.cpp
@@ -1647,46 +1647,120 @@ ProgramManager::getSYCLDeviceImagesWithCompatibleState(
}
assert(BinImages.size() > 0 && "Expected to find at least one device image");
+ // Ignore images with incompatible state. Image is considered compatible
+ // with a target state if an image is already in the target state or can
+ // be brought to target state by compiling/linking/building.
+ //
+ // Example: an image in "executable" state is not compatible with
+ // "input" target state - there is no operation to convert the image it
+ // to "input" state. An image in "input" state is compatible with
+ // "executable" target state because it can be built to get into
+ // "executable" state.
+ for (auto It = BinImages.begin(); It != BinImages.end();) {
+ if (getBinImageState(*It) > TargetState)
+ It = BinImages.erase(It);
+ else
+ ++It;
+ }
+
std::vector<device_image_plain> SYCLDeviceImages;
- for (RTDeviceBinaryImage *BinImage : BinImages) {
- const bundle_state ImgState = getBinImageState(BinImage);
-
- // Ignore images with incompatible state. Image is considered compatible
- // with a target state if an image is already in the target state or can
- // be brought to target state by compiling/linking/building.
- //
- // Example: an image in "executable" state is not compatible with
- // "input" target state - there is no operation to convert the image it
- // to "input" state. An image in "input" state is compatible with
- // "executable" target state because it can be built to get into
- // "executable" state.
- if (ImgState > TargetState)
- continue;
- for (const sycl::device &Dev : Devs) {
+ // If a non-input state is requested, we can filter out some compatible
+ // images and return only those with the highest compatible state for each
+ // device-kernel pair. This map tracks how many kernel-device pairs need each
+ // image, so that any unneeded ones are skipped.
+ // TODO this has no effect if the requested state is input, consider having
+ // a separate branch for that case to avoid unnecessary tracking work.
+ struct DeviceBinaryImageInfo {
+ std::shared_ptr<std::vector<sycl::kernel_id>> KernelIDs;
+ bundle_state State = bundle_state::input;
+ int RequirementCounter = 0;
+ };
+ std::unordered_map<RTDeviceBinaryImage *, DeviceBinaryImageInfo> ImageInfoMap;
+
+ for (const sycl::device &Dev : Devs) {
+ // Track the highest image state for each requested kernel.
+ using StateImagesPairT =
+ std::pair<bundle_state, std::vector<RTDeviceBinaryImage *>>;
+ using KernelImageMapT =
+ std::map<kernel_id, StateImagesPairT, LessByNameComp>;
+ KernelImageMapT KernelImageMap;
+ if (!KernelIDs.empty())
+ for (const kernel_id &KernelID : KernelIDs)
+ KernelImageMap.insert({KernelID, {}});
+
+ for (RTDeviceBinaryImage *BinImage : BinImages) {
if (!compatibleWithDevice(BinImage, Dev) ||
!doesDevSupportImgAspects(Dev, *BinImage))
continue;
- std::shared_ptr<std::vector<sycl::kernel_id>> KernelIDs;
- // Collect kernel names for the image
- {
- std::lock_guard<std::mutex> KernelIDsGuard(m_KernelIDsMutex);
- KernelIDs = m_BinImg2KernelIDs[BinImage];
- // If the image does not contain any non-service kernels we can skip it.
- if (!KernelIDs || KernelIDs->empty())
- continue;
+ auto InsertRes = ImageInfoMap.insert({BinImage, {}});
+ DeviceBinaryImageInfo &ImgInfo = InsertRes.first->second;
+ if (InsertRes.second) {
+ ImgInfo.State = getBinImageState(BinImage);
+ // Collect kernel names for the image
+ {
+ std::lock_guard<std::mutex> KernelIDsGuard(m_KernelIDsMutex);
+ ImgInfo.KernelIDs = m_BinImg2KernelIDs[BinImage];
+ }
}
+ const bundle_state ImgState = ImgInfo.State;
+ const std::shared_ptr<std::vector<sycl::kernel_id>> &ImageKernelIDs =
+ ImgInfo.KernelIDs;
+ int &ImgRequirementCounter = ImgInfo.RequirementCounter;
- DeviceImageImplPtr Impl = std::make_shared<detail::device_image_impl>(
- BinImage, Ctx, Devs, ImgState, KernelIDs, /*PIProgram=*/nullptr);
+ // If the image does not contain any non-service kernels we can skip it.
+ if (!ImageKernelIDs || ImageKernelIDs->empty())
+ continue;
- SYCLDeviceImages.push_back(
- createSyclObjFromImpl<device_image_plain>(Impl));
- break;
+ // Update tracked information.
+ for (kernel_id &KernelID : *ImageKernelIDs) {
+ StateImagesPairT *StateImagesPair;
+ // If only specific kernels are requested, ignore the rest.
+ if (!KernelIDs.empty()) {
+ auto It = KernelImageMap.find(KernelID);
+ if (It == KernelImageMap.end())
+ continue;
+ StateImagesPair = &It->second;
+ } else
+ StateImagesPair = &KernelImageMap[KernelID];
+
+ auto &[KernelImagesState, KernelImages] = *StateImagesPair;
+
+ if (KernelImages.empty()) {
+ KernelImagesState = ImgState;
+ KernelImages.push_back(BinImage);
+ ++ImgRequirementCounter;
+ } else if (KernelImagesState < ImgState) {
+ for (RTDeviceBinaryImage *Img : KernelImages) {
+ auto It = ImageInfoMap.find(Img);
+ assert(It != ImageInfoMap.end());
+ assert(It->second.RequirementCounter > 0);
+ --(It->second.RequirementCounter);
+ }
+ KernelImages.clear();
+ KernelImages.push_back(BinImage);
+ KernelImagesState = ImgState;
+ ++ImgRequirementCounter;
+ } else if (KernelImagesState == ImgState) {
+ KernelImages.push_back(BinImage);
+ ++ImgRequirementCounter;
+ }
+ }
}
}
+ for (const auto &ImgInfoPair : ImageInfoMap) {
+ if (ImgInfoPair.second.RequirementCounter == 0)
+ continue;
+
+ DeviceImageImplPtr Impl = std::make_shared<detail::device_image_impl>(
+ ImgInfoPair.first, Ctx, Devs, ImgInfoPair.second.State,
+ ImgInfoPair.second.KernelIDs, /*PIProgram=*/nullptr);
+
+ SYCLDeviceImages.push_back(createSyclObjFromImpl<device_image_plain>(Impl));
+ }
+
return SYCLDeviceImages;
}

View File

@@ -149,3 +149,19 @@ index 074f910a2..30f490818 100644
return is_hit_first | is_hit_second;
}
};
diff -ruN a/kernels/sycl/rthwif_embree_builder.cpp b/kernels/sycl/rthwif_embree_builder.cpp
--- a/kernels/sycl/rthwif_embree_builder.cpp 2023-03-28 17:23:06.429190200 +0200
+++ b/kernels/sycl/rthwif_embree_builder.cpp 2023-03-28 17:35:01.291938600 +0200
@@ -540,7 +540,12 @@
assert(offset <= geomDescrData.size());
}
+ /* Force running BVH building sequentially from the calling thread if using TBB < 2021, as it otherwise leads to runtime issues. */
+#if TBB_VERSION_MAJOR<2021
+ RTHWIF_PARALLEL_OPERATION parallelOperation = nullptr;
+#else
RTHWIF_PARALLEL_OPERATION parallelOperation = rthwifNewParallelOperation();
+#endif
/* estimate static accel size */
BBox1f time_range(0,1);

View File

@@ -37,18 +37,24 @@ elseif(HIP_HIPCC_EXECUTABLE)
set(HIP_VERSION_MINOR 0)
set(HIP_VERSION_PATCH 0)
if(WIN32)
set(_hipcc_executable ${HIP_HIPCC_EXECUTABLE}.bat)
else()
set(_hipcc_executable ${HIP_HIPCC_EXECUTABLE})
endif()
# Get version from the output.
execute_process(COMMAND ${HIP_HIPCC_EXECUTABLE} --version
OUTPUT_VARIABLE HIP_VERSION_RAW
execute_process(COMMAND ${_hipcc_executable} --version
OUTPUT_VARIABLE _hip_version_raw
ERROR_QUIET
OUTPUT_STRIP_TRAILING_WHITESPACE)
# Parse parts.
if(HIP_VERSION_RAW MATCHES "HIP version: .*")
if(_hip_version_raw MATCHES "HIP version: .*")
# Strip the HIP prefix and get list of individual version components.
string(REGEX REPLACE
".*HIP version: ([.0-9]+).*" "\\1"
HIP_SEMANTIC_VERSION "${HIP_VERSION_RAW}")
HIP_SEMANTIC_VERSION "${_hip_version_raw}")
string(REPLACE "." ";" HIP_VERSION_PARTS "${HIP_SEMANTIC_VERSION}")
list(LENGTH HIP_VERSION_PARTS NUM_HIP_VERSION_PARTS)
@@ -71,7 +77,13 @@ elseif(HIP_HIPCC_EXECUTABLE)
# Construct full semantic version.
set(HIP_VERSION "${HIP_VERSION_MAJOR}.${HIP_VERSION_MINOR}.${HIP_VERSION_PATCH}")
unset(HIP_VERSION_RAW)
unset(_hip_version_raw)
unset(_hipcc_executable)
else()
set(HIP_FOUND FALSE)
endif()
include(FindPackageHandleStandardArgs)
find_package_handle_standard_args(HIP
REQUIRED_VARS HIP_HIPCC_EXECUTABLE
VERSION_VAR HIP_VERSION)

View File

@@ -108,7 +108,11 @@ FIND_PACKAGE_HANDLE_STANDARD_ARGS(SYCL
IF(SYCL_FOUND)
SET(SYCL_INCLUDE_DIR ${SYCL_INCLUDE_DIR} ${SYCL_INCLUDE_DIR}/sycl)
SET(SYCL_LIBRARIES ${SYCL_LIBRARY})
IF(WIN32 AND SYCL_LIBRARY_DEBUG)
SET(SYCL_LIBRARIES optimized ${SYCL_LIBRARY} debug ${SYCL_LIBRARY_DEBUG})
ELSE()
SET(SYCL_LIBRARIES ${SYCL_LIBRARY})
ENDIF()
ELSE()
SET(SYCL_SYCL_FOUND FALSE)
ENDIF()

View File

@@ -82,7 +82,7 @@ if(NOT APPLE)
set(WITH_CYCLES_DEVICE_OPTIX ON CACHE BOOL "" FORCE)
set(WITH_CYCLES_CUDA_BINARIES ON CACHE BOOL "" FORCE)
set(WITH_CYCLES_CUBIN_COMPILER OFF CACHE BOOL "" FORCE)
set(WITH_CYCLES_HIP_BINARIES OFF CACHE BOOL "" FORCE)
set(WITH_CYCLES_HIP_BINARIES ON CACHE BOOL "" FORCE)
set(WITH_CYCLES_DEVICE_ONEAPI ON CACHE BOOL "" FORCE)
set(WITH_CYCLES_ONEAPI_BINARIES ON CACHE BOOL "" FORCE)
endif()

View File

@@ -1384,4 +1384,3 @@ macro(windows_process_platform_bundled_libraries library_deps)
endforeach()
endif()
endmacro()

View File

@@ -174,7 +174,7 @@ if(SYSTEMSTUBS_LIBRARY)
list(APPEND PLATFORM_LINKLIBS SystemStubs)
endif()
string(APPEND PLATFORM_CFLAGS " -pipe -funsigned-char -fno-strict-aliasing")
string(APPEND PLATFORM_CFLAGS " -pipe -funsigned-char -fno-strict-aliasing -ffp-contract=off")
set(PLATFORM_LINKFLAGS
"-fexceptions -framework CoreServices -framework Foundation -framework IOKit -framework AppKit -framework Cocoa -framework Carbon -framework AudioUnit -framework AudioToolbox -framework CoreAudio -framework Metal -framework QuartzCore"
)

View File

@@ -803,8 +803,7 @@ if(CMAKE_COMPILER_IS_GNUCC)
# Automatically turned on when building with "-march=native". This is
# explicitly turned off here as it will make floating point math give a bit
# different results. This will lead to automated test failures. So disable
# this until we support it. Seems to default to off in clang and the intel
# compiler.
# this until we support it.
set(PLATFORM_CFLAGS "-pipe -fPIC -funsigned-char -fno-strict-aliasing -ffp-contract=off")
# `maybe-uninitialized` is unreliable in release builds, but fine in debug builds.
@@ -892,7 +891,7 @@ if(CMAKE_COMPILER_IS_GNUCC)
# CLang is the same as GCC for now.
elseif(CMAKE_C_COMPILER_ID MATCHES "Clang")
set(PLATFORM_CFLAGS "-pipe -fPIC -funsigned-char -fno-strict-aliasing")
set(PLATFORM_CFLAGS "-pipe -fPIC -funsigned-char -fno-strict-aliasing -ffp-contract=off")
if(WITH_LINKER_MOLD AND _IS_LINKER_DEFAULT)
find_program(MOLD_BIN "mold")

View File

@@ -9,7 +9,7 @@ buildbot:
cuda11:
version: '11.4.1'
hip:
version: '5.3.22480'
version: '5.5.30571'
optix:
version: '7.3.0'
ocloc:

View File

@@ -489,7 +489,8 @@ if __name__ == "__main__":
branch = f"blender-v{major}.{minor}-release"
release_version: Optional[str] = f"{major}.{minor}"
else:
branch = 'main'
# TODO !!!!! remove this before merge !!!!!
branch = 'asset-shelf'
release_version = None
if not args.no_libraries:

View File

@@ -1,7 +1,7 @@
#!/usr/bin/env python3
# SPDX-License-Identifier: GPL-2.0-or-later
'''
"""
This script generates the blender.1 man page, embedding the help text
from the Blender executable itself. Invoke it as follows:
@@ -9,7 +9,7 @@ from the Blender executable itself. Invoke it as follows:
where <path-to-blender> is the path to the Blender executable,
and <output-filename> is where to write the generated man page.
'''
"""
import argparse
import os
@@ -87,29 +87,29 @@ def man_page_from_blender_help(fh: TextIO, blender_bin: str, verbose: bool) -> N
(blender_info["date"], blender_info["version"].replace(".", "\\&."))
)
fh.write(r'''
fh.write(r"""
.SH NAME
blender \- a full-featured 3D application''')
blender \- a full-featured 3D application""")
fh.write(r'''
fh.write(r"""
.SH SYNOPSIS
.B blender [args ...] [file] [args ...]''')
.B blender [args ...] [file] [args ...]""")
fh.write(r'''
fh.write(r"""
.br
.SH DESCRIPTION
.PP
.B blender
is a full-featured 3D application. It supports the entirety of the 3D pipeline - '''
'''modeling, rigging, animation, simulation, rendering, compositing, motion tracking, and video editing.
is a full-featured 3D application. It supports the entirety of the 3D pipeline - """
"""modeling, rigging, animation, simulation, rendering, compositing, motion tracking, and video editing.
Use Blender to create 3D images and animations, films and commercials, content for games, '''
r'''architectural and industrial visualizations, and scientific visualizations.
Use Blender to create 3D images and animations, films and commercials, content for games, """
r"""architectural and industrial visualizations, and scientific visualizations.
https://www.blender.org''')
https://www.blender.org""")
fh.write(r'''
.SH OPTIONS''')
fh.write(r"""
.SH OPTIONS""")
fh.write("\n\n")
@@ -152,7 +152,7 @@ https://www.blender.org''')
# Footer Content.
fh.write(r'''
fh.write(r"""
.br
.SH SEE ALSO
.B luxrender(1)
@@ -162,7 +162,7 @@ https://www.blender.org''')
This manpage was written for a Debian GNU/Linux system by Daniel Mester
<mester@uni-bremen.de> and updated by Cyril Brulebois
<cyril.brulebois@enst-bretagne.fr> and Dan Eicher <dan@trollwerks.org>.
''')
""")
def create_argparse() -> argparse.ArgumentParser:

View File

@@ -572,7 +572,7 @@ template<class T> inline bool cmpMinMax(T &minv, T &maxv, const T &val)
}
template<> inline bool cmpMinMax<Vec3>(Vec3 &minv, Vec3 &maxv, const Vec3 &val)
{
return (cmpMinMax(minv.x, maxv.x, val.x) | cmpMinMax(minv.y, maxv.y, val.y) |
return (cmpMinMax(minv.x, maxv.x, val.x) || cmpMinMax(minv.y, maxv.y, val.y) ||
cmpMinMax(minv.z, maxv.z, val.z));
}

View File

@@ -281,6 +281,9 @@ endif()
if(WITH_CYCLES_EMBREE)
add_definitions(-DWITH_EMBREE)
if(WITH_CYCLES_DEVICE_ONEAPI AND EMBREE_SYCL_SUPPORT)
add_definitions(-DWITH_EMBREE_GPU)
endif()
add_definitions(-DEMBREE_MAJOR_VERSION=${EMBREE_MAJOR_VERSION})
include_directories(
SYSTEM

View File

@@ -106,7 +106,7 @@ class CyclesRender(bpy.types.RenderEngine):
from . import osl
osl.update_script_node(node, self.report)
else:
self.report({'ERROR'}, "OSL support disabled in this build.")
self.report({'ERROR'}, "OSL support disabled in this build")
def update_render_passes(self, scene, srl):
engine.register_passes(self, scene, srl)

View File

@@ -1544,6 +1544,13 @@ class CyclesPreferences(bpy.types.AddonPreferences):
default=False,
)
use_oneapirt: BoolProperty(
name="Embree on GPU (Experimental)",
description="Embree GPU execution will allow to use hardware ray tracing on Intel GPUs, which will provide better performance. "
"However this support is experimental and some scenes may render incorrectly",
default=False,
)
kernel_optimization_level: EnumProperty(
name="Kernel Optimization",
description="Kernels can be optimized based on scene content. Optimized kernels are requested at the start of a render. "
@@ -1676,16 +1683,16 @@ class CyclesPreferences(bpy.types.AddonPreferences):
col.label(text=iface_("and NVIDIA driver version %s or newer") % driver_version,
icon='BLANK1', translate=False)
elif device_type == 'HIP':
if True:
col.label(text="HIP temporarily disabled due to compiler bugs", icon='BLANK1')
else:
import sys
if sys.platform[:3] == "win":
driver_version = "21.Q4"
col.label(text="Requires AMD GPU with Vega or RDNA architecture", icon='BLANK1')
col.label(text=iface_("and AMD Radeon Pro %s driver or newer") % driver_version,
icon='BLANK1', translate=False)
elif sys.platform.startswith("linux"):
import sys
if sys.platform[:3] == "win":
driver_version = "21.Q4"
col.label(text="Requires AMD GPU with Vega or RDNA architecture", icon='BLANK1')
col.label(text=iface_("and AMD Radeon Pro %s driver or newer") % driver_version,
icon='BLANK1', translate=False)
elif sys.platform.startswith("linux"):
if True:
col.label(text="HIP temporarily disabled due to compiler bugs", icon='BLANK1')
else:
driver_version = "22.10"
col.label(text="Requires AMD GPU with Vega or RDNA architecture", icon='BLANK1')
col.label(text=iface_("and AMD driver version %s or newer") % driver_version, icon='BLANK1',
@@ -1763,6 +1770,11 @@ class CyclesPreferences(bpy.types.AddonPreferences):
col.prop(self, "kernel_optimization_level")
col.prop(self, "use_metalrt")
if compute_device_type == 'ONEAPI' and _cycles.with_embree_gpu:
row = layout.row()
row.use_property_split = True
row.prop(self, "use_oneapirt")
def draw(self, context):
self.draw_impl(self.layout, context)

View File

@@ -112,9 +112,26 @@ DeviceInfo blender_device_info(BL::Preferences &b_preferences,
device.has_peer_memory = false;
}
if (get_boolean(cpreferences, "use_metalrt")) {
device.use_metalrt = true;
bool accumulated_use_hardware_raytracing = false;
foreach (
DeviceInfo &info,
(device.multi_devices.size() != 0 ? device.multi_devices : vector<DeviceInfo>({device}))) {
if (info.type == DEVICE_METAL && !get_boolean(cpreferences, "use_metalrt")) {
info.use_hardware_raytracing = false;
}
if (info.type == DEVICE_ONEAPI && !get_boolean(cpreferences, "use_oneapirt")) {
info.use_hardware_raytracing = false;
}
/* There is an accumulative logic here, because Multi-devices are support only for
* the same backend + CPU in Blender right now, and both oneAPI and Metal have a
* global boolean backend setting (see above) for enabling/disabling HW RT,
* so all sub-devices in the multi-device should enable (or disable) HW RT
* simultaneously (and CPU device are expected to ignore `use_hardware_raytracing` setting). */
accumulated_use_hardware_raytracing |= info.use_hardware_raytracing;
}
device.use_hardware_raytracing = accumulated_use_hardware_raytracing;
if (preview) {
/* Disable specialization for preview renders. */

View File

@@ -280,7 +280,7 @@ static void fill_generic_attribute(BL::Mesh &b_mesh,
assert(0);
}
else {
const MEdge *edges = static_cast<const MEdge *>(b_mesh.edges[0].ptr.data);
const int2 *edges = static_cast<const int2 *>(b_mesh.edges[0].ptr.data);
const size_t verts_num = b_mesh.vertices.length();
vector<int> count(verts_num, 0);
@@ -288,11 +288,11 @@ static void fill_generic_attribute(BL::Mesh &b_mesh,
for (int i = 0; i < edges_num; i++) {
TypeInCycles value = get_value_at_index(i);
const MEdge &b_edge = edges[i];
data[b_edge.v1] += value;
data[b_edge.v2] += value;
count[b_edge.v1]++;
count[b_edge.v2]++;
const int2 &b_edge = edges[i];
data[b_edge[0]] += value;
data[b_edge[1]] += value;
count[b_edge[0]]++;
count[b_edge[1]]++;
}
for (size_t i = 0; i < verts_num; i++) {
@@ -796,13 +796,13 @@ static void attr_create_pointiness(Scene *scene, Mesh *mesh, BL::Mesh &b_mesh, b
EdgeMap visited_edges;
memset(&counter[0], 0, sizeof(int) * counter.size());
const MEdge *edges = static_cast<MEdge *>(b_mesh.edges[0].ptr.data);
const int2 *edges = static_cast<int2 *>(b_mesh.edges[0].ptr.data);
const int edges_num = b_mesh.edges.length();
for (int i = 0; i < edges_num; i++) {
const MEdge &b_edge = edges[i];
const int v0 = vert_orig_index[b_edge.v1];
const int v1 = vert_orig_index[b_edge.v2];
const int2 &b_edge = edges[i];
const int v0 = vert_orig_index[b_edge[0]];
const int v1 = vert_orig_index[b_edge[1]];
if (visited_edges.exists(v0, v1)) {
continue;
}
@@ -838,9 +838,9 @@ static void attr_create_pointiness(Scene *scene, Mesh *mesh, BL::Mesh &b_mesh, b
memset(&counter[0], 0, sizeof(int) * counter.size());
visited_edges.clear();
for (int i = 0; i < edges_num; i++) {
const MEdge &b_edge = edges[i];
const int v0 = vert_orig_index[b_edge.v1];
const int v1 = vert_orig_index[b_edge.v2];
const int2 &b_edge = edges[i];
const int v0 = vert_orig_index[b_edge[0]];
const int v1 = vert_orig_index[b_edge[1]];
if (visited_edges.exists(v0, v1)) {
continue;
}
@@ -907,12 +907,12 @@ static void attr_create_random_per_island(Scene *scene,
DisjointSet vertices_sets(number_of_vertices);
const MEdge *edges = static_cast<MEdge *>(b_mesh.edges[0].ptr.data);
const int2 *edges = static_cast<int2 *>(b_mesh.edges[0].ptr.data);
const int edges_num = b_mesh.edges.length();
const int *corner_verts = find_corner_vert_attribute(b_mesh);
for (int i = 0; i < edges_num; i++) {
vertices_sets.join(edges[i].v1, edges[i].v2);
vertices_sets.join(edges[i][0], edges[i][1]);
}
AttributeSet &attributes = (subdivision) ? mesh->subd_attributes : mesh->attributes;
@@ -1234,12 +1234,12 @@ static void create_subd_mesh(Scene *scene,
mesh->reserve_subd_creases(num_creases);
const MEdge *edges = static_cast<MEdge *>(b_mesh.edges[0].ptr.data);
const int2 *edges = static_cast<int2 *>(b_mesh.edges[0].ptr.data);
for (int i = 0; i < edges_num; i++) {
const float crease = creases[i];
if (crease != 0.0f) {
const MEdge &b_edge = edges[i];
mesh->add_edge_crease(b_edge.v1, b_edge.v2, crease);
const int2 &b_edge = edges[i];
mesh->add_edge_crease(b_edge[0], b_edge[1], crease);
}
}
}

View File

@@ -1034,6 +1034,14 @@ void *CCL_python_module_init()
Py_INCREF(Py_False);
#endif /* WITH_EMBREE */
#ifdef WITH_EMBREE_GPU
PyModule_AddObject(mod, "with_embree_gpu", Py_True);
Py_INCREF(Py_True);
#else /* WITH_EMBREE_GPU */
PyModule_AddObject(mod, "with_embree_gpu", Py_False);
Py_INCREF(Py_False);
#endif /* WITH_EMBREE_GPU */
if (ccl::openimagedenoise_supported()) {
PyModule_AddObject(mod, "with_openimagedenoise", Py_True);
Py_INCREF(Py_True);

View File

@@ -1061,7 +1061,7 @@ void BlenderSession::ensure_display_driver_if_needed()
unique_ptr<BlenderDisplayDriver> display_driver = make_unique<BlenderDisplayDriver>(
b_engine, b_scene, background);
display_driver_ = display_driver.get();
session->set_display_driver(move(display_driver));
session->set_display_driver(std::move(display_driver));
}
CCL_NAMESPACE_END

View File

@@ -981,22 +981,8 @@ static ShaderNode *add_node(Scene *scene,
sky->set_sun_disc(b_sky_node.sun_disc());
sky->set_sun_size(b_sky_node.sun_size());
sky->set_sun_intensity(b_sky_node.sun_intensity());
/* Patch sun position to be able to animate daylight cycle while keeping the shading code
* simple. */
float sun_rotation = b_sky_node.sun_rotation();
/* Wrap into [-2PI..2PI] range. */
float sun_elevation = fmodf(b_sky_node.sun_elevation(), M_2PI_F);
/* Wrap into [-PI..PI] range. */
if (fabsf(sun_elevation) >= M_PI_F) {
sun_elevation -= copysignf(2.0f, sun_elevation) * M_PI_F;
}
/* Wrap into [-PI/2..PI/2] range while keeping the same absolute position. */
if (sun_elevation >= M_PI_2_F || sun_elevation <= -M_PI_2_F) {
sun_elevation = copysignf(M_PI_F, sun_elevation) - sun_elevation;
sun_rotation += M_PI_F;
}
sky->set_sun_elevation(sun_elevation);
sky->set_sun_rotation(sun_rotation);
sky->set_sun_elevation(b_sky_node.sun_elevation());
sky->set_sun_rotation(b_sky_node.sun_rotation());
sky->set_altitude(b_sky_node.altitude());
sky->set_air_density(b_sky_node.air_density());
sky->set_dust_density(b_sky_node.dust_density());

View File

@@ -527,7 +527,7 @@ BVHNode *BVHBuild::run()
if (progress.get_cancel()) {
rootnode->deleteSubtree();
rootnode = NULL;
VLOG_WORK << "BVH build cancelled.";
VLOG_WORK << "BVH build canceled.";
}
else {
/*rotate(rootnode, 4, 5);*/

View File

@@ -606,7 +606,7 @@ void BVH2::pack_instances(size_t nodes_size, size_t leaf_nodes_size)
int4 *bvh_nodes = &bvh->pack.nodes[0];
size_t bvh_nodes_size = bvh->pack.nodes.size();
for (size_t i = 0, j = 0; i < bvh_nodes_size; j++) {
for (size_t i = 0; i < bvh_nodes_size;) {
size_t nsize, nsize_bbox;
if (bvh_nodes[i].x & PATH_RAY_NODE_UNALIGNED) {
nsize = BVH_UNALIGNED_NODE_SIZE;

View File

@@ -111,9 +111,13 @@ BVHEmbree::~BVHEmbree()
}
}
void BVHEmbree::build(Progress &progress, Stats *stats, RTCDevice rtc_device_)
void BVHEmbree::build(Progress &progress,
Stats *stats,
RTCDevice rtc_device_,
const bool rtc_device_is_sycl_)
{
rtc_device = rtc_device_;
rtc_device_is_sycl = rtc_device_is_sycl_;
assert(rtc_device);
rtcSetDeviceErrorFunction(rtc_device, rtc_error_func, NULL);
@@ -266,15 +270,29 @@ void BVHEmbree::add_triangles(const Object *ob, const Mesh *mesh, int i)
rtcSetGeometryTimeStepCount(geom_id, num_motion_steps);
const int *triangles = mesh->get_triangles().data();
rtcSetSharedGeometryBuffer(geom_id,
RTC_BUFFER_TYPE_INDEX,
0,
RTC_FORMAT_UINT3,
triangles,
0,
sizeof(int) * 3,
num_triangles);
if (!rtc_device_is_sycl) {
rtcSetSharedGeometryBuffer(geom_id,
RTC_BUFFER_TYPE_INDEX,
0,
RTC_FORMAT_UINT3,
triangles,
0,
sizeof(int) * 3,
num_triangles);
}
else {
/* NOTE(sirgienko): If the Embree device is a SYCL device, then Embree execution will
* happen on GPU, and we cannot use standard host pointers at this point. So instead
* of making a shared geometry buffer - a new Embree buffer will be created and data
* will be copied. */
int *triangles_buffer = (int *)rtcSetNewGeometryBuffer(
geom_id, RTC_BUFFER_TYPE_INDEX, 0, RTC_FORMAT_UINT3, sizeof(int) * 3, num_triangles);
assert(triangles_buffer);
if (triangles_buffer) {
static_assert(sizeof(int) == sizeof(uint));
std::memcpy(triangles_buffer, triangles, sizeof(int) * 3 * (num_triangles));
}
}
set_tri_vertex_buffer(geom_id, mesh, false);
rtcSetGeometryUserData(geom_id, (void *)prim_offset);
@@ -323,14 +341,38 @@ void BVHEmbree::set_tri_vertex_buffer(RTCGeometry geom_id, const Mesh *mesh, con
rtcUpdateGeometryBuffer(geom_id, RTC_BUFFER_TYPE_VERTEX, t);
}
else {
rtcSetSharedGeometryBuffer(geom_id,
RTC_BUFFER_TYPE_VERTEX,
t,
RTC_FORMAT_FLOAT3,
verts,
0,
sizeof(float3),
num_verts + 1);
if (!rtc_device_is_sycl) {
rtcSetSharedGeometryBuffer(geom_id,
RTC_BUFFER_TYPE_VERTEX,
t,
RTC_FORMAT_FLOAT3,
verts,
0,
sizeof(float3),
num_verts + 1);
}
else {
/* NOTE(sirgienko): If the Embree device is a SYCL device, then Embree execution will
* happen on GPU, and we cannot use standard host pointers at this point. So instead
* of making a shared geometry buffer - a new Embree buffer will be created and data
* will be copied. */
/* As float3 is packed on GPU side, we map it to packed_float3. */
packed_float3 *verts_buffer = (packed_float3 *)rtcSetNewGeometryBuffer(
geom_id,
RTC_BUFFER_TYPE_VERTEX,
t,
RTC_FORMAT_FLOAT3,
sizeof(packed_float3),
num_verts + 1);
assert(verts_buffer);
if (verts_buffer) {
for (size_t i = (size_t)0; i < num_verts + 1; ++i) {
verts_buffer[i].x = verts[i].x;
verts_buffer[i].y = verts[i].y;
verts_buffer[i].z = verts[i].z;
}
}
}
}
}
}

View File

@@ -29,7 +29,10 @@ class PointCloud;
class BVHEmbree : public BVH {
public:
void build(Progress &progress, Stats *stats, RTCDevice rtc_device);
void build(Progress &progress,
Stats *stats,
RTCDevice rtc_device,
const bool isSyclEmbreeDevice = false);
void refit(Progress &progress);
RTCScene scene;
@@ -55,6 +58,7 @@ class BVHEmbree : public BVH {
const bool update);
RTCDevice rtc_device;
bool rtc_device_is_sycl;
enum RTCBuildQuality build_quality;
};

View File

@@ -42,15 +42,19 @@ endif()
###########################################################################
if(WITH_CYCLES_HIP_BINARIES AND WITH_CYCLES_DEVICE_HIP)
set(WITH_CYCLES_HIP_BINARIES OFF)
message(STATUS "HIP temporarily disabled due to compiler bugs")
if(UNIX)
# Disabled until there is a HIP 5.5 release for Linux.
set(WITH_CYCLES_HIP_BINARIES OFF)
message(STATUS "HIP temporarily disabled due to compiler bugs")
else()
# Need at least HIP 5.5 to solve compiler bug affecting the kernel.
find_package(HIP 5.5.0)
set_and_warn_library_found("HIP compiler" HIP_FOUND WITH_CYCLES_HIP_BINARIES)
# find_package(HIP)
# set_and_warn_library_found("HIP compiler" HIP_FOUND WITH_CYCLES_HIP_BINARIES)
# if(HIP_FOUND)
# message(STATUS "Found HIP ${HIP_HIPCC_EXECUTABLE} (${HIP_VERSION})")
# endif()
if(HIP_FOUND)
message(STATUS "Found HIP ${HIP_HIPCC_EXECUTABLE} (${HIP_VERSION})")
endif()
endif()
endif()
if(NOT WITH_HIP_DYNLOAD)

View File

@@ -84,7 +84,7 @@ CPUDevice::~CPUDevice()
texture_info.free();
}
BVHLayoutMask CPUDevice::get_bvh_layout_mask() const
BVHLayoutMask CPUDevice::get_bvh_layout_mask(uint /*kernel_features*/) const
{
BVHLayoutMask bvh_layout_mask = BVH_LAYOUT_BVH2;
#ifdef WITH_EMBREE

View File

@@ -56,7 +56,7 @@ class CPUDevice : public Device {
CPUDevice(const DeviceInfo &info_, Stats &stats_, Profiler &profiler_);
~CPUDevice();
virtual BVHLayoutMask get_bvh_layout_mask() const override;
virtual BVHLayoutMask get_bvh_layout_mask(uint /*kernel_features*/) const override;
/* Returns true if the texture info was copied to the device (meaning, some more
* re-initialization might be needed). */

View File

@@ -35,7 +35,7 @@ bool CUDADevice::have_precompiled_kernels()
return path_exists(cubins_path);
}
BVHLayoutMask CUDADevice::get_bvh_layout_mask() const
BVHLayoutMask CUDADevice::get_bvh_layout_mask(uint /*kernel_features*/) const
{
return BVH_LAYOUT_BVH2;
}

View File

@@ -38,7 +38,7 @@ class CUDADevice : public GPUDevice {
static bool have_precompiled_kernels();
virtual BVHLayoutMask get_bvh_layout_mask() const override;
virtual BVHLayoutMask get_bvh_layout_mask(uint /*kernel_features*/) const override;
void set_error(const string &error) override;

View File

@@ -354,7 +354,7 @@ DeviceInfo Device::get_multi_device(const vector<DeviceInfo> &subdevices,
info.has_guiding = true;
info.has_profiling = true;
info.has_peer_memory = false;
info.use_metalrt = false;
info.use_hardware_raytracing = false;
info.denoisers = DENOISER_ALL;
foreach (const DeviceInfo &device, subdevices) {
@@ -403,7 +403,7 @@ DeviceInfo Device::get_multi_device(const vector<DeviceInfo> &subdevices,
info.has_guiding &= device.has_guiding;
info.has_profiling &= device.has_profiling;
info.has_peer_memory |= device.has_peer_memory;
info.use_metalrt |= device.use_metalrt;
info.use_hardware_raytracing |= device.use_hardware_raytracing;
info.denoisers &= device.denoisers;
}

View File

@@ -71,15 +71,16 @@ class DeviceInfo {
string description;
string id; /* used for user preferences, should stay fixed with changing hardware config */
int num;
bool display_device; /* GPU is used as a display device. */
bool has_nanovdb; /* Support NanoVDB volumes. */
bool has_light_tree; /* Support light tree. */
bool has_osl; /* Support Open Shading Language. */
bool has_guiding; /* Support path guiding. */
bool has_profiling; /* Supports runtime collection of profiling info. */
bool has_peer_memory; /* GPU has P2P access to memory of another GPU. */
bool has_gpu_queue; /* Device supports GPU queue. */
bool use_metalrt; /* Use MetalRT to accelerate ray queries (Metal only). */
bool display_device; /* GPU is used as a display device. */
bool has_nanovdb; /* Support NanoVDB volumes. */
bool has_light_tree; /* Support light tree. */
bool has_osl; /* Support Open Shading Language. */
bool has_guiding; /* Support path guiding. */
bool has_profiling; /* Supports runtime collection of profiling info. */
bool has_peer_memory; /* GPU has P2P access to memory of another GPU. */
bool has_gpu_queue; /* Device supports GPU queue. */
bool use_hardware_raytracing; /* Use hardware ray tracing to accelerate ray queries in a backend.
*/
KernelOptimizationLevel kernel_optimization_level; /* Optimization level applied to path tracing
* kernels (Metal only). */
DenoiserTypeMask denoisers; /* Supported denoiser types. */
@@ -101,7 +102,7 @@ class DeviceInfo {
has_profiling = false;
has_peer_memory = false;
has_gpu_queue = false;
use_metalrt = false;
use_hardware_raytracing = false;
denoisers = DENOISER_NONE;
}
@@ -157,7 +158,7 @@ class Device {
fprintf(stderr, "%s\n", error.c_str());
fflush(stderr);
}
virtual BVHLayoutMask get_bvh_layout_mask() const = 0;
virtual BVHLayoutMask get_bvh_layout_mask(uint kernel_features) const = 0;
/* statistics */
Stats &stats;

View File

@@ -20,7 +20,7 @@ class DummyDevice : public Device {
~DummyDevice() {}
virtual BVHLayoutMask get_bvh_layout_mask() const override
virtual BVHLayoutMask get_bvh_layout_mask(uint /*kernel_features*/) const override
{
return 0;
}

View File

@@ -137,7 +137,7 @@ void device_hip_info(vector<DeviceInfo> &devices)
info.num = num;
info.has_nanovdb = true;
info.has_light_tree = false;
info.has_light_tree = true;
info.denoisers = 0;
info.has_gpu_queue = true;

View File

@@ -35,7 +35,7 @@ bool HIPDevice::have_precompiled_kernels()
return path_exists(fatbins_path);
}
BVHLayoutMask HIPDevice::get_bvh_layout_mask() const
BVHLayoutMask HIPDevice::get_bvh_layout_mask(uint /*kernel_features*/) const
{
return BVH_LAYOUT_BVH2;
}

View File

@@ -35,7 +35,7 @@ class HIPDevice : public GPUDevice {
static bool have_precompiled_kernels();
virtual BVHLayoutMask get_bvh_layout_mask() const override;
virtual BVHLayoutMask get_bvh_layout_mask(uint /*kernel_features*/) const override;
void set_error(const string &error) override;

View File

@@ -3,7 +3,9 @@
#include "device/kernel.h"
#include "util/log.h"
#ifndef __KERNEL_ONEAPI__
# include "util/log.h"
#endif
CCL_NAMESPACE_BEGIN
@@ -153,10 +155,13 @@ const char *device_kernel_as_string(DeviceKernel kernel)
case DEVICE_KERNEL_NUM:
break;
};
#ifndef __KERNEL_ONEAPI__
LOG(FATAL) << "Unhandled kernel " << static_cast<int>(kernel) << ", should never happen.";
#endif
return "UNKNOWN";
}
#ifndef __KERNEL_ONEAPI__
std::ostream &operator<<(std::ostream &os, DeviceKernel kernel)
{
os << device_kernel_as_string(kernel);
@@ -178,5 +183,6 @@ string device_kernel_mask_as_string(DeviceKernelMask mask)
return str;
}
#endif
CCL_NAMESPACE_END

View File

@@ -3,11 +3,13 @@
#pragma once
#include "kernel/types.h"
#ifndef __KERNEL_ONEAPI__
# include "kernel/types.h"
#include "util/string.h"
# include "util/string.h"
#include <ostream> // NOLINT
# include <ostream> // NOLINT
#endif
CCL_NAMESPACE_BEGIN
@@ -15,9 +17,12 @@ bool device_kernel_has_shading(DeviceKernel kernel);
bool device_kernel_has_intersection(DeviceKernel kernel);
const char *device_kernel_as_string(DeviceKernel kernel);
#ifndef __KERNEL_ONEAPI__
std::ostream &operator<<(std::ostream &os, DeviceKernel kernel);
typedef uint64_t DeviceKernelMask;
string device_kernel_mask_as_string(DeviceKernelMask mask);
#endif
CCL_NAMESPACE_END

View File

@@ -100,7 +100,7 @@ class MetalDevice : public Device {
virtual void cancel() override;
virtual BVHLayoutMask get_bvh_layout_mask() const override;
virtual BVHLayoutMask get_bvh_layout_mask(uint /*kernel_features*/) const override;
void set_error(const string &error) override;

View File

@@ -39,7 +39,7 @@ bool MetalDevice::is_device_cancelled(int ID)
return get_device_by_ID(ID, lock) == nullptr;
}
BVHLayoutMask MetalDevice::get_bvh_layout_mask() const
BVHLayoutMask MetalDevice::get_bvh_layout_mask(uint /*kernel_features*/) const
{
return use_metalrt ? BVH_LAYOUT_METAL : BVH_LAYOUT_BVH2;
}
@@ -100,12 +100,12 @@ MetalDevice::MetalDevice(const DeviceInfo &info, Stats &stats, Profiler &profile
}
case METAL_GPU_AMD: {
max_threads_per_threadgroup = 128;
use_metalrt = info.use_metalrt;
use_metalrt = info.use_hardware_raytracing;
break;
}
case METAL_GPU_APPLE: {
max_threads_per_threadgroup = 512;
use_metalrt = info.use_metalrt;
use_metalrt = info.use_hardware_raytracing;
break;
}
}

View File

@@ -96,12 +96,13 @@ class MultiDevice : public Device {
return error_msg;
}
virtual BVHLayoutMask get_bvh_layout_mask() const override
virtual BVHLayoutMask get_bvh_layout_mask(uint kernel_features) const override
{
BVHLayoutMask bvh_layout_mask = BVH_LAYOUT_ALL;
BVHLayoutMask bvh_layout_mask_all = BVH_LAYOUT_NONE;
foreach (const SubDevice &sub_device, devices) {
BVHLayoutMask device_bvh_layout_mask = sub_device.device->get_bvh_layout_mask();
BVHLayoutMask device_bvh_layout_mask = sub_device.device->get_bvh_layout_mask(
kernel_features);
bvh_layout_mask &= device_bvh_layout_mask;
bvh_layout_mask_all |= device_bvh_layout_mask;
}

View File

@@ -40,12 +40,12 @@ bool device_oneapi_init()
if (getenv("SYCL_CACHE_TRESHOLD") == nullptr) {
_putenv_s("SYCL_CACHE_THRESHOLD", "0");
}
if (getenv("SYCL_DEVICE_FILTER") == nullptr) {
if (getenv("ONEAPI_DEVICE_SELECTOR") == nullptr) {
if (getenv("CYCLES_ONEAPI_ALL_DEVICES") == nullptr) {
_putenv_s("SYCL_DEVICE_FILTER", "level_zero");
_putenv_s("ONEAPI_DEVICE_SELECTOR", "level_zero:*");
}
else {
_putenv_s("SYCL_DEVICE_FILTER", "level_zero,cuda,hip");
_putenv_s("ONEAPI_DEVICE_SELECTOR", "!opencl:*");
}
}
if (getenv("SYCL_ENABLE_PCI") == nullptr) {
@@ -58,10 +58,10 @@ bool device_oneapi_init()
setenv("SYCL_CACHE_PERSISTENT", "1", false);
setenv("SYCL_CACHE_THRESHOLD", "0", false);
if (getenv("CYCLES_ONEAPI_ALL_DEVICES") == nullptr) {
setenv("SYCL_DEVICE_FILTER", "level_zero", false);
setenv("ONEAPI_DEVICE_SELECTOR", "level_zero:*", false);
}
else {
setenv("SYCL_DEVICE_FILTER", "level_zero,cuda,hip", false);
setenv("ONEAPI_DEVICE_SELECTOR", "!opencl:*", false);
}
setenv("SYCL_ENABLE_PCI", "1", false);
setenv("SYCL_PI_LEVEL_ZERO_USE_COPY_ENGINE_FOR_IN_ORDER_QUEUE", "0", false);
@@ -87,7 +87,8 @@ Device *device_oneapi_create(const DeviceInfo &info, Stats &stats, Profiler &pro
}
#ifdef WITH_ONEAPI
static void device_iterator_cb(const char *id, const char *name, int num, void *user_ptr)
static void device_iterator_cb(
const char *id, const char *name, int num, bool hwrt_support, void *user_ptr)
{
vector<DeviceInfo> *devices = (vector<DeviceInfo> *)user_ptr;
@@ -112,6 +113,13 @@ static void device_iterator_cb(const char *id, const char *name, int num, void *
/* NOTE(@nsirgien): Seems not possible to know from SYCL/oneAPI or Level0. */
info.display_device = false;
# ifdef WITH_EMBREE_GPU
info.use_hardware_raytracing = hwrt_support;
# else
info.use_hardware_raytracing = false;
(void)hwrt_support;
# endif
devices->push_back(info);
VLOG_INFO << "Added device \"" << name << "\" with id \"" << info.id << "\".";
}

View File

@@ -8,7 +8,19 @@
# include "util/debug.h"
# include "util/log.h"
# ifdef WITH_EMBREE_GPU
# include "bvh/embree.h"
# endif
# include "kernel/device/oneapi/globals.h"
# include "kernel/device/oneapi/kernel.h"
# if defined(WITH_EMBREE_GPU) && defined(EMBREE_SYCL_SUPPORT) && !defined(SYCL_LANGUAGE_VERSION)
/* These declarations are missing from embree headers when compiling from a compiler that doesn't
* support SYCL. */
extern "C" RTCDevice rtcNewSYCLDevice(sycl::context context, const char *config);
extern "C" bool rtcIsSYCLDeviceSupported(const sycl::device sycl_device);
# endif
CCL_NAMESPACE_BEGIN
@@ -22,16 +34,29 @@ static void queue_error_cb(const char *message, void *user_ptr)
OneapiDevice::OneapiDevice(const DeviceInfo &info, Stats &stats, Profiler &profiler)
: Device(info, stats, profiler),
device_queue_(nullptr),
# ifdef WITH_EMBREE_GPU
embree_device(nullptr),
embree_scene(nullptr),
# endif
texture_info_(this, "texture_info", MEM_GLOBAL),
kg_memory_(nullptr),
kg_memory_device_(nullptr),
kg_memory_size_(0)
{
need_texture_info_ = false;
use_hardware_raytracing = info.use_hardware_raytracing;
oneapi_set_error_cb(queue_error_cb, &oneapi_error_string_);
bool is_finished_ok = create_queue(device_queue_, info.num);
bool is_finished_ok = create_queue(device_queue_,
info.num,
# ifdef WITH_EMBREE_GPU
use_hardware_raytracing ? &embree_device : nullptr
# else
nullptr
# endif
);
if (is_finished_ok == false) {
set_error("oneAPI queue initialization error: got runtime exception \"" +
oneapi_error_string_ + "\"");
@@ -42,6 +67,16 @@ OneapiDevice::OneapiDevice(const DeviceInfo &info, Stats &stats, Profiler &profi
assert(device_queue_);
}
# ifdef WITH_EMBREE_GPU
use_hardware_raytracing = use_hardware_raytracing && (embree_device != nullptr);
# else
use_hardware_raytracing = false;
# endif
if (use_hardware_raytracing) {
VLOG_INFO << "oneAPI will use hardware ray tracing for intersection acceleration.";
}
size_t globals_segment_size;
is_finished_ok = kernel_globals_size(globals_segment_size);
if (is_finished_ok == false) {
@@ -64,6 +99,11 @@ OneapiDevice::OneapiDevice(const DeviceInfo &info, Stats &stats, Profiler &profi
OneapiDevice::~OneapiDevice()
{
# ifdef WITH_EMBREE_GPU
if (embree_device)
rtcReleaseDevice(embree_device);
# endif
texture_info_.free();
usm_free(device_queue_, kg_memory_);
usm_free(device_queue_, kg_memory_device_);
@@ -80,15 +120,47 @@ bool OneapiDevice::check_peer_access(Device * /*peer_device*/)
return false;
}
BVHLayoutMask OneapiDevice::get_bvh_layout_mask() const
bool OneapiDevice::can_use_hardware_raytracing_for_features(uint requested_features) const
{
return BVH_LAYOUT_BVH2;
/* MNEE and Ray-trace kernels currently don't work correctly with HWRT. */
return !(requested_features & (KERNEL_FEATURE_MNEE | KERNEL_FEATURE_NODE_RAYTRACE));
}
BVHLayoutMask OneapiDevice::get_bvh_layout_mask(uint requested_features) const
{
return (use_hardware_raytracing &&
can_use_hardware_raytracing_for_features(requested_features)) ?
BVH_LAYOUT_EMBREE :
BVH_LAYOUT_BVH2;
}
# ifdef WITH_EMBREE_GPU
void OneapiDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
{
if (embree_device && bvh->params.bvh_layout == BVH_LAYOUT_EMBREE) {
BVHEmbree *const bvh_embree = static_cast<BVHEmbree *>(bvh);
if (refit) {
bvh_embree->refit(progress);
}
else {
bvh_embree->build(progress, &stats, embree_device, true);
}
if (bvh->params.top_level) {
embree_scene = bvh_embree->scene;
}
}
else {
Device::build_bvh(bvh, progress, refit);
}
}
# endif
bool OneapiDevice::load_kernels(const uint requested_features)
{
assert(device_queue_);
kernel_features = requested_features;
bool is_finished_ok = oneapi_run_test_kernel(device_queue_);
if (is_finished_ok == false) {
set_error("oneAPI test kernel execution: got a runtime exception \"" + oneapi_error_string_ +
@@ -100,7 +172,14 @@ bool OneapiDevice::load_kernels(const uint requested_features)
assert(device_queue_);
}
is_finished_ok = oneapi_load_kernels(device_queue_, (const unsigned int)requested_features);
if (use_hardware_raytracing && !can_use_hardware_raytracing_for_features(requested_features)) {
VLOG_INFO
<< "Hardware ray tracing disabled, not supported yet by oneAPI for requested features.";
use_hardware_raytracing = false;
}
is_finished_ok = oneapi_load_kernels(
device_queue_, (const unsigned int)requested_features, use_hardware_raytracing);
if (is_finished_ok == false) {
set_error("oneAPI kernels loading: got a runtime exception \"" + oneapi_error_string_ + "\"");
}
@@ -327,6 +406,16 @@ void OneapiDevice::const_copy_to(const char *name, void *host, size_t size)
<< string_human_readable_number(size) << " bytes. ("
<< string_human_readable_size(size) << ")";
# ifdef WITH_EMBREE_GPU
if (strcmp(name, "data") == 0) {
assert(size <= sizeof(KernelData));
/* Update scene handle(since it is different for each device on multi devices) */
KernelData *const data = (KernelData *)host;
data->device_bvh = embree_scene;
}
# endif
ConstMemMap::iterator i = const_mem_map_.find(name);
device_vector<uchar> *data;
@@ -446,7 +535,9 @@ void OneapiDevice::check_usm(SyclQueue *queue_, const void *usm_ptr, bool allow_
# endif
}
bool OneapiDevice::create_queue(SyclQueue *&external_queue, int device_index)
bool OneapiDevice::create_queue(SyclQueue *&external_queue,
int device_index,
void *embree_device_pointer)
{
bool finished_correct = true;
try {
@@ -457,6 +548,13 @@ bool OneapiDevice::create_queue(SyclQueue *&external_queue, int device_index)
sycl::queue *created_queue = new sycl::queue(devices[device_index],
sycl::property::queue::in_order());
external_queue = reinterpret_cast<SyclQueue *>(created_queue);
# ifdef WITH_EMBREE_GPU
if (embree_device_pointer) {
*((RTCDevice *)embree_device_pointer) = rtcNewSYCLDevice(created_queue->get_context(), "");
}
# else
(void)embree_device_pointer;
# endif
}
catch (sycl::exception const &e) {
finished_correct = false;
@@ -625,7 +723,8 @@ bool OneapiDevice::enqueue_kernel(KernelContext *kernel_context,
size_t global_size,
void **args)
{
return oneapi_enqueue_kernel(kernel_context, kernel, global_size, args);
return oneapi_enqueue_kernel(
kernel_context, kernel, global_size, kernel_features, use_hardware_raytracing, args);
}
/* Compute-runtime (ie. NEO) version is what gets returned by sycl/L0 on Windows
@@ -767,9 +866,9 @@ char *OneapiDevice::device_capabilities()
sycl::id<3> max_work_item_sizes =
device.get_info<sycl::info::device::max_work_item_sizes<3>>();
WRITE_ATTR("max_work_item_sizes_dim0", ((size_t)max_work_item_sizes.get(0)))
WRITE_ATTR("max_work_item_sizes_dim1", ((size_t)max_work_item_sizes.get(1)))
WRITE_ATTR("max_work_item_sizes_dim2", ((size_t)max_work_item_sizes.get(2)))
WRITE_ATTR(max_work_item_sizes_dim0, ((size_t)max_work_item_sizes.get(0)))
WRITE_ATTR(max_work_item_sizes_dim1, ((size_t)max_work_item_sizes.get(1)))
WRITE_ATTR(max_work_item_sizes_dim2, ((size_t)max_work_item_sizes.get(2)))
GET_NUM_ATTR(max_work_group_size)
GET_NUM_ATTR(max_num_sub_groups)
@@ -792,7 +891,7 @@ char *OneapiDevice::device_capabilities()
GET_NUM_ATTR(native_vector_width_half)
size_t max_clock_frequency = device.get_info<sycl::info::device::max_clock_frequency>();
WRITE_ATTR("max_clock_frequency", max_clock_frequency)
WRITE_ATTR(max_clock_frequency, max_clock_frequency)
GET_NUM_ATTR(address_bits)
GET_NUM_ATTR(max_mem_alloc_size)
@@ -801,7 +900,7 @@ char *OneapiDevice::device_capabilities()
* supported so we always return false, even if device supports HW texture usage acceleration.
*/
bool image_support = false;
WRITE_ATTR("image_support", (size_t)image_support)
WRITE_ATTR(image_support, (size_t)image_support)
GET_NUM_ATTR(max_parameter_size)
GET_NUM_ATTR(mem_base_addr_align)
@@ -830,12 +929,17 @@ void OneapiDevice::iterate_devices(OneAPIDeviceIteratorCallback cb, void *user_p
std::string name = device.get_info<sycl::info::device::name>();
# else
std::string name = "SYCL Host Task (Debug)";
# endif
# ifdef WITH_EMBREE_GPU
bool hwrt_support = rtcIsSYCLDeviceSupported(device);
# else
bool hwrt_support = false;
# endif
std::string id = "ONEAPI_" + platform_name + "_" + name;
if (device.has(sycl::aspect::ext_intel_pci_address)) {
id.append("_" + device.get_info<sycl::ext::intel::info::device::pci_address>());
}
(cb)(id.c_str(), name.c_str(), num, user_ptr);
(cb)(id.c_str(), name.c_str(), num, hwrt_support, user_ptr);
num++;
}
}

View File

@@ -16,15 +16,16 @@ CCL_NAMESPACE_BEGIN
class DeviceQueue;
typedef void (*OneAPIDeviceIteratorCallback)(const char *id,
const char *name,
int num,
void *user_ptr);
typedef void (*OneAPIDeviceIteratorCallback)(
const char *id, const char *name, int num, bool hwrt_support, void *user_ptr);
class OneapiDevice : public Device {
private:
SyclQueue *device_queue_;
# ifdef WITH_EMBREE_GPU
RTCDevice embree_device;
RTCScene embree_scene;
# endif
using ConstMemMap = map<string, device_vector<uchar> *>;
ConstMemMap const_mem_map_;
device_vector<TextureInfo> texture_info_;
@@ -34,17 +35,21 @@ class OneapiDevice : public Device {
size_t kg_memory_size_ = (size_t)0;
size_t max_memory_on_device_ = (size_t)0;
std::string oneapi_error_string_;
bool use_hardware_raytracing = false;
unsigned int kernel_features = 0;
public:
virtual BVHLayoutMask get_bvh_layout_mask() const override;
virtual BVHLayoutMask get_bvh_layout_mask(uint kernel_features) const override;
OneapiDevice(const DeviceInfo &info, Stats &stats, Profiler &profiler);
virtual ~OneapiDevice();
# ifdef WITH_EMBREE_GPU
void build_bvh(BVH *bvh, Progress &progress, bool refit) override;
# endif
bool check_peer_access(Device *peer_device) override;
bool load_kernels(const uint requested_features) override;
bool load_kernels(const uint kernel_features) override;
void load_texture_info();
@@ -113,8 +118,9 @@ class OneapiDevice : public Device {
SyclQueue *sycl_queue();
protected:
bool can_use_hardware_raytracing_for_features(uint kernel_features) const;
void check_usm(SyclQueue *queue, const void *usm_ptr, bool allow_host);
bool create_queue(SyclQueue *&external_queue, int device_index);
bool create_queue(SyclQueue *&external_queue, int device_index, void *embree_device);
void free_queue(SyclQueue *queue);
void *usm_aligned_alloc_host(SyclQueue *queue, size_t memory_size, size_t alignment);
void *usm_alloc_device(SyclQueue *queue, size_t memory_size);

View File

@@ -151,7 +151,7 @@ unique_ptr<DeviceQueue> OptiXDevice::gpu_queue_create()
return make_unique<OptiXDeviceQueue>(this);
}
BVHLayoutMask OptiXDevice::get_bvh_layout_mask() const
BVHLayoutMask OptiXDevice::get_bvh_layout_mask(uint /*kernel_features*/) const
{
/* OptiX has its own internal acceleration structure format. */
return BVH_LAYOUT_OPTIX;

View File

@@ -88,7 +88,7 @@ class OptiXDevice : public CUDADevice {
OptiXDevice(const DeviceInfo &info, Stats &stats, Profiler &profiler);
~OptiXDevice();
BVHLayoutMask get_bvh_layout_mask() const override;
BVHLayoutMask get_bvh_layout_mask(uint /*kernel_features*/) const override;
string compile_kernel_get_common_cflags(const uint kernel_features);

View File

@@ -574,7 +574,7 @@ void PathTrace::denoise(const RenderWork &render_work)
void PathTrace::set_output_driver(unique_ptr<OutputDriver> driver)
{
output_driver_ = move(driver);
output_driver_ = std::move(driver);
}
void PathTrace::set_display_driver(unique_ptr<DisplayDriver> driver)
@@ -585,7 +585,7 @@ void PathTrace::set_display_driver(unique_ptr<DisplayDriver> driver)
destroy_gpu_resources();
if (driver) {
display_ = make_unique<PathTraceDisplay>(move(driver));
display_ = make_unique<PathTraceDisplay>(std::move(driver));
}
else {
display_ = nullptr;

View File

@@ -9,7 +9,9 @@
CCL_NAMESPACE_BEGIN
PathTraceDisplay::PathTraceDisplay(unique_ptr<DisplayDriver> driver) : driver_(move(driver)) {}
PathTraceDisplay::PathTraceDisplay(unique_ptr<DisplayDriver> driver) : driver_(std::move(driver))
{
}
void PathTraceDisplay::reset(const BufferParams &buffer_params, const bool reset_rendering)
{

View File

@@ -28,6 +28,7 @@ static size_t estimate_single_state_size(const uint kernel_features)
#define KERNEL_STRUCT_ARRAY_MEMBER(parent_struct, type, name, feature) \
state_size += (kernel_features & (feature)) ? sizeof(type) : 0;
#define KERNEL_STRUCT_END(name) \
(void)array_index; \
break; \
}
#define KERNEL_STRUCT_END_ARRAY(name, cpu_array_size, gpu_array_size) \
@@ -139,6 +140,7 @@ void PathTraceWorkGPU::alloc_integrator_soa()
integrator_state_gpu_.parent_struct[array_index].name = (type *)array->device_pointer; \
}
#define KERNEL_STRUCT_END(name) \
(void)array_index; \
break; \
}
#define KERNEL_STRUCT_END_ARRAY(name, cpu_array_size, gpu_array_size) \
@@ -299,8 +301,8 @@ void PathTraceWorkGPU::render_samples(RenderStatistics &statistics,
* become busy after adding new tiles). This is especially important for the shadow catcher which
* schedules work in halves of available number of paths. */
work_tile_scheduler_.set_max_num_path_states(max_num_paths_ / 8);
work_tile_scheduler_.set_accelerated_rt((device_->get_bvh_layout_mask() & BVH_LAYOUT_OPTIX) !=
0);
work_tile_scheduler_.set_accelerated_rt(
(device_->get_bvh_layout_mask(device_scene_->data.kernel_features) & BVH_LAYOUT_OPTIX) != 0);
work_tile_scheduler_.reset(effective_buffer_params_,
start_sample,
samples_num,

View File

@@ -55,21 +55,29 @@ void WorkTileScheduler::reset_scheduler_state()
VLOG_WORK << "Will schedule tiles of size " << tile_size_;
if (VLOG_IS_ON(3)) {
/* The logging is based on multiple tiles scheduled, ignoring overhead of multi-tile scheduling
* and purely focusing on the number of used path states. */
const int num_path_states_in_tile = tile_size_.width * tile_size_.height *
tile_size_.num_samples;
const int num_tiles = max_num_path_states_ / num_path_states_in_tile;
VLOG_WORK << "Number of unused path states: "
<< max_num_path_states_ - num_tiles * num_path_states_in_tile;
const int num_path_states_in_tile = tile_size_.width * tile_size_.height *
tile_size_.num_samples;
if (num_path_states_in_tile == 0) {
num_tiles_x_ = 0;
num_tiles_y_ = 0;
num_tiles_per_sample_range_ = 0;
}
else {
if (VLOG_IS_ON(3)) {
/* The logging is based on multiple tiles scheduled, ignoring overhead of multi-tile
* scheduling and purely focusing on the number of used path states. */
const int num_tiles = max_num_path_states_ / num_path_states_in_tile;
VLOG_WORK << "Number of unused path states: "
<< max_num_path_states_ - num_tiles * num_path_states_in_tile;
}
num_tiles_x_ = divide_up(image_size_px_.x, tile_size_.width);
num_tiles_y_ = divide_up(image_size_px_.y, tile_size_.height);
num_tiles_per_sample_range_ = divide_up(samples_num_, tile_size_.num_samples);
}
num_tiles_x_ = divide_up(image_size_px_.x, tile_size_.width);
num_tiles_y_ = divide_up(image_size_px_.y, tile_size_.height);
total_tiles_num_ = num_tiles_x_ * num_tiles_y_;
num_tiles_per_sample_range_ = divide_up(samples_num_, tile_size_.num_samples);
next_work_index_ = 0;
total_work_size_ = total_tiles_num_ * num_tiles_per_sample_range_;

View File

@@ -96,10 +96,13 @@ set(SRC_KERNEL_DEVICE_ONEAPI_HEADERS
device/oneapi/compat.h
device/oneapi/context_begin.h
device/oneapi/context_end.h
device/oneapi/context_intersect_begin.h
device/oneapi/context_intersect_end.h
device/oneapi/globals.h
device/oneapi/image.h
device/oneapi/kernel.h
device/oneapi/kernel_templates.h
device/cpu/bvh.h
)
set(SRC_KERNEL_CLOSURE_HEADERS
@@ -764,7 +767,7 @@ if(WITH_CYCLES_DEVICE_ONEAPI)
# Set defaults for spir64 and spir64_gen options
if(NOT DEFINED CYCLES_ONEAPI_SYCL_OPTIONS_spir64)
set(CYCLES_ONEAPI_SYCL_OPTIONS_spir64 "-options '-ze-opt-large-register-file -ze-opt-regular-grf-kernel integrator_intersect'")
set(CYCLES_ONEAPI_SYCL_OPTIONS_spir64 "-options '-ze-opt-regular-grf-kernel integrator_intersect -ze-opt-large-grf-kernel shade -ze-opt-no-local-to-generic'")
endif()
if(NOT DEFINED CYCLES_ONEAPI_SYCL_OPTIONS_spir64_gen)
set(CYCLES_ONEAPI_SYCL_OPTIONS_spir64_gen "${CYCLES_ONEAPI_SYCL_OPTIONS_spir64}" CACHE STRING "Extra build options for spir64_gen target")
@@ -775,8 +778,6 @@ if(WITH_CYCLES_DEVICE_ONEAPI)
# Host execution won't use GPU binaries, no need to compile them.
if(WITH_CYCLES_ONEAPI_BINARIES AND NOT WITH_CYCLES_ONEAPI_HOST_TASK_EXECUTION)
# AoT binaries aren't currently reused when calling sycl::build.
list(APPEND sycl_compiler_flags -DSYCL_SKIP_KERNELS_PRELOAD)
# Iterate over all targest and their options
list(JOIN CYCLES_ONEAPI_SYCL_TARGETS "," targets_string)
list(APPEND sycl_compiler_flags -fsycl-targets=${targets_string})
@@ -798,6 +799,59 @@ if(WITH_CYCLES_DEVICE_ONEAPI)
-I"${NANOVDB_INCLUDE_DIR}")
endif()
if(WITH_CYCLES_EMBREE AND EMBREE_SYCL_SUPPORT)
list(APPEND sycl_compiler_flags
-DWITH_EMBREE
-DWITH_EMBREE_GPU
-DEMBREE_MAJOR_VERSION=${EMBREE_MAJOR_VERSION}
-I"${EMBREE_INCLUDE_DIRS}")
if(WIN32)
list(APPEND sycl_compiler_flags
-ladvapi32.lib
)
endif()
set(next_library_mode "")
foreach(library ${EMBREE_LIBRARIES})
string(TOLOWER "${library}" library_lower)
if(("${library_lower}" STREQUAL "optimized") OR
("${library_lower}" STREQUAL "debug"))
set(next_library_mode "${library_lower}")
else()
if(next_library_mode STREQUAL "")
list(APPEND EMBREE_TBB_LIBRARIES_optimized ${library})
list(APPEND EMBREE_TBB_LIBRARIES_debug ${library})
else()
list(APPEND EMBREE_TBB_LIBRARIES_${next_library_mode} ${library})
endif()
set(next_library_mode "")
endif()
endforeach()
foreach(library ${TBB_LIBRARIES})
string(TOLOWER "${library}" library_lower)
if(("${library_lower}" STREQUAL "optimized") OR
("${library_lower}" STREQUAL "debug"))
set(next_library_mode "${library_lower}")
else()
if(next_library_mode STREQUAL "")
list(APPEND EMBREE_TBB_LIBRARIES_optimized ${library})
list(APPEND EMBREE_TBB_LIBRARIES_debug ${library})
else()
list(APPEND EMBREE_TBB_LIBRARIES_${next_library_mode} ${library})
endif()
set(next_library_mode "")
endif()
endforeach()
list(APPEND sycl_compiler_flags
"$<$<CONFIG:Release>:${EMBREE_TBB_LIBRARIES_optimized}>"
"$<$<CONFIG:RelWithDebInfo>:${EMBREE_TBB_LIBRARIES_optimized}>"
"$<$<CONFIG:MinSizeRel>:${EMBREE_TBB_LIBRARIES_optimized}>"
"$<$<CONFIG:Debug>:${EMBREE_TBB_LIBRARIES_debug}>"
)
endif()
if(WITH_CYCLES_DEBUG)
list(APPEND sycl_compiler_flags -DWITH_CYCLES_DEBUG)
endif()

View File

@@ -21,6 +21,28 @@
# define __BVH2__
#endif
#if defined(__KERNEL_ONEAPI__) && defined(WITH_EMBREE_GPU)
/* bool is apparently not tested for specialization constants:
* https://github.com/intel/llvm/blob/39d1c65272a786b2b13a6f094facfddf9408406d/sycl/test/basic_tests/SYCL-2020-spec-constants.cpp#L25-L27
* Instead of adding one more bool specialization constant, we reuse existing embree_features one
* and use RTC_FEATURE_FLAG_NONE as value to test for avoiding to call Embree on GPU.
*/
/* We set it to RTC_FEATURE_FLAG_NONE by default so AoT binaries contain MNE and ray-trace kernels
* pre-compiled without Embree.
* Changing this default value would require updating the logic in oneapi_load_kernels(). */
static constexpr sycl::specialization_id<RTCFeatureFlags> oneapi_embree_features{
RTC_FEATURE_FLAG_NONE};
# define IF_USING_EMBREE \
if (kernel_handler.get_specialization_constant<oneapi_embree_features>() != \
RTC_FEATURE_FLAG_NONE)
# define IF_NOT_USING_EMBREE \
if (kernel_handler.get_specialization_constant<oneapi_embree_features>() == \
RTC_FEATURE_FLAG_NONE)
#else
# define IF_USING_EMBREE
# define IF_NOT_USING_EMBREE
#endif
CCL_NAMESPACE_BEGIN
#ifdef __BVH2__
@@ -74,30 +96,39 @@ ccl_device_intersect bool scene_intersect(KernelGlobals kg,
}
# ifdef __EMBREE__
if (kernel_data.device_bvh) {
return kernel_embree_intersect(kg, ray, visibility, isect);
IF_USING_EMBREE
{
if (kernel_data.device_bvh) {
return kernel_embree_intersect(kg, ray, visibility, isect);
}
}
# endif
IF_NOT_USING_EMBREE
{
# ifdef __OBJECT_MOTION__
if (kernel_data.bvh.have_motion) {
if (kernel_data.bvh.have_motion) {
# ifdef __HAIR__
if (kernel_data.bvh.have_curves) {
return bvh_intersect_hair_motion(kg, ray, isect, visibility);
}
if (kernel_data.bvh.have_curves) {
return bvh_intersect_hair_motion(kg, ray, isect, visibility);
}
# endif /* __HAIR__ */
return bvh_intersect_motion(kg, ray, isect, visibility);
}
return bvh_intersect_motion(kg, ray, isect, visibility);
}
# endif /* __OBJECT_MOTION__ */
# ifdef __HAIR__
if (kernel_data.bvh.have_curves) {
return bvh_intersect_hair(kg, ray, isect, visibility);
}
if (kernel_data.bvh.have_curves) {
return bvh_intersect_hair(kg, ray, isect, visibility);
}
# endif /* __HAIR__ */
return bvh_intersect(kg, ray, isect, visibility);
return bvh_intersect(kg, ray, isect, visibility);
}
kernel_assert(false);
return false;
}
/* Single object BVH traversal, for SSS/AO/bevel. */
@@ -129,17 +160,27 @@ ccl_device_intersect bool scene_intersect_local(KernelGlobals kg,
}
# ifdef __EMBREE__
if (kernel_data.device_bvh) {
return kernel_embree_intersect_local(kg, ray, local_isect, local_object, lcg_state, max_hits);
IF_USING_EMBREE
{
if (kernel_data.device_bvh) {
return kernel_embree_intersect_local(
kg, ray, local_isect, local_object, lcg_state, max_hits);
}
}
# endif
IF_NOT_USING_EMBREE
{
# ifdef __OBJECT_MOTION__
if (kernel_data.bvh.have_motion) {
return bvh_intersect_local_motion(kg, ray, local_isect, local_object, lcg_state, max_hits);
}
if (kernel_data.bvh.have_motion) {
return bvh_intersect_local_motion(kg, ray, local_isect, local_object, lcg_state, max_hits);
}
# endif /* __OBJECT_MOTION__ */
return bvh_intersect_local(kg, ray, local_isect, local_object, lcg_state, max_hits);
return bvh_intersect_local(kg, ray, local_isect, local_object, lcg_state, max_hits);
}
kernel_assert(false);
return false;
}
# endif
@@ -184,35 +225,44 @@ ccl_device_intersect bool scene_intersect_shadow_all(KernelGlobals kg,
}
# ifdef __EMBREE__
if (kernel_data.device_bvh) {
return kernel_embree_intersect_shadow_all(
kg, state, ray, visibility, max_hits, num_recorded_hits, throughput);
IF_USING_EMBREE
{
if (kernel_data.device_bvh) {
return kernel_embree_intersect_shadow_all(
kg, state, ray, visibility, max_hits, num_recorded_hits, throughput);
}
}
# endif
IF_NOT_USING_EMBREE
{
# ifdef __OBJECT_MOTION__
if (kernel_data.bvh.have_motion) {
if (kernel_data.bvh.have_motion) {
# ifdef __HAIR__
if (kernel_data.bvh.have_curves) {
return bvh_intersect_shadow_all_hair_motion(
kg, ray, state, visibility, max_hits, num_recorded_hits, throughput);
}
if (kernel_data.bvh.have_curves) {
return bvh_intersect_shadow_all_hair_motion(
kg, ray, state, visibility, max_hits, num_recorded_hits, throughput);
}
# endif /* __HAIR__ */
return bvh_intersect_shadow_all_motion(
kg, ray, state, visibility, max_hits, num_recorded_hits, throughput);
}
return bvh_intersect_shadow_all_motion(
kg, ray, state, visibility, max_hits, num_recorded_hits, throughput);
}
# endif /* __OBJECT_MOTION__ */
# ifdef __HAIR__
if (kernel_data.bvh.have_curves) {
return bvh_intersect_shadow_all_hair(
kg, ray, state, visibility, max_hits, num_recorded_hits, throughput);
}
if (kernel_data.bvh.have_curves) {
return bvh_intersect_shadow_all_hair(
kg, ray, state, visibility, max_hits, num_recorded_hits, throughput);
}
# endif /* __HAIR__ */
return bvh_intersect_shadow_all(
kg, ray, state, visibility, max_hits, num_recorded_hits, throughput);
return bvh_intersect_shadow_all(
kg, ray, state, visibility, max_hits, num_recorded_hits, throughput);
}
kernel_assert(false);
return false;
}
# endif /* __SHADOW_RECORD_ALL__ */
@@ -239,13 +289,28 @@ ccl_device_intersect bool scene_intersect_volume(KernelGlobals kg,
return false;
}
# ifdef __OBJECT_MOTION__
if (kernel_data.bvh.have_motion) {
return bvh_intersect_volume_motion(kg, ray, isect, visibility);
# ifdef __EMBREE__
IF_USING_EMBREE
{
if (kernel_data.device_bvh) {
return kernel_embree_intersect_volume(kg, ray, isect, visibility);
}
}
# endif
IF_NOT_USING_EMBREE
{
# ifdef __OBJECT_MOTION__
if (kernel_data.bvh.have_motion) {
return bvh_intersect_volume_motion(kg, ray, isect, visibility);
}
# endif /* __OBJECT_MOTION__ */
return bvh_intersect_volume(kg, ray, isect, visibility);
return bvh_intersect_volume(kg, ray, isect, visibility);
}
kernel_assert(false);
return false;
}
# endif /* defined(__VOLUME__) && !defined(__VOLUME_RECORD_ALL__) */
@@ -275,18 +340,27 @@ ccl_device_intersect uint scene_intersect_volume(KernelGlobals kg,
}
# ifdef __EMBREE__
if (kernel_data.device_bvh) {
return kernel_embree_intersect_volume(kg, ray, isect, max_hits, visibility);
IF_USING_EMBREE
{
if (kernel_data.device_bvh) {
return kernel_embree_intersect_volume(kg, ray, isect, max_hits, visibility);
}
}
# endif
IF_NOT_USING_EMBREE
{
# ifdef __OBJECT_MOTION__
if (kernel_data.bvh.have_motion) {
return bvh_intersect_volume_all_motion(kg, ray, isect, max_hits, visibility);
}
if (kernel_data.bvh.have_motion) {
return bvh_intersect_volume_all_motion(kg, ray, isect, max_hits, visibility);
}
# endif /* __OBJECT_MOTION__ */
return bvh_intersect_volume_all(kg, ray, isect, max_hits, visibility);
return bvh_intersect_volume_all(kg, ray, isect, max_hits, visibility);
}
kernel_assert(false);
return false;
}
# endif /* defined(__VOLUME__) && defined(__VOLUME_RECORD_ALL__) */

View File

@@ -51,8 +51,6 @@ ccl_device_inline
int object = OBJECT_NONE;
float isect_t = ray->tmax;
int num_hits_in_instance = 0;
uint num_hits = 0;
isect_array->t = ray->tmax;
@@ -152,7 +150,6 @@ ccl_device_inline
/* Move on to next entry in intersections array. */
isect_array++;
num_hits++;
num_hits_in_instance++;
isect_array->t = isect_t;
if (num_hits == max_hits) {
return num_hits;
@@ -193,7 +190,6 @@ ccl_device_inline
/* Move on to next entry in intersections array. */
isect_array++;
num_hits++;
num_hits_in_instance++;
isect_array->t = isect_t;
if (num_hits == max_hits) {
return num_hits;
@@ -219,7 +215,6 @@ ccl_device_inline
bvh_instance_push(kg, object, ray, &P, &dir, &idir);
#endif
num_hits_in_instance = 0;
isect_array->t = isect_t;
++stack_ptr;

View File

@@ -20,6 +20,7 @@ KERNEL_STRUCT_BEGIN(KernelBackground, background)
/* xyz store direction, w the angle. float4 instead of float3 is used
* to ensure consistent padding/alignment across devices. */
KERNEL_STRUCT_MEMBER(background, float4, sun)
KERNEL_STRUCT_MEMBER(background, int, use_sun_guiding)
/* Only shader index. */
KERNEL_STRUCT_MEMBER(background, int, surface_shader)
KERNEL_STRUCT_MEMBER(background, int, volume_shader)
@@ -39,6 +40,10 @@ KERNEL_STRUCT_MEMBER(background, int, use_mis)
KERNEL_STRUCT_MEMBER(background, int, lightgroup)
/* Light Index. */
KERNEL_STRUCT_MEMBER(background, int, light_index)
/* Padding. */
KERNEL_STRUCT_MEMBER(background, int, pad1)
KERNEL_STRUCT_MEMBER(background, int, pad2)
KERNEL_STRUCT_MEMBER(background, int, pad3)
KERNEL_STRUCT_END(KernelBackground)
/* BVH: own BVH2 if no native device acceleration struct used. */

View File

@@ -13,8 +13,13 @@
# include <embree3/rtcore_scene.h>
#endif
#include "kernel/device/cpu/compat.h"
#include "kernel/device/cpu/globals.h"
#ifdef __KERNEL_ONEAPI__
# include "kernel/device/oneapi/compat.h"
# include "kernel/device/oneapi/globals.h"
#else
# include "kernel/device/cpu/compat.h"
# include "kernel/device/cpu/globals.h"
#endif
#include "kernel/bvh/types.h"
#include "kernel/bvh/util.h"
@@ -33,11 +38,16 @@ using numhit_t = uint8_t;
using numhit_t = uint32_t;
#endif
#define CYCLES_EMBREE_USED_FEATURES \
(RTCFeatureFlags)(RTC_FEATURE_FLAG_TRIANGLE | RTC_FEATURE_FLAG_INSTANCE | \
RTC_FEATURE_FLAG_FILTER_FUNCTION_IN_ARGUMENTS | RTC_FEATURE_FLAG_POINT | \
RTC_FEATURE_FLAG_MOTION_BLUR | RTC_FEATURE_FLAG_ROUND_CATMULL_ROM_CURVE | \
RTC_FEATURE_FLAG_FLAT_CATMULL_ROM_CURVE)
#ifdef __KERNEL_ONEAPI__
# define CYCLES_EMBREE_USED_FEATURES \
(kernel_handler.get_specialization_constant<oneapi_embree_features>())
#else
# define CYCLES_EMBREE_USED_FEATURES \
(RTCFeatureFlags)(RTC_FEATURE_FLAG_TRIANGLE | RTC_FEATURE_FLAG_INSTANCE | \
RTC_FEATURE_FLAG_FILTER_FUNCTION_IN_ARGUMENTS | RTC_FEATURE_FLAG_POINT | \
RTC_FEATURE_FLAG_MOTION_BLUR | RTC_FEATURE_FLAG_ROUND_CATMULL_ROM_CURVE | \
RTC_FEATURE_FLAG_FLAT_CATMULL_ROM_CURVE)
#endif
#define EMBREE_IS_HAIR(x) (x & 1)
@@ -99,7 +109,9 @@ struct CCLVolumeContext
#if EMBREE_MAJOR_VERSION >= 4
KernelGlobals kg;
const Ray *ray;
# ifdef __VOLUME_RECORD_ALL__
numhit_t max_hits;
# endif
numhit_t num_hits;
#endif
Intersection *vol_isect;
@@ -252,7 +264,8 @@ ccl_device_inline void kernel_embree_convert_sss_hit(KernelGlobals kg,
* Things like recording subsurface or shadow hits for later evaluation
* as well as filtering for volume objects happen here.
* Cycles' own BVH does that directly inside the traversal calls. */
ccl_device void kernel_embree_filter_intersection_func(const RTCFilterFunctionNArguments *args)
ccl_device_forceinline void kernel_embree_filter_intersection_func_impl(
const RTCFilterFunctionNArguments *args)
{
/* Current implementation in Cycles assumes only single-ray intersection queries. */
assert(args->N == 1);
@@ -263,7 +276,11 @@ ccl_device void kernel_embree_filter_intersection_func(const RTCFilterFunctionNA
#else
CCLIntersectContext *ctx = (CCLIntersectContext *)(args->context);
#endif
#ifdef __KERNEL_ONEAPI__
KernelGlobalsGPU *kg = nullptr;
#else
const KernelGlobalsCPU *kg = ctx->kg;
#endif
const Ray *cray = ctx->ray;
if (kernel_embree_is_self_intersection(
@@ -277,7 +294,7 @@ ccl_device void kernel_embree_filter_intersection_func(const RTCFilterFunctionNA
* as well as filtering for volume objects happen here.
* Cycles' own BVH does that directly inside the traversal calls.
*/
ccl_device void kernel_embree_filter_occluded_shadow_all_func(
ccl_device_forceinline void kernel_embree_filter_occluded_shadow_all_func_impl(
const RTCFilterFunctionNArguments *args)
{
/* Current implementation in Cycles assumes only single-ray intersection queries. */
@@ -290,7 +307,11 @@ ccl_device void kernel_embree_filter_occluded_shadow_all_func(
#else
CCLIntersectContext *ctx = (CCLIntersectContext *)(args->context);
#endif
#ifdef __KERNEL_ONEAPI__
KernelGlobalsGPU *kg = nullptr;
#else
const KernelGlobalsCPU *kg = ctx->kg;
#endif
const Ray *cray = ctx->ray;
Intersection current_isect;
@@ -326,7 +347,7 @@ ccl_device void kernel_embree_filter_occluded_shadow_all_func(
}
/* Test if we need to record this transparent intersection. */
const numhit_t max_record_hits = min(ctx->max_hits, INTEGRATOR_SHADOW_ISECT_SIZE);
const numhit_t max_record_hits = min(ctx->max_hits, numhit_t(INTEGRATOR_SHADOW_ISECT_SIZE));
if (ctx->num_recorded_hits < max_record_hits) {
/* If maximum number of hits was reached, replace the intersection with the
* highest distance. We want to find the N closest intersections. */
@@ -363,7 +384,7 @@ ccl_device void kernel_embree_filter_occluded_shadow_all_func(
*args->valid = 0;
}
ccl_device_forceinline void kernel_embree_filter_occluded_local_func(
ccl_device_forceinline void kernel_embree_filter_occluded_local_func_impl(
const RTCFilterFunctionNArguments *args)
{
/* Current implementation in Cycles assumes only single-ray intersection queries. */
@@ -376,7 +397,11 @@ ccl_device_forceinline void kernel_embree_filter_occluded_local_func(
#else
CCLIntersectContext *ctx = (CCLIntersectContext *)(args->context);
#endif
#ifdef __KERNEL_ONEAPI__
KernelGlobalsGPU *kg = nullptr;
#else
const KernelGlobalsCPU *kg = ctx->kg;
#endif
const Ray *cray = ctx->ray;
/* Check if it's hitting the correct object. */
@@ -462,7 +487,7 @@ ccl_device_forceinline void kernel_embree_filter_occluded_local_func(
*args->valid = 0;
}
ccl_device_forceinline void kernel_embree_filter_occluded_volume_all_func(
ccl_device_forceinline void kernel_embree_filter_occluded_volume_all_func_impl(
const RTCFilterFunctionNArguments *args)
{
/* Current implementation in Cycles assumes only single-ray intersection queries. */
@@ -475,11 +500,17 @@ ccl_device_forceinline void kernel_embree_filter_occluded_volume_all_func(
#else
CCLIntersectContext *ctx = (CCLIntersectContext *)(args->context);
#endif
#ifdef __KERNEL_ONEAPI__
KernelGlobalsGPU *kg = nullptr;
#else
const KernelGlobalsCPU *kg = ctx->kg;
#endif
const Ray *cray = ctx->ray;
#ifdef __VOLUME_RECORD_ALL__
/* Append the intersection to the end of the array. */
if (ctx->num_hits < ctx->max_hits) {
#endif
Intersection current_isect;
kernel_embree_convert_hit(
kg, ray, hit, &current_isect, reinterpret_cast<intptr_t>(args->geometryUserPtr));
@@ -496,10 +527,17 @@ ccl_device_forceinline void kernel_embree_filter_occluded_volume_all_func(
int object_flag = kernel_data_fetch(object_flag, tri_object);
if ((object_flag & SD_OBJECT_HAS_VOLUME) == 0) {
--ctx->num_hits;
#ifndef __VOLUME_RECORD_ALL__
/* Without __VOLUME_RECORD_ALL__ we need only a first counted hit, so we will
* continue tracing only if a current hit is not counted. */
*args->valid = 0;
#endif
}
#ifdef __VOLUME_RECORD_ALL__
/* This tells Embree to continue tracing. */
*args->valid = 0;
}
#endif
}
#if EMBREE_MAJOR_VERSION < 4
@@ -513,14 +551,14 @@ ccl_device_forceinline void kernel_embree_filter_occluded_func(
switch (ctx->type) {
case CCLIntersectContext::RAY_SHADOW_ALL:
kernel_embree_filter_occluded_shadow_all_func(args);
kernel_embree_filter_occluded_shadow_all_func_impl(args);
break;
case CCLIntersectContext::RAY_LOCAL:
case CCLIntersectContext::RAY_SSS:
kernel_embree_filter_occluded_local_func(args);
kernel_embree_filter_occluded_local_func_impl(args);
break;
case CCLIntersectContext::RAY_VOLUME_ALL:
kernel_embree_filter_occluded_volume_all_func(args);
kernel_embree_filter_occluded_volume_all_func_impl(args);
break;
case CCLIntersectContext::RAY_REGULAR:
@@ -569,7 +607,63 @@ ccl_device void kernel_embree_filter_occluded_func_backface_cull(
kernel_embree_filter_occluded_func(args);
}
#endif
#ifdef __KERNEL_ONEAPI__
/* Static wrappers so we can call the callbacks from out side the ONEAPIKernelContext class */
RTC_SYCL_INDIRECTLY_CALLABLE static void ccl_always_inline
kernel_embree_filter_intersection_func_static(const RTCFilterFunctionNArguments *args)
{
RTCHit *hit = (RTCHit *)args->hit;
CCLFirstHitContext *ctx = (CCLFirstHitContext *)(args->context);
ONEAPIKernelContext *context = static_cast<ONEAPIKernelContext *>(ctx->kg);
context->kernel_embree_filter_intersection_func_impl(args);
}
RTC_SYCL_INDIRECTLY_CALLABLE static void ccl_always_inline
kernel_embree_filter_occluded_shadow_all_func_static(const RTCFilterFunctionNArguments *args)
{
RTCHit *hit = (RTCHit *)args->hit;
CCLShadowContext *ctx = (CCLShadowContext *)(args->context);
ONEAPIKernelContext *context = static_cast<ONEAPIKernelContext *>(ctx->kg);
context->kernel_embree_filter_occluded_shadow_all_func_impl(args);
}
RTC_SYCL_INDIRECTLY_CALLABLE static void ccl_always_inline
kernel_embree_filter_occluded_local_func_static(const RTCFilterFunctionNArguments *args)
{
RTCHit *hit = (RTCHit *)args->hit;
CCLLocalContext *ctx = (CCLLocalContext *)(args->context);
ONEAPIKernelContext *context = static_cast<ONEAPIKernelContext *>(ctx->kg);
context->kernel_embree_filter_occluded_local_func_impl(args);
}
RTC_SYCL_INDIRECTLY_CALLABLE static void ccl_always_inline
kernel_embree_filter_occluded_volume_all_func_static(const RTCFilterFunctionNArguments *args)
{
RTCHit *hit = (RTCHit *)args->hit;
CCLVolumeContext *ctx = (CCLVolumeContext *)(args->context);
ONEAPIKernelContext *context = static_cast<ONEAPIKernelContext *>(ctx->kg);
context->kernel_embree_filter_occluded_volume_all_func_impl(args);
}
# define kernel_embree_filter_intersection_func \
ONEAPIKernelContext::kernel_embree_filter_intersection_func_static
# define kernel_embree_filter_occluded_shadow_all_func \
ONEAPIKernelContext::kernel_embree_filter_occluded_shadow_all_func_static
# define kernel_embree_filter_occluded_local_func \
ONEAPIKernelContext::kernel_embree_filter_occluded_local_func_static
# define kernel_embree_filter_occluded_volume_all_func \
ONEAPIKernelContext::kernel_embree_filter_occluded_volume_all_func_static
#else
# define kernel_embree_filter_intersection_func kernel_embree_filter_intersection_func_impl
# if EMBREE_MAJOR_VERSION >= 4
# define kernel_embree_filter_occluded_shadow_all_func \
kernel_embree_filter_occluded_shadow_all_func_impl
# define kernel_embree_filter_occluded_local_func kernel_embree_filter_occluded_local_func_impl
# define kernel_embree_filter_occluded_volume_all_func \
kernel_embree_filter_occluded_volume_all_func_impl
# endif
#endif
/* Scene intersection. */
@@ -583,7 +677,15 @@ ccl_device_intersect bool kernel_embree_intersect(KernelGlobals kg,
#if EMBREE_MAJOR_VERSION >= 4
CCLFirstHitContext ctx;
rtcInitRayQueryContext(&ctx);
# ifdef __KERNEL_ONEAPI__
/* NOTE(sirgienko): Cycles GPU back-ends passes NULL to KernelGlobals and
* uses global device allocation (CUDA, Optix, HIP) or passes all needed data
* as a class context (Metal, oneAPI). So we need to pass this context here
* in order to have an access to it later in Embree filter functions on GPU. */
ctx.kg = (KernelGlobals)this;
# else
ctx.kg = kg;
# endif
#else
CCLIntersectContext ctx(kg, CCLIntersectContext::RAY_REGULAR);
rtcInitIntersectContext(&ctx);
@@ -596,7 +698,7 @@ ccl_device_intersect bool kernel_embree_intersect(KernelGlobals kg,
#if EMBREE_MAJOR_VERSION >= 4
RTCIntersectArguments args;
rtcInitIntersectArguments(&args);
args.filter = (RTCFilterFunctionN)kernel_embree_filter_intersection_func;
args.filter = reinterpret_cast<RTCFilterFunctionN>(kernel_embree_filter_intersection_func);
args.feature_mask = CYCLES_EMBREE_USED_FEATURES;
args.context = &ctx;
rtcIntersect1(kernel_data.device_bvh, &ray_hit, &args);
@@ -625,7 +727,15 @@ ccl_device_intersect bool kernel_embree_intersect_local(KernelGlobals kg,
# if EMBREE_MAJOR_VERSION >= 4
CCLLocalContext ctx;
rtcInitRayQueryContext(&ctx);
# ifdef __KERNEL_ONEAPI__
/* NOTE(sirgienko): Cycles GPU back-ends passes NULL to KernelGlobals and
* uses global device allocation (CUDA, Optix, HIP) or passes all needed data
* as a class context (Metal, oneAPI). So we need to pass this context here
* in order to have an access to it later in Embree filter functions on GPU. */
ctx.kg = (KernelGlobals)this;
# else
ctx.kg = kg;
# endif
# else
CCLIntersectContext ctx(kg,
has_bvh ? CCLIntersectContext::RAY_SSS : CCLIntersectContext::RAY_LOCAL);
@@ -646,7 +756,7 @@ ccl_device_intersect bool kernel_embree_intersect_local(KernelGlobals kg,
# if EMBREE_MAJOR_VERSION >= 4
RTCOccludedArguments args;
rtcInitOccludedArguments(&args);
args.filter = (RTCFilterFunctionN)(kernel_embree_filter_occluded_local_func);
args.filter = reinterpret_cast<RTCFilterFunctionN>(kernel_embree_filter_occluded_local_func);
args.feature_mask = CYCLES_EMBREE_USED_FEATURES;
args.context = &ctx;
# endif
@@ -692,7 +802,7 @@ ccl_device_intersect bool kernel_embree_intersect_local(KernelGlobals kg,
#ifdef __SHADOW_RECORD_ALL__
ccl_device_intersect bool kernel_embree_intersect_shadow_all(KernelGlobals kg,
IntegratorShadowStateCPU *state,
IntegratorShadowState state,
ccl_private const Ray *ray,
uint visibility,
uint max_hits,
@@ -702,7 +812,15 @@ ccl_device_intersect bool kernel_embree_intersect_shadow_all(KernelGlobals kg,
# if EMBREE_MAJOR_VERSION >= 4
CCLShadowContext ctx;
rtcInitRayQueryContext(&ctx);
# ifdef __KERNEL_ONEAPI__
/* NOTE(sirgienko): Cycles GPU back-ends passes NULL to KernelGlobals and
* uses global device allocation (CUDA, Optix, HIP) or passes all needed data
* as a class context (Metal, oneAPI). So we need to pass this context here
* in order to have an access to it later in Embree filter functions on GPU. */
ctx.kg = (KernelGlobals)this;
# else
ctx.kg = kg;
# endif
# else
CCLIntersectContext ctx(kg, CCLIntersectContext::RAY_SHADOW_ALL);
rtcInitIntersectContext(&ctx);
@@ -718,7 +836,8 @@ ccl_device_intersect bool kernel_embree_intersect_shadow_all(KernelGlobals kg,
# if EMBREE_MAJOR_VERSION >= 4
RTCOccludedArguments args;
rtcInitOccludedArguments(&args);
args.filter = (RTCFilterFunctionN)kernel_embree_filter_occluded_shadow_all_func;
args.filter = reinterpret_cast<RTCFilterFunctionN>(
kernel_embree_filter_occluded_shadow_all_func);
args.feature_mask = CYCLES_EMBREE_USED_FEATURES;
args.context = &ctx;
rtcOccluded1(kernel_data.device_bvh, &rtc_ray, &args);
@@ -736,19 +855,31 @@ ccl_device_intersect bool kernel_embree_intersect_shadow_all(KernelGlobals kg,
ccl_device_intersect uint kernel_embree_intersect_volume(KernelGlobals kg,
ccl_private const Ray *ray,
ccl_private Intersection *isect,
# ifdef __VOLUME_RECORD_ALL__
const uint max_hits,
# endif
const uint visibility)
{
# if EMBREE_MAJOR_VERSION >= 4
CCLVolumeContext ctx;
rtcInitRayQueryContext(&ctx);
# ifdef __KERNEL_ONEAPI__
/* NOTE(sirgienko) Cycles GPU back-ends passes NULL to KernelGlobals and
* uses global device allocation (CUDA, Optix, HIP) or passes all needed data
* as a class context (Metal, oneAPI). So we need to pass this context here
* in order to have an access to it later in Embree filter functions on GPU. */
ctx.kg = (KernelGlobals)this;
# else
ctx.kg = kg;
# endif
# else
CCLIntersectContext ctx(kg, CCLIntersectContext::RAY_VOLUME_ALL);
rtcInitIntersectContext(&ctx);
# endif
ctx.vol_isect = isect;
# ifdef __VOLUME_RECORD_ALL__
ctx.max_hits = numhit_t(max_hits);
# endif
ctx.num_hits = numhit_t(0);
ctx.ray = ray;
RTCRay rtc_ray;
@@ -756,7 +887,8 @@ ccl_device_intersect uint kernel_embree_intersect_volume(KernelGlobals kg,
# if EMBREE_MAJOR_VERSION >= 4
RTCOccludedArguments args;
rtcInitOccludedArguments(&args);
args.filter = (RTCFilterFunctionN)kernel_embree_filter_occluded_volume_all_func;
args.filter = reinterpret_cast<RTCFilterFunctionN>(
kernel_embree_filter_occluded_volume_all_func);
args.feature_mask = CYCLES_EMBREE_USED_FEATURES;
args.context = &ctx;
rtcOccluded1(kernel_data.device_bvh, &rtc_ray, &args);

View File

@@ -128,6 +128,12 @@ ccl_gpu_kernel(GPU_KERNEL_BLOCK_NUM_THREADS, GPU_KERNEL_MAX_REGISTERS)
}
ccl_gpu_kernel_postfix
/* Intersection kernels need access to the kernel handler for specialization constants to work
* properly. */
#ifdef __KERNEL_ONEAPI__
# include "kernel/device/oneapi/context_intersect_begin.h"
#endif
ccl_gpu_kernel(GPU_KERNEL_BLOCK_NUM_THREADS, GPU_KERNEL_MAX_REGISTERS)
ccl_gpu_kernel_signature(integrator_intersect_closest,
ccl_global const int *path_index_array,
@@ -185,6 +191,10 @@ ccl_gpu_kernel(GPU_KERNEL_BLOCK_NUM_THREADS, GPU_KERNEL_MAX_REGISTERS)
}
ccl_gpu_kernel_postfix
#ifdef __KERNEL_ONEAPI__
# include "kernel/device/oneapi/context_intersect_end.h"
#endif
ccl_gpu_kernel(GPU_KERNEL_BLOCK_NUM_THREADS, GPU_KERNEL_MAX_REGISTERS)
ccl_gpu_kernel_signature(integrator_shade_background,
ccl_global const int *path_index_array,
@@ -249,6 +259,12 @@ ccl_gpu_kernel_postfix
constant int __dummy_constant [[function_constant(Kernel_DummyConstant)]];
#endif
/* Kernels using intersections need access to the kernel handler for specialization constants to
* work properly. */
#ifdef __KERNEL_ONEAPI__
# include "kernel/device/oneapi/context_intersect_begin.h"
#endif
ccl_gpu_kernel(GPU_KERNEL_BLOCK_NUM_THREADS, GPU_KERNEL_MAX_REGISTERS)
ccl_gpu_kernel_signature(integrator_shade_surface_raytrace,
ccl_global const int *path_index_array,
@@ -287,6 +303,9 @@ ccl_gpu_kernel(GPU_KERNEL_BLOCK_NUM_THREADS, GPU_KERNEL_MAX_REGISTERS)
}
}
ccl_gpu_kernel_postfix
#ifdef __KERNEL_ONEAPI__
# include "kernel/device/oneapi/context_intersect_end.h"
#endif
ccl_gpu_kernel(GPU_KERNEL_BLOCK_NUM_THREADS, GPU_KERNEL_MAX_REGISTERS)
ccl_gpu_kernel_signature(integrator_shade_volume,

View File

@@ -5,6 +5,11 @@
#define __KERNEL_GPU__
#define __KERNEL_ONEAPI__
#define __KERNEL_64_BIT__
#ifdef WITH_EMBREE_GPU
# define __KERNEL_GPU_RAYTRACING__
#endif
#define CCL_NAMESPACE_BEGIN
#define CCL_NAMESPACE_END
@@ -57,17 +62,19 @@
#define ccl_gpu_kernel_threads(block_num_threads)
#ifndef WITH_ONEAPI_SYCL_HOST_TASK
# define ccl_gpu_kernel_signature(name, ...) \
# define __ccl_gpu_kernel_signature(name, ...) \
void oneapi_kernel_##name(KernelGlobalsGPU *ccl_restrict kg, \
size_t kernel_global_size, \
size_t kernel_local_size, \
sycl::handler &cgh, \
__VA_ARGS__) { \
(kg); \
cgh.parallel_for<class kernel_##name>( \
cgh.parallel_for( \
sycl::nd_range<1>(kernel_global_size, kernel_local_size), \
[=](sycl::nd_item<1> item) {
# define ccl_gpu_kernel_signature __ccl_gpu_kernel_signature
# define ccl_gpu_kernel_postfix \
}); \
}

View File

@@ -0,0 +1,18 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2023 Intel Corporation */
#if !defined(WITH_ONEAPI_SYCL_HOST_TASK) && defined(WITH_EMBREE_GPU)
# undef ccl_gpu_kernel_signature
# define ccl_gpu_kernel_signature(name, ...) \
void oneapi_kernel_##name(KernelGlobalsGPU *ccl_restrict kg, \
size_t kernel_global_size, \
size_t kernel_local_size, \
sycl::handler &cgh, \
__VA_ARGS__) \
{ \
(kg); \
cgh.parallel_for( \
sycl::nd_range<1>(kernel_global_size, kernel_local_size), \
[=](sycl::nd_item<1> item, sycl::kernel_handler oneapi_kernel_handler) { \
((ONEAPIKernelContext*)kg)->kernel_handler = oneapi_kernel_handler;
#endif

View File

@@ -0,0 +1,7 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2023 Intel Corporation */
#if !defined(WITH_ONEAPI_SYCL_HOST_TASK) && defined(WITH_EMBREE_GPU)
# undef ccl_gpu_kernel_signature
# define ccl_gpu_kernel_signature __ccl_gpu_kernel_signature
#endif

View File

@@ -31,6 +31,8 @@ typedef struct KernelGlobalsGPU {
size_t nd_item_group_range_0;
size_t nd_item_global_id_0;
size_t nd_item_global_range_0;
#else
sycl::kernel_handler kernel_handler;
#endif
} KernelGlobalsGPU;

View File

@@ -16,9 +16,22 @@
# include "kernel/device/gpu/kernel.h"
# include "device/kernel.cpp"
static OneAPIErrorCallback s_error_cb = nullptr;
static void *s_error_user_ptr = nullptr;
# ifdef WITH_EMBREE_GPU
static const RTCFeatureFlags CYCLES_ONEAPI_EMBREE_BASIC_FEATURES =
(const RTCFeatureFlags)(RTC_FEATURE_FLAG_TRIANGLE | RTC_FEATURE_FLAG_INSTANCE |
RTC_FEATURE_FLAG_FILTER_FUNCTION_IN_ARGUMENTS |
RTC_FEATURE_FLAG_POINT | RTC_FEATURE_FLAG_MOTION_BLUR);
static const RTCFeatureFlags CYCLES_ONEAPI_EMBREE_ALL_FEATURES =
(const RTCFeatureFlags)(CYCLES_ONEAPI_EMBREE_BASIC_FEATURES |
RTC_FEATURE_FLAG_ROUND_CATMULL_ROM_CURVE |
RTC_FEATURE_FLAG_FLAT_CATMULL_ROM_CURVE);
# endif
void oneapi_set_error_cb(OneAPIErrorCallback cb, void *user_ptr)
{
s_error_cb = cb;
@@ -142,15 +155,99 @@ size_t oneapi_kernel_preferred_local_size(SyclQueue *queue,
return std::min(limit_work_group_size, preferred_work_group_size);
}
bool oneapi_load_kernels(SyclQueue *queue_, const uint requested_features)
bool oneapi_kernel_is_required_for_features(const std::string &kernel_name,
const uint kernel_features)
{
if ((kernel_features & KERNEL_FEATURE_NODE_RAYTRACE) == 0 &&
kernel_name.find(device_kernel_as_string(DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE)) !=
std::string::npos)
return false;
if ((kernel_features & KERNEL_FEATURE_MNEE) == 0 &&
kernel_name.find(device_kernel_as_string(DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_MNEE)) !=
std::string::npos)
return false;
if ((kernel_features & KERNEL_FEATURE_VOLUME) == 0 &&
kernel_name.find(device_kernel_as_string(DEVICE_KERNEL_INTEGRATOR_INTERSECT_VOLUME_STACK)) !=
std::string::npos)
return false;
return true;
}
bool oneapi_kernel_is_raytrace_or_mnee(const std::string &kernel_name)
{
return (kernel_name.find(device_kernel_as_string(DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_MNEE)) !=
std::string::npos) ||
(kernel_name.find(device_kernel_as_string(
DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE)) != std::string::npos);
}
bool oneapi_kernel_is_using_embree(const std::string &kernel_name)
{
# ifdef WITH_EMBREE_GPU
/* MNEE and Ray-trace kernels aren't yet enabled to use Embree. */
for (int i = 0; i < (int)DEVICE_KERNEL_NUM; i++) {
DeviceKernel kernel = (DeviceKernel)i;
if (device_kernel_has_intersection(kernel)) {
if (kernel_name.find(device_kernel_as_string(kernel)) != std::string::npos) {
return !oneapi_kernel_is_raytrace_or_mnee(kernel_name);
}
}
}
# endif
return false;
}
bool oneapi_load_kernels(SyclQueue *queue_,
const uint kernel_features,
bool use_hardware_raytracing)
{
# ifdef SYCL_SKIP_KERNELS_PRELOAD
(void)queue_;
(void)requested_features;
# else
assert(queue_);
sycl::queue *queue = reinterpret_cast<sycl::queue *>(queue_);
# ifdef WITH_EMBREE_GPU
/* For best performance, we always JIT compile the kernels that are using Embree. */
if (use_hardware_raytracing) {
try {
sycl::kernel_bundle<sycl::bundle_state::input> all_kernels_bundle =
sycl::get_kernel_bundle<sycl::bundle_state::input>(queue->get_context(),
{queue->get_device()});
for (const sycl::kernel_id &kernel_id : all_kernels_bundle.get_kernel_ids()) {
const std::string &kernel_name = kernel_id.get_name();
if (!oneapi_kernel_is_required_for_features(kernel_name, kernel_features) ||
!oneapi_kernel_is_using_embree(kernel_name)) {
continue;
}
sycl::kernel_bundle<sycl::bundle_state::input> one_kernel_bundle_input =
sycl::get_kernel_bundle<sycl::bundle_state::input>(queue->get_context(), {kernel_id});
/* Hair requires embree curves support. */
if (kernel_features & KERNEL_FEATURE_HAIR) {
one_kernel_bundle_input
.set_specialization_constant<ONEAPIKernelContext::oneapi_embree_features>(
CYCLES_ONEAPI_EMBREE_ALL_FEATURES);
sycl::build(one_kernel_bundle_input);
}
else {
one_kernel_bundle_input
.set_specialization_constant<ONEAPIKernelContext::oneapi_embree_features>(
CYCLES_ONEAPI_EMBREE_BASIC_FEATURES);
sycl::build(one_kernel_bundle_input);
}
}
}
catch (sycl::exception const &e) {
if (s_error_cb) {
s_error_cb(e.what(), s_error_user_ptr);
}
return false;
}
}
# endif
try {
sycl::kernel_bundle<sycl::bundle_state::input> all_kernels_bundle =
sycl::get_kernel_bundle<sycl::bundle_state::input>(queue->get_context(),
@@ -159,27 +256,29 @@ bool oneapi_load_kernels(SyclQueue *queue_, const uint requested_features)
for (const sycl::kernel_id &kernel_id : all_kernels_bundle.get_kernel_ids()) {
const std::string &kernel_name = kernel_id.get_name();
/* NOTE(@nsirgien): Names in this conditions below should match names from
* oneapi_call macro in oneapi_enqueue_kernel below */
if (((requested_features & KERNEL_FEATURE_VOLUME) == 0) &&
kernel_name.find("oneapi_kernel_integrator_shade_volume") != std::string::npos) {
/* In case HWRT is on, compilation of kernels using Embree is already handled in previous
* block. */
if (!oneapi_kernel_is_required_for_features(kernel_name, kernel_features) ||
(use_hardware_raytracing && oneapi_kernel_is_using_embree(kernel_name))) {
continue;
}
if (((requested_features & KERNEL_FEATURE_MNEE) == 0) &&
kernel_name.find("oneapi_kernel_integrator_shade_surface_mnee") != std::string::npos) {
# ifdef WITH_EMBREE_GPU
if (oneapi_kernel_is_using_embree(kernel_name) ||
oneapi_kernel_is_raytrace_or_mnee(kernel_name)) {
sycl::kernel_bundle<sycl::bundle_state::input> one_kernel_bundle_input =
sycl::get_kernel_bundle<sycl::bundle_state::input>(queue->get_context(), {kernel_id});
one_kernel_bundle_input
.set_specialization_constant<ONEAPIKernelContext::oneapi_embree_features>(
RTC_FEATURE_FLAG_NONE);
sycl::build(one_kernel_bundle_input);
continue;
}
if (((requested_features & KERNEL_FEATURE_NODE_RAYTRACE) == 0) &&
kernel_name.find("oneapi_kernel_integrator_shade_surface_raytrace") !=
std::string::npos) {
continue;
}
sycl::kernel_bundle<sycl::bundle_state::input> one_kernel_bundle =
sycl::get_kernel_bundle<sycl::bundle_state::input>(queue->get_context(), {kernel_id});
sycl::build(one_kernel_bundle);
# endif
/* This call will ensure that AoT or cached JIT binaries are available
* for execution. It will trigger compilation if it is not already the case. */
(void)sycl::get_kernel_bundle<sycl::bundle_state::executable>(queue->get_context(),
{kernel_id});
}
}
catch (sycl::exception const &e) {
@@ -188,13 +287,14 @@ bool oneapi_load_kernels(SyclQueue *queue_, const uint requested_features)
}
return false;
}
# endif
return true;
}
bool oneapi_enqueue_kernel(KernelContext *kernel_context,
int kernel,
size_t global_size,
const uint kernel_features,
bool use_hardware_raytracing,
void **args)
{
bool success = true;
@@ -248,6 +348,21 @@ bool oneapi_enqueue_kernel(KernelContext *kernel_context,
try {
queue->submit([&](sycl::handler &cgh) {
# ifdef WITH_EMBREE_GPU
/* Spec says it has no effect if the called kernel doesn't support the below specialization
* constant but it can still trigger a recompilation, so we set it only if needed. */
if (device_kernel_has_intersection(device_kernel)) {
const RTCFeatureFlags used_embree_features = !use_hardware_raytracing ?
RTC_FEATURE_FLAG_NONE :
!(kernel_features & KERNEL_FEATURE_HAIR) ?
CYCLES_ONEAPI_EMBREE_BASIC_FEATURES :
CYCLES_ONEAPI_EMBREE_ALL_FEATURES;
cgh.set_specialization_constant<ONEAPIKernelContext::oneapi_embree_features>(
used_embree_features);
}
# else
(void)kernel_features;
# endif
switch (device_kernel) {
case DEVICE_KERNEL_INTEGRATOR_RESET: {
oneapi_call(kg, cgh, global_size, local_size, args, oneapi_kernel_integrator_reset);
@@ -549,4 +664,5 @@ bool oneapi_enqueue_kernel(KernelContext *kernel_context,
# endif
return success;
}
#endif /* WITH_ONEAPI */

View File

@@ -47,10 +47,14 @@ CYCLES_KERNEL_ONEAPI_EXPORT size_t oneapi_kernel_preferred_local_size(
CYCLES_KERNEL_ONEAPI_EXPORT bool oneapi_enqueue_kernel(KernelContext *context,
int kernel,
size_t global_size,
const unsigned int kernel_features,
bool use_hardware_raytracing,
void **args);
CYCLES_KERNEL_ONEAPI_EXPORT bool oneapi_load_kernels(SyclQueue *queue,
const unsigned int requested_features);
const unsigned int kernel_features,
bool use_hardware_raytracing);
# ifdef __cplusplus
}
# endif
#endif /* WITH_ONEAPI */

View File

@@ -342,7 +342,7 @@ ccl_device_forceinline void area_light_update_position(const ccl_global KernelLi
ls->D = normalize_len(ls->P - P, &ls->t);
ls->pdf = invarea;
if (klight->area.tan_half_spread > 0) {
if (klight->area.normalize_spread > 0) {
ls->eval_fac = 0.25f * invarea;
ls->eval_fac *= area_light_spread_attenuation(
ls->D, ls->Ng, klight->area.tan_half_spread, klight->area.normalize_spread);

View File

@@ -132,11 +132,11 @@ color sky_radiance_nishita(vector dir, float nishita_data[10], string filename)
/* definitions */
vector sun_dir = geographical_to_direction(sun_elevation, sun_rotation + M_PI_2);
float sun_dir_angle = precise_angle(dir, sun_dir);
float half_angular = angular_diameter / 2.0;
float half_angular = angular_diameter * 0.5;
float dir_elevation = M_PI_2 - direction[0];
/* if ray inside sun disc render it, otherwise render sky.
* alternatively, ignore the sun if we're evaluating the background texture. */
/* If the ray is inside the sun disc, render it, otherwise render the sky.
* Alternatively, ignore the sun if we're evaluating the background texture. */
if (sun_dir_angle < half_angular && sun_disc == 1 && raytype("importance_bake") != 1) {
/* get 2 pixels data */
color pixel_bottom = color(nishita_data[0], nishita_data[1], nishita_data[2]);

View File

@@ -84,8 +84,8 @@ ccl_device_inline void sample_uniform_cone(const float3 N,
ccl_device_inline float pdf_uniform_cone(const float3 N, float3 D, float angle)
{
float zMin = cosf(angle);
float z = dot(N, D);
if (z > zMin) {
float z = precise_angle(N, D);
if (z < angle) {
return M_1_2PI_F / (1.0f - zMin);
}
return 0.0f;

View File

@@ -138,12 +138,13 @@ ccl_device float3 sky_radiance_nishita(KernelGlobals kg,
/* definitions */
float3 sun_dir = geographical_to_direction(sun_elevation, sun_rotation + M_PI_2_F);
float sun_dir_angle = precise_angle(dir, sun_dir);
float half_angular = angular_diameter / 2.0f;
float half_angular = angular_diameter * 0.5f;
float dir_elevation = M_PI_2_F - direction.x;
/* if ray inside sun disc render it, otherwise render sky.
* alternatively, ignore the sun if we're evaluating the background texture. */
if (sun_disc && sun_dir_angle < half_angular && !(path_flag & PATH_RAY_IMPORTANCE_BAKE)) {
/* If the ray is inside the sun disc, render it, otherwise render the sky.
* Alternatively, ignore the sun if we're evaluating the background texture. */
if (sun_disc && sun_dir_angle < half_angular &&
!((path_flag & PATH_RAY_IMPORTANCE_BAKE) && kernel_data.background.use_sun_guiding)) {
/* get 2 pixels data */
float y;

View File

@@ -3,8 +3,9 @@
#pragma once
#if !defined(__KERNEL_GPU__) && defined(WITH_EMBREE)
# if EMBREE_MAJOR_VERSION >= 4
#if (!defined(__KERNEL_GPU__) || (defined(__KERNEL_ONEAPI__) && defined(WITH_EMBREE_GPU))) && \
defined(WITH_EMBREE)
# if EMBREE_MAJOR_VERSION == 4
# include <embree4/rtcore.h>
# include <embree4/rtcore_scene.h>
# else
@@ -78,9 +79,8 @@ CCL_NAMESPACE_BEGIN
#define __VISIBILITY_FLAG__
#define __VOLUME__
/* TODO: solve internal compiler errors and enable light tree on HIP. */
/* TODO: solve internal compiler perf issue and enable light tree on Metal/AMD. */
#if defined(__KERNEL_HIP__) || defined(__KERNEL_METAL_AMD__)
#if defined(__KERNEL_METAL_AMD__)
# undef __LIGHT_TREE__
#endif

View File

@@ -15,8 +15,12 @@ set(SRC
camera.cpp
colorspace.cpp
constant_fold.cpp
devicescene.cpp
film.cpp
geometry.cpp
geometry_attributes.cpp
geometry_bvh.cpp
geometry_mesh.cpp
hair.cpp
image.cpp
image_oiio.cpp
@@ -55,6 +59,7 @@ set(SRC_HEADERS
camera.h
colorspace.h
constant_fold.h
devicescene.h
film.h
geometry.h
hair.h

View File

@@ -0,0 +1,64 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2011-2022 Blender Foundation */
#include "scene/devicescene.h"
#include "device/device.h"
#include "device/memory.h"
CCL_NAMESPACE_BEGIN
DeviceScene::DeviceScene(Device *device)
: bvh_nodes(device, "bvh_nodes", MEM_GLOBAL),
bvh_leaf_nodes(device, "bvh_leaf_nodes", MEM_GLOBAL),
object_node(device, "object_node", MEM_GLOBAL),
prim_type(device, "prim_type", MEM_GLOBAL),
prim_visibility(device, "prim_visibility", MEM_GLOBAL),
prim_index(device, "prim_index", MEM_GLOBAL),
prim_object(device, "prim_object", MEM_GLOBAL),
prim_time(device, "prim_time", MEM_GLOBAL),
tri_verts(device, "tri_verts", MEM_GLOBAL),
tri_shader(device, "tri_shader", MEM_GLOBAL),
tri_vnormal(device, "tri_vnormal", MEM_GLOBAL),
tri_vindex(device, "tri_vindex", MEM_GLOBAL),
tri_patch(device, "tri_patch", MEM_GLOBAL),
tri_patch_uv(device, "tri_patch_uv", MEM_GLOBAL),
curves(device, "curves", MEM_GLOBAL),
curve_keys(device, "curve_keys", MEM_GLOBAL),
curve_segments(device, "curve_segments", MEM_GLOBAL),
patches(device, "patches", MEM_GLOBAL),
points(device, "points", MEM_GLOBAL),
points_shader(device, "points_shader", MEM_GLOBAL),
objects(device, "objects", MEM_GLOBAL),
object_motion_pass(device, "object_motion_pass", MEM_GLOBAL),
object_motion(device, "object_motion", MEM_GLOBAL),
object_flag(device, "object_flag", MEM_GLOBAL),
object_volume_step(device, "object_volume_step", MEM_GLOBAL),
object_prim_offset(device, "object_prim_offset", MEM_GLOBAL),
camera_motion(device, "camera_motion", MEM_GLOBAL),
attributes_map(device, "attributes_map", MEM_GLOBAL),
attributes_float(device, "attributes_float", MEM_GLOBAL),
attributes_float2(device, "attributes_float2", MEM_GLOBAL),
attributes_float3(device, "attributes_float3", MEM_GLOBAL),
attributes_float4(device, "attributes_float4", MEM_GLOBAL),
attributes_uchar4(device, "attributes_uchar4", MEM_GLOBAL),
light_distribution(device, "light_distribution", MEM_GLOBAL),
lights(device, "lights", MEM_GLOBAL),
light_background_marginal_cdf(device, "light_background_marginal_cdf", MEM_GLOBAL),
light_background_conditional_cdf(device, "light_background_conditional_cdf", MEM_GLOBAL),
light_tree_nodes(device, "light_tree_nodes", MEM_GLOBAL),
light_tree_emitters(device, "light_tree_emitters", MEM_GLOBAL),
light_to_tree(device, "light_to_tree", MEM_GLOBAL),
object_to_tree(device, "object_to_tree", MEM_GLOBAL),
object_lookup_offset(device, "object_lookup_offset", MEM_GLOBAL),
triangle_to_tree(device, "triangle_to_tree", MEM_GLOBAL),
particles(device, "particles", MEM_GLOBAL),
svm_nodes(device, "svm_nodes", MEM_GLOBAL),
shaders(device, "shaders", MEM_GLOBAL),
lookup_table(device, "lookup_table", MEM_GLOBAL),
sample_pattern_lut(device, "sample_pattern_lut", MEM_GLOBAL),
ies_lights(device, "ies", MEM_GLOBAL)
{
memset((void *)&data, 0, sizeof(data));
}
CCL_NAMESPACE_END

View File

@@ -0,0 +1,101 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2011-2022 Blender Foundation */
#ifndef __DEVICESCENE_H__
#define __DEVICESCENE_H__
#include "device/device.h"
#include "device/memory.h"
#include "util/types.h"
#include "util/vector.h"
CCL_NAMESPACE_BEGIN
class DeviceScene {
public:
/* BVH */
device_vector<int4> bvh_nodes;
device_vector<int4> bvh_leaf_nodes;
device_vector<int> object_node;
device_vector<int> prim_type;
device_vector<uint> prim_visibility;
device_vector<int> prim_index;
device_vector<int> prim_object;
device_vector<float2> prim_time;
/* mesh */
device_vector<packed_float3> tri_verts;
device_vector<uint> tri_shader;
device_vector<packed_float3> tri_vnormal;
device_vector<packed_uint3> tri_vindex;
device_vector<uint> tri_patch;
device_vector<float2> tri_patch_uv;
device_vector<KernelCurve> curves;
device_vector<float4> curve_keys;
device_vector<KernelCurveSegment> curve_segments;
device_vector<uint> patches;
/* point-cloud */
device_vector<float4> points;
device_vector<uint> points_shader;
/* objects */
device_vector<KernelObject> objects;
device_vector<Transform> object_motion_pass;
device_vector<DecomposedTransform> object_motion;
device_vector<uint> object_flag;
device_vector<float> object_volume_step;
device_vector<uint> object_prim_offset;
/* cameras */
device_vector<DecomposedTransform> camera_motion;
/* attributes */
device_vector<AttributeMap> attributes_map;
device_vector<float> attributes_float;
device_vector<float2> attributes_float2;
device_vector<packed_float3> attributes_float3;
device_vector<float4> attributes_float4;
device_vector<uchar4> attributes_uchar4;
/* lights */
device_vector<KernelLightDistribution> light_distribution;
device_vector<KernelLight> lights;
device_vector<float2> light_background_marginal_cdf;
device_vector<float2> light_background_conditional_cdf;
/* light tree */
device_vector<KernelLightTreeNode> light_tree_nodes;
device_vector<KernelLightTreeEmitter> light_tree_emitters;
device_vector<uint> light_to_tree;
device_vector<uint> object_to_tree;
device_vector<uint> object_lookup_offset;
device_vector<uint> triangle_to_tree;
/* particles */
device_vector<KernelParticle> particles;
/* shaders */
device_vector<int4> svm_nodes;
device_vector<KernelShader> shaders;
/* lookup tables */
device_vector<float> lookup_table;
/* integrator */
device_vector<float> sample_pattern_lut;
/* IES lights */
device_vector<float> ies_lights;
KernelData data;
DeviceScene(Device *device);
};
CCL_NAMESPACE_END
#endif /* __DEVICESCENE_H__ */

File diff suppressed because it is too large Load Diff

View File

@@ -30,6 +30,38 @@ class Shader;
class Volume;
struct PackedBVH;
/* Set of flags used to help determining what data has been modified or needs reallocation, so we
* can decide which device data to free or update. */
enum {
DEVICE_CURVE_DATA_MODIFIED = (1 << 0),
DEVICE_MESH_DATA_MODIFIED = (1 << 1),
DEVICE_POINT_DATA_MODIFIED = (1 << 2),
ATTR_FLOAT_MODIFIED = (1 << 3),
ATTR_FLOAT2_MODIFIED = (1 << 4),
ATTR_FLOAT3_MODIFIED = (1 << 5),
ATTR_FLOAT4_MODIFIED = (1 << 6),
ATTR_UCHAR4_MODIFIED = (1 << 7),
CURVE_DATA_NEED_REALLOC = (1 << 8),
MESH_DATA_NEED_REALLOC = (1 << 9),
POINT_DATA_NEED_REALLOC = (1 << 10),
ATTR_FLOAT_NEEDS_REALLOC = (1 << 11),
ATTR_FLOAT2_NEEDS_REALLOC = (1 << 12),
ATTR_FLOAT3_NEEDS_REALLOC = (1 << 13),
ATTR_FLOAT4_NEEDS_REALLOC = (1 << 14),
ATTR_UCHAR4_NEEDS_REALLOC = (1 << 15),
ATTRS_NEED_REALLOC = (ATTR_FLOAT_NEEDS_REALLOC | ATTR_FLOAT2_NEEDS_REALLOC |
ATTR_FLOAT3_NEEDS_REALLOC | ATTR_FLOAT4_NEEDS_REALLOC |
ATTR_UCHAR4_NEEDS_REALLOC),
DEVICE_MESH_DATA_NEEDS_REALLOC = (MESH_DATA_NEED_REALLOC | ATTRS_NEED_REALLOC),
DEVICE_POINT_DATA_NEEDS_REALLOC = (POINT_DATA_NEED_REALLOC | ATTRS_NEED_REALLOC),
DEVICE_CURVE_DATA_NEEDS_REALLOC = (CURVE_DATA_NEED_REALLOC | ATTRS_NEED_REALLOC),
};
/* Geometry
*
* Base class for geometric types like Mesh and Hair. */

View File

@@ -0,0 +1,722 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2011-2022 Blender Foundation */
#include "bvh/bvh.h"
#include "bvh/bvh2.h"
#include "device/device.h"
#include "scene/attribute.h"
#include "scene/camera.h"
#include "scene/geometry.h"
#include "scene/hair.h"
#include "scene/light.h"
#include "scene/mesh.h"
#include "scene/object.h"
#include "scene/pointcloud.h"
#include "scene/scene.h"
#include "scene/shader.h"
#include "scene/shader_nodes.h"
#include "scene/stats.h"
#include "scene/volume.h"
#include "subd/patch_table.h"
#include "subd/split.h"
#include "kernel/osl/globals.h"
#include "util/foreach.h"
#include "util/log.h"
#include "util/progress.h"
#include "util/task.h"
CCL_NAMESPACE_BEGIN
bool Geometry::need_attribute(Scene *scene, AttributeStandard std)
{
if (std == ATTR_STD_NONE)
return false;
if (scene->need_global_attribute(std))
return true;
foreach (Node *node, used_shaders) {
Shader *shader = static_cast<Shader *>(node);
if (shader->attributes.find(std))
return true;
}
return false;
}
bool Geometry::need_attribute(Scene * /*scene*/, ustring name)
{
if (name == ustring())
return false;
foreach (Node *node, used_shaders) {
Shader *shader = static_cast<Shader *>(node);
if (shader->attributes.find(name))
return true;
}
return false;
}
AttributeRequestSet Geometry::needed_attributes()
{
AttributeRequestSet result;
foreach (Node *node, used_shaders) {
Shader *shader = static_cast<Shader *>(node);
result.add(shader->attributes);
}
return result;
}
bool Geometry::has_voxel_attributes() const
{
foreach (const Attribute &attr, attributes.attributes) {
if (attr.element == ATTR_ELEMENT_VOXEL) {
return true;
}
}
return false;
}
/* Generate a normal attribute map entry from an attribute descriptor. */
static void emit_attribute_map_entry(AttributeMap *attr_map,
size_t index,
uint64_t id,
TypeDesc type,
const AttributeDescriptor &desc)
{
attr_map[index].id = id;
attr_map[index].element = desc.element;
attr_map[index].offset = as_uint(desc.offset);
if (type == TypeDesc::TypeFloat)
attr_map[index].type = NODE_ATTR_FLOAT;
else if (type == TypeDesc::TypeMatrix)
attr_map[index].type = NODE_ATTR_MATRIX;
else if (type == TypeFloat2)
attr_map[index].type = NODE_ATTR_FLOAT2;
else if (type == TypeFloat4)
attr_map[index].type = NODE_ATTR_FLOAT4;
else if (type == TypeRGBA)
attr_map[index].type = NODE_ATTR_RGBA;
else
attr_map[index].type = NODE_ATTR_FLOAT3;
attr_map[index].flags = desc.flags;
}
/* Generate an attribute map end marker, optionally including a link to another map.
* Links are used to connect object attribute maps to mesh attribute maps. */
static void emit_attribute_map_terminator(AttributeMap *attr_map,
size_t index,
bool chain,
uint chain_link)
{
for (int j = 0; j < ATTR_PRIM_TYPES; j++) {
attr_map[index + j].id = ATTR_STD_NONE;
attr_map[index + j].element = chain; /* link is valid flag */
attr_map[index + j].offset = chain ? chain_link + j : 0; /* link to the correct sub-entry */
attr_map[index + j].type = 0;
attr_map[index + j].flags = 0;
}
}
/* Generate all necessary attribute map entries from the attribute request. */
static void emit_attribute_mapping(
AttributeMap *attr_map, size_t index, uint64_t id, AttributeRequest &req, Geometry *geom)
{
emit_attribute_map_entry(attr_map, index, id, req.type, req.desc);
if (geom->is_mesh()) {
Mesh *mesh = static_cast<Mesh *>(geom);
if (mesh->get_num_subd_faces()) {
emit_attribute_map_entry(attr_map, index + 1, id, req.subd_type, req.subd_desc);
}
}
}
void GeometryManager::update_svm_attributes(Device *,
DeviceScene *dscene,
Scene *scene,
vector<AttributeRequestSet> &geom_attributes,
vector<AttributeRequestSet> &object_attributes)
{
/* for SVM, the attributes_map table is used to lookup the offset of an
* attribute, based on a unique shader attribute id. */
/* compute array stride */
size_t attr_map_size = 0;
for (size_t i = 0; i < scene->geometry.size(); i++) {
Geometry *geom = scene->geometry[i];
geom->attr_map_offset = attr_map_size;
#ifdef WITH_OSL
size_t attr_count = 0;
foreach (AttributeRequest &req, geom_attributes[i].requests) {
if (req.std != ATTR_STD_NONE &&
scene->shader_manager->get_attribute_id(req.std) != (uint64_t)req.std)
attr_count += 2;
else
attr_count += 1;
}
#else
const size_t attr_count = geom_attributes[i].size();
#endif
attr_map_size += (attr_count + 1) * ATTR_PRIM_TYPES;
}
for (size_t i = 0; i < scene->objects.size(); i++) {
Object *object = scene->objects[i];
/* only allocate a table for the object if it actually has attributes */
if (object_attributes[i].size() == 0) {
object->attr_map_offset = 0;
}
else {
object->attr_map_offset = attr_map_size;
attr_map_size += (object_attributes[i].size() + 1) * ATTR_PRIM_TYPES;
}
}
if (attr_map_size == 0)
return;
if (!dscene->attributes_map.need_realloc()) {
return;
}
/* create attribute map */
AttributeMap *attr_map = dscene->attributes_map.alloc(attr_map_size);
memset(attr_map, 0, dscene->attributes_map.size() * sizeof(*attr_map));
for (size_t i = 0; i < scene->geometry.size(); i++) {
Geometry *geom = scene->geometry[i];
AttributeRequestSet &attributes = geom_attributes[i];
/* set geometry attributes */
size_t index = geom->attr_map_offset;
foreach (AttributeRequest &req, attributes.requests) {
uint64_t id;
if (req.std == ATTR_STD_NONE)
id = scene->shader_manager->get_attribute_id(req.name);
else
id = scene->shader_manager->get_attribute_id(req.std);
emit_attribute_mapping(attr_map, index, id, req, geom);
index += ATTR_PRIM_TYPES;
#ifdef WITH_OSL
/* Some standard attributes are explicitly referenced via their standard ID, so add those
* again in case they were added under a different attribute ID. */
if (req.std != ATTR_STD_NONE && id != (uint64_t)req.std) {
emit_attribute_mapping(attr_map, index, (uint64_t)req.std, req, geom);
index += ATTR_PRIM_TYPES;
}
#endif
}
emit_attribute_map_terminator(attr_map, index, false, 0);
}
for (size_t i = 0; i < scene->objects.size(); i++) {
Object *object = scene->objects[i];
AttributeRequestSet &attributes = object_attributes[i];
/* set object attributes */
if (attributes.size() > 0) {
size_t index = object->attr_map_offset;
foreach (AttributeRequest &req, attributes.requests) {
uint64_t id;
if (req.std == ATTR_STD_NONE)
id = scene->shader_manager->get_attribute_id(req.name);
else
id = scene->shader_manager->get_attribute_id(req.std);
emit_attribute_mapping(attr_map, index, id, req, object->geometry);
index += ATTR_PRIM_TYPES;
}
emit_attribute_map_terminator(attr_map, index, true, object->geometry->attr_map_offset);
}
}
/* copy to device */
dscene->attributes_map.copy_to_device();
}
void GeometryManager::update_attribute_element_offset(Geometry *geom,
device_vector<float> &attr_float,
size_t &attr_float_offset,
device_vector<float2> &attr_float2,
size_t &attr_float2_offset,
device_vector<packed_float3> &attr_float3,
size_t &attr_float3_offset,
device_vector<float4> &attr_float4,
size_t &attr_float4_offset,
device_vector<uchar4> &attr_uchar4,
size_t &attr_uchar4_offset,
Attribute *mattr,
AttributePrimitive prim,
TypeDesc &type,
AttributeDescriptor &desc)
{
if (mattr) {
/* store element and type */
desc.element = mattr->element;
desc.flags = mattr->flags;
type = mattr->type;
/* store attribute data in arrays */
size_t size = mattr->element_size(geom, prim);
AttributeElement &element = desc.element;
int &offset = desc.offset;
if (mattr->element == ATTR_ELEMENT_VOXEL) {
/* store slot in offset value */
ImageHandle &handle = mattr->data_voxel();
offset = handle.svm_slot();
}
else if (mattr->element == ATTR_ELEMENT_CORNER_BYTE) {
uchar4 *data = mattr->data_uchar4();
offset = attr_uchar4_offset;
assert(attr_uchar4.size() >= offset + size);
if (mattr->modified) {
for (size_t k = 0; k < size; k++) {
attr_uchar4[offset + k] = data[k];
}
attr_uchar4.tag_modified();
}
attr_uchar4_offset += size;
}
else if (mattr->type == TypeDesc::TypeFloat) {
float *data = mattr->data_float();
offset = attr_float_offset;
assert(attr_float.size() >= offset + size);
if (mattr->modified) {
for (size_t k = 0; k < size; k++) {
attr_float[offset + k] = data[k];
}
attr_float.tag_modified();
}
attr_float_offset += size;
}
else if (mattr->type == TypeFloat2) {
float2 *data = mattr->data_float2();
offset = attr_float2_offset;
assert(attr_float2.size() >= offset + size);
if (mattr->modified) {
for (size_t k = 0; k < size; k++) {
attr_float2[offset + k] = data[k];
}
attr_float2.tag_modified();
}
attr_float2_offset += size;
}
else if (mattr->type == TypeDesc::TypeMatrix) {
Transform *tfm = mattr->data_transform();
offset = attr_float4_offset;
assert(attr_float4.size() >= offset + size * 3);
if (mattr->modified) {
for (size_t k = 0; k < size * 3; k++) {
attr_float4[offset + k] = (&tfm->x)[k];
}
attr_float4.tag_modified();
}
attr_float4_offset += size * 3;
}
else if (mattr->type == TypeFloat4 || mattr->type == TypeRGBA) {
float4 *data = mattr->data_float4();
offset = attr_float4_offset;
assert(attr_float4.size() >= offset + size);
if (mattr->modified) {
for (size_t k = 0; k < size; k++) {
attr_float4[offset + k] = data[k];
}
attr_float4.tag_modified();
}
attr_float4_offset += size;
}
else {
float3 *data = mattr->data_float3();
offset = attr_float3_offset;
assert(attr_float3.size() >= offset + size);
if (mattr->modified) {
for (size_t k = 0; k < size; k++) {
attr_float3[offset + k] = data[k];
}
attr_float3.tag_modified();
}
attr_float3_offset += size;
}
/* mesh vertex/curve index is global, not per object, so we sneak
* a correction for that in here */
if (geom->is_mesh()) {
Mesh *mesh = static_cast<Mesh *>(geom);
if (mesh->subdivision_type == Mesh::SUBDIVISION_CATMULL_CLARK &&
desc.flags & ATTR_SUBDIVIDED) {
/* Indices for subdivided attributes are retrieved
* from patch table so no need for correction here. */
}
else if (element == ATTR_ELEMENT_VERTEX)
offset -= mesh->vert_offset;
else if (element == ATTR_ELEMENT_VERTEX_MOTION)
offset -= mesh->vert_offset;
else if (element == ATTR_ELEMENT_FACE) {
if (prim == ATTR_PRIM_GEOMETRY)
offset -= mesh->prim_offset;
else
offset -= mesh->face_offset;
}
else if (element == ATTR_ELEMENT_CORNER || element == ATTR_ELEMENT_CORNER_BYTE) {
if (prim == ATTR_PRIM_GEOMETRY)
offset -= 3 * mesh->prim_offset;
else
offset -= mesh->corner_offset;
}
}
else if (geom->is_hair()) {
Hair *hair = static_cast<Hair *>(geom);
if (element == ATTR_ELEMENT_CURVE)
offset -= hair->prim_offset;
else if (element == ATTR_ELEMENT_CURVE_KEY)
offset -= hair->curve_key_offset;
else if (element == ATTR_ELEMENT_CURVE_KEY_MOTION)
offset -= hair->curve_key_offset;
}
else if (geom->is_pointcloud()) {
if (element == ATTR_ELEMENT_VERTEX)
offset -= geom->prim_offset;
else if (element == ATTR_ELEMENT_VERTEX_MOTION)
offset -= geom->prim_offset;
}
}
else {
/* attribute not found */
desc.element = ATTR_ELEMENT_NONE;
desc.offset = 0;
}
}
static void update_attribute_element_size(Geometry *geom,
Attribute *mattr,
AttributePrimitive prim,
size_t *attr_float_size,
size_t *attr_float2_size,
size_t *attr_float3_size,
size_t *attr_float4_size,
size_t *attr_uchar4_size)
{
if (mattr) {
size_t size = mattr->element_size(geom, prim);
if (mattr->element == ATTR_ELEMENT_VOXEL) {
/* pass */
}
else if (mattr->element == ATTR_ELEMENT_CORNER_BYTE) {
*attr_uchar4_size += size;
}
else if (mattr->type == TypeDesc::TypeFloat) {
*attr_float_size += size;
}
else if (mattr->type == TypeFloat2) {
*attr_float2_size += size;
}
else if (mattr->type == TypeDesc::TypeMatrix) {
*attr_float4_size += size * 4;
}
else if (mattr->type == TypeFloat4 || mattr->type == TypeRGBA) {
*attr_float4_size += size;
}
else {
*attr_float3_size += size;
}
}
}
void GeometryManager::device_update_attributes(Device *device,
DeviceScene *dscene,
Scene *scene,
Progress &progress)
{
progress.set_status("Updating Mesh", "Computing attributes");
/* gather per mesh requested attributes. as meshes may have multiple
* shaders assigned, this merges the requested attributes that have
* been set per shader by the shader manager */
vector<AttributeRequestSet> geom_attributes(scene->geometry.size());
for (size_t i = 0; i < scene->geometry.size(); i++) {
Geometry *geom = scene->geometry[i];
geom->index = i;
scene->need_global_attributes(geom_attributes[i]);
foreach (Node *node, geom->get_used_shaders()) {
Shader *shader = static_cast<Shader *>(node);
geom_attributes[i].add(shader->attributes);
}
if (geom->is_hair() && static_cast<Hair *>(geom)->need_shadow_transparency()) {
geom_attributes[i].add(ATTR_STD_SHADOW_TRANSPARENCY);
}
}
/* convert object attributes to use the same data structures as geometry ones */
vector<AttributeRequestSet> object_attributes(scene->objects.size());
vector<AttributeSet> object_attribute_values;
object_attribute_values.reserve(scene->objects.size());
for (size_t i = 0; i < scene->objects.size(); i++) {
Object *object = scene->objects[i];
Geometry *geom = object->geometry;
size_t geom_idx = geom->index;
assert(geom_idx < scene->geometry.size() && scene->geometry[geom_idx] == geom);
object_attribute_values.push_back(AttributeSet(geom, ATTR_PRIM_GEOMETRY));
AttributeRequestSet &geom_requests = geom_attributes[geom_idx];
AttributeRequestSet &attributes = object_attributes[i];
AttributeSet &values = object_attribute_values[i];
for (size_t j = 0; j < object->attributes.size(); j++) {
ParamValue &param = object->attributes[j];
/* add attributes that are requested and not already handled by the mesh */
if (geom_requests.find(param.name()) && !geom->attributes.find(param.name())) {
attributes.add(param.name());
Attribute *attr = values.add(param.name(), param.type(), ATTR_ELEMENT_OBJECT);
assert(param.datasize() == attr->buffer.size());
memcpy(attr->buffer.data(), param.data(), param.datasize());
}
}
}
/* mesh attribute are stored in a single array per data type. here we fill
* those arrays, and set the offset and element type to create attribute
* maps next */
/* Pre-allocate attributes to avoid arrays re-allocation which would
* take 2x of overall attribute memory usage.
*/
size_t attr_float_size = 0;
size_t attr_float2_size = 0;
size_t attr_float3_size = 0;
size_t attr_float4_size = 0;
size_t attr_uchar4_size = 0;
for (size_t i = 0; i < scene->geometry.size(); i++) {
Geometry *geom = scene->geometry[i];
AttributeRequestSet &attributes = geom_attributes[i];
foreach (AttributeRequest &req, attributes.requests) {
Attribute *attr = geom->attributes.find(req);
update_attribute_element_size(geom,
attr,
ATTR_PRIM_GEOMETRY,
&attr_float_size,
&attr_float2_size,
&attr_float3_size,
&attr_float4_size,
&attr_uchar4_size);
if (geom->is_mesh()) {
Mesh *mesh = static_cast<Mesh *>(geom);
Attribute *subd_attr = mesh->subd_attributes.find(req);
update_attribute_element_size(mesh,
subd_attr,
ATTR_PRIM_SUBD,
&attr_float_size,
&attr_float2_size,
&attr_float3_size,
&attr_float4_size,
&attr_uchar4_size);
}
}
}
for (size_t i = 0; i < scene->objects.size(); i++) {
Object *object = scene->objects[i];
foreach (Attribute &attr, object_attribute_values[i].attributes) {
update_attribute_element_size(object->geometry,
&attr,
ATTR_PRIM_GEOMETRY,
&attr_float_size,
&attr_float2_size,
&attr_float3_size,
&attr_float4_size,
&attr_uchar4_size);
}
}
dscene->attributes_float.alloc(attr_float_size);
dscene->attributes_float2.alloc(attr_float2_size);
dscene->attributes_float3.alloc(attr_float3_size);
dscene->attributes_float4.alloc(attr_float4_size);
dscene->attributes_uchar4.alloc(attr_uchar4_size);
/* The order of those flags needs to match that of AttrKernelDataType. */
const bool attributes_need_realloc[AttrKernelDataType::NUM] = {
dscene->attributes_float.need_realloc(),
dscene->attributes_float2.need_realloc(),
dscene->attributes_float3.need_realloc(),
dscene->attributes_float4.need_realloc(),
dscene->attributes_uchar4.need_realloc(),
};
size_t attr_float_offset = 0;
size_t attr_float2_offset = 0;
size_t attr_float3_offset = 0;
size_t attr_float4_offset = 0;
size_t attr_uchar4_offset = 0;
/* Fill in attributes. */
for (size_t i = 0; i < scene->geometry.size(); i++) {
Geometry *geom = scene->geometry[i];
AttributeRequestSet &attributes = geom_attributes[i];
/* todo: we now store std and name attributes from requests even if
* they actually refer to the same mesh attributes, optimize */
foreach (AttributeRequest &req, attributes.requests) {
Attribute *attr = geom->attributes.find(req);
if (attr) {
/* force a copy if we need to reallocate all the data */
attr->modified |= attributes_need_realloc[Attribute::kernel_type(*attr)];
}
update_attribute_element_offset(geom,
dscene->attributes_float,
attr_float_offset,
dscene->attributes_float2,
attr_float2_offset,
dscene->attributes_float3,
attr_float3_offset,
dscene->attributes_float4,
attr_float4_offset,
dscene->attributes_uchar4,
attr_uchar4_offset,
attr,
ATTR_PRIM_GEOMETRY,
req.type,
req.desc);
if (geom->is_mesh()) {
Mesh *mesh = static_cast<Mesh *>(geom);
Attribute *subd_attr = mesh->subd_attributes.find(req);
if (subd_attr) {
/* force a copy if we need to reallocate all the data */
subd_attr->modified |= attributes_need_realloc[Attribute::kernel_type(*subd_attr)];
}
update_attribute_element_offset(mesh,
dscene->attributes_float,
attr_float_offset,
dscene->attributes_float2,
attr_float2_offset,
dscene->attributes_float3,
attr_float3_offset,
dscene->attributes_float4,
attr_float4_offset,
dscene->attributes_uchar4,
attr_uchar4_offset,
subd_attr,
ATTR_PRIM_SUBD,
req.subd_type,
req.subd_desc);
}
if (progress.get_cancel())
return;
}
}
for (size_t i = 0; i < scene->objects.size(); i++) {
Object *object = scene->objects[i];
AttributeRequestSet &attributes = object_attributes[i];
AttributeSet &values = object_attribute_values[i];
foreach (AttributeRequest &req, attributes.requests) {
Attribute *attr = values.find(req);
if (attr) {
attr->modified |= attributes_need_realloc[Attribute::kernel_type(*attr)];
}
update_attribute_element_offset(object->geometry,
dscene->attributes_float,
attr_float_offset,
dscene->attributes_float2,
attr_float2_offset,
dscene->attributes_float3,
attr_float3_offset,
dscene->attributes_float4,
attr_float4_offset,
dscene->attributes_uchar4,
attr_uchar4_offset,
attr,
ATTR_PRIM_GEOMETRY,
req.type,
req.desc);
/* object attributes don't care about subdivision */
req.subd_type = req.type;
req.subd_desc = req.desc;
if (progress.get_cancel())
return;
}
}
/* create attribute lookup maps */
if (scene->shader_manager->use_osl())
update_osl_globals(device, scene);
update_svm_attributes(device, dscene, scene, geom_attributes, object_attributes);
if (progress.get_cancel())
return;
/* copy to device */
progress.set_status("Updating Mesh", "Copying Attributes to device");
dscene->attributes_float.copy_to_device_if_modified();
dscene->attributes_float2.copy_to_device_if_modified();
dscene->attributes_float3.copy_to_device_if_modified();
dscene->attributes_float4.copy_to_device_if_modified();
dscene->attributes_uchar4.copy_to_device_if_modified();
if (progress.get_cancel())
return;
/* After mesh attributes and patch tables have been copied to device memory,
* we need to update offsets in the objects. */
scene->object_manager->device_update_geom_offsets(device, dscene, scene);
}
CCL_NAMESPACE_END

View File

@@ -0,0 +1,196 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2011-2022 Blender Foundation */
#include "bvh/bvh.h"
#include "bvh/bvh2.h"
#include "device/device.h"
#include "scene/attribute.h"
#include "scene/camera.h"
#include "scene/geometry.h"
#include "scene/hair.h"
#include "scene/light.h"
#include "scene/mesh.h"
#include "scene/object.h"
#include "scene/pointcloud.h"
#include "scene/scene.h"
#include "scene/shader.h"
#include "scene/shader_nodes.h"
#include "scene/stats.h"
#include "scene/volume.h"
#include "subd/patch_table.h"
#include "subd/split.h"
#include "kernel/osl/globals.h"
#include "util/foreach.h"
#include "util/log.h"
#include "util/progress.h"
#include "util/task.h"
CCL_NAMESPACE_BEGIN
void Geometry::compute_bvh(Device *device,
DeviceScene *dscene,
SceneParams *params,
Progress *progress,
size_t n,
size_t total)
{
if (progress->get_cancel())
return;
compute_bounds();
const BVHLayout bvh_layout = BVHParams::best_bvh_layout(
params->bvh_layout, device->get_bvh_layout_mask(dscene->data.kernel_features));
if (need_build_bvh(bvh_layout)) {
string msg = "Updating Geometry BVH ";
if (name.empty())
msg += string_printf("%u/%u", (uint)(n + 1), (uint)total);
else
msg += string_printf("%s %u/%u", name.c_str(), (uint)(n + 1), (uint)total);
Object object;
/* Ensure all visibility bits are set at the geometry level BVH. In
* the object level BVH is where actual visibility is tested. */
object.set_is_shadow_catcher(true);
object.set_visibility(~0);
object.set_geometry(this);
vector<Geometry *> geometry;
geometry.push_back(this);
vector<Object *> objects;
objects.push_back(&object);
if (bvh && !need_update_rebuild) {
progress->set_status(msg, "Refitting BVH");
bvh->replace_geometry(geometry, objects);
device->build_bvh(bvh, *progress, true);
}
else {
progress->set_status(msg, "Building BVH");
BVHParams bparams;
bparams.use_spatial_split = params->use_bvh_spatial_split;
bparams.use_compact_structure = params->use_bvh_compact_structure;
bparams.bvh_layout = bvh_layout;
bparams.use_unaligned_nodes = dscene->data.bvh.have_curves &&
params->use_bvh_unaligned_nodes;
bparams.num_motion_triangle_steps = params->num_bvh_time_steps;
bparams.num_motion_curve_steps = params->num_bvh_time_steps;
bparams.num_motion_point_steps = params->num_bvh_time_steps;
bparams.bvh_type = params->bvh_type;
bparams.curve_subdivisions = params->curve_subdivisions();
delete bvh;
bvh = BVH::create(bparams, geometry, objects, device);
MEM_GUARDED_CALL(progress, device->build_bvh, bvh, *progress, false);
}
}
need_update_rebuild = false;
need_update_bvh_for_offset = false;
}
void GeometryManager::device_update_bvh(Device *device,
DeviceScene *dscene,
Scene *scene,
Progress &progress)
{
/* bvh build */
progress.set_status("Updating Scene BVH", "Building");
BVHParams bparams;
bparams.top_level = true;
bparams.bvh_layout = BVHParams::best_bvh_layout(
scene->params.bvh_layout, device->get_bvh_layout_mask(dscene->data.kernel_features));
bparams.use_spatial_split = scene->params.use_bvh_spatial_split;
bparams.use_unaligned_nodes = dscene->data.bvh.have_curves &&
scene->params.use_bvh_unaligned_nodes;
bparams.num_motion_triangle_steps = scene->params.num_bvh_time_steps;
bparams.num_motion_curve_steps = scene->params.num_bvh_time_steps;
bparams.num_motion_point_steps = scene->params.num_bvh_time_steps;
bparams.bvh_type = scene->params.bvh_type;
bparams.curve_subdivisions = scene->params.curve_subdivisions();
VLOG_INFO << "Using " << bvh_layout_name(bparams.bvh_layout) << " layout.";
const bool can_refit = scene->bvh != nullptr &&
(bparams.bvh_layout == BVHLayout::BVH_LAYOUT_OPTIX ||
bparams.bvh_layout == BVHLayout::BVH_LAYOUT_METAL);
BVH *bvh = scene->bvh;
if (!scene->bvh) {
bvh = scene->bvh = BVH::create(bparams, scene->geometry, scene->objects, device);
}
device->build_bvh(bvh, progress, can_refit);
if (progress.get_cancel()) {
return;
}
const bool has_bvh2_layout = (bparams.bvh_layout == BVH_LAYOUT_BVH2);
PackedBVH pack;
if (has_bvh2_layout) {
pack = std::move(static_cast<BVH2 *>(bvh)->pack);
}
else {
pack.root_index = -1;
}
/* copy to device */
progress.set_status("Updating Scene BVH", "Copying BVH to device");
/* When using BVH2, we always have to copy/update the data as its layout is dependent on the
* BVH's leaf nodes which may be different when the objects or vertices move. */
if (pack.nodes.size()) {
dscene->bvh_nodes.steal_data(pack.nodes);
dscene->bvh_nodes.copy_to_device();
}
if (pack.leaf_nodes.size()) {
dscene->bvh_leaf_nodes.steal_data(pack.leaf_nodes);
dscene->bvh_leaf_nodes.copy_to_device();
}
if (pack.object_node.size()) {
dscene->object_node.steal_data(pack.object_node);
dscene->object_node.copy_to_device();
}
if (pack.prim_type.size()) {
dscene->prim_type.steal_data(pack.prim_type);
dscene->prim_type.copy_to_device();
}
if (pack.prim_visibility.size()) {
dscene->prim_visibility.steal_data(pack.prim_visibility);
dscene->prim_visibility.copy_to_device();
}
if (pack.prim_index.size()) {
dscene->prim_index.steal_data(pack.prim_index);
dscene->prim_index.copy_to_device();
}
if (pack.prim_object.size()) {
dscene->prim_object.steal_data(pack.prim_object);
dscene->prim_object.copy_to_device();
}
if (pack.prim_time.size()) {
dscene->prim_time.steal_data(pack.prim_time);
dscene->prim_time.copy_to_device();
}
dscene->data.bvh.root = pack.root_index;
dscene->data.bvh.use_bvh_steps = (scene->params.num_bvh_time_steps != 0);
dscene->data.bvh.curve_subdivisions = scene->params.curve_subdivisions();
/* The scene handle is set in 'CPUDevice::const_copy_to' and 'OptiXDevice::const_copy_to' */
dscene->data.device_bvh = 0;
}
CCL_NAMESPACE_END

View File

@@ -0,0 +1,223 @@
/* SPDX-License-Identifier: Apache-2.0
* Copyright 2011-2022 Blender Foundation */
#include "bvh/bvh.h"
#include "bvh/bvh2.h"
#include "device/device.h"
#include "scene/attribute.h"
#include "scene/camera.h"
#include "scene/geometry.h"
#include "scene/hair.h"
#include "scene/light.h"
#include "scene/mesh.h"
#include "scene/object.h"
#include "scene/osl.h"
#include "scene/pointcloud.h"
#include "scene/scene.h"
#include "scene/shader.h"
#include "scene/shader_nodes.h"
#include "scene/stats.h"
#include "scene/volume.h"
#include "subd/patch_table.h"
#include "subd/split.h"
#ifdef WITH_OSL
# include "kernel/osl/globals.h"
#endif
#include "util/foreach.h"
#include "util/log.h"
#include "util/progress.h"
#include "util/task.h"
CCL_NAMESPACE_BEGIN
void GeometryManager::device_update_mesh(Device *,
DeviceScene *dscene,
Scene *scene,
Progress &progress)
{
/* Count. */
size_t vert_size = 0;
size_t tri_size = 0;
size_t curve_key_size = 0;
size_t curve_size = 0;
size_t curve_segment_size = 0;
size_t point_size = 0;
size_t patch_size = 0;
foreach (Geometry *geom, scene->geometry) {
if (geom->geometry_type == Geometry::MESH || geom->geometry_type == Geometry::VOLUME) {
Mesh *mesh = static_cast<Mesh *>(geom);
vert_size += mesh->verts.size();
tri_size += mesh->num_triangles();
if (mesh->get_num_subd_faces()) {
Mesh::SubdFace last = mesh->get_subd_face(mesh->get_num_subd_faces() - 1);
patch_size += (last.ptex_offset + last.num_ptex_faces()) * 8;
/* patch tables are stored in same array so include them in patch_size */
if (mesh->patch_table) {
mesh->patch_table_offset = patch_size;
patch_size += mesh->patch_table->total_size();
}
}
}
else if (geom->is_hair()) {
Hair *hair = static_cast<Hair *>(geom);
curve_key_size += hair->get_curve_keys().size();
curve_size += hair->num_curves();
curve_segment_size += hair->num_segments();
}
else if (geom->is_pointcloud()) {
PointCloud *pointcloud = static_cast<PointCloud *>(geom);
point_size += pointcloud->num_points();
}
}
/* Fill in all the arrays. */
if (tri_size != 0) {
/* normals */
progress.set_status("Updating Mesh", "Computing normals");
packed_float3 *tri_verts = dscene->tri_verts.alloc(vert_size);
uint *tri_shader = dscene->tri_shader.alloc(tri_size);
packed_float3 *vnormal = dscene->tri_vnormal.alloc(vert_size);
packed_uint3 *tri_vindex = dscene->tri_vindex.alloc(tri_size);
uint *tri_patch = dscene->tri_patch.alloc(tri_size);
float2 *tri_patch_uv = dscene->tri_patch_uv.alloc(vert_size);
const bool copy_all_data = dscene->tri_shader.need_realloc() ||
dscene->tri_vindex.need_realloc() ||
dscene->tri_vnormal.need_realloc() ||
dscene->tri_patch.need_realloc() ||
dscene->tri_patch_uv.need_realloc();
foreach (Geometry *geom, scene->geometry) {
if (geom->geometry_type == Geometry::MESH || geom->geometry_type == Geometry::VOLUME) {
Mesh *mesh = static_cast<Mesh *>(geom);
if (mesh->shader_is_modified() || mesh->smooth_is_modified() ||
mesh->triangles_is_modified() || copy_all_data) {
mesh->pack_shaders(scene, &tri_shader[mesh->prim_offset]);
}
if (mesh->verts_is_modified() || copy_all_data) {
mesh->pack_normals(&vnormal[mesh->vert_offset]);
}
if (mesh->verts_is_modified() || mesh->triangles_is_modified() ||
mesh->vert_patch_uv_is_modified() || copy_all_data) {
mesh->pack_verts(&tri_verts[mesh->vert_offset],
&tri_vindex[mesh->prim_offset],
&tri_patch[mesh->prim_offset],
&tri_patch_uv[mesh->vert_offset]);
}
if (progress.get_cancel())
return;
}
}
/* vertex coordinates */
progress.set_status("Updating Mesh", "Copying Mesh to device");
dscene->tri_verts.copy_to_device_if_modified();
dscene->tri_shader.copy_to_device_if_modified();
dscene->tri_vnormal.copy_to_device_if_modified();
dscene->tri_vindex.copy_to_device_if_modified();
dscene->tri_patch.copy_to_device_if_modified();
dscene->tri_patch_uv.copy_to_device_if_modified();
}
if (curve_segment_size != 0) {
progress.set_status("Updating Mesh", "Copying Curves to device");
float4 *curve_keys = dscene->curve_keys.alloc(curve_key_size);
KernelCurve *curves = dscene->curves.alloc(curve_size);
KernelCurveSegment *curve_segments = dscene->curve_segments.alloc(curve_segment_size);
const bool copy_all_data = dscene->curve_keys.need_realloc() ||
dscene->curves.need_realloc() ||
dscene->curve_segments.need_realloc();
foreach (Geometry *geom, scene->geometry) {
if (geom->is_hair()) {
Hair *hair = static_cast<Hair *>(geom);
bool curve_keys_co_modified = hair->curve_radius_is_modified() ||
hair->curve_keys_is_modified();
bool curve_data_modified = hair->curve_shader_is_modified() ||
hair->curve_first_key_is_modified();
if (!curve_keys_co_modified && !curve_data_modified && !copy_all_data) {
continue;
}
hair->pack_curves(scene,
&curve_keys[hair->curve_key_offset],
&curves[hair->prim_offset],
&curve_segments[hair->curve_segment_offset]);
if (progress.get_cancel())
return;
}
}
dscene->curve_keys.copy_to_device_if_modified();
dscene->curves.copy_to_device_if_modified();
dscene->curve_segments.copy_to_device_if_modified();
}
if (point_size != 0) {
progress.set_status("Updating Mesh", "Copying Point clouds to device");
float4 *points = dscene->points.alloc(point_size);
uint *points_shader = dscene->points_shader.alloc(point_size);
foreach (Geometry *geom, scene->geometry) {
if (geom->is_pointcloud()) {
PointCloud *pointcloud = static_cast<PointCloud *>(geom);
pointcloud->pack(
scene, &points[pointcloud->prim_offset], &points_shader[pointcloud->prim_offset]);
if (progress.get_cancel())
return;
}
}
dscene->points.copy_to_device();
dscene->points_shader.copy_to_device();
}
if (patch_size != 0 && dscene->patches.need_realloc()) {
progress.set_status("Updating Mesh", "Copying Patches to device");
uint *patch_data = dscene->patches.alloc(patch_size);
foreach (Geometry *geom, scene->geometry) {
if (geom->is_mesh()) {
Mesh *mesh = static_cast<Mesh *>(geom);
mesh->pack_patches(&patch_data[mesh->patch_offset]);
if (mesh->patch_table) {
mesh->patch_table->copy_adjusting_offsets(&patch_data[mesh->patch_table_offset],
mesh->patch_table_offset);
}
if (progress.get_cancel())
return;
}
}
dscene->patches.copy_to_device();
}
}
CCL_NAMESPACE_END

View File

@@ -450,6 +450,9 @@ void LightManager::device_update_tree(Device *,
* More benchmarking is needed to determine what number works best. */
LightTree light_tree(scene, dscene, progress, 8);
LightTreeNode *root = light_tree.build(scene, dscene);
if (progress.get_cancel()) {
return;
}
/* We want to create separate arrays corresponding to triangles and lights,
* which will be used to index back into the light tree for PDF calculations. */
@@ -730,7 +733,7 @@ void LightManager::device_update_background(Device *device,
/* Determine sun direction from lat/long and texture mapping. */
float latitude = sky->get_sun_elevation();
float longitude = M_2PI_F - sky->get_sun_rotation() + M_PI_2_F;
float longitude = sky->get_sun_rotation() + M_PI_2_F;
float3 sun_direction = make_float3(
cosf(latitude) * cosf(longitude), cosf(latitude) * sinf(longitude), sinf(latitude));
Transform sky_transform = transform_inverse(sky->tex_mapping.compute_transform());
@@ -752,7 +755,8 @@ void LightManager::device_update_background(Device *device,
}
/* If there's more than one sun, fall back to map sampling instead. */
if (num_suns != 1) {
kbackground->use_sun_guiding = (num_suns == 1);
if (!kbackground->use_sun_guiding) {
kbackground->sun_weight = 0.0f;
environment_res.x = max(environment_res.x, 4096);
environment_res.y = max(environment_res.y, 2048);

View File

@@ -595,7 +595,7 @@ void ObjectManager::device_update_object_transform(UpdateObjectTransformState *s
void ObjectManager::device_update_prim_offsets(Device *device, DeviceScene *dscene, Scene *scene)
{
if (!scene->integrator->get_use_light_tree()) {
BVHLayoutMask layout_mask = device->get_bvh_layout_mask();
BVHLayoutMask layout_mask = device->get_bvh_layout_mask(dscene->data.kernel_features);
if (layout_mask != BVH_LAYOUT_METAL && layout_mask != BVH_LAYOUT_MULTI_METAL &&
layout_mask != BVH_LAYOUT_MULTI_METAL_EMBREE) {
return;

View File

@@ -24,6 +24,7 @@
#include "scene/svm.h"
#include "scene/tables.h"
#include "scene/volume.h"
#include "scene/devicescene.h"
#include "session/session.h"
#include "util/foreach.h"
@@ -33,59 +34,7 @@
CCL_NAMESPACE_BEGIN
DeviceScene::DeviceScene(Device *device)
: bvh_nodes(device, "bvh_nodes", MEM_GLOBAL),
bvh_leaf_nodes(device, "bvh_leaf_nodes", MEM_GLOBAL),
object_node(device, "object_node", MEM_GLOBAL),
prim_type(device, "prim_type", MEM_GLOBAL),
prim_visibility(device, "prim_visibility", MEM_GLOBAL),
prim_index(device, "prim_index", MEM_GLOBAL),
prim_object(device, "prim_object", MEM_GLOBAL),
prim_time(device, "prim_time", MEM_GLOBAL),
tri_verts(device, "tri_verts", MEM_GLOBAL),
tri_shader(device, "tri_shader", MEM_GLOBAL),
tri_vnormal(device, "tri_vnormal", MEM_GLOBAL),
tri_vindex(device, "tri_vindex", MEM_GLOBAL),
tri_patch(device, "tri_patch", MEM_GLOBAL),
tri_patch_uv(device, "tri_patch_uv", MEM_GLOBAL),
curves(device, "curves", MEM_GLOBAL),
curve_keys(device, "curve_keys", MEM_GLOBAL),
curve_segments(device, "curve_segments", MEM_GLOBAL),
patches(device, "patches", MEM_GLOBAL),
points(device, "points", MEM_GLOBAL),
points_shader(device, "points_shader", MEM_GLOBAL),
objects(device, "objects", MEM_GLOBAL),
object_motion_pass(device, "object_motion_pass", MEM_GLOBAL),
object_motion(device, "object_motion", MEM_GLOBAL),
object_flag(device, "object_flag", MEM_GLOBAL),
object_volume_step(device, "object_volume_step", MEM_GLOBAL),
object_prim_offset(device, "object_prim_offset", MEM_GLOBAL),
camera_motion(device, "camera_motion", MEM_GLOBAL),
attributes_map(device, "attributes_map", MEM_GLOBAL),
attributes_float(device, "attributes_float", MEM_GLOBAL),
attributes_float2(device, "attributes_float2", MEM_GLOBAL),
attributes_float3(device, "attributes_float3", MEM_GLOBAL),
attributes_float4(device, "attributes_float4", MEM_GLOBAL),
attributes_uchar4(device, "attributes_uchar4", MEM_GLOBAL),
light_distribution(device, "light_distribution", MEM_GLOBAL),
lights(device, "lights", MEM_GLOBAL),
light_background_marginal_cdf(device, "light_background_marginal_cdf", MEM_GLOBAL),
light_background_conditional_cdf(device, "light_background_conditional_cdf", MEM_GLOBAL),
light_tree_nodes(device, "light_tree_nodes", MEM_GLOBAL),
light_tree_emitters(device, "light_tree_emitters", MEM_GLOBAL),
light_to_tree(device, "light_to_tree", MEM_GLOBAL),
object_to_tree(device, "object_to_tree", MEM_GLOBAL),
object_lookup_offset(device, "object_lookup_offset", MEM_GLOBAL),
triangle_to_tree(device, "triangle_to_tree", MEM_GLOBAL),
particles(device, "particles", MEM_GLOBAL),
svm_nodes(device, "svm_nodes", MEM_GLOBAL),
shaders(device, "shaders", MEM_GLOBAL),
lookup_table(device, "lookup_table", MEM_GLOBAL),
sample_pattern_lut(device, "sample_pattern_lut", MEM_GLOBAL),
ies_lights(device, "ies", MEM_GLOBAL)
{
memset((void *)&data, 0, sizeof(data));
}
Scene::Scene(const SceneParams &params_, Device *device)
: name("Scene"),

View File

@@ -6,20 +6,16 @@
#include "bvh/params.h"
#include "scene/devicescene.h"
#include "scene/film.h"
#include "scene/image.h"
#include "scene/shader.h"
#include "device/device.h"
#include "device/memory.h"
#include "util/param.h"
#include "util/string.h"
#include "util/system.h"
#include "util/texture.h"
#include "util/thread.h"
#include "util/types.h"
#include "util/vector.h"
CCL_NAMESPACE_BEGIN
@@ -54,92 +50,6 @@ class RenderStats;
class SceneUpdateStats;
class Volume;
/* Scene Device Data */
class DeviceScene {
public:
/* BVH */
device_vector<int4> bvh_nodes;
device_vector<int4> bvh_leaf_nodes;
device_vector<int> object_node;
device_vector<int> prim_type;
device_vector<uint> prim_visibility;
device_vector<int> prim_index;
device_vector<int> prim_object;
device_vector<float2> prim_time;
/* mesh */
device_vector<packed_float3> tri_verts;
device_vector<uint> tri_shader;
device_vector<packed_float3> tri_vnormal;
device_vector<packed_uint3> tri_vindex;
device_vector<uint> tri_patch;
device_vector<float2> tri_patch_uv;
device_vector<KernelCurve> curves;
device_vector<float4> curve_keys;
device_vector<KernelCurveSegment> curve_segments;
device_vector<uint> patches;
/* point-cloud */
device_vector<float4> points;
device_vector<uint> points_shader;
/* objects */
device_vector<KernelObject> objects;
device_vector<Transform> object_motion_pass;
device_vector<DecomposedTransform> object_motion;
device_vector<uint> object_flag;
device_vector<float> object_volume_step;
device_vector<uint> object_prim_offset;
/* cameras */
device_vector<DecomposedTransform> camera_motion;
/* attributes */
device_vector<AttributeMap> attributes_map;
device_vector<float> attributes_float;
device_vector<float2> attributes_float2;
device_vector<packed_float3> attributes_float3;
device_vector<float4> attributes_float4;
device_vector<uchar4> attributes_uchar4;
/* lights */
device_vector<KernelLightDistribution> light_distribution;
device_vector<KernelLight> lights;
device_vector<float2> light_background_marginal_cdf;
device_vector<float2> light_background_conditional_cdf;
/* light tree */
device_vector<KernelLightTreeNode> light_tree_nodes;
device_vector<KernelLightTreeEmitter> light_tree_emitters;
device_vector<uint> light_to_tree;
device_vector<uint> object_to_tree;
device_vector<uint> object_lookup_offset;
device_vector<uint> triangle_to_tree;
/* particles */
device_vector<KernelParticle> particles;
/* shaders */
device_vector<int4> svm_nodes;
device_vector<KernelShader> shaders;
/* lookup tables */
device_vector<float> lookup_table;
/* integrator */
device_vector<float> sample_pattern_lut;
/* IES lights */
device_vector<float> ies_lights;
KernelData data;
DeviceScene(Device *device);
};
/* Scene Parameters */
class SceneParams {

View File

@@ -758,12 +758,7 @@ static void sky_texture_precompute_nishita(SunSky *sunsky,
float pixel_top[3];
SKY_nishita_skymodel_precompute_sun(
sun_elevation, sun_size, altitude, air_density, dust_density, pixel_bottom, pixel_top);
/* limit sun rotation between 0 and 360 degrees */
sun_rotation = fmodf(sun_rotation, M_2PI_F);
if (sun_rotation < 0.0f) {
sun_rotation += M_2PI_F;
}
sun_rotation = M_2PI_F - sun_rotation;
/* send data to svm_sky */
sunsky->nishita_data[0] = pixel_bottom[0];
sunsky->nishita_data[1] = pixel_bottom[1];
@@ -873,6 +868,37 @@ NODE_DEFINE(SkyTextureNode)
SkyTextureNode::SkyTextureNode() : TextureNode(get_node_type()) {}
void SkyTextureNode::simplify_settings(Scene * /* scene */)
{
/* Patch sun position so users are able to animate the daylight cycle while keeping the shading
* code simple. */
float new_sun_elevation = sun_elevation;
float new_sun_rotation = sun_rotation;
/* Wrap `new_sun_elevation` into [-2PI..2PI] range. */
new_sun_elevation = fmodf(new_sun_elevation, M_2PI_F);
/* Wrap `new_sun_elevation` into [-PI..PI] range. */
if (fabsf(new_sun_elevation) >= M_PI_F) {
new_sun_elevation -= copysignf(2.0f, new_sun_elevation) * M_PI_F;
}
/* Wrap `new_sun_elevation` into [-PI/2..PI/2] range while keeping the same absolute position. */
if (new_sun_elevation >= M_PI_2_F || new_sun_elevation <= -M_PI_2_F) {
new_sun_elevation = copysignf(M_PI_F, new_sun_elevation) - new_sun_elevation;
new_sun_rotation += M_PI_F;
}
/* Wrap `new_sun_rotation` into [-2PI..2PI] range. */
new_sun_rotation = fmodf(new_sun_rotation, M_2PI_F);
/* Wrap `new_sun_rotation` into [0..2PI] range. */
if (new_sun_rotation < 0.0f) {
new_sun_rotation += M_2PI_F;
}
new_sun_rotation = M_2PI_F - new_sun_rotation;
sun_elevation = new_sun_elevation;
sun_rotation = new_sun_rotation;
}
void SkyTextureNode::compile(SVMCompiler &compiler)
{
ShaderInput *vector_in = input("Vector");

View File

@@ -167,6 +167,8 @@ class SkyTextureNode : public TextureNode {
NODE_SOCKET_API(float3, vector)
ImageHandle handle;
void simplify_settings(Scene *scene);
float get_sun_size()
{
/* Clamping for numerical precision. */

View File

@@ -621,12 +621,12 @@ void Session::set_pause(bool pause)
void Session::set_output_driver(unique_ptr<OutputDriver> driver)
{
path_trace_->set_output_driver(move(driver));
path_trace_->set_output_driver(std::move(driver));
}
void Session::set_display_driver(unique_ptr<DisplayDriver> driver)
{
path_trace_->set_display_driver(move(driver));
path_trace_->set_display_driver(std::move(driver));
}
double Session::get_estimated_remaining_time() const

View File

@@ -285,7 +285,7 @@ static bool configure_image_spec_from_buffer(ImageSpec *image_spec,
*image_spec = ImageSpec(
buffer_params.width, buffer_params.height, num_channels, TypeDesc::FLOAT);
image_spec->channelnames = move(channel_names);
image_spec->channelnames = std::move(channel_names);
if (!buffer_params_to_image_spec_atttributes(image_spec, buffer_params)) {
return false;

View File

@@ -1,7 +1,7 @@
# SPDX-License-Identifier: Apache-2.0
# Copyright 2011-2022 Blender Foundation
if(WITH_GTESTS)
if(WITH_GTESTS AND WITH_CYCLES_LOGGING)
Include(GTestTesting)
# Otherwise we get warnings here that we can't fix in external projects
@@ -53,6 +53,6 @@ if(NOT APPLE)
endif()
endif()
if(WITH_GTESTS)
if(WITH_GTESTS AND WITH_CYCLES_LOGGING)
blender_src_gtest(cycles "${SRC}" "${LIB}")
endif()

View File

@@ -36,8 +36,7 @@ class LogMessageVoidify {
# define CHECK(expression) LOG_SUPPRESS()
# define CHECK_NOTNULL(expression) LOG_SUPPRESS()
# define CHECK_NULL(expression) LOG_SUPPRESS()
# define CHECK_NOTNULL(expression) (expression)
# define CHECK_NEAR(actual, expected, eps) LOG_SUPPRESS()
@@ -50,8 +49,7 @@ class LogMessageVoidify {
# define DCHECK(expression) LOG_SUPPRESS()
# define DCHECK_NOTNULL(expression) LOG_SUPPRESS()
# define DCHECK_NULL(expression) LOG_SUPPRESS()
# define DCHECK_NOTNULL(expression) (expression)
# define DCHECK_NEAR(actual, expected, eps) LOG_SUPPRESS()

View File

@@ -809,7 +809,7 @@ static string path_source_replace_includes_recursive(const string &_source,
const size_t source_length = source.length();
size_t index = 0;
/* Information about where we are in the source. */
size_t line_number = 0, column_number = 1;
size_t column_number = 1;
/* Currently gathered non-preprocessor token.
* Store as start/length rather than token itself to avoid overhead of
* memory re-allocations on each character concatenation.
@@ -842,7 +842,6 @@ static string path_source_replace_includes_recursive(const string &_source,
preprocessor_line = "";
}
column_number = 0;
++line_number;
}
else if (ch == '#' && column_number == 1 && !inside_preprocessor) {
/* Append all possible non-preprocessor token to the result. */

View File

@@ -4,7 +4,6 @@
#ifndef __UTIL_VECTOR_H__
#define __UTIL_VECTOR_H__
#include <cassert>
#include <cstring>
#include <vector>

View File

@@ -897,12 +897,10 @@ void Octree::printPath(PathElement *path)
void Octree::printPaths(PathList *path)
{
PathList *iter = path;
int i = 0;
while (iter != NULL) {
dc_printf("Path %d:\n", i);
printPath(iter);
iter = iter->next;
i++;
}
}
@@ -1256,7 +1254,6 @@ Node *Octree::connectFace(
updateParent(&newnode->internal, len, st);
int flag = 0;
// Add the cells to the rings and fill in the patch
PathElement *newEleN;
if (curEleN->pos[0] != stN[0] || curEleN->pos[1] != stN[1] || curEleN->pos[2] != stN[2]) {
@@ -1286,7 +1283,6 @@ Node *Octree::connectFace(
alpha);
curEleN = newEleN;
flag++;
}
PathElement *newEleP;
@@ -1316,7 +1312,6 @@ Node *Octree::connectFace(
alpha);
curEleP = newEleP;
flag++;
}
/*
@@ -1543,6 +1538,8 @@ void Octree::getFacePoint(PathElement *leaf, int dir, int &x, int &y, float &p,
float off[3];
int num = 0, num2 = 0;
(void)num2; // Unused in release builds.
LeafNode *leafnode = locateLeaf(leaf->pos);
for (int i = 0; i < 4; i++) {
int edgeind = faceMap[dir * 2][i];

View File

@@ -72,7 +72,7 @@ class GHOST_ISystemPaths {
/**
* Add the file to the operating system most recently used files
*/
virtual void addToSystemRecentFiles(const char *filename) const = 0;
virtual void addToSystemRecentFiles(const char *filepath) const = 0;
private:
/** The one and only system paths. */

View File

@@ -61,7 +61,7 @@ extern const char *GHOST_getBinaryDir(void);
/**
* Add the file to the operating system most recently used files
*/
extern void GHOST_addToSystemRecentFiles(const char *filename);
extern void GHOST_addToSystemRecentFiles(const char *filepath);
#ifdef __cplusplus
}

View File

@@ -98,6 +98,14 @@ typedef enum {
* This is a convention for X11/WAYLAND, select text & MMB to paste (without an explicit copy).
*/
GHOST_kCapabilityPrimaryClipboard = (1 << 2),
/**
* Support for reading the front-buffer.
*/
GHOST_kCapabilityGPUReadFrontBuffer = (1 << 3),
/**
* Set when there is support for system clipboard copy/paste.
*/
GHOST_kCapabilityClipboardImages = (1 << 4),
} GHOST_TCapabilityFlag;
/**
@@ -106,7 +114,7 @@ typedef enum {
*/
#define GHOST_CAPABILITY_FLAG_ALL \
(GHOST_kCapabilityCursorWarp | GHOST_kCapabilityWindowPosition | \
GHOST_kCapabilityPrimaryClipboard)
GHOST_kCapabilityPrimaryClipboard | GHOST_kCapabilityGPUReadFrontBuffer)
/* Xtilt and Ytilt represent how much the pen is tilted away from
* vertically upright in either the X or Y direction, with X and Y the

View File

@@ -10,6 +10,9 @@
#include "GHOST_Context.hh"
#ifdef _WIN32
# define WIN32_LEAN_AND_MEAN
# include <windows.h>
# include <epoxy/wgl.h>
# include <tchar.h>
#
@@ -18,6 +21,8 @@
# endif
#endif
#include <epoxy/gl.h>
#include <cstdio>
#include <cstring>

Some files were not shown because too many files have changed in this diff Show More