1
1

Compare commits

..

734 Commits

Author SHA1 Message Date
dbc8b52752 test using compiled function in math node 2022-01-02 20:46:01 +01:00
ab6a116334 initial ir optimization 2022-01-02 18:23:04 +01:00
077debe17f fix function generation 2022-01-02 16:00:33 +01:00
33d6b09d3d test creating add function 2022-01-02 16:49:23 +01:00
f92a1e20bc add pass manager 2022-01-02 14:59:02 +01:00
c1e014f2a1 enable object cache code path 2021-12-29 21:44:40 +01:00
6a69a32c6d object file test 2021-12-29 20:37:26 +01:00
cc32f73a29 add object cache 2021-12-29 20:14:47 +01:00
d4367fa8e0 Merge branch 'master' into temp-llvm-testing 2021-12-29 19:25:33 +01:00
bb0da7dbbd Fix (unreported): missing relations update after adding scene time node
This just moves the relations update to a lower level function that is used
by other functions. Eventually, the special case for this node should be
generalized.
2021-12-29 19:40:14 +01:00
5e8b42bf86 Cleanup: Remove unused DerivedMesh functions 2021-12-29 12:37:01 -06:00
a94d80716e Geometry Nodes: Support instances in the delete geometry node
Ever since the instance domain was added, this was exposed, it just
didn't do anything. This patch implements the instances domain in the
delete and separate geometry nodes, where it acts on the top-level
instances.

We act on a mutable instances input, with the idea that eventually
copy on write attribute layers will make this less expensive. It also
allows us to keep the instance references in place and to do less
work in some situations.

Ref T93554

Differential Revision: https://developer.blender.org/D13565
2021-12-29 11:31:58 -06:00
a836ded990 Geometry Nodes: Accumulate Fields Node
This function node creates a running total of a given Vector, Float, or
Int field.

Inputs:
  - Value: The field to be accumulated
  - Group Index: The values of this input are used to aggregate the input
    into separate 'bins', creating multiple accumulations.
Outputs:
  - Leading and Trailing: Returns the running totals starting
   at either the first value of each accumulations or 0 respectively.
  - Total: Returns the total accumulation at all positions of the field.

There's currently plenty of duplicate work happening when multiple outputs
are used that could be optimized by a future refactor to field inputs.

Differential Revision: https://developer.blender.org/D12743
2021-12-29 10:25:39 -06:00
279085e18e Fix: Issues with attribute comparison in geometry nodes tests
A few typos in 17770192fb lead to an incorrect count of custom
data layers in the test meshes. We only want to consider layers that are
not anonymous, and there was a copy and paste mistake.
2021-12-29 10:23:53 -06:00
d5b77fd522 Nodes: Composite: UI fixes to time node
- Use default size consistent with other curve nodes
- Use column instead of row for properties
2021-12-29 11:16:44 -05:00
Germano Cavalcante
1c7d7c9150 Fix T94113: Local view + Geometry Nodes is broken for instances
`GeometrySet::compute_boundbox_without_instances` may not initialize min
max in some cases such as meshes without vertices.

This can result in a Bounding Box with impossible dimensions
(min=FLT_MAX, max=-FLT_MAX).

So repeat the same solution seen in `BKE_object_boundbox_calc_from_mesh`
and set boundbox values to zero.

Reviewed By: HooglyBoogly

Differential Revision: https://developer.blender.org/D13664
2021-12-29 12:36:58 -03:00
d786b48aab Cleanup: Remove dead code 2021-12-29 10:31:17 -05:00
Aaron Carlisle
465bd66519 Nodes: Cleanup: Remove no op registration functions
All these function paramaters are set to NULL so they arent necessary.

Reviewed By: HooglyBoogly, JacquesLucke

Differential Revision: https://developer.blender.org/D13686
2021-12-29 10:00:50 -05:00
Germano Cavalcante
b7f6377e38 gpu.types.GPUOffScreen: accept format argument for color texture
Some projects need more than 8-bit RGBA off-screen, so add the ability to
accept color format and defaults to RGBA8 so existing code should not be
affected.

Currently supported formats:
- RGBA8 (default)
- RGBA16
- RGBA16F
- RGBA32F

Reviewed By: mano-wii

Differential Revision: https://developer.blender.org/D13650
2021-12-29 11:46:53 -03:00
bdcc258305 Fix: VSE colormix blend factor not working
Blend factor was used to adjust alpha of background image, which is not
correct. This was done in fdee84fd56 where another change was, that
background alpha is copied into result, which is correct.

Apply blend factor to foreground image alpha channel.
2021-12-29 15:00:57 +01:00
4bf74afacc Fix T94422: Shading/Normals break on array modifier caps
The array modifier does not necessarily tag normals dirty.
If it doesnt, normals are recalculated "internally" using the offset ob
transform. This was happening for the array items, but not for the caps.

Now do the same thing for caps.

Maniphest Tasks: T94422

Differential Revision: https://developer.blender.org/D13681
2021-12-29 10:16:48 +01:00
dc0bf9b702 Fix T94453: Weld modifier crash after recent cleanup
I had assumed that the span's size was the same as the length variable.
In the future, separate lengths could be removed in favor of using
lengths directly from spans.
2021-12-29 00:16:54 -06:00
ba38b06a97 Nodes: Convert shader, shader category nodes to c++
Also add file namespace

This is needed to use new node APIs

Differential Revision: https://developer.blender.org/D13684
2021-12-28 22:51:57 -05:00
b92ef379b7 Cleanup: clang-tidy
Fixes two instances of `-Wunused-but-set-variable`

There are several more of these but these were low hanging
and noisy with one being in a header functions.
2021-12-28 21:53:41 -05:00
53ed7ec7f2 Cleanup: clang-tidy
- modernize-deprecated-headers
- modernize-redundant-void-arg

Missed in rB11ac276caaa6e6d42176452526af97cf972abb5f
2021-12-28 20:58:50 -05:00
c34ea3323a Cleanup: Remove unused node tree "local sync" functions 2021-12-28 18:25:18 -06:00
7006d4f0fb Cleanup: Use indices instead of pointers
This improves code readability.

Take the opportunity and improve the comments too.
2021-12-28 20:42:21 -03:00
1464eff375 Cleanup: Return early, organize variable declarations 2021-12-28 15:36:59 -06:00
c32ce881e8 Nodes: Enable unity build for function nodes
Unity build saves 5 seconds off the total build time when compiling `bf_nodes_function`.
Total build times went from 25s to 20s (20% reduction),
tested with ninja on linux running i5 8250U.
2021-12-28 15:49:42 -05:00
Aaron Carlisle
1e9175e1d7 Nodes: Add bf_nodes_function module
In the future this will be used to support unity builds for function nodes

Differential Revision: https://developer.blender.org/D13682
2021-12-28 15:18:17 -05:00
2668f9181c Nodes: Split shader color ramp into its own file 2021-12-28 14:18:31 -05:00
715e0faabc Nodes: Declare function nodes in individual file namespace
To be used in the future to support unity builds
2021-12-28 14:18:31 -05:00
955748ab1e Fix: Duplicate link search entries for attribute statistic node
Using the output declarations is incorrect because there is a
declaration for each type. Instead loop over the names directly,
since it will make it easier to add an integer mode that only
supports some of the outputs.
2021-12-28 12:44:36 -06:00
4cbcfd22f5 Fix T94442: Trim curve node can crash with duplicate point
The calculation to find the factor between two evaluated points assumed
that the points were not at the same location. This assumption is some-
what reasonable, since we might expect `lower_bound` to skip those
point anyway. However, the report found a case where the first two
evaluated points were coincident, and there is no strong reason not
to make this safe, so add a check for 0 length before the division.
2021-12-28 12:22:14 -06:00
d7c556de32 GPencil: Avoid crashes calling from python
This is part of T94439
2021-12-28 18:50:07 +01:00
2e6ae11326 Nodes: Add bf_nodes_composite module
In the future this will be used to support unity builds for composite nodes

Differential Revision: https://developer.blender.org/D13678
2021-12-28 12:22:06 -05:00
1a721c5dbe Fix T94380: Scrolling zooms in spreadsheet data set region
The region used to be type "Channels", but the standard for this
type is "Tools", which is what the file browser uses. This follows
the changes in rB01df48a98394, which also make the region more
"standard."
2021-12-28 11:05:31 -06:00
0e38002dd5 Cleanup: Loops in VSE effect processing
Some effect functions looped over alternating lines, previously with
different factors. Since only one factor is used, code can be
simplified by looping all lines in one for loop.

There should be no functional changes.
2021-12-28 16:40:09 +01:00
bd9d09ca82 Fix T94439: Some GPencil operators crash when calling from console
This fix avoid the segment fault for several operators.
2021-12-28 16:07:07 +01:00
9bacd54312 LibOverride: better handling of the "no override of bones' shapes" case.
Also avoid overriding collections of bone shape objects, if possible.
2021-12-28 15:08:10 +01:00
23ac79f2c2 LibOverride: Tweak RNA 'need resync' detection code.
* Assert about source ID of an overridden pointer property not being a
  liboverride was not necessary, just skip in that case.
* Tag actual 'real' ID owner for resync, and not (potentially) an embedded one.
2021-12-28 15:08:10 +01:00
44db9f192e Cleanup: Factor in VSE effect processing
2 factor variables were passed to effects, but they were hard coded to
have same value.

Remove duplicate variable from arguments, rename single argument to
`fac`. Inverted factor variables were renamed to `mfac`. Any other
factor related variables are prefixed with `temp_`.

There should be no functional changes.
2021-12-28 14:30:21 +01:00
a7dca135dc Fix loss of cloth disk cache on reload in library overrides.
If the override system creates an override record for the cache
name (no idea why though), it trashes the disk cache on file load.

The reason is that it tries to rename cache files in update handler
when assigning the name, and BLI_rename deletes the target file even
if both names are the same.

This is a safe fix that simply aborts the pointless rename attempt.

Differential Revision: https://developer.blender.org/D13679
2021-12-28 14:56:09 +03:00
b29e33caa2 Fix T94408: missing sockets after adding node group 2021-12-28 11:21:07 +01:00
e28222966b Fix compile error when building without OpenSubDiv
Stubs were missing for functions added to opensubdiv_evaluator_capi.h
2021-12-28 09:36:30 +01:00
d25fa3250a Fix T94420: deadlock with subsurf modifiers
The deadlock was caused as the lock on the Mesh mutex used to compute
the subdivision wrapper was not released in some early exits of the
function.
2021-12-28 09:20:17 +01:00
f6699bfccf Revert "LineArt: Intersection function additional clamping"
This reverts commit 0a68fa8e14.
2021-12-28 11:55:37 +08:00
0a68fa8e14 LineArt: Intersection function additional clamping
To handle a rare case where it leads to a -1 index in isect order lookup
2021-12-28 11:21:25 +08:00
8c4edd1b37 Cleanup: clang format 2021-12-27 16:23:23 -05:00
d5b72fb06c Nodes: Declare shader nodes in individual file namespace
To be used in the future to support unity builds
2021-12-27 16:21:31 -05:00
336f6f4bbd GPencil: Fix error in previous Cleanup 2021-12-27 19:51:32 +01:00
d6dd2f51bb Cleanup: Use vector function 2021-12-27 13:16:47 -05:00
5814de65f9 Cleanup: Store cursor location in tGPspoint as an array
Fixes many instances of `-Wstringop-overread` warning on GCC 11

Differential Revision: https://developer.blender.org/D13672
2021-12-27 12:31:31 -05:00
11ac276caa Cleanup: clang tidy
Use c++ headers; use nullptr; redundant `void` in parameter list;
inconsistent parameter name.
2021-12-27 18:18:37 +01:00
1c9d8fcb47 Render: move editor/render module to c++
Doing this in preparation for some work on asset preview generation.

Differential Revision: https://developer.blender.org/D13676
2021-12-27 17:26:09 +01:00
644e6c7a3e Fix T93941: geometry proximity breaks with high resolution mesh
The calls to `.fill` were overwriting indices that are processed by
other threads.
2021-12-27 17:21:09 +01:00
eed45d2a23 OpenSubDiv: add support for an OpenGL evaluator
This evaluator is used in order to evaluate subdivision at render time, allowing for
faster renders of meshes with a subdivision surface modifier placed at the last
position in the modifier list.

When evaluating the subsurf modifier, we detect whether we can delegate evaluation
to the draw code. If so, the subdivision is first evaluated on the GPU using our own
custom evaluator (only the coarse data needs to be initially sent to the GPU), then,
buffers for the final `MeshBufferCache` are filled on the GPU using a set of
compute shaders. However, some buffers are still filled on the CPU side, if doing so
on the GPU is impractical (e.g. the line adjacency buffer used for x-ray, whose
logic is hardly GPU compatible).

This is done at the mesh buffer extraction level so that the result can be readily used
in the various OpenGL engines, without having to write custom geometry or tesselation
shaders.

We use our own subdivision evaluation shaders, instead of OpenSubDiv's vanilla one, in
order to control the data layout, and interpolation. For example, we store vertex colors
as compressed 16-bit integers, while OpenSubDiv's default evaluator only work for float
types.

In order to still access the modified geometry on the CPU side, for use in modifiers
or transform operators, a dedicated wrapper type is added `MESH_WRAPPER_TYPE_SUBD`.
Subdivision will be lazily evaluated via `BKE_object_get_evaluated_mesh` which will
create such a wrapper if possible. If the final subdivision surface is not needed on
the CPU side, `BKE_object_get_evaluated_mesh_no_subsurf` should be used.

Enabling or disabling GPU subdivision can be done through the user preferences (under
Viewport -> Subdivision).

See patch description for benchmarks.

Reviewed By: campbellbarton, jbakker, fclem, brecht, #eevee_viewport

Differential Revision: https://developer.blender.org/D12406
2021-12-27 16:35:54 +01:00
31e120ef49 Nodes: Support linking to existing group input from link drag search.
Before one could only create a new group input using the link drag search.
With this patch it becomes possible to create a Group Input node for an
existing input.

Differential Revision: https://developer.blender.org/D13674
2021-12-27 16:15:11 +01:00
51a131ddbc BLI: add utility to check if type is any specific type
This adds `blender::is_same_any_v` which is the almost the same as
`std::is_same_v`. The difference is that it allows for checking multiple
types at the same time.

Differential Revision: https://developer.blender.org/D13673
2021-12-27 16:08:11 +01:00
594438ef0d Allocator: add missing include
The placement-new operator requires `#include <new>`.
It is used in `MEM_new`.
2021-12-27 15:41:56 +01:00
7cf5f4cc63 LineArt: Protecting bounding area links.
In case they overflowed the bounding area maximum link count,
Protect the link array so it doesn't crash.
2021-12-27 14:33:58 +08:00
52585b39a1 LineArt: Remove duplicated edge-boundbox linking.
The edge is linked twice from differen calls during line art calculation
Probably caused by a merge and both calls stayed for some reason.
This would lead to edge link overflowing its limit of 2^16 items.
2021-12-27 14:22:09 +08:00
20b438d523 Cleanup: Use array for BKE cursor functions
Missed this function in rB67525b88d2e
2021-12-26 15:08:41 -05:00
5cf993f951 Cleanup: Fix compile warning
Own mistake in rB67525b88d2e
2021-12-26 15:08:41 -05:00
28a8d434d5 Fix T94387: Mesh sequence cache, crash when clicking a panel
The crash happens when opening a panel (added in rB43f5e761a66e87fed664a199cda867639f8daf3e)
when no CacheFile is set in the modifier.

To fix this, check that the CacheFile pointer is not null before attempting to draw anything.
2021-12-26 13:45:49 +01:00
dd3a72f275 Docs: Add to and cleanup attribute API docs
Most of the comment block is similar to the text in the source
code documentation wiki. It's helpful to have some text in
a header file too, so it's closer for programmers already looking
at the code.

This also uses more consistent syntax and wording in the comments
about the attribute API in `GeometryComponent`.

Ref T93753

Differential Revision: https://developer.blender.org/D13661
2021-12-25 15:08:38 -06:00
ceed8f7c06 Cleanup: Use more common variable name
It's preferred to use `area` as a name for `ScrArea` variables.
2021-12-25 14:57:07 -06:00
85abac7e87 Cleanup: Move customdata.c to C++
Differential Revision: https://developer.blender.org/D13666
2021-12-25 14:28:22 -06:00
Christoph Lendenfeld
f7ddb1ed8a Breakdown Implementation
This patch adds the breakdown (or tween) functionality to the graph editor.

The factor defines the linear interpolation from left key to right key.

Reviewed by: Sybren A. Stüvel
Differential Revision: https://developer.blender.org/D9375
Ref: D9375
2021-12-25 20:58:47 +01:00
Aaron Carlisle
fbd01624e3 Shader Nodes: Convert bump node to use new socket builder
This node is a bit special in that it uses two internal sockets
for a hack for Eevee; see rBffd5e1e6acd296a187e7af016f9d7f8a9f209f87

As a result, the `SOCK_UNAVAIL` flag is exposed to socket builder API.

Reviewed By: JacquesLucke, fclem

Differential Revision: https://developer.blender.org/D13496
2021-12-25 11:13:15 -05:00
Aaron Carlisle
c5862da5ad Cleanup: use new c++ guarded allocator API in nodes
Also simplify the allocation name to `__func__`

Reviewed By: JacquesLucke

Differential Revision: https://developer.blender.org/D13665
2021-12-25 11:12:08 -05:00
Christoph Lendenfeld
9085b4a731 Blend To Neighbor Implementation
This patch adds the blend to neighbor operator to the Graph editor.

The operator acts like the blend to neighbor operator for a pose context, just working on keyframes.

Reviewed by: Sybren A. Stüvel
Differential Revision: https://developer.blender.org/D9374
Ref: D9374
2021-12-25 13:40:53 +01:00
e505957b47 Fix T94375: Python error when trying to add Grease Pencil brush preset
The prop name was wrong.
2021-12-25 12:53:35 +01:00
28df0107d4 Fix T94362: GPUMaterialTexture references freed ImageUser
The issue was caused by rB7e712b2d6a0d257d272ed35622b41d06274af8df
and the fact that `GPUMaterialTexture` contains an `ImageUser *` which
references the `ImageUser` on e.g. `NodeTexImage`.

Since the node tree update refactor, it is possible that the node tree changes
without changing the actual material. Therefore, either the renderer should
check if the node tree has changed or it should not store pointers to data in
node storage. The latter approach is implemented in this patch.

Differential Revision: https://developer.blender.org/D13663
2021-12-25 11:14:02 +01:00
f1e04116f0 Cleanup: Do not use magic number 2021-12-25 00:02:45 -05:00
67525b88d2 Cleanup: Use array for BKE cursor functions
Differential Revision: https://developer.blender.org/D12962
2021-12-24 23:59:33 -05:00
95c7e8aa13 Update RNA to user manual mapping file 2021-12-24 22:54:59 -05:00
fc45b00720 Cleanup: Define node tree icon in register function
I suppose this was done to reduce then dependencies.
However, most nodes already depend on UI code so this isnt too useful.
2021-12-24 22:47:58 -05:00
6e0cf86e73 Cleanup: use new c++ guarded allocator API
API added in rBa3ad5abf2fe85d623f9e78fefc34e27bdc14632e
2021-12-24 22:18:04 -05:00
79012c6784 Cleanup: Use consistent order for custom data mesh masks
Loops come last in the struct's definition, use the same order when
initializing the common masks in customdata.c (they were switched
with the poly masks).
2021-12-24 18:26:11 -06:00
26c7be71d7 Nodes: Migrate bump shader node to cpp
Needed for D13496
2021-12-24 15:42:58 -05:00
291d2a2222 Cleanup: Remove misleading comments
Most of these custom data layers weren't BMesh only, and the
one that actually looks to be BMesh only has `BM` in its name.
2021-12-24 11:00:01 -06:00
d48fc7d156 Cleanup: Use vector instead of linked list 2021-12-24 10:05:47 -06:00
dd01ce2cd0 Fix T94322: add missing updates after recent refactor
This was a regression in rB7e712b2d6a0d257d272ed35622b41d06274af8df.
2021-12-24 13:39:50 +01:00
ba4b7b4319 Fix T94162: incorrect handling when there are multiple group outputs
Typically a node group should only have a single Group Output node.
However, currently Blender already supports having multiple group outputs,
one of which is active. This wasn't handled correctly by geometry nodes.

Differential Revision: https://developer.blender.org/D13611
2021-12-24 12:34:04 +01:00
c0db8a9a3b Cleanup: remove unused button function
rB05f900e3466b45a19e13bea6dd641e4f7b8b46e9 removed unused button functions,
but since that commit the `uiDefIconTextButBit()` static function sits
unused as well. It's now been removed.
2021-12-24 10:39:36 +01:00
81b3933abb Fix T94357: Node Ungroup operator copies current node tree
This was a mistake in rBfdc4a1a590d8befb1ff which copied the parent
node tree into itself rather than accessing the node group's nodes.
2021-12-23 22:48:55 -06:00
35bd6fe993 Fix T94344: Incorrect size for edge vertices node output
This looks like a copy and paste error from the original commit.
The virtual array output used the number of mesh polygons instead
of the number of edges.
2021-12-23 16:22:07 -06:00
2df912466c Cleanup: Remove outdated comment
After rB01df48a983944ab3f8a, this comment no longer applies.
2021-12-23 13:28:44 -06:00
582f6032fc Cleanup: Move hair object type files to C++
Differential Revision: https://developer.blender.org/D13657
2021-12-23 11:46:45 -06:00
05f900e346 Cleanup: Remove unused UI button definition functions
These were part of the older buttons API that shouldn't be used in
more places at this point. Most layouts should be built with the regular
layout system API and RNA properties. This sort of button can still be
created though, since these were just shortcuts anyway.
2021-12-23 11:30:11 -06:00
43f5e761a6 Cache File: use panels to organize UI
This adds interface panels to organize the Cache File UI parameters for
modifiers and constraints into related components: velocity, time, and
render procedural.

Properties relating to the three aforementioned components are separated
from `uiTemplateCacheFile` into their own functions (e.g.
`uiTemplateCacheFileVelocity` for the velocity one), which are in turn
called from the specific panel creation routines of the modifiers and
constraints (for constraints, the functions are exposed to the RNA).

`uiTemplateCacheFile` now only shows the properties for the file path,
and in the case of constraints, the scale property.

The properties that are only defined per modifier (like the velocity
scale), are shown in the proper modifier layout panel if applicable.

Reviewed By: sybren

Differential Revision: https://developer.blender.org/D13652
2021-12-23 18:05:26 +01:00
Christoph Lendenfeld
7a71a95f32 Graph Slider Ops: Show error when no valid keys are found
When using graph slider operators like D9374
it showed a warning when no keys were selected.
However since that stops the modal operation it should be an Error.
Also the message was misleading
since it could error for different reasons than stated.

Reviewed by: Sybren A. Stüvel
Differential Revision: https://developer.blender.org/D13655
Ref: D13655
2021-12-23 14:31:28 +01:00
d71009d980 Avoid exception when no weight paint settings exist
Just an extra check for `None` before accessing its properties.
2021-12-23 14:10:24 +01:00
025c921416 Cleanup: remove BKE_animdata_driver_path_hack
The `BKE_animdata_driver_path_hack()` function has had almost no effect
since rB51b796ff1528, and basically boils down to:

```
return base_path ? base_path : RNA_path_from_ID_to_property(ptr, prop);
```

Since `base_path` was `NULL` in the majority of cases, it's just been
replaced by a direct call to `RNA_path_from_ID_to_property()`. The
conditional now just appears in one remaining case.

This relates to T91387.

Reviewed By: mont29

Differential Revision: https://developer.blender.org/D13646
2021-12-23 13:49:58 +01:00
00965c98cb LibOverride: protect better against using on complex inter-dependency cases.
Do not allow 3DView operator to run on the liboverride of an
instantiating Empty object. And tweak behavior in the Outliner
operations too.

Related to T94226.

Note that this remains fairly exotic, bad idea not recommended cases,
such complex inter-dependencies between different libraries inside a
same liboverride hierarchy is just not possible to handle properly.
2021-12-23 10:13:07 +01:00
710e279b19 Fix missing type declaration compile error
rBbd3bd776c893 broke compilation here due to missing type declaration
for basic types as the source file is not including this header. In any
case, it is the responsibility of header files to include headers for
types used by value in function parameters or struct definitions.
2021-12-23 08:01:43 +01:00
41f3164e57 Cleanup: typo in comment 2021-12-23 07:53:47 +01:00
1931387799 Fix: Curve trim node test failure
Caused by 60c59d7d61. The position wasn't copied into the correct
place on each spline. Somehow I didn't catch that in the tests I ran.
2021-12-22 18:38:30 -06:00
8f89196be2 Fix T94232: No selection with set material node empty material list
If the input mesh had no materials already, the new material would
become the only material on the mesh, meaning the material was
added to all of the faces, instead of just the selected faces.

The mesh primitive nodes in geometry nodes already add an empty
slot by default, so this only affects outside geometry.

The fix is just adding an empty slot before the new slot, so the
non-selected material indices can still point to an empty slot.

Differential Revision: https://developer.blender.org/D13654
2021-12-22 18:15:21 -06:00
60c59d7d61 Cleanup: Remove spline add_point method, refactor mesh to curve node
It's better to calculate the size of a spline before creating it, and this
should simplify refactoring to a data structure that stores all point
attribute contiguously (see T94193). The mesh to curve conversion is
simplified slightly now, it creates the curve output after gathering all
of the result vertex indices. This should be more efficient too, since
it only grows an index vector for each spline, not a whole spline.
2021-12-22 17:39:35 -06:00
c593db5a2f Nodes: Add link drag search support for map range node
Previously only the float version of the node was connected to.
This adds connection operations for vector sockets, and exposes
the "Steps" socket properly when it's selected.
2021-12-22 17:22:10 -06:00
6b662ebdb9 Cleanup: Return early 2021-12-22 17:18:37 -06:00
9033d270d5 Fix: Extra space at the front of "Sample Curve" node name 2021-12-22 17:13:16 -06:00
dca5be9b94 Fix: Wrong node link drag search menu items for attribute statistic
Caused by capturing local variables by reference in a function that
outlives the scope it was created in. Also use a more generic function
for the first two inputs.
2021-12-22 16:45:41 -06:00
14621e7720 Fix: Potential use after scope in curve to mesh node
I don't think this has been visible, since I only ran into it after
changing other code that affected this. However, some attributes
can keep a reference to the source component to use when tagging
caches dirty (like the position attribute tagging the normals dirty).
Here, the component was created inside a function, then the attributes
were used afterwards.

Also add some comments warning about this in the header file.
2021-12-22 16:29:00 -06:00
Michael
c6e7fc9744 Fix: Large stack allocation in compositor
When COM_EXPORT_GRAPHVIZ is enabled, DebugInfo::graphviz
uses a char[1000000] as local variable. When this function
is called this is allocated on the stack, which has a size
of just 1MB on mac and may cause a stack overflow.

This patch allocates the memory on the heap and frees
the memory at the end of the function.

Reviewed By: LazyDodo

Differential Revision: https://developer.blender.org/D13628
2021-12-22 13:49:52 -07:00
0fd72a98ac Cleanup: Avoid adding points to splines sequentially
This should be faster because it avoids reallocating the internal
vectors when the size is known beforehand, but it may also help
a potential refactor to a different data structure (see T94193).
2021-12-22 14:35:46 -06:00
b4f978e901 Fix: Missing update when toggling node mute
Toggling node mute doesn't cause node trees to reevaluate after
rB7e712b2d6a0d257. Toggling a link mute still works though. To fix this,
the operator tags the node and node with a new update tag function
(that uses an existing tag internally).

Differential Revision: https://developer.blender.org/D13653
2021-12-22 14:00:34 -06:00
586e2face6 Fix T93408: Snap performance regression at high poll rate
Caused by {rBfba9cd019f21f29bad1a6f3713370c5172dbc97f}.

The snap timer was accidentally modified and damaged.
2021-12-22 16:11:05 -03:00
e2a9e7e803 Nodes: Remove unnecessary node tree socket tagging
`SOCK_IN_USE` is now set in `update_socket_used_tags` in
`node_tree_update.cc` when a node tree is changed.
It doesn't need to run every single redraw. Removing this
results in a small speedup of 0.4 ms when drawing a tree
with about 4000 nodes (from about 70 ms total).

Differential Revision: https://developer.blender.org/D13645
2021-12-22 11:25:55 -06:00
6a71b2af66 Mesh: Parallelize bounding box calculation (WIP)
This replaces the single-threaded calculation of mesh min and max
positions with a `parallel_reduce` loop. Since the bounding box
of a mesh is retrieved quite often (at the end of each evaluation,
currently 2(?!) times when leaving edit mode, etc.), this makes for a
quite noticeable speedup actually.

On my Ryzen 3700x and a 4.2 million vertex mesh, I observed
a 4.4x performance increase, from 14 ms to 4.4 ms.

I added some methods to `float3` so they would be inlined, but
they're also a nice addition, since they're used often anyway.

Differential Revision: https://developer.blender.org/D13572
2021-12-22 11:04:03 -06:00
3579a9e0fc Cleanup: Remove debug print. 2021-12-22 17:53:00 +01:00
921708fc76 Fix (unreported) potential bug in collections parenting update code.
Own mistake in rB2ef192a55b2c. Did not seem to have any visible effect
though...
2021-12-22 17:50:10 +01:00
f577abc5cd Cleanup: Use LISTBASE_FOREACH_ macros. 2021-12-22 17:39:53 +01:00
8d3e57f338 Fix T93799: Outliner: Remaping objects could result in duplicates in a collection.
Fix is similar to how CollectionObject with NULL object pointers are handled.

Using one of the 'free' pad bytes in Object_Runtime struct instead of a
gset (or other external way to detect object duplicates), as this is
several times faster.

NOTE: This makes remapping slightly slower again (adds 10 extra seconds
to file case in T94059).

General improvements of remapping time complexity, especially when
remapping a lot of IDs at once, is a separate topic currently
investigated in D13615.
2021-12-22 17:34:13 +01:00
902318f0fd Fix part of T93799: Outliner: Remap Users crash (for ID Type Object).
This commit fixes the crash itself, however this can still lead to a
same collection 'owning' the same object several time.

Issue here was a bad assumption in layer resync code, that would lead to
removing valid objects from the viewlayer's `object_bases_hash` in
`BKE_layer_collection_sync`, when deleting no-more-used bases, in case
of bases duplicate.
2021-12-22 17:03:20 +01:00
978a930d9c Fix: Build issue on 32 bit archs
The cast to size_t leads to a build issue on 32
bit archs. cursor_delim_type_utf8 expects an int
so an additional cast to size_t is not required.

Reported by user frispete on devtalk.
2021-12-22 08:39:33 -07:00
aba91a745a Fix T93999: GPencil Box tool allows decreasing subdiv, but not increase
The problem was the number of points for each edge of the box was wrong and the wheelmouse effect was anulated.

Also fixed the value displayed in the status bar to keep consistency with subdivision value.

Reviewed By: lichtwerk

Maniphest Tasks: T93999

Differential Revision: https://developer.blender.org/D1363
2021-12-22 16:07:48 +01:00
dbbf0e7f66 Nodes: Improve node tree copy performance
When copying a full node tree, we can avoid an O(n^2) loop finding a
unique name for every node if we assume they already have unique names.
That is a reasonable assumption, since unique names are verified
elsewhere when adding a new node.

Copying a node tree with about 4000 nodes took 42 ms before,
now it takes 6 ms.

Differential Revision: https://developer.blender.org/D13644
2021-12-22 08:52:46 -06:00
fdc4a1a590 Nodes: Refactor to remove node and socket "new" pointers
These pointers point to the new nodes when duplicating,
and their even used to point to "original" nodes for
"localized" trees. They're just a bad design decision
that make code confusing and buggy.

Instead, node copy functions now optionally add to a map
of old to new socket pointers. The case where the compositor
abused these pointers as "original" pointers are handled
by looking up the string node names.

Differential Revision: https://developer.blender.org/D13518
2021-12-22 08:47:46 -06:00
d6224db8f1 Geometry Nodes: improve multi socket handling in evaluator
Previously, the values passed to a multi-input socket were stored
in the order that they arrived in. Then, when the values are accessed,
they are sorted depending on the link order.

Now, the ordering is determined in the beginning before execution starts.
Every value is assigned to the right index directly, avoiding the sort
in the end. This makes the ordering more explicit.
2021-12-22 13:16:01 +01:00
2ce2bffc4d Fix T94295: VSE fades error when no suitable sequences selected
This errored out in two scenarios:
- current frame not in strips framerange (this was reported)
- no strips selected at all

Now handle these cases properly in the operator and give appropriate
report info.

Maniphest Tasks: T94295

Differential Revision: https://developer.blender.org/D13642
2021-12-22 09:28:31 +01:00
d2bf60cc17 Cleanup: Clang tidy, restore alphabetical sorting 2021-12-21 14:32:22 -06:00
Germano Cavalcante
6db0919724 Fix T94191: correct (time) translation headers not showing DeltaX
Caused by {rBb0d9e6797fb8}

For the header (both Graph Editor case in general `headerTranslation` as
well as `headerTimeTranslate`) we are interested in deltas values
(not absolute values).

Since culprit commit, `snapFrameTransform` was not working with deltas
anymore, but we have to compensate for this.

For the Graph Editor, this only worked "by accident" in rB7192e57d63a5,
since `ival` is still zero at this point.

So now, reacquire the delta right after the snap operation.

Also use a more appropriate center value in the translate operator.

Maniphest Tasks: T94191

Differential Revision: https://developer.blender.org/D13641
2021-12-21 13:00:51 -03:00
aa7105f759 Cleanup: use BKE_pose_is_layer_visible in more places
This was added in rBd13970de8627, now use in more places.
2021-12-21 16:45:46 +01:00
bdbd0cffda Nodes: Improve performance when freeing a node tree
This commit makes freeing a node tree about 25 to 30 times faster.
Freeing a node tree happens whenever it is edited. Freeing a node
tree with about 4000 nodes went from 30-50ms to about 2 ms.

This was so slow before because for every node that was freed
when freeing the node tree, `node_free_node` looped over all
other nodes to detach frames, and then looped over all links to
remove any links connected to the node. That was all pointless
work because everything else is about to be freed anyway.

Instead, move that "detaching" behavior to the dedicated function
for removing a single node, and to the "local" version of the free
function to be safe, since I know less about what that version expects.

Differential Revision: https://developer.blender.org/D13636
2021-12-21 09:23:48 -06:00
8cf1994455 Fix T93960: Asset Catalogs I/O fails with unicode file paths on Windows
On Windows, encode file paths as UTF-16 before trying to open the file
for reading/writing.

This introduces a new class `blender::fstream`, which wraps
`std::fstream` and provides this UTF-16 encoding. This class should also
be used in other areas, like the Alembic importer/exporter.

Manifest Task: T93960

Reviewed By: JacquesLucke

Differential Revision: https://developer.blender.org/D13633
2021-12-21 15:54:09 +01:00
d66a6525c3 Assets: log message when catalog definitions cannot be loaded
Log a message (via `CLOG`) when asset catalog definitions cannot be
loaded.

Reviewed by @jacqueslucke in D13633
2021-12-21 15:53:57 +01:00
7e712b2d6a Nodes: refactor node tree update handling
Goals of this refactor:
* More unified approach to updating everything that needs to be updated
  after a change in a node tree.
* The updates should happen in the correct order and quadratic or worse
  algorithms should be avoided.
* Improve detection of changes to the output to avoid tagging the depsgraph
  when it's not necessary.
* Move towards a more declarative style of defining nodes by having a
  more centralized update procedure.

The refactor consists of two main parts:
* Node tree tagging and update refactor.
  * Generally, when changes are done to a node tree, it is tagged dirty
    until a global update function is called that updates everything in
    the correct order.
  * The tagging is more fine-grained compared to before, to allow for more
    precise depsgraph update tagging.
* Depsgraph changes.
  * The shading specific depsgraph node for node trees as been removed.
  * Instead, there is a new `NTREE_OUTPUT` depsgrap node, which is only
    tagged when the output of the node tree changed (e.g. the Group Output
    or Material Output node).
  * The copy-on-write relation from node trees to the data block they are
    embedded in is now non-flushing. This avoids e.g. triggering a material
    update after the shader node tree changed in unrelated ways. Instead
    the material has a flushing relation to the new `NTREE_OUTPUT` node now.
  * The depsgraph no longer reports data block changes through to cycles
    through `Depsgraph.updates` when only the node tree changed in ways
    that do not affect the output.

Avoiding unnecessary updates seems to work well for geometry nodes and cycles.
The situation is a bit worse when there are drivers on the node tree, but that
could potentially be improved separately in the future.

Avoiding updates in eevee and the compositor is more tricky, but also less urgent.
* Eevee updates are triggered by calling `DRW_notify_view_update` in
  `ED_render_view3d_update` indirectly from `DEG_editors_update`.
* Compositor updates are triggered by `ED_node_composite_job` in `node_area_refresh`.
  This is triggered by calling `ED_area_tag_refresh` in `node_area_listener`.

Removing updates always has the risk of breaking some dependency that no
one was aware of. It's not unlikely that this will happen here as well. Adding
back missing updates should be quite a bit easier than getting rid of
unnecessary updates though.

Differential Revision: https://developer.blender.org/D13246
2021-12-21 15:18:56 +01:00
1abf2f3c7c Cleanup: clang format
Missed in rB7c9e4099854a, sorry.
2021-12-21 14:28:04 +01:00
d13970de86 Fix T92930: Outliner "Show Active" bone fails in certain situations
Outliner would frame the armature object instead of the bone if the bone
was on a hidden armature layer.

Similar to issues reported in e.g. T58068 and T80464, this is due to the
fact that `BKE_pose_channel_active` always checks for the armature layer
(and returns NULL if a bone is not on a visible armature layer).

Now propose to make this layer check **optional** (and e.g. from the
Outliner be more permissive). This also introduces
`BKE_pose_channel_active_if_layer_visible` which just wraps
`BKE_pose_channel_active` with the check being ON.

Maniphest Tasks: T92930

Differential Revision: https://developer.blender.org/D13154
2021-12-21 14:07:48 +01:00
fac42e3fa1 Tests: initialise BKE callbacks before loading blend file
Initialise the BKE callback system in
`BlendfileLoadingBaseTest::SetUpTestCase()`. This allows certain tests
to run in debug mode (when `BLI_assert` is enabled).
2021-12-21 11:12:47 +01:00
68f1b2c671 Fix T93757: Do not force-instantiate indrectly linked objects in linking case. 2021-12-21 10:09:01 +01:00
bb4de77b82 Fix T93839: Copy/Paste of empty instantiating a collection.
Do not also instantiate a collection in the view layer, if it is already
instantiated through an empty object.
2021-12-21 09:52:53 +01:00
c0f06ba614 Fix build error in debug builds from recent commit
r7acd3ad7d8e58b913c5 converted a pointer to a reference,
but an assert still compares the variable to a pointer.
2021-12-20 22:48:31 -06:00
e4de5b4657 Fix T94280: Crash when splitting meta strip
This happens because in `SEQ_time_update_sequence` function
`SEQ_get_meta_by_seqbase` returns uninitialized value. This isn't nice,
but it shouldn't happen in first place. Problem is, that
`SEQ_edit_strip_split` does move strips into detached `ListBase`, so
other functions can't see them anymore. Detached `ListBase` is used
solely to preserve relationships during duplication.

Move strips to original `ListBase` immediately after duplication and
return `NULL` if `SEQ_get_meta_by_seqbase` can't find meta strip.

Splitting itself can still rely on fact, that number of original and
duplicated strips is same and they are placed next to each other in
exactly same order at the end of original `ListBase`.
2021-12-21 05:27:46 +01:00
5457b66301 Cleanup: Use span instead of raw pointer
This is a followup to the previous commit.
2021-12-20 18:10:29 -06:00
7acd3ad7d8 Cleanup: Use simpler loops in weld modifier
In this commit I changed many loops to range-based for loops.
I also removed some of the redundant iterator variables, using
indexing inside the loop instead. Generally an optimizing compiler
should have no problem doing the smartest thing in that situation,
and this way makes it much easier to tell where data is coming from.

I only changed the loops I was confident about, so there is still more
that could be done in the future.

Differential Revision: https://developer.blender.org/D13637
2021-12-20 18:03:06 -06:00
59221476b0 Fix T94262: Grease Pencil Blur Effect DoF mode wrong
This was visible outside of camera view and was not respecting the
"Depth of Field" checkbox on the Camera properties.

Now return early if DoF should not be visible.

Maniphest Tasks: T94262

Differential Revision: https://developer.blender.org/D13631
2021-12-20 21:52:42 +01:00
399f84d479 Run clang-format 2021-12-20 14:30:29 -05:00
d93dd85951 Sculpt: split sculpt.c into three files
Sculpt.c is now three files:

* Sculpt.c: main API methods and the brush stroke operator
* Sculpt_brushes.c: Code for individual brushes.
* Sculpt_ops.c: Sculpt operators other than the brush stroke operator.

TODO: split brush stroke operator into a new file (sculpt_stroke.c?).
2021-12-20 14:20:51 -05:00
Wannes Malfait
e9092110ff Fix T94144: Duplicate edges in dual mesh node
The duplicated edges were caused by 'oversubdivided' edges, i.e. edges
where some of the vertices on them are only connected to two polygons.
The fix finds these vertices and 'dissolves' them so that only one edge
is created.

For most 'normal' meshes this shouldn't occurr, or only very little, so
the performance impact of this change should be neglegible. In practice
this is also avoidable by triangulating the mesh first.

Differential Revision: https://developer.blender.org/D13445
2021-12-20 13:08:05 -06:00
bdb5852e28 Cleanup: Remove mesh valid assertions in debug builds
These are useful for development, but when the primitive nodes
aren't actively changing, the performance cost is not worth it.
2021-12-20 12:46:40 -06:00
Piotr Makal
70de992afe Fix: Incorrect assert conditions in NURBSpline
This fix provides better conditions for asserts in `NURBSpline::knots`
method accounting for cyclic NURBS curves.

Differential Revision: https://developer.blender.org/D13620
2021-12-20 12:43:14 -06:00
76cb11e332 Cleanup: Remove unused node type flag
This flag was checked, but not set anywhere.
2021-12-20 11:03:24 -06:00
82fa2bdf3f Cleanup: Comment formatting in node.cc
Also remove a SCOPED_TIMER I added by mistake in a previous commit.
2021-12-20 10:48:01 -06:00
cb96435047 Geometry Nodes: Parallelize mesh grid primitive
On a Ryzen 3700x, this ended up 2.5x faster than before. More
benchmarking details are included in the differential revision.

For smaller grids, all this should do is increase the
code size a bit, and add a few more if statements.

Differential Revision: https://developer.blender.org/D13617
2021-12-20 10:34:31 -06:00
1d25ba175e Cleanup: Remove unused arguments 2021-12-20 09:57:18 -06:00
6c33a0f6d6 Update our USD 21.02 patch to support gcc-11
There are two issues in USD code that break building it with gcc-11,
one (in `demangle.cpp`) was already fixed upstream, the other (in
`singularTask.h`) is still pending (reported upstream, see
https://github.com/PixarAnimationStudios/USD/issues/1721).

CC #platforms_builds_tests_devices project.
2021-12-20 16:17:42 +01:00
edb3ab0617 Fix T94251: Cycles wrong pointcloud normal for instanced objects
Refactor code a bit also so we need to do fewer matrix transforms for shader
data setup of points and curves.
2021-12-20 14:14:43 +01:00
e2e7f7ea52 Fix Cycles OptiX crash with 3D curves after point cloud changes
Includes refactoring to reduce the number of bits taken by primitive types,
so they more easily fit in the OptiX limit.
2021-12-20 14:14:43 +01:00
Maxime Chambonnet
5adc06d2d8 install_deps: Fix OIIO and OSL build with OpenEXR
Root path variables for those libraries is now using the 'standard' naming
scheme.

With tweaks/cleanups from @mont29.

Reviewed By: mont29

Differential Revision: https://developer.blender.org/D13591
2021-12-20 12:52:07 +01:00
deb3d566a5 BLI: fix Vector.prepend declaration
Using `&&` there was a typo. With `&&` the `prepend` method
could not be called with a const reference as argument.
2021-12-20 10:46:43 +01:00
ffd1a7d8c8 Fix T93570: VSE image transforms in preview dont autokey
This was basically not implemented, do this via
`ED_autokeyframe_property` in a new dedicated function in
special_aftertrans_update.

Maniphest Tasks: T93570

Differential Revision: https://developer.blender.org/D13608
2021-12-20 09:12:37 +01:00
fdb2167b4a Docs: use doxygen formatting for BLI
Differentiate doc-strings from title/section text.
2021-12-20 19:07:10 +11:00
5c63c0a58c Docs: use doxygen formatting for DNA
Differentiate doc-strings from title/section text.
Also use explicit doxygen references to struct members
so it's not ambiguous which member is being referenced.

Note that these changes aren't complete (some files weren't touched).
2021-12-20 19:07:10 +11:00
528fc35fed Docs: minor correction to doc-string
The text editor no longer accumulates changes.
2021-12-20 17:42:27 +11:00
0b69793487 Fix T94254: Crash using view_all operator in VSE
Caused by `NULL` dereference in `SEQ_meta_stack_active_get()`.

Check if `Editing` is `NULL` before accessing meta stack.
2021-12-20 02:18:59 +01:00
34fac7a670 Fix: Meta strip not created with alpha over blend mode
Meta add function wasn't updated. All strips now use alpha over
blending, so set it in `SEQ_sequence_alloc()`.
2021-12-20 01:52:40 +01:00
cc367908cd Cleanup: Remove more texture nodes preview handling
Similar to the previous commit, this allowed removing a function to set
a single pixel of a node preview.
2021-12-19 18:37:57 -06:00
0966eab8e9 VSE: Clamp sound strip when adding movie strip
When adding multiple movie strips and sound stream is longer than video,
this results in gaps between video strips, which is undesirable.
2021-12-20 01:35:25 +01:00
c27d0cb7b9 Cleanup: Remove more no-op node preview code
This is a follow-up to rB43875e8dd1d76ee, removing some
processing of non-existent node previews in the shader and
texture nodes systems.
2021-12-19 18:11:47 -06:00
eac6aff741 Cleanup: Const arguments, remove unused argument 2021-12-19 17:57:51 -06:00
6878897a6d Cleanup: Clang tidy lamda function name 2021-12-18 17:18:41 -06:00
8a91673562 Cleanup: Move weld modifier to C++
This moves `MOD_weld.cc` to C++, fixing compiler warnings
coming from the change. It also goes a little bit further and converts
the code to use C++ data structures: `Span`, `Array`, and `Vector`.
This makes the code more shorter and easier to reason about, and
makes memory maneagement more automatic.

Differential Revision: https://developer.blender.org/D13618
2021-12-18 13:42:48 -06:00
491b9cd2b9 Fix: Build error from missing include 2021-12-18 13:41:56 -06:00
b9861055ad Fix: Link drag search missing field nodes from function nodes
When dragging from the inputs of function nodes, other function
nodes wouldn't connect, because their socket declaration field types
weren't set correctly. Instead, they relied on code properly checking
the *node* declaration's `is_function_node()` method. However,
that increases complexity and requires passing the node instead of
just the socket in more places. Instead, set the proper field types
in the socket declaration during building.

Differential Revision: https://developer.blender.org/D13600
2021-12-18 13:31:06 -06:00
d15d512a68 Fix T94173: Missing update for frame node size
Before d56bbfea7b, nodes were updated (size calculated and
buttons added) in reverse order. Instead, now calculate the size of
frame nodes after all other nodes. Separating the drawing further
may be a good step to removing the O(n^2) loop later on.
2021-12-18 13:27:05 -06:00
4c3f57ffa7 Cleanup: compiler warnings with clang
Includes use of memcpy to avoid warnings about deprecated members.
2021-12-18 18:36:34 +01:00
3471b0016c Fix T94215: compositer denoise node UI wrongly shows as disabled
After recent refactoring in 4e98d974b5.
2021-12-18 05:53:38 +01:00
214a56ce8c GPU: add memory barriers for vertex and index buffers
This adds memory barriers to use with `GPU_memory_barrier` to ensure that
writes to a vertex or index buffer issued before the barrier are
completed after it, so they can be safely read later by another shader.

`GPU_BARRIER_VERTEX_ATTRIB_ARRAY` should be used for vertex buffers (`GPUVertBuf`),
and `GPU_BARRIER_ELEMENT_ARRAY` should be used for index buffers (`GPUIndexBuf`).

Reviewed By: fclem

Differential Revision: https://developer.blender.org/D13595
2021-12-18 00:55:54 +01:00
Christoph Lendenfeld
8e31e53aa0 Function to return a list of keyframe segments
Add a function that returns a list of keyframe segments
A segment being a continuous selection of keyframes
Will be used by future operators in the graph editor

Reviewed by: Sybren A. Stüvel
Differential Revision: https://developer.blender.org/D13531
Ref: D13531
2021-12-17 22:43:02 +00:00
76fd2ff9fa Fix T94184: Outliner: Collection dragging tooltip is not updating
In the context of the dragdrop tooltip, the event referenced to the window
is out of date and contains invalid `mval` values.

Avoid using `win->eventstate` as much as possible.
2021-12-17 19:13:14 -03:00
4b21067aea Fix T94116: Drivers can have multiple variables with same name
The RNA setter now ensures that driver variables are uniquely named
(within the scope of the driver).

Versioning code has been added to ensure this uniqueness. The last
variable with the non-unique name retains the original name; this
ensures that the driver will still evaluate to the same value as before
this fix.

This also introduces a new blenlib function `BLI_listbase_from_link()`,
which can be used to find the entire list from any item within the list.

Manifest Task: T94116

Reviewed By: mont29, JacquesLucke

Maniphest Tasks: T94116

Differential Revision: https://developer.blender.org/D13594
2021-12-17 17:31:15 +01:00
2648d920d8 Theme: Node Group color only needs RGB, not RGBA
The node group alpha theme was used for the overlay drawing in the node
editor. Since this was removed (919e513fa8) the alpha channel doesn't
need to be exposed anymore.

Reported as part of T93654.
2021-12-17 16:47:06 +01:00
Hans Goudey
0aabaa4583 Cleanup: Use signed integers in the weld modifier
The style guide mentions that unsigned integers shouldn't be used to
show that a value won't be negative. Many places don't follow this
properly yet. The modifier used to cast an array of `uint` to `int` in
order to pass it to `BLI_kdtree_3d_calc_duplicates_fast`. That is no
longer necessary.

Differential Revision: https://developer.blender.org/D13613
2021-12-17 12:40:01 -03:00
552dce0de7 Cleanup: quiet warning due to incompatible pointer types 2021-12-17 15:46:00 +01:00
77760194fe Cleanup: use new c++ guarded allocator api in some files 2021-12-17 15:42:28 +01:00
a3ad5abf2f Allocator: simplify using guarded allocator in C++ code
Using the `MEM_*` API from C++ code was a bit annoying:
* When converting C to C++ code, one often has to add a type cast on
  returned `void *`. That leads to having the same type name three times
  in the same line. This patch reduces the amount to two and removes the
  `sizeof(...)` from the line.
* The existing alternative of using `OBJECT_GUARDED_NEW` looks a out
  of place compared to other allocation methods. Sometimes
  `MEM_CXX_CLASS_ALLOC_FUNCS` can be used when structs are defined
  in C++ code. It doesn't look great but it's definitely better. The downside
  is that it makes the name of the allocation less useful. That's because
  the same name is used for all allocations of a type, independend of
  where it is allocated.

This patch introduces three new functions: `MEM_new`, `MEM_cnew` and
`MEM_delete`. These cover the majority of use cases (array allocation is
not covered).

The `OBJECT_GUARDED_*` macros are removed because they are not
needed anymore.

Differential Revision: https://developer.blender.org/D13502
2021-12-17 15:42:28 +01:00
Phil Stopford
c0d96ca9a5 UI: make Remap User dialog in outliner wider
The previous size was too small for common object names.

Differential Revision: https://developer.blender.org/D13604
2021-12-17 15:21:12 +01:00
Alessio Monti di Sopra
3b965ba10b UI: Fix node socket alignment in some cases
The patch fixes some misalignments in the nodes' sockets/options
recently introduced in 26d2caee3b, while maintaining the original
fix for T92268.

The original fix made the top padding always of the same size; while
that works when the first row of the other node is `Socket | Socket`,
it doesn't for other more common cases, `like Socket | Node Option`,
where the text results misaligned.

Differential Revision: https://developer.blender.org/D13451
2021-12-17 08:04:28 -06:00
b386f960f6 Fix error in Cycles geometry update tagging after pointcloud addition
Thanks to Christophe Hery for spotting this.
2021-12-17 14:59:24 +01:00
Michael
67734d1853 Fix T94142, T94182: Cycles metal broken after pointcloud changes
Missing ccl_private form an older patch.

Differential Revision: https://developer.blender.org/D13612
2021-12-17 14:46:52 +01:00
f3c1d0e3a3 Fix T94137: GPencil: Eraser does not erase first point
The eraser checks the current, previous and next point (and sets pc0,
pc1 & pc2 corresponding to that for futher occlusion/brush/clipping
checks). For the very first point, it sets pc0 to pc1 [which makes sense,
there is no previous point, so we should assume the previous segment is
"visible" as soon as the first point is], but does so *before* pc1 is
even calculated. This makes following occlusion/brush/clipping checks
work with zero values [which leads to no earsing in most cases].

Now *first* calculate pc1, *then* set pc0 to pc1.

Maniphest Tasks: T94137

Differential Revision: https://developer.blender.org/D13593
2021-12-17 14:38:11 +01:00
7c9e409985 Outliner ID Remap Users: hide ID type from the UI
The correct type should be set by invoke already, changing it to a non-
matching type (e.g. trying to remap Mesh users with a Camera block) does
not really make sense afaict, reason being that we would be presented
with the "Invalid old/new ID pair" message in such case anyways (code
checks GS(old_id->name) == GS(new_id->name)).

This alone wouldnt be a pressing issue, but since doing this with an
object ID type crashes atm., it seems to make sense to clean this up now
(of course the crash should be looked into, but this is for a separate
patch -- if that is solved, we could also think about adding the "Remap
Users" entry back in the context menu for objects as well [which was
removed in rB17bd5c9d4b1e for some reason]).

Part of T93799.

Differential Revision: https://developer.blender.org/D13512
2021-12-17 13:58:25 +01:00
9690b7c91b Fix (unreported): missed running versioning code in some files
The versioning code was accidentally put not at the very bottom.
That lead to a situation where it wasn't run on some files that happened
to be within a specific short time frame.

Since the versioning code is idempotent, it can just run again on existing
files. Therefore, this commit just moves it back to the bottom so that it
is executed on all files again.

Broken Commit: rB5b61737a8f41688699fd1d711a25b7cea86d1530.
2021-12-17 13:49:53 +01:00
367b484841 Fix T94166: set handle position node crashed after refactor
This was an oversight in rB8e2c9f2dd3118bfdb69ccf0ab2b9f968a854aae4.
2021-12-17 10:40:01 +01:00
b9d6271916 GPU: Sort SSBOs In Shader Interface.
SSBOs weren't sorted, but other types were. This was an
oversight when SSBOs were introduced (GPU compute pipeline).

Issue identified by @fclem.
2021-12-17 08:16:06 +01:00
7788293954 Cleanup: Silenced unused var warnings. 2021-12-17 08:04:35 +01:00
0e1bb232e6 UI: move "undo history" from a custom popup to a menu type
This lets the undo history expand as a regular sub-menu
instead of being a popup.

Also disable the active undo step menu item as this is a no-op.
2021-12-17 17:28:57 +11:00
c536eb410c Cleanup: spelling in comments 2021-12-17 15:26:28 +11:00
6b3c3454d3 Cleanup: use more common naming suffix for item callbacks 2021-12-17 15:11:16 +11:00
Johannes Jakob
1f50c2876e UI: Correct "QuickTime" Spelling
Change the spelling of the QuickTime output video container item from
"Quicktime" to "QuickTime"

Differential Revision: https://developer.blender.org/D10929

Reviewed by Harley Acheson
2021-12-16 17:59:06 -08:00
b7e151d876 Cleanup: Simplify logic in set material node 2021-12-16 15:44:50 -06:00
da2c564bb0 Geometry Nodes: Support point clouds in the set material node
Now that point clouds can be rendered with cycles, it makes sense
to allow assigning a material to them. Note that like volumes, they
only support a single material though.
2021-12-16 15:39:59 -06:00
35b1e9fc3a Cycles: pointcloud rendering
This add support for rendering of the point cloud object in Blender, as a native
geometry type in Cycles that is more memory and time efficient than instancing
sphere meshes. This can be useful for rendering sand, water splashes, particles,
motion graphics, etc.

Points are currently always rendered as spheres, with backface culling. More
shapes are likely to be added later, but this is the most important one and can
be customized with shaders.

For CPU rendering the Embree primitive is used, for GPU there is our own
intersection code. Motion blur is suppored. Volumes inside points are not
currently supported.

Implemented with help from:
* Kévin Dietrich: Alembic procedural integration
* Patrick Mourse: OptiX integration
* Josh Whelchel: update for cycles-x changes

Ref T92573

Differential Revision: https://developer.blender.org/D9887
2021-12-16 20:54:04 +01:00
2229179faa Revert "Cycles-X: Add hysteresis to resolution divider algorithm"
This reverts commit d8b4275162. It causes reduced
viewport render resolution. Revert for now until I have time to look into this
more closely.
2021-12-16 18:29:27 +01:00
bd12855110 Cleanup: Move curve.c to C++
I need this for a refactor I'm looking into for bounding boxes.
It may be helpful in the future when using `CurveEval` in more places.

Differential Revision: https://developer.blender.org/D13596
2021-12-16 11:22:48 -06:00
197b3502b0 LibOverride: Further improve creation/resync pre-process speed.
Fully get rid of `BKE_collection_object_find` in
`lib_override_group_tag_data_object_to_collection_init`, even if only
used a few times this function was still noticeable in profiling data.

Now instead loop over collections' objects to build required
object-to-collections mapping.

Adds an extra 5-10% speed-up compared to previous commit rB0624fad0f3ff.

Related to T94059.
2021-12-16 18:15:19 +01:00
9765ddf4eb Fix T94109: 3d cursor crash when using shortcut
Operator was erroneously starting edge_slide operation.

Revert part of the changes in rB3fab16fe8eb4 as obedit_type was being
confused with object_mode.
2021-12-16 13:45:35 -03:00
0624fad0f3 LibOverride: Improve speed of resync and creation of liboverrides.
`BKE_collection_object_find` has extremely bad performances (very high
time complexity). While ideally this should be fixed in that API, for
now cache its results once at the beginning of the resync/creation
process.

This makes loading of complex production files with a lot of
liboverrides to resync three to four times faster.

Thanks to @brecht for the profiling in T94059.
2021-12-16 17:27:38 +01:00
b1f865f9d3 LibOverride: Cleanup log about unfound subitems.
Not finding subitem when its name and index are invalid/unset is
expected behavior, and does happen when e.g. inserting a new constraint
or modifier at the begining of the stack.
2021-12-16 16:48:00 +01:00
0f589f8d3c Fix T94115: Selecting current action in undo history undoes all
When selecting the current undo step there is no need to do anything.

Fix and minor refactor to de-duplicate refreshing after running
undo/redo & undo history.
2021-12-16 23:40:11 +11:00
d6902668e3 UI: support Copy To Selected for id-properties [GN modifier properties]
Both {key Alt} editing behavior as well as `Copy To Selected` were not
working on geometry nodes modifiers (even if these matched exactly -
having the same nodegroup - on multiple objects)

Reason is that code checks pointer equality on the discovered properties
[geometry nodes modifier properties are stored as ID properties], but
these are not the same across objects (since these are fetched from
NodesModifierSettings - which are different on different objects).

note: if general custom properties are "API defined" on existing classes,
this was working, we are getting the exact property for different IDs in
this case

Now be more permissive with ID properties not defined on classes in
general and dont check pointer equality for them. For ID properties on
specific IDs (not the ones defined on classes) this //might// be undesired
(havent spotted issues though, even if equally named ID properies with
different types existed -- this then simply does nothing).

For geometry nodes modifiers, new code also checks if the nodegroups are
the same [since generic naming "Input_XXX" is shared for all modifiers --
and starting to copy over things to unrelated modifiers is not desired
here].

Fixes T93983.

Maniphest Tasks: T93983

Differential Revision: https://developer.blender.org/D13573
2021-12-16 13:18:19 +01:00
6da23db5e0 UI: deduplicate code for Copy To Selected and Alt-button tweaking
This resolves an old TODO to deduplicate code in copy_to_selected_button
& ui_selectcontext_begin.
This is also in hindsight of adding id-property support [incl. Geometry
Nodes modifier properties] for this in the next commit.
No behavior change expected here.

ref T93983 & D13573
2021-12-16 13:18:19 +01:00
3e04d37529 Cleanup: correct docstring for driver_variable_name_validate
No functional changes.
2021-12-16 12:54:25 +01:00
59774d64f0 Docs: add doc-strings for BLI_path functions 2021-12-16 21:58:04 +11:00
21c7689b77 Animation: send notifier when keyframe is inserted
`<some_id>.keyframe_insert()` now sends a notifier that animation data
was changed, so that animation-related editors can properly refresh.

Since this function is quite high-level (if necessary it creates the
Action and FCurves), I thought this would be a suitable location for the
notifier. If high keyframing speed is required, it is still recommended
to use `FCurveKeyframePoints.insert(options={'FAST'})` instead.
2021-12-16 11:26:21 +01:00
1818110459 Fix compile error on Windows. 2021-12-16 10:08:31 +01:00
8dbd406ea0 WM: various changes to file writing behavior
Saving with only a filename (from Python) wasn't being prevented,
while it would successfully write the file to the working-directory,
path remapping and setting relative paths wouldn't work afterwards
as `Main.filepath` would have no directory component.

Disallow this since it's a corner case which only ever occurs
when path names without any directories are used from Python,
the overhead of expanding the working-directory for all data saving
operations isn't worthwhile.

The following changes have been made:

- bpy.ops.wm.save_mainfile() without a filepath argument
  fails & reports and error when the file hasn't been saved.

  Previously it would write to "untitled.blend" and set the
  `G.main->filepath` to this as well.

- bpy.ops.wm.save_mainfile(filepath="untitled.blend")
  fails & reports and error as the filename has no directory component.

- `BLI_path_is_abs_from_cwd` was added to check if the path would
  attempt to expand to the CWD.
2021-12-16 16:27:35 +11:00
15c3617009 Cleanup: simplify file saving logic
Revert part of the fix from 073669dd85
that initialized the file-path on first save as it's no longer needed.

Also remove relbase argument to BLI_path_normalize as the destination
file paths shouldn't use relative locations.
2021-12-16 16:27:35 +11:00
Aaron Carlisle
4e98d974b5 Nodes: Begin splitting composite node buttons into individual files
Currently, most node buttons are defined in `drawnode.cc` however,
this is inconvenient because it requires editing many files when adding new nodes.
The goal is to minimize the number of files needed to add or update a node.

This commit moves most of the node layout functions for composite nodes into their respected
`source/blender/nodes/composite/nodes` file.

In the future, these functions will be simplified to `node_layout` once files have their own namespace.
See {D13466} for more information.

Reviewed By: HooglyBoogly

Differential Revision: https://developer.blender.org/D13523
2021-12-15 21:30:29 -05:00
5de109cc2d Remove G.relbase_valid
In almost all cases there is no difference between `G.relbase_valid`
and checking `G.main->filepath` isn't an empty string.

In many places a non-empty string is already being used instead of
`G.relbase_valid`.

The only situation where this was needed was when saving from
`wm_file_write` where they temporarily became out of sync.
This has been replaced by adding a new member to `BlendFileWriteParams`
to account for saving an unsaved file for the first time.

Reviewed By: brecht

Ref D13564
2021-12-16 11:41:46 +11:00
4b12f521e3 Cleanup: spelling 2021-12-16 11:38:09 +11:00
807efa8538 Cleanup: clang-format 2021-12-16 11:38:08 +11:00
366ec5f0f8 Cleanup: unused variable warning 2021-12-16 11:38:06 +11:00
b265b447b6 Fix various cases of incorrect filtering in node link drag search
Some nodes didn't check the type of the link's socket for filtering.
Do this with a combination of manually calling the node tree's validate
links function and using the helper function for declarations.

Also clean up a few cases that added geometry sockets manually
when they can use the simpler helper function.
2021-12-15 18:05:45 -06:00
36a830b4d3 Fix: Compare node missing from link drag search
This was missing from rB11be151d58ec0ca955f.
It uses the same approach as the quadrilateral node.
2021-12-15 17:37:55 -06:00
49311a73b8 Fix: Crash in nodes modifier with missing node group
We cannot depend on node->id being non-null for group nodes.
2021-12-15 15:00:20 -06:00
aa55cb2996 Cleanup: Use const arguments, references
Also slightly change naming to avoid camel case.
2021-12-15 14:51:58 -06:00
43875e8dd1 Cleanup: Remove no-op node preview function calls
This patch removes no-op node editor preview code (`PR_NODE_RENDER`)
and most calls to `BKE_node_preview_init_tree`. The only remaining call is
in the compositor.

 - Shader nodes previews don't seem to do anything.
 - In-node previews for the texture node system doesn't work either.

This is a first step to refactoring to remove `preview_xsize`,
`preview_ysize`, and `prvr` from nodes in DNA, aligned with
the general goal of removing runtime/derived data from data
structs.

Differential Revision: https://developer.blender.org/D13578
2021-12-15 14:27:38 -06:00
b32f5a922f Fix T93995: Cycles camera motion blur not working in right stereo view
Thanks to Michael (michael64) for identifying the solution.

Ref D13567
2021-12-15 20:47:54 +01:00
11be151d58 Node Editor: Link Drag Search Menu
This commit adds a search menu when links are dragged above empty
space. When releasing the drag, a menu displays all compatible
sockets with the source link. The "main" sockets (usually the first)
are weighted above other sockets in the search, so they appear first
when you type the name of the node.

A few special operators for creating a reroute or a group input node
are also added to the search.

Translation is started after choosing a node so it can be placed
quickly, since users would likely adjust the position after anyway.

A small "+" is displayed next to the cursor to give a hint about this.

Further improvements are possible after this first iteration:
 - Support custom node trees.
 - Better drawing of items in the search menu.
 - Potential tweaks to filtering of items, depending on user feedback.

Thanks to Juanfran Matheu for developing an initial patch.

Differential Revision: https://developer.blender.org/D8286
2021-12-15 09:51:57 -06:00
474adc6f88 Refactor: Simplify spreadsheet handling of cell values
Previously we used a `CellValue` class to hold the data for a cell,
and called a function to fill it whenever necessary. This is an
unnecessary complication when we have virtual generic arrays
and most data is already easily accessible that way anyway.
This patch removes `CellValue` and uses `fn::GVArray` to provide
access to data instead.

In the future, if rows have different types within a single column,
we can use a `GVArray` of `blender::Any` to interface with the drawing.

Along with that, the use of virtual arrays made it easy to do a
few other cleanups:
 - Use selection domain interpolations from rB5841f8656d95
   for the mesh selection filter.
 - Change the row filter to only calculate for necessary indices.

Differential Revision: https://developer.blender.org/D13478
2021-12-15 09:34:13 -06:00
d79868c4e6 Fix T93975: add more nested instance limit checks
Differential Revision: https://developer.blender.org/D13585
2021-12-15 15:46:39 +01:00
Michael
5f52684a0f Initialize the fourth and final instance variable of MemoryProxy
The constructor of MemoryProxy initializes 3 of 4 instances variables.
If a MemoryProxy is constructed and MemoryProxy::free is called
on this instance, buffer_ is undefined and 'delete buffer_;' causes errors.

Although this misuse pattern does not exist in the current codebase
it already tripped up the Address Sanitizer on various occasions
while debugging unrelated problems.

Reviewed By: jbakker

Differential Revision: https://developer.blender.org/D13569
2021-12-15 15:46:28 +01:00
1a833dbdb9 Geometry Nodes: Add Selection to Attribute Statistics
This adds a bool field selection input to the Attribute Statistics node.
This is useful for running calculations on a subset of the input field data
rather that then whole set.

Differential Revision: https://developer.blender.org/D13520
2021-12-15 08:04:49 -06:00
e85d7d5a28 Fix T93971: "Center Cursor & Frame All" fails to redraw
bda9e4238a changed smooth-view
not to redraw when there were no changes made.
Redrawing is needed for repositioning the cursor.

Subscribe to changes to the 3d cursor to ensure all view ports
are updated (not just the current one).
2021-12-16 00:15:13 +11:00
c101ded463 Cleanup: remove disabled code
Originally pointcache wasn't supported when the file wasn't saved.
Remove commented code as this hasn't been the case for a long time.
2021-12-15 23:45:11 +11:00
a9d8ff6e21 Cleanup: unused variable warning 2021-12-15 23:43:31 +11:00
883e4c089a MetaBall: optimize memory allocation for meta-ball tessellation
Double the allocation size when the limit is reached instead of
increasing by a fixed number.

Also re-allocate to the exact size once complete instead of over
allocating. This gives a minor speedup in my tests ~19% faster
tessellation for ~1million faces.
2021-12-15 23:40:39 +11:00
6a885e5d89 Fix meta-ball bound-box calculation reading past buffer bounds
This broke "test_undo.view3d_multi_mode_select" test in
"lib/tests/ui_simulate" and is likely exposed by recent changes to
bounding box calculation.

The missing check for DL_INDEX4 dates back to code from 2002 which
intended to check this but was checking for DL_INDEX3 twice
which got removed as part of a cleaned up.

This could be hidden from memory checking tools as meta-balls
over-allocate vertex arrays.
2021-12-15 23:40:33 +11:00
3a856f7967 Fix compile errors on windows. 2021-12-15 11:51:03 +01:00
6051b80dd2 Cleanup: Use pixel in stead of texels in naming. 2021-12-15 11:19:29 +01:00
723fb16343 Images: 1,2,3 channel support for transform function.
Added support for 1, 2, 3 float channel source images. Destination
images must still be 4 channels.
2021-12-15 11:09:31 +01:00
67b657f07c Fix T94082: Curve to point empty evaluated NURBS crash
This is basically the same as rBee4ed99866fbb7ab04, the fix is
simply to check if the spline has evaluated points when deciding
the offsets into the result points array.
2021-12-14 18:57:45 -06:00
644eb68524 Fix T93949: Preview Image Error When No Screen
Fix an error if "File Preview Type" is "Auto" and there is no screen.

See D13574 for details.

Differential Revision: https://developer.blender.org/D13574

Reviewed by Julian Eisel
2021-12-14 13:50:40 -08:00
40aee0b2e9 Fix possible use-after-free on error handling during VR view drawing
Whenever an exception happens in VR session code, we cancel the entire
session. Alongside that, we removed the "surface" item used to draw into
an offscreen context. This would mess up the iterator of the surface
draw loop.
Similar to 7afd84df40.
2021-12-14 20:57:39 +01:00
4cfa21f09b Fix null-pointer dereference on error handling during VR view drawing 2021-12-14 20:44:17 +01:00
7afd84df40 Fix possible use-after-free on error handling during VR view drawing
Whenever an exception happens in VR session code, we cancel the entire
session. Alongside that, we removed the "surface" item used to draw into
an offscreen context. But this may still be stored as active surface,
leading to a use-after-free when deactivating this active surface, for
example.
2021-12-14 20:13:58 +01:00
7e8912eb96 Fix Cycles compilation with CUDA / Optix after recent Map Range additions. 2021-12-14 20:01:30 +01:00
Charlie Jolly
5b61737a8f Nodes: Add vector support to Map Range node
This replaces lost functionality from the old GN Attribute Map Range node.
This also adds vector support to the shader version of the node.

Notes:
This breaks forward compatibility as this node now uses data storage.

Reviewed By: HooglyBoogly, brecht

Differential Revision: https://developer.blender.org/D12760
2021-12-14 18:27:01 +00:00
44232a2ce6 Cleanup: Remove unused arguments 2021-12-14 12:16:28 -06:00
d56bbfea7b Cleanup: Remove runtime uiBlock pointer from nodes
Code is simpler when the uiBlocks used during drawing are simply
stored in an array. Additionally, looping can be simpler when we use
an vector to hold a temporary copy of the tree's linked list of nodes.

This patch also slightly changes how uiBlocks are "named" in
`node_uiblocks_init`. Now it uses the node name instead of the
pointer, which is helpful so we rely less on the node's address.

Differential Revision: https://developer.blender.org/D13540
2021-12-14 11:19:47 -06:00
fdd41ac49e Cleanup: Simplify node group input and output socket verification
This commit refactors the way the socket lists for group nodes,
and group input/output nodes are verified to match the group's
interface.

Previously the `bNodeSocket.new_sock` pointer was used to
temporarily mark the new sockets. This made the code confusing
and more complicated than necessary.

Now the old socket list is saved, and sockets are moved directly from
the old list to a new list if they match, or a new socket is created
directly in the new list.

This change is split from D13518, which aims to remove the `new_node`
and `new_sock` pointers. In the future this code might be removed
entirely in favor of using node socket declarations.

Differential Revision: https://developer.blender.org/D13543
2021-12-14 10:56:12 -06:00
f5ce243a56 Geometry Nodes: support instance attributes when realizing instances
This patch refactors the instance-realization code and adds new functionality.
* Named and anonymous attributes are propagated from instances to the
  realized geometry. If the same attribute exists on the geometry and on an
  instance, the attribute on the geometry has precedence.
* The id attribute has special handling to avoid creating the same id on many
  output points. This is necessary to make e.g. the Random Value node work
  as expected afterwards.

Realizing instance attributes has an effect on existing files, especially due to the
id attribute. To avoid breaking existing files, the Realize Instances node now has
a legacy option that is enabled for all already existing Realize Instances nodes.
Removing this legacy behavior does affect some existing files (although not many).
We can decide whether it's worth to remove the old behavior as a separate step.

This refactor also improves performance when realizing instances. That is mainly
due to multi-threading. See D13446 to get the file used for benchmarking. The
curve code is not as optimized as it could be yet. That's mainly because the storage
for these attributes might change soonish and it wasn't worth optimizing for the
current storage format right now.

```
1,000,000 x mesh vertex:       530 ms -> 130 ms
1,000,000 x simple cube:      1290 ms -> 190 ms
1,000,000 x point:            1000 ms -> 150 ms
1,000,000 x curve spiral:     1740 ms -> 330 ms
1,000,000 x curve line:       1110 ms -> 210 ms
10,000 x subdivided cylinder:  170 ms ->  40 ms
10 x subdivided spiral:        180 ms -> 180 ms
```

Differential Revision: https://developer.blender.org/D13446
2021-12-14 15:57:58 +01:00
8e2c9f2dd3 Geometry Nodes: simplify using selection when evaluating fields
We often had to use two `FieldEvaluator` instances to first evaluate
the selection and then the remaining fields. Now both can be done
with a single `FieldEvaluator`. This results in less boilerplate code in
many cases.

Performance is not affected by this change. In a separate patch we
could improve performance by reusing evaluated sub-fields that are
used by the selection and the other fields.

Differential Revision: https://developer.blender.org/D13571
2021-12-14 15:40:27 +01:00
b44a500988 Fix T93920: Wrong field inferencing state with unavailable socket
This commit ignores unavailable sockets in one more place, to fix the
case in T93920.

Differential Revision: https://developer.blender.org/D13562
2021-12-14 08:30:45 -06:00
b5c18288f5 Fix T93649: Blender freezes when saving with active VR session
Dead-lock when VR viewport drawing and depsgraph updates would fight for
the draw-manager GL lock. This didn't usually cause issues because the
depsgraph would be evaluated at this point already, except in rare
exceptions like after file writing.

Fix this by ensuring the XR surface gets its depsgraph updated after
handling notifiers, which is where regular windows also do the depsgraph
updating.
2021-12-14 15:23:50 +01:00
5cce6894d2 Cleanup: consistent naming for the blender file name 2021-12-14 21:32:14 +11:00
381cef1773 Cleanup: remove oudated comment from early COW development
Added in 161ab6109e
2021-12-14 21:17:06 +11:00
c9e5897bb0 Cleanup: use typed enum for wmDrag.flags
Also use 'e' prefix for enum type name.
2021-12-14 21:08:30 +11:00
f6fd3a84c2 Cleanup: reorganize doxygen modules
- Nest compositor pages under the compositor module
- Nest GUI, DNA/RNA & externformats modules under Blender.
- Remove modules from intern which no longer exist.
- Add intern modules (atomic, eigen, glew-mx, libc_compat, locale,
  numaapi, rigidbody, sky, utfconv).
- Use 'intern_' prefix for intern modules since some of the modules
  use generic terms such as locale & atomic.
2021-12-14 20:56:11 +11:00
a207c1cdaf Cleanup: resolve parameter mis-matches in doc-strings
Renamed or removed parameters which no longer exist.
2021-12-14 18:35:23 +11:00
c097c7b855 Cleanup: correct unbalanced doxygen groups
Also add groups in some files.
2021-12-14 16:17:10 +11:00
c1f5d8d023 Fix T91005: Autosplit produces unusable files
Audio PTS was reset for each new file. This caused misalignment of video
and audio streams. In Blender, these files can't be loaded, other
players will fail to align audio and video.

Since timestamps are reset intentionally, reset also video stream
timestamps.

There were other bugs:
After timestamp was reset for audio, write_audio_frames started
encoding from timeline start until target frame, so each split video
had more audio than it should.

Also audio for last frame before splitting was written into new file.

Differential Revision: https://developer.blender.org/D13280
2021-12-14 01:16:24 +01:00
b647509913 Fix T93844: High memory usage during VSE preview
Since 88c02bf826 FFmpeg handles are freed if image is not displayed.
This change did not work correctly if strips are inside meta strip,
because overlap did not consider meta strip boundary, only strips inside
of meta strip.

Pass frame range to `sequencer_all_free_anim_ibufs`, if strip is inside
of meta strip, frame range is reduced to fit meta strip boundary, but if
meta strip is being edited, range must be set to +/-`MAXFRAME`,
otherwise playback performance would be too bad.
2021-12-14 00:43:54 +01:00
2a0a6a0541 Remove G.save_over
The difference between G.save_over and G.relbase_valid was minor.

There is one change in functionality. When saving the default-startup
file from an already loaded blend file - future save actions will
continue to write to the originally loaded file instead of prompting
the user to select a location to save the file.

This change makes saving the startup file behave the same way
"Save a Copy" does.

Reviewed By: brecht

Ref D13556
2021-12-14 09:42:46 +11:00
a603bb3459 Cleanup: clang-format 2021-12-14 09:42:44 +11:00
e688c927eb Fix T94022: Both options GPU/CPU checked under preferences cause viewport render crash. (ARM/Metal)
This fixes crash T94022 when selecting live viewport render with both GPU & CPU devices selected. It is caused by incorrect `KernelBVHLayout` assignment. Similar to `BVH_LAYOUT_MULTI_OPTIX` for Optix, this patch adds a `BVH_LAYOUT_MULTI_METAL` to correctly redirect to the correct Metal BVH layout type.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D13561
2021-12-13 22:34:48 +00:00
Piotr Makal
b8952ecec9 Cleanup: Remove unused curve types from enum
There were a few unused enum values: `CU_CARDINAL` and `CU_BSPLINE`
This commit cleans them up from code as they were not used for
anything meaningful.

Differential Revision: https://developer.blender.org/D13554
2021-12-13 15:13:47 -06:00
0606adceb3 UI: String Search: Add an optional weight parameter
This builds off of rBf951aa063f7, adding a weight parameter which can
be used to change the order of items when they have the same match
score. In the future, if string searching gets a C++ API, we could
use an optional parameter for the weight, since it is not used yet.

This will be used for the node link drag search menu (D8286).

Differential Revision: https://developer.blender.org/D13559
2021-12-13 13:58:30 -06:00
Cody Winchester
a90c356467 GPencil: Add randomize options to Length modifier
This patch adds a randomize factor for the start/end lengths in the Length modifier.

Reviewed By: #grease_pencil, antoniov, pepeland, HooglyBoogly

Differential Revision: https://developer.blender.org/D12928
2021-12-13 17:14:44 +01:00
459af75d1e GPencil: New Shrinkwrap modifier
his new modifier is equals to the existing mesh modifier but adapted to grease pencil.

The underlying functions used to calculate the shrink are the same used in meshes.

{F11794101}

Reviewed By: pepeland, HooglyBoogly

Differential Revision: https://developer.blender.org/D13192
2021-12-13 17:09:22 +01:00
49802af7cd Cycles: add text explaining minimum requirements for Metal when no device found 2021-12-13 15:40:59 +01:00
Brecht Van Lommel
3f96555123 Cycles: enable Metal GPU rendering
This adds the remaining bits to enable Metal on macOS. There are still
performance optimizations and other improvements planned, but it should
now be ready for early testing.

This is currently only enabled on in Arm builds for M1 GPUs. It is not
yet working on AMD or Intel GPUs.

Ref T92212

Differential Revision: https://developer.blender.org/D13503
2021-12-13 13:57:13 +01:00
8ba6302696 Geometry Nodes: fix combining field inputs
This was an oversight in rB7b88a4a3ba7eb9b839afa6c42d070812a3af7997.
2021-12-13 13:51:42 +01:00
8709cbb73e Fix T93704: StructRNA.path_resolve fails silently with missing keys
Resolving the path to a missing pose-bone (for example),
was not raising an error as it should have.

Regression introduced in f9ccd26b03,
which didn't update collection lookup logic to fail in the case the
key of a collection wasn't found.
2021-12-13 23:48:17 +11:00
1686979747 Geometry Nodes: move up destruct instructions in procedure
This implements an optimization pass for multi-function procedures.
It optimizes memory reuse by moving destruct instructions up.
For more details see the in-code comment.

In very large fields with many short lived intermediate values, this change
can improve performance 3-4x. Furthermore, in such cases, peak memory
consumption is reduced significantly (e.g. 100x lower peak memory usage).

Differential Revision: https://developer.blender.org/D13548
2021-12-13 13:28:33 +01:00
e549d6c1bd Fix T93169: Weightpaint falloff popover drawn twice
This came with {rBf8a0e102cf5e}.

The panel was meant specifically for the gradient tool, but since it was
given the ".weighpaint" context, it would also draw as part of generic
header toolsettings drawing.

Now remove this context on purpose and only draw this specifically from
the gradient tools ToolDef.

Maniphest Tasks: T93169

Differential Revision: https://developer.blender.org/D13268
2021-12-13 09:38:17 +01:00
bea5a9997d Cleanup: clang-format 2021-12-13 16:22:21 +11:00
36c6b2e893 Cleanup: spelling in comments
Also move notes about where noise functions come from
into the function body as it's not relavant to the public doc-string.
2021-12-13 16:22:20 +11:00
8ad2642c47 Cleanup: use "filepath" term for Main, BlendFileData & FileGlobal
Use "filepath" which is the current convention for naming full paths.

- Main use "name" which isn't obviously a file path.
- BlendFileData & FileGlobal used "filename" which is often
  used for the name component of a path (without the directory).
2021-12-13 16:22:19 +11:00
27231afce5 Cleanup: remove checks for invalid input for Python Run utilities
These arguments must be non-null for useful functionality,
there is no need for paranoid checks.

The return value in case of invalid input for BPY_run_string_as_number
was also wrong (casting -1 to a bool, when false was expected).
2021-12-13 16:22:18 +11:00
80f1527762 Cleanup: Point cloud vs point-cloud in comments
Point cloud is two separate words, this just changes comments in
a few places where we were inconsistent. A small wording change
to another comment is also included.
2021-12-12 23:00:39 -06:00
d9d4b9899e Cleanup: Remove includes in node_util.h header
This ends up including the removed headers in many unnecessary places.
Also, remove unnecessary extern from function definitions.
2021-12-12 22:56:51 -06:00
f5679838bc Docs: document all Global (G) struct members 2021-12-13 14:05:31 +11:00
bc1e517bb3 Docs: improve on doc-strings for BPY_extern_run.h
Also add ATTR_NONNULL function attributes.
2021-12-13 13:12:09 +11:00
03015a9b22 Cleanup/docs: Add comment to spline lookup factor method 2021-12-12 18:03:30 -06:00
92237f11eb Geometry Nodes: improve memory reuse in procedure executor
This reduces the number of separate memory allocations done
by the multi-function procedure executor (which is used by the
field evaluation).

Now a linear memory allocator is used to allocate all intermediate
values. Furthermore, more buffers are reused when possible. This
reduces the total amount of allocated memory and improves
cache efficiency because the values are more likely to be in cache
already.

The performance improvement of this patch are most noticable
when few elements are processed by many functions. The situation
will improve even more with D13548, because then buffers can actually
be reused in practice. I measured up to 20% faster field evaluation
in extreme cases with this change.
2021-12-12 10:34:02 +01:00
b444e1be0f Cleanup: use struct instead of class
Using `class` and `struct` for the same type can cause issues on windows.
2021-12-12 08:58:55 +01:00
Charlie Jolly
8c55481e33 UI: Add curve handle buttons to CurveMap interface
This patch exposes the vector handle options as buttons
and aligns the UI between CurveMap and CurveProfile more closely.
- CurveMap point editing is on a single row like CurveProfile
- Tools menu is moved to the right hand side on both widgets
- Emboss curve map buttons

Reviewed By: HooglyBoogly

Differential Revision: https://developer.blender.org/D10980
2021-12-12 07:17:35 +00:00
9df13fba69 macOS: add tilde-based path expander
Replaces `HOME` environment variable usage for user
directories like in D12802.

Reviewed By: #platform_macos, brecht
Differential Revision: https://developer.blender.org/D13212
2021-12-11 22:29:53 +05:30
23be5fd449 Cleanup: Use const arguments
Also remove unnecessary function to set a node type's
label function that duplicated its definition, and make
another function static.
2021-12-11 09:51:53 -06:00
96387a2eff Cleanup: const, autoreleasepool, remove unneeded cast.
Early return in some places also.
De-duplicate getSystemDir and getUserDir also.

Reviewed By: #platform_macos, brecht
Differential Revision: https://developer.blender.org/D13211
2021-12-11 19:07:36 +05:30
7b88a4a3ba Geometry Nodes: remove accidental exponential time algorithm
Calling `foreach_field_input` on a highly nested field (we do that
often) has an exponential running time in the number of nodes.
That is because the same node may be visited many times.
This made Blender freeze on some setups that should work just fine.
Now every field keeps track of its inputs all the time. That replaces
the exponential algorithm with constant time access.
2021-12-11 11:26:55 +01:00
4c705ab8fe Fix compilation error on Windows platform. 2021-12-11 09:48:43 +01:00
8c7d970e2c Sky: Use Gauss-Laguerre quadrature for optical depth calculation
This allows to use fewer evaluations (30 msec down to 23 for me) while giving more accurate results (3x-10x less relative absolute error) compared to classic ray marching.

Not a massive difference, but meh, it's better. For Cycles the speedup doesn't really matter much, but I also have a patch for Eevee support.

I've also tried Gauss-Legendre and Gauss-Lobatto - the latter was always worse, while the former was slightly better at 2deg elevation but notably worse on 15deg.

Unfortunately the same approach can't be used for the integration along the primary ray, since there we also need the accumulated transmission so far at every integration point, not just the total result.

Differential Revision: https://developer.blender.org/D13521
2021-12-11 01:02:11 +01:00
1a02c0d7dd Cleanup: Return early for null check 2021-12-10 16:42:34 -06:00
a8b730e04c Cleanup: Use const argument, rename variables
The const argument makes sense because these are the "source"
sockets, even though a const cast is necessary at one point.
The name "interface_socket" is an improvement over "stemp"
because the latter sounds like "temporary", or it confuses
the old socket template system with a node group's interface.
2021-12-10 14:54:32 -06:00
7c2fb00e66 Cleanup: Remove unnecessary runtime rectangle from nodes in DNA
I assume this `butr` rectangle was used more in the past,
but currently its value is set and used less than 10 lines apart,
so it's trivial to remove 16 bytes from every node. The other
rectangles are also runtime data and could be removed, but
they are more difficult.
2021-12-10 13:52:02 -06:00
5a3d5f751f Revert recent changes to ImBuf transform
This reverts commit 943aed0de3  and f76e04bf4d

The latter caused a test failure: `sequencer_render_transform`
Reverting since the fix is not obvious (to me), and people are
away for the weekend.
2021-12-10 12:55:36 -06:00
5ca38fd612 Cleanup/Docs: Add comments to Mesh header, rearrange fields
Most of the fields in Mesh had no comments, or outdated misleading
comments. For example, "BMESH ONLY" referred to the BMesh project,
not the data structure. Given how much these structs are used, it should
save a lot of time to have proper comments.

I also rearranged the fields in mesh to have a more logical order. Now
the most important fields come first. In the process I was able to
remove 19 bytes of unnecessary padding (31->12). I just had to
change a `short` flag to `char`.

Differential Revision: https://developer.blender.org/D13454
2021-12-10 10:42:28 -06:00
Moritz Röhrich
f886f29355 Fix T93591: Random Value node first and last value proportion
This patch replaces `round_fl_to_int` with `floor` and adjusts the
maximum value accordingly. The call to `round_fl_to_int` is problematic
here because it messes with the probability distribution at the edges
of the value range, meaning the first and last values were only half
as common as all other values. Since `round_fl_to_int` does
`floor(val + 0.5)`, it will not introduce misbehavior in edge cases.

Differential Revision: https://developer.blender.org/D13474
2021-12-10 09:34:30 -06:00
943aed0de3 ImBuf: Extracted UV Wrapping from the Interpolation functions.
Improvement of the IMB_transform function by separating the UVWrapping
method from the color interpolation function. This would allow us to
perform uv wrapping operations separate from the interpolation function.

Currently this wrapping is only supported when interpolating colors.
This change would allow us to perform it on non-color image buffers.
2021-12-10 16:14:36 +01:00
60a9703de8 Added support for large images to openexr. 2021-12-10 15:38:46 +01:00
05df6366a4 Added support for large texture to OCIO. 2021-12-10 15:38:25 +01:00
f76e04bf4d ImBuf: Use templating for IMB_transform.
Reduce the inner loop of IMB_transform by extracting writing to an
output buffer in a template. This reduces a branch in the inner loop and
would allow different number of channels in the future.
2021-12-10 15:19:41 +01:00
37e799e299 Fix crash using 32k images.
Use IMB_get_rect_len to solve overflow issues.
2021-12-10 12:28:48 +01:00
bd2b48e98d Cleanup: move public doc-strings into headers for various API's
Some doc-strings were skipped because of blank-lines between
the doc-string and the symbol and needed to be moved manually.

- Added space below non doc-string comments to make it clear
  these aren't comments for the symbols directly below them.
- Use doxy sections for some headers.

Ref T92709
2021-12-10 21:42:06 +11:00
63f8d18c0f Cleanup: move public doc-strings into headers for 'python/intern'
Ref T92709
2021-12-10 21:40:53 +11:00
3060217d39 Cleanup: move public doc-strings into headers for 'nodes'
Ref T92709
2021-12-10 21:40:30 +11:00
715f57371b Cleanup: spelling in comments 2021-12-10 21:28:56 +11:00
b87b33adbf Cleanup: doxygen comments for node_geo_dual_mesh
- Indent text after dot-points.
- Use an unparsed code-block to display the text verbatim in doxygen.
2021-12-10 21:28:56 +11:00
fcf8fc3eaa Fix Crash: Loading Huge Images.
When loading huge images (30k) blender crashed with a buffer overflow.
The reason is that determine the length of a buffer was done in 32bit
precision and afterwards stored in 64 bit precision.

This patch adds a new function to do the correct calculation in 64bit.
It should be added to other sections in blender as well. But that should
be tested on a per case situation.
2021-12-10 10:37:46 +01:00
William Leeson
57f46b9d5f Fix T92036: Magic Texture in Volumetric World Shaders render differently with the CPU and GPU
When rendering volume surfaces in unbounded worlds the volume stepping can produce large values. If used with a magic texture node the values can results in a Inf float which when used in a sin or cos produces a NaN.

To fix this the input values are mapped into the periodic range of the sin and cos functions (-2*PI 2*PI) this stops the possibility of a Inf occurring and thus the NaN. It also improves the accuracy and smoothness of the result due to loss of precision when large values are summed with smaller ones effectively removing the parts of the smaller number (i.e. those in the -2PI to 2PI range) that result in variation of the output of sin and cos.

Reviewed By: brecht

Maniphest Tasks: T92036

Differential Revision: https://developer.blender.org/D12821
2021-12-10 09:09:20 +01:00
566a458950 Cleanup: move public doc-strings into headers for 'depsgraph'
- Added space below non doc-string comments to make it clear
  these aren't comments for the symbols directly below them.
- Use doxy sections for some headers.

Ref T92709
2021-12-10 12:19:36 +11:00
dffd032bc9 Geometry Nodes: Remove unnecessary copy when replacing data
In the `replace_mesh`/`replace_curve` etc. methods, the component
was retrieved with write access. Retrieving with write access will
duplicate the data if the component has another user. This means that
the replaced geometry data was often duplicated just to be deleted
a moment later.

I expect this would have a large impact in performance in some specific
situations when dealing with large geometry. In a scene with many small
meshes though, I didn't observe a significant difference.

This also makes replacing a geometry set's data with the same data
that's already in the set safe. It would be valid to assert for that
case instead, but this seems safer.

Differential Revision: https://developer.blender.org/D13530
2021-12-09 17:43:30 -06:00
18412744c8 Fix T93687: Transform Gpencil vertices not working if scale is zero
Solution similar to rBd5cefc1844cf.

Basically the problem occurs because `td->smtx` was set to zero matrix.
2021-12-09 18:17:23 -03:00
74a566d211 Cleanup: Return early in null check
I'm planning to make these functions slightly more complicated,
and it makes sense to return early when checking one of the parameters
for null anyway.
2021-12-09 14:34:15 -06:00
Andrii
b8f41825e8 Fix Cycles wrong adaptive sampling render when using sample offset
Sample offset was not accounted for in the adaptive sampling code and caused
issues, like immediately applied adaptive filtering, with non-zero values.

Differential Revision: https://developer.blender.org/D13510
2021-12-09 20:54:41 +01:00
Andrii
dbd64a5592 Cycles: support merging images rendered with adaptive sampling
This patch adds support for merging images rendered with adaptive sampling to
the merge operator (which is currently only exposed in the Python API).

To do this an sample count buffer is created for each render layer from the
sample count pass if it exists, or from the metadata otherwise. This is then
used for averaging passes.

Differential Revision: https://developer.blender.org/D13457
2021-12-09 20:52:46 +01:00
Andrii
20987b0f29 Cleanup: use more modern C++ in Cycles merge operator
Use structured binding and for-each loop, remove reduntant type casts, use
find_if instead of loop.

Differential Revision: https://developer.blender.org/D13456
2021-12-09 20:52:46 +01:00
bd3bd776c8 Geometry Nodes: Scene Time Node
This node outputs the current scene time in seconds or in frames.
Use of this node eliminates the need to use drivers to control values
in the node tree that are driven by the scene time.
Frame is a float value to provide for subframe rendering for motion
blur.

Differential Revision: https://developer.blender.org/D13455
2021-12-09 11:50:25 -06:00
ad44f22397 Fix T93498: Cycles fast GI add method affected by bounces settings
This is only for the replace method.
2021-12-09 18:17:48 +01:00
4b00a779ec Fix T93890: Cycles error with shadow catcher + OptiX denoise without passes 2021-12-09 18:01:26 +01:00
56fa6f58a0 Fix T93874: Cycles crash with fast GI approximation 2021-12-09 17:46:21 +01:00
e427e4dbb1 Fix T93871: Image.has_data returns True for images that failed to load 2021-12-09 17:36:19 +01:00
fc14d02bc5 Fix T93892: Changing bone name leaves non-functional vertex group
The corresponding vertex group was renamed properly, but the armature
influence was broken for that bone.

Caused by {rB3b6ee8cee708}.

Since above commit, vertex group names are stored on object data (mesh/
lattice/gpencil) and if we update these, we have to inform dependency
graph to have immediate effect.

Maniphest Tasks: T93892

Differential Revision: https://developer.blender.org/D13526
2021-12-09 16:47:06 +01:00
5ce1c63e1b Fix T93691: Crash when loading custom thumbnail in custom library
This was an issue with the mixed list of external assets and assets from
the current file. When closing the File Browser to select the custom
preview image, the assets from the current file would be cleared for
reread, to make sure we display up-to-date file data. That is because
the workspace of the temporary File Browser was deleted, causing a
change in the file data (main data-base). The reread would happen in a
background thread, meaning it might not finish before the custom preview
operator runs and queries the active asset. So the preview operator
would get the wrong active asset from context.

Two fixes were needed:
* Make sure current file data is reread before the operator runs, by
  doing this partial rereading on the main thread.
* Ensure the asset list (in fact file list) order stays consistent over
  rereads. If multiple assets with the same name were shown, the
  operator might also have gotten the wrong asset, also leading to a
  crash.

Additionally the file operation handler should probably poll before
executing, to fail gracefully at least (not crash).
2021-12-09 15:51:00 +01:00
74fa4ee92b Fix (unreported): missing null check
A crash happened when `instance_id_attribute` further down in the
function was null. This issue was probably introduced when the
id attribute starting using generic attribute handling.
2021-12-09 13:54:47 +01:00
d812e46f40 Cleanup: move public doc-strings into headers for 'io/usd'
Ref T92709
2021-12-09 22:47:58 +11:00
74e57efb2d Cleanup: move public doc-strings into headers for 'io/alembic'
Ref T92709
2021-12-09 22:37:24 +11:00
69f55b1b62 Cleanup: Various cleanups to the tree-view API
* Correct URL for documentation (was changed recently).
* Add comments.
* Reevaluate and update which functions are public, protected or
  private.
* Reorder functions and classes to be more logical and readable.
* Add helper class for the public item API so individual functions it
  uses can be made protected/private (the helper class is a friend).
  Also allows splitting API implementation from the C-API.
* Move internal layout builder helper class to the source file, out of
  the header.
* More consistent naming.
* Add alias for item-container, so it's more clear how it can be used.
* Use const.
* Remove unnecessary forward declaration.
2021-12-09 12:35:15 +01:00
9183f9f860 Cleanup/Documentation: Add/move comments for asset files
Adds some basic high-level explanations for editor/UI level asset APIs.
Also moves one such comment from the source file to the header file,
so it's in the same file as other API comments.
2021-12-09 12:35:15 +01:00
50f378e5c8 Cleanup: move public doc-strings into headers for 'io/collada'
Ref T92709
2021-12-09 22:25:45 +11:00
973dac9b5f Cleanup: use doxy section for itasc_plugin
It wasn't obvious all callbacks were part of the plugin-API.
2021-12-09 21:19:46 +11:00
7f4878ac7f Cleanup: move public doc-strings into headers for 'functions'
Ref T92709
2021-12-09 21:17:16 +11:00
Azeem Bande-Ali
b8bad3549d Fix T93519: handle prefix names in autocompletes
Autocomplete entires keep track of the length of the prefix
in `name_prefix_offset`. However, the name matching logic
was comparing the string including the prefix which resulted
in tab-completion not working (when the user didn't also type
in the prefix, typically two whitespaces).

This is fixed by passing in a char pointer after the end of
the prefix.

Additionally, some searchbox logic is moved. Previously,
`ui_searchbox_apply` would clear the entry which would mean
that `ui_searchbox_find_index` would never succeed. Now the
search box is only cleared if no match was found.

Differential Revision: https://developer.blender.org/D13483
2021-12-09 11:15:19 +01:00
65de17ece4 Cleanup: move public doc-strings into headers for 'freestyle'
Ref T92709
2021-12-09 21:09:54 +11:00
bc01003673 Cleanup: move public doc-strings into headers for 'nodes/geometry'
Ref T92709
2021-12-09 20:58:39 +11:00
cd4a7be5b2 Cleanup: move public doc-strings into headers for 'io/gpencil'
Ref T92709
2021-12-09 20:34:14 +11:00
3647a1e621 Cleanup: move public doc-strings into headers for 'geometry'
Ref T92709
2021-12-09 20:23:10 +11:00
8ef8f3a60a Cleanup: spelling in comments 2021-12-09 20:21:26 +11:00
15a428eab0 Cleanup: move public doc-strings into headers for 'gpencil_modifiers'
Removed doc-strings for operator definitions as they didn't provide
useful information in addition to the operators own description.

Ref T92709
2021-12-09 20:11:29 +11:00
8aed5dbcf8 Cleanup: move public doc-strings into headers for 'compositor'
Ref T92709
2021-12-09 20:01:49 +11:00
7c76bdca1b Cleanup: move public doc-strings into headers for 'gpu'
Ref T92709
2021-12-09 20:01:47 +11:00
9f546d6908 Cleanup: move public doc-strings into headers for 'imbuf'
Ref T92709
2021-12-09 20:01:45 +11:00
9e365069af Cleanup: move public doc-strings into headers for 'blenlib'
- Added space below non doc-string comments to make it clear
  these aren't comments for the symbols directly below them.
- Use doxy sections for some headers.
- Minor improvements to doc-strings.

Ref T92709
2021-12-09 20:01:44 +11:00
Alaska
d8b4275162 Cycles-X: Add hysteresis to resolution divider algorithm
Adds hysteresis to the resolution divider algorithm to avoid having the resolution bounce around when on the boundary of two resolutions.

Reviewed By: brecht, leesonw

Differential Revision: https://developer.blender.org/D12385
2021-12-09 09:18:47 +01:00
894269ad12 Fix incorrect copying of XR action map items
After using MEM_dupallocN() on the original item, the bindings ListBase
for the new item needs to be cleared and each binding copied separately.
2021-12-09 16:29:05 +09:00
30cebf5747 Cleanup: Remove empty node button layout function
Was unused since the first commit:
rB658b4c0d56dffbcf1476c2a2a019fa0ecfb79376
2021-12-08 22:02:45 -05:00
cf6be711e2 Fix T93869: snap cursor may fail in orthographic view
Float precision issues cause the `ED_view3d_win_to_3d_on_plane` to return
a value even when the view ray is parallel to the plane.

A more general solution might be desired in this case, as other areas that
use `ED_view3d_win_to_3d_on_plane` might have the same problem.

For now, just work around the problem for the snap cursor.
2021-12-08 23:47:26 -03:00
3753a0b72b Fix T93523: Memory leak in Menu Search
Fixes a memory leak introduced by D13225.
Caused by not freeing the hash-map in some cases.

Differential Revision: https://developer.blender.org/D13432
2021-12-09 01:48:11 +01:00
be2213472f Fix T93858: Zstd-compressed .blend files from external tools aren't recognized
The issue here was that after the seek table check, the underlying file wasn't
rewound to the start, so the code that checks for the BLENDER header
immediately reaches EOF and fails.

Since Blender always writes files with a seek table, this bug isn't triggered
by files saved in Blender itself. However, files compressed in external tools
generally don't have a seek table.
2021-12-08 23:35:47 +01:00
f32b63ec58 Fix T93642: value used as transform offset is ignored in some modes
If a transform operator is executed as modal and has a "value" set, that value works as an offset.

However few transform modes were supporting this.

Include more transform modes as supported.
2021-12-08 14:21:57 -03:00
6ebc581b52 Cleanup: Avoid lookup the same property string multiple times
This is a micro-optimization that is useful for operators with many properties.
2021-12-08 14:21:57 -03:00
069d63561a Geometry Nodes: Mesh Island Node
This node is a field input that outputs a separate index for each mesh island.
The indices are based on the order of the lowest-numbered vertex in each island.

Authoring help from @hooglyboogly

Differential Revision: https://developer.blender.org/D13504
2021-12-08 10:14:03 -06:00
72b39d3f92 Fix T93728: Greasepencil separate will loose all vertex groups
Caused by {rB3b6ee8cee708}

The raw data is copied here correctly
[`BKE_gpencil_stroke_weights_duplicate` in
`BKE_gpencil_stroke_duplicate`] but the vertex groups names are missing.
Prior to above commit is was enough to have `ED_object_add_duplicate`
(this seemingly took care of duplicating object's defbase).
Now vertex groups names sit on the `bGPdata` rather than the `Object`,
and since the separate operation creates **new** `bGPdata` we have to
copy vertex groups names - and active index - over [via
`BKE_defgroup_copy_list`].

Maniphest Tasks: T93728

Differential Revision: https://developer.blender.org/D13509
2021-12-08 16:10:07 +01:00
e23b54a59f Cycles: Fix OS version warnings
This patch suppresses OS version warnings and hides currently unsupported Metal GPUs when enumerating devices.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D13506
2021-12-08 15:08:12 +00:00
Campbell Barton
5e9dba822d Cleanup: Avoid error prone pointer storage in SnapObjectParams
eed48a7322 caused the `SnapObjectParams` to be stored in the `SnapObjectContext`.

As this pointer is always passing in stack memory, so it seems error prone to keep a reference to this in `SnapObjectContext` since failure to set this will reference undefined stack memory.

So avoid this by moving params out of `SnapObjectContext`.

Differential Revision: https://developer.blender.org/D13401
2021-12-08 12:05:36 -03:00
98bb8e6955 Fix T91680: viewport selection broken in macOS x86 build with Xcode 13
There is an apparent compiler bug here, tweak the code to avoid it. This did
not affect official builds as we were still using Xcode 12.
2021-12-08 15:46:15 +01:00
5b06759473 Cleanup: Use float2 for node view functions
Though the interfacing with `rctf` becomes slightly more complicated,
this should be more helpful as more of this code usese `float2` instead
of two separate floats.
2021-12-08 09:44:02 -05:00
61776befc3 Cleanup: move public doc-strings into headers for 'editors'
Ref T92709
2021-12-09 01:14:10 +11:00
8f1997975d ImBuf: Performance IMB_transform.
This change uses generics to reduce code duplication an increases
flexibility and performance. Performance increase isn't measurable
for end users.
2021-12-08 13:02:55 +01:00
32b1a13fa1 Cleanup: move public doc-strings into headers for 'sequencer'
Ref T92709
2021-12-08 21:05:06 +11:00
e89d42ddff Cleanup: move public doc-strings into headers for 'draw'
Ref T92709
2021-12-08 20:30:05 +11:00
a46ff1dd38 Cleanup: move public doc-strings into headers for 'shader_fx'
Ref T92709
2021-12-08 20:30:05 +11:00
2c0ccb0159 Cleanup: move public doc-strings into headers for 'simulation'
Ref T92709
2021-12-08 20:30:05 +11:00
07726ef1b6 Cleanup: move public doc-strings into headers for 'blentranslation'
Ref T92709
2021-12-08 20:30:05 +11:00
ca0c9757f2 Cleanup: moved IMB_transform to transform.cc.
Part of a refactoring to make IMB_transform more generic to reduce
unneeded branching.
2021-12-08 09:54:52 +01:00
a7b64a714d Cleanup: Silence clang-tidy warnings. 2021-12-08 09:52:38 +01:00
4f48b2992b Cleanup: move public doc-strings into headers for 'blenloader'
Ref T92709
2021-12-08 17:51:45 +11:00
93ba5e2375 Cleanup: move public doc-strings into headers for 'render'
Ref T92709
2021-12-08 17:12:43 +11:00
7e92717439 Cleanup: move public doc-strings into headers for 'windowmanager'
Ref T92709
2021-12-08 17:12:41 +11:00
db795a4727 Cleanup: move public doc-strings into headers for 'makesrna'
Ref T92709
2021-12-08 17:12:40 +11:00
da67a19ed9 Cleanup: move public doc-strings into headers for 'makesdna'
Ref T92709
2021-12-08 17:12:39 +11:00
2545119112 Cleanup: move public doc-strings into headers for 'blendthumb'
Ref T92709
2021-12-08 17:12:37 +11:00
d6c3ea9e7a Cleanup: move public doc-strings into headers for 'blenfont'
Ref T92709
2021-12-08 17:12:35 +11:00
00f3957b8e Cleanup: move public doc-strings into headers for 'modifiers'
Ref T92709
2021-12-08 17:12:33 +11:00
c1279768a7 Cleanup: Clang-Tidy modernize-redundant-void-arg 2021-12-08 00:31:20 -05:00
cbcd74de22 Cleanup: Missing braces in array declaration 2021-12-08 00:30:36 -05:00
b71e29b3da Cleanup: clang-format 2021-12-07 23:12:13 -05:00
47b36ddcce Cleanup: Nodes: clang-tidy modernize-redundant-void-arg 2021-12-07 23:09:19 -05:00
2964c4e1d0 Cleanup: spelling in comments 2021-12-08 13:31:19 +11:00
333dc7b5c4 Nodes: Add Shader Socket to new decleration API
This commit adds the shader socket type to the new socket builder api.

Re commits part of rB0bd3cad04edf4bf9b9d3b1353f955534aa5e6740
2021-12-07 21:05:13 -05:00
0d8c479225 Shader Nodes: Use camel case for file names
Recommits part of rBf60b95b5320f8d6abe6a629fe8fc4f1b94d0d91c
2021-12-07 20:57:25 -05:00
1552c92fb1 Cycles: Fix Metal BVH crash caused by missing WITH_METAL define
Reviewed By: brecht

Differential Revision: https://developer.blender.org/D13505
2021-12-07 21:23:52 +00:00
5568455d63 Cleanup: Extend a few comments in BKE_spline.hh 2021-12-07 15:17:10 -05:00
204ae33d75 Revert "Fix T93350: Cycles renders shows black during rendering huge resolutions"
This reverts commit 5e37f70307.

It is leading to freezing of the entire desktop for a few seconds when stopping
3D viewport rendering on my Linux / NVIDIA system.
2021-12-07 20:49:34 +01:00
b815088416 Tests: add Cycles perspective/ortho/panoramic camera regression tests
Ref D12691
2021-12-07 20:08:12 +01:00
Håkan Ardö
24e0165463 Cycles: add Fisheye Lens Polynomial camera model
This allows real world cameras to be modeled by specifying the coordinates of a
4th degree polynomial that relates a pixels distance (in mm) from the optical
center on the sensor to the angle (in radians) of the world ray that is
projected onto that pixel.

This is available as part of the panoramic lens type, however it can also be
used to model lens distortions in projective cameras for example.

Differential Revision: https://developer.blender.org/D12691
2021-12-07 20:05:57 +01:00
205254150a Build: don't look for CUDA toolkit if not using Cycles CUDA device 2021-12-07 19:52:54 +01:00
763cd2e0be Cleanup: fix compiler warning 2021-12-07 19:52:54 +01:00
5bd41b2e25 Shader Nodes: Create a new bf_nodes_shader library
Re commits rBf72cc47d8edf849af98e196f721022bacf86a5e7 but without the unity build
2021-12-07 13:20:22 -05:00
63bd356faf Cleanup: Missing include
This included is needed for the `ATTR_NONNULL` macro used in the header.
As found in a recent c --> c++ if the includes get ordered in a different order
this could result in an error.

Re commits rBc20098e6ec6adee874a12e510aa4a56d89f92838
2021-12-07 13:20:22 -05:00
b9641cfc37 Cleanup: clang-tidy: modernize-redundant-void-arg
Re commits part of rB0578921063fbb081239439062215f2538a31af4b
2021-12-07 13:20:22 -05:00
7fbb767259 Docs: Add more comments to geometry set header
This adds a bit more information to `GeometrySet` and each of the
geometry components. There is probably still more that can be written,
but this includes the most important information that I could think of.

I'd like to include some more general information about the
attribute API in a separate patch.

Differential Revision: https://developer.blender.org/D13501
2021-12-07 13:04:32 -05:00
Sergey Sharybin
5e37f70307 Fix T93350: Cycles renders shows black during rendering huge resolutions
The root of the issue is caused by Cycles ignoring OpenGL limitation on
the maximum resolution of textures: Cycles was allocating texture of the
final render resolution. It was exceeding limitation on certain GPUs and
driver.

The idea is simple: use multiple textures for the display, each of which
will fit into OpenGL limitations.

There is some code which allows the display driver to know when to start
the new tile. Also added some code to allow force graphics interop to be
re-created. The latter one ended up not used in the final version of the
patch, but it might be helpful for other drivers implementation.

The tile size is limited to 8K now as it is the safest size for textures
on many GPUs and OpenGL drivers.

Differential Revision: https://developer.blender.org/D13385
2021-12-07 19:01:42 +01:00
Mikhail Matrosov
a92805bf24 Fix T93418: Cycles shadow terminator Geometry Offset artifacts with translucency
Differential Revision: https://developer.blender.org/D13468
2021-12-07 19:01:39 +01:00
484714992c Build: remove Cycles CUDA/HIP/OPTIX build options on macOS
So those device types are not listed in the preferences. Metal will be
added instead as the only option.
2021-12-07 19:01:39 +01:00
13af88b23f Build: clean up handling of some Cycles build options
* Don't link embree / OSL when WITH_CYCLES is disabled
* Simplify lite config by disabling Cycles as a whole using this
* Remove code handling the removed WITH_CYCLES_NETWORK option
2021-12-07 19:01:39 +01:00
e14f8c2dd7 Cycles: Reintroduce device-only memory handling that got lost in Cycles X merge
Somehow only a part of rBf4f8b6dde32b0438e0b97a6d8ebeb89802987127 ended up in
Cycles X, causing the issue that commit fixed, "OPTIX_ERROR_INVALID_VALUE" when the
system is out of memory, to show up again.
This adds the missing changes to fix that problem.

Maniphest Tasks: T93620

Differential Revision: https://developer.blender.org/D13488
2021-12-07 18:50:10 +01:00
a37dac0a88 Cycles: add Metal device type to device_type_for_description
Add a `DEVICE_METAL` case to the enum-value-to-string mapping function.
2021-12-07 17:47:35 +01:00
fee4b58627 Cycles: fix build on non-Apple systems
Skip compiling `metal.mm` unless `WITH_CYCLES_DEVICE_METAL` is enabled.
2021-12-07 17:46:52 +01:00
c4cee2e221 Geometry Nodes: Edge Neighbors Node
Creates a new Edge Neighbors node which outputs a field
containing the number of faces connected to each edge.

Differential Revision: https://developer.blender.org/D13493
2021-12-07 10:07:24 -06:00
9558fa5196 Cycles: Metal host-side code
This patch adds the Metal host-side code:

- Add all core host-side Metal backend files (device_impl, queue, etc)
- Add MetalRT BVH setup files
- Integrate with Cycles device enumeration code
- Revive `path_source_replace_includes` in util/path (required for MSL compilation)

This patch also includes a couple of small kernel-side fixes:

- Add an implementation of `lgammaf` for Metal [Nemes, Gergő (2010), "New asymptotic expansion for the Gamma function", Archiv der Mathematik](https://users.renyi.hu/~gergonemes/)
- include "work_stealing.h" inside the Metal context class because it accesses state now

Ref T92212

Reviewed By: brecht

Maniphest Tasks: T92212

Differential Revision: https://developer.blender.org/D13423
2021-12-07 15:52:21 +00:00
565b33c0ad Geometry Nodes: new Geometry to Instance node
This adds a new Geometry to Instance node that turns every
connected input geometry into an instance. Those instances
can for example be used in the Instance on Points node.

Differential Revision: https://developer.blender.org/D13500
2021-12-07 15:37:12 +01:00
a8e0fe6a54 Geometry Nodes: move type conversions to blenkernel
The type conversions do not depend on other files in the nodes
module. Furthermore we want to use the conversions in the
geometry module without creating a dependency to the
nodes module there.
2021-12-07 15:22:08 +01:00
2309fa20af Cleanup: Add macro and functions for node storage
The `node_storage` functions to retrieve const and mutable structs
from a node are generated by a short macro that can be placed at the
top of each relevant file. I use this in D8286 to make code snippets
in the socket declarations much shorter, but I thought it would be
good to use it consistently everywhere else too.

The functions are also useful to avoid copy and paste errors,
like the one corrected in the cylinder node in this commit.

Differential Revision: https://developer.blender.org/D13491
2021-12-07 09:09:30 -05:00
Yuchen Wen
6a9775ec6f Fix T93467: Use world bg color for pose library previews
Use the World viewport color when rendering pose library previews.

The World's viewport color is chosen instead of the World shading nodes,
as the latter would require rendering with `OB_RENDER` (instead of
`OB_SOLID`), which would take considerably longer.

Manifest Task: T93467

Reviewed By: sybren

Differential Revision: https://developer.blender.org/D13470
2021-12-07 14:25:00 +01:00
0f48b37aae Revert moving all shader nodes to c++
This reverts to following commits:
* rB5cad004d716da02f511bd34983ac7da820308676
* rB97e3a2d935ba9b21b127eda7ca104d4bcf4e48bd
* rBf60b95b5320f8d6abe6a629fe8fc4f1b94d0d91c
* rB0bd3cad04edf4bf9b9d3b1353f955534aa5e6740
* rBf72cc47d8edf849af98e196f721022bacf86a5e7
* rB3f7014ecc9d523997062eadd62888af5fc70a2b6
* rB0578921063fbb081239439062215f2538a31af4b
* rBc20098e6ec6adee874a12e510aa4a56d89f92838
* rBd5efda72f501ad95679d7ac554086a1fb18c1ac0

The original move to c++ that the other commits depended upon had some issues
that should be fixed before committing it again. The issues were reported in
T93797, T93809 and T93798.

We should also find a better rule for not using c-style casts going forward,
although that wouldn't have been reason enough to revert the commits.
Introducing something like a `MEM_new<T>` and `MEM_delete<T>`
function might help with the the most common case of casting the return
type of `MEM_malloc`.

Going forward, I recommend first committing the changes that don't
require converting files to c++. Then convert the shading node files
in smaller chunks. Especially don't mix fairly low risk changes like
moving some simple nodes, with higher risk changes.
2021-12-07 13:26:39 +01:00
ae5a89e80a Fix T93797, T93809: Crash/undefined-behavior when opening demo file
Error in d5efda72f5. Was changing an iteration that would free items
to an iterator that is not safe for use in such cases.

There still seem to be significant issues with the rendering, but that's
a separate issue to be fixed.
2021-12-07 12:03:10 +01:00
4312cb8545 Fix memory leak when loading large asset libraries 2021-12-07 11:47:06 +01:00
cd494087c1 Fix crash when switching back from render preview.
Issue is that external engine uses the gpu info. but overwrote the
instance data. The draw manager would then detect instance data and
required the engine type to have a instance free callback.

The solution is to save some space in the engine data to hold an empty
and unused instance_data attribute to comply with `ViewportEngineData`
struct.
2021-12-07 11:30:50 +01:00
Jeroen Bakker
b069218a55 DrawManager: Engine Instance Data.
In the original design draw engines had to copy with a limitation that
they were not allowed to reuse complex data structures between drawing
calls. Data that could be reused were limited to:
- GPUFramebuffers
- GPUTextures
- Memory that could be removed calling MEM_freeN (storage list)
- DRWPass

This is fine when the storage list contains arrays or structs but when
more complex data types (vectors, maps) etc wasn't possible.

This patch adds instance_data that can be reused between drawing calls.
The instance_data is controlled by the draw engine and doesn't need to
be limited as described above.

When an engines stores instance_data it must implement the
`DrawEngineType.instance_free` callback to free the data.

The patch originates from eevee rewrite. But was added to master as the
image engine rewrite also has a need for it.

Reviewed By: fclem

Differential Revision: https://developer.blender.org/D13425
2021-12-07 10:34:38 +01:00
Jeroen Bakker
e2f0b4a0cb Cleanup: Use rcti marking dirty regions when texture painting.
Dirty regions when painting are not using rcti. Meaning less
understandable code. Found issue when refactoring the image_gpu partial
update. In a future change gpu partial update API will be using rcti
also what makes the code even cleaner.

Reviewed By: campbellbarton

Differential Revision: https://developer.blender.org/D13260
2021-12-07 10:33:23 +01:00
1de3636624 Cleanup: note that functions in BKE_node.h aren't part of blenkernel 2021-12-07 18:55:57 +11:00
512a560cde Cleanup: remove BKE_ptcache_remove
No longer needed as the temporary directory is cleared on exit.
2021-12-07 18:47:01 +11:00
a55d318d71 Cleanup: sort DNA renaming
Add note at the beginning & end so it's not overlooked.
2021-12-07 18:44:28 +11:00
c6a200c693 Cleanup: clang-format 2021-12-07 18:39:16 +11:00
1e7ef83e46 Cleanup: clarify source/destination args for BKE_spacedata_copylist 2021-12-07 18:26:35 +11:00
a37a6fb445 Cleanup: remove incorrect/unhelpful comments 2021-12-07 18:23:57 +11:00
6483dee141 Cleanup: replace int with bool type 2021-12-07 18:15:15 +11:00
ffc4c126f5 Cleanup: move public doc-strings into headers for 'blenkernel'
- Added space below non doc-string comments to make it clear
  these aren't comments for the symbols directly below them.
- Use doxy sections for some headers.
- Minor improvements to doc-strings.

Ref T92709
2021-12-07 17:38:48 +11:00
luzpaz
f159d49f56 Cleanup: Fix various source typos
This is a continuation of D13462 to clean up source typos.

Differential Revision: https://developer.blender.org/D13471
2021-12-06 22:39:52 -05:00
luzpaz
92dae5775f Cleanup: Fix typos in source code
Source typos corrected

Reviewed By: Blendify

Differential Revision: https://developer.blender.org/D13462
2021-12-06 22:23:17 -05:00
50fb0fd636 Docs: Incorrect link to context type
Fixes T93773
2021-12-06 22:07:38 -05:00
5cad004d71 Cleanup: clang format 2021-12-06 18:52:08 -05:00
97e3a2d935 Shader Nodes: Migrate shader category to new node socket declaration API
No functional changes, tests passed and I double checked the nodes by hand.
2021-12-06 18:52:08 -05:00
Christoph Lendenfeld
b1696702cd Fix: Remove line from common invoke
The line that sets the factor_prop in graph_slider_ops.c
has been left in the common invoke function by accident.

Reviewed by: Sybren A. Stüvel
Differential Revision: https://developer.blender.org/D13477
Ref: D13477
2021-12-06 22:27:04 +00:00
Christoph Lendenfeld
ae6f3056fc Cleanup: renames in graph_slider_ops
rename percentage_prop to factor_prop
rename remove_ratio to factor
remove "graphkeys" prefix from functions

Reviewed by: Sybren A. Stüvel
Differential Revision: https://developer.blender.org/D13476
Ref: D13476
2021-12-06 22:19:00 +00:00
78ae587649 Cleanup: Use C++ types for multi input socket sorting
The algorithm used is still quite inefficient, but at least the code
is easier to read and a little bit simpler now.
2021-12-06 17:12:46 -05:00
aa23e870ec Cleanup: Remove unused includes 2021-12-06 16:50:44 -05:00
01779970c2 Cleanup: Remove unused node flag 2021-12-06 16:35:38 -05:00
b91ac86cfc Cleanup: Remove unnecessary generic includes from headers 2021-12-06 16:04:14 -05:00
5705db5bb3 Fix: Compile error in field input
Instead of essentially hashing a bool, just use a ternary operator.

Differential Revision: https://developer.blender.org/D13494
2021-12-06 15:47:09 -05:00
0ed254574f macOS: Fix build error in hash functions
Remove unneeded recent static_cast attempt.

Reviewed By: HooglyBoogly
Differential Revision: https://developer.blender.org/D13492
2021-12-07 01:38:14 +05:30
Aaron Carlisle
f60b95b532 Shader Nodes: Split each node into own file
This improves both code finding,  for example "color ramp" now has its own file.
And now each node has its own namespace so function names can be simplified
similar to rBfab39440e94

This commit also makes all file names use snake case instead of camel case.

Reviewed By: HooglyBoogly

Differential Revision: https://developer.blender.org/D13482
2021-12-06 14:47:49 -05:00
c3c69fee09 Cleanup: Fix warnings about copied Map loop variables
The `Map::items()` iterator does not return references.
2021-12-06 13:49:37 -05:00
86d520f268 Fix: Attempt to fix build error on macOS 2021-12-06 13:47:53 -05:00
Aaron Carlisle
9792994311 Nodes: Add function to set compact socket flag for vectors
This flag is currently only used for vector sockets
so the function is limited to the vector builder.

The flag is only used by two shader nodes at the moment
and this is needed to port them over to the new socket declaration API.

Reviewed By: JacquesLucke

Differential Revision: https://developer.blender.org/D13490
2021-12-06 13:40:02 -05:00
2d4c7fa896 Geometry Nodes: reduce code duplication with new GeometyrFieldInput
Most of our field inputs are currently specific to geometry. This patch introduces
a new `GeometryFieldInput` that reduces the overhead of adding new geometry
field input.

Differential Revision: https://developer.blender.org/D13489
2021-12-06 19:13:24 +01:00
2814740f5b Geometry Nodes: 4 Field Inputs for Mesh Topology Data
Creates 4 new nodes which provide topology information
for the mesh. Values are interpolated from the primary
domain in each case using basic attribute interpolation.

Vertex Neighbors
  - Vertex Count
  - Face Count
Face Neighbors
  - Vertex Count
  - Neighboring Face Count
Edge Vertices
  - Vertex Index 1
  - Vertex Index 2
  - Position 1
  - Position 2
Face Area
  - Face Area

Differential Revision: https://developer.blender.org/D13343
2021-12-06 11:58:08 -06:00
ee4ed99866 Fix T93521: Single point NURBS crash in resample node
The resample node didn't handle the case of when a spline didn't have
any evaluated points. For poly and Bezier splines we should never hit
this case, but it is expected when the number of NURBS control points
is smaller than its order, so we have to handle the case here.

It's not that obvious what to do in this case, there are a few options:
 - Remove the bad splines from the result
 - Generate empty splines for those inputs
 - Skip resampling the bad splines, copy them to the result
 - Arbitrarily generate single point splines

I chose option three, just skipping the "bad" splines. Since the node
already has a selection input, this can be described by just extending
that. "Splines with no evaluated points are implicitly deselected."
The first option would probably be valid too though.

Differential Revision: https://developer.blender.org/D13434
2021-12-06 12:19:27 -05:00
Aaron Carlisle
0bd3cad04e Nodes: Add Shader Socket to new decleration API
This commit adds the shader socket type to the new socket builder api.

As a test, this commit also converts the Add Shader node to the new API

Reviewed By: JacquesLucke

Differential Revision: https://developer.blender.org/D13485
2021-12-06 11:59:52 -05:00
Aaron Carlisle
f72cc47d8e Shader Nodes: Unity Build
- Create a new `bf_nodes_shader` library
- Enable unity builds for  `bf_nodes_shader`, gives abount a 2.7x speed up for compile times

Reviewed By: JacquesLucke

Differential Revision: https://developer.blender.org/D13484
2021-12-06 11:47:45 -05:00
0c703b856b Disable asset indexing in the Asset List
Asset indexing should be disabled, but this was overlooked in the
asset list (used for the pose library in the 3D View).
2021-12-06 17:42:16 +01:00
86992a96b8 Asset Indexer: use fixed-length string for ID code
Use fixed-length string to convert `char[2]` to `std::string`. Otherwise
`strlen()` is called, which is problematic as the `char[2]` is not
zero-terminated.
2021-12-06 17:42:16 +01:00
c2292b2cd6 Cleanup: remove unnecessary extern template implementations
This technique isn't really necessary anymore, because unity builds
avoid instantiating the same template too many times already.
2021-12-06 17:31:42 +01:00
7d1a10a9bb Fix T93314: Thumbnails not drawn with default scale
Decrease threshold for drawing thumbnails.

This was unintended change in daaa43232d that was overlooked.
2021-12-06 17:01:51 +01:00
a5e3899853 VSE: Fix strip with mask modifier not blending
Set `ibuf->planes` to `R_IMF_PLANES_RGBA` because mask modifier adds
transparent areas to image.
2021-12-06 17:01:51 +01:00
Aurelien Jarno
477631d9ec cmake: fix linking with WITH_X11_XF86VMODE and bfd
Fix typos in the variables of the Xxf86vm libray to fix link failure
with bfd. These variables are defined in platform_unix.cmake.

Reviewed By: zeddb

Differential Revision: https://developer.blender.org/D12911
2021-12-06 16:46:23 +01:00
b9c6ef4e8f Fix T93707: Dragging the tweaked NLA strip causes crash
Earlier code assumed that the active strip was on the active track. This
commit detects when this assumption doesn't hold, and adds a more thorough
search of the active strip.
2021-12-06 15:50:21 +01:00
0e52af097f Fix T93611: Curve modifier crash in editmode in certain situations
Caused by {rB3b6ee8cee708}

Above commit was trying to get the vertexgroup from the mesh that is
passed into `deformVertsEM` (but that can be NULL).
When can it be NULL, when is is non-NULL?
`editbmesh_calc_modifiers` only passes in a non-NULL mesh to
`deformVertsEM` under certain conditions:
- a non-deform-only modifier is handled currently
- a non-deform-only modifier preceeds the current modifier
- a deform-only modifier preceeds the current modifier (and the current
one depends on normals)

So the passed-in mesh cannot be relied on, now get the vertex group from
the context object data (like it was before the culprit commit).

Related commit: rB8f22feefbc20

Maniphest Tasks: T93611

Differential Revision: https://developer.blender.org/D13487
2021-12-06 15:20:30 +01:00
989d510e3e Cleanup: remove some temp dev asserts in new link/append code.
No longer needed now that all code uses that new
BKE_blendfile_link_append module, and that instantiation code in
BLO_readfile has been removed.
2021-12-06 11:29:27 +01:00
9a69c456bd Fix T93388: dropping object on grid in orthogonal view misses the floor plane
`ED_view3d_win_to_3d_on_plane` does not use the `clip_start` and
`clip_end` values of the scene, so the `do_clip` option can be misleading
especially in the orthographic view where the `clip_start` is negative.

For now, don't use the `do_clip` option in orthographic view.
2021-12-05 23:40:43 -03:00
3d8dea9ff9 Fix T93732: Snap Cursor not working after changing Add Object settings
`g_data_intern.state_default.gzgrp_type` is a very specific member and
cannot be set to default.
2021-12-05 23:40:43 -03:00
Aaron Carlisle
3f7014ecc9 Shader Nodes: Declare nodes in their own namespace
Follow up on rB1df8abff257030ba79bc23dc321f35494f4d91c5

This puts all static functions in geometry node files into a new
namespace. This allows using unity build which can improve
compile times significantly

- The namespace name is derived from the file name.
  That makes it possible to write some tooling that checks the names later on.
  The filename extension (cc) is added to the namespace name as well.
  This also possibly simplifies tooling but also makes it more obvious that this namespace is specific to a file.
- In the register function of every node, I added a namespace alias namespace `file_ns = blender::nodes::node_shader_*_cc`;.
  This avoids some duplication of the file name and may also simplify tooling, because this line is easy to detect.
  The name `file_ns` stands for "file namespace" and also indicates that this namespace corresponds to the current file.

In the future some nodes will be split up to separate files and given their own namespace
This will allow function names to be simplified similar to rBfab39440e94

Reviewed By: HooglyBoogly

Differential Revision: https://developer.blender.org/D13480
2021-12-05 17:47:33 -05:00
0a8a22fc04 Cleanup: Remove unused next and prev pointers in bNodeType 2021-12-05 17:15:21 -05:00
9a312ba192 Cleanup: Remove unnecesary node type draw callback
As a followup to 338c1060d5, apply the same change to the node
drawing callback. This helps to simplify code when the complexity
of a callback isn't necessary right now.
2021-12-05 17:12:25 -05:00
338c1060d5 Cleanup: Remove unnecessary node type callbacks for drawing
Currently there are a few callbacks on `bNodeType` that do the same
thing for every node type except reroutes and frame nodes. Having a
callback for basic things complicates code and makes it harder to
understand, and reroutes and frames are special cases in larger way.

Arguably frame nodes shouldn't even be drawn like regular nodes,
given that it adds a case of O(N^2) looping through all nodes.
"Unrolling" the callbacks makes it easier to see what's happening,
and therefore easier to optimize.

Differential Revision: https://developer.blender.org/D13463
2021-12-05 16:45:41 -05:00
0578921063 Cleanup: clang-tidy: modernize-redundant-void-arg
This change follows up on recent c --> c++ conversions
2021-12-05 16:38:00 -05:00
c20098e6ec Cleanup: Add missing include
Fixes compilation errors after rBd5efda72f501
Re sorted some includes.
2021-12-05 13:33:15 -05:00
Aaron Carlisle
d5efda72f5 Cleanup: Migrate all shader nodes to c++
This will be useful in the future to use the new socket builder API

Aditional changes:

- Declare variables where initialized
- Do not use relative includes

Differential Revision: https://developer.blender.org/D13465
2021-12-05 12:12:45 -05:00
b32f9bf801 Geometry Nodes: use array instead of map in GeometrySet
`GeometrySet` contains at most one component of each type.
Previously, a map was used to make sure that each component
type only exists once. The overhead of a map (especially with
inline storage) is rather large though. Since all component types
are known at compile time and the number of types is low,
a simple `std::array` works as well.

Some benefits of using `std::array` here:
* Looking up the component of a specific type is a bit faster.
* The size of `GeometrySet` becomes much smaller from 192 to 40 bytes.
* Debugging a `GeometrySet` in many tools becomes simpler because
  one can  easily see which components exists and which don't
2021-12-05 15:10:11 +01:00
d19443074a Fix (unreported): off-by-one error when setting curve handles
The problem is that `current_mask` can become `selection.size()`.
The loop could also be refactored to break out of it when the end
of the selection is reached. This can be done separately.
2021-12-05 14:49:30 +01:00
Aaron Carlisle
5ef5a9fc24 Compositor: Migrate most nodes to new socket builder API
This patch leaves a out a few nodes:

- Group Nodes
- Image input node
- File output node
- Switch View
- Cryptomatte

These nodes above are a bit more complicated and should be worked on individually.

Differential Revision: https://developer.blender.org/D13266
2021-12-03 20:26:38 -05:00
Christoph Lendenfeld
d5920744f4 Create generic modal functions from GRAPH_OT_decimate
This patch extracts the modal functions from GRAPH_OT_decimate
and makes them generic so they can be reused by future operators

Reviewed by: Sybren A. Stüvel
Differential Revision: https://developer.blender.org/D9326
Ref: D9326
2021-12-03 22:24:41 +00:00
56ed4c14d3 Cleanup: Use LISTBASE_FOREACH macro 2021-12-03 16:33:14 -05:00
2d8606b360 Cleanup: Use references in node editor, other improvements
This helps to tell when a pointer is expected to be null, and avoid
overly verbose code when dereferencing. This commit also includes
a few other cleanups in this area:
 - Use const in a few places
 - Use `float2` instead of `float[2]`
 - Remove some unnecessary includes and old code
The change can be continued further in the future.
2021-12-03 16:25:17 -05:00
ca0dbf8c26 Cleanup: Remove unused code
This is very old and is unlikely to be useful in the near future.
2021-12-03 12:31:00 -05:00
d1f118d228 Cleanup: Const, use references, C++ types
Also remove some unnecessary includes
2021-12-03 11:43:10 -05:00
be3f3812dc Cleanup: Comments and ordering in node editor header 2021-12-03 11:31:25 -05:00
cb0fbe1fde Cleanup: Use typed enum for node resize direction 2021-12-03 11:05:59 -05:00
ab927f5ca7 ImBuf: Made Wrapping and Cropping optional in IMB_transform.
`IMB_transform` is used in VSE. It had a required crop parameter
for cropping the source buffer. This is not always needed.

In the image engine we want to use the use the `IMB_transform`
with wrap repeat. Both options are mutual exclusive and due
to performance reasons the wrap repeat is only available when
performing a nearest interpolation.
2021-12-03 13:48:00 +01:00
c4e041da23 Cleanup: move public doc-strings into headers for 'bmesh'
Some minor improvements to doc-strings too.

Ref T92709
2021-12-03 20:10:57 +11:00
Jesse Yurkovich
7c4fc5b58d Fix T93541: Use warning instead of error for exceeding layer limits
Instead of using RPT_ERROR, use RPT_WARNING which will not raise an
exception to Python. This broke some scripts (including FBX import)
which already check for a None return value.

Ref D13458

Reviewed By: campbellbarton
2021-12-03 17:05:18 +11:00
de5d36560f Fix T93574: Asset triangulating a mesh
Correct assert for edit-mesh normal calculation.
2021-12-03 16:55:34 +11:00
56ff954030 UI: Expand tree-view items (e.g. asset catalogs) on click to activate
This actually gives a quite nice behavior in my opinion, especially for
asset catalogs, where activating a catalog makes all assets inside it or
its (grand-)child catalogs visible, so showing the child catalogs then
adds useful information. Maybe this should become a feature for
asset catalogs only, to be evaluated once the tree-view API is used in
more cases. Only asset catalogs are affected by this change right now.

Part of T93582.
2021-12-02 19:49:48 +01:00
c0122cc888 Asset Browser: Don't expand top-level catalogs by default
The "All" item will be expanded of course, and the tree will also be
expanded so that the active item is visible. But feedback was that in
some setups, opening all the top-level catalogs is a bit too much. So
collapse them for a more compact default layout.

Part of T93582.
2021-12-02 19:49:48 +01:00
a159f67ccc Merge remote-tracking branch 'origin/blender-v3.0-release' 2021-12-02 19:37:01 +01:00
f1cca30557 Blender 3.0 - version bump -> release 2021-12-02 19:35:47 +01:00
0988711575 Licenses: Attribution document for Blender 3.0
A few libraries were updated, a few were added, and a few were missing
from the previous license document.
2021-12-02 19:35:01 +01:00
fca6a9fe3f Asset Browser: Remove home icon for "All" item
Originally this wasn't really meant to stay there permanently, but
helped giving things a nicer alignment. Others seemed to like it at
first so it stayed, but after more feedback we decided to remove it
again. The alingment is better now with the previous commit.

Part of T93582.
2021-12-02 19:14:50 +01:00
e38532773b Asset Browser: Nest all catalogs under the "All" item
This makes the relation to the catalogs and the behavior more clear.

Part of T93582.
2021-12-02 19:14:50 +01:00
cef8f5ff50 Docs: add README for HIPEW library 2021-12-02 18:43:42 +01:00
1e98a0cee5 NanoSVG: Mention the version we use 2021-12-02 18:22:05 +01:00
431255e5e8 Docs: 3.0 release description for Linux appdata
Includes a typo fix for 2.93.
2021-12-02 18:02:19 +01:00
27b70428c1 Cleanup: Avoid using C++ keyword as variable name 2021-12-02 11:20:22 -05:00
e6a732daad Merge branch 'blender-v3.0-release' 2021-12-02 16:43:25 +01:00
7e60d8a713 Fix missing Blender logo in Windows store package
D9681 was not properly merged to all branches, leaving a path to a non-existent
icon file in the maniphest.
2021-12-02 16:35:24 +01:00
2fd657db5b Fix T93560: crash with image paint undo and cycles preview render
Cycles preview rendering could free the image buffers being used by drawing in
another thread due to a race condition. This race condition was unlikely before,
but now that preview renders are started right before we draw the image in the
image editor or load it as a texture in the 3D viewport, it's likely to happen.

As we are close to release this is too risky to fix properly, just avoid freeing
the cache for preview renders instead and accept increased memory usage in some
cases.
2021-12-02 16:34:16 +01:00
aec56e562a Fix (unreported): incorrect custom data layer created
Without this fix `CustomDataAttributes::create_by_move`
did not work on named attributes.
2021-12-02 15:32:19 +01:00
a1f0f2eacb Cleanup: Move public docs to BKE_spline.hh header 2021-12-02 09:24:21 -05:00
23ffcb242d Merge remote-tracking branch 'origin/blender-v3.0-release' 2021-12-02 14:40:20 +01:00
61e92eeb3e Fix Action.asset_data["is_single_frame"] set incorrectly
The asset metadata custom property `["is_single_frame"]` was set
incorrectly. Since this is intended for forward compatibility, including
being covered by the asset metadata indexing, it's important to have it
set correctly from the first release of Blender that includes the asset
browser.

Differential Revision: https://developer.blender.org/D13452
2021-12-02 14:35:12 +01:00
1620dcd208 Fix: don't use BLI_strncpy_utf8 for copying file paths
File paths can be any encoding, so using some UTF-8-specific function is
not the right way to go.
2021-12-02 13:02:39 +01:00
c5ec3738d8 Blenloader: move ghost include path from INC to TEST_INC
The Ghost dependency was added to avoid a memory leak (rBc7a1e115b507),
but since that's only for the unit tests, it's better to add the path to
`TEST_INC`.
2021-12-02 12:58:45 +01:00
e130903060 BLI: avoid invoking tbb for small workloads
We often call `parallel_for` in places with very variable
sized workloads. When many elements are processed,
using multi-threading is great, but when processing
few elements (possibly many times) using `parallel_for`
can result in significant overhead.

I measured that this improves performance by >20% in
the refactored realize instances code I'm working on
separately. The change might also help with debugging
sometimes, because the stack trace is smaller and contains
fewer irrevelant symbols.
2021-12-02 12:56:47 +01:00
0f89d05848 Fix T93563: Crash subdividing with overlapping tri and quad
The first loop was left out when finding the split edge boundary.

Error from f2138686d9.
2021-12-02 22:53:44 +11:00
42a6b2fd06 Cleanup: move public doc-strings into headers for 'python' 2021-12-02 22:53:44 +11:00
1766549418 Fix T92308: OptiX denoising fails with high resolutions
The OptiX denoiser does have an upper limit as to how many pixels it can denoise at once, so
this changes the OptiX denoising process to use tiles for high resolution images.
The OptiX SDK does have an utility function for this purpose, so changes are minor, adjusting
the configured tile size and including enough overlap.

Maniphest Tasks: T92308

Differential Revision: https://developer.blender.org/D13436
2021-12-02 12:10:46 +01:00
7da979c070 Merge branch 'blender-v3.0-release' 2021-12-02 11:19:00 +01:00
67c490daaf Fix T93548: Appended (material) assets don't have a fake user
Since our design is to always keep data-blocks marked as assets on exit,
and our technical design for this is to do this via fake users, ensure
the fake user is set for an appended asset.

Reviewed by: Bastien Montagne

Differential Revision: https://developer.blender.org/D13443
2021-12-02 11:18:27 +01:00
68e3755209 Fix T93555: crash when muting nodes with multiple internal links
The crash happened because I was incorrectly and inconsistently assuming
that a socket is part of at most one internal link. However, this is not the case.
In geometry nodes, an input socket can be internally linked to multiple
output sockets. In the general case, an output could also be linked to multiple
input sockets, even though we don't have that in Blender yet.

Dalai gave green light to cherry pick this fix for 3.0.
2021-12-02 11:10:41 +01:00
9f290467ca Blendread: Remove all instantiation logic from BLO_library_link_ code.
Instantiation is now fully handled by BKE_blendfile_link_append module.

Note that this also allows removal of the `BLO_LIBLINK_NEEDS_ID_TAG_DOIT`
flag.

Part of T91414: Unify link/append between WM operators and BPY context
manager API, and cleanup usages of `BKE_library_make_local`.
2021-12-02 11:10:34 +01:00
4fe8c62b56 Cleanup: Readfile: Remove deprecated BLO_library_link_copypaste.
Rewrite of ID paste code in rB3f08488244c0 made this function useless,
ID pasting is now handled by the BKE_blendfile_link_append module too.
2021-12-02 10:27:34 +01:00
198e571e87 Fix T93555: crash when muting nodes with multiple internal links
The crash happened because I was incorrectly and inconsistently assuming
that a socket is part of at most one internal link. However, this is not the case.
In geometry nodes, an input socket can be internally linked to multiple
output sockets. In the general case, an output could also be linked to multiple
input sockets, even though we don't have that in Blender yet.
2021-12-02 09:41:36 +01:00
f1b0b0ffb8 Cleanup: spelling in comments 2021-12-02 16:02:34 +11:00
8f69c91408 Fix T93410: Crash hiding a region from Python used by a popover
Close all active buttons when hiding a region as this can be called
from Python a popover is open from that region.

Failure to do this causes the popover to read from the freed button.

Also rename UI_screen_free_active_but to
UI_screen_free_active_but_highlight since it only frees highlighted
buttons.
2021-12-02 15:53:41 +11:00
0de1d2e84e Cleanup: FIx build with USD after recent refactor
rB218360a89217f4e8321319035bf4d9ff97fb2658 missed a couple renames in USD code paths.
2021-12-01 23:34:51 -05:00
Leon Leno
bb3d03973a UI: Fix scaling of HSV cursor when zooming
The small circle used to choose the hue/saturation and value in the
color widget was drawn with a fixed screen space size. Now scale the
circle used as cursor in the color widget based on the zoom. This
could have been part of a9642f8d61 but the implementation is
different.

Based on a fix provided by Erik Abrahamsson

Differential Revision: https://developer.blender.org/D13444
2021-12-01 22:22:53 -05:00
Nikhil Shringarpurey
26d2caee3b Fix T92268: Group input and output nodes have inconsistent padding
The group output node did not have the same size padding as the group
input, leading to the node looking different and actually being smaller.

Differential Revision: https://developer.blender.org/D13092
2021-12-01 22:04:44 -05:00
218360a892 Cleanup: Rename curve struct fields
These existing names were unhelpful at best, actively confusing at
worst. This patch renames them to be consistent with the terms
used to refer to the values in the UI.
 - `width` -> `offset`
 - `ext1` -> `extrude`
 - `ext2` -> `bevel_radius`

Differential Revision: https://developer.blender.org/D9627
2021-12-01 22:01:35 -05:00
ca9cdba2df Fix: Add tooltip translation marker to disabled hints
This was overlooked, as it seems there's no way for these strings to be
translated currently. Generally it's not that clear whether `N_` or
`TIP_` should be used in this case, but `TIP_` seems more consistent.
To avoid the cost of the translation lookup when the UI text isn't
necessary, we could allow the disabled hint argument to be optional.

Differential Revision: https://developer.blender.org/D13141
2021-12-01 21:55:04 -05:00
b869da0c10 Fix T84710: Instances with only mesh edges or vertices are invisible
Wire-only meshes have a special case in the overlay drawing to give
the wire shader a special color (which avoids the lines being dashed,
somehow). The fast path for duplis didn't have that special case.

Differential Revision: https://developer.blender.org/D13196
2021-12-01 21:47:57 -05:00
70a7685d04 UI: Add an option to display the node editor context path
Since we have the overlays popover, it makes sense to allow toggling the
context path like in the 3D viewport. This commit adds a property,
and turns it on by default in existing files.

Differential Revision: https://developer.blender.org/D13248
2021-12-01 21:45:41 -05:00
2b6c01d98c Fix: Updating geometry nodes test results fails
I ran into this when adding new geometry nodes tests.
The issue was caused by 7168a4fa5c.

Differential Revision: https://developer.blender.org/D13440
2021-12-01 21:22:00 -05:00
70a0d45b69 Merge branch 'blender-v3.0-release' 2021-12-01 21:17:48 -05:00
594656e7a3 Fix T93525: Crash with curve/text armature bone gizmo
The problem is that drw_batch_cache_generate_requested_delayed
is called on the object, which uses the original object data type to
choose which data type to get info for. So for curves and text it uses
the incorrect type (not the evaluated mesh like we hardcoded in the
armature overlay code).

To fix this I hardcoded the "delayed" generation to only use the
evaluated mesh. Luckily it wasn't use elsewhere besides this
armature overlay system. That seems like the simplest fix for
3.0. A proper solution should rewrite this whole area anyway.

Differential Revision: https://developer.blender.org/D13439
2021-12-01 21:16:18 -05:00
fed4fc9c42 Cleanup: Move node_shader_util.c to C++
This will allow using a function I've declared in a C++ header.
2021-12-01 15:08:42 -05:00
8ca8380699 Cleanup: compiler warning 2021-12-01 20:52:31 +01:00
Sebastian Herholz
cb334428b0 Cycles: fix bugs in point and spot light multiple importance sampling
* Spot lights are now handled as disks aligned with the direction of the
  spotlight instead of view aligned disks.

* Point light is now handled separately from the spot light, to fix a case
  where multiple lights are intersected in a row. Before the origin of the
  ray was the previously intersected light and not the origin of the initial
  ray traced from the last surface/volume interaction.

This makes both strategies in multiple importance sampling converge to the same
result. It changes the render results in some scenes, for example the junkshop
scene where there are large point lights overlapping scene geometry and each
other.

Differential Revision: https://developer.blender.org/D13233
2021-12-01 20:20:12 +01:00
9afd6e7b70 Cleanup: clarify material copying for hair and pointclouds
No functional change, just more clear this way it comes from src.
2021-12-01 19:48:48 +01:00
Yuki Hashimoto
7336af3259 Fix some shortcut keys not working on macOS with Japanese input
Differential Revision: https://developer.blender.org/D13414
2021-12-01 19:40:47 +01:00
128ebdb062 Fix: Remove incorrect asserts for empty attributes
While it is an edge case, it isn't incorrect for the attribute storage
to have a zero size, it will just return an empty span or nothing.
2021-12-01 12:44:02 -05:00
bc48da3235 Fix: Instances component does not copy attributes 2021-12-01 12:36:49 -05:00
be6e56b36b Cleanup: Fix errors in previous commit
After 1ef8ef4941 there were build warnings because of unused
arguments. Also missed to change code to iterate `strips` instead of
`seqbase` in 2 functions.
2021-12-01 18:13:09 +01:00
506d672524 Geometry Nodes: deduplicate join geometry code
The Realize Instances and Join Geometry node can share most of their code.
Until now, both nodes had their own implementations though. This patch
removes the implementation in the Join Geometry node in favor of the more
general Realize Instances implementation.

This removes an optimization for avoiding spline copies. This can be brought
back later. The realize instances code has to be rewritten to support attribute
instances anyway.

Differential Revision: https://developer.blender.org/D13417
2021-12-01 17:57:08 +01:00
Wannes Malfait
d54a08c8af Geometry Nodes: Dual Mesh Node
This node calculates the dual of the input mesh. This means that faces
get replaced with vertices and vertices with faces. In principle this
only makes sense when the mesh in manifold, but there is an option to
keep the (non-manifold) boundaries of the mesh intact.

Attributes are propagated:
 - Point domain goes to face domain and vice versa
 - Edge domain and Face corner domain gets mapped to itself
Because of the duality, when the mesh is manifold, the attributes get
mapped to themselves when applying the node twice.

Thanks to Leul Mulugeta (@Leul) for help with the
ascii diagrams in the code comments.

Note that this does not work well with some non-manifold geometry,
like an edge connected to more than 2 faces, or a vertex connected to
only two faces, while not being in the boundary. This is because there
is no good way to define the dual at some of those points. This type
of non-manifold vertices are just removed for this reason.

Differential Revision: https://developer.blender.org/D12949
2021-12-01 11:11:50 -05:00
1757840843 Geometry Nodes: Generalized Compare Node
Replace compare floats node with a generalized compare node. The node
allows for the comparison of float, int, string, color, and vector.

The datatypes support the following operators:
Float, Int: <, >, <=, >=, ==, !=
String: ==, !=
Color: ==, !=, lighter, darker
    (using rgb_to_grayscale value as the brightness value)

Vector Supports 5 comparison modes for: ==, !=, <, >, <=, >=
Average: The average of the components of the vectors are compared.
Dot Product: The dot product of the vectors are compared.
Direction: The angle between the vectors is compared to an angle
Element-wise: The individual components of the vectors are compared.
Length: The lengths of the vectors are compared.

Differential Revision: https://developer.blender.org/D13228
2021-12-01 09:36:25 -06:00
f8dd03d3dd Cleanup: Store instances id attribute with other attributes
Now that we can store any dynamic attribute on the instances component,
we don't need the special case for `id`, it can just be handled by the
generic attribute storage. Mostly this just allows removing a bunch
of redundant code.

I had to add a null check for `update_custom_data_pointers` because
the instances component doesn't have any pointers to inside of
custom data.

Differential Revision: https://developer.blender.org/D13430
2021-12-01 09:27:27 -05:00
Paul Golter
fd8418385c Add layer and pass index parameters to rna_Image_gl_load
This patch adds a layer_idx and pass_idx parameter to the rna_Image_gl_load function. It exposes both to the Python API as optional parameters.
This allows python scripters to specifiy which layer and pass they want to load in to an OpenGL texture. Right now image.gl_load() always takes the first layer and first pass.
This is limiting when working with multilayer openEXRs.

With this patch scripters can do something like this:

```
pass_idx = area.spaces.active.image_user.multilayer_pass
layer_idx = area.spaces.active.image_user.multilayer_layer
image.gl_load(frame=0, layer_idx=layer_idx, pass_idx=pass_idx)
```

As the parameters are optional and default to 0, it should not break existing code.

Reviewed By: campbellbarton

Differential Revision: https://developer.blender.org/D13435
2021-12-01 14:58:49 +01:00
1ef8ef4941 Cleanup: Remove seq->tmp_flag DNA member
Commit f0d20198b2 used this field to flag rendered strips which were
queried by `SEQ_query_rendered_strips()`. Then operators iterate all
strips and checks state of `seq->tmp_flag`.

Use collection returned by `SEQ_query_rendered_strips` directly.
There should be no functional changes.

This commit adds functions `all_strips_from_context` and
`selected_strips_from_context` that provide collection of strips based
on context. Most operators can use this collection directly.
There is already `seq->tmp` DNA field, but is should not be used unless
absolutely necessary. Better option is to use human readable flag.
Best is to not use DNA fields for temporary storage in runtime.
2021-12-01 12:49:17 +01:00
d5d91b4ae4 Color adjustments for splash for Blender 3.x series
Tweaks by Andy Goralczyk.
2021-12-01 12:46:20 +01:00
02ab4ad991 Fix T92561: unstable particle distribution with Alembic files
When enabling or disabling a Mesh Sequence Cache modifier of an Object
with a hair particle system, the hair would switch positions. This is
caused because original coordinates in Blender are expected to be
normalized, and toggling the modifier would cause the usage of different
orco layers: one that is normalized, and the other which isn't.

This bug exposes a few related issues:
- if the Alembic file did not have orco data,
`MOD_deform_mesh_eval_get`, used by the particle system modifier, would
add an orco layer without normalization
- `MOD_deform_mesh_eval_get` would also ignore the presence of an orco
layer (e.g. one that could have been read from Alembic)
- if the Alembic file did have orco data, the data would be read
unnormalized

To fix those various issues, original coordinates are normalized when
read from Alembic and unnormalized when written to Alembic; and a new
utility function `BKE_mesh_orco_ensure` is added to add a normalized
orco layer if none exists on the mesh already, this function derives
from the code used in the particle system.

Reviewed By: brecht

Maniphest Tasks: T92561

Differential Revision: https://developer.blender.org/D13306
2021-12-01 12:44:32 +01:00
fd22404837 Cleanup: Use int8 type rather than char for buffer
Indicates that this is just a buffer with an element size of 8 bit, not
a displayable/printable string buffer.
2021-12-01 12:19:43 +01:00
51791004ea Fix errors in user preferences after 2.8 hack removal
How to reproduce it:
* Open User Preferences.
* Got to Add-ons tab.

Issue introduced on d723e331f1

Reported via chat by Pedro Alcaide (povmaniac).
2021-12-01 11:38:08 +01:00
9cec9b4d6e Fix(unreported): LineArt intersection mask logic error.
The stroke generation call mistakenly uses all enabled
types to check intersection mask, the correct behavior
is to use individual edge(chain) type.
2021-12-01 15:55:52 +08:00
24b84e4688 LineArt: Use consitent view vector direction.
Now do not invertes view vector in different stages of calculation.
2021-12-01 15:55:16 +08:00
5eeaf4cce6 LineArt: Use consitent view vector direction.
Now do not invertes view vector in different stages of calculation.
2021-12-01 15:53:30 +08:00
386b112f76 Geometry Nodes: Propagate attributes in Instances to Points node
This is part of T92926, and is very similar to 221b7b27fc, except
slightly similar, because it only transfers from one component, and
it doesn't handle multi-threading at the moment.
2021-11-30 23:30:35 -05:00
Aaron Carlisle
d723e331f1 Cleanup: Remove hack to hide pre 2.8 addons in the user preferences
Should be safe to do so now that 2.8+ has been out for 8 releases and over 2 years now.

Reviewed By: campbellbarton

Differential Revision: https://developer.blender.org/D12984
2021-11-30 22:55:52 -05:00
1cb99f5808 Merge branch 'blender-v3.0-release' 2021-12-01 11:47:28 +11:00
b3d101ac29 Fix T93100: VSE RMB shift-select fails with "fallback tools"
When the select action was set to "Select Tool", shift-clicking
on sequence strips wasn't selecting the strip.

Regression in 2a2d873124

Thanks to @a.monti for the fix.
2021-12-01 11:43:19 +11:00
88e9e97ee9 Geometry Nodes: Add Point Count to Spline Length Node
- Integer Field input of the number of control points on each spline
  in the spline domain.

  - In the point domain, it is the number of points on the spline that
  contains the given control point.

Differential Revision: https://developer.blender.org/D13279
2021-11-30 15:25:43 -06:00
247f37f765 Fix missing subsurface IOR/Anisotropy socket after refactor in c2ab47e 2021-11-30 22:09:18 +01:00
221b7b27fc Geometry Nodes: propagate attributes in Instance on Points node
This is part of T92926.

Differential Revision: https://developer.blender.org/D13429
2021-11-30 18:37:44 +01:00
88b37b639e LibLink/Append: Some cleanup and addition to comments. 2021-11-30 17:37:00 +01:00
1b4734c57d Geometry Nodes: Curve Menu Organize
Put Spline Parameter in the correct position in the Curve menu.
2021-11-30 10:32:12 -06:00
3f08488244 Rewrite ID Paste code to use new BKE_blendfile_link_append module.
No behavioral changes are expected here.

Part of T91414: Unify link/append between WM operators and BPY context
manager API, and cleanup usages of `BKE_library_make_local`.
2021-11-30 17:18:35 +01:00
b9f54dd48a LibLink/Append: Add a utils to link/append all ID of given types from a given library.
This will be used by the copy/paste code.

Part of T91414: Unify link/append between WM operators and BPY context
manager API, and cleanup usages of `BKE_library_make_local`.
2021-11-30 17:18:26 +01:00
581fb2da10 LibLink/Append: tweak instantiation of collections in viewlayers.
When appended collections are instantiated in viewlayers, do not add to
viewlayers collections children of other instantiated collections.

This default behavior is required for proper copy/paste behavior. If
previous behavior needs to be brought back for regular append operation,
this can be decided and done, but think this default behavior also makes
sense in normal append case.

part of T91414: Unify link/append between WM operators and BPY context
manager API, and cleanup usages of `BKE_library_make_local`.
2021-11-30 17:13:28 +01:00
c822e03e2a Fix: Missing handling of dynamic instance attribute size
The attributes need to be reallocated when the size changes.

Differential Revision: https://developer.blender.org/D13390
2021-11-30 10:59:11 -05:00
2fbb52dd86 Merge branch 'blender-v3.0-release' 2021-11-30 15:41:29 +01:00
3788003cda Fix T93368: Dragging Blends Without Previews
Unfortunately the drop logic for file-path based drag & drop checks the
used icon for its logic. This is very bad and should be changed. But
doing this involves some changes that are better not done during bcon4,
so for now stick to it and update the icon check.

Reviewed by: Julian Eisel

Differential Revision: https://developer.blender.org/D13383?id=45314
2021-11-30 15:40:27 +01:00
74039388cd Merge remote-tracking branch 'origin/blender-v3.0-release' 2021-11-30 15:36:48 +01:00
7863e03e89 Fix(unreported): LineArt intersection mask logic error.
The stroke generation call mistakenly uses all enabled
types to check intersection mask, the correct behavior
is to use individual edge(chain) type.
2021-11-30 22:24:42 +08:00
Bastien Montagne
de7f1e8e07 Fix T93353: Reload Library Override file loses Constraints, take II.
When adding `INSERT` operations over RNACollection items, rna diffing
code did not properly report the properties as not being equals.

This in turn triggered the 'purge unused exiting override properties'
mechanism, thus deleting the exitsting (valid) insert override property
operation.

NOTE: This should also be backported to 2.93, and probably 2.83.

Reviewed By: sybren, jbakker

Maniphest Tasks: T93353

Differential Revision: https://developer.blender.org/D13426
2021-11-30 15:19:44 +01:00
Germano Cavalcante
251c017534 Fix T93477: Viewport X-Ray is influencing snapping even in material mode
The default snap behavior to perform on tools and cursors is to the
final geometry and not edited geometry.

In snapping to edited geometry, there are some specific behaviors that
are not convenient in some cases. For example the general occlusion
test of X-Ray geometries during dragdrop.

This fix also resolves a regression for tools like measure and placement
that were also ignoring the snap to face in x-ray mode.

Differential Revision: https://developer.blender.org/D13410
2021-11-30 11:03:57 -03:00
0704570721 LibLink/Append: Port bpy.data.libraries.load to new BKE_blendfile_link_append module.
Note that this fully replaces the 'PyCapsule' storage of linked/appended items
in the python API code by the generic storage of items in the
`BlendfileLinkAppendContext` data.

Maniphest Tasks: T91414

Differential Revision: https://developer.blender.org/D13331
2021-11-30 15:02:53 +01:00
8cf0d15b60 Fix T93508: Shift+F1 to switch to asset browser randomly crashes 2021-11-30 14:35:29 +01:00
1cd9fcd98d Geometry Nodes: Rename Curve Parameter, Add Index on Spline
- Rename the Curve Parameter node to Spline Parameter.
  - Add "Index on Spline" to the node. This output is the index of
the current point on it's parent spline rather than the entrire curve.

Differential Revision: https://developer.blender.org/D13275
2021-11-30 07:21:14 -06:00
2f7bec04e8 Asset Browser: Fix incorrect user message
Text would display "No asset selected" when it actually inidicates that
there is no asset active (not selected). Changed it to "No active asset"
now.
2021-11-30 13:01:23 +01:00
3692c0521c Fix Asset Browser properties region toggle not showing open/closed state
The button is supposed to be blue (default theme) when the properties
region is open, to indicate that state.
2021-11-30 12:40:57 +01:00
b67dca9b76 Depsgraph: remove shading parameters component
This component served no purpose anymore. It was technical
dept from the early 2.80 days.

Differential Revision: https://developer.blender.org/D13422
2021-11-30 12:14:23 +01:00
4b971bb87c Asset Bundle Copy button: only report each external dependency once
The `ASSET_OT_bundle_install` operator only works when the blend file is
self-contained. It reports any external dependencies. Before this patch:

- every dependency was mentioned, even when it repeated the same
  filename over and over again, and
- multiple dependencies were all mentioned in the error popup,
  potentially filling the screen.

This is now resolved by:

- only reporting each external file once, and
- referring to the console when there are multiple external dependencies.

Reviewed by: severin, dfelinto

Differential Revision: https://developer.blender.org/D13413
2021-11-30 11:50:38 +01:00
Julian Eisel
2e53f8b4b1 Fix T92577: Cannot open shortcut folders on Windows
`file.select()` wasn't handling redirects as it should when it also
opens directories. This was only uncovered by a change in the keymap.

Reviewed By: Bastien Montagne, Harley Acheson

Differential Revision: https://developer.blender.org/D13388
2021-11-30 11:29:40 +01:00
e1cb2a226c Merge branch 'blender-v3.0-release' 2021-11-30 11:17:04 +01:00
d8edc2c634 VSE: Disable interactivity in combined view
Combined view of timeline and preview causes seemingly unpredictable
behavior after some operators have been allowed to run in preview
region.

Disable new features in this combined view, so behavior should be
consistent with previous versions.

ref: https://developer.blender.org/T92584

Reviewed By: campbellbarton

Differential Revision: https://developer.blender.org/D13419
2021-11-30 11:15:20 +01:00
6f460b76fe LibLink/Append: tweak asserts in main BKE link/append functions.
Now that those functions are much widely used, just return early in case
there is nothing to link/append,  instead of asserting over it.
2021-11-30 11:04:41 +01:00
bc1e3238c4 Merge remote-tracking branch 'origin/blender-v3.0-release'
This includes adjustment of rBc12d8a72cef5 to the new path traversal code
introduced in rBe5e8db73df86.
2021-11-30 10:56:14 +01:00
c12d8a72ce BPath traversing: allow skipping weak library references
Add flag to `BKE_bpath_traverse_id()` and friends to skip weak
references (see below). This makes a distinction between "this blend
file depends on that file" and "this blend file references that file,
but doesn't directly use its data". This distinction is for the Asset
Bundle install operator, which refuses to copy the blend file when it's
not self-contained.

Weak references are those that are not directly used by the blend file,
but are still present to allow path rewriting. For example, when an
Asset is loaded its originating blend file is saved in
`ID::library_weak_reference`; this reference is purely for deduplication
purposes, and not for actually loading any data.

Reviewed by: mont29, brecht

Differential Revision: https://developer.blender.org/D13412
2021-11-30 10:41:23 +01:00
d7f0de0e3a Fix T93442: Crash when proxy building is cancelled
If strip is removed while proxy is built, this causes crash when
building is finished. This happens because proxy anims are freed in
`SEQ_proxy_rebuild_finish()`.

Iterate over available strips and check if original strip still exists
by comparing `SessionUUID`.

Reviewed By: sergey

Differential Revision: https://developer.blender.org/D13408
2021-11-30 10:08:22 +01:00
Jake
7168a4fa5c Tests: add edit-mesh operator tests
Added operator tests for hide, symmetry_snap, tris_convert_to_quads,
uvs_rotate, uvs_rotate, uv_texture_add, uv_texture_remove,
vert_connect_concave, vert_connect_nonplanar, vertex_color_add,
vertex_color_remove, vertices_smooth_laplacian, wireframe,
sculpt_vertex_color_add and sculpt_vertex_color_remove.

Ref D11798

Reviewed By: campbellbarton
2021-11-30 17:43:24 +11:00
500ec993f5 Revert "Fix: Const warning in editmesh_knife.c"
It's important the coordinates the knife is operating on are never
manipulated since it will cause problems which are difficult to
troubleshoot.

Instead, use a cast in the MEM_freeN(..) call.

This reverts commit 8600d4491f.
2021-11-30 16:30:08 +11:00
Matheus Santos
da279927b1 Text Editor: Line number highlight follow selection
Change the current behavior of line number highlighting to follow the
current selected line text->sell, not the current line text->curl.
2021-11-30 14:32:52 +11:00
ac447ba1a3 Cleanup: clang-format, trailing space 2021-11-30 10:15:17 +11:00
e2473d3baf Cleanup: remove blank lines in comment blocks 2021-11-30 10:15:17 +11:00
76471dbd5e Cleanup: capitalize NOTE tag 2021-11-30 10:15:17 +11:00
4e45265dc6 Cleanup: spelling in comments & strings 2021-11-30 10:15:17 +11:00
262ef26ea3 Cleanup: use colon after doxygen params, correct slash direction 2021-11-30 10:15:17 +11:00
35124acd19 Geometry Nodes: Domain Size Node
The Domain Size node has a single geometry input and a selection for
the component type. Based on the component chosen, outputs containing
single values for the related domains are shown.

Mesh:
  - Point Count
  - Edge Count
  - Face Count
  - Face Corner Count
Curve:
  - Point Count
  - Spline Count
Point Cloud:
  - Point Count
Instances:
  - Instance Count

Differential Revision: https://developer.blender.org/D13365
2021-11-29 13:04:25 -06:00
c2001ec275 Merge branch 'blender-v3.0-release' 2021-11-29 19:25:28 +01:00
e7ae9f493a Fix T93310: crash due to broken image paths
The crash was caused by allocating an uninitialized amount of memory.
This fix initializes a bunch of variables that could cause the error.

It should be possible to also fix this in the function that actually uses
the uninitialized memory, but that could cause unknown consequences
that are a bit too risky for 3.0. Just initializing some variables should
be safe though. For more details see D13369.

Differential Revision: https://developer.blender.org/D13369
2021-11-29 19:23:43 +01:00
4fac3be146 Fix Cycles OptiX doing a bit too much work for almost opaque curve shadows
Found in D13353, likely has no significant impact in performance.
2021-11-29 18:41:37 +01:00
2e6f914e37 Fix debug build error after recent Cycles kernel argument changes 2021-11-29 18:41:37 +01:00
0adb356f2e Merge branch 'blender-v3.0-release' 2021-11-29 17:09:53 +01:00
aa7051c8f2 Fix T93439: Armature widgets from hidden collections are invisible
The are few things in the dependency graph which lead to the issue:
- IDs are only built once.
- Object-data level (Armature, i,e,) builder dependent on the object
  visibility.

This caused issues when an armature is first built as not directly
visible (via driver, i.e.) and then was built as a directly visible.
This did not update visibility flag on the node for the custom shape
object.

The idea behind the fix is to go away form passing object visibility
flag to the geometry-level builders and instead rely on the common
visibility flush post-processing to make sure certain objects are
fully visible when needed.

This is the safest minimal part of the change for 3.0 release which
acts as an additional way to ensure visibility. This means that it
might not be a complete fix (if some configuration was overseen) but
it should not make currently working cases to not work.

The fix should also make modifiers used on rigify widgets to work.

The more complete fix will have `is_object_visible` argument removed
from the geometry-level builder functions.

Differential Revision: https://developer.blender.org/D13404
2021-11-29 16:59:50 +01:00
aff6227412 Merge branch 'blender-v3.0-release' 2021-11-29 16:59:09 +01:00
dae9917915 Fix T93384: Objects with Constraints to curves have wrong locations on file load
Regression since 3.93 caused by 752c6d668b.

Follow the code from 2.93 which was always leaving curve modifiers
evaluation with a valid and clean state of the bounding box.

This is also what was proposed and agreed on in the following
design task: T92206: Bounding Box: compute during depsgraph evaluation

Tested with files from T90808 and T93384.

For the 3.0 going with the safest and minimal change. The rest of
the bounding box un-entanglement is to happen outside of the stable
branch.

Thanks The patch is based on the code from Philipp Oeser and
investigation by Germano Cavalcante and Dr. Sybren A. Stüvel,
thanks!

Differential Revision: https://developer.blender.org/D13409
2021-11-29 16:45:31 +01:00
2206b6b9a0 Fix T92628: .blend thumbnail renders black with Cycles 3D viewport render
Don't use Cycles for rendering thumbnails, fall back to Solid shading.

Differential Revision: https://developer.blender.org/D13406
2021-11-29 16:27:55 +01:00
f613c4c095 Cycles: MetalRT support (kernel side)
This patch adds MetalRT support to Cycles kernel code. It is mostly additive in nature or confined to Metal-specific code, however there are a few areas where this interacts with other code:

- MetalRT closely follows the Optix implementation, and in some cases (notably handling of transforms) it makes sense to extend Optix special-casing to MetalRT. For these generalisations we now have `__KERNEL_GPU_RAYTRACING__` instead of `__KERNEL_OPTIX__`.
- MetalRT doesn't support primitive offsetting (as with `primitiveIndexOffset` in Optix), so we define and populate a new kernel texture, `__object_prim_offset`, containing per-object primitive / curve-segment offsets. This is referenced and applied in MetalRT intersection handlers.
- Two new BVH layout enum values have been added: `BVH_LAYOUT_METAL` and `BVH_LAYOUT_MULTI_METAL_EMBREE` for XPU mode). Some host-side enum case handling has been updated where it is trivial to do so.

Ref T92212

Reviewed By: brecht

Maniphest Tasks: T92212

Differential Revision: https://developer.blender.org/D13353
2021-11-29 15:20:26 +00:00
98a5c924fc Cycles: Metal readiness: Specify DeviceQueue::enqueue arg types
This patch adds new arg-type parameters to `DeviceQueue::enqueue` and its overrides. This is in preparation for the Metal backend which needs this information for correct argument encoding.

Ref T92212

Reviewed By: brecht

Maniphest Tasks: T92212

Differential Revision: https://developer.blender.org/D13357
2021-11-29 14:56:06 +00:00
Pratik Borhade
03c9563582 Fix T93431: Crash when empty is marked as asset
Make `ED_preview_id_is_supported(ID *)` NULL-safe. It's semantically
valid, as it's not possible to render a preview of a NULL ID.

The crash was introduced in 481f032f5c

Reviewed By: sybren, jbakker

Maniphest Tasks: T93431

Differential Revision: https://developer.blender.org/D13398
2021-11-29 15:25:03 +01:00
f9add2d63e Fix: Missing min value for set spline resolution node 2021-11-29 09:11:51 -05:00
Bastien Montagne
e5e8db73df Refactor BKE_bpath module.
The main goal of this refactor is to make BPath module use `IDTypeInfo`,
and move each ID-specific part of the `foreach_path` looper into their
own IDTypeInfo struct, using a new `foreach_path` callback.

Additionally, following improvements/cleanups are included:
* Attempt to get better, more consistent namings.
** In particular, move from `path_visitor` to more standard `foreach_path`.
* Update and extend documentation.
** API doc was moved to header, according to recent discussions on this
   topic.
* Remove `BKE_bpath_relocate_visitor` from API, this is specific
  callback that belongs in `lib_id.c` user code.

NOTE: This commit is expected to be 100% non-behavioral-change. This
implies that several potential further changes were only noted as
comments (like using a more generic solution for
`lib_id_library_local_paths`, addressing inconsistencies like path of
packed libraries always being skipped, regardless of the
`BKE_BPATH_FOREACH_PATH_SKIP_PACKED` `eBPathForeachFlag` flag value,
etc.).

NOTE: basic unittests were added to master already in
rBdcc500e5a265093bc9cc.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D13381
2021-11-29 14:22:38 +01:00
6ae34bb071 Fix drawing annotations on surface
Caused by {rBaa0ac0035a0d}.

Similar solution to {rBc0fdaf700a5}.
2021-11-29 09:43:33 -03:00
7c703b4699 Fix T93417: Strip outline not aligned with image
When using non-uniform aspect ratio, strip outline is not aligned
correctly.

Apply pixel aspect ratio to image quad and origin.
2021-11-29 12:44:42 +01:00
f1118ee51e Nodes: Support internal links for custom sockets
Currently, nodes with custom sockets do not get their internal links
populated. So operators like Delete And Reconnect don't work with such
nodes. This patch put custom sockets into consideration when computing
priorities for internal links such that sockets of the same idname get
connected. Additionally, the patch cleanup the function in the process
to avoid redundant code repetition.

Reviewed By: Jacques Lucke

Differential Revision: https://developer.blender.org/D13386
2021-11-29 13:01:53 +02:00
ab2a7aa0da Fix T93438: Auto linking do not work for custom sockets
Currently, custom sockets are no longer supported for automatic linking
when dropping a node on a link. This is because SOCK_CUSTOM is given a
negative priority and is ignored. To fix this, SOCK_CUSTOM is now given
the lowest priority and the rest of the sockets got their priority
incremented.

Reviewed By: Jacques Lucke

Differential Revision: https://developer.blender.org/D13403
2021-11-29 12:34:11 +02:00
cebe5f5bf4 Cleanup: Silenced clang-tidy warning. 2021-11-29 11:26:35 +01:00
0cbcddd91e Merge branch 'blender-v3.0-release' 2021-11-29 02:14:05 -08:00
b31250feba Fix T93456: Properly translate operator on splash screen
Use the translation API to lookup the string before formatting occurs.

Differential Revision: https://developer.blender.org/D13400
2021-11-29 02:04:32 -08:00
444971aa8e Cleanup: typos in comments. 2021-11-28 19:05:22 +01:00
f7f558e293 Cleanup: Deduplicate instances component in spreadsheet
Currently we have a separate `InstancesDataSource`, which does almost
exactly the same thing as `GeometryDataSource`, except that it hardcodes
a few more columns: "Name", "Rotation", and "Scale". We can easily
replace that with a couple of if statements in the geometry data source.

This also makes named attributes on instances display
in the spreadsheet.

Differential Revision: https://developer.blender.org/D13391
2021-11-27 17:09:35 -05:00
e121b5b66c Merge branch 'blender-v3.0-release' 2021-11-27 21:59:57 +01:00
Brecht Van Lommel
d2e6087335 Fix build error with TBB 2021 and booleans
Linux distributions are using newer TBB versions than official releases, and
TBB 2021 is an API breaking release.

In general we should avoid using TBB directly and go through the abstractions
in BLI_task.hh, though there is no abstraction for this.

For 3.0 the safe option is to just not cancel the task but instead early out
in the lambda function. Given the grain size of 2048 there should be no
significant performance difference.

Differential Revision: https://developer.blender.org/D13382
2021-11-27 19:08:06 +01:00
aa6c922d99 Geometry Nodes: Optimize Cube primitive vertex calculation
This patch gets rid of the O(N^3) complexity
of calculate_vertices. Execution time of the node is
reduced from 250ms to 140ms with 500^3 vertices.
In the future edge calculations could be done manually
and reduce the execution time even further.

Differential Revision: https://developer.blender.org/D13207
2021-11-27 19:06:07 +01:00
d2f4fb68f5 Geometry Nodes: Parallelize "Set Spline Type"-node
Parallelizes the loop that converts splines.
It gives around a 2x speedup on curves with over 1k splines.

Differential Revision: https://developer.blender.org/D13389
2021-11-27 18:17:58 +01:00
7b4867d1ba progress 2021-11-27 13:02:34 +01:00
468bba3d2b initial testing 2021-11-27 11:49:27 +01:00
2531358297 Attempt to fix Windows new bpath tests failing, take IV.
Follow up to rBdcc500e5a265093bc9cc, rB92daff6ac2adb5bb,
rB61bd5882a20c6f3 and rB08264aaf82da8.

Sorry for the noise, this time should be good.
2021-11-26 22:39:34 +01:00
08264aaf82 Attempt to fix Windows new bpath tests failing, take III.
Follow up to rBdcc500e5a265093bc9cc, rB92daff6ac2adb5bb and rB61bd5882a20c6f3.
2021-11-26 22:07:37 +01:00
61bd5882a2 Attempt to fix Windows new bpath tests failing, take II.
Follow up to rBdcc500e5a265093bc9cc and rB92daff6ac2adb5bb.

Also shortening a bit the macros names...
2021-11-26 21:34:48 +01:00
a0bb6bb4d6 Fix T89564: Spline IK breaks when it is far away from the world origin
The isect_line_sphere algorithm became very imprecise when the line and
the sphere were reasonably far away from the world origin.

This would lead to no intersections being reported even if there was a
guaranteed intersection (line crossing from inside the sphere to the
outside).

To fix this we now use the secant root finding method to get an
intersection point. This is much more stable and robust it seems.
2021-11-26 18:24:58 +01:00
97465046c6 Geometry Nodes: add utility to set remaining outputs
Differential Revision: https://developer.blender.org/D13384
2021-11-26 18:01:59 +01:00
92daff6ac2 Attempt to fix Windows new bpath tests failing.
Follow up to rBdcc500e5a265093bc9cc.
2021-11-26 17:57:40 +01:00
2efc2221cc Fix T93402: Linking Collections instantiate extra sub-collections.
Linking is more relax than appending when it comes to instantiating
indirectly linked collections/objects.
2021-11-26 17:49:18 +01:00
0789f61373 Cleanup: remove warnings
This assert was producing warning in debug builds because
it was never hit under some circumstances.
2021-11-26 17:32:09 +01:00
602ecbdf9a Geometry Nodes: optimize Set Position node
This implements four optimizations in the Set Position node:
* Check whether the position input is the current position and ignore
  it if it is. This results in a speedup when only the Offset input is used.
* Use multi-threading when copying to computed values to the
  position attribute. All geometry types benefit from this.
* Use devirtualization for the offset and position input. This optimizes
  the common case that they are either single values or computed
  in the fly in a span.
* Write to `Mesh->mvert` directly instead of creating a temporary span.
  This makes setting mesh vertex positions even more efficient.

In my simple benchmark I'm using a White Noise node to offset the
position of 1,000,000 vertices. The speed is `20 ms -> 4.5 ms` in the
multi-threaded case and `32 ms -> 22 ms` in the single-threaded case.
2021-11-26 15:33:35 +01:00
eb7827e797 Cycles: Fix film convert address space mismatch on Metal
This patch fixes an address space mismatch in the film convert kernels on Metal. The `film_get_pass_pixel_...` functions take a `ccl_private` result pointer, but the film convert kernels pass a `ccl_global` memory pointer. Specialising the pass-fetch functions with templates results in compilation errors on Visual Studio, so instead this patch just adds an intermediate local on Metal.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D13350
2021-11-26 13:58:48 +00:00
12a83db83c Fix T93290: Rotation without contraint after extrude has wrong axis
The default orientation of the mode was being indicated as overridden,
although the one of constraint was used.
2021-11-26 10:49:00 -03:00
f86331a033 Geometry Nodes: deduplicate virtual array implementations
For some underlying data (e.g. spans) we had two virtual array
implementations. One for the mutable and one for the immutable
case. Now that most code does not deal with the virtual array
implementations directly anymore (since rBrBd4c868da9f97a),
we can get away with sharing one implementation for both cases.
This means that we have to do a `const_cast` in a few places, but
this is an implementation detail that does not leak into "user code"
(only when explicitly casting a `VArrayImpl` to a `VMutableArrayImpl`,
which should happen nowhere).
2021-11-26 14:47:15 +01:00
ef88047a97 Fix T89081: Freestyle noise seed of zero crash
This leads to division by zero in Freestyle's NoiseShader which also
crashes blender.

Not sure if we really need a do_version patch for old files, as an
alternative we could also force a positive number in the NoiseShader.
This patch does not do either, just force a positive range in RNA from
now on.

Maniphest Tasks: T89081

Differential Revision: https://developer.blender.org/D13332
2021-11-26 14:40:07 +01:00
a773cd3850 Fix T93130: Frame Selected with selected paint mask does not work
This broke with {rB20fac2eca723} (which landed in 2.63), so long
standing bug.

Convention for paint modes is:
- when no paint mask is active, `Frame Selected` will focus the last
stroke
- when paint mask is active, `Frame Selected` will focus the selected
mask faces

To check the right vert coords we have to offset with `mp->loopstart`.

Maniphest Tasks: T93130

Differential Revision: https://developer.blender.org/D13247
2021-11-26 14:32:49 +01:00
35c3644e78 Fix T93117: Texture paint clone tool crash in certain situation
Caused by {rBaf162658e127}, so long standing bug.

When changing clone slots (report involved a quite complicated sequence
of selecting textures and undo -- but I think this could happen in more
situations) code checks for UV of new clone slot.
However, since above commit the slot and the clone slot were mixed up,
so in this case the responsible NULL check (for when no UV is assigned)
wasnt working.
Now correct this (NULL check the clone slot uv -- instead of the paint
slot UV).

note: not sure why low level CustomData functions actually dont do the
name NULL checks themselves (seems like callers are always responsible).

Maniphest Tasks: T93117

Differential Revision: https://developer.blender.org/D13378
2021-11-26 14:24:46 +01:00
63342861e7 Fix: error in previous commit
Forgot to actually slice the span in rB6b5e1cfacab4c4605ec2d7bfef360389afe849be.
2021-11-26 13:29:24 +01:00
b066d58216 CMake mini-rewrite: Sync code check with BKE_blender_version_is_alpha
Reproduce the logic we already do in C (BKE_blender_version_is_alpha)
and the CMake file.

Otherwise it can get out of sync if we add/rename the non-alpha release cycles.

No functional change.

Differential Revision: https://developer.blender.org/D13379
2021-11-26 13:05:37 +01:00
236be8e9f1 Fix T93380: Texture paint clone tool crash without clone image
This was crashing using the clone tool without a clone image assigned.

Caused by {rB9111ea78acf4}.
Since above commit, `BKE_image_acquire_ibuf` was using `ima->runtime`
without checking for NULL first.
Since callers are not required to check for this, just return early
here.

note: there is still a memory leak using the clone tool without a clone
image assigned (but this was also the case before said commit and needs
to be investigated separately).

Maniphest Tasks: T93380

Differential Revision: https://developer.blender.org/D13377
2021-11-26 11:46:26 +01:00
dcc500e5a2 BKE_bpath: Add minimal unittests.
This is far from a complete coverage, but should catch most of potential
issues when rewriting this code.
2021-11-26 11:08:14 +01:00
Simon Lenz
c6eaa9c552 MaskEditor: draw active layer on top
Instead of drawing the mask layers in the sequence of their occurence, draw the active mask *always* on top.

Implementation:
- move drawing loop for splines to separate static function
- draw active mask last

Example: lowest layer is active, yet still drawn on top.
{F12140355}

This is part of an effort to make mask editing more intuitive & easy to use: https://developer.blender.org/T93097

Reviewed By: sergey

Differential Revision: https://developer.blender.org/D13372
2021-11-26 11:08:05 +01:00
658fd8df0b Geometry Nodes: refactor multi-threading in field evaluation
Previously, there was a fixed grain size for all multi-functions. That was
not sufficient because some functions could benefit a lot from smaller
grain sizes.

This refactors adds a new `MultiFunction::call_auto` method which has the
same effect as just calling `MultiFunction::call` but additionally figures
out how to execute the specific multi-function efficiently. It determines
a good grain size and decides whether the mask indices should be shifted
or not.

Most multi-function evaluations benefit from this, but medium sized work
loads (1000 - 50000 elements) benefit from it the most. Especially when
expensive multi-functions (e.g. noise) is involved. This is because for
smaller work loads, threading is rarely used and for larger work loads
threading worked fine before already.

With this patch, multi-functions can specify execution hints, that allow
the caller to execute it most efficiently. These execution hints still
have to be added to more functions.

Some performance measurements of a field evaluation involving noise and
math nodes, ordered by the number of elements being evaluated:
```
1,000,000: 133 ms   -> 120 ms
  100,000:  30 ms   ->  18 ms
   10,000:  20 ms   ->   2.7 ms
    1,000:   4 ms   ->   0.5 ms
      100:   0.5 ms ->   0.4 ms
```
2021-11-26 11:06:16 +01:00
Martijn Versteegh
004172de38 Clarify a confusing comment.
Affect and effect are too confusing for non-native english speakers
(like me). Also BAKING_MASK_MARGIN doesn't exist anymore in the code.

Reviewed By: sergey

Differential Revision: https://developer.blender.org/D13361
2021-11-26 10:57:17 +01:00
6b5e1cfaca Geometry Nodes: better devirtualization for sliced virtual arrays
Under some circumstances that can lead to more than a 2x
performance increase, because math nodes can better optimize
for the case when the slice is a single value or span.
2021-11-26 10:08:19 +01:00
2cda65a35a Geometry Nodes: avoid allocation when construct varray for single value
Previously, `GVArray::ForSingle` would always allocate a copy of the passed
in value. Now it only does so when the value is too large or not trivial.
2021-11-26 09:59:41 +01:00
8015433f81 Cleanup: Migrate image_gpu.cc to CC.
To prepare for future changes {T92613}.
2021-11-26 08:16:35 +01:00
e3cf7ebdb1 Cleanup: Silence clang-tidy warning. 2021-11-26 08:00:05 +01:00
a073e1e401 Cleanup: Silence clang-tidy warnings. 2021-11-26 07:59:38 +01:00
7b5a6f452a Merge branch 'blender-v3.0-release' 2021-11-25 18:33:19 +01:00
466b50dbc9 Cycles: expose direct light sampling option in Debug panel, tweak panel layout 2021-11-25 18:30:54 +01:00
2fb8c6805a Fix build error with experimental features after recent release cycle bump
Hair, pointcloud and simulation datablock types should be disabled in the
beta cycles already like other experimental features.
2021-11-25 18:24:46 +01:00
e3d3296327 Merge remote-tracking branch 'origin/blender-v3.0-release'
Merge conflict: Makes sure master is still alpha, while 3.0 is rc.
2021-11-25 18:00:33 +01:00
e6a41e1c80 Blender 3.0 bcon4 - change release cycle to release candidate
This is still a rolling release candidate with new builds every day
as a preparation to the final release.
2021-11-25 17:59:49 +01:00
b2bb3e4b72 Fix T90082: Autoscrolling after renaming in the File Browser broken
Caused by 6b0869039a

Above commit introduced selection after renaming. This includes calling
`file_select_deselect_all` [which resorts and refilters].

So now, to have the correct file for scrolling, get it again after
sorting by calling `file_params_find_renamed` again.

Differential Revision: https://developer.blender.org/D13368
2021-11-25 17:32:40 +01:00
4a3f99ad5a Merge branch 'blender-v3.0-release' 2021-11-25 17:26:53 +01:00
5a11c6e558 Fix missing margin below panels
A minor cosmetic fix. When the view was scrolled all the way to the
bottom, the lowest panel would end right on the view edge. The
scrollable view should get the same margin at the bottom as used at the
top.
2021-11-25 17:15:48 +01:00
5514ca58a4 Fix T92313: Heading of redo panel is not aligned properly
This corrects some alignments issues through new margins introduced in
93544b641b. Basic idea of this fix is to only add the new margins when
drawing a panel with background. These margins were added specifically
for the background boxes, so that makes sense.

Alternative fix to D13199.

This also fixes some margings added unintentionally in mentioned commit.
There is a little jump of the toolbar and the tabs in the Properties
when comparing the UI without this fix to 2.93:
{F12158085} {F12158039}
The jump is gone with this fix applied (compare to the 2.93 screenshot):
{F12158064}
While not a serious issue, this confirms that this fix actually tackles
the root of the issue.
2021-11-25 17:09:45 +01:00
94e8db1e86 Fix T92278: Small size of previews in the shading popover
Don't use the side padding for menu item contents when displaying
previews or icons in a row or grid layout. This can cause problems for
the preview drawing and doesn't make sense to draw there anyway.

This not only fixes the mentioned issue, but also too small heighlight
for the collection color tag in the Outliner context menu.

Alternative to and similar to D13125.
2021-11-25 15:55:04 +01:00
3652f5f758 Merge remote-tracking branch 'origin/blender-v3.0-release' 2021-11-25 15:22:40 +01:00
c91d196159 Fix T93274: Assigning asset catalog doesn't mark file as modified
Assigning a catalog to an asset via drag-and-drop in the asset browser
now creates an undo step. Not only does this allow undoing the action,
it also tags the blend file as modified.

Reviewed by: Severin

Differential Revision: https://developer.blender.org/D13370
2021-11-25 15:02:23 +01:00
9812a08848 Fix: Crash when muting the Group Output node
This fixes a crash when muting the "Group Output" node.
It should not be possible to mute it so this patch
sets the `no_muting`-variable on it.

Differential Revision: https://developer.blender.org/D13364
2021-11-25 15:01:38 +01:00
e216660382 Merge branch 'blender-v3.0-release'
Conflicts:
	source/blender/windowmanager/intern/wm_files_link.c
2021-11-25 15:00:06 +01:00
e253fb2143 Fix T93353: Reload Library used as source for liboverride loses Constraints.
Liboverride properties and operations list need to be fully up-to-date
before libraries are reloaded, otherwise re-applying those liboverrides
after linked data is reloaded may miss some changes.
2021-11-25 14:56:56 +01:00
3bf10e5d0a Fix T89996, T90063: bugs with multi-button reset and entering values in popups
This reverts the changes to fix T87448, where entering the same value in number
buttons causes an unnecessary update. This is not stable enough for 3.0 and so
is being reverted, better to have an unnecessary update than no update in other
cases.

This effectively reverts the changes from rBeb06ccc32462 and follow up fixes
rBe1a9ba94c599, rBbbb52a462ef9, rBec30cf0b742f, and rB071799d4fc44. The code is
disabled with a comment on how it could be implemented better.
2021-11-25 14:45:19 +01:00
2378f057a0 Merge branch 'blender-v3.0-release' 2021-11-25 14:21:05 +01:00
bc4c20d414 Fix T93360: 'Iteractive Light Track' do not work over empty background
Bug introduced in {rBaa0ac0035a0d}.

The invalid depth fallback was changed to `FLT_MAX` in order to match the
annotation and gpencil operations.

This broke the `Interactive Light Track` operator which invalidates the
operation if the depth value is `1.0f`.

The chosen solution was to change the value tested in the annotation and
gpencil operations.
2021-11-25 10:18:57 -03:00
ffddf9e5c9 Fix T93338: Curve Guide force field crash
Caused by {rBcf2baa585cc8}.

For Curve Guide force fields to work, the `Path Animation` option has to
be enabled. With it disabled, we are lacking the necessary
`anim_path_accum_length` data initialized [done by
`BKE_anim_path_calc_data`] which `BKE_where_on_path` relies on since
above commit.

Now just check for this before using it - and return early otherwise.
Prior to said commit, `BKE_where_on_path` would equally return early
with a similar message, so that is expected behavior here.

Maniphest Tasks: T93338

Differential Revision: https://developer.blender.org/D13371
2021-11-25 14:16:12 +01:00
447378753d BLI: remove special cases for is_span and is_single methods
Those were not implemented consistently and don't really help in practice.
2021-11-25 13:51:23 +01:00
845716e600 Fix T92609 Default Compositing tab shows red overlay when stereoscopy is turned on
This was caused by the drawing not being done on the right frammebuffer.
2021-11-25 13:40:04 +01:00
d6646f7a8a Fix T93367: wrong attribute propagation in Delete Geometry node
This only happened with non-point selections. It used an incorrect
index mapping for the attribute propagation.
2021-11-25 11:32:12 +01:00
1e2376f41f Merge branch 'blender-v3.0-release' 2021-11-25 10:36:30 +01:00
82808e18e6 Fix T93362: crash when capturing attribute after fillet curve node
The issue was that the attribute propagation in the Fillet Curve node seems
pretty broken. I couldn't really make sense of the old code. It changed the
size of the point attribute domains on splines to 1 for some reason which
led to a crash in the next node.

Differential Revision: https://developer.blender.org/D13362
2021-11-25 10:33:05 +01:00
5ffb9b6dc4 Merge remote-tracking branch 'origin/blender-v3.0-release' 2021-11-25 10:22:32 +01:00
Bastien Montagne
a0acb9bd0c Fix T91444: Edge Loop Preview fails with two Mirror Modifiers
The mirror modifiers merge option caused unnecessary re-ordering
to the vertex array with original vertices merging into their copies.

While this wasn't an error, it meant creating a 1:1 mapping from input
vertices to their final output wasn't reliable (when looping over
vertices first to last) as is done in
BKE_editmesh_vert_coords_when_deformed.

As merging in either direction is supported, keep the source meshes
vertices in-order since it allows the vertex coordinates to be extracted.

NOTE: Since this change introduce issues for some cases (e.g. bound
modifiers like SurfaceDeform), this change is only applied to newly
created modifiers, existing ones will still use the old incorrect merge
behavior.

Reviewed By: @brecht

Maniphest Tasks: T93321, T91444

Differential Revision: https://developer.blender.org/D13355
2021-11-25 10:21:49 +01:00
e6cd4761e7 Fix T93321: Modified Mirror modifier behavior break some other tools like bound SurfaceDeform.
Revert "Fix T91444: Edge Loop Preview fails with two Mirror Modifiers"

This reverts commit 1a7757b0bc.

Caused issue reported in T93321, boiling down to the fact that other
operations or modifiers (like the SurfaceDeform one) rely on the order
of the vertices in the mesh to remain consistent.

Changing this in a modifier would mean those operations need to be
reset/re-created (e.g. rebound for the SurfaceDeform case), which is not
doable in `do_version` code.
2021-11-25 10:21:49 +01:00
726bc3a46b Fix T93155: Approximate shadow catcher displayed wrong on CPU and GPU
Was happening during rendering, causing visual artifacts when doing
CPU+GPU rendering, and giving different in-progress results on different
devices.

The root of the issue comes to the fact that math used in the approximate
shadow catcher calculation might have resulted in negative alpha channel,
and negative values for display are handled differently on CPU and GPU.
Such difference in handling is caused by an approximate conversion used on
the CPU for the performance reasons.

This change makes it so no negative alpha is generated by the approximate
shadow catcher. Not sure if we need some explicit clamping somewhere to
deal with possible negative values coming from somewhere else.

The shadow catcher cornell box tests are to be updated for the new code,
but the new result seems to be more accurate.

Differential Revision: https://developer.blender.org/D13354
2021-11-25 10:17:52 +01:00
610e68590b Merge branch 'blender-v3.0-release' 2021-11-25 10:12:57 +01:00
ce5561b815 Fix Py API: wrong doc about type of Collection property.
Collection property only accepts PropertyGroup type, not ID ones.

Reported on IRC by @frameshift, thanks.
2021-11-25 10:12:35 +01:00
f12a6ff5cb Merge branch 'blender-v3.0-release' 2021-11-25 09:51:12 +01:00
40d28b40df Fix black Cycles result when cancelling tiled rendering with shadow catcher
Noticed when was looking into T93155. Steps to reproduce:

- Open the .blend file from the report
- Hit F12 to start rendering
- After some tiles were rendered hit Esc

The issue is caused by "sticky" cancel reported via Progress. This  means
that once user hit Esc all further requests for cancel state will return
truth, which was preventing OIDN denoiser from completing the denoising
task.

Now only allow stopping the denoiser when interactive rendering requests
a very fast stopping.

Aiming the fix for 3.0 branch.

Differential Revision: https://developer.blender.org/D13352
2021-11-25 09:50:33 +01:00
William Leeson
c49d2cbe92 Merge branch 'blender-v3.0-release' to bring in D13042:
Fix performance decrease with Scrambling Distance on
2021-11-25 09:41:03 +01:00
Alaska
b41c72b710 Fix performance decrease with Scrambling Distance on
With the current code in master, scrambling distance is enabled on non-hardware accelerated ray tracing devices see a measurable performance decrease when compared scrambling distance on vs off. From testing, this performance decrease comes from the large tile sizes scheduled in `tile.cpp`.

This patch attempts to address the performance decrease by using different algorithms to calculate the tile size for devices with hardware accelerated ray traversal and devices without. Large tile sizes for hardware accelerated devices and small tile sizes for others.

Most of this code is based on proposals from @brecht and @leesonw

Reviewed By: brecht, leesonw

Differential Revision: https://developer.blender.org/D13042
2021-11-25 09:32:26 +01:00
827c5b399e Merge branch 'blender-v3.0-release' 2021-11-25 00:04:32 +01:00
8f2db94627 Fix T93357: crash when opening search menu
This is the same fix as in rBde35a90f9f56d3ff3ac80c13bf1ae296853ba877
but for the blender-v3.0-release branch.
2021-11-25 00:03:41 +01:00
c26011efcb Fix T92962 Boolean output indices unstable.
A parallel loop to create the interesection meshes for each triangle
meant that with parallelism, the output order of the created meshes
could vary with each execution. Keep the parallelism for doing the CDTs
for interesection, but move the extraction of the new faces into a
serial loop afterwards, for repeatability.
2021-11-24 15:26:12 -05:00
6b1b3383c6 Merge branch 'blender-v3.0-release' 2021-11-24 21:10:24 +01:00
Leon Leno
a9642f8d61 UI: Improve scaling of widgets when zooming
This commit improves the scaling of some ui widgets when
zooming by making the radius of the rounded corners
dependent on the element's zoom level.

Needed to fix T92278 without padding issues, see D13125.

Reviewed By: Hans Goudey, Julian Eisel

Differential Revision: https://developer.blender.org/D12842
2021-11-24 21:06:32 +01:00
c155a5f9d7 Merge branch 'blender-v3.0-release' 2021-11-24 14:53:16 -03:00
752c6d668b Fix T90808: wrong BoundBox after undo curve selection
There are two functions that recalculate the boundbox of an object:
- One that considers the evaluated geometry
- Another that only considers the object's `data`.

Most of the time, the bound box is calculated on the final object
(with modifiers), so it doesn't seem right to just rely on `ob->data`
to recalculate the `ob->runtime.bb`.

Be sure to calculate the BoundBox based on the final geometry and
only use `ob->data` as a fallback

Differential Revision: https://developer.blender.org/D12282
2021-11-24 14:52:49 -03:00
bbadee6fc1 UI: Blend File Icons Thumbnail View
Changes icon used to indicate blend file when overlaid over larger
document icon when in thumbnail view. Only seen when file does not
have a preview.

Followup to {rB611e4ffaab43}

For more details and examples see D13342

Differential Revision: https://developer.blender.org/D13342

Reviewed by Julian Eisel
2021-11-24 09:52:27 -08:00
7a7ae4df43 UI: Blend File Icons Thumbnail View
Changes icon used to indicate blend file when overlaid over larger
document icon when in thumbnail view. Only seen when file does not
have a preview.

Followup to {rB611e4ffaab43}

For more details and examples see D13342

Differential Revision: https://developer.blender.org/D13342

Reviewed by Julian Eisel
2021-11-24 09:50:02 -08:00
338408772a Merge branch 'blender-v3.0-release' 2021-11-24 18:37:33 +01:00
71c39a9e2e Asset Browser: Activate a catalog when dragging
Without this it's easy to loose track of which catalog you are dragging.
Things feel generally quite jumpy/disconnected, activating the catalog
makes things feel far less like that.
I consider this an important usability fix, therefore I'm adding it to
the release branch.
2021-11-24 18:05:08 +01:00
cae3b581b0 Asset Browser: Fix catalog being renamed when dropping into parent
When dropping catalogs it is ensured that the name of the moved catalog
is unique within the new parent catalog. When dropping a catalog into
the parent, the catalog would not actually move to a different location,
but it would still be renamed. The unique name logic simply isn't smart
enough to ignore the catalog that is about to be moved.
Address this by disallowing dragging a catalog into its own parent. It's
already there.
2021-11-24 17:59:14 +01:00
499c24ce75 BLI: add slice method to index mask and generic span 2021-11-24 17:49:33 +01:00
cbe9a87d28 Geometry Nodes: use simpler types when devirtualizing virtual array
The compiler is more likely to optimize away the function call
overhead when the used type is simpler and not virtual.
2021-11-24 17:46:00 +01:00
01ab36ebc1 Asset Browser: Support dragging catalogs into top level
This was an oversight when I added catalog drag & drop support. I forgot
to add this for dragging catalogs into the top level by dragging into to
the "All" item as well. This made the drag & drop support rather broken
because it wouldn't work for a basic case.
2021-11-24 17:31:28 +01:00
e206a0ae96 Geometry Nodes: reduce thread switching in evaluator
When a node is executed, it usually schedules other nodes.
Right now, those newly scheduled nodes are added to a
task pool so that another thread can start working on them
immediatly.

However, that leads to the situation where sometimes each
node in a simple chain is executed by another thread. That
leads to additional threading overhead and reduced cache
efficiency (for caches that are not shared between cores).

Now, when a node is executed and schedules other nodes,
the first of those newly scheduled nodes will always be
executed on the same thread once the current node is done.
If it schedules more than one other node, those will be
added to the task pool as before.

The speedup achieved by this is hard to measure. I found it
to be a couple percent faster in some extreme cases, not
much to get excited about. It's nice though that the number
of tasks added to the task pool is commonly reduced by a
factor of 4 or 5.
2021-11-24 17:26:41 +01:00
4930cd5db6 Merge branch 'blender-v3.0-release' 2021-11-24 10:40:13 -05:00
a07089dcb1 Fix T92120 (partially): No bone custom shape with curve object meshes
This part of the drawing code assumes that the bone custom object
has only one evaluated geometry component, and it also uses the
object type to check which data to draw, with the functions like
`DRW_cache_object_surface_get` that just take an object input.
Those functions usually work on evaluated objects, which use the
instancing system to access a temporary object with `object.data`
replaced for data types that don't match the original object.

That assumption used to work, but now curve, point cloud, or volume
objects can have an evaluated mesh which is not accessed with the
same object for render engine drawing.

The "correct" solution for the way this code is structured would be to
loop through all of the geometry components and try to get GPU batches
from every one of them. However, that significantly increases complexity
in an area that should probably be refactored anyway. This patch treats
the mesh as a special case, and only draws the evaluated mesh.

The **best** solution in my opinion might be refactoring this area to
use the instancing system with some sort of viewport-only flag so
the custom shape instances aren't added in the render.

The solution is "partial" because the "Wireframe" option only works
for meshes from mesh objects, even after this fix, and because other
data besides meshes is not displayed at all.

Differential Revision: https://developer.blender.org/D13038
2021-11-24 10:39:33 -05:00
7a97e925fd Cycles: Add support for building with OptiX 7.4 SDK and use built-in catmull-rom curve type
Some enum names were changed/removed in OptiX 7.4, so some changes are necessary to
make things compile still.
In addition, OptiX 7.4 also adds built-in support for catmull-rom curves, so it is no longer
necessary to convert the catmull-rom data to cubic bsplines first, and has endcaps disabled
by default now, so can remove the special handling via any-hit programs that filtered them
out before.

Differential Revision: https://developer.blender.org/D13351
2021-11-24 16:33:04 +01:00
56b068a664 Fix inconsistent UI terminology for tiling option
Was meant to be Use instead of Using.
2021-11-24 15:54:53 +01:00
64d9291d26 Cleanup: formatting 2021-11-24 15:54:53 +01:00
Alessio Monti di Sopra
2cc56495f3 UI: Fix alignment for recently added/edited icons
The patch slightly modifies two recently added icons "FILE_BLEND" and
"CURRENT_FILE" to better align them to the pixel grid, and change the
design of "FILE_BACKUP" to avoid alignment and readability issues, as
well as avoiding the outline version of the Blender logo which violates
the official logo guidelines.

Differential revision: https://developer.blender.org/D13346
2021-11-24 15:51:57 +01:00
72acce43bc Animation: allow marking actions as cyclic for Cycle-Aware Keying.
When a manual frame range is set, allow marking an action as having
Cyclic Animation. This does not affect how the action is evaluated,
but the Cycle-Aware Keying option will automatically make any newly
added F-Curves cyclic. This allows using the option from the start
to build the cycle, rather than only for tweaking an existing loop.

The curves are made cyclic when they have only one key, either
after inserting the first key, or before adding the second one.
The latter case avoids the need to manually make the first added
curve cyclic after marking a newly added action cyclic.

Differential Revision: https://developer.blender.org/D11803
2021-11-24 15:58:32 +03:00
5d59b38605 Animation: allow manually setting the intended playback range for actions.
Some operations, e.g. adding a new action strip to NLA, require
knowing the active frame range of an action. However, currently it
can only be deduced by scanning the keyframes of the curves within
it. This is not ideal if e.g. curves are staggered for overlap.

As suggested by Nathan Vegdahl in comments to T54724, this patch adds
Action properties that allow manually specifying its active frame range.
The settings are exposed via a panel in the Dopesheet and Action Editor.
When enabled, the range is highlighted in the background using a striped
fill to distinguish it from the solid filled regular playback range.

When set, the frame range is used when adding or updating NLA tracks,
and by add-ons using `Action.frame_range`, e.g. FBX exporter.

Differential Revision: https://developer.blender.org/D11803
2021-11-24 15:58:32 +03:00
057cb7e5e7 Context: add accessors returning selected actions for animation editors.
Add a new 'selected_visible_actions' property to allow querying
actions that are selected in animation related editors for use in
UI and operators. The 'selected_editable_actions' variant excludes
linked actions (the only reason an action can be read-only).

In the Action and Shape Key editors there is only one action
that is specified by the field at the top of the editor.

In Dope Sheet it scans the channel rows and returns all actions
related to the selected items. This includes summary items for
actions and groups.

In Graph Editor, it lists actions associated with selected curves.

The new property is also used for Copy To Selected and Alt-Click.

Ref D11803
2021-11-24 15:58:32 +03:00
65f547c3fc Geometry Nodes: add utility to show debug messages in node editor
This is only meant to be used for development purposes for now,
not to show warnings to the user.

Differential Revision: https://developer.blender.org/D13348
2021-11-24 13:39:20 +01:00
9fc5a9c78f macOS: Fix build error due to std::optional<T>::value 2021-11-24 17:41:37 +05:30
7ea4342e73 Geometry Nodes: reduce number of scheduled nodes in evaluator
Previously, there were a couple of cases where nodes were scheduled when
that was not really necessary. This change doesn't seem to have a big impact
on performance, but simplifies the code a bit.
2021-11-24 12:57:36 +01:00
833eb90820 Cleanup: removed shadowed variable 2021-11-24 12:57:36 +01:00
simon
25478bdc9a Merge branch 'blender-v3.0-release' 2021-11-24 12:56:19 +01:00
5a50b46376 Fix T93352: Material previews lost render samples
There are few layers of things which lead to the situation
of more noisy material preview: the do-version of the
preview.blend did not happen (at least from the Python
side as we did not investigate the C side deep). This made
Cycles to use default integrator settings for the preview
file ever since the Cycles X was merged. Those settings are
adaptive sampling with max 4K samples, noise threshold 0.01.
Opening the file in Blender 3.0 for edit did run the versioning
code which effectively lowered the number of samples used for
rendering.

This change makes it so the preview file is configured with
the exact effective settings as seen by Cycles prior to the
file was re-saved (adaptive sampling with the parameters noted
above).

This fix does not chaneg the fact that the versioning code is
not used for preview.blend, it only solves the regression in
the quality of previews.

The fix is done and reviewed with collaboration with Dalai and Sergey.
2021-11-24 12:47:30 +01:00
f5e5a0987a Geometry Nodes: decouple Delete Geometry node from Mask modifier
This dependency was a bit ugly and the functions from the mask modifier
did a few things that we don't need in the Delete Geometry node:
* The mask modifier uses `CustomData_copy_data` which the node doesn't need.
* The mask modifier checks for `-2` in some values, but this special value is
  not used by the node.

Differential Revision: https://developer.blender.org/D13335
2021-11-24 12:00:35 +01:00
050b205a97 CMake: add WITH_UNITY_BUILD option to improve compile times
Unity builds are only used in the `bf_nodes_geometry` module for now.
This module has been prepared to support unity builds already.
Usually, there is a 2-4x speedup when building `bf_nodes_geometry`
compared to without unity builds (e.g. 145s to 55s).

For more information about how unity builds work and how they help
with compile times, see D13341.

Differential Revision: https://developer.blender.org/D13341
2021-11-24 11:40:26 +01:00
3850fdd5b9 Assets: Sanitize threaded preview creation with undo
Basically, this fixes disappearing previews when editing asset metadata
or performing undo/redo actions.

The preview generation in a background job will eventually modify ID
data, but the undo push was done prior to that. So obviously, an undo
then would mean the preview is lost.

This patch makes it so undo/redo will regenerate the preview, if the preview
rendering was invoked but not finished in the undone/redone state.

The preview flag PRV_UNFINISHED wasn't entirely what we needed. So I had to
change it to a slightly different flag, with different semantics.
2021-11-24 11:26:37 +01:00
98a5658239 Cleanup: Renamed DefaultDrawingMode ImageSpaceDrawingMode.
Explains more what it does, not how it is used.
2021-11-24 11:23:44 +01:00
cd818fd081 Assets: Sanitize threaded preview creation with undo
Basically, this fixes disappearing previews when editing asset metadata
or performing undo/redo actions.

The preview generation in a background job will eventually modify ID
data, but the undo push was done prior to that. So obviously, an undo
then would mean the preview is lost.

This patch makes it so undo/redo will regenerate the preview, if the preview
rendering was invoked but not finished in the undone/redone state.

The preview flag PRV_UNFINISHED wasn't entirely what we needed. So I had to
change it to a slightly different flag, with different semantics.
2021-11-24 11:20:35 +01:00
75b53542f2 ImageEngine: Remove unneeded check for tiled image.
Regular images are also tiled images, but with a default tile that leads
to the same result.
2021-11-24 11:02:05 +01:00
17770192fb Tests: exclude anonymous attributes from mesh comparison
The set of anonymous attributes on a geometry is not visible to the user.
2021-11-24 10:38:42 +01:00
e0763760e4 Cleanup: IDTypeInfo new asset_type_info member.
Two issues addressed here:

I) `asset_type_info` is sub-data, not a callback. Therefore, move it
before the callbacks in the `IDTypeInfo` struct.

II) More important, initialize this new attribute in *ALL* `IDTypeInfo`
instances. No member of this struct should ever be left implicitely
uninitilazed, ever.

Aftermath of rBa84f1c02d251.
2021-11-24 10:35:47 +01:00
f8dea3fe64 Fix T93345: missing null check for geometry nodes logger 2021-11-24 10:16:14 +01:00
2ddbf81c47 Cleanup: Add info about attributes init in Mesh IDTypeInfo after conversion to C++.
rB6002914f141f totally lost those info, in C++ we use comments by
convention.
2021-11-24 10:09:35 +01:00
f9db7675e0 Cleanup: use lowercase in private functions. 2021-11-24 10:06:17 +01:00
785503a7e4 Cleanup: use lowercase in private functions. 2021-11-24 10:05:45 +01:00
1a887b0088 Cleanup: use nullptr 2021-11-24 10:05:01 +01:00
4d46e8a5e0 Merge branch 'blender-v3.0-release' 2021-11-24 10:02:57 +01:00
4b259edb0a Cleanup: Silent compilation warning in draw_manager. 2021-11-24 10:02:30 +01:00
de35a90f9f Fix crash when opening search menu.
Asset install bundle poll didn't check the space it was running in and
assumed that it was always running in file browser.
2021-11-24 09:27:47 +01:00
ebe5a5eca8 AssetBrowser: Disable asset indexing.
Asset Indexing is disabled for now as it requires {D12990} to support
object snapping.
2021-11-24 08:32:29 +01:00
Jeroen Bakker
dbeab82a89 T91406: Asset Library Indexing
Asset library indexing would store indexes of asset files to speed up
asset library browsing.

* Indexes are read when they are up to date
** Index should exist
** Index last modify data should be later than the file it indexes
** Index version should match
* The index of a file containing no assets can be load without opening
   the index file. The size of the file should be below a 32 bytes.
* Indexes are stored on a persistent cache folder.
* Unused index files are automatically removed.

The structure of the index files contains all data needed for browsing assets:
```
{
  "version": <file version number>,
  "entries": [{
    "name": "<asset name>",
    "catalog_id": "<catalog_id>",
    "catalog_name": "<catalog_name>",
    "description": "<description>",
    "author": "<author>",
    "tags": ["<tag>"]
  }]
}
```

Reviewed By: sybren, Severin

Maniphest Tasks: T91406

Differential Revision: https://developer.blender.org/D12693
2021-11-24 08:21:40 +01:00
Jeroen Bakker
16fc0da0e7 Animation: Add test cases for ED_keyframes_keylist.
This patch adds test cases to detect edge cases when finding
keylist columns.

The patch originated during development of D12052 to make sure
the new implementation matches the old implementation. It would
be good to add these test cases to master so this part is covered
in a next change might influence the expected edges.

The patch covers `ED_keylist_find_next`, `ED_keylist_find_prev`
and `ED_keylist_find_exact` methods.

Reviewed By: sybren

Differential Revision: https://developer.blender.org/D12302
2021-11-24 08:02:50 +01:00
21f22759ea Merge branch 'blender-v3.0-release' 2021-11-23 17:17:36 -08:00
60c0b79256 Add tablet data to Wintab fallback cursor movement. 2021-11-23 17:10:30 -08:00
Christoph Lendenfeld
6b4405d757 Extract keyframe segment calculation
extracts the search for keyframe segments (consecutive selection of keys)
It will be reused by future graph editor operators

Reviewed by: Sybren A. Stüvel
Differential Revision: https://developer.blender.org/D9360
Ref: D9360
2021-11-23 21:38:23 +00:00
d144983f8c Cleanup: unused variable 2021-11-23 18:15:59 -03:00
56f66602c7 Fix (unreported): Local preview icons not loading
Bug introduced in {rB9d7422b817d1}.

The solution is not an init thread for preview tasks that won't complete.
2021-11-23 18:11:54 -03:00
9159295c3c Clang Tidy: ignore some passes that changed or were added in version 13
I get hundreds of clang-tidy errors without ignoring those passes right now.
To not forget about the passes, I added them to T78535.
2021-11-23 19:45:05 +01:00
62a04f7aa6 Cleanup: clang tidy
The parameter name was inconsistent with the declaration.
2021-11-23 19:38:22 +01:00
38a3819171 Merge branch 'blender-v3.0-release' 2021-11-23 19:04:44 +01:00
3844e9dbe7 Fix (unreported): unlinked group input is not logged in geometry nodes
Differential Revision: https://developer.blender.org/D13340
2021-11-23 19:03:16 +01:00
ea93e5df6c Asset: Merge asset library/list refresh operators
In rBdcdbaf89bd11, I introduced a new operator
(`file.asset_library_refresh()`) to handle Asset Browser refreshing more
separate from File Browser refreshing. However, there already was
`asset.asset_list_refresh()`, which at this point only works for asset
view templates, but was intended to cover the Asset Browser case in
future too. This would happen once the Asset Browser uses the asset list
design of the asset view template.

So rather than having two operators for refreshing asset library data,
have one that just handles both cases, until they converge into one.
This avoids changes to the Python API in future (deprecating/changing
operators).

Differential Revision: https://developer.blender.org/D13239
2021-11-23 18:57:25 +01:00
60befc8f02 Clean-up: Fix BLI_rect.h collision with windows.h
windows.h `#defines rct1` as a number which is
problematic if we include `BLI_rect.h` after
`windows.h` .

by renaming `rct1/2` to `rct_a/b` we side step
the collision and straighten up the naming with
the functions directly above it.
2021-11-23 10:51:09 -07:00
62b50c612f Cleanup: Else after return, other simplifications
`std::stringstream` already returns a `std::string`, and there is no
particular reason to use short here instead of int.
2021-11-23 12:49:45 -05:00
9e5aae4215 Asset: Merge asset library/list refresh operators
In rBdcdbaf89bd11, I introduced a new operator
(`file.asset_library_refresh()`) to handle Asset Browser refreshing more
separate from File Browser refreshing. However, there already was
`asset.asset_list_refresh()`, which at this point only works for asset
view templates, but was intended to cover the Asset Browser case in
future too. This would happen once the Asset Browser uses the asset list
design of the asset view template.

So rather than having two operators for refreshing asset library data,
have one that just handles both cases, until they converge into one.
This avoids changes to the Python API in future (deprecating/changing
operators).

Differential Revision: https://developer.blender.org/D13239
2021-11-23 18:40:31 +01:00
c09e8a3590 Merge branch 'blender-v3.0-release' 2021-11-23 18:35:56 +01:00
792badcfef Fix broken handling of constraints reordering with library overrides
Alternative to D13291 (description partially copied from there).

New drag & drop reordering code would call constraints reordering
operator with the generic context, and not the one from the panel's
layout. missing the "constraint" member which is mandatory for poll
function to properly deal with override vs. local constraints.

For this to work in a decent way, there needs to be some panel-wide
context that we can restore when executing callbacks outside of the
normal draw context. So similar to uiLayoutSetContextPointer() to set
context on a layout level, this introduces
UI_panel_context_pointer_set() for panel level context (this calls the
former for the current panel root layout as well).

Differential Revision: https://developer.blender.org/D13308
2021-11-23 18:34:51 +01:00
c0a2b21744 Merge branch 'blender-v3.0-release' 2021-11-23 18:04:28 +01:00
fb4851fbbc Fix: The bounding box gizmo breaks if transform pivot is set to cursor
The bounding box transform code assumed that the pivot would always be
the sequence object transform center.

Rework the code so that this assumption is true even if the general
transform pivot is set to be the 2D cursor.
2021-11-23 18:03:36 +01:00
b40e930ac7 Merge branch 'blender-v3.0-release' 2021-11-23 17:52:30 +01:00
cf266ecaa6 Geometry Nodes: fix attribute propagation in Delete Geometry node
Previously, attribute propagation did not work correctly in when only
deleting edges and faces (but not points). Face and face corner attributes
were propagated wrongly or not at all respectively.

In order to keep the patch relatively small for the release branch,
it does not include some small optimizations. These can be done in 3.1:
* Use a `Span<int>` instead of `IndexMask` to avoid creating an
  unnecessary `Vector<int64_t>`.
* Only prepare index mappings when there are actually attributes to
  propagate.

Differential Revision: https://developer.blender.org/D13338
2021-11-23 17:48:09 +01:00
2526 changed files with 83675 additions and 48684 deletions

View File

@@ -12,6 +12,8 @@ Checks: >
-readability-avoid-const-params-in-decls,
-readability-simplify-boolean-expr,
-readability-make-member-function-const,
-readability-suspicious-call-argument,
-readability-redundant-member-init,
-readability-misleading-indentation,
@@ -25,6 +27,8 @@ Checks: >
-bugprone-branch-clone,
-bugprone-macro-parentheses,
-bugprone-reserved-identifier,
-bugprone-easily-swappable-parameters,
-bugprone-implicit-widening-of-multiplication-result,
-bugprone-sizeof-expression,
-bugprone-integer-division,
@@ -40,7 +44,8 @@ Checks: >
-modernize-pass-by-value,
# Cannot be enabled yet, because using raw string literals in tests breaks
# the windows compiler currently.
-modernize-raw-string-literal
-modernize-raw-string-literal,
-modernize-return-braced-init-list
CheckOptions:
- key: modernize-use-default-member-init.UseAssignment

View File

@@ -187,6 +187,13 @@ mark_as_advanced(CPACK_OVERRIDE_PACKAGENAME)
mark_as_advanced(BUILDINFO_OVERRIDE_DATE)
mark_as_advanced(BUILDINFO_OVERRIDE_TIME)
if(${CMAKE_VERSION} VERSION_GREATER_EQUAL "3.16")
option(WITH_UNITY_BUILD "Enable unity build for modules that support it to improve compile times" ON)
mark_as_advanced(WITH_UNITY_BUILD)
else()
set(WITH_UNITY_BUILD OFF)
endif()
option(WITH_IK_ITASC "Enable ITASC IK solver (only disable for development & for incompatible C++ compilers)" ON)
option(WITH_IK_SOLVER "Enable Legacy IK solver (only disable for development)" ON)
option(WITH_FFTW3 "Enable FFTW3 support (Used for smoke, ocean sim, and audio effects)" ON)
@@ -426,30 +433,40 @@ mark_as_advanced(WITH_CYCLES_DEBUG_NAN)
mark_as_advanced(WITH_CYCLES_NATIVE_ONLY)
# NVIDIA CUDA & OptiX
option(WITH_CYCLES_DEVICE_CUDA "Enable Cycles NVIDIA CUDA compute support" ON)
option(WITH_CYCLES_DEVICE_OPTIX "Enable Cycles NVIDIA OptiX support" ON)
mark_as_advanced(WITH_CYCLES_DEVICE_CUDA)
if(NOT APPLE)
option(WITH_CYCLES_DEVICE_CUDA "Enable Cycles NVIDIA CUDA compute support" ON)
option(WITH_CYCLES_DEVICE_OPTIX "Enable Cycles NVIDIA OptiX support" ON)
mark_as_advanced(WITH_CYCLES_DEVICE_CUDA)
option(WITH_CYCLES_CUDA_BINARIES "Build Cycles NVIDIA CUDA binaries" OFF)
set(CYCLES_CUDA_BINARIES_ARCH sm_30 sm_35 sm_37 sm_50 sm_52 sm_60 sm_61 sm_70 sm_75 sm_86 compute_75 CACHE STRING "CUDA architectures to build binaries for")
option(WITH_CYCLES_CUBIN_COMPILER "Build cubins with nvrtc based compiler instead of nvcc" OFF)
option(WITH_CYCLES_CUDA_BUILD_SERIAL "Build cubins one after another (useful on machines with limited RAM)" OFF)
option(WITH_CUDA_DYNLOAD "Dynamically load CUDA libraries at runtime (for developers, makes cuda-gdb work)" ON)
mark_as_advanced(CYCLES_CUDA_BINARIES_ARCH)
mark_as_advanced(WITH_CYCLES_CUBIN_COMPILER)
mark_as_advanced(WITH_CYCLES_CUDA_BUILD_SERIAL)
mark_as_advanced(WITH_CUDA_DYNLOAD)
option(WITH_CYCLES_CUDA_BINARIES "Build Cycles NVIDIA CUDA binaries" OFF)
set(CYCLES_CUDA_BINARIES_ARCH sm_30 sm_35 sm_37 sm_50 sm_52 sm_60 sm_61 sm_70 sm_75 sm_86 compute_75 CACHE STRING "CUDA architectures to build binaries for")
option(WITH_CYCLES_CUBIN_COMPILER "Build cubins with nvrtc based compiler instead of nvcc" OFF)
option(WITH_CYCLES_CUDA_BUILD_SERIAL "Build cubins one after another (useful on machines with limited RAM)" OFF)
option(WITH_CUDA_DYNLOAD "Dynamically load CUDA libraries at runtime (for developers, makes cuda-gdb work)" ON)
mark_as_advanced(CYCLES_CUDA_BINARIES_ARCH)
mark_as_advanced(WITH_CYCLES_CUBIN_COMPILER)
mark_as_advanced(WITH_CYCLES_CUDA_BUILD_SERIAL)
mark_as_advanced(WITH_CUDA_DYNLOAD)
endif()
# AMD HIP
if(WIN32)
option(WITH_CYCLES_DEVICE_HIP "Enable Cycles AMD HIP support" ON)
else()
option(WITH_CYCLES_DEVICE_HIP "Enable Cycles AMD HIP support" OFF)
if(NOT APPLE)
if(WIN32)
option(WITH_CYCLES_DEVICE_HIP "Enable Cycles AMD HIP support" ON)
else()
option(WITH_CYCLES_DEVICE_HIP "Enable Cycles AMD HIP support" OFF)
endif()
option(WITH_CYCLES_HIP_BINARIES "Build Cycles AMD HIP binaries" OFF)
set(CYCLES_HIP_BINARIES_ARCH gfx1010 gfx1011 gfx1012 gfx1030 gfx1031 gfx1032 gfx1034 CACHE STRING "AMD HIP architectures to build binaries for")
mark_as_advanced(WITH_CYCLES_DEVICE_HIP)
mark_as_advanced(CYCLES_HIP_BINARIES_ARCH)
endif()
# Apple Metal
if(APPLE)
option(WITH_CYCLES_DEVICE_METAL "Enable Cycles Apple Metal compute support" ON)
endif()
option(WITH_CYCLES_HIP_BINARIES "Build Cycles AMD HIP binaries" OFF)
set(CYCLES_HIP_BINARIES_ARCH gfx1010 gfx1011 gfx1012 gfx1030 gfx1031 gfx1032 gfx1034 CACHE STRING "AMD HIP architectures to build binaries for")
mark_as_advanced(WITH_CYCLES_DEVICE_HIP)
mark_as_advanced(CYCLES_HIP_BINARIES_ARCH)
# Draw Manager
option(WITH_DRAW_DEBUG "Add extra debug capabilities to Draw Manager" OFF)
@@ -494,11 +511,10 @@ if(WIN32)
endif()
# This should be turned off when Blender enter beta/rc/release
if("${BLENDER_VERSION_CYCLE}" STREQUAL "release" OR
"${BLENDER_VERSION_CYCLE}" STREQUAL "rc")
set(WITH_EXPERIMENTAL_FEATURES OFF)
else()
if("${BLENDER_VERSION_CYCLE}" STREQUAL "alpha")
set(WITH_EXPERIMENTAL_FEATURES ON)
else()
set(WITH_EXPERIMENTAL_FEATURES OFF)
endif()
# Unit testsing
@@ -840,7 +856,7 @@ if(WITH_AUDASPACE)
endif()
# Auto-enable CUDA dynload if toolkit is not found.
if(NOT WITH_CUDA_DYNLOAD)
if(WITH_CYCLES AND WITH_CYCLES_DEVICE_CUDA AND NOT WITH_CUDA_DYNLOAD)
find_package(CUDA)
if(NOT CUDA_FOUND)
message(STATUS "CUDA toolkit not found, using dynamic runtime loading of libraries (WITH_CUDA_DYNLOAD) instead")

View File

@@ -2083,9 +2083,9 @@ compile_OIIO() {
cmake_d="$cmake_d -D OPENEXR_VERSION=$OPENEXR_VERSION"
if [ "$_with_built_openexr" = true ]; then
cmake_d="$cmake_d -D ILMBASE_HOME=$INST/openexr"
cmake_d="$cmake_d -D OPENEXR_HOME=$INST/openexr"
INFO "ILMBASE_HOME=$INST/openexr"
cmake_d="$cmake_d -D ILMBASE_ROOT=$INST/openexr"
cmake_d="$cmake_d -D OPENEXR_ROOT=$INST/openexr"
INFO "Ilmbase_ROOT=$INST/openexr"
fi
# ptex is only needed when nicholas bishop is ready
@@ -2374,9 +2374,9 @@ compile_OSL() {
#~ cmake_d="$cmake_d -D ILMBASE_VERSION=$ILMBASE_VERSION"
if [ "$_with_built_openexr" = true ]; then
INFO "ILMBASE_HOME=$INST/openexr"
cmake_d="$cmake_d -D OPENEXR_ROOT_DIR=$INST/openexr"
cmake_d="$cmake_d -D ILMBASE_ROOT_DIR=$INST/openexr"
cmake_d="$cmake_d -D ILMBASE_ROOT=$INST/openexr"
cmake_d="$cmake_d -D OPENEXR_ROOT=$INST/openexr"
INFO "Ilmbase_ROOT=$INST/openexr"
# XXX Temp workaround... sigh, ILMBase really messed the things up by defining their custom names ON by default :(
fi

View File

@@ -197,3 +197,38 @@ index 67ec0d15f..6dc3e85a0 100644
#else
#error Unknown architecture.
#endif
diff --git a/pxr/base/arch/demangle.cpp b/pxr/base/arch/demangle.cpp
index 67ec0d15f..6dc3e85a0 100644
--- a/pxr/base/arch/demangle.cpp
+++ b/pxr/base/arch/demangle.cpp
@@ -36,6 +36,7 @@
#if (ARCH_COMPILER_GCC_MAJOR == 3 && ARCH_COMPILER_GCC_MINOR >= 1) || \
ARCH_COMPILER_GCC_MAJOR > 3 || defined(ARCH_COMPILER_CLANG)
#define _AT_LEAST_GCC_THREE_ONE_OR_CLANG
+#include <cxxabi.h>
#endif
PXR_NAMESPACE_OPEN_SCOPE
@@ -138,7 +139,6 @@
#endif
#if defined(_AT_LEAST_GCC_THREE_ONE_OR_CLANG)
-#include <cxxabi.h>
/*
* This routine doesn't work when you get to gcc3.4.
diff --git a/pxr/base/work/singularTask.h b/pxr/base/work/singularTask.h
index 67ec0d15f..6dc3e85a0 100644
--- a/pxr/base/work/singularTask.h
+++ b/pxr/base/work/singularTask.h
@@ -120,7 +120,7 @@
// case we go again to ensure the task can do whatever it
// was awakened to do. Once we successfully take the count
// to zero, we stop.
- size_t old = count;
+ std::size_t old = count;
do { _fn(); } while (
!count.compare_exchange_strong(old, 0));
});

View File

@@ -19,9 +19,6 @@ set(WITH_CODEC_SNDFILE OFF CACHE BOOL "" FORCE)
set(WITH_COMPOSITOR OFF CACHE BOOL "" FORCE)
set(WITH_COREAUDIO OFF CACHE BOOL "" FORCE)
set(WITH_CYCLES OFF CACHE BOOL "" FORCE)
set(WITH_CYCLES_DEVICE_OPTIX OFF CACHE BOOL "" FORCE)
set(WITH_CYCLES_EMBREE OFF CACHE BOOL "" FORCE)
set(WITH_CYCLES_OSL OFF CACHE BOOL "" FORCE)
set(WITH_DRACO OFF CACHE BOOL "" FORCE)
set(WITH_FFTW3 OFF CACHE BOOL "" FORCE)
set(WITH_FREESTYLE OFF CACHE BOOL "" FORCE)

View File

@@ -61,6 +61,7 @@ set(WITH_MEM_JEMALLOC ON CACHE BOOL "" FORCE)
# platform dependent options
if(APPLE)
set(WITH_COREAUDIO ON CACHE BOOL "" FORCE)
set(WITH_CYCLES_DEVICE_METAL ON CACHE BOOL "" FORCE)
endif()
if(NOT WIN32)
set(WITH_JACK ON CACHE BOOL "" FORCE)

View File

@@ -257,9 +257,6 @@ if(WITH_BOOST)
if(WITH_INTERNATIONAL)
list(APPEND _boost_FIND_COMPONENTS locale)
endif()
if(WITH_CYCLES_NETWORK)
list(APPEND _boost_FIND_COMPONENTS serialization)
endif()
if(WITH_OPENVDB)
list(APPEND _boost_FIND_COMPONENTS iostreams)
endif()
@@ -339,7 +336,7 @@ if(WITH_LLVM)
endif()
if(WITH_CYCLES_OSL)
if(WITH_CYCLES AND WITH_CYCLES_OSL)
set(CYCLES_OSL ${LIBDIR}/osl)
find_library(OSL_LIB_EXEC NAMES oslexec PATHS ${CYCLES_OSL}/lib)
@@ -359,7 +356,7 @@ if(WITH_CYCLES_OSL)
endif()
endif()
if(WITH_CYCLES_EMBREE)
if(WITH_CYCLES AND WITH_CYCLES_EMBREE)
find_package(Embree 3.8.0 REQUIRED)
# Increase stack size for Embree, only works for executables.
if(NOT WITH_PYTHON_MODULE)

View File

@@ -241,7 +241,7 @@ if(WITH_INPUT_NDOF)
endif()
endif()
if(WITH_CYCLES_OSL)
if(WITH_CYCLES AND WITH_CYCLES_OSL)
set(CYCLES_OSL ${LIBDIR}/osl CACHE PATH "Path to OpenShadingLanguage installation")
if(EXISTS ${CYCLES_OSL} AND NOT OSL_ROOT)
set(OSL_ROOT ${CYCLES_OSL})
@@ -314,7 +314,7 @@ if(WITH_BOOST)
endif()
set(Boost_USE_MULTITHREADED ON)
set(__boost_packages filesystem regex thread date_time)
if(WITH_CYCLES_OSL)
if(WITH_CYCLES AND WITH_CYCLES_OSL)
if(NOT (${OSL_LIBRARY_VERSION_MAJOR} EQUAL "1" AND ${OSL_LIBRARY_VERSION_MINOR} LESS "6"))
list(APPEND __boost_packages wave)
else()
@@ -323,9 +323,6 @@ if(WITH_BOOST)
if(WITH_INTERNATIONAL)
list(APPEND __boost_packages locale)
endif()
if(WITH_CYCLES_NETWORK)
list(APPEND __boost_packages serialization)
endif()
if(WITH_OPENVDB)
list(APPEND __boost_packages iostreams)
endif()
@@ -403,7 +400,7 @@ if(WITH_OPENCOLORIO)
endif()
endif()
if(WITH_CYCLES_EMBREE)
if(WITH_CYCLES AND WITH_CYCLES_EMBREE)
find_package(Embree 3.8.0 REQUIRED)
endif()

View File

@@ -477,7 +477,7 @@ if(WITH_PYTHON)
endif()
if(WITH_BOOST)
if(WITH_CYCLES_OSL)
if(WITH_CYCLES AND WITH_CYCLES_OSL)
set(boost_extra_libs wave)
endif()
if(WITH_INTERNATIONAL)
@@ -520,7 +520,7 @@ if(WITH_BOOST)
debug ${BOOST_LIBPATH}/libboost_thread-${BOOST_DEBUG_POSTFIX}
debug ${BOOST_LIBPATH}/libboost_chrono-${BOOST_DEBUG_POSTFIX}
)
if(WITH_CYCLES_OSL)
if(WITH_CYCLES AND WITH_CYCLES_OSL)
set(BOOST_LIBRARIES ${BOOST_LIBRARIES}
optimized ${BOOST_LIBPATH}/libboost_wave-${BOOST_POSTFIX}
debug ${BOOST_LIBPATH}/libboost_wave-${BOOST_DEBUG_POSTFIX})
@@ -708,7 +708,7 @@ if(WITH_CODEC_SNDFILE)
set(LIBSNDFILE_LIBRARIES ${LIBSNDFILE_LIBPATH}/libsndfile-1.lib)
endif()
if(WITH_CYCLES_OSL)
if(WITH_CYCLES AND WITH_CYCLES_OSL)
set(CYCLES_OSL ${LIBDIR}/osl CACHE PATH "Path to OpenShadingLanguage installation")
set(OSL_SHADER_DIR ${CYCLES_OSL}/shaders)
# Shaders have moved around a bit between OSL versions, check multiple locations
@@ -741,7 +741,7 @@ if(WITH_CYCLES_OSL)
endif()
endif()
if(WITH_CYCLES_EMBREE)
if(WITH_CYCLES AND WITH_CYCLES_EMBREE)
windows_find_package(Embree)
if(NOT EMBREE_FOUND)
set(EMBREE_INCLUDE_DIRS ${LIBDIR}/embree/include)

View File

@@ -6,91 +6,90 @@
* as part of the normal development process.
*/
/** \defgroup MEM Guarded memory (de)allocation
* \ingroup intern
/* TODO: other modules.
* - `libmv`
* - `cycles`
* - `opencolorio`
* - `opensubdiv`
* - `openvdb`
* - `quadriflow`
*/
/** \defgroup clog C-Logging (CLOG)
* \ingroup intern
*/
/** \defgroup intern_atomic Atomic Operations
* \ingroup intern */
/** \defgroup ctr container
* \ingroup intern
*/
/** \defgroup intern_clog C-Logging (CLOG)
* \ingroup intern */
/** \defgroup iksolver iksolver
* \ingroup intern
*/
/** \defgroup intern_eigen Eigen
* \ingroup intern */
/** \defgroup itasc itasc
* \ingroup intern
*/
/** \defgroup intern_glew-mx GLEW with Multiple Rendering Context's
* \ingroup intern */
/** \defgroup memutil memutil
* \ingroup intern
*/
/** \defgroup intern_iksolver Inverse Kinematics (Solver)
* \ingroup intern */
/** \defgroup mikktspace mikktspace
* \ingroup intern
*/
/** \defgroup intern_itasc Inverse Kinematics (ITASC)
* \ingroup intern */
/** \defgroup moto moto
* \ingroup intern
*/
/** \defgroup intern_libc_compat libc Compatibility For Linux
* \ingroup intern */
/** \defgroup eigen eigen
* \ingroup intern
*/
/** \defgroup intern_locale Locale
* \ingroup intern */
/** \defgroup smoke smoke
* \ingroup intern
*/
/** \defgroup intern_mantaflow Manta-Flow Fluid Simulation
* \ingroup intern */
/** \defgroup string string
* \ingroup intern
*/
/** \defgroup intern_mem Guarded Memory (de)allocation
* \ingroup intern */
/** \defgroup intern_memutil Memory Utilities (memutil)
* \ingroup intern */
/** \defgroup intern_mikktspace MikktSpace
* \ingroup intern */
/** \defgroup intern_numaapi NUMA (Non Uniform Memory Architecture)
* \ingroup intern */
/** \defgroup intern_rigidbody Rigid-Body C-API
* \ingroup intern */
/** \defgroup intern_sky_model Sky Model
* \ingroup intern */
/** \defgroup intern_utf_conv UTF-8/16 Conversion (utfconv)
* \ingroup intern */
/** \defgroup audaspace Audaspace
* \ingroup intern undoc
* \todo add to doxygen
*/
* \todo add to doxygen */
/** \defgroup audcoreaudio Audaspace CoreAudio
* \ingroup audaspace
*/
* \ingroup audaspace */
/** \defgroup audfx Audaspace FX
* \ingroup audaspace
*/
* \ingroup audaspace */
/** \defgroup audopenal Audaspace OpenAL
* \ingroup audaspace
*/
* \ingroup audaspace */
/** \defgroup audpulseaudio Audaspace PulseAudio
* \ingroup audaspace
*/
* \ingroup audaspace */
/** \defgroup audwasapi Audaspace WASAPI
* \ingroup audaspace
*/
* \ingroup audaspace */
/** \defgroup audpython Audaspace Python
* \ingroup audaspace
*/
* \ingroup audaspace */
/** \defgroup audsdl Audaspace SDL
* \ingroup audaspace
*/
* \ingroup audaspace */
/** \defgroup audsrc Audaspace SRC
*
* \ingroup audaspace
*/
* \ingroup audaspace */
/** \defgroup audffmpeg Audaspace FFMpeg
* \ingroup audaspace
*/
* \ingroup audaspace */
/** \defgroup audfftw Audaspace FFTW
* \ingroup audaspace
*/
* \ingroup audaspace */
/** \defgroup audjack Audaspace Jack
* \ingroup audaspace
*/
* \ingroup audaspace */
/** \defgroup audsndfile Audaspace sndfile
* \ingroup audaspace
*/
* \ingroup audaspace */
/** \defgroup GHOST GHOST API
* \ingroup intern GUI

View File

@@ -5,7 +5,8 @@
/** \defgroup bmesh BMesh
* \ingroup blender
*/
/** \defgroup compositor Compositing */
/** \defgroup compositor Compositing
* \ingroup blender */
/** \defgroup python Python
* \ingroup blender
@@ -78,7 +79,8 @@
* \ingroup blender
*/
/** \defgroup data DNA, RNA and .blend access*/
/** \defgroup data DNA, RNA and .blend access
* \ingroup blender */
/** \defgroup gpu GPU
* \ingroup blender
@@ -101,11 +103,12 @@
* merged in docs.
*/
/** \defgroup gui GUI */
/**
* \defgroup gui GUI
* \ingroup blender */
/** \defgroup wm Window Manager
* \ingroup blender gui
*/
* \ingroup gui */
/* ================================ */
@@ -279,7 +282,8 @@
* \ingroup gui
*/
/** \defgroup externformats External Formats */
/** \defgroup externformats External Formats
* \ingroup blender */
/** \defgroup collada COLLADA
* \ingroup externformats
@@ -308,4 +312,7 @@
/* ================================ */
/** \defgroup undoc Undocumented
* \brief Modules and libraries that are still undocumented, or lacking proper integration into the doxygen system, are marked in this group. */
*
* \brief Modules and libraries that are still undocumented,
* or lacking proper integration into the doxygen system, are marked in this group.
*/

View File

@@ -61,7 +61,7 @@ def blender_extract_info(blender_bin: str) -> Dict[str, str]:
stdout=subprocess.PIPE,
).stdout.decode(encoding="utf-8")
blender_version_ouput = subprocess.run(
blender_version_output = subprocess.run(
[blender_bin, "--version"],
env=blender_env,
check=True,
@@ -73,7 +73,7 @@ def blender_extract_info(blender_bin: str) -> Dict[str, str]:
# check for each lines prefix to ensure these aren't included.
blender_version = ""
blender_date = ""
for l in blender_version_ouput.split("\n"):
for l in blender_version_output.split("\n"):
if l.startswith("Blender "):
# Remove 'Blender' prefix.
blender_version = l.split(" ", 1)[1].strip()

View File

@@ -1103,6 +1103,7 @@ context_type_map = {
"selectable_objects": ("Object", True),
"selected_asset_files": ("FileSelectEntry", True),
"selected_bones": ("EditBone", True),
"selected_editable_actions": ("Action", True),
"selected_editable_bones": ("EditBone", True),
"selected_editable_fcurves": ("FCurve", True),
"selected_editable_keyframes": ("Keyframe", True),
@@ -1118,12 +1119,13 @@ context_type_map = {
"selected_pose_bones": ("PoseBone", True),
"selected_pose_bones_from_active_object": ("PoseBone", True),
"selected_sequences": ("Sequence", True),
"selected_visible_actions": ("Action", True),
"selected_visible_fcurves": ("FCurve", True),
"sequences": ("Sequence", True),
"soft_body": ("SoftBodyModifier", False),
"speaker": ("Speaker", False),
"texture": ("Texture", False),
"texture_slot": ("MaterialTextureSlot", False),
"texture_slot": ("TextureSlot", False),
"texture_user": ("ID", False),
"texture_user_property": ("Property", False),
"ui_list": ("UIList", False),

12
extern/hipew/README vendored Normal file
View File

@@ -0,0 +1,12 @@
The HIP Extension Wrangler Library (HIPEW) is a cross-platform open-source
C/C++ library to dynamically load the HIP library.
HIP (Heterogeneous-Compute Interface for Portability) is an API for C++
programming on AMD GPUs.
It is maintained as part of the Blender project, but included in extern/
for consistency with CUEW and CLEW libraries.
LICENSE
HIPEW is released under the Apache 2.0 license.

5
extern/hipew/README.blender vendored Normal file
View File

@@ -0,0 +1,5 @@
Project: Blender
URL: https://git.blender.org/blender.git
License: Apache 2.0
Upstream version: N/A
Local modifications: None

View File

@@ -219,17 +219,17 @@ static int hipewHasOldDriver(const char *hip_path) {
DWORD verHandle = 0;
DWORD verSize = GetFileVersionInfoSize(hip_path, &verHandle);
int old_driver = 0;
if(verSize != 0) {
if (verSize != 0) {
LPSTR verData = (LPSTR)malloc(verSize);
if(GetFileVersionInfo(hip_path, verHandle, verSize, verData)) {
if (GetFileVersionInfo(hip_path, verHandle, verSize, verData)) {
LPBYTE lpBuffer = NULL;
UINT size = 0;
if(VerQueryValue(verData, "\\", (VOID FAR * FAR *)&lpBuffer, &size)) {
if(size) {
if (VerQueryValue(verData, "\\", (VOID FAR * FAR *)&lpBuffer, &size)) {
if (size) {
VS_FIXEDFILEINFO *verInfo = (VS_FIXEDFILEINFO *)lpBuffer;
/* Magic value from
* https://docs.microsoft.com/en-us/windows/win32/api/verrsrc/ns-verrsrc-vs_fixedfileinfo */
if(verInfo->dwSignature == 0xfeef04bd) {
if (verInfo->dwSignature == 0xfeef04bd) {
unsigned int fileVersionLS0 = (verInfo->dwFileVersionLS >> 16) & 0xffff;
unsigned int fileversionLS1 = (verInfo->dwFileVersionLS >> 0) & 0xffff;
/* Corresponds to versions older than AMD Radeon Pro 21.Q4. */

View File

@@ -1,7 +1,7 @@
Project: NanoSVG
URL: https://github.com/memononen/nanosvg
License: zlib
Upstream version:
Upstream version: 3cdd4a9d7886
Local modifications: Added some functionality to manage grease pencil layers
Added a fix to SVG import arc and float errors (https://developer.blender.org/rB11dc674c78b49fc4e0b7c134c375b6c8b8eacbcc)

View File

@@ -45,7 +45,7 @@
*/
/** \file
* \ingroup Atomic
* \ingroup intern_atomic
*
* \brief Provides wrapper around system-specific atomic primitives,
* and some extensions (faked-atomic operations over float numbers).

View File

@@ -44,6 +44,10 @@
* The Original Code is: adapted from jemalloc.
*/
/** \file
* \ingroup intern_atomic
*/
#ifndef __ATOMIC_OPS_EXT_H__
#define __ATOMIC_OPS_EXT_H__

View File

@@ -5,7 +5,7 @@
* All rights reserved.
* Copyright (C) 2007-2012 Mozilla Foundation. All rights reserved.
* Copyright (C) 2009-2013 Facebook, Inc. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
* 1. Redistributions of source code must retain the above copyright notice(s),
@@ -13,7 +13,7 @@
* 2. Redistributions in binary form must reproduce the above copyright notice(s),
* this list of conditions and the following disclaimer in the documentation
* and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER(S) ``AS IS'' AND ANY EXPRESS
* OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
* MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO
@@ -26,6 +26,10 @@
* ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
/** \file
* \ingroup intern_atomic
*/
#ifndef __ATOMIC_OPS_MSVC_H__
#define __ATOMIC_OPS_MSVC_H__

View File

@@ -44,6 +44,10 @@
* The Original Code is: adapted from jemalloc.
*/
/** \file
* \ingroup intern_atomic
*/
#ifndef __ATOMIC_OPS_UNIX_H__
#define __ATOMIC_OPS_UNIX_H__

View File

@@ -44,6 +44,10 @@
* The Original Code is: adapted from jemalloc.
*/
/** \file
* \ingroup intern_atomic
*/
#ifndef __ATOMIC_OPS_UTILS_H__
#define __ATOMIC_OPS_UTILS_H__

View File

@@ -14,11 +14,8 @@
* Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
*/
#ifndef __CLG_LOG_H__
#define __CLG_LOG_H__
/** \file
* \ingroup clog
* \ingroup intern_clog
*
* C Logging Library (clog)
* ========================
@@ -68,6 +65,9 @@
* - 4+: May be used for more details than 3, should be avoided but not prevented.
*/
#ifndef __CLG_LOG_H__
#define __CLG_LOG_H__
#ifdef __cplusplus
extern "C" {
#endif /* __cplusplus */

View File

@@ -15,7 +15,7 @@
*/
/** \file
* \ingroup clog
* \ingroup intern_clog
*/
#include <assert.h>
@@ -388,7 +388,7 @@ static void clg_ctx_fatal_action(CLogContext *ctx)
static void clg_ctx_backtrace(CLogContext *ctx)
{
/* Note: we avoid writing to 'FILE', for back-trace we make an exception,
/* NOTE: we avoid writing to 'FILE', for back-trace we make an exception,
* if necessary we could have a version of the callback that writes to file
* descriptor all at once. */
ctx->callbacks.backtrace_fn(ctx->output_file);

View File

@@ -40,6 +40,7 @@ set(SRC
object_cull.cpp
output_driver.cpp
particles.cpp
pointcloud.cpp
curves.cpp
logging.cpp
python.cpp
@@ -87,6 +88,7 @@ endif()
set(ADDON_FILES
addon/__init__.py
addon/camera.py
addon/engine.py
addon/operators.py
addon/osl.py
@@ -101,6 +103,11 @@ add_definitions(${GL_DEFINITIONS})
if(WITH_CYCLES_DEVICE_HIP)
add_definitions(-DWITH_HIP)
endif()
if(WITH_CYCLES_DEVICE_METAL)
add_definitions(-DWITH_METAL)
endif()
if(WITH_MOD_FLUID)
add_definitions(-DWITH_FLUID)
endif()

View File

@@ -0,0 +1,84 @@
#
# Copyright 2011-2021 Blender Foundation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# <pep8 compliant>
# Fit to match default projective camera with focal_length 50 and sensor_width 36.
default_fisheye_polynomial = [-1.1735143712967577e-05,
-0.019988736953434998,
-3.3525322965709175e-06,
3.099275275886036e-06,
-2.6064646454854524e-08]
# Utilities to generate lens polynomials to match built-in camera types, only here
# for reference at the moment, not used by the code.
def create_grid(sensor_height, sensor_width):
import numpy as np
if sensor_height is None:
sensor_height = sensor_width / (16 / 9) # Default aspect ration 16:9
uu, vv = np.meshgrid(np.linspace(0, 1, 100), np.linspace(0, 1, 100))
uu = (uu - 0.5) * sensor_width
vv = (vv - 0.5) * sensor_height
rr = np.sqrt(uu ** 2 + vv ** 2)
return rr
def fisheye_lens_polynomial_from_projective(focal_length=50, sensor_width=36, sensor_height=None):
import numpy as np
rr = create_grid(sensor_height, sensor_width)
polynomial = np.polyfit(rr.flat, (-np.arctan(rr / focal_length)).flat, 4)
return list(reversed(polynomial))
def fisheye_lens_polynomial_from_projective_fov(fov, sensor_width=36, sensor_height=None):
import numpy as np
f = sensor_width / 2 / np.tan(fov / 2)
return fisheye_lens_polynomial_from_projective(f, sensor_width, sensor_height)
def fisheye_lens_polynomial_from_equisolid(lens=10.5, sensor_width=36, sensor_height=None):
import numpy as np
rr = create_grid(sensor_height, sensor_width)
x = rr.reshape(-1)
x = np.stack([x**i for i in [1, 2, 3, 4]])
y = (-2 * np.arcsin(rr / (2 * lens))).reshape(-1)
polynomial = np.linalg.lstsq(x.T, y.T, rcond=None)[0]
return [0] + list(polynomial)
def fisheye_lens_polynomial_from_equidistant(fov=180, sensor_width=36, sensor_height=None):
import numpy as np
return [0, -np.radians(fov) / sensor_width, 0, 0, 0]
def fisheye_lens_polynomial_from_distorted_projective_polynomial(k1, k2, k3, focal_length=50, sensor_width=36, sensor_height=None):
import numpy as np
rr = create_grid(sensor_height, sensor_width)
r2 = (rr / focal_length) ** 2
r4 = r2 * r2
r6 = r4 * r2
r_coeff = 1 + k1 * r2 + k2 * r4 + k3 * r6
polynomial = np.polyfit(rr.flat, (-np.arctan(rr / focal_length * r_coeff)).flat, 4)
return list(reversed(polynomial))
def fisheye_lens_polynomial_from_distorted_projective_divisions(k1, k2, focal_length=50, sensor_width=36, sensor_height=None):
import numpy as np
rr = create_grid(sensor_height, sensor_width)
r2 = (rr / focal_length) ** 2
r4 = r2 * r2
r_coeff = 1 + k1 * r2 + k2 * r4
polynomial = np.polyfit(rr.flat, (-np.arctan(rr / focal_length / r_coeff)).flat, 4)
return list(reversed(polynomial))

View File

@@ -28,7 +28,7 @@ def _configure_argument_parser():
action='store_true')
parser.add_argument("--cycles-device",
help="Set the device to use for Cycles, overriding user preferences and the scene setting."
"Valid options are 'CPU', 'CUDA', 'OPTIX', or 'HIP'"
"Valid options are 'CPU', 'CUDA', 'OPTIX', 'HIP' or 'METAL'."
"Additionally, you can append '+CPU' to any GPU type for hybrid rendering.",
default=None)
return parser

View File

@@ -33,6 +33,7 @@ from math import pi
# enums
from . import engine
from . import camera
enum_devices = (
('CPU', "CPU", "Use CPU for rendering"),
@@ -72,6 +73,8 @@ enum_panorama_types = (
('FISHEYE_EQUISOLID', "Fisheye Equisolid",
"Similar to most fisheye modern lens, takes sensor dimensions into consideration"),
('MIRRORBALL', "Mirror Ball", "Uses the mirror ball mapping"),
('FISHEYE_LENS_POLYNOMIAL', "Fisheye Lens Polynomial",
"Defines the lens projection as polynomial to allow real world camera lenses to be mimicked."),
)
enum_curve_shape = (
@@ -111,7 +114,8 @@ enum_device_type = (
('CPU', "CPU", "CPU", 0),
('CUDA', "CUDA", "CUDA", 1),
('OPTIX', "OptiX", "OptiX", 3),
("HIP", "HIP", "HIP", 4)
('HIP', "HIP", "HIP", 4),
('METAL', "Metal", "Metal", 5)
)
enum_texture_limit = (
@@ -429,7 +433,7 @@ class CyclesRenderSettings(bpy.types.PropertyGroup):
)
direct_light_sampling_type: EnumProperty(
name="Direct Light Sampling Type",
name="Direct Light Sampling",
description="The type of strategy used for sampling direct light contributions",
items=enum_direct_light_sampling_type,
default='MULTIPLE_IMPORTANCE_SAMPLING',
@@ -790,7 +794,7 @@ class CyclesRenderSettings(bpy.types.PropertyGroup):
)
use_auto_tile: BoolProperty(
name="Using Tiling",
name="Use Tiling",
description="Render high resolution images in tiles to reduce memory usage, using the specified tile size. Tiles are cached to disk while rendering to save memory",
default=True,
)
@@ -890,6 +894,32 @@ class CyclesCameraSettings(bpy.types.PropertyGroup):
default=pi,
)
fisheye_polynomial_k0: FloatProperty(
name="Fisheye Polynomial K0",
description="Coefficient K0 of the lens polinomial",
default=camera.default_fisheye_polynomial[0], precision=6, step=0.1, subtype='ANGLE',
)
fisheye_polynomial_k1: FloatProperty(
name="Fisheye Polynomial K1",
description="Coefficient K1 of the lens polinomial",
default=camera.default_fisheye_polynomial[1], precision=6, step=0.1, subtype='ANGLE',
)
fisheye_polynomial_k2: FloatProperty(
name="Fisheye Polynomial K2",
description="Coefficient K2 of the lens polinomial",
default=camera.default_fisheye_polynomial[2], precision=6, step=0.1, subtype='ANGLE',
)
fisheye_polynomial_k3: FloatProperty(
name="Fisheye Polynomial K3",
description="Coefficient K3 of the lens polinomial",
default=camera.default_fisheye_polynomial[3], precision=6, step=0.1, subtype='ANGLE',
)
fisheye_polynomial_k4: FloatProperty(
name="Fisheye Polynomial K4",
description="Coefficient K4 of the lens polinomial",
default=camera.default_fisheye_polynomial[4], precision=6, step=0.1, subtype='ANGLE',
)
@classmethod
def register(cls):
bpy.types.Camera.cycles = PointerProperty(
@@ -1312,8 +1342,7 @@ class CyclesPreferences(bpy.types.AddonPreferences):
def get_device_types(self, context):
import _cycles
has_cuda, has_optix, has_hip = _cycles.get_device_types()
has_cuda, has_optix, has_hip, has_metal = _cycles.get_device_types()
list = [('NONE', "None", "Don't use compute device", 0)]
if has_cuda:
list.append(('CUDA', "CUDA", "Use CUDA for GPU acceleration", 1))
@@ -1321,6 +1350,8 @@ class CyclesPreferences(bpy.types.AddonPreferences):
list.append(('OPTIX', "OptiX", "Use OptiX for GPU acceleration", 3))
if has_hip:
list.append(('HIP', "HIP", "Use HIP for GPU acceleration", 4))
if has_metal:
list.append(('METAL', "Metal", "Use Metal for GPU acceleration", 5))
return list
@@ -1346,7 +1377,7 @@ class CyclesPreferences(bpy.types.AddonPreferences):
def update_device_entries(self, device_list):
for device in device_list:
if not device[1] in {'CUDA', 'OPTIX', 'CPU', 'HIP'}:
if not device[1] in {'CUDA', 'OPTIX', 'CPU', 'HIP', 'METAL'}:
continue
# Try to find existing Device entry
entry = self.find_existing_device_entry(device)
@@ -1390,7 +1421,7 @@ class CyclesPreferences(bpy.types.AddonPreferences):
import _cycles
# Ensure `self.devices` is not re-allocated when the second call to
# get_devices_for_type is made, freeing items from the first list.
for device_type in ('CUDA', 'OPTIX', 'HIP'):
for device_type in ('CUDA', 'OPTIX', 'HIP', 'METAL'):
self.update_device_entries(_cycles.available_devices(device_type))
# Deprecated: use refresh_devices instead.
@@ -1442,6 +1473,8 @@ class CyclesPreferences(bpy.types.AddonPreferences):
col.label(text="Requires discrete AMD GPU with RDNA architecture", icon='BLANK1')
if sys.platform[:3] == "win":
col.label(text="and AMD Radeon Pro 21.Q4 driver or newer", icon='BLANK1')
elif device_type == 'METAL':
col.label(text="Requires Apple Silicon and macOS 12.0 or newer", icon='BLANK1')
return
for device in devices:

View File

@@ -97,6 +97,11 @@ def use_cpu(context):
return (get_device_type(context) == 'NONE' or cscene.device == 'CPU')
def use_metal(context):
cscene = context.scene.cycles
return (get_device_type(context) == 'METAL' and cscene.device == 'GPU')
def use_cuda(context):
cscene = context.scene.cycles
@@ -1015,7 +1020,7 @@ class CYCLES_OBJECT_PT_motion_blur(CyclesButtonsPanel, Panel):
def poll(cls, context):
ob = context.object
if CyclesButtonsPanel.poll(context) and ob:
if ob.type in {'MESH', 'CURVE', 'CURVE', 'SURFACE', 'FONT', 'META', 'CAMERA'}:
if ob.type in {'MESH', 'CURVE', 'CURVE', 'SURFACE', 'FONT', 'META', 'CAMERA', 'HAIR', 'POINTCLOUD'}:
return True
if ob.instance_type == 'COLLECTION' and ob.instance_collection:
return True
@@ -1819,37 +1824,38 @@ class CYCLES_RENDER_PT_debug(CyclesDebugButtonsPanel, Panel):
def draw(self, context):
layout = self.layout
layout.use_property_split = True
layout.use_property_decorate = False # No animation.
scene = context.scene
cscene = scene.cycles
col = layout.column()
col = layout.column(heading="CPU")
col.label(text="CPU Flags:")
row = col.row(align=True)
row.prop(cscene, "debug_use_cpu_sse2", toggle=True)
row.prop(cscene, "debug_use_cpu_sse3", toggle=True)
row.prop(cscene, "debug_use_cpu_sse41", toggle=True)
row.prop(cscene, "debug_use_cpu_avx", toggle=True)
row.prop(cscene, "debug_use_cpu_avx2", toggle=True)
col.prop(cscene, "debug_bvh_layout")
col.prop(cscene, "debug_bvh_layout", text="BVH")
col.separator()
col = layout.column()
col.label(text="CUDA Flags:")
col = layout.column(heading="CUDA")
col.prop(cscene, "debug_use_cuda_adaptive_compile")
col = layout.column(heading="OptiX")
col.prop(cscene, "debug_use_optix_debug", text="Module Debug")
col.separator()
col = layout.column()
col.label(text="OptiX Flags:")
col.prop(cscene, "debug_use_optix_debug")
col.prop(cscene, "debug_bvh_type", text="Viewport BVH")
col.separator()
col = layout.column()
col.prop(cscene, "debug_bvh_type")
import _cycles
if _cycles.with_debug:
col.prop(cscene, "direct_light_sampling_type")
class CYCLES_RENDER_PT_simplify(CyclesButtonsPanel, Panel):

View File

@@ -69,6 +69,12 @@ struct BlenderCamera {
float pole_merge_angle_from;
float pole_merge_angle_to;
float fisheye_polynomial_k0;
float fisheye_polynomial_k1;
float fisheye_polynomial_k2;
float fisheye_polynomial_k3;
float fisheye_polynomial_k4;
enum { AUTO, HORIZONTAL, VERTICAL } sensor_fit;
float sensor_width;
float sensor_height;
@@ -200,6 +206,12 @@ static void blender_camera_from_object(BlenderCamera *bcam,
bcam->longitude_min = RNA_float_get(&ccamera, "longitude_min");
bcam->longitude_max = RNA_float_get(&ccamera, "longitude_max");
bcam->fisheye_polynomial_k0 = RNA_float_get(&ccamera, "fisheye_polynomial_k0");
bcam->fisheye_polynomial_k1 = RNA_float_get(&ccamera, "fisheye_polynomial_k1");
bcam->fisheye_polynomial_k2 = RNA_float_get(&ccamera, "fisheye_polynomial_k2");
bcam->fisheye_polynomial_k3 = RNA_float_get(&ccamera, "fisheye_polynomial_k3");
bcam->fisheye_polynomial_k4 = RNA_float_get(&ccamera, "fisheye_polynomial_k4");
bcam->interocular_distance = b_camera.stereo().interocular_distance();
if (b_camera.stereo().convergence_mode() == BL::CameraStereoData::convergence_mode_PARALLEL) {
bcam->convergence_distance = FLT_MAX;
@@ -422,7 +434,8 @@ static void blender_camera_sync(Camera *cam,
cam->set_full_height(height);
/* panorama sensor */
if (bcam->type == CAMERA_PANORAMA && bcam->panorama_type == PANORAMA_FISHEYE_EQUISOLID) {
if (bcam->type == CAMERA_PANORAMA && (bcam->panorama_type == PANORAMA_FISHEYE_EQUISOLID ||
bcam->panorama_type == PANORAMA_FISHEYE_LENS_POLYNOMIAL)) {
float fit_xratio = (float)bcam->render_width * bcam->pixelaspect.x;
float fit_yratio = (float)bcam->render_height * bcam->pixelaspect.y;
bool horizontal_fit;
@@ -465,6 +478,12 @@ static void blender_camera_sync(Camera *cam,
cam->set_latitude_min(bcam->latitude_min);
cam->set_latitude_max(bcam->latitude_max);
cam->set_fisheye_polynomial_k0(bcam->fisheye_polynomial_k0);
cam->set_fisheye_polynomial_k1(bcam->fisheye_polynomial_k1);
cam->set_fisheye_polynomial_k2(bcam->fisheye_polynomial_k2);
cam->set_fisheye_polynomial_k3(bcam->fisheye_polynomial_k3);
cam->set_fisheye_polynomial_k4(bcam->fisheye_polynomial_k4);
cam->set_longitude_min(bcam->longitude_min);
cam->set_longitude_max(bcam->longitude_max);

View File

@@ -819,11 +819,14 @@ void BlenderSync::sync_hair(BL::Depsgraph b_depsgraph, BObjectInfo &b_ob_info, H
new_hair.set_used_shaders(used_shaders);
if (view_layer.use_hair) {
#ifdef WITH_HAIR_NODES
if (b_ob_info.object_data.is_a(&RNA_Hair)) {
/* Hair object. */
sync_hair(&new_hair, b_ob_info, false);
}
else {
else
#endif
{
/* Particle hair. */
bool need_undeformed = new_hair.need_attribute(scene, ATTR_STD_GENERATED);
BL::Mesh b_mesh = object_to_mesh(
@@ -870,12 +873,15 @@ void BlenderSync::sync_hair_motion(BL::Depsgraph b_depsgraph,
/* Export deformed coordinates. */
if (ccl::BKE_object_is_deform_modified(b_ob_info, b_scene, preview)) {
#ifdef WITH_HAIR_NODES
if (b_ob_info.object_data.is_a(&RNA_Hair)) {
/* Hair object. */
sync_hair(hair, b_ob_info, true, motion_step);
return;
}
else {
else
#endif
{
/* Particle hair. */
BL::Mesh b_mesh = object_to_mesh(
b_data, b_ob_info, b_depsgraph, false, Mesh::SUBDIVISION_NONE);

View File

@@ -27,6 +27,7 @@ enum ComputeDevice {
COMPUTE_DEVICE_CUDA = 1,
COMPUTE_DEVICE_OPTIX = 3,
COMPUTE_DEVICE_HIP = 4,
COMPUTE_DEVICE_METAL = 5,
COMPUTE_DEVICE_NUM
};
@@ -85,6 +86,9 @@ DeviceInfo blender_device_info(BL::Preferences &b_preferences, BL::Scene &b_scen
else if (compute_device == COMPUTE_DEVICE_HIP) {
mask |= DEVICE_MASK_HIP;
}
else if (compute_device == COMPUTE_DEVICE_METAL) {
mask |= DEVICE_MASK_METAL;
}
vector<DeviceInfo> devices = Device::available_devices(mask);
/* Match device preferences and available devices. */

View File

@@ -19,6 +19,7 @@
#include "scene/hair.h"
#include "scene/mesh.h"
#include "scene/object.h"
#include "scene/pointcloud.h"
#include "scene/volume.h"
#include "blender/sync.h"
@@ -31,10 +32,18 @@ CCL_NAMESPACE_BEGIN
static Geometry::Type determine_geom_type(BObjectInfo &b_ob_info, bool use_particle_hair)
{
#ifdef WITH_HAIR_NODES
if (b_ob_info.object_data.is_a(&RNA_Hair) || use_particle_hair) {
#else
if (use_particle_hair) {
#endif
return Geometry::HAIR;
}
if (b_ob_info.object_data.is_a(&RNA_PointCloud)) {
return Geometry::POINTCLOUD;
}
if (b_ob_info.object_data.is_a(&RNA_Volume) ||
(b_ob_info.object_data == b_ob_info.real_object.data() &&
object_fluid_gas_domain_find(b_ob_info.real_object))) {
@@ -107,6 +116,9 @@ Geometry *BlenderSync::sync_geometry(BL::Depsgraph &b_depsgraph,
else if (geom_type == Geometry::VOLUME) {
geom = scene->create_node<Volume>();
}
else if (geom_type == Geometry::POINTCLOUD) {
geom = scene->create_node<PointCloud>();
}
else {
geom = scene->create_node<Mesh>();
}
@@ -166,6 +178,10 @@ Geometry *BlenderSync::sync_geometry(BL::Depsgraph &b_depsgraph,
Volume *volume = static_cast<Volume *>(geom);
sync_volume(b_ob_info, volume);
}
else if (geom_type == Geometry::POINTCLOUD) {
PointCloud *pointcloud = static_cast<PointCloud *>(geom);
sync_pointcloud(pointcloud, b_ob_info);
}
else {
Mesh *mesh = static_cast<Mesh *>(geom);
sync_mesh(b_depsgraph, b_ob_info, mesh);
@@ -215,7 +231,11 @@ void BlenderSync::sync_geometry_motion(BL::Depsgraph &b_depsgraph,
if (progress.get_cancel())
return;
#ifdef WITH_HAIR_NODES
if (b_ob_info.object_data.is_a(&RNA_Hair) || use_particle_hair) {
#else
if (use_particle_hair) {
#endif
Hair *hair = static_cast<Hair *>(geom);
sync_hair_motion(b_depsgraph, b_ob_info, hair, motion_step);
}
@@ -223,6 +243,10 @@ void BlenderSync::sync_geometry_motion(BL::Depsgraph &b_depsgraph,
object_fluid_gas_domain_find(b_ob_info.real_object)) {
/* No volume motion blur support yet. */
}
else if (b_ob_info.object_data.is_a(&RNA_PointCloud)) {
PointCloud *pointcloud = static_cast<PointCloud *>(geom);
sync_pointcloud_motion(pointcloud, b_ob_info, motion_step);
}
else {
Mesh *mesh = static_cast<Mesh *>(geom);
sync_mesh_motion(b_depsgraph, b_ob_info, mesh, motion_step);

View File

@@ -24,8 +24,14 @@ CCL_NAMESPACE_BEGIN
/* Packed Images */
BlenderImageLoader::BlenderImageLoader(BL::Image b_image, int frame)
: b_image(b_image), frame(frame), free_cache(!b_image.has_data())
BlenderImageLoader::BlenderImageLoader(BL::Image b_image,
const int frame,
const bool is_preview_render)
: b_image(b_image),
frame(frame),
/* Don't free cache for preview render to avoid race condition from T93560, to be fixed
properly later as we are close to release. */
free_cache(!is_preview_render && !b_image.has_data())
{
}

View File

@@ -25,7 +25,7 @@ CCL_NAMESPACE_BEGIN
class BlenderImageLoader : public ImageLoader {
public:
BlenderImageLoader(BL::Image b_image, int frame);
BlenderImageLoader(BL::Image b_image, const int frame, const bool is_preview_render);
bool load_metadata(const ImageDeviceFeatures &features, ImageMetaData &metadata) override;
bool load_pixels(const ImageMetaData &metadata,

View File

@@ -72,7 +72,8 @@ bool BlenderSync::object_is_geometry(BObjectInfo &b_ob_info)
BL::Object::type_enum type = b_ob_info.iter_object.type();
if (type == BL::Object::type_VOLUME || type == BL::Object::type_HAIR) {
if (type == BL::Object::type_VOLUME || type == BL::Object::type_HAIR ||
type == BL::Object::type_POINTCLOUD) {
/* Will be exported attached to mesh. */
return true;
}
@@ -206,7 +207,7 @@ Object *BlenderSync::sync_object(BL::Depsgraph &b_depsgraph,
return NULL;
}
/* only interested in object that we can create meshes from */
/* only interested in object that we can create geometry from */
if (!object_is_geometry(b_ob_info)) {
return NULL;
}

View File

@@ -66,7 +66,7 @@ bool BlenderOutputDriver::read_render_tile(const Tile &tile)
bool BlenderOutputDriver::update_render_tile(const Tile &tile)
{
/* Use final write for preview renders, otherwise render result wouldn't be be updated
/* Use final write for preview renders, otherwise render result wouldn't be updated
* quickly on Blender side. For all other cases we use the display driver. */
if (b_engine_.is_preview()) {
write_render_tile(tile);

View File

@@ -0,0 +1,253 @@
/*
* Copyright 2011-2013 Blender Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include "scene/pointcloud.h"
#include "scene/attribute.h"
#include "scene/scene.h"
#include "blender/sync.h"
#include "blender/util.h"
#include "util/foreach.h"
#include "util/hash.h"
CCL_NAMESPACE_BEGIN
template<typename TypeInCycles, typename GetValueAtIndex>
static void fill_generic_attribute(BL::PointCloud &b_pointcloud,
TypeInCycles *data,
const GetValueAtIndex &get_value_at_index)
{
const int num_points = b_pointcloud.points.length();
for (int i = 0; i < num_points; i++) {
data[i] = get_value_at_index(i);
}
}
static void copy_attributes(PointCloud *pointcloud, BL::PointCloud b_pointcloud)
{
AttributeSet &attributes = pointcloud->attributes;
for (BL::Attribute &b_attribute : b_pointcloud.attributes) {
const ustring name{b_attribute.name().c_str()};
if (attributes.find(name)) {
continue;
}
const AttributeElement element = ATTR_ELEMENT_VERTEX;
const BL::Attribute::data_type_enum b_data_type = b_attribute.data_type();
switch (b_data_type) {
case BL::Attribute::data_type_FLOAT: {
BL::FloatAttribute b_float_attribute{b_attribute};
Attribute *attr = attributes.add(name, TypeFloat, element);
float *data = attr->data_float();
fill_generic_attribute(
b_pointcloud, data, [&](int i) { return b_float_attribute.data[i].value(); });
break;
}
case BL::Attribute::data_type_BOOLEAN: {
BL::BoolAttribute b_bool_attribute{b_attribute};
Attribute *attr = attributes.add(name, TypeFloat, element);
float *data = attr->data_float();
fill_generic_attribute(
b_pointcloud, data, [&](int i) { return (float)b_bool_attribute.data[i].value(); });
break;
}
case BL::Attribute::data_type_INT: {
BL::IntAttribute b_int_attribute{b_attribute};
Attribute *attr = attributes.add(name, TypeFloat, element);
float *data = attr->data_float();
fill_generic_attribute(
b_pointcloud, data, [&](int i) { return (float)b_int_attribute.data[i].value(); });
break;
}
case BL::Attribute::data_type_FLOAT_VECTOR: {
BL::FloatVectorAttribute b_vector_attribute{b_attribute};
Attribute *attr = attributes.add(name, TypeVector, element);
float3 *data = attr->data_float3();
fill_generic_attribute(b_pointcloud, data, [&](int i) {
BL::Array<float, 3> v = b_vector_attribute.data[i].vector();
return make_float3(v[0], v[1], v[2]);
});
break;
}
case BL::Attribute::data_type_FLOAT_COLOR: {
BL::FloatColorAttribute b_color_attribute{b_attribute};
Attribute *attr = attributes.add(name, TypeRGBA, element);
float4 *data = attr->data_float4();
fill_generic_attribute(b_pointcloud, data, [&](int i) {
BL::Array<float, 4> v = b_color_attribute.data[i].color();
return make_float4(v[0], v[1], v[2], v[3]);
});
break;
}
case BL::Attribute::data_type_FLOAT2: {
BL::Float2Attribute b_float2_attribute{b_attribute};
Attribute *attr = attributes.add(name, TypeFloat2, element);
float2 *data = attr->data_float2();
fill_generic_attribute(b_pointcloud, data, [&](int i) {
BL::Array<float, 2> v = b_float2_attribute.data[i].vector();
return make_float2(v[0], v[1]);
});
break;
}
default:
/* Not supported. */
break;
}
}
}
static void export_pointcloud(Scene *scene, PointCloud *pointcloud, BL::PointCloud b_pointcloud)
{
/* TODO: optimize so we can straight memcpy arrays from Blender? */
/* Add requested attributes. */
Attribute *attr_random = NULL;
if (pointcloud->need_attribute(scene, ATTR_STD_POINT_RANDOM)) {
attr_random = pointcloud->attributes.add(ATTR_STD_POINT_RANDOM);
}
/* Reserve memory. */
const int num_points = b_pointcloud.points.length();
pointcloud->reserve(num_points);
/* Export points. */
BL::PointCloud::points_iterator b_point_iter;
for (b_pointcloud.points.begin(b_point_iter); b_point_iter != b_pointcloud.points.end();
++b_point_iter) {
BL::Point b_point = *b_point_iter;
const float3 co = get_float3(b_point.co());
const float radius = b_point.radius();
pointcloud->add_point(co, radius);
/* Random number per point. */
if (attr_random != NULL) {
attr_random->add(hash_uint2_to_float(b_point.index(), 0));
}
}
/* Export attributes */
copy_attributes(pointcloud, b_pointcloud);
}
static void export_pointcloud_motion(PointCloud *pointcloud,
BL::PointCloud b_pointcloud,
int motion_step)
{
/* Find or add attribute. */
Attribute *attr_mP = pointcloud->attributes.find(ATTR_STD_MOTION_VERTEX_POSITION);
bool new_attribute = false;
if (!attr_mP) {
attr_mP = pointcloud->attributes.add(ATTR_STD_MOTION_VERTEX_POSITION);
new_attribute = true;
}
/* Export motion points. */
const int num_points = pointcloud->num_points();
float3 *mP = attr_mP->data_float3() + motion_step * num_points;
bool have_motion = false;
int num_motion_points = 0;
const array<float3> &pointcloud_points = pointcloud->get_points();
BL::PointCloud::points_iterator b_point_iter;
for (b_pointcloud.points.begin(b_point_iter); b_point_iter != b_pointcloud.points.end();
++b_point_iter) {
BL::Point b_point = *b_point_iter;
if (num_motion_points < num_points) {
float3 P = get_float3(b_point.co());
P.w = b_point.radius();
mP[num_motion_points] = P;
have_motion = have_motion || (P != pointcloud_points[num_motion_points]);
num_motion_points++;
}
}
/* In case of new attribute, we verify if there really was any motion. */
if (new_attribute) {
if (num_motion_points != num_points || !have_motion) {
pointcloud->attributes.remove(ATTR_STD_MOTION_VERTEX_POSITION);
}
else if (motion_step > 0) {
/* Motion, fill up previous steps that we might have skipped because
* they had no motion, but we need them anyway now. */
for (int step = 0; step < motion_step; step++) {
pointcloud->copy_center_to_motion_step(step);
}
}
}
/* Export attributes */
copy_attributes(pointcloud, b_pointcloud);
}
void BlenderSync::sync_pointcloud(PointCloud *pointcloud, BObjectInfo &b_ob_info)
{
size_t old_numpoints = pointcloud->num_points();
array<Node *> used_shaders = pointcloud->get_used_shaders();
PointCloud new_pointcloud;
new_pointcloud.set_used_shaders(used_shaders);
/* TODO: add option to filter out points in the view layer. */
BL::PointCloud b_pointcloud(b_ob_info.object_data);
export_pointcloud(scene, &new_pointcloud, b_pointcloud);
/* update original sockets */
for (const SocketType &socket : new_pointcloud.type->inputs) {
/* Those sockets are updated in sync_object, so do not modify them. */
if (socket.name == "use_motion_blur" || socket.name == "motion_steps" ||
socket.name == "used_shaders") {
continue;
}
pointcloud->set_value(socket, new_pointcloud, socket);
}
pointcloud->attributes.clear();
foreach (Attribute &attr, new_pointcloud.attributes.attributes) {
pointcloud->attributes.attributes.push_back(std::move(attr));
}
/* tag update */
const bool rebuild = (pointcloud && old_numpoints != pointcloud->num_points());
pointcloud->tag_update(scene, rebuild);
}
void BlenderSync::sync_pointcloud_motion(PointCloud *pointcloud,
BObjectInfo &b_ob_info,
int motion_step)
{
/* Skip if nothing exported. */
if (pointcloud->num_points() == 0) {
return;
}
/* Export deformed coordinates. */
if (ccl::BKE_object_is_deform_modified(b_ob_info, b_scene, preview)) {
/* PointCloud object. */
BL::PointCloud b_pointcloud(b_ob_info.object_data);
export_pointcloud_motion(pointcloud, b_pointcloud, motion_step);
}
else {
/* No deformation on this frame, copy coordinates if other frames did have it. */
pointcloud->copy_center_to_motion_step(motion_step);
}
}
CCL_NAMESPACE_END

View File

@@ -906,16 +906,18 @@ static PyObject *enable_print_stats_func(PyObject * /*self*/, PyObject * /*args*
static PyObject *get_device_types_func(PyObject * /*self*/, PyObject * /*args*/)
{
vector<DeviceType> device_types = Device::available_types();
bool has_cuda = false, has_optix = false, has_hip = false;
bool has_cuda = false, has_optix = false, has_hip = false, has_metal = false;
foreach (DeviceType device_type, device_types) {
has_cuda |= (device_type == DEVICE_CUDA);
has_optix |= (device_type == DEVICE_OPTIX);
has_hip |= (device_type == DEVICE_HIP);
has_metal |= (device_type == DEVICE_METAL);
}
PyObject *list = PyTuple_New(3);
PyObject *list = PyTuple_New(4);
PyTuple_SET_ITEM(list, 0, PyBool_FromLong(has_cuda));
PyTuple_SET_ITEM(list, 1, PyBool_FromLong(has_optix));
PyTuple_SET_ITEM(list, 2, PyBool_FromLong(has_hip));
PyTuple_SET_ITEM(list, 3, PyBool_FromLong(has_metal));
return list;
}
@@ -944,6 +946,9 @@ static PyObject *set_device_override_func(PyObject * /*self*/, PyObject *arg)
else if (override == "HIP") {
BlenderSession::device_override = DEVICE_MASK_HIP;
}
else if (override == "METAL") {
BlenderSession::device_override = DEVICE_MASK_METAL;
}
else {
printf("\nError: %s is not a valid Cycles device.\n", override.c_str());
Py_RETURN_FALSE;
@@ -1054,5 +1059,13 @@ void *CCL_python_module_init()
Py_INCREF(Py_False);
}
#ifdef WITH_CYCLES_DEBUG
PyModule_AddObject(mod, "with_debug", Py_True);
Py_INCREF(Py_True);
#else /* WITH_CYCLES_DEBUG */
PyModule_AddObject(mod, "with_debug", Py_False);
Py_INCREF(Py_False);
#endif /* WITH_CYCLES_DEBUG */
return (void *)mod;
}

View File

@@ -396,6 +396,13 @@ void BlenderSession::render(BL::Depsgraph &b_depsgraph_)
/* set the current view */
b_engine.active_view_set(b_rview_name.c_str());
/* Force update in this case, since the camera transform on each frame changes
* in different views. This could be optimized by somehow storing the animated
* camera transforms separate from the fixed stereo transform. */
if ((scene->need_motion() != Scene::MOTION_NONE) && view_index > 0) {
sync->tag_update();
}
/* update scene */
BL::Object b_camera_override(b_engine.camera_override());
sync->sync_camera(b_render, b_camera_override, width, height, b_rview_name.c_str());
@@ -629,7 +636,7 @@ void BlenderSession::bake(BL::Depsgraph &b_depsgraph_,
integrator->set_use_emission((bake_filter & BL::BakeSettings::pass_filter_EMIT) != 0);
}
/* Always use transpanent background for baking. */
/* Always use transparent background for baking. */
scene->background->set_transparent(true);
/* Load built-in images from Blender. */

View File

@@ -378,10 +378,19 @@ static ShaderNode *add_node(Scene *scene,
}
else if (b_node.is_a(&RNA_ShaderNodeMapRange)) {
BL::ShaderNodeMapRange b_map_range_node(b_node);
MapRangeNode *map_range_node = graph->create_node<MapRangeNode>();
map_range_node->set_clamp(b_map_range_node.clamp());
map_range_node->set_range_type((NodeMapRangeType)b_map_range_node.interpolation_type());
node = map_range_node;
if (b_map_range_node.data_type() == BL::ShaderNodeMapRange::data_type_FLOAT_VECTOR) {
VectorMapRangeNode *vector_map_range_node = graph->create_node<VectorMapRangeNode>();
vector_map_range_node->set_use_clamp(b_map_range_node.clamp());
vector_map_range_node->set_range_type(
(NodeMapRangeType)b_map_range_node.interpolation_type());
node = vector_map_range_node;
}
else {
MapRangeNode *map_range_node = graph->create_node<MapRangeNode>();
map_range_node->set_clamp(b_map_range_node.clamp());
map_range_node->set_range_type((NodeMapRangeType)b_map_range_node.interpolation_type());
node = map_range_node;
}
}
else if (b_node.is_a(&RNA_ShaderNodeClamp)) {
BL::ShaderNodeClamp b_clamp_node(b_node);
@@ -762,7 +771,8 @@ static ShaderNode *add_node(Scene *scene,
int scene_frame = b_scene.frame_current();
int image_frame = image_user_frame_number(b_image_user, b_image, scene_frame);
image->handle = scene->image_manager->add_image(
new BlenderImageLoader(b_image, image_frame), image->image_params());
new BlenderImageLoader(b_image, image_frame, b_engine.is_preview()),
image->image_params());
}
else {
ustring filename = ustring(
@@ -797,8 +807,9 @@ static ShaderNode *add_node(Scene *scene,
if (is_builtin) {
int scene_frame = b_scene.frame_current();
int image_frame = image_user_frame_number(b_image_user, b_image, scene_frame);
env->handle = scene->image_manager->add_image(new BlenderImageLoader(b_image, image_frame),
env->image_params());
env->handle = scene->image_manager->add_image(
new BlenderImageLoader(b_image, image_frame, b_engine.is_preview()),
env->image_params());
}
else {
env->set_filename(

View File

@@ -95,6 +95,11 @@ void BlenderSync::reset(BL::BlendData &b_data, BL::Scene &b_scene)
this->b_scene = b_scene;
}
void BlenderSync::tag_update()
{
has_updates_ = true;
}
/* Sync */
void BlenderSync::sync_recalc(BL::Depsgraph &b_depsgraph, BL::SpaceView3D &b_v3d)

View File

@@ -66,6 +66,8 @@ class BlenderSync {
void reset(BL::BlendData &b_data, BL::Scene &b_scene);
void tag_update();
/* sync */
void sync_recalc(BL::Depsgraph &b_depsgraph, BL::SpaceView3D &b_v3d);
void sync_data(BL::RenderSettings &b_render,
@@ -167,12 +169,16 @@ class BlenderSync {
Hair *hair, BL::Mesh &b_mesh, BObjectInfo &b_ob_info, bool motion, int motion_step = 0);
bool object_has_particle_hair(BL::Object b_ob);
/* Point Cloud */
void sync_pointcloud(PointCloud *pointcloud, BObjectInfo &b_ob_info);
void sync_pointcloud_motion(PointCloud *pointcloud, BObjectInfo &b_ob_info, int motion_step = 0);
/* Camera */
void sync_camera_motion(
BL::RenderSettings &b_render, BL::Object &b_ob, int width, int height, float motion_time);
/* Geometry */
Geometry *sync_geometry(BL::Depsgraph &b_depsgrpah,
Geometry *sync_geometry(BL::Depsgraph &b_depsgraph,
BObjectInfo &b_ob_info,
bool object_updated,
bool use_particle_hair,
@@ -267,7 +273,6 @@ class BlenderSync {
Progress &progress;
protected:
/* Indicates that `sync_recalc()` detected changes in the scene.
* If this flag is false then the data is considered to be up-to-date and will not be
* synchronized at all. */

View File

@@ -33,6 +33,17 @@ set(SRC
unaligned.cpp
)
set(SRC_METAL
metal.mm
)
if(WITH_CYCLES_DEVICE_METAL)
list(APPEND SRC
${SRC_METAL}
)
add_definitions(-DWITH_METAL)
endif()
set(SRC_HEADERS
bvh.h
bvh2.h
@@ -46,6 +57,7 @@ set(SRC_HEADERS
sort.h
split.h
unaligned.h
metal.h
)
set(LIB

View File

@@ -26,6 +26,7 @@
#include "scene/hair.h"
#include "scene/mesh.h"
#include "scene/object.h"
#include "scene/pointcloud.h"
#include "scene/scene.h"
#include "util/algorithm.h"
@@ -113,9 +114,9 @@ void BVHBuild::add_reference_triangles(BoundBox &root,
else {
/* Motion triangles, trace optimized case: we split triangle
* primitives into separate nodes for each of the time steps.
* This way we minimize overlap of neighbor curve primitives.
* This way we minimize overlap of neighbor triangle primitives.
*/
const int num_bvh_steps = params.num_motion_curve_steps * 2 + 1;
const int num_bvh_steps = params.num_motion_triangle_steps * 2 + 1;
const float num_bvh_steps_inv_1 = 1.0f / (num_bvh_steps - 1);
const size_t num_verts = mesh->verts.size();
const size_t num_steps = mesh->motion_steps;
@@ -269,6 +270,101 @@ void BVHBuild::add_reference_curves(BoundBox &root, BoundBox &center, Hair *hair
}
}
void BVHBuild::add_reference_points(BoundBox &root,
BoundBox &center,
PointCloud *pointcloud,
int i)
{
const Attribute *point_attr_mP = NULL;
if (pointcloud->has_motion_blur()) {
point_attr_mP = pointcloud->attributes.find(ATTR_STD_MOTION_VERTEX_POSITION);
}
const float3 *points_data = &pointcloud->points[0];
const float *radius_data = &pointcloud->radius[0];
const size_t num_points = pointcloud->num_points();
const float3 *motion_data = (point_attr_mP) ? point_attr_mP->data_float3() : NULL;
const size_t num_steps = pointcloud->get_motion_steps();
if (point_attr_mP == NULL) {
/* Really simple logic for static points. */
for (uint j = 0; j < num_points; j++) {
const PointCloud::Point point = pointcloud->get_point(j);
BoundBox bounds = BoundBox::empty;
point.bounds_grow(points_data, radius_data, bounds);
if (bounds.valid()) {
references.push_back(BVHReference(bounds, j, i, PRIMITIVE_POINT));
root.grow(bounds);
center.grow(bounds.center2());
}
}
}
else if (params.num_motion_point_steps == 0 || params.use_spatial_split) {
/* Simple case of motion points: single node for the whole
* shutter time. Lowest memory usage but less optimal
* rendering.
*/
/* TODO(sergey): Support motion steps for spatially split BVH. */
for (uint j = 0; j < num_points; j++) {
const PointCloud::Point point = pointcloud->get_point(j);
BoundBox bounds = BoundBox::empty;
point.bounds_grow(points_data, radius_data, bounds);
for (size_t step = 0; step < num_steps - 1; step++) {
point.bounds_grow(motion_data + step * num_points, radius_data, bounds);
}
if (bounds.valid()) {
references.push_back(BVHReference(bounds, j, i, PRIMITIVE_MOTION_POINT));
root.grow(bounds);
center.grow(bounds.center2());
}
}
}
else {
/* Motion points, trace optimized case: we split point
* primitives into separate nodes for each of the time steps.
* This way we minimize overlap of neighbor point primitives.
*/
const int num_bvh_steps = params.num_motion_point_steps * 2 + 1;
const float num_bvh_steps_inv_1 = 1.0f / (num_bvh_steps - 1);
for (uint j = 0; j < num_points; j++) {
const PointCloud::Point point = pointcloud->get_point(j);
const size_t num_steps = pointcloud->get_motion_steps();
const float3 *point_steps = point_attr_mP->data_float3();
/* Calculate bounding box of the previous time step.
* Will be reused later to avoid duplicated work on
* calculating BVH time step boundbox.
*/
float4 prev_key = point.motion_key(
points_data, radius_data, point_steps, num_points, num_steps, 0.0f, j);
BoundBox prev_bounds = BoundBox::empty;
point.bounds_grow(prev_key, prev_bounds);
/* Create all primitive time steps, */
for (int bvh_step = 1; bvh_step < num_bvh_steps; ++bvh_step) {
const float curr_time = (float)(bvh_step)*num_bvh_steps_inv_1;
float4 curr_key = point.motion_key(
points_data, radius_data, point_steps, num_points, num_steps, curr_time, j);
BoundBox curr_bounds = BoundBox::empty;
point.bounds_grow(curr_key, curr_bounds);
BoundBox bounds = prev_bounds;
bounds.grow(curr_bounds);
if (bounds.valid()) {
const float prev_time = (float)(bvh_step - 1) * num_bvh_steps_inv_1;
references.push_back(
BVHReference(bounds, j, i, PRIMITIVE_MOTION_POINT, prev_time, curr_time));
root.grow(bounds);
center.grow(bounds.center2());
}
/* Current time boundbox becomes previous one for the
* next time step.
*/
prev_bounds = curr_bounds;
}
}
}
}
void BVHBuild::add_reference_geometry(BoundBox &root,
BoundBox &center,
Geometry *geom,
@@ -282,6 +378,10 @@ void BVHBuild::add_reference_geometry(BoundBox &root,
Hair *hair = static_cast<Hair *>(geom);
add_reference_curves(root, center, hair, object_index);
}
else if (geom->geometry_type == Geometry::POINTCLOUD) {
PointCloud *pointcloud = static_cast<PointCloud *>(geom);
add_reference_points(root, center, pointcloud, object_index);
}
}
void BVHBuild::add_reference_object(BoundBox &root, BoundBox &center, Object *ob, int i)
@@ -311,6 +411,10 @@ static size_t count_primitives(Geometry *geom)
Hair *hair = static_cast<Hair *>(geom);
return count_curve_segments(hair);
}
else if (geom->geometry_type == Geometry::POINTCLOUD) {
PointCloud *pointcloud = static_cast<PointCloud *>(geom);
return pointcloud->num_points();
}
return 0;
}
@@ -328,8 +432,9 @@ void BVHBuild::add_references(BVHRange &root)
if (!ob->get_geometry()->is_instanced()) {
num_alloc_references += count_primitives(ob->get_geometry());
}
else
else {
num_alloc_references++;
}
}
else {
num_alloc_references += count_primitives(ob->get_geometry());
@@ -394,7 +499,7 @@ BVHNode *BVHBuild::run()
spatial_min_overlap = root.bounds().safe_area() * params.spatial_split_alpha;
spatial_free_index = 0;
need_prim_time = params.num_motion_curve_steps > 0 || params.num_motion_triangle_steps > 0;
need_prim_time = params.use_motion_steps();
/* init progress updates */
double build_start_time;
@@ -535,7 +640,8 @@ bool BVHBuild::range_within_max_leaf_size(const BVHRange &range,
const vector<BVHReference> &references) const
{
size_t size = range.size();
size_t max_leaf_size = max(params.max_triangle_leaf_size, params.max_curve_leaf_size);
size_t max_leaf_size = max(max(params.max_triangle_leaf_size, params.max_curve_leaf_size),
params.max_point_leaf_size);
if (size > max_leaf_size)
return false;
@@ -544,32 +650,44 @@ bool BVHBuild::range_within_max_leaf_size(const BVHRange &range,
size_t num_motion_triangles = 0;
size_t num_curves = 0;
size_t num_motion_curves = 0;
size_t num_points = 0;
size_t num_motion_points = 0;
for (int i = 0; i < size; i++) {
const BVHReference &ref = references[range.start() + i];
if (ref.prim_type() & PRIMITIVE_ALL_CURVE) {
if (ref.prim_type() & PRIMITIVE_ALL_MOTION) {
if (ref.prim_type() & PRIMITIVE_CURVE) {
if (ref.prim_type() & PRIMITIVE_MOTION) {
num_motion_curves++;
}
else {
num_curves++;
}
}
else if (ref.prim_type() & PRIMITIVE_ALL_TRIANGLE) {
if (ref.prim_type() & PRIMITIVE_ALL_MOTION) {
else if (ref.prim_type() & PRIMITIVE_TRIANGLE) {
if (ref.prim_type() & PRIMITIVE_MOTION) {
num_motion_triangles++;
}
else {
num_triangles++;
}
}
else if (ref.prim_type() & PRIMITIVE_POINT) {
if (ref.prim_type() & PRIMITIVE_MOTION) {
num_motion_points++;
}
else {
num_points++;
}
}
}
return (num_triangles <= params.max_triangle_leaf_size) &&
(num_motion_triangles <= params.max_motion_triangle_leaf_size) &&
(num_curves <= params.max_curve_leaf_size) &&
(num_motion_curves <= params.max_motion_curve_leaf_size);
(num_motion_curves <= params.max_motion_curve_leaf_size) &&
(num_points <= params.max_point_leaf_size) &&
(num_motion_points <= params.max_motion_point_leaf_size);
}
/* multithreaded binning builder */
@@ -855,7 +973,7 @@ BVHNode *BVHBuild::create_leaf_node(const BVHRange &range, const vector<BVHRefer
for (int i = 0; i < range.size(); i++) {
const BVHReference &ref = references[range.start() + i];
if (ref.prim_index() != -1) {
uint32_t type_index = bitscan((uint32_t)(ref.prim_type() & PRIMITIVE_ALL));
uint32_t type_index = PRIMITIVE_INDEX(ref.prim_type() & PRIMITIVE_ALL);
p_ref[type_index].push_back(ref);
p_type[type_index].push_back(ref.prim_type());
p_index[type_index].push_back(ref.prim_index());

View File

@@ -39,6 +39,7 @@ class Geometry;
class Hair;
class Mesh;
class Object;
class PointCloud;
class Progress;
/* BVH Builder */
@@ -68,6 +69,7 @@ class BVHBuild {
/* Adding references. */
void add_reference_triangles(BoundBox &root, BoundBox &center, Mesh *mesh, int i);
void add_reference_curves(BoundBox &root, BoundBox &center, Hair *hair, int i);
void add_reference_points(BoundBox &root, BoundBox &center, PointCloud *pointcloud, int i);
void add_reference_geometry(BoundBox &root, BoundBox &center, Geometry *geom, int i);
void add_reference_object(BoundBox &root, BoundBox &center, Object *ob, int i);
void add_references(BVHRange &root);

View File

@@ -19,6 +19,7 @@
#include "bvh/bvh2.h"
#include "bvh/embree.h"
#include "bvh/metal.h"
#include "bvh/multi.h"
#include "bvh/optix.h"
@@ -40,8 +41,12 @@ const char *bvh_layout_name(BVHLayout layout)
return "EMBREE";
case BVH_LAYOUT_OPTIX:
return "OPTIX";
case BVH_LAYOUT_METAL:
return "METAL";
case BVH_LAYOUT_MULTI_OPTIX:
case BVH_LAYOUT_MULTI_METAL:
case BVH_LAYOUT_MULTI_OPTIX_EMBREE:
case BVH_LAYOUT_MULTI_METAL_EMBREE:
return "MULTI";
case BVH_LAYOUT_ALL:
return "ALL";
@@ -102,9 +107,18 @@ BVH *BVH::create(const BVHParams &params,
#else
(void)device;
break;
#endif
case BVH_LAYOUT_METAL:
#ifdef WITH_METAL
return bvh_metal_create(params, geometry, objects, device);
#else
(void)device;
break;
#endif
case BVH_LAYOUT_MULTI_OPTIX:
case BVH_LAYOUT_MULTI_METAL:
case BVH_LAYOUT_MULTI_OPTIX_EMBREE:
case BVH_LAYOUT_MULTI_METAL_EMBREE:
return new BVHMulti(params, geometry, objects);
case BVH_LAYOUT_NONE:
case BVH_LAYOUT_ALL:

View File

@@ -20,6 +20,7 @@
#include "scene/hair.h"
#include "scene/mesh.h"
#include "scene/object.h"
#include "scene/pointcloud.h"
#include "bvh/build.h"
#include "bvh/node.h"
@@ -386,7 +387,7 @@ void BVH2::refit_primitives(int start, int end, BoundBox &bbox, uint &visibility
}
else {
/* Primitives. */
if (pack.prim_type[prim] & PRIMITIVE_ALL_CURVE) {
if (pack.prim_type[prim] & PRIMITIVE_CURVE) {
/* Curves. */
const Hair *hair = static_cast<const Hair *>(ob->get_geometry());
int prim_offset = (params.top_level) ? hair->prim_offset : 0;
@@ -409,6 +410,30 @@ void BVH2::refit_primitives(int start, int end, BoundBox &bbox, uint &visibility
}
}
}
else if (pack.prim_type[prim] & PRIMITIVE_POINT) {
/* Points. */
const PointCloud *pointcloud = static_cast<const PointCloud *>(ob->get_geometry());
int prim_offset = (params.top_level) ? pointcloud->prim_offset : 0;
const float3 *points = &pointcloud->points[0];
const float *radius = &pointcloud->radius[0];
PointCloud::Point point = pointcloud->get_point(pidx - prim_offset);
point.bounds_grow(points, radius, bbox);
/* Motion points. */
if (pointcloud->get_use_motion_blur()) {
Attribute *attr = pointcloud->attributes.find(ATTR_STD_MOTION_VERTEX_POSITION);
if (attr) {
size_t pointcloud_size = pointcloud->points.size();
size_t steps = pointcloud->get_motion_steps() - 1;
float3 *point_steps = attr->data_float3();
for (size_t i = 0; i < steps; i++)
point.bounds_grow(point_steps + i * pointcloud_size, radius, bbox);
}
}
}
else {
/* Triangles. */
const Mesh *mesh = static_cast<const Mesh *>(ob->get_geometry());
@@ -505,7 +530,8 @@ void BVH2::pack_instances(size_t nodes_size, size_t leaf_nodes_size)
pack.leaf_nodes.resize(leaf_nodes_size);
pack.object_node.resize(objects.size());
if (params.num_motion_curve_steps > 0 || params.num_motion_triangle_steps > 0) {
if (params.num_motion_curve_steps > 0 || params.num_motion_triangle_steps > 0 ||
params.num_motion_point_steps > 0) {
pack.prim_time.resize(prim_index_size);
}
@@ -564,13 +590,7 @@ void BVH2::pack_instances(size_t nodes_size, size_t leaf_nodes_size)
float2 *bvh_prim_time = bvh->pack.prim_time.size() ? &bvh->pack.prim_time[0] : NULL;
for (size_t i = 0; i < bvh_prim_index_size; i++) {
if (bvh->pack.prim_type[i] & PRIMITIVE_ALL_CURVE) {
pack_prim_index[pack_prim_index_offset] = bvh_prim_index[i] + geom_prim_offset;
}
else {
pack_prim_index[pack_prim_index_offset] = bvh_prim_index[i] + geom_prim_offset;
}
pack_prim_index[pack_prim_index_offset] = bvh_prim_index[i] + geom_prim_offset;
pack_prim_type[pack_prim_index_offset] = bvh_prim_type[i];
pack_prim_visibility[pack_prim_index_offset] = bvh_prim_visibility[i];
pack_prim_object[pack_prim_index_offset] = 0; // unused for instances

View File

@@ -45,6 +45,7 @@
# include "scene/hair.h"
# include "scene/mesh.h"
# include "scene/object.h"
# include "scene/pointcloud.h"
# include "util/foreach.h"
# include "util/log.h"
@@ -90,7 +91,7 @@ static void rtc_filter_occluded_func(const RTCFilterFunctionNArguments *args)
++ctx->num_hits;
/* Always use baked shadow transparency for curves. */
if (current_isect.type & PRIMITIVE_ALL_CURVE) {
if (current_isect.type & PRIMITIVE_CURVE) {
ctx->throughput *= intersection_curve_shadow_transparency(
kg, current_isect.object, current_isect.prim, current_isect.u);
@@ -245,7 +246,7 @@ static void rtc_filter_occluded_func(const RTCFilterFunctionNArguments *args)
}
}
static void rtc_filter_func_thick_curve(const RTCFilterFunctionNArguments *args)
static void rtc_filter_func_backface_cull(const RTCFilterFunctionNArguments *args)
{
const RTCRay *ray = (RTCRay *)args->ray;
RTCHit *hit = (RTCHit *)args->hit;
@@ -258,7 +259,7 @@ static void rtc_filter_func_thick_curve(const RTCFilterFunctionNArguments *args)
}
}
static void rtc_filter_occluded_func_thick_curve(const RTCFilterFunctionNArguments *args)
static void rtc_filter_occluded_func_backface_cull(const RTCFilterFunctionNArguments *args)
{
const RTCRay *ray = (RTCRay *)args->ray;
RTCHit *hit = (RTCHit *)args->hit;
@@ -410,6 +411,12 @@ void BVHEmbree::add_object(Object *ob, int i)
add_curves(ob, hair, i);
}
}
else if (geom->geometry_type == Geometry::POINTCLOUD) {
PointCloud *pointcloud = static_cast<PointCloud *>(geom);
if (pointcloud->num_points() > 0) {
add_points(ob, pointcloud, i);
}
}
}
void BVHEmbree::add_instance(Object *ob, int i)
@@ -624,6 +631,89 @@ void BVHEmbree::set_curve_vertex_buffer(RTCGeometry geom_id, const Hair *hair, c
}
}
void BVHEmbree::set_point_vertex_buffer(RTCGeometry geom_id,
const PointCloud *pointcloud,
const bool update)
{
const Attribute *attr_mP = NULL;
size_t num_motion_steps = 1;
if (pointcloud->has_motion_blur()) {
attr_mP = pointcloud->attributes.find(ATTR_STD_MOTION_VERTEX_POSITION);
if (attr_mP) {
num_motion_steps = pointcloud->get_motion_steps();
}
}
const size_t num_points = pointcloud->num_points();
/* Copy the point data to Embree */
const int t_mid = (num_motion_steps - 1) / 2;
const float *radius = pointcloud->get_radius().data();
for (int t = 0; t < num_motion_steps; ++t) {
const float3 *verts;
if (t == t_mid || attr_mP == NULL) {
verts = pointcloud->get_points().data();
}
else {
int t_ = (t > t_mid) ? (t - 1) : t;
verts = &attr_mP->data_float3()[t_ * num_points];
}
float4 *rtc_verts = (update) ? (float4 *)rtcGetGeometryBufferData(
geom_id, RTC_BUFFER_TYPE_VERTEX, t) :
(float4 *)rtcSetNewGeometryBuffer(geom_id,
RTC_BUFFER_TYPE_VERTEX,
t,
RTC_FORMAT_FLOAT4,
sizeof(float) * 4,
num_points);
assert(rtc_verts);
if (rtc_verts) {
for (size_t j = 0; j < num_points; ++j) {
rtc_verts[j] = float3_to_float4(verts[j]);
rtc_verts[j].w = radius[j];
}
}
if (update) {
rtcUpdateGeometryBuffer(geom_id, RTC_BUFFER_TYPE_VERTEX, t);
}
}
}
void BVHEmbree::add_points(const Object *ob, const PointCloud *pointcloud, int i)
{
size_t prim_offset = pointcloud->prim_offset;
const Attribute *attr_mP = NULL;
size_t num_motion_steps = 1;
if (pointcloud->has_motion_blur()) {
attr_mP = pointcloud->attributes.find(ATTR_STD_MOTION_VERTEX_POSITION);
if (attr_mP) {
num_motion_steps = pointcloud->get_motion_steps();
}
}
enum RTCGeometryType type = RTC_GEOMETRY_TYPE_SPHERE_POINT;
RTCGeometry geom_id = rtcNewGeometry(rtc_device, type);
rtcSetGeometryBuildQuality(geom_id, build_quality);
rtcSetGeometryTimeStepCount(geom_id, num_motion_steps);
set_point_vertex_buffer(geom_id, pointcloud, false);
rtcSetGeometryUserData(geom_id, (void *)prim_offset);
rtcSetGeometryIntersectFilterFunction(geom_id, rtc_filter_func_backface_cull);
rtcSetGeometryOccludedFilterFunction(geom_id, rtc_filter_occluded_func_backface_cull);
rtcSetGeometryMask(geom_id, ob->visibility_for_tracing());
rtcCommitGeometry(geom_id);
rtcAttachGeometryByID(scene, geom_id, i * 2);
rtcReleaseGeometry(geom_id);
}
void BVHEmbree::add_curves(const Object *ob, const Hair *hair, int i)
{
size_t prim_offset = hair->curve_segment_offset;
@@ -678,8 +768,8 @@ void BVHEmbree::add_curves(const Object *ob, const Hair *hair, int i)
rtcSetGeometryOccludedFilterFunction(geom_id, rtc_filter_occluded_func);
}
else {
rtcSetGeometryIntersectFilterFunction(geom_id, rtc_filter_func_thick_curve);
rtcSetGeometryOccludedFilterFunction(geom_id, rtc_filter_occluded_func_thick_curve);
rtcSetGeometryIntersectFilterFunction(geom_id, rtc_filter_func_backface_cull);
rtcSetGeometryOccludedFilterFunction(geom_id, rtc_filter_occluded_func_backface_cull);
}
rtcSetGeometryMask(geom_id, ob->visibility_for_tracing());
@@ -716,6 +806,14 @@ void BVHEmbree::refit(Progress &progress)
rtcCommitGeometry(geom);
}
}
else if (geom->geometry_type == Geometry::POINTCLOUD) {
PointCloud *pointcloud = static_cast<PointCloud *>(geom);
if (pointcloud->num_points() > 0) {
RTCGeometry geom = rtcGetGeometry(scene, geom_id);
set_point_vertex_buffer(geom, pointcloud, true);
rtcCommitGeometry(geom);
}
}
}
geom_id += 2;
}

View File

@@ -33,6 +33,7 @@ CCL_NAMESPACE_BEGIN
class Hair;
class Mesh;
class PointCloud;
class BVHEmbree : public BVH {
public:
@@ -51,11 +52,15 @@ class BVHEmbree : public BVH {
void add_object(Object *ob, int i);
void add_instance(Object *ob, int i);
void add_curves(const Object *ob, const Hair *hair, int i);
void add_points(const Object *ob, const PointCloud *pointcloud, int i);
void add_triangles(const Object *ob, const Mesh *mesh, int i);
private:
void set_tri_vertex_buffer(RTCGeometry geom_id, const Mesh *mesh, const bool update);
void set_curve_vertex_buffer(RTCGeometry geom_id, const Hair *hair, const bool update);
void set_point_vertex_buffer(RTCGeometry geom_id,
const PointCloud *pointcloud,
const bool update);
RTCDevice rtc_device;
enum RTCBuildQuality build_quality;

35
intern/cycles/bvh/metal.h Normal file
View File

@@ -0,0 +1,35 @@
/*
* Copyright 2021 Blender Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifndef __BVH_METAL_H__
#define __BVH_METAL_H__
#ifdef WITH_METAL
# include "bvh/bvh.h"
CCL_NAMESPACE_BEGIN
BVH *bvh_metal_create(const BVHParams &params,
const vector<Geometry *> &geometry,
const vector<Object *> &objects,
Device *device);
CCL_NAMESPACE_END
#endif /* WITH_METAL */
#endif /* __BVH_METAL_H__ */

View File

@@ -0,0 +1,33 @@
/*
* Copyright 2021 Blender Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifdef WITH_METAL
# include "device/metal/bvh.h"
CCL_NAMESPACE_BEGIN
BVH *bvh_metal_create(const BVHParams &params,
const vector<Geometry *> &geometry,
const vector<Object *> &objects,
Device *device)
{
return new BVHMetal(params, geometry, objects, device);
}
CCL_NAMESPACE_END
#endif /* WITH_METAL */

View File

@@ -83,6 +83,8 @@ class BVHParams {
int max_motion_triangle_leaf_size;
int max_curve_leaf_size;
int max_motion_curve_leaf_size;
int max_point_leaf_size;
int max_motion_point_leaf_size;
/* object or mesh level bvh */
bool top_level;
@@ -98,13 +100,13 @@ class BVHParams {
/* Split time range to this number of steps and create leaf node for each
* of this time steps.
*
* Speeds up rendering of motion curve primitives in the cost of higher
* memory usage.
* Speeds up rendering of motion primitives in the cost of higher memory usage.
*/
int num_motion_curve_steps;
/* Same as above, but for triangle primitives. */
int num_motion_triangle_steps;
int num_motion_curve_steps;
int num_motion_point_steps;
/* Same as in SceneParams. */
int bvh_type;
@@ -132,6 +134,8 @@ class BVHParams {
max_motion_triangle_leaf_size = 8;
max_curve_leaf_size = 1;
max_motion_curve_leaf_size = 4;
max_point_leaf_size = 8;
max_motion_point_leaf_size = 8;
top_level = false;
bvh_layout = BVH_LAYOUT_BVH2;
@@ -139,6 +143,7 @@ class BVHParams {
num_motion_curve_steps = 0;
num_motion_triangle_steps = 0;
num_motion_point_steps = 0;
bvh_type = 0;
@@ -166,6 +171,12 @@ class BVHParams {
return (size <= min_leaf_size || level >= MAX_DEPTH);
}
bool use_motion_steps()
{
return num_motion_curve_steps > 0 || num_motion_triangle_steps > 0 ||
num_motion_point_steps > 0;
}
/* Gets best matching BVH.
*
* If the requested layout is supported by the device, it will be used.

View File

@@ -23,6 +23,7 @@
#include "scene/hair.h"
#include "scene/mesh.h"
#include "scene/object.h"
#include "scene/pointcloud.h"
#include "util/algorithm.h"
@@ -426,6 +427,32 @@ void BVHSpatialSplit::split_curve_primitive(const Hair *hair,
}
}
void BVHSpatialSplit::split_point_primitive(const PointCloud *pointcloud,
const Transform *tfm,
int prim_index,
int dim,
float pos,
BoundBox &left_bounds,
BoundBox &right_bounds)
{
/* No real splitting support for points, assume they are small enough for it
* not to matter. */
float3 point = pointcloud->get_points()[prim_index];
if (tfm != NULL) {
point = transform_point(tfm, point);
}
point = get_unaligned_point(point);
if (point[dim] <= pos) {
left_bounds.grow(point);
}
if (point[dim] >= pos) {
right_bounds.grow(point);
}
}
void BVHSpatialSplit::split_triangle_reference(const BVHReference &ref,
const Mesh *mesh,
int dim,
@@ -453,6 +480,16 @@ void BVHSpatialSplit::split_curve_reference(const BVHReference &ref,
right_bounds);
}
void BVHSpatialSplit::split_point_reference(const BVHReference &ref,
const PointCloud *pointcloud,
int dim,
float pos,
BoundBox &left_bounds,
BoundBox &right_bounds)
{
split_point_primitive(pointcloud, NULL, ref.prim_index(), dim, pos, left_bounds, right_bounds);
}
void BVHSpatialSplit::split_object_reference(
const Object *object, int dim, float pos, BoundBox &left_bounds, BoundBox &right_bounds)
{
@@ -475,6 +512,13 @@ void BVHSpatialSplit::split_object_reference(
}
}
}
else if (geom->geometry_type == Geometry::POINTCLOUD) {
PointCloud *pointcloud = static_cast<PointCloud *>(geom);
for (int point_idx = 0; point_idx < pointcloud->num_points(); ++point_idx) {
split_point_primitive(
pointcloud, &object->get_tfm(), point_idx, dim, pos, left_bounds, right_bounds);
}
}
}
void BVHSpatialSplit::split_reference(const BVHBuild &builder,
@@ -491,14 +535,18 @@ void BVHSpatialSplit::split_reference(const BVHBuild &builder,
/* loop over vertices/edges. */
const Object *ob = builder.objects[ref.prim_object()];
if (ref.prim_type() & PRIMITIVE_ALL_TRIANGLE) {
if (ref.prim_type() & PRIMITIVE_TRIANGLE) {
Mesh *mesh = static_cast<Mesh *>(ob->get_geometry());
split_triangle_reference(ref, mesh, dim, pos, left_bounds, right_bounds);
}
else if (ref.prim_type() & PRIMITIVE_ALL_CURVE) {
else if (ref.prim_type() & PRIMITIVE_CURVE) {
Hair *hair = static_cast<Hair *>(ob->get_geometry());
split_curve_reference(ref, hair, dim, pos, left_bounds, right_bounds);
}
else if (ref.prim_type() & PRIMITIVE_POINT) {
PointCloud *pointcloud = static_cast<PointCloud *>(ob->get_geometry());
split_point_reference(ref, pointcloud, dim, pos, left_bounds, right_bounds);
}
else {
split_object_reference(ob, dim, pos, left_bounds, right_bounds);
}

View File

@@ -26,6 +26,7 @@ CCL_NAMESPACE_BEGIN
class BVHBuild;
class Hair;
class Mesh;
class PointCloud;
struct Transform;
/* Object Split */
@@ -123,6 +124,13 @@ class BVHSpatialSplit {
float pos,
BoundBox &left_bounds,
BoundBox &right_bounds);
void split_point_primitive(const PointCloud *pointcloud,
const Transform *tfm,
int prim_index,
int dim,
float pos,
BoundBox &left_bounds,
BoundBox &right_bounds);
/* Lower-level functions which calculates boundaries of left and right nodes
* needed for spatial split.
@@ -141,6 +149,12 @@ class BVHSpatialSplit {
float pos,
BoundBox &left_bounds,
BoundBox &right_bounds);
void split_point_reference(const BVHReference &ref,
const PointCloud *pointcloud,
int dim,
float pos,
BoundBox &left_bounds,
BoundBox &right_bounds);
void split_object_reference(
const Object *object, int dim, float pos, BoundBox &left_bounds, BoundBox &right_bounds);

View File

@@ -69,7 +69,7 @@ bool BVHUnaligned::compute_aligned_space(const BVHReference &ref, Transform *ali
const int packed_type = ref.prim_type();
const int type = (packed_type & PRIMITIVE_ALL);
/* No motion blur curves here, we can't fit them to aligned boxes well. */
if (type & (PRIMITIVE_CURVE_RIBBON | PRIMITIVE_CURVE_THICK)) {
if ((type & PRIMITIVE_CURVE) && !(type & PRIMITIVE_MOTION)) {
const int curve_index = ref.prim_index();
const int segment = PRIMITIVE_UNPACK_SEGMENT(packed_type);
const Hair *hair = static_cast<const Hair *>(object->get_geometry());
@@ -95,7 +95,7 @@ BoundBox BVHUnaligned::compute_aligned_prim_boundbox(const BVHReference &prim,
const int packed_type = prim.prim_type();
const int type = (packed_type & PRIMITIVE_ALL);
/* No motion blur curves here, we can't fit them to aligned boxes well. */
if (type & (PRIMITIVE_CURVE_RIBBON | PRIMITIVE_CURVE_THICK)) {
if ((type & PRIMITIVE_CURVE) && !(type & PRIMITIVE_MOTION)) {
const int curve_index = prim.prim_index();
const int segment = PRIMITIVE_UNPACK_SEGMENT(packed_type);
const Hair *hair = static_cast<const Hair *>(object->get_geometry());

View File

@@ -551,4 +551,23 @@ if(NOT WITH_HIP_DYNLOAD)
set(WITH_HIP_DYNLOAD ON)
endif()
###########################################################################
# Metal
###########################################################################
if(WITH_CYCLES_DEVICE_METAL)
find_library(METAL_LIBRARY Metal)
# This file was added in the 12.0 SDK, use it as a way to detect the version.
if (METAL_LIBRARY AND NOT EXISTS "${METAL_LIBRARY}/Headers/MTLFunctionStitching.h")
message(STATUS "Metal version too old, must be SDK 12.0 or newer, disabling WITH_CYCLES_DEVICE_METAL")
set(WITH_CYCLES_DEVICE_METAL OFF)
elseif (NOT METAL_LIBRARY)
message(STATUS "Metal not found, disabling WITH_CYCLES_DEVICE_METAL")
set(WITH_CYCLES_DEVICE_METAL OFF)
else()
message(STATUS "Found Metal: ${METAL_LIBRARY}")
endif()
endif()
unset(_cycles_lib_dir)

View File

@@ -43,7 +43,7 @@ if(WITH_CYCLES_DEVICE_HIP AND WITH_HIP_DYNLOAD)
add_definitions(-DWITH_HIP_DYNLOAD)
endif()
set(SRC
set(SRC_BASE
device.cpp
denoise.cpp
graphics_interop.cpp
@@ -104,6 +104,21 @@ set(SRC_MULTI
multi/device.h
)
set(SRC_METAL
metal/bvh.mm
metal/bvh.h
metal/device.mm
metal/device.h
metal/device_impl.mm
metal/device_impl.h
metal/kernel.mm
metal/kernel.h
metal/queue.mm
metal/queue.h
metal/util.mm
metal/util.h
)
set(SRC_OPTIX
optix/device.cpp
optix/device.h
@@ -123,6 +138,17 @@ set(SRC_HEADERS
queue.h
)
set(SRC
${SRC_BASE}
${SRC_CPU}
${SRC_CUDA}
${SRC_HIP}
${SRC_DUMMY}
${SRC_MULTI}
${SRC_OPTIX}
${SRC_HEADERS}
)
set(LIB
cycles_kernel
cycles_util
@@ -158,6 +184,15 @@ endif()
if(WITH_CYCLES_DEVICE_OPTIX)
add_definitions(-DWITH_OPTIX)
endif()
if(WITH_CYCLES_DEVICE_METAL)
list(APPEND LIB
${METAL_LIBRARY}
)
add_definitions(-DWITH_METAL)
list(APPEND SRC
${SRC_METAL}
)
endif()
if(WITH_OPENIMAGEDENOISE)
list(APPEND LIB
@@ -168,20 +203,12 @@ endif()
include_directories(${INC})
include_directories(SYSTEM ${INC_SYS})
cycles_add_library(cycles_device "${LIB}"
${SRC}
${SRC_CPU}
${SRC_CUDA}
${SRC_HIP}
${SRC_DUMMY}
${SRC_MULTI}
${SRC_OPTIX}
${SRC_HEADERS}
)
cycles_add_library(cycles_device "${LIB}" ${SRC})
source_group("cpu" FILES ${SRC_CPU})
source_group("cuda" FILES ${SRC_CUDA})
source_group("dummy" FILES ${SRC_DUMMY})
source_group("multi" FILES ${SRC_MULTI})
source_group("metal" FILES ${SRC_METAL})
source_group("optix" FILES ${SRC_OPTIX})
source_group("common" FILES ${SRC} ${SRC_HEADERS})

View File

@@ -129,8 +129,7 @@ void CPUDevice::mem_alloc(device_memory &mem)
<< string_human_readable_size(mem.memory_size()) << ")";
}
if (mem.type == MEM_DEVICE_ONLY) {
assert(!mem.host_pointer);
if (mem.type == MEM_DEVICE_ONLY || !mem.host_pointer) {
size_t alignment = MIN_ALIGNMENT_CPU_DATA_TYPES;
void *data = util_aligned_malloc(mem.memory_size(), alignment);
mem.device_pointer = (device_ptr)data;
@@ -189,7 +188,7 @@ void CPUDevice::mem_free(device_memory &mem)
tex_free((device_texture &)mem);
}
else if (mem.device_pointer) {
if (mem.type == MEM_DEVICE_ONLY) {
if (mem.type == MEM_DEVICE_ONLY || !mem.host_pointer) {
util_aligned_free((void *)mem.device_pointer);
}
mem.device_pointer = 0;
@@ -274,7 +273,8 @@ void CPUDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
{
#ifdef WITH_EMBREE
if (bvh->params.bvh_layout == BVH_LAYOUT_EMBREE ||
bvh->params.bvh_layout == BVH_LAYOUT_MULTI_OPTIX_EMBREE) {
bvh->params.bvh_layout == BVH_LAYOUT_MULTI_OPTIX_EMBREE ||
bvh->params.bvh_layout == BVH_LAYOUT_MULTI_METAL_EMBREE) {
BVHEmbree *const bvh_embree = static_cast<BVHEmbree *>(bvh);
if (refit) {
bvh_embree->refit(progress);

View File

@@ -477,10 +477,10 @@ void CUDADevice::reserve_local_memory(const uint kernel_features)
* still to make it faster. */
CUDADeviceQueue queue(this);
void *d_path_index = nullptr;
void *d_render_buffer = nullptr;
device_ptr d_path_index = 0;
device_ptr d_render_buffer = 0;
int d_work_size = 0;
void *args[] = {&d_path_index, &d_render_buffer, &d_work_size};
DeviceKernelArguments args(&d_path_index, &d_render_buffer, &d_work_size);
queue.init_execution();
queue.enqueue(test_kernel, 1, args);
@@ -678,7 +678,7 @@ CUDADevice::CUDAMem *CUDADevice::generic_alloc(device_memory &mem, size_t pitch_
void *shared_pointer = 0;
if (mem_alloc_result != CUDA_SUCCESS && can_map_host) {
if (mem_alloc_result != CUDA_SUCCESS && can_map_host && mem.type != MEM_DEVICE_ONLY) {
if (mem.shared_pointer) {
/* Another device already allocated host memory. */
mem_alloc_result = CUDA_SUCCESS;
@@ -701,8 +701,14 @@ CUDADevice::CUDAMem *CUDADevice::generic_alloc(device_memory &mem, size_t pitch_
}
if (mem_alloc_result != CUDA_SUCCESS) {
status = " failed, out of device and host memory";
set_error("System is out of GPU and shared host memory");
if (mem.type == MEM_DEVICE_ONLY) {
status = " failed, out of device memory";
set_error("System is out of GPU memory");
}
else {
status = " failed, out of device and host memory";
set_error("System is out of GPU and shared host memory");
}
}
if (mem.name) {

View File

@@ -89,7 +89,9 @@ bool CUDADeviceQueue::kernel_available(DeviceKernel kernel) const
return cuda_device_->kernels.available(kernel);
}
bool CUDADeviceQueue::enqueue(DeviceKernel kernel, const int work_size, void *args[])
bool CUDADeviceQueue::enqueue(DeviceKernel kernel,
const int work_size,
DeviceKernelArguments const &args)
{
if (cuda_device_->have_error()) {
return false;
@@ -133,7 +135,7 @@ bool CUDADeviceQueue::enqueue(DeviceKernel kernel, const int work_size, void *ar
1,
shared_mem_bytes,
cuda_stream_,
args,
const_cast<void **>(args.values),
0),
"enqueue");

View File

@@ -42,7 +42,9 @@ class CUDADeviceQueue : public DeviceQueue {
virtual bool kernel_available(DeviceKernel kernel) const override;
virtual bool enqueue(DeviceKernel kernel, const int work_size, void *args[]) override;
virtual bool enqueue(DeviceKernel kernel,
const int work_size,
DeviceKernelArguments const &args) override;
virtual bool synchronize() override;

View File

@@ -27,6 +27,7 @@
#include "device/cuda/device.h"
#include "device/dummy/device.h"
#include "device/hip/device.h"
#include "device/metal/device.h"
#include "device/multi/device.h"
#include "device/optix/device.h"
@@ -49,6 +50,7 @@ vector<DeviceInfo> Device::cuda_devices;
vector<DeviceInfo> Device::optix_devices;
vector<DeviceInfo> Device::cpu_devices;
vector<DeviceInfo> Device::hip_devices;
vector<DeviceInfo> Device::metal_devices;
uint Device::devices_initialized_mask = 0;
/* Device */
@@ -105,6 +107,12 @@ Device *Device::create(const DeviceInfo &info, Stats &stats, Profiler &profiler)
break;
#endif
#ifdef WITH_METAL
case DEVICE_METAL:
if (device_metal_init())
device = device_metal_create(info, stats, profiler);
break;
#endif
default:
break;
}
@@ -128,6 +136,8 @@ DeviceType Device::type_from_string(const char *name)
return DEVICE_MULTI;
else if (strcmp(name, "HIP") == 0)
return DEVICE_HIP;
else if (strcmp(name, "METAL") == 0)
return DEVICE_METAL;
return DEVICE_NONE;
}
@@ -144,6 +154,8 @@ string Device::string_from_type(DeviceType type)
return "MULTI";
else if (type == DEVICE_HIP)
return "HIP";
else if (type == DEVICE_METAL)
return "METAL";
return "";
}
@@ -161,7 +173,9 @@ vector<DeviceType> Device::available_types()
#ifdef WITH_HIP
types.push_back(DEVICE_HIP);
#endif
#ifdef WITH_METAL
types.push_back(DEVICE_METAL);
#endif
return types;
}
@@ -227,6 +241,20 @@ vector<DeviceInfo> Device::available_devices(uint mask)
}
}
#ifdef WITH_METAL
if (mask & DEVICE_MASK_METAL) {
if (!(devices_initialized_mask & DEVICE_MASK_METAL)) {
if (device_metal_init()) {
device_metal_info(metal_devices);
}
devices_initialized_mask |= DEVICE_MASK_METAL;
}
foreach (DeviceInfo &info, metal_devices) {
devices.push_back(info);
}
}
#endif
return devices;
}
@@ -266,6 +294,15 @@ string Device::device_capabilities(uint mask)
}
#endif
#ifdef WITH_METAL
if (mask & DEVICE_MASK_METAL) {
if (device_metal_init()) {
capabilities += "\nMetal device capabilities:\n";
capabilities += device_metal_capabilities();
}
}
#endif
return capabilities;
}
@@ -354,6 +391,7 @@ void Device::free_memory()
optix_devices.free_memory();
hip_devices.free_memory();
cpu_devices.free_memory();
metal_devices.free_memory();
}
unique_ptr<DeviceQueue> Device::gpu_queue_create()

View File

@@ -52,6 +52,7 @@ enum DeviceType {
DEVICE_MULTI,
DEVICE_OPTIX,
DEVICE_HIP,
DEVICE_METAL,
DEVICE_DUMMY,
};
@@ -60,6 +61,7 @@ enum DeviceTypeMask {
DEVICE_MASK_CUDA = (1 << DEVICE_CUDA),
DEVICE_MASK_OPTIX = (1 << DEVICE_OPTIX),
DEVICE_MASK_HIP = (1 << DEVICE_HIP),
DEVICE_MASK_METAL = (1 << DEVICE_METAL),
DEVICE_MASK_ALL = ~0
};
@@ -281,6 +283,7 @@ class Device {
static vector<DeviceInfo> optix_devices;
static vector<DeviceInfo> cpu_devices;
static vector<DeviceInfo> hip_devices;
static vector<DeviceInfo> metal_devices;
static uint devices_initialized_mask;
};

View File

@@ -440,10 +440,10 @@ void HIPDevice::reserve_local_memory(const uint kernel_features)
* still to make it faster. */
HIPDeviceQueue queue(this);
void *d_path_index = nullptr;
void *d_render_buffer = nullptr;
device_ptr d_path_index = 0;
device_ptr d_render_buffer = 0;
int d_work_size = 0;
void *args[] = {&d_path_index, &d_render_buffer, &d_work_size};
DeviceKernelArguments args(&d_path_index, &d_render_buffer, &d_work_size);
queue.init_execution();
queue.enqueue(test_kernel, 1, args);

View File

@@ -89,7 +89,9 @@ bool HIPDeviceQueue::kernel_available(DeviceKernel kernel) const
return hip_device_->kernels.available(kernel);
}
bool HIPDeviceQueue::enqueue(DeviceKernel kernel, const int work_size, void *args[])
bool HIPDeviceQueue::enqueue(DeviceKernel kernel,
const int work_size,
DeviceKernelArguments const &args)
{
if (hip_device_->have_error()) {
return false;
@@ -132,7 +134,7 @@ bool HIPDeviceQueue::enqueue(DeviceKernel kernel, const int work_size, void *arg
1,
shared_mem_bytes,
hip_stream_,
args,
const_cast<void **>(args.values),
0),
"enqueue");

View File

@@ -42,7 +42,9 @@ class HIPDeviceQueue : public DeviceQueue {
virtual bool kernel_available(DeviceKernel kernel) const override;
virtual bool enqueue(DeviceKernel kernel, const int work_size, void *args[]) override;
virtual bool enqueue(DeviceKernel kernel,
const int work_size,
DeviceKernelArguments const &args) override;
virtual bool synchronize() override;

View File

@@ -263,6 +263,7 @@ class device_memory {
friend class CUDADevice;
friend class OptiXDevice;
friend class HIPDevice;
friend class MetalDevice;
/* Only create through subclasses. */
device_memory(Device *device, const char *name, MemoryType type);
@@ -581,7 +582,7 @@ template<typename T> class device_vector : public device_memory {
* from an already allocated base memory. It is freed automatically when it
* goes out of scope, which should happen before base memory is freed.
*
* Note: some devices require offset and size of the sub_ptr to be properly
* NOTE: some devices require offset and size of the sub_ptr to be properly
* aligned to device->mem_address_alingment(). */
class device_sub_ptr {

View File

@@ -0,0 +1,66 @@
/*
* Copyright 2021 Blender Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#pragma once
#ifdef WITH_METAL
# include "bvh/bvh.h"
# include "bvh/params.h"
# include "device/memory.h"
# include <Metal/Metal.h>
CCL_NAMESPACE_BEGIN
class BVHMetal : public BVH {
public:
API_AVAILABLE(macos(11.0))
id<MTLAccelerationStructure> accel_struct = nil;
bool accel_struct_building = false;
API_AVAILABLE(macos(11.0))
vector<id<MTLAccelerationStructure>> blas_array;
bool motion_blur = false;
Stats &stats;
bool build(Progress &progress, id<MTLDevice> device, id<MTLCommandQueue> queue, bool refit);
BVHMetal(const BVHParams &params,
const vector<Geometry *> &geometry,
const vector<Object *> &objects,
Device *device);
virtual ~BVHMetal();
bool build_BLAS(Progress &progress, id<MTLDevice> device, id<MTLCommandQueue> queue, bool refit);
bool build_BLAS_mesh(Progress &progress,
id<MTLDevice> device,
id<MTLCommandQueue> queue,
Geometry *const geom,
bool refit);
bool build_BLAS_hair(Progress &progress,
id<MTLDevice> device,
id<MTLCommandQueue> queue,
Geometry *const geom,
bool refit);
bool build_TLAS(Progress &progress, id<MTLDevice> device, id<MTLCommandQueue> queue, bool refit);
};
CCL_NAMESPACE_END
#endif /* WITH_METAL */

View File

@@ -0,0 +1,813 @@
/*
* Copyright 2021 Blender Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifdef WITH_METAL
# include "scene/hair.h"
# include "scene/mesh.h"
# include "scene/object.h"
# include "util/progress.h"
# include "device/metal/bvh.h"
CCL_NAMESPACE_BEGIN
# define BVH_status(...) \
{ \
string str = string_printf(__VA_ARGS__); \
progress.set_substatus(str); \
}
BVHMetal::BVHMetal(const BVHParams &params_,
const vector<Geometry *> &geometry_,
const vector<Object *> &objects_,
Device *device)
: BVH(params_, geometry_, objects_), stats(device->stats)
{
}
BVHMetal::~BVHMetal()
{
if (@available(macos 12.0, *)) {
if (accel_struct) {
stats.mem_free(accel_struct.allocatedSize);
[accel_struct release];
}
}
}
bool BVHMetal::build_BLAS_mesh(Progress &progress,
id<MTLDevice> device,
id<MTLCommandQueue> queue,
Geometry *const geom,
bool refit)
{
if (@available(macos 12.0, *)) {
/* Build BLAS for triangle primitives */
Mesh *const mesh = static_cast<Mesh *const>(geom);
if (mesh->num_triangles() == 0) {
return false;
}
/*------------------------------------------------*/
BVH_status(
"Building mesh BLAS | %7d tris | %s", (int)mesh->num_triangles(), geom->name.c_str());
/*------------------------------------------------*/
const bool use_fast_trace_bvh = (params.bvh_type == BVH_TYPE_STATIC);
const array<float3> &verts = mesh->get_verts();
const array<int> &tris = mesh->get_triangles();
const size_t num_verts = verts.size();
const size_t num_indices = tris.size();
size_t num_motion_steps = 1;
Attribute *motion_keys = mesh->attributes.find(ATTR_STD_MOTION_VERTEX_POSITION);
if (motion_blur && mesh->get_use_motion_blur() && motion_keys) {
num_motion_steps = mesh->get_motion_steps();
}
MTLResourceOptions storage_mode;
if (device.hasUnifiedMemory) {
storage_mode = MTLResourceStorageModeShared;
}
else {
storage_mode = MTLResourceStorageModeManaged;
}
/* Upload the mesh data to the GPU */
id<MTLBuffer> posBuf = nil;
id<MTLBuffer> indexBuf = [device newBufferWithBytes:tris.data()
length:num_indices * sizeof(tris.data()[0])
options:storage_mode];
if (num_motion_steps == 1) {
posBuf = [device newBufferWithBytes:verts.data()
length:num_verts * sizeof(verts.data()[0])
options:storage_mode];
}
else {
posBuf = [device newBufferWithLength:num_verts * num_motion_steps * sizeof(verts.data()[0])
options:storage_mode];
float3 *dest_data = (float3 *)[posBuf contents];
size_t center_step = (num_motion_steps - 1) / 2;
for (size_t step = 0; step < num_motion_steps; ++step) {
const float3 *verts = mesh->get_verts().data();
/* The center step for motion vertices is not stored in the attribute. */
if (step != center_step) {
verts = motion_keys->data_float3() + (step > center_step ? step - 1 : step) * num_verts;
}
memcpy(dest_data + num_verts * step, verts, num_verts * sizeof(float3));
}
if (storage_mode == MTLResourceStorageModeManaged) {
[posBuf didModifyRange:NSMakeRange(0, posBuf.length)];
}
}
/* Create an acceleration structure. */
MTLAccelerationStructureGeometryDescriptor *geomDesc;
if (num_motion_steps > 1) {
std::vector<MTLMotionKeyframeData *> vertex_ptrs;
vertex_ptrs.reserve(num_motion_steps);
for (size_t step = 0; step < num_motion_steps; ++step) {
MTLMotionKeyframeData *k = [MTLMotionKeyframeData data];
k.buffer = posBuf;
k.offset = num_verts * step * sizeof(float3);
vertex_ptrs.push_back(k);
}
MTLAccelerationStructureMotionTriangleGeometryDescriptor *geomDescMotion =
[MTLAccelerationStructureMotionTriangleGeometryDescriptor descriptor];
geomDescMotion.vertexBuffers = [NSArray arrayWithObjects:vertex_ptrs.data()
count:vertex_ptrs.size()];
geomDescMotion.vertexStride = sizeof(verts.data()[0]);
geomDescMotion.indexBuffer = indexBuf;
geomDescMotion.indexBufferOffset = 0;
geomDescMotion.indexType = MTLIndexTypeUInt32;
geomDescMotion.triangleCount = num_indices / 3;
geomDescMotion.intersectionFunctionTableOffset = 0;
geomDesc = geomDescMotion;
}
else {
MTLAccelerationStructureTriangleGeometryDescriptor *geomDescNoMotion =
[MTLAccelerationStructureTriangleGeometryDescriptor descriptor];
geomDescNoMotion.vertexBuffer = posBuf;
geomDescNoMotion.vertexBufferOffset = 0;
geomDescNoMotion.vertexStride = sizeof(verts.data()[0]);
geomDescNoMotion.indexBuffer = indexBuf;
geomDescNoMotion.indexBufferOffset = 0;
geomDescNoMotion.indexType = MTLIndexTypeUInt32;
geomDescNoMotion.triangleCount = num_indices / 3;
geomDescNoMotion.intersectionFunctionTableOffset = 0;
geomDesc = geomDescNoMotion;
}
/* Force a single any-hit call, so shadow record-all behavior works correctly */
/* (Match optix behavior: unsigned int build_flags =
* OPTIX_GEOMETRY_FLAG_REQUIRE_SINGLE_ANYHIT_CALL;) */
geomDesc.allowDuplicateIntersectionFunctionInvocation = false;
MTLPrimitiveAccelerationStructureDescriptor *accelDesc =
[MTLPrimitiveAccelerationStructureDescriptor descriptor];
accelDesc.geometryDescriptors = @[ geomDesc ];
if (num_motion_steps > 1) {
accelDesc.motionStartTime = 0.0f;
accelDesc.motionEndTime = 1.0f;
accelDesc.motionStartBorderMode = MTLMotionBorderModeClamp;
accelDesc.motionEndBorderMode = MTLMotionBorderModeClamp;
accelDesc.motionKeyframeCount = num_motion_steps;
}
if (!use_fast_trace_bvh) {
accelDesc.usage |= (MTLAccelerationStructureUsageRefit |
MTLAccelerationStructureUsagePreferFastBuild);
}
MTLAccelerationStructureSizes accelSizes = [device
accelerationStructureSizesWithDescriptor:accelDesc];
id<MTLAccelerationStructure> accel_uncompressed = [device
newAccelerationStructureWithSize:accelSizes.accelerationStructureSize];
id<MTLBuffer> scratchBuf = [device newBufferWithLength:accelSizes.buildScratchBufferSize
options:MTLResourceStorageModePrivate];
id<MTLBuffer> sizeBuf = [device newBufferWithLength:8 options:MTLResourceStorageModeShared];
id<MTLCommandBuffer> accelCommands = [queue commandBuffer];
id<MTLAccelerationStructureCommandEncoder> accelEnc =
[accelCommands accelerationStructureCommandEncoder];
if (refit) {
[accelEnc refitAccelerationStructure:accel_struct
descriptor:accelDesc
destination:accel_uncompressed
scratchBuffer:scratchBuf
scratchBufferOffset:0];
}
else {
[accelEnc buildAccelerationStructure:accel_uncompressed
descriptor:accelDesc
scratchBuffer:scratchBuf
scratchBufferOffset:0];
}
if (use_fast_trace_bvh) {
[accelEnc writeCompactedAccelerationStructureSize:accel_uncompressed
toBuffer:sizeBuf
offset:0
sizeDataType:MTLDataTypeULong];
}
[accelEnc endEncoding];
[accelCommands addCompletedHandler:^(id<MTLCommandBuffer> command_buffer) {
/* free temp resources */
[scratchBuf release];
[indexBuf release];
[posBuf release];
if (use_fast_trace_bvh) {
/* Compact the accel structure */
uint64_t compressed_size = *(uint64_t *)sizeBuf.contents;
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
id<MTLCommandBuffer> accelCommands = [queue commandBuffer];
id<MTLAccelerationStructureCommandEncoder> accelEnc =
[accelCommands accelerationStructureCommandEncoder];
id<MTLAccelerationStructure> accel = [device
newAccelerationStructureWithSize:compressed_size];
[accelEnc copyAndCompactAccelerationStructure:accel_uncompressed
toAccelerationStructure:accel];
[accelEnc endEncoding];
[accelCommands addCompletedHandler:^(id<MTLCommandBuffer> command_buffer) {
uint64_t allocated_size = [accel allocatedSize];
stats.mem_alloc(allocated_size);
accel_struct = accel;
[accel_uncompressed release];
accel_struct_building = false;
}];
[accelCommands commit];
});
}
else {
/* set our acceleration structure to the uncompressed structure */
accel_struct = accel_uncompressed;
uint64_t allocated_size = [accel_struct allocatedSize];
stats.mem_alloc(allocated_size);
accel_struct_building = false;
}
[sizeBuf release];
}];
accel_struct_building = true;
[accelCommands commit];
return true;
}
return false;
}
bool BVHMetal::build_BLAS_hair(Progress &progress,
id<MTLDevice> device,
id<MTLCommandQueue> queue,
Geometry *const geom,
bool refit)
{
if (@available(macos 12.0, *)) {
/* Build BLAS for hair curves */
Hair *hair = static_cast<Hair *>(geom);
if (hair->num_curves() == 0) {
return false;
}
/*------------------------------------------------*/
BVH_status(
"Building hair BLAS | %7d curves | %s", (int)hair->num_curves(), geom->name.c_str());
/*------------------------------------------------*/
const bool use_fast_trace_bvh = (params.bvh_type == BVH_TYPE_STATIC);
const size_t num_segments = hair->num_segments();
size_t num_motion_steps = 1;
Attribute *motion_keys = hair->attributes.find(ATTR_STD_MOTION_VERTEX_POSITION);
if (motion_blur && hair->get_use_motion_blur() && motion_keys) {
num_motion_steps = hair->get_motion_steps();
}
const size_t num_aabbs = num_segments * num_motion_steps;
MTLResourceOptions storage_mode;
if (device.hasUnifiedMemory) {
storage_mode = MTLResourceStorageModeShared;
}
else {
storage_mode = MTLResourceStorageModeManaged;
}
/* Allocate a GPU buffer for the AABB data and populate it */
id<MTLBuffer> aabbBuf = [device
newBufferWithLength:num_aabbs * sizeof(MTLAxisAlignedBoundingBox)
options:storage_mode];
MTLAxisAlignedBoundingBox *aabb_data = (MTLAxisAlignedBoundingBox *)[aabbBuf contents];
/* Get AABBs for each motion step */
size_t center_step = (num_motion_steps - 1) / 2;
for (size_t step = 0; step < num_motion_steps; ++step) {
/* The center step for motion vertices is not stored in the attribute */
const float3 *keys = hair->get_curve_keys().data();
if (step != center_step) {
size_t attr_offset = (step > center_step) ? step - 1 : step;
/* Technically this is a float4 array, but sizeof(float3) == sizeof(float4) */
keys = motion_keys->data_float3() + attr_offset * hair->get_curve_keys().size();
}
for (size_t j = 0, i = 0; j < hair->num_curves(); ++j) {
const Hair::Curve curve = hair->get_curve(j);
for (int segment = 0; segment < curve.num_segments(); ++segment, ++i) {
{
BoundBox bounds = BoundBox::empty;
curve.bounds_grow(segment, keys, hair->get_curve_radius().data(), bounds);
const size_t index = step * num_segments + i;
aabb_data[index].min = (MTLPackedFloat3 &)bounds.min;
aabb_data[index].max = (MTLPackedFloat3 &)bounds.max;
}
}
}
}
if (storage_mode == MTLResourceStorageModeManaged) {
[aabbBuf didModifyRange:NSMakeRange(0, aabbBuf.length)];
}
# if 0
for (size_t i=0; i<num_aabbs && i < 400; i++) {
MTLAxisAlignedBoundingBox& bb = aabb_data[i];
printf(" %d: %.1f,%.1f,%.1f -- %.1f,%.1f,%.1f\n", int(i), bb.min.x, bb.min.y, bb.min.z, bb.max.x, bb.max.y, bb.max.z);
}
# endif
MTLAccelerationStructureGeometryDescriptor *geomDesc;
if (motion_blur) {
std::vector<MTLMotionKeyframeData *> aabb_ptrs;
aabb_ptrs.reserve(num_motion_steps);
for (size_t step = 0; step < num_motion_steps; ++step) {
MTLMotionKeyframeData *k = [MTLMotionKeyframeData data];
k.buffer = aabbBuf;
k.offset = step * num_segments * sizeof(MTLAxisAlignedBoundingBox);
aabb_ptrs.push_back(k);
}
MTLAccelerationStructureMotionBoundingBoxGeometryDescriptor *geomDescMotion =
[MTLAccelerationStructureMotionBoundingBoxGeometryDescriptor descriptor];
geomDescMotion.boundingBoxBuffers = [NSArray arrayWithObjects:aabb_ptrs.data()
count:aabb_ptrs.size()];
geomDescMotion.boundingBoxCount = num_segments;
geomDescMotion.boundingBoxStride = sizeof(aabb_data[0]);
geomDescMotion.intersectionFunctionTableOffset = 1;
/* Force a single any-hit call, so shadow record-all behavior works correctly */
/* (Match optix behavior: unsigned int build_flags =
* OPTIX_GEOMETRY_FLAG_REQUIRE_SINGLE_ANYHIT_CALL;) */
geomDescMotion.allowDuplicateIntersectionFunctionInvocation = false;
geomDescMotion.opaque = true;
geomDesc = geomDescMotion;
}
else {
MTLAccelerationStructureBoundingBoxGeometryDescriptor *geomDescNoMotion =
[MTLAccelerationStructureBoundingBoxGeometryDescriptor descriptor];
geomDescNoMotion.boundingBoxBuffer = aabbBuf;
geomDescNoMotion.boundingBoxBufferOffset = 0;
geomDescNoMotion.boundingBoxCount = int(num_aabbs);
geomDescNoMotion.boundingBoxStride = sizeof(aabb_data[0]);
geomDescNoMotion.intersectionFunctionTableOffset = 1;
/* Force a single any-hit call, so shadow record-all behavior works correctly */
/* (Match optix behavior: unsigned int build_flags =
* OPTIX_GEOMETRY_FLAG_REQUIRE_SINGLE_ANYHIT_CALL;) */
geomDescNoMotion.allowDuplicateIntersectionFunctionInvocation = false;
geomDescNoMotion.opaque = true;
geomDesc = geomDescNoMotion;
}
MTLPrimitiveAccelerationStructureDescriptor *accelDesc =
[MTLPrimitiveAccelerationStructureDescriptor descriptor];
accelDesc.geometryDescriptors = @[ geomDesc ];
if (motion_blur) {
accelDesc.motionStartTime = 0.0f;
accelDesc.motionEndTime = 1.0f;
accelDesc.motionStartBorderMode = MTLMotionBorderModeVanish;
accelDesc.motionEndBorderMode = MTLMotionBorderModeVanish;
accelDesc.motionKeyframeCount = num_motion_steps;
}
if (!use_fast_trace_bvh) {
accelDesc.usage |= (MTLAccelerationStructureUsageRefit |
MTLAccelerationStructureUsagePreferFastBuild);
}
MTLAccelerationStructureSizes accelSizes = [device
accelerationStructureSizesWithDescriptor:accelDesc];
id<MTLAccelerationStructure> accel_uncompressed = [device
newAccelerationStructureWithSize:accelSizes.accelerationStructureSize];
id<MTLBuffer> scratchBuf = [device newBufferWithLength:accelSizes.buildScratchBufferSize
options:MTLResourceStorageModePrivate];
id<MTLBuffer> sizeBuf = [device newBufferWithLength:8 options:MTLResourceStorageModeShared];
id<MTLCommandBuffer> accelCommands = [queue commandBuffer];
id<MTLAccelerationStructureCommandEncoder> accelEnc =
[accelCommands accelerationStructureCommandEncoder];
if (refit) {
[accelEnc refitAccelerationStructure:accel_struct
descriptor:accelDesc
destination:accel_uncompressed
scratchBuffer:scratchBuf
scratchBufferOffset:0];
}
else {
[accelEnc buildAccelerationStructure:accel_uncompressed
descriptor:accelDesc
scratchBuffer:scratchBuf
scratchBufferOffset:0];
}
if (use_fast_trace_bvh) {
[accelEnc writeCompactedAccelerationStructureSize:accel_uncompressed
toBuffer:sizeBuf
offset:0
sizeDataType:MTLDataTypeULong];
}
[accelEnc endEncoding];
[accelCommands addCompletedHandler:^(id<MTLCommandBuffer> command_buffer) {
/* free temp resources */
[scratchBuf release];
[aabbBuf release];
if (use_fast_trace_bvh) {
/* Compact the accel structure */
uint64_t compressed_size = *(uint64_t *)sizeBuf.contents;
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
id<MTLCommandBuffer> accelCommands = [queue commandBuffer];
id<MTLAccelerationStructureCommandEncoder> accelEnc =
[accelCommands accelerationStructureCommandEncoder];
id<MTLAccelerationStructure> accel = [device
newAccelerationStructureWithSize:compressed_size];
[accelEnc copyAndCompactAccelerationStructure:accel_uncompressed
toAccelerationStructure:accel];
[accelEnc endEncoding];
[accelCommands addCompletedHandler:^(id<MTLCommandBuffer> command_buffer) {
uint64_t allocated_size = [accel allocatedSize];
stats.mem_alloc(allocated_size);
accel_struct = accel;
[accel_uncompressed release];
accel_struct_building = false;
}];
[accelCommands commit];
});
}
else {
/* set our acceleration structure to the uncompressed structure */
accel_struct = accel_uncompressed;
uint64_t allocated_size = [accel_struct allocatedSize];
stats.mem_alloc(allocated_size);
accel_struct_building = false;
}
[sizeBuf release];
}];
accel_struct_building = true;
[accelCommands commit];
return true;
}
return false;
}
bool BVHMetal::build_BLAS(Progress &progress,
id<MTLDevice> device,
id<MTLCommandQueue> queue,
bool refit)
{
if (@available(macos 12.0, *)) {
assert(objects.size() == 1 && geometry.size() == 1);
/* Build bottom level acceleration structures (BLAS) */
Geometry *const geom = geometry[0];
switch (geom->geometry_type) {
case Geometry::VOLUME:
case Geometry::MESH:
return build_BLAS_mesh(progress, device, queue, geom, refit);
case Geometry::HAIR:
return build_BLAS_hair(progress, device, queue, geom, refit);
default:
return false;
}
}
return false;
}
bool BVHMetal::build_TLAS(Progress &progress,
id<MTLDevice> device,
id<MTLCommandQueue> queue,
bool refit)
{
if (@available(macos 12.0, *)) {
/* we need to sync here and ensure that all BLAS have completed async generation by both GCD
* and Metal */
{
__block bool complete_bvh = false;
while (!complete_bvh) {
dispatch_sync(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
complete_bvh = true;
for (Object *ob : objects) {
/* Skip non-traceable objects */
if (!ob->is_traceable())
continue;
Geometry const *geom = ob->get_geometry();
BVHMetal const *blas = static_cast<BVHMetal const *>(geom->bvh);
if (blas->accel_struct_building) {
complete_bvh = false;
/* We're likely waiting on a command buffer that's in flight to complete.
* Queue up a command buffer and wait for it complete before checking the BLAS again
*/
id<MTLCommandBuffer> command_buffer = [queue commandBuffer];
[command_buffer commit];
[command_buffer waitUntilCompleted];
break;
}
}
});
}
}
uint32_t num_instances = 0;
uint32_t num_motion_transforms = 0;
for (Object *ob : objects) {
/* Skip non-traceable objects */
if (!ob->is_traceable())
continue;
num_instances++;
if (ob->use_motion()) {
num_motion_transforms += max(1, ob->get_motion().size());
}
else {
num_motion_transforms++;
}
}
/*------------------------------------------------*/
BVH_status("Building TLAS | %7d instances", (int)num_instances);
/*------------------------------------------------*/
const bool use_fast_trace_bvh = (params.bvh_type == BVH_TYPE_STATIC);
NSMutableArray *all_blas = [NSMutableArray array];
unordered_map<BVHMetal const *, int> instance_mapping;
/* Lambda function to build/retrieve the BLAS index mapping */
auto get_blas_index = [&](BVHMetal const *blas) {
auto it = instance_mapping.find(blas);
if (it != instance_mapping.end()) {
return it->second;
}
else {
int blas_index = (int)[all_blas count];
instance_mapping[blas] = blas_index;
if (@available(macos 12.0, *)) {
[all_blas addObject:blas->accel_struct];
}
return blas_index;
}
};
MTLResourceOptions storage_mode;
if (device.hasUnifiedMemory) {
storage_mode = MTLResourceStorageModeShared;
}
else {
storage_mode = MTLResourceStorageModeManaged;
}
size_t instance_size;
if (motion_blur) {
instance_size = sizeof(MTLAccelerationStructureMotionInstanceDescriptor);
}
else {
instance_size = sizeof(MTLAccelerationStructureUserIDInstanceDescriptor);
}
/* Allocate a GPU buffer for the instance data and populate it */
id<MTLBuffer> instanceBuf = [device newBufferWithLength:num_instances * instance_size
options:storage_mode];
id<MTLBuffer> motion_transforms_buf = nil;
MTLPackedFloat4x3 *motion_transforms = nullptr;
if (motion_blur && num_motion_transforms) {
motion_transforms_buf = [device
newBufferWithLength:num_motion_transforms * sizeof(MTLPackedFloat4x3)
options:storage_mode];
motion_transforms = (MTLPackedFloat4x3 *)motion_transforms_buf.contents;
}
uint32_t instance_index = 0;
uint32_t motion_transform_index = 0;
for (Object *ob : objects) {
/* Skip non-traceable objects */
if (!ob->is_traceable())
continue;
Geometry const *geom = ob->get_geometry();
BVHMetal const *blas = static_cast<BVHMetal const *>(geom->bvh);
uint32_t accel_struct_index = get_blas_index(blas);
/* Add some of the object visibility bits to the mask.
* __prim_visibility contains the combined visibility bits of all instances, so is not
* reliable if they differ between instances.
*
* METAL_WIP: OptiX visibility mask can only contain 8 bits, so have to trade-off here
* and select just a few important ones.
*/
uint32_t mask = ob->visibility_for_tracing() & 0xFF;
/* Have to have at least one bit in the mask, or else instance would always be culled. */
if (0 == mask) {
mask = 0xFF;
}
/* Set user instance ID to object index */
int object_index = ob->get_device_index();
uint32_t user_id = uint32_t(object_index);
/* Bake into the appropriate descriptor */
if (motion_blur) {
MTLAccelerationStructureMotionInstanceDescriptor *instances =
(MTLAccelerationStructureMotionInstanceDescriptor *)[instanceBuf contents];
MTLAccelerationStructureMotionInstanceDescriptor &desc = instances[instance_index++];
desc.accelerationStructureIndex = accel_struct_index;
desc.userID = user_id;
desc.mask = mask;
desc.motionStartTime = 0.0f;
desc.motionEndTime = 1.0f;
desc.motionTransformsStartIndex = motion_transform_index;
desc.motionStartBorderMode = MTLMotionBorderModeVanish;
desc.motionEndBorderMode = MTLMotionBorderModeVanish;
desc.intersectionFunctionTableOffset = 0;
int key_count = ob->get_motion().size();
if (key_count) {
desc.motionTransformsCount = key_count;
Transform *keys = ob->get_motion().data();
for (int i = 0; i < key_count; i++) {
float *t = (float *)&motion_transforms[motion_transform_index++];
/* Transpose transform */
auto src = (float const *)&keys[i];
for (int i = 0; i < 12; i++) {
t[i] = src[(i / 3) + 4 * (i % 3)];
}
}
}
else {
desc.motionTransformsCount = 1;
float *t = (float *)&motion_transforms[motion_transform_index++];
if (ob->get_geometry()->is_instanced()) {
/* Transpose transform */
auto src = (float const *)&ob->get_tfm();
for (int i = 0; i < 12; i++) {
t[i] = src[(i / 3) + 4 * (i % 3)];
}
}
else {
/* Clear transform to identity matrix */
t[0] = t[4] = t[8] = 1.0f;
}
}
}
else {
MTLAccelerationStructureUserIDInstanceDescriptor *instances =
(MTLAccelerationStructureUserIDInstanceDescriptor *)[instanceBuf contents];
MTLAccelerationStructureUserIDInstanceDescriptor &desc = instances[instance_index++];
desc.accelerationStructureIndex = accel_struct_index;
desc.userID = user_id;
desc.mask = mask;
desc.intersectionFunctionTableOffset = 0;
float *t = (float *)&desc.transformationMatrix;
if (ob->get_geometry()->is_instanced()) {
/* Transpose transform */
auto src = (float const *)&ob->get_tfm();
for (int i = 0; i < 12; i++) {
t[i] = src[(i / 3) + 4 * (i % 3)];
}
}
else {
/* Clear transform to identity matrix */
t[0] = t[4] = t[8] = 1.0f;
}
}
}
if (storage_mode == MTLResourceStorageModeManaged) {
[instanceBuf didModifyRange:NSMakeRange(0, instanceBuf.length)];
if (motion_transforms_buf) {
[motion_transforms_buf didModifyRange:NSMakeRange(0, motion_transforms_buf.length)];
assert(num_motion_transforms == motion_transform_index);
}
}
MTLInstanceAccelerationStructureDescriptor *accelDesc =
[MTLInstanceAccelerationStructureDescriptor descriptor];
accelDesc.instanceCount = num_instances;
accelDesc.instanceDescriptorType = MTLAccelerationStructureInstanceDescriptorTypeUserID;
accelDesc.instanceDescriptorBuffer = instanceBuf;
accelDesc.instanceDescriptorBufferOffset = 0;
accelDesc.instanceDescriptorStride = instance_size;
accelDesc.instancedAccelerationStructures = all_blas;
if (motion_blur) {
accelDesc.instanceDescriptorType = MTLAccelerationStructureInstanceDescriptorTypeMotion;
accelDesc.motionTransformBuffer = motion_transforms_buf;
accelDesc.motionTransformCount = num_motion_transforms;
}
if (!use_fast_trace_bvh) {
accelDesc.usage |= (MTLAccelerationStructureUsageRefit |
MTLAccelerationStructureUsagePreferFastBuild);
}
MTLAccelerationStructureSizes accelSizes = [device
accelerationStructureSizesWithDescriptor:accelDesc];
id<MTLAccelerationStructure> accel = [device
newAccelerationStructureWithSize:accelSizes.accelerationStructureSize];
id<MTLBuffer> scratchBuf = [device newBufferWithLength:accelSizes.buildScratchBufferSize
options:MTLResourceStorageModePrivate];
id<MTLCommandBuffer> accelCommands = [queue commandBuffer];
id<MTLAccelerationStructureCommandEncoder> accelEnc =
[accelCommands accelerationStructureCommandEncoder];
if (refit) {
[accelEnc refitAccelerationStructure:accel_struct
descriptor:accelDesc
destination:accel
scratchBuffer:scratchBuf
scratchBufferOffset:0];
}
else {
[accelEnc buildAccelerationStructure:accel
descriptor:accelDesc
scratchBuffer:scratchBuf
scratchBufferOffset:0];
}
[accelEnc endEncoding];
[accelCommands commit];
[accelCommands waitUntilCompleted];
if (motion_transforms_buf) {
[motion_transforms_buf release];
}
[instanceBuf release];
[scratchBuf release];
uint64_t allocated_size = [accel allocatedSize];
stats.mem_alloc(allocated_size);
/* Cache top and bottom-level acceleration structs */
accel_struct = accel;
blas_array.clear();
blas_array.reserve(all_blas.count);
for (id<MTLAccelerationStructure> blas in all_blas) {
blas_array.push_back(blas);
}
return true;
}
return false;
}
bool BVHMetal::build(Progress &progress,
id<MTLDevice> device,
id<MTLCommandQueue> queue,
bool refit)
{
if (@available(macos 12.0, *)) {
if (refit && params.bvh_type != BVH_TYPE_STATIC) {
assert(accel_struct);
}
else {
if (accel_struct) {
stats.mem_free(accel_struct.allocatedSize);
[accel_struct release];
accel_struct = nil;
}
}
}
if (!params.top_level) {
return build_BLAS(progress, device, queue, refit);
}
else {
return build_TLAS(progress, device, queue, refit);
}
}
CCL_NAMESPACE_END
#endif /* WITH_METAL */

View File

@@ -0,0 +1,37 @@
/*
* Copyright 2021 Blender Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#pragma once
#include "util/string.h"
#include "util/vector.h"
CCL_NAMESPACE_BEGIN
class Device;
class DeviceInfo;
class Profiler;
class Stats;
bool device_metal_init();
Device *device_metal_create(const DeviceInfo &info, Stats &stats, Profiler &profiler);
void device_metal_info(vector<DeviceInfo> &devices);
string device_metal_capabilities();
CCL_NAMESPACE_END

View File

@@ -0,0 +1,136 @@
/*
* Copyright 2021 Blender Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifdef WITH_METAL
# include "device/metal/device.h"
# include "device/metal/device_impl.h"
#endif
#include "util/debug.h"
#include "util/set.h"
#include "util/system.h"
CCL_NAMESPACE_BEGIN
#ifdef WITH_METAL
Device *device_metal_create(const DeviceInfo &info, Stats &stats, Profiler &profiler)
{
return new MetalDevice(info, stats, profiler);
}
bool device_metal_init()
{
return true;
}
static int device_metal_get_num_devices_safe(uint32_t *num_devices)
{
*num_devices = MTLCopyAllDevices().count;
return 0;
}
void device_metal_info(vector<DeviceInfo> &devices)
{
uint32_t num_devices = 0;
device_metal_get_num_devices_safe(&num_devices);
if (num_devices == 0) {
return;
}
vector<MetalPlatformDevice> usable_devices;
MetalInfo::get_usable_devices(&usable_devices);
/* Devices are numbered consecutively across platforms. */
set<string> unique_ids;
int device_index = 0;
for (MetalPlatformDevice &device : usable_devices) {
/* Compute unique ID for persistent user preferences. */
const string &device_name = device.device_name;
string id = string("METAL_") + device_name;
/* Hardware ID might not be unique, add device number in that case. */
if (unique_ids.find(id) != unique_ids.end()) {
id += string_printf("_ID_%d", num_devices);
}
unique_ids.insert(id);
/* Create DeviceInfo. */
DeviceInfo info;
info.type = DEVICE_METAL;
info.description = string_remove_trademark(string(device_name));
/* Ensure unique naming on Apple Silicon / SoC devices which return the same string for CPU and
* GPU */
if (info.description == system_cpu_brand_string()) {
info.description += " (GPU)";
}
info.num = device_index;
/* We don't know if it's used for display, but assume it is. */
info.display_device = true;
info.denoisers = DENOISER_NONE;
info.id = id;
devices.push_back(info);
device_index++;
}
}
string device_metal_capabilities()
{
string result = "";
string error_msg = "";
uint32_t num_devices = 0;
assert(device_metal_get_num_devices_safe(&num_devices));
if (num_devices == 0) {
return "No Metal devices found\n";
}
result += string_printf("Number of devices: %u\n", num_devices);
NSArray<id<MTLDevice>> *allDevices = MTLCopyAllDevices();
for (id<MTLDevice> device in allDevices) {
result += string_printf("\t\tDevice: %s\n", [device.name UTF8String]);
}
return result;
}
#else
Device *device_metal_create(const DeviceInfo &info, Stats &stats, Profiler &profiler)
{
return nullptr;
}
bool device_metal_init()
{
return false;
}
void device_metal_info(vector<DeviceInfo> &devices)
{
}
string device_metal_capabilities()
{
return "";
}
#endif
CCL_NAMESPACE_END

View File

@@ -0,0 +1,166 @@
/*
* Copyright 2021 Blender Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#pragma once
#ifdef WITH_METAL
# include "bvh/bvh.h"
# include "device/device.h"
# include "device/metal/bvh.h"
# include "device/metal/device.h"
# include "device/metal/kernel.h"
# include "device/metal/queue.h"
# include "device/metal/util.h"
# include <Metal/Metal.h>
CCL_NAMESPACE_BEGIN
class DeviceQueue;
class MetalDevice : public Device {
public:
id<MTLDevice> mtlDevice = nil;
id<MTLLibrary> mtlLibrary[PSO_NUM] = {nil};
id<MTLArgumentEncoder> mtlBufferKernelParamsEncoder =
nil; /* encoder used for fetching device pointers from MTLBuffers */
id<MTLCommandQueue> mtlGeneralCommandQueue = nil;
id<MTLArgumentEncoder> mtlAncillaryArgEncoder =
nil; /* encoder used for fetching device pointers from MTLBuffers */
string source_used_for_compile[PSO_NUM];
KernelParamsMetal launch_params = {0};
/* MetalRT members ----------------------------------*/
BVHMetal *bvhMetalRT = nullptr;
bool motion_blur = false;
id<MTLArgumentEncoder> mtlASArgEncoder =
nil; /* encoder used for fetching device pointers from MTLAccelerationStructure */
/*---------------------------------------------------*/
string device_name;
MetalGPUVendor device_vendor;
uint kernel_features;
MTLResourceOptions default_storage_mode;
int max_threads_per_threadgroup;
int mtlDevId = 0;
bool first_error = true;
struct MetalMem {
device_memory *mem = nullptr;
int pointer_index = -1;
id<MTLBuffer> mtlBuffer = nil;
id<MTLTexture> mtlTexture = nil;
uint64_t offset = 0;
uint64_t size = 0;
void *hostPtr = nullptr;
bool use_UMA = false; /* If true, UMA memory in shared_pointer is being used. */
};
typedef map<device_memory *, unique_ptr<MetalMem>> MetalMemMap;
MetalMemMap metal_mem_map;
std::vector<id<MTLResource>> delayed_free_list;
std::recursive_mutex metal_mem_map_mutex;
/* Bindless Textures */
device_vector<TextureInfo> texture_info;
bool need_texture_info;
id<MTLArgumentEncoder> mtlTextureArgEncoder = nil;
id<MTLBuffer> texture_bindings_2d = nil;
id<MTLBuffer> texture_bindings_3d = nil;
std::vector<id<MTLTexture>> texture_slot_map;
MetalDeviceKernels kernels;
bool use_metalrt = false;
bool use_function_specialisation = false;
virtual BVHLayoutMask get_bvh_layout_mask() const override;
void set_error(const string &error) override;
MetalDevice(const DeviceInfo &info, Stats &stats, Profiler &profiler);
virtual ~MetalDevice();
bool support_device(const uint /*kernel_features*/);
bool check_peer_access(Device *peer_device) override;
bool use_adaptive_compilation();
string get_source(const uint kernel_features);
string compile_kernel(const uint kernel_features, const char *name);
virtual bool load_kernels(const uint kernel_features) override;
void reserve_local_memory(const uint kernel_features);
void init_host_memory();
void load_texture_info();
virtual bool should_use_graphics_interop() override;
virtual unique_ptr<DeviceQueue> gpu_queue_create() override;
virtual void build_bvh(BVH *bvh, Progress &progress, bool refit) override;
/* ------------------------------------------------------------------ */
/* low-level memory management */
MetalMem *generic_alloc(device_memory &mem);
void generic_copy_to(device_memory &mem);
void generic_free(device_memory &mem);
void mem_alloc(device_memory &mem) override;
void mem_copy_to(device_memory &mem) override;
void mem_copy_from(device_memory &mem)
{
mem_copy_from(mem, -1, -1, -1, -1);
}
void mem_copy_from(device_memory &mem, size_t y, size_t w, size_t h, size_t elem) override;
void mem_zero(device_memory &mem) override;
void mem_free(device_memory &mem) override;
device_ptr mem_alloc_sub_ptr(device_memory &mem, size_t offset, size_t /*size*/) override;
virtual void const_copy_to(const char *name, void *host, size_t size) override;
void global_alloc(device_memory &mem);
void global_free(device_memory &mem);
void tex_alloc(device_texture &mem);
void tex_alloc_as_buffer(device_texture &mem);
void tex_free(device_texture &mem);
void flush_delayed_free_list();
};
CCL_NAMESPACE_END
#endif

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,168 @@
/*
* Copyright 2021 Blender Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#pragma once
#ifdef WITH_METAL
# include "device/kernel.h"
# include <Metal/Metal.h>
CCL_NAMESPACE_BEGIN
class MetalDevice;
enum {
METALRT_FUNC_DEFAULT_TRI,
METALRT_FUNC_DEFAULT_BOX,
METALRT_FUNC_SHADOW_TRI,
METALRT_FUNC_SHADOW_BOX,
METALRT_FUNC_LOCAL_TRI,
METALRT_FUNC_LOCAL_BOX,
METALRT_FUNC_CURVE_RIBBON,
METALRT_FUNC_CURVE_RIBBON_SHADOW,
METALRT_FUNC_CURVE_ALL,
METALRT_FUNC_CURVE_ALL_SHADOW,
METALRT_FUNC_NUM
};
enum { METALRT_TABLE_DEFAULT, METALRT_TABLE_SHADOW, METALRT_TABLE_LOCAL, METALRT_TABLE_NUM };
/* Pipeline State Object types */
enum {
/* A kernel that can be used with all scenes, supporting all features.
* It is slow to compile, but only needs to be compiled once and is then
* cached for future render sessions. This allows a render to get underway
* on the GPU quickly.
*/
PSO_GENERIC,
/* A kernel that is relatively quick to compile, but is specialized for the
* scene being rendered. It only contains the functionality and even baked in
* constants for values that means it needs to be recompiled whenever a
* dependent setting is changed. The render performance of this kernel is
* significantly faster though, and justifies the extra compile time.
*/
/* METAL_WIP: This isn't used and will require more changes to enable. */
PSO_SPECIALISED,
PSO_NUM
};
const char *kernel_type_as_string(int kernel_type);
struct MetalKernelPipeline {
void release()
{
if (pipeline) {
[pipeline release];
pipeline = nil;
if (@available(macOS 11.0, *)) {
for (int i = 0; i < METALRT_TABLE_NUM; i++) {
if (intersection_func_table[i]) {
[intersection_func_table[i] release];
intersection_func_table[i] = nil;
}
}
}
}
if (function) {
[function release];
function = nil;
}
if (@available(macOS 11.0, *)) {
for (int i = 0; i < METALRT_TABLE_NUM; i++) {
if (intersection_func_table[i]) {
[intersection_func_table[i] release];
}
}
}
}
bool loaded = false;
id<MTLFunction> function = nil;
id<MTLComputePipelineState> pipeline = nil;
API_AVAILABLE(macos(11.0))
id<MTLIntersectionFunctionTable> intersection_func_table[METALRT_TABLE_NUM] = {nil};
};
struct MetalKernelLoadDesc {
int pso_index = 0;
const char *function_name = nullptr;
int kernel_index = 0;
int threads_per_threadgroup = 0;
MTLFunctionConstantValues *constant_values = nullptr;
NSArray *linked_functions = nullptr;
struct IntersectorFunctions {
NSArray *defaults;
NSArray *shadow;
NSArray *local;
NSArray *operator[](int index) const
{
if (index == METALRT_TABLE_DEFAULT)
return defaults;
if (index == METALRT_TABLE_SHADOW)
return shadow;
return local;
}
} intersector_functions = {nullptr};
};
/* Metal kernel and associate occupancy information. */
class MetalDeviceKernel {
public:
~MetalDeviceKernel();
bool load(MetalDevice *device, MetalKernelLoadDesc const &desc, class MD5Hash const &md5);
void mark_loaded(int pso_index)
{
pso[pso_index].loaded = true;
}
int get_num_threads_per_block() const
{
return num_threads_per_block;
}
const MetalKernelPipeline &get_pso() const;
double load_duration = 0.0;
private:
MetalKernelPipeline pso[PSO_NUM];
int num_threads_per_block = 0;
};
/* Cache of Metal kernels for each DeviceKernel. */
class MetalDeviceKernels {
public:
bool load(MetalDevice *device, int kernel_type);
bool available(DeviceKernel kernel) const;
const MetalDeviceKernel &get(DeviceKernel kernel) const;
MetalDeviceKernel kernels_[DEVICE_KERNEL_NUM];
id<MTLFunction> rt_intersection_funcs[PSO_NUM][METALRT_FUNC_NUM] = {{nil}};
string loaded_md5[PSO_NUM];
};
CCL_NAMESPACE_END
#endif /* WITH_METAL */

View File

@@ -0,0 +1,525 @@
/*
* Copyright 2021 Blender Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifdef WITH_METAL
# include "device/metal/kernel.h"
# include "device/metal/device_impl.h"
# include "util/md5.h"
# include "util/path.h"
# include "util/tbb.h"
# include "util/time.h"
CCL_NAMESPACE_BEGIN
/* limit to 2 MTLCompiler instances */
int max_mtlcompiler_threads = 2;
const char *kernel_type_as_string(int kernel_type)
{
switch (kernel_type) {
case PSO_GENERIC:
return "PSO_GENERIC";
case PSO_SPECIALISED:
return "PSO_SPECIALISED";
default:
assert(0);
}
return "";
}
MetalDeviceKernel::~MetalDeviceKernel()
{
for (int i = 0; i < PSO_NUM; i++) {
pso[i].release();
}
}
bool MetalDeviceKernel::load(MetalDevice *device,
MetalKernelLoadDesc const &desc_in,
MD5Hash const &md5)
{
__block MetalKernelLoadDesc const desc(desc_in);
if (desc.kernel_index == DEVICE_KERNEL_INTEGRATOR_MEGAKERNEL) {
/* skip megakernel */
return true;
}
bool use_binary_archive = true;
if (getenv("CYCLES_METAL_DISABLE_BINARY_ARCHIVES")) {
use_binary_archive = false;
}
id<MTLBinaryArchive> archive = nil;
string metalbin_path;
if (use_binary_archive) {
NSProcessInfo *processInfo = [NSProcessInfo processInfo];
string osVersion = [[processInfo operatingSystemVersionString] UTF8String];
MD5Hash local_md5(md5);
local_md5.append(osVersion);
string metalbin_name = string(desc.function_name) + "." + local_md5.get_hex() +
to_string(desc.pso_index) + ".bin";
metalbin_path = path_cache_get(path_join("kernels", metalbin_name));
path_create_directories(metalbin_path);
if (path_exists(metalbin_path) && use_binary_archive) {
if (@available(macOS 11.0, *)) {
MTLBinaryArchiveDescriptor *archiveDesc = [[MTLBinaryArchiveDescriptor alloc] init];
archiveDesc.url = [NSURL fileURLWithPath:@(metalbin_path.c_str())];
archive = [device->mtlDevice newBinaryArchiveWithDescriptor:archiveDesc error:nil];
[archiveDesc release];
}
}
}
NSString *entryPoint = [@(desc.function_name) copy];
NSError *error = NULL;
if (@available(macOS 11.0, *)) {
MTLFunctionDescriptor *func_desc = [MTLIntersectionFunctionDescriptor functionDescriptor];
func_desc.name = entryPoint;
if (desc.constant_values) {
func_desc.constantValues = desc.constant_values;
}
pso[desc.pso_index].function = [device->mtlLibrary[desc.pso_index]
newFunctionWithDescriptor:func_desc
error:&error];
}
[entryPoint release];
if (pso[desc.pso_index].function == nil) {
NSString *err = [error localizedDescription];
string errors = [err UTF8String];
device->set_error(
string_printf("Error getting function \"%s\": %s", desc.function_name, errors.c_str()));
return false;
}
pso[desc.pso_index].function.label = [@(desc.function_name) copy];
__block MTLComputePipelineDescriptor *computePipelineStateDescriptor =
[[MTLComputePipelineDescriptor alloc] init];
computePipelineStateDescriptor.buffers[0].mutability = MTLMutabilityImmutable;
computePipelineStateDescriptor.buffers[1].mutability = MTLMutabilityImmutable;
computePipelineStateDescriptor.buffers[2].mutability = MTLMutabilityImmutable;
if (@available(macos 10.14, *)) {
computePipelineStateDescriptor.maxTotalThreadsPerThreadgroup = desc.threads_per_threadgroup;
}
computePipelineStateDescriptor.threadGroupSizeIsMultipleOfThreadExecutionWidth = true;
computePipelineStateDescriptor.computeFunction = pso[desc.pso_index].function;
if (@available(macOS 11.0, *)) {
/* Attach the additional functions to an MTLLinkedFunctions object */
if (desc.linked_functions) {
computePipelineStateDescriptor.linkedFunctions = [[MTLLinkedFunctions alloc] init];
computePipelineStateDescriptor.linkedFunctions.functions = desc.linked_functions;
}
computePipelineStateDescriptor.maxCallStackDepth = 1;
}
/* Create a new Compute pipeline state object */
MTLPipelineOption pipelineOptions = MTLPipelineOptionNone;
bool creating_new_archive = false;
if (@available(macOS 11.0, *)) {
if (use_binary_archive) {
if (!archive) {
MTLBinaryArchiveDescriptor *archiveDesc = [[MTLBinaryArchiveDescriptor alloc] init];
archiveDesc.url = nil;
archive = [device->mtlDevice newBinaryArchiveWithDescriptor:archiveDesc error:nil];
creating_new_archive = true;
double starttime = time_dt();
if (![archive addComputePipelineFunctionsWithDescriptor:computePipelineStateDescriptor
error:&error]) {
NSString *errStr = [error localizedDescription];
metal_printf("Failed to add PSO to archive:\n%s\n",
errStr ? [errStr UTF8String] : "nil");
}
else {
double duration = time_dt() - starttime;
metal_printf("%2d | %-55s | %7.2fs\n",
desc.kernel_index,
device_kernel_as_string((DeviceKernel)desc.kernel_index),
duration);
if (desc.pso_index == PSO_GENERIC) {
this->load_duration = duration;
}
}
}
computePipelineStateDescriptor.binaryArchives = [NSArray arrayWithObjects:archive, nil];
pipelineOptions = MTLPipelineOptionFailOnBinaryArchiveMiss;
}
}
double starttime = time_dt();
MTLNewComputePipelineStateWithReflectionCompletionHandler completionHandler = ^(
id<MTLComputePipelineState> computePipelineState,
MTLComputePipelineReflection *reflection,
NSError *error) {
bool recreate_archive = false;
if (computePipelineState == nil && archive && !creating_new_archive) {
assert(0);
NSString *errStr = [error localizedDescription];
metal_printf(
"Failed to create compute pipeline state \"%s\" from archive - attempting recreation... "
"(error: %s)\n",
device_kernel_as_string((DeviceKernel)desc.kernel_index),
errStr ? [errStr UTF8String] : "nil");
computePipelineState = [device->mtlDevice
newComputePipelineStateWithDescriptor:computePipelineStateDescriptor
options:MTLPipelineOptionNone
reflection:nullptr
error:&error];
recreate_archive = true;
}
double duration = time_dt() - starttime;
if (computePipelineState == nil) {
NSString *errStr = [error localizedDescription];
device->set_error(string_printf("Failed to create compute pipeline state \"%s\", error: \n",
device_kernel_as_string((DeviceKernel)desc.kernel_index)) +
(errStr ? [errStr UTF8String] : "nil"));
metal_printf("%2d | %-55s | %7.2fs | FAILED!\n",
desc.kernel_index,
device_kernel_as_string((DeviceKernel)desc.kernel_index),
duration);
return;
}
pso[desc.pso_index].pipeline = computePipelineState;
num_threads_per_block = round_down(computePipelineState.maxTotalThreadsPerThreadgroup,
computePipelineState.threadExecutionWidth);
num_threads_per_block = std::max(num_threads_per_block,
(int)computePipelineState.threadExecutionWidth);
if (!use_binary_archive) {
metal_printf("%2d | %-55s | %7.2fs\n",
desc.kernel_index,
device_kernel_as_string((DeviceKernel)desc.kernel_index),
duration);
if (desc.pso_index == PSO_GENERIC) {
this->load_duration = duration;
}
}
if (@available(macOS 11.0, *)) {
if (creating_new_archive || recreate_archive) {
if (![archive serializeToURL:[NSURL fileURLWithPath:@(metalbin_path.c_str())]
error:&error]) {
metal_printf("Failed to save binary archive, error:\n%s\n",
[[error localizedDescription] UTF8String]);
}
}
}
[computePipelineStateDescriptor release];
computePipelineStateDescriptor = nil;
if (device->use_metalrt && desc.linked_functions) {
for (int table = 0; table < METALRT_TABLE_NUM; table++) {
if (@available(macOS 11.0, *)) {
MTLIntersectionFunctionTableDescriptor *ift_desc =
[[MTLIntersectionFunctionTableDescriptor alloc] init];
ift_desc.functionCount = desc.intersector_functions[table].count;
pso[desc.pso_index].intersection_func_table[table] = [pso[desc.pso_index].pipeline
newIntersectionFunctionTableWithDescriptor:ift_desc];
/* Finally write the function handles into this pipeline's table */
for (int i = 0; i < 2; i++) {
id<MTLFunctionHandle> handle = [pso[desc.pso_index].pipeline
functionHandleWithFunction:desc.intersector_functions[table][i]];
[pso[desc.pso_index].intersection_func_table[table] setFunction:handle atIndex:i];
}
}
}
}
mark_loaded(desc.pso_index);
};
if (desc.pso_index == PSO_SPECIALISED) {
/* Asynchronous load */
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
NSError *error;
id<MTLComputePipelineState> pipeline = [device->mtlDevice
newComputePipelineStateWithDescriptor:computePipelineStateDescriptor
options:pipelineOptions
reflection:nullptr
error:&error];
completionHandler(pipeline, nullptr, error);
});
}
else {
/* Block on load to ensure we continue with a valid kernel function */
id<MTLComputePipelineState> pipeline = [device->mtlDevice
newComputePipelineStateWithDescriptor:computePipelineStateDescriptor
options:pipelineOptions
reflection:nullptr
error:&error];
completionHandler(pipeline, nullptr, error);
}
return true;
}
const MetalKernelPipeline &MetalDeviceKernel::get_pso() const
{
if (pso[PSO_SPECIALISED].loaded) {
return pso[PSO_SPECIALISED];
}
assert(pso[PSO_GENERIC].loaded);
return pso[PSO_GENERIC];
}
bool MetalDeviceKernels::load(MetalDevice *device, int kernel_type)
{
bool any_error = false;
MD5Hash md5;
/* Build the function constant table */
MTLFunctionConstantValues *constant_values = nullptr;
if (kernel_type == PSO_SPECIALISED) {
constant_values = [MTLFunctionConstantValues new];
# define KERNEL_FILM(_type, name) \
[constant_values setConstantValue:&data.film.name \
type:get_MTLDataType_##_type() \
atIndex:KernelData_film_##name]; \
md5.append((uint8_t *)&data.film.name, sizeof(data.film.name));
# define KERNEL_BACKGROUND(_type, name) \
[constant_values setConstantValue:&data.background.name \
type:get_MTLDataType_##_type() \
atIndex:KernelData_background_##name]; \
md5.append((uint8_t *)&data.background.name, sizeof(data.background.name));
# define KERNEL_INTEGRATOR(_type, name) \
[constant_values setConstantValue:&data.integrator.name \
type:get_MTLDataType_##_type() \
atIndex:KernelData_integrator_##name]; \
md5.append((uint8_t *)&data.integrator.name, sizeof(data.integrator.name));
# define KERNEL_BVH(_type, name) \
[constant_values setConstantValue:&data.bvh.name \
type:get_MTLDataType_##_type() \
atIndex:KernelData_bvh_##name]; \
md5.append((uint8_t *)&data.bvh.name, sizeof(data.bvh.name));
/* METAL_WIP: populate constant_values based on KernelData */
assert(0);
/*
const KernelData &data = device->launch_params.data;
# include "kernel/types/background.h"
# include "kernel/types/bvh.h"
# include "kernel/types/film.h"
# include "kernel/types/integrator.h"
*/
}
if (device->use_metalrt) {
if (@available(macOS 11.0, *)) {
/* create the id<MTLFunction> for each intersection function */
const char *function_names[] = {
"__anyhit__cycles_metalrt_visibility_test_tri",
"__anyhit__cycles_metalrt_visibility_test_box",
"__anyhit__cycles_metalrt_shadow_all_hit_tri",
"__anyhit__cycles_metalrt_shadow_all_hit_box",
"__anyhit__cycles_metalrt_local_hit_tri",
"__anyhit__cycles_metalrt_local_hit_box",
"__intersection__curve_ribbon",
"__intersection__curve_ribbon_shadow",
"__intersection__curve_all",
"__intersection__curve_all_shadow",
};
assert(sizeof(function_names) / sizeof(function_names[0]) == METALRT_FUNC_NUM);
MTLFunctionDescriptor *desc = [MTLIntersectionFunctionDescriptor functionDescriptor];
if (kernel_type == PSO_SPECIALISED) {
desc.constantValues = constant_values;
}
for (int i = 0; i < METALRT_FUNC_NUM; i++) {
const char *function_name = function_names[i];
desc.name = [@(function_name) copy];
NSError *error = NULL;
rt_intersection_funcs[kernel_type][i] = [device->mtlLibrary[kernel_type]
newFunctionWithDescriptor:desc
error:&error];
if (rt_intersection_funcs[kernel_type][i] == nil) {
NSString *err = [error localizedDescription];
string errors = [err UTF8String];
device->set_error(string_printf(
"Error getting intersection function \"%s\": %s", function_name, errors.c_str()));
any_error = true;
break;
}
rt_intersection_funcs[kernel_type][i].label = [@(function_name) copy];
}
}
}
md5.append(device->source_used_for_compile[kernel_type]);
string hash = md5.get_hex();
if (loaded_md5[kernel_type] == hash) {
return true;
}
if (!any_error) {
NSArray *table_functions[METALRT_TABLE_NUM] = {nil};
NSArray *function_list = nil;
if (device->use_metalrt) {
id<MTLFunction> box_intersect_default = nil;
id<MTLFunction> box_intersect_shadow = nil;
if (device->kernel_features & KERNEL_FEATURE_HAIR) {
/* Add curve intersection programs. */
if (device->kernel_features & KERNEL_FEATURE_HAIR_THICK) {
/* Slower programs for thick hair since that also slows down ribbons.
* Ideally this should not be needed. */
box_intersect_default = rt_intersection_funcs[kernel_type][METALRT_FUNC_CURVE_ALL];
box_intersect_shadow = rt_intersection_funcs[kernel_type][METALRT_FUNC_CURVE_ALL_SHADOW];
}
else {
box_intersect_default = rt_intersection_funcs[kernel_type][METALRT_FUNC_CURVE_RIBBON];
box_intersect_shadow =
rt_intersection_funcs[kernel_type][METALRT_FUNC_CURVE_RIBBON_SHADOW];
}
}
table_functions[METALRT_TABLE_DEFAULT] = [NSArray
arrayWithObjects:rt_intersection_funcs[kernel_type][METALRT_FUNC_DEFAULT_TRI],
box_intersect_default ?
box_intersect_default :
rt_intersection_funcs[kernel_type][METALRT_FUNC_DEFAULT_BOX],
nil];
table_functions[METALRT_TABLE_SHADOW] = [NSArray
arrayWithObjects:rt_intersection_funcs[kernel_type][METALRT_FUNC_SHADOW_TRI],
box_intersect_shadow ?
box_intersect_shadow :
rt_intersection_funcs[kernel_type][METALRT_FUNC_SHADOW_BOX],
nil];
table_functions[METALRT_TABLE_LOCAL] = [NSArray
arrayWithObjects:rt_intersection_funcs[kernel_type][METALRT_FUNC_LOCAL_TRI],
rt_intersection_funcs[kernel_type][METALRT_FUNC_LOCAL_BOX],
nil];
NSMutableSet *unique_functions = [NSMutableSet
setWithArray:table_functions[METALRT_TABLE_DEFAULT]];
[unique_functions addObjectsFromArray:table_functions[METALRT_TABLE_SHADOW]];
[unique_functions addObjectsFromArray:table_functions[METALRT_TABLE_LOCAL]];
function_list = [[NSArray arrayWithArray:[unique_functions allObjects]]
sortedArrayUsingComparator:^NSComparisonResult(id<MTLFunction> f1, id<MTLFunction> f2) {
return [f1.label compare:f2.label];
}];
unique_functions = nil;
}
metal_printf("Starting %s \"cycles_metal_...\" pipeline builds\n",
kernel_type_as_string(kernel_type));
tbb::task_arena local_arena(max_mtlcompiler_threads);
local_arena.execute([&]() {
tbb::parallel_for(int(0), int(DEVICE_KERNEL_NUM), [&](int i) {
/* skip megakernel */
if (i == DEVICE_KERNEL_INTEGRATOR_MEGAKERNEL) {
return;
}
/* Only specialize kernels where it can make an impact. */
if (kernel_type == PSO_SPECIALISED) {
if (i < DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST ||
i > DEVICE_KERNEL_INTEGRATOR_MEGAKERNEL) {
return;
}
}
MetalDeviceKernel &kernel = kernels_[i];
const std::string function_name = std::string("cycles_metal_") +
device_kernel_as_string((DeviceKernel)i);
int threads_per_threadgroup = device->max_threads_per_threadgroup;
if (i > DEVICE_KERNEL_INTEGRATOR_MEGAKERNEL && i < DEVICE_KERNEL_INTEGRATOR_RESET) {
/* Always use 512 for the sorting kernels */
threads_per_threadgroup = 512;
}
NSArray *kernel_function_list = nil;
if (i == DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST ||
i == DEVICE_KERNEL_INTEGRATOR_INTERSECT_SHADOW ||
i == DEVICE_KERNEL_INTEGRATOR_INTERSECT_SUBSURFACE ||
i == DEVICE_KERNEL_INTEGRATOR_INTERSECT_VOLUME_STACK ||
i == DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE) {
kernel_function_list = function_list;
}
MetalKernelLoadDesc desc;
desc.pso_index = kernel_type;
desc.kernel_index = i;
desc.linked_functions = kernel_function_list;
desc.intersector_functions.defaults = table_functions[METALRT_TABLE_DEFAULT];
desc.intersector_functions.shadow = table_functions[METALRT_TABLE_SHADOW];
desc.intersector_functions.local = table_functions[METALRT_TABLE_LOCAL];
desc.constant_values = constant_values;
desc.threads_per_threadgroup = threads_per_threadgroup;
desc.function_name = function_name.c_str();
bool success = kernel.load(device, desc, md5);
any_error |= !success;
});
});
}
bool loaded = !any_error;
if (loaded) {
loaded_md5[kernel_type] = hash;
}
return loaded;
}
const MetalDeviceKernel &MetalDeviceKernels::get(DeviceKernel kernel) const
{
return kernels_[(int)kernel];
}
bool MetalDeviceKernels::available(DeviceKernel kernel) const
{
return kernels_[(int)kernel].get_pso().function != nil;
}
CCL_NAMESPACE_END
#endif /* WITH_METAL*/

View File

@@ -0,0 +1,99 @@
/*
* Copyright 2021 Blender Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#pragma once
#ifdef WITH_METAL
# include "device/kernel.h"
# include "device/memory.h"
# include "device/queue.h"
# include "device/metal/util.h"
# include "kernel/device/metal/globals.h"
# define metal_printf VLOG(4) << string_printf
CCL_NAMESPACE_BEGIN
class MetalDevice;
/* Base class for Metal queues. */
class MetalDeviceQueue : public DeviceQueue {
public:
MetalDeviceQueue(MetalDevice *device);
~MetalDeviceQueue();
virtual int num_concurrent_states(const size_t) const override;
virtual int num_concurrent_busy_states() const override;
virtual void init_execution() override;
virtual bool enqueue(DeviceKernel kernel,
const int work_size,
DeviceKernelArguments const &args) override;
virtual bool synchronize() override;
virtual void zero_to_device(device_memory &mem) override;
virtual void copy_to_device(device_memory &mem) override;
virtual void copy_from_device(device_memory &mem) override;
virtual bool kernel_available(DeviceKernel kernel) const override;
protected:
void prepare_resources(DeviceKernel kernel);
id<MTLComputeCommandEncoder> get_compute_encoder(DeviceKernel kernel);
id<MTLBlitCommandEncoder> get_blit_encoder();
MetalDevice *metal_device;
MetalBufferPool temp_buffer_pool;
API_AVAILABLE(macos(11.0), ios(14.0))
MTLCommandBufferDescriptor *command_buffer_desc = nullptr;
id<MTLDevice> mtlDevice = nil;
id<MTLCommandQueue> mtlCommandQueue = nil;
id<MTLCommandBuffer> mtlCommandBuffer = nil;
id<MTLComputeCommandEncoder> mtlComputeEncoder = nil;
id<MTLBlitCommandEncoder> mtlBlitEncoder = nil;
API_AVAILABLE(macos(10.14), ios(14.0))
id<MTLSharedEvent> shared_event = nil;
API_AVAILABLE(macos(10.14), ios(14.0))
MTLSharedEventListener *shared_event_listener = nil;
dispatch_queue_t event_queue;
dispatch_semaphore_t wait_semaphore;
struct CopyBack {
void *host_pointer;
void *gpu_mem;
uint64_t size;
};
std::vector<CopyBack> copy_back_mem;
uint64_t shared_event_id;
uint64_t command_buffers_submitted = 0;
uint64_t command_buffers_completed = 0;
Stats &stats;
void close_compute_encoder();
void close_blit_encoder();
};
CCL_NAMESPACE_END
#endif /* WITH_METAL */

View File

@@ -0,0 +1,610 @@
/*
* Copyright 2021 Blender Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifdef WITH_METAL
# include "device/metal/queue.h"
# include "device/metal/device_impl.h"
# include "device/metal/kernel.h"
# include "util/path.h"
# include "util/string.h"
# include "util/time.h"
CCL_NAMESPACE_BEGIN
/* MetalDeviceQueue */
MetalDeviceQueue::MetalDeviceQueue(MetalDevice *device)
: DeviceQueue(device), metal_device(device), stats(device->stats)
{
if (@available(macos 11.0, *)) {
command_buffer_desc = [[MTLCommandBufferDescriptor alloc] init];
command_buffer_desc.errorOptions = MTLCommandBufferErrorOptionEncoderExecutionStatus;
}
mtlDevice = device->mtlDevice;
mtlCommandQueue = [mtlDevice newCommandQueue];
if (@available(macos 10.14, *)) {
shared_event = [mtlDevice newSharedEvent];
shared_event_id = 1;
/* Shareable event listener */
event_queue = dispatch_queue_create("com.cycles.metal.event_queue", NULL);
shared_event_listener = [[MTLSharedEventListener alloc] initWithDispatchQueue:event_queue];
}
wait_semaphore = dispatch_semaphore_create(0);
}
MetalDeviceQueue::~MetalDeviceQueue()
{
/* Tidying up here isn't really practical - we should expect and require the work
* queue to be empty here. */
assert(mtlCommandBuffer == nil);
assert(command_buffers_submitted == command_buffers_completed);
if (@available(macos 10.14, *)) {
[shared_event_listener release];
[shared_event release];
}
if (@available(macos 11.0, *)) {
[command_buffer_desc release];
}
if (mtlCommandQueue) {
[mtlCommandQueue release];
mtlCommandQueue = nil;
}
}
int MetalDeviceQueue::num_concurrent_states(const size_t /*state_size*/) const
{
/* METAL_WIP */
/* TODO: compute automatically. */
/* TODO: must have at least num_threads_per_block. */
int result = 1048576;
if (metal_device->device_vendor == METAL_GPU_AMD) {
result *= 2;
}
else if (metal_device->device_vendor == METAL_GPU_APPLE) {
result *= 4;
}
return result;
}
int MetalDeviceQueue::num_concurrent_busy_states() const
{
/* METAL_WIP */
/* TODO: compute automatically. */
int result = 65536;
if (metal_device->device_vendor == METAL_GPU_AMD) {
result *= 2;
}
else if (metal_device->device_vendor == METAL_GPU_APPLE) {
result *= 4;
}
return result;
}
void MetalDeviceQueue::init_execution()
{
/* Synchronize all textures and memory copies before executing task. */
metal_device->load_texture_info();
synchronize();
}
bool MetalDeviceQueue::enqueue(DeviceKernel kernel,
const int work_size,
DeviceKernelArguments const &args)
{
if (metal_device->have_error()) {
return false;
}
VLOG(3) << "Metal queue launch " << device_kernel_as_string(kernel) << ", work_size "
<< work_size;
const MetalDeviceKernel &metal_kernel = metal_device->kernels.get(kernel);
const MetalKernelPipeline &metal_kernel_pso = metal_kernel.get_pso();
id<MTLComputeCommandEncoder> mtlComputeCommandEncoder = get_compute_encoder(kernel);
/* Determine size requirement for argument buffer. */
size_t arg_buffer_length = 0;
for (size_t i = 0; i < args.count; i++) {
size_t size_in_bytes = args.sizes[i];
arg_buffer_length = round_up(arg_buffer_length, size_in_bytes) + size_in_bytes;
}
/* 256 is the Metal offset alignment for constant address space bindings */
arg_buffer_length = round_up(arg_buffer_length, 256);
/* Globals placed after "vanilla" arguments. */
size_t globals_offsets = arg_buffer_length;
arg_buffer_length += sizeof(KernelParamsMetal);
arg_buffer_length = round_up(arg_buffer_length, 256);
/* Metal ancillary bindless pointers. */
size_t metal_offsets = arg_buffer_length;
arg_buffer_length += metal_device->mtlAncillaryArgEncoder.encodedLength;
arg_buffer_length = round_up(arg_buffer_length, metal_device->mtlAncillaryArgEncoder.alignment);
/* Temporary buffer used to prepare arg_buffer */
uint8_t *init_arg_buffer = (uint8_t *)alloca(arg_buffer_length);
memset(init_arg_buffer, 0, arg_buffer_length);
/* Prepare the non-pointer "enqueue" arguments */
size_t bytes_written = 0;
for (size_t i = 0; i < args.count; i++) {
size_t size_in_bytes = args.sizes[i];
bytes_written = round_up(bytes_written, size_in_bytes);
if (args.types[i] != DeviceKernelArguments::POINTER) {
memcpy(init_arg_buffer + bytes_written, args.values[i], size_in_bytes);
}
bytes_written += size_in_bytes;
}
/* Prepare any non-pointer (i.e. plain-old-data) KernelParamsMetal data */
/* The plain-old-data is contiguous, continuing to the end of KernelParamsMetal */
size_t plain_old_launch_data_offset = offsetof(KernelParamsMetal, __integrator_state) +
sizeof(IntegratorStateGPU);
size_t plain_old_launch_data_size = sizeof(KernelParamsMetal) - plain_old_launch_data_offset;
memcpy(init_arg_buffer + globals_offsets + plain_old_launch_data_offset,
(uint8_t *)&metal_device->launch_params + plain_old_launch_data_offset,
plain_old_launch_data_size);
/* Allocate an argument buffer. */
MTLResourceOptions arg_buffer_options = MTLResourceStorageModeManaged;
if (@available(macOS 11.0, *)) {
if ([mtlDevice hasUnifiedMemory]) {
arg_buffer_options = MTLResourceStorageModeShared;
}
}
id<MTLBuffer> arg_buffer = temp_buffer_pool.get_buffer(
mtlDevice, mtlCommandBuffer, arg_buffer_length, arg_buffer_options, init_arg_buffer, stats);
/* Encode the pointer "enqueue" arguments */
bytes_written = 0;
for (size_t i = 0; i < args.count; i++) {
size_t size_in_bytes = args.sizes[i];
bytes_written = round_up(bytes_written, size_in_bytes);
if (args.types[i] == DeviceKernelArguments::POINTER) {
[metal_device->mtlBufferKernelParamsEncoder setArgumentBuffer:arg_buffer
offset:bytes_written];
if (MetalDevice::MetalMem *mmem = *(MetalDevice::MetalMem **)args.values[i]) {
[mtlComputeCommandEncoder useResource:mmem->mtlBuffer
usage:MTLResourceUsageRead | MTLResourceUsageWrite];
[metal_device->mtlBufferKernelParamsEncoder setBuffer:mmem->mtlBuffer offset:0 atIndex:0];
}
else {
if (@available(macos 12.0, *)) {
[metal_device->mtlBufferKernelParamsEncoder setBuffer:nil offset:0 atIndex:0];
}
}
}
bytes_written += size_in_bytes;
}
/* Encode KernelParamsMetal buffers */
[metal_device->mtlBufferKernelParamsEncoder setArgumentBuffer:arg_buffer offset:globals_offsets];
/* this relies on IntegratorStateGPU layout being contiguous device_ptrs */
const size_t pointer_block_end = offsetof(KernelParamsMetal, __integrator_state) +
sizeof(IntegratorStateGPU);
for (size_t offset = 0; offset < pointer_block_end; offset += sizeof(device_ptr)) {
int pointer_index = offset / sizeof(device_ptr);
MetalDevice::MetalMem *mmem = *(
MetalDevice::MetalMem **)((uint8_t *)&metal_device->launch_params + offset);
if (mmem && (mmem->mtlBuffer || mmem->mtlTexture)) {
[metal_device->mtlBufferKernelParamsEncoder setBuffer:mmem->mtlBuffer
offset:0
atIndex:pointer_index];
}
else {
if (@available(macos 12.0, *)) {
[metal_device->mtlBufferKernelParamsEncoder setBuffer:nil offset:0 atIndex:pointer_index];
}
}
}
bytes_written = globals_offsets + sizeof(KernelParamsMetal);
/* Encode ancillaries */
[metal_device->mtlAncillaryArgEncoder setArgumentBuffer:arg_buffer offset:metal_offsets];
[metal_device->mtlAncillaryArgEncoder setBuffer:metal_device->texture_bindings_2d
offset:0
atIndex:0];
[metal_device->mtlAncillaryArgEncoder setBuffer:metal_device->texture_bindings_3d
offset:0
atIndex:1];
if (@available(macos 12.0, *)) {
if (metal_device->use_metalrt) {
if (metal_device->bvhMetalRT) {
id<MTLAccelerationStructure> accel_struct = metal_device->bvhMetalRT->accel_struct;
[metal_device->mtlAncillaryArgEncoder setAccelerationStructure:accel_struct atIndex:2];
}
for (int table = 0; table < METALRT_TABLE_NUM; table++) {
if (metal_kernel_pso.intersection_func_table[table]) {
[metal_kernel_pso.intersection_func_table[table] setBuffer:arg_buffer
offset:globals_offsets
atIndex:1];
[metal_device->mtlAncillaryArgEncoder
setIntersectionFunctionTable:metal_kernel_pso.intersection_func_table[table]
atIndex:3 + table];
[mtlComputeCommandEncoder useResource:metal_kernel_pso.intersection_func_table[table]
usage:MTLResourceUsageRead];
}
else {
[metal_device->mtlAncillaryArgEncoder setIntersectionFunctionTable:nil
atIndex:3 + table];
}
}
}
bytes_written = metal_offsets + metal_device->mtlAncillaryArgEncoder.encodedLength;
}
if (arg_buffer.storageMode == MTLStorageModeManaged) {
[arg_buffer didModifyRange:NSMakeRange(0, bytes_written)];
}
[mtlComputeCommandEncoder setBuffer:arg_buffer offset:0 atIndex:0];
[mtlComputeCommandEncoder setBuffer:arg_buffer offset:globals_offsets atIndex:1];
[mtlComputeCommandEncoder setBuffer:arg_buffer offset:metal_offsets atIndex:2];
if (metal_device->use_metalrt) {
if (@available(macos 12.0, *)) {
auto bvhMetalRT = metal_device->bvhMetalRT;
switch (kernel) {
case DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST:
case DEVICE_KERNEL_INTEGRATOR_INTERSECT_SHADOW:
case DEVICE_KERNEL_INTEGRATOR_INTERSECT_SUBSURFACE:
case DEVICE_KERNEL_INTEGRATOR_INTERSECT_VOLUME_STACK:
case DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE:
break;
default:
bvhMetalRT = nil;
break;
}
if (bvhMetalRT) {
/* Mark all Accelerations resources as used */
[mtlComputeCommandEncoder useResource:bvhMetalRT->accel_struct usage:MTLResourceUsageRead];
[mtlComputeCommandEncoder useResources:bvhMetalRT->blas_array.data()
count:bvhMetalRT->blas_array.size()
usage:MTLResourceUsageRead];
}
}
}
[mtlComputeCommandEncoder setComputePipelineState:metal_kernel_pso.pipeline];
/* Compute kernel launch parameters. */
const int num_threads_per_block = metal_kernel.get_num_threads_per_block();
int shared_mem_bytes = 0;
switch (kernel) {
case DEVICE_KERNEL_INTEGRATOR_QUEUED_PATHS_ARRAY:
case DEVICE_KERNEL_INTEGRATOR_QUEUED_SHADOW_PATHS_ARRAY:
case DEVICE_KERNEL_INTEGRATOR_ACTIVE_PATHS_ARRAY:
case DEVICE_KERNEL_INTEGRATOR_TERMINATED_PATHS_ARRAY:
case DEVICE_KERNEL_INTEGRATOR_SORTED_PATHS_ARRAY:
case DEVICE_KERNEL_INTEGRATOR_COMPACT_PATHS_ARRAY:
case DEVICE_KERNEL_INTEGRATOR_TERMINATED_SHADOW_PATHS_ARRAY:
case DEVICE_KERNEL_INTEGRATOR_COMPACT_SHADOW_PATHS_ARRAY:
/* See parallel_active_index.h for why this amount of shared memory is needed.
* Rounded up to 16 bytes for Metal */
shared_mem_bytes = round_up((num_threads_per_block + 1) * sizeof(int), 16);
[mtlComputeCommandEncoder setThreadgroupMemoryLength:shared_mem_bytes atIndex:0];
break;
default:
break;
}
MTLSize size_threadgroups_per_dispatch = MTLSizeMake(
divide_up(work_size, num_threads_per_block), 1, 1);
MTLSize size_threads_per_threadgroup = MTLSizeMake(num_threads_per_block, 1, 1);
[mtlComputeCommandEncoder dispatchThreadgroups:size_threadgroups_per_dispatch
threadsPerThreadgroup:size_threads_per_threadgroup];
[mtlCommandBuffer addCompletedHandler:^(id<MTLCommandBuffer> command_buffer) {
NSString *kernel_name = metal_kernel_pso.function.label;
/* Enhanced command buffer errors are only available in 11.0+ */
if (@available(macos 11.0, *)) {
if (command_buffer.status == MTLCommandBufferStatusError && command_buffer.error != nil) {
printf("CommandBuffer Failed: %s\n", [kernel_name UTF8String]);
NSArray<id<MTLCommandBufferEncoderInfo>> *encoderInfos = [command_buffer.error.userInfo
valueForKey:MTLCommandBufferEncoderInfoErrorKey];
if (encoderInfos != nil) {
for (id<MTLCommandBufferEncoderInfo> encoderInfo : encoderInfos) {
NSLog(@"%@", encoderInfo);
}
}
id<MTLLogContainer> logs = command_buffer.logs;
for (id<MTLFunctionLog> log in logs) {
NSLog(@"%@", log);
}
}
else if (command_buffer.error) {
printf("CommandBuffer Failed: %s\n", [kernel_name UTF8String]);
}
}
}];
return !(metal_device->have_error());
}
bool MetalDeviceQueue::synchronize()
{
if (metal_device->have_error()) {
return false;
}
if (mtlComputeEncoder) {
close_compute_encoder();
}
close_blit_encoder();
if (mtlCommandBuffer) {
uint64_t shared_event_id = this->shared_event_id++;
if (@available(macos 10.14, *)) {
__block dispatch_semaphore_t block_sema = wait_semaphore;
[shared_event notifyListener:shared_event_listener
atValue:shared_event_id
block:^(id<MTLSharedEvent> sharedEvent, uint64_t value) {
dispatch_semaphore_signal(block_sema);
}];
[mtlCommandBuffer encodeSignalEvent:shared_event value:shared_event_id];
[mtlCommandBuffer commit];
dispatch_semaphore_wait(wait_semaphore, DISPATCH_TIME_FOREVER);
}
[mtlCommandBuffer release];
for (const CopyBack &mmem : copy_back_mem) {
memcpy((uchar *)mmem.host_pointer, (uchar *)mmem.gpu_mem, mmem.size);
}
copy_back_mem.clear();
temp_buffer_pool.process_command_buffer_completion(mtlCommandBuffer);
metal_device->flush_delayed_free_list();
mtlCommandBuffer = nil;
}
return !(metal_device->have_error());
}
void MetalDeviceQueue::zero_to_device(device_memory &mem)
{
assert(mem.type != MEM_GLOBAL && mem.type != MEM_TEXTURE);
if (mem.memory_size() == 0) {
return;
}
/* Allocate on demand. */
if (mem.device_pointer == 0) {
metal_device->mem_alloc(mem);
}
/* Zero memory on device. */
assert(mem.device_pointer != 0);
std::lock_guard<std::recursive_mutex> lock(metal_device->metal_mem_map_mutex);
MetalDevice::MetalMem &mmem = *metal_device->metal_mem_map.at(&mem);
if (mmem.mtlBuffer) {
id<MTLBlitCommandEncoder> blitEncoder = get_blit_encoder();
[blitEncoder fillBuffer:mmem.mtlBuffer range:NSMakeRange(mmem.offset, mmem.size) value:0];
}
else {
metal_device->mem_zero(mem);
}
}
void MetalDeviceQueue::copy_to_device(device_memory &mem)
{
if (mem.memory_size() == 0) {
return;
}
/* Allocate on demand. */
if (mem.device_pointer == 0) {
metal_device->mem_alloc(mem);
}
assert(mem.device_pointer != 0);
assert(mem.host_pointer != nullptr);
std::lock_guard<std::recursive_mutex> lock(metal_device->metal_mem_map_mutex);
auto result = metal_device->metal_mem_map.find(&mem);
if (result != metal_device->metal_mem_map.end()) {
if (mem.host_pointer == mem.shared_pointer) {
return;
}
MetalDevice::MetalMem &mmem = *result->second;
id<MTLBlitCommandEncoder> blitEncoder = get_blit_encoder();
id<MTLBuffer> buffer = temp_buffer_pool.get_buffer(mtlDevice,
mtlCommandBuffer,
mmem.size,
MTLResourceStorageModeShared,
mem.host_pointer,
stats);
[blitEncoder copyFromBuffer:buffer
sourceOffset:0
toBuffer:mmem.mtlBuffer
destinationOffset:mmem.offset
size:mmem.size];
}
else {
metal_device->mem_copy_to(mem);
}
}
void MetalDeviceQueue::copy_from_device(device_memory &mem)
{
assert(mem.type != MEM_GLOBAL && mem.type != MEM_TEXTURE);
if (mem.memory_size() == 0) {
return;
}
assert(mem.device_pointer != 0);
assert(mem.host_pointer != nullptr);
std::lock_guard<std::recursive_mutex> lock(metal_device->metal_mem_map_mutex);
MetalDevice::MetalMem &mmem = *metal_device->metal_mem_map.at(&mem);
if (mmem.mtlBuffer) {
const size_t size = mem.memory_size();
if (mem.device_pointer) {
if ([mmem.mtlBuffer storageMode] == MTLStorageModeManaged) {
id<MTLBlitCommandEncoder> blitEncoder = get_blit_encoder();
[blitEncoder synchronizeResource:mmem.mtlBuffer];
}
if (mem.host_pointer != mmem.hostPtr) {
if (mtlCommandBuffer) {
copy_back_mem.push_back({mem.host_pointer, mmem.hostPtr, size});
}
else {
memcpy((uchar *)mem.host_pointer, (uchar *)mmem.hostPtr, size);
}
}
}
else {
memset((char *)mem.host_pointer, 0, size);
}
}
else {
metal_device->mem_copy_from(mem);
}
}
bool MetalDeviceQueue::kernel_available(DeviceKernel kernel) const
{
return metal_device->kernels.available(kernel);
}
void MetalDeviceQueue::prepare_resources(DeviceKernel kernel)
{
std::lock_guard<std::recursive_mutex> lock(metal_device->metal_mem_map_mutex);
/* declare resource usage */
for (auto &it : metal_device->metal_mem_map) {
device_memory *mem = it.first;
MTLResourceUsage usage = MTLResourceUsageRead;
if (mem->type != MEM_GLOBAL && mem->type != MEM_READ_ONLY && mem->type != MEM_TEXTURE) {
usage |= MTLResourceUsageWrite;
}
if (it.second->mtlBuffer) {
/* METAL_WIP - use array version (i.e. useResources) */
[mtlComputeEncoder useResource:it.second->mtlBuffer usage:usage];
}
else if (it.second->mtlTexture) {
/* METAL_WIP - use array version (i.e. useResources) */
[mtlComputeEncoder useResource:it.second->mtlTexture usage:usage | MTLResourceUsageSample];
}
}
/* ancillaries */
[mtlComputeEncoder useResource:metal_device->texture_bindings_2d usage:MTLResourceUsageRead];
[mtlComputeEncoder useResource:metal_device->texture_bindings_3d usage:MTLResourceUsageRead];
}
id<MTLComputeCommandEncoder> MetalDeviceQueue::get_compute_encoder(DeviceKernel kernel)
{
bool concurrent = (kernel < DEVICE_KERNEL_INTEGRATOR_NUM);
if (@available(macos 10.14, *)) {
if (mtlComputeEncoder) {
if (mtlComputeEncoder.dispatchType == concurrent ? MTLDispatchTypeConcurrent :
MTLDispatchTypeSerial) {
/* declare usage of MTLBuffers etc */
prepare_resources(kernel);
return mtlComputeEncoder;
}
close_compute_encoder();
}
close_blit_encoder();
if (!mtlCommandBuffer) {
mtlCommandBuffer = [mtlCommandQueue commandBuffer];
[mtlCommandBuffer retain];
}
mtlComputeEncoder = [mtlCommandBuffer
computeCommandEncoderWithDispatchType:concurrent ? MTLDispatchTypeConcurrent :
MTLDispatchTypeSerial];
/* declare usage of MTLBuffers etc */
prepare_resources(kernel);
}
return mtlComputeEncoder;
}
id<MTLBlitCommandEncoder> MetalDeviceQueue::get_blit_encoder()
{
if (mtlBlitEncoder) {
return mtlBlitEncoder;
}
if (mtlComputeEncoder) {
close_compute_encoder();
}
if (!mtlCommandBuffer) {
mtlCommandBuffer = [mtlCommandQueue commandBuffer];
[mtlCommandBuffer retain];
}
mtlBlitEncoder = [mtlCommandBuffer blitCommandEncoder];
return mtlBlitEncoder;
}
void MetalDeviceQueue::close_compute_encoder()
{
[mtlComputeEncoder endEncoding];
mtlComputeEncoder = nil;
}
void MetalDeviceQueue::close_blit_encoder()
{
if (mtlBlitEncoder) {
[mtlBlitEncoder endEncoding];
mtlBlitEncoder = nil;
}
}
CCL_NAMESPACE_END
#endif /* WITH_METAL */

View File

@@ -0,0 +1,101 @@
/*
* Copyright 2021 Blender Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#pragma once
#ifdef WITH_METAL
# include <Metal/Metal.h>
# include <string>
# include "device/metal/device.h"
# include "device/metal/kernel.h"
# include "device/queue.h"
# include "util/thread.h"
CCL_NAMESPACE_BEGIN
enum MetalGPUVendor {
METAL_GPU_UNKNOWN = 0,
METAL_GPU_APPLE = 1,
METAL_GPU_AMD = 2,
METAL_GPU_INTEL = 3,
};
/* Retains a named MTLDevice for device enumeration. */
struct MetalPlatformDevice {
MetalPlatformDevice(id<MTLDevice> device, const string &device_name)
: device_id(device), device_name(device_name)
{
[device_id retain];
}
~MetalPlatformDevice()
{
[device_id release];
}
id<MTLDevice> device_id;
string device_name;
};
/* Contains static Metal helper functions. */
struct MetalInfo {
static bool device_version_check(id<MTLDevice> device);
static void get_usable_devices(vector<MetalPlatformDevice> *usable_devices);
static MetalGPUVendor get_vendor_from_device_name(string const &device_name);
/* Platform information. */
static bool get_num_devices(uint32_t *num_platforms);
static uint32_t get_num_devices();
static bool get_device_name(id<MTLDevice> device_id, string *device_name);
static string get_device_name(id<MTLDevice> device_id);
};
/* Pool of MTLBuffers whose lifetime is linked to a single MTLCommandBuffer */
class MetalBufferPool {
struct MetalBufferListEntry {
MetalBufferListEntry(id<MTLBuffer> buffer, id<MTLCommandBuffer> command_buffer)
: buffer(buffer), command_buffer(command_buffer)
{
}
MetalBufferListEntry() = delete;
id<MTLBuffer> buffer;
id<MTLCommandBuffer> command_buffer;
};
std::vector<MetalBufferListEntry> buffer_free_list;
std::vector<MetalBufferListEntry> buffer_in_use_list;
thread_mutex buffer_mutex;
size_t total_temp_mem_size = 0;
public:
MetalBufferPool() = default;
~MetalBufferPool();
id<MTLBuffer> get_buffer(id<MTLDevice> device,
id<MTLCommandBuffer> command_buffer,
NSUInteger length,
MTLResourceOptions options,
const void *pointer,
Stats &stats);
void process_command_buffer_completion(id<MTLCommandBuffer> command_buffer);
};
CCL_NAMESPACE_END
#endif /* WITH_METAL */

View File

@@ -0,0 +1,218 @@
/*
* Copyright 2021 Blender Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifdef WITH_METAL
# include "device/metal/util.h"
# include "device/metal/device_impl.h"
# include "util/md5.h"
# include "util/path.h"
# include "util/string.h"
# include "util/time.h"
# include <pwd.h>
# include <sys/shm.h>
# include <time.h>
CCL_NAMESPACE_BEGIN
MetalGPUVendor MetalInfo::get_vendor_from_device_name(string const &device_name)
{
if (device_name.find("Intel") != string::npos) {
return METAL_GPU_INTEL;
}
else if (device_name.find("AMD") != string::npos) {
return METAL_GPU_AMD;
}
else if (device_name.find("Apple") != string::npos) {
return METAL_GPU_APPLE;
}
return METAL_GPU_UNKNOWN;
}
bool MetalInfo::device_version_check(id<MTLDevice> device)
{
/* Metal Cycles doesn't work correctly on macOS versions older than 12.0 */
if (@available(macos 12.0, *)) {
MetalGPUVendor vendor = get_vendor_from_device_name([[device name] UTF8String]);
/* Metal Cycles works on Apple Silicon GPUs at present */
return (vendor == METAL_GPU_APPLE);
}
return false;
}
void MetalInfo::get_usable_devices(vector<MetalPlatformDevice> *usable_devices)
{
static bool first_time = true;
# define FIRST_VLOG(severity) \
if (first_time) \
VLOG(severity)
usable_devices->clear();
NSArray<id<MTLDevice>> *allDevices = MTLCopyAllDevices();
for (id<MTLDevice> device in allDevices) {
string device_name;
if (!get_device_name(device, &device_name)) {
FIRST_VLOG(2) << "Failed to get device name, ignoring.";
continue;
}
static const char *forceIntelStr = getenv("CYCLES_METAL_FORCE_INTEL");
bool forceIntel = forceIntelStr ? (atoi(forceIntelStr) != 0) : false;
if (forceIntel && device_name.find("Intel") == string::npos) {
FIRST_VLOG(2) << "CYCLES_METAL_FORCE_INTEL causing non-Intel device " << device_name
<< " to be ignored.";
continue;
}
if (!device_version_check(device)) {
FIRST_VLOG(2) << "Ignoring device " << device_name << " due to too old compiler version.";
continue;
}
FIRST_VLOG(2) << "Adding new device " << device_name << ".";
string hardware_id;
usable_devices->push_back(MetalPlatformDevice(device, device_name));
}
first_time = false;
}
bool MetalInfo::get_num_devices(uint32_t *num_devices)
{
*num_devices = MTLCopyAllDevices().count;
return true;
}
uint32_t MetalInfo::get_num_devices()
{
uint32_t num_devices;
if (!get_num_devices(&num_devices)) {
return 0;
}
return num_devices;
}
bool MetalInfo::get_device_name(id<MTLDevice> device, string *platform_name)
{
*platform_name = [device.name UTF8String];
return true;
}
string MetalInfo::get_device_name(id<MTLDevice> device)
{
string platform_name;
if (!get_device_name(device, &platform_name)) {
return "";
}
return platform_name;
}
id<MTLBuffer> MetalBufferPool::get_buffer(id<MTLDevice> device,
id<MTLCommandBuffer> command_buffer,
NSUInteger length,
MTLResourceOptions options,
const void *pointer,
Stats &stats)
{
id<MTLBuffer> buffer;
MTLStorageMode storageMode = MTLStorageMode((options & MTLResourceStorageModeMask) >>
MTLResourceStorageModeShift);
MTLCPUCacheMode cpuCacheMode = MTLCPUCacheMode((options & MTLResourceCPUCacheModeMask) >>
MTLResourceCPUCacheModeShift);
buffer_mutex.lock();
for (auto entry = buffer_free_list.begin(); entry != buffer_free_list.end(); entry++) {
MetalBufferListEntry bufferEntry = *entry;
/* Check if buffer matches size and storage mode and is old enough to reuse */
if (bufferEntry.buffer.length == length && storageMode == bufferEntry.buffer.storageMode &&
cpuCacheMode == bufferEntry.buffer.cpuCacheMode) {
buffer = bufferEntry.buffer;
buffer_free_list.erase(entry);
bufferEntry.command_buffer = command_buffer;
buffer_in_use_list.push_back(bufferEntry);
buffer_mutex.unlock();
/* Copy over data */
if (pointer) {
memcpy(buffer.contents, pointer, length);
if (bufferEntry.buffer.storageMode == MTLStorageModeManaged) {
[buffer didModifyRange:NSMakeRange(0, length)];
}
}
return buffer;
}
}
// NSLog(@"Creating buffer of length %lu (%lu)", length, frameCount);
if (pointer) {
buffer = [device newBufferWithBytes:pointer length:length options:options];
}
else {
buffer = [device newBufferWithLength:length options:options];
}
MetalBufferListEntry buffer_entry(buffer, command_buffer);
stats.mem_alloc(buffer.allocatedSize);
total_temp_mem_size += buffer.allocatedSize;
buffer_in_use_list.push_back(buffer_entry);
buffer_mutex.unlock();
return buffer;
}
void MetalBufferPool::process_command_buffer_completion(id<MTLCommandBuffer> command_buffer)
{
assert(command_buffer);
thread_scoped_lock lock(buffer_mutex);
/* Release all buffers that have not been recently reused back into the free pool */
for (auto entry = buffer_in_use_list.begin(); entry != buffer_in_use_list.end();) {
MetalBufferListEntry buffer_entry = *entry;
if (buffer_entry.command_buffer == command_buffer) {
entry = buffer_in_use_list.erase(entry);
buffer_entry.command_buffer = nil;
buffer_free_list.push_back(buffer_entry);
}
else {
entry++;
}
}
}
MetalBufferPool::~MetalBufferPool()
{
thread_scoped_lock lock(buffer_mutex);
/* Release all buffers that have not been recently reused */
for (auto entry = buffer_free_list.begin(); entry != buffer_free_list.end();) {
MetalBufferListEntry buffer_entry = *entry;
id<MTLBuffer> buffer = buffer_entry.buffer;
// NSLog(@"Releasing buffer of length %lu (%lu) (%lu outstanding)", buffer.length, frameCount,
// bufferFreeList.size());
total_temp_mem_size -= buffer.allocatedSize;
[buffer release];
entry = buffer_free_list.erase(entry);
}
}
CCL_NAMESPACE_END
#endif /* WITH_METAL */

View File

@@ -124,11 +124,20 @@ class MultiDevice : public Device {
return BVH_LAYOUT_MULTI_OPTIX;
}
/* With multiple Metal devices, every device needs its own acceleration structure */
if (bvh_layout_mask == BVH_LAYOUT_METAL) {
return BVH_LAYOUT_MULTI_METAL;
}
/* When devices do not share a common BVH layout, fall back to creating one for each */
const BVHLayoutMask BVH_LAYOUT_OPTIX_EMBREE = (BVH_LAYOUT_OPTIX | BVH_LAYOUT_EMBREE);
if ((bvh_layout_mask_all & BVH_LAYOUT_OPTIX_EMBREE) == BVH_LAYOUT_OPTIX_EMBREE) {
return BVH_LAYOUT_MULTI_OPTIX_EMBREE;
}
const BVHLayoutMask BVH_LAYOUT_METAL_EMBREE = (BVH_LAYOUT_METAL | BVH_LAYOUT_EMBREE);
if ((bvh_layout_mask_all & BVH_LAYOUT_METAL_EMBREE) == BVH_LAYOUT_METAL_EMBREE) {
return BVH_LAYOUT_MULTI_METAL_EMBREE;
}
return bvh_layout_mask;
}
@@ -151,7 +160,9 @@ class MultiDevice : public Device {
}
assert(bvh->params.bvh_layout == BVH_LAYOUT_MULTI_OPTIX ||
bvh->params.bvh_layout == BVH_LAYOUT_MULTI_OPTIX_EMBREE);
bvh->params.bvh_layout == BVH_LAYOUT_MULTI_METAL ||
bvh->params.bvh_layout == BVH_LAYOUT_MULTI_OPTIX_EMBREE ||
bvh->params.bvh_layout == BVH_LAYOUT_MULTI_METAL_EMBREE);
BVHMulti *const bvh_multi = static_cast<BVHMulti *>(bvh);
bvh_multi->sub_bvhs.resize(devices.size());
@@ -174,9 +185,14 @@ class MultiDevice : public Device {
BVHParams params = bvh->params;
if (bvh->params.bvh_layout == BVH_LAYOUT_MULTI_OPTIX)
params.bvh_layout = BVH_LAYOUT_OPTIX;
else if (bvh->params.bvh_layout == BVH_LAYOUT_MULTI_METAL)
params.bvh_layout = BVH_LAYOUT_METAL;
else if (bvh->params.bvh_layout == BVH_LAYOUT_MULTI_OPTIX_EMBREE)
params.bvh_layout = sub.device->info.type == DEVICE_OPTIX ? BVH_LAYOUT_OPTIX :
BVH_LAYOUT_EMBREE;
else if (bvh->params.bvh_layout == BVH_LAYOUT_MULTI_METAL_EMBREE)
params.bvh_layout = sub.device->info.type == DEVICE_METAL ? BVH_LAYOUT_METAL :
BVH_LAYOUT_EMBREE;
/* Skip building a bottom level acceleration structure for non-instanced geometry on Embree
* (since they are put into the top level directly, see bvh_embree.cpp) */

View File

@@ -28,6 +28,7 @@
# include "scene/mesh.h"
# include "scene/object.h"
# include "scene/pass.h"
# include "scene/pointcloud.h"
# include "scene/scene.h"
# include "util/debug.h"
@@ -41,17 +42,19 @@
# define __KERNEL_OPTIX__
# include "kernel/device/optix/globals.h"
# include <optix_denoiser_tiling.h>
CCL_NAMESPACE_BEGIN
OptiXDevice::Denoiser::Denoiser(OptiXDevice *device)
: device(device), queue(device), state(device, "__denoiser_state")
: device(device), queue(device), state(device, "__denoiser_state", true)
{
}
OptiXDevice::OptiXDevice(const DeviceInfo &info, Stats &stats, Profiler &profiler)
: CUDADevice(info, stats, profiler),
sbt_data(this, "__sbt", MEM_READ_ONLY),
launch_params(this, "__params"),
launch_params(this, "__params", false),
denoiser_(this)
{
/* Make the CUDA context current. */
@@ -208,11 +211,15 @@ bool OptiXDevice::load_kernels(const uint kernel_features)
}
else {
module_options.optLevel = OPTIX_COMPILE_OPTIMIZATION_LEVEL_3;
module_options.debugLevel = OPTIX_COMPILE_DEBUG_LEVEL_LINEINFO;
module_options.debugLevel = OPTIX_COMPILE_DEBUG_LEVEL_NONE;
}
module_options.boundValues = nullptr;
module_options.numBoundValues = 0;
# if OPTIX_ABI_VERSION >= 55
module_options.payloadTypes = nullptr;
module_options.numPayloadTypes = 0;
# endif
OptixPipelineCompileOptions pipeline_options = {};
/* Default to no motion blur and two-level graph, since it is the fastest option. */
@@ -227,11 +234,18 @@ bool OptiXDevice::load_kernels(const uint kernel_features)
pipeline_options.usesPrimitiveTypeFlags = OPTIX_PRIMITIVE_TYPE_FLAGS_TRIANGLE;
if (kernel_features & KERNEL_FEATURE_HAIR) {
if (kernel_features & KERNEL_FEATURE_HAIR_THICK) {
# if OPTIX_ABI_VERSION >= 55
pipeline_options.usesPrimitiveTypeFlags |= OPTIX_PRIMITIVE_TYPE_FLAGS_ROUND_CATMULLROM;
# else
pipeline_options.usesPrimitiveTypeFlags |= OPTIX_PRIMITIVE_TYPE_FLAGS_ROUND_CUBIC_BSPLINE;
# endif
}
else
pipeline_options.usesPrimitiveTypeFlags |= OPTIX_PRIMITIVE_TYPE_FLAGS_CUSTOM;
}
if (kernel_features & KERNEL_FEATURE_POINTCLOUD) {
pipeline_options.usesPrimitiveTypeFlags |= OPTIX_PRIMITIVE_TYPE_FLAGS_CUSTOM;
}
/* Keep track of whether motion blur is enabled, so to enable/disable motion in BVH builds
* This is necessary since objects may be reported to have motion if the Vector pass is
@@ -324,7 +338,13 @@ bool OptiXDevice::load_kernels(const uint kernel_features)
if (kernel_features & KERNEL_FEATURE_HAIR_THICK) {
/* Built-in thick curve intersection. */
OptixBuiltinISOptions builtin_options = {};
# if OPTIX_ABI_VERSION >= 55
builtin_options.builtinISModuleType = OPTIX_PRIMITIVE_TYPE_ROUND_CATMULLROM;
builtin_options.buildFlags = OPTIX_BUILD_FLAG_PREFER_FAST_TRACE;
builtin_options.curveEndcapFlags = OPTIX_CURVE_ENDCAP_DEFAULT; /* Disable end-caps. */
# else
builtin_options.builtinISModuleType = OPTIX_PRIMITIVE_TYPE_ROUND_CUBIC_BSPLINE;
# endif
builtin_options.usesMotionBlur = false;
optix_assert(optixBuiltinISModuleGet(
@@ -356,6 +376,18 @@ bool OptiXDevice::load_kernels(const uint kernel_features)
}
}
/* Pointclouds */
if (kernel_features & KERNEL_FEATURE_POINTCLOUD) {
group_descs[PG_HITD_POINTCLOUD] = group_descs[PG_HITD];
group_descs[PG_HITD_POINTCLOUD].kind = OPTIX_PROGRAM_GROUP_KIND_HITGROUP;
group_descs[PG_HITD_POINTCLOUD].hitgroup.moduleIS = optix_module;
group_descs[PG_HITD_POINTCLOUD].hitgroup.entryFunctionNameIS = "__intersection__point";
group_descs[PG_HITS_POINTCLOUD] = group_descs[PG_HITS];
group_descs[PG_HITS_POINTCLOUD].kind = OPTIX_PROGRAM_GROUP_KIND_HITGROUP;
group_descs[PG_HITS_POINTCLOUD].hitgroup.moduleIS = optix_module;
group_descs[PG_HITS_POINTCLOUD].hitgroup.entryFunctionNameIS = "__intersection__point";
}
if (kernel_features & (KERNEL_FEATURE_SUBSURFACE | KERNEL_FEATURE_NODE_RAYTRACE)) {
/* Add hit group for local intersections. */
group_descs[PG_HITL].kind = OPTIX_PROGRAM_GROUP_KIND_HITGROUP;
@@ -403,6 +435,10 @@ bool OptiXDevice::load_kernels(const uint kernel_features)
stack_size[PG_HITD_MOTION].cssIS + stack_size[PG_HITD_MOTION].cssAH);
trace_css = std::max(trace_css,
stack_size[PG_HITS_MOTION].cssIS + stack_size[PG_HITS_MOTION].cssAH);
trace_css = std::max(
trace_css, stack_size[PG_HITD_POINTCLOUD].cssIS + stack_size[PG_HITD_POINTCLOUD].cssAH);
trace_css = std::max(
trace_css, stack_size[PG_HITS_POINTCLOUD].cssIS + stack_size[PG_HITS_POINTCLOUD].cssAH);
OptixPipelineLinkOptions link_options = {};
link_options.maxTraceDepth = 1;
@@ -411,7 +447,7 @@ bool OptiXDevice::load_kernels(const uint kernel_features)
link_options.debugLevel = OPTIX_COMPILE_DEBUG_LEVEL_FULL;
}
else {
link_options.debugLevel = OPTIX_COMPILE_DEBUG_LEVEL_LINEINFO;
link_options.debugLevel = OPTIX_COMPILE_DEBUG_LEVEL_NONE;
}
if (kernel_features & KERNEL_FEATURE_NODE_RAYTRACE) {
@@ -428,6 +464,10 @@ bool OptiXDevice::load_kernels(const uint kernel_features)
pipeline_groups.push_back(groups[PG_HITD_MOTION]);
pipeline_groups.push_back(groups[PG_HITS_MOTION]);
}
if (kernel_features & KERNEL_FEATURE_POINTCLOUD) {
pipeline_groups.push_back(groups[PG_HITD_POINTCLOUD]);
pipeline_groups.push_back(groups[PG_HITS_POINTCLOUD]);
}
pipeline_groups.push_back(groups[PG_CALL_SVM_AO]);
pipeline_groups.push_back(groups[PG_CALL_SVM_BEVEL]);
@@ -467,6 +507,10 @@ bool OptiXDevice::load_kernels(const uint kernel_features)
pipeline_groups.push_back(groups[PG_HITD_MOTION]);
pipeline_groups.push_back(groups[PG_HITS_MOTION]);
}
if (kernel_features & KERNEL_FEATURE_POINTCLOUD) {
pipeline_groups.push_back(groups[PG_HITD_POINTCLOUD]);
pipeline_groups.push_back(groups[PG_HITS_POINTCLOUD]);
}
optix_assert(optixPipelineCreate(context,
&pipeline_options,
@@ -507,7 +551,7 @@ class OptiXDevice::DenoiseContext {
: denoise_params(task.params),
render_buffers(task.render_buffers),
buffer_params(task.buffer_params),
guiding_buffer(device, "denoiser guiding passes buffer"),
guiding_buffer(device, "denoiser guiding passes buffer", true),
num_samples(task.num_samples)
{
num_input_passes = 1;
@@ -522,9 +566,9 @@ class OptiXDevice::DenoiseContext {
}
}
const int num_guiding_passes = num_input_passes - 1;
use_guiding_passes = (num_input_passes - 1) > 0;
if (num_guiding_passes) {
if (use_guiding_passes) {
if (task.allow_inplace_modification) {
guiding_params.device_pointer = render_buffers->buffer.device_pointer;
@@ -577,6 +621,7 @@ class OptiXDevice::DenoiseContext {
/* Number of input passes. Including the color and extra auxiliary passes. */
int num_input_passes = 0;
bool use_guiding_passes = false;
bool use_pass_albedo = false;
bool use_pass_normal = false;
@@ -653,22 +698,22 @@ bool OptiXDevice::denoise_filter_guiding_preprocess(DenoiseContext &context)
const int work_size = buffer_params.width * buffer_params.height;
void *args[] = {const_cast<device_ptr *>(&context.guiding_params.device_pointer),
const_cast<int *>(&context.guiding_params.pass_stride),
const_cast<int *>(&context.guiding_params.pass_albedo),
const_cast<int *>(&context.guiding_params.pass_normal),
&context.render_buffers->buffer.device_pointer,
const_cast<int *>(&buffer_params.offset),
const_cast<int *>(&buffer_params.stride),
const_cast<int *>(&buffer_params.pass_stride),
const_cast<int *>(&context.pass_sample_count),
const_cast<int *>(&context.pass_denoising_albedo),
const_cast<int *>(&context.pass_denoising_normal),
const_cast<int *>(&buffer_params.full_x),
const_cast<int *>(&buffer_params.full_y),
const_cast<int *>(&buffer_params.width),
const_cast<int *>(&buffer_params.height),
const_cast<int *>(&context.num_samples)};
DeviceKernelArguments args(&context.guiding_params.device_pointer,
&context.guiding_params.pass_stride,
&context.guiding_params.pass_albedo,
&context.guiding_params.pass_normal,
&context.render_buffers->buffer.device_pointer,
&buffer_params.offset,
&buffer_params.stride,
&buffer_params.pass_stride,
&context.pass_sample_count,
&context.pass_denoising_albedo,
&context.pass_denoising_normal,
&buffer_params.full_x,
&buffer_params.full_y,
&buffer_params.width,
&buffer_params.height,
&context.num_samples);
return denoiser_.queue.enqueue(DEVICE_KERNEL_FILTER_GUIDING_PREPROCESS, work_size, args);
}
@@ -679,11 +724,11 @@ bool OptiXDevice::denoise_filter_guiding_set_fake_albedo(DenoiseContext &context
const int work_size = buffer_params.width * buffer_params.height;
void *args[] = {const_cast<device_ptr *>(&context.guiding_params.device_pointer),
const_cast<int *>(&context.guiding_params.pass_stride),
const_cast<int *>(&context.guiding_params.pass_albedo),
const_cast<int *>(&buffer_params.width),
const_cast<int *>(&buffer_params.height)};
DeviceKernelArguments args(&context.guiding_params.device_pointer,
&context.guiding_params.pass_stride,
&context.guiding_params.pass_albedo,
&buffer_params.width,
&buffer_params.height);
return denoiser_.queue.enqueue(DEVICE_KERNEL_FILTER_GUIDING_SET_FAKE_ALBEDO, work_size, args);
}
@@ -708,7 +753,7 @@ void OptiXDevice::denoise_pass(DenoiseContext &context, PassType pass_type)
return;
}
}
else if (!context.albedo_replaced_with_fake) {
else if (context.use_guiding_passes && !context.albedo_replaced_with_fake) {
context.albedo_replaced_with_fake = true;
if (!denoise_filter_guiding_set_fake_albedo(context)) {
LOG(ERROR) << "Error replacing real albedo with the fake one.";
@@ -779,15 +824,15 @@ bool OptiXDevice::denoise_filter_color_preprocess(DenoiseContext &context, const
const int work_size = buffer_params.width * buffer_params.height;
void *args[] = {&context.render_buffers->buffer.device_pointer,
const_cast<int *>(&buffer_params.full_x),
const_cast<int *>(&buffer_params.full_y),
const_cast<int *>(&buffer_params.width),
const_cast<int *>(&buffer_params.height),
const_cast<int *>(&buffer_params.offset),
const_cast<int *>(&buffer_params.stride),
const_cast<int *>(&buffer_params.pass_stride),
const_cast<int *>(&pass.denoised_offset)};
DeviceKernelArguments args(&context.render_buffers->buffer.device_pointer,
&buffer_params.full_x,
&buffer_params.full_y,
&buffer_params.width,
&buffer_params.height,
&buffer_params.offset,
&buffer_params.stride,
&buffer_params.pass_stride,
&pass.denoised_offset);
return denoiser_.queue.enqueue(DEVICE_KERNEL_FILTER_COLOR_PREPROCESS, work_size, args);
}
@@ -799,20 +844,20 @@ bool OptiXDevice::denoise_filter_color_postprocess(DenoiseContext &context,
const int work_size = buffer_params.width * buffer_params.height;
void *args[] = {&context.render_buffers->buffer.device_pointer,
const_cast<int *>(&buffer_params.full_x),
const_cast<int *>(&buffer_params.full_y),
const_cast<int *>(&buffer_params.width),
const_cast<int *>(&buffer_params.height),
const_cast<int *>(&buffer_params.offset),
const_cast<int *>(&buffer_params.stride),
const_cast<int *>(&buffer_params.pass_stride),
const_cast<int *>(&context.num_samples),
const_cast<int *>(&pass.noisy_offset),
const_cast<int *>(&pass.denoised_offset),
const_cast<int *>(&context.pass_sample_count),
const_cast<int *>(&pass.num_components),
const_cast<bool *>(&pass.use_compositing)};
DeviceKernelArguments args(&context.render_buffers->buffer.device_pointer,
&buffer_params.full_x,
&buffer_params.full_y,
&buffer_params.width,
&buffer_params.height,
&buffer_params.offset,
&buffer_params.stride,
&buffer_params.pass_stride,
&context.num_samples,
&pass.noisy_offset,
&pass.denoised_offset,
&context.pass_sample_count,
&pass.num_components,
&pass.use_compositing);
return denoiser_.queue.enqueue(DEVICE_KERNEL_FILTER_COLOR_POSTPROCESS, work_size, args);
}
@@ -870,35 +915,33 @@ bool OptiXDevice::denoise_create_if_needed(DenoiseContext &context)
bool OptiXDevice::denoise_configure_if_needed(DenoiseContext &context)
{
if (denoiser_.is_configured && (denoiser_.configured_size.x == context.buffer_params.width &&
denoiser_.configured_size.y == context.buffer_params.height)) {
/* Limit maximum tile size denoiser can be invoked with. */
const int2 tile_size = make_int2(min(context.buffer_params.width, 4096),
min(context.buffer_params.height, 4096));
if (denoiser_.is_configured &&
(denoiser_.configured_size.x == tile_size.x && denoiser_.configured_size.y == tile_size.y)) {
return true;
}
const BufferParams &buffer_params = context.buffer_params;
OptixDenoiserSizes sizes = {};
optix_assert(optixDenoiserComputeMemoryResources(
denoiser_.optix_denoiser, buffer_params.width, buffer_params.height, &sizes));
/* Denoiser is invoked on whole images only, so no overlap needed (would be used for tiling). */
denoiser_.scratch_size = sizes.withoutOverlapScratchSizeInBytes;
denoiser_.scratch_offset = sizes.stateSizeInBytes;
denoiser_.optix_denoiser, tile_size.x, tile_size.y, &denoiser_.sizes));
/* Allocate denoiser state if tile size has changed since last setup. */
denoiser_.state.alloc_to_device(denoiser_.scratch_offset + denoiser_.scratch_size);
denoiser_.state.alloc_to_device(denoiser_.sizes.stateSizeInBytes +
denoiser_.sizes.withOverlapScratchSizeInBytes);
/* Initialize denoiser state for the current tile size. */
const OptixResult result = optixDenoiserSetup(
denoiser_.optix_denoiser,
0, /* Work around bug in r495 drivers that causes artifacts when denoiser setup is called
on a stream that is not the default stream */
buffer_params.width,
buffer_params.height,
tile_size.x + denoiser_.sizes.overlapWindowSizeInPixels * 2,
tile_size.y + denoiser_.sizes.overlapWindowSizeInPixels * 2,
denoiser_.state.device_pointer,
denoiser_.scratch_offset,
denoiser_.state.device_pointer + denoiser_.scratch_offset,
denoiser_.scratch_size);
denoiser_.sizes.stateSizeInBytes,
denoiser_.state.device_pointer + denoiser_.sizes.stateSizeInBytes,
denoiser_.sizes.withOverlapScratchSizeInBytes);
if (result != OPTIX_SUCCESS) {
set_error("Failed to set up OptiX denoiser");
return false;
@@ -907,8 +950,7 @@ bool OptiXDevice::denoise_configure_if_needed(DenoiseContext &context)
cuda_assert(cuCtxSynchronize());
denoiser_.is_configured = true;
denoiser_.configured_size.x = buffer_params.width;
denoiser_.configured_size.y = buffer_params.height;
denoiser_.configured_size = tile_size;
return true;
}
@@ -979,18 +1021,20 @@ bool OptiXDevice::denoise_run(DenoiseContext &context, const DenoisePass &pass)
guide_layers.albedo = albedo_layer;
guide_layers.normal = normal_layer;
optix_assert(optixDenoiserInvoke(denoiser_.optix_denoiser,
denoiser_.queue.stream(),
&params,
denoiser_.state.device_pointer,
denoiser_.scratch_offset,
&guide_layers,
&image_layers,
1,
0,
0,
denoiser_.state.device_pointer + denoiser_.scratch_offset,
denoiser_.scratch_size));
optix_assert(optixUtilDenoiserInvokeTiled(denoiser_.optix_denoiser,
denoiser_.queue.stream(),
&params,
denoiser_.state.device_pointer,
denoiser_.sizes.stateSizeInBytes,
&guide_layers,
&image_layers,
1,
denoiser_.state.device_pointer +
denoiser_.sizes.stateSizeInBytes,
denoiser_.sizes.withOverlapScratchSizeInBytes,
denoiser_.sizes.overlapWindowSizeInPixels,
denoiser_.configured_size.x,
denoiser_.configured_size.y));
return true;
}
@@ -1000,6 +1044,13 @@ bool OptiXDevice::build_optix_bvh(BVHOptiX *bvh,
const OptixBuildInput &build_input,
uint16_t num_motion_steps)
{
/* Allocate and build acceleration structures only one at a time, to prevent parallel builds
* from running out of memory (since both original and compacted acceleration structure memory
* may be allocated at the same time for the duration of this function). The builds would
* otherwise happen on the same CUDA stream anyway. */
static thread_mutex mutex;
thread_scoped_lock lock(mutex);
const CUDAContextScope scope(this);
const bool use_fast_trace_bvh = (bvh->params.bvh_type == BVH_TYPE_STATIC);
@@ -1025,13 +1076,14 @@ bool OptiXDevice::build_optix_bvh(BVHOptiX *bvh,
optix_assert(optixAccelComputeMemoryUsage(context, &options, &build_input, 1, &sizes));
/* Allocate required output buffers. */
device_only_memory<char> temp_mem(this, "optix temp as build mem");
device_only_memory<char> temp_mem(this, "optix temp as build mem", true);
temp_mem.alloc_to_device(align_up(sizes.tempSizeInBytes, 8) + 8);
if (!temp_mem.device_pointer) {
/* Make sure temporary memory allocation succeeded. */
return false;
}
/* Acceleration structure memory has to be allocated on the device (not allowed on the host). */
device_only_memory<char> &out_data = *bvh->as_data;
if (operation == OPTIX_BUILD_OPERATION_BUILD) {
assert(out_data.device == this);
@@ -1080,12 +1132,13 @@ bool OptiXDevice::build_optix_bvh(BVHOptiX *bvh,
/* There is no point compacting if the size does not change. */
if (compacted_size < sizes.outputSizeInBytes) {
device_only_memory<char> compacted_data(this, "optix compacted as");
device_only_memory<char> compacted_data(this, "optix compacted as", false);
compacted_data.alloc_to_device(compacted_size);
if (!compacted_data.device_pointer)
if (!compacted_data.device_pointer) {
/* Do not compact if memory allocation for compacted acceleration structure fails.
* Can just use the uncompacted one then, so succeed here regardless. */
return !have_error();
}
optix_assert(optixAccelCompact(
context, NULL, out_handle, compacted_data.device_pointer, compacted_size, &out_handle));
@@ -1096,6 +1149,8 @@ bool OptiXDevice::build_optix_bvh(BVHOptiX *bvh,
std::swap(out_data.device_size, compacted_data.device_size);
std::swap(out_data.device_pointer, compacted_data.device_pointer);
/* Original acceleration structure memory is freed when 'compacted_data' goes out of scope.
*/
}
}
@@ -1178,20 +1233,27 @@ void OptiXDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
int ka = max(k0 - 1, curve.first_key);
int kb = min(k1 + 1, curve.first_key + curve.num_keys - 1);
index_data[i] = i * 4;
float4 *const v = vertex_data.data() + step * num_vertices + index_data[i];
# if OPTIX_ABI_VERSION >= 55
v[0] = make_float4(keys[ka].x, keys[ka].y, keys[ka].z, curve_radius[ka]);
v[1] = make_float4(keys[k0].x, keys[k0].y, keys[k0].z, curve_radius[k0]);
v[2] = make_float4(keys[k1].x, keys[k1].y, keys[k1].z, curve_radius[k1]);
v[3] = make_float4(keys[kb].x, keys[kb].y, keys[kb].z, curve_radius[kb]);
# else
const float4 px = make_float4(keys[ka].x, keys[k0].x, keys[k1].x, keys[kb].x);
const float4 py = make_float4(keys[ka].y, keys[k0].y, keys[k1].y, keys[kb].y);
const float4 pz = make_float4(keys[ka].z, keys[k0].z, keys[k1].z, keys[kb].z);
const float4 pw = make_float4(
curve_radius[ka], curve_radius[k0], curve_radius[k1], curve_radius[kb]);
/* Convert Catmull-Rom data to Bezier spline. */
/* Convert Catmull-Rom data to B-spline. */
static const float4 cr2bsp0 = make_float4(+7, -4, +5, -2) / 6.f;
static const float4 cr2bsp1 = make_float4(-2, 11, -4, +1) / 6.f;
static const float4 cr2bsp2 = make_float4(+1, -4, 11, -2) / 6.f;
static const float4 cr2bsp3 = make_float4(-2, +5, -4, +7) / 6.f;
index_data[i] = i * 4;
float4 *const v = vertex_data.data() + step * num_vertices + index_data[i];
v[0] = make_float4(
dot(cr2bsp0, px), dot(cr2bsp0, py), dot(cr2bsp0, pz), dot(cr2bsp0, pw));
v[1] = make_float4(
@@ -1200,6 +1262,7 @@ void OptiXDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
dot(cr2bsp2, px), dot(cr2bsp2, py), dot(cr2bsp2, pz), dot(cr2bsp2, pw));
v[3] = make_float4(
dot(cr2bsp3, px), dot(cr2bsp3, py), dot(cr2bsp3, pz), dot(cr2bsp3, pw));
# endif
}
else {
BoundBox bounds = BoundBox::empty;
@@ -1241,7 +1304,11 @@ void OptiXDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
OptixBuildInput build_input = {};
if (hair->curve_shape == CURVE_THICK) {
build_input.type = OPTIX_BUILD_INPUT_TYPE_CURVES;
# if OPTIX_ABI_VERSION >= 55
build_input.curveArray.curveType = OPTIX_PRIMITIVE_TYPE_ROUND_CATMULLROM;
# else
build_input.curveArray.curveType = OPTIX_PRIMITIVE_TYPE_ROUND_CUBIC_BSPLINE;
# endif
build_input.curveArray.numPrimitives = num_segments;
build_input.curveArray.vertexBuffers = (CUdeviceptr *)vertex_ptrs.data();
build_input.curveArray.numVertices = num_vertices;
@@ -1255,7 +1322,7 @@ void OptiXDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
}
else {
/* Disable visibility test any-hit program, since it is already checked during
* intersection. Those trace calls that require anyhit can force it with a ray flag. */
* intersection. Those trace calls that require any-hit can force it with a ray flag. */
build_flags |= OPTIX_GEOMETRY_FLAG_DISABLE_ANYHIT;
build_input.type = OPTIX_BUILD_INPUT_TYPE_CUSTOM_PRIMITIVES;
@@ -1335,6 +1402,86 @@ void OptiXDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
build_input.triangleArray.numSbtRecords = 1;
build_input.triangleArray.primitiveIndexOffset = mesh->prim_offset;
if (!build_optix_bvh(bvh_optix, operation, build_input, num_motion_steps)) {
progress.set_error("Failed to build OptiX acceleration structure");
}
}
else if (geom->geometry_type == Geometry::POINTCLOUD) {
/* Build BLAS for points primitives. */
PointCloud *const pointcloud = static_cast<PointCloud *const>(geom);
const size_t num_points = pointcloud->num_points();
if (num_points == 0) {
return;
}
size_t num_motion_steps = 1;
Attribute *motion_points = pointcloud->attributes.find(ATTR_STD_MOTION_VERTEX_POSITION);
if (motion_blur && pointcloud->get_use_motion_blur() && motion_points) {
num_motion_steps = pointcloud->get_motion_steps();
}
device_vector<OptixAabb> aabb_data(this, "optix temp aabb data", MEM_READ_ONLY);
aabb_data.alloc(num_points * num_motion_steps);
/* Get AABBs for each motion step. */
for (size_t step = 0; step < num_motion_steps; ++step) {
/* The center step for motion vertices is not stored in the attribute. */
const float3 *points = pointcloud->get_points().data();
const float *radius = pointcloud->get_radius().data();
size_t center_step = (num_motion_steps - 1) / 2;
if (step != center_step) {
size_t attr_offset = (step > center_step) ? step - 1 : step;
/* Technically this is a float4 array, but sizeof(float3) == sizeof(float4). */
points = motion_points->data_float3() + attr_offset * num_points;
}
for (size_t i = 0; i < num_points; ++i) {
const PointCloud::Point point = pointcloud->get_point(i);
BoundBox bounds = BoundBox::empty;
point.bounds_grow(points, radius, bounds);
const size_t index = step * num_points + i;
aabb_data[index].minX = bounds.min.x;
aabb_data[index].minY = bounds.min.y;
aabb_data[index].minZ = bounds.min.z;
aabb_data[index].maxX = bounds.max.x;
aabb_data[index].maxY = bounds.max.y;
aabb_data[index].maxZ = bounds.max.z;
}
}
/* Upload AABB data to GPU. */
aabb_data.copy_to_device();
vector<device_ptr> aabb_ptrs;
aabb_ptrs.reserve(num_motion_steps);
for (size_t step = 0; step < num_motion_steps; ++step) {
aabb_ptrs.push_back(aabb_data.device_pointer + step * num_points * sizeof(OptixAabb));
}
/* Disable visibility test any-hit program, since it is already checked during
* intersection. Those trace calls that require anyhit can force it with a ray flag.
* For those, force a single any-hit call, so shadow record-all behavior works correctly. */
unsigned int build_flags = OPTIX_GEOMETRY_FLAG_DISABLE_ANYHIT |
OPTIX_GEOMETRY_FLAG_REQUIRE_SINGLE_ANYHIT_CALL;
OptixBuildInput build_input = {};
build_input.type = OPTIX_BUILD_INPUT_TYPE_CUSTOM_PRIMITIVES;
# if OPTIX_ABI_VERSION < 23
build_input.aabbArray.aabbBuffers = (CUdeviceptr *)aabb_ptrs.data();
build_input.aabbArray.numPrimitives = num_points;
build_input.aabbArray.strideInBytes = sizeof(OptixAabb);
build_input.aabbArray.flags = &build_flags;
build_input.aabbArray.numSbtRecords = 1;
build_input.aabbArray.primitiveIndexOffset = pointcloud->prim_offset;
# else
build_input.customPrimitiveArray.aabbBuffers = (CUdeviceptr *)aabb_ptrs.data();
build_input.customPrimitiveArray.numPrimitives = num_points;
build_input.customPrimitiveArray.strideInBytes = sizeof(OptixAabb);
build_input.customPrimitiveArray.flags = &build_flags;
build_input.customPrimitiveArray.numSbtRecords = 1;
build_input.customPrimitiveArray.primitiveIndexOffset = pointcloud->prim_offset;
# endif
if (!build_optix_bvh(bvh_optix, operation, build_input, num_motion_steps)) {
progress.set_error("Failed to build OptiX acceleration structure");
}
@@ -1422,9 +1569,22 @@ void OptiXDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
instance.sbtOffset = PG_HITD_MOTION - PG_HITD;
}
}
else {
else if (ob->get_geometry()->geometry_type == Geometry::POINTCLOUD) {
/* Use the hit group that has an intersection program for point clouds. */
instance.sbtOffset = PG_HITD_POINTCLOUD - PG_HITD;
/* Also skip point clouds in local trace calls. */
instance.visibilityMask |= 4;
}
# if OPTIX_ABI_VERSION < 55
/* Cannot disable any-hit program for thick curves, since it needs to filter out end-caps. */
else
# endif
{
/* Can disable __anyhit__kernel_optix_visibility_test by default (except for thick curves,
* since it needs to filter out end-caps there).
* It is enabled where necessary (visibility mask exceeds 8 bits or the other any-hit
* programs like __anyhit__kernel_optix_shadow_all_hit) via OPTIX_RAY_FLAG_ENFORCE_ANYHIT.
*/
@@ -1494,9 +1654,6 @@ void OptiXDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
cuMemcpyHtoD(motion_transform_gpu, &motion_transform, motion_transform_size);
delete[] reinterpret_cast<uint8_t *>(&motion_transform);
/* Disable instance transform if object uses motion transform already. */
instance.flags |= OPTIX_INSTANCE_FLAG_DISABLE_TRANSFORM;
/* Get traversable handle to motion transform. */
optixConvertPointerToTraversableHandle(context,
motion_transform_gpu,
@@ -1510,10 +1667,6 @@ void OptiXDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
/* Set transform matrix. */
memcpy(instance.transform, &ob->get_tfm(), sizeof(instance.transform));
}
else {
/* Disable instance transform if geometry already has it applied to vertex data. */
instance.flags |= OPTIX_INSTANCE_FLAG_DISABLE_TRANSFORM;
}
}
}

View File

@@ -44,6 +44,8 @@ enum {
PG_HITV, /* __VOLUME__ hit group. */
PG_HITD_MOTION,
PG_HITS_MOTION,
PG_HITD_POINTCLOUD,
PG_HITS_POINTCLOUD,
PG_CALL_SVM_AO,
PG_CALL_SVM_BEVEL,
NUM_PROGRAM_GROUPS
@@ -52,9 +54,9 @@ enum {
static const int MISS_PROGRAM_GROUP_OFFSET = PG_MISS;
static const int NUM_MIS_PROGRAM_GROUPS = 1;
static const int HIT_PROGAM_GROUP_OFFSET = PG_HITD;
static const int NUM_HIT_PROGRAM_GROUPS = 6;
static const int NUM_HIT_PROGRAM_GROUPS = 8;
static const int CALLABLE_PROGRAM_GROUPS_BASE = PG_CALL_SVM_AO;
static const int NUM_CALLABLE_PROGRAM_GROUPS = 3;
static const int NUM_CALLABLE_PROGRAM_GROUPS = 2;
/* List of OptiX pipelines. */
enum { PIP_SHADE_RAYTRACE, PIP_INTERSECT, NUM_PIPELINES };
@@ -98,8 +100,7 @@ class OptiXDevice : public CUDADevice {
/* OptiX denoiser state and scratch buffers, stored in a single memory buffer.
* The memory layout goes as following: [denoiser state][scratch buffer]. */
device_only_memory<unsigned char> state;
size_t scratch_offset = 0;
size_t scratch_size = 0;
OptixDenoiserSizes sizes = {};
bool use_pass_albedo = false;
bool use_pass_normal = false;

View File

@@ -47,7 +47,9 @@ static bool is_optix_specific_kernel(DeviceKernel kernel)
kernel == DEVICE_KERNEL_INTEGRATOR_INTERSECT_VOLUME_STACK);
}
bool OptiXDeviceQueue::enqueue(DeviceKernel kernel, const int work_size, void *args[])
bool OptiXDeviceQueue::enqueue(DeviceKernel kernel,
const int work_size,
DeviceKernelArguments const &args)
{
if (!is_optix_specific_kernel(kernel)) {
return CUDADeviceQueue::enqueue(kernel, work_size, args);
@@ -69,7 +71,7 @@ bool OptiXDeviceQueue::enqueue(DeviceKernel kernel, const int work_size, void *a
cuda_device_assert(
cuda_device_,
cuMemcpyHtoDAsync(launch_params_ptr + offsetof(KernelParamsOptiX, path_index_array),
args[0], // &d_path_index
args.values[0], // &d_path_index
sizeof(device_ptr),
cuda_stream_));
@@ -78,7 +80,7 @@ bool OptiXDeviceQueue::enqueue(DeviceKernel kernel, const int work_size, void *a
cuda_device_assert(
cuda_device_,
cuMemcpyHtoDAsync(launch_params_ptr + offsetof(KernelParamsOptiX, render_buffer),
args[1], // &d_render_buffer
args.values[1], // &d_render_buffer
sizeof(device_ptr),
cuda_stream_));
}

View File

@@ -31,7 +31,9 @@ class OptiXDeviceQueue : public CUDADeviceQueue {
virtual void init_execution() override;
virtual bool enqueue(DeviceKernel kernel, const int work_size, void *args[]) override;
virtual bool enqueue(DeviceKernel kernel,
const int work_size,
DeviceKernelArguments const &args) override;
};
CCL_NAMESPACE_END

View File

@@ -31,6 +31,72 @@ class device_memory;
struct KernelWorkTile;
/* Container for device kernel arguments with type correctness ensured by API. */
struct DeviceKernelArguments {
enum Type {
POINTER,
INT32,
FLOAT32,
BOOLEAN,
KERNEL_FILM_CONVERT,
};
static const int MAX_ARGS = 16;
Type types[MAX_ARGS];
void *values[MAX_ARGS];
size_t sizes[MAX_ARGS];
size_t count = 0;
DeviceKernelArguments()
{
}
template<class T> DeviceKernelArguments(const T *arg)
{
add(arg);
}
template<class T, class... Args> DeviceKernelArguments(const T *first, Args... args)
{
add(first);
add(args...);
}
void add(const KernelFilmConvert *value)
{
add(KERNEL_FILM_CONVERT, value, sizeof(KernelFilmConvert));
}
void add(const device_ptr *value)
{
add(POINTER, value, sizeof(device_ptr));
}
void add(const int32_t *value)
{
add(INT32, value, sizeof(int32_t));
}
void add(const float *value)
{
add(FLOAT32, value, sizeof(float));
}
void add(const bool *value)
{
add(BOOLEAN, value, 4);
}
void add(const Type type, const void *value, size_t size)
{
types[count] = type;
values[count] = (void *)value;
sizes[count] = size;
count++;
}
template<typename T, typename... Args> void add(const T *first, Args... args)
{
add(first);
add(args...);
}
};
/* Abstraction of a command queue for a device.
* Provides API to schedule kernel execution in a specific queue with minimal possible overhead
* from driver side.
@@ -66,7 +132,9 @@ class DeviceQueue {
* - int: pass pointer to the int
* - device memory: pass pointer to device_memory.device_pointer
* Return false if there was an error executing this or a previous kernel. */
virtual bool enqueue(DeviceKernel kernel, const int work_size, void *args[]) = 0;
virtual bool enqueue(DeviceKernel kernel,
const int work_size,
DeviceKernelArguments const &args) = 0;
/* Wait unit all enqueued kernels have finished execution.
* Return false if there was an error executing any of the enqueued kernels. */

View File

@@ -31,7 +31,7 @@ struct Node;
struct NodeType;
struct Transform;
/* Note: in the following macros we use "type const &" instead of "const type &"
/* NOTE: in the following macros we use "type const &" instead of "const type &"
* to avoid issues when pasting a pointer type. */
#define NODE_SOCKET_API_BASE_METHODS(type_, name, string_name) \
const SocketType *get_##name##_socket() const \

View File

@@ -54,30 +54,30 @@ void PassAccessorGPU::run_film_convert_kernels(DeviceKernel kernel,
if (destination.d_pixels) {
DCHECK_EQ(destination.stride, 0) << "Custom stride for float destination is not implemented.";
void *args[] = {const_cast<KernelFilmConvert *>(&kfilm_convert),
const_cast<device_ptr *>(&destination.d_pixels),
const_cast<device_ptr *>(&render_buffers->buffer.device_pointer),
const_cast<int *>(&work_size),
const_cast<int *>(&buffer_params.window_width),
const_cast<int *>(&offset),
const_cast<int *>(&buffer_params.stride),
const_cast<int *>(&destination.offset),
const_cast<int *>(&destination_stride)};
DeviceKernelArguments args(&kfilm_convert,
&destination.d_pixels,
&render_buffers->buffer.device_pointer,
&work_size,
&buffer_params.window_width,
&offset,
&buffer_params.stride,
&destination.offset,
&destination_stride);
queue_->enqueue(kernel, work_size, args);
}
if (destination.d_pixels_half_rgba) {
const DeviceKernel kernel_half_float = static_cast<DeviceKernel>(kernel + 1);
void *args[] = {const_cast<KernelFilmConvert *>(&kfilm_convert),
const_cast<device_ptr *>(&destination.d_pixels_half_rgba),
const_cast<device_ptr *>(&render_buffers->buffer.device_pointer),
const_cast<int *>(&work_size),
const_cast<int *>(&buffer_params.window_width),
const_cast<int *>(&offset),
const_cast<int *>(&buffer_params.stride),
const_cast<int *>(&destination.offset),
const_cast<int *>(&destination_stride)};
DeviceKernelArguments args(&kfilm_convert,
&destination.d_pixels_half_rgba,
&render_buffers->buffer.device_pointer,
&work_size,
&buffer_params.window_width,
&offset,
&buffer_params.stride,
&destination.offset,
&destination_stride);
queue_->enqueue(kernel_half_float, work_size, args);
}

View File

@@ -482,7 +482,11 @@ void PathTrace::set_denoiser_params(const DenoiseParams &params)
}
denoiser_ = Denoiser::create(device_, params);
denoiser_->is_cancelled_cb = [this]() { return is_cancel_requested(); };
/* Only take into account the "immediate" cancel to have interactive rendering responding to
* navigation as quickly as possible, but allow to run denoiser after user hit Esc button while
* doing offline rendering. */
denoiser_->is_cancelled_cb = [this]() { return render_cancel_.is_requested; };
}
void PathTrace::set_adaptive_sampling(const AdaptiveSampling &adaptive_sampling)
@@ -1089,6 +1093,8 @@ static const char *device_type_for_description(const DeviceType type)
return "Dummy";
case DEVICE_MULTI:
return "Multi";
case DEVICE_METAL:
return "Metal";
}
return "UNKNOWN";

View File

@@ -258,7 +258,8 @@ void PathTraceWorkGPU::render_samples(RenderStatistics &statistics,
* become busy after adding new tiles). This is especially important for the shadow catcher which
* schedules work in halves of available number of paths. */
work_tile_scheduler_.set_max_num_path_states(max_num_paths_ / 8);
work_tile_scheduler_.set_accelerated_rt((device_->get_bvh_layout_mask() & BVH_LAYOUT_OPTIX) !=
0);
work_tile_scheduler_.reset(effective_buffer_params_,
start_sample,
samples_num,
@@ -333,7 +334,8 @@ DeviceKernel PathTraceWorkGPU::get_most_queued_kernel() const
void PathTraceWorkGPU::enqueue_reset()
{
void *args[] = {&max_num_paths_};
DeviceKernelArguments args(&max_num_paths_);
queue_->enqueue(DEVICE_KERNEL_INTEGRATOR_RESET, max_num_paths_, args);
queue_->zero_to_device(integrator_queue_counter_);
queue_->zero_to_device(integrator_shader_sort_counter_);
@@ -404,7 +406,7 @@ bool PathTraceWorkGPU::enqueue_path_iteration()
void PathTraceWorkGPU::enqueue_path_iteration(DeviceKernel kernel, const int num_paths_limit)
{
void *d_path_index = (void *)NULL;
device_ptr d_path_index = 0;
/* Create array of path indices for which this kernel is queued to be executed. */
int work_size = kernel_max_active_main_path_index(kernel);
@@ -415,14 +417,14 @@ void PathTraceWorkGPU::enqueue_path_iteration(DeviceKernel kernel, const int num
if (kernel_uses_sorting(kernel)) {
/* Compute array of active paths, sorted by shader. */
work_size = num_queued;
d_path_index = (void *)queued_paths_.device_pointer;
d_path_index = queued_paths_.device_pointer;
compute_sorted_queued_paths(
DEVICE_KERNEL_INTEGRATOR_SORTED_PATHS_ARRAY, kernel, num_paths_limit);
}
else if (num_queued < work_size) {
work_size = num_queued;
d_path_index = (void *)queued_paths_.device_pointer;
d_path_index = queued_paths_.device_pointer;
if (kernel_is_shadow_path(kernel)) {
/* Compute array of active shadow paths for specific kernel. */
@@ -441,8 +443,7 @@ void PathTraceWorkGPU::enqueue_path_iteration(DeviceKernel kernel, const int num
switch (kernel) {
case DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST: {
/* Closest ray intersection kernels with integrator state and render buffer. */
void *d_render_buffer = (void *)buffers_->buffer.device_pointer;
void *args[] = {&d_path_index, &d_render_buffer, const_cast<int *>(&work_size)};
DeviceKernelArguments args(&d_path_index, &buffers_->buffer.device_pointer, &work_size);
queue_->enqueue(kernel, work_size, args);
break;
@@ -452,7 +453,7 @@ void PathTraceWorkGPU::enqueue_path_iteration(DeviceKernel kernel, const int num
case DEVICE_KERNEL_INTEGRATOR_INTERSECT_SUBSURFACE:
case DEVICE_KERNEL_INTEGRATOR_INTERSECT_VOLUME_STACK: {
/* Ray intersection kernels with integrator state. */
void *args[] = {&d_path_index, const_cast<int *>(&work_size)};
DeviceKernelArguments args(&d_path_index, &work_size);
queue_->enqueue(kernel, work_size, args);
break;
@@ -464,8 +465,7 @@ void PathTraceWorkGPU::enqueue_path_iteration(DeviceKernel kernel, const int num
case DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE:
case DEVICE_KERNEL_INTEGRATOR_SHADE_VOLUME: {
/* Shading kernels with integrator state and render buffer. */
void *d_render_buffer = (void *)buffers_->buffer.device_pointer;
void *args[] = {&d_path_index, &d_render_buffer, const_cast<int *>(&work_size)};
DeviceKernelArguments args(&d_path_index, &buffers_->buffer.device_pointer, &work_size);
queue_->enqueue(kernel, work_size, args);
break;
@@ -483,15 +483,17 @@ void PathTraceWorkGPU::compute_sorted_queued_paths(DeviceKernel kernel,
const int num_paths_limit)
{
int d_queued_kernel = queued_kernel;
void *d_counter = integrator_state_gpu_.sort_key_counter[d_queued_kernel];
void *d_prefix_sum = (void *)integrator_shader_sort_prefix_sum_.device_pointer;
assert(d_counter != nullptr && d_prefix_sum != nullptr);
device_ptr d_counter = (device_ptr)integrator_state_gpu_.sort_key_counter[d_queued_kernel];
device_ptr d_prefix_sum = integrator_shader_sort_prefix_sum_.device_pointer;
assert(d_counter != 0 && d_prefix_sum != 0);
/* Compute prefix sum of number of active paths with each shader. */
{
const int work_size = 1;
int max_shaders = device_scene_->data.max_shaders;
void *args[] = {&d_counter, &d_prefix_sum, &max_shaders};
DeviceKernelArguments args(&d_counter, &d_prefix_sum, &max_shaders);
queue_->enqueue(DEVICE_KERNEL_PREFIX_SUM, work_size, args);
}
@@ -506,15 +508,16 @@ void PathTraceWorkGPU::compute_sorted_queued_paths(DeviceKernel kernel,
* end of the array since compaction would need to do less work. */
const int work_size = kernel_max_active_main_path_index(queued_kernel);
void *d_queued_paths = (void *)queued_paths_.device_pointer;
void *d_num_queued_paths = (void *)num_queued_paths_.device_pointer;
void *args[] = {const_cast<int *>(&work_size),
const_cast<int *>(&num_paths_limit),
&d_queued_paths,
&d_num_queued_paths,
&d_counter,
&d_prefix_sum,
&d_queued_kernel};
device_ptr d_queued_paths = queued_paths_.device_pointer;
device_ptr d_num_queued_paths = num_queued_paths_.device_pointer;
DeviceKernelArguments args(&work_size,
&num_paths_limit,
&d_queued_paths,
&d_num_queued_paths,
&d_counter,
&d_prefix_sum,
&d_queued_kernel);
queue_->enqueue(kernel, work_size, args);
}
@@ -526,10 +529,10 @@ void PathTraceWorkGPU::compute_queued_paths(DeviceKernel kernel, DeviceKernel qu
/* Launch kernel to fill the active paths arrays. */
const int work_size = kernel_max_active_main_path_index(queued_kernel);
void *d_queued_paths = (void *)queued_paths_.device_pointer;
void *d_num_queued_paths = (void *)num_queued_paths_.device_pointer;
void *args[] = {
const_cast<int *>(&work_size), &d_queued_paths, &d_num_queued_paths, &d_queued_kernel};
device_ptr d_queued_paths = queued_paths_.device_pointer;
device_ptr d_num_queued_paths = num_queued_paths_.device_pointer;
DeviceKernelArguments args(&work_size, &d_queued_paths, &d_num_queued_paths, &d_queued_kernel);
queue_->zero_to_device(num_queued_paths_);
queue_->enqueue(kernel, work_size, args);
@@ -605,15 +608,17 @@ void PathTraceWorkGPU::compact_paths(const int num_active_paths,
{
/* Compact fragmented path states into the start of the array, moving any paths
* with index higher than the number of active paths into the gaps. */
void *d_compact_paths = (void *)queued_paths_.device_pointer;
void *d_num_queued_paths = (void *)num_queued_paths_.device_pointer;
device_ptr d_compact_paths = queued_paths_.device_pointer;
device_ptr d_num_queued_paths = num_queued_paths_.device_pointer;
/* Create array with terminated paths that we can write to. */
{
/* TODO: can the work size be reduced here? */
int offset = num_active_paths;
int work_size = num_active_paths;
void *args[] = {&work_size, &d_compact_paths, &d_num_queued_paths, &offset};
DeviceKernelArguments args(&work_size, &d_compact_paths, &d_num_queued_paths, &offset);
queue_->zero_to_device(num_queued_paths_);
queue_->enqueue(terminated_paths_kernel, work_size, args);
}
@@ -622,8 +627,10 @@ void PathTraceWorkGPU::compact_paths(const int num_active_paths,
* than the number of active paths. */
{
int work_size = max_active_path_index;
void *args[] = {
&work_size, &d_compact_paths, &d_num_queued_paths, const_cast<int *>(&num_active_paths)};
DeviceKernelArguments args(
&work_size, &d_compact_paths, &d_num_queued_paths, &num_active_paths);
queue_->zero_to_device(num_queued_paths_);
queue_->enqueue(compact_paths_kernel, work_size, args);
}
@@ -638,8 +645,10 @@ void PathTraceWorkGPU::compact_paths(const int num_active_paths,
int work_size = num_compact_paths;
int active_states_offset = 0;
int terminated_states_offset = num_active_paths;
void *args[] = {
&d_compact_paths, &active_states_offset, &terminated_states_offset, &work_size};
DeviceKernelArguments args(
&d_compact_paths, &active_states_offset, &terminated_states_offset, &work_size);
queue_->enqueue(compact_kernel, work_size, args);
}
}
@@ -768,14 +777,12 @@ void PathTraceWorkGPU::enqueue_work_tiles(DeviceKernel kernel,
queue_->copy_to_device(work_tiles_);
void *d_work_tiles = (void *)work_tiles_.device_pointer;
void *d_render_buffer = (void *)buffers_->buffer.device_pointer;
device_ptr d_work_tiles = work_tiles_.device_pointer;
device_ptr d_render_buffer = buffers_->buffer.device_pointer;
/* Launch kernel. */
void *args[] = {&d_work_tiles,
const_cast<int *>(&num_work_tiles),
&d_render_buffer,
const_cast<int *>(&max_tile_work_size)};
DeviceKernelArguments args(
&d_work_tiles, &num_work_tiles, &d_render_buffer, &max_tile_work_size);
queue_->enqueue(kernel, max_tile_work_size * num_work_tiles, args);
@@ -965,16 +972,16 @@ int PathTraceWorkGPU::adaptive_sampling_convergence_check_count_active(float thr
const int work_size = effective_buffer_params_.width * effective_buffer_params_.height;
void *args[] = {&buffers_->buffer.device_pointer,
const_cast<int *>(&effective_buffer_params_.full_x),
const_cast<int *>(&effective_buffer_params_.full_y),
const_cast<int *>(&effective_buffer_params_.width),
const_cast<int *>(&effective_buffer_params_.height),
&threshold,
&reset,
&effective_buffer_params_.offset,
&effective_buffer_params_.stride,
&num_active_pixels.device_pointer};
DeviceKernelArguments args(&buffers_->buffer.device_pointer,
&effective_buffer_params_.full_x,
&effective_buffer_params_.full_y,
&effective_buffer_params_.width,
&effective_buffer_params_.height,
&threshold,
&reset,
&effective_buffer_params_.offset,
&effective_buffer_params_.stride,
&num_active_pixels.device_pointer);
queue_->enqueue(DEVICE_KERNEL_ADAPTIVE_SAMPLING_CONVERGENCE_CHECK, work_size, args);
@@ -988,13 +995,13 @@ void PathTraceWorkGPU::enqueue_adaptive_sampling_filter_x()
{
const int work_size = effective_buffer_params_.height;
void *args[] = {&buffers_->buffer.device_pointer,
&effective_buffer_params_.full_x,
&effective_buffer_params_.full_y,
&effective_buffer_params_.width,
&effective_buffer_params_.height,
&effective_buffer_params_.offset,
&effective_buffer_params_.stride};
DeviceKernelArguments args(&buffers_->buffer.device_pointer,
&effective_buffer_params_.full_x,
&effective_buffer_params_.full_y,
&effective_buffer_params_.width,
&effective_buffer_params_.height,
&effective_buffer_params_.offset,
&effective_buffer_params_.stride);
queue_->enqueue(DEVICE_KERNEL_ADAPTIVE_SAMPLING_CONVERGENCE_FILTER_X, work_size, args);
}
@@ -1003,13 +1010,13 @@ void PathTraceWorkGPU::enqueue_adaptive_sampling_filter_y()
{
const int work_size = effective_buffer_params_.width;
void *args[] = {&buffers_->buffer.device_pointer,
&effective_buffer_params_.full_x,
&effective_buffer_params_.full_y,
&effective_buffer_params_.width,
&effective_buffer_params_.height,
&effective_buffer_params_.offset,
&effective_buffer_params_.stride};
DeviceKernelArguments args(&buffers_->buffer.device_pointer,
&effective_buffer_params_.full_x,
&effective_buffer_params_.full_y,
&effective_buffer_params_.width,
&effective_buffer_params_.height,
&effective_buffer_params_.offset,
&effective_buffer_params_.stride);
queue_->enqueue(DEVICE_KERNEL_ADAPTIVE_SAMPLING_CONVERGENCE_FILTER_Y, work_size, args);
}
@@ -1018,10 +1025,10 @@ void PathTraceWorkGPU::cryptomatte_postproces()
{
const int work_size = effective_buffer_params_.width * effective_buffer_params_.height;
void *args[] = {&buffers_->buffer.device_pointer,
const_cast<int *>(&work_size),
&effective_buffer_params_.offset,
&effective_buffer_params_.stride};
DeviceKernelArguments args(&buffers_->buffer.device_pointer,
&work_size,
&effective_buffer_params_.offset,
&effective_buffer_params_.stride);
queue_->enqueue(DEVICE_KERNEL_CRYPTOMATTE_POSTPROCESS, work_size, args);
}
@@ -1070,8 +1077,9 @@ int PathTraceWorkGPU::shadow_catcher_count_possible_splits()
queue_->zero_to_device(num_queued_paths_);
const int work_size = max_active_main_path_index_;
void *d_num_queued_paths = (void *)num_queued_paths_.device_pointer;
void *args[] = {const_cast<int *>(&work_size), &d_num_queued_paths};
device_ptr d_num_queued_paths = num_queued_paths_.device_pointer;
DeviceKernelArguments args(&work_size, &d_num_queued_paths);
queue_->enqueue(DEVICE_KERNEL_INTEGRATOR_SHADOW_CATCHER_COUNT_POSSIBLE_SPLITS, work_size, args);
queue_->copy_from_device(num_queued_paths_);

View File

@@ -112,7 +112,7 @@ int RenderScheduler::get_rendered_sample() const
{
DCHECK_GT(get_num_rendered_samples(), 0);
return start_sample_ + get_num_rendered_samples() - 1;
return start_sample_ + get_num_rendered_samples() - 1 - sample_offset_;
}
int RenderScheduler::get_num_rendered_samples() const
@@ -406,6 +406,9 @@ bool RenderScheduler::set_postprocess_render_work(RenderWork *render_work)
any_scheduled = true;
}
/* Force update. */
any_scheduled = true;
if (any_scheduled) {
render_work->display.update = true;
}
@@ -874,7 +877,8 @@ int RenderScheduler::get_num_samples_to_path_trace() const
* is to ensure that the final render is pixel-matched regardless of how many samples per second
* compute device can do. */
return adaptive_sampling_.align_samples(path_trace_start_sample, num_samples_to_render);
return adaptive_sampling_.align_samples(path_trace_start_sample - sample_offset_,
num_samples_to_render);
}
int RenderScheduler::get_num_samples_during_navigation(int resolution_divider) const

View File

@@ -158,14 +158,16 @@ bool ShaderEval::eval_gpu(Device *device,
/* Execute work on GPU in chunk, so we can cancel.
* TODO : query appropriate size from device.*/
const int64_t chunk_size = 65536;
const int32_t chunk_size = 65536;
void *d_input = (void *)input.device_pointer;
void *d_output = (void *)output.device_pointer;
device_ptr d_input = input.device_pointer;
device_ptr d_output = output.device_pointer;
for (int64_t d_offset = 0; d_offset < work_size; d_offset += chunk_size) {
int64_t d_work_size = std::min(chunk_size, work_size - d_offset);
void *args[] = {&d_input, &d_output, &d_offset, &d_work_size};
assert(work_size <= 0x7fffffff);
for (int32_t d_offset = 0; d_offset < int32_t(work_size); d_offset += chunk_size) {
int32_t d_work_size = std::min(chunk_size, int32_t(work_size) - d_offset);
DeviceKernelArguments args(&d_input, &d_output, &d_offset, &d_work_size);
queue->enqueue(kernel, d_work_size, args);
queue->synchronize();

View File

@@ -46,7 +46,8 @@ ccl_device_inline uint round_up_to_power_of_two(uint x)
return next_power_of_two(x);
}
TileSize tile_calculate_best_size(const int2 &image_size,
TileSize tile_calculate_best_size(const bool accel_rt,
const int2 &image_size,
const int num_samples,
const int max_num_path_states,
const float scrambling_distance)
@@ -73,7 +74,7 @@ TileSize tile_calculate_best_size(const int2 &image_size,
TileSize tile_size;
const int num_path_states_per_sample = max_num_path_states / num_samples;
if (scrambling_distance < 0.9f) {
if (scrambling_distance < 0.9f && accel_rt) {
/* Prefer large tiles for scrambling distance, bounded by max num path states. */
tile_size.width = min(image_size.x, max_num_path_states);
tile_size.height = min(image_size.y, max(max_num_path_states / tile_size.width, 1));

View File

@@ -49,7 +49,8 @@ std::ostream &operator<<(std::ostream &os, const TileSize &tile_size);
* of active path states.
* Will attempt to provide best guess to keep path tracing threads of a device as localized as
* possible, and have as many threads active for every tile as possible. */
TileSize tile_calculate_best_size(const int2 &image_size,
TileSize tile_calculate_best_size(const bool accel_rt,
const int2 &image_size,
const int num_samples,
const int max_num_path_states,
const float scrambling_distance);

View File

@@ -28,6 +28,11 @@ WorkTileScheduler::WorkTileScheduler()
{
}
void WorkTileScheduler::set_accelerated_rt(bool accelerated_rt)
{
accelerated_rt_ = accelerated_rt;
}
void WorkTileScheduler::set_max_num_path_states(int max_num_path_states)
{
max_num_path_states_ = max_num_path_states;
@@ -61,7 +66,7 @@ void WorkTileScheduler::reset(const BufferParams &buffer_params,
void WorkTileScheduler::reset_scheduler_state()
{
tile_size_ = tile_calculate_best_size(
image_size_px_, samples_num_, max_num_path_states_, scrambling_distance_);
accelerated_rt_, image_size_px_, samples_num_, max_num_path_states_, scrambling_distance_);
VLOG(3) << "Will schedule tiles of size " << tile_size_;

View File

@@ -31,6 +31,9 @@ class WorkTileScheduler {
public:
WorkTileScheduler();
/* To indicate if there is accelerated RT support. */
void set_accelerated_rt(bool state);
/* MAximum path states which are allowed to be used by a single scheduled work tile.
*
* Affects the scheduled work size: the work size will be as big as possible, but will not exceed
@@ -55,6 +58,9 @@ class WorkTileScheduler {
protected:
void reset_scheduler_state();
/* Used to indicate if there is accelerated ray tracing. */
bool accelerated_rt_ = false;
/* Maximum allowed path states to be used.
*
* TODO(sergey): Naming can be improved. The fact that this is a limiting factor based on the

View File

@@ -179,11 +179,14 @@ set(SRC_KERNEL_GEOM_HEADERS
geom/curve.h
geom/curve_intersect.h
geom/motion_curve.h
geom/motion_point.h
geom/motion_triangle.h
geom/motion_triangle_intersect.h
geom/motion_triangle_shader.h
geom/object.h
geom/patch.h
geom/point.h
geom/point_intersect.h
geom/primitive.h
geom/shader_data.h
geom/subd_triangle.h
@@ -207,6 +210,7 @@ set(SRC_KERNEL_BVH_HEADERS
bvh/volume.h
bvh/volume_all.h
bvh/embree.h
bvh/metal.h
)
set(SRC_KERNEL_CAMERA_HEADERS

View File

@@ -31,6 +31,10 @@
# include "kernel/bvh/embree.h"
#endif
#ifdef __METALRT__
# include "kernel/bvh/metal.h"
#endif
#include "kernel/bvh/types.h"
#include "kernel/bvh/util.h"
@@ -38,31 +42,31 @@
CCL_NAMESPACE_BEGIN
#ifndef __KERNEL_OPTIX__
#if !defined(__KERNEL_GPU_RAYTRACING__)
/* Regular BVH traversal */
# include "kernel/bvh/nodes.h"
# define BVH_FUNCTION_NAME bvh_intersect
# define BVH_FUNCTION_FEATURES 0
# define BVH_FUNCTION_FEATURES BVH_POINTCLOUD
# include "kernel/bvh/traversal.h"
# if defined(__HAIR__)
# define BVH_FUNCTION_NAME bvh_intersect_hair
# define BVH_FUNCTION_FEATURES BVH_HAIR
# define BVH_FUNCTION_FEATURES BVH_HAIR | BVH_POINTCLOUD
# include "kernel/bvh/traversal.h"
# endif
# if defined(__OBJECT_MOTION__)
# define BVH_FUNCTION_NAME bvh_intersect_motion
# define BVH_FUNCTION_FEATURES BVH_MOTION
# define BVH_FUNCTION_FEATURES BVH_MOTION | BVH_POINTCLOUD
# include "kernel/bvh/traversal.h"
# endif
# if defined(__HAIR__) && defined(__OBJECT_MOTION__)
# define BVH_FUNCTION_NAME bvh_intersect_hair_motion
# define BVH_FUNCTION_FEATURES BVH_HAIR | BVH_MOTION
# define BVH_FUNCTION_FEATURES BVH_HAIR | BVH_MOTION | BVH_POINTCLOUD
# include "kernel/bvh/traversal.h"
# endif
@@ -98,26 +102,27 @@ CCL_NAMESPACE_BEGIN
# if defined(__SHADOW_RECORD_ALL__)
# define BVH_FUNCTION_NAME bvh_intersect_shadow_all
# define BVH_FUNCTION_FEATURES 0
# define BVH_FUNCTION_FEATURES BVH_POINTCLOUD
# include "kernel/bvh/shadow_all.h"
# if defined(__HAIR__)
# define BVH_FUNCTION_NAME bvh_intersect_shadow_all_hair
# define BVH_FUNCTION_FEATURES BVH_HAIR
# define BVH_FUNCTION_FEATURES BVH_HAIR | BVH_POINTCLOUD
# include "kernel/bvh/shadow_all.h"
# endif
# if defined(__OBJECT_MOTION__)
# define BVH_FUNCTION_NAME bvh_intersect_shadow_all_motion
# define BVH_FUNCTION_FEATURES BVH_MOTION
# define BVH_FUNCTION_FEATURES BVH_MOTION | BVH_POINTCLOUD
# include "kernel/bvh/shadow_all.h"
# endif
# if defined(__HAIR__) && defined(__OBJECT_MOTION__)
# define BVH_FUNCTION_NAME bvh_intersect_shadow_all_hair_motion
# define BVH_FUNCTION_FEATURES BVH_HAIR | BVH_MOTION
# define BVH_FUNCTION_FEATURES BVH_HAIR | BVH_MOTION | BVH_POINTCLOUD
# include "kernel/bvh/shadow_all.h"
# endif
# endif /* __SHADOW_RECORD_ALL__ */
/* Record all intersections - Volume BVH traversal. */
@@ -139,7 +144,7 @@ CCL_NAMESPACE_BEGIN
# undef BVH_NAME_EVAL
# undef BVH_FUNCTION_FULL_NAME
#endif /* __KERNEL_OPTIX__ */
#endif /* !defined(__KERNEL_GPU_RAYTRACING__) */
ccl_device_inline bool scene_intersect_valid(ccl_private const Ray *ray)
{
@@ -205,7 +210,95 @@ ccl_device_intersect bool scene_intersect(KernelGlobals kg,
isect->type = p5;
return p5 != PRIMITIVE_NONE;
#else /* __KERNEL_OPTIX__ */
#elif defined(__METALRT__)
if (!scene_intersect_valid(ray)) {
isect->t = ray->t;
isect->type = PRIMITIVE_NONE;
return false;
}
# if defined(__KERNEL_DEBUG__)
if (is_null_instance_acceleration_structure(metal_ancillaries->accel_struct)) {
isect->t = ray->t;
isect->type = PRIMITIVE_NONE;
kernel_assert(!"Invalid metal_ancillaries->accel_struct pointer");
return false;
}
if (is_null_intersection_function_table(metal_ancillaries->ift_default)) {
isect->t = ray->t;
isect->type = PRIMITIVE_NONE;
kernel_assert(!"Invalid ift_default");
return false;
}
# endif
metal::raytracing::ray r(ray->P, ray->D, 0.0f, ray->t);
metalrt_intersector_type metalrt_intersect;
if (!kernel_data.bvh.have_curves) {
metalrt_intersect.assume_geometry_type(metal::raytracing::geometry_type::triangle);
}
MetalRTIntersectionPayload payload;
payload.u = 0.0f;
payload.v = 0.0f;
payload.visibility = visibility;
typename metalrt_intersector_type::result_type intersection;
uint ray_mask = visibility & 0xFF;
if (0 == ray_mask && (visibility & ~0xFF) != 0) {
ray_mask = 0xFF;
/* No further intersector setup required: Default MetalRT behavior is any-hit. */
}
else if (visibility & PATH_RAY_SHADOW_OPAQUE) {
/* No further intersector setup required: Shadow ray early termination is controlled by the
* intersection handler */
}
# if defined(__METALRT_MOTION__)
payload.time = ray->time;
intersection = metalrt_intersect.intersect(r,
metal_ancillaries->accel_struct,
ray_mask,
ray->time,
metal_ancillaries->ift_default,
payload);
# else
intersection = metalrt_intersect.intersect(
r, metal_ancillaries->accel_struct, ray_mask, metal_ancillaries->ift_default, payload);
# endif
if (intersection.type == intersection_type::none) {
isect->t = ray->t;
isect->type = PRIMITIVE_NONE;
return false;
}
isect->t = intersection.distance;
isect->prim = payload.prim;
isect->type = payload.type;
isect->object = intersection.user_instance_id;
isect->t = intersection.distance;
if (intersection.type == intersection_type::triangle) {
isect->u = 1.0f - intersection.triangle_barycentric_coord.y -
intersection.triangle_barycentric_coord.x;
isect->v = intersection.triangle_barycentric_coord.x;
}
else {
isect->u = payload.u;
isect->v = payload.v;
}
return isect->type != PRIMITIVE_NONE;
#else
if (!scene_intersect_valid(ray)) {
return false;
}
@@ -289,7 +382,69 @@ ccl_device_intersect bool scene_intersect_local(KernelGlobals kg,
p5);
return p5;
# else /* __KERNEL_OPTIX__ */
# elif defined(__METALRT__)
if (!scene_intersect_valid(ray)) {
if (local_isect) {
local_isect->num_hits = 0;
}
return false;
}
# if defined(__KERNEL_DEBUG__)
if (is_null_instance_acceleration_structure(metal_ancillaries->accel_struct)) {
if (local_isect) {
local_isect->num_hits = 0;
}
kernel_assert(!"Invalid metal_ancillaries->accel_struct pointer");
return false;
}
if (is_null_intersection_function_table(metal_ancillaries->ift_local)) {
if (local_isect) {
local_isect->num_hits = 0;
}
kernel_assert(!"Invalid ift_local");
return false;
}
# endif
metal::raytracing::ray r(ray->P, ray->D, 0.0f, ray->t);
metalrt_intersector_type metalrt_intersect;
metalrt_intersect.force_opacity(metal::raytracing::forced_opacity::non_opaque);
if (!kernel_data.bvh.have_curves) {
metalrt_intersect.assume_geometry_type(metal::raytracing::geometry_type::triangle);
}
MetalRTIntersectionLocalPayload payload;
payload.local_object = local_object;
payload.max_hits = max_hits;
payload.local_isect.num_hits = 0;
if (lcg_state) {
payload.has_lcg_state = true;
payload.lcg_state = *lcg_state;
}
payload.result = false;
typename metalrt_intersector_type::result_type intersection;
# if defined(__METALRT_MOTION__)
intersection = metalrt_intersect.intersect(
r, metal_ancillaries->accel_struct, 0xFF, ray->time, metal_ancillaries->ift_local, payload);
# else
intersection = metalrt_intersect.intersect(
r, metal_ancillaries->accel_struct, 0xFF, metal_ancillaries->ift_local, payload);
# endif
if (lcg_state) {
*lcg_state = payload.lcg_state;
}
*local_isect = payload.local_isect;
return payload.result;
# else
if (!scene_intersect_valid(ray)) {
if (local_isect) {
local_isect->num_hits = 0;
@@ -406,7 +561,67 @@ ccl_device_intersect bool scene_intersect_shadow_all(KernelGlobals kg,
*throughput = __uint_as_float(p1);
return p5;
# else /* __KERNEL_OPTIX__ */
# elif defined(__METALRT__)
if (!scene_intersect_valid(ray)) {
return false;
}
# if defined(__KERNEL_DEBUG__)
if (is_null_instance_acceleration_structure(metal_ancillaries->accel_struct)) {
kernel_assert(!"Invalid metal_ancillaries->accel_struct pointer");
return false;
}
if (is_null_intersection_function_table(metal_ancillaries->ift_shadow)) {
kernel_assert(!"Invalid ift_shadow");
return false;
}
# endif
metal::raytracing::ray r(ray->P, ray->D, 0.0f, ray->t);
metalrt_intersector_type metalrt_intersect;
metalrt_intersect.force_opacity(metal::raytracing::forced_opacity::non_opaque);
if (!kernel_data.bvh.have_curves) {
metalrt_intersect.assume_geometry_type(metal::raytracing::geometry_type::triangle);
}
MetalRTIntersectionShadowPayload payload;
payload.visibility = visibility;
payload.max_hits = max_hits;
payload.num_hits = 0;
payload.num_recorded_hits = 0;
payload.throughput = 1.0f;
payload.result = false;
payload.state = state;
uint ray_mask = visibility & 0xFF;
if (0 == ray_mask && (visibility & ~0xFF) != 0) {
ray_mask = 0xFF;
}
typename metalrt_intersector_type::result_type intersection;
# if defined(__METALRT_MOTION__)
payload.time = ray->time;
intersection = metalrt_intersect.intersect(r,
metal_ancillaries->accel_struct,
ray_mask,
ray->time,
metal_ancillaries->ift_shadow,
payload);
# else
intersection = metalrt_intersect.intersect(
r, metal_ancillaries->accel_struct, ray_mask, metal_ancillaries->ift_shadow, payload);
# endif
*num_recorded_hits = payload.num_recorded_hits;
*throughput = payload.throughput;
return payload.result;
# else
if (!scene_intersect_valid(ray)) {
*num_recorded_hits = 0;
*throughput = 1.0f;
@@ -503,7 +718,76 @@ ccl_device_intersect bool scene_intersect_volume(KernelGlobals kg,
isect->type = p5;
return p5 != PRIMITIVE_NONE;
# else /* __KERNEL_OPTIX__ */
# elif defined(__METALRT__)
if (!scene_intersect_valid(ray)) {
return false;
}
# if defined(__KERNEL_DEBUG__)
if (is_null_instance_acceleration_structure(metal_ancillaries->accel_struct)) {
kernel_assert(!"Invalid metal_ancillaries->accel_struct pointer");
return false;
}
if (is_null_intersection_function_table(metal_ancillaries->ift_default)) {
kernel_assert(!"Invalid ift_default");
return false;
}
# endif
metal::raytracing::ray r(ray->P, ray->D, 0.0f, ray->t);
metalrt_intersector_type metalrt_intersect;
metalrt_intersect.force_opacity(metal::raytracing::forced_opacity::non_opaque);
if (!kernel_data.bvh.have_curves) {
metalrt_intersect.assume_geometry_type(metal::raytracing::geometry_type::triangle);
}
MetalRTIntersectionPayload payload;
payload.visibility = visibility;
typename metalrt_intersector_type::result_type intersection;
uint ray_mask = visibility & 0xFF;
if (0 == ray_mask && (visibility & ~0xFF) != 0) {
ray_mask = 0xFF;
}
# if defined(__METALRT_MOTION__)
payload.time = ray->time;
intersection = metalrt_intersect.intersect(r,
metal_ancillaries->accel_struct,
ray_mask,
ray->time,
metal_ancillaries->ift_default,
payload);
# else
intersection = metalrt_intersect.intersect(
r, metal_ancillaries->accel_struct, ray_mask, metal_ancillaries->ift_default, payload);
# endif
if (intersection.type == intersection_type::none) {
return false;
}
isect->prim = payload.prim;
isect->type = payload.type;
isect->object = intersection.user_instance_id;
isect->t = intersection.distance;
if (intersection.type == intersection_type::triangle) {
isect->u = 1.0f - intersection.triangle_barycentric_coord.y -
intersection.triangle_barycentric_coord.x;
isect->v = intersection.triangle_barycentric_coord.x;
}
else {
isect->u = payload.u;
isect->v = payload.v;
}
return isect->type != PRIMITIVE_NONE;
# else
if (!scene_intersect_valid(ray)) {
return false;
}

View File

@@ -0,0 +1,47 @@
/*
* Copyright 2021 Blender Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
struct MetalRTIntersectionPayload {
uint visibility;
float u, v;
int prim;
int type;
#if defined(__METALRT_MOTION__)
float time;
#endif
};
struct MetalRTIntersectionLocalPayload {
uint local_object;
uint lcg_state;
short max_hits;
bool has_lcg_state;
bool result;
LocalIntersection local_isect;
};
struct MetalRTIntersectionShadowPayload {
uint visibility;
#if defined(__METALRT_MOTION__)
float time;
#endif
int state;
float throughput;
short max_hits;
short num_hits;
short num_recorded_hits;
bool result;
};

View File

@@ -28,6 +28,7 @@
* without new features slowing things down.
*
* BVH_HAIR: hair curve rendering
* BVH_POINTCLOUD: point cloud rendering
* BVH_MOTION: motion blur rendering
*/
@@ -173,7 +174,7 @@ ccl_device_inline
case PRIMITIVE_MOTION_CURVE_THICK:
case PRIMITIVE_CURVE_RIBBON:
case PRIMITIVE_MOTION_CURVE_RIBBON: {
if ((type & PRIMITIVE_ALL_MOTION) && kernel_data.bvh.use_bvh_steps) {
if ((type & PRIMITIVE_MOTION) && kernel_data.bvh.use_bvh_steps) {
const float2 prim_time = kernel_tex_fetch(__prim_time, prim_addr);
if (ray->time < prim_time.x || ray->time > prim_time.y) {
hit = false;
@@ -199,6 +200,34 @@ ccl_device_inline
break;
}
#endif
#if BVH_FEATURE(BVH_POINTCLOUD)
case PRIMITIVE_POINT:
case PRIMITIVE_MOTION_POINT: {
if ((type & PRIMITIVE_MOTION) && kernel_data.bvh.use_bvh_steps) {
const float2 prim_time = kernel_tex_fetch(__prim_time, prim_addr);
if (ray->time < prim_time.x || ray->time > prim_time.y) {
hit = false;
break;
}
}
const int point_object = (object == OBJECT_NONE) ?
kernel_tex_fetch(__prim_object, prim_addr) :
object;
const int point_prim = kernel_tex_fetch(__prim_index, prim_addr);
const int point_type = kernel_tex_fetch(__prim_type, prim_addr);
hit = point_intersect(kg,
&isect,
P,
dir,
t_max_current,
point_object,
point_prim,
ray->time,
point_type);
break;
}
#endif /* BVH_FEATURE(BVH_POINTCLOUD) */
default: {
hit = false;
break;
@@ -226,7 +255,7 @@ ccl_device_inline
bool record_intersection = true;
/* Always use baked shadow transparency for curves. */
if (isect.type & PRIMITIVE_ALL_CURVE) {
if (isect.type & PRIMITIVE_CURVE) {
*throughput *= intersection_curve_shadow_transparency(
kg, isect.object, isect.prim, isect.u);

Some files were not shown because too many files have changed in this diff Show More