1
1

Compare commits

..

288 Commits

Author SHA1 Message Date
c6f85d7a8d improve handling of incompatible enums 2021-11-10 11:47:36 +01:00
25f44ab631 update collection info node 2021-11-10 11:11:12 +01:00
1d729aa2f7 Merge branch 'master' into temp-enum-socket 2021-11-10 11:04:50 +01:00
aa440923c8 Fix: Hide selection value in resample curve node, fix order
Selection sockets are meant to come right after the geometry,
this was missed in review of rBa7672caeb255e3.
Also, the selection value was not hidden.
2021-11-09 17:23:11 -06:00
1f6010e609 Cleanup: Move info_stats.c to C++ 2021-11-09 15:57:58 -06:00
Michael Jones
3a4c8f406a Cycles: Adapt shared kernel/device/gpu layer for MSL
This patch adapts the shared kernel entrypoints so that they can be compiled as MSL (Metal Shading Language). Where possible, the adaptations avoid changes in common code.

In MSL, kernel function inputs are explicitly bound to resources. In the case of argument buffers, we declare a struct containing the kernel arguments, accessible via device pointer. This differs from CUDA and HIP where kernel function arguments are declared as traditional C-style function parameters. This patch adapts the entrypoints declared in kernel.h so that they can be translated via a new `ccl_gpu_kernel_signature` macro into the required parameter struct + kernel entrypoint pairing for MSL.

MSL buffer attribution must be applied to function parameters or non-static class data members. To allow universal access to the integrator state, kernel data, and texture fetch adapters, we wrap all of the shared kernel code in a `MetalKernelContext` class. This is achieved by bracketing the appropriate kernel headers with "context_begin.h" and "context_end.h" on Metal. When calling deeper into the kernel code, we must reference the context class (e.g. `context.integrator_init_from_camera`). This extra prefixing is performed by a set of defines in "context_end.h". These will require explicit maintenance if entrypoints change. We invite discussion on more maintainable ways to enforce correctness.

Lambda expressions are not supported on MSL, so a new `ccl_gpu_kernel_lambda` macro generates an inline function object and optionally capturing any required state. This yields the same behaviour. This approach is applied to all parallel_... implementations which are templated by operation. The lambda expressions in the film_convert... kernels don't adapt cleanly to use function objects. However, these entrypoints can be macro-generated more concisely to avoid lambda expressions entirely, instead relying on constant folding to handle the pixel/channel conversions.

A separate implementation of `gpu_parallel_active_index_array` is provided for Metal to workaround some subtle differences in SIMD width, and also to encapsulate some required thread parameters which must be declared as explicit entrypoint function parameters.

Ref T92212

Reviewed By: brecht

Maniphest Tasks: T92212

Differential Revision: https://developer.blender.org/D13109
2021-11-09 21:43:10 +00:00
4648c4990c Merge branch 'blender-v3.0-release' 2021-11-09 13:08:36 -06:00
b17561f8b2 support enum in interface socket 2021-11-09 20:00:05 +01:00
dea72dbb9d fixes 2021-11-09 19:54:42 +01:00
fe2cde8390 improve 2021-11-09 19:28:55 +01:00
74257d6ccb Merge branch 'master' into temp-enum-socket 2021-11-09 18:29:57 +01:00
a356e4fb3f Merge branch 'blender-v3.0-release' 2021-11-09 17:53:15 +01:00
7383f95443 Merge branch 'blender-v3.0-release' 2021-11-09 11:43:18 -05:00
ed0df0f3c6 Merge branch 'blender-v3.0-release' 2021-11-09 10:15:20 -06:00
e5fb5c9d7b Merge branch 'blender-v3.0-release' 2021-11-09 17:07:41 +01:00
Demeter Dzadik
e452c43fd6 Let Unlink Action operator have an undo step
I noticed while rigging a character and editing actions that the Unlink Action operator had no undo step. Doesn't feel intentional, so this patch adds the necessary flags.

Reviewed By: mont29

Differential Revision: https://developer.blender.org/D12346
2021-11-09 16:58:39 +01:00
0bdf9d10a4 Merge branch 'blender-v3.0-release' 2021-11-09 16:46:58 +01:00
fb0ae66ee5 Merge branch 'blender-v3.0-release' 2021-11-09 16:10:19 +01:00
9dc3f454d9 Merge branch 'blender-v3.0-release' 2021-11-09 11:27:25 -03:00
Demeter Dzadik
c092cc35b3 Expose BLI_string_flip_side_name as bpy.utils.flip_name
Expose a new function in `bpy.utils.flip_name(name, strip_number=False)
that allows flipping bone names, eg "Bone.L" -> "Bone.R".

Useful for add-ons to avoid re-implementing Blender's name flipping.

Ref D12322
2021-11-10 01:19:04 +11:00
accdd4c1bc Merge branch 'blender-v3.0-release' 2021-11-09 15:08:37 +01:00
625349a6bd Cleanup: spelling, C style comments 2021-11-10 00:56:17 +11:00
65bbac6692 Cleanup: clang-format 2021-11-10 00:55:38 +11:00
faeb2cc900 Merge branch 'blender-v3.0-release' 2021-11-09 14:49:47 +01:00
0bcf014bcf Merge branch 'blender-v3.0-release' 2021-11-10 00:38:51 +11:00
41b0820ddd Merge branch 'blender-v3.0-release' 2021-11-09 13:31:33 +01:00
6c24cafecc Fix T92876: Cycles incorrect volume emission + absorption handling 2021-11-09 13:13:56 +01:00
cb487b6507 Asset Catalogs: add test for proper shortening of simple names
Catalog simple names are supposed to fit into the DNA field `char
AssetMetaData::catalog_simple_name[64]`, and thus should be shortened
appropriately. This was already happening, but is now also covered by a
test.

No functional changes.
2021-11-09 13:13:30 +01:00
09f1be53d8 Fix T92950: spreadsheet shows 0 instances when there are instances
Fix found by @erik85.
2021-11-09 13:10:13 +01:00
Demeter Dzadik
de8e13036b Armature Make/Clear Parent: Grey out options that don't do anything
In armature edit mode, the Make/Clear Parent operators don't do anything
in various cases, but only one of these cases was previously indicated,
and it was indicated by hiding the option completely instead of graying
it out.

Clear Parent (Alt+P) problems fixed:
- "Clear Parent" option always showed up, even when none of the selected
  bones had a parent.
- "Disconnect Bone" option always showed up, even when use_connected on
  all selected bones was already false.

Make Parent (Ctrl+P) problems fixed:
- "Keep Offset" option didn't show up when all selected bones' parent
  was already the active bone. This was correct, and this patch tries to
  make all behaviours consistent with this.
- "Connected" option always showed up, even when all selected bones'
  parent was already the active bone, and they all had use_connect set
  to True.

With this patch all options show up all the time, but in cases where
they would do nothing, they will be grayed out.

Reviewed By: sybren

Differential Revision: https://developer.blender.org/D6100
2021-11-09 12:16:30 +01:00
Cody Winchester
4e2478940e Alembic: Allow exporting of animated vertex colors
Allow exporting of animated vertex colors to Alembic.

The changes are made to be in line with the way the UV Maps are written.
Each vertex color gets a OC4fGeomParam created and mapped into the
CDStreamConfig to avoid recreating the Param on each frame.

The time sample index is also stored in the config now and set onto the
UV and Vertex Color params each frame. Without this the exports would
get inconsistent timing results where animated UV maps and Vertex Colors
were not playing back at the original speed.

Reviewed By: sybren

Maniphest Tasks: T88074

Differential Revision: https://developer.blender.org/D11278
2021-11-09 10:54:13 +01:00
6b0a6c2ca9 Merge branch 'blender-v3.0-release' 2021-11-09 10:34:07 +01:00
ad679ee747 Merge branch 'blender-v3.0-release' 2021-11-09 10:07:44 +01:00
a7540f4b36 Merge branch 'blender-v3.0-release' 2021-11-09 17:11:35 +11:00
2772a033c9 Merge branch 'blender-v3.0-release' 2021-11-09 15:49:41 +11:00
8772a6fb9b Merge branch 'blender-v3.0-release' 2021-11-09 15:49:38 +11:00
Jeducious
d5d97e4169 Fix T92704: Redrawing while saving crashes outside the main thread
If the blend file is saved from a script in another thread,
like the render thread for example, Blender will crash on the call that
redraws the UI.

Ref D13140
2021-11-09 15:28:00 +11:00
e0dae0f98f Merge branch 'blender-v3.0-release' 2021-11-08 16:57:19 +01:00
fe2ed4a229 Merge branch 'blender-v3.0-release' 2021-11-08 18:44:39 +03:00
33beec1cec Cleanup: remove redundant arg when forcing zero initialization 2021-11-08 12:41:30 -03:00
e1c4e5df22 GPencil: New option to export PDF full scene
This new mode export all frames of the scene.

Reviewed By: pepeland

Differential Revision: https://developer.blender.org/D13055
2021-11-08 16:03:30 +01:00
b4d47523c2 evaluate enum node 2021-11-08 15:53:24 +01:00
c865577643 GPUTest: Add support to test on Windows.
On windows the OpenGL context wasn't activated when created, on Linux it
is. This patch will activate the context in gpu/draw test cases.
2021-11-08 15:49:51 +01:00
Jarrett Johnson
495e60c0da Basic engine shaders test
This patch adds shader compilation tests for the basic engine in `shaders_test.cc`

Addresses T92701

Reviewed By: jbakker

Differential Revision: https://developer.blender.org/D13066
2021-11-08 15:43:50 +01:00
d728c22181 add index outut 2021-11-08 15:24:26 +01:00
1c31d62951 fix 2021-11-08 15:24:21 +01:00
e6ba5ec37b progress 2021-11-08 14:50:48 +01:00
bb6547cb5f Merge branch 'blender-v3.0-release' 2021-11-09 00:25:49 +11:00
c55d0ebea5 Merge branch 'blender-v3.0-release' 2021-11-09 00:25:46 +11:00
62bd391187 improved inferencing 2021-11-08 13:51:57 +01:00
e736900e9a Merge branch 'master' into temp-enum-socket 2021-11-08 12:25:52 +01:00
Bastien Montagne
d6e2210935 Fix inconsistent creation of NodeTreeTypeUndefined type of node tree
Currently, when creating a new node tree ID, its `typeinfo` is set to
`NodeTreeTypeUndefined`, but its `type` enum value is left to `0`,
aka `NTREE_SHADER`.

This patch adds a new `NTREE_UNDIFINED` value, and use it for
`NodeTreeTypeUndefined` types of node trees.

NOTE: While it is not clear whether that actually fixes issues currently,
quite a bit of code still relies on the value of `type`, so think it
makes sense to sanitize this.

NOTE: Would have been ideal to reserve `0` value to undefined type,
but at this point this is not possible anymore, so chose to use `-2` instead.

Reviewed By: JacquesLucke

Differential Revision: https://developer.blender.org/D13123
2021-11-08 12:24:47 +01:00
fc373af8f5 Nodes: store socket declaration reference in socket
Previously, to get the declaration of a socket, one had to go
through `node->declaration`. Now this indirection is not necessary
anymore. This makes it easier to add more per-socket information
into the declaration and accessing it in various places.

Currently, this system is used by socket descriptions and node warnings
for unsupported geometry component types.
2021-11-08 12:24:01 +01:00
09cef0fc00 Fix ASAN issue in image editor.
Adding virtual constructors to base classes.
2021-11-08 11:37:42 +01:00
65548da002 fix missing update 2021-11-08 11:13:37 +01:00
3481d13104 Merge branch 'master' into temp-enum-socket 2021-11-08 11:02:09 +01:00
ee4966d146 Merge branch 'blender-v3.0-release' 2021-11-08 17:19:08 +11:00
c3f5fca8a2 Cleanup: avoid error prone struct declarations in C++
Reference struct members by name instead relying on their order.
This also simplifies moving back to named members when all compilers
we use support them.
2021-11-08 17:00:36 +11:00
de581a2302 Fix reading the 3rd value of 2D cursors when transforming
Out of bounds read and potential out-of-bounds write when transforming
the 2D cursor for image editor and sequencer.

While this didn't cause user visible bugs in my tests,
it's error prone and should be avoided.

Use TransData2D for 2D cursors.
2021-11-08 17:00:36 +11:00
fb4b737518 CMake: add missing headers to CMake lists 2021-11-08 17:00:36 +11:00
27b37517f8 Cleanup: use static sets 2021-11-08 17:00:36 +11:00
42df2a7b57 Cleanup: Grammar in comments 2021-11-07 23:20:29 -06:00
0f80602632 Cleanup: remove references to non-existent 'mtexpoly' 2021-11-08 14:14:16 +11:00
b24a03e635 Cleanup: remove duplicate doc-strings
Internal struct ObTfmBack had out of sync doc-strings
for members duplicated from Object.

Remove the doc-strings as there is this is just temporary storage.
2021-11-08 14:14:15 +11:00
ed24b7d9a2 Cleanup: spelling in comments 2021-11-08 14:14:14 +11:00
3b726cfee8 Cleanup: compiler warnings 2021-11-08 14:14:13 +11:00
0654c41b0c Cleanup: use doxygen for BLF glyph
- Use doxy formatted functions.
- Use doxy sections.
2021-11-08 13:37:51 +11:00
4f387e66ac Merge branch 'blender-v3.0-release' 2021-11-07 20:45:19 -03:00
23ef20855b progress 2021-11-07 13:42:25 +01:00
ebc81c6de4 progress 2021-11-07 13:07:00 +01:00
ffd8b05e20 initial enum inferencin 2021-11-07 12:07:38 +01:00
1006e84faa progress 2021-11-07 11:53:12 +01:00
892da668dc Cleanup: Clang tidy 2021-11-07 00:39:20 -05:00
dfb86671fe show enum labels in node 2021-11-06 19:34:24 +01:00
eddf5ad581 BLF: Refactor blf_glyph.c
Cleanup and Simplification of blf_glyph.c

See D13095 for details.

Differential Revision: https://developer.blender.org/D13095

Reviewed by Campbell Barton
2021-11-06 11:29:48 -07:00
2fb43d04cb support custom socket drawing 2021-11-06 19:26:58 +01:00
6002914f14 Cleanup: Convert mesh.c to C++
This commit renames mesh.c to mesh.cc and makes
it compile in C++. Can be useful in the future to be able
to use C++ functionality in existing and new functions.

Differential Revision: https://developer.blender.org/D13134
2021-11-06 19:16:37 +01:00
dad7371b41 progress 2021-11-06 19:11:52 +01:00
eaa9feb9a0 progress 2021-11-06 18:25:11 +01:00
d12eff1a88 improve enum storage 2021-11-06 18:15:55 +01:00
6069ff45c7 add more enum storage 2021-11-06 17:12:38 +01:00
f121713ece Merge branch 'master' into temp-enum-socket 2021-11-06 16:46:21 +01:00
f315a46982 Nodes: add preview image storage to node group
This is part of T92811.

Differential Revision: https://developer.blender.org/D13105
2021-11-06 16:43:26 +01:00
81baeec59b Cleanup: remove window_manager & editor includes from BLF
Remove the need to include the window manager & editor functions
in low level font rendering code.

- The default font size is now set when changed in the preferences.
- Flushing cache is set as a callback.
2021-11-06 16:28:02 +11:00
a804a11db1 Merge branch 'blender-v3.0-release' 2021-11-05 17:22:06 -05:00
48841c479f Merge branch 'blender-v3.0-release' 2021-11-05 18:00:01 -04:00
9e611c5616 Merge branch 'blender-v3.0-release' 2021-11-05 16:33:08 -05:00
f0bc7f3261 Merge branch 'blender-v3.0-release' 2021-11-05 20:40:02 +01:00
81bee0e75a UI: Fix minor theme mismatch
Pie menu got wrong item highlight and options settings were outdated.
2021-11-05 19:20:47 +01:00
3211c80a31 Fix T92815: Incorrect handling of evaluated meshes from curves
Evaluated meshes from curves are presented to render engines as
separate instance objects now, just like evaluated meshes from other
object types like point clouds and volumes. For that reason, cycles
should not consider curve objects as geometry (previously it did,
meaning it retrieved a second mesh from the curve object as well
as the temporary evaluated mesh geometry).

Further, avoid adding a curve object's evaluated mesh as data_eval,
since that is special behavior for meshes that is arbitrary. Adding an
evaluated mesh there but not an evalauted pointcloud is arbitrary,
for example. Retrieve the evaluated mesh in from the geometry set
in BKE_object_get_evaluated_mesh now, to support that change.

This gets us closer to a place where all of an object's evaluated data
is stored in geometry_set_eval, and we just have helper functions
to access specific geometry components.

Differential Revision: https://developer.blender.org/D13118
2021-11-05 19:20:47 +01:00
8d2a0d9b4c UI: Apply recent theme fixes for Preferences saved in 3.1 builds
Followup to e65230f0c0.

Pablo and I decided it's fine to reset themes again when saved with the
recent 3.1 builds.

This needed to be done a bit careful, since a normal version patch
resetting the theme would've reset the theme for anybody opening
preferences of a 3.0 build (even the final release build) in a 3.1
build. So make sure the theme is at least from a 3.1 build (but not
newer then this commit of course).
2021-11-05 19:00:10 +01:00
cc49c479a7 Cleanup: Use reference for non-optional C++ parameter
A reference makes clear that NULL is not an expected value. So it's the
prefered way of passing a `const` input parameter (at least if it may
not be cheap to copy).
2021-11-05 18:27:55 +01:00
aaf86bad87 Merge branch 'blender-v3.0-release' 2021-11-05 11:19:29 -05:00
212dcd6075 Merge branch 'blender-v3.0-release' 2021-11-05 10:57:04 -05:00
bbd8d33453 Merge branch 'blender-v3.0-release' 2021-11-05 16:23:56 +01:00
f415b41a94 Merge branch 'blender-v3.0-release' 2021-11-05 09:59:26 -05:00
625b2f59f0 Fix GCC warnings after own recent commit
Caused by 4e09fd76bc.
2021-11-05 15:32:11 +01:00
2986924301 Cleanup: Remove misleading comment
Python isn't doing any conversion here. We just do a regular lookup of
the given enum identifier in the RNA enum definition.
2021-11-05 15:32:11 +01:00
29efd26e71 Cleanup: Remove redundant scope qualifiers. 2021-11-05 15:18:58 +01:00
885c79915f Merge branch 'blender-v3.0-release' 2021-11-05 15:08:09 +01:00
a0f50c1890 Merge branch 'blender-v3.0-release' 2021-11-05 15:06:10 +01:00
4e09fd76bc Cleanup (UI): Add/use type for operator context enum
Adds a `wmOperatorCallContext` typedef for the existing `WM_OP_XXX`
operator context enum. This adds type safety, allows the compiler to
produce better warnings and helps understanding what a variable is for.

Differential Revision: https://developer.blender.org/D13113

Reviewed by: Campbell Barton
2021-11-05 14:57:26 +01:00
35198606d5 Cleanup: Remove wrong comments in versioning
This comment is from the block at the end of the versioning functions,
where we have an unversioned block to collect versioning code that
doesn't require immediate version bumping. The comment was probably just
copied over with the code when bumping the version eventually.
2021-11-05 14:53:57 +01:00
32c90d2d7c Cleanup: Split image engine into ImageEngine, SpaceAccessor and DrawingMode.
Image engine is used to draw an image into a space. The current
structure wasn't clear and couldn't be easilly extended. This refactor
spliced the image draw engine into 3 main components.

- Space accessors: contains an interface to communicate with space data
  (Image editor, UV Editor, Node Editor) in a common way. This reduced
  the branching in the code base.
- DrawingMode: contains an interface to the used tactic to draw an image
  inside the space framebuffer. Currently only one mode is implemented;
  in the future there could be a separate drawing mode for huge images.
- ImageEngine: the core that connects the draw manager with the space
  data and drawing mode.
2021-11-05 13:25:00 +01:00
d7f4fdf845 GPencil: Fix dash modifier missing vertex color.
The original code did not copy vertex color to the generated stroke,
now fixed.
2021-11-05 20:02:35 +08:00
016a575002 Merge branch 'blender-v3.0-release' 2021-11-05 16:05:00 +11:00
3a4dade2f8 Merge branch 'blender-v3.0-release' 2021-11-05 16:04:57 +11:00
b9968b83ad Merge branch 'blender-v3.0-release' 2021-11-05 16:04:52 +11:00
40d090cc72 Merge branch 'blender-v3.0-release' 2021-11-05 16:04:43 +11:00
5cc21b095a Merge branch 'blender-v3.0-release' 2021-11-05 16:04:40 +11:00
7061d1e39f Fix T92740: Missing lock around the image CacheLimiter
A recent change exposed this long-standing race. Simply protect the
MEM_CacheLimiter with its lock now. Additionally, guard against
unmanaging an already destroyed cache handle.

Ref T92740, T92838
2021-11-04 20:58:32 -07:00
1b6238edba Merge branch 'blender-v3.0-release' 2021-11-05 13:45:32 +11:00
5b7a14c019 Cleanup: use doxy sections in graph editor module 2021-11-05 13:38:58 +11:00
9d2e325694 Cleanup: spelling/typos 2021-11-05 13:38:58 +11:00
Christoph Lendenfeld
3364a5bea6 Cleanup: move code in graph_slider_ops
Future operators can use the same code,
so it is moved up to disassociate it from decimate

No functional changes

Reviewed by: Sybren A. Stüvel
Differential Revision: https://developer.blender.org/D12489
Ref: D12489
2021-11-04 21:35:28 +00:00
Christoph Lendenfeld
6986b43b3d Cleanup: renames in graph_slider_ops
This patch renames:
* tDecimateGraphOp to tGraphSliderOp
* dgo to gso (to match with the struct rename)
* decimate_reset_bezts to reset_bezts to indicate it can be used by other functions

No functional changes

Reviewed by: Sybren A. Stüvel
Differential Revision: https://developer.blender.org/D12490
Ref: D12490
2021-11-04 21:20:26 +00:00
df3e30398f Merge branch 'blender-v3.0-release' 2021-11-04 16:56:32 -03:00
556c71a84a GPencil: New option to Merge All layers in active one
This new option allows to combine all layers in the active one. Also the merge down option has been improved.

Reviewed By: mendio, pablo vazquez (UI)

Differential Revision: https://developer.blender.org/D13054
2021-11-04 19:40:55 +01:00
1c6d3d614a Minor tweaks to new append code.
* Name generated 'append' collection, instead of getting a generic
  meaningless name.
* Do not check if a collections's objects are already instantiated, when
  we already know that we want to instantiate that collection.
2021-11-04 19:00:22 +01:00
101fa4a425 Merge branch 'blender-v3.0-release' 2021-11-04 18:32:37 +01:00
2373ce7fcf Merge branch 'blender-v3.0-release' 2021-11-04 17:27:51 +01:00
a72b26527d Merge branch 'blender-v3.0-release' 2021-11-04 16:50:53 +01:00
2becb3e9af Merge branch 'blender-v3.0-release' 2021-11-04 10:10:52 -05:00
37b862fa6c Cleanup: Remove operator context override for drop-box operators
Drop-boxes should act on the context determined through the exact cursor
location. There should be no need to override that, basically by the
nature of how drop-boxes work.
So Campbell and I agreed on removing this.

If we wanted to support it, we'd have to restore the operator context
when drawing drop-boxes, see
https://developer.blender.org/T92501#1247581.
2021-11-04 16:02:54 +01:00
db43d19c16 Merge branch 'blender-v3.0-release' 2021-11-04 15:32:50 +01:00
d3328fabc9 Attempt to fix failing lib link test on windows.
According to https://docs.python.org/3/library/os.html#os.rename,
`os.rename` has os-specific behavior, and will fail in case you attempt
to rename to an existing file on windows.

So using `os.replace` instead, which should be os-agnostic.

NOTE: Fact that temp test directory is not cleared after tests are
sucessfully ran does not sound great...
2021-11-04 15:06:58 +01:00
8eff1eca52 Merge remote-tracking branch 'origin/blender-v3.0-release' 2021-11-04 14:50:48 +01:00
165100d8ac add node storage 2021-11-04 13:48:04 +01:00
ffb5d1205e add initial enum node 2021-11-04 13:38:18 +01:00
805540c713 initial enum socket 2021-11-04 13:28:24 +01:00
c641107c95 Fix T92800: (UI) Radial control values get wrong color assigned
The global theme state didn't get updated or unset properly when drawing
overlays. Now paint cursors use the theme settings of the space they are
in, while global overlays use the global fallback, which is the main 3D
View region.
2021-11-04 11:52:11 +01:00
978ef093db Cleanup: fix compiler warnings
Those were introduced in rBccead2ed9c6121c42a516712da38a2faec877e2f.
2021-11-04 11:29:16 +01:00
2d6d8fc7ca Attempt fix for new lib reload/relocate tests on windows.
Try splitting them into their own class.
2021-11-04 11:25:30 +01:00
82b20b6975 Merge branch 'blender-v3.0-release' 2021-11-04 10:51:40 +01:00
feaf5b95e0 Merge branch 'blender-v3.0-release' 2021-11-04 08:35:46 +01:00
36a6528723 Merge branch 'blender-v3.0-release' 2021-11-04 18:10:07 +11:00
d7f9f083d4 Merge branch 'blender-v3.0-release' 2021-11-04 14:51:58 +11:00
7c188d8241 Merge branch 'blender-v3.0-release' 2021-11-04 14:51:55 +11:00
9cd5b3c9b6 Merge branch 'blender-v3.0-release' 2021-11-04 14:51:52 +11:00
682f1548be Merge branch 'blender-v3.0-release' 2021-11-04 14:51:50 +11:00
c0fdaf700a Fix the GPencil stroke not sticking to other strokes
Issue seen when setting `Stroke Placement` of type `Stroke`.

Regression introduced in {rBaa0ac0035a0d3601672a0c732e3f8f932a36fc04}.
2021-11-03 18:51:25 -03:00
0c6b815855 Geometry Nodes: Add Length Output to Curve Parameter Node
Adds a length output to the curve parameter node which returns the
length of a spline at each point, or the length of the curve at
each spline depending on the domain.

Differential Revision: https://developer.blender.org/D12882
2021-11-03 15:05:46 -05:00
d6ed9c2b40 Merge branch 'blender-v3.0-release' 2021-11-03 14:41:50 -05:00
431524aebc Geometry Nodes: Selection outputs for Cone and Cylinder
This adds Top, Bottom and Side selections to the Primitive
Mesh nodes Cone and Cylinder.
2021-11-03 20:20:15 +01:00
ccead2ed9c Spreadsheet: Display geometry volume component grids
This shows a geometry's volume grids in the spreadsheet.
Three columns are displayed:
 - Name: The text name of each grid
 - Data type: Float, Vector, etc.
 - Class: Fog volume, Level Set, or unkown

In the future, values of the voxels themselves could be displayed,
but that is a much more complex problem, with important performance
implications, etc.

Differential Revision: https://developer.blender.org/D13049
2021-11-03 13:45:51 -05:00
Nikhil Shringarpurey
4e5537d841 Geometry Nodes: Add tooltips to primitive node inputs
Building on the work in rBef45399f3be0, this commits adds
tooltips to the inputs for the default primitives nodes.

Differential Revision: https://developer.blender.org/D12640
2021-11-03 13:25:44 -05:00
7aaedc09c7 Merge branch 'blender-v3.0-release' 2021-11-03 18:16:16 +01:00
e10caf6fe3 Merge remote-tracking branch 'origin/blender-v3.0-release' 2021-11-03 18:07:36 +01:00
a7672caeb2 Geometry Nodes: Add Selection Input to Resample Curves Node
Add a boolean input to the resample curve node that indicates which
splines should be resampled and which should be unchanged.

Differential Revision: https://developer.blender.org/D13064
2021-11-03 11:55:24 -05:00
de2988ea1b Cleanup: Remove effect-less const
Using `const` on an enum type returned by value doesn't have an effect.
2021-11-03 17:28:52 +01:00
debf4b70db Cleanup: Avoid redundant template parameter in BLI serializing API
The `ContainerValue` template can obtain the type of the contained value
via the given `Container` type, simply using `Container::value_type`.
Use this as the default way to determine the value type which simplifies
using the template. If necessary the value type can be passed explicitly
still.
2021-11-03 17:28:20 +01:00
Germano Cavalcante
aa0ac0035a GPencil and Annotation: Use cached depth to perform depth testing operations
Operations such as erasing with occlusion and drawing on the surface
require reading the depth buffer.

However, this is being done with minimal efficiency.

Currently, to read the depth corresponding to each point of the new stroke,
a ReadPixel is called to send a message to the GPU and read the depth of
the corresponding pixel in the VRAM.

The communication between GPU and CPU is known to be a slow operation so
it is good to be avoided.

Therefore, save the entire depth buffer in a cache to be read directly
from the RAM.

(Also the `ED_view3d_autodist_depth` and `ED_view3d_autodist_depth_seg` have
been removed since they are no longer used).

Reviewed By: antoniov, fclem

Differential Revision: https://developer.blender.org/D10894
2021-11-03 12:10:37 -03:00
8b516d8712 Include node name for socket animation channel UI
The channel names were often indistingushable in animation editors.
Now include the node _name_ (unfortunately, getting the _label_ could
result in bad performance in some circustances -- see previous version
of D13085).
Similar to what rB77744b581d08 did for some VSE strip properties.

ref. T91917

Maniphest Tasks: T91917

Differential Revision: https://developer.blender.org/D13085
2021-11-03 15:03:40 +01:00
2a88343213 Merge branch 'blender-v3.0-release' 2021-11-03 14:24:33 +01:00
04d35f9315 Merge branch 'blender-v3.0-release' 2021-11-03 10:23:55 -03:00
3532da44ee Merge branch 'blender-v3.0-release' 2021-11-03 22:26:39 +11:00
5095e4fc22 UI: Display disabled-hint when dragging material outside object mode
Display a "disabled hint" (text explaining why something isn't possible)
when dragging a material over the 3D View, while being in edit mode or
so (anything that isn't object mode).
2021-11-03 11:57:05 +01:00
bdf6665e3a Merge branch 'blender-v3.0-release' 2021-11-03 10:54:53 +01:00
e1f1b0841d Merge branch 'blender-v3.0-release' 2021-11-03 15:56:11 +11:00
cea7ee7582 Merge branch 'blender-v3.0-release' 2021-11-03 11:49:58 +11:00
ac0eefe26f Merge branch 'blender-v3.0-release' 2021-11-02 18:22:50 -05:00
c4b73847d3 BLF: Remove Thread Locking For Font Metrics
This patch removes the need to lock the thread just to get to some
generic (not glyph-specific) font metrics.

See D12976 for more details.

Differential Revision: https://developer.blender.org/D12976

Reviewed by Campbell Barton
2021-11-02 14:28:25 -07:00
8c58838f6a Merge branch 'blender-v3.0-release' 2021-11-02 15:23:59 -05:00
5cd1210b52 Merge branch 'blender-v3.0-release' 2021-11-02 13:44:17 -05:00
3dcd042bfc Merge branch 'blender-v3.0-release' 2021-11-02 18:38:59 +01:00
53fdde3f64 Merge branch 'blender-v3.0-release' 2021-11-02 18:14:59 +01:00
0a254109b8 Merge branch 'blender-v3.0-release' 2021-11-02 17:52:03 +01:00
242ad4bd0f Merge branch 'blender-v3.0-release' 2021-11-02 17:49:52 +01:00
27621490c2 Merge branch 'blender-v3.0-release' 2021-11-02 18:33:25 +02:00
12bf4adbe3 Merge branch 'blender-v3.0-release' 2021-11-02 17:05:03 +01:00
2c23256288 Merge branch 'blender-v3.0-release' 2021-11-02 16:03:17 +00:00
c01b3c534b Merge branch 'blender-v3.0-release' 2021-11-02 15:42:09 +01:00
1b2342b4d3 Tests: Add basic unittest for library reload and relocate operators. 2021-11-02 15:33:30 +01:00
a7e92843f7 Fix: Build error on windows.
External symbols in C files need
to be marked as such otherwise
the linker will not find them.
2021-11-02 07:43:44 -06:00
e64d4d0200 Merge branch 'blender-v3.0-release' 2021-11-02 07:59:35 -05:00
0daf429591 Merge branch 'blender-v3.0-release' 2021-11-02 23:38:01 +11:00
b7dc667eb2 Merge branch 'blender-v3.0-release' 2021-11-02 12:31:05 +01:00
7a4ee2fd4f Merge branch 'blender-v3.0-release' 2021-11-02 22:06:35 +11:00
52f4a908f7 Removed compilation warning nullptr check in image engine. 2021-11-02 12:00:07 +01:00
ffd3dd6376 Merge branch 'blender-v3.0-release' 2021-11-02 11:17:53 +01:00
a2f5a10129 Merge branch 'blender-v3.0-release' 2021-11-02 11:09:09 +01:00
223f2b27d1 Merge branch 'blender-v3.0-release' 2021-11-02 09:57:16 +01:00
21e168069d Merge branch 'blender-v3.0-release' 2021-11-02 19:52:01 +11:00
8ca6e51ade Cleanup: clang-tidy 2021-11-02 19:50:53 +11:00
47d12268e3 Cleanup: Change image engine to CPP.
Added namespace blender::draw::image_engine. Code is still C-a-like.
Only changed the obvious code style to CPP (structs, nullptr, casts).
2021-11-02 09:16:33 +01:00
9cc05fe9c4 Merge branch 'blender-v3.0-release' 2021-11-02 16:08:13 +11:00
af9e0409f1 Merge branch 'blender-v3.0-release' 2021-11-02 16:08:10 +11:00
7b436ead6b Merge branch 'blender-v3.0-release' 2021-11-02 16:08:07 +11:00
5363437555 Merge branch 'blender-v3.0-release' 2021-11-02 00:46:32 -03:00
75f5edcaf3 Merge branch 'blender-v3.0-release' 2021-11-01 14:48:20 -05:00
6ddbcaa096 Merge branch 'blender-v3.0-release' 2021-11-01 14:42:49 -05:00
e045249a28 Merge branch 'blender-v3.0-release' 2021-11-01 19:54:58 +01:00
85176c86f0 Fix T92722: Error when saving new render preset
The preset menu name was not renamed in 4ddad5a7ee.
2021-11-01 19:25:11 +01:00
765c2cc6c7 Merge branch 'blender-v3.0-release' 2021-11-01 13:17:54 -05:00
339fd8027f Merge branch 'blender-v3.0-release' 2021-11-01 17:28:40 +01:00
adc540cf7c Fix T92655: spreadsheet_duplicate Split Exception
Check SpaceSpreadsheet's runtime is not null when trying to duplicate
the data when doing an area split.

See D13047 for further details.

Differential Revision: https://developer.blender.org/D13047

Reviewed by Jacques Lucke
2021-11-01 08:52:00 -07:00
1704a394d8 Fix T92689: Assert loading file with a sound.
Dummy mistake in rBc8c53ceecc30 (boolean inversion).
2021-11-01 16:44:12 +01:00
d56d3fc6b1 Merge branch 'blender-v3.0-release' 2021-11-01 16:43:26 +01:00
6acba759e0 Merge branch 'blender-v3.0-release' 2021-11-01 11:58:11 -03:00
8fbbd69946 Fix T92629: Crash on mesh separate after rB43bc494892c3.
rB43bc494892c3 switched `BKE_libblock_relink_to_newid` to use new ID
remapping and libquery code.

However, that new code does protect by default against remapping an
objects's data pointer when that object is in Edit mode, since this is
not a behavior that generic BKE code can handle (due to required editing
data for most obdata types when in edit mode).

So specific code that does create new IDs and need remapping in Edit
mode has to pass specific exception flags to remaping code.

This commit adds those remapping flags to `BKE_libblock_relink_to_newid`
and add said exception flag to the remapping call from
`ED_object_add_duplicate` when the object is in edit mode.
2021-11-01 15:46:28 +01:00
e85e126e3f IDRemap: Add option to force remapping obdata in edit mode.
In theory we should never allow remapping of Objects' obdata ID pointer
when the object is in Edit mode. But there are some cases were this is
needed, so adding yet another exception option to remapping flags.

Preliminary change to fix T92629.
2021-11-01 15:46:28 +01:00
64de6ad4fe Fix use-after-free in image code 2021-11-01 15:36:09 +01:00
2fb725ea30 Cleanup: Unused argument
Fixes strict compiler warnings.
2021-11-01 15:14:49 +01:00
9de4f64197 Fix compile error on Windows
Caused by 7150f919d3. This undid part of 79a88b5e91. Added a comment
for why this include is needed, to avoid this error from happening
again.
2021-11-01 14:44:18 +01:00
6897c2141e Merge branch 'blender-v3.0-release' 2021-11-01 13:39:24 +01:00
7150f919d3 Cleanup: Remove unused headers in asset files
Also move system includes first, like we have it elsewhere in Blender.
2021-11-01 13:21:07 +01:00
0eb63328e8 Merge branch 'blender-v3.0-release' 2021-11-01 13:02:48 +01:00
d07e3bde20 Fix Cycles tests after recent logging changes
The constant folding tests rely on logging sync.
2021-11-01 12:47:03 +01:00
9111ea78ac Localize image mutex lock into runtime field of Image datablock
Allows to avoid a global lock being held while reading files from disk,
solving performance issues when Cycles needs to read a lot of packed
images.

Simple test file F11597666

Differential Revision: https://developer.blender.org/D13032
2021-11-01 12:47:03 +01:00
b6dd5be213 Merge branch 'blender-v3.0-release' 2021-11-01 22:25:27 +11:00
b5eada7f69 Merge branch 'blender-v3.0-release' 2021-11-01 22:25:24 +11:00
3f0991266f Merge branch 'blender-v3.0-release' 2021-11-01 12:15:09 +01:00
a96b2f39b8 Geometry Nodes: improve check if object has geometry set instances
The improves playback speed in my instance heavy scene from ~3.7 fps to ~3.9 fps.
2021-11-01 12:00:41 +01:00
Jarrett Johnson
7dd84f05aa Pointcloud selection support
This patch adds support for selecting pointclouds.

Since pointclouds were not properly drawn to the selection buffer (as diagonsed by output from `glReadPixels` and Renderdoc), they were not able to be selectable by depth picking or occlusion queries. In `basic_engine`, objects were rendered with a shader which draws to a depth buffer but only assumes a single position vertex attribute. Pointclouds, though, require at least another vertex attribute `pos_inst` which provides the instance offsets. Thus, this patch adds another shader variant for pointclouds which supports these two attributes and renders the points appropriately.

{F11652666}

Addresses T92415

Reviewed By: fclem

Differential Revision: https://developer.blender.org/D13059
2021-11-01 11:41:13 +01:00
fe44001215 Revert "Pointcloud selection support"
This reverts commit a50f8b3fd8.
2021-11-01 11:40:34 +01:00
Jarrett Johnson
a50f8b3fd8 Pointcloud selection support
This patch adds support for selecting pointclouds.

Since pointclouds were not properly drawn to the selection buffer (as diagonsed by output from `glReadPixels` and Renderdoc), they were not able to be selectable by depth picking or occlusion queries. In `basic_engine`, objects were rendered with a shader which draws to a depth buffer but only assumes a single position vertex attribute. Pointclouds, though, require at least another vertex attribute `pos_inst` which provides the instance offsets. Thus, this patch adds another shader variant for pointclouds which supports these two attributes and renders the points appropriately.

{F11652666}

Addresses T92415

Reviewed By: fclem

Differential Revision: https://developer.blender.org/D13059
2021-11-01 11:25:23 +01:00
8379eefafb Cycles: Enable debug symbols for Clang
Debug symbols were disabled for Clang at some point due to link issues.
This is no longer the case for any reasonably modern version of Clang.
So this patch removes the check in question.

Differential Revision: https://developer.blender.org/D13045

Reviewed By: brecht
2021-11-01 11:38:03 +02:00
5327413b37 Cleanup: Remove Cycles device checks for half float.
All supported devices support half float now, so we can remove the check.

Differential Revision: https://developer.blender.org/D13021
2021-11-01 10:18:30 +01:00
289f013e64 Merge branch 'blender-v3.0-release' 2021-11-01 17:48:50 +11:00
49a0453799 Merge branch 'blender-v3.0-release' 2021-11-01 17:22:44 +11:00
de4793e0e6 Merge branch 'blender-v3.0-release' 2021-11-01 17:16:53 +11:00
55ce05e0bb Merge branch 'blender-v3.0-release' 2021-11-01 17:16:50 +11:00
346a812d7e Fix scale cage gizmo in pose-mode
The active objects matrix was ignored when calculating the cage.
2021-11-01 17:16:45 +11:00
1e749d0602 Cleanup: spelling, use C comments 2021-11-01 14:00:58 +11:00
b99d6e1bed Fix errors in BKE_appdir_font_folder_default
- Missing NULL check for the HOME environment variable.
- The user preference path was written to even when the
  path didn't exist.
2021-11-01 13:43:35 +11:00
e2937ff24f Merge branch 'blender-v3.0-release' 2021-11-01 13:15:10 +11:00
c312c71969 Fix: Build error on all platforms
Types were used before being declared.
2021-10-31 11:52:24 -06:00
Christoph Lendenfeld
b2e9f35c5e Cleanup: Extract function to store bezt arrays
The code to store an original bezt array previously lived in
`graphkeys_decimate_invoke`.
Since future graph slider operators will need this function as well,
it has been extracted.

No functional changes.

Reviewed by: Sybren A. Stüvel
Differential Revision: https://developer.blender.org/D12487
Ref: D12487
2021-10-31 11:28:10 +00:00
Christoph Lendenfeld
1b6daa871d Cleanup: Extract keyframe filter to constant
An int flag is used to filter animation channels for
operators to work on. The flag was duplicated multiple times.
This patch removes the duplication by creating a constant

Reviewed by: Sybren A. Stüvel
Differential Revision: https://developer.blender.org/D12486
Ref: D12486
2021-10-31 11:19:40 +00:00
Christoph Lendenfeld
4e502bb6d2 Merge branch 'blender-v3.0-release' 2021-10-31 11:10:11 +00:00
ae9052a33e Cleanup: Simplify logic for adding grid in points to volume node
Instead of creating a separate grid first and then merging the points
to volume grid, use the recently added `BKE_volume_grid_add_vdb`
helper function for this purpose.
2021-10-30 17:26:18 -05:00
9cfffe8468 UI: Open File Browser with Thumbnails for Fonts
When browsing to open a font file, open File Browser in Thumbnail View
and sorting by name, instead of using the last-used states.

See D13040 for more details.

Differential Revision: https://developer.blender.org/D13040

Reviewed by Julian Eisel
2021-10-30 14:02:34 -07:00
03a962d8ca Merge branch 'blender-v3.0-release' 2021-10-30 16:45:04 +02:00
02a9377da0 UI: Default Fonts Folder for Mac and Linux
Initial defaults for userdef->fontdir for Mac and Linux.

See D12802 for more details.

Differential Revision: https://developer.blender.org/D12802

Reviewed by Campbell Barton
2021-10-29 17:15:22 -07:00
dcdbaf89bd Asset Browser: Correct name & tooltip for asset list refresh operator
The name and tooltip were talking about file-lists, which exposes the
fact that the Asset Browser uses the File Browser code in the UI, which
we shouldn't do. This can confuse users.
Instead have a dedicated operator for the Asset Browser with a proper
name and tooltip.
2021-10-29 16:55:20 +02:00
0c3da7f370 Merge branch 'blender-v3.0-release' 2021-10-29 09:42:24 -05:00
837b72fa83 Merge branch 'blender-v3.0-release' 2021-10-29 11:25:32 -03:00
d18d87d3e7 Fix T92576: Crash switching from Asset Browser to File Brower
The asset catalog filtering data needs to be cleared when with the other
asset library data of the file list. This is done when changing between
asset and file browser (and in other cases).
2021-10-29 16:00:43 +02:00
e1db6dc11b Fix crash on undo after recent lib_query refactor.
Forgot that scene uses part of its ID looping code for specific undo
handling.

Caused by rBe3b2f0fd6ff9.
2021-10-29 12:54:32 +02:00
c8c53ceecc LibQuery: Remove last 'bool returns' from public API.
Those were used in a very few places to detect whether iteration should
be stopped or not, but one can use `BKE_lib_query_foreachid_iter_stop`
now for that.

Also fix early break handling in embedded IDs processing.

Fix T90922: Fix return policy inconsistency in `scene_foreach_id`.
2021-10-29 11:32:26 +02:00
259731909c LibQuery: Fix potential memleak in recursive case.
In case `library_foreach_ID_link` would return early in recursive
process, it would not properly free its utils data.

Also add proper iteration break in case some sub-calls requested it.

Finally, make this function return a boolean to know whether iteration
should be stopped or not (will be used in future commit to fix this
handling in embedded IDs case).

Part of T90922: Fix return policy inconsistency in `scene_foreach_id`.
2021-10-29 11:32:26 +02:00
e3b2f0fd6f LibQuery: Add macro to help break looping when requested.
The new `BKE_LIB_FOREACHID_PROCESS_FUNCTION_CALL` execute the given
statement and then check status of `LibraryForeachIDData` data, and
return in case stop of iteration is requested.

This is very similar to the other `BKE_LIB_FOREACHID_PROCESS_` existing
macros, and allows us to properly break iteration when a sub-function
has requested it.

Part of T90922: Fix return policy inconsistency in `scene_foreach_id`.
2021-10-29 11:32:26 +02:00
51c1c1cd93 Fix potential early-return in WM foreach_id process.
Add a function to check if iteration over ID usages should stop (using
internal `IDWALK_STOP` status flag).

Use it in `BKE_LIB_FOREACHID_PROCESS_` macros, and in
`window_manager_foreach_id` to handle properly the active workspace case
(previous code could skip the call to `BKE_workspace_active_set` in case
iteration over ID usages was stopped by callback on that specific ID
usage).

Part of T90922: Fix return policy inconsistency in `scene_foreach_id`.
2021-10-29 11:32:26 +02:00
eae59645de Cleanup: Add some comments to some sub-function of foreach_id process. 2021-10-29 11:32:26 +02:00
c112418e95 Merge branch 'blender-v3.0-release' 2021-10-29 11:05:20 +02:00
43bc494892 IDManagement: Remove deprecated BKE_libblock_relink_to_newid usages.
Move all usages to new `BKE_libblock_relink_to_newid_new`, and rename
that one to `BKE_libblock_relink_to_newid`.

Fix T91413.
2021-10-29 10:45:48 +02:00
fb688c8d5c Merge branch 'blender-v3.0-release' 2021-10-29 10:24:37 +02:00
261bb766fb Merge branch 'blender-v3.0-release' 2021-10-29 10:11:26 +02:00
1688cb27cd Merge branch 'blender-v3.0-release' 2021-10-29 09:49:48 +02:00
657923cf93 Merge branch 'blender-v3.0-release' 2021-10-29 15:01:50 +11:00
57f7650dc7 Merge branch 'blender-v3.0-release' 2021-10-29 14:01:19 +11:00
99fbf1716f Merge branch 'blender-v3.0-release' 2021-10-29 13:54:43 +11:00
38fc19d643 Cleanup: rename blf_utf8_next_fast to blf_glyph_from_utf8_and_step
Calling this 'fast' no longer made sense as the slower code-path
has been removed.
2021-10-29 13:49:31 +11:00
1e2589bfa5 Cleanup: remove redundant BLI_UTF8_ERR check 2021-10-29 13:15:39 +11:00
0e71162e68 Cleanup: resolve cast warnings 2021-10-29 13:14:23 +11:00
70947ebc65 BLF Refactor: blf_utf8_next_fast
Simplification of BLF glyph loading

See D13026 for details.

Differential Revision: https://developer.blender.org/D13026

Reviewed by Campbell Barton
2021-10-28 18:49:21 -07:00
59534dbee2 BLF Refactor: blf_kerning_step_fast
Simplification of BLF Kerning

See D13015 for more details.

Differential Revision: https://developer.blender.org/D13015

Reviewed by Campbell Barton
2021-10-28 18:20:34 -07:00
0283a22456 Merge branch 'blender-v3.0-release' 2021-10-29 10:53:42 +11:00
8eb10de739 Merge branch 'blender-v3.0-release' 2021-10-29 10:53:39 +11:00
adf82fe943 Merge branch 'blender-v3.0-release' 2021-10-28 22:42:04 +02:00
fc36772b06 Tests: minor updates to benchmark script for running on buildbot
* graph command accepts folder of json files as input
* reset command clears log files
2021-10-28 22:41:40 +02:00
731926e70e Merge branch 'blender-v3.0-release' 2021-10-28 14:25:14 -05:00
a6af0e570d Merge branch 'blender-v3.0-release' 2021-10-28 21:20:25 +02:00
ec9357a94e Merge branch 'blender-v3.0-release' 2021-10-28 16:02:06 +02:00
60b278a3bb Merge branch 'blender-v3.0-release' 2021-10-28 15:52:19 +02:00
5f1107ffaf Merge branch 'blender-v3.0-release' 2021-10-28 12:01:01 +02:00
2f8ed53d6f Merge remote-tracking branch 'origin/blender-v3.0-release' 2021-10-28 11:23:25 +02:00
c1cfb475b3 LineArt: Fix(unreported) Material mask panel logic
The logic should be: show material mask panel if in_front is on,
it was inverted unintentionally.
2021-10-28 17:17:48 +08:00
4b57d5a9a0 LineArt: Fix(unreported) depsgraph camera error
This fixes unintentional line art error when custom camera doesn't
exist, now not adding custom camera relation in this case.
2021-10-28 17:13:22 +08:00
4adde62f60 Python doc generator: add missing selected_ids context key
Add the context key I introduced in rB03c0581c6ed to the Python API docs
generator.

No functional changes to Blender.
2021-10-28 10:32:13 +02:00
2501d00268 Python doc generator: raise explanatory error when context key is missing
When a new key is added to the context, it also needs to be added to the
`sphinx_doc_gen.py` file for generating the Python API documentation.
When this isn't done, the script would raise a generic `KeyError`. Now
it explains what needs to be updated to solve the problem.

No functional changes to Blender.
2021-10-28 10:32:13 +02:00
45439dfe4c Preferences: remove special case for copying previous settings
This was only needed for skipping version numbers when the numbering
scheme changed for 3.0.
2021-10-28 17:59:51 +11:00
5568599015 Merge branch 'blender-v3.0-release' 2021-10-28 17:49:15 +11:00
a7879dea7c Merge branch 'blender-v3.0-release' 2021-10-28 17:49:12 +11:00
43c603c2ff Merge branch 'blender-v3.0-release' 2021-10-28 17:45:28 +11:00
4979537a65 Merge branch 'blender-v3.0-release' 2021-10-28 16:56:11 +11:00
99b6127b73 Merge branch 'blender-v3.0-release' 2021-10-28 16:56:08 +11:00
c647bd899f Merge branch 'blender-v3.0-release' 2021-10-28 14:08:57 +11:00
a2f0f98271 Merge branch 'blender-v3.0-release' 2021-10-27 19:07:40 -03:00
7d2c759054 Merge branch 'blender-v3.0-release' 2021-10-27 23:22:37 +02:00
8f02de3de7 Cleanup: remove redundant variable
`free_tooltip` is no longer needed.
2021-10-27 18:11:47 -03:00
e4a5fd4298 Fix broken Python API doc generation after addition of selected_ids 2021-10-27 22:16:36 +02:00
39c11c03d0 Asset Browser: Increase size of search button a bit
Before this, the search button was quite small really, not much text would fit
into it. Increase the size a bit, but not too much to still make the layout
work in smaller area sizes.
2021-10-27 18:56:41 +02:00
3e32a68f38 UI: Refactor how dragging onto text buttons works, fixing issues
There was a bunch of special handling to support dropping data-blocks onto
string or search-menu buttons, to change the value of these. This refactor
makes that case use the normal drop-box design, where an operator is executed
on drop that gets input properties set by the drop-box. This should also make
it easier to add support for dragging assets into these buttons.

In addition this fixes an issue: Two tooltips were shown when dragging assets
over text buttons. None should be shown, because this isn't supported.
2021-10-27 18:51:44 +02:00
bfd2921d38 Revert "Blender 3.0 bcon3 (beta)"
This reverts commit f7a3450e63.
2021-10-27 18:44:28 +02:00
c346bb1990 Revert "3.0 splashscreen"
This reverts commit 78c1c71988.
2021-10-27 18:43:27 +02:00
17efd14682 Merge branch 'blender-v3.0-release' 2021-10-27 18:43:18 +02:00
dab3591588 Blender 3.1 bcon1 - alpha
Bump the version number for the new release cycle.
2021-10-27 18:40:49 +02:00
628 changed files with 8752 additions and 9389 deletions

View File

@@ -440,11 +440,7 @@ mark_as_advanced(WITH_CYCLES_CUDA_BUILD_SERIAL)
mark_as_advanced(WITH_CUDA_DYNLOAD)
# AMD HIP
if(WIN32)
option(WITH_CYCLES_DEVICE_HIP "Enable Cycles AMD HIP support" ON)
else()
option(WITH_CYCLES_DEVICE_HIP "Enable Cycles AMD HIP support" OFF)
endif()
option(WITH_CYCLES_DEVICE_HIP "Enable Cycles AMD HIP support" OFF)
option(WITH_CYCLES_HIP_BINARIES "Build Cycles AMD HIP binaries" OFF)
set(CYCLES_HIP_BINARIES_ARCH gfx1010 gfx1011 gfx1012 gfx1030 gfx1031 gfx1032 gfx1034 CACHE STRING "AMD HIP architectures to build binaries for")
mark_as_advanced(WITH_CYCLES_DEVICE_HIP)
@@ -494,8 +490,7 @@ endif()
# This should be turned off when Blender enter beta/rc/release
if("${BLENDER_VERSION_CYCLE}" STREQUAL "release" OR
"${BLENDER_VERSION_CYCLE}" STREQUAL "rc" OR
"${BLENDER_VERSION_CYCLE}" STREQUAL "beta")
"${BLENDER_VERSION_CYCLE}" STREQUAL "rc")
set(WITH_EXPERIMENTAL_FEATURES OFF)
else()
set(WITH_EXPERIMENTAL_FEATURES ON)

View File

@@ -42,7 +42,6 @@ ExternalProject_Add(nanovdb
URL_HASH ${NANOVDB_HASH_TYPE}=${NANOVDB_HASH}
PREFIX ${BUILD_DIR}/nanovdb
SOURCE_SUBDIR nanovdb
PATCH_COMMAND ${PATCH_CMD} -p 1 -d ${BUILD_DIR}/nanovdb/src/nanovdb < ${PATCH_DIR}/nanovdb.diff
CMAKE_ARGS -DCMAKE_INSTALL_PREFIX=${LIBDIR}/nanovdb ${DEFAULT_CMAKE_FLAGS} ${NANOVDB_EXTRA_ARGS}
INSTALL_DIR ${LIBDIR}/nanovdb
)

View File

@@ -1,374 +0,0 @@
Index: nanovdb/nanovdb/NanoVDB.h
===================================================================
--- a/nanovdb/nanovdb/NanoVDB.h (revision 62751)
+++ b/nanovdb/nanovdb/NanoVDB.h (working copy)
@@ -152,8 +152,8 @@
#endif // __CUDACC_RTC__
-#ifdef __CUDACC__
-// Only define __hostdev__ when using NVIDIA CUDA compiler
+#if defined(__CUDACC__) || defined(__HIP__)
+// Only define __hostdev__ when using NVIDIA CUDA or HIP compiler
#define __hostdev__ __host__ __device__
#else
#define __hostdev__
@@ -461,7 +461,7 @@
/// Maximum floating-point values
template<typename T>
struct Maximum;
-#ifdef __CUDA_ARCH__
+#if defined(__CUDA_ARCH__) || defined(__HIP__)
template<>
struct Maximum<int>
{
@@ -1006,10 +1006,10 @@
using Vec3i = Vec3<int>;
/// @brief Return a single precision floating-point vector of this coordinate
-Vec3f Coord::asVec3s() const { return Vec3f(float(mVec[0]), float(mVec[1]), float(mVec[2])); }
+inline __hostdev__ Vec3f Coord::asVec3s() const { return Vec3f(float(mVec[0]), float(mVec[1]), float(mVec[2])); }
/// @brief Return a double precision floating-point vector of this coordinate
-Vec3d Coord::asVec3d() const { return Vec3d(double(mVec[0]), double(mVec[1]), double(mVec[2])); }
+inline __hostdev__ Vec3d Coord::asVec3d() const { return Vec3d(double(mVec[0]), double(mVec[1]), double(mVec[2])); }
// ----------------------------> Vec4 <--------------------------------------
@@ -1820,7 +1820,7 @@
}; // Map
template<typename Mat4T>
-void Map::set(const Mat4T& mat, const Mat4T& invMat, double taper)
+__hostdev__ void Map::set(const Mat4T& mat, const Mat4T& invMat, double taper)
{
float * mf = mMatF, *vf = mVecF;
float* mif = mInvMatF;
@@ -2170,7 +2170,7 @@
}; // Class Grid
template<typename TreeT>
-int Grid<TreeT>::findBlindDataForSemantic(GridBlindDataSemantic semantic) const
+__hostdev__ int Grid<TreeT>::findBlindDataForSemantic(GridBlindDataSemantic semantic) const
{
for (uint32_t i = 0, n = blindDataCount(); i < n; ++i)
if (blindMetaData(i).mSemantic == semantic)
@@ -2328,7 +2328,7 @@
}; // Tree class
template<typename RootT>
-void Tree<RootT>::extrema(ValueType& min, ValueType& max) const
+__hostdev__ void Tree<RootT>::extrema(ValueType& min, ValueType& max) const
{
min = this->root().valueMin();
max = this->root().valueMax();
@@ -2336,7 +2336,7 @@
template<typename RootT>
template<typename NodeT>
-const NodeT* Tree<RootT>::getNode(uint32_t i) const
+__hostdev__ const NodeT* Tree<RootT>::getNode(uint32_t i) const
{
static_assert(is_same<TreeNodeT<NodeT::LEVEL>, NodeT>::value, "Tree::getNode: unvalid node type");
NANOVDB_ASSERT(i < DataType::mCount[NodeT::LEVEL]);
@@ -2345,7 +2345,7 @@
template<typename RootT>
template<int LEVEL>
-const typename TreeNode<Tree<RootT>, LEVEL>::type* Tree<RootT>::getNode(uint32_t i) const
+__hostdev__ const typename TreeNode<Tree<RootT>, LEVEL>::type* Tree<RootT>::getNode(uint32_t i) const
{
NANOVDB_ASSERT(i < DataType::mCount[LEVEL]);
return reinterpret_cast<const TreeNodeT<LEVEL>*>(reinterpret_cast<const uint8_t*>(this) + DataType::mBytes[LEVEL]) + i;
@@ -2353,7 +2353,7 @@
template<typename RootT>
template<typename NodeT>
-NodeT* Tree<RootT>::getNode(uint32_t i)
+__hostdev__ NodeT* Tree<RootT>::getNode(uint32_t i)
{
static_assert(is_same<TreeNodeT<NodeT::LEVEL>, NodeT>::value, "Tree::getNode: invalid node type");
NANOVDB_ASSERT(i < DataType::mCount[NodeT::LEVEL]);
@@ -2362,7 +2362,7 @@
template<typename RootT>
template<int LEVEL>
-typename TreeNode<Tree<RootT>, LEVEL>::type* Tree<RootT>::getNode(uint32_t i)
+__hostdev__ typename TreeNode<Tree<RootT>, LEVEL>::type* Tree<RootT>::getNode(uint32_t i)
{
NANOVDB_ASSERT(i < DataType::mCount[LEVEL]);
return reinterpret_cast<TreeNodeT<LEVEL>*>(reinterpret_cast<uint8_t*>(this) + DataType::mBytes[LEVEL]) + i;
@@ -2370,7 +2370,7 @@
template<typename RootT>
template<typename NodeT>
-uint32_t Tree<RootT>::getNodeID(const NodeT& node) const
+__hostdev__ uint32_t Tree<RootT>::getNodeID(const NodeT& node) const
{
static_assert(is_same<TreeNodeT<NodeT::LEVEL>, NodeT>::value, "Tree::getNodeID: invalid node type");
const NodeT* first = reinterpret_cast<const NodeT*>(reinterpret_cast<const uint8_t*>(this) + DataType::mBytes[NodeT::LEVEL]);
@@ -2380,7 +2380,7 @@
template<typename RootT>
template<typename NodeT>
-uint32_t Tree<RootT>::getLinearOffset(const NodeT& node) const
+__hostdev__ uint32_t Tree<RootT>::getLinearOffset(const NodeT& node) const
{
return this->getNodeID(node) + DataType::mPFSum[NodeT::LEVEL];
}
@@ -3366,7 +3366,7 @@
}; // LeafNode class
template<typename ValueT, typename CoordT, template<uint32_t> class MaskT, uint32_t LOG2DIM>
-inline void LeafNode<ValueT, CoordT, MaskT, LOG2DIM>::updateBBox()
+inline __hostdev__ void LeafNode<ValueT, CoordT, MaskT, LOG2DIM>::updateBBox()
{
static_assert(LOG2DIM == 3, "LeafNode::updateBBox: only supports LOGDIM = 3!");
if (!this->isActive()) return;
Index: nanovdb/nanovdb/util/SampleFromVoxels.h
===================================================================
--- a/nanovdb/nanovdb/util/SampleFromVoxels.h (revision 62751)
+++ b/nanovdb/nanovdb/util/SampleFromVoxels.h (working copy)
@@ -22,7 +22,7 @@
#define NANOVDB_SAMPLE_FROM_VOXELS_H_HAS_BEEN_INCLUDED
// Only define __hostdev__ when compiling as NVIDIA CUDA
-#ifdef __CUDACC__
+#if defined(__CUDACC__) || defined(__HIP__)
#define __hostdev__ __host__ __device__
#else
#include <cmath> // for floor
@@ -136,7 +136,7 @@
template<typename TreeOrAccT>
template<typename Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 0, true>::operator()(const Vec3T& xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 0, true>::operator()(const Vec3T& xyz) const
{
const CoordT ijk = Round<CoordT>(xyz);
if (ijk != mPos) {
@@ -147,7 +147,7 @@
}
template<typename TreeOrAccT>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 0, true>::operator()(const CoordT& ijk) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 0, true>::operator()(const CoordT& ijk) const
{
if (ijk != mPos) {
mPos = ijk;
@@ -158,7 +158,7 @@
template<typename TreeOrAccT>
template<typename Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 0, false>::operator()(const Vec3T& xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 0, false>::operator()(const Vec3T& xyz) const
{
return mAcc.getValue(Round<CoordT>(xyz));
}
@@ -195,7 +195,7 @@
}; // TrilinearSamplerBase
template<typename TreeOrAccT>
-void TrilinearSampler<TreeOrAccT>::stencil(CoordT& ijk, ValueT (&v)[2][2][2]) const
+__hostdev__ void TrilinearSampler<TreeOrAccT>::stencil(CoordT& ijk, ValueT (&v)[2][2][2]) const
{
v[0][0][0] = mAcc.getValue(ijk); // i, j, k
@@ -224,7 +224,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType TrilinearSampler<TreeOrAccT>::sample(const Vec3T<RealT> &uvw, const ValueT (&v)[2][2][2])
+__hostdev__ typename TreeOrAccT::ValueType TrilinearSampler<TreeOrAccT>::sample(const Vec3T<RealT> &uvw, const ValueT (&v)[2][2][2])
{
#if 0
auto lerp = [](ValueT a, ValueT b, ValueT w){ return fma(w, b-a, a); };// = w*(b-a) + a
@@ -239,7 +239,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-Vec3T<typename TreeOrAccT::ValueType> TrilinearSampler<TreeOrAccT>::gradient(const Vec3T<RealT> &uvw, const ValueT (&v)[2][2][2])
+__hostdev__ Vec3T<typename TreeOrAccT::ValueType> TrilinearSampler<TreeOrAccT>::gradient(const Vec3T<RealT> &uvw, const ValueT (&v)[2][2][2])
{
static_assert(std::is_floating_point<ValueT>::value, "TrilinearSampler::gradient requires a floating-point type");
#if 0
@@ -270,7 +270,7 @@
}
template<typename TreeOrAccT>
-bool TrilinearSampler<TreeOrAccT>::zeroCrossing(const ValueT (&v)[2][2][2])
+__hostdev__ bool TrilinearSampler<TreeOrAccT>::zeroCrossing(const ValueT (&v)[2][2][2])
{
static_assert(std::is_floating_point<ValueT>::value, "TrilinearSampler::zeroCrossing requires a floating-point type");
const bool less = v[0][0][0] < ValueT(0);
@@ -363,7 +363,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, true>::operator()(Vec3T<RealT> xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, true>::operator()(Vec3T<RealT> xyz) const
{
this->cache(xyz);
return BaseT::sample(xyz, mVal);
@@ -370,7 +370,7 @@
}
template<typename TreeOrAccT>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, true>::operator()(const CoordT &ijk) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, true>::operator()(const CoordT &ijk) const
{
return ijk == mPos ? mVal[0][0][0] : BaseT::mAcc.getValue(ijk);
}
@@ -377,7 +377,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-Vec3T<typename TreeOrAccT::ValueType> SampleFromVoxels<TreeOrAccT, 1, true>::gradient(Vec3T<RealT> xyz) const
+__hostdev__ Vec3T<typename TreeOrAccT::ValueType> SampleFromVoxels<TreeOrAccT, 1, true>::gradient(Vec3T<RealT> xyz) const
{
this->cache(xyz);
return BaseT::gradient(xyz, mVal);
@@ -393,7 +393,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-void SampleFromVoxels<TreeOrAccT, 1, true>::cache(Vec3T<RealT>& xyz) const
+__hostdev__ void SampleFromVoxels<TreeOrAccT, 1, true>::cache(Vec3T<RealT>& xyz) const
{
CoordT ijk = Floor<CoordT>(xyz);
if (ijk != mPos) {
@@ -406,7 +406,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, false>::operator()(Vec3T<RealT> xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, false>::operator()(Vec3T<RealT> xyz) const
{
ValueT val[2][2][2];
CoordT ijk = Floor<CoordT>(xyz);
@@ -418,7 +418,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, false>::operator()(Vec3T<RealT> xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, false>::operator()(Vec3T<RealT> xyz) const
{
auto lerp = [](ValueT a, ValueT b, RealT w) { return a + ValueT(w) * (b - a); };
@@ -463,7 +463,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-inline Vec3T<typename TreeOrAccT::ValueType> SampleFromVoxels<TreeOrAccT, 1, false>::gradient(Vec3T<RealT> xyz) const
+inline __hostdev__ Vec3T<typename TreeOrAccT::ValueType> SampleFromVoxels<TreeOrAccT, 1, false>::gradient(Vec3T<RealT> xyz) const
{
ValueT val[2][2][2];
CoordT ijk = Floor<CoordT>(xyz);
@@ -473,7 +473,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-bool SampleFromVoxels<TreeOrAccT, 1, false>::zeroCrossing(Vec3T<RealT> xyz) const
+__hostdev__ bool SampleFromVoxels<TreeOrAccT, 1, false>::zeroCrossing(Vec3T<RealT> xyz) const
{
ValueT val[2][2][2];
CoordT ijk = Floor<CoordT>(xyz);
@@ -510,7 +510,7 @@
}; // TriquadraticSamplerBase
template<typename TreeOrAccT>
-void TriquadraticSampler<TreeOrAccT>::stencil(const CoordT &ijk, ValueT (&v)[3][3][3]) const
+__hostdev__ void TriquadraticSampler<TreeOrAccT>::stencil(const CoordT &ijk, ValueT (&v)[3][3][3]) const
{
CoordT p(ijk[0] - 1, 0, 0);
for (int dx = 0; dx < 3; ++dx, ++p[0]) {
@@ -526,7 +526,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType TriquadraticSampler<TreeOrAccT>::sample(const Vec3T<RealT> &uvw, const ValueT (&v)[3][3][3])
+__hostdev__ typename TreeOrAccT::ValueType TriquadraticSampler<TreeOrAccT>::sample(const Vec3T<RealT> &uvw, const ValueT (&v)[3][3][3])
{
auto kernel = [](const ValueT* value, double weight)->ValueT {
return weight * (weight * (0.5f * (value[0] + value[2]) - value[1]) +
@@ -545,7 +545,7 @@
}
template<typename TreeOrAccT>
-bool TriquadraticSampler<TreeOrAccT>::zeroCrossing(const ValueT (&v)[3][3][3])
+__hostdev__ bool TriquadraticSampler<TreeOrAccT>::zeroCrossing(const ValueT (&v)[3][3][3])
{
static_assert(std::is_floating_point<ValueT>::value, "TrilinearSampler::zeroCrossing requires a floating-point type");
const bool less = v[0][0][0] < ValueT(0);
@@ -624,7 +624,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 2, true>::operator()(Vec3T<RealT> xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 2, true>::operator()(Vec3T<RealT> xyz) const
{
this->cache(xyz);
return BaseT::sample(xyz, mVal);
@@ -631,7 +631,7 @@
}
template<typename TreeOrAccT>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 2, true>::operator()(const CoordT &ijk) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 2, true>::operator()(const CoordT &ijk) const
{
return ijk == mPos ? mVal[1][1][1] : BaseT::mAcc.getValue(ijk);
}
@@ -646,7 +646,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-void SampleFromVoxels<TreeOrAccT, 2, true>::cache(Vec3T<RealT>& xyz) const
+__hostdev__ void SampleFromVoxels<TreeOrAccT, 2, true>::cache(Vec3T<RealT>& xyz) const
{
CoordT ijk = Floor<CoordT>(xyz);
if (ijk != mPos) {
@@ -657,7 +657,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 2, false>::operator()(Vec3T<RealT> xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 2, false>::operator()(Vec3T<RealT> xyz) const
{
ValueT val[3][3][3];
CoordT ijk = Floor<CoordT>(xyz);
@@ -667,7 +667,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-bool SampleFromVoxels<TreeOrAccT, 2, false>::zeroCrossing(Vec3T<RealT> xyz) const
+__hostdev__ bool SampleFromVoxels<TreeOrAccT, 2, false>::zeroCrossing(Vec3T<RealT> xyz) const
{
ValueT val[3][3][3];
CoordT ijk = Floor<CoordT>(xyz);
@@ -710,7 +710,7 @@
}; // TricubicSampler
template<typename TreeOrAccT>
-void TricubicSampler<TreeOrAccT>::stencil(const CoordT& ijk, ValueT (&C)[64]) const
+__hostdev__ void TricubicSampler<TreeOrAccT>::stencil(const CoordT& ijk, ValueT (&C)[64]) const
{
auto fetch = [&](int i, int j, int k) -> ValueT& { return C[((i + 1) << 4) + ((j + 1) << 2) + k + 1]; };
@@ -929,7 +929,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 3, true>::operator()(Vec3T<RealT> xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 3, true>::operator()(Vec3T<RealT> xyz) const
{
this->cache(xyz);
return BaseT::sample(xyz, mC);
@@ -937,7 +937,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-void SampleFromVoxels<TreeOrAccT, 3, true>::cache(Vec3T<RealT>& xyz) const
+__hostdev__ void SampleFromVoxels<TreeOrAccT, 3, true>::cache(Vec3T<RealT>& xyz) const
{
CoordT ijk = Floor<CoordT>(xyz);
if (ijk != mPos) {

View File

@@ -81,5 +81,4 @@ if(NOT APPLE)
set(WITH_CYCLES_DEVICE_OPTIX ON CACHE BOOL "" FORCE)
set(WITH_CYCLES_CUDA_BINARIES ON CACHE BOOL "" FORCE)
set(WITH_CYCLES_CUBIN_COMPILER OFF CACHE BOOL "" FORCE)
set(WITH_CYCLES_HIP_BINARIES ON CACHE BOOL "" FORCE)
endif()

View File

@@ -5,38 +5,38 @@
update-code:
git:
submodules:
- branch: blender-v3.0-release
- branch: master
commit_id: HEAD
path: release/scripts/addons
- branch: blender-v3.0-release
- branch: master
commit_id: HEAD
path: release/scripts/addons_contrib
- branch: blender-v3.0-release
- branch: master
commit_id: HEAD
path: release/datafiles/locale
- branch: blender-v3.0-release
- branch: master
commit_id: HEAD
path: source/tools
svn:
libraries:
darwin-arm64:
branch: tags/blender-3.0-release
branch: trunk
commit_id: HEAD
path: lib/darwin_arm64
darwin-x86_64:
branch: tags/blender-3.0-release
branch: trunk
commit_id: HEAD
path: lib/darwin
linux-x86_64:
branch: tags/blender-3.0-release
branch: trunk
commit_id: HEAD
path: lib/linux_centos7_x86_64
windows-amd64:
branch: tags/blender-3.0-release
branch: trunk
commit_id: HEAD
path: lib/win64_vc15
tests:
branch: tags/blender-3.0-release
branch: trunk
commit_id: HEAD
path: lib/tests
benchmarks:

View File

@@ -38,7 +38,7 @@ PROJECT_NAME = Blender
# could be handy for archiving the generated documentation or if some version
# control system is used.
PROJECT_NUMBER = V3.0
PROJECT_NUMBER = V3.1
# Using the PROJECT_BRIEF tag one can provide an optional one line description
# for a project that appears at the top of each page and should give viewer a

View File

@@ -42,13 +42,8 @@ class SimpleMouseOperator(bpy.types.Operator):
self.y = event.mouse_y
return self.execute(context)
# Only needed if you want to add into a dynamic menu
def menu_func(self, context):
self.layout.operator(SimpleMouseOperator.bl_idname, text="Simple Mouse Operator")
# Register and add to the view menu (required to also use F3 search "Simple Mouse Operator" for quick access)
bpy.utils.register_class(SimpleMouseOperator)
bpy.types.VIEW3D_MT_view.append(menu_func)
# Test call to the newly defined operator.
# Here we call the operator and invoke it, meaning that the settings are taken

View File

@@ -43,7 +43,7 @@ def menu_func(self, context):
self.layout.operator(ExportSomeData.bl_idname, text="Text Export Operator")
# Register and add to the file selector (required to also use F3 search "Text Export Operator" for quick access)
# Register and add to the file selector
bpy.utils.register_class(ExportSomeData)
bpy.types.TOPBAR_MT_file_export.append(menu_func)

View File

@@ -27,14 +27,8 @@ class DialogOperator(bpy.types.Operator):
wm = context.window_manager
return wm.invoke_props_dialog(self)
# Only needed if you want to add into a dynamic menu
def menu_func(self, context):
self.layout.operator(DialogOperator.bl_idname, text="Dialog Operator")
# Register and add to the object menu (required to also use F3 search "Dialog Operator" for quick access)
bpy.utils.register_class(DialogOperator)
bpy.types.VIEW3D_MT_object.append(menu_func)
# Test call.
bpy.ops.object.dialog_operator('INVOKE_DEFAULT')

View File

@@ -41,13 +41,8 @@ class CustomDrawOperator(bpy.types.Operator):
col.prop(self, "my_string")
# Only needed if you want to add into a dynamic menu
def menu_func(self, context):
self.layout.operator(CustomDrawOperator.bl_idname, text="Custom Draw Operator")
# Register and add to the object menu (required to also use F3 search "Custom Draw Operator" for quick access)
bpy.utils.register_class(CustomDrawOperator)
bpy.types.VIEW3D_MT_object.append(menu_func)
# test call
bpy.ops.object.custom_draw('INVOKE_DEFAULT')

View File

@@ -55,13 +55,8 @@ class ModalOperator(bpy.types.Operator):
context.window_manager.modal_handler_add(self)
return {'RUNNING_MODAL'}
# Only needed if you want to add into a dynamic menu
def menu_func(self, context):
self.layout.operator(ModalOperator.bl_idname, text="Modal Operator")
# Register and add to the object menu (required to also use F3 search "Modal Operator" for quick access)
bpy.utils.register_class(ModalOperator)
bpy.types.VIEW3D_MT_object.append(menu_func)
# test call
bpy.ops.object.modal_operator('INVOKE_DEFAULT')

View File

@@ -31,13 +31,8 @@ class SearchEnumOperator(bpy.types.Operator):
context.window_manager.invoke_search_popup(self)
return {'RUNNING_MODAL'}
# Only needed if you want to add into a dynamic menu
def menu_func(self, context):
self.layout.operator(SearchEnumOperator.bl_idname, text="Search Enum Operator")
# Register and add to the object menu (required to also use F3 search "Search Enum Operator" for quick access)
bpy.utils.register_class(SearchEnumOperator)
bpy.types.VIEW3D_MT_object.append(menu_func)
# test call
bpy.ops.object.search_enum_operator('INVOKE_DEFAULT')

View File

@@ -22,13 +22,8 @@ class HelloWorldOperator(bpy.types.Operator):
print("Hello World")
return {'FINISHED'}
# Only needed if you want to add into a dynamic menu
def menu_func(self, context):
self.layout.operator(HelloWorldOperator.bl_idname, text="Hello World Operator")
# Register and add to the view menu (required to also use F3 search "Hello World Operator" for quick access)
bpy.utils.register_class(HelloWorldOperator)
bpy.types.VIEW3D_MT_view.append(menu_func)
# test call to the newly defined operator
bpy.ops.wm.hello_world()

View File

@@ -106,6 +106,24 @@ including advanced features.
floating-point values. These values are interpreted as a plane equation.
.. function:: glColor (red, green, blue, alpha):
B{glColor3b, glColor3d, glColor3f, glColor3i, glColor3s, glColor3ub, glColor3ui, glColor3us,
glColor4b, glColor4d, glColor4f, glColor4i, glColor4s, glColor4ub, glColor4ui, glColor4us,
glColor3bv, glColor3dv, glColor3fv, glColor3iv, glColor3sv, glColor3ubv, glColor3uiv,
glColor3usv, glColor4bv, glColor4dv, glColor4fv, glColor4iv, glColor4sv, glColor4ubv,
glColor4uiv, glColor4usv}
Set a new color.
.. seealso:: `OpenGL Docs <https://khronos.org/registry/OpenGL-Refpages/gl4/html/glColor.xhtml>`__
:type red, green, blue, alpha: Depends on function prototype.
:arg red, green, blue: Specify new red, green, and blue values for the current color.
:arg alpha: Specifies a new alpha value for the current color. Included only in the
four-argument glColor4 commands. (With '4' colors only)
.. function:: glColorMask(red, green, blue, alpha):
Enable and disable writing of frame buffer color components

View File

@@ -1123,7 +1123,7 @@ context_type_map = {
"soft_body": ("SoftBodyModifier", False),
"speaker": ("Speaker", False),
"texture": ("Texture", False),
"texture_slot": ("TextureSlot", False),
"texture_slot": ("MaterialTextureSlot", False),
"texture_user": ("ID", False),
"texture_user_property": ("Property", False),
"ui_list": ("UIList", False),
@@ -1224,7 +1224,10 @@ def pycontext2sphinx(basepath):
while char_array[i] is not None:
member = ctypes.string_at(char_array[i]).decode(encoding="ascii")
fw(".. data:: %s\n\n" % member)
member_type, is_seq = context_type_map[member]
try:
member_type, is_seq = context_type_map[member]
except KeyError:
raise SystemExit("Error: context key %r not found in context_type_map; update %s" % (member, __file__)) from None
fw(" :type: %s :class:`bpy.types.%s`\n\n" % ("sequence of " if is_seq else "", member_type))
unique.add(member)
i += 1
@@ -2251,7 +2254,7 @@ def main():
# First monkey patch to load in fake members.
setup_monkey_patch()
# Perform changes to Blender it's self.
# Perform changes to Blender itself.
setup_data = setup_blender()
# eventually, create the dirs

12
extern/hipew/README vendored
View File

@@ -1,12 +0,0 @@
The HIP Extension Wrangler Library (HIPEW) is a cross-platform open-source
C/C++ library to dynamically load the HIP library.
HIP (Heterogeneous-Compute Interface for Portability) is an API for C++
programming on AMD GPUs.
It is maintained as part of the Blender project, but included in extern/
for consistency with CUEW and CLEW libraries.
LICENSE
HIPEW is released under the Apache 2.0 license.

View File

@@ -1,5 +0,0 @@
Project: Blender
URL: https://git.blender.org/blender.git
License: Apache 2.0
Upstream version: N/A
Local modifications: None

View File

@@ -804,29 +804,31 @@ typedef enum hipDeviceP2PAttr {
} hipDeviceP2PAttr;
typedef struct HIP_MEMCPY3D {
unsigned int srcXInBytes;
unsigned int srcY;
unsigned int srcZ;
unsigned int srcLOD;
size_t srcXInBytes;
size_t srcY;
size_t srcZ;
size_t srcLOD;
hipMemoryType srcMemoryType;
const void* srcHost;
hipDeviceptr_t srcDevice;
hArray srcArray;
unsigned int srcPitch;
unsigned int srcHeight;
unsigned int dstXInBytes;
unsigned int dstY;
unsigned int dstZ;
unsigned int dstLOD;
hArray * srcArray;
void* reserved0;
size_t srcPitch;
size_t srcHeight;
size_t dstXInBytes;
size_t dstY;
size_t dstZ;
size_t dstLOD;
hipMemoryType dstMemoryType;
void* dstHost;
hipDeviceptr_t dstDevice;
hArray dstArray;
unsigned int dstPitch;
unsigned int dstHeight;
unsigned int WidthInBytes;
unsigned int Height;
unsigned int Depth;
hArray * dstArray;
void* reserved1;
size_t dstPitch;
size_t dstHeight;
size_t WidthInBytes;
size_t Height;
size_t Depth;
} HIP_MEMCPY3D;
typedef struct HIP_MEMCPY3D_PEER_st {
@@ -877,7 +879,7 @@ typedef struct HIP_RESOURCE_DESC_st {
hipResourceType resType;
union {
struct {
hArray h_Array;
hArray * h_Array;
} array;
struct {
hipMipmappedArray_t hMipmappedArray;
@@ -1072,10 +1074,9 @@ typedef enum hiprtcResult {
typedef hipError_t HIPAPI thipGetErrorName(hipError_t error, const char** pStr);
typedef hipError_t HIPAPI thipInit(unsigned int Flags);
typedef hipError_t HIPAPI thipDriverGetVersion(int* driverVersion);
typedef hipError_t HIPAPI thipGetDevice(int* device);
typedef hipError_t HIPAPI thipGetDevice(hipDevice_t* device, int ordinal);
typedef hipError_t HIPAPI thipGetDeviceCount(int* count);
typedef hipError_t HIPAPI thipGetDeviceProperties(hipDeviceProp_t* props, int deviceId);
typedef hipError_t HIPAPI thipDeviceGet(hipDevice_t* device, int ordinal);
typedef hipError_t HIPAPI thipDeviceGetName(char* name, int len, hipDevice_t dev);
typedef hipError_t HIPAPI thipDeviceGetAttribute(int* pi, hipDeviceAttribute_t attrib, hipDevice_t dev);
typedef hipError_t HIPAPI thipDeviceComputeCapability(int* major, int* minor, hipDevice_t dev);
@@ -1208,7 +1209,6 @@ extern thipDriverGetVersion *hipDriverGetVersion;
extern thipGetDevice *hipGetDevice;
extern thipGetDeviceCount *hipGetDeviceCount;
extern thipGetDeviceProperties *hipGetDeviceProperties;
extern thipDeviceGet* hipDeviceGet;
extern thipDeviceGetName *hipDeviceGetName;
extern thipDeviceGetAttribute *hipDeviceGetAttribute;
extern thipDeviceComputeCapability *hipDeviceComputeCapability;
@@ -1333,7 +1333,6 @@ enum {
HIPEW_SUCCESS = 0,
HIPEW_ERROR_OPEN_FAILED = -1,
HIPEW_ERROR_ATEXIT_FAILED = -2,
HIPEW_ERROR_OLD_DRIVER = -3,
};
enum {

View File

@@ -71,7 +71,6 @@ thipDriverGetVersion *hipDriverGetVersion;
thipGetDevice *hipGetDevice;
thipGetDeviceCount *hipGetDeviceCount;
thipGetDeviceProperties *hipGetDeviceProperties;
thipDeviceGet* hipDeviceGet;
thipDeviceGetName *hipDeviceGetName;
thipDeviceGetAttribute *hipDeviceGetAttribute;
thipDeviceComputeCapability *hipDeviceComputeCapability;
@@ -214,36 +213,6 @@ static void hipewHipExit(void) {
}
}
#ifdef _WIN32
static int hipewHasOldDriver(const char *hip_path) {
DWORD verHandle = 0;
DWORD verSize = GetFileVersionInfoSize(hip_path, &verHandle);
int old_driver = 0;
if (verSize != 0) {
LPSTR verData = (LPSTR)malloc(verSize);
if (GetFileVersionInfo(hip_path, verHandle, verSize, verData)) {
LPBYTE lpBuffer = NULL;
UINT size = 0;
if (VerQueryValue(verData, "\\", (VOID FAR * FAR *)&lpBuffer, &size)) {
if (size) {
VS_FIXEDFILEINFO *verInfo = (VS_FIXEDFILEINFO *)lpBuffer;
/* Magic value from
* https://docs.microsoft.com/en-us/windows/win32/api/verrsrc/ns-verrsrc-vs_fixedfileinfo */
if (verInfo->dwSignature == 0xfeef04bd) {
unsigned int fileVersionLS0 = (verInfo->dwFileVersionLS >> 16) & 0xffff;
unsigned int fileversionLS1 = (verInfo->dwFileVersionLS >> 0) & 0xffff;
/* Corresponds to versions older than AMD Radeon Pro 21.Q4. */
old_driver = ((fileVersionLS0 < 3354) || (fileVersionLS0 == 3354 && fileversionLS1 < 13));
}
}
}
}
free(verData);
}
return old_driver;
}
#endif
static int hipewHipInit(void) {
/* Library paths. */
#ifdef _WIN32
@@ -271,14 +240,6 @@ static int hipewHipInit(void) {
return result;
}
#ifdef _WIN32
/* Test for driver version. */
if(hipewHasOldDriver(hip_paths[0])) {
result = HIPEW_ERROR_OLD_DRIVER;
return result;
}
#endif
/* Load library. */
hip_lib = dynamic_library_open_find(hip_paths);
@@ -294,7 +255,6 @@ static int hipewHipInit(void) {
HIP_LIBRARY_FIND_CHECKED(hipGetDevice);
HIP_LIBRARY_FIND_CHECKED(hipGetDeviceCount);
HIP_LIBRARY_FIND_CHECKED(hipGetDeviceProperties);
HIP_LIBRARY_FIND_CHECKED(hipDeviceGet);
HIP_LIBRARY_FIND_CHECKED(hipDeviceGetName);
HIP_LIBRARY_FIND_CHECKED(hipDeviceGetAttribute);
HIP_LIBRARY_FIND_CHECKED(hipDeviceComputeCapability);

View File

@@ -1,7 +1,7 @@
Project: NanoSVG
URL: https://github.com/memononen/nanosvg
License: zlib
Upstream version: 3cdd4a9d7886
Upstream version:
Local modifications: Added some functionality to manage grease pencil layers
Added a fix to SVG import arc and float errors (https://developer.blender.org/rB11dc674c78b49fc4e0b7c134c375b6c8b8eacbcc)

View File

@@ -82,7 +82,7 @@ static void session_print_status()
string status, substatus;
/* get status */
double progress = options.session->progress.get_progress();
float progress = options.session->progress.get_progress();
options.session->progress.get_status(status, substatus);
if (substatus != "")
@@ -183,7 +183,7 @@ static void display_info(Progress &progress)
progress.get_time(total_time, sample_time);
progress.get_status(status, substatus);
double progress_val = progress.get_progress();
float progress_val = progress.get_progress();
if (substatus != "")
status += ": " + substatus;

View File

@@ -138,11 +138,6 @@ endif()
blender_add_lib(bf_intern_cycles "${SRC}" "${INC}" "${INC_SYS}" "${LIB}")
# avoid link failure with clang 3.4 debug
if(CMAKE_C_COMPILER_ID MATCHES "Clang" AND NOT ${CMAKE_C_COMPILER_VERSION} VERSION_LESS '3.4')
string(APPEND CMAKE_CXX_FLAGS_DEBUG " -gline-tables-only")
endif()
add_dependencies(bf_intern_cycles bf_rna)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${ADDON_FILES}" ${CYCLES_INSTALL_PATH})

View File

@@ -346,7 +346,7 @@ class CyclesRenderSettings(bpy.types.PropertyGroup):
name="Scrambling Distance",
default=1.0,
min=0.0, max=1.0,
description="Reduce randomization between pixels to improve GPU rendering performance, at the cost of possible rendering artifacts if set too low. Only works when not using adaptive sampling",
description="Lower values give faster rendering with GPU rendering and less noise with all devices at the cost of possible artifacts if set too low. Only works when not using adaptive sampling",
)
preview_scrambling_distance: BoolProperty(
name="Scrambling Distance viewport",
@@ -354,10 +354,10 @@ class CyclesRenderSettings(bpy.types.PropertyGroup):
description="Uses the Scrambling Distance value for the viewport. Faster but may flicker",
)
auto_scrambling_distance: BoolProperty(
name="Automatic Scrambling Distance",
adaptive_scrambling_distance: BoolProperty(
name="Adaptive Scrambling Distance",
default=False,
description="Automatically reduce the randomization between pixels to improve GPU rendering performance, at the cost of possible rendering artifacts. Only works when not using adaptive sampling",
description="Uses a formula to adapt the scrambling distance strength based on the sample count",
)
use_layer_samples: EnumProperty(
@@ -770,8 +770,8 @@ class CyclesRenderSettings(bpy.types.PropertyGroup):
)
use_auto_tile: BoolProperty(
name="Use Tiling",
description="Render high resolution images in tiles to reduce memory usage, using the specified tile size. Tiles are cached to disk while rendering to save memory",
name="Auto Tiles",
description="Automatically render high resolution images in tiles to reduce memory usage, using the specified tile size. Tiles are cached to disk while rendering to save memory",
default=True,
)
tile_size: IntProperty(

View File

@@ -292,13 +292,13 @@ class CYCLES_RENDER_PT_sampling_advanced(CyclesButtonsPanel, Panel):
layout.separator()
heading = layout.column(align=True, heading="Scrambling Distance")
heading.active = not (cscene.use_adaptive_sampling and cscene.use_preview_adaptive_sampling)
heading.prop(cscene, "auto_scrambling_distance", text="Automatic")
sub = heading.row()
col = layout.column(align=True)
col.active = not (cscene.use_adaptive_sampling and cscene.use_preview_adaptive_sampling)
col.prop(cscene, "scrambling_distance", text="Scrambling Distance")
col.prop(cscene, "adaptive_scrambling_distance", text="Adaptive")
sub = col.row(align=True)
sub.active = not cscene.use_preview_adaptive_sampling
sub.prop(cscene, "preview_scrambling_distance", text="Viewport")
heading.prop(cscene, "scrambling_distance", text="Multiplier")
layout.separator()
@@ -1051,7 +1051,7 @@ class CYCLES_OBJECT_PT_motion_blur(CyclesButtonsPanel, Panel):
def has_geometry_visibility(ob):
return ob and ((ob.type in {'MESH', 'CURVE', 'SURFACE', 'FONT', 'META', 'LIGHT', 'VOLUME', 'POINTCLOUD', 'HAIR'}) or
return ob and ((ob.type in {'MESH', 'CURVE', 'SURFACE', 'FONT', 'META', 'LIGHT'}) or
(ob.instance_type == 'COLLECTION' and ob.instance_collection))

View File

@@ -819,14 +819,11 @@ void BlenderSync::sync_hair(BL::Depsgraph b_depsgraph, BObjectInfo &b_ob_info, H
new_hair.set_used_shaders(used_shaders);
if (view_layer.use_hair) {
#ifdef WITH_HAIR_NODES
if (b_ob_info.object_data.is_a(&RNA_Hair)) {
/* Hair object. */
sync_hair(&new_hair, b_ob_info, false);
}
else
#endif
{
else {
/* Particle hair. */
bool need_undeformed = new_hair.need_attribute(scene, ATTR_STD_GENERATED);
BL::Mesh b_mesh = object_to_mesh(
@@ -873,15 +870,12 @@ void BlenderSync::sync_hair_motion(BL::Depsgraph b_depsgraph,
/* Export deformed coordinates. */
if (ccl::BKE_object_is_deform_modified(b_ob_info, b_scene, preview)) {
#ifdef WITH_HAIR_NODES
if (b_ob_info.object_data.is_a(&RNA_Hair)) {
/* Hair object. */
sync_hair(hair, b_ob_info, true, motion_step);
return;
}
else
#endif
{
else {
/* Particle hair. */
BL::Mesh b_mesh = object_to_mesh(
b_data, b_ob_info, b_depsgraph, false, Mesh::SUBDIVISION_NONE);

View File

@@ -31,11 +31,7 @@ CCL_NAMESPACE_BEGIN
static Geometry::Type determine_geom_type(BObjectInfo &b_ob_info, bool use_particle_hair)
{
#ifdef WITH_HAIR_NODES
if (b_ob_info.object_data.is_a(&RNA_Hair) || use_particle_hair) {
#else
if (use_particle_hair) {
#endif
return Geometry::HAIR;
}
@@ -219,11 +215,7 @@ void BlenderSync::sync_geometry_motion(BL::Depsgraph &b_depsgraph,
if (progress.get_cancel())
return;
#ifdef WITH_HAIR_NODES
if (b_ob_info.object_data.is_a(&RNA_Hair) || use_particle_hair) {
#else
if (use_particle_hair) {
#endif
Hair *hair = static_cast<Hair *>(geom);
sync_hair_motion(b_depsgraph, b_ob_info, hair, motion_step);
}

View File

@@ -24,14 +24,8 @@ CCL_NAMESPACE_BEGIN
/* Packed Images */
BlenderImageLoader::BlenderImageLoader(BL::Image b_image,
const int frame,
const bool is_preview_render)
: b_image(b_image),
frame(frame),
/* Don't free cache for preview render to avoid race condition from T93560, to be fixed
properly later as we are close to release. */
free_cache(!is_preview_render && !b_image.has_data())
BlenderImageLoader::BlenderImageLoader(BL::Image b_image, int frame)
: b_image(b_image), frame(frame), free_cache(!b_image.has_data())
{
}

View File

@@ -25,7 +25,7 @@ CCL_NAMESPACE_BEGIN
class BlenderImageLoader : public ImageLoader {
public:
BlenderImageLoader(BL::Image b_image, const int frame, const bool is_preview_render);
BlenderImageLoader(BL::Image b_image, int frame);
bool load_metadata(const ImageDeviceFeatures &features, ImageMetaData &metadata) override;
bool load_pixels(const ImageMetaData &metadata,

View File

@@ -62,15 +62,15 @@ bool BlenderSync::BKE_object_is_modified(BL::Object &b_ob)
return false;
}
bool BlenderSync::object_is_geometry(BObjectInfo &b_ob_info)
bool BlenderSync::object_is_geometry(BL::Object &b_ob)
{
BL::ID b_ob_data = b_ob_info.object_data;
BL::ID b_ob_data = b_ob.data();
if (!b_ob_data) {
return false;
}
BL::Object::type_enum type = b_ob_info.iter_object.type();
BL::Object::type_enum type = b_ob.type();
if (type == BL::Object::type_VOLUME || type == BL::Object::type_HAIR) {
/* Will be exported attached to mesh. */
@@ -87,24 +87,6 @@ bool BlenderSync::object_is_geometry(BObjectInfo &b_ob_info)
return b_ob_data.is_a(&RNA_Mesh);
}
bool BlenderSync::object_can_have_geometry(BL::Object &b_ob)
{
BL::Object::type_enum type = b_ob.type();
switch (type) {
case BL::Object::type_MESH:
case BL::Object::type_CURVE:
case BL::Object::type_SURFACE:
case BL::Object::type_META:
case BL::Object::type_FONT:
case BL::Object::type_HAIR:
case BL::Object::type_POINTCLOUD:
case BL::Object::type_VOLUME:
return true;
default:
return false;
}
}
bool BlenderSync::object_is_light(BL::Object &b_ob)
{
BL::ID b_ob_data = b_ob.data();
@@ -207,7 +189,7 @@ Object *BlenderSync::sync_object(BL::Depsgraph &b_depsgraph,
}
/* only interested in object that we can create meshes from */
if (!object_is_geometry(b_ob_info)) {
if (!object_is_geometry(b_ob)) {
return NULL;
}
@@ -294,7 +276,7 @@ Object *BlenderSync::sync_object(BL::Depsgraph &b_depsgraph,
object->set_visibility(visibility);
object->set_is_shadow_catcher(b_ob.is_shadow_catcher() || b_parent.is_shadow_catcher());
object->set_is_shadow_catcher(b_ob.is_shadow_catcher());
float shadow_terminator_shading_offset = get_float(cobject, "shadow_terminator_offset");
object->set_shadow_terminator_shading_offset(shadow_terminator_shading_offset);

View File

@@ -120,7 +120,7 @@ void BlenderOutputDriver::write_render_tile(const Tile &tile)
b_pass.rect(&pixels[0]);
}
b_engine_.end_result(b_rr, false, false, true);
b_engine_.end_result(b_rr, true, false, true);
}
CCL_NAMESPACE_END

View File

@@ -129,7 +129,7 @@ void BlenderSession::create_session()
/* reset status/progress */
last_status = "";
last_error = "";
last_progress = -1.0;
last_progress = -1.0f;
start_resize_time = 0.0;
/* create session */
@@ -615,24 +615,6 @@ void BlenderSession::bake(BL::Depsgraph &b_depsgraph_,
sync->sync_camera(b_render, b_camera_override, width, height, "");
sync->sync_data(
b_render, b_depsgraph, b_v3d, b_camera_override, width, height, &python_thread_state);
/* Filtering settings for combined pass. */
if (pass->get_type() == PASS_COMBINED) {
Integrator *integrator = scene->integrator;
integrator->set_use_direct_light((bake_filter & BL::BakeSettings::pass_filter_DIRECT) != 0);
integrator->set_use_indirect_light((bake_filter & BL::BakeSettings::pass_filter_INDIRECT) !=
0);
integrator->set_use_diffuse((bake_filter & BL::BakeSettings::pass_filter_DIFFUSE) != 0);
integrator->set_use_glossy((bake_filter & BL::BakeSettings::pass_filter_GLOSSY) != 0);
integrator->set_use_transmission(
(bake_filter & BL::BakeSettings::pass_filter_TRANSMISSION) != 0);
integrator->set_use_emission((bake_filter & BL::BakeSettings::pass_filter_EMIT) != 0);
}
/* Always use transpanent background for baking. */
scene->background->set_transparent(true);
/* Load built-in images from Blender. */
builtin_images_load();
}
@@ -859,7 +841,7 @@ void BlenderSession::get_status(string &status, string &substatus)
session->progress.get_status(status, substatus);
}
void BlenderSession::get_progress(double &progress, double &total_time, double &render_time)
void BlenderSession::get_progress(float &progress, double &total_time, double &render_time)
{
session->progress.get_time(total_time, render_time);
progress = session->progress.get_progress();
@@ -867,10 +849,10 @@ void BlenderSession::get_progress(double &progress, double &total_time, double &
void BlenderSession::update_bake_progress()
{
double progress = session->progress.get_progress();
float progress = session->progress.get_progress();
if (progress != last_progress) {
b_engine.update_progress((float)progress);
b_engine.update_progress(progress);
last_progress = progress;
}
}
@@ -879,7 +861,7 @@ void BlenderSession::update_status_progress()
{
string timestatus, status, substatus;
string scene_status = "";
double progress;
float progress;
double total_time, remaining_time = 0, render_time;
float mem_used = (float)session->stats.mem_used / 1024.0f / 1024.0f;
float mem_peak = (float)session->stats.mem_peak / 1024.0f / 1024.0f;
@@ -923,7 +905,7 @@ void BlenderSession::update_status_progress()
last_status_time = current_time;
}
if (progress != last_progress) {
b_engine.update_progress((float)progress);
b_engine.update_progress(progress);
last_progress = progress;
}

View File

@@ -82,7 +82,7 @@ class BlenderSession {
void tag_redraw();
void tag_update();
void get_status(string &status, string &substatus);
void get_progress(double &progress, double &total_time, double &render_time);
void get_progress(float &progress, double &total_time, double &render_time);
void test_cancel();
void update_status_progress();
void update_bake_progress();
@@ -108,7 +108,7 @@ class BlenderSession {
string last_status;
string last_error;
double last_progress;
float last_progress;
double last_status_time;
int width, height;

View File

@@ -762,8 +762,7 @@ static ShaderNode *add_node(Scene *scene,
int scene_frame = b_scene.frame_current();
int image_frame = image_user_frame_number(b_image_user, b_image, scene_frame);
image->handle = scene->image_manager->add_image(
new BlenderImageLoader(b_image, image_frame, b_engine.is_preview()),
image->image_params());
new BlenderImageLoader(b_image, image_frame), image->image_params());
}
else {
ustring filename = ustring(
@@ -798,9 +797,8 @@ static ShaderNode *add_node(Scene *scene,
if (is_builtin) {
int scene_frame = b_scene.frame_current();
int image_frame = image_user_frame_number(b_image_user, b_image, scene_frame);
env->handle = scene->image_manager->add_image(
new BlenderImageLoader(b_image, image_frame, b_engine.is_preview()),
env->image_params());
env->handle = scene->image_manager->add_image(new BlenderImageLoader(b_image, image_frame),
env->image_params());
}
else {
env->set_filename(

View File

@@ -162,19 +162,19 @@ void BlenderSync::sync_recalc(BL::Depsgraph &b_depsgraph, BL::SpaceView3D &b_v3d
/* Object */
else if (b_id.is_a(&RNA_Object)) {
BL::Object b_ob(b_id);
const bool can_have_geometry = object_can_have_geometry(b_ob);
const bool is_light = !can_have_geometry && object_is_light(b_ob);
const bool is_geometry = object_is_geometry(b_ob);
const bool is_light = !is_geometry && object_is_light(b_ob);
if (b_ob.is_instancer() && b_update.is_updated_shading()) {
/* Needed for e.g. object color updates on instancer. */
object_map.set_recalc(b_ob);
}
if (can_have_geometry || is_light) {
if (is_geometry || is_light) {
const bool updated_geometry = b_update.is_updated_geometry();
/* Geometry (mesh, hair, volume). */
if (can_have_geometry) {
if (is_geometry) {
if (b_update.is_updated_transform() || b_update.is_updated_shading()) {
object_map.set_recalc(b_ob);
}
@@ -365,8 +365,8 @@ void BlenderSync::sync_integrator(BL::ViewLayer &b_view_layer, bool background)
int samples = get_int(cscene, "samples");
float scrambling_distance = get_float(cscene, "scrambling_distance");
bool auto_scrambling_distance = get_boolean(cscene, "auto_scrambling_distance");
if (auto_scrambling_distance) {
bool adaptive_scrambling_distance = get_boolean(cscene, "adaptive_scrambling_distance");
if (adaptive_scrambling_distance) {
scrambling_distance *= 4.0f / sqrtf(samples);
}

View File

@@ -208,8 +208,7 @@ class BlenderSync {
/* util */
void find_shader(BL::ID &id, array<Node *> &used_shaders, Shader *default_shader);
bool BKE_object_is_modified(BL::Object &b_ob);
bool object_is_geometry(BObjectInfo &b_ob_info);
bool object_can_have_geometry(BL::Object &b_ob);
bool object_is_geometry(BL::Object &b_ob);
bool object_is_light(BL::Object &b_ob);
/* variables */

View File

@@ -30,17 +30,15 @@ BVHOptiX::BVHOptiX(const BVHParams &params_,
: BVH(params_, geometry_, objects_),
device(device),
traversable_handle(0),
as_data(make_unique<device_only_memory<char>>(
device, params.top_level ? "optix tlas" : "optix blas", false)),
motion_transform_data(
make_unique<device_only_memory<char>>(device, "optix motion transform", false))
as_data(device, params_.top_level ? "optix tlas" : "optix blas", false),
motion_transform_data(device, "optix motion transform", false)
{
}
BVHOptiX::~BVHOptiX()
{
/* Acceleration structure memory is delayed freed on device, since deleting the
* BVH may happen while still being used for rendering. */
// Acceleration structure memory is delayed freed on device, since deleting the
// BVH may happen while still being used for rendering.
device->release_optix_bvh(this);
}

View File

@@ -25,16 +25,14 @@
# include "device/memory.h"
# include "util/unique_ptr.h"
CCL_NAMESPACE_BEGIN
class BVHOptiX : public BVH {
public:
Device *device;
uint64_t traversable_handle;
unique_ptr<device_only_memory<char>> as_data;
unique_ptr<device_only_memory<char>> motion_transform_data;
device_only_memory<char> as_data;
device_only_memory<char> motion_transform_data;
protected:
friend class BVH;

View File

@@ -38,7 +38,6 @@ void device_cpu_info(vector<DeviceInfo> &devices)
info.id = "CPU";
info.num = 0;
info.has_osl = true;
info.has_half_images = true;
info.has_nanovdb = true;
info.has_profiling = true;
if (openimagedenoise_supported()) {

View File

@@ -134,7 +134,8 @@ void CPUDevice::mem_alloc(device_memory &mem)
<< string_human_readable_size(mem.memory_size()) << ")";
}
if (mem.type == MEM_DEVICE_ONLY || !mem.host_pointer) {
if (mem.type == MEM_DEVICE_ONLY) {
assert(!mem.host_pointer);
size_t alignment = MIN_ALIGNMENT_CPU_DATA_TYPES;
void *data = util_aligned_malloc(mem.memory_size(), alignment);
mem.device_pointer = (device_ptr)data;
@@ -193,7 +194,7 @@ void CPUDevice::mem_free(device_memory &mem)
tex_free((device_texture &)mem);
}
else if (mem.device_pointer) {
if (mem.type == MEM_DEVICE_ONLY || !mem.host_pointer) {
if (mem.type == MEM_DEVICE_ONLY) {
util_aligned_free((void *)mem.device_pointer);
}
mem.device_pointer = 0;

View File

@@ -42,7 +42,7 @@ class CPUKernels {
IntegratorInitFunction integrator_init_from_camera;
IntegratorInitFunction integrator_init_from_bake;
IntegratorShadeFunction integrator_intersect_closest;
IntegratorFunction integrator_intersect_closest;
IntegratorFunction integrator_intersect_shadow;
IntegratorFunction integrator_intersect_subsurface;
IntegratorFunction integrator_intersect_volume_stack;

View File

@@ -144,7 +144,6 @@ void device_cuda_info(vector<DeviceInfo> &devices)
info.description = string(name);
info.num = num;
info.has_half_images = (major >= 3);
info.has_nanovdb = true;
info.denoisers = 0;

View File

@@ -680,7 +680,7 @@ CUDADevice::CUDAMem *CUDADevice::generic_alloc(device_memory &mem, size_t pitch_
void *shared_pointer = 0;
if (mem_alloc_result != CUDA_SUCCESS && can_map_host && mem.type != MEM_DEVICE_ONLY) {
if (mem_alloc_result != CUDA_SUCCESS && can_map_host) {
if (mem.shared_pointer) {
/* Another device already allocated host memory. */
mem_alloc_result = CUDA_SUCCESS;
@@ -703,14 +703,8 @@ CUDADevice::CUDAMem *CUDADevice::generic_alloc(device_memory &mem, size_t pitch_
}
if (mem_alloc_result != CUDA_SUCCESS) {
if (mem.type == MEM_DEVICE_ONLY) {
status = " failed, out of device memory";
set_error("System is out of GPU memory");
}
else {
status = " failed, out of device and host memory";
set_error("System is out of GPU and shared host memory");
}
status = " failed, out of device and host memory";
set_error("System is out of GPU and shared host memory");
}
if (mem.name) {
@@ -783,7 +777,6 @@ void CUDADevice::generic_free(device_memory &mem)
if (mem.device_pointer) {
CUDAContextScope scope(this);
thread_scoped_lock lock(cuda_mem_map_mutex);
DCHECK(cuda_mem_map.find(&mem) != cuda_mem_map.end());
const CUDAMem &cmem = cuda_mem_map[&mem];
/* If cmem.use_mapped_host is true, reference counting is used
@@ -1152,7 +1145,6 @@ void CUDADevice::tex_free(device_texture &mem)
if (mem.device_pointer) {
CUDAContextScope scope(this);
thread_scoped_lock lock(cuda_mem_map_mutex);
DCHECK(cuda_mem_map.find(&mem) != cuda_mem_map.end());
const CUDAMem &cmem = cuda_mem_map[&mem];
if (cmem.texobject) {

View File

@@ -286,7 +286,6 @@ DeviceInfo Device::get_multi_device(const vector<DeviceInfo> &subdevices,
info.description = "Multi Device";
info.num = 0;
info.has_half_images = true;
info.has_nanovdb = true;
info.has_osl = true;
info.has_profiling = true;
@@ -333,7 +332,6 @@ DeviceInfo Device::get_multi_device(const vector<DeviceInfo> &subdevices,
}
/* Accumulate device info. */
info.has_half_images &= device.has_half_images;
info.has_nanovdb &= device.has_nanovdb;
info.has_osl &= device.has_osl;
info.has_profiling &= device.has_profiling;

View File

@@ -73,7 +73,6 @@ class DeviceInfo {
int num;
bool display_device; /* GPU is used as a display device. */
bool has_nanovdb; /* Support NanoVDB volumes. */
bool has_half_images; /* Support half-float textures. */
bool has_osl; /* Support Open Shading Language. */
bool has_profiling; /* Supports runtime collection of profiling info. */
bool has_peer_memory; /* GPU has P2P access to memory of another GPU. */
@@ -90,7 +89,6 @@ class DeviceInfo {
num = 0;
cpu_threads = 0;
display_device = false;
has_half_images = false;
has_nanovdb = false;
has_osl = false;
has_profiling = false;

View File

@@ -57,16 +57,9 @@ bool device_hip_init()
}
}
else {
if (hipew_result == HIPEW_ERROR_ATEXIT_FAILED) {
VLOG(1) << "HIPEW initialization failed: Error setting up atexit() handler";
}
else if (hipew_result == HIPEW_ERROR_OLD_DRIVER) {
VLOG(1) << "HIPEW initialization failed: Driver version too old, requires AMD Radeon Pro "
"21.Q4 driver or newer";
}
else {
VLOG(1) << "HIPEW initialization failed: Error opening HIP dynamic library";
}
VLOG(1) << "HIPEW initialization failed: "
<< ((hipew_result == HIPEW_ERROR_ATEXIT_FAILED) ? "Error setting up atexit() handler" :
"Error opening the library");
}
return result;
@@ -148,7 +141,6 @@ void device_hip_info(vector<DeviceInfo> &devices)
info.description = string(name);
info.num = num;
info.has_half_images = true;
info.has_nanovdb = true;
info.denoisers = 0;

View File

@@ -99,7 +99,7 @@ HIPDevice::HIPDevice(const DeviceInfo &info, Stats &stats, Profiler &profiler)
}
/* Setup device and context. */
result = hipDeviceGet(&hipDevice, hipDevId);
result = hipGetDevice(&hipDevice, hipDevId);
if (result != hipSuccess) {
set_error(string_printf("Failed to get HIP device handle from ordinal (%s)",
hipewErrorString(result)));
@@ -222,6 +222,7 @@ string HIPDevice::compile_kernel_get_common_cflags(const uint kernel_features)
const string include_path = source_path;
string cflags = string_printf(
"-m%d "
"--ptxas-options=\"-v\" "
"--use_fast_math "
"-DHIPCC "
"-I\"%s\"",
@@ -233,7 +234,10 @@ string HIPDevice::compile_kernel_get_common_cflags(const uint kernel_features)
return cflags;
}
string HIPDevice::compile_kernel(const uint kernel_features, const char *name, const char *base)
string HIPDevice::compile_kernel(const uint kernel_features,
const char *name,
const char *base,
bool force_ptx)
{
/* Compute kernel name. */
int major, minor;
@@ -251,11 +255,13 @@ string HIPDevice::compile_kernel(const uint kernel_features, const char *name, c
/* Attempt to use kernel provided with Blender. */
if (!use_adaptive_compilation()) {
const string fatbin = path_get(string_printf("lib/%s_%s.fatbin", name, arch));
VLOG(1) << "Testing for pre-compiled kernel " << fatbin << ".";
if (path_exists(fatbin)) {
VLOG(1) << "Using precompiled kernel.";
return fatbin;
if (!force_ptx) {
const string fatbin = path_get(string_printf("lib/%s_%s.fatbin", name, arch));
VLOG(1) << "Testing for pre-compiled kernel " << fatbin << ".";
if (path_exists(fatbin)) {
VLOG(1) << "Using precompiled kernel.";
return fatbin;
}
}
}
@@ -292,9 +298,9 @@ string HIPDevice::compile_kernel(const uint kernel_features, const char *name, c
# ifdef _WIN32
if (!use_adaptive_compilation() && have_precompiled_kernels()) {
if (!hipSupportsDevice(hipDevId)) {
if (major < 3) {
set_error(
string_printf("HIP backend requires compute capability 10.1 or up, but found %d.%d. "
string_printf("HIP backend requires compute capability 3.0 or up, but found %d.%d. "
"Your GPU is not supported.",
major,
minor));
@@ -745,7 +751,6 @@ void HIPDevice::generic_free(device_memory &mem)
if (mem.device_pointer) {
HIPContextScope scope(this);
thread_scoped_lock lock(hip_mem_map_mutex);
DCHECK(hip_mem_map.find(&mem) != hip_mem_map.end());
const HIPMem &cmem = hip_mem_map[&mem];
/* If cmem.use_mapped_host is true, reference counting is used
@@ -989,16 +994,16 @@ void HIPDevice::tex_alloc(device_texture &mem)
<< string_human_readable_number(mem.memory_size()) << " bytes. ("
<< string_human_readable_size(mem.memory_size()) << ")";
hip_assert(hipArray3DCreate((hArray *)&array_3d, &desc));
hip_assert(hipArray3DCreate(&array_3d, &desc));
if (!array_3d) {
return;
}
HIP_MEMCPY3D param;
memset(&param, 0, sizeof(HIP_MEMCPY3D));
memset(&param, 0, sizeof(param));
param.dstMemoryType = hipMemoryTypeArray;
param.dstArray = array_3d;
param.dstArray = &array_3d;
param.srcMemoryType = hipMemoryTypeHost;
param.srcHost = mem.host_pointer;
param.srcPitch = src_pitch;
@@ -1064,13 +1069,13 @@ void HIPDevice::tex_alloc(device_texture &mem)
if (mem.info.data_type != IMAGE_DATA_TYPE_NANOVDB_FLOAT &&
mem.info.data_type != IMAGE_DATA_TYPE_NANOVDB_FLOAT3) {
/* Bindless textures. */
/* Kepler+, bindless textures. */
hipResourceDesc resDesc;
memset(&resDesc, 0, sizeof(resDesc));
if (array_3d) {
resDesc.resType = hipResourceTypeArray;
resDesc.res.array.h_Array = array_3d;
resDesc.res.array.h_Array = &array_3d;
resDesc.flags = 0;
}
else if (mem.data_height > 0) {
@@ -1115,7 +1120,6 @@ void HIPDevice::tex_free(device_texture &mem)
if (mem.device_pointer) {
HIPContextScope scope(this);
thread_scoped_lock lock(hip_mem_map_mutex);
DCHECK(hip_mem_map.find(&mem) != hip_mem_map.end());
const HIPMem &cmem = hip_mem_map[&mem];
if (cmem.texobject) {
@@ -1156,8 +1160,6 @@ bool HIPDevice::should_use_graphics_interop()
* possible, but from the empiric measurements it can be considerably slower than using naive
* pixels copy. */
/* Disable graphics interop for now, because of driver bug in 21.40. See T92972 */
# if 0
HIPContextScope scope(this);
int num_all_devices = 0;
@@ -1176,7 +1178,6 @@ bool HIPDevice::should_use_graphics_interop()
return true;
}
}
# endif
return false;
}

View File

@@ -95,7 +95,8 @@ class HIPDevice : public Device {
string compile_kernel(const uint kernel_features,
const char *name,
const char *base = "hip");
const char *base = "hip",
bool force_ptx = false);
virtual bool load_kernels(const uint kernel_features) override;
void reserve_local_memory(const uint kernel_features);

View File

@@ -44,6 +44,45 @@ device_memory::device_memory(Device *device, const char *name, MemoryType type)
{
}
device_memory::device_memory(device_memory &&other) noexcept
: data_type(other.data_type),
data_elements(other.data_elements),
data_size(other.data_size),
device_size(other.device_size),
data_width(other.data_width),
data_height(other.data_height),
data_depth(other.data_depth),
type(other.type),
name(other.name),
device(other.device),
device_pointer(other.device_pointer),
host_pointer(other.host_pointer),
shared_pointer(other.shared_pointer),
shared_counter(other.shared_counter),
original_device_ptr(other.original_device_ptr),
original_device_size(other.original_device_size),
original_device(other.original_device),
need_realloc_(other.need_realloc_),
modified(other.modified)
{
other.data_elements = 0;
other.data_size = 0;
other.device_size = 0;
other.data_width = 0;
other.data_height = 0;
other.data_depth = 0;
other.device = 0;
other.device_pointer = 0;
other.host_pointer = 0;
other.shared_pointer = 0;
other.shared_counter = 0;
other.original_device_ptr = 0;
other.original_device_size = 0;
other.original_device = 0;
other.need_realloc_ = false;
other.modified = false;
}
device_memory::~device_memory()
{
assert(shared_pointer == 0);

View File

@@ -281,16 +281,11 @@ class device_memory {
/* Only create through subclasses. */
device_memory(Device *device, const char *name, MemoryType type);
device_memory(device_memory &&other) noexcept;
/* No copying and allowed.
*
* This is because device implementation might need to register device memory in an allocation
* map of some sort and use pointer as a key to identify blocks. Moving data from one place to
* another bypassing device allocation routines will make those maps hard to maintain. */
/* No copying allowed. */
device_memory(const device_memory &) = delete;
device_memory(device_memory &&other) noexcept = delete;
device_memory &operator=(const device_memory &) = delete;
device_memory &operator=(device_memory &&) = delete;
/* Host allocation on the device. All host_pointer memory should be
* allocated with these functions, for devices that support using

View File

@@ -44,14 +44,14 @@
CCL_NAMESPACE_BEGIN
OptiXDevice::Denoiser::Denoiser(OptiXDevice *device)
: device(device), queue(device), state(device, "__denoiser_state", true)
: device(device), queue(device), state(device, "__denoiser_state")
{
}
OptiXDevice::OptiXDevice(const DeviceInfo &info, Stats &stats, Profiler &profiler)
: CUDADevice(info, stats, profiler),
sbt_data(this, "__sbt", MEM_READ_ONLY),
launch_params(this, "__params", false),
launch_params(this, "__params"),
denoiser_(this)
{
/* Make the CUDA context current. */
@@ -507,7 +507,7 @@ class OptiXDevice::DenoiseContext {
: denoise_params(task.params),
render_buffers(task.render_buffers),
buffer_params(task.buffer_params),
guiding_buffer(device, "denoiser guiding passes buffer", true),
guiding_buffer(device, "denoiser guiding passes buffer"),
num_samples(task.num_samples)
{
num_input_passes = 1;
@@ -522,9 +522,9 @@ class OptiXDevice::DenoiseContext {
}
}
use_guiding_passes = (num_input_passes - 1) > 0;
const int num_guiding_passes = num_input_passes - 1;
if (use_guiding_passes) {
if (num_guiding_passes) {
if (task.allow_inplace_modification) {
guiding_params.device_pointer = render_buffers->buffer.device_pointer;
@@ -577,7 +577,6 @@ class OptiXDevice::DenoiseContext {
/* Number of input passes. Including the color and extra auxiliary passes. */
int num_input_passes = 0;
bool use_guiding_passes = false;
bool use_pass_albedo = false;
bool use_pass_normal = false;
@@ -709,7 +708,7 @@ void OptiXDevice::denoise_pass(DenoiseContext &context, PassType pass_type)
return;
}
}
else if (context.use_guiding_passes && !context.albedo_replaced_with_fake) {
else if (!context.albedo_replaced_with_fake) {
context.albedo_replaced_with_fake = true;
if (!denoise_filter_guiding_set_fake_albedo(context)) {
LOG(ERROR) << "Error replacing real albedo with the fake one.";
@@ -887,7 +886,8 @@ bool OptiXDevice::denoise_configure_if_needed(DenoiseContext &context)
denoiser_.scratch_offset = sizes.stateSizeInBytes;
/* Allocate denoiser state if tile size has changed since last setup. */
denoiser_.state.alloc_to_device(denoiser_.scratch_offset + denoiser_.scratch_size);
denoiser_.state.alloc_to_device(denoiser_.scratch_offset + denoiser_.scratch_size +
sizeof(float));
/* Initialize denoiser state for the current tile size. */
const OptixResult result = optixDenoiserSetup(
@@ -971,6 +971,16 @@ bool OptiXDevice::denoise_run(DenoiseContext &context, const DenoisePass &pass)
/* Finally run denoising. */
OptixDenoiserParams params = {}; /* All parameters are disabled/zero. */
params.hdrIntensity = denoiser_.state.device_pointer + denoiser_.scratch_offset +
denoiser_.scratch_size;
optix_assert(
optixDenoiserComputeIntensity(denoiser_.optix_denoiser,
denoiser_.queue.stream(),
&color_layer,
params.hdrIntensity,
denoiser_.state.device_pointer + denoiser_.scratch_offset,
denoiser_.scratch_size));
OptixDenoiserLayer image_layers = {};
image_layers.input = color_layer;
@@ -1001,13 +1011,6 @@ bool OptiXDevice::build_optix_bvh(BVHOptiX *bvh,
const OptixBuildInput &build_input,
uint16_t num_motion_steps)
{
/* Allocate and build acceleration structures only one at a time, to prevent parallel builds
* from running out of memory (since both original and compacted acceleration structure memory
* may be allocated at the same time for the duration of this function). The builds would
* otherwise happen on the same CUDA stream anyway. */
static thread_mutex mutex;
thread_scoped_lock lock(mutex);
const CUDAContextScope scope(this);
const bool use_fast_trace_bvh = (bvh->params.bvh_type == BVH_TYPE_STATIC);
@@ -1033,15 +1036,14 @@ bool OptiXDevice::build_optix_bvh(BVHOptiX *bvh,
optix_assert(optixAccelComputeMemoryUsage(context, &options, &build_input, 1, &sizes));
/* Allocate required output buffers. */
device_only_memory<char> temp_mem(this, "optix temp as build mem", true);
device_only_memory<char> temp_mem(this, "optix temp as build mem");
temp_mem.alloc_to_device(align_up(sizes.tempSizeInBytes, 8) + 8);
if (!temp_mem.device_pointer) {
/* Make sure temporary memory allocation succeeded. */
return false;
}
/* Acceleration structure memory has to be allocated on the device (not allowed on the host). */
device_only_memory<char> &out_data = *bvh->as_data;
device_only_memory<char> &out_data = bvh->as_data;
if (operation == OPTIX_BUILD_OPERATION_BUILD) {
assert(out_data.device == this);
out_data.alloc_to_device(sizes.outputSizeInBytes);
@@ -1089,13 +1091,12 @@ bool OptiXDevice::build_optix_bvh(BVHOptiX *bvh,
/* There is no point compacting if the size does not change. */
if (compacted_size < sizes.outputSizeInBytes) {
device_only_memory<char> compacted_data(this, "optix compacted as", false);
device_only_memory<char> compacted_data(this, "optix compacted as");
compacted_data.alloc_to_device(compacted_size);
if (!compacted_data.device_pointer) {
if (!compacted_data.device_pointer)
/* Do not compact if memory allocation for compacted acceleration structure fails.
* Can just use the uncompacted one then, so succeed here regardless. */
return !have_error();
}
optix_assert(optixAccelCompact(
context, NULL, out_handle, compacted_data.device_pointer, compacted_size, &out_handle));
@@ -1106,8 +1107,6 @@ bool OptiXDevice::build_optix_bvh(BVHOptiX *bvh,
std::swap(out_data.device_size, compacted_data.device_size);
std::swap(out_data.device_pointer, compacted_data.device_pointer);
/* Original acceleration structure memory is freed when 'compacted_data' goes out of scope.
*/
}
}
@@ -1135,7 +1134,7 @@ void OptiXDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
operation = OPTIX_BUILD_OPERATION_UPDATE;
}
else {
bvh_optix->as_data->free();
bvh_optix->as_data.free();
bvh_optix->traversable_handle = 0;
}
@@ -1196,7 +1195,7 @@ void OptiXDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
const float4 pw = make_float4(
curve_radius[ka], curve_radius[k0], curve_radius[k1], curve_radius[kb]);
/* Convert Catmull-Rom data to B-spline. */
/* Convert Catmull-Rom data to Bezier spline. */
static const float4 cr2bsp0 = make_float4(+7, -4, +5, -2) / 6.f;
static const float4 cr2bsp1 = make_float4(-2, 11, -4, +1) / 6.f;
static const float4 cr2bsp2 = make_float4(+1, -4, 11, -2) / 6.f;
@@ -1356,9 +1355,9 @@ void OptiXDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
unsigned int num_instances = 0;
unsigned int max_num_instances = 0xFFFFFFFF;
bvh_optix->as_data->free();
bvh_optix->as_data.free();
bvh_optix->traversable_handle = 0;
bvh_optix->motion_transform_data->free();
bvh_optix->motion_transform_data.free();
optixDeviceContextGetProperty(context,
OPTIX_DEVICE_PROPERTY_LIMIT_MAX_INSTANCE_ID,
@@ -1391,8 +1390,8 @@ void OptiXDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
}
}
assert(bvh_optix->motion_transform_data->device == this);
bvh_optix->motion_transform_data->alloc_to_device(total_motion_transform_size);
assert(bvh_optix->motion_transform_data.device == this);
bvh_optix->motion_transform_data.alloc_to_device(total_motion_transform_size);
}
for (Object *ob : bvh->objects) {
@@ -1453,7 +1452,7 @@ void OptiXDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
motion_transform_offset = align_up(motion_transform_offset,
OPTIX_TRANSFORM_BYTE_ALIGNMENT);
CUdeviceptr motion_transform_gpu = bvh_optix->motion_transform_data->device_pointer +
CUdeviceptr motion_transform_gpu = bvh_optix->motion_transform_data.device_pointer +
motion_transform_offset;
motion_transform_offset += motion_transform_size;

View File

@@ -23,7 +23,6 @@
# include "device/optix/queue.h"
# include "device/optix/util.h"
# include "kernel/types.h"
# include "util/unique_ptr.h"
CCL_NAMESPACE_BEGIN
@@ -77,7 +76,7 @@ class OptiXDevice : public CUDADevice {
device_only_memory<KernelParamsOptiX> launch_params;
OptixTraversableHandle tlas_handle = 0;
vector<unique_ptr<device_only_memory<char>>> delayed_free_bvh_memory;
vector<device_only_memory<char>> delayed_free_bvh_memory;
thread_mutex delayed_free_bvh_mutex;
class Denoiser {

View File

@@ -73,8 +73,7 @@ bool OptiXDeviceQueue::enqueue(DeviceKernel kernel, const int work_size, void *a
sizeof(device_ptr),
cuda_stream_));
if (kernel == DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST ||
kernel == DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE) {
if (kernel == DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE) {
cuda_device_assert(
cuda_device_,
cuMemcpyHtoDAsync(launch_params_ptr + offsetof(KernelParamsOptiX, render_buffer),

View File

@@ -33,10 +33,7 @@ unique_ptr<Denoiser> Denoiser::create(Device *path_trace_device, const DenoisePa
return make_unique<OptiXDenoiser>(path_trace_device, params);
}
/* Always fallback to OIDN. */
DenoiseParams oidn_params = params;
oidn_params.type = DENOISER_OPENIMAGEDENOISE;
return make_unique<OIDNDenoiser>(path_trace_device, oidn_params);
return make_unique<OIDNDenoiser>(path_trace_device, params);
}
Denoiser::Denoiser(Device *path_trace_device, const DenoiseParams &params)

View File

@@ -47,6 +47,9 @@ static bool oidn_progress_monitor_function(void *user_ptr, double /*n*/)
OIDNDenoiser *oidn_denoiser = reinterpret_cast<OIDNDenoiser *>(user_ptr);
return !oidn_denoiser->is_cancelled();
}
#endif
#ifdef WITH_OPENIMAGEDENOISE
class OIDNPass {
public:
@@ -544,6 +547,7 @@ class OIDNDenoiseContext {
* the fake values and denoising of passes which do need albedo can no longer happen. */
bool albedo_replaced_with_fake_ = false;
};
#endif
static unique_ptr<DeviceQueue> create_device_queue(const RenderBuffers *render_buffers)
{
@@ -578,20 +582,18 @@ static void copy_render_buffers_to_device(unique_ptr<DeviceQueue> &queue,
}
}
#endif
bool OIDNDenoiser::denoise_buffer(const BufferParams &buffer_params,
RenderBuffers *render_buffers,
const int num_samples,
bool allow_inplace_modification)
{
#ifdef WITH_OPENIMAGEDENOISE
thread_scoped_lock lock(mutex_);
/* Make sure the host-side data is available for denoising. */
unique_ptr<DeviceQueue> queue = create_device_queue(render_buffers);
copy_render_buffers_from_device(queue, render_buffers);
#ifdef WITH_OPENIMAGEDENOISE
OIDNDenoiseContext context(
this, params_, buffer_params, render_buffers, num_samples, allow_inplace_modification);
@@ -618,11 +620,6 @@ bool OIDNDenoiser::denoise_buffer(const BufferParams &buffer_params,
* copies data from the device it doesn't overwrite the denoiser buffers. */
copy_render_buffers_to_device(queue, render_buffers);
}
#else
(void)buffer_params;
(void)render_buffers;
(void)num_samples;
(void)allow_inplace_modification;
#endif
/* This code is not supposed to run when compiled without OIDN support, so can assume if we made

View File

@@ -296,13 +296,13 @@ static BufferParams scale_buffer_params(const BufferParams &params, int resoluti
scaled_params.window_x = params.window_x / resolution_divider;
scaled_params.window_y = params.window_y / resolution_divider;
scaled_params.window_width = max(1, params.window_width / resolution_divider);
scaled_params.window_height = max(1, params.window_height / resolution_divider);
scaled_params.window_width = params.window_width / resolution_divider;
scaled_params.window_height = params.window_height / resolution_divider;
scaled_params.full_x = params.full_x / resolution_divider;
scaled_params.full_y = params.full_y / resolution_divider;
scaled_params.full_width = max(1, params.full_width / resolution_divider);
scaled_params.full_height = max(1, params.full_height / resolution_divider);
scaled_params.full_width = params.full_width / resolution_divider;
scaled_params.full_height = params.full_height / resolution_divider;
scaled_params.update_offset_stride();
@@ -479,11 +479,7 @@ void PathTrace::set_denoiser_params(const DenoiseParams &params)
}
denoiser_ = Denoiser::create(device_, params);
/* Only take into account the "immediate" cancel to have interactive rendering responding to
* navigation as quickly as possible, but allow to run denoiser after user hit Esc button while
* doing offline rendering. */
denoiser_->is_cancelled_cb = [this]() { return render_cancel_.is_requested; };
denoiser_->is_cancelled_cb = [this]() { return is_cancel_requested(); };
}
void PathTrace::set_adaptive_sampling(const AdaptiveSampling &adaptive_sampling)
@@ -851,8 +847,7 @@ void PathTrace::progress_update_if_needed(const RenderWork &render_work)
{
if (progress_ != nullptr) {
const int2 tile_size = get_render_tile_size();
const uint64_t num_samples_added = uint64_t(tile_size.x) * tile_size.y *
render_work.path_trace.num_samples;
const int num_samples_added = tile_size.x * tile_size.y * render_work.path_trace.num_samples;
const int current_sample = render_work.path_trace.start_sample +
render_work.path_trace.num_samples;
progress_->add_samples(num_samples_added, current_sample);

View File

@@ -77,10 +77,8 @@ void PathTraceWorkCPU::render_samples(RenderStatistics &statistics,
const int64_t image_height = effective_buffer_params_.height;
const int64_t total_pixels_num = image_width * image_height;
if (device_->profiler.active()) {
for (CPUKernelThreadGlobals &kernel_globals : kernel_thread_globals_) {
kernel_globals.start_profiling();
}
for (CPUKernelThreadGlobals &kernel_globals : kernel_thread_globals_) {
kernel_globals.start_profiling();
}
tbb::task_arena local_arena = local_tbb_arena_create(device_);
@@ -108,10 +106,9 @@ void PathTraceWorkCPU::render_samples(RenderStatistics &statistics,
render_samples_full_pipeline(kernel_globals, work_tile, samples_num);
});
});
if (device_->profiler.active()) {
for (CPUKernelThreadGlobals &kernel_globals : kernel_thread_globals_) {
kernel_globals.stop_profiling();
}
for (CPUKernelThreadGlobals &kernel_globals : kernel_thread_globals_) {
kernel_globals.stop_profiling();
}
statistics.occupancy = 1.0f;

View File

@@ -257,8 +257,7 @@ void PathTraceWorkGPU::render_samples(RenderStatistics &statistics,
* become busy after adding new tiles). This is especially important for the shadow catcher which
* schedules work in halves of available number of paths. */
work_tile_scheduler_.set_max_num_path_states(max_num_paths_ / 8);
work_tile_scheduler_.set_accelerated_rt((device_->get_bvh_layout_mask() & BVH_LAYOUT_OPTIX) !=
0);
work_tile_scheduler_.reset(effective_buffer_params_,
start_sample,
samples_num,
@@ -438,15 +437,7 @@ void PathTraceWorkGPU::enqueue_path_iteration(DeviceKernel kernel, const int num
DCHECK_LE(work_size, max_num_paths_);
switch (kernel) {
case DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST: {
/* Closest ray intersection kernels with integrator state and render buffer. */
void *d_render_buffer = (void *)buffers_->buffer.device_pointer;
void *args[] = {&d_path_index, &d_render_buffer, const_cast<int *>(&work_size)};
queue_->enqueue(kernel, work_size, args);
break;
}
case DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST:
case DEVICE_KERNEL_INTEGRATOR_INTERSECT_SHADOW:
case DEVICE_KERNEL_INTEGRATOR_INTERSECT_SUBSURFACE:
case DEVICE_KERNEL_INTEGRATOR_INTERSECT_VOLUME_STACK: {

View File

@@ -827,26 +827,6 @@ int RenderScheduler::get_num_samples_to_path_trace() const
num_samples_to_occupy = lround(state_.occupancy_num_samples * 0.7f / state_.occupancy);
}
/* When time limit is used clamp the calculated number of samples to keep occupancy.
* This is because time limit causes the last render iteration to happen with less number of
* samples, which conflicts with the occupancy (lower number of samples causes lower
* occupancy, also the calculation is based on number of previously rendered samples).
*
* When time limit is not used the number of samples per render iteration is either increasing
* or stays the same, so there is no need to clamp number of samples calculated for occupancy.
*/
if (time_limit_ && state_.start_render_time) {
const double remaining_render_time = max(
0.0, time_limit_ - (time_dt() - state_.start_render_time));
const double time_per_sample_average = path_trace_time_.get_average();
const double predicted_render_time = num_samples_to_occupy * time_per_sample_average;
if (predicted_render_time > remaining_render_time) {
num_samples_to_occupy = lround(num_samples_to_occupy *
(remaining_render_time / predicted_render_time));
}
}
num_samples_to_render = max(num_samples_to_render,
min(num_samples_to_occupy, max_num_samples_to_render));
}

View File

@@ -46,8 +46,7 @@ ccl_device_inline uint round_up_to_power_of_two(uint x)
return next_power_of_two(x);
}
TileSize tile_calculate_best_size(const bool accel_rt,
const int2 &image_size,
TileSize tile_calculate_best_size(const int2 &image_size,
const int num_samples,
const int max_num_path_states,
const float scrambling_distance)
@@ -74,7 +73,7 @@ TileSize tile_calculate_best_size(const bool accel_rt,
TileSize tile_size;
const int num_path_states_per_sample = max_num_path_states / num_samples;
if (scrambling_distance < 0.9f && accel_rt) {
if (scrambling_distance < 0.9f) {
/* Prefer large tiles for scrambling distance, bounded by max num path states. */
tile_size.width = min(image_size.x, max_num_path_states);
tile_size.height = min(image_size.y, max(max_num_path_states / tile_size.width, 1));

View File

@@ -49,8 +49,7 @@ std::ostream &operator<<(std::ostream &os, const TileSize &tile_size);
* of active path states.
* Will attempt to provide best guess to keep path tracing threads of a device as localized as
* possible, and have as many threads active for every tile as possible. */
TileSize tile_calculate_best_size(const bool accel_rt,
const int2 &image_size,
TileSize tile_calculate_best_size(const int2 &image_size,
const int num_samples,
const int max_num_path_states,
const float scrambling_distance);

View File

@@ -28,11 +28,6 @@ WorkTileScheduler::WorkTileScheduler()
{
}
void WorkTileScheduler::set_accelerated_rt(bool accelerated_rt)
{
accelerated_rt_ = accelerated_rt;
}
void WorkTileScheduler::set_max_num_path_states(int max_num_path_states)
{
max_num_path_states_ = max_num_path_states;
@@ -64,7 +59,7 @@ void WorkTileScheduler::reset(const BufferParams &buffer_params,
void WorkTileScheduler::reset_scheduler_state()
{
tile_size_ = tile_calculate_best_size(
accelerated_rt_, image_size_px_, samples_num_, max_num_path_states_, scrambling_distance_);
image_size_px_, samples_num_, max_num_path_states_, scrambling_distance_);
VLOG(3) << "Will schedule tiles of size " << tile_size_;

View File

@@ -31,9 +31,6 @@ class WorkTileScheduler {
public:
WorkTileScheduler();
/* To indicate if there is accelerated RT support. */
void set_accelerated_rt(bool state);
/* MAximum path states which are allowed to be used by a single scheduled work tile.
*
* Affects the scheduled work size: the work size will be as big as possible, but will not exceed
@@ -57,9 +54,6 @@ class WorkTileScheduler {
protected:
void reset_scheduler_state();
/* Used to indicate if there is accelerated ray tracing. */
bool accelerated_rt_ = false;
/* Maximum allowed path states to be used.
*
* TODO(sergey): Naming can be improved. The fact that this is a limiting factor based on the

View File

@@ -39,6 +39,10 @@ set(SRC_KERNEL_DEVICE_HIP
device/hip/kernel.cpp
)
set(SRC_KERNEL_DEVICE_METAL
device/metal/kernel.metal
)
set(SRC_KERNEL_DEVICE_OPTIX
device/optix/kernel.cu
device/optix/kernel_shader_raytrace.cu
@@ -79,6 +83,13 @@ set(SRC_KERNEL_DEVICE_OPTIX_HEADERS
device/optix/globals.h
)
set(SRC_KERNEL_DEVICE_METAL_HEADERS
device/metal/compat.h
device/metal/context_begin.h
device/metal/context_end.h
device/metal/globals.h
)
set(SRC_KERNEL_CLOSURE_HEADERS
closure/alloc.h
closure/bsdf.h
@@ -368,6 +379,7 @@ if(WITH_CYCLES_CUDA_BINARIES)
${SRC_KERNEL_HEADERS}
${SRC_KERNEL_DEVICE_GPU_HEADERS}
${SRC_KERNEL_DEVICE_CUDA_HEADERS}
${SRC_KERNEL_DEVICE_METAL_HEADERS}
${SRC_UTIL_HEADERS}
)
set(cuda_cubins)
@@ -565,12 +577,6 @@ if(WITH_CYCLES_HIP_BINARIES AND WITH_CYCLES_DEVICE_HIP)
set(name ${name}_experimental)
endif()
if(WITH_NANOVDB)
set(hip_flags ${hip_flags}
-D WITH_NANOVDB
-I "${NANOVDB_INCLUDE_DIR}")
endif()
if(WITH_CYCLES_DEBUG)
set(hip_flags ${hip_flags} -D __KERNEL_DEBUG__)
endif()
@@ -729,12 +735,14 @@ cycles_add_library(cycles_kernel "${LIB}"
${SRC_KERNEL_DEVICE_CUDA}
${SRC_KERNEL_DEVICE_HIP}
${SRC_KERNEL_DEVICE_OPTIX}
${SRC_KERNEL_DEVICE_METAL}
${SRC_KERNEL_HEADERS}
${SRC_KERNEL_DEVICE_CPU_HEADERS}
${SRC_KERNEL_DEVICE_GPU_HEADERS}
${SRC_KERNEL_DEVICE_CUDA_HEADERS}
${SRC_KERNEL_DEVICE_HIP_HEADERS}
${SRC_KERNEL_DEVICE_OPTIX_HEADERS}
${SRC_KERNEL_DEVICE_METAL_HEADERS}
)
source_group("bake" FILES ${SRC_KERNEL_BAKE_HEADERS})
@@ -746,6 +754,7 @@ source_group("device\\cuda" FILES ${SRC_KERNEL_DEVICE_CUDA} ${SRC_KERNEL_DEVICE_
source_group("device\\gpu" FILES ${SRC_KERNEL_DEVICE_GPU_HEADERS})
source_group("device\\hip" FILES ${SRC_KERNEL_DEVICE_HIP} ${SRC_KERNEL_DEVICE_HIP_HEADERS})
source_group("device\\optix" FILES ${SRC_KERNEL_DEVICE_OPTIX} ${SRC_KERNEL_DEVICE_OPTIX_HEADERS})
source_group("device\\metal" FILES ${SRC_KERNEL_DEVICE_METAL} ${SRC_KERNEL_DEVICE_METAL_HEADERS})
source_group("film" FILES ${SRC_KERNEL_FILM_HEADERS})
source_group("geom" FILES ${SRC_KERNEL_GEOM_HEADERS})
source_group("integrator" FILES ${SRC_KERNEL_INTEGRATOR_HEADERS})
@@ -778,6 +787,8 @@ delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_DEVICE_HIP}" ${CYCLES_
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_DEVICE_HIP_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel/device/hip)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_DEVICE_OPTIX}" ${CYCLES_INSTALL_PATH}/source/kernel/device/optix)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_DEVICE_OPTIX_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel/device/optix)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_DEVICE_METAL}" ${CYCLES_INSTALL_PATH}/source/kernel/device/metal)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_DEVICE_METAL_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel/device/metal)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_FILM_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel/film)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_GEOM_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel/geom)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_INTEGRATOR_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel/integrator)

View File

@@ -438,7 +438,7 @@ ccl_device_inline int bsdf_sample(KernelGlobals kg,
if (label & LABEL_TRANSMIT) {
float threshold_squared = kernel_data.background.transparent_roughness_squared_threshold;
if (threshold_squared >= 0.0f && !(label & LABEL_DIFFUSE)) {
if (threshold_squared >= 0.0f) {
if (bsdf_get_specular_roughness_squared(sc) <= threshold_squared) {
label |= LABEL_TRANSMIT_TRANSPARENT;
}

View File

@@ -37,7 +37,7 @@
KERNEL_INTEGRATOR_INIT_FUNCTION(init_from_camera);
KERNEL_INTEGRATOR_INIT_FUNCTION(init_from_bake);
KERNEL_INTEGRATOR_SHADE_FUNCTION(intersect_closest);
KERNEL_INTEGRATOR_FUNCTION(intersect_closest);
KERNEL_INTEGRATOR_FUNCTION(intersect_shadow);
KERNEL_INTEGRATOR_FUNCTION(intersect_subsurface);
KERNEL_INTEGRATOR_FUNCTION(intersect_volume_stack);

View File

@@ -112,7 +112,7 @@ CCL_NAMESPACE_BEGIN
DEFINE_INTEGRATOR_INIT_KERNEL(init_from_camera)
DEFINE_INTEGRATOR_INIT_KERNEL(init_from_bake)
DEFINE_INTEGRATOR_SHADE_KERNEL(intersect_closest)
DEFINE_INTEGRATOR_KERNEL(intersect_closest)
DEFINE_INTEGRATOR_KERNEL(intersect_subsurface)
DEFINE_INTEGRATOR_KERNEL(intersect_volume_stack)
DEFINE_INTEGRATOR_SHADE_KERNEL(shade_background)

View File

@@ -75,6 +75,7 @@ typedef unsigned long long uint64_t;
#define ccl_gpu_block_idx_x (blockIdx.x)
#define ccl_gpu_grid_dim_x (gridDim.x)
#define ccl_gpu_warp_size (warpSize)
#define ccl_gpu_thread_mask(thread_warp) uint(0xFFFFFFFF >> (ccl_gpu_warp_size - thread_warp))
#define ccl_gpu_global_id_x() (ccl_gpu_block_idx_x * ccl_gpu_block_dim_x + ccl_gpu_thread_idx_x)
#define ccl_gpu_global_size_x() (ccl_gpu_grid_dim_x * ccl_gpu_block_dim_x)

View File

@@ -93,11 +93,35 @@
/* Compute number of threads per block and minimum blocks per multiprocessor
* given the maximum number of registers per thread. */
#define ccl_gpu_kernel(block_num_threads, thread_num_registers) \
#define ccl_gpu_kernel_threads(block_num_threads) \
extern "C" __global__ void __launch_bounds__(block_num_threads)
#define ccl_gpu_kernel_threads_registers(block_num_threads, thread_num_registers) \
extern "C" __global__ void __launch_bounds__(block_num_threads, \
GPU_MULTIPRESSOR_MAX_REGISTERS / \
(block_num_threads * thread_num_registers))
/* allow ccl_gpu_kernel to accept 1 or 2 parameters */
#define SELECT_MACRO(_1, _2, NAME, ...) NAME
#define ccl_gpu_kernel(...) \
SELECT_MACRO(__VA_ARGS__, ccl_gpu_kernel_threads_registers, ccl_gpu_kernel_threads)(__VA_ARGS__)
#define ccl_gpu_kernel_signature(name, ...) kernel_gpu_##name(__VA_ARGS__)
#define ccl_gpu_kernel_call(x) x
/* define a function object where "func" is the lambda body, and additional parameters are used to
* specify captured state */
#define ccl_gpu_kernel_lambda(func, ...) \
struct KernelLambda { \
__VA_ARGS__; \
__device__ int operator()(const int state) \
{ \
return (func); \
} \
} ccl_gpu_kernel_lambda_pass; \
ccl_gpu_kernel_lambda_pass
/* sanity checks */
#if GPU_KERNEL_BLOCK_NUM_THREADS > GPU_BLOCK_MAX_THREADS

View File

@@ -65,7 +65,9 @@ ccl_device float cubic_h1(float a)
/* Fast bicubic texture lookup using 4 bilinear lookups, adapted from CUDA samples. */
template<typename T>
ccl_device_noinline T kernel_tex_image_interp_bicubic(const TextureInfo &info, float x, float y)
ccl_device_noinline T kernel_tex_image_interp_bicubic(ccl_global const TextureInfo &info,
float x,
float y)
{
ccl_gpu_tex_object tex = (ccl_gpu_tex_object)info.data;
@@ -94,7 +96,7 @@ ccl_device_noinline T kernel_tex_image_interp_bicubic(const TextureInfo &info, f
/* Fast tricubic texture lookup using 8 trilinear lookups. */
template<typename T>
ccl_device_noinline T
kernel_tex_image_interp_tricubic(const TextureInfo &info, float x, float y, float z)
kernel_tex_image_interp_tricubic(ccl_global const TextureInfo &info, float x, float y, float z)
{
ccl_gpu_tex_object tex = (ccl_gpu_tex_object)info.data;
@@ -169,7 +171,7 @@ ccl_device T kernel_tex_image_interp_tricubic_nanovdb(S &s, float x, float y, fl
template<typename T>
ccl_device_noinline T kernel_tex_image_interp_nanovdb(
const TextureInfo &info, float x, float y, float z, uint interpolation)
ccl_global const TextureInfo &info, float x, float y, float z, uint interpolation)
{
using namespace nanovdb;
@@ -191,7 +193,7 @@ ccl_device_noinline T kernel_tex_image_interp_nanovdb(
ccl_device float4 kernel_tex_image_interp(KernelGlobals kg, int id, float x, float y)
{
const TextureInfo &info = kernel_tex_fetch(__texture_info, id);
ccl_global const TextureInfo &info = kernel_tex_fetch(__texture_info, id);
/* float4, byte4, ushort4 and half4 */
const int texture_type = info.data_type;
@@ -226,7 +228,7 @@ ccl_device float4 kernel_tex_image_interp_3d(KernelGlobals kg,
float3 P,
InterpolationType interp)
{
const TextureInfo &info = kernel_tex_fetch(__texture_info, id);
ccl_global const TextureInfo &info = kernel_tex_fetch(__texture_info, id);
if (info.use_transform_3d) {
P = transform_point(&info.transform_3d, P);

File diff suppressed because it is too large Load Diff

View File

@@ -31,10 +31,43 @@ CCL_NAMESPACE_BEGIN
# define GPU_PARALLEL_ACTIVE_INDEX_DEFAULT_BLOCK_SIZE 512
#endif
#ifdef __KERNEL_METAL__
struct ActiveIndexContext {
ActiveIndexContext(int _thread_index,
int _global_index,
int _threadgroup_size,
int _simdgroup_size,
int _simd_lane_index,
int _simd_group_index,
int _num_simd_groups,
threadgroup int *_simdgroup_offset)
: thread_index(_thread_index),
global_index(_global_index),
blocksize(_threadgroup_size),
ccl_gpu_warp_size(_simdgroup_size),
thread_warp(_simd_lane_index),
warp_index(_simd_group_index),
num_warps(_num_simd_groups),
warp_offset(_simdgroup_offset)
{
}
const int thread_index, global_index, blocksize, ccl_gpu_warp_size, thread_warp, warp_index,
num_warps;
threadgroup int *warp_offset;
template<uint blocksizeDummy, typename IsActiveOp>
void active_index_array(const uint num_states,
ccl_global int *indices,
ccl_global int *num_indices,
IsActiveOp is_active_op)
{
const uint state_index = global_index;
#else
template<uint blocksize, typename IsActiveOp>
__device__ void gpu_parallel_active_index_array(const uint num_states,
int *indices,
int *num_indices,
ccl_global int *indices,
ccl_global int *num_indices,
IsActiveOp is_active_op)
{
extern ccl_gpu_shared int warp_offset[];
@@ -45,43 +78,62 @@ __device__ void gpu_parallel_active_index_array(const uint num_states,
const uint warp_index = thread_index / ccl_gpu_warp_size;
const uint num_warps = blocksize / ccl_gpu_warp_size;
/* Test if state corresponding to this thread is active. */
const uint state_index = ccl_gpu_block_idx_x * blocksize + thread_index;
const uint is_active = (state_index < num_states) ? is_active_op(state_index) : 0;
#endif
/* For each thread within a warp compute how many other active states precede it. */
const uint thread_mask = 0xFFFFFFFF >> (ccl_gpu_warp_size - thread_warp);
const uint thread_offset = ccl_gpu_popc(ccl_gpu_ballot(is_active) & thread_mask);
/* Test if state corresponding to this thread is active. */
const uint is_active = (state_index < num_states) ? is_active_op(state_index) : 0;
/* Last thread in warp stores number of active states for each warp. */
if (thread_warp == ccl_gpu_warp_size - 1) {
warp_offset[warp_index] = thread_offset + is_active;
}
/* For each thread within a warp compute how many other active states precede it. */
const uint thread_offset = ccl_gpu_popc(ccl_gpu_ballot(is_active) &
ccl_gpu_thread_mask(thread_warp));
ccl_gpu_syncthreads();
/* Last thread in block converts per-warp sizes to offsets, increments global size of
* index array and gets offset to write to. */
if (thread_index == blocksize - 1) {
/* TODO: parallelize this. */
int offset = 0;
for (int i = 0; i < num_warps; i++) {
int num_active = warp_offset[i];
warp_offset[i] = offset;
offset += num_active;
/* Last thread in warp stores number of active states for each warp. */
if (thread_warp == ccl_gpu_warp_size - 1) {
warp_offset[warp_index] = thread_offset + is_active;
}
const uint block_num_active = warp_offset[warp_index] + thread_offset + is_active;
warp_offset[num_warps] = atomic_fetch_and_add_uint32(num_indices, block_num_active);
ccl_gpu_syncthreads();
/* Last thread in block converts per-warp sizes to offsets, increments global size of
* index array and gets offset to write to. */
if (thread_index == blocksize - 1) {
/* TODO: parallelize this. */
int offset = 0;
for (int i = 0; i < num_warps; i++) {
int num_active = warp_offset[i];
warp_offset[i] = offset;
offset += num_active;
}
const uint block_num_active = warp_offset[warp_index] + thread_offset + is_active;
warp_offset[num_warps] = atomic_fetch_and_add_uint32(num_indices, block_num_active);
}
ccl_gpu_syncthreads();
/* Write to index array. */
if (is_active) {
const uint block_offset = warp_offset[num_warps];
indices[block_offset + warp_offset[warp_index] + thread_offset] = state_index;
}
}
ccl_gpu_syncthreads();
#ifdef __KERNEL_METAL__
}; /* end class ActiveIndexContext */
/* Write to index array. */
if (is_active) {
const uint block_offset = warp_offset[num_warps];
indices[block_offset + warp_offset[warp_index] + thread_offset] = state_index;
}
}
/* inject the required thread params into a struct, and redirect to its templated member function
*/
# define gpu_parallel_active_index_array \
ActiveIndexContext(metal_local_id, \
metal_global_id, \
metal_local_size, \
simdgroup_size, \
simd_lane_index, \
simd_group_index, \
num_simd_groups, \
simdgroup_offset) \
.active_index_array
#endif
CCL_NAMESPACE_END

View File

@@ -33,10 +33,12 @@ CCL_NAMESPACE_BEGIN
# define GPU_PARALLEL_PREFIX_SUM_DEFAULT_BLOCK_SIZE 512
#endif
template<uint blocksize>
__device__ void gpu_parallel_prefix_sum(int *counter, int *prefix_sum, const int num_values)
__device__ void gpu_parallel_prefix_sum(const int global_id,
ccl_global int *counter,
ccl_global int *prefix_sum,
const int num_values)
{
if (!(ccl_gpu_block_idx_x == 0 && ccl_gpu_thread_idx_x == 0)) {
if (global_id != 0) {
return;
}

View File

@@ -33,16 +33,16 @@ CCL_NAMESPACE_BEGIN
#endif
#define GPU_PARALLEL_SORTED_INDEX_INACTIVE_KEY (~0)
template<uint blocksize, typename GetKeyOp>
__device__ void gpu_parallel_sorted_index_array(const uint num_states,
template<typename GetKeyOp>
__device__ void gpu_parallel_sorted_index_array(const uint state_index,
const uint num_states,
const int num_states_limit,
int *indices,
int *num_indices,
int *key_counter,
int *key_prefix_sum,
ccl_global int *indices,
ccl_global int *num_indices,
ccl_global int *key_counter,
ccl_global int *key_prefix_sum,
GetKeyOp get_key_op)
{
const uint state_index = ccl_gpu_block_idx_x * blocksize + ccl_gpu_thread_idx_x;
const int key = (state_index < num_states) ? get_key_op(state_index) :
GPU_PARALLEL_SORTED_INDEX_INACTIVE_KEY;

View File

@@ -29,20 +29,17 @@ ccl_device_inline void get_work_pixel(ccl_global const KernelWorkTile *tile,
ccl_private uint *y,
ccl_private uint *sample)
{
uint sample_offset, pixel_offset;
if (kernel_data.integrator.scrambling_distance < 0.9f) {
/* Keep threads for the same sample together. */
uint tile_pixels = tile->w * tile->h;
sample_offset = global_work_index / tile_pixels;
pixel_offset = global_work_index - sample_offset * tile_pixels;
}
else {
/* Keeping threads for the same pixel together.
* Appears to improve performance by a few % on CUDA and OptiX. */
sample_offset = global_work_index % tile->num_samples;
pixel_offset = global_work_index / tile->num_samples;
}
#if 0
/* Keep threads for the same sample together. */
uint tile_pixels = tile->w * tile->h;
uint sample_offset = global_work_index / tile_pixels;
uint pixel_offset = global_work_index - sample_offset * tile_pixels;
#else
/* Keeping threads for the same pixel together.
* Appears to improve performance by a few % on CUDA and OptiX. */
uint sample_offset = global_work_index % tile->num_samples;
uint pixel_offset = global_work_index / tile->num_samples;
#endif
uint y_offset = pixel_offset / tile->w;
uint x_offset = pixel_offset - y_offset * tile->w;

View File

@@ -74,6 +74,7 @@ typedef unsigned long long uint64_t;
#define ccl_gpu_block_idx_x (blockIdx.x)
#define ccl_gpu_grid_dim_x (gridDim.x)
#define ccl_gpu_warp_size (warpSize)
#define ccl_gpu_thread_mask(thread_warp) uint(0xFFFFFFFF >> (ccl_gpu_warp_size - thread_warp))
#define ccl_gpu_global_id_x() (ccl_gpu_block_idx_x * ccl_gpu_block_dim_x + ccl_gpu_thread_idx_x)
#define ccl_gpu_global_size_x() (ccl_gpu_grid_dim_x * ccl_gpu_block_dim_x)

View File

@@ -36,11 +36,35 @@
/* Compute number of threads per block and minimum blocks per multiprocessor
* given the maximum number of registers per thread. */
#define ccl_gpu_kernel(block_num_threads, thread_num_registers) \
#define ccl_gpu_kernel_threads(block_num_threads) \
extern "C" __global__ void __launch_bounds__(block_num_threads)
#define ccl_gpu_kernel_threads_registers(block_num_threads, thread_num_registers) \
extern "C" __global__ void __launch_bounds__(block_num_threads, \
GPU_MULTIPRESSOR_MAX_REGISTERS / \
(block_num_threads * thread_num_registers))
/* allow ccl_gpu_kernel to accept 1 or 2 parameters */
#define SELECT_MACRO(_1, _2, NAME, ...) NAME
#define ccl_gpu_kernel(...) \
SELECT_MACRO(__VA_ARGS__, ccl_gpu_kernel_threads_registers, ccl_gpu_kernel_threads)(__VA_ARGS__)
#define ccl_gpu_kernel_signature(name, ...) kernel_gpu_##name(__VA_ARGS__)
#define ccl_gpu_kernel_call(x) x
/* define a function object where "func" is the lambda body, and additional parameters are used to
* specify captured state */
#define ccl_gpu_kernel_lambda(func, ...) \
struct KernelLambda { \
__VA_ARGS__; \
__device__ int operator()(const int state) \
{ \
return (func); \
} \
} ccl_gpu_kernel_lambda_pass; \
ccl_gpu_kernel_lambda_pass
/* sanity checks */
#if GPU_KERNEL_BLOCK_NUM_THREADS > GPU_BLOCK_MAX_THREADS

View File

@@ -58,6 +58,95 @@ using namespace metal;
#define kernel_assert(cond)
#define ccl_gpu_global_id_x() metal_global_id
#define ccl_gpu_warp_size simdgroup_size
#define ccl_gpu_thread_idx_x simd_group_index
#define ccl_gpu_thread_mask(thread_warp) uint64_t((1ull << thread_warp) - 1)
#define ccl_gpu_ballot(predicate) ((uint64_t)((simd_vote::vote_t)simd_ballot(predicate)))
#define ccl_gpu_popc(x) popcount(x)
// clang-format off
/* kernel.h adapters */
#define ccl_gpu_kernel(...)
/* convert a comma-separated list into a semicolon-separated list (so that we can generate a struct based on kernel entrypoint parameters) */
#define FN0()
#define FN1(p1) p1;
#define FN2(p1, p2) p1; p2;
#define FN3(p1, p2, p3) p1; p2; p3;
#define FN4(p1, p2, p3, p4) p1; p2; p3; p4;
#define FN5(p1, p2, p3, p4, p5) p1; p2; p3; p4; p5;
#define FN6(p1, p2, p3, p4, p5, p6) p1; p2; p3; p4; p5; p6;
#define FN7(p1, p2, p3, p4, p5, p6, p7) p1; p2; p3; p4; p5; p6; p7;
#define FN8(p1, p2, p3, p4, p5, p6, p7, p8) p1; p2; p3; p4; p5; p6; p7; p8;
#define FN9(p1, p2, p3, p4, p5, p6, p7, p8, p9) p1; p2; p3; p4; p5; p6; p7; p8; p9;
#define FN10(p1, p2, p3, p4, p5, p6, p7, p8, p9, p10) p1; p2; p3; p4; p5; p6; p7; p8; p9; p10;
#define FN11(p1, p2, p3, p4, p5, p6, p7, p8, p9, p10, p11) p1; p2; p3; p4; p5; p6; p7; p8; p9; p10; p11;
#define FN12(p1, p2, p3, p4, p5, p6, p7, p8, p9, p10, p11, p12) p1; p2; p3; p4; p5; p6; p7; p8; p9; p10; p11; p12;
#define FN13(p1, p2, p3, p4, p5, p6, p7, p8, p9, p10, p11, p12, p13) p1; p2; p3; p4; p5; p6; p7; p8; p9; p10; p11; p12; p13;
#define FN14(p1, p2, p3, p4, p5, p6, p7, p8, p9, p10, p11, p12, p13, p14) p1; p2; p3; p4; p5; p6; p7; p8; p9; p10; p11; p12; p13; p14;
#define FN15(p1, p2, p3, p4, p5, p6, p7, p8, p9, p10, p11, p12, p13, p14, p15) p1; p2; p3; p4; p5; p6; p7; p8; p9; p10; p11; p12; p13; p14; p15;
#define FN16(p1, p2, p3, p4, p5, p6, p7, p8, p9, p10, p11, p12, p13, p14, p15, p16) p1; p2; p3; p4; p5; p6; p7; p8; p9; p10; p11; p12; p13; p14; p15; p16;
#define GET_LAST_ARG(p0, p1, p2, p3, p4, p5, p6, p7, p8, p9, p10, p11, p12, p13, p14, p15, p16, ...) p16
#define PARAMS_MAKER(...) GET_LAST_ARG(__VA_ARGS__, FN16, FN15, FN14, FN13, FN12, FN11, FN10, FN9, FN8, FN7, FN6, FN5, FN4, FN3, FN2, FN1, FN0)
/* generate a struct containing the entrypoint parameters and a "run" method which can access them implicitly via this-> */
#define ccl_gpu_kernel_signature(name, ...) \
struct kernel_gpu_##name \
{ \
PARAMS_MAKER(__VA_ARGS__)(__VA_ARGS__) \
void run(thread MetalKernelContext& context, \
threadgroup int *simdgroup_offset, \
const uint metal_global_id, \
const ushort metal_local_id, \
const ushort metal_local_size, \
uint simdgroup_size, \
uint simd_lane_index, \
uint simd_group_index, \
uint num_simd_groups) ccl_global const; \
}; \
kernel void kernel_metal_##name(device const kernel_gpu_##name *params_struct, \
constant KernelParamsMetal &ccl_restrict _launch_params_metal, \
constant MetalAncillaries *_metal_ancillaries, \
threadgroup int *simdgroup_offset[[ threadgroup(0) ]], \
const uint metal_global_id [[thread_position_in_grid]], \
const ushort metal_local_id [[thread_position_in_threadgroup]], \
const ushort metal_local_size [[threads_per_threadgroup]], \
uint simdgroup_size [[threads_per_simdgroup]], \
uint simd_lane_index [[thread_index_in_simdgroup]], \
uint simd_group_index [[simdgroup_index_in_threadgroup]], \
uint num_simd_groups [[simdgroups_per_threadgroup]]) { \
MetalKernelContext context(_launch_params_metal, _metal_ancillaries); \
INIT_DEBUG_BUFFER \
params_struct->run(context, simdgroup_offset, metal_global_id, metal_local_id, metal_local_size, simdgroup_size, simd_lane_index, simd_group_index, num_simd_groups); \
} \
void kernel_gpu_##name::run(thread MetalKernelContext& context, \
threadgroup int *simdgroup_offset, \
const uint metal_global_id, \
const ushort metal_local_id, \
const ushort metal_local_size, \
uint simdgroup_size, \
uint simd_lane_index, \
uint simd_group_index, \
uint num_simd_groups) ccl_global const
#define ccl_gpu_kernel_call(x) context.x
/* define a function object where "func" is the lambda body, and additional parameters are used to specify captured state */
#define ccl_gpu_kernel_lambda(func, ...) \
struct KernelLambda \
{ \
KernelLambda(ccl_private MetalKernelContext &_context) : context(_context) {} \
ccl_private MetalKernelContext &context; \
__VA_ARGS__; \
int operator()(const int state) const { return (func); } \
}ccl_gpu_kernel_lambda_pass(context); ccl_gpu_kernel_lambda_pass
// clang-format on
/* make_type definitions with Metal style element initializers */
#ifdef make_float2
# undef make_float2
@@ -124,3 +213,38 @@ using namespace metal;
#define logf(x) trigmode::log(float(x))
#define NULL 0
/* texture bindings and sampler setup */
struct Texture2DParamsMetal {
texture2d<float, access::sample> tex;
};
struct Texture3DParamsMetal {
texture3d<float, access::sample> tex;
};
struct MetalAncillaries {
device Texture2DParamsMetal *textures_2d;
device Texture3DParamsMetal *textures_3d;
};
enum SamplerType {
SamplerFilterNearest_AddressRepeat,
SamplerFilterNearest_AddressClampEdge,
SamplerFilterNearest_AddressClampZero,
SamplerFilterLinear_AddressRepeat,
SamplerFilterLinear_AddressClampEdge,
SamplerFilterLinear_AddressClampZero,
SamplerCount
};
constant constexpr array<sampler, SamplerCount> metal_samplers = {
sampler(address::repeat, filter::nearest),
sampler(address::clamp_to_edge, filter::nearest),
sampler(address::clamp_to_zero, filter::nearest),
sampler(address::repeat, filter::linear),
sampler(address::clamp_to_edge, filter::linear),
sampler(address::clamp_to_zero, filter::linear),
};

View File

@@ -0,0 +1,79 @@
/*
* Copyright 2021 Blender Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
// clang-format off
/* Open the Metal kernel context class
* Necessary to access resource bindings */
class MetalKernelContext {
public:
constant KernelParamsMetal &launch_params_metal;
constant MetalAncillaries *metal_ancillaries;
MetalKernelContext(constant KernelParamsMetal &_launch_params_metal, constant MetalAncillaries * _metal_ancillaries)
: launch_params_metal(_launch_params_metal), metal_ancillaries(_metal_ancillaries)
{}
/* texture fetch adapter functions */
typedef uint64_t ccl_gpu_tex_object;
template<typename T>
inline __attribute__((__always_inline__))
T ccl_gpu_tex_object_read_2D(ccl_gpu_tex_object tex, float x, float y) const {
kernel_assert(0);
return 0;
}
template<typename T>
inline __attribute__((__always_inline__))
T ccl_gpu_tex_object_read_3D(ccl_gpu_tex_object tex, float x, float y, float z) const {
kernel_assert(0);
return 0;
}
// texture2d
template<>
inline __attribute__((__always_inline__))
float4 ccl_gpu_tex_object_read_2D(ccl_gpu_tex_object tex, float x, float y) const {
const uint tid(tex);
const uint sid(tex >> 32);
return metal_ancillaries->textures_2d[tid].tex.sample(metal_samplers[sid], float2(x, y));
}
template<>
inline __attribute__((__always_inline__))
float ccl_gpu_tex_object_read_2D(ccl_gpu_tex_object tex, float x, float y) const {
const uint tid(tex);
const uint sid(tex >> 32);
return metal_ancillaries->textures_2d[tid].tex.sample(metal_samplers[sid], float2(x, y)).x;
}
// texture3d
template<>
inline __attribute__((__always_inline__))
float4 ccl_gpu_tex_object_read_3D(ccl_gpu_tex_object tex, float x, float y, float z) const {
const uint tid(tex);
const uint sid(tex >> 32);
return metal_ancillaries->textures_3d[tid].tex.sample(metal_samplers[sid], float3(x, y, z));
}
template<>
inline __attribute__((__always_inline__))
float ccl_gpu_tex_object_read_3D(ccl_gpu_tex_object tex, float x, float y, float z) const {
const uint tid(tex);
const uint sid(tex >> 32);
return metal_ancillaries->textures_3d[tid].tex.sample(metal_samplers[sid], float3(x, y, z)).x;
}
# include "kernel/device/gpu/image.h"
// clang-format on

View File

@@ -0,0 +1,23 @@
/*
* Copyright 2021 Blender Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
}
; /* end of MetalKernelContext class definition */
/* Silently redirect into the MetalKernelContext instance */
/* NOTE: These macros will need maintaining as entrypoints change */
#undef kernel_integrator_state
#define kernel_integrator_state context.launch_params_metal.__integrator_state

View File

@@ -0,0 +1,51 @@
/*
* Copyright 2021 Blender Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
/* Constant Globals */
#include "kernel/types.h"
#include "kernel/util/profiling.h"
#include "kernel/integrator/state.h"
CCL_NAMESPACE_BEGIN
typedef struct KernelParamsMetal {
#define KERNEL_TEX(type, name) ccl_constant type *name;
#include "kernel/textures.h"
#undef KERNEL_TEX
const IntegratorStateGPU __integrator_state;
const KernelData data;
} KernelParamsMetal;
typedef struct KernelGlobalsGPU {
int unused[1];
} KernelGlobalsGPU;
typedef ccl_global const KernelGlobalsGPU *ccl_restrict KernelGlobals;
#define kernel_data launch_params_metal.data
#define kernel_integrator_state launch_params_metal.__integrator_state
/* data lookup defines */
#define kernel_tex_fetch(tex, index) launch_params_metal.tex[index]
#define kernel_tex_array(tex) launch_params_metal.tex
CCL_NAMESPACE_END

View File

@@ -0,0 +1,25 @@
/*
* Copyright 2021 Blender Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
/* Metal kernel entry points */
// clang-format off
#include "kernel/device/metal/compat.h"
#include "kernel/device/metal/globals.h"
#include "kernel/device/gpu/kernel.h"
// clang-format on

View File

@@ -76,6 +76,7 @@ typedef unsigned long long uint64_t;
#define ccl_gpu_block_idx_x (blockIdx.x)
#define ccl_gpu_grid_dim_x (gridDim.x)
#define ccl_gpu_warp_size (warpSize)
#define ccl_gpu_thread_mask(thread_warp) uint(0xFFFFFFFF >> (ccl_gpu_warp_size - thread_warp))
#define ccl_gpu_global_id_x() (ccl_gpu_block_idx_x * ccl_gpu_block_dim_x + ccl_gpu_thread_idx_x)
#define ccl_gpu_global_size_x() (ccl_gpu_grid_dim_x * ccl_gpu_block_dim_x)

View File

@@ -57,7 +57,7 @@ extern "C" __global__ void __raygen__kernel_optix_integrator_intersect_closest()
const int global_index = optixGetLaunchIndex().x;
const int path_index = (__params.path_index_array) ? __params.path_index_array[global_index] :
global_index;
integrator_intersect_closest(nullptr, path_index, __params.render_buffer);
integrator_intersect_closest(nullptr, path_index);
}
extern "C" __global__ void __raygen__kernel_optix_integrator_intersect_shadow()

View File

@@ -33,72 +33,62 @@ CCL_NAMESPACE_BEGIN
* them separately. */
ccl_device_inline void bsdf_eval_init(ccl_private BsdfEval *eval,
const ClosureType closure_type,
const bool is_diffuse,
float3 value)
{
eval->diffuse = zero_float3();
eval->glossy = zero_float3();
if (CLOSURE_IS_BSDF_DIFFUSE(closure_type)) {
if (is_diffuse) {
eval->diffuse = value;
}
else if (CLOSURE_IS_BSDF_GLOSSY(closure_type)) {
else {
eval->glossy = value;
}
eval->sum = value;
}
ccl_device_inline void bsdf_eval_accum(ccl_private BsdfEval *eval,
const ClosureType closure_type,
float3 value)
const bool is_diffuse,
float3 value,
float mis_weight)
{
if (CLOSURE_IS_BSDF_DIFFUSE(closure_type)) {
value *= mis_weight;
if (is_diffuse) {
eval->diffuse += value;
}
else if (CLOSURE_IS_BSDF_GLOSSY(closure_type)) {
else {
eval->glossy += value;
}
eval->sum += value;
}
ccl_device_inline bool bsdf_eval_is_zero(ccl_private BsdfEval *eval)
{
return is_zero(eval->sum);
return is_zero(eval->diffuse) && is_zero(eval->glossy);
}
ccl_device_inline void bsdf_eval_mul(ccl_private BsdfEval *eval, float value)
{
eval->diffuse *= value;
eval->glossy *= value;
eval->sum *= value;
}
ccl_device_inline void bsdf_eval_mul3(ccl_private BsdfEval *eval, float3 value)
{
eval->diffuse *= value;
eval->glossy *= value;
eval->sum *= value;
}
ccl_device_inline float3 bsdf_eval_sum(ccl_private const BsdfEval *eval)
{
return eval->sum;
return eval->diffuse + eval->glossy;
}
ccl_device_inline float3 bsdf_eval_pass_diffuse_weight(ccl_private const BsdfEval *eval)
ccl_device_inline float3 bsdf_eval_diffuse_glossy_ratio(ccl_private const BsdfEval *eval)
{
/* Ratio of diffuse weight to recover proportions for writing to render pass.
/* Ratio of diffuse and glossy to recover proportions for writing to render pass.
* We assume reflection, transmission and volume scatter to be exclusive. */
return safe_divide_float3_float3(eval->diffuse, eval->sum);
}
ccl_device_inline float3 bsdf_eval_pass_glossy_weight(ccl_private const BsdfEval *eval)
{
/* Ratio of glossy weight to recover proportions for writing to render pass.
* We assume reflection, transmission and volume scatter to be exclusive. */
return safe_divide_float3_float3(eval->glossy, eval->sum);
return safe_divide_float3_float3(eval->diffuse, eval->diffuse + eval->glossy);
}
/* --------------------------------------------------------------------
@@ -361,48 +351,38 @@ ccl_device_inline void kernel_accum_emission_or_background_pass(KernelGlobals kg
/* Directly visible, write to emission or background pass. */
pass_offset = pass;
}
else if (kernel_data.kernel_features & KERNEL_FEATURE_LIGHT_PASSES) {
if (path_flag & PATH_RAY_SURFACE_PASS) {
/* Indirectly visible through reflection. */
const float3 diffuse_weight = INTEGRATOR_STATE(state, path, pass_diffuse_weight);
const float3 glossy_weight = INTEGRATOR_STATE(state, path, pass_glossy_weight);
else if (path_flag & (PATH_RAY_REFLECT_PASS | PATH_RAY_TRANSMISSION_PASS)) {
/* Indirectly visible through reflection. */
const int glossy_pass_offset = (path_flag & PATH_RAY_REFLECT_PASS) ?
((INTEGRATOR_STATE(state, path, bounce) == 1) ?
kernel_data.film.pass_glossy_direct :
kernel_data.film.pass_glossy_indirect) :
((INTEGRATOR_STATE(state, path, bounce) == 1) ?
kernel_data.film.pass_transmission_direct :
kernel_data.film.pass_transmission_indirect);
/* Glossy */
const int glossy_pass_offset = ((INTEGRATOR_STATE(state, path, bounce) == 1) ?
kernel_data.film.pass_glossy_direct :
kernel_data.film.pass_glossy_indirect);
if (glossy_pass_offset != PASS_UNUSED) {
kernel_write_pass_float3(buffer + glossy_pass_offset, glossy_weight * contribution);
}
/* Transmission */
const int transmission_pass_offset = ((INTEGRATOR_STATE(state, path, bounce) == 1) ?
kernel_data.film.pass_transmission_direct :
kernel_data.film.pass_transmission_indirect);
if (transmission_pass_offset != PASS_UNUSED) {
/* Transmission is what remains if not diffuse and glossy, not stored explicitly to save
* GPU memory. */
const float3 transmission_weight = one_float3() - diffuse_weight - glossy_weight;
kernel_write_pass_float3(buffer + transmission_pass_offset,
transmission_weight * contribution);
}
/* Reconstruct diffuse subset of throughput. */
pass_offset = (INTEGRATOR_STATE(state, path, bounce) == 1) ?
kernel_data.film.pass_diffuse_direct :
kernel_data.film.pass_diffuse_indirect;
if (pass_offset != PASS_UNUSED) {
contribution *= diffuse_weight;
}
if (glossy_pass_offset != PASS_UNUSED) {
/* Glossy is a subset of the throughput, reconstruct it here using the
* diffuse-glossy ratio. */
const float3 ratio = INTEGRATOR_STATE(state, path, diffuse_glossy_ratio);
const float3 glossy_contribution = (one_float3() - ratio) * contribution;
kernel_write_pass_float3(buffer + glossy_pass_offset, glossy_contribution);
}
else if (path_flag & PATH_RAY_VOLUME_PASS) {
/* Indirectly visible through volume. */
pass_offset = (INTEGRATOR_STATE(state, path, bounce) == 1) ?
kernel_data.film.pass_volume_direct :
kernel_data.film.pass_volume_indirect;
/* Reconstruct diffuse subset of throughput. */
pass_offset = (INTEGRATOR_STATE(state, path, bounce) == 1) ?
kernel_data.film.pass_diffuse_direct :
kernel_data.film.pass_diffuse_indirect;
if (pass_offset != PASS_UNUSED) {
contribution *= INTEGRATOR_STATE(state, path, diffuse_glossy_ratio);
}
}
else if (path_flag & PATH_RAY_VOLUME_PASS) {
/* Indirectly visible through volume. */
pass_offset = (INTEGRATOR_STATE(state, path, bounce) == 1) ?
kernel_data.film.pass_volume_direct :
kernel_data.film.pass_volume_indirect;
}
/* Single write call for GPU coherence. */
if (pass_offset != PASS_UNUSED) {
@@ -446,60 +426,49 @@ ccl_device_inline void kernel_accum_light(KernelGlobals kg,
#ifdef __PASSES__
if (kernel_data.film.light_pass_flag & PASS_ANY) {
const uint32_t path_flag = INTEGRATOR_STATE(state, shadow_path, flag);
int pass_offset = PASS_UNUSED;
if (kernel_data.kernel_features & KERNEL_FEATURE_LIGHT_PASSES) {
int pass_offset = PASS_UNUSED;
if (path_flag & (PATH_RAY_REFLECT_PASS | PATH_RAY_TRANSMISSION_PASS)) {
/* Indirectly visible through reflection. */
const int glossy_pass_offset = (path_flag & PATH_RAY_REFLECT_PASS) ?
((INTEGRATOR_STATE(state, shadow_path, bounce) == 0) ?
kernel_data.film.pass_glossy_direct :
kernel_data.film.pass_glossy_indirect) :
((INTEGRATOR_STATE(state, shadow_path, bounce) == 0) ?
kernel_data.film.pass_transmission_direct :
kernel_data.film.pass_transmission_indirect);
if (path_flag & PATH_RAY_SURFACE_PASS) {
/* Indirectly visible through reflection. */
const float3 diffuse_weight = INTEGRATOR_STATE(state, shadow_path, pass_diffuse_weight);
const float3 glossy_weight = INTEGRATOR_STATE(state, shadow_path, pass_glossy_weight);
/* Glossy */
const int glossy_pass_offset = ((INTEGRATOR_STATE(state, shadow_path, bounce) == 0) ?
kernel_data.film.pass_glossy_direct :
kernel_data.film.pass_glossy_indirect);
if (glossy_pass_offset != PASS_UNUSED) {
kernel_write_pass_float3(buffer + glossy_pass_offset, glossy_weight * contribution);
}
/* Transmission */
const int transmission_pass_offset = ((INTEGRATOR_STATE(state, shadow_path, bounce) == 0) ?
kernel_data.film.pass_transmission_direct :
kernel_data.film.pass_transmission_indirect);
if (transmission_pass_offset != PASS_UNUSED) {
/* Transmission is what remains if not diffuse and glossy, not stored explicitly to save
* GPU memory. */
const float3 transmission_weight = one_float3() - diffuse_weight - glossy_weight;
kernel_write_pass_float3(buffer + transmission_pass_offset,
transmission_weight * contribution);
}
/* Reconstruct diffuse subset of throughput. */
pass_offset = (INTEGRATOR_STATE(state, shadow_path, bounce) == 0) ?
kernel_data.film.pass_diffuse_direct :
kernel_data.film.pass_diffuse_indirect;
if (pass_offset != PASS_UNUSED) {
contribution *= diffuse_weight;
}
}
else if (path_flag & PATH_RAY_VOLUME_PASS) {
/* Indirectly visible through volume. */
pass_offset = (INTEGRATOR_STATE(state, shadow_path, bounce) == 0) ?
kernel_data.film.pass_volume_direct :
kernel_data.film.pass_volume_indirect;
if (glossy_pass_offset != PASS_UNUSED) {
/* Glossy is a subset of the throughput, reconstruct it here using the
* diffuse-glossy ratio. */
const float3 ratio = INTEGRATOR_STATE(state, shadow_path, diffuse_glossy_ratio);
const float3 glossy_contribution = (one_float3() - ratio) * contribution;
kernel_write_pass_float3(buffer + glossy_pass_offset, glossy_contribution);
}
/* Single write call for GPU coherence. */
/* Reconstruct diffuse subset of throughput. */
pass_offset = (INTEGRATOR_STATE(state, shadow_path, bounce) == 0) ?
kernel_data.film.pass_diffuse_direct :
kernel_data.film.pass_diffuse_indirect;
if (pass_offset != PASS_UNUSED) {
kernel_write_pass_float3(buffer + pass_offset, contribution);
contribution *= INTEGRATOR_STATE(state, shadow_path, diffuse_glossy_ratio);
}
}
else if (path_flag & PATH_RAY_VOLUME_PASS) {
/* Indirectly visible through volume. */
pass_offset = (INTEGRATOR_STATE(state, shadow_path, bounce) == 0) ?
kernel_data.film.pass_volume_direct :
kernel_data.film.pass_volume_indirect;
}
/* Single write call for GPU coherence. */
if (pass_offset != PASS_UNUSED) {
kernel_write_pass_float3(buffer + pass_offset, contribution);
}
/* Write shadow pass. */
if (kernel_data.film.pass_shadow != PASS_UNUSED && (path_flag & PATH_RAY_SHADOW_FOR_LIGHT) &&
(path_flag & PATH_RAY_TRANSPARENT_BACKGROUND)) {
(path_flag & PATH_RAY_CAMERA)) {
const float3 unshadowed_throughput = INTEGRATOR_STATE(
state, shadow_path, unshadowed_throughput);
const float3 shadowed_throughput = INTEGRATOR_STATE(state, shadow_path, throughput);

View File

@@ -160,6 +160,40 @@ ccl_device_forceinline void kernel_write_denoising_features_volume(KernelGlobals
}
#endif /* __DENOISING_FEATURES__ */
#ifdef __SHADOW_CATCHER__
/* Write shadow catcher passes on a bounce from the shadow catcher object. */
ccl_device_forceinline void kernel_write_shadow_catcher_bounce_data(
KernelGlobals kg,
IntegratorState state,
ccl_private const ShaderData *sd,
ccl_global float *ccl_restrict render_buffer)
{
if (!kernel_data.integrator.has_shadow_catcher) {
return;
}
kernel_assert(kernel_data.film.pass_shadow_catcher_sample_count != PASS_UNUSED);
kernel_assert(kernel_data.film.pass_shadow_catcher_matte != PASS_UNUSED);
if (!kernel_shadow_catcher_is_path_split_bounce(kg, state, sd->object_flag)) {
return;
}
ccl_global float *buffer = kernel_pass_pixel_render_buffer(kg, state, render_buffer);
/* Count sample for the shadow catcher object. */
kernel_write_pass_float(buffer + kernel_data.film.pass_shadow_catcher_sample_count, 1.0f);
/* Since the split is done, the sample does not contribute to the matte, so accumulate it as
* transparency to the matte. */
const float3 throughput = INTEGRATOR_STATE(state, path, throughput);
kernel_write_pass_float(buffer + kernel_data.film.pass_shadow_catcher_matte + 3,
average(throughput));
}
#endif /* __SHADOW_CATCHER__ */
ccl_device_inline size_t kernel_write_id_pass(ccl_global float *ccl_restrict buffer,
size_t depth,
float id,
@@ -177,7 +211,7 @@ ccl_device_inline void kernel_write_data_passes(KernelGlobals kg,
#ifdef __PASSES__
const uint32_t path_flag = INTEGRATOR_STATE(state, path, flag);
if (!(path_flag & PATH_RAY_TRANSPARENT_BACKGROUND)) {
if (!(path_flag & PATH_RAY_CAMERA)) {
return;
}

View File

@@ -460,7 +460,7 @@ ccl_device_inline float4 film_calculate_shadow_catcher_matte_with_shadow(
const float transparency = in_matte[3] * scale;
const float alpha = saturatef(1.0f - transparency);
const float alpha_matte = (1.0f - alpha) * (1.0f - saturatef(average(shadow_catcher))) + alpha;
const float alpha_matte = (1.0f - alpha) * (1.0f - average(shadow_catcher)) + alpha;
if (kfilm_convert->use_approximate_shadow_catcher_background) {
kernel_assert(kfilm_convert->pass_background != PASS_UNUSED);

View File

@@ -70,16 +70,14 @@ ccl_device bool integrator_init_from_bake(KernelGlobals kg,
/* Setup render buffers. */
const int index = INTEGRATOR_STATE(state, path, render_pixel_index);
const int pass_stride = kernel_data.film.pass_stride;
ccl_global float *buffer = render_buffer + index * pass_stride;
render_buffer += index * pass_stride;
ccl_global float *primitive = buffer + kernel_data.film.pass_bake_primitive;
ccl_global float *differential = buffer + kernel_data.film.pass_bake_differential;
ccl_global float *primitive = render_buffer + kernel_data.film.pass_bake_primitive;
ccl_global float *differential = render_buffer + kernel_data.film.pass_bake_differential;
const int seed = __float_as_uint(primitive[0]);
int prim = __float_as_uint(primitive[1]);
if (prim == -1) {
/* Accumulate transparency for empty pixels. */
kernel_accum_transparent(kg, state, 0, 1.0f, buffer);
return false;
}

View File

@@ -88,10 +88,7 @@ ccl_device_forceinline bool integrator_intersect_terminate(KernelGlobals kg,
#ifdef __SHADOW_CATCHER__
/* Split path if a shadow catcher was hit. */
ccl_device_forceinline void integrator_split_shadow_catcher(
KernelGlobals kg,
IntegratorState state,
ccl_private const Intersection *ccl_restrict isect,
ccl_global float *ccl_restrict render_buffer)
KernelGlobals kg, IntegratorState state, ccl_private const Intersection *ccl_restrict isect)
{
/* Test if we hit a shadow catcher object, and potentially split the path to continue tracing two
* paths from here. */
@@ -100,8 +97,6 @@ ccl_device_forceinline void integrator_split_shadow_catcher(
return;
}
kernel_write_shadow_catcher_bounce_data(kg, state, render_buffer);
/* Mark state as having done a shadow catcher split so that it stops contributing to
* the shadow catcher matte pass, but keeps contributing to the combined pass. */
INTEGRATOR_STATE_WRITE(state, path, flag) |= PATH_RAY_SHADOW_CATCHER_HIT;
@@ -196,7 +191,6 @@ ccl_device_forceinline void integrator_intersect_next_kernel(
KernelGlobals kg,
IntegratorState state,
ccl_private const Intersection *ccl_restrict isect,
ccl_global float *ccl_restrict render_buffer,
const bool hit)
{
/* Continue with volume kernel if we are inside a volume, regardless if we hit anything. */
@@ -239,7 +233,7 @@ ccl_device_forceinline void integrator_intersect_next_kernel(
#ifdef __SHADOW_CATCHER__
/* Handle shadow catcher. */
integrator_split_shadow_catcher(kg, state, isect, render_buffer);
integrator_split_shadow_catcher(kg, state, isect);
#endif
}
else {
@@ -259,10 +253,7 @@ ccl_device_forceinline void integrator_intersect_next_kernel(
* volume shading and termination testing have already been done. */
template<uint32_t current_kernel>
ccl_device_forceinline void integrator_intersect_next_kernel_after_volume(
KernelGlobals kg,
IntegratorState state,
ccl_private const Intersection *ccl_restrict isect,
ccl_global float *ccl_restrict render_buffer)
KernelGlobals kg, IntegratorState state, ccl_private const Intersection *ccl_restrict isect)
{
if (isect->prim != PRIM_NONE) {
/* Hit a surface, continue with light or surface kernel. */
@@ -287,7 +278,7 @@ ccl_device_forceinline void integrator_intersect_next_kernel_after_volume(
#ifdef __SHADOW_CATCHER__
/* Handle shadow catcher. */
integrator_split_shadow_catcher(kg, state, isect, render_buffer);
integrator_split_shadow_catcher(kg, state, isect);
#endif
return;
}
@@ -299,9 +290,7 @@ ccl_device_forceinline void integrator_intersect_next_kernel_after_volume(
}
}
ccl_device void integrator_intersect_closest(KernelGlobals kg,
IntegratorState state,
ccl_global float *ccl_restrict render_buffer)
ccl_device void integrator_intersect_closest(KernelGlobals kg, IntegratorState state)
{
PROFILING_INIT(kg, PROFILING_INTERSECT_CLOSEST);
@@ -352,7 +341,7 @@ ccl_device void integrator_intersect_closest(KernelGlobals kg,
/* Setup up next kernel to be executed. */
integrator_intersect_next_kernel<DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST>(
kg, state, &isect, render_buffer, hit);
kg, state, &isect, hit);
}
CCL_NAMESPACE_END

View File

@@ -76,7 +76,7 @@ ccl_device void integrator_megakernel(KernelGlobals kg,
if (queued_kernel) {
switch (queued_kernel) {
case DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST:
integrator_intersect_closest(kg, state, render_buffer);
integrator_intersect_closest(kg, state);
break;
case DEVICE_KERNEL_INTEGRATOR_SHADE_BACKGROUND:
integrator_shade_background(kg, state, render_buffer);

View File

@@ -70,9 +70,6 @@ ccl_device_inline void path_state_init_integrator(KernelGlobals kg,
INTEGRATOR_STATE_WRITE(state, path, continuation_probability) = 1.0f;
INTEGRATOR_STATE_WRITE(state, path, throughput) = make_float3(1.0f, 1.0f, 1.0f);
INTEGRATOR_STATE_WRITE(state, isect, object) = OBJECT_NONE;
INTEGRATOR_STATE_WRITE(state, isect, prim) = PRIM_NONE;
if (kernel_data.kernel_features & KERNEL_FEATURE_VOLUME) {
INTEGRATOR_STATE_ARRAY_WRITE(state, volume_stack, 0, object) = OBJECT_NONE;
INTEGRATOR_STATE_ARRAY_WRITE(
@@ -125,7 +122,7 @@ ccl_device_inline void path_state_next(KernelGlobals kg, IntegratorState state,
/* volume scatter */
flag |= PATH_RAY_VOLUME_SCATTER;
flag &= ~PATH_RAY_TRANSPARENT_BACKGROUND;
if (!(flag & PATH_RAY_ANY_PASS)) {
if (bounce == 1) {
flag |= PATH_RAY_VOLUME_PASS;
}
@@ -187,8 +184,8 @@ ccl_device_inline void path_state_next(KernelGlobals kg, IntegratorState state,
}
/* Render pass categories. */
if (!(flag & PATH_RAY_ANY_PASS) && !(flag & PATH_RAY_TRANSPARENT_BACKGROUND)) {
flag |= PATH_RAY_SURFACE_PASS;
if (bounce == 1) {
flag |= (label & LABEL_TRANSMIT) ? PATH_RAY_TRANSMISSION_PASS : PATH_RAY_REFLECT_PASS;
}
}
@@ -211,7 +208,9 @@ ccl_device_inline bool path_state_volume_next(IntegratorState state)
}
/* Random number generator next bounce. */
INTEGRATOR_STATE_WRITE(state, path, rng_offset) += PRNG_BOUNCE_NUM;
if (volume_bounds_bounce > 1) {
INTEGRATOR_STATE_WRITE(state, path, rng_offset) += PRNG_BOUNCE_NUM;
}
return true;
}

View File

@@ -191,18 +191,14 @@ ccl_device_forceinline void integrate_surface_direct_light(KernelGlobals kg,
const uint16_t transparent_bounce = INTEGRATOR_STATE(state, path, transparent_bounce);
uint32_t shadow_flag = INTEGRATOR_STATE(state, path, flag);
shadow_flag |= (is_light) ? PATH_RAY_SHADOW_FOR_LIGHT : 0;
shadow_flag |= PATH_RAY_SURFACE_PASS;
shadow_flag |= (is_transmission) ? PATH_RAY_TRANSMISSION_PASS : PATH_RAY_REFLECT_PASS;
const float3 throughput = INTEGRATOR_STATE(state, path, throughput) * bsdf_eval_sum(&bsdf_eval);
if (kernel_data.kernel_features & KERNEL_FEATURE_LIGHT_PASSES) {
const float3 pass_diffuse_weight = (bounce == 0) ?
bsdf_eval_pass_diffuse_weight(&bsdf_eval) :
INTEGRATOR_STATE(state, path, pass_diffuse_weight);
const float3 pass_glossy_weight = (bounce == 0) ?
bsdf_eval_pass_glossy_weight(&bsdf_eval) :
INTEGRATOR_STATE(state, path, pass_glossy_weight);
INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, pass_diffuse_weight) = pass_diffuse_weight;
INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, pass_glossy_weight) = pass_glossy_weight;
const float3 diffuse_glossy_ratio = (bounce == 0) ?
bsdf_eval_diffuse_glossy_ratio(&bsdf_eval) :
INTEGRATOR_STATE(state, path, diffuse_glossy_ratio);
INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, diffuse_glossy_ratio) = diffuse_glossy_ratio;
}
INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, render_pixel_index) = INTEGRATOR_STATE(
@@ -287,9 +283,7 @@ ccl_device_forceinline int integrate_surface_bsdf_bssrdf_bounce(
if (kernel_data.kernel_features & KERNEL_FEATURE_LIGHT_PASSES) {
if (INTEGRATOR_STATE(state, path, bounce) == 0) {
INTEGRATOR_STATE_WRITE(state, path, pass_diffuse_weight) = bsdf_eval_pass_diffuse_weight(
&bsdf_eval);
INTEGRATOR_STATE_WRITE(state, path, pass_glossy_weight) = bsdf_eval_pass_glossy_weight(
INTEGRATOR_STATE_WRITE(state, path, diffuse_glossy_ratio) = bsdf_eval_diffuse_glossy_ratio(
&bsdf_eval);
}
}
@@ -451,7 +445,7 @@ ccl_device bool integrate_surface(KernelGlobals kg,
}
#endif
shader_prepare_surface_closures(kg, state, &sd, path_flag);
shader_prepare_surface_closures(kg, state, &sd);
#ifdef __HOLDOUT__
/* Evaluate holdout. */
@@ -498,6 +492,10 @@ ccl_device bool integrate_surface(KernelGlobals kg,
kernel_write_denoising_features_surface(kg, state, &sd, render_buffer);
#endif
#ifdef __SHADOW_CATCHER__
kernel_write_shadow_catcher_bounce_data(kg, state, &sd, render_buffer);
#endif
/* Direct light. */
PROFILING_EVENT(PROFILING_SHADE_SURFACE_DIRECT_LIGHT);
integrate_surface_direct_light(kg, state, &sd, &rng_state);

View File

@@ -263,12 +263,6 @@ ccl_device void volume_shadow_heterogeneous(KernelGlobals kg,
/* Equi-angular sampling as in:
* "Importance Sampling Techniques for Path Tracing in Participating Media" */
/* Below this pdf we ignore samples, as they tend to lead to very long distances.
* This can cause performance issues with BVH traversal in OptiX, leading it to
* traverse many nodes. Since these contribute very little to the image, just ignore
* those samples. */
# define VOLUME_SAMPLE_PDF_CUTOFF 1e-8f
ccl_device float volume_equiangular_sample(ccl_private const Ray *ccl_restrict ray,
const float3 light_P,
const float xi,
@@ -443,8 +437,7 @@ ccl_device_forceinline void volume_integrate_step_scattering(
/* Equiangular sampling for direct lighting. */
if (vstate.direct_sample_method == VOLUME_SAMPLE_EQUIANGULAR && !result.direct_scatter) {
if (result.direct_t >= vstate.start_t && result.direct_t <= vstate.end_t &&
vstate.equiangular_pdf > VOLUME_SAMPLE_PDF_CUTOFF) {
if (result.direct_t >= vstate.start_t && result.direct_t <= vstate.end_t) {
const float new_dt = result.direct_t - vstate.start_t;
const float3 new_transmittance = volume_color_transmittance(coeff.sigma_t, new_dt);
@@ -481,28 +474,26 @@ ccl_device_forceinline void volume_integrate_step_scattering(
const float3 new_transmittance = volume_color_transmittance(coeff.sigma_t, new_dt);
const float distance_pdf = dot(channel_pdf, coeff.sigma_t * new_transmittance);
if (vstate.distance_pdf * distance_pdf > VOLUME_SAMPLE_PDF_CUTOFF) {
/* throughput */
result.indirect_scatter = true;
result.indirect_t = new_t;
result.indirect_throughput *= coeff.sigma_s * new_transmittance / distance_pdf;
shader_copy_volume_phases(&result.indirect_phases, sd);
/* throughput */
result.indirect_scatter = true;
result.indirect_t = new_t;
result.indirect_throughput *= coeff.sigma_s * new_transmittance / distance_pdf;
shader_copy_volume_phases(&result.indirect_phases, sd);
if (vstate.direct_sample_method != VOLUME_SAMPLE_EQUIANGULAR) {
/* If using distance sampling for direct light, just copy parameters
* of indirect light since we scatter at the same point then. */
result.direct_scatter = true;
result.direct_t = result.indirect_t;
result.direct_throughput = result.indirect_throughput;
shader_copy_volume_phases(&result.direct_phases, sd);
if (vstate.direct_sample_method != VOLUME_SAMPLE_EQUIANGULAR) {
/* If using distance sampling for direct light, just copy parameters
* of indirect light since we scatter at the same point then. */
result.direct_scatter = true;
result.direct_t = result.indirect_t;
result.direct_throughput = result.indirect_throughput;
shader_copy_volume_phases(&result.direct_phases, sd);
/* Multiple importance sampling. */
if (vstate.use_mis) {
const float equiangular_pdf = volume_equiangular_pdf(ray, equiangular_light_P, new_t);
const float mis_weight = power_heuristic(vstate.distance_pdf * distance_pdf,
equiangular_pdf);
result.direct_throughput *= 2.0f * mis_weight;
}
/* Multiple importance sampling. */
if (vstate.use_mis) {
const float equiangular_pdf = volume_equiangular_pdf(ray, equiangular_light_P, new_t);
const float mis_weight = power_heuristic(vstate.distance_pdf * distance_pdf,
equiangular_pdf);
result.direct_throughput *= 2.0f * mis_weight;
}
}
}
@@ -703,10 +694,8 @@ ccl_device_forceinline bool integrate_volume_sample_light(
float light_u, light_v;
path_state_rng_2D(kg, rng_state, PRNG_LIGHT_U, &light_u, &light_v);
if (!light_distribution_sample_from_volume_segment(
kg, light_u, light_v, sd->time, sd->P, bounce, path_flag, ls)) {
return false;
}
light_distribution_sample_from_volume_segment(
kg, light_u, light_v, sd->time, sd->P, bounce, path_flag, ls);
if (ls->shader & SHADER_EXCLUDE_SCATTER) {
return false;
@@ -805,11 +794,10 @@ ccl_device_forceinline void integrate_volume_direct_light(
const float3 throughput_phase = throughput * bsdf_eval_sum(&phase_eval);
if (kernel_data.kernel_features & KERNEL_FEATURE_LIGHT_PASSES) {
const float3 pass_diffuse_weight = (bounce == 0) ?
one_float3() :
INTEGRATOR_STATE(state, path, pass_diffuse_weight);
INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, pass_diffuse_weight) = pass_diffuse_weight;
INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, pass_glossy_weight) = zero_float3();
const float3 diffuse_glossy_ratio = (bounce == 0) ?
one_float3() :
INTEGRATOR_STATE(state, path, diffuse_glossy_ratio);
INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, diffuse_glossy_ratio) = diffuse_glossy_ratio;
}
INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, render_pixel_index) = INTEGRATOR_STATE(
@@ -888,8 +876,7 @@ ccl_device_forceinline bool integrate_volume_phase_scatter(
INTEGRATOR_STATE_WRITE(state, path, throughput) = throughput_phase;
if (kernel_data.kernel_features & KERNEL_FEATURE_LIGHT_PASSES) {
INTEGRATOR_STATE_WRITE(state, path, pass_diffuse_weight) = one_float3();
INTEGRATOR_STATE_WRITE(state, path, pass_glossy_weight) = zero_float3();
INTEGRATOR_STATE_WRITE(state, path, diffuse_glossy_ratio) = one_float3();
}
/* Update path state */
@@ -1037,7 +1024,7 @@ ccl_device void integrator_shade_volume(KernelGlobals kg,
else {
/* Continue to background, light or surface. */
integrator_intersect_next_kernel_after_volume<DEVICE_KERNEL_INTEGRATOR_SHADE_VOLUME>(
kg, state, &isect, render_buffer);
kg, state, &isect);
return;
}
#endif /* __VOLUME__ */

View File

@@ -105,42 +105,8 @@ ccl_device_inline void shader_copy_volume_phases(ccl_private ShaderVolumePhases
ccl_device_inline void shader_prepare_surface_closures(KernelGlobals kg,
ConstIntegratorState state,
ccl_private ShaderData *sd,
const uint32_t path_flag)
ccl_private ShaderData *sd)
{
/* Filter out closures. */
if (kernel_data.integrator.filter_closures) {
if (kernel_data.integrator.filter_closures & FILTER_CLOSURE_EMISSION) {
sd->closure_emission_background = zero_float3();
}
if (kernel_data.integrator.filter_closures & FILTER_CLOSURE_DIRECT_LIGHT) {
sd->flag &= ~SD_BSDF_HAS_EVAL;
}
if (path_flag & PATH_RAY_CAMERA) {
for (int i = 0; i < sd->num_closure; i++) {
ccl_private ShaderClosure *sc = &sd->closure[i];
if ((CLOSURE_IS_BSDF_DIFFUSE(sc->type) &&
(kernel_data.integrator.filter_closures & FILTER_CLOSURE_DIFFUSE)) ||
(CLOSURE_IS_BSDF_GLOSSY(sc->type) &&
(kernel_data.integrator.filter_closures & FILTER_CLOSURE_GLOSSY)) ||
(CLOSURE_IS_BSDF_TRANSMISSION(sc->type) &&
(kernel_data.integrator.filter_closures & FILTER_CLOSURE_TRANSMISSION))) {
sc->type = CLOSURE_NONE_ID;
sc->sample_weight = 0.0f;
}
else if ((CLOSURE_IS_BSDF_TRANSPARENT(sc->type) &&
(kernel_data.integrator.filter_closures & FILTER_CLOSURE_TRANSPARENT))) {
sc->type = CLOSURE_HOLDOUT_ID;
sc->sample_weight = 0.0f;
sd->flag |= SD_HOLDOUT;
}
}
}
}
/* Defensive sampling.
*
* We can likely also do defensive sampling at deeper bounces, particularly
@@ -243,7 +209,8 @@ ccl_device_inline float _shader_bsdf_multi_eval(KernelGlobals kg,
float3 eval = bsdf_eval(kg, sd, sc, omega_in, is_transmission, &bsdf_pdf);
if (bsdf_pdf != 0.0f) {
bsdf_eval_accum(result_eval, sc->type, eval * sc->weight);
const bool is_diffuse = CLOSURE_IS_BSDF_DIFFUSE(sc->type);
bsdf_eval_accum(result_eval, is_diffuse, eval * sc->weight, 1.0f);
sum_pdf += bsdf_pdf * sc->sample_weight;
}
}
@@ -268,7 +235,7 @@ ccl_device_inline
ccl_private BsdfEval *bsdf_eval,
const uint light_shader_flags)
{
bsdf_eval_init(bsdf_eval, CLOSURE_NONE_ID, zero_float3());
bsdf_eval_init(bsdf_eval, false, zero_float3());
return _shader_bsdf_multi_eval(
kg, sd, omega_in, is_transmission, NULL, bsdf_eval, 0.0f, 0.0f, light_shader_flags);
@@ -361,7 +328,8 @@ ccl_device int shader_bsdf_sample_closure(KernelGlobals kg,
label = bsdf_sample(kg, sd, sc, randu, randv, &eval, omega_in, domega_in, pdf);
if (*pdf != 0.0f) {
bsdf_eval_init(bsdf_eval, sc->type, eval * sc->weight);
const bool is_diffuse = CLOSURE_IS_BSDF_DIFFUSE(sc->type);
bsdf_eval_init(bsdf_eval, is_diffuse, eval * sc->weight);
if (sd->num_closure > 1) {
const bool is_transmission = shader_bsdf_is_transmission(sd, *omega_in);
@@ -687,7 +655,7 @@ ccl_device_inline float _shader_volume_phase_multi_eval(
float3 eval = volume_phase_eval(sd, svc, omega_in, &phase_pdf);
if (phase_pdf != 0.0f) {
bsdf_eval_accum(result_eval, CLOSURE_VOLUME_HENYEY_GREENSTEIN_ID, eval);
bsdf_eval_accum(result_eval, false, eval, 1.0f);
sum_pdf += phase_pdf * svc->sample_weight;
}
@@ -703,7 +671,7 @@ ccl_device float shader_volume_phase_eval(KernelGlobals kg,
const float3 omega_in,
ccl_private BsdfEval *phase_eval)
{
bsdf_eval_init(phase_eval, CLOSURE_VOLUME_HENYEY_GREENSTEIN_ID, zero_float3());
bsdf_eval_init(phase_eval, false, zero_float3());
return _shader_volume_phase_multi_eval(sd, phases, omega_in, -1, phase_eval, 0.0f, 0.0f);
}
@@ -761,7 +729,7 @@ ccl_device int shader_volume_phase_sample(KernelGlobals kg,
label = volume_phase_sample(sd, svc, randu, randv, &eval, omega_in, domega_in, pdf);
if (*pdf != 0.0f) {
bsdf_eval_init(phase_eval, CLOSURE_VOLUME_HENYEY_GREENSTEIN_ID, eval);
bsdf_eval_init(phase_eval, false, eval);
}
return label;
@@ -784,7 +752,7 @@ ccl_device int shader_phase_sample_closure(KernelGlobals kg,
label = volume_phase_sample(sd, sc, randu, randv, &eval, omega_in, domega_in, pdf);
if (*pdf != 0.0f)
bsdf_eval_init(phase_eval, CLOSURE_VOLUME_HENYEY_GREENSTEIN_ID, eval);
bsdf_eval_init(phase_eval, false, eval);
return label;
}

View File

@@ -16,7 +16,6 @@
#pragma once
#include "kernel/film/write_passes.h"
#include "kernel/integrator/path_state.h"
#include "kernel/integrator/state_util.h"
@@ -48,7 +47,7 @@ ccl_device_inline bool kernel_shadow_catcher_is_path_split_bounce(KernelGlobals
return false;
}
if (path_flag & PATH_RAY_SHADOW_CATCHER_HIT) {
if (path_flag & PATH_RAY_SHADOW_CATCHER_PASS) {
return false;
}
@@ -89,28 +88,6 @@ ccl_device_forceinline bool kernel_shadow_catcher_is_object_pass(const uint32_t
return path_flag & PATH_RAY_SHADOW_CATCHER_PASS;
}
/* Write shadow catcher passes on a bounce from the shadow catcher object. */
ccl_device_forceinline void kernel_write_shadow_catcher_bounce_data(
KernelGlobals kg, IntegratorState state, ccl_global float *ccl_restrict render_buffer)
{
kernel_assert(kernel_data.film.pass_shadow_catcher_sample_count != PASS_UNUSED);
kernel_assert(kernel_data.film.pass_shadow_catcher_matte != PASS_UNUSED);
const uint32_t render_pixel_index = INTEGRATOR_STATE(state, path, render_pixel_index);
const uint64_t render_buffer_offset = (uint64_t)render_pixel_index *
kernel_data.film.pass_stride;
ccl_global float *buffer = render_buffer + render_buffer_offset;
/* Count sample for the shadow catcher object. */
kernel_write_pass_float(buffer + kernel_data.film.pass_shadow_catcher_sample_count, 1.0f);
/* Since the split is done, the sample does not contribute to the matte, so accumulate it as
* transparency to the matte. */
const float3 throughput = INTEGRATOR_STATE(state, path, throughput);
kernel_write_pass_float(buffer + kernel_data.film.pass_shadow_catcher_matte + 3,
average(throughput));
}
#endif /* __SHADOW_CATCHER__ */
CCL_NAMESPACE_END

View File

@@ -46,9 +46,8 @@ KERNEL_STRUCT_MEMBER(shadow_path,
float3,
unshadowed_throughput,
KERNEL_FEATURE_SHADOW_PASS | KERNEL_FEATURE_AO_ADDITIVE)
/* Ratio of throughput to distinguish diffuse / glossy / transmission render passes. */
KERNEL_STRUCT_MEMBER(shadow_path, float3, pass_diffuse_weight, KERNEL_FEATURE_LIGHT_PASSES)
KERNEL_STRUCT_MEMBER(shadow_path, float3, pass_glossy_weight, KERNEL_FEATURE_LIGHT_PASSES)
/* Ratio of throughput to distinguish diffuse and glossy render passes. */
KERNEL_STRUCT_MEMBER(shadow_path, float3, diffuse_glossy_ratio, KERNEL_FEATURE_LIGHT_PASSES)
/* Number of intersections found by ray-tracing. */
KERNEL_STRUCT_MEMBER(shadow_path, uint16_t, num_hits, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_END(shadow_path)

View File

@@ -60,9 +60,8 @@ KERNEL_STRUCT_MEMBER(path, float, min_ray_pdf, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(path, float, continuation_probability, KERNEL_FEATURE_PATH_TRACING)
/* Throughput. */
KERNEL_STRUCT_MEMBER(path, float3, throughput, KERNEL_FEATURE_PATH_TRACING)
/* Ratio of throughput to distinguish diffuse / glossy / transmission render passes. */
KERNEL_STRUCT_MEMBER(path, float3, pass_diffuse_weight, KERNEL_FEATURE_LIGHT_PASSES)
KERNEL_STRUCT_MEMBER(path, float3, pass_glossy_weight, KERNEL_FEATURE_LIGHT_PASSES)
/* Ratio of throughput to distinguish diffuse and glossy render passes. */
KERNEL_STRUCT_MEMBER(path, float3, diffuse_glossy_ratio, KERNEL_FEATURE_LIGHT_PASSES)
/* Denoising. */
KERNEL_STRUCT_MEMBER(path, float3, denoising_feature_throughput, KERNEL_FEATURE_DENOISING)
/* Shader sorting. */

View File

@@ -71,10 +71,6 @@ ccl_device int subsurface_bounce(KernelGlobals kg,
}
# endif
if (sd->flag & SD_BACKFACING) {
path_flag |= PATH_RAY_SUBSURFACE_BACKFACING;
}
INTEGRATOR_STATE_WRITE(state, path, throughput) *= weight;
INTEGRATOR_STATE_WRITE(state, path, flag) = path_flag;
@@ -83,8 +79,7 @@ ccl_device int subsurface_bounce(KernelGlobals kg,
if (kernel_data.kernel_features & KERNEL_FEATURE_LIGHT_PASSES) {
if (INTEGRATOR_STATE(state, path, bounce) == 0) {
INTEGRATOR_STATE_WRITE(state, path, pass_diffuse_weight) = one_float3();
INTEGRATOR_STATE_WRITE(state, path, pass_glossy_weight) = zero_float3();
INTEGRATOR_STATE_WRITE(state, path, diffuse_glossy_ratio) = one_float3();
}
}

View File

@@ -47,7 +47,6 @@ ccl_device_inline bool subsurface_disk(KernelGlobals kg,
const float time = INTEGRATOR_STATE(state, ray, time);
const float3 Ng = INTEGRATOR_STATE(state, subsurface, Ng);
const int object = INTEGRATOR_STATE(state, isect, object);
const uint32_t path_flag = INTEGRATOR_STATE(state, path, flag);
/* Read subsurface scattering parameters. */
const float3 radius = INTEGRATOR_STATE(state, subsurface, radius);
@@ -124,9 +123,6 @@ ccl_device_inline bool subsurface_disk(KernelGlobals kg,
const int object = ss_isect.hits[hit].object;
const int object_flag = kernel_tex_fetch(__object_flag, object);
float3 hit_Ng = ss_isect.Ng[hit];
if (path_flag & PATH_RAY_SUBSURFACE_BACKFACING) {
hit_Ng = -hit_Ng;
}
if (object_flag & SD_OBJECT_NEGATIVE_SCALE_APPLIED) {
hit_Ng = -hit_Ng;
}

View File

@@ -73,7 +73,7 @@ ccl_device_inline bool light_sample(KernelGlobals kg,
ls->P = zero_float3();
ls->Ng = zero_float3();
ls->D = zero_float3();
ls->pdf = 1.0f;
ls->pdf = true;
ls->t = FLT_MAX;
return true;
}
@@ -131,7 +131,7 @@ ccl_device_inline bool light_sample(KernelGlobals kg,
float3 dir = make_float3(klight->spot.dir[0], klight->spot.dir[1], klight->spot.dir[2]);
ls->eval_fac *= spot_light_attenuation(
dir, klight->spot.spot_angle, klight->spot.spot_smooth, ls->Ng);
if (!in_volume_segment && ls->eval_fac == 0.0f) {
if (ls->eval_fac == 0.0f) {
return false;
}
}
@@ -170,7 +170,7 @@ ccl_device_inline bool light_sample(KernelGlobals kg,
float3 sample_axisu = axisu;
float3 sample_axisv = axisv;
if (!in_volume_segment && klight->area.tan_spread > 0.0f) {
if (klight->area.tan_spread > 0.0f) {
if (!light_spread_clamp_area_light(
P, Ng, &ls->P, &sample_axisu, &sample_axisv, klight->area.tan_spread)) {
return false;
@@ -203,7 +203,7 @@ ccl_device_inline bool light_sample(KernelGlobals kg,
ls->pdf *= kernel_data.integrator.pdf_lights;
return in_volume_segment || (ls->pdf > 0.0f);
return (ls->pdf > 0.0f);
}
ccl_device bool lights_intersect(KernelGlobals kg,

View File

@@ -199,9 +199,6 @@ ccl_device_inline float3 shadow_ray_offset(KernelGlobals kg,
if (offset_cutoff > 0.0f) {
float NgL = dot(Ng, L);
float offset_amount = 0.0f;
if (NL < 0) {
NL = -NL;
}
if (NL < offset_cutoff) {
offset_amount = clamp(2.0f - (NgL + NL) / offset_cutoff, 0.0f, 1.0f);
}

Some files were not shown because too many files have changed in this diff Show More