Compare commits

..

503 Commits

Author SHA1 Message Date
a82f633ebb GPU: Cleanup GPU_batch.h documentation and some of the API for consistency
Documented all functions, adding use case and side effects.

Also replace the use of shortened argument name by more meaningful ones.

Renamed `GPU_batch_instbuf_add_ex` and `GPU_batch_vertbuf_add_ex` to remove
the `ex` suffix as they are the main version used (removed the few usage
of the other version).

Renamed `GPU_batch_draw_instanced` to `GPU_batch_draw_instance_range` and
make it consistent with `GPU_batch_draw_range`.
2023-02-11 12:52:58 -08:00
5a184f2ff0 Cleanup: Move 6 sculpt-session-related files and header to C++
To allow further mesh data structure refactoring. See #103343

Pull Request #104540
2023-02-11 12:52:55 -08:00
4a75d4b581 Fix #104383: don't update declaration for clipboard copy
When nodes are copied to the clipboard, they don't need their declaration.
For nodes with dynamic declaration that might depend on the node tree itself,
the declaration could not be build anyway, because the node-clipboard does
not have a node tree.

Pull Request #104432
2023-02-11 12:50:45 -08:00
13a5f5c926 Fix #104435: Fix rna_NlaStrip_new add strip logic to be correct boolean expression
Fixed #104435: Use correct conditional logic when testing if a new NLA strip can be added in the rna_NlaStrip_new method
2023-02-11 12:50:45 -08:00
ce2b6a5b44 Cleanup: Fix const correctness warning in recent commit 2023-02-11 12:50:45 -08:00
Kevin C. Burke
2c22895ee6 Fix: Sequencer "Pan" label using incorrect keyword 'heading_ctxt'
Oversight in db87e2a638

Reviewed By: ISS
Differential Revision: https://archive.blender.org/developer/D17213
2023-02-11 12:50:45 -08:00
4fc62cf0f1 Geometry Nodes: Edges to Face Groups Node
Add a new node that groups faces inside of boundary edge regions.
This is the opposite action as the existing "Face Group Boundaries"
node. It's also the same as some of the "Initialize Face Sets"
options in sculpt mode.

Discussion in #102962 has favored "Group" for a name for these
sockets rather than "Set", so that is used here.

Pull Request #104428
2023-02-11 12:50:44 -08:00
ed40f76fe2 Assets: Implement viewport drag and drop for geometry nodes
Currently there's no way to assign a geometry node group from the asset
browser to an object as a modifier without first appending/linking it
manually. This patch adds a drag and drop operator that adds a new
modifier and assigns the dragged tree.

Pull Request #104430
2023-02-11 12:50:44 -08:00
8f56c7835e Fix: Add missing "-" in logic to get the channel height
This was missed when doing the refactoring in #104500
It didn't seem to have any effect until I worked on clamping the view
2023-02-11 12:50:44 -08:00
3e8c82fb68 Mesh: Remove unnecessary edge draw flag
As described in #95966, replace the `ME_EDGEDRAW` flag with a bit
vector in mesh runtime data. Currently the the flag is only ever set
to false for the "optimal display" feature of the subdivision surface
modifier. When creating an "original" mesh in the main data-base,
the flag is always supposed to be true.

The bit vector is now created by the modifier only as necessary, and
is cleared for topology-changing operations. This fixes incorrect
interpolation of the flag as noted in #104376. Generally it isn't
possible to interpolate it through topology-changing operations.

After this, only the seam status needs to be removed from edges before
we can replace them with the generic `int2` type (or something similar)
and reduce memory usage by 1/3.

Related:
- 10131a6f62
- 145839aa42

In the future `BM_ELEM_DRAW` could be removed as well. Currently it is
used and aliased by other defines in some non-obvious ways though.

Pull Request #104417
2023-02-11 12:50:44 -08:00
b352d9986e Curves: Add box selection
This adds a `select_box` function for the `Curves` object. It is used in the `view3d_box_select` operator.

It also adds the basic selection tools in the toolbar of Edit Mode.

Authored-by: Falk David <falkdavid@gmx.de>
Pull Request #104411
2023-02-11 12:50:44 -08:00
8be7d35212 I18n: use format strings for Cycles version error messages
The required version numbers for various devices was hardcoded in the
UI messages. The result was that every time one of these versions was
bumped, every language team had to update the message in question.

Instead, the version numbers can be extracted, and injected into the
error messages using string formatting so that translation updates
need happen less frequently.

Pull Request #104488
2023-02-11 12:50:44 -08:00
6136c39b56 Refactor: remove yscale from bAnimContext
`bAnimContext` had a float property called `yscale_fac` that was used to define the height of the keyframe channels.

However the property was never set, only read so there really is no need to have it in the struct.

Moreover it complicated getting the channel height because `bAnimContext` had to be passed in.

Speaking of getting the channel height. This was done with macros. I ripped them all out and replaced them with function calls.

Originally it was introduced in this patch: https://developer.blender.org/rB095c8dbe6919857ea322b213a1e240161cd7c843

Co-authored-by: Christoph Lendenfeld <chris.lenden@gmail.com>
Pull Request #104500
2023-02-11 12:50:44 -08:00
760665e88b Fix freeing uninitialized pointer in GHOST/Wayland + X11 fallback
Freeing the timer manager didn't account for Wayland being partially
initialized.
2023-02-11 12:50:44 -08:00
70897a4fea Build: disable LTO for Python builds
LTO compiled libpython3.10.a failed to link with GCC 12.0,
disable since these libraries are intended for developers to link
against.
2023-02-11 12:50:44 -08:00
70d05ccd01 Build: enable Python optimizations (PGO & LTO) on Linux
This is used for most Python release builds and has been reported to
give a modest 5-10% speedup (depending on the workload).

This could be enabled on macOS too but needs to be tested.
2023-02-11 12:50:44 -08:00
c1fd0fa079 GPU: Fix assert when using light gizmo.
Blender was reporting that the GPU_TEXTURE_USAGE_HOST_READ wasn't set.
This is used to indicate that the textures needs to be read back to
CPU. Textures that don't need to be read back can be optimized by the
GPU backend.

Found during investigation of #104282.
2023-02-11 12:50:44 -08:00
6495e36779 Spelling: Assert message in GPU_texture_read. 2023-02-11 12:50:44 -08:00
66c7106672 Cleanup: solve compiler warnings.
Classes were predefined as structs.
2023-02-11 12:50:44 -08:00
8d8960001c Cycles: update Intel Graphics Compiler to 1.0.13064.7 on Linux
Linux side of 8afcecdf1f.

Reviewed by: LazyDodo, sergey, campbellbarton
Ref !104458, 16984
2023-02-11 12:50:44 -08:00
95da6fddcb Cleanup: Remove unused/redundant includes from BKE_curves.hh
Avoid including headers that are obviously redundant, and don't
include BLI_task.hh in the header file, since it isn't really related.
2023-02-11 12:50:44 -08:00
5fba2a156b Cleanup: update username in code-comments: campbellbarton -> ideasman42
Gitea migration changed my username, update code-comments.
2023-02-11 12:50:43 -08:00
c58e60af70 Cleanup: spelling in comments 2023-02-11 12:50:43 -08:00
0f3a8ff8f3 Cleanup: enum conversion compiler warnings 2023-02-11 12:50:43 -08:00
3aff175c27 PyAPI: minor change to rna_manual_reference loading
- Use bpy.utils.execfile instead of importing then deleting from
  sys.modules.
- Add a note for why keeping this cached in memory isn't necessary.

This has the advantage of not interfering with any scripts that import
`rna_manual_reference` as a module.
2023-02-11 12:50:43 -08:00
a6861475b6 EEVEE-Next: Shadows: Add global switch
This allow to bypass all cost associated with shadow mapping.

This can be useful in certain situation, such as opening a scene on a
lower end system or just to gain performance in some situation (lookdev).
2023-02-11 12:50:43 -08:00
02576a8b8e EEVEE-Next: Shadow: Fix issue with last merge
The merge with master updated the code to use the new matrix API. This
introduce some regressions.

For sunlights make sure there is enough tilemaps in orthographic mode
to cover the depth range and fix the level offset in perspective.
2023-02-11 12:50:43 -08:00
f5031dc7f4 Fix Cycles link error with debug/asan builds after recent bugfix
Pull Request #104487
2023-02-11 12:50:43 -08:00
fa55c85176 EEVEE-Next: Virtual Shadow Map initial implementation
Implements virtual shadow mapping for EEVEE-Next primary shadow solution.
This technique aims to deliver really high precision shadowing for many
lights while keeping a relatively low cost.

The technique works by splitting each shadows in tiles that are only
allocated & updated on demand by visible surfaces and volumes.
Local lights use cubemap projection with mipmap level of detail to adapt
the resolution to the receiver distance.
Sun lights use clipmap distribution or cascade distribution (depending on
which is better) for selecting the level of detail with the distance to
the camera.

Current maximum shadow precision for local light is about 1 pixel per 0.01
degrees.
For sun light, the maximum resolution is based on the camera far clip
distance which sets the most coarse clipmap.

## Limitation:
Alpha Blended surfaces might not get correct shadowing in some corner
casses. This is to be fixed in another commit.
While resolution is greatly increase, it is still finite. It is virtually
equivalent to one 8K shadow per shadow cube face and per clipmap level.
There is no filtering present for now.

## Parameters:
Shadow Pool Size: In bytes, amount of GPU memory to dedicate to the
shadow pool (is allocated per viewport).
Shadow Scaling: Scale the shadow resolution. Base resolution should
target subpixel accuracy (within the limitation of the technique).

Related to #93220
Related to #104472
2023-02-11 12:50:43 -08:00
df37a60c62 BLI: Math: Fix vector operator * with MutableMatView
This was caused by operator priority trying to use
`friend VecBase operator*(const VecBase &a, FactorT b)`.

Adding tests as these were not covered.
2023-02-11 12:50:43 -08:00
f7e4c3cb69 Fix Cycles debug build error after host falback changes
Introduced in dcfb6df9ce6.

Co-authored-by: Lucas Tadeu Teixeira <lucas@lucastadeu.com>

Pull Request #104454
2023-02-11 12:50:43 -08:00
909daaf629 Make update: Ignore submodules
The previous change in the .gitmodules made it so the `make update`
rejects to do its thing because it now sees changes in the submodules
and rejected to update, thinking there are unstaged changes.

Ignore the submodule changes, bringing the old behavior closer to
what it was.
2023-02-11 12:50:42 -08:00
28e9e40a8d Un-ignore modules in .gitmodules configuration
The meaning of the ignore option for submodules did change since our
initial Git setup was done: back then it was affecting both diff and
stage families of Git command. Unfortunately, the actual behavior did
violate what documentation was stating (the documentation was stating
that the option only affects diff family of commands). This got fixed
in Git some time after our initial setup and it was the behavior of the
commands changed, not the documentation. This lead to a situation when
we can no longer see that submodules are modified and staged, and it is
very easy to stage the submodules.

For the clarity: diff and status are both "status" family, show and
diff are "diff" family.

Hence this change: since there is no built-in zero-configuration way
of forbidding Git from staging submodules lets make it visible and
clear what the state of submodules is.

We still need to inform people to not stage submodules, for which
we can offer some configuration tips and scripts but doing so is
outside of the scope of this change at it requires some additional
research. Current goal is simple: make it visible and clear what is
going to be committed to Git.

This is a response to an increased frequency of incidents when the
submodules are getting modified and committed without authors even
noticing this (which is also a bit annoying to recover from).

Differential Revision: https://developer.blender.org/D13001
2023-02-11 12:50:42 -08:00
c5b0eba85b Fix references to the main branch in the .gitmodules 2023-02-11 12:50:42 -08:00
52a31a8ca9 Subdivision Surface: fix a serious performance hit when mixing CPU & GPU.
Subdivision surface efficiency relies on caching pre-computed topology
data for evaluation between frames. However, while eed45d2a23
introduced a second GPU subdiv evaluator type, it still only kept
one slot for caching this runtime data per mesh.

The result is that if the mesh is also needed on CPU, for instance
due to a modifier on a different object (e.g. shrinkwrap), the two
evaluators are used at the same time and fight over the single slot.
This causes the topology data to be discarded and recomputed twice
per frame.

Since avoiding duplicate evaluation is a complex task, this fix
simply adds a second separate cache slot for the GPU data, so that
the cost is simply running subdivision twice, not recomputing topology
twice.

To help diagnostics, I also add a message to show when GPU evaluation
is actually used to the modifier panel. Two frame counters are used
to suppress flicker in the UI panel.

Differential Revision: https://developer.blender.org/D17117

Pull Request #104441
2023-02-11 12:50:42 -08:00
d3f14bda16 Cleanup: use enum literals, order likely case first in polyfill_2d 2023-02-11 12:50:42 -08:00
3abceef389 Fix #103913: Triangulate sometimes creates degenerate triangles
The ear clipping method used by polyfill_2d only excluded concave ears
which meant ears exactly co-linear edges created zero area triangles
even when convex ears are available.

While polyfill_2d prioritizes performance over *pretty* results,
there is no need to pick degenerate triangles with other candidates
are available. As noted in code-comments, callers that require higher
quality tessellation should use BLI_polyfill_beautify.
2023-02-11 12:50:42 -08:00
ed334e642f Merge branch 'main' into temp-sculpt-roll-mapping 2023-02-07 18:15:15 -08:00
ef04a4262e Merge branch 'master' into temp-sculpt-roll-mapping 2023-02-07 18:15:08 -08:00
6aa1b5d031 Cleanup: format 2023-02-08 00:21:57 +01:00
5c994d7846 Fix #104297: Cycling geometry nodes viewer ignores sockets
Sockets after the geometry socket were ignored when cycling through
the node's output sockets. If there are multiple geometry sockets, the
behavior could still be refined probably, but this should at least make
basic non-geometry socket cycling work.
2023-02-07 16:01:54 -05:00
53b057aa09 Cleanup: Move 18 sculpt files to C++
To allow further mesh data structure refactoring. See #103343

Pull Request #104436
2023-02-07 21:56:45 +01:00
e817cff009 Release: support generating LTS release notes from Gitea
Now a single script to generate both links and release notes. It also includes
the issue ID for the LTS releases, so only the release version needs to be
specified.

Pull Request #104402
2023-02-07 21:23:24 +01:00
41ddd3d732 Fix: Experimental Panel links modified for Gitea
Modifies the links to point to the new developer site.

Pull Request #104425
2023-02-07 19:54:43 +01:00
f5552d759c Fix compiler error 2023-02-07 18:32:24 +01:00
f01bf82480 Curves: Add select pick operator
This adds the `select_pick` function for to `Curves` objects.
It is used in the common `view3d_select` operator.

Pull Request #104406
2023-02-07 17:50:39 +01:00
8d9d16fb53 Fix #104396: Blender crashes when moving Keyframes in Graph Editor
`t->region->gizmo_map` can be `nullptr`.

Caused by 19b63b932d
2023-02-07 12:59:35 -03:00
349350b304 Fix T104390: Regression: Object selection in viewport is not working
Caused by alignment difference between C and C++. Asan caught the issue
on startup.

Removing the unused view matrix storage copy avoids this problem.
2023-02-07 16:45:47 +01:00
cb5318b651 Docs: change Git URLs to point projects.blender.org instead of git.blender.org 2023-02-07 14:23:05 +01:00
bd6b0bac88 Update references to the new projects platform and main branch 2023-02-07 14:18:19 +01:00
3002670332 Fix T104368: Incorrect tooltip text in Blender 3.4.1's Preferences > File Paths > Scripts field.
Use backticks to cleary identify 'path' parts of this tooltip.
2023-02-07 09:50:50 +01:00
f086cf3cea Cleanup: remove redundant parenthesis 2023-02-07 17:34:20 +11:00
2609ca2b8e Cleanup: tweaks to cycles/metal preferences
- Auto-format.
- Use raw string for regex.
- Remove redundant assignment.
- Remove duplicate arm64 check.
- Break early out of loop.
2023-02-07 17:30:13 +11:00
b-init
7e8153b07d Keymap: support default shortcut to toggle overlays in all space-types
UV Editor, Image Editor & Sequencer didn't have a shortcut for toggling
overlays. Use the same shortcut as the 3D viewport.

Ref D16959
2023-02-07 16:54:06 +11:00
622cad7073 Cleanup: minor tweak to recent fix for T10438
Minor change to [0], prefer calling em_setup_viewcontext,
even though there is no functional difference at the moment,
if this function ever performs additional operations than assigning
`ViewContext.em`, it would have to be manually in-lined in
`view3d_circle_select_recalc`.

[0]: 430cc9d7bf
2023-02-07 16:18:30 +11:00
44daeaae7d Cleanup: use arg instead of param for generated sphinx docs 2023-02-07 15:14:22 +11:00
db8b5a2316 PyDoc: remove deprecated dpi argument from BLF example 2023-02-07 15:12:05 +11:00
dbca0cc9d5 Fix crash on exit under Wayland
Order of free error from [0] caused the timer manager
to be freed before the timer.

[0]: 7de1a4d1d8
2023-02-07 15:12:05 +11:00
e4f77c1a6c Cleanup: format 2023-02-07 16:57:35 +13:00
Jon Denning
e27c89c7c7 Docs: added missing documentation for WindowManager methods
Added missing documentation for `draw_cursor_add` and
`draw_cursor_remove` methods for `WindowManager`.

Differential Revision: https://developer.blender.org/D14860
2023-02-06 22:40:10 -05:00
af5706c960 Docs: improve doc-string for WM_operator_flag_only_pass_through_on_press
The doc-string didn't provide any context for how the funciton is
intended to be used.
2023-02-07 14:18:59 +11:00
a99022e22d Cleanup: spelling in comments 2023-02-07 14:17:01 +11:00
d5af895419 Fix missing matrix includes 2023-02-07 14:07:21 +11:00
Jason Fielder
8703db393b Metal: Ensure explicit return after discard to eliminate differences in behaviour between GPUs.
Discard is not always treated as an explicit return and flow control can continue for required derivative calculations. This behaviour is different in Metal vs OpenGL. Adding return after discards ensures consistency in expectation as behaviour is well-defined.

Authored by Apple: Michael Parkin-White

Ref T96261

Reviewed By: fclem

Maniphest Tasks: T96261

Differential Revision: https://developer.blender.org/D17199
2023-02-07 00:58:06 +01:00
Jason Fielder
f152159101 Metal: Guard advanced command buffer debugging behind OS version flag.
Authored by Apple: Michael Parkin-White

Ref T96261

Reviewed By: fclem

Maniphest Tasks: T96261

Differential Revision: https://developer.blender.org/D17181
2023-02-07 00:51:16 +01:00
d3500c482f Cleanup: Move DRW_pbvh.h header to C++
For continued refactoring of the Mesh data structure. See T103343.
2023-02-06 16:52:02 -05:00
3a1583972a Fix T104256: Curve to points node skips curve domain attributes
7536abbe16 forgot to port the curve domain attributes.
2023-02-06 16:30:25 -05:00
6dcfb6df9c Cycles: Abstract host memory fallback for GPU devices
Host memory fallback in CUDA and HIP devices is almost identical.
We remove duplicated code and create a shared generic version that
other devices (oneAPI) will be able to use.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D17173
2023-02-06 22:19:32 +01:00
b0b9e746fa BLI: Use BLI_math_matrix_type.hh instead of BLI_math_float4x4.hh
Straightforward port. I took the oportunity to remove some C vector
functions (ex: copy_v2_v2).

This makes some changes to DRWView to accomodate the alignement
requirements of the float4x4 type.
2023-02-06 21:25:45 +01:00
8be3fcab7e Fix tests after recent commit.
`9c14039a8f4b5f` broke blenlib tests in release builds, due to how
`EXPECT_BLI_ASSERT` works (in release builds it just calls the given
function, so if that crashes then the test fails).

For now remove that check in the test.
2023-02-06 21:14:20 +01:00
d20f992322 Mesh: Avoid copying positions in object mode modifier stack
Remove the use of a separate contiguous positions array now that
they are stored that way in the first place. This allows removing the
complexity of tracking whether it is allocated and deformed in the
mesh modifier stack.

Instead of deferring the creation of the final mesh until after the
positions have been copied and deformed, create the final mesh
first and then deform its positions.

Since vertex and face normals are calculated lazily, we can rely on
individual modifiers to calculate them as necessary and simplify
the modifier stack. This was hard to change before because of the
separate array of deformed positions.

Differential Revision: https://developer.blender.org/D16971
2023-02-06 14:34:29 -05:00
a38d99e0b2 Fix (unreported): Transform gizmo not restoring when changing mode
When activating a rotation with the Transform gizmo for example, some
gizmos are hidden but they don't reappear when changing the mode.

Make sure the gizmos corresponding to the mode always reappear.
2023-02-06 16:20:12 -03:00
75db4c082b Cleanup: Use BitVector instead of BLI_bitmap for subsurf face dot tags
For better type safety and more automatic memory management.
2023-02-06 14:13:53 -05:00
Michael Jones
2d994de77c Cycles: MetalRT optimisation for subsurface intersection queries
This patch optimises subsurface intersection queries on MetalRT. Currently intersect_local traverses from the scene root, retrospectively discarding all non-local hits. Using a lookup of bottom level acceleration structures, we can explicitly query only the relevant instance. On M1 Max, with MetalRT selected, this can give a render speedup of 15-20% for scenes like Monster which make heavy use of subsurface scattering.

Patch authored by Marco Giordano.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D17153
2023-02-06 19:12:29 +00:00
961d99d3a4 Minor comments about current usages of BKE_main_namemap_destroy. 2023-02-06 19:36:36 +01:00
430cc9d7bf Fix T104381: Assert on Circle Select end modal
`em_setup_vivewcontext` cannot be used in this function now as it
expects `obedit` to be a mesh. It also duplicated the viewcontext init.
Instead `BKE_editmesh_from_object` is called only when type is a mesh.
2023-02-06 19:32:14 +01:00
a3551ed878 BKE main namemap: Add a way to clear all namemaps of a given Main.
Existing `BKE_main_namemap_destroy` is too specific when a entire Main
needs to have its namemaps cleared, since it would not handle the
Library ones.

While in regular situation current code is fine, ID management code that
may also edit linked data needs a wider, simpler clearing tool.
2023-02-06 19:29:21 +01:00
288b13b252 BKE lib remapper: Add new util to overwrite an existing mapping.
Current `BKE_id_remapper_add` would not replace an already existing
mapping rule, now `BKE_id_remapper_add_overwrite` allows that behavior
if necessary.
2023-02-06 19:29:21 +01:00
85b2bce037 BKE libremap: do not assert on identity pairs.
If identity pairs (i.e. old ID pointer being same as new one) was
forbidden, then this should be asserted against in code defining
remapping, not in code applying it.

But it is actually sometimes usefull to allow/use identity pairs, so
simply early-return on these instead of asserting.
2023-02-06 19:29:21 +01:00
c7b601c79e Cleanup: Subdiv: Use index instead of pointer
The edge pointer was retrieved from the index and then just
used to calculate the index again. Remove the edge pointer
and only use the index.
2023-02-06 12:29:01 -05:00
81f8d74f6d EEVEE: Cleanup: light power / radiance code and document it
All thanks to @weizhen for spoting the inconsistencies.
2023-02-06 18:23:34 +01:00
9c14039a8f BLI ListBase: Add new 'SplitAfter' given link util.
Allows to easily split a ListBase by moving the part after given link
into a new ListBase.
2023-02-06 17:58:40 +01:00
39e0bbfa55 BKE libquery: Add option to also process ID.lib pointer.
This was never needed before, but upcomming work for Brush Asset project
requires it.
2023-02-06 17:31:01 +01:00
8fe4f3b756 Minor code doc improvement. 2023-02-06 17:19:02 +01:00
Iliya Katueshenock
a86f657692 Fix T104233: crash when deleting a group node that is displayed by other editor
Differential Revision: https://developer.blender.org/D17172
2023-02-06 16:18:16 +01:00
8adebaeb7c Modifiers: measure execution time and provide Python access
The goal is to give technical artists the ability to optimize modifier usage
and/or geometry node groups for performance. In the long term, it
would be useful if Blender could provide its own UI to display profiling
information to users. However, right now, there are too many open
design questions making it infeasible to tackle this in the short term.

This commit uses a simpler approach: Instead of adding new ui for
profiling data, it exposes the execution-time of modifiers in the Python
API. This allows technical artists to access the information and to build
their own UI to display the relevant information. In the long term this
will hopefully also help us to integrate a native ui for this in Blender
by observing how users use this information.

Note: The execution time of a modifier highly depends on what other
things the CPU is doing at the same time. For example, in many more
complex files, many objects and therefore modifiers are evaluated at
the same time by multiple threads which makes the measurement
much less reliable. For best results, make sure that only one object
is evaluated at a time (e.g. by changing it in isolation) and that no
other process on the system keeps the CPU busy.

As shown below, the execution time has to be accessed on the
evaluated object, not the original object.

```lang=python
import bpy
depsgraph = bpy.context.view_layer.depsgraph
ob = bpy.context.active_object
ob_eval = ob.evaluated_get(depsgraph)
modifier_eval = ob_eval.modifiers[0]
print(modifier_eval.execution_time, "s")
```

Differential Revision: https://developer.blender.org/D17185
2023-02-06 15:40:15 +01:00
773a36d2f8 Fix Cycles OneAPI build error after recent changes 2023-02-06 15:36:49 +01:00
cc623ee7b0 Transform: do not save settings when canceling the operation
If we are canceling, the settings must remain the same as before we
start the operation.
2023-02-06 11:18:52 -03:00
deaddbdcff Fix forced snap status being removed when changing transform mode
The `SNAP_FORCED` setting is set to the operation and not the snap
status.

Therefore, this option should not be cleared along with the other
statuses when resetting snapping.

Move then the location of this setting to `TransInfo::modifiers`.
2023-02-06 11:18:52 -03:00
f2538c7173 Fix T104335: MNEE + OptiX OSL results in illegal address error
The OptiX pipeline created for OSL was missing sufficient continuation
stack to handle the MNEE ray generation program.
2023-02-06 15:06:52 +01:00
4bd3b02984 Python: Suppress BGL deprecation messages after 100 times.
BGL deprecation calls used to be reported on each use. As bgl calls
are typically part of a handler that is triggered at refresh this
could lead to overflow of messages and slowing down systems when
the terminal/console had to be refreshed as well.

This patch only reports the first 100 bgl deprecation calls. This
gives enough feedback to the developer that changes needs to be made
. But still provides good responsiveness to users when they have
such add-on enabled. Only the first frames can have a slowdown.
2023-02-06 13:35:29 +01:00
7beb487e9a Fix T104353: Crash on opening sculpting template
`t->region` was `NULL`.

It can happen depending on the context.

Caused by rB19b63b932d2b.
2023-02-06 09:21:04 -03:00
9ad3a85f8b Fix Cycles GPU binaries build error after recent changes for Metal 2023-02-06 13:17:57 +01:00
Michael Jones
654e1e901b Cycles: Use local atomics for faster shader sorting (enabled on Metal)
This patch adds two new kernels: SORT_BUCKET_PASS and SORT_WRITE_PASS. These replace PREFIX_SUM and SORTED_PATHS_ARRAY on supported devices (currently implemented on Metal, but will be trivial to enable on the other backends). The new kernels exploit sort partitioning (see D15331) by sorting each partition separately using local atomics. This can give an overall render speedup of 2-3% depending on architecture. As before, we fall back to the original non-partitioned sorting when the shader count is "too high".

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D16909
2023-02-06 11:18:26 +00:00
Michael Jones
46c9f7702a Cycles: Enable MetalRT opt-in for AMD/Navi2 GPUs
Reviewed By: brecht

Differential Revision: https://developer.blender.org/D17043
2023-02-06 11:14:11 +00:00
Michael Jones
be0912a402 Cycles: Prevent use of both AMD and Intel Metal devices at same time
This patch removes the option to select both AMD and Intel GPUs on system that have both. Currently both devices will be selected by default which results in crashes and other poorly understood behaviour. This patch adds precedence for using any discrete AMD GPU over an integrated Intel one. This can be overridden with CYCLES_METAL_FORCE_INTEL.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D17166
2023-02-06 11:13:33 +00:00
Michael Jones
0a3df611e7 Fix T103393: Cycles: Undefine __LIGHT_TREE__ on Metal/AMD to fix perf
This patch fixes T103393 by undefining `__LIGHT_TREE__` on Metal/AMD as it has an unexpected & major impact on performance even when light trees are not in use.

Patch authored by Prakash Kamliya.

Reviewed By: brecht

Maniphest Tasks: T103393

Differential Revision: https://developer.blender.org/D17167
2023-02-06 11:12:34 +00:00
Amelie Fondevilla
6d297c35c8 Fix T104371: GPencil merge down layer duplicates wrong frame
The merge down operator was sometimes copying the wrong frame, which altered the animation.
While merging the layers, it is sometimes needed to duplicate a keyframe,
when the lowest layer does not have a keyframe but the highest layer does.
Instead of duplicating the previous keyframe of the lowest layer, the code
was actually duplicating the active frame of the layer which was the current frame in the timeline.

This patch fixes the issue by setting the previous keyframe of the layer as its active frame before duplication.

Related issue: T104371.

Differential Revision: https://developer.blender.org/D17214
2023-02-06 10:44:17 +01:00
329eeacc66 Cleanup: Cycles: Remove isotropic microfacet closure setup functions
Turns out these are 100% redundant, so get rid of them.
2023-02-06 04:26:36 +01:00
2627635ff3 Cleanup: use nullptr in C++ 2023-02-06 12:50:34 +11:00
d6b6050e5b Cleanup: use function style casts in C++ 2023-02-06 12:35:51 +11:00
731c3efd97 Cleanup: format 2023-02-06 12:32:45 +11:00
9f5c17f4af Cleanup: comments in code 2023-02-06 12:25:04 +11:00
4fcc9f5e7e Cleanup: use back-slash doxygen commands, de-duplicate doc-string 2023-02-06 12:25:04 +11:00
7de1a4d1d8 Fix GHOST/Wayland thread-unsafe timer-manager manipulation
Mutex locks for manipulating GHOST_System::m_timerManager from
GHOST_SystemWayland relied on WAYLAND being the only user of the
timer-manager.

This isn't the case as timers are fired from
`GHOST_System::dispatchEvents`.

Resolve by using a separate timer-manager for wayland key-repeat timers.
2023-02-06 12:25:04 +11:00
d3949a4fdb Fix GHOST/Wayland thread-unsafe key-repeat timer checks
Resolve a thread safety issue reported by valgrind's helgrind checker,
although I wasn't able to redo the error in practice.

NULL check on the key-repeat timer also needs to lock, otherwise it's
possible the timer is set in another thread before the lock is acquired.

Now all key-repeat timer access which may run from a thread
locks the timer mutex before any checks or timer manipulation.
2023-02-06 12:25:04 +11:00
b642dc7bc7 Fix: Incorrect forward-compatible saving of face sets
There were two errors with the function used to convert face sets
to the legacy mesh format for keeping forward compatibility:
- It was moved before `CustomData_blend_write_prepare` so it
  operated on an empty span.
- It modified the mesh when it's only supposed to change the copy
  of the layers written to the file.

Differential Revision: https://developer.blender.org/D17210
2023-02-05 18:09:22 -05:00
501352ef05 Cleanup: Move PBVH files to C++
For continued refactoring of the Mesh data structure. See T103343.
2023-02-05 17:36:47 -05:00
3420c47900 GPU: Fix incorrect implicit datatype conversions in test case shaders. 2023-02-05 22:33:46 +01:00
RiverIntheSky
a094f30a74 Fix T104363: BLI_assert 'attr->comp_len == 3' failed in cage 2d gizmo 2023-02-05 21:21:35 +01:00
3d6ceb737d Geometry Nodes: parallelize part of Duplicate Elements node
This implements two optimizations:
* If the duplication count is constant, the offsets array can be
  filled directly in parallel.
* Otherwise, extracting the counts from the virtual array is parallelized.
  But there is still a serial loop over all elements in the end to compute
  the offsets.
2023-02-05 20:59:39 +01:00
b7034e7280 Cleanup: use boolean instead of int, use const arguments, variable 2023-02-05 21:51:34 +11:00
a97607dcfa Cleanup: use typed enum for the handler flag in wm_event_system 2023-02-05 21:51:34 +11:00
f1314f3d5b Cleanup: make WM_HANDLER_* action flags local to wm_event_system 2023-02-05 21:51:34 +11:00
d6cd7d1138 WM: correct the return flag from wm_handler_fileselect_do
In the unlikely case an area could not be created OPERATOR_CANCELLED
was returned, this has the same value of WM_HANDLER_HANDLED however
break is logical in this situation and both flags work.
2023-02-05 21:51:34 +11:00
b2196ebe28 Win32: Relax Debug Assert in BLI_file_attributes
If GetFileAttributesW returns an error, only debug assert if the reason
is file not found.

See D17204 for more details.

Differential Revision: https://developer.blender.org/D17204

Reviewed by Ray Molenkamp
2023-02-04 13:44:10 -08:00
1a13940ef8 Fix T88617: Wrong annotation cursor in Preview of sequencer editor
Allow preview region to change cursor as per the selected tool

Reviewed by: campbellbarton, ISS

Differential Revision: https://developer.blender.org/D16878
2023-02-04 10:22:50 +05:30
62fc001979 Cleanup: silence warning
```
warning C4457: declaration of 'fac' hides function parameter
message : see declaration of 'fac'
```
2023-02-04 00:16:02 -03:00
Germano Cavalcante
06305e5ca8 Cleanup/refactor: split Edge and Vert Slide code into more specific functions
This makes the code more readable.

In this commit also the `int curr_side_unclamp` member was moved to
`EdgeSlideParams` as it is a common value for all "Containers".
2023-02-03 23:01:26 -03:00
789e549dba Geometry Nodes: Tweak menu location of sample nodes
There was an inconsistency between geometry sample nodes and mesh/curve
sample nodes, where the latter didn't have a special "Sample" category,
and we categorized as "Operations", which they were not. Also put the
sample category between "Read" and "Write" since the verb name is
more consistent and sampling is an advanced form of reading.
2023-02-03 16:26:56 -05:00
ab9c34e7e7 Fix T104316: Width of headerless panels on regions with overlap
In some cases panels without headers were drawn too wide in sidebars
with region overlap.

Instead of always using the maximum width the view allows, headerless
panels now also use the width calculated in
`panel_draw_width_from_max_width_get`. The function already returns the
correct width in all cases and also takes care of insetting the panels,
when their backdrop needs to be drawn, which is necessary for headerless
panels, too, when there is region overlap.

Reviewed By: Hans Goudey

Differential Revision: http://developer.blender.org/D17194
2023-02-03 21:40:38 +01:00
90f36fc50e Fix (unreported): snap to object origin not respecting clipping planes
There was an incorrect conversion for local clip planes although the
coordinates used are in world positions.
2023-02-03 17:16:44 -03:00
73000c792d Cycles: Reorganize Fresnel handling in Microfacet closures
This is both a cleanup and a preparation for the Principled v2 changes.
Notable changes:
- Clearcoat weight is now folded into the closure weight, there's no reason
  to track this separately.
- There's a general-purpose helper for computing a Closure's albedo, which is
  currently used by the denoising albedo and diffuse/gloss/transmission color
  passes.
- The d/g/t color passes didn't account for closure albedo before, this means
  that e.g. metallic shaders with Principled v2 now have their color texture
  included in the glossy color pass. Also fixes T104041 (sheen albedo).
- Instead of precomputing and storing the albedo during shader setup, compute
  it when needed. This is technically redundant since we still need to compute
  it on shader setup to adjust the sample weight, but the operation is cheap
  enough that freeing up the storage seems worth it.
- Future changes (Principled v2) are easier to integrate since the Fresnel
  handling isn't all over the place anymore.
- Fresnel handling in the Multiscattering GGX code is still ugly, but since
  removing that entirely is the next step, putting effort into cleaning it up
  doesn't seem worth it.
- Apart from the d/g/t color passes, no changes to render results are expected.

Differential Revision: https://developer.blender.org/D17101
2023-02-03 21:03:48 +01:00
0220bdc2d5 Cycles: Tests: Add option to increase SPP for manual comparisons
Useful for validating changes when sampling/noise changes:
- First run with BLENDER_TEST_UPDATE=1 and e.g. CYCLESTEST_SPP_MULTIPLIER=32
- Apply your change
- Run with only CYCLESTEST_SPP_MULTIPLIER=32
- Compare
- Reset the SVN repo

Differential Revision: https://developer.blender.org/D17107
2023-02-03 21:00:47 +01:00
b454416927 Cycles: add non-uniform scaling to spot light size
Cycles ignores the size of spot lights, therefore the illuminated area doesn't match the gizmo. This patch resolves this discrepancy.
| Before (Cycles) | After (Cycles) | Eevee
|{F14200605}|{F14200595}|{F14200600}|
This is done by scaling the ray direction by the size of the cone. The implementation of `spot_light_attenuation()` in `spot.h` matches `spot_attenuation()` in `lights_lib.glsl`.
**Test file**:
{F14200728}

Differential Revision: https://developer.blender.org/D17129
2023-02-03 18:51:14 +01:00
6590a288b0 Fix number sliders not working
Own mistake in d204830107.

For some buttons the type is changed after construction, which means the button
has to be reconstructed. For example number buttons can be turned into number
slider buttons this way. New code was unintentionally overriding the button
type after reconstruction with the old type again.
2023-02-03 18:46:12 +01:00
23506622a5 Gizmo: add central point to circular 2D cage 2023-02-03 18:30:57 +01:00
7f958217ad Curves: Use shared caches for evaluated data
Use the `SharedCache` concept introduced in D16204 to share lazily
calculated evaluated data between original and multiple evaluated
curves data-blocks. Combined with D14139, this should basically remove
most costs associated with copying large curves data-blocks (though
they add slightly higher constant overhead). The caches should
interact well with undo steps, limiting recalculations on undo/redo.

Options for avoiding the new overhead associated with the shared
caches are described in T104327 and can be addressed separately.

Simple situations affected by this change are using any of the following data
on an evaluated curves data-block without first invalidating it:
- Evaluated offsets (size of evaluated curves)
- Evaluated positions
- Evaluated tangents
- Evaluated normals
- Evaluated lengths (spline parameter node)
- Internal Bezier and NURBS caches

In a test with 4m points and 170k curves, using curve normals in a
procedural setup that didn't change positions procedurally gave 5x
faster playback speeds. Avoiding recalculating the offsets on every
update saved about 3 ms for every sculpt update for brushes that don't
change topology.

Differential Revision: https://developer.blender.org/D17134
2023-02-03 12:19:40 -05:00
dc79281223 Nodes: Add modal keymap for Node link drag
This will add a proper modal keymap for the node link drag operator.
It allows the user to customize the keys used to start drag and so on.
Also it gets rid of the custom status bar message.

Differential Revision: https://developer.blender.org/D17190
2023-02-03 17:41:12 +01:00
1195933ada Fix: Compile error after BMesh conversion commit 2023-02-03 11:09:19 -05:00
a1c01e0c06 Attempt to fix build errors with NDOF enabled 2023-02-03 17:03:29 +01:00
fcc1166821 GPU: Disable verbose GLSL variable names in debug builds
GpuInput::node can be deallocated in some cases. (See T104265)
This is a temp workaround until a proper solution is implemented.
2023-02-03 17:00:35 +01:00
d204830107 UI: Make uiBut safe for non-trivial construction
No user-visible changes expected.

Essentially, this makes it possible to use C++ types like `std::function`
inside `uiBut`. This has plenty of benefits, for example this should help
significantly reducing unsafe `void *` use (since a `std::function` can hold
arbitrary data while preserving types).

----

I wanted to use a non-trivially-constructible C++ type (`std::function`) inside
`uiBut`. But this would mean we can't use `MEM_cnew()` like allocation anymore.

Rather than writing worse code, allow non-trivial construction for `uiBut`.
Member-initializing all members is annoying since there are so many, but rather
safe than sorry. As we use more C++ types (e.g. convert callbacks to use
`std::function`), this should become less since they initialize properly on
default construction.

Also use proper C++ inheritance for `uiBut` subtypes, the old way to allocate
based on size isn't working anymore.

Differential Revision: https://developer.blender.org/D17164

Reviewed by: Hans Goudey
2023-02-03 16:35:51 +01:00
ebe8f8ce71 BMesh: Parallelize BMesh to evaluated Mesh conversion
Currently this conversion (which happens when using modifiers in edit
mode, for example) is completely single threaded. It's harder than some
other areas to multithread because BMesh elements don't always know
their indices (and vise versa), and because the dynamic AoS format
used by BMesh makes some typical solutions not helpful.

This patch proposes to split the operation into two steps. The first
updates the indices of BMesh elements and builds tables for easy
iteration later. It also checks if some optional mesh attributes
should be added. The second uses parallel loops over all elements,
copying attribute values and building the Mesh topology.

Both steps process different domains in separate threads (though the
first has to combine faces and loops). Though this isn't proper data
parallelism, it's quite helpful because each domain doesn't affect the
others.

**Timings**
I tested this on a Ryzen 7950x with a 1 million face grid, with no
extra attributes and then with several color attributes and vertex
groups.

| File | Before | After |
| Simple | 101.6 ms | 59.6 ms |
| More Attributes | 149.2 ms | 65.6 ms |

The optimization scales better with more attributes on the BMesh. The
speedup isn't as linear as multithreading other operations, indicating
added overhead. I think this is worth it though, because the user is
usually actively interacting with a mesh in edit mode.

See the differential revision for more timing information.

Differential Revision: https://developer.blender.org/D16249
2023-02-03 10:20:19 -05:00
Germano Cavalcante
19b63b932d Fix transform gizmo not updating according to state
Whenever a transform operation is activated by gizmo, the gizmo
modal is maintained, but its drawing remains the same even if the
transform mode or constrain is changed.

So update the gizmo according to the mode or constrain set.

NOTE: Currently only 3D view gizmo is affected
2023-02-03 11:07:43 -03:00
af0d378177 Cleanup: Move multires reshape header to C++
For continued refactoring of the Mesh data structure. See T103343.
2023-02-03 08:48:00 -05:00
96e37affe5 Cleanup: Remove unused image paint function 2023-02-03 08:18:41 -05:00
5a9d2b872e Cleanup: incorrect naming of storage_buf parameters.
They were named vert.
2023-02-03 14:11:07 +01:00
0f51b5b599 Animation: Add Operator for adding FCurve modifiers to more menus
The operator has the option to add to selected FCurves instead
of only the active, but it was only exposed in the redo panel.
This patch adds the operator to the right-click menu on FCurve channels,
and to the channel menu in the Graph editor.
Both times with configured to add to selected
instead of only the active FCurve

Revied by Reviewed by: Sybren A. Stüvel
Differential Revision: https://developer.blender.org/D17066
Ref: D17066
2023-02-03 13:06:15 +01:00
cc23b6abd6 Cleanup: rename cage2d draw style (RECTANGLE -> BOX_TRANSFORM) 2023-02-03 12:01:45 +01:00
3c8c0f1094 Gizmo: add gizmo for adjusting spot light blend
Ref T104280

Differential Revision: https://developer.blender.org/D17171
2023-02-03 11:32:10 +01:00
d335db09e0 Fix T90893: Restoring a key in the keymap editor doesn't work if the key is only disabled
Reset `KMI_INACTIVE` flag when restore-item  button is called

Reviewed By: campbellbarton

Differential Revision: https://developer.blender.org/D16773
2023-02-03 15:45:59 +05:30
ab3e1809b6 Vulkan: Select graphics queue with compute capabilities.
There might be cases that the graphics queue and compute
queue are separated, but that would be something that we
will solve in the future.
2023-02-03 10:16:01 +01:00
a45a881534 Spelling: familly -> family.
Fix spelling mistake originated from tmp-vulkan branch.
2023-02-03 10:16:01 +01:00
9565ea0724 IO: Harmonize UI for selection of axes in OBJ and Collada
Implements T103858: in OBJ importer and exporter, and in Collada
exporter, present axis choices as a dropdown instead of inline button
row.

Reviewed By: Bastien Montagne
Differential Revision: https://developer.blender.org/D17186
2023-02-03 11:14:53 +02:00
2fb0c20f53 Cleanup: remove unreachable return values 2023-02-03 20:02:25 +11:00
307113d744 GPU: Fix incorrect datatype conversions in test case shaders. 2023-02-03 08:23:55 +01:00
11d8965da1 Cleanup: quiet undeclared variable warning 2023-02-03 15:10:58 +11:00
c468aeefb5 PyAPI: updates to bl_ui.generic_ui_list which didn't follow conventions
- Replace type annotations with doc-strings, the current conventions is
  not to use type annotations in startup scripts.
- Replace abbreviation "idx" with "index" in public arguments/properties.
- Replace `len(..) > 0` with boolean checks.
- Add `__all__` to list public members.
- Use `arg` instead of `param` for doc-strings.
- Locate the doc-string so it shows as `__doc__`.
2023-02-03 15:07:14 +11:00
266d8de687 Cleanup: spelling in comments 2023-02-03 12:41:01 +11:00
27e2b32a06 Cleanup: Remove redundant calls to init grids
`BKE_volume_init_grids` gets called in `volume_init_data` that is run
on creating new datablocks so it's unnecessary to run it separately.
2023-02-03 00:14:19 +01:00
023277359f Fix T104053: Limit pbvh recursion
The recursion depth was checked for equality with a maximum depth,
allowing leaves with more primitives if a certain depth was reached.

However, a single leaf must always use the same material (including
set_smooth), so if a leaf contained multiple materials it was split
anyway. This meant that in the next recursion step the depth was
larger than the cutoff value and it would go back to recursing until
the number of primitives was small enough, ignoring the recursion
depth for the rest of the process.

In certain edge cases this could lead to a stack overflow.

Even with the check changed from 'equality' to 'larger or equal'
this could still fail in the pathological case where every primitive
has another material. But that can't be helped, and it wouldn't
realistically happen either.

Differential Revsision: https://developer.blender.org/D17188
2023-02-03 00:09:26 +01:00
b5e00a1482 Revert "BLI: Use BLI_math_matrix_type.hh instead of BLI_math_float4x4.hh"
This reverts commit 52de84b0db.

had some build issues on windows i can't quickly resolve, revert for
now while we fix the problems
2023-02-02 11:46:23 -07:00
f31ad5d98b Cleanup: refactoring BKE_nlastrips_add_strip method to Null check. Adding Unit tests for method
Reviewed By: Sybren

Differential Revision: https://developer.blender.org/D17155
2023-02-02 13:20:25 -05:00
52de84b0db BLI: Use BLI_math_matrix_type.hh instead of BLI_math_float4x4.hh
Straightforward port. I took the oportunity to remove some C vector
functions (ex: `copy_v2_v2`).

This makes some changes to DRWView to accomodate the alignement
requirements of the float4x4 type.
2023-02-02 18:11:35 +01:00
ca99a59605 Fix T104261: crash when trying to draw viewer overlay for empty curve
`BKE_displist_make_curveTypes` only sets `curve_eval` if the final geometry
of the object actually contains curve data, otherwise it's null.
2023-02-02 17:20:57 +01:00
Iliya Katueshenock
fdc81be213 Fix T104223: attribute not propagated to Set Curve Radius node
This declaration change tells the evaluation system that the radius
field is evaluated on the input geometry. Which in turn means that
attributes referenced by the radius field should be propagated to
the geometry.
2023-02-02 17:04:29 +01:00
fc2a64e21a Win32: Better Error Handling in BLI_file_attributes
After Win32 API call GetFileAttributesW, check for
INVALID_FILE_ATTRIBUTES, which is returned on error,
usually if file not found.

See D17176 for more details.

Differential Revision: https://developer.blender.org/D17176

Reviewed by Ray Molenkamp
2023-02-02 08:01:43 -08:00
7208938707 Fix T104278: incorrect handling of unavailable linked node groups
The declaration of group nodes using unavailable linked groups contains
a `skip_updating_sockets` tag, which indicates that the node shouldn't
change. This information was not used properly further down the line.
2023-02-02 16:38:53 +01:00
2a19810f97 Fix T104296: crash caused by incorrectly initialized group nodes
Versioning code in `do_versions_after_linking_260` inserted new group input
and output nodes. And (reasonably?) expected sockets to exist on those nodes.
However, `nodeAddStaticNode` did not initialize sockets on nodes with that use
`declare_dynamic` yet. This patch changes it so that `declare_dynamic` is used
in more places, which caused issues during file loading when node groups are
updated in somewhat arbitrary order (not in an order that is based on which
groups use which).

Differential Revision: https://developer.blender.org/D17183
2023-02-02 16:38:53 +01:00
04aab7d516 Animation: Add "Select Linked Vertices" to Weight Paint Mode
Adds two operators to select linked  vertices in weight paint mode.
Similar to how it works in edit mode.
Press "L" to select vertices under the cursor,
or CTRL + "L" to select anything linked to the current selection.

Reviewed by: Sybren A. Stüvel, Hans Goudey, Marion Stalke
Differential Revision: https://developer.blender.org/D16848
Ref: D16848
2023-02-02 16:17:17 +01:00
fe5d54d3d0 Gizmo: add new cage2d draw style for circular shapes
`ED_GIZMO_CAGE2D_STYLE_CIRCLE` now draw circles. The previous `ED_GIZMO_CAGE2D_STYLE_CIRCLE`, which drew rectangles, is renamed to `ED_GIZMO_CAGE2D_STYLE_RECTANGLE`. The meaning of `ED_GIZMO_CAGE2D_STYLE_BOX` is now unclear and probably needs to be renamed too.
Ref T104280

Maniphest Tasks: T104280

Differential Revision: https://developer.blender.org/D17174
2023-02-02 16:15:23 +01:00
c2c6707919 Fix T102690: Object can be animated with a single keyframe
Remove some assumptions that an FCurve with a single keyframe always is
extrapolated in constant fashion. There's no reason for the Beziér handles
to be ignored in such a case.

FCurve evaluation already worked properly, it was just the drawing in the
graph editor and the selectability of the handles that needed adjustments.
2023-02-02 16:01:03 +01:00
c1b85103fe GPU: Added testcase for SSBO/Compute.
Test case is a smaller step towards supporting Vulkan. Other
test cases rely on SSBOs as well so it is better to first satisfy
this step before handling the others.
2023-02-02 14:11:43 +01:00
49ad91b5ab Animation: Enable Pin Icon in the Dope Sheet
Even though the dope sheet respects the pinned channels, it did not show the icon to interact with the pinned state.

This adds the icon to the dope sheet as well.

{F14178715}

Reviewed by: Sybren A. Stüvel
Differential Revision: https://developer.blender.org/D17061
Ref: D17061
2023-02-02 13:58:13 +01:00
bbc18673f3 Cleanup: make format 2023-02-02 13:16:39 +01:00
d77b87eb93 Fix invalid versioning code for meshes
The issue was introduced in d92edca862: one shall not use
polygon index to index polygon custom data layer.
2023-02-02 13:10:45 +01:00
644d54a193 GPU: Fix memory leak in VkShader.
std::string makes a copy, without taking ownership.
2023-02-02 13:06:10 +01:00
6c66f3e2b3 GPU: Remove prototype without implementation.
`GPUShaderInterface(const ShaderCreateInfo&)` is defined but its
implementation has been removed.
2023-02-02 11:46:29 +01:00
d92edca862 Fix: CustomData layers become unsorted in versioning
n versioning, when converting CD_SCULPT_FACE_SETS to CD_PROP_INT32
the layers were not kept properly ordered by type.

This was discovered while investigating T104053

Differential Revision: D17165
2023-02-02 09:49:25 +01:00
Stephen Seo
e1ee86b63c Change default AV1 encoder for "slowest"
Previously, having the "Encoding speed" set to "slowest" would choose
libaom-av1 first and librav1e second. This change makes Blender choose
librav1e first (and has a fallback to whatever other AV1 codec is
available if librav1e is not installed).

Addresses /T103849 on systems where librav1e codec available.

Reviewed By: sergey, ISS

Maniphest Tasks: T103849

Differential Revision: https://developer.blender.org/D17002
2023-02-02 09:33:25 +01:00
199233eee1 Cleanup: Change VMA from CRLF to LF.
To match the rest of our repository.
2023-02-02 08:23:54 +01:00
5d9971bc63 Vulkan: Fix compilation warning in VMA. 2023-02-02 08:19:17 +01:00
99520a79b6 Viewport Compositor: Platform support message.
In the previous situation the message was shown for Apple devices.
But this is not correct and confusing.

- Apple with Metal backend are supported, OpenGL on Apple isn't
- Legacy devices on Windows or Linux are also not supported.

This change will check that the capabilities of the GPU match the
requirements to use Viewport compositor. Based on those capabilities
a message is shown and the panel is activated.
2023-02-02 07:43:18 +01:00
e2c3bff78b Spelling: 'GPU Back end' -> 'GPU Backend'. 2023-02-02 07:35:32 +01:00
96ea0dd458 Cleanup: spelling in comments 2023-02-02 14:00:32 +11:00
31a6400279 Fix error referencing transition context which doesn't exist
Regression in [0] added reference to constraint which doesn't exist.

[0]: db87e2a638
2023-02-02 13:33:08 +11:00
d82ffb9787 Fix T103530: Allow Recursive File Lists
In File Browser, correct full path creation so that it works correctly
in recursive listings.

See D17175 for more details.

Differential Revision: https://developer.blender.org/D17175

Reviewed by Julian Eisel
2023-02-01 14:06:38 -08:00
a4868f2058 Fix T104136: Crash with Mesh to Volume & Simplify
Blender crashes when Simplify is enabled and set to 0.
This is because mesh_to_volume_grid returns nullptr at voxel size 0.
A simple null-check fixes this problem.

Differential Revision: https://developer.blender.org/D17122
2023-02-01 20:50:39 +01:00
9b86741ae7 FIx possible return of string without null character
`WM_modalkeymap_items_to_string` is expected to always return a string.

But in the special case of zero length, the returned string was not
terminated with `'\0'`.

This can cause problems with the header of the knife tool for example.
It always uses the returned string.

(This issue was observed when investigating T103804).
2023-02-01 15:29:21 -03:00
c105c49407 Cleanup: rename places where 3D cage gizmo uses 2D cage enums 2023-02-01 17:28:35 +01:00
19fe5caf87 Geometry Nodes: add support for eye dropper for object input in modifier
Differential Revision: https://developer.blender.org/D17108
2023-02-01 12:53:57 +01:00
5a97b282e9 Fix (unreported) crash when importing data with custom normals.
The usage of `index` was inverted between destination array of vectors
and source vector in rB3a3d9488a1633.
2023-02-01 10:53:53 +01:00
e515f81318 Cleanup: move swapping objects in a collection to a utility function
Including this logic inline complicated D17016 & was noted as a TODO.
2023-02-01 14:41:40 +11:00
345dded146 Revert "Geometry Nodes: hide group inputs with "Hide Value" enabled in modifier"
This reverts commit 11a9578a19.

Reverting this because there was a miscommunication between Simon and me. Shortly
before I committed the change, Simon noticed that there are cases when "Hide Value"
is checked to hide the value in a group node, but we still want to show the value
in the modifier.
2023-01-31 19:28:22 +01:00
3750e7ef0b Sculpt: Don't invert in geodesic mask expand keymap 2023-01-31 10:23:07 -08:00
0a9520ce84 Sculpt: Un-invert expand normal falloff
Normals falloff mask is created inverted compared
to how other falloffs work.  It's now un-inverted
in a final step at the end.

Also exposed the normals falloff smooth steps as
an RNA property.
2023-01-31 10:23:07 -08:00
11a9578a19 Geometry Nodes: hide group inputs with "Hide Value" enabled in modifier
When the "Hide Value" option of a group input is enabled, only its name
is displayed in group nodes. Modifiers should have the same behavior.
However, for modifiers, only showing the name does not make sense
when the user can't edit the value. Therefore the value is not shown at all.
2023-01-31 18:55:06 +01:00
2995165148 Cleanup: simplify wrapping CurvesGeometry in C++ 2023-01-31 18:45:55 +01:00
fa9fc59b56 Fix T104240: OptiX OSL texture loading broken with displacement
The image manager used to handle OSL textures on the GPU by
default loads images after displacement is evaluated. This is a
problem when the displacement shader uses any textures, hence
why the geometry manager already makes the image manager
load any images used in the displacement shader graph early
(`GeometryManager::device_update_displacement_images`).
This only handled Cycles image nodes however, not OSL nodes, so
if any `texture` calls were made in OSL those would be missed and
therefore crash when accessed on the GPU. Unfortunately it is not
simple to determine which textures referenced by OSL are needed
for displacement, so the solution for now is to simply load all of
them early if true displacement is used.
This patch also fixes the result of the displacement shader not
being used properly in OptiX.

Maniphest Tasks: T104240

Differential Revision: https://developer.blender.org/D17162
2023-01-31 16:41:00 +01:00
ce42906b89 Fix wrong spot light blend circle radius in viewport
Was using squared cosine instead of cosine. The problem is obvious for large spread angles.
|before|after
|{F14220946}|{F14220948}
The images are generating by returning `floor(spotmask)` in function `spot_attenuation()` in `lights_lib.glsl`.
2023-01-31 16:26:51 +01:00
1cb04af1df BLI: Fix missing const correctness
This fix incorrect rBa837604d44a0 commit
2023-01-31 15:36:24 +01:00
6de43f6f04 BLI: Fix invalid swizzle commit
This fix incorrect rBa837604d44a0 commit
2023-01-31 14:53:54 +01:00
a837604d44 BLI: Math: Add xy() and xyz() swizzle functions
Thoses are added for better component masking syntax.
This avoids the harder to read `float2(my_vec4)`.

This is not meant to be fully usable swizzling support but could be
extended in the future.
2023-01-31 14:48:19 +01:00
59b7aec9a2 VSE: Add missing NULL check in poll
Commit 90e9406866 forgot to add a NULL check for `ed`
2023-01-31 14:43:17 +01:00
5021f10e65 GPencil: Group similar UI parameters in Offset modifier 2023-01-31 13:13:00 +01:00
c1e5359eba GPencil: Use BLI_math_matrix_type.hh instead of BLI_math_float4x4.hh
Straightforward port. I took the oportunity to remove some C vector
functions (ex: `copy_v2_v2()`).

Reviewed By: antoniov

Differential Revision: https://developer.blender.org/D17160
2023-01-31 12:53:44 +01:00
a71ae981c7 Fix T104243: GPencil Cutter flattening caps on both sides 2023-01-31 11:18:13 +01:00
Jason Fielder
a504058dee Metal: Optimize GLSL to MSL translation. Improve cached compilation
Reduces the GLSL to MSL translation stage of shader compilation from 120 ms to 5 ms for complex EEVEE materials. This manifests in faster overall compilations, and faster cache hits for secondary compilations, as the MSL variant is needed as a key.

Startup time is also improved for both first-run and second-run. Note that this change does not affect shader compilation times within the Metal API.

Also disables shader output to disk

Authored by Apple: Michael Parkin-White

Ref T96261
Depends on D16990

Reviewed By: fclem

Maniphest Tasks: T96261

Differential Revision: https://developer.blender.org/D17033
2023-01-31 11:06:31 +01:00
Jason Fielder
f3bd5458a3 Metal: Optimise shader texture cache usage and branch reduction via point sampling.
Replace texelFetch calls with a texture point-sample rather than a textureRead call. This increases texture cache utilisation when mixing between sampled calls and reads. Bounds checking can also be removed from these functions, reducing instruction count and branch divergence, as the sampler routine handles range clamping.

Authored by Apple: Michael Parkin-White
Ref T96261

Depends on D16923

Reviewed By: fclem

Maniphest Tasks: T96261

Differential Revision: https://developer.blender.org/D17021
2023-01-31 10:56:25 +01:00
9f866a92dc Fix T104176: Smooth modifier with vergex group not work for negative factors.
Regression from rBabc8d6b12ce9, over three years ago!

Should be backported to the active LTS releases too.
2023-01-31 09:55:45 +01:00
3d765f6527 GPU: Fix gpu_math_test.glsl test case.
Test results were generated from incorrect code.
Code was fixed, but test results weren't updated.
This patch updates the test results to match the implementation.
- `projection_perspective.w.w == 0.0`
2023-01-31 09:51:08 +01:00
59aefaf3d0 Vulkan: Fix assert when compiling transform feedback shaders.
Transform feedback shaders don't have a fragment shader and should not
fail when it is not given.
2023-01-31 09:50:50 +01:00
d44d165e8a Vulkan: Fix compilation on Linux.
ShaderC was missing and is required for compilation.
2023-01-31 09:23:21 +01:00
d8cd8a0852 Cleanup: remove compilation warning
Unused parameter in vk_backend.
2023-01-31 09:10:18 +01:00
b7e178cb7d GPU: Cross test OpenGL tests to Vulkan.
Enhanced the GPU_TEST macro to also handle Vulkan backend
when WITH_VULKAN_BACKEND compilation option has been
enabled.
2023-01-31 08:48:52 +01:00
6b8fa899ca Metal: Fix compilation of GLSL used in test cases.
Added imageStore for 1d textures.
2023-01-31 08:42:33 +01:00
57efef2635 Fix generating geometry icons for meshes without vertex colors 2023-01-31 16:46:40 +11:00
91263a8b8b Sculpt: Fix T104040: Always update eevee shadows in sculpt modes 2023-01-30 21:29:11 -08:00
79c82fc1c5 Cleanup: trailing space 2023-01-31 15:49:04 +11:00
b18b53eae0 CMake: add missing headers 2023-01-31 14:22:26 +11:00
4f1800d70a Docs: note that delimiting by winding could be supported
Some users requested this behavior since it was removed,
so note that it could be supported again.
2023-01-31 14:22:25 +11:00
6c8c8c20c7 Cleanup: quiet mypy type checking warnings 2023-01-31 14:22:24 +11:00
27b4916b1a Cleanup: spelling in comments
Also minor changes in comments:
- Reference BLENDER_HISTORY_FILE instead of the literal file-name
  (simplifies looking up usage).
- Use usernames in tags, as noted in code-style.
2023-01-31 14:22:23 +11:00
ea8fd343eb Gitea: add merge message templates
To add the pull request # at the end of the commit.
2023-01-30 23:48:41 +01:00
f359a39d11 Fix T100028: Convert USD camera properties to mm from USD units.
Authored by Sonny Campbell.

Currently when importing a USD file, some of the camera properties are
ignored, or the units are not converted correctly from USD world units.
On import we currently set the focal length, but not the camera sensor
size (horizontal and vertical aperture), so the camera field of view
is wrong. The sensor size information is in the USD file, but is ignored
for perspective cameras.

USD uses "tenth of a world unit" scale for some physical camera properties
like focal length and aperture.
https://graphics.pixar.com/usd/release/api/class_usd_geom_camera.html#UsdGeom_CameraUnits

I have added the UsdStage's metersPerUnit parameter to the ImportSettings
so the camera can do the required conversion on import. This will convert from
the USD file's world units to millimeters for Blender's camera settings.

Reviewed by: Sybren and makowalski.

Differential Revision: https://developer.blender.org/D16019
2023-01-30 17:22:26 -05:00
129093fbce Cycles: Fix crash when rendering with OSL on multiple GPUs
The `MultiDevice` implementation of `get_cpu_osl_memory` returns a
nullptr when there is no CPU device in the mix. As such access to that
crashed in `update_osl_globals`. But that only updates maps that are not
currently used on the GPU anyway, so can just skip that when the CPU
is not used for rendering.

Maniphest Tasks: T104216
2023-01-30 19:40:22 +01:00
87a923fdb6 GPU: Add SSBO binding test to new structure.
This test was added to test the shader info structure binding
information for SSBOs. It still used the legacy GLSL structure.
2023-01-30 19:07:33 +01:00
f4deed288b USD export: style fixes to previous commit.
Changed to C-style comments and added braces.
2023-01-30 11:57:28 -05:00
c79b55fc05 USD export: add scale and bias for normal maps.
Changes authored by Michael B Johnson (drwave).

This addresses the issue in T102911.

Add scale and bias inputs to ensure the normals are
in tangent space [(-1,-1,-1), (1,1,1)].

This is following the convention as set in the USD Spec
(https://graphics.pixar.com/usd/dev/spec_usdpreviewsurface.html).

Differential Revision: https://developer.blender.org/D17072
2023-01-30 11:32:03 -05:00
ad083f925c GPencil: Rename init_time to time_start 2023-01-30 16:27:00 +01:00
ce13d0d326 GPU: Only compile test shaders when test cases option is enabled.
The glsl files + create infos of shaders that are only used
during development where still being compiled into blender.

This isn't needed and shouldn't be included. This change will
only include them when WITH_GTEST and WITH_OPENGL_DRAW_TESTS are
enabled. All other cases those files will be skipped.
2023-01-30 15:46:12 +01:00
1a50f814e6 Cleanup: Remove unused variable, use switch, and C++ casting. 2023-01-30 15:41:01 +01:00
Jason Fielder
aca9c131fc Metal: Fix issue with premature safe buffer list flush and optimize memory manager overhead.
Resolve an issue where released buffers were returned to the reusable memory pool before GPU work associated with these buffers had been encoded. Usually release of memory pools is dependent on successful completion of GPU work via command buffer callbacks. However, if the pool refresh operation occurs between encoding of work and submission, buffer ref-count is prematurely decremented.

Patch also ensures safe buffer free lists are only flushed once a set number of buffers have been used. This reduces overhead of small and frequent flushes, without raising the memory ceiling significantly.

Authored by Apple: Michael Parkin-White

Ref T96261

Reviewed By: fclem

Maniphest Tasks: T96261

Differential Revision: https://developer.blender.org/D17118
2023-01-30 14:54:39 +01:00
d4d4efd3d3 GPU: Use create info for compute test cases.
Compute test case still used legacy API to construct
GLSL shaders. This change will migrate it to use the
GPUShaderCreateInfo's.

In preparation to run test-cases against non-opengl
back-ends.
2023-01-30 14:45:07 +01:00
62dd0855a9 Cleanup: Remove special handling of 3DCursor in undo code.
Such preserve-across-undo data handling is now done through the IDType
callbacks, see e.g. `scene_undo_preserve` for the 3DCursor case.
2023-01-30 14:40:59 +01:00
Jason Fielder
596ee79a9f Metal: Optimize shader local memory usage.
Due to shader global scope emulation via class interface, global constant arrays in shaders are allocated in per-thread shader local memory. To reduce memory pressure, placing these constant arrays inside function scope will ensure they only reside within device constant memory. This results in a tangible 1.5-2x performance uplift for the specific shaders affected.

Authored by Apple: Michael Parkin-White

Ref T96261

Reviewed By: fclem

Maniphest Tasks: T96261

Differential Revision: https://developer.blender.org/D17089
2023-01-30 13:53:00 +01:00
dea924a91f GPU: Fix incorrectly commited test compilation of all shaders 2023-01-30 12:30:21 +01:00
0da74d3ee9 GPU: Fix GLSL compilation on OpenGL backend.
Regression in {d0f55aa671f7}
2023-01-30 12:23:02 +01:00
411345757c Cleanup: Remove unused variable.
Introduced in recent commit.
2023-01-30 12:04:44 +01:00
a36c1cabce Vulkan: Changes to CMake config.
Paths to vulkan libraries, paths and related components were
hardcoded in the platform cmake file. This patch separates
this by using adding CMake modules for Vulkan and ShaderC.

This change has only been applied to the macOs configuration as
that is currently our main platform for development. Other platforms
will be added during the development of the Vulkan back-end.
2023-01-30 12:04:44 +01:00
084dd110c9 Build: Remove unused BLENDER_GL_LIBRARIES.
This CMAKE variable isn't used.
2023-01-30 12:04:44 +01:00
Jason Fielder
6dde185dc4 Metal: Fix edge-case with point primitive restart index removal where all indices are restarts.
Metal backend does not support primtiive restart for point primtiives. Hence strip_restart_indices removes restart indices by swapping them to the end of the index buffer and reducing the length.
An edge-case existed where all indices within the index buffer were restarts and no valid swap-index would be found, resulting in a buffer underflow.

Authored by Apple: Michael Parkin-White

Ref T96261

Reviewed By: fclem

Maniphest Tasks: T96261

Differential Revision: https://developer.blender.org/D17088
2023-01-30 11:26:38 +01:00
Jason Fielder
57552f52b2 Metal: Realtime compositor enablement with addition of GPU Compute.
This patch adds support for compilation and execution of GLSL compute shaders. This, along with a few systematic changes and fixes, enable realtime compositor functionality with the Metal backend on macOS. A number of GLSL source modifications have been made to add the required level of type explicitness, allowing all compilations to succeed.

GLSL Compute shader compilation follows a similar path to Vertex/Fragment translation, with added support for shader atomics, shared memory blocks and barriers.

Texture flags have also been updated to ensure correct read/write specification for textures used within the compositor pipeline. GPU command submission changes have also been made in the high level path, when Metal is used, to address command buffer time-outs caused by certain expensive compute shaders.

Authored by Apple: Michael Parkin-White

Ref T96261
Ref T99210

Reviewed By: fclem

Maniphest Tasks: T99210, T96261

Differential Revision: https://developer.blender.org/D16990
2023-01-30 11:06:56 +01:00
d0f55aa671 Vulkan: Fix GLSL compilation errors.
Recent changes in our GLSL libraries didn't compile on Vulkan. This
change reverts a compile directive that was removed, but required
in order to compile using the Vulkan backend.
2023-01-30 10:55:14 +01:00
2ff08d6d9c Cleanup: Pass explicit type to MEM_cnew.
Better avoid fancy implicit typing in this template, and be clear about
what type is actually being allocated.
2023-01-30 10:41:42 +01:00
be8778355a Cleanup: Unused parameters and variables. 2023-01-30 09:45:42 +01:00
3649c05f57 Cleanup: Run make format on codebase. 2023-01-30 09:40:17 +01:00
Damien Picard
db87e2a638 I18n: extract and disambiguate a few messages
Extract:
- EEVEE: Compiling Shaders (the same message exists in EEVEE Next, but
  it uses string concatenation and I don't know yet how to deal with
  those--see T92758)

Disambiguate:
- Pan (audio, camera)
- Box (TextSequence)
- Mix (noun in constraints, GP materials)
- Volume (object type, file system)
- Floor (math integer part, 3D viewport horizontal plane)
  - Impossible to disambiguate the constraint name because
    bConstraintTypeInfo doesn't have a context field.
- Show Overlay (in the sequence editor, use the same message as other
  editors to avoid a confusion with the Frame Overlay feature, also
  called "Show Overlay")

Additionally, fix a few issues reported by Joan Pujolar (@jpujolar)
in T101830.

Reviewed By: mont29

Differential Revision: https://developer.blender.org/D17114
2023-01-30 09:38:57 +01:00
Damien Picard
75c772391d I18n: construct report verbosely when moving objects to collection
When moving or linking an object to a collection, the report was not
properly translatable. In French for instance, it would give
nonsensical half-translated sentences such as "<Object> moved vers
<collection>", instead of "<Object> déplacé vers <collection>".

Instead, separate the report into the four possible translations (one
or multiple objects, linking or moving). This is very verbose and less
legible, but it ensure the sentences can be properly translated,
including plurals in languages which use grammatical agreement.

In addition, use BKE_collection_ui_name_get() to get the collection
name, because the Scene Collection's name is hardcoded, but it can be
localized.

Reviewed By: mont29

Differential Revision: https://developer.blender.org/D17112
2023-01-30 09:31:38 +01:00
90e9406866 VSE: Add Update scene frame range operator
This operator updates scene strip internal length to reflect target
scene length. Previously scene strip had to be deleted and added from
scratch. Scene strip length in timeline will not be changed.
2023-01-30 07:56:53 +01:00
ad146bd17a Fix T103852: Muting timeline channel does not update image
Add RNA update function to invalidate cache for all strips in channel.
2023-01-30 06:48:01 +01:00
11de4aa0ce Update RNA to User manual mappings 2023-01-29 19:00:47 -05:00
042775ad48 Sculpt: Fix T104068, depth calculation error in trim tools
Also made the coplanar padding factor relative.
2023-01-28 16:10:02 -08:00
e497da5fda Fix: off by one error in previous commit
Fixes rB90253ad2e753acde161b38d82bd650d54d3f6581.
2023-01-29 00:13:37 +01:00
52ed8bcb27 Gitea: fix pull request template so commit body can be set as description 2023-01-28 18:11:51 +01:00
90253ad2e7 Geometry Nodes: avoid creating a lazy function many times
It's better to use some statically allocated functions instead
of dynamically allocating them all the time.
2023-01-28 15:28:55 +01:00
b2534fb866 Fix: anonymous attribute output requested even though it's not used
The code removed here was intended to be an optimization that
avoids creating an additional node to join multiple attribute sets.
However, that optimization did not work, because it did not take
into account whether the single attribute set is required or not.
2023-01-28 14:55:39 +01:00
904357d67a Fix: assert when converting between incompatible field types
This results in a compile time error now which hopefully prevents
this specific kind of mistake in the future.
2023-01-28 14:52:15 +01:00
89aae4ac82 Node Editor: Controlled node link swapping
Allow to explicitly swap node links by pressing the alt-key while
reconnecting node links. This replaces the old auto-swapping based on
matching prefixes in socket names.

The new behavior works as follows:

* By default plugging links into already occupied (single input)
  sockets will connect the dragged link and remove the existing one.
* Pressing the alt-key while dragging an existing node link from one
  socket to another socket that is already connected will swap the
  links' destinations.
* Pressing the alt-key while dragging a new node link into an already
  linked socket will try to reconnect the existing links into another
  socket of the same type and remove the links, if no matching socket
  is found on the node. This is similar to the old auto-swapping.

Swapping links from or to multi input sockets is not supported.

This commit also makes the link drag tooltip better visible, when using
light themes by using the text theme color.

Reviewed By: Hans Goudey, Simon Thommes

Differential Revision: https://developer.blender.org/D16244
2023-01-28 10:07:29 +01:00
fe5c3a0ab3 GNUmakefile: add convenience target 'check_wiki_file_structure'
This target ensures https://wiki.blender.org/wiki/Source/File_Structure
follows Blender's source tree.
2023-01-28 16:41:12 +11:00
4e9c6929c1 VSE: Handle drivers when duplicating strips
Most operations where strips are duplicated use `SEQ_animation` API,
which handles keyframe animation. Now drivers are handled as well.

When group of strips is duplicated and driver references other strip,
it will still reference original strip. However, this is much better,
than previous behavior, when strip duplication results in "transfer" of
driver from original strip to duplicated one.

Fixes T104141
2023-01-28 03:24:44 +01:00
3a9e589142 Fix (Unreported): VSE side panel flickering when tweaking offset value
Panel was split by factor calculated from property value string length.
Since these properties have float type now, calculated length was
incorrect.
2023-01-28 00:25:56 +01:00
d7e914270f Fix (unreported): Pasted strip is not active
Broken by renaming strip before comparing name to reference.
2023-01-28 00:08:07 +01:00
cef03c867b UV: cleanup winding
Simplify `BM_uv_element_map_create` by using `BM_face_calc_area_uv_signed`.

Remove unused UV winding code in `BM_uv_vert_map_create`.

Fixes unlikely memory leak in `BKE_mesh_uv_vert_map_create`.

No functional changes.

Differential Revision: https://developer.blender.org/D17137
2023-01-28 11:03:45 +13:00
8336de03a6 Cleanup: VSE: use context->for_render instead of G.is_rendering 2023-01-27 22:50:38 +01:00
9facc5067a Cleanup: Simplify mesh and point cloud conversion
Since mesh positions are stored as a generic attribute,
the attribute doesn't need special handling here.
2023-01-27 14:34:10 -06:00
Colin Basnett
328772f2d9 Mesh: Add operator to flip quad tessellation
This adds a new operator: bpy.ops.mesh.flip_quad_tessellation()

This operator rotates the internal loops of the selected quads, allowing
the user to control tessellation without destructively altering the
mesh.

{F14201995}

This operator can be found in the "Face" menu (Ctrl+F) under "Face
Data".

{F14201997}

Reviewed By: campbellbarton, dbystedt

Differential Revision: https://developer.blender.org/D17056
2023-01-27 11:02:55 -08:00
f5e76aa39e Cleanup: Array types, const, math API in workbench code
Some miscellaneous cleanups left over from a fix/cleanup combo:
- Use const variables
- Use the C++ `math` namespace functions
- Use `std::array` for arrays with size known at compile time
- Use `MutableSpan` instead of reference to array

Differential Revision: https://developer.blender.org/D17094
2023-01-27 11:35:08 -06:00
0050d6d399 Cleanup: move function to file where it is used
`drawLine` is only used for constraint, so it should be in
`transform_constraints.c`
2023-01-27 14:10:43 -03:00
8343e841fd Cleanup: Quiet unused variable warning in non-debug builds 2023-01-27 09:59:59 -06:00
179605bd2d Fix T104168: No active UV when reading auto-save files
Similar to 6d12d43a05, we should skip the
legacy to generic conversion if there's nothing to convert.
2023-01-27 09:59:59 -06:00
000e722c7d Geometry Nodes: Optimize start point case of Points of Curve node
In the node groups for T103730, the "Points of Curve" node is often used to
retrieve the root point of every curve. Since the curve point offsets array
already contains that data directly, we can detect this as a special case and
avoid all the other work.

Differential Revision: https://developer.blender.org/D17128
2023-01-27 09:59:59 -06:00
e99ae0a75d Vulkan: Tweaks to CMake configuration.
MoltenVK wasn't found as it was previous part of lib/vulkan.
as lib/vulkan now doesn't contain
the full sdk, we will use a moltenvk folder.

At this moment the moltenvk folder isn't filled, but will eventually be.
2023-01-27 16:58:14 +01:00
b67b84bd5d Fix T103984: USD exports pass usdchecker
These changes were authored by Michael B Johnson (drwave).

The default Blender USD export currently produces files that trigger
errors in the usdchecker that ships with USD 22.11.

The changes are:

- Set the defaultPrim if no defaultPrim is set. This sets it to the
first prim in the hierarchy which matches the behaviour of Pixar's
referencing (where referencing a USD layer without a defaultPrim will
pick the first prim) as well as matches the logic in Pixar's Maya USD
exporter code.

- Applies the MaterialBindingAPI to prims with material binding
attributes. This is a relatively new requirement for USD as it will
help for efficiency with upcoming changes to Hydra.

- Removes the preview scope in the USD shader hierarchy, because it
is no longer valid for shaders to have any non-container ancestors in
their hierarchy up until the enclosing Material prim.

Reviewed by: Michael Kowalski

Differential Revision: https://developer.blender.org/D17041
2023-01-27 10:29:58 -05:00
4635dd6aed Fix T104157: Deleting an active OSL node causes issues
Removing all OSL script nodes from the shader graph would cause that
graph to no longer report it using `KERNEL_FEATURE_SHADER_RAYTRACE`
via `ShaderManager::get_graph_kernel_features`, but the shader object
itself still would have the `has_surface_raytrace` field set.
This caused kernels to be reloaded without shader raytracing support, but
later the `DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE`
kernel would still be invoked since the shader continued to report it
requiring that through the `SD_HAS_RAYTRACE` flag set because of
`has_surface_raytrace`.
Fix that by ensuring `has_surface_raytrace` is reset on every shader update,
so that when all OSL script nodes are deleted it is set to false, and only
stays true when there are still OSL script nodes (or other nodes using it).

Maniphest Tasks: T104157

Differential Revision: https://developer.blender.org/D17140
2023-01-27 16:14:25 +01:00
2590de913d Cleanup: Silence compilation warning (unused parameter). 2023-01-27 15:50:52 +01:00
46c68c46a5 3D Texturing: Adding more cases where seams can be fixed.
Case added where a corner in uv space share the same edge in 3d space.
This used to work, but was lost when implementing
the new approach.
2023-01-27 15:49:10 +01:00
3d7697b325 Tweak to previous commit: move checks on DNA deprecated data at the end of readfile code.
BKE blendfile should not be allowed to deal with DNA deprectaed data, so
move recent check in rB138b3815e528 into BLO readfile, in a new
`blo_read_file_checks` util that is being called at the very end of main
readfile code (`blo_read_file_internal` and `library_link_end`).
2023-01-27 15:32:44 +01:00
138b3815e5 Fix (unreported) missing clear of deprecated Window data on fileread.
rB7f564d74f9ed (6 years ago!) forgot to clear the deprecated
`Window->screen` pointer on file read for recent-enough .blend files.

This is required since a valid value is always written in .blend files
for that pointer, to ensure backward compatibility.

The issue was never detected so far because that pointer is explicitely
reset to NULL after filewrite, which includes any memfile undostep
write, and usually existing UI data is re-used instead of loading the
one from the .blend file, so thedden  assert in `blo_lib_link_restore`
would never be triggered.

Now moved the assert at the end of `setup_app_data` to ensure it always
get checked.
2023-01-27 15:14:55 +01:00
1fd1d24265 Revert accidental changes to sub-modules
When committing via VSCode's Git UI,
my commit 073cf46b2e
seems to have affected sub-modules.
Gotta be more careful!
2023-01-27 15:05:44 +01:00
073cf46b2e Fix Generic List move ops clickable with 1 element
Meant to include this piece of feedback from Sybren in D14119.
2023-01-27 14:56:53 +01:00
75d6228583 PyAPI: Generic UIList for CollectionProperties
This patch adds a draw_ui_list() function, which is a wrapper around
layout.template_list(). It implements generic add/remove/move buttons,
passing the correct "row" integer to template_list(), as well as a
drop-down menu, if provided, making it a complete solution for
consistent UILists for addons.

Differential Revision: https://developer.blender.org/D14119
2023-01-27 14:51:13 +01:00
Iliya Katueshenock
454057f9df Fix T104175: adding Blur Attribute node with link drag search fails
The node does not support blurring booleans, but that was not handled
property in link drag search.

Differential Revision: https://developer.blender.org/D17139
2023-01-27 14:48:29 +01:00
79f70e48eb Fix: crash in mesh topology nodes
This was broken in rBd6c9cd445cb41480b40.
2023-01-27 12:40:08 +01:00
d5b026a16c Fix incorrect RNA path for GPencil brush settings, and add it for Curves brush settings.
RNA paths should be relative to their owner ID, not to some other ID!
2023-01-27 11:11:18 +01:00
0cce653892 Cleanup: Pass double2 by reference.
Just to be sure if compilers aren't inlining the function that at
least the parameter isn't copied.
2023-01-27 10:25:39 +01:00
71d7c919c0 ImBuf: Limit transform region to pixels that are affected by the transformation.
Only the area where the source buffer will be drawn on the destination buffer
will be evaluated. This will improve performance in sequencer and image editors as
less calcuations needs to happen.
2023-01-27 09:56:19 +01:00
b3fd169259 ImBuf: Precalc subsamples to reduce branching.
Micro improvement to store delta uv per subsample. This reduces
branching that might happen on the CPU, but also makes it possible
to add other sub-sampling filters as well.

No changes for the end-user.
2023-01-27 08:39:26 +01:00
2b4bafeac6 UV: add "similar object" and "similar winding" to uv "select similar"
Adds new options to UV Face selection in the UV Editor, with UV > Select > Select Similar

In multi object edit mode, "Similar Object" selects faces which have the same object.

"Similar Winding" will select faces which have the same winding, i.e. are they
facing upwards or downwards.

Resolves: T103975
Differential Revision: https://developer.blender.org/D17125
2023-01-27 17:58:07 +13:00
6c8db7c22b Fix T103868: render uv transform gizmo even if it has zero area
Change the 2D Gizmo drawing function to provide a usable transform matrix,
if it would otherwise be degenerate.

Differential Revision: https://developer.blender.org/D17113
2023-01-27 17:06:14 +13:00
34a6591a07 Fix T98594: missing uv editor redraw with geometry nodes modifier
If an object has a geometry nodes modifier, the UVs on that object might change
in response to any change on any other object.

Now we will redraw the UV editor on any object change, not just the active object.

Differential Revision: https://developer.blender.org/D17124
2023-01-27 16:53:14 +13:00
d76004f48f Sculpt: Fix sculpt expand not switching falloff types properly 2023-01-26 17:54:10 -08:00
742c2e46bb Cleanup: format 2023-01-27 14:45:37 +13:00
e735bf02cb Fix linux/mac compiler warning. 2023-01-26 20:16:07 -05:00
647cffc001 Sculpt: Add numpad aliases for number keymap entries in expand modal map 2023-01-26 16:57:28 -08:00
Aaron Carlisle
3e90390918 Sculpt: Resolve Shift R shortcut conflicts
Based on T99607:
- Existing Angle Control shortcuts are removed
- Voxel, Dyntopo and Hair resolution shortcuts are remapped to `R`

Since voxel remeshing is not compatible with dyntopo, each can use the shortcut `R` for the remeshing resolution without causing a conflict.
The shortcut `R` is not currently used for anything important.
The angle control menu is commonly not used.
And sculpt mode is only coincidentally inheriting the rotate operator shortcut on `R` because nothing else is mapped to the key.

Reviewed By: Julien Kaspar and Hans Goudey and Joseph Eagar
Differential Revision: https://developer.blender.org/D16511
Ref D16511
2023-01-26 16:55:03 -08:00
cdef135f6f USD import: Support importing USDZ.
This addressed feature request T99811.

Added the following features to fully support importing USDZ archives:

- Added .usdz to the list of supported extensions.
- Added new USD import options to copy textures from USDZ archives. The
textures may be imported as packed data (the default) or to a directory
on disk.
- Extended the USD material import logic to handle package-relative texture
assets paths by invoking the USD asset resolver to copy the textures from
the USDZ archive to a directory on disk. When importing in Packed mode,
the textures are first saved to Blender's temporary session directory
prior to packing.

The new USD import options are

- Import Textures: Behavior when importing textures from a USDZ archive
- Textures Directory: Path to the directory where imported textures will
be copied
- File Name Collision: Behavior when the name of an imported texture file
conflicts with an existing file

Import Textures menu options:

- None: Don't import textures
- Packed: Import textures as packed data (the default)
- Copy: Copy files to Textures Directory

File Name Collision menu options:

- Use Existing: If a file with the same name already exists, use that
instead of copying (the default)
- Overwrite: Overwrite existing files

Reviewed by: Bastien

Differential Revision: https://developer.blender.org/D17074
2023-01-26 18:08:45 -05:00
Iliya Katueshenock
9a4c54e8b0 Fix: Curve to Points node has wrong field interface status
In 7536abbe16 changes make possible to input field as Count field.
But changes of declaration probably was forgotten. So now this input
can take field and node will be work. But input link was red. This
patch resolves this issue.

Differential Revision: https://developer.blender.org/D17131
2023-01-26 13:01:03 -06:00
d6c9cd445c Geometry Nodes: Skip sorting in topology nodes if possible
When the sort weights are a single value, they have no effect,
so sorting the relevant indices for the element will be wasted work.
The sorting is expensive compared to the rest of the node. In my
simple test of the points of curve node, it became 6 times faster
when the weights are a single value.
2023-01-26 12:34:28 -06:00
Germano Cavalcante
3b4486424a Fix repeated transform constraint orientations
On some occasions, as in cases where transform operations are triggered
via gizmos, the constrain orientations that can be toggled with
multiple clicks of X, Y or Z were repeated.

There is no use in maintaining repeated orientations.
2023-01-26 14:54:48 -03:00
f19f50d288 USD export test format fixes. 2023-01-26 10:35:14 -05:00
1fd54204b0 Enable commented out code.
Code was disabled for debugging purposes.
2023-01-26 15:17:37 +01:00
c412c38f0d Fix missing code-paths in previous sequence/imbuf commits. 2023-01-26 15:15:21 +01:00
3b17d6c619 Sequencer: Made subsampling a transform option.
There are cases where automatic selection of subsampling doesn't work
This patch move adds a filtering option that
can enable this.
2023-01-26 15:03:19 +01:00
f210842a72 Sequencer: Improve Image Transform Quality When Exporting.
Image Transform use linear or nearest sampling during editing and exporting.
This gets sampling is fine for images that aren't scaled. When sequencing
however you mostly want use some sort of scaling, that leads to poorer
quality.

This change will use sub-sampling to improve the quality. This is only
enabled when rendering. During editing the subsampling is disabled to
keep the user interface reacting as expected.

Another improvement is that image transform is stopped at the moment
it hadn't sampled at least 4 samples for a scan line. In that case we
expect that there are no further samples that would change to result.

In a future patch this could be replaced by a ray/bounding bo intersection
as that would remove some unneeded loops in both the single sampled and
sub sampled approach.
2023-01-26 14:25:49 +01:00
6d79bc0073 ImBuf: Use vector types in transform.cc.
Replace array types with the vector types. No functional changes.
2023-01-26 14:25:49 +01:00
f7dd7d5454 Python API: add a method for reordering modifiers.
Add an `object.modifiers.move()` method, similar to the one
for constraints and some other collections. Currently reordering
modifiers requires using operators, which depend on context.

The implementation is straightforward, except for the need to
make the severity of errors reported by the underlying editor
code into a parameter, so that the new Python API function
reports any problems as Python exceptions, and refactoring
the code to allow aborting a blocked move before making any
changes. I also turn the negative index condition from an assert
into an error.

Differential Revision: https://developer.blender.org/D16966
2023-01-26 14:56:58 +02:00
4a768a7857 Cleanup: remove unused variable 2023-01-26 13:31:41 +01:00
a42f307915 Shader Nodes: Use layers from evaluated mesh
The list was populated from the base (unevaluated) object, but now that
Geometry nodes can generate various layers this is impractical..

Differential Revision: https://developer.blender.org/D17093
2023-01-26 10:31:13 +01:00
ef39d85d7c Fix T104121: Grease pencil clear function in python not updated in viewport
The clear functions for grease pencil data/frame/layer
was not followed by immediate update in the viewport.
Some notifiers were missing in the rna definition of these functions,
which are added in by this patch.

Reviewed By: Antonio Vazquez

Differential Revision: https://developer.blender.org/D17120
2023-01-26 09:02:45 +01:00
564673b8ea Fix T71137: curve minimum twist producing wrong geometry
Only one point should be used to create a reference rotation for other
points to follow. Using two caused the resulting twist to be
asymmetric, especially noticeable on symmetrical, cyclic curves.

An update to [0] which broke curve_to_mesh & deform_modifiers tests,
now this change only applies to cyclic curves as the final result was
much greater for non-cyclic curves because of a difference between how
end-point directions are calculated (see code-comments for details).

Alternate fix to D11886 which caused T101843.

[0]: 36a82314a0.
2023-01-26 17:38:20 +11:00
c1beaea80f Fix T103587: Redo panel doesn't appear for spin operator
Regression in [0] which cleared the redo-panel if an operator added
its own undo step. This worked for sculpt to fix T101743, but caused
the redo-panel to be cleared for actions who's undo steps where created
by nested operators (which is the case for the spin operator).

Fix by checking an undo-step is added without registering an operator.

[0]: f68e50a263
2023-01-26 11:06:38 +11:00
f5d94f97e4 Fix T104132: Vertex colors no longer displaying in Workbench 2023-01-25 23:01:00 +01:00
fa57c691f6 USD IO CI Tests
Various new CI tests for USD Import / Export functionalty:

Import:
 - Added mesh import tests for topology types and multiple UV sets. (Python)

Export:
 - Added a verification tests for mesh topology. (C++)
 - Added a test to make sure UsdPreviewSurface export conversion of materials
is correct. (C++)

Reviewed by: Sybren and Hans.

Differential Revision: https://developer.blender.org/D16274
2023-01-25 14:51:39 -05:00
feae1c7d05 Gitea: update issue template with new scoped labels and other tweaks 2023-01-25 20:39:27 +01:00
6e0d58a68a Cleanup: Edit Mesh: Decrease variable scope, use bool instead of int 2023-01-25 12:56:26 -06:00
9ad051140c Revert "Fix T71137: curve minimum twist producing wrong geometry"
This reverts commit 36a82314a0.
as it has broken tests for the last day and a half, it likely just
needs a test file update, but we can't keep this failing longer
than it already has.
2023-01-25 11:00:08 -07:00
Sibo Van Gool
2f9346bc0a Fix T104071: Mix up in Realize Instances tooltip text
A mistake in the node type descriptions gave the node a description for
the reverse curve node.

Differential Revision: https://developer.blender.org/D17111
2023-01-25 11:42:10 -06:00
f954029e97 Curves Sculpt: Add report about missing surface for puff brush
This is done in the other sculpt brushes that require a surface object.
2023-01-25 11:38:12 -06:00
405b08e00d Fix: Workbench Next: Ensure matcaps textures are loaded 2023-01-25 17:12:25 +01:00
e744673268 Draw: Improve Texture assignment operator
Differential Revision: https://developer.blender.org/D17119
2023-01-25 17:12:25 +01:00
ffe45ad87a USD import unused materials.
Added a new Import All Materials USD import option.  When this
option is enabled, USD materials not used by any geometry will
be included in the import.  Imported materials with no users
will have a fake user assigned.

Maniphest Tasks: T97195

Differential Revision: https://developer.blender.org/D16172
2023-01-25 10:46:07 -05:00
Faheim Arslan M
69288daa74 Fix T100674: Use plural for axes option in object properties
This commit fixes T100674, Renamed the "axis" label under viewport display
panel to "axes" in object properties.

Differential Revision: https://developer.blender.org/D16650
2023-01-25 16:25:17 +01:00
12ca26afc2 Fix T104089: Removing greasepencil vertex group can leave no active
We should always have an active vertexgroup (making sure this is the
case was just not happening on the greasepencil side).

Now do this (similar to what is done for other object types in
`object_defgroup_remove_common`).

Maniphest Tasks: T104089

Differential Revision: https://developer.blender.org/D17091
2023-01-25 13:52:35 +01:00
ef3225b9da Cleanup: Workbench Next: Remove UNUSED_VARS 2023-01-25 13:16:53 +01:00
e27d1d9bb6 Fix broken listdir code.
Error in `bli_builddir` cahnges in B4815d0706fb5 commit.
2023-01-25 12:31:00 +01:00
b898e00edc Cleanup: remove unused KernelGlobals in microfacet BSDF 2023-01-25 11:26:51 +01:00
72c012ab4a Python: enable user site-packages without --python-use-system-env
User site-packages were disabled unless `--python-use-system-env`
argument was given. This was done to prevent Blender's Python
unintentionally using modules that happened to be installed locally,
however it meant so pip installing modules from Blender would fail to
load those modules (for some users).

Enable user site-packages since it's needed for installing additional
packages via pip, see code-comments for details.

Resolves T104000.
2023-01-25 12:52:30 +11:00
7416021bf7 Command Line Arguments: all errors now print to the stderr
This was done by some callbacks but not all.
Only use the stdout for status & information.
2023-01-25 12:20:48 +11:00
6c310acccc help now includes a GPU section & improve --gpu-backend error
- Include available GPU backends in the GPU backend error.
- Use stderr for the error message.
2023-01-25 12:20:00 +11:00
821dee6de4 CMake: de-duplicate option(..) for platform specific defaults
Use a variable for the default instead, avoid duplicate descriptions.
2023-01-25 12:18:41 +11:00
adcb0edca0 Cleanup: clang-tidy, replace defines with enum, redundant parenthesis 2023-01-25 11:53:06 +11:00
9c71f5807f Cleanup: remove unused argument, warnings 2023-01-25 11:18:33 +11:00
0c2b2cdb78 BLI_filelist: minor changes to bli_builddir behavior
- Don't call exit() when memory allocation fails, while unlikely
  internal failures should not be exiting the application.
- Don't print a message when the directory is empty as it's
  unnecessarily noisy.
- Print errors the the stderr & include the reason for opendir failing.
2023-01-25 10:48:59 +11:00
d3967ce27c Cleanup: use early return in bli_builddir, reduce right shift 2023-01-25 10:48:56 +11:00
b51034b9ca Asset system: use native slash for AssetLibraryIndex.indices_base_path
When constructing run-time paths native slashes are preferred as WIN32
doesn't have full support for forward slashes in paths.
It can also cause problems when performing exact matches on paths
which are normalized, where normalizing one of the paths makes
comparisons fail.

Using the system native slash would have avoided T103385.
2023-01-25 09:59:43 +11:00
b8cb962fb2 Cleanup: perform string join & allocation in a single step 2023-01-25 09:59:33 +11:00
d446809f96 Cleanup: remove unused argument 2023-01-25 09:53:35 +11:00
c5d9938adc Fix T104073: Unable to add more than two strips in one channel
Originally, function `sequencer_generic_invoke_xy_guess_channel`
looked in what channel the last strip of the same type is and the
channel used for new strip placement. This was changed as a workaround
when sound strips were added below movie strips
(58bea005c5).
Now these workarounds aren't necessary, but the function was left
unchanged.

Current logic is for adding movie strip to channel 1 is:
- Sound is added to channel 1
- Video is added to channel 2
- If there is only video, it is added to channel 1

Since movie may, or may not contain sound, it is not possible to align
added strips in a way that was done before, unless timeline is analyzed
more in detail, but this would be quite inefficient.
Also, note, that the code did not work, if strip is added next to
previous one, so it mostly did not work as intended.

This commit changes:
- Fix alignment not working when strip is added right next to previous
  strip
- Assume that movie strips have sound strip underneath it, so channel
  below last movie strip is picked
- For other strip types, pick channel of the last strip of same type

Ultimately, the code is still rather weak, and better system, like using
channel where strip was manually moved, or concept of "typed channels"
would improve this feature.
2023-01-24 23:28:16 +01:00
1ad11355a3 Transform: fix use of "snap_point" property
There is not much documentation on the "snap_point" property, but by
code it is possible to note that it serves to set a target snap point
and is of internal use as it is hidden from the Redo panel.

However, this property was still very dependent on Tools settings and
if set to an operator's call, it changes the scene configurations
inadequately.

Therefore,
- remove this dependency from UI for rotation and resize operators,
- do not change the state of the snap in the scene and
- cleanup the code.
2023-01-24 17:10:21 -03:00
0f52aa0954 Transform: Initialize 'transform_matrix' accordingly
Some transform modes are changeable, so callbacks should be reset
together.

Currently the unchanged `transform_matrix` callback is not a major
issue as it is only used for gizmos and gizmos stop updating when
changing the operator type.
2023-01-24 17:10:21 -03:00
5cc9f07d5c Draw: Workbench Next: Fix shaders memory leaks
Ensure all shaders are deleted
2023-01-24 20:52:35 +01:00
361ebe98d5 Draw: Workbench Next: Fix shadow culling after recent cleanup commit
79ba1a1ac8 changed virtual function signatures so they didn't match their parents.
2023-01-24 20:52:35 +01:00
789ab9b92a Sculpt: Fix T104090: Automask topology not constrained by brush radius 2023-01-24 11:04:39 -08:00
a73a2d345f Fix T104044: keep order of UVMaps on load
Use a Vector<std::string> , instead of a Set<std::string> as a Set does
not keep the same order when iterating over it.

Differential Revision: https://developer.blender.org/D17103
2023-01-24 18:26:02 +01:00
e56f284843 readfile: Fix 'virtual' issue in memfile case.
In many cases when reading undo memfile, n the 'restore id from old
main' part of the process, the 'id_old' is not set, which means that
the call to `BKE_main_idmap_insert_id` would try to dereference a `nullptr`.

In practice this is likely not an issue (see comment in code for details),
but at least explicitely check for a nullptr `id_old` pointer.

Would deffer actual cleanup of this area for after 3.5 is branched out.
2023-01-24 18:15:06 +01:00
30b34735e3 Cleanup/refactor: Add a new flag for IDTypes that are not read in memfile undo.
Currently only affects 'UI' IDs (WindowManager, Screen, etc.), but in
the future other types may be affected as well.

NOTE: this is only used in readfile code itself, not in the
post-processing performed by `setup_app_data`, as this code is too
specific for such generic handling.
2023-01-24 18:15:06 +01:00
e308b891c8 Cycles: Use faster and exact GGX VNDF sampling algorithm
Based on "Sampling the GGX Distribution of Visible Normals" by Eric Heitz
(https://jcgt.org/published/0007/04/01/).

Also, this removes the lambdaI computation from the Beckmann sampling code and
just recomputes it below. We already need to recompute for two other cases
(GGX and clearcoat), so this makes the code more consistent.

In terms of performance, I don't expect a notable impact since the earlier
computation also was non-trivial, and while it probably was slightly more
accurate, I'd argue that being consistent between evaluation and sampling is
more important than absolute numerical accuracy anyways.

Differential Revision: https://developer.blender.org/D17100
2023-01-24 17:59:29 +01:00
fdcb55b285 Cycles: Switch microfacet code to non-separable shadowing-masking term
This gives closer results to what I've seen in papers and other renderers when
using the code to precompute albedo later (to replace MultiGGX).

It's usually a tiny difference, the only case where I've seen it matter is
in the `shader/node_group_float.blend` test - but that's a (single-scatter) GGX
closure with 0.9 roughness, so it's not too surprising. In any case, the new
result looks closer to Eevee, so that's good I guess.

Differential Revision: https://developer.blender.org/D17099
2023-01-24 17:59:29 +01:00
ce25e3e581 Cycles: Cleanup: Add general-purpose conversion between sin and cos 2023-01-24 17:59:29 +01:00
b1c8889396 Cleanup: format 2023-01-24 17:59:29 +01:00
3dd3a0f02e temp-sculpt-roll-mapping: Fix deadlock 2023-01-24 08:54:12 -08:00
ae80a6696f Geometry Nodes: don't show warning in modifier with multiple geometry inputs
Simon mentioned that this gets in the way more than it helps. No geometry
sockets currently show up in the modifier panel. People may build node groups
that have multiple geometry inputs that can be used when the group is used
as node instead of as modifier.

In the future we could also allow e.g. choosing an object to pass into a geometry
socket. That has the problem that we'd also have to duplicate other functionality
of the Object Info node (original vs. relative space).
2023-01-24 17:45:47 +01:00
6899474615 Fix: crash when adding curves in curves sculpt mode with interpolation
There were two issues:
* The `new_point_counts_per_curve` was one too large, resulting in
  `interpolate_from_neighbors` reading invalid memory.
* Writing the counts into the existing offsets array didn't quite work
  because there can be a collision at the offset right between the
  last old curve and the first new point. There was a race condition
  where this value could be read and written at the same time.
2023-01-24 17:36:53 +01:00
3ca4ec3857 Merge branch 'master' into temp-sculpt-roll-mapping 2023-01-24 08:30:06 -08:00
62743dde11 Gpencil: More UI tweaks in Offset modifier
Remove more duplicate words
2023-01-24 16:53:28 +01:00
0931d91ab6 GPenciil: Small UI tweak
Remove duplicate `Layer` word
2023-01-24 16:47:22 +01:00
3956c4738f GPencil: Change range for Length value in Simplify modifier
In some cases a smaller value is required. Anyways, the UI value
soft limit is 0.005
2023-01-24 16:44:37 +01:00
813425877b Geometry Nodes: propagate material from guides in Interpolate Curves node
This was missing from the original implementation.
2023-01-24 16:24:00 +01:00
6e4e5f6484 UI: Allow spawning file browser dialog from regular file browser editor
When some path property was displayed in the File Browser, clicking the
icon to open a file browser dialog to choose a file/directory would
show an error. While this makes sense when you are already in a file
browser dialog (we don't support such nested file browser dialogs),
allow it when the file browser is opened as a normal editor, not as a
dialog.
2023-01-24 15:33:44 +01:00
10a06dfd11 Code-style: Rename internal parts of pbvh_uv_islands.
- InnerEdge -> FanSegment
- full -> is_manifold
2023-01-24 15:12:21 +01:00
21eff2c0ac 3DTexturing: Improve fix seam bleeding for manifold parts of mesh.
Current implementation had some faulty assumtions and had some work
arounds for crashes that were actually limitation of the implementation.

The main reason for this was that the implementation didn't add new
primitives in the same direction it was already adding. Some when
incorrect behavior was detected it was assumed that the part wasn't
manifold (anymore) and didn't fix that part of the mesh.

The new implementation will extract a solution and use this solution
also as the order to generate primitives in uv space.

This patch fixes several crashes and improves the overall quality
when fixing seam bleeding. It also adds additional debug tools
(print_debug) implementation in order to find issues faster in the
future.
2023-01-24 15:03:21 +01:00
4815d0706f Fix T103385: Asset Browser Thumbnails take long time to load
Regression in [0] caused by a change where path joining would
replace a forward slash with a back-slash when joining paths WIN32.
Now the directory is always used as a prefix for the paths returned
by BLI_filelist_dir_contents which resolves the regression.

[0]: 9f6a045e23
2023-01-25 00:21:13 +11:00
246485b213 BLI_path: make BLI_path_slash_is_native_compat into a public function 2023-01-25 00:21:11 +11:00
33edef15ed Fix T104095: missing crazy space data while sculpting curves
`remember_deformed_curve_positions_if_necessary` has to be called before
topology changing operations on curves. Otherwise the crazy-space data
is invalid.
2023-01-24 14:07:01 +01:00
cf332e896f GPencil: Fix unreported double rotation in Fill circles
The help circles to fill areas were rotated with the object rotation,
but as the stroke is part of the object, apply the rot matrix again
produced a double rotation, so it can be removed.
2023-01-24 13:19:22 +01:00
c16bd34316 Fix: memory allocation before guarded allocator is initialized
Use the construct-on-first-use idiom to fix this.
2023-01-24 13:07:54 +01:00
a36e2a9b64 Gpencil: Expose stroke and point time properties to API
Two properties are now exposed in python API :
time of each point, and inittime of each stroke.

Reviewed by: Antonio Vazquez

Differential Revision: https://developer.blender.org/D17104
2023-01-24 12:39:48 +01:00
74aa960398 Cleanup: add newline in Collada error message 2023-01-24 10:53:52 +01:00
1c90f8209d Cycles: fix rendering with Nishita Sky Texture on Intel Arc GPUs
Speckles and missing lights were experienced in scenes with Nishita Sky
Texture and a Sun Size smaller than 1.5°, such as in Lone Monk and Attic
scenes.
Increasing the precision of cosf fixes it.
2023-01-24 09:58:22 +01:00
6279042d21 Cleanup: use const, doc-string for BevList.poly & rename args
It wasn't clear that BevList.poly was used to check cyclic BevList's.
2023-01-24 16:42:38 +11:00
36a82314a0 Fix T71137: curve minimum twist producing wrong geometry
Only one point should be used to create a reference rotation for other
points to follow. Using two caused the resulting twist to be
asymmetric, especially noticeable on symmetrical, cyclic curves.

Alternate fix to D11886 which caused T101843.
2023-01-24 16:40:58 +11:00
67c48314ba Revert "Fix T71137: curve minimum twist producing wrong geometry"
This reverts commit cf72194214.
2023-01-24 15:33:47 +11:00
0ad4d07f10 Fix: Debug build compile error after recent cleanup commit
79ba1a1ac8 missed that this variable was used in an assert.
2023-01-23 15:57:33 -06:00
f72969f377 Fix T103888: Regression: Side-by-side stereo renders ignore color management
Looks like rB42937493d8253a295a97092315288f961e8c6dba accidentally dropped the
colorspace copy.
2023-01-23 22:12:16 +01:00
42f8f98ee1 Fix T104088: Geometry nodes modifier boolean lost on undo
After object-mode undo (memfile undo), the value wan't lost, but the
property would be temporarily converted back to integer type in order
to be forward compatible. Now only use the forward compatible
writing when writing undo steps. Auto-saves and similar files are
currently not forward compatible anyway.
2023-01-23 15:06:55 -06:00
79ba1a1ac8 Cleanup: Compiler warnings in new sculpt & workbench
Set but unused variables, unused arguments, unnecessary/incorrect type
casting, missing static qualifier. Unused arguments are removed.
2023-01-23 14:57:26 -06:00
f24b9a7943 Cleanup: Rename pbvh.cc to pbvh_colors.cc
In preparation for moving pbvh.c to C++, give this file a different name
so git will recognize that future commit as a rename.
2023-01-23 14:47:08 -06:00
7fc395354c Cleanup: Use offset indices arguments for curves utilities
Make the functions more flexible and more generic by changing the curves
arguments to the curve offsets. This way, theoretically they could become
normal utility functions in the future. Also do a consistency pass over
the algorithms that generate new curves geometry for naming and
code ordering, and use of utility functions. The functions are really
quite similar, and it's much easier to tell this way.
2023-01-23 14:43:04 -06:00
15575b953d Merge By Distance: Optimize algorithm to find duplicate polygons
The most time-consuming operation in merge by distance is to find
duplicate faces (faces that are different but have the same vertices).

Therefore, some strategies were planned to optimize this algorithm:
- Store the corner indices in an array thus avoiding multiple calls of `weld_iter_loop_of_poly_next`;
- Create a map of polygons linked to edges instead of linked to vertices - this decreases the number of connections and reduces the calculation of the intersection of polygon indices.

There are other fields to optimize, like reusing the `wpolys` array
instead of creating a new array of corner offsets. And join some arrays
as members of the same struct to be used in the same buffer.
But for now, it is already a nice optimization. And the new
`poly_find_doubles` function can be reused in the future to create a
generic utility.

The result of the optimization varies greatly depending on the number
of polygons, the size of each polygon and the number of duplicates.
On average it was something around 2 times faster.

Worst case tested (old vs new): 0.1ms vs 0.3ms
Best case tested (old vs new): 10.0ms vs 3.2ms

Differential Revision: https://developer.blender.org/D17071
2023-01-23 17:21:52 -03:00
b6b6e47e1d Sculpt: PBVH node splitting for texture paint
`PBVH_Leaf` nodes are now split into a new `PBVH_TexLeaf`
node type when using the paint brush.  These nodes are
split by image pixels, not triangles.  This greatly
increases performance when working with large
textures on low-poly meshes.

Reviewed By: Jeroen Bakker
Differential Revision: https://developer.blender.org/D14900
Ref: D14900
2023-01-23 10:44:50 -08:00
8afcecdf1f Cycles: update Intel Graphics compiler to 101.4032 on Windows
A noticeable (>5%) performance regression in oneAPI backend came with
a501a2dbff. Updating to latest graphics
compiler from driver 101.4032 fixes it.

I've tested it with current min-supported drivers and it runs well but
since compatibility of graphics compiler with older drivers isn't
guaranteed, I'm also bumping the min-supported driver versions.

If end-users consider latest drivers too fresh to switch to (version
isn't released as stable on Linux as of today but should be before
Blender 3.5 release), CYCLES_ONEAPI_ALL_DEVICES=1 env variable can be
used.

Intel Graphics Compiler on Linux will be updated in a later commit
so we can then close D16984.

Reviewed By: sergey, LazyDodo
2023-01-23 19:36:34 +01:00
fe8bf5e0c7 Fix T104039: GPencil Fill crash in multiframe
The problem was that the stroke array was not reset for 
each iteration of the loop, so the second time around, 
the array was not initialized correctly and 
was left with a NaN value.
2023-01-23 18:44:53 +01:00
e49e5f6f08 Enable USD Preview Surface import by default
The USD Preview Surface material import feature is now considered
stable, so this patch removes this option from the Experimental
category in the UI.

The Import USD Preview option is now enabled by default.

The Experimental box has been removed.

A new Materials box has been added to group the Import
USD Preview Surface, Set Material Blend and Material
Collision Mode options.

Reviewed by: Sybren

Differential Revision: https://developer.blender.org/D17053
2023-01-23 12:02:38 -05:00
ba982119cd Workbench Next
Rewrite of the Workbench engine using C++ and the new Draw Manager API.

The new engine can be enabled in Blender `Preferences > Experimental > Workbench Next`.
After that, the engine can be selected in `Properties > Scene > Render Engine`.
When `Workbench Next` is the active engine, it also handles the `Solid` viewport mode rendering.

The rewrite aims to be functionally equivalent to the current Workbench engine, but it also includes some small fixes/tweaks:
- `In Front` rendered objects now work correctly with DoF and Shadows.
- The `Sampling > Viewport` setting is actually used when the viewport is in `Render Mode`.
- In `Texture` mode, textured materials also use the material properties. (Previously, only non textured materials would)

To do:
- Sculpt PBVH.
- Volume rendering.
- Hair rendering.
- Use the "no_geom" shader versions for shadow rendering.
- Decide the final API for custom visibility culling (Needed for shadows).
- Profile/optimize.

Known Issues:
- Matcaps are not loaded until they’re shown elsewhere. (e.g. when opening the `Viewort Shading` UI)
- Outlines are drawn between different materials of the same object. (Each material submesh has its own object handle)

Reviewed By: fclem

Maniphest Tasks: T101619

Differential Revision: https://developer.blender.org/D16826
2023-01-23 17:59:07 +01:00
Jason Fielder
84c25fdcaa Metal: Improve command buffer handling and workload scheduling.
Improve handling for cases where maximum in-flight command buffer count is exceeded. This can occur during light-baking operations. Ensures the application handles this gracefully and also improves workload pipelining by situationally stalling until GPU work has completed, if too much work is queued up.

This may have a tangible benefit for T103742 by ensuring Blender does not queue up too much GPU work.

Authored by Apple: Michael Parkin-White

Ref T96261
Ref T103742
Depends on D17018

Reviewed By: fclem

Maniphest Tasks: T103742, T96261

Differential Revision: https://developer.blender.org/D17019
2023-01-23 17:47:21 +01:00
Jason Fielder
139fb38d4f DRW: Add texture usage host read to Lightcache texture.
Required by Metal backend to have correct usage flags for textures which are read by host.

Authored by Apple: Michael Parkin-White

Ref T96261

Reviewed By: fclem

Maniphest Tasks: T96261

Differential Revision: https://developer.blender.org/D17020
2023-01-23 17:46:46 +01:00
Jason Fielder
1c672f3d1d Fix T103433: Ensure Metal memory allocator is safe for multi-threaded allocation. Resolves crash when baking indirect lighting.
Also applies correct texture usage flag to light bake texture.

Authored by Apple: Michael Parkin-White

Ref T96261

Reviewed By: fclem, jbakker

Maniphest Tasks: T96261, T103433

Differential Revision: https://developer.blender.org/D17018
2023-01-23 17:45:39 +01:00
Jason Fielder
0ba5954bb2 Fix T103635: Fix failing EEVEE and OCIO shader compilations in Metal.
Affecting render output preview when tone mapping is used, and EEVEE scenes such as Mr Elephant rendering in pink due to missing shaders.

Authored by Apple: Michael Parkin-White

Ref T103635
Ref T96261

Reviewed By: fclem

Maniphest Tasks: T103635, T96261

Differential Revision: https://developer.blender.org/D16923
2023-01-23 17:40:10 +01:00
8e56ded86d Cycles: temporarily disable AMD Vega GPU rendering due to compiler bug
To make daily builds pass while we figure this out.

Ref T104097
2023-01-23 17:30:12 +01:00
62af8ae57b Merge branch 'master' into temp-sculpt-roll-mapping 2023-01-12 19:34:16 -08:00
bddb691dfc temp-sculpt-roll-mapping: Fix texture rotation bugs 2023-01-03 10:34:20 -08:00
a489b73a38 Merge branch 'master' into temp-sculpt-roll-mapping 2023-01-03 09:18:36 -08:00
a238533550 temp-sculpt-roll-mapping: Visualization improvements
* Only draw curve line at start of stroke
* Fixed dash
2022-12-30 15:12:22 -08:00
4faa5e30a5 Merge remote-tracking branch 'origin' into temp-sculpt-roll-mapping 2022-12-30 14:22:01 -08:00
2157115fa6 temp-sculpt-roll-mapping: Fix merge errors 2022-12-30 14:21:26 -08:00
8931df291a Merge branch 'master' into temp-sculpt-roll-mapping 2022-12-29 12:25:05 -08:00
d15324f1c3 temp-sculpt-roll-mapping: update more comments 2022-12-18 05:45:37 -08:00
90b633967b Merge remote-tracking branch 'origin' into temp-sculpt-roll-mapping 2022-12-17 20:58:31 -08:00
e9b573e11d temp-sculpt-roll-mapping: Cleanup some comments 2022-12-17 20:58:10 -08:00
ce56101897 Merge branch 'master' into temp-sculpt-roll-mapping 2022-12-15 17:56:57 -08:00
733a764b07 temp-sculpt-roll-mapping: Implement symmetry/tile/radial modes
* Note that tile + radial does not work
* Also cleaned up the code a bit.
2022-11-20 00:58:27 -08:00
999ba46d11 Merge remote-tracking branch 'origin' into temp-sculpt-roll-mapping 2022-11-20 00:01:41 -08:00
1fa19af508 Merge remote-tracking branch 'origin' into temp-sculpt-roll-mapping 2022-11-17 16:50:53 -08:00
997ff01503 Merge branch 'master' into temp-sculpt-roll-mapping 2022-11-17 01:02:26 -08:00
88f2350c34 temp-sculpt-roll-mapping: Update comment 2022-11-17 00:43:07 -08:00
33a1472d4e temp-sculpt-roll-mapping: New strategy for spline tex mapping
Decomposing into a quadratic spline did not work.  Having
to test each segment individually for the closest point test
wasn't worth it.

New strategy is to (roughly) find the inflection points on
the curve and binary search them for the closest
point test.
2022-11-17 00:06:26 -08:00
7a6e2d1e39 temp-sculpt-roll-mapping: remove debug draw points 2022-11-16 00:45:02 -08:00
f8afdc971f temp-sculpt-roll-mappings: Experimental quadratic spline code
Doesn't quite work yet.  The idea is to directly solve for
curve texture coordinates using a quadratic spline.
2022-11-16 00:40:46 -08:00
9a8c4aefb3 temp-sculpt-roll-mapping: mirror symmetry fixes 2022-11-15 10:40:09 -08:00
af4048f2b8 temp-sculpt-roll-mapping: Fix stroke points projection outside of the
mesh
2022-11-15 09:13:31 -08:00
eecd4b69f2 Sculpt: Fix mask from cavity settings issues
Mask from cavity can now pull settings from three
places: the operator properties, scene tool settings
or the brush.  This is needed to make the "create mask"
button work as expected.
2022-11-15 08:44:03 -08:00
12c5ae7e06 GPU: Update Vulkan backend with latest API changes.
UniformBuf::bind_as_ssbo has been introduced.
2022-11-15 08:44:02 -08:00
852a9ffa3d Fix T102482: Crash loading geometry nodes assets after file load
Asset library data is destructed on file load. Asset lists (weak and
hopefully temporary design) contain pointers into it that would dangle
then. Make sure the asset lists are destructed before the asset library
data.
2022-11-15 08:44:02 -08:00
893565268d Fix T102512: Snapping Option Face Nearest is not working
Caused by misuse of the `GS` macro.

Error in rB8f4e52b7e0dd
2022-11-15 08:44:02 -08:00
Philipp Oeser
00113d6888 Report messages from override ID template to the UI.
Previously these were only written to the console with no UI feedback
whatsoever. Just a bit nicer to give the user some info that something
went wrong.

See also T102495.

Differential Revision: https://developer.blender.org/D15732
2022-11-15 08:44:02 -08:00
9826204157 Fix WM_report not printing into console.
Also make it print warnings, not only errors.

Related to D15732.
2022-11-15 08:44:01 -08:00
295dc61334 Fix T102495: Fix UI-invisible warning and failing liboverride creation in IDTemplate in some cases.
Replace `RNA_warning` uage by `WM_report`.

And allow creating liboverrides of linked obdata used by purely local
objects.
2022-11-15 08:44:01 -08:00
fc4c8a3647 Cleanup: Move OptiX denoiser code from device into denoiser class
Cycles already treats denoising fairly separate in its code, with a
dedicated `Denoiser` base class used to describe denoising
behavior. That class has been fully implemented for OIDN
(`denoiser_oidn.cpp`), but for OptiX was mostly empty
(`denoiser_optix.cpp`) and denoising was instead implemented in
the OptiX device. That meant denoising code was split over various
files and directories, making it a bit awkward to work with. This
patch moves the OptiX denoising implementation into the existing
`OptiXDenoiser` class, so that everything is in one place. There are
no functional changes, code has been mostly moved as-is. To
retain support for potential other denoiser implementations based
on a GPU device in the future, the `DeviceDenoiser` base class was
kept and slightly extended (and its file renamed to
`denoiser_gpu.cpp` to follow similar naming rules as
`path_trace_work_*.cpp`).

Differential Revision: https://developer.blender.org/D16502
2022-11-15 08:44:01 -08:00
2e4ed11dd3 Fix (unreported) broken args handling in install_deps.
Issue likely introduced when openpgl lib was added a few months ago.
2022-11-15 08:44:01 -08:00
f46b932c59 GPU: UniformBuffer: Add possibility to bind as SSBO
This way UBOs can be modified directly in shader just like VBOs and IBOs.
2022-11-15 08:44:01 -08:00
d67cf5d288 GPU: State: Add GPU_BARRIER_UNIFORM
This allows to synchronise uniform buffer writes from compute shader
when an UBO is bound as SSBO.
2022-11-15 08:44:00 -08:00
0a16677741 GPU: Enabled Metal test cases.
This commit enabled the metal gpu backend test cases. These test cases
will currently fail, but are by default disabled.
2022-11-15 08:44:00 -08:00
f37cbba8f1 GPU: Improve Codegen variable names
Include the node name and parameter index in the variable name for easier debugging.
(Enabled for debug builds only)

Reviewed By: fclem, jbakker

Differential Revision: https://developer.blender.org/D16496
2022-11-15 08:44:00 -08:00
e2449adfcd Fix T101562: GPU: Update fmod to match Cycles/OSL behavior
Implementation from:
https://github.com/AcademySoftwareFoundation/OpenShadingLanguage/blob/main/src/include/OSL/dual.h#L1283
https://github.com/OpenImageIO/oiio/blob/master/src/include/OpenImageIO/fmath.h#L1605

Reviewed By: fclem

Maniphest Tasks: T101562

Differential Revision: https://developer.blender.org/D16497
2022-11-15 08:44:00 -08:00
63e630b703 install_deps: Update openpgl to 0.4.1.
Ref. T101403.
2022-11-15 08:44:00 -08:00
c2499366f2 Cleanup: fix compile error in leak detection utility 2022-11-15 08:43:59 -08:00
b30e20d0d9 Fix T101536: missing node tree updates after remapping id
The main problem is that the node tree was not properly updated
after a property in the tree changed. More specifically, the collection
pointer in the Collection Info node was cleared, but the node tree
was not updated after that (usually this is handled by rna updates).

Differential Revision: https://developer.blender.org/D16289
2022-11-15 08:43:59 -08:00
Damien Picard
51976a9e4c I18n: fix UI layout operator context extraction
Some operator layout buttons with custom text did not get the proper
context: they'd get * instead of Operator.

This was probably never noticed because the only operator that
actually had this issue was Import Images as Planes in the Add menu.

Reviewed By: mont29

Differential Revision: https://developer.blender.org/D15993
2022-11-15 08:43:59 -08:00
Damien Picard
6749cec22f I18n: fix description_from_data_path() for property assignment
The property assignment operators' tooltips were never actually
translated when they were constructed dynamically from the description
in the prop's RNA.

This was visible when using such operators in menus (example I found
was the Marker Settings, Shift + E in the Movie Clip Editor).

"%s: %s" is already extracted elsewhere, might as well use it.

Reviewed By: mont29

Differential Revision: https://developer.blender.org/D16439
2022-11-15 08:43:59 -08:00
Damien Picard
f5877da993 I18n: improve keymap translations
- The label for modal keymaps was extracted but did not use the proper
  context on translation.
- Same goes for modal keymap items.
- Extract the UI messages from rna_keymap_ui.py
- Translate global keymap names.
- Use the proper context in the status bar for the tool prompt operator

Ref T102071

Maniphest Tasks: T102071

Differential Revision: https://developer.blender.org/D16348
2022-11-15 08:43:59 -08:00
42dd5cc430 Fix T101775: grease pencil keyframe not filtered in dopesheet summary
Grease pencil data keyframes were listed twice in the summary.

First by the generic object data listing,
which did not handle properly grease pencil objects,
and did not account for the grease pencil filter.
Second by the specific grease pencil function.

Now only the second call is made,
and the filter hides keyframes in summary as well.

Reviewed By : Jeroen Bakker, Falk David

Differential Revision: https://developer.blender.org/D16369
2022-11-15 08:43:58 -08:00
e16f473523 Animation: rearrange grease pencil channels in the main dopesheet
Operations to rearrange channels in the main dopesheet
did not cover grease pencil layer channels.
Now grease pencil layer channels can be moved up and down
in the main dopesheet just like other channels.

Reviewed By: Sybren A. Stüvel

Differential Revision: https://developer.blender.org/D15542
2022-11-15 08:43:58 -08:00
a97e2fd4e6 Fix hair/curve drawing artifacts when workarounds are enabled.
Regression introduced by {rB601995c3b86986cf8f8e5b6e5a65bcfa7f8f2e32}.
Noticed by Heist project as they render final frames with workarounds enabled.

The mentioned patch introduces attaching VBO as textures. This used to be
done by the caller. The mechanism used a different order hence the VBO
could still be unbound when using. This cannot be solved inside the new
mechanism clearly so this patch will just bind when the buffer isn't bound
just before the drawing command is sent to the GPU driver.
2022-11-15 08:43:58 -08:00
b0e48cb936 Cleanup: format 2022-11-15 08:43:58 -08:00
dc14353791 Cleanup: quiet unused variable warning 2022-11-15 08:43:57 -08:00
03218f7b45 GHOST/Wayland: limit the size of the events_pending vector
Avoid keeping allocated overly large events_pending vector in case of
long delays between processing events.

While in practice this isn't likely to cause problems, it's better to
avoid keeping unnecessarily large allocations.

Also remove invalid comment.
2022-11-15 08:43:57 -08:00
472762f0dc Fix T100855: Input while Blender is unresponsive exits under Wayland
Consume events in a thread to prevent Wayland's event buffer from
overflowing Waylands internal buffer and closing the connection.
From a users perspective this seemed like a crash.

Details:

- This is a workaround for a known bug in Wayland [0].
  Threaded event handling has been if-defed so it can be removed when
  it's no longer needed.

- GTK & QT use threaded event handling to avoid this problem
  (SDL on the other hand doesn't).

- The complexity and number of locks needed to handle events in a
  separate thread is a significant down-side, but as far as I can see
  this is necessary.

- Re-connecting to the Wayland server is possible but not practical as
  the OpenGL context is lost and as far as I can tell it's not possible
  to keep it active (see: D16492).

[0]: https://gitlab.freedesktop.org/wayland/wayland/-/issues/159
2022-11-15 08:43:57 -08:00
6868d7b74a GHOST/Wayland: share logic for xdg/libdecor window configuration
Split frame & frame_pending structs, making window actions simpler
to reason about.
2022-11-15 08:43:57 -08:00
a3c4e28df9 GHOST/Wayland: only use a single swapBuffers for libdecor redrawing
Using a single draw works in my tests and I couldn't reproduce the
issue noted in the comment.

Also apply minor cleanup, assigning a variable before calling methods to
reduce diff-noise in planned changes.
2022-11-15 08:43:57 -08:00
Iliya Katueshenock
444ce7c3b7 Geometry Nodes: Image Info Node
This commit adds a new "Image Info" node to retrieve various
information from an image like its width, height, and whether
it has an alpha channel. It is also possible to retrieve the FPS
and frame count of video files.

Differential Revision: https://developer.blender.org/D15042
2022-11-15 08:43:56 -08:00
Mattias Fredriksson
24044c64ba Partial Fix T102428: Invertible Bezier curve trim within one segment
Adjusts behavior for trimming Bezier curves, specifically the outer
Bezier handles for the endpoints which do not influence the actual
curve. Handles are only adjusted if they lie within the same segment
with at most one endpoint being a control point (unless they are the
same in which handles are set to the point itself). The result yields
a curve in which the trim result can be inverted by re-setting the
cyclic property for the curve using 'Set Spline Cyclic' node
(iff both trim endpoints lie within a segment).

Differential Revision: https://developer.blender.org/D16488
2022-11-15 08:43:56 -08:00
Laurynas Duburas
9dd0899298 Fix: Versioning problem with cyclic bezier NURBS
Patch fixes versioning issue with NURBS files saved with Blender
version from commit 45d038181a to 0602852860 and opened with
Blender version from commit 0602852860. Cyclic Bezier NURBS
saved and then opened with Blender versions mentioned above
changed their shape.

Bug was reported in comments of T101160, circle problem.

Differential Revision: https://developer.blender.org/D16503
2022-11-15 08:43:56 -08:00
0783789a95 Fix T102502: Collada import sets incorrect material indices
The weird code dealing with `MeshPrimitive` didn't increment the
material indices pointer for geometry types besides triangle fans.
Also use a proper accessor to avoid adding a duplicate material
indices attribute, just in case this code is used on existing meshes.
2022-11-15 08:43:56 -08:00
7e1898cfa7 Cleanup: Resolve unused variable warning in draw module 2022-11-15 08:43:56 -08:00
a67f703ff3 Fix T100491: Mouse selection is inaccurate in NLA Editor
Selection range is +/-7 pixels to actual clicked position, but strip selection
was biased towards rightmost strip.

To make selection more intuitive, select closest strip to clicked position, and
stop iterating when strip intersects clicked pixel.

Reviewed By: sybren

Differential Revision: https://developer.blender.org/D15728
2022-11-15 08:43:56 -08:00
a504c9a69e Python API: add Image.save() optional parameters quality and filepath
To override the default quality and filepath. After changes to unify the image
operator and this method, the quality changed from 75 to 90 to make both
consistent. However there was no way to lower the quality to match the previous
behavior, this adds support for that.

Ref T102421
2022-11-15 08:43:55 -08:00
e86612d442 Fix T102421: Image.save() not respecting set image.file_format
The default file type for new ImBuf is already PNG, no need to override it
again in the save method.
2022-11-15 08:43:55 -08:00
63d4f862ce Cleanup: BVH utils: Remove print, use spans instead of pointers 2022-11-15 08:43:55 -08:00
8a19fcc500 Cycles: Enable MetalRT pointclouds & other fixes
Code authored by Marco Giordano.

This fixes pointcloud rendering on MetalRT and some other subtle MetalRT bugs:
- Incorrect kernel hashing
- Missing specialisation constants
- Incorrect visibility filtering
- Missing null pointer check

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D16499
2022-11-15 08:43:55 -08:00
ed4e04a1f6 Usual minor fixes to UI messages & i18m utils module. 2022-11-15 08:43:55 -08:00
96e93cc70c Mesh: Avoid calculating normals when building BVH tree
Though they are sometimes used by users of the BVH tree, mostly
vertex normals when building the BVH tree is unnecessary. Skip it
instead and avoid storing the vertex normals in the BVH tree cache.
They are just calculated in the few places they are actually needed.
This should save at least a few percent of the runtime in some cases
where the normals weren't needed otherwise.
2022-11-15 08:43:54 -08:00
85dc681fc3 Cleanup: Reduce indentation and variable scope in BVH utils 2022-11-15 08:43:54 -08:00
e442a17587 Cleanup: Use C++ BitVector instead of BLI_bitmap for BVH utils
This gives a friendlier interface, an inline buffer, RAII, etc.
Also switch some BMesh functions that were only used by the snap
system's use of BVH utils.
2022-11-15 08:43:54 -08:00
f4e4a0b5b3 Cleanup: Allow using C++ features in BMesh header functions
Generally the `extern "C" {` brackets shouldn't be added around other
headers since it causes problems when using C++ features in them.
Follow that convention for the "bmesh.h" header.
2022-11-15 08:43:54 -08:00
dd669dd703 Cleanup: Move bmesh_iterators.c to C++ 2022-11-15 08:43:53 -08:00
adf4dd8355 Cleanup: Braces around initialization warning
There is no need to initialize the bounding box as it is fully
initialized by the BKE_boundbox_init_from_minmax().
2022-11-15 08:43:53 -08:00
Ejner Fergo
9b36b846f0 Gizmo toggle for Movie Clip Editor
This patch adds a "Show Gizmo" toggle to the Movie Clip Editor header, for consistency with other editors.

{F13892765}

Differential Revision: https://developer.blender.org/D16437
2022-11-15 08:43:53 -08:00
Mikhail Matrosov
9d74da3352 Cycles: improve adaptive sampling for overexposed scenes
Render time is reduced for overexposed scenes, by taking into account absolute
light intensity for adaptive sampling.

This can negatively affect some scenes where compositing or color management
are used to make the scene much darker or lighter. For best results adjust the
Film > Exposure setting to bring the intensity into a good range, and then do
further compositing and color management on top of that. Note that this setting
is different than color management exposure.

Previously Cycles' adaptive sampling used sqrt(I) to normalize noise level to
conform to a viewer's eye sensitivity. It is great for darker regions of the
image, but also requests too much samples in bright regions, sometimes several
times more than needed. Highlights can tolerate more noise because in most
examples it is still less noticeable then the noise in darker areas in the same
render.

Differential Revision: https://developer.blender.org/D16392
2022-11-15 08:43:52 -08:00
a72f590d51 Fix image datablock losing movie metadata on undo
The anims data is a runtime cache similar to the image buffer or GPU texture
and needs to be preserved through undo in the same way.

Found as part of D15042 development.
2022-11-15 08:43:52 -08:00
Colin Basnett
6b590a86a6 UI: disable curve map & profile zoom buttons at max/min zoom level
Disable the zoom in and out buttons on the when they would have no effect.

This also removes an incorrect comment that indicates the maximum zoom level
was 20x when in fact it was 25x.

Differential Revision: https://developer.blender.org/D16252
2022-11-15 08:43:52 -08:00
48527a5c2d DRW: Fix compilation issues in inline functions 2022-11-15 08:43:52 -08:00
Iliya Katueshenock
94e761e263 Fix: incorrect field dependencies in the raycast node
This does not change the result when evaluating the node.

Differential Revision: https://developer.blender.org/D16325
2022-11-15 08:43:51 -08:00
85cec1deff Asset system: New asset system code module (with files from BKE)
Adds a new `source/blender/asset_system` directory and moves asset
related files from BKE to it. More asset related code can follow
(e.g. asset indexing, ED_assetlist stuff) but needs further work to
untangle it. I also kept `BKE_asset.h` and `asset.cc` as is, since they
deal with asset DNA data mostly, thus make sense in BKE.

Motivation:
- Makes the asset system design more present (term wasn't even used in
  code before).
- An `asset_system` directory is quite descriptive (trivial to identify
  core asset system features) and makes it easy to find asset code.
- Asset system is mostly runtime data, with little relation to other
  `Main`/BKE/DNA types.
- There's a lot of stuff in BKE already. It shouldn't be just a dump for
  all stuff that seems core enough.
- Being its own directly helps us be more mindful about encapsulating
  the module well, and avoiding dependencies on other modules.
- We can be more free with splitting files here than in BKE.
- In future there might be an asset system BPY module, which would then
  map quite nicely to the `asset_system` directory.

Checked with some other core devs, consensus seems that this makes
sense.
2022-11-15 08:43:51 -08:00
165daf0bc6 Fix T98989: Performance regression when using multiple bump nodes
Ensure each graph material_function only evaluates the input links that are connected to it.

Differential Revision: https://developer.blender.org/D16425
2022-11-15 08:43:51 -08:00
aed5a62db4 GHOST/Wayland: fix building when GHOST_OPENGL_ALPHA is defined 2022-11-15 08:43:51 -08:00
0c1b536c57 DRW: View: Add base for multi-view support
This implements the base needed for supporting multiple view concurently
inside the same drawcall.

The view used by common macros and view related functions is indexed using
a global variable `drw_view_id` which can be set arbitrarly or read
from the `drw_ResourceID`.

This is needed for EEVEE-Next shadow but can be used for other purpose
in the future.

Note that a shader specialization is needed for it to work. `DRW_VIEW_LEN`
needs to be defined to the amount of view the shader will access.

The number of views contained in a `draw::View` is set at construction
time.

Note that the maximum number of object correctly drawn by the shaders
using multiple views will be lower than thoses who don't.
2022-11-15 08:43:51 -08:00
903c25f097 Cleanup: DRW: Remove two clang-tidy warnings 2022-11-15 08:43:50 -08:00
6e2fc39eb7 Cleanup: Move mesh_remap.c to C++
To facilitate further mesh data structure refactoring.
2022-11-15 08:43:50 -08:00
f2f1bbdea3 UV: fix crash with uv copy on empty selection
Introduced in 721fc9c1c9
2022-11-15 08:43:50 -08:00
96dfc8be8e GHOST/Wayland: call exit() when Wayland has a fatal error
Without this, a fatal error simply floods the stderr with the same
message without exiting.

Also add note on why reconnecting to the display server isn't practical.
2022-11-15 08:43:50 -08:00
4d57da0dd0 GHOST/Wayland: remove display listener (was warning on startup)
This already has a default listener which uses waylands logging.
2022-11-15 08:43:50 -08:00
9eb885d029 Cleanup: Disable mesh normal debug time printing
Left enabled mistakenly by d63ada602d.
2022-11-15 08:43:49 -08:00
e2376eceb5 Cleanup: Move lineart_cpu.c to C++
To enable further mesh data structure refactoring-- access to loose
edges in particular.
2022-11-15 08:43:49 -08:00
f9947b50c6 UV: fix compile on windows
Remove VLAs for compiling on windows.
Regression from 721fc9c1c9
2022-11-15 08:43:49 -08:00
812de962a3 Fix T95335 Bevel operator Loop Slide overshoot.
If the edge you are going to slide along is very close to in line
with the adjacent beveled edge, then there will be sharp overshoots.
There is an epsilon comparison to just abandon loop slide if this
situation is happening. That epsilon used to be 0.25 radians, but
bug T86768 complained that that value was too high, so it was changed
to .0001 radians (5 millidegrees). Now this current bug shows that
that was too aggressively small, so this change ups it by a factor
of 10, to .001 radians (5 centidegrees). All previous bug reports
remained fixed.
2022-11-15 08:43:49 -08:00
8bea9693df Fix T102187: Add knife tool in mesh panel
Add knife tool option in mesh panel

Reviewer: campbellbarton, JulienKaspar

Differential Revision: https://developer.blender.org/D16395
2022-11-15 08:43:49 -08:00
035e040ee2 DRW: Manager: Fix ClearMulti breaking compilation on Mac
The error was:
`draw_pass.hh:1055:16: error: call to implicitly-deleted default constructor of 'blender::draw::command::Undetermined [3]'
2022-11-15 08:43:48 -08:00
da33e57e39 BLI: Fix ListBaseWrapper::get wrong return type 2022-11-15 08:43:48 -08:00
6ee2f9dc15 DRW: Manager: Add bind_texture command for vertex buffer
This allows the same behavior as with `DRW_shgroup_buffer_texture`.
2022-11-15 08:43:48 -08:00
453a0c430b DRW: Manager: Add ClearMulti command
Allows to record `GPU_framebuffer_multi_clear` inside `draw::Pass`.
2022-11-15 08:43:48 -08:00
94a35be532 DRW: Manager: Finish / change implementation of framebuffer_set command
Use reference instead of direct pointer. This is because framebuffers
often use temp textures and are configured later just before submission.
2022-11-15 08:43:47 -08:00
4efef2fccd DRW: Wrappers: Allow taking reference of the framebuffer object
This is in order to make it work with the new `framebuffer_set` command
which requires a `GPUFrameBuffer **`.
2022-11-15 08:43:47 -08:00
a80e4f4bc9 DRW: Wrappers: Allow trivial types inside draw::SwapChain
This allows to use pointers and such other trivial types which cannot
implement the `swap` mehod.
2022-11-15 08:43:47 -08:00
8e40e6400a DRW: Wrappers: Avoid default vector length of 0 if sizeof(T) is large
This increases the default size to some reasonable value (>512bytes) and
allocate at least 1 element.
2022-11-15 08:43:47 -08:00
9f1560a9a4 Cleanup: fix compiler error/warnings 2022-11-15 08:43:47 -08:00
dce36e3e80 UV: implement copy and paste for uv
Implement a new topology-based copy and paste solution for UVs.

Usage notes:

* Open the UV Editor

* Use the selection tools to select a Quad joined to a Triangle joined to another Quad.
* From the menu, choose UV > UV Copy
 * The UV co-ordinates for your quad<=>tri<=>quad are now stored internally

* Use the selection tools to select a different Quad joined to a Triangle joined to a Quad.
* (Optional) From the menu, choose UV > Split > Selection

* From the menu, choose UV > UV Paste
 * The UV co-ordinates for the new selection will be moved to match the stored UVs.

Repeat selection / UV Paste steps as many times as desired.
For performance considerations, see https://en.wikipedia.org/wiki/Graph_isomorphism_problem

In theory, UV Copy and Paste should work with all UV selection modes.
Please report any problems.

A copy has been made of the Graph Isomorphism code from https://github.com/stefanoquer/graphISO
Copyright (c) 2019 Stefano Quer stefano.quer@polito.it GPL v3 or later.

Additional integration code Copyright (c) 2022 by Blender Foundation, GPL v2 or later.

Maniphest Tasks: T77911
Differential Revision: https://developer.blender.org/D16278
2022-11-15 08:43:46 -08:00
be9bac1f0d Fix: OpenSubdiv reporting version 0.0.0 in system_info.txt
OSD Lists as 0, 0, 0 this is due to opensubdiv_capi.cc not actually including
the OSD version header, so it's not getting the version define, and the code
in openSubdiv_getVersionHex is really well prepared to deal with any or no
version at all of OSD, catches the problem and returns 0, 0, 0

Given this file is only build when OSD is enabled we can just blindly include
opensubdiv/version.h here

Reviewed by: brecht
Differential Revision: https://developer.blender.org/D16398
2022-11-15 08:43:46 -08:00
0f06080bc3 Cleanup: Move cloth.c to C++
To support further mesh data structure refactoring.
2022-11-15 08:43:46 -08:00
b5d4fbb9cc BLI: improve CPPType system
* Support bidirectional type lookups. E.g. finding the base type of a
  field was supported, but not the other way around. This also removes
  the todo in `get_vector_type`. To achieve this, types have to be
  registered up-front.
* Separate `CPPType` from other "type traits". For example, previously
  `ValueOrFieldCPPType` adds additional behavior on top of `CPPType`.
  Previously, it was a subclass, now it just contains a reference to the
  `CPPType` it corresponds to. This follows the composition-over-inheritance
  idea. This makes it easier to have self-contained "type traits" without
  having to put everything into `CPPType`.

Differential Revision: https://developer.blender.org/D16479
2022-11-15 08:43:45 -08:00
431d527679 Fix: Curves sculptmode: deduplicate checks for being in mode
rBa8f7d41d3898 added a "duplicate" check for being in curves sculptmode
unnecessarily afaict (`in_sculpt_curve_mode` in addition to the
previously existing `in_curves_sculpt_mode`).
Over time, the later evolved to also take into account the output of a
viewer node, see rBc55d38f00b8c (the previously existing
`in_curves_sculpt_mode` did not receive this).

This all results in the fact that selection is not drawn with a viewer
node (can be useful though, and there are separate opacity controls for
both selection and the viewer attribute, so these can be used/blended to
everyones liking).

So now deduplicate the check.

Differential Revision: https://developer.blender.org/D16467
2022-11-15 08:43:45 -08:00
Iliya Katueshenock
d5dff81cf6 BLI: use templates for disjoint set data structure
Differential Revision: https://developer.blender.org/D16472
2022-11-15 08:43:45 -08:00
Iliya Katueshenock
c619c86662 Fix: missing tooltip for color sockets
Differential Revision: https://developer.blender.org/D16401
2022-11-15 08:43:45 -08:00
Iliya Katueshenock
dba0a2ad9a Cleanup: remove unused variable
Differential Revision: https://developer.blender.org/D16350
2022-11-15 08:43:44 -08:00
Iliya Katueshenock
2a90fc497b Fix: geometry nodes viewer shows black with dangling reroute input
Differential Revision: https://developer.blender.org/D16322
2022-11-15 08:43:44 -08:00
Iliya Katueshenock
36673016e9 Cleanup: make GArray declarations more explicit
Differential Revision: https://developer.blender.org/D16064
2022-11-15 08:43:44 -08:00
30e9c5b7f5 Cleanup: add a system reference to the wayland window
Avoid relying on GHOST_ISystem::getSystem(), store the system instead.
2022-11-15 08:43:44 -08:00
6e82f6bf2f Fix window title not redrawing with Wayland/libdecor 2022-11-15 08:43:44 -08:00
f6f05740be GHOST/Wayland: skip resizing the EGL surface unnecessarily
wl_egl_window_resize ran when the window became active/inactive for e.g.
2022-11-15 08:43:44 -08:00
a873f7299a Cleanup: move title into GWL_Window
Nearly all Wayland window data is stored in this struct,
follow this convention so GWL_Window functions can be self contained.
2022-11-15 08:43:44 -08:00
9f03d4bc2c Cleanup: move Wayland window state utilities into lower level functions
Add low level gwl_window_* functions.
2022-11-15 08:43:43 -08:00
dc2ac546cd Fix non-interactive window borders after changes to event handling
Regression in [0] causes LIBDECOR interactions not to be detected.

[0]: deb8ae6bd1
2022-11-15 08:43:43 -08:00
a3f23a321e Cleanup: use snake-case for WAYLAND utility functions
It wasn't so obvious which functions were part of the GHOST API
and which system functions were utilities.

This convention was already in place but not always followed.
2022-11-15 08:43:43 -08:00
ddecc6166b GHOST/Wayland: replace roundtrip with dispatch_pending
Add a non-blocking version wrapper for wl_display_dispatch_pending.
This uses roughly the same logic as Wayland_PumpEvents in SDL.
Noticed this when investigating T100855.

Note that performing a round-trip doesn't seem necessary from looking
into QT/GTK & SDL event handling loops.
2022-11-15 08:43:43 -08:00
9155608875 Cleanup: Simplify handling of loop to poly map in normal calculation
A Loop to poly map was passed as an optional output to the loop normal
calculation. That meant it was often recalculated more than necessary.
Instead, treat it as an optional argument. This also helps relieve
unnecessary responsibilities from the already-complicated loop normal
calculation code.
2022-11-15 08:43:43 -08:00
415ed999ab Cleanup: Use simpler timers for mesh normals debug timing 2022-11-15 08:43:43 -08:00
cfab4f432c Cleanup: Make loop normal calculation function static 2022-11-15 08:43:42 -08:00
4ac1f59a81 Cleanup: Decrease variable scope in mesh loop normal calculation 2022-11-15 08:43:42 -08:00
1ed9df3763 Cleanup: Use spans for loop normal calculation input data 2022-11-15 08:43:42 -08:00
61e591b2a5 Cleanup: Remove unnecessary struct keywords 2022-11-15 08:43:42 -08:00
c29795452c Merge branch 'master' into temp-sculpt-roll-mapping 2022-11-11 14:18:46 -08:00
9980fd0b8e temp-sculpt-roll-mapping: Port roll tex mapping code from sculpt-dev 2022-11-05 14:47:48 -07:00
1016 changed files with 50482 additions and 34960 deletions

View File

@@ -0,0 +1,5 @@
${CommitTitle}
${CommitBody}
Pull Request #${PullRequestIndex}

View File

@@ -0,0 +1,3 @@
${PullRequestTitle}
Pull Request #${PullRequestIndex}

View File

@@ -1,13 +1,15 @@
name: Bug Report
about: File a bug report
labels:
- bug
- "type::Report"
- "status::Needs Triage"
- "priority::Normal"
body:
- type: markdown
attributes:
value: |
### Instructions
First time reporting? See [tips](https://wiki.blender.org/wiki/Process/Bug_Reports) and [walkthrough video](https://www.youtube.com/watch?v=JTD0OJq_rF4).
First time reporting? See [tips](https://wiki.blender.org/wiki/Process/Bug_Reports).
* Use **Help > Report a Bug** in Blender to fill system information and exact Blender version.
* Test [daily builds](https://builder.blender.org/) to verify if the issue is already fixed.
@@ -19,6 +21,7 @@ body:
id: body
attributes:
label: "Description"
hide_label: true
value: |
**System Information**
Operating system:

View File

@@ -1,9 +1,10 @@
name: Design
about: Create a design task (for developers only)
labels:
- design
- "type::Design"
body:
- type: textarea
id: body
attributes:
label: "Description"
hide_label: true

View File

@@ -1,9 +1,10 @@
name: To Do
about: Create a to do task (for developers only)
labels:
- todo
- "type::To Do"
body:
- type: textarea
id: body
attributes:
label: "Description"
hide_label: true

View File

@@ -14,7 +14,4 @@ body:
id: body
attributes:
label: "Description"
value: |
Description of the problem that is addressed in the patch.
Description of the proposed solution and its implementation.
hide_label: true

View File

@@ -1,5 +1,4 @@
This repository is only used as a mirror of git.blender.org. Blender development happens on
https://developer.blender.org.
This repository is only used as a mirror. Blender development happens on projects.blender.org.
To get started with contributing code, please see:
https://wiki.blender.org/wiki/Process/Contributing_Code

3
.github/stale.yml vendored
View File

@@ -15,8 +15,7 @@ staleLabel: stale
# Comment to post when closing a stale Issue or Pull Request.
closeComment: >
This issue has been automatically closed, because this repository is only
used as a mirror of git.blender.org. Blender development happens on
developer.blender.org.
used as a mirror. Blender development happens on projects.blender.org.
To get started contributing code, please read:
https://wiki.blender.org/wiki/Process/Contributing_Code

12
.gitmodules vendored
View File

@@ -1,20 +1,16 @@
[submodule "release/scripts/addons"]
path = release/scripts/addons
url = ../blender-addons.git
branch = master
ignore = all
branch = main
[submodule "release/scripts/addons_contrib"]
path = release/scripts/addons_contrib
url = ../blender-addons-contrib.git
branch = master
ignore = all
branch = main
[submodule "release/datafiles/locale"]
path = release/datafiles/locale
url = ../blender-translations.git
branch = master
ignore = all
branch = main
[submodule "source/tools"]
path = source/tools
url = ../blender-dev-tools.git
branch = master
ignore = all
branch = main

View File

@@ -167,14 +167,26 @@ get_blender_version()
option(WITH_BLENDER "Build blender (disable to build only the blender player)" ON)
mark_as_advanced(WITH_BLENDER)
if(APPLE)
# In future, can be used with `quicklookthumbnailing/qlthumbnailreply` to create file
# thumbnails for say Finder. Turn it off for now.
option(WITH_BLENDER_THUMBNAILER "Build \"blender-thumbnailer\" thumbnail extraction utility" OFF)
elseif(WIN32)
option(WITH_BLENDER_THUMBNAILER "Build \"BlendThumb.dll\" helper for Windows explorer integration" ON)
if(WIN32)
option(WITH_BLENDER_THUMBNAILER "\
Build \"BlendThumb.dll\" helper for Windows explorer integration to support extracting \
thumbnails from `.blend` files."
ON
)
else()
option(WITH_BLENDER_THUMBNAILER "Build \"blender-thumbnailer\" thumbnail extraction utility" ON)
set(_option_default ON)
if(APPLE)
# In future, can be used with `quicklookthumbnailing/qlthumbnailreply`
# to create file thumbnails for say Finder.
# Turn it off for now, even though it can build on APPLE, it's not likely to be useful.
set(_option_default OFF)
endif()
option(WITH_BLENDER_THUMBNAILER "\
Build stand-alone \"blender-thumbnailer\" command-line thumbnail extraction utility, \
intended for use by file-managers to extract PNG images from `.blend` files."
${_option_default}
)
unset(_option_default)
endif()
option(WITH_INTERNATIONAL "Enable I18N (International fonts and text)" ON)
@@ -214,14 +226,19 @@ option(WITH_BULLET "Enable Bullet (Physics Engine)" ON)
option(WITH_SYSTEM_BULLET "Use the systems bullet library (currently unsupported due to missing features in upstream!)" )
mark_as_advanced(WITH_SYSTEM_BULLET)
option(WITH_OPENCOLORIO "Enable OpenColorIO color management" ON)
set(_option_default ON)
if(APPLE)
# There's no OpenXR runtime in sight for macOS, neither is code well
# tested there -> disable it by default.
option(WITH_XR_OPENXR "Enable VR features through the OpenXR specification" OFF)
mark_as_advanced(WITH_XR_OPENXR)
else()
option(WITH_XR_OPENXR "Enable VR features through the OpenXR specification" ON)
set(_option_default OFF)
endif()
option(WITH_XR_OPENXR "Enable VR features through the OpenXR specification" ${_option_default})
if(APPLE)
mark_as_advanced(WITH_XR_OPENXR)
endif()
unset(_option_default)
option(WITH_GMP "Enable features depending on GMP (Exact Boolean)" ON)
# Compositor
@@ -353,12 +370,13 @@ else()
set(WITH_COREAUDIO OFF)
endif()
if(NOT WIN32)
set(_option_default ON)
if(APPLE)
option(WITH_JACK "Enable JACK Support (http://www.jackaudio.org)" OFF)
else()
option(WITH_JACK "Enable JACK Support (http://www.jackaudio.org)" ON)
set(_option_default OFF)
endif()
option(WITH_JACK_DYNLOAD "Enable runtime dynamic JACK libraries loading" OFF)
option(WITH_JACK "Enable JACK Support (http://www.jackaudio.org)" ${_option_default})
unset(_option_default)
option(WITH_JACK_DYNLOAD "Enable runtime dynamic JACK libraries loading" OFF)
else()
set(WITH_JACK OFF)
endif()
@@ -506,7 +524,7 @@ endif()
if(NOT APPLE)
option(WITH_CYCLES_DEVICE_HIP "Enable Cycles AMD HIP support" ON)
option(WITH_CYCLES_HIP_BINARIES "Build Cycles AMD HIP binaries" OFF)
set(CYCLES_HIP_BINARIES_ARCH gfx900 gfx906 gfx90c gfx902 gfx1010 gfx1011 gfx1012 gfx1030 gfx1031 gfx1032 gfx1034 gfx1035 gfx1100 gfx1101 gfx1102 CACHE STRING "AMD HIP architectures to build binaries for")
set(CYCLES_HIP_BINARIES_ARCH gfx1010 gfx1011 gfx1012 gfx1030 gfx1031 gfx1032 gfx1034 gfx1035 gfx1100 gfx1101 gfx1102 CACHE STRING "AMD HIP architectures to build binaries for")
mark_as_advanced(WITH_CYCLES_DEVICE_HIP)
mark_as_advanced(CYCLES_HIP_BINARIES_ARCH)
endif()
@@ -1223,13 +1241,6 @@ if(WITH_OPENGL)
add_definitions(-DWITH_OPENGL)
endif()
#-----------------------------------------------------------------------------
# Configure Vulkan.
if(WITH_VULKAN_BACKEND)
list(APPEND BLENDER_GL_LIBRARIES ${VULKAN_LIBRARIES})
endif()
# -----------------------------------------------------------------------------
# Configure Metal

View File

@@ -71,6 +71,13 @@ Static Source Code Checking
* check_mypy: Checks all Python scripts using mypy,
see: source/tools/check_source/check_mypy_config.py scripts which are included.
Documentation Checking
* check_wiki_file_structure:
Check the WIKI documentation for the source-tree's file structure
matches Blender's source-code.
See: https://wiki.blender.org/wiki/Source/File_Structure
Spell Checkers
This runs the spell checker from the developer tools repositor.
@@ -481,6 +488,10 @@ check_smatch: .FORCE
check_mypy: .FORCE
@$(PYTHON) "$(BLENDER_DIR)/source/tools/check_source/check_mypy.py"
check_wiki_file_structure: .FORCE
@PYTHONIOENCODING=utf_8 $(PYTHON) \
"$(BLENDER_DIR)/source/tools/check_wiki/check_wiki_file_structure.py"
check_spelling_py: .FORCE
@cd "$(BUILD_DIR)" ; \
PYTHONIOENCODING=utf_8 $(PYTHON) \

View File

@@ -24,7 +24,7 @@ Development
-----------
- [Build Instructions](https://wiki.blender.org/wiki/Building_Blender)
- [Code Review & Bug Tracker](https://developer.blender.org)
- [Code Review & Bug Tracker](https://projects.blender.org)
- [Developer Forum](https://devtalk.blender.org)
- [Developer Documentation](https://wiki.blender.org)

View File

@@ -2,7 +2,7 @@
# LLVM does not switch over to cpp17 until llvm 16 and building ealier versions with
# MSVC is leading to some crashes in ISPC. Switch back to their default on all platforms
# for now.
# for now.
string(REPLACE "-DCMAKE_CXX_STANDARD=17" " " DPCPP_CMAKE_FLAGS "${DEFAULT_CMAKE_FLAGS}")
if(WIN32)

View File

@@ -40,7 +40,8 @@ ExternalProject_Add(external_igc_llvm
${PATCH_CMD} -p 1 -d ${IGC_LLVM_SOURCE_DIR} < ${IGC_OPENCL_CLANG_PATCH_DIR}/clang/0004-OpenCL-support-cl_ext_float_atomics.patch &&
${PATCH_CMD} -p 1 -d ${IGC_LLVM_SOURCE_DIR} < ${IGC_OPENCL_CLANG_PATCH_DIR}/clang/0005-OpenCL-Add-cl_khr_integer_dot_product.patch &&
${PATCH_CMD} -p 1 -d ${IGC_LLVM_SOURCE_DIR} < ${IGC_OPENCL_CLANG_PATCH_DIR}/llvm/0001-Memory-leak-fix-for-Managed-Static-Mutex.patch &&
${PATCH_CMD} -p 1 -d ${IGC_LLVM_SOURCE_DIR} < ${IGC_OPENCL_CLANG_PATCH_DIR}/llvm/0002-Remove-repo-name-in-LLVM-IR.patch
${PATCH_CMD} -p 1 -d ${IGC_LLVM_SOURCE_DIR} < ${IGC_OPENCL_CLANG_PATCH_DIR}/llvm/0002-Remove-repo-name-in-LLVM-IR.patch &&
${PATCH_CMD} -p 1 -d ${IGC_LLVM_SOURCE_DIR} < ${IGC_OPENCL_CLANG_PATCH_DIR}/llvm/0003-Add-missing-include-limit-in-benchmark.patch
)
add_dependencies(
external_igc_llvm
@@ -55,9 +56,6 @@ ExternalProject_Add(external_igc_spirv_translator
CONFIGURE_COMMAND echo .
BUILD_COMMAND echo .
INSTALL_COMMAND echo .
PATCH_COMMAND ${PATCH_CMD} -p 1 -d ${IGC_SPIRV_TRANSLATOR_SOURCE_DIR} < ${IGC_OPENCL_CLANG_PATCH_DIR}/spirv/0001-update-SPIR-V-headers-for-SPV_INTEL_split_barrier.patch &&
${PATCH_CMD} -p 1 -d ${IGC_SPIRV_TRANSLATOR_SOURCE_DIR} < ${IGC_OPENCL_CLANG_PATCH_DIR}/spirv/0002-Add-support-for-split-barriers-extension-SPV_INTEL_s.patch &&
${PATCH_CMD} -p 1 -d ${IGC_SPIRV_TRANSLATOR_SOURCE_DIR} < ${IGC_OPENCL_CLANG_PATCH_DIR}/spirv/0003-Support-cl_bf16_conversions.patch
)
add_dependencies(
external_igc_spirv_translator

View File

@@ -42,7 +42,7 @@ endif()
# LLVM does not switch over to cpp17 until llvm 16 and building ealier versions with
# MSVC is leading to some crashes in ISPC. Switch back to their default on all platforms
# for now.
# for now.
string(REPLACE "-DCMAKE_CXX_STANDARD=17" " " LLVM_CMAKE_FLAGS "${DEFAULT_CMAKE_FLAGS}")
# short project name due to long filename issues on windows

View File

@@ -88,6 +88,19 @@ else()
export LDFLAGS=${PYTHON_LDFLAGS} &&
export PKG_CONFIG_PATH=${LIBDIR}/ffi/lib/pkgconfig)
# NOTE: untested on APPLE so far.
if(NOT APPLE)
set(PYTHON_CONFIGURE_EXTRA_ARGS
${PYTHON_CONFIGURE_EXTRA_ARGS}
# Used on most release Linux builds (Fedora for e.g.),
# increases build times noticeably with the benefit of a modest speedup at runtime.
--enable-optimizations
# While LTO is OK when building on the same system, it's incompatible across GCC versions,
# making it impractical for developers to build against, so keep it disabled.
# `--with-lto`
)
endif()
ExternalProject_Add(external_python
URL file://${PACKAGE_DIR}/${PYTHON_FILE}
DOWNLOAD_DIR ${DOWNLOAD_DIR}

View File

@@ -10,9 +10,9 @@ if(WIN32)
DOWNLOAD_DIR ${DOWNLOAD_DIR}
URL_HASH ${SSL_HASH_TYPE}=${SSL_HASH}
PREFIX ${BUILD_DIR}/ssl
CONFIGURE_COMMAND echo "."
BUILD_COMMAND echo "."
INSTALL_COMMAND echo "."
CONFIGURE_COMMAND echo "."
BUILD_COMMAND echo "."
INSTALL_COMMAND echo "."
INSTALL_DIR ${LIBDIR}/ssl
)
else()
@@ -46,4 +46,4 @@ else()
INSTALL_COMMAND ${CONFIGURE_ENV} && cd ${BUILD_DIR}/ssl/src/external_ssl/ && make install
INSTALL_DIR ${LIBDIR}/ssl
)
endif()
endif()

View File

@@ -668,9 +668,9 @@ set(SPIRV_HEADERS_FILE SPIR-V-Headers-${SPIRV_HEADERS_VERSION}.tar.gz)
# compiler, the versions used are taken from the following location
# https://github.com/intel/intel-graphics-compiler/releases
set(IGC_VERSION 1.0.12149.1)
set(IGC_VERSION 1.0.13064.7)
set(IGC_URI https://github.com/intel/intel-graphics-compiler/archive/refs/tags/igc-${IGC_VERSION}.tar.gz)
set(IGC_HASH 44f67f24e3bc5130f9f062533abf8154782a9d0a992bc19b498639a8521ae836)
set(IGC_HASH a929abd4cca2b293961ec0437ee4b3b2147bd3b2c8a3c423af78c0c359b2e5ae)
set(IGC_HASH_TYPE SHA256)
set(IGC_FILE igc-${IGC_VERSION}.tar.gz)
@@ -690,15 +690,15 @@ set(IGC_LLVM_FILE ${IGC_LLVM_VERSION}.tar.gz)
#
# WARNING WARNING WARNING
set(IGC_OPENCL_CLANG_VERSION 363a5262d8c7cff3fb28f3bdb5d85c8d7e91c1bb)
set(IGC_OPENCL_CLANG_VERSION ee31812ea8b89d08c2918f045d11a19bd33525c5)
set(IGC_OPENCL_CLANG_URI https://github.com/intel/opencl-clang/archive/${IGC_OPENCL_CLANG_VERSION}.tar.gz)
set(IGC_OPENCL_CLANG_HASH aa8cf72bb239722ce8ce44f79413c6887ecc8ca18477dd520aa5c4809756da9a)
set(IGC_OPENCL_CLANG_HASH 1db6735bbcfaa31e8a9ba39f121d6bafa806ea8919e9f56782d6aaa67771ddda)
set(IGC_OPENCL_CLANG_HASH_TYPE SHA256)
set(IGC_OPENCL_CLANG_FILE opencl-clang-${IGC_OPENCL_CLANG_VERSION}.tar.gz)
set(IGC_VCINTRINSICS_VERSION v0.5.0)
set(IGC_VCINTRINSICS_VERSION v0.11.0)
set(IGC_VCINTRINSICS_URI https://github.com/intel/vc-intrinsics/archive/refs/tags/${IGC_VCINTRINSICS_VERSION}.tar.gz)
set(IGC_VCINTRINSICS_HASH 70bb47c5e32173cf61514941e83ae7c7eb4485e6d2fca60cfa1f50d4f42c41f2)
set(IGC_VCINTRINSICS_HASH e5acd5626ce7fa6d41ce154c50ac805eda734ee66af94ef28e680ac2ad81bb9f)
set(IGC_VCINTRINSICS_HASH_TYPE SHA256)
set(IGC_VCINTRINSICS_FILE vc-intrinsics-${IGC_VCINTRINSICS_VERSION}.tar.gz)
@@ -714,9 +714,9 @@ set(IGC_SPIRV_TOOLS_HASH 6e19900e948944243024aedd0a201baf3854b377b9cc7a386553bc1
set(IGC_SPIRV_TOOLS_HASH_TYPE SHA256)
set(IGC_SPIRV_TOOLS_FILE SPIR-V-Tools-${IGC_SPIRV_TOOLS_VERSION}.tar.gz)
set(IGC_SPIRV_TRANSLATOR_VERSION a31ffaeef77e23d500b3ea3d35e0c42ff5648ad9)
set(IGC_SPIRV_TRANSLATOR_VERSION d739c01d65ec00dee64dedd40deed805216a7193)
set(IGC_SPIRV_TRANSLATOR_URI https://github.com/KhronosGroup/SPIRV-LLVM-Translator/archive/${IGC_SPIRV_TRANSLATOR_VERSION}.tar.gz)
set(IGC_SPIRV_TRANSLATOR_HASH 9e26c96a45341b8f8af521bacea20e752623346340addd02af95d669f6e89252)
set(IGC_SPIRV_TRANSLATOR_HASH ddc0cc9ccbe59dadeaf291012d59de142b2e9f2b124dbb634644d39daddaa13e)
set(IGC_SPIRV_TRANSLATOR_HASH_TYPE SHA256)
set(IGC_SPIRV_TRANSLATOR_FILE SPIR-V-Translator-${IGC_SPIRV_TRANSLATOR_VERSION}.tar.gz)
@@ -724,15 +724,15 @@ set(IGC_SPIRV_TRANSLATOR_FILE SPIR-V-Translator-${IGC_SPIRV_TRANSLATOR_VERSION}.
### Intel Graphics Compiler DEPS END ###
########################################
set(GMMLIB_VERSION intel-gmmlib-22.1.8)
set(GMMLIB_VERSION intel-gmmlib-22.3.0)
set(GMMLIB_URI https://github.com/intel/gmmlib/archive/refs/tags/${GMMLIB_VERSION}.tar.gz)
set(GMMLIB_HASH bf23e9a3742b4fb98c7666c9e9b29f3219e4b2fb4d831aaf4eed71f5e2d17368)
set(GMMLIB_HASH c1f33e1519edfc527127baeb0436b783430dfd256c643130169a3a71dc86aff9)
set(GMMLIB_HASH_TYPE SHA256)
set(GMMLIB_FILE ${GMMLIB_VERSION}.tar.gz)
set(OCLOC_VERSION 22.38.24278)
set(OCLOC_VERSION 22.49.25018.21)
set(OCLOC_URI https://github.com/intel/compute-runtime/archive/refs/tags/${OCLOC_VERSION}.tar.gz)
set(OCLOC_HASH db0c542fccd651e6404b15a74d46027f1ce0eda8dc9e25a40cbb6c0faef257ee)
set(OCLOC_HASH 92362dae08b503a34e5d3820ed284198c452bcd5e7504d90eb69887b20492c06)
set(OCLOC_HASH_TYPE SHA256)
set(OCLOC_FILE ocloc-${OCLOC_VERSION}.tar.gz)

View File

@@ -1,7 +1,7 @@
# SPDX-License-Identifier: GPL-2.0-or-later
if(WIN32)
set(XML2_EXTRA_ARGS
set(XML2_EXTRA_ARGS
-DLIBXML2_WITH_ZLIB=OFF
-DLIBXML2_WITH_LZMA=OFF
-DLIBXML2_WITH_PYTHON=OFF

View File

@@ -1,7 +1,7 @@
diff -Naur external_igc_opencl_clang.orig/CMakeLists.txt external_igc_opencl_clang/CMakeLists.txt
--- external_igc_opencl_clang.orig/CMakeLists.txt 2022-03-16 05:51:10 -0600
+++ external_igc_opencl_clang/CMakeLists.txt 2022-05-23 10:40:09 -0600
@@ -126,22 +126,24 @@
@@ -147,22 +147,24 @@
)
endif()

View File

@@ -24,7 +24,7 @@ SET(_moltenvk_SEARCH_DIRS
# FIXME: These finder modules typically don't use LIBDIR,
# this should be set by `./build_files/cmake/platform/` instead.
IF(DEFINED LIBDIR)
SET(_moltenvk_SEARCH_DIRS ${_moltenvk_SEARCH_DIRS} ${LIBDIR}/vulkan/MoltenVK)
SET(_moltenvk_SEARCH_DIRS ${_moltenvk_SEARCH_DIRS} ${LIBDIR}/moltenvk)
ENDIF()
FIND_PATH(MOLTENVK_INCLUDE_DIR

View File

@@ -0,0 +1,63 @@
# SPDX-License-Identifier: BSD-3-Clause
# Copyright 2023 Blender Foundation.
# - Find ShaderC libraries
# Find the ShaderC includes and libraries
# This module defines
# SHADERC_INCLUDE_DIRS, where to find MoltenVK headers, Set when
# SHADERC_INCLUDE_DIR is found.
# SHADERC_LIBRARIES, libraries to link against to use ShaderC.
# SHADERC_ROOT_DIR, The base directory to search for ShaderC.
# This can also be an environment variable.
# SHADERC_FOUND, If false, do not try to use ShaderC.
#
# If SHADERC_ROOT_DIR was defined in the environment, use it.
IF(NOT SHADERC_ROOT_DIR AND NOT $ENV{SHADERC_ROOT_DIR} STREQUAL "")
SET(SHADERC_ROOT_DIR $ENV{SHADERC_ROOT_DIR})
ENDIF()
SET(_shaderc_SEARCH_DIRS
${SHADERC_ROOT_DIR}
)
# FIXME: These finder modules typically don't use LIBDIR,
# this should be set by `./build_files/cmake/platform/` instead.
IF(DEFINED LIBDIR)
SET(_shaderc_SEARCH_DIRS ${_shaderc_SEARCH_DIRS} ${LIBDIR}/shaderc)
ENDIF()
FIND_PATH(SHADERC_INCLUDE_DIR
NAMES
shaderc/shaderc.h
HINTS
${_shaderc_SEARCH_DIRS}
PATH_SUFFIXES
include
)
FIND_LIBRARY(SHADERC_LIBRARY
NAMES
shaderc_combined
HINTS
${_shaderc_SEARCH_DIRS}
PATH_SUFFIXES
lib
)
# handle the QUIETLY and REQUIRED arguments and set SHADERC_FOUND to TRUE if
# all listed variables are TRUE
INCLUDE(FindPackageHandleStandardArgs)
FIND_PACKAGE_HANDLE_STANDARD_ARGS(ShaderC DEFAULT_MSG SHADERC_LIBRARY SHADERC_INCLUDE_DIR)
IF(SHADERC_FOUND)
SET(SHADERC_LIBRARIES ${SHADERC_LIBRARY})
SET(SHADERC_INCLUDE_DIRS ${SHADERC_INCLUDE_DIR})
ENDIF()
MARK_AS_ADVANCED(
SHADERC_INCLUDE_DIR
SHADERC_LIBRARY
)
UNSET(_shaderc_SEARCH_DIRS)

View File

@@ -0,0 +1,63 @@
# SPDX-License-Identifier: BSD-3-Clause
# Copyright 2023 Blender Foundation.
# - Find Vulkan libraries
# Find the Vulkan includes and libraries
# This module defines
# VULKAN_INCLUDE_DIRS, where to find Vulkan headers, Set when
# VULKAN_INCLUDE_DIR is found.
# VULKAN_LIBRARIES, libraries to link against to use Vulkan.
# VULKAN_ROOT_DIR, The base directory to search for Vulkan.
# This can also be an environment variable.
# VULKAN_FOUND, If false, do not try to use Vulkan.
#
# If VULKAN_ROOT_DIR was defined in the environment, use it.
IF(NOT VULKAN_ROOT_DIR AND NOT $ENV{VULKAN_ROOT_DIR} STREQUAL "")
SET(VULKAN_ROOT_DIR $ENV{VULKAN_ROOT_DIR})
ENDIF()
SET(_vulkan_SEARCH_DIRS
${VULKAN_ROOT_DIR}
)
# FIXME: These finder modules typically don't use LIBDIR,
# this should be set by `./build_files/cmake/platform/` instead.
IF(DEFINED LIBDIR)
SET(_vulkan_SEARCH_DIRS ${_vulkan_SEARCH_DIRS} ${LIBDIR}/vulkan)
ENDIF()
FIND_PATH(VULKAN_INCLUDE_DIR
NAMES
vulkan/vulkan.h
HINTS
${_vulkan_SEARCH_DIRS}
PATH_SUFFIXES
include
)
FIND_LIBRARY(VULKAN_LIBRARY
NAMES
vulkan
HINTS
${_vulkan_SEARCH_DIRS}
PATH_SUFFIXES
lib
)
# handle the QUIETLY and REQUIRED arguments and set VULKAN_FOUND to TRUE if
# all listed variables are TRUE
INCLUDE(FindPackageHandleStandardArgs)
FIND_PACKAGE_HANDLE_STANDARD_ARGS(Vulkan DEFAULT_MSG VULKAN_LIBRARY VULKAN_INCLUDE_DIR)
IF(VULKAN_FOUND)
SET(VULKAN_LIBRARIES ${VULKAN_LIBRARY})
SET(VULKAN_INCLUDE_DIRS ${VULKAN_INCLUDE_DIR})
ENDIF()
MARK_AS_ADVANCED(
VULKAN_INCLUDE_DIR
VULKAN_LIBRARY
)
UNSET(_vulkan_SEARCH_DIRS)

View File

@@ -23,19 +23,19 @@ if(EXISTS ${SOURCE_DIR}/.git)
if(MY_WC_BRANCH STREQUAL "HEAD")
# Detached HEAD, check whether commit hash is reachable
# in the master branch
# in the main branch
execute_process(COMMAND git rev-parse --short=12 HEAD
WORKING_DIRECTORY ${SOURCE_DIR}
OUTPUT_VARIABLE MY_WC_HASH
OUTPUT_STRIP_TRAILING_WHITESPACE)
execute_process(COMMAND git branch --list master blender-v* --contains ${MY_WC_HASH}
execute_process(COMMAND git branch --list main blender-v* --contains ${MY_WC_HASH}
WORKING_DIRECTORY ${SOURCE_DIR}
OUTPUT_VARIABLE _git_contains_check
OUTPUT_STRIP_TRAILING_WHITESPACE)
if(NOT _git_contains_check STREQUAL "")
set(MY_WC_BRANCH "master")
set(MY_WC_BRANCH "main")
else()
execute_process(COMMAND git show-ref --tags -d
WORKING_DIRECTORY ${SOURCE_DIR}
@@ -48,7 +48,7 @@ if(EXISTS ${SOURCE_DIR}/.git)
OUTPUT_STRIP_TRAILING_WHITESPACE)
if(_git_tag_hashes MATCHES "${_git_head_hash}")
set(MY_WC_BRANCH "master")
set(MY_WC_BRANCH "main")
else()
execute_process(COMMAND git branch --contains ${MY_WC_HASH}
WORKING_DIRECTORY ${SOURCE_DIR}

View File

@@ -11,11 +11,11 @@
mkdir ~/blender-git
cd ~/blender-git
git clone http://git.blender.org/blender.git
git clone https://projects.blender.org/blender/blender.git
cd blender
git submodule update --init --recursive
git submodule foreach git checkout master
git submodule foreach git pull --rebase origin master
git submodule foreach git checkout main
git submodule foreach git pull --rebase origin main
# create build dir
mkdir ~/blender-git/build-cmake
@@ -35,7 +35,7 @@ ln -s ~/blender-git/build-cmake/bin/blender ~/blender-git/blender/blender.bin
echo ""
echo "* Useful Commands *"
echo " Run Blender: ~/blender-git/blender/blender.bin"
echo " Update Blender: git pull --rebase; git submodule foreach git pull --rebase origin master"
echo " Update Blender: git pull --rebase; git submodule foreach git pull --rebase origin main"
echo " Reconfigure Blender: cd ~/blender-git/build-cmake ; cmake ."
echo " Build Blender: cd ~/blender-git/build-cmake ; make"
echo ""

View File

@@ -97,20 +97,8 @@ add_bundled_libraries(materialx/lib)
if(WITH_VULKAN_BACKEND)
find_package(MoltenVK REQUIRED)
if(EXISTS ${LIBDIR}/vulkan)
set(VULKAN_FOUND On)
set(VULKAN_ROOT_DIR ${LIBDIR}/vulkan/macOS)
set(VULKAN_INCLUDE_DIR ${VULKAN_ROOT_DIR}/include)
set(VULKAN_LIBRARY ${VULKAN_ROOT_DIR}/lib/libvulkan.1.dylib)
set(SHADERC_LIBRARY ${VULKAN_ROOT_DIR}/lib/libshaderc_combined.a)
set(VULKAN_INCLUDE_DIRS ${VULKAN_INCLUDE_DIR} ${MOLTENVK_INCLUDE_DIRS})
set(VULKAN_LIBRARIES ${VULKAN_LIBRARY} ${SHADERC_LIBRARY} ${MOLTENVK_LIBRARIES})
else()
message(WARNING "Vulkan SDK was not found, disabling WITH_VULKAN_BACKEND")
set(WITH_VULKAN_BACKEND OFF)
endif()
find_package(ShaderC REQUIRED)
find_package(Vulkan REQUIRED)
endif()
if(WITH_OPENSUBDIV)

View File

@@ -111,6 +111,7 @@ find_package_wrapper(Epoxy REQUIRED)
if(WITH_VULKAN_BACKEND)
find_package_wrapper(Vulkan REQUIRED)
find_package_wrapper(ShaderC REQUIRED)
endif()
function(check_freetype_for_brotli)

View File

@@ -5,16 +5,16 @@
update-code:
git:
submodules:
- branch: master
- branch: main
commit_id: HEAD
path: release/scripts/addons
- branch: master
- branch: main
commit_id: HEAD
path: release/scripts/addons_contrib
- branch: master
- branch: main
commit_id: HEAD
path: release/datafiles/locale
- branch: master
- branch: main
commit_id: HEAD
path: source/tools
svn:
@@ -63,7 +63,7 @@ buildbot:
optix:
version: '7.3.0'
ocloc:
version: '101.3430'
version: '101.4032'
cmake:
default:
version: any

View File

@@ -24,7 +24,7 @@ import os
import re
import platform
import string
import setuptools # type: ignore
import setuptools
import sys
from typing import (
@@ -58,7 +58,7 @@ Each Blender release supports one Python version, and the package is only compat
## Source Code
* [Releases](https://download.blender.org/source/)
* Repository: [git.blender.org/blender.git](https://git.blender.org/gitweb/gitweb.cgi/blender.git)
* Repository: [projects.blender.org/blender/blender.git](https://projects.blender.org/blender/blender)
## Credits
@@ -208,7 +208,7 @@ def main() -> None:
return paths
# Ensure this wheel is marked platform specific.
class BinaryDistribution(setuptools.dist.Distribution): # type: ignore
class BinaryDistribution(setuptools.dist.Distribution):
def has_ext_modules(self) -> bool:
return True

View File

@@ -13,10 +13,10 @@ import sys
import make_utils
from make_utils import call
# Parse arguments
# Parse arguments.
def parse_arguments():
def parse_arguments() -> argparse.Namespace:
parser = argparse.ArgumentParser()
parser.add_argument("--ctest-command", default="ctest")
parser.add_argument("--cmake-command", default="cmake")

View File

@@ -170,7 +170,7 @@ def git_update_skip(args: argparse.Namespace, check_remote_exists: bool = True)
return "rebase or merge in progress, complete it first"
# Abort if uncommitted changes.
changes = check_output([args.git_command, 'status', '--porcelain', '--untracked-files=no'])
changes = check_output([args.git_command, 'status', '--porcelain', '--untracked-files=no', '--ignore-submodules'])
if len(changes) != 0:
return "you have unstaged changes"
@@ -202,8 +202,8 @@ def submodules_update(
sys.exit(1)
# Update submodules to appropriate given branch,
# falling back to master if none is given and/or found in a sub-repository.
branch_fallback = "master"
# falling back to main if none is given and/or found in a sub-repository.
branch_fallback = "main"
if not branch:
branch = branch_fallback

View File

@@ -3,9 +3,9 @@ if NOT exist "%BLENDER_DIR%\source\tools\.git" (
if not "%GIT%" == "" (
"%GIT%" submodule update --init --recursive --progress
if errorlevel 1 goto FAIL
"%GIT%" submodule foreach git checkout master
"%GIT%" submodule foreach git checkout main
if errorlevel 1 goto FAIL
"%GIT%" submodule foreach git pull --rebase origin master
"%GIT%" submodule foreach git pull --rebase origin main
if errorlevel 1 goto FAIL
goto EOF
) else (

View File

@@ -37,7 +37,7 @@ def draw_callback_px(self, context):
# BLF drawing routine
font_id = font_info["font_id"]
blf.position(font_id, 2, 80, 0)
blf.size(font_id, 50, 72)
blf.size(font_id, 50)
blf.draw(font_id, "Hello World")

View File

@@ -1816,9 +1816,9 @@ def pyrna2sphinx(basepath):
# operators
def write_ops():
API_BASEURL = "https://developer.blender.org/diffusion/B/browse/master/release/scripts"
API_BASEURL_ADDON = "https://developer.blender.org/diffusion/BA"
API_BASEURL_ADDON_CONTRIB = "https://developer.blender.org/diffusion/BAC"
API_BASEURL = "https://projects.blender.org/blender/blender/src/branch/main/release/scripts"
API_BASEURL_ADDON = "https://projects.blender.org/blender/blender-addons"
API_BASEURL_ADDON_CONTRIB = "https://projects.blender.org/blender/blender-addons-contrib"
op_modules = {}
op = None

View File

@@ -156,7 +156,7 @@ var Popover = function() {
},
getNamed : function(v) {
$.each(all_versions, function(ix, title) {
if (ix === "master" || ix === "latest") {
if (ix === "master" || ix === "main" || ix === "latest") {
var m = title.match(/\d\.\d[\w\d\.]*/)[0];
if (parseFloat(m) == v) {
v = ix;

View File

@@ -1,5 +1,5 @@
Project: Blender
URL: https://git.blender.org/blender.git
URL: https://projects.blender.org/blender/blender.git
License: Apache 2.0
Upstream version: N/A
Local modifications: None

View File

@@ -7,6 +7,7 @@ set(INC
set(INC_SYS
${VULKAN_INCLUDE_DIRS}
${MOLTENVK_INCLUDE_DIRS}
)
set(SRC

View File

@@ -0,0 +1,15 @@
diff --git a/extern/vulkan_memory_allocator/vk_mem_alloc.h b/extern/vulkan_memory_allocator/vk_mem_alloc.h
index 60f572038c0..63a9994ba46 100644
--- a/extern/vulkan_memory_allocator/vk_mem_alloc.h
+++ b/extern/vulkan_memory_allocator/vk_mem_alloc.h
@@ -13371,8 +13371,8 @@ bool VmaDefragmentationContext_T::IncrementCounters(VkDeviceSize bytes)
// Early return when max found
if (++m_PassStats.allocationsMoved >= m_MaxPassAllocations || m_PassStats.bytesMoved >= m_MaxPassBytes)
{
- VMA_ASSERT(m_PassStats.allocationsMoved == m_MaxPassAllocations ||
- m_PassStats.bytesMoved == m_MaxPassBytes && "Exceeded maximal pass threshold!");
+ VMA_ASSERT((m_PassStats.allocationsMoved == m_MaxPassAllocations ||
+ m_PassStats.bytesMoved == m_MaxPassBytes) && "Exceeded maximal pass threshold!");
return true;
}
return false;

File diff suppressed because it is too large Load Diff

View File

@@ -12,6 +12,7 @@ from bpy.props import (
PointerProperty,
StringProperty,
)
from bpy.app.translations import pgettext_iface as iface_
from math import pi
@@ -1664,30 +1665,48 @@ class CyclesPreferences(bpy.types.AddonPreferences):
col.label(text="No compatible GPUs found for Cycles", icon='INFO')
if device_type == 'CUDA':
col.label(text="Requires NVIDIA GPU with compute capability 3.0", icon='BLANK1')
compute_capability = "3.0"
col.label(text=iface_("Requires NVIDIA GPU with compute capability %s") % compute_capability,
icon='BLANK1', translate=False)
elif device_type == 'OPTIX':
col.label(text="Requires NVIDIA GPU with compute capability 5.0", icon='BLANK1')
col.label(text="and NVIDIA driver version 470 or newer", icon='BLANK1')
compute_capability = "5.0"
driver_version = "470"
col.label(text=iface_("Requires NVIDIA GPU with compute capability %s") % compute_capability,
icon='BLANK1', translate=False)
col.label(text="and NVIDIA driver version %s or newer" % driver_version,
icon='BLANK1', translate=False)
elif device_type == 'HIP':
import sys
if sys.platform[:3] == "win":
col.label(text="Requires AMD GPU with Vega or RDNA architecture", icon='BLANK1')
col.label(text="and AMD Radeon Pro 21.Q4 driver or newer", icon='BLANK1')
driver_version = "21.Q4"
col.label(text="Requires AMD GPU with RDNA architecture", icon='BLANK1')
col.label(text=iface_("and AMD Radeon Pro %s driver or newer") % driver_version,
icon='BLANK1', translate=False)
elif sys.platform.startswith("linux"):
col.label(text="Requires AMD GPU with Vega or RDNA architecture", icon='BLANK1')
col.label(text="and AMD driver version 22.10 or newer", icon='BLANK1')
driver_version = "22.10"
col.label(text="Requires AMD GPU with RDNA architecture", icon='BLANK1')
col.label(text=iface_("and AMD driver version %s or newer") % driver_version, icon='BLANK1',
translate=False)
elif device_type == 'ONEAPI':
import sys
if sys.platform.startswith("win"):
driver_version = "101.4032"
col.label(text="Requires Intel GPU with Xe-HPG architecture", icon='BLANK1')
col.label(text="and Windows driver version 101.3430 or newer", icon='BLANK1')
col.label(text=iface_("and Windows driver version %s or newer") % driver_version,
icon='BLANK1', translate=False)
elif sys.platform.startswith("linux"):
driver_version = "1.3.24931"
col.label(text="Requires Intel GPU with Xe-HPG architecture and", icon='BLANK1')
col.label(text=" - intel-level-zero-gpu version 1.3.23904 or newer", icon='BLANK1')
col.label(text=iface_(" - intel-level-zero-gpu version %s or newer") % driver_version,
icon='BLANK1', translate=False)
col.label(text=" - oneAPI Level-Zero Loader", icon='BLANK1')
elif device_type == 'METAL':
col.label(text="Requires Apple Silicon with macOS 12.2 or newer", icon='BLANK1')
col.label(text="or AMD with macOS 12.3 or newer", icon='BLANK1')
silicon_mac_version = "12.2"
amd_mac_version = "12.3"
col.label(text=iface_("Requires Apple Silicon with macOS %s or newer") % silicon_mac_version,
icon='BLANK1', translate=False)
col.label(text=iface_("or AMD with macOS %s or newer") % amd_mac_version, icon='BLANK1',
translate=False)
return
for device in devices:
@@ -1723,12 +1742,21 @@ class CyclesPreferences(bpy.types.AddonPreferences):
if compute_device_type == 'METAL':
import platform
# MetalRT only works on Apple Silicon at present, pending argument encoding fixes on AMD
# Kernel specialization is only viable on Apple Silicon at present due to relative compilation speed
if platform.machine() == 'arm64':
import re
is_navi_2 = False
for device in devices:
if re.search(r"((RX)|(Pro)|(PRO))\s+W?6\d00X", device.name):
is_navi_2 = True
break
# MetalRT only works on Apple Silicon and Navi2.
is_arm64 = platform.machine() == 'arm64'
if is_arm64 or is_navi_2:
col = layout.column()
col.use_property_split = True
col.prop(self, "kernel_optimization_level")
# Kernel specialization is only supported on Apple Silicon
if is_arm64:
col.prop(self, "kernel_optimization_level")
col.prop(self, "use_metalrt")
def draw(self, context):

View File

@@ -48,6 +48,8 @@ void BlenderSync::sync_light(BL::Object &b_parent,
case BL::Light::type_SPOT: {
BL::SpotLight b_spot_light(b_light);
light->set_size(b_spot_light.shadow_soft_size());
light->set_axisu(transform_get_column(&tfm, 0));
light->set_axisv(transform_get_column(&tfm, 1));
light->set_light_type(LIGHT_SPOT);
light->set_spot_angle(b_spot_light.spot_size());
light->set_spot_smooth(b_spot_light.spot_blend());

View File

@@ -53,8 +53,12 @@ void CUDADevice::set_error(const string &error)
}
CUDADevice::CUDADevice(const DeviceInfo &info, Stats &stats, Profiler &profiler)
: Device(info, stats, profiler), texture_info(this, "texture_info", MEM_GLOBAL)
: GPUDevice(info, stats, profiler)
{
/* Verify that base class types can be used with specific backend types */
static_assert(sizeof(texMemObject) == sizeof(CUtexObject));
static_assert(sizeof(arrayMemObject) == sizeof(CUarray));
first_error = true;
cuDevId = info.num;
@@ -65,12 +69,6 @@ CUDADevice::CUDADevice(const DeviceInfo &info, Stats &stats, Profiler &profiler)
need_texture_info = false;
device_texture_headroom = 0;
device_working_headroom = 0;
move_texture_to_host = false;
map_host_limit = 0;
map_host_used = 0;
can_map_host = 0;
pitch_alignment = 0;
/* Initialize CUDA. */
@@ -91,8 +89,9 @@ CUDADevice::CUDADevice(const DeviceInfo &info, Stats &stats, Profiler &profiler)
/* CU_CTX_MAP_HOST for mapping host memory when out of device memory.
* CU_CTX_LMEM_RESIZE_TO_MAX for reserving local memory ahead of render,
* so we can predict which memory to map to host. */
cuda_assert(
cuDeviceGetAttribute(&can_map_host, CU_DEVICE_ATTRIBUTE_CAN_MAP_HOST_MEMORY, cuDevice));
int value;
cuda_assert(cuDeviceGetAttribute(&value, CU_DEVICE_ATTRIBUTE_CAN_MAP_HOST_MEMORY, cuDevice));
can_map_host = value != 0;
cuda_assert(cuDeviceGetAttribute(
&pitch_alignment, CU_DEVICE_ATTRIBUTE_TEXTURE_PITCH_ALIGNMENT, cuDevice));
@@ -499,311 +498,57 @@ void CUDADevice::reserve_local_memory(const uint kernel_features)
# endif
}
void CUDADevice::init_host_memory()
{
/* Limit amount of host mapped memory, because allocating too much can
* cause system instability. Leave at least half or 4 GB of system
* memory free, whichever is smaller. */
size_t default_limit = 4 * 1024 * 1024 * 1024LL;
size_t system_ram = system_physical_ram();
if (system_ram > 0) {
if (system_ram / 2 > default_limit) {
map_host_limit = system_ram - default_limit;
}
else {
map_host_limit = system_ram / 2;
}
}
else {
VLOG_WARNING << "Mapped host memory disabled, failed to get system RAM";
map_host_limit = 0;
}
/* Amount of device memory to keep is free after texture memory
* and working memory allocations respectively. We set the working
* memory limit headroom lower so that some space is left after all
* texture memory allocations. */
device_working_headroom = 32 * 1024 * 1024LL; // 32MB
device_texture_headroom = 128 * 1024 * 1024LL; // 128MB
VLOG_INFO << "Mapped host memory limit set to " << string_human_readable_number(map_host_limit)
<< " bytes. (" << string_human_readable_size(map_host_limit) << ")";
}
void CUDADevice::load_texture_info()
{
if (need_texture_info) {
/* Unset flag before copying, so this does not loop indefinitely if the copy below calls
* into 'move_textures_to_host' (which calls 'load_texture_info' again). */
need_texture_info = false;
texture_info.copy_to_device();
}
}
void CUDADevice::move_textures_to_host(size_t size, bool for_texture)
{
/* Break out of recursive call, which can happen when moving memory on a multi device. */
static bool any_device_moving_textures_to_host = false;
if (any_device_moving_textures_to_host) {
return;
}
/* Signal to reallocate textures in host memory only. */
move_texture_to_host = true;
while (size > 0) {
/* Find suitable memory allocation to move. */
device_memory *max_mem = NULL;
size_t max_size = 0;
bool max_is_image = false;
thread_scoped_lock lock(cuda_mem_map_mutex);
foreach (CUDAMemMap::value_type &pair, cuda_mem_map) {
device_memory &mem = *pair.first;
CUDAMem *cmem = &pair.second;
/* Can only move textures allocated on this device (and not those from peer devices).
* And need to ignore memory that is already on the host. */
if (!mem.is_resident(this) || cmem->use_mapped_host) {
continue;
}
bool is_texture = (mem.type == MEM_TEXTURE || mem.type == MEM_GLOBAL) &&
(&mem != &texture_info);
bool is_image = is_texture && (mem.data_height > 1);
/* Can't move this type of memory. */
if (!is_texture || cmem->array) {
continue;
}
/* For other textures, only move image textures. */
if (for_texture && !is_image) {
continue;
}
/* Try to move largest allocation, prefer moving images. */
if (is_image > max_is_image || (is_image == max_is_image && mem.device_size > max_size)) {
max_is_image = is_image;
max_size = mem.device_size;
max_mem = &mem;
}
}
lock.unlock();
/* Move to host memory. This part is mutex protected since
* multiple CUDA devices could be moving the memory. The
* first one will do it, and the rest will adopt the pointer. */
if (max_mem) {
VLOG_WORK << "Move memory from device to host: " << max_mem->name;
static thread_mutex move_mutex;
thread_scoped_lock lock(move_mutex);
any_device_moving_textures_to_host = true;
/* Potentially need to call back into multi device, so pointer mapping
* and peer devices are updated. This is also necessary since the device
* pointer may just be a key here, so cannot be accessed and freed directly.
* Unfortunately it does mean that memory is reallocated on all other
* devices as well, which is potentially dangerous when still in use (since
* a thread rendering on another devices would only be caught in this mutex
* if it so happens to do an allocation at the same time as well. */
max_mem->device_copy_to();
size = (max_size >= size) ? 0 : size - max_size;
any_device_moving_textures_to_host = false;
}
else {
break;
}
}
/* Unset flag before texture info is reloaded, since it should stay in device memory. */
move_texture_to_host = false;
/* Update texture info array with new pointers. */
load_texture_info();
}
CUDADevice::CUDAMem *CUDADevice::generic_alloc(device_memory &mem, size_t pitch_padding)
void CUDADevice::get_device_memory_info(size_t &total, size_t &free)
{
CUDAContextScope scope(this);
CUdeviceptr device_pointer = 0;
size_t size = mem.memory_size() + pitch_padding;
CUresult mem_alloc_result = CUDA_ERROR_OUT_OF_MEMORY;
const char *status = "";
/* First try allocating in device memory, respecting headroom. We make
* an exception for texture info. It is small and frequently accessed,
* so treat it as working memory.
*
* If there is not enough room for working memory, we will try to move
* textures to host memory, assuming the performance impact would have
* been worse for working memory. */
bool is_texture = (mem.type == MEM_TEXTURE || mem.type == MEM_GLOBAL) && (&mem != &texture_info);
bool is_image = is_texture && (mem.data_height > 1);
size_t headroom = (is_texture) ? device_texture_headroom : device_working_headroom;
size_t total = 0, free = 0;
cuMemGetInfo(&free, &total);
/* Move textures to host memory if needed. */
if (!move_texture_to_host && !is_image && (size + headroom) >= free && can_map_host) {
move_textures_to_host(size + headroom - free, is_texture);
cuMemGetInfo(&free, &total);
}
/* Allocate in device memory. */
if (!move_texture_to_host && (size + headroom) < free) {
mem_alloc_result = cuMemAlloc(&device_pointer, size);
if (mem_alloc_result == CUDA_SUCCESS) {
status = " in device memory";
}
}
/* Fall back to mapped host memory if needed and possible. */
void *shared_pointer = 0;
if (mem_alloc_result != CUDA_SUCCESS && can_map_host && mem.type != MEM_DEVICE_ONLY) {
if (mem.shared_pointer) {
/* Another device already allocated host memory. */
mem_alloc_result = CUDA_SUCCESS;
shared_pointer = mem.shared_pointer;
}
else if (map_host_used + size < map_host_limit) {
/* Allocate host memory ourselves. */
mem_alloc_result = cuMemHostAlloc(
&shared_pointer, size, CU_MEMHOSTALLOC_DEVICEMAP | CU_MEMHOSTALLOC_WRITECOMBINED);
assert((mem_alloc_result == CUDA_SUCCESS && shared_pointer != 0) ||
(mem_alloc_result != CUDA_SUCCESS && shared_pointer == 0));
}
if (mem_alloc_result == CUDA_SUCCESS) {
cuda_assert(cuMemHostGetDevicePointer_v2(&device_pointer, shared_pointer, 0));
map_host_used += size;
status = " in host memory";
}
}
if (mem_alloc_result != CUDA_SUCCESS) {
if (mem.type == MEM_DEVICE_ONLY) {
status = " failed, out of device memory";
set_error("System is out of GPU memory");
}
else {
status = " failed, out of device and host memory";
set_error("System is out of GPU and shared host memory");
}
}
if (mem.name) {
VLOG_WORK << "Buffer allocate: " << mem.name << ", "
<< string_human_readable_number(mem.memory_size()) << " bytes. ("
<< string_human_readable_size(mem.memory_size()) << ")" << status;
}
mem.device_pointer = (device_ptr)device_pointer;
mem.device_size = size;
stats.mem_alloc(size);
if (!mem.device_pointer) {
return NULL;
}
/* Insert into map of allocations. */
thread_scoped_lock lock(cuda_mem_map_mutex);
CUDAMem *cmem = &cuda_mem_map[&mem];
if (shared_pointer != 0) {
/* Replace host pointer with our host allocation. Only works if
* CUDA memory layout is the same and has no pitch padding. Also
* does not work if we move textures to host during a render,
* since other devices might be using the memory. */
if (!move_texture_to_host && pitch_padding == 0 && mem.host_pointer &&
mem.host_pointer != shared_pointer) {
memcpy(shared_pointer, mem.host_pointer, size);
/* A Call to device_memory::host_free() should be preceded by
* a call to device_memory::device_free() for host memory
* allocated by a device to be handled properly. Two exceptions
* are here and a call in OptiXDevice::generic_alloc(), where
* the current host memory can be assumed to be allocated by
* device_memory::host_alloc(), not by a device */
mem.host_free();
mem.host_pointer = shared_pointer;
}
mem.shared_pointer = shared_pointer;
mem.shared_counter++;
cmem->use_mapped_host = true;
}
else {
cmem->use_mapped_host = false;
}
return cmem;
}
void CUDADevice::generic_copy_to(device_memory &mem)
bool CUDADevice::alloc_device(void *&device_pointer, size_t size)
{
if (!mem.host_pointer || !mem.device_pointer) {
return;
}
CUDAContextScope scope(this);
/* If use_mapped_host of mem is false, the current device only uses device memory allocated by
* cuMemAlloc regardless of mem.host_pointer and mem.shared_pointer, and should copy data from
* mem.host_pointer. */
thread_scoped_lock lock(cuda_mem_map_mutex);
if (!cuda_mem_map[&mem].use_mapped_host || mem.host_pointer != mem.shared_pointer) {
const CUDAContextScope scope(this);
cuda_assert(
cuMemcpyHtoD((CUdeviceptr)mem.device_pointer, mem.host_pointer, mem.memory_size()));
}
CUresult mem_alloc_result = cuMemAlloc((CUdeviceptr *)&device_pointer, size);
return mem_alloc_result == CUDA_SUCCESS;
}
void CUDADevice::generic_free(device_memory &mem)
void CUDADevice::free_device(void *device_pointer)
{
if (mem.device_pointer) {
CUDAContextScope scope(this);
thread_scoped_lock lock(cuda_mem_map_mutex);
DCHECK(cuda_mem_map.find(&mem) != cuda_mem_map.end());
const CUDAMem &cmem = cuda_mem_map[&mem];
CUDAContextScope scope(this);
/* If cmem.use_mapped_host is true, reference counting is used
* to safely free a mapped host memory. */
cuda_assert(cuMemFree((CUdeviceptr)device_pointer));
}
if (cmem.use_mapped_host) {
assert(mem.shared_pointer);
if (mem.shared_pointer) {
assert(mem.shared_counter > 0);
if (--mem.shared_counter == 0) {
if (mem.host_pointer == mem.shared_pointer) {
mem.host_pointer = 0;
}
cuMemFreeHost(mem.shared_pointer);
mem.shared_pointer = 0;
}
}
map_host_used -= mem.device_size;
}
else {
/* Free device memory. */
cuda_assert(cuMemFree(mem.device_pointer));
}
bool CUDADevice::alloc_host(void *&shared_pointer, size_t size)
{
CUDAContextScope scope(this);
stats.mem_free(mem.device_size);
mem.device_pointer = 0;
mem.device_size = 0;
CUresult mem_alloc_result = cuMemHostAlloc(
&shared_pointer, size, CU_MEMHOSTALLOC_DEVICEMAP | CU_MEMHOSTALLOC_WRITECOMBINED);
return mem_alloc_result == CUDA_SUCCESS;
}
cuda_mem_map.erase(cuda_mem_map.find(&mem));
}
void CUDADevice::free_host(void *shared_pointer)
{
CUDAContextScope scope(this);
cuMemFreeHost(shared_pointer);
}
bool CUDADevice::transform_host_pointer(void *&device_pointer, void *&shared_pointer)
{
CUDAContextScope scope(this);
cuda_assert(cuMemHostGetDevicePointer_v2((CUdeviceptr *)&device_pointer, shared_pointer, 0));
return true;
}
void CUDADevice::copy_host_to_device(void *device_pointer, void *host_pointer, size_t size)
{
const CUDAContextScope scope(this);
cuda_assert(cuMemcpyHtoD((CUdeviceptr)device_pointer, host_pointer, size));
}
void CUDADevice::mem_alloc(device_memory &mem)
@@ -868,8 +613,8 @@ void CUDADevice::mem_zero(device_memory &mem)
/* If use_mapped_host of mem is false, mem.device_pointer currently refers to device memory
* regardless of mem.host_pointer and mem.shared_pointer. */
thread_scoped_lock lock(cuda_mem_map_mutex);
if (!cuda_mem_map[&mem].use_mapped_host || mem.host_pointer != mem.shared_pointer) {
thread_scoped_lock lock(device_mem_map_mutex);
if (!device_mem_map[&mem].use_mapped_host || mem.host_pointer != mem.shared_pointer) {
const CUDAContextScope scope(this);
cuda_assert(cuMemsetD8((CUdeviceptr)mem.device_pointer, 0, mem.memory_size()));
}
@@ -994,19 +739,19 @@ void CUDADevice::tex_alloc(device_texture &mem)
return;
}
CUDAMem *cmem = NULL;
Mem *cmem = NULL;
CUarray array_3d = NULL;
size_t src_pitch = mem.data_width * dsize * mem.data_elements;
size_t dst_pitch = src_pitch;
if (!mem.is_resident(this)) {
thread_scoped_lock lock(cuda_mem_map_mutex);
cmem = &cuda_mem_map[&mem];
thread_scoped_lock lock(device_mem_map_mutex);
cmem = &device_mem_map[&mem];
cmem->texobject = 0;
if (mem.data_depth > 1) {
array_3d = (CUarray)mem.device_pointer;
cmem->array = array_3d;
cmem->array = reinterpret_cast<arrayMemObject>(array_3d);
}
else if (mem.data_height > 0) {
dst_pitch = align_up(src_pitch, pitch_alignment);
@@ -1050,10 +795,10 @@ void CUDADevice::tex_alloc(device_texture &mem)
mem.device_size = size;
stats.mem_alloc(size);
thread_scoped_lock lock(cuda_mem_map_mutex);
cmem = &cuda_mem_map[&mem];
thread_scoped_lock lock(device_mem_map_mutex);
cmem = &device_mem_map[&mem];
cmem->texobject = 0;
cmem->array = array_3d;
cmem->array = reinterpret_cast<arrayMemObject>(array_3d);
}
else if (mem.data_height > 0) {
/* 2D texture, using pitch aligned linear memory. */
@@ -1137,8 +882,8 @@ void CUDADevice::tex_alloc(device_texture &mem)
texDesc.filterMode = filter_mode;
texDesc.flags = CU_TRSF_NORMALIZED_COORDINATES;
thread_scoped_lock lock(cuda_mem_map_mutex);
cmem = &cuda_mem_map[&mem];
thread_scoped_lock lock(device_mem_map_mutex);
cmem = &device_mem_map[&mem];
cuda_assert(cuTexObjectCreate(&cmem->texobject, &resDesc, &texDesc, NULL));
@@ -1153,9 +898,9 @@ void CUDADevice::tex_free(device_texture &mem)
{
if (mem.device_pointer) {
CUDAContextScope scope(this);
thread_scoped_lock lock(cuda_mem_map_mutex);
DCHECK(cuda_mem_map.find(&mem) != cuda_mem_map.end());
const CUDAMem &cmem = cuda_mem_map[&mem];
thread_scoped_lock lock(device_mem_map_mutex);
DCHECK(device_mem_map.find(&mem) != device_mem_map.end());
const Mem &cmem = device_mem_map[&mem];
if (cmem.texobject) {
/* Free bindless texture. */
@@ -1164,16 +909,16 @@ void CUDADevice::tex_free(device_texture &mem)
if (!mem.is_resident(this)) {
/* Do not free memory here, since it was allocated on a different device. */
cuda_mem_map.erase(cuda_mem_map.find(&mem));
device_mem_map.erase(device_mem_map.find(&mem));
}
else if (cmem.array) {
/* Free array. */
cuArrayDestroy(cmem.array);
cuArrayDestroy(reinterpret_cast<CUarray>(cmem.array));
stats.mem_free(mem.device_size);
mem.device_pointer = 0;
mem.device_size = 0;
cuda_mem_map.erase(cuda_mem_map.find(&mem));
device_mem_map.erase(device_mem_map.find(&mem));
}
else {
lock.unlock();

View File

@@ -21,7 +21,7 @@ CCL_NAMESPACE_BEGIN
class DeviceQueue;
class CUDADevice : public Device {
class CUDADevice : public GPUDevice {
friend class CUDAContextScope;
@@ -29,36 +29,11 @@ class CUDADevice : public Device {
CUdevice cuDevice;
CUcontext cuContext;
CUmodule cuModule;
size_t device_texture_headroom;
size_t device_working_headroom;
bool move_texture_to_host;
size_t map_host_used;
size_t map_host_limit;
int can_map_host;
int pitch_alignment;
int cuDevId;
int cuDevArchitecture;
bool first_error;
struct CUDAMem {
CUDAMem() : texobject(0), array(0), use_mapped_host(false)
{
}
CUtexObject texobject;
CUarray array;
/* If true, a mapped host memory in shared_pointer is being used. */
bool use_mapped_host;
};
typedef map<device_memory *, CUDAMem> CUDAMemMap;
CUDAMemMap cuda_mem_map;
thread_mutex cuda_mem_map_mutex;
/* Bindless Textures */
device_vector<TextureInfo> texture_info;
bool need_texture_info;
CUDADeviceKernels kernels;
static bool have_precompiled_kernels();
@@ -88,17 +63,13 @@ class CUDADevice : public Device {
void reserve_local_memory(const uint kernel_features);
void init_host_memory();
void load_texture_info();
void move_textures_to_host(size_t size, bool for_texture);
CUDAMem *generic_alloc(device_memory &mem, size_t pitch_padding = 0);
void generic_copy_to(device_memory &mem);
void generic_free(device_memory &mem);
virtual void get_device_memory_info(size_t &total, size_t &free) override;
virtual bool alloc_device(void *&device_pointer, size_t size) override;
virtual void free_device(void *device_pointer) override;
virtual bool alloc_host(void *&shared_pointer, size_t size) override;
virtual void free_host(void *shared_pointer) override;
virtual bool transform_host_pointer(void *&device_pointer, void *&shared_pointer) override;
virtual void copy_host_to_device(void *device_pointer, void *host_pointer, size_t size) override;
void mem_alloc(device_memory &mem) override;

View File

@@ -452,6 +452,320 @@ void *Device::get_cpu_osl_memory()
return nullptr;
}
GPUDevice::~GPUDevice() noexcept(false)
{
}
bool GPUDevice::load_texture_info()
{
if (need_texture_info) {
/* Unset flag before copying, so this does not loop indefinitely if the copy below calls
* into 'move_textures_to_host' (which calls 'load_texture_info' again). */
need_texture_info = false;
texture_info.copy_to_device();
return true;
}
else {
return false;
}
}
void GPUDevice::init_host_memory(size_t preferred_texture_headroom,
size_t preferred_working_headroom)
{
/* Limit amount of host mapped memory, because allocating too much can
* cause system instability. Leave at least half or 4 GB of system
* memory free, whichever is smaller. */
size_t default_limit = 4 * 1024 * 1024 * 1024LL;
size_t system_ram = system_physical_ram();
if (system_ram > 0) {
if (system_ram / 2 > default_limit) {
map_host_limit = system_ram - default_limit;
}
else {
map_host_limit = system_ram / 2;
}
}
else {
VLOG_WARNING << "Mapped host memory disabled, failed to get system RAM";
map_host_limit = 0;
}
/* Amount of device memory to keep free after texture memory
* and working memory allocations respectively. We set the working
* memory limit headroom lower than the working one so there
* is space left for it. */
device_working_headroom = preferred_working_headroom > 0 ? preferred_working_headroom :
32 * 1024 * 1024LL; // 32MB
device_texture_headroom = preferred_texture_headroom > 0 ? preferred_texture_headroom :
128 * 1024 * 1024LL; // 128MB
VLOG_INFO << "Mapped host memory limit set to " << string_human_readable_number(map_host_limit)
<< " bytes. (" << string_human_readable_size(map_host_limit) << ")";
}
void GPUDevice::move_textures_to_host(size_t size, bool for_texture)
{
/* Break out of recursive call, which can happen when moving memory on a multi device. */
static bool any_device_moving_textures_to_host = false;
if (any_device_moving_textures_to_host) {
return;
}
/* Signal to reallocate textures in host memory only. */
move_texture_to_host = true;
while (size > 0) {
/* Find suitable memory allocation to move. */
device_memory *max_mem = NULL;
size_t max_size = 0;
bool max_is_image = false;
thread_scoped_lock lock(device_mem_map_mutex);
foreach (MemMap::value_type &pair, device_mem_map) {
device_memory &mem = *pair.first;
Mem *cmem = &pair.second;
/* Can only move textures allocated on this device (and not those from peer devices).
* And need to ignore memory that is already on the host. */
if (!mem.is_resident(this) || cmem->use_mapped_host) {
continue;
}
bool is_texture = (mem.type == MEM_TEXTURE || mem.type == MEM_GLOBAL) &&
(&mem != &texture_info);
bool is_image = is_texture && (mem.data_height > 1);
/* Can't move this type of memory. */
if (!is_texture || cmem->array) {
continue;
}
/* For other textures, only move image textures. */
if (for_texture && !is_image) {
continue;
}
/* Try to move largest allocation, prefer moving images. */
if (is_image > max_is_image || (is_image == max_is_image && mem.device_size > max_size)) {
max_is_image = is_image;
max_size = mem.device_size;
max_mem = &mem;
}
}
lock.unlock();
/* Move to host memory. This part is mutex protected since
* multiple backend devices could be moving the memory. The
* first one will do it, and the rest will adopt the pointer. */
if (max_mem) {
VLOG_WORK << "Move memory from device to host: " << max_mem->name;
static thread_mutex move_mutex;
thread_scoped_lock lock(move_mutex);
any_device_moving_textures_to_host = true;
/* Potentially need to call back into multi device, so pointer mapping
* and peer devices are updated. This is also necessary since the device
* pointer may just be a key here, so cannot be accessed and freed directly.
* Unfortunately it does mean that memory is reallocated on all other
* devices as well, which is potentially dangerous when still in use (since
* a thread rendering on another devices would only be caught in this mutex
* if it so happens to do an allocation at the same time as well. */
max_mem->device_copy_to();
size = (max_size >= size) ? 0 : size - max_size;
any_device_moving_textures_to_host = false;
}
else {
break;
}
}
/* Unset flag before texture info is reloaded, since it should stay in device memory. */
move_texture_to_host = false;
/* Update texture info array with new pointers. */
load_texture_info();
}
GPUDevice::Mem *GPUDevice::generic_alloc(device_memory &mem, size_t pitch_padding)
{
void *device_pointer = 0;
size_t size = mem.memory_size() + pitch_padding;
bool mem_alloc_result = false;
const char *status = "";
/* First try allocating in device memory, respecting headroom. We make
* an exception for texture info. It is small and frequently accessed,
* so treat it as working memory.
*
* If there is not enough room for working memory, we will try to move
* textures to host memory, assuming the performance impact would have
* been worse for working memory. */
bool is_texture = (mem.type == MEM_TEXTURE || mem.type == MEM_GLOBAL) && (&mem != &texture_info);
bool is_image = is_texture && (mem.data_height > 1);
size_t headroom = (is_texture) ? device_texture_headroom : device_working_headroom;
size_t total = 0, free = 0;
get_device_memory_info(total, free);
/* Move textures to host memory if needed. */
if (!move_texture_to_host && !is_image && (size + headroom) >= free && can_map_host) {
move_textures_to_host(size + headroom - free, is_texture);
get_device_memory_info(total, free);
}
/* Allocate in device memory. */
if (!move_texture_to_host && (size + headroom) < free) {
mem_alloc_result = alloc_device(device_pointer, size);
if (mem_alloc_result) {
device_mem_in_use += size;
status = " in device memory";
}
}
/* Fall back to mapped host memory if needed and possible. */
void *shared_pointer = 0;
if (!mem_alloc_result && can_map_host && mem.type != MEM_DEVICE_ONLY) {
if (mem.shared_pointer) {
/* Another device already allocated host memory. */
mem_alloc_result = true;
shared_pointer = mem.shared_pointer;
}
else if (map_host_used + size < map_host_limit) {
/* Allocate host memory ourselves. */
mem_alloc_result = alloc_host(shared_pointer, size);
assert((mem_alloc_result && shared_pointer != 0) ||
(!mem_alloc_result && shared_pointer == 0));
}
if (mem_alloc_result) {
assert(transform_host_pointer(device_pointer, shared_pointer));
map_host_used += size;
status = " in host memory";
}
}
if (!mem_alloc_result) {
if (mem.type == MEM_DEVICE_ONLY) {
status = " failed, out of device memory";
set_error("System is out of GPU memory");
}
else {
status = " failed, out of device and host memory";
set_error("System is out of GPU and shared host memory");
}
}
if (mem.name) {
VLOG_WORK << "Buffer allocate: " << mem.name << ", "
<< string_human_readable_number(mem.memory_size()) << " bytes. ("
<< string_human_readable_size(mem.memory_size()) << ")" << status;
}
mem.device_pointer = (device_ptr)device_pointer;
mem.device_size = size;
stats.mem_alloc(size);
if (!mem.device_pointer) {
return NULL;
}
/* Insert into map of allocations. */
thread_scoped_lock lock(device_mem_map_mutex);
Mem *cmem = &device_mem_map[&mem];
if (shared_pointer != 0) {
/* Replace host pointer with our host allocation. Only works if
* memory layout is the same and has no pitch padding. Also
* does not work if we move textures to host during a render,
* since other devices might be using the memory. */
if (!move_texture_to_host && pitch_padding == 0 && mem.host_pointer &&
mem.host_pointer != shared_pointer) {
memcpy(shared_pointer, mem.host_pointer, size);
/* A Call to device_memory::host_free() should be preceded by
* a call to device_memory::device_free() for host memory
* allocated by a device to be handled properly. Two exceptions
* are here and a call in OptiXDevice::generic_alloc(), where
* the current host memory can be assumed to be allocated by
* device_memory::host_alloc(), not by a device */
mem.host_free();
mem.host_pointer = shared_pointer;
}
mem.shared_pointer = shared_pointer;
mem.shared_counter++;
cmem->use_mapped_host = true;
}
else {
cmem->use_mapped_host = false;
}
return cmem;
}
void GPUDevice::generic_free(device_memory &mem)
{
if (mem.device_pointer) {
thread_scoped_lock lock(device_mem_map_mutex);
DCHECK(device_mem_map.find(&mem) != device_mem_map.end());
const Mem &cmem = device_mem_map[&mem];
/* If cmem.use_mapped_host is true, reference counting is used
* to safely free a mapped host memory. */
if (cmem.use_mapped_host) {
assert(mem.shared_pointer);
if (mem.shared_pointer) {
assert(mem.shared_counter > 0);
if (--mem.shared_counter == 0) {
if (mem.host_pointer == mem.shared_pointer) {
mem.host_pointer = 0;
}
free_host(mem.shared_pointer);
mem.shared_pointer = 0;
}
}
map_host_used -= mem.device_size;
}
else {
/* Free device memory. */
free_device((void *)mem.device_pointer);
device_mem_in_use -= mem.device_size;
}
stats.mem_free(mem.device_size);
mem.device_pointer = 0;
mem.device_size = 0;
device_mem_map.erase(device_mem_map.find(&mem));
}
}
void GPUDevice::generic_copy_to(device_memory &mem)
{
if (!mem.host_pointer || !mem.device_pointer) {
return;
}
/* If use_mapped_host of mem is false, the current device only uses device memory allocated by
* backend device allocation regardless of mem.host_pointer and mem.shared_pointer, and should
* copy data from mem.host_pointer. */
thread_scoped_lock lock(device_mem_map_mutex);
if (!device_mem_map[&mem].use_mapped_host || mem.host_pointer != mem.shared_pointer) {
copy_host_to_device((void *)mem.device_pointer, mem.host_pointer, mem.memory_size());
}
}
/* DeviceInfo */
CCL_NAMESPACE_END

View File

@@ -309,6 +309,93 @@ class Device {
static uint devices_initialized_mask;
};
/* Device, which is GPU, with some common functionality for GPU backends */
class GPUDevice : public Device {
protected:
GPUDevice(const DeviceInfo &info_, Stats &stats_, Profiler &profiler_)
: Device(info_, stats_, profiler_),
texture_info(this, "texture_info", MEM_GLOBAL),
need_texture_info(false),
can_map_host(false),
map_host_used(0),
map_host_limit(0),
device_texture_headroom(0),
device_working_headroom(0),
device_mem_map(),
device_mem_map_mutex(),
move_texture_to_host(false),
device_mem_in_use(0)
{
}
public:
virtual ~GPUDevice() noexcept(false);
/* For GPUs that can use bindless textures in some way or another. */
device_vector<TextureInfo> texture_info;
bool need_texture_info;
/* Returns true if the texture info was copied to the device (meaning, some more
* re-initialization might be needed). */
virtual bool load_texture_info();
protected:
/* Memory allocation, only accessed through device_memory. */
friend class device_memory;
bool can_map_host;
size_t map_host_used;
size_t map_host_limit;
size_t device_texture_headroom;
size_t device_working_headroom;
typedef unsigned long long texMemObject;
typedef unsigned long long arrayMemObject;
struct Mem {
Mem() : texobject(0), array(0), use_mapped_host(false)
{
}
texMemObject texobject;
arrayMemObject array;
/* If true, a mapped host memory in shared_pointer is being used. */
bool use_mapped_host;
};
typedef map<device_memory *, Mem> MemMap;
MemMap device_mem_map;
thread_mutex device_mem_map_mutex;
bool move_texture_to_host;
/* Simple counter which will try to track amount of used device memory */
size_t device_mem_in_use;
virtual void init_host_memory(size_t preferred_texture_headroom = 0,
size_t preferred_working_headroom = 0);
virtual void move_textures_to_host(size_t size, bool for_texture);
/* Allocation, deallocation and copy functions, with corresponding
* support of device/host allocations. */
virtual GPUDevice::Mem *generic_alloc(device_memory &mem, size_t pitch_padding = 0);
virtual void generic_free(device_memory &mem);
virtual void generic_copy_to(device_memory &mem);
/* total - amount of device memory, free - amount of available device memory */
virtual void get_device_memory_info(size_t &total, size_t &free) = 0;
virtual bool alloc_device(void *&device_pointer, size_t size) = 0;
virtual void free_device(void *device_pointer) = 0;
virtual bool alloc_host(void *&shared_pointer, size_t size) = 0;
virtual void free_host(void *shared_pointer) = 0;
/* This function should return device pointer corresponding to shared pointer, which
* is host buffer, allocated in `alloc_host`. The function should `true`, if such
* address transformation is possible and `false` otherwise. */
virtual bool transform_host_pointer(void *&device_pointer, void *&shared_pointer) = 0;
virtual void copy_host_to_device(void *device_pointer, void *host_pointer, size_t size) = 0;
};
CCL_NAMESPACE_END
#endif /* __DEVICE_H__ */

View File

@@ -53,8 +53,12 @@ void HIPDevice::set_error(const string &error)
}
HIPDevice::HIPDevice(const DeviceInfo &info, Stats &stats, Profiler &profiler)
: Device(info, stats, profiler), texture_info(this, "texture_info", MEM_GLOBAL)
: GPUDevice(info, stats, profiler)
{
/* Verify that base class types can be used with specific backend types */
static_assert(sizeof(texMemObject) == sizeof(hipTextureObject_t));
static_assert(sizeof(arrayMemObject) == sizeof(hArray));
first_error = true;
hipDevId = info.num;
@@ -65,12 +69,6 @@ HIPDevice::HIPDevice(const DeviceInfo &info, Stats &stats, Profiler &profiler)
need_texture_info = false;
device_texture_headroom = 0;
device_working_headroom = 0;
move_texture_to_host = false;
map_host_limit = 0;
map_host_used = 0;
can_map_host = 0;
pitch_alignment = 0;
/* Initialize HIP. */
@@ -91,7 +89,9 @@ HIPDevice::HIPDevice(const DeviceInfo &info, Stats &stats, Profiler &profiler)
/* hipDeviceMapHost for mapping host memory when out of device memory.
* hipDeviceLmemResizeToMax for reserving local memory ahead of render,
* so we can predict which memory to map to host. */
hip_assert(hipDeviceGetAttribute(&can_map_host, hipDeviceAttributeCanMapHostMemory, hipDevice));
int value;
hip_assert(hipDeviceGetAttribute(&value, hipDeviceAttributeCanMapHostMemory, hipDevice));
can_map_host = value != 0;
hip_assert(
hipDeviceGetAttribute(&pitch_alignment, hipDeviceAttributeTexturePitchAlignment, hipDevice));
@@ -460,305 +460,58 @@ void HIPDevice::reserve_local_memory(const uint kernel_features)
# endif
}
void HIPDevice::init_host_memory()
{
/* Limit amount of host mapped memory, because allocating too much can
* cause system instability. Leave at least half or 4 GB of system
* memory free, whichever is smaller. */
size_t default_limit = 4 * 1024 * 1024 * 1024LL;
size_t system_ram = system_physical_ram();
if (system_ram > 0) {
if (system_ram / 2 > default_limit) {
map_host_limit = system_ram - default_limit;
}
else {
map_host_limit = system_ram / 2;
}
}
else {
VLOG_WARNING << "Mapped host memory disabled, failed to get system RAM";
map_host_limit = 0;
}
/* Amount of device memory to keep is free after texture memory
* and working memory allocations respectively. We set the working
* memory limit headroom lower so that some space is left after all
* texture memory allocations. */
device_working_headroom = 32 * 1024 * 1024LL; // 32MB
device_texture_headroom = 128 * 1024 * 1024LL; // 128MB
VLOG_INFO << "Mapped host memory limit set to " << string_human_readable_number(map_host_limit)
<< " bytes. (" << string_human_readable_size(map_host_limit) << ")";
}
void HIPDevice::load_texture_info()
{
if (need_texture_info) {
/* Unset flag before copying, so this does not loop indefinitely if the copy below calls
* into 'move_textures_to_host' (which calls 'load_texture_info' again). */
need_texture_info = false;
texture_info.copy_to_device();
}
}
void HIPDevice::move_textures_to_host(size_t size, bool for_texture)
{
/* Break out of recursive call, which can happen when moving memory on a multi device. */
static bool any_device_moving_textures_to_host = false;
if (any_device_moving_textures_to_host) {
return;
}
/* Signal to reallocate textures in host memory only. */
move_texture_to_host = true;
while (size > 0) {
/* Find suitable memory allocation to move. */
device_memory *max_mem = NULL;
size_t max_size = 0;
bool max_is_image = false;
thread_scoped_lock lock(hip_mem_map_mutex);
foreach (HIPMemMap::value_type &pair, hip_mem_map) {
device_memory &mem = *pair.first;
HIPMem *cmem = &pair.second;
/* Can only move textures allocated on this device (and not those from peer devices).
* And need to ignore memory that is already on the host. */
if (!mem.is_resident(this) || cmem->use_mapped_host) {
continue;
}
bool is_texture = (mem.type == MEM_TEXTURE || mem.type == MEM_GLOBAL) &&
(&mem != &texture_info);
bool is_image = is_texture && (mem.data_height > 1);
/* Can't move this type of memory. */
if (!is_texture || cmem->array) {
continue;
}
/* For other textures, only move image textures. */
if (for_texture && !is_image) {
continue;
}
/* Try to move largest allocation, prefer moving images. */
if (is_image > max_is_image || (is_image == max_is_image && mem.device_size > max_size)) {
max_is_image = is_image;
max_size = mem.device_size;
max_mem = &mem;
}
}
lock.unlock();
/* Move to host memory. This part is mutex protected since
* multiple HIP devices could be moving the memory. The
* first one will do it, and the rest will adopt the pointer. */
if (max_mem) {
VLOG_WORK << "Move memory from device to host: " << max_mem->name;
static thread_mutex move_mutex;
thread_scoped_lock lock(move_mutex);
any_device_moving_textures_to_host = true;
/* Potentially need to call back into multi device, so pointer mapping
* and peer devices are updated. This is also necessary since the device
* pointer may just be a key here, so cannot be accessed and freed directly.
* Unfortunately it does mean that memory is reallocated on all other
* devices as well, which is potentially dangerous when still in use (since
* a thread rendering on another devices would only be caught in this mutex
* if it so happens to do an allocation at the same time as well. */
max_mem->device_copy_to();
size = (max_size >= size) ? 0 : size - max_size;
any_device_moving_textures_to_host = false;
}
else {
break;
}
}
/* Unset flag before texture info is reloaded, since it should stay in device memory. */
move_texture_to_host = false;
/* Update texture info array with new pointers. */
load_texture_info();
}
HIPDevice::HIPMem *HIPDevice::generic_alloc(device_memory &mem, size_t pitch_padding)
void HIPDevice::get_device_memory_info(size_t &total, size_t &free)
{
HIPContextScope scope(this);
hipDeviceptr_t device_pointer = 0;
size_t size = mem.memory_size() + pitch_padding;
hipError_t mem_alloc_result = hipErrorOutOfMemory;
const char *status = "";
/* First try allocating in device memory, respecting headroom. We make
* an exception for texture info. It is small and frequently accessed,
* so treat it as working memory.
*
* If there is not enough room for working memory, we will try to move
* textures to host memory, assuming the performance impact would have
* been worse for working memory. */
bool is_texture = (mem.type == MEM_TEXTURE || mem.type == MEM_GLOBAL) && (&mem != &texture_info);
bool is_image = is_texture && (mem.data_height > 1);
size_t headroom = (is_texture) ? device_texture_headroom : device_working_headroom;
size_t total = 0, free = 0;
hipMemGetInfo(&free, &total);
/* Move textures to host memory if needed. */
if (!move_texture_to_host && !is_image && (size + headroom) >= free && can_map_host) {
move_textures_to_host(size + headroom - free, is_texture);
hipMemGetInfo(&free, &total);
}
/* Allocate in device memory. */
if (!move_texture_to_host && (size + headroom) < free) {
mem_alloc_result = hipMalloc(&device_pointer, size);
if (mem_alloc_result == hipSuccess) {
status = " in device memory";
}
}
/* Fall back to mapped host memory if needed and possible. */
void *shared_pointer = 0;
if (mem_alloc_result != hipSuccess && can_map_host) {
if (mem.shared_pointer) {
/* Another device already allocated host memory. */
mem_alloc_result = hipSuccess;
shared_pointer = mem.shared_pointer;
}
else if (map_host_used + size < map_host_limit) {
/* Allocate host memory ourselves. */
mem_alloc_result = hipHostMalloc(
&shared_pointer, size, hipHostMallocMapped | hipHostMallocWriteCombined);
assert((mem_alloc_result == hipSuccess && shared_pointer != 0) ||
(mem_alloc_result != hipSuccess && shared_pointer == 0));
}
if (mem_alloc_result == hipSuccess) {
hip_assert(hipHostGetDevicePointer(&device_pointer, shared_pointer, 0));
map_host_used += size;
status = " in host memory";
}
}
if (mem_alloc_result != hipSuccess) {
status = " failed, out of device and host memory";
set_error("System is out of GPU and shared host memory");
}
if (mem.name) {
VLOG_WORK << "Buffer allocate: " << mem.name << ", "
<< string_human_readable_number(mem.memory_size()) << " bytes. ("
<< string_human_readable_size(mem.memory_size()) << ")" << status;
}
mem.device_pointer = (device_ptr)device_pointer;
mem.device_size = size;
stats.mem_alloc(size);
if (!mem.device_pointer) {
return NULL;
}
/* Insert into map of allocations. */
thread_scoped_lock lock(hip_mem_map_mutex);
HIPMem *cmem = &hip_mem_map[&mem];
if (shared_pointer != 0) {
/* Replace host pointer with our host allocation. Only works if
* HIP memory layout is the same and has no pitch padding. Also
* does not work if we move textures to host during a render,
* since other devices might be using the memory. */
if (!move_texture_to_host && pitch_padding == 0 && mem.host_pointer &&
mem.host_pointer != shared_pointer) {
memcpy(shared_pointer, mem.host_pointer, size);
/* A Call to device_memory::host_free() should be preceded by
* a call to device_memory::device_free() for host memory
* allocated by a device to be handled properly. Two exceptions
* are here and a call in OptiXDevice::generic_alloc(), where
* the current host memory can be assumed to be allocated by
* device_memory::host_alloc(), not by a device */
mem.host_free();
mem.host_pointer = shared_pointer;
}
mem.shared_pointer = shared_pointer;
mem.shared_counter++;
cmem->use_mapped_host = true;
}
else {
cmem->use_mapped_host = false;
}
return cmem;
}
void HIPDevice::generic_copy_to(device_memory &mem)
bool HIPDevice::alloc_device(void *&device_pointer, size_t size)
{
if (!mem.host_pointer || !mem.device_pointer) {
return;
}
HIPContextScope scope(this);
/* If use_mapped_host of mem is false, the current device only uses device memory allocated by
* hipMalloc regardless of mem.host_pointer and mem.shared_pointer, and should copy data from
* mem.host_pointer. */
thread_scoped_lock lock(hip_mem_map_mutex);
if (!hip_mem_map[&mem].use_mapped_host || mem.host_pointer != mem.shared_pointer) {
const HIPContextScope scope(this);
hip_assert(
hipMemcpyHtoD((hipDeviceptr_t)mem.device_pointer, mem.host_pointer, mem.memory_size()));
}
hipError_t mem_alloc_result = hipMalloc((hipDeviceptr_t *)&device_pointer, size);
return mem_alloc_result == hipSuccess;
}
void HIPDevice::generic_free(device_memory &mem)
void HIPDevice::free_device(void *device_pointer)
{
if (mem.device_pointer) {
HIPContextScope scope(this);
thread_scoped_lock lock(hip_mem_map_mutex);
DCHECK(hip_mem_map.find(&mem) != hip_mem_map.end());
const HIPMem &cmem = hip_mem_map[&mem];
HIPContextScope scope(this);
/* If cmem.use_mapped_host is true, reference counting is used
* to safely free a mapped host memory. */
hip_assert(hipFree((hipDeviceptr_t)device_pointer));
}
if (cmem.use_mapped_host) {
assert(mem.shared_pointer);
if (mem.shared_pointer) {
assert(mem.shared_counter > 0);
if (--mem.shared_counter == 0) {
if (mem.host_pointer == mem.shared_pointer) {
mem.host_pointer = 0;
}
hipHostFree(mem.shared_pointer);
mem.shared_pointer = 0;
}
}
map_host_used -= mem.device_size;
}
else {
/* Free device memory. */
hip_assert(hipFree(mem.device_pointer));
}
bool HIPDevice::alloc_host(void *&shared_pointer, size_t size)
{
HIPContextScope scope(this);
stats.mem_free(mem.device_size);
mem.device_pointer = 0;
mem.device_size = 0;
hipError_t mem_alloc_result = hipHostMalloc(
&shared_pointer, size, hipHostMallocMapped | hipHostMallocWriteCombined);
hip_mem_map.erase(hip_mem_map.find(&mem));
}
return mem_alloc_result == hipSuccess;
}
void HIPDevice::free_host(void *shared_pointer)
{
HIPContextScope scope(this);
hipHostFree(shared_pointer);
}
bool HIPDevice::transform_host_pointer(void *&device_pointer, void *&shared_pointer)
{
HIPContextScope scope(this);
hip_assert(hipHostGetDevicePointer((hipDeviceptr_t *)&device_pointer, shared_pointer, 0));
return true;
}
void HIPDevice::copy_host_to_device(void *device_pointer, void *host_pointer, size_t size)
{
const HIPContextScope scope(this);
hip_assert(hipMemcpyHtoD((hipDeviceptr_t)device_pointer, host_pointer, size));
}
void HIPDevice::mem_alloc(device_memory &mem)
@@ -823,8 +576,8 @@ void HIPDevice::mem_zero(device_memory &mem)
/* If use_mapped_host of mem is false, mem.device_pointer currently refers to device memory
* regardless of mem.host_pointer and mem.shared_pointer. */
thread_scoped_lock lock(hip_mem_map_mutex);
if (!hip_mem_map[&mem].use_mapped_host || mem.host_pointer != mem.shared_pointer) {
thread_scoped_lock lock(device_mem_map_mutex);
if (!device_mem_map[&mem].use_mapped_host || mem.host_pointer != mem.shared_pointer) {
const HIPContextScope scope(this);
hip_assert(hipMemsetD8((hipDeviceptr_t)mem.device_pointer, 0, mem.memory_size()));
}
@@ -951,19 +704,19 @@ void HIPDevice::tex_alloc(device_texture &mem)
return;
}
HIPMem *cmem = NULL;
Mem *cmem = NULL;
hArray array_3d = NULL;
size_t src_pitch = mem.data_width * dsize * mem.data_elements;
size_t dst_pitch = src_pitch;
if (!mem.is_resident(this)) {
thread_scoped_lock lock(hip_mem_map_mutex);
cmem = &hip_mem_map[&mem];
thread_scoped_lock lock(device_mem_map_mutex);
cmem = &device_mem_map[&mem];
cmem->texobject = 0;
if (mem.data_depth > 1) {
array_3d = (hArray)mem.device_pointer;
cmem->array = array_3d;
cmem->array = reinterpret_cast<arrayMemObject>(array_3d);
}
else if (mem.data_height > 0) {
dst_pitch = align_up(src_pitch, pitch_alignment);
@@ -1007,10 +760,10 @@ void HIPDevice::tex_alloc(device_texture &mem)
mem.device_size = size;
stats.mem_alloc(size);
thread_scoped_lock lock(hip_mem_map_mutex);
cmem = &hip_mem_map[&mem];
thread_scoped_lock lock(device_mem_map_mutex);
cmem = &device_mem_map[&mem];
cmem->texobject = 0;
cmem->array = array_3d;
cmem->array = reinterpret_cast<arrayMemObject>(array_3d);
}
else if (mem.data_height > 0) {
/* 2D texture, using pitch aligned linear memory. */
@@ -1095,8 +848,8 @@ void HIPDevice::tex_alloc(device_texture &mem)
texDesc.filterMode = filter_mode;
texDesc.flags = HIP_TRSF_NORMALIZED_COORDINATES;
thread_scoped_lock lock(hip_mem_map_mutex);
cmem = &hip_mem_map[&mem];
thread_scoped_lock lock(device_mem_map_mutex);
cmem = &device_mem_map[&mem];
hip_assert(hipTexObjectCreate(&cmem->texobject, &resDesc, &texDesc, NULL));
@@ -1111,9 +864,9 @@ void HIPDevice::tex_free(device_texture &mem)
{
if (mem.device_pointer) {
HIPContextScope scope(this);
thread_scoped_lock lock(hip_mem_map_mutex);
DCHECK(hip_mem_map.find(&mem) != hip_mem_map.end());
const HIPMem &cmem = hip_mem_map[&mem];
thread_scoped_lock lock(device_mem_map_mutex);
DCHECK(device_mem_map.find(&mem) != device_mem_map.end());
const Mem &cmem = device_mem_map[&mem];
if (cmem.texobject) {
/* Free bindless texture. */
@@ -1122,16 +875,16 @@ void HIPDevice::tex_free(device_texture &mem)
if (!mem.is_resident(this)) {
/* Do not free memory here, since it was allocated on a different device. */
hip_mem_map.erase(hip_mem_map.find(&mem));
device_mem_map.erase(device_mem_map.find(&mem));
}
else if (cmem.array) {
/* Free array. */
hipArrayDestroy(cmem.array);
hipArrayDestroy(reinterpret_cast<hArray>(cmem.array));
stats.mem_free(mem.device_size);
mem.device_pointer = 0;
mem.device_size = 0;
hip_mem_map.erase(hip_mem_map.find(&mem));
device_mem_map.erase(device_mem_map.find(&mem));
}
else {
lock.unlock();

View File

@@ -18,7 +18,7 @@ CCL_NAMESPACE_BEGIN
class DeviceQueue;
class HIPDevice : public Device {
class HIPDevice : public GPUDevice {
friend class HIPContextScope;
@@ -26,36 +26,11 @@ class HIPDevice : public Device {
hipDevice_t hipDevice;
hipCtx_t hipContext;
hipModule_t hipModule;
size_t device_texture_headroom;
size_t device_working_headroom;
bool move_texture_to_host;
size_t map_host_used;
size_t map_host_limit;
int can_map_host;
int pitch_alignment;
int hipDevId;
int hipDevArchitecture;
bool first_error;
struct HIPMem {
HIPMem() : texobject(0), array(0), use_mapped_host(false)
{
}
hipTextureObject_t texobject;
hArray array;
/* If true, a mapped host memory in shared_pointer is being used. */
bool use_mapped_host;
};
typedef map<device_memory *, HIPMem> HIPMemMap;
HIPMemMap hip_mem_map;
thread_mutex hip_mem_map_mutex;
/* Bindless Textures */
device_vector<TextureInfo> texture_info;
bool need_texture_info;
HIPDeviceKernels kernels;
static bool have_precompiled_kernels();
@@ -81,17 +56,13 @@ class HIPDevice : public Device {
virtual bool load_kernels(const uint kernel_features) override;
void reserve_local_memory(const uint kernel_features);
void init_host_memory();
void load_texture_info();
void move_textures_to_host(size_t size, bool for_texture);
HIPMem *generic_alloc(device_memory &mem, size_t pitch_padding = 0);
void generic_copy_to(device_memory &mem);
void generic_free(device_memory &mem);
virtual void get_device_memory_info(size_t &total, size_t &free) override;
virtual bool alloc_device(void *&device_pointer, size_t size) override;
virtual void free_device(void *device_pointer) override;
virtual bool alloc_host(void *&shared_pointer, size_t size) override;
virtual void free_host(void *shared_pointer) override;
virtual bool transform_host_pointer(void *&device_pointer, void *&shared_pointer) override;
virtual void copy_host_to_device(void *device_pointer, void *host_pointer, size_t size) override;
void mem_alloc(device_memory &mem) override;

View File

@@ -51,7 +51,7 @@ static inline bool hipSupportsDevice(const int hipDevId)
hipDeviceGetAttribute(&major, hipDeviceAttributeComputeCapabilityMajor, hipDevId);
hipDeviceGetAttribute(&minor, hipDeviceAttributeComputeCapabilityMinor, hipDevId);
return (major >= 9);
return (major >= 10);
}
CCL_NAMESPACE_END

View File

@@ -73,6 +73,10 @@ const char *device_kernel_as_string(DeviceKernel kernel)
return "integrator_terminated_paths_array";
case DEVICE_KERNEL_INTEGRATOR_SORTED_PATHS_ARRAY:
return "integrator_sorted_paths_array";
case DEVICE_KERNEL_INTEGRATOR_SORT_BUCKET_PASS:
return "integrator_sort_bucket_pass";
case DEVICE_KERNEL_INTEGRATOR_SORT_WRITE_PASS:
return "integrator_sort_write_pass";
case DEVICE_KERNEL_INTEGRATOR_COMPACT_PATHS_ARRAY:
return "integrator_compact_paths_array";
case DEVICE_KERNEL_INTEGRATOR_COMPACT_STATES:

View File

@@ -247,6 +247,8 @@ class device_memory {
bool is_resident(Device *sub_device) const;
protected:
friend class Device;
friend class GPUDevice;
friend class CUDADevice;
friend class OptiXDevice;
friend class HIPDevice;

View File

@@ -21,6 +21,7 @@ class BVHMetal : public BVH {
API_AVAILABLE(macos(11.0))
vector<id<MTLAccelerationStructure>> blas_array;
vector<uint32_t> blas_lookup;
bool motion_blur = false;

View File

@@ -816,6 +816,11 @@ bool BVHMetal::build_TLAS(Progress &progress,
uint32_t instance_index = 0;
uint32_t motion_transform_index = 0;
// allocate look up buffer for wost case scenario
uint64_t count = objects.size();
blas_lookup.resize(count);
for (Object *ob : objects) {
/* Skip non-traceable objects */
if (!ob->is_traceable())
@@ -843,12 +848,15 @@ bool BVHMetal::build_TLAS(Progress &progress,
/* Set user instance ID to object index */
int object_index = ob->get_device_index();
uint32_t user_id = uint32_t(object_index);
int currIndex = instance_index++;
assert(user_id < blas_lookup.size());
blas_lookup[user_id] = accel_struct_index;
/* Bake into the appropriate descriptor */
if (motion_blur) {
MTLAccelerationStructureMotionInstanceDescriptor *instances =
(MTLAccelerationStructureMotionInstanceDescriptor *)[instanceBuf contents];
MTLAccelerationStructureMotionInstanceDescriptor &desc = instances[instance_index++];
MTLAccelerationStructureMotionInstanceDescriptor &desc = instances[currIndex];
desc.accelerationStructureIndex = accel_struct_index;
desc.userID = user_id;
@@ -894,7 +902,7 @@ bool BVHMetal::build_TLAS(Progress &progress,
else {
MTLAccelerationStructureUserIDInstanceDescriptor *instances =
(MTLAccelerationStructureUserIDInstanceDescriptor *)[instanceBuf contents];
MTLAccelerationStructureUserIDInstanceDescriptor &desc = instances[instance_index++];
MTLAccelerationStructureUserIDInstanceDescriptor &desc = instances[currIndex];
desc.accelerationStructureIndex = accel_struct_index;
desc.userID = user_id;

View File

@@ -55,6 +55,10 @@ void device_metal_info(vector<DeviceInfo> &devices)
info.denoisers = DENOISER_NONE;
info.id = id;
if (MetalInfo::get_device_vendor(device) == METAL_GPU_AMD) {
info.has_light_tree = false;
}
devices.push_back(info);
device_index++;
}

View File

@@ -74,6 +74,11 @@ class MetalDevice : public Device {
id<MTLBuffer> texture_bindings_3d = nil;
std::vector<id<MTLTexture>> texture_slot_map;
/* BLAS encoding & lookup */
id<MTLArgumentEncoder> mtlBlasArgEncoder = nil;
id<MTLBuffer> blas_buffer = nil;
id<MTLBuffer> blas_lookup_buffer = nil;
bool use_metalrt = false;
MetalPipelineType kernel_specialization_level = PSO_GENERIC;
@@ -105,6 +110,8 @@ class MetalDevice : public Device {
bool use_adaptive_compilation();
bool use_local_atomic_sort() const;
bool make_source_and_check_if_compile_needed(MetalPipelineType pso_type);
void make_source(MetalPipelineType pso_type, const uint kernel_features);

View File

@@ -192,6 +192,10 @@ MetalDevice::MetalDevice(const DeviceInfo &info, Stats &stats, Profiler &profile
arg_desc_as.dataType = MTLDataTypeInstanceAccelerationStructure;
arg_desc_as.access = MTLArgumentAccessReadOnly;
MTLArgumentDescriptor *arg_desc_ptrs = [[MTLArgumentDescriptor alloc] init];
arg_desc_ptrs.dataType = MTLDataTypePointer;
arg_desc_ptrs.access = MTLArgumentAccessReadOnly;
MTLArgumentDescriptor *arg_desc_ift = [[MTLArgumentDescriptor alloc] init];
arg_desc_ift.dataType = MTLDataTypeIntersectionFunctionTable;
arg_desc_ift.access = MTLArgumentAccessReadOnly;
@@ -204,14 +208,28 @@ MetalDevice::MetalDevice(const DeviceInfo &info, Stats &stats, Profiler &profile
[ancillary_desc addObject:[arg_desc_ift copy]]; /* ift_shadow */
arg_desc_ift.index = index++;
[ancillary_desc addObject:[arg_desc_ift copy]]; /* ift_local */
arg_desc_ift.index = index++;
[ancillary_desc addObject:[arg_desc_ift copy]]; /* ift_local_prim */
arg_desc_ptrs.index = index++;
[ancillary_desc addObject:[arg_desc_ptrs copy]]; /* blas array */
arg_desc_ptrs.index = index++;
[ancillary_desc addObject:[arg_desc_ptrs copy]]; /* look up table for blas */
[arg_desc_ift release];
[arg_desc_as release];
[arg_desc_ptrs release];
}
}
mtlAncillaryArgEncoder = [mtlDevice newArgumentEncoderWithArguments:ancillary_desc];
// preparing the blas arg encoder
MTLArgumentDescriptor *arg_desc_blas = [[MTLArgumentDescriptor alloc] init];
arg_desc_blas.dataType = MTLDataTypeInstanceAccelerationStructure;
arg_desc_blas.access = MTLArgumentAccessReadOnly;
mtlBlasArgEncoder = [mtlDevice newArgumentEncoderWithArguments:@[ arg_desc_blas ]];
[arg_desc_blas release];
for (int i = 0; i < ancillary_desc.count; i++) {
[ancillary_desc[i] release];
}
@@ -271,6 +289,11 @@ bool MetalDevice::use_adaptive_compilation()
return DebugFlags().metal.adaptive_compile;
}
bool MetalDevice::use_local_atomic_sort() const
{
return DebugFlags().metal.use_local_atomic_sort;
}
void MetalDevice::make_source(MetalPipelineType pso_type, const uint kernel_features)
{
string global_defines;
@@ -278,6 +301,10 @@ void MetalDevice::make_source(MetalPipelineType pso_type, const uint kernel_feat
global_defines += "#define __KERNEL_FEATURES__ " + to_string(kernel_features) + "\n";
}
if (use_local_atomic_sort()) {
global_defines += "#define __KERNEL_LOCAL_ATOMIC_SORT__\n";
}
if (use_metalrt) {
global_defines += "#define __METALRT__\n";
if (motion_blur) {
@@ -1231,6 +1258,33 @@ void MetalDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
if (@available(macos 11.0, *)) {
if (bvh->params.top_level) {
bvhMetalRT = bvh_metal;
// allocate required buffers for BLAS array
uint64_t count = bvhMetalRT->blas_array.size();
uint64_t bufferSize = mtlBlasArgEncoder.encodedLength * count;
blas_buffer = [mtlDevice newBufferWithLength:bufferSize options:default_storage_mode];
stats.mem_alloc(blas_buffer.allocatedSize);
for (uint64_t i = 0; i < count; ++i) {
[mtlBlasArgEncoder setArgumentBuffer:blas_buffer
offset:i * mtlBlasArgEncoder.encodedLength];
[mtlBlasArgEncoder setAccelerationStructure:bvhMetalRT->blas_array[i] atIndex:0];
}
count = bvhMetalRT->blas_lookup.size();
bufferSize = sizeof(uint32_t) * count;
blas_lookup_buffer = [mtlDevice newBufferWithLength:bufferSize
options:default_storage_mode];
stats.mem_alloc(blas_lookup_buffer.allocatedSize);
memcpy([blas_lookup_buffer contents],
bvhMetalRT -> blas_lookup.data(),
blas_lookup_buffer.allocatedSize);
if (default_storage_mode == MTLResourceStorageModeManaged) {
[blas_buffer didModifyRange:NSMakeRange(0, blas_buffer.length)];
[blas_lookup_buffer didModifyRange:NSMakeRange(0, blas_lookup_buffer.length)];
}
}
}
}

View File

@@ -19,6 +19,8 @@ enum {
METALRT_FUNC_SHADOW_BOX,
METALRT_FUNC_LOCAL_TRI,
METALRT_FUNC_LOCAL_BOX,
METALRT_FUNC_LOCAL_TRI_PRIM,
METALRT_FUNC_LOCAL_BOX_PRIM,
METALRT_FUNC_CURVE_RIBBON,
METALRT_FUNC_CURVE_RIBBON_SHADOW,
METALRT_FUNC_CURVE_ALL,
@@ -28,7 +30,13 @@ enum {
METALRT_FUNC_NUM
};
enum { METALRT_TABLE_DEFAULT, METALRT_TABLE_SHADOW, METALRT_TABLE_LOCAL, METALRT_TABLE_NUM };
enum {
METALRT_TABLE_DEFAULT,
METALRT_TABLE_SHADOW,
METALRT_TABLE_LOCAL,
METALRT_TABLE_LOCAL_PRIM,
METALRT_TABLE_NUM
};
/* Pipeline State Object types */
enum MetalPipelineType {

View File

@@ -87,6 +87,9 @@ struct ShaderCache {
break;
}
}
occupancy_tuning[DEVICE_KERNEL_INTEGRATOR_SORT_BUCKET_PASS] = {1024, 1024};
occupancy_tuning[DEVICE_KERNEL_INTEGRATOR_SORT_WRITE_PASS] = {1024, 1024};
}
~ShaderCache();
@@ -521,6 +524,8 @@ void MetalKernelPipeline::compile()
"__anyhit__cycles_metalrt_shadow_all_hit_box",
"__anyhit__cycles_metalrt_local_hit_tri",
"__anyhit__cycles_metalrt_local_hit_box",
"__anyhit__cycles_metalrt_local_hit_tri_prim",
"__anyhit__cycles_metalrt_local_hit_box_prim",
"__intersection__curve_ribbon",
"__intersection__curve_ribbon_shadow",
"__intersection__curve_all",
@@ -611,11 +616,17 @@ void MetalKernelPipeline::compile()
rt_intersection_function[METALRT_FUNC_LOCAL_BOX],
rt_intersection_function[METALRT_FUNC_LOCAL_BOX],
nil];
table_functions[METALRT_TABLE_LOCAL_PRIM] = [NSArray
arrayWithObjects:rt_intersection_function[METALRT_FUNC_LOCAL_TRI_PRIM],
rt_intersection_function[METALRT_FUNC_LOCAL_BOX_PRIM],
rt_intersection_function[METALRT_FUNC_LOCAL_BOX_PRIM],
nil];
NSMutableSet *unique_functions = [NSMutableSet
setWithArray:table_functions[METALRT_TABLE_DEFAULT]];
[unique_functions addObjectsFromArray:table_functions[METALRT_TABLE_SHADOW]];
[unique_functions addObjectsFromArray:table_functions[METALRT_TABLE_LOCAL]];
[unique_functions addObjectsFromArray:table_functions[METALRT_TABLE_LOCAL_PRIM]];
if (kernel_has_intersection(device_kernel)) {
linked_functions = [[NSArray arrayWithArray:[unique_functions allObjects]]

View File

@@ -25,6 +25,7 @@ class MetalDeviceQueue : public DeviceQueue {
virtual int num_concurrent_states(const size_t) const override;
virtual int num_concurrent_busy_states(const size_t) const override;
virtual int num_sort_partition_elements() const override;
virtual bool supports_local_atomic_sort() const override;
virtual void init_execution() override;

View File

@@ -315,6 +315,11 @@ int MetalDeviceQueue::num_sort_partition_elements() const
return MetalInfo::optimal_sort_partition_elements(metal_device_->mtlDevice);
}
bool MetalDeviceQueue::supports_local_atomic_sort() const
{
return metal_device_->use_local_atomic_sort();
}
void MetalDeviceQueue::init_execution()
{
/* Synchronize all textures and memory copies before executing task. */
@@ -477,6 +482,12 @@ bool MetalDeviceQueue::enqueue(DeviceKernel kernel,
if (metal_device_->bvhMetalRT) {
id<MTLAccelerationStructure> accel_struct = metal_device_->bvhMetalRT->accel_struct;
[metal_device_->mtlAncillaryArgEncoder setAccelerationStructure:accel_struct atIndex:2];
[metal_device_->mtlAncillaryArgEncoder setBuffer:metal_device_->blas_buffer
offset:0
atIndex:7];
[metal_device_->mtlAncillaryArgEncoder setBuffer:metal_device_->blas_lookup_buffer
offset:0
atIndex:8];
}
for (int table = 0; table < METALRT_TABLE_NUM; table++) {
@@ -527,6 +538,10 @@ bool MetalDeviceQueue::enqueue(DeviceKernel kernel,
if (bvhMetalRT) {
/* Mark all Accelerations resources as used */
[mtlComputeCommandEncoder useResource:bvhMetalRT->accel_struct usage:MTLResourceUsageRead];
[mtlComputeCommandEncoder useResource:metal_device_->blas_buffer
usage:MTLResourceUsageRead];
[mtlComputeCommandEncoder useResource:metal_device_->blas_lookup_buffer
usage:MTLResourceUsageRead];
[mtlComputeCommandEncoder useResources:bvhMetalRT->blas_array.data()
count:bvhMetalRT->blas_array.size()
usage:MTLResourceUsageRead];
@@ -553,13 +568,24 @@ bool MetalDeviceQueue::enqueue(DeviceKernel kernel,
/* See parallel_active_index.h for why this amount of shared memory is needed.
* Rounded up to 16 bytes for Metal */
shared_mem_bytes = (int)round_up((num_threads_per_block + 1) * sizeof(int), 16);
[mtlComputeCommandEncoder setThreadgroupMemoryLength:shared_mem_bytes atIndex:0];
break;
case DEVICE_KERNEL_INTEGRATOR_SORT_BUCKET_PASS:
case DEVICE_KERNEL_INTEGRATOR_SORT_WRITE_PASS: {
int key_count = metal_device_->launch_params.data.max_shaders;
shared_mem_bytes = (int)round_up(key_count * sizeof(int), 16);
break;
}
default:
break;
}
if (shared_mem_bytes) {
assert(shared_mem_bytes <= 32 * 1024);
[mtlComputeCommandEncoder setThreadgroupMemoryLength:shared_mem_bytes atIndex:0];
}
MTLSize size_threadgroups_per_dispatch = MTLSizeMake(
divide_up(work_size, num_threads_per_block), 1, 1);
MTLSize size_threads_per_threadgroup = MTLSizeMake(num_threads_per_block, 1, 1);

View File

@@ -64,6 +64,12 @@ MetalGPUVendor MetalInfo::get_device_vendor(id<MTLDevice> device)
return METAL_GPU_INTEL;
}
else if (strstr(device_name, "AMD")) {
/* Setting this env var hides AMD devices thus exposing any integrated Intel devices. */
if (auto str = getenv("CYCLES_METAL_FORCE_INTEL")) {
if (atoi(str)) {
return METAL_GPU_UNKNOWN;
}
}
return METAL_GPU_AMD;
}
else if (strstr(device_name, "Apple")) {
@@ -96,6 +102,15 @@ vector<id<MTLDevice>> const &MetalInfo::get_usable_devices()
return usable_devices;
}
/* If the system has both an AMD GPU (discrete) and an Intel one (integrated), prefer the AMD
* one. This can be overridden with CYCLES_METAL_FORCE_INTEL. */
bool has_usable_amd_gpu = false;
if (@available(macos 12.3, *)) {
for (id<MTLDevice> device in MTLCopyAllDevices()) {
has_usable_amd_gpu |= (get_device_vendor(device) == METAL_GPU_AMD);
}
}
metal_printf("Usable Metal devices:\n");
for (id<MTLDevice> device in MTLCopyAllDevices()) {
string device_name = get_device_name(device);
@@ -111,8 +126,10 @@ vector<id<MTLDevice>> const &MetalInfo::get_usable_devices()
}
# if defined(MAC_OS_VERSION_13_0)
if (@available(macos 13.0, *)) {
usable |= (vendor == METAL_GPU_INTEL);
if (!has_usable_amd_gpu) {
if (@available(macos 13.0, *)) {
usable |= (vendor == METAL_GPU_INTEL);
}
}
# endif

View File

@@ -377,7 +377,7 @@ void OneapiDevice::tex_alloc(device_texture &mem)
generic_alloc(mem);
generic_copy_to(mem);
/* Resize if needed. Also, in case of resize - allocate in advance for future allocs. */
/* Resize if needed. Also, in case of resize - allocate in advance for future allocations. */
const uint slot = mem.slot;
if (slot >= texture_info_.size()) {
texture_info_.resize(slot + 128);
@@ -631,9 +631,9 @@ bool OneapiDevice::enqueue_kernel(KernelContext *kernel_context,
/* Compute-runtime (ie. NEO) version is what gets returned by sycl/L0 on Windows
* since Windows driver 101.3268. */
/* The same min compute-runtime version is currently required across Windows and Linux.
* For Windows driver 101.3430, compute-runtime version is 23904. */
static const int lowest_supported_driver_version_win = 1013430;
static const int lowest_supported_driver_version_neo = 23904;
* For Windows driver 101.4032, compute-runtime version is 24931. */
static const int lowest_supported_driver_version_win = 1014032;
static const int lowest_supported_driver_version_neo = 24931;
int OneapiDevice::parse_driver_build_version(const sycl::device &device)
{

View File

@@ -854,12 +854,14 @@ bool OptiXDevice::load_osl_kernels()
context, group_descs, 2, &group_options, nullptr, 0, &osl_groups[i * 2]));
}
OptixStackSizes stack_size[NUM_PROGRAM_GROUPS] = {};
vector<OptixStackSizes> osl_stack_size(osl_groups.size());
/* Update SBT with new entries. */
sbt_data.alloc(NUM_PROGRAM_GROUPS + osl_groups.size());
for (int i = 0; i < NUM_PROGRAM_GROUPS; ++i) {
optix_assert(optixSbtRecordPackHeader(groups[i], &sbt_data[i]));
optix_assert(optixProgramGroupGetStackSize(groups[i], &stack_size[i]));
}
for (size_t i = 0; i < osl_groups.size(); ++i) {
if (osl_groups[i] != NULL) {
@@ -907,13 +909,15 @@ bool OptiXDevice::load_osl_kernels()
0,
&pipelines[PIP_SHADE]));
const unsigned int css = std::max(stack_size[PG_RGEN_SHADE_SURFACE_RAYTRACE].cssRG,
stack_size[PG_RGEN_SHADE_SURFACE_MNEE].cssRG);
unsigned int dss = 0;
for (unsigned int i = 0; i < osl_stack_size.size(); ++i) {
dss = std::max(dss, osl_stack_size[i].dssDC);
}
optix_assert(optixPipelineSetStackSize(
pipelines[PIP_SHADE], 0, dss, 0, pipeline_options.usesMotionBlur ? 3 : 2));
pipelines[PIP_SHADE], 0, dss, css, pipeline_options.usesMotionBlur ? 3 : 2));
}
return !have_error();

View File

@@ -112,6 +112,13 @@ class DeviceQueue {
return 65536;
}
/* Does device support local atomic sorting kernels (INTEGRATOR_SORT_BUCKET_PASS and
* INTEGRATOR_SORT_WRITE_PASS)? */
virtual bool supports_local_atomic_sort() const
{
return false;
}
/* Initialize execution of kernels on this queue.
*
* Will, for example, load all data required by the kernels from Device to global or path state.

View File

@@ -71,6 +71,8 @@ PathTraceWorkGPU::PathTraceWorkGPU(Device *device,
device, "integrator_shader_mnee_sort_counter", MEM_READ_WRITE),
integrator_shader_sort_prefix_sum_(
device, "integrator_shader_sort_prefix_sum", MEM_READ_WRITE),
integrator_shader_sort_partition_key_offsets_(
device, "integrator_shader_sort_partition_key_offsets", MEM_READ_WRITE),
integrator_next_main_path_index_(device, "integrator_next_main_path_index", MEM_READ_WRITE),
integrator_next_shadow_path_index_(
device, "integrator_next_shadow_path_index", MEM_READ_WRITE),
@@ -207,33 +209,45 @@ void PathTraceWorkGPU::alloc_integrator_sorting()
integrator_state_gpu_.sort_partition_divisor = (int)divide_up(max_num_paths_,
num_sort_partitions_);
/* Allocate arrays for shader sorting. */
const int sort_buckets = device_scene_->data.max_shaders * num_sort_partitions_;
if (integrator_shader_sort_counter_.size() < sort_buckets) {
integrator_shader_sort_counter_.alloc(sort_buckets);
integrator_shader_sort_counter_.zero_to_device();
integrator_state_gpu_.sort_key_counter[DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE] =
(int *)integrator_shader_sort_counter_.device_pointer;
integrator_shader_sort_prefix_sum_.alloc(sort_buckets);
integrator_shader_sort_prefix_sum_.zero_to_device();
}
if (device_scene_->data.kernel_features & KERNEL_FEATURE_NODE_RAYTRACE) {
if (integrator_shader_raytrace_sort_counter_.size() < sort_buckets) {
integrator_shader_raytrace_sort_counter_.alloc(sort_buckets);
integrator_shader_raytrace_sort_counter_.zero_to_device();
integrator_state_gpu_.sort_key_counter[DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE] =
(int *)integrator_shader_raytrace_sort_counter_.device_pointer;
if (num_sort_partitions_ > 1 && queue_->supports_local_atomic_sort()) {
/* Allocate array for partitioned shader sorting using local atomics. */
const int num_offsets = (device_scene_->data.max_shaders + 1) * num_sort_partitions_;
if (integrator_shader_sort_partition_key_offsets_.size() < num_offsets) {
integrator_shader_sort_partition_key_offsets_.alloc(num_offsets);
integrator_shader_sort_partition_key_offsets_.zero_to_device();
}
integrator_state_gpu_.sort_partition_key_offsets =
(int *)integrator_shader_sort_partition_key_offsets_.device_pointer;
}
else {
/* Allocate arrays for shader sorting. */
const int sort_buckets = device_scene_->data.max_shaders * num_sort_partitions_;
if (integrator_shader_sort_counter_.size() < sort_buckets) {
integrator_shader_sort_counter_.alloc(sort_buckets);
integrator_shader_sort_counter_.zero_to_device();
integrator_state_gpu_.sort_key_counter[DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE] =
(int *)integrator_shader_sort_counter_.device_pointer;
if (device_scene_->data.kernel_features & KERNEL_FEATURE_MNEE) {
if (integrator_shader_mnee_sort_counter_.size() < sort_buckets) {
integrator_shader_mnee_sort_counter_.alloc(sort_buckets);
integrator_shader_mnee_sort_counter_.zero_to_device();
integrator_state_gpu_.sort_key_counter[DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_MNEE] =
(int *)integrator_shader_mnee_sort_counter_.device_pointer;
integrator_shader_sort_prefix_sum_.alloc(sort_buckets);
integrator_shader_sort_prefix_sum_.zero_to_device();
}
if (device_scene_->data.kernel_features & KERNEL_FEATURE_NODE_RAYTRACE) {
if (integrator_shader_raytrace_sort_counter_.size() < sort_buckets) {
integrator_shader_raytrace_sort_counter_.alloc(sort_buckets);
integrator_shader_raytrace_sort_counter_.zero_to_device();
integrator_state_gpu_.sort_key_counter[DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE] =
(int *)integrator_shader_raytrace_sort_counter_.device_pointer;
}
}
if (device_scene_->data.kernel_features & KERNEL_FEATURE_MNEE) {
if (integrator_shader_mnee_sort_counter_.size() < sort_buckets) {
integrator_shader_mnee_sort_counter_.alloc(sort_buckets);
integrator_shader_mnee_sort_counter_.zero_to_device();
integrator_state_gpu_.sort_key_counter[DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_MNEE] =
(int *)integrator_shader_mnee_sort_counter_.device_pointer;
}
}
}
}
@@ -451,8 +465,7 @@ void PathTraceWorkGPU::enqueue_path_iteration(DeviceKernel kernel, const int num
work_size = num_queued;
d_path_index = queued_paths_.device_pointer;
compute_sorted_queued_paths(
DEVICE_KERNEL_INTEGRATOR_SORTED_PATHS_ARRAY, kernel, num_paths_limit);
compute_sorted_queued_paths(kernel, num_paths_limit);
}
else if (num_queued < work_size) {
work_size = num_queued;
@@ -511,11 +524,26 @@ void PathTraceWorkGPU::enqueue_path_iteration(DeviceKernel kernel, const int num
}
}
void PathTraceWorkGPU::compute_sorted_queued_paths(DeviceKernel kernel,
DeviceKernel queued_kernel,
void PathTraceWorkGPU::compute_sorted_queued_paths(DeviceKernel queued_kernel,
const int num_paths_limit)
{
int d_queued_kernel = queued_kernel;
/* Launch kernel to fill the active paths arrays. */
if (num_sort_partitions_ > 1 && queue_->supports_local_atomic_sort()) {
const int work_size = kernel_max_active_main_path_index(queued_kernel);
device_ptr d_queued_paths = queued_paths_.device_pointer;
int partition_size = (int)integrator_state_gpu_.sort_partition_divisor;
DeviceKernelArguments args(
&work_size, &partition_size, &num_paths_limit, &d_queued_paths, &d_queued_kernel);
queue_->enqueue(DEVICE_KERNEL_INTEGRATOR_SORT_BUCKET_PASS, 1024 * num_sort_partitions_, args);
queue_->enqueue(DEVICE_KERNEL_INTEGRATOR_SORT_WRITE_PASS, 1024 * num_sort_partitions_, args);
return;
}
device_ptr d_counter = (device_ptr)integrator_state_gpu_.sort_key_counter[d_queued_kernel];
device_ptr d_prefix_sum = integrator_shader_sort_prefix_sum_.device_pointer;
assert(d_counter != 0 && d_prefix_sum != 0);
@@ -552,7 +580,7 @@ void PathTraceWorkGPU::compute_sorted_queued_paths(DeviceKernel kernel,
&d_prefix_sum,
&d_queued_kernel);
queue_->enqueue(kernel, work_size, args);
queue_->enqueue(DEVICE_KERNEL_INTEGRATOR_SORTED_PATHS_ARRAY, work_size, args);
}
}

View File

@@ -70,9 +70,7 @@ class PathTraceWorkGPU : public PathTraceWork {
void enqueue_path_iteration(DeviceKernel kernel, const int num_paths_limit = INT_MAX);
void compute_queued_paths(DeviceKernel kernel, DeviceKernel queued_kernel);
void compute_sorted_queued_paths(DeviceKernel kernel,
DeviceKernel queued_kernel,
const int num_paths_limit);
void compute_sorted_queued_paths(DeviceKernel queued_kernel, const int num_paths_limit);
void compact_main_paths(const int num_active_paths);
void compact_shadow_paths();
@@ -135,6 +133,7 @@ class PathTraceWorkGPU : public PathTraceWork {
device_vector<int> integrator_shader_raytrace_sort_counter_;
device_vector<int> integrator_shader_mnee_sort_counter_;
device_vector<int> integrator_shader_sort_prefix_sum_;
device_vector<int> integrator_shader_sort_partition_key_offsets_;
/* Path split. */
device_vector<int> integrator_next_main_path_index_;
device_vector<int> integrator_next_shadow_path_index_;

View File

@@ -170,7 +170,7 @@ ccl_device_inline int bsdf_sample(KernelGlobals kg,
case CLOSURE_BSDF_MICROFACET_GGX_CLEARCOAT_ID:
case CLOSURE_BSDF_MICROFACET_GGX_REFRACTION_ID:
label = bsdf_microfacet_ggx_sample(
kg, sc, Ng, sd->wi, randu, randv, eval, wo, pdf, sampled_roughness, eta);
sc, Ng, sd->wi, randu, randv, eval, wo, pdf, sampled_roughness, eta);
break;
case CLOSURE_BSDF_MICROFACET_MULTI_GGX_ID:
case CLOSURE_BSDF_MICROFACET_MULTI_GGX_FRESNEL_ID:
@@ -185,7 +185,7 @@ ccl_device_inline int bsdf_sample(KernelGlobals kg,
case CLOSURE_BSDF_MICROFACET_BECKMANN_ID:
case CLOSURE_BSDF_MICROFACET_BECKMANN_REFRACTION_ID:
label = bsdf_microfacet_beckmann_sample(
kg, sc, Ng, sd->wi, randu, randv, eval, wo, pdf, sampled_roughness, eta);
sc, Ng, sd->wi, randu, randv, eval, wo, pdf, sampled_roughness, eta);
break;
case CLOSURE_BSDF_ASHIKHMIN_SHIRLEY_ID:
label = bsdf_ashikhmin_shirley_sample(
@@ -661,4 +661,38 @@ ccl_device void bsdf_blur(KernelGlobals kg, ccl_private ShaderClosure *sc, float
#endif
}
ccl_device_inline Spectrum bsdf_albedo(ccl_private const ShaderData *sd,
ccl_private const ShaderClosure *sc)
{
Spectrum albedo = sc->weight;
/* Some closures include additional components such as Fresnel terms that cause their albedo to
* be below 1. The point of this function is to return a best-effort estimation of their albedo,
* meaning the amount of reflected/refracted light that would be expected when illuminated by a
* uniform white background.
* This is used for the denoising albedo pass and diffuse/glossy/transmission color passes.
* NOTE: This should always match the sample_weight of the closure - as in, if there's an albedo
* adjustment in here, the sample_weight should also be reduced accordingly.
* TODO(lukas): Consider calling this function to determine the sample_weight? Would be a bit of
* extra overhead though. */
#if defined(__SVM__) || defined(__OSL__)
switch (sc->type) {
case CLOSURE_BSDF_MICROFACET_GGX_FRESNEL_ID:
case CLOSURE_BSDF_MICROFACET_MULTI_GGX_FRESNEL_ID:
case CLOSURE_BSDF_MICROFACET_MULTI_GGX_GLASS_FRESNEL_ID:
case CLOSURE_BSDF_MICROFACET_GGX_CLEARCOAT_ID:
albedo *= microfacet_fresnel((ccl_private const MicrofacetBsdf *)sc, sd->wi, sc->N);
break;
case CLOSURE_BSDF_PRINCIPLED_SHEEN_ID:
albedo *= ((ccl_private const PrincipledSheenBsdf *)sc)->avg_value;
break;
case CLOSURE_BSDF_HAIR_PRINCIPLED_ID:
albedo *= bsdf_principled_hair_albedo(sc);
break;
default:
break;
}
#endif
return albedo;
}
CCL_NAMESPACE_END

View File

@@ -41,11 +41,6 @@ static_assert(sizeof(ShaderClosure) >= sizeof(PrincipledHairBSDF),
static_assert(sizeof(ShaderClosure) >= sizeof(PrincipledHairExtra),
"PrincipledHairExtra is too large!");
ccl_device_inline float cos_from_sin(const float s)
{
return safe_sqrtf(1.0f - s * s);
}
/* Gives the change in direction in the normal plane for the given angles and p-th-order
* scattering. */
ccl_device_inline float delta_phi(int p, float gamma_o, float gamma_t)

View File

@@ -23,8 +23,6 @@ enum MicrofacetType {
typedef struct MicrofacetExtra {
Spectrum color, cspec0;
Spectrum fresnel_color;
float clearcoat;
} MicrofacetExtra;
typedef struct MicrofacetBsdf {
@@ -37,190 +35,99 @@ typedef struct MicrofacetBsdf {
static_assert(sizeof(ShaderClosure) >= sizeof(MicrofacetBsdf), "MicrofacetBsdf is too large!");
/* Beckmann and GGX microfacet importance sampling. */
ccl_device_inline void microfacet_beckmann_sample_slopes(KernelGlobals kg,
const float cos_theta_i,
const float sin_theta_i,
float randu,
float randv,
ccl_private float *slope_x,
ccl_private float *slope_y,
ccl_private float *G1i)
{
/* Special case (normal incidence). */
if (cos_theta_i >= 0.99999f) {
const float r = sqrtf(-logf(randu));
const float phi = M_2PI_F * randv;
*slope_x = r * cosf(phi);
*slope_y = r * sinf(phi);
*G1i = 1.0f;
return;
}
/* Precomputations. */
const float tan_theta_i = sin_theta_i / cos_theta_i;
const float inv_a = tan_theta_i;
const float cot_theta_i = 1.0f / tan_theta_i;
const float erf_a = fast_erff(cot_theta_i);
const float exp_a2 = expf(-cot_theta_i * cot_theta_i);
const float SQRT_PI_INV = 0.56418958354f;
const float Lambda = 0.5f * (erf_a - 1.0f) + (0.5f * SQRT_PI_INV) * (exp_a2 * inv_a);
const float G1 = 1.0f / (1.0f + Lambda); /* masking */
*G1i = G1;
/* Based on paper from Wenzel Jakob
* An Improved Visible Normal Sampling Routine for the Beckmann Distribution
*
* http://www.mitsuba-renderer.org/~wenzel/files/visnormal.pdf
*
* Reformulation from OpenShadingLanguage which avoids using inverse
* trigonometric functions.
*/
/* Sample slope X.
*
* Compute a coarse approximation using the approximation:
* exp(-ierf(x)^2) ~= 1 - x * x
* solve y = 1 + b + K * (1 - b * b)
*/
const float K = tan_theta_i * SQRT_PI_INV;
const float y_approx = randu * (1.0f + erf_a + K * (1 - erf_a * erf_a));
const float y_exact = randu * (1.0f + erf_a + K * exp_a2);
float b = K > 0 ? (0.5f - sqrtf(K * (K - y_approx + 1.0f) + 0.25f)) / K : y_approx - 1.0f;
float inv_erf = fast_ierff(b);
float2 begin = make_float2(-1.0f, -y_exact);
float2 end = make_float2(erf_a, 1.0f + erf_a + K * exp_a2 - y_exact);
float2 current = make_float2(b, 1.0f + b + K * expf(-sqr(inv_erf)) - y_exact);
/* Find root in a monotonic interval using newton method, under given precision and maximal
* iterations. Falls back to bisection if newton step produces results outside of the valid
* interval.*/
const float precision = 1e-6f;
const int max_iter = 3;
int iter = 0;
while (fabsf(current.y) > precision && iter++ < max_iter) {
if (signf(begin.y) == signf(current.y)) {
begin.x = current.x;
begin.y = current.y;
}
else {
end.x = current.x;
}
const float newton_x = current.x - current.y / (1.0f - inv_erf * tan_theta_i);
current.x = (newton_x >= begin.x && newton_x <= end.x) ? newton_x : 0.5f * (begin.x + end.x);
inv_erf = fast_ierff(current.x);
current.y = 1.0f + current.x + K * expf(-sqr(inv_erf)) - y_exact;
}
*slope_x = inv_erf;
*slope_y = fast_ierff(2.0f * randv - 1.0f);
}
/* GGX microfacet importance sampling from:
*
/* Beckmann VNDF importance sampling algorithm from:
* Importance Sampling Microfacet-Based BSDFs using the Distribution of Visible Normals.
* E. Heitz and E. d'Eon, EGSR 2014
*/
* Eric Heitz and Eugene d'Eon, EGSR 2014.
* https://hal.inria.fr/hal-00996995v2/document */
ccl_device_inline void microfacet_ggx_sample_slopes(const float cos_theta_i,
const float sin_theta_i,
float randu,
float randv,
ccl_private float *slope_x,
ccl_private float *slope_y,
ccl_private float *G1i)
{
/* Special case (normal incidence). */
if (cos_theta_i >= 0.99999f) {
const float r = sqrtf(randu / (1.0f - randu));
const float phi = M_2PI_F * randv;
*slope_x = r * cosf(phi);
*slope_y = r * sinf(phi);
*G1i = 1.0f;
return;
}
/* Precomputations. */
const float tan_theta_i = sin_theta_i / cos_theta_i;
const float G1_inv = 0.5f * (1.0f + safe_sqrtf(1.0f + tan_theta_i * tan_theta_i));
*G1i = 1.0f / G1_inv;
/* Sample slope_x. */
const float A = 2.0f * randu * G1_inv - 1.0f;
const float AA = A * A;
const float tmp = 1.0f / (AA - 1.0f);
const float B = tan_theta_i;
const float BB = B * B;
const float D = safe_sqrtf(BB * (tmp * tmp) - (AA - BB) * tmp);
const float slope_x_1 = B * tmp - D;
const float slope_x_2 = B * tmp + D;
*slope_x = (A < 0.0f || slope_x_2 * tan_theta_i > 1.0f) ? slope_x_1 : slope_x_2;
/* Sample slope_y. */
float S;
if (randv > 0.5f) {
S = 1.0f;
randv = 2.0f * (randv - 0.5f);
}
else {
S = -1.0f;
randv = 2.0f * (0.5f - randv);
}
const float z = (randv * (randv * (randv * 0.27385f - 0.73369f) + 0.46341f)) /
(randv * (randv * (randv * 0.093073f + 0.309420f) - 1.000000f) + 0.597999f);
*slope_y = S * z * safe_sqrtf(1.0f + (*slope_x) * (*slope_x));
}
template<MicrofacetType m_type>
ccl_device_forceinline float3 microfacet_sample_stretched(KernelGlobals kg,
const float3 wi,
const float alpha_x,
const float alpha_y,
const float randu,
const float randv,
ccl_private float *G1i)
ccl_device_forceinline float3 microfacet_beckmann_sample_vndf(const float3 wi,
const float alpha_x,
const float alpha_y,
const float randu,
const float randv)
{
/* 1. stretch wi */
float3 wi_ = make_float3(alpha_x * wi.x, alpha_y * wi.y, wi.z);
wi_ = normalize(wi_);
/* Compute polar coordinates of wi_. */
float costheta_ = 1.0f;
float sintheta_ = 0.0f;
float cosphi_ = 1.0f;
float sinphi_ = 0.0f;
if (wi_.z < 0.99999f) {
costheta_ = wi_.z;
sintheta_ = safe_sqrtf(1.0f - costheta_ * costheta_);
float invlen = 1.0f / sintheta_;
cosphi_ = wi_.x * invlen;
sinphi_ = wi_.y * invlen;
}
/* 2. sample P22_{wi}(x_slope, y_slope, 1, 1) */
float slope_x, slope_y;
float cos_phi_i = 1.0f;
float sin_phi_i = 0.0f;
if (m_type == MicrofacetType::BECKMANN) {
microfacet_beckmann_sample_slopes(
kg, costheta_, sintheta_, randu, randv, &slope_x, &slope_y, G1i);
if (wi_.z >= 0.99999f) {
/* Special case (normal incidence). */
const float r = sqrtf(-logf(randu));
const float phi = M_2PI_F * randv;
slope_x = r * cosf(phi);
slope_y = r * sinf(phi);
}
else {
microfacet_ggx_sample_slopes(costheta_, sintheta_, randu, randv, &slope_x, &slope_y, G1i);
/* Precomputations. */
const float cos_theta_i = wi_.z;
const float sin_theta_i = sin_from_cos(cos_theta_i);
const float tan_theta_i = sin_theta_i / cos_theta_i;
const float cot_theta_i = 1.0f / tan_theta_i;
const float erf_a = fast_erff(cot_theta_i);
const float exp_a2 = expf(-cot_theta_i * cot_theta_i);
const float SQRT_PI_INV = 0.56418958354f;
float invlen = 1.0f / sin_theta_i;
cos_phi_i = wi_.x * invlen;
sin_phi_i = wi_.y * invlen;
/* Based on paper from Wenzel Jakob
* An Improved Visible Normal Sampling Routine for the Beckmann Distribution
*
* http://www.mitsuba-renderer.org/~wenzel/files/visnormal.pdf
*
* Reformulation from OpenShadingLanguage which avoids using inverse
* trigonometric functions.
*/
/* Sample slope X.
*
* Compute a coarse approximation using the approximation:
* exp(-ierf(x)^2) ~= 1 - x * x
* solve y = 1 + b + K * (1 - b * b)
*/
const float K = tan_theta_i * SQRT_PI_INV;
const float y_approx = randu * (1.0f + erf_a + K * (1 - erf_a * erf_a));
const float y_exact = randu * (1.0f + erf_a + K * exp_a2);
float b = K > 0 ? (0.5f - sqrtf(K * (K - y_approx + 1.0f) + 0.25f)) / K : y_approx - 1.0f;
float inv_erf = fast_ierff(b);
float2 begin = make_float2(-1.0f, -y_exact);
float2 end = make_float2(erf_a, 1.0f + erf_a + K * exp_a2 - y_exact);
float2 current = make_float2(b, 1.0f + b + K * expf(-sqr(inv_erf)) - y_exact);
/* Find root in a monotonic interval using newton method, under given precision and maximal
* iterations. Falls back to bisection if newton step produces results outside of the valid
* interval.*/
const float precision = 1e-6f;
const int max_iter = 3;
int iter = 0;
while (fabsf(current.y) > precision && iter++ < max_iter) {
if (signf(begin.y) == signf(current.y)) {
begin.x = current.x;
begin.y = current.y;
}
else {
end.x = current.x;
}
const float newton_x = current.x - current.y / (1.0f - inv_erf * tan_theta_i);
current.x = (newton_x >= begin.x && newton_x <= end.x) ? newton_x : 0.5f * (begin.x + end.x);
inv_erf = fast_ierff(current.x);
current.y = 1.0f + current.x + K * expf(-sqr(inv_erf)) - y_exact;
}
slope_x = inv_erf;
slope_y = fast_ierff(2.0f * randv - 1.0f);
}
/* 3. rotate */
float tmp = cosphi_ * slope_x - sinphi_ * slope_y;
slope_y = sinphi_ * slope_x + cosphi_ * slope_y;
float tmp = cos_phi_i * slope_x - sin_phi_i * slope_y;
slope_y = sin_phi_i * slope_x + cos_phi_i * slope_y;
slope_x = tmp;
/* 4. unstretch */
@@ -231,6 +138,43 @@ ccl_device_forceinline float3 microfacet_sample_stretched(KernelGlobals kg,
return normalize(make_float3(-slope_x, -slope_y, 1.0f));
}
/* GGX VNDF importance sampling algorithm from:
* Sampling the GGX Distribution of Visible Normals.
* Eric Heitz, JCGT Vol. 7, No. 4, 2018.
* https://jcgt.org/published/0007/04/01/ */
ccl_device_forceinline float3 microfacet_ggx_sample_vndf(const float3 wi,
const float alpha_x,
const float alpha_y,
const float randu,
const float randv)
{
/* Section 3.2: Transforming the view direction to the hemisphere configuration. */
float3 wi_ = normalize(make_float3(alpha_x * wi.x, alpha_y * wi.y, wi.z));
/* Section 4.1: Orthonormal basis. */
float lensq = sqr(wi_.x) + sqr(wi_.y);
float3 T1, T2;
if (lensq > 1e-7f) {
T1 = make_float3(-wi_.y, wi_.x, 0.0f) * inversesqrtf(lensq);
T2 = cross(wi_, T1);
}
else {
/* Normal incidence, any basis is fine. */
T1 = make_float3(1.0f, 0.0f, 0.0f);
T2 = make_float3(0.0f, 1.0f, 0.0f);
}
/* Section 4.2: Parameterization of the projected area. */
float2 t = concentric_sample_disk(randu, randv);
t.y = mix(safe_sqrtf(1.0f - sqr(t.x)), t.y, 0.5f * (1.0f + wi_.z));
/* Section 4.3: Reprojection onto hemisphere. */
float3 H_ = t.x * T1 + t.y * T2 + safe_sqrtf(1.0f - len_squared(t)) * wi_;
/* Section 3.4: Transforming the normal back to the ellipsoid configuration. */
return normalize(make_float3(alpha_x * H_.x, alpha_y * H_.y, max(0.0f, H_.z)));
}
/* Calculate the reflection color
*
* If fresnel is used, the color is an interpolation of the F0 color and white
@@ -238,26 +182,25 @@ ccl_device_forceinline float3 microfacet_sample_stretched(KernelGlobals kg,
*
* Else it is simply white
*/
ccl_device_forceinline Spectrum reflection_color(ccl_private const MicrofacetBsdf *bsdf,
float3 L,
float3 H)
ccl_device_forceinline Spectrum microfacet_fresnel(ccl_private const MicrofacetBsdf *bsdf,
float3 wi,
float3 H)
{
Spectrum F = one_spectrum();
bool use_clearcoat = bsdf->type == CLOSURE_BSDF_MICROFACET_GGX_CLEARCOAT_ID;
bool use_fresnel = (bsdf->type == CLOSURE_BSDF_MICROFACET_GGX_FRESNEL_ID || use_clearcoat);
if (use_fresnel) {
float F0 = fresnel_dielectric_cos(1.0f, bsdf->ior);
F = interpolate_fresnel_color(L, H, bsdf->ior, F0, bsdf->extra->cspec0);
if (CLOSURE_IS_BSDF_MICROFACET_FRESNEL(bsdf->type)) {
return interpolate_fresnel_color(wi, H, bsdf->ior, bsdf->extra->cspec0);
}
if (use_clearcoat) {
F *= 0.25f * bsdf->extra->clearcoat;
else if (bsdf->type == CLOSURE_BSDF_MICROFACET_GGX_CLEARCOAT_ID) {
return make_spectrum(fresnel_dielectric_cos(dot(wi, H), bsdf->ior));
}
else {
return one_spectrum();
}
}
return F;
ccl_device_forceinline void bsdf_microfacet_adjust_weight(ccl_private const ShaderData *sd,
ccl_private MicrofacetBsdf *bsdf)
{
bsdf->sample_weight *= average(microfacet_fresnel(bsdf, sd->wi, bsdf->N));
}
/* Generalized Trowbridge-Reitz for clearcoat. */
@@ -271,37 +214,48 @@ ccl_device_forceinline float bsdf_clearcoat_D(float alpha2, float cos_NH)
return (alpha2 - 1.0f) / (M_PI_F * logf(alpha2) * t);
}
/* Monodirectional shadowing-masking term. */
/* Smith shadowing-masking term, here in the non-separable form.
* For details, see:
* Understanding the Masking-Shadowing Function in Microfacet-Based BRDFs.
* Eric Heitz, JCGT Vol. 3, No. 2, 2014.
* https://jcgt.org/published/0003/02/03/ */
template<MicrofacetType m_type>
ccl_device_inline float bsdf_G1_from_sqr_alpha_tan_n(float sqr_alpha_tan_n)
ccl_device_inline float bsdf_lambda_from_sqr_alpha_tan_n(float sqr_alpha_tan_n)
{
if (m_type == MicrofacetType::GGX) {
return 2.0f / (1.0f + sqrtf(1.0f + sqr_alpha_tan_n));
/* Equation 72. */
return 0.5f * (sqrtf(1.0f + sqr_alpha_tan_n) - 1.0f);
}
else {
/* m_type == MicrofacetType::BECKMANN */
/* m_type == MicrofacetType::BECKMANN
* Approximation from below Equation 69. */
if (sqr_alpha_tan_n < 0.39f) {
/* Equivalent to a >= 1.6f, but also handles sqr_alpha_tan_n == 0.0f cleanly. */
return 0.0f;
}
const float a = inversesqrtf(sqr_alpha_tan_n);
return (a > 1.6f) ? 1.0f : ((2.181f * a + 3.535f) * a) / ((2.577f * a + 2.276f) * a + 1.0f);
return ((0.396f * a - 1.259f) * a + 1.0f) / ((2.181f * a + 3.535f) * a);
}
}
template<MicrofacetType m_type> ccl_device_inline float bsdf_G1(float alpha2, float cos_N)
template<MicrofacetType m_type> ccl_device_inline float bsdf_lambda(float alpha2, float cos_N)
{
return bsdf_G1_from_sqr_alpha_tan_n<m_type>(alpha2 * fmaxf(1.0f / (cos_N * cos_N) - 1.0f, 0.0f));
return bsdf_lambda_from_sqr_alpha_tan_n<m_type>(alpha2 * fmaxf(1.0f / sqr(cos_N) - 1.0f, 0.0f));
}
template<MicrofacetType m_type>
ccl_device_inline float bsdf_aniso_G1(float alpha_x, float alpha_y, float3 V)
ccl_device_inline float bsdf_aniso_lambda(float alpha_x, float alpha_y, float3 V)
{
return bsdf_G1_from_sqr_alpha_tan_n<m_type>((sqr(alpha_x * V.x) + sqr(alpha_y * V.y)) /
sqr(V.z));
const float sqr_alpha_tan_n = (sqr(alpha_x * V.x) + sqr(alpha_y * V.y)) / sqr(V.z);
return bsdf_lambda_from_sqr_alpha_tan_n<m_type>(sqr_alpha_tan_n);
}
/* Smith's separable shadowing-masking term. */
/* Combined shadowing-masking term. */
template<MicrofacetType m_type>
ccl_device_inline float bsdf_G(float alpha2, float cos_NI, float cos_NO)
{
return bsdf_G1<m_type>(alpha2, cos_NI) * bsdf_G1<m_type>(alpha2, cos_NO);
return 1.0f / (1.0f + bsdf_lambda<m_type>(alpha2, cos_NI) + bsdf_lambda<m_type>(alpha2, cos_NO));
}
/* Normal distribution function. */
@@ -335,22 +289,6 @@ ccl_device_inline float bsdf_aniso_D(float alpha_x, float alpha_y, float3 H)
}
}
ccl_device_forceinline void bsdf_microfacet_fresnel_color(ccl_private const ShaderData *sd,
ccl_private MicrofacetBsdf *bsdf)
{
kernel_assert(CLOSURE_IS_BSDF_MICROFACET_FRESNEL(bsdf->type));
float F0 = fresnel_dielectric_cos(1.0f, bsdf->ior);
bsdf->extra->fresnel_color = interpolate_fresnel_color(
sd->wi, bsdf->N, bsdf->ior, F0, bsdf->extra->cspec0);
if (bsdf->type == CLOSURE_BSDF_MICROFACET_GGX_CLEARCOAT_ID) {
bsdf->extra->fresnel_color *= 0.25f * bsdf->extra->clearcoat;
}
bsdf->sample_weight *= average(bsdf->extra->fresnel_color);
}
template<MicrofacetType m_type>
ccl_device Spectrum bsdf_microfacet_eval(ccl_private const ShaderClosure *sc,
const float3 Ng,
@@ -382,7 +320,7 @@ ccl_device Spectrum bsdf_microfacet_eval(ccl_private const ShaderClosure *sc,
H *= inv_len_H;
const float cos_NH = dot(N, H);
float D, G1i, G1o;
float D, lambdaI, lambdaO;
/* TODO: add support for anisotropic transmission. */
if (alpha_x == alpha_y || m_refractive) { /* Isotropic. */
@@ -399,8 +337,8 @@ ccl_device Spectrum bsdf_microfacet_eval(ccl_private const ShaderClosure *sc,
D = bsdf_D<m_type>(alpha2, cos_NH);
}
G1i = bsdf_G1<m_type>(alpha2, cos_NI);
G1o = bsdf_G1<m_type>(alpha2, cos_NO);
lambdaI = bsdf_lambda<m_type>(alpha2, cos_NI);
lambdaO = bsdf_lambda<m_type>(alpha2, cos_NO);
}
else { /* Anisotropic. */
float3 X, Y;
@@ -412,25 +350,23 @@ ccl_device Spectrum bsdf_microfacet_eval(ccl_private const ShaderClosure *sc,
D = bsdf_aniso_D<m_type>(alpha_x, alpha_y, local_H);
G1i = bsdf_aniso_G1<m_type>(alpha_x, alpha_y, local_I);
G1o = bsdf_aniso_G1<m_type>(alpha_x, alpha_y, local_O);
lambdaI = bsdf_aniso_lambda<m_type>(alpha_x, alpha_y, local_I);
lambdaO = bsdf_aniso_lambda<m_type>(alpha_x, alpha_y, local_O);
}
const float common = G1i * D / cos_NI *
const float common = D / cos_NI *
(m_refractive ?
sqr(bsdf->ior * inv_len_H) * fabsf(dot(H, wi) * dot(H, wo)) :
0.25f);
*pdf = common;
*pdf = common / (1.0f + lambdaI);
const Spectrum F = m_refractive ? one_spectrum() : reflection_color(bsdf, wo, H);
return F * G1o * common;
const Spectrum F = microfacet_fresnel(bsdf, wo, H);
return F * common / (1.0f + lambdaO + lambdaI);
}
template<MicrofacetType m_type>
ccl_device int bsdf_microfacet_sample(KernelGlobals kg,
ccl_private const ShaderClosure *sc,
ccl_device int bsdf_microfacet_sample(ccl_private const ShaderClosure *sc,
float3 Ng,
float3 wi,
float randu,
@@ -466,10 +402,15 @@ ccl_device int bsdf_microfacet_sample(KernelGlobals kg,
/* Importance sampling with distribution of visible normals. Vectors are transformed to local
* space before and after sampling. */
float G1i;
const float3 local_I = make_float3(dot(X, wi), dot(Y, wi), cos_NI);
const float3 local_H = microfacet_sample_stretched<m_type>(
kg, local_I, alpha_x, alpha_y, randu, randv, &G1i);
float3 local_H;
if (m_type == MicrofacetType::GGX) {
local_H = microfacet_ggx_sample_vndf(local_I, alpha_x, alpha_y, randu, randv);
}
else {
/* m_type == MicrofacetType::BECKMANN */
local_H = microfacet_beckmann_sample_vndf(local_I, alpha_x, alpha_y, randu, randv);
}
const float3 H = X * local_H.x + Y * local_H.y + N * local_H.z;
const float cos_NH = local_H.z;
@@ -502,19 +443,12 @@ ccl_device int bsdf_microfacet_sample(KernelGlobals kg,
label |= LABEL_SINGULAR;
/* Some high number for MIS. */
*pdf = 1e6f;
*eval = make_spectrum(1e6f);
bool use_fresnel = (bsdf->type == CLOSURE_BSDF_MICROFACET_GGX_FRESNEL_ID ||
bsdf->type == CLOSURE_BSDF_MICROFACET_GGX_CLEARCOAT_ID);
if (use_fresnel && !m_refractive) {
*eval *= reflection_color(bsdf, *wo, H);
}
*eval = make_spectrum(1e6f) * microfacet_fresnel(bsdf, *wo, H);
}
else {
label |= LABEL_GLOSSY;
float cos_NO = dot(N, *wo);
float D, G1o;
float D, lambdaI, lambdaO;
/* TODO: add support for anisotropic transmission. */
if (alpha_x == alpha_y || m_refractive) { /* Isotropic. */
@@ -526,34 +460,32 @@ ccl_device int bsdf_microfacet_sample(KernelGlobals kg,
/* The masking-shadowing term for clearcoat has a fixed alpha of 0.25
* => alpha2 = 0.25 * 0.25 */
alpha2 = 0.0625f;
/* Recalculate G1i. */
G1i = bsdf_G1<m_type>(alpha2, cos_NI);
}
else {
D = bsdf_D<m_type>(alpha2, cos_NH);
}
G1o = bsdf_G1<m_type>(alpha2, cos_NO);
lambdaO = bsdf_lambda<m_type>(alpha2, cos_NO);
lambdaI = bsdf_lambda<m_type>(alpha2, cos_NI);
}
else { /* Anisotropic. */
const float3 local_O = make_float3(dot(X, *wo), dot(Y, *wo), cos_NO);
D = bsdf_aniso_D<m_type>(alpha_x, alpha_y, local_H);
G1o = bsdf_aniso_G1<m_type>(alpha_x, alpha_y, local_O);
lambdaO = bsdf_aniso_lambda<m_type>(alpha_x, alpha_y, local_O);
lambdaI = bsdf_aniso_lambda<m_type>(alpha_x, alpha_y, local_I);
}
const float cos_HO = dot(H, *wo);
const float common = G1i * D / cos_NI *
const float common = D / cos_NI *
(m_refractive ? fabsf(cos_HI * cos_HO) / sqr(cos_HO + cos_HI / m_eta) :
0.25f);
*pdf = common;
*pdf = common / (1.0f + lambdaI);
Spectrum F = m_refractive ? one_spectrum() : reflection_color(bsdf, *wo, H);
*eval = G1o * common * F;
Spectrum F = microfacet_fresnel(bsdf, *wo, H);
*eval = F * common / (1.0f + lambdaI + lambdaO);
}
*sampled_roughness = make_float2(alpha_x, alpha_y);
@@ -587,14 +519,6 @@ ccl_device int bsdf_microfacet_ggx_setup(ccl_private MicrofacetBsdf *bsdf)
return SD_BSDF | SD_BSDF_HAS_EVAL;
}
/* Required to maintain OSL interface. */
ccl_device int bsdf_microfacet_ggx_isotropic_setup(ccl_private MicrofacetBsdf *bsdf)
{
bsdf->alpha_y = bsdf->alpha_x;
return bsdf_microfacet_ggx_setup(bsdf);
}
ccl_device int bsdf_microfacet_ggx_fresnel_setup(ccl_private MicrofacetBsdf *bsdf,
ccl_private const ShaderData *sd)
{
@@ -605,7 +529,7 @@ ccl_device int bsdf_microfacet_ggx_fresnel_setup(ccl_private MicrofacetBsdf *bsd
bsdf->type = CLOSURE_BSDF_MICROFACET_GGX_FRESNEL_ID;
bsdf_microfacet_fresnel_color(sd, bsdf);
bsdf_microfacet_adjust_weight(sd, bsdf);
return SD_BSDF | SD_BSDF_HAS_EVAL;
}
@@ -613,14 +537,12 @@ ccl_device int bsdf_microfacet_ggx_fresnel_setup(ccl_private MicrofacetBsdf *bsd
ccl_device int bsdf_microfacet_ggx_clearcoat_setup(ccl_private MicrofacetBsdf *bsdf,
ccl_private const ShaderData *sd)
{
bsdf->extra->cspec0 = saturate(bsdf->extra->cspec0);
bsdf->alpha_x = saturatef(bsdf->alpha_x);
bsdf->alpha_y = bsdf->alpha_x;
bsdf->type = CLOSURE_BSDF_MICROFACET_GGX_CLEARCOAT_ID;
bsdf_microfacet_fresnel_color(sd, bsdf);
bsdf_microfacet_adjust_weight(sd, bsdf);
return SD_BSDF | SD_BSDF_HAS_EVAL;
}
@@ -654,8 +576,7 @@ ccl_device Spectrum bsdf_microfacet_ggx_eval(ccl_private const ShaderClosure *sc
return bsdf_microfacet_eval<MicrofacetType::GGX>(sc, Ng, wi, wo, pdf);
}
ccl_device int bsdf_microfacet_ggx_sample(KernelGlobals kg,
ccl_private const ShaderClosure *sc,
ccl_device int bsdf_microfacet_ggx_sample(ccl_private const ShaderClosure *sc,
float3 Ng,
float3 wi,
float randu,
@@ -667,7 +588,7 @@ ccl_device int bsdf_microfacet_ggx_sample(KernelGlobals kg,
ccl_private float *eta)
{
return bsdf_microfacet_sample<MicrofacetType::GGX>(
kg, sc, Ng, wi, randu, randv, eval, wo, pdf, sampled_roughness, eta);
sc, Ng, wi, randu, randv, eval, wo, pdf, sampled_roughness, eta);
}
/* Beckmann microfacet with Smith shadow-masking from:
@@ -684,14 +605,6 @@ ccl_device int bsdf_microfacet_beckmann_setup(ccl_private MicrofacetBsdf *bsdf)
return SD_BSDF | SD_BSDF_HAS_EVAL;
}
/* Required to maintain OSL interface. */
ccl_device int bsdf_microfacet_beckmann_isotropic_setup(ccl_private MicrofacetBsdf *bsdf)
{
bsdf->alpha_y = bsdf->alpha_x;
return bsdf_microfacet_beckmann_setup(bsdf);
}
ccl_device int bsdf_microfacet_beckmann_refraction_setup(ccl_private MicrofacetBsdf *bsdf)
{
bsdf->alpha_x = saturatef(bsdf->alpha_x);
@@ -718,8 +631,7 @@ ccl_device Spectrum bsdf_microfacet_beckmann_eval(ccl_private const ShaderClosur
return bsdf_microfacet_eval<MicrofacetType::BECKMANN>(sc, Ng, wi, wo, pdf);
}
ccl_device int bsdf_microfacet_beckmann_sample(KernelGlobals kg,
ccl_private const ShaderClosure *sc,
ccl_device int bsdf_microfacet_beckmann_sample(ccl_private const ShaderClosure *sc,
float3 Ng,
float3 wi,
float randu,
@@ -731,7 +643,7 @@ ccl_device int bsdf_microfacet_beckmann_sample(KernelGlobals kg,
ccl_private float *eta)
{
return bsdf_microfacet_sample<MicrofacetType::BECKMANN>(
kg, sc, Ng, wi, randu, randv, eval, wo, pdf, sampled_roughness, eta);
sc, Ng, wi, randu, randv, eval, wo, pdf, sampled_roughness, eta);
}
CCL_NAMESPACE_END

View File

@@ -43,7 +43,7 @@ ccl_device_forceinline float2 mf_sampleP22_11(const float cosI,
return make_float2(r * cosf(phi), r * sinf(phi));
}
const float sinI = safe_sqrtf(1.0f - cosI * cosI);
const float sinI = sin_from_cos(cosI);
const float tanI = sinI / cosI;
const float projA = 0.5f * (cosI + 1.0f);
if (projA < 0.0001f)
@@ -401,7 +401,7 @@ ccl_device int bsdf_microfacet_multi_ggx_fresnel_setup(ccl_private MicrofacetBsd
bsdf->type = CLOSURE_BSDF_MICROFACET_MULTI_GGX_FRESNEL_ID;
bsdf_microfacet_fresnel_color(sd, bsdf);
bsdf_microfacet_adjust_weight(sd, bsdf);
return bsdf_microfacet_multi_ggx_common_setup(bsdf);
}
@@ -575,7 +575,7 @@ ccl_device int bsdf_microfacet_multi_ggx_glass_fresnel_setup(ccl_private Microfa
bsdf->type = CLOSURE_BSDF_MICROFACET_MULTI_GGX_GLASS_FRESNEL_ID;
bsdf_microfacet_fresnel_color(sd, bsdf);
bsdf_microfacet_adjust_weight(sd, bsdf);
return SD_BSDF | SD_BSDF_HAS_EVAL | SD_BSDF_NEEDS_LCG;
}

View File

@@ -73,9 +73,8 @@ ccl_device_forceinline Spectrum MF_FUNCTION_FULL_NAME(mf_eval)(float3 wi,
eval = make_spectrum(val);
#endif
float F0 = fresnel_dielectric_cos(1.0f, eta);
if (use_fresnel) {
throughput = interpolate_fresnel_color(wi, wh, eta, F0, cspec0);
throughput = interpolate_fresnel_color(wi, wh, eta, cspec0);
eval *= throughput;
}
@@ -144,11 +143,11 @@ ccl_device_forceinline Spectrum MF_FUNCTION_FULL_NAME(mf_eval)(float3 wi,
throughput *= color;
}
else if (use_fresnel && order > 0) {
throughput *= interpolate_fresnel_color(wi_prev, wm, eta, F0, cspec0);
throughput *= interpolate_fresnel_color(wi_prev, wm, eta, cspec0);
}
#else /* MF_MULTI_GLOSSY */
if (use_fresnel && order > 0) {
throughput *= interpolate_fresnel_color(-wr, wm, eta, F0, cspec0);
throughput *= interpolate_fresnel_color(-wr, wm, eta, cspec0);
}
wr = mf_sample_phase_glossy(-wr, &throughput, wm);
#endif
@@ -192,8 +191,6 @@ ccl_device_forceinline Spectrum MF_FUNCTION_FULL_NAME(mf_sample)(float3 wi,
float G1_r = 0.0f;
bool outside = true;
float F0 = fresnel_dielectric_cos(1.0f, eta);
int order;
for (order = 0; order < 10; order++) {
/* Sample microfacet height. */
@@ -229,22 +226,12 @@ ccl_device_forceinline Spectrum MF_FUNCTION_FULL_NAME(mf_sample)(float3 wi,
throughput *= color;
}
else {
Spectrum t_color = interpolate_fresnel_color(wi_prev, wm, eta, F0, cspec0);
if (order == 0)
throughput = t_color;
else
throughput *= t_color;
throughput *= interpolate_fresnel_color(wi_prev, wm, eta, cspec0);
}
}
#else /* MF_MULTI_GLOSSY */
if (use_fresnel) {
Spectrum t_color = interpolate_fresnel_color(-wr, wm, eta, F0, cspec0);
if (order == 0)
throughput = t_color;
else
throughput *= t_color;
throughput *= interpolate_fresnel_color(-wr, wm, eta, cspec0);
}
wr = mf_sample_phase_glossy(-wr, &throughput, wm);
#endif

View File

@@ -89,19 +89,21 @@ ccl_device float schlick_fresnel(float u)
return m2 * m2 * m; // pow(m, 5)
}
/* Calculate the fresnel color which is a blend between white and the F0 color (cspec0) */
ccl_device_forceinline Spectrum
interpolate_fresnel_color(float3 L, float3 H, float ior, float F0, Spectrum cspec0)
/* Calculate the fresnel color, which is a blend between white and the F0 color */
ccl_device_forceinline Spectrum interpolate_fresnel_color(float3 L,
float3 H,
float ior,
Spectrum F0)
{
/* Calculate the fresnel interpolation factor
* The value from fresnel_dielectric_cos(...) has to be normalized because
* the cspec0 keeps the F0 color
*/
float F0_norm = 1.0f / (1.0f - F0);
float FH = (fresnel_dielectric_cos(dot(L, H), ior) - F0) * F0_norm;
/* Compute the real Fresnel term and remap it from real_F0..1 to F0..1.
* The reason why we use this remapping instead of directly doing the
* Schlick approximation lerp(F0, 1.0, (1.0-cosLH)^5) is that for cases
* with similar IORs (e.g. ice in water), the relative IOR can be close
* enough to 1.0 that the Schlick approximation becomes inaccurate. */
float real_F = fresnel_dielectric_cos(dot(L, H), ior);
float real_F0 = fresnel_dielectric_cos(1.0f, ior);
/* Blend between white and a specular color with respect to the fresnel */
return cspec0 * (1.0f - FH) + make_spectrum(FH);
return mix(F0, one_spectrum(), inverse_lerp(real_F0, 1.0f, real_F));
}
ccl_device float3 ensure_valid_reflection(float3 Ng, float3 I, float3 N)

View File

@@ -88,7 +88,7 @@ henyey_greenstrein_sample(float3 D, float g, float randu, float randv, ccl_priva
}
}
float sin_theta = safe_sqrtf(1.0f - cos_theta * cos_theta);
float sin_theta = sin_from_cos(cos_theta);
float phi = M_2PI_F * randv;
float3 dir = make_float3(sin_theta * cosf(phi), sin_theta * sinf(phi), cos_theta);

View File

@@ -401,6 +401,72 @@ ccl_gpu_kernel_threads(GPU_PARALLEL_SORTED_INDEX_DEFAULT_BLOCK_SIZE)
}
ccl_gpu_kernel_postfix
ccl_gpu_kernel_threads(GPU_PARALLEL_SORT_BLOCK_SIZE)
ccl_gpu_kernel_signature(integrator_sort_bucket_pass,
int num_states,
int partition_size,
int num_states_limit,
ccl_global int *indices,
int kernel_index)
{
#if defined(__KERNEL_LOCAL_ATOMIC_SORT__)
int max_shaders = context.launch_params_metal.data.max_shaders;
ccl_global ushort *d_queued_kernel = (ccl_global ushort *)
kernel_integrator_state.path.queued_kernel;
ccl_global uint *d_shader_sort_key = (ccl_global uint *)
kernel_integrator_state.path.shader_sort_key;
ccl_global int *key_offsets = (ccl_global int *)
kernel_integrator_state.sort_partition_key_offsets;
gpu_parallel_sort_bucket_pass(num_states,
partition_size,
max_shaders,
kernel_index,
d_queued_kernel,
d_shader_sort_key,
key_offsets,
(threadgroup int *)threadgroup_array,
metal_local_id,
metal_local_size,
metal_grid_id);
#endif
}
ccl_gpu_kernel_postfix
ccl_gpu_kernel_threads(GPU_PARALLEL_SORT_BLOCK_SIZE)
ccl_gpu_kernel_signature(integrator_sort_write_pass,
int num_states,
int partition_size,
int num_states_limit,
ccl_global int *indices,
int kernel_index)
{
#if defined(__KERNEL_LOCAL_ATOMIC_SORT__)
int max_shaders = context.launch_params_metal.data.max_shaders;
ccl_global ushort *d_queued_kernel = (ccl_global ushort *)
kernel_integrator_state.path.queued_kernel;
ccl_global uint *d_shader_sort_key = (ccl_global uint *)
kernel_integrator_state.path.shader_sort_key;
ccl_global int *key_offsets = (ccl_global int *)
kernel_integrator_state.sort_partition_key_offsets;
gpu_parallel_sort_write_pass(num_states,
partition_size,
max_shaders,
kernel_index,
num_states_limit,
indices,
d_queued_kernel,
d_shader_sort_key,
key_offsets,
(threadgroup int *)threadgroup_array,
metal_local_id,
metal_local_size,
metal_grid_id);
#endif
}
ccl_gpu_kernel_postfix
ccl_gpu_kernel_threads(GPU_PARALLEL_ACTIVE_INDEX_DEFAULT_BLOCK_SIZE)
ccl_gpu_kernel_signature(integrator_compact_paths_array,
int num_states,

View File

@@ -178,7 +178,7 @@ __device__
simd_lane_index, \
simd_group_index, \
num_simd_groups, \
simdgroup_offset)
(threadgroup int *)threadgroup_array)
#elif defined(__KERNEL_ONEAPI__)
# define gpu_parallel_active_index_array(num_states, indices, num_indices, is_active_op) \

View File

@@ -19,6 +19,115 @@ CCL_NAMESPACE_BEGIN
# define GPU_PARALLEL_SORTED_INDEX_DEFAULT_BLOCK_SIZE 512
#endif
#define GPU_PARALLEL_SORTED_INDEX_INACTIVE_KEY (~0)
#define GPU_PARALLEL_SORT_BLOCK_SIZE 1024
#if defined(__KERNEL_LOCAL_ATOMIC_SORT__)
# define atomic_store_local(p, x) \
atomic_store_explicit((threadgroup atomic_int *)p, x, memory_order_relaxed)
# define atomic_load_local(p) \
atomic_load_explicit((threadgroup atomic_int *)p, memory_order_relaxed)
ccl_device_inline void gpu_parallel_sort_bucket_pass(const uint num_states,
const uint partition_size,
const uint max_shaders,
const uint queued_kernel,
ccl_global ushort *d_queued_kernel,
ccl_global uint *d_shader_sort_key,
ccl_global int *partition_key_offsets,
ccl_gpu_shared int *buckets,
const ushort local_id,
const ushort local_size,
const ushort grid_id)
{
/* Zero the bucket sizes. */
if (local_id < max_shaders) {
atomic_store_local(&buckets[local_id], 0);
}
ccl_gpu_syncthreads();
/* Determine bucket sizes within the partitions. */
const uint partition_start = partition_size * uint(grid_id);
const uint partition_end = min(num_states, partition_start + partition_size);
for (int state_index = partition_start + uint(local_id); state_index < partition_end;
state_index += uint(local_size)) {
ushort kernel_index = d_queued_kernel[state_index];
if (kernel_index == queued_kernel) {
uint key = d_shader_sort_key[state_index] % max_shaders;
atomic_fetch_and_add_uint32(&buckets[key], 1);
}
}
ccl_gpu_syncthreads();
/* Calculate the partition's local offsets from the prefix sum of bucket sizes. */
if (local_id == 0) {
int offset = 0;
for (int i = 0; i < max_shaders; i++) {
partition_key_offsets[i + uint(grid_id) * (max_shaders + 1)] = offset;
offset = offset + atomic_load_local(&buckets[i]);
}
/* Store the number of active states in this partition. */
partition_key_offsets[max_shaders + uint(grid_id) * (max_shaders + 1)] = offset;
}
}
ccl_device_inline void gpu_parallel_sort_write_pass(const uint num_states,
const uint partition_size,
const uint max_shaders,
const uint queued_kernel,
const int num_states_limit,
ccl_global int *indices,
ccl_global ushort *d_queued_kernel,
ccl_global uint *d_shader_sort_key,
ccl_global int *partition_key_offsets,
ccl_gpu_shared int *local_offset,
const ushort local_id,
const ushort local_size,
const ushort grid_id)
{
/* Calculate each partition's global offset from the prefix sum of the active state counts per
* partition. */
if (local_id < max_shaders) {
int partition_offset = 0;
for (int i = 0; i < uint(grid_id); i++) {
int partition_key_count = partition_key_offsets[max_shaders + uint(i) * (max_shaders + 1)];
partition_offset += partition_key_count;
}
ccl_global int *key_offsets = partition_key_offsets + (uint(grid_id) * (max_shaders + 1));
atomic_store_local(&local_offset[local_id], key_offsets[local_id] + partition_offset);
}
ccl_gpu_syncthreads();
/* Write the sorted active indices. */
const uint partition_start = partition_size * uint(grid_id);
const uint partition_end = min(num_states, partition_start + partition_size);
ccl_global int *key_offsets = partition_key_offsets + (uint(grid_id) * max_shaders);
for (int state_index = partition_start + uint(local_id); state_index < partition_end;
state_index += uint(local_size)) {
ushort kernel_index = d_queued_kernel[state_index];
if (kernel_index == queued_kernel) {
uint key = d_shader_sort_key[state_index] % max_shaders;
int index = atomic_fetch_and_add_uint32(&local_offset[key], 1);
if (index < num_states_limit) {
indices[index] = state_index;
}
}
}
}
#endif /* __KERNEL_LOCAL_ATOMIC_SORT__ */
template<typename GetKeyOp>
__device__ void gpu_parallel_sorted_index_array(const uint state_index,

View File

@@ -172,17 +172,14 @@ ccl_device_intersect bool scene_intersect_local(KernelGlobals kg,
kernel_assert(!"Invalid ift_local");
return false;
}
# endif
metal::raytracing::ray r(ray->P, ray->D, ray->tmin, ray->tmax);
metalrt_intersector_type metalrt_intersect;
metalrt_intersect.force_opacity(metal::raytracing::forced_opacity::non_opaque);
bool triangle_only = !kernel_data.bvh.have_curves && !kernel_data.bvh.have_points;
if (triangle_only) {
metalrt_intersect.assume_geometry_type(metal::raytracing::geometry_type::triangle);
if (is_null_intersection_function_table(metal_ancillaries->ift_local_prim)) {
if (local_isect) {
local_isect->num_hits = 0;
}
kernel_assert(!"Invalid ift_local_prim");
return false;
}
# endif
MetalRTIntersectionLocalPayload payload;
payload.self = ray->self;
@@ -195,14 +192,48 @@ ccl_device_intersect bool scene_intersect_local(KernelGlobals kg,
}
payload.result = false;
typename metalrt_intersector_type::result_type intersection;
metal::raytracing::ray r(ray->P, ray->D, ray->tmin, ray->tmax);
# if defined(__METALRT_MOTION__)
metalrt_intersector_type metalrt_intersect;
typename metalrt_intersector_type::result_type intersection;
metalrt_intersect.force_opacity(metal::raytracing::forced_opacity::non_opaque);
bool triangle_only = !kernel_data.bvh.have_curves && !kernel_data.bvh.have_points;
if (triangle_only) {
metalrt_intersect.assume_geometry_type(metal::raytracing::geometry_type::triangle);
}
intersection = metalrt_intersect.intersect(
r, metal_ancillaries->accel_struct, 0xFF, ray->time, metal_ancillaries->ift_local, payload);
# else
metalrt_blas_intersector_type metalrt_intersect;
typename metalrt_blas_intersector_type::result_type intersection;
metalrt_intersect.force_opacity(metal::raytracing::forced_opacity::non_opaque);
bool triangle_only = !kernel_data.bvh.have_curves && !kernel_data.bvh.have_points;
if (triangle_only) {
metalrt_intersect.assume_geometry_type(metal::raytracing::geometry_type::triangle);
}
// if we know we are going to get max one hit, like for random-sss-walk we can
// optimize and accept the first hit
if (max_hits == 1) {
metalrt_intersect.accept_any_intersection(true);
}
int blas_index = metal_ancillaries->blas_userID_to_index_lookUp[local_object];
// transform the ray into object's local space
Transform itfm = kernel_data_fetch(objects, local_object).itfm;
r.origin = transform_point(&itfm, r.origin);
r.direction = transform_direction(&itfm, r.direction);
intersection = metalrt_intersect.intersect(
r, metal_ancillaries->accel_struct, 0xFF, metal_ancillaries->ift_local, payload);
r,
metal_ancillaries->blas_accel_structs[blas_index].blas,
metal_ancillaries->ift_local_prim,
payload);
# endif
if (lcg_state) {

View File

@@ -105,10 +105,11 @@ struct kernel_gpu_##name \
{ \
PARAMS_MAKER(__VA_ARGS__)(__VA_ARGS__) \
void run(thread MetalKernelContext& context, \
threadgroup int *simdgroup_offset, \
threadgroup atomic_int *threadgroup_array, \
const uint metal_global_id, \
const ushort metal_local_id, \
const ushort metal_local_size, \
const ushort metal_grid_id, \
uint simdgroup_size, \
uint simd_lane_index, \
uint simd_group_index, \
@@ -117,22 +118,24 @@ struct kernel_gpu_##name \
kernel void cycles_metal_##name(device const kernel_gpu_##name *params_struct, \
constant KernelParamsMetal &ccl_restrict _launch_params_metal, \
constant MetalAncillaries *_metal_ancillaries, \
threadgroup int *simdgroup_offset[[ threadgroup(0) ]], \
threadgroup atomic_int *threadgroup_array[[ threadgroup(0) ]], \
const uint metal_global_id [[thread_position_in_grid]], \
const ushort metal_local_id [[thread_position_in_threadgroup]], \
const ushort metal_local_size [[threads_per_threadgroup]], \
const ushort metal_grid_id [[threadgroup_position_in_grid]], \
uint simdgroup_size [[threads_per_simdgroup]], \
uint simd_lane_index [[thread_index_in_simdgroup]], \
uint simd_group_index [[simdgroup_index_in_threadgroup]], \
uint num_simd_groups [[simdgroups_per_threadgroup]]) { \
MetalKernelContext context(_launch_params_metal, _metal_ancillaries); \
params_struct->run(context, simdgroup_offset, metal_global_id, metal_local_id, metal_local_size, simdgroup_size, simd_lane_index, simd_group_index, num_simd_groups); \
params_struct->run(context, threadgroup_array, metal_global_id, metal_local_id, metal_local_size, metal_grid_id, simdgroup_size, simd_lane_index, simd_group_index, num_simd_groups); \
} \
void kernel_gpu_##name::run(thread MetalKernelContext& context, \
threadgroup int *simdgroup_offset, \
threadgroup atomic_int *threadgroup_array, \
const uint metal_global_id, \
const ushort metal_local_id, \
const ushort metal_local_size, \
const ushort metal_grid_id, \
uint simdgroup_size, \
uint simd_lane_index, \
uint simd_group_index, \
@@ -263,13 +266,25 @@ ccl_device_forceinline uchar4 make_uchar4(const uchar x,
# if defined(__METALRT_MOTION__)
# define METALRT_TAGS instancing, instance_motion, primitive_motion
# define METALRT_BLAS_TAGS , primitive_motion
# else
# define METALRT_TAGS instancing
# define METALRT_BLAS_TAGS
# endif /* __METALRT_MOTION__ */
typedef acceleration_structure<METALRT_TAGS> metalrt_as_type;
typedef intersection_function_table<triangle_data, METALRT_TAGS> metalrt_ift_type;
typedef metal::raytracing::intersector<triangle_data, METALRT_TAGS> metalrt_intersector_type;
# if defined(__METALRT_MOTION__)
typedef acceleration_structure<primitive_motion> metalrt_blas_as_type;
typedef intersection_function_table<triangle_data, primitive_motion> metalrt_blas_ift_type;
typedef metal::raytracing::intersector<triangle_data, primitive_motion>
metalrt_blas_intersector_type;
# else
typedef acceleration_structure<> metalrt_blas_as_type;
typedef intersection_function_table<triangle_data> metalrt_blas_ift_type;
typedef metal::raytracing::intersector<triangle_data> metalrt_blas_intersector_type;
# endif
#endif /* __METALRT__ */
@@ -282,6 +297,12 @@ struct Texture3DParamsMetal {
texture3d<float, access::sample> tex;
};
#ifdef __METALRT__
struct MetalRTBlasWrapper {
metalrt_blas_as_type blas;
};
#endif
struct MetalAncillaries {
device Texture2DParamsMetal *textures_2d;
device Texture3DParamsMetal *textures_3d;
@@ -291,6 +312,9 @@ struct MetalAncillaries {
metalrt_ift_type ift_default;
metalrt_ift_type ift_shadow;
metalrt_ift_type ift_local;
metalrt_blas_ift_type ift_local_prim;
constant MetalRTBlasWrapper *blas_accel_structs;
constant int *blas_userID_to_index_lookUp;
#endif
};

View File

@@ -34,7 +34,7 @@ class MetalKernelContext {
kernel_assert(0);
return 0;
}
#ifdef __KERNEL_METAL_INTEL__
template<typename TextureType, typename CoordsType>
inline __attribute__((__always_inline__))
@@ -55,7 +55,7 @@ class MetalKernelContext {
}
}
#endif
// texture2d
template<>
inline __attribute__((__always_inline__))

View File

@@ -139,6 +139,20 @@ TReturn metalrt_local_hit(constant KernelParamsMetal &launch_params_metal,
#endif
}
[[intersection(triangle, triangle_data )]] TriangleIntersectionResult
__anyhit__cycles_metalrt_local_hit_tri_prim(
constant KernelParamsMetal &launch_params_metal [[buffer(1)]],
ray_data MetalKernelContext::MetalRTIntersectionLocalPayload &payload [[payload]],
uint primitive_id [[primitive_id]],
float2 barycentrics [[barycentric_coord]],
float ray_tmax [[distance]])
{
//instance_id, aka the user_id has been removed. If we take this function we optimized the
//SSS for starting traversal from a primitive acceleration structure instead of the root of the global AS.
//this means we will always be intersecting the correct object no need for the userid to check
return metalrt_local_hit<TriangleIntersectionResult, METALRT_HIT_TRIANGLE>(
launch_params_metal, payload, payload.local_object, primitive_id, barycentrics, ray_tmax);
}
[[intersection(triangle, triangle_data, METALRT_TAGS)]] TriangleIntersectionResult
__anyhit__cycles_metalrt_local_hit_tri(
constant KernelParamsMetal &launch_params_metal [[buffer(1)]],
@@ -163,6 +177,17 @@ __anyhit__cycles_metalrt_local_hit_box(const float ray_tmax [[max_distance]])
return result;
}
[[intersection(bounding_box, triangle_data )]] BoundingBoxIntersectionResult
__anyhit__cycles_metalrt_local_hit_box_prim(const float ray_tmax [[max_distance]])
{
/* unused function */
BoundingBoxIntersectionResult result;
result.distance = ray_tmax;
result.accept = false;
result.continue_search = false;
return result;
}
template<uint intersection_type>
bool metalrt_shadow_all_hit(constant KernelParamsMetal &launch_params_metal,
ray_data MetalKernelContext::MetalRTIntersectionShadowPayload &payload,

View File

@@ -195,7 +195,15 @@ using sycl::half;
#define fmodf(x, y) sycl::fmod((x), (y))
#define lgammaf(x) sycl::lgamma((x))
#define cosf(x) sycl::native::cos(((float)(x)))
/* `sycl::native::cos` precision is not sufficient and `-ffast-math` lets
* the current DPC++ compiler overload `sycl::cos` with it.
* We work around this issue by directly calling the SPIRV implementation which
* provides greater precision. */
#if defined(__SYCL_DEVICE_ONLY__) && defined(__SPIR__)
# define cosf(x) __spirv_ocl_cos(((float)(x)))
#else
# define cosf(x) sycl::cos(((float)(x)))
#endif
#define sinf(x) sycl::native::sin(((float)(x)))
#define powf(x, y) sycl::native::powr(((float)(x)), ((float)(y)))
#define tanf(x) sycl::native::tan(((float)(x)))

View File

@@ -372,6 +372,16 @@ bool oneapi_enqueue_kernel(KernelContext *kernel_context,
kg, cgh, global_size, local_size, args, oneapi_kernel_integrator_sorted_paths_array);
break;
}
case DEVICE_KERNEL_INTEGRATOR_SORT_BUCKET_PASS: {
oneapi_call(
kg, cgh, global_size, local_size, args, oneapi_kernel_integrator_sort_bucket_pass);
break;
}
case DEVICE_KERNEL_INTEGRATOR_SORT_WRITE_PASS: {
oneapi_call(
kg, cgh, global_size, local_size, args, oneapi_kernel_integrator_sort_write_pass);
break;
}
case DEVICE_KERNEL_INTEGRATOR_COMPACT_PATHS_ARRAY: {
oneapi_call(kg,
cgh,

View File

@@ -58,23 +58,7 @@ ccl_device_forceinline void film_write_denoising_features_surface(KernelGlobals
normal += sc->N * sc->sample_weight;
sum_weight += sc->sample_weight;
Spectrum closure_albedo = sc->weight;
/* Closures that include a Fresnel term typically have weights close to 1 even though their
* actual contribution is significantly lower.
* To account for this, we scale their weight by the average fresnel factor (the same is also
* done for the sample weight in the BSDF setup, so we don't need to scale that here). */
if (CLOSURE_IS_BSDF_MICROFACET_FRESNEL(sc->type)) {
ccl_private MicrofacetBsdf *bsdf = (ccl_private MicrofacetBsdf *)sc;
closure_albedo *= bsdf->extra->fresnel_color;
}
else if (sc->type == CLOSURE_BSDF_PRINCIPLED_SHEEN_ID) {
ccl_private PrincipledSheenBsdf *bsdf = (ccl_private PrincipledSheenBsdf *)sc;
closure_albedo *= bsdf->avg_value;
}
else if (sc->type == CLOSURE_BSDF_HAIR_PRINCIPLED_ID) {
closure_albedo *= bsdf_principled_hair_albedo(sc);
}
else if (sc->type == CLOSURE_BSDF_PRINCIPLED_DIFFUSE_ID) {
if (sc->type == CLOSURE_BSDF_PRINCIPLED_DIFFUSE_ID) {
/* BSSRDF already accounts for weight, retro-reflection would double up. */
ccl_private const PrincipledDiffuseBsdf *bsdf = (ccl_private const PrincipledDiffuseBsdf *)
sc;
@@ -83,6 +67,7 @@ ccl_device_forceinline void film_write_denoising_features_surface(KernelGlobals
}
}
Spectrum closure_albedo = bsdf_albedo(sd, sc);
if (bsdf_get_specular_roughness_squared(sc) > sqr(0.075f)) {
diffuse_albedo += closure_albedo;
sum_nonspecular_weight += sc->sample_weight;

View File

@@ -720,7 +720,7 @@ ccl_device_inline void curve_shader_setup(KernelGlobals kg,
const float3 tangent = normalize(dPdu);
const float3 bitangent = normalize(cross(tangent, -D));
const float sine = sd->v;
const float cosine = safe_sqrtf(1.0f - sine * sine);
const float cosine = cos_from_sin(sine);
sd->N = normalize(sine * bitangent - cosine * normalize(cross(tangent, bitangent)));
# if 0

View File

@@ -704,9 +704,9 @@ ccl_device_forceinline bool mnee_compute_transfer_matrix(ccl_private const Shade
float ilo = -eta * ilh;
float cos_theta = dot(wo, m.n);
float sin_theta = safe_sqrtf(1.f - sqr(cos_theta));
float sin_theta = sin_from_cos(cos_theta);
float cos_phi = dot(wo, s);
float sin_phi = safe_sqrtf(1.f - sqr(cos_phi));
float sin_phi = sin_from_cos(cos_phi);
/* Wo = (cos_phi * sin_theta) * s + (sin_phi * sin_theta) * t + cos_theta * n. */
float3 dH_dtheta = ilo * (cos_theta * (cos_phi * s + sin_phi * t) - sin_theta * m.n);

View File

@@ -132,6 +132,9 @@ typedef struct IntegratorStateGPU {
/* Index of main path which will be used by a next shadow catcher split. */
ccl_global int *next_main_path_index;
/* Partition/key offsets used when writing sorted active indices. */
ccl_global int *sort_partition_key_offsets;
/* Divisor used to partition active indices by locality when sorting by material. */
uint sort_partition_divisor;
} IntegratorStateGPU;

View File

@@ -115,6 +115,13 @@ ccl_device_forceinline void integrator_path_init_sorted(KernelGlobals kg,
atomic_fetch_and_add_uint32(&kernel_integrator_state.queue_counter->num_queued[next_kernel], 1);
INTEGRATOR_STATE_WRITE(state, path, queued_kernel) = next_kernel;
INTEGRATOR_STATE_WRITE(state, path, shader_sort_key) = key_;
# if defined(__KERNEL_LOCAL_ATOMIC_SORT__)
if (!kernel_integrator_state.sort_key_counter[next_kernel]) {
return;
}
# endif
atomic_fetch_and_add_uint32(&kernel_integrator_state.sort_key_counter[next_kernel][key_], 1);
}
@@ -130,6 +137,13 @@ ccl_device_forceinline void integrator_path_next_sorted(KernelGlobals kg,
atomic_fetch_and_add_uint32(&kernel_integrator_state.queue_counter->num_queued[next_kernel], 1);
INTEGRATOR_STATE_WRITE(state, path, queued_kernel) = next_kernel;
INTEGRATOR_STATE_WRITE(state, path, shader_sort_key) = key_;
# if defined(__KERNEL_LOCAL_ATOMIC_SORT__)
if (!kernel_integrator_state.sort_key_counter[next_kernel]) {
return;
}
# endif
atomic_fetch_and_add_uint32(&kernel_integrator_state.sort_key_counter[next_kernel][key_], 1);
}

View File

@@ -136,7 +136,7 @@ ccl_device_forceinline float diffusion_length_dwivedi(float alpha)
ccl_device_forceinline float3 direction_from_cosine(float3 D, float cos_theta, float randv)
{
float sin_theta = safe_sqrtf(1.0f - cos_theta * cos_theta);
float sin_theta = sin_from_cos(cos_theta);
float phi = M_2PI_F * randv;
float3 dir = make_float3(sin_theta * cosf(phi), sin_theta * sinf(phi), cos_theta);

View File

@@ -621,7 +621,7 @@ ccl_device Spectrum surface_shader_diffuse(KernelGlobals kg, ccl_private const S
ccl_private const ShaderClosure *sc = &sd->closure[i];
if (CLOSURE_IS_BSDF_DIFFUSE(sc->type) || CLOSURE_IS_BSSRDF(sc->type))
eval += sc->weight;
eval += bsdf_albedo(sd, sc);
}
return eval;
@@ -635,7 +635,7 @@ ccl_device Spectrum surface_shader_glossy(KernelGlobals kg, ccl_private const Sh
ccl_private const ShaderClosure *sc = &sd->closure[i];
if (CLOSURE_IS_BSDF_GLOSSY(sc->type))
eval += sc->weight;
eval += bsdf_albedo(sd, sc);
}
return eval;
@@ -649,7 +649,7 @@ ccl_device Spectrum surface_shader_transmission(KernelGlobals kg, ccl_private co
ccl_private const ShaderClosure *sc = &sd->closure[i];
if (CLOSURE_IS_BSDF_TRANSMISSION(sc->type))
eval += sc->weight;
eval += bsdf_albedo(sd, sc);
}
return eval;

View File

@@ -102,7 +102,7 @@ ccl_device float area_light_spread_attenuation(const float3 D,
/* The factor M_PI_F comes from integrating the radiance over the hemisphere */
return (cos_a > 0.9999997f) ? M_PI_F : 0.0f;
}
const float sin_a = safe_sqrtf(1.0f - sqr(cos_a));
const float sin_a = sin_from_cos(cos_a);
const float tan_a = sin_a / cos_a;
return max((tan_half_spread - tan_a) * normalize_spread, 0.0f);
}

View File

@@ -7,24 +7,13 @@
CCL_NAMESPACE_BEGIN
ccl_device float spot_light_attenuation(float3 dir,
float cos_half_spot_angle,
float spot_smooth,
float3 N)
ccl_device float spot_light_attenuation(const ccl_global KernelSpotLight *spot, float3 ray)
{
float attenuation = dot(dir, N);
const float3 scaled_ray = safe_normalize(
make_float3(dot(ray, spot->axis_u), dot(ray, spot->axis_v), dot(ray, spot->dir)) /
spot->len);
if (attenuation <= cos_half_spot_angle) {
attenuation = 0.0f;
}
else {
float t = attenuation - cos_half_spot_angle;
if (t < spot_smooth && spot_smooth != 0.0f)
attenuation *= smoothstepf(t / spot_smooth);
}
return attenuation;
return smoothstepf((scaled_ray.z - spot->cos_half_spot_angle) / spot->spot_smooth);
}
template<bool in_volume_segment>
@@ -57,8 +46,7 @@ ccl_device_inline bool spot_light_sample(const ccl_global KernelLight *klight,
ls->eval_fac = (0.25f * M_1_PI_F) * invarea;
/* spot light attenuation */
ls->eval_fac *= spot_light_attenuation(
klight->spot.dir, klight->spot.cos_half_spot_angle, klight->spot.spot_smooth, -ls->D);
ls->eval_fac *= spot_light_attenuation(&klight->spot, -ls->D);
if (!in_volume_segment && ls->eval_fac == 0.0f) {
return false;
}
@@ -87,8 +75,7 @@ ccl_device_forceinline void spot_light_update_position(const ccl_global KernelLi
ls->pdf = invarea;
/* spot light attenuation */
ls->eval_fac *= spot_light_attenuation(
klight->spot.dir, klight->spot.cos_half_spot_angle, klight->spot.spot_smooth, ls->Ng);
ls->eval_fac *= spot_light_attenuation(&klight->spot, ls->Ng);
}
ccl_device_inline bool spot_light_intersect(const ccl_global KernelLight *klight,
@@ -129,8 +116,7 @@ ccl_device_inline bool spot_light_sample_from_intersection(
ls->pdf = invarea;
/* spot light attenuation */
ls->eval_fac *= spot_light_attenuation(
klight->spot.dir, klight->spot.cos_half_spot_angle, klight->spot.spot_smooth, -ls->D);
ls->eval_fac *= spot_light_attenuation(&klight->spot, -ls->D);
if (ls->eval_fac == 0.0f) {
return false;

View File

@@ -47,11 +47,6 @@ ccl_device float light_tree_cos_bounding_box_angle(const BoundingBox bbox,
return cos_theta_u;
}
ccl_device_forceinline float sin_from_cos(const float c)
{
return safe_sqrtf(1.0f - sqr(c));
}
/* Compute vector v as in Fig .8. P_v is the corresponding point along the ray. */
ccl_device float3 compute_v(
const float3 centroid, const float3 P, const float3 D, const float3 bcone_axis, const float t)

View File

@@ -218,7 +218,7 @@ ccl_device_forceinline bool triangle_light_sample(KernelGlobals kg,
/* Finally, select a random point along the edge of the new triangle
* That point on the spherical triangle is the sampled ray direction */
const float z = 1.0f - randv * (1.0f - dot(C_, B));
ls->D = z * B + safe_sqrtf(1.0f - z * z) * safe_normalize(C_ - dot(C_, B) * B);
ls->D = z * B + sin_from_cos(z) * safe_normalize(C_ - dot(C_, B) * B);
/* calculate intersection with the planar triangle */
if (!ray_triangle_intersect(

View File

@@ -209,14 +209,7 @@ ccl_device void osl_closure_microfacet_setup(KernelGlobals kg,
if (closure->distribution == make_string("ggx", 11253504724482777663ull) ||
closure->distribution == make_string("default", 4430693559278735917ull)) {
if (!closure->refract) {
if (closure->alpha_x == closure->alpha_y) {
/* Isotropic */
sd->flag |= bsdf_microfacet_ggx_isotropic_setup(bsdf);
}
else {
/* Anisotropic */
sd->flag |= bsdf_microfacet_ggx_setup(bsdf);
}
sd->flag |= bsdf_microfacet_ggx_setup(bsdf);
}
else {
sd->flag |= bsdf_microfacet_ggx_refraction_setup(bsdf);
@@ -225,14 +218,7 @@ ccl_device void osl_closure_microfacet_setup(KernelGlobals kg,
/* Beckmann */
else {
if (!closure->refract) {
if (closure->alpha_x == closure->alpha_y) {
/* Isotropic */
sd->flag |= bsdf_microfacet_beckmann_isotropic_setup(bsdf);
}
else {
/* Anisotropic */
sd->flag |= bsdf_microfacet_beckmann_setup(bsdf);
}
sd->flag |= bsdf_microfacet_beckmann_setup(bsdf);
}
else {
sd->flag |= bsdf_microfacet_beckmann_refraction_setup(bsdf);
@@ -258,9 +244,9 @@ ccl_device void osl_closure_microfacet_ggx_setup(
}
bsdf->N = ensure_valid_reflection(sd->Ng, sd->wi, closure->N);
bsdf->alpha_x = closure->alpha_x;
bsdf->alpha_x = bsdf->alpha_y = closure->alpha_x;
sd->flag |= bsdf_microfacet_ggx_isotropic_setup(bsdf);
sd->flag |= bsdf_microfacet_ggx_setup(bsdf);
}
ccl_device void osl_closure_microfacet_ggx_aniso_setup(
@@ -345,7 +331,6 @@ ccl_device void osl_closure_microfacet_ggx_fresnel_setup(
bsdf->extra = extra;
bsdf->extra->color = rgb_to_spectrum(closure->color);
bsdf->extra->cspec0 = rgb_to_spectrum(closure->cspec0);
bsdf->extra->clearcoat = 0.0f;
bsdf->T = zero_float3();
@@ -383,7 +368,6 @@ ccl_device void osl_closure_microfacet_ggx_aniso_fresnel_setup(
bsdf->extra = extra;
bsdf->extra->color = rgb_to_spectrum(closure->color);
bsdf->extra->cspec0 = rgb_to_spectrum(closure->cspec0);
bsdf->extra->clearcoat = 0.0f;
bsdf->T = closure->T;
@@ -426,7 +410,6 @@ ccl_device void osl_closure_microfacet_multi_ggx_setup(
bsdf->extra = extra;
bsdf->extra->color = rgb_to_spectrum(closure->color);
bsdf->extra->cspec0 = zero_spectrum();
bsdf->extra->clearcoat = 0.0f;
bsdf->T = zero_float3();
@@ -467,7 +450,6 @@ ccl_device void osl_closure_microfacet_multi_ggx_glass_setup(
bsdf->extra = extra;
bsdf->extra->color = rgb_to_spectrum(closure->color);
bsdf->extra->cspec0 = zero_spectrum();
bsdf->extra->clearcoat = 0.0f;
bsdf->T = zero_float3();
@@ -508,7 +490,6 @@ ccl_device void osl_closure_microfacet_multi_ggx_aniso_setup(
bsdf->extra = extra;
bsdf->extra->color = rgb_to_spectrum(closure->color);
bsdf->extra->cspec0 = zero_spectrum();
bsdf->extra->clearcoat = 0.0f;
bsdf->T = closure->T;
@@ -551,7 +532,6 @@ ccl_device void osl_closure_microfacet_multi_ggx_fresnel_setup(
bsdf->extra = extra;
bsdf->extra->color = rgb_to_spectrum(closure->color);
bsdf->extra->cspec0 = rgb_to_spectrum(closure->cspec0);
bsdf->extra->clearcoat = 0.0f;
bsdf->T = zero_float3();
@@ -592,7 +572,6 @@ ccl_device void osl_closure_microfacet_multi_ggx_glass_fresnel_setup(
bsdf->extra = extra;
bsdf->extra->color = rgb_to_spectrum(closure->color);
bsdf->extra->cspec0 = rgb_to_spectrum(closure->cspec0);
bsdf->extra->clearcoat = 0.0f;
bsdf->T = zero_float3();
@@ -633,7 +612,6 @@ ccl_device void osl_closure_microfacet_multi_ggx_aniso_fresnel_setup(
bsdf->extra = extra;
bsdf->extra->color = rgb_to_spectrum(closure->color);
bsdf->extra->cspec0 = rgb_to_spectrum(closure->cspec0);
bsdf->extra->clearcoat = 0.0f;
bsdf->T = closure->T;
@@ -660,9 +638,9 @@ ccl_device void osl_closure_microfacet_beckmann_setup(
}
bsdf->N = ensure_valid_reflection(sd->Ng, sd->wi, closure->N);
bsdf->alpha_x = closure->alpha_x;
bsdf->alpha_x = bsdf->alpha_y = closure->alpha_x;
sd->flag |= bsdf_microfacet_beckmann_isotropic_setup(bsdf);
sd->flag |= bsdf_microfacet_beckmann_setup(bsdf);
}
ccl_device void osl_closure_microfacet_beckmann_aniso_setup(
@@ -865,27 +843,18 @@ ccl_device void osl_closure_principled_clearcoat_setup(
float3 weight,
ccl_private const PrincipledClearcoatClosure *closure)
{
weight *= 0.25f * closure->clearcoat;
ccl_private MicrofacetBsdf *bsdf = (ccl_private MicrofacetBsdf *)bsdf_alloc(
sd, sizeof(MicrofacetBsdf), rgb_to_spectrum(weight));
if (!bsdf) {
return;
}
MicrofacetExtra *extra = (MicrofacetExtra *)closure_alloc_extra(sd, sizeof(MicrofacetExtra));
if (!extra) {
return;
}
bsdf->N = ensure_valid_reflection(sd->Ng, sd->wi, closure->N);
bsdf->alpha_x = closure->clearcoat_roughness;
bsdf->alpha_y = closure->clearcoat_roughness;
bsdf->ior = 1.5f;
bsdf->extra = extra;
bsdf->extra->color = zero_spectrum();
bsdf->extra->cspec0 = make_spectrum(0.04f);
bsdf->extra->clearcoat = closure->clearcoat;
bsdf->T = zero_float3();
sd->flag |= bsdf_microfacet_ggx_clearcoat_setup(bsdf, sd);

View File

@@ -161,7 +161,10 @@ ccl_device_inline void osl_eval_nodes(KernelGlobals kg,
/* shadeindex = */ 0);
# endif
if (globals.Ci) {
if constexpr (type == SHADER_TYPE_DISPLACEMENT) {
sd->P = globals.P;
}
else if (globals.Ci) {
flatten_closure_tree(kg, sd, path_flag, globals.Ci);
}
}

View File

@@ -67,17 +67,18 @@ ccl_device_inline void sample_uniform_cone(const float3 N,
ccl_private float3 *wo,
ccl_private float *pdf)
{
float zMin = cosf(angle);
float z = zMin - zMin * randu + randu;
float r = safe_sqrtf(1.0f - sqr(z));
float phi = M_2PI_F * randv;
float x = r * cosf(phi);
float y = r * sinf(phi);
const float cosThetaMin = cosf(angle);
const float cosTheta = mix(cosThetaMin, 1.0f, randu);
const float sinTheta = sin_from_cos(cosTheta);
const float phi = M_2PI_F * randv;
const float x = sinTheta * cosf(phi);
const float y = sinTheta * sinf(phi);
const float z = cosTheta;
float3 T, B;
make_orthonormals(N, &T, &B);
*wo = x * T + y * B + z * N;
*pdf = M_1_2PI_F / (1.0f - zMin);
*pdf = M_1_2PI_F / (1.0f - cosThetaMin);
}
ccl_device_inline float pdf_uniform_cone(const float3 N, float3 D, float angle)

View File

@@ -46,17 +46,8 @@ ccl_device_noinline_cpu float2 svm_brick(float3 p,
float tint = saturatef((brick_noise((rownum << 16) + (bricknum & 0xFFFF)) + bias));
float min_dist = min(min(x, y), min(brick_width - x, row_height - y));
float mortar;
if (min_dist >= mortar_size) {
mortar = 0.0f;
}
else if (mortar_smooth == 0.0f) {
mortar = 1.0f;
}
else {
min_dist = 1.0f - min_dist / mortar_size;
mortar = (min_dist < mortar_smooth) ? smoothstepf(min_dist / mortar_smooth) : 1.0f;
}
min_dist = 1.0f - min_dist / mortar_size;
float mortar = smoothstepf(min_dist / mortar_smooth);
return make_float2(tint, mortar);
}

View File

@@ -333,7 +333,6 @@ ccl_device_noinline int svm_node_closure_bsdf(KernelGlobals kg,
bsdf->extra->cspec0 = rgb_to_spectrum(
(specular * 0.08f * tmp_col) * (1.0f - metallic) + base_color * metallic);
bsdf->extra->color = rgb_to_spectrum(base_color);
bsdf->extra->clearcoat = 0.0f;
/* setup bsdf */
if (distribution == CLOSURE_BSDF_MICROFACET_GGX_GLASS_ID ||
@@ -383,7 +382,6 @@ ccl_device_noinline int svm_node_closure_bsdf(KernelGlobals kg,
bsdf->extra->color = rgb_to_spectrum(base_color);
bsdf->extra->cspec0 = rgb_to_spectrum(cspec0);
bsdf->extra->clearcoat = 0.0f;
/* setup bsdf */
sd->flag |= bsdf_microfacet_ggx_fresnel_setup(bsdf, sd);
@@ -440,7 +438,6 @@ ccl_device_noinline int svm_node_closure_bsdf(KernelGlobals kg,
bsdf->extra->color = rgb_to_spectrum(base_color);
bsdf->extra->cspec0 = rgb_to_spectrum(cspec0);
bsdf->extra->clearcoat = 0.0f;
/* setup bsdf */
sd->flag |= bsdf_microfacet_multi_ggx_glass_fresnel_setup(bsdf, sd);
@@ -455,30 +452,20 @@ ccl_device_noinline int svm_node_closure_bsdf(KernelGlobals kg,
#ifdef __CAUSTICS_TRICKS__
if (kernel_data.integrator.caustics_reflective || (path_flag & PATH_RAY_DIFFUSE) == 0) {
#endif
if (clearcoat > CLOSURE_WEIGHT_CUTOFF) {
ccl_private MicrofacetBsdf *bsdf = (ccl_private MicrofacetBsdf *)bsdf_alloc(
sd, sizeof(MicrofacetBsdf), weight);
ccl_private MicrofacetExtra *extra =
(bsdf != NULL) ?
(ccl_private MicrofacetExtra *)closure_alloc_extra(sd, sizeof(MicrofacetExtra)) :
NULL;
Spectrum clearcoat_weight = 0.25f * clearcoat * weight;
ccl_private MicrofacetBsdf *bsdf = (ccl_private MicrofacetBsdf *)bsdf_alloc(
sd, sizeof(MicrofacetBsdf), clearcoat_weight);
if (bsdf && extra) {
bsdf->N = clearcoat_normal;
bsdf->T = zero_float3();
bsdf->ior = 1.5f;
bsdf->extra = extra;
if (bsdf) {
bsdf->N = clearcoat_normal;
bsdf->T = zero_float3();
bsdf->ior = 1.5f;
bsdf->alpha_x = clearcoat_roughness * clearcoat_roughness;
bsdf->alpha_y = clearcoat_roughness * clearcoat_roughness;
bsdf->alpha_x = clearcoat_roughness * clearcoat_roughness;
bsdf->alpha_y = clearcoat_roughness * clearcoat_roughness;
bsdf->extra->color = zero_spectrum();
bsdf->extra->cspec0 = make_spectrum(0.04f);
bsdf->extra->clearcoat = clearcoat;
/* setup bsdf */
sd->flag |= bsdf_microfacet_ggx_clearcoat_setup(bsdf, sd);
}
/* setup bsdf */
sd->flag |= bsdf_microfacet_ggx_clearcoat_setup(bsdf, sd);
}
#ifdef __CAUSTICS_TRICKS__
}
@@ -584,7 +571,6 @@ ccl_device_noinline int svm_node_closure_bsdf(KernelGlobals kg,
if (bsdf->extra) {
bsdf->extra->color = rgb_to_spectrum(stack_load_float3(stack, data_node.w));
bsdf->extra->cspec0 = zero_spectrum();
bsdf->extra->clearcoat = 0.0f;
sd->flag |= bsdf_microfacet_multi_ggx_setup(bsdf);
}
}
@@ -724,7 +710,6 @@ ccl_device_noinline int svm_node_closure_bsdf(KernelGlobals kg,
kernel_assert(stack_valid(data_node.z));
bsdf->extra->color = rgb_to_spectrum(stack_load_float3(stack, data_node.z));
bsdf->extra->cspec0 = zero_spectrum();
bsdf->extra->clearcoat = 0.0f;
/* setup bsdf */
sd->flag |= bsdf_microfacet_multi_ggx_glass_setup(bsdf);

View File

@@ -489,8 +489,7 @@ typedef enum ClosureType {
#define CLOSURE_IS_BSDF_MICROFACET_FRESNEL(type) \
(type == CLOSURE_BSDF_MICROFACET_MULTI_GGX_FRESNEL_ID || \
type == CLOSURE_BSDF_MICROFACET_MULTI_GGX_GLASS_FRESNEL_ID || \
type == CLOSURE_BSDF_MICROFACET_GGX_FRESNEL_ID || \
type == CLOSURE_BSDF_MICROFACET_GGX_CLEARCOAT_ID)
type == CLOSURE_BSDF_MICROFACET_GGX_FRESNEL_ID)
#define CLOSURE_IS_BSDF_OR_BSSRDF(type) (type <= CLOSURE_BSSRDF_RANDOM_WALK_FIXED_RADIUS_ID)
#define CLOSURE_IS_BSSRDF(type) \
(type >= CLOSURE_BSSRDF_BURLEY_ID && type <= CLOSURE_BSSRDF_RANDOM_WALK_FIXED_RADIUS_ID)

View File

@@ -74,7 +74,8 @@ CCL_NAMESPACE_BEGIN
#define __VOLUME__
/* TODO: solve internal compiler errors and enable light tree on HIP. */
#ifdef __KERNEL_HIP__
/* TODO: solve internal compiler perf issue and enable light tree on Metal/AMD. */
#if defined(__KERNEL_HIP__) || defined(__KERNEL_METAL_AMD__)
# undef __LIGHT_TREE__
#endif
@@ -1290,12 +1291,14 @@ typedef struct KernelCurveSegment {
static_assert_align(KernelCurveSegment, 8);
typedef struct KernelSpotLight {
packed_float3 axis_u;
float radius;
packed_float3 axis_v;
float invarea;
float cos_half_spot_angle;
float spot_smooth;
packed_float3 dir;
float pad;
float cos_half_spot_angle;
packed_float3 len;
float spot_smooth;
} KernelSpotLight;
/* PointLight is SpotLight with only radius and invarea being used. */
@@ -1506,6 +1509,8 @@ typedef enum DeviceKernel : int {
DEVICE_KERNEL_INTEGRATOR_ACTIVE_PATHS_ARRAY,
DEVICE_KERNEL_INTEGRATOR_TERMINATED_PATHS_ARRAY,
DEVICE_KERNEL_INTEGRATOR_SORTED_PATHS_ARRAY,
DEVICE_KERNEL_INTEGRATOR_SORT_BUCKET_PASS,
DEVICE_KERNEL_INTEGRATOR_SORT_WRITE_PASS,
DEVICE_KERNEL_INTEGRATOR_COMPACT_PATHS_ARRAY,
DEVICE_KERNEL_INTEGRATOR_COMPACT_STATES,
DEVICE_KERNEL_INTEGRATOR_TERMINATED_SHADOW_PATHS_ARRAY,

View File

@@ -13,6 +13,7 @@
#include "scene/light.h"
#include "scene/mesh.h"
#include "scene/object.h"
#include "scene/osl.h"
#include "scene/pointcloud.h"
#include "scene/scene.h"
#include "scene/shader.h"
@@ -23,7 +24,9 @@
#include "subd/patch_table.h"
#include "subd/split.h"
#include "kernel/osl/globals.h"
#ifdef WITH_OSL
# include "kernel/osl/globals.h"
#endif
#include "util/foreach.h"
#include "util/log.h"
@@ -306,6 +309,11 @@ void GeometryManager::update_osl_globals(Device *device, Scene *scene)
{
#ifdef WITH_OSL
OSLGlobals *og = (OSLGlobals *)device->get_cpu_osl_memory();
if (og == nullptr) {
/* Can happen when rendering with multiple GPUs, but no CPU (in which case the name maps filled
* below are not used anyway) */
return;
}
og->object_name_map.clear();
og->object_names.clear();
@@ -1666,6 +1674,7 @@ void GeometryManager::device_update_displacement_images(Device *device,
TaskPool pool;
ImageManager *image_manager = scene->image_manager;
set<int> bump_images;
bool has_osl_node = false;
foreach (Geometry *geom, scene->geometry) {
if (geom->is_modified()) {
/* Geometry-level check for hair shadow transparency.
@@ -1685,6 +1694,9 @@ void GeometryManager::device_update_displacement_images(Device *device,
continue;
}
foreach (ShaderNode *node, shader->graph->nodes) {
if (node->special_type == SHADER_SPECIAL_TYPE_OSL) {
has_osl_node = true;
}
if (node->special_type != SHADER_SPECIAL_TYPE_IMAGE_SLOT) {
continue;
}
@@ -1700,6 +1712,15 @@ void GeometryManager::device_update_displacement_images(Device *device,
}
}
}
#ifdef WITH_OSL
/* If any OSL node is used for displacement, it may reference a texture. But it's
* unknown which ones, so have to load them all. */
if (has_osl_node) {
OSLShaderManager::osl_image_slots(device, image_manager, bump_images);
}
#endif
foreach (int slot, bump_images) {
pool.push(function_bind(
&ImageManager::device_update_slot, image_manager, device, scene, slot, &progress));

Some files were not shown because too many files have changed in this diff Show More