1
1

Compare commits

..

301 Commits

Author SHA1 Message Date
dc2d180181 Bump version to 3.0.1 release 2022-01-25 18:19:44 +01:00
Jens Ehrhardt
ddf92d719b Fix T95099: Have launcher wait for Blender exit
unity launches blender in background mode to do some
file conversions, ever since the launcher got introduced
this process broke.

The root cause here is: Unity looks up the default program
to launch .blend files with, which is now the launcher, then
launches it in background mode with a script to export the data.

The launcher however was designed to exit as quickly as
possible so there would not be an extra background process
lingering. It does not wait for blender to exit and does not
pass back any error codes.

This broke unity's workflow since it assumed if the process
exits and succeeds the data *must* be ready for reading which
no longer holds true.

This change keeps the launcher design as was previously,
*except* when launching in background mode, then it
waits and passes back any error codes, thus restoring
unity's workflow.

Differential Revision: https://developer.blender.org/D13894
Reviewed by: LazyDodo, Brecht
2022-01-25 09:44:31 +01:00
add07576a0 Mesh: Add versioning in 3.0 for vertex normal refactor
This tags all meshes from the future 3.1 version dirty,
since normals aren't "implicitly" read as part of `MVert`
anymore after rBcfa53e0fbeed7178.

Differential Revision: https://developer.blender.org/D13856
2022-01-24 10:28:40 -06:00
63b9e5378b Fix T94202: GPUFrameBuffer: wrong refcount in the buffer passed for read_color and read_depth
The buffer passed as an argument to `GPUFrameBuffer.read_color` is used
in the return of the function and therefore, if not used, its refcount is
decremented.

So be sure to increment the refcount of the already existing objects that
will be used in the return of a function.
2022-01-20 21:13:27 +01:00
822501d86d Cleanup: Avoid possible NULL pointer error
In normal conditions, `gpf` always has a value, but better move inside the NULL checking.
2022-01-20 21:12:36 +01:00
b3fe135854 Fix T94197: Applying boolean with fast solver clears bevel weights
For boolean operations only one of the meshes was checked to determine
if bevel weights should be created.

Now initialize custom data from both meshes flag.
Note that this is a localized fix to be back-ported, further changes
will be made so edit-mode conversion accounts for this
without the caller needing explicit checks for custom-data flags.
2022-01-20 21:11:48 +01:00
0039432cfc Fix T94715: multiple volumes using the same .vdb causes freeze
Needs more TBB task isolation, as even freeing an OpenVDB grid uses
multithreading.
2022-01-20 21:01:24 +01:00
Richard Antalik
9600f36cfc Fix T94768: Crash in VSE prefetching
Only the fix part of rBf2fb9a0c59a applied (P2726).
2022-01-17 13:07:21 +01:00
688713db24 Fix T94827: Group Input/Output cannot connect to custom sockets
Caused by rBa5c59fb90ef9.

Since Group Input and Output sockets happen to be of type `SOCK_CUSTOM`
[and since rBa5c59fb90ef9 custom py defined sockets are too :)] a check
introduced in rB513066e8ad6f that prevents connections for `SOCK_CUSTOM`
triggered.

Now refine the check, so it specifically looks for NODE_GROUP_INPUT /
NODE_GROUP_OUTPUT, too (this keeps the intention intact to not connect
group inputs to group outputs and vice versa, but allows custom py
defined sockets to connect again) and put it in new utility function.

Maniphest Tasks: T94827

Differential Revision: https://developer.blender.org/D13817
2022-01-17 12:08:18 +01:00
5079a460a7 Fix T94878: LineArt crease threshold logic error.
A coding mistake allows default crease to override object crease, now fixed.
2022-01-17 12:06:19 +01:00
bcca7bf975 Fix T94089: GPencil Drawing don't Update after paste in Dopesheet
When paste new frames, the datablock need to be tagged to update the drawings.
2022-01-17 12:05:27 +01:00
d7a1fc0868 Fix T94903: GPencil: Copying keys doesn't preserve Keyframe Type
When a new frame is created, ensure the keytype of source key is used.
2022-01-17 12:02:37 +01:00
40c5786df3 Fix crash caused by exception in Python gizmo target get handler 2022-01-17 12:01:35 +01:00
71bc9d4760 Fix T94624: Object as font instances don't work
The fundamental limitation is that we can only have one instance
("dupli") generator at a time. Because the mesh output of a curve
object is output as an instances, the geometry set instances existed,
replacing the object as font instances. The "fix" is to reverse the
order. The behavior won't be perfect still, but at least the old
behavior will be preserved, which is really what matters for a
feature like this.

One way to take this change further would be completely disabling
regular geometry evaluation while this option is active. However,
it doesn't seem like that would actually improve the state of the code.

Differential Revision: https://developer.blender.org/D13768
2022-01-17 11:59:59 +01:00
1d462e1729 Fix T92953: Tool Settings: Drag on Tweak fails with LMB select 2022-01-17 11:58:43 +01:00
44ac03785c Fix T94799: GPencil Strokes drawn at 0.0 Strength still visible
There was a clamp with a value greater than 0.
2022-01-17 11:56:32 +01:00
aafbd74646 Fix T94728: Auto Depth problem with Cliping Region
Issue introduced in rB1d49293b80446b89b5b12fa0eeefaf14e5051e48

`drw_manager_init` must be called after `drw_context_state_init` as
`DST.draw_ctx.sh_cfg` (indicating when the view is clipped) must be set
first.

Differential Revision: https://developer.blender.org/D13795
2022-01-12 08:17:18 +01:00
edc85e182e Cleanup: remove unnecessary 'use_opengl_context' parameter
The argument passed is always false.
2022-01-12 08:14:03 +01:00
Germano Cavalcante
9d6680e7f9 Fix T93477: Viewport X-Ray is influencing snapping even in material mode
The default snap behavior to perform on tools and cursors is to the
final geometry and not edited geometry.

In snapping to edited geometry, there are some specific behaviors that
are not convenient in some cases. For example the general occlusion
test of X-Ray geometries during dragdrop.

This fix also resolves a regression for tools like measure and placement
that were also ignoring the snap to face in x-ray mode.

Differential Revision: https://developer.blender.org/D13410
2022-01-11 11:14:26 +01:00
15e4d0f25d Fix T94600: Apply single shrinkwrap constraint fails
rBd6891d9bee2b introduced a way to apply a single constraint from the
constraint stack. For this we want to work in the evaluated domain, in
particular the constraint target should be evaluated (the shrinkwrap
constraint needs to have access to the target's evaluated mesh).

Thx a lot to @sergey for handholding here!

Maniphest Tasks: T94600

Differential Revision: https://developer.blender.org/D13765
2022-01-11 10:33:08 +01:00
e53f3954a4 Fix T93949: Preview Image Error When No Screen
Fix an error if "File Preview Type" is "Auto" and there is no screen.

See D13574 for details.

Differential Revision: https://developer.blender.org/D13574

Reviewed by Julian Eisel
2022-01-11 10:32:04 +01:00
3b686b8233 Fix T94708: negative reference count error with Python API callbacks
Regression in 7972785d7b that caused
Python callback arguments to be de-referenced twice - potentially
accessing freed memory. Making a new-file with a circle-select
tool active triggered this (for example).

Now arguments aren't de-referenced when Blender it's self has already
removed the callback handle.
2022-01-11 10:31:04 +01:00
Mikhail Matrosov
4c8740a452 Fix T93418: Cycles shadow terminator Geometry Offset artifacts with translucency
Differential Revision: https://developer.blender.org/D13468
2022-01-11 10:29:12 +01:00
9a7b1d2245 Fix T94145: Knife tool fails in orthographic mode
Calculating min and max orthographic extent forgot to convert to
worldspace coordinates.
2022-01-11 10:28:07 +01:00
72aefef9d2 Fix T93695: Discontinuous cutting with the knife tool
An important check to reject edge linehits when a vertex of that edge
was already hit was accidentally removed in
rB6e77afe6ec7b6a73f218f1fef264758abcbc778a
2022-01-11 10:26:54 +01:00
ed397ff507 Fix T94506: Crash in Compositing 2022-01-11 10:25:36 +01:00
7d26cf01f7 Fix T94563: Cycles standalone build error on with strict float/double casting
Thanks to John David for finding this.
2022-01-11 10:24:43 +01:00
c3457af23f Fix Cycles allocating result too early
When tiled rendering was used the render result was
allocated at the end of every view layer render as
opposite of an intended end of all rendering.

Modify the render_result_end so that it only ensures
pixels are allocated if pixels are actually copied
over.
2022-01-11 10:23:58 +01:00
3707a78471 Fix Cycles using Cancel semantic on final result write
Seems like a copy-paste bug from another place.
2022-01-11 10:23:17 +01:00
e84625dcbc Fix T94169: Missing grease pencil render with tiled rendering
Delay grease pencil for until after the render result is written
to the Blender side.

Differential Revision: https://developer.blender.org/D13740
2022-01-11 10:20:49 +01:00
29c4c78a38 Fix T94620: GPencil AutoMerge does not work when Draw On Back is enabled
The problem was the stroke was added to head and the `prev` pointer was NULL. Now check if there is the list is empty`next`.
2022-01-11 10:18:21 +01:00
cd804fb2e9 Fix T94392: 3D Cursor surface projection onto hidden geometry
Regression introduced in rB098008f42d8127d9b60717c7059d3c55a3bfada7

Previously the selected geometry was ignored along with the hidden one.

The mentioned commit caused neither the hidden nor the selected one to be ignored.

But hidden geometry needs to be ignored.
2022-01-11 10:16:54 +01:00
69c56d2819 LineArt: Correct clamping of out of bound isect index
Handles rare cases where line doesn't intersect the triangle correctly.
2022-01-11 10:15:42 +01:00
7339663bbc LineArt: Correct collection intersection mask logic.
The logic used to be:

"if collection doesn't have child collection, check if ob is from this one"

The correct logic should be:

"if collection child does not have this ob, then check this collection".
2022-01-11 10:14:26 +01:00
6bd924c748 Fix T93868: GPencil material filter does not work with instances
When the material is used in several objects, the filter by material is not working as expected because the internal pointers are different due eval version.

Now, the original version of the material is compared to keep same address.
2022-01-11 10:13:10 +01:00
Shen Ciao
e8a8e953b3 Fix T93134: Set origin broken for curve edit mode
Bug: Set Origin causes unexpected offset on Grease Pencil strokes when Curve Editing is enabled.
Fix: Add transformation of editcurve points in `object_origin_set_exec`.

Reviewed By: #grease_pencil, antoniov

Maniphest Tasks: T93134

Differential Revision: https://developer.blender.org/D13273
2022-01-11 10:11:42 +01:00
784c04bcbd Fix T93163: GPencil scale thickness fails in negative scales
Before the negative scales produced a thickness invalid. Now, the value is used in absolute value to avoid this situation.
2022-01-11 10:09:32 +01:00
50c39ff8fe Fix T94454: Python API curve to mesh use after free without depsgraph
This was caused by a mistake in eb0eb54d96, which removed
the clearing of the curve edit mode pointers that are set when creating
the temporary data for the conversion. If they are not cleared, the
generic ID free function will also free the edit mode data, which is
wrong when the source curve is in edit mode.
2022-01-11 10:08:12 +01:00
512014f042 Fix T94442: Trim curve node can crash with duplicate point
The calculation to find the factor between two evaluated points assumed
that the points were not at the same location. This assumption is some-
what reasonable, since we might expect `lower_bound` to skip those
point anyway. However, the report found a case where the first two
evaluated points were coincident, and there is no strong reason not
to make this safe, so add a check for 0 length before the division.
2022-01-11 10:06:54 +01:00
3d5dbc1c44 Cycles: Reintroduce device-only memory handling that got lost in Cycles X merge
Somehow only a part of rBf4f8b6dde32b0438e0b97a6d8ebeb89802987127 ended up in
Cycles X, causing the issue that commit fixed, "OPTIX_ERROR_INVALID_VALUE" when the
system is out of memory, to show up again.
This adds the missing changes to fix that problem.

Maniphest Tasks: T93620

Differential Revision: https://developer.blender.org/D13488
2022-01-11 10:05:20 +01:00
5d80b64d28 Fix T94375: Python error when trying to add Grease Pencil brush preset
The prop name was wrong.
2022-01-11 10:03:51 +01:00
Harley Acheson
08226693cf Fix T94334: 3DView View Menu Close Error
Add Error checking to `do_view3d_header_buttons` so that it does
not crash if area->win does not exist because it has been closed.

Note this is a temporary simple fix that will be replaced by D13660.

---

Selecting "Close Area" from the 3DView View / Area menu will crash when `do_view3d_header_buttons` is called afterward even though the area has closed. It gets a NULL result from CTX_wm_window(C) and dies. This patch just adds a check for this being NULL and exits out in this case.

`uiTemplateEditModeSelection` is a bit dodgy adding `do_view3d_header_buttons` as a handler for the entire uiBlock. This patch is meant to be a simple and temporary solution in 3.01, replaced later by {D13660} which fixes this area by using an operator instead.
2022-01-11 10:01:14 +01:00
b2e15cb19d Fix T93408: Snap performance regression at high poll rate
Caused by {rBfba9cd019f21f29bad1a6f3713370c5172dbc97f}.

The snap timer was accidentally modified and damaged.
2022-01-11 09:53:12 +01:00
6514e4c418 Fix: Build issue on 32 bit archs
The cast to size_t leads to a build issue on 32
bit archs. cursor_delim_type_utf8 expects an int
so an additional cast to size_t is not required.

Reported by user frispete on devtalk.
2022-01-11 09:51:10 +01:00
Germano Cavalcante
66addab27a Fix T94191: correct (time) translation headers not showing DeltaX
Caused by {rBb0d9e6797fb8}

For the header (both Graph Editor case in general `headerTranslation` as
well as `headerTimeTranslate`) we are interested in deltas values
(not absolute values).

Since culprit commit, `snapFrameTransform` was not working with deltas
anymore, but we have to compensate for this.

For the Graph Editor, this only worked "by accident" in rB7192e57d63a5,
since `ival` is still zero at this point.

So now, reacquire the delta right after the snap operation.

Also use a more appropriate center value in the translate operator.

Maniphest Tasks: T94191

Differential Revision: https://developer.blender.org/D13641
2022-01-11 09:26:50 +01:00
606f6e73b0 Fix T94280: Crash when splitting meta strip
This happens because in `SEQ_time_update_sequence` function
`SEQ_get_meta_by_seqbase` returns uninitialized value. This isn't nice,
but it shouldn't happen in first place. Problem is, that
`SEQ_edit_strip_split` does move strips into detached `ListBase`, so
other functions can't see them anymore. Detached `ListBase` is used
solely to preserve relationships during duplication.

Move strips to original `ListBase` immediately after duplication and
return `NULL` if `SEQ_get_meta_by_seqbase` can't find meta strip.

Splitting itself can still rely on fact, that number of original and
duplicated strips is same and they are placed next to each other in
exactly same order at the end of original `ListBase`.
2022-01-11 09:25:19 +01:00
56d45a2974 Fix T94254: Crash using view_all operator in VSE
Caused by `NULL` dereference in `SEQ_meta_stack_active_get()`.

Check if `Editing` is `NULL` before accessing meta stack.
2022-01-11 09:23:53 +01:00
fa0173f728 Fix T94184: Outliner: Collection dragging tooltip is not updating
In the context of the dragdrop tooltip, the event referenced to the window
is out of date and contains invalid `mval` values.

Avoid using `win->eventstate` as much as possible.
2022-01-11 09:22:41 +01:00
52905c02ae Fix T94109: 3d cursor crash when using shortcut
Operator was erroneously starting edge_slide operation.

Revert part of the changes in rB3fab16fe8eb4 as obedit_type was being
confused with object_mode.
2022-01-11 09:21:27 +01:00
50ecf9dcf5 Fix T94115: Selecting current action in undo history undoes all
When selecting the current undo step there is no need to do anything.

Fix and minor refactor to de-duplicate refreshing after running
undo/redo & undo history.
2022-01-11 09:20:15 +01:00
0564b19ff4 Cleanup: Correct order of guard and lock in moviecache_valfree
Fix own mistake in rB7061d1e39fe

In my attempt to quickly address T92838, along with the original bug, I
made a nonsensical choice to use the limiter lock to guard the check
against the cache item itself. While harmless, it is not necessary and
semantically wrong / potentially confusing to future readers of the code.

Differential Revision: https://developer.blender.org/D13122
2022-01-11 09:18:50 +01:00
ff5630b7fa Fix T92740: Missing lock around the image CacheLimiter
A recent change exposed this long-standing race. Simply protect the
MEM_CacheLimiter with its lock now. Additionally, guard against
unmanaging an already destroyed cache handle.

Ref T92740, T92838
2022-01-11 09:17:41 +01:00
0a6b6eb13b Fix T94082: Curve to point empty evaluated NURBS crash
This is basically the same as rBee4ed99866fbb7ab04, the fix is
simply to check if the spline has evaluated points when deciding
the offsets into the result points array.
2022-01-11 09:15:28 +01:00
b9f5e6c0b4 Fix T93971: "Center Cursor & Frame All" fails to redraw
bda9e4238a changed smooth-view
not to redraw when there were no changes made.
Redrawing is needed for repositioning the cursor.

Subscribe to changes to the 3d cursor to ensure all view ports
are updated (not just the current one).
2022-01-11 09:14:10 +01:00
62ce0c60cd Fix meta-ball bound-box calculation reading past buffer bounds
This broke "test_undo.view3d_multi_mode_select" test in
"lib/tests/ui_simulate" and is likely exposed by recent changes to
bounding box calculation.

The missing check for DL_INDEX4 dates back to code from 2002 which
intended to check this but was checking for DL_INDEX3 twice
which got removed as part of a cleaned up.

This could be hidden from memory checking tools as meta-balls
over-allocate vertex arrays.
2022-01-11 09:12:43 +01:00
58ee4852b6 Fix T93649: Blender freezes when saving with active VR session
Dead-lock when VR viewport drawing and depsgraph updates would fight for
the draw-manager GL lock. This didn't usually cause issues because the
depsgraph would be evaluated at this point already, except in rare
exceptions like after file writing.

Fix this by ensuring the XR surface gets its depsgraph updated after
handling notifiers, which is where regular windows also do the depsgraph
updating.
2022-01-11 09:11:31 +01:00
ad4d66580e Fix T93704: StructRNA.path_resolve fails silently with missing keys
Resolving the path to a missing pose-bone (for example),
was not raising an error as it should have.

Regression introduced in f9ccd26b03,
which didn't update collection lookup logic to fail in the case the
key of a collection wasn't found.
2022-01-11 09:09:38 +01:00
21f02dd85d Fix T94243: weightpaint gradient panel shown in other places
Weightpaint gradient tool panel showed in other modes (and as a separate
panel).

Fix for fix, see
- rBf8a0e102cf5e
- rBe549d6c1bd2d

So now, check mode again and restrict to topbar (prevents an additional
panel since this is already included in the brush settings).

ref rB0837926740b3 in sculpt-dev branch, so thx @joeedh as well!

Maniphest Tasks: T94243

Differential Revision: https://developer.blender.org/D13630
2022-01-11 09:07:25 +01:00
7d5fa51666 Fix T93169: Weightpaint falloff popover drawn twice
This came with {rBf8a0e102cf5e}.

The panel was meant specifically for the gradient tool, but since it was
given the ".weighpaint" context, it would also draw as part of generic
header toolsettings drawing.

Now remove this context on purpose and only draw this specifically from
the gradient tools ToolDef.

Maniphest Tasks: T93169

Differential Revision: https://developer.blender.org/D13268
2022-01-11 09:05:19 +01:00
61fe0d6264 Fix T93438: Auto linking do not work for custom sockets
Currently, custom sockets are no longer supported for automatic linking
when dropping a node on a link. This is because SOCK_CUSTOM is given a
negative priority and is ignored. To fix this, SOCK_CUSTOM is now given
the lowest priority and the rest of the sockets got their priority
incremented.

Reviewed By: Jacques Lucke

Differential Revision: https://developer.blender.org/D13403
2022-01-11 08:51:47 +01:00
Moritz Röhrich
99efb95441 Fix T93591: Random Value node first and last value proportion
This patch replaces `round_fl_to_int` with `floor` and adjusts the
maximum value accordingly. The call to `round_fl_to_int` is problematic
here because it messes with the probability distribution at the edges
of the value range, meaning the first and last values were only half
as common as all other values. Since `round_fl_to_int` does
`floor(val + 0.5)`, it will not introduce misbehavior in edge cases.

Differential Revision: https://developer.blender.org/D13474
2022-01-11 08:50:31 +01:00
dc7ff75ef8 Fix T93498: Cycles fast GI add method affected by bounces settings
This is only for the replace method.
2022-01-11 08:49:08 +01:00
e83df74008 Fix T93890: Cycles error with shadow catcher + OptiX denoise without passes 2022-01-11 08:47:52 +01:00
55ecdf3195 Fix T93874: Cycles crash with fast GI approximation 2022-01-11 08:45:03 +01:00
5beadc31d6 Fix T93871: Image.has_data returns True for images that failed to load 2022-01-11 08:43:35 +01:00
a8a9a08bf7 Fix T84710: Instances with only mesh edges or vertices are invisible
Wire-only meshes have a special case in the overlay drawing to give
the wire shader a special color (which avoids the lines being dashed,
somehow). The fast path for duplis didn't have that special case.

Differential Revision: https://developer.blender.org/D13196
2022-01-11 08:42:11 +01:00
88f8b01e66 Fix T93691: Crash when loading custom thumbnail in custom library
This was an issue with the mixed list of external assets and assets from
the current file. When closing the File Browser to select the custom
preview image, the assets from the current file would be cleared for
reread, to make sure we display up-to-date file data. That is because
the workspace of the temporary File Browser was deleted, causing a
change in the file data (main data-base). The reread would happen in a
background thread, meaning it might not finish before the custom preview
operator runs and queries the active asset. So the preview operator
would get the wrong active asset from context.

Two fixes were needed:
* Make sure current file data is reread before the operator runs, by
  doing this partial rereading on the main thread.
* Ensure the asset list (in fact file list) order stays consistent over
  rereads. If multiple assets with the same name were shown, the
  operator might also have gotten the wrong asset, also leading to a
  crash.

Additionally the file operation handler should probably poll before
executing, to fail gracefully at least (not crash).
2022-01-11 08:41:02 +01:00
10d65b821b Fix T93892: Changing bone name leaves non-functional vertex group
The corresponding vertex group was renamed properly, but the armature
influence was broken for that bone.

Caused by {rB3b6ee8cee708}.

Since above commit, vertex group names are stored on object data (mesh/
lattice/gpencil) and if we update these, we have to inform dependency
graph to have immediate effect.

Maniphest Tasks: T93892

Differential Revision: https://developer.blender.org/D13526
2022-01-11 08:39:24 +01:00
56bd7adce7 Fix T93858: Zstd-compressed .blend files from external tools aren't recognized
The issue here was that after the seek table check, the underlying file wasn't
rewound to the start, so the code that checks for the BLENDER header
immediately reaches EOF and fails.

Since Blender always writes files with a seek table, this bug isn't triggered
by files saved in Blender itself. However, files compressed in external tools
generally don't have a seek table.
2022-01-11 08:37:47 +01:00
b4c9f8da88 Fix T93869: snap cursor may fail in orthographic view
Float precision issues cause the `ED_view3d_win_to_3d_on_plane` to return
a value even when the view ray is parallel to the plane.

A more general solution might be desired in this case, as other areas that
use `ED_view3d_win_to_3d_on_plane` might have the same problem.

For now, just work around the problem for the snap cursor.
2022-01-11 08:36:20 +01:00
0b7dbff04a Fix T93728: Greasepencil separate will loose all vertex groups
Caused by {rB3b6ee8cee708}

The raw data is copied here correctly
[`BKE_gpencil_stroke_weights_duplicate` in
`BKE_gpencil_stroke_duplicate`] but the vertex groups names are missing.
Prior to above commit is was enough to have `ED_object_add_duplicate`
(this seemingly took care of duplicating object's defbase).
Now vertex groups names sit on the `bGPdata` rather than the `Object`,
and since the separate operation creates **new** `bGPdata` we have to
copy vertex groups names - and active index - over [via
`BKE_defgroup_copy_list`].

Maniphest Tasks: T93728

Differential Revision: https://developer.blender.org/D13509
2022-01-11 08:34:05 +01:00
Yuchen Wen
b93127c57b Fix T93467: Use world bg color for pose library previews
Use the World viewport color when rendering pose library previews.

The World's viewport color is chosen instead of the World shading nodes,
as the latter would require rendering with `OB_RENDER` (instead of
`OB_SOLID`), which would take considerably longer.

Manifest Task: T93467

Reviewed By: sybren

Differential Revision: https://developer.blender.org/D13470
2022-01-11 08:32:32 +01:00
d009056b01 Docs: Incorrect link to context type
Fixes T93773
2022-01-11 08:31:01 +01:00
Hans Goudey
669577a973 Fix T93521: Single point NURBS crash in resample node
The resample node didn't handle the case of when a spline didn't have
any evaluated points. For poly and Bezier splines we should never hit
this case, but it is expected when the number of NURBS control points
is smaller than its order, so we have to handle the case here.

It's not that obvious what to do in this case, there are a few options:

- Remove the bad splines from the result
- Generate empty splines for those inputs
- Skip resampling the bad splines, copy them to the result
- Arbitrarily generate single point splines

I chose option three, just skipping the "bad" splines. Since the node
already has a selection input, this can be described by just extending
that. "Splines with no evaluated points are implicitly deselected."
The first option would probably be valid too though.

Differential Revision: https://developer.blender.org/D13434
2022-01-10 16:26:09 +01:00
e29026bb4b Fix T93314: Thumbnails not drawn with default scale
Decrease threshold for drawing thumbnails.

This was unintended change in daaa43232d that was overlooked.
2022-01-10 16:17:56 +01:00
1fd824345d VSE: Fix strip with mask modifier not blending
Set `ibuf->planes` to `R_IMF_PLANES_RGBA` because mask modifier adds
transparent areas to image.
2022-01-10 16:16:38 +01:00
fd4c343dcd Fix T93707: Dragging the tweaked NLA strip causes crash
Earlier code assumed that the active strip was on the active track. This
commit detects when this assumption doesn't hold, and adds a more thorough
search of the active strip.
2022-01-10 16:15:23 +01:00
46f5b305e4 Fix T93611: Curve modifier crash in editmode in certain situations
Caused by {rB3b6ee8cee708}

Above commit was trying to get the vertexgroup from the mesh that is
passed into `deformVertsEM` (but that can be NULL).
When can it be NULL, when is is non-NULL?
`editbmesh_calc_modifiers` only passes in a non-NULL mesh to
`deformVertsEM` under certain conditions:
- a non-deform-only modifier is handled currently
- a non-deform-only modifier preceeds the current modifier
- a deform-only modifier preceeds the current modifier (and the current
one depends on normals)

So the passed-in mesh cannot be relied on, now get the vertex group from
the context object data (like it was before the culprit commit).

Related commit: rB8f22feefbc20

Maniphest Tasks: T93611

Differential Revision: https://developer.blender.org/D13487
2022-01-10 16:14:08 +01:00
4c8b93c5c2 Fix T93732: Snap Cursor not working after changing Add Object settings
`g_data_intern.state_default.gzgrp_type` is a very specific member and
cannot be set to default.
2022-01-10 16:12:52 +01:00
256c1b82f6 Fix T93388: dropping object on grid in orthogonal view misses the floor plane
`ED_view3d_win_to_3d_on_plane` does not use the `clip_start` and
`clip_end` values of the scene, so the `do_clip` option can be misleading
especially in the orthographic view where the `clip_start` is negative.

For now, don't use the `do_clip` option in orthographic view.
2022-01-10 16:11:35 +01:00
580c603df0 Fix T93574: Asset triangulating a mesh
Correct assert for edit-mesh normal calculation.
2022-01-10 16:09:15 +01:00
Jesse Yurkovich
c37cd35469 Fix T93541: Use warning instead of error for exceeding layer limits
Instead of using RPT_ERROR, use RPT_WARNING which will not raise an
exception to Python. This broke some scripts (including FBX import)
which already check for a None return value.

Ref D13458

Reviewed By: campbellbarton
2022-01-10 16:07:19 +01:00
490f1648a7 Fix T93563: Crash subdividing with overlapping tri and quad
The first loop was left out when finding the split edge boundary.

Error from f2138686d9.
2022-01-10 15:59:44 +01:00
256a2d1e98 Fix T93508: Shift+F1 to switch to asset browser randomly crashes 2022-01-10 15:36:50 +01:00
77694b571f Fix T92561: unstable particle distribution with Alembic files
When enabling or disabling a Mesh Sequence Cache modifier of an Object
with a hair particle system, the hair would switch positions. This is
caused because original coordinates in Blender are expected to be
normalized, and toggling the modifier would cause the usage of different
orco layers: one that is normalized, and the other which isn't.

This bug exposes a few related issues:
- if the Alembic file did not have orco data,
`MOD_deform_mesh_eval_get`, used by the particle system modifier, would
add an orco layer without normalization
- `MOD_deform_mesh_eval_get` would also ignore the presence of an orco
layer (e.g. one that could have been read from Alembic)
- if the Alembic file did have orco data, the data would be read
unnormalized

To fix those various issues, original coordinates are normalized when
read from Alembic and unnormalized when written to Alembic; and a new
utility function `BKE_mesh_orco_ensure` is added to add a normalized
orco layer if none exists on the mesh already, this function derives
from the code used in the particle system.

Reviewed By: brecht

Maniphest Tasks: T92561

Differential Revision: https://developer.blender.org/D13306
2022-01-10 15:33:22 +01:00
Yuki Hashimoto
8b44b756d8 Fix some shortcut keys not working on macOS with Japanese input
Differential Revision: https://developer.blender.org/D13414
2022-01-10 15:31:57 +01:00
24a79289b0 Asset Browser: Fix incorrect user message
Text would display "No asset selected" when it actually inidicates that
there is no asset active (not selected). Changed it to "No active asset"
now.
2022-01-10 15:30:02 +01:00
2af6cb9dce Fix Asset Browser properties region toggle not showing open/closed state
The button is supposed to be blue (default theme) when the properties
region is open, to indicate that state.
2022-01-10 15:28:17 +01:00
8ca4d20878 Fix T91680: viewport selection broken in macOS x86 build with Xcode 13
There is an apparent compiler bug here, tweak the code to avoid it. This did
not affect official builds as we were still using Xcode 12.
2022-01-06 14:45:52 +01:00
e78a21afb6 Fix/workaround macOS Rosetta crash running Cycles AVX tests
Just disable these tests on macOS for now as fixing seems hard, and we want to
be able to cross-compile and test x86_64 on Arm machines on the buildbot.
2022-01-06 14:33:10 +01:00
d02eecc0ca Fix Cycles AVX test failure with x86_64 build running on Arm
Don't create const avx vectors before validating if CPU supports AVX.
2022-01-06 14:33:01 +01:00
f17593ff26 Bump version to 3.0.1 release candidate 2022-01-06 14:24:31 +01:00
f1cca30557 Blender 3.0 - version bump -> release 2021-12-02 19:35:47 +01:00
0988711575 Licenses: Attribution document for Blender 3.0
A few libraries were updated, a few were added, and a few were missing
from the previous license document.
2021-12-02 19:35:01 +01:00
cef8f5ff50 Docs: add README for HIPEW library 2021-12-02 18:43:42 +01:00
1e98a0cee5 NanoSVG: Mention the version we use 2021-12-02 18:22:05 +01:00
431255e5e8 Docs: 3.0 release description for Linux appdata
Includes a typo fix for 2.93.
2021-12-02 18:02:19 +01:00
7e60d8a713 Fix missing Blender logo in Windows store package
D9681 was not properly merged to all branches, leaving a path to a non-existent
icon file in the maniphest.
2021-12-02 16:35:24 +01:00
2fd657db5b Fix T93560: crash with image paint undo and cycles preview render
Cycles preview rendering could free the image buffers being used by drawing in
another thread due to a race condition. This race condition was unlikely before,
but now that preview renders are started right before we draw the image in the
image editor or load it as a texture in the 3D viewport, it's likely to happen.

As we are close to release this is too risky to fix properly, just avoid freeing
the cache for preview renders instead and accept increased memory usage in some
cases.
2021-12-02 16:34:16 +01:00
61e92eeb3e Fix Action.asset_data["is_single_frame"] set incorrectly
The asset metadata custom property `["is_single_frame"]` was set
incorrectly. Since this is intended for forward compatibility, including
being covered by the asset metadata indexing, it's important to have it
set correctly from the first release of Blender that includes the asset
browser.

Differential Revision: https://developer.blender.org/D13452
2021-12-02 14:35:12 +01:00
67c490daaf Fix T93548: Appended (material) assets don't have a fake user
Since our design is to always keep data-blocks marked as assets on exit,
and our technical design for this is to do this via fake users, ensure
the fake user is set for an appended asset.

Reviewed by: Bastien Montagne

Differential Revision: https://developer.blender.org/D13443
2021-12-02 11:18:27 +01:00
68e3755209 Fix T93555: crash when muting nodes with multiple internal links
The crash happened because I was incorrectly and inconsistently assuming
that a socket is part of at most one internal link. However, this is not the case.
In geometry nodes, an input socket can be internally linked to multiple
output sockets. In the general case, an output could also be linked to multiple
input sockets, even though we don't have that in Blender yet.

Dalai gave green light to cherry pick this fix for 3.0.
2021-12-02 11:10:41 +01:00
594656e7a3 Fix T93525: Crash with curve/text armature bone gizmo
The problem is that drw_batch_cache_generate_requested_delayed
is called on the object, which uses the original object data type to
choose which data type to get info for. So for curves and text it uses
the incorrect type (not the evaluated mesh like we hardcoded in the
armature overlay code).

To fix this I hardcoded the "delayed" generation to only use the
evaluated mesh. Luckily it wasn't use elsewhere besides this
armature overlay system. That seems like the simplest fix for
3.0. A proper solution should rewrite this whole area anyway.

Differential Revision: https://developer.blender.org/D13439
2021-12-01 21:16:18 -05:00
9cec9b4d6e Fix(unreported): LineArt intersection mask logic error.
The stroke generation call mistakenly uses all enabled
types to check intersection mask, the correct behavior
is to use individual edge(chain) type.
2021-12-01 15:55:52 +08:00
24b84e4688 LineArt: Use consitent view vector direction.
Now do not invertes view vector in different stages of calculation.
2021-12-01 15:55:16 +08:00
b3d101ac29 Fix T93100: VSE RMB shift-select fails with "fallback tools"
When the select action was set to "Select Tool", shift-clicking
on sequence strips wasn't selecting the strip.

Regression in 2a2d873124

Thanks to @a.monti for the fix.
2021-12-01 11:43:19 +11:00
3788003cda Fix T93368: Dragging Blends Without Previews
Unfortunately the drop logic for file-path based drag & drop checks the
used icon for its logic. This is very bad and should be changed. But
doing this involves some changes that are better not done during bcon4,
so for now stick to it and update the icon check.

Reviewed by: Julian Eisel

Differential Revision: https://developer.blender.org/D13383?id=45314
2021-11-30 15:40:27 +01:00
Bastien Montagne
de7f1e8e07 Fix T93353: Reload Library Override file loses Constraints, take II.
When adding `INSERT` operations over RNACollection items, rna diffing
code did not properly report the properties as not being equals.

This in turn triggered the 'purge unused exiting override properties'
mechanism, thus deleting the exitsting (valid) insert override property
operation.

NOTE: This should also be backported to 2.93, and probably 2.83.

Reviewed By: sybren, jbakker

Maniphest Tasks: T93353

Differential Revision: https://developer.blender.org/D13426
2021-11-30 15:19:44 +01:00
4b971bb87c Asset Bundle Copy button: only report each external dependency once
The `ASSET_OT_bundle_install` operator only works when the blend file is
self-contained. It reports any external dependencies. Before this patch:

- every dependency was mentioned, even when it repeated the same
  filename over and over again, and
- multiple dependencies were all mentioned in the error popup,
  potentially filling the screen.

This is now resolved by:

- only reporting each external file once, and
- referring to the console when there are multiple external dependencies.

Reviewed by: severin, dfelinto

Differential Revision: https://developer.blender.org/D13413
2021-11-30 11:50:38 +01:00
Julian Eisel
2e53f8b4b1 Fix T92577: Cannot open shortcut folders on Windows
`file.select()` wasn't handling redirects as it should when it also
opens directories. This was only uncovered by a change in the keymap.

Reviewed By: Bastien Montagne, Harley Acheson

Differential Revision: https://developer.blender.org/D13388
2021-11-30 11:29:40 +01:00
d8edc2c634 VSE: Disable interactivity in combined view
Combined view of timeline and preview causes seemingly unpredictable
behavior after some operators have been allowed to run in preview
region.

Disable new features in this combined view, so behavior should be
consistent with previous versions.

ref: https://developer.blender.org/T92584

Reviewed By: campbellbarton

Differential Revision: https://developer.blender.org/D13419
2021-11-30 11:15:20 +01:00
c12d8a72ce BPath traversing: allow skipping weak library references
Add flag to `BKE_bpath_traverse_id()` and friends to skip weak
references (see below). This makes a distinction between "this blend
file depends on that file" and "this blend file references that file,
but doesn't directly use its data". This distinction is for the Asset
Bundle install operator, which refuses to copy the blend file when it's
not self-contained.

Weak references are those that are not directly used by the blend file,
but are still present to allow path rewriting. For example, when an
Asset is loaded its originating blend file is saved in
`ID::library_weak_reference`; this reference is purely for deduplication
purposes, and not for actually loading any data.

Reviewed by: mont29, brecht

Differential Revision: https://developer.blender.org/D13412
2021-11-30 10:41:23 +01:00
e7ae9f493a Fix T93310: crash due to broken image paths
The crash was caused by allocating an uninitialized amount of memory.
This fix initializes a bunch of variables that could cause the error.

It should be possible to also fix this in the function that actually uses
the uninitialized memory, but that could cause unknown consequences
that are a bit too risky for 3.0. Just initializing some variables should
be safe though. For more details see D13369.

Differential Revision: https://developer.blender.org/D13369
2021-11-29 19:23:43 +01:00
aa7051c8f2 Fix T93439: Armature widgets from hidden collections are invisible
The are few things in the dependency graph which lead to the issue:
- IDs are only built once.
- Object-data level (Armature, i,e,) builder dependent on the object
  visibility.

This caused issues when an armature is first built as not directly
visible (via driver, i.e.) and then was built as a directly visible.
This did not update visibility flag on the node for the custom shape
object.

The idea behind the fix is to go away form passing object visibility
flag to the geometry-level builders and instead rely on the common
visibility flush post-processing to make sure certain objects are
fully visible when needed.

This is the safest minimal part of the change for 3.0 release which
acts as an additional way to ensure visibility. This means that it
might not be a complete fix (if some configuration was overseen) but
it should not make currently working cases to not work.

The fix should also make modifiers used on rigify widgets to work.

The more complete fix will have `is_object_visible` argument removed
from the geometry-level builder functions.

Differential Revision: https://developer.blender.org/D13404
2021-11-29 16:59:50 +01:00
dae9917915 Fix T93384: Objects with Constraints to curves have wrong locations on file load
Regression since 3.93 caused by 752c6d668b.

Follow the code from 2.93 which was always leaving curve modifiers
evaluation with a valid and clean state of the bounding box.

This is also what was proposed and agreed on in the following
design task: T92206: Bounding Box: compute during depsgraph evaluation

Tested with files from T90808 and T93384.

For the 3.0 going with the safest and minimal change. The rest of
the bounding box un-entanglement is to happen outside of the stable
branch.

Thanks The patch is based on the code from Philipp Oeser and
investigation by Germano Cavalcante and Dr. Sybren A. Stüvel,
thanks!

Differential Revision: https://developer.blender.org/D13409
2021-11-29 16:45:31 +01:00
2206b6b9a0 Fix T92628: .blend thumbnail renders black with Cycles 3D viewport render
Don't use Cycles for rendering thumbnails, fall back to Solid shading.

Differential Revision: https://developer.blender.org/D13406
2021-11-29 16:27:55 +01:00
Pratik Borhade
03c9563582 Fix T93431: Crash when empty is marked as asset
Make `ED_preview_id_is_supported(ID *)` NULL-safe. It's semantically
valid, as it's not possible to render a preview of a NULL ID.

The crash was introduced in 481f032f5c

Reviewed By: sybren, jbakker

Maniphest Tasks: T93431

Differential Revision: https://developer.blender.org/D13398
2021-11-29 15:25:03 +01:00
b31250feba Fix T93456: Properly translate operator on splash screen
Use the translation API to lookup the string before formatting occurs.

Differential Revision: https://developer.blender.org/D13400
2021-11-29 02:04:32 -08:00
Brecht Van Lommel
d2e6087335 Fix build error with TBB 2021 and booleans
Linux distributions are using newer TBB versions than official releases, and
TBB 2021 is an API breaking release.

In general we should avoid using TBB directly and go through the abstractions
in BLI_task.hh, though there is no abstraction for this.

For 3.0 the safe option is to just not cancel the task but instead early out
in the lambda function. Given the grain size of 2048 there should be no
significant performance difference.

Differential Revision: https://developer.blender.org/D13382
2021-11-27 19:08:06 +01:00
2fb8c6805a Fix build error with experimental features after recent release cycle bump
Hair, pointcloud and simulation datablock types should be disabled in the
beta cycles already like other experimental features.
2021-11-25 18:24:46 +01:00
e6a41e1c80 Blender 3.0 bcon4 - change release cycle to release candidate
This is still a rolling release candidate with new builds every day
as a preparation to the final release.
2021-11-25 17:59:49 +01:00
b2bb3e4b72 Fix T90082: Autoscrolling after renaming in the File Browser broken
Caused by 6b0869039a

Above commit introduced selection after renaming. This includes calling
`file_select_deselect_all` [which resorts and refilters].

So now, to have the correct file for scrolling, get it again after
sorting by calling `file_params_find_renamed` again.

Differential Revision: https://developer.blender.org/D13368
2021-11-25 17:32:40 +01:00
5a11c6e558 Fix missing margin below panels
A minor cosmetic fix. When the view was scrolled all the way to the
bottom, the lowest panel would end right on the view edge. The
scrollable view should get the same margin at the bottom as used at the
top.
2021-11-25 17:15:48 +01:00
5514ca58a4 Fix T92313: Heading of redo panel is not aligned properly
This corrects some alignments issues through new margins introduced in
93544b641b. Basic idea of this fix is to only add the new margins when
drawing a panel with background. These margins were added specifically
for the background boxes, so that makes sense.

Alternative fix to D13199.

This also fixes some margings added unintentionally in mentioned commit.
There is a little jump of the toolbar and the tabs in the Properties
when comparing the UI without this fix to 2.93:
{F12158085} {F12158039}
The jump is gone with this fix applied (compare to the 2.93 screenshot):
{F12158064}
While not a serious issue, this confirms that this fix actually tackles
the root of the issue.
2021-11-25 17:09:45 +01:00
94e8db1e86 Fix T92278: Small size of previews in the shading popover
Don't use the side padding for menu item contents when displaying
previews or icons in a row or grid layout. This can cause problems for
the preview drawing and doesn't make sense to draw there anyway.

This not only fixes the mentioned issue, but also too small heighlight
for the collection color tag in the Outliner context menu.

Alternative to and similar to D13125.
2021-11-25 15:55:04 +01:00
c91d196159 Fix T93274: Assigning asset catalog doesn't mark file as modified
Assigning a catalog to an asset via drag-and-drop in the asset browser
now creates an undo step. Not only does this allow undoing the action,
it also tags the blend file as modified.

Reviewed by: Severin

Differential Revision: https://developer.blender.org/D13370
2021-11-25 15:02:23 +01:00
e253fb2143 Fix T93353: Reload Library used as source for liboverride loses Constraints.
Liboverride properties and operations list need to be fully up-to-date
before libraries are reloaded, otherwise re-applying those liboverrides
after linked data is reloaded may miss some changes.
2021-11-25 14:56:56 +01:00
3bf10e5d0a Fix T89996, T90063: bugs with multi-button reset and entering values in popups
This reverts the changes to fix T87448, where entering the same value in number
buttons causes an unnecessary update. This is not stable enough for 3.0 and so
is being reverted, better to have an unnecessary update than no update in other
cases.

This effectively reverts the changes from rBeb06ccc32462 and follow up fixes
rBe1a9ba94c599, rBbbb52a462ef9, rBec30cf0b742f, and rB071799d4fc44. The code is
disabled with a comment on how it could be implemented better.
2021-11-25 14:45:19 +01:00
ffddf9e5c9 Fix T93338: Curve Guide force field crash
Caused by {rBcf2baa585cc8}.

For Curve Guide force fields to work, the `Path Animation` option has to
be enabled. With it disabled, we are lacking the necessary
`anim_path_accum_length` data initialized [done by
`BKE_anim_path_calc_data`] which `BKE_where_on_path` relies on since
above commit.

Now just check for this before using it - and return early otherwise.
Prior to said commit, `BKE_where_on_path` would equally return early
with a similar message, so that is expected behavior here.

Maniphest Tasks: T93338

Differential Revision: https://developer.blender.org/D13371
2021-11-25 14:16:12 +01:00
845716e600 Fix T92609 Default Compositing tab shows red overlay when stereoscopy is turned on
This was caused by the drawing not being done on the right frammebuffer.
2021-11-25 13:40:04 +01:00
82808e18e6 Fix T93362: crash when capturing attribute after fillet curve node
The issue was that the attribute propagation in the Fillet Curve node seems
pretty broken. I couldn't really make sense of the old code. It changed the
size of the point attribute domains on splines to 1 for some reason which
led to a crash in the next node.

Differential Revision: https://developer.blender.org/D13362
2021-11-25 10:33:05 +01:00
Bastien Montagne
a0acb9bd0c Fix T91444: Edge Loop Preview fails with two Mirror Modifiers
The mirror modifiers merge option caused unnecessary re-ordering
to the vertex array with original vertices merging into their copies.

While this wasn't an error, it meant creating a 1:1 mapping from input
vertices to their final output wasn't reliable (when looping over
vertices first to last) as is done in
BKE_editmesh_vert_coords_when_deformed.

As merging in either direction is supported, keep the source meshes
vertices in-order since it allows the vertex coordinates to be extracted.

NOTE: Since this change introduce issues for some cases (e.g. bound
modifiers like SurfaceDeform), this change is only applied to newly
created modifiers, existing ones will still use the old incorrect merge
behavior.

Reviewed By: @brecht

Maniphest Tasks: T93321, T91444

Differential Revision: https://developer.blender.org/D13355
2021-11-25 10:21:49 +01:00
e6cd4761e7 Fix T93321: Modified Mirror modifier behavior break some other tools like bound SurfaceDeform.
Revert "Fix T91444: Edge Loop Preview fails with two Mirror Modifiers"

This reverts commit 1a7757b0bc.

Caused issue reported in T93321, boiling down to the fact that other
operations or modifiers (like the SurfaceDeform one) rely on the order
of the vertices in the mesh to remain consistent.

Changing this in a modifier would mean those operations need to be
reset/re-created (e.g. rebound for the SurfaceDeform case), which is not
doable in `do_version` code.
2021-11-25 10:21:49 +01:00
726bc3a46b Fix T93155: Approximate shadow catcher displayed wrong on CPU and GPU
Was happening during rendering, causing visual artifacts when doing
CPU+GPU rendering, and giving different in-progress results on different
devices.

The root of the issue comes to the fact that math used in the approximate
shadow catcher calculation might have resulted in negative alpha channel,
and negative values for display are handled differently on CPU and GPU.
Such difference in handling is caused by an approximate conversion used on
the CPU for the performance reasons.

This change makes it so no negative alpha is generated by the approximate
shadow catcher. Not sure if we need some explicit clamping somewhere to
deal with possible negative values coming from somewhere else.

The shadow catcher cornell box tests are to be updated for the new code,
but the new result seems to be more accurate.

Differential Revision: https://developer.blender.org/D13354
2021-11-25 10:17:52 +01:00
ce5561b815 Fix Py API: wrong doc about type of Collection property.
Collection property only accepts PropertyGroup type, not ID ones.

Reported on IRC by @frameshift, thanks.
2021-11-25 10:12:35 +01:00
40d28b40df Fix black Cycles result when cancelling tiled rendering with shadow catcher
Noticed when was looking into T93155. Steps to reproduce:

- Open the .blend file from the report
- Hit F12 to start rendering
- After some tiles were rendered hit Esc

The issue is caused by "sticky" cancel reported via Progress. This  means
that once user hit Esc all further requests for cancel state will return
truth, which was preventing OIDN denoiser from completing the denoising
task.

Now only allow stopping the denoiser when interactive rendering requests
a very fast stopping.

Aiming the fix for 3.0 branch.

Differential Revision: https://developer.blender.org/D13352
2021-11-25 09:50:33 +01:00
Alaska
b41c72b710 Fix performance decrease with Scrambling Distance on
With the current code in master, scrambling distance is enabled on non-hardware accelerated ray tracing devices see a measurable performance decrease when compared scrambling distance on vs off. From testing, this performance decrease comes from the large tile sizes scheduled in `tile.cpp`.

This patch attempts to address the performance decrease by using different algorithms to calculate the tile size for devices with hardware accelerated ray traversal and devices without. Large tile sizes for hardware accelerated devices and small tile sizes for others.

Most of this code is based on proposals from @brecht and @leesonw

Reviewed By: brecht, leesonw

Differential Revision: https://developer.blender.org/D13042
2021-11-25 09:32:26 +01:00
8f2db94627 Fix T93357: crash when opening search menu
This is the same fix as in rBde35a90f9f56d3ff3ac80c13bf1ae296853ba877
but for the blender-v3.0-release branch.
2021-11-25 00:03:41 +01:00
Leon Leno
a9642f8d61 UI: Improve scaling of widgets when zooming
This commit improves the scaling of some ui widgets when
zooming by making the radius of the rounded corners
dependent on the element's zoom level.

Needed to fix T92278 without padding issues, see D13125.

Reviewed By: Hans Goudey, Julian Eisel

Differential Revision: https://developer.blender.org/D12842
2021-11-24 21:06:32 +01:00
752c6d668b Fix T90808: wrong BoundBox after undo curve selection
There are two functions that recalculate the boundbox of an object:
- One that considers the evaluated geometry
- Another that only considers the object's `data`.

Most of the time, the bound box is calculated on the final object
(with modifiers), so it doesn't seem right to just rely on `ob->data`
to recalculate the `ob->runtime.bb`.

Be sure to calculate the BoundBox based on the final geometry and
only use `ob->data` as a fallback

Differential Revision: https://developer.blender.org/D12282
2021-11-24 14:52:49 -03:00
7a7ae4df43 UI: Blend File Icons Thumbnail View
Changes icon used to indicate blend file when overlaid over larger
document icon when in thumbnail view. Only seen when file does not
have a preview.

Followup to {rB611e4ffaab43}

For more details and examples see D13342

Differential Revision: https://developer.blender.org/D13342

Reviewed by Julian Eisel
2021-11-24 09:50:02 -08:00
71c39a9e2e Asset Browser: Activate a catalog when dragging
Without this it's easy to loose track of which catalog you are dragging.
Things feel generally quite jumpy/disconnected, activating the catalog
makes things feel far less like that.
I consider this an important usability fix, therefore I'm adding it to
the release branch.
2021-11-24 18:05:08 +01:00
cae3b581b0 Asset Browser: Fix catalog being renamed when dropping into parent
When dropping catalogs it is ensured that the name of the moved catalog
is unique within the new parent catalog. When dropping a catalog into
the parent, the catalog would not actually move to a different location,
but it would still be renamed. The unique name logic simply isn't smart
enough to ignore the catalog that is about to be moved.
Address this by disallowing dragging a catalog into its own parent. It's
already there.
2021-11-24 17:59:14 +01:00
01ab36ebc1 Asset Browser: Support dragging catalogs into top level
This was an oversight when I added catalog drag & drop support. I forgot
to add this for dragging catalogs into the top level by dragging into to
the "All" item as well. This made the drag & drop support rather broken
because it wouldn't work for a basic case.
2021-11-24 17:31:28 +01:00
a07089dcb1 Fix T92120 (partially): No bone custom shape with curve object meshes
This part of the drawing code assumes that the bone custom object
has only one evaluated geometry component, and it also uses the
object type to check which data to draw, with the functions like
`DRW_cache_object_surface_get` that just take an object input.
Those functions usually work on evaluated objects, which use the
instancing system to access a temporary object with `object.data`
replaced for data types that don't match the original object.

That assumption used to work, but now curve, point cloud, or volume
objects can have an evaluated mesh which is not accessed with the
same object for render engine drawing.

The "correct" solution for the way this code is structured would be to
loop through all of the geometry components and try to get GPU batches
from every one of them. However, that significantly increases complexity
in an area that should probably be refactored anyway. This patch treats
the mesh as a special case, and only draws the evaluated mesh.

The **best** solution in my opinion might be refactoring this area to
use the instancing system with some sort of viewport-only flag so
the custom shape instances aren't added in the render.

The solution is "partial" because the "Wireframe" option only works
for meshes from mesh objects, even after this fix, and because other
data besides meshes is not displayed at all.

Differential Revision: https://developer.blender.org/D13038
2021-11-24 10:39:33 -05:00
56b068a664 Fix inconsistent UI terminology for tiling option
Was meant to be Use instead of Using.
2021-11-24 15:54:53 +01:00
64d9291d26 Cleanup: formatting 2021-11-24 15:54:53 +01:00
Alessio Monti di Sopra
2cc56495f3 UI: Fix alignment for recently added/edited icons
The patch slightly modifies two recently added icons "FILE_BLEND" and
"CURRENT_FILE" to better align them to the pixel grid, and change the
design of "FILE_BACKUP" to avoid alignment and readability issues, as
well as avoiding the outline version of the Blender logo which violates
the official logo guidelines.

Differential revision: https://developer.blender.org/D13346
2021-11-24 15:51:57 +01:00
5a50b46376 Fix T93352: Material previews lost render samples
There are few layers of things which lead to the situation
of more noisy material preview: the do-version of the
preview.blend did not happen (at least from the Python
side as we did not investigate the C side deep). This made
Cycles to use default integrator settings for the preview
file ever since the Cycles X was merged. Those settings are
adaptive sampling with max 4K samples, noise threshold 0.01.
Opening the file in Blender 3.0 for edit did run the versioning
code which effectively lowered the number of samples used for
rendering.

This change makes it so the preview file is configured with
the exact effective settings as seen by Cycles prior to the
file was re-saved (adaptive sampling with the parameters noted
above).

This fix does not chaneg the fact that the versioning code is
not used for preview.blend, it only solves the regression in
the quality of previews.

The fix is done and reviewed with collaboration with Dalai and Sergey.
2021-11-24 12:47:30 +01:00
cd818fd081 Assets: Sanitize threaded preview creation with undo
Basically, this fixes disappearing previews when editing asset metadata
or performing undo/redo actions.

The preview generation in a background job will eventually modify ID
data, but the undo push was done prior to that. So obviously, an undo
then would mean the preview is lost.

This patch makes it so undo/redo will regenerate the preview, if the preview
rendering was invoked but not finished in the undone/redone state.

The preview flag PRV_UNFINISHED wasn't entirely what we needed. So I had to
change it to a slightly different flag, with different semantics.
2021-11-24 11:20:35 +01:00
785503a7e4 Cleanup: use lowercase in private functions. 2021-11-24 10:05:45 +01:00
4b259edb0a Cleanup: Silent compilation warning in draw_manager. 2021-11-24 10:02:30 +01:00
60c0b79256 Add tablet data to Wintab fallback cursor movement. 2021-11-23 17:10:30 -08:00
3844e9dbe7 Fix (unreported): unlinked group input is not logged in geometry nodes
Differential Revision: https://developer.blender.org/D13340
2021-11-23 19:03:16 +01:00
9e5aae4215 Asset: Merge asset library/list refresh operators
In rBdcdbaf89bd11, I introduced a new operator
(`file.asset_library_refresh()`) to handle Asset Browser refreshing more
separate from File Browser refreshing. However, there already was
`asset.asset_list_refresh()`, which at this point only works for asset
view templates, but was intended to cover the Asset Browser case in
future too. This would happen once the Asset Browser uses the asset list
design of the asset view template.

So rather than having two operators for refreshing asset library data,
have one that just handles both cases, until they converge into one.
This avoids changes to the Python API in future (deprecating/changing
operators).

Differential Revision: https://developer.blender.org/D13239
2021-11-23 18:40:31 +01:00
792badcfef Fix broken handling of constraints reordering with library overrides
Alternative to D13291 (description partially copied from there).

New drag & drop reordering code would call constraints reordering
operator with the generic context, and not the one from the panel's
layout. missing the "constraint" member which is mandatory for poll
function to properly deal with override vs. local constraints.

For this to work in a decent way, there needs to be some panel-wide
context that we can restore when executing callbacks outside of the
normal draw context. So similar to uiLayoutSetContextPointer() to set
context on a layout level, this introduces
UI_panel_context_pointer_set() for panel level context (this calls the
former for the current panel root layout as well).

Differential Revision: https://developer.blender.org/D13308
2021-11-23 18:34:51 +01:00
fb4851fbbc Fix: The bounding box gizmo breaks if transform pivot is set to cursor
The bounding box transform code assumed that the pivot would always be
the sequence object transform center.

Rework the code so that this assumption is true even if the general
transform pivot is set to be the 2D cursor.
2021-11-23 18:03:36 +01:00
cf266ecaa6 Geometry Nodes: fix attribute propagation in Delete Geometry node
Previously, attribute propagation did not work correctly in when only
deleting edges and faces (but not points). Face and face corner attributes
were propagated wrongly or not at all respectively.

In order to keep the patch relatively small for the release branch,
it does not include some small optimizations. These can be done in 3.1:
* Use a `Span<int>` instead of `IndexMask` to avoid creating an
  unnecessary `Vector<int64_t>`.
* Only prepare index mappings when there are actually attributes to
  propagate.

Differential Revision: https://developer.blender.org/D13338
2021-11-23 17:48:09 +01:00
a6b7f32112 Fix T93297: incorrect eevee motion blur with geometry instances
This disables motion blur for geometry instances in eevee, which did
not work correctly anyway. See code comment for more details.

Differential Revision: https://developer.blender.org/D13334
2021-11-23 16:40:21 +01:00
70424195a8 Cycles: Fix possible access to non-initialized light sample in volume
Happened in barbershop file where number of bounces to the light was
reached.

Differential Revision: https://developer.blender.org/D13336
2021-11-23 16:38:15 +01:00
2cbb9d7a76 Fix T93322: Freestyle Sinus Displacement Division by Zero Crash
This happens if the Wavelength is set to 0.0f.

Not sure if we really need a do_version patch for old files, as an
alternative we could also force a slight offset in the
SinusDisplacementShader. This patch does not do either, just force a
positive range from now on.

Maniphest Tasks: T93322

Differential Revision: https://developer.blender.org/D13329
2021-11-23 16:29:37 +01:00
Sayak Biswas
3bb8d173e7 Fix T93109: Cycles HIP missing check for correct driver version
21.Q4 is required, older version should not show devices in the preferences.
This adds a check for the file version of amdhip64.dll file during hipew
initialization.

Differential Revision: https://developer.blender.org/D13324
2021-11-23 15:45:37 +01:00
1a7c32a0ab Fix T93320: Freestyle LineStyleModifier blend 'Minimum' error
This was just a typo in {rBb408d8af31c9}
Must be 'MINIMUM' (instead of 'MININUM').

Maniphest Tasks: T93320

Differential Revision: https://developer.blender.org/D13328
2021-11-23 15:38:52 +01:00
ceb25cbeba Fix compilation warnings when building without OpenImageDenoiser
Reported by Sybren, thanks!
2021-11-23 15:34:54 +01:00
5efddc4347 Fix add-on Preferences using the .blend file icon, not the Blender logo
Intention of the icon is to mark add-ons that are official/bundled.
Doesn't make much sense to use the .blend file icon for that. It's
arguable if the Blender logo should be used for this, but the file icon
is definitely the wrong choice.
2021-11-23 15:30:05 +01:00
436ce22194 Fix T93296: raycast node uses wrong domain for face corner attributes
This changes what domain is used by the raycast mode. This should fix the
behavior for face corner attributes (but may make it a bit slower for other
attributes). I think for 3.0 this is an acceptable trade off. For 3.1 we can do
what the comment suggests already.

Differential Revision: https://developer.blender.org/D13333
2021-11-23 14:38:02 +01:00
dab04bc053 Fix T93231: crash when overwriting vertex group with other domain
The problem was that we forgot to actually remove the vertex group when
it should be deleted. We only removed all the data that was attached to it.

Differential Revision: https://developer.blender.org/D13326
2021-11-23 14:38:02 +01:00
0479a66313 Fix broken versionning after recent refactor of insertion in liboverrides.
rB33c5e7bcd5e5b79 doversion code was incorrectly dealing with 'insert in
first position' case from older blendfiles.

Specifically, a NULL anchor is valid (it means that the new item is the
first of the stored override data, and should be inserted at start of
the list).

Reported as part of T93321.
2021-11-23 14:36:40 +01:00
611e4ffaab Icons: Replace .blend file icons, add "Current File" icon
The Blender icon must not be used to refer to anything that is not
Blender itself. Using the Blender icon on its own to refer to .blend
files or the currently open file is a no-go, which was brought up by
Ton.

This does the following changes to the icon file:
* Add new "Current File" icon
* Change the .blend file icon to contain a file icon with the Blender
  logo, but not merely the Blender logo.
* Change the backup .blend file icon accordingly.

The new "Current File" icon is used in the Asset Browser, but
could/should be used in the Outliner as well. That needs more design
discussion though.
2021-11-23 14:32:35 +01:00
b7c98c87ac Cleanup: clang-tidy warnings
Silenciate warnings of usage of 'else' after 'return'
2021-11-23 10:15:00 -03:00
Christian Stolze
bba6fe83e2 Fix T89204: slow repeated rendering with GPUOffscreen.draw_view3d.
Reviewed By: fclem
Differential Revision: D13235
2021-11-23 14:08:38 +01:00
6ab3349bd4 Documentation: Remove deprecated glColor* from bgl module.
glColor isn't supported but still part of the documentation. This
patch removes the glColor from the documentation.

Ref {T93315}
2021-11-23 13:03:12 +01:00
3e65bb86f9 Cleanup: fix clang-tidy warnings
Fix clang-tidy warnings introduced by a recent commit on the release
branch.

No functional changes.
2021-11-23 13:02:00 +01:00
cd2849c89b Asset Browser: add operator for installing asset bundles
Add an operator "Copy to Asset Library" for installing asset bundles
into already-existing asset libraries.

The operator is shown when:
- the "Current File" library is selected,
- the blend file name matches `*_bundle.blend`, and
- the file is not already located in an asset library.

The user can select a target asset library, then gets a "Save As"
dialogue box to select where in that library the file should be saved.
This allows for renaming, saving in a subdirectory, etc.

The Asset Catalogs from the asset bundle are merged into the target
asset library.

The operator will refuse to run when external files are referenced. This
is not done in its poll function, as it's quite an extensive operator
(it loops over all ID datablocks).

Reviewed by: Severin

Differential Revision: https://developer.blender.org/D13312
2021-11-23 12:29:44 +01:00
b02ac2d8be Fix T93092: incomplete animation rendering of multi-layer exr composition
This was broken by rB0c3b215e7d5456878b155d13440864f49ad1f230.
The caching of loaded exr files needed some special treatment.

Differential Revision: https://developer.blender.org/D13313
2021-11-23 09:43:00 +01:00
84be741329 Fix T92654: missing padding in spreadsheet cells
This was probably broken by rB5c2330203e11e0d916960218b07d88d2193bf526.
For now just add the padding back in a spreadsheet specific way.

Differential Revision: https://developer.blender.org/D13315
2021-11-23 09:35:47 +01:00
6987060f70 Fix T93090: crash with data transfer modifier and geometry nodes
There was a missing normals layer that was requested by the data transfer
modifier from the target object. The normal layer was correctly added to
the target object. However, it never reached the data transfer modifier
because the mesh was copied in `BKE_object_get_evaluated_mesh`
(in the call to `get_mesh_for_write`) and the copy does not include the normals
layer.

The solution is to not use `get_mesh_for_write` here which was only used
because `BKE_object_get_evaluated_mesh` returns a non-const `Mesh *`.
Mid term, it should actually return a `const Mesh *` to avoid the confusion.

Differential Revision: https://developer.blender.org/D13319
2021-11-23 09:32:12 +01:00
Sayak Biswas
f749506163 Fix T93244: Cycles HIP not working with multi GPU rendering
Use the correct device function (hipDeviceGet) for multi GPU setups, instead
of hipGetDevice which just returns the default device.

Differential Revision: https://developer.blender.org/D13323
2021-11-23 00:55:56 +01:00
481f032f5c Assets: Generate light preview when making light object an asset
We already supported previews for lights, just didn't actually use them
when making a light object an asset. They were only used when making the
light data itself an asset.
2021-11-22 21:14:11 +01:00
34615cd269 Fix grayed out preview generation button for light objects
Mistake in e7bea3fb6e.

We should only skip preview generation for objects that don't support
preview rendering, not completely forbid accessing preview data of such
IDs.
2021-11-22 21:12:36 +01:00
e2b736aa40 Fix part of T93278: transparent glass option not working with environment pass 2021-11-22 20:58:09 +01:00
Takahiro Shizuki
ee0277271c IME: Fix Multi-Window Duplicated First Character
Fix problem with duplicated initial character when initiating or
switching to new windows. This is done by updating our copies of state
and modes from the new window when it receives WM_IME_SETCONTEXT
message. This problem and fix are only for the Windows platform.
2021-11-22 10:41:11 -08:00
8a84a61f6b Fix confusing new Cycles UI terminology
* Rename "Auto Tiles" to "Use Tiling", it's not really automatic and
  confusing with the old auto tile size add-on.
* Rename "Adaptive" scrambling distance to "Automatic", to avoid confusion
  with adaptive sampling.
2021-11-22 18:37:47 +01:00
336ca6796a Fix T90308: Cycles crash copying memory from device to host
Happens when device runs out of memory and Cycles is moving some
textures to the host memory.

The delayed memory free for OptiX BVH was moving data from one
device_memory to another, leaving the original device memory in
an invalid state. This was ruining the allocation map in the CUDA
device which is using pointer to the device_memory.

This change makes it so the memory pointer is stolen from BVH
into the delayed memory free list.

Additionally, forbid copying and moving instances of device_memory
and added sanity checks in the device implementation.

Differential Revision: https://developer.blender.org/D13316
2021-11-22 17:26:59 +01:00
25c83c217b Cleanup: Clang-format of the HIP device implementation 2021-11-22 17:26:52 +01:00
875f24352a Material Preview: Fix Sphere object (squared UV and poles)
A good sphere preview material has a 1:1 UV ratio (so we see squares as
least distorted as possible), as well as good poles.

Square UV:
The original sphere expected a 2:1 panorama to be mapped to it. This
patch changes that (I scaled Y by 0.5) so that square textures look ok.

Poles:
The original sphere had a low initial resolution, so no ammount of
subdivision would fix the poles.

The sphere has a subdivision modifier with 0 resolution. Later (3.1?) I
want to try to change the resolution on-the-fly based on whether the material
has a displacement map.

Old sphere (1.9K vertices):
{F11845752, size=500px}

New sphere (2.0K vertices):
{F11845710, size=500px}

Differential Revision: https://developer.blender.org/D13309
2021-11-22 16:46:49 +01:00
Henrik Dick
819b9bdfa1 Fix T92631: Fix negative thickness regression in complex solidify
This regression was introduced by D11832, but there was problems before
that as well. I seem to have missed it in review. See the differential
revision for a screenshot of the difference.

Differential Revision: https://developer.blender.org/D13216
2021-11-22 09:33:49 -05:00
7b09213f2f Fix T93198: Frame Selected in greasepencil curve editing does not work
Was not taking into account curve points at all.

Maniphest Tasks: T93198

Differential Revision: https://developer.blender.org/D13281
2021-11-22 13:42:00 +01:00
0b246ed813 Revert "Fix (unreported) broken handling of constraints reordering with liboverride."
This reverts commit 6eaa69c66c.

Committed by nistake, sorry for the noise.
2021-11-22 10:39:20 +01:00
6c16bb2706 Fix broken NLA RNA code after own rBfa6a913ef19c.
Thanks to @scurest for noticing this mistake!
2021-11-22 09:31:59 +01:00
6eaa69c66c Fix (unreported) broken handling of constraints reordering with liboverride.
New drag&drop reordering code would call constraints reordering operator
with the generic context, and not the one from the panel's layout.
missing the "constraint" member which is mandatory for poll function to
properly deal with override vs. local constraints.

This commit fixes it by generating a temp bContextStore in the panel
re-ordering callback.

NOTE: this fix will have to be extended to modifiers (which happen to
work currently because they have an 'active' status), and gpencil
modifiers (which are also broken currently).

Differential Revision: https://developer.blender.org/D13291
2021-11-22 09:28:27 +01:00
1b2ee3cf20 Fix T92090: Eevee crash with Intel HD 4000 and macOS 10.15.7
A recent security update to macOS 10.15.7 causes crashes when using Eevee and
various other 3D viewport features. It appears that glGenerateMipmap is
broken, causing a crash whenever its commands are flushed/submitted to the GPU.

Ideally this would be fixed in a driver update, however it's unlikely this will
happen. Earlier macOS versions have been receiving security updates for 2 years,
and that window has just passed for 10.15. Further, computers with these GPUs
can't upgrade to a newer macOS version.

As a workaround, disable mipmaps on these GPUs, by setting the mipmap max level
to 0 and not calling glGenerateMipmaps. Effects like depth of field also use
mipmaps, but fill in the mip levels by other means. In those cases we keep the
mipmap level.

Differential Revision: https://developer.blender.org/D13295
2021-11-20 17:50:05 +01:00
Sayak Biswas
f2bb42a095 Fix T92984: Cycles HIP crash with smoke volumes
This fixes the the app crash happening when trying to render smoke as a dense
3D texture. The changes are related to matching up hipew with the actual HIP
headers.

Differential Revision: https://developer.blender.org/D13296
2021-11-20 14:02:38 +01:00
b20997cb34 Fix T93194: greasepencil channel lists ignoring collection visibility
Same fix as rB0a3b4d4c64f1, but this time for greasepencil.

To repeat: dopesheet in greasepencil mode was ignoring the temporariy
visibility flag of collections. As a result, even though the dopesheet
was supposed to show animation data of visible greasepencils only was
still showing such data of greasepencils that were hidden by hiding
their collection.
2021-11-20 12:31:33 +01:00
092df87534 Fix error in rBfb0ea9
There is no need to multiply the "dash_width" by `UM.pixel size` since the "viewport_size" is already being divided by the DPI.

Ref {rBfb0ea9}
2021-11-20 01:44:35 -03:00
fb0ea94c63 Fix T85855: F-curves too thin on Mac
Use the `GPU_SHADER_3D_POLYLINE_UNIFORM_COLOR` shader instead of `GPU_SHADER_2D_LINE_DASHED_UNIFORM_COLOR`.
This is just a partial solution as "protected" fcurves still use the dashed shader.

Differential Revision: https://developer.blender.org/D13290
2021-11-19 16:57:05 -03:00
00e4d665f4 Fix T91838 Crash when toggling edit mode on object with geometry node modifier, but only if the instanced objects material has a normal map assigned.
This is only a workaround to avoid the crash. The underlying issue is left
unfixed.

New report for tracking the underlying issue is T93223.
2021-11-19 19:24:22 +01:00
1b686c60b5 Fix T93046: Cycles world volume rendering very slow in OptiX with some scenes
With very long ray distance, OptiX ends up traversing many BVH nodes due to
a feature that improves precision. However this causes very slow rendering.

We now avoid generating such long rays by rejecting the few samples that have
long ray distances and very low probability of being generated. This should not
meaningfully affect render results.

Thanks to Sergey and Patrick for the investigation.
2021-11-19 17:42:22 +01:00
0f1a200a67 Fix T92682: EEVEE motion blur crash with curve objects
After rBb9febb54a492, the evaluated mesh from a curve is now presented
to render engines as a separate mesh object, but some code still assumed
that a curve object itself could have an evaluated mesh. However, this is
still true for surface objects and metaballs, which don't
use geometry sets yet.

Differential Revision: https://developer.blender.org/D13272
2021-11-19 11:36:29 -05:00
1a1ddcb5e2 Asset Browser: don't display linked-in asset datablocks
Datablocks marked as asset, linked from another file, were shown in the
"Current File" asset library. This is now resolved.
2021-11-19 16:29:25 +01:00
06ead314b6 Asset Preferences: disallow single file as asset library
Asset libraries should be directories on disk. By manually entering a
file path it was possible to have a single blend file as asset library,
but that was not a designed-for situation, and it doesn't play well
with the asset catalog system.
2021-11-19 16:08:55 +01:00
33c5e7bcd5 LibOverrides: Refactor how diffing of RNA collections is handled.
Original implementation was a quick prototype which should have never
landed as-is in master. It had very limiting constraints and did not
allow for any real further development.

This commit fixes the internal implementation to make more sensible,
maintainable and evolutive.

NOTE: This commit introduces another forward-incompatibility in the
Blender file format: Files saved after this commit won't open properly
in older versions of blender regarding local inserted constraints or
modifiers into overrides of linked data.

NOTE: Technical details: The 'anchor' item name/index is now stored in
`subitem_reference_` members, and the actual 'source' item name/index is
stored in `subitem_local_` members of the override property operation
data.
Previously, only the `subitem_local_` members were used, storing the
anchor item name/index, and assuming the 'source' item was always the
next in the list.

Milestone I of T82160.

Maniphest Tasks: T82160

Differential Revision: https://developer.blender.org/D13282
2021-11-19 15:41:53 +01:00
d6ea881a74 BLI_listbase: Add utils to search from string or index.
If a valid matching string is found, return that item, otherwise
fallback to the item matching the given index, if any.

This will be useful in RNA override code, and potentially other
areas where data in lists can be referenced by their names or indices.
2021-11-19 15:41:36 +01:00
04ec36f677 Fix T87912: use session id instead of name to identify dropped object
The old code did not work when there were multiple ids with
the same name (which can happen when ids are linked in).
The solution is to use the session ids instead. Those are different
even when two ids have the same name.

Differential Revision: https://developer.blender.org/D11116
2021-11-19 15:28:44 +01:00
a20e703d1a Fix T93184: Link color not used for custom sockets
D13044 allowed the link color overlay to be used with custom sockets.
This no longer works due to a condition that checks if the socket is
standard or not, which was in place to avoid bad indexing of the
std_node_socket_colors array. Since that array is no longer used, this
condition needs to be removed.

Differential Revision: https://developer.blender.org/D13274

Reviewed By: Hans Goudey
2021-11-19 15:55:58 +02:00
fa6a913ef1 LibOverride: Add read-only flags accessors for 'local override' status.
Constraints, modifiers and NLA tracks can now report from RNA whether
they are defined as comming from the override's reference linked data,
or are local to the override.
2021-11-19 12:09:28 +01:00
83e245023c Fix (unreported) wrong behavior of constraints in liboverrides.
All constraints were 'made local', including the ones comming from the
reference linked object.
2021-11-19 12:09:28 +01:00
de3fda29c7 Fix T93054: crash when deleting a missed linked file
This is a bit similar to rBb7260ca4c9f4b7618c9c214f1270e31d6ed9886b.
Sometimes a group node may not reference a node group
because it was linked and can't be found.
2021-11-19 10:15:58 +01:00
4ea6b4ba84 Fix crash in VSE versioning code from recent commit
Caused by {rB4d09a692e22a}.
Greenlit by @sergey in chat.
2021-11-19 10:05:18 +01:00
4d09a692e2 Fix T92847: Meta-strip corrupt
Offsets for meta strip were invalid. No steps to reproduce the issue are
available, but it is quite possible that there are files with incorrect
state after issues with meta strips were fixed.

Ensure correct offsets for meta strips in versioning code.

Reviewed By: sergey

Differential Revision: https://developer.blender.org/D13257
2021-11-19 06:12:09 +01:00
fd2a155d06 Fix T91797: Cycles volume rendering artifact with overlapping volumes
With the new volume rendering code this was no longer accurate, we always
need to use a new dimension for the next volume segment.
2021-11-18 19:27:37 +01:00
7c4e4d605c Pose Library: clarify apply/blend operator names
The old names had "{verb} Pose Library Pose"; they are now named
"{verb} Pose Asset", which is shorter & clearer.
2021-11-18 17:52:17 +01:00
12fc395436 Fix T93152: Cycles baking multiple selected object to active not working 2021-11-18 17:43:27 +01:00
f0be276514 Fix T93082: Cycles baking not handling transparency correctly
For baking, replace transparent BSDF with holdout for baking. This ensure no
objects behind are baked, and that the baked image has alpha.
2021-11-18 17:13:16 +01:00
ed91e759d1 Fix Cryptomatte accurate option showing for Cycles, only needed for Eevee 2021-11-18 17:13:16 +01:00
8d1357ea6b Pose Library: add conversion button to old poselib UI
Add the "Convert Old Pose Library" operator to the old pose library (in
the Armature properties editor). This makes it more discoverable; before
it only was available in the Action editor.
2021-11-18 16:35:56 +01:00
31afa1bb9a Asset tags: include partial tag matches
When filtering the asset browser, also include results that have partial
tag matches. So searching for "xite" will include results tagged with
"excited".

This brings the tag filtering in line with other search boxes in
Blender. Later we might want to provide users with more options for
prefix-only ("excite" would match "excited", but "xited" would not) or
only exact matches.
2021-11-18 16:21:00 +01:00
0624acf088 Fix T92442: undo removal of Eevee cryptomatte accurate mode option
The Cycles accurate mode was removed, but the Eevee option for this has
a different meaning and should not have been removed. The Eevee accurate
makes cryptomatte accumulate for every sample, which Cycles has always
done regardless of any option.
2021-11-18 15:59:08 +01:00
b926f54f3c Fix: error when materializing curve point attribute
Differential Revision: https://developer.blender.org/D13271
2021-11-18 15:28:04 +01:00
f71813204c Cycles: Don't tile if image area fits into tile area
Previously the check was done based on dimension of image and if any
of dimensions were larger than tile size tiling was used.

This change makes it so that if image does not exceed number of pixels
in the tile no tile will be used. Allows to render widescreen images
without tiling.

Differential Revision: https://developer.blender.org/D13206
2021-11-18 14:27:45 +01:00
3ad2bf1327 Cycles: Fix command line render overshooting time limit
The calculation based on preserving device occupancy was conflicting
with the fact that time limit needs to render less samples at the last
round of render work.

For example, rendering BMW27 for 30sec on i9-11900k was actually
rendering for almost a minute. Now the render time limit is respected
much more close.

Differential Revision: https://developer.blender.org/D13269
2021-11-18 14:27:45 +01:00
bd2e3bb7bd Fix T93045: Cycles HIP not rendering OpenVDB volumes
Build HIP kernels with NanoVDB, and patch NanoVDB to work with HIP.

This is a header only library so no rebuild is needed. The changes are being
submitted upstream to openvdb, so this patch should be temporary.

Thanks Thomas for help testing this.
2021-11-18 13:24:56 +01:00
e5774282b9 Fix asset preview not showing up for current file data-blocks
For data-blocks from the current file, the image-buffer for dragging
wasn't set at all. This wasn't intentional, dragging things in the Asset
Browser should just always show the preview.
2021-11-18 11:47:21 +01:00
William Leeson
8c0370ef7b Fix T93102: Principled hair shader stack was built incorrectly
stack_assign_if was used in the middle of creating the shader value blocks.
Which caused stack variables to be inserted in the middle of the shader value data.
This resulted in the shader node data no being in sequential order. This was also
the case for the wave texture wave node.

Reviewed By: brecht

Maniphest Tasks: T93102

Differential Revision: https://developer.blender.org/D13262
2021-11-18 09:31:29 +01:00
a182b05f07 Fix T90390: VSE Jump operator don't work during playback
Tag dependency graph `ID_RECALC_AUDIO_SEEK` update.
2021-11-18 02:59:50 +01:00
Andrea Beconcini
daaa43232d Fix T92445: Thumbnail height without overlay text
This patch changes the thumbnails' height used for image and movie
strips by choosing the proper size according to the VSE's text overlay
settings: i.e. thumbnails use the whole strip's height when no overlay
text is displayed; otherwise, some space is left for the overlay.

Reviewed By: ISS

Differential Revision: https://developer.blender.org/D13043
2021-11-18 02:14:11 +01:00
d8fd575af9 Fix T93154: Crash adding multiple movie strips
Some when adding multiple movies at once and only some of them have
audio track, this causes crash on NULL dereference. Issue was introduced
in bdbc7e12a0 to align sound and video properly.

Check if sound is present in movie file. If it's not, don't try to align
sound with video.
2021-11-18 01:32:06 +01:00
b071083496 Fix T93166: Division by zero when drawing thumbnails
Caused by incorrect step calculation fo too short strips. For these,
step should be equal to strip length not 0.
2021-11-18 01:11:51 +01:00
00a9617f92 Fix: wrong assert in geometry nodes evaluator
It only makes sense to check if all required outputs have been computed
if the node was executed at all.
2021-11-17 15:40:53 +01:00
51b8e34fb7 LineArt: Improve certain edge cases in occlusion
This patch includes:
View vector fix for ortho back face.
Point on segment logic correction.
Better handling of boundary cases.

See review page for detailed description.

Reviewed By: Sebastian Parborg (zeddb)

Differential Revision: https://developer.blender.org/D13143
2021-11-17 14:31:12 +08:00
6e6123b40f Fix T93080: Crash on scrubbing with snapping
Sequencer wasn't initialized, snapping crashed on NULL dereference.
Add Null check.
2021-11-17 05:40:25 +01:00
9bdf3fa5f0 Fix T91724: Strip height is too limited
This change was introduced in 997b5fe45d, to not display pixelated
thumbnails. However when VSE timeline height is made smaller, this
limits strip height.

Change limit, so one strip can occupy full height of VSE timeline
2021-11-17 02:45:53 +01:00
f829b86039 Cleanup: Correct copy paste error in comment
Mistake from rB2743d746ea4f38c098512f6dd6fc33d5a62429d3
2021-11-16 17:49:51 -05:00
1e4d1eb398 Fix another Linux build error with double and float comparison 2021-11-16 23:12:50 +01:00
b496c1c721 Cleanup: compiler warnings 2021-11-16 22:29:50 +01:00
3189171a94 Fix build error with strict double to float conversion 2021-11-16 22:25:24 +01:00
f30e1fd2f0 Fix T93085: Incorrect geometry nodes modifier warning
It's valid for a node group connected to the modifier not to
have a geometry input, but I didn't consider that case
with the last change I made here, f3bdabbe24.

Differential Revision: https://developer.blender.org/D13231
2021-11-16 14:51:03 -06:00
25d30e6c99 Fix T92857: Deadlock in geometry nodes curve multi-threading
The spline code, especially Bezier splines, often make use of lazily
evaluation and caching. In order to do that, they use mutex locks.
When multi-threading, this can lead to problems. Further detail
can be found in rBfcc844f8fbd0d1.

To fix the deadlock, isolate the task before multi-threading
when holding a lock.

Differential Revision: https://developer.blender.org/D13229
2021-11-16 14:49:58 -06:00
cfd0e96e47 Fix T93125: Cycles wrong remaining render time with high number of samples
Avoid integer overflow.
2021-11-16 20:49:32 +01:00
7293c1b357 Fix T93106: Cycles SSS not working with normals pointing inside 2021-11-16 19:44:45 +01:00
1572c4d3d3 Fix T93011: Individual origins being used when pivot point is override
There should be a special `t->around` for this case, but for now let's
just avoid having the individual origins overlap.
2021-11-16 13:56:11 -03:00
bd37553850 Cleanup: better delimit member initialization
The initialization of `t->around` and `t->view` was scattered and with
duplicate code
2021-11-16 13:55:30 -03:00
0335df9384 Transform: better contextualize the status bar
`Remove Last Snap Point` should only be displayed when there is a Snap Point to be removed.
2021-11-16 13:39:10 -03:00
b3529ecf0e Fix CUDA error when using tiny border in viewport
Need to clamp scaled render buffers window to be above zero
when applying resolution divider.
2021-11-16 17:25:18 +01:00
72ee62e0da Fix crash on freeing hair system
Fix a crash when a hair system's `ParticleSettings` ID datablock was
linked from another file but couldn't be found. This results in default
settings, with `type = PART_EMITTER`, where the particle data still has
a non-NULL `hair` pointer. Previously, copies of such a particle system
would NOT copy hair data for non-hair particle systems, hence the
pointer of the copy pointed to the original data, which got freed (at
least) twice upon closing the blend file.

This is now fixed by always copying the hair data, regardless of the
particle system type.

Reviewed by: mont29

Differential Revision: https://developer.blender.org/D13245
2021-11-16 17:18:01 +01:00
07af45eec5 Asset Browser: hide catalog debug info behind debug option
Add a new "experimental" debug option `show_asset_debug_info`, and use
that to determine the visibility of the active asset's catalog UUID and
simple name. Previously this was only determined by the "Developer
Extras" option, which meant it was visible in too many situations. It's
not really a "developer extra", and really just a debugging tool, so the
new option is more in line with its purpose.

Reviewed by: Severin

Differential Revision: https://developer.blender.org/D13242
2021-11-16 16:49:27 +01:00
ce0d817bb7 UI: Fix hard to read text for drag disabled hints
In 499dbb626a, the background color of drag tooltips were changed so
text becomes more readable. But multiple people were touching the same
code, so the disabled hint tooltips didn't get the same tweak. They
would benefit from them even more, since the red text is even harder to
read on the transparent background than the regular, white text.
2021-11-16 14:37:20 +01:00
c7a1e115b5 Tests: fix memory leak of GHOST system paths
Dispose of GHOST system paths when tearing down `BlendfileLoadingBaseTest`
and some other test cases. This prevents a memory leak.

A better solution would be to rework Blender's initialisation & teardown
structure, but that's outside the scope of this fix.

No functional changes to Blender.
2021-11-16 13:07:11 +01:00
faa8aa3bb9 Asset Browser: Forbid dragging catalogs into themselves
While there is nothing technically that would cause issues when moving a
catalog into itself (it just changes the path of the catalog, and the
missing parent catalogs will be created), it seems broken to the user.
So disable this in the drag & drop code for asset catalogs.
2021-11-16 13:01:57 +01:00
052c22199d Asset Browser: use one more refresh operator
Refreshing the assets requires `file_OT_asset_library_refresh` in the
asset browser, and `asset_OT_list_refresh` for the asset view. Both
are now done from `ASSET_OT_open_containing_blend_file`.
2021-11-16 12:35:39 +01:00
Diptangshu Dey
da14a482f2 Fix T90866: Python operator templates are not accessible from menus
Python Operator templates made accessible from respective menus
(required to also use F3 search for quick access)
Also fixed Modal Draw Operator id_name (had duplicate name from other template)

Maniphest Tasks: T90866

Differential Revision: https://developer.blender.org/D13182
2021-11-16 10:45:23 +01:00
7d985d6b69 Fix T93066: Alembic export ignores Mantaflow particles
`ABCPointsWriter::is_supported` already checked for valid particle
system types (liquid, spray, foam, bubbles, ...).

`AbstractHierarchyIterator::make_writers_particle_systems` did not
create a writer for these though, so now bring these in line and also
create writers for these.
2021-11-16 09:41:09 +01:00
a040d2a93a Fix T90592: Incorrect scrollbar range with backdrop
`v2d->tot` rect was set for backdrop drawing. Set range before drawing
scrollbars.

Reviewed By: Severin

Differential Revision: https://developer.blender.org/D13099
2021-11-15 20:28:11 +01:00
7e148c45c8 Fix T90415: Missing cache invalidation
Some RNA properties and operators did not invalidate cache or did it
incorrectly.

Reviewed By: sergey

Differential Revision: https://developer.blender.org/D13101
2021-11-15 20:23:57 +01:00
d3c45e1c39 Fix T91405: Block artefacts in WEBM video
Issue was caused by incorrect FFmpeg asynchronous decoding API. In most
cases, decoder returns 1 frame each time it is fed by 1 packet. Here
decoder wanted to return more frames, but our code always expected only
one.

Before sending new packets to decoder, check if there are frames to
receive. If there are, process them, otherwise continue decoding as
usual.

Reviewed By: zeddb, sergey

Differential Revision: https://developer.blender.org/D13079
2021-11-15 20:20:33 +01:00
ef8240e64c Fix T91992: Incorrect clip strip image size
When proxy size lower than 100% is used, clip strips are rendered with
incorrect image size.

This is because if proxies aren't enabled in movieclip, it automatically
falls back on rendering original media. Sequencer doesn't have knowledge
about this and since 9c99292a16 it assumes that image is proxy,
because it explicitly requested this size.

Check movieclip flag to see if proxies are enabled.

Reviewed By: sergey

Differential Revision: https://developer.blender.org/D13080
2021-11-15 20:07:46 +01:00
62da41d63d Fix: Incorrect socket identifier versioning
There were two issues:
 - The third math node socket does not exist in old enough files.
 - The comment incorrectly referred to the vector math node.

Differential Revision: https://developer.blender.org/D13219
2021-11-15 08:52:58 -06:00
Jeroen Bakker
a5c59fb90e Fix T89260: Eevee crashes with custom node sockets.
Cause of this issue is that Custom Node Sockets info type was
initialized as SOCK_FLOAT when registering. Areas within the core that
would ignore custom socket types by checking its type would use the
socket as being a float type.

When custom node sockets have a property called default_value blender
tries to store it as an internal default value what failed in debug
builds.

This patch will set the socket type to SOCK_CUSTOM when registering a
custom socket type and allow, but skip storage of custom default values.
In this case the default values should already be stored as custom
properies.

Reviewed By: campbellbarton, JacquesLucke

Maniphest Tasks: T89260

Differential Revision: https://developer.blender.org/D13174
2021-11-15 08:13:22 +01:00
622e6f05f1 Fix T92750: sculpt vertex colors missing in object mode
The layers were not aliased properly for usage in the shaders.

Regression caused by rB03013d19d167.
2021-11-15 02:38:52 +01:00
0a6f428be7 Fix snapping not performing on selected bones or splines
This is an old issue but never reported as it is only visible in the measure tool snapping.
2021-11-12 21:56:42 -03:00
eed48a7322 Cleanup: reference some snap parameters in the snap context itself
This decreases the number of parameters in functions and makes important variables available in more places.
2021-11-12 21:34:19 -03:00
6b4ca78108 Fix T92636: Vector math node link disconnects when loading old file
Previously, unique identifiers for sockets were created automatically,
sometimes using .001 and sometimes _001. Now they are created
manually with the second format, but some files were saved with .001
format. I think this was only an issue in the vector math node.

rBd845ba481c6d fixed this problem in 3.1, but in a more general way.
After I merge this patch to 3.1, I will revert it, since the versioning
added in that commit will make this redundant.

Differential Revision: https://developer.blender.org/D13209
2021-11-12 17:07:38 -06:00
8d3a771574 Fix: Node Class Type for Select by Handle Type
Change the node class type for Node Select by Handle Type to
NODE_CLASS_INPUT
2021-11-12 14:05:40 -06:00
888b879f5f Fix: Incorrect transfer attribute error message with curves
The node does support curves, but only in index mode (see T88630)
So add a specific error message for the nearest mode, and let the
node support curves in the declaration.

Differential Revision: https://developer.blender.org/D13205
2021-11-12 13:25:22 -06:00
71131b4969 Tests: disable audio system for performance tests 2021-11-12 20:03:46 +01:00
b4d9b8b7f8 Fix T91893, T92455: wrong transmission pass with hair and multiscatter glass
We need to increase GPU memory usage a bit. Unfortunately we can't get away
with writing either reflection or transmission passes because these BSDFs may
scatter in either direction but still must be in a fixed reflection or
transmission category to match up with the color passes.
2021-11-12 20:03:46 +01:00
ef0b8d6306 Fix T92002: no Cycles combined baking support for filter settings 2021-11-12 20:03:46 +01:00
9d0d4b8601 Fix T93029: OptiX denoising artifacts at high sample counts in specific scenes
Partially reverts commit rB440a3475b8f5410e5c41bfbed5ce82771b41356f because
"optixDenoiserComputeIntensity" does not currently support input images that are not packed (the
"pixelStrideInBytes" field is not zero). As a result the intensity calculation would take into account
data from other passes in the image, some of which was scaled by the number of samples still and
therefore produce widely incorrect results that then caused artifacts in the denoised image.

Maniphest Tasks: T93029
2021-11-12 18:59:50 +01:00
2b394e1108 UI: (Performance) Avoid drawing buttons out of view
The UI was always drawing all buttons in a layout, no matter if they
were scrolled out of view (as in, outside of the visible part of the
region) or not. This means it's doing quite some work that can be
avoided.
UI drawing generally isn't a big bottleneck in Blender, so I don't
expect huge speedups from this. But while playing back animation, we do
redraw a fair bit of the UI, so in cases where there are many buttons
out of view, it may bring a little FPS boost. E.g. say in complex node
trees (the node editor is redrawn on animation playback in case there
are animated values that need updated UI feedback). This also mitigates
the issue in T92922 significantly.

Differential Revision: https://developer.blender.org/T92922

Reviewed by: Brecht Van Lommel
2021-11-12 18:21:12 +01:00
Yevgeny Makarov
896d3f1ce5 UI: Do not shade alpha when blending colors
UI_GetThemeColorBlendShade4fv incorrectly changing alpha by the amount
of the shading offset.

See D9944 for more details.

Differential Revision: https://developer.blender.org/D9944

Reviewed by Hans Goudey
2021-11-12 08:51:31 -08:00
76105eb752 Fix T93007: Cycles not updating for animated Object properties like color 2021-11-12 17:09:21 +01:00
d48523cb4d Fix Cycles ray visibility panel missing for volume objects 2021-11-12 16:18:07 +01:00
b4cfe80547 Assets: Store Action sub-type in asset metadata
Blender 3.0 will only support single-frame Actions in the pose library.
The goal of this patch is to lay the groundwork for making it possible
for the Asset Browser to reject/hide "animation snippet" Action assets.

Determining whether an Action has one or more frames (i.e. whether it
has a single pose or animation) requires inspecting the Action itself,
and thus loading the data-block itself. This would make it impossible to
quickly determine from the asset browser.

To solve this, the Action is inspected before saving, and a
`"is_single_frame"` boolean (well, 0/1 integer) IDProperty is added.

Reviewed by: Severin

Differential Revision: https://developer.blender.org/D13202
2021-11-12 14:36:22 +01:00
5f7d5c0809 Gizmo: adjust when gizmo protection flags are displayed
Restore behavior reverted in
0ea60cf6b8 but only for location
as it makes sense to use protection flags in global mode in that case.
2021-11-13 00:28:13 +11:00
0ea60cf6b8 Fix gizmo protection flag use in object mode
Channel protection flags were only used in global mode,
this doesn't make any sense, especially for rotation and scale.

Follow pose-bones, only using protection flags for
local & gimbal orientation.
2021-11-13 00:19:36 +11:00
de8a46c6ad Fix T93008: Cycles: Huge memory spike when saving tile to disk
The root of the problem lies in bug in OIIO which we can work around
from our side (which does not affect pack memory usage).

Thanks Brecht for finding the root cause!

Differential Revision: https://developer.blender.org/D13186
2021-11-12 11:20:01 +01:00
2b633f12ad Fix crash in asset browser when switching from file browser
When the file browser is in asset browser mode, it sets the callback
`filelist->prepare_filter_fn` to an asset browser specific function. This
function will segfault if there is no current asset library. Switching back
from asset browser to file browser would not reset that callback to
`NULL`, causing it to be called and crash Blender. This is now fixed.
2021-11-12 10:51:09 +01:00
William Leeson
456876208b Fix T92601: Disable profiling when the profiler is deemed not active.
Adds a method to profiler that can be used to check if it is active.
This is used to determine if stop_profiling and start_profiling
should be called.

| patch | Juans Scene UI 256 samples | Juans Scene bg 256 samples | junkshop UI | junkshop bg |
| No patch | 6:16.59 | 4:05.37 | 2:08.48 | 1:59.7 |
| D13187   | 4:12.15 | 3:57.36 | 2:07.25 | 1:58.16 |
| D13185   | 4.11.18 |3:54.74 | 2:07.44 | 1:58.03 |
| D13190   | 4:12.39 | 3:55.42 | 2:07.62 | 1:58.68 |

UI - means rendered from within Blender
bg - means rendered from the command line using ##blender -b scene.blend -f 1##

Reviewed By: sergey, brecht

Maniphest Tasks: T92601

Differential Revision: https://developer.blender.org/D13190
2021-11-12 10:16:01 +01:00
1061f5a1ba Cleanup: remove redundant loop-index access
Use a counter for loop indices as they're being iterated in order.
2021-11-12 18:33:05 +11:00
1a7757b0bc Fix T91444: Edge Loop Preview fails with two Mirror Modifiers
The mirror modifiers merge option caused unnecessary re-ordering
to the vertex array with original vertices merging into their copies.

While this wasn't an error, it meant creating a 1:1 mapping from input
vertices to their final output wasn't reliable (when looping over
vertices first to last) as is done in
BKE_editmesh_vert_coords_when_deformed.

As merging in either direction is supported, keep the source meshes
vertices in-order since it allows the vertex coordinates to be extracted.
2021-11-12 18:27:47 +11:00
f133c6b094 Fix crash saving blend files in background mode
Reading the windows pixels was attempted in background mode.
2021-11-12 18:05:27 +11:00
d612d92630 Cleanup: use term sequence_strip instead of vse_strip 2021-11-12 17:38:58 +11:00
bd734cc441 Fix: Attribute Transfer node does not work with a single index
Differential Revision: https://developer.blender.org/D13194
2021-11-11 19:49:20 +01:00
f3bdabbe24 Fix: Incorrect modifier warning with non-geometry input first
The code assumed that any geometry input that wasn't the first input
was a second geometry input. Fix by separating the warning for the
first input and for the number of geometry inputs.
2021-11-11 11:47:19 -06:00
393879f30c Fix: wrong field inferencing with unavailable sockets 2021-11-11 18:26:40 +01:00
3d9c8397fc Fix T93005: Cycles shadow catcher not inherited by instances 2021-11-11 18:12:05 +01:00
Peter Sergay
4bc08b79aa UI: Updated Blender Light theme for Spreadsheet
This patch adds list colors to the light theme for the spreadsheet,
which are needed for the data set region. Addresses T92492.

Differential Revision: https://developer.blender.org/D13090
2021-11-11 10:10:28 -06:00
Martijn Versteegh
7aa39b40f4 Fix: Prevent use of uninitialized memory when creating Bezier spline
When Constructing bezier splines from dna, the positions of the
left/right handles were set directly in the internal vectors, by
requesting a reference to them. The problem is that
BezierSpline::handle_positions_left() calls ensure_auto_handles()
before returning the reference. That function does some calculations on
uninitialized memory if the positions array is not yet filled.

Differential Revision: https://developer.blender.org/D13107
2021-11-11 09:25:10 -06:00
d26d3cfe19 Fix T92868: Cycles catcher with transparency crashes
The issue was caused by splitting happening twice.

Fixed by checking for split flag which is assigned to the both states
during split.

The tricky part was to write catcher data at the moment of split: the
transparency and shadow catcher sample count is to be accumulated at
that point. Now it is happening in the `intersect_closest` kernel.
The downside is that render buffer is to be passed to the kernel, but
the benefit is that extra split bounce check is not needed now.

Had to move the passes write to shadow catcher header, since include
of `film/passes.h` causes all the fun of requirement to have BSDF
data structures available.

Differential Revision: https://developer.blender.org/D13177
2021-11-11 15:21:35 +01:00
9be6880d02 Fix (unreported) bad handling of reports in libraries.load code.
rB60fee69682ac39 only partially fixed the issue, `BlendFileReadReport
bf_reports` was now properly stored in `BPy_Library` `self` for the
lifetime of the context, but its `reports` member was still referencing
local variable to `bpy_lib_enter` function.
2021-11-11 14:29:14 +01:00
b7e2408ea4 Fix T92867: Gimbal rotation broken when used for multiple objects
Support gimbal orientation for objects & bones.
2021-11-11 21:33:09 +11:00
bb64155c63 Cleanup: split gimbal_axis into pose and object
Allow access to a single bones gimbal matrix.
2021-11-11 21:14:08 +11:00
d753ebd40a Cleanup: spelling in comments 2021-11-11 14:59:30 +11:00
ddf0bacaa9 Cleanup: typo in function name 2021-11-11 14:58:55 +11:00
3929db265f Fix T92954: Loop Cut Tool preview line visible during operation
Apply a local-workaround instead of adding support for this use-case
since pre-selection isn't the intended purpose of gizmos.

This also resolves a glitch where poly-build and loop cut would
briefly show loop-cut or poly-build pre-selection after transforming.

See gizmo_preselect_poll_for_draw note for more details.
2021-11-11 14:58:55 +11:00
22ffd69a91 Fix T89313: Attribute search crash with animation playback
rBc473b2ce8bdbf8fa42 improved the situation somewhat, but
attribute search still crashes during animation playback, because
the UI search data references stale memory. The proper solution
is to allow the search to own data rather than just referencing it,
but I would prefer not to do that for 3.0. In the meantime, just
disable attribute search when animation is playing.

Differential Revision: https://developer.blender.org/D13179
2021-11-10 15:43:18 -06:00
040630bb9a Fix wrong device check in HIP kernel compile.
Also cleanup some related code, that was falsely copied from CUDA.

Differential Revision: https://developer.blender.org/D13180
2021-11-10 22:24:53 +01:00
7689f501e2 Cycles: enable HIP device and binaries on Windows
We've now done testing to confirm this works with RDNA and RDNA2 AMD GPUs
on Windows. The AMD driver needed for this will soon be released publicly.
2021-11-10 20:16:44 +01:00
Thomas Dinges
e507a789b3 Cycles: disable graphics interop for HIP devices
This is due to a driver bug, so disable it for now until it gets resolved
in a future driver release.

Ref T92972

Differential Revision: https://developer.blender.org/D13167
2021-11-10 20:16:44 +01:00
6b0008129e Fix T92972: Cycles HIP wrong render display after a recent refactor
It's unclear why this fails. Maybe the size of half4 is not the expected
8 bytes and adjacent pixels are overwritten. Or there is some bug in the
HIP compiler writing a struct into global memory, which we probably don't
do elsewhere in the kernel.

Thanks to Thomas, William and Jeroen for helping investigate this.
2021-11-10 20:03:07 +01:00
c8e93da0a7 Fix Cycles assert in denoising fallback to OIDN 2021-11-10 19:56:30 +01:00
612 changed files with 9276 additions and 7721 deletions

View File

@@ -440,7 +440,11 @@ mark_as_advanced(WITH_CYCLES_CUDA_BUILD_SERIAL)
mark_as_advanced(WITH_CUDA_DYNLOAD) mark_as_advanced(WITH_CUDA_DYNLOAD)
# AMD HIP # AMD HIP
option(WITH_CYCLES_DEVICE_HIP "Enable Cycles AMD HIP support" OFF) if(WIN32)
option(WITH_CYCLES_DEVICE_HIP "Enable Cycles AMD HIP support" ON)
else()
option(WITH_CYCLES_DEVICE_HIP "Enable Cycles AMD HIP support" OFF)
endif()
option(WITH_CYCLES_HIP_BINARIES "Build Cycles AMD HIP binaries" OFF) option(WITH_CYCLES_HIP_BINARIES "Build Cycles AMD HIP binaries" OFF)
set(CYCLES_HIP_BINARIES_ARCH gfx1010 gfx1011 gfx1012 gfx1030 gfx1031 gfx1032 gfx1034 CACHE STRING "AMD HIP architectures to build binaries for") set(CYCLES_HIP_BINARIES_ARCH gfx1010 gfx1011 gfx1012 gfx1030 gfx1031 gfx1032 gfx1034 CACHE STRING "AMD HIP architectures to build binaries for")
mark_as_advanced(WITH_CYCLES_DEVICE_HIP) mark_as_advanced(WITH_CYCLES_DEVICE_HIP)
@@ -490,7 +494,8 @@ endif()
# This should be turned off when Blender enter beta/rc/release # This should be turned off when Blender enter beta/rc/release
if("${BLENDER_VERSION_CYCLE}" STREQUAL "release" OR if("${BLENDER_VERSION_CYCLE}" STREQUAL "release" OR
"${BLENDER_VERSION_CYCLE}" STREQUAL "rc") "${BLENDER_VERSION_CYCLE}" STREQUAL "rc" OR
"${BLENDER_VERSION_CYCLE}" STREQUAL "beta")
set(WITH_EXPERIMENTAL_FEATURES OFF) set(WITH_EXPERIMENTAL_FEATURES OFF)
else() else()
set(WITH_EXPERIMENTAL_FEATURES ON) set(WITH_EXPERIMENTAL_FEATURES ON)

View File

@@ -42,6 +42,7 @@ ExternalProject_Add(nanovdb
URL_HASH ${NANOVDB_HASH_TYPE}=${NANOVDB_HASH} URL_HASH ${NANOVDB_HASH_TYPE}=${NANOVDB_HASH}
PREFIX ${BUILD_DIR}/nanovdb PREFIX ${BUILD_DIR}/nanovdb
SOURCE_SUBDIR nanovdb SOURCE_SUBDIR nanovdb
PATCH_COMMAND ${PATCH_CMD} -p 1 -d ${BUILD_DIR}/nanovdb/src/nanovdb < ${PATCH_DIR}/nanovdb.diff
CMAKE_ARGS -DCMAKE_INSTALL_PREFIX=${LIBDIR}/nanovdb ${DEFAULT_CMAKE_FLAGS} ${NANOVDB_EXTRA_ARGS} CMAKE_ARGS -DCMAKE_INSTALL_PREFIX=${LIBDIR}/nanovdb ${DEFAULT_CMAKE_FLAGS} ${NANOVDB_EXTRA_ARGS}
INSTALL_DIR ${LIBDIR}/nanovdb INSTALL_DIR ${LIBDIR}/nanovdb
) )

View File

@@ -0,0 +1,374 @@
Index: nanovdb/nanovdb/NanoVDB.h
===================================================================
--- a/nanovdb/nanovdb/NanoVDB.h (revision 62751)
+++ b/nanovdb/nanovdb/NanoVDB.h (working copy)
@@ -152,8 +152,8 @@
#endif // __CUDACC_RTC__
-#ifdef __CUDACC__
-// Only define __hostdev__ when using NVIDIA CUDA compiler
+#if defined(__CUDACC__) || defined(__HIP__)
+// Only define __hostdev__ when using NVIDIA CUDA or HIP compiler
#define __hostdev__ __host__ __device__
#else
#define __hostdev__
@@ -461,7 +461,7 @@
/// Maximum floating-point values
template<typename T>
struct Maximum;
-#ifdef __CUDA_ARCH__
+#if defined(__CUDA_ARCH__) || defined(__HIP__)
template<>
struct Maximum<int>
{
@@ -1006,10 +1006,10 @@
using Vec3i = Vec3<int>;
/// @brief Return a single precision floating-point vector of this coordinate
-Vec3f Coord::asVec3s() const { return Vec3f(float(mVec[0]), float(mVec[1]), float(mVec[2])); }
+inline __hostdev__ Vec3f Coord::asVec3s() const { return Vec3f(float(mVec[0]), float(mVec[1]), float(mVec[2])); }
/// @brief Return a double precision floating-point vector of this coordinate
-Vec3d Coord::asVec3d() const { return Vec3d(double(mVec[0]), double(mVec[1]), double(mVec[2])); }
+inline __hostdev__ Vec3d Coord::asVec3d() const { return Vec3d(double(mVec[0]), double(mVec[1]), double(mVec[2])); }
// ----------------------------> Vec4 <--------------------------------------
@@ -1820,7 +1820,7 @@
}; // Map
template<typename Mat4T>
-void Map::set(const Mat4T& mat, const Mat4T& invMat, double taper)
+__hostdev__ void Map::set(const Mat4T& mat, const Mat4T& invMat, double taper)
{
float * mf = mMatF, *vf = mVecF;
float* mif = mInvMatF;
@@ -2170,7 +2170,7 @@
}; // Class Grid
template<typename TreeT>
-int Grid<TreeT>::findBlindDataForSemantic(GridBlindDataSemantic semantic) const
+__hostdev__ int Grid<TreeT>::findBlindDataForSemantic(GridBlindDataSemantic semantic) const
{
for (uint32_t i = 0, n = blindDataCount(); i < n; ++i)
if (blindMetaData(i).mSemantic == semantic)
@@ -2328,7 +2328,7 @@
}; // Tree class
template<typename RootT>
-void Tree<RootT>::extrema(ValueType& min, ValueType& max) const
+__hostdev__ void Tree<RootT>::extrema(ValueType& min, ValueType& max) const
{
min = this->root().valueMin();
max = this->root().valueMax();
@@ -2336,7 +2336,7 @@
template<typename RootT>
template<typename NodeT>
-const NodeT* Tree<RootT>::getNode(uint32_t i) const
+__hostdev__ const NodeT* Tree<RootT>::getNode(uint32_t i) const
{
static_assert(is_same<TreeNodeT<NodeT::LEVEL>, NodeT>::value, "Tree::getNode: unvalid node type");
NANOVDB_ASSERT(i < DataType::mCount[NodeT::LEVEL]);
@@ -2345,7 +2345,7 @@
template<typename RootT>
template<int LEVEL>
-const typename TreeNode<Tree<RootT>, LEVEL>::type* Tree<RootT>::getNode(uint32_t i) const
+__hostdev__ const typename TreeNode<Tree<RootT>, LEVEL>::type* Tree<RootT>::getNode(uint32_t i) const
{
NANOVDB_ASSERT(i < DataType::mCount[LEVEL]);
return reinterpret_cast<const TreeNodeT<LEVEL>*>(reinterpret_cast<const uint8_t*>(this) + DataType::mBytes[LEVEL]) + i;
@@ -2353,7 +2353,7 @@
template<typename RootT>
template<typename NodeT>
-NodeT* Tree<RootT>::getNode(uint32_t i)
+__hostdev__ NodeT* Tree<RootT>::getNode(uint32_t i)
{
static_assert(is_same<TreeNodeT<NodeT::LEVEL>, NodeT>::value, "Tree::getNode: invalid node type");
NANOVDB_ASSERT(i < DataType::mCount[NodeT::LEVEL]);
@@ -2362,7 +2362,7 @@
template<typename RootT>
template<int LEVEL>
-typename TreeNode<Tree<RootT>, LEVEL>::type* Tree<RootT>::getNode(uint32_t i)
+__hostdev__ typename TreeNode<Tree<RootT>, LEVEL>::type* Tree<RootT>::getNode(uint32_t i)
{
NANOVDB_ASSERT(i < DataType::mCount[LEVEL]);
return reinterpret_cast<TreeNodeT<LEVEL>*>(reinterpret_cast<uint8_t*>(this) + DataType::mBytes[LEVEL]) + i;
@@ -2370,7 +2370,7 @@
template<typename RootT>
template<typename NodeT>
-uint32_t Tree<RootT>::getNodeID(const NodeT& node) const
+__hostdev__ uint32_t Tree<RootT>::getNodeID(const NodeT& node) const
{
static_assert(is_same<TreeNodeT<NodeT::LEVEL>, NodeT>::value, "Tree::getNodeID: invalid node type");
const NodeT* first = reinterpret_cast<const NodeT*>(reinterpret_cast<const uint8_t*>(this) + DataType::mBytes[NodeT::LEVEL]);
@@ -2380,7 +2380,7 @@
template<typename RootT>
template<typename NodeT>
-uint32_t Tree<RootT>::getLinearOffset(const NodeT& node) const
+__hostdev__ uint32_t Tree<RootT>::getLinearOffset(const NodeT& node) const
{
return this->getNodeID(node) + DataType::mPFSum[NodeT::LEVEL];
}
@@ -3366,7 +3366,7 @@
}; // LeafNode class
template<typename ValueT, typename CoordT, template<uint32_t> class MaskT, uint32_t LOG2DIM>
-inline void LeafNode<ValueT, CoordT, MaskT, LOG2DIM>::updateBBox()
+inline __hostdev__ void LeafNode<ValueT, CoordT, MaskT, LOG2DIM>::updateBBox()
{
static_assert(LOG2DIM == 3, "LeafNode::updateBBox: only supports LOGDIM = 3!");
if (!this->isActive()) return;
Index: nanovdb/nanovdb/util/SampleFromVoxels.h
===================================================================
--- a/nanovdb/nanovdb/util/SampleFromVoxels.h (revision 62751)
+++ b/nanovdb/nanovdb/util/SampleFromVoxels.h (working copy)
@@ -22,7 +22,7 @@
#define NANOVDB_SAMPLE_FROM_VOXELS_H_HAS_BEEN_INCLUDED
// Only define __hostdev__ when compiling as NVIDIA CUDA
-#ifdef __CUDACC__
+#if defined(__CUDACC__) || defined(__HIP__)
#define __hostdev__ __host__ __device__
#else
#include <cmath> // for floor
@@ -136,7 +136,7 @@
template<typename TreeOrAccT>
template<typename Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 0, true>::operator()(const Vec3T& xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 0, true>::operator()(const Vec3T& xyz) const
{
const CoordT ijk = Round<CoordT>(xyz);
if (ijk != mPos) {
@@ -147,7 +147,7 @@
}
template<typename TreeOrAccT>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 0, true>::operator()(const CoordT& ijk) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 0, true>::operator()(const CoordT& ijk) const
{
if (ijk != mPos) {
mPos = ijk;
@@ -158,7 +158,7 @@
template<typename TreeOrAccT>
template<typename Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 0, false>::operator()(const Vec3T& xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 0, false>::operator()(const Vec3T& xyz) const
{
return mAcc.getValue(Round<CoordT>(xyz));
}
@@ -195,7 +195,7 @@
}; // TrilinearSamplerBase
template<typename TreeOrAccT>
-void TrilinearSampler<TreeOrAccT>::stencil(CoordT& ijk, ValueT (&v)[2][2][2]) const
+__hostdev__ void TrilinearSampler<TreeOrAccT>::stencil(CoordT& ijk, ValueT (&v)[2][2][2]) const
{
v[0][0][0] = mAcc.getValue(ijk); // i, j, k
@@ -224,7 +224,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType TrilinearSampler<TreeOrAccT>::sample(const Vec3T<RealT> &uvw, const ValueT (&v)[2][2][2])
+__hostdev__ typename TreeOrAccT::ValueType TrilinearSampler<TreeOrAccT>::sample(const Vec3T<RealT> &uvw, const ValueT (&v)[2][2][2])
{
#if 0
auto lerp = [](ValueT a, ValueT b, ValueT w){ return fma(w, b-a, a); };// = w*(b-a) + a
@@ -239,7 +239,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-Vec3T<typename TreeOrAccT::ValueType> TrilinearSampler<TreeOrAccT>::gradient(const Vec3T<RealT> &uvw, const ValueT (&v)[2][2][2])
+__hostdev__ Vec3T<typename TreeOrAccT::ValueType> TrilinearSampler<TreeOrAccT>::gradient(const Vec3T<RealT> &uvw, const ValueT (&v)[2][2][2])
{
static_assert(std::is_floating_point<ValueT>::value, "TrilinearSampler::gradient requires a floating-point type");
#if 0
@@ -270,7 +270,7 @@
}
template<typename TreeOrAccT>
-bool TrilinearSampler<TreeOrAccT>::zeroCrossing(const ValueT (&v)[2][2][2])
+__hostdev__ bool TrilinearSampler<TreeOrAccT>::zeroCrossing(const ValueT (&v)[2][2][2])
{
static_assert(std::is_floating_point<ValueT>::value, "TrilinearSampler::zeroCrossing requires a floating-point type");
const bool less = v[0][0][0] < ValueT(0);
@@ -363,7 +363,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, true>::operator()(Vec3T<RealT> xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, true>::operator()(Vec3T<RealT> xyz) const
{
this->cache(xyz);
return BaseT::sample(xyz, mVal);
@@ -370,7 +370,7 @@
}
template<typename TreeOrAccT>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, true>::operator()(const CoordT &ijk) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, true>::operator()(const CoordT &ijk) const
{
return ijk == mPos ? mVal[0][0][0] : BaseT::mAcc.getValue(ijk);
}
@@ -377,7 +377,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-Vec3T<typename TreeOrAccT::ValueType> SampleFromVoxels<TreeOrAccT, 1, true>::gradient(Vec3T<RealT> xyz) const
+__hostdev__ Vec3T<typename TreeOrAccT::ValueType> SampleFromVoxels<TreeOrAccT, 1, true>::gradient(Vec3T<RealT> xyz) const
{
this->cache(xyz);
return BaseT::gradient(xyz, mVal);
@@ -393,7 +393,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-void SampleFromVoxels<TreeOrAccT, 1, true>::cache(Vec3T<RealT>& xyz) const
+__hostdev__ void SampleFromVoxels<TreeOrAccT, 1, true>::cache(Vec3T<RealT>& xyz) const
{
CoordT ijk = Floor<CoordT>(xyz);
if (ijk != mPos) {
@@ -406,7 +406,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, false>::operator()(Vec3T<RealT> xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, false>::operator()(Vec3T<RealT> xyz) const
{
ValueT val[2][2][2];
CoordT ijk = Floor<CoordT>(xyz);
@@ -418,7 +418,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, false>::operator()(Vec3T<RealT> xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 1, false>::operator()(Vec3T<RealT> xyz) const
{
auto lerp = [](ValueT a, ValueT b, RealT w) { return a + ValueT(w) * (b - a); };
@@ -463,7 +463,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-inline Vec3T<typename TreeOrAccT::ValueType> SampleFromVoxels<TreeOrAccT, 1, false>::gradient(Vec3T<RealT> xyz) const
+inline __hostdev__ Vec3T<typename TreeOrAccT::ValueType> SampleFromVoxels<TreeOrAccT, 1, false>::gradient(Vec3T<RealT> xyz) const
{
ValueT val[2][2][2];
CoordT ijk = Floor<CoordT>(xyz);
@@ -473,7 +473,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-bool SampleFromVoxels<TreeOrAccT, 1, false>::zeroCrossing(Vec3T<RealT> xyz) const
+__hostdev__ bool SampleFromVoxels<TreeOrAccT, 1, false>::zeroCrossing(Vec3T<RealT> xyz) const
{
ValueT val[2][2][2];
CoordT ijk = Floor<CoordT>(xyz);
@@ -510,7 +510,7 @@
}; // TriquadraticSamplerBase
template<typename TreeOrAccT>
-void TriquadraticSampler<TreeOrAccT>::stencil(const CoordT &ijk, ValueT (&v)[3][3][3]) const
+__hostdev__ void TriquadraticSampler<TreeOrAccT>::stencil(const CoordT &ijk, ValueT (&v)[3][3][3]) const
{
CoordT p(ijk[0] - 1, 0, 0);
for (int dx = 0; dx < 3; ++dx, ++p[0]) {
@@ -526,7 +526,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType TriquadraticSampler<TreeOrAccT>::sample(const Vec3T<RealT> &uvw, const ValueT (&v)[3][3][3])
+__hostdev__ typename TreeOrAccT::ValueType TriquadraticSampler<TreeOrAccT>::sample(const Vec3T<RealT> &uvw, const ValueT (&v)[3][3][3])
{
auto kernel = [](const ValueT* value, double weight)->ValueT {
return weight * (weight * (0.5f * (value[0] + value[2]) - value[1]) +
@@ -545,7 +545,7 @@
}
template<typename TreeOrAccT>
-bool TriquadraticSampler<TreeOrAccT>::zeroCrossing(const ValueT (&v)[3][3][3])
+__hostdev__ bool TriquadraticSampler<TreeOrAccT>::zeroCrossing(const ValueT (&v)[3][3][3])
{
static_assert(std::is_floating_point<ValueT>::value, "TrilinearSampler::zeroCrossing requires a floating-point type");
const bool less = v[0][0][0] < ValueT(0);
@@ -624,7 +624,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 2, true>::operator()(Vec3T<RealT> xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 2, true>::operator()(Vec3T<RealT> xyz) const
{
this->cache(xyz);
return BaseT::sample(xyz, mVal);
@@ -631,7 +631,7 @@
}
template<typename TreeOrAccT>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 2, true>::operator()(const CoordT &ijk) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 2, true>::operator()(const CoordT &ijk) const
{
return ijk == mPos ? mVal[1][1][1] : BaseT::mAcc.getValue(ijk);
}
@@ -646,7 +646,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-void SampleFromVoxels<TreeOrAccT, 2, true>::cache(Vec3T<RealT>& xyz) const
+__hostdev__ void SampleFromVoxels<TreeOrAccT, 2, true>::cache(Vec3T<RealT>& xyz) const
{
CoordT ijk = Floor<CoordT>(xyz);
if (ijk != mPos) {
@@ -657,7 +657,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 2, false>::operator()(Vec3T<RealT> xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 2, false>::operator()(Vec3T<RealT> xyz) const
{
ValueT val[3][3][3];
CoordT ijk = Floor<CoordT>(xyz);
@@ -667,7 +667,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-bool SampleFromVoxels<TreeOrAccT, 2, false>::zeroCrossing(Vec3T<RealT> xyz) const
+__hostdev__ bool SampleFromVoxels<TreeOrAccT, 2, false>::zeroCrossing(Vec3T<RealT> xyz) const
{
ValueT val[3][3][3];
CoordT ijk = Floor<CoordT>(xyz);
@@ -710,7 +710,7 @@
}; // TricubicSampler
template<typename TreeOrAccT>
-void TricubicSampler<TreeOrAccT>::stencil(const CoordT& ijk, ValueT (&C)[64]) const
+__hostdev__ void TricubicSampler<TreeOrAccT>::stencil(const CoordT& ijk, ValueT (&C)[64]) const
{
auto fetch = [&](int i, int j, int k) -> ValueT& { return C[((i + 1) << 4) + ((j + 1) << 2) + k + 1]; };
@@ -929,7 +929,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 3, true>::operator()(Vec3T<RealT> xyz) const
+__hostdev__ typename TreeOrAccT::ValueType SampleFromVoxels<TreeOrAccT, 3, true>::operator()(Vec3T<RealT> xyz) const
{
this->cache(xyz);
return BaseT::sample(xyz, mC);
@@ -937,7 +937,7 @@
template<typename TreeOrAccT>
template<typename RealT, template<typename...> class Vec3T>
-void SampleFromVoxels<TreeOrAccT, 3, true>::cache(Vec3T<RealT>& xyz) const
+__hostdev__ void SampleFromVoxels<TreeOrAccT, 3, true>::cache(Vec3T<RealT>& xyz) const
{
CoordT ijk = Floor<CoordT>(xyz);
if (ijk != mPos) {

View File

@@ -81,4 +81,5 @@ if(NOT APPLE)
set(WITH_CYCLES_DEVICE_OPTIX ON CACHE BOOL "" FORCE) set(WITH_CYCLES_DEVICE_OPTIX ON CACHE BOOL "" FORCE)
set(WITH_CYCLES_CUDA_BINARIES ON CACHE BOOL "" FORCE) set(WITH_CYCLES_CUDA_BINARIES ON CACHE BOOL "" FORCE)
set(WITH_CYCLES_CUBIN_COMPILER OFF CACHE BOOL "" FORCE) set(WITH_CYCLES_CUBIN_COMPILER OFF CACHE BOOL "" FORCE)
set(WITH_CYCLES_HIP_BINARIES ON CACHE BOOL "" FORCE)
endif() endif()

View File

@@ -5,38 +5,38 @@
update-code: update-code:
git: git:
submodules: submodules:
- branch: master - branch: blender-v3.0-release
commit_id: HEAD commit_id: HEAD
path: release/scripts/addons path: release/scripts/addons
- branch: master - branch: blender-v3.0-release
commit_id: HEAD commit_id: HEAD
path: release/scripts/addons_contrib path: release/scripts/addons_contrib
- branch: master - branch: blender-v3.0-release
commit_id: HEAD commit_id: HEAD
path: release/datafiles/locale path: release/datafiles/locale
- branch: master - branch: blender-v3.0-release
commit_id: HEAD commit_id: HEAD
path: source/tools path: source/tools
svn: svn:
libraries: libraries:
darwin-arm64: darwin-arm64:
branch: trunk branch: tags/blender-3.0-release
commit_id: HEAD commit_id: HEAD
path: lib/darwin_arm64 path: lib/darwin_arm64
darwin-x86_64: darwin-x86_64:
branch: trunk branch: tags/blender-3.0-release
commit_id: HEAD commit_id: HEAD
path: lib/darwin path: lib/darwin
linux-x86_64: linux-x86_64:
branch: trunk branch: tags/blender-3.0-release
commit_id: HEAD commit_id: HEAD
path: lib/linux_centos7_x86_64 path: lib/linux_centos7_x86_64
windows-amd64: windows-amd64:
branch: trunk branch: tags/blender-3.0-release
commit_id: HEAD commit_id: HEAD
path: lib/win64_vc15 path: lib/win64_vc15
tests: tests:
branch: trunk branch: tags/blender-3.0-release
commit_id: HEAD commit_id: HEAD
path: lib/tests path: lib/tests
benchmarks: benchmarks:

View File

@@ -38,7 +38,7 @@ PROJECT_NAME = Blender
# could be handy for archiving the generated documentation or if some version # could be handy for archiving the generated documentation or if some version
# control system is used. # control system is used.
PROJECT_NUMBER = V3.1 PROJECT_NUMBER = V3.0
# Using the PROJECT_BRIEF tag one can provide an optional one line description # Using the PROJECT_BRIEF tag one can provide an optional one line description
# for a project that appears at the top of each page and should give viewer a # for a project that appears at the top of each page and should give viewer a

View File

@@ -42,8 +42,13 @@ class SimpleMouseOperator(bpy.types.Operator):
self.y = event.mouse_y self.y = event.mouse_y
return self.execute(context) return self.execute(context)
# Only needed if you want to add into a dynamic menu
def menu_func(self, context):
self.layout.operator(SimpleMouseOperator.bl_idname, text="Simple Mouse Operator")
# Register and add to the view menu (required to also use F3 search "Simple Mouse Operator" for quick access)
bpy.utils.register_class(SimpleMouseOperator) bpy.utils.register_class(SimpleMouseOperator)
bpy.types.VIEW3D_MT_view.append(menu_func)
# Test call to the newly defined operator. # Test call to the newly defined operator.
# Here we call the operator and invoke it, meaning that the settings are taken # Here we call the operator and invoke it, meaning that the settings are taken

View File

@@ -43,7 +43,7 @@ def menu_func(self, context):
self.layout.operator(ExportSomeData.bl_idname, text="Text Export Operator") self.layout.operator(ExportSomeData.bl_idname, text="Text Export Operator")
# Register and add to the file selector # Register and add to the file selector (required to also use F3 search "Text Export Operator" for quick access)
bpy.utils.register_class(ExportSomeData) bpy.utils.register_class(ExportSomeData)
bpy.types.TOPBAR_MT_file_export.append(menu_func) bpy.types.TOPBAR_MT_file_export.append(menu_func)

View File

@@ -27,8 +27,14 @@ class DialogOperator(bpy.types.Operator):
wm = context.window_manager wm = context.window_manager
return wm.invoke_props_dialog(self) return wm.invoke_props_dialog(self)
# Only needed if you want to add into a dynamic menu
def menu_func(self, context):
self.layout.operator(DialogOperator.bl_idname, text="Dialog Operator")
# Register and add to the object menu (required to also use F3 search "Dialog Operator" for quick access)
bpy.utils.register_class(DialogOperator) bpy.utils.register_class(DialogOperator)
bpy.types.VIEW3D_MT_object.append(menu_func)
# Test call. # Test call.
bpy.ops.object.dialog_operator('INVOKE_DEFAULT') bpy.ops.object.dialog_operator('INVOKE_DEFAULT')

View File

@@ -41,8 +41,13 @@ class CustomDrawOperator(bpy.types.Operator):
col.prop(self, "my_string") col.prop(self, "my_string")
# Only needed if you want to add into a dynamic menu
def menu_func(self, context):
self.layout.operator(CustomDrawOperator.bl_idname, text="Custom Draw Operator")
# Register and add to the object menu (required to also use F3 search "Custom Draw Operator" for quick access)
bpy.utils.register_class(CustomDrawOperator) bpy.utils.register_class(CustomDrawOperator)
bpy.types.VIEW3D_MT_object.append(menu_func)
# test call # test call
bpy.ops.object.custom_draw('INVOKE_DEFAULT') bpy.ops.object.custom_draw('INVOKE_DEFAULT')

View File

@@ -55,8 +55,13 @@ class ModalOperator(bpy.types.Operator):
context.window_manager.modal_handler_add(self) context.window_manager.modal_handler_add(self)
return {'RUNNING_MODAL'} return {'RUNNING_MODAL'}
# Only needed if you want to add into a dynamic menu
def menu_func(self, context):
self.layout.operator(ModalOperator.bl_idname, text="Modal Operator")
# Register and add to the object menu (required to also use F3 search "Modal Operator" for quick access)
bpy.utils.register_class(ModalOperator) bpy.utils.register_class(ModalOperator)
bpy.types.VIEW3D_MT_object.append(menu_func)
# test call # test call
bpy.ops.object.modal_operator('INVOKE_DEFAULT') bpy.ops.object.modal_operator('INVOKE_DEFAULT')

View File

@@ -31,8 +31,13 @@ class SearchEnumOperator(bpy.types.Operator):
context.window_manager.invoke_search_popup(self) context.window_manager.invoke_search_popup(self)
return {'RUNNING_MODAL'} return {'RUNNING_MODAL'}
# Only needed if you want to add into a dynamic menu
def menu_func(self, context):
self.layout.operator(SearchEnumOperator.bl_idname, text="Search Enum Operator")
# Register and add to the object menu (required to also use F3 search "Search Enum Operator" for quick access)
bpy.utils.register_class(SearchEnumOperator) bpy.utils.register_class(SearchEnumOperator)
bpy.types.VIEW3D_MT_object.append(menu_func)
# test call # test call
bpy.ops.object.search_enum_operator('INVOKE_DEFAULT') bpy.ops.object.search_enum_operator('INVOKE_DEFAULT')

View File

@@ -22,8 +22,13 @@ class HelloWorldOperator(bpy.types.Operator):
print("Hello World") print("Hello World")
return {'FINISHED'} return {'FINISHED'}
# Only needed if you want to add into a dynamic menu
def menu_func(self, context):
self.layout.operator(HelloWorldOperator.bl_idname, text="Hello World Operator")
# Register and add to the view menu (required to also use F3 search "Hello World Operator" for quick access)
bpy.utils.register_class(HelloWorldOperator) bpy.utils.register_class(HelloWorldOperator)
bpy.types.VIEW3D_MT_view.append(menu_func)
# test call to the newly defined operator # test call to the newly defined operator
bpy.ops.wm.hello_world() bpy.ops.wm.hello_world()

View File

@@ -106,24 +106,6 @@ including advanced features.
floating-point values. These values are interpreted as a plane equation. floating-point values. These values are interpreted as a plane equation.
.. function:: glColor (red, green, blue, alpha):
B{glColor3b, glColor3d, glColor3f, glColor3i, glColor3s, glColor3ub, glColor3ui, glColor3us,
glColor4b, glColor4d, glColor4f, glColor4i, glColor4s, glColor4ub, glColor4ui, glColor4us,
glColor3bv, glColor3dv, glColor3fv, glColor3iv, glColor3sv, glColor3ubv, glColor3uiv,
glColor3usv, glColor4bv, glColor4dv, glColor4fv, glColor4iv, glColor4sv, glColor4ubv,
glColor4uiv, glColor4usv}
Set a new color.
.. seealso:: `OpenGL Docs <https://khronos.org/registry/OpenGL-Refpages/gl4/html/glColor.xhtml>`__
:type red, green, blue, alpha: Depends on function prototype.
:arg red, green, blue: Specify new red, green, and blue values for the current color.
:arg alpha: Specifies a new alpha value for the current color. Included only in the
four-argument glColor4 commands. (With '4' colors only)
.. function:: glColorMask(red, green, blue, alpha): .. function:: glColorMask(red, green, blue, alpha):
Enable and disable writing of frame buffer color components Enable and disable writing of frame buffer color components

View File

@@ -1123,7 +1123,7 @@ context_type_map = {
"soft_body": ("SoftBodyModifier", False), "soft_body": ("SoftBodyModifier", False),
"speaker": ("Speaker", False), "speaker": ("Speaker", False),
"texture": ("Texture", False), "texture": ("Texture", False),
"texture_slot": ("MaterialTextureSlot", False), "texture_slot": ("TextureSlot", False),
"texture_user": ("ID", False), "texture_user": ("ID", False),
"texture_user_property": ("Property", False), "texture_user_property": ("Property", False),
"ui_list": ("UIList", False), "ui_list": ("UIList", False),
@@ -1224,10 +1224,7 @@ def pycontext2sphinx(basepath):
while char_array[i] is not None: while char_array[i] is not None:
member = ctypes.string_at(char_array[i]).decode(encoding="ascii") member = ctypes.string_at(char_array[i]).decode(encoding="ascii")
fw(".. data:: %s\n\n" % member) fw(".. data:: %s\n\n" % member)
try: member_type, is_seq = context_type_map[member]
member_type, is_seq = context_type_map[member]
except KeyError:
raise SystemExit("Error: context key %r not found in context_type_map; update %s" % (member, __file__)) from None
fw(" :type: %s :class:`bpy.types.%s`\n\n" % ("sequence of " if is_seq else "", member_type)) fw(" :type: %s :class:`bpy.types.%s`\n\n" % ("sequence of " if is_seq else "", member_type))
unique.add(member) unique.add(member)
i += 1 i += 1
@@ -2254,7 +2251,7 @@ def main():
# First monkey patch to load in fake members. # First monkey patch to load in fake members.
setup_monkey_patch() setup_monkey_patch()
# Perform changes to Blender itself. # Perform changes to Blender it's self.
setup_data = setup_blender() setup_data = setup_blender()
# eventually, create the dirs # eventually, create the dirs

12
extern/hipew/README vendored Normal file
View File

@@ -0,0 +1,12 @@
The HIP Extension Wrangler Library (HIPEW) is a cross-platform open-source
C/C++ library to dynamically load the HIP library.
HIP (Heterogeneous-Compute Interface for Portability) is an API for C++
programming on AMD GPUs.
It is maintained as part of the Blender project, but included in extern/
for consistency with CUEW and CLEW libraries.
LICENSE
HIPEW is released under the Apache 2.0 license.

5
extern/hipew/README.blender vendored Normal file
View File

@@ -0,0 +1,5 @@
Project: Blender
URL: https://git.blender.org/blender.git
License: Apache 2.0
Upstream version: N/A
Local modifications: None

View File

@@ -804,31 +804,29 @@ typedef enum hipDeviceP2PAttr {
} hipDeviceP2PAttr; } hipDeviceP2PAttr;
typedef struct HIP_MEMCPY3D { typedef struct HIP_MEMCPY3D {
size_t srcXInBytes; unsigned int srcXInBytes;
size_t srcY; unsigned int srcY;
size_t srcZ; unsigned int srcZ;
size_t srcLOD; unsigned int srcLOD;
hipMemoryType srcMemoryType; hipMemoryType srcMemoryType;
const void* srcHost; const void* srcHost;
hipDeviceptr_t srcDevice; hipDeviceptr_t srcDevice;
hArray * srcArray; hArray srcArray;
void* reserved0; unsigned int srcPitch;
size_t srcPitch; unsigned int srcHeight;
size_t srcHeight; unsigned int dstXInBytes;
size_t dstXInBytes; unsigned int dstY;
size_t dstY; unsigned int dstZ;
size_t dstZ; unsigned int dstLOD;
size_t dstLOD;
hipMemoryType dstMemoryType; hipMemoryType dstMemoryType;
void* dstHost; void* dstHost;
hipDeviceptr_t dstDevice; hipDeviceptr_t dstDevice;
hArray * dstArray; hArray dstArray;
void* reserved1; unsigned int dstPitch;
size_t dstPitch; unsigned int dstHeight;
size_t dstHeight; unsigned int WidthInBytes;
size_t WidthInBytes; unsigned int Height;
size_t Height; unsigned int Depth;
size_t Depth;
} HIP_MEMCPY3D; } HIP_MEMCPY3D;
typedef struct HIP_MEMCPY3D_PEER_st { typedef struct HIP_MEMCPY3D_PEER_st {
@@ -879,7 +877,7 @@ typedef struct HIP_RESOURCE_DESC_st {
hipResourceType resType; hipResourceType resType;
union { union {
struct { struct {
hArray * h_Array; hArray h_Array;
} array; } array;
struct { struct {
hipMipmappedArray_t hMipmappedArray; hipMipmappedArray_t hMipmappedArray;
@@ -1074,9 +1072,10 @@ typedef enum hiprtcResult {
typedef hipError_t HIPAPI thipGetErrorName(hipError_t error, const char** pStr); typedef hipError_t HIPAPI thipGetErrorName(hipError_t error, const char** pStr);
typedef hipError_t HIPAPI thipInit(unsigned int Flags); typedef hipError_t HIPAPI thipInit(unsigned int Flags);
typedef hipError_t HIPAPI thipDriverGetVersion(int* driverVersion); typedef hipError_t HIPAPI thipDriverGetVersion(int* driverVersion);
typedef hipError_t HIPAPI thipGetDevice(hipDevice_t* device, int ordinal); typedef hipError_t HIPAPI thipGetDevice(int* device);
typedef hipError_t HIPAPI thipGetDeviceCount(int* count); typedef hipError_t HIPAPI thipGetDeviceCount(int* count);
typedef hipError_t HIPAPI thipGetDeviceProperties(hipDeviceProp_t* props, int deviceId); typedef hipError_t HIPAPI thipGetDeviceProperties(hipDeviceProp_t* props, int deviceId);
typedef hipError_t HIPAPI thipDeviceGet(hipDevice_t* device, int ordinal);
typedef hipError_t HIPAPI thipDeviceGetName(char* name, int len, hipDevice_t dev); typedef hipError_t HIPAPI thipDeviceGetName(char* name, int len, hipDevice_t dev);
typedef hipError_t HIPAPI thipDeviceGetAttribute(int* pi, hipDeviceAttribute_t attrib, hipDevice_t dev); typedef hipError_t HIPAPI thipDeviceGetAttribute(int* pi, hipDeviceAttribute_t attrib, hipDevice_t dev);
typedef hipError_t HIPAPI thipDeviceComputeCapability(int* major, int* minor, hipDevice_t dev); typedef hipError_t HIPAPI thipDeviceComputeCapability(int* major, int* minor, hipDevice_t dev);
@@ -1209,6 +1208,7 @@ extern thipDriverGetVersion *hipDriverGetVersion;
extern thipGetDevice *hipGetDevice; extern thipGetDevice *hipGetDevice;
extern thipGetDeviceCount *hipGetDeviceCount; extern thipGetDeviceCount *hipGetDeviceCount;
extern thipGetDeviceProperties *hipGetDeviceProperties; extern thipGetDeviceProperties *hipGetDeviceProperties;
extern thipDeviceGet* hipDeviceGet;
extern thipDeviceGetName *hipDeviceGetName; extern thipDeviceGetName *hipDeviceGetName;
extern thipDeviceGetAttribute *hipDeviceGetAttribute; extern thipDeviceGetAttribute *hipDeviceGetAttribute;
extern thipDeviceComputeCapability *hipDeviceComputeCapability; extern thipDeviceComputeCapability *hipDeviceComputeCapability;
@@ -1333,6 +1333,7 @@ enum {
HIPEW_SUCCESS = 0, HIPEW_SUCCESS = 0,
HIPEW_ERROR_OPEN_FAILED = -1, HIPEW_ERROR_OPEN_FAILED = -1,
HIPEW_ERROR_ATEXIT_FAILED = -2, HIPEW_ERROR_ATEXIT_FAILED = -2,
HIPEW_ERROR_OLD_DRIVER = -3,
}; };
enum { enum {

View File

@@ -71,6 +71,7 @@ thipDriverGetVersion *hipDriverGetVersion;
thipGetDevice *hipGetDevice; thipGetDevice *hipGetDevice;
thipGetDeviceCount *hipGetDeviceCount; thipGetDeviceCount *hipGetDeviceCount;
thipGetDeviceProperties *hipGetDeviceProperties; thipGetDeviceProperties *hipGetDeviceProperties;
thipDeviceGet* hipDeviceGet;
thipDeviceGetName *hipDeviceGetName; thipDeviceGetName *hipDeviceGetName;
thipDeviceGetAttribute *hipDeviceGetAttribute; thipDeviceGetAttribute *hipDeviceGetAttribute;
thipDeviceComputeCapability *hipDeviceComputeCapability; thipDeviceComputeCapability *hipDeviceComputeCapability;
@@ -213,6 +214,36 @@ static void hipewHipExit(void) {
} }
} }
#ifdef _WIN32
static int hipewHasOldDriver(const char *hip_path) {
DWORD verHandle = 0;
DWORD verSize = GetFileVersionInfoSize(hip_path, &verHandle);
int old_driver = 0;
if (verSize != 0) {
LPSTR verData = (LPSTR)malloc(verSize);
if (GetFileVersionInfo(hip_path, verHandle, verSize, verData)) {
LPBYTE lpBuffer = NULL;
UINT size = 0;
if (VerQueryValue(verData, "\\", (VOID FAR * FAR *)&lpBuffer, &size)) {
if (size) {
VS_FIXEDFILEINFO *verInfo = (VS_FIXEDFILEINFO *)lpBuffer;
/* Magic value from
* https://docs.microsoft.com/en-us/windows/win32/api/verrsrc/ns-verrsrc-vs_fixedfileinfo */
if (verInfo->dwSignature == 0xfeef04bd) {
unsigned int fileVersionLS0 = (verInfo->dwFileVersionLS >> 16) & 0xffff;
unsigned int fileversionLS1 = (verInfo->dwFileVersionLS >> 0) & 0xffff;
/* Corresponds to versions older than AMD Radeon Pro 21.Q4. */
old_driver = ((fileVersionLS0 < 3354) || (fileVersionLS0 == 3354 && fileversionLS1 < 13));
}
}
}
}
free(verData);
}
return old_driver;
}
#endif
static int hipewHipInit(void) { static int hipewHipInit(void) {
/* Library paths. */ /* Library paths. */
#ifdef _WIN32 #ifdef _WIN32
@@ -240,6 +271,14 @@ static int hipewHipInit(void) {
return result; return result;
} }
#ifdef _WIN32
/* Test for driver version. */
if(hipewHasOldDriver(hip_paths[0])) {
result = HIPEW_ERROR_OLD_DRIVER;
return result;
}
#endif
/* Load library. */ /* Load library. */
hip_lib = dynamic_library_open_find(hip_paths); hip_lib = dynamic_library_open_find(hip_paths);
@@ -255,6 +294,7 @@ static int hipewHipInit(void) {
HIP_LIBRARY_FIND_CHECKED(hipGetDevice); HIP_LIBRARY_FIND_CHECKED(hipGetDevice);
HIP_LIBRARY_FIND_CHECKED(hipGetDeviceCount); HIP_LIBRARY_FIND_CHECKED(hipGetDeviceCount);
HIP_LIBRARY_FIND_CHECKED(hipGetDeviceProperties); HIP_LIBRARY_FIND_CHECKED(hipGetDeviceProperties);
HIP_LIBRARY_FIND_CHECKED(hipDeviceGet);
HIP_LIBRARY_FIND_CHECKED(hipDeviceGetName); HIP_LIBRARY_FIND_CHECKED(hipDeviceGetName);
HIP_LIBRARY_FIND_CHECKED(hipDeviceGetAttribute); HIP_LIBRARY_FIND_CHECKED(hipDeviceGetAttribute);
HIP_LIBRARY_FIND_CHECKED(hipDeviceComputeCapability); HIP_LIBRARY_FIND_CHECKED(hipDeviceComputeCapability);

View File

@@ -1,7 +1,7 @@
Project: NanoSVG Project: NanoSVG
URL: https://github.com/memononen/nanosvg URL: https://github.com/memononen/nanosvg
License: zlib License: zlib
Upstream version: Upstream version: 3cdd4a9d7886
Local modifications: Added some functionality to manage grease pencil layers Local modifications: Added some functionality to manage grease pencil layers
Added a fix to SVG import arc and float errors (https://developer.blender.org/rB11dc674c78b49fc4e0b7c134c375b6c8b8eacbcc) Added a fix to SVG import arc and float errors (https://developer.blender.org/rB11dc674c78b49fc4e0b7c134c375b6c8b8eacbcc)

View File

@@ -82,7 +82,7 @@ static void session_print_status()
string status, substatus; string status, substatus;
/* get status */ /* get status */
float progress = options.session->progress.get_progress(); double progress = options.session->progress.get_progress();
options.session->progress.get_status(status, substatus); options.session->progress.get_status(status, substatus);
if (substatus != "") if (substatus != "")
@@ -183,7 +183,7 @@ static void display_info(Progress &progress)
progress.get_time(total_time, sample_time); progress.get_time(total_time, sample_time);
progress.get_status(status, substatus); progress.get_status(status, substatus);
float progress_val = progress.get_progress(); double progress_val = progress.get_progress();
if (substatus != "") if (substatus != "")
status += ": " + substatus; status += ": " + substatus;

View File

@@ -138,6 +138,11 @@ endif()
blender_add_lib(bf_intern_cycles "${SRC}" "${INC}" "${INC_SYS}" "${LIB}") blender_add_lib(bf_intern_cycles "${SRC}" "${INC}" "${INC_SYS}" "${LIB}")
# avoid link failure with clang 3.4 debug
if(CMAKE_C_COMPILER_ID MATCHES "Clang" AND NOT ${CMAKE_C_COMPILER_VERSION} VERSION_LESS '3.4')
string(APPEND CMAKE_CXX_FLAGS_DEBUG " -gline-tables-only")
endif()
add_dependencies(bf_intern_cycles bf_rna) add_dependencies(bf_intern_cycles bf_rna)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${ADDON_FILES}" ${CYCLES_INSTALL_PATH}) delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${ADDON_FILES}" ${CYCLES_INSTALL_PATH})

View File

@@ -346,7 +346,7 @@ class CyclesRenderSettings(bpy.types.PropertyGroup):
name="Scrambling Distance", name="Scrambling Distance",
default=1.0, default=1.0,
min=0.0, max=1.0, min=0.0, max=1.0,
description="Lower values give faster rendering with GPU rendering and less noise with all devices at the cost of possible artifacts if set too low. Only works when not using adaptive sampling", description="Reduce randomization between pixels to improve GPU rendering performance, at the cost of possible rendering artifacts if set too low. Only works when not using adaptive sampling",
) )
preview_scrambling_distance: BoolProperty( preview_scrambling_distance: BoolProperty(
name="Scrambling Distance viewport", name="Scrambling Distance viewport",
@@ -354,10 +354,10 @@ class CyclesRenderSettings(bpy.types.PropertyGroup):
description="Uses the Scrambling Distance value for the viewport. Faster but may flicker", description="Uses the Scrambling Distance value for the viewport. Faster but may flicker",
) )
adaptive_scrambling_distance: BoolProperty( auto_scrambling_distance: BoolProperty(
name="Adaptive Scrambling Distance", name="Automatic Scrambling Distance",
default=False, default=False,
description="Uses a formula to adapt the scrambling distance strength based on the sample count", description="Automatically reduce the randomization between pixels to improve GPU rendering performance, at the cost of possible rendering artifacts. Only works when not using adaptive sampling",
) )
use_layer_samples: EnumProperty( use_layer_samples: EnumProperty(
@@ -770,8 +770,8 @@ class CyclesRenderSettings(bpy.types.PropertyGroup):
) )
use_auto_tile: BoolProperty( use_auto_tile: BoolProperty(
name="Auto Tiles", name="Use Tiling",
description="Automatically render high resolution images in tiles to reduce memory usage, using the specified tile size. Tiles are cached to disk while rendering to save memory", description="Render high resolution images in tiles to reduce memory usage, using the specified tile size. Tiles are cached to disk while rendering to save memory",
default=True, default=True,
) )
tile_size: IntProperty( tile_size: IntProperty(

View File

@@ -292,13 +292,13 @@ class CYCLES_RENDER_PT_sampling_advanced(CyclesButtonsPanel, Panel):
layout.separator() layout.separator()
col = layout.column(align=True) heading = layout.column(align=True, heading="Scrambling Distance")
col.active = not (cscene.use_adaptive_sampling and cscene.use_preview_adaptive_sampling) heading.active = not (cscene.use_adaptive_sampling and cscene.use_preview_adaptive_sampling)
col.prop(cscene, "scrambling_distance", text="Scrambling Distance") heading.prop(cscene, "auto_scrambling_distance", text="Automatic")
col.prop(cscene, "adaptive_scrambling_distance", text="Adaptive") sub = heading.row()
sub = col.row(align=True)
sub.active = not cscene.use_preview_adaptive_sampling sub.active = not cscene.use_preview_adaptive_sampling
sub.prop(cscene, "preview_scrambling_distance", text="Viewport") sub.prop(cscene, "preview_scrambling_distance", text="Viewport")
heading.prop(cscene, "scrambling_distance", text="Multiplier")
layout.separator() layout.separator()
@@ -1051,7 +1051,7 @@ class CYCLES_OBJECT_PT_motion_blur(CyclesButtonsPanel, Panel):
def has_geometry_visibility(ob): def has_geometry_visibility(ob):
return ob and ((ob.type in {'MESH', 'CURVE', 'SURFACE', 'FONT', 'META', 'LIGHT'}) or return ob and ((ob.type in {'MESH', 'CURVE', 'SURFACE', 'FONT', 'META', 'LIGHT', 'VOLUME', 'POINTCLOUD', 'HAIR'}) or
(ob.instance_type == 'COLLECTION' and ob.instance_collection)) (ob.instance_type == 'COLLECTION' and ob.instance_collection))

View File

@@ -819,11 +819,14 @@ void BlenderSync::sync_hair(BL::Depsgraph b_depsgraph, BObjectInfo &b_ob_info, H
new_hair.set_used_shaders(used_shaders); new_hair.set_used_shaders(used_shaders);
if (view_layer.use_hair) { if (view_layer.use_hair) {
#ifdef WITH_HAIR_NODES
if (b_ob_info.object_data.is_a(&RNA_Hair)) { if (b_ob_info.object_data.is_a(&RNA_Hair)) {
/* Hair object. */ /* Hair object. */
sync_hair(&new_hair, b_ob_info, false); sync_hair(&new_hair, b_ob_info, false);
} }
else { else
#endif
{
/* Particle hair. */ /* Particle hair. */
bool need_undeformed = new_hair.need_attribute(scene, ATTR_STD_GENERATED); bool need_undeformed = new_hair.need_attribute(scene, ATTR_STD_GENERATED);
BL::Mesh b_mesh = object_to_mesh( BL::Mesh b_mesh = object_to_mesh(
@@ -870,12 +873,15 @@ void BlenderSync::sync_hair_motion(BL::Depsgraph b_depsgraph,
/* Export deformed coordinates. */ /* Export deformed coordinates. */
if (ccl::BKE_object_is_deform_modified(b_ob_info, b_scene, preview)) { if (ccl::BKE_object_is_deform_modified(b_ob_info, b_scene, preview)) {
#ifdef WITH_HAIR_NODES
if (b_ob_info.object_data.is_a(&RNA_Hair)) { if (b_ob_info.object_data.is_a(&RNA_Hair)) {
/* Hair object. */ /* Hair object. */
sync_hair(hair, b_ob_info, true, motion_step); sync_hair(hair, b_ob_info, true, motion_step);
return; return;
} }
else { else
#endif
{
/* Particle hair. */ /* Particle hair. */
BL::Mesh b_mesh = object_to_mesh( BL::Mesh b_mesh = object_to_mesh(
b_data, b_ob_info, b_depsgraph, false, Mesh::SUBDIVISION_NONE); b_data, b_ob_info, b_depsgraph, false, Mesh::SUBDIVISION_NONE);

View File

@@ -31,7 +31,11 @@ CCL_NAMESPACE_BEGIN
static Geometry::Type determine_geom_type(BObjectInfo &b_ob_info, bool use_particle_hair) static Geometry::Type determine_geom_type(BObjectInfo &b_ob_info, bool use_particle_hair)
{ {
#ifdef WITH_HAIR_NODES
if (b_ob_info.object_data.is_a(&RNA_Hair) || use_particle_hair) { if (b_ob_info.object_data.is_a(&RNA_Hair) || use_particle_hair) {
#else
if (use_particle_hair) {
#endif
return Geometry::HAIR; return Geometry::HAIR;
} }
@@ -215,7 +219,11 @@ void BlenderSync::sync_geometry_motion(BL::Depsgraph &b_depsgraph,
if (progress.get_cancel()) if (progress.get_cancel())
return; return;
#ifdef WITH_HAIR_NODES
if (b_ob_info.object_data.is_a(&RNA_Hair) || use_particle_hair) { if (b_ob_info.object_data.is_a(&RNA_Hair) || use_particle_hair) {
#else
if (use_particle_hair) {
#endif
Hair *hair = static_cast<Hair *>(geom); Hair *hair = static_cast<Hair *>(geom);
sync_hair_motion(b_depsgraph, b_ob_info, hair, motion_step); sync_hair_motion(b_depsgraph, b_ob_info, hair, motion_step);
} }

View File

@@ -24,8 +24,14 @@ CCL_NAMESPACE_BEGIN
/* Packed Images */ /* Packed Images */
BlenderImageLoader::BlenderImageLoader(BL::Image b_image, int frame) BlenderImageLoader::BlenderImageLoader(BL::Image b_image,
: b_image(b_image), frame(frame), free_cache(!b_image.has_data()) const int frame,
const bool is_preview_render)
: b_image(b_image),
frame(frame),
/* Don't free cache for preview render to avoid race condition from T93560, to be fixed
properly later as we are close to release. */
free_cache(!is_preview_render && !b_image.has_data())
{ {
} }

View File

@@ -25,7 +25,7 @@ CCL_NAMESPACE_BEGIN
class BlenderImageLoader : public ImageLoader { class BlenderImageLoader : public ImageLoader {
public: public:
BlenderImageLoader(BL::Image b_image, int frame); BlenderImageLoader(BL::Image b_image, const int frame, const bool is_preview_render);
bool load_metadata(const ImageDeviceFeatures &features, ImageMetaData &metadata) override; bool load_metadata(const ImageDeviceFeatures &features, ImageMetaData &metadata) override;
bool load_pixels(const ImageMetaData &metadata, bool load_pixels(const ImageMetaData &metadata,

View File

@@ -294,7 +294,7 @@ Object *BlenderSync::sync_object(BL::Depsgraph &b_depsgraph,
object->set_visibility(visibility); object->set_visibility(visibility);
object->set_is_shadow_catcher(b_ob.is_shadow_catcher()); object->set_is_shadow_catcher(b_ob.is_shadow_catcher() || b_parent.is_shadow_catcher());
float shadow_terminator_shading_offset = get_float(cobject, "shadow_terminator_offset"); float shadow_terminator_shading_offset = get_float(cobject, "shadow_terminator_offset");
object->set_shadow_terminator_shading_offset(shadow_terminator_shading_offset); object->set_shadow_terminator_shading_offset(shadow_terminator_shading_offset);

View File

@@ -120,7 +120,7 @@ void BlenderOutputDriver::write_render_tile(const Tile &tile)
b_pass.rect(&pixels[0]); b_pass.rect(&pixels[0]);
} }
b_engine_.end_result(b_rr, true, false, true); b_engine_.end_result(b_rr, false, false, true);
} }
CCL_NAMESPACE_END CCL_NAMESPACE_END

View File

@@ -129,7 +129,7 @@ void BlenderSession::create_session()
/* reset status/progress */ /* reset status/progress */
last_status = ""; last_status = "";
last_error = ""; last_error = "";
last_progress = -1.0f; last_progress = -1.0;
start_resize_time = 0.0; start_resize_time = 0.0;
/* create session */ /* create session */
@@ -615,6 +615,24 @@ void BlenderSession::bake(BL::Depsgraph &b_depsgraph_,
sync->sync_camera(b_render, b_camera_override, width, height, ""); sync->sync_camera(b_render, b_camera_override, width, height, "");
sync->sync_data( sync->sync_data(
b_render, b_depsgraph, b_v3d, b_camera_override, width, height, &python_thread_state); b_render, b_depsgraph, b_v3d, b_camera_override, width, height, &python_thread_state);
/* Filtering settings for combined pass. */
if (pass->get_type() == PASS_COMBINED) {
Integrator *integrator = scene->integrator;
integrator->set_use_direct_light((bake_filter & BL::BakeSettings::pass_filter_DIRECT) != 0);
integrator->set_use_indirect_light((bake_filter & BL::BakeSettings::pass_filter_INDIRECT) !=
0);
integrator->set_use_diffuse((bake_filter & BL::BakeSettings::pass_filter_DIFFUSE) != 0);
integrator->set_use_glossy((bake_filter & BL::BakeSettings::pass_filter_GLOSSY) != 0);
integrator->set_use_transmission(
(bake_filter & BL::BakeSettings::pass_filter_TRANSMISSION) != 0);
integrator->set_use_emission((bake_filter & BL::BakeSettings::pass_filter_EMIT) != 0);
}
/* Always use transpanent background for baking. */
scene->background->set_transparent(true);
/* Load built-in images from Blender. */
builtin_images_load(); builtin_images_load();
} }
@@ -841,7 +859,7 @@ void BlenderSession::get_status(string &status, string &substatus)
session->progress.get_status(status, substatus); session->progress.get_status(status, substatus);
} }
void BlenderSession::get_progress(float &progress, double &total_time, double &render_time) void BlenderSession::get_progress(double &progress, double &total_time, double &render_time)
{ {
session->progress.get_time(total_time, render_time); session->progress.get_time(total_time, render_time);
progress = session->progress.get_progress(); progress = session->progress.get_progress();
@@ -849,10 +867,10 @@ void BlenderSession::get_progress(float &progress, double &total_time, double &r
void BlenderSession::update_bake_progress() void BlenderSession::update_bake_progress()
{ {
float progress = session->progress.get_progress(); double progress = session->progress.get_progress();
if (progress != last_progress) { if (progress != last_progress) {
b_engine.update_progress(progress); b_engine.update_progress((float)progress);
last_progress = progress; last_progress = progress;
} }
} }
@@ -861,7 +879,7 @@ void BlenderSession::update_status_progress()
{ {
string timestatus, status, substatus; string timestatus, status, substatus;
string scene_status = ""; string scene_status = "";
float progress; double progress;
double total_time, remaining_time = 0, render_time; double total_time, remaining_time = 0, render_time;
float mem_used = (float)session->stats.mem_used / 1024.0f / 1024.0f; float mem_used = (float)session->stats.mem_used / 1024.0f / 1024.0f;
float mem_peak = (float)session->stats.mem_peak / 1024.0f / 1024.0f; float mem_peak = (float)session->stats.mem_peak / 1024.0f / 1024.0f;
@@ -905,7 +923,7 @@ void BlenderSession::update_status_progress()
last_status_time = current_time; last_status_time = current_time;
} }
if (progress != last_progress) { if (progress != last_progress) {
b_engine.update_progress(progress); b_engine.update_progress((float)progress);
last_progress = progress; last_progress = progress;
} }

View File

@@ -82,7 +82,7 @@ class BlenderSession {
void tag_redraw(); void tag_redraw();
void tag_update(); void tag_update();
void get_status(string &status, string &substatus); void get_status(string &status, string &substatus);
void get_progress(float &progress, double &total_time, double &render_time); void get_progress(double &progress, double &total_time, double &render_time);
void test_cancel(); void test_cancel();
void update_status_progress(); void update_status_progress();
void update_bake_progress(); void update_bake_progress();
@@ -108,7 +108,7 @@ class BlenderSession {
string last_status; string last_status;
string last_error; string last_error;
float last_progress; double last_progress;
double last_status_time; double last_status_time;
int width, height; int width, height;

View File

@@ -762,7 +762,8 @@ static ShaderNode *add_node(Scene *scene,
int scene_frame = b_scene.frame_current(); int scene_frame = b_scene.frame_current();
int image_frame = image_user_frame_number(b_image_user, b_image, scene_frame); int image_frame = image_user_frame_number(b_image_user, b_image, scene_frame);
image->handle = scene->image_manager->add_image( image->handle = scene->image_manager->add_image(
new BlenderImageLoader(b_image, image_frame), image->image_params()); new BlenderImageLoader(b_image, image_frame, b_engine.is_preview()),
image->image_params());
} }
else { else {
ustring filename = ustring( ustring filename = ustring(
@@ -797,8 +798,9 @@ static ShaderNode *add_node(Scene *scene,
if (is_builtin) { if (is_builtin) {
int scene_frame = b_scene.frame_current(); int scene_frame = b_scene.frame_current();
int image_frame = image_user_frame_number(b_image_user, b_image, scene_frame); int image_frame = image_user_frame_number(b_image_user, b_image, scene_frame);
env->handle = scene->image_manager->add_image(new BlenderImageLoader(b_image, image_frame), env->handle = scene->image_manager->add_image(
env->image_params()); new BlenderImageLoader(b_image, image_frame, b_engine.is_preview()),
env->image_params());
} }
else { else {
env->set_filename( env->set_filename(

View File

@@ -365,8 +365,8 @@ void BlenderSync::sync_integrator(BL::ViewLayer &b_view_layer, bool background)
int samples = get_int(cscene, "samples"); int samples = get_int(cscene, "samples");
float scrambling_distance = get_float(cscene, "scrambling_distance"); float scrambling_distance = get_float(cscene, "scrambling_distance");
bool adaptive_scrambling_distance = get_boolean(cscene, "adaptive_scrambling_distance"); bool auto_scrambling_distance = get_boolean(cscene, "auto_scrambling_distance");
if (adaptive_scrambling_distance) { if (auto_scrambling_distance) {
scrambling_distance *= 4.0f / sqrtf(samples); scrambling_distance *= 4.0f / sqrtf(samples);
} }

View File

@@ -30,15 +30,17 @@ BVHOptiX::BVHOptiX(const BVHParams &params_,
: BVH(params_, geometry_, objects_), : BVH(params_, geometry_, objects_),
device(device), device(device),
traversable_handle(0), traversable_handle(0),
as_data(device, params_.top_level ? "optix tlas" : "optix blas", false), as_data(make_unique<device_only_memory<char>>(
motion_transform_data(device, "optix motion transform", false) device, params.top_level ? "optix tlas" : "optix blas", false)),
motion_transform_data(
make_unique<device_only_memory<char>>(device, "optix motion transform", false))
{ {
} }
BVHOptiX::~BVHOptiX() BVHOptiX::~BVHOptiX()
{ {
// Acceleration structure memory is delayed freed on device, since deleting the /* Acceleration structure memory is delayed freed on device, since deleting the
// BVH may happen while still being used for rendering. * BVH may happen while still being used for rendering. */
device->release_optix_bvh(this); device->release_optix_bvh(this);
} }

View File

@@ -25,14 +25,16 @@
# include "device/memory.h" # include "device/memory.h"
# include "util/unique_ptr.h"
CCL_NAMESPACE_BEGIN CCL_NAMESPACE_BEGIN
class BVHOptiX : public BVH { class BVHOptiX : public BVH {
public: public:
Device *device; Device *device;
uint64_t traversable_handle; uint64_t traversable_handle;
device_only_memory<char> as_data; unique_ptr<device_only_memory<char>> as_data;
device_only_memory<char> motion_transform_data; unique_ptr<device_only_memory<char>> motion_transform_data;
protected: protected:
friend class BVH; friend class BVH;

View File

@@ -38,6 +38,7 @@ void device_cpu_info(vector<DeviceInfo> &devices)
info.id = "CPU"; info.id = "CPU";
info.num = 0; info.num = 0;
info.has_osl = true; info.has_osl = true;
info.has_half_images = true;
info.has_nanovdb = true; info.has_nanovdb = true;
info.has_profiling = true; info.has_profiling = true;
if (openimagedenoise_supported()) { if (openimagedenoise_supported()) {

View File

@@ -134,8 +134,7 @@ void CPUDevice::mem_alloc(device_memory &mem)
<< string_human_readable_size(mem.memory_size()) << ")"; << string_human_readable_size(mem.memory_size()) << ")";
} }
if (mem.type == MEM_DEVICE_ONLY) { if (mem.type == MEM_DEVICE_ONLY || !mem.host_pointer) {
assert(!mem.host_pointer);
size_t alignment = MIN_ALIGNMENT_CPU_DATA_TYPES; size_t alignment = MIN_ALIGNMENT_CPU_DATA_TYPES;
void *data = util_aligned_malloc(mem.memory_size(), alignment); void *data = util_aligned_malloc(mem.memory_size(), alignment);
mem.device_pointer = (device_ptr)data; mem.device_pointer = (device_ptr)data;
@@ -194,7 +193,7 @@ void CPUDevice::mem_free(device_memory &mem)
tex_free((device_texture &)mem); tex_free((device_texture &)mem);
} }
else if (mem.device_pointer) { else if (mem.device_pointer) {
if (mem.type == MEM_DEVICE_ONLY) { if (mem.type == MEM_DEVICE_ONLY || !mem.host_pointer) {
util_aligned_free((void *)mem.device_pointer); util_aligned_free((void *)mem.device_pointer);
} }
mem.device_pointer = 0; mem.device_pointer = 0;

View File

@@ -42,7 +42,7 @@ class CPUKernels {
IntegratorInitFunction integrator_init_from_camera; IntegratorInitFunction integrator_init_from_camera;
IntegratorInitFunction integrator_init_from_bake; IntegratorInitFunction integrator_init_from_bake;
IntegratorFunction integrator_intersect_closest; IntegratorShadeFunction integrator_intersect_closest;
IntegratorFunction integrator_intersect_shadow; IntegratorFunction integrator_intersect_shadow;
IntegratorFunction integrator_intersect_subsurface; IntegratorFunction integrator_intersect_subsurface;
IntegratorFunction integrator_intersect_volume_stack; IntegratorFunction integrator_intersect_volume_stack;

View File

@@ -144,6 +144,7 @@ void device_cuda_info(vector<DeviceInfo> &devices)
info.description = string(name); info.description = string(name);
info.num = num; info.num = num;
info.has_half_images = (major >= 3);
info.has_nanovdb = true; info.has_nanovdb = true;
info.denoisers = 0; info.denoisers = 0;

View File

@@ -680,7 +680,7 @@ CUDADevice::CUDAMem *CUDADevice::generic_alloc(device_memory &mem, size_t pitch_
void *shared_pointer = 0; void *shared_pointer = 0;
if (mem_alloc_result != CUDA_SUCCESS && can_map_host) { if (mem_alloc_result != CUDA_SUCCESS && can_map_host && mem.type != MEM_DEVICE_ONLY) {
if (mem.shared_pointer) { if (mem.shared_pointer) {
/* Another device already allocated host memory. */ /* Another device already allocated host memory. */
mem_alloc_result = CUDA_SUCCESS; mem_alloc_result = CUDA_SUCCESS;
@@ -703,8 +703,14 @@ CUDADevice::CUDAMem *CUDADevice::generic_alloc(device_memory &mem, size_t pitch_
} }
if (mem_alloc_result != CUDA_SUCCESS) { if (mem_alloc_result != CUDA_SUCCESS) {
status = " failed, out of device and host memory"; if (mem.type == MEM_DEVICE_ONLY) {
set_error("System is out of GPU and shared host memory"); status = " failed, out of device memory";
set_error("System is out of GPU memory");
}
else {
status = " failed, out of device and host memory";
set_error("System is out of GPU and shared host memory");
}
} }
if (mem.name) { if (mem.name) {
@@ -777,6 +783,7 @@ void CUDADevice::generic_free(device_memory &mem)
if (mem.device_pointer) { if (mem.device_pointer) {
CUDAContextScope scope(this); CUDAContextScope scope(this);
thread_scoped_lock lock(cuda_mem_map_mutex); thread_scoped_lock lock(cuda_mem_map_mutex);
DCHECK(cuda_mem_map.find(&mem) != cuda_mem_map.end());
const CUDAMem &cmem = cuda_mem_map[&mem]; const CUDAMem &cmem = cuda_mem_map[&mem];
/* If cmem.use_mapped_host is true, reference counting is used /* If cmem.use_mapped_host is true, reference counting is used
@@ -1145,6 +1152,7 @@ void CUDADevice::tex_free(device_texture &mem)
if (mem.device_pointer) { if (mem.device_pointer) {
CUDAContextScope scope(this); CUDAContextScope scope(this);
thread_scoped_lock lock(cuda_mem_map_mutex); thread_scoped_lock lock(cuda_mem_map_mutex);
DCHECK(cuda_mem_map.find(&mem) != cuda_mem_map.end());
const CUDAMem &cmem = cuda_mem_map[&mem]; const CUDAMem &cmem = cuda_mem_map[&mem];
if (cmem.texobject) { if (cmem.texobject) {

View File

@@ -286,6 +286,7 @@ DeviceInfo Device::get_multi_device(const vector<DeviceInfo> &subdevices,
info.description = "Multi Device"; info.description = "Multi Device";
info.num = 0; info.num = 0;
info.has_half_images = true;
info.has_nanovdb = true; info.has_nanovdb = true;
info.has_osl = true; info.has_osl = true;
info.has_profiling = true; info.has_profiling = true;
@@ -332,6 +333,7 @@ DeviceInfo Device::get_multi_device(const vector<DeviceInfo> &subdevices,
} }
/* Accumulate device info. */ /* Accumulate device info. */
info.has_half_images &= device.has_half_images;
info.has_nanovdb &= device.has_nanovdb; info.has_nanovdb &= device.has_nanovdb;
info.has_osl &= device.has_osl; info.has_osl &= device.has_osl;
info.has_profiling &= device.has_profiling; info.has_profiling &= device.has_profiling;

View File

@@ -73,6 +73,7 @@ class DeviceInfo {
int num; int num;
bool display_device; /* GPU is used as a display device. */ bool display_device; /* GPU is used as a display device. */
bool has_nanovdb; /* Support NanoVDB volumes. */ bool has_nanovdb; /* Support NanoVDB volumes. */
bool has_half_images; /* Support half-float textures. */
bool has_osl; /* Support Open Shading Language. */ bool has_osl; /* Support Open Shading Language. */
bool has_profiling; /* Supports runtime collection of profiling info. */ bool has_profiling; /* Supports runtime collection of profiling info. */
bool has_peer_memory; /* GPU has P2P access to memory of another GPU. */ bool has_peer_memory; /* GPU has P2P access to memory of another GPU. */
@@ -89,6 +90,7 @@ class DeviceInfo {
num = 0; num = 0;
cpu_threads = 0; cpu_threads = 0;
display_device = false; display_device = false;
has_half_images = false;
has_nanovdb = false; has_nanovdb = false;
has_osl = false; has_osl = false;
has_profiling = false; has_profiling = false;

View File

@@ -57,9 +57,16 @@ bool device_hip_init()
} }
} }
else { else {
VLOG(1) << "HIPEW initialization failed: " if (hipew_result == HIPEW_ERROR_ATEXIT_FAILED) {
<< ((hipew_result == HIPEW_ERROR_ATEXIT_FAILED) ? "Error setting up atexit() handler" : VLOG(1) << "HIPEW initialization failed: Error setting up atexit() handler";
"Error opening the library"); }
else if (hipew_result == HIPEW_ERROR_OLD_DRIVER) {
VLOG(1) << "HIPEW initialization failed: Driver version too old, requires AMD Radeon Pro "
"21.Q4 driver or newer";
}
else {
VLOG(1) << "HIPEW initialization failed: Error opening HIP dynamic library";
}
} }
return result; return result;
@@ -141,6 +148,7 @@ void device_hip_info(vector<DeviceInfo> &devices)
info.description = string(name); info.description = string(name);
info.num = num; info.num = num;
info.has_half_images = true;
info.has_nanovdb = true; info.has_nanovdb = true;
info.denoisers = 0; info.denoisers = 0;

View File

@@ -99,7 +99,7 @@ HIPDevice::HIPDevice(const DeviceInfo &info, Stats &stats, Profiler &profiler)
} }
/* Setup device and context. */ /* Setup device and context. */
result = hipGetDevice(&hipDevice, hipDevId); result = hipDeviceGet(&hipDevice, hipDevId);
if (result != hipSuccess) { if (result != hipSuccess) {
set_error(string_printf("Failed to get HIP device handle from ordinal (%s)", set_error(string_printf("Failed to get HIP device handle from ordinal (%s)",
hipewErrorString(result))); hipewErrorString(result)));
@@ -222,7 +222,6 @@ string HIPDevice::compile_kernel_get_common_cflags(const uint kernel_features)
const string include_path = source_path; const string include_path = source_path;
string cflags = string_printf( string cflags = string_printf(
"-m%d " "-m%d "
"--ptxas-options=\"-v\" "
"--use_fast_math " "--use_fast_math "
"-DHIPCC " "-DHIPCC "
"-I\"%s\"", "-I\"%s\"",
@@ -234,10 +233,7 @@ string HIPDevice::compile_kernel_get_common_cflags(const uint kernel_features)
return cflags; return cflags;
} }
string HIPDevice::compile_kernel(const uint kernel_features, string HIPDevice::compile_kernel(const uint kernel_features, const char *name, const char *base)
const char *name,
const char *base,
bool force_ptx)
{ {
/* Compute kernel name. */ /* Compute kernel name. */
int major, minor; int major, minor;
@@ -255,13 +251,11 @@ string HIPDevice::compile_kernel(const uint kernel_features,
/* Attempt to use kernel provided with Blender. */ /* Attempt to use kernel provided with Blender. */
if (!use_adaptive_compilation()) { if (!use_adaptive_compilation()) {
if (!force_ptx) { const string fatbin = path_get(string_printf("lib/%s_%s.fatbin", name, arch));
const string fatbin = path_get(string_printf("lib/%s_%s.fatbin", name, arch)); VLOG(1) << "Testing for pre-compiled kernel " << fatbin << ".";
VLOG(1) << "Testing for pre-compiled kernel " << fatbin << "."; if (path_exists(fatbin)) {
if (path_exists(fatbin)) { VLOG(1) << "Using precompiled kernel.";
VLOG(1) << "Using precompiled kernel."; return fatbin;
return fatbin;
}
} }
} }
@@ -298,9 +292,9 @@ string HIPDevice::compile_kernel(const uint kernel_features,
# ifdef _WIN32 # ifdef _WIN32
if (!use_adaptive_compilation() && have_precompiled_kernels()) { if (!use_adaptive_compilation() && have_precompiled_kernels()) {
if (major < 3) { if (!hipSupportsDevice(hipDevId)) {
set_error( set_error(
string_printf("HIP backend requires compute capability 3.0 or up, but found %d.%d. " string_printf("HIP backend requires compute capability 10.1 or up, but found %d.%d. "
"Your GPU is not supported.", "Your GPU is not supported.",
major, major,
minor)); minor));
@@ -751,6 +745,7 @@ void HIPDevice::generic_free(device_memory &mem)
if (mem.device_pointer) { if (mem.device_pointer) {
HIPContextScope scope(this); HIPContextScope scope(this);
thread_scoped_lock lock(hip_mem_map_mutex); thread_scoped_lock lock(hip_mem_map_mutex);
DCHECK(hip_mem_map.find(&mem) != hip_mem_map.end());
const HIPMem &cmem = hip_mem_map[&mem]; const HIPMem &cmem = hip_mem_map[&mem];
/* If cmem.use_mapped_host is true, reference counting is used /* If cmem.use_mapped_host is true, reference counting is used
@@ -994,16 +989,16 @@ void HIPDevice::tex_alloc(device_texture &mem)
<< string_human_readable_number(mem.memory_size()) << " bytes. (" << string_human_readable_number(mem.memory_size()) << " bytes. ("
<< string_human_readable_size(mem.memory_size()) << ")"; << string_human_readable_size(mem.memory_size()) << ")";
hip_assert(hipArray3DCreate(&array_3d, &desc)); hip_assert(hipArray3DCreate((hArray *)&array_3d, &desc));
if (!array_3d) { if (!array_3d) {
return; return;
} }
HIP_MEMCPY3D param; HIP_MEMCPY3D param;
memset(&param, 0, sizeof(param)); memset(&param, 0, sizeof(HIP_MEMCPY3D));
param.dstMemoryType = hipMemoryTypeArray; param.dstMemoryType = hipMemoryTypeArray;
param.dstArray = &array_3d; param.dstArray = array_3d;
param.srcMemoryType = hipMemoryTypeHost; param.srcMemoryType = hipMemoryTypeHost;
param.srcHost = mem.host_pointer; param.srcHost = mem.host_pointer;
param.srcPitch = src_pitch; param.srcPitch = src_pitch;
@@ -1069,13 +1064,13 @@ void HIPDevice::tex_alloc(device_texture &mem)
if (mem.info.data_type != IMAGE_DATA_TYPE_NANOVDB_FLOAT && if (mem.info.data_type != IMAGE_DATA_TYPE_NANOVDB_FLOAT &&
mem.info.data_type != IMAGE_DATA_TYPE_NANOVDB_FLOAT3) { mem.info.data_type != IMAGE_DATA_TYPE_NANOVDB_FLOAT3) {
/* Kepler+, bindless textures. */ /* Bindless textures. */
hipResourceDesc resDesc; hipResourceDesc resDesc;
memset(&resDesc, 0, sizeof(resDesc)); memset(&resDesc, 0, sizeof(resDesc));
if (array_3d) { if (array_3d) {
resDesc.resType = hipResourceTypeArray; resDesc.resType = hipResourceTypeArray;
resDesc.res.array.h_Array = &array_3d; resDesc.res.array.h_Array = array_3d;
resDesc.flags = 0; resDesc.flags = 0;
} }
else if (mem.data_height > 0) { else if (mem.data_height > 0) {
@@ -1120,6 +1115,7 @@ void HIPDevice::tex_free(device_texture &mem)
if (mem.device_pointer) { if (mem.device_pointer) {
HIPContextScope scope(this); HIPContextScope scope(this);
thread_scoped_lock lock(hip_mem_map_mutex); thread_scoped_lock lock(hip_mem_map_mutex);
DCHECK(hip_mem_map.find(&mem) != hip_mem_map.end());
const HIPMem &cmem = hip_mem_map[&mem]; const HIPMem &cmem = hip_mem_map[&mem];
if (cmem.texobject) { if (cmem.texobject) {
@@ -1160,6 +1156,8 @@ bool HIPDevice::should_use_graphics_interop()
* possible, but from the empiric measurements it can be considerably slower than using naive * possible, but from the empiric measurements it can be considerably slower than using naive
* pixels copy. */ * pixels copy. */
/* Disable graphics interop for now, because of driver bug in 21.40. See T92972 */
# if 0
HIPContextScope scope(this); HIPContextScope scope(this);
int num_all_devices = 0; int num_all_devices = 0;
@@ -1178,6 +1176,7 @@ bool HIPDevice::should_use_graphics_interop()
return true; return true;
} }
} }
# endif
return false; return false;
} }

View File

@@ -95,8 +95,7 @@ class HIPDevice : public Device {
string compile_kernel(const uint kernel_features, string compile_kernel(const uint kernel_features,
const char *name, const char *name,
const char *base = "hip", const char *base = "hip");
bool force_ptx = false);
virtual bool load_kernels(const uint kernel_features) override; virtual bool load_kernels(const uint kernel_features) override;
void reserve_local_memory(const uint kernel_features); void reserve_local_memory(const uint kernel_features);

View File

@@ -44,45 +44,6 @@ device_memory::device_memory(Device *device, const char *name, MemoryType type)
{ {
} }
device_memory::device_memory(device_memory &&other) noexcept
: data_type(other.data_type),
data_elements(other.data_elements),
data_size(other.data_size),
device_size(other.device_size),
data_width(other.data_width),
data_height(other.data_height),
data_depth(other.data_depth),
type(other.type),
name(other.name),
device(other.device),
device_pointer(other.device_pointer),
host_pointer(other.host_pointer),
shared_pointer(other.shared_pointer),
shared_counter(other.shared_counter),
original_device_ptr(other.original_device_ptr),
original_device_size(other.original_device_size),
original_device(other.original_device),
need_realloc_(other.need_realloc_),
modified(other.modified)
{
other.data_elements = 0;
other.data_size = 0;
other.device_size = 0;
other.data_width = 0;
other.data_height = 0;
other.data_depth = 0;
other.device = 0;
other.device_pointer = 0;
other.host_pointer = 0;
other.shared_pointer = 0;
other.shared_counter = 0;
other.original_device_ptr = 0;
other.original_device_size = 0;
other.original_device = 0;
other.need_realloc_ = false;
other.modified = false;
}
device_memory::~device_memory() device_memory::~device_memory()
{ {
assert(shared_pointer == 0); assert(shared_pointer == 0);

View File

@@ -281,11 +281,16 @@ class device_memory {
/* Only create through subclasses. */ /* Only create through subclasses. */
device_memory(Device *device, const char *name, MemoryType type); device_memory(Device *device, const char *name, MemoryType type);
device_memory(device_memory &&other) noexcept;
/* No copying allowed. */ /* No copying and allowed.
*
* This is because device implementation might need to register device memory in an allocation
* map of some sort and use pointer as a key to identify blocks. Moving data from one place to
* another bypassing device allocation routines will make those maps hard to maintain. */
device_memory(const device_memory &) = delete; device_memory(const device_memory &) = delete;
device_memory(device_memory &&other) noexcept = delete;
device_memory &operator=(const device_memory &) = delete; device_memory &operator=(const device_memory &) = delete;
device_memory &operator=(device_memory &&) = delete;
/* Host allocation on the device. All host_pointer memory should be /* Host allocation on the device. All host_pointer memory should be
* allocated with these functions, for devices that support using * allocated with these functions, for devices that support using

View File

@@ -44,14 +44,14 @@
CCL_NAMESPACE_BEGIN CCL_NAMESPACE_BEGIN
OptiXDevice::Denoiser::Denoiser(OptiXDevice *device) OptiXDevice::Denoiser::Denoiser(OptiXDevice *device)
: device(device), queue(device), state(device, "__denoiser_state") : device(device), queue(device), state(device, "__denoiser_state", true)
{ {
} }
OptiXDevice::OptiXDevice(const DeviceInfo &info, Stats &stats, Profiler &profiler) OptiXDevice::OptiXDevice(const DeviceInfo &info, Stats &stats, Profiler &profiler)
: CUDADevice(info, stats, profiler), : CUDADevice(info, stats, profiler),
sbt_data(this, "__sbt", MEM_READ_ONLY), sbt_data(this, "__sbt", MEM_READ_ONLY),
launch_params(this, "__params"), launch_params(this, "__params", false),
denoiser_(this) denoiser_(this)
{ {
/* Make the CUDA context current. */ /* Make the CUDA context current. */
@@ -507,7 +507,7 @@ class OptiXDevice::DenoiseContext {
: denoise_params(task.params), : denoise_params(task.params),
render_buffers(task.render_buffers), render_buffers(task.render_buffers),
buffer_params(task.buffer_params), buffer_params(task.buffer_params),
guiding_buffer(device, "denoiser guiding passes buffer"), guiding_buffer(device, "denoiser guiding passes buffer", true),
num_samples(task.num_samples) num_samples(task.num_samples)
{ {
num_input_passes = 1; num_input_passes = 1;
@@ -522,9 +522,9 @@ class OptiXDevice::DenoiseContext {
} }
} }
const int num_guiding_passes = num_input_passes - 1; use_guiding_passes = (num_input_passes - 1) > 0;
if (num_guiding_passes) { if (use_guiding_passes) {
if (task.allow_inplace_modification) { if (task.allow_inplace_modification) {
guiding_params.device_pointer = render_buffers->buffer.device_pointer; guiding_params.device_pointer = render_buffers->buffer.device_pointer;
@@ -577,6 +577,7 @@ class OptiXDevice::DenoiseContext {
/* Number of input passes. Including the color and extra auxiliary passes. */ /* Number of input passes. Including the color and extra auxiliary passes. */
int num_input_passes = 0; int num_input_passes = 0;
bool use_guiding_passes = false;
bool use_pass_albedo = false; bool use_pass_albedo = false;
bool use_pass_normal = false; bool use_pass_normal = false;
@@ -708,7 +709,7 @@ void OptiXDevice::denoise_pass(DenoiseContext &context, PassType pass_type)
return; return;
} }
} }
else if (!context.albedo_replaced_with_fake) { else if (context.use_guiding_passes && !context.albedo_replaced_with_fake) {
context.albedo_replaced_with_fake = true; context.albedo_replaced_with_fake = true;
if (!denoise_filter_guiding_set_fake_albedo(context)) { if (!denoise_filter_guiding_set_fake_albedo(context)) {
LOG(ERROR) << "Error replacing real albedo with the fake one."; LOG(ERROR) << "Error replacing real albedo with the fake one.";
@@ -886,8 +887,7 @@ bool OptiXDevice::denoise_configure_if_needed(DenoiseContext &context)
denoiser_.scratch_offset = sizes.stateSizeInBytes; denoiser_.scratch_offset = sizes.stateSizeInBytes;
/* Allocate denoiser state if tile size has changed since last setup. */ /* Allocate denoiser state if tile size has changed since last setup. */
denoiser_.state.alloc_to_device(denoiser_.scratch_offset + denoiser_.scratch_size + denoiser_.state.alloc_to_device(denoiser_.scratch_offset + denoiser_.scratch_size);
sizeof(float));
/* Initialize denoiser state for the current tile size. */ /* Initialize denoiser state for the current tile size. */
const OptixResult result = optixDenoiserSetup( const OptixResult result = optixDenoiserSetup(
@@ -971,16 +971,6 @@ bool OptiXDevice::denoise_run(DenoiseContext &context, const DenoisePass &pass)
/* Finally run denoising. */ /* Finally run denoising. */
OptixDenoiserParams params = {}; /* All parameters are disabled/zero. */ OptixDenoiserParams params = {}; /* All parameters are disabled/zero. */
params.hdrIntensity = denoiser_.state.device_pointer + denoiser_.scratch_offset +
denoiser_.scratch_size;
optix_assert(
optixDenoiserComputeIntensity(denoiser_.optix_denoiser,
denoiser_.queue.stream(),
&color_layer,
params.hdrIntensity,
denoiser_.state.device_pointer + denoiser_.scratch_offset,
denoiser_.scratch_size));
OptixDenoiserLayer image_layers = {}; OptixDenoiserLayer image_layers = {};
image_layers.input = color_layer; image_layers.input = color_layer;
@@ -1011,6 +1001,13 @@ bool OptiXDevice::build_optix_bvh(BVHOptiX *bvh,
const OptixBuildInput &build_input, const OptixBuildInput &build_input,
uint16_t num_motion_steps) uint16_t num_motion_steps)
{ {
/* Allocate and build acceleration structures only one at a time, to prevent parallel builds
* from running out of memory (since both original and compacted acceleration structure memory
* may be allocated at the same time for the duration of this function). The builds would
* otherwise happen on the same CUDA stream anyway. */
static thread_mutex mutex;
thread_scoped_lock lock(mutex);
const CUDAContextScope scope(this); const CUDAContextScope scope(this);
const bool use_fast_trace_bvh = (bvh->params.bvh_type == BVH_TYPE_STATIC); const bool use_fast_trace_bvh = (bvh->params.bvh_type == BVH_TYPE_STATIC);
@@ -1036,14 +1033,15 @@ bool OptiXDevice::build_optix_bvh(BVHOptiX *bvh,
optix_assert(optixAccelComputeMemoryUsage(context, &options, &build_input, 1, &sizes)); optix_assert(optixAccelComputeMemoryUsage(context, &options, &build_input, 1, &sizes));
/* Allocate required output buffers. */ /* Allocate required output buffers. */
device_only_memory<char> temp_mem(this, "optix temp as build mem"); device_only_memory<char> temp_mem(this, "optix temp as build mem", true);
temp_mem.alloc_to_device(align_up(sizes.tempSizeInBytes, 8) + 8); temp_mem.alloc_to_device(align_up(sizes.tempSizeInBytes, 8) + 8);
if (!temp_mem.device_pointer) { if (!temp_mem.device_pointer) {
/* Make sure temporary memory allocation succeeded. */ /* Make sure temporary memory allocation succeeded. */
return false; return false;
} }
device_only_memory<char> &out_data = bvh->as_data; /* Acceleration structure memory has to be allocated on the device (not allowed on the host). */
device_only_memory<char> &out_data = *bvh->as_data;
if (operation == OPTIX_BUILD_OPERATION_BUILD) { if (operation == OPTIX_BUILD_OPERATION_BUILD) {
assert(out_data.device == this); assert(out_data.device == this);
out_data.alloc_to_device(sizes.outputSizeInBytes); out_data.alloc_to_device(sizes.outputSizeInBytes);
@@ -1091,12 +1089,13 @@ bool OptiXDevice::build_optix_bvh(BVHOptiX *bvh,
/* There is no point compacting if the size does not change. */ /* There is no point compacting if the size does not change. */
if (compacted_size < sizes.outputSizeInBytes) { if (compacted_size < sizes.outputSizeInBytes) {
device_only_memory<char> compacted_data(this, "optix compacted as"); device_only_memory<char> compacted_data(this, "optix compacted as", false);
compacted_data.alloc_to_device(compacted_size); compacted_data.alloc_to_device(compacted_size);
if (!compacted_data.device_pointer) if (!compacted_data.device_pointer) {
/* Do not compact if memory allocation for compacted acceleration structure fails. /* Do not compact if memory allocation for compacted acceleration structure fails.
* Can just use the uncompacted one then, so succeed here regardless. */ * Can just use the uncompacted one then, so succeed here regardless. */
return !have_error(); return !have_error();
}
optix_assert(optixAccelCompact( optix_assert(optixAccelCompact(
context, NULL, out_handle, compacted_data.device_pointer, compacted_size, &out_handle)); context, NULL, out_handle, compacted_data.device_pointer, compacted_size, &out_handle));
@@ -1107,6 +1106,8 @@ bool OptiXDevice::build_optix_bvh(BVHOptiX *bvh,
std::swap(out_data.device_size, compacted_data.device_size); std::swap(out_data.device_size, compacted_data.device_size);
std::swap(out_data.device_pointer, compacted_data.device_pointer); std::swap(out_data.device_pointer, compacted_data.device_pointer);
/* Original acceleration structure memory is freed when 'compacted_data' goes out of scope.
*/
} }
} }
@@ -1134,7 +1135,7 @@ void OptiXDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
operation = OPTIX_BUILD_OPERATION_UPDATE; operation = OPTIX_BUILD_OPERATION_UPDATE;
} }
else { else {
bvh_optix->as_data.free(); bvh_optix->as_data->free();
bvh_optix->traversable_handle = 0; bvh_optix->traversable_handle = 0;
} }
@@ -1195,7 +1196,7 @@ void OptiXDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
const float4 pw = make_float4( const float4 pw = make_float4(
curve_radius[ka], curve_radius[k0], curve_radius[k1], curve_radius[kb]); curve_radius[ka], curve_radius[k0], curve_radius[k1], curve_radius[kb]);
/* Convert Catmull-Rom data to Bezier spline. */ /* Convert Catmull-Rom data to B-spline. */
static const float4 cr2bsp0 = make_float4(+7, -4, +5, -2) / 6.f; static const float4 cr2bsp0 = make_float4(+7, -4, +5, -2) / 6.f;
static const float4 cr2bsp1 = make_float4(-2, 11, -4, +1) / 6.f; static const float4 cr2bsp1 = make_float4(-2, 11, -4, +1) / 6.f;
static const float4 cr2bsp2 = make_float4(+1, -4, 11, -2) / 6.f; static const float4 cr2bsp2 = make_float4(+1, -4, 11, -2) / 6.f;
@@ -1355,9 +1356,9 @@ void OptiXDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
unsigned int num_instances = 0; unsigned int num_instances = 0;
unsigned int max_num_instances = 0xFFFFFFFF; unsigned int max_num_instances = 0xFFFFFFFF;
bvh_optix->as_data.free(); bvh_optix->as_data->free();
bvh_optix->traversable_handle = 0; bvh_optix->traversable_handle = 0;
bvh_optix->motion_transform_data.free(); bvh_optix->motion_transform_data->free();
optixDeviceContextGetProperty(context, optixDeviceContextGetProperty(context,
OPTIX_DEVICE_PROPERTY_LIMIT_MAX_INSTANCE_ID, OPTIX_DEVICE_PROPERTY_LIMIT_MAX_INSTANCE_ID,
@@ -1390,8 +1391,8 @@ void OptiXDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
} }
} }
assert(bvh_optix->motion_transform_data.device == this); assert(bvh_optix->motion_transform_data->device == this);
bvh_optix->motion_transform_data.alloc_to_device(total_motion_transform_size); bvh_optix->motion_transform_data->alloc_to_device(total_motion_transform_size);
} }
for (Object *ob : bvh->objects) { for (Object *ob : bvh->objects) {
@@ -1452,7 +1453,7 @@ void OptiXDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
motion_transform_offset = align_up(motion_transform_offset, motion_transform_offset = align_up(motion_transform_offset,
OPTIX_TRANSFORM_BYTE_ALIGNMENT); OPTIX_TRANSFORM_BYTE_ALIGNMENT);
CUdeviceptr motion_transform_gpu = bvh_optix->motion_transform_data.device_pointer + CUdeviceptr motion_transform_gpu = bvh_optix->motion_transform_data->device_pointer +
motion_transform_offset; motion_transform_offset;
motion_transform_offset += motion_transform_size; motion_transform_offset += motion_transform_size;

View File

@@ -23,6 +23,7 @@
# include "device/optix/queue.h" # include "device/optix/queue.h"
# include "device/optix/util.h" # include "device/optix/util.h"
# include "kernel/types.h" # include "kernel/types.h"
# include "util/unique_ptr.h"
CCL_NAMESPACE_BEGIN CCL_NAMESPACE_BEGIN
@@ -76,7 +77,7 @@ class OptiXDevice : public CUDADevice {
device_only_memory<KernelParamsOptiX> launch_params; device_only_memory<KernelParamsOptiX> launch_params;
OptixTraversableHandle tlas_handle = 0; OptixTraversableHandle tlas_handle = 0;
vector<device_only_memory<char>> delayed_free_bvh_memory; vector<unique_ptr<device_only_memory<char>>> delayed_free_bvh_memory;
thread_mutex delayed_free_bvh_mutex; thread_mutex delayed_free_bvh_mutex;
class Denoiser { class Denoiser {

View File

@@ -73,7 +73,8 @@ bool OptiXDeviceQueue::enqueue(DeviceKernel kernel, const int work_size, void *a
sizeof(device_ptr), sizeof(device_ptr),
cuda_stream_)); cuda_stream_));
if (kernel == DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE) { if (kernel == DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST ||
kernel == DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE) {
cuda_device_assert( cuda_device_assert(
cuda_device_, cuda_device_,
cuMemcpyHtoDAsync(launch_params_ptr + offsetof(KernelParamsOptiX, render_buffer), cuMemcpyHtoDAsync(launch_params_ptr + offsetof(KernelParamsOptiX, render_buffer),

View File

@@ -33,7 +33,10 @@ unique_ptr<Denoiser> Denoiser::create(Device *path_trace_device, const DenoisePa
return make_unique<OptiXDenoiser>(path_trace_device, params); return make_unique<OptiXDenoiser>(path_trace_device, params);
} }
return make_unique<OIDNDenoiser>(path_trace_device, params); /* Always fallback to OIDN. */
DenoiseParams oidn_params = params;
oidn_params.type = DENOISER_OPENIMAGEDENOISE;
return make_unique<OIDNDenoiser>(path_trace_device, oidn_params);
} }
Denoiser::Denoiser(Device *path_trace_device, const DenoiseParams &params) Denoiser::Denoiser(Device *path_trace_device, const DenoiseParams &params)

View File

@@ -47,9 +47,6 @@ static bool oidn_progress_monitor_function(void *user_ptr, double /*n*/)
OIDNDenoiser *oidn_denoiser = reinterpret_cast<OIDNDenoiser *>(user_ptr); OIDNDenoiser *oidn_denoiser = reinterpret_cast<OIDNDenoiser *>(user_ptr);
return !oidn_denoiser->is_cancelled(); return !oidn_denoiser->is_cancelled();
} }
#endif
#ifdef WITH_OPENIMAGEDENOISE
class OIDNPass { class OIDNPass {
public: public:
@@ -547,7 +544,6 @@ class OIDNDenoiseContext {
* the fake values and denoising of passes which do need albedo can no longer happen. */ * the fake values and denoising of passes which do need albedo can no longer happen. */
bool albedo_replaced_with_fake_ = false; bool albedo_replaced_with_fake_ = false;
}; };
#endif
static unique_ptr<DeviceQueue> create_device_queue(const RenderBuffers *render_buffers) static unique_ptr<DeviceQueue> create_device_queue(const RenderBuffers *render_buffers)
{ {
@@ -582,18 +578,20 @@ static void copy_render_buffers_to_device(unique_ptr<DeviceQueue> &queue,
} }
} }
#endif
bool OIDNDenoiser::denoise_buffer(const BufferParams &buffer_params, bool OIDNDenoiser::denoise_buffer(const BufferParams &buffer_params,
RenderBuffers *render_buffers, RenderBuffers *render_buffers,
const int num_samples, const int num_samples,
bool allow_inplace_modification) bool allow_inplace_modification)
{ {
#ifdef WITH_OPENIMAGEDENOISE
thread_scoped_lock lock(mutex_); thread_scoped_lock lock(mutex_);
/* Make sure the host-side data is available for denoising. */ /* Make sure the host-side data is available for denoising. */
unique_ptr<DeviceQueue> queue = create_device_queue(render_buffers); unique_ptr<DeviceQueue> queue = create_device_queue(render_buffers);
copy_render_buffers_from_device(queue, render_buffers); copy_render_buffers_from_device(queue, render_buffers);
#ifdef WITH_OPENIMAGEDENOISE
OIDNDenoiseContext context( OIDNDenoiseContext context(
this, params_, buffer_params, render_buffers, num_samples, allow_inplace_modification); this, params_, buffer_params, render_buffers, num_samples, allow_inplace_modification);
@@ -620,6 +618,11 @@ bool OIDNDenoiser::denoise_buffer(const BufferParams &buffer_params,
* copies data from the device it doesn't overwrite the denoiser buffers. */ * copies data from the device it doesn't overwrite the denoiser buffers. */
copy_render_buffers_to_device(queue, render_buffers); copy_render_buffers_to_device(queue, render_buffers);
} }
#else
(void)buffer_params;
(void)render_buffers;
(void)num_samples;
(void)allow_inplace_modification;
#endif #endif
/* This code is not supposed to run when compiled without OIDN support, so can assume if we made /* This code is not supposed to run when compiled without OIDN support, so can assume if we made

View File

@@ -296,13 +296,13 @@ static BufferParams scale_buffer_params(const BufferParams &params, int resoluti
scaled_params.window_x = params.window_x / resolution_divider; scaled_params.window_x = params.window_x / resolution_divider;
scaled_params.window_y = params.window_y / resolution_divider; scaled_params.window_y = params.window_y / resolution_divider;
scaled_params.window_width = params.window_width / resolution_divider; scaled_params.window_width = max(1, params.window_width / resolution_divider);
scaled_params.window_height = params.window_height / resolution_divider; scaled_params.window_height = max(1, params.window_height / resolution_divider);
scaled_params.full_x = params.full_x / resolution_divider; scaled_params.full_x = params.full_x / resolution_divider;
scaled_params.full_y = params.full_y / resolution_divider; scaled_params.full_y = params.full_y / resolution_divider;
scaled_params.full_width = params.full_width / resolution_divider; scaled_params.full_width = max(1, params.full_width / resolution_divider);
scaled_params.full_height = params.full_height / resolution_divider; scaled_params.full_height = max(1, params.full_height / resolution_divider);
scaled_params.update_offset_stride(); scaled_params.update_offset_stride();
@@ -479,7 +479,11 @@ void PathTrace::set_denoiser_params(const DenoiseParams &params)
} }
denoiser_ = Denoiser::create(device_, params); denoiser_ = Denoiser::create(device_, params);
denoiser_->is_cancelled_cb = [this]() { return is_cancel_requested(); };
/* Only take into account the "immediate" cancel to have interactive rendering responding to
* navigation as quickly as possible, but allow to run denoiser after user hit Esc button while
* doing offline rendering. */
denoiser_->is_cancelled_cb = [this]() { return render_cancel_.is_requested; };
} }
void PathTrace::set_adaptive_sampling(const AdaptiveSampling &adaptive_sampling) void PathTrace::set_adaptive_sampling(const AdaptiveSampling &adaptive_sampling)
@@ -847,7 +851,8 @@ void PathTrace::progress_update_if_needed(const RenderWork &render_work)
{ {
if (progress_ != nullptr) { if (progress_ != nullptr) {
const int2 tile_size = get_render_tile_size(); const int2 tile_size = get_render_tile_size();
const int num_samples_added = tile_size.x * tile_size.y * render_work.path_trace.num_samples; const uint64_t num_samples_added = uint64_t(tile_size.x) * tile_size.y *
render_work.path_trace.num_samples;
const int current_sample = render_work.path_trace.start_sample + const int current_sample = render_work.path_trace.start_sample +
render_work.path_trace.num_samples; render_work.path_trace.num_samples;
progress_->add_samples(num_samples_added, current_sample); progress_->add_samples(num_samples_added, current_sample);

View File

@@ -77,8 +77,10 @@ void PathTraceWorkCPU::render_samples(RenderStatistics &statistics,
const int64_t image_height = effective_buffer_params_.height; const int64_t image_height = effective_buffer_params_.height;
const int64_t total_pixels_num = image_width * image_height; const int64_t total_pixels_num = image_width * image_height;
for (CPUKernelThreadGlobals &kernel_globals : kernel_thread_globals_) { if (device_->profiler.active()) {
kernel_globals.start_profiling(); for (CPUKernelThreadGlobals &kernel_globals : kernel_thread_globals_) {
kernel_globals.start_profiling();
}
} }
tbb::task_arena local_arena = local_tbb_arena_create(device_); tbb::task_arena local_arena = local_tbb_arena_create(device_);
@@ -106,9 +108,10 @@ void PathTraceWorkCPU::render_samples(RenderStatistics &statistics,
render_samples_full_pipeline(kernel_globals, work_tile, samples_num); render_samples_full_pipeline(kernel_globals, work_tile, samples_num);
}); });
}); });
if (device_->profiler.active()) {
for (CPUKernelThreadGlobals &kernel_globals : kernel_thread_globals_) { for (CPUKernelThreadGlobals &kernel_globals : kernel_thread_globals_) {
kernel_globals.stop_profiling(); kernel_globals.stop_profiling();
}
} }
statistics.occupancy = 1.0f; statistics.occupancy = 1.0f;

View File

@@ -257,7 +257,8 @@ void PathTraceWorkGPU::render_samples(RenderStatistics &statistics,
* become busy after adding new tiles). This is especially important for the shadow catcher which * become busy after adding new tiles). This is especially important for the shadow catcher which
* schedules work in halves of available number of paths. */ * schedules work in halves of available number of paths. */
work_tile_scheduler_.set_max_num_path_states(max_num_paths_ / 8); work_tile_scheduler_.set_max_num_path_states(max_num_paths_ / 8);
work_tile_scheduler_.set_accelerated_rt((device_->get_bvh_layout_mask() & BVH_LAYOUT_OPTIX) !=
0);
work_tile_scheduler_.reset(effective_buffer_params_, work_tile_scheduler_.reset(effective_buffer_params_,
start_sample, start_sample,
samples_num, samples_num,
@@ -437,7 +438,15 @@ void PathTraceWorkGPU::enqueue_path_iteration(DeviceKernel kernel, const int num
DCHECK_LE(work_size, max_num_paths_); DCHECK_LE(work_size, max_num_paths_);
switch (kernel) { switch (kernel) {
case DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST: case DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST: {
/* Closest ray intersection kernels with integrator state and render buffer. */
void *d_render_buffer = (void *)buffers_->buffer.device_pointer;
void *args[] = {&d_path_index, &d_render_buffer, const_cast<int *>(&work_size)};
queue_->enqueue(kernel, work_size, args);
break;
}
case DEVICE_KERNEL_INTEGRATOR_INTERSECT_SHADOW: case DEVICE_KERNEL_INTEGRATOR_INTERSECT_SHADOW:
case DEVICE_KERNEL_INTEGRATOR_INTERSECT_SUBSURFACE: case DEVICE_KERNEL_INTEGRATOR_INTERSECT_SUBSURFACE:
case DEVICE_KERNEL_INTEGRATOR_INTERSECT_VOLUME_STACK: { case DEVICE_KERNEL_INTEGRATOR_INTERSECT_VOLUME_STACK: {

View File

@@ -827,6 +827,26 @@ int RenderScheduler::get_num_samples_to_path_trace() const
num_samples_to_occupy = lround(state_.occupancy_num_samples * 0.7f / state_.occupancy); num_samples_to_occupy = lround(state_.occupancy_num_samples * 0.7f / state_.occupancy);
} }
/* When time limit is used clamp the calculated number of samples to keep occupancy.
* This is because time limit causes the last render iteration to happen with less number of
* samples, which conflicts with the occupancy (lower number of samples causes lower
* occupancy, also the calculation is based on number of previously rendered samples).
*
* When time limit is not used the number of samples per render iteration is either increasing
* or stays the same, so there is no need to clamp number of samples calculated for occupancy.
*/
if (time_limit_ && state_.start_render_time) {
const double remaining_render_time = max(
0.0, time_limit_ - (time_dt() - state_.start_render_time));
const double time_per_sample_average = path_trace_time_.get_average();
const double predicted_render_time = num_samples_to_occupy * time_per_sample_average;
if (predicted_render_time > remaining_render_time) {
num_samples_to_occupy = lround(num_samples_to_occupy *
(remaining_render_time / predicted_render_time));
}
}
num_samples_to_render = max(num_samples_to_render, num_samples_to_render = max(num_samples_to_render,
min(num_samples_to_occupy, max_num_samples_to_render)); min(num_samples_to_occupy, max_num_samples_to_render));
} }

View File

@@ -46,7 +46,8 @@ ccl_device_inline uint round_up_to_power_of_two(uint x)
return next_power_of_two(x); return next_power_of_two(x);
} }
TileSize tile_calculate_best_size(const int2 &image_size, TileSize tile_calculate_best_size(const bool accel_rt,
const int2 &image_size,
const int num_samples, const int num_samples,
const int max_num_path_states, const int max_num_path_states,
const float scrambling_distance) const float scrambling_distance)
@@ -73,7 +74,7 @@ TileSize tile_calculate_best_size(const int2 &image_size,
TileSize tile_size; TileSize tile_size;
const int num_path_states_per_sample = max_num_path_states / num_samples; const int num_path_states_per_sample = max_num_path_states / num_samples;
if (scrambling_distance < 0.9f) { if (scrambling_distance < 0.9f && accel_rt) {
/* Prefer large tiles for scrambling distance, bounded by max num path states. */ /* Prefer large tiles for scrambling distance, bounded by max num path states. */
tile_size.width = min(image_size.x, max_num_path_states); tile_size.width = min(image_size.x, max_num_path_states);
tile_size.height = min(image_size.y, max(max_num_path_states / tile_size.width, 1)); tile_size.height = min(image_size.y, max(max_num_path_states / tile_size.width, 1));

View File

@@ -49,7 +49,8 @@ std::ostream &operator<<(std::ostream &os, const TileSize &tile_size);
* of active path states. * of active path states.
* Will attempt to provide best guess to keep path tracing threads of a device as localized as * Will attempt to provide best guess to keep path tracing threads of a device as localized as
* possible, and have as many threads active for every tile as possible. */ * possible, and have as many threads active for every tile as possible. */
TileSize tile_calculate_best_size(const int2 &image_size, TileSize tile_calculate_best_size(const bool accel_rt,
const int2 &image_size,
const int num_samples, const int num_samples,
const int max_num_path_states, const int max_num_path_states,
const float scrambling_distance); const float scrambling_distance);

View File

@@ -28,6 +28,11 @@ WorkTileScheduler::WorkTileScheduler()
{ {
} }
void WorkTileScheduler::set_accelerated_rt(bool accelerated_rt)
{
accelerated_rt_ = accelerated_rt;
}
void WorkTileScheduler::set_max_num_path_states(int max_num_path_states) void WorkTileScheduler::set_max_num_path_states(int max_num_path_states)
{ {
max_num_path_states_ = max_num_path_states; max_num_path_states_ = max_num_path_states;
@@ -59,7 +64,7 @@ void WorkTileScheduler::reset(const BufferParams &buffer_params,
void WorkTileScheduler::reset_scheduler_state() void WorkTileScheduler::reset_scheduler_state()
{ {
tile_size_ = tile_calculate_best_size( tile_size_ = tile_calculate_best_size(
image_size_px_, samples_num_, max_num_path_states_, scrambling_distance_); accelerated_rt_, image_size_px_, samples_num_, max_num_path_states_, scrambling_distance_);
VLOG(3) << "Will schedule tiles of size " << tile_size_; VLOG(3) << "Will schedule tiles of size " << tile_size_;

View File

@@ -31,6 +31,9 @@ class WorkTileScheduler {
public: public:
WorkTileScheduler(); WorkTileScheduler();
/* To indicate if there is accelerated RT support. */
void set_accelerated_rt(bool state);
/* MAximum path states which are allowed to be used by a single scheduled work tile. /* MAximum path states which are allowed to be used by a single scheduled work tile.
* *
* Affects the scheduled work size: the work size will be as big as possible, but will not exceed * Affects the scheduled work size: the work size will be as big as possible, but will not exceed
@@ -54,6 +57,9 @@ class WorkTileScheduler {
protected: protected:
void reset_scheduler_state(); void reset_scheduler_state();
/* Used to indicate if there is accelerated ray tracing. */
bool accelerated_rt_ = false;
/* Maximum allowed path states to be used. /* Maximum allowed path states to be used.
* *
* TODO(sergey): Naming can be improved. The fact that this is a limiting factor based on the * TODO(sergey): Naming can be improved. The fact that this is a limiting factor based on the

View File

@@ -39,10 +39,6 @@ set(SRC_KERNEL_DEVICE_HIP
device/hip/kernel.cpp device/hip/kernel.cpp
) )
set(SRC_KERNEL_DEVICE_METAL
device/metal/kernel.metal
)
set(SRC_KERNEL_DEVICE_OPTIX set(SRC_KERNEL_DEVICE_OPTIX
device/optix/kernel.cu device/optix/kernel.cu
device/optix/kernel_shader_raytrace.cu device/optix/kernel_shader_raytrace.cu
@@ -83,13 +79,6 @@ set(SRC_KERNEL_DEVICE_OPTIX_HEADERS
device/optix/globals.h device/optix/globals.h
) )
set(SRC_KERNEL_DEVICE_METAL_HEADERS
device/metal/compat.h
device/metal/context_begin.h
device/metal/context_end.h
device/metal/globals.h
)
set(SRC_KERNEL_CLOSURE_HEADERS set(SRC_KERNEL_CLOSURE_HEADERS
closure/alloc.h closure/alloc.h
closure/bsdf.h closure/bsdf.h
@@ -576,6 +565,12 @@ if(WITH_CYCLES_HIP_BINARIES AND WITH_CYCLES_DEVICE_HIP)
set(name ${name}_experimental) set(name ${name}_experimental)
endif() endif()
if(WITH_NANOVDB)
set(hip_flags ${hip_flags}
-D WITH_NANOVDB
-I "${NANOVDB_INCLUDE_DIR}")
endif()
if(WITH_CYCLES_DEBUG) if(WITH_CYCLES_DEBUG)
set(hip_flags ${hip_flags} -D __KERNEL_DEBUG__) set(hip_flags ${hip_flags} -D __KERNEL_DEBUG__)
endif() endif()
@@ -734,14 +729,12 @@ cycles_add_library(cycles_kernel "${LIB}"
${SRC_KERNEL_DEVICE_CUDA} ${SRC_KERNEL_DEVICE_CUDA}
${SRC_KERNEL_DEVICE_HIP} ${SRC_KERNEL_DEVICE_HIP}
${SRC_KERNEL_DEVICE_OPTIX} ${SRC_KERNEL_DEVICE_OPTIX}
${SRC_KERNEL_DEVICE_METAL}
${SRC_KERNEL_HEADERS} ${SRC_KERNEL_HEADERS}
${SRC_KERNEL_DEVICE_CPU_HEADERS} ${SRC_KERNEL_DEVICE_CPU_HEADERS}
${SRC_KERNEL_DEVICE_GPU_HEADERS} ${SRC_KERNEL_DEVICE_GPU_HEADERS}
${SRC_KERNEL_DEVICE_CUDA_HEADERS} ${SRC_KERNEL_DEVICE_CUDA_HEADERS}
${SRC_KERNEL_DEVICE_HIP_HEADERS} ${SRC_KERNEL_DEVICE_HIP_HEADERS}
${SRC_KERNEL_DEVICE_OPTIX_HEADERS} ${SRC_KERNEL_DEVICE_OPTIX_HEADERS}
${SRC_KERNEL_DEVICE_METAL_HEADERS}
) )
source_group("bake" FILES ${SRC_KERNEL_BAKE_HEADERS}) source_group("bake" FILES ${SRC_KERNEL_BAKE_HEADERS})
@@ -753,7 +746,6 @@ source_group("device\\cuda" FILES ${SRC_KERNEL_DEVICE_CUDA} ${SRC_KERNEL_DEVICE_
source_group("device\\gpu" FILES ${SRC_KERNEL_DEVICE_GPU_HEADERS}) source_group("device\\gpu" FILES ${SRC_KERNEL_DEVICE_GPU_HEADERS})
source_group("device\\hip" FILES ${SRC_KERNEL_DEVICE_HIP} ${SRC_KERNEL_DEVICE_HIP_HEADERS}) source_group("device\\hip" FILES ${SRC_KERNEL_DEVICE_HIP} ${SRC_KERNEL_DEVICE_HIP_HEADERS})
source_group("device\\optix" FILES ${SRC_KERNEL_DEVICE_OPTIX} ${SRC_KERNEL_DEVICE_OPTIX_HEADERS}) source_group("device\\optix" FILES ${SRC_KERNEL_DEVICE_OPTIX} ${SRC_KERNEL_DEVICE_OPTIX_HEADERS})
source_group("device\\metal" FILES ${SRC_KERNEL_DEVICE_METAL} ${SRC_KERNEL_DEVICE_METAL_HEADERS})
source_group("film" FILES ${SRC_KERNEL_FILM_HEADERS}) source_group("film" FILES ${SRC_KERNEL_FILM_HEADERS})
source_group("geom" FILES ${SRC_KERNEL_GEOM_HEADERS}) source_group("geom" FILES ${SRC_KERNEL_GEOM_HEADERS})
source_group("integrator" FILES ${SRC_KERNEL_INTEGRATOR_HEADERS}) source_group("integrator" FILES ${SRC_KERNEL_INTEGRATOR_HEADERS})
@@ -786,8 +778,6 @@ delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_DEVICE_HIP}" ${CYCLES_
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_DEVICE_HIP_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel/device/hip) delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_DEVICE_HIP_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel/device/hip)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_DEVICE_OPTIX}" ${CYCLES_INSTALL_PATH}/source/kernel/device/optix) delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_DEVICE_OPTIX}" ${CYCLES_INSTALL_PATH}/source/kernel/device/optix)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_DEVICE_OPTIX_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel/device/optix) delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_DEVICE_OPTIX_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel/device/optix)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_DEVICE_METAL}" ${CYCLES_INSTALL_PATH}/source/kernel/device/metal)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_DEVICE_METAL_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel/device/metal)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_FILM_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel/film) delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_FILM_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel/film)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_GEOM_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel/geom) delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_GEOM_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel/geom)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_INTEGRATOR_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel/integrator) delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNEL_INTEGRATOR_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel/integrator)

View File

@@ -438,7 +438,7 @@ ccl_device_inline int bsdf_sample(KernelGlobals kg,
if (label & LABEL_TRANSMIT) { if (label & LABEL_TRANSMIT) {
float threshold_squared = kernel_data.background.transparent_roughness_squared_threshold; float threshold_squared = kernel_data.background.transparent_roughness_squared_threshold;
if (threshold_squared >= 0.0f) { if (threshold_squared >= 0.0f && !(label & LABEL_DIFFUSE)) {
if (bsdf_get_specular_roughness_squared(sc) <= threshold_squared) { if (bsdf_get_specular_roughness_squared(sc) <= threshold_squared) {
label |= LABEL_TRANSMIT_TRANSPARENT; label |= LABEL_TRANSMIT_TRANSPARENT;
} }

View File

@@ -37,7 +37,7 @@
KERNEL_INTEGRATOR_INIT_FUNCTION(init_from_camera); KERNEL_INTEGRATOR_INIT_FUNCTION(init_from_camera);
KERNEL_INTEGRATOR_INIT_FUNCTION(init_from_bake); KERNEL_INTEGRATOR_INIT_FUNCTION(init_from_bake);
KERNEL_INTEGRATOR_FUNCTION(intersect_closest); KERNEL_INTEGRATOR_SHADE_FUNCTION(intersect_closest);
KERNEL_INTEGRATOR_FUNCTION(intersect_shadow); KERNEL_INTEGRATOR_FUNCTION(intersect_shadow);
KERNEL_INTEGRATOR_FUNCTION(intersect_subsurface); KERNEL_INTEGRATOR_FUNCTION(intersect_subsurface);
KERNEL_INTEGRATOR_FUNCTION(intersect_volume_stack); KERNEL_INTEGRATOR_FUNCTION(intersect_volume_stack);

View File

@@ -112,7 +112,7 @@ CCL_NAMESPACE_BEGIN
DEFINE_INTEGRATOR_INIT_KERNEL(init_from_camera) DEFINE_INTEGRATOR_INIT_KERNEL(init_from_camera)
DEFINE_INTEGRATOR_INIT_KERNEL(init_from_bake) DEFINE_INTEGRATOR_INIT_KERNEL(init_from_bake)
DEFINE_INTEGRATOR_KERNEL(intersect_closest) DEFINE_INTEGRATOR_SHADE_KERNEL(intersect_closest)
DEFINE_INTEGRATOR_KERNEL(intersect_subsurface) DEFINE_INTEGRATOR_KERNEL(intersect_subsurface)
DEFINE_INTEGRATOR_KERNEL(intersect_volume_stack) DEFINE_INTEGRATOR_KERNEL(intersect_volume_stack)
DEFINE_INTEGRATOR_SHADE_KERNEL(shade_background) DEFINE_INTEGRATOR_SHADE_KERNEL(shade_background)

View File

@@ -75,7 +75,6 @@ typedef unsigned long long uint64_t;
#define ccl_gpu_block_idx_x (blockIdx.x) #define ccl_gpu_block_idx_x (blockIdx.x)
#define ccl_gpu_grid_dim_x (gridDim.x) #define ccl_gpu_grid_dim_x (gridDim.x)
#define ccl_gpu_warp_size (warpSize) #define ccl_gpu_warp_size (warpSize)
#define ccl_gpu_thread_mask(thread_warp) uint(0xFFFFFFFF >> (ccl_gpu_warp_size - thread_warp))
#define ccl_gpu_global_id_x() (ccl_gpu_block_idx_x * ccl_gpu_block_dim_x + ccl_gpu_thread_idx_x) #define ccl_gpu_global_id_x() (ccl_gpu_block_idx_x * ccl_gpu_block_dim_x + ccl_gpu_thread_idx_x)
#define ccl_gpu_global_size_x() (ccl_gpu_grid_dim_x * ccl_gpu_block_dim_x) #define ccl_gpu_global_size_x() (ccl_gpu_grid_dim_x * ccl_gpu_block_dim_x)

View File

@@ -92,29 +92,12 @@
/* Compute number of threads per block and minimum blocks per multiprocessor /* Compute number of threads per block and minimum blocks per multiprocessor
* given the maximum number of registers per thread. */ * given the maximum number of registers per thread. */
#define ccl_gpu_kernel(block_num_threads, thread_num_registers) \ #define ccl_gpu_kernel(block_num_threads, thread_num_registers) \
extern "C" __global__ void __launch_bounds__(block_num_threads, \ extern "C" __global__ void __launch_bounds__(block_num_threads, \
GPU_MULTIPRESSOR_MAX_REGISTERS / \ GPU_MULTIPRESSOR_MAX_REGISTERS / \
(block_num_threads * thread_num_registers)) (block_num_threads * thread_num_registers))
#define ccl_gpu_kernel_threads(block_num_threads) \
extern "C" __global__ void __launch_bounds__(block_num_threads)
#define ccl_gpu_kernel_signature(name, ...) kernel_gpu_##name(__VA_ARGS__)
#define ccl_gpu_kernel_call(x) x
/* Define a function object where "func" is the lambda body, and additional parameters are used to
* specify captured state */
#define ccl_gpu_kernel_lambda(func, ...) \
struct KernelLambda { \
__VA_ARGS__; \
__device__ int operator()(const int state) \
{ \
return (func); \
} \
} ccl_gpu_kernel_lambda_pass
/* sanity checks */ /* sanity checks */
#if GPU_KERNEL_BLOCK_NUM_THREADS > GPU_BLOCK_MAX_THREADS #if GPU_KERNEL_BLOCK_NUM_THREADS > GPU_BLOCK_MAX_THREADS

View File

@@ -65,9 +65,7 @@ ccl_device float cubic_h1(float a)
/* Fast bicubic texture lookup using 4 bilinear lookups, adapted from CUDA samples. */ /* Fast bicubic texture lookup using 4 bilinear lookups, adapted from CUDA samples. */
template<typename T> template<typename T>
ccl_device_noinline T kernel_tex_image_interp_bicubic(ccl_global const TextureInfo &info, ccl_device_noinline T kernel_tex_image_interp_bicubic(const TextureInfo &info, float x, float y)
float x,
float y)
{ {
ccl_gpu_tex_object tex = (ccl_gpu_tex_object)info.data; ccl_gpu_tex_object tex = (ccl_gpu_tex_object)info.data;
@@ -96,7 +94,7 @@ ccl_device_noinline T kernel_tex_image_interp_bicubic(ccl_global const TextureIn
/* Fast tricubic texture lookup using 8 trilinear lookups. */ /* Fast tricubic texture lookup using 8 trilinear lookups. */
template<typename T> template<typename T>
ccl_device_noinline T ccl_device_noinline T
kernel_tex_image_interp_tricubic(ccl_global const TextureInfo &info, float x, float y, float z) kernel_tex_image_interp_tricubic(const TextureInfo &info, float x, float y, float z)
{ {
ccl_gpu_tex_object tex = (ccl_gpu_tex_object)info.data; ccl_gpu_tex_object tex = (ccl_gpu_tex_object)info.data;
@@ -171,7 +169,7 @@ ccl_device T kernel_tex_image_interp_tricubic_nanovdb(S &s, float x, float y, fl
template<typename T> template<typename T>
ccl_device_noinline T kernel_tex_image_interp_nanovdb( ccl_device_noinline T kernel_tex_image_interp_nanovdb(
ccl_global const TextureInfo &info, float x, float y, float z, uint interpolation) const TextureInfo &info, float x, float y, float z, uint interpolation)
{ {
using namespace nanovdb; using namespace nanovdb;
@@ -193,7 +191,7 @@ ccl_device_noinline T kernel_tex_image_interp_nanovdb(
ccl_device float4 kernel_tex_image_interp(KernelGlobals kg, int id, float x, float y) ccl_device float4 kernel_tex_image_interp(KernelGlobals kg, int id, float x, float y)
{ {
ccl_global const TextureInfo &info = kernel_tex_fetch(__texture_info, id); const TextureInfo &info = kernel_tex_fetch(__texture_info, id);
/* float4, byte4, ushort4 and half4 */ /* float4, byte4, ushort4 and half4 */
const int texture_type = info.data_type; const int texture_type = info.data_type;
@@ -228,7 +226,7 @@ ccl_device float4 kernel_tex_image_interp_3d(KernelGlobals kg,
float3 P, float3 P,
InterpolationType interp) InterpolationType interp)
{ {
ccl_global const TextureInfo &info = kernel_tex_fetch(__texture_info, id); const TextureInfo &info = kernel_tex_fetch(__texture_info, id);
if (info.use_transform_3d) { if (info.use_transform_3d) {
P = transform_point(&info.transform_3d, P); P = transform_point(&info.transform_3d, P);

File diff suppressed because it is too large Load Diff

View File

@@ -31,43 +31,10 @@ CCL_NAMESPACE_BEGIN
# define GPU_PARALLEL_ACTIVE_INDEX_DEFAULT_BLOCK_SIZE 512 # define GPU_PARALLEL_ACTIVE_INDEX_DEFAULT_BLOCK_SIZE 512
#endif #endif
#ifdef __KERNEL_METAL__
struct ActiveIndexContext {
ActiveIndexContext(int _thread_index,
int _global_index,
int _threadgroup_size,
int _simdgroup_size,
int _simd_lane_index,
int _simd_group_index,
int _num_simd_groups,
threadgroup int *_simdgroup_offset)
: thread_index(_thread_index),
global_index(_global_index),
blocksize(_threadgroup_size),
ccl_gpu_warp_size(_simdgroup_size),
thread_warp(_simd_lane_index),
warp_index(_simd_group_index),
num_warps(_num_simd_groups),
warp_offset(_simdgroup_offset)
{
}
const int thread_index, global_index, blocksize, ccl_gpu_warp_size, thread_warp, warp_index,
num_warps;
threadgroup int *warp_offset;
template<uint blocksizeDummy, typename IsActiveOp>
void active_index_array(const uint num_states,
ccl_global int *indices,
ccl_global int *num_indices,
IsActiveOp is_active_op)
{
const uint state_index = global_index;
#else
template<uint blocksize, typename IsActiveOp> template<uint blocksize, typename IsActiveOp>
__device__ void gpu_parallel_active_index_array(const uint num_states, __device__ void gpu_parallel_active_index_array(const uint num_states,
ccl_global int *indices, int *indices,
ccl_global int *num_indices, int *num_indices,
IsActiveOp is_active_op) IsActiveOp is_active_op)
{ {
extern ccl_gpu_shared int warp_offset[]; extern ccl_gpu_shared int warp_offset[];
@@ -78,62 +45,43 @@ __device__ void gpu_parallel_active_index_array(const uint num_states,
const uint warp_index = thread_index / ccl_gpu_warp_size; const uint warp_index = thread_index / ccl_gpu_warp_size;
const uint num_warps = blocksize / ccl_gpu_warp_size; const uint num_warps = blocksize / ccl_gpu_warp_size;
/* Test if state corresponding to this thread is active. */
const uint state_index = ccl_gpu_block_idx_x * blocksize + thread_index; const uint state_index = ccl_gpu_block_idx_x * blocksize + thread_index;
#endif const uint is_active = (state_index < num_states) ? is_active_op(state_index) : 0;
/* Test if state corresponding to this thread is active. */ /* For each thread within a warp compute how many other active states precede it. */
const uint is_active = (state_index < num_states) ? is_active_op(state_index) : 0; const uint thread_mask = 0xFFFFFFFF >> (ccl_gpu_warp_size - thread_warp);
const uint thread_offset = ccl_gpu_popc(ccl_gpu_ballot(is_active) & thread_mask);
/* For each thread within a warp compute how many other active states precede it. */ /* Last thread in warp stores number of active states for each warp. */
const uint thread_offset = ccl_gpu_popc(ccl_gpu_ballot(is_active) & if (thread_warp == ccl_gpu_warp_size - 1) {
ccl_gpu_thread_mask(thread_warp)); warp_offset[warp_index] = thread_offset + is_active;
/* Last thread in warp stores number of active states for each warp. */
if (thread_warp == ccl_gpu_warp_size - 1) {
warp_offset[warp_index] = thread_offset + is_active;
}
ccl_gpu_syncthreads();
/* Last thread in block converts per-warp sizes to offsets, increments global size of
* index array and gets offset to write to. */
if (thread_index == blocksize - 1) {
/* TODO: parallelize this. */
int offset = 0;
for (int i = 0; i < num_warps; i++) {
int num_active = warp_offset[i];
warp_offset[i] = offset;
offset += num_active;
}
const uint block_num_active = warp_offset[warp_index] + thread_offset + is_active;
warp_offset[num_warps] = atomic_fetch_and_add_uint32(num_indices, block_num_active);
}
ccl_gpu_syncthreads();
/* Write to index array. */
if (is_active) {
const uint block_offset = warp_offset[num_warps];
indices[block_offset + warp_offset[warp_index] + thread_offset] = state_index;
}
} }
#ifdef __KERNEL_METAL__ ccl_gpu_syncthreads();
}; /* end class ActiveIndexContext */
/* inject the required thread params into a struct, and redirect to its templated member function /* Last thread in block converts per-warp sizes to offsets, increments global size of
*/ * index array and gets offset to write to. */
# define gpu_parallel_active_index_array \ if (thread_index == blocksize - 1) {
ActiveIndexContext(metal_local_id, \ /* TODO: parallelize this. */
metal_global_id, \ int offset = 0;
metal_local_size, \ for (int i = 0; i < num_warps; i++) {
simdgroup_size, \ int num_active = warp_offset[i];
simd_lane_index, \ warp_offset[i] = offset;
simd_group_index, \ offset += num_active;
num_simd_groups, \ }
simdgroup_offset) \
.active_index_array const uint block_num_active = warp_offset[warp_index] + thread_offset + is_active;
#endif warp_offset[num_warps] = atomic_fetch_and_add_uint32(num_indices, block_num_active);
}
ccl_gpu_syncthreads();
/* Write to index array. */
if (is_active) {
const uint block_offset = warp_offset[num_warps];
indices[block_offset + warp_offset[warp_index] + thread_offset] = state_index;
}
}
CCL_NAMESPACE_END CCL_NAMESPACE_END

View File

@@ -33,12 +33,10 @@ CCL_NAMESPACE_BEGIN
# define GPU_PARALLEL_PREFIX_SUM_DEFAULT_BLOCK_SIZE 512 # define GPU_PARALLEL_PREFIX_SUM_DEFAULT_BLOCK_SIZE 512
#endif #endif
__device__ void gpu_parallel_prefix_sum(const int global_id, template<uint blocksize>
ccl_global int *counter, __device__ void gpu_parallel_prefix_sum(int *counter, int *prefix_sum, const int num_values)
ccl_global int *prefix_sum,
const int num_values)
{ {
if (global_id != 0) { if (!(ccl_gpu_block_idx_x == 0 && ccl_gpu_thread_idx_x == 0)) {
return; return;
} }

View File

@@ -33,16 +33,16 @@ CCL_NAMESPACE_BEGIN
#endif #endif
#define GPU_PARALLEL_SORTED_INDEX_INACTIVE_KEY (~0) #define GPU_PARALLEL_SORTED_INDEX_INACTIVE_KEY (~0)
template<typename GetKeyOp> template<uint blocksize, typename GetKeyOp>
__device__ void gpu_parallel_sorted_index_array(const uint state_index, __device__ void gpu_parallel_sorted_index_array(const uint num_states,
const uint num_states,
const int num_states_limit, const int num_states_limit,
ccl_global int *indices, int *indices,
ccl_global int *num_indices, int *num_indices,
ccl_global int *key_counter, int *key_counter,
ccl_global int *key_prefix_sum, int *key_prefix_sum,
GetKeyOp get_key_op) GetKeyOp get_key_op)
{ {
const uint state_index = ccl_gpu_block_idx_x * blocksize + ccl_gpu_thread_idx_x;
const int key = (state_index < num_states) ? get_key_op(state_index) : const int key = (state_index < num_states) ? get_key_op(state_index) :
GPU_PARALLEL_SORTED_INDEX_INACTIVE_KEY; GPU_PARALLEL_SORTED_INDEX_INACTIVE_KEY;

View File

@@ -29,17 +29,20 @@ ccl_device_inline void get_work_pixel(ccl_global const KernelWorkTile *tile,
ccl_private uint *y, ccl_private uint *y,
ccl_private uint *sample) ccl_private uint *sample)
{ {
#if 0 uint sample_offset, pixel_offset;
/* Keep threads for the same sample together. */
uint tile_pixels = tile->w * tile->h; if (kernel_data.integrator.scrambling_distance < 0.9f) {
uint sample_offset = global_work_index / tile_pixels; /* Keep threads for the same sample together. */
uint pixel_offset = global_work_index - sample_offset * tile_pixels; uint tile_pixels = tile->w * tile->h;
#else sample_offset = global_work_index / tile_pixels;
/* Keeping threads for the same pixel together. pixel_offset = global_work_index - sample_offset * tile_pixels;
* Appears to improve performance by a few % on CUDA and OptiX. */ }
uint sample_offset = global_work_index % tile->num_samples; else {
uint pixel_offset = global_work_index / tile->num_samples; /* Keeping threads for the same pixel together.
#endif * Appears to improve performance by a few % on CUDA and OptiX. */
sample_offset = global_work_index % tile->num_samples;
pixel_offset = global_work_index / tile->num_samples;
}
uint y_offset = pixel_offset / tile->w; uint y_offset = pixel_offset / tile->w;
uint x_offset = pixel_offset - y_offset * tile->w; uint x_offset = pixel_offset - y_offset * tile->w;

View File

@@ -74,7 +74,6 @@ typedef unsigned long long uint64_t;
#define ccl_gpu_block_idx_x (blockIdx.x) #define ccl_gpu_block_idx_x (blockIdx.x)
#define ccl_gpu_grid_dim_x (gridDim.x) #define ccl_gpu_grid_dim_x (gridDim.x)
#define ccl_gpu_warp_size (warpSize) #define ccl_gpu_warp_size (warpSize)
#define ccl_gpu_thread_mask(thread_warp) uint(0xFFFFFFFF >> (ccl_gpu_warp_size - thread_warp))
#define ccl_gpu_global_id_x() (ccl_gpu_block_idx_x * ccl_gpu_block_dim_x + ccl_gpu_thread_idx_x) #define ccl_gpu_global_id_x() (ccl_gpu_block_idx_x * ccl_gpu_block_dim_x + ccl_gpu_thread_idx_x)
#define ccl_gpu_global_size_x() (ccl_gpu_grid_dim_x * ccl_gpu_block_dim_x) #define ccl_gpu_global_size_x() (ccl_gpu_grid_dim_x * ccl_gpu_block_dim_x)

View File

@@ -35,29 +35,12 @@
/* Compute number of threads per block and minimum blocks per multiprocessor /* Compute number of threads per block and minimum blocks per multiprocessor
* given the maximum number of registers per thread. */ * given the maximum number of registers per thread. */
#define ccl_gpu_kernel(block_num_threads, thread_num_registers) \ #define ccl_gpu_kernel(block_num_threads, thread_num_registers) \
extern "C" __global__ void __launch_bounds__(block_num_threads, \ extern "C" __global__ void __launch_bounds__(block_num_threads, \
GPU_MULTIPRESSOR_MAX_REGISTERS / \ GPU_MULTIPRESSOR_MAX_REGISTERS / \
(block_num_threads * thread_num_registers)) (block_num_threads * thread_num_registers))
#define ccl_gpu_kernel_threads(block_num_threads) \
extern "C" __global__ void __launch_bounds__(block_num_threads)
#define ccl_gpu_kernel_signature(name, ...) kernel_gpu_##name(__VA_ARGS__)
#define ccl_gpu_kernel_call(x) x
/* Define a function object where "func" is the lambda body, and additional parameters are used to
* specify captured state */
#define ccl_gpu_kernel_lambda(func, ...) \
struct KernelLambda { \
__VA_ARGS__; \
__device__ int operator()(const int state) \
{ \
return (func); \
} \
} ccl_gpu_kernel_lambda_pass
/* sanity checks */ /* sanity checks */
#if GPU_KERNEL_BLOCK_NUM_THREADS > GPU_BLOCK_MAX_THREADS #if GPU_KERNEL_BLOCK_NUM_THREADS > GPU_BLOCK_MAX_THREADS

View File

@@ -58,96 +58,6 @@ using namespace metal;
#define kernel_assert(cond) #define kernel_assert(cond)
#define ccl_gpu_global_id_x() metal_global_id
#define ccl_gpu_warp_size simdgroup_size
#define ccl_gpu_thread_idx_x simd_group_index
#define ccl_gpu_thread_mask(thread_warp) uint64_t((1ull << thread_warp) - 1)
#define ccl_gpu_ballot(predicate) ((uint64_t)((simd_vote::vote_t)simd_ballot(predicate)))
#define ccl_gpu_popc(x) popcount(x)
// clang-format off
/* kernel.h adapters */
#define ccl_gpu_kernel(block_num_threads, thread_num_registers)
#define ccl_gpu_kernel_threads(block_num_threads)
/* convert a comma-separated list into a semicolon-separated list (so that we can generate a struct based on kernel entrypoint parameters) */
#define FN0()
#define FN1(p1) p1;
#define FN2(p1, p2) p1; p2;
#define FN3(p1, p2, p3) p1; p2; p3;
#define FN4(p1, p2, p3, p4) p1; p2; p3; p4;
#define FN5(p1, p2, p3, p4, p5) p1; p2; p3; p4; p5;
#define FN6(p1, p2, p3, p4, p5, p6) p1; p2; p3; p4; p5; p6;
#define FN7(p1, p2, p3, p4, p5, p6, p7) p1; p2; p3; p4; p5; p6; p7;
#define FN8(p1, p2, p3, p4, p5, p6, p7, p8) p1; p2; p3; p4; p5; p6; p7; p8;
#define FN9(p1, p2, p3, p4, p5, p6, p7, p8, p9) p1; p2; p3; p4; p5; p6; p7; p8; p9;
#define FN10(p1, p2, p3, p4, p5, p6, p7, p8, p9, p10) p1; p2; p3; p4; p5; p6; p7; p8; p9; p10;
#define FN11(p1, p2, p3, p4, p5, p6, p7, p8, p9, p10, p11) p1; p2; p3; p4; p5; p6; p7; p8; p9; p10; p11;
#define FN12(p1, p2, p3, p4, p5, p6, p7, p8, p9, p10, p11, p12) p1; p2; p3; p4; p5; p6; p7; p8; p9; p10; p11; p12;
#define FN13(p1, p2, p3, p4, p5, p6, p7, p8, p9, p10, p11, p12, p13) p1; p2; p3; p4; p5; p6; p7; p8; p9; p10; p11; p12; p13;
#define FN14(p1, p2, p3, p4, p5, p6, p7, p8, p9, p10, p11, p12, p13, p14) p1; p2; p3; p4; p5; p6; p7; p8; p9; p10; p11; p12; p13; p14;
#define FN15(p1, p2, p3, p4, p5, p6, p7, p8, p9, p10, p11, p12, p13, p14, p15) p1; p2; p3; p4; p5; p6; p7; p8; p9; p10; p11; p12; p13; p14; p15;
#define FN16(p1, p2, p3, p4, p5, p6, p7, p8, p9, p10, p11, p12, p13, p14, p15, p16) p1; p2; p3; p4; p5; p6; p7; p8; p9; p10; p11; p12; p13; p14; p15; p16;
#define GET_LAST_ARG(p0, p1, p2, p3, p4, p5, p6, p7, p8, p9, p10, p11, p12, p13, p14, p15, p16, ...) p16
#define PARAMS_MAKER(...) GET_LAST_ARG(__VA_ARGS__, FN16, FN15, FN14, FN13, FN12, FN11, FN10, FN9, FN8, FN7, FN6, FN5, FN4, FN3, FN2, FN1, FN0)
/* generate a struct containing the entrypoint parameters and a "run" method which can access them implicitly via this-> */
#define ccl_gpu_kernel_signature(name, ...) \
struct kernel_gpu_##name \
{ \
PARAMS_MAKER(__VA_ARGS__)(__VA_ARGS__) \
void run(thread MetalKernelContext& context, \
threadgroup int *simdgroup_offset, \
const uint metal_global_id, \
const ushort metal_local_id, \
const ushort metal_local_size, \
uint simdgroup_size, \
uint simd_lane_index, \
uint simd_group_index, \
uint num_simd_groups) ccl_global const; \
}; \
kernel void kernel_metal_##name(device const kernel_gpu_##name *params_struct, \
constant KernelParamsMetal &ccl_restrict _launch_params_metal, \
constant MetalAncillaries *_metal_ancillaries, \
threadgroup int *simdgroup_offset[[ threadgroup(0) ]], \
const uint metal_global_id [[thread_position_in_grid]], \
const ushort metal_local_id [[thread_position_in_threadgroup]], \
const ushort metal_local_size [[threads_per_threadgroup]], \
uint simdgroup_size [[threads_per_simdgroup]], \
uint simd_lane_index [[thread_index_in_simdgroup]], \
uint simd_group_index [[simdgroup_index_in_threadgroup]], \
uint num_simd_groups [[simdgroups_per_threadgroup]]) { \
MetalKernelContext context(_launch_params_metal, _metal_ancillaries); \
INIT_DEBUG_BUFFER \
params_struct->run(context, simdgroup_offset, metal_global_id, metal_local_id, metal_local_size, simdgroup_size, simd_lane_index, simd_group_index, num_simd_groups); \
} \
void kernel_gpu_##name::run(thread MetalKernelContext& context, \
threadgroup int *simdgroup_offset, \
const uint metal_global_id, \
const ushort metal_local_id, \
const ushort metal_local_size, \
uint simdgroup_size, \
uint simd_lane_index, \
uint simd_group_index, \
uint num_simd_groups) ccl_global const
#define ccl_gpu_kernel_call(x) context.x
/* define a function object where "func" is the lambda body, and additional parameters are used to specify captured state */
#define ccl_gpu_kernel_lambda(func, ...) \
struct KernelLambda \
{ \
KernelLambda(ccl_private MetalKernelContext &_context) : context(_context) {} \
ccl_private MetalKernelContext &context; \
__VA_ARGS__; \
int operator()(const int state) const { return (func); } \
} ccl_gpu_kernel_lambda_pass(context)
// clang-format on
/* make_type definitions with Metal style element initializers */ /* make_type definitions with Metal style element initializers */
#ifdef make_float2 #ifdef make_float2
# undef make_float2 # undef make_float2
@@ -214,38 +124,3 @@ void kernel_gpu_##name::run(thread MetalKernelContext& context, \
#define logf(x) trigmode::log(float(x)) #define logf(x) trigmode::log(float(x))
#define NULL 0 #define NULL 0
/* texture bindings and sampler setup */
struct Texture2DParamsMetal {
texture2d<float, access::sample> tex;
};
struct Texture3DParamsMetal {
texture3d<float, access::sample> tex;
};
struct MetalAncillaries {
device Texture2DParamsMetal *textures_2d;
device Texture3DParamsMetal *textures_3d;
};
enum SamplerType {
SamplerFilterNearest_AddressRepeat,
SamplerFilterNearest_AddressClampEdge,
SamplerFilterNearest_AddressClampZero,
SamplerFilterLinear_AddressRepeat,
SamplerFilterLinear_AddressClampEdge,
SamplerFilterLinear_AddressClampZero,
SamplerCount
};
constant constexpr array<sampler, SamplerCount> metal_samplers = {
sampler(address::repeat, filter::nearest),
sampler(address::clamp_to_edge, filter::nearest),
sampler(address::clamp_to_zero, filter::nearest),
sampler(address::repeat, filter::linear),
sampler(address::clamp_to_edge, filter::linear),
sampler(address::clamp_to_zero, filter::linear),
};

View File

@@ -1,79 +0,0 @@
/*
* Copyright 2021 Blender Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
// clang-format off
/* Open the Metal kernel context class
* Necessary to access resource bindings */
class MetalKernelContext {
public:
constant KernelParamsMetal &launch_params_metal;
constant MetalAncillaries *metal_ancillaries;
MetalKernelContext(constant KernelParamsMetal &_launch_params_metal, constant MetalAncillaries * _metal_ancillaries)
: launch_params_metal(_launch_params_metal), metal_ancillaries(_metal_ancillaries)
{}
/* texture fetch adapter functions */
typedef uint64_t ccl_gpu_tex_object;
template<typename T>
inline __attribute__((__always_inline__))
T ccl_gpu_tex_object_read_2D(ccl_gpu_tex_object tex, float x, float y) const {
kernel_assert(0);
return 0;
}
template<typename T>
inline __attribute__((__always_inline__))
T ccl_gpu_tex_object_read_3D(ccl_gpu_tex_object tex, float x, float y, float z) const {
kernel_assert(0);
return 0;
}
// texture2d
template<>
inline __attribute__((__always_inline__))
float4 ccl_gpu_tex_object_read_2D(ccl_gpu_tex_object tex, float x, float y) const {
const uint tid(tex);
const uint sid(tex >> 32);
return metal_ancillaries->textures_2d[tid].tex.sample(metal_samplers[sid], float2(x, y));
}
template<>
inline __attribute__((__always_inline__))
float ccl_gpu_tex_object_read_2D(ccl_gpu_tex_object tex, float x, float y) const {
const uint tid(tex);
const uint sid(tex >> 32);
return metal_ancillaries->textures_2d[tid].tex.sample(metal_samplers[sid], float2(x, y)).x;
}
// texture3d
template<>
inline __attribute__((__always_inline__))
float4 ccl_gpu_tex_object_read_3D(ccl_gpu_tex_object tex, float x, float y, float z) const {
const uint tid(tex);
const uint sid(tex >> 32);
return metal_ancillaries->textures_3d[tid].tex.sample(metal_samplers[sid], float3(x, y, z));
}
template<>
inline __attribute__((__always_inline__))
float ccl_gpu_tex_object_read_3D(ccl_gpu_tex_object tex, float x, float y, float z) const {
const uint tid(tex);
const uint sid(tex >> 32);
return metal_ancillaries->textures_3d[tid].tex.sample(metal_samplers[sid], float3(x, y, z)).x;
}
# include "kernel/device/gpu/image.h"
// clang-format on

View File

@@ -1,23 +0,0 @@
/*
* Copyright 2021 Blender Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
}
; /* end of MetalKernelContext class definition */
/* Silently redirect into the MetalKernelContext instance */
/* NOTE: These macros will need maintaining as entrypoints change */
#undef kernel_integrator_state
#define kernel_integrator_state context.launch_params_metal.__integrator_state

View File

@@ -1,51 +0,0 @@
/*
* Copyright 2021 Blender Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
/* Constant Globals */
#include "kernel/types.h"
#include "kernel/util/profiling.h"
#include "kernel/integrator/state.h"
CCL_NAMESPACE_BEGIN
typedef struct KernelParamsMetal {
#define KERNEL_TEX(type, name) ccl_constant type *name;
#include "kernel/textures.h"
#undef KERNEL_TEX
const IntegratorStateGPU __integrator_state;
const KernelData data;
} KernelParamsMetal;
typedef struct KernelGlobalsGPU {
int unused[1];
} KernelGlobalsGPU;
typedef ccl_global const KernelGlobalsGPU *ccl_restrict KernelGlobals;
#define kernel_data launch_params_metal.data
#define kernel_integrator_state launch_params_metal.__integrator_state
/* data lookup defines */
#define kernel_tex_fetch(tex, index) launch_params_metal.tex[index]
#define kernel_tex_array(tex) launch_params_metal.tex
CCL_NAMESPACE_END

View File

@@ -1,25 +0,0 @@
/*
* Copyright 2021 Blender Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
/* Metal kernel entry points */
// clang-format off
#include "kernel/device/metal/compat.h"
#include "kernel/device/metal/globals.h"
#include "kernel/device/gpu/kernel.h"
// clang-format on

View File

@@ -76,7 +76,6 @@ typedef unsigned long long uint64_t;
#define ccl_gpu_block_idx_x (blockIdx.x) #define ccl_gpu_block_idx_x (blockIdx.x)
#define ccl_gpu_grid_dim_x (gridDim.x) #define ccl_gpu_grid_dim_x (gridDim.x)
#define ccl_gpu_warp_size (warpSize) #define ccl_gpu_warp_size (warpSize)
#define ccl_gpu_thread_mask(thread_warp) uint(0xFFFFFFFF >> (ccl_gpu_warp_size - thread_warp))
#define ccl_gpu_global_id_x() (ccl_gpu_block_idx_x * ccl_gpu_block_dim_x + ccl_gpu_thread_idx_x) #define ccl_gpu_global_id_x() (ccl_gpu_block_idx_x * ccl_gpu_block_dim_x + ccl_gpu_thread_idx_x)
#define ccl_gpu_global_size_x() (ccl_gpu_grid_dim_x * ccl_gpu_block_dim_x) #define ccl_gpu_global_size_x() (ccl_gpu_grid_dim_x * ccl_gpu_block_dim_x)

View File

@@ -57,7 +57,7 @@ extern "C" __global__ void __raygen__kernel_optix_integrator_intersect_closest()
const int global_index = optixGetLaunchIndex().x; const int global_index = optixGetLaunchIndex().x;
const int path_index = (__params.path_index_array) ? __params.path_index_array[global_index] : const int path_index = (__params.path_index_array) ? __params.path_index_array[global_index] :
global_index; global_index;
integrator_intersect_closest(nullptr, path_index); integrator_intersect_closest(nullptr, path_index, __params.render_buffer);
} }
extern "C" __global__ void __raygen__kernel_optix_integrator_intersect_shadow() extern "C" __global__ void __raygen__kernel_optix_integrator_intersect_shadow()

View File

@@ -33,62 +33,72 @@ CCL_NAMESPACE_BEGIN
* them separately. */ * them separately. */
ccl_device_inline void bsdf_eval_init(ccl_private BsdfEval *eval, ccl_device_inline void bsdf_eval_init(ccl_private BsdfEval *eval,
const bool is_diffuse, const ClosureType closure_type,
float3 value) float3 value)
{ {
eval->diffuse = zero_float3(); eval->diffuse = zero_float3();
eval->glossy = zero_float3(); eval->glossy = zero_float3();
if (is_diffuse) { if (CLOSURE_IS_BSDF_DIFFUSE(closure_type)) {
eval->diffuse = value; eval->diffuse = value;
} }
else { else if (CLOSURE_IS_BSDF_GLOSSY(closure_type)) {
eval->glossy = value; eval->glossy = value;
} }
eval->sum = value;
} }
ccl_device_inline void bsdf_eval_accum(ccl_private BsdfEval *eval, ccl_device_inline void bsdf_eval_accum(ccl_private BsdfEval *eval,
const bool is_diffuse, const ClosureType closure_type,
float3 value, float3 value)
float mis_weight)
{ {
value *= mis_weight; if (CLOSURE_IS_BSDF_DIFFUSE(closure_type)) {
if (is_diffuse) {
eval->diffuse += value; eval->diffuse += value;
} }
else { else if (CLOSURE_IS_BSDF_GLOSSY(closure_type)) {
eval->glossy += value; eval->glossy += value;
} }
eval->sum += value;
} }
ccl_device_inline bool bsdf_eval_is_zero(ccl_private BsdfEval *eval) ccl_device_inline bool bsdf_eval_is_zero(ccl_private BsdfEval *eval)
{ {
return is_zero(eval->diffuse) && is_zero(eval->glossy); return is_zero(eval->sum);
} }
ccl_device_inline void bsdf_eval_mul(ccl_private BsdfEval *eval, float value) ccl_device_inline void bsdf_eval_mul(ccl_private BsdfEval *eval, float value)
{ {
eval->diffuse *= value; eval->diffuse *= value;
eval->glossy *= value; eval->glossy *= value;
eval->sum *= value;
} }
ccl_device_inline void bsdf_eval_mul3(ccl_private BsdfEval *eval, float3 value) ccl_device_inline void bsdf_eval_mul3(ccl_private BsdfEval *eval, float3 value)
{ {
eval->diffuse *= value; eval->diffuse *= value;
eval->glossy *= value; eval->glossy *= value;
eval->sum *= value;
} }
ccl_device_inline float3 bsdf_eval_sum(ccl_private const BsdfEval *eval) ccl_device_inline float3 bsdf_eval_sum(ccl_private const BsdfEval *eval)
{ {
return eval->diffuse + eval->glossy; return eval->sum;
} }
ccl_device_inline float3 bsdf_eval_diffuse_glossy_ratio(ccl_private const BsdfEval *eval) ccl_device_inline float3 bsdf_eval_pass_diffuse_weight(ccl_private const BsdfEval *eval)
{ {
/* Ratio of diffuse and glossy to recover proportions for writing to render pass. /* Ratio of diffuse weight to recover proportions for writing to render pass.
* We assume reflection, transmission and volume scatter to be exclusive. */ * We assume reflection, transmission and volume scatter to be exclusive. */
return safe_divide_float3_float3(eval->diffuse, eval->diffuse + eval->glossy); return safe_divide_float3_float3(eval->diffuse, eval->sum);
}
ccl_device_inline float3 bsdf_eval_pass_glossy_weight(ccl_private const BsdfEval *eval)
{
/* Ratio of glossy weight to recover proportions for writing to render pass.
* We assume reflection, transmission and volume scatter to be exclusive. */
return safe_divide_float3_float3(eval->glossy, eval->sum);
} }
/* -------------------------------------------------------------------- /* --------------------------------------------------------------------
@@ -351,37 +361,47 @@ ccl_device_inline void kernel_accum_emission_or_background_pass(KernelGlobals kg
/* Directly visible, write to emission or background pass. */ /* Directly visible, write to emission or background pass. */
pass_offset = pass; pass_offset = pass;
} }
else if (path_flag & (PATH_RAY_REFLECT_PASS | PATH_RAY_TRANSMISSION_PASS)) { else if (kernel_data.kernel_features & KERNEL_FEATURE_LIGHT_PASSES) {
/* Indirectly visible through reflection. */ if (path_flag & PATH_RAY_SURFACE_PASS) {
const int glossy_pass_offset = (path_flag & PATH_RAY_REFLECT_PASS) ? /* Indirectly visible through reflection. */
((INTEGRATOR_STATE(state, path, bounce) == 1) ? const float3 diffuse_weight = INTEGRATOR_STATE(state, path, pass_diffuse_weight);
kernel_data.film.pass_glossy_direct : const float3 glossy_weight = INTEGRATOR_STATE(state, path, pass_glossy_weight);
kernel_data.film.pass_glossy_indirect) :
((INTEGRATOR_STATE(state, path, bounce) == 1) ?
kernel_data.film.pass_transmission_direct :
kernel_data.film.pass_transmission_indirect);
if (glossy_pass_offset != PASS_UNUSED) { /* Glossy */
/* Glossy is a subset of the throughput, reconstruct it here using the const int glossy_pass_offset = ((INTEGRATOR_STATE(state, path, bounce) == 1) ?
* diffuse-glossy ratio. */ kernel_data.film.pass_glossy_direct :
const float3 ratio = INTEGRATOR_STATE(state, path, diffuse_glossy_ratio); kernel_data.film.pass_glossy_indirect);
const float3 glossy_contribution = (one_float3() - ratio) * contribution; if (glossy_pass_offset != PASS_UNUSED) {
kernel_write_pass_float3(buffer + glossy_pass_offset, glossy_contribution); kernel_write_pass_float3(buffer + glossy_pass_offset, glossy_weight * contribution);
} }
/* Reconstruct diffuse subset of throughput. */ /* Transmission */
pass_offset = (INTEGRATOR_STATE(state, path, bounce) == 1) ? const int transmission_pass_offset = ((INTEGRATOR_STATE(state, path, bounce) == 1) ?
kernel_data.film.pass_diffuse_direct : kernel_data.film.pass_transmission_direct :
kernel_data.film.pass_diffuse_indirect; kernel_data.film.pass_transmission_indirect);
if (pass_offset != PASS_UNUSED) {
contribution *= INTEGRATOR_STATE(state, path, diffuse_glossy_ratio); if (transmission_pass_offset != PASS_UNUSED) {
/* Transmission is what remains if not diffuse and glossy, not stored explicitly to save
* GPU memory. */
const float3 transmission_weight = one_float3() - diffuse_weight - glossy_weight;
kernel_write_pass_float3(buffer + transmission_pass_offset,
transmission_weight * contribution);
}
/* Reconstruct diffuse subset of throughput. */
pass_offset = (INTEGRATOR_STATE(state, path, bounce) == 1) ?
kernel_data.film.pass_diffuse_direct :
kernel_data.film.pass_diffuse_indirect;
if (pass_offset != PASS_UNUSED) {
contribution *= diffuse_weight;
}
}
else if (path_flag & PATH_RAY_VOLUME_PASS) {
/* Indirectly visible through volume. */
pass_offset = (INTEGRATOR_STATE(state, path, bounce) == 1) ?
kernel_data.film.pass_volume_direct :
kernel_data.film.pass_volume_indirect;
} }
}
else if (path_flag & PATH_RAY_VOLUME_PASS) {
/* Indirectly visible through volume. */
pass_offset = (INTEGRATOR_STATE(state, path, bounce) == 1) ?
kernel_data.film.pass_volume_direct :
kernel_data.film.pass_volume_indirect;
} }
/* Single write call for GPU coherence. */ /* Single write call for GPU coherence. */
@@ -426,49 +446,60 @@ ccl_device_inline void kernel_accum_light(KernelGlobals kg,
#ifdef __PASSES__ #ifdef __PASSES__
if (kernel_data.film.light_pass_flag & PASS_ANY) { if (kernel_data.film.light_pass_flag & PASS_ANY) {
const uint32_t path_flag = INTEGRATOR_STATE(state, shadow_path, flag); const uint32_t path_flag = INTEGRATOR_STATE(state, shadow_path, flag);
int pass_offset = PASS_UNUSED;
if (path_flag & (PATH_RAY_REFLECT_PASS | PATH_RAY_TRANSMISSION_PASS)) { if (kernel_data.kernel_features & KERNEL_FEATURE_LIGHT_PASSES) {
/* Indirectly visible through reflection. */ int pass_offset = PASS_UNUSED;
const int glossy_pass_offset = (path_flag & PATH_RAY_REFLECT_PASS) ?
((INTEGRATOR_STATE(state, shadow_path, bounce) == 0) ?
kernel_data.film.pass_glossy_direct :
kernel_data.film.pass_glossy_indirect) :
((INTEGRATOR_STATE(state, shadow_path, bounce) == 0) ?
kernel_data.film.pass_transmission_direct :
kernel_data.film.pass_transmission_indirect);
if (glossy_pass_offset != PASS_UNUSED) { if (path_flag & PATH_RAY_SURFACE_PASS) {
/* Glossy is a subset of the throughput, reconstruct it here using the /* Indirectly visible through reflection. */
* diffuse-glossy ratio. */ const float3 diffuse_weight = INTEGRATOR_STATE(state, shadow_path, pass_diffuse_weight);
const float3 ratio = INTEGRATOR_STATE(state, shadow_path, diffuse_glossy_ratio); const float3 glossy_weight = INTEGRATOR_STATE(state, shadow_path, pass_glossy_weight);
const float3 glossy_contribution = (one_float3() - ratio) * contribution;
kernel_write_pass_float3(buffer + glossy_pass_offset, glossy_contribution); /* Glossy */
const int glossy_pass_offset = ((INTEGRATOR_STATE(state, shadow_path, bounce) == 0) ?
kernel_data.film.pass_glossy_direct :
kernel_data.film.pass_glossy_indirect);
if (glossy_pass_offset != PASS_UNUSED) {
kernel_write_pass_float3(buffer + glossy_pass_offset, glossy_weight * contribution);
}
/* Transmission */
const int transmission_pass_offset = ((INTEGRATOR_STATE(state, shadow_path, bounce) == 0) ?
kernel_data.film.pass_transmission_direct :
kernel_data.film.pass_transmission_indirect);
if (transmission_pass_offset != PASS_UNUSED) {
/* Transmission is what remains if not diffuse and glossy, not stored explicitly to save
* GPU memory. */
const float3 transmission_weight = one_float3() - diffuse_weight - glossy_weight;
kernel_write_pass_float3(buffer + transmission_pass_offset,
transmission_weight * contribution);
}
/* Reconstruct diffuse subset of throughput. */
pass_offset = (INTEGRATOR_STATE(state, shadow_path, bounce) == 0) ?
kernel_data.film.pass_diffuse_direct :
kernel_data.film.pass_diffuse_indirect;
if (pass_offset != PASS_UNUSED) {
contribution *= diffuse_weight;
}
}
else if (path_flag & PATH_RAY_VOLUME_PASS) {
/* Indirectly visible through volume. */
pass_offset = (INTEGRATOR_STATE(state, shadow_path, bounce) == 0) ?
kernel_data.film.pass_volume_direct :
kernel_data.film.pass_volume_indirect;
} }
/* Reconstruct diffuse subset of throughput. */ /* Single write call for GPU coherence. */
pass_offset = (INTEGRATOR_STATE(state, shadow_path, bounce) == 0) ?
kernel_data.film.pass_diffuse_direct :
kernel_data.film.pass_diffuse_indirect;
if (pass_offset != PASS_UNUSED) { if (pass_offset != PASS_UNUSED) {
contribution *= INTEGRATOR_STATE(state, shadow_path, diffuse_glossy_ratio); kernel_write_pass_float3(buffer + pass_offset, contribution);
} }
} }
else if (path_flag & PATH_RAY_VOLUME_PASS) {
/* Indirectly visible through volume. */
pass_offset = (INTEGRATOR_STATE(state, shadow_path, bounce) == 0) ?
kernel_data.film.pass_volume_direct :
kernel_data.film.pass_volume_indirect;
}
/* Single write call for GPU coherence. */
if (pass_offset != PASS_UNUSED) {
kernel_write_pass_float3(buffer + pass_offset, contribution);
}
/* Write shadow pass. */ /* Write shadow pass. */
if (kernel_data.film.pass_shadow != PASS_UNUSED && (path_flag & PATH_RAY_SHADOW_FOR_LIGHT) && if (kernel_data.film.pass_shadow != PASS_UNUSED && (path_flag & PATH_RAY_SHADOW_FOR_LIGHT) &&
(path_flag & PATH_RAY_CAMERA)) { (path_flag & PATH_RAY_TRANSPARENT_BACKGROUND)) {
const float3 unshadowed_throughput = INTEGRATOR_STATE( const float3 unshadowed_throughput = INTEGRATOR_STATE(
state, shadow_path, unshadowed_throughput); state, shadow_path, unshadowed_throughput);
const float3 shadowed_throughput = INTEGRATOR_STATE(state, shadow_path, throughput); const float3 shadowed_throughput = INTEGRATOR_STATE(state, shadow_path, throughput);

View File

@@ -160,40 +160,6 @@ ccl_device_forceinline void kernel_write_denoising_features_volume(KernelGlobals
} }
#endif /* __DENOISING_FEATURES__ */ #endif /* __DENOISING_FEATURES__ */
#ifdef __SHADOW_CATCHER__
/* Write shadow catcher passes on a bounce from the shadow catcher object. */
ccl_device_forceinline void kernel_write_shadow_catcher_bounce_data(
KernelGlobals kg,
IntegratorState state,
ccl_private const ShaderData *sd,
ccl_global float *ccl_restrict render_buffer)
{
if (!kernel_data.integrator.has_shadow_catcher) {
return;
}
kernel_assert(kernel_data.film.pass_shadow_catcher_sample_count != PASS_UNUSED);
kernel_assert(kernel_data.film.pass_shadow_catcher_matte != PASS_UNUSED);
if (!kernel_shadow_catcher_is_path_split_bounce(kg, state, sd->object_flag)) {
return;
}
ccl_global float *buffer = kernel_pass_pixel_render_buffer(kg, state, render_buffer);
/* Count sample for the shadow catcher object. */
kernel_write_pass_float(buffer + kernel_data.film.pass_shadow_catcher_sample_count, 1.0f);
/* Since the split is done, the sample does not contribute to the matte, so accumulate it as
* transparency to the matte. */
const float3 throughput = INTEGRATOR_STATE(state, path, throughput);
kernel_write_pass_float(buffer + kernel_data.film.pass_shadow_catcher_matte + 3,
average(throughput));
}
#endif /* __SHADOW_CATCHER__ */
ccl_device_inline size_t kernel_write_id_pass(ccl_global float *ccl_restrict buffer, ccl_device_inline size_t kernel_write_id_pass(ccl_global float *ccl_restrict buffer,
size_t depth, size_t depth,
float id, float id,
@@ -211,7 +177,7 @@ ccl_device_inline void kernel_write_data_passes(KernelGlobals kg,
#ifdef __PASSES__ #ifdef __PASSES__
const uint32_t path_flag = INTEGRATOR_STATE(state, path, flag); const uint32_t path_flag = INTEGRATOR_STATE(state, path, flag);
if (!(path_flag & PATH_RAY_CAMERA)) { if (!(path_flag & PATH_RAY_TRANSPARENT_BACKGROUND)) {
return; return;
} }

View File

@@ -460,7 +460,7 @@ ccl_device_inline float4 film_calculate_shadow_catcher_matte_with_shadow(
const float transparency = in_matte[3] * scale; const float transparency = in_matte[3] * scale;
const float alpha = saturatef(1.0f - transparency); const float alpha = saturatef(1.0f - transparency);
const float alpha_matte = (1.0f - alpha) * (1.0f - average(shadow_catcher)) + alpha; const float alpha_matte = (1.0f - alpha) * (1.0f - saturatef(average(shadow_catcher))) + alpha;
if (kfilm_convert->use_approximate_shadow_catcher_background) { if (kfilm_convert->use_approximate_shadow_catcher_background) {
kernel_assert(kfilm_convert->pass_background != PASS_UNUSED); kernel_assert(kfilm_convert->pass_background != PASS_UNUSED);

View File

@@ -70,14 +70,16 @@ ccl_device bool integrator_init_from_bake(KernelGlobals kg,
/* Setup render buffers. */ /* Setup render buffers. */
const int index = INTEGRATOR_STATE(state, path, render_pixel_index); const int index = INTEGRATOR_STATE(state, path, render_pixel_index);
const int pass_stride = kernel_data.film.pass_stride; const int pass_stride = kernel_data.film.pass_stride;
render_buffer += index * pass_stride; ccl_global float *buffer = render_buffer + index * pass_stride;
ccl_global float *primitive = render_buffer + kernel_data.film.pass_bake_primitive; ccl_global float *primitive = buffer + kernel_data.film.pass_bake_primitive;
ccl_global float *differential = render_buffer + kernel_data.film.pass_bake_differential; ccl_global float *differential = buffer + kernel_data.film.pass_bake_differential;
const int seed = __float_as_uint(primitive[0]); const int seed = __float_as_uint(primitive[0]);
int prim = __float_as_uint(primitive[1]); int prim = __float_as_uint(primitive[1]);
if (prim == -1) { if (prim == -1) {
/* Accumulate transparency for empty pixels. */
kernel_accum_transparent(kg, state, 0, 1.0f, buffer);
return false; return false;
} }

View File

@@ -88,7 +88,10 @@ ccl_device_forceinline bool integrator_intersect_terminate(KernelGlobals kg,
#ifdef __SHADOW_CATCHER__ #ifdef __SHADOW_CATCHER__
/* Split path if a shadow catcher was hit. */ /* Split path if a shadow catcher was hit. */
ccl_device_forceinline void integrator_split_shadow_catcher( ccl_device_forceinline void integrator_split_shadow_catcher(
KernelGlobals kg, IntegratorState state, ccl_private const Intersection *ccl_restrict isect) KernelGlobals kg,
IntegratorState state,
ccl_private const Intersection *ccl_restrict isect,
ccl_global float *ccl_restrict render_buffer)
{ {
/* Test if we hit a shadow catcher object, and potentially split the path to continue tracing two /* Test if we hit a shadow catcher object, and potentially split the path to continue tracing two
* paths from here. */ * paths from here. */
@@ -97,6 +100,8 @@ ccl_device_forceinline void integrator_split_shadow_catcher(
return; return;
} }
kernel_write_shadow_catcher_bounce_data(kg, state, render_buffer);
/* Mark state as having done a shadow catcher split so that it stops contributing to /* Mark state as having done a shadow catcher split so that it stops contributing to
* the shadow catcher matte pass, but keeps contributing to the combined pass. */ * the shadow catcher matte pass, but keeps contributing to the combined pass. */
INTEGRATOR_STATE_WRITE(state, path, flag) |= PATH_RAY_SHADOW_CATCHER_HIT; INTEGRATOR_STATE_WRITE(state, path, flag) |= PATH_RAY_SHADOW_CATCHER_HIT;
@@ -191,6 +196,7 @@ ccl_device_forceinline void integrator_intersect_next_kernel(
KernelGlobals kg, KernelGlobals kg,
IntegratorState state, IntegratorState state,
ccl_private const Intersection *ccl_restrict isect, ccl_private const Intersection *ccl_restrict isect,
ccl_global float *ccl_restrict render_buffer,
const bool hit) const bool hit)
{ {
/* Continue with volume kernel if we are inside a volume, regardless if we hit anything. */ /* Continue with volume kernel if we are inside a volume, regardless if we hit anything. */
@@ -233,7 +239,7 @@ ccl_device_forceinline void integrator_intersect_next_kernel(
#ifdef __SHADOW_CATCHER__ #ifdef __SHADOW_CATCHER__
/* Handle shadow catcher. */ /* Handle shadow catcher. */
integrator_split_shadow_catcher(kg, state, isect); integrator_split_shadow_catcher(kg, state, isect, render_buffer);
#endif #endif
} }
else { else {
@@ -253,7 +259,10 @@ ccl_device_forceinline void integrator_intersect_next_kernel(
* volume shading and termination testing have already been done. */ * volume shading and termination testing have already been done. */
template<uint32_t current_kernel> template<uint32_t current_kernel>
ccl_device_forceinline void integrator_intersect_next_kernel_after_volume( ccl_device_forceinline void integrator_intersect_next_kernel_after_volume(
KernelGlobals kg, IntegratorState state, ccl_private const Intersection *ccl_restrict isect) KernelGlobals kg,
IntegratorState state,
ccl_private const Intersection *ccl_restrict isect,
ccl_global float *ccl_restrict render_buffer)
{ {
if (isect->prim != PRIM_NONE) { if (isect->prim != PRIM_NONE) {
/* Hit a surface, continue with light or surface kernel. */ /* Hit a surface, continue with light or surface kernel. */
@@ -278,7 +287,7 @@ ccl_device_forceinline void integrator_intersect_next_kernel_after_volume(
#ifdef __SHADOW_CATCHER__ #ifdef __SHADOW_CATCHER__
/* Handle shadow catcher. */ /* Handle shadow catcher. */
integrator_split_shadow_catcher(kg, state, isect); integrator_split_shadow_catcher(kg, state, isect, render_buffer);
#endif #endif
return; return;
} }
@@ -290,7 +299,9 @@ ccl_device_forceinline void integrator_intersect_next_kernel_after_volume(
} }
} }
ccl_device void integrator_intersect_closest(KernelGlobals kg, IntegratorState state) ccl_device void integrator_intersect_closest(KernelGlobals kg,
IntegratorState state,
ccl_global float *ccl_restrict render_buffer)
{ {
PROFILING_INIT(kg, PROFILING_INTERSECT_CLOSEST); PROFILING_INIT(kg, PROFILING_INTERSECT_CLOSEST);
@@ -341,7 +352,7 @@ ccl_device void integrator_intersect_closest(KernelGlobals kg, IntegratorState s
/* Setup up next kernel to be executed. */ /* Setup up next kernel to be executed. */
integrator_intersect_next_kernel<DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST>( integrator_intersect_next_kernel<DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST>(
kg, state, &isect, hit); kg, state, &isect, render_buffer, hit);
} }
CCL_NAMESPACE_END CCL_NAMESPACE_END

View File

@@ -76,7 +76,7 @@ ccl_device void integrator_megakernel(KernelGlobals kg,
if (queued_kernel) { if (queued_kernel) {
switch (queued_kernel) { switch (queued_kernel) {
case DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST: case DEVICE_KERNEL_INTEGRATOR_INTERSECT_CLOSEST:
integrator_intersect_closest(kg, state); integrator_intersect_closest(kg, state, render_buffer);
break; break;
case DEVICE_KERNEL_INTEGRATOR_SHADE_BACKGROUND: case DEVICE_KERNEL_INTEGRATOR_SHADE_BACKGROUND:
integrator_shade_background(kg, state, render_buffer); integrator_shade_background(kg, state, render_buffer);

View File

@@ -70,6 +70,9 @@ ccl_device_inline void path_state_init_integrator(KernelGlobals kg,
INTEGRATOR_STATE_WRITE(state, path, continuation_probability) = 1.0f; INTEGRATOR_STATE_WRITE(state, path, continuation_probability) = 1.0f;
INTEGRATOR_STATE_WRITE(state, path, throughput) = make_float3(1.0f, 1.0f, 1.0f); INTEGRATOR_STATE_WRITE(state, path, throughput) = make_float3(1.0f, 1.0f, 1.0f);
INTEGRATOR_STATE_WRITE(state, isect, object) = OBJECT_NONE;
INTEGRATOR_STATE_WRITE(state, isect, prim) = PRIM_NONE;
if (kernel_data.kernel_features & KERNEL_FEATURE_VOLUME) { if (kernel_data.kernel_features & KERNEL_FEATURE_VOLUME) {
INTEGRATOR_STATE_ARRAY_WRITE(state, volume_stack, 0, object) = OBJECT_NONE; INTEGRATOR_STATE_ARRAY_WRITE(state, volume_stack, 0, object) = OBJECT_NONE;
INTEGRATOR_STATE_ARRAY_WRITE( INTEGRATOR_STATE_ARRAY_WRITE(
@@ -122,7 +125,7 @@ ccl_device_inline void path_state_next(KernelGlobals kg, IntegratorState state,
/* volume scatter */ /* volume scatter */
flag |= PATH_RAY_VOLUME_SCATTER; flag |= PATH_RAY_VOLUME_SCATTER;
flag &= ~PATH_RAY_TRANSPARENT_BACKGROUND; flag &= ~PATH_RAY_TRANSPARENT_BACKGROUND;
if (bounce == 1) { if (!(flag & PATH_RAY_ANY_PASS)) {
flag |= PATH_RAY_VOLUME_PASS; flag |= PATH_RAY_VOLUME_PASS;
} }
@@ -184,8 +187,8 @@ ccl_device_inline void path_state_next(KernelGlobals kg, IntegratorState state,
} }
/* Render pass categories. */ /* Render pass categories. */
if (bounce == 1) { if (!(flag & PATH_RAY_ANY_PASS) && !(flag & PATH_RAY_TRANSPARENT_BACKGROUND)) {
flag |= (label & LABEL_TRANSMIT) ? PATH_RAY_TRANSMISSION_PASS : PATH_RAY_REFLECT_PASS; flag |= PATH_RAY_SURFACE_PASS;
} }
} }
@@ -208,9 +211,7 @@ ccl_device_inline bool path_state_volume_next(IntegratorState state)
} }
/* Random number generator next bounce. */ /* Random number generator next bounce. */
if (volume_bounds_bounce > 1) { INTEGRATOR_STATE_WRITE(state, path, rng_offset) += PRNG_BOUNCE_NUM;
INTEGRATOR_STATE_WRITE(state, path, rng_offset) += PRNG_BOUNCE_NUM;
}
return true; return true;
} }

View File

@@ -191,14 +191,18 @@ ccl_device_forceinline void integrate_surface_direct_light(KernelGlobals kg,
const uint16_t transparent_bounce = INTEGRATOR_STATE(state, path, transparent_bounce); const uint16_t transparent_bounce = INTEGRATOR_STATE(state, path, transparent_bounce);
uint32_t shadow_flag = INTEGRATOR_STATE(state, path, flag); uint32_t shadow_flag = INTEGRATOR_STATE(state, path, flag);
shadow_flag |= (is_light) ? PATH_RAY_SHADOW_FOR_LIGHT : 0; shadow_flag |= (is_light) ? PATH_RAY_SHADOW_FOR_LIGHT : 0;
shadow_flag |= (is_transmission) ? PATH_RAY_TRANSMISSION_PASS : PATH_RAY_REFLECT_PASS; shadow_flag |= PATH_RAY_SURFACE_PASS;
const float3 throughput = INTEGRATOR_STATE(state, path, throughput) * bsdf_eval_sum(&bsdf_eval); const float3 throughput = INTEGRATOR_STATE(state, path, throughput) * bsdf_eval_sum(&bsdf_eval);
if (kernel_data.kernel_features & KERNEL_FEATURE_LIGHT_PASSES) { if (kernel_data.kernel_features & KERNEL_FEATURE_LIGHT_PASSES) {
const float3 diffuse_glossy_ratio = (bounce == 0) ? const float3 pass_diffuse_weight = (bounce == 0) ?
bsdf_eval_diffuse_glossy_ratio(&bsdf_eval) : bsdf_eval_pass_diffuse_weight(&bsdf_eval) :
INTEGRATOR_STATE(state, path, diffuse_glossy_ratio); INTEGRATOR_STATE(state, path, pass_diffuse_weight);
INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, diffuse_glossy_ratio) = diffuse_glossy_ratio; const float3 pass_glossy_weight = (bounce == 0) ?
bsdf_eval_pass_glossy_weight(&bsdf_eval) :
INTEGRATOR_STATE(state, path, pass_glossy_weight);
INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, pass_diffuse_weight) = pass_diffuse_weight;
INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, pass_glossy_weight) = pass_glossy_weight;
} }
INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, render_pixel_index) = INTEGRATOR_STATE( INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, render_pixel_index) = INTEGRATOR_STATE(
@@ -283,7 +287,9 @@ ccl_device_forceinline int integrate_surface_bsdf_bssrdf_bounce(
if (kernel_data.kernel_features & KERNEL_FEATURE_LIGHT_PASSES) { if (kernel_data.kernel_features & KERNEL_FEATURE_LIGHT_PASSES) {
if (INTEGRATOR_STATE(state, path, bounce) == 0) { if (INTEGRATOR_STATE(state, path, bounce) == 0) {
INTEGRATOR_STATE_WRITE(state, path, diffuse_glossy_ratio) = bsdf_eval_diffuse_glossy_ratio( INTEGRATOR_STATE_WRITE(state, path, pass_diffuse_weight) = bsdf_eval_pass_diffuse_weight(
&bsdf_eval);
INTEGRATOR_STATE_WRITE(state, path, pass_glossy_weight) = bsdf_eval_pass_glossy_weight(
&bsdf_eval); &bsdf_eval);
} }
} }
@@ -445,7 +451,7 @@ ccl_device bool integrate_surface(KernelGlobals kg,
} }
#endif #endif
shader_prepare_surface_closures(kg, state, &sd); shader_prepare_surface_closures(kg, state, &sd, path_flag);
#ifdef __HOLDOUT__ #ifdef __HOLDOUT__
/* Evaluate holdout. */ /* Evaluate holdout. */
@@ -492,10 +498,6 @@ ccl_device bool integrate_surface(KernelGlobals kg,
kernel_write_denoising_features_surface(kg, state, &sd, render_buffer); kernel_write_denoising_features_surface(kg, state, &sd, render_buffer);
#endif #endif
#ifdef __SHADOW_CATCHER__
kernel_write_shadow_catcher_bounce_data(kg, state, &sd, render_buffer);
#endif
/* Direct light. */ /* Direct light. */
PROFILING_EVENT(PROFILING_SHADE_SURFACE_DIRECT_LIGHT); PROFILING_EVENT(PROFILING_SHADE_SURFACE_DIRECT_LIGHT);
integrate_surface_direct_light(kg, state, &sd, &rng_state); integrate_surface_direct_light(kg, state, &sd, &rng_state);

View File

@@ -263,6 +263,12 @@ ccl_device void volume_shadow_heterogeneous(KernelGlobals kg,
/* Equi-angular sampling as in: /* Equi-angular sampling as in:
* "Importance Sampling Techniques for Path Tracing in Participating Media" */ * "Importance Sampling Techniques for Path Tracing in Participating Media" */
/* Below this pdf we ignore samples, as they tend to lead to very long distances.
* This can cause performance issues with BVH traversal in OptiX, leading it to
* traverse many nodes. Since these contribute very little to the image, just ignore
* those samples. */
# define VOLUME_SAMPLE_PDF_CUTOFF 1e-8f
ccl_device float volume_equiangular_sample(ccl_private const Ray *ccl_restrict ray, ccl_device float volume_equiangular_sample(ccl_private const Ray *ccl_restrict ray,
const float3 light_P, const float3 light_P,
const float xi, const float xi,
@@ -437,7 +443,8 @@ ccl_device_forceinline void volume_integrate_step_scattering(
/* Equiangular sampling for direct lighting. */ /* Equiangular sampling for direct lighting. */
if (vstate.direct_sample_method == VOLUME_SAMPLE_EQUIANGULAR && !result.direct_scatter) { if (vstate.direct_sample_method == VOLUME_SAMPLE_EQUIANGULAR && !result.direct_scatter) {
if (result.direct_t >= vstate.start_t && result.direct_t <= vstate.end_t) { if (result.direct_t >= vstate.start_t && result.direct_t <= vstate.end_t &&
vstate.equiangular_pdf > VOLUME_SAMPLE_PDF_CUTOFF) {
const float new_dt = result.direct_t - vstate.start_t; const float new_dt = result.direct_t - vstate.start_t;
const float3 new_transmittance = volume_color_transmittance(coeff.sigma_t, new_dt); const float3 new_transmittance = volume_color_transmittance(coeff.sigma_t, new_dt);
@@ -474,26 +481,28 @@ ccl_device_forceinline void volume_integrate_step_scattering(
const float3 new_transmittance = volume_color_transmittance(coeff.sigma_t, new_dt); const float3 new_transmittance = volume_color_transmittance(coeff.sigma_t, new_dt);
const float distance_pdf = dot(channel_pdf, coeff.sigma_t * new_transmittance); const float distance_pdf = dot(channel_pdf, coeff.sigma_t * new_transmittance);
/* throughput */ if (vstate.distance_pdf * distance_pdf > VOLUME_SAMPLE_PDF_CUTOFF) {
result.indirect_scatter = true; /* throughput */
result.indirect_t = new_t; result.indirect_scatter = true;
result.indirect_throughput *= coeff.sigma_s * new_transmittance / distance_pdf; result.indirect_t = new_t;
shader_copy_volume_phases(&result.indirect_phases, sd); result.indirect_throughput *= coeff.sigma_s * new_transmittance / distance_pdf;
shader_copy_volume_phases(&result.indirect_phases, sd);
if (vstate.direct_sample_method != VOLUME_SAMPLE_EQUIANGULAR) { if (vstate.direct_sample_method != VOLUME_SAMPLE_EQUIANGULAR) {
/* If using distance sampling for direct light, just copy parameters /* If using distance sampling for direct light, just copy parameters
* of indirect light since we scatter at the same point then. */ * of indirect light since we scatter at the same point then. */
result.direct_scatter = true; result.direct_scatter = true;
result.direct_t = result.indirect_t; result.direct_t = result.indirect_t;
result.direct_throughput = result.indirect_throughput; result.direct_throughput = result.indirect_throughput;
shader_copy_volume_phases(&result.direct_phases, sd); shader_copy_volume_phases(&result.direct_phases, sd);
/* Multiple importance sampling. */ /* Multiple importance sampling. */
if (vstate.use_mis) { if (vstate.use_mis) {
const float equiangular_pdf = volume_equiangular_pdf(ray, equiangular_light_P, new_t); const float equiangular_pdf = volume_equiangular_pdf(ray, equiangular_light_P, new_t);
const float mis_weight = power_heuristic(vstate.distance_pdf * distance_pdf, const float mis_weight = power_heuristic(vstate.distance_pdf * distance_pdf,
equiangular_pdf); equiangular_pdf);
result.direct_throughput *= 2.0f * mis_weight; result.direct_throughput *= 2.0f * mis_weight;
}
} }
} }
} }
@@ -694,8 +703,10 @@ ccl_device_forceinline bool integrate_volume_sample_light(
float light_u, light_v; float light_u, light_v;
path_state_rng_2D(kg, rng_state, PRNG_LIGHT_U, &light_u, &light_v); path_state_rng_2D(kg, rng_state, PRNG_LIGHT_U, &light_u, &light_v);
light_distribution_sample_from_volume_segment( if (!light_distribution_sample_from_volume_segment(
kg, light_u, light_v, sd->time, sd->P, bounce, path_flag, ls); kg, light_u, light_v, sd->time, sd->P, bounce, path_flag, ls)) {
return false;
}
if (ls->shader & SHADER_EXCLUDE_SCATTER) { if (ls->shader & SHADER_EXCLUDE_SCATTER) {
return false; return false;
@@ -794,10 +805,11 @@ ccl_device_forceinline void integrate_volume_direct_light(
const float3 throughput_phase = throughput * bsdf_eval_sum(&phase_eval); const float3 throughput_phase = throughput * bsdf_eval_sum(&phase_eval);
if (kernel_data.kernel_features & KERNEL_FEATURE_LIGHT_PASSES) { if (kernel_data.kernel_features & KERNEL_FEATURE_LIGHT_PASSES) {
const float3 diffuse_glossy_ratio = (bounce == 0) ? const float3 pass_diffuse_weight = (bounce == 0) ?
one_float3() : one_float3() :
INTEGRATOR_STATE(state, path, diffuse_glossy_ratio); INTEGRATOR_STATE(state, path, pass_diffuse_weight);
INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, diffuse_glossy_ratio) = diffuse_glossy_ratio; INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, pass_diffuse_weight) = pass_diffuse_weight;
INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, pass_glossy_weight) = zero_float3();
} }
INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, render_pixel_index) = INTEGRATOR_STATE( INTEGRATOR_STATE_WRITE(shadow_state, shadow_path, render_pixel_index) = INTEGRATOR_STATE(
@@ -876,7 +888,8 @@ ccl_device_forceinline bool integrate_volume_phase_scatter(
INTEGRATOR_STATE_WRITE(state, path, throughput) = throughput_phase; INTEGRATOR_STATE_WRITE(state, path, throughput) = throughput_phase;
if (kernel_data.kernel_features & KERNEL_FEATURE_LIGHT_PASSES) { if (kernel_data.kernel_features & KERNEL_FEATURE_LIGHT_PASSES) {
INTEGRATOR_STATE_WRITE(state, path, diffuse_glossy_ratio) = one_float3(); INTEGRATOR_STATE_WRITE(state, path, pass_diffuse_weight) = one_float3();
INTEGRATOR_STATE_WRITE(state, path, pass_glossy_weight) = zero_float3();
} }
/* Update path state */ /* Update path state */
@@ -1024,7 +1037,7 @@ ccl_device void integrator_shade_volume(KernelGlobals kg,
else { else {
/* Continue to background, light or surface. */ /* Continue to background, light or surface. */
integrator_intersect_next_kernel_after_volume<DEVICE_KERNEL_INTEGRATOR_SHADE_VOLUME>( integrator_intersect_next_kernel_after_volume<DEVICE_KERNEL_INTEGRATOR_SHADE_VOLUME>(
kg, state, &isect); kg, state, &isect, render_buffer);
return; return;
} }
#endif /* __VOLUME__ */ #endif /* __VOLUME__ */

View File

@@ -105,8 +105,42 @@ ccl_device_inline void shader_copy_volume_phases(ccl_private ShaderVolumePhases
ccl_device_inline void shader_prepare_surface_closures(KernelGlobals kg, ccl_device_inline void shader_prepare_surface_closures(KernelGlobals kg,
ConstIntegratorState state, ConstIntegratorState state,
ccl_private ShaderData *sd) ccl_private ShaderData *sd,
const uint32_t path_flag)
{ {
/* Filter out closures. */
if (kernel_data.integrator.filter_closures) {
if (kernel_data.integrator.filter_closures & FILTER_CLOSURE_EMISSION) {
sd->closure_emission_background = zero_float3();
}
if (kernel_data.integrator.filter_closures & FILTER_CLOSURE_DIRECT_LIGHT) {
sd->flag &= ~SD_BSDF_HAS_EVAL;
}
if (path_flag & PATH_RAY_CAMERA) {
for (int i = 0; i < sd->num_closure; i++) {
ccl_private ShaderClosure *sc = &sd->closure[i];
if ((CLOSURE_IS_BSDF_DIFFUSE(sc->type) &&
(kernel_data.integrator.filter_closures & FILTER_CLOSURE_DIFFUSE)) ||
(CLOSURE_IS_BSDF_GLOSSY(sc->type) &&
(kernel_data.integrator.filter_closures & FILTER_CLOSURE_GLOSSY)) ||
(CLOSURE_IS_BSDF_TRANSMISSION(sc->type) &&
(kernel_data.integrator.filter_closures & FILTER_CLOSURE_TRANSMISSION))) {
sc->type = CLOSURE_NONE_ID;
sc->sample_weight = 0.0f;
}
else if ((CLOSURE_IS_BSDF_TRANSPARENT(sc->type) &&
(kernel_data.integrator.filter_closures & FILTER_CLOSURE_TRANSPARENT))) {
sc->type = CLOSURE_HOLDOUT_ID;
sc->sample_weight = 0.0f;
sd->flag |= SD_HOLDOUT;
}
}
}
}
/* Defensive sampling. /* Defensive sampling.
* *
* We can likely also do defensive sampling at deeper bounces, particularly * We can likely also do defensive sampling at deeper bounces, particularly
@@ -209,8 +243,7 @@ ccl_device_inline float _shader_bsdf_multi_eval(KernelGlobals kg,
float3 eval = bsdf_eval(kg, sd, sc, omega_in, is_transmission, &bsdf_pdf); float3 eval = bsdf_eval(kg, sd, sc, omega_in, is_transmission, &bsdf_pdf);
if (bsdf_pdf != 0.0f) { if (bsdf_pdf != 0.0f) {
const bool is_diffuse = CLOSURE_IS_BSDF_DIFFUSE(sc->type); bsdf_eval_accum(result_eval, sc->type, eval * sc->weight);
bsdf_eval_accum(result_eval, is_diffuse, eval * sc->weight, 1.0f);
sum_pdf += bsdf_pdf * sc->sample_weight; sum_pdf += bsdf_pdf * sc->sample_weight;
} }
} }
@@ -235,7 +268,7 @@ ccl_device_inline
ccl_private BsdfEval *bsdf_eval, ccl_private BsdfEval *bsdf_eval,
const uint light_shader_flags) const uint light_shader_flags)
{ {
bsdf_eval_init(bsdf_eval, false, zero_float3()); bsdf_eval_init(bsdf_eval, CLOSURE_NONE_ID, zero_float3());
return _shader_bsdf_multi_eval( return _shader_bsdf_multi_eval(
kg, sd, omega_in, is_transmission, NULL, bsdf_eval, 0.0f, 0.0f, light_shader_flags); kg, sd, omega_in, is_transmission, NULL, bsdf_eval, 0.0f, 0.0f, light_shader_flags);
@@ -328,8 +361,7 @@ ccl_device int shader_bsdf_sample_closure(KernelGlobals kg,
label = bsdf_sample(kg, sd, sc, randu, randv, &eval, omega_in, domega_in, pdf); label = bsdf_sample(kg, sd, sc, randu, randv, &eval, omega_in, domega_in, pdf);
if (*pdf != 0.0f) { if (*pdf != 0.0f) {
const bool is_diffuse = CLOSURE_IS_BSDF_DIFFUSE(sc->type); bsdf_eval_init(bsdf_eval, sc->type, eval * sc->weight);
bsdf_eval_init(bsdf_eval, is_diffuse, eval * sc->weight);
if (sd->num_closure > 1) { if (sd->num_closure > 1) {
const bool is_transmission = shader_bsdf_is_transmission(sd, *omega_in); const bool is_transmission = shader_bsdf_is_transmission(sd, *omega_in);
@@ -655,7 +687,7 @@ ccl_device_inline float _shader_volume_phase_multi_eval(
float3 eval = volume_phase_eval(sd, svc, omega_in, &phase_pdf); float3 eval = volume_phase_eval(sd, svc, omega_in, &phase_pdf);
if (phase_pdf != 0.0f) { if (phase_pdf != 0.0f) {
bsdf_eval_accum(result_eval, false, eval, 1.0f); bsdf_eval_accum(result_eval, CLOSURE_VOLUME_HENYEY_GREENSTEIN_ID, eval);
sum_pdf += phase_pdf * svc->sample_weight; sum_pdf += phase_pdf * svc->sample_weight;
} }
@@ -671,7 +703,7 @@ ccl_device float shader_volume_phase_eval(KernelGlobals kg,
const float3 omega_in, const float3 omega_in,
ccl_private BsdfEval *phase_eval) ccl_private BsdfEval *phase_eval)
{ {
bsdf_eval_init(phase_eval, false, zero_float3()); bsdf_eval_init(phase_eval, CLOSURE_VOLUME_HENYEY_GREENSTEIN_ID, zero_float3());
return _shader_volume_phase_multi_eval(sd, phases, omega_in, -1, phase_eval, 0.0f, 0.0f); return _shader_volume_phase_multi_eval(sd, phases, omega_in, -1, phase_eval, 0.0f, 0.0f);
} }
@@ -729,7 +761,7 @@ ccl_device int shader_volume_phase_sample(KernelGlobals kg,
label = volume_phase_sample(sd, svc, randu, randv, &eval, omega_in, domega_in, pdf); label = volume_phase_sample(sd, svc, randu, randv, &eval, omega_in, domega_in, pdf);
if (*pdf != 0.0f) { if (*pdf != 0.0f) {
bsdf_eval_init(phase_eval, false, eval); bsdf_eval_init(phase_eval, CLOSURE_VOLUME_HENYEY_GREENSTEIN_ID, eval);
} }
return label; return label;
@@ -752,7 +784,7 @@ ccl_device int shader_phase_sample_closure(KernelGlobals kg,
label = volume_phase_sample(sd, sc, randu, randv, &eval, omega_in, domega_in, pdf); label = volume_phase_sample(sd, sc, randu, randv, &eval, omega_in, domega_in, pdf);
if (*pdf != 0.0f) if (*pdf != 0.0f)
bsdf_eval_init(phase_eval, false, eval); bsdf_eval_init(phase_eval, CLOSURE_VOLUME_HENYEY_GREENSTEIN_ID, eval);
return label; return label;
} }

View File

@@ -16,6 +16,7 @@
#pragma once #pragma once
#include "kernel/film/write_passes.h"
#include "kernel/integrator/path_state.h" #include "kernel/integrator/path_state.h"
#include "kernel/integrator/state_util.h" #include "kernel/integrator/state_util.h"
@@ -47,7 +48,7 @@ ccl_device_inline bool kernel_shadow_catcher_is_path_split_bounce(KernelGlobals
return false; return false;
} }
if (path_flag & PATH_RAY_SHADOW_CATCHER_PASS) { if (path_flag & PATH_RAY_SHADOW_CATCHER_HIT) {
return false; return false;
} }
@@ -88,6 +89,28 @@ ccl_device_forceinline bool kernel_shadow_catcher_is_object_pass(const uint32_t
return path_flag & PATH_RAY_SHADOW_CATCHER_PASS; return path_flag & PATH_RAY_SHADOW_CATCHER_PASS;
} }
/* Write shadow catcher passes on a bounce from the shadow catcher object. */
ccl_device_forceinline void kernel_write_shadow_catcher_bounce_data(
KernelGlobals kg, IntegratorState state, ccl_global float *ccl_restrict render_buffer)
{
kernel_assert(kernel_data.film.pass_shadow_catcher_sample_count != PASS_UNUSED);
kernel_assert(kernel_data.film.pass_shadow_catcher_matte != PASS_UNUSED);
const uint32_t render_pixel_index = INTEGRATOR_STATE(state, path, render_pixel_index);
const uint64_t render_buffer_offset = (uint64_t)render_pixel_index *
kernel_data.film.pass_stride;
ccl_global float *buffer = render_buffer + render_buffer_offset;
/* Count sample for the shadow catcher object. */
kernel_write_pass_float(buffer + kernel_data.film.pass_shadow_catcher_sample_count, 1.0f);
/* Since the split is done, the sample does not contribute to the matte, so accumulate it as
* transparency to the matte. */
const float3 throughput = INTEGRATOR_STATE(state, path, throughput);
kernel_write_pass_float(buffer + kernel_data.film.pass_shadow_catcher_matte + 3,
average(throughput));
}
#endif /* __SHADOW_CATCHER__ */ #endif /* __SHADOW_CATCHER__ */
CCL_NAMESPACE_END CCL_NAMESPACE_END

View File

@@ -46,8 +46,9 @@ KERNEL_STRUCT_MEMBER(shadow_path,
float3, float3,
unshadowed_throughput, unshadowed_throughput,
KERNEL_FEATURE_SHADOW_PASS | KERNEL_FEATURE_AO_ADDITIVE) KERNEL_FEATURE_SHADOW_PASS | KERNEL_FEATURE_AO_ADDITIVE)
/* Ratio of throughput to distinguish diffuse and glossy render passes. */ /* Ratio of throughput to distinguish diffuse / glossy / transmission render passes. */
KERNEL_STRUCT_MEMBER(shadow_path, float3, diffuse_glossy_ratio, KERNEL_FEATURE_LIGHT_PASSES) KERNEL_STRUCT_MEMBER(shadow_path, float3, pass_diffuse_weight, KERNEL_FEATURE_LIGHT_PASSES)
KERNEL_STRUCT_MEMBER(shadow_path, float3, pass_glossy_weight, KERNEL_FEATURE_LIGHT_PASSES)
/* Number of intersections found by ray-tracing. */ /* Number of intersections found by ray-tracing. */
KERNEL_STRUCT_MEMBER(shadow_path, uint16_t, num_hits, KERNEL_FEATURE_PATH_TRACING) KERNEL_STRUCT_MEMBER(shadow_path, uint16_t, num_hits, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_END(shadow_path) KERNEL_STRUCT_END(shadow_path)

View File

@@ -60,8 +60,9 @@ KERNEL_STRUCT_MEMBER(path, float, min_ray_pdf, KERNEL_FEATURE_PATH_TRACING)
KERNEL_STRUCT_MEMBER(path, float, continuation_probability, KERNEL_FEATURE_PATH_TRACING) KERNEL_STRUCT_MEMBER(path, float, continuation_probability, KERNEL_FEATURE_PATH_TRACING)
/* Throughput. */ /* Throughput. */
KERNEL_STRUCT_MEMBER(path, float3, throughput, KERNEL_FEATURE_PATH_TRACING) KERNEL_STRUCT_MEMBER(path, float3, throughput, KERNEL_FEATURE_PATH_TRACING)
/* Ratio of throughput to distinguish diffuse and glossy render passes. */ /* Ratio of throughput to distinguish diffuse / glossy / transmission render passes. */
KERNEL_STRUCT_MEMBER(path, float3, diffuse_glossy_ratio, KERNEL_FEATURE_LIGHT_PASSES) KERNEL_STRUCT_MEMBER(path, float3, pass_diffuse_weight, KERNEL_FEATURE_LIGHT_PASSES)
KERNEL_STRUCT_MEMBER(path, float3, pass_glossy_weight, KERNEL_FEATURE_LIGHT_PASSES)
/* Denoising. */ /* Denoising. */
KERNEL_STRUCT_MEMBER(path, float3, denoising_feature_throughput, KERNEL_FEATURE_DENOISING) KERNEL_STRUCT_MEMBER(path, float3, denoising_feature_throughput, KERNEL_FEATURE_DENOISING)
/* Shader sorting. */ /* Shader sorting. */

View File

@@ -71,6 +71,10 @@ ccl_device int subsurface_bounce(KernelGlobals kg,
} }
# endif # endif
if (sd->flag & SD_BACKFACING) {
path_flag |= PATH_RAY_SUBSURFACE_BACKFACING;
}
INTEGRATOR_STATE_WRITE(state, path, throughput) *= weight; INTEGRATOR_STATE_WRITE(state, path, throughput) *= weight;
INTEGRATOR_STATE_WRITE(state, path, flag) = path_flag; INTEGRATOR_STATE_WRITE(state, path, flag) = path_flag;
@@ -79,7 +83,8 @@ ccl_device int subsurface_bounce(KernelGlobals kg,
if (kernel_data.kernel_features & KERNEL_FEATURE_LIGHT_PASSES) { if (kernel_data.kernel_features & KERNEL_FEATURE_LIGHT_PASSES) {
if (INTEGRATOR_STATE(state, path, bounce) == 0) { if (INTEGRATOR_STATE(state, path, bounce) == 0) {
INTEGRATOR_STATE_WRITE(state, path, diffuse_glossy_ratio) = one_float3(); INTEGRATOR_STATE_WRITE(state, path, pass_diffuse_weight) = one_float3();
INTEGRATOR_STATE_WRITE(state, path, pass_glossy_weight) = zero_float3();
} }
} }

View File

@@ -47,6 +47,7 @@ ccl_device_inline bool subsurface_disk(KernelGlobals kg,
const float time = INTEGRATOR_STATE(state, ray, time); const float time = INTEGRATOR_STATE(state, ray, time);
const float3 Ng = INTEGRATOR_STATE(state, subsurface, Ng); const float3 Ng = INTEGRATOR_STATE(state, subsurface, Ng);
const int object = INTEGRATOR_STATE(state, isect, object); const int object = INTEGRATOR_STATE(state, isect, object);
const uint32_t path_flag = INTEGRATOR_STATE(state, path, flag);
/* Read subsurface scattering parameters. */ /* Read subsurface scattering parameters. */
const float3 radius = INTEGRATOR_STATE(state, subsurface, radius); const float3 radius = INTEGRATOR_STATE(state, subsurface, radius);
@@ -123,6 +124,9 @@ ccl_device_inline bool subsurface_disk(KernelGlobals kg,
const int object = ss_isect.hits[hit].object; const int object = ss_isect.hits[hit].object;
const int object_flag = kernel_tex_fetch(__object_flag, object); const int object_flag = kernel_tex_fetch(__object_flag, object);
float3 hit_Ng = ss_isect.Ng[hit]; float3 hit_Ng = ss_isect.Ng[hit];
if (path_flag & PATH_RAY_SUBSURFACE_BACKFACING) {
hit_Ng = -hit_Ng;
}
if (object_flag & SD_OBJECT_NEGATIVE_SCALE_APPLIED) { if (object_flag & SD_OBJECT_NEGATIVE_SCALE_APPLIED) {
hit_Ng = -hit_Ng; hit_Ng = -hit_Ng;
} }

View File

@@ -73,7 +73,7 @@ ccl_device_inline bool light_sample(KernelGlobals kg,
ls->P = zero_float3(); ls->P = zero_float3();
ls->Ng = zero_float3(); ls->Ng = zero_float3();
ls->D = zero_float3(); ls->D = zero_float3();
ls->pdf = true; ls->pdf = 1.0f;
ls->t = FLT_MAX; ls->t = FLT_MAX;
return true; return true;
} }
@@ -131,7 +131,7 @@ ccl_device_inline bool light_sample(KernelGlobals kg,
float3 dir = make_float3(klight->spot.dir[0], klight->spot.dir[1], klight->spot.dir[2]); float3 dir = make_float3(klight->spot.dir[0], klight->spot.dir[1], klight->spot.dir[2]);
ls->eval_fac *= spot_light_attenuation( ls->eval_fac *= spot_light_attenuation(
dir, klight->spot.spot_angle, klight->spot.spot_smooth, ls->Ng); dir, klight->spot.spot_angle, klight->spot.spot_smooth, ls->Ng);
if (ls->eval_fac == 0.0f) { if (!in_volume_segment && ls->eval_fac == 0.0f) {
return false; return false;
} }
} }
@@ -170,7 +170,7 @@ ccl_device_inline bool light_sample(KernelGlobals kg,
float3 sample_axisu = axisu; float3 sample_axisu = axisu;
float3 sample_axisv = axisv; float3 sample_axisv = axisv;
if (klight->area.tan_spread > 0.0f) { if (!in_volume_segment && klight->area.tan_spread > 0.0f) {
if (!light_spread_clamp_area_light( if (!light_spread_clamp_area_light(
P, Ng, &ls->P, &sample_axisu, &sample_axisv, klight->area.tan_spread)) { P, Ng, &ls->P, &sample_axisu, &sample_axisv, klight->area.tan_spread)) {
return false; return false;
@@ -203,7 +203,7 @@ ccl_device_inline bool light_sample(KernelGlobals kg,
ls->pdf *= kernel_data.integrator.pdf_lights; ls->pdf *= kernel_data.integrator.pdf_lights;
return (ls->pdf > 0.0f); return in_volume_segment || (ls->pdf > 0.0f);
} }
ccl_device bool lights_intersect(KernelGlobals kg, ccl_device bool lights_intersect(KernelGlobals kg,

View File

@@ -199,6 +199,9 @@ ccl_device_inline float3 shadow_ray_offset(KernelGlobals kg,
if (offset_cutoff > 0.0f) { if (offset_cutoff > 0.0f) {
float NgL = dot(Ng, L); float NgL = dot(Ng, L);
float offset_amount = 0.0f; float offset_amount = 0.0f;
if (NL < 0) {
NL = -NL;
}
if (NL < offset_cutoff) { if (NL < offset_cutoff) {
offset_amount = clamp(2.0f - (NgL + NL) / offset_cutoff, 0.0f, 1.0f); offset_amount = clamp(2.0f - (NgL + NL) / offset_cutoff, 0.0f, 1.0f);
} }

View File

@@ -23,7 +23,8 @@ CCL_NAMESPACE_BEGIN
ccl_device_inline bool svm_node_aov_check(const uint32_t path_flag, ccl_device_inline bool svm_node_aov_check(const uint32_t path_flag,
ccl_global float *render_buffer) ccl_global float *render_buffer)
{ {
bool is_primary = (path_flag & PATH_RAY_CAMERA) && (!(path_flag & PATH_RAY_SINGLE_PASS_DONE)); bool is_primary = (path_flag & PATH_RAY_TRANSPARENT_BACKGROUND) &&
(!(path_flag & PATH_RAY_SINGLE_PASS_DONE));
return ((render_buffer != NULL) && is_primary); return ((render_buffer != NULL) && is_primary);
} }

Some files were not shown because too many files have changed in this diff Show More