Compare commits

..

422 Commits

Author SHA1 Message Date
576af00029 Merge branch 'main', update to changes 2023-02-17 15:37:16 +01:00
08f24553dc Cleanup: Remove accidentally merged TODO comment 2023-02-17 15:05:39 +01:00
7838eb12c6 Merge branch 'blender-v3.5-release' 2023-02-17 14:37:35 +01:00
54b1e71dda Make update: Allow amd64 architecture
Apparently, the 65bit Intel architecture is presented differently
on Linux and Windows.

Allow both variants for the command line, so that semantically the
command line argument can be seen as a lower case platform.machine.
2023-02-17 14:35:53 +01:00
69677827ff Fix #104347: Loop Cut Tool becomes impressive with GPU Subdivision
When updating a mesh, the GPU Subdivision code makes calls to
`GPU_indexbuf_bind_as_ssbo()`.

This may cause the current VAO index buffer to change due to calls from
`glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, ibo_id_)` in
`GPU_indexbuf_bind_as_ssbo()`.

The solution is to unbind the VAO (by calling `glBindVertexArray(0)`)
before creating the index buffer IBO.

Co-authored-by: Germano Cavalcante <grmncv@gmail.com>
Pull Request #104873
2023-02-17 14:30:29 +01:00
42f636f7e8 Merge branch 'blender-v3.5-release' 2023-02-17 13:43:17 +01:00
60abeddc55 Make update: Add --architecture command line attribute
Possible values are x86_64 and arm64.

Allows to use make_update.py in a cross-compile environment, like
building x86_64 macOS Blender from Apple Silicon machine.

Pull Request #104863
2023-02-17 13:42:15 +01:00
cb95f8aea7 Animation: Clamp V2D so keyframes cannot go offscreen
In the Dope Sheet and the Timeline, it was possible to drag the view until the keyframes were completely out of view.
(Important to drag in the region with the keyframes, dragging in the channel box already did clamping)

This patch adds a clamping mechanism matching that of the channel box. That means the last channel will stick to the bottom of the view.

Co-authored-by: Christoph Lendenfeld <chris.lenden@gmail.com>
Pull Request #104516
2023-02-17 12:47:45 +01:00
d4480fdfa3 Merge branch 'blender-v3.5-release' 2023-02-17 11:45:49 +01:00
6ffaee8d9a Fix #95400: Crash when running Euler Filter on baked Curves
Fix a crash when using the Euler Filter from the Graph Editor on baked curves.

The crash happened because baked curves have no bezt array.
Skipping any curves where that was missing fixes the issue.

Co-authored-by: Christoph Lendenfeld <chris.lenden@gmail.com>
Pull Request #104858
2023-02-17 11:45:11 +01:00
bf7ccd43ca Cleanup: use PI and PI/2 in compatible_eul
3.2 and 1.6 where used as rough equivalents to M_PI & M_PI_2, however
this raised questions about the significance of these values.

Running thousands of tests with generated euler inputs I wasn't able to
detect a difference so use M_PI & M_PI_2 instead.
2023-02-17 21:18:30 +11:00
773903f43c Docs: add doc-string for compatible_eul & improve code-comments 2023-02-17 21:18:30 +11:00
9b3ce950e6 Fix #95400: Crash when running Euler Filter on baked Curves
Fix a crash when using the Euler Filter from the Graph Editor on baked curves.

The crash happened because baked curves have no bezt array.
Skipping any curves where that was missing fixes the issue.

Co-authored-by: Christoph Lendenfeld <chris.lenden@gmail.com>
Pull Request #104858
2023-02-17 11:05:57 +01:00
f6c6805226 Merge branch 'blender-v3.5-release' 2023-02-17 10:56:41 +01:00
a4b92a6814 Fix #104829: Clip editor has poor performance when Metadata panel is closed
According to the report it is a regression since 3.2, but it is tricky
to pin-point which exact commit caused it.

The root of the issue is that under certain circumstances frame might
be read and processed twice, depending on the order in which panels and
the main area is drawn: the footage information panel skips cache, so
it it is drawn prior to the main area it leads to 2 frame reads. Opening
the Metadata panel triggers code path which forces frame to be put to
the cache, solving the double frame read.

Solution is simple: do not skip cache when acquiring image buffer for
the footage information: the same frame will be needed for the main
area as well.

Pull Request #104860
2023-02-17 10:56:10 +01:00
932148ec54 Merge branch 'blender-v3.5-release' 2023-02-17 10:15:28 +01:00
09b9106e3d Make update: Use BKE_blender_version to detect release branches
On a user level there are no expected changes, other than being able
to update submodules and libraries from a main repository at a detached
HEAD situation (which did not work before).

On the infrastructure level of things this moves us closer to ability
to use the main make_update.py for the buildbot update-code stage, and
to remove the update-code section from the pipeline_config.yaml.

The initial idea of switching make_update to the pipeline config did
not really work, causing duplicated work done on blender side and the
buildbot side. Additionally, it is not easy to switch make_update.py
to use pipeline_config.yaml because the YAML parser is not included
into default package of Python.

There will be few more steps of updates to this script before we can
actually clean-up the pipeline_config: the changes needs to be also
applied on the buildbot side to switch it to the actual make_update.

Switching buildbot to the official make_update.py allows to much more
easily apply the submodules change as per #104573.
2023-02-17 10:04:37 +01:00
0250b40750 Cleanup: Silence compiler warning in VKShader. 2023-02-17 09:54:57 +01:00
c6ea00de3c BLI_math: correct the threshold for wrapping rotation in compatible_eul
Having a threshold well above PI would result in discontinuity in some
cases.

The discontinuity can be measured by generating euler values (both
random and interpolated rotations), then comparing the accumulated
difference.
Changing the threshold for wrapping rotations produces at least as good
or more compatible results.

This was reported as #17297 and fixed in [0], however the change was
only applied for the game-engine.

Ref !104856

[0]: ab44742cf3
2023-02-17 19:52:15 +11:00
1a6943d553 Merge branch 'blender-v3.5-release' 2023-02-17 09:38:18 +01:00
c624e56ffc Fix #104810: Appending a camera does not pull in background movie clip
This was broken even before 0649e63716 and was always expanding the
`Image`, not the movie clip (even if the source was set to
`CAM_BGIMG_SOURCE_MOVIE`)

Now the rule here seems to be to always expand unconditionally, so
remove checking the source and always expand image and movie clip.

Co-authored-by: Philipp Oeser <philipp@blender.org>
Pull Request #104815
2023-02-17 09:36:16 +01:00
5de9a5dea5 Sculpt: Fix 104618: Topology automasking errors
Sculpt island tags are now invalidated whenever the PBVH
is rebuilt.
2023-02-16 22:37:20 -08:00
9808d6abd8 Cleanup: loop over axes in compatible_eul
There is no need to inline the loop.
2023-02-17 16:18:43 +11:00
6b84636ff2 Cleanup: unused warning 2023-02-17 16:18:31 +11:00
a52c0a252f Cleanup: fix warnings from last commit 2023-02-16 21:14:27 -08:00
c352eeb213 Sculpt: Add support for last operator panel to mesh filter operator
* Repeat last operator now works for mesh filters.
* Added an iteration_count property to repeat the filter.
  This is especially useful when compounded with the repeat
  last operator tool.
* The mouse event history is stored for mesh filters
  with more advanced user input (mostly Smooth and Relax
  filters).
2023-02-16 20:54:56 -08:00
ef60b13c1f Deps: Optimize meson based deps
meson defaults to debug builds [0] unless you tell it differently, this
diff changes the options for

- epoxy
- fribidi
- harfbuzz
- wayland
- wayland_protocols

to be optimized, mesa was already optimized

[0] https://mesonbuild.com/Builtin-options.html#core-options

Pull Request #104802
2023-02-17 01:43:25 +01:00
9c79875f65 Cleanup: use BitSpan instead of BitVector where possible
Passing a `BitSpan` is generally better because then the caller is not
forced to allocate the bits with a `BitVector`. Also, the `BitSpan` can
be stored in the stack, which removes one pointer indirection compared
to accessing bits through a `BitVector &`.
2023-02-17 00:47:02 +01:00
891f47b801 BLI: new bit span data structure
This adds `BitSpan` and `MutableBitSpan`. They work essentially the same as
the normal `Span` and `MutableSpan`, but work on individual bits instead
(the smallest type `Span` can handle is one byte large).

This also splits up `BLI_bit_vector.hh` and introduces two new headers:
`BLI_bit_ref.hh` and `BLI_bit_span.hh`.

The goal here is to make working with dynamically sized bit masks more
convenient. I'm mainly working on this because I might want to use this
in #104629. It can also be used to cleanup function signatures that
currently take a reference to a `BitVector`. Like with `Span` vs. `Vector`,
it is better to pass a `BitSpan` to function than a `const BitVector &`.

Unit tests for the new code are included.

Pull Request #104671
2023-02-17 00:42:44 +01:00
8094d389f7 Cleanup: format 2023-02-17 11:41:12 +13:00
09d3ebfd72 Merge branch 'blender-v3.5-release' 2023-02-16 17:00:18 -05:00
efc2e5134f Fix #104841: Split function for Cycles for sharp edges ignores attribute
Cycles uses the "split faces" mesh function to support sharp edges
and auto-smooth. However, 75ad8da1ea updated that
function to ignore the edges that are explicitly tagged as sharp and
only use the edge angle. Fix by taking the attribute into account too.
2023-02-16 16:59:58 -05:00
4ebb66864a Fix #104826: Mesh to BMesh with shape keys can corrupt layers
The custom data layer mappings from dfacaf4f40 were created
*before* the BMesh shape key layers were added, invalidating the BMesh
data offsets they stored. Fix by creating the mappings after all layers
have been created.
2023-02-16 16:30:59 -05:00
8a51d61b44 Merge branch 'blender-v3.5-release' 2023-02-16 15:20:00 -05:00
eb5fead5ac Fix: Use proper types in compare node link drag search options
Clean up logic to make it more clear and formalize the way to choose
fixed node data type based on operation. This make possible to more
easily fix wrong node data type for color type and less than ops.

Pull Request #104617
2023-02-16 15:18:56 -05:00
b2a536e9d7 Fix: Remove the rest of the tracks in the BKE_nlatrack_remove_and_free to free up the memory
Fix: The BKE_nlatrack_remove_and_free (#104752) unit test leaks a little memory. Cleaning up the rest of the track list to ensure everything is freed.
Co-authored-by: Nate Rupsis <nrupsis@gmail.com>
Pull Request #104839
2023-02-16 20:24:33 +01:00
77c273ee37 Fix unused variable in release builds
Assert is not needed in this case.
2023-02-16 18:57:50 +01:00
4cdf27cd96 Merge branch 'blender-v3.5-release' 2023-02-16 12:50:11 -05:00
9f41f95c8e Fix #104785: Quick fur keeps asset status of appended node groups
Use the recently added "clear_asset_data" option in the append operator.

Pull Request #104828
2023-02-16 18:14:47 +01:00
c2a0decbf3 Cleanup: Use Span to iterate over nodes instead of ListBase
Similar to 5b8e2ebd97.
2023-02-16 12:12:38 -05:00
851de8170d Cleanup: Nodes: Avoid unnecessary const cast, use early return 2023-02-16 12:11:20 -05:00
4e7a7e613a Cleanup: Fix operator description 2023-02-16 17:47:04 +01:00
bf5a89f4e0 Cleanup: Adding in NLA Track Remove / Remove and free BKE methods
This PR adds 2 new methods:
* BKE_nlatrack_remove
* BKE_nlatrack_remove_and_free

and modifies the existing `BKE_nlatrack_free` to remove the track list parameter.

This refactor splits out the removal / freeing into it's own methods, and provides a higher order method (BKE_nlatrack_remove_and_free) to conveniently call both.

Co-authored-by: Nate Rupsis <nrupsis@gmail.com>
Pull Request #104752
2023-02-16 17:31:09 +01:00
81b53aa507 WM: Add option for clearing asset data to append operator
When appending assets it often isn't expected for the asset tags and
meta-data to be included. Add an option to the append operator to
disable appending the asset data, exposing existing internal options.
2023-02-16 11:17:02 -05:00
55843cd64b Curves: Add select more/less
This adds the "Select More/Less" operators for Curves. Both operators use the `select_adjacent` function to (de)select adjacent points.

Pull Request #104626
2023-02-16 17:02:43 +01:00
1e9564864c Cleanup: Refactor File/Asset Browser button dragging code
Code to set the dragging data for a button was mostly duplicated, they
share it now. The followup commit also needs an easy way to reuse the
logic, which is possible now.
2023-02-16 16:57:10 +01:00
6da512f0bc UI: Refactor how draggable part of button is determined
No longer use the existance of an image pointer inside the button, or
the the type of a button to decide if the entire button should be
draggable, or only its icon. This was rather sneaky, undocumented
behavior. Instead make this a proper option in the public UI API, and
enable it by default in specific cases to match the previous behavior.

This at least makes the behavior known and controllable in the API, and
not something that's just "secretly" done deep down in the button
handling code. I'd argue we should just use the entire button width by
default for dragging, but that would be a bigger change.
2023-02-16 16:57:10 +01:00
9edb1d0a7c Fix #104166: Add redraw for asset marking and unmarking
When the users click the "Mark as Asset" with the mouse hover the fake
user button, the button was not refreshed. In fact, the areas are not
listening to the "NC_ID NA_EDITED", which is the signal emitted after
an asset is marked/unmarked. Because of this, the areas aren't redrawn
(especially the ID buttons).

This little patch adds the event listening for the areas where this
problem is happening  node editor and properties editor.

Pull Request #104694
2023-02-16 16:26:23 +01:00
2ccb820c7e Merge branch 'blender-v3.5-release' 2023-02-16 15:27:40 +01:00
47934b5c2b Curves: Add remove_selection function
This adds a `remove_selection` function that can be used by other
objects that make use of `CurvesGeometry`.

Pull Request #104813
2023-02-16 15:26:13 +01:00
be6847e773 Cleanup: Remove read-only attribute type
After the removal of the "normal" attribute providers, we no longer
use the concept of read-only attributes. Removing this status simplifies
code, clarifies the design, and removes potentially buggy corner cases.

Pull Request #104795
2023-02-16 15:08:28 +01:00
300c673a64 Geometry Nodes: Remove "normal" attribute
The "normal" was added before fields existed because we needed a
way to expose the data to geometry nodes. It isn't really an attribute,
because it's read-only and it's derived rather than original data.

No features have relied on the "normal" attribute existing, except
for the corresponding column in the spreadsheet. However, the
column in the spreadsheet is also inconsistent, since it isn't an
attribute but looks just like the other columns. The normal is
always visible in the spreadsheet.

Pull Request #104795
2023-02-16 15:08:28 +01:00
368559647f Merge branch 'blender-v3.5-release' 2023-02-16 15:01:29 +01:00
c785e7431e Themes: Fix several issues in Blender Light theme
* Fix #92539: Hard to read the breadcrumbs.
* Fix View Item active, hover, and text color (e.g. count numbers in the
  Spreadsheet were almost unreadable).
* Fix mismatching node type colors with the default theme.
  Blender Light is meant to be simply a brighter version of the default,
  so screenshots and tutorials can be followed with both themes.
* Use the same outline color for widgets, so they match when aligned in a row.
* Make panels standout (not fully transparent), like in the default theme.
2023-02-16 14:39:41 +01:00
d705c8ed57 Merge branch 'blender-v3.5-release' 2023-02-16 08:30:52 -05:00
9d15b3f424 Fix #104697: Curves Sculpt: Setting brush shortcuts does not work
The active tools in `_defs_curves_sculpt` don't use names that are
exactly the same as the corresponding brush name with "builtin_brush."
at the beginning, instead they use more standard identifiers without
capitals or spaces.

The "brush_select" utility operator assumed the names matched though.
That can be fixed by manually mapping the brushes to the active tools.

Pull Request #104792
2023-02-16 13:48:39 +01:00
c9285f83ab Fix #104698: Assert and failure adding shortcuts to curves sculpt tools
The keymap name in `WM_keymap_guess_from_context` didn't match the
name of the keymap in the Blender default keymap (`km_sculpt_curves`).
Fix by changing the utility function to match the keymap name.

Before right clicking on any tool in curves sculpt mode gave an assert,
now it shows a context menu.

Pull Request #104791
2023-02-16 13:47:13 +01:00
c39b81d832 UI: Improvement to screen_find_area_xy
In BKE_screen_area_map_find_area_xy (find a ScrArea by 2D location),
ignore edges by using screen verts instead of totrct

Differential Revision: blender/blender#104680

Reviewed by Brecht Van Lommel
2023-02-16 02:38:57 +01:00
6e7242f00c Merge branch 'blender-v3.5-release' 2023-02-16 00:53:04 +01:00
4cb119f533 Fix #104584: Reassigning effect strip input does not update its position
Lookup cache was not invalidated, to update attached effects position, a handle
of a input strip is touched.

To update attached effects, currently the code only does that when strip
position is changed. This is, because effect strip updating is done internally
in sequencer module code and ideally shouldn't be done at all. A TODO comment
with further explanation is added.
2023-02-16 00:48:52 +01:00
10c0c2a156 Windows: allow windows to span multiple monitors
On the Windows platform allow Blender windows to be created that are
spread over multiple adjacent monitors.

---

On Windows we are quite feature-complete and stable for the creation and placement of multiple (non-temp) Blender windows. We correctly do so across multiple monitors no matter their arrangement, resolution, and scale.

However, there is another way that Blender could use multiple monitors - suggested by a core dev - which is to size a window so that it comprises multiple monitors. There are some advantages to this way of working because the one window remains constantly active and in focus. It also allows a single region (like a node editor) to be as large as possible.

But this way of working is not currently possible. That is because during window creation we constrain them to fit within the confines of the nearest single monitor. This has mostly been done for simplicity and safety. We don't want to restore a saved window to a position where it cannot be seen or used.

This patch addresses that. It allows windows to span multiple monitors, and does so safely by constraining the four corners of the window to be within the working area of any active monitor.  This means it allows the creation of single windows as shown below in blue (left two), but does not allow the one in orange (right):

![image.png](/attachments/b2a9dfca-d54c-467a-ab95-717df3b25051)

Note this has been previously (before gitea) reviewed and approved by Brecht.

Co-authored-by: Harley Acheson <harley.acheson@gmail.com>
Pull Request #104438
2023-02-16 00:41:26 +01:00
ab63fe9eab Cleanup: format 2023-02-16 10:04:09 +11:00
a2551f23ad Cleanup: document & improve naming for Wayland's pending window actions 2023-02-16 10:02:23 +11:00
f178e3f849 Merge branch 'blender-v3.5-release' 2023-02-15 23:56:58 +01:00
7a76f2ae77 Fix #104370: Draw: Don't request the same attribute more than once
Avoid running out of attributes when multiple material slots use the same one.

Cleanup:
Removes the return value from drw_attributes_add_request since it shouldn't be modified afterward and it's never used.
Avoid making copies of DRW_AttributeRequest in drw_attributes_has_request.

Co-authored-by: Miguel Pozo <pragma37@gmail.com>
Pull Request #104709
2023-02-15 23:54:51 +01:00
b1abc23899 Fix: Skip anonymous CustomData layers for the python API
Now that the UV map names are read from the evaluated mesh the names of
the anonymous layers would show up in the UV Map node and be accessible
via the python interface.

This changes the collection definition to skip anonymous layers.

Pull Request #104783
2023-02-15 23:54:50 +01:00
043bff144d Merge branch 'blender-v3.5-release' 2023-02-15 17:40:15 -05:00
d465b92823 Cleanup: Make format, fix missing static warning 2023-02-15 17:39:53 -05:00
7ad1d3156c Cleanup: Use simpler attribute transfer API in duplicate elements node
Instead of retrieving which attributes to transfer from the geometry set
which exists at a different abstraction level, get them from accessors
directly with a newer utility function. This removes boilerplate code
and makes the logic clearer for a future even more generic attribute
propagation API.
2023-02-15 17:38:47 -05:00
2cad80b0e5 Merge branch 'blender-v3.5-release' 2023-02-15 16:13:35 -05:00
Colin Marmond
c59d2f3f2d Fix #102529: De-duplicate add node search items with assets
Fix a little omission. The id name has a 2 char prefix which has
to be removed in order to compare to the real name.

Pull Request #104793
2023-02-15 16:10:37 -05:00
1e6ed77896 Merge branch 'blender-v3.5-release' 2023-02-15 13:27:49 -05:00
72a2229848 Fix #104690: Evaluated positions user-after-free for copied poly curves
The evaluated positions cache can live longer than a specific
`CurvesGeometry`, but for only-poly curves, it pointed to the positions,
which are freed when the curves are. Instead, use the same pattern
as the evaluated offsets and don't store the positions span, just return
it when retrieving evaluated positions.
2023-02-15 13:26:06 -05:00
b7e39acfcd Fix #104789: uv_layers.remove() always raises error
A simple typo from 6c774feba2. Also return
so nothing happens when the UV map isn't found.
2023-02-15 13:25:46 -05:00
b07085fe3c Fix #104690: Evaluated positions user-after-free for copied poly curves
The evaluated positions cache can live longer than a specific
`CurvesGeometry`, but for only-poly curves, it pointed to the positions,
which are freed when the curves are. Instead, use the same pattern
as the evaluated offsets and don't store the positions span, just return
it when retrieving evaluated positions.
2023-02-15 11:38:27 -05:00
753eb9c20e Cycles: disable AMD GPU rendering binaries due to compiler bugs
There are known bugs in HIP compiler that are causing random build failures
when making changes to the Cycles kernel. This is preventing developers from
efficiently making improvements to Cycles.

For now Cycles AMD GPU rendering is disabled in Blender 3.6 until a good
solution is found, so that ongoing work like Principled v2 is not blocked.
We hope this can be resolved later on in the 3.6 release cycle.

Ref #104786
2023-02-15 17:24:37 +01:00
05b177b0b3 Revert changes to align asset library labels in menus
This reverts 1116d821dc and part of 5bac672e1a.

The solution was made specifically for the 3.5 release, to avoid
breaking other cases. The previous commit addressed the problem properly
by letting the general menu code align labels where needed.
2023-02-15 16:09:08 +01:00
58752ad93c UI: Ensure menus with icons align all menu item labels
Usually in Blender, we try to align the labels of items within a menu,
if necessary by adding a blank icon for padding. This wasn't done for
menus generated from enum properties (RNA or custom property enums). Now
we do it whenever there is at least one item with an icon.
2023-02-15 16:09:08 +01:00
0f29a65744 Merge branch 'blender-v3.5-release' 2023-02-15 15:59:34 +01:00
1116d821dc Fix weird icon padding in asset library selector menu button
Since the menu doesn't automatically align the labels like other menus
and pulldowns in Blender, I manually made them align using the blank
icon. However the menu button would also include this blank icon now.

This is a specific fix for the 3.5 release. In the main branch I will
replace it with proper support for automatically aligning labels in such
menus.
2023-02-15 15:59:12 +01:00
b9bf4700b0 Fix weird icon padding in asset library selector menu button
Since the menu doesn't automatically align the labels like other menus
and pulldowns in Blender, I manually made them align using the blank
icon. However the menu button would also include this blank icon now.

This is a specific fix for the 3.5 release. In the main branch I will
replace it with proper support for automatically aligning labels in such
menus.
2023-02-15 15:58:40 +01:00
Matti-Ranta
0b63136e63 Cleanup: delete .arcconfig
Now that https://developer.blender.org/ has been decommissioned, arcconfig is no longer needed.

Pull Request #104768
2023-02-15 15:49:22 +01:00
7ae5bc142b Merge branch 'blender-v3.5-release' 2023-02-15 15:37:47 +01:00
a7ccb3df3e Fix possible compiler warning and linker error
Type was forward declared as class, but is a struct. GCC is quiet about this
Clang gives a warning about potential linker errors on MSVC.
2023-02-15 15:37:30 +01:00
a6ebe0ca62 Fix possible compiler warning and linker error
Type was forward declared as class, but is a struct. GCC is quiet about this
Clang gives a warning about potential linker errors on MSVC.
2023-02-15 15:09:22 +01:00
d8112ec1ea Revert "Splash screen for Blender 3.5"
This reverts commit a0b4da2bd0.
2023-02-15 14:17:00 +01:00
6417da28a8 Revert "Blender 3.5 Beta:"
This reverts commit 904b7e4e41.
2023-02-15 14:16:43 +01:00
33193093b1 Merge branch 'blender-v3.5-release' 2023-02-15 14:16:25 +01:00
a0b4da2bd0 Splash screen for Blender 3.5
Credits: Nicole Morena
2023-02-15 14:14:14 +01:00
904b7e4e41 Blender 3.5 Beta:
* BLENDER_VERSION_CYCLE set to beta
* Update pipeline_config.yaml to point to 3.5 branches and svn tags
* Update and uncomment BLENDER_VERSION in download.cmake
2023-02-15 13:56:14 +01:00
d5d48a986b Release cycle: Bump version to 3.6. 2023-02-15 13:49:56 +01:00
5bac672e1a Asset Browser: Add separator line & icon padding for import method menu
Usually when a menu item displays an icon, we indent all other items
with an empty icon so items align nicely. Now with more built-in asset
libraries (the new "Essentials" library), this inconsistency becomes
more apparent.

Also add a separator line between the "All" asset library and the
others, makes the menu look more organized.
2023-02-15 13:11:10 +01:00
e3b5a2ae15 Asset Browser: Hide import method menu for essentials
Essentials always use "Append (Reuse Data)", displaying the menu to
change that doesn't make sense.
2023-02-15 13:03:36 +01:00
0d798ef57c Assets/UI: Use UI-list for asset library preferences UI
The Preferences for asset libraries are becoming more than a simple name
+ path. E.g. there is now an Import Method options, and we previously
also considered a Relative Paths option (which we may still want to
add). The previous UI, while consistent with the Auto Run Python Scripts
UI isn't well suited for less than trivial cases. Using UI lists makes
the UI more scalable and follows usual list UI patterns more. There is
also more space for the path button now.

Part of #104686.
2023-02-15 13:00:17 +01:00
972f58c482 Asset Browser: Rename "Import Type" to "Import Method"
"Type" sounds like you're selecting a thing, not a behavior. So "Method"
should be better. Part of 104686.
2023-02-15 12:58:15 +01:00
ae84a2956e Assets: Preference for default import method for an asset library
The default import method for an asset library can now be determined in
the Preferences. The Asset Browser has a new "Follow Preferences" option
for the importing. The essentials asset library still only uses "Append
(Reuse Data)".

This is part of #104686, which aims at improving the import method
selection, especially for the introduction of the new essentials library
(which doesn't support certain import methods). Further changes are
coming to improve the UI, see #104686.

Pull Request: #104688
2023-02-15 12:51:23 +01:00
4387dbf03e Curves: Replace quick fur operator, add to add menu
The quick fur operator now uses the new hair system. It adds a new
curves object for every selected mesh, and adds geometry nodes
modifiers from the essentials assets that generate curves. A few
settings are exposed in the redo panel, including an option for whether
to apply the modifier to generate the initial curves so that there is
original editable data.

The point of the operator is to give people a sense of how to use the
node groups and to give a very fast way to build a basic setup for
further tweaking.

Pull Request #104764
2023-02-15 12:13:34 +01:00
837ba82402 Fix #90629: VSE: Split selection is determined by menu position
Use the right hand side selection, as it fits the typical workflow
the best.

Arguably, the same would need to be done for the k-shortcut, but
that is another issue to be tackled. As well as making the selection
active.

Pull Request #104777
2023-02-15 12:05:37 +01:00
1a94457586 Fix #104760: GPencil Fill Gap circles no longer disappear when blue line appear
The problem was the alpha channel was set wrongly and the original 0 value was replaced with 1 by error.
2023-02-15 11:55:32 +01:00
35cde8b5fd Curves: move edit mode out of experimental
We (Dalai, Hans, Falk, Simon and me) decided that the curves edit mode
is useful enough to justify moving it out of experimental now. So far it
supports the following features:
* Various selection tools. The selections are synced with sculpt mode.
* Transform tools.
* Delete curves/points.

More functionality of the old curve edit mode will be ported over in
future releases.
2023-02-15 11:43:55 +01:00
ef46f735f3 Cleanup: LineArt: Remove lineart_cpp_bridge.cc
Removed this file because lineart_cpu.cc is already c++.

Pull Request #104736
2023-02-15 11:33:37 +01:00
7481a36d51 EEVEE: Remove unnecessary material optimization assertion.
Fix unreported assert in basic scenes.

Authored by Apple: Michael Parkin-White

Pull Request #104775
2023-02-15 11:27:25 +01:00
3807a0981e Curves: Add delete operator to edit mode
This implements the delete operator in curves edit mode. The behavior
is similar to the delete operator in the edit mode of legacy curves,
i.e. it's actually dissolving and doesn't split curves. This is also
the behavior that we generally want for the hair use case.

The operator is added to the `Curves` menu and can be accessed via
the keyboard using `X` or `Del`.

Pull Request #104765
2023-02-15 10:50:01 +01:00
ddc6815b1a Curves: add Curve Parameter Falloff to comb brush
This adds a new `Curve Falloff` popover to the comb brush tool settings.
The curve control allows changing the brush weight along the curve to
e.g. affect the tip more than the root. This is a relative way to get
something like stiffness for short hair.

This functionality could potentially be added to some other brushes,
but the comb brush is the most important one, so that is added first.

I did add the buttons add the buttons to choose a curve map preset.
However, I did not add the preset dropdown, because that just adds
some unnecessary complexity in the code now and is redundant.

Pull Request #104589
2023-02-15 10:42:30 +01:00
fc6d13eced Cleanup: extract the check to save a blend file into a function
Part of D11422 by @filedescriptor, makes it convenient to run a funciton
when the file cannot be saved.
2023-02-15 16:19:49 +11:00
e424931cf7 Cleanup: rename Byte to Bytes in Python string utilities
These functions operate on byte arrays, not a single byte,
match naming for CPython's PyBytes_AS_STRING.
2023-02-15 16:19:14 +11:00
6661342dc5 Mesh: Set active attribute values edit mode operator
This patch adds a simple operator to set values of the active
attribute for the selected element. The aim is to give simple control
over attribute values in edit mode rather than to provide the fastest
workflow for most cases. Eventually this operator might be less
important compared to more advanced attribute editing tools, but for
now, exposing a little bit of functionality is low hanging fruit and
will help to see the possibilities.

The implementation mostly consists of boilerplate to register the
necessary property types for the operator and draw their UI.
Beyond that, we just loop over selected elements and set a value.

Pull Request #104426
2023-02-15 04:34:53 +01:00
02c3889b1c Cleanup: quiet clang warnings
Quiet unused argument, shadow, array-bounds & range-loop-bind-reference
warnings.
2023-02-15 13:26:54 +11:00
8d35b28f2a Cleanup: spelling in comments 2023-02-15 13:11:14 +11:00
00a7344f27 Cleanup: quiet warnings from shellcheck, correct indentation 2023-02-15 13:11:06 +11:00
8de121ffc6 Cleanup: use 'show_' prefix for RNA sculpt booleans
Follow naming conventions from
https://wiki.blender.org/wiki/Source/Architecture/RNA
2023-02-15 13:11:04 +11:00
c3b53abc10 Fix #84057: Delete operator removes unexpected strip
Caused by strips being flagged for removal, but the flag was never
cleared. As far as I can tell, this issue is not reproducible anymore,
but there may be files with this flag still set.
2023-02-15 01:25:38 +01:00
dcc80748d3 Fix overlay_uniform_color_clipped inheritance
overlay_uniform_color_clipped was inheriting from overlay_depth_only, which doesn't
make much sense.
I've changed it to inherit from overlay_uniform_color instead, which is consistent
with other \*\_clipped variants of shaders.

Pull Request #104761
2023-02-14 22:21:59 +01:00
7b9d1cb51f Eevee: GPU Material node graph optimization.
Certain material node graphs can be very expensive to run. This feature aims to produce secondary GPUPass shaders within a GPUMaterial which provide optimal runtime performance. Such optimizations include baking constant data into the shader source directly, allowing the compiler to propogate constants and perform aggressive optimization upfront.

As optimizations can result in reduction of shader editor and animation interactivity, optimized pass generation and compilation is deferred until all outstanding compilations have completed. Optimization is also delayed util a material has remained unmodified for a set period of time, to reduce excessive compilation. The original variant of the material shader is kept to maintain interactivity.

Also adding a new concept to gpu::Shader allowing assignment of a parent shader from which a shader can pull PSO descriptors and any required metadata for asynchronous shader cache warming. This enables fully asynchronous shader optimization, without runtime hitching, while also reducing runtime hitching for standard materials, by using PSO descriptors from default materials, ahead of rendering.

Further shader graph optimizations are likely also possible with this architecture. Certain scenes, such as Wanderer benefit significantly. Viewport performance for this scene is 2-3x faster on Apple-silicon based GPUs.

Authored by Apple: Michael Parkin-White

Ref T96261
Pull Request #104536
2023-02-14 21:51:03 +01:00
db2eaa5c86 OBJ: fixed some faces wrongly skipped in invalid face validation logic (#104593)
The logic for looping over imported OBJ faces and checking whether any
of them are "invalid" (duplicate vertices) was wrongly skipping
validation of the next face right after some invalid face. It
was the previously invalid face, moving the last into its place,
but then the loop was incrementing the face index and that just-moved
face was not properly validated.

Fixes #104593 - importing attached obj file (which contains some faces
that have duplicate indices). Added test coverage with a much smaller
obj file.
2023-02-14 21:49:25 +02:00
e1a29b58bb Fix: Curve resolution input node missing default
When the attribute doesn't exist, the node should give the default
of 12, as defined by the accessor method for `bke::CurvesGeometry`.

Pull Request #104674
2023-02-14 19:59:37 +01:00
87f96b7664 Fix wrong variable being used as plan in 'gizmo_3d_dial_matrixbasis_calc'
A plane must be `float[4]`.
2023-02-14 15:39:32 -03:00
a755e6e63f Revert "MSVC: lower C4100 warning level from 4 to 3"
This reverts commit db4e7616f3.

Caused many issues when compiling mantaflow.
2023-02-14 15:36:35 -03:00
f72d28d7e7 Gizmo: use the final matrix for the dial3D clip plane
Dial3D uses the `matrix_basis` for the clip plane.

This can cause inconsistencies if the gizmo has `matrix_offset` or
`matrix_space`.
2023-02-14 15:31:30 -03:00
db4e7616f3 MSVC: lower C4100 warning level from 4 to 3
This better aligns with OSX/Linux warnings.

Although `__pragma(warning(suppress:4100))` is not the same as
`__attribute__((__unused__))` in gcc (which only affects the attribute
instead of the line), it still seems to be better to use it than to
hide the warning entirely.
2023-02-14 14:38:18 -03:00
f4db58844d GPU: Fix wrong 2D shader used from 3D drawing
Regression introduced by d165d6aa2a.
2023-02-14 18:32:42 +01:00
acf7f46b77 Fix #104748: GPencil: Fill tool is not working at all
Caused by missing color uniform update.
2023-02-14 18:32:42 +01:00
4fbc9c42e5 Curves: Add transform tools to toolbar
This commit also makes `ED_transform_calc_gizmo_stats` work
for curves.

Pull Request #104750
2023-02-14 18:24:24 +01:00
5ca65001ea Geometry Nodes: Operator to wrap a modifier's node group
As described in #104171, add an operator that creates a new node group
that contain the current node group and named attribute nodes to deal
with the outputs. This saves manual work when moving a high-level
modifier to the node editor for better procedural control.

Pull Request #104546
2023-02-14 18:21:50 +01:00
c9f02569c7 Curves: cage overlay for sculpt mode
This adds a new overlay for curves sculpt mode that displays the curves that the
user currently edits. Those may be different from the evaluated/original curves
when procedural deformations or child curves are used.

The overlay can clash with the evaluated curves when they are exactly on top of
each other. There is not much we can do about that currently. The user will have
to decide whether the overlay should be shown or not on a case-by-case basis.

Pull Request #104467
2023-02-14 18:10:11 +01:00
eb9647c53d Cleanup: rename vertex buffer attribute to be more specific
This renames `data` and `color` to `selection`. This is better because
it's actually what the corresponding buffers contain. Using this
more correct name makes sharing vertex buffers between different
gpu batches for different shaders easier.
2023-02-14 17:53:20 +01:00
7ba59c8a62 Assets: show all assets by default instead of current file
The "current file" mode is only useful when creating new assets.
However, the far more common use case and the one that should require
fewer steps is to use existing assets.

There is a risk that this causes freezing if the file browser preview
caching does not work properly. So we'll have to keep an eye on the bug
tracker to see if this is an issue in practice.

Pull Request #104749
2023-02-14 17:44:41 +01:00
b3fb73f325 Assets: bundle Essentials with Blender
This patch adds an "Essentials" asset library that is bundled with Blender.
Also see #103620. At build time, the `lib/assets/publish` folder is copied
to `datafiles/assets` in the build directory.

In the UI, the "Essentials" library can be accessed like other custom asset
libraries with the exception that assets from that library cannot be linked.

The immediate impact of this is that Blender now comes with some geometry
node groups for procedural hair grooming.

Pull Request #104474
2023-02-14 17:35:29 +01:00
459c210154 Fix #104652: UV Face Dots randomized
Match the check from before 75db4c082b,
where we just checked that the face dots bitmap wasn't null.
2023-02-14 11:27:34 -05:00
715acf758c Curves: Use deformed points in edit mode overlay
Uses the deformed crazyspace to populate the position buffer for Curves edit mode overlay.

Pull Request #104705
2023-02-14 16:33:38 +01:00
cfe828b452 OBJ: Support polylines with more than 2 vertices.
The OBJ spec (page B1-17) allows "l" entries to specify
polylines with more than 2 vertices, optionally with texture
coordinates.

Previously, only the first 2 vertices of each polyline
were read and added as loose edges, failing when texture
coordinates were present.

This adds support for proper polylines, reading but ignoring
texture coordinates.

Pull Request #104503
2023-02-14 16:30:50 +01:00
4cc94679dd MSVC: suppress additional warnings out of external headers
In 161908157d we moved all warnings
coming out of the library folder to /W0 as many of them do not follow
our code-style nor can we force them to.

When i made this change, i took `/external:templates-` to mean
"and that goes for you too, templates" and it decisively does the
opposite leading to /W3 warnings coming out of openvdb

This change removes the flag as it should have never have been added
in the first place.
2023-02-14 07:56:14 -07:00
Robin Hohnsbeen
39f63c8c08 Sculpting: Vector displacement for the sculpting draw brush
Sculpt: Added vector displacement for the sculpting draw brush (area plane mapping only for now)

Vector displacement maps (VDM) provide a way to create complex displacements that can have overhangs in one brush dab.
This is unlike standard displacement with height maps that only displace in the normal direction.
Forms like ears, curled horns, etc can be created in one click if VMDs are used.
There is a checkbox on the draw brush in the texture settings "Vector Displacement" that enables/disables this feature.

Technical description: The RGB channels of a texture in a brush stroke are read and interpreted as individual vectors, that are used to offset vertices.
As of now, this is only working for the draw brush using the area plane mapping. Symmetry and radial symmetry are working.

A few things to consider when making VD-Maps:
* UVs need to stay intact for the bake mesh (e.g. voxel remeshing can't be used to create VD Meshes)
* When exporting a VD Map it should be in the file format OpenEXR (for positive and negative floating point values).
* Export resolution can be 512x512 or lower (EXR files can get very large, but VDM brushes don't need a high resolution)

And when using them:
* Inside Blender clamping needs to be unchecked on the texture
* The brush falloff should be set to constant (or nearly constant)

This patch was inspired by this [right-click-select proposal](https://blender.community/c/rightclickselect/WqWx/) Thanks for the post!

(Moved [this patch](https://archive.blender.org/developer/D17080) to here.)

Co-authored-by: Robin Hohnsbeen <robin@hohnsbeen.de>
Pull Request #104481
2023-02-14 15:29:27 +01:00
d4f7f1dfe4 Fix memory leak in view3d_select
There was an unintentional early return in the click select
execute function that caused the operator properties not
to be freed.
2023-02-14 14:59:07 +01:00
a76b5d3a07 Curves: change collision radius for consistency with default radius
Offset distance for curve collision  will be matched with default curve
radius for better visibility.

Pull Request #104648
2023-02-14 14:15:04 +01:00
59b9a88a35 Fix: duplicate elements node does not update curve type counts
Curve type counts are updated eagerly but it was missing in this
node leading to a crash further down the line where the counts
were expected to be correct.
2023-02-14 13:32:00 +01:00
4126284e46 Allocator: Fail building when trying to MEM_delete a void pointer
`MEM_delete()` is designed for type safe destruction and freeing, void
pointers make that impossible.
Was reviewing a patch that was trying to free a C-style custom data
pointer this way. Apparently MSVC compiles this just fine, other
compilers error out. Make sure this is a build error on all platforms
with a useful message.
2023-02-14 12:48:26 +01:00
1fd0c80725 Build: correct forwarding jobs argument to make_deps_wrapper.sh 2023-02-14 22:26:59 +11:00
5215543280 Constraints: use a dimensionless unit prop for Maintain Volume
The Volume property of the Maintain Volume constraint was marked as a
distance, which made it confusing--especially with non-metric units.

The volume can actually be understood as a factor of the initial
volume, so it should be dimensionless.

Additionally, the volume had a range of 0.001 to 100.0. This is wide
enough in most cases, but sometimes you may need to go orders of
magnitude higher or lower to consider vast or thin objects, and there
should be no drawback to extending the limits, provided they stay
positive.

Pull Request #104489
2023-02-14 12:19:53 +01:00
1e449bee21 Cycles: Temporary fix for Principled Hair albedo
The proper fix (bb9eb262d4) caused compilation problems with HIP, so we're
delaying it until 3.6.
To fix the original bug report (#104586), this is a quick workaround that'll
hopefully not upset the compiler.

Pull Request #104723
2023-02-14 12:18:10 +01:00
5040c39d1a Fix T103354: Author extents on UsdGeomMesh
A properly authored USD file will have the extent attribute authored on all prims conforming to UsdGeomBoundable.
This cached extent information is useful because it allows the 3D range of prims to be quickly understood without reading potentially large arrays of data. Note that because the shape of prims may change over time, extent attributes are always evaluated for a given timecode.

This patch introduces support for authoring extents on meshes and volumes during export to USD.

Because extents are common to multiple kinds of geometries, the main support for authoring extents has been placed in USDAbstractWriter, whose new author_extent method can operate on any prim conforming to pxr::UsdGeomBoundable. The USD library already provides us the code necessary to compute the bounds for a given prim, in pxr::UsdGeomBBoxCache::ComputeLocalBound.
Note that not all prims that are imageable are boundable, such as transforms and cameras.

For more details on extents, see https://graphics.pixar.com/usd/release/api/class_usd_geom_boundable.html#details.

Note that when new types of geometries are introduced, such as curves in https://developer.blender.org/D16545, we will need to update the USD writer for that geometry such that it calls this->author_extent.

Update on Feb 2: This patch has been updated to include a unit test to ensure authored extents are valid. This test requires new test assets that will need to be submitted via svn. The test assets are attached in the d16837_usd_test_assets.zip file. To use, unzip and merge the contents of this zip into the lib/tests/usd folder.

This unit test also addresses #104269 by validating compliance of exported USD via UsdUtils.ComplianceChecker.

Pull Request #104676
2023-02-14 12:11:53 +01:00
a3e6cb5dab Gizmo: press shift key to scale cage2d gizmo uniformly
Reviewed by: Campbell Barton
Pull Request #104418
2023-02-14 11:08:54 +01:00
fe0a219b5b Revert: Formatting changes space_toolsystem_toolbar 2023-02-14 10:30:49 +01:00
09498264f6 Deps: build one at a time, each using all available cores on Linux
Use a MAKE wrapper for 'make deps' on Linux that ensures dependencies
are built one at a time. This is preferable because building many
dependencies at once made troubleshooting impractical and had the
downside that large deps such as LLVM would bottleneck on a single core.

This may be used for macOS, so far it's only tested on Linux.
2023-02-14 16:37:04 +11:00
46c34ba1f6 GNUmakefile: include install directory in 'make deps' message
Avoids uncertainly, ensure deps are copied from the right place.
2023-02-14 13:26:30 +11:00
d02f863b18 Update RNA to User manual mappings 2023-02-14 13:19:18 +11:00
d851c91856 PyDocs: update links in generated API docs for GITEA migration
- Correct broken link for undocumented modules.
  Point to the contributing page, it seems #51062 was lost with the
  GITEA task migration.
- Correct Blender Version link to the SHA1.
2023-02-14 12:41:51 +11:00
a6dcf6e6cb Cleanup: avoid redundant sqrt when picking reroute nodes 2023-02-14 12:26:29 +11:00
818c16ef1f Cleanup: pass a const argument to CustomData_get_elem_size 2023-02-14 12:26:01 +11:00
750e278fd1 Cleanup: line wrapping in USERPREF_PT_experimental_new_features
Changing from Phabricator links caused poor formatting.
Also remove blank lines between comment and it's function.
2023-02-14 12:13:04 +11:00
85f8ba9db7 Cleanup: include missing header, sort CMake file lists 2023-02-14 11:52:14 +11:00
1ac80e8338 Cleanup: quiet unreachable-code warning, use ARRAY_SIZE macro 2023-02-14 11:50:00 +11:00
6dc88682ea License headers: add missing license identifier 2023-02-14 11:47:13 +11:00
36937e523a Fix #66863: Bisect plane rotation behavior bugged for side/cut view
The matrix of the rotation gizmo is somewhat confusing.

Attaching to the Z axis has more predictable results.
2023-02-13 21:15:38 -03:00
0fa34aa0ec Cleanup: spelling in comments, reference enum types in doc-strings
Also use doxy formatting for structs in sculpt_uv.c.
2023-02-14 10:29:48 +11:00
e928dd300b Cleanup: format 2023-02-14 10:29:46 +11:00
dc46465490 Cleanup: use typed enum (eDirEntry_SelectFlag) where appropriate 2023-02-14 10:29:44 +11:00
1d7bf4f826 BLI_utildefines: add ^= operator to ENUM_OPERATORS macro 2023-02-14 10:29:42 +11:00
Erik Abrahamsson
526f2273c6 Docs: improve online manual lookup time time
Matching the RNA id's to the search pattern is slow because
of the function `fnmatchcase`. This patch first checks the string
prefix without any special characters used by fnmatch,
if the `startswith` check fails, there is no need to check `fnmatchcase`.
Before the optimization, an online manual lookup took about 400ms
which is quite noticeable, with this patch applied it's under 10ms.

Ref !104581.
2023-02-14 10:29:40 +11:00
bb9eb262d4 Revert "Cycles: Clean up the Principled Hair BSDF implementation"
This reverts commit cb77865c21.

Appears to break HIP compilation, so delay until 3.6.
2023-02-13 23:32:53 +01:00
1f615690e6 Cycles: Add support for CUDA 12
CUDA 12 no longer supports sm_3*, in particular 35 and sm_37,
so skip building those if only CUDA 12+ is available.

Pull Request #104713
2023-02-13 23:04:02 +01:00
6a0b1eae8c Fix #104097: re-enable Cycles AMD Vega support
The internal compiler error appears to be gone. Unclear why it appeared in the
first place and why it's gone now. Just random kernel code changes causing it.

Pull Request #104719
2023-02-13 22:53:08 +01:00
cb77865c21 Cycles: Clean up the Principled Hair BSDF implementation
- Rename roughness variables for more clarity - before, the SVM/OSL code would
  set s and v to the linear roughness values, and the setup function would over-
  write them with the distribution parameters. This actually caused a bug in the
  albedo code, since it intended to use the linear roughness value, but ended up
  getting the remapped value.
- Deduplicate the evaluation and sample functions. Most of their code is the
  same, only the middle part is different.
- Changed albedo computation to return the sum of the intensities of the four
  BSDF lobes. Previously, the code applied the inverse of the color->sigma
  mapping from the paper - this returns the color specified in the node, but
  for very dark hair (e.g. when using the Melanin controls) the result is
  extremely low (e.g. 0.000001) despite the hair still reflecting a significant
  amount of light (since the R lobe is independent of sigma). This causes issues
  with the light component passes, so this change fixes #104586.
- There's quite a few computations at the start of the evaluation function that
  are needed for sampling, evaluation and albedo computation, but only depend on
  the view direction. Therefore, just precompute them - we still have space in
  PrincipledHairExtra after all.
- Fix a tiny bug - the direction sampling code did not account for the R lobe
  roughness modifier.

Pull Request #104669
2023-02-13 22:49:01 +01:00
7dbf2e2e95 Curves: Draw point overlay only in point selection mode
In case the selection domain is `ATTR_DOMAIN_CURVE` then do not draw the points.

Pull Request #104715
2023-02-13 22:25:06 +01:00
c92c52f0c9 Cleanup: Move draw_attributes.hh to C++
In order to experiment with different storage types for `DRW_Attributes`
and for general cleanup (see #103343). Also move a curves header to C++.

Pull Request #104716
2023-02-13 20:56:24 +01:00
dfacaf4f40 Fix: Incorrect BMesh to Mesh attribute copying
The existing logic to copy `BMesh` custom data layers to `Mesh`
attribute arrays was quite complicated, and incorrect in some cases
when the source and destinations didn't have the same layers.
The functions leave a lot to be desired in general, since they have
a lot of redundant complexity that ends up doing the same thing for
every element.

The problem in #104154 was that the "rest_position" attribute overwrote
the mesh positions since it has the same type and the positions weren't
copied. This same problem has shown up in boolean attribute conversion
in the past. Other changes fixed some specific cases but I think a
larger change is the only proper solution.

This patch adds preprocessing before looping over all elements to
find the basic information for copying the relevant layers, taking
layer names into account. The preprocessing makes the hot loops
simpler.

In a simple file with a 1 million vertex grid, I observed a 6%
improvement animation playback framerate in edit mode with a simple
geometry nodes modifier, from 5 to 5.3 FPS.

Fixes #104154, #104348

Pull Request #104421
2023-02-13 20:52:02 +01:00
0dfc102531 Fix #104588: Initialize Face Sets from edit mode selection broken
Mistake in da4bd24c3e
2023-02-13 14:16:23 -05:00
1e794d9a92 BMesh: Add flexibility for future lazily created UV selection
Don't crash on nonexisting uv selection layers. Add an assert
because for now it is a bug if they don't exist. But when converting
back to Mesh it is preferable to accept in release mode, as opposed to
crashing.

Pull Request #104600
2023-02-13 20:08:30 +01:00
684789c815 Fix #104501: Clear CD_FLAG_NOCOPY after use
When generating a Mesh from a BMesh the uv map bool layers are not
copied if all elements are false. To suppress the copying the flag
CD_FLAG_NOCOPY is set in the layer flags. However these layers *do*
need to be copied to other BMeshes (for example undo steps). So we
need to clear them afterwards.
2023-02-13 20:08:29 +01:00
72a85d976a USD Import: USD Shapes support
This commit adds the ability to import USD Shape primitives (Gprims).
They are imported as Blender Meshes using the USD API to convert, so
that they appear the same as they would in other applications. USD
Shapes are important in many workflows, particularly in gaming, where
they are used for stand-in geometry or for collision primitives.

Pull Request #104707
2023-02-13 19:49:24 +01:00
8d20db0594 Curves: Fix curve segment selection
Make sure that in curve selection mode, the segments are checked
for intersection instead of only the points.

Pull Request #104703
2023-02-13 19:05:28 +01:00
64f83f80c6 Fix #104373: Motion Tracker solve with anamorphic footage not working
The issue was caused by rather recent refactor in 7dea18b3aa.

The root of the issue lies within the fact that the optical center was updated
on the Blender side after the solution was run. There was a mistake in the code
which double-corrected for the pixel aspect ratio.

Added a comment in the code about this, so that it does not look suspicious.

Pull Request #104711
2023-02-13 18:49:45 +01:00
9b7d71cec2 Cleanup: Remove redundant translation markers in context poll message
`CTX_wm_operator_poll_msg_set()` is covered by the translation script
and always translates these strings. Checked with Bastien, he prefers
not having the redundancy here.
2023-02-13 18:41:38 +01:00
64e4aede7f Fix bug report including rB commit hash prefix not needed for Gitea 2023-02-13 18:34:13 +01:00
675717eb79 Cleanup: fix a few typos in UI messages
Issues reported by @Joan-Pujolar in #43295.

Pull Request #104672
2023-02-13 18:24:53 +01:00
c2a5c592fe Fix #103397: Vector UI control shader UBO alignment fix.
Replace float3 with packed_float3 when using additional type padding.

Authored by Apple: Michael Parkin-White

Ref #96261
Pull Request #104699
2023-02-13 18:16:38 +01:00
0e1a6f6033 Cleanup: Add in NLAStrip / NLATrack remove / clean methods
## Cleanup: Refactor NLATrack / NLAStrip Remove

This PR adds 3 new methods:
* BKE_nlatrack_remove_strip
* BKE_nlastrip_remove
* BKE_nlastrip_remove_and_free

These named BKE methods are really just replacements for BLI_remlink, but with some added checks, and enhanced readability.

Co-authored-by: Nate Rupsis <nrupsis@gmail.com>
Pull Request #104437
2023-02-13 18:10:13 +01:00
d42c803f5b Cleanup: add comment explaining #interpf is performing lerp 2023-02-13 17:32:19 +01:00
218a909dec GPencil: Fix unreported Eyedropper color difference in Materials
The color selected was converted wrongly for materials. The undo of the conversion must be done only for palettes.

Also, some code cleanup done.
2023-02-13 17:30:20 +01:00
bea1eff3a5 Fix: No update setting curve selection domain with all selected
Send a notification and tag for an update even if the selection doesn't
exist, which is still necessary for drawing that depends on the
selection domain.
2023-02-13 11:22:31 -05:00
d5c60fb685 Cleanup: add comment to cage gizmo scaling regarding the boundary 2023-02-13 17:08:23 +01:00
9fecf1f8b8 Cycles: Replace resolution divider loop with an analytical formula
As a side effect of this change, more resolution divisions are now available.
Before this patch the possible resolution divisions were all powers of two.
Now the possible resolution divisions are the multiples of pixel_size.

This increase in possible resolution divisions is the same idea proposed in https://archive.blender.org/developer/D13590.
In that patch there were concerns that this will increase the time between a user navigating
and seeing the 1:1 render. To my knowledge this is a non-issue and there should be
little to no increase in time between those two events.

Pull Request #104450
2023-02-13 13:02:47 +01:00
99e71ec1f2 Assets: Store pointer to owning asset library in asset representation
This is needed to be able to query asset library information from an
asset. This again is relevant especially for the "All" asset library,
where you can't just directly access the library itself, which is
different for different assets.

The current design is that an asset representation is owned by exactly
one asset library, so having this pointer is perfectly compatible with
the design.

Reviewed by: Julian Eisel
2023-02-13 12:57:03 +01:00
4f19e99cb1 Tests: add unit test for BLI_strnlen
It wasn't there. Now it is.
2023-02-13 12:54:35 +01:00
147c75b831 Cleanup: EEVEE-Next: Fix typo 2023-02-13 12:49:51 +01:00
0d9fbfe7fe GPUShader: Fix compilation caused by designated initializers in C++ 2023-02-13 12:49:22 +01:00
da0c182a61 Cycles: Fix compilation after recent GPU cleanups 2023-02-13 12:17:46 +01:00
dd171f7743 Cleanup: GPUShader: Rename GPU_shader_uniform_vector
Rename to `GPU_shader_uniform_float/int_ex` to make more sense as a
general purpose function.
2023-02-13 11:22:38 +01:00
b68bac7ced Cleanup: GPUShader: Remove GPU_shader_uniform_int/float
Simplify the API, leaving only one function to set uniform without the
uniform name.
2023-02-13 11:22:38 +01:00
173a8f4ac9 GPU: Removes GPU_shader_get_builtin_ssbo
Simplify the API. Use hardcoded ssbo location instead.
2023-02-13 11:22:38 +01:00
164f591033 Cleanup: GPU: Rename some functions for consistency 2023-02-13 11:22:38 +01:00
83a6642045 Cleanup: GPU: Move eGPUKeyframeShapes to shader shared
Removes code duplication.
2023-02-13 11:22:38 +01:00
158f87203e Cleanup: GPUShader: Reorganize GPU_shader.h to separate depecated API
This avoid confusion to what to use nowadays.
Also improves documentation.
2023-02-13 11:22:38 +01:00
d92c28582a Cleanup: GPUShader: Split Builtins to their own header
Also improve documentation and cleanup.
2023-02-13 11:22:38 +01:00
d165d6aa2a GPU: Remove GPU_SHADER_3D_POINT_FIXED_SIZE_VARYING_COLOR
This replaces `GPU_SHADER_3D_POINT_FIXED_SIZE_VARYING_COLOR` by
GPU_SHADER_2D_POINT_UNIFORM_SIZE_UNIFORM_COLOR_OUTLINE_AA`.

None of the usage made sense to not use the AA shader.
Scale the point size to account for the rounded shape.
2023-02-13 11:22:38 +01:00
5f3c97b999 GPU: Move gpu_shader_builtin.c to C++
This also removes all unused part of it now that we have converted all
shaders to use create infos.
2023-02-13 11:22:38 +01:00
afd6358fc0 Gizmo: add gizmos for spot and point light radius
Ref #104280

Pull Request #104410
2023-02-13 11:18:21 +01:00
701f90c677 Gizmo: make cage gizmo follow the cursor exactly when scaling
Ref #104280
The scaling of area light / spot light blend was wrong because it is
calculated for pivot at the edges. The new implementation in theory
works for all `abs(pivot) <= 0.5f`, although we only have -0.5, 0, and
0.5.
- Axis constraint for box cage was only applied when there is translate
flag, now the same logic is applied regardless of the translate flag,
this means when dragging the edge, the scaling in the other axis stays
the same; when dragging the corners, it applies free-form scaling.
- Due to the existence of margin, `data->orig_mouse` does not lie
exactly on the boundary. Using that value to compute the scaling causes
the error to accumulate over distance. The new implementation uses the
original dimension of the object instead, and only uses
`data->orig_mouse` to determine the side of the original cursor relative
to the pivot.
- For circular gizmo with unsigned scaling, the gizmo only follow the
cursor exactly when the cursor stays in the original quadrant, otherwise
it's hard to handle the logic when we should clamp the scaling.
2023-02-13 10:48:31 +01:00
9f4ee75c9b Fix #80307: Underline shortcut not working in enum context menus
The buttons of enum context menus are of type `UI_BUT_ROW`. They
are part of the set of buttons we create underline shortcuts for in
`ui_menu_block_set_keyaccels`.
But since they weren't handled in `ui_handle_button_activate_by_type`,
pressing the underline shortcuts didn't do anyting in those cases.

Co-authored-by: Leon Schittek <leon.schittek@gmx.net>
Co-authored-by: Brecht Van Lommel <brecht@noreply.localhost>
Pull Request #104433
2023-02-13 10:43:36 +01:00
2e19aa1399 UI: add camera passepartout color to theme
Currently the passepartout color is hardcoded to black. While a
sensible default for cinema, it may make less sense for other media,
whether video, print, web, etc. It greatly affects viewing conditions
of the image and should be user selectable, much like painting
programs allow.

Pull Request #104486
2023-02-13 09:15:34 +01:00
918cf6cd77 Fix #104341: Handle edge case in Curve to Mesh node
Don't create caps when using cyclic profile splines with two or fewer
points.
This case wasn't handled, yet, leading to invalid meshes or crashes.

Co-authored-by: Leon Schittek <leon.schittek@gmx.net>
Pull Request #104594
2023-02-13 09:03:00 +01:00
af8941e6a8 Vulkan: Use guardedalloc for driver allocations.
Vulkan has a pluggable memory allocation feature, which allows internal
driver allocations to be done by the client application provided
allocator. Vulkan uses this for more client application allocations
done inside the driver, but can also do it for more internal oriented
allocations.

VK_ALLOCATION_CALLBACKS initializes allocation callbacks for host allocations.
The macro creates a local static variable with the name vk_allocation_callbacks
that can be passed to vulkan API functions that expect
const VkAllocationCallbacks *pAllocator.

When WITH_VULKAN_GUARDEDALLOC=Off the memory allocation implemented
in the vulkan device driver is used for both internal and application
oriented memory operations.

For now this would help during the development of Vulkan backend to
detect hidden memory leaks that are hidden inside the driver part
of the stack. In a later stage we need to measure the overhead and
if this should become the default behavior.

Pull Request #104434
2023-02-13 08:37:35 +01:00
f828ecf4ba GPU: Use same read back API as SSBOs
The GPU module has 2 different styles when reading back data from
GPU buffers. The SSBOs used a memcpy to copy the data to a
pre-allocated buffer. IndexBuf/VertBuf gave back a driver/platform
controlled pointer to the memory.

Readback is done for test cases returning mapped pointers is not safe.
For this reason we settled on using the same approach as the SSBO.
Copy the data to a caller pre-allocated buffer.

Reason why this API is currently changed is that the Vulkan API is more
strict on mapping/unmapping buffers that can lead to potential issues
down the road.

Pull Request #104571
2023-02-13 08:34:19 +01:00
86b3073c9e Cleanup: Quiet unused variable warning
Also name another argument for consistency.
2023-02-12 22:27:07 -05:00
2a7440176e Fix: Missing const specifier for curve field input
Mistake in 000e722c7d, which probably made the viewer node
auto-domain detection behave differently when the special case was used.
2023-02-12 20:17:41 -05:00
6ea3fdebc8 Fix: Workbench Next: Extruded frustum binding 2023-02-12 23:40:28 +01:00
77963ff778 Fix #104637: EEVEE Displacement regression after #104595
Keep using the 3 evaluations dF_branch method for the Displacement output.
The optimized 2 evaluations method used by node_bump is now on its own macro (dF_branch_incomplete).
displacement_bump modifies the normal that nodetree_exec uses, so even with a refactor it wouldn’t be possible to re-use the computation anyway.
2023-02-12 23:06:21 +01:00
f0669ff8ba BLI: use larger integer type in BitVector
Using larger integer types allows for more efficient code, because we
can use the hardware better. Instead of working on individual bytes,
the code can now work on 8 bytes at a time. We don't really benefit
from this immediately but I'm planning to implement some more optimized
bit vector operations for #104629.

Pull Request #104658
2023-02-12 18:00:48 +01:00
3f40962414 Cleanup: use sized int types for polyfill_2d
Also correct building when USE_CLIP_EVEN is disabled.
2023-02-12 16:35:24 +11:00
32149f8d7a Tests: add polyfill2d test to ensure the result has no zero area tris
Add a test to address the issue raised in #103913, where zero area
triangles could be created from polygons that have co-linear edges
but were not degenerate.
2023-02-12 16:26:34 +11:00
91346755ce Cleanup: use '#' prefix for issues instead of 'T'
Match the convention from Gitea instead of Phabricator's T for tasks.
2023-02-12 14:56:05 +11:00
a02fa6c40d Cleanup: spelling in comments 2023-02-12 14:23:16 +11:00
10354b043f Fix crash selecting faces in wire-frame mode
Regression in [0] didn't account for the mesh not having
subdivision surface is applied.

[0]: 75db4c082b
2023-02-12 14:20:52 +11:00
c7456272b1 Cleanup: EEVEE-Next: Add LIGHT_FOREACH macros to clang-format exceptions 2023-02-12 01:41:54 +01:00
77aa9e8809 Cleanup: GPU: Remove commented lines without any comments or purpose
These were added during a big refactor. They were supposed to be
uncommented at some point but the new code does not even need a default
world.
2023-02-12 01:21:53 +01:00
d33960aead Cleanup: remove whole-archive linking for USD
Since USD is no longer statically linked these linker tricks
are no longer needed.

Co-authored-by: Ray Molenkamp <github@lazydodo.com>
Pull Request #104627
2023-02-11 19:48:47 +01:00
085c854b2a Fix curves selection toggling 2023-02-11 19:23:48 +01:00
82867753cf Transform: Hide trackball gizmo while dragging
It was accidentally displayed in a38d99e0b2.
2023-02-11 15:22:57 -03:00
232e02282e Fix circular transform gizmo always displaying Global orientation
The Global orientation comes from the mode's default orientation
(without the constraints).

It's not really exposed.
2023-02-11 15:20:38 -03:00
b9fa32cccd Fix #104587: 'Extrude To Cursor' snapping ignoring 'Target Selection'
Although not a transform operator, `Extrude to Cursor` depends on some
snapping settings.

So it should use the `Target Selection` options as well.
2023-02-11 14:50:37 -03:00
197eee6e04 Fix transform gizmos not changing in Automatic Constraint mode 2023-02-11 13:56:06 -03:00
e732580fcc Nodes: change order of Hide Value and Hide in Modifier
Based on the review comment in #104517.
2023-02-11 16:14:38 +01:00
158f809dcb Geometry Nodes: Add option to hide input in modifier
When building a node group that's meant to be used directly in the
node editor as well as in the modifier, it's useful to be able to have
some inputs that are only meant for the node editor, like inputs that
only make sense when combined with other nodes.

In the future we might have the ability to only display certain assets
in the modifier and the node editor, but until then this simple solution
allows a bit more customization.

Pull Request #104517
2023-02-11 16:11:10 +01:00
19ea673260 Cleanup: Remove const keyword in declarations 2023-02-11 15:05:55 +01:00
b723a398f3 Curves: initial surface collision for curves sculpt mode
During hair grooming in curves sculpt mode, it is very useful when hair strands
are prevented from intersecting with the surface mesh. Unfortunately, it also
decreases performance significantly so we don't want it to be turned on all the time.

The surface collision is used by the Comb, Pinch and Puff brushes currently.
It can be turned on or off on a per-geometry basis.

The intersection prevention quality of this patch is not perfect yet. This can
be improved over time using a better solver. Overall, perfect collision detection
at the cost of bad performance is not necessary for interactive sculpting,
because the user can fix small mistakes very quickly. Nevertheless, the quality
can probably still be improved significantly without too big slow-downs depending
on the use case. This can be done separately from this patch.

Pull Request #104469
2023-02-11 13:46:37 +01:00
0f708fa2e3 Geometry Nodes: use smooth normals in Distribute Points on Faces node
Previously, the node used the "true" normal of every looptri. Now it uses the
"loop normals" which includes e.g. smooth faces and custom normals. The true
normal can still be used on the points by capturing it before the Distribute node.

We do intend to expose the smooth normals separately in geometry nodes as well,
but this is an important first step.

It's also necessary to generate child hair between guide hair strands that don't
have visible artifacts at face boundaries.

For perfect backward compatibility, the node still has a "Legacy Normal" option
in the side bar. Creating the exact same behavior with existing nodes isn't
really possible unfortunately because of the specifics of how the Distribute
node used to compute the normals using looptris.

Pull Request #104414
2023-02-11 13:25:59 +01:00
6478eb565a Cleanup: format 2023-02-11 14:26:56 +11:00
fefc6a73b3 Fix pep8 checker operating on dot-files
Temporary editor files were included which could make the checker fail.
2023-02-11 14:12:43 +11:00
9f4edf8c2a Cleanup: remove unused variables 2023-02-11 14:04:35 +11:00
ce44953933 Cleanup: various C++ cleanups 2023-02-11 14:04:35 +11:00
343bb4a5a3 Cleanup: Use const char * for layer names in collada exporter
CustomData layer names should not be written except via the CusomData
api. Therefore use const char * instead of char * when referencing the
layer name.

Pull Request #104585
2023-02-11 01:13:38 +01:00
efabe81c91 Fix #103903: Bump Node performance regression
Avoid computing the non-derivative height twice.
The height is now computed as part of the main function, while the height at x and y offsets are still computed on a separate function.
The differentials are now computed directly at node_bump.

Co-authored-by: Miguel Pozo <pragma37@gmail.com>
Pull Request #104595
2023-02-10 21:06:53 +01:00
0e6da74e98 Fix #104282: Resolve Depth read for D24_S8 types in Metal.
Fixes incorrect spotlight gizmo orientation when moving.

Authored by Apple: Michael Parkin-White

Related to #96261

Pull Request #104537
2023-02-10 20:40:07 +01:00
8a32d56056 Tests: Fix device list of benchmark script only showing a single GPU
Pull Request #104583
2023-02-10 19:38:37 +01:00
7351f533e0 Curves: Add lasso and circle select
This adds a `select_lasso` and a `select_circle` function for the Curves object. It is used in the `view3d_lasso_select` and `view3d_circle_select` operator.

Co-authored-by: Falk David <falkdavid@gmx.de>
Pull Request #104560
2023-02-10 19:06:08 +01:00
5c4e1ed578 UI: Make text nomenclature and ordering consistent
"Center" -> "Middle" when describing vertical alignment.
"Align X" -> "Horizontal Alignment"
"Align Y" -> "Vertical Alignment"
Vertical alignment options rearranged to be consistently top-most to
bottom-most.

---

Co-authored-by: joshua-maros <60271685+joshua-maros@users.noreply.github.com>
Pull Request #104493
2023-02-10 19:05:37 +01:00
6f8c441950 Curves: Add select linked
This adds a new `select_linked` function that selects all the points
on a curve if there is at least one point already selected.
This also adds a keymap for the operator.

Co-authored-by: Falk David <falkdavid@gmx.de>
Pull Request #104569
2023-02-10 18:58:08 +01:00
d411be8a99 Cleanup: Use utility function to find groups in node tree
Add `contains_group` method in python api for `NodeTree` type, cleanup
`ntreeHasTree` function, reuse `ntreeHasTree` in more place in code.
The algorithm has been changed to not recheck trees by using set.

Performance gains from avoiding already checked node trees:
Based on tests, can say that for large files with a huge number
of trees, the response speed of opening the search menu in the
node editor increased by ~200 times (for really large projects
with 16 individual groups in 6 levels of nesting). Group insert
operations are also accelerated, but this is different in some cases.

Pull Request #104465
2023-02-10 17:30:55 +01:00
fae661a1ab Revert "Un-ignore modules in .gitmodules configuration"
This reverts commit aab707ab70.

A different solution to the submodule problem is being considered in #104573.
Revert to the previous behavior that developers are familiar with for now.
2023-02-10 17:15:28 +01:00
923152d180 Geometry Nodes: improve parallelization in Delete/Separate Geometry node
This just adds `threading::parallel_for` and `threading::parallel_invoke` in a few
places where it can be added trivially. The run time of the `separate_geometry`
function changes from 830 ms to 413 ms in my test file.

Pull Request #104563
2023-02-10 17:14:30 +01:00
0ea15a6fbb Fix: Inaccessible default for node group image sockets
The type was just skipped when drawing defaults for the image sockets.
2023-02-10 09:22:45 -05:00
284cdbb6cf Cleanup: Use lambdas in mesh mapping callback, remove unused arguments
Using callback functions didn't scale well as more arguments are added.
It got very confusing when to pass tehmarguments weren't always used.
Instead use a `FunctionRef` with indices for arguments. Also remove
unused edge arguments to topology mapping functions.
2023-02-10 08:37:50 -05:00
bad2c3b9ef Geometry Nodes: Experimental option for Volumes
Adds an experimental option under "New Features" in preferences,
which enables visibility of the new Volume Nodes.
Right now this option does nothing but will be used during development.
See #103248

Pull Request #104552
2023-02-10 14:21:01 +01:00
88f9c55f7f Sculpt: Fix Dyntopo Warnings
Because of T95965, some attributes are stored as generic attributes
in Mesh but have special handling for the conversion to BMesh.

Expose a function to tell whether certain attribute names are handled
specially in the conversion, and refactor the error checking process
to use it. Also check for generic attributes on the face domain which
wasn't done before.

Author: Hans Goudey
Reviewed By: Joseph Eagar

Co-authored-by: Joseph Eagar <joeedh@gmail.com>
Pull Request #104567
2023-02-10 13:16:10 +01:00
dc9f7fe64f Fix #104514: GPencil merge down layer misses some frames
When merging two gpencil layers, if the destination layer had a keyframe
where the source layer did not, strokes of the previous keyframe
in source layer were lost in that frame.

This happened because the merge operator was looping through
frames of the source layer and appending strokes in the
corresponding destination layer, but never completing
other frames than the ones existing in the source layer.

This patch fixes it by first adding in source layer
all frames that are in destination layer.

Co-authored-by: Amelie Fondevilla <amelie.fondevilla@les-fees-speciales.coop>
Pull Request #104558
2023-02-10 12:55:06 +01:00
Lucas Tadeu Teixeira
5d30c3994e Sequencer: Don't create undo step when click-select does nothing
When the sequencer is empty (i.e., there are no sequences),
we would have the deselect_all variable set to true called
ED_sequencer_deselect_all to select any existing sequences.

Ref !104453
2023-02-10 21:49:42 +11:00
51ceeb506f Fix #104026: Click-Drag to select graph editor channels no longer working
Box-Selecting channels in the dope sheet with click-drag was no longer possible as of Blender 3.2

Due to the removal of tweak events the box select operator was always shadowed by the click operator.

Original Phabricator discussion here: https://archive.blender.org/developer/D17065

Use `WM_operator_flag_only_pass_through_on_press` on click operator to fix it

Co-authored-by: Christoph Lendenfeld <chris.lenden@gmail.com>
Pull Request #104505
2023-02-10 11:36:01 +01:00
01480229b1 Cycles: Fix MetalRT checkbox not hooked up to device on AMD
(Follow on from D17043)
On AMD Navi2 devices the MetalRT checkbox was not hooked up properly and had no effect. This patch fixes it.

Co-authored-by: Michael Jones <michael_p_jones@apple.com>
Pull Request #104520
2023-02-10 10:55:39 +01:00
b77c82e2bb Tests: minor updates to make bl_rna_manual_reference more useful
- Avoid flooding the output with every match that succeeds.
- Report patterns listed in the manual that don't match anything in
  Blender.
- Disable external URL lookups, this is too slow.
  Instead use a LOCAL_PREFIX (a local build of the manual)
  or skip the test.
2023-02-10 14:04:15 +11:00
c2c62c3618 RNA: return a dummy language value when WITH_INTERNATIONAL=OFF
Without this, every access to "language" would warn that the enum
value didn't match a value in the enum items.

This made the bl_rna_manual_reference.py test output practically
unusable.
2023-02-10 13:18:33 +11:00
a8d951abdd Docs: remove malformed patterns for RNA mapping
The generator now skips these with a warning, they will need to be
corrected in the user manual.

This caused tests/python/bl_rna_manual_reference.py to fail looking
up URL's.
2023-02-10 13:04:27 +11:00
4cbe0bff34 Cleanup: spelling in comments 2023-02-10 11:34:20 +11:00
48d9363fa7 Cleanup: quiet clang compiler warnings
- undeclared variable warning.
- unreachable-code-return warnings.
- array-parameter, mismatch bound.
- 'requires' is a keyword in C++20, (rename to requires_flag).
2023-02-10 11:27:30 +11:00
8ac3096e24 Fix add-on & manual link in Help menu ignoring the current language
Use bpy.utils.manual_language_code() create manual URL's instead of
assuming English.
2023-02-10 11:02:45 +11:00
2ee9c12a23 PyAPI: add bpy.utils.manual_locale_code()
Move the function for getting the language code associated with the
user manual into a utility function (from the generated
rna_manual_reference.py).

This allows other parts of Blender to create a manual URL based on the
current locale preferences and environment.

Ref !104494
2023-02-10 11:01:02 +11:00
2d351e9ee3 Cleanup: format 2023-02-10 10:48:50 +11:00
7e0e07657c GPU: Cleanup GPU_batch.h documentation and some of the API for consistency
Documented all functions, adding use case and side effects.

Also replace the use of shortened argument name by more meaningful ones.

Renamed `GPU_batch_instbuf_add_ex` and `GPU_batch_vertbuf_add_ex` to remove
the `ex` suffix as they are the main version used (removed the few usage
of the other version).

Renamed `GPU_batch_draw_instanced` to `GPU_batch_draw_instance_range` and
make it consistent with `GPU_batch_draw_range`.
2023-02-09 22:46:35 +01:00
5c8edbd99b Cleanup: Move 6 sculpt-session-related files and header to C++
To allow further mesh data structure refactoring. See #103343

Pull Request #104540
2023-02-09 20:35:50 +01:00
2cfc4d7644 Fix #104383: don't update declaration for clipboard copy
When nodes are copied to the clipboard, they don't need their declaration.
For nodes with dynamic declaration that might depend on the node tree itself,
the declaration could not be build anyway, because the node-clipboard does
not have a node tree.

Pull Request #104432
2023-02-09 19:20:39 +01:00
bc0d3c91b1 Fix #104435: Fix rna_NlaStrip_new add strip logic to be correct boolean expression
Fixed #104435: Use correct conditional logic when testing if a new NLA strip can be added in the rna_NlaStrip_new method
2023-02-09 19:09:27 +01:00
50918d44fb Cleanup: Fix const correctness warning in recent commit 2023-02-09 11:26:38 -05:00
Kevin C. Burke
1649921791 Fix: Sequencer "Pan" label using incorrect keyword 'heading_ctxt'
Oversight in db87e2a638

Reviewed By: ISS
Differential Revision: https://archive.blender.org/developer/D17213
2023-02-09 16:33:50 +01:00
50dfd5f501 Geometry Nodes: Edges to Face Groups Node
Add a new node that groups faces inside of boundary edge regions.
This is the opposite action as the existing "Face Group Boundaries"
node. It's also the same as some of the "Initialize Face Sets"
options in sculpt mode.

Discussion in #102962 has favored "Group" for a name for these
sockets rather than "Set", so that is used here.

Pull Request #104428
2023-02-09 16:27:20 +01:00
bfa7f9db0e Assets: Implement viewport drag and drop for geometry nodes
Currently there's no way to assign a geometry node group from the asset
browser to an object as a modifier without first appending/linking it
manually. This patch adds a drag and drop operator that adds a new
modifier and assigns the dragged tree.

Pull Request #104430
2023-02-09 16:04:14 +01:00
b8e15a4a84 Fix: Add missing "-" in logic to get the channel height
This was missed when doing the refactoring in #104500
It didn't seem to have any effect until I worked on clamping the view
2023-02-09 15:59:57 +01:00
7ca651d182 Mesh: Remove unnecessary edge draw flag
As described in #95966, replace the `ME_EDGEDRAW` flag with a bit
vector in mesh runtime data. Currently the the flag is only ever set
to false for the "optimal display" feature of the subdivision surface
modifier. When creating an "original" mesh in the main data-base,
the flag is always supposed to be true.

The bit vector is now created by the modifier only as necessary, and
is cleared for topology-changing operations. This fixes incorrect
interpolation of the flag as noted in #104376. Generally it isn't
possible to interpolate it through topology-changing operations.

After this, only the seam status needs to be removed from edges before
we can replace them with the generic `int2` type (or something similar)
and reduce memory usage by 1/3.

Related:
- 10131a6f62
- 145839aa42

In the future `BM_ELEM_DRAW` could be removed as well. Currently it is
used and aliased by other defines in some non-obvious ways though.

Pull Request #104417
2023-02-09 15:56:05 +01:00
3bed78ff59 Curves: Add box selection
This adds a `select_box` function for the `Curves` object. It is used in the `view3d_box_select` operator.

It also adds the basic selection tools in the toolbar of Edit Mode.

Authored-by: Falk David <falkdavid@gmx.de>
Pull Request #104411
2023-02-09 15:53:42 +01:00
22edf04458 I18n: use format strings for Cycles version error messages
The required version numbers for various devices was hardcoded in the
UI messages. The result was that every time one of these versions was
bumped, every language team had to update the message in question.

Instead, the version numbers can be extracted, and injected into the
error messages using string formatting so that translation updates
need happen less frequently.

Pull Request #104488
2023-02-09 15:48:19 +01:00
666c2ea012 Refactor: remove yscale from bAnimContext
`bAnimContext` had a float property called `yscale_fac` that was used to define the height of the keyframe channels.

However the property was never set, only read so there really is no need to have it in the struct.

Moreover it complicated getting the channel height because `bAnimContext` had to be passed in.

Speaking of getting the channel height. This was done with macros. I ripped them all out and replaced them with function calls.

Originally it was introduced in this patch: https://developer.blender.org/rB095c8dbe6919857ea322b213a1e240161cd7c843

Co-authored-by: Christoph Lendenfeld <chris.lenden@gmail.com>
Pull Request #104500
2023-02-09 14:28:04 +01:00
ca183993a5 Fix freeing uninitialized pointer in GHOST/Wayland + X11 fallback
Freeing the timer manager didn't account for Wayland being partially
initialized.
2023-02-09 23:41:03 +11:00
0e196bab76 Build: disable LTO for Python builds
LTO compiled libpython3.10.a failed to link with GCC 12.0,
disable since these libraries are intended for developers to link
against.
2023-02-09 23:00:08 +11:00
f222fe6a3a Build: enable Python optimizations (PGO & LTO) on Linux
This is used for most Python release builds and has been reported to
give a modest 5-10% speedup (depending on the workload).

This could be enabled on macOS too but needs to be tested.
2023-02-09 20:59:08 +11:00
8b35db914e GPU: Fix assert when using light gizmo.
Blender was reporting that the GPU_TEXTURE_USAGE_HOST_READ wasn't set.
This is used to indicate that the textures needs to be read back to
CPU. Textures that don't need to be read back can be optimized by the
GPU backend.

Found during investigation of #104282.
2023-02-09 08:36:27 +01:00
1883e782cb Spelling: Assert message in GPU_texture_read. 2023-02-09 08:11:11 +01:00
7effc6ffc4 Cleanup: solve compiler warnings.
Classes were predefined as structs.
2023-02-09 08:01:33 +01:00
f3d7de709f Cycles: update Intel Graphics Compiler to 1.0.13064.7 on Linux
Linux side of 8afcecdf1f.

Reviewed by: LazyDodo, sergey, campbellbarton
Ref !104458, 16984
2023-02-09 15:43:23 +11:00
3c8f7b1a64 Cleanup: Remove unused/redundant includes from BKE_curves.hh
Avoid including headers that are obviously redundant, and don't
include BLI_task.hh in the header file, since it isn't really related.
2023-02-08 20:29:52 -05:00
0381fe7bfe Cleanup: update username in code-comments: campbellbarton -> ideasman42
Gitea migration changed my username, update code-comments.
2023-02-09 11:33:48 +11:00
5f842ef336 Cleanup: spelling in comments 2023-02-09 11:24:50 +11:00
5b110548eb Cleanup: enum conversion compiler warnings 2023-02-09 11:18:32 +11:00
9fd71d470e PyAPI: minor change to rna_manual_reference loading
- Use bpy.utils.execfile instead of importing then deleting from
  sys.modules.
- Add a note for why keeping this cached in memory isn't necessary.

This has the advantage of not interfering with any scripts that import
`rna_manual_reference` as a module.
2023-02-09 11:18:15 +11:00
94d280fc3f EEVEE-Next: Shadows: Add global switch
This allow to bypass all cost associated with shadow mapping.

This can be useful in certain situation, such as opening a scene on a
lower end system or just to gain performance in some situation (lookdev).
2023-02-09 00:48:33 +01:00
9103978952 EEVEE-Next: Shadow: Fix issue with last merge
The merge with master updated the code to use the new matrix API. This
introduce some regressions.

For sunlights make sure there is enough tilemaps in orthographic mode
to cover the depth range and fix the level offset in perspective.
2023-02-08 23:45:12 +01:00
9c03a1c92f Fix Cycles link error with debug/asan builds after recent bugfix
Pull Request #104487
2023-02-08 23:20:29 +01:00
a0f5240089 EEVEE-Next: Virtual Shadow Map initial implementation
Implements virtual shadow mapping for EEVEE-Next primary shadow solution.
This technique aims to deliver really high precision shadowing for many
lights while keeping a relatively low cost.

The technique works by splitting each shadows in tiles that are only
allocated & updated on demand by visible surfaces and volumes.
Local lights use cubemap projection with mipmap level of detail to adapt
the resolution to the receiver distance.
Sun lights use clipmap distribution or cascade distribution (depending on
which is better) for selecting the level of detail with the distance to
the camera.

Current maximum shadow precision for local light is about 1 pixel per 0.01
degrees.
For sun light, the maximum resolution is based on the camera far clip
distance which sets the most coarse clipmap.

## Limitation:
Alpha Blended surfaces might not get correct shadowing in some corner
casses. This is to be fixed in another commit.
While resolution is greatly increase, it is still finite. It is virtually
equivalent to one 8K shadow per shadow cube face and per clipmap level.
There is no filtering present for now.

## Parameters:
Shadow Pool Size: In bytes, amount of GPU memory to dedicate to the
shadow pool (is allocated per viewport).
Shadow Scaling: Scale the shadow resolution. Base resolution should
target subpixel accuracy (within the limitation of the technique).

Related to #93220
Related to #104472
2023-02-08 21:18:44 +01:00
0ab3ac7a41 BLI: Math: Fix vector operator * with MutableMatView
This was caused by operator priority trying to use
`friend VecBase operator*(const VecBase &a, FactorT b)`.

Adding tests as these were not covered.
2023-02-08 19:38:53 +01:00
a1282ab015 Fix Cycles debug build error after host falback changes
Introduced in dcfb6df9ce6.

Co-authored-by: Lucas Tadeu Teixeira <lucas@lucastadeu.com>

Pull Request #104454
2023-02-08 19:27:40 +01:00
43f308f216 Make update: Ignore submodules
The previous change in the .gitmodules made it so the `make update`
rejects to do its thing because it now sees changes in the submodules
and rejected to update, thinking there are unstaged changes.

Ignore the submodule changes, bringing the old behavior closer to
what it was.
2023-02-08 14:05:50 +01:00
aab707ab70 Un-ignore modules in .gitmodules configuration
The meaning of the ignore option for submodules did change since our
initial Git setup was done: back then it was affecting both diff and
stage families of Git command. Unfortunately, the actual behavior did
violate what documentation was stating (the documentation was stating
that the option only affects diff family of commands). This got fixed
in Git some time after our initial setup and it was the behavior of the
commands changed, not the documentation. This lead to a situation when
we can no longer see that submodules are modified and staged, and it is
very easy to stage the submodules.

For the clarity: diff and status are both "status" family, show and
diff are "diff" family.

Hence this change: since there is no built-in zero-configuration way
of forbidding Git from staging submodules lets make it visible and
clear what the state of submodules is.

We still need to inform people to not stage submodules, for which
we can offer some configuration tips and scripts but doing so is
outside of the scope of this change at it requires some additional
research. Current goal is simple: make it visible and clear what is
going to be committed to Git.

This is a response to an increased frequency of incidents when the
submodules are getting modified and committed without authors even
noticing this (which is also a bit annoying to recover from).

Differential Revision: https://developer.blender.org/D13001
2023-02-08 11:32:32 +01:00
4ed8a360e9 Fix references to the main branch in the .gitmodules 2023-02-08 11:01:01 +01:00
4d3bfb3f41 Subdivision Surface: fix a serious performance hit when mixing CPU & GPU.
Subdivision surface efficiency relies on caching pre-computed topology
data for evaluation between frames. However, while eed45d2a23
introduced a second GPU subdiv evaluator type, it still only kept
one slot for caching this runtime data per mesh.

The result is that if the mesh is also needed on CPU, for instance
due to a modifier on a different object (e.g. shrinkwrap), the two
evaluators are used at the same time and fight over the single slot.
This causes the topology data to be discarded and recomputed twice
per frame.

Since avoiding duplicate evaluation is a complex task, this fix
simply adds a second separate cache slot for the GPU data, so that
the cost is simply running subdivision twice, not recomputing topology
twice.

To help diagnostics, I also add a message to show when GPU evaluation
is actually used to the modifier panel. Two frame counters are used
to suppress flicker in the UI panel.

Differential Revision: https://developer.blender.org/D17117

Pull Request #104441
2023-02-08 10:27:22 +01:00
d781e52ee0 Cleanup: use enum literals, order likely case first in polyfill_2d 2023-02-08 17:06:54 +11:00
09eb4fe19a Fix #103913: Triangulate sometimes creates degenerate triangles
The ear clipping method used by polyfill_2d only excluded concave ears
which meant ears exactly co-linear edges created zero area triangles
even when convex ears are available.

While polyfill_2d prioritizes performance over *pretty* results,
there is no need to pick degenerate triangles with other candidates
are available. As noted in code-comments, callers that require higher
quality tessellation should use BLI_polyfill_beautify.
2023-02-08 16:59:42 +11:00
6aa1b5d031 Cleanup: format 2023-02-08 00:21:57 +01:00
5c994d7846 Fix #104297: Cycling geometry nodes viewer ignores sockets
Sockets after the geometry socket were ignored when cycling through
the node's output sockets. If there are multiple geometry sockets, the
behavior could still be refined probably, but this should at least make
basic non-geometry socket cycling work.
2023-02-07 16:01:54 -05:00
53b057aa09 Cleanup: Move 18 sculpt files to C++
To allow further mesh data structure refactoring. See #103343

Pull Request #104436
2023-02-07 21:56:45 +01:00
e817cff009 Release: support generating LTS release notes from Gitea
Now a single script to generate both links and release notes. It also includes
the issue ID for the LTS releases, so only the release version needs to be
specified.

Pull Request #104402
2023-02-07 21:23:24 +01:00
41ddd3d732 Fix: Experimental Panel links modified for Gitea
Modifies the links to point to the new developer site.

Pull Request #104425
2023-02-07 19:54:43 +01:00
f5552d759c Fix compiler error 2023-02-07 18:32:24 +01:00
f01bf82480 Curves: Add select pick operator
This adds the `select_pick` function for to `Curves` objects.
It is used in the common `view3d_select` operator.

Pull Request #104406
2023-02-07 17:50:39 +01:00
8d9d16fb53 Fix #104396: Blender crashes when moving Keyframes in Graph Editor
`t->region->gizmo_map` can be `nullptr`.

Caused by 19b63b932d
2023-02-07 12:59:35 -03:00
349350b304 Fix T104390: Regression: Object selection in viewport is not working
Caused by alignment difference between C and C++. Asan caught the issue
on startup.

Removing the unused view matrix storage copy avoids this problem.
2023-02-07 16:45:47 +01:00
cb5318b651 Docs: change Git URLs to point projects.blender.org instead of git.blender.org 2023-02-07 14:23:05 +01:00
bd6b0bac88 Update references to the new projects platform and main branch 2023-02-07 14:18:19 +01:00
3002670332 Fix T104368: Incorrect tooltip text in Blender 3.4.1's Preferences > File Paths > Scripts field.
Use backticks to cleary identify 'path' parts of this tooltip.
2023-02-07 09:50:50 +01:00
f086cf3cea Cleanup: remove redundant parenthesis 2023-02-07 17:34:20 +11:00
2609ca2b8e Cleanup: tweaks to cycles/metal preferences
- Auto-format.
- Use raw string for regex.
- Remove redundant assignment.
- Remove duplicate arm64 check.
- Break early out of loop.
2023-02-07 17:30:13 +11:00
b-init
7e8153b07d Keymap: support default shortcut to toggle overlays in all space-types
UV Editor, Image Editor & Sequencer didn't have a shortcut for toggling
overlays. Use the same shortcut as the 3D viewport.

Ref D16959
2023-02-07 16:54:06 +11:00
622cad7073 Cleanup: minor tweak to recent fix for T10438
Minor change to [0], prefer calling em_setup_viewcontext,
even though there is no functional difference at the moment,
if this function ever performs additional operations than assigning
`ViewContext.em`, it would have to be manually in-lined in
`view3d_circle_select_recalc`.

[0]: 430cc9d7bf
2023-02-07 16:18:30 +11:00
44daeaae7d Cleanup: use arg instead of param for generated sphinx docs 2023-02-07 15:14:22 +11:00
db8b5a2316 PyDoc: remove deprecated dpi argument from BLF example 2023-02-07 15:12:05 +11:00
dbca0cc9d5 Fix crash on exit under Wayland
Order of free error from [0] caused the timer manager
to be freed before the timer.

[0]: 7de1a4d1d8
2023-02-07 15:12:05 +11:00
e4f77c1a6c Cleanup: format 2023-02-07 16:57:35 +13:00
Jon Denning
e27c89c7c7 Docs: added missing documentation for WindowManager methods
Added missing documentation for `draw_cursor_add` and
`draw_cursor_remove` methods for `WindowManager`.

Differential Revision: https://developer.blender.org/D14860
2023-02-06 22:40:10 -05:00
af5706c960 Docs: improve doc-string for WM_operator_flag_only_pass_through_on_press
The doc-string didn't provide any context for how the funciton is
intended to be used.
2023-02-07 14:18:59 +11:00
a99022e22d Cleanup: spelling in comments 2023-02-07 14:17:01 +11:00
d5af895419 Fix missing matrix includes 2023-02-07 14:07:21 +11:00
Jason Fielder
8703db393b Metal: Ensure explicit return after discard to eliminate differences in behaviour between GPUs.
Discard is not always treated as an explicit return and flow control can continue for required derivative calculations. This behaviour is different in Metal vs OpenGL. Adding return after discards ensures consistency in expectation as behaviour is well-defined.

Authored by Apple: Michael Parkin-White

Ref T96261

Reviewed By: fclem

Maniphest Tasks: T96261

Differential Revision: https://developer.blender.org/D17199
2023-02-07 00:58:06 +01:00
Jason Fielder
f152159101 Metal: Guard advanced command buffer debugging behind OS version flag.
Authored by Apple: Michael Parkin-White

Ref T96261

Reviewed By: fclem

Maniphest Tasks: T96261

Differential Revision: https://developer.blender.org/D17181
2023-02-07 00:51:16 +01:00
d3500c482f Cleanup: Move DRW_pbvh.h header to C++
For continued refactoring of the Mesh data structure. See T103343.
2023-02-06 16:52:02 -05:00
3a1583972a Fix T104256: Curve to points node skips curve domain attributes
7536abbe16 forgot to port the curve domain attributes.
2023-02-06 16:30:25 -05:00
6dcfb6df9c Cycles: Abstract host memory fallback for GPU devices
Host memory fallback in CUDA and HIP devices is almost identical.
We remove duplicated code and create a shared generic version that
other devices (oneAPI) will be able to use.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D17173
2023-02-06 22:19:32 +01:00
b0b9e746fa BLI: Use BLI_math_matrix_type.hh instead of BLI_math_float4x4.hh
Straightforward port. I took the oportunity to remove some C vector
functions (ex: copy_v2_v2).

This makes some changes to DRWView to accomodate the alignement
requirements of the float4x4 type.
2023-02-06 21:25:45 +01:00
8be3fcab7e Fix tests after recent commit.
`9c14039a8f4b5f` broke blenlib tests in release builds, due to how
`EXPECT_BLI_ASSERT` works (in release builds it just calls the given
function, so if that crashes then the test fails).

For now remove that check in the test.
2023-02-06 21:14:20 +01:00
d20f992322 Mesh: Avoid copying positions in object mode modifier stack
Remove the use of a separate contiguous positions array now that
they are stored that way in the first place. This allows removing the
complexity of tracking whether it is allocated and deformed in the
mesh modifier stack.

Instead of deferring the creation of the final mesh until after the
positions have been copied and deformed, create the final mesh
first and then deform its positions.

Since vertex and face normals are calculated lazily, we can rely on
individual modifiers to calculate them as necessary and simplify
the modifier stack. This was hard to change before because of the
separate array of deformed positions.

Differential Revision: https://developer.blender.org/D16971
2023-02-06 14:34:29 -05:00
a38d99e0b2 Fix (unreported): Transform gizmo not restoring when changing mode
When activating a rotation with the Transform gizmo for example, some
gizmos are hidden but they don't reappear when changing the mode.

Make sure the gizmos corresponding to the mode always reappear.
2023-02-06 16:20:12 -03:00
75db4c082b Cleanup: Use BitVector instead of BLI_bitmap for subsurf face dot tags
For better type safety and more automatic memory management.
2023-02-06 14:13:53 -05:00
Michael Jones
2d994de77c Cycles: MetalRT optimisation for subsurface intersection queries
This patch optimises subsurface intersection queries on MetalRT. Currently intersect_local traverses from the scene root, retrospectively discarding all non-local hits. Using a lookup of bottom level acceleration structures, we can explicitly query only the relevant instance. On M1 Max, with MetalRT selected, this can give a render speedup of 15-20% for scenes like Monster which make heavy use of subsurface scattering.

Patch authored by Marco Giordano.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D17153
2023-02-06 19:12:29 +00:00
961d99d3a4 Minor comments about current usages of BKE_main_namemap_destroy. 2023-02-06 19:36:36 +01:00
430cc9d7bf Fix T104381: Assert on Circle Select end modal
`em_setup_vivewcontext` cannot be used in this function now as it
expects `obedit` to be a mesh. It also duplicated the viewcontext init.
Instead `BKE_editmesh_from_object` is called only when type is a mesh.
2023-02-06 19:32:14 +01:00
a3551ed878 BKE main namemap: Add a way to clear all namemaps of a given Main.
Existing `BKE_main_namemap_destroy` is too specific when a entire Main
needs to have its namemaps cleared, since it would not handle the
Library ones.

While in regular situation current code is fine, ID management code that
may also edit linked data needs a wider, simpler clearing tool.
2023-02-06 19:29:21 +01:00
288b13b252 BKE lib remapper: Add new util to overwrite an existing mapping.
Current `BKE_id_remapper_add` would not replace an already existing
mapping rule, now `BKE_id_remapper_add_overwrite` allows that behavior
if necessary.
2023-02-06 19:29:21 +01:00
85b2bce037 BKE libremap: do not assert on identity pairs.
If identity pairs (i.e. old ID pointer being same as new one) was
forbidden, then this should be asserted against in code defining
remapping, not in code applying it.

But it is actually sometimes usefull to allow/use identity pairs, so
simply early-return on these instead of asserting.
2023-02-06 19:29:21 +01:00
c7b601c79e Cleanup: Subdiv: Use index instead of pointer
The edge pointer was retrieved from the index and then just
used to calculate the index again. Remove the edge pointer
and only use the index.
2023-02-06 12:29:01 -05:00
81f8d74f6d EEVEE: Cleanup: light power / radiance code and document it
All thanks to @weizhen for spoting the inconsistencies.
2023-02-06 18:23:34 +01:00
9c14039a8f BLI ListBase: Add new 'SplitAfter' given link util.
Allows to easily split a ListBase by moving the part after given link
into a new ListBase.
2023-02-06 17:58:40 +01:00
39e0bbfa55 BKE libquery: Add option to also process ID.lib pointer.
This was never needed before, but upcomming work for Brush Asset project
requires it.
2023-02-06 17:31:01 +01:00
8fe4f3b756 Minor code doc improvement. 2023-02-06 17:19:02 +01:00
Iliya Katueshenock
a86f657692 Fix T104233: crash when deleting a group node that is displayed by other editor
Differential Revision: https://developer.blender.org/D17172
2023-02-06 16:18:16 +01:00
8adebaeb7c Modifiers: measure execution time and provide Python access
The goal is to give technical artists the ability to optimize modifier usage
and/or geometry node groups for performance. In the long term, it
would be useful if Blender could provide its own UI to display profiling
information to users. However, right now, there are too many open
design questions making it infeasible to tackle this in the short term.

This commit uses a simpler approach: Instead of adding new ui for
profiling data, it exposes the execution-time of modifiers in the Python
API. This allows technical artists to access the information and to build
their own UI to display the relevant information. In the long term this
will hopefully also help us to integrate a native ui for this in Blender
by observing how users use this information.

Note: The execution time of a modifier highly depends on what other
things the CPU is doing at the same time. For example, in many more
complex files, many objects and therefore modifiers are evaluated at
the same time by multiple threads which makes the measurement
much less reliable. For best results, make sure that only one object
is evaluated at a time (e.g. by changing it in isolation) and that no
other process on the system keeps the CPU busy.

As shown below, the execution time has to be accessed on the
evaluated object, not the original object.

```lang=python
import bpy
depsgraph = bpy.context.view_layer.depsgraph
ob = bpy.context.active_object
ob_eval = ob.evaluated_get(depsgraph)
modifier_eval = ob_eval.modifiers[0]
print(modifier_eval.execution_time, "s")
```

Differential Revision: https://developer.blender.org/D17185
2023-02-06 15:40:15 +01:00
773a36d2f8 Fix Cycles OneAPI build error after recent changes 2023-02-06 15:36:49 +01:00
cc623ee7b0 Transform: do not save settings when canceling the operation
If we are canceling, the settings must remain the same as before we
start the operation.
2023-02-06 11:18:52 -03:00
deaddbdcff Fix forced snap status being removed when changing transform mode
The `SNAP_FORCED` setting is set to the operation and not the snap
status.

Therefore, this option should not be cleared along with the other
statuses when resetting snapping.

Move then the location of this setting to `TransInfo::modifiers`.
2023-02-06 11:18:52 -03:00
f2538c7173 Fix T104335: MNEE + OptiX OSL results in illegal address error
The OptiX pipeline created for OSL was missing sufficient continuation
stack to handle the MNEE ray generation program.
2023-02-06 15:06:52 +01:00
4bd3b02984 Python: Suppress BGL deprecation messages after 100 times.
BGL deprecation calls used to be reported on each use. As bgl calls
are typically part of a handler that is triggered at refresh this
could lead to overflow of messages and slowing down systems when
the terminal/console had to be refreshed as well.

This patch only reports the first 100 bgl deprecation calls. This
gives enough feedback to the developer that changes needs to be made
. But still provides good responsiveness to users when they have
such add-on enabled. Only the first frames can have a slowdown.
2023-02-06 13:35:29 +01:00
7beb487e9a Fix T104353: Crash on opening sculpting template
`t->region` was `NULL`.

It can happen depending on the context.

Caused by rB19b63b932d2b.
2023-02-06 09:21:04 -03:00
9ad3a85f8b Fix Cycles GPU binaries build error after recent changes for Metal 2023-02-06 13:17:57 +01:00
Michael Jones
654e1e901b Cycles: Use local atomics for faster shader sorting (enabled on Metal)
This patch adds two new kernels: SORT_BUCKET_PASS and SORT_WRITE_PASS. These replace PREFIX_SUM and SORTED_PATHS_ARRAY on supported devices (currently implemented on Metal, but will be trivial to enable on the other backends). The new kernels exploit sort partitioning (see D15331) by sorting each partition separately using local atomics. This can give an overall render speedup of 2-3% depending on architecture. As before, we fall back to the original non-partitioned sorting when the shader count is "too high".

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D16909
2023-02-06 11:18:26 +00:00
Michael Jones
46c9f7702a Cycles: Enable MetalRT opt-in for AMD/Navi2 GPUs
Reviewed By: brecht

Differential Revision: https://developer.blender.org/D17043
2023-02-06 11:14:11 +00:00
Michael Jones
be0912a402 Cycles: Prevent use of both AMD and Intel Metal devices at same time
This patch removes the option to select both AMD and Intel GPUs on system that have both. Currently both devices will be selected by default which results in crashes and other poorly understood behaviour. This patch adds precedence for using any discrete AMD GPU over an integrated Intel one. This can be overridden with CYCLES_METAL_FORCE_INTEL.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D17166
2023-02-06 11:13:33 +00:00
Michael Jones
0a3df611e7 Fix T103393: Cycles: Undefine __LIGHT_TREE__ on Metal/AMD to fix perf
This patch fixes T103393 by undefining `__LIGHT_TREE__` on Metal/AMD as it has an unexpected & major impact on performance even when light trees are not in use.

Patch authored by Prakash Kamliya.

Reviewed By: brecht

Maniphest Tasks: T103393

Differential Revision: https://developer.blender.org/D17167
2023-02-06 11:12:34 +00:00
Amelie Fondevilla
6d297c35c8 Fix T104371: GPencil merge down layer duplicates wrong frame
The merge down operator was sometimes copying the wrong frame, which altered the animation.
While merging the layers, it is sometimes needed to duplicate a keyframe,
when the lowest layer does not have a keyframe but the highest layer does.
Instead of duplicating the previous keyframe of the lowest layer, the code
was actually duplicating the active frame of the layer which was the current frame in the timeline.

This patch fixes the issue by setting the previous keyframe of the layer as its active frame before duplication.

Related issue: T104371.

Differential Revision: https://developer.blender.org/D17214
2023-02-06 10:44:17 +01:00
329eeacc66 Cleanup: Cycles: Remove isotropic microfacet closure setup functions
Turns out these are 100% redundant, so get rid of them.
2023-02-06 04:26:36 +01:00
2627635ff3 Cleanup: use nullptr in C++ 2023-02-06 12:50:34 +11:00
d6b6050e5b Cleanup: use function style casts in C++ 2023-02-06 12:35:51 +11:00
731c3efd97 Cleanup: format 2023-02-06 12:32:45 +11:00
9f5c17f4af Cleanup: comments in code 2023-02-06 12:25:04 +11:00
4fcc9f5e7e Cleanup: use back-slash doxygen commands, de-duplicate doc-string 2023-02-06 12:25:04 +11:00
7de1a4d1d8 Fix GHOST/Wayland thread-unsafe timer-manager manipulation
Mutex locks for manipulating GHOST_System::m_timerManager from
GHOST_SystemWayland relied on WAYLAND being the only user of the
timer-manager.

This isn't the case as timers are fired from
`GHOST_System::dispatchEvents`.

Resolve by using a separate timer-manager for wayland key-repeat timers.
2023-02-06 12:25:04 +11:00
d3949a4fdb Fix GHOST/Wayland thread-unsafe key-repeat timer checks
Resolve a thread safety issue reported by valgrind's helgrind checker,
although I wasn't able to redo the error in practice.

NULL check on the key-repeat timer also needs to lock, otherwise it's
possible the timer is set in another thread before the lock is acquired.

Now all key-repeat timer access which may run from a thread
locks the timer mutex before any checks or timer manipulation.
2023-02-06 12:25:04 +11:00
b642dc7bc7 Fix: Incorrect forward-compatible saving of face sets
There were two errors with the function used to convert face sets
to the legacy mesh format for keeping forward compatibility:
- It was moved before `CustomData_blend_write_prepare` so it
  operated on an empty span.
- It modified the mesh when it's only supposed to change the copy
  of the layers written to the file.

Differential Revision: https://developer.blender.org/D17210
2023-02-05 18:09:22 -05:00
501352ef05 Cleanup: Move PBVH files to C++
For continued refactoring of the Mesh data structure. See T103343.
2023-02-05 17:36:47 -05:00
b999b79d08 Support project asset libraries in new asset library loading 2023-01-12 14:37:39 +01:00
e63671c21d Support & use new asset library path query from master 2023-01-10 16:45:39 +01:00
3ff321bc98 Merge branch 'master' into blender-projects-basics 2023-01-10 16:37:29 +01:00
f0eba2fe6c Merge branch 'master' into blender-projects-basics 2023-01-10 15:54:27 +01:00
01f9ac24ed Adapt to changes in master, fixes all compile errors 2023-01-06 17:20:03 +01:00
58d32e8185 Merge branch 'master' into blender-projects-basics 2023-01-06 15:26:21 +01:00
7158e0eb4c Merge branch 'master' into blender-projects-basics 2022-11-02 11:58:03 +01:00
75794f95c0 Fix incorrect operator name in UI script 2022-10-18 19:00:15 +02:00
7234c363db Hide UI behind experimental option 2022-10-18 18:26:50 +02:00
c475201378 Use operator poll instead of cancelling; Minor cleanup 2022-10-18 17:32:10 +02:00
907a7aba1e Confirmation prompt for closing file with unsaved project settings
The regular confirmation prompt is shown then with a "Save modified
project settings" checkbox.
2022-10-18 16:44:43 +02:00
e1d90eccc5 Fix error when changing asset library name or path
Coded the RNA functions to only work with the preferences for the
start. But we actually don't have to care here if the libraries are
defined in the preferences or project.
2022-10-18 16:23:53 +02:00
42091addb0 Add comment on asset library definition 2022-10-18 15:45:20 +02:00
7aba0015d0 Cleanup: Remove unused headers 2022-10-18 15:30:19 +02:00
eafc9cb446 Refactor BKE project interface & implementation
Mostly this is about more cleanly separating between `BlenderProject`
and `ProjectSettings`. Overall this is a nice improvement I think, helps
readability of interfaces and implementation a lot.
2022-10-18 15:24:56 +02:00
30b08fbec5 Merge branch 'master' into blender-projects-basics 2022-10-18 15:18:36 +02:00
55512dd2a8 Update to path changes in master 2022-10-18 11:49:50 +02:00
dacbcde446 Merge branch 'master' into blender-projects-basics 2022-10-17 22:36:47 +02:00
7987608fae Add missing break 2022-10-17 15:10:18 +02:00
3dfe0421f2 Attempt to fix build error on clean builds 2022-10-14 19:14:38 +02:00
80a4c388f9 Cleanup: Move project settings code to own file
Will probably reorganize things even further, but step by step.
2022-10-14 19:13:06 +02:00
8ea0551d36 Cleanup: Move operators & operator logic to editors/project 2022-10-14 15:17:23 +02:00
6177df9dce Support displaying project asset libraries in asset browser
The hint in the asset browser for when the selected library path doesn't
exists is updated to mention the project settings too, and there is a
button to open the project settings in the asset library section.
2022-10-13 21:31:20 +02:00
97f667058c Set default project name & add default asset library
Default project name will just be the directory name. The default asset
library will be called "Project Library" and point to an `assets/`
directory inside the project root directory.
2022-10-13 19:37:03 +02:00
1f819df7df Add UI for setting up project asset libraries
Looks just like the UI for setting up custom asset libraries in the
Preferences. However project asset libraries use paths relative to the
project root directory.

Had to do some changes to project data storage to avoid memory issues.
2022-10-13 16:43:54 +02:00
f22cc99944 Merge branch 'master' into blender-projects-basics 2022-10-13 12:06:12 +02:00
97e21ed248 Support writing asset library definitions to the project settings
Unit tested, there is no UI for this yet.
2022-10-13 11:57:24 +02:00
378e0d96fc Make custom asset libraries independent of Preferences
Projects are supposed to support custom asset libraries too. So the
custom asset library types should be generalized and not be preferences
specific.
2022-10-12 16:29:30 +02:00
416a486c00 Swap order of Show Preferences and Show Project Settings menu entires
Looks a bit less nice, but doesn't mess with muscle memory as much,
since the Preferences used to be last in the menu. I for one kept
opening the wrong window :)
2022-10-12 15:57:54 +02:00
9dc2bcf166 Indicate project folders with special icon in File Browser
Using the dev-fund heart icon temporarily :)
2022-10-07 15:58:14 +02:00
c8daddf7db Rename File->New Project to Set up Project, open settings after 2022-10-07 15:37:54 +02:00
b2b00cb8ea Improve message in Project Settings when there's no project + cleanup
Also show a button to set up a project.
2022-10-07 11:55:23 +02:00
0cba4a12a3 Merge branch 'master' into blender-projects-basics 2022-10-07 11:12:23 +02:00
4441e3b791 Fix incorrect setting of initial file browser location 2022-10-06 19:36:23 +02:00
3013cf0b37 Add menu with operator to delete project configuration
Deletes the active project's .blender_project directory, but leaves all
actual project files (.blends and such) untouched.
2022-10-06 19:32:06 +02:00
3002bd6e88 Show warning popup when creating project not containing the current file 2022-10-06 17:42:49 +02:00
3ad6715265 Cleanup: Move project operators closer together 2022-10-06 17:39:53 +02:00
2578c0ba92 Show warning when saving files outside the active project 2022-10-06 17:35:05 +02:00
525f24f1ba Disable buttons if no project is loaded, always show hint then
Disable the navigation and "Save Settings" buttons when there is no
active project.

The message saying that no project is loaded always shows up now, even
if the UI somehow displays a section that is not the "General" one.
2022-10-06 17:00:04 +02:00
871bdb5d1f Keep track of unsaved changes, indicate in "Save Settings" button 2022-10-06 16:42:59 +02:00
818a3d3743 Merge branch 'master' into blender-projects-basics 2022-10-06 16:03:40 +02:00
483a70eeb5 Support changing project name in Project Settings UI + show location
Adds basic access to the active project and its properties to BPY
(`context.project`).
2022-10-06 15:53:04 +02:00
6fde8ab8db Cleanup: variable names, memory debug names 2022-10-06 15:52:22 +02:00
cd15fbbed6 Support saving project settings to .blender_project/settings.json 2022-10-06 15:51:10 +02:00
3ab935c8a9 Project Settings: Add sections support and display some text for testing 2022-10-04 02:15:48 +02:00
420cc8a093 Add operator to open project settings window (like preferences) 2022-10-03 22:40:44 +02:00
c8583415e5 Add theme options for new project settings editor 2022-10-03 21:45:10 +02:00
5cc304b9a7 Add most bolierplate code & some buttons for Project Settings editor 2022-10-03 21:34:17 +02:00
c2fce60bea Don't display save file name & red-alert on "New Project..." 2022-10-03 16:15:45 +02:00
a2863b8891 Merge branch 'master' into blender-projects-basics 2022-10-03 10:42:58 +02:00
d5ae369efa Initial support for reading project name from settings.json
Adds very basic json deserializing for reading a project name from a
`.blender_project/settings.json` file. The name is displayed in the
window title.
2022-09-29 16:58:15 +02:00
c32b7bfe54 Add "New Project" operator to select a directory to use as project root
Name and exact design of this is pending still, they are not optimal.
2022-09-29 15:47:41 +02:00
24cb209915 Show hint in window title when editing file inside project
Currently only says "Has Project", but should be the project name once
that's supported.
2022-09-29 15:46:32 +02:00
5fb4b2d6ce Fix project not updated correctly on file save and write 2022-09-29 15:46:01 +02:00
1e105e99b0 Merge branch 'master' into blender-projects-basics 2022-09-29 14:38:07 +02:00
ce9949f7fa Support active project, set when loading or writing a file (+ tests)
The active project is determined via the path of the .blend file. So if
that changes (on write) or when a new .blend file is opend, the active
project is updated.

Some of the added tests require a latest checkout of the libraries SVN
repository (see rBL63043).
2022-09-29 14:29:00 +02:00
994c33ae71 Merge branch 'master' into blender-projects-basics 2022-09-29 00:17:53 +02:00
939f20490d Handle and test both Unix and Windows style slashes
When unit testing on Unix with Windows style slashes, the project
directories would have the backslash in the name, rather than
recognizing it as nested directory. We could just expect native paths
only like most BLI functions, but it's not a big problem to just support
any format and just convert it internally. The most important part is
that the API defines well how it deals with the different formats, and
that this is unit tested. Ideally we'd have some path object type that
abstracts away the difference.
2022-09-29 00:17:20 +02:00
6d7a301b04 Test project paths with unicode characters & trailing slashes 2022-09-28 16:08:50 +02:00
39987a890a Add create, dummy load function and tests for project settings directory 2022-09-27 15:30:02 +02:00
1310 changed files with 20995 additions and 11939 deletions

View File

@@ -1,8 +0,0 @@
{
"project_id" : "Blender",
"conduit_uri" : "https://developer.blender.org/",
"phabricator.uri" : "https://developer.blender.org/",
"git.default-relative-commit" : "origin/master",
"arc.land.update.default" : "rebase",
"arc.land.onto.default" : "master"
}

View File

@@ -236,6 +236,8 @@ ForEachMacros:
- LOOP_UNSELECTED_POINTS
- LOOP_VISIBLE_KEYS
- LOOP_VISIBLE_POINTS
- LIGHT_FOREACH_BEGIN_DIRECTIONAL
- LIGHT_FOREACH_BEGIN_LOCAL
- LISTBASE_CIRCULAR_BACKWARD_BEGIN
- LISTBASE_CIRCULAR_FORWARD_BEGIN
- LISTBASE_FOREACH

View File

@@ -1,5 +1,4 @@
This repository is only used as a mirror of git.blender.org. Blender development happens on
https://developer.blender.org.
This repository is only used as a mirror. Blender development happens on projects.blender.org.
To get started with contributing code, please see:
https://wiki.blender.org/wiki/Process/Contributing_Code

3
.github/stale.yml vendored
View File

@@ -15,8 +15,7 @@ staleLabel: stale
# Comment to post when closing a stale Issue or Pull Request.
closeComment: >
This issue has been automatically closed, because this repository is only
used as a mirror of git.blender.org. Blender development happens on
developer.blender.org.
used as a mirror. Blender development happens on projects.blender.org.
To get started contributing code, please read:
https://wiki.blender.org/wiki/Process/Contributing_Code

8
.gitmodules vendored
View File

@@ -1,20 +1,20 @@
[submodule "release/scripts/addons"]
path = release/scripts/addons
url = ../blender-addons.git
branch = master
branch = main
ignore = all
[submodule "release/scripts/addons_contrib"]
path = release/scripts/addons_contrib
url = ../blender-addons-contrib.git
branch = master
branch = main
ignore = all
[submodule "release/datafiles/locale"]
path = release/datafiles/locale
url = ../blender-translations.git
branch = master
branch = main
ignore = all
[submodule "source/tools"]
path = source/tools
url = ../blender-dev-tools.git
branch = master
branch = main
ignore = all

View File

@@ -524,7 +524,7 @@ endif()
if(NOT APPLE)
option(WITH_CYCLES_DEVICE_HIP "Enable Cycles AMD HIP support" ON)
option(WITH_CYCLES_HIP_BINARIES "Build Cycles AMD HIP binaries" OFF)
set(CYCLES_HIP_BINARIES_ARCH gfx1010 gfx1011 gfx1012 gfx1030 gfx1031 gfx1032 gfx1034 gfx1035 gfx1100 gfx1101 gfx1102 CACHE STRING "AMD HIP architectures to build binaries for")
set(CYCLES_HIP_BINARIES_ARCH gfx900 gfx906 gfx90c gfx902 gfx1010 gfx1011 gfx1012 gfx1030 gfx1031 gfx1032 gfx1034 gfx1035 gfx1100 gfx1101 gfx1102 CACHE STRING "AMD HIP architectures to build binaries for")
mark_as_advanced(WITH_CYCLES_DEVICE_HIP)
mark_as_advanced(CYCLES_HIP_BINARIES_ARCH)
endif()
@@ -625,8 +625,10 @@ mark_as_advanced(
# Vulkan
option(WITH_VULKAN_BACKEND "Enable Vulkan as graphics backend (only for development)" OFF)
option(WITH_VULKAN_GUARDEDALLOC "Use guardedalloc for host allocations done inside Vulkan (development option)" OFF)
mark_as_advanced(
WITH_VULKAN_BACKEND
WITH_VULKAN_GUARDEDALLOC
)
# Metal

View File

@@ -299,7 +299,11 @@ else
ifneq ("$(wildcard $(DEPS_BUILD_DIR)/build.ninja)","")
DEPS_BUILD_COMMAND:=ninja
else
DEPS_BUILD_COMMAND:=make -s
ifeq ($(OS), Darwin)
DEPS_BUILD_COMMAND:=make -s
else
DEPS_BUILD_COMMAND:="$(BLENDER_DIR)/build_files/build_environment/linux/make_deps_wrapper.sh" -s
endif
endif
endif
@@ -398,7 +402,7 @@ endif
deps: .FORCE
@echo
@echo Configuring dependencies in \"$(DEPS_BUILD_DIR)\"
@echo Configuring dependencies in \"$(DEPS_BUILD_DIR)\", install to \"$(DEPS_INSTALL_DIR)\"
@cmake -H"$(DEPS_SOURCE_DIR)" \
-B"$(DEPS_BUILD_DIR)" \

View File

@@ -24,7 +24,7 @@ Development
-----------
- [Build Instructions](https://wiki.blender.org/wiki/Building_Blender)
- [Code Review & Bug Tracker](https://developer.blender.org)
- [Code Review & Bug Tracker](https://projects.blender.org)
- [Developer Forum](https://devtalk.blender.org)
- [Developer Documentation](https://wiki.blender.org)

View File

@@ -10,7 +10,7 @@ ExternalProject_Add(external_epoxy
URL_HASH ${EPOXY_HASH_TYPE}=${EPOXY_HASH}
PREFIX ${BUILD_DIR}/epoxy
PATCH_COMMAND ${PATCH_CMD} -p 1 -N -d ${BUILD_DIR}/epoxy/src/external_epoxy/ < ${PATCH_DIR}/epoxy.diff
CONFIGURE_COMMAND ${CONFIGURE_ENV} && ${MESON} setup --prefix ${LIBDIR}/epoxy --default-library ${EPOXY_LIB_TYPE} --libdir lib ${BUILD_DIR}/epoxy/src/external_epoxy-build ${BUILD_DIR}/epoxy/src/external_epoxy -Dtests=false
CONFIGURE_COMMAND ${CONFIGURE_ENV} && ${MESON} setup --prefix ${LIBDIR}/epoxy --default-library ${EPOXY_LIB_TYPE} --libdir lib ${BUILD_DIR}/epoxy/src/external_epoxy-build ${BUILD_DIR}/epoxy/src/external_epoxy -Dtests=false ${MESON_BUILD_TYPE}
BUILD_COMMAND ninja
INSTALL_COMMAND ninja install
)

View File

@@ -9,7 +9,7 @@ ExternalProject_Add(external_fribidi
URL_HASH ${FRIBIDI_HASH_TYPE}=${FRIBIDI_HASH}
DOWNLOAD_DIR ${DOWNLOAD_DIR}
PREFIX ${BUILD_DIR}/fribidi
CONFIGURE_COMMAND ${MESON} setup --prefix ${LIBDIR}/fribidi -Ddocs=false --default-library static --libdir lib ${BUILD_DIR}/fribidi/src/external_fribidi-build ${BUILD_DIR}/fribidi/src/external_fribidi
CONFIGURE_COMMAND ${MESON} setup --prefix ${LIBDIR}/fribidi ${MESON_BUILD_TYPE} -Ddocs=false --default-library static --libdir lib ${BUILD_DIR}/fribidi/src/external_fribidi-build ${BUILD_DIR}/fribidi/src/external_fribidi
BUILD_COMMAND ninja
INSTALL_COMMAND ninja install
INSTALL_DIR ${LIBDIR}/fribidi

View File

@@ -22,7 +22,7 @@ elseif(UNIX AND NOT APPLE)
)
endif()
# Boolean crashes with Arm assembly, see T103423.
# Boolean crashes with Arm assembly, see #103423.
if(BLENDER_PLATFORM_ARM)
set(GMP_OPTIONS
${GMP_OPTIONS}

View File

@@ -21,6 +21,7 @@ set(HARFBUZZ_EXTRA_OPTIONS
# Only used for command line utilities,
# disable as this would add an addition & unnecessary build-dependency.
-Dcairo=disabled
${MESON_BUILD_TYPE}
)
ExternalProject_Add(external_harfbuzz
@@ -59,3 +60,10 @@ if(BUILD_MODE STREQUAL Release AND WIN32)
DEPENDEES install
)
endif()
if(BUILD_MODE STREQUAL Debug AND WIN32)
ExternalProject_Add_Step(external_harfbuzz after_install
COMMAND ${CMAKE_COMMAND} -E copy ${LIBDIR}/harfbuzz/lib/libharfbuzz.a ${HARVEST_TARGET}/harfbuzz/lib/libharfbuzz_d.lib
DEPENDEES install
)
endif()

View File

@@ -40,7 +40,8 @@ ExternalProject_Add(external_igc_llvm
${PATCH_CMD} -p 1 -d ${IGC_LLVM_SOURCE_DIR} < ${IGC_OPENCL_CLANG_PATCH_DIR}/clang/0004-OpenCL-support-cl_ext_float_atomics.patch &&
${PATCH_CMD} -p 1 -d ${IGC_LLVM_SOURCE_DIR} < ${IGC_OPENCL_CLANG_PATCH_DIR}/clang/0005-OpenCL-Add-cl_khr_integer_dot_product.patch &&
${PATCH_CMD} -p 1 -d ${IGC_LLVM_SOURCE_DIR} < ${IGC_OPENCL_CLANG_PATCH_DIR}/llvm/0001-Memory-leak-fix-for-Managed-Static-Mutex.patch &&
${PATCH_CMD} -p 1 -d ${IGC_LLVM_SOURCE_DIR} < ${IGC_OPENCL_CLANG_PATCH_DIR}/llvm/0002-Remove-repo-name-in-LLVM-IR.patch
${PATCH_CMD} -p 1 -d ${IGC_LLVM_SOURCE_DIR} < ${IGC_OPENCL_CLANG_PATCH_DIR}/llvm/0002-Remove-repo-name-in-LLVM-IR.patch &&
${PATCH_CMD} -p 1 -d ${IGC_LLVM_SOURCE_DIR} < ${IGC_OPENCL_CLANG_PATCH_DIR}/llvm/0003-Add-missing-include-limit-in-benchmark.patch
)
add_dependencies(
external_igc_llvm
@@ -55,9 +56,6 @@ ExternalProject_Add(external_igc_spirv_translator
CONFIGURE_COMMAND echo .
BUILD_COMMAND echo .
INSTALL_COMMAND echo .
PATCH_COMMAND ${PATCH_CMD} -p 1 -d ${IGC_SPIRV_TRANSLATOR_SOURCE_DIR} < ${IGC_OPENCL_CLANG_PATCH_DIR}/spirv/0001-update-SPIR-V-headers-for-SPV_INTEL_split_barrier.patch &&
${PATCH_CMD} -p 1 -d ${IGC_SPIRV_TRANSLATOR_SOURCE_DIR} < ${IGC_OPENCL_CLANG_PATCH_DIR}/spirv/0002-Add-support-for-split-barriers-extension-SPV_INTEL_s.patch &&
${PATCH_CMD} -p 1 -d ${IGC_SPIRV_TRANSLATOR_SOURCE_DIR} < ${IGC_OPENCL_CLANG_PATCH_DIR}/spirv/0003-Support-cl_bf16_conversions.patch
)
add_dependencies(
external_igc_spirv_translator

View File

@@ -15,7 +15,7 @@ llvm-config = '${LIBDIR}/llvm/bin/llvm-config'"
)
set(MESA_EXTRA_FLAGS
-Dbuildtype=release
${MESON_BUILD_TYPE}
-Dc_args=${MESA_CFLAGS}
-Dcpp_args=${MESA_CXXFLAGS}
-Dc_link_args=${MESA_LDFLAGS}

View File

@@ -16,8 +16,10 @@ message("BuildMode = ${BUILD_MODE}")
if(BUILD_MODE STREQUAL "Debug")
set(LIBDIR ${CMAKE_CURRENT_BINARY_DIR}/Debug)
set(MESON_BUILD_TYPE -Dbuildtype=debug)
else()
set(LIBDIR ${CMAKE_CURRENT_BINARY_DIR}/Release)
set(MESON_BUILD_TYPE -Dbuildtype=release)
endif()
set(DOWNLOAD_DIR "${CMAKE_CURRENT_BINARY_DIR}/downloads" CACHE STRING "Path for downloaded files")

View File

@@ -88,6 +88,19 @@ else()
export LDFLAGS=${PYTHON_LDFLAGS} &&
export PKG_CONFIG_PATH=${LIBDIR}/ffi/lib/pkgconfig)
# NOTE: untested on APPLE so far.
if(NOT APPLE)
set(PYTHON_CONFIGURE_EXTRA_ARGS
${PYTHON_CONFIGURE_EXTRA_ARGS}
# Used on most release Linux builds (Fedora for e.g.),
# increases build times noticeably with the benefit of a modest speedup at runtime.
--enable-optimizations
# While LTO is OK when building on the same system, it's incompatible across GCC versions,
# making it impractical for developers to build against, so keep it disabled.
# `--with-lto`
)
endif()
ExternalProject_Add(external_python
URL file://${PACKAGE_DIR}/${PYTHON_FILE}
DOWNLOAD_DIR ${DOWNLOAD_DIR}

View File

@@ -668,9 +668,9 @@ set(SPIRV_HEADERS_FILE SPIR-V-Headers-${SPIRV_HEADERS_VERSION}.tar.gz)
# compiler, the versions used are taken from the following location
# https://github.com/intel/intel-graphics-compiler/releases
set(IGC_VERSION 1.0.12149.1)
set(IGC_VERSION 1.0.13064.7)
set(IGC_URI https://github.com/intel/intel-graphics-compiler/archive/refs/tags/igc-${IGC_VERSION}.tar.gz)
set(IGC_HASH 44f67f24e3bc5130f9f062533abf8154782a9d0a992bc19b498639a8521ae836)
set(IGC_HASH a929abd4cca2b293961ec0437ee4b3b2147bd3b2c8a3c423af78c0c359b2e5ae)
set(IGC_HASH_TYPE SHA256)
set(IGC_FILE igc-${IGC_VERSION}.tar.gz)
@@ -690,15 +690,15 @@ set(IGC_LLVM_FILE ${IGC_LLVM_VERSION}.tar.gz)
#
# WARNING WARNING WARNING
set(IGC_OPENCL_CLANG_VERSION 363a5262d8c7cff3fb28f3bdb5d85c8d7e91c1bb)
set(IGC_OPENCL_CLANG_VERSION ee31812ea8b89d08c2918f045d11a19bd33525c5)
set(IGC_OPENCL_CLANG_URI https://github.com/intel/opencl-clang/archive/${IGC_OPENCL_CLANG_VERSION}.tar.gz)
set(IGC_OPENCL_CLANG_HASH aa8cf72bb239722ce8ce44f79413c6887ecc8ca18477dd520aa5c4809756da9a)
set(IGC_OPENCL_CLANG_HASH 1db6735bbcfaa31e8a9ba39f121d6bafa806ea8919e9f56782d6aaa67771ddda)
set(IGC_OPENCL_CLANG_HASH_TYPE SHA256)
set(IGC_OPENCL_CLANG_FILE opencl-clang-${IGC_OPENCL_CLANG_VERSION}.tar.gz)
set(IGC_VCINTRINSICS_VERSION v0.5.0)
set(IGC_VCINTRINSICS_VERSION v0.11.0)
set(IGC_VCINTRINSICS_URI https://github.com/intel/vc-intrinsics/archive/refs/tags/${IGC_VCINTRINSICS_VERSION}.tar.gz)
set(IGC_VCINTRINSICS_HASH 70bb47c5e32173cf61514941e83ae7c7eb4485e6d2fca60cfa1f50d4f42c41f2)
set(IGC_VCINTRINSICS_HASH e5acd5626ce7fa6d41ce154c50ac805eda734ee66af94ef28e680ac2ad81bb9f)
set(IGC_VCINTRINSICS_HASH_TYPE SHA256)
set(IGC_VCINTRINSICS_FILE vc-intrinsics-${IGC_VCINTRINSICS_VERSION}.tar.gz)
@@ -714,9 +714,9 @@ set(IGC_SPIRV_TOOLS_HASH 6e19900e948944243024aedd0a201baf3854b377b9cc7a386553bc1
set(IGC_SPIRV_TOOLS_HASH_TYPE SHA256)
set(IGC_SPIRV_TOOLS_FILE SPIR-V-Tools-${IGC_SPIRV_TOOLS_VERSION}.tar.gz)
set(IGC_SPIRV_TRANSLATOR_VERSION a31ffaeef77e23d500b3ea3d35e0c42ff5648ad9)
set(IGC_SPIRV_TRANSLATOR_VERSION d739c01d65ec00dee64dedd40deed805216a7193)
set(IGC_SPIRV_TRANSLATOR_URI https://github.com/KhronosGroup/SPIRV-LLVM-Translator/archive/${IGC_SPIRV_TRANSLATOR_VERSION}.tar.gz)
set(IGC_SPIRV_TRANSLATOR_HASH 9e26c96a45341b8f8af521bacea20e752623346340addd02af95d669f6e89252)
set(IGC_SPIRV_TRANSLATOR_HASH ddc0cc9ccbe59dadeaf291012d59de142b2e9f2b124dbb634644d39daddaa13e)
set(IGC_SPIRV_TRANSLATOR_HASH_TYPE SHA256)
set(IGC_SPIRV_TRANSLATOR_FILE SPIR-V-Translator-${IGC_SPIRV_TRANSLATOR_VERSION}.tar.gz)
@@ -724,15 +724,15 @@ set(IGC_SPIRV_TRANSLATOR_FILE SPIR-V-Translator-${IGC_SPIRV_TRANSLATOR_VERSION}.
### Intel Graphics Compiler DEPS END ###
########################################
set(GMMLIB_VERSION intel-gmmlib-22.1.8)
set(GMMLIB_VERSION intel-gmmlib-22.3.0)
set(GMMLIB_URI https://github.com/intel/gmmlib/archive/refs/tags/${GMMLIB_VERSION}.tar.gz)
set(GMMLIB_HASH bf23e9a3742b4fb98c7666c9e9b29f3219e4b2fb4d831aaf4eed71f5e2d17368)
set(GMMLIB_HASH c1f33e1519edfc527127baeb0436b783430dfd256c643130169a3a71dc86aff9)
set(GMMLIB_HASH_TYPE SHA256)
set(GMMLIB_FILE ${GMMLIB_VERSION}.tar.gz)
set(OCLOC_VERSION 22.38.24278)
set(OCLOC_VERSION 22.49.25018.21)
set(OCLOC_URI https://github.com/intel/compute-runtime/archive/refs/tags/${OCLOC_VERSION}.tar.gz)
set(OCLOC_HASH db0c542fccd651e6404b15a74d46027f1ce0eda8dc9e25a40cbb6c0faef257ee)
set(OCLOC_HASH 92362dae08b503a34e5d3820ed284198c452bcd5e7504d90eb69887b20492c06)
set(OCLOC_HASH_TYPE SHA256)
set(OCLOC_FILE ocloc-${OCLOC_VERSION}.tar.gz)

View File

@@ -13,7 +13,7 @@ ExternalProject_Add(external_wayland
# NOTE: `-lm` is needed for `libxml2` which is a static library that uses `libm.so`,
# without this, math symbols such as `floor` aren't found.
CONFIGURE_COMMAND ${CMAKE_COMMAND} -E env PKG_CONFIG_PATH=${LIBDIR}/expat/lib/pkgconfig:${LIBDIR}/xml2/lib/pkgconfig:${LIBDIR}/ffi/lib/pkgconfig:$PKG_CONFIG_PATH
${MESON} --prefix ${LIBDIR}/wayland -Ddocumentation=false -Dtests=false -D "c_link_args=-L${LIBDIR}/ffi/lib -lm" . ../external_wayland
${MESON} --prefix ${LIBDIR}/wayland ${MESON_BUILD_TYPE} -Ddocumentation=false -Dtests=false -D "c_link_args=-L${LIBDIR}/ffi/lib -lm" . ../external_wayland
BUILD_COMMAND ninja
INSTALL_COMMAND ninja install
)

View File

@@ -7,7 +7,7 @@ ExternalProject_Add(external_wayland_protocols
PREFIX ${BUILD_DIR}/wayland-protocols
# Use `-E` so the `PKG_CONFIG_PATH` can be defined to link against our own WAYLAND.
CONFIGURE_COMMAND ${CMAKE_COMMAND} -E env PKG_CONFIG_PATH=${LIBDIR}/wayland/lib64/pkgconfig:$PKG_CONFIG_PATH
${MESON} --prefix ${LIBDIR}/wayland-protocols . ../external_wayland_protocols -Dtests=false
${MESON} --prefix ${LIBDIR}/wayland-protocols ${MESON_BUILD_TYPE} . ../external_wayland_protocols -Dtests=false
BUILD_COMMAND ninja
INSTALL_COMMAND ninja install
)

View File

@@ -0,0 +1,74 @@
#!/usr/bin/env bash
# SPDX-License-Identifier: GPL-2.0-or-later
# This script ensures:
# - One dependency is built at a time.
# - That dependency uses all available cores.
#
# Without this, simply calling `make -j$(nproc)` from the `${CMAKE_BUILD_DIR}/deps/`
# directory will build many projects at once.
#
# This is undesirable for the following reasons:
#
# - The output from projects is mixed together,
# making it difficult to track down the cause of a build failure.
#
# - Larger dependencies such as LLVM can bottleneck the build process,
# making it necessary to cancel the build and manually run build commands in each directory.
#
# - Building many projects at once means canceling (Control-C) can lead to the build being in an undefined state.
# It's possible canceling happens as a patch is being applied or files are being copied.
# (steps that aren't part of the compilation process where it's typically safe to cancel).
if [[ -z "$MY_MAKE_CALL_LEVEL" ]]; then
export MY_MAKE_CALL_LEVEL=0
export MY_MAKEFLAGS=$MAKEFLAGS
# Extract the jobs argument (`-jN`, `-j N`, `--jobs=N`).
add_next=0
for i in "$@"; do
case $i in
-j*)
export MY_JOBS_ARG=$i
if [ "$MY_JOBS_ARG" = "-j" ]; then
add_next=1
fi
;;
--jobs=*)
shift # past argument=value
MY_JOBS_ARG=$i
;;
*)
if (( add_next == 1 )); then
MY_JOBS_ARG="$MY_JOBS_ARG $i"
add_next=0
fi
;;
esac
done
unset i add_next
if [[ -z "$MY_JOBS_ARG" ]]; then
MY_JOBS_ARG="-j$(nproc)"
fi
export MY_JOBS_ARG
# Support user defined `MAKEFLAGS`.
export MAKEFLAGS="$MY_MAKEFLAGS -j1"
else
export MY_MAKE_CALL_LEVEL=$(( MY_MAKE_CALL_LEVEL + 1 ))
if (( MY_MAKE_CALL_LEVEL == 1 )); then
# Important to set jobs to 1, otherwise user defined jobs argument is used.
export MAKEFLAGS="$MY_MAKEFLAGS -j1"
elif (( MY_MAKE_CALL_LEVEL == 2 )); then
# This is the level used by each sub-project.
export MAKEFLAGS="$MY_MAKEFLAGS $MY_JOBS_ARG"
fi
# Else leave `MY_MAKEFLAGS` flags as-is, avoids setting a high number of jobs on recursive
# calls (which may easily run out of memory). Let the job-server handle the rest.
fi
# Useful for troubleshooting the wrapper.
# echo "Call level: $MY_MAKE_CALL_LEVEL, args=$@".
# Call actual make but ensure recursive calls run via this script.
exec make MAKE="$0" "$@"

View File

@@ -1,7 +1,7 @@
diff -Naur external_igc_opencl_clang.orig/CMakeLists.txt external_igc_opencl_clang/CMakeLists.txt
--- external_igc_opencl_clang.orig/CMakeLists.txt 2022-03-16 05:51:10 -0600
+++ external_igc_opencl_clang/CMakeLists.txt 2022-05-23 10:40:09 -0600
@@ -126,22 +126,24 @@
@@ -147,22 +147,24 @@
)
endif()

View File

@@ -23,19 +23,19 @@ if(EXISTS ${SOURCE_DIR}/.git)
if(MY_WC_BRANCH STREQUAL "HEAD")
# Detached HEAD, check whether commit hash is reachable
# in the master branch
# in the main branch
execute_process(COMMAND git rev-parse --short=12 HEAD
WORKING_DIRECTORY ${SOURCE_DIR}
OUTPUT_VARIABLE MY_WC_HASH
OUTPUT_STRIP_TRAILING_WHITESPACE)
execute_process(COMMAND git branch --list master blender-v* --contains ${MY_WC_HASH}
execute_process(COMMAND git branch --list main blender-v* --contains ${MY_WC_HASH}
WORKING_DIRECTORY ${SOURCE_DIR}
OUTPUT_VARIABLE _git_contains_check
OUTPUT_STRIP_TRAILING_WHITESPACE)
if(NOT _git_contains_check STREQUAL "")
set(MY_WC_BRANCH "master")
set(MY_WC_BRANCH "main")
else()
execute_process(COMMAND git show-ref --tags -d
WORKING_DIRECTORY ${SOURCE_DIR}
@@ -48,7 +48,7 @@ if(EXISTS ${SOURCE_DIR}/.git)
OUTPUT_STRIP_TRAILING_WHITESPACE)
if(_git_tag_hashes MATCHES "${_git_head_hash}")
set(MY_WC_BRANCH "master")
set(MY_WC_BRANCH "main")
else()
execute_process(COMMAND git branch --contains ${MY_WC_HASH}
WORKING_DIRECTORY ${SOURCE_DIR}

View File

@@ -85,7 +85,7 @@ if(NOT APPLE)
set(WITH_CYCLES_DEVICE_OPTIX ON CACHE BOOL "" FORCE)
set(WITH_CYCLES_CUDA_BINARIES ON CACHE BOOL "" FORCE)
set(WITH_CYCLES_CUBIN_COMPILER OFF CACHE BOOL "" FORCE)
set(WITH_CYCLES_HIP_BINARIES ON CACHE BOOL "" FORCE)
set(WITH_CYCLES_HIP_BINARIES OFF CACHE BOOL "" FORCE)
set(WITH_CYCLES_DEVICE_ONEAPI ON CACHE BOOL "" FORCE)
set(WITH_CYCLES_ONEAPI_BINARIES ON CACHE BOOL "" FORCE)
endif()

View File

@@ -11,11 +11,11 @@
mkdir ~/blender-git
cd ~/blender-git
git clone http://git.blender.org/blender.git
git clone https://projects.blender.org/blender/blender.git
cd blender
git submodule update --init --recursive
git submodule foreach git checkout master
git submodule foreach git pull --rebase origin master
git submodule foreach git checkout main
git submodule foreach git pull --rebase origin main
# create build dir
mkdir ~/blender-git/build-cmake
@@ -35,7 +35,7 @@ ln -s ~/blender-git/build-cmake/bin/blender ~/blender-git/blender/blender.bin
echo ""
echo "* Useful Commands *"
echo " Run Blender: ~/blender-git/blender/blender.bin"
echo " Update Blender: git pull --rebase; git submodule foreach git pull --rebase origin master"
echo " Update Blender: git pull --rebase; git submodule foreach git pull --rebase origin main"
echo " Reconfigure Blender: cd ~/blender-git/build-cmake ; cmake ."
echo " Build Blender: cd ~/blender-git/build-cmake ; make"
echo ""

View File

@@ -544,7 +544,7 @@ endfunction()
function(setup_platform_linker_libs
target
)
# jemalloc must be early in the list, to be before pthread (see T57998)
# jemalloc must be early in the list, to be before pthread (see #57998).
if(WITH_MEM_JEMALLOC)
target_link_libraries(${target} ${JEMALLOC_LIBRARIES})
endif()

View File

@@ -440,7 +440,7 @@ string(APPEND PLATFORM_LINKFLAGS " -stdlib=libc++")
# Make stack size more similar to Embree, required for Embree.
string(APPEND PLATFORM_LINKFLAGS_EXECUTABLE " -Wl,-stack_size,0x100000")
# Suppress ranlib "has no symbols" warnings (workaround for T48250)
# Suppress ranlib "has no symbols" warnings (workaround for #48250).
set(CMAKE_C_ARCHIVE_CREATE "<CMAKE_AR> Scr <TARGET> <LINK_FLAGS> <OBJECTS>")
set(CMAKE_CXX_ARCHIVE_CREATE "<CMAKE_AR> Scr <TARGET> <LINK_FLAGS> <OBJECTS>")
# llvm-ranlib doesn't support this flag. Xcode's libtool does.

View File

@@ -121,7 +121,7 @@ if(WITH_WINDOWS_BUNDLE_CRT)
include(InstallRequiredSystemLibraries)
# ucrtbase(d).dll cannot be in the manifest, due to the way windows 10 handles
# redirects for this dll, for details see T88813.
# redirects for this dll, for details see #88813.
foreach(lib ${CMAKE_INSTALL_SYSTEM_RUNTIME_LIBS})
string(FIND ${lib} "ucrtbase" pos)
if(NOT pos EQUAL -1)
@@ -295,7 +295,7 @@ unset(MATERIALX_LIB_FOLDER_EXISTS)
if(NOT MSVC_CLANG AND # Available with MSVC 15.7+ but not for CLANG.
NOT WITH_WINDOWS_SCCACHE AND # And not when sccache is enabled
NOT VS_CLANG_TIDY) # Clang-tidy does not like these options
add_compile_options(/experimental:external /external:templates- /external:I "${LIBDIR}" /external:W0)
add_compile_options(/experimental:external /external:I "${LIBDIR}" /external:W0)
endif()
# Add each of our libraries to our cmake_prefix_path so find_package() could work

View File

@@ -5,16 +5,16 @@
update-code:
git:
submodules:
- branch: master
- branch: main
commit_id: HEAD
path: release/scripts/addons
- branch: master
- branch: main
commit_id: HEAD
path: release/scripts/addons_contrib
- branch: master
- branch: main
commit_id: HEAD
path: release/datafiles/locale
- branch: master
- branch: main
commit_id: HEAD
path: source/tools
svn:

View File

@@ -58,7 +58,7 @@ Each Blender release supports one Python version, and the package is only compat
## Source Code
* [Releases](https://download.blender.org/source/)
* Repository: [git.blender.org/blender.git](https://git.blender.org/gitweb/gitweb.cgi/blender.git)
* Repository: [projects.blender.org/blender/blender.git](https://projects.blender.org/blender/blender)
## Credits

View File

@@ -42,6 +42,7 @@ def parse_arguments() -> argparse.Namespace:
parser.add_argument("--svn-branch", default=None)
parser.add_argument("--git-command", default="git")
parser.add_argument("--use-linux-libraries", action="store_true")
parser.add_argument("--architecture", type=str, choices=("x86_64", "amd64", "arm64",))
return parser.parse_args()
@@ -51,6 +52,17 @@ def get_blender_git_root() -> str:
# Setup for precompiled libraries and tests from svn.
def get_effective_architecture(args: argparse.Namespace):
if args.architecture:
return args.architecture
# Check platform.version to detect arm64 with x86_64 python binary.
if "ARM64" in platform.version():
return "arm64"
return platform.machine().lower()
def svn_update(args: argparse.Namespace, release_version: Optional[str]) -> None:
svn_non_interactive = [args.svn_command, '--non-interactive']
@@ -58,11 +70,11 @@ def svn_update(args: argparse.Namespace, release_version: Optional[str]) -> None
svn_url = make_utils.svn_libraries_base_url(release_version, args.svn_branch)
# Checkout precompiled libraries
architecture = get_effective_architecture(args)
if sys.platform == 'darwin':
# Check platform.version to detect arm64 with x86_64 python binary.
if platform.machine() == 'arm64' or ('ARM64' in platform.version()):
if architecture == 'arm64':
lib_platform = "darwin_arm64"
elif platform.machine() == 'x86_64':
elif architecture == 'x86_64':
lib_platform = "darwin"
else:
lib_platform = None
@@ -170,7 +182,7 @@ def git_update_skip(args: argparse.Namespace, check_remote_exists: bool = True)
return "rebase or merge in progress, complete it first"
# Abort if uncommitted changes.
changes = check_output([args.git_command, 'status', '--porcelain', '--untracked-files=no'])
changes = check_output([args.git_command, 'status', '--porcelain', '--untracked-files=no', '--ignore-submodules'])
if len(changes) != 0:
return "you have unstaged changes"
@@ -202,8 +214,8 @@ def submodules_update(
sys.exit(1)
# Update submodules to appropriate given branch,
# falling back to master if none is given and/or found in a sub-repository.
branch_fallback = "master"
# falling back to main if none is given and/or found in a sub-repository.
branch_fallback = "main"
if not branch:
branch = branch_fallback
@@ -256,14 +268,15 @@ if __name__ == "__main__":
blender_skip_msg = ""
submodules_skip_msg = ""
# Test if we are building a specific release version.
branch = make_utils.git_branch(args.git_command)
if branch == 'HEAD':
sys.stderr.write('Blender git repository is in detached HEAD state, must be in a branch\n')
sys.exit(1)
tag = make_utils.git_tag(args.git_command)
release_version = make_utils.git_branch_release_version(branch, tag)
blender_version = make_utils. parse_blender_version()
if blender_version.cycle != 'alpha':
major = blender_version.version // 100
minor = blender_version.version % 100
branch = f"blender-v{major}.{minor}-release"
release_version = f"{major}.{minor}"
else:
branch = 'main'
release_version = None
if not args.no_libraries:
svn_update(args, release_version)

View File

@@ -3,9 +3,9 @@ if NOT exist "%BLENDER_DIR%\source\tools\.git" (
if not "%GIT%" == "" (
"%GIT%" submodule update --init --recursive --progress
if errorlevel 1 goto FAIL
"%GIT%" submodule foreach git checkout master
"%GIT%" submodule foreach git checkout main
if errorlevel 1 goto FAIL
"%GIT%" submodule foreach git pull --rebase origin master
"%GIT%" submodule foreach git pull --rebase origin main
if errorlevel 1 goto FAIL
goto EOF
) else (

View File

@@ -38,7 +38,7 @@ PROJECT_NAME = Blender
# could be handy for archiving the generated documentation or if some version
# control system is used.
PROJECT_NUMBER = V3.5
PROJECT_NUMBER = V3.6
# Using the PROJECT_BRIEF tag one can provide an optional one line description
# for a project that appears at the top of each page and should give viewer a

View File

@@ -37,7 +37,7 @@ def draw_callback_px(self, context):
# BLF drawing routine
font_id = font_info["font_id"]
blf.position(font_id, 2, 80, 0)
blf.size(font_id, 50, 72)
blf.size(font_id, 50)
blf.draw(font_id, "Hello World")

View File

@@ -476,7 +476,7 @@ MODULE_GROUPING = {
# -------------------------------BLENDER----------------------------------------
# converting bytes to strings, due to T30154
# Converting bytes to strings, due to #30154.
BLENDER_REVISION = str(bpy.app.build_hash, 'utf_8')
BLENDER_REVISION_TIMESTAMP = bpy.app.build_commit_timestamp
@@ -487,7 +487,7 @@ BLENDER_VERSION_DOTS = "%d.%d" % (bpy.app.version[0], bpy.app.version[1])
if BLENDER_REVISION != "Unknown":
# SHA1 Git hash
BLENDER_VERSION_HASH = BLENDER_REVISION
BLENDER_VERSION_HASH_HTML_LINK = "<a href=https://developer.blender.org/rB%s>%s</a>" % (
BLENDER_VERSION_HASH_HTML_LINK = "<a href=https://projects.blender.org/blender/blender/commit/%s>%s</a>" % (
BLENDER_VERSION_HASH, BLENDER_VERSION_HASH,
)
BLENDER_VERSION_DATE = time.strftime("%d/%m/%Y", time.localtime(BLENDER_REVISION_TIMESTAMP))
@@ -647,7 +647,7 @@ def undocumented_message(module_name, type_name, identifier):
module_name, type_name, identifier,
)
return "Undocumented, consider `contributing <https://developer.blender.org/T51061>`__."
return "Undocumented, consider `contributing <https://developer.blender.org/>`__."
def range_str(val):
@@ -1816,9 +1816,9 @@ def pyrna2sphinx(basepath):
# operators
def write_ops():
API_BASEURL = "https://developer.blender.org/diffusion/B/browse/master/release/scripts"
API_BASEURL_ADDON = "https://developer.blender.org/diffusion/BA"
API_BASEURL_ADDON_CONTRIB = "https://developer.blender.org/diffusion/BAC"
API_BASEURL = "https://projects.blender.org/blender/blender/src/branch/main/release/scripts"
API_BASEURL_ADDON = "https://projects.blender.org/blender/blender-addons"
API_BASEURL_ADDON_CONTRIB = "https://projects.blender.org/blender/blender-addons-contrib"
op_modules = {}
op = None
@@ -2200,7 +2200,7 @@ def write_rst_enum_items(basepath, key, key_no_prefix, enum_items):
Write a single page for a static enum in RST.
This helps avoiding very large lists being in-lined in many places which is an issue
especially with icons in ``bpy.types.UILayout``. See T87008.
especially with icons in ``bpy.types.UILayout``. See #87008.
"""
filepath = os.path.join(basepath, "%s.rst" % key_no_prefix)
with open(filepath, "w", encoding="utf-8") as fh:

View File

@@ -156,7 +156,7 @@ var Popover = function() {
},
getNamed : function(v) {
$.each(all_versions, function(ix, title) {
if (ix === "master" || ix === "latest") {
if (ix === "master" || ix === "main" || ix === "latest") {
var m = title.match(/\d\.\d[\w\d\.]*/)[0];
if (parseFloat(m) == v) {
v = ix;

View File

@@ -1,5 +1,5 @@
Project: Blender
URL: https://git.blender.org/blender.git
URL: https://projects.blender.org/blender/blender.git
License: Apache 2.0
Upstream version: N/A
Local modifications: None

View File

@@ -12,6 +12,7 @@ from bpy.props import (
PointerProperty,
StringProperty,
)
from bpy.app.translations import pgettext_iface as iface_
from math import pi
@@ -1664,30 +1665,51 @@ class CyclesPreferences(bpy.types.AddonPreferences):
col.label(text="No compatible GPUs found for Cycles", icon='INFO')
if device_type == 'CUDA':
col.label(text="Requires NVIDIA GPU with compute capability 3.0", icon='BLANK1')
compute_capability = "3.0"
col.label(text=iface_("Requires NVIDIA GPU with compute capability %s") % compute_capability,
icon='BLANK1', translate=False)
elif device_type == 'OPTIX':
col.label(text="Requires NVIDIA GPU with compute capability 5.0", icon='BLANK1')
col.label(text="and NVIDIA driver version 470 or newer", icon='BLANK1')
compute_capability = "5.0"
driver_version = "470"
col.label(text=iface_("Requires NVIDIA GPU with compute capability %s") % compute_capability,
icon='BLANK1', translate=False)
col.label(text="and NVIDIA driver version %s or newer" % driver_version,
icon='BLANK1', translate=False)
elif device_type == 'HIP':
import sys
if sys.platform[:3] == "win":
col.label(text="Requires AMD GPU with RDNA architecture", icon='BLANK1')
col.label(text="and AMD Radeon Pro 21.Q4 driver or newer", icon='BLANK1')
elif sys.platform.startswith("linux"):
col.label(text="Requires AMD GPU with RDNA architecture", icon='BLANK1')
col.label(text="and AMD driver version 22.10 or newer", icon='BLANK1')
if True:
col.label(text="HIP temporarily disabled due to compiler bugs", icon='BLANK1')
else:
import sys
if sys.platform[:3] == "win":
driver_version = "21.Q4"
col.label(text="Requires AMD GPU with Vega or RDNA architecture", icon='BLANK1')
col.label(text=iface_("and AMD Radeon Pro %s driver or newer") % driver_version,
icon='BLANK1', translate=False)
elif sys.platform.startswith("linux"):
driver_version = "22.10"
col.label(text="Requires AMD GPU with Vega or RDNA architecture", icon='BLANK1')
col.label(text=iface_("and AMD driver version %s or newer") % driver_version, icon='BLANK1',
translate=False)
elif device_type == 'ONEAPI':
import sys
if sys.platform.startswith("win"):
driver_version = "101.4032"
col.label(text="Requires Intel GPU with Xe-HPG architecture", icon='BLANK1')
col.label(text="and Windows driver version 101.4032 or newer", icon='BLANK1')
col.label(text=iface_("and Windows driver version %s or newer") % driver_version,
icon='BLANK1', translate=False)
elif sys.platform.startswith("linux"):
driver_version = "1.3.24931"
col.label(text="Requires Intel GPU with Xe-HPG architecture and", icon='BLANK1')
col.label(text=" - intel-level-zero-gpu version 1.3.24931 or newer", icon='BLANK1')
col.label(text=iface_(" - intel-level-zero-gpu version %s or newer") % driver_version,
icon='BLANK1', translate=False)
col.label(text=" - oneAPI Level-Zero Loader", icon='BLANK1')
elif device_type == 'METAL':
col.label(text="Requires Apple Silicon with macOS 12.2 or newer", icon='BLANK1')
col.label(text="or AMD with macOS 12.3 or newer", icon='BLANK1')
silicon_mac_version = "12.2"
amd_mac_version = "12.3"
col.label(text=iface_("Requires Apple Silicon with macOS %s or newer") % silicon_mac_version,
icon='BLANK1', translate=False)
col.label(text=iface_("or AMD with macOS %s or newer") % amd_mac_version, icon='BLANK1',
translate=False)
return
for device in devices:
@@ -1723,12 +1745,21 @@ class CyclesPreferences(bpy.types.AddonPreferences):
if compute_device_type == 'METAL':
import platform
# MetalRT only works on Apple Silicon at present, pending argument encoding fixes on AMD
# Kernel specialization is only viable on Apple Silicon at present due to relative compilation speed
if platform.machine() == 'arm64':
import re
is_navi_2 = False
for device in devices:
if re.search(r"((RX)|(Pro)|(PRO))\s+W?6\d00X", device.name):
is_navi_2 = True
break
# MetalRT only works on Apple Silicon and Navi2.
is_arm64 = platform.machine() == 'arm64'
if is_arm64 or is_navi_2:
col = layout.column()
col.use_property_split = True
col.prop(self, "kernel_optimization_level")
# Kernel specialization is only supported on Apple Silicon
if is_arm64:
col.prop(self, "kernel_optimization_level")
col.prop(self, "use_metalrt")
def draw(self, context):

View File

@@ -20,7 +20,7 @@ class CyclesPresetPanel(PresetPanel, Panel):
@staticmethod
def post_cb(context):
# Modify an arbitrary built-in scene property to force a depsgraph
# update, because add-on properties don't. (see T62325)
# update, because add-on properties don't. (see #62325)
render = context.scene.render
render.filter_size = render.filter_size

View File

@@ -105,11 +105,12 @@ GPUShader *BlenderFallbackDisplayShader::bind(int width, int height)
/* Bind shader now to enable uniform assignment. */
GPU_shader_bind(shader_program_);
GPU_shader_uniform_int(shader_program_, image_texture_location_, 0);
int slot = 0;
GPU_shader_uniform_int_ex(shader_program_, image_texture_location_, 1, 1, &slot);
float size[2];
size[0] = width;
size[1] = height;
GPU_shader_uniform_vector(shader_program_, fullscreen_location_, 2, 1, size);
GPU_shader_uniform_float_ex(shader_program_, fullscreen_location_, 2, 1, size);
return shader_program_;
}

View File

@@ -20,7 +20,7 @@ BlenderImageLoader::BlenderImageLoader(BL::Image b_image,
: b_image(b_image),
frame(frame),
tile_number(tile_number),
/* Don't free cache for preview render to avoid race condition from T93560, to be fixed
/* Don't free cache for preview render to avoid race condition from #93560, to be fixed
* properly later as we are close to release. */
free_cache(!is_preview_render && !b_image.has_data())
{
@@ -72,7 +72,7 @@ bool BlenderImageLoader::load_metadata(const ImageDeviceFeatures &, ImageMetaDat
metadata.colorspace = u_colorspace_raw;
}
else {
/* In some cases (e.g. T94135), the colorspace setting in Blender gets updated as part of the
/* In some cases (e.g. #94135), the colorspace setting in Blender gets updated as part of the
* metadata queries in this function, so update the colorspace setting here. */
PointerRNA colorspace_ptr = b_image.colorspace_settings().ptr;
metadata.colorspace = get_enum_identifier(colorspace_ptr, "name");

View File

@@ -24,7 +24,7 @@ void BlenderSync::sync_light(BL::Object &b_parent,
Light *light = light_map.find(key);
/* Check if the transform was modified, in case a linked collection is moved we do not get a
* specific depsgraph update (T88515). This also mimics the behavior for Objects. */
* specific depsgraph update (#88515). This also mimics the behavior for Objects. */
const bool tfm_updated = (light && light->get_tfm() != tfm);
/* Update if either object or light data changed. */

View File

@@ -94,7 +94,7 @@ void python_thread_state_restore(void **python_thread_state)
*python_thread_state = NULL;
}
static const char *PyC_UnicodeAsByte(PyObject *py_str, PyObject **coerce)
static const char *PyC_UnicodeAsBytes(PyObject *py_str, PyObject **coerce)
{
const char *result = PyUnicode_AsUTF8(py_str);
if (result) {
@@ -131,8 +131,8 @@ static PyObject *init_func(PyObject * /*self*/, PyObject *args)
}
PyObject *path_coerce = nullptr, *user_path_coerce = nullptr;
path_init(PyC_UnicodeAsByte(path, &path_coerce),
PyC_UnicodeAsByte(user_path, &user_path_coerce));
path_init(PyC_UnicodeAsBytes(path, &path_coerce),
PyC_UnicodeAsBytes(user_path, &user_path_coerce));
Py_XDECREF(path_coerce);
Py_XDECREF(user_path_coerce);

View File

@@ -404,7 +404,7 @@ void BlenderSession::render(BL::Depsgraph &b_depsgraph_)
* point we know that we've got everything to render current view layer.
*/
/* At the moment we only free if we are not doing multi-view
* (or if we are rendering the last view). See T58142/D4239 for discussion.
* (or if we are rendering the last view). See #58142/D4239 for discussion.
*/
if (view_index == num_views - 1) {
free_blender_memory_if_possible();

View File

@@ -766,7 +766,7 @@ void BlenderSync::free_data_after_sync(BL::Depsgraph &b_depsgraph)
(BlenderSession::headless || is_interface_locked) &&
/* Baking re-uses the depsgraph multiple times, clearing crashes
* reading un-evaluated mesh data which isn't aligned with the
* geometry we're baking, see T71012. */
* geometry we're baking, see #71012. */
!scene->bake_manager->get_baking() &&
/* Persistent data must main caches for performance and correctness. */
!is_persistent_data;

View File

@@ -42,12 +42,15 @@ endif()
###########################################################################
if(WITH_CYCLES_HIP_BINARIES AND WITH_CYCLES_DEVICE_HIP)
find_package(HIP)
set_and_warn_library_found("HIP compiler" HIP_FOUND WITH_CYCLES_HIP_BINARIES)
set(WITH_CYCLES_HIP_BINARIES OFF)
message(STATUS "HIP temporarily disabled due to compiler bugs")
if(HIP_FOUND)
message(STATUS "Found HIP ${HIP_HIPCC_EXECUTABLE} (${HIP_VERSION})")
endif()
# find_package(HIP)
# set_and_warn_library_found("HIP compiler" HIP_FOUND WITH_CYCLES_HIP_BINARIES)
# if(HIP_FOUND)
# message(STATUS "Found HIP ${HIP_HIPCC_EXECUTABLE} (${HIP_VERSION})")
# endif()
endif()
if(NOT WITH_HIP_DYNLOAD)

View File

@@ -53,8 +53,12 @@ void CUDADevice::set_error(const string &error)
}
CUDADevice::CUDADevice(const DeviceInfo &info, Stats &stats, Profiler &profiler)
: Device(info, stats, profiler), texture_info(this, "texture_info", MEM_GLOBAL)
: GPUDevice(info, stats, profiler)
{
/* Verify that base class types can be used with specific backend types */
static_assert(sizeof(texMemObject) == sizeof(CUtexObject));
static_assert(sizeof(arrayMemObject) == sizeof(CUarray));
first_error = true;
cuDevId = info.num;
@@ -65,12 +69,6 @@ CUDADevice::CUDADevice(const DeviceInfo &info, Stats &stats, Profiler &profiler)
need_texture_info = false;
device_texture_headroom = 0;
device_working_headroom = 0;
move_texture_to_host = false;
map_host_limit = 0;
map_host_used = 0;
can_map_host = 0;
pitch_alignment = 0;
/* Initialize CUDA. */
@@ -91,8 +89,9 @@ CUDADevice::CUDADevice(const DeviceInfo &info, Stats &stats, Profiler &profiler)
/* CU_CTX_MAP_HOST for mapping host memory when out of device memory.
* CU_CTX_LMEM_RESIZE_TO_MAX for reserving local memory ahead of render,
* so we can predict which memory to map to host. */
cuda_assert(
cuDeviceGetAttribute(&can_map_host, CU_DEVICE_ATTRIBUTE_CAN_MAP_HOST_MEMORY, cuDevice));
int value;
cuda_assert(cuDeviceGetAttribute(&value, CU_DEVICE_ATTRIBUTE_CAN_MAP_HOST_MEMORY, cuDevice));
can_map_host = value != 0;
cuda_assert(cuDeviceGetAttribute(
&pitch_alignment, CU_DEVICE_ATTRIBUTE_TEXTURE_PITCH_ALIGNMENT, cuDevice));
@@ -499,311 +498,57 @@ void CUDADevice::reserve_local_memory(const uint kernel_features)
# endif
}
void CUDADevice::init_host_memory()
{
/* Limit amount of host mapped memory, because allocating too much can
* cause system instability. Leave at least half or 4 GB of system
* memory free, whichever is smaller. */
size_t default_limit = 4 * 1024 * 1024 * 1024LL;
size_t system_ram = system_physical_ram();
if (system_ram > 0) {
if (system_ram / 2 > default_limit) {
map_host_limit = system_ram - default_limit;
}
else {
map_host_limit = system_ram / 2;
}
}
else {
VLOG_WARNING << "Mapped host memory disabled, failed to get system RAM";
map_host_limit = 0;
}
/* Amount of device memory to keep is free after texture memory
* and working memory allocations respectively. We set the working
* memory limit headroom lower so that some space is left after all
* texture memory allocations. */
device_working_headroom = 32 * 1024 * 1024LL; // 32MB
device_texture_headroom = 128 * 1024 * 1024LL; // 128MB
VLOG_INFO << "Mapped host memory limit set to " << string_human_readable_number(map_host_limit)
<< " bytes. (" << string_human_readable_size(map_host_limit) << ")";
}
void CUDADevice::load_texture_info()
{
if (need_texture_info) {
/* Unset flag before copying, so this does not loop indefinitely if the copy below calls
* into 'move_textures_to_host' (which calls 'load_texture_info' again). */
need_texture_info = false;
texture_info.copy_to_device();
}
}
void CUDADevice::move_textures_to_host(size_t size, bool for_texture)
{
/* Break out of recursive call, which can happen when moving memory on a multi device. */
static bool any_device_moving_textures_to_host = false;
if (any_device_moving_textures_to_host) {
return;
}
/* Signal to reallocate textures in host memory only. */
move_texture_to_host = true;
while (size > 0) {
/* Find suitable memory allocation to move. */
device_memory *max_mem = NULL;
size_t max_size = 0;
bool max_is_image = false;
thread_scoped_lock lock(cuda_mem_map_mutex);
foreach (CUDAMemMap::value_type &pair, cuda_mem_map) {
device_memory &mem = *pair.first;
CUDAMem *cmem = &pair.second;
/* Can only move textures allocated on this device (and not those from peer devices).
* And need to ignore memory that is already on the host. */
if (!mem.is_resident(this) || cmem->use_mapped_host) {
continue;
}
bool is_texture = (mem.type == MEM_TEXTURE || mem.type == MEM_GLOBAL) &&
(&mem != &texture_info);
bool is_image = is_texture && (mem.data_height > 1);
/* Can't move this type of memory. */
if (!is_texture || cmem->array) {
continue;
}
/* For other textures, only move image textures. */
if (for_texture && !is_image) {
continue;
}
/* Try to move largest allocation, prefer moving images. */
if (is_image > max_is_image || (is_image == max_is_image && mem.device_size > max_size)) {
max_is_image = is_image;
max_size = mem.device_size;
max_mem = &mem;
}
}
lock.unlock();
/* Move to host memory. This part is mutex protected since
* multiple CUDA devices could be moving the memory. The
* first one will do it, and the rest will adopt the pointer. */
if (max_mem) {
VLOG_WORK << "Move memory from device to host: " << max_mem->name;
static thread_mutex move_mutex;
thread_scoped_lock lock(move_mutex);
any_device_moving_textures_to_host = true;
/* Potentially need to call back into multi device, so pointer mapping
* and peer devices are updated. This is also necessary since the device
* pointer may just be a key here, so cannot be accessed and freed directly.
* Unfortunately it does mean that memory is reallocated on all other
* devices as well, which is potentially dangerous when still in use (since
* a thread rendering on another devices would only be caught in this mutex
* if it so happens to do an allocation at the same time as well. */
max_mem->device_copy_to();
size = (max_size >= size) ? 0 : size - max_size;
any_device_moving_textures_to_host = false;
}
else {
break;
}
}
/* Unset flag before texture info is reloaded, since it should stay in device memory. */
move_texture_to_host = false;
/* Update texture info array with new pointers. */
load_texture_info();
}
CUDADevice::CUDAMem *CUDADevice::generic_alloc(device_memory &mem, size_t pitch_padding)
void CUDADevice::get_device_memory_info(size_t &total, size_t &free)
{
CUDAContextScope scope(this);
CUdeviceptr device_pointer = 0;
size_t size = mem.memory_size() + pitch_padding;
CUresult mem_alloc_result = CUDA_ERROR_OUT_OF_MEMORY;
const char *status = "";
/* First try allocating in device memory, respecting headroom. We make
* an exception for texture info. It is small and frequently accessed,
* so treat it as working memory.
*
* If there is not enough room for working memory, we will try to move
* textures to host memory, assuming the performance impact would have
* been worse for working memory. */
bool is_texture = (mem.type == MEM_TEXTURE || mem.type == MEM_GLOBAL) && (&mem != &texture_info);
bool is_image = is_texture && (mem.data_height > 1);
size_t headroom = (is_texture) ? device_texture_headroom : device_working_headroom;
size_t total = 0, free = 0;
cuMemGetInfo(&free, &total);
/* Move textures to host memory if needed. */
if (!move_texture_to_host && !is_image && (size + headroom) >= free && can_map_host) {
move_textures_to_host(size + headroom - free, is_texture);
cuMemGetInfo(&free, &total);
}
/* Allocate in device memory. */
if (!move_texture_to_host && (size + headroom) < free) {
mem_alloc_result = cuMemAlloc(&device_pointer, size);
if (mem_alloc_result == CUDA_SUCCESS) {
status = " in device memory";
}
}
/* Fall back to mapped host memory if needed and possible. */
void *shared_pointer = 0;
if (mem_alloc_result != CUDA_SUCCESS && can_map_host && mem.type != MEM_DEVICE_ONLY) {
if (mem.shared_pointer) {
/* Another device already allocated host memory. */
mem_alloc_result = CUDA_SUCCESS;
shared_pointer = mem.shared_pointer;
}
else if (map_host_used + size < map_host_limit) {
/* Allocate host memory ourselves. */
mem_alloc_result = cuMemHostAlloc(
&shared_pointer, size, CU_MEMHOSTALLOC_DEVICEMAP | CU_MEMHOSTALLOC_WRITECOMBINED);
assert((mem_alloc_result == CUDA_SUCCESS && shared_pointer != 0) ||
(mem_alloc_result != CUDA_SUCCESS && shared_pointer == 0));
}
if (mem_alloc_result == CUDA_SUCCESS) {
cuda_assert(cuMemHostGetDevicePointer_v2(&device_pointer, shared_pointer, 0));
map_host_used += size;
status = " in host memory";
}
}
if (mem_alloc_result != CUDA_SUCCESS) {
if (mem.type == MEM_DEVICE_ONLY) {
status = " failed, out of device memory";
set_error("System is out of GPU memory");
}
else {
status = " failed, out of device and host memory";
set_error("System is out of GPU and shared host memory");
}
}
if (mem.name) {
VLOG_WORK << "Buffer allocate: " << mem.name << ", "
<< string_human_readable_number(mem.memory_size()) << " bytes. ("
<< string_human_readable_size(mem.memory_size()) << ")" << status;
}
mem.device_pointer = (device_ptr)device_pointer;
mem.device_size = size;
stats.mem_alloc(size);
if (!mem.device_pointer) {
return NULL;
}
/* Insert into map of allocations. */
thread_scoped_lock lock(cuda_mem_map_mutex);
CUDAMem *cmem = &cuda_mem_map[&mem];
if (shared_pointer != 0) {
/* Replace host pointer with our host allocation. Only works if
* CUDA memory layout is the same and has no pitch padding. Also
* does not work if we move textures to host during a render,
* since other devices might be using the memory. */
if (!move_texture_to_host && pitch_padding == 0 && mem.host_pointer &&
mem.host_pointer != shared_pointer) {
memcpy(shared_pointer, mem.host_pointer, size);
/* A Call to device_memory::host_free() should be preceded by
* a call to device_memory::device_free() for host memory
* allocated by a device to be handled properly. Two exceptions
* are here and a call in OptiXDevice::generic_alloc(), where
* the current host memory can be assumed to be allocated by
* device_memory::host_alloc(), not by a device */
mem.host_free();
mem.host_pointer = shared_pointer;
}
mem.shared_pointer = shared_pointer;
mem.shared_counter++;
cmem->use_mapped_host = true;
}
else {
cmem->use_mapped_host = false;
}
return cmem;
}
void CUDADevice::generic_copy_to(device_memory &mem)
bool CUDADevice::alloc_device(void *&device_pointer, size_t size)
{
if (!mem.host_pointer || !mem.device_pointer) {
return;
}
CUDAContextScope scope(this);
/* If use_mapped_host of mem is false, the current device only uses device memory allocated by
* cuMemAlloc regardless of mem.host_pointer and mem.shared_pointer, and should copy data from
* mem.host_pointer. */
thread_scoped_lock lock(cuda_mem_map_mutex);
if (!cuda_mem_map[&mem].use_mapped_host || mem.host_pointer != mem.shared_pointer) {
const CUDAContextScope scope(this);
cuda_assert(
cuMemcpyHtoD((CUdeviceptr)mem.device_pointer, mem.host_pointer, mem.memory_size()));
}
CUresult mem_alloc_result = cuMemAlloc((CUdeviceptr *)&device_pointer, size);
return mem_alloc_result == CUDA_SUCCESS;
}
void CUDADevice::generic_free(device_memory &mem)
void CUDADevice::free_device(void *device_pointer)
{
if (mem.device_pointer) {
CUDAContextScope scope(this);
thread_scoped_lock lock(cuda_mem_map_mutex);
DCHECK(cuda_mem_map.find(&mem) != cuda_mem_map.end());
const CUDAMem &cmem = cuda_mem_map[&mem];
CUDAContextScope scope(this);
/* If cmem.use_mapped_host is true, reference counting is used
* to safely free a mapped host memory. */
cuda_assert(cuMemFree((CUdeviceptr)device_pointer));
}
if (cmem.use_mapped_host) {
assert(mem.shared_pointer);
if (mem.shared_pointer) {
assert(mem.shared_counter > 0);
if (--mem.shared_counter == 0) {
if (mem.host_pointer == mem.shared_pointer) {
mem.host_pointer = 0;
}
cuMemFreeHost(mem.shared_pointer);
mem.shared_pointer = 0;
}
}
map_host_used -= mem.device_size;
}
else {
/* Free device memory. */
cuda_assert(cuMemFree(mem.device_pointer));
}
bool CUDADevice::alloc_host(void *&shared_pointer, size_t size)
{
CUDAContextScope scope(this);
stats.mem_free(mem.device_size);
mem.device_pointer = 0;
mem.device_size = 0;
CUresult mem_alloc_result = cuMemHostAlloc(
&shared_pointer, size, CU_MEMHOSTALLOC_DEVICEMAP | CU_MEMHOSTALLOC_WRITECOMBINED);
return mem_alloc_result == CUDA_SUCCESS;
}
cuda_mem_map.erase(cuda_mem_map.find(&mem));
}
void CUDADevice::free_host(void *shared_pointer)
{
CUDAContextScope scope(this);
cuMemFreeHost(shared_pointer);
}
bool CUDADevice::transform_host_pointer(void *&device_pointer, void *&shared_pointer)
{
CUDAContextScope scope(this);
cuda_assert(cuMemHostGetDevicePointer_v2((CUdeviceptr *)&device_pointer, shared_pointer, 0));
return true;
}
void CUDADevice::copy_host_to_device(void *device_pointer, void *host_pointer, size_t size)
{
const CUDAContextScope scope(this);
cuda_assert(cuMemcpyHtoD((CUdeviceptr)device_pointer, host_pointer, size));
}
void CUDADevice::mem_alloc(device_memory &mem)
@@ -868,8 +613,8 @@ void CUDADevice::mem_zero(device_memory &mem)
/* If use_mapped_host of mem is false, mem.device_pointer currently refers to device memory
* regardless of mem.host_pointer and mem.shared_pointer. */
thread_scoped_lock lock(cuda_mem_map_mutex);
if (!cuda_mem_map[&mem].use_mapped_host || mem.host_pointer != mem.shared_pointer) {
thread_scoped_lock lock(device_mem_map_mutex);
if (!device_mem_map[&mem].use_mapped_host || mem.host_pointer != mem.shared_pointer) {
const CUDAContextScope scope(this);
cuda_assert(cuMemsetD8((CUdeviceptr)mem.device_pointer, 0, mem.memory_size()));
}
@@ -994,19 +739,19 @@ void CUDADevice::tex_alloc(device_texture &mem)
return;
}
CUDAMem *cmem = NULL;
Mem *cmem = NULL;
CUarray array_3d = NULL;
size_t src_pitch = mem.data_width * dsize * mem.data_elements;
size_t dst_pitch = src_pitch;
if (!mem.is_resident(this)) {
thread_scoped_lock lock(cuda_mem_map_mutex);
cmem = &cuda_mem_map[&mem];
thread_scoped_lock lock(device_mem_map_mutex);
cmem = &device_mem_map[&mem];
cmem->texobject = 0;
if (mem.data_depth > 1) {
array_3d = (CUarray)mem.device_pointer;
cmem->array = array_3d;
cmem->array = reinterpret_cast<arrayMemObject>(array_3d);
}
else if (mem.data_height > 0) {
dst_pitch = align_up(src_pitch, pitch_alignment);
@@ -1050,10 +795,10 @@ void CUDADevice::tex_alloc(device_texture &mem)
mem.device_size = size;
stats.mem_alloc(size);
thread_scoped_lock lock(cuda_mem_map_mutex);
cmem = &cuda_mem_map[&mem];
thread_scoped_lock lock(device_mem_map_mutex);
cmem = &device_mem_map[&mem];
cmem->texobject = 0;
cmem->array = array_3d;
cmem->array = reinterpret_cast<arrayMemObject>(array_3d);
}
else if (mem.data_height > 0) {
/* 2D texture, using pitch aligned linear memory. */
@@ -1137,8 +882,8 @@ void CUDADevice::tex_alloc(device_texture &mem)
texDesc.filterMode = filter_mode;
texDesc.flags = CU_TRSF_NORMALIZED_COORDINATES;
thread_scoped_lock lock(cuda_mem_map_mutex);
cmem = &cuda_mem_map[&mem];
thread_scoped_lock lock(device_mem_map_mutex);
cmem = &device_mem_map[&mem];
cuda_assert(cuTexObjectCreate(&cmem->texobject, &resDesc, &texDesc, NULL));
@@ -1153,9 +898,9 @@ void CUDADevice::tex_free(device_texture &mem)
{
if (mem.device_pointer) {
CUDAContextScope scope(this);
thread_scoped_lock lock(cuda_mem_map_mutex);
DCHECK(cuda_mem_map.find(&mem) != cuda_mem_map.end());
const CUDAMem &cmem = cuda_mem_map[&mem];
thread_scoped_lock lock(device_mem_map_mutex);
DCHECK(device_mem_map.find(&mem) != device_mem_map.end());
const Mem &cmem = device_mem_map[&mem];
if (cmem.texobject) {
/* Free bindless texture. */
@@ -1164,16 +909,16 @@ void CUDADevice::tex_free(device_texture &mem)
if (!mem.is_resident(this)) {
/* Do not free memory here, since it was allocated on a different device. */
cuda_mem_map.erase(cuda_mem_map.find(&mem));
device_mem_map.erase(device_mem_map.find(&mem));
}
else if (cmem.array) {
/* Free array. */
cuArrayDestroy(cmem.array);
cuArrayDestroy(reinterpret_cast<CUarray>(cmem.array));
stats.mem_free(mem.device_size);
mem.device_pointer = 0;
mem.device_size = 0;
cuda_mem_map.erase(cuda_mem_map.find(&mem));
device_mem_map.erase(device_mem_map.find(&mem));
}
else {
lock.unlock();

View File

@@ -21,7 +21,7 @@ CCL_NAMESPACE_BEGIN
class DeviceQueue;
class CUDADevice : public Device {
class CUDADevice : public GPUDevice {
friend class CUDAContextScope;
@@ -29,36 +29,11 @@ class CUDADevice : public Device {
CUdevice cuDevice;
CUcontext cuContext;
CUmodule cuModule;
size_t device_texture_headroom;
size_t device_working_headroom;
bool move_texture_to_host;
size_t map_host_used;
size_t map_host_limit;
int can_map_host;
int pitch_alignment;
int cuDevId;
int cuDevArchitecture;
bool first_error;
struct CUDAMem {
CUDAMem() : texobject(0), array(0), use_mapped_host(false)
{
}
CUtexObject texobject;
CUarray array;
/* If true, a mapped host memory in shared_pointer is being used. */
bool use_mapped_host;
};
typedef map<device_memory *, CUDAMem> CUDAMemMap;
CUDAMemMap cuda_mem_map;
thread_mutex cuda_mem_map_mutex;
/* Bindless Textures */
device_vector<TextureInfo> texture_info;
bool need_texture_info;
CUDADeviceKernels kernels;
static bool have_precompiled_kernels();
@@ -88,17 +63,13 @@ class CUDADevice : public Device {
void reserve_local_memory(const uint kernel_features);
void init_host_memory();
void load_texture_info();
void move_textures_to_host(size_t size, bool for_texture);
CUDAMem *generic_alloc(device_memory &mem, size_t pitch_padding = 0);
void generic_copy_to(device_memory &mem);
void generic_free(device_memory &mem);
virtual void get_device_memory_info(size_t &total, size_t &free) override;
virtual bool alloc_device(void *&device_pointer, size_t size) override;
virtual void free_device(void *device_pointer) override;
virtual bool alloc_host(void *&shared_pointer, size_t size) override;
virtual void free_host(void *shared_pointer) override;
virtual bool transform_host_pointer(void *&device_pointer, void *&shared_pointer) override;
virtual void copy_host_to_device(void *device_pointer, void *host_pointer, size_t size) override;
void mem_alloc(device_memory &mem) override;

View File

@@ -452,6 +452,320 @@ void *Device::get_cpu_osl_memory()
return nullptr;
}
GPUDevice::~GPUDevice() noexcept(false)
{
}
bool GPUDevice::load_texture_info()
{
if (need_texture_info) {
/* Unset flag before copying, so this does not loop indefinitely if the copy below calls
* into 'move_textures_to_host' (which calls 'load_texture_info' again). */
need_texture_info = false;
texture_info.copy_to_device();
return true;
}
else {
return false;
}
}
void GPUDevice::init_host_memory(size_t preferred_texture_headroom,
size_t preferred_working_headroom)
{
/* Limit amount of host mapped memory, because allocating too much can
* cause system instability. Leave at least half or 4 GB of system
* memory free, whichever is smaller. */
size_t default_limit = 4 * 1024 * 1024 * 1024LL;
size_t system_ram = system_physical_ram();
if (system_ram > 0) {
if (system_ram / 2 > default_limit) {
map_host_limit = system_ram - default_limit;
}
else {
map_host_limit = system_ram / 2;
}
}
else {
VLOG_WARNING << "Mapped host memory disabled, failed to get system RAM";
map_host_limit = 0;
}
/* Amount of device memory to keep free after texture memory
* and working memory allocations respectively. We set the working
* memory limit headroom lower than the working one so there
* is space left for it. */
device_working_headroom = preferred_working_headroom > 0 ? preferred_working_headroom :
32 * 1024 * 1024LL; // 32MB
device_texture_headroom = preferred_texture_headroom > 0 ? preferred_texture_headroom :
128 * 1024 * 1024LL; // 128MB
VLOG_INFO << "Mapped host memory limit set to " << string_human_readable_number(map_host_limit)
<< " bytes. (" << string_human_readable_size(map_host_limit) << ")";
}
void GPUDevice::move_textures_to_host(size_t size, bool for_texture)
{
/* Break out of recursive call, which can happen when moving memory on a multi device. */
static bool any_device_moving_textures_to_host = false;
if (any_device_moving_textures_to_host) {
return;
}
/* Signal to reallocate textures in host memory only. */
move_texture_to_host = true;
while (size > 0) {
/* Find suitable memory allocation to move. */
device_memory *max_mem = NULL;
size_t max_size = 0;
bool max_is_image = false;
thread_scoped_lock lock(device_mem_map_mutex);
foreach (MemMap::value_type &pair, device_mem_map) {
device_memory &mem = *pair.first;
Mem *cmem = &pair.second;
/* Can only move textures allocated on this device (and not those from peer devices).
* And need to ignore memory that is already on the host. */
if (!mem.is_resident(this) || cmem->use_mapped_host) {
continue;
}
bool is_texture = (mem.type == MEM_TEXTURE || mem.type == MEM_GLOBAL) &&
(&mem != &texture_info);
bool is_image = is_texture && (mem.data_height > 1);
/* Can't move this type of memory. */
if (!is_texture || cmem->array) {
continue;
}
/* For other textures, only move image textures. */
if (for_texture && !is_image) {
continue;
}
/* Try to move largest allocation, prefer moving images. */
if (is_image > max_is_image || (is_image == max_is_image && mem.device_size > max_size)) {
max_is_image = is_image;
max_size = mem.device_size;
max_mem = &mem;
}
}
lock.unlock();
/* Move to host memory. This part is mutex protected since
* multiple backend devices could be moving the memory. The
* first one will do it, and the rest will adopt the pointer. */
if (max_mem) {
VLOG_WORK << "Move memory from device to host: " << max_mem->name;
static thread_mutex move_mutex;
thread_scoped_lock lock(move_mutex);
any_device_moving_textures_to_host = true;
/* Potentially need to call back into multi device, so pointer mapping
* and peer devices are updated. This is also necessary since the device
* pointer may just be a key here, so cannot be accessed and freed directly.
* Unfortunately it does mean that memory is reallocated on all other
* devices as well, which is potentially dangerous when still in use (since
* a thread rendering on another devices would only be caught in this mutex
* if it so happens to do an allocation at the same time as well. */
max_mem->device_copy_to();
size = (max_size >= size) ? 0 : size - max_size;
any_device_moving_textures_to_host = false;
}
else {
break;
}
}
/* Unset flag before texture info is reloaded, since it should stay in device memory. */
move_texture_to_host = false;
/* Update texture info array with new pointers. */
load_texture_info();
}
GPUDevice::Mem *GPUDevice::generic_alloc(device_memory &mem, size_t pitch_padding)
{
void *device_pointer = 0;
size_t size = mem.memory_size() + pitch_padding;
bool mem_alloc_result = false;
const char *status = "";
/* First try allocating in device memory, respecting headroom. We make
* an exception for texture info. It is small and frequently accessed,
* so treat it as working memory.
*
* If there is not enough room for working memory, we will try to move
* textures to host memory, assuming the performance impact would have
* been worse for working memory. */
bool is_texture = (mem.type == MEM_TEXTURE || mem.type == MEM_GLOBAL) && (&mem != &texture_info);
bool is_image = is_texture && (mem.data_height > 1);
size_t headroom = (is_texture) ? device_texture_headroom : device_working_headroom;
size_t total = 0, free = 0;
get_device_memory_info(total, free);
/* Move textures to host memory if needed. */
if (!move_texture_to_host && !is_image && (size + headroom) >= free && can_map_host) {
move_textures_to_host(size + headroom - free, is_texture);
get_device_memory_info(total, free);
}
/* Allocate in device memory. */
if (!move_texture_to_host && (size + headroom) < free) {
mem_alloc_result = alloc_device(device_pointer, size);
if (mem_alloc_result) {
device_mem_in_use += size;
status = " in device memory";
}
}
/* Fall back to mapped host memory if needed and possible. */
void *shared_pointer = 0;
if (!mem_alloc_result && can_map_host && mem.type != MEM_DEVICE_ONLY) {
if (mem.shared_pointer) {
/* Another device already allocated host memory. */
mem_alloc_result = true;
shared_pointer = mem.shared_pointer;
}
else if (map_host_used + size < map_host_limit) {
/* Allocate host memory ourselves. */
mem_alloc_result = alloc_host(shared_pointer, size);
assert((mem_alloc_result && shared_pointer != 0) ||
(!mem_alloc_result && shared_pointer == 0));
}
if (mem_alloc_result) {
assert(transform_host_pointer(device_pointer, shared_pointer));
map_host_used += size;
status = " in host memory";
}
}
if (!mem_alloc_result) {
if (mem.type == MEM_DEVICE_ONLY) {
status = " failed, out of device memory";
set_error("System is out of GPU memory");
}
else {
status = " failed, out of device and host memory";
set_error("System is out of GPU and shared host memory");
}
}
if (mem.name) {
VLOG_WORK << "Buffer allocate: " << mem.name << ", "
<< string_human_readable_number(mem.memory_size()) << " bytes. ("
<< string_human_readable_size(mem.memory_size()) << ")" << status;
}
mem.device_pointer = (device_ptr)device_pointer;
mem.device_size = size;
stats.mem_alloc(size);
if (!mem.device_pointer) {
return NULL;
}
/* Insert into map of allocations. */
thread_scoped_lock lock(device_mem_map_mutex);
Mem *cmem = &device_mem_map[&mem];
if (shared_pointer != 0) {
/* Replace host pointer with our host allocation. Only works if
* memory layout is the same and has no pitch padding. Also
* does not work if we move textures to host during a render,
* since other devices might be using the memory. */
if (!move_texture_to_host && pitch_padding == 0 && mem.host_pointer &&
mem.host_pointer != shared_pointer) {
memcpy(shared_pointer, mem.host_pointer, size);
/* A Call to device_memory::host_free() should be preceded by
* a call to device_memory::device_free() for host memory
* allocated by a device to be handled properly. Two exceptions
* are here and a call in OptiXDevice::generic_alloc(), where
* the current host memory can be assumed to be allocated by
* device_memory::host_alloc(), not by a device */
mem.host_free();
mem.host_pointer = shared_pointer;
}
mem.shared_pointer = shared_pointer;
mem.shared_counter++;
cmem->use_mapped_host = true;
}
else {
cmem->use_mapped_host = false;
}
return cmem;
}
void GPUDevice::generic_free(device_memory &mem)
{
if (mem.device_pointer) {
thread_scoped_lock lock(device_mem_map_mutex);
DCHECK(device_mem_map.find(&mem) != device_mem_map.end());
const Mem &cmem = device_mem_map[&mem];
/* If cmem.use_mapped_host is true, reference counting is used
* to safely free a mapped host memory. */
if (cmem.use_mapped_host) {
assert(mem.shared_pointer);
if (mem.shared_pointer) {
assert(mem.shared_counter > 0);
if (--mem.shared_counter == 0) {
if (mem.host_pointer == mem.shared_pointer) {
mem.host_pointer = 0;
}
free_host(mem.shared_pointer);
mem.shared_pointer = 0;
}
}
map_host_used -= mem.device_size;
}
else {
/* Free device memory. */
free_device((void *)mem.device_pointer);
device_mem_in_use -= mem.device_size;
}
stats.mem_free(mem.device_size);
mem.device_pointer = 0;
mem.device_size = 0;
device_mem_map.erase(device_mem_map.find(&mem));
}
}
void GPUDevice::generic_copy_to(device_memory &mem)
{
if (!mem.host_pointer || !mem.device_pointer) {
return;
}
/* If use_mapped_host of mem is false, the current device only uses device memory allocated by
* backend device allocation regardless of mem.host_pointer and mem.shared_pointer, and should
* copy data from mem.host_pointer. */
thread_scoped_lock lock(device_mem_map_mutex);
if (!device_mem_map[&mem].use_mapped_host || mem.host_pointer != mem.shared_pointer) {
copy_host_to_device((void *)mem.device_pointer, mem.host_pointer, mem.memory_size());
}
}
/* DeviceInfo */
CCL_NAMESPACE_END

View File

@@ -309,6 +309,93 @@ class Device {
static uint devices_initialized_mask;
};
/* Device, which is GPU, with some common functionality for GPU backends */
class GPUDevice : public Device {
protected:
GPUDevice(const DeviceInfo &info_, Stats &stats_, Profiler &profiler_)
: Device(info_, stats_, profiler_),
texture_info(this, "texture_info", MEM_GLOBAL),
need_texture_info(false),
can_map_host(false),
map_host_used(0),
map_host_limit(0),
device_texture_headroom(0),
device_working_headroom(0),
device_mem_map(),
device_mem_map_mutex(),
move_texture_to_host(false),
device_mem_in_use(0)
{
}
public:
virtual ~GPUDevice() noexcept(false);
/* For GPUs that can use bindless textures in some way or another. */
device_vector<TextureInfo> texture_info;
bool need_texture_info;
/* Returns true if the texture info was copied to the device (meaning, some more
* re-initialization might be needed). */
virtual bool load_texture_info();
protected:
/* Memory allocation, only accessed through device_memory. */
friend class device_memory;
bool can_map_host;
size_t map_host_used;
size_t map_host_limit;
size_t device_texture_headroom;
size_t device_working_headroom;
typedef unsigned long long texMemObject;
typedef unsigned long long arrayMemObject;
struct Mem {
Mem() : texobject(0), array(0), use_mapped_host(false)
{
}
texMemObject texobject;
arrayMemObject array;
/* If true, a mapped host memory in shared_pointer is being used. */
bool use_mapped_host;
};
typedef map<device_memory *, Mem> MemMap;
MemMap device_mem_map;
thread_mutex device_mem_map_mutex;
bool move_texture_to_host;
/* Simple counter which will try to track amount of used device memory */
size_t device_mem_in_use;
virtual void init_host_memory(size_t preferred_texture_headroom = 0,
size_t preferred_working_headroom = 0);
virtual void move_textures_to_host(size_t size, bool for_texture);
/* Allocation, deallocation and copy functions, with corresponding
* support of device/host allocations. */
virtual GPUDevice::Mem *generic_alloc(device_memory &mem, size_t pitch_padding = 0);
virtual void generic_free(device_memory &mem);
virtual void generic_copy_to(device_memory &mem);
/* total - amount of device memory, free - amount of available device memory */
virtual void get_device_memory_info(size_t &total, size_t &free) = 0;
virtual bool alloc_device(void *&device_pointer, size_t size) = 0;
virtual void free_device(void *device_pointer) = 0;
virtual bool alloc_host(void *&shared_pointer, size_t size) = 0;
virtual void free_host(void *shared_pointer) = 0;
/* This function should return device pointer corresponding to shared pointer, which
* is host buffer, allocated in `alloc_host`. The function should `true`, if such
* address transformation is possible and `false` otherwise. */
virtual bool transform_host_pointer(void *&device_pointer, void *&shared_pointer) = 0;
virtual void copy_host_to_device(void *device_pointer, void *host_pointer, size_t size) = 0;
};
CCL_NAMESPACE_END
#endif /* __DEVICE_H__ */

View File

@@ -53,8 +53,12 @@ void HIPDevice::set_error(const string &error)
}
HIPDevice::HIPDevice(const DeviceInfo &info, Stats &stats, Profiler &profiler)
: Device(info, stats, profiler), texture_info(this, "texture_info", MEM_GLOBAL)
: GPUDevice(info, stats, profiler)
{
/* Verify that base class types can be used with specific backend types */
static_assert(sizeof(texMemObject) == sizeof(hipTextureObject_t));
static_assert(sizeof(arrayMemObject) == sizeof(hArray));
first_error = true;
hipDevId = info.num;
@@ -65,12 +69,6 @@ HIPDevice::HIPDevice(const DeviceInfo &info, Stats &stats, Profiler &profiler)
need_texture_info = false;
device_texture_headroom = 0;
device_working_headroom = 0;
move_texture_to_host = false;
map_host_limit = 0;
map_host_used = 0;
can_map_host = 0;
pitch_alignment = 0;
/* Initialize HIP. */
@@ -91,7 +89,9 @@ HIPDevice::HIPDevice(const DeviceInfo &info, Stats &stats, Profiler &profiler)
/* hipDeviceMapHost for mapping host memory when out of device memory.
* hipDeviceLmemResizeToMax for reserving local memory ahead of render,
* so we can predict which memory to map to host. */
hip_assert(hipDeviceGetAttribute(&can_map_host, hipDeviceAttributeCanMapHostMemory, hipDevice));
int value;
hip_assert(hipDeviceGetAttribute(&value, hipDeviceAttributeCanMapHostMemory, hipDevice));
can_map_host = value != 0;
hip_assert(
hipDeviceGetAttribute(&pitch_alignment, hipDeviceAttributeTexturePitchAlignment, hipDevice));
@@ -460,305 +460,58 @@ void HIPDevice::reserve_local_memory(const uint kernel_features)
# endif
}
void HIPDevice::init_host_memory()
{
/* Limit amount of host mapped memory, because allocating too much can
* cause system instability. Leave at least half or 4 GB of system
* memory free, whichever is smaller. */
size_t default_limit = 4 * 1024 * 1024 * 1024LL;
size_t system_ram = system_physical_ram();
if (system_ram > 0) {
if (system_ram / 2 > default_limit) {
map_host_limit = system_ram - default_limit;
}
else {
map_host_limit = system_ram / 2;
}
}
else {
VLOG_WARNING << "Mapped host memory disabled, failed to get system RAM";
map_host_limit = 0;
}
/* Amount of device memory to keep is free after texture memory
* and working memory allocations respectively. We set the working
* memory limit headroom lower so that some space is left after all
* texture memory allocations. */
device_working_headroom = 32 * 1024 * 1024LL; // 32MB
device_texture_headroom = 128 * 1024 * 1024LL; // 128MB
VLOG_INFO << "Mapped host memory limit set to " << string_human_readable_number(map_host_limit)
<< " bytes. (" << string_human_readable_size(map_host_limit) << ")";
}
void HIPDevice::load_texture_info()
{
if (need_texture_info) {
/* Unset flag before copying, so this does not loop indefinitely if the copy below calls
* into 'move_textures_to_host' (which calls 'load_texture_info' again). */
need_texture_info = false;
texture_info.copy_to_device();
}
}
void HIPDevice::move_textures_to_host(size_t size, bool for_texture)
{
/* Break out of recursive call, which can happen when moving memory on a multi device. */
static bool any_device_moving_textures_to_host = false;
if (any_device_moving_textures_to_host) {
return;
}
/* Signal to reallocate textures in host memory only. */
move_texture_to_host = true;
while (size > 0) {
/* Find suitable memory allocation to move. */
device_memory *max_mem = NULL;
size_t max_size = 0;
bool max_is_image = false;
thread_scoped_lock lock(hip_mem_map_mutex);
foreach (HIPMemMap::value_type &pair, hip_mem_map) {
device_memory &mem = *pair.first;
HIPMem *cmem = &pair.second;
/* Can only move textures allocated on this device (and not those from peer devices).
* And need to ignore memory that is already on the host. */
if (!mem.is_resident(this) || cmem->use_mapped_host) {
continue;
}
bool is_texture = (mem.type == MEM_TEXTURE || mem.type == MEM_GLOBAL) &&
(&mem != &texture_info);
bool is_image = is_texture && (mem.data_height > 1);
/* Can't move this type of memory. */
if (!is_texture || cmem->array) {
continue;
}
/* For other textures, only move image textures. */
if (for_texture && !is_image) {
continue;
}
/* Try to move largest allocation, prefer moving images. */
if (is_image > max_is_image || (is_image == max_is_image && mem.device_size > max_size)) {
max_is_image = is_image;
max_size = mem.device_size;
max_mem = &mem;
}
}
lock.unlock();
/* Move to host memory. This part is mutex protected since
* multiple HIP devices could be moving the memory. The
* first one will do it, and the rest will adopt the pointer. */
if (max_mem) {
VLOG_WORK << "Move memory from device to host: " << max_mem->name;
static thread_mutex move_mutex;
thread_scoped_lock lock(move_mutex);
any_device_moving_textures_to_host = true;
/* Potentially need to call back into multi device, so pointer mapping
* and peer devices are updated. This is also necessary since the device
* pointer may just be a key here, so cannot be accessed and freed directly.
* Unfortunately it does mean that memory is reallocated on all other
* devices as well, which is potentially dangerous when still in use (since
* a thread rendering on another devices would only be caught in this mutex
* if it so happens to do an allocation at the same time as well. */
max_mem->device_copy_to();
size = (max_size >= size) ? 0 : size - max_size;
any_device_moving_textures_to_host = false;
}
else {
break;
}
}
/* Unset flag before texture info is reloaded, since it should stay in device memory. */
move_texture_to_host = false;
/* Update texture info array with new pointers. */
load_texture_info();
}
HIPDevice::HIPMem *HIPDevice::generic_alloc(device_memory &mem, size_t pitch_padding)
void HIPDevice::get_device_memory_info(size_t &total, size_t &free)
{
HIPContextScope scope(this);
hipDeviceptr_t device_pointer = 0;
size_t size = mem.memory_size() + pitch_padding;
hipError_t mem_alloc_result = hipErrorOutOfMemory;
const char *status = "";
/* First try allocating in device memory, respecting headroom. We make
* an exception for texture info. It is small and frequently accessed,
* so treat it as working memory.
*
* If there is not enough room for working memory, we will try to move
* textures to host memory, assuming the performance impact would have
* been worse for working memory. */
bool is_texture = (mem.type == MEM_TEXTURE || mem.type == MEM_GLOBAL) && (&mem != &texture_info);
bool is_image = is_texture && (mem.data_height > 1);
size_t headroom = (is_texture) ? device_texture_headroom : device_working_headroom;
size_t total = 0, free = 0;
hipMemGetInfo(&free, &total);
/* Move textures to host memory if needed. */
if (!move_texture_to_host && !is_image && (size + headroom) >= free && can_map_host) {
move_textures_to_host(size + headroom - free, is_texture);
hipMemGetInfo(&free, &total);
}
/* Allocate in device memory. */
if (!move_texture_to_host && (size + headroom) < free) {
mem_alloc_result = hipMalloc(&device_pointer, size);
if (mem_alloc_result == hipSuccess) {
status = " in device memory";
}
}
/* Fall back to mapped host memory if needed and possible. */
void *shared_pointer = 0;
if (mem_alloc_result != hipSuccess && can_map_host) {
if (mem.shared_pointer) {
/* Another device already allocated host memory. */
mem_alloc_result = hipSuccess;
shared_pointer = mem.shared_pointer;
}
else if (map_host_used + size < map_host_limit) {
/* Allocate host memory ourselves. */
mem_alloc_result = hipHostMalloc(
&shared_pointer, size, hipHostMallocMapped | hipHostMallocWriteCombined);
assert((mem_alloc_result == hipSuccess && shared_pointer != 0) ||
(mem_alloc_result != hipSuccess && shared_pointer == 0));
}
if (mem_alloc_result == hipSuccess) {
hip_assert(hipHostGetDevicePointer(&device_pointer, shared_pointer, 0));
map_host_used += size;
status = " in host memory";
}
}
if (mem_alloc_result != hipSuccess) {
status = " failed, out of device and host memory";
set_error("System is out of GPU and shared host memory");
}
if (mem.name) {
VLOG_WORK << "Buffer allocate: " << mem.name << ", "
<< string_human_readable_number(mem.memory_size()) << " bytes. ("
<< string_human_readable_size(mem.memory_size()) << ")" << status;
}
mem.device_pointer = (device_ptr)device_pointer;
mem.device_size = size;
stats.mem_alloc(size);
if (!mem.device_pointer) {
return NULL;
}
/* Insert into map of allocations. */
thread_scoped_lock lock(hip_mem_map_mutex);
HIPMem *cmem = &hip_mem_map[&mem];
if (shared_pointer != 0) {
/* Replace host pointer with our host allocation. Only works if
* HIP memory layout is the same and has no pitch padding. Also
* does not work if we move textures to host during a render,
* since other devices might be using the memory. */
if (!move_texture_to_host && pitch_padding == 0 && mem.host_pointer &&
mem.host_pointer != shared_pointer) {
memcpy(shared_pointer, mem.host_pointer, size);
/* A Call to device_memory::host_free() should be preceded by
* a call to device_memory::device_free() for host memory
* allocated by a device to be handled properly. Two exceptions
* are here and a call in OptiXDevice::generic_alloc(), where
* the current host memory can be assumed to be allocated by
* device_memory::host_alloc(), not by a device */
mem.host_free();
mem.host_pointer = shared_pointer;
}
mem.shared_pointer = shared_pointer;
mem.shared_counter++;
cmem->use_mapped_host = true;
}
else {
cmem->use_mapped_host = false;
}
return cmem;
}
void HIPDevice::generic_copy_to(device_memory &mem)
bool HIPDevice::alloc_device(void *&device_pointer, size_t size)
{
if (!mem.host_pointer || !mem.device_pointer) {
return;
}
HIPContextScope scope(this);
/* If use_mapped_host of mem is false, the current device only uses device memory allocated by
* hipMalloc regardless of mem.host_pointer and mem.shared_pointer, and should copy data from
* mem.host_pointer. */
thread_scoped_lock lock(hip_mem_map_mutex);
if (!hip_mem_map[&mem].use_mapped_host || mem.host_pointer != mem.shared_pointer) {
const HIPContextScope scope(this);
hip_assert(
hipMemcpyHtoD((hipDeviceptr_t)mem.device_pointer, mem.host_pointer, mem.memory_size()));
}
hipError_t mem_alloc_result = hipMalloc((hipDeviceptr_t *)&device_pointer, size);
return mem_alloc_result == hipSuccess;
}
void HIPDevice::generic_free(device_memory &mem)
void HIPDevice::free_device(void *device_pointer)
{
if (mem.device_pointer) {
HIPContextScope scope(this);
thread_scoped_lock lock(hip_mem_map_mutex);
DCHECK(hip_mem_map.find(&mem) != hip_mem_map.end());
const HIPMem &cmem = hip_mem_map[&mem];
HIPContextScope scope(this);
/* If cmem.use_mapped_host is true, reference counting is used
* to safely free a mapped host memory. */
hip_assert(hipFree((hipDeviceptr_t)device_pointer));
}
if (cmem.use_mapped_host) {
assert(mem.shared_pointer);
if (mem.shared_pointer) {
assert(mem.shared_counter > 0);
if (--mem.shared_counter == 0) {
if (mem.host_pointer == mem.shared_pointer) {
mem.host_pointer = 0;
}
hipHostFree(mem.shared_pointer);
mem.shared_pointer = 0;
}
}
map_host_used -= mem.device_size;
}
else {
/* Free device memory. */
hip_assert(hipFree(mem.device_pointer));
}
bool HIPDevice::alloc_host(void *&shared_pointer, size_t size)
{
HIPContextScope scope(this);
stats.mem_free(mem.device_size);
mem.device_pointer = 0;
mem.device_size = 0;
hipError_t mem_alloc_result = hipHostMalloc(
&shared_pointer, size, hipHostMallocMapped | hipHostMallocWriteCombined);
hip_mem_map.erase(hip_mem_map.find(&mem));
}
return mem_alloc_result == hipSuccess;
}
void HIPDevice::free_host(void *shared_pointer)
{
HIPContextScope scope(this);
hipHostFree(shared_pointer);
}
bool HIPDevice::transform_host_pointer(void *&device_pointer, void *&shared_pointer)
{
HIPContextScope scope(this);
hip_assert(hipHostGetDevicePointer((hipDeviceptr_t *)&device_pointer, shared_pointer, 0));
return true;
}
void HIPDevice::copy_host_to_device(void *device_pointer, void *host_pointer, size_t size)
{
const HIPContextScope scope(this);
hip_assert(hipMemcpyHtoD((hipDeviceptr_t)device_pointer, host_pointer, size));
}
void HIPDevice::mem_alloc(device_memory &mem)
@@ -823,8 +576,8 @@ void HIPDevice::mem_zero(device_memory &mem)
/* If use_mapped_host of mem is false, mem.device_pointer currently refers to device memory
* regardless of mem.host_pointer and mem.shared_pointer. */
thread_scoped_lock lock(hip_mem_map_mutex);
if (!hip_mem_map[&mem].use_mapped_host || mem.host_pointer != mem.shared_pointer) {
thread_scoped_lock lock(device_mem_map_mutex);
if (!device_mem_map[&mem].use_mapped_host || mem.host_pointer != mem.shared_pointer) {
const HIPContextScope scope(this);
hip_assert(hipMemsetD8((hipDeviceptr_t)mem.device_pointer, 0, mem.memory_size()));
}
@@ -951,19 +704,19 @@ void HIPDevice::tex_alloc(device_texture &mem)
return;
}
HIPMem *cmem = NULL;
Mem *cmem = NULL;
hArray array_3d = NULL;
size_t src_pitch = mem.data_width * dsize * mem.data_elements;
size_t dst_pitch = src_pitch;
if (!mem.is_resident(this)) {
thread_scoped_lock lock(hip_mem_map_mutex);
cmem = &hip_mem_map[&mem];
thread_scoped_lock lock(device_mem_map_mutex);
cmem = &device_mem_map[&mem];
cmem->texobject = 0;
if (mem.data_depth > 1) {
array_3d = (hArray)mem.device_pointer;
cmem->array = array_3d;
cmem->array = reinterpret_cast<arrayMemObject>(array_3d);
}
else if (mem.data_height > 0) {
dst_pitch = align_up(src_pitch, pitch_alignment);
@@ -1007,10 +760,10 @@ void HIPDevice::tex_alloc(device_texture &mem)
mem.device_size = size;
stats.mem_alloc(size);
thread_scoped_lock lock(hip_mem_map_mutex);
cmem = &hip_mem_map[&mem];
thread_scoped_lock lock(device_mem_map_mutex);
cmem = &device_mem_map[&mem];
cmem->texobject = 0;
cmem->array = array_3d;
cmem->array = reinterpret_cast<arrayMemObject>(array_3d);
}
else if (mem.data_height > 0) {
/* 2D texture, using pitch aligned linear memory. */
@@ -1095,8 +848,8 @@ void HIPDevice::tex_alloc(device_texture &mem)
texDesc.filterMode = filter_mode;
texDesc.flags = HIP_TRSF_NORMALIZED_COORDINATES;
thread_scoped_lock lock(hip_mem_map_mutex);
cmem = &hip_mem_map[&mem];
thread_scoped_lock lock(device_mem_map_mutex);
cmem = &device_mem_map[&mem];
hip_assert(hipTexObjectCreate(&cmem->texobject, &resDesc, &texDesc, NULL));
@@ -1111,9 +864,9 @@ void HIPDevice::tex_free(device_texture &mem)
{
if (mem.device_pointer) {
HIPContextScope scope(this);
thread_scoped_lock lock(hip_mem_map_mutex);
DCHECK(hip_mem_map.find(&mem) != hip_mem_map.end());
const HIPMem &cmem = hip_mem_map[&mem];
thread_scoped_lock lock(device_mem_map_mutex);
DCHECK(device_mem_map.find(&mem) != device_mem_map.end());
const Mem &cmem = device_mem_map[&mem];
if (cmem.texobject) {
/* Free bindless texture. */
@@ -1122,16 +875,16 @@ void HIPDevice::tex_free(device_texture &mem)
if (!mem.is_resident(this)) {
/* Do not free memory here, since it was allocated on a different device. */
hip_mem_map.erase(hip_mem_map.find(&mem));
device_mem_map.erase(device_mem_map.find(&mem));
}
else if (cmem.array) {
/* Free array. */
hipArrayDestroy(cmem.array);
hipArrayDestroy(reinterpret_cast<hArray>(cmem.array));
stats.mem_free(mem.device_size);
mem.device_pointer = 0;
mem.device_size = 0;
hip_mem_map.erase(hip_mem_map.find(&mem));
device_mem_map.erase(device_mem_map.find(&mem));
}
else {
lock.unlock();
@@ -1153,7 +906,7 @@ bool HIPDevice::should_use_graphics_interop()
* possible, but from the empiric measurements it can be considerably slower than using naive
* pixels copy. */
/* Disable graphics interop for now, because of driver bug in 21.40. See T92972 */
/* Disable graphics interop for now, because of driver bug in 21.40. See #92972 */
# if 0
HIPContextScope scope(this);

View File

@@ -18,7 +18,7 @@ CCL_NAMESPACE_BEGIN
class DeviceQueue;
class HIPDevice : public Device {
class HIPDevice : public GPUDevice {
friend class HIPContextScope;
@@ -26,36 +26,11 @@ class HIPDevice : public Device {
hipDevice_t hipDevice;
hipCtx_t hipContext;
hipModule_t hipModule;
size_t device_texture_headroom;
size_t device_working_headroom;
bool move_texture_to_host;
size_t map_host_used;
size_t map_host_limit;
int can_map_host;
int pitch_alignment;
int hipDevId;
int hipDevArchitecture;
bool first_error;
struct HIPMem {
HIPMem() : texobject(0), array(0), use_mapped_host(false)
{
}
hipTextureObject_t texobject;
hArray array;
/* If true, a mapped host memory in shared_pointer is being used. */
bool use_mapped_host;
};
typedef map<device_memory *, HIPMem> HIPMemMap;
HIPMemMap hip_mem_map;
thread_mutex hip_mem_map_mutex;
/* Bindless Textures */
device_vector<TextureInfo> texture_info;
bool need_texture_info;
HIPDeviceKernels kernels;
static bool have_precompiled_kernels();
@@ -81,17 +56,13 @@ class HIPDevice : public Device {
virtual bool load_kernels(const uint kernel_features) override;
void reserve_local_memory(const uint kernel_features);
void init_host_memory();
void load_texture_info();
void move_textures_to_host(size_t size, bool for_texture);
HIPMem *generic_alloc(device_memory &mem, size_t pitch_padding = 0);
void generic_copy_to(device_memory &mem);
void generic_free(device_memory &mem);
virtual void get_device_memory_info(size_t &total, size_t &free) override;
virtual bool alloc_device(void *&device_pointer, size_t size) override;
virtual void free_device(void *device_pointer) override;
virtual bool alloc_host(void *&shared_pointer, size_t size) override;
virtual void free_host(void *shared_pointer) override;
virtual bool transform_host_pointer(void *&device_pointer, void *&shared_pointer) override;
virtual void copy_host_to_device(void *device_pointer, void *host_pointer, size_t size) override;
void mem_alloc(device_memory &mem) override;

View File

@@ -51,7 +51,7 @@ static inline bool hipSupportsDevice(const int hipDevId)
hipDeviceGetAttribute(&major, hipDeviceAttributeComputeCapabilityMajor, hipDevId);
hipDeviceGetAttribute(&minor, hipDeviceAttributeComputeCapabilityMinor, hipDevId);
return (major >= 10);
return (major >= 9);
}
CCL_NAMESPACE_END

View File

@@ -73,6 +73,10 @@ const char *device_kernel_as_string(DeviceKernel kernel)
return "integrator_terminated_paths_array";
case DEVICE_KERNEL_INTEGRATOR_SORTED_PATHS_ARRAY:
return "integrator_sorted_paths_array";
case DEVICE_KERNEL_INTEGRATOR_SORT_BUCKET_PASS:
return "integrator_sort_bucket_pass";
case DEVICE_KERNEL_INTEGRATOR_SORT_WRITE_PASS:
return "integrator_sort_write_pass";
case DEVICE_KERNEL_INTEGRATOR_COMPACT_PATHS_ARRAY:
return "integrator_compact_paths_array";
case DEVICE_KERNEL_INTEGRATOR_COMPACT_STATES:

View File

@@ -247,6 +247,8 @@ class device_memory {
bool is_resident(Device *sub_device) const;
protected:
friend class Device;
friend class GPUDevice;
friend class CUDADevice;
friend class OptiXDevice;
friend class HIPDevice;

View File

@@ -21,6 +21,7 @@ class BVHMetal : public BVH {
API_AVAILABLE(macos(11.0))
vector<id<MTLAccelerationStructure>> blas_array;
vector<uint32_t> blas_lookup;
bool motion_blur = false;

View File

@@ -816,6 +816,11 @@ bool BVHMetal::build_TLAS(Progress &progress,
uint32_t instance_index = 0;
uint32_t motion_transform_index = 0;
// allocate look up buffer for wost case scenario
uint64_t count = objects.size();
blas_lookup.resize(count);
for (Object *ob : objects) {
/* Skip non-traceable objects */
if (!ob->is_traceable())
@@ -843,12 +848,15 @@ bool BVHMetal::build_TLAS(Progress &progress,
/* Set user instance ID to object index */
int object_index = ob->get_device_index();
uint32_t user_id = uint32_t(object_index);
int currIndex = instance_index++;
assert(user_id < blas_lookup.size());
blas_lookup[user_id] = accel_struct_index;
/* Bake into the appropriate descriptor */
if (motion_blur) {
MTLAccelerationStructureMotionInstanceDescriptor *instances =
(MTLAccelerationStructureMotionInstanceDescriptor *)[instanceBuf contents];
MTLAccelerationStructureMotionInstanceDescriptor &desc = instances[instance_index++];
MTLAccelerationStructureMotionInstanceDescriptor &desc = instances[currIndex];
desc.accelerationStructureIndex = accel_struct_index;
desc.userID = user_id;
@@ -894,7 +902,7 @@ bool BVHMetal::build_TLAS(Progress &progress,
else {
MTLAccelerationStructureUserIDInstanceDescriptor *instances =
(MTLAccelerationStructureUserIDInstanceDescriptor *)[instanceBuf contents];
MTLAccelerationStructureUserIDInstanceDescriptor &desc = instances[instance_index++];
MTLAccelerationStructureUserIDInstanceDescriptor &desc = instances[currIndex];
desc.accelerationStructureIndex = accel_struct_index;
desc.userID = user_id;

View File

@@ -55,6 +55,10 @@ void device_metal_info(vector<DeviceInfo> &devices)
info.denoisers = DENOISER_NONE;
info.id = id;
if (MetalInfo::get_device_vendor(device) == METAL_GPU_AMD) {
info.has_light_tree = false;
}
devices.push_back(info);
device_index++;
}

View File

@@ -74,6 +74,11 @@ class MetalDevice : public Device {
id<MTLBuffer> texture_bindings_3d = nil;
std::vector<id<MTLTexture>> texture_slot_map;
/* BLAS encoding & lookup */
id<MTLArgumentEncoder> mtlBlasArgEncoder = nil;
id<MTLBuffer> blas_buffer = nil;
id<MTLBuffer> blas_lookup_buffer = nil;
bool use_metalrt = false;
MetalPipelineType kernel_specialization_level = PSO_GENERIC;
@@ -105,6 +110,8 @@ class MetalDevice : public Device {
bool use_adaptive_compilation();
bool use_local_atomic_sort() const;
bool make_source_and_check_if_compile_needed(MetalPipelineType pso_type);
void make_source(MetalPipelineType pso_type, const uint kernel_features);

View File

@@ -105,6 +105,7 @@ MetalDevice::MetalDevice(const DeviceInfo &info, Stats &stats, Profiler &profile
}
case METAL_GPU_AMD: {
max_threads_per_threadgroup = 128;
use_metalrt = info.use_metalrt;
break;
}
case METAL_GPU_APPLE: {
@@ -192,6 +193,10 @@ MetalDevice::MetalDevice(const DeviceInfo &info, Stats &stats, Profiler &profile
arg_desc_as.dataType = MTLDataTypeInstanceAccelerationStructure;
arg_desc_as.access = MTLArgumentAccessReadOnly;
MTLArgumentDescriptor *arg_desc_ptrs = [[MTLArgumentDescriptor alloc] init];
arg_desc_ptrs.dataType = MTLDataTypePointer;
arg_desc_ptrs.access = MTLArgumentAccessReadOnly;
MTLArgumentDescriptor *arg_desc_ift = [[MTLArgumentDescriptor alloc] init];
arg_desc_ift.dataType = MTLDataTypeIntersectionFunctionTable;
arg_desc_ift.access = MTLArgumentAccessReadOnly;
@@ -204,14 +209,28 @@ MetalDevice::MetalDevice(const DeviceInfo &info, Stats &stats, Profiler &profile
[ancillary_desc addObject:[arg_desc_ift copy]]; /* ift_shadow */
arg_desc_ift.index = index++;
[ancillary_desc addObject:[arg_desc_ift copy]]; /* ift_local */
arg_desc_ift.index = index++;
[ancillary_desc addObject:[arg_desc_ift copy]]; /* ift_local_prim */
arg_desc_ptrs.index = index++;
[ancillary_desc addObject:[arg_desc_ptrs copy]]; /* blas array */
arg_desc_ptrs.index = index++;
[ancillary_desc addObject:[arg_desc_ptrs copy]]; /* look up table for blas */
[arg_desc_ift release];
[arg_desc_as release];
[arg_desc_ptrs release];
}
}
mtlAncillaryArgEncoder = [mtlDevice newArgumentEncoderWithArguments:ancillary_desc];
// preparing the blas arg encoder
MTLArgumentDescriptor *arg_desc_blas = [[MTLArgumentDescriptor alloc] init];
arg_desc_blas.dataType = MTLDataTypeInstanceAccelerationStructure;
arg_desc_blas.access = MTLArgumentAccessReadOnly;
mtlBlasArgEncoder = [mtlDevice newArgumentEncoderWithArguments:@[ arg_desc_blas ]];
[arg_desc_blas release];
for (int i = 0; i < ancillary_desc.count; i++) {
[ancillary_desc[i] release];
}
@@ -271,6 +290,11 @@ bool MetalDevice::use_adaptive_compilation()
return DebugFlags().metal.adaptive_compile;
}
bool MetalDevice::use_local_atomic_sort() const
{
return DebugFlags().metal.use_local_atomic_sort;
}
void MetalDevice::make_source(MetalPipelineType pso_type, const uint kernel_features)
{
string global_defines;
@@ -278,6 +302,10 @@ void MetalDevice::make_source(MetalPipelineType pso_type, const uint kernel_feat
global_defines += "#define __KERNEL_FEATURES__ " + to_string(kernel_features) + "\n";
}
if (use_local_atomic_sort()) {
global_defines += "#define __KERNEL_LOCAL_ATOMIC_SORT__\n";
}
if (use_metalrt) {
global_defines += "#define __METALRT__\n";
if (motion_blur) {
@@ -558,7 +586,7 @@ void MetalDevice::erase_allocation(device_memory &mem)
if (it != metal_mem_map.end()) {
MetalMem *mmem = it->second.get();
/* blank out reference to MetalMem* in the launch params (fixes crash T94736) */
/* blank out reference to MetalMem* in the launch params (fixes crash #94736) */
if (mmem->pointer_index >= 0) {
device_ptr *pointers = (device_ptr *)&launch_params;
pointers[mmem->pointer_index] = 0;
@@ -1231,6 +1259,33 @@ void MetalDevice::build_bvh(BVH *bvh, Progress &progress, bool refit)
if (@available(macos 11.0, *)) {
if (bvh->params.top_level) {
bvhMetalRT = bvh_metal;
// allocate required buffers for BLAS array
uint64_t count = bvhMetalRT->blas_array.size();
uint64_t bufferSize = mtlBlasArgEncoder.encodedLength * count;
blas_buffer = [mtlDevice newBufferWithLength:bufferSize options:default_storage_mode];
stats.mem_alloc(blas_buffer.allocatedSize);
for (uint64_t i = 0; i < count; ++i) {
[mtlBlasArgEncoder setArgumentBuffer:blas_buffer
offset:i * mtlBlasArgEncoder.encodedLength];
[mtlBlasArgEncoder setAccelerationStructure:bvhMetalRT->blas_array[i] atIndex:0];
}
count = bvhMetalRT->blas_lookup.size();
bufferSize = sizeof(uint32_t) * count;
blas_lookup_buffer = [mtlDevice newBufferWithLength:bufferSize
options:default_storage_mode];
stats.mem_alloc(blas_lookup_buffer.allocatedSize);
memcpy([blas_lookup_buffer contents],
bvhMetalRT -> blas_lookup.data(),
blas_lookup_buffer.allocatedSize);
if (default_storage_mode == MTLResourceStorageModeManaged) {
[blas_buffer didModifyRange:NSMakeRange(0, blas_buffer.length)];
[blas_lookup_buffer didModifyRange:NSMakeRange(0, blas_lookup_buffer.length)];
}
}
}
}

View File

@@ -19,6 +19,8 @@ enum {
METALRT_FUNC_SHADOW_BOX,
METALRT_FUNC_LOCAL_TRI,
METALRT_FUNC_LOCAL_BOX,
METALRT_FUNC_LOCAL_TRI_PRIM,
METALRT_FUNC_LOCAL_BOX_PRIM,
METALRT_FUNC_CURVE_RIBBON,
METALRT_FUNC_CURVE_RIBBON_SHADOW,
METALRT_FUNC_CURVE_ALL,
@@ -28,7 +30,13 @@ enum {
METALRT_FUNC_NUM
};
enum { METALRT_TABLE_DEFAULT, METALRT_TABLE_SHADOW, METALRT_TABLE_LOCAL, METALRT_TABLE_NUM };
enum {
METALRT_TABLE_DEFAULT,
METALRT_TABLE_SHADOW,
METALRT_TABLE_LOCAL,
METALRT_TABLE_LOCAL_PRIM,
METALRT_TABLE_NUM
};
/* Pipeline State Object types */
enum MetalPipelineType {

View File

@@ -87,6 +87,9 @@ struct ShaderCache {
break;
}
}
occupancy_tuning[DEVICE_KERNEL_INTEGRATOR_SORT_BUCKET_PASS] = {1024, 1024};
occupancy_tuning[DEVICE_KERNEL_INTEGRATOR_SORT_WRITE_PASS] = {1024, 1024};
}
~ShaderCache();
@@ -521,6 +524,8 @@ void MetalKernelPipeline::compile()
"__anyhit__cycles_metalrt_shadow_all_hit_box",
"__anyhit__cycles_metalrt_local_hit_tri",
"__anyhit__cycles_metalrt_local_hit_box",
"__anyhit__cycles_metalrt_local_hit_tri_prim",
"__anyhit__cycles_metalrt_local_hit_box_prim",
"__intersection__curve_ribbon",
"__intersection__curve_ribbon_shadow",
"__intersection__curve_all",
@@ -611,11 +616,17 @@ void MetalKernelPipeline::compile()
rt_intersection_function[METALRT_FUNC_LOCAL_BOX],
rt_intersection_function[METALRT_FUNC_LOCAL_BOX],
nil];
table_functions[METALRT_TABLE_LOCAL_PRIM] = [NSArray
arrayWithObjects:rt_intersection_function[METALRT_FUNC_LOCAL_TRI_PRIM],
rt_intersection_function[METALRT_FUNC_LOCAL_BOX_PRIM],
rt_intersection_function[METALRT_FUNC_LOCAL_BOX_PRIM],
nil];
NSMutableSet *unique_functions = [NSMutableSet
setWithArray:table_functions[METALRT_TABLE_DEFAULT]];
[unique_functions addObjectsFromArray:table_functions[METALRT_TABLE_SHADOW]];
[unique_functions addObjectsFromArray:table_functions[METALRT_TABLE_LOCAL]];
[unique_functions addObjectsFromArray:table_functions[METALRT_TABLE_LOCAL_PRIM]];
if (kernel_has_intersection(device_kernel)) {
linked_functions = [[NSArray arrayWithArray:[unique_functions allObjects]]

View File

@@ -25,6 +25,7 @@ class MetalDeviceQueue : public DeviceQueue {
virtual int num_concurrent_states(const size_t) const override;
virtual int num_concurrent_busy_states(const size_t) const override;
virtual int num_sort_partition_elements() const override;
virtual bool supports_local_atomic_sort() const override;
virtual void init_execution() override;

View File

@@ -315,6 +315,11 @@ int MetalDeviceQueue::num_sort_partition_elements() const
return MetalInfo::optimal_sort_partition_elements(metal_device_->mtlDevice);
}
bool MetalDeviceQueue::supports_local_atomic_sort() const
{
return metal_device_->use_local_atomic_sort();
}
void MetalDeviceQueue::init_execution()
{
/* Synchronize all textures and memory copies before executing task. */
@@ -477,6 +482,12 @@ bool MetalDeviceQueue::enqueue(DeviceKernel kernel,
if (metal_device_->bvhMetalRT) {
id<MTLAccelerationStructure> accel_struct = metal_device_->bvhMetalRT->accel_struct;
[metal_device_->mtlAncillaryArgEncoder setAccelerationStructure:accel_struct atIndex:2];
[metal_device_->mtlAncillaryArgEncoder setBuffer:metal_device_->blas_buffer
offset:0
atIndex:7];
[metal_device_->mtlAncillaryArgEncoder setBuffer:metal_device_->blas_lookup_buffer
offset:0
atIndex:8];
}
for (int table = 0; table < METALRT_TABLE_NUM; table++) {
@@ -527,6 +538,10 @@ bool MetalDeviceQueue::enqueue(DeviceKernel kernel,
if (bvhMetalRT) {
/* Mark all Accelerations resources as used */
[mtlComputeCommandEncoder useResource:bvhMetalRT->accel_struct usage:MTLResourceUsageRead];
[mtlComputeCommandEncoder useResource:metal_device_->blas_buffer
usage:MTLResourceUsageRead];
[mtlComputeCommandEncoder useResource:metal_device_->blas_lookup_buffer
usage:MTLResourceUsageRead];
[mtlComputeCommandEncoder useResources:bvhMetalRT->blas_array.data()
count:bvhMetalRT->blas_array.size()
usage:MTLResourceUsageRead];
@@ -553,13 +568,24 @@ bool MetalDeviceQueue::enqueue(DeviceKernel kernel,
/* See parallel_active_index.h for why this amount of shared memory is needed.
* Rounded up to 16 bytes for Metal */
shared_mem_bytes = (int)round_up((num_threads_per_block + 1) * sizeof(int), 16);
[mtlComputeCommandEncoder setThreadgroupMemoryLength:shared_mem_bytes atIndex:0];
break;
case DEVICE_KERNEL_INTEGRATOR_SORT_BUCKET_PASS:
case DEVICE_KERNEL_INTEGRATOR_SORT_WRITE_PASS: {
int key_count = metal_device_->launch_params.data.max_shaders;
shared_mem_bytes = (int)round_up(key_count * sizeof(int), 16);
break;
}
default:
break;
}
if (shared_mem_bytes) {
assert(shared_mem_bytes <= 32 * 1024);
[mtlComputeCommandEncoder setThreadgroupMemoryLength:shared_mem_bytes atIndex:0];
}
MTLSize size_threadgroups_per_dispatch = MTLSizeMake(
divide_up(work_size, num_threads_per_block), 1, 1);
MTLSize size_threads_per_threadgroup = MTLSizeMake(num_threads_per_block, 1, 1);

View File

@@ -64,6 +64,12 @@ MetalGPUVendor MetalInfo::get_device_vendor(id<MTLDevice> device)
return METAL_GPU_INTEL;
}
else if (strstr(device_name, "AMD")) {
/* Setting this env var hides AMD devices thus exposing any integrated Intel devices. */
if (auto str = getenv("CYCLES_METAL_FORCE_INTEL")) {
if (atoi(str)) {
return METAL_GPU_UNKNOWN;
}
}
return METAL_GPU_AMD;
}
else if (strstr(device_name, "Apple")) {
@@ -96,6 +102,15 @@ vector<id<MTLDevice>> const &MetalInfo::get_usable_devices()
return usable_devices;
}
/* If the system has both an AMD GPU (discrete) and an Intel one (integrated), prefer the AMD
* one. This can be overridden with CYCLES_METAL_FORCE_INTEL. */
bool has_usable_amd_gpu = false;
if (@available(macos 12.3, *)) {
for (id<MTLDevice> device in MTLCopyAllDevices()) {
has_usable_amd_gpu |= (get_device_vendor(device) == METAL_GPU_AMD);
}
}
metal_printf("Usable Metal devices:\n");
for (id<MTLDevice> device in MTLCopyAllDevices()) {
string device_name = get_device_name(device);
@@ -111,8 +126,10 @@ vector<id<MTLDevice>> const &MetalInfo::get_usable_devices()
}
# if defined(MAC_OS_VERSION_13_0)
if (@available(macos 13.0, *)) {
usable |= (vendor == METAL_GPU_INTEL);
if (!has_usable_amd_gpu) {
if (@available(macos 13.0, *)) {
usable |= (vendor == METAL_GPU_INTEL);
}
}
# endif

View File

@@ -854,12 +854,14 @@ bool OptiXDevice::load_osl_kernels()
context, group_descs, 2, &group_options, nullptr, 0, &osl_groups[i * 2]));
}
OptixStackSizes stack_size[NUM_PROGRAM_GROUPS] = {};
vector<OptixStackSizes> osl_stack_size(osl_groups.size());
/* Update SBT with new entries. */
sbt_data.alloc(NUM_PROGRAM_GROUPS + osl_groups.size());
for (int i = 0; i < NUM_PROGRAM_GROUPS; ++i) {
optix_assert(optixSbtRecordPackHeader(groups[i], &sbt_data[i]));
optix_assert(optixProgramGroupGetStackSize(groups[i], &stack_size[i]));
}
for (size_t i = 0; i < osl_groups.size(); ++i) {
if (osl_groups[i] != NULL) {
@@ -907,13 +909,15 @@ bool OptiXDevice::load_osl_kernels()
0,
&pipelines[PIP_SHADE]));
const unsigned int css = std::max(stack_size[PG_RGEN_SHADE_SURFACE_RAYTRACE].cssRG,
stack_size[PG_RGEN_SHADE_SURFACE_MNEE].cssRG);
unsigned int dss = 0;
for (unsigned int i = 0; i < osl_stack_size.size(); ++i) {
dss = std::max(dss, osl_stack_size[i].dssDC);
}
optix_assert(optixPipelineSetStackSize(
pipelines[PIP_SHADE], 0, dss, 0, pipeline_options.usesMotionBlur ? 3 : 2));
pipelines[PIP_SHADE], 0, dss, css, pipeline_options.usesMotionBlur ? 3 : 2));
}
return !have_error();

View File

@@ -112,6 +112,13 @@ class DeviceQueue {
return 65536;
}
/* Does device support local atomic sorting kernels (INTEGRATOR_SORT_BUCKET_PASS and
* INTEGRATOR_SORT_WRITE_PASS)? */
virtual bool supports_local_atomic_sort() const
{
return false;
}
/* Initialize execution of kernels on this queue.
*
* Will, for example, load all data required by the kernels from Device to global or path state.

View File

@@ -71,6 +71,8 @@ PathTraceWorkGPU::PathTraceWorkGPU(Device *device,
device, "integrator_shader_mnee_sort_counter", MEM_READ_WRITE),
integrator_shader_sort_prefix_sum_(
device, "integrator_shader_sort_prefix_sum", MEM_READ_WRITE),
integrator_shader_sort_partition_key_offsets_(
device, "integrator_shader_sort_partition_key_offsets", MEM_READ_WRITE),
integrator_next_main_path_index_(device, "integrator_next_main_path_index", MEM_READ_WRITE),
integrator_next_shadow_path_index_(
device, "integrator_next_shadow_path_index", MEM_READ_WRITE),
@@ -207,33 +209,45 @@ void PathTraceWorkGPU::alloc_integrator_sorting()
integrator_state_gpu_.sort_partition_divisor = (int)divide_up(max_num_paths_,
num_sort_partitions_);
/* Allocate arrays for shader sorting. */
const int sort_buckets = device_scene_->data.max_shaders * num_sort_partitions_;
if (integrator_shader_sort_counter_.size() < sort_buckets) {
integrator_shader_sort_counter_.alloc(sort_buckets);
integrator_shader_sort_counter_.zero_to_device();
integrator_state_gpu_.sort_key_counter[DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE] =
(int *)integrator_shader_sort_counter_.device_pointer;
integrator_shader_sort_prefix_sum_.alloc(sort_buckets);
integrator_shader_sort_prefix_sum_.zero_to_device();
}
if (device_scene_->data.kernel_features & KERNEL_FEATURE_NODE_RAYTRACE) {
if (integrator_shader_raytrace_sort_counter_.size() < sort_buckets) {
integrator_shader_raytrace_sort_counter_.alloc(sort_buckets);
integrator_shader_raytrace_sort_counter_.zero_to_device();
integrator_state_gpu_.sort_key_counter[DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE] =
(int *)integrator_shader_raytrace_sort_counter_.device_pointer;
if (num_sort_partitions_ > 1 && queue_->supports_local_atomic_sort()) {
/* Allocate array for partitioned shader sorting using local atomics. */
const int num_offsets = (device_scene_->data.max_shaders + 1) * num_sort_partitions_;
if (integrator_shader_sort_partition_key_offsets_.size() < num_offsets) {
integrator_shader_sort_partition_key_offsets_.alloc(num_offsets);
integrator_shader_sort_partition_key_offsets_.zero_to_device();
}
integrator_state_gpu_.sort_partition_key_offsets =
(int *)integrator_shader_sort_partition_key_offsets_.device_pointer;
}
else {
/* Allocate arrays for shader sorting. */
const int sort_buckets = device_scene_->data.max_shaders * num_sort_partitions_;
if (integrator_shader_sort_counter_.size() < sort_buckets) {
integrator_shader_sort_counter_.alloc(sort_buckets);
integrator_shader_sort_counter_.zero_to_device();
integrator_state_gpu_.sort_key_counter[DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE] =
(int *)integrator_shader_sort_counter_.device_pointer;
if (device_scene_->data.kernel_features & KERNEL_FEATURE_MNEE) {
if (integrator_shader_mnee_sort_counter_.size() < sort_buckets) {
integrator_shader_mnee_sort_counter_.alloc(sort_buckets);
integrator_shader_mnee_sort_counter_.zero_to_device();
integrator_state_gpu_.sort_key_counter[DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_MNEE] =
(int *)integrator_shader_mnee_sort_counter_.device_pointer;
integrator_shader_sort_prefix_sum_.alloc(sort_buckets);
integrator_shader_sort_prefix_sum_.zero_to_device();
}
if (device_scene_->data.kernel_features & KERNEL_FEATURE_NODE_RAYTRACE) {
if (integrator_shader_raytrace_sort_counter_.size() < sort_buckets) {
integrator_shader_raytrace_sort_counter_.alloc(sort_buckets);
integrator_shader_raytrace_sort_counter_.zero_to_device();
integrator_state_gpu_.sort_key_counter[DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE] =
(int *)integrator_shader_raytrace_sort_counter_.device_pointer;
}
}
if (device_scene_->data.kernel_features & KERNEL_FEATURE_MNEE) {
if (integrator_shader_mnee_sort_counter_.size() < sort_buckets) {
integrator_shader_mnee_sort_counter_.alloc(sort_buckets);
integrator_shader_mnee_sort_counter_.zero_to_device();
integrator_state_gpu_.sort_key_counter[DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_MNEE] =
(int *)integrator_shader_mnee_sort_counter_.device_pointer;
}
}
}
}
@@ -451,8 +465,7 @@ void PathTraceWorkGPU::enqueue_path_iteration(DeviceKernel kernel, const int num
work_size = num_queued;
d_path_index = queued_paths_.device_pointer;
compute_sorted_queued_paths(
DEVICE_KERNEL_INTEGRATOR_SORTED_PATHS_ARRAY, kernel, num_paths_limit);
compute_sorted_queued_paths(kernel, num_paths_limit);
}
else if (num_queued < work_size) {
work_size = num_queued;
@@ -511,11 +524,26 @@ void PathTraceWorkGPU::enqueue_path_iteration(DeviceKernel kernel, const int num
}
}
void PathTraceWorkGPU::compute_sorted_queued_paths(DeviceKernel kernel,
DeviceKernel queued_kernel,
void PathTraceWorkGPU::compute_sorted_queued_paths(DeviceKernel queued_kernel,
const int num_paths_limit)
{
int d_queued_kernel = queued_kernel;
/* Launch kernel to fill the active paths arrays. */
if (num_sort_partitions_ > 1 && queue_->supports_local_atomic_sort()) {
const int work_size = kernel_max_active_main_path_index(queued_kernel);
device_ptr d_queued_paths = queued_paths_.device_pointer;
int partition_size = (int)integrator_state_gpu_.sort_partition_divisor;
DeviceKernelArguments args(
&work_size, &partition_size, &num_paths_limit, &d_queued_paths, &d_queued_kernel);
queue_->enqueue(DEVICE_KERNEL_INTEGRATOR_SORT_BUCKET_PASS, 1024 * num_sort_partitions_, args);
queue_->enqueue(DEVICE_KERNEL_INTEGRATOR_SORT_WRITE_PASS, 1024 * num_sort_partitions_, args);
return;
}
device_ptr d_counter = (device_ptr)integrator_state_gpu_.sort_key_counter[d_queued_kernel];
device_ptr d_prefix_sum = integrator_shader_sort_prefix_sum_.device_pointer;
assert(d_counter != 0 && d_prefix_sum != 0);
@@ -552,7 +580,7 @@ void PathTraceWorkGPU::compute_sorted_queued_paths(DeviceKernel kernel,
&d_prefix_sum,
&d_queued_kernel);
queue_->enqueue(kernel, work_size, args);
queue_->enqueue(DEVICE_KERNEL_INTEGRATOR_SORTED_PATHS_ARRAY, work_size, args);
}
}

View File

@@ -70,9 +70,7 @@ class PathTraceWorkGPU : public PathTraceWork {
void enqueue_path_iteration(DeviceKernel kernel, const int num_paths_limit = INT_MAX);
void compute_queued_paths(DeviceKernel kernel, DeviceKernel queued_kernel);
void compute_sorted_queued_paths(DeviceKernel kernel,
DeviceKernel queued_kernel,
const int num_paths_limit);
void compute_sorted_queued_paths(DeviceKernel queued_kernel, const int num_paths_limit);
void compact_main_paths(const int num_active_paths);
void compact_shadow_paths();
@@ -135,6 +133,7 @@ class PathTraceWorkGPU : public PathTraceWork {
device_vector<int> integrator_shader_raytrace_sort_counter_;
device_vector<int> integrator_shader_mnee_sort_counter_;
device_vector<int> integrator_shader_sort_prefix_sum_;
device_vector<int> integrator_shader_sort_partition_key_offsets_;
/* Path split. */
device_vector<int> integrator_next_main_path_index_;
device_vector<int> integrator_next_shadow_path_index_;

View File

@@ -886,7 +886,7 @@ int RenderScheduler::get_num_samples_during_navigation(int resolution_divider) c
{
/* Special trick for fast navigation: schedule multiple samples during fast navigation
* (which will prefer to use lower resolution to keep up with refresh rate). This gives more
* usable visual feedback for artists. There are a couple of tricks though. */
* usable visual feedback for artists. */
if (is_denoise_active_during_update()) {
/* When denoising is used during navigation prefer using a higher resolution with less samples
@@ -896,25 +896,12 @@ int RenderScheduler::get_num_samples_during_navigation(int resolution_divider) c
return 1;
}
if (resolution_divider <= pixel_size_) {
/* When resolution divider is at or below pixel size, schedule one sample. This doesn't effect
* the sample count at this resolution division, but instead assists in the calculation of
* the resolution divider. */
return 1;
}
if (resolution_divider == pixel_size_ * 2) {
/* When resolution divider is the previous step to the final resolution, schedule two samples.
* This is so that rendering on lower resolution does not exceed time that it takes to render
* first sample at the full resolution. */
return 2;
}
/* Always render 4 samples, even if scene is configured for less.
* The idea here is to have enough information on the screen. Resolution divider of 2 allows us
* to have 4 time extra samples, so overall worst case timing is the same as the final resolution
* at one sample. */
return 4;
/* Schedule samples equal to the resolution divider up to a maximum of 4.
* The idea is to have enough information on the screen by increasing the sample count as the
* resolution is decreased. */
/* NOTE: Changing this formula will change the formula in
* `RenderScheduler::calculate_resolution_divider_for_time()`. */
return min(max(1, resolution_divider / pixel_size_), 4);
}
bool RenderScheduler::work_need_adaptive_filter() const
@@ -1100,9 +1087,10 @@ void RenderScheduler::update_start_resolution_divider()
/* TODO(sergey): Need to add hysteresis to avoid resolution divider bouncing around when actual
* render time is somewhere on a boundary between two resolutions. */
/* Never increase resolution to higher than the pixel size (which is possible if the scene is
* simple and compute device is fast). */
start_resolution_divider_ = max(resolution_divider_for_update, pixel_size_);
/* Don't let resolution drop below the desired one. It's better to be slow than provide an
* unreadable viewport render. */
start_resolution_divider_ = min(resolution_divider_for_update,
default_start_resolution_divider_);
VLOG_WORK << "Calculated resolution divider is " << start_resolution_divider_;
}
@@ -1187,24 +1175,24 @@ void RenderScheduler::check_time_limit_reached()
int RenderScheduler::calculate_resolution_divider_for_time(double desired_time, double actual_time)
{
/* TODO(sergey): There should a non-iterative analytical formula here. */
const double ratio_between_times = actual_time / desired_time;
int resolution_divider = 1;
/* We can pass `ratio_between_times` to `get_num_samples_during_navigation()` to get our
* navigation samples because the equation for calculating the resolution divider is as follows:
* `actual_time / desired_time = sqr(resolution_divider) / sample_count`.
* While `resolution_divider` is less than or equal to 4, `resolution_divider = sample_count`
* (This relationship is determined in `get_num_samples_during_navigation()`). With some
* substitution we end up with `actual_time / desired_time = resolution_divider` while the
* resolution divider is less than or equal to 4. Once the resolution divider increases above 4,
* the relationship of `actual_time / desired_time = resolution_divider` is no longer true,
* however the sample count retrieved from `get_num_samples_during_navigation()` is still
* accurate if we continue using this assumption. It should be noted that the interaction between
* `pixel_size`, sample count, and resolution divider are automatically accounted for and that's
* why `pixel_size` isn't included in any of the equations. */
const int navigation_samples = get_num_samples_during_navigation(
ceil_to_int(ratio_between_times));
/* This algorithm iterates through resolution dividers until a divider is found that achieves
* the desired render time. A limit of default_start_resolution_divider_ is put in place as the
* maximum resolution divider to avoid an unreadable viewport due to a low resolution.
* pre_resolution_division_samples and post_resolution_division_samples are used in this
* calculation to better predict the performance impact of changing resolution divisions as
* the sample count can also change between resolution divisions. */
while (actual_time > desired_time && resolution_divider < default_start_resolution_divider_) {
int pre_resolution_division_samples = get_num_samples_during_navigation(resolution_divider);
resolution_divider = resolution_divider * 2;
int post_resolution_division_samples = get_num_samples_during_navigation(resolution_divider);
actual_time /= 4.0 * pre_resolution_division_samples / post_resolution_division_samples;
}
return resolution_divider;
return ceil_to_int(sqrt(navigation_samples * ratio_between_times));
}
int calculate_resolution_divider_for_resolution(int width, int height, int resolution)

View File

@@ -412,11 +412,12 @@ if(WITH_CYCLES_CUDA_BINARIES)
# warn for other versions
if((CUDA_VERSION STREQUAL "101") OR
(CUDA_VERSION STREQUAL "102") OR
(CUDA_VERSION_MAJOR STREQUAL "11"))
(CUDA_VERSION_MAJOR STREQUAL "11") OR
(CUDA_VERSION_MAJOR STREQUAL "12"))
else()
message(WARNING
"CUDA version ${CUDA_VERSION_MAJOR}.${CUDA_VERSION_MINOR} detected, "
"build may succeed but only CUDA 11, 10.2 and 10.1 have been tested")
"build may succeed but only CUDA 12, 11, 10.2 and 10.1 have been tested")
endif()
# build for each arch
@@ -514,6 +515,16 @@ if(WITH_CYCLES_CUDA_BINARIES)
else()
message(STATUS "CUDA binaries for ${arch} require CUDA 10 or earlier, skipped.")
endif()
elseif(${arch} MATCHES ".*_3.")
if(DEFINED CUDA11_NVCC_EXECUTABLE)
set(cuda_nvcc_executable ${CUDA11_NVCC_EXECUTABLE})
set(cuda_toolkit_root_dir ${CUDA11_TOOLKIT_ROOT_DIR})
elseif("${CUDA_VERSION}" LESS 120) # Support for sm_35, sm_37 was removed in CUDA 12
set(cuda_nvcc_executable ${CUDA_NVCC_EXECUTABLE})
set(cuda_toolkit_root_dir ${CUDA_TOOLKIT_ROOT_DIR})
else()
message(STATUS "CUDA binaries for ${arch} require CUDA 11 or earlier, skipped.")
endif()
elseif(${arch} MATCHES ".*_7." AND "${CUDA_VERSION}" LESS 100)
message(STATUS "CUDA binaries for ${arch} require CUDA 10.0+, skipped.")
elseif(${arch} MATCHES ".*_8.")

View File

@@ -661,7 +661,8 @@ ccl_device void bsdf_blur(KernelGlobals kg, ccl_private ShaderClosure *sc, float
#endif
}
ccl_device_inline Spectrum bsdf_albedo(ccl_private const ShaderData *sd, ccl_private const ShaderClosure *sc)
ccl_device_inline Spectrum bsdf_albedo(ccl_private const ShaderData *sd,
ccl_private const ShaderClosure *sc)
{
Spectrum albedo = sc->weight;
/* Some closures include additional components such as Fresnel terms that cause their albedo to
@@ -685,7 +686,7 @@ ccl_device_inline Spectrum bsdf_albedo(ccl_private const ShaderData *sd, ccl_pri
albedo *= ((ccl_private const PrincipledSheenBsdf *)sc)->avg_value;
break;
case CLOSURE_BSDF_HAIR_PRINCIPLED_ID:
albedo *= bsdf_principled_hair_albedo(sc);
albedo *= bsdf_principled_hair_albedo(sd, sc);
break;
default:
break;

View File

@@ -478,10 +478,18 @@ ccl_device_inline float bsdf_principled_hair_albedo_roughness_scale(
return (((((0.245f * x) + 5.574f) * x - 10.73f) * x + 2.532f) * x - 0.215f) * x + 5.969f;
}
ccl_device Spectrum bsdf_principled_hair_albedo(ccl_private const ShaderClosure *sc)
ccl_device Spectrum bsdf_principled_hair_albedo(ccl_private const ShaderData *sd,
ccl_private const ShaderClosure *sc)
{
ccl_private PrincipledHairBSDF *bsdf = (ccl_private PrincipledHairBSDF *)sc;
return exp(-sqrt(bsdf->sigma) * bsdf_principled_hair_albedo_roughness_scale(bsdf->v));
const float cos_theta_o = cos_from_sin(dot(sd->wi, safe_normalize(sd->dPdu)));
const float cos_gamma_o = cos_from_sin(bsdf->extra->geom.w);
const float f = fresnel_dielectric_cos(cos_theta_o * cos_gamma_o, bsdf->eta);
const float roughness_scale = bsdf_principled_hair_albedo_roughness_scale(bsdf->v);
/* TODO(lukas): Adding the Fresnel term here as a workaround until the proper refactor. */
return exp(-sqrt(bsdf->sigma) * roughness_scale) + make_spectrum(f);
}
ccl_device_inline Spectrum

View File

@@ -519,14 +519,6 @@ ccl_device int bsdf_microfacet_ggx_setup(ccl_private MicrofacetBsdf *bsdf)
return SD_BSDF | SD_BSDF_HAS_EVAL;
}
/* Required to maintain OSL interface. */
ccl_device int bsdf_microfacet_ggx_isotropic_setup(ccl_private MicrofacetBsdf *bsdf)
{
bsdf->alpha_y = bsdf->alpha_x;
return bsdf_microfacet_ggx_setup(bsdf);
}
ccl_device int bsdf_microfacet_ggx_fresnel_setup(ccl_private MicrofacetBsdf *bsdf,
ccl_private const ShaderData *sd)
{
@@ -613,14 +605,6 @@ ccl_device int bsdf_microfacet_beckmann_setup(ccl_private MicrofacetBsdf *bsdf)
return SD_BSDF | SD_BSDF_HAS_EVAL;
}
/* Required to maintain OSL interface. */
ccl_device int bsdf_microfacet_beckmann_isotropic_setup(ccl_private MicrofacetBsdf *bsdf)
{
bsdf->alpha_y = bsdf->alpha_x;
return bsdf_microfacet_beckmann_setup(bsdf);
}
ccl_device int bsdf_microfacet_beckmann_refraction_setup(ccl_private MicrofacetBsdf *bsdf)
{
bsdf->alpha_x = saturatef(bsdf->alpha_x);

View File

@@ -90,8 +90,10 @@ ccl_device float schlick_fresnel(float u)
}
/* Calculate the fresnel color, which is a blend between white and the F0 color */
ccl_device_forceinline Spectrum
interpolate_fresnel_color(float3 L, float3 H, float ior, Spectrum F0)
ccl_device_forceinline Spectrum interpolate_fresnel_color(float3 L,
float3 H,
float ior,
Spectrum F0)
{
/* Compute the real Fresnel term and remap it from real_F0..1 to F0..1.
* The reason why we use this remapping instead of directly doing the

View File

@@ -10,7 +10,7 @@
#ifndef WITH_CYCLES_OPTIMIZED_KERNEL_AVX2
# define KERNEL_STUB
#else
/* SSE optimization disabled for now on 32 bit, see bug T36316. */
/* SSE optimization disabled for now on 32 bit, see bug #36316. */
# if !(defined(__GNUC__) && (defined(i386) || defined(_M_IX86)))
# define __KERNEL_SSE__
# define __KERNEL_SSE2__

View File

@@ -10,7 +10,7 @@
#ifndef WITH_CYCLES_OPTIMIZED_KERNEL_SSE2
# define KERNEL_STUB
#else
/* SSE optimization disabled for now on 32 bit, see bug T36316. */
/* SSE optimization disabled for now on 32 bit, see bug #36316. */
# if !(defined(__GNUC__) && (defined(i386) || defined(_M_IX86)))
# define __KERNEL_SSE2__
# endif

View File

@@ -10,7 +10,7 @@
#ifndef WITH_CYCLES_OPTIMIZED_KERNEL_SSE41
# define KERNEL_STUB
#else
/* SSE optimization disabled for now on 32 bit, see bug T36316. */
/* SSE optimization disabled for now on 32 bit, see bug #36316. */
# if !(defined(__GNUC__) && (defined(i386) || defined(_M_IX86)))
# define __KERNEL_SSE2__
# define __KERNEL_SSE3__

View File

@@ -401,6 +401,72 @@ ccl_gpu_kernel_threads(GPU_PARALLEL_SORTED_INDEX_DEFAULT_BLOCK_SIZE)
}
ccl_gpu_kernel_postfix
ccl_gpu_kernel_threads(GPU_PARALLEL_SORT_BLOCK_SIZE)
ccl_gpu_kernel_signature(integrator_sort_bucket_pass,
int num_states,
int partition_size,
int num_states_limit,
ccl_global int *indices,
int kernel_index)
{
#if defined(__KERNEL_LOCAL_ATOMIC_SORT__)
int max_shaders = context.launch_params_metal.data.max_shaders;
ccl_global ushort *d_queued_kernel = (ccl_global ushort *)
kernel_integrator_state.path.queued_kernel;
ccl_global uint *d_shader_sort_key = (ccl_global uint *)
kernel_integrator_state.path.shader_sort_key;
ccl_global int *key_offsets = (ccl_global int *)
kernel_integrator_state.sort_partition_key_offsets;
gpu_parallel_sort_bucket_pass(num_states,
partition_size,
max_shaders,
kernel_index,
d_queued_kernel,
d_shader_sort_key,
key_offsets,
(threadgroup int *)threadgroup_array,
metal_local_id,
metal_local_size,
metal_grid_id);
#endif
}
ccl_gpu_kernel_postfix
ccl_gpu_kernel_threads(GPU_PARALLEL_SORT_BLOCK_SIZE)
ccl_gpu_kernel_signature(integrator_sort_write_pass,
int num_states,
int partition_size,
int num_states_limit,
ccl_global int *indices,
int kernel_index)
{
#if defined(__KERNEL_LOCAL_ATOMIC_SORT__)
int max_shaders = context.launch_params_metal.data.max_shaders;
ccl_global ushort *d_queued_kernel = (ccl_global ushort *)
kernel_integrator_state.path.queued_kernel;
ccl_global uint *d_shader_sort_key = (ccl_global uint *)
kernel_integrator_state.path.shader_sort_key;
ccl_global int *key_offsets = (ccl_global int *)
kernel_integrator_state.sort_partition_key_offsets;
gpu_parallel_sort_write_pass(num_states,
partition_size,
max_shaders,
kernel_index,
num_states_limit,
indices,
d_queued_kernel,
d_shader_sort_key,
key_offsets,
(threadgroup int *)threadgroup_array,
metal_local_id,
metal_local_size,
metal_grid_id);
#endif
}
ccl_gpu_kernel_postfix
ccl_gpu_kernel_threads(GPU_PARALLEL_ACTIVE_INDEX_DEFAULT_BLOCK_SIZE)
ccl_gpu_kernel_signature(integrator_compact_paths_array,
int num_states,
@@ -579,7 +645,7 @@ ccl_device_inline void kernel_gpu_film_convert_half_write(ccl_global uchar4 *rgb
const int y,
const half4 half_pixel)
{
/* Work around HIP issue with half float display, see T92972. */
/* Work around HIP issue with half float display, see #92972. */
#ifdef __KERNEL_HIP__
ccl_global half *out = ((ccl_global half *)rgba) + (rgba_offset + y * rgba_stride + x) * 4;
out[0] = half_pixel.x;

View File

@@ -178,7 +178,7 @@ __device__
simd_lane_index, \
simd_group_index, \
num_simd_groups, \
simdgroup_offset)
(threadgroup int *)threadgroup_array)
#elif defined(__KERNEL_ONEAPI__)
# define gpu_parallel_active_index_array(num_states, indices, num_indices, is_active_op) \

View File

@@ -19,6 +19,115 @@ CCL_NAMESPACE_BEGIN
# define GPU_PARALLEL_SORTED_INDEX_DEFAULT_BLOCK_SIZE 512
#endif
#define GPU_PARALLEL_SORTED_INDEX_INACTIVE_KEY (~0)
#define GPU_PARALLEL_SORT_BLOCK_SIZE 1024
#if defined(__KERNEL_LOCAL_ATOMIC_SORT__)
# define atomic_store_local(p, x) \
atomic_store_explicit((threadgroup atomic_int *)p, x, memory_order_relaxed)
# define atomic_load_local(p) \
atomic_load_explicit((threadgroup atomic_int *)p, memory_order_relaxed)
ccl_device_inline void gpu_parallel_sort_bucket_pass(const uint num_states,
const uint partition_size,
const uint max_shaders,
const uint queued_kernel,
ccl_global ushort *d_queued_kernel,
ccl_global uint *d_shader_sort_key,
ccl_global int *partition_key_offsets,
ccl_gpu_shared int *buckets,
const ushort local_id,
const ushort local_size,
const ushort grid_id)
{
/* Zero the bucket sizes. */
if (local_id < max_shaders) {
atomic_store_local(&buckets[local_id], 0);
}
ccl_gpu_syncthreads();
/* Determine bucket sizes within the partitions. */
const uint partition_start = partition_size * uint(grid_id);
const uint partition_end = min(num_states, partition_start + partition_size);
for (int state_index = partition_start + uint(local_id); state_index < partition_end;
state_index += uint(local_size)) {
ushort kernel_index = d_queued_kernel[state_index];
if (kernel_index == queued_kernel) {
uint key = d_shader_sort_key[state_index] % max_shaders;
atomic_fetch_and_add_uint32(&buckets[key], 1);
}
}
ccl_gpu_syncthreads();
/* Calculate the partition's local offsets from the prefix sum of bucket sizes. */
if (local_id == 0) {
int offset = 0;
for (int i = 0; i < max_shaders; i++) {
partition_key_offsets[i + uint(grid_id) * (max_shaders + 1)] = offset;
offset = offset + atomic_load_local(&buckets[i]);
}
/* Store the number of active states in this partition. */
partition_key_offsets[max_shaders + uint(grid_id) * (max_shaders + 1)] = offset;
}
}
ccl_device_inline void gpu_parallel_sort_write_pass(const uint num_states,
const uint partition_size,
const uint max_shaders,
const uint queued_kernel,
const int num_states_limit,
ccl_global int *indices,
ccl_global ushort *d_queued_kernel,
ccl_global uint *d_shader_sort_key,
ccl_global int *partition_key_offsets,
ccl_gpu_shared int *local_offset,
const ushort local_id,
const ushort local_size,
const ushort grid_id)
{
/* Calculate each partition's global offset from the prefix sum of the active state counts per
* partition. */
if (local_id < max_shaders) {
int partition_offset = 0;
for (int i = 0; i < uint(grid_id); i++) {
int partition_key_count = partition_key_offsets[max_shaders + uint(i) * (max_shaders + 1)];
partition_offset += partition_key_count;
}
ccl_global int *key_offsets = partition_key_offsets + (uint(grid_id) * (max_shaders + 1));
atomic_store_local(&local_offset[local_id], key_offsets[local_id] + partition_offset);
}
ccl_gpu_syncthreads();
/* Write the sorted active indices. */
const uint partition_start = partition_size * uint(grid_id);
const uint partition_end = min(num_states, partition_start + partition_size);
ccl_global int *key_offsets = partition_key_offsets + (uint(grid_id) * max_shaders);
for (int state_index = partition_start + uint(local_id); state_index < partition_end;
state_index += uint(local_size)) {
ushort kernel_index = d_queued_kernel[state_index];
if (kernel_index == queued_kernel) {
uint key = d_shader_sort_key[state_index] % max_shaders;
int index = atomic_fetch_and_add_uint32(&local_offset[key], 1);
if (index < num_states_limit) {
indices[index] = state_index;
}
}
}
}
#endif /* __KERNEL_LOCAL_ATOMIC_SORT__ */
template<typename GetKeyOp>
__device__ void gpu_parallel_sorted_index_array(const uint state_index,

View File

@@ -172,17 +172,14 @@ ccl_device_intersect bool scene_intersect_local(KernelGlobals kg,
kernel_assert(!"Invalid ift_local");
return false;
}
# endif
metal::raytracing::ray r(ray->P, ray->D, ray->tmin, ray->tmax);
metalrt_intersector_type metalrt_intersect;
metalrt_intersect.force_opacity(metal::raytracing::forced_opacity::non_opaque);
bool triangle_only = !kernel_data.bvh.have_curves && !kernel_data.bvh.have_points;
if (triangle_only) {
metalrt_intersect.assume_geometry_type(metal::raytracing::geometry_type::triangle);
if (is_null_intersection_function_table(metal_ancillaries->ift_local_prim)) {
if (local_isect) {
local_isect->num_hits = 0;
}
kernel_assert(!"Invalid ift_local_prim");
return false;
}
# endif
MetalRTIntersectionLocalPayload payload;
payload.self = ray->self;
@@ -195,14 +192,48 @@ ccl_device_intersect bool scene_intersect_local(KernelGlobals kg,
}
payload.result = false;
typename metalrt_intersector_type::result_type intersection;
metal::raytracing::ray r(ray->P, ray->D, ray->tmin, ray->tmax);
# if defined(__METALRT_MOTION__)
metalrt_intersector_type metalrt_intersect;
typename metalrt_intersector_type::result_type intersection;
metalrt_intersect.force_opacity(metal::raytracing::forced_opacity::non_opaque);
bool triangle_only = !kernel_data.bvh.have_curves && !kernel_data.bvh.have_points;
if (triangle_only) {
metalrt_intersect.assume_geometry_type(metal::raytracing::geometry_type::triangle);
}
intersection = metalrt_intersect.intersect(
r, metal_ancillaries->accel_struct, 0xFF, ray->time, metal_ancillaries->ift_local, payload);
# else
metalrt_blas_intersector_type metalrt_intersect;
typename metalrt_blas_intersector_type::result_type intersection;
metalrt_intersect.force_opacity(metal::raytracing::forced_opacity::non_opaque);
bool triangle_only = !kernel_data.bvh.have_curves && !kernel_data.bvh.have_points;
if (triangle_only) {
metalrt_intersect.assume_geometry_type(metal::raytracing::geometry_type::triangle);
}
// if we know we are going to get max one hit, like for random-sss-walk we can
// optimize and accept the first hit
if (max_hits == 1) {
metalrt_intersect.accept_any_intersection(true);
}
int blas_index = metal_ancillaries->blas_userID_to_index_lookUp[local_object];
// transform the ray into object's local space
Transform itfm = kernel_data_fetch(objects, local_object).itfm;
r.origin = transform_point(&itfm, r.origin);
r.direction = transform_direction(&itfm, r.direction);
intersection = metalrt_intersect.intersect(
r, metal_ancillaries->accel_struct, 0xFF, metal_ancillaries->ift_local, payload);
r,
metal_ancillaries->blas_accel_structs[blas_index].blas,
metal_ancillaries->ift_local_prim,
payload);
# endif
if (lcg_state) {

View File

@@ -105,10 +105,11 @@ struct kernel_gpu_##name \
{ \
PARAMS_MAKER(__VA_ARGS__)(__VA_ARGS__) \
void run(thread MetalKernelContext& context, \
threadgroup int *simdgroup_offset, \
threadgroup atomic_int *threadgroup_array, \
const uint metal_global_id, \
const ushort metal_local_id, \
const ushort metal_local_size, \
const ushort metal_grid_id, \
uint simdgroup_size, \
uint simd_lane_index, \
uint simd_group_index, \
@@ -117,22 +118,24 @@ struct kernel_gpu_##name \
kernel void cycles_metal_##name(device const kernel_gpu_##name *params_struct, \
constant KernelParamsMetal &ccl_restrict _launch_params_metal, \
constant MetalAncillaries *_metal_ancillaries, \
threadgroup int *simdgroup_offset[[ threadgroup(0) ]], \
threadgroup atomic_int *threadgroup_array[[ threadgroup(0) ]], \
const uint metal_global_id [[thread_position_in_grid]], \
const ushort metal_local_id [[thread_position_in_threadgroup]], \
const ushort metal_local_size [[threads_per_threadgroup]], \
const ushort metal_grid_id [[threadgroup_position_in_grid]], \
uint simdgroup_size [[threads_per_simdgroup]], \
uint simd_lane_index [[thread_index_in_simdgroup]], \
uint simd_group_index [[simdgroup_index_in_threadgroup]], \
uint num_simd_groups [[simdgroups_per_threadgroup]]) { \
MetalKernelContext context(_launch_params_metal, _metal_ancillaries); \
params_struct->run(context, simdgroup_offset, metal_global_id, metal_local_id, metal_local_size, simdgroup_size, simd_lane_index, simd_group_index, num_simd_groups); \
params_struct->run(context, threadgroup_array, metal_global_id, metal_local_id, metal_local_size, metal_grid_id, simdgroup_size, simd_lane_index, simd_group_index, num_simd_groups); \
} \
void kernel_gpu_##name::run(thread MetalKernelContext& context, \
threadgroup int *simdgroup_offset, \
threadgroup atomic_int *threadgroup_array, \
const uint metal_global_id, \
const ushort metal_local_id, \
const ushort metal_local_size, \
const ushort metal_grid_id, \
uint simdgroup_size, \
uint simd_lane_index, \
uint simd_group_index, \
@@ -263,13 +266,25 @@ ccl_device_forceinline uchar4 make_uchar4(const uchar x,
# if defined(__METALRT_MOTION__)
# define METALRT_TAGS instancing, instance_motion, primitive_motion
# define METALRT_BLAS_TAGS , primitive_motion
# else
# define METALRT_TAGS instancing
# define METALRT_BLAS_TAGS
# endif /* __METALRT_MOTION__ */
typedef acceleration_structure<METALRT_TAGS> metalrt_as_type;
typedef intersection_function_table<triangle_data, METALRT_TAGS> metalrt_ift_type;
typedef metal::raytracing::intersector<triangle_data, METALRT_TAGS> metalrt_intersector_type;
# if defined(__METALRT_MOTION__)
typedef acceleration_structure<primitive_motion> metalrt_blas_as_type;
typedef intersection_function_table<triangle_data, primitive_motion> metalrt_blas_ift_type;
typedef metal::raytracing::intersector<triangle_data, primitive_motion>
metalrt_blas_intersector_type;
# else
typedef acceleration_structure<> metalrt_blas_as_type;
typedef intersection_function_table<triangle_data> metalrt_blas_ift_type;
typedef metal::raytracing::intersector<triangle_data> metalrt_blas_intersector_type;
# endif
#endif /* __METALRT__ */
@@ -282,6 +297,12 @@ struct Texture3DParamsMetal {
texture3d<float, access::sample> tex;
};
#ifdef __METALRT__
struct MetalRTBlasWrapper {
metalrt_blas_as_type blas;
};
#endif
struct MetalAncillaries {
device Texture2DParamsMetal *textures_2d;
device Texture3DParamsMetal *textures_3d;
@@ -291,6 +312,9 @@ struct MetalAncillaries {
metalrt_ift_type ift_default;
metalrt_ift_type ift_shadow;
metalrt_ift_type ift_local;
metalrt_blas_ift_type ift_local_prim;
constant MetalRTBlasWrapper *blas_accel_structs;
constant int *blas_userID_to_index_lookUp;
#endif
};

View File

@@ -139,6 +139,20 @@ TReturn metalrt_local_hit(constant KernelParamsMetal &launch_params_metal,
#endif
}
[[intersection(triangle, triangle_data )]] TriangleIntersectionResult
__anyhit__cycles_metalrt_local_hit_tri_prim(
constant KernelParamsMetal &launch_params_metal [[buffer(1)]],
ray_data MetalKernelContext::MetalRTIntersectionLocalPayload &payload [[payload]],
uint primitive_id [[primitive_id]],
float2 barycentrics [[barycentric_coord]],
float ray_tmax [[distance]])
{
//instance_id, aka the user_id has been removed. If we take this function we optimized the
//SSS for starting traversal from a primitive acceleration structure instead of the root of the global AS.
//this means we will always be intersecting the correct object no need for the userid to check
return metalrt_local_hit<TriangleIntersectionResult, METALRT_HIT_TRIANGLE>(
launch_params_metal, payload, payload.local_object, primitive_id, barycentrics, ray_tmax);
}
[[intersection(triangle, triangle_data, METALRT_TAGS)]] TriangleIntersectionResult
__anyhit__cycles_metalrt_local_hit_tri(
constant KernelParamsMetal &launch_params_metal [[buffer(1)]],
@@ -163,6 +177,17 @@ __anyhit__cycles_metalrt_local_hit_box(const float ray_tmax [[max_distance]])
return result;
}
[[intersection(bounding_box, triangle_data )]] BoundingBoxIntersectionResult
__anyhit__cycles_metalrt_local_hit_box_prim(const float ray_tmax [[max_distance]])
{
/* unused function */
BoundingBoxIntersectionResult result;
result.distance = ray_tmax;
result.accept = false;
result.continue_search = false;
return result;
}
template<uint intersection_type>
bool metalrt_shadow_all_hit(constant KernelParamsMetal &launch_params_metal,
ray_data MetalKernelContext::MetalRTIntersectionShadowPayload &payload,

View File

@@ -372,6 +372,16 @@ bool oneapi_enqueue_kernel(KernelContext *kernel_context,
kg, cgh, global_size, local_size, args, oneapi_kernel_integrator_sorted_paths_array);
break;
}
case DEVICE_KERNEL_INTEGRATOR_SORT_BUCKET_PASS: {
oneapi_call(
kg, cgh, global_size, local_size, args, oneapi_kernel_integrator_sort_bucket_pass);
break;
}
case DEVICE_KERNEL_INTEGRATOR_SORT_WRITE_PASS: {
oneapi_call(
kg, cgh, global_size, local_size, args, oneapi_kernel_integrator_sort_write_pass);
break;
}
case DEVICE_KERNEL_INTEGRATOR_COMPACT_PATHS_ARRAY: {
oneapi_call(kg,
cgh,

View File

@@ -132,6 +132,9 @@ typedef struct IntegratorStateGPU {
/* Index of main path which will be used by a next shadow catcher split. */
ccl_global int *next_main_path_index;
/* Partition/key offsets used when writing sorted active indices. */
ccl_global int *sort_partition_key_offsets;
/* Divisor used to partition active indices by locality when sorting by material. */
uint sort_partition_divisor;
} IntegratorStateGPU;

View File

@@ -115,6 +115,13 @@ ccl_device_forceinline void integrator_path_init_sorted(KernelGlobals kg,
atomic_fetch_and_add_uint32(&kernel_integrator_state.queue_counter->num_queued[next_kernel], 1);
INTEGRATOR_STATE_WRITE(state, path, queued_kernel) = next_kernel;
INTEGRATOR_STATE_WRITE(state, path, shader_sort_key) = key_;
# if defined(__KERNEL_LOCAL_ATOMIC_SORT__)
if (!kernel_integrator_state.sort_key_counter[next_kernel]) {
return;
}
# endif
atomic_fetch_and_add_uint32(&kernel_integrator_state.sort_key_counter[next_kernel][key_], 1);
}
@@ -130,6 +137,13 @@ ccl_device_forceinline void integrator_path_next_sorted(KernelGlobals kg,
atomic_fetch_and_add_uint32(&kernel_integrator_state.queue_counter->num_queued[next_kernel], 1);
INTEGRATOR_STATE_WRITE(state, path, queued_kernel) = next_kernel;
INTEGRATOR_STATE_WRITE(state, path, shader_sort_key) = key_;
# if defined(__KERNEL_LOCAL_ATOMIC_SORT__)
if (!kernel_integrator_state.sort_key_counter[next_kernel]) {
return;
}
# endif
atomic_fetch_and_add_uint32(&kernel_integrator_state.sort_key_counter[next_kernel][key_], 1);
}

View File

@@ -209,14 +209,7 @@ ccl_device void osl_closure_microfacet_setup(KernelGlobals kg,
if (closure->distribution == make_string("ggx", 11253504724482777663ull) ||
closure->distribution == make_string("default", 4430693559278735917ull)) {
if (!closure->refract) {
if (closure->alpha_x == closure->alpha_y) {
/* Isotropic */
sd->flag |= bsdf_microfacet_ggx_isotropic_setup(bsdf);
}
else {
/* Anisotropic */
sd->flag |= bsdf_microfacet_ggx_setup(bsdf);
}
sd->flag |= bsdf_microfacet_ggx_setup(bsdf);
}
else {
sd->flag |= bsdf_microfacet_ggx_refraction_setup(bsdf);
@@ -225,14 +218,7 @@ ccl_device void osl_closure_microfacet_setup(KernelGlobals kg,
/* Beckmann */
else {
if (!closure->refract) {
if (closure->alpha_x == closure->alpha_y) {
/* Isotropic */
sd->flag |= bsdf_microfacet_beckmann_isotropic_setup(bsdf);
}
else {
/* Anisotropic */
sd->flag |= bsdf_microfacet_beckmann_setup(bsdf);
}
sd->flag |= bsdf_microfacet_beckmann_setup(bsdf);
}
else {
sd->flag |= bsdf_microfacet_beckmann_refraction_setup(bsdf);
@@ -258,9 +244,9 @@ ccl_device void osl_closure_microfacet_ggx_setup(
}
bsdf->N = ensure_valid_reflection(sd->Ng, sd->wi, closure->N);
bsdf->alpha_x = closure->alpha_x;
bsdf->alpha_x = bsdf->alpha_y = closure->alpha_x;
sd->flag |= bsdf_microfacet_ggx_isotropic_setup(bsdf);
sd->flag |= bsdf_microfacet_ggx_setup(bsdf);
}
ccl_device void osl_closure_microfacet_ggx_aniso_setup(
@@ -652,9 +638,9 @@ ccl_device void osl_closure_microfacet_beckmann_setup(
}
bsdf->N = ensure_valid_reflection(sd->Ng, sd->wi, closure->N);
bsdf->alpha_x = closure->alpha_x;
bsdf->alpha_x = bsdf->alpha_y = closure->alpha_x;
sd->flag |= bsdf_microfacet_beckmann_isotropic_setup(bsdf);
sd->flag |= bsdf_microfacet_beckmann_setup(bsdf);
}
ccl_device void osl_closure_microfacet_beckmann_aniso_setup(

View File

@@ -74,7 +74,8 @@ CCL_NAMESPACE_BEGIN
#define __VOLUME__
/* TODO: solve internal compiler errors and enable light tree on HIP. */
#ifdef __KERNEL_HIP__
/* TODO: solve internal compiler perf issue and enable light tree on Metal/AMD. */
#if defined(__KERNEL_HIP__) || defined(__KERNEL_METAL_AMD__)
# undef __LIGHT_TREE__
#endif
@@ -1508,6 +1509,8 @@ typedef enum DeviceKernel : int {
DEVICE_KERNEL_INTEGRATOR_ACTIVE_PATHS_ARRAY,
DEVICE_KERNEL_INTEGRATOR_TERMINATED_PATHS_ARRAY,
DEVICE_KERNEL_INTEGRATOR_SORTED_PATHS_ARRAY,
DEVICE_KERNEL_INTEGRATOR_SORT_BUCKET_PASS,
DEVICE_KERNEL_INTEGRATOR_SORT_WRITE_PASS,
DEVICE_KERNEL_INTEGRATOR_COMPACT_PATHS_ARRAY,
DEVICE_KERNEL_INTEGRATOR_COMPACT_STATES,
DEVICE_KERNEL_INTEGRATOR_TERMINATED_SHADOW_PATHS_ARRAY,

View File

@@ -13,6 +13,7 @@
#include "scene/light.h"
#include "scene/mesh.h"
#include "scene/object.h"
#include "scene/osl.h"
#include "scene/pointcloud.h"
#include "scene/scene.h"
#include "scene/shader.h"
@@ -25,7 +26,6 @@
#ifdef WITH_OSL
# include "kernel/osl/globals.h"
# include "kernel/osl/services.h"
#endif
#include "util/foreach.h"
@@ -1717,20 +1717,7 @@ void GeometryManager::device_update_displacement_images(Device *device,
/* If any OSL node is used for displacement, it may reference a texture. But it's
* unknown which ones, so have to load them all. */
if (has_osl_node) {
set<OSLRenderServices *> services_shared;
device->foreach_device([&services_shared](Device *sub_device) {
OSLGlobals *og = (OSLGlobals *)sub_device->get_cpu_osl_memory();
services_shared.insert(og->services);
});
for (OSLRenderServices *services : services_shared) {
for (auto it = services->textures.begin(); it != services->textures.end(); ++it) {
if (it->second->handle.get_manager() == image_manager) {
const int slot = it->second->handle.svm_slot();
bump_images.insert(slot);
}
}
}
OSLShaderManager::osl_image_slots(device, image_manager, bump_images);
}
#endif

View File

@@ -394,7 +394,7 @@ bool OSLShaderManager::osl_compile(const string &inputfile, const string &output
/* Compile.
*
* Mutex protected because the OSL compiler does not appear to be thread safe, see T92503. */
* Mutex protected because the OSL compiler does not appear to be thread safe, see #92503. */
static thread_mutex osl_compiler_mutex;
thread_scoped_lock lock(osl_compiler_mutex);
@@ -665,6 +665,27 @@ OSLNode *OSLShaderManager::osl_node(ShaderGraph *graph,
return node;
}
/* Static function, so only this file needs to be compile with RTTT. */
void OSLShaderManager::osl_image_slots(Device *device,
ImageManager *image_manager,
set<int> &image_slots)
{
set<OSLRenderServices *> services_shared;
device->foreach_device([&services_shared](Device *sub_device) {
OSLGlobals *og = (OSLGlobals *)sub_device->get_cpu_osl_memory();
services_shared.insert(og->services);
});
for (OSLRenderServices *services : services_shared) {
for (auto it = services->textures.begin(); it != services->textures.end(); ++it) {
if (it->second->handle.get_manager() == image_manager) {
const int slot = it->second->handle.svm_slot();
image_slots.insert(slot);
}
}
}
}
/* Graph Compiler */
OSLCompiler::OSLCompiler(OSLShaderManager *manager, OSL::ShadingSystem *ss, Scene *scene)

View File

@@ -92,6 +92,9 @@ class OSLShaderManager : public ShaderManager {
const std::string &bytecode_hash = "",
const std::string &bytecode = "");
/* Get image slots used by OSL services on device. */
static void osl_image_slots(Device *device, ImageManager *image_manager, set<int> &image_slots);
private:
void texture_system_init();
void texture_system_free();

View File

@@ -573,7 +573,7 @@ void ShaderManager::device_update_common(Device * /*device*/,
kfilm->is_rec709 = is_rec709;
}
void ShaderManager::device_free_common(Device *, DeviceScene *dscene, Scene *scene)
void ShaderManager::device_free_common(Device * /*device*/, DeviceScene *dscene, Scene * /*scene*/)
{
dscene->shaders.free();
}

View File

@@ -520,7 +520,7 @@ bool TileManager::write_tile(const RenderBuffers &tile_buffers)
/* If there is an overscan used for the tile copy pixels into single continuous block of memory
* without any "gaps".
* This is a workaround for bug in OIIO (https://github.com/OpenImageIO/oiio/pull/3176).
* Our task reference: T93008. */
* Our task reference: #93008. */
if (tile_params.window_x || tile_params.window_y ||
tile_params.window_width != tile_params.width ||
tile_params.window_height != tile_params.height) {

View File

@@ -73,16 +73,55 @@ ccl_device_inline float atomic_add_and_fetch_float(volatile ccl_global float *_s
return new_value.float_value;
}
# define atomic_fetch_and_add_uint32(p, x) \
atomic_fetch_add_explicit((device atomic_uint *)p, x, memory_order_relaxed)
# define atomic_fetch_and_sub_uint32(p, x) \
atomic_fetch_sub_explicit((device atomic_uint *)p, x, memory_order_relaxed)
# define atomic_fetch_and_inc_uint32(p) \
atomic_fetch_add_explicit((device atomic_uint *)p, 1, memory_order_relaxed)
# define atomic_fetch_and_dec_uint32(p) \
atomic_fetch_sub_explicit((device atomic_uint *)p, 1, memory_order_relaxed)
# define atomic_fetch_and_or_uint32(p, x) \
atomic_fetch_or_explicit((device atomic_uint *)p, x, memory_order_relaxed)
template<class T> ccl_device_inline uint32_t atomic_fetch_and_add_uint32(device T *p, int x)
{
return atomic_fetch_add_explicit((device atomic_uint *)p, x, memory_order_relaxed);
}
template<class T> ccl_device_inline uint32_t atomic_fetch_and_sub_uint32(device T *p, int x)
{
return atomic_fetch_sub_explicit((device atomic_uint *)p, x, memory_order_relaxed);
}
template<class T> ccl_device_inline uint32_t atomic_fetch_and_inc_uint32(device T *p)
{
return atomic_fetch_add_explicit((device atomic_uint *)p, 1, memory_order_relaxed);
}
template<class T> ccl_device_inline uint32_t atomic_fetch_and_dec_uint32(device T *p)
{
return atomic_fetch_sub_explicit((device atomic_uint *)p, 1, memory_order_relaxed);
}
template<class T> ccl_device_inline uint32_t atomic_fetch_and_or_uint32(device T *p, int x)
{
return atomic_fetch_or_explicit((device atomic_uint *)p, x, memory_order_relaxed);
}
template<class T> ccl_device_inline uint32_t atomic_fetch_and_add_uint32(threadgroup T *p, int x)
{
return atomic_fetch_add_explicit((threadgroup atomic_uint *)p, x, memory_order_relaxed);
}
template<class T> ccl_device_inline uint32_t atomic_fetch_and_sub_uint32(threadgroup T *p, int x)
{
return atomic_fetch_sub_explicit((threadgroup atomic_uint *)p, x, memory_order_relaxed);
}
template<class T> ccl_device_inline uint32_t atomic_fetch_and_inc_uint32(threadgroup T *p)
{
return atomic_fetch_add_explicit((threadgroup atomic_uint *)p, 1, memory_order_relaxed);
}
template<class T> ccl_device_inline uint32_t atomic_fetch_and_dec_uint32(threadgroup T *p)
{
return atomic_fetch_sub_explicit((threadgroup atomic_uint *)p, 1, memory_order_relaxed);
}
template<class T> ccl_device_inline uint32_t atomic_fetch_and_or_uint32(threadgroup T *p, int x)
{
return atomic_fetch_or_explicit((threadgroup atomic_uint *)p, x, memory_order_relaxed);
}
ccl_device_inline float atomic_compare_and_swap_float(volatile ccl_global float *dest,
const float old_val,

View File

@@ -69,6 +69,9 @@ void DebugFlags::Metal::reset()
{
if (getenv("CYCLES_METAL_ADAPTIVE_COMPILE") != NULL)
adaptive_compile = true;
if (auto str = getenv("CYCLES_METAL_LOCAL_ATOMIC_SORT"))
use_local_atomic_sort = (atoi(str) != 0);
}
DebugFlags::OptiX::OptiX()

View File

@@ -97,6 +97,9 @@ class DebugFlags {
/* Whether adaptive feature based runtime compile is enabled or not. */
bool adaptive_compile = false;
/* Whether local atomic sorting is enabled or not. */
bool use_local_atomic_sort = true;
};
/* Get instance of debug flags registry. */

View File

@@ -74,7 +74,7 @@ ccl_device float fast_sinf(float x)
*
* Results on: [-2pi,2pi].
*
* Examined 2173837240 values of sin: 0.00662760244 avg ulp diff, 2 max ulp,
* Examined 2173837240 values of sin: 0.00662760244 avg ULP diff, 2 max ULP,
* 1.19209e-07 max error
*/
int q = fast_rint(x * M_1_PI_F);
@@ -256,11 +256,11 @@ ccl_device float fast_acosf(float x)
/* clamp and crush denormals. */
const float m = (f < 1.0f) ? 1.0f - (1.0f - f) : 1.0f;
/* Based on http://www.pouet.net/topic.php?which=9132&page=2
* 85% accurate (ulp 0)
* 85% accurate (ULP 0)
* Examined 2130706434 values of acos:
* 15.2000597 avg ulp diff, 4492 max ulp, 4.51803e-05 max error // without "denormal crush"
* 15.2000597 avg ULP diff, 4492 max ULP, 4.51803e-05 max error // without "denormal crush"
* Examined 2130706434 values of acos:
* 15.2007108 avg ulp diff, 4492 max ulp, 4.51803e-05 max error // with "denormal crush"
* 15.2007108 avg ULP diff, 4492 max ULP, 4.51803e-05 max error // with "denormal crush"
*/
const float a = sqrtf(1.0f - m) *
(1.5707963267f + m * (-0.213300989f + m * (0.077980478f + m * -0.02164095f)));
@@ -270,9 +270,8 @@ ccl_device float fast_acosf(float x)
ccl_device float fast_asinf(float x)
{
/* Based on acosf approximation above.
* Max error is 4.51133e-05 (ulps are higher because we are consistently off
* by a little amount).
*/
* Max error is 4.51133e-05 (ULPS are higher because we are consistently off
* by a little amount). */
const float f = fabsf(x);
/* Clamp and crush denormals. */
const float m = (f < 1.0f) ? 1.0f - (1.0f - f) : 1.0f;
@@ -290,9 +289,9 @@ ccl_device float fast_atanf(float x)
const float t = s * s;
/* http://mathforum.org/library/drmath/view/62672.html
* Examined 4278190080 values of atan:
* 2.36864877 avg ulp diff, 302 max ulp, 6.55651e-06 max error // (with denormals)
* 2.36864877 avg ULP diff, 302 max ULP, 6.55651e-06 max error // (with denormals)
* Examined 4278190080 values of atan:
* 171160502 avg ulp diff, 855638016 max ulp, 6.55651e-06 max error // (crush denormals)
* 171160502 avg ULP diff, 855638016 max ULP, 6.55651e-06 max error // (crush denormals)
*/
float r = s * madd(0.43157974f, t, 1.0f) / madd(madd(0.05831938f, t, 0.76443945f), t, 1.0f);
if (a > 1.0f) {
@@ -343,8 +342,8 @@ ccl_device float fast_log2f(float x)
int exponent = (int)(bits >> 23) - 127;
float f = __uint_as_float((bits & 0x007FFFFF) | 0x3f800000) - 1.0f;
/* Examined 2130706432 values of log2 on [1.17549435e-38,3.40282347e+38]:
* 0.0797524457 avg ulp diff, 3713596 max ulp, 7.62939e-06 max error.
* ulp histogram:
* 0.0797524457 avg ULP diff, 3713596 max ULP, 7.62939e-06 max error.
* ULP histogram:
* 0 = 97.46%
* 1 = 2.29%
* 2 = 0.11%
@@ -363,7 +362,7 @@ ccl_device float fast_log2f(float x)
ccl_device_inline float fast_logf(float x)
{
/* Examined 2130706432 values of logf on [1.17549435e-38,3.40282347e+38]:
* 0.313865375 avg ulp diff, 5148137 max ulp, 7.62939e-06 max error.
* 0.313865375 avg ULP diff, 5148137 max ULP, 7.62939e-06 max error.
*/
return fast_log2f(x) * M_LN2_F;
}
@@ -371,7 +370,7 @@ ccl_device_inline float fast_logf(float x)
ccl_device_inline float fast_log10(float x)
{
/* Examined 2130706432 values of log10f on [1.17549435e-38,3.40282347e+38]:
* 0.631237033 avg ulp diff, 4471615 max ulp, 3.8147e-06 max error.
* 0.631237033 avg ULP diff, 4471615 max ULP, 3.8147e-06 max error.
*/
return fast_log2f(x) * M_LN2_F / M_LN10_F;
}
@@ -392,12 +391,12 @@ ccl_device float fast_exp2f(float x)
/* Range reduction. */
int m = (int)x;
x -= m;
x = 1.0f - (1.0f - x); /* Crush denormals (does not affect max ulps!). */
x = 1.0f - (1.0f - x); /* Crush denormals (does not affect max ULPS!). */
/* 5th degree polynomial generated with sollya
* Examined 2247622658 values of exp2 on [-126,126]: 2.75764912 avg ulp diff,
* 232 max ulp.
* Examined 2247622658 values of exp2 on [-126,126]: 2.75764912 avg ULP diff,
* 232 max ULP.
*
* ulp histogram:
* ULP histogram:
* 0 = 87.81%
* 1 = 4.18%
*/
@@ -415,14 +414,14 @@ ccl_device float fast_exp2f(float x)
ccl_device_inline float fast_expf(float x)
{
/* Examined 2237485550 values of exp on [-87.3300018,87.3300018]:
* 2.6666452 avg ulp diff, 230 max ulp.
* 2.6666452 avg ULP diff, 230 max ULP.
*/
return fast_exp2f(x / M_LN2_F);
}
#if !defined(__KERNEL_GPU__) && !defined(_MSC_VER)
/* MSVC seems to have a code-gen bug here in at least SSE41/AVX, see
* T78047 and T78869 for details. Just disable for now, it only makes
* #78047 and #78869 for details. Just disable for now, it only makes
* a small difference in denoising performance. */
ccl_device float4 fast_exp2f4(float4 x)
{
@@ -454,7 +453,7 @@ ccl_device_inline float4 fast_expf4(float4 x)
ccl_device_inline float fast_exp10(float x)
{
/* Examined 2217701018 values of exp10 on [-37.9290009,37.9290009]:
* 2.71732409 avg ulp diff, 232 max ulp.
* 2.71732409 avg ULP diff, 232 max ULP.
*/
return fast_exp2f(x * M_LN10_F / M_LN2_F);
}
@@ -475,7 +474,7 @@ ccl_device float fast_sinhf(float x)
float a = fabsf(x);
if (a > 1.0f) {
/* Examined 53389559 values of sinh on [1,87.3300018]:
* 33.6886442 avg ulp diff, 178 max ulp. */
* 33.6886442 avg ULP diff, 178 max ULP. */
float e = fast_expf(a);
return copysignf(0.5f * e - 0.5f / e, x);
}
@@ -495,7 +494,7 @@ ccl_device float fast_sinhf(float x)
ccl_device_inline float fast_coshf(float x)
{
/* Examined 2237485550 values of cosh on [-87.3300018,87.3300018]:
* 1.78256726 avg ulp diff, 178 max ulp.
* 1.78256726 avg ULP diff, 178 max ULP.
*/
float e = fast_expf(fabsf(x));
return 0.5f * e + 0.5f / e;
@@ -506,7 +505,7 @@ ccl_device_inline float fast_tanhf(float x)
/* Examined 4278190080 values of tanh on [-3.40282347e+38,3.40282347e+38]:
* 3.12924e-06 max error.
*/
/* NOTE: ulp error is high because of sub-optimal handling around the origin. */
/* NOTE: ULP error is high because of sub-optimal handling around the origin. */
float e = fast_expf(2.0f * fabsf(x));
return copysignf(1.0f - 2.0f / (1.0f + e), x);
}
@@ -579,7 +578,7 @@ ccl_device_inline float fast_erfcf(float x)
{
/* Examined 2164260866 values of erfcf on [-4,4]: 1.90735e-06 max error.
*
* ulp histogram:
* ULP histogram:
*
* 0 = 80.30%
*/

View File

@@ -516,7 +516,7 @@ GHOST_TSuccess GHOST_ContextCGL::releaseNativeHandles()
/* OpenGL on Metal
*
* Use Metal layer to avoid Viewport lagging on macOS, see T60043. */
* Use Metal layer to avoid Viewport lagging on macOS, see #60043. */
static const MTLPixelFormat METAL_FRAMEBUFFERPIXEL_FORMAT = MTLPixelFormatBGRA8Unorm;
static const OSType METAL_CORE_VIDEO_PIXEL_FORMAT = kCVPixelFormatType_32BGRA;

View File

@@ -141,7 +141,7 @@ GHOST_TSuccess GHOST_ContextGLX::initializeDrawingContext()
/* -------------------------------------------------------------------- */
#else
/* Important to initialize only GLXEW (_not_ GLEW),
* since this breaks w/ Mesa's `swrast`, see: T46431. */
* since this breaks w/ Mesa's `swrast`, see: #46431. */
glxewInit();
#endif /* USE_GLXEW_INIT_WORKAROUND */

View File

@@ -302,7 +302,7 @@ bool GHOST_NDOFManager::setDevice(ushort vendor_id, ushort product_id)
switch (product_id) {
case 0xC62E: /* Plugged in. */
case 0xC62F: /* Wireless. */
case 0xC658: /* Wireless (3DConnexion Universal Wireless Receiver in WIN32), see T82412. */
case 0xC658: /* Wireless (3DConnexion Universal Wireless Receiver in WIN32), see #82412. */
{
device_type_ = NDOF_SpaceMouseWireless;
hid_map_button_num_ = 2;
@@ -341,7 +341,7 @@ bool GHOST_NDOFManager::setDevice(ushort vendor_id, ushort product_id)
hid_map_button_mask_ = int(~(UINT_MAX << hid_map_button_num_));
}
CLOG_INFO(LOG, 2, "%d buttons -> hex:%X", hid_map_button_num_, (uint)hid_map_button_mask_);
CLOG_INFO(LOG, 2, "%d buttons -> hex:%X", hid_map_button_num_, uint(hid_map_button_mask_));
return device_type_ != NDOF_UnknownDevice;
}
@@ -445,14 +445,14 @@ void GHOST_NDOFManager::updateButton(int button_number, bool press, uint64_t tim
2,
"button=%d, press=%d (out of range %d, ignoring!)",
button_number,
(int)press,
int(press),
hid_map_button_num_);
return;
}
const NDOF_ButtonT button = hid_map_[button_number];
if (button == NDOF_BUTTON_NONE) {
CLOG_INFO(
LOG, 2, "button=%d, press=%d (mapped to none, ignoring!)", button_number, (int)press);
LOG, 2, "button=%d, press=%d (mapped to none, ignoring!)", button_number, int(press));
return;
}
@@ -460,7 +460,7 @@ void GHOST_NDOFManager::updateButton(int button_number, bool press, uint64_t tim
2,
"button=%d, press=%d, name=%s",
button_number,
(int)press,
int(press),
ndof_button_names[button]);
GHOST_IWindow *window = system_.getWindowManager()->getActiveWindow();

Some files were not shown because too many files have changed in this diff Show More