This was set to maxcpu which in an 8 core box would be 8, each project would then spawn
8 instances of cl.exe, making a possible of 64 simultaneously running compiler instances
slowing the compile down instead of speeding it up.
The code initialized allocated memory by assigning the KernelGlobals to it. However, that calls the assignment operator, which frees previous elements which were never allocated.
We unfortunately cannot fix this for previous versions of Blender, but
at least the issue (Blender crashing on unknown IDProp types) should now
be addressed for future.
Simply reset unknown IDProp types to integer one, and reset its value to zero.
Previously, every RenderPass would have a bitfield that specified its type. That limits the number of passes to 32, which was reached a while ago.
However, most of the code already supported arbitrary RenderPasses since they were also used to store Multilayer EXR images.
Therefore, this commit completely removes the passflag from RenderPass and changes all code to use the unique pass name for identification.
Since Blender Internal relies on hardcoded passes and to preserve compatibility, 32 pass names are reserved for the old hardcoded passes.
To support these arbitrary passes, the Render Result compositor node now adds dynamic sockets. For compatibility, the old hardcoded sockets are always stored and just hidden when the corresponding pass isn't available.
To use these changes, the Render Engine API now includes a function that allows render engines to add arbitrary passes to the render result. To be able to add options for these passes, addons can now add their own properties to SceneRenderLayers.
To keep the compositor input node updated, render engine plugins have to implement a callback that registers all the passes that will be generated.
From a user perspective, nothing should change with this commit.
Differential Revision: https://developer.blender.org/D2443
Differential Revision: https://developer.blender.org/D2444
Reduce thread divergence in kernel_shader_eval.
Rays are sorted in blocks of 2048 according to shader->id.
On R9 290 Classroom is ~30% faster, and Pabellon Barcelone is ~8% faster.
No sorting for CUDA split kernel.
Reviewers: sergey, maiself
Reviewed By: maiself
Differential Revision: https://developer.blender.org/D2598
Previously the logic was different for duplis and regular objects: regular objects
were using render visibility when Render Layer option is enabled which duplis were
always using viewport visibility when rendering from the viewport.
This was quite confusing because caused different results in viewport and render
when artists were expecting them to match 1:1.
This implements branched path tracing for the split kernel.
General approach is to store the ray state at a branch point, trace the
branched ray as normal, then restore the state as necessary before iterating
to the next part of the path. A state machine is used to advance the indirect
loop state, which avoids the need to add any new kernels. Each iteration the
state machine recreates as much state as possible from the stored ray to keep
overall storage down.
Its kind of hard to keep all the different integration loops in sync, so this
needs lots of testing to make sure everything is working correctly. We should
probably start trying to deduplicate the integration loops more now.
Nonbranched BMW is ~2% slower, while classroom is ~2% faster, other scenes
could use more testing still.
Reviewers: sergey, nirved
Reviewed By: nirved
Subscribers: Blendify, bliblubli
Differential Revision: https://developer.blender.org/D2611
Negative scale on camera is a nice trick to invert render image on one
axis at no extra CPU cost. It was implemented in the Decklink branch but
I introduced a typo when porting it to master. It is now fixed.
The change was initially needed for Blender 2.8 branch but the actual
function was reverted in there. So no reason to keep dead unused
placeholder in the dependency graph.
This reverts commit fd69ba2255.
Avoid calculating a new split-index when re-fitting.
While checking if a knot can be removed, the index with the highest error
can be used as a candidate to replace the knot
(in the case it can't be removed).
Not sure if this is a proper fix, but was getting frequent crashes, so
committing this real quick just to make master sable again. Can be
reverted later if there's a better fix. The changes to images really
need a closer look...
Hi Guys,
as one of my clients needs the possibility to have custom menu entries in the general right click menu (all over Blender: in the node editor, properties, toolbars,..) I talked with Campbell about expanding our hard coded menu a bit. This is the outcome. As I only need those two, I support currently a button_prop and a button_pointer.
{F540397}
I tested the changes with a custom script where I added a custom entry and executed an operator on click - it seems to work exactly how it's intended to. The script: {F540435}
As I'm not too experienced in rna stuff I would really appreciate any review.
Thanks very much Campbell for his open ears & help on this issue!
Reviewers: campbellbarton, mont29
Reviewed By: campbellbarton, mont29
Subscribers: sybren, mont29
Tags: #addons
Differential Revision: https://developer.blender.org/D2612
Previous fix did not work for mixed textures. This one will over-allocate
information array, but it's better than not being able to render at all.
Some more cleanup and improvement is coming.
This patch allows for an unlimited number of textures in Cycles where the hardware allows. It replaces a number static arrays with dynamic arrays and changes the way the flat_slot indices are calculated. Eventually, I'd like to get to a point where there are only flat slots left and textures off all kinds are stored in a single array.
Note that the arrays in DeviceScene are changed from containing device_vector<T> objects to device_vector<T>* pointers. Ideally, I'd like to store objects, but dynamic resizing of a std:vector in pre-C++11 calls the copy constructor, which for a good reason is not implemented for device_vector. Once we require C++11 for Cycles builds, we can implement a move constructor for device_vector and store objects again.
The limits for CUDA Fermi hardware still apply.
Reviewers: tod_baudais, InsigMathK, dingto, #cycles
Reviewed By: dingto, #cycles
Subscribers: dingto, smellslikedonkey
Differential Revision: https://developer.blender.org/D2650
Previously canceling a render done by the split kernel could cause artifacts
such as very bright or dark tiles. This was caused by unfinished samples
being included in the output buffer. To avoid this we now wait till all the
currently rendering samples have finished, up to a limit of twice the
expected time for them to finish (currently this is no more than 20 seconds,
but usually its much less). If samples still haven't finished by then we
stop anyways in case there's an endless loop occurring.
It was totally unclear whether the device is enabled or disabled.
Lots of people got fully lost in the current interface.
While the solution is not fully ideal, it is at least solves
ambiguity in the interface.
Simple child hairs don't have a face index number assigned, so the
call to dm->getTessFaceData(dm, num, CD_MFACE) would cause a crash. To
work around this, UV and normal vectors are copied from the parent
hair.
I've also removed an unnecessary call to dm->getTessFaceArray(dm);
Reviewers: kevindietrich
Differential Revision: https://developer.blender.org/D2638
The function doesn't return whether the object is a shape at all, since
it also returns true for camera objects (and soon also for empties). It
returns true when objects of this type can be exported to Alembic at all.
This is now reflected in the name.
This works around a long outstanding issue T50176 with cycles on msvc2015/x86 . root cause is still unknown though,feels like a game of whack'a'mole
Reviewers: sergey, dingto
Subscribers: Blendify
Tags: #cycles
Differential Revision: https://developer.blender.org/D2573
Using -cl-fast-relaxed-math assumes no NaN/Inf values in any expression.
This causes problems on overflow, division by zero, square root of negative number.
Comparisons with NaN or infinite value are affected as well.
This patch causes <2% slowdown on benchmark scenes.
Fix T50985: Rendering volume scatter with GPU OpenCL comes to an halt after a few seconds
HDF5 Alembic files are not officially supported by Blender. With this
commit, the HDF5 format is detected even when Blender is compiled without
HDF5 support, and the user is given an explanatory error message (rather
than the generic "Could not open Alembic archive for reading".
The final goal to reach is to make vectorized types much easier to maintain
and the previous design had following issues:
- Having all types and methods implementation made the source file rather
bloated and unfun to navigate in.
- It was not possible to quickly glance available API for the type you are
interested in.
- Adding more vectorization types will bloat the file even more, making
things even more tricky to follow.
Fixes performance issues of C++ one with Windows MSVC debug builds...
Merely a translation from msgfmt.cc code by @sergey, using BLI libs intead of C++'s stdlib.
Reviewers: sergey, campbellbarton, LazyDodo
Subscribers: sergey
Differential Revision: https://developer.blender.org/D2605
QtKit was removed in macOS Sierra, this patch disables WITH_CODEC_QUICKTIME
in Sierra and greater versions of macOS.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D2645
By mistake, the code relied on ALEMBIC_ROOT_DIR being defined by the user
running the tests. Now CMake macros are used to correctly find the Alembic
root directory.
Matrix.decompose() should either return "location, orientation, size" or
"translation, rotation, scale". Since there are constructors for the former,
I've replaced "location" in the documentation with "translation".
The code is still the same, I just changed the documentation.
The idea is to have osme geenric BSDF node which is subclassed by
"regular" BSDF nodes and uber shaders.
This way we can access special type and closure type for making
decisions somewhere else.
No longer passing time as float and constructing ISampleSelectors all
over the place. Instead, just construct an ISampleSelector once and
pass it along.
It is disabled by default, so should not affect existing configurations.
Main benefits of this goes as:
- Linux distros can use that to avoid libraries duplication and link
blender package against gflags package from the system.
- It it easier to test whether Blender works with updated version of
Gflags prior to re-bundling the library.
This switches the internal color representation of the eye dropper from display space to linear. Any time a linear color is requested and the color is picked from a linear object, the result is now precise to the bit as the color gets patched through directly. Color space conversion now only happens when a color is picked from non-linear display space objects or when the color is requested to be returned in non-linear space.
In addition, this patch changes the DifferenceMatte node to interpret a tolerance of 0.0 to accept colors that are identical bit by bit, as apposed to simply refusing all colors.
Replaced some STREQ(snode->tree_idname, ...) calls with ED_node_is_*() calls for improved readability, fixed one case where the STREQ was used the wrong way
That's a quick hack to address that specific case, new pointer IDProp
actually enlights a generic problem - datablocks using themselves - which
is not really handled by current code, would consider this not-so-urgent
TODO though.
On GPU architectures, storing the design row in local memory improves performance due to lower global memory bandwidth requirements.
However, if the GPU doesn't have enough local memory available, occupancy suffers which makes it even slower than the global memory version.
On CUDA, the amount of available local memory (shared memory in CUDA terminology) can be controlled, but that's not possible on OpenCL. So, to avoid a huge performance hit when the local memory isn't enough, it's disabled on OpenCL.
The ABC_export and ABC_import functions both take a as_background_job
parameter, and return a boolean.
When as_background_job=true, returns false immediately after scheduling
a background job. This was the old behaviour of this function, which makes
it very hard for scripts to do something with the data after the import
or export completes.
When as_background_job=false, performs the export synchronously, and
returns true when the export was ok, and false if there were any errors.
This allows further processing.
The Scene.alembic_export() function is deprecated, and will be removed from
Blender 2.8 in favour of calling the bpy.ops.wm.alembic_export() operator.
As such, it has been hard-coded to the old background job behaviour.
The export is still slower than needed, as the particle systems themselves
aren't disabled during the export. It's only the writing to the Alembic
file that's skipped.
We could not edit them, but still could delete them, which makes no
sense, API-defined properties are similar to class members, removing
them from single instances is pure garbage. And it was broken anyway.
Found by @a.romanov while checking on T51198, thanks.
Curve resolution isn't natively supported by Alembic, hence it is stored
in a user property "blender:resolution". I've looked at a Maya curves
example file, but that also didn't contain any information about curve
resolution.
Differential Revision: https://developer.blender.org/D2634
Reviewers: kevindietrich
The order number written to Alembic is the same as we use in memory, so
the +1 wasn't needed, at least according to the reference Maya exporter
maya/AbcExport/MayaNurbsCurveWriter.cpp, function
MayaNurbsCurveWriter::write(), in the Alembic source code.
Furthermore, when writing an array of nurb orders, the curve type should
be set to kVariableOrder, otherwise the importer will ignore it.
commit 90778901c9
Merge: 76eebd93bf0026
Author: Schoen <schoepas@deher1m1598.emea.adsint.biz>
Date: Mon Apr 3 07:52:05 2017 +0200
Merge branch 'master' into cycles_disney_brdf
commit 76eebd9379
Author: Schoen <schoepas@deher1m1598.emea.adsint.biz>
Date: Thu Mar 30 15:34:20 2017 +0200
Updated copyright for the new files.
commit 013f4a152a
Author: Schoen <schoepas@deher1m1598.emea.adsint.biz>
Date: Thu Mar 30 15:32:55 2017 +0200
Switched from multiplication of base and subsurface color to blending
between them using the subsurface parameter.
commit 482ec5d1f2
Author: Schoen <schoepas@deher1m1598.emea.adsint.biz>
Date: Mon Mar 13 15:47:12 2017 +0100
Fixed a bug that caused an additional white diffuse closure call when using
path tracing.
commit 26e906d162
Merge: 0593b8c223aff9
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Mon Feb 6 11:32:31 2017 +0100
Merge branch 'master' into cycles_disney_brdf
commit 0593b8c51b
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Mon Feb 6 11:30:36 2017 +0100
Fixed the broken GLSL shader and implemented the Disney BRDF in the
real-time view port.
commit 8c7e11423b
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Fri Feb 3 14:24:05 2017 +0100
Fix to comply strict compiler flags and some code cleanup
commit 17724e9d2d
Merge: 379ba34520afa2
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Tue Jan 24 09:59:58 2017 +0100
Merge branch 'master' into cycles_disney_brdf
commit 379ba346b0
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Tue Jan 24 09:28:56 2017 +0100
Renamed the Disney BSDF to Principled BSDF.
commit f80dcb4f34
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Fri Dec 2 13:55:12 2016 +0100
Removed reflection call when roughness is low because of artifacts.
commit 732db8a57f
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Wed Nov 16 09:22:25 2016 +0100
Indication if to use fresnel is now handled via the type of the BSDF.
commit 0103659f5e
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Fri Nov 11 13:04:11 2016 +0100
Fixed an error in the clearcoat where it appeared too bright for default
light sources (like directional lights)
commit 0aa68f5335
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Mon Nov 7 12:04:38 2016 +0100
Resolved inconsistencies in using tabs and spaces
commit f5897a9494
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Mon Nov 7 08:13:41 2016 +0100
Improved the clearcoat part by using GTR1 instead of GTR2
commit 3dfc240e61
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Mon Oct 31 11:31:36 2016 +0100
Use reflection BSDF for glossy reflections when roughness is 0.0 to
reduce computational expense and some code cleanup
Code cleanup includes:
- Code style cleanup and removed unused code
- Consolidated code in the bsdf_microfacet_multi_impl.h to reduce
some computational expense
commit a2dd0c5faf
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Wed Oct 26 08:51:10 2016 +0200
Fixed glossy reflections and refractions for low roughness values and
cleaned up the code.
For low roughness values, the reflections had some strange behavior.
commit 9817375912
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Tue Oct 25 12:37:40 2016 +0200
Removed default values in setup functions and added extra functions for
GGX with fresnel.
commit bbc5d9d452
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Tue Oct 25 11:09:36 2016 +0200
Switched from uniform to cosine hemisphere sampling for the diffuse and
the sheen part.
commit d52d8f2813
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Mon Oct 24 16:17:13 2016 +0200
Removed the color parameters from the diffuse and sheen shader and use
them as closure weights instead.
commit 8f3d927385
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Mon Oct 24 09:57:06 2016 +0200
Fixed the issue with artifacts when using anisotropy without linking the
tangent input to a tangent node.
commit d93f680db9
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Mon Oct 24 09:14:51 2016 +0200
Added subsurface radius parameter to control the per color channel
effection radius of the subsurface scattering.
commit c708c3e53b
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Mon Oct 24 08:14:10 2016 +0200
Rearranged the inputs of the shader.
commit dfbfff9c38
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Fri Oct 21 09:27:05 2016 +0200
Put spaces in the parameter names of the shader node
commit e5a748ced1
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Fri Oct 21 08:51:20 2016 +0200
Removed code that isn't in use anymore
commit 75992bebc1
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Fri Oct 21 08:50:07 2016 +0200
Code style cleanup
commit 4dfcf455f7
Merge: 243a0e32cd6a89
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Thu Oct 20 10:41:50 2016 +0200
Merge branch 'master' into cycles_disney_brdf
commit 243a0e3eb8
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Thu Oct 20 10:01:45 2016 +0200
Switching between OSL and SVM is more consistant now when using Disney
BSDF.
There were some minor differences in the OSL implementation, e.g. the
refraction roughness was missing.
commit 2a5ac50922
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Tue Sep 27 09:17:57 2016 +0200
Fixed a bug that caused transparency to be always white when using OSL and
selecting GGX as distribution of the Disney BSDF
commit e1fa862391
Merge: d0530a87f76f6f
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Tue Sep 27 08:59:32 2016 +0200
Merge branch 'master' into cycles_disney_brdf
commit d0530a8af0
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Tue Sep 27 08:53:18 2016 +0200
Cleanup the Disney BSDF implementation and removing unneeded files.
commit 3f4fc826bd
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Tue Sep 27 08:36:07 2016 +0200
Unified the OSL implementation of the Disney clearcoat as a simple
microfacet shader like it was previously done in SVM
commit 4d3a0032ec
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Mon Sep 26 12:35:36 2016 +0200
Enhanced performance for Disney materials without subsurface scattering
commit 3cd5eb56cf
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Fri Sep 16 08:47:56 2016 +0200
Fixed a bug in the Disney BSDF that caused specular reflections to be too
bright and diffuse is now reacting to the roughness again
- A normalization for the fresnel was missing which caused the specular
reflections to become too bright for the single-scatter GGX
- The roughness value for the diffuse BSSRDF part has always been
overwritten and thus always 0
- Also the performance for refractive materials with roughness=0.0 has
been improved
commit 7cb37d7119
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Thu Sep 8 12:24:43 2016 +0200
Added selection field to the Disney BSDF node for switching between
"Multiscatter GGX" and "GGX"
In the "GGX" mode there is an additional parameter for changing the
refraction roughness for materials with smooth surfaces and rough interns
(e.g. honey). With the "Multiscatter GGX" this effect can't be produced at
the moment and so here will be no separation of the two roughness values.
commit cdd29d06bb
Merge: 02c315ab40d1c1
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Tue Sep 6 15:59:05 2016 +0200
Merge branch 'master' into cycles_disney_brdf
commit 02c315aeb0
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Tue Sep 6 15:16:09 2016 +0200
Implemented the OSL part of the Disney shader
commit 5f880293ae
Merge: 630b80eb399a6d
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Fri Sep 2 10:53:36 2016 +0200
Merge branch 'master' into cycles_disney_brdf
commit 630b80e08b
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Fri Sep 2 10:52:13 2016 +0200
Fresnel in the microfacet multiscatter implementation improved
commit 0d9f4d7acb
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Fri Aug 26 11:11:05 2016 +0200
Fixed refraction roughness problem (refractions were always 100% rough)
and set IOR of clearcoat to 1.5
commit 9eed34c7d9
Merge: ef29aaeae475e3
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Tue Aug 16 15:22:32 2016 +0200
Merge branch 'master' into cycles_disney_brdf
commit ef29aaee1a
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Tue Aug 16 15:17:12 2016 +0200
Implemented the fresnel in the multi-scatter GGX for the Disney BSDF
- The specular/metallic part uses the multi-scatter GGX
- The fresnel of the metallic part is controlled by the specular value
- The color of the reflection part when using transparency can be
controlled by the specularTint value
commit 88567af085
Merge: cc267e5285e082
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Wed Aug 3 15:05:09 2016 +0200
Merge branch 'master' into cycles_disney_brdf
commit cc267e52f2
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Wed Aug 3 15:00:25 2016 +0200
Implemented the Disney clearcoat as a variation of the microfacet bsdf,
removed the transparency roughness again and added an input for
anisotropic rotations
commit 81f6c06b1f
Merge: ece5a087065022
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Wed Aug 3 11:42:02 2016 +0200
Merge branch 'master' into cycles_disney_brdf
commit ece5a08e0d
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Tue Jul 26 16:29:21 2016 +0200
Base color now applied again to the refraction of transparent Disney
materials
commit e3aff6849e
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Tue Jul 26 16:05:19 2016 +0200
Added subsurface color parameter to the Disney shader
commit b3ca6d8a2f
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Tue Jul 26 12:30:25 2016 +0200
Improvement of the SSS in the Disney shader
* Now the bump normal is correctly used for the SSS.
* SSS in Disney uses the Disney diffuse shader
commit d68729300e
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Tue Jul 26 12:23:13 2016 +0200
Better calculation of the Disney diffuse part
Now the values for NdotL und NdotV are clamped to 0.0f for a better look
when using normal maps
commit cb6e500b12
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Mon Jul 25 16:26:42 2016 +0200
Now one can disable specular reflactions again by setting specular and
metallic to 0 (cracked this in the previous commit)
commit bfb9cb11b5
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Mon Jul 25 16:11:07 2016 +0200
fixed the Disney SSS and cleaned the initialization of the Disney shaders
commit 642c0fdad1
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Mon Jul 25 16:09:55 2016 +0200
fixed an error that was caused by the missing LABEL_REFLECT in the Disney
diffuse shader
commit c10b484dca
Author: Jens Verwiebe <info@jensverwiebe.de>
Date: Fri Jul 22 01:15:21 2016 +0200
Rollback attempt to fix sss crashing, it prevented crash by disabling sss completely, thus useless
commit 462bba3f97
Author: Jens Verwiebe <info@jensverwiebe.de>
Date: Thu Jul 21 23:11:59 2016 +0200
Add an undef for sc_next for safety
commit 32d348577d
Author: Jens Verwiebe <info@jensverwiebe.de>
Date: Thu Jul 21 00:15:48 2016 +0200
Attempt to fix Disney SSS
commit dbad91ca6d
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Wed Jul 20 11:13:00 2016 +0200
Added a roughness parameter for refractions (for scattering of the rays
within an object)
With this, one can create a translucent material with a smooth surface and
with a milky look.
The final refraction roughness has to be calculated using the surface
roughness and the refraction roughness because those two are correlated
for refractions. If a ray hits a rough surface of a translucent material,
it is scattered while entering the surface. Then it is scattered further
within the object. The calculation I'm using is the following:
RefrRoughnessFinal = 1.0 - (1.0 - Roughness) * (1.0 - RefrRoughness)
commit 50ea5e3e34
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Tue Jun 7 10:24:50 2016 +0200
Disney BSDF is now supporting CUDA
commit 10974cc826
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Tue May 31 11:18:07 2016 +0200
Added parameters IOR and Transparency for refractions
With this, the Disney BRDF/BSSRDF is extended by the BTDF part.
commit 218202c090
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Mon May 30 15:08:18 2016 +0200
Added an additional normal for the clearcoat
With this normal one can simulate a thin layer of clearcoat by applying a
smoother normal map than the original to this input
commit dd139ead7e
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Mon May 30 12:40:56 2016 +0200
Switched to the improved subsurface scattering from Christensen and
Burley
commit 11160fa4e1
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Mon May 30 10:16:30 2016 +0200
Added Disney Sheen shader as a preparation to get to a BSSRDF
commit cee4fe0cc9
Merge: 4f955d06b5bab6
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Mon May 30 09:08:09 2016 +0200
Merge branch 'cycles_disney_brdf' of git.blender.org:blender into cycles_disney_brdf
Conflicts:
intern/cycles/kernel/closure/bsdf_disney_clearcoat.h
intern/cycles/kernel/closure/bsdf_disney_diffuse.h
intern/cycles/kernel/closure/bsdf_disney_specular.h
intern/cycles/kernel/closure/bsdf_util.h
intern/cycles/kernel/osl/CMakeLists.txt
intern/cycles/kernel/osl/bsdf_disney_clearcoat.cpp
intern/cycles/kernel/osl/bsdf_disney_diffuse.cpp
intern/cycles/kernel/osl/bsdf_disney_specular.cpp
intern/cycles/kernel/osl/osl_closures.h
intern/cycles/kernel/shaders/node_disney_bsdf.osl
intern/cycles/render/nodes.cpp
intern/cycles/render/nodes.h
commit 4f955d0523
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Tue May 24 16:38:23 2016 +0200
SVM and OSL are both working for the simple version of the Disney BRDF
commit 1f5c41874b
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Tue May 24 09:58:50 2016 +0200
Disney node can be used without SVM and started to cleanup the OSL implementation
There is still some wrong behavior for SVM for the Schlick Fresnel part at the
specular and clearcoat
commit d4b814e930
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Wed May 18 10:22:29 2016 +0200
Switched from a parameter struct for Disney parameters to ShaderClosure params
commit b86a1f5ba5
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Wed May 18 10:19:57 2016 +0200
Added additional variables for storing parameters in the ShaderClosure struct
commit 585b886236
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Tue May 17 12:03:17 2016 +0200
added output parameter to the DisneyBsdfNode
That has been forgotten after removing the inheritance of BsdfNode
commit f91a286398
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Tue May 17 10:40:48 2016 +0200
removed BsdfNode class inheritance for DisneyBsdfNode
That's due to a naming difference. The Disney BSDF uses the name 'Base Color'
while the BsdfNode had a 'Color' input. That caused a text message to be
printed while rendering.
commit 30da91c9c5
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Wed May 4 16:08:10 2016 +0200
disney implementation cleaned
commit 30d41da0f0
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Wed May 4 13:23:07 2016 +0200
added the disney brdf as a shader node
commit 1f099fce24
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Tue May 3 16:54:49 2016 +0200
added clearcoat implementation
commit 00a1378b98
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Fri Apr 29 22:56:49 2016 +0200
disney diffuse und specular implemented
commit 6baa7a7eb7
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Mon Apr 18 15:21:32 2016 +0200
disney diffuse is working correctly
commit d8fa169bf3
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Mon Apr 18 08:41:53 2016 +0200
added vessel for disney diffuse shader
commit 6b5bab6cec
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Wed May 18 10:22:29 2016 +0200
Switched from a parameter struct for Disney parameters to ShaderClosure params
commit f6499c2676
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Wed May 18 10:19:57 2016 +0200
Added additional variables for storing parameters in the ShaderClosure struct
commit 7100640b65
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Tue May 17 12:03:17 2016 +0200
added output parameter to the DisneyBsdfNode
That has been forgotten after removing the inheritance of BsdfNode
commit 419ee54411
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Tue May 17 10:40:48 2016 +0200
removed BsdfNode class inheritance for DisneyBsdfNode
That's due to a naming difference. The Disney BSDF uses the name 'Base Color'
while the BsdfNode had a 'Color' input. That caused a text message to be
printed while rendering.
commit 6006f91e87
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Wed May 4 16:08:10 2016 +0200
disney implementation cleaned
commit 0ed0895914
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Wed May 4 13:23:07 2016 +0200
added the disney brdf as a shader node
commit 0630b742d7
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Tue May 3 16:54:49 2016 +0200
added clearcoat implementation
commit 9f3d39744b
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Fri Apr 29 22:56:49 2016 +0200
disney diffuse und specular implemented
commit 9b26206376
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Mon Apr 18 15:21:32 2016 +0200
disney diffuse is working correctly
commit 4711a3927d
Author: Pascal Schoen <pascal_schoen@gmx.net>
Date: Mon Apr 18 08:41:53 2016 +0200
added vessel for disney diffuse shader
Differential Revision: https://developer.blender.org/D2313
This is possible to use surface-only nodes and connect them to volume output.
If there was something connected to surface output those extra connections
will not change anything visually but will force volume features to be included
into feature-adaptive kernels.
In fact, this exact reason seems to be causing slowdown of Barcelone file
comparing AMD OpenCL to NVidia CUDA.
Currently only supported by the final F12 renders because of the current design
of what gets optimized out when and how feature-adaptive kernel accesses
list of required features.
Reviewers: dingto, nirved, maiself, lukasstockner97, brecht
Reviewed By: brecht
Subscribers: bliblubli
Differential Revision: https://developer.blender.org/D2569
Remapping to itself is nonsense here (was triggering an assert in
BKE_library code actually), just make it a bail out early in RNA
callback in that case.
Sanitize a bit how cache path is handled by fluidsim (there is much more
to be done here though :( ), and forbid empty path (we reset to default
path relative to current .blend file in case it's empty).
If people really, really want to use current OS-wise directory, they can at
least use '.' as path. ;)
When moved the options to toolsetting, this part was missing. The problem was not the pointer as suggested in D2629.
Thanks Arvīds Kokins for his help fixing this bug
This avoids the unnecessary creation of bvhtree, which can be highly inefficient in some cases
(for example: in the `operator_modal_view3d_raycast.py` template)
Unfortunately this does break compatibility in that the viewport will look a
bit different depending on the settings, but the old behavior was simply not
usable for higher distances.
Object Info node can be useful to give some variation to a single material assigned to multiple instances. This patch adds support for Viewport and BI.
{F499530}
Example: {F499528}
Reviewers: merwin, brecht, dfelinto
Reviewed By: brecht
Subscribers: duarteframos, fclem, homyachetser, Evgeny_Rodygin, AlexKowel, yurikovelenov
Differential Revision: https://developer.blender.org/D2425
The U-resolution of the imported curves was kept at the default value
of 12, which is way too high for imported hair. We export hair at a
fairly high resolution already, so it's not needed to subdivide even
further when importing.
Of course this may have an impact on other curves that do require this
U-resolution to be higher. In that case the resolution can be
increased after importing.
I removed the default nu->orderu = num_verts, as that allowed every
point to influence the entire spline, which was more expensive for the
CPU, and unlikely to be needed. The orderu computations had off-by-one
errors in the curve importer, which are now also fixed. The correct
values are:
- Linear: orderu = 2
- Quadratic: orderu = 3
- Cubic: orderu = 4
These values are also what is stored in the Alembic file for curves of
type kVariableOrder, according to the reference Maya exporter
maya/AbcExport/MayaNurbsCurveWriter.cpp, function
MayaNurbsCurveWriter::write(), in the Alembic source code.
The result is a frame rate increase of roughly 100x (tested with one
100-hair test on one machine, so take with grain of salt).
This test checks that a set of cubes are exported with the correct
transform, both with flatten=True and flatten=False.
This commit also adds an easy to use superclass for upcoming Alembic
unit tests.
This supports our common character animation workflow, where a character,
its rig, and the custom bone shapes are all part of a group. This group
is then linked into the scene, the rig is proxified and animated. Such
a group can now be exported. Use "Renderable objects only" to prevent
writing the custom bone shapes to the Alembic file.
The absence of datablock properties "will certainly be resolved soon as the need for them is becoming obvious" said the [[http://wiki.blender.org/index.php/Dev:Ref/Release_Notes/2.67/Python_Nodes|Python Nodes release notes]]. So this patch allows Python scripts to create ID Properties which reference datablocks.
This functionality is implemented for `PointerProperty` and now such properties can be created with Python.
In addition to the standard update callback, `PointerProperty` can have a `poll` callback (standard RNA) which is useful for search menus. For details see the test included in this patch.
Original author: @artfunkel
Alexander (Blend4Web Team)
Reviewers: brecht, artfunkel, mont29, campbellbarton
Reviewed By: mont29, campbellbarton
Subscribers: jta, sergey, campbellbarton, wisaac, poseidon4o, mont29, homyachetser, Evgeny_Rodygin, AlexKowel, yurikovelenov, fjuhec, sharlybg, cardboard, duarteframos, blueprintrandom, a.romanov, BYOB, disnel, aditiapratama, bliblubli, dfelinto, lukastoenne
Maniphest Tasks: T37754
Differential Revision: https://developer.blender.org/D113
We can not re-use anything for such pools, because we will know nothing about whether
the main thread is sleeping or not. So we identify such threads as 0, but we don't
use main thread's TLS.
This fixes dead-locks and crashes reported by Luca when doing playblasts.
Global size depends on memory usage which might change during rendering.
Havent seen it happen but seems possible that this could cause the global
size to be different than what was used for allocating buffers.
The issue here is that the preferences are still used because both can be accessed from the 3D View, view menu. In the future, it is likely that the old mode will be removed (maybe 2.8?) but for now we want to keep both operational.
Differential revision: https://developer.blender.org/D2320
Do not recompute both points's 2D coordinates for each segments, we can
copy over from previous one... Does not gives any measurable speedup off
hands, though.
Couple of things here:
- Boost is not necesserily compiled into your /opt/lib and system-wide
version might have been used. The recent change in Alembic did not
take this into account.
- Alembic needs some extra component of Boost.
This part might be missing now for other distros than DEB.
The issue was caused by recent change in inline policy.
There is some sort of memory corruption happening here, ASAN suggests
it's stack overflow issue. Not quite sure why it is happening tho and
was not able to solve anything here yet in the past hours.
Committing fix which works with a big TODO note.
The issue is visible on AVX2 machine when rendering cycles_reports_test.
Doing this in a fully 'clean' way is far from obvious, especially
unregister, you often end up leaving nasty 'orphanned' keymap items
referring to unregistered operators...
This seems to happen on Windows only, happened to Thomas and Nathan already.
Similar patch Thomas was showing, but i do not see it committted. So comitting
now in order to get more developers and users happy.
Ever since we merged the extra texture types (half etc) and spit kernel the compile time for cycles_kernel has been going out of control.
It's currently sitting at a cool 1295.762 seconds with our standard compiler (2013/x64/release)
I'm not entirely sure why msvc gets upset with it, but the inlining of matrix near the bottom of the tri-cubic 3d interpolator is the source of the issue, this patch excludes it from being inlined.
This patch bring it back down to a manageable 186 seconds. (7x faster!!)
with the attached bzzt.blend that @sergey kindly provided i got the following results with builds with identical hashes
58:51.73 buildbot
58:04.23 Patched
it's really close, the slight speedup could be explained by the switch instead of having multiple if's (switches do generate more optimal code than a chain of if/else/if/else statements) but in all honesty it might just have been pure luck (dev box,very polluted, bad for benchmarks) regardless, this patch doesn't seem to slow down anything with my limited testing.
{F532336}
{F532337}
Reviewers: brecht, lukasstockner97, juicyfruit, dingto, sergey
Reviewed By: brecht, dingto, sergey
Subscribers: InsigMathK, sergey
Tags: #cycles
Differential Revision: https://developer.blender.org/D2595
It's possible that cancellation occured between the creation of the reader
and the creation of the Blender object, in which case reader->object()
returns a NULL pointer.
BKE_libblock_free_us() was called on the object data, which decrements
its user count, after which the same function was called on the object,
which decrements the user count of the object data again. This double
decrement was too much.
The state mask wasnt applied before comparison giving false results. It
shouldnt really happen that a ray state contains any flags that need to
be masked away, but if it does happen its better to not get stuck.
Previously, a GHash was used to store a flattened mapping of parent
information based on the Alembic hierarchy, and then that hash was used to
set parent pointers on Blender objects. This resulted in errors and
some duplicate objects. The new approach stores parent pointers while
traversing the Alembic hierarchy, which means that there is much more
information about the actual context of the Alembic object itself,
producing a more stable import.
Also replaced the bool param "to_yup" with "AbcAxisSwapMode mode", so that
it's more explicit that axes are swapped.
Also added unittests for create_swapped_rotation_matrix.
There was a problem with parent-child relations not getting set up
correctly when an Alembic object was both the transform for a mesh object
and the parent of other mesh objects.
convert_matrix() now only converts from Imath::M44d to float[4][4] (taking
different camera orientations into account). Import-time scaling is now
performed by the caller.
Also renamed AbcObjectReader::readObjectMatrix to
setupObjectTransform, as it does more than just reading the object
matrix; it also sets up an object constraint if the Alembic Xform is
animated.
Alembic is an interchange and caching format, that can contain custom
object schemas. Blender shouldn't crash (because of failing asserts) just
because it doesn't know such an object type.
It's a mapping from full path of an Alembic object to an AbcObjectReader*.
The fact that at some point it is used to construct parent-child relations
doesn't matter.
The importer was guessing whether an Alembic IXform object was part of a
child object, or should be represented as an Empty in Blender. By reversing
the order in which objects are visited, the children can now claim their
parent as part of the same object (so IPolyMesh claims its parent IXform
as part of the same Blender object). This results in much less guesswork.
I've also removed similar guesswork from the code that sets parent pointers,
by simply searching for the parent in a hierarchical way, instead of trying
to predict (again) which IXforms were turned into empties.
Also, visit_object() now actually visits the object -- previously it only
visited its children, and assumed the object it was called on was already
handled by a previous call.
create_transform_matrix(float[4][4]) did mostly the same as
create_transform_matrix(Object *, float[4][4]), but more elegant.
However, the former has some inconsistencies with the latter (which
are now merged and made explicit, turned out one was for z-up→y-up
while the other was for y-up→z-up), and was renamed to
copy_m44_axis_swap(...) to convey its purpose more clearly.
Furthermore, "loc" has been renamed to "trans", as matrices don't
store locations but translations; and more variables now have a src_
or dst_ prefix to denote whether they contain a matrix/vector in the
source or destination axis orientation.
AbcExporter::createTransformWriter() tries to predict the parent Xform
name, but if it cannot be found has multiple ways of creating it, possibly
under a different name than originally searched for.
The idea is to have a system where we properly log error messages and
let the users know that errors occured redirecting them to the console
for explanations. This is only implemented for the exporter since the
importer already has similar functionalities; however they shall
ultimately be unified in some way.
Reviewers: sybren, dfelinto
Differential Revision: https://developer.blender.org/D2541
This avoids write access happening in non-atomic manner in
Shader::tag_update which modifies the global managers. Even
for 1 byte data types it's quite dangerous.
The issue is coming from the fact that float3 is actually 16 bytes aligned
data type and the "padding" was not initialized. This caused memcmp() to
access non-initialized memory.
This provides us with a clearer API (so I don't have to use const_cast<>
in upcoming code). It also allows layering of different Alembic files,
so you can have a base file and load a separate file containing overrides.
Verbally approved by Dr. Sergey.
Alembic requires one of ALEMBIC_LIB_USES_BOOST, ALEMBIC_LIB_USES_TR1, or
C++11, and silently defaults to the latter if the former two are OFF.
Before this change, Alembic was only built without C++11 of OpenEXR
was built at the same time. This dependency was both unnecessary and
undocumented.
This function was modifying texture datablock, which makes the call
unsafe for call from multiple threads. Now we pass the argument that
we don't need nodes to the underlying functions.
There will be still race condition in noise texture, but that should
at least be free from crashes. Doesn't mean we shouldn't fix it tho.
In this case the Pyobject gets lost from pybm, and bm.free() does not invalidate the PyElem.
This will cause the destructor of python to read invalid memory and crash.
The solution is to make a copy of the pyobjects pointers before overwriting.
This area is a subject of reconsideration, so for now used simplest
way possible -- ensure depsgraph's nodes have proper layer flags
when going in and out of local mode.
The issue was apparently caused by -fno-finite-math-only added to kernel.cpp
CFLAGS. For now just removed this flag from the kernel (we don't really want
it there at this point, and we don't have it for SSE/AVX optimized kernels).
But surely more investigation is needed here.
The idea is to make include statements more explicit and obvious where the
file is coming from, additionally reducing chance of wrong header being
picked up.
For example, it was not obvious whether bvh.h was refferring to builder
or traversal, whenter node.h is a generic graph node or a shader node
and cases like that.
Surely this might look obvious for the active developers, but after some
time of not touching the code it becomes less obvious where file is coming
from.
This was briefly mentioned in T50824 and seems @brecht is fine with such
explicitness, but need to agree with all active developers before committing
this.
Please note that this patch is lacking changes related on GPU/OpenCL
support. This will be solved if/when we all agree this is a good idea to move
forward.
Reviewers: brecht, lukasstockner97, maiself, nirved, dingto, juicyfruit, swerner
Reviewed By: lukasstockner97, maiself, nirved, dingto
Subscribers: brecht
Differential Revision: https://developer.blender.org/D2586
Cross denoising is significantly slower, needs more memory and didn't really produce better results in my tests.
Gradient denoising sometimes helped, but tends to produce artifacts and was broken for a few weeks already anyways.
The extremely confusing "Filter strength" (negative values used to map to an absolute threshold of 10^(2*-strength), positive ones to a relative threshold of 10^(2*strength), 0 to relative 1e-3) was replaced by a checkbox that selects between an absolute threshold of 1 and a relative threshold of 1e-3.
Eventually, I'd like to completely remove the option, but it's not clear yet which one is the better approach.
The intention of this commit it to address issues mentioned in the
reports T43865,T50164 and T50452.
The code is based on Embree code with some extra vectorization
to speed up single ray to single triangle intersection.
Unfortunately, such a fix is not coming for free. There is some
slowdown for AVX2 processors, mainly due to different vectorization
code, which caused different number of instructions to be executed
and different instructions-per-cycle counters. But on another hand
this commit makes pre-AVX2 platforms such as AVX and SSE4.1 a bit
faster. The prerformance goes as following:
2.78c AVX2 2.78c AVX Patch AVX2 Patch AVX
BMW 05:21.09 06:05.34 05:32.97 (+3.5%) 05:34.97 (-8.5%)
Classroom 16:55.36 18:24.51 17:10.41 (+1.4%) 17:15.87 (-6.3%)
Fishy Cat 08:08.49 08:36.26 08:09.19 (+0.2%) 08:12.25 (-4.7%
Koro 11:22.54 11:45.24 11:13.25 (-1.5%) 11:43.81 (-0.3%)
Barcelone 14:18.32 16:09.46 14:15.20 (-0.4%) 14:25.15 (-10.8%)
On GPU the performance is about 1.5-2% slower in my tests on GTX1080
but afraid we can't do much as a part of this chaneg here and
consider it a price to pay for more proper intersection check.
Made in collaboration with Maxym Dmytrychenko, big thanks to him!
Reviewers: brecht, juicyfruit, lukasstockner97, dingto
Differential Revision: https://developer.blender.org/D1574
That one was:
* Resetting non-ID pointers (lib_link_xxx funcs should only affect ID
pointers, everything else shall be done in direct_link_xxx func).
* Even worse, always calling lib_link_animdata, even when
LIB_TAG_NEED_LINK tag was unset...
We do not need any special handling anymore for usercount of images used
by faces/polygons (tpage stuff), since we have the 'real_user' handling,
which will gracefully cope with all possible situations.
So better not keep that ugly confusing useless special case.
Mainly:
* Add missing `IDP_LibLinkProperty()` calls for many ID types
(harmless currently, but better be consistent here!).
* Bring lib_link_xxx functions more in line with each other.
* Replace some long if/else by switch.
Simplifies code quite a bit, making it shorter and easier to extend.
Currently no functional changes for users, but is required for the
upcoming work of shadow catcher support with OpenCL.
It uses an idea of accumulating all possible light reachable across the
light path (without taking shadow blocked into account) and accumulating
total shaded light across the path. Dividing second figure by first one
seems to be giving good estimate of the shadow.
In fact, to my knowledge, it's something really similar to what is
happening in the denoising branch, so we are aligned here which is good.
The workflow is following:
- Create an object which matches real-life object on which shadow is
to be catched.
- Create approximate similar material on that object.
This is needed to make indirect light properly affecting CG objects
in the scene.
- Mark object as Shadow Catcher in the Object properties.
Ideally, after doing that it will be possible to render the image and
simply alpha-over it on top of real footage.
Before, Cycles would first sync the shader exactly as shown in the UI, then determine and sync the used attributes and later optimize the shader.
Therefore, even completely unconnected nodes would cause unneccessary attributes to be synced.
The reason for this is to avoid frequent resyncs when editing shaders interactively, but it can still be avoided for noninteractive renders - which is what this commit does.
Reviewed by: sergey
Differential Revision: https://developer.blender.org/D2285
The problem there is that currently tiles get allocated on the GPU that's used to render them.
However, if a GPU is supposed to denoise a tile, it needs all 8 neighbors in its memory as well.
Therefore, the code now allocates and copies the tiles on the denoising GPU as well.
Moving to manual class registration means its easier to accidentally
miss registering classes.
Now detect missing class registration
and warn when running with `--debug-python`
For Windows 8.1 and X11 (Linux, BSD) now use the DPI specified by the operating
system, which previously only worked on macOS. For Windows this is handled per
monitor, for X11 this is based on Xft.dpi or xrandr --dpi. This should result
in appropriate font and button sizes by default in most cases.
The UI has been simplified to a single UI Scale factor relative to the automatic
DPI, instead of two DPI and Virtual Pixel Size settings. There is forward and
backwards compatibility for existing user preferences.
Reviewed By: brecht, LazyDodo
Differential Revision: https://developer.blender.org/D2539
This adds the ability to switch between different application-configurations
without interfering with Blender's normal operation.
This commit doesn't include any templates,
so its mostly to allow collaboration for the Blender 101 project
and other custom configurations.
Application templates can be installed & selected from the file menu.
Other details:
- The `bl_app_template_utils` module handles template activation
(similar to `addon_utils`).
- The `bl_app_override` module is a general module
to assist scripts overriding parts of Blender in reversible way.
See docs:
https://docs.blender.org/manual/en/dev/advanced/app_templates.html
See patch: D2565
Use fast-math friendly version of this function.
We should probably avoid unsafe fast math, but this is to be done with
real care with all the benchmarks properly done.
For now comitting much safer fix.
Ideally we need to find a way to remove such a static limit here, but it's not so
trivial to implement for texture nodes. Requires some bigger system redesign there.
Just raising limit for now, which is fine for modern systems.
For CPU and CUDA, it was possible to determine the pointers to the tile buffers on the host and just fill the TilesInfo there.
However, for OpenCL the actual pointer is only known inside the kernel, so a separate kernel for setting them is needed.
Denoising a pixel requires access to the other pixels surrounding it. On the CPU, this is solved by waiting for the neighboring tiles to be rendered before the central tile is denoised.
On the GPU, it was handled by rendering larger tiles internally and discarding the overscan area after denoising. That saved a bit of memory, but wasted computation (with 256x256 tiles and a half-window of 8, 13% of rendered pixels were never actually seen).
Also, supporting overscan tiles made the code more complex. So, this commit removes the overscan code and uses the CPU approach on GPUs as well.
It is unused now and if we want similar function we should use
Pluecker intersection which is same performance with SSE optimization
but which is more watertight.
The title says it all actually. Gives up to 10% speedup on test scenes here
on i7-6800K.
Render times on GPU are unreliable here, but there might be some slowdown
caused by watertight nature of intersections.
Avoid construction of temporary array and make utility function force-inlined.
Additionally avoid calling float4_to_float3 twice.
This brings render times to the same values as before current patch series.
This is a preparation work for the followup commit which wil l move
remaining parts of Woop intersection logic to an utility file.
Doing it as a separate commit to keep changes more atomic and easier
to bisect when/if needed.
There are following benefits:
- Modifying intersection algorithm will not cause so much re-compilation.
- It works around header dependency hell and allows us to use vectorization
types much easier in there.
This removes the goal springs, in favor of simply calculating the goal forces on the vertices directly. The vertices already store all the necessary data for the goal forces, thus the springs were redundant, and just defined both ends as being the same vertex.
The main advantage of removing the goal springs, is an increase in flexibility, allowing us to much more nicely do some neat dynamic stuff with the goals/pins, such as animated vertex weights. But this also has the advantage of simpler code, and a slightly reduced memory footprint.
This also removes the `f`, `dfdx` and `dfdv` fields from the `ClothSpring` struct, as that data is only used by the solver, and is re-computed on each step, and thus does not need to be stored throughout the simulation.
Reviewers: sergey
Reviewed By: sergey
Tags: #physics
Differential Revision: https://developer.blender.org/D2514
This is already called by wm_init_userdef, in old code
different initialization methods were used but now it's not needed.
Confusing since prefs are loaded in this function that don't initialize temp.
There seems to be a compiler bug of MSVC2013. The issue does not happen on Linux and
does not happen on Windows when building with MSVC2015.
Since it's reallly a pain to debug release builds with MSVC2013 the AVX2 optimization
is disabled for curve sergemnts for this compiler.
Utility to get a file/dir in the path by index,
supporting negative indices to start from the end of the path.
Without this it wasn't straightforward to get
the a files parent directory name from a filepath.
When the new window didn't end up using the size stored in the preferences
the splash would not be centered (even outside the screen in some cases).
Now centered popups listen for window resizing.
For example, for RX480 you'll no longer see "Ellesmere" but will see
"AMD Radeon RX 480 Graphics" which makes more sense and allows to easily
distinguish which exact card it is when having multiple different cards
of Ellesmere codenames (i.e. RX480 and WX7100) in the same machine.
Previously, the code would only update the status string if the main status changed.
However, the main status did not include the remaining time, and therefore it wasn't updated until the amount of rendered tiles (which is part of the main status) changed.
This commit therefore makes the BlenderSession remember the time of the last status update and forces a status update if the last one was more than a second ago.
Reviewers: sergey
Differential Revision: https://developer.blender.org/D2465
Although this wasn't so obvious since it
only showed up for factory settings and in the preferences window.
Panel display order depends on registration order,
Sorry for the noise. On the bright side we no longer need to move
classes around to re-arrange panels.
Instead of calling a function looping over whole list of a given ID
type, make whole loop over Main in parent function, and call functions
writing a single datablock at a time.
This design is more in line with all other places in Blender where we
handle whole content of Main (including readfile.c), and much more easy
to extend and add e.g. some generic processing of IDs before/after
writing, etc.
From user point, there should be no change at all, only difference is
that data-block types won't be saved in same order as before (.blend
file specs enforces no order here, so this is not an issue, but it could
bug some third party users using other, simplified .blend file reader maybe).
Reviewers: sergey, campbellbarton
Differential Revision: https://developer.blender.org/D2510
Meshes w/o modifiers wouldn't have their derived mesh applied.
Check was to avoid crash but its in fact meaningless,
since the modifier might be disabled, or there may be virtual modifiers.
Not sure why I did not put those from start... Actually *not* having an
undo point here can be problematic, since undoing some previous action
was trying to restore from bad pointer (I think) in UI, generating
asserts.
Note however that it's not a 'pure' undo, in that you may not find your
linked data in exact same state as before deleting it, after an undo,
since it actually implies *reloading* the deleted libraries (and not
restoring from a previously stored memory dump).
Reported by @sergey, thanks.
There is no way currently to prevent the option from showing in menu, so
instead report a warning to user (and curse again current nightmarish
system of operation in outliner...).
Reported by @sergey, thanks.
Pass globals as a bare pointer, same as it sued to be prior to split kernel rework.
AMD CPU platform and Intel OpenCL were complaining about this.
Perhaps we shouldn't pass globals as pointer at all, this isn't something what is
really portable and can cause issues on 32 bit perhaps.
Internal change needed for template support.
Loading the user preferences first so it's possible
for preferences to control startup behavior.
In general it's useful to load preferences before data-files,
so we know security settings for eg.
The range is controlled using the following command line arguments:
--cycles-resumable-start-chunk
--cycles-resumable-end-chunk
Those are 1-based index of range for rendering.
The thing i'm really starting to hate is the requirement to specify both
operation code and node type. Seems to be duplicated enums without real
need for that.
This addresses an issue raised by D2453 -
that there was no way to check if operators are run
multiple times in a row.
Actions are still ignored that don't cause an UNDO event.
Was using some threaded queue on top of task pool, tssk...
Now using properly task pool directly to crunch chunks of smooth fans.
No noticable changes in speed.
Tried to completely get rid of the 'no threading with few loops' code,
but even just creating/freeing the task pool, without actually pushing
any task, is enough to make code 50% slower in worst case scenario (i.e.
few thousands of simple cube objects).
The root of the issue was in custom normal code, so far it assumed that
we could only have one cyclic smooth fan around each vertex, which is...
blatantly wrong (again, the two cones sharing same vertex tip e.g.).
This required a rather deep change in how smooth fans/clnor spaces are processed,
took me some time to find a 'good' solution.
Note that new code is slightly slower than previous one (maybe about 5%),
not much to be done here, am afraid.
Tested against all older report files I could find, seems OK.
- Add optional 'display_name' callback
so callers can construct own names.
- Add optional 'prop_filepath' argument
(for operators that don't use "filepath").
- Add doc-string.
- Use keyword only arguments.
While this compiler is not officially supported yet, getting it to work is
a nice thing because more and more AMD cards will fall under MESA driver.
It's also nice to use explicit comparison with NULL, which makes it more
clear whether variable is a boolean or pointer. Even Rust enforces this!
Patch by Ian Bruce with own modifications.
The mesh convert operator can 'freeze' a mesh
(WYSIWYG, modifiers, shape keys etc).
However its not very obvious that the way to perform this
operation is to convert a mesh to a mesh.
Expose this as 'Visual Geometry to Mesh' in the 'Apply' menu,
since this is where users might expect to see it.
Use expanded names for bmesh primitive operations
(urmv jvke semv jfke).
Use 'bmesh_kernel_' prefix,
these functions aren't intended for wide use so favor readability.
Remove BM_face_vert_separate,
it wasn't used and only skipped step of finding correct loop of face.
It might be useful to keep the search string stored in some cases, but
in most it's not useful but confusing. Especially if the string is taken
from a menu showing a different enum.
When the loop region passed in had no loops to edge-split from,
it was assumed nothing needed to be done.
This ignored the case where loops share a vertex
without any shared edges.
Now BM_face_loop_separate_multi behaves like BM_face_loop_separate.
Fixed error where faces remained connected by verts in BM_mesh_separate_faces.
This commit adds new features to the breakdowner, giving animators more
control over what gets interpolated by the breakdowner. Specifically:
"Just as G R S let you move rotate scale, and then X Y Z let you do that
in one desired axis, when using the Breakdower it would be great to be
able to add GRS and XYZ to constrain what transform / axis is being
breakdowned."
As requested here:
https://rightclickselect.com/p/animation/csbbbc/breakdowner-constrain-transform-and-axis
Notes:
* In addition to G/R/S, there's also B (Bendy Bone settings and C (custom properties)
* Pressing G/R/S/B/C or X/Y/Z again will turn these constraints off again
- Connectivity length was overwritten by distance to closest selected.
- Vertices used the 'island' center of the closest vertex,
even if it wasn't connected.
Now optionally keep track of the original index of used as the closest
connected distance.
To support this needed to add optional support for islands of 1 vertex.
The changes introduced in rB3e628eefa9f55fac7b0faaec4fd4392c2de6b20e
made the non-subframe frame change behaviour less intuitive, by always
truncating downwards, instead of rounding to the nearest frame instead.
This made the UI a lot less forgiving of pointing precision errors
(for example, as a result of hand shake, or using a tablet on a highres scren)
This commit restores the old behaviour in this case only (subframe inspection
isn't affected by these changes)
Selection loop would draw the selection ignoring xray.
Now draw in a separate pass after clearing the depth buffer,
as with regular drawing.
Also disable depth sorting,
caller can sort the hit-list by depth if needed.
Single program generally compiles kernels faster (2-3 times), loads faster,
takes less drive space (2-3 times), and reduces the number of cached kernels.
Reduces memory allocation for split kernel.
This allows for faster rendering due to bigger global size,
specially when GPU memory is limited.
Perfromance results:
R9 290 total render time
Before After Change
BMW 4:37 4:34 -1.1 %
Classroom 14:43 14:30 -1.5 %
Fishy Cat 11:20 11:04 -2.4 %
Koro 12:11 12:04 -1.0 %
Pabellon Barcelona 22:01 20:44 -5.8 %
Pabellon Barcelona(*) 15:32 15:09 -2.5 %
(*) without glossy connected to volume
Decoupled ray marching is not supported yet.
Transparent shadows are always enabled for volume rendering.
Changes in kernel/bvh and kernel/geom are from Sergey.
This simiplifies code significantly, and prepares it for
record-all transparent shadow function in split kernel.
Intended to replace legacy GL_SELECT, without the limitations of
sample queries which can't access depth information.
This commit adds VIEW3D_SELECT_PICK_NEAREST and VIEW3D_SELECT_PICK_ALL
which access the depth buffers to detect whats under the pointer,
so initial selection is always the closest item.
The performance of this method depends a lot on the OpenGL
implementations glReadPixels.
Since reading depth can be slow, buffers are cached for object picking
so selecting re-uses depth data, performing 1 draw instead of 3
(for 24, 18, 10 px regions, picking with many items under the pointer).
Occlusion queries draw twice when picking nearest,
so worst case 6x draw calls per selection.
Even with these improvements occlusion queries is faster on AMD hardware.
Depth selection is disabled by default, toggle option under select method.
May enable by default if this works well on different hardware.
Reviewed as D2543
The issue was caused by sometimes negative color returned by the filter node.
Seems to be caused by precision issues. Don't see any reason why we would want
negative colors in output. Those only causing issues later on.
By calculating the size of the state buffer in the kernel rather than the host
less code is needed and the size actually reflects the requested features.
Will also be a little faster in some cases because of larger global work size.
Because the split kernel can render multiple samples in parallel it is
necessary to have everything initialized before rendering of any samples
begins. The code that normally handles initialization of
`rng_state` (`kernel_path_trace_setup()`) only does so for the first sample,
which was causing artifacts in the split kernel due to uninitialized
`rng_state` for some samples.
Note that because the split kernel can render samples in parallel this
means that the split kernel is incompatible with the LCG.
This was only needed for the previous implementation of parallel samples. As
we don't have that any more it can be removed.
Real reason for removal tho is this: `per_sample_output_buffers` was being
calculated too small and artifacts resulted. The tile buffer is already
the correct size and calculating the size for `per_sample_output_buffers`
is a bit difficult with the current layout of the code. As
`per_sample_output_buffers` was only needed for `sum_all_radiance`,
removing that kernel and writing output to the tile buffer directly
fixes the artifacts.
This is to help debug and track memory usage for generic buffers. We
have similar for textures already since those require a name, but for
buffers the name is only for debugging proposes.
Simple workaround for some issues we've been having with AMD drivers hanging
and rendering systems unresponsive. Unfortunately this makes things a bit
slower, but its better than having to do hard reboots. Will be removed when
drivers have been fixed.
Define CYCLES_DISABLE_DRIVER_WORKAROUNDS to disable for testing purposes.
This does a few things at once:
- Refactors host side split kernel logic into a new device
agnostic class `DeviceSplitKernel`.
- Removes tile splitting, a new work pool implementation takes its place and
allows as many threads as will fit in memory regardless of tile size, which
can give performance gains.
- Refactors split state buffers into one buffer, as well as reduces the
number of arguments passed to kernels. Means there's less code to deal
with overall.
- Moves kernel logic out of OpenCL kernel files so they can later be used by
other device types.
- Replaced OpenCL specific APIs with new generic versions
- Tiles can now be seen updating during rendering
Suspended pools allows to push huge amount of initial tasks
without any threading synchronization and hence overhead.
This gives ~50% speedup of cached rigid body with file from
T50027 and seems to have no negative affect in other scenes
here.
The idea is to allow some amount of tasks to be pushed from working
thread to it's local queue, so we can acquire some work without doing
whole mutex lock.
This should allow us to remove some hacks from depsgraph which was
added there to keep threads alive.
This allows us to avoid TLS stored in pool which gives us advantage of
using pre-allocated tasks pool for the pools created from non-main thread.
Even on systems with slow pthread TLS it should not be a problem because
we access it once at a pool construction time. If we want to use this more
often (for example, to get rid of push_from_thread) we'll have to do much
more accurate benchmark.
Basically move all thread-specific data (currently it's only task
memory pool) from a dedicated array of taskScheduler to TaskThread.
This way we can add more thread-specific data in the future with
less of a hassle.
This feature was adding extra complexity to task scheduling
which required yet extra variables to be worried about to be
modified in atomic manner, which resulted in following issues:
- More complex code to maintain, which increases risks of
something going wrong when we modify the code.
- Extra barriers and/or locks during task scheduling, which
causes extra threading overhead.
- Unable to use some other implementation (such as TBB) even for
the comparison tests.
Notes about other changes.
There are two places where we really had to use that limit.
One of them is the single threaded dependency graph. This will
now construct a single-threaded scheduler at evaluation time.
This shouldn't be a problem because it only happens when using
debugging command line arguments and the code simply don't
run in regular Blender operation.
The code seems a bit duplicated here across old and new
depsgraph, but think it's OK since the old depsgraph is already
gone in 2.8 branch and i don't see where else we might want
to use such a single-threaded scheduler.
When/if we'll want to do so, we can move it to a centralized
single-threaded scheduler in threads.c.
OpenGL render was a bit more tricky to port, but basically we
are using conditional variables to wait background thread to
do all the job.
This slightly changes SDef behavior, by now respecting object transforms
at bind time, thus not requiring the objects to be aligned in their
respective local spaces, but instead using world space.
When rendering multi-view in side-by-side or top-bottom mode, we squash
the UI to half of its size and draw it twice on screen. That means the
cursor coordinates used for UI interaction don't match what's visible on
screen.
This commit is a little event system hack (tm) to fix this. It has some
small glitches with cursor grabbing, but nothing to bad.
We'll also use it for viewport HMD support.
D1350, thanks for the feedback @dfelinto!
It was only possible to separate all geometry from an intersection or none.
Made this into an enum with a 3rd option to 'Cut', (now default)
which keeps each side of the intersection separate
without splitting faces in half.
There was a bug in the intended code behaviour to always seek with a
pitch of 1.0 regardless of pitch/pitch animation/doppler effects.
Check the bug report for a more detailed explanation of problems
concerning pitch and seeking.
Comments said that function was supposed to 'stop worker threads', but
it absolutely did not do anything like that, was merely wiping out TODO
queue of tasks from given pool (kind of subset of what
`BLI_task_pool_cancel()` does).
Misleading, and currently useless, we can always add it back if we need
it some day, but for now we try to simplify that area.
Freeing pool was calling `BLI_task_pool_stop()`, which only clears
pool's tasks that are in TODO queue, whithout ensuring no more tasks
from that pool are being processed in worker threads.
This could lead to use-after-free random (and seldom) crashes.
Now use instead `BLI_task_pool_cancel()`, which does waits for all tasks
being processed to finish, before returning.
The paint slot name was not the same as what is displayed on the texture properties panel.
Instead, the slot type (e.g. "Diffuse Color") was used as the name.
Patch by Suchaaver (@minifigmaster125) with minor changes from @mont29.
Reviewers: mont29, sergey
Maniphest Tasks: T50704
Differential Revision: https://developer.blender.org/D2523
Can't say enough how much I hate those proxies... their duality (sharing
some aspects of both direct *and* indirect users) is a nightmare to handle. :(
Issue was that the VIEW_OT_manipulator operator calls the transform
operators and passes them it's own operator properties. That means the
transform operator got properties passed that it doesn't have.
The custom poll function for surfacedeform_bind seems to have caused
issues when calling it from Python. Fixed by using the generic modifier
poll function, and setting the button to be active or not in the
Python UI code instead. (there might be a better way, but for now this
works fine)
The issue was introduced by 4df75e5 and seems we just need to explicitly
add new keymap item now.
There is still some difference from old behavior, which is planar transform
is using precision movement since e138cde and here i don't see nice solution
currently: the change was requested here in the studio and it's just a
conflict in picking shift key for something which is not supposed to be
accurate.
At least now it's possible to invoke planar constraint and simply unhold
shift.
Not really happy of per-pool threads limit, need to find better
approach to that. But at least it's possible to get rid of half
of the nastyness here by removing getter which was only used in
an assert statement.
That piece of code was already well-tested and this code becomes
obsolete in the new depsgraph and does no longer exists in blender
2.8 branch.
This was only used for progress report, and it's wrong because:
- Pool might in theory be re-used by different tasks
- We should not make any decision based on scheduling stats
Proper way is to take care of progress by the task itself.
Do a "full" update on leaving sculpt mode, so we are sure scene will be brought
to a consistent state.
Ideally we'll only do that when there are objects which depends on geometry
without re-calculating self geometry, but that's a bit tricky currently.
The idea is to make it simpler to remove noise from scenes when some prop uses
Sharp glossy closure and causes noise in certain cases. Previously Sharp Glossy
was not affected by Filter Glossy at all, which was quite confusing.
Here is a file which demonstrates the issue: {F417797}
After applying the patch all the noise from the scene is gone.
This change also solves fireflies reported in T50700.
Reviewers: brecht, lukasstockner97
Differential Revision: https://developer.blender.org/D2416
Can't see any reason to call AUD exit early in WM_exit, that's a
low-level module that has no dependency on anything else in Blender, but
is dependency of some other parts of Blender, so it should rather be
exited late in the process!
This adds an option to force fields of type "Force", which enables the
simulation of gravitational behavior (dist^-2 falloff).
Patch by @AndreasE
Reviewers: #physics, LucaRood, mont29
Reviewed By: #physics, LucaRood, mont29
Tags: #physics
Differential Revision: https://developer.blender.org/D2389
- for rc/release: /api/2.79c/, zip file named blender_python_reference_2.79c_release.zip
- for dev: /api/master/, zip file named blender_python_reference_2_79_4.zip
Please never, ever use same DNA var for two different things. Even worse
if they do not have same type and ranges!
This is only ensuring issues (as described in report, but also if
animating both RNA props using same DNA var... yuck).
And we were not even saving any byte in DNA, could reuse some padding
there to store the two new needed vars (yes, two, since we cannot re-use
existing one if we want to keep backward *and* forward compatibility).
Was actually harmeless and not crashing, but I'd say more or less only
by luck: the NULL-check for region data would only evaluate to true for
the correct 3D View region. However, if we were to add region data to a
different region type in future, this would lead to undefined behavior
if executed in the wrong region.
So... Curve+shapekey was even more broken than it looked, this report was
actually a nice crasher (immediate crash in an ASAN build when trying to
edit a curve shapekey with some viewport rendering enabled).
There were actually two different issues here.
I) The less critical: rB6f1493f68fe was not fully fixing issues from
T50614. More specifically, if you updated obdata from editnurb
*without* freeing editnurb afterwards, you had a 'restored' (to
original curve) editnurb, without the edited shapekey modifications
anymore. This was fixed by tweaking again `calc_shapeKeys()` behavior in
`ED_curve_editnurb_load()`.
II) The crasher: in `ED_curve_editnurb_make()`, the call to
`init_editNurb_keyIndex()` was directly storing pointers of obdata
nurbs. Since those get freed every time `ED_curve_editnurb_load()` is
executed, it easily ended up being pointers to freed memory. This was
fixed by copying those data, which implied more complex handling code
for editnurbs->keyindex, and some reshuffling of a few functions to
avoid duplicating things between editor's editcurve.c and BKE's curve.c
Note that the separation of functions between editors and BKE area for
curve could use a serious update, it's currently messy to say the least.
Then again, that area is due to rework since a long time now... :/
Finally, aligned 'for_render' curve evaluation to mesh one - now
editing a shapekey will show in rendered viewports, if it does have some
weight (exactly as with shapekeys of meshes).
This was fixed ages ago for the interface case but not for the
command line. The thing here is that currently external engines
are relying on reports system to indicate that error happened
so suppressing reports storage in the background mode prevented
render pipeline from detecting errors happened.
This is all weak and i don't like it, but this is better than
delivering black frames from the farm.
New logic of split_faces was leaving mesh in a proper state
from Blender's point of view, but Cycles wanted loop normals
to be "flushed" to vertex normals.
Now we do such a flush from Cycles side again, so we don't
leave bad meshes behind.
Thanks Bastien for assistance here!
Finding which loop should share its vertex with which others is not easy
with regular Mesh data (mostly due to lack of advanced topology info, as
opposed with BMesh case).
Custom loop normals computing already does that - and can return 'loop
normal spaces', which among other things contain definitions of 'smooth
fans' of loops around vertices.
Using those makes it easy to find vertices (and then edges) that needs
splitting.
This commit also adds support of non-autosmooth meshes, where we want to
split out flat faces from smooth ones.
The issue seems to be caused by vertex normal being re-calculated
to something else than loop normal, which also caused wrong loop
normals after re-calculation.
For now issue is solved by preserving CD_NORMAL for loops after
split_faces() is finished, so render engine can access original
proper value.
Logic of handling shapekeys when entering and leaving edit mode for
curves was... utterly broken.
Was leaving actual curve data with edited shapekey applied to it.
The release of these arrays should be the programmer's discretion since these arrays can continue to be used.
Only the expanded functions `bvhtree_from_mesh_edges_ex` and `bvhtree_from_mesh_looptri_ex` are currently being used in blender (in mesh_remap.c), and from what I could to analyze, these changes can prevent a crash.
A group of object groups can be formed by means of the dupli_group option in
the Object properties window. The present revision extends the Selection by
Group option in the Freestyle Line Set so as to support not only flat object
groups but also nested groups.
We need to first split all vertices before we can reliably
check whether edge can be reused or not.
There is still known issue happening with a edge-fan mesh
with some faces being on the same plane.
This commit adds a way to debug Cycles motion blur issues which
are usually happening due to something crazy happening in between
of frames. Biggest trouble was that artists had no clue about
what's happening in subframes before they render. This is at
least inefficient workflow when dealing with motion blur shots
with complex animation.
Now there is an option in Time Line Editor which could be found
in View -> Show Subframe. This option will expose current frame
with it's subframe to the time line editor header and it'll allow
scrubbing with a subframe precision in time line editor.
Please note that none of the tools in Blender are aware of
subframe, so they'll likely be using current integer frame still.
This is something we don't consider a bug for now, the whole
purpose for now is to give a tool for investigation. Eventually
we'll likely tweak all tools to be aware of subframe.
Hopefully now we can finish the movie here in the studio..
Now new edges will be properly created between original and
new split vertices.
Now topology is correct, but shading is still not quite in
some special cases.
Doesn't currently change anything, but would need for some future
work here.
It uses existing padding in kernel BVH structure, so there is
nothing changed memory-wise.
The issue here was mainly coming from minimal pixel width feature
which is quite commonly enabled in production shots.
This feature will use some probabilistic heuristic in the curve
intersection function to check whether we need to return intersection
or not. This probability is calculated for every intersection check.
Now, when we use multiple BVH nodes for curve primitives we increase
probability of that primitive to be considered a good intersection
for us. This is similar to increasing minimal width of curve.
What is worst here is that change in the intersection probability
fully depends on exact layout of BVH, meaning probability might
change differently depending on a view angle, the way how builder
binned the primitives and such. This makes it impossible to do
simple check like dividing probability by number of BVH steps.
Other solution might have been to split BVH into fully independent
trees, but that will increase memory usage of all the static
objects in the scenes, which is also not something desirable.
For now used most simple but robust approach: store BVH primitives
time and test it in curve intersection functions. This solves the
regression, but has two downsides:
- Uses more memory.
which isn't surprising, and ANY solution to this problem will
use more memory.
What we still have to do is to avoid this memory increase for
cases when we don't use BVH motion steps.
- Reduces number of maximum available textures on pre-kepler cards.
There is not much we can do here, hardware gets old but we need
to move forward on more modern hardware..
The change was delivering broken topology for certain cases.
The assumption that new edge only connects new vertices was
wrong.
Reverting to a commit which was giving correct render results
but was using more memory.
This reverts commit af1e48e8ab.
The bug T46099 no longer applies since the addition of `dist_squared_to_projected_aabb_simple`
Has also been added comments that relates to an occlusion bug with the ruler. I'll investigate this.
When importing an Alembic file with grouped transforms, it would badly name the transforms, taking the name of the parent instead of its own.
Patch by @maxime.robinot
Differential Revision: https://developer.blender.org/D2507
Quite simple fix for now which only deals with this case. Maybe we want to do
some "clipping" on image load time so regular textures wouldn't give NaN as
well.
This allows us to use faster math and still have reliable
isnan/isfinite tests.
Only do it for host side, kernels stays unchanged.
Thanks Lukas Stockner for the tip!
**DISCLAIMER**: This is more a code dump of a local branch, not somewhat really finished or so. Underlying math is the subject for rework since it's not quite physically based at all.
Publishing to start collaboration with other Cycles developers who are looking into solving this puzzle.
=== What do we consider a shadow catcher? ===
That's a good question actually, and there's no single formulation of what it exactly is and mathematically it's a bit malformed in the constraints we're working on. Ideally shadow catcher is a difference between image rendered without artificial objects and with them. Such approach gives best ever shadows, but takes 2x more time to render. So for good usability we need to get some assumptions, make system a bit more biased but give artists an useful tool.
Shadow catcher is mainly used by VFX artists to inject artificial objects into real footage. At least that definition we'll stick to
in Blender. Hence here's what shadow catcher should be capable of doing:
- Receive shadows from other objects: be totally transparent when there's no shadows cast on it, be more opaque in shaded areas.
- Ignore self-shadowing and shading. Shadows caused by occlusion with itself already exists in the footage. Same applies to the
shading -- all shading caused by material itself are also in the footage already.
- Interact with other objects in the scene. This sounds a bit tricky but makes sense actually. Consider situation when one needs to put sharp glossy object into the footage: you'll want objects from a real scene to be reflected in the artificial object. And often you'll want the object on which shadow is to be cast to be reflected in such situations. Surely you can escape with copying object and playing with ray visibility, but that's complicated scene setup instead of making it simpler.
- Be affected with indirect light. Cycles is the GI render engine after all!
=== How to use the shadow catcher? ===
1. Create an object on which you want to receive shadow.
2. Create some basic material setup which is close to a real object.
3. Enable "Shadow Catcher" in Object buttons -> Cycles Settings.
4. Be happy! (hopefully, once we've debugged all the code)
=== What this patch actually contains? ===
It contains all the bits which tries to implement definition of shadow catcher above. It is trying to implement it all in a way so we don't need to make big changes in the ray integration loop, hence it has some tricky magic to deduct what was the received shadow from the light passes and will fail in certain situations, mainly when there is no direct lighting of the object at all. It is totally tweakable to become more artists friendly, i just didn't have enough time to try all the ideas and used whatever latest semi-working formula was.
Major changes are in fact made around shadow_blocked() to exclude shading from self. This part is based on an older patch which tried to expose it to an user. That exposing settings are somewhat malformed and shouldn't really be used. In fact, we should remove those settings from the interface.
=== Some pictures? ===
Sure, here's one from a hackish patch:
{F282085}
(This is a glossy monkey on a checker board floor, floor is makred as a catcher, Here's .blend file {F282088})
Reviewers: lukasstockner97, juicyfruit, brecht
Subscribers: jensverwiebe, Nikos_prinio, brecht, lukasstockner97, borisdonalds, aliasguru, YAFU, forest-house, uli_k, aditiapratama, hype, davidandrade, printerkiller, jta, Davd, johnroper100, poor, lowercase, juang3d, GiantCowFIlms, iklsr, gandalf3, sasa42, saphires, duarteframos, madog, Lapineige, railla, zuggamasta, plasmasolutions, jesterking
Differential Revision: https://developer.blender.org/D1788
The issue was caused by usage of non-initialized image user, which
could have different settings, causing some random image being loaded
or not loaded at all.
This caused non-deterministic behavior of Cycles image loading because
it was querying image information from several places.
This fixes crash reported in T50616, but it's not a complete fix
because preview rendering in material is wrong (same wrong as in
2.78a release).
We now assert that we now file version of libraries (needed for
do_version after linking step), so for missing libraries, set dummy
numbers (using version of main .blend file actually).
Works similar to regular Cycles tests, just does OpenGL render to
get output image.
Seems to work fine with the only funny effect: Blender window will
pop up for each of the tests. This is current limitation of our
OpenGL context. Might be changed in the future.
Instead of implementing the full algorithm inside the device, the new code just calls the device in order to run specific kernels while handling all the high-level logic outside of the individual devices.
Seems CUDA failed to de-duplicate the array across multiple inlined
versions of the shadow_blocked(). Helped it a bit with that now.
Gives about 100MB memory improvement on a scenes after previous
commit and brings up memory "regression" to only 100MB comparing to
the master branch now.
This commit enables record-all behavior of transparent shadows
rays.
Render times difference goes as following:
GTX 1080 render time
BMW -0.5%
Fishy Cat -0.0%
Pabellon Barcelona -11.6%
Classroom +1.2%
Koro -58.6%
Kernel will now use some extra VRAM memory to store the intersection
array (200MB on my configuration). This we can optimize out with some
further commits.
The idea is to record all possible transparent intersections when
shooting transparent ray on GPU (similar to what we were doing on
CPU already).
This avoids need of doing whole ray-to-scene intersections queries
for each intersection and speeds up a lot cases like transparent
hair in the cost of extra memory.
This commit is a base ground for now and this feature is kept
disabled for until some further tweaks.
Now we break the traversal cycle and then perform volume attenuation
and check with zero throughput. Not sure it makes any measurable sense
at this moment, but in the future it might help de-duplicating some
extra logic here.
Removed unnecessary call to DM_update_tessface_data(). This call is
already performed by DM_ensure_tessface(dm). The call being performed
twice caused a failing BLI_assert().
Reviewed by: Kévin Dietrich
With the new names the arguments (yup, zup) are in the same order as
they appear in the function name. The old names used copy_src_dst(dst,
src), which I found very confusing. Furthermore, now it is clear from
where to where the copy is made.
This makes the function names a little bit longer, though. If that is
a real issue, we can just name them zup_from_yup(zup, yup).
Reviewed by: Kévin Dietrich
Speedup is mainly gained by multi-threading. Gives about 3x
fps gain on an edit shot file.
There is still some room for improvements, will happen in one
of the upcoming commits.
Derived mesh for particles did not include tessellated faces when it
was expected to. Now added explicit function to copy CDDM with tess
faces without need to re-tessellate the result.
It is not necessary to add MOTO library dependency when we use
WITH_IK_SOLVER (now it uses Eigen) or we use WITH_MOD_BOOLEAN (it was
used by bsp intern library some time ago but it is not present in the
code anymore).
Reviewers: mont29, sergey
Subscribers: mont29, sergey
Differential Revision: https://developer.blender.org/D2477
No reason to not make this private to this file, and it gave conflict
when using bpy as module and loading it in a GLib application (which
also has a g_atexit var).
The title says it all actually. Use BLI task to loop over vertices
and distort their locations. Gives 2x FPS increase in a file with
just time-dependent displace modifier on my desktop.
This version will give less spin locks and now well-tested by render engines.
This should reduce amount of threading overhead when having multiple objects
with displace modifier enabled.
In the future this will also help us threading the modifier.
There are more modifiers which could benefit from this, but let's first
investigate the new behavior with one of them.
We (the Microsoft C++ team) use the Blender project as part of our "Real world code" tests.
I noticed a place in WIN32 specific code (dvpapi.cpp:85) where a string literal is losing
its const-ness when being passed to BLI_dynlib_open(). This is not permitted when using the
/permissive- conformance compiler switch (see our blog
https://blogs.msdn.microsoft.com/vcblog/2016/11/16/permissive-switch/)
My suggested fix is to add const and propagate it where needed. Another possible fix would be
to explicitly cast away the const.
Reviewers: mont29, sergey, LazyDodo
Subscribers: Blendify, sergey, mont29, LazyDodo
Tags: #platform:_windows
Differential Revision: https://developer.blender.org/D2495
Instead of reference the vertex first and test the bitmap afterwards. Test the bitmap first and reference the vertex after.
In a mesh with 31146 vertices and the entire bitmap disabled, the loop time is 243% faster
With all bitmap enabled, the time becomes 463473% faster!!!
One possible reason for this huge difference in peformance is that maybe the compiler is not putting the function "BM_vert_at_index" inline (I dont know if buildbot do this, but it's good to investigate).
Looks like `object_map` and `mem_arena` may be NULL sometimes...
Also, cleaned up function pointers declaration of Nearest2dUserData,
those were warning out in gcc. Please, *always* use typdef defined
prototypes for function pointers, it is sooooo much cleaner and clearer
that way. And easy to convert from compatible functions too.
BKE_lamp_free was somehow missing the refactor of datablocks handling
(which, among other things, completely separated ID refcounting and
linking management from ID freeing itself).
Either forgot during development, or lost during merge...
Previously, the denoising kernels were just included with the other kernels.
However, that is not ideal, since the kernels already take very long to compile. Also, it isn't needed since the rendering and denoising kernels share basically no code.
So, this commit adds intern/cycles/filter/, which contains the filtering kernels.
The code looks for the closest element between its centers. In the case of islands, the center of each vertex is the center of the island.
The solution here is to skip the search for islands when the operation is translation
Pretty straight forward actually, just do not bother about obdata part
of vgroups in that case, only copy object part of it.
And let's curse once again those stuff spread accross several types of
data-blocks...
The order was wrong from the semantic point of view, caused
by some legacy workarounds in Libmv. Didn't realize it's was
not how things were expected to be used.
Issue was indeed in join operation, mesh in which we join all others
could be re-added to final data after others, leading to undesired
re-ordering of CD layers, and existing vertices etc. being shifted away
from their original indices, etc.
All kind of more or less bad and undesired changes, fixed by always
re-inserting destination mesh first.
Also cleaned up a bit that code, it was doing some rather
non-recommanded things (like allocating zero-sized mem, doing own
coocking to remove a data-block from main, etc.).
Tricky issue caused by CDDM_copy() coying MFACE array but not MTFACE which
confused logic later on.
Now we don't copy ANY tessellation unless it is requested to.
Thanks Bastien for help and review!
'page' prop of scroll up/down operators would get stuck once set once by
pageup/down keys... Now only take this prop into account if explicitely
set, not when its value is inherited from previous run.
Since the features that are used for denoising may be highly correlated (for example, with a greyscale texture the three albedo channels will be identical), using them directly for fitting would be rather unstable.
Therefore, before performing the actual fit a transformation into a reduced feature space is peformed using Principal Component Analysis by calculating the eigendecomposition of X^t*X, where X is the feature matrix.
After doing that, the eigenvectors are the basis vectors of the new feature space, and the eigenvalues specify their "importance". Therefore, by discarding eigenvectors whose eigenvalues are low, its possible to get rid of unneccessary dimensions.
Now, the question is which dimensions should be removed. The original WLR algorithm calculates a threshold based on the variance of the feature passes, with the goal of discarding noisy features. However, this implementation already prefilters the feature passes, so the (original) variance passes overestimate the actual variance a lot and discarding them isn't actually needed anymore.
Therefore, this commit replaces it with two simpler heuristics - either removing all eigenvalues below a certain threshold, or removing until a certain fraction of the energy in the eigenvalues is gone.
Which heuristic is used is chosen based on the sign of the filter strength, positive values choose the energy heuristic and negative values the absolute heuristic. In both cases, the threshold value is 10^(2*abs(filter strength)). If the default of zero is used, it uses the energy heuristic with a fraction of 10^-3.
Note that in some cases, especially motion blur and depth of field, this might cause new artifacts. These can be solved and I'll commit that soon. On the positive side, this change makes the denoiser handle hair/fur much better.
Strangely this change does not affect the performance very much.
Suzanne subdividide 6x (ortho view):
Before:0.00013983
After :0.00013920
But it makes it easier to read the code
When the function that tests snap on multiple elements starts from the face and ends at the vertex, the transition between elements becomes much smoother.
Better to have clear way to tell whether flag is parameter for
BKE_library_foreach_ID_link(), parameter for its callback function, or
return value from this callback function.
Taking advantage of the area, the depth is decreased 0.01 BU to each loop to give priority to elements in order: Vertice > Edge > Face. This increases the threshold and improves the snap to multiple elements
The previous solution took arbitrary values to determine if the mouse was near or not to the Bound Box (it simply scaled the Bound Box).
Now the same function that detected the distance from the BVHTree nodes to the mouse is used in the Bound Box
This revision extends the functionality of the "Fill Range by Selection" button in
the "Distance from Camera/Object" modifiers so that only selected mesh vertices
in the edit mode are taken into account (instead of considering all vertices when
in the object mode) to compute the min & max distances from the reference.
This will give users much finer control on the range values.
Use new Main->relations ID usages mapping in BKE_library_make_local().
This allows a noticeable simplification in code, and can be up to twice
quicker as previous code (Make Local: All from 2 to 1 minute e.g. in a
huge production file with thousands of linked data-blocks).
Note that new code has been successfuly tested with several complex cases
(production files from Agent327), as well as some testcases from recent
bug reports related to that function. But as always, nothing beats real
usage by real users, so please check this before we release 2.79. ;)
Main areas that would be affected: Make Local operations (L shortcut in
3DView), and append from libraries.
Use Main->relations in BKE_library_foreach_ID_link(), when possible
(i.e. IDWALK_READONLY is set), and if the data is available of course.
This is quite minor optimization, no sensible improvements are expected,
but does not hurt either to avoid potentially tens of looping over e.g.
objects constraints and modifiers, or heap of drivers...
The new MainIDRelations stores two mappings, one from ID users to ID
used, the other vice-versa.
That data is assumed to be short-living runtime, code creating it is
responsible to clear it asap. It will be much useful in places where we
handle relations between IDs for a lot of them at once.
Note: This commit is not fully functional, that is, the infamous, ugly,
PoS non-ID nodetrees will not be handled correctly when building relations.
Fix needed here is a bit noisy, so will be done in next own commit.
This provides a slight improvement in performance in specific cases, such as when the observer is inside a high poly object and executes snap to edge or vertex
Distance calculation performed by the "Fill Range by Selection" button of the
"Distance from Camera" color, alpha and thickness modifiers was incorrect,
limiting the usefulness of the functionality.
The problem was that the distance between the camera and individual vertex
locations was calculated in the world space, which was inconsistent with the
distance calculation done by the modifiers in the camera space.
The new `isect_ray_aabb_v3_simple` function replaces the `BKE_boundbox_ray_hit_check` and can be used in BVHTree Root (first AABB). So it is much more efficient.
In order to simplify the reading of these functions, the parameters: `snap_to`, `mval`, `ray_start`, `ray_dir`, `view_proj` and `depth_range` are now stored in the struct `SnapData`
Checking only whether mverts is same as base mesh one is not enough in
all cases, some modifiers (deform ones) can only generate new mvert
data, while keeping others from original mesh.
Now checking both mvert or medge, hopefully this will be enough to catch
all problematic cases this time.
Thanks @gaia for finding that problem. :)
Although the "BLI_bvhtree_find_nearest_to_ray" function is more practical than the generic "BLI_bvhtree_walk_dfs", it does not work to snap in perspective view. This makes it necessary to add "ifs" and functions that make the code difficult to understand
patch: D2474
This is a speed up option which is mainly useful for viewport. Gives nice speedup in
the barbershop scene of 2x when replacing GI with AO after 2nd bounce without loosing
too much details.
Reviewers: brecht
Subscribers: eyecandy, venomgfx
Differential Revision: https://developer.blender.org/D2383
We are not bumping file version, but we cannot have the doversion code running twice.
In this particular case it was crashing files, since we were setting node->storage to NULL, and later on accessing it.
The idea was to link something to a parent, but the point is:
we must not pass owner deep and then have any parent-type-related
logic implemented in the "children".
This is much more flexible solution which will allow doing some
more procedural features.
Reviewers: brecht, dfelinto, mont29
Reviewed By: mont29
Subscribers: Severin
Differential Revision: https://developer.blender.org/D2403
The freestyle data was never freed when removing a renderlayer.
```
blender -b --factory-startup --debug-memory --python-expr "import bpy;bpy.ops.scene.render_layer_add();bpy.context.scene.render.layers.active_index=0;bpy.ops.scene.render_layer_remove()"
```
Currently the tests don't run on windows for the following reasons
1) render_graph_finalize has an linking issue due missing a bunch of libraries (not sure why this is not an issue for linux)
2) This one is more interesting, in test/python/cmakelists.txt ${TEST_BLENDER_EXE_BARE} and ${TEST_BLENDER_EXE} are flat out wrong, but for some reason this doesn't matter for most tests, cause ctest will actually go out and look for the executable and fix the path for you *BUT* only for the command, if you use them in any of the parameters it'll happily pass on the wrong path.
3) on linux you can just run a .py file, windows is not as awesome and needs to be told to run it with pyton.
4) had to use the NAME/COMMAND long form of add_test otherwise $<TARGET_FILE:blender> doesn't get expanded, why? beats me.
5) missing idiff.exe for msvc2015/x64 in the libs folder.
This patch addresses 1-4 , but given I have no working Linux build environment, I'm unsure if it'll break anything there
5 has been fixed in rBL61751
Reviewers: juicyfruit, brecht, sergey
Reviewed By: sergey
Subscribers: Blendify
Tags: #cycles, #automated_testing
Differential Revision: https://developer.blender.org/D2367
Blenders baking system currently doesn't support the topology used by
adaptive subdivision and primitive ids will be wrong or out of range
leading to crashes. Updating the baking system to support other
topologies would be a bit involved, so for now we simply disable
subdivision while baking to avoid crashes.
We started to run out of bits there, so now we separate flags
which came from __object_flags and which are either runtime or
coming from __shader_flags.
Rule now is: SD_OBJECT_* flags are to be tested against new
object_flags field of ShaderData, all the rest flags are to
be tested against flags field of ShaderData.
There should be no user-visible changes, and time difference
should be minimal. In fact, from tests here can only see hardly
measurable difference and sometimes the new code is somewhat
faster (all within a noise floor, so hard to tell for sure).
Reviewers: brecht, dingto, juicyfruit, lukasstockner97, maiself
Differential Revision: https://developer.blender.org/D2428
Cycles add-on did not actually support reloading correctly.
When you want to correctly reload sub-modules (i.e. modules of an add-on
which is a package), you need to use importlib, a mere import will do
nothing with already loaded modules (RNA classes are sort of
pre-registered when they are evaluated, through the meta-class system).
New options to define the style of the animation paths in order to get
better visibility in complex scenes.
Now is possible define the color, thickness and several options relative
to the style of the lines used to draw motion path.
This way we can stop traversing BVH node early on.
Gives about 2-2.5x times render time improvement with 3 BVH steps.
Hopefully this gives no measurable performance loss for scenes with
single BVH step.
Traversal is currently only implemented for QBVH, meaning old CPUs
and GPU do not benefit from this change.
Similar to the previous commit, the statistics goes as:
BVH Steps Render time (sec) Memory usage (MB)
0 46 260
1 27 373
2 18 598
3 15 826
Scene used for the tests is the agent's body from one of the barber
shop scenes (no textures or anything, just a diffuse material).
Once again this is limited to regular (non-spatial split) BVH,
Support of spatial split to this feature will come later.
The idea is to create several smaller BVH nodes for each of the motion
curve primitives. This acts as a forced spatial split for the single
primitive.
This gives up render time speedup of motion blurred hair in the cost
of extra memory usage. The numbers goes as:
BVH Steps Render time (sec) Memory usage (MB)
0 258 191
1 123 278
2 69 453
3 43 627
Scene used for the tests is the agent's hair from one of the barber
shop scenes.
Currently it's only limited to scenes without spatial split enabled,
since the spatial split builder requires some changes to work properly
with motion steps coordinates.
Also fixed some issues with motion keys calculation:
- Clamp lower and upper limits of curves so we can safely call those
functions for the very first and very last curve segment.
- Fixed wrong indexing for the curve radius array.
- Fixed wrong motion attribute offset calculation.
Mimics how regular triangles are working and makes it more clear where
the stuff is located in the kernel.
Needed to have some forward declarations because of the current placement
of things in the kernel.
Following @AlonDan's feature request and @hjalti's screenshot yesterday,
I've decided to implement support for this to make it easier to scan which
keyframes correspond with which set of controls, especially when faced with
a large wall of keyframes.
In retrospect, I should've done this a long time ago!
Was a bit confusing to have transparent and translucent depth
exposed but no diffuse or glossy.
Reviewers: brecht
Subscribers: eyecandy
Differential Revision: https://developer.blender.org/D2399
This is important for the reliable behavior or isnan/isfinite/min/max
functions to work with nan and non-finite values. Some of the issues
with fast math are possible to work around, but didn't find a way to
have reliable min/max implementation yet.
Please NEVER EVER use such a statement, it's only causing HUGE
issues. What is even worse: it's not always possible to immediately
see that the hell is coming from such a statement.
There is still some statements in the existing code, will leave
those for a later cleanup.
- flushing hidden state ran when it didn't need to.
- flushing checks didn't early exit when first visible element found.
- low level BM_*_hide API calls like this can use skip iterators
can loop over struct members directly.
No user-visible changes.
- face-create-extend option could add hidden verts and edges into
the selection history (invalid state).
- faces could be created that included existing hidden edges
that remained hidden (invalid state too).
- newly created faces could copy hidden flag from surrounding faces,
giving very confusing results (looks as if face creation failed).
Surprising nobody noticed these years old bugs!
Experimental option for the Reproject Strokes operator to project strokes on to
geometry, instead of only doing this in a planar (i.e. parallel to viewplane) way.
The current implementation is quite rough, and may need to be improved before it
is really ready for use. Potential issues:
* Loss of precision (i.e. stairstepping artifacts) from the 3D -> 2D -> 3D conversion
as we don't have float version of one of the projection funcs
* Jagged depth if there are gaps, since it will default back to the 3d-cursor plane
if no geometry was found (instead of doing some fancy interpolation scheme)
* I'm not sure if it's that useful for adapting GP strokes to deforming geometry yet...
Now the eraser checks if there's an active frame with some strokes in it
before creating a new frame. There's no point in creating a new frame if
there are no strokes in the active frame (if one exists).
This still doesn't help much if there were strokes but they weren't touched though...
This operator adds a new frame with nothing in it on the current frame.
If there is already a frame there, all existing frames are shifted one frame later.
Quite often when animating, you may want a quick way to get a blank frame,
ready to start drawing something new. Or maybe you just need a quick way to
add a "placeholder" frame so that a suddenly-appearing element does not show
up before its time.
If the layers or the colors were renamed, the animation data was wrong
because the data path was not updated.
I also have fixed a possible stroke color name update if the name was duplicated moving
the rename function call after checking unique name.
Avoids possible jumps when one is trying to do some really preciese tweak.
Quite striaghtforward change for mouse input initialization: take Shift
state into account. However, this will interfere with the axis exclusion
which is currently also uses Shift (the feature to move something in a
plane which doesn't have selected axis). This is probably not so commonly
used feature (nobody in the studio even knew of it) and the only downside
now would be that such a constrainted movement will become accurate by
default. That's easy to deal from user side by just unholding Shift key.
Reviewers: brecht, mont29, Severin
Differential Revision: https://developer.blender.org/D2418
To make it faster to try different interpolation curves, there's a new operator
"Remove Breakdowns" which will delete all breakdowns sandwiched by normal
keyframes (i.e. all the ones that the previous run of the Interpolation op created)
This commit introduces the ability to use the Robert Penner easing equations
or a Custom Curve to control the way that the "Interpolate Sequence" operator
interpolates between keyframes. Previously, it was only possible to get linear
interpolation between the gp frames.
Workflow:
1) Place current frame between a pair of GP keyframes
2) Open the "Interpolate" panel in the Toolshelf
3) Choose the interpolation type (under "Sequence Options")
4) Adjust settings (e.g. if you're using "Custom Curve", use the curvemap widget
to define the way that the interpolation proceeds)
5) Click "Sequence" to interpolate
6) Play back/scrub the animation to see if you've got the result you want
7) If you need to make some tweaks, undo, or delete the generated keyframes,
then repeat the process again from step 4 until you've got the desired result.
The "gp_sculpt" settings should be strictly for stroke sculpting, and not abused by
other tools. (Similarly, if other general GP tools need one-off options, those should
go into the normal toolsettings->gpencil_flag)
Furthermore, this paves the way for introducing new settings for controlling the way
that GP interpolation takes place (e.g. with easing equations, or a custom curvemap)
* Reshuffled some blocks of code for better ease of navigation/flow in the file
* Improved some tooltips
* Removed "Helper" tag from some functions that serve bigger roles
* Fixed some errant formatting
The interpolation operators (and their associated code) occupied a significant
portion of gpencil_edit.c (which was getting a bit heavy). So, it's best to split
these out into a separate file to make things easier to handle, in preparation
for some further dev work.
Things like `BLI_uniquename` had nothing, but really nothing to do in
BLI_path_util files!
Also, got rid of length limitation in `BLI_uniquename_cb`, we can use
alloca here to avoid overhead of malloc while keeping free size (within
reasonable limits of course).
Just store bones that could not get renamed to desired flipped name on the
first try into a temp list, and try to rename them a second time.
This is rather simple solution, will induce 'over numbering' in case you
flip a bone to another unselected bone's name (since number will be
incremented in both rename attempts), but think this is acceptable minor
glitch, for a corner case situation that does not have any good
resolution anyway.
Also, set `strip_numbers` option of `BKE_deform_flip_side_name` to
false, otherwise chains of bones with same names would get their numbers
completely messed up after name flipping.
Based on work by @dfelinto in D2456 (https://developer.blender.org/D2456), thanks.
This adds two functions to project 3d coordinates onto a 3d plane,
to get 2d coordinates, essentially eliminating the plane's normal axis
from the coordinates.
Reviewed By: mont29
Differential Revision: https://developer.blender.org/D2460
It is quite likely in a triangulated mesh that the actual island edge
belongs to a different triangle than the current pixel; for example
consider corners of a triangulated axis aligned rectangle face that
have the additional edge: a pixel there will have to be assigned to
one of the triangles, but one of the edges of the original rectangle
can only be accessed through the other triangle.
Thus for robust operation it is necessary to do a recursive search.
The search is limited by requiring that it only goes through edges
that bring it closer to the target point, and also by depth as a
safeguard.
Differential Revision: https://developer.blender.org/D2409
The code requires the pixel on the other side of the seam to be assigned
precisely to the expected triangle. This can cause false negatives around
vertices, where a pixel is likely to touch multiple triangles and thus
cannot be said to unambiguously belong to any one of them, so check
distance to the intended triangle and accept the result if it's close.
1. Forcibly symmetrize the neighbor relations, so that if A is neighbor
of B, B is neighbor of A. The existing code is guaranteed to violate
this if texture resolution is different between the sides of a seam.
2. In texture mode dynamic paint adds a 1 pixel wide border around the
islands. These pixels aren't really part of the dynamic paint domain
and thus by design can't have symmetrical neighbor relations. This
means they can't be treated by effects like normal pixels.
The simplest way to handle it in a consistent way is to exclude
them from effects, but add an additional pass that recomputes them
as average of their non-border neighbors, located on both sides of
the seam.
This avoids intersection AABB of different curve primitives
which makes it less ray-to-primitive intersections.
This gives about 30% speedup of hair rendering in the barber
shop scenes here. There is still some work to be done on those
files to solve major speed issues on certain frames.
This way we can have different limits for regular and motion curves
which we'll do in one of the upcoming commits in order to gain some
percents of speedup.
The reasoning here is that motion curves are usually intersecting
lots of others bounding boxes, which makes it inefficient to have
single primitive in the leaf node.
Maximal number of elements is supposed to be inclusive. That is what
it was always meant in this file and what @brecht considered still
the case in 6974b69c61.
In fact, the commit message to that change mentions that we allowed
up to 2 curve primitives per leaf while in fact it was doing up to 1
curve primitive.
Making it real 2 primitives at a max gives about 5% slowdown for the
koro.blend scene. This is a reason why BVHParams.max_curve_leaf_size
was changed to 1 by this change.
Since the beginning of times hair settings in cycles were global for
the whole scene but were located in the particle context. This causes
quite some trickery to get shots set up for the movies here in the
studio by forcing artists to create dummy particle system to change
settings of hair on the shot.
While ideally this settings should be properly become per-particle
system for the time being it will save sweat and blood to move the
settings to scene context.
Reviewers: brecht
Subscribers: jtheninja, eyecandy, venomgfx, Blendify
Differential Revision: https://developer.blender.org/D2287
Made them closer to how GTest shows the output, so reading test logs
is easier now (at least feels more uniform).
Additionally now we know how much time tests are taking so can tweak
samples/resolution to reduce render time of slow tests.
It is now also possible to enable colored messages using magic
CYCLESTEST_COLOR environment variable. This makes it even easier to
visually grep failed/passed tests using `ctest -R cycles -V`.
Previously, the prefiltering NLM kernel was implemented just as it's described in the paper:
For every pixel P, loop over every pixel Q in the search window. Then, loop over the small patches around them, calculate the average difference, and use that to compute the weight of Q for the denoised result at P.
However, that gives you a time complexity of O(N^2 * R^2 * F^2), where N is the image size, R the search window and F the patch size...
So, this patch implements the clever idea from "A Simple Trick to Speed Up and Improve the Non-Local Means" - by reformulating the loop, it's actually possible to skip a lot of computation and replace it with a separable box filter convolution. This reduces complexity to O(N^2 * R^2 * F), and the amount of pixel differences calculated even to O(N^2 * R^2)!
Furthermore, by applying a second box-filter pass after calculating the weights, we get the "patchwise NLM" improvement basically for free!
This is CPU-only so far, but that will change soon.
Since the type of int4 depends on whether SSE is enabled, the SSE kernels expect a different type than the device code.
Therefore, the content must be passed as a pointer...
Reusing PROP_TEXTEDIT_UPDATE instead of adding a new property flag just for search strings. Currently it's only used for search strings anyway so seems fine for now.
Fixes T50336.
This splits `interp_weights_face_v3` into `interp_weights_tri_v3` and
`interp_weights_quad_v3`, in order to properly handle three sided polygons
without needing a useless extra index in your weight array. This also
improves clarity and consistency with other math_geom functions, thus
reducing potential future errors.
Reviewed By: mont29
Differential Revision: https://developer.blender.org/D2461
The issue was that we used to compare number of vertices for mesh after the auto
smooth was applied (at the center of the shutter time) with number of vertices
prior to the auto smooth applied. This caused false-positive consideration of a
mesh as changing topology.
Now we do autosplit as early as possible and do it from blender side, so Cycles
does not need to re-implement splitting on it's side.
This way render engine can request mesh to be auto-split and not
worry about implementing this functionality on it's own.
Please note that this split is to be performed prior to tessellation.
The denoising code started out as an implementation of WLR, but the NLM mode is working so much better that I decided to remove the WLR mode completely.
This allows to get rid of a significant amount of complexity and code.
Also, the NFOR mode is removed - the name is misleading, most of the ideas behind the NFOR paper are actually what powers the NLM mode. NFOR mode was just an experiment with removing the T-SVD feature space reduction, and it turned out that the experiment had failed.
Other than implementing a `mid_v3_v3_array` function, this removes
`cent_tri_v3` and `cent_quad_v3` in favor of `mid_v3_v3v3v3` and
`mid_v3_v3v3v3v3` respectively.
Reviewed By: mont29
Differential Revision: https://developer.blender.org/D2459
When layout has only small buttons (buttons with icon and without label)
its size should be fixed. Code was modified to be able to add a new UI_ITEM_MIN
flag which indicates that the layout has only small fixed-width buttons.
Patch by @raa, with minor style edits by @mont29.
Reviewers: Severin, mont29
Reviewed By: mont29
Tags: #bf_blender, #user_interface
Differential Revision: https://developer.blender.org/D2423
Am pretty sure node update should not touch to Main database like that,
but for now let's allow it, I guess the hack is needed for things like
Sverchok. ;)
If the active object is in weight paint mode, but some armatures in pose mode, 'manipulate center points' still affects the transformation. See bd2034a749.
Also removed redundant check, we basically did the same check for paint modes twice.
If a very low wetness absolute alpha brush is used with spread and
drying effects enabled, some pixels will rapidly accumulate paint.
This happens because paint drying code applies a minimal wetness
threshold that causes the paint to instantly dry out.
Specifically, every frame the brush adds paint at the specified
absolute alpha and wetness set to the minimal threshold, spread
drops it below threshold, and finally drying moves all paint to
the dry layer. This drastically accelerates the rate of flow of
paint into the affected pixels.
Fortunately, the reason paint spread actually ends up decreasing
wetness turns out to be a simple floating point precision problem,
which can be easily fixed by restructuring the affected expression.
Reported on IRC by dfelinto, thanks.
Root of the issue was that opening a new text file would create
datablock with one user, when Text editor is actually a 'user one' user.
This was leaving Text datablocks in inconsitent user count, and
generating asserts in BKE_library area.
Also changed a weird piece of code related to that extra user thing in
main remapping func.
Main issue here was that in old usercount system 'user_real' did simply
not allow that kind of thing to work. With new pait of 'USER_EXTRA'
tags, it becomes possible to handle the case correctly, by merely refining
checks about indirectly use objects whene removing them from a scene.
Incidently, found another related bug, 'link group objects to scene' was not
incrementing objects' usercount - bad, very very bad!
The settings.frame_start rna was clamping frame start to frame end when frame start was bigger than frame end.
The fix is simply to set frame end first
This is a hacky fix for a regression introduced sometime after 2.76.
The "Strip Time" setting on NLA Strips could not be edited without the
value immediately jumping back to the current FCurve value (or 0.0 if no
keyframes existed); even enabling autokey wouldn't let you key the property.
Until we have proper overrides (that only lose their values on frame change),
it's best that this setting is editable, even if it does mean it you have to
manually change the frame to see the updated values.
Sometimes it can be useful to be able to keep onion skins visible in the
OpenGL renders and/or when doing animation playback. In particular, there
are two use cases where this is quite useful:
1) For creating a cheap motion-blur effect, especially when the before/after
values are also animated.
2) If you've animated a shot with onion skinning enabled, the poses may end
up looking odd if the ghosts are not shown (as you may have been accounting
for the ghosts when making the compositions).
This option can be found as the small "camera" toggle between the "Use Onion Skinning"
and "Use Custom Colors" options.
This is a regression introduced in rB5bd9e832
It looks more like a hack than a proper fix, but the shader logic
changed a lot for blender2.8, so I would rather do the elegant fix
there, while leaving master working.
If we ever do a 2.78b (or 2.79) this should get in.
That code was a joke, letting some invalid utf8 bytes pass, returning
wrong offset for some invalid sequences, not to mention length and
pointer easily going out of sync, NULL final byte being 'forgotten' by
memcpy, etc. etc.
The miracle here is that we could survive using this for so long!
Probably because we do not use utf-8 sanitizing enough in Blender,
actually... :/
This test should ensure we correctly detect all invalid utf-8 sequences in a given string.
DISCLAIMER:
Do not run this with current code - you'll either laugh or cry, nearly *all* checks fail!
Based on utf-8 decoder stress-test (https://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt)
by Markus Kuhn <http://www.cl.cam.ac.uk/~mgk25/> - 2015-08-28 - CC BY 4.0
This is the same issue as was fixed with T39486: the adjustment pass
that tries to equalize different widths at either end of an edge
sometimes causes the widths to get bigger and bigger.
The previous fix was to let "clamp_overlap" do double duty as a way
to limit this behavior. But clearly this is undiscoverable, as the
current bug report shows. So I put in an "auto-limiting" mode that
detects when adjustments are going crazy and then acts as if
clamp_overlap were set.
The reason we can't always act as if clamp_overlap is set is that
certain models (e.g., Bent_test in regression tests) look bad if
that is enabled.
Include idea that Blender may fail to launch it even if path is correct,
in some cases (dear Windows...).
Based on idea from @lijenstina and @blendify (D2349), thanks.
Over time roll and orbit would scale the quaternion
which is documented as unit length.
In practice any errors would be subtle,
but better normalize as other operators do.
Basic idea is to store fileversion in Library datablock, and split again
Main by libraries after lib linking, do_versions_after_liblink on
those separated Mains, and merge again.
This allows to still have correct versions for each data-block in that
second do_versions step.
Note that this is not used currently in master (might be soon, though),
but is needed for 2.8 work.
Main scheduler would be created way before `-t` argument would be
parsed, since it was on forth pass! Moved it to first pass of argparse,
that kind of stuff should be initialized asap on startup.
The code that was used for the T-SVD before came from the WLR reference implementation,
but had numerical problems on Windows and would often cause NaNs.
This commit replaces it with a new implementation using Eigendecomposition based on the Jacobi Eigenvalue Method.
That should:
- Give a slight performance boost (probably not noticable, since the T-SVD was no bottleneck to begin with)
- Improve numerical accuracy of the results (not very important either since the eigenvalues are only compared against a threshold)
- FINALLY solve the black spot issue on Windows
- Slightly reduce memory usage (singular values are now constructed on the diagonal of the input matrix) with the potential of more in the future (now only the lower-triangular part is required).
- Resolve potential licensing issues - the specific file containing the original code didn't come with any licensing information, and the main file contains an apparently custom license...
When linking data-blocks from same library in several steps, the already
linked data-blocks of same lib would go again through versionning code...
Note: only fixed for libraries, I can't imagine how this could happen
with local data...
Data transfer was not checking if the required geometry existed, thus
causing a segfault when it didn't. This adds the required checks, and
reports errors if geometry is missing.
This also replaces instances of the words "polygon" and "loop" in error
messages with "face" and "corner" respectively, to be consistent with
the rest of the existing UI.
Reviewed By: mont29
Differential Revision: http://developer.blender.org/D2410
When append a datablock the default brushes were not created and only
were created when draw new strokes. Now the default brushes are created
when draw strokes if necessary.
It's now possible to change the shortcut that enables planar transformation with the transform manipulators (shift+LMB on axis).
This actually fixes the workaround added in rB20681f49801fd. Thing is that we needed to allow using the manipulators, even if a modifier key is held so things like snapping work right away. That's why normal LMB behavior uses KM_ANY. However, event handling would always execute the KM_ANY keymap handler because it's iterated over first. Simply solved this by registering the KM_SHIFT keymap item first, so it has priority over the KM_ANY one.
Enum properties with icon only flag should use minimum/fixed width in expanded layouts (alignment=UI_LAYOUT_ALIGN_EXPAND).
Differential Revision: https://developer.blender.org/D2415 by @raa (only made some really minor corrections)
This matches behavior of Multiscatter GGX and could become handy later on
when/if we decide it would be beneficial to replace on closure with another.
Reviewers: lukasstockner97, brecht
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D2413
There was 16 bits reserved for primitive type, while we only need 4.
Reviewers: brecht
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D2401
There was fully wrong logic in comparison: was actually accessing memory
past the array boundary. Run test manually and the figure seems correct
to me now.
Spotted by @LazyDodo, thanks!
This code has not been used for a long time if not ever.
Most of the code was removed in rB1d3609262704f88c9e30b2cebdb236110b25cdc9
however, this was forgoten.
Adds two buttons to context (RMB) menu of path buttons:
* "Open File Externally" to open a file in an external app (only visible if path contains a filename)
* "Open Location Externally" to open a path in an external file browser
The functionallity for this was already there, just hidden behind Shift/Alt click of file_browse button (folder icon next to path button).
This gives us 9 flags available again for properties (we had none anymore),
and also makes things slightly cleaner.
To simplify (and make more clear the differences between mere properties
and function parameters), also added RNA_def_parameter_flags function (and
its clear counterpart), to be used instead of RNA_def_property_flag for
function parameters.
This patch is also a big cleanup (some RNA function definitions were
still using 'prop' PropertyRNA pointer, etc.).
And yes, am aware this will be annoying for all branches, but we really need
to get new flags available for properties (will need at least one for override, etc.).
Reviewers: sergey, Severin
Subscribers: dfelinto, brecht
Differential Revision: https://developer.blender.org/D2400
Was a waaaaayyyyy to much generic name for such a specific func, renamed
to much more descriptive BKE_libblock_relink_to_newid().
In near future (few weeks, to limit as much as possible silent mismatch
in branches), will rename BKE_libblock_relink_ex to BKE_libblock_relink,
this is the real generic data-block relinking func!
Discard the whole volume stack on the last bounce (but keep
world volume if present).
Volumes are expected to be closed manifol meshes, meaning if
ray entered the volume there should be an intersection event
of ray exisintg the volume. Case when ray hit nothing and
there are still non-world volumes in the stack can happen in
either of cases.
1. Mesh is not closed manifold.
Such configurations are not really supported anyway and should
not be used.
Previous code would have consider the infinite length of the
ray to sample across, so render result wasn't really correct
anyway.
2. Exit intersection is more far away than the camera far
clip distance.
This case also will behave differently now, but previously it
wasn't really correct either, so it's not like we're breaking
something which was working as expected.
3. We missed exit event due to intersection precision issues.
This is exact the case which this patch fixes and avoid
fireflies.
4. Volume has Camera only visibility (all the rest visibility
is set to off)
This is what could be considered a regression but could be
solved quite easily by checking volume stack's objects flags
and keep entries which doesn't have Volume Scatter visibility
(or even better: ensure Volume Scatter visibility for objects
with volume closure),
Fixes T46108: Cycles - Overlapping emissive volumes generates unexpected bright hotspots around the intersection
Also fixes fireflies appearing on the edges of cube with
emissive volue.
Reviewers: juicyfruit, brecht
Reviewed By: brecht
Maniphest Tasks: T46108
Differential Revision: https://developer.blender.org/D2212
//ui_item_enum_expand// function replaces all pie menu's sub-layouts with radial layout. It should replace only root layout.
To reproduce the issue paste the code in Blender's text editor and press Run Script button.
```
import bpy
class VIEW3D_PIE_template(bpy.types.Menu):
bl_label = "Select Mode"
def draw(self, context):
layout = self.layout.menu_pie()
layout.column().prop(
context.scene.render.image_settings, "color_mode", expand=True)
def register():
bpy.utils.register_class(VIEW3D_PIE_template)
def unregister():
bpy.utils.unregister_class(VIEW3D_PIE_template)
if __name__ == "__main__":
register()
bpy.ops.wm.call_menu_pie(name="VIEW3D_PIE_template")
```
Differential Revision: https://developer.blender.org/D2394 by @raa
Reading rest of the code, it's obvious we want to start à YOFF lines
from start of rect2i, so we have to also multiply by number of
components.
Also did some minor cleanup.
Code was not accounting for possibilities that width or height of given
buffers may be smaller than XOFF/YOFF...
Note that I seriously doubt that drop code actually works (as in, gives
expected results) when applied to tiles like it seems to be done
currently, but this is much more complex (and involved) topic.
compiled.
This adds a short message to the smoke, remesh and boolean modifiers' UI
when trying to use them when their compilation was turned off. This was
already implemented for the fluid and ocean simulation modifiers.
This also makes the 'quick fluid' and 'quick smoke' operator abort and
report when trying to use them when unavailable.
This kind of keeps threads "warmer" and should in theory give better
cache coherency bringing some %% of speedup. It was already tested
few months ago and it gave few % speedup in barber shop, but was
reverted due to some bone popping. The popping is now fixed so it
should be fine to use new scheduling policy.
This sets forces to zero, when Nabla is zero and a grayscale texture is
used or texture mode is Gradient or Curl.
Nabla equal to zero was causing a zero division, and forces ended up
being set to `nan`.
Reviewed By: mont29
Differential Revision: http://developer.blender.org/D2393
Own error when changing order,
moving experimental features last made some sense,
but causes them to be listed twice.
Reorder and comment to avoid it happening again.
- Expand overly dense & confusing delta assignments.
- Replace bit shift with multiply.
Also link to 'clipped' version of this function
which may be useful to add later.
The Progress system in Cycles had two limitations so far:
- It just counted tiles, but ignored their size. For example, when rendering a 600x500 image with 512x512 tiles, the right 88x500 tile would count for 50% of the progress, although it only covers 15% of the image.
- Scene update time was incorrectly counted as rendering time - therefore, the remaining time started very long and gradually decreased.
This patch fixes both problems:
First of all, the Progress now has a function to ignore time spans, and that is used to ignore scene update time.
The larger change is the tile size: Instead of counting samples per tile, so that the final value is num_samples*num_tiles, the code now counts every sample for every pixel, so that the final value is num_samples*num_pixels.
Along with that, some unused variables were removed from the Progress and Session classes.
Reviewers: brecht, sergey, #cycles
Subscribers: brecht, candreacchio, sergey
Differential Revision: https://developer.blender.org/D2214
Quite handy for debugging.
Unfortunately, this doesn't support viewport tweaks yet since those
require GLSL for colorspace conversion. Maybe this will be implemented
as well one day in the future..
They are defined for MSVC but seems to be missing in GCC and CLang-3.8.
Maybe some further tweaks to policy when to define those functions is
needed, but should be fine for now.
I can no longer reproduce crash with neither of the files where
the crash was originally visible. This is something where other
changes (light threshold, sampling) had an effect and made code
to work as it is supposed to. Could have been optimizator issue
or something like that.
Let's see if we hit same issue again.
`IMB_remakemipmap` may 'shrink' the mipmap list without actually freeing
anything, so we need to check all possible levels in `imb_freemipmapImBuf`
to avoid memory leaks, not only those currently used.
In ccgDM and emDM, looptri array recalculation was being handled
directly by `*DM_getLoopTriArray` (`getLoopTriArray` callback), while
`*DM_recalcLoopTri` (`recalcLoopTri` callback) was doing nothing.
This results in the array not being recalculated when other functions
that depend on the array data called the recalc function.
This moves all the recalculation code to `*DM_recalcLoopTri` and makes
`*DM_getLoopTriArray` call that.
This commit also makes a minor change to the `getNumLoopTri` function,
so that it returns the correct number without having to recalculate the
looptri array.
Reviewed By: mont29
Differential Revision: https://developer.blender.org/D2375
There were two cases where correlation issues were obvious:
- File from T38710 was giving issues in 2.78a again
- File from T50116 was having totally different shadow between
sample 1 and sample 32.
Use some more simplified version of CMJ hash which seems to give
nice randomized value which solves the correlation.
This commit will break all unit test files, but it's a bug fix
so perhaps OK to commit this.
This also fixes T41143: Sobol gives nonuniform noise
Proper science paper about hash function is coming.
Reviewers: brecht
Reviewed By: brecht
Subscribers: lukasstockner97
Differential Revision: https://developer.blender.org/D2385
Most of them are harmless implicit conversions (e.g. Alembic deals with
doubles for storing time information when Blender uses both ints and
floats/doubles) or class/struct mismatch on forward declarations.
This aims at always ensuring that ID.newid (and relevant LIB_TAG_NEW)
stay in clean (i.e. cleared) state by default.
To achieve this, instead of clearing after all id copy call (would be
horribly noisy, and bad for performances), we try to completely remove
the setting of id->newid by default when copying a new ID.
This implies that areas actually needing that info (mainly, object editing
area (make single user...) and make local area) have to ensure they set
it themselves as needed.
This is far from simple change, many complex code paths to consider, so
will need some serious testing. :/
Was a ground work for some more improvements here, but got dragged
to some other studio maintenance job here.
The plan would be to enable exposure/gamma control for fallback mode
which will definitely be really handy for development and might be
handy for cases when OCIO config can not be read.
Crash is due by mismatching loops and faces counts between the Alembic
data and the Blender derivedmesh which does not appear so
straightforward to fix (the crash happens deep in the derivedmesh code).
So for now, try to detect if the topology has changed and if so, both
only read vertices (vertex colors and UVs won't be read, as tied to face
loops) and add a warning message in the modifier's UI to let the user
know.
The idea is simple: cache PD resolution from cache_point_density() RNA
function because that one is supposed to be called while database is
locked for original synchronization.
Ideally we would also pass array size to the sampling function, but
it turned out to be quite problematic because API only accepts int type
and passing size_t might cause some weird behavior.
This is a way to avoid possible memory corruption when render threads works
in parallel with UI thread.
Not guarantees complete safe, but makes things easier to check anyway.
To bring the API more into line with the UI (and the general expected behaviour of
Blender when it comes to adding stuff), newly created layers and palettes will be
made the active ones by default. It's possible to override this behaviour still
(e.g. in cases where you're auto-generating a large number of them), but otherwise,
this change will help prevent errors like T50123.
When there were no prior palettes, creating a new one didn't automatically make it active.
This caused problems when trying to rename the color, as the RNA code assumed that if there's
a color, it must come from the active palette.
This commit partially fixes the problem by ensuring that if there are no palettes, the first
one will always be made active.
- Remove 'rotate_m2', unlike 'rotate_m4' it created a new matrix
duplicating 'angle_to_mat2' - now used instead.
(better avoid matching functions having different behavior).
- Add 'axis_angle_to_mat4_single',
convenience wrapper for 'axis_angle_to_mat3_single'.
- Replace 'unit_m4(), rotate_m4()' with a single call to 'axis_angle_to_mat4_single'.
This is no longer needed since moving to MPoly/MLoop data structure.
Also use 3x3 matrix for transforming instead of quaternion
(slightly better performance).
Was giving huge artifacts in the barber shop file here in the studio,
Maybe not fully optimal solution, but committing it for now to have
closer look later.
All objects were being parented to a single instance of each parent
object, instead of their respective instances, when using dupliverts or
dupligroups.
Behavior was caused by the `persistent_id[0]` (vertex/face id) being
ignored when computing `parent_gh` hash, which caused all instances to
have the same hash, and thus only the first one was included.
Reviewed By: mont29
Differential Revision: https://developer.blender.org/D2370
Was affecting armatures' pose drawing code, could try to draw with
non-updated pose, which may contain NULL bone pointers (e.g. after some
data-block management tool execution, like make local, remapping, etc.).
Main intention is to give some quick way to control scene's memory
usage by clamping textures which are too big. This is really handy
on the early production stages when you first create really nice
looking hi-res textures and only when it all works and approved
start investing time on optimizing your scene.
This is a new option in Scene Simplify panel and it acts as
following: when texture size is bigger than the given value it'll
be scaled down by half for until it fits into given limit.
There are various possible improvements, such as:
- Use threaded scaling using our own task manager.
This is actually one of the main reasons why image resize is
manually-implemented instead of using OIIO's resize. Other
reason here is that API seems limited to construct 3D texture
description easily.
- Vectorization of uchar4/float4/half4 textures.
- Use something smarter than box filter.
Was playing with some other filters, but not sure they are
really better: they kind of causes more fuzzy edges.
Even with such a TODOs in the code the option is already quite
useful.
Reviewers: brecht
Reviewed By: brecht
Subscribers: jtheninja, Blendify, gregzaal, venomgfx
Differential Revision: https://developer.blender.org/D2362
This will make triple buffer used by default for such configuration.
Ideally we would switch to triple buffer on all platforms, but let's
do it in 2.8 branch and don't open can of worms in master now.
This should solve issues like T49945.
Previously, a build of Cycles Standalone was needed for animation denoising.
Now, it is possible to call it from Python Scripts in Blender or directly from the command line (through --python-expr).
This is very confusing, in fact, and rna tooltip was wrong,
BKE_object_make_local_ex actually ensures we never have several proxies
of same object, since it always clears proxy when it has to copy object
to make it local...
What that RNA function is probably missing, though, is same logic as in
BKE_library_make_local to actually remap proxy from old linked object to
new local one.
I) `clear_proxy` parameter was not assigned to parm in RNA define code,
so 'pyfunc optional' flag was set to `new_id` parameter of `user_remap`
func - super ugly!
II) `clear_proxy` parameter itself, when set to False, would allow to
leave .blend file in invalid state (more than one proxy of same object),
this should never, ever be allowed in RNA API imho. Left the PAI
untouched for now, just disabled any effect from this parameter (hence
always clearing proxy when copying).
There is some define conflict between system headers and clew,
so delay include of clew.h as much as possible.]
This is something which needed to be done in the code before
the refactor, hopefully such change will still work.
For the multi-GPU case users still have to reconfigure the devices they want to use.
Based on patch from Lukas Stockner.
Differential Revision: https://developer.blender.org/D2347
This can be used together with camera culling to keep nearby objects visible in
reflections, using a minimum distance within which objects are visible. It is
also useful to cull small objects far from the camera.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D2332
When the parent object matrix change after the layer was parented, the
inverse matrix for strokes must be updated when editing strokes or the
transformations will be wrong.
We need to check node tree links are still valid, after we remapped
some NodeGroup.
Note: In fact, we have to run that for *all* ID types, since nodes may
use any kind of data-block (in theory)... :/
Forward compatibility code should never, ever be run during undo saving.
Note: related to T49991 (but does not fix it either, crash now happens
when doing a real file save...).
Adding a torus in edit-mode, with 'Generate UVs'
for example would either create another UV layer with the default name or
switch to the default UV layer name if it exists.
Now use the existing UV layer if present.
'1' threshold value would only allow to access a third of the basic
'color space' (from black to white, from 0.0 to 1.0 component values),
when you expect it to access the whole range.
Unfortunately, this needs a subversion bump to allow already defined
brushes to keep exact same behavior!
Also, did not change default value (0.2) for new brushes, think here
keeping current one makes more sense.
Thanks to @LucaRood for confirming the issue.
Empty images were implemented to expand (and eventually replace)
the background images functionalities. If we are ever to drop
background images "image empties" should support stereo/multi-view as well.
The renderpasses for grease pencil are not necessary when render from
sequencer.
This fix solves the GPF but we need to rethink the complete render
process for grease pencil and integrate better in the render and
composition workflow.
Thanks to Dalai Felinto por helping in the debug and fixing of the
problem.
it seems to me the icons are unused:
- VICO_VIEW3D_VEC
- VICO_EDIT_VEC
- VICO_EDITMODE_VEC_DEHLT
- VICO_EDITMODE_VEC_HLT
- VICO_DISCLOSURE_TRI_RIGHT_VEC
- VICO_DISCLOSURE_TRI_DOWN_VEC
- VICO_MOVE_UP_VEC
- VICO_MOVE_DOWN_VEC
- VICO_X_VEC
Since their code contains immediate mode GL calls and they seem to be unused i thought we could remove them.
Reviewers: mont29
Reviewed By: mont29
Subscribers: merwin
Tags: #bf_blender_2.8
Differential Revision: https://developer.blender.org/D2356
On second and third thoughts, this should have been done that way since
the begining, cases were you just delete a few data-blocks without any
serious knowledge of their usages are much, much more frequent than
cases where you are deleting thousands of data-blocks and are sure they
are not used anywhere anymore...
Own fault, but really frustrated that this topic was only raised the day
after 2.78a was released. :(
This has nothing to do here (freeing is not unlinking/remapping!), and
was actually redoing something already taken care of by
`BKE_libblock_relink_ex()` call in `BKE_libblock_free_ex()`.
Also, gives some noticeable speedup when removing datablocks with
do_unlink=True, about 5 to 10% quicker e.g. when deleting all objects
from a py console, in a big production file...
This commit reverts part of a fix for T33275, but things are:
- I can not reproduce the original issue at all, so doesn't seem to
cause any regressions.
- It is really bad idea to do delayed initialization in the threaded
environment, it's a straight way to some nasty issues.
- We can't do things like this anyway because we go more granular,
meaning such a delayed initialization will fail in the case of
having several IK solvers (unless they properly accommodate to
changed bone head).
- Verified the fix with various files from Mango project and all of
them seems to work nice with new depednency graph now (old depsgraph
has some flickering, but it's not related on DEG itself, but on
an environment with lots of proxies and threaded evaluation and it
is not a new behavior).
This reverts commit 9b5a32cbfb.
Apparently it is possible to have other thread mocking around with the hash.
Needs deeper investigation, for the time being reverting to prevent crashes.
This commit fixes two issues:
- UV/Image editor uvs menu did not match the 3D View's which was changed in rB2b240b043078
- Circle select tool was missing in particle edit mode
Reviewers: Severin
Differential Revision: https://developer.blender.org/D2329
Added BKE_libblock_free_data_ex() which takes special do_id_user
argument which basically indicates whether main database was already
taken care about not having "dead" pointers.
Gives about 40% speedup of main database free with quadbot scene
(3.4sec vs. 5.4 sec on quite powerful desktop).
Just fixing crash itself. Actually operator shouldn't run in most editors (not in dopesheet either I guess), but don't want to spend time on that right now.
This option makes an operator to not push a task to the undo stack if the previous stored elemen is the same operator or part of the same undo group.
The main usage is for animation, so you can change frames to inspect the
poses, and revert the previous pose without having to roll back tons of
"change frame" operator, or even see the undo stack full.
This complements rB13ee9b8e
Design with help by Sergey Sharybin.
Reviewers: sergey, mont29
Reviewed By: mont29, sergey
Subscribers: pyc0d3r, hjalti, Severin, lowercase, brecht, monio, aligorith, hadrien, jbakker
Differential Revision: https://developer.blender.org/D2330
- `bmesh_radial_faceloop_find_first` & `bmesh_disk_faceedge_find_first`
can be replaced with a single call to a new function:
`bmesh_disk_faceloop_find_first`
- `bmesh_disk_faceedge_find_first` called `bmesh_radial_facevert_check`
which isn't needed, since either the current or next loop in the
cycle is attached to the edge we're looking for.
The code was templated already, so don't see big reason to have
3 versions of templated functions. It was giving some extra code
to maintain and in fact already had divergency for support of huge
image resolution (missing size_t cast in byte image loading).
There should be no changes visible by artists.
Just return the face or NULL, like BM_edge_exists(),
Also for BM_face_exists_overlap & bm_face_exists_tri_from_loop_vert.
No functional changes.
Old code did some partial overlap checks where this made some sense,
but it's since been removed.
The mismatching alloc/dealloc would upset ASan, and the undefined elements would cause FPEs in the NLM code (the result was fine due to masking, but the intermediate values weren't).
That change (enabling __KERNEL_SSE__ for AVX(2)) broke the ABI between the kernel and device code when int4 was passed as an argument.
Now, a pointer to the first element is passed instead.
Object freeing may in some kind access its obdata (in case it has some
caches e.g.), since here obdata may have already been freed, let's set
object's data pointer to NULL (probably not ideal solution, but we don't
care much, those form archipelagos of unused linked datablocks,
we nuke'em all anyway).
Also fix stupid mistake in one of own recent commits (using ID we just
freed, tsst...).
This should make it easier to sculpt in high resolutions, downside is that the new way to calculate maximum edge length is a bit less intuitive. Maximum edge length used to be calculated as blender_unit * percentage_value, now it's blender_unit / value.
Reused old DNA struct member, but had to bump subversion to ensure correct compatibility conversion. Also changed default value slightly (would have had to set to 3.333... otherwise).
Was Requested by @monio (see https://rightclickselect.com/p/sculpting/zpbbbc/dyntopo-better-scale-input-in-constant-detail-mode) and I think it's worth testing.
Do not set 'real user' to groups every time we run the first clearing loop.
And do fully clear properly LIB_TAG_DOIT (this is not yet enforced in
existing code, but would love to get to that stage in future, so let's
do it at least with new code!).
Basic idea is to split first loop in two, and run checks before making
anything actually local, to detect data-blocks that we can directly make
local (because we are sure they are only used by already/future local
datablocks).
This allows to avoid a lot of overhead in later 'cleanup' steps of this
function, here with barbershop shot it's four times quicker (from 190s to 48s).
We are still far from the instantaneous results of MakeLocal in 2.77,
but in that version main characters lose their connection to their
armature and remain static after makelocal, so guess new code is still
better. ;)
There are probably more optimizations possible here, but would rather
polish this area of code once we get rid of proxies, those really
make it a nightmare to work on.
If the opengl render with grease pencil is run from VSE with the current
frame outside visible frames, the render pass is wrong and the render
must be canceled because nothing to render. Related to #T49975
Before this commit, the brush set was created with the first stroke
drawing, but if the user creates the datablock or the layer manually
(not drawing) the brush list was empty.
This commit complement the python fix by Sergey:
https://developer.blender.org/rB89c1f9db37cc1becdd437fcfdb1877306cc2b329
Culprit here was once more proxies. Think what was happening here was:
1) Both proxy and proxified armatures' PoseChannels were cleared
(needed after remapping due to Bone pointers being stored in pchans).
2) Proxy PoseChannels got rebuilt in `BKE_pose_rebuild_ex()`, which ends,
in proxy cases, by actually replacing rebuilt pchans by those from
the proxified object... which has not yet been rebuilt.
Fixed the issue by merely adding bone pointer to data copied from
original pchan into new 'from proxy' one... Sounds much, much safer and
sanier anyway, that way we can be sure bone pointer is actually pointing
to a bone of the object's armature (this is supposed to be the same
Armature datablock between proxy and proxified objects, but that may not
be always true especially during makelocal process).
Did similar trick to old dependency graph: tag invisible relations for update.
Might need some re-consideration, see the comment.
This should solve our issues with powerlib addon here in the studio.
New dependency graph expects strict separation between nodes and relations builder,
meaning, if we try to create relation with an object which is not in the graph yet
we'll have an error in depsgraph.
Now, so far object nodes were created from bases of the current scene, which caused
missing objects in graph in certain cases.
Didn't find better approach than to simply ensure object nodes exists when we know
they'll be used by relation builder.
`kernel_path.h` and `kernel_path_branched.h` have a lot of conditional code and
it was kind of hard to tell what code belonged to which directive. Should be
easier to read now.
Proxified objects can never be local, we can totally ignore them here.
This 'fixes' the asserts related to usercount when trying to remap poselib
of localized proxified objects (not sure what exactly was going on wrong here,
but proxies are a giant can of worms for sane data-blocks handling anyway :/).
This new `bpy.types.ID.make_local(clear_proxies=True)` allows Python
code to press the "Make Local" button on any ID block. I chose
`clear_proxies=True` as the default, since it's the default behaviour
of `id_make_local()` (defined in `library.c`).
The caller does need to take care of ensuring that linked-in objects
don't refer to local data, and that proxies aren't broken.
Reviewers: sergey, mont29
Reviewed By: mont29
Subscribers: dfelinto
Differential Revision: https://developer.blender.org/D2346
Add subframe to the animated seed hash calculation.
Should be no difference for the regular files, only for cases when scene is
rendered from sequencer with a speed effect, which is not really a common thing.
This is a late follow-up commit to the light sample threshold changes which
caused difference in rendering all existing .blend files which is not something
we are happy about: it is fine to use new optimized defaults for new files, but
existing ones should always be rendering in the same way as they used to be.
Sorry for the inconveniece, but such thing should have been done to begin with.
If this setting was modified it will not be reset to zero.
Now all render tests should be passing again.
P.S. Also really annoying to bump subversion for such reasons, but currently we
don't have better way to achieve what we want.
Lamp Data node requires shadow sample array which is only enabled when
Shadows are enabled in the shading settings.
This commit prevents crash but might not give expected render results
in such a configuration.
After CUDA dynload changes having CUDA toolkit became required
in order to compile Cycles. This only happened due to wrong
default value to the option.
The idea is simple: when falling back to one of the nodes which was partially
handled we "resume" checking outgoing relations from the index which we stopped.
This gives about 15-20% depsgraph construction time save.
This is more proper way to go:
- Avoids re-compilation of all dependent files when implementation changes
without changed API,
- Linker should have much simpler time now de-duplicating and getting rid
of redundant implementations.
The idea here is to address issue that name on it's own is not
always unique: for example, when adding driver operations the
name used for nodes is the RNA path (and multiple drivers can
write to different array indices of the path). Basically, now
it's possible to pass extra integer value to distinguish
operations in such cases.
So now we've already switched from sprintf() to construct unique
operation name to pass RNA path and array index.
There should be no functional changes yet, but this work is
required for further work about replacing string with const
char*.
There is no real reason to have nodes storing heap-allocated name
and description. Doing this increases amount of allocations during
dependency graph building, which usually means somewhat slowness.
We're temporarily loosing some eyecandy in the graphviz visualizer,
but those we can bring back as a part of graphiz dump (which happens
much less often than depsgraph build).
This will happen in multiple commits for the ease of bisect in the
future just in case this causes any regression. This commit contains
ID creation API changes.
Bullet spring constraint already supports rotational springs, but
they are not exposed in blender UI, likely due to a simple oversight.
Supporting them is as simple as adding a few DNA/RNA properties
with appropriate UI and passing them on to Bullet.
Reviewers: sergof
Reviewed By: sergof
Differential Revision: https://developer.blender.org/D2331
Previously, it was only possible to choose a single GPU or all of that type (CUDA or OpenCL).
Now, a toggle button is displayed for every device.
These settings are tied to the PCI Bus ID of the devices, so they're consistent across hardware addition and removal (but not when swapping/moving cards).
From the code perspective, the more important change is that now, the compute device properties are stored in the Addon preferences of the Cycles addon, instead of directly in the User Preferences.
This allows for a cleaner implementation, removing the Cycles C API functions that were called by the RNA code to specify the enum items.
Note that this change is neither backwards- nor forwards-compatible, but since it's only a User Preference no existing files are broken.
Reviewers: #cycles, brecht
Reviewed By: #cycles, brecht
Subscribers: brecht, juicyfruit, mib2berlin, Blendify
Differential Revision: https://developer.blender.org/D2338
With this fix, using a MIS map resolution equal to the image size for closest imterpolation or twice the size for linear interpolation gets rid of all fireflies.
Previously, a much higher resolution was needed to get acceptable noise levels.
There's more dll's hanging out in the ucrt folder, but I just grabbed the ones blender requested (not sure if that's a wise idea, but it seems to work)
Reviewers: sergey, juicyfruit
Reviewed By: juicyfruit
Differential Revision: https://developer.blender.org/D2335
We have to clear `newid` of all datablocks, not only object ones.
Note that this whole stuff is still using some kind of older, primitive
'ID remapping', would like to see whether we can replace it with new,
more generic one, but that's for another day.
Code would try to add multiple time the same key in `parent_gh` (for this
ghash a lot of dupliobjects may generate same key).
Was making the tool unusable in debug builds.
Also optimise things a bit by avoiding creating parent_gh when only
`use_base_parent` is set.
New code dealing with getting rid of lib-only cycles of data-blocks
could add several time the same datablock to the list of candidates. Now
this is avoided, and pointers are further cleaned up as double-safety
measure.
Feature request during bconf, makes sense to have it even as an hack for
now, since this is probably one of the most common use cases. This should
be redone in bmesh once we have proper custom noramls handling in edit mode...
The issue was caused by image ID nodes not being in the depsgraph.
Now, tricky part: we only add nodes but do not add relations yet. Reasoning:
- It's currently important to only call editor's ID update callback to solve
the issue, without need to flush changes somewhere deeper.
- Adding relations might cause some unwanted updates, so will leave that for
a later investigation.
Basically, the problem here was that the transform that's used to bring texture coordinates
to world space is either fetched while setting up the shader (with Object Motion is enabled) or
fetched when needed (otherwise). That helps to save ShaderData memory on OpenCL when Object Motion isn't needed.
Now, if OM is enabled, the Lamp transform can just be stored inside the ShaderData as well. The original commit just assumed it is.
However, when it's not (on OpenCL by default, for example), there is no easy way to fetch it when needed, since the ShaderData doesn't
store the Lamp index.
So, for now the lamps just don't support local texture coordinates anymore when Object Motion is disabled.
To fix and support this properly, one of the following could be done:
- Just always pre-fetch the transform. Downside: Memory Usage increases when not using OM on OpenCL
- Add a variable to ShaderData that stores the Lamp ID to allow fetching it when needed
- Store the Lamp ID inside prim or object. Problem: Cycles currently checks these for whether an object was hit - these checks would need to be changed.
- Enable OM whenever a Texture Coordinate's Normal output is used. Downside: Might not actually be needed.
Animation system has separate fcurves for each of array elements and
dependency graph creates separate nodes for each of fcurve, This is
needed to keep granularity of updates, but causes issues because
animation system will actually write the whole array to property when
modifying single value (this is a limitation of RNA API).
Worked around by adding operation relation between array drivers
so we never write same array form multiple threads.
It was possible to have synchronization issues whe naccumulating smooth
normal to a vertex, causing shading artifacts during playback.
Bug found by Dalai, thanks!
They were not real issues, it's just some areas of code tried to create
relations between non-existing nodes without checking whether such
relations are really needed.
Now it should be easier to see real bugs printed.
Hopefully should be no regressions here.
Some platforms are having hard time using this linker so added an option
to not use it. The options is an advanced one and enabled by default so
should not cause any changes for current users.
Request from Hjalti Hjalmarsson for the animation work.
Basically a common part of the workflow of animation is to change the pose, scrub back and forth a few times and roll back the changes when unsatisfied.
However if you go back and forth too many times the UNDO stack would be full, and it would not be possible to bring back the previous pose.
I'm leaving clip_editor change frames as it is for now. But we can
probably change the behaviour there as well.
Issue here was that py API code was keeping references (pointers) to the
liniked data-blocks, which can actually be duplicated and then deleted
during the 'make local' process...
Would have like to find a better way than passing optional GHash to get
the oldid->newid mapping, but could not think of a better idea.
Radial append/remove had swapped args and *slightly* different behavior.
- bmesh_radial_append(edge, loop)
- bmesh_radial_loop_remove(loop, edge)
Match logic for append/remove,
Logic for the one case where the edge needs to be left untouched
has been moved to: `bmesh_radial_loop_unlink`.
This is yet another debug option that allows to render an arbitrary
simulation field by using a color ramp to inspect its voxel values.
Note that when using this, fire rendering is turned off.
Reviewers: plasmasolutions, gottfried
Differential Revision: https://developer.blender.org/D1733
This allows to save a memory copy, which will be particularly useful for network rendering.
Reviewers: sergey, brecht, dingto, juicyfruit, maiself
Differential Revision: https://developer.blender.org/D2323
In scenes with many lights, some of them might have a very small contribution to some pixels, but the shadow rays are traced anyways.
To avoid that, this patch adds probabilistic termination to light samples - if the contribution before checking for shadowing is below a user-defined threshold, the sample will be discarded with probability (1 - (contribution / threshold)) and otherwise kept, but weighted more to remain unbiased.
This is the same approach that's also used in path termination based on length.
Note that the rendering remains unbiased with this option, it just adds a bit of noise - but if the setting is used moderately, the speedup gained easily outweighs the additional noise.
Reviewers: #cycles
Subscribers: sergey, brecht
Differential Revision: https://developer.blender.org/D2217
This option allows to create a smoother transition between Bricks and Mortar - 0 applies no smoothing, and 1 smooths across the whole mortar width.
Mainly useful for displacement textures.
The new default value for the smoothing option is 0.1 to give some smoothing that helps with antialiasing, but existing nodes are loaded with smoothing 0 to preserve compatibility.
Reviewers: sergey, dingto, juicyfruit, brecht
Reviewed By: brecht
Subscribers: Blendify, nutel
Differential Revision: https://developer.blender.org/D2230
When using the Normal output of the Texture Coordinate node on Point and Spot lamps, the coordinates now depend on the rotation of the lamp.
On Area lamps, the Parametric output of the Geometry node now returns UV coordinates on the area lamp.
Credit for the Area lamp part goes to Stefan Werner (from D1995).
constraints.
This avoids traversing the archive everytime object data is needed and
gives an overall consistent ~2x speedup here with files containing
between 136 and 500 Alembic objects. Also this somewhat nicely de-
duplicates code between data creation (upon import) and data streaming
(modifiers and constraints).
The only worying part is what happens when a CacheFile is deleted and/or
has its path changed. For now, we traverse the whole scene and for each
object using the CacheFile we free the pointer and NULL-ify it (see
BKE_cachefile_clean), but at some point this should be re-considered and
make use of the dependency graph.
The 'local' layers were not correctly set when redoing 'add object'
addons using object_utils.py helper (we always want to restore layers
from view in local view, even if we set 'real' layers from operator
afterwards).
Issue was happening when removal of custom icons was done while they
were still being rendered by preview job.
Now add a 'deffered deletion' system, to prevent main thread to delete
preview image until loading thread is done with them.
Note that ideally, calling `ED_preview_kill_jobs()` on custom icon
removal would have been simpler, but we don't have easy access to
context here...
Oh man, is it a compiler bug? Is it something we do stupid?
For now more crap to prevent crashes. During the conference will talk to
Maxyn about how can we troubleshoot such weird issues.
Basically don't use rcp() in areas which seems to be critical after
second look. Also disabled some multiplication operators, not sure
yet why they might be a problem.
Tomorrow will be setting up a full test with all cases which were
buggy in our farm to see if this fix is complete.
There is some precision issues for big magnitude coordinates which started
to give weird behavior of release builds. Some weird memory usage in BVH
which is tricky to nail down because only happens in release builds and GDB
reports all variables as optimized out when trying to use RelWithDebInfo.
There are two things in this commit:
- Attempt to make vectorized code closer to original one, hoping that it'll
eliminate precision issue.
This seems to work for transform_point().
- Similar trick did not work for transform_direction() even tho absolute
error here is much smaller. For now disabled that function, need a more
careful look here.
Rewrite the current range-tree API used by dyn-topo undo
to avoid inefficiencies from stdc++'s set use.
- every call to `take_any` (called for all verts & faces)
removed and added to the set.
- further range adjustment also took 2x btree edits.
This patch inlines a btree which is modified in-place,
so common resizing operations don't need to perform a remove & insert.
Ranges are stored in a list so `take_any` can access the first item
without a btree lookup.
Since range-tree isn't a bottleneck in sculpting, this only gives minor speedups.
Measured approx ~15% overall faster calculation for sculpting,
although this number time doesn't include GPU updates and depends on how
much edits fragment the range-tree.
Existing method was fine for basic polygons but didn't scale well
because its was checking all coordinates for every y-pixel.
Heres an optimized version.
Basic logic remains the same this just maintains an ordered list of intersections,
tracking in-out points, to avoid re-computing every row,
this means sorting is only done once when out of order segments are found,
the segments only need to be re-ordered if they cross each other.
Speedup isn't linear, test with full-screen complex lasso gave 11x speedup.
Do this for the new dependency graph: was missing handle of OB_UPDATE_TIME in tag update.
Hopefully it's all correct still.
Old dependency graph needs work, but i'm tempting to call it unsupported and move on
to 2.8 branch.
Several ideas here:
- Optimize calculation of near_{x,y,z} in a way that does not require
3 if() statements per update, which avoids negative effect of wrong
branch prediction.
- Optimization of direction clamping for BVH.
- Optimization of point/direction transform.
Brings ~1.5% speedup again depending on a scene (unfortunately, this
speedup can't be sum across all previous commits because speedup of
each of the changes varies from scene to scene, but it still seems to
be nice solid speedup of few percent on Linux and bigger speedup was
reported on Windows).
Once again ,thanks Maxym for inspiration!
Still TODO: We have multiple places where we need to calculate near
x,y,z indices in BVH, for now it's only done for main BVH traversal.
Will try to move this calculation to an utility function and see if
that can be easily re-used across all the BVH flavors.
Similar to the previous commit, avoid negative effect of bad branch prediction.
Gives measurable performance up to ~2% in tests here.
Once again, thanks to Maxym Dmytrychenko!
The idea here is to avoid if statements which could cause wrong
branch prediction.
Gives a bit of measurable speedup up to ~1%. Still nice :)
Inspired by Maxym Dmytrychenko, thanks!
Seems CMake will rearrange and copy libraries which are passed to the linker
when some of the libraries is listed twice (for example, -lz from png libraries
and -l for blender itself). This was causing libopenimageio to be added somewhere
at the end of linking flags without -ldl followed after which was causing linking
issues.
Similar to regular triangle intersection case. Gives about 3% speedup rendering
SSS object on my desktop,
Question: how to avoid such a code duplication in a nice way without speed loss?
This will confuse hell of a guarded allocators because it is possible
to have allocation happened prior to Blender's guarded allocator is
fully initialized.
This was causing crashes and assert failures when running blender
with fully guarded memory allocator.
Initialization order of global stats and node types was not strictly
defined and it was possible to have node types initialized first and
stats after that. This will zero out memory which was allocated from
the statistics causing assert failure when de-initializing node types.
It was possible to have non-initialized unaligned BVH split
to be used when regular BVH split SAH was inf. Now we ensure
that unaligned splitter is only used when it's really initialized.
It's a regression and should be in 2.78a.
Material linking might and does change the way how drawObject is calculated
but does not tag drawObject for recalculation in any way.
Now use dependency graph to tag draw object for reclaculation. Currently do
this using OB_RECALC_DATA taq since tagging is not very granular yet. In the
future we can introduce ore granular tagging in the new dependency graph
easily.
Simple and safe for 2.78a.
Using context manager for output file itself, and whole try/except block
to at least catch and print error in file.
Also some minor tweaks to previous 'list add-ons' commit.
Note that volume rendering is not supported yet, this is a step towards that.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D2299
Now, the strokes can be locked to a plane set in the cursor location.
This option allow the artist to rotate the view and draw keeping the
strokes flat over the surface. This option is similar to surface option
but doesn't need a object.
The option is only valid for 3D view and strokes in CURSOR mode.
When ED_screen_animation_play is called from wm_event_do_handlers,ScrArea *sa = CTX_wm_area(C); is NULL in ED_screen_animation_timer.
Informing the audio system in CTX_data_main_set, that a new Main has been set.
Not really possible to precisely detect all cases in which they should or
should not be active, but at least now it won't show as disabled when it
actually has some effects.
Previously an error message would be printed whenever the OpenCL build produced output.
However, some frameworks seem to print extra information even if the build succeeded, so now the actual returned error is checked as well.
When --debug-cycles is activated, the build output will always be printed, otherwise it only gets printed if there was an error.
This would cause Alembic to throw an exception and fail exporting
animations because it was trying to recreate and overwrite the
attributes for each frame.
Also use the operator as part of the UI keymap now, to deduplicate code and let
users configure a custom shortcut.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D2303
this patch resolves the following warnings;
```
Warning C4028 formal parameter 1 different from declaration blenkernel\intern\ocean.c 764
Warning C4098 'attach_stabilization_baseline_data': 'void' function returning a value blenkernel\intern\tracking_stabilize.c 139
Warning C4028 formal parameter 3 different from declaration blenkernel\intern\cachefile.c 148
Warning C4028 formal parameter 3 different from declaration blenkernel\intern\paint.c 413
Warning C4028 formal parameter 1 different from declaration blenkernel\intern\editderivedmesh.c 591
Warning C4028 formal parameter 3 different from declaration blenkernel\intern\library_remap.c 709
Warning C4028 formal parameter 1 different from declaration blenkernel\intern\ocean.c 754
Warning C4028 formal parameter 1 different from declaration blenkernel\intern\ocean.c 758
Warning C4028 formal parameter 1 different from declaration blenkernel\intern\ocean.c 759
Warning C4028 formal parameter 1 different from declaration blenkernel\intern\ocean.c 763
Warning C4028 formal parameter 1 different from declaration blenkernel\intern\ocean.c 764
Warning C4028 formal parameter 1 different from declaration blenkernel\intern\ocean.c 765
Warning C4028 formal parameter 1 different from declaration blenkernel\intern\ocean.c 769
Warning C4028 formal parameter 1 different from declaration blenkernel\intern\ocean.c 770
Warning C4028 formal parameter 1 different from declaration blenkernel\intern\DerivedMesh.c 3458
```
It's mostly things where the signature in the .h and the actual implementation in the .c do not match. And a bunch functions who do not match the TaskRunFunction declaration cause they leave out the __restrict keyword.
Reviewers: brecht, juicyfruit, sergey
Reviewed By: sergey
Subscribers: Blendify
Differential Revision: https://developer.blender.org/D2268
Blender doesn't necessarily crash when Python doesn't keep references to
the returned strings. As a result, someone that implements this incorrectly
could be lulled into a false sense of correctness by Blender not crashing.
Previously the editor will always try to only show UV faces with the same exact active
image or image texture, which is quite difficult to control on a production shaders, where
each material can have multiple objects assigned.
The idea of this commit is to bring option which allows to easily control what to display
when "Draw Other Objects" is enabled, so currently we can have old behavior ("Same Image")
or tell editor to show everything ("All"). In the future we can extend it with such filters
as "Same Material" and things like that.
Hopefully this will help @eyecandy's workflow of texturing.
the issue was caused by wrong default value for brush particle count
which was clamped on display from 0 to 1. This is technically a regression
but how to port this to 2.78a?
The problem here was, as the title says, that the two kernels were swapped.
Since shader evaluation is only used for building the samling map when World MIS is enabled, rendering without it would still work fine, although baking also was broken.
The previous refactor changed the code to use a separate logging mechanism to support multithreaded compilation.
However, since that's not supported by any frameworks yes, it just resulted in bad logging behaviour.
So, this commit changes the logging to go diectly to stdout/stderr once again by default.
Couple of issues here:
* Missing initialization for 3D view keyframe options for "Reset to Default Theme"
* Alpha values not reset correctly on "Reset to Default Theme"
* Alpha values of timeline keyframe options not reset correctly for old files
Also corrected old version patches even though they're overridden later, to avoid more issues in case people copy this code.
Corrections to d7af7a1e04 and 8d573aa0ec
This was giving some speedup but made intersection tests to fail
from watertight point of view.
Needs deeper investigation, but need to quickly get it fixed for
the studio.
Regression from rB69b66d549bcc8, was supposed to be non-functionnal
change, so not sure why search menu was reduced here? For now, restore
to 2.77 width.
Seems to be a bug in original implementation of a830280: code was always
using tangent space instead of UV map because it had the same name. Now
prefer UVMap over tangent because this is how Cycles works. At least it's
closer to.
Not sure it the save+reload issue is still relevant after this fix, that
needs to be double-checked.
Thanks @dfelinto for looking into the report and simplifying the case.
Should be included into 2.78a.
This allows appending of an entire scene from another blend file into this one,
even when that blend file contains proxified armatures.
This replaces the approach from commit 1cdc54dc7d.
Thanks @sergey for the help.
Column flow layout was abuse ui_item_fit in a weird way, which was
broken for last column items.
Now rather use own code, which basically spread available width as
equally as possible between all columns.
Seems to be rounding error. Hopefully new code handles the error fixed back in
SVN revision 28901 and still have proper frame number for Hjalti.
What could possibly go wrong here..
This gives about 5% speedup on AVX2 kernels (other kernels still
have SSE disabled for math operations) and this solves the slowdown
of koro scene mention in the previous commit.
The title says it all actually. This commit also contains
changes to pass float3 as const reference in affected functions.
This should make MSVC happier without breaking OpenCL because it's
only done in areas which are ifdef-ed for non-OpenCL.
Another patch based on inspiration from Maxym Dmytrychenko, thanks!
This commit basically vectorizes existing code using AVX2 instructions
(without modifying algorithm itself). This gives quite nice speedups:
BMW: -8%
Classroom: -5%
Cat: -5%
Koro: +1%
Barcelona: -8%
That's on Linux machine, reported performance improvement on Windows
goes up to 20%.
Not currently sure why Koro is somewhat slower because it mainly uses
curve intersection tests, could be a time noise? Or osmething with the
cache utilization perhaps? In any case speedup in other scenes makes
me thinking that current state is acceptable for initial implementation.
This is again inspired by Maxym Dmytrychenko.
Based on existing ssef data type and to my knowledge it's also what happens in
Embree nowadays.
Inspired by Maxym Dmytrychenko and required for the upcoming triangle
intersection commit.
Hopefully the copyright message is correct.
When ray hits curve segment with SSS shader it was possible to have
uninitialized hit_P variable used for sampling.
Seems that was a reason of our headache of difference between AVX2
and SSE4 render results here, so now we can revert all the nasty
ifdef-ed inline policies.
Original fix in this area was not really complete (but was the safest at
the release time). Now all the crazy configurations of slots going out
of sync should be handled here.
For now, we merely add an option that sets CXXFLAGS envvar with
'--std=c++11' option.
There is no check done to ensure compatibility with the system
libraries, mainly because:
- It is all but trivial to get this information in a generic and
reliable way.
- Currently even cutting edge distributions may still distribute some c++98
libraries.
- With recent stdlibc++, both ABIs are supported together, which means
that incompatibilities are rather unlikely.
To summarize: if your system is recent and built with gcc-5.1 or more,
you should not experience too much troubles with c++11.
It was possible to have two viewports opened and start using Ctrl-0
to make different objects an active camera for the viewport. This
worked fine for viewports which had decoupled camera from the scene,
but if viewport was locked to scene camera it was possible to run into
situation when two different viewports are locked to scene camera but
had different v3d->camera pointers.
Apparently, the whole G.is_break is not used by OpenGL render, meaning
this flag will not be clear before running the operator. This was
causing missing file output after pressing Esc once for the rest of
Blender session.
Reworked logic in the few places that still called this. Deleted the "GLSL not supported" fallbacks.
Also removed some nearby checks for ARB_multitexture and OpenGL 1.1. Blender 2.77 removed checks like this, but game engine still has some.
We were checking for number of tasks from given pool already active, and
then atomically increasing it if allowed - this is not correct, number
could be increased by another thread between check and atomic op!
Atomic primitives are nice, but you must be very careful with *how* you
use them... Now we atomically increase counter, check result, and if we
end up over max value, abort and decrease counter again.
Spotted by Sergey, thanks!
Since the collision modifier cannot be disabled, it causes a constant
hit on the viewport animation playback FPS. Most of this overhead can
be automatically removed in the case when the collider is static.
The updates are only skipped when the collider was stationary during
the preceding update as well, so the state is stored in a field.
Knowing that the collider is static can also be used to disable similar
BVH updates for substeps in the actual cloth simulation code.
Differential Revision: https://developer.blender.org/D2277
Previously if the rendering is much faster than saving (for example,
when transcoding stuff via VSE) it was possible to have 100s of frames
in memory.
This isn't ideal because of limited amount of RAM, so need to have
some sort of limit. This is exactly what is implemented in this commit.
By the design of task scheduler it was possible that tasks from somewhere
in the middle of scheduled list will be handled first.
For example, one thread might be iterating over the scheduled list and
ignore tasks because there is other thread is working on task from the
same pool. However, if that other thread finishes task before iteration
is over current thread will pick up task from somewhere in in the middle
of the list.
This isn't a problem in general case, but for movie rendering we do need
to have strict order of frames.
This allows appending of an entire scene from another blend file into
this one, even when that blend file contains proxified armatures.
Since the proxified object needs to be linked (not local), this will
only work when the "Localize all" checkbox is disabled. The appended
proxy object should also not be referenced from anything in a library
(for example in a constraint). Referencing it from the appended data
should be fine.
Fixes T49495.
Pretty much same reason as for the 'from' pointer of shapekeys - runtime
data creating loops and 'ghost' dependencies between datablocks.
We need to handle them in cases like remapping, but whall not take them
into account to check dependencies between datablocks... :/
The initial idea was to use Ctrl+E to interpolate stroke because this is
similar to Pose breakdown, but the Ctrl+E keymap is used to inverse
grease pencil sculpt effect.
The new keymap is Ctrl+Alt+E in order to fix the conflict
New recursive iteration over IDs in BKE_library_foreach_ID_link() was
broken by the infamous nodetree case. We cannot really recusively call
this function in that case, so better to deffer handling of
non-datablock NodeTrees as if real IDs here.
Also fixed initial ID not being stored as handled, in rare cases this
could also lead to infinite looping.
To be backported to 2.78a.
Mostly this is making inlining match CUDA 7.5 in a few performance critical
places. The end result is that performance is now better than before, possibly
due to less register spilling or other CUDA 8.0 compiler improvements.
On benchmarks scenes, there are 3% to 35% render time reductions. Stack memory
usage is reduced a little too.
Reviewed By: sergey
Differential Revision: https://developer.blender.org/D2269
We *always* want to increase mat user count when from Object (and not
Data), because in that case we are moving mat from object to temp
generated mesh, material can never be 'borrowed' in that case.
To be backported to 2.78a
If I didn't miss anything these are indeed not used. Old themes should still work (will only print info on redundant theme defines into console), but updated non-contrib themes already.
Fixes for clamp-omp, fewer shared variables, fix some cases of threads writing
to the same memory location. Issue found by Jens Verwiebe, who reports 30%
speedup with 16 core CPU, when using this with a recent clang-omp version.
- Explicitly specify the platform for msbuild, to facilitate builds with just the Visual C++ Build Tools installed.
- When vs2013 is not found, try looking for 2015 as a fallback
- Clear up any batch variables that might have been set from previous runs
This is also an important mathematical operation that can be folded
if it is known that one argument is a certain constant. For colors
the operation is provided as a Gamma node.
The SVM Gamma node needs a small fix to make it follow the 0 ^ 0 == 1
rule, same as the Power node, or the Gamma node itself in OSL mode.
Reviewers: #cycles
Differential Revision: https://developer.blender.org/D2263
'camera' Object pointer of TimeMarkers is a 'temp' hack since Durian project...
Would need to be either made definitive now, or removed/reworked/whatever.
But since we intend to use that object pointer for other needs, and current code
could lead to crashing .blend files, for now let's fix that mess (was missing
some bits in read code, and also totally ignored in libquery code).
Should be safe for 2.78a.
It will discard the whole tile, but it's still kind of more friendly than
fully locked interface (sort of) for until tile is fully sampled.
Sorry if it causes PITA to merge for the opencl split work, but this issue
bothering a lot when collecting benchmarks.
This change makes it so that when the sequences within a Scene strip are
evaluated, they use the Scene that they come from as the context as opposed
the Scene that the Scene strip is in. This is necessary, for example, in the
case of the MulticamSelector where it needs to reference strips in the original
Scene as opposed to the Scene where the Scene strip is located.
Patch by @Matt (HyperSphere), thanks!
Previously it was falling back to just a path after #include
statement was finished. Now we fall back to a proper current
file name after dealing with the preprocessor statement.
Some of the files were wrongly attributing code to some other
organizations and in few places proper attribution was missing.
This is mainly either a copy-paste error (when new file was
created from an existing one and header wasn't updated) or due
to some refactor which split non-original-BF code with purely
BF code.
Should solve some confusion around.
- The build folder name used to be depended on the order of the parameters, this is now normalized to
"build_windows_[Release/Full/Lite/Headless/Cycles/Bpy]_[x86/x64]_vc[12/14]_[Release/Debug]" regardless of the order of the parameters.
-Use CUDA8 for all kernels when building the release convenience target with visual studio 2015
In Windows, event dispatching code is throwing out the wheel scroll count value.
Despite of how many fast you move the wheel, it only make one-notch scroll event.
This patch convert wheel event to multiple 1-notch wheel events.
This also correct the handling of smooth scroll mouse wheel (which can report smaller than 1-notch wheel movement) by accumulating the small wheel delta values.
Reviewers: djnz, shadowrom, elubie, #platform:_windows, sergey, juicyfruit, brecht
Reviewed By: shadowrom, elubie, #platform:_windows, brecht
Subscribers: dingto, elubie, brachi, brecht
Differential Revision: https://developer.blender.org/D143
Another case of float imprecision leading to endless loop. INcreasing a bit 'noise threashold' seems to work OK.
Not a regression, but might be nice to have in 2.78a.
Not sure where this comes from, but code was converting BMEdge* to BMVert* to check oflags,
i.e. not accessing correct memory.
Regression, to be backported to 2.78a.
Apparently the keying sets system doesn't support subclassing
KeyingSetInfo subclasses. I have added a note to the top of the file to
indicate this to future developers.
Regression caused rBbcc863993ad, write code was assuming dw->ob was always valid,
wich is no more the case right after reading file e.g.
Another good example of how bad it is to use 'hidden' dependencies between datablocks. :(
And another fix to be backported to 2.78a. :(((
Previously the pose library used the WholeCharacter key set, which ignores
selection and add keys for almost all bones in the rig. This is a very
slow operation on complex rigs. With this patch, only selected bones are
keyed, defaulting to keying all bones when none are selected.
Note that this fixes the FIXME previously mentioned in the source.
Actually two errors here:
* Properties editor wasn't refreshing on (NC_SCENE | ND_RENDER_OPTIONS) notifiers
* Was using notifier info bits wrongly, needs to send two separate notifiers
Decided to remove ND_RENDER_OPTIONS rather than adding properties editor scene context refresh for it, this is more than a render option change.
Duplicates can happen at UV seams in case of resolution mismatch
or other complications. It's better not to store them in case it
confuses some math later on.
Reviewers: mont29
Differential Revision: https://developer.blender.org/D2261
1. When adding one pixel border to UV islands prefer grid directions.
This is more logical as it means neighbor pixels will commonly
share a border, opposed to just corners.
2. Don't subtract 0.5 when converting float UVs to int in computing
adjacency across a UV seam. Pixels cover a square with corners
at int positions and adding/subtracting 0.5 is only for dealing
with the center point of a pixel.
3. Use the neighbour_pixel field from the correct pixel, and check
it's not back to the origin point.
4. In the connected UV case, ensure that the returned index actually
refers to a valid active pixel.
This fixes paint spread not traversing some UV seams, while at the
same time spreading to random unrelated points on other seams.
The first problem is primarily fixed by 2, while the second one
is addressed by item 4.
Differential Revision: https://developer.blender.org/D2261
Moved that code forth and back a few times while creating rB776a8548f03a fix,
ended up forgetting to update it correctly for function where it layed down in the end.
Last-minute fixes are never a good thing... Now we already have real reason for 2.78 'a' release :(
Code was not getting correct boundbox in some cases (group only instancing other groups e.g.), now
compute our own bbox in those cases.
Based on patch by @lichtwerk, but extended the fix to include linked datablocks in some cases
(linked objects in local groups, linked objects in local scene, etc.), this was also broken in existing code.
Reviewers: mont29
Subscribers: duarteframos
Differential Revision: https://developer.blender.org/D2257
Regression from rBa4a968f, we would adjust current point's wetness without actually protecting it
in new multi-threaded context, leading to concurrent access mess.
Now delay applying wetness reduction to current point to end of function, allows us to avoid having
to lock current point twice together with neighbor one (and reducing spinlock awainting too).
To be backported to 2.78a.
Never call function that might recompute a DM in an RNA itemf callback (or any UI-related func in general)!
There was an XXX comment asking if this was OK - well, no, it was not. :P
Could be ported back to some 2.78 flavour should we need it.
Was only happening with new dependency graph.
The issue here is that scene's depsgraph layers will be 0 unless
it was ever visible. Worked around by checking for 0 layer in the
update_tagged of new depsgraph. This currently kind of following
logic of visible_layers, but is weak.
Committing so studio is unlocked here, will re-evaluate this layer.
Now using new system dedicated to that kind of cases, id_ensure_real_user(), instead.
That way, usercount of Scenes is handled correctly at deletion time.
Reported by @sergey over IRC, thanks.
The light sampling functions calculate light sampling PDF for the case that the light has been randomly selected out of all lights.
However, since BPT handles lamps and meshlights separately, this isn't the case. So, to avoid a wrong result, the code just included the 0.5 factor in the throughput.
In theory, however, the correction should be made to the sampling probability, which needs to be doubled. Now, for the regular calculation, that's no real difference since the throughput is divided by the pdf.
However, it does matter for the MIS calculation - it's unbiased both ways, but including the factor in the PDF instead of the throughput should give slightly better results.
Reviewers: sergey, brecht, dingto, juicyfruit
Differential Revision: https://developer.blender.org/D2258
- WITH_SMOKE macro was not defined so some code was not compiled, though
it was still accessible from the UI
- some UI elements were disappearing due to bad indentation, also rework
the UI code to not hide but rather disable/grey out button in the UI
- Display thickness was not used due to bad manual merge of the code
from the patch.
This basically exposes to the UI a function that was only available
through a debug macro ; the purpose is obviously to help debugging
simulations. It adds ways to draw the vectors either as colored needles
or as arrows showing the direction of the vectors. The colors are based
on the magnitude of the underlying vectors.
Reviewers: plasmasolutions, gottfried
Differential Revision: https://developer.blender.org/D1733
Current approach uses view aligned slicing to generate polygons for GL
texturing such that the generated polygons are always facing the view
plane. Now it is also possible to use object aligned slicing, which
creates polygons by slicing the object perpendicular to whichever axis
is facing the most the view plane. It is also possible to create a
single slice for inspecting the volume, or for 2D rendering effects.
Settings for this, along with a density multiplier setting, are to be
found in a newly added "Smoke Display Settings" panel in the smoke
domain properties tab.
Reviewers: plasmasolutions, gottfried
Differential Revision: https://developer.blender.org/D1733
Using context.active_gpencil_brush to access the active Grease Pencil brush
would result in a crash if trying to rename the brush, because the "ID" pointer
was not set.
To be backported to 2.78
This is something what was guaranteed in give_current_material(), just
copied some range checking logic from there.
Not sure what would be a proper fix here tho.
In fact, it was the whole remapping process that was broken in logic bricks area,
due to terrible design of links between those bricks...
Object copying was also broken in that case, fixed as well.
To be backported to 2.78.
Note that issue was actually probably there since ages, hidden behind dirty hacks
used in previous append code (though likely visible in some corner cases).
Listen kids: do not, never, ever, do what has been done for links between logic bricks. Never. Ever.
Even as pure runtime data it would have been bad, but as stored data...
Optimization attempt with BKE_library_idtype_can_use_idtype() was not taking into account
the fact that drivers may link virtually against any datablock...
Has to be rethinked, but for after 2.78 release, this commit is safe to backport.
Regression caused by rBb27ba26, we would always tag datablocks to update in G.main,
ignoring given bmain, now always use this one instead.
To be backported to 2.78.
This commit allows RNA properties to return additional info on their editable state which may then be displayed in tooltips. To show how it works, it also adds some info for the editable check of proxies. For generally un-editable properties or properties of a linked data-block, RNA returns default strings.
| {F362785} | {F362786} | {F362787} |
Reviewed by brecht, thanks!
Differential Revision: https://developer.blender.org/D2243
Drawing used colors for select (TH_EDGE_SELECT/TH_VERTEX_SELECT) which was inconsistent with crease, seam, sharp, .. (which all had their own them color -- also was a bit hard to read).
NOTE: UI team usually doesn't allow adding more theme options, this is an exception.
Differential Revision: https://developer.blender.org/D2234
It's now possible to change the shortcut for invoking the eyedropper while hovering a button (E by default). Also removed the keymap editor entry for the modal eyedropper keymap, it's now automatically appended to the eyedropper shortcut.
This patch changes a couple of things in the video output encoding.
{F362527}
- Clearer separation between container and codec. No more "format", as this is
too ambiguous. As a result, codecs were removed from the container list.
- Added FFmpeg speed presets, so the user can choosen from the range "Very
slow" to "Ultra fast". By default no preset is used.
- Added Constant Rate Factor (CRF) mode, which allows changing the bit-rate
depending on the desired quality and the input. This generally produces the
best quality videos, at the expense of not knowing the exact bit-rate and
file size.
- Added optional maximum of non-B-frames between B-frames (`max_b_frames`).
- Presets were adjusted for these changes, and new presets added. One of the
new presets is [recommended](https://trac.ffmpeg.org/wiki/Encode/VFX#H.264)
for reviewing videos, as it allows players to scrub through it easily. Might
be nice in weeklies. This preset also requires control over the
`max_b_frames` setting.
GUI-only changes:
- Renamed "MPEG" in the output file format menu with "FFmpeg", as this is more
accurate. After all, FFmpeg is used when this option is chosen, which can
also output non-MPEG files.
- Certain parts of the GUI are disabled when not in use:
- bit rate options are not used when a constant rate factor is given.
- audio bitrate & volume are not used when no audio is exported.
Note that I did not touch `BKE_ffmpeg_preset_set()`. There are currently two
preset systems for FFmpeg (`BKE_ffmpeg_preset_set()` and the Python preset
system). Before we do more work on `BKE_ffmpeg_preset_set()`, I think it's a
good idea to determine whether we want to keep it at all.
After this patch has been accepted, I'd be happy to go through the code and
remove any then-obsolete bits, such as the handling of "XVID" as a container
format.
Reviewers: sergey, mont29, brecht
Subscribers: mpan3, Blendify, brecht, fsiddi
Tags: #bf_blender
Differential Revision: https://developer.blender.org/D2242
For some reason (which I can't recall), backing was doing backface
culling. Since Cycles itself doesn't ignore them (nor does Blender
Internal), they should be visible.
Steps to reproduce:
* Go to modifier context in properties editor
* Add modifier, collapse it
* Press down LMB over collapse button of modifier, hold it
* Drag over pin-icon in properties editor (to keep fixed data-block displayed)
* Drag outside of window bounds (should crash)
Also could've solved by getting space data from callback arguments instead of context, but this fix is much nicer (though not totally un-risky).
Do not close and re-open the file in case it's compressed, gzip module can now directly take a file object as parameter.
Differential Revision: https://developer.blender.org/D2235
There might be some extra missing points here, but it's all rather
a TODO than a real bug and can be tweaked further once issues are
actually discovered.
We raised the minimum to GL 2.1 in Blender 2.77, and dropped support for older GPUs (pre-2012 Intel mostly). On Windows you get a popup message, but on Mac we simply crashed. Every Mac has a builtin software renderer for GL 2.1 so let's use that when the GPU is not capable!
Run blender --debug-gpu to see version detection & software fallback.
Quick fix for now, need to unlock studio here as well.
Proper fix would be to modify API a bit and pass flags which will
prevent expand called on bmain perhaps. But this we should discuss
a bit,
We were calling BLI_remlink and then BLI_insertlinkbefore/after quite often. BLI_listbase_link_move simplifies code a bit and makes it easier to follow. It also returns if link position has changed which can be used to avoid unnecessary updates.
Added it to a number of list reorder operators for now and made use of return value. Behavior shouldn't be changed.
Also some minor cleanup.
Was spawning error popup each time user tried to move a stroke higher or lower than the list allowed. We don't do that anywhere else and it's not really useful info for the user. So rather not bother her.
Idea here is to select the lowest isolation level that wont compromise quality.
By using the lowest level we save memory and processing time. This will also
help avoid precision issues that have been showing up from using the highest
level (T49179, T49257).
This is a pretty simple heuristic that gives ok results. There's more we could
do here, such as filtering for vertices/edges adjacent geometric features that
need isolation instead of checking them all, but the logic there could get a
bit involved.
There's potential for slight popping of edges during animation if the dice
rate is low, but I don't think this should be a problem since low dice rates
really shouldn't be used in animation anyways.
Reviewed By: brecht, sergey
Differential Revision: https://developer.blender.org/D2240
Crashes occured immediately when clicking on "OpenGL render image" because there was only a task pool created previously when it was an animation. Solved it by introducing a variable is_animation to the openglrender and omitting the task_pool call when it's no animation.
@sergey: Please check my changes, moved the pool_ok and the lock into the is_animation clause.
Buildbot machine was updated to the new SDK which seems to have
QTKit removed.
For until we've installed older SDK or ported our code to a new
AVFramework disabling QuickTime.
These may be exposed in UI (keymap editor & redo panel), so better avoid using identifiers like "UP" "DOWN". They are redundant anyway (already displayed).
Replace the W shortcut for subdivision by a new menu for edit specials
in order to keep consistency in UI.
Subdivision is not used all the time, so it's better assign this
shortcut to menu.
In some situations the artist needs to subdivide a stroke created with
few points before, specially for sculpting.
The subdivision is done for any pair of continuous selected points in
the same stroke.
The operator can be activated in edit mode with W key and has a
parameter for number of cuts.
The idea is to have a dedicated thread which is responsive for all the
file writing to a separate thread, so slow disk will not slow down
OpenGL itself.
Gives really nice speedup around 1.5x when exporting barber shop layout
file to h264 video.
any object
There were a couple of crashes caused by stupid typos in
rB631af9f930d2fd2c76751204ff22239aa95f761d and
rB78ea06fea4a74181c25254ed72d50d8a743b6954, as well as a shamefull lack
of 'testing before committing' which only affect exporting.
One crash was due to using RNA_boolean_get instead of RNA_enum_get, the
other one was a tricky case of order of deletion happening in the
destructors of AbcExporter and ArchiveWriter.
Should not affect RC or release.
It was annoyingly slow to do roundtrip from byte OpenGL render to
float render result and back to byte image format (which is used
in 99% of cases for the OpenGL previews),
Now we use render result's rect32 to store render result which is
already supposed to be in the display space.
Gives about 30% speed improvement for OpenGL previews here.
Previously converting from linear space to SRGB was doing rather
slow inverted 1D lookup. Adding explicit inverse LUT gives 20%
speedup of OpenGL render.
Next question is: why do we even bother with sRGB conversion here,
OpenGL is already in the proper space so in theory we can avoid
quite some color space conversions. In any case, having this case
optimized in nice anyway.
New features:
1) Release target that checks for both cuda 7.5 and 8 with WITH_CYCLES_CUDA_BINARIES=ON and CYCLES_CUDA_BINARIES_ARCH=sm_20;sm_21;sm_30;sm_35;sm_37;sm_50;sm_52;sm_60;sm_61 options set.
2) Option to switch between x86 and x64 builds, the default remains (auto detect the architecture) but can be overridden.
3) Option to switch between vs12(2013) and vs14(2015) default is 2013.
Reviewers: juicyfruit, sergey
Reviewed By: sergey
Tags: #platform:_windows
Differential Revision: https://developer.blender.org/D2180
This was quite weak to consider all scripted expression to be time-dependent.
Current solution is somewhat better but still crappy. Not sure how can we make
it really nice.
Stupid mistake wrapping path validation code inside a BLI_assert, which means it was
only called in Debug builds...
Found by Sergey, thanks.
Should be backported to 2.78.
Those 'never null' ID pointers are really a PITA to handle... luckily we don't have much of those around!
Found by Sybren, thanks.
Should be backported to 2.78.
This is internal pointer helper for scene evaluation and tools, though exposed to bpy API,
it can give false 'dependency cycles' in bpy.data.user_map() results.
That's followup to rBe007552442634 really, both should be backported to 2.78
This is internal pointer helper for scene evaluation and tools, it's not exposed to bpy API anyway,
and can give false 'dependency cycles' in bpy.data.user_map() results.
Found by sybren in his Splode work.
Uses similar way of storing temp data as object copy paste, just
uses different read entrypoint which does not modify current bmain.
This gives ability to easily copy-paste poses from one blender to
another one.
Hopefully doesn't introduce user-measurable differences.
Request from Peer here in the studio.
Reviewers: mont29
Reviewed By: mont29
Subscribers: hjalti, fsiddi
Differential Revision: https://developer.blender.org/D2229
This reverts commit ecbfa31caa.
Original commit broke logic in nodes re-fitting. That area can
access non-existing children momentarely. Not sure what would
be best solution here, for now simply reverting the change/
Problem was zero length normal caused by a precision issue in patch evaluation.
This is somewhat of a quick fix, but is better than allowing possible NaNs to
occur and cause problems elsewhere.
Both spot and area light have large areas where they're not visible.
Therefore, this patch stops the light sampling code when one of these cases (outside of the spotlight cone or behind the area light) occurs, before the lamp shader is evaluated.
In the case of the area light, the solid angle sampling can also be skipped.
In a test scene with Sample All Lights and 18 Area lamps and 9 Spot lamps that all point away from the area that the camera sees, render time drops from 12sec to 5sec.
Reviewers: brecht, sergey, dingto, juicyfruit
Differential Revision: https://developer.blender.org/D2216
Small issues in GHOST
- use NSApplicationDelegate protocol for our app delegate
- make sure NSApp is initialized before using
(cherry picked from commit df7be04ca6)
Regression from rB036c006cefe471. We can't use self here, self is bpy.app, not pydescriptor of python path getsetter...
So for now, do not try to replace getsetter by actual value in bpy.app's dict,
just return static var generated on first run.
Should be safe for 2.78.
I) Filename was not put in temp Main generated to save selected data only,
this was breaking readcode when trying to open partial file, leading to missing
filename in final loaded Main data.
II) Read code would confuse partial .blend files with Undo ones, when they had no screen in them
(which happens to 99.999% of partial .blend files I guess).
Reported by @sybren, thanks.
Should be safe enough for 2.78 release.
The title says it all actually. From tests with barber shop scene here
gives 2-3x speedup for shader compilation on my oldie i7 machine. The
gain is mainly due to textures metadata query from jpeg files (which
seems to requite de-compression before metadata can be read). But in
theory could give nice improvements for scenes with huge node trees
as well (i'm talking about node trees of complexity of fractal which
we had reports about in the past).
Reviewers: juicyfruit, dingto, lukasstockner97, brecht
Reviewed By: brecht
Subscribers: monio, Blendify
Differential Revision: https://developer.blender.org/D2215
The issue was caused by some false-positive empty non-AABB intersection.
Tried to tweak it a bit so it does not record intersection anymore.
Hopefully will work for all platforms. Tested here on iMac and Debian.
The idea is to allow certain animation channels to be always visible in
animation editors. So, for example, one can pin Camera animation to the
editor so it is always possible to refine/tweak camera animation when
animating something else in the scene.
There is probably some more polishing required, and some current
limitations could be solved in the future but should be a good starting
point already.
Currently only works for object without recursing into deeper datablock
(so for example, it's not possible to pin object material animation).
Studio request by Colin Levy.
Basically just moves cached kernels from ~/.config/blender/BLENDER_VERSION to
~/.cache/cycles/kernels. This has following benefits:
- Follows XDG specification more closely,
not as if it's totally crucial or measurable by users, but still nice.
- Prevents unexpected sizes of config folder, makes disk space used in more
predictable for users way.
- Allows to share kernels across multiple Blender versions,
which makes it easier debugging at the times close to release.
- "Copy Previous Settings" operator will no longer be copying possibly
gigabytes of cached kernels, which used to lead to really nast disk usage
and annoying delays of copying settings.
- In the future we can have some smart logic to clear old unused cached
kernels.
Currently only done for Linux and OSX. Windows still follows old "cache"
folder logic, but it's not really important for now because we don't
support kernel compilation on this platform yet.
Reviewers: dingto, juicyfruit, brecht
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D2197
Constant folding was removing all nodes connected to the displacement output
if they evaluated to a constant, causing there to be no valid graph for
displacement even when there was displacement to be applied, and sometimes
caused crashes.
Using ones complement for detecting if transform has been applied was confusing
and led to several bugs. With this proper checks are made.
Also added a few transforms where they were missing, mostly affecting baking
and displacement when `P` is used in the shader (previously `P` was in the
wrong space for these shaders)
Also removed `TIME_INVALID` as this may have resulted in incorrect
transforms in some cases.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D2192
Bump mapping was happening in world space while displacement happens in object
space, causing shading errors when displacement type was used with bump mapping.
To fix this the proper transforms are added to bump nodes. This is only done
for automatic bump mapping however, to avoid visual changes from other uses of
bump mapping. It would be nice to do this for all bump mapping to be consistent
but that will have to wait till we can break compatibility.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D2191
Now the factor works similar to other Blender areas to make the factor
more consistent for artists. The value 0% means equal to original
stroke, 100% equal to final stroke (50% means half way). Any value below
0% or greater than 100% create an overshoot of the stroke.
The existing code uses the input value count of the first channel
for all of them. If the first channel is the largest, it leads to
a crash-causing buffer overrun in memcpy below. Likely this was
left since the time when only one channel was supported.
As a crash fix, probably should go into 2.78
This commits changes two things:
* It adds more keysyms preferably taken from XLookupKeysym than XLookupString (namely, all numpad ones).
* It falls back to keysyms from XLookupKeysym in other cases, when XLookupString does not produce anything we know of.
Finding the correct balance here is far from easy, but think we are comming rather close to it now...
Root of the issue is that active render index became wrong. This is the actual
thing to be fixed, but as usual this is quite tricky to reproduce. Since such
bad situation might have happened more and fix isn't really difficult or
intruisive let's avoid crash for now.
Can be revisited once we figure out root of the issue.
Nice for 2.78 release.
Own fault in new ID management work, thought rebuild the DAG itself was
enough to actually update whole scene, but we actually need to tag datablocks
for update as well, when we change (or remove) one of their ID pointers...
OCD commit, but cleans the code a bit:
- the first `if 0` block was supposed to draw collision objects but is
vastly outdated as most of the SmokeCollisionSettings member variables
were removed a few years ago and collision objects are drawn like other
objects anyway. Also it was committed already commented out back in
2009.
- the second `if 0` block was doing pretty much the same thing as the
few lines above it.
Most of the time, Lamps in Cycles are just a constant emission closure, no texturing etc. Therefore, running a full shader evaluation is wasteful.
To avoid that, Cycles now detects these constant emission shaders and stores their value in the lamp data along with a flag in the shader.
Then, at runtime, if this flag is set, the lamp code just uses this value and only runs the full shader evaluation if it is neccessary.
In scenes with a lot of lamps and with "Sample all direct/indirect" enabled, this saves up to 20% of rendering time in my tests.
Reviewers: #cycles
Differential Revision: https://developer.blender.org/D2193
For proper indexing to work we need to use unaligned node with
identity transform instead of aligned nodes when doing refit.
To be backported to 2.78 release.
This is really hack-fix actually, not sure why `get_pointcache_keys_for_time()` seems to assume
it will always find key for given part index at least for current frame, and whether this assumption
is wrong or whether bug happens elsewhere...
Anyway, this is to be wiped out in 2.8, so no point loosing too much time on it, for now merely
returning unchanged (i.e. zero'ed) ParticleKeys in case index2 is invalid. Won't hurt anyway,
even if this did not crash in release builds, would be returning giberish values.
Weirdly enough, this version of XCode seems to have static_assert()
even when NOT using C++11. This is totally weird and counter intuitive
since static_assert() is supposed to be C++11 onlky feature.
Can XCode stop using future, please? :)
The two SVM nodes added with e7ea1ae78c caused a slowdown on AMD cards when rendering with OpenCL, whether displacement was used or not.
In the Barcelona Pavillon scene on a RX480, this would cause a 12% slowdown.
Therefore, this commit adds a additional flag for feature-adaptive compilation so that the new SVM nodes are only enabled when they are needed (Node tree connected to the Displacement output and Displacement type set to Both).
Also, the nodes were also added to shaders when the Displacement Type was set to Bump (the default), which was unneccessary and is fixed now.
Thanks to linda2 on IRC for reporting and testing and to maiself for help with the displacement shader code.
This fix might be relevant for 2.78, but it should be tested further before including it.
When drawing with Grease Pencil "continous drawing" for a long time
(i.e. basically, drawing a very large number of strokes), it could be
possible to cause lower-specced machines to run out of RAM and start
swapping. This was because there was no limit on the number of undo
states that the GP undo code was storing; since the undo states grow
exponentially on each stroke (i.e. each stroke results in another undo
state which contains all the existing strokes AND the newest stroke), this
could cause issues when taken to the extreme.
See commit's comments for details, but this boils down to: do not try to use
purely runtime cache data as a 'real' ID pointer in readcode, it's likely
doomed to fail in some cases, and is bad practice in any case!
Thix fix implies dupliweight's object will be invalid until first scene update
(i.e. first particles evaluation).
Two new modal operators to create a grease pencil interpolate drawing
for one frame or a complete sequence between two frames. For drawing
the temporary strokes in the viewport, two drawing handlers have been
added to manage 3D and 2D stuff.
Video: https://youtu.be/qxYwO5sSg5Y
The operator shortcuts are Ctrl+E and Ctrl+Shift+E. During the modal
operator, the interpolation can be adjusted using the mouse (moving
left/right) or the wheel mouse.
The issue was that although all of the image is available, the prefiltering system didn't use the area outside of the
current tile, which caused visible seams.
This commit reverts fba2b77c2a since it turned out that it actually doesn't help with speed at all - I screwed up the original benchmarking...
Considering that there is no real performance difference, the increased complexity isn't worth it.
The approach that is used to find the global bandwidth is:
- Run the reconstruction filter for different bandwidths and estimate bias and variance
- Fit analytic bias and variance models to these bandwidth-bias/variance pairs using least-squares
- Minimize the MSE term (Bias^2 + Variance) analytically using the fitted models
The models used in the LWR paper are:
- Bias(h) = a + b*h^2
- Variance(h) = (c + d*h^(-k))/n
, where (a, b, c, d) are the parameters to be fitted, h is the global bandwidth, k is the rank and n is the number of samples.
Classic linear least squares is used to find a, b, c and d.
Then, the paper states that MSE(h) = (Bias(h)^2 + Variance(h)) is minimal for h = (k*d / (4*b^2*n))^(1/(k+4)).
Now, what is suspicious about this term is that a and c don't appear.
c makes sense - after all, its contribution to the variance is independent of h.
a, however, does not - after all, the Bias term is squared, so a term that depends on both h and a exists.
It turns out that this minimization term is wrong for these models, but instead correct when using Bias(h) = b*h^2 (without constant offset).
That model also makes intuitive sense, since the bias goes to zero as filter strength (bandwidth) does so.
Similarly, the variance model should go to zero as h goes towards infinity, since infinite filter strength would eliminate all possible noise.
Therefore, this commit changes the bias and variance models to not include the constant term any more.
The change in result can be significant - in my test scene, the average bandwidth halved.
The previous algorithm was:
- Fetch buffer data into the feature vector which was in shared (faster) memory
- Use the feature vector to calculate the weight and the design_row, which was stored in local (slower) memory
- Update the Gramian matrix using the design_row
Now, the problem there is that the most expensive part in terms of memory accesses is the third step, which means that having the design_row in shared memory would be a great improvement.
However, shared memory is extremely limited - for good performance, the number of elements per thread should be odd (to avoid bank comflicts), but even going from the 11 floats that the feature vector needs to 13 already significantly hurts the occupancy.
Therefore, in order to make room for the design_row, it would be great to get rid of the feature vector.
That's the first part of the commit: By changing the order in whoch the design_row is built, the first two steps can be merged so that the design_row is constructed directly from the buffer data instead of going through the feature vector.
This has a disadvantage - the old design_row construction had an early-abort for zero weights, which was pretty common. With the new structure, that's not possible anymore. However, this is less of a problem on GPUs due to divergence - in order to save any speed, all 32 threads in the warp had to abort anyways.
Now the feature vector doesn't take up memory anymore, but the design_row is still to big - it has up to 23 elements, which is far too much.
It has a useful property, though - the first element is always one, and the last 11 elements are just the squares of the first 11. So, storing 11 floats is enough to have all information, and the squaring can be performed when the design_row is used.
Therefore, the second part of the commit adds specialized functions that accept this reduced design_row and account for these missing elements.
The missed factor caused the NLM filtering of the buffer variance to essentially reduce to a simple box filter,
which overblurred the buffer variance and therefore caused problems with sharp edges in the shadow buffer.
This commit contains essentially a complete overhaul of the CUDA denoising kernels.
One of the main changes is splitting up the huge estimate_params kernel into multiple smaller ones:
- One Kernel calculates the reduced feature space transform.
- One Kernel estimates the feature bandwidths.
- One Kernel estimates bias and variance for a given global bandwidth. This kernel is executed multiple times for different global bandwidths.
- One Kernel calculates the optimal global bandwidth.
This improves UI responsiveness since the individual kernel launches are shorter.
Also, smaller kernels are always a good thing on GPUs - from register allocation to warp divergence.
The next major improvement concerns the transform - before this commit, transform loads from global memory were the main bottleneck.
First of all, it's now stored in a SoA layout instead of AoS, which makes all transform loads coalesced.
Furthermore, the transform pointer is declared as "float const* __restricted__" instead of float*, which allows NVCC to cache the transform reads. Since only the first kernel writes the transforms, this increases speed again.
The third mayor change is that the feature vector, which is used in every per-pixel loop, now is stored in shared memory.
Since the feature vector is involved in a lot of operations, this improves performance again.
On the other hand, shared memory is rather limited on Kepler and older, so even the 11 floats per thread are already a lot.
With the default "16KB shared - 48KB L1 Cache" split on a GTX780, occupancy is only 12.5% - way too low.
With "48KB shared - 16KB L1 Cache", occupancy is back up at 50%, but of course there are more cache misses - in the end, though, the benefits of having the feature vector local make up for that.
I expect the performance boost to be even higher on Maxwell and Pascal, since these have much larger shared memory and L1.
This commit changes the denoising kernel to actually use the additional frames.
The required changes are surprisingly small - one additional feature contains
the frame to which the pixel belongs, and the per-pixel loop now iterates over frames first.
This commits adds an option to the BufferParams that specifies how many frames are stored in there.
The frames share all other parameters, such as size and passes.
Frames are not stored in order - instead, the first frame is the primary frame, so that all code that uses
the RenderBuffers still works as expected, but code parts that can use the additional frames may do so.
The Standalone Denoising mode now comes with an option to specify the frame range that will be used for denoising.
When doing so, the input filename isn't an actual file, but has to contain a part of the form "%Xd" that specifies how the frame file names are formatted, where X is the length to which frames are zero-padded. That part will be replaced by the padded frame number before loading.
So far, no code actually uses the additional frames yet, that will come in the next commits.
The code is supposed to implement replacements for a few SSE4.1-specific functions so that they can be used with SSE3 as well.
Therefore, it was enabled when __KERNEL_SSE3__ was set, but __KERNEL_SSE4__ wasn't.
However, __KERNEL_SSE4__ is never set anywhere - the correct one is __KERNEL_SSE41__.
Because of that, the replacements were enabled for SSE4.1 and better (AVX) as well, where they're not needed, but only slow things down.
To use it, call it with "./cycles --denoise --samples <sample number> --output <denoised_file.png> <rendered_image.exr>".
You need to enter the sample number that the image was rendered with - others will work as well, but might produce artifacts.
The input image can be generated by rendering with "Keep denoising data" enabled (denoising itself isn't needed) and saving the result as Multilayer EXR.
For now, this is mainly useful for quicker testing without re-rendering and profiling, not so much for regular users.
However, the next step will be to implement inter-frame denoising for animations, which will provide a significant quality boost.
At first, the denoising kernel just directly accessed the RenderBuffers.
However, that introduced some addressing complexity since the filter window might cover multiple tiles, each with a separate buffer.
Apart from the addressing overhead, this also made it pretty much impossible to SIMDify the CPU code.
When feature prefiltering was added, it changed the buffer addressing.
First, it copied the various parts of different buffers into one continuous array. Then, it operated directly on that array.
With these changes, the only thing the regular buffer addressing was still needed for was the color image.
Now, this commit also copies the color image into the prefiltered buffer. Therefore, it's not really just a prefiltered buffer anymore, but actually contains all the data needed to denoise.
This allows to redesign and clean up the kernel-device-interface, which is also done in this commit.
Advantages are:
- Lower addressing overhead - every pixel is only addressed once to copy the data to the denoising buffer, and once to store the final result - instead of hundreds of accesses per pixel when looping over the filter window.
- Lower code complexity - one array with standard scanline addressing makes the code a lot cleaner.
- For GPUs: More memory access coherence since the passes are stored in SoA layout instead of AoS (like the regular RenderBuffers are).
- For CPUs: Possibility to use SIMD instructions in the future due to the SoA layout.
The disadvantage is slightly higher memory usage - 22 floats per pixel instead of 16.
This commit doesn't include the CUDA changes yet.
As soon as any feature pass sample is NaN, every pixel which contains that sample in its filter window will be black in the filtered result.
Ideally no NaNs should be generated in the first place, but there are quite a few cases where they are generated in Cycles and now become visible.
So, as a temporary fix, NaNs are now replaced with zero when storing the passes. Ideally these NaNs should be fixed for good, of course.
This commit adds prefiltering to all feature passes, instead of just the shadow pass.
Feature passes are supposed to be noise-free, but effects like Depth of Field, Motion Blur or slightly glossy shaders could still produce noticable amounts of noise.
By enabling the new WITH_CYCLES_DEBUG_FPE, floating point exceptions are enabled in the CPU kernels.
That way, the debugger stops as soon as an invalid calculation is performed, which makes it a lot easier to track these issues.
Note that the option may cause problems in combination with the --debug-fpe runtime option.
This commit finally adds the prefiltered shadow feature to the main denoising algorithm.
Doing so improves detail preservation a lot: Although the main focus are sharp shadow edges, it actually also helps for Ambient-Occlusion-like and geometric details.
The only issue is that some geometric edges might be a bit noisier after denoising, but that will be fixed in the future by downweighting the shadow feature
when the geometric changes (normals and depth features) are strong.
The previous commit already generates the features, but they're quite noisy, which is unacceptable for a LWR feature since it leads to noise in the result.
The filtering algorithm is:
1. filter_divide_shadow:
- Divide the R and G channels of both passes to get two noisy shadow passes
- Scale the B channels and combine them to get the approximate Sample Variance
- Compute the squared difference of the A and B divided passes to get the correct, but also noisy, Buffer Variance
- Compute the squared difference of the A and B Sample Variances to get the variance of the Sample Variance estimate
2. filter_non_local_means:
- Smooth the Buffer Variance using Non-Local Means with weights derived from the Sample Variance pass
3. filter_non_local_means:
- Smooth the A and B shadow passes using Non-Local Means with weights derived from the other pass (B to smooth A, A to smooth B) and from the smooth buffer variance.
4. filter_combine_halves:
- Compute the squared difference of the A and B smoothed shadow passes to estimate the residual variance in the channels.
5. filter_non_local_means:
- Smooth the two passes again using each other and the residual variance for weights.
6. filter_combine_halves:
- Average the two double-smoothed passes to obtain the final shadow feature used for the LWR algorithm.
Although the algorithm might sound rather slow, that's not the case. This can be seen by reducing the half window: Doing so reduces the time LWR takes, but the prefiltering stays the same. So, since the time used can be reduced drastically with the half window, prefiltering can't be the bottleneck.
Also, the amount of repeated smoothing sounds like it destroys fine details. However, that is not the case: Due to taking variance into account and the remarkable quality of the NLM filter, details that only span a couple of pixels are still preserved without blurring.
The final feature isn't used yet, that will be added in the next commit.
This commit makes the path tracing kernel generate the needed data for the shadow passes and fill them.
The approach used here is quite similar to a shadowcatcher:
The R channel of the passes records the total energy of direct light queries, no matter whether they were shadowed or not.
The G channel records the amount of energy that actually was unoccluded. Therefore, dividing G by R produces a accurate pass containing the amount of shadowing (between zero and one, since G can never be larger than R).
The B channel contains an approximation of the variance of the shadow info.
However, since Var[E[Unoccluded]/E[Full]] (which is what would be needed) is impossible to compute analytically from the samples, Var[Unoccluded/Full] is used instead (Variance of the individual ratios at each sample, instead of the variance of the ratio of the mean values after sampling).
That's both biased and actually might not even be close, it still is of use in prefiltering: The correct variance can be estimated (with a lot of noise) from the difference of the A and B passes, and the approximation in the B channel can be used for the weights used to prefilter the noisy, but correct variance.
These two passes will both hold the same feature, but one will be filled with data from even samples
and the other with data from odd samples. That allows to estimate buffer variance and prefilter it better later on.
As explained in the code, the bandwidth variables actually store the inverse of the bandwidth
to save a couple of divides. However, if the bandwidth is to be multiplied with a factor, that means
that the variable must be divided by that factor, which was not done currently.
This commit reorders the denoising features: Instead of "Normals, Texture, Depth, Screen position"
the order now is "Screen position, Depth, Normals, Texture" like in the old demo code, which significantly
improves result quality.
As explained previously, CPUs currently get the out-of-tile pixels for denoising
from neighbor tiles, while GPUs just render bigger tiles internally.
However, implementation differences can make the GPU version (aka "overscan" rendering)
fail while the CPU code works.
Therefore, this commit adds a environment variable check for whether CPU_OVERSCAN is defined,
and enables the already-present CPU single-tile overscan mode for easier debugging.
Note that this has no benefits at all for regular use and will be removed later!
The tile highlighting is still a bit random and the progress bar isn't showing either,
but the basic live update works.
To avoid lots of duplicated code, editors/render and editors/space_image now share two functions.
One of the first steps of the algorithm is to apply a truncated SVD to a
matrix formed by the features in order to reduce the dimensionality of the
problem and decorrelate dimensions.
The truncation threshold, according to the paper, is defined by twice the
spectral norm of the matrix that is formed like the first one, but from the
variances of the features.
The reference implementation doesn't compute the spectral norm, but instead
computes the Frobenius norm and multiplies it with sqrt(Rank)/2.
That makes sense since it's guaranteed that the Frobenius norm lies somewhere
between the one and sqrt(Rank) times the spectral norm.
However, it's still an approximation. Therefore, in this commit I've tried
to directly compute the spectral norm since the runtime performance is currently
mainly limited by the per-pixel loops, so a small constant increase shouldn't matter too much.
In order to compute it, the code just constructs the Gramian matrix of the variance matrix
and computes the square root of its largest eigenvalue, which is found via power iteration.
I haven't tested the code too much yet, but it seems that the improvement is quite negligible
and not really worth the effort. Still, it might be interesting to tweak it further.
This commit finally uses all the work from the earlier commits to implement
the postprocess callback, which triggers a denoising pass on the current
RenderResult.
Note that this does not work on GPUs yet - they can be used for rendering,
but the Device setting has to be switched to CPU rendering before using the
postprocess button.
One other remaining problem is that the Image editor view isn't updated automatically,
you have to switch to another pass/layer and back to see the change.
Also, the feature should eventually be implemeted as a Job to get a progress bar and a
responding UI during denoising.
This mode just creates the Device, generates tiles and runs the denoising kernel
on each tile.
Compared to the regular mode of operation, a lot is missing: No interactivity, no
scene syncing, no progressive rendering etc. However, these features aren't needed
for the denoise-after-render feature, and so this mode saves a lot of code when
calling it from the bindings.
Internally, it uses one single large buffer to hold the image, instead of a small buffer
per tile. That requires some changes to the TileManager and is also the reason for the
earlier region-of-interest commit.
This commit adds a function that takes an existing RenderResult and copies the passes
to newly created RenderBuffers. That will later be used to copy the rendered image
back for post-processing denoising.
The probem here was that regular passes are stored as (signed) integers.
Therefore, the pass with bit 31 set (the Cycles debug pass) is stored as -1.
On its own, that's fine - however, when that pass type is implicitly cast to uint64_t
in a function call, the compiler apparently first sign-extends it and then reinterprets
it as unsigned, so the result is 0xffffffff80000000 instead of only bit 31.
To get around that issue, the type is now explicitly cast to a unsigned int32 first and
then implicitly extended to uint64_t.
This commit implements the Postprocessing API in the Cycles Python bindings
and the poll function in the actual Cycles code.
The actual postprocessing doesn't do any processing yet.
This commit adds a general operator for postprocessing render results in the Image editor.
To do so, the render API is extended by two functions:
- can_postprocess checks whether the current render result can be postprocessed by the engine.
For the denoiser, this will check whether the required passes have been rendered.
- postprocess is executed when the user runs the operator. For the denoiser, this will do the actual denoising.
This commit turns the extended pass types into an enum for nicer code.
Also, the feature passes are now scaled like the regular ones, which means that the passes
are shown correctly in the Image editor.
This commit adds support for tile overscan - rendering a larger tile internally
and only showing its center area. That is needed for GPU denoising since the regular
approach of keeping the neighbor tiles in memory would require far too much memory.
Since tiles are generally quite large on GPUs, the added overhead isn't too large.
This change can help a lot with shadow edges, but adds artifacts to smoother areas.
In the future, I'll look into adaptively selecting the polynomial order.
This commit finally implements the selective denoising pass writing.
With this commit, the denoising feature passes and therefore the changes to the
regular Cycles kernels should be finished.
This commit refactors how the integration result is stored: Instead of summing up
the PathRadiance in the integration function and returning the final color, the integration
function now fills a PathRadiance passed to it and just returns the alpha value.
The main kernel function then passes that PathRadiance to kernel_write_result, a new function
which then handles summing, clamping and storing of light and combined passes.
This commit by itself shouldn't change existing behaviour, but is needed for the upcoming
selective denoising.
With this commit, the newly added passes finally get some content.
The three explicitly stored features are the surface normal, the albedo of the surface
and the path length from the camera. These features will be used by the denoiser to "understand"
where differences in lighting come from - for example, the normal pass allows the denoiser to
smooth the noise on a wall, but keep the edge in the corner of the room perfectly sharp.
To preserve small detail like bumpmapped reflections, the actual normal used for shading
is stored instead of the surface normal which is used for the regular Normal pass.
The main purpose of the albedo pass is to preserve fine texture detail, but can also help to detect
object borders.
The depth pass helps for some edges where both surfaces have the same orientation
(so that normal don't help), but its variance also helps to detect depth-of-field blurring.
The real image passes aren't stored yet because they still require a bit of refactoring.
To produce better results for sharp reflections/refractions, the denoise features are only written at the first
rough/diffuse bounce. To determine whether the current bounce is rough/diffuse or not, the roughness of the individual
closures will be used.
Also, the PathState tracks the total length of the path, for the same reason (it might not be written at the first bounce).
This commit adds the add_pass function to the renderer API, which allows a renderer
to add a pass to a certain or all render layers. Since the RNA only has 32bit integers as
argument types, the position of the passtype bit is passed instead of the actual passtype.
When additional RenderResults are allocated, all additional passes added to the main result
are added to it as well.
This commit extends the number of possible pass types to 64 bit. However, it only
affects the structures used for storage during and after rendering, not the SceneRenderLayer
that's visible to the user (due to various limitations to 32 bit integers in RNA).
Therefore, their main purpose is to be allocated by the renderer based on some other setting.
This commit adds the necessary parameters to the RenderLayer panel and syncs them to Cycles.
It would be nicer to define the RNA properties from Cycles Python code like most other ones, but
since that's not possible for RenderLayers, it has to be added in the DNA :/
This commit changes the TileManager and Session so that one tile can be processed
in multiple steps, here rendering and denoising. The reason for that is that a tile
can only be denoised as soon as all its neighbors are rendered (since the filter window
will cross tile borders) and only be freed as soon as all its neighbors are denoised
(otherwise they wouldn't have the out-of-tile data either).
Therefore, Tiles now have a State and are actually stored for the whole render process.
Tile Highlighting also needed a bit of tweaking to un-highlight tiles between rendering
and denoising.
This commit renames the PATH_TRACE task to RENDER and adds subtypes to the RenderTile, for now
PATH_TRACE and DENOISE. The reason for doing this instead of simply adding a new DENOISE
task is that it 1. allows to reuse the acquire_tile system etc. and 2. allows to denoise tiles
while others are still rendering (if DENOISE was an own task, it would have to wait until PATH_TRACE
was running out of tiles).
The task isn't used yet, that's for the upcoming commits.
2016-05-27 21:41:20 +02:00
1587 changed files with 144408 additions and 39274 deletions
message(FATAL_ERROR"WITH_WINDOWS_CODESIGN is on but WINDOWS_CODESIGN_PFX_PASSWORD not set, and environment variable PFXPASSWORD not found, unable to sign code.")
message(FATAL_ERROR"WITH_WINDOWS_CODESIGN is on but WINDOWS_CODESIGN_PFX_PASSWORD not set, and environment variable PFXPASSWORD not found, unable to sign code.")
@@ -187,7 +187,7 @@ The next table describes the information in the file-header.
</table>
<p>
<ahref="http://en.wikipedia.org/wiki/Endianness">Endianness</a> addresses the way values are ordered in a sequence of bytes(see the <ahref="#example-endianess">example</a> below):
<ahref="https://en.wikipedia.org/wiki/Endianness">Endianness</a> addresses the way values are ordered in a sequence of bytes(see the <ahref="#example-endianess">example</a> below):
@@ -4,7 +4,7 @@ for determining which CUDA functions and extensions extensions are supported
on the target platform.
CUDA core and extension functionality is exposed in a single header file.
GUEW has been tested on a variety of operating systems, including Windows,
CUEW has been tested on a variety of operating systems, including Windows,
Linux, Mac OS X.
LICENSE
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.