Compare commits

...

593 Commits

Author SHA1 Message Date
78f1476d93 Some more cleanup, get rid of hack to protect against NULL RNA pointerprops. 2017-04-04 16:13:55 +02:00
5937ebba44 Merge branch 'master' into datablock_idprops 2017-04-04 16:04:35 +02:00
9170e49250 Fix missing protection of RNA_pointer_as_string() against NULL pointers.
Odd that issue was never reached before? Looks like it hit in
datablock_idprops branch though...
2017-04-04 16:04:13 +02:00
92aeb84fde Cycles: Tag shaders for update after the threading part is over
This avoids write access happening in non-atomic manner in
Shader::tag_update which modifies the global managers. Even
for 1 byte data types it's quite dangerous.
2017-04-04 15:43:12 +02:00
7b149bfde6 Depsgraph: Use atomic operation to tag the changed ID 2017-04-04 15:43:12 +02:00
5ce95df2c6 Cycles: Fix uninitialized memory access when comparing curve mapping nodes
The issue is coming from the fact that float3 is actually 16 bytes aligned
data type and the "padding" was not initialized. This caused memcmp() to
access non-initialized memory.
2017-04-04 15:43:12 +02:00
e74254dd48 Some minor cleanup. 2017-04-04 15:41:39 +02:00
a63a31dd12 install_deps: removed leftover compile_HDF5 command
It was a leftover from when Alembic with HDF5 was still officially
supported.
2017-04-04 14:50:58 +02:00
8a60d84327 Bumped Alembic library version to 1.7.1
This provides us with a clearer API (so I don't have to use const_cast<>
in upcoming code). It also allows layering of different Alembic files,
so you can have a base file and load a separate file containing overrides.

Verbally approved by Dr. Sergey.
2017-04-04 12:55:38 +02:00
ffac92e385 Buildbot: Update master config 2017-04-04 12:52:54 +02:00
b93ddfd8ac Alembic: force ALEMBIC_LIB_USES_BOOST=ON when not using C++11
Alembic requires one of ALEMBIC_LIB_USES_BOOST, ALEMBIC_LIB_USES_TR1, or
C++11, and silently defaults to the latter if the former two are OFF.

Before this change, Alembic was only built without C++11 of OpenEXR
was built at the same time. This dependency was both unnecessary and
undocumented.
2017-04-04 12:41:44 +02:00
ca5ccf5cd4 Task: Remove non-atomic pool suspended flag assignment
This was done some lines above by atomic fetch and and.
2017-04-04 12:32:15 +02:00
4f7eb3ad12 Buildbot: Update master config 2017-04-04 12:15:35 +02:00
e131783384 Merge branch 'master' into datablock_idprops 2017-04-04 10:38:21 +02:00
6347b0b0aa Some cleanup regarding STRUCT_CONTAINS_DATABLOCK_IDPROPERTIES.
This whole system (how to prevent IDP_ID-forbidden structs to get
assigned IDP_ID-allowed IDProps) still seems somewhat brittle to me,
took me some time to wrap my head around it, but... could not find
any better way to do that so far.
2017-04-04 10:29:14 +02:00
728f75c6a7 Buildbot: Some more twqeaks to master config 2017-04-03 15:51:31 +02:00
e741804ce3 Buildbot: Update bundled vetrsion of server configuration 2017-04-03 15:36:52 +02:00
cc93a66e71 Buildbot: SPecial branch trickery for linux slaves 2017-04-03 15:04:16 +02:00
2aa0215ec5 Point all submodules to master branch
This way it should be safe to use `git submodule update --remote`.
2017-04-03 14:54:51 +02:00
d27ef3913a Buildbot: Some special tricks for Blender 2.8 slave 2017-04-03 14:49:07 +02:00
54a60eff24 Fix blender player 2017-04-03 12:31:33 +02:00
ab347c8380 Fix T51115: Bump node is broken when the displacement socket is used 2017-04-03 10:51:00 +02:00
368b74315a Collada - add flag to limit precision of exported data, mainly to simplify debugging 2017-04-03 10:48:00 +02:00
f65d6ea954 fix: collada - do proper conversion from int to bool (as with other nearby parameters) 2017-04-03 10:45:24 +02:00
9cf2a581ab Merge branch 'master' into datablock_idprops 2017-04-02 18:08:48 +02:00
3bf0026bec fix: T50412 - collada: Replaced precision local limit function by blender's own implementation 2017-04-01 15:29:50 +02:00
e1fb080743 Cleanup: style 2017-04-01 12:09:17 +11:00
d424c8041d Cleanup: Some minor styling. 2017-03-31 23:56:18 +02:00
3c76da79b4 Merge branch 'master' into datablock_idprops 2017-03-31 19:41:09 +02:00
4a3708b7ea Add missing handling of sequencer's strips IDProperties. 2017-03-31 19:40:12 +02:00
6c42079b78 Depsgraph: Correction for the previous local view commit
Need to flush layers from components back to ID node.
2017-03-31 17:08:18 +02:00
25ab3aac9d Fix threading conflicts in multitex_ext_safe()
This function was modifying texture datablock, which makes the call
unsafe for call from multiple threads. Now we pass the argument that
we don't need nodes to the underlying functions.

There will be still race condition in noise texture, but that should
at least be free from crashes. Doesn't mean we shouldn't fix it tho.
2017-03-31 17:08:18 +02:00
90df1142a3 Cycles: Solve threading conflict in shader synchronization
Update tag might access links (when checking for attributes) and
the links might be in the middle of rebuild in simplification
logic.
2017-03-31 17:08:18 +02:00
27d20a04b5 Fix unreported bug in Blender Render: using unnormalized normal in normal map node in the same way as in baking 2017-03-31 17:53:55 +03:00
26549b5ba2 Cleanup: simpler to define 'no datablock idprop' flag in RNA struct definitions.
That's cleaner & easier to read than to do it for every child class in
register functions...
2017-03-31 15:27:56 +02:00
ff693959d8 WM: Previous commit broke common-case loading new file
Handle this in the operator
2017-03-31 23:48:10 +11:00
7f7c807a92 Keep current app-template when selecting 'New File' 2017-03-31 22:06:36 +11:00
39e1121698 Merge branch 'master' into datablock_idprops 2017-03-31 12:19:30 +02:00
e5fa738ce9 UI cleanup: simplify Icon handling of uiDefAutoButR for PROP_POINTER.
Comes from D113, but really not related to the patch's topic!
2017-03-31 12:14:27 +02:00
aebd8a7328 More cleanup anf fixes! 2017-03-31 10:57:01 +02:00
8072d4bd66 Merge branch 'master' into datablock_idprops 2017-03-31 10:44:55 +02:00
952f31b0d8 Fix bunch of missing/incorrect handling of IDProps.
Mainly in readfile.c and library_query.c.
Plus some other minor fixes and cleanup.
2017-03-31 10:42:45 +02:00
4b7d95290f Cycles: More fixes after include changes 2017-03-31 10:12:13 +02:00
d097c2a1b3 Fix T51072: The reference of a pyobject may be being overwritten in bm_mesh_remap_cd_update
In this case the Pyobject gets lost from pybm, and bm.free() does not invalidate the PyElem.
This will cause the destructor of python to read invalid memory and crash.

The solution is to make a copy of the pyobjects pointers before overwriting.
2017-03-31 01:01:16 -03:00
8bd61ea54d Correct string formatting (error in recent change) 2017-03-31 09:48:57 +11:00
14c2083460 Cleanup: warnings 2017-03-31 09:48:57 +11:00
01e0b38b66 Grr, fix stupid typo in previous commit...
Always build before committing :|
2017-03-30 23:06:38 +02:00
314ccf6494 Merge branch 'master' into datablock_idprops 2017-03-30 23:00:12 +02:00
4cfac9edab Cleanup/fix bad code in IDP_SetIndexArray()
Mainly, using index before checking for its validity...
2017-03-30 22:52:53 +02:00
abb876d84b Cleanup - resync with master, this needs to be fixed in master first! 2017-03-30 21:19:17 +02:00
b7464ec6a2 Fix another case of bad logic in recurrent function handling IDProps.
And some generic cleanup/styling/etc.
2017-03-30 21:18:04 +02:00
5b3b0b4778 Redraw parent popup when the child popup is closed 2017-03-30 16:48:18 +03:00
843be91002 Depsgraph: Fix missing updates when in local view
This area is a subject of reconsideration, so for now used simplest
way possible -- ensure depsgraph's nodes have proper layer flags
when going in and out of local mode.
2017-03-30 14:42:55 +02:00
e444a41545 Merge branch 'master' into datablock_idprops 2017-03-30 12:56:31 +02:00
e446935265 Fix always-unlinking in ID free function.
Since IDProps now handle ID usages, makes sense to pass do_id_user flag
to some new IDP_FreeProperty_ex() as well...
2017-03-30 12:52:07 +02:00
a88801b99b Cycles: Fix missing kernel re-compilation after recent changes
Reported by Mai in IRC, thanks!
2017-03-30 11:45:30 +02:00
ced8fff5de Fix T51051: Incorrect render on 32bit Linux
The issue was apparently caused by -fno-finite-math-only added to kernel.cpp
CFLAGS. For now just removed this flag from the kernel (we don't really want
it there at this point, and we don't have it for SSE/AVX optimized kernels).

But surely more investigation is needed here.
2017-03-30 11:37:31 +02:00
9b1564a862 Cycles: Cleanup, rename RegularBVH to BinaryBVH
Makes it more explicit what the structure is from it's name.
2017-03-30 09:47:27 +02:00
31e6249256 Mirror Modifier: Add offsets for mirrored UVs
The mirror modifier now has two fields that specify a -1 to 1 offset for
the U and V axes when mirroring their coordinates.

D1844 by @circuitfox
2017-03-30 13:15:02 +11:00
66ef0b8834 Cycles: Fix compilation error of app after the include directories change 2017-03-29 16:54:41 +02:00
171c39cc19 Merge branch 'master' into datablock_idprops 2017-03-29 16:25:09 +02:00
48fa2c83eb Cycles: Attempt to work around compilation errors of CUDA on sm_2x 2017-03-29 16:22:51 +02:00
aae70f182b Cleanup, minor fixes and serious simplification of idprops.c
Mostly, get rid of id_(un)register, we can just use mere id_us_plus/min
as anywhere else in code now. Also, unlink function was not actually
used.
2017-03-29 16:21:40 +02:00
603aafc9dc Fix missing handling of IDProps of nodetree's IO sockets. 2017-03-29 16:19:00 +02:00
be17445714 Cycles: Cleanup, indentation 2017-03-29 15:41:56 +02:00
cc7386ec6b Cycles: Remove toolkit-specific workaround from kernel 2017-03-29 15:07:53 +02:00
5af4e1ca15 Cycles: Only use CUDA 8.0 as officially supported one
This deprecates CUDA 7.5.
2017-03-29 15:06:47 +02:00
270df9a60f Cycles: Cleanup, don't use m_ prefix for public properties 2017-03-29 14:45:49 +02:00
30bed91b78 Cycles: Fix compilation error with visibility flag disabled 2017-03-29 14:28:45 +02:00
0579eaae1f Cycles: Make all #include statements relative to cycles source directory
The idea is to make include statements more explicit and obvious where the
file is coming from, additionally reducing chance of wrong header being
picked up.

For example, it was not obvious whether bvh.h was refferring to builder
or traversal, whenter node.h is a generic graph node or a shader node
and cases like that.

Surely this might look obvious for the active developers, but after some
time of not touching the code it becomes less obvious where file is coming
from.

This was briefly mentioned in T50824 and seems @brecht is fine with such
explicitness, but need to agree with all active developers before committing
this.

Please note that this patch is lacking changes related on GPU/OpenCL
support. This will be solved if/when we all agree this is a good idea to move
forward.

Reviewers: brecht, lukasstockner97, maiself, nirved, dingto, juicyfruit, swerner

Reviewed By: lukasstockner97, maiself, nirved, dingto

Subscribers: brecht

Differential Revision: https://developer.blender.org/D2586
2017-03-29 13:41:11 +02:00
61db9ee27a Cycles: Attempt to workaround compilation error on new CUDA toolkit and sm_2x 2017-03-29 11:50:17 +02:00
c2d3bb7090 Remove non-bmesh case from test 2017-03-29 20:11:54 +11:00
dd662c74ae Fix skin mark operator
Accessed custom-data layer offset before creating.
2017-03-29 20:11:54 +11:00
a7ca991841 Fix crash closing window in background mode 2017-03-29 20:11:54 +11:00
cb6ec44fc7 Fix missing NULL check in gpencil poll
Also de-duplicate poll functions
2017-03-29 20:11:54 +11:00
6332e0e1f7 Use 'empty' option for clearing factory settings 2017-03-29 20:11:54 +11:00
df7f6a3e2e Option to load startup file with empty-data
Useful for batch conversion and tests.
2017-03-29 20:11:54 +11:00
b3f9ae0125 Buildbot: Revert previous change, older toolkit has same exact behavior 2017-03-29 10:48:10 +02:00
15ff75d06b Buildbot: Use older NVCC on 32bit linux
Newer toolkit has some weird issue with cross0-compiling 32bit kernels
from 64bit environment.
2017-03-29 10:21:17 +02:00
ac43e5cc87 Buildbot: Remove global hardcoded NVCC path
This was initially needed for heterogeneous setup of two toolkits which
we no longer need.
2017-03-29 10:16:41 +02:00
286adfde38 Cycles: Bring back preview AA samples when using BPT
This was removed in 93426cb. Please be more accurate when
changing interface.
2017-03-29 09:12:26 +02:00
93543e6695 Fix mistake with last master merge... 2017-03-29 08:41:42 +02:00
4c7f4e4662 PyAPI: minor path init simplification 2017-03-29 15:07:41 +11:00
4f69dca547 Fix 'bl_app_override' wrapping multiple times.
Calling `SomeClass.draw(self, context)` instead of `self.draw()`
would try to wrap the argument `self` multiple times, causing an error.
2017-03-29 14:31:14 +11:00
d808557d15 Fix memory leak re-registering operators
Re-registering an operator used by the keymap would lead memory.
Reload scripts for eg leaked over ~1600 blocks.
2017-03-29 13:35:15 +11:00
02b2094847 PyAPI: check modules are registered before unregister
Needed since templates may unregister classes.

Also replace old modules on reloading.
2017-03-29 12:38:02 +11:00
93426cb295 Fix T51068: Place props in their own row
This allows the props to extend into the blank space that is to the right.
2017-03-28 16:33:05 -04:00
6ea54fe9ff Cycles: Switch to reformulated Pluecker ray/triangle intersection
The intention of this commit it to address issues mentioned in the
reports T43865,T50164 and T50452.

The code is based on Embree code with some extra vectorization
to speed up single ray to single triangle intersection.

Unfortunately, such a fix is not coming for free. There is some
slowdown for AVX2 processors, mainly due to different vectorization
code, which caused different number of instructions to be executed
and different instructions-per-cycle counters. But on another hand
this commit makes pre-AVX2 platforms such as AVX and SSE4.1 a bit
faster. The prerformance goes as following:

              2.78c AVX2   2.78c AVX   Patch AVX2         Patch AVX
BMW            05:21.09     06:05.34    05:32.97 (+3.5%)   05:34.97 (-8.5%)
Classroom      16:55.36     18:24.51    17:10.41 (+1.4%)   17:15.87 (-6.3%)
Fishy Cat      08:08.49     08:36.26    08:09.19 (+0.2%)   08:12.25 (-4.7%
Koro           11:22.54     11:45.24    11:13.25 (-1.5%)   11:43.81 (-0.3%)
Barcelone      14:18.32     16:09.46    14:15.20 (-0.4%)   14:25.15 (-10.8%)

On GPU the performance is about 1.5-2% slower in my tests on GTX1080
but afraid we can't do much as a part of this chaneg here and
consider it a price to pay for more proper intersection check.

Made in collaboration with Maxym Dmytrychenko, big thanks to him!

Reviewers: brecht, juicyfruit, lukasstockner97, dingto

Differential Revision: https://developer.blender.org/D1574
2017-03-28 17:26:47 +02:00
f512c8c2a8 Merge branch 'master' into datablock_idprops 2017-03-28 15:51:14 +02:00
69aa6577b3 Forgot those IDP_LibLinkProperty call on node sockets IDProps in previous commit... 2017-03-28 14:38:00 +02:00
a5f316e999 Merge branch 'master' into datablock_idprops
Conflicts:
	source/blender/blenloader/intern/readfile.c
2017-03-28 14:35:18 +02:00
59bb4ca1b0 Fix: Icon offset for pie buttons 2017-03-28 13:44:02 +03:00
3f61280327 Cycles: Pass m128 vectors by const reference 2017-03-28 11:01:11 +02:00
e1909958d9 Fix lib_link_cachefile.
That one was:
* Resetting non-ID pointers (lib_link_xxx funcs should only affect ID
  pointers, everything else shall be done in direct_link_xxx func).
* Even worse, always calling lib_link_animdata, even when
  LIB_TAG_NEED_LINK tag was unset...
2017-03-28 10:15:52 +02:00
bed327f1ce Bring back lib_link_mesh() in 'order' with other libdata liblink functions.
We do not need any special handling anymore for usercount of images used
by faces/polygons (tpage stuff), since we have the 'real_user' handling,
which will gracefully cope with all possible situations.

So better not keep that ugly confusing useless special case.
2017-03-28 10:10:15 +02:00
39172c6f34 readfile.c: Cleanup lib_link code a bit.
Mainly:
* Add missing `IDP_LibLinkProperty()` calls for many ID types
  (harmless currently, but better be consistent here!).
* Bring lib_link_xxx functions more in line with each other.
* Replace some long if/else by switch.
2017-03-28 10:03:59 +02:00
1b5acbb329 Correct splash size check 2017-03-28 17:07:37 +11:00
5ce120b865 Fix columns with fixed width 2017-03-28 00:07:31 +03:00
6a5e92c022 Cleanup: Use upper case consistently in adaptive feature compile logging. 2017-03-27 22:52:33 +02:00
7a65f9b171 Cleanup: Resolve todo in CUDA voxel image code. 2017-03-27 22:36:26 +02:00
0df33cc52d Cycles UI: Avoid abreviation for Hair Extension.
Since 2.5x we should try to avoid such abreviations in the UI, except for common terms like Min / Max as much as possible.
2017-03-27 21:59:29 +02:00
0cfc557c5d Cycles: Move Shadow Catcher UI option next to Ray Visibility.
Previously it was beneath the Performance UI label, which was incorrect. It's better suited next to Ray Visibility.
2017-03-27 21:51:56 +02:00
43b255a2a1 Cleanup: remove 'generic' ID liblink function in readfile.c
We do need some generic ID handling refactor here (as was recently done
for writefile.c), but this out of scope of this patch - and not the way
to do it.
2017-03-27 16:50:15 +02:00
bd053ac7ba Cycles: Correct ifdef around float3 intrinsics 2017-03-27 16:13:07 +02:00
eef52b2818 Merge branch 'master' into datablock_idprops 2017-03-27 15:19:50 +02:00
60ac61b13f Fix obvious mistake in logic of two functions recursively handling IDProps.
They were expecting IDP_group as parameter, but then recursiveley
calling themselves with IDProps from groups and arrays, which can be of
any type...
2017-03-27 15:17:29 +02:00
38589ce6ef Fix bad auto-generated UI for ID IDProps.
`RNA_path_resolve()` tries to 'dereference' pointer properties, was not
a problem before but now that we do have pointer IDProps we want to get
property here, not it's pointed data.
2017-03-27 14:52:15 +02:00
95e0cb499b Cleanup and minor changes.
No functional change expected, mostly:
* Some renaming, bit of style editing, cleanup...
* Removing some useless diff from master.
2017-03-27 14:49:40 +02:00
2a05292efa Correct for Py3.5 2017-03-27 21:34:21 +11:00
db0bada1c3 Merge branch 'master' into datablock_idprops 2017-03-27 10:48:21 +02:00
8d48ea0233 Cycles: Make shadow catcher an optional feature for OpenCL
Solves majority of speed regression on AMD OpenCL.
2017-03-27 10:47:14 +02:00
Hristo Gueorguiev
e07ffcbd1c Cycles: Add OpenCL support for shadow catcher feature
The title says it all actually.
2017-03-27 10:46:59 +02:00
Hristo Gueorguiev
8ada7f7397 Cycles: Remove ccl_addr_space from RNG passed to functions
Simplifies code quite a bit, making it shorter and easier to extend.
Currently no functional changes for users, but is required for the
upcoming work of shadow catcher support with OpenCL.
2017-03-27 10:46:28 +02:00
d14e39622a Cycles: First implementation of shadow catcher
It uses an idea of accumulating all possible light reachable across the
light path (without taking shadow blocked into account) and accumulating
total shaded light across the path. Dividing second figure by first one
seems to be giving good estimate of the shadow.

In fact, to my knowledge, it's something really similar to what is
happening in the denoising branch, so we are aligned here which is good.

The workflow is following:

- Create an object which matches real-life object on which shadow is
  to be catched.

- Create approximate similar material on that object.

  This is needed to make indirect light properly affecting CG objects
  in the scene.

- Mark object as Shadow Catcher in the Object properties.

Ideally, after doing that it will be possible to render the image and
simply alpha-over it on top of real footage.
2017-03-27 10:46:03 +02:00
5aaa643947 Cycles: Optimize shaders earlier to skip unneccessary attributes for noninteractive rendering
Before, Cycles would first sync the shader exactly as shown in the UI, then determine and sync the used attributes and later optimize the shader.
Therefore, even completely unconnected nodes would cause unneccessary attributes to be synced.

The reason for this is to avoid frequent resyncs when editing shaders interactively, but it can still be avoided for noninteractive renders - which is what this commit does.

Reviewed by: sergey

Differential Revision: https://developer.blender.org/D2285
2017-03-27 05:36:49 +02:00
086320a62e CMake: WITH_PYTHON_SECURITY=OFF was ignored
Allow auto-execution to be enabled,
also move this to user-prefs versioning code.
2017-03-27 13:02:41 +11:00
356aacab6b Add back missing include 2017-03-27 09:14:40 +11:00
505b3b7328 Fix padding and align calculation for box layouts 2017-03-26 18:02:11 +03:00
2830f687aa Cleanup: line length, assignment 2017-03-26 21:52:25 +11:00
4bdb2d4885 Fix: Ignore min flag for rows that require all available width 2017-03-26 12:19:01 +03:00
fa63515c37 Fix: Use "round" instead of "floor" in snapping UI to pixels 2017-03-26 12:16:04 +03:00
001fce167a Fix: Button's label can be NULL 2017-03-26 12:04:16 +03:00
ed072e1dcd re-adds the include "BLI_math.h" to custondata
It was removed here rBd52191616b5f
2017-03-26 04:08:16 -03:00
15143a7464 Cleanup: simplify script path assignment 2017-03-26 11:31:39 +11:00
8c0682a93c PyAPI: add missing class registration 2017-03-26 11:28:16 +11:00
f8e02c75ba PyAPI: debug-python check for missing class register
Moving to manual class registration means its easier to accidentally
miss registering classes.

Now detect missing class registration
and warn when running with `--debug-python`
2017-03-26 11:28:10 +11:00
393efccb19 Fix GHOST crash on X11 with recent DPI changes on some systems. 2017-03-25 19:32:50 +01:00
Wouter
fe3fb23697 Automatic DPI for all platforms, per monitor DPI for Windows.
For Windows 8.1 and X11 (Linux, BSD) now use the DPI specified by the operating
system, which previously only worked on macOS. For Windows this is handled per
monitor, for X11 this is based on Xft.dpi or xrandr --dpi. This should result
in appropriate font and button sizes by default in most cases.

The UI has been simplified to a single UI Scale factor relative to the automatic
DPI, instead of two DPI and Virtual Pixel Size settings. There is forward and
backwards compatibility for existing user preferences.

Reviewed By: brecht, LazyDodo

Differential Revision: https://developer.blender.org/D2539
2017-03-25 11:22:16 +01:00
86730f1f35 Remove support for py app-templates
Only zip-files make sense here.
2017-03-25 18:14:00 +11:00
7cb2974182 Cleanup: imports, indentation, long lines 2017-03-25 11:07:48 +11:00
a6f74453b6 Fix unreported: inaccuracy of interpolation of custom color layers due to float truncation
Same solution from rBd23459f51640 but now in `layerInterp_mcol`
Also a cleaning was done in the includes
2017-03-24 20:06:43 -03:00
f68145011f WM: Application Templates
This adds the ability to switch between different application-configurations
without interfering with Blender's normal operation.

This commit doesn't include any templates,
so its mostly to allow collaboration for the Blender 101 project
and other custom configurations.

Application templates can be installed & selected from the file menu.

Other details:

- The `bl_app_template_utils` module handles template activation
  (similar to `addon_utils`).
- The `bl_app_override` module is a general module
  to assist scripts overriding parts of Blender in reversible way.

See docs:
https://docs.blender.org/manual/en/dev/advanced/app_templates.html

See patch: D2565
2017-03-25 10:04:04 +11:00
a7f16c17c2 Fix various i18n ambiguous issues reported in T43295. 2017-03-24 20:02:15 +01:00
dab3865c0b Fix UI message issue, and style cleanup (!) 2017-03-24 20:02:15 +01:00
e9770adf63 Cycles: Remove obsolete variable from the TileManager 2017-03-24 19:44:05 +01:00
8e58e197fd Ашч T50995: Wrong freestyle render with new depgraph
The iossue was caused by 0371ef1/
2017-03-24 16:33:26 +01:00
5b45715f8a Cycles: Correct isfinite check used in integrator
Use fast-math friendly version of this function.

We should probably avoid unsafe fast math, but this is to be done with
real care with all the benchmarks properly done.

For now comitting much safer fix.
2017-03-24 15:39:33 +01:00
6aa972ebd4 Fix/workaround T51007: Material viewport mode crash on node with more than 64 outputs
Ideally we need to find a way to remove such a static limit here, but it's not so
trivial to implement for texture nodes. Requires some bigger system redesign there.

Just raising limit for now, which is fine for modern systems.
2017-03-24 14:36:00 +01:00
467d824f80 Fix T50238: Cycles: difference in texture position between OpenGL and Cycles render 2017-03-24 12:24:14 +01:00
e32710d2d7 Buildbot: Use proper NVCC path
In fact, we could probably remove this option all together.
2017-03-24 10:27:41 +01:00
85a5fbf2ce Cycles: Workaround incorrect SSS with CUDA toolkit 8.0.61 2017-03-24 10:08:18 +01:00
a14fb77fee Update CLERW to the latest version 2017-03-24 09:43:03 +01:00
d52191616b Fix for last fix of fix: (unsigned)char is limited to 255
setting char as value outside its range will wrap
2017-03-24 04:35:17 -03:00
178708f142 Fix of last commit. Clamp values that will be used! 2017-03-24 04:13:16 -03:00
d23459f516 Fix T51038: layerInterp_mloopcol was casting instead of rounding the interpolated RGBA channels
Casting to int truncates a floating-point number, that is, it loose the fractional part.
2017-03-24 04:06:30 -03:00
bc0b5d611c Cleanup: minor edits to path test
No need for redundant ID's and correct arg order
2017-03-24 17:48:22 +11:00
50f9fc7a53 BLI_path_util: Add BLI_path_join
There weren't any convenient ways to join multiple paths in C
that accounted for corner cases.
2017-03-24 17:40:35 +11:00
0453c807e0 Add: BKE_appdir_folder_id_ex
Allows getting the path without using a static string.
2017-03-24 10:35:58 +11:00
6a6566a7fc Cleanup: line-length 2017-03-24 10:11:01 +11:00
096602d3a8 bpy.path.display_name: strip spaces
Useful for Python module paths that can't start with numbers.
2017-03-24 06:55:44 +11:00
05b7591c62 BLI_path_util: Add string versions of SEP, ALTSEP
This allows for adding separators in string literals.
2017-03-24 05:23:03 +11:00
9af6f40e4d addon_utils: add disable_all function 2017-03-24 05:20:26 +11:00
a96110e710 Cycles: Remove old non-optimized triangle intersection function
It is unused now and if we want similar function we should use
Pluecker intersection which is same performance with SSE optimization
but which is more watertight.
2017-03-23 17:59:34 +01:00
27248c8636 Cycles: Remove unused macro 2017-03-23 17:59:02 +01:00
ba8c7d2ba1 Cycles: Use SSE-optimized version of triangle intersection for motion triangles
The title says it all actually. Gives up to 10% speedup on test scenes here
on i7-6800K.

Render times on GPU are unreliable here, but there might be some slowdown
caused by watertight nature of intersections.
2017-03-23 17:58:03 +01:00
a1348dde2e Cycles: Fix speed regression on GPU
Avoid construction of temporary array and make utility function force-inlined.
Additionally avoid calling float4_to_float3 twice.

This brings render times to the same values as before current patch series.
2017-03-23 17:45:19 +01:00
2a5d7b5b1e Cycles: Use utility function for SSS triangle intersection
This effectively de-duplicates triangle intersection logic implemented
for both regular triangle and SSS triangle.
2017-03-23 17:45:19 +01:00
a5b6742ed2 Cycles: Move watertight triangle intersection to an utility file
This way the code can be reused more easily.
2017-03-23 17:45:19 +01:00
f8a999c965 Cycles: Move triangle intersection precalc to an util file
This is a preparation work for the followup commit which wil l move
remaining parts of Woop intersection logic to an utility file.

Doing it as a separate commit to keep changes more atomic and easier
to bisect when/if needed.
2017-03-23 17:45:19 +01:00
b797a5ff78 Cycles: Cleanup, move utility function to utility file
Was an old TODO, this function is handy for some math utilities as well.
2017-03-23 17:45:19 +01:00
aa0602130b Cycles: Cleanup, code style and comments 2017-03-23 17:45:19 +01:00
1c5cceb7af Cycles: Move intersection math to own header file
There are following benefits:

- Modifying intersection algorithm will not cause so much re-compilation.
- It works around header dependency hell and allows us to use vectorization
  types much easier in there.
2017-03-23 17:45:19 +01:00
e8ff06186e Cycles: Cleanup, inline AVX register construction from kernel global data
Currently should be no functional changes, preparing for some upcoming refactor.
2017-03-23 17:45:19 +01:00
5c06ff8bb9 Cycles: Cleanup, remove unused function 2017-03-23 17:45:19 +01:00
e04970b392 Fix player stubs (tm) 2017-03-23 15:47:23 +01:00
2c78b0c71f Collada - Export: now use bind_mat and rest_mat custom properties (when the use_bind_info option is enabled and the properties exist) 2017-03-23 14:14:23 +01:00
b48ba0909a Collada - Import: now add bind_mat and rest_mat as custom properties (when the use_bind_info option is enabled) 2017-03-23 14:14:23 +01:00
476f5c473a Collada - remove no longer used functions (moved to collada_utils) 2017-03-23 14:14:23 +01:00
51d4743033 Collada - Added support for custom bind matrix (using new bind_mat custom property) 2017-03-23 14:14:22 +01:00
6cfa962986 Collada - removed TransformBase baseclass (not needed for anything) 2017-03-23 14:14:22 +01:00
7c094f6079 Collada - Added some helper functions into collada_utils, for common usage in the collada module 2017-03-23 14:14:22 +01:00
092d673689 Added new option for storing bindpose matrix, see T50412 2017-03-23 14:14:22 +01:00
339d0170d1 collada: Simplify reading Node Matrix 2017-03-23 14:14:22 +01:00
1729dd9998 collada: Make sure that bone use_conncet is set to false when connect type is not defined in Import 2017-03-23 14:14:22 +01:00
33e32c341a collada: add extern 'C' for c header includes 2017-03-23 14:14:22 +01:00
ec3989441f fix: collada - Connected bones get their tails set to wrong location when fix leaf nodes option is enabled 2017-03-23 14:14:22 +01:00
1978ac65c4 collada: use local variable to avoid repeated call of bone chain_length_calculator 2017-03-23 14:14:22 +01:00
89631485cc collada: use vector copy function instead of direct assigning 2017-03-23 14:14:22 +01:00
1600b93fb8 UI: allow to extend camera as a menu
Needed for T46853
2017-03-23 20:45:02 +11:00
4f4a484b9b Cloth refactor: Remove goal springs and some other cleanup
This removes the goal springs, in favor of simply calculating the goal forces on the vertices directly. The vertices already store all the necessary data for the goal forces, thus the springs were redundant, and just defined both ends as being the same vertex.

The main advantage of removing the goal springs, is an increase in flexibility, allowing us to much more nicely do some neat dynamic stuff with the goals/pins, such as animated vertex weights. But this also has the advantage of simpler code, and a slightly reduced memory footprint.

This also removes the `f`, `dfdx` and `dfdv` fields from the `ClothSpring` struct, as that data is only used by the solver, and is re-computed on each step, and thus does not need to be stored throughout the simulation.

Reviewers: sergey

Reviewed By: sergey

Tags: #physics

Differential Revision: https://developer.blender.org/D2514
2017-03-23 03:52:46 -03:00
4d82d525f8 Cycles: Fix building for some compilers 2017-03-23 00:14:48 -04:00
a63ba2739e Cleanup: remove redundant temp dir init
This is already called by wm_init_userdef, in old code
different initialization methods were used but now it's not needed.

Confusing since prefs are loaded in this function that don't initialize temp.
2017-03-23 15:05:42 +11:00
12b62b58e1 Cleanup: minor wm_homefile_read simplification
Logic in this function is a bit scattered,
minor changes to avoid confusion.

Also rename 'from_memory' to 'use_factory_settings'.
2017-03-23 10:42:09 +11:00
762319e911 fix redundant assignment
Thanks clang for the warning.
2017-03-22 16:26:53 -04:00
d8b34a17ac Cleanup: remove BLI_getlastdir
Replace with BLI_path_name_at_index
2017-03-23 06:33:30 +11:00
c7a4f96f88 Pydoc: Change Wikipedia links and grammar in mathutils matrix code 2017-03-22 14:54:22 -04:00
2ba1868c3f Cleanup/optimization: Simplify some usages of uiItemFullO/_ptr, avoid multiple search of same op. 2017-03-22 19:42:19 +01:00
387ba87ad3 Cleanup: ignore open-blend as startup/prefs basis
No reason startup/prefs would ever be blend-file relative.
2017-03-23 05:24:05 +11:00
dc5007648c Depsgraph: Fix missing relations update tag when typing #frame
New depsgraph requires relations to be updated after drivers changes.
2017-03-22 14:44:45 +01:00
Stefan Werner
412220c8d3 Cycles: fixed warnings 2017-03-22 12:28:01 +01:00
797b1d5053 Fix T51024: Switch install_deps to set OSL_ROOT_DIR instead of CYCLES_OSL.
Path by @alekulyn, thanks.

Differential Revision: https://developer.blender.org/D2571
2017-03-22 12:05:43 +01:00
2b44db4cfc Fix/workaround T50533: Transparency shader doesn't cast shadows with curve segments
There seems to be a compiler bug of MSVC2013. The issue does not happen on Linux and
does not happen on Windows when building with MSVC2015.

Since it's reallly a pain to debug release builds with MSVC2013 the AVX2 optimization
is disabled for curve sergemnts for this compiler.
2017-03-22 11:37:23 +01:00
8563d3b254 Create correct node after image file drag&drop for Blender Render 2017-03-22 12:00:33 +03:00
d0253b2ea4 BLI_path_util: add BLI_path_name_at_index
Utility to get a file/dir in the path by index,
supporting negative indices to start from the end of the path.

Without this it wasn't straightforward to get
the a files parent directory name from a filepath.
2017-03-22 19:34:43 +11:00
253281f9d6 Fix for splash not opening centered
When the new window didn't end up using the size stored in the preferences
the splash would not be centered (even outside the screen in some cases).

Now centered popups listen for window resizing.
2017-03-22 13:53:54 +11:00
a3f48d65df Datablock ID Properties
Summary:
The absence of datablock properties "will certainly be resolved soon as the need for them is becoming obvious" said the [[http://wiki.blender.org/index.php/Dev:Ref/Release_Notes/2.67/Python_Nodes|Python Nodes release notes]]. So this patch allows Python scripts to create ID Properties which reference datablocks.
This functionality is implemented for `PointerProperty` and now such properties can be created with Python.

In addition to the standard update callback, `PointerProperty` can have a `poll` callback (standard RNA) which is useful for search menus. For details see the test included in this patch.

Original author: @artfunkel

Alexander (Blend4Web Team)

Reviewers: brecht, artfunkel, mont29

Subscribers: poseidon4o, mont29, homyachetser, Evgeny_Rodygin, AlexKowel, yurikovelenov, fjuhec, sharlybg, cardboard, Asticles, duarteframos, blueprintrandom, a.romanov, BYOB, disnel, aditiapratama, bliblubli, dfelinto, lukastoenne

Maniphest Tasks: T37754

Differential Revision: https://developer.blender.org/D113
2017-03-21 17:23:26 +03:00
a0f16e12a0 Cycles: Use more friendly GPU device name for AMD cards
For example, for RX480 you'll no longer see "Ellesmere" but will see
"AMD Radeon RX 480 Graphics" which makes more sense and allows to easily
distinguish which exact card it is when having multiple different cards
of Ellesmere codenames (i.e. RX480 and WX7100) in the same machine.
2017-03-21 12:01:11 +01:00
7780a108b3 Cycles: Simplify some extra OpenCL query code 2017-03-21 12:01:03 +01:00
a41240439b Update CLEW to latest version
Needed to get access to some AMD extensions.
2017-03-21 12:01:03 +01:00
fceb1d0781 Cycles: Cleanup, add some utility functions to shorten access to low level API
Should be no functional changes.
2017-03-21 12:01:03 +01:00
eb1a57b12c Cycles: Fix wrong vector allocation in the mesh sync code 2017-03-21 04:30:08 +01:00
8fff6cc2f5 Cycles: Fix building of OpenCL kernels
Theres no overloading of functions in OpenCL so we can't make use of
`safe_normalize` with `float2`.
2017-03-20 22:55:52 -04:00
13d8661503 Fix T51012: Surface modifier was not working with curves
This prevented the Force Field Curve Shape of working
2017-03-20 18:51:32 -03:00
3c4df13924 Fix T50268: Cycles allows to select un supported GPUs for OpenCL 2017-03-20 15:37:27 +01:00
d544a61e8a Cycles: Update remaining time once per second without waiting for a tile change
Previously, the code would only update the status string if the main status changed.
However, the main status did not include the remaining time, and therefore it wasn't updated until the amount of rendered tiles (which is part of the main status) changed.

This commit therefore makes the BlenderSession remember the time of the last status update and forces a status update if the last one was more than a second ago.

Reviewers: sergey

Differential Revision: https://developer.blender.org/D2465
2017-03-20 15:28:36 +01:00
a201b99c5a Fix T50975: Cycles: Light sampling threshold inadvertently clamps negative lamps 2017-03-20 14:48:55 +01:00
6b86b446d3 Cleanup: useless call to glRasterPos before view3d_cached_text_draw_add()
Probably some leftover from much older code?
2017-03-20 14:36:06 +01:00
18bf900b31 Fix T50990: Random black pixels in Cycles when rendering material with Multiscatter GGX 2017-03-20 12:07:41 +01:00
06159e6a58 Correct unintended splash on loading startup 2017-03-20 12:46:20 +11:00
dbc8b81ecf User Preferences: Split out addon and keymap free 2017-03-20 12:42:19 +11:00
eaf88f564c Remove register_module use in Cycles 2017-03-20 12:16:51 +11:00
fa11d41113 Cleanup: especially non pep8 parts of Py UI 2017-03-20 09:49:35 +11:00
df76616d74 Usual UI/i18n message fixes.
Please provide valid description for SurfaceDeform modifier tooltip.
Such place-holders should not pass final checks before merging in master!
2017-03-19 17:31:07 +01:00
19d493ee10 Moving classes to separate listing broke panel order
Although this wasn't so obvious since it
only showed up for factory settings and in the preferences window.

Panel display order depends on registration order,
Sorry for the noise. On the bright side we no longer need to move
classes around to re-arrange panels.
2017-03-20 02:37:55 +11:00
84935998a7 Add missing classes from recent commit 2017-03-20 02:07:24 +11:00
56d3cc9341 PyAPI: ID Property tests 2017-03-19 03:57:40 +11:00
9bdda427e6 PyAPI: remove bpy.utils.register_module()
In preparation for it being removed, see: T47811
2017-03-18 20:03:24 +11:00
2fbc50e4c1 Alternate fix for T50899
object_get_derived_final shouldn't have been assuming mesh objects.

It's even valid to use a curve as a target for a shrink-wrap modifier.
2017-03-18 18:33:01 +11:00
3ceb68c833 Missing from recent commit 2017-03-18 12:33:59 +11:00
e392bb4937 PyAPI: add BPY_execute_string_as_string
Utility to execute a string and get the resulting string,
matching BPY_execute_string_as_number.

Not used just yet but generally useful function.
2017-03-18 12:19:03 +11:00
d863b5182e Cleanup: use return args last and 'r_' prefix. 2017-03-18 09:39:36 +11:00
9d873fc3de Various icon adjustments 2017-03-17 16:57:53 +03:00
ea3d7a7f58 Fix T50968: Cycles crashes when image datablock points to a directory
See more details about root of the cause there:

  https://github.com/OpenImageIO/oiio/pull/1640
2017-03-17 14:47:12 +01:00
d6b4fb6429 Cycles: Fix mistake in previous split kernel commits
Own stupid mistake. Reported by nirved in IRC, thanks!
2017-03-17 11:55:59 +01:00
502c4be56e fix: redraw dope sheet / action editor when pose bone selection changes 2017-03-17 11:04:13 +01:00
a58350b07f Cycles: Cleanup, indentation 2017-03-17 10:25:37 +01:00
98b81493f3 Refactor writefile handling of data-blocks.
Instead of calling a function looping over whole list of a given ID
type, make whole loop over Main in parent function, and call functions
writing a single datablock at a time.

This design is more in line with all other places in Blender where we
handle whole content of Main (including readfile.c), and much more easy
to extend and add e.g. some generic processing of IDs before/after
writing, etc.

From user point, there should be no change at all, only difference is
that data-block types won't be saved in same order as before (.blend
file specs enforces no order here, so this is not an issue, but it could
bug some third party users using other, simplified .blend file reader maybe).

Reviewers: sergey, campbellbarton

Differential Revision: https://developer.blender.org/D2510
2017-03-17 10:02:08 +01:00
e361adbca2 Cycles: Fix compilation error of LCG RNG 2017-03-17 09:58:08 +01:00
439a277aa5 Cycles: Silence strict compiler warning 2017-03-17 09:56:44 +01:00
2cae58524c Cycles: Improve memory usage of CPU split kernel by using smaller global size 2017-03-17 01:54:10 -04:00
60a344b43d Cycles: Fix handling of barriers 2017-03-17 01:54:04 -04:00
b27e224276 Mesh Convert: remove meaningless modifier check
Meshes w/o modifiers wouldn't have their derived mesh applied.
Check was to avoid crash but its in fact meaningless,
since the modifier might be disabled, or there may be virtual modifiers.
2017-03-17 10:10:55 +11:00
750c0dd4de Fix T50950: Converting meshes fails w/ boolean 2017-03-17 09:58:05 +11:00
d4d8da28fc Add BKE_blendfile_userdef_read_from_memory
Needed to read user-preferences from in-memory startup.blend

Also skip data-blocks when reading preferences.
2017-03-17 07:01:48 +11:00
b2d3956e7b Add support for loading preference struct
Previously it would always load into 'U' global.
Needed for loading & merging template preferences.
2017-03-17 05:20:50 +11:00
db04980678 PyAPI: Menu.path_menu: Add path filter callback
Needed if we want to filter based on filenames (not just extension).
2017-03-17 05:20:50 +11:00
f7793bd53c Correct reading missing property 2017-03-17 05:20:50 +11:00
1f65ab606b Fix missing undo pushes in outliner's new datablock management operations.
Not sure why I did not put those from start... Actually *not* having an
undo point here can be problematic, since undoing some previous action
was trying to restore from bad pointer (I think) in UI, generating
asserts.

Note however that it's not a 'pure' undo, in that you may not find your
linked data in exact same state as before deleting it, after an undo,
since it actually implies *reloading* the deleted libraries (and not
restoring from a previously stored memory dump).

Reported by @sergey, thanks.
2017-03-16 17:05:48 +01:00
fa9bd04483 Fix outliner contextual menu allowing to delete indirect libraries.
There is no way currently to prevent the option from showing in menu, so
instead report a warning to user (and curse again current nightmarish
system of operation in outliner...).

Reported by @sergey, thanks.
2017-03-16 17:05:48 +01:00
0434053f13 Depsgraph: Fixed crash with curve bevel indirect dupligroups
Need to expand all object's dupli-groups, not only the dupli-groups
of objects directly linked to the scene.
2017-03-16 15:35:21 +01:00
68e58f1991 Depsgraph: Use string and vector in the DEG namespace only 2017-03-16 15:35:21 +01:00
6d8875bd61 Depsgraph: Don't use explicit values in runtime only enum
Lower risk of forgetting to update some values here.
2017-03-16 15:35:21 +01:00
c4e07dd29b Cleanup: differentiate startup/prefs in home-file reading 2017-03-17 00:42:13 +11:00
aad9dd2f1b Support passing in UserDef for free function
Needed so we can load and free non-global user preferences.
2017-03-17 00:18:20 +11:00
1cad64900e Cycles: Define ccl_local variables in kernel functions
Declaring ccl_local in a device function is not supported
by certain compilers.
2017-03-16 11:27:17 +01:00
1ff753baa4 Cycles: Workaround for compilation error caused by passing KernelGlobals
Pass globals as a bare pointer, same as it sued to be prior to split kernel rework.

AMD CPU platform and Intel OpenCL were complaining about this.

Perhaps we shouldn't pass globals as pointer at all, this isn't something what is
really portable and can cause issues on 32 bit perhaps.
2017-03-16 11:27:17 +01:00
26620f3f87 Cycles: Avoid some ccl_local in various kernels 2017-03-16 11:27:17 +01:00
4833a71621 Cycles: Adjust global size for OpenCL CPU devices to make them faster 2017-03-16 06:11:42 -04:00
d68a84d1d2 Fix BGE building.
When you use typedef'ed enum, you need to define all supported values
explicitely in enum, else compiler goes grumpy...
2017-03-16 10:30:02 +01:00
375ede0f3f Comments: wmOperator.cancel & modal 2017-03-16 06:39:09 +11:00
68496c0b38 Missed BGE in recent commit 2017-03-16 06:28:20 +11:00
c832354e33 Load user-preferences before startup file
Internal change needed for template support.
Loading the user preferences first so it's possible
for preferences to control startup behavior.

In general it's useful to load preferences before data-files,
so we know security settings for eg.
2017-03-16 04:02:24 +11:00
c44cdd5905 Cycles: Allow rendering a range of resumable chunks
The range is controlled using the following command line arguments:

  --cycles-resumable-start-chunk
  --cycles-resumable-end-chunk

Those are 1-based index of range for rendering.
2017-03-15 16:00:01 +01:00
c5dba540d7 Cycles: Use argument parser for resumable render feature
Currently there is no functional changes, but we will be adding
couple more of options here soon.
2017-03-15 16:00:01 +01:00
Dalai Felinto
81dc8dd42a Fix bug on Blender version string
Reported by Pablo Vazquez (venomgfx) over irc.
2017-03-15 15:42:01 +01:00
Dalai Felinto
c6c85a8c6b Move Blender version string handling to its own function
Planning to use this util function in 2.8 for doversioning (to communicate converted layers)
2017-03-15 14:10:20 +01:00
af1d9ecd40 Fix strict compiler warning in the previous commit 2017-03-15 12:48:07 +01:00
9ad252d157 Fix T50938: Cache not being reset when changing simulation settings with new depsgraph
The thing i'm really starting to hate is the requirement to specify both
operation code and node type. Seems to be duplicated enums without real
need for that.
2017-03-15 11:10:42 +01:00
6a5487e021 BGE: Fix blenderplayer stub.
Add dummy definition of WM_operator_is_repeat.
2017-03-14 21:17:40 +00:00
f13c729b26 WM: free operators when repeating
Needed since the active operator isn't ensured to be the last.
2017-03-15 05:37:42 +11:00
647fb6ef1e fix D2552: Collada - Follow up change to complete the fix in rBda6cd7762810 (use unique id for bones with same name in different armatures) 2017-03-14 19:31:25 +01:00
4877c9362a Collada simplify: avoid duplicate negation in boolean 2017-03-14 19:31:25 +01:00
76ec329dd1 WM: add Operator.is_repeat() check for C & Py
This addresses an issue raised by D2453 -
that there was no way to check if operators are run
multiple times in a row.

Actions are still ignored that don't cause an UNDO event.
2017-03-15 03:57:01 +11:00
1208792adb WM: store operators with undo flag
This is needed so we can tell if operators are executed repeatedly.
2017-03-15 03:57:01 +11:00
582f9ddeb7 Update path_menu for recent API change 2017-03-15 03:57:01 +11:00
5ba51de84a Cycles: Cleanup, indentation 2017-03-14 16:54:16 +01:00
43f7d5643f Fix T50926: python crashes with path containing utf8 characters.
Default text encoding is platform-dependent in python, and windows
usually do not use utf-8 as default...
2017-03-14 16:04:45 +01:00
Jon Allee
da6cd77628 fix D2552: Collada - use unique id for bones with same name but in different armatures. Co-authored-by: Gaia <gaia.clary@machiniamtrix.org> 2017-03-14 14:35:51 +01:00
f3ff03b3c1 CLNor: rework threaded computation.
Was using some threaded queue on top of task pool, tssk...

Now using properly task pool directly to crunch chunks of smooth fans.

No noticable changes in speed.

Tried to completely get rid of the 'no threading with few loops' code,
but even just creating/freeing the task pool, without actually pushing
any task, is enough to make code 50% slower in worst case scenario (i.e.
few thousands of simple cube objects).
2017-03-14 12:54:57 +01:00
284701e371 CLNor code: use averaged debug timing. 2017-03-14 12:54:57 +01:00
1410ea0478 Fix T50876: Cycles Crash - Cycles crashes before sampling when certain meshes have autosmooth enabled.
The root of the issue was in custom normal code, so far it assumed that
we could only have one cyclic smooth fan around each vertex, which is...
blatantly wrong (again, the two cones sharing same vertex tip e.g.).

This required a rather deep change in how smooth fans/clnor spaces are processed,
took me some time to find a 'good' solution.

Note that new code is slightly slower than previous one (maybe about 5%),
not much to be done here, am afraid.

Tested against all older report files I could find, seems OK.
2017-03-14 12:54:57 +01:00
521133682c Fix own mistake in recent 'edge split' refactor.
We do can have some vertices to split, while not having any edge (think
about two cones sharing the same tip vertex e.g.).
2017-03-14 12:54:57 +01:00
8dd0355c21 Cycles: Try to avoid infinite loops by catching invalid ray states 2017-03-14 06:22:57 -04:00
0ee1cdab7e WM: Option to load startup w/o closing the splash
Not user visible, needed for switching templates.
2017-03-14 21:05:00 +11:00
4c5374d46a PyAPI: extend Menu.path_menu
- Add optional 'display_name' callback
  so callers can construct own names.
- Add optional 'prop_filepath' argument
  (for operators that don't use "filepath").
- Add doc-string.
- Use keyword only arguments.
2017-03-14 21:00:55 +11:00
810982a95c Fix T50932: depth picking w/ pose-bone constraints 2017-03-14 18:02:27 +11:00
76acaefdd7 Cycles: Cleanup, wipe obviously outdated parts of split kernel comments 2017-03-13 17:16:16 +01:00
0c72008592 fix msvc warnings about unknown opencl pragmas 2017-03-13 10:08:14 -06:00
aa36c73c33 Cycles: Add missing header in the file 2017-03-13 16:59:09 +01:00
2b3cc24388 Fix T50920: Adds missing edges on return of bisect operator 2017-03-13 09:22:11 -03:00
Hristo Gueorguiev
f169ff8b88 Fix T50925: Add AO approximation to split kernel 2017-03-13 11:15:58 +01:00
8794a43b68 Cycles: Make MESA compiler more happy
While this compiler is not officially supported yet, getting it to work is
a nice thing because more and more AMD cards will fall under MESA driver.

It's also nice to use explicit comparison with NULL, which makes it more
clear whether variable is a boolean or pointer. Even Rust enforces this!

Patch by Ian Bruce with own modifications.
2017-03-13 09:57:25 +01:00
e8021f5e3b UI: expose mesh conversion in apply menu
The mesh convert operator can 'freeze' a mesh
(WYSIWYG, modifiers, shape keys etc).
However its not very obvious that the way to perform this
operation is to convert a mesh to a mesh.

Expose this as 'Visual Geometry to Mesh' in the 'Apply' menu,
since this is where users might expect to see it.
2017-03-13 07:33:24 +11:00
10404e3e56 Comments: minor clarification 2017-03-13 07:18:28 +11:00
b759d3c9c5 fix T50923: Inconsistent default values and wrong order of parameters in api call 2017-03-12 20:31:51 +01:00
18ed060bc3 Fix T50930 Typo in 'jpeg2k_codec' description 2017-03-12 13:56:25 -04:00
6521307dcd BMesh: rename cryptic functions
Use expanded names for bmesh primitive operations
(urmv jvke semv jfke).

Use 'bmesh_kernel_' prefix,
these functions aren't intended for wide use so favor readability.

Remove BM_face_vert_separate,
it wasn't used and only skipped step of finding correct loop of face.
2017-03-13 04:39:20 +11:00
f28376d8d9 Cleanup: style 2017-03-13 04:39:20 +11:00
Julian Eisel
8ca11f5b72 UI: Always open enum-search popups with empty search string
It might be useful to keep the search string stored in some cases, but
in most it's not useful but confusing. Especially if the string is taken
from a menu showing a different enum.
2017-03-12 18:14:43 +01:00
3f94836922 Fix T50788: blender startup crash on macOS with some types of volumes available. 2017-03-12 18:03:15 +01:00
68ca973f7f Fix T50628: gray out cycles device menu when no device configured only for GPU Compute. 2017-03-12 18:00:17 +01:00
76015f98ae Fix icon alignment for pie buttons 2017-03-11 22:34:09 +03:00
bcc8c04db4 Cleanup: code style & cmake 2017-03-12 02:47:53 +11:00
98045648ab Add support for Objects in Drive variable Rotational Difference
Thus it is according to the Manual
https://docs.blender.org/manual/en/dev/animation/drivers/drivers_panel.html#driver-variables
2017-03-11 10:43:23 -03:00
304315d181 BMesh: Fix BM_face_loop_separate_multi
When the loop region passed in had no loops to edge-split from,
it was assumed nothing needed to be done.

This ignored the case where loops share a vertex
without any shared edges.

Now BM_face_loop_separate_multi behaves like BM_face_loop_separate.

Fixed error where faces remained connected by verts in BM_mesh_separate_faces.
2017-03-11 23:26:44 +11:00
ce155ad2f6 Correct recent bmesh separate addition
- Was setting flag incorrectly to avoid re-use.
- Check edge has loops before accessing.
2017-03-11 23:26:13 +11:00
96868a3941 Fix T50888: Numeric overflow in split kernel state buffer size calculation
Overflow led to the state buffer being too small and the split kernel to
get stuck doing nothing forever.
2017-03-11 05:39:28 -05:00
5afe4c787f BMesh: add BM_mesh_separate_faces
Fast-path for bmesh split operator which duplicates and deletes.
Use when only separating faces, currently used by the intersect tool.
2017-03-11 20:50:32 +11:00
5be8adf8c0 Makefile: set tab width=4 2017-03-11 20:48:12 +11:00
f667593b6a Fix text and icon positioning issue on high DPI, after recent changes in 32c5f3d. 2017-03-11 04:47:22 +01:00
2d3c44389a Fix OpenCL warnings about doubles on some platforms. 2017-03-11 00:55:23 +01:00
c374e9f1f5 Breakdowner - Constrain Transform and Axis
This commit adds new features to the breakdowner, giving animators more
control over what gets interpolated by the breakdowner. Specifically:

   "Just as G R S let you move rotate scale, and then X Y Z let you do that
   in one desired axis, when using the Breakdower it would be great to be
   able to add GRS and XYZ to constrain what transform / axis is being
   breakdowned."

As requested here:
https://rightclickselect.com/p/animation/csbbbc/breakdowner-constrain-transform-and-axis


Notes:
* In addition to G/R/S, there's also B (Bendy Bone settings and C (custom properties)
* Pressing G/R/S/B/C or X/Y/Z again will turn these constraints off again
2017-03-11 11:53:45 +13:00
b6713dcbe5 rBa81ea408367abe2f33b351ff6dcc6b09170fd088 "object" -> "target" 2017-03-10 13:54:06 -03:00
a81ea40836 fix T50899: Even though the Shrinkwrap options hide the possibility of using a non-mesh target, you can still circumvent this... Causing Crash 2017-03-10 13:51:04 -03:00
103ae04fbc Correct glPixelTransfer function 2017-03-11 03:03:47 +11:00
15eb83c8b3 Fix T50900: Text-Blocks created from "Edit Source" have zero users. 2017-03-10 15:43:33 +01:00
9d6acc34a1 Fix useless allocation of edge_vectors in threaded case of loop split generation. 2017-03-10 15:43:33 +01:00
59fd21296a Cycles: Cleanup, extra semicolon and space 2017-03-10 15:38:30 +01:00
17b3097205 Adjust kmi header 2017-03-10 15:10:40 +03:00
6038583909 Cleanup: struct flags for select picking 2017-03-10 21:47:43 +11:00
3dbb560331 Cleanup: rename drawObjectSelect
After adding draw_object_select, noticed a similar name.
Rename drawObjectSelect to draw_object_selected_outline.
2017-03-10 21:27:33 +11:00
12e681909f Fix T47690: Connected PET w/ individual origins
- Connectivity length was overwritten by distance to closest selected.
- Vertices used the 'island' center of the closest vertex,
  even if it wasn't connected.

Now optionally keep track of the original index of used as the closest
connected distance.

To support this needed to add optional support for islands of 1 vertex.
2017-03-10 20:27:23 +11:00
4a2cde3f0e Cycles: Enable SSS and volumes for CUDA and Nvidia OpenCL split kernel 2017-03-10 02:09:41 -05:00
17689f8bb6 Fix T50904: Imprecise timeline frame selection using mouse
The changes introduced in rB3e628eefa9f55fac7b0faaec4fd4392c2de6b20e
made the non-subframe frame change behaviour less intuitive, by always
truncating downwards, instead of rounding to the nearest frame instead.
This made the UI a lot less forgiving of pointing precision errors
(for example, as a result of hand shake, or using a tablet on a highres scren)

This commit restores the old behaviour in this case only (subframe inspection
isn't affected by these changes)
2017-03-10 15:07:17 +13:00
62cc226101 3D View: x-ray support for depth picking
Selection loop would draw the selection ignoring xray.
Now draw in a separate pass after clearing the depth buffer,
as with regular drawing.

Also disable depth sorting,
caller can sort the hit-list by depth if needed.
2017-03-10 05:00:49 +11:00
Hristo Gueorguiev
9de9f25b24 Cycles: add single program debug option for split kernel
Single program generally compiles kernels faster (2-3 times), loads faster,
takes less drive space (2-3 times), and reduces the number of cached kernels.
2017-03-09 17:09:37 +01:00
Hristo Gueorguiev
06c051363b Cycles: split kernel_shadow_blocked to AO & DL parts
Reduces memory allocation for split kernel.

This allows for faster rendering due to bigger global size,
specially when GPU memory is limited.

Perfromance results:

                         R9 290 total render time
                        Before    After   Change
BMW                      4:37      4:34   -1.1 %
Classroom               14:43     14:30   -1.5 %
Fishy Cat               11:20     11:04   -2.4 %
Koro                    12:11     12:04   -1.0 %
Pabellon Barcelona      22:01     20:44   -5.8 %
Pabellon Barcelona(*)   15:32     15:09   -2.5 %

(*) without glossy connected to volume
2017-03-09 17:09:37 +01:00
Hristo Gueorguiev
e8b5a5bf5b Cycles: Speedup transparent shadows in split kernel
This commit enables record-all transparent shadows rays.

Perfromance results:

               R9 290 render time (without synchronization), seconds
                        Before    After   Change
BMW                      261.5    262.5   +0.4 %
Classroom                869.6    867.3   -0.3 %
Fishy Cat                657.4    639.8   -2.7 %
Koro                    1909.8    692.8  -63.7 %
Pabellon Barcelona      1633.3   1238.0  -24.2 %
Pabellon Barcelona(*)   1158.1    903.8  -22.0 %

(*) without glossy connected to volume
2017-03-09 17:09:37 +01:00
Hristo Gueorguiev
57e26627c4 Cycles: SSS and Volume rendering in split kernel
Decoupled ray marching is not supported yet.

Transparent shadows are always enabled for volume rendering.

Changes in kernel/bvh and kernel/geom are from Sergey.
This simiplifies code significantly, and prepares it for
record-all transparent shadow function in split kernel.
2017-03-09 17:09:37 +01:00
Dalai Felinto
6c942db30d Remove (ifdef) draw_documentation from text_draw.c
This was no longer supported.
2017-03-09 17:02:35 +01:00
88e8e7a074 3D View: wrap GPU_select cache calls
Avoids including GPU_select and makes it more clear that the cache is
needed for view3d_opengl_select calls.

Also use typed enum for select mode.
2017-03-09 20:47:37 +11:00
4ab322fdd2 3D View: use cache for armature select 2017-03-09 09:25:33 +11:00
c837bd5ea5 Cycles: Fix CUDA build error for some compilers
Needed to include `util_types.h` before using `uint`.
2017-03-08 16:44:43 -05:00
45b764e95b 3D View: new nethod of opengl selection
Intended to replace legacy GL_SELECT, without the limitations of
sample queries which can't access depth information.

This commit adds VIEW3D_SELECT_PICK_NEAREST and VIEW3D_SELECT_PICK_ALL
which access the depth buffers to detect whats under the pointer,
so initial selection is always the closest item.

The performance of this method depends a lot on the OpenGL
implementations glReadPixels.

Since reading depth can be slow, buffers are cached for object picking
so selecting re-uses depth data, performing 1 draw instead of 3
(for 24, 18, 10 px regions, picking with many items under the pointer).

Occlusion queries draw twice when picking nearest,
so worst case 6x draw calls per selection.

Even with these improvements occlusion queries is faster on AMD hardware.

Depth selection is disabled by default, toggle option under select method.
May enable by default if this works well on different hardware.

Reviewed as D2543
2017-03-09 06:22:02 +11:00
817e975dee Fix T50849: Transparent background produces artifacts in this compositing setup
The issue was caused by sometimes negative color returned by the filter node.

Seems to be caused by precision issues. Don't see any reason why we would want
negative colors in output. Those only causing issues later on.
2017-03-08 15:56:50 +01:00
97c4c2689f Cycles: Make it more obvious message which initialization failed 2017-03-08 13:57:21 +01:00
05dfe9c318 Fix T49603: Blender/Cycles 2.78 CUDA error on Jetson-TX1~
Patch by Bruno d'Arcangeli (@arcangeli), thanks!
2017-03-08 13:38:01 +01:00
c24d045a23 OpenGL Select: integer rect for passing region 2017-03-08 23:23:39 +11:00
9af0c8b00a Cleanup: replace short -> int for selection hits 2017-03-08 23:23:39 +11:00
6f3f891c58 Rename BLI_rct*_init_pt_size -> radius 2017-03-08 23:23:39 +11:00
75cb4850f0 Cycles: Use 1-based line number for #line directives
AMD CPU platform was complaining about #line 0 directives in the code.
2017-03-08 12:45:18 +01:00
ecfbfe478b Cycles: Log which device kernels are being loaded for 2017-03-08 12:33:51 +01:00
712f7c3640 Cycles: Make it possible to access KernelGlobals from split data initialization function 2017-03-08 11:02:54 +01:00
ef7c36f5ed Cycles: Cleanup, remove residue of previous split kernel data
This is all in split data state array.
2017-03-08 10:26:29 +01:00
a095611eb8 Fix T50886: Blender crashes on render
Was a mistake in one of the previous TLS commits.

See comment in the pool_create to see some details why it was crashing.
2017-03-08 09:41:38 +01:00
3505be8361 update theme back to black re: T50869 2017-03-08 18:31:24 +11:00
64751552f7 Cycles: Fix indentation 2017-03-08 01:31:32 -05:00
fe7cc94dfa Cycles: Fix strict warning about unused variable 2017-03-08 01:31:32 -05:00
306034790f Cycles: Calculate size of split state buffer kernel side
By calculating the size of the state buffer in the kernel rather than the host
less code is needed and the size actually reflects the requested features.

Will also be a little faster in some cases because of larger global work size.
2017-03-08 01:31:30 -05:00
997e345bd2 Cycles: Fix crash after failed kernel build
Pointers to kernels were uninitialized leading to freeing of random memory
addresses. Another reason it would be good to use smart pointers.
2017-03-08 01:31:09 -05:00
18e50927f7 Cycles: Faster building of split kernel
Simple change to make it so that only kernels that have been modified are
rebuilt. Might only be useful during development.
2017-03-08 01:31:09 -05:00
223f45818e Cycles: Initialize rng_state for split kernel
Because the split kernel can render multiple samples in parallel it is
necessary to have everything initialized before rendering of any samples
begins. The code that normally handles initialization of
`rng_state` (`kernel_path_trace_setup()`) only does so for the first sample,
which was causing artifacts in the split kernel due to uninitialized
`rng_state` for some samples.

Note that because the split kernel can render samples in parallel this
means that the split kernel is incompatible with the LCG.
2017-03-08 01:31:09 -05:00
cd7d5669d1 Cycles: Remove sum_all_radiance kernel
This was only needed for the previous implementation of parallel samples. As
we don't have that any more it can be removed.

Real reason for removal tho is this: `per_sample_output_buffers` was being
calculated too small and artifacts resulted. The tile buffer is already
the correct size and calculating the size for `per_sample_output_buffers`
is a bit difficult with the current layout of the code. As
`per_sample_output_buffers` was only needed for `sum_all_radiance`,
removing that kernel and writing output to the tile buffer directly
fixes the artifacts.
2017-03-08 01:31:07 -05:00
4cf501b835 Cycles: Split path initialization into own kernel
This makes it easier to initialize things correctly in the data_init kernel
before they are needed by path tracing.
2017-03-08 01:30:43 -05:00
5b8f1c8d34 Cycles: Seperate kernel loading time from render time 2017-03-08 01:24:55 -05:00
b78e543af9 Cycles: Add names to buffer allocations
This is to help debug and track memory usage for generic buffers. We
have similar for textures already since those require a name, but for
buffers the name is only for debugging proposes.
2017-03-08 01:24:55 -05:00
817873cc83 Cycles: CUDA implementation of split kernel 2017-03-08 01:24:53 -05:00
0892352bfe Cycles: CPU implementation of split kernel 2017-03-08 00:52:41 -05:00
352ee7c3ef Cycles: Remove ccl_fetch and SOA 2017-03-08 00:52:41 -05:00
a87766416f Cycles: Report device maximum allocation and detected global size 2017-03-08 00:52:41 -05:00
365a4239c5 Cycles: Workaround for driver hangs
Simple workaround for some issues we've been having with AMD drivers hanging
and rendering systems unresponsive. Unfortunately this makes things a bit
slower, but its better than having to do hard reboots. Will be removed when
drivers have been fixed.

Define CYCLES_DISABLE_DRIVER_WORKAROUNDS to disable for testing purposes.
2017-03-08 00:52:41 -05:00
230c00d872 Cycles: OpenCL split kernel refactor
This does a few things at once:

- Refactors host side split kernel logic into a new device
  agnostic class `DeviceSplitKernel`.
- Removes tile splitting, a new work pool implementation takes its place and
  allows as many threads as will fit in memory regardless of tile size, which
  can give performance gains.
- Refactors split state buffers into one buffer, as well as reduces the
  number of arguments passed to kernels. Means there's less code to deal
  with overall.
- Moves kernel logic out of OpenCL kernel files so they can later be used by
  other device types.
- Replaced OpenCL specific APIs with new generic versions
- Tiles can now be seen updating during rendering
2017-03-08 00:52:41 -05:00
520b53364c Cycles: Add OpenCL kernel for zeroing memory buffers
Transferring memory to the device was very slow and there's really no
need when only zeroing a buffer.
2017-03-08 00:52:41 -05:00
dfd6055eb0 Cycles: Add more atomic operations 2017-03-08 00:52:41 -05:00
bc652766e8 Cycles: Expose passes size to device tasks
This is needed so devices can know the size of a tile buffer before any
tiles are acquired.
2017-03-08 00:52:41 -05:00
0f56f7a811 Cycles: Allow device_memory to be used directly
This is useful for when theres no host side memory attched to the buffer
2017-03-08 00:52:41 -05:00
9e566b06e3 Task scheduler: Add concept of suspended pools
Suspended pools allows to push huge amount of initial tasks
without any threading synchronization and hence overhead.

This gives ~50% speedup of cached rigid body with file from
T50027 and seems to have no negative affect in other scenes
here.
2017-03-07 17:32:01 +01:00
347410a322 Depsgraph: Remove workarounds from depsgraph for keeping threads alive
This is something what should be done in the task scheduler instead
with local thread queues so we handle this in a single place.
2017-03-07 17:32:01 +01:00
55c2cd85f0 Task scheduler: Initial implementation of local tasks queues
The idea is to allow some amount of tasks to be pushed from working
thread to it's local queue, so we can acquire some work without doing
whole mutex lock.

This should allow us to remove some hacks from depsgraph which was
added there to keep threads alive.
2017-03-07 17:32:01 +01:00
2f722f1a49 Task scheduler: Use real pthread's TLS to access active thread's data
This allows us to avoid TLS stored in pool which gives us advantage of
using pre-allocated tasks pool for the pools created from non-main thread.

Even on systems with slow pthread TLS it should not be a problem because
we access it once at a pool construction time. If we want to use this more
often (for example, to get rid of push_from_thread) we'll have to do much
more accurate benchmark.
2017-03-07 17:32:01 +01:00
a07ad02156 Task scheduler: Refactor the way we store thread-spedific data
Basically move all thread-specific data (currently it's only task
memory pool) from a dedicated array of taskScheduler to TaskThread.
This way we can add more thread-specific data in the future with
less of a hassle.
2017-03-07 17:32:01 +01:00
9522f8acf0 Task scheduler: Remove per-pool threads limit
This feature was adding extra complexity to task scheduling
which required yet extra variables to be worried about to be
modified in atomic manner, which resulted in following issues:

- More complex code to maintain, which increases risks of
  something going wrong when we modify the code.

- Extra barriers and/or locks during task scheduling, which
  causes extra threading overhead.

- Unable to use some other implementation (such as TBB) even for
  the comparison tests.

Notes about other changes.

There are two places where we really had to use that limit.

One of them is the single threaded dependency graph. This will
now construct a single-threaded scheduler at evaluation time.
This shouldn't be a problem because it only happens when using
debugging command line arguments and the code simply don't
run in regular Blender operation.

The code seems a bit duplicated here across old and new
depsgraph, but think it's OK since the old depsgraph is already
gone in 2.8 branch and i don't see where else we might want
to use such a single-threaded scheduler.

When/if we'll want to do so, we can move it to a centralized
single-threaded scheduler in threads.c.

OpenGL render was a bit more tricky to port, but basically we
are using conditional variables to wait background thread to
do all the job.
2017-03-07 17:32:01 +01:00
35d78121f0 Fix typo in command line arg list 2017-03-07 09:07:58 -05:00
Julian Eisel
af076031d6 Update keymap presets for recent transform manipulator changes
Part of T50565.
2017-03-07 11:54:40 +01:00
Julian Eisel
ca796f872e Once more T50565: Allow using planar constraints for scale manipulator 2017-03-07 11:23:07 +01:00
15fa806160 Rigid body: fix viewport not updating on properties change. 2017-03-06 16:25:47 +01:00
f1c764fd8f Fix width calculation for split layouts 2017-03-06 16:35:56 +03:00
0e995e0bfe Cycles: Fix strict -Wpedantic warnings with GCC
Patch by Stefan Werner, thanks!
2017-03-06 14:18:26 +01:00
b498db06eb Task scheduler: Cleanup, use BLI_assert() instead of assert() 2017-03-06 11:33:27 +01:00
3623f32b48 FFmpeg: Update for the deprecated API in 3.2.x
Should be no functional changes.
2017-03-06 10:34:57 +01:00
355ad008a2 Surface Deform Modifier: Respect object transforms at bind time
This slightly changes SDef behavior, by now respecting object transforms
at bind time, thus not requiring the objects to be aligned in their
respective local spaces, but instead using world space.
2017-03-06 03:43:26 -03:00
Julian Eisel
80444effc6 Multi-View: Map cursor coordinates to visual coordinates
When rendering multi-view in side-by-side or top-bottom mode, we squash
the UI to half of its size and draw it twice on screen. That means the
cursor coordinates used for UI interaction don't match what's visible on
screen.
This commit is a little event system hack (tm) to fix this. It has some
small glitches with cursor grabbing, but nothing to bad.
We'll also use it for viewport HMD support.

D1350, thanks for the feedback @dfelinto!
2017-03-06 01:32:35 +01:00
e72af060ab CMake: confine WIN32 options 2017-03-06 04:05:00 +11:00
5f98cd6360 Cleanup: typos 2017-03-05 23:36:49 +11:00
a461216885 BMesh: Add 'cut' separate mode for intersect tool
It was only possible to separate all geometry from an intersection or none.

Made this into an enum with a 3rd option to 'Cut', (now default)
which keeps each side of the intersection separate
without splitting faces in half.
2017-03-05 23:36:46 +11:00
3caeb51d7f Fix T50855: Intersect (knife) w/o separate doesn't select 2017-03-05 22:28:16 +11:00
f75b52eca1 Fix T50843: Pitched Audio renders incorrectly in VSE
There was a bug in the intended code behaviour to always seek with a
pitch of 1.0 regardless of pitch/pitch animation/doppler effects.

Check the bug report for a more detailed explanation of problems
concerning pitch and seeking.
2017-03-05 12:19:32 +01:00
4a4d71414e BLI_rect: add init from point functions
Initialize a rectangle from point+size.
2017-03-05 20:51:23 +11:00
2089a17f7e Fix T50838: Surface Deform DM use after free issue
Implementd fix suggested by @sergey in T50838.
2017-03-04 03:16:50 -03:00
6b9d73e8a7 Cleanup: expose struct for ED_view3d_mats_rv3d_* 2017-03-04 13:32:40 +11:00
7b92b64742 Fix own previous commit, sorry about that :( 2017-03-03 17:23:22 +01:00
2e8398c095 Get rid of BLI_task_pool_stop().
Comments said that function was supposed to 'stop worker threads', but
it absolutely did not do anything like that, was merely wiping out TODO
queue of tasks from given pool (kind of subset of what
`BLI_task_pool_cancel()` does).

Misleading, and currently useless, we can always add it back if we need
it some day, but for now we try to simplify that area.
2017-03-03 17:16:39 +01:00
18c2a44333 Fix ugly mistake in BLI_task - freeing while some tasks are still being processed.
Freeing pool was calling `BLI_task_pool_stop()`, which only clears
pool's tasks that are in TODO queue, whithout ensuring no more tasks
from that pool are being processed in worker threads.

This could lead to use-after-free random (and seldom) crashes.

Now use instead `BLI_task_pool_cancel()`, which does waits for all tasks
being processed to finish, before returning.
2017-03-03 17:12:03 +01:00
5f05dac28f Update comment which was remained in an old place 2017-03-03 16:36:21 +01:00
17cf423f30 Cleanup: Indentation 2017-03-03 15:53:55 +01:00
91ce13e90d Fix T50842: NLA Influence Curve draws out of bounds when it exceeds the 0-1 range 2017-03-04 01:24:21 +13:00
c0d0ef142f Cleanup: GPU_select never took NULL rect 2017-03-03 22:24:08 +11:00
25de610876 Cleanup: redundant header, use const, short -> bool 2017-03-03 22:24:08 +11:00
cdfae957f2 When creating texture/image in Texture Paint mode, both datablocks should get the same name
The paint slot name was not the same as what is displayed on the texture properties panel.
Instead, the slot type (e.g. "Diffuse Color") was used as the name.

Patch by Suchaaver (@minifigmaster125) with minor changes from @mont29.

Reviewers: mont29, sergey

Maniphest Tasks: T50704

Differential Revision: https://developer.blender.org/D2523
2017-03-03 10:50:01 +01:00
810d7d4694 Cycles: Fix possibly uninitialized variable
Hopefully this was a reason of randomly disappearing textures in our renders.
2017-03-03 10:10:26 +01:00
df88d54284 Fix T49655: Reloading library breaks proxies.
Can't say enough how much I hate those proxies... their duality (sharing
some aspects of both direct *and* indirect users) is a nightmare to handle. :(
2017-03-03 08:52:19 +01:00
42cb93205c Fix own stupid mistake in recent mesh 'split_faces' rework.
Was assigning new edge index to ml_prev->e, and then assigning ml_pre->e
to orig_index...
2017-03-02 17:22:03 +01:00
Julian Eisel
a78717a72d Fix duplicated 'Accurate' property for manipulator keymap item
Is already added through Transform_Properties
2017-03-02 13:39:01 +01:00
Julian Eisel
e7dc46d278 Fix weird "use_planar_constraint" button in redo panel
Issue was that the VIEW_OT_manipulator operator calls the transform
operators and passes them it's own operator properties. That means the
transform operator got properties passed that it doesn't have.
2017-03-02 13:37:42 +01:00
a83a68b9b6 Threads: Use atomics instead of spin when entering threaded malloc 2017-03-02 12:42:34 +01:00
87f8bb8d1d Fix another part of T50565: Planar constraints were always initialized to accurate transform
Now it is defined by keymap.
2017-03-02 12:18:07 +01:00
499faa8b11 Fix second part T50565: Using planar transform once makes it enabled by default
Was caused by property being saved by the operator manager.
2017-03-02 11:20:57 +01:00
856077618a Fix T50830: Wrong context when calling surfacedeform_bind
The custom poll function for surfacedeform_bind seems to have caused
issues when calling it from Python. Fixed by using the generic modifier
poll function, and setting the button to be active or not in the
Python UI code instead. (there might be a better way, but for now this
works fine)
2017-03-01 17:56:10 -03:00
193827e59b Correct comment
Thanks to @dingto for noticing.
2017-03-01 14:12:03 -05:00
49c99549eb Cleanup: Use .enabled instead of .active 2017-03-01 13:06:19 -05:00
278fce1170 Fix T50565: Planar constraints don't work properly with non-Blender key configurations
The issue was introduced by 4df75e5 and seems we just need to explicitly
add new keymap item now.

There is still some difference from old behavior, which is planar transform
is using precision movement since e138cde and here i don't see nice solution
currently: the change was requested here in the studio and it's just a
conflict in picking shift key for something which is not supposed to be
accurate.

At least now it's possible to invoke planar constraint and simply unhold
shift.
2017-03-01 18:00:54 +01:00
7fcae7ba60 Task scheduler: Remove query for the pool's number of threads
Not really happy of per-pool threads limit, need to find better
approach to that. But at least it's possible to get rid of half
of the nastyness here by removing getter which was only used in
an assert statement.

That piece of code was already well-tested and this code becomes
obsolete in the new depsgraph and does no longer exists in blender
2.8 branch.
2017-03-01 18:00:54 +01:00
ecee40e919 All drop-down buttons should use the same width 2017-03-01 19:30:18 +03:00
714e85b534 Cleanup: code-style, duplicate header 2017-03-02 00:16:36 +11:00
32c5f3d772 Fix text and icon positioning issues 2017-03-01 16:11:21 +03:00
f0cf15b5c6 Task scheduler: Remove counter of done tasks
This was only used for progress report, and it's wrong because:

- Pool might in theory be re-used by different tasks
- We should not make any decision based on scheduling stats

Proper way is to take care of progress by the task itself.
2017-03-01 12:45:51 +01:00
351c9239ed Cleanup: Use explicit unsigned int in atomics 2017-03-01 12:01:19 +01:00
c1012c6c3a Cleanup: update copyright and Blender description 2017-02-28 12:04:43 -05:00
87f236cd10 Cycles: Fix division by zero in volume code which was producing -nan 2017-02-28 17:33:06 +01:00
efe78d824e Fix/workaround T48549: Crash baking high-to-low-poly normal map in cycles
For now only prevent crash.
2017-02-28 14:08:33 +01:00
a581b65822 Fix T49936: Cycles point density get's it's bounding box from basis shape key 2017-02-28 12:41:56 +01:00
6d1ac79514 Cleanup: Grey --> Gray 2017-02-27 19:33:57 -05:00
4fa4132e45 Surface Deform Modifier (SDef)
Implementation of the SDef modifier, which allows meshes to be bound by
surface, thus allowing things such as cloth simulation proxies.

User documentation: https://wiki.blender.org/index.php/User:Lucarood/SurfaceDeform

Reviewers: mont29, sergey

Subscribers: Severin, dfelinto, plasmasolutions, kjym3

Differential Revision: https://developer.blender.org/D2462
2017-02-27 13:49:14 -03:00
cd5c853307 Fix memory leak when making duplicates real and parent had constraints
Thanks Bastien for help!
2017-02-27 17:46:41 +01:00
2342cd0a0f Fix/workaround T50677: Shrinkwrap constraint don't get updated when target mesh gets modified
Do a "full" update on leaving sculpt mode, so we are sure scene will be brought
to a consistent state.

Ideally we'll only do that when there are objects which depends on geometry
without re-calculating self geometry, but that's a bit tricky currently.
2017-02-27 16:27:53 +01:00
691ffb60b9 Similar to previous commit, but for object constraints 2017-02-27 16:19:52 +01:00
bf7006c15a Depsgraph: Shrinkwrap constraint actually depends on geometry 2017-02-27 16:00:39 +01:00
5acac13eb4 Cycles: Fix compilation error on vanilla Ubuntu 16.10
Patch by @swerner, thanks!
2017-02-27 15:22:51 +01:00
f1b21d5960 Fix T50634: Hair Primitive as Triangles + Hair shader with a texture = crash
Attributes were not resized after pushing new triangles to the mesh.
2017-02-27 15:21:14 +01:00
209a64111e Fix part of T50634: Hair Primitive as Triangles + Hair shader with a texture = crash
Wrong formula was used to calculate needed verts and tris to be reserved.
2017-02-27 15:21:14 +01:00
00ceb6d2f4 Cycles: Make it more clear values never changes by using const qualifier 2017-02-27 15:21:14 +01:00
f7d67835e9 Cleanup: typo in struct name 2017-02-28 00:38:33 +11:00
cc78690be3 Cycles: Forgot this in previous commit 2017-02-27 12:54:35 +01:00
238db604c5 Cycles: Add more logs about what's going on in shader optimization 2017-02-27 12:38:24 +01:00
845ba1a6fb Cycles: Experiment with replacing Sharp Glossy with GGX when Filter Glossy is used
The idea is to make it simpler to remove noise from scenes when some prop uses
Sharp glossy closure and causes noise in certain cases. Previously Sharp Glossy
was not affected by Filter Glossy at all, which was quite confusing.

Here is a file which demonstrates the issue: {F417797}

After applying the patch all the noise from the scene is gone.

This change also solves fireflies reported in T50700.

Reviewers: brecht, lukasstockner97

Differential Revision: https://developer.blender.org/D2416
2017-02-27 12:33:59 +01:00
406398213c Fix missing break setting curve auto-handles 2017-02-27 13:35:03 +11:00
631ecbc4ca Fix unreported bug: Ensure you have the correct array directory even after the dm->release(dm) 2017-02-26 14:16:54 -03:00
112e4de885 Improve add-on UI error message
Show the paths of the duplicate addons

D791 by @gregzaal
2017-02-27 03:57:11 +11:00
0561aa771b Cleanup: minor changes to array_store
- remove unused struct member.
- misleading variable name.
2017-02-26 15:29:09 +11:00
5c3216e233 Fix compiling after a0b8a9f 2017-02-25 14:58:08 +01:00
d66d5790e9 Fix (unreported) missing update when adding constraint from RNA. 2017-02-25 11:38:02 +01:00
94ca09e01c Fix rows with fixed last item (D2524) 2017-02-25 13:18:41 +03:00
2c4564b044 Alembic: avoid crashing when reading non-indexed UV params. 2017-02-25 07:08:42 +01:00
a0b8a9fe68 Alembic: addition of a scope timer to perform basic profiling. 2017-02-25 07:08:42 +01:00
8c5826f59a Fix T50698: Cycles baking artifacts with transparent surfaces. 2017-02-25 03:12:53 +01:00
15f1072ee2 Fix build error with macOS / clang / c++11. 2017-02-25 03:12:53 +01:00
caaf5f0a09 Fix T50757: Alembic, assign imported materials to the object data
instead of to the object itself.
2017-02-24 21:19:52 +01:00
9062c086b4 Fix T50676: Crash on closing while frameserver rendering.
Can't see any reason to call AUD exit early in WM_exit, that's a
low-level module that has no dependency on anything else in Blender, but
is dependency of some other parts of Blender, so it should rather be
exited late in the process!
2017-02-24 14:58:38 +01:00
1e29286c8c Cycles: Fix compilation warning with CUDA on OSX 2017-02-24 14:33:10 +01:00
f49e28bae7 Cycles: Fix non-zero exit status when rendering animation from CLI and running out of memory 2017-02-24 14:25:38 +01:00
4c164487bc Add "Gravitation" option to "Force" type force fields
This adds an option to force fields of type "Force", which enables the
simulation of gravitational behavior (dist^-2 falloff).

Patch by @AndreasE

Reviewers: #physics, LucaRood, mont29

Reviewed By: #physics, LucaRood, mont29

Tags: #physics

Differential Revision: https://developer.blender.org/D2389
2017-02-23 19:23:39 -03:00
29859d0d5e Fix some more minor issue with updated py doc generation. 2017-02-23 22:31:21 +01:00
c067f1d0d4 Fix stupid mistake in previous commit for release builds of API doc. 2017-02-23 22:08:01 +01:00
c7ad27fc07 Update py API doc generation tools to comply to new name scheme on server.
- for rc/release: /api/2.79c/, zip file named blender_python_reference_2.79c_release.zip
 - for dev: /api/master/, zip file named blender_python_reference_2_79_4.zip
2017-02-23 21:45:20 +01:00
6a249bb000 Usual UI messages fixes... 2017-02-23 21:10:43 +01:00
50328b41a7 Cycles: Fix compilation error on 32bit Linux 2017-02-23 17:30:26 +01:00
4e12113bea Cycles: Fix wrong render results with texture limit and half-float textures 2017-02-23 14:46:22 +01:00
13e075600a Cycles: Add utility function to convert float to half
handles overflow and underflow, but not NaN/inf.
2017-02-23 14:42:06 +01:00
9eb647f1c8 Fix T50656: Compositing node editor is empty, no nodes can be added 2017-02-23 11:23:49 +01:00
60592f6778 Fix T50748: Render Time incorrect when refreshing rendered preview in GPU mode 2017-02-23 10:51:06 +01:00
9dd194716b Fix T50736: Zero streaks in Glare node.
Please never, ever use same DNA var for two different things. Even worse
if they do not have same type and ranges!

This is only ensuring issues (as described in report, but also if
animating both RNA props using same DNA var... yuck).

And we were not even saving any byte in DNA, could reuse some padding
there to store the two new needed vars (yes, two, since we cannot re-use
existing one if we want to keep backward *and* forward compatibility).
2017-02-23 10:39:51 +01:00
Julian Eisel
7359cc1060 Fix possible crash in various 3D View operators
Was actually harmeless and not crashing, but I'd say more or less only
by luck: the NULL-check for region data would only evaluate to true for
the correct 3D View region. However, if we were to add region data to a
different region type in future, this would lead to undefined behavior
if executed in the wrong region.
2017-02-23 02:14:27 +01:00
43299f9465 Columns should be expandable by default 2017-02-23 00:06:54 +03:00
5e1d4714fe Fix T50745: Shape key editing on bezier objects broken with Rendered Viewport Shading
So... Curve+shapekey was even more broken than it looked, this report was
actually a nice crasher (immediate crash in an ASAN build when trying to
edit a curve shapekey with some viewport rendering enabled).

There were actually two different issues here.

I) The less critical: rB6f1493f68fe was not fully fixing issues from
T50614. More specifically, if you updated obdata from editnurb
*without* freeing editnurb afterwards, you had a 'restored' (to
original curve) editnurb, without the edited shapekey modifications
anymore. This was fixed by tweaking again `calc_shapeKeys()` behavior in
`ED_curve_editnurb_load()`.

II) The crasher: in `ED_curve_editnurb_make()`, the call to
`init_editNurb_keyIndex()` was directly storing pointers of obdata
nurbs. Since those get freed every time `ED_curve_editnurb_load()` is
executed, it easily ended up being pointers to freed memory. This was
fixed by copying those data, which implied more complex handling code
for editnurbs->keyindex, and some reshuffling of a few functions to
avoid duplicating things between editor's editcurve.c and BKE's curve.c

Note that the separation of functions between editors and BKE area for
curve could use a serious update, it's currently messy to say the least.
Then again, that area is due to rework since a long time now... :/

Finally, aligned 'for_render' curve evaluation to mesh one - now
editing a shapekey will show in rendered viewports, if it does have some
weight (exactly as with shapekeys of meshes).
2017-02-22 21:56:49 +01:00
b637db2a7a Cleanup: remove unused orig_nu from keyIndex ghash of editcurves. 2017-02-22 21:56:49 +01:00
99947e2943 Use new api doc links
Differential Revision: https://developer.blender.org/D2522
2017-02-22 11:19:30 -05:00
75cc33fa20 Fix Cycles still saving render output when error happened
This was fixed ages ago for the interface case but not for the
command line. The thing here is that currently external engines
are relying on reports system to indicate that error happened
so suppressing reports storage in the background mode prevented
render pipeline from detecting errors happened.

This is all weak and i don't like it, but this is better than
delivering black frames from the farm.
2017-02-22 13:06:24 +01:00
36c4fc1ea9 Cycles: Fix shading with autosmooth and custom normals
New logic of split_faces was leaving mesh in a proper state
from Blender's point of view, but Cycles wanted loop normals
to be "flushed" to vertex normals.

Now we do such a flush from Cycles side again, so we don't
leave bad meshes behind.

Thanks Bastien for assistance here!
2017-02-22 10:54:36 +01:00
2c30fd83f1 Cycles: Additionally report all OpenCL cflags
This way we can control exact spaces and such added to the cflags
which is crucial to troubleshoot certain drivers.
2017-02-22 10:06:02 +01:00
ae1c1cd8c0 Refactor Mesh split_faces() code to use loop normal spaces.
Finding which loop should share its vertex with which others is not easy
with regular Mesh data (mostly due to lack of advanced topology info, as
opposed with BMesh case).

Custom loop normals computing already does that - and can return 'loop
normal spaces', which among other things contain definitions of 'smooth
fans' of loops around vertices.

Using those makes it easy to find vertices (and then edges) that needs
splitting.

This commit also adds support of non-autosmooth meshes, where we want to
split out flat faces from smooth ones.
2017-02-22 09:40:46 +01:00
3622074bf7 Fix Drawing nested box layouts (D2508) 2017-02-21 21:02:56 +03:00
4e9b17da4c Cycles: Speedup by avoiding extra calculations in noise texture when unneeded
Noise texture is now faster when the color socket is unused. Potential for
speedup spotted by @nutel.

Some performance results:

                     Render Time Before    After    Difference
Gooseberry benchmark         47:51.34    45:55.57       -4%
Koro                         12:24.92    12:18.46     -0.8%
Simple cube (Color socket)      48.53       48.72     +0.3%
Simple cube (Fac socket)        48.74       32.78    -32.7%
Goethe displacement           1:21.18     1:08.47    -15.6%
Cycles brick displacement     3:02.38     2:16.76    -25.0%
Large displacement scene     23:54.12    20:09.62    -15.6%

Reviewed By: sergey

Differential Revision: https://developer.blender.org/D2513
2017-02-21 07:24:33 -05:00
34a502c16a Cleanup: use proper link to the api 2017-02-20 20:21:57 -05:00
696836af1d Fix T50718: Regression: Split Normals Render Problem with Cycles
The issue seems to be caused by vertex normal being re-calculated
to something else than loop normal, which also caused wrong loop
normals after re-calculation.

For now issue is solved by preserving CD_NORMAL for loops after
split_faces() is finished, so render engine can access original
proper value.
2017-02-20 11:56:02 +01:00
75ce4ebc12 Mesh faces split: Add missing vertex normal copy 2017-02-20 11:47:43 +01:00
333dc8d60f Fix T50719: Memory usage won't reset to zero while re-rendering on two video cards
Was only visible with Persistent Images option ON.
2017-02-20 11:02:19 +01:00
9992e6a169 Fix a few compiler warnings with macOS / clang. 2017-02-18 23:59:34 +01:00
3f5b2e2682 Fix T50564: 3D view panning with scroll wheel inconsistent with dragging. 2017-02-18 22:41:56 +01:00
6f1493f68f Fix T50614: Curve doesn't restore initial form after deleting all its shapekeys
Logic of handling shapekeys when entering and leaving edit mode for
curves was... utterly broken.

Was leaving actual curve data with edited shapekey applied to it.
2017-02-17 18:55:52 +01:00
31123f09cd Remove unused functions related to distance between BoundBox and ray 2017-02-17 09:49:20 -03:00
d41451a0ca Forgotten in last commit: Check the allocation 2017-02-16 23:41:38 -03:00
6c59a3b37a Do not release the arrays used in the parameters of the expanded functions of bvhutils
The release of these arrays should be the programmer's discretion since these arrays can continue to be used.

Only the expanded functions `bvhtree_from_mesh_edges_ex` and `bvhtree_from_mesh_looptri_ex` are currently being used in blender (in mesh_remap.c), and from what I could to analyze, these changes can prevent a crash.
2017-02-16 22:55:01 -03:00
7819d36d4e Make File: Print 'blender.exe' at the end of the path to run from 2017-02-16 17:08:33 -05:00
99a6bbf7dd Cleanup: Spelling, Spaces --> Tabs, Whitespace 2017-02-16 17:06:03 -05:00
21eae869ad UI: Move 'relations extras' right below 'relations'
Differential Revision: https://developer.blender.org/D2218
2017-02-16 12:02:32 -05:00
306acb7dda Fix T50687: Cycles baking time estimate and progress bar doesn't work / progress when baking with high samples 2017-02-16 17:15:08 +01:00
26c8d559fe Register test for mesh.split_faces() 2017-02-16 15:36:00 +01:00
6468cb5f9c Faces split: Don't leave CD_NORMAL after split
This is supposed to be a temporary layer.

If someone needs loop normals after split it should explicitly
ask for that.
2017-02-16 11:00:17 +01:00
5cbaf56b26 Cyctes tests: Commit blender.git side changes 2017-02-16 10:36:22 +01:00
fc185fb1d2 CDDM Copy: Only tag data layers dirty if we ignored tessellation data
This solves assert failure in CustomData_from_bmeshpoly() happening with
broom.blend file from barber shop SVN.
2017-02-16 09:55:44 +01:00
809ed38075 Cleanup: Indentation 2017-02-16 09:16:20 +01:00
781507d2dd Freestyle: Feature edge selection by nested object groups.
A group of object groups can be formed by means of the dupli_group option in
the Object properties window.  The present revision extends the Selection by
Group option in the Freestyle Line Set so as to support not only flat object
groups but also nested groups.
2017-02-16 10:53:11 +09:00
9b3d415f6a Fix more corner cases failing in mesh faces split
Now we handle properly case with edge-fan meshes, which should
fix bad topology calculated for cash register which was causing
crashes in the studio.
2017-02-15 23:09:31 +01:00
40e5bc15e9 Fix wrong edges created by split faces
We need to first split all vertices before we can reliably
check whether edge can be reused or not.

There is still known issue happening with a edge-fan mesh
with some faces being on the same plane.
2017-02-15 21:41:25 +01:00
41e0085fd3 [Alembic] Fix msvc warning - C4138 '*/' found outside of comment 2017-02-15 12:40:41 -07:00
e22d4699cb Cycles: Cleanup, style 2017-02-15 20:33:49 +01:00
13d31b1604 Fix T50542: Wrong metadata frame when using OpenGL render 2017-02-15 17:09:49 +01:00
3e628eefa9 Motion blur investigation feature
This commit adds a way to debug Cycles motion blur issues which
are usually happening due to something crazy happening in between
of frames. Biggest trouble was that artists had no clue about
what's happening in subframes before they render. This is at
least inefficient workflow when dealing with motion blur shots
with complex animation.

Now there is an option in Time Line Editor which could be found
in View -> Show Subframe. This option will expose current frame
with it's subframe to the time line editor header and it'll allow
scrubbing with a subframe precision in time line editor.

Please note that none of the tools in Blender are aware of
subframe, so they'll likely be using current integer frame still.

This is something we don't consider a bug for now, the whole
purpose for now is to give a tool for investigation. Eventually
we'll likely tweak all tools to be aware of subframe.

Hopefully now we can finish the movie here in the studio..
2017-02-15 16:19:05 +01:00
efbe47f9cd Fix T50662: Auto-split affects on smooth mesh when it sohuldn't
Seems to be a precision error comparing proper floating point
normal with the one coming from short.
2017-02-15 15:21:15 +01:00
fe47163a1e Cycles: Fix CUDA compilation error after recent changes 2017-02-15 15:01:08 +01:00
20283bfa0b Fix wrong loop normals left after face splitting
Let's keep all data in a consistent state, so we don't have any
issues later on.

This solves rendering artifacts mentioned in the previous commit.
2017-02-15 14:58:49 +01:00
dd79f907a7 Mesh: Re-implement face split solving issue mentioned earlier
Now new edges will be properly created between original and
new split vertices.

Now topology is correct, but shading is still not quite in
some special cases.
2017-02-15 14:49:42 +01:00
8b8c0d0049 Cycles: Don't calculate primitive time if BVH motion steps are not used
Solves memory regression by the default configuration.
2017-02-15 12:59:31 +01:00
6cdc954e8c Cycles: Pass special flag whether BVH motion steps are used
Doesn't currently change anything, but would need for some future
work here.

It uses existing padding in kernel BVH structure, so there is
nothing changed memory-wise.
2017-02-15 12:45:06 +01:00
dc7bbd731a Cycles: Fix wrong hair render results when using BVH motion steps
The issue here was mainly coming from minimal pixel width feature
which is quite commonly enabled in production shots.

This feature will use some probabilistic heuristic in the curve
intersection function to check whether we need to return intersection
or not. This probability is calculated for every intersection check.
Now, when we use multiple BVH nodes for curve primitives we increase
probability of that primitive to be considered a good intersection
for us. This is similar to increasing minimal width of curve.

What is worst here is that change in the intersection probability
fully depends on exact layout of BVH, meaning probability might
change differently depending on a view angle, the way how builder
binned the primitives and such. This makes it impossible to do
simple check like dividing probability by number of BVH steps.

Other solution might have been to split BVH into fully independent
trees, but that will increase memory usage of all the static
objects in the scenes, which is also not something desirable.

For now used most simple but robust approach: store BVH primitives
time and test it in curve intersection functions. This solves the
regression, but has two downsides:

- Uses more memory.

  which isn't surprising, and ANY solution to this problem will
  use more memory.

  What we still have to do is to avoid this memory increase for
  cases when we don't use BVH motion steps.

- Reduces number of maximum available textures on pre-kepler cards.

  There is not much we can do here, hardware gets old but we need
  to move forward on more modern hardware..
2017-02-15 12:45:04 +01:00
088c6a17ba Cycles: Fix missing initialization of triangle BVH steps
Likely was harmless for Blender, but better be safe here.
2017-02-15 12:44:52 +01:00
5723aa8c02 Cycles: Fix wrong pointiness caused by precision issues 2017-02-15 12:40:13 +01:00
b36e26bbce Revert "Mesh: Solve incorrect result of mesh.split_faces()"
The change was delivering broken topology for certain cases.
The assumption that new edge only connects new vertices was
wrong.

Reverting to a commit which was giving correct render results
but was using more memory.

This reverts commit af1e48e8ab.
2017-02-15 12:40:13 +01:00
384b7e18f1 UI: Wireframe modifier- make crease grayed out when disabled 2017-02-14 23:48:35 -05:00
402b0aa59b Comments: notes on polyfill2d, minor corrections 2017-02-15 14:17:06 +11:00
af1e48e8ab Mesh: Solve incorrect result of mesh.split_faces()
This function was keeping original edges and was creating some
extra vertices which is not something we are really looking
forward to,
2017-02-14 17:02:22 +01:00
737a3b8a0a Mesh: Cleanup, use shorter version of loop 2017-02-14 16:27:09 +01:00
324d057b25 Mesh: Use faster calculation of previous loop 2017-02-14 16:27:09 +01:00
4d325693e1 BKE_boundbox_ensure_minimum_dimensions is no longer necessary
The bug T46099 no longer applies since the addition of `dist_squared_to_projected_aabb_simple`
Has also been added comments that relates to an occlusion bug with the ruler. I'll investigate this.
2017-02-14 10:25:00 -03:00
6c104f62b9 transform_snap_object: Remove do_bb parameter. It is always true 2017-02-14 09:38:20 -03:00
54102ab36e Alembic: fix naming of imported transforms.
When importing an Alembic file with grouped transforms, it would badly name the transforms, taking the name of the parent instead of its own.

Patch by @maxime.robinot

Differential Revision: https://developer.blender.org/D2507
2017-02-14 08:15:13 +01:00
930186d3df Cycles: Optimize sorting of transparent intersections on CUDA 2017-02-13 18:24:45 +01:00
21dbfb7828 Cycles: Fix wrong transparent shadows with CUDA
Was a bug in recent optimization commit.
2017-02-13 18:22:10 +01:00
581c819013 Cycles: Fix wrong shading on GPU when background has NaN pixels and MIS enabled
Quite simple fix for now which only deals with this case. Maybe we want to do
some "clipping" on image load time so regular textures wouldn't give NaN as
well.
2017-02-13 16:32:55 +01:00
81eee0f536 Cycles: Use fast math without finite optimization
This allows us to use faster math and still have reliable
isnan/isfinite tests.

Only do it for host side, kernels stays unchanged.

Thanks Lukas Stockner for the tip!
2017-02-13 16:25:35 +01:00
37afa965a4 Fix T50655: Pointiness is too slow to calculate
Optimize vertex de-duplication the same way as we do doe Remove Doubles.
2017-02-13 12:00:10 +01:00
594015fb7e Cycles: Use Cycles-side mesh instead of C++ RNA
Those are now matching and it's faster to skip C++ RNA to
calculate pointiness.
2017-02-13 10:40:05 +01:00
9148ce9f3c F-Curve normalization: Do proper curve min/max instead of handle min/max
Would be cool to find some way to cache the results.
2017-02-13 10:02:04 +01:00
5552e83b53 Cycles: Don't use built-in API for image sequences in preview mode
Our Python API is not ready for such things at all. Better be slower
but more correct for until we improve our API.
2017-02-11 22:24:59 +01:00
e76364adcd Image: Fix non-deterministic behavior of image sequence loading
The issue was caused by usage of non-initialized image user, which
could have different settings, causing some random image being loaded
or not loaded at all.

This caused non-deterministic behavior of Cycles image loading because
it was querying image information from several places.

This fixes crash reported in T50616, but it's not a complete fix
because preview rendering in material is wrong (same wrong as in
2.78a release).
2017-02-11 22:19:49 +01:00
1ac6e4c7a2 UI: Redesign the VSE multicam strip
Idea from https://rightclickselect.com/p/sequencer/zfbbbc/sequencer-panels-update by @pauloup

|{F434631}|{F434624}|
|Before |After|

Test file:
{F434643}
2017-02-11 11:35:02 -05:00
3ede515b5b Use dummy versionning numbers for missing libraries.
We now assert that we now file version of libraries (needed for
do_version after linking step), so for missing libraries, set dummy
numbers (using version of main .blend file actually).
2017-02-10 22:50:45 +01:00
9d8a9cacc3 De-duplicate min/max calculation in F-Curve normalization 2017-02-10 18:10:26 +01:00
e33e58bf23 CTests: Initial work to cover Cycles nodes with OpenGL tests
Works similar to regular Cycles tests, just does OpenGL render to
get output image.

Seems to work fine with the only funny effect: Blender window will
pop up for each of the tests. This is current limitation of our
OpenGL context. Might be changed in the future.
2017-02-10 14:52:54 +01:00
e991af0934 Cleanup: Trailing whitespace 2017-02-10 14:08:12 +01:00
cd4309ced0 Cycles: Cleanup, move EdgeMap to blender_util
it's better place for such an utility structure. Still not fully ideal tho.
2017-02-10 13:34:10 +01:00
0178915ce9 Cycles: Make an utility class for edge map
Simplifies some logic.
2017-02-10 13:34:09 +01:00
fd7e9f7974 Cycles: Fix pointiness attribute giving wrong results with autosplit
Basically made the algorithm to handle vertices with the same coordinate
as a single vertex.
2017-02-10 13:34:09 +01:00
d395d81bfc Cycles: Cleanup: Use less indentation by inverting condition 2017-02-10 13:34:09 +01:00
0b65b889ef Cycles: Calculate all vertex attribute after faces generation
This way the calculation is not spread over multiple places.
2017-02-10 13:34:09 +01:00
b26da8b467 Cycles: Cleanup: use vector instead of bare malloc
This way memory is more "manageable" and easier to follow.
2017-02-10 13:34:09 +01:00
b929eef8c5 Alembic: fixed mistake in bounding box computation
By performing the Z-up to Y-up conversion, the change in sign of the
Z-coordinate swaps "minimum" and "maximum".
2017-02-10 11:54:00 +01:00
38155c7d3c Do not overide text 2017-02-09 16:25:04 -05:00
bb1367cdaf Fix T50629 -- Add remove doubles to the cleanup menu
Also move it up in the verticies menu
2017-02-09 16:18:33 -05:00
e523cde574 Cleanup: Remove commented code
Code has been commented from before 2010 and relates to old Background image code.
2017-02-09 09:26:57 -05:00
d2f4900d1a Use a smaller cross icon for clearing search box contents 2017-02-09 19:08:58 +13:00
351eb4fad1 More tweaks to Normalisation options in Graph Editor
* Added a new dedicated icon for normalize
* Only use an icon for "Auto"
2017-02-09 18:59:51 +13:00
316d23f2ba Graph Editor: Replace Normalise/Auto checkboxes with toggle buttons
These take less space, fit in better with rest of the UI, and make their relationship clearer
2017-02-09 17:10:49 +13:00
117d90b3da Fix: GPencil delete operators did not respect color locking 2017-02-09 17:10:48 +13:00
b16fd22018 Cycles: Fix regression with transparent shadows in volume 2017-02-08 14:00:48 +01:00
da31a82832 Cycles: Solve speed regression by casting opaque ray first 2017-02-08 14:00:48 +01:00
04cf1538b5 Cycles: Fix compilation error on OpenCL 2017-02-08 14:00:48 +01:00
31a025f51e Cycles: Split shadow functions to avoid some duplicated calculations 2017-02-08 14:00:48 +01:00
dde40989f3 Cycles: Store shadow intersections in the kernel globals
Seems CUDA failed to de-duplicate the array across multiple inlined
versions of the shadow_blocked(). Helped it a bit with that now.

Gives about 100MB memory improvement on a scenes after previous
commit and brings up memory "regression" to only 100MB comparing to
the master branch now.
2017-02-08 14:00:48 +01:00
7447950bc3 Cycles: Speedup transparent shadows on CUDA
This commit enables record-all behavior of transparent shadows
rays.

Render times difference goes as following:

               GTX 1080 render time
BMW                  -0.5%
Fishy Cat            -0.0%
Pabellon Barcelona   -11.6%
Classroom            +1.2%
Koro                 -58.6%

Kernel will now use some extra VRAM memory to store the intersection
array (200MB on my configuration). This we can optimize out with some
further commits.
2017-02-08 14:00:48 +01:00
9830eeb44b Cycles: Implement record-all transparent shadow function for GPU
The idea is to record all possible transparent intersections when
shooting transparent ray on GPU (similar to what we were doing  on
CPU already).

This avoids need of doing whole ray-to-scene intersections queries
for each intersection and speeds up a lot cases like transparent
hair in the cost of extra memory.

This commit is a base ground for now and this feature is kept
disabled for until some further tweaks.
2017-02-08 14:00:48 +01:00
9c3d202e56 Cycles: Use an utility function to sort intersections array 2017-02-08 14:00:48 +01:00
58a10122d0 Cycles: Make GPU version of shadow_blocked() closer to CPU
Now we break the traversal cycle and then perform volume attenuation
and check with zero throughput. Not sure it makes any measurable sense
at this moment, but in the future it might help de-duplicating some
extra logic here.
2017-02-08 14:00:48 +01:00
98a1855803 Cycles: De-duplicate transparent shadows attenuation
Fair amount of code was duplicated for CPU and GPU, now we are
using inlined function to avoid such duplication.
2017-02-08 14:00:48 +01:00
8cda364d6f Fix T49249: Alembic export with multiple hair systems crash blender
Removed unnecessary call to DM_update_tessface_data(). This call is
already performed by DM_ensure_tessface(dm). The call being performed
twice caused a failing BLI_assert().

Reviewed by: Kévin Dietrich
2017-02-08 12:26:36 +01:00
ac38d5652b Alembic export: avoid infinite loops trying to find parent objects.
Also added some assertions for debugging purposes

Reviewed by: Kévin Dietrich
2017-02-08 12:26:36 +01:00
95e7f93fa2 Alembic export: only create transform writer if the object should be exported
Reviewed by: Kévin Dietrich
2017-02-08 12:26:36 +01:00
b320873382 Alembic: #undef'ed the correct macro
TEST_RET is not defined anywhere in Blender's sources, and LAYER_CMP
is no longer used after this function ends.
2017-02-08 12:26:36 +01:00
ce9df09067 Alembic: Use getXForm() in check, because it's used in rest of the function too
This makes the code within the function consistent.
2017-02-08 12:26:36 +01:00
82df7100c8 Alembic: Renamed copy_zup_yup to copy_yup_from_zup (and same for zup_from_yup)
With the new names the arguments (yup, zup) are in the same order as
they appear in the function name. The old names used copy_src_dst(dst,
src), which I found very confusing. Furthermore, now it is clear from
where to where the copy is made.

This makes the function names a little bit longer, though. If that is
a real issue, we can just name them zup_from_yup(zup, yup).

Reviewed by: Kévin Dietrich
2017-02-08 12:26:36 +01:00
69dbeeca48 Cleanup: Use const qualifier in some of color management code 2017-02-07 17:49:54 +01:00
b641d016e1 Sequencer: Some extra speedup in color space conversion
Use the new utility from coloranagement which multi-threads byte to
float conversion.

Gives extra 10% speedup from quick tests.
2017-02-07 17:49:54 +01:00
ce629c5dd9 Color management: Add utility function to convert byte to float with processor applied 2017-02-07 17:49:54 +01:00
e5bb005369 Sequencer: Speedup conversion to sequencer space
Speedup is mainly gained by multi-threading. Gives about 3x
fps gain on an edit shot file.

There is still some room for improvements, will happen in one
of the upcoming commits.
2017-02-07 17:49:54 +01:00
5d6177111d Color management: Implement threaded byte buffer conversion
The title says it all actually: now we can convert byte buffer
directly, without need of temporary float buffer.
2017-02-07 17:49:54 +01:00
03be3102c7 Param is_cached not being used in bvhtree_from_mesh_edges_setup_data
This could cause bugs in the memory release
2017-02-07 11:03:10 -03:00
03544eccb4 Fix missing hair after rendering with different viewport/render settings
Derived mesh for particles did not include tessellated faces when it
was expected to. Now added explicit function to copy CDDM with tess
faces without need to re-tessellate the result.
2017-02-07 14:21:29 +01:00
53896d4235 Fix T49253: Cycles blackbody is wrong on AVX2 CPU on Windows
Seems to be bug in optimizer, but managed to reshuffle in a way
which should also give some speedup.
2017-02-07 13:05:19 +01:00
1158800d1b PIL_time_utildefines: also show total time in TIMEIT_AVERAGED. 2017-02-07 10:14:46 +01:00
f7eaaf35b4 Fix (unreported) Object previews being written even for skipped objects. 2017-02-06 20:58:18 +01:00
e217839fd3 Cleanup writefile code a bit.
Modernize some of it a bit, saves quite some lines of blabla (using
shile instead of for loops... tsssts...).
2017-02-06 20:43:14 +01:00
dbdc346e9f CMake: Remove MOTO library dependency when it is not needed
It is not necessary to add MOTO library dependency when we use
WITH_IK_SOLVER (now it uses Eigen) or we use WITH_MOD_BOOLEAN (it was
used by bsp intern library some time ago but it is not present in the
code anymore).

Reviewers: mont29, sergey

Subscribers: mont29, sergey

Differential Revision: https://developer.blender.org/D2477
2017-02-06 19:29:42 +01:00
0170c682fe Specify the correct size of the BVHTree of edges
~edge_num~ edges_num_active
Not always all the edges enter in the build
2017-02-06 14:59:31 -03:00
e3f99329d8 Standardization and style for BKE_bvhutils
Add `bvhtree_from_mesh_edges_ex` and callbacks to nearest_to_ray (Similar to the other functions of this code)
2017-02-06 14:11:06 -03:00
ac8348d033 Fix 'public' global 'g_atexit' var in Blender.
No reason to not make this private to this file, and it gave conflict
when using bpy as module and loading it in a GLib application (which
also has a g_atexit var).
2017-02-06 17:42:30 +01:00
9e97b00873 Fix compilation error after recent change 2017-02-06 15:29:13 +01:00
c7f40caa2c Add shortcuts for unsigned int, short, long and char
Feel free to use those in the new code.

And stay away from simple "unsigned".
2017-02-06 15:04:13 +01:00
c5cc9e046d Use hash instead of linear lookup in armature deform
This avoids calling linear lookup 100s of time when dealing with
real-life character.

Still some tweaks possible.
2017-02-06 14:47:36 +01:00
d0015cba02 Multi-thread displace modifier
The title says it all actually. Use BLI task to loop over vertices
and distort their locations. Gives 2x FPS increase in a file with
just time-dependent displace modifier on my desktop.
2017-02-06 14:21:29 +01:00
89f3837d68 Displace modifier: Use special version of texture sampling
This version will give less spin locks and now well-tested by render engines.

This should reduce amount of threading overhead when having multiple objects
with displace modifier enabled.

In the future this will also help us threading the modifier.

There are more modifiers which could benefit from this, but let's first
investigate the new behavior with one of them.
2017-02-06 12:37:08 +01:00
385fe4f0ce Add special texture sampling function which takes image pool argument
Using image pool will reduce number of thread locks when acquiring image.
Useful when it's needed to sample texture fewzillion times a second.
2017-02-06 12:23:03 +01:00
223aff987a Fix memory leak when building without audaspace 2017-02-06 11:18:20 +01:00
Phil Christensen
351c409317 C++ conformance fixes (MSVC /permissive-)
We (the Microsoft C++ team) use the Blender project as part of our "Real world code" tests.
I noticed a place in WIN32 specific code (dvpapi.cpp:85) where a string literal is losing
its const-ness when being passed to BLI_dynlib_open().  This is not permitted when using the
/permissive- conformance compiler switch (see our blog
https://blogs.msdn.microsoft.com/vcblog/2016/11/16/permissive-switch/)

My suggested fix is to add const and propagate it where needed.  Another possible fix would be
to explicitly cast away the const.

Reviewers: mont29, sergey, LazyDodo

Subscribers: Blendify, sergey, mont29, LazyDodo

Tags: #platform:_windows

Differential Revision: https://developer.blender.org/D2495
2017-02-06 10:44:56 +01:00
22156d951d fix T50602: Avoid crash when executing transform_snap_context_project_view3d_mixed with dist_px NULL 2017-02-06 01:01:39 -03:00
da08aa4b96 Cleaning of the last commit: lack of attention with the debug of time X(
This was a stupid mistake
2017-02-04 19:06:41 -03:00
75aa866211 Optimize BVHTree creation of vertices that have BLI_bitmap test
Instead of reference the vertex first and test the bitmap afterwards. Test the bitmap first and reference the vertex after.

In a mesh with 31146 vertices and the entire bitmap disabled, the loop time is 243% faster
With all bitmap enabled, the time becomes 463473% faster!!!

One possible reason for this huge difference in peformance is that maybe the compiler is not putting the function "BM_vert_at_index" inline (I dont know if buildbot do this, but it's good to investigate).
2017-02-04 19:01:29 -03:00
47caf343c0 fix T50592: Scene.raycast not working
Ray_start and ray_normal values were being ignored
2017-02-04 18:17:15 -03:00
a2c469edc2 Fix (unreported) crash in new snap code.
Looks like `object_map` and `mem_arena` may be NULL sometimes...

Also, cleaned up function pointers declaration of Nearest2dUserData,
those were warning out in gcc. Please, *always* use typdef defined
prototypes for function pointers, it is sooooo much cleaner and clearer
that way. And easy to convert from compatible functions too.
2017-02-04 21:51:27 +01:00
6663099810 Fix T50590: BI lamp doesn't hold a texture in this case.
BKE_lamp_free was somehow missing the refactor of datablocks handling
(which, among other things, completely separated ID refcounting and
linking management from ID freeing itself).

Either forgot during development, or lost during merge...
2017-02-04 21:31:52 +01:00
c367e23d46 Snap System: Use callbaks to differentiate how referenced vertives of DerivedMeshs and Bmeshs
Before it was informed the type of object in the `userdata`, and a same function ran between the types to obtain the coordinates of the vertices
2017-02-03 20:08:57 -03:00
a0561a05ef Remove flag: SNAP_OBJECT_USE_CACHE from snap_context
Since the cache is created in one way or another, this flag is not really making a difference
More details here: D2496
2017-02-03 19:03:31 -03:00
21f3767809 fix T46892: snap to closest point now works with Individual Origins
The code looks for the closest element between its centers. In the case of islands, the center of each vertex is the center of the island.
The solution here is to skip the search for islands when the operation is translation
2017-02-03 13:15:44 -03:00
0b4a9caf51 Forgotten in committee ddf99214dc
In obect mode, the rotation matrix need to be restored to the initial value if a snap point is not found
2017-02-03 12:57:02 -03:00
0e459ad1a3 Buildbot: Re-enable cuda support for OSX 2017-02-03 16:11:05 +01:00
52696a0d3f Fix T50125: Shortcut keys missing in menus for Clear Location, Rotation, and Scale.
Menu entries and shortcuts did not have exact same behavior, now they do
(using shortcuts' behavior).
2017-02-03 16:10:00 +01:00
f3a7104adb Fix T49860: Copying vgroups between objects sharing the same obdata was not possible.
Pretty straight forward actually, just do not bother about obdata part
of vgroups in that case, only copy object part of it.

And let's curse once again those stuff spread accross several types of
data-blocks...
2017-02-03 15:47:44 +01:00
a1820afa30 Depsgraph: Add some extra debug prints on eval 2017-02-03 14:05:59 +01:00
030e99588d Tests: Use proper order for EXPECT_EQ() 2017-02-03 12:03:59 +01:00
aea17a612d Tests: Use EXPECT_FALSE() instead of EXPECT_EQ(foo, false) 2017-02-03 11:52:47 +01:00
dc1b45ff1a Tests: Use EXPECT_TRUE() instead of EXPECT_EQ(foo, true) 2017-02-03 11:52:29 +01:00
e1e85454ea Cycles: Cleanup, order of arguments to EXPECT_EQ
The order was wrong from the semantic point of view, caused
by some legacy workarounds in Libmv. Didn't realize it's was
not how things were expected to be used.
2017-02-03 11:35:34 +01:00
103f2655ab Explode modifier: Don't tessellate DM if we are not going to apply modifier 2017-02-03 11:03:47 +01:00
ddf99214dc fix T49494: snap_align_rotation should use a local pivot to make the transformation
The problem was simple, just transform the global coordinates of t->tsnap.snapTarget to local coordinates.
(Some comments were added to the code)
2017-02-03 02:27:57 -03:00
813 changed files with 25533 additions and 14703 deletions

4
.gitmodules vendored
View File

@@ -2,15 +2,19 @@
path = release/scripts/addons
url = ../blender-addons.git
ignore = all
branch = master
[submodule "release/scripts/addons_contrib"]
path = release/scripts/addons_contrib
url = ../blender-addons-contrib.git
ignore = all
branch = master
[submodule "release/datafiles/locale"]
path = release/datafiles/locale
url = ../blender-translations.git
ignore = all
branch = master
[submodule "source/tools"]
path = source/tools
url = ../blender-dev-tools.git
ignore = all
branch = master

View File

@@ -445,6 +445,7 @@ option(WITH_BOOST "Enable features depending on boost" ON)
# Unit testsing
option(WITH_GTESTS "Enable GTest unit testing" OFF)
option(WITH_OPENGL_TESTS "Enable OpenGL related unit testing (Experimental)" OFF)
# Documentation
@@ -518,18 +519,20 @@ endif()
option(WITH_LEGACY_DEPSGRAPH "Build Blender with legacy dependency graph" ON)
mark_as_advanced(WITH_LEGACY_DEPSGRAPH)
# Use hardcoded paths or find_package to find externals
option(WITH_WINDOWS_FIND_MODULES "Use find_package to locate libraries" OFF)
mark_as_advanced(WITH_WINDOWS_FIND_MODULES)
if(WIN32)
# Use hardcoded paths or find_package to find externals
option(WITH_WINDOWS_FIND_MODULES "Use find_package to locate libraries" OFF)
mark_as_advanced(WITH_WINDOWS_FIND_MODULES)
option(WITH_WINDOWS_CODESIGN "Use signtool to sign the final binary." OFF)
mark_as_advanced(WITH_WINDOWS_CODESIGN)
option(WITH_WINDOWS_CODESIGN "Use signtool to sign the final binary." OFF)
mark_as_advanced(WITH_WINDOWS_CODESIGN)
set(WINDOWS_CODESIGN_PFX CACHE FILEPATH "Path to pfx file to use for codesigning.")
mark_as_advanced(WINDOWS_CODESIGN_PFX)
set(WINDOWS_CODESIGN_PFX CACHE FILEPATH "Path to pfx file to use for codesigning.")
mark_as_advanced(WINDOWS_CODESIGN_PFX)
set(WINDOWS_CODESIGN_PFX_PASSWORD CACHE STRING "password for pfx file used for codesigning.")
mark_as_advanced(WINDOWS_CODESIGN_PFX_PASSWORD)
set(WINDOWS_CODESIGN_PFX_PASSWORD CACHE STRING "password for pfx file used for codesigning.")
mark_as_advanced(WINDOWS_CODESIGN_PFX_PASSWORD)
endif()
# avoid using again
option_defaults_clear()
@@ -924,7 +927,7 @@ if(WITH_X11)
if(WITH_X11_ALPHA)
find_library(X11_Xrender_LIB Xrender ${X11_LIB_SEARCH_PATH})
mark_as_advanced(X11_Xrender_LIB)
if (X11_Xrender_LIB)
if(X11_Xrender_LIB)
list(APPEND PLATFORM_LINKLIBS ${X11_Xrender_LIB})
else()
set(WITH_X11_ALPHA OFF)

View File

@@ -1,4 +1,4 @@
# -*- mode: gnumakefile; tab-width: 8; indent-tabs-mode: t; -*-
# -*- mode: gnumakefile; tab-width: 4; indent-tabs-mode: t; -*-
# vim: tabstop=4
#
# ##### BEGIN GPL LICENSE BLOCK #####
@@ -113,7 +113,7 @@ CMAKE_CONFIG = cmake $(BUILD_CMAKE_ARGS) \
# X11 spesific
ifdef DISPLAY
CMAKE_CONFIG_TOOL = cmake-gui
else
else
CMAKE_CONFIG_TOOL = ccmake
endif
@@ -127,7 +127,7 @@ all: .FORCE
# # if test ! -f $(BUILD_DIR)/CMakeCache.txt ; then \
# # $(CMAKE_CONFIG); \
# # fi
# # do this always incase of failed initial build, could be smarter here...
@$(CMAKE_CONFIG)

View File

@@ -360,7 +360,7 @@ OPENVDB_FORCE_REBUILD=false
OPENVDB_SKIP=false
# Alembic needs to be compiled for now
ALEMBIC_VERSION="1.6.0"
ALEMBIC_VERSION="1.7.1"
ALEMBIC_VERSION_MIN=$ALEMBIC_VERSION
ALEMBIC_FORCE_BUILD=false
ALEMBIC_FORCE_REBUILD=false
@@ -2236,9 +2236,6 @@ compile_ALEMBIC() {
return
fi
compile_HDF5
PRINT ""
# To be changed each time we make edits that would modify the compiled result!
alembic_magic=2
_init_alembic
@@ -2266,6 +2263,12 @@ compile_ALEMBIC() {
cmake_d="-D CMAKE_INSTALL_PREFIX=$_inst"
# Without Boost or TR1, Alembic requires C++11.
if [ "$USE_CXX11" != true ]; then
cmake_d="$cmake_d -D ALEMBIC_LIB_USES_BOOST=ON"
cmake_d="$cmake_d -D ALEMBIC_LIB_USES_TR1=OFF"
fi
if [ -d $INST/boost ]; then
cmake_d="$cmake_d -D BOOST_ROOT=$INST/boost"
cmake_d="$cmake_d -D USE_STATIC_BOOST=ON"
@@ -2285,8 +2288,6 @@ compile_ALEMBIC() {
cmake_d="$cmake_d -D USE_STATIC_HDF5=OFF"
cmake_d="$cmake_d -D ALEMBIC_ILMBASE_LINK_STATIC=OFF"
cmake_d="$cmake_d -D ALEMBIC_SHARED_LIBS=OFF"
cmake_d="$cmake_d -D ALEMBIC_LIB_USES_BOOST=ON"
cmake_d="$cmake_d -D ALEMBIC_LIB_USES_TR1=OFF"
INFO "ILMBASE_ROOT=$INST/openexr"
fi
@@ -4252,7 +4253,7 @@ print_info() {
PRINT " $_3"
_buildargs="$_buildargs $_1 $_2 $_3"
if [ -d $INST/osl ]; then
_1="-D CYCLES_OSL=$INST/osl"
_1="-D OSL_ROOT_DIR=$INST/osl"
PRINT " $_1"
_buildargs="$_buildargs $_1"
fi

View File

@@ -4,10 +4,10 @@
# <pep8 compliant>
# List of the branches being built automatically overnight
NIGHT_SCHEDULE_BRANCHES = [None]
NIGHT_SCHEDULE_BRANCHES = [None, "blender2.8"]
# List of the branches available for force build
FORCE_SCHEDULE_BRANCHES = ["master", "gooseberry", "experimental-build"]
FORCE_SCHEDULE_BRANCHES = ["master", "blender2.8", "experimental-build"]
"""
Stock Twisted directory lister doesn't provide any information about last file
@@ -127,7 +127,14 @@ def schedule_force_build(name):
project=forcesched.FixedParameter(name="project", default="", hide=True)),
# For now, hide other codebases.
forcesched.CodebaseParameter(hide=True, codebase="blender-translations"),
forcesched.CodebaseParameter(hide=True, codebase="blender-addons"),
forcesched.CodebaseParameter(
codebase="blender-addons",
branch=forcesched.ChoiceStringParameter(
name="branch", choices=["master", "blender2.8"], default="master"),
repository=forcesched.FixedParameter(name="repository", default="", hide=True),
project=forcesched.FixedParameter(name="project", default="", hide=True),
revision=forcesched.FixedParameter(name="revision", default="", hide=True),
),
forcesched.CodebaseParameter(hide=True, codebase="blender-addons-contrib"),
forcesched.CodebaseParameter(hide=True, codebase="blender-dev-tools"),
forcesched.CodebaseParameter(hide=True, codebase="lib svn")],
@@ -139,11 +146,15 @@ def schedule_build(name, hour, minute=0):
scheduler_name = "nightly " + name
if current_branch:
scheduler_name += ' ' + current_branch
# Use special addons submodule branch when building blender2.8 branch.
addons_branch = "master"
if current_branch == "blender2.8":
addons_branch = "blender2.8"
c['schedulers'].append(timed.Nightly(name=scheduler_name,
codebases={
"blender": {"repository": ""},
"blender-translations": {"repository": "", "branch": "master"},
"blender-addons": {"repository": "", "branch": "master"},
"blender-addons": {"repository": "", "branch": addons_branch},
"blender-addons-contrib": {"repository": "", "branch": "master"},
"blender-dev-tools": {"repository": "", "branch": "master"},
"lib svn": {"repository": "", "branch": "trunk"}},
@@ -225,8 +236,7 @@ def git_step(branch=''):
def git_submodules_update():
command = ['git', 'submodule', 'foreach', '--recursive',
'git', 'pull', 'origin', 'master']
command = ['git', 'submodule', 'update', '--remote']
return ShellCommand(name='Submodules Update',
command=command,
description='updating',
@@ -235,7 +245,10 @@ def git_submodules_update():
def lib_svn_step(dir):
return SVN(name='lib svn',
name = "lib svn"
if dir == "darwin":
name = "C++11 lib svn"
return SVN(name=name,
baseURL='https://svn.blender.org/svnroot/bf-blender/%%BRANCH%%/lib/' + dir,
codebase='lib svn',
mode='update',
@@ -264,6 +277,9 @@ def generic_builder(id, libdir='', branch='', rsync=False):
f = BuildFactory()
if libdir != '':
f.addStep(lib_svn_step(libdir))
# Special trick to make sure we always have all the libs.
if libdir.startswith("darwin"):
f.addStep(lib_svn_step("darwin"))
for submodule in ('blender-translations',
'blender-addons',
@@ -286,7 +302,7 @@ def generic_builder(id, libdir='', branch='', rsync=False):
f.addStep(FileUpload(name='upload',
slavesrc='buildbot_upload.zip',
masterdest=filename,
maxsize=150 * 1024 * 1024,
maxsize=180 * 1024 * 1024,
workdir='install'))
f.addStep(MasterShellCommand(name='unpack',
command=['python2.7', unpack_script, filename],

View File

@@ -67,6 +67,9 @@ def get_platform(filename):
def get_branch(filename):
if filename.startswith("blender-2.8"):
return "blender2.8"
tokens = filename.split("-")
branch = ""

View File

@@ -72,10 +72,8 @@ if 'cmake' in builder:
# Set up OSX architecture
if builder.endswith('x86_64_10_6_cmake'):
cmake_extra_options.append('-DCMAKE_OSX_ARCHITECTURES:STRING=x86_64')
cmake_extra_options.append('-DCUDA_NVCC_EXECUTABLE=/usr/local/cuda8-hack/bin/nvcc')
cmake_extra_options.append('-DWITH_CODEC_QUICKTIME=OFF')
cmake_extra_options.append('-DCMAKE_OSX_DEPLOYMENT_TARGET=10.6')
build_cubins = False
elif builder.startswith('win'):
@@ -93,7 +91,6 @@ if 'cmake' in builder:
elif builder.startswith('win32'):
bits = 32
cmake_options.extend(['-G', 'Visual Studio 12 2013'])
cmake_extra_options.append('-DCUDA_NVCC_EXECUTABLE:FILEPATH=C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v8.0/bin/nvcc.exe')
elif builder.startswith('linux'):
tokens = builder.split("_")
@@ -113,8 +110,6 @@ if 'cmake' in builder:
cuda_chroot_name = 'buildbot_' + deb_name + '_x86_64'
targets = ['player', 'blender', 'cuda']
cmake_extra_options.append('-DCUDA_NVCC_EXECUTABLE=/usr/local/cuda-8.0/bin/nvcc')
cmake_options.append("-C" + os.path.join(blender_dir, cmake_config_file))
# Prepare CMake options needed to configure cuda binaries compilation.

View File

@@ -111,7 +111,8 @@ if builder.find('cmake') != -1:
if builder.endswith('vc2015'):
platform += "-vc14"
builderified_name = 'blender-{}-{}-{}'.format(blender_full_version, git_hash, platform)
if branch != '':
# NOTE: Blender 2.8 is already respected by blender_full_version.
if branch != '' and branch != 'blender2.8':
builderified_name = branch + "-" + builderified_name
os.rename(result_file, "{}.zip".format(builderified_name))
@@ -177,7 +178,8 @@ if builder.find('cmake') != -1:
blender_hash,
blender_glibc,
blender_arch)
if branch != '':
# NOTE: Blender 2.8 is already respected by blender_full_version.
if branch != '' and branch != 'blender2.8':
package_name = branch + "-" + package_name
upload_filename = package_name + ".tar.bz2"

View File

@@ -56,7 +56,7 @@ if(EXISTS ${SOURCE_DIR}/.git)
string(REGEX REPLACE "[\r\n]+" ";" _git_contains_branches "${_git_contains_branches}")
string(REGEX REPLACE ";[ \t]+" ";" _git_contains_branches "${_git_contains_branches}")
foreach(_branch ${_git_contains_branches})
if (NOT "${_branch}" MATCHES "\\(HEAD.*")
if(NOT "${_branch}" MATCHES "\\(HEAD.*")
set(MY_WC_BRANCH "${_branch}")
break()
endif()

View File

@@ -1574,24 +1574,24 @@ macro(openmp_delayload
endmacro()
MACRO(WINDOWS_SIGN_TARGET target)
if (WITH_WINDOWS_CODESIGN)
if (!SIGNTOOL_EXE)
if(WITH_WINDOWS_CODESIGN)
if(!SIGNTOOL_EXE)
error("Codesigning is enabled, but signtool is not found")
else()
if (WINDOWS_CODESIGN_PFX_PASSWORD)
if(WINDOWS_CODESIGN_PFX_PASSWORD)
set(CODESIGNPASSWORD /p ${WINDOWS_CODESIGN_PFX_PASSWORD})
else()
if ($ENV{PFXPASSWORD})
if($ENV{PFXPASSWORD})
set(CODESIGNPASSWORD /p $ENV{PFXPASSWORD})
else()
message( FATAL_ERROR "WITH_WINDOWS_CODESIGN is on but WINDOWS_CODESIGN_PFX_PASSWORD not set, and environment variable PFXPASSWORD not found, unable to sign code.")
message(FATAL_ERROR "WITH_WINDOWS_CODESIGN is on but WINDOWS_CODESIGN_PFX_PASSWORD not set, and environment variable PFXPASSWORD not found, unable to sign code.")
endif()
endif()
add_custom_command(TARGET ${target}
POST_BUILD
COMMAND ${SIGNTOOL_EXE} sign /f ${WINDOWS_CODESIGN_PFX} ${CODESIGNPASSWORD} $<TARGET_FILE:${target}>
VERBATIM
)
POST_BUILD
COMMAND ${SIGNTOOL_EXE} sign /f ${WINDOWS_CODESIGN_PFX} ${CODESIGNPASSWORD} $<TARGET_FILE:${target}>
VERBATIM
)
endif()
endif()
ENDMACRO()

View File

@@ -1,5 +1,7 @@
set(PROJECT_DESCRIPTION "Blender is a very fast and versatile 3D modeller/renderer.")
set(PROJECT_COPYRIGHT "Copyright (C) 2001-2012 Blender Foundation")
string(TIMESTAMP CURRENT_YEAR "%Y")
set(PROJECT_DESCRIPTION "Blender is the free and open source 3D creation suite software.")
set(PROJECT_COPYRIGHT "Copyright (C) 2001-${CURRENT_YEAR} Blender Foundation")
set(PROJECT_CONTACT "foundation@blender.org")
set(PROJECT_VENDOR "Blender Foundation")
@@ -38,8 +40,8 @@ unset(MY_WC_HASH)
# Force Package Name
execute_process(COMMAND date "+%Y%m%d" OUTPUT_VARIABLE CPACK_DATE OUTPUT_STRIP_TRAILING_WHITESPACE)
string(TOLOWER ${PROJECT_NAME} PROJECT_NAME_LOWER)
if (MSVC)
if ("${CMAKE_SIZEOF_VOID_P}" EQUAL "8")
if(MSVC)
if("${CMAKE_SIZEOF_VOID_P}" EQUAL "8")
set(PACKAGE_ARCH windows64)
else()
set(PACKAGE_ARCH windows32)
@@ -48,7 +50,7 @@ else(MSVC)
set(PACKAGE_ARCH ${CMAKE_SYSTEM_PROCESSOR})
endif()
if (CPACK_OVERRIDE_PACKAGENAME)
if(CPACK_OVERRIDE_PACKAGENAME)
set(CPACK_PACKAGE_FILE_NAME ${CPACK_OVERRIDE_PACKAGENAME}-${PACKAGE_ARCH})
else()
set(CPACK_PACKAGE_FILE_NAME ${PROJECT_NAME_LOWER}-${MAJOR_VERSION}.${MINOR_VERSION}.${PATCH_VERSION}-git${CPACK_DATE}.${BUILD_REV}-${PACKAGE_ARCH})
@@ -135,4 +137,3 @@ unset(MINOR_VERSION)
unset(PATCH_VERSION)
unset(BUILD_REV)

View File

@@ -33,7 +33,7 @@ endmacro()
macro(windows_find_package package_name
)
if(WITH_WINDOWS_FIND_MODULES)
find_package( ${package_name})
find_package(${package_name})
endif(WITH_WINDOWS_FIND_MODULES)
endmacro()

View File

@@ -681,7 +681,7 @@ Image classes
.. attribute:: zbuff
Use depth component of render as grey scale color - suitable for texture source.
Use depth component of render as grayscale color - suitable for texture source.
:type: bool
@@ -817,7 +817,7 @@ Image classes
.. attribute:: zbuff
Use depth component of viewport as grey scale color - suitable for texture source.
Use depth component of viewport as grayscale color - suitable for texture source.
:type: bool
@@ -1260,8 +1260,8 @@ Filter classes
.. class:: FilterGray
Filter for gray scale effect.
Proportions of R, G and B contributions in the output gray scale are 28:151:77.
Filter for grayscale effect.
Proportions of R, G and B contributions in the output grayscale are 28:151:77.
.. attribute:: previous

View File

@@ -427,9 +427,9 @@ if BLENDER_REVISION != "Unknown":
BLENDER_VERSION_DOTS += " " + BLENDER_REVISION # '2.62.1 SHA1'
BLENDER_VERSION_PATH = "_".join(blender_version_strings) # '2_62_1'
if bpy.app.version_cycle == "release":
BLENDER_VERSION_PATH = "%s%s_release" % ("_".join(blender_version_strings[:2]),
bpy.app.version_char) # '2_62_release'
if bpy.app.version_cycle in {"rc", "release"}:
# '2_62a_release'
BLENDER_VERSION_PATH = "%s%s_release" % ("_".join(blender_version_strings[:2]), bpy.app.version_char)
# --------------------------DOWNLOADABLE FILES----------------------------------

View File

@@ -96,6 +96,11 @@ def main():
rsync_base = "rsync://%s@%s:%s" % (args.user, args.rsync_server, args.rsync_root)
blenver = blenver_zip = ""
api_name = ""
branch = ""
is_release = False
# I) Update local mirror using rsync.
rsync_mirror_cmd = ("rsync", "--delete-after", "-avzz", rsync_base, args.mirror_dir)
subprocess.run(rsync_mirror_cmd, env=dict(os.environ, RSYNC_PASSWORD=args.password))
@@ -108,19 +113,24 @@ def main():
subprocess.run(doc_gen_cmd)
# III) Get Blender version info.
blenver = blenver_zip = ""
getver_file = os.path.join(tmp_dir, "blendver.txt")
getver_script = (""
"import sys, bpy\n"
"with open(sys.argv[-1], 'w') as f:\n"
" f.write('%d_%d%s_release\\n' % (bpy.app.version[0], bpy.app.version[1], bpy.app.version_char)\n"
" if bpy.app.version_cycle in {'rc', 'release'} else '%d_%d_%d\\n' % bpy.app.version)\n"
" f.write('%d_%d_%d' % bpy.app.version)\n")
" is_release = bpy.app.version_cycle in {'rc', 'release'}\n"
" branch = bpy.app.build_branch.split()[0].decode()\n"
" f.write('%d\\n' % is_release)\n"
" f.write('%s\\n' % branch)\n"
" f.write('%d.%d%s\\n' % (bpy.app.version[0], bpy.app.version[1], bpy.app.version_char)\n"
" if is_release else '%s\\n' % branch)\n"
" f.write('%d_%d%s_release' % (bpy.app.version[0], bpy.app.version[1], bpy.app.version_char)\n"
" if is_release else '%d_%d_%d' % bpy.app.version)\n")
get_ver_cmd = (args.blender, "--background", "-noaudio", "--factory-startup", "--python-exit-code", "1",
"--python-expr", getver_script, "--", getver_file)
subprocess.run(get_ver_cmd)
with open(getver_file) as f:
blenver, blenver_zip = f.read().split("\n")
is_release, branch, blenver, blenver_zip = f.read().split("\n")
is_release = bool(int(is_release))
os.remove(getver_file)
# IV) Build doc.
@@ -132,7 +142,7 @@ def main():
os.chdir(curr_dir)
# V) Cleanup existing matching dir in server mirror (if any), and copy new doc.
api_name = "blender_python_api_%s" % blenver
api_name = blenver
api_dir = os.path.join(args.mirror_dir, api_name)
if os.path.exists(api_dir):
shutil.rmtree(api_dir)
@@ -150,19 +160,15 @@ def main():
os.rename(zip_path, os.path.join(api_dir, "%s.zip" % zip_name))
# VII) Create symlinks and html redirects.
#~ os.symlink(os.path.join(DEFAULT_SYMLINK_ROOT, api_name, "contents.html"), os.path.join(api_dir, "index.html"))
os.symlink("./contents.html", os.path.join(api_dir, "index.html"))
if blenver.endswith("release"):
symlink = os.path.join(args.mirror_dir, "blender_python_api_current")
if is_release:
symlink = os.path.join(args.mirror_dir, "current")
os.remove(symlink)
os.symlink("./%s" % api_name, symlink)
with open(os.path.join(args.mirror_dir, "250PythonDoc/index.html"), 'w') as f:
f.write("<html><head><title>Redirecting...</title><meta http-equiv=\"REFRESH\""
"content=\"0;url=../%s/\"></head><body>Redirecting...</body></html>" % api_name)
else:
symlink = os.path.join(args.mirror_dir, "blender_python_api_master")
os.remove(symlink)
os.symlink("./%s" % api_name, symlink)
elif branch == "master":
with open(os.path.join(args.mirror_dir, "blender_python_api/index.html"), 'w') as f:
f.write("<html><head><title>Redirecting...</title><meta http-equiv=\"REFRESH\""
"content=\"0;url=../%s/\"></head><body>Redirecting...</body></html>" % api_name)

View File

@@ -1,5 +1,5 @@
Project: OpenCL Wrangler
URL: https://github.com/OpenCLWrangler/clew
License: Apache 2.0
Upstream version: 309a653
Upstream version: 27a6867
Local modifications: None

View File

@@ -369,7 +369,7 @@ typedef unsigned int cl_GLenum;
#endif
/* Define basic vector types */
/* WOrkaround for ppc64el platform: conflicts with bool from C++. */
/* Workaround for ppc64el platform: conflicts with bool from C++. */
#if defined( __VEC__ ) && !(defined(__PPC64__) && defined(__LITTLE_ENDIAN__))
#include <altivec.h> /* may be omitted depending on compiler. AltiVec spec provides no way to detect whether the header is required. */
typedef vector unsigned char __cl_uchar16;
@@ -2765,11 +2765,40 @@ CLEW_FUN_EXPORT PFNCLGETGLCONTEXTINFOKHR __clewGetGLContextInfoKH
#define CL_DEVICE_GPU_OVERLAP_NV 0x4004
#define CL_DEVICE_KERNEL_EXEC_TIMEOUT_NV 0x4005
#define CL_DEVICE_INTEGRATED_MEMORY_NV 0x4006
#define CL_DEVICE_ATTRIBUTE_ASYNC_ENGINE_COUNT_NV 0x4007
#define CL_DEVICE_PCI_BUS_ID_NV 0x4008
#define CL_DEVICE_PCI_SLOT_ID_NV 0x4009
/*********************************
* cl_amd_device_attribute_query *
*********************************/
#define CL_DEVICE_PROFILING_TIMER_OFFSET_AMD 0x4036
#define CL_DEVICE_TOPOLOGY_AMD 0x4037
#define CL_DEVICE_BOARD_NAME_AMD 0x4038
#define CL_DEVICE_GLOBAL_FREE_MEMORY_AMD 0x4039
#define CL_DEVICE_SIMD_PER_COMPUTE_UNIT_AMD 0x4040
#define CL_DEVICE_SIMD_WIDTH_AMD 0x4041
#define CL_DEVICE_SIMD_INSTRUCTION_WIDTH_AMD 0x4042
#define CL_DEVICE_WAVEFRONT_WIDTH_AMD 0x4043
#define CL_DEVICE_GLOBAL_MEM_CHANNELS_AMD 0x4044
#define CL_DEVICE_GLOBAL_MEM_CHANNEL_BANKS_AMD 0x4045
#define CL_DEVICE_GLOBAL_MEM_CHANNEL_BANK_WIDTH_AMD 0x4046
#define CL_DEVICE_LOCAL_MEM_SIZE_PER_COMPUTE_UNIT_AMD 0x4047
#define CL_DEVICE_LOCAL_MEM_BANKS_AMD 0x4048
#define CL_DEVICE_THREAD_TRACE_SUPPORTED_AMD 0x4049
#define CL_DEVICE_GFXIP_MAJOR_AMD 0x404A
#define CL_DEVICE_GFXIP_MINOR_AMD 0x404B
#define CL_DEVICE_AVAILABLE_ASYNC_QUEUES_AMD 0x404C
#ifndef CL_DEVICE_TOPOLOGY_TYPE_PCIE_AMD
#define CL_DEVICE_TOPOLOGY_TYPE_PCIE_AMD 1
typedef union
{
struct { cl_uint type; cl_uint data[5]; } raw;
struct { cl_uint type; cl_char unused[17]; cl_char bus; cl_char device; cl_char function; } pcie;
} cl_device_topology_amd;
#endif
/*********************************
* cl_arm_printf extension

View File

@@ -15,7 +15,7 @@
typedef HMODULE CLEW_DYNLIB_HANDLE;
#define CLEW_DYNLIB_OPEN LoadLibrary
#define CLEW_DYNLIB_OPEN LoadLibraryA
#define CLEW_DYNLIB_CLOSE FreeLibrary
#define CLEW_DYNLIB_IMPORT GetProcAddress
#else
@@ -223,7 +223,7 @@ int clewInit()
__clewSetCommandQueueProperty = (PFNCLSETCOMMANDQUEUEPROPERTY )CLEW_DYNLIB_IMPORT(module, "clSetCommandQueueProperty");
#endif
__clewCreateBuffer = (PFNCLCREATEBUFFER )CLEW_DYNLIB_IMPORT(module, "clCreateBuffer");
__clewCreateSubBuffer = (PFNCLCREATESUBBUFFER )CLEW_DYNLIB_IMPORT(module, "clCreateBuffer");
__clewCreateSubBuffer = (PFNCLCREATESUBBUFFER )CLEW_DYNLIB_IMPORT(module, "clCreateSubBuffer");
__clewCreateImage = (PFNCLCREATEIMAGE )CLEW_DYNLIB_IMPORT(module, "clCreateImage");
__clewRetainMemObject = (PFNCLRETAINMEMOBJECT )CLEW_DYNLIB_IMPORT(module, "clRetainMemObject");
__clewReleaseMemObject = (PFNCLRELEASEMEMOBJECT )CLEW_DYNLIB_IMPORT(module, "clReleaseMemObject");

View File

@@ -114,7 +114,7 @@ extern "C" {
#define cuGLGetDevices cuGLGetDevices_v2
/* Types. */
#if defined(__x86_64) || defined(AMD64) || defined(_M_AMD64)
#if defined(__x86_64) || defined(AMD64) || defined(_M_AMD64) || defined (__aarch64__)
typedef unsigned long long CUdeviceptr;
#else
typedef unsigned int CUdeviceptr;

View File

@@ -34,7 +34,7 @@ add_subdirectory(mikktspace)
add_subdirectory(glew-mx)
add_subdirectory(eigen)
if (WITH_GAMEENGINE_DECKLINK)
if(WITH_GAMEENGINE_DECKLINK)
add_subdirectory(decklink)
endif()
@@ -62,7 +62,7 @@ if(WITH_IK_ITASC)
add_subdirectory(itasc)
endif()
if(WITH_IK_SOLVER OR WITH_GAMEENGINE OR WITH_MOD_BOOLEAN)
if(WITH_GAMEENGINE)
add_subdirectory(moto)
endif()

View File

@@ -101,11 +101,11 @@ ATOMIC_INLINE size_t atomic_fetch_and_add_z(size_t *p, size_t x);
ATOMIC_INLINE size_t atomic_fetch_and_sub_z(size_t *p, size_t x);
ATOMIC_INLINE size_t atomic_cas_z(size_t *v, size_t old, size_t _new);
ATOMIC_INLINE unsigned atomic_add_and_fetch_u(unsigned *p, unsigned x);
ATOMIC_INLINE unsigned atomic_sub_and_fetch_u(unsigned *p, unsigned x);
ATOMIC_INLINE unsigned atomic_fetch_and_add_u(unsigned *p, unsigned x);
ATOMIC_INLINE unsigned atomic_fetch_and_sub_u(unsigned *p, unsigned x);
ATOMIC_INLINE unsigned atomic_cas_u(unsigned *v, unsigned old, unsigned _new);
ATOMIC_INLINE unsigned int atomic_add_and_fetch_u(unsigned int *p, unsigned int x);
ATOMIC_INLINE unsigned int atomic_sub_and_fetch_u(unsigned int *p, unsigned int x);
ATOMIC_INLINE unsigned int atomic_fetch_and_add_u(unsigned int *p, unsigned int x);
ATOMIC_INLINE unsigned int atomic_fetch_and_sub_u(unsigned int *p, unsigned int x);
ATOMIC_INLINE unsigned int atomic_cas_u(unsigned int *v, unsigned int old, unsigned int _new);
/* WARNING! Float 'atomics' are really faked ones, those are actually closer to some kind of spinlock-sync'ed operation,
* which means they are only efficient if collisions are highly unlikely (i.e. if probability of two threads

View File

@@ -113,58 +113,58 @@ ATOMIC_INLINE size_t atomic_cas_z(size_t *v, size_t old, size_t _new)
/******************************************************************************/
/* unsigned operations. */
ATOMIC_INLINE unsigned atomic_add_and_fetch_u(unsigned *p, unsigned x)
ATOMIC_INLINE unsigned int atomic_add_and_fetch_u(unsigned int *p, unsigned int x)
{
assert(sizeof(unsigned) == LG_SIZEOF_INT);
assert(sizeof(unsigned int) == LG_SIZEOF_INT);
#if (LG_SIZEOF_INT == 8)
return (unsigned)atomic_add_and_fetch_uint64((uint64_t *)p, (uint64_t)x);
return (unsigned int)atomic_add_and_fetch_uint64((uint64_t *)p, (uint64_t)x);
#elif (LG_SIZEOF_INT == 4)
return (unsigned)atomic_add_and_fetch_uint32((uint32_t *)p, (uint32_t)x);
return (unsigned int)atomic_add_and_fetch_uint32((uint32_t *)p, (uint32_t)x);
#endif
}
ATOMIC_INLINE unsigned atomic_sub_and_fetch_u(unsigned *p, unsigned x)
ATOMIC_INLINE unsigned int atomic_sub_and_fetch_u(unsigned int *p, unsigned int x)
{
assert(sizeof(unsigned) == LG_SIZEOF_INT);
assert(sizeof(unsigned int) == LG_SIZEOF_INT);
#if (LG_SIZEOF_INT == 8)
return (unsigned)atomic_add_and_fetch_uint64((uint64_t *)p, (uint64_t)-((int64_t)x));
return (unsigned int)atomic_add_and_fetch_uint64((uint64_t *)p, (uint64_t)-((int64_t)x));
#elif (LG_SIZEOF_INT == 4)
return (unsigned)atomic_add_and_fetch_uint32((uint32_t *)p, (uint32_t)-((int32_t)x));
return (unsigned int)atomic_add_and_fetch_uint32((uint32_t *)p, (uint32_t)-((int32_t)x));
#endif
}
ATOMIC_INLINE unsigned atomic_fetch_and_add_u(unsigned *p, unsigned x)
ATOMIC_INLINE unsigned int atomic_fetch_and_add_u(unsigned int *p, unsigned int x)
{
assert(sizeof(unsigned) == LG_SIZEOF_INT);
assert(sizeof(unsigned int) == LG_SIZEOF_INT);
#if (LG_SIZEOF_INT == 8)
return (unsigned)atomic_fetch_and_add_uint64((uint64_t *)p, (uint64_t)x);
return (unsigned int)atomic_fetch_and_add_uint64((uint64_t *)p, (uint64_t)x);
#elif (LG_SIZEOF_INT == 4)
return (unsigned)atomic_fetch_and_add_uint32((uint32_t *)p, (uint32_t)x);
return (unsigned int)atomic_fetch_and_add_uint32((uint32_t *)p, (uint32_t)x);
#endif
}
ATOMIC_INLINE unsigned atomic_fetch_and_sub_u(unsigned *p, unsigned x)
ATOMIC_INLINE unsigned int atomic_fetch_and_sub_u(unsigned int *p, unsigned int x)
{
assert(sizeof(unsigned) == LG_SIZEOF_INT);
assert(sizeof(unsigned int) == LG_SIZEOF_INT);
#if (LG_SIZEOF_INT == 8)
return (unsigned)atomic_fetch_and_add_uint64((uint64_t *)p, (uint64_t)-((int64_t)x));
return (unsigned int)atomic_fetch_and_add_uint64((uint64_t *)p, (uint64_t)-((int64_t)x));
#elif (LG_SIZEOF_INT == 4)
return (unsigned)atomic_fetch_and_add_uint32((uint32_t *)p, (uint32_t)-((int32_t)x));
return (unsigned int)atomic_fetch_and_add_uint32((uint32_t *)p, (uint32_t)-((int32_t)x));
#endif
}
ATOMIC_INLINE unsigned atomic_cas_u(unsigned *v, unsigned old, unsigned _new)
ATOMIC_INLINE unsigned int atomic_cas_u(unsigned int *v, unsigned int old, unsigned int _new)
{
assert(sizeof(unsigned) == LG_SIZEOF_INT);
assert(sizeof(unsigned int) == LG_SIZEOF_INT);
#if (LG_SIZEOF_INT == 8)
return (unsigned)atomic_cas_uint64((uint64_t *)v, (uint64_t)old, (uint64_t)_new);
return (unsigned int)atomic_cas_uint64((uint64_t *)v, (uint64_t)old, (uint64_t)_new);
#elif (LG_SIZEOF_INT == 4)
return (unsigned)atomic_cas_uint32((uint32_t *)v, (uint32_t)old, (uint32_t)_new);
return (unsigned int)atomic_cas_uint32((uint32_t *)v, (uint32_t)old, (uint32_t)_new);
#endif
}

View File

@@ -365,6 +365,7 @@ bool AUD_SoftwareDevice::AUD_SoftwareHandle::seek(float position)
if(!m_status)
return false;
m_pitch->setPitch(m_user_pitch);
m_reader->seek((int)(position * m_reader->getSpecs().rate));
if(m_status == AUD_STATUS_STOPPED)

View File

@@ -22,6 +22,7 @@ if(WITH_CYCLES_NATIVE_ONLY)
-DWITH_KERNEL_NATIVE
)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -march=native")
set(CYCLES_KERNEL_FLAGS "-march=native")
elseif(NOT WITH_CPU_SSE)
set(CXX_HAS_SSE FALSE)
set(CXX_HAS_AVX FALSE)
@@ -59,10 +60,13 @@ elseif(WIN32 AND MSVC)
set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} /Ox")
set(CMAKE_CXX_FLAGS_RELWITHDEBINFO "${CMAKE_CXX_FLAGS_RELWITHDEBINFO} /Ox")
set(CMAKE_CXX_FLAGS_MINSIZEREL "${CMAKE_CXX_FLAGS_MINSIZEREL} /Ox")
set(CYCLES_KERNEL_FLAGS "/fp:fast -D_CRT_SECURE_NO_WARNINGS /GS-")
elseif(CMAKE_COMPILER_IS_GNUCC)
check_cxx_compiler_flag(-msse CXX_HAS_SSE)
check_cxx_compiler_flag(-mavx CXX_HAS_AVX)
check_cxx_compiler_flag(-mavx2 CXX_HAS_AVX2)
set(CYCLES_KERNEL_FLAGS "-ffast-math")
if(CXX_HAS_SSE)
set(CYCLES_SSE2_KERNEL_FLAGS "-ffast-math -msse -msse2 -mfpmath=sse")
set(CYCLES_SSE3_KERNEL_FLAGS "-ffast-math -msse -msse2 -msse3 -mssse3 -mfpmath=sse")
@@ -74,10 +78,12 @@ elseif(CMAKE_COMPILER_IS_GNUCC)
if(CXX_HAS_AVX2)
set(CYCLES_AVX2_KERNEL_FLAGS "-ffast-math -msse -msse2 -msse3 -mssse3 -msse4.1 -mavx -mavx2 -mfma -mlzcnt -mbmi -mbmi2 -mf16c -mfpmath=sse")
endif()
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -ffast-math -fno-finite-math-only")
elseif(CMAKE_CXX_COMPILER_ID MATCHES "Clang")
check_cxx_compiler_flag(-msse CXX_HAS_SSE)
check_cxx_compiler_flag(-mavx CXX_HAS_AVX)
check_cxx_compiler_flag(-mavx2 CXX_HAS_AVX2)
set(CYCLES_KERNEL_FLAGS "-ffast-math")
if(CXX_HAS_SSE)
set(CYCLES_SSE2_KERNEL_FLAGS "-ffast-math -msse -msse2")
set(CYCLES_SSE3_KERNEL_FLAGS "-ffast-math -msse -msse2 -msse3 -mssse3")
@@ -89,6 +95,7 @@ elseif(CMAKE_CXX_COMPILER_ID MATCHES "Clang")
if(CXX_HAS_AVX2)
set(CYCLES_AVX2_KERNEL_FLAGS "-ffast-math -msse -msse2 -msse3 -mssse3 -msse4.1 -mavx -mavx2 -mfma -mlzcnt -mbmi -mbmi2 -mf16c")
endif()
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -ffast-math -fno-finite-math-only")
endif()
if(CXX_HAS_SSE)

View File

@@ -1,14 +1,6 @@
set(INC
.
../bvh
../device
../graph
../kernel
../kernel/svm
../render
../subd
../util
..
)
set(INC_SYS
)

View File

@@ -16,15 +16,15 @@
#include <stdio.h>
#include "device.h"
#include "device/device.h"
#include "util_args.h"
#include "util_foreach.h"
#include "util_path.h"
#include "util_stats.h"
#include "util_string.h"
#include "util_task.h"
#include "util_logging.h"
#include "util/util_args.h"
#include "util/util_foreach.h"
#include "util/util_path.h"
#include "util/util_stats.h"
#include "util/util_string.h"
#include "util/util_task.h"
#include "util/util_logging.h"
using namespace ccl;

View File

@@ -16,29 +16,29 @@
#include <stdio.h>
#include "buffers.h"
#include "camera.h"
#include "device.h"
#include "scene.h"
#include "session.h"
#include "integrator.h"
#include "render/buffers.h"
#include "render/camera.h"
#include "device/device.h"
#include "render/scene.h"
#include "render/session.h"
#include "render/integrator.h"
#include "util_args.h"
#include "util_foreach.h"
#include "util_function.h"
#include "util_logging.h"
#include "util_path.h"
#include "util_progress.h"
#include "util_string.h"
#include "util_time.h"
#include "util_transform.h"
#include "util_version.h"
#include "util/util_args.h"
#include "util/util_foreach.h"
#include "util/util_function.h"
#include "util/util_logging.h"
#include "util/util_path.h"
#include "util/util_progress.h"
#include "util/util_string.h"
#include "util/util_time.h"
#include "util/util_transform.h"
#include "util/util_version.h"
#ifdef WITH_CYCLES_STANDALONE_GUI
#include "util_view.h"
#include "util/util_view.h"
#endif
#include "cycles_xml.h"
#include "app/cycles_xml.h"
CCL_NAMESPACE_BEGIN

View File

@@ -20,31 +20,31 @@
#include <algorithm>
#include <iterator>
#include "node_xml.h"
#include "graph/node_xml.h"
#include "background.h"
#include "camera.h"
#include "film.h"
#include "graph.h"
#include "integrator.h"
#include "light.h"
#include "mesh.h"
#include "nodes.h"
#include "object.h"
#include "osl.h"
#include "shader.h"
#include "scene.h"
#include "render/background.h"
#include "render/camera.h"
#include "render/film.h"
#include "render/graph.h"
#include "render/integrator.h"
#include "render/light.h"
#include "render/mesh.h"
#include "render/nodes.h"
#include "render/object.h"
#include "render/osl.h"
#include "render/shader.h"
#include "render/scene.h"
#include "subd_patch.h"
#include "subd_split.h"
#include "subd/subd_patch.h"
#include "subd/subd_split.h"
#include "util_debug.h"
#include "util_foreach.h"
#include "util_path.h"
#include "util_transform.h"
#include "util_xml.h"
#include "util/util_debug.h"
#include "util/util_foreach.h"
#include "util/util_path.h"
#include "util/util_transform.h"
#include "util/util_xml.h"
#include "cycles_xml.h"
#include "app/cycles_xml.h"
CCL_NAMESPACE_BEGIN

View File

@@ -1,12 +1,6 @@
set(INC
../graph
../render
../device
../kernel
../kernel/svm
../util
../subd
..
../../glew-mx
../../guardedalloc
../../mikktspace

View File

@@ -107,7 +107,13 @@ def engine_exit():
engine.exit()
classes = (
CyclesRender,
)
def register():
from bpy.utils import register_class
from . import ui
from . import properties
from . import presets
@@ -122,12 +128,15 @@ def register():
properties.register()
ui.register()
presets.register()
bpy.utils.register_module(__name__)
for cls in classes:
register_class(cls)
bpy.app.handlers.version_update.append(version_update.do_versions)
def unregister():
from bpy.utils import unregister_class
from . import ui
from . import properties
from . import presets
@@ -138,4 +147,6 @@ def unregister():
ui.unregister()
properties.unregister()
presets.unregister()
bpy.utils.unregister_module(__name__)
for cls in classes:
unregister_class(cls)

View File

@@ -50,6 +50,24 @@ def _workaround_buggy_drivers():
_cycles.opencl_disable()
def _configure_argument_parser():
import argparse
parser = argparse.ArgumentParser(description="Cycles Addon argument parser")
parser.add_argument("--cycles-resumable-num-chunks",
help="Number of chunks to split sample range into",
default=None)
parser.add_argument("--cycles-resumable-current-chunk",
help="Current chunk of samples range to render",
default=None)
parser.add_argument("--cycles-resumable-start-chunk",
help="Start chunk to render",
default=None)
parser.add_argument("--cycles-resumable-end-chunk",
help="End chunk to render",
default=None)
return parser
def _parse_command_line():
import sys
@@ -57,25 +75,22 @@ def _parse_command_line():
if "--" not in argv:
return
argv = argv[argv.index("--") + 1:]
parser = _configure_argument_parser()
args, unknown = parser.parse_known_args(argv[argv.index("--") + 1:])
num_resumable_chunks = None
current_resumable_chunk = None
# TODO(sergey): Add some nice error prints if argument is not used properly.
idx = 0
while idx < len(argv) - 1:
arg = argv[idx]
if arg == '--cycles-resumable-num-chunks':
num_resumable_chunks = int(argv[idx + 1])
elif arg == '--cycles-resumable-current-chunk':
current_resumable_chunk = int(argv[idx + 1])
idx += 1
if num_resumable_chunks is not None and current_resumable_chunk is not None:
import _cycles
_cycles.set_resumable_chunks(num_resumable_chunks,
current_resumable_chunk)
if args.cycles_resumable_num_chunks is not None:
if args.cycles_resumable_current_chunk is not None:
import _cycles
_cycles.set_resumable_chunk(
int(args.cycles_resumable_num_chunks),
int(args.cycles_resumable_current_chunk))
elif args.cycles_resumable_start_chunk is not None and \
args.cycles_resumable_end_chunk:
import _cycles
_cycles.set_resumable_chunk_range(
int(args.cycles_resumable_num_chunks),
int(args.cycles_resumable_start_chunk),
int(args.cycles_resumable_end_chunk))
def init():

View File

@@ -82,12 +82,23 @@ class AddPresetSampling(AddPresetBase, Operator):
preset_subdir = "cycles/sampling"
classes = (
AddPresetIntegrator,
AddPresetSampling,
)
def register():
pass
from bpy.utils import register_class
for cls in classes:
register_class(cls)
def unregister():
pass
from bpy.utils import unregister_class
for cls in classes:
unregister_class(cls)
if __name__ == "__main__":
register()

View File

@@ -665,8 +665,10 @@ class CyclesRenderSettings(bpy.types.PropertyGroup):
cls.debug_use_cpu_sse3 = BoolProperty(name="SSE3", default=True)
cls.debug_use_cpu_sse2 = BoolProperty(name="SSE2", default=True)
cls.debug_use_qbvh = BoolProperty(name="QBVH", default=True)
cls.debug_use_cpu_split_kernel = BoolProperty(name="Split Kernel", default=False)
cls.debug_use_cuda_adaptive_compile = BoolProperty(name="Adaptive Compile", default=False)
cls.debug_use_cuda_split_kernel = BoolProperty(name="Split Kernel", default=False)
cls.debug_opencl_kernel_type = EnumProperty(
name="OpenCL Kernel Type",
@@ -693,6 +695,8 @@ class CyclesRenderSettings(bpy.types.PropertyGroup):
update=devices_update_callback
)
cls.debug_opencl_kernel_single_program = BoolProperty(name="Single Program", default=False, update=devices_update_callback);
cls.debug_use_opencl_debug = BoolProperty(name="Debug OpenCL", default=False)
@classmethod
@@ -1092,6 +1096,12 @@ class CyclesObjectSettings(bpy.types.PropertyGroup):
default=1.0,
)
cls.is_shadow_catcher = BoolProperty(
name="Shadow Catcher",
description="Only render shadows on this object, for compositing renders into real footage",
default=False,
)
@classmethod
def unregister(cls):
del bpy.types.Object.cycles

View File

@@ -86,12 +86,10 @@ def use_sample_all_lights(context):
return cscene.sample_all_lights_direct or cscene.sample_all_lights_indirect
def show_device_selection(context):
type = get_device_type(context)
if type == 'NETWORK':
def show_device_active(context):
cscene = context.scene.cycles
if cscene.device != 'GPU':
return True
if not type in {'CUDA', 'OPENCL'}:
return False
return context.user_preferences.addons[__package__].preferences.has_active_device()
@@ -186,9 +184,6 @@ class CyclesRender_PT_sampling(CyclesButtonsPanel, Panel):
sub.label(text="AA Samples:")
sub.prop(cscene, "aa_samples", text="Render")
sub.prop(cscene, "preview_aa_samples", text="Preview")
sub.separator()
sub.prop(cscene, "sample_all_lights_direct")
sub.prop(cscene, "sample_all_lights_indirect")
col = split.column()
sub = col.column(align=True)
@@ -205,6 +200,10 @@ class CyclesRender_PT_sampling(CyclesButtonsPanel, Panel):
sub.prop(cscene, "subsurface_samples", text="Subsurface")
sub.prop(cscene, "volume_samples", text="Volume")
col = layout.column(align=True)
col.prop(cscene, "sample_all_lights_direct")
col.prop(cscene, "sample_all_lights_indirect")
if not (use_opencl(context) and cscene.feature_set != 'EXPERIMENTAL'):
layout.row().prop(cscene, "sampling_pattern", text="Pattern")
@@ -270,7 +269,7 @@ class CyclesRender_PT_geometry(CyclesButtonsPanel, Panel):
row = col.row()
row.prop(ccscene, "minimum_width", text="Min Pixels")
row.prop(ccscene, "maximum_width", text="Max Ext.")
row.prop(ccscene, "maximum_width", text="Max Extension")
class CyclesRender_PT_light_paths(CyclesButtonsPanel, Panel):
@@ -788,6 +787,8 @@ class CyclesObject_PT_cycles_settings(CyclesButtonsPanel, Panel):
if ob.type != 'LAMP':
flow.prop(visibility, "shadow")
layout.prop(cob, "is_shadow_catcher")
col = layout.column()
col.label(text="Performance:")
row = col.row()
@@ -1518,15 +1519,18 @@ class CyclesRender_PT_debug(CyclesButtonsPanel, Panel):
row.prop(cscene, "debug_use_cpu_avx", toggle=True)
row.prop(cscene, "debug_use_cpu_avx2", toggle=True)
col.prop(cscene, "debug_use_qbvh")
col.prop(cscene, "debug_use_cpu_split_kernel")
col = layout.column()
col.label('CUDA Flags:')
col.prop(cscene, "debug_use_cuda_adaptive_compile")
col.prop(cscene, "debug_use_cuda_split_kernel")
col = layout.column()
col.label('OpenCL Flags:')
col.prop(cscene, "debug_opencl_kernel_type", text="Kernel")
col.prop(cscene, "debug_opencl_device_type", text="Device")
col.prop(cscene, "debug_opencl_kernel_single_program", text="Single Program")
col.prop(cscene, "debug_use_opencl_debug", text="Debug")
@@ -1633,7 +1637,7 @@ def draw_device(self, context):
split = layout.split(percentage=1/3)
split.label("Device:")
row = split.row()
row.active = show_device_selection(context)
row.active = show_device_active(context)
row.prop(cscene, "device", text="")
if engine.with_osl() and use_cpu(context):
@@ -1712,17 +1716,75 @@ def get_panels():
return panels
classes = (
CYCLES_MT_sampling_presets,
CYCLES_MT_integrator_presets,
CyclesRender_PT_sampling,
CyclesRender_PT_geometry,
CyclesRender_PT_light_paths,
CyclesRender_PT_motion_blur,
CyclesRender_PT_film,
CyclesRender_PT_performance,
CyclesRender_PT_layer_options,
CyclesRender_PT_layer_passes,
CyclesRender_PT_views,
Cycles_PT_post_processing,
CyclesCamera_PT_dof,
Cycles_PT_context_material,
CyclesObject_PT_motion_blur,
CyclesObject_PT_cycles_settings,
CYCLES_OT_use_shading_nodes,
CyclesLamp_PT_preview,
CyclesLamp_PT_lamp,
CyclesLamp_PT_nodes,
CyclesLamp_PT_spot,
CyclesWorld_PT_preview,
CyclesWorld_PT_surface,
CyclesWorld_PT_volume,
CyclesWorld_PT_ambient_occlusion,
CyclesWorld_PT_mist,
CyclesWorld_PT_ray_visibility,
CyclesWorld_PT_settings,
CyclesMaterial_PT_preview,
CyclesMaterial_PT_surface,
CyclesMaterial_PT_volume,
CyclesMaterial_PT_displacement,
CyclesMaterial_PT_settings,
CyclesTexture_PT_context,
CyclesTexture_PT_node,
CyclesTexture_PT_mapping,
CyclesTexture_PT_colors,
CyclesParticle_PT_textures,
CyclesRender_PT_bake,
CyclesRender_PT_debug,
CyclesParticle_PT_CurveSettings,
CyclesScene_PT_simplify,
)
def register():
from bpy.utils import register_class
bpy.types.RENDER_PT_render.append(draw_device)
bpy.types.VIEW3D_HT_header.append(draw_pause)
for panel in get_panels():
panel.COMPAT_ENGINES.add('CYCLES')
for cls in classes:
register_class(cls)
def unregister():
from bpy.utils import unregister_class
bpy.types.RENDER_PT_render.remove(draw_device)
bpy.types.VIEW3D_HT_header.remove(draw_pause)
for panel in get_panels():
if 'CYCLES' in panel.COMPAT_ENGINES:
panel.COMPAT_ENGINES.remove('CYCLES')
for cls in classes:
unregister_class(cls)

View File

@@ -14,13 +14,13 @@
* limitations under the License.
*/
#include "camera.h"
#include "scene.h"
#include "render/camera.h"
#include "render/scene.h"
#include "blender_sync.h"
#include "blender_util.h"
#include "blender/blender_sync.h"
#include "blender/blender_util.h"
#include "util_logging.h"
#include "util/util_logging.h"
CCL_NAMESPACE_BEGIN

View File

@@ -14,18 +14,18 @@
* limitations under the License.
*/
#include "attribute.h"
#include "camera.h"
#include "curves.h"
#include "mesh.h"
#include "object.h"
#include "scene.h"
#include "render/attribute.h"
#include "render/camera.h"
#include "render/curves.h"
#include "render/mesh.h"
#include "render/object.h"
#include "render/scene.h"
#include "blender_sync.h"
#include "blender_util.h"
#include "blender/blender_sync.h"
#include "blender/blender_util.h"
#include "util_foreach.h"
#include "util_logging.h"
#include "util/util_foreach.h"
#include "util/util_logging.h"
CCL_NAMESPACE_BEGIN
@@ -411,6 +411,7 @@ static void ExportCurveTrianglePlanes(Mesh *mesh, ParticleCurveData *CData,
}
}
mesh->resize_mesh(mesh->verts.size(), mesh->triangles.size());
mesh->attributes.remove(ATTR_STD_VERTEX_NORMAL);
mesh->attributes.remove(ATTR_STD_FACE_NORMAL);
mesh->add_face_normals();
@@ -434,8 +435,8 @@ static void ExportCurveTriangleGeometry(Mesh *mesh,
if(CData->curve_keynum[curve] <= 1 || CData->curve_length[curve] == 0.0f)
continue;
numverts += (CData->curve_keynum[curve] - 2)*2*resolution + resolution;
numtris += (CData->curve_keynum[curve] - 2)*resolution;
numverts += (CData->curve_keynum[curve] - 1)*resolution + resolution;
numtris += (CData->curve_keynum[curve] - 1)*2*resolution;
}
}
@@ -545,6 +546,7 @@ static void ExportCurveTriangleGeometry(Mesh *mesh,
}
}
mesh->resize_mesh(mesh->verts.size(), mesh->triangles.size());
mesh->attributes.remove(ATTR_STD_VERTEX_NORMAL);
mesh->attributes.remove(ATTR_STD_FACE_NORMAL);
mesh->add_face_normals();
@@ -890,7 +892,7 @@ void BlenderSync::sync_curves(Mesh *mesh,
}
/* obtain general settings */
bool use_curves = scene->curve_system_manager->use_curves;
const bool use_curves = scene->curve_system_manager->use_curves;
if(!(use_curves && b_ob.mode() != b_ob.mode_PARTICLE_EDIT)) {
if(!motion)
@@ -898,11 +900,11 @@ void BlenderSync::sync_curves(Mesh *mesh,
return;
}
int primitive = scene->curve_system_manager->primitive;
int triangle_method = scene->curve_system_manager->triangle_method;
int resolution = scene->curve_system_manager->resolution;
size_t vert_num = mesh->verts.size();
size_t tri_num = mesh->num_triangles();
const int primitive = scene->curve_system_manager->primitive;
const int triangle_method = scene->curve_system_manager->triangle_method;
const int resolution = scene->curve_system_manager->resolution;
const size_t vert_num = mesh->verts.size();
const size_t tri_num = mesh->num_triangles();
int used_res = 1;
/* extract particle hair data - should be combined with connecting to mesh later*/

View File

@@ -14,8 +14,8 @@
* limitations under the License.
*/
#include "CCL_api.h"
#include "util_logging.h"
#include "blender/CCL_api.h"
#include "util/util_logging.h"
void CCL_init_logging(const char *argv0)
{

View File

@@ -15,21 +15,22 @@
*/
#include "mesh.h"
#include "object.h"
#include "scene.h"
#include "camera.h"
#include "render/mesh.h"
#include "render/object.h"
#include "render/scene.h"
#include "render/camera.h"
#include "blender_sync.h"
#include "blender_session.h"
#include "blender_util.h"
#include "blender/blender_sync.h"
#include "blender/blender_session.h"
#include "blender/blender_util.h"
#include "subd_patch.h"
#include "subd_split.h"
#include "subd/subd_patch.h"
#include "subd/subd_split.h"
#include "util_foreach.h"
#include "util_logging.h"
#include "util_math.h"
#include "util/util_algorithm.h"
#include "util/util_foreach.h"
#include "util/util_logging.h"
#include "util/util_math.h"
#include "mikktspace.h"
@@ -525,69 +526,177 @@ static void attr_create_uv_map(Scene *scene,
}
/* Create vertex pointiness attributes. */
/* Compare vertices by sum of their coordinates. */
class VertexAverageComparator {
public:
VertexAverageComparator(const array<float3>& verts)
: verts_(verts) {
}
bool operator()(const int& vert_idx_a, const int& vert_idx_b)
{
const float3 &vert_a = verts_[vert_idx_a];
const float3 &vert_b = verts_[vert_idx_b];
if(vert_a == vert_b) {
/* Special case for doubles, so we ensure ordering. */
return vert_idx_a > vert_idx_b;
}
const float x1 = vert_a.x + vert_a.y + vert_a.z;
const float x2 = vert_b.x + vert_b.y + vert_b.z;
return x1 < x2;
}
protected:
const array<float3>& verts_;
};
static void attr_create_pointiness(Scene *scene,
Mesh *mesh,
BL::Mesh& b_mesh,
bool subdivision)
{
if(mesh->need_attribute(scene, ATTR_STD_POINTINESS)) {
const int numverts = b_mesh.vertices.length();
AttributeSet& attributes = (subdivision)? mesh->subd_attributes: mesh->attributes;
Attribute *attr = attributes.add(ATTR_STD_POINTINESS);
float *data = attr->data_float();
int *counter = new int[numverts];
float *raw_data = new float[numverts];
float3 *edge_accum = new float3[numverts];
/* Calculate pointiness using single ring neighborhood. */
memset(counter, 0, sizeof(int) * numverts);
memset(raw_data, 0, sizeof(float) * numverts);
memset(edge_accum, 0, sizeof(float3) * numverts);
BL::Mesh::edges_iterator e;
int i = 0;
for(b_mesh.edges.begin(e); e != b_mesh.edges.end(); ++e, ++i) {
int v0 = b_mesh.edges[i].vertices()[0],
v1 = b_mesh.edges[i].vertices()[1];
float3 co0 = get_float3(b_mesh.vertices[v0].co()),
co1 = get_float3(b_mesh.vertices[v1].co());
float3 edge = normalize(co1 - co0);
edge_accum[v0] += edge;
edge_accum[v1] += -edge;
++counter[v0];
++counter[v1];
}
i = 0;
BL::Mesh::vertices_iterator v;
for(b_mesh.vertices.begin(v); v != b_mesh.vertices.end(); ++v, ++i) {
if(counter[i] > 0) {
float3 normal = get_float3(b_mesh.vertices[i].normal());
float angle = safe_acosf(dot(normal, edge_accum[i] / counter[i]));
raw_data[i] = angle * M_1_PI_F;
if(!mesh->need_attribute(scene, ATTR_STD_POINTINESS)) {
return;
}
const int num_verts = b_mesh.vertices.length();
/* STEP 1: Find out duplicated vertices and point duplicates to a single
* original vertex.
*/
vector<int> sorted_vert_indeices(num_verts);
for(int vert_index = 0; vert_index < num_verts; ++vert_index) {
sorted_vert_indeices[vert_index] = vert_index;
}
VertexAverageComparator compare(mesh->verts);
sort(sorted_vert_indeices.begin(), sorted_vert_indeices.end(), compare);
/* This array stores index of the original vertex for the given vertex
* index.
*/
vector<int> vert_orig_index(num_verts);
for(int sorted_vert_index = 0;
sorted_vert_index < num_verts;
++sorted_vert_index)
{
const int vert_index = sorted_vert_indeices[sorted_vert_index];
const float3 &vert_co = mesh->verts[vert_index];
bool found = false;
for(int other_sorted_vert_index = sorted_vert_index + 1;
other_sorted_vert_index < num_verts;
++other_sorted_vert_index)
{
const int other_vert_index =
sorted_vert_indeices[other_sorted_vert_index];
const float3 &other_vert_co = mesh->verts[other_vert_index];
/* We are too far away now, we wouldn't have duplicate. */
if((other_vert_co.x + other_vert_co.y + other_vert_co.z) -
(vert_co.x + vert_co.y + vert_co.z) > 3 * FLT_EPSILON)
{
break;
}
else {
raw_data[i] = 0.0f;
/* Found duplicate. */
if(len_squared(other_vert_co - vert_co) < FLT_EPSILON) {
found = true;
vert_orig_index[vert_index] = other_vert_index;
break;
}
}
/* Blur vertices to approximate 2 ring neighborhood. */
memset(counter, 0, sizeof(int) * numverts);
memcpy(data, raw_data, sizeof(float) * numverts);
i = 0;
for(b_mesh.edges.begin(e); e != b_mesh.edges.end(); ++e, ++i) {
int v0 = b_mesh.edges[i].vertices()[0],
v1 = b_mesh.edges[i].vertices()[1];
data[v0] += raw_data[v1];
data[v1] += raw_data[v0];
++counter[v0];
++counter[v1];
if(!found) {
vert_orig_index[vert_index] = vert_index;
}
for(i = 0; i < numverts; ++i) {
data[i] /= counter[i] + 1;
}
/* Make sure we always points to the very first orig vertex. */
for(int vert_index = 0; vert_index < num_verts; ++vert_index) {
int orig_index = vert_orig_index[vert_index];
while(orig_index != vert_orig_index[orig_index]) {
orig_index = vert_orig_index[orig_index];
}
delete [] counter;
delete [] raw_data;
delete [] edge_accum;
vert_orig_index[vert_index] = orig_index;
}
sorted_vert_indeices.free_memory();
/* STEP 2: Calculate vertex normals taking into account their possible
* duplicates which gets "welded" together.
*/
vector<float3> vert_normal(num_verts, make_float3(0.0f, 0.0f, 0.0f));
/* First we accumulate all vertex normals in the original index. */
for(int vert_index = 0; vert_index < num_verts; ++vert_index) {
const float3 normal = get_float3(b_mesh.vertices[vert_index].normal());
const int orig_index = vert_orig_index[vert_index];
vert_normal[orig_index] += normal;
}
/* Then we normalize the accumulated result and flush it to all duplicates
* as well.
*/
for(int vert_index = 0; vert_index < num_verts; ++vert_index) {
const int orig_index = vert_orig_index[vert_index];
vert_normal[vert_index] = normalize(vert_normal[orig_index]);
}
/* STEP 3: Calculate pointiness using single ring neighborhood. */
vector<int> counter(num_verts, 0);
vector<float> raw_data(num_verts, 0.0f);
vector<float3> edge_accum(num_verts, make_float3(0.0f, 0.0f, 0.0f));
BL::Mesh::edges_iterator e;
EdgeMap visited_edges;
int edge_index = 0;
memset(&counter[0], 0, sizeof(int) * counter.size());
for(b_mesh.edges.begin(e); e != b_mesh.edges.end(); ++e, ++edge_index) {
const int v0 = vert_orig_index[b_mesh.edges[edge_index].vertices()[0]],
v1 = vert_orig_index[b_mesh.edges[edge_index].vertices()[1]];
if(visited_edges.exists(v0, v1)) {
continue;
}
visited_edges.insert(v0, v1);
float3 co0 = get_float3(b_mesh.vertices[v0].co()),
co1 = get_float3(b_mesh.vertices[v1].co());
float3 edge = normalize(co1 - co0);
edge_accum[v0] += edge;
edge_accum[v1] += -edge;
++counter[v0];
++counter[v1];
}
for(int vert_index = 0; vert_index < num_verts; ++vert_index) {
const int orig_index = vert_orig_index[vert_index];
if(orig_index != vert_index) {
/* Skip duplicates, they'll be overwritten later on. */
continue;
}
if(counter[vert_index] > 0) {
const float3 normal = vert_normal[vert_index];
const float angle =
safe_acosf(dot(normal,
edge_accum[vert_index] / counter[vert_index]));
raw_data[vert_index] = angle * M_1_PI_F;
}
else {
raw_data[vert_index] = 0.0f;
}
}
/* STEP 3: Blur vertices to approximate 2 ring neighborhood. */
AttributeSet& attributes = (subdivision)? mesh->subd_attributes: mesh->attributes;
Attribute *attr = attributes.add(ATTR_STD_POINTINESS);
float *data = attr->data_float();
memcpy(data, &raw_data[0], sizeof(float) * raw_data.size());
memset(&counter[0], 0, sizeof(int) * counter.size());
edge_index = 0;
visited_edges.clear();
for(b_mesh.edges.begin(e); e != b_mesh.edges.end(); ++e, ++edge_index) {
const int v0 = vert_orig_index[b_mesh.edges[edge_index].vertices()[0]],
v1 = vert_orig_index[b_mesh.edges[edge_index].vertices()[1]];
if(visited_edges.exists(v0, v1)) {
continue;
}
visited_edges.insert(v0, v1);
data[v0] += raw_data[v1];
data[v1] += raw_data[v0];
++counter[v0];
++counter[v1];
}
for(int vert_index = 0; vert_index < num_verts; ++vert_index) {
data[vert_index] /= counter[vert_index] + 1;
}
/* STEP 4: Copy attribute to the duplicated vertices. */
for(int vert_index = 0; vert_index < num_verts; ++vert_index) {
const int orig_index = vert_orig_index[vert_index];
data[vert_index] = data[orig_index];
}
}
@@ -656,9 +765,6 @@ static void create_mesh(Scene *scene,
generated[i++] = get_float3(v->undeformed_co())*size - loc;
}
/* Create needed vertex attributes. */
attr_create_pointiness(scene, mesh, b_mesh, subdivision);
/* create faces */
vector<int> nverts(numfaces);
vector<int> face_flags(numfaces, FACE_FLAG_NONE);
@@ -671,6 +777,15 @@ static void create_mesh(Scene *scene,
int shader = clamp(f->material_index(), 0, used_shaders.size()-1);
bool smooth = f->use_smooth() || use_loop_normals;
if(use_loop_normals) {
BL::Array<float, 12> loop_normals = f->split_normals();
for(int i = 0; i < n; i++) {
N[vi[i]] = make_float3(loop_normals[i * 3],
loop_normals[i * 3 + 1],
loop_normals[i * 3 + 2]);
}
}
/* Create triangles.
*
* NOTE: Autosmooth is already taken care about.
@@ -704,7 +819,7 @@ static void create_mesh(Scene *scene,
int shader = clamp(p->material_index(), 0, used_shaders.size()-1);
bool smooth = p->use_smooth() || use_loop_normals;
vi.reserve(n);
vi.resize(n);
for(int i = 0; i < n; i++) {
/* NOTE: Autosmooth is already taken care about. */
vi[i] = b_mesh.loops[p->loop_start() + i].vertex_index();
@@ -718,6 +833,7 @@ static void create_mesh(Scene *scene,
/* Create all needed attributes.
* The calculate functions will check whether they're needed or not.
*/
attr_create_pointiness(scene, mesh, b_mesh, subdivision);
attr_create_vertex_color(scene, mesh, b_mesh, nverts, face_flags, subdivision);
attr_create_uv_map(scene, mesh, b_mesh, nverts, face_flags, subdivision, subdivide_uvs);
@@ -1178,4 +1294,3 @@ void BlenderSync::sync_mesh_motion(BL::Object& b_ob,
}
CCL_NAMESPACE_END

View File

@@ -14,24 +14,24 @@
* limitations under the License.
*/
#include "camera.h"
#include "integrator.h"
#include "graph.h"
#include "light.h"
#include "mesh.h"
#include "object.h"
#include "scene.h"
#include "nodes.h"
#include "particles.h"
#include "shader.h"
#include "render/camera.h"
#include "render/integrator.h"
#include "render/graph.h"
#include "render/light.h"
#include "render/mesh.h"
#include "render/object.h"
#include "render/scene.h"
#include "render/nodes.h"
#include "render/particles.h"
#include "render/shader.h"
#include "blender_object_cull.h"
#include "blender_sync.h"
#include "blender_util.h"
#include "blender/blender_object_cull.h"
#include "blender/blender_sync.h"
#include "blender/blender_util.h"
#include "util_foreach.h"
#include "util_hash.h"
#include "util_logging.h"
#include "util/util_foreach.h"
#include "util/util_hash.h"
#include "util/util_logging.h"
CCL_NAMESPACE_BEGIN
@@ -343,6 +343,13 @@ Object *BlenderSync::sync_object(BL::Object& b_parent,
object_updated = true;
}
PointerRNA cobject = RNA_pointer_get(&b_ob.ptr, "cycles");
bool is_shadow_catcher = get_boolean(cobject, "is_shadow_catcher");
if(is_shadow_catcher != object->is_shadow_catcher) {
object->is_shadow_catcher = is_shadow_catcher;
object_updated = true;
}
/* object sync
* transform comparison should not be needed, but duplis don't work perfect
* in the depsgraph and may not signal changes, so this is a workaround */

View File

@@ -16,9 +16,9 @@
#include <cstdlib>
#include "camera.h"
#include "render/camera.h"
#include "blender_object_cull.h"
#include "blender/blender_object_cull.h"
CCL_NAMESPACE_BEGIN

View File

@@ -17,8 +17,8 @@
#ifndef __BLENDER_OBJECT_CULL_H__
#define __BLENDER_OBJECT_CULL_H__
#include "blender_sync.h"
#include "util_types.h"
#include "blender/blender_sync.h"
#include "util/util_types.h"
CCL_NAMESPACE_BEGIN

View File

@@ -14,14 +14,14 @@
* limitations under the License.
*/
#include "mesh.h"
#include "object.h"
#include "particles.h"
#include "render/mesh.h"
#include "render/object.h"
#include "render/particles.h"
#include "blender_sync.h"
#include "blender_util.h"
#include "blender/blender_sync.h"
#include "blender/blender_util.h"
#include "util_foreach.h"
#include "util/util_foreach.h"
CCL_NAMESPACE_BEGIN

View File

@@ -16,21 +16,21 @@
#include <Python.h>
#include "CCL_api.h"
#include "blender/CCL_api.h"
#include "blender_sync.h"
#include "blender_session.h"
#include "blender/blender_sync.h"
#include "blender/blender_session.h"
#include "util_foreach.h"
#include "util_logging.h"
#include "util_md5.h"
#include "util_opengl.h"
#include "util_path.h"
#include "util_string.h"
#include "util_types.h"
#include "util/util_foreach.h"
#include "util/util_logging.h"
#include "util/util_md5.h"
#include "util/util_opengl.h"
#include "util/util_path.h"
#include "util/util_string.h"
#include "util/util_types.h"
#ifdef WITH_OSL
#include "osl.h"
#include "render/osl.h"
#include <OSL/oslquery.h>
#include <OSL/oslconfig.h>
@@ -67,8 +67,10 @@ bool debug_flags_sync_from_scene(BL::Scene b_scene)
flags.cpu.sse3 = get_boolean(cscene, "debug_use_cpu_sse3");
flags.cpu.sse2 = get_boolean(cscene, "debug_use_cpu_sse2");
flags.cpu.qbvh = get_boolean(cscene, "debug_use_qbvh");
flags.cpu.split_kernel = get_boolean(cscene, "debug_use_cpu_split_kernel");
/* Synchronize CUDA flags. */
flags.cuda.adaptive_compile = get_boolean(cscene, "debug_use_cuda_adaptive_compile");
flags.cuda.split_kernel = get_boolean(cscene, "debug_use_cuda_split_kernel");
/* Synchronize OpenCL kernel type. */
switch(get_enum(cscene, "debug_opencl_kernel_type")) {
case 0:
@@ -104,6 +106,7 @@ bool debug_flags_sync_from_scene(BL::Scene b_scene)
}
/* Synchronize other OpenCL flags. */
flags.opencl.debug = get_boolean(cscene, "debug_use_opencl_debug");
flags.opencl.single_program = get_boolean(cscene, "debug_opencl_kernel_single_program");
return flags.opencl.device_type != opencl_device_type ||
flags.opencl.kernel_type != opencl_kernel_type;
}
@@ -641,7 +644,7 @@ static PyObject *debug_flags_reset_func(PyObject * /*self*/, PyObject * /*args*/
Py_RETURN_NONE;
}
static PyObject *set_resumable_chunks_func(PyObject * /*self*/, PyObject *args)
static PyObject *set_resumable_chunk_func(PyObject * /*self*/, PyObject *args)
{
int num_resumable_chunks, current_resumable_chunk;
if(!PyArg_ParseTuple(args, "ii",
@@ -676,6 +679,53 @@ static PyObject *set_resumable_chunks_func(PyObject * /*self*/, PyObject *args)
Py_RETURN_NONE;
}
static PyObject *set_resumable_chunk_range_func(PyObject * /*self*/, PyObject *args)
{
int num_chunks, start_chunk, end_chunk;
if(!PyArg_ParseTuple(args, "iii",
&num_chunks,
&start_chunk,
&end_chunk)) {
Py_RETURN_NONE;
}
if(num_chunks <= 0) {
fprintf(stderr, "Cycles: Bad value for number of resumable chunks.\n");
abort();
Py_RETURN_NONE;
}
if(start_chunk < 1 || start_chunk > num_chunks) {
fprintf(stderr, "Cycles: Bad value for start chunk number.\n");
abort();
Py_RETURN_NONE;
}
if(end_chunk < 1 || end_chunk > num_chunks) {
fprintf(stderr, "Cycles: Bad value for start chunk number.\n");
abort();
Py_RETURN_NONE;
}
if(start_chunk > end_chunk) {
fprintf(stderr, "Cycles: End chunk should be higher than start one.\n");
abort();
Py_RETURN_NONE;
}
VLOG(1) << "Initialized resumable render: "
<< "num_resumable_chunks=" << num_chunks << ", "
<< "start_resumable_chunk=" << start_chunk
<< "end_resumable_chunk=" << end_chunk;
BlenderSession::num_resumable_chunks = num_chunks;
BlenderSession::start_resumable_chunk = start_chunk;
BlenderSession::end_resumable_chunk = end_chunk;
printf("Cycles: Will render chunks %d to %d of %d\n",
start_chunk,
end_chunk,
num_chunks);
Py_RETURN_NONE;
}
static PyObject *get_device_types_func(PyObject * /*self*/, PyObject * /*args*/)
{
vector<DeviceInfo>& devices = Device::available_devices();
@@ -715,7 +765,8 @@ static PyMethodDef methods[] = {
{"debug_flags_reset", debug_flags_reset_func, METH_NOARGS, ""},
/* Resumable render */
{"set_resumable_chunks", set_resumable_chunks_func, METH_VARARGS, ""},
{"set_resumable_chunk", set_resumable_chunk_func, METH_VARARGS, ""},
{"set_resumable_chunk_range", set_resumable_chunk_range_func, METH_VARARGS, ""},
/* Compute Device selection */
{"get_device_types", get_device_types_func, METH_VARARGS, ""},

View File

@@ -16,36 +16,38 @@
#include <stdlib.h>
#include "background.h"
#include "buffers.h"
#include "camera.h"
#include "device.h"
#include "integrator.h"
#include "film.h"
#include "light.h"
#include "mesh.h"
#include "object.h"
#include "scene.h"
#include "session.h"
#include "shader.h"
#include "render/background.h"
#include "render/buffers.h"
#include "render/camera.h"
#include "device/device.h"
#include "render/integrator.h"
#include "render/film.h"
#include "render/light.h"
#include "render/mesh.h"
#include "render/object.h"
#include "render/scene.h"
#include "render/session.h"
#include "render/shader.h"
#include "util_color.h"
#include "util_foreach.h"
#include "util_function.h"
#include "util_hash.h"
#include "util_logging.h"
#include "util_progress.h"
#include "util_time.h"
#include "util/util_color.h"
#include "util/util_foreach.h"
#include "util/util_function.h"
#include "util/util_hash.h"
#include "util/util_logging.h"
#include "util/util_progress.h"
#include "util/util_time.h"
#include "blender_sync.h"
#include "blender_session.h"
#include "blender_util.h"
#include "blender/blender_sync.h"
#include "blender/blender_session.h"
#include "blender/blender_util.h"
CCL_NAMESPACE_BEGIN
bool BlenderSession::headless = false;
int BlenderSession::num_resumable_chunks = 0;
int BlenderSession::current_resumable_chunk = 0;
int BlenderSession::start_resumable_chunk = 0;
int BlenderSession::end_resumable_chunk = 0;
BlenderSession::BlenderSession(BL::RenderEngine& b_engine,
BL::UserPreferences& b_userpref,
@@ -68,6 +70,7 @@ BlenderSession::BlenderSession(BL::RenderEngine& b_engine,
background = true;
last_redraw_time = 0.0;
start_resize_time = 0.0;
last_status_time = 0.0;
}
BlenderSession::BlenderSession(BL::RenderEngine& b_engine,
@@ -93,6 +96,7 @@ BlenderSession::BlenderSession(BL::RenderEngine& b_engine,
background = false;
last_redraw_time = 0.0;
start_resize_time = 0.0;
last_status_time = 0.0;
}
BlenderSession::~BlenderSession()
@@ -989,10 +993,14 @@ void BlenderSession::update_status_progress()
if(substatus.size() > 0)
status += " | " + substatus;
if(status != last_status) {
double current_time = time_dt();
/* When rendering in a window, redraw the status at least once per second to keep the elapsed and remaining time up-to-date.
* For headless rendering, only report when something significant changes to keep the console output readable. */
if(status != last_status || (!headless && (current_time - last_status_time) > 1.0)) {
b_engine.update_stats("", (timestatus + scene + status).c_str());
b_engine.update_memory_stats(mem_used, mem_peak);
last_status = status;
last_status_time = current_time;
}
if(progress != last_progress) {
b_engine.update_progress(progress);
@@ -1342,9 +1350,21 @@ void BlenderSession::update_resumable_tile_manager(int num_samples)
return;
}
int num_samples_per_chunk = (int)ceilf((float)num_samples / num_resumable_chunks);
int range_start_sample = num_samples_per_chunk * (current_resumable_chunk - 1);
int range_num_samples = num_samples_per_chunk;
const int num_samples_per_chunk = (int)ceilf((float)num_samples / num_resumable_chunks);
int range_start_sample, range_num_samples;
if(current_resumable_chunk != 0) {
/* Single chunk rendering. */
range_start_sample = num_samples_per_chunk * (current_resumable_chunk - 1);
range_num_samples = num_samples_per_chunk;
}
else {
/* Ranged-chunks. */
const int num_chunks = end_resumable_chunk - start_resumable_chunk + 1;
range_start_sample = num_samples_per_chunk * (start_resumable_chunk - 1);
range_num_samples = num_chunks * num_samples_per_chunk;
}
/* Make sure we don't overshoot. */
if(range_start_sample + range_num_samples > num_samples) {
range_num_samples = num_samples - range_num_samples;
}

View File

@@ -17,12 +17,12 @@
#ifndef __BLENDER_SESSION_H__
#define __BLENDER_SESSION_H__
#include "device.h"
#include "scene.h"
#include "session.h"
#include "bake.h"
#include "device/device.h"
#include "render/scene.h"
#include "render/session.h"
#include "render/bake.h"
#include "util_vector.h"
#include "util/util_vector.h"
CCL_NAMESPACE_BEGIN
@@ -113,6 +113,7 @@ public:
string last_status;
string last_error;
float last_progress;
double last_status_time;
int width, height;
double start_resize_time;
@@ -137,6 +138,10 @@ public:
/* Current resumable chunk index to render. */
static int current_resumable_chunk;
/* Alternative to single-chunk rendering to render a range of chunks. */
static int start_resumable_chunk;
static int end_resumable_chunk;
protected:
void do_write_update_render_result(BL::RenderResult& b_rr,
BL::RenderLayer& b_rlay,

View File

@@ -14,20 +14,23 @@
* limitations under the License.
*/
#include "background.h"
#include "graph.h"
#include "light.h"
#include "nodes.h"
#include "osl.h"
#include "scene.h"
#include "shader.h"
#include "render/background.h"
#include "render/graph.h"
#include "render/light.h"
#include "render/nodes.h"
#include "render/osl.h"
#include "render/scene.h"
#include "render/shader.h"
#include "blender_texture.h"
#include "blender_sync.h"
#include "blender_util.h"
#include "blender/blender_texture.h"
#include "blender/blender_sync.h"
#include "blender/blender_util.h"
#include "util_debug.h"
#include "util_string.h"
#include "util/util_debug.h"
#include "util/util_foreach.h"
#include "util/util_string.h"
#include "util/util_set.h"
#include "util/util_task.h"
CCL_NAMESPACE_BEGIN
@@ -609,7 +612,8 @@ static ShaderNode *add_node(Scene *scene,
bool is_builtin = b_image.packed_file() ||
b_image.source() == BL::Image::source_GENERATED ||
b_image.source() == BL::Image::source_MOVIE ||
b_engine.is_preview();
(b_engine.is_preview() &&
b_image.source() != BL::Image::source_SEQUENCE);
if(is_builtin) {
/* for builtin images we're using image datablock name to find an image to
@@ -662,7 +666,8 @@ static ShaderNode *add_node(Scene *scene,
bool is_builtin = b_image.packed_file() ||
b_image.source() == BL::Image::source_GENERATED ||
b_image.source() == BL::Image::source_MOVIE ||
b_engine.is_preview();
(b_engine.is_preview() &&
b_image.source() != BL::Image::source_SEQUENCE);
if(is_builtin) {
int scene_frame = b_scene.frame_current();
@@ -1162,6 +1167,9 @@ void BlenderSync::sync_materials(bool update_all)
/* material loop */
BL::BlendData::materials_iterator b_mat;
TaskPool pool;
set<Shader*> updated_shaders;
for(b_data.materials.begin(b_mat); b_mat != b_data.materials.end(); ++b_mat) {
Shader *shader;
@@ -1197,9 +1205,37 @@ void BlenderSync::sync_materials(bool update_all)
shader->displacement_method = (experimental) ? get_displacement_method(cmat) : DISPLACE_BUMP;
shader->set_graph(graph);
shader->tag_update(scene);
/* By simplifying the shader graph as soon as possible, some
* redundant shader nodes might be removed which prevents loading
* unnecessary attributes later.
*
* However, since graph simplification also accounts for e.g. mix
* weight, this would cause frequent expensive resyncs in interactive
* sessions, so for those sessions optimization is only performed
* right before compiling.
*/
if(!preview) {
pool.push(function_bind(&ShaderGraph::simplify, graph, scene));
/* NOTE: Update shaders out of the threads since those routines
* are accessing and writing to a global context.
*/
updated_shaders.insert(shader);
}
else {
/* NOTE: Update tagging can access links which are being
* optimized out.
*/
shader->tag_update(scene);
}
}
}
pool.wait_work();
foreach(Shader *shader, updated_shaders) {
shader->tag_update(scene);
}
}
/* Sync World */

View File

@@ -14,29 +14,29 @@
* limitations under the License.
*/
#include "background.h"
#include "camera.h"
#include "film.h"
#include "graph.h"
#include "integrator.h"
#include "light.h"
#include "mesh.h"
#include "nodes.h"
#include "object.h"
#include "scene.h"
#include "shader.h"
#include "curves.h"
#include "render/background.h"
#include "render/camera.h"
#include "render/film.h"
#include "render/graph.h"
#include "render/integrator.h"
#include "render/light.h"
#include "render/mesh.h"
#include "render/nodes.h"
#include "render/object.h"
#include "render/scene.h"
#include "render/shader.h"
#include "render/curves.h"
#include "device.h"
#include "device/device.h"
#include "blender_sync.h"
#include "blender_session.h"
#include "blender_util.h"
#include "blender/blender_sync.h"
#include "blender/blender_session.h"
#include "blender/blender_util.h"
#include "util_debug.h"
#include "util_foreach.h"
#include "util_opengl.h"
#include "util_hash.h"
#include "util/util_debug.h"
#include "util/util_foreach.h"
#include "util/util_opengl.h"
#include "util/util_hash.h"
CCL_NAMESPACE_BEGIN

View File

@@ -22,15 +22,15 @@
#include "RNA_access.h"
#include "RNA_blender_cpp.h"
#include "blender_util.h"
#include "blender/blender_util.h"
#include "scene.h"
#include "session.h"
#include "render/scene.h"
#include "render/session.h"
#include "util_map.h"
#include "util_set.h"
#include "util_transform.h"
#include "util_vector.h"
#include "util/util_map.h"
#include "util/util_set.h"
#include "util/util_transform.h"
#include "util/util_vector.h"
CCL_NAMESPACE_BEGIN

View File

@@ -14,7 +14,7 @@
* limitations under the License.
*/
#include "blender_texture.h"
#include "blender/blender_texture.h"
CCL_NAMESPACE_BEGIN

View File

@@ -18,7 +18,7 @@
#define __BLENDER_TEXTURE_H__
#include <stdlib.h>
#include "blender_sync.h"
#include "blender/blender_sync.h"
CCL_NAMESPACE_BEGIN

View File

@@ -17,14 +17,15 @@
#ifndef __BLENDER_UTIL_H__
#define __BLENDER_UTIL_H__
#include "mesh.h"
#include "render/mesh.h"
#include "util_map.h"
#include "util_path.h"
#include "util_set.h"
#include "util_transform.h"
#include "util_types.h"
#include "util_vector.h"
#include "util/util_algorithm.h"
#include "util/util_map.h"
#include "util/util_path.h"
#include "util/util_set.h"
#include "util/util_transform.h"
#include "util/util_types.h"
#include "util/util_vector.h"
/* Hacks to hook into Blender API
* todo: clean this up ... */
@@ -78,7 +79,7 @@ static inline BL::Mesh object_to_mesh(BL::BlendData& data,
me.calc_normals_split();
}
else {
me.split_faces();
me.split_faces(false);
}
}
if(subdivision_type == Mesh::SUBDIVISION_NONE) {
@@ -173,22 +174,19 @@ static inline void curvemapping_color_to_array(BL::CurveMapping& cumap,
if(rgb_curve) {
BL::CurveMap mapI = cumap.curves[3];
for(int i = 0; i < size; i++) {
float t = min_x + (float)i/(float)(size-1) * range_x;
data[i][0] = mapR.evaluate(mapI.evaluate(t));
data[i][1] = mapG.evaluate(mapI.evaluate(t));
data[i][2] = mapB.evaluate(mapI.evaluate(t));
const float t = min_x + (float)i/(float)(size-1) * range_x;
data[i] = make_float3(mapR.evaluate(mapI.evaluate(t)),
mapG.evaluate(mapI.evaluate(t)),
mapB.evaluate(mapI.evaluate(t)));
}
}
else {
for(int i = 0; i < size; i++) {
float t = min_x + (float)i/(float)(size-1) * range_x;
data[i][0] = mapR.evaluate(t);
data[i][1] = mapG.evaluate(t);
data[i][2] = mapB.evaluate(t);
data[i] = make_float3(mapR.evaluate(t),
mapG.evaluate(t),
mapB.evaluate(t));
}
}
}
@@ -786,6 +784,35 @@ struct ParticleSystemKey {
}
};
class EdgeMap {
public:
EdgeMap() {
}
void clear() {
edges_.clear();
}
void insert(int v0, int v1) {
get_sorted_verts(v0, v1);
edges_.insert(std::pair<int, int>(v0, v1));
}
bool exists(int v0, int v1) {
get_sorted_verts(v0, v1);
return edges_.find(std::pair<int, int>(v0, v1)) != edges_.end();
}
protected:
void get_sorted_verts(int& v0, int& v1) {
if(v0 > v1) {
swap(v0, v1);
}
}
set< std::pair<int, int> > edges_;
};
CCL_NAMESPACE_END
#endif /* __BLENDER_UTIL_H__ */

View File

@@ -1,12 +1,6 @@
set(INC
.
../graph
../kernel
../kernel/svm
../render
../util
../device
..
)
set(INC_SYS

View File

@@ -15,25 +15,25 @@
* limitations under the License.
*/
#include "mesh.h"
#include "object.h"
#include "scene.h"
#include "curves.h"
#include "render/mesh.h"
#include "render/object.h"
#include "render/scene.h"
#include "render/curves.h"
#include "bvh.h"
#include "bvh_build.h"
#include "bvh_node.h"
#include "bvh_params.h"
#include "bvh_unaligned.h"
#include "bvh/bvh.h"
#include "bvh/bvh_build.h"
#include "bvh/bvh_node.h"
#include "bvh/bvh_params.h"
#include "bvh/bvh_unaligned.h"
#include "util_debug.h"
#include "util_foreach.h"
#include "util_logging.h"
#include "util_map.h"
#include "util_progress.h"
#include "util_system.h"
#include "util_types.h"
#include "util_math.h"
#include "util/util_debug.h"
#include "util/util_foreach.h"
#include "util/util_logging.h"
#include "util/util_map.h"
#include "util/util_progress.h"
#include "util/util_system.h"
#include "util/util_types.h"
#include "util/util_math.h"
CCL_NAMESPACE_BEGIN
@@ -67,7 +67,7 @@ BVH *BVH::create(const BVHParams& params, const vector<Object*>& objects)
if(params.use_qbvh)
return new QBVH(params, objects);
else
return new RegularBVH(params, objects);
return new BinaryBVH(params, objects);
}
/* Building */
@@ -81,6 +81,7 @@ void BVH::build(Progress& progress)
pack.prim_type,
pack.prim_index,
pack.prim_object,
pack.prim_time,
params,
progress);
BVHNode *root = bvh_build.run();
@@ -256,6 +257,10 @@ void BVH::pack_instances(size_t nodes_size, size_t leaf_nodes_size)
pack.leaf_nodes.resize(leaf_nodes_size);
pack.object_node.resize(objects.size());
if(params.num_motion_curve_steps > 0 || params.num_motion_triangle_steps > 0) {
pack.prim_time.resize(prim_index_size);
}
int *pack_prim_index = (pack.prim_index.size())? &pack.prim_index[0]: NULL;
int *pack_prim_type = (pack.prim_type.size())? &pack.prim_type[0]: NULL;
int *pack_prim_object = (pack.prim_object.size())? &pack.prim_object[0]: NULL;
@@ -264,6 +269,7 @@ void BVH::pack_instances(size_t nodes_size, size_t leaf_nodes_size)
uint *pack_prim_tri_index = (pack.prim_tri_index.size())? &pack.prim_tri_index[0]: NULL;
int4 *pack_nodes = (pack.nodes.size())? &pack.nodes[0]: NULL;
int4 *pack_leaf_nodes = (pack.leaf_nodes.size())? &pack.leaf_nodes[0]: NULL;
float2 *pack_prim_time = (pack.prim_time.size())? &pack.prim_time[0]: NULL;
/* merge */
foreach(Object *ob, objects) {
@@ -309,6 +315,7 @@ void BVH::pack_instances(size_t nodes_size, size_t leaf_nodes_size)
int *bvh_prim_type = &bvh->pack.prim_type[0];
uint *bvh_prim_visibility = &bvh->pack.prim_visibility[0];
uint *bvh_prim_tri_index = &bvh->pack.prim_tri_index[0];
float2 *bvh_prim_time = bvh->pack.prim_time.size()? &bvh->pack.prim_time[0]: NULL;
for(size_t i = 0; i < bvh_prim_index_size; i++) {
if(bvh->pack.prim_type[i] & PRIMITIVE_ALL_CURVE) {
@@ -324,6 +331,9 @@ void BVH::pack_instances(size_t nodes_size, size_t leaf_nodes_size)
pack_prim_type[pack_prim_index_offset] = bvh_prim_type[i];
pack_prim_visibility[pack_prim_index_offset] = bvh_prim_visibility[i];
pack_prim_object[pack_prim_index_offset] = 0; // unused for instances
if(bvh_prim_time != NULL) {
pack_prim_time[pack_prim_index_offset] = bvh_prim_time[i];
}
pack_prim_index_offset++;
}
}
@@ -414,64 +424,64 @@ static bool node_bvh_is_unaligned(const BVHNode *node)
{
const BVHNode *node0 = node->get_child(0),
*node1 = node->get_child(1);
return node0->is_unaligned() || node1->is_unaligned();
return node0->is_unaligned || node1->is_unaligned;
}
RegularBVH::RegularBVH(const BVHParams& params_, const vector<Object*>& objects_)
BinaryBVH::BinaryBVH(const BVHParams& params_, const vector<Object*>& objects_)
: BVH(params_, objects_)
{
}
void RegularBVH::pack_leaf(const BVHStackEntry& e,
const LeafNode *leaf)
void BinaryBVH::pack_leaf(const BVHStackEntry& e,
const LeafNode *leaf)
{
assert(e.idx + BVH_NODE_LEAF_SIZE <= pack.leaf_nodes.size());
float4 data[BVH_NODE_LEAF_SIZE];
memset(data, 0, sizeof(data));
if(leaf->num_triangles() == 1 && pack.prim_index[leaf->m_lo] == -1) {
if(leaf->num_triangles() == 1 && pack.prim_index[leaf->lo] == -1) {
/* object */
data[0].x = __int_as_float(~(leaf->m_lo));
data[0].x = __int_as_float(~(leaf->lo));
data[0].y = __int_as_float(0);
}
else {
/* triangle */
data[0].x = __int_as_float(leaf->m_lo);
data[0].y = __int_as_float(leaf->m_hi);
data[0].x = __int_as_float(leaf->lo);
data[0].y = __int_as_float(leaf->hi);
}
data[0].z = __uint_as_float(leaf->m_visibility);
data[0].z = __uint_as_float(leaf->visibility);
if(leaf->num_triangles() != 0) {
data[0].w = __uint_as_float(pack.prim_type[leaf->m_lo]);
data[0].w = __uint_as_float(pack.prim_type[leaf->lo]);
}
memcpy(&pack.leaf_nodes[e.idx], data, sizeof(float4)*BVH_NODE_LEAF_SIZE);
}
void RegularBVH::pack_inner(const BVHStackEntry& e,
const BVHStackEntry& e0,
const BVHStackEntry& e1)
void BinaryBVH::pack_inner(const BVHStackEntry& e,
const BVHStackEntry& e0,
const BVHStackEntry& e1)
{
if(e0.node->is_unaligned() || e1.node->is_unaligned()) {
if(e0.node->is_unaligned || e1.node->is_unaligned) {
pack_unaligned_inner(e, e0, e1);
} else {
pack_aligned_inner(e, e0, e1);
}
}
void RegularBVH::pack_aligned_inner(const BVHStackEntry& e,
const BVHStackEntry& e0,
const BVHStackEntry& e1)
void BinaryBVH::pack_aligned_inner(const BVHStackEntry& e,
const BVHStackEntry& e0,
const BVHStackEntry& e1)
{
pack_aligned_node(e.idx,
e0.node->m_bounds, e1.node->m_bounds,
e0.node->bounds, e1.node->bounds,
e0.encodeIdx(), e1.encodeIdx(),
e0.node->m_visibility, e1.node->m_visibility);
e0.node->visibility, e1.node->visibility);
}
void RegularBVH::pack_aligned_node(int idx,
const BoundBox& b0,
const BoundBox& b1,
int c0, int c1,
uint visibility0, uint visibility1)
void BinaryBVH::pack_aligned_node(int idx,
const BoundBox& b0,
const BoundBox& b1,
int c0, int c1,
uint visibility0, uint visibility1)
{
assert(idx + BVH_NODE_SIZE <= pack.nodes.size());
assert(c0 < 0 || c0 < pack.nodes.size());
@@ -498,26 +508,26 @@ void RegularBVH::pack_aligned_node(int idx,
memcpy(&pack.nodes[idx], data, sizeof(int4)*BVH_NODE_SIZE);
}
void RegularBVH::pack_unaligned_inner(const BVHStackEntry& e,
const BVHStackEntry& e0,
const BVHStackEntry& e1)
void BinaryBVH::pack_unaligned_inner(const BVHStackEntry& e,
const BVHStackEntry& e0,
const BVHStackEntry& e1)
{
pack_unaligned_node(e.idx,
e0.node->get_aligned_space(),
e1.node->get_aligned_space(),
e0.node->m_bounds,
e1.node->m_bounds,
e0.node->bounds,
e1.node->bounds,
e0.encodeIdx(), e1.encodeIdx(),
e0.node->m_visibility, e1.node->m_visibility);
e0.node->visibility, e1.node->visibility);
}
void RegularBVH::pack_unaligned_node(int idx,
const Transform& aligned_space0,
const Transform& aligned_space1,
const BoundBox& bounds0,
const BoundBox& bounds1,
int c0, int c1,
uint visibility0, uint visibility1)
void BinaryBVH::pack_unaligned_node(int idx,
const Transform& aligned_space0,
const Transform& aligned_space1,
const BoundBox& bounds0,
const BoundBox& bounds1,
int c0, int c1,
uint visibility0, uint visibility1)
{
assert(idx + BVH_UNALIGNED_NODE_SIZE <= pack.nodes.size());
assert(c0 < 0 || c0 < pack.nodes.size());
@@ -543,7 +553,7 @@ void RegularBVH::pack_unaligned_node(int idx,
memcpy(&pack.nodes[idx], data, sizeof(float4)*BVH_UNALIGNED_NODE_SIZE);
}
void RegularBVH::pack_nodes(const BVHNode *root)
void BinaryBVH::pack_nodes(const BVHNode *root)
{
const size_t num_nodes = root->getSubtreeSize(BVH_STAT_NODE_COUNT);
const size_t num_leaf_nodes = root->getSubtreeSize(BVH_STAT_LEAF_COUNT);
@@ -620,7 +630,7 @@ void RegularBVH::pack_nodes(const BVHNode *root)
pack.root_index = (root->is_leaf())? -1: 0;
}
void RegularBVH::refit_nodes()
void BinaryBVH::refit_nodes()
{
assert(!params.top_level);
@@ -629,7 +639,7 @@ void RegularBVH::refit_nodes()
refit_node(0, (pack.root_index == -1)? true: false, bbox, visibility);
}
void RegularBVH::refit_node(int idx, bool leaf, BoundBox& bbox, uint& visibility)
void BinaryBVH::refit_node(int idx, bool leaf, BoundBox& bbox, uint& visibility)
{
if(leaf) {
assert(idx + BVH_NODE_LEAF_SIZE <= pack.leaf_nodes.size());
@@ -759,18 +769,18 @@ static bool node_qbvh_is_unaligned(const BVHNode *node)
*node1 = node->get_child(1);
bool has_unaligned = false;
if(node0->is_leaf()) {
has_unaligned |= node0->is_unaligned();
has_unaligned |= node0->is_unaligned;
}
else {
has_unaligned |= node0->get_child(0)->is_unaligned();
has_unaligned |= node0->get_child(1)->is_unaligned();
has_unaligned |= node0->get_child(0)->is_unaligned;
has_unaligned |= node0->get_child(1)->is_unaligned;
}
if(node1->is_leaf()) {
has_unaligned |= node1->is_unaligned();
has_unaligned |= node1->is_unaligned;
}
else {
has_unaligned |= node1->get_child(0)->is_unaligned();
has_unaligned |= node1->get_child(1)->is_unaligned();
has_unaligned |= node1->get_child(0)->is_unaligned;
has_unaligned |= node1->get_child(1)->is_unaligned;
}
return has_unaligned;
}
@@ -785,19 +795,19 @@ void QBVH::pack_leaf(const BVHStackEntry& e, const LeafNode *leaf)
{
float4 data[BVH_QNODE_LEAF_SIZE];
memset(data, 0, sizeof(data));
if(leaf->num_triangles() == 1 && pack.prim_index[leaf->m_lo] == -1) {
if(leaf->num_triangles() == 1 && pack.prim_index[leaf->lo] == -1) {
/* object */
data[0].x = __int_as_float(~(leaf->m_lo));
data[0].x = __int_as_float(~(leaf->lo));
data[0].y = __int_as_float(0);
}
else {
/* triangle */
data[0].x = __int_as_float(leaf->m_lo);
data[0].y = __int_as_float(leaf->m_hi);
data[0].x = __int_as_float(leaf->lo);
data[0].y = __int_as_float(leaf->hi);
}
data[0].z = __uint_as_float(leaf->m_visibility);
data[0].z = __uint_as_float(leaf->visibility);
if(leaf->num_triangles() != 0) {
data[0].w = __uint_as_float(pack.prim_type[leaf->m_lo]);
data[0].w = __uint_as_float(pack.prim_type[leaf->lo]);
}
memcpy(&pack.leaf_nodes[e.idx], data, sizeof(float4)*BVH_QNODE_LEAF_SIZE);
@@ -813,7 +823,7 @@ void QBVH::pack_inner(const BVHStackEntry& e,
*/
if(params.use_unaligned_nodes) {
for(int i = 0; i < num; i++) {
if(en[i].node->is_unaligned()) {
if(en[i].node->is_unaligned) {
has_unaligned = true;
break;
}
@@ -838,15 +848,15 @@ void QBVH::pack_aligned_inner(const BVHStackEntry& e,
BoundBox bounds[4];
int child[4];
for(int i = 0; i < num; ++i) {
bounds[i] = en[i].node->m_bounds;
bounds[i] = en[i].node->bounds;
child[i] = en[i].encodeIdx();
}
pack_aligned_node(e.idx,
bounds,
child,
e.node->m_visibility,
e.node->m_time_from,
e.node->m_time_to,
e.node->visibility,
e.node->time_from,
e.node->time_to,
num);
}
@@ -907,16 +917,16 @@ void QBVH::pack_unaligned_inner(const BVHStackEntry& e,
int child[4];
for(int i = 0; i < num; ++i) {
aligned_space[i] = en[i].node->get_aligned_space();
bounds[i] = en[i].node->m_bounds;
bounds[i] = en[i].node->bounds;
child[i] = en[i].encodeIdx();
}
pack_unaligned_node(e.idx,
aligned_space,
bounds,
child,
e.node->m_visibility,
e.node->m_time_from,
e.node->m_time_to,
e.node->visibility,
e.node->time_from,
e.node->time_to,
num);
}

View File

@@ -18,10 +18,10 @@
#ifndef __BVH_H__
#define __BVH_H__
#include "bvh_params.h"
#include "bvh/bvh_params.h"
#include "util_types.h"
#include "util_vector.h"
#include "util/util_types.h"
#include "util/util_vector.h"
CCL_NAMESPACE_BEGIN
@@ -68,6 +68,8 @@ struct PackedBVH {
array<int> prim_index;
/* mapping from BVH primitive index, to the object id of that primitive. */
array<int> prim_object;
/* Time range of BVH primitive. */
array<float2> prim_time;
/* index of the root node. */
int root_index;
@@ -108,15 +110,15 @@ protected:
virtual void refit_nodes() = 0;
};
/* Regular BVH
/* Binary BVH
*
* Typical BVH with each node having two children. */
class RegularBVH : public BVH {
class BinaryBVH : public BVH {
protected:
/* constructor */
friend class BVH;
RegularBVH(const BVHParams& params, const vector<Object*>& objects);
BinaryBVH(const BVHParams& params, const vector<Object*>& objects);
/* pack */
void pack_nodes(const BVHNode *root);

View File

@@ -19,11 +19,11 @@
#include <stdlib.h>
#include "bvh_binning.h"
#include "bvh/bvh_binning.h"
#include "util_algorithm.h"
#include "util_boundbox.h"
#include "util_types.h"
#include "util/util_algorithm.h"
#include "util/util_boundbox.h"
#include "util/util_types.h"
CCL_NAMESPACE_BEGIN

View File

@@ -18,10 +18,10 @@
#ifndef __BVH_BINNING_H__
#define __BVH_BINNING_H__
#include "bvh_params.h"
#include "bvh_unaligned.h"
#include "bvh/bvh_params.h"
#include "bvh/bvh_unaligned.h"
#include "util_types.h"
#include "util/util_types.h"
CCL_NAMESPACE_BEGIN

View File

@@ -15,26 +15,26 @@
* limitations under the License.
*/
#include "bvh_binning.h"
#include "bvh_build.h"
#include "bvh_node.h"
#include "bvh_params.h"
#include "bvh/bvh_binning.h"
#include "bvh/bvh_build.h"
#include "bvh/bvh_node.h"
#include "bvh/bvh_params.h"
#include "bvh_split.h"
#include "mesh.h"
#include "object.h"
#include "scene.h"
#include "curves.h"
#include "render/mesh.h"
#include "render/object.h"
#include "render/scene.h"
#include "render/curves.h"
#include "util_algorithm.h"
#include "util_debug.h"
#include "util_foreach.h"
#include "util_logging.h"
#include "util_progress.h"
#include "util_stack_allocator.h"
#include "util_simd.h"
#include "util_time.h"
#include "util_queue.h"
#include "util/util_algorithm.h"
#include "util/util_debug.h"
#include "util/util_foreach.h"
#include "util/util_logging.h"
#include "util/util_progress.h"
#include "util/util_stack_allocator.h"
#include "util/util_simd.h"
#include "util/util_time.h"
#include "util/util_queue.h"
CCL_NAMESPACE_BEGIN
@@ -93,12 +93,14 @@ BVHBuild::BVHBuild(const vector<Object*>& objects_,
array<int>& prim_type_,
array<int>& prim_index_,
array<int>& prim_object_,
array<float2>& prim_time_,
const BVHParams& params_,
Progress& progress_)
: objects(objects_),
prim_type(prim_type_),
prim_index(prim_index_),
prim_object(prim_object_),
prim_time(prim_time_),
params(params_),
progress(progress_),
progress_start_time(0.0),
@@ -465,6 +467,9 @@ BVHNode* BVHBuild::run()
}
spatial_free_index = 0;
need_prim_time = params.num_motion_curve_steps > 0 ||
params.num_motion_triangle_steps > 0;
/* init progress updates */
double build_start_time;
build_start_time = progress_start_time = time_dt();
@@ -475,6 +480,12 @@ BVHNode* BVHBuild::run()
prim_type.resize(references.size());
prim_index.resize(references.size());
prim_object.resize(references.size());
if(need_prim_time) {
prim_time.resize(references.size());
}
else {
prim_time.resize(0);
}
/* build recursively */
BVHNode *rootnode;
@@ -849,11 +860,14 @@ BVHNode *BVHBuild::create_object_leaf_nodes(const BVHReference *ref, int start,
prim_type[start] = ref->prim_type();
prim_index[start] = ref->prim_index();
prim_object[start] = ref->prim_object();
if(need_prim_time) {
prim_time[start] = make_float2(ref->time_from(), ref->time_to());
}
uint visibility = objects[ref->prim_object()]->visibility;
BVHNode *leaf_node = new LeafNode(ref->bounds(), visibility, start, start+1);
leaf_node->m_time_from = ref->time_from();
leaf_node->m_time_to = ref->time_to();
leaf_node->time_from = ref->time_from();
leaf_node->time_to = ref->time_to();
return leaf_node;
}
else {
@@ -862,12 +876,12 @@ BVHNode *BVHBuild::create_object_leaf_nodes(const BVHReference *ref, int start,
BVHNode *leaf1 = create_object_leaf_nodes(ref+mid, start+mid, num-mid);
BoundBox bounds = BoundBox::empty;
bounds.grow(leaf0->m_bounds);
bounds.grow(leaf1->m_bounds);
bounds.grow(leaf0->bounds);
bounds.grow(leaf1->bounds);
BVHNode *inner_node = new InnerNode(bounds, leaf0, leaf1);
inner_node->m_time_from = min(leaf0->m_time_from, leaf1->m_time_from);
inner_node->m_time_to = max(leaf0->m_time_to, leaf1->m_time_to);
inner_node->time_from = min(leaf0->time_from, leaf1->time_from);
inner_node->time_to = max(leaf0->time_to, leaf1->time_to);
return inner_node;
}
}
@@ -891,11 +905,13 @@ BVHNode* BVHBuild::create_leaf_node(const BVHRange& range,
* can not control.
*/
typedef StackAllocator<256, int> LeafStackAllocator;
typedef StackAllocator<256, float2> LeafTimeStackAllocator;
typedef StackAllocator<256, BVHReference> LeafReferenceStackAllocator;
vector<int, LeafStackAllocator> p_type[PRIMITIVE_NUM_TOTAL];
vector<int, LeafStackAllocator> p_index[PRIMITIVE_NUM_TOTAL];
vector<int, LeafStackAllocator> p_object[PRIMITIVE_NUM_TOTAL];
vector<float2, LeafTimeStackAllocator> p_time[PRIMITIVE_NUM_TOTAL];
vector<BVHReference, LeafReferenceStackAllocator> p_ref[PRIMITIVE_NUM_TOTAL];
/* TODO(sergey): In theory we should be able to store references. */
@@ -918,6 +934,8 @@ BVHNode* BVHBuild::create_leaf_node(const BVHRange& range,
p_type[type_index].push_back(ref.prim_type());
p_index[type_index].push_back(ref.prim_index());
p_object[type_index].push_back(ref.prim_object());
p_time[type_index].push_back(make_float2(ref.time_from(),
ref.time_to()));
bounds[type_index].grow(ref.bounds());
visibility[type_index] |= objects[ref.prim_object()]->visibility;
@@ -947,9 +965,13 @@ BVHNode* BVHBuild::create_leaf_node(const BVHRange& range,
vector<int, LeafStackAllocator> local_prim_type,
local_prim_index,
local_prim_object;
vector<float2, LeafTimeStackAllocator> local_prim_time;
local_prim_type.resize(num_new_prims);
local_prim_index.resize(num_new_prims);
local_prim_object.resize(num_new_prims);
if(need_prim_time) {
local_prim_time.resize(num_new_prims);
}
for(int i = 0; i < PRIMITIVE_NUM_TOTAL; ++i) {
int num = (int)p_type[i].size();
if(num != 0) {
@@ -962,6 +984,9 @@ BVHNode* BVHBuild::create_leaf_node(const BVHRange& range,
local_prim_type[index] = p_type[i][j];
local_prim_index[index] = p_index[i][j];
local_prim_object[index] = p_object[i][j];
if(need_prim_time) {
local_prim_time[index] = p_time[i][j];
}
if(params.use_unaligned_nodes && !alignment_found) {
alignment_found =
unaligned_heuristic.compute_aligned_space(p_ref[i][j],
@@ -979,19 +1004,19 @@ BVHNode* BVHBuild::create_leaf_node(const BVHRange& range,
time_from = min(time_from, ref.time_from());
time_to = max(time_to, ref.time_to());
}
leaf_node->m_time_from = time_from;
leaf_node->m_time_to = time_to;
leaf_node->time_from = time_from;
leaf_node->time_to = time_to;
}
if(alignment_found) {
/* Need to recalculate leaf bounds with new alignment. */
leaf_node->m_bounds = BoundBox::empty;
leaf_node->bounds = BoundBox::empty;
for(int j = 0; j < num; ++j) {
const BVHReference &ref = p_ref[i][j];
BoundBox ref_bounds =
unaligned_heuristic.compute_aligned_prim_boundbox(
ref,
aligned_space);
leaf_node->m_bounds.grow(ref_bounds);
leaf_node->bounds.grow(ref_bounds);
}
/* Set alignment space. */
leaf_node->set_aligned_space(aligned_space);
@@ -1028,11 +1053,17 @@ BVHNode* BVHBuild::create_leaf_node(const BVHRange& range,
prim_type.reserve(reserve);
prim_index.reserve(reserve);
prim_object.reserve(reserve);
if(need_prim_time) {
prim_time.reserve(reserve);
}
}
prim_type.resize(range_end);
prim_index.resize(range_end);
prim_object.resize(range_end);
if(need_prim_time) {
prim_time.resize(range_end);
}
}
spatial_spin_lock.unlock();
@@ -1041,6 +1072,9 @@ BVHNode* BVHBuild::create_leaf_node(const BVHRange& range,
memcpy(&prim_type[start_index], &local_prim_type[0], new_leaf_data_size);
memcpy(&prim_index[start_index], &local_prim_index[0], new_leaf_data_size);
memcpy(&prim_object[start_index], &local_prim_object[0], new_leaf_data_size);
if(need_prim_time) {
memcpy(&prim_time[start_index], &local_prim_time[0], sizeof(float2)*num_new_leaf_data);
}
}
}
else {
@@ -1053,6 +1087,9 @@ BVHNode* BVHBuild::create_leaf_node(const BVHRange& range,
memcpy(&prim_type[start_index], &local_prim_type[0], new_leaf_data_size);
memcpy(&prim_index[start_index], &local_prim_index[0], new_leaf_data_size);
memcpy(&prim_object[start_index], &local_prim_object[0], new_leaf_data_size);
if(need_prim_time) {
memcpy(&prim_time[start_index], &local_prim_time[0], sizeof(float2)*num_new_leaf_data);
}
}
}
@@ -1062,8 +1099,8 @@ BVHNode* BVHBuild::create_leaf_node(const BVHRange& range,
*/
for(int i = 0; i < num_leaves; ++i) {
LeafNode *leaf = (LeafNode *)leaves[i];
leaf->m_lo += start_index;
leaf->m_hi += start_index;
leaf->lo += start_index;
leaf->hi += start_index;
}
/* Create leaf node for object. */
@@ -1092,17 +1129,17 @@ BVHNode* BVHBuild::create_leaf_node(const BVHRange& range,
return new InnerNode(range.bounds(), leaves[0], leaves[1]);
}
else if(num_leaves == 3) {
BoundBox inner_bounds = merge(leaves[1]->m_bounds, leaves[2]->m_bounds);
BoundBox inner_bounds = merge(leaves[1]->bounds, leaves[2]->bounds);
BVHNode *inner = new InnerNode(inner_bounds, leaves[1], leaves[2]);
return new InnerNode(range.bounds(), leaves[0], inner);
} else {
/* Should be doing more branches if more primitive types added. */
assert(num_leaves <= 5);
BoundBox inner_bounds_a = merge(leaves[0]->m_bounds, leaves[1]->m_bounds);
BoundBox inner_bounds_b = merge(leaves[2]->m_bounds, leaves[3]->m_bounds);
BoundBox inner_bounds_a = merge(leaves[0]->bounds, leaves[1]->bounds);
BoundBox inner_bounds_b = merge(leaves[2]->bounds, leaves[3]->bounds);
BVHNode *inner_a = new InnerNode(inner_bounds_a, leaves[0], leaves[1]);
BVHNode *inner_b = new InnerNode(inner_bounds_b, leaves[2], leaves[3]);
BoundBox inner_bounds_c = merge(inner_a->m_bounds, inner_b->m_bounds);
BoundBox inner_bounds_c = merge(inner_a->bounds, inner_b->bounds);
BVHNode *inner_c = new InnerNode(inner_bounds_c, inner_a, inner_b);
if(num_leaves == 5) {
return new InnerNode(range.bounds(), inner_c, leaves[4]);
@@ -1137,8 +1174,8 @@ void BVHBuild::rotate(BVHNode *node, int max_depth)
rotate(parent->children[c], max_depth-1);
/* compute current area of all children */
BoundBox bounds0 = parent->children[0]->m_bounds;
BoundBox bounds1 = parent->children[1]->m_bounds;
BoundBox bounds0 = parent->children[0]->bounds;
BoundBox bounds1 = parent->children[1]->bounds;
float area0 = bounds0.half_area();
float area1 = bounds1.half_area();
@@ -1158,8 +1195,8 @@ void BVHBuild::rotate(BVHNode *node, int max_depth)
BoundBox& other = (c == 0)? bounds1: bounds0;
/* transpose child bounds */
BoundBox target0 = child->children[0]->m_bounds;
BoundBox target1 = child->children[1]->m_bounds;
BoundBox target0 = child->children[0]->bounds;
BoundBox target1 = child->children[1]->bounds;
/* compute cost for both possible swaps */
float cost0 = merge(other, target1).half_area() - child_area[c];
@@ -1191,7 +1228,7 @@ void BVHBuild::rotate(BVHNode *node, int max_depth)
InnerNode *child = (InnerNode*)parent->children[best_child];
swap(parent->children[best_other], child->children[best_target]);
child->m_bounds = merge(child->children[0]->m_bounds, child->children[1]->m_bounds);
child->bounds = merge(child->children[0]->bounds, child->children[1]->bounds);
}
CCL_NAMESPACE_END

View File

@@ -20,13 +20,13 @@
#include <float.h>
#include "bvh.h"
#include "bvh_binning.h"
#include "bvh_unaligned.h"
#include "bvh/bvh.h"
#include "bvh/bvh_binning.h"
#include "bvh/bvh_unaligned.h"
#include "util_boundbox.h"
#include "util_task.h"
#include "util_vector.h"
#include "util/util_boundbox.h"
#include "util/util_task.h"
#include "util/util_vector.h"
CCL_NAMESPACE_BEGIN
@@ -48,6 +48,7 @@ public:
array<int>& prim_type,
array<int>& prim_index,
array<int>& prim_object,
array<float2>& prim_time,
const BVHParams& params,
Progress& progress);
~BVHBuild();
@@ -112,6 +113,9 @@ protected:
array<int>& prim_type;
array<int>& prim_index;
array<int>& prim_object;
array<float2>& prim_time;
bool need_prim_time;
/* Build parameters. */
BVHParams params;

View File

@@ -15,12 +15,12 @@
* limitations under the License.
*/
#include "bvh.h"
#include "bvh_build.h"
#include "bvh_node.h"
#include "bvh/bvh.h"
#include "bvh/bvh_build.h"
#include "bvh/bvh_node.h"
#include "util_debug.h"
#include "util_vector.h"
#include "util/util_debug.h"
#include "util/util_vector.h"
CCL_NAMESPACE_BEGIN
@@ -62,12 +62,12 @@ int BVHNode::getSubtreeSize(BVH_STAT stat) const
}
return cnt;
case BVH_STAT_ALIGNED_COUNT:
if(!is_unaligned()) {
if(!is_unaligned) {
cnt = 1;
}
break;
case BVH_STAT_UNALIGNED_COUNT:
if(is_unaligned()) {
if(is_unaligned) {
cnt = 1;
}
break;
@@ -75,7 +75,7 @@ int BVHNode::getSubtreeSize(BVH_STAT stat) const
if(!is_leaf()) {
bool has_unaligned = false;
for(int j = 0; j < num_children(); j++) {
has_unaligned |= get_child(j)->is_unaligned();
has_unaligned |= get_child(j)->is_unaligned;
}
cnt += has_unaligned? 0: 1;
}
@@ -84,7 +84,7 @@ int BVHNode::getSubtreeSize(BVH_STAT stat) const
if(!is_leaf()) {
bool has_unaligned = false;
for(int j = 0; j < num_children(); j++) {
has_unaligned |= get_child(j)->is_unaligned();
has_unaligned |= get_child(j)->is_unaligned;
}
cnt += has_unaligned? 1: 0;
}
@@ -95,12 +95,12 @@ int BVHNode::getSubtreeSize(BVH_STAT stat) const
for(int i = 0; i < num_children(); i++) {
BVHNode *node = get_child(i);
if(node->is_leaf()) {
has_unaligned |= node->is_unaligned();
has_unaligned |= node->is_unaligned;
}
else {
for(int j = 0; j < node->num_children(); j++) {
cnt += node->get_child(j)->getSubtreeSize(stat);
has_unaligned |= node->get_child(j)->is_unaligned();
has_unaligned |= node->get_child(j)->is_unaligned;
}
}
}
@@ -113,12 +113,12 @@ int BVHNode::getSubtreeSize(BVH_STAT stat) const
for(int i = 0; i < num_children(); i++) {
BVHNode *node = get_child(i);
if(node->is_leaf()) {
has_unaligned |= node->is_unaligned();
has_unaligned |= node->is_unaligned;
}
else {
for(int j = 0; j < node->num_children(); j++) {
cnt += node->get_child(j)->getSubtreeSize(stat);
has_unaligned |= node->get_child(j)->is_unaligned();
has_unaligned |= node->get_child(j)->is_unaligned;
}
}
}
@@ -126,10 +126,10 @@ int BVHNode::getSubtreeSize(BVH_STAT stat) const
}
return cnt;
case BVH_STAT_ALIGNED_LEAF_COUNT:
cnt = (is_leaf() && !is_unaligned()) ? 1 : 0;
cnt = (is_leaf() && !is_unaligned) ? 1 : 0;
break;
case BVH_STAT_UNALIGNED_LEAF_COUNT:
cnt = (is_leaf() && is_unaligned()) ? 1 : 0;
cnt = (is_leaf() && is_unaligned) ? 1 : 0;
break;
default:
assert(0); /* unknown mode */
@@ -157,7 +157,7 @@ float BVHNode::computeSubtreeSAHCost(const BVHParams& p, float probability) cons
for(int i = 0; i < num_children(); i++) {
BVHNode *child = get_child(i);
SAH += child->computeSubtreeSAHCost(p, probability * child->m_bounds.safe_area()/m_bounds.safe_area());
SAH += child->computeSubtreeSAHCost(p, probability * child->bounds.safe_area()/bounds.safe_area());
}
return SAH;
@@ -165,15 +165,15 @@ float BVHNode::computeSubtreeSAHCost(const BVHParams& p, float probability) cons
uint BVHNode::update_visibility()
{
if(!is_leaf() && m_visibility == 0) {
if(!is_leaf() && visibility == 0) {
InnerNode *inner = (InnerNode*)this;
BVHNode *child0 = inner->children[0];
BVHNode *child1 = inner->children[1];
m_visibility = child0->update_visibility()|child1->update_visibility();
visibility = child0->update_visibility()|child1->update_visibility();
}
return m_visibility;
return visibility;
}
void BVHNode::update_time()
@@ -184,8 +184,8 @@ void BVHNode::update_time()
BVHNode *child1 = inner->children[1];
child0->update_time();
child1->update_time();
m_time_from = min(child0->m_time_from, child1->m_time_from);
m_time_to = max(child0->m_time_to, child1->m_time_to);
time_from = min(child0->time_from, child1->time_from);
time_to = max(child0->time_to, child1->time_to);
}
}
@@ -209,7 +209,7 @@ void LeafNode::print(int depth) const
for(int i = 0; i < depth; i++)
printf(" ");
printf("leaf node %d to %d\n", m_lo, m_hi);
printf("leaf node %d to %d\n", lo, hi);
}
CCL_NAMESPACE_END

View File

@@ -18,9 +18,9 @@
#ifndef __BVH_NODE_H__
#define __BVH_NODE_H__
#include "util_boundbox.h"
#include "util_debug.h"
#include "util_types.h"
#include "util/util_boundbox.h"
#include "util/util_debug.h"
#include "util/util_types.h"
CCL_NAMESPACE_BEGIN
@@ -46,16 +46,16 @@ class BVHParams;
class BVHNode
{
public:
BVHNode() : m_is_unaligned(false),
m_aligned_space(NULL),
m_time_from(0.0f),
m_time_to(1.0f)
BVHNode() : is_unaligned(false),
aligned_space(NULL),
time_from(0.0f),
time_to(1.0f)
{
}
virtual ~BVHNode()
{
delete m_aligned_space;
delete aligned_space;
}
virtual bool is_leaf() const = 0;
@@ -63,30 +63,26 @@ public:
virtual BVHNode *get_child(int i) const = 0;
virtual int num_triangles() const { return 0; }
virtual void print(int depth = 0) const = 0;
bool is_unaligned() const { return m_is_unaligned; }
inline void set_aligned_space(const Transform& aligned_space)
{
m_is_unaligned = true;
if(m_aligned_space == NULL) {
m_aligned_space = new Transform(aligned_space);
is_unaligned = true;
if(this->aligned_space == NULL) {
this->aligned_space = new Transform(aligned_space);
}
else {
*m_aligned_space = aligned_space;
*this->aligned_space = aligned_space;
}
}
inline Transform get_aligned_space() const
{
if(m_aligned_space == NULL) {
if(aligned_space == NULL) {
return transform_identity();
}
return *m_aligned_space;
return *aligned_space;
}
BoundBox m_bounds;
uint m_visibility;
// Subtree functions
int getSubtreeSize(BVH_STAT stat=BVH_STAT_NODE_COUNT) const;
float computeSubtreeSAHCost(const BVHParams& p, float probability = 1.0f) const;
@@ -95,13 +91,18 @@ public:
uint update_visibility();
void update_time();
bool m_is_unaligned;
// Properties.
BoundBox bounds;
uint visibility;
// TODO(sergey): Can be stored as 3x3 matrix, but better to have some
// utilities and type defines in util_transform first.
Transform *m_aligned_space;
bool is_unaligned;
float m_time_from, m_time_to;
/* TODO(sergey): Can be stored as 3x3 matrix, but better to have some
* utilities and type defines in util_transform first.
*/
Transform *aligned_space;
float time_from, time_to;
};
class InnerNode : public BVHNode
@@ -111,20 +112,20 @@ public:
BVHNode* child0,
BVHNode* child1)
{
m_bounds = bounds;
this->bounds = bounds;
children[0] = child0;
children[1] = child1;
if(child0 && child1)
m_visibility = child0->m_visibility|child1->m_visibility;
visibility = child0->visibility|child1->visibility;
else
m_visibility = 0; /* happens on build cancel */
visibility = 0; /* happens on build cancel */
}
explicit InnerNode(const BoundBox& bounds)
{
m_bounds = bounds;
m_visibility = 0;
this->bounds = bounds;
visibility = 0;
children[0] = NULL;
children[1] = NULL;
}
@@ -140,12 +141,12 @@ public:
class LeafNode : public BVHNode
{
public:
LeafNode(const BoundBox& bounds, uint visibility, int lo, int hi)
LeafNode(const BoundBox& bounds, uint visibility, int lo, int hi)
: lo(lo),
hi(hi)
{
m_bounds = bounds;
m_visibility = visibility;
m_lo = lo;
m_hi = hi;
this->bounds = bounds;
this->visibility = visibility;
}
LeafNode(const LeafNode& s)
@@ -157,14 +158,13 @@ public:
bool is_leaf() const { return true; }
int num_children() const { return 0; }
BVHNode *get_child(int) const { return NULL; }
int num_triangles() const { return m_hi - m_lo; }
int num_triangles() const { return hi - lo; }
void print(int depth) const;
int m_lo;
int m_hi;
int lo;
int hi;
};
CCL_NAMESPACE_END
#endif /* __BVH_NODE_H__ */

View File

@@ -18,9 +18,9 @@
#ifndef __BVH_PARAMS_H__
#define __BVH_PARAMS_H__
#include "util_boundbox.h"
#include "util/util_boundbox.h"
#include "kernel_types.h"
#include "kernel/kernel_types.h"
CCL_NAMESPACE_BEGIN
@@ -104,6 +104,7 @@ public:
primitive_mask = PRIMITIVE_ALL;
num_motion_curve_steps = 0;
num_motion_triangle_steps = 0;
}
/* SAH costs */

View File

@@ -15,12 +15,12 @@
* limitations under the License.
*/
#include "bvh_build.h"
#include "bvh_sort.h"
#include "bvh/bvh_build.h"
#include "bvh/bvh_sort.h"
#include "util_algorithm.h"
#include "util_debug.h"
#include "util_task.h"
#include "util/util_algorithm.h"
#include "util/util_debug.h"
#include "util/util_task.h"
CCL_NAMESPACE_BEGIN

View File

@@ -15,14 +15,14 @@
* limitations under the License.
*/
#include "bvh_build.h"
#include "bvh_split.h"
#include "bvh_sort.h"
#include "bvh/bvh_build.h"
#include "bvh/bvh_split.h"
#include "bvh/bvh_sort.h"
#include "mesh.h"
#include "object.h"
#include "render/mesh.h"
#include "render/object.h"
#include "util_algorithm.h"
#include "util/util_algorithm.h"
CCL_NAMESPACE_BEGIN

View File

@@ -18,8 +18,8 @@
#ifndef __BVH_SPLIT_H__
#define __BVH_SPLIT_H__
#include "bvh_build.h"
#include "bvh_params.h"
#include "bvh/bvh_build.h"
#include "bvh/bvh_params.h"
CCL_NAMESPACE_BEGIN

View File

@@ -15,17 +15,17 @@
*/
#include "bvh_unaligned.h"
#include "bvh/bvh_unaligned.h"
#include "mesh.h"
#include "object.h"
#include "render/mesh.h"
#include "render/object.h"
#include "bvh_binning.h"
#include "bvh/bvh_binning.h"
#include "bvh_params.h"
#include "util_boundbox.h"
#include "util_debug.h"
#include "util_transform.h"
#include "util/util_boundbox.h"
#include "util/util_debug.h"
#include "util/util_transform.h"
CCL_NAMESPACE_BEGIN

View File

@@ -17,7 +17,7 @@
#ifndef __BVH_UNALIGNED_H__
#define __BVH_UNALIGNED_H__
#include "util_vector.h"
#include "util/util_vector.h"
CCL_NAMESPACE_BEGIN

View File

@@ -1,12 +1,6 @@
set(INC
.
../graph
../kernel
../kernel/svm
../kernel/osl
../util
../render
..
../../glew-mx
)
@@ -33,6 +27,7 @@ set(SRC
device_cuda.cpp
device_multi.cpp
device_opencl.cpp
device_split_kernel.cpp
device_task.cpp
)
@@ -56,6 +51,7 @@ set(SRC_HEADERS
device_memory.h
device_intern.h
device_network.h
device_split_kernel.h
device_task.h
)

View File

@@ -17,18 +17,18 @@
#include <stdlib.h>
#include <string.h>
#include "device.h"
#include "device_intern.h"
#include "device/device.h"
#include "device/device_intern.h"
#include "util_debug.h"
#include "util_foreach.h"
#include "util_half.h"
#include "util_math.h"
#include "util_opengl.h"
#include "util_time.h"
#include "util_types.h"
#include "util_vector.h"
#include "util_string.h"
#include "util/util_debug.h"
#include "util/util_foreach.h"
#include "util/util_half.h"
#include "util/util_math.h"
#include "util/util_opengl.h"
#include "util/util_time.h"
#include "util/util_types.h"
#include "util/util_vector.h"
#include "util/util_string.h"
CCL_NAMESPACE_BEGIN
@@ -48,11 +48,11 @@ std::ostream& operator <<(std::ostream &os,
os << "Max nodes group: " << requested_features.max_nodes_group << std::endl;
/* TODO(sergey): Decode bitflag into list of names. */
os << "Nodes features: " << requested_features.nodes_features << std::endl;
os << "Use hair: "
os << "Use Hair: "
<< string_from_bool(requested_features.use_hair) << std::endl;
os << "Use object motion: "
os << "Use Object Motion: "
<< string_from_bool(requested_features.use_object_motion) << std::endl;
os << "Use camera motion: "
os << "Use Camera Motion: "
<< string_from_bool(requested_features.use_camera_motion) << std::endl;
os << "Use Baking: "
<< string_from_bool(requested_features.use_baking) << std::endl;
@@ -80,7 +80,7 @@ Device::~Device()
void Device::pixels_alloc(device_memory& mem)
{
mem_alloc(mem, MEM_READ_WRITE);
mem_alloc("pixels", mem, MEM_READ_WRITE);
}
void Device::pixels_copy_from(device_memory& mem, int y, int w, int h)

View File

@@ -19,15 +19,15 @@
#include <stdlib.h>
#include "device_memory.h"
#include "device_task.h"
#include "device/device_memory.h"
#include "device/device_task.h"
#include "util_list.h"
#include "util_stats.h"
#include "util_string.h"
#include "util_thread.h"
#include "util_types.h"
#include "util_vector.h"
#include "util/util_list.h"
#include "util/util_stats.h"
#include "util/util_string.h"
#include "util/util_thread.h"
#include "util/util_types.h"
#include "util/util_vector.h"
CCL_NAMESPACE_BEGIN
@@ -121,6 +121,9 @@ public:
/* Use Transparent shadows */
bool use_transparent;
/* Use various shadow tricks, such as shadow catcher. */
bool use_shadow_tricks;
DeviceRequestedFeatures()
{
/* TODO(sergey): Find more meaningful defaults. */
@@ -137,6 +140,7 @@ public:
use_integrator_branched = false;
use_patch_evaluation = false;
use_transparent = false;
use_shadow_tricks = false;
}
bool modified(const DeviceRequestedFeatures& requested_features)
@@ -153,7 +157,8 @@ public:
use_volume == requested_features.use_volume &&
use_integrator_branched == requested_features.use_integrator_branched &&
use_patch_evaluation == requested_features.use_patch_evaluation &&
use_transparent == requested_features.use_transparent);
use_transparent == requested_features.use_transparent &&
use_shadow_tricks == requested_features.use_shadow_tricks);
}
/* Convert the requested features structure to a build options,
@@ -194,9 +199,12 @@ public:
if(!use_patch_evaluation) {
build_options += " -D__NO_PATCH_EVAL__";
}
if(!use_transparent) {
if(!use_transparent && !use_volume) {
build_options += " -D__NO_TRANSPARENT__";
}
if(!use_shadow_tricks) {
build_options += " -D__NO_SHADOW_TRICKS__";
}
return build_options;
}
};
@@ -228,13 +236,21 @@ public:
DeviceInfo info;
virtual const string& error_message() { return error_msg; }
bool have_error() { return !error_message().empty(); }
virtual void set_error(const string& error)
{
if(!have_error()) {
error_msg = error;
}
fprintf(stderr, "%s\n", error.c_str());
fflush(stderr);
}
virtual bool show_samples() const { return false; }
/* statistics */
Stats &stats;
/* regular memory */
virtual void mem_alloc(device_memory& mem, MemoryType type) = 0;
virtual void mem_alloc(const char *name, device_memory& mem, MemoryType type) = 0;
virtual void mem_copy_to(device_memory& mem) = 0;
virtual void mem_copy_from(device_memory& mem,
int y, int w, int h, int elem) = 0;

View File

@@ -20,36 +20,124 @@
/* So ImathMath is included before our kernel_cpu_compat. */
#ifdef WITH_OSL
/* So no context pollution happens from indirectly included windows.h */
# include "util_windows.h"
# include "util/util_windows.h"
# include <OSL/oslexec.h>
#endif
#include "device.h"
#include "device_intern.h"
#include "device/device.h"
#include "device/device_intern.h"
#include "device/device_split_kernel.h"
#include "kernel.h"
#include "kernel_compat_cpu.h"
#include "kernel_types.h"
#include "kernel_globals.h"
#include "kernel/kernel.h"
#include "kernel/kernel_compat_cpu.h"
#include "kernel/kernel_types.h"
#include "kernel/split/kernel_split_data.h"
#include "kernel/kernel_globals.h"
#include "osl_shader.h"
#include "osl_globals.h"
#include "kernel/osl/osl_shader.h"
#include "kernel/osl/osl_globals.h"
#include "buffers.h"
#include "render/buffers.h"
#include "util_debug.h"
#include "util_foreach.h"
#include "util_function.h"
#include "util_logging.h"
#include "util_opengl.h"
#include "util_progress.h"
#include "util_system.h"
#include "util_thread.h"
#include "util/util_debug.h"
#include "util/util_foreach.h"
#include "util/util_function.h"
#include "util/util_logging.h"
#include "util/util_map.h"
#include "util/util_opengl.h"
#include "util/util_progress.h"
#include "util/util_system.h"
#include "util/util_thread.h"
CCL_NAMESPACE_BEGIN
class CPUDevice;
class CPUSplitKernel : public DeviceSplitKernel {
CPUDevice *device;
public:
explicit CPUSplitKernel(CPUDevice *device);
virtual bool enqueue_split_kernel_data_init(const KernelDimensions& dim,
RenderTile& rtile,
int num_global_elements,
device_memory& kernel_globals,
device_memory& kernel_data_,
device_memory& split_data,
device_memory& ray_state,
device_memory& queue_index,
device_memory& use_queues_flag,
device_memory& work_pool_wgs);
virtual SplitKernelFunction* get_split_kernel_function(string kernel_name, const DeviceRequestedFeatures&);
virtual int2 split_kernel_local_size();
virtual int2 split_kernel_global_size(device_memory& kg, device_memory& data, DeviceTask *task);
virtual uint64_t state_buffer_size(device_memory& kg, device_memory& data, size_t num_threads);
};
class CPUDevice : public Device
{
static unordered_map<string, void*> kernel_functions;
static void register_kernel_function(const char* name, void* func)
{
kernel_functions[name] = func;
}
static const char* get_arch_name()
{
#ifdef WITH_CYCLES_OPTIMIZED_KERNEL_AVX2
if(system_cpu_support_avx2()) {
return "cpu_avx2";
}
else
#endif
#ifdef WITH_CYCLES_OPTIMIZED_KERNEL_AVX
if(system_cpu_support_avx()) {
return "cpu_avx";
}
else
#endif
#ifdef WITH_CYCLES_OPTIMIZED_KERNEL_SSE41
if(system_cpu_support_sse41()) {
return "cpu_sse41";
}
else
#endif
#ifdef WITH_CYCLES_OPTIMIZED_KERNEL_SSE3
if(system_cpu_support_sse3()) {
return "cpu_sse3";
}
else
#endif
#ifdef WITH_CYCLES_OPTIMIZED_KERNEL_SSE2
if(system_cpu_support_sse2()) {
return "cpu_sse2";
}
else
#endif
{
return "cpu";
}
}
template<typename F>
static F get_kernel_function(string name)
{
name = string("kernel_") + get_arch_name() + "_" + name;
unordered_map<string, void*>::iterator it = kernel_functions.find(name);
if(it == kernel_functions.end()) {
assert(!"kernel function not found");
return NULL;
}
return (F)it->second;
}
friend class CPUSplitKernel;
public:
TaskPool task_pool;
KernelGlobals kernel_globals;
@@ -57,10 +145,15 @@ public:
#ifdef WITH_OSL
OSLGlobals osl_globals;
#endif
bool use_split_kernel;
DeviceRequestedFeatures requested_features;
CPUDevice(DeviceInfo& info, Stats &stats, bool background)
: Device(info, stats, background)
{
#ifdef WITH_OSL
kernel_globals.osl = &osl_globals;
#endif
@@ -105,6 +198,28 @@ public:
{
VLOG(1) << "Will be using regular kernels.";
}
use_split_kernel = DebugFlags().cpu.split_kernel;
if(use_split_kernel) {
VLOG(1) << "Will be using split kernel.";
}
kernel_cpu_register_functions(register_kernel_function);
#ifdef WITH_CYCLES_OPTIMIZED_KERNEL_SSE2
kernel_cpu_sse2_register_functions(register_kernel_function);
#endif
#ifdef WITH_CYCLES_OPTIMIZED_KERNEL_SSE3
kernel_cpu_sse3_register_functions(register_kernel_function);
#endif
#ifdef WITH_CYCLES_OPTIMIZED_KERNEL_SSE41
kernel_cpu_sse41_register_functions(register_kernel_function);
#endif
#ifdef WITH_CYCLES_OPTIMIZED_KERNEL_AVX
kernel_cpu_avx_register_functions(register_kernel_function);
#endif
#ifdef WITH_CYCLES_OPTIMIZED_KERNEL_AVX2
kernel_cpu_avx2_register_functions(register_kernel_function);
#endif
}
~CPUDevice()
@@ -117,9 +232,20 @@ public:
return (TaskScheduler::num_threads() == 1);
}
void mem_alloc(device_memory& mem, MemoryType /*type*/)
void mem_alloc(const char *name, device_memory& mem, MemoryType /*type*/)
{
if(name) {
VLOG(1) << "Buffer allocate: " << name << ", "
<< string_human_readable_number(mem.memory_size()) << " bytes. ("
<< string_human_readable_size(mem.memory_size()) << ")";
}
mem.device_pointer = mem.data_pointer;
if(!mem.device_pointer) {
mem.device_pointer = (device_ptr)malloc(mem.memory_size());
}
mem.device_size = mem.memory_size();
stats.mem_alloc(mem.device_size);
}
@@ -144,6 +270,10 @@ public:
void mem_free(device_memory& mem)
{
if(mem.device_pointer) {
if(!mem.data_pointer) {
free((void*)mem.device_pointer);
}
mem.device_pointer = 0;
stats.mem_free(mem.device_size);
mem.device_size = 0;
@@ -196,8 +326,14 @@ public:
void thread_run(DeviceTask *task)
{
if(task->type == DeviceTask::PATH_TRACE)
thread_path_trace(*task);
if(task->type == DeviceTask::PATH_TRACE) {
if(!use_split_kernel) {
thread_path_trace(*task);
}
else {
thread_path_trace_split(*task);
}
}
else if(task->type == DeviceTask::FILM_CONVERT)
thread_film_convert(*task);
else if(task->type == DeviceTask::SHADER)
@@ -258,7 +394,7 @@ public:
{
path_trace_kernel = kernel_cpu_path_trace;
}
while(task.acquire_tile(this, tile)) {
float *render_buffer = (float*)tile.buffer;
uint *rng_state = (uint*)tile.rng_state;
@@ -294,6 +430,49 @@ public:
thread_kernel_globals_free(&kg);
}
void thread_path_trace_split(DeviceTask& task)
{
if(task_pool.canceled()) {
if(task.need_finish_queue == false)
return;
}
RenderTile tile;
CPUSplitKernel split_kernel(this);
/* allocate buffer for kernel globals */
device_memory kgbuffer;
kgbuffer.resize(sizeof(KernelGlobals));
mem_alloc("kernel_globals", kgbuffer, MEM_READ_WRITE);
KernelGlobals *kg = (KernelGlobals*)kgbuffer.device_pointer;
*kg = thread_kernel_globals_init();
requested_features.max_closure = MAX_CLOSURE;
if(!split_kernel.load_kernels(requested_features)) {
thread_kernel_globals_free((KernelGlobals*)kgbuffer.device_pointer);
mem_free(kgbuffer);
return;
}
while(task.acquire_tile(this, tile)) {
device_memory data;
split_kernel.path_trace(&task, tile, kgbuffer, data);
task.release_tile(tile);
if(task_pool.canceled()) {
if(task.need_finish_queue == false)
break;
}
}
thread_kernel_globals_free((KernelGlobals*)kgbuffer.device_pointer);
mem_free(kgbuffer);
}
void thread_film_convert(DeviceTask& task)
{
float sample_scale = 1.0f/(task.sample + 1);
@@ -501,6 +680,10 @@ protected:
inline void thread_kernel_globals_free(KernelGlobals *kg)
{
if(kg == NULL) {
return;
}
if(kg->transparent_shadow_intersections != NULL) {
free(kg->transparent_shadow_intersections);
}
@@ -515,8 +698,175 @@ protected:
OSLShader::thread_free(kg);
#endif
}
virtual bool load_kernels(DeviceRequestedFeatures& requested_features_) {
requested_features = requested_features_;
return true;
}
};
/* split kernel */
class CPUSplitKernelFunction : public SplitKernelFunction {
public:
CPUDevice* device;
void (*func)(KernelGlobals *kg, KernelData *data);
CPUSplitKernelFunction(CPUDevice* device) : device(device), func(NULL) {}
~CPUSplitKernelFunction() {}
virtual bool enqueue(const KernelDimensions& dim, device_memory& kernel_globals, device_memory& data)
{
if(!func) {
return false;
}
KernelGlobals *kg = (KernelGlobals*)kernel_globals.device_pointer;
kg->global_size = make_int2(dim.global_size[0], dim.global_size[1]);
for(int y = 0; y < dim.global_size[1]; y++) {
for(int x = 0; x < dim.global_size[0]; x++) {
kg->global_id = make_int2(x, y);
func(kg, (KernelData*)data.device_pointer);
}
}
return true;
}
};
CPUSplitKernel::CPUSplitKernel(CPUDevice *device) : DeviceSplitKernel(device), device(device)
{
}
bool CPUSplitKernel::enqueue_split_kernel_data_init(const KernelDimensions& dim,
RenderTile& rtile,
int num_global_elements,
device_memory& kernel_globals,
device_memory& data,
device_memory& split_data,
device_memory& ray_state,
device_memory& queue_index,
device_memory& use_queues_flags,
device_memory& work_pool_wgs)
{
typedef void(*data_init_t)(KernelGlobals *kg,
ccl_constant KernelData *data,
ccl_global void *split_data_buffer,
int num_elements,
ccl_global char *ray_state,
ccl_global uint *rng_state,
int start_sample,
int end_sample,
int sx, int sy, int sw, int sh, int offset, int stride,
ccl_global int *Queue_index,
int queuesize,
ccl_global char *use_queues_flag,
ccl_global unsigned int *work_pool_wgs,
unsigned int num_samples,
ccl_global float *buffer);
data_init_t data_init;
#ifdef WITH_CYCLES_OPTIMIZED_KERNEL_AVX2
if(system_cpu_support_avx2()) {
data_init = kernel_cpu_avx2_data_init;
}
else
#endif
#ifdef WITH_CYCLES_OPTIMIZED_KERNEL_AVX
if(system_cpu_support_avx()) {
data_init = kernel_cpu_avx_data_init;
}
else
#endif
#ifdef WITH_CYCLES_OPTIMIZED_KERNEL_SSE41
if(system_cpu_support_sse41()) {
data_init = kernel_cpu_sse41_data_init;
}
else
#endif
#ifdef WITH_CYCLES_OPTIMIZED_KERNEL_SSE3
if(system_cpu_support_sse3()) {
data_init = kernel_cpu_sse3_data_init;
}
else
#endif
#ifdef WITH_CYCLES_OPTIMIZED_KERNEL_SSE2
if(system_cpu_support_sse2()) {
data_init = kernel_cpu_sse2_data_init;
}
else
#endif
{
data_init = kernel_cpu_data_init;
}
KernelGlobals *kg = (KernelGlobals*)kernel_globals.device_pointer;
kg->global_size = make_int2(dim.global_size[0], dim.global_size[1]);
for(int y = 0; y < dim.global_size[1]; y++) {
for(int x = 0; x < dim.global_size[0]; x++) {
kg->global_id = make_int2(x, y);
data_init((KernelGlobals*)kernel_globals.device_pointer,
(KernelData*)data.device_pointer,
(void*)split_data.device_pointer,
num_global_elements,
(char*)ray_state.device_pointer,
(uint*)rtile.rng_state,
rtile.start_sample,
rtile.start_sample + rtile.num_samples,
rtile.x,
rtile.y,
rtile.w,
rtile.h,
rtile.offset,
rtile.stride,
(int*)queue_index.device_pointer,
dim.global_size[0] * dim.global_size[1],
(char*)use_queues_flags.device_pointer,
(uint*)work_pool_wgs.device_pointer,
rtile.num_samples,
(float*)rtile.buffer);
}
}
return true;
}
SplitKernelFunction* CPUSplitKernel::get_split_kernel_function(string kernel_name, const DeviceRequestedFeatures&)
{
CPUSplitKernelFunction *kernel = new CPUSplitKernelFunction(device);
kernel->func = device->get_kernel_function<void(*)(KernelGlobals*, KernelData*)>(kernel_name);
if(!kernel->func) {
delete kernel;
return NULL;
}
return kernel;
}
int2 CPUSplitKernel::split_kernel_local_size()
{
return make_int2(1, 1);
}
int2 CPUSplitKernel::split_kernel_global_size(device_memory& /*kg*/, device_memory& /*data*/, DeviceTask * /*task*/) {
return make_int2(64, 1);
}
uint64_t CPUSplitKernel::state_buffer_size(device_memory& kernel_globals, device_memory& /*data*/, size_t num_threads) {
KernelGlobals *kg = (KernelGlobals*)kernel_globals.device_pointer;
return split_data_buffer_size(kg, num_threads);
}
unordered_map<string, void*> CPUDevice::kernel_functions;
Device *device_cpu_create(DeviceInfo& info, Stats &stats, bool background)
{
return new CPUDevice(info, stats, background);

View File

@@ -15,32 +15,36 @@
*/
#include <climits>
#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "device.h"
#include "device_intern.h"
#include "device/device.h"
#include "device/device_intern.h"
#include "device/device_split_kernel.h"
#include "buffers.h"
#include "render/buffers.h"
#ifdef WITH_CUDA_DYNLOAD
# include "cuew.h"
#else
# include "util_opengl.h"
# include "util/util_opengl.h"
# include <cuda.h>
# include <cudaGL.h>
#endif
#include "util_debug.h"
#include "util_logging.h"
#include "util_map.h"
#include "util_md5.h"
#include "util_opengl.h"
#include "util_path.h"
#include "util_string.h"
#include "util_system.h"
#include "util_types.h"
#include "util_time.h"
#include "util/util_debug.h"
#include "util/util_logging.h"
#include "util/util_map.h"
#include "util/util_md5.h"
#include "util/util_opengl.h"
#include "util/util_path.h"
#include "util/util_string.h"
#include "util/util_system.h"
#include "util/util_types.h"
#include "util/util_time.h"
#include "kernel/split/kernel_split_data_types.h"
CCL_NAMESPACE_BEGIN
@@ -78,6 +82,31 @@ int cuewCompilerVersion(void)
} /* namespace */
#endif /* WITH_CUDA_DYNLOAD */
class CUDADevice;
class CUDASplitKernel : public DeviceSplitKernel {
CUDADevice *device;
public:
explicit CUDASplitKernel(CUDADevice *device);
virtual uint64_t state_buffer_size(device_memory& kg, device_memory& data, size_t num_threads);
virtual bool enqueue_split_kernel_data_init(const KernelDimensions& dim,
RenderTile& rtile,
int num_global_elements,
device_memory& kernel_globals,
device_memory& kernel_data_,
device_memory& split_data,
device_memory& ray_state,
device_memory& queue_index,
device_memory& use_queues_flag,
device_memory& work_pool_wgs);
virtual SplitKernelFunction* get_split_kernel_function(string kernel_name, const DeviceRequestedFeatures&);
virtual int2 split_kernel_local_size();
virtual int2 split_kernel_global_size(device_memory& kg, device_memory& data, DeviceTask *task);
};
class CUDADevice : public Device
{
public:
@@ -258,16 +287,21 @@ public:
return DebugFlags().cuda.adaptive_compile;
}
bool use_split_kernel()
{
return DebugFlags().cuda.split_kernel;
}
/* Common NVCC flags which stays the same regardless of shading model,
* kernel sources md5 and only depends on compiler or compilation settings.
*/
string compile_kernel_get_common_cflags(
const DeviceRequestedFeatures& requested_features)
const DeviceRequestedFeatures& requested_features, bool split=false)
{
const int cuda_version = cuewCompilerVersion();
const int machine = system_cpu_bits();
const string kernel_path = path_get("kernel");
const string include = kernel_path;
const string source_path = path_get("source");
const string include_path = source_path;
string cflags = string_printf("-m%d "
"--ptxas-options=\"-v\" "
"--use_fast_math "
@@ -276,7 +310,7 @@ public:
"-I\"%s\"",
machine,
cuda_version,
include.c_str());
include_path.c_str());
if(use_adaptive_compilation()) {
cflags += " " + requested_features.get_build_options();
}
@@ -287,6 +321,11 @@ public:
#ifdef WITH_CYCLES_DEBUG
cflags += " -D__KERNEL_DEBUG__";
#endif
if(split) {
cflags += " -D__SPLIT__";
}
return cflags;
}
@@ -306,21 +345,21 @@ public:
cuda_error_message("CUDA nvcc compiler version could not be parsed.");
return false;
}
if(cuda_version < 75) {
if(cuda_version < 80) {
printf("Unsupported CUDA version %d.%d detected, "
"you need CUDA 7.5 or newer.\n",
"you need CUDA 8.0 or newer.\n",
major, minor);
return false;
}
else if(cuda_version != 75 && cuda_version != 80) {
else if(cuda_version != 80) {
printf("CUDA version %d.%d detected, build may succeed but only "
"CUDA 7.5 and 8.0 are officially supported.\n",
"CUDA 8.0 is officially supported.\n",
major, minor);
}
return true;
}
string compile_kernel(const DeviceRequestedFeatures& requested_features)
string compile_kernel(const DeviceRequestedFeatures& requested_features, bool split=false)
{
/* Compute cubin name. */
int major, minor;
@@ -329,7 +368,8 @@ public:
/* Attempt to use kernel provided with Blender. */
if(!use_adaptive_compilation()) {
const string cubin = path_get(string_printf("lib/kernel_sm_%d%d.cubin",
const string cubin = path_get(string_printf(split ? "lib/kernel_split_sm_%d%d.cubin"
: "lib/kernel_sm_%d%d.cubin",
major, minor));
VLOG(1) << "Testing for pre-compiled kernel " << cubin << ".";
if(path_exists(cubin)) {
@@ -339,18 +379,19 @@ public:
}
const string common_cflags =
compile_kernel_get_common_cflags(requested_features);
compile_kernel_get_common_cflags(requested_features, split);
/* Try to use locally compiled kernel. */
const string kernel_path = path_get("kernel");
const string kernel_md5 = path_files_md5_hash(kernel_path);
const string source_path = path_get("source");
const string kernel_md5 = path_files_md5_hash(source_path);
/* We include cflags into md5 so changing cuda toolkit or changing other
* compiler command line arguments makes sure cubin gets re-built.
*/
const string cubin_md5 = util_md5_string(kernel_md5 + common_cflags);
const string cubin_file = string_printf("cycles_kernel_sm%d%d_%s.cubin",
const string cubin_file = string_printf(split ? "cycles_kernel_split_sm%d%d_%s.cubin"
: "cycles_kernel_sm%d%d_%s.cubin",
major, minor,
cubin_md5.c_str());
const string cubin = path_cache_get(path_join("kernels", cubin_file));
@@ -383,9 +424,10 @@ public:
return "";
}
const char *nvcc = cuewCompilerPath();
const string kernel = path_join(kernel_path,
path_join("kernels",
path_join("cuda", "kernel.cu")));
const string kernel = path_join(
path_join(source_path, "kernel"),
path_join("kernels",
path_join("cuda", split ? "kernel_split.cu" : "kernel.cu")));
double starttime = time_dt();
printf("Compiling CUDA kernel ...\n");
@@ -433,7 +475,7 @@ public:
return false;
/* get kernel */
string cubin = compile_kernel(requested_features);
string cubin = compile_kernel(requested_features, use_split_kernel());
if(cubin == "")
return false;
@@ -466,8 +508,14 @@ public:
}
}
void mem_alloc(device_memory& mem, MemoryType /*type*/)
void mem_alloc(const char *name, device_memory& mem, MemoryType /*type*/)
{
if(name) {
VLOG(1) << "Buffer allocate: " << name << ", "
<< string_human_readable_number(mem.memory_size()) << " bytes. ("
<< string_human_readable_size(mem.memory_size()) << ")";
}
cuda_push_context();
CUdeviceptr device_pointer;
size_t size = mem.memory_size();
@@ -504,7 +552,9 @@ public:
void mem_zero(device_memory& mem)
{
memset((void*)mem.data_pointer, 0, mem.memory_size());
if(mem.data_pointer) {
memset((void*)mem.data_pointer, 0, mem.memory_size());
}
cuda_push_context();
if(mem.device_pointer)
@@ -617,7 +667,7 @@ public:
/* Data Storage */
if(interpolation == INTERPOLATION_NONE) {
if(has_bindless_textures) {
mem_alloc(mem, MEM_READ_ONLY);
mem_alloc(NULL, mem, MEM_READ_ONLY);
mem_copy_to(mem);
cuda_push_context();
@@ -641,7 +691,7 @@ public:
cuda_pop_context();
}
else {
mem_alloc(mem, MEM_READ_ONLY);
mem_alloc(NULL, mem, MEM_READ_ONLY);
mem_copy_to(mem);
cuda_push_context();
@@ -1258,25 +1308,48 @@ public:
/* Upload Bindless Mapping */
load_bindless_mapping();
/* keep rendering tiles until done */
while(task->acquire_tile(this, tile)) {
int start_sample = tile.start_sample;
int end_sample = tile.start_sample + tile.num_samples;
if(!use_split_kernel()) {
/* keep rendering tiles until done */
while(task->acquire_tile(this, tile)) {
int start_sample = tile.start_sample;
int end_sample = tile.start_sample + tile.num_samples;
for(int sample = start_sample; sample < end_sample; sample++) {
if(task->get_cancel()) {
if(task->need_finish_queue == false)
break;
}
path_trace(tile, sample, branched);
tile.sample = sample + 1;
task->update_progress(&tile, tile.w*tile.h);
}
task->release_tile(tile);
}
}
else {
DeviceRequestedFeatures requested_features;
if(!use_adaptive_compilation()) {
requested_features.max_closure = 64;
}
CUDASplitKernel split_kernel(this);
split_kernel.load_kernels(requested_features);
while(task->acquire_tile(this, tile)) {
device_memory void_buffer;
split_kernel.path_trace(task, tile, void_buffer, void_buffer);
task->release_tile(tile);
for(int sample = start_sample; sample < end_sample; sample++) {
if(task->get_cancel()) {
if(task->need_finish_queue == false)
break;
}
path_trace(tile, sample, branched);
tile.sample = sample + 1;
task->update_progress(&tile, tile.w*tile.h);
}
task->release_tile(tile);
}
}
else if(task->type == DeviceTask::SHADER) {
@@ -1329,8 +1402,223 @@ public:
{
task_pool.cancel();
}
friend class CUDASplitKernelFunction;
friend class CUDASplitKernel;
};
/* redefine the cuda_assert macro so it can be used outside of the CUDADevice class
* now that the definition of that class is complete
*/
#undef cuda_assert
#define cuda_assert(stmt) \
{ \
CUresult result = stmt; \
\
if(result != CUDA_SUCCESS) { \
string message = string_printf("CUDA error: %s in %s", cuewErrorString(result), #stmt); \
if(device->error_msg == "") \
device->error_msg = message; \
fprintf(stderr, "%s\n", message.c_str()); \
/*cuda_abort();*/ \
device->cuda_error_documentation(); \
} \
} (void)0
/* split kernel */
class CUDASplitKernelFunction : public SplitKernelFunction{
CUDADevice* device;
CUfunction func;
public:
CUDASplitKernelFunction(CUDADevice *device, CUfunction func) : device(device), func(func) {}
/* enqueue the kernel, returns false if there is an error */
bool enqueue(const KernelDimensions &dim, device_memory &/*kg*/, device_memory &/*data*/)
{
return enqueue(dim, NULL);
}
/* enqueue the kernel, returns false if there is an error */
bool enqueue(const KernelDimensions &dim, void *args[])
{
device->cuda_push_context();
if(device->have_error())
return false;
/* we ignore dim.local_size for now, as this is faster */
int threads_per_block;
cuda_assert(cuFuncGetAttribute(&threads_per_block, CU_FUNC_ATTRIBUTE_MAX_THREADS_PER_BLOCK, func));
int xthreads = (int)sqrt(threads_per_block);
int ythreads = (int)sqrt(threads_per_block);
int xblocks = (dim.global_size[0] + xthreads - 1)/xthreads;
int yblocks = (dim.global_size[1] + ythreads - 1)/ythreads;
cuda_assert(cuFuncSetCacheConfig(func, CU_FUNC_CACHE_PREFER_L1));
cuda_assert(cuLaunchKernel(func,
xblocks , yblocks, 1, /* blocks */
xthreads, ythreads, 1, /* threads */
0, 0, args, 0));
device->cuda_pop_context();
return !device->have_error();
}
};
CUDASplitKernel::CUDASplitKernel(CUDADevice *device) : DeviceSplitKernel(device), device(device)
{
}
uint64_t CUDASplitKernel::state_buffer_size(device_memory& /*kg*/, device_memory& /*data*/, size_t num_threads)
{
device_vector<uint64_t> size_buffer;
size_buffer.resize(1);
device->mem_alloc(NULL, size_buffer, MEM_READ_WRITE);
device->cuda_push_context();
uint threads = num_threads;
CUdeviceptr d_size = device->cuda_device_ptr(size_buffer.device_pointer);
struct args_t {
uint* num_threads;
CUdeviceptr* size;
};
args_t args = {
&threads,
&d_size
};
CUfunction state_buffer_size;
cuda_assert(cuModuleGetFunction(&state_buffer_size, device->cuModule, "kernel_cuda_state_buffer_size"));
cuda_assert(cuLaunchKernel(state_buffer_size,
1, 1, 1,
1, 1, 1,
0, 0, (void**)&args, 0));
device->cuda_pop_context();
device->mem_copy_from(size_buffer, 0, 1, 1, sizeof(uint64_t));
device->mem_free(size_buffer);
return *size_buffer.get_data();
}
bool CUDASplitKernel::enqueue_split_kernel_data_init(const KernelDimensions& dim,
RenderTile& rtile,
int num_global_elements,
device_memory& /*kernel_globals*/,
device_memory& /*kernel_data*/,
device_memory& split_data,
device_memory& ray_state,
device_memory& queue_index,
device_memory& use_queues_flag,
device_memory& work_pool_wgs)
{
device->cuda_push_context();
CUdeviceptr d_split_data = device->cuda_device_ptr(split_data.device_pointer);
CUdeviceptr d_ray_state = device->cuda_device_ptr(ray_state.device_pointer);
CUdeviceptr d_queue_index = device->cuda_device_ptr(queue_index.device_pointer);
CUdeviceptr d_use_queues_flag = device->cuda_device_ptr(use_queues_flag.device_pointer);
CUdeviceptr d_work_pool_wgs = device->cuda_device_ptr(work_pool_wgs.device_pointer);
CUdeviceptr d_rng_state = device->cuda_device_ptr(rtile.rng_state);
CUdeviceptr d_buffer = device->cuda_device_ptr(rtile.buffer);
int end_sample = rtile.start_sample + rtile.num_samples;
int queue_size = dim.global_size[0] * dim.global_size[1];
struct args_t {
CUdeviceptr* split_data_buffer;
int* num_elements;
CUdeviceptr* ray_state;
CUdeviceptr* rng_state;
int* start_sample;
int* end_sample;
int* sx;
int* sy;
int* sw;
int* sh;
int* offset;
int* stride;
CUdeviceptr* queue_index;
int* queuesize;
CUdeviceptr* use_queues_flag;
CUdeviceptr* work_pool_wgs;
int* num_samples;
CUdeviceptr* buffer;
};
args_t args = {
&d_split_data,
&num_global_elements,
&d_ray_state,
&d_rng_state,
&rtile.start_sample,
&end_sample,
&rtile.x,
&rtile.y,
&rtile.w,
&rtile.h,
&rtile.offset,
&rtile.stride,
&d_queue_index,
&queue_size,
&d_use_queues_flag,
&d_work_pool_wgs,
&rtile.num_samples,
&d_buffer
};
CUfunction data_init;
cuda_assert(cuModuleGetFunction(&data_init, device->cuModule, "kernel_cuda_path_trace_data_init"));
if(device->have_error()) {
return false;
}
CUDASplitKernelFunction(device, data_init).enqueue(dim, (void**)&args);
device->cuda_pop_context();
return !device->have_error();
}
SplitKernelFunction* CUDASplitKernel::get_split_kernel_function(string kernel_name, const DeviceRequestedFeatures&)
{
CUfunction func;
device->cuda_push_context();
cuda_assert(cuModuleGetFunction(&func, device->cuModule, (string("kernel_cuda_") + kernel_name).data()));
if(device->have_error()) {
device->cuda_error_message(string_printf("kernel \"kernel_cuda_%s\" not found in module", kernel_name.data()));
return NULL;
}
device->cuda_pop_context();
return new CUDASplitKernelFunction(device, func);
}
int2 CUDASplitKernel::split_kernel_local_size()
{
return make_int2(32, 1);
}
int2 CUDASplitKernel::split_kernel_global_size(device_memory& /*kg*/, device_memory& /*data*/, DeviceTask */*task*/)
{
/* TODO(mai): implement something here to detect ideal work size */
return make_int2(256, 256);
}
bool device_cuda_init(void)
{
#ifdef WITH_CUDA_DYNLOAD

View File

@@ -28,10 +28,10 @@
* other devices this is a pointer to device memory, where we will copy memory
* to and from. */
#include "util_debug.h"
#include "util_half.h"
#include "util_types.h"
#include "util_vector.h"
#include "util/util_debug.h"
#include "util/util_half.h"
#include "util/util_types.h"
#include "util/util_vector.h"
CCL_NAMESPACE_BEGIN
@@ -48,7 +48,8 @@ enum DataType {
TYPE_UINT,
TYPE_INT,
TYPE_FLOAT,
TYPE_HALF
TYPE_HALF,
TYPE_UINT64,
};
static inline size_t datatype_size(DataType datatype)
@@ -59,6 +60,7 @@ static inline size_t datatype_size(DataType datatype)
case TYPE_UINT: return sizeof(uint);
case TYPE_INT: return sizeof(int);
case TYPE_HALF: return sizeof(half);
case TYPE_UINT64: return sizeof(uint64_t);
default: return 0;
}
}
@@ -160,6 +162,11 @@ template<> struct device_type_traits<half4> {
static const int num_elements = 4;
};
template<> struct device_type_traits<uint64_t> {
static const DataType data_type = TYPE_UINT64;
static const int num_elements = 1;
};
/* Device Memory */
class device_memory
@@ -180,10 +187,27 @@ public:
/* device pointer */
device_ptr device_pointer;
protected:
device_memory() {}
device_memory()
{
data_type = device_type_traits<uchar>::data_type;
data_elements = device_type_traits<uchar>::num_elements;
data_pointer = 0;
data_size = 0;
device_size = 0;
data_width = 0;
data_height = 0;
data_depth = 0;
device_pointer = 0;
}
virtual ~device_memory() { assert(!device_pointer); }
void resize(size_t size)
{
data_size = size;
data_width = size;
}
protected:
/* no copying */
device_memory(const device_memory&);
device_memory& operator = (const device_memory&);
@@ -198,16 +222,8 @@ public:
{
data_type = device_type_traits<T>::data_type;
data_elements = device_type_traits<T>::num_elements;
data_pointer = 0;
data_size = 0;
device_size = 0;
data_width = 0;
data_height = 0;
data_depth = 0;
assert(data_elements > 0);
device_pointer = 0;
}
virtual ~device_vector() {}
@@ -266,6 +282,7 @@ public:
data_height = 0;
data_depth = 0;
data_size = 0;
device_pointer = 0;
}
size_t size()

View File

@@ -17,17 +17,17 @@
#include <stdlib.h>
#include <sstream>
#include "device.h"
#include "device_intern.h"
#include "device_network.h"
#include "device/device.h"
#include "device/device_intern.h"
#include "device/device_network.h"
#include "buffers.h"
#include "render/buffers.h"
#include "util_foreach.h"
#include "util_list.h"
#include "util_logging.h"
#include "util_map.h"
#include "util_time.h"
#include "util/util_foreach.h"
#include "util/util_list.h"
#include "util/util_logging.h"
#include "util/util_map.h"
#include "util/util_time.h"
CCL_NAMESPACE_BEGIN
@@ -106,11 +106,11 @@ public:
return true;
}
void mem_alloc(device_memory& mem, MemoryType type)
void mem_alloc(const char *name, device_memory& mem, MemoryType type)
{
foreach(SubDevice& sub, devices) {
mem.device_pointer = 0;
sub.device->mem_alloc(mem, type);
sub.device->mem_alloc(name, mem, type);
sub.ptr_map[unique_ptr] = mem.device_pointer;
}
@@ -162,6 +162,7 @@ public:
void mem_free(device_memory& mem)
{
device_ptr tmp = mem.device_pointer;
stats.mem_free(mem.device_size);
foreach(SubDevice& sub, devices) {
mem.device_pointer = sub.ptr_map[tmp];
@@ -170,7 +171,6 @@ public:
}
mem.device_pointer = 0;
stats.mem_free(mem.device_size);
}
void const_copy_to(const char *name, void *host, size_t size)
@@ -202,6 +202,7 @@ public:
void tex_free(device_memory& mem)
{
device_ptr tmp = mem.device_pointer;
stats.mem_free(mem.device_size);
foreach(SubDevice& sub, devices) {
mem.device_pointer = sub.ptr_map[tmp];
@@ -210,7 +211,6 @@ public:
}
mem.device_pointer = 0;
stats.mem_free(mem.device_size);
}
void pixels_alloc(device_memory& mem)

View File

@@ -14,12 +14,12 @@
* limitations under the License.
*/
#include "device.h"
#include "device_intern.h"
#include "device_network.h"
#include "device/device.h"
#include "device/device_intern.h"
#include "device/device_network.h"
#include "util_foreach.h"
#include "util_logging.h"
#include "util/util_foreach.h"
#include "util/util_logging.h"
#if defined(WITH_NETWORK)
@@ -87,8 +87,14 @@ public:
snd.write();
}
void mem_alloc(device_memory& mem, MemoryType type)
void mem_alloc(const char *name, device_memory& mem, MemoryType type)
{
if(name) {
VLOG(1) << "Buffer allocate: " << name << ", "
<< string_human_readable_number(mem.memory_size()) << " bytes. ("
<< string_human_readable_size(mem.memory_size()) << ")";
}
thread_scoped_lock lock(rpc_lock);
mem.device_pointer = ++mem_counter;
@@ -481,7 +487,7 @@ protected:
mem.data_pointer = 0;
/* perform the allocation on the actual device */
device->mem_alloc(mem, type);
device->mem_alloc(NULL, mem, type);
/* store a mapping to/from client_pointer and real device pointer */
pointer_mapping_insert(client_pointer, mem.device_pointer);

View File

@@ -33,12 +33,12 @@
#include <sstream>
#include <deque>
#include "buffers.h"
#include "render/buffers.h"
#include "util_foreach.h"
#include "util_list.h"
#include "util_map.h"
#include "util_string.h"
#include "util/util_foreach.h"
#include "util/util_list.h"
#include "util/util_map.h"
#include "util/util_string.h"
CCL_NAMESPACE_BEGIN

View File

@@ -16,12 +16,12 @@
#ifdef WITH_OPENCL
#include "opencl/opencl.h"
#include "device/opencl/opencl.h"
#include "device_intern.h"
#include "device/device_intern.h"
#include "util_foreach.h"
#include "util_logging.h"
#include "util/util_foreach.h"
#include "util/util_logging.h"
CCL_NAMESPACE_BEGIN

View File

@@ -0,0 +1,306 @@
/*
* Copyright 2011-2016 Blender Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include "device/device_split_kernel.h"
#include "kernel/kernel_types.h"
#include "kernel/split/kernel_split_data_types.h"
#include "util/util_time.h"
CCL_NAMESPACE_BEGIN
static const double alpha = 0.1; /* alpha for rolling average */
DeviceSplitKernel::DeviceSplitKernel(Device *device) : device(device)
{
current_max_closure = -1;
first_tile = true;
avg_time_per_sample = 0.0;
kernel_path_init = NULL;
kernel_scene_intersect = NULL;
kernel_lamp_emission = NULL;
kernel_do_volume = NULL;
kernel_queue_enqueue = NULL;
kernel_indirect_background = NULL;
kernel_shader_eval = NULL;
kernel_holdout_emission_blurring_pathtermination_ao = NULL;
kernel_subsurface_scatter = NULL;
kernel_direct_lighting = NULL;
kernel_shadow_blocked_ao = NULL;
kernel_shadow_blocked_dl = NULL;
kernel_next_iteration_setup = NULL;
kernel_indirect_subsurface = NULL;
kernel_buffer_update = NULL;
}
DeviceSplitKernel::~DeviceSplitKernel()
{
device->mem_free(split_data);
device->mem_free(ray_state);
device->mem_free(use_queues_flag);
device->mem_free(queue_index);
device->mem_free(work_pool_wgs);
delete kernel_path_init;
delete kernel_scene_intersect;
delete kernel_lamp_emission;
delete kernel_do_volume;
delete kernel_queue_enqueue;
delete kernel_indirect_background;
delete kernel_shader_eval;
delete kernel_holdout_emission_blurring_pathtermination_ao;
delete kernel_subsurface_scatter;
delete kernel_direct_lighting;
delete kernel_shadow_blocked_ao;
delete kernel_shadow_blocked_dl;
delete kernel_next_iteration_setup;
delete kernel_indirect_subsurface;
delete kernel_buffer_update;
}
bool DeviceSplitKernel::load_kernels(const DeviceRequestedFeatures& requested_features)
{
#define LOAD_KERNEL(name) \
kernel_##name = get_split_kernel_function(#name, requested_features); \
if(!kernel_##name) { \
return false; \
}
LOAD_KERNEL(path_init);
LOAD_KERNEL(scene_intersect);
LOAD_KERNEL(lamp_emission);
LOAD_KERNEL(do_volume);
LOAD_KERNEL(queue_enqueue);
LOAD_KERNEL(indirect_background);
LOAD_KERNEL(shader_eval);
LOAD_KERNEL(holdout_emission_blurring_pathtermination_ao);
LOAD_KERNEL(subsurface_scatter);
LOAD_KERNEL(direct_lighting);
LOAD_KERNEL(shadow_blocked_ao);
LOAD_KERNEL(shadow_blocked_dl);
LOAD_KERNEL(next_iteration_setup);
LOAD_KERNEL(indirect_subsurface);
LOAD_KERNEL(buffer_update);
#undef LOAD_KERNEL
current_max_closure = requested_features.max_closure;
return true;
}
size_t DeviceSplitKernel::max_elements_for_max_buffer_size(device_memory& kg, device_memory& data, uint64_t max_buffer_size)
{
uint64_t size_per_element = state_buffer_size(kg, data, 1024) / 1024;
return max_buffer_size / size_per_element;
}
bool DeviceSplitKernel::path_trace(DeviceTask *task,
RenderTile& tile,
device_memory& kgbuffer,
device_memory& kernel_data)
{
if(device->have_error()) {
return false;
}
/* Get local size */
size_t local_size[2];
{
int2 lsize = split_kernel_local_size();
local_size[0] = lsize[0];
local_size[1] = lsize[1];
}
/* Set gloabl size */
size_t global_size[2];
{
int2 gsize = split_kernel_global_size(kgbuffer, kernel_data, task);
/* Make sure that set work size is a multiple of local
* work size dimensions.
*/
global_size[0] = round_up(gsize[0], local_size[0]);
global_size[1] = round_up(gsize[1], local_size[1]);
}
/* Number of elements in the global state buffer */
int num_global_elements = global_size[0] * global_size[1];
assert(num_global_elements % WORK_POOL_SIZE == 0);
/* Allocate all required global memory once. */
if(first_tile) {
first_tile = false;
/* Calculate max groups */
/* Denotes the maximum work groups possible w.r.t. current requested tile size. */
unsigned int max_work_groups = num_global_elements / WORK_POOL_SIZE + 1;
/* Allocate work_pool_wgs memory. */
work_pool_wgs.resize(max_work_groups * sizeof(unsigned int));
device->mem_alloc("work_pool_wgs", work_pool_wgs, MEM_READ_WRITE);
queue_index.resize(NUM_QUEUES * sizeof(int));
device->mem_alloc("queue_index", queue_index, MEM_READ_WRITE);
use_queues_flag.resize(sizeof(char));
device->mem_alloc("use_queues_flag", use_queues_flag, MEM_READ_WRITE);
ray_state.resize(num_global_elements);
device->mem_alloc("ray_state", ray_state, MEM_READ_WRITE);
split_data.resize(state_buffer_size(kgbuffer, kernel_data, num_global_elements));
device->mem_alloc("split_data", split_data, MEM_READ_WRITE);
}
#define ENQUEUE_SPLIT_KERNEL(name, global_size, local_size) \
if(device->have_error()) { \
return false; \
} \
if(!kernel_##name->enqueue(KernelDimensions(global_size, local_size), kgbuffer, kernel_data)) { \
return false; \
}
tile.sample = tile.start_sample;
/* for exponential increase between tile updates */
int time_multiplier = 1;
while(tile.sample < tile.start_sample + tile.num_samples) {
/* to keep track of how long it takes to run a number of samples */
double start_time = time_dt();
/* initial guess to start rolling average */
const int initial_num_samples = 1;
/* approx number of samples per second */
int samples_per_second = (avg_time_per_sample > 0.0) ?
int(double(time_multiplier) / avg_time_per_sample) + 1 : initial_num_samples;
RenderTile subtile = tile;
subtile.start_sample = tile.sample;
subtile.num_samples = min(samples_per_second, tile.start_sample + tile.num_samples - tile.sample);
if(device->have_error()) {
return false;
}
/* reset state memory here as global size for data_init
* kernel might not be large enough to do in kernel
*/
device->mem_zero(work_pool_wgs);
device->mem_zero(split_data);
device->mem_zero(ray_state);
if(!enqueue_split_kernel_data_init(KernelDimensions(global_size, local_size),
subtile,
num_global_elements,
kgbuffer,
kernel_data,
split_data,
ray_state,
queue_index,
use_queues_flag,
work_pool_wgs))
{
return false;
}
ENQUEUE_SPLIT_KERNEL(path_init, global_size, local_size);
bool activeRaysAvailable = true;
while(activeRaysAvailable) {
/* Do path-iteration in host [Enqueue Path-iteration kernels. */
for(int PathIter = 0; PathIter < 16; PathIter++) {
ENQUEUE_SPLIT_KERNEL(scene_intersect, global_size, local_size);
ENQUEUE_SPLIT_KERNEL(lamp_emission, global_size, local_size);
ENQUEUE_SPLIT_KERNEL(do_volume, global_size, local_size);
ENQUEUE_SPLIT_KERNEL(queue_enqueue, global_size, local_size);
ENQUEUE_SPLIT_KERNEL(indirect_background, global_size, local_size);
ENQUEUE_SPLIT_KERNEL(shader_eval, global_size, local_size);
ENQUEUE_SPLIT_KERNEL(holdout_emission_blurring_pathtermination_ao, global_size, local_size);
ENQUEUE_SPLIT_KERNEL(subsurface_scatter, global_size, local_size);
ENQUEUE_SPLIT_KERNEL(direct_lighting, global_size, local_size);
ENQUEUE_SPLIT_KERNEL(shadow_blocked_ao, global_size, local_size);
ENQUEUE_SPLIT_KERNEL(shadow_blocked_dl, global_size, local_size);
ENQUEUE_SPLIT_KERNEL(next_iteration_setup, global_size, local_size);
ENQUEUE_SPLIT_KERNEL(indirect_subsurface, global_size, local_size);
ENQUEUE_SPLIT_KERNEL(queue_enqueue, global_size, local_size);
ENQUEUE_SPLIT_KERNEL(buffer_update, global_size, local_size);
if(task->get_cancel()) {
return true;
}
}
/* Decide if we should exit path-iteration in host. */
device->mem_copy_from(ray_state, 0, global_size[0] * global_size[1] * sizeof(char), 1, 1);
activeRaysAvailable = false;
for(int rayStateIter = 0; rayStateIter < global_size[0] * global_size[1]; ++rayStateIter) {
int8_t state = ray_state.get_data()[rayStateIter];
if(state != RAY_INACTIVE) {
if(state == RAY_INVALID) {
/* Something went wrong, abort to avoid looping endlessly. */
device->set_error("Split kernel error: invalid ray state");
return false;
}
/* Not all rays are RAY_INACTIVE. */
activeRaysAvailable = true;
break;
}
}
if(task->get_cancel()) {
return true;
}
}
double time_per_sample = ((time_dt()-start_time) / subtile.num_samples);
if(avg_time_per_sample == 0.0) {
/* start rolling average */
avg_time_per_sample = time_per_sample;
}
else {
avg_time_per_sample = alpha*time_per_sample + (1.0-alpha)*avg_time_per_sample;
}
#undef ENQUEUE_SPLIT_KERNEL
tile.sample += subtile.num_samples;
task->update_progress(&tile, tile.w*tile.h*subtile.num_samples);
time_multiplier = min(time_multiplier << 1, 10);
if(task->get_cancel()) {
return true;
}
}
return true;
}
CCL_NAMESPACE_END

View File

@@ -0,0 +1,132 @@
/*
* Copyright 2011-2016 Blender Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifndef __DEVICE_SPLIT_KERNEL_H__
#define __DEVICE_SPLIT_KERNEL_H__
#include "device/device.h"
#include "render/buffers.h"
CCL_NAMESPACE_BEGIN
/* When allocate global memory in chunks. We may not be able to
* allocate exactly "CL_DEVICE_MAX_MEM_ALLOC_SIZE" bytes in chunks;
* Since some bytes may be needed for aligning chunks of memory;
* This is the amount of memory that we dedicate for that purpose.
*/
#define DATA_ALLOCATION_MEM_FACTOR 5000000 //5MB
/* Types used for split kernel */
class KernelDimensions {
public:
size_t global_size[2];
size_t local_size[2];
KernelDimensions(size_t global_size_[2], size_t local_size_[2])
{
memcpy(global_size, global_size_, sizeof(global_size));
memcpy(local_size, local_size_, sizeof(local_size));
}
};
class SplitKernelFunction {
public:
virtual ~SplitKernelFunction() {}
/* enqueue the kernel, returns false if there is an error */
virtual bool enqueue(const KernelDimensions& dim, device_memory& kg, device_memory& data) = 0;
};
class DeviceSplitKernel {
private:
Device *device;
SplitKernelFunction *kernel_path_init;
SplitKernelFunction *kernel_scene_intersect;
SplitKernelFunction *kernel_lamp_emission;
SplitKernelFunction *kernel_do_volume;
SplitKernelFunction *kernel_queue_enqueue;
SplitKernelFunction *kernel_indirect_background;
SplitKernelFunction *kernel_shader_eval;
SplitKernelFunction *kernel_holdout_emission_blurring_pathtermination_ao;
SplitKernelFunction *kernel_subsurface_scatter;
SplitKernelFunction *kernel_direct_lighting;
SplitKernelFunction *kernel_shadow_blocked_ao;
SplitKernelFunction *kernel_shadow_blocked_dl;
SplitKernelFunction *kernel_next_iteration_setup;
SplitKernelFunction *kernel_indirect_subsurface;
SplitKernelFunction *kernel_buffer_update;
/* Global memory variables [porting]; These memory is used for
* co-operation between different kernels; Data written by one
* kernel will be available to another kernel via this global
* memory.
*/
device_memory split_data;
device_vector<uchar> ray_state;
device_memory queue_index; /* Array of size num_queues * sizeof(int) that tracks the size of each queue. */
/* Flag to make sceneintersect and lampemission kernel use queues. */
device_memory use_queues_flag;
/* Approximate time it takes to complete one sample */
double avg_time_per_sample;
/* Work pool with respect to each work group. */
device_memory work_pool_wgs;
/* clos_max value for which the kernels have been loaded currently. */
int current_max_closure;
/* Marked True in constructor and marked false at the end of path_trace(). */
bool first_tile;
public:
explicit DeviceSplitKernel(Device* device);
virtual ~DeviceSplitKernel();
bool load_kernels(const DeviceRequestedFeatures& requested_features);
bool path_trace(DeviceTask *task,
RenderTile& rtile,
device_memory& kgbuffer,
device_memory& kernel_data);
virtual uint64_t state_buffer_size(device_memory& kg, device_memory& data, size_t num_threads) = 0;
size_t max_elements_for_max_buffer_size(device_memory& kg, device_memory& data, uint64_t max_buffer_size);
virtual bool enqueue_split_kernel_data_init(const KernelDimensions& dim,
RenderTile& rtile,
int num_global_elements,
device_memory& kernel_globals,
device_memory& kernel_data_,
device_memory& split_data,
device_memory& ray_state,
device_memory& queue_index,
device_memory& use_queues_flag,
device_memory& work_pool_wgs) = 0;
virtual SplitKernelFunction* get_split_kernel_function(string kernel_name, const DeviceRequestedFeatures&) = 0;
virtual int2 split_kernel_local_size() = 0;
virtual int2 split_kernel_global_size(device_memory& kg, device_memory& data, DeviceTask *task) = 0;
};
CCL_NAMESPACE_END
#endif /* __DEVICE_SPLIT_KERNEL_H__ */

View File

@@ -17,12 +17,12 @@
#include <stdlib.h>
#include <string.h>
#include "device_task.h"
#include "device/device_task.h"
#include "buffers.h"
#include "render/buffers.h"
#include "util_algorithm.h"
#include "util_time.h"
#include "util/util_algorithm.h"
#include "util/util_time.h"
CCL_NAMESPACE_BEGIN

View File

@@ -17,11 +17,11 @@
#ifndef __DEVICE_TASK_H__
#define __DEVICE_TASK_H__
#include "device_memory.h"
#include "device/device_memory.h"
#include "util_function.h"
#include "util_list.h"
#include "util_task.h"
#include "util/util_function.h"
#include "util/util_list.h"
#include "util/util_task.h"
CCL_NAMESPACE_BEGIN
@@ -51,6 +51,8 @@ public:
int shader_filter;
int shader_x, shader_w;
int passes_size;
explicit DeviceTask(Type type = PATH_TRACE);
int get_subtask_count(int num, int max_size = 0);

View File

@@ -16,40 +16,40 @@
#ifdef WITH_OPENCL
#include "device.h"
#include "device/device.h"
#include "util_map.h"
#include "util_param.h"
#include "util_string.h"
#include "util/util_map.h"
#include "util/util_param.h"
#include "util/util_string.h"
#include "clew.h"
CCL_NAMESPACE_BEGIN
/* Define CYCLES_DISABLE_DRIVER_WORKAROUNDS to disable workaounds for testing */
#ifndef CYCLES_DISABLE_DRIVER_WORKAROUNDS
/* Work around AMD driver hangs by ensuring each command is finished before doing anything else. */
# undef clEnqueueNDRangeKernel
# define clEnqueueNDRangeKernel(a, b, c, d, e, f, g, h, i) \
clFinish(a); \
CLEW_GET_FUN(__clewEnqueueNDRangeKernel)(a, b, c, d, e, f, g, h, i); \
clFinish(a);
# undef clEnqueueWriteBuffer
# define clEnqueueWriteBuffer(a, b, c, d, e, f, g, h, i) \
clFinish(a); \
CLEW_GET_FUN(__clewEnqueueWriteBuffer)(a, b, c, d, e, f, g, h, i); \
clFinish(a);
# undef clEnqueueReadBuffer
# define clEnqueueReadBuffer(a, b, c, d, e, f, g, h, i) \
clFinish(a); \
CLEW_GET_FUN(__clewEnqueueReadBuffer)(a, b, c, d, e, f, g, h, i); \
clFinish(a);
#endif /* CYCLES_DISABLE_DRIVER_WORKAROUNDS */
#define CL_MEM_PTR(p) ((cl_mem)(uintptr_t)(p))
/* Macro declarations used with split kernel */
/* Macro to enable/disable work-stealing */
#define __WORK_STEALING__
#define SPLIT_KERNEL_LOCAL_SIZE_X 64
#define SPLIT_KERNEL_LOCAL_SIZE_Y 1
/* This value may be tuned according to the scene we are rendering.
*
* Modifying PATH_ITER_INC_FACTOR value proportional to number of expected
* ray-bounces will improve performance.
*/
#define PATH_ITER_INC_FACTOR 8
/* When allocate global memory in chunks. We may not be able to
* allocate exactly "CL_DEVICE_MAX_MEM_ALLOC_SIZE" bytes in chunks;
* Since some bytes may be needed for aligning chunks of memory;
* This is the amount of memory that we dedicate for that purpose.
*/
#define DATA_ALLOCATION_MEM_FACTOR 5000000 //5MB
struct OpenCLPlatformDevice {
OpenCLPlatformDevice(cl_platform_id platform_id,
const string& platform_name,
@@ -90,6 +90,54 @@ public:
cl_device_id device_id);
static void get_usable_devices(vector<OpenCLPlatformDevice> *usable_devices,
bool force_all = false);
static bool use_single_program();
/* ** Some handy shortcuts to low level cl*GetInfo() functions. ** */
/* Platform information. */
static bool get_num_platforms(cl_uint *num_platforms, cl_int *error = NULL);
static cl_uint get_num_platforms();
static bool get_platforms(vector<cl_platform_id> *platform_ids,
cl_int *error = NULL);
static vector<cl_platform_id> get_platforms();
static bool get_platform_name(cl_platform_id platform_id,
string *platform_name);
static string get_platform_name(cl_platform_id platform_id);
static bool get_num_platform_devices(cl_platform_id platform_id,
cl_device_type device_type,
cl_uint *num_devices,
cl_int *error = NULL);
static cl_uint get_num_platform_devices(cl_platform_id platform_id,
cl_device_type device_type);
static bool get_platform_devices(cl_platform_id platform_id,
cl_device_type device_type,
vector<cl_device_id> *device_ids,
cl_int* error = NULL);
static vector<cl_device_id> get_platform_devices(cl_platform_id platform_id,
cl_device_type device_type);
/* Device information. */
static bool get_device_name(cl_device_id device_id,
string *device_name,
cl_int* error = NULL);
static string get_device_name(cl_device_id device_id);
static bool get_device_type(cl_device_id device_id,
cl_device_type *device_type,
cl_int* error = NULL);
static cl_device_type get_device_type(cl_device_id device_id);
/* Get somewhat more readable device name.
* Main difference is AMD OpenCL here which only gives code name
* for the regular device name. This will give more sane device
* name using some extensions.
*/
static string get_readable_device_name(cl_device_id device_id);
};
/* Thread safe cache for contexts and programs.
@@ -248,6 +296,7 @@ public:
bool device_initialized;
string platform_name;
string device_name;
bool opencl_error(cl_int err);
void opencl_error(const string& message);
@@ -266,10 +315,10 @@ public:
/* Has to be implemented by the real device classes.
* The base device will then load all these programs. */
virtual void load_kernels(const DeviceRequestedFeatures& requested_features,
virtual bool load_kernels(const DeviceRequestedFeatures& requested_features,
vector<OpenCLProgram*> &programs) = 0;
void mem_alloc(device_memory& mem, MemoryType type);
void mem_alloc(const char *name, device_memory& mem, MemoryType type);
void mem_copy_to(device_memory& mem);
void mem_copy_from(device_memory& mem, int y, int w, int h, int elem);
void mem_zero(device_memory& mem);
@@ -326,16 +375,39 @@ protected:
class ArgumentWrapper {
public:
ArgumentWrapper() : size(0), pointer(NULL) {}
template <typename T>
ArgumentWrapper() : size(0), pointer(NULL)
{
}
ArgumentWrapper(device_memory& argument) : size(sizeof(void*)),
pointer((void*)(&argument.device_pointer))
{
}
template<typename T>
ArgumentWrapper(device_vector<T>& argument) : size(sizeof(void*)),
pointer((void*)(&argument.device_pointer))
{
}
template<typename T>
ArgumentWrapper(T& argument) : size(sizeof(argument)),
pointer(&argument) { }
pointer(&argument)
{
}
ArgumentWrapper(int argument) : size(sizeof(int)),
int_value(argument),
pointer(&int_value) { }
pointer(&int_value)
{
}
ArgumentWrapper(float argument) : size(sizeof(float)),
float_value(argument),
pointer(&float_value) { }
pointer(&float_value)
{
}
size_t size;
int int_value;
float float_value;

View File

@@ -16,15 +16,15 @@
#ifdef WITH_OPENCL
#include "opencl.h"
#include "device/opencl/opencl.h"
#include "kernel_types.h"
#include "kernel/kernel_types.h"
#include "util_foreach.h"
#include "util_logging.h"
#include "util_md5.h"
#include "util_path.h"
#include "util_time.h"
#include "util/util_foreach.h"
#include "util/util_logging.h"
#include "util/util_md5.h"
#include "util/util_path.h"
#include "util/util_time.h"
CCL_NAMESPACE_BEGIN
@@ -82,9 +82,10 @@ OpenCLDeviceBase::OpenCLDeviceBase(DeviceInfo& info, Stats &stats, bool backgrou
cpPlatform = platform_device.platform_id;
cdDevice = platform_device.device_id;
platform_name = platform_device.platform_name;
device_name = platform_device.device_name;
VLOG(2) << "Creating new Cycles device for OpenCL platform "
<< platform_name << ", device "
<< platform_device.device_name << ".";
<< device_name << ".";
{
/* try to use cached context */
@@ -113,12 +114,16 @@ OpenCLDeviceBase::OpenCLDeviceBase(DeviceInfo& info, Stats &stats, bool backgrou
}
cqCommandQueue = clCreateCommandQueue(cxContext, cdDevice, 0, &ciErr);
if(opencl_error(ciErr))
if(opencl_error(ciErr)) {
opencl_error("OpenCL: Error creating command queue");
return;
}
null_mem = (device_ptr)clCreateBuffer(cxContext, CL_MEM_READ_ONLY, 1, NULL, &ciErr);
if(opencl_error(ciErr))
if(opencl_error(ciErr)) {
opencl_error("OpenCL: Error creating memory buffer for NULL");
return;
}
fprintf(stderr, "Device init success\n");
device_initialized = true;
@@ -147,10 +152,8 @@ OpenCLDeviceBase::~OpenCLDeviceBase()
void CL_CALLBACK OpenCLDeviceBase::context_notify_callback(const char *err_info,
const void * /*private_info*/, size_t /*cb*/, void *user_data)
{
char name[256];
clGetDeviceInfo((cl_device_id)user_data, CL_DEVICE_NAME, sizeof(name), &name, NULL);
fprintf(stderr, "OpenCL error (%s): %s\n", name, err_info);
string device_name = OpenCLInfo::get_device_name((cl_device_id)user_data);
fprintf(stderr, "OpenCL error (%s): %s\n", device_name.c_str(), err_info);
}
bool OpenCLDeviceBase::opencl_version_check()
@@ -191,6 +194,8 @@ string OpenCLDeviceBase::device_md5_hash(string kernel_custom_build_options)
bool OpenCLDeviceBase::load_kernels(const DeviceRequestedFeatures& requested_features)
{
VLOG(2) << "Loading kernels for platform " << platform_name
<< ", device " << device_name << ".";
/* Verify if device was initialized. */
if(!device_initialized) {
fprintf(stderr, "OpenCL: failed to initialize device.\n");
@@ -206,11 +211,14 @@ bool OpenCLDeviceBase::load_kernels(const DeviceRequestedFeatures& requested_fea
base_program.add_kernel(ustring("convert_to_half_float"));
base_program.add_kernel(ustring("shader"));
base_program.add_kernel(ustring("bake"));
base_program.add_kernel(ustring("zero_buffer"));
vector<OpenCLProgram*> programs;
programs.push_back(&base_program);
/* Call actual class to fill the vector with its programs. */
load_kernels(requested_features, programs);
if(!load_kernels(requested_features, programs)) {
return false;
}
/* Parallel compilation is supported by Cycles, but currently all OpenCL frameworks
* serialize the calls internally, so it's not much use right now.
@@ -242,8 +250,14 @@ bool OpenCLDeviceBase::load_kernels(const DeviceRequestedFeatures& requested_fea
return true;
}
void OpenCLDeviceBase::mem_alloc(device_memory& mem, MemoryType type)
void OpenCLDeviceBase::mem_alloc(const char *name, device_memory& mem, MemoryType type)
{
if(name) {
VLOG(1) << "Buffer allocate: " << name << ", "
<< string_human_readable_number(mem.memory_size()) << " bytes. ("
<< string_human_readable_size(mem.memory_size()) << ")";
}
size_t size = mem.memory_size();
cl_mem_flags mem_flag;
@@ -311,8 +325,61 @@ void OpenCLDeviceBase::mem_copy_from(device_memory& mem, int y, int w, int h, in
void OpenCLDeviceBase::mem_zero(device_memory& mem)
{
if(mem.device_pointer) {
memset((void*)mem.data_pointer, 0, mem.memory_size());
mem_copy_to(mem);
if(base_program.is_loaded()) {
cl_kernel ckZeroBuffer = base_program(ustring("zero_buffer"));
size_t global_size[] = {1024, 1024};
size_t num_threads = global_size[0] * global_size[1];
cl_mem d_buffer = CL_MEM_PTR(mem.device_pointer);
cl_ulong d_offset = 0;
cl_ulong d_size = 0;
while(d_offset < mem.memory_size()) {
d_size = std::min<cl_ulong>(num_threads*sizeof(float4), mem.memory_size() - d_offset);
kernel_set_args(ckZeroBuffer, 0, d_buffer, d_size, d_offset);
ciErr = clEnqueueNDRangeKernel(cqCommandQueue,
ckZeroBuffer,
2,
NULL,
global_size,
NULL,
0,
NULL,
NULL);
opencl_assert_err(ciErr, "clEnqueueNDRangeKernel");
d_offset += d_size;
}
}
if(mem.data_pointer) {
memset((void*)mem.data_pointer, 0, mem.memory_size());
}
if(!base_program.is_loaded()) {
void* zero = (void*)mem.data_pointer;
if(!mem.data_pointer) {
zero = util_aligned_malloc(mem.memory_size(), 16);
memset(zero, 0, mem.memory_size());
}
opencl_assert(clEnqueueWriteBuffer(cqCommandQueue,
CL_MEM_PTR(mem.device_pointer),
CL_TRUE,
0,
mem.memory_size(),
zero,
0,
NULL, NULL));
if(!mem.data_pointer) {
util_aligned_free(zero);
}
}
}
}
@@ -337,7 +404,7 @@ void OpenCLDeviceBase::const_copy_to(const char *name, void *host, size_t size)
device_vector<uchar> *data = new device_vector<uchar>();
data->copy((uchar*)host, size);
mem_alloc(*data, MEM_READ_ONLY);
mem_alloc(name, *data, MEM_READ_ONLY);
i = const_mem_map.insert(ConstMemMap::value_type(name, data)).first;
}
else {
@@ -356,7 +423,7 @@ void OpenCLDeviceBase::tex_alloc(const char *name,
VLOG(1) << "Texture allocate: " << name << ", "
<< string_human_readable_number(mem.memory_size()) << " bytes. ("
<< string_human_readable_size(mem.memory_size()) << ")";
mem_alloc(mem, MEM_READ_ONLY);
mem_alloc(NULL, mem, MEM_READ_ONLY);
mem_copy_to(mem);
assert(mem_map.find(name) == mem_map.end());
mem_map.insert(MemMap::value_type(name, mem.device_pointer));
@@ -460,7 +527,7 @@ void OpenCLDeviceBase::film_convert(DeviceTask& task, device_ptr buffer, device_
#define KERNEL_TEX(type, ttype, name) \
set_kernel_arg_mem(ckFilmConvertKernel, &start_arg_index, #name);
#include "kernel_textures.h"
#include "kernel/kernel_textures.h"
#undef KERNEL_TEX
start_arg_index += kernel_set_args(ckFilmConvertKernel,
@@ -511,7 +578,7 @@ void OpenCLDeviceBase::shader(DeviceTask& task)
#define KERNEL_TEX(type, ttype, name) \
set_kernel_arg_mem(kernel, &start_arg_index, #name);
#include "kernel_textures.h"
#include "kernel/kernel_textures.h"
#undef KERNEL_TEX
start_arg_index += kernel_set_args(kernel,

View File

@@ -16,15 +16,15 @@
#ifdef WITH_OPENCL
#include "opencl.h"
#include "device/opencl/opencl.h"
#include "buffers.h"
#include "render/buffers.h"
#include "kernel_types.h"
#include "kernel/kernel_types.h"
#include "util_md5.h"
#include "util_path.h"
#include "util_time.h"
#include "util/util_md5.h"
#include "util/util_path.h"
#include "util/util_time.h"
CCL_NAMESPACE_BEGIN
@@ -43,11 +43,12 @@ public:
return true;
}
virtual void load_kernels(const DeviceRequestedFeatures& /*requested_features*/,
virtual bool load_kernels(const DeviceRequestedFeatures& /*requested_features*/,
vector<OpenCLProgram*> &programs)
{
path_trace_program.add_kernel(ustring("path_trace"));
programs.push_back(&path_trace_program);
return true;
}
~OpenCLDeviceMegaKernel()
@@ -83,7 +84,7 @@ public:
#define KERNEL_TEX(type, ttype, name) \
set_kernel_arg_mem(ckPathTraceKernel, &start_arg_index, #name);
#include "kernel_textures.h"
#include "kernel/kernel_textures.h"
#undef KERNEL_TEX
start_arg_index += kernel_set_args(ckPathTraceKernel,

File diff suppressed because it is too large Load Diff

View File

@@ -16,11 +16,12 @@
#ifdef WITH_OPENCL
#include "opencl.h"
#include "device/opencl/opencl.h"
#include "util_logging.h"
#include "util_path.h"
#include "util_time.h"
#include "util/util_logging.h"
#include "util/util_md5.h"
#include "util/util_path.h"
#include "util/util_time.h"
using std::cerr;
using std::endl;
@@ -234,7 +235,7 @@ string OpenCLCache::get_kernel_md5()
thread_scoped_lock lock(self.kernel_md5_lock);
if(self.kernel_md5.empty()) {
self.kernel_md5 = path_files_md5_hash(path_get("kernel"));
self.kernel_md5 = path_files_md5_hash(path_get("source"));
}
return self.kernel_md5;
}
@@ -309,6 +310,8 @@ bool OpenCLDeviceBase::OpenCLProgram::build_kernel(const string *debug_src)
string build_options;
build_options = device->kernel_build_options(debug_src) + kernel_build_options;
VLOG(1) << "Build options passed to clBuildProgram: '"
<< build_options << "'.";
cl_int ciErr = clBuildProgram(program, 0, NULL, build_options.c_str(), NULL, NULL);
/* show warnings even if build is successful */
@@ -336,12 +339,13 @@ bool OpenCLDeviceBase::OpenCLProgram::build_kernel(const string *debug_src)
bool OpenCLDeviceBase::OpenCLProgram::compile_kernel(const string *debug_src)
{
string source = "#include \"kernels/opencl/" + kernel_file + "\" // " + OpenCLCache::get_kernel_md5() + "\n";
string source = "#include \"kernel/kernels/opencl/" + kernel_file + "\"\n";
/* We compile kernels consisting of many files. unfortunately OpenCL
* kernel caches do not seem to recognize changes in included files.
* so we force recompile on changes by adding the md5 hash of all files.
*/
source = path_source_replace_includes(source, path_get("kernel"));
source = path_source_replace_includes(source, path_get("source"));
source += "\n// " + util_md5_string(source) + "\n";
if(debug_src) {
path_write_text(*debug_src, source);
@@ -352,10 +356,10 @@ bool OpenCLDeviceBase::OpenCLProgram::compile_kernel(const string *debug_src)
cl_int ciErr;
program = clCreateProgramWithSource(device->cxContext,
1,
&source_str,
&source_len,
&ciErr);
1,
&source_str,
&source_len,
&ciErr);
if(ciErr != CL_SUCCESS) {
add_error(string("OpenCL program creation failed: ") + clewErrorString(ciErr));
@@ -438,7 +442,11 @@ void OpenCLDeviceBase::OpenCLProgram::load()
if(!program) {
add_log(string("OpenCL program ") + program_name + " not found in cache.", true);
string basename = "cycles_kernel_" + program_name + "_" + device_md5 + "_" + OpenCLCache::get_kernel_md5();
/* need to create source to get md5 */
string source = "#include \"kernel/kernels/opencl/" + kernel_file + "\"\n";
source = path_source_replace_includes(source, path_get("source"));
string basename = "cycles_kernel_" + program_name + "_" + device_md5 + "_" + util_md5_string(source);
basename = path_cache_get(path_join("kernels", basename));
string clbin = basename + ".clbin";
@@ -544,6 +552,11 @@ bool OpenCLInfo::use_debug()
return DebugFlags().opencl.debug;
}
bool OpenCLInfo::use_single_program()
{
return DebugFlags().opencl.single_program;
}
bool OpenCLInfo::kernel_use_advanced_shading(const string& platform)
{
/* keep this in sync with kernel_types.h! */
@@ -587,11 +600,20 @@ bool OpenCLInfo::device_supported(const string& platform_name,
const cl_device_id device_id)
{
cl_device_type device_type;
clGetDeviceInfo(device_id,
CL_DEVICE_TYPE,
sizeof(cl_device_type),
&device_type,
NULL);
if(!get_device_type(device_id, &device_type)) {
return false;
}
string device_name;
if(!get_device_name(device_id, &device_name)) {
return false;
}
/* It is possible tyo have Iris GPU on AMD/Apple OpenCL framework
* (aka, it will not be on Intel framework). This isn't supported
* and needs an explicit blacklist.
*/
if(strstr(device_name.c_str(), "Iris")) {
return false;
}
if(platform_name == "AMD Accelerated Parallel Processing" &&
device_type == CL_DEVICE_TYPE_GPU)
{
@@ -705,39 +727,30 @@ void OpenCLInfo::get_usable_devices(vector<OpenCLPlatformDevice> *usable_devices
return;
}
cl_int error;
vector<cl_device_id> device_ids;
cl_uint num_devices = 0;
vector<cl_platform_id> platform_ids;
cl_uint num_platforms = 0;
/* Get devices. */
if(clGetPlatformIDs(0, NULL, &num_platforms) != CL_SUCCESS ||
num_platforms == 0)
{
/* Get platforms. */
if(!get_platforms(&platform_ids, &error)) {
FIRST_VLOG(2) << "Error fetching platforms:"
<< string(clewErrorString(error));
first_time = false;
return;
}
if(platform_ids.size() == 0) {
FIRST_VLOG(2) << "No OpenCL platforms were found.";
first_time = false;
return;
}
platform_ids.resize(num_platforms);
if(clGetPlatformIDs(num_platforms, &platform_ids[0], NULL) != CL_SUCCESS) {
FIRST_VLOG(2) << "Failed to fetch platform IDs from the driver..";
first_time = false;
return;
}
/* Devices are numbered consecutively across platforms. */
for(int platform = 0; platform < num_platforms; platform++) {
for(int platform = 0; platform < platform_ids.size(); platform++) {
cl_platform_id platform_id = platform_ids[platform];
char pname[256];
if(clGetPlatformInfo(platform_id,
CL_PLATFORM_NAME,
sizeof(pname),
&pname,
NULL) != CL_SUCCESS)
{
string platform_name;
if(!get_platform_name(platform_id, &platform_name)) {
FIRST_VLOG(2) << "Failed to get platform name, ignoring.";
continue;
}
string platform_name = pname;
FIRST_VLOG(2) << "Enumerating devices for platform "
<< platform_name << ".";
if(!platform_version_check(platform_id)) {
@@ -745,39 +758,28 @@ void OpenCLInfo::get_usable_devices(vector<OpenCLPlatformDevice> *usable_devices
<< " due to too old compiler version.";
continue;
}
num_devices = 0;
cl_int ciErr;
if((ciErr = clGetDeviceIDs(platform_id,
device_type,
0,
NULL,
&num_devices)) != CL_SUCCESS || num_devices == 0)
if(!get_platform_devices(platform_id,
device_type,
&device_ids,
&error))
{
FIRST_VLOG(2) << "Ignoring platform " << platform_name
<< ", failed to fetch number of devices: " << string(clewErrorString(ciErr));
<< ", failed to fetch of devices: "
<< string(clewErrorString(error));
continue;
}
device_ids.resize(num_devices);
if(clGetDeviceIDs(platform_id,
device_type,
num_devices,
&device_ids[0],
NULL) != CL_SUCCESS)
{
if(device_ids.size() == 0) {
FIRST_VLOG(2) << "Ignoring platform " << platform_name
<< ", failed to fetch devices list.";
<< ", it has no devices.";
continue;
}
for(int num = 0; num < num_devices; num++) {
cl_device_id device_id = device_ids[num];
char device_name[1024] = "\0";
if(clGetDeviceInfo(device_id,
CL_DEVICE_NAME,
sizeof(device_name),
&device_name,
NULL) != CL_SUCCESS)
{
FIRST_VLOG(2) << "Failed to fetch device name, ignoring.";
for(int num = 0; num < device_ids.size(); num++) {
const cl_device_id device_id = device_ids[num];
string device_name;
if(!get_device_name(device_id, &device_name, &error)) {
FIRST_VLOG(2) << "Failed to fetch device name: "
<< string(clewErrorString(error))
<< ", ignoring.";
continue;
}
if(!device_version_check(device_id)) {
@@ -789,24 +791,28 @@ void OpenCLInfo::get_usable_devices(vector<OpenCLPlatformDevice> *usable_devices
device_supported(platform_name, device_id))
{
cl_device_type device_type;
if(clGetDeviceInfo(device_id,
CL_DEVICE_TYPE,
sizeof(cl_device_type),
&device_type,
NULL) != CL_SUCCESS)
{
if(!get_device_type(device_id, &device_type, &error)) {
FIRST_VLOG(2) << "Ignoring device " << device_name
<< ", failed to fetch device type.";
<< ", failed to fetch device type:"
<< string(clewErrorString(error));
continue;
}
FIRST_VLOG(2) << "Adding new device " << device_name << ".";
string readable_device_name =
get_readable_device_name(device_id);
if(readable_device_name != device_name) {
FIRST_VLOG(2) << "Using more readable device name: "
<< readable_device_name;
}
FIRST_VLOG(2) << "Adding new device "
<< readable_device_name << ".";
string hardware_id = get_hardware_id(platform_name, device_id);
usable_devices->push_back(OpenCLPlatformDevice(platform_id,
platform_name,
device_id,
device_type,
device_name,
hardware_id));
usable_devices->push_back(OpenCLPlatformDevice(
platform_id,
platform_name,
device_id,
device_type,
readable_device_name,
hardware_id));
}
else {
FIRST_VLOG(2) << "Ignoring device " << device_name
@@ -817,6 +823,252 @@ void OpenCLInfo::get_usable_devices(vector<OpenCLPlatformDevice> *usable_devices
first_time = false;
}
bool OpenCLInfo::get_platforms(vector<cl_platform_id> *platform_ids,
cl_int *error)
{
/* Reset from possible previous state. */
platform_ids->resize(0);
cl_uint num_platforms;
if(!get_num_platforms(&num_platforms, error)) {
return false;
}
/* Get actual platforms. */
cl_int err;
platform_ids->resize(num_platforms);
if((err = clGetPlatformIDs(num_platforms,
&platform_ids->at(0),
NULL)) != CL_SUCCESS) {
if(error != NULL) {
*error = err;
}
return false;
}
if(error != NULL) {
*error = CL_SUCCESS;
}
return true;
}
vector<cl_platform_id> OpenCLInfo::get_platforms()
{
vector<cl_platform_id> platform_ids;
get_platforms(&platform_ids);
return platform_ids;
}
bool OpenCLInfo::get_num_platforms(cl_uint *num_platforms, cl_int *error)
{
cl_int err;
if((err = clGetPlatformIDs(0, NULL, num_platforms)) != CL_SUCCESS) {
if(error != NULL) {
*error = err;
}
*num_platforms = 0;
return false;
}
if(error != NULL) {
*error = CL_SUCCESS;
}
return true;
}
cl_uint OpenCLInfo::get_num_platforms()
{
cl_uint num_platforms;
if(!get_num_platforms(&num_platforms)) {
return 0;
}
return num_platforms;
}
bool OpenCLInfo::get_platform_name(cl_platform_id platform_id,
string *platform_name)
{
char buffer[256];
if(clGetPlatformInfo(platform_id,
CL_PLATFORM_NAME,
sizeof(buffer),
&buffer,
NULL) != CL_SUCCESS)
{
*platform_name = "";
return false;
}
*platform_name = buffer;
return true;
}
string OpenCLInfo::get_platform_name(cl_platform_id platform_id)
{
string platform_name;
if (!get_platform_name(platform_id, &platform_name)) {
return "";
}
return platform_name;
}
bool OpenCLInfo::get_num_platform_devices(cl_platform_id platform_id,
cl_device_type device_type,
cl_uint *num_devices,
cl_int *error)
{
cl_int err;
if((err = clGetDeviceIDs(platform_id,
device_type,
0,
NULL,
num_devices)) != CL_SUCCESS)
{
if(error != NULL) {
*error = err;
}
*num_devices = 0;
return false;
}
if(error != NULL) {
*error = CL_SUCCESS;
}
return true;
}
cl_uint OpenCLInfo::get_num_platform_devices(cl_platform_id platform_id,
cl_device_type device_type)
{
cl_uint num_devices;
if(!get_num_platform_devices(platform_id,
device_type,
&num_devices))
{
return 0;
}
return num_devices;
}
bool OpenCLInfo::get_platform_devices(cl_platform_id platform_id,
cl_device_type device_type,
vector<cl_device_id> *device_ids,
cl_int* error)
{
/* Reset from possible previous state. */
device_ids->resize(0);
/* Get number of devices to pre-allocate memory. */
cl_uint num_devices;
if(!get_num_platform_devices(platform_id,
device_type,
&num_devices,
error))
{
return false;
}
/* Get actual device list. */
device_ids->resize(num_devices);
cl_int err;
if((err = clGetDeviceIDs(platform_id,
device_type,
num_devices,
&device_ids->at(0),
NULL)) != CL_SUCCESS)
{
if(error != NULL) {
*error = err;
}
return false;
}
if(error != NULL) {
*error = CL_SUCCESS;
}
return true;
}
vector<cl_device_id> OpenCLInfo::get_platform_devices(cl_platform_id platform_id,
cl_device_type device_type)
{
vector<cl_device_id> devices;
get_platform_devices(platform_id, device_type, &devices);
return devices;
}
bool OpenCLInfo::get_device_name(cl_device_id device_id,
string *device_name,
cl_int* error)
{
char buffer[1024];
cl_int err;
if((err = clGetDeviceInfo(device_id,
CL_DEVICE_NAME,
sizeof(buffer),
&buffer,
NULL)) != CL_SUCCESS)
{
if(error != NULL) {
*error = err;
}
*device_name = "";
return false;
}
if(error != NULL) {
*error = CL_SUCCESS;
}
*device_name = buffer;
return true;
}
string OpenCLInfo::get_device_name(cl_device_id device_id)
{
string device_name;
if(!get_device_name(device_id, &device_name)) {
return "";
}
return device_name;
}
bool OpenCLInfo::get_device_type(cl_device_id device_id,
cl_device_type *device_type,
cl_int* error)
{
cl_int err;
if((err = clGetDeviceInfo(device_id,
CL_DEVICE_TYPE,
sizeof(cl_device_type),
device_type,
NULL)) != CL_SUCCESS)
{
if(error != NULL) {
*error = err;
}
*device_type = 0;
return false;
}
if(error != NULL) {
*error = CL_SUCCESS;
}
return true;
}
cl_device_type OpenCLInfo::get_device_type(cl_device_id device_id)
{
cl_device_type device_type;
if(!get_device_type(device_id, &device_type)) {
return 0;
}
return device_type;
}
string OpenCLInfo::get_readable_device_name(cl_device_id device_id)
{
char board_name[1024];
if(clGetDeviceInfo(device_id,
CL_DEVICE_BOARD_NAME_AMD,
sizeof(board_name),
&board_name,
NULL) == CL_SUCCESS)
{
return board_name;
}
/* Fallback to standard device name API. */
return get_device_name(device_id);
}
CCL_NAMESPACE_END
#endif

View File

@@ -1,7 +1,6 @@
set(INC
.
../util
..
)
set(SRC

View File

@@ -14,12 +14,12 @@
* limitations under the License.
*/
#include "node.h"
#include "node_type.h"
#include "graph/node.h"
#include "graph/node_type.h"
#include "util_foreach.h"
#include "util_param.h"
#include "util_transform.h"
#include "util/util_foreach.h"
#include "util/util_param.h"
#include "util/util_transform.h"
CCL_NAMESPACE_BEGIN

View File

@@ -16,11 +16,11 @@
#pragma once
#include "node_type.h"
#include "graph/node_type.h"
#include "util_map.h"
#include "util_param.h"
#include "util_vector.h"
#include "util/util_map.h"
#include "util/util_param.h"
#include "util/util_vector.h"
CCL_NAMESPACE_BEGIN

View File

@@ -16,8 +16,8 @@
#pragma once
#include "util_map.h"
#include "util_param.h"
#include "util/util_map.h"
#include "util/util_param.h"
CCL_NAMESPACE_BEGIN

View File

@@ -14,9 +14,9 @@
* limitations under the License.
*/
#include "node_type.h"
#include "util_foreach.h"
#include "util_transform.h"
#include "graph/node_type.h"
#include "util/util_foreach.h"
#include "util/util_transform.h"
CCL_NAMESPACE_BEGIN

View File

@@ -16,12 +16,12 @@
#pragma once
#include "node_enum.h"
#include "graph/node_enum.h"
#include "util_map.h"
#include "util_param.h"
#include "util_string.h"
#include "util_vector.h"
#include "util/util_map.h"
#include "util/util_param.h"
#include "util/util_string.h"
#include "util/util_vector.h"
CCL_NAMESPACE_BEGIN

View File

@@ -14,11 +14,11 @@
* limitations under the License.
*/
#include "node_xml.h"
#include "graph/node_xml.h"
#include "util_foreach.h"
#include "util_string.h"
#include "util_transform.h"
#include "util/util_foreach.h"
#include "util/util_string.h"
#include "util/util_transform.h"
CCL_NAMESPACE_BEGIN

View File

@@ -16,11 +16,11 @@
#pragma once
#include "node.h"
#include "graph/node.h"
#include "util_map.h"
#include "util_string.h"
#include "util_xml.h"
#include "util/util_map.h"
#include "util/util_string.h"
#include "util/util_xml.h"
CCL_NAMESPACE_BEGIN

View File

@@ -1,10 +1,7 @@
remove_extra_strict_flags()
set(INC
.
../util
osl
svm
..
)
set(INC_SYS
@@ -13,19 +10,28 @@ set(INC_SYS
set(SRC
kernels/cpu/kernel.cpp
kernels/cpu/kernel_split.cpp
kernels/opencl/kernel.cl
kernels/opencl/kernel_state_buffer_size.cl
kernels/opencl/kernel_split.cl
kernels/opencl/kernel_data_init.cl
kernels/opencl/kernel_path_init.cl
kernels/opencl/kernel_queue_enqueue.cl
kernels/opencl/kernel_scene_intersect.cl
kernels/opencl/kernel_lamp_emission.cl
kernels/opencl/kernel_background_buffer_update.cl
kernels/opencl/kernel_do_volume.cl
kernels/opencl/kernel_indirect_background.cl
kernels/opencl/kernel_shader_eval.cl
kernels/opencl/kernel_holdout_emission_blurring_pathtermination_ao.cl
kernels/opencl/kernel_subsurface_scatter.cl
kernels/opencl/kernel_direct_lighting.cl
kernels/opencl/kernel_shadow_blocked.cl
kernels/opencl/kernel_shadow_blocked_ao.cl
kernels/opencl/kernel_shadow_blocked_dl.cl
kernels/opencl/kernel_next_iteration_setup.cl
kernels/opencl/kernel_sum_all_radiance.cl
kernels/opencl/kernel_indirect_subsurface.cl
kernels/opencl/kernel_buffer_update.cl
kernels/cuda/kernel.cu
kernels/cuda/kernel_split.cu
)
set(SRC_BVH_HEADERS
@@ -68,6 +74,7 @@ set(SRC_HEADERS
kernel_path_common.h
kernel_path_state.h
kernel_path_surface.h
kernel_path_subsurface.h
kernel_path_volume.h
kernel_projection.h
kernel_queues.h
@@ -88,6 +95,10 @@ set(SRC_KERNELS_CPU_HEADERS
kernels/cpu/kernel_cpu_image.h
)
set(SRC_KERNELS_CUDA_HEADERS
kernels/cuda/kernel_config.h
)
set(SRC_CLOSURE_HEADERS
closure/alloc.h
closure/bsdf.h
@@ -182,6 +193,7 @@ set(SRC_UTIL_HEADERS
../util/util_hash.h
../util/util_math.h
../util/util_math_fast.h
../util/util_math_intersect.h
../util/util_static_assert.h
../util/util_transform.h
../util/util_texture.h
@@ -189,17 +201,25 @@ set(SRC_UTIL_HEADERS
)
set(SRC_SPLIT_HEADERS
split/kernel_background_buffer_update.h
split/kernel_buffer_update.h
split/kernel_data_init.h
split/kernel_direct_lighting.h
split/kernel_do_volume.h
split/kernel_holdout_emission_blurring_pathtermination_ao.h
split/kernel_indirect_background.h
split/kernel_indirect_subsurface.h
split/kernel_lamp_emission.h
split/kernel_next_iteration_setup.h
split/kernel_path_init.h
split/kernel_queue_enqueue.h
split/kernel_scene_intersect.h
split/kernel_shader_eval.h
split/kernel_shadow_blocked.h
split/kernel_shadow_blocked_ao.h
split/kernel_shadow_blocked_dl.h
split/kernel_split_common.h
split/kernel_sum_all_radiance.h
split/kernel_split_data.h
split/kernel_split_data_types.h
split/kernel_subsurface_scatter.h
)
# CUDA module
@@ -227,8 +247,9 @@ if(WITH_CYCLES_CUDA_BINARIES)
endif()
# build for each arch
set(cuda_sources kernels/cuda/kernel.cu
set(cuda_sources kernels/cuda/kernel.cu kernels/cuda/kernel_split.cu
${SRC_HEADERS}
${SRC_KERNELS_CUDA_HEADERS}
${SRC_BVH_HEADERS}
${SRC_SVM_HEADERS}
${SRC_GEOM_HEADERS}
@@ -237,15 +258,22 @@ if(WITH_CYCLES_CUDA_BINARIES)
)
set(cuda_cubins)
macro(CYCLES_CUDA_KERNEL_ADD arch experimental)
if(${experimental})
set(cuda_extra_flags "-D__KERNEL_EXPERIMENTAL__")
set(cuda_cubin kernel_experimental_${arch}.cubin)
macro(CYCLES_CUDA_KERNEL_ADD arch split experimental)
if(${split})
set(cuda_extra_flags "-D__SPLIT__")
set(cuda_cubin kernel_split)
else()
set(cuda_extra_flags "")
set(cuda_cubin kernel_${arch}.cubin)
set(cuda_cubin kernel)
endif()
if(${experimental})
set(cuda_extra_flags ${cuda_extra_flags} -D__KERNEL_EXPERIMENTAL__)
set(cuda_cubin ${cuda_cubin}_experimental)
endif()
set(cuda_cubin ${cuda_cubin}_${arch}.cubin)
if(WITH_CYCLES_DEBUG)
set(cuda_debug_flags "-D__KERNEL_DEBUG__")
else()
@@ -258,13 +286,19 @@ if(WITH_CYCLES_CUDA_BINARIES)
set(cuda_version_flags "-D__KERNEL_CUDA_VERSION__=${cuda_nvcc_version}")
set(cuda_math_flags "--use_fast_math")
if(split)
set(cuda_kernel_src "/kernels/cuda/kernel_split.cu")
else()
set(cuda_kernel_src "/kernels/cuda/kernel.cu")
endif()
add_custom_command(
OUTPUT ${cuda_cubin}
COMMAND ${cuda_nvcc_command}
-arch=${arch}
${CUDA_NVCC_FLAGS}
-m${CUDA_BITS}
--cubin ${CMAKE_CURRENT_SOURCE_DIR}/kernels/cuda/kernel.cu
--cubin ${CMAKE_CURRENT_SOURCE_DIR}${cuda_kernel_src}
-o ${CMAKE_CURRENT_BINARY_DIR}/${cuda_cubin}
--ptxas-options="-v"
${cuda_arch_flags}
@@ -272,8 +306,7 @@ if(WITH_CYCLES_CUDA_BINARIES)
${cuda_math_flags}
${cuda_extra_flags}
${cuda_debug_flags}
-I${CMAKE_CURRENT_SOURCE_DIR}/../util
-I${CMAKE_CURRENT_SOURCE_DIR}/svm
-I${CMAKE_CURRENT_SOURCE_DIR}/..
-DCCL_NAMESPACE_BEGIN=
-DCCL_NAMESPACE_END=
-DNVCC
@@ -291,7 +324,12 @@ if(WITH_CYCLES_CUDA_BINARIES)
foreach(arch ${CYCLES_CUDA_BINARIES_ARCH})
# Compile regular kernel
CYCLES_CUDA_KERNEL_ADD(${arch} FALSE)
CYCLES_CUDA_KERNEL_ADD(${arch} FALSE FALSE)
if(WITH_CYCLES_CUDA_SPLIT_KERNEL_BINARIES)
# Compile split kernel
CYCLES_CUDA_KERNEL_ADD(${arch} TRUE FALSE)
endif()
endforeach()
add_custom_target(cycles_kernel_cuda ALL DEPENDS ${cuda_cubins})
@@ -309,36 +347,50 @@ endif()
include_directories(${INC})
include_directories(SYSTEM ${INC_SYS})
set_source_files_properties(kernels/cpu/kernel.cpp PROPERTIES COMPILE_FLAGS "${CYCLES_KERNEL_FLAGS}")
set_source_files_properties(kernels/cpu/kernel_split.cpp PROPERTIES COMPILE_FLAGS "${CYCLES_KERNEL_FLAGS}")
if(CXX_HAS_SSE)
list(APPEND SRC
kernels/cpu/kernel_sse2.cpp
kernels/cpu/kernel_sse3.cpp
kernels/cpu/kernel_sse41.cpp
kernels/cpu/kernel_split_sse2.cpp
kernels/cpu/kernel_split_sse3.cpp
kernels/cpu/kernel_split_sse41.cpp
)
set_source_files_properties(kernels/cpu/kernel_sse2.cpp PROPERTIES COMPILE_FLAGS "${CYCLES_SSE2_KERNEL_FLAGS}")
set_source_files_properties(kernels/cpu/kernel_sse3.cpp PROPERTIES COMPILE_FLAGS "${CYCLES_SSE3_KERNEL_FLAGS}")
set_source_files_properties(kernels/cpu/kernel_sse41.cpp PROPERTIES COMPILE_FLAGS "${CYCLES_SSE41_KERNEL_FLAGS}")
set_source_files_properties(kernels/cpu/kernel_split_sse2.cpp PROPERTIES COMPILE_FLAGS "${CYCLES_SSE2_KERNEL_FLAGS}")
set_source_files_properties(kernels/cpu/kernel_split_sse3.cpp PROPERTIES COMPILE_FLAGS "${CYCLES_SSE3_KERNEL_FLAGS}")
set_source_files_properties(kernels/cpu/kernel_split_sse41.cpp PROPERTIES COMPILE_FLAGS "${CYCLES_SSE41_KERNEL_FLAGS}")
endif()
if(CXX_HAS_AVX)
list(APPEND SRC
kernels/cpu/kernel_avx.cpp
kernels/cpu/kernel_split_avx.cpp
)
set_source_files_properties(kernels/cpu/kernel_avx.cpp PROPERTIES COMPILE_FLAGS "${CYCLES_AVX_KERNEL_FLAGS}")
set_source_files_properties(kernels/cpu/kernel_split_avx.cpp PROPERTIES COMPILE_FLAGS "${CYCLES_AVX_KERNEL_FLAGS}")
endif()
if(CXX_HAS_AVX2)
list(APPEND SRC
kernels/cpu/kernel_avx2.cpp
kernels/cpu/kernel_split_avx2.cpp
)
set_source_files_properties(kernels/cpu/kernel_avx2.cpp PROPERTIES COMPILE_FLAGS "${CYCLES_AVX2_KERNEL_FLAGS}")
set_source_files_properties(kernels/cpu/kernel_split_avx2.cpp PROPERTIES COMPILE_FLAGS "${CYCLES_AVX2_KERNEL_FLAGS}")
endif()
add_library(cycles_kernel
${SRC}
${SRC_HEADERS}
${SRC_KERNELS_CPU_HEADERS}
${SRC_KERNELS_CUDA_HEADERS}
${SRC_BVH_HEADERS}
${SRC_CLOSURE_HEADERS}
${SRC_SVM_HEADERS}
@@ -360,24 +412,33 @@ endif()
#add_custom_target(cycles_kernel_preprocess ALL DEPENDS ${KERNEL_PREPROCESSED})
#delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${KERNEL_PREPROCESSED}" ${CYCLES_INSTALL_PATH}/kernel)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel.cl" ${CYCLES_INSTALL_PATH}/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel_data_init.cl" ${CYCLES_INSTALL_PATH}/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel_queue_enqueue.cl" ${CYCLES_INSTALL_PATH}/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel_scene_intersect.cl" ${CYCLES_INSTALL_PATH}/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel_lamp_emission.cl" ${CYCLES_INSTALL_PATH}/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel_background_buffer_update.cl" ${CYCLES_INSTALL_PATH}/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel_shader_eval.cl" ${CYCLES_INSTALL_PATH}/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel_holdout_emission_blurring_pathtermination_ao.cl" ${CYCLES_INSTALL_PATH}/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel_direct_lighting.cl" ${CYCLES_INSTALL_PATH}/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel_shadow_blocked.cl" ${CYCLES_INSTALL_PATH}/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel_next_iteration_setup.cl" ${CYCLES_INSTALL_PATH}/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel_sum_all_radiance.cl" ${CYCLES_INSTALL_PATH}/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/cuda/kernel.cu" ${CYCLES_INSTALL_PATH}/kernel/kernels/cuda)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_HEADERS}" ${CYCLES_INSTALL_PATH}/kernel)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_BVH_HEADERS}" ${CYCLES_INSTALL_PATH}/kernel/bvh)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_CLOSURE_HEADERS}" ${CYCLES_INSTALL_PATH}/kernel/closure)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_SVM_HEADERS}" ${CYCLES_INSTALL_PATH}/kernel/svm)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_GEOM_HEADERS}" ${CYCLES_INSTALL_PATH}/kernel/geom)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_UTIL_HEADERS}" ${CYCLES_INSTALL_PATH}/kernel)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_SPLIT_HEADERS}" ${CYCLES_INSTALL_PATH}/kernel/split)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel.cl" ${CYCLES_INSTALL_PATH}/source/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel_state_buffer_size.cl" ${CYCLES_INSTALL_PATH}/source/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel_split.cl" ${CYCLES_INSTALL_PATH}/source/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel_data_init.cl" ${CYCLES_INSTALL_PATH}/source/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel_path_init.cl" ${CYCLES_INSTALL_PATH}/source/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel_queue_enqueue.cl" ${CYCLES_INSTALL_PATH}/source/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel_scene_intersect.cl" ${CYCLES_INSTALL_PATH}/source/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel_lamp_emission.cl" ${CYCLES_INSTALL_PATH}/source/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel_do_volume.cl" ${CYCLES_INSTALL_PATH}/source/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel_indirect_background.cl" ${CYCLES_INSTALL_PATH}/source/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel_shader_eval.cl" ${CYCLES_INSTALL_PATH}/source/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel_holdout_emission_blurring_pathtermination_ao.cl" ${CYCLES_INSTALL_PATH}/source/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel_subsurface_scatter.cl" ${CYCLES_INSTALL_PATH}/source/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel_direct_lighting.cl" ${CYCLES_INSTALL_PATH}/source/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel_shadow_blocked_ao.cl" ${CYCLES_INSTALL_PATH}/source/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel_shadow_blocked_dl.cl" ${CYCLES_INSTALL_PATH}/source/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel_next_iteration_setup.cl" ${CYCLES_INSTALL_PATH}/source/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel_indirect_subsurface.cl" ${CYCLES_INSTALL_PATH}/source/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/opencl/kernel_buffer_update.cl" ${CYCLES_INSTALL_PATH}/source/kernel/kernels/opencl)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/cuda/kernel.cu" ${CYCLES_INSTALL_PATH}/source/kernel/kernels/cuda)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "kernels/cuda/kernel_split.cu" ${CYCLES_INSTALL_PATH}/source/kernel/kernels/cuda)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_KERNELS_CUDA_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel/kernels/cuda)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_BVH_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel/bvh)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_CLOSURE_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel/closure)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_SVM_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel/svm)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_GEOM_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel/geom)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_UTIL_HEADERS}" ${CYCLES_INSTALL_PATH}/source/util)
delayed_install(${CMAKE_CURRENT_SOURCE_DIR} "${SRC_SPLIT_HEADERS}" ${CYCLES_INSTALL_PATH}/source/kernel/split)

View File

@@ -27,43 +27,43 @@
CCL_NAMESPACE_BEGIN
#include "bvh_types.h"
#include "kernel/bvh/bvh_types.h"
/* Common QBVH functions. */
#ifdef __QBVH__
# include "qbvh_nodes.h"
# include "kernel/bvh/qbvh_nodes.h"
#endif
/* Regular BVH traversal */
#include "bvh_nodes.h"
#include "kernel/bvh/bvh_nodes.h"
#define BVH_FUNCTION_NAME bvh_intersect
#define BVH_FUNCTION_FEATURES 0
#include "bvh_traversal.h"
#include "kernel/bvh/bvh_traversal.h"
#if defined(__INSTANCING__)
# define BVH_FUNCTION_NAME bvh_intersect_instancing
# define BVH_FUNCTION_FEATURES BVH_INSTANCING
# include "bvh_traversal.h"
# include "kernel/bvh/bvh_traversal.h"
#endif
#if defined(__HAIR__)
# define BVH_FUNCTION_NAME bvh_intersect_hair
# define BVH_FUNCTION_FEATURES BVH_INSTANCING|BVH_HAIR|BVH_HAIR_MINIMUM_WIDTH
# include "bvh_traversal.h"
# include "kernel/bvh/bvh_traversal.h"
#endif
#if defined(__OBJECT_MOTION__)
# define BVH_FUNCTION_NAME bvh_intersect_motion
# define BVH_FUNCTION_FEATURES BVH_INSTANCING|BVH_MOTION
# include "bvh_traversal.h"
# include "kernel/bvh/bvh_traversal.h"
#endif
#if defined(__HAIR__) && defined(__OBJECT_MOTION__)
# define BVH_FUNCTION_NAME bvh_intersect_hair_motion
# define BVH_FUNCTION_FEATURES BVH_INSTANCING|BVH_HAIR|BVH_HAIR_MINIMUM_WIDTH|BVH_MOTION
# include "bvh_traversal.h"
# include "kernel/bvh/bvh_traversal.h"
#endif
/* Subsurface scattering BVH traversal */
@@ -71,12 +71,12 @@ CCL_NAMESPACE_BEGIN
#if defined(__SUBSURFACE__)
# define BVH_FUNCTION_NAME bvh_intersect_subsurface
# define BVH_FUNCTION_FEATURES BVH_HAIR
# include "bvh_subsurface.h"
# include "kernel/bvh/bvh_subsurface.h"
# if defined(__OBJECT_MOTION__)
# define BVH_FUNCTION_NAME bvh_intersect_subsurface_motion
# define BVH_FUNCTION_FEATURES BVH_MOTION|BVH_HAIR
# include "bvh_subsurface.h"
# include "kernel/bvh/bvh_subsurface.h"
# endif
#endif /* __SUBSURFACE__ */
@@ -85,18 +85,18 @@ CCL_NAMESPACE_BEGIN
#if defined(__VOLUME__)
# define BVH_FUNCTION_NAME bvh_intersect_volume
# define BVH_FUNCTION_FEATURES BVH_HAIR
# include "bvh_volume.h"
# include "kernel/bvh/bvh_volume.h"
# if defined(__INSTANCING__)
# define BVH_FUNCTION_NAME bvh_intersect_volume_instancing
# define BVH_FUNCTION_FEATURES BVH_INSTANCING|BVH_HAIR
# include "bvh_volume.h"
# include "kernel/bvh/bvh_volume.h"
# endif
# if defined(__OBJECT_MOTION__)
# define BVH_FUNCTION_NAME bvh_intersect_volume_motion
# define BVH_FUNCTION_FEATURES BVH_INSTANCING|BVH_MOTION|BVH_HAIR
# include "bvh_volume.h"
# include "kernel/bvh/bvh_volume.h"
# endif
#endif /* __VOLUME__ */
@@ -105,30 +105,30 @@ CCL_NAMESPACE_BEGIN
#if defined(__SHADOW_RECORD_ALL__)
# define BVH_FUNCTION_NAME bvh_intersect_shadow_all
# define BVH_FUNCTION_FEATURES 0
# include "bvh_shadow_all.h"
# include "kernel/bvh/bvh_shadow_all.h"
# if defined(__INSTANCING__)
# define BVH_FUNCTION_NAME bvh_intersect_shadow_all_instancing
# define BVH_FUNCTION_FEATURES BVH_INSTANCING
# include "bvh_shadow_all.h"
# include "kernel/bvh/bvh_shadow_all.h"
# endif
# if defined(__HAIR__)
# define BVH_FUNCTION_NAME bvh_intersect_shadow_all_hair
# define BVH_FUNCTION_FEATURES BVH_INSTANCING|BVH_HAIR
# include "bvh_shadow_all.h"
# include "kernel/bvh/bvh_shadow_all.h"
# endif
# if defined(__OBJECT_MOTION__)
# define BVH_FUNCTION_NAME bvh_intersect_shadow_all_motion
# define BVH_FUNCTION_FEATURES BVH_INSTANCING|BVH_MOTION
# include "bvh_shadow_all.h"
# include "kernel/bvh/bvh_shadow_all.h"
# endif
# if defined(__HAIR__) && defined(__OBJECT_MOTION__)
# define BVH_FUNCTION_NAME bvh_intersect_shadow_all_hair_motion
# define BVH_FUNCTION_FEATURES BVH_INSTANCING|BVH_HAIR|BVH_MOTION
# include "bvh_shadow_all.h"
# include "kernel/bvh/bvh_shadow_all.h"
# endif
#endif /* __SHADOW_RECORD_ALL__ */
@@ -137,18 +137,18 @@ CCL_NAMESPACE_BEGIN
#if defined(__VOLUME_RECORD_ALL__)
# define BVH_FUNCTION_NAME bvh_intersect_volume_all
# define BVH_FUNCTION_FEATURES BVH_HAIR
# include "bvh_volume_all.h"
# include "kernel/bvh/bvh_volume_all.h"
# if defined(__INSTANCING__)
# define BVH_FUNCTION_NAME bvh_intersect_volume_all_instancing
# define BVH_FUNCTION_FEATURES BVH_INSTANCING|BVH_HAIR
# include "bvh_volume_all.h"
# include "kernel/bvh/bvh_volume_all.h"
# endif
# if defined(__OBJECT_MOTION__)
# define BVH_FUNCTION_NAME bvh_intersect_volume_all_motion
# define BVH_FUNCTION_FEATURES BVH_INSTANCING|BVH_MOTION|BVH_HAIR
# include "bvh_volume_all.h"
# include "kernel/bvh/bvh_volume_all.h"
# endif
#endif /* __VOLUME_RECORD_ALL__ */
@@ -202,8 +202,9 @@ ccl_device_intersect bool scene_intersect(KernelGlobals *kg,
}
#ifdef __SUBSURFACE__
/* Note: ray is passed by value to work around a possible CUDA compiler bug. */
ccl_device_intersect void scene_intersect_subsurface(KernelGlobals *kg,
const Ray *ray,
const Ray ray,
SubsurfaceIntersection *ss_isect,
int subsurface_object,
uint *lcg_state,
@@ -212,7 +213,7 @@ ccl_device_intersect void scene_intersect_subsurface(KernelGlobals *kg,
#ifdef __OBJECT_MOTION__
if(kernel_data.bvh.have_motion) {
return bvh_intersect_subsurface_motion(kg,
ray,
&ray,
ss_isect,
subsurface_object,
lcg_state,
@@ -220,7 +221,7 @@ ccl_device_intersect void scene_intersect_subsurface(KernelGlobals *kg,
}
#endif /* __OBJECT_MOTION__ */
return bvh_intersect_subsurface(kg,
ray,
&ray,
ss_isect,
subsurface_object,
lcg_state,
@@ -229,30 +230,63 @@ ccl_device_intersect void scene_intersect_subsurface(KernelGlobals *kg,
#endif
#ifdef __SHADOW_RECORD_ALL__
ccl_device_intersect bool scene_intersect_shadow_all(KernelGlobals *kg, const Ray *ray, Intersection *isect, uint max_hits, uint *num_hits)
ccl_device_intersect bool scene_intersect_shadow_all(KernelGlobals *kg,
const Ray *ray,
Intersection *isect,
int skip_object,
uint max_hits,
uint *num_hits)
{
# ifdef __OBJECT_MOTION__
if(kernel_data.bvh.have_motion) {
# ifdef __HAIR__
if(kernel_data.bvh.have_curves)
return bvh_intersect_shadow_all_hair_motion(kg, ray, isect, max_hits, num_hits);
if(kernel_data.bvh.have_curves) {
return bvh_intersect_shadow_all_hair_motion(kg,
ray,
isect,
skip_object,
max_hits,
num_hits);
}
# endif /* __HAIR__ */
return bvh_intersect_shadow_all_motion(kg, ray, isect, max_hits, num_hits);
return bvh_intersect_shadow_all_motion(kg,
ray,
isect,
skip_object,
max_hits,
num_hits);
}
# endif /* __OBJECT_MOTION__ */
# ifdef __HAIR__
if(kernel_data.bvh.have_curves)
return bvh_intersect_shadow_all_hair(kg, ray, isect, max_hits, num_hits);
if(kernel_data.bvh.have_curves) {
return bvh_intersect_shadow_all_hair(kg,
ray,
isect,
skip_object,
max_hits,
num_hits);
}
# endif /* __HAIR__ */
# ifdef __INSTANCING__
if(kernel_data.bvh.have_instancing)
return bvh_intersect_shadow_all_instancing(kg, ray, isect, max_hits, num_hits);
if(kernel_data.bvh.have_instancing) {
return bvh_intersect_shadow_all_instancing(kg,
ray,
isect,
skip_object,
max_hits,
num_hits);
}
# endif /* __INSTANCING__ */
return bvh_intersect_shadow_all(kg, ray, isect, max_hits, num_hits);
return bvh_intersect_shadow_all(kg,
ray,
isect,
skip_object,
max_hits,
num_hits);
}
#endif /* __SHADOW_RECORD_ALL__ */
@@ -357,7 +391,7 @@ ccl_device_inline float3 ray_offset(float3 P, float3 Ng)
#endif
}
#if defined(__SHADOW_RECORD_ALL__) || defined (__VOLUME_RECORD_ALL__)
#if defined(__VOLUME_RECORD_ALL__) || (defined(__SHADOW_RECORD_ALL__) && defined(__KERNEL_CPU__))
/* ToDo: Move to another file? */
ccl_device int intersections_compare(const void *a, const void *b)
{
@@ -373,5 +407,28 @@ ccl_device int intersections_compare(const void *a, const void *b)
}
#endif
CCL_NAMESPACE_END
#if defined(__SHADOW_RECORD_ALL__)
ccl_device_inline void sort_intersections(Intersection *hits, uint num_hits)
{
#ifdef __KERNEL_GPU__
/* Use bubble sort which has more friendly memory pattern on GPU. */
bool swapped;
do {
swapped = false;
for(int j = 0; j < num_hits - 1; ++j) {
if(hits[j].t > hits[j + 1].t) {
struct Intersection tmp = hits[j];
hits[j] = hits[j + 1];
hits[j + 1] = tmp;
swapped = true;
}
}
--num_hits;
} while(swapped);
#else
qsort(hits, num_hits, sizeof(Intersection), intersections_compare);
#endif
}
#endif /* __SHADOW_RECORD_ALL__ | __VOLUME_RECORD_ALL__ */
CCL_NAMESPACE_END

View File

@@ -17,8 +17,8 @@
// TODO(sergey): Look into avoid use of full Transform and use 3x3 matrix and
// 3-vector which might be faster.
ccl_device_forceinline Transform bvh_unaligned_node_fetch_space(KernelGlobals *kg,
int node_addr,
int child)
int node_addr,
int child)
{
Transform space;
const int child_addr = node_addr + child * 3;
@@ -31,12 +31,12 @@ ccl_device_forceinline Transform bvh_unaligned_node_fetch_space(KernelGlobals *k
#if !defined(__KERNEL_SSE2__)
ccl_device_forceinline int bvh_aligned_node_intersect(KernelGlobals *kg,
const float3 P,
const float3 idir,
const float t,
const int node_addr,
const uint visibility,
float dist[2])
const float3 P,
const float3 idir,
const float t,
const int node_addr,
const uint visibility,
float dist[2])
{
/* fetch node data */
@@ -78,14 +78,14 @@ ccl_device_forceinline int bvh_aligned_node_intersect(KernelGlobals *kg,
}
ccl_device_forceinline int bvh_aligned_node_intersect_robust(KernelGlobals *kg,
const float3 P,
const float3 idir,
const float t,
const float difl,
const float extmax,
const int node_addr,
const uint visibility,
float dist[2])
const float3 P,
const float3 idir,
const float t,
const float difl,
const float extmax,
const int node_addr,
const uint visibility,
float dist[2])
{
/* fetch node data */
@@ -203,13 +203,13 @@ ccl_device_forceinline bool bvh_unaligned_node_intersect_child_robust(
}
ccl_device_forceinline int bvh_unaligned_node_intersect(KernelGlobals *kg,
const float3 P,
const float3 dir,
const float3 idir,
const float t,
const int node_addr,
const uint visibility,
float dist[2])
const float3 P,
const float3 dir,
const float3 idir,
const float t,
const int node_addr,
const uint visibility,
float dist[2])
{
int mask = 0;
float4 cnodes = kernel_tex_fetch(__bvh_nodes, node_addr+0);
@@ -233,15 +233,15 @@ ccl_device_forceinline int bvh_unaligned_node_intersect(KernelGlobals *kg,
}
ccl_device_forceinline int bvh_unaligned_node_intersect_robust(KernelGlobals *kg,
const float3 P,
const float3 dir,
const float3 idir,
const float t,
const float difl,
const float extmax,
const int node_addr,
const uint visibility,
float dist[2])
const float3 P,
const float3 dir,
const float3 idir,
const float t,
const float difl,
const float extmax,
const int node_addr,
const uint visibility,
float dist[2])
{
int mask = 0;
float4 cnodes = kernel_tex_fetch(__bvh_nodes, node_addr+0);
@@ -265,13 +265,13 @@ ccl_device_forceinline int bvh_unaligned_node_intersect_robust(KernelGlobals *kg
}
ccl_device_forceinline int bvh_node_intersect(KernelGlobals *kg,
const float3 P,
const float3 dir,
const float3 idir,
const float t,
const int node_addr,
const uint visibility,
float dist[2])
const float3 P,
const float3 dir,
const float3 idir,
const float t,
const int node_addr,
const uint visibility,
float dist[2])
{
float4 node = kernel_tex_fetch(__bvh_nodes, node_addr);
if(__float_as_uint(node.x) & PATH_RAY_NODE_UNALIGNED) {
@@ -296,15 +296,15 @@ ccl_device_forceinline int bvh_node_intersect(KernelGlobals *kg,
}
ccl_device_forceinline int bvh_node_intersect_robust(KernelGlobals *kg,
const float3 P,
const float3 dir,
const float3 idir,
const float t,
const float difl,
const float extmax,
const int node_addr,
const uint visibility,
float dist[2])
const float3 P,
const float3 dir,
const float3 idir,
const float t,
const float difl,
const float extmax,
const int node_addr,
const uint visibility,
float dist[2])
{
float4 node = kernel_tex_fetch(__bvh_nodes, node_addr);
if(__float_as_uint(node.x) & PATH_RAY_NODE_UNALIGNED) {
@@ -442,19 +442,19 @@ ccl_device_forceinline int bvh_aligned_node_intersect_robust(
}
ccl_device_forceinline int bvh_unaligned_node_intersect(KernelGlobals *kg,
const float3 P,
const float3 dir,
const ssef& isect_near,
const ssef& isect_far,
const int node_addr,
const uint visibility,
float dist[2])
const float3 P,
const float3 dir,
const ssef& isect_near,
const ssef& isect_far,
const int node_addr,
const uint visibility,
float dist[2])
{
Transform space0 = bvh_unaligned_node_fetch_space(kg, node_addr, 0);
Transform space1 = bvh_unaligned_node_fetch_space(kg, node_addr, 1);
float3 aligned_dir0 = transform_direction(&space0, dir),
aligned_dir1 = transform_direction(&space1, dir);;
aligned_dir1 = transform_direction(&space1, dir);
float3 aligned_P0 = transform_point(&space0, P),
aligned_P1 = transform_point(&space1, P);
float3 nrdir0 = -bvh_inverse_direction(aligned_dir0),
@@ -503,20 +503,20 @@ ccl_device_forceinline int bvh_unaligned_node_intersect(KernelGlobals *kg,
}
ccl_device_forceinline int bvh_unaligned_node_intersect_robust(KernelGlobals *kg,
const float3 P,
const float3 dir,
const ssef& isect_near,
const ssef& isect_far,
const float difl,
const int node_addr,
const uint visibility,
float dist[2])
const float3 P,
const float3 dir,
const ssef& isect_near,
const ssef& isect_far,
const float difl,
const int node_addr,
const uint visibility,
float dist[2])
{
Transform space0 = bvh_unaligned_node_fetch_space(kg, node_addr, 0);
Transform space1 = bvh_unaligned_node_fetch_space(kg, node_addr, 1);
float3 aligned_dir0 = transform_direction(&space0, dir),
aligned_dir1 = transform_direction(&space1, dir);;
aligned_dir1 = transform_direction(&space1, dir);
float3 aligned_P0 = transform_point(&space0, P),
aligned_P1 = transform_point(&space1, P);
float3 nrdir0 = -bvh_inverse_direction(aligned_dir0),
@@ -574,17 +574,17 @@ ccl_device_forceinline int bvh_unaligned_node_intersect_robust(KernelGlobals *kg
}
ccl_device_forceinline int bvh_node_intersect(KernelGlobals *kg,
const float3& P,
const float3& dir,
const ssef& isect_near,
const ssef& isect_far,
const ssef& tsplat,
const ssef Psplat[3],
const ssef idirsplat[3],
const shuffle_swap_t shufflexyz[3],
const int node_addr,
const uint visibility,
float dist[2])
const float3& P,
const float3& dir,
const ssef& isect_near,
const ssef& isect_far,
const ssef& tsplat,
const ssef Psplat[3],
const ssef idirsplat[3],
const shuffle_swap_t shufflexyz[3],
const int node_addr,
const uint visibility,
float dist[2])
{
float4 node = kernel_tex_fetch(__bvh_nodes, node_addr);
if(__float_as_uint(node.x) & PATH_RAY_NODE_UNALIGNED) {
@@ -612,19 +612,19 @@ ccl_device_forceinline int bvh_node_intersect(KernelGlobals *kg,
}
ccl_device_forceinline int bvh_node_intersect_robust(KernelGlobals *kg,
const float3& P,
const float3& dir,
const ssef& isect_near,
const ssef& isect_far,
const ssef& tsplat,
const ssef Psplat[3],
const ssef idirsplat[3],
const shuffle_swap_t shufflexyz[3],
const float difl,
const float extmax,
const int node_addr,
const uint visibility,
float dist[2])
const float3& P,
const float3& dir,
const ssef& isect_near,
const ssef& isect_far,
const ssef& tsplat,
const ssef Psplat[3],
const ssef idirsplat[3],
const shuffle_swap_t shufflexyz[3],
const float difl,
const float extmax,
const int node_addr,
const uint visibility,
float dist[2])
{
float4 node = kernel_tex_fetch(__bvh_nodes, node_addr);
if(__float_as_uint(node.x) & PATH_RAY_NODE_UNALIGNED) {

View File

@@ -18,7 +18,7 @@
*/
#ifdef __QBVH__
# include "qbvh_shadow_all.h"
# include "kernel/bvh/qbvh_shadow_all.h"
#endif
#if BVH_FEATURE(BVH_HAIR)
@@ -45,6 +45,7 @@ ccl_device_inline
bool BVH_FUNCTION_FULL_NAME(BVH)(KernelGlobals *kg,
const Ray *ray,
Intersection *isect_array,
const int skip_object,
const uint max_hits,
uint *num_hits)
{
@@ -100,9 +101,6 @@ bool BVH_FUNCTION_FULL_NAME(BVH)(KernelGlobals *kg,
gen_idirsplat_swap(pn, shuf_identity, shuf_swap, idir, idirsplat, shufflexyz);
#endif /* __KERNEL_SSE2__ */
IsectPrecalc isect_precalc;
triangle_intersect_precalc(dir, &isect_precalc);
/* traversal loop */
do {
do {
@@ -189,6 +187,16 @@ bool BVH_FUNCTION_FULL_NAME(BVH)(KernelGlobals *kg,
while(prim_addr < prim_addr2) {
kernel_assert((kernel_tex_fetch(__prim_type, prim_addr) & PRIMITIVE_ALL) == p_type);
#ifdef __SHADOW_TRICKS__
uint tri_object = (object == OBJECT_NONE)
? kernel_tex_fetch(__prim_object, prim_addr)
: object;
if(tri_object == skip_object) {
++prim_addr;
continue;
}
#endif
bool hit;
/* todo: specialized intersect functions which don't fill in
@@ -198,9 +206,9 @@ bool BVH_FUNCTION_FULL_NAME(BVH)(KernelGlobals *kg,
switch(p_type) {
case PRIMITIVE_TRIANGLE: {
hit = triangle_intersect(kg,
&isect_precalc,
isect_array,
P,
dir,
PATH_RAY_SHADOW,
object,
prim_addr);
@@ -309,12 +317,11 @@ bool BVH_FUNCTION_FULL_NAME(BVH)(KernelGlobals *kg,
object = kernel_tex_fetch(__prim_object, -prim_addr-1);
# if BVH_FEATURE(BVH_MOTION)
bvh_instance_motion_push(kg, object, ray, &P, &dir, &idir, &isect_t, &ob_itfm);
isect_t = bvh_instance_motion_push(kg, object, ray, &P, &dir, &idir, isect_t, &ob_itfm);
# else
bvh_instance_push(kg, object, ray, &P, &dir, &idir, &isect_t);
isect_t = bvh_instance_push(kg, object, ray, &P, &dir, &idir, isect_t);
# endif
triangle_intersect_precalc(dir, &isect_precalc);
num_hits_in_instance = 0;
isect_array->t = isect_t;
@@ -354,22 +361,17 @@ bool BVH_FUNCTION_FULL_NAME(BVH)(KernelGlobals *kg,
bvh_instance_pop_factor(kg, object, ray, &P, &dir, &idir, &t_fac);
# endif
triangle_intersect_precalc(dir, &isect_precalc);
/* scale isect->t to adjust for instancing */
for(int i = 0; i < num_hits_in_instance; i++) {
(isect_array-i-1)->t *= t_fac;
}
}
else {
float ignore_t = FLT_MAX;
# if BVH_FEATURE(BVH_MOTION)
bvh_instance_motion_pop(kg, object, ray, &P, &dir, &idir, &ignore_t, &ob_itfm);
bvh_instance_motion_pop(kg, object, ray, &P, &dir, &idir, FLT_MAX, &ob_itfm);
# else
bvh_instance_pop(kg, object, ray, &P, &dir, &idir, &ignore_t);
bvh_instance_pop(kg, object, ray, &P, &dir, &idir, FLT_MAX);
# endif
triangle_intersect_precalc(dir, &isect_precalc);
}
isect_t = tmax;
@@ -400,6 +402,7 @@ bool BVH_FUNCTION_FULL_NAME(BVH)(KernelGlobals *kg,
ccl_device_inline bool BVH_FUNCTION_NAME(KernelGlobals *kg,
const Ray *ray,
Intersection *isect_array,
const int skip_object,
const uint max_hits,
uint *num_hits)
{
@@ -408,6 +411,7 @@ ccl_device_inline bool BVH_FUNCTION_NAME(KernelGlobals *kg,
return BVH_FUNCTION_FULL_NAME(QBVH)(kg,
ray,
isect_array,
skip_object,
max_hits,
num_hits);
}
@@ -418,6 +422,7 @@ ccl_device_inline bool BVH_FUNCTION_NAME(KernelGlobals *kg,
return BVH_FUNCTION_FULL_NAME(BVH)(kg,
ray,
isect_array,
skip_object,
max_hits,
num_hits);
}

View File

@@ -18,7 +18,7 @@
*/
#ifdef __QBVH__
# include "qbvh_subsurface.h"
# include "kernel/bvh/qbvh_subsurface.h"
#endif
#if BVH_FEATURE(BVH_HAIR)
@@ -75,16 +75,16 @@ void BVH_FUNCTION_FULL_NAME(BVH)(KernelGlobals *kg,
if(!(object_flag & SD_OBJECT_TRANSFORM_APPLIED)) {
#if BVH_FEATURE(BVH_MOTION)
Transform ob_itfm;
bvh_instance_motion_push(kg,
subsurface_object,
ray,
&P,
&dir,
&idir,
&isect_t,
&ob_itfm);
isect_t = bvh_instance_motion_push(kg,
subsurface_object,
ray,
&P,
&dir,
&idir,
isect_t,
&ob_itfm);
#else
bvh_instance_push(kg, subsurface_object, ray, &P, &dir, &idir, &isect_t);
isect_t = bvh_instance_push(kg, subsurface_object, ray, &P, &dir, &idir, isect_t);
#endif
object = subsurface_object;
}
@@ -109,9 +109,6 @@ void BVH_FUNCTION_FULL_NAME(BVH)(KernelGlobals *kg,
gen_idirsplat_swap(pn, shuf_identity, shuf_swap, idir, idirsplat, shufflexyz);
#endif
IsectPrecalc isect_precalc;
triangle_intersect_precalc(dir, &isect_precalc);
/* traversal loop */
do {
do {
@@ -197,9 +194,9 @@ void BVH_FUNCTION_FULL_NAME(BVH)(KernelGlobals *kg,
for(; prim_addr < prim_addr2; prim_addr++) {
kernel_assert(kernel_tex_fetch(__prim_type, prim_addr) == type);
triangle_intersect_subsurface(kg,
&isect_precalc,
ss_isect,
P,
dir,
object,
prim_addr,
isect_t,

View File

@@ -18,7 +18,7 @@
*/
#ifdef __QBVH__
# include "qbvh_traversal.h"
# include "kernel/bvh/qbvh_traversal.h"
#endif
#if BVH_FEATURE(BVH_HAIR)
@@ -104,9 +104,6 @@ ccl_device_noinline bool BVH_FUNCTION_FULL_NAME(BVH)(KernelGlobals *kg,
gen_idirsplat_swap(pn, shuf_identity, shuf_swap, idir, idirsplat, shufflexyz);
#endif
IsectPrecalc isect_precalc;
triangle_intersect_precalc(dir, &isect_precalc);
/* traversal loop */
do {
do {
@@ -238,9 +235,9 @@ ccl_device_noinline bool BVH_FUNCTION_FULL_NAME(BVH)(KernelGlobals *kg,
BVH_DEBUG_NEXT_INTERSECTION();
kernel_assert(kernel_tex_fetch(__prim_type, prim_addr) == type);
if(triangle_intersect(kg,
&isect_precalc,
isect,
P,
dir,
visibility,
object,
prim_addr))
@@ -354,11 +351,10 @@ ccl_device_noinline bool BVH_FUNCTION_FULL_NAME(BVH)(KernelGlobals *kg,
object = kernel_tex_fetch(__prim_object, -prim_addr-1);
# if BVH_FEATURE(BVH_MOTION)
bvh_instance_motion_push(kg, object, ray, &P, &dir, &idir, &isect->t, &ob_itfm);
isect->t = bvh_instance_motion_push(kg, object, ray, &P, &dir, &idir, isect->t, &ob_itfm);
# else
bvh_instance_push(kg, object, ray, &P, &dir, &idir, &isect->t);
isect->t = bvh_instance_push(kg, object, ray, &P, &dir, &idir, isect->t);
# endif
triangle_intersect_precalc(dir, &isect_precalc);
# if defined(__KERNEL_SSE2__)
Psplat[0] = ssef(P.x);
@@ -391,11 +387,10 @@ ccl_device_noinline bool BVH_FUNCTION_FULL_NAME(BVH)(KernelGlobals *kg,
/* instance pop */
# if BVH_FEATURE(BVH_MOTION)
bvh_instance_motion_pop(kg, object, ray, &P, &dir, &idir, &isect->t, &ob_itfm);
isect->t = bvh_instance_motion_pop(kg, object, ray, &P, &dir, &idir, isect->t, &ob_itfm);
# else
bvh_instance_pop(kg, object, ray, &P, &dir, &idir, &isect->t);
isect->t = bvh_instance_pop(kg, object, ray, &P, &dir, &idir, isect->t);
# endif
triangle_intersect_precalc(dir, &isect_precalc);
# if defined(__KERNEL_SSE2__)
Psplat[0] = ssef(P.x);

View File

@@ -18,7 +18,7 @@
*/
#ifdef __QBVH__
# include "qbvh_volume.h"
# include "kernel/bvh/qbvh_volume.h"
#endif
#if BVH_FEATURE(BVH_HAIR)
@@ -97,9 +97,6 @@ bool BVH_FUNCTION_FULL_NAME(BVH)(KernelGlobals *kg,
gen_idirsplat_swap(pn, shuf_identity, shuf_swap, idir, idirsplat, shufflexyz);
#endif
IsectPrecalc isect_precalc;
triangle_intersect_precalc(dir, &isect_precalc);
/* traversal loop */
do {
do {
@@ -194,9 +191,9 @@ bool BVH_FUNCTION_FULL_NAME(BVH)(KernelGlobals *kg,
continue;
}
triangle_intersect(kg,
&isect_precalc,
isect,
P,
dir,
visibility,
object,
prim_addr);
@@ -238,13 +235,11 @@ bool BVH_FUNCTION_FULL_NAME(BVH)(KernelGlobals *kg,
int object_flag = kernel_tex_fetch(__object_flag, object);
if(object_flag & SD_OBJECT_HAS_VOLUME) {
# if BVH_FEATURE(BVH_MOTION)
bvh_instance_motion_push(kg, object, ray, &P, &dir, &idir, &isect->t, &ob_itfm);
isect->t = bvh_instance_motion_push(kg, object, ray, &P, &dir, &idir, isect->t, &ob_itfm);
# else
bvh_instance_push(kg, object, ray, &P, &dir, &idir, &isect->t);
isect->t = bvh_instance_push(kg, object, ray, &P, &dir, &idir, isect->t);
# endif
triangle_intersect_precalc(dir, &isect_precalc);
# if defined(__KERNEL_SSE2__)
Psplat[0] = ssef(P.x);
Psplat[1] = ssef(P.y);
@@ -281,13 +276,11 @@ bool BVH_FUNCTION_FULL_NAME(BVH)(KernelGlobals *kg,
/* instance pop */
# if BVH_FEATURE(BVH_MOTION)
bvh_instance_motion_pop(kg, object, ray, &P, &dir, &idir, &isect->t, &ob_itfm);
isect->t = bvh_instance_motion_pop(kg, object, ray, &P, &dir, &idir, isect->t, &ob_itfm);
# else
bvh_instance_pop(kg, object, ray, &P, &dir, &idir, &isect->t);
isect->t = bvh_instance_pop(kg, object, ray, &P, &dir, &idir, isect->t);
# endif
triangle_intersect_precalc(dir, &isect_precalc);
# if defined(__KERNEL_SSE2__)
Psplat[0] = ssef(P.x);
Psplat[1] = ssef(P.y);

Some files were not shown because too many files have changed in this diff Show More