Commit Graph

156 Commits

Author SHA1 Message Date
d2bb0e660b Fix T46207: Slow OpenCL GPU bake and blown out baking Cycles render 2016-05-31 17:48:42 +02:00
4388b29e98 Cycles: Add human readable sizes to debug output
Some of these values can get quite large and are hard to read, adding this
makes it easy to read them at a glance.

Reviewed By: sergey

Differential Revision: https://developer.blender.org/D2039
2016-05-31 06:13:54 -04:00
3aa74828ab Cycles: Cleanup, indentation and braces 2016-02-03 15:00:55 +01:00
9815f8a623 Cycles: Cleanup of OpenCL split kernel routines
The idea is to switch from allocating separate buffers for shader data's
structure of arrays to allocating one huge memory block and do some index
trickery to make it accessed as SOA.

This saves quite reasonable amount of lines of code in device_opencl and
also makes it possible to get rid of special declaration of ShaderData
structure.

As a side effect it also makes it easier to experiment with SOA vs. AOS
for split kernel.

Works fine here on NVidia GTX580, Intel CPU amd AMD Fiji cards.

Reviewers: #cycles, brecht, juicyfruit, dingto

Differential Revision: https://developer.blender.org/D1593
2016-01-30 00:23:06 +01:00
25aea19323 Cycles: Remove some unused variables from split kernel function 2016-01-29 18:54:46 +01:00
e2161ca854 Cycles: Remove few function arguments needed only for the split kernel
Use KernelGlobals to access all the global arrays for the intermediate
storage instead of passing all this storage things explicitly.

Tested here with Intel OpenCL, NVIDIA GTX580 and AMD Fiji, didn't see
any artifacts, so guess it's all good.

Reviewers: juicyfruit, dingto, lukasstockner97

Differential Revision: https://developer.blender.org/D1736
2016-01-28 18:59:27 +01:00
5f31089957 Cycles: Make OpenCL's argument wrapper able to get int/float values directly 2016-01-28 15:03:42 +01:00
19adfd3176 Cycles: Fix OpenCL kernel compilation after the bake commit
There is no function pointers in OpenCL specification. For as long
as we want to support this platform we should follow the specifications.

While the code is not totally optimal now, it should not be that huge
of performance issue on CPU since it does jump tables just nicely, so
it's not that much extra computation here.
2016-01-19 22:53:19 +01:00
52e34ffe33 Cycles: Pass missing shader filter argument to CUDA and OpenCL kernels 2016-01-19 22:53:19 +01:00
ac7aefd7c2 Cycles: Use special debug panel to fine-tune debug flags
This panel is only visible when debug_value is set to 256 and has no
affect in other cases. However, if debug value is not set to this
value, environment variables will be used to control which features
are enabled, so there's no visible changes to anyone in fact.

There are some changes needed to prevent devices re-enumeration on
every Cycles session create.

Reviewers: juicyfruit, lukasstockner97, dingto, brecht

Reviewed By: lukasstockner97, dingto

Differential Revision: https://developer.blender.org/D1720
2016-01-12 16:21:30 +05:00
81a253a0d5 Cycles OpenCL: Change environment flags for testing.
CYCLES_OPENCL_TEST was removed, there was an insonsistency between
opencl_kernel_use_split() and opencl_get_usable_devices().

From now on, to test non whitelisted devices please use either
CYCLES_OPENCL_MEGA_KERNEL_TEST or CYCLES_OPENCL_SPLIT_KERNEL_TEST.
2016-01-07 00:14:04 +01:00
83e73a2100 Cycles: Refactor how we pass bounce info to light path node.
This commit changes the way how we pass bounce information to the Light
Path node. Instead of manualy copying the bounces into ShaderData, we now
directly pass PathState. This reduces the arguments that we need to pass
around and also makes it easier to extend the feature.

This commit also exposes the Transmission Bounce Depth to the Light Path
node. It works similar to the Transparent Depth Output: Replace a
Transmission lightpath after X bounces with another shader, e.g a Diffuse
one. This can be used to avoid black surfaces, due to low amount of max
bounces.

Reviewed by Sergey and Brecht, thanks for some hlp with this.

I tested compilation and usage on CPU (SVM and OSL), CUDA, OpenCL Split
and Mega kernel. Hopefully this covers all devices. :)
2016-01-06 23:43:29 +01:00
da49ee30b0 Fix T47100: OpenCL compilation warnings due to missing space in the argument list 2016-01-03 23:13:49 +05:00
3918c8b9a5 Cycles: Optionally output luminance from the shader evaluation kernel
This makes it possible to move some parts of evaluation from host to the device
and hopefully reduce memory usage by avoid having full RGBA buffer on the host.

Reviewers: juicyfruit, lukasstockner97, brecht

Reviewed By: lukasstockner97, brecht

Differential Revision: https://developer.blender.org/D1702
2015-12-30 19:04:04 +05:00
0ae2ade17a Cycles; Fix typo in the comment 2015-12-28 19:01:26 +05:00
a106da7f1d Cycles: Move build options constructions to DeviceRequestedFeatures
This way it's easier to re-use requested features logic across multiple
device implementations.
2015-11-21 21:42:31 +05:00
9aafec1ce1 Cycles: Avoid multiple spaces in OpenCL build options
This should solve some compilation errors with compilation on OSX.
2015-11-21 21:33:08 +05:00
9bce104c8c Cycles: Partially revert previous commit
Apparently removing kernel arguments broke NVidia OpenCL.

Needs more investigation, for the time being revering changes which caused problem.
2015-11-01 21:01:12 +05:00
dc9e0b819b Cycles: Remove unused argument from the split kernel functions
Should be no functional changes, just simplifies operation with kernels.
2015-11-01 17:22:42 +05:00
84e8b05e97 Cycles: Minor code style cleanup 2015-11-01 15:40:17 +05:00
3300b1f232 Cycles: Add option to force mega kernel to be used
This way it's possible to test mega kernel on various hardware.

That being said mega kernel seems to work on Fiji card here in the studio.
2015-10-29 21:52:56 +05:00
350cf8ea7f Cycles: Cleanup, whitespace around keywords 2015-10-08 19:08:28 +05:00
3804a3660e Cleanup: Typo fixes in OpenCL log messages. 2015-09-24 15:34:41 +02:00
2672ee77a0 Cleanup: spelling/style 2015-08-23 21:12:48 +10:00
d22153425a Cycles: Enable some extra debug prints for OpenCL kernel loading 2015-08-11 18:03:54 +02:00
3fba620858 Cycles: Prepare for more image extension types support
Basically just replace boolean periodic flag with extension type enum in the
device API.
2015-07-28 14:14:24 +02:00
1788293a01 Fix T45381: Crash Blender 2.75 in Win7 x64 AMD card
Previous fix didn't work well enough because on Windows Python has different
environment than Blender ans setting variables in there made no effect from
Blender point of view.
2015-07-23 12:10:38 +02:00
4bca8a6bc5 Fix T45484: Regression OpenCL split: access violation
That was a primary school error caused by moving statements inside assert()
which effectivly disabled crucial code in release builds.
2015-07-18 23:30:19 +02:00
45b5bf034b Cycles; Make baking a feature-specific option
This means render devices now might skip building baking kernels in cases when
only actual render-related functionality is used.

For now it's only implemented for OpenCL split kernel device and mainly needed
to work around some compiler-specific bugs which crashes on building the kernel.

Using OpenCL for baking might still crash the driver, but at least there is now
higher probability of that GPU will be usable to render the scene.

Real fix should actually be done in the driver side.
2015-07-18 16:02:08 +02:00
36a952e3e4 Cycles: Use feature-selective base kernel compilation when using split kernel
The idea is to make all kernels as small as possible to work around possible
issues with buggy drivers which might fail building feature-complete kernels.

It's indeed just a workaround to make at last simple test scenes to render
on OpenCL. Real fix should happen from the driver side.
2015-07-18 16:02:08 +02:00
5e4a8c6a87 Cycles: Some cleanup if OpenCL base kernel load_kernel()
Hopefully makes it less clumzy, should be no functional changes still.
2015-07-18 16:02:08 +02:00
025eda57da Cycles: Make OpenCL cache follow out code style a bit closer 2015-07-18 16:02:08 +02:00
548e650252 Cycles: Merging of patch from OSX went wrong in the previous change
That's what happens when you can't commit from a system you're making
changes at and someone is behind your back...

Sorry for the noise.
2015-07-15 15:12:19 +02:00
2b97ad348c Cycles: Missed this in the previous commit 2015-07-15 15:11:02 +02:00
56bf25d219 Cycles: Enable OpenCL rendering on Apple OSX
Requires having latest El Capitan beta 3 OSX due to ome crucial fixes made in the
compiler. Supports same features as NVidia OpenCL apart from CMJ (there's no
experimental feature set support in megakernel yet).

Uses megakernel internally, which works much better than the split kernel. Split
kernel is not supported on OSX still, needs to be investigated still.

Some more details can be found there:

  http://wiki.blender.org/index.php/Dev:2.6/Source/Render/Cycles/OpenCL#AMD_on_OSX
2015-07-15 14:20:59 +02:00
a79d47b14e Cycles: Add logging to detected OpenCL platforms and devices
Happens on verbosity level 2, should help looking into some of the
bug reports in the tracker.
2015-07-14 09:56:00 +02:00
3dc86f586c Cycles: Add debug print about CLEW initialization status 2015-07-07 14:37:12 +02:00
37539962fe Cycles: Add an option to force disable all OpenCL devices
This way it's possible to disable OpenCL devices for AMD devices
which are considered whitelisted.
2015-07-07 14:18:45 +02:00
36426c3ee2 Cycles: Code cleanup, double semicolon 2015-07-03 15:44:57 +02:00
c864f5d140 Cycles: Error enqueueing split kernels should no longer cause infinite loop 2015-07-03 12:13:38 +02:00
78de47ca24 Cycles: Fix zero-size buffer allocation with OpenCL devices
This is not really supported by OpenCL but might happen in certain
configurations. There might be some remained cases when this happens
but so far can not find any,
2015-07-01 11:56:48 +02:00
09dc470982 Cycles: Rework the way how OpenCL devices are created
It was annoying copy-paste happened across OpenCL device constructor, device
enumeration and split kernel checks. Now those areas are using an utility
function which returns pairs of platform and device IDs for devices which are
supported by Cycles and enumeration is happening inside that list.

This makes it so filtering is happening in a single place, so there's no need
to keep 3 different functions in sync.

This commit also fixes a bug with wrong enumeration of devices caused by recent
fixes. Those fixes were in fact wrong and only happened to appear to be working
on laptop with optimus card on Linux. Root of those issues is in fact in bad
Linux driver for optimus cards.
2015-06-27 15:13:08 +02:00
c40759e678 Cleanup: warnings 2015-06-24 18:42:16 +10:00
4ed6605d65 Cycles: Don't show devices which does not support OpenCL 1.1 in the menu
They'll be checked for the version later and that check will fail anyway,
so better to not allow user to see unsupported device in the list.

Also corrected one more issue with the device enumeration.
2015-06-18 11:26:22 +02:00
ae3e37b899 Cycles: Fix wrong numbering of OpenCL devices when some of them are skipped
Skipped devices did not reflect in the device number, which might result in bad
array indices.

This might also resolve T45037, and need to be ported to a release branch.
2015-06-17 11:35:39 +02:00
2ebaa69676 Cycles: Move requested feature conversion to an own function
This way it could be used for the shader/baking kernels easily n the future.
making those kernels more optimal.
2015-06-08 11:15:40 +02:00
8c2750bc82 Cycles: Remove round-up trickery for max closure in split OpenCL kernel
Round-up was only enabled for viewport render, which was for a long time hardcoded to
use 64 closures. This was done in order to avoid unnecessary kernel re-compilations
when tweaking the shader tree.

We could enable selective closure compilation in the viewport later if it'll give
measurable speed improvements, but even then round-up is to happen outside of the
device level,

This commit also removes early output which happened in cases when max closure did
not change. It was wrong because other requested kernel features might have been
changed.
2015-06-08 11:15:39 +02:00
27ed75271c Cycles: Make hair, object and motion blur selective compiled into OpenCL
This features are now based on the scene settings, so scenes without those features
used are rendered even faster.

This gives about 30% speedup on the AMD A10 APU here, but at the same time it does
not mean such an improvement will happen on all the hardware. That being said, the
Tonga device here seems to have no measurable difference.

In any case it seems handy to have for the future, when we'll want to support SSS
in the kernel or to port selective compilation/split kernel to CUDA devices.
2015-06-08 11:15:39 +02:00
28f798f86e Cycles: Initial support for OpenCL capabilities reports
For now it's just generic information, still need to expose memory, workgorup
sizes and so on.
2015-06-05 14:17:30 +02:00
9d4d55e78b Cycles: Strip meaningless empty output form the MVidia OpenCL compiler 2015-06-01 19:49:53 +05:00