Commit Graph

709 Commits

Author SHA1 Message Date
4690281b17 Cycles: Add implementation of clip extension mode
For now there's no OpenCL support, it'll come later.
2015-07-28 14:36:08 +02:00
3fba620858 Cycles: Prepare for more image extension types support
Basically just replace boolean periodic flag with extension type enum in the
device API.
2015-07-28 14:14:24 +02:00
038d6ce2cc Cycles: Correction to image extension setting commit
Technically it was all wrong and it should have been called Extend instead
of Clip. Got confused by the naming in different libraries.

More options are still to come.
2015-07-28 13:41:09 +02:00
f2c54df625 Cycles: Expose image image extension mapping to the image manager
Currently only two mappings are supported by API, which is Repeat (old behavior)
and new Clip behavior. Internally this extension is being converted to periodic
flag which was already supported but wasn't exposed.

There's no support for OpenCL yet because of the way how we pack images into a
single texture.

Those settings are not exposed to UI or anywhere else and there should be no
functional changes so far.
2015-07-21 21:58:19 +02:00
dc3563ff48 Cycles: Implement camera zoom motion blur
Works totally similar to camera motion blur and majority of the changes are
related on just passing extra arguments to sync() functions.

Couple of things still to look into:

- Motion pass will not include motion caused by the zoom.
- Only perspective cameras are supported currently.
- Motion is being interpolated on projected coordinates, which might give
  different results from constructing projection matrix from interpolated
  field of view.

  This could be good enough for us, but we need to consider improving this
  at some point.

Reviewers: juicyfruit, dingto

Reviewed By: dingto

Differential Revision: https://developer.blender.org/D1383
2015-07-21 17:40:03 +02:00
b0df19667f Fix T45317: Cycles material preview unnecessarily re-rendering
The issue was caused by wrong fix for T22741 which forced redraws on any window
event, like Expose. Use proper NV_WM | ND_UNDO listener instead,
2015-07-21 11:26:42 +02:00
7d10798af2 Cycles: Add voxel texture sampler shader node
The idea of this node is to sampling of 3D voxels at a given coordinate
supporting different mapping strategies (world space mapping, object
local space etc).

Currently not in use, it's a preparation step for supporting point density
textures.
2015-07-18 22:09:20 +02:00
cf14437ac9 Cycles: Log requested device features
Useful to have this always logged because otherwise it's needed to remove cached
kernels and check build flags to see which features are enabled.
2015-07-18 16:02:09 +02:00
45b5bf034b Cycles; Make baking a feature-specific option
This means render devices now might skip building baking kernels in cases when
only actual render-related functionality is used.

For now it's only implemented for OpenCL split kernel device and mainly needed
to work around some compiler-specific bugs which crashes on building the kernel.

Using OpenCL for baking might still crash the driver, but at least there is now
higher probability of that GPU will be usable to render the scene.

Real fix should actually be done in the driver side.
2015-07-18 16:02:08 +02:00
686e8e452c Fix T45390: Cycles experimental displacement method ignores scaling when render
From artists perspective it makes sense to always apply displacement in a local
space.

TODO: Double-check that BVH is being packed properly. From quick tests seems it's
all fine, but might be missing some obvious failure still.
2015-07-13 15:24:56 +02:00
e245a57640 Fix T45227: Light optimization commit broke world MIS 2015-06-28 20:47:35 +02:00
9260c0c2ba Cycles: Ignore light which has no contribution to the scene
This commit makes it so light which has zero energy or doesn't has
emission shader at all is being ignored by the path tracing.
2015-06-27 15:13:08 +02:00
ddeb8c595f Cleanup: Fix a typo in world MIS.
Found by Lukas Stockner, thanks!
2015-06-26 21:36:28 +02:00
e1fd7b9ca9 Cycles: Report currently sampling tile when CPU is working on the last tile
This is mainly useful for the render farms output when logging might stop at
the "Path Tracing Tile N/N" string, which makes it a bit difficult to follow
what exactly is happening (node going crazy, hardware issues or just last tile
is too much heavy).

This is more like an experiment, might be changed in the future.
2015-06-18 16:15:37 +02:00
0f42b8aee0 Cycles: Fix compilation error with motion blur disabled on CPU 2015-06-13 18:16:32 +02:00
596eadf0e1 Cycles: Add debug pass which shows number of instance pushes during camera ray intersection
TODO: We might want to refactor debug passes into PASS_DEBUG and some
debug_type (similar to Blender's side passes) to avoid issue of running
out of bits.
2015-06-12 00:12:03 +02:00
b666593775 Cycles: Remove Bump Node from the graph, if Height input is not connected.
This way we can avoid building the split kernel with NODE_FEATURE_BUMP enabled, in case we don't need it.
2015-06-11 23:09:38 +02:00
2bd6de5bbb Cycles: Add debug pass showing average number of ray bounces per pixel
Quite straightforward implementation, but still needs some work for the split
kernel. Includes both regular and split kernel implementation for that.

The pass is not exposed to the interface yet because it's currently not really
easy to have same pass listed in the menu multiple times.
2015-06-11 14:53:15 +02:00
27ed75271c Cycles: Make hair, object and motion blur selective compiled into OpenCL
This features are now based on the scene settings, so scenes without those features
used are rendered even faster.

This gives about 30% speedup on the AMD A10 APU here, but at the same time it does
not mean such an improvement will happen on all the hardware. That being said, the
Tonga device here seems to have no measurable difference.

In any case it seems handy to have for the future, when we'll want to support SSS
in the kernel or to port selective compilation/split kernel to CUDA devices.
2015-06-08 11:15:39 +02:00
23b068ce8a Fix T44922: Split kernel renders black when using Bump node
Was missing feature detection in the BumpNode in the previous selective nodes
compilation commit.
2015-06-02 11:53:10 +05:00
27c1262e21 Fix T44908: Blender crashes when trying to use cycles experimental displacement
The issue was caused by the reshuffle needed to make objects flags have proper
object's bounding box to solve regressions in SSS objects intersecting volumes.

There's actually a feedback loop happening here, which is now solved in quite
naive way -- for the true displacement we consider all objects are capable of
intersecting volumes, synchronize object flags prior to displacement shader
tasks runs and then re-update object flags for proper bounding box.

Not sure what will be the proper solution here, we can't do preliminary check
of intersection for displacement shader, but on the other hand we don't really
need this flag for displacement shader anyway.
2015-06-02 00:04:30 +05:00
3127d47029 Cycles: Fix wrong max nodes group used for the viewport render 2015-06-01 19:49:53 +05:00
ecd4ee75af Cycles: Implement selective nodes compilation
This commits finishes initial selective nodes compilation into kernel, which
helps a lot performance-wise for AMD OpenCL kernels.

Split by node groups is based on statistics from simple scenes like BMW and
more complex scenes like mango and gooseberry production files. Further
tweaks are always possible, but it should be a good starting point.

TODO: Still need to ignore unused nodes when calculating requested shader
features.
2015-06-01 19:49:52 +05:00
c0235da53c Cycles: Fix some typos in the selective modes compilation 2015-06-01 19:49:52 +05:00
f45f2ac687 Cycles: Fix missing features gathering from the bump graph 2015-06-01 19:49:52 +05:00
4d8cf1329d Cycles: Add bump feature for selective nodes compilation
For now it is unused in the kernel, actual usage will come with
the next commits.
2015-06-01 19:49:52 +05:00
14251e8b45 Cycles: Shader node features are to be inherited from the base class 2015-06-01 19:49:52 +05:00
da34136de1 Cycles: Check for validity of the tiles arrays in progressive refine
In certain configurations (for example when start resolution is set to small
value for background render and progressive refine enabled) number of tiles
might change in the tile manager. This situation will confuse progressive
refine feature and likely cause crash.

We might also add some settings verification in the session constructor, but
having an assert with brief explanation about what's wrong should already be
much better than nothing.
2015-05-19 12:42:07 +05:00
f868be6295 Cycles: Check for whether update/write callbacks are set prior to calling them
This changes the progressive refine part, regular update was already checking
for whether callbacks are set.
2015-05-19 12:42:07 +05:00
0a60c7d8ee Cycles: Fix missing camera-in-volume update when using certain render layers configurations 2015-05-14 19:08:13 +05:00
0e80eb82e0 Cycles: Resize light_data after possible light removal. 2015-05-14 01:13:40 +02:00
67eb2c7897 Cycles: Remove Emission shaders from the graph if color or strength is 0. 2015-05-14 01:13:40 +02:00
f0f481031c Fix T44616: Cycles crashes loading 42k by 21k textures
Simple integer overflow issue.

TODO(sergey): Check on CPU cubic sampling, it might also need size_t.
2015-05-12 18:48:55 +05:00
4fc3188112 Cycles: Get rid of one more OpenGL matrix manipulation/push/pop. 2015-05-11 16:41:18 +02:00
7f4479da42 Cycles: OpenCL kernel split
This commit contains all the work related on the AMD megakernel split work
which was mainly done by Varun Sundar, George Kyriazis and Lenny Wang, plus
some help from Sergey Sharybin, Martijn Berger, Thomas Dinges and likely
someone else which we're forgetting to mention.

Currently only AMD cards are enabled for the new split kernel, but it is
possible to force split opencl kernel to be used by setting the following
environment variable: CYCLES_OPENCL_SPLIT_KERNEL_TEST=1.

Not all the features are supported yet, and that being said no motion blur,
camera blur, SSS and volumetrics for now. Also transparent shadows are
disabled on AMD device because of some compiler bug.

This kernel is also only implements regular path tracing and supporting
branched one will take a bit. Branched path tracing is exposed to the
interface still, which is a bit misleading and will be hidden there soon.

More feature will be enabled once they're ported to the split kernel and
tested.

Neither regular CPU nor CUDA has any difference, they're generating the
same exact code, which means no regressions/improvements there.

Based on the research paper:

  https://research.nvidia.com/sites/default/files/publications/laine2013hpg_paper.pdf

Here's the documentation:

  https://docs.google.com/document/d/1LuXW-CV-sVJkQaEGZlMJ86jZ8FmoPfecaMdR-oiWbUY/edit

Design discussion of the patch:

  https://developer.blender.org/T44197

Differential Revision: https://developer.blender.org/D1200
2015-05-09 19:52:40 +05:00
f680c1b54a Cycles: Communicate number of closures and nodes feature set to the device
This way device can actually make a decision of how it can optimize the kernel
in order to make it most efficient.
2015-05-09 19:28:00 +05:00
6fc1669679 Cycles: Initial work towards selective nodes support compilation
The goal is to be able to compile kernel with nodes which are actually needed
to render current scene, hence improving performance of the kernel,

The idea is:

- Have few node groups, starting with a group which contains nodes are used
  really often, and then couple of groups which will be extension of this one.

- Have feature-based nodes disabling, so it's possible to disable nodes related
  to features which are not used with the currently used nodes group.

This commit only lays down needed routines for this approach, actual split will
happen later after gathering statistics from bunch of production scenes.
2015-05-09 19:22:16 +05:00
17c95d0a96 Cycles: Add utility function to count maximum number of closures used by session
This will be used by split kernel in order to compile most optimal kernel.

Maximum number of closures is actually being cached in the session, so viewport
rendering will not trigger kernel re-loading when number of closures goes down.
2015-05-09 19:17:49 +05:00
5068f7dc01 Cycles: Add utility function to graph to query number of closures used in it
Currently unused but will be needed soon for the split kernel work.
2015-05-09 19:13:32 +05:00
b3299bace0 Cycles: Pass requested tile size to the device via device task
This is currently unused but crucial for things like calculating amount of
device memory required to deal with the tasks.

Maybe not really best place to store it, but consider it good enough for now.
2015-05-09 19:09:07 +05:00
0e4ddaadd4 Cycles: Change the way how we pass requested capabilities to the device
Previously we only had experimental flag passed to device's load_kernel() which
was all fine. But since we're gonna to have some extra parameters passed there
it makes sense to wrap them into a single struct, which will make it easier to
pass stuff around.
2015-05-09 19:05:49 +05:00
7eac672e4f Cycles: Set default closure values to some of the nodes
Previously it was only set at compilation time which is all fine but does
not let us to check which closure the node corresponds to prior to the
compilation.
2015-05-09 19:04:09 +05:00
900fc43bb4 Cleanup: Remove unused ray type flags.
They were added for completeness, but it seems we don't need them.
2015-05-08 12:10:26 +02:00
7201f6d14c Cycles: Use curve approximation for blackbody instead of lookup table
Now we calculate color in range 800..12000 using an approximation a/x+bx+c for R and G and ((at + b)t + c)t + d) for B.
Max absolute error for RGB for non-lut function is less than 0.0001, which is enough to get the same 8 bit/channel color as for OSL with a noticeable performance difference.
However there is a slight visible difference between previous non-OSL implementation because of lookup table interpolation and offset-by-one mistake.
The previous implementation gave black color outside of soft range (t > 12000), now it gives the same color as for 12000.

Also blackbody node without input connected is being converted to value input at shader compile time.

Reviewers: dingto, sergey

Reviewed By: dingto

Subscribers: nutel, brecht, juicyfruit

Differential Revision: https://developer.blender.org/D1280
2015-05-05 06:11:54 +00:00
e5f3193df3 Cycles: Fix wrong order in object flags calculations
Object flags are depending on bounding box which is only available after
mesh synchronization.

This was broken since 7fd4c44 which happened quite close to the release
and oddly enough was not sopped by anyone. Render test is coming for this.

Was spotted by Thomas Dinges while working on another patch.
2015-04-30 01:09:48 +05:00
f478c2cfbd Cycles: Added support for light portals
This patch adds support for light portals: objects that help sampling the
environment light, therefore improving convergence. Using them tor other
lights in a unidirectional pathtracer is virtually useless.

The sampling is done with the area-preserving code already used for area lamps.
MIS is used both for combination of different portals and for combining portal-
and envmap-sampling.

The direction of portals is considered, they aren't used if the sampling point
is behind them.

Reviewers: sergey, dingto, #cycles

Reviewed By: dingto, #cycles

Subscribers: Lapineige, nutel, jtheninja, dsisco11, januz, vitorbalbio, candreacchio, TARDISMaker, lichtwerk, ace_dragon, marcog, mib2berlin, Tunge, lopataasdf, lordodin, sergey, dingto

Differential Revision: https://developer.blender.org/D1133
2015-04-28 01:30:16 +05:00
ae7d84dbc1 Cycles: Use native saturate function for CUDA
This more a workaround for CUDA optimizer which can't optimize clamp(x, 0, 1)
into a single instruction and uses 4 instructions instead.

Original patch by @lockal with own modification:

  Don't make changes outside of the kernel. They don't make any difference
  anyway and term saturate() has a bit different meaning outside of kernel.

This gives around 2% of speedup in Barcelona file, but in more complex shader
setups with lots of math nodes with clamping speedup could be much nicer.

Subscribers: dingto

Projects: #cycles

Differential Revision: https://developer.blender.org/D1224
2015-04-28 00:38:32 +05:00
bc160d8a85 Cleanup: Code style. 2015-04-26 00:42:26 +02:00
8dd055cd47 Cleanup: Update Lookup table comments. 2015-04-26 00:06:38 +02:00
828abaf11c Cycles: Split BVH nodes storage into inner and leaf nodes
This way we can get rid of inefficient memory usage caused by BVH boundbox
part being unused by leaf nodes but still being allocated for them. Doing
such split allows to save 6 of float4 values for QBVH per leaf node and 3
of float4 values for regular BVH per leaf node.

This translates into following memory save using 01.01.01.G rendered
without hair:

                   Device memory size   Device memory peak   Global memory peak
Before the patch:  4957                 5051                 7668
With the patch:    4467                 4562                 7332

The measurements are done against current master. Still need to run speed tests
and it's hard to predict if it's faster or not: on the one hand leaf nodes are
now much more coherent in cache, on the other hand they're not so much coherent
with regular nodes anymore.

Reviewers: brecht, juicyfruit

Subscribers: venomgfx, eyecandy

Differential Revision: https://developer.blender.org/D1236
2015-04-20 17:29:51 +05:00