Float2 are now a new type for attributes in Cycles. Before, the choices
for attribute storage were float and float3, the latter padded to
float4. This meant that UV maps were inflated to twice the size
necessary.
Reviewers: brecht, sergey
Reviewed By: brecht
Subscribers: #cycles
Tags: #cycles
Differential Revision: https://developer.blender.org/D4409
There is a generic function to retrieve float and float3 attributes
`primitive_attribute_float` and primitive_attribute_float3`. Inside
these functions an prioritised if-else construction checked where
the attribute is stored and then retrieved from that location.
Actually the calling function most of the time already knows where
the data is stored. So we could simplify this by splitting these
functions and remove the check logic.
This patch splits the `primitive_attribute_float?` functions into
`primitive_surface_attribute_float?` and `primitive_volume_attribute_float?`.
What leads to less branching and more optimum kernels.
The original function is still being used by OSL and `svm_node_attr`.
This will reduce the compilation time and render time for kernels.
Especially in production scenes there is a lot of benefit.
Impact in compilation times
job | scene_name | previous | new | percentage
-------+-----------------+----------+-------+------------
t61513 | empty | 10.63 | 10.66 | 0%
t61513 | bmw | 17.91 | 17.65 | 1%
t61513 | fishycat | 19.57 | 17.68 | 10%
t61513 | barbershop | 54.10 | 24.41 | 55%
t61513 | classroom | 17.55 | 16.29 | 7%
t61513 | koro | 18.92 | 18.05 | 5%
t61513 | pavillion | 17.43 | 16.52 | 5%
t61513 | splash279 | 16.48 | 14.91 | 10%
t61513 | volume_emission | 36.22 | 21.60 | 40%
Impact in render times
job | scene_name | previous | new | percentage
-------+-----------------+----------+--------+------------
61513 | empty | 21.06 | 20.35 | 3%
61513 | bmw | 198.44 | 190.05 | 4%
61513 | fishycat | 394.20 | 401.25 | -2%
61513 | barbershop | 1188.16 | 912.39 | 23%
61513 | classroom | 341.08 | 340.38 | 0%
61513 | koro | 472.43 | 471.80 | 0%
61513 | pavillion | 905.77 | 899.80 | 1%
61513 | splash279 | 55.26 | 54.86 | 1%
61513 | volume_emission | 62.59 | 61.70 | 1%
There is also a possitive impact when using CPU and CUDA, but they are small.
I didn't split the hair logic from the surface logic due to:
* Hair and surface use same attribute types. It was not clear if it could be
splitted when looking at the code only.
* Hair and surface are quick to compile and to read. So the benefit is quite
small.
Differential Revision: https://developer.blender.org/D4375
This means the shader can now be used for procedural texturing. New
settings on the node are Samples, Inside, Local Only and Distance.
Original patch by Lukas with further changes by Brecht.
Differential Revision: https://developer.blender.org/D3479
I've limited it to just the RGB<->XYZ stuff for now, correct image handling is the next step.
Reviewers: brecht, sergey
Differential Revision: https://developer.blender.org/D3478
There is one legit place in the code where memcpy was used as an
optimization trick. Was needed for older version of GCC, but now
it should be re-evaluated and checked if it still helps to have
that trick.
In other places it's somewhat lazy programming to zero out all
object members. That is absolutely unsafe, at the moment when
less trivial class is used as a member in that object things
will break.
Other cases were using memcpy into an object which comes from
an external library. We don't control that object, and we can
not guarantee it will always be safe for such memory tricks
and debugging bugs caused by such low level access is far fun.
Ideally we need to use more proper C++, but needs to be done with
big care, including benchmarks of each change, For now do
annoying but simple cast to void*.
This patch adds support for IES files, a file format that is commonly used to store the directional intensity distribution of light sources.
The new IES node is supposed to be plugged into the Strength input of the Emission node of the lamp.
Since people generating IES files do not really seem to care about the standard, the parser is flexible enough to accept all test files I have tried.
Some common weirdnesses are distributing values over multiple lines that should go into one line, using commas instead of spaces as delimiters and adding various useless stuff at the end of the file.
The user interface of the node is similar to the script node, the user can either select an internal Text or load a file.
Internally, IES files are handled similar to Image textures: They are stored in slots by the LightManager and each unique IES is assigned to one slot.
The local coordinate system of the lamp is used, so that the direction of the light can be changed. For UI reasons, it's usually best to add an area light,
rotate it and then change its type, since especially the point light does not immediately show its local coordinate system in the viewport.
Reviewers: #cycles, dingto, sergey, brecht
Reviewed By: #cycles, dingto, brecht
Subscribers: OgDEV, crazyrobinhood, secundar, cardboard, pisuke, intrah, swerner, micah_denn, harvester, gottfried, disnel, campbellbarton, duarteframos, Lapineige, brecht, juicyfruit, dingto, marek, rickyblender, bliblubli, lockal, sergey
Differential Revision: https://developer.blender.org/D1543
This save a little memory and copying in the kernel by storing only a 4x3
matrix instead of a 4x4 matrix. We already did this in a few places, and
those don't need to be special exceptions anymore now.
It seems to be useful still in cases where the particle are distributed in
a particular order or pattern, to colorize them along with that. This isn't
really well defined, but might as well avoid breaking backwards compatibility
for now.
The algorithm averages normals from nearby surfaces. It uses the same
sampling strategy as BSSRDFs, casting rays along the normal and two
orthogonal axes, and combining the samples with MIS.
The main concern here is that we are introducing raytracing inside
shader evaluation, which could be quite bad for GPU performance and
stack memory usage. In practice it doesn't seem so bad though.
Note that using this feature can easily slow down renders 20%, and
that if you care about performance then it's better to use a bevel
modifier. Mainly this is useful for baking, and for cases where the
mesh topology makes it difficult for the bevel modifier to work well.
Differential Revision: https://developer.blender.org/D2803
* Use common TextureInfo struct for all devices, except CUDA fermi.
* Move image sampling code to kernels/*/kernel_*_image.h files.
* Use arrays for data textures on Fermi too, so device_vector<Struct> works.
Fishy cat benchmark was rendering with wrong shadows. Cause is unclear,
adding printf or rearranging code seems to avoid this issue, possibly a
compiler bug. This reverts the fix and solves the OSL bug elsewhere.
The idea is to make include statements more explicit and obvious where the
file is coming from, additionally reducing chance of wrong header being
picked up.
For example, it was not obvious whether bvh.h was refferring to builder
or traversal, whenter node.h is a generic graph node or a shader node
and cases like that.
Surely this might look obvious for the active developers, but after some
time of not touching the code it becomes less obvious where file is coming
from.
This was briefly mentioned in T50824 and seems @brecht is fine with such
explicitness, but need to agree with all active developers before committing
this.
Please note that this patch is lacking changes related on GPU/OpenCL
support. This will be solved if/when we all agree this is a good idea to move
forward.
Reviewers: brecht, lukasstockner97, maiself, nirved, dingto, juicyfruit, swerner
Reviewed By: lukasstockner97, maiself, nirved, dingto
Subscribers: brecht
Differential Revision: https://developer.blender.org/D2586
We started to run out of bits there, so now we separate flags
which came from __object_flags and which are either runtime or
coming from __shader_flags.
Rule now is: SD_OBJECT_* flags are to be tested against new
object_flags field of ShaderData, all the rest flags are to
be tested against flags field of ShaderData.
There should be no user-visible changes, and time difference
should be minimal. In fact, from tests here can only see hardly
measurable difference and sometimes the new code is somewhat
faster (all within a noise floor, so hard to tell for sure).
Reviewers: brecht, dingto, juicyfruit, lukasstockner97, maiself
Differential Revision: https://developer.blender.org/D2428
Was a bit confusing to have transparent and translucent depth
exposed but no diffuse or glossy.
Reviewers: brecht
Subscribers: eyecandy
Differential Revision: https://developer.blender.org/D2399
When using the Normal output of the Texture Coordinate node on Point and Spot lamps, the coordinates now depend on the rotation of the lamp.
On Area lamps, the Parametric output of the Geometry node now returns UV coordinates on the area lamp.
Credit for the Area lamp part goes to Stefan Werner (from D1995).
Mostly this is making inlining match CUDA 7.5 in a few performance critical
places. The end result is that performance is now better than before, possibly
due to less register spilling or other CUDA 8.0 compiler improvements.
On benchmarks scenes, there are 3% to 35% render time reductions. Stack memory
usage is reduced a little too.
Reviewed By: sergey
Differential Revision: https://developer.blender.org/D2269
Object coordinates can now be used in the displacement shader and will give
correct results, where as before bump mapping was calculated from the displace
positions and resulted in incorrect shading.
This works by evaluating the shader in two parts, first bump then surface, and
setting the shader state to match what it would be if the surface was
undisplaced for the bump shader evaluation. Currently only `P` is set as if
undisplaced, but other shader variables could be set as well, such as `I` or
`time`. Since these aren't set to anything meaningful for displacement I left
them out of this patch, we can decide what to do with them separately.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D2156
Adds a descriptor for attributes that can easily be passed around and extended
to contain more data. Will be used for attributes on subdivision meshes.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D2110
This adds support for ngons and attributes on subdivision meshes. Ngons are
needed for proper attribute interpolation as well as correct Catmull-Clark
subdivision. Several changes are made to achieve this:
- new primitive `SubdFace` added to `Mesh`
- 3 more textures are used to store info on patches from subd meshes
- Blender export uses loop interface instead of tessface for subd meshes
- `Attribute` class is updated with a simplified way to pass primitive counts
around and to support ngons.
- extra points for ngons are generated for O(1) attribute interpolation
- curves are temporally disabled on subd meshes to avoid various bugs with
implementation
- old unneeded code is removed from `subd/`
- various fixes and improvements
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D2108
BVH traversal is not really that much a geometry and we've got
quite some traversals now. Makes sense to keep them separate in
the name of source structure clarity.
Compile time per kernel increased alot after recent image commits, re-shuffle some code to fix this.
Patch by "LazyDodo".
Differential Revision: https://developer.blender.org/D2012
This commit changes the way how we pass bounce information to the Light
Path node. Instead of manualy copying the bounces into ShaderData, we now
directly pass PathState. This reduces the arguments that we need to pass
around and also makes it easier to extend the feature.
This commit also exposes the Transmission Bounce Depth to the Light Path
node. It works similar to the Transparent Depth Output: Replace a
Transmission lightpath after X bounces with another shader, e.g a Diffuse
one. This can be used to avoid black surfaces, due to low amount of max
bounces.
Reviewed by Sergey and Brecht, thanks for some hlp with this.
I tested compilation and usage on CPU (SVM and OSL), CUDA, OpenCL Split
and Mega kernel. Hopefully this covers all devices. :)
Originally we thought it's needed in order to distinguish builtin file from
filename which starts with '@', but the filepath is actually full path there
and it's unlikely to have file system where '@' is a proper root character.
Surprisingly this does not give visible speed differences, but it's still
nice to get rid of redundant check.