Make sure we don't perform any implicit address space conversion.
A bit annoying, but less intrusive approaches (like using temp private
variable in .cl kernel) do not work correct here.
Using generic address space will help from code side here, but will
be somewhat slower due to extra things happening as far as i know.
As far as I can see, the second issue there was that the functions receive a pointer to a member variable of the
ShaderData, which is stored in global memory. However, this means that the pointer points to global memory as well,
therefore OpenCL requires the ccl_addr_space "keyword" in front of the pointer.
With this commit, the OpenCL kernels build on Linux with the Intel CPU OpenCL runtime - however, they already did
without the change and I don't have an AMD card, so I can't really test whether the AMD runtime is happy as well now.
Those (one per ID type!) were uselessly duplicated, and badly inconsistent
(some types were actually unlinking before deletion, others were only working if already unlinked!).
Now we use same func and same API for all types, by default deletion is performed only if ID is no more used,
set `do_unlink` parameter to True to always delete ID even if still in use.
Only exception now is with Scene, since we always want to keep at least one!
Note that this will change default behavior of some types (since unlinking is never done anymore by default).
This commit adds a new distribution to the Glossy, Anisotropic and Glass BSDFs that implements the
multiple-scattering microfacet model described in the paper "Multiple-Scattering Microfacet BSDFs with the Smith Model".
Essentially, the improvement is that unlike classical GGX, which only models single scattering and assumes
the contribution of multiple bounces to be zero, this new model performs a random walk on the microsurface until
the ray leaves it again, which ensures perfect energy conservation.
In practise, this means that the "darkening problem" - GGX materials becoming darker with increasing
roughness - is solved in a physically correct and efficient way.
The downside of this model is that it has no (known) analytic expression for evalation. However, it can be
evaluated stochastically, and although the correct PDF isn't known either, the properties of MIS and the
balance heuristic guarantee an unbiased result at the cost of slightly higher noise.
Reviewers: dingto, #cycles, brecht
Reviewed By: dingto, #cycles, brecht
Subscribers: bliblubli, ace_dragon, gregzaal, brecht, harvester, dingto, marcog, swerner, jtheninja, Blendify, nutel
Differential Revision: https://developer.blender.org/D2002
The function that assigns names to socket types missed an entry, therefore all entries after it were mapped to the wrong name.
Long-term, it might be a better solution to use a map to avoid issues like these, but for now this fix works.
In the OSL node compilation code for the Environment Texture, is_linear was used as a socket.
However, there was no socket for it, which caused Blender to crash.
Adding a socket doesn't really make sense since it's an internal value and not a parameter
of the node, so it now just uses the variable directly.
The problem here was that there are five path types internally (diffuse, glossy, transmission, subsurface and volume scatter), but subsurface isn't exposed to the user.
This caused some weird behaviour - if all four types are disabled on the lamp, Cycles doesn't even try sampling it, but if any type was active, the lamp would illuminate
the cube since none of the options set subsurface to zero.
In the future, it might be reasonable to add subsurface visibility as an option - but for now the weird and inconsistent behaviour can be fixed simply by setting both
diffuse and subsurface to zero if the user disables diffuse visibility.
It is not possible to use a set split by name as valid input to
check_node_input_traversed - it needs a complete set of all nodes visited so
far. On the other hand, the merge comparison loop should only check nodes that
were not just visited, but found unique. This means that there should really be
two separate data structures.
Without the fix, check_node_input_traversed actually never returns true, so
only nodes without any inputs are processed.
For non-branched path tracing with a GTX 960 and CUDA 7.5, this gives a small reduction
in stack usage but mainly: 8% faster render on BMW, 5% on pabellon, 13% on classroom.
This is an initial commit for half texture support in Cycles.
It adds the basic infrastructure inside of the ImageManager and support for these textures on CPU.
Supported:
* Half Float OpenEXR images (can be used for e.g HDRs or Normalmaps) now use 1/2 the memory, when loaded via disk (OIIO).
ToDo:
Various things like support for inbuilt half textures, GPU... will come later, step by step.
Part of my GSoC 2016.
Unsigned int is not supported by OSL as far as i concerned, so should not
really matter here. However, might be wrong and perhaps more proper idea
would be so set it as regular int?
You can capture and stream video in the BGE using the DeckLink video
cards from Black Magic Design. You need a card and Desktop Video software
version 10.4 or above to use these features in the BGE.
Many thanks to Nuno Estanquiero who tested the patch extensively
on a variety of Decklink products, it wouldn't have been possible without
his help.
You can find a brief summary of the decklink features here: https://wiki.blender.org/index.php/Dev:Source/GameEngine/Decklink
The full API details and samples are in the Python API documentation.
bge.texture.VideoDeckLink(format, capture=0):
Use this object to capture a video stream. the format argument describes
the video and pixel formats and the capture argument the card number.
This object can be used as a source for bge.texture.Texture so that the frame
is sent to the GPU, or by itself using the new refresh method to get the video
frame in a buffer.
The frames are usually not in RGB but in YUV format (8bit or 10bit); they
require a shader to extract the RGB components in the GPU. Details and sample
shaders in the documentation.
3D video capture is supported: the frames are double height with left and right
eyes in top-bottom order. The 'eye' uniform (see setUniformEyef) can be used to
sample the 3D frame when the BGE is also in stereo mode. This allows to composite
a 3D video stream with a 3D scene and render it in stereo.
In Windows, and if you have a nVidia Quadro GPU, you can benefit of an additional
performance boost by using 'GPUDirect': a method to send a video frame to the GPU
without going through the OGL driver. The 'pinned memory' OGL extension is also
supported (only on high-end AMD GPU) with the same effect.
bge.texture.DeckLink(cardIdx=0, format=""):
Use this object to send video frame to a DeckLink card. Only the immediate mode
is supported, the scheduled mode is not implemented.
This object is similar to bge.texture.Texture: you need to attach a image source
and call refresh() to compute and send the frame to the card.
This object is best suited for video keying: a video stream (not captured) flows
through the card and the frame you send to the card are displayed above it (the
card does the compositing automatically based on the alpha channel).
At the time of this commit, 3D video keying is supported in the BGE but not in the
DeckLink card due to a color space issue.
The assembler version in Windows used to return the previous value
of the variable while all the other versions return the new value.
This is now fixed for consistency.
Note: this bug had no effect on blender because no part of the code
use the return value of these functions, but the future BGE DeckLink
module makes use of it to implement reference counter.
bge.logic.setRender(flag) to enable/disable render.
The render pass is enabled by default but it can be disabled with
bge.logic.setRender(False).
Once disabled, the render pass is skipped and a new logic frame starts
immediately. Note that VSync no longer limits the fps when render is off
but the 'Use Frame Rate' option in the Render Properties still does.
To run as many frames as possible, untick the option
This function is useful when you don't need the default render, e.g.
when doing offscreen render to an alternate device than the monitor.
Note that without VSync, you must limit the frame rate by other means.
fbo = bge.render.offScreenCreate(width,height,[,samples=0][,target=bge.render.RAS_OFS_RENDER_BUFFER])
Use this method to create an offscreen buffer of given size, with given MSAA
samples and targetting either a render buffer (bge.render.RAS_OFS_RENDER_BUFFER)
or a texture (bge.render.RAS_OFS_RENDER_TEXTURE). Use the former if you want to
retrieve the frame buffer on the host and the latter if you want to pass the render
to another context (texture are proper OGL object, render buffers aren't)
The object created by this function can only be used as a parameter of the
bge.texture.ImageRender() constructor to send the the render to the FBO rather
than to the frame buffer. This is best suited when you want to create a render
of specific size, or if you need an image with an alpha channel.
bge.texture.<imagetype>.refresh(buffer=None, format="RGBA", ts=-1.0)
Without arg, the refresh method of the image objects is pretty much a no-op, it
simply invalidates the image so that on next texture refresh, the image will
be recalculated.
It is now possible to pass an optional buffer object to transfer the image (and
recalculate it if it was invalid) to an external object. The object must implement
the 'buffer protocol'. The image will be transfered as "RGBA" or "BGRA" pixels
depending on format argument (only those 2 formats are supported) and ts is an
optional timestamp in the image depends on it (e.g. VideoFFmpeg playing a video file).
With this function you don't need anymore to link the image object to a Texture
object to use: the image object is self-sufficient.
bge.texture.ImageRender(scene, camera, fbo=None)
Render to buffer is possible by passing a FBO object (see offScreenCreate).
bge.texture.ImageRender.render()
Allows asynchronous render: call this method to render the scene but without
extracting the pixels yet. The function returns as soon as the render commands
have been send to the GPU. The render will proceed asynchronously in the GPU
while the host can perform other tasks.
To complete the render, you can either call refresh() directly of refresh the texture
to which this object is the source. Asynchronous render is useful to achieve optimal
performance: call render() on frame N and refresh() on frame N+1 to give as much as
time as possible to the GPU to render the frame while the game engine can perform other tasks.
Support negative scale on camera.
Camera scale was previously ignored in the BGE.
It is now injected in the modelview matrix as a vertical or horizontal flip
of the scene (respectively if scaleY<0 and scaleX<0).
Note that the actual value of the scale is not used, only the sign.
This allows to flip the image produced by ImageRender() without any performance
degradation: the flip is integrated in the render itself.
Optimized image transfer from ImageRender to buffer.
Previously, images that were transferred to the host were always going through
buffers in VideoTexture. It is now possible to transfer ImageRender
images to external buffer without intermediate copy (i.e. directly from OGL to buffer)
if the attributes of the ImageRender objects are set as follow:
flip=False, alpha=True, scale=False, depth=False, zbuff=False.
(if you need to flip the image, use camera negative scale)
A new option '-a' can be passed to the blenderplayer. It forces the
framebuffer to have an alpha channel.
This can be used in VideoTexture to return a image with alpha channel
with ImageViewport (provided alpha is set to True on the ImageViewport
object and that the background color alpha channel is 0, which is the
default).
Without the -a option, the frame buffer has no alpha channel and
ImageViewport always returns an opaque image, no matter what.
In Linux, the player window will be rendered transparently over
the desktop.
In Windows, the player window is still rendered opaque because
transparency of the window is only possible using the 'compositing'
functions of Windows. The code is there but not enabled (look for
WIN32_COMPOSITING) because 1) it doesn't work so well 2) it requires
a DLL that is only available on Vista and up.
give precedence to AA over Swap copy:
Certain GPU (intel) will not allow MSAA together with swap copy.
Previously, swap copy had priority over MSAA: fewer AA samples would be
chosen if it was the condition to get swap copy. This patch reverse the
logic: swap copy will be abandonned if another swap method (undefined or
exchange) will provide the number of AA samples requested. If no AA
samples is requested, swap copy still has the priority of course.