This patch adds a new Cycles device with similar functionality to the
existing GPU devices. Kernel compilation and runtime interaction happen
via oneAPI DPC++ compiler and SYCL API.
This implementation is primarly focusing on Intel® Arc™ GPUs and other
future Intel GPUs. The first supported drivers are 101.1660 on Windows
and 22.10.22597 on Linux.
The necessary tools for compilation are:
- A SYCL compiler such as oneAPI DPC++ compiler or
https://github.com/intel/llvm
- Intel® oneAPI Level Zero which is used for low level device queries:
https://github.com/oneapi-src/level-zero
- To optionally generate prebuilt graphics binaries: Intel® Graphics
Compiler All are included in Linux precompiled libraries on svn:
https://svn.blender.org/svnroot/bf-blender/trunk/lib The same goes for
Windows precompiled binaries but for the graphics compiler, available
as "Intel® Graphics Offline Compiler for OpenCL™ Code" from
https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html,
for which path can be set as OCLOC_INSTALL_DIR.
Being based on the open SYCL standard, this implementation could also be
extended to run on other compatible non-Intel hardware in the future.
Reviewed By: sergey, brecht
Differential Revision: https://developer.blender.org/D15254
Co-authored-by: Nikita Sirgienko <nikita.sirgienko@intel.com>
Co-authored-by: Stefan Werner <stefan.werner@intel.com>
GLEW does not support GLX and EGL at the same time, and the distribution version
is likely to have GLX.
This also refactors the code so all OpenGL related CMake options are together.
Differential Revision: https://developer.blender.org/D12034
Add WITH_GHOST_WAYLAND_DBUS option, so Blender can be built without
DBUS support. Currently it's only used to access the cursor theme.
Without this the "default" cursors are used instead.
Disabling this since it adds an additional dependency for a minor gain
in functionality, with the benefit of removing a library requirement.
There is also a problem where Blender hangs on startup for ~5 seconds
when DBUS isn't running. Eventually it would be good to be able to avoid
this problem without a build option.
This implements client-side window decorations for moving and resizing
windows and HiDPI support.
This functionality depends on the external project 'libdecor' that is
currently a build option: WITH_GHOST_WAYLAND_LIBDECOR.
Reviewed by: brecht, campbellbarton
Ref D7989
Building against the existing 3.1 libraries should continue to work, until
the precompiled libraries are committed for all platforms.
* Enable WebP by default.
* Update Windows for new library file names.
* Automatically clear outdated CMake cache variables when upgrading to new
libraries.
* Fix static library linking order issues on Linux for OpenEXR and OpenVDB.
Implemented by Ray Molenkamp, Sybren Stüvel and Brecht Van Lommel.
Ref T95206
Currently only supports single image frames (no animation possible).
If quality slider is set to 100 then lossless compression will be used,
otherwise lossy compression is used.
Gives about 35% reduction of filesize save when re-saving splash screens with lossless
compression.
Also saves much faster, up to 15x faster than PNG with a better compression ratio as a plus.
Note, this is currently left disabled until we have WebP libs (see T95206)
For testing precompiled libs can be downloaded from Google:
https://storage.googleapis.com/downloads.webmproject.org/releases/webp/index.html
Differential Revision: https://developer.blender.org/D1598
Atomic operations performed by the C++ standard library might require
libatomic on platforms which do not have hardware support for those
operations.
This change makes it that such configurations are automatically detected
and -latomic is added when needed.
Differential Revision: https://developer.blender.org/D14106
I noticed that there were a few variables that should not be visible per default.
It seems to me to simply be an oversight, so I went ahead and cleaned them up.
Reviewed By: Sybren, Ray molenkamp
Differential Revision: http://developer.blender.org/D14132
Use a shorter/simpler license convention, stops the header taking so
much space.
Follow the SPDX license specification: https://spdx.org/licenses
- C/C++/objc/objc++
- Python
- Shell Scripts
- CMake, GNUmakefile
While most of the source tree has been included
- `./extern/` was left out.
- `./intern/cycles` & `./intern/atomic` are also excluded because they
use different header conventions.
doc/license/SPDX-license-identifiers.txt has been added to list SPDX all
used identifiers.
See P2788 for the script that automated these edits.
Reviewed By: brecht, mont29, sergey
Ref D14069
This reverts commit 086f191169.
There was apparently a problem using APPEND which wasn't referenced
in the commit log.
Added comment noting the reason for the discrepancy.
This was already done for APPLE & WIN32, which would
reference these libraries twice.
Now append BROTLI_LIBRARIES to FREETYPE_LIBRARIES when they're
required for linking.
No functional changes as all references to FREETYPE_LIBRARIES also
used BROTLI_LIBRARIES.
When LIBDIR existed, searching for system libraries would always
first search 'LIBDIR'.
This meant "WITH_SYSTEM_*" would still prefer LIBDIR versions of
libraries if they exist.
The presence of LIBDIR also ignored the setting for WITH_STATIC_LIBS
which is now restored to the cached value once pre-compiled libraries
have been handled.
The Brotli library only needs to be explicitly linked when using the
statically linked libraries. When using system libs they're shared, and
the .so loading mechanism takes care of dependencies.
Add `libbrotlidec-static.a` and `libbrotlicommon-static.a` to the CMake
`$FREETYPE_LIBRARIES` variable; they'll be required when the Linux libs
for the FreeType upgrade lands (D13448).
The order of libraries is different compared to the similar lines in the
Windows and Apple CMake files, to prevent linker errors on Linux.
Since the option to enable linkers are booleans,
it's possible to enable them all at once.
Now only the first enabled + available linker is used
(with priority given to link is with better performance).
Set the linker using CMAKE_*_LINKER_FLAGS instead of {C/CXX}FLAGS.
There is no advantage in using the CFLAGS to set the linker, it has the
downside of triggering a full rebuild when changing the linker.
Tested building Blender and the bpy.so Python module.
Ref D13833
Reviewed by: sergey, brecht
Can give considerably faster linking, especially on system with many
cores.
The mold linker recently reached 1.0, see:
https://github.com/rui314/mold
The current stable release of GCC can't use this linker via
-fuse-ld=mold, so this patch uses the "-B" argument to add a binary
directory containing an alternate "ld" command that points to
"mold" (which is part of the default mold installation).
Some timing tests for linking full builds for AMD TR 3970X:
- BFD: 20.78 seconds.
- LLD: 12.16 seconds.
- GOLD: 7.21 seconds.
- MOLD: 2.53 seconds.
Ref D13807
Reviewed by: sergey, brecht
And change install_deps.sh to build shared (instead of static) FFMPEG
libraries, for consistency with other library dependencies and to simplify
the logic. This may require users of install_deps.sh to rebuild FFMPEG.
This is the last step that lets us get rid of LIBPATH variables and
link_directories() entirely, as recommended by the CMake docs.
Some fixes were needed in the find FFMPEG module to make it actually work,
this code was unused up to now.
Followup to D8855.
Differential Revision: https://developer.blender.org/D9177
And change install_deps.sh to build shared (instead of static) FFMPEG
libraries, for consistency with other library dependencies and to simplify
the logic. This may require users of install_deps.sh to rebuild FFMPEG.
This is the last step that lets us get rid of LIBPATH variables and
link_directories() entirely, as recommended by the CMake docs.
Some fixes were needed in the find FFMPEG module to make it actually work,
this code was unused up to now.
Followup to D8855.
Differential Revision: https://developer.blender.org/D9177
This patch implements the vector types (i.e:`float2`) by making heavy
usage of templating. All vector functions are now outside of the vector
classes (inside the `blender::math` namespace) and are not vector size
dependent for the most part.
In the ongoing effort to make shaders less GL centric, we are aiming
to share more code between GLSL and C++ to avoid code duplication.
####Motivations:
- We are aiming to share UBO and SSBO structures between GLSL and C++.
This means we will use many of the existing vector types and others
we currently don't have (uintX, intX). All these variations were
asking for many more code duplication.
- Deduplicate existing code which is duplicated for each vector size.
- We also want to share small functions. Which means that vector
functions should be static and not in the class namespace.
- Reduce friction to use these types in new projects due to their
incompleteness.
- The current state of the `BLI_(float|double|mpq)(2|3|4).hh` is a
bit of a let down. Most clases are incomplete, out of sync with each
others with different codestyles, and some functions that should be
static are not (i.e: `float3::reflect()`).
####Upsides:
- Still support `.x, .y, .z, .w` for readability.
- Compact, readable and easilly extendable.
- All of the vector functions are available for all the vectors types
and can be restricted to certain types. Also template specialization
let us define exception for special class (like mpq).
- With optimization ON, the compiler unroll the loops and performance
is the same.
####Downsides:
- Might impact debugability. Though I would arge that the bugs are
rarelly caused by the vector class itself (since the operations are
quite trivial) but by the type conversions.
- Might impact compile time. I did not saw a significant impact since
the usage is not really widespread.
- Functions needs to be rewritten to support arbitrary vector length.
For instance, one can't call `len_squared_v3v3` in
`math::length_squared()` and call it a day.
- Type cast does not work with the template version of the `math::`
vector functions. Meaning you need to manually cast `float *` and
`(float *)[3]` to `float3` for the function calls.
i.e: `math::distance_squared(float3(nearest.co), positions[i]);`
- Some parts might loose in readability:
`float3::dot(v1.normalized(), v2.normalized())`
becoming
`math::dot(math::normalize(v1), math::normalize(v2))`
But I propose, when appropriate, to use
`using namespace blender::math;` on function local or file scope to
increase readability.
`dot(normalize(v1), normalize(v2))`
####Consideration:
- Include back `.length()` method. It is quite handy and is more C++
oriented.
- I considered the GLM library as a candidate for replacement. It felt
like too much for what we need and would be difficult to extend / modify
to our needs.
- I used Macros to reduce code in operators declaration and potential
copy paste bugs. This could reduce debugability and could be reverted.
- This touches `delaunay_2d.cc` and the intersection code. I would like
to know @howardt opinion on the matter.
- The `noexcept` on the copy constructor of `mpq(2|3)` is being removed.
But according to @JacquesLucke it is not a real problem for now.
I would like to give a huge thanks to @JacquesLucke who helped during this
and pushed me to reduce the duplication further.
Reviewed By: brecht, sergey, JacquesLucke
Differential Revision: https://developer.blender.org/D13791
This patch implements the vector types (i.e:float2) by making heavy
usage of templating. All vector functions are now outside of the vector
classes (inside the blender::math namespace) and are not vector size
dependent for the most part.
In the ongoing effort to make shaders less GL centric, we are aiming
to share more code between GLSL and C++ to avoid code duplication.
Motivations:
- We are aiming to share UBO and SSBO structures between GLSL and C++.
This means we will use many of the existing vector types and others we
currently don't have (uintX, intX). All these variations were asking
for many more code duplication.
- Deduplicate existing code which is duplicated for each vector size.
- We also want to share small functions. Which means that vector functions
should be static and not in the class namespace.
- Reduce friction to use these types in new projects due to their
incompleteness.
- The current state of the BLI_(float|double|mpq)(2|3|4).hh is a bit of a
let down. Most clases are incomplete, out of sync with each others with
different codestyles, and some functions that should be static are not
(i.e: float3::reflect()).
Upsides:
- Still support .x, .y, .z, .w for readability.
- Compact, readable and easilly extendable.
- All of the vector functions are available for all the vectors types and
can be restricted to certain types. Also template specialization let us
define exception for special class (like mpq).
- With optimization ON, the compiler unroll the loops and performance is
the same.
Downsides:
- Might impact debugability. Though I would arge that the bugs are rarelly
caused by the vector class itself (since the operations are quite trivial)
but by the type conversions.
- Might impact compile time. I did not saw a significant impact since the
usage is not really widespread.
- Functions needs to be rewritten to support arbitrary vector length. For
instance, one can't call len_squared_v3v3 in math::length_squared() and
call it a day.
- Type cast does not work with the template version of the math:: vector
functions. Meaning you need to manually cast float * and (float *)[3] to
float3 for the function calls.
i.e: math::distance_squared(float3(nearest.co), positions[i]);
- Some parts might loose in readability:
float3::dot(v1.normalized(), v2.normalized())
becoming
math::dot(math::normalize(v1), math::normalize(v2))
But I propose, when appropriate, to use
using namespace blender::math; on function local or file scope to
increase readability. dot(normalize(v1), normalize(v2))
Consideration:
- Include back .length() method. It is quite handy and is more C++
oriented.
- I considered the GLM library as a candidate for replacement.
It felt like too much for what we need and would be difficult to
extend / modify to our needs.
- I used Macros to reduce code in operators declaration and potential
copy paste bugs. This could reduce debugability and could be reverted.
- This touches delaunay_2d.cc and the intersection code. I would like to
know @Howard Trickey (howardt) opinion on the matter.
- The noexcept on the copy constructor of mpq(2|3) is being removed.
But according to @Jacques Lucke (JacquesLucke) it is not a real problem
for now.
I would like to give a huge thanks to @Jacques Lucke (JacquesLucke) who
helped during this and pushed me to reduce the duplication further.
Reviewed By: brecht, sergey, JacquesLucke
Differential Revision: http://developer.blender.org/D13791
* Don't link embree / OSL when WITH_CYCLES is disabled
* Simplify lite config by disabling Cycles as a whole using this
* Remove code handling the removed WITH_CYCLES_NETWORK option
Compressing blendfiles can help save a lot of disk space, but the slowdown
while loading and saving is a major annoyance.
Currently Blender uses Zlib (aka gzip aka Deflate) for compression, but there
are now several more modern algorithms that outperform it in every way.
In this patch, I decided for Zstandard aka Zstd for several reasons:
- It is widely supported, both in other programs and libraries as well as in
general-purpose compression utilities on Unix
- It is extremely flexible - spanning several orders of magnitude of
compression speeds depending on the level setting.
- It is pretty much on the Pareto frontier for all of its configurations
(meaning that no other algorithm is both faster and more efficient).
One downside of course is that older versions of Blender will not be able to
read these files, but one can always just re-save them without compression or
decompress the file manually with an external tool.
The implementation here saves additional metadata into the compressed file in
order to allow for efficient seeking when loading. This is standard-compliant
and will be ignored by other tools that support Zstd.
If the metadata is not present (e.g. because you manually compressed a .blend
file with another tool), Blender will fall back to sequential reading.
Saving is multithreaded to improve performance. Loading is currently not
multithreaded since it's not easy to predict the access patterns of the
loading code when seeking is supported.
In the future, we might want to look into making this more predictable or
disabling seeking for the main .blend file, which would then allow for
multiple background threads that decompress data ahead of time.
The compression level was chosen to get sizes comparable to previous versions
at much higher speeds. In the future, this could be exposed as an option.
Reviewed By: campbellbarton, brecht, mont29
Differential Revision: https://developer.blender.org/D5799
This is causing issues for some users launching Blender, because EGL indirectly
requires GLVND, which is not installed by default on e.g. Ubuntu.
This reverts commit 0b18a618b8.
Fixes T90374
Ref D12034
This will replace GLX with EGL for X11. GLEW does not support GLX and EGL
at the same time. Most distributions build GLEW with GLX support, so we
have to use the externally provided GLEW and build with EGL support.
This effectively sets WITH_SYSTEM_GLEW to OFF for all Linux configurations.
Differential Revision: https://developer.blender.org/D12034
This bumps OSL to 1.11.10.0. OSL Has a new build time
dependency: Clang, and more importantly it expects
clang and llvm to share a library folder, which it
previously for us did not.
This patch changes:
-OSL Update to 1.11.10.0
-refactor the llvm/clang/clang-tools-extra builds into the llvm
build using the llvm-project tarball for building that has all
of the subprojects in it.
-update ispc/openmp builds since clang no longer its own dependency
and they have to depend on the llvm build now.
-Update the windows builder to use the 64 bit host tools since it
ran out of ram linking clang
-Since OSL now needs clang to link successfully a findclang.cmake
has been provided for linux/OSX
Differential Revision: https://developer.blender.org/D10212
Reviewed By: brecht, sebbas, sybren
* WITH_CPU_SSE was renamed to WITH_CPU_SIMD, and now covers both SSE and Neon.
* For macOS sse2neon.h is included as part of the precompiled libraries.
* For Linux it is enabled if the sse2neon.h header file is detected. However
this library does not have official releases and is not shipped with any Linux
distribution, so manual installation and configuration is required to get this
working.
Ref D8237, T78710
Ref T84819
Build System
============
This is an API breaking new version, and the updated code only builds with
OpenColorIO 2.0 and later. Adding backwards compatibility was too complicated.
* Tinyxml was replaced with Expat, adding a new dependency.
* Yaml-cpp is now built as a dependency on Unix, as was already done on Windows.
* Removed currently unused LCMS code.
* Pystring remains built as part of OCIO itself, since it has no good build system.
* Linux and macOS check for the OpenColorIO verison, and disable it if too old.
Ref D10270
Processors and Transforms
=========================
CPU processors now need to be created to do CPU processing. These are cached
internally, but the cache lookup is not fast enough to execute per pixel or
texture sample, so for performance these are now also exposed in the C API.
The C API for transforms will no longer be needed afer all changes, so remove
it to simplify the API and fallback implementation.
Ref D10271
Display Transforms
==================
Needs a bit more manual work constructing the transform. LegacyViewingPipeline
could also have been used, but isn't really any simpler and since it's legacy
we better not rely on it.
We moved more logic into the opencolorio module, to simplify the API. There is
no need to wrap a dozen functions just to be able to do this in C rather than C++.
It's also tightly coupled to the GPU shader logic, and so should be in the same
module.
Ref D10271
GPU Display Shader
==================
To avoid baking exposure and gamma into the GLSL shader and requiring slow
recompiles when tweaking, we manually apply them in the shader. This leads
to some logic duplicaton between the CPU and GPU display processor, but it
seems unavoidable.
Caching was also changed. Previously this was done both on the imbuf and
opencolorio module levels. Now it's all done in the opencolorio module by
simply matching color space names. We no longer use cacheIDs from OpenColorIO
since computing them is expensive, and they are unlikely to match now that
more is baked into the shader code.
Shaders can now use multiple 2D textures, 3D textures and uniforms, rather
than a single 3D texture. So allocating and binding those adds some code.
Color space conversions for blending with overlays is now hardcoded in the
shader. This was using harcoded numbers anyway, if this every becomes a
general OpenColorIO transform it can be changed, but for now there is no
point to add code complexity.
Ref D10273
CIE XYZ
=======
We need standard CIE XYZ values for rendering effects like blackbody emission.
The relation to the scene linear role is based on OpenColorIO configuration.
In OpenColorIO 2.0 configs roles can no longer have the same name as color
spaces, which means our XYZ role and colorspace in the configuration give an
error.
Instead use the new standard aces_interchange role, which relates scene linear
to a known scene referred color space. Compatibility with the old XYZ role is
preserved, if the configuration file has no conflicting names.
Also includes a non-functional change to the configuraton file to use an
XYZ-to-ACES matrix instead of REC709-to-ACES, makes debugging a little easier
since the matrix is the same one we have in the code now and that is also
found easily in the ACES specs.
Ref D10274
Since GPencil changes depending on libharu may be committed to master
before SVN libraries for all platforms are in place, avoid build issues.
Extension of D9928.
Reviewed By: sybren
Differential Revision: https://developer.blender.org/D10280
When building with more aggressive optimization flags, GCC will add FMA
(Fused Multiply Add) instructions that will slightly alter the floating
point operation results.
This causes some automated tests to fail in blender.
In clang and the intel compiler ffp-contract is set to off per default
it seems from my research. (They do not have the exact same setting,
but the default seems to match the off behavior)
Reviewed By: Brecht
Differential Revision: http://developer.blender.org/D9047
This adds an option (WITH_COMPILER_CCACHE) to build using Ccache if it's
found. Makefiles-based, Ninja-based and Xcode generators are supported.
Pass `-DWITH_COMPILER_CCACHE=ON` to cmake to enable Ccache.
Utility option in GNUmakefile is also added: for e.g.,
`make ninja ccache`.
Reviewed By: brecht, ankitm
Differential Revision: https://developer.blender.org/D9665