This repository has been archived on 2023-10-09. You can view files and clone it, but cannot push or open issues or pull requests.
Files
blender-archive/source/blender/render/intern/raytrace/rayobject_vbvh.cpp

207 lines
5.2 KiB
C++
Raw Normal View History

/*
* ***** BEGIN GPL LICENSE BLOCK *****
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version 2
* of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software Foundation,
2010-02-12 13:34:04 +00:00
* Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
*
* The Original Code is Copyright (C) 2009 Blender Foundation.
* All rights reserved.
*
* The Original Code is: all of this file.
*
Raytrace modifications from the Render Branch. These should not have any effect on render results, except in some cases with you have overlapping faces, where the noise seems to be slightly reduced. There are some performance improvements, for simple scenes I wouldn't expect more than 5-10% to be cut off the render time, for sintel scenes we got about 50% on average, that's with millions of polygons on intel quad cores. This because memory access / cache misses were the main bottleneck for those scenes, and the optimizations improve that. Interal changes: * Remove RE_raytrace.h, raytracer is now only used by render engine again. * Split non-public parts rayobject.h into rayobject_internal.h, hopefully makes it clearer how the API is used. * Added rayintersection.h to contain some of the stuff from RE_raytrace.h * Change Isect.vec/labda to Isect.dir/dist, previously vec was sometimes normalized and sometimes not, confusing... now dir is always normalized and dist contains the distance. * Change VECCOPY and similar to BLI_math functions. * Force inlining of auxiliary functions for ray-triangle/quad intersection, helps a few percentages. * Reorganize svbvh code so all the traversal functions are in one file * Don't do test for root so that push_childs can be inlined * Make shadow a template parameter so it doesn't need to be runtime checked * Optimization in raytree building, was computing bounding boxes more often than necessary. * Leave out logf() factor in SAH, makes tree build quicker with no noticeable influence on raytracing on performance? * Set max childs to 4, simplifies traversal code a bit, but also seems to help slightly in general. * Store child pointers and child bb just as fixed arrays of size 4 in nodes, nearly all nodes have this many children, so overall it actually reduces memory usage a bit and avoids a pointer indirection.
2011-02-05 13:41:29 +00:00
* Contributor(s): André Pinto.
*
* ***** END GPL LICENSE BLOCK *****
*/
Raytrace modifications from the Render Branch. These should not have any effect on render results, except in some cases with you have overlapping faces, where the noise seems to be slightly reduced. There are some performance improvements, for simple scenes I wouldn't expect more than 5-10% to be cut off the render time, for sintel scenes we got about 50% on average, that's with millions of polygons on intel quad cores. This because memory access / cache misses were the main bottleneck for those scenes, and the optimizations improve that. Interal changes: * Remove RE_raytrace.h, raytracer is now only used by render engine again. * Split non-public parts rayobject.h into rayobject_internal.h, hopefully makes it clearer how the API is used. * Added rayintersection.h to contain some of the stuff from RE_raytrace.h * Change Isect.vec/labda to Isect.dir/dist, previously vec was sometimes normalized and sometimes not, confusing... now dir is always normalized and dist contains the distance. * Change VECCOPY and similar to BLI_math functions. * Force inlining of auxiliary functions for ray-triangle/quad intersection, helps a few percentages. * Reorganize svbvh code so all the traversal functions are in one file * Don't do test for root so that push_childs can be inlined * Make shadow a template parameter so it doesn't need to be runtime checked * Optimization in raytree building, was computing bounding boxes more often than necessary. * Leave out logf() factor in SAH, makes tree build quicker with no noticeable influence on raytracing on performance? * Set max childs to 4, simplifies traversal code a bit, but also seems to help slightly in general. * Store child pointers and child bb just as fixed arrays of size 4 in nodes, nearly all nodes have this many children, so overall it actually reduces memory usage a bit and avoids a pointer indirection.
2011-02-05 13:41:29 +00:00
2011-02-27 19:31:27 +00:00
/** \file blender/render/intern/raytrace/rayobject_vbvh.cpp
* \ingroup render
*/
int tot_pushup = 0;
int tot_pushdown = 0;
int tot_hints = 0;
#include <assert.h>
Raytrace modifications from the Render Branch. These should not have any effect on render results, except in some cases with you have overlapping faces, where the noise seems to be slightly reduced. There are some performance improvements, for simple scenes I wouldn't expect more than 5-10% to be cut off the render time, for sintel scenes we got about 50% on average, that's with millions of polygons on intel quad cores. This because memory access / cache misses were the main bottleneck for those scenes, and the optimizations improve that. Interal changes: * Remove RE_raytrace.h, raytracer is now only used by render engine again. * Split non-public parts rayobject.h into rayobject_internal.h, hopefully makes it clearer how the API is used. * Added rayintersection.h to contain some of the stuff from RE_raytrace.h * Change Isect.vec/labda to Isect.dir/dist, previously vec was sometimes normalized and sometimes not, confusing... now dir is always normalized and dist contains the distance. * Change VECCOPY and similar to BLI_math functions. * Force inlining of auxiliary functions for ray-triangle/quad intersection, helps a few percentages. * Reorganize svbvh code so all the traversal functions are in one file * Don't do test for root so that push_childs can be inlined * Make shadow a template parameter so it doesn't need to be runtime checked * Optimization in raytree building, was computing bounding boxes more often than necessary. * Leave out logf() factor in SAH, makes tree build quicker with no noticeable influence on raytracing on performance? * Set max childs to 4, simplifies traversal code a bit, but also seems to help slightly in general. * Store child pointers and child bb just as fixed arrays of size 4 in nodes, nearly all nodes have this many children, so overall it actually reduces memory usage a bit and avoids a pointer indirection.
2011-02-05 13:41:29 +00:00
#include "MEM_guardedalloc.h"
Raytrace modifications from the Render Branch. These should not have any effect on render results, except in some cases with you have overlapping faces, where the noise seems to be slightly reduced. There are some performance improvements, for simple scenes I wouldn't expect more than 5-10% to be cut off the render time, for sintel scenes we got about 50% on average, that's with millions of polygons on intel quad cores. This because memory access / cache misses were the main bottleneck for those scenes, and the optimizations improve that. Interal changes: * Remove RE_raytrace.h, raytracer is now only used by render engine again. * Split non-public parts rayobject.h into rayobject_internal.h, hopefully makes it clearer how the API is used. * Added rayintersection.h to contain some of the stuff from RE_raytrace.h * Change Isect.vec/labda to Isect.dir/dist, previously vec was sometimes normalized and sometimes not, confusing... now dir is always normalized and dist contains the distance. * Change VECCOPY and similar to BLI_math functions. * Force inlining of auxiliary functions for ray-triangle/quad intersection, helps a few percentages. * Reorganize svbvh code so all the traversal functions are in one file * Don't do test for root so that push_childs can be inlined * Make shadow a template parameter so it doesn't need to be runtime checked * Optimization in raytree building, was computing bounding boxes more often than necessary. * Leave out logf() factor in SAH, makes tree build quicker with no noticeable influence on raytracing on performance? * Set max childs to 4, simplifies traversal code a bit, but also seems to help slightly in general. * Store child pointers and child bb just as fixed arrays of size 4 in nodes, nearly all nodes have this many children, so overall it actually reduces memory usage a bit and avoids a pointer indirection.
2011-02-05 13:41:29 +00:00
#include "BLI_math.h"
Raytrace modifications from the Render Branch. These should not have any effect on render results, except in some cases with you have overlapping faces, where the noise seems to be slightly reduced. There are some performance improvements, for simple scenes I wouldn't expect more than 5-10% to be cut off the render time, for sintel scenes we got about 50% on average, that's with millions of polygons on intel quad cores. This because memory access / cache misses were the main bottleneck for those scenes, and the optimizations improve that. Interal changes: * Remove RE_raytrace.h, raytracer is now only used by render engine again. * Split non-public parts rayobject.h into rayobject_internal.h, hopefully makes it clearer how the API is used. * Added rayintersection.h to contain some of the stuff from RE_raytrace.h * Change Isect.vec/labda to Isect.dir/dist, previously vec was sometimes normalized and sometimes not, confusing... now dir is always normalized and dist contains the distance. * Change VECCOPY and similar to BLI_math functions. * Force inlining of auxiliary functions for ray-triangle/quad intersection, helps a few percentages. * Reorganize svbvh code so all the traversal functions are in one file * Don't do test for root so that push_childs can be inlined * Make shadow a template parameter so it doesn't need to be runtime checked * Optimization in raytree building, was computing bounding boxes more often than necessary. * Leave out logf() factor in SAH, makes tree build quicker with no noticeable influence on raytracing on performance? * Set max childs to 4, simplifies traversal code a bit, but also seems to help slightly in general. * Store child pointers and child bb just as fixed arrays of size 4 in nodes, nearly all nodes have this many children, so overall it actually reduces memory usage a bit and avoids a pointer indirection.
2011-02-05 13:41:29 +00:00
#include "BLI_memarena.h"
#include "BLI_utildefines.h"
#include "BKE_global.h"
Raytrace modifications from the Render Branch. These should not have any effect on render results, except in some cases with you have overlapping faces, where the noise seems to be slightly reduced. There are some performance improvements, for simple scenes I wouldn't expect more than 5-10% to be cut off the render time, for sintel scenes we got about 50% on average, that's with millions of polygons on intel quad cores. This because memory access / cache misses were the main bottleneck for those scenes, and the optimizations improve that. Interal changes: * Remove RE_raytrace.h, raytracer is now only used by render engine again. * Split non-public parts rayobject.h into rayobject_internal.h, hopefully makes it clearer how the API is used. * Added rayintersection.h to contain some of the stuff from RE_raytrace.h * Change Isect.vec/labda to Isect.dir/dist, previously vec was sometimes normalized and sometimes not, confusing... now dir is always normalized and dist contains the distance. * Change VECCOPY and similar to BLI_math functions. * Force inlining of auxiliary functions for ray-triangle/quad intersection, helps a few percentages. * Reorganize svbvh code so all the traversal functions are in one file * Don't do test for root so that push_childs can be inlined * Make shadow a template parameter so it doesn't need to be runtime checked * Optimization in raytree building, was computing bounding boxes more often than necessary. * Leave out logf() factor in SAH, makes tree build quicker with no noticeable influence on raytracing on performance? * Set max childs to 4, simplifies traversal code a bit, but also seems to help slightly in general. * Store child pointers and child bb just as fixed arrays of size 4 in nodes, nearly all nodes have this many children, so overall it actually reduces memory usage a bit and avoids a pointer indirection.
2011-02-05 13:41:29 +00:00
#include "rayintersection.h"
#include "rayobject.h"
#include "rayobject_rtbuild.h"
#include "reorganize.h"
#include "bvh.h"
#include "vbvh.h"
Raytrace modifications from the Render Branch. These should not have any effect on render results, except in some cases with you have overlapping faces, where the noise seems to be slightly reduced. There are some performance improvements, for simple scenes I wouldn't expect more than 5-10% to be cut off the render time, for sintel scenes we got about 50% on average, that's with millions of polygons on intel quad cores. This because memory access / cache misses were the main bottleneck for those scenes, and the optimizations improve that. Interal changes: * Remove RE_raytrace.h, raytracer is now only used by render engine again. * Split non-public parts rayobject.h into rayobject_internal.h, hopefully makes it clearer how the API is used. * Added rayintersection.h to contain some of the stuff from RE_raytrace.h * Change Isect.vec/labda to Isect.dir/dist, previously vec was sometimes normalized and sometimes not, confusing... now dir is always normalized and dist contains the distance. * Change VECCOPY and similar to BLI_math functions. * Force inlining of auxiliary functions for ray-triangle/quad intersection, helps a few percentages. * Reorganize svbvh code so all the traversal functions are in one file * Don't do test for root so that push_childs can be inlined * Make shadow a template parameter so it doesn't need to be runtime checked * Optimization in raytree building, was computing bounding boxes more often than necessary. * Leave out logf() factor in SAH, makes tree build quicker with no noticeable influence on raytracing on performance? * Set max childs to 4, simplifies traversal code a bit, but also seems to help slightly in general. * Store child pointers and child bb just as fixed arrays of size 4 in nodes, nearly all nodes have this many children, so overall it actually reduces memory usage a bit and avoids a pointer indirection.
2011-02-05 13:41:29 +00:00
#include <queue>
#include <algorithm>
2012-05-15 18:45:20 +00:00
#define DFS_STACK_SIZE 256
2012-05-15 18:45:20 +00:00
struct VBVHTree {
RayObject rayobj;
VBVHNode *root;
MemArena *node_arena;
float cost;
RTBuilder *builder;
};
/*
* Cost to test N childs
*/
2012-05-15 18:45:20 +00:00
struct PackCost {
float operator()(int n)
{
return n;
}
};
template<>
void bvh_done<VBVHTree>(VBVHTree *obj)
{
rtbuild_done(obj->builder, &obj->rayobj.control);
//TODO find a away to exactly calculate the needed memory
MemArena *arena1 = BLI_memarena_new(BLI_MEMARENA_STD_BUFSIZE, "vbvh arena");
2012-05-15 18:45:20 +00:00
BLI_memarena_use_malloc(arena1);
//Build and optimize the tree
if (1) {
2012-04-29 15:47:02 +00:00
VBVHNode *root = BuildBinaryVBVH<VBVHNode>(arena1, &obj->rayobj.control).transform(obj->builder);
if (RE_rayobjectcontrol_test_break(&obj->rayobj.control)) {
BLI_memarena_free(arena1);
return;
}
if (root) {
reorganize(root);
remove_useless(root, &root);
bvh_refit(root);
pushup(root);
pushdown(root);
obj->root = root;
}
else
obj->root = NULL;
}
else {
2012-05-15 18:45:20 +00:00
/* TODO */
#if 0
MemArena *arena2 = BLI_memarena_new(BLI_MEMARENA_STD_BUFSIZE, "vbvh arena2");
2012-05-15 18:45:20 +00:00
BLI_memarena_use_malloc(arena2);
//Finds the optimal packing of this tree using a given cost model
//TODO this uses quite a lot of memory, find ways to reduce memory usage during building
OVBVHNode *root = BuildBinaryVBVH<OVBVHNode>(arena2).transform(obj->builder);
2012-04-29 15:47:02 +00:00
VBVH_optimalPackSIMD<OVBVHNode, PackCost>(PackCost()).transform(root);
obj->root = Reorganize_VBVH<OVBVHNode>(arena1).transform(root);
BLI_memarena_free(arena2);
2012-05-15 18:45:20 +00:00
#endif
}
//Cleanup
2012-05-15 18:45:20 +00:00
rtbuild_free(obj->builder);
obj->builder = NULL;
obj->node_arena = arena1;
obj->cost = 1.0;
}
template<int StackSize>
static int intersect(VBVHTree *obj, Isect *isec)
{
//TODO renable hint support
if (RE_rayobject_isAligned(obj->root)) {
if (isec->mode == RE_RAY_SHADOW)
2012-05-15 18:45:20 +00:00
return bvh_node_stack_raycast<VBVHNode, StackSize, false, true>(obj->root, isec);
Raytrace modifications from the Render Branch. These should not have any effect on render results, except in some cases with you have overlapping faces, where the noise seems to be slightly reduced. There are some performance improvements, for simple scenes I wouldn't expect more than 5-10% to be cut off the render time, for sintel scenes we got about 50% on average, that's with millions of polygons on intel quad cores. This because memory access / cache misses were the main bottleneck for those scenes, and the optimizations improve that. Interal changes: * Remove RE_raytrace.h, raytracer is now only used by render engine again. * Split non-public parts rayobject.h into rayobject_internal.h, hopefully makes it clearer how the API is used. * Added rayintersection.h to contain some of the stuff from RE_raytrace.h * Change Isect.vec/labda to Isect.dir/dist, previously vec was sometimes normalized and sometimes not, confusing... now dir is always normalized and dist contains the distance. * Change VECCOPY and similar to BLI_math functions. * Force inlining of auxiliary functions for ray-triangle/quad intersection, helps a few percentages. * Reorganize svbvh code so all the traversal functions are in one file * Don't do test for root so that push_childs can be inlined * Make shadow a template parameter so it doesn't need to be runtime checked * Optimization in raytree building, was computing bounding boxes more often than necessary. * Leave out logf() factor in SAH, makes tree build quicker with no noticeable influence on raytracing on performance? * Set max childs to 4, simplifies traversal code a bit, but also seems to help slightly in general. * Store child pointers and child bb just as fixed arrays of size 4 in nodes, nearly all nodes have this many children, so overall it actually reduces memory usage a bit and avoids a pointer indirection.
2011-02-05 13:41:29 +00:00
else
2012-05-15 18:45:20 +00:00
return bvh_node_stack_raycast<VBVHNode, StackSize, false, false>(obj->root, isec);
Raytrace modifications from the Render Branch. These should not have any effect on render results, except in some cases with you have overlapping faces, where the noise seems to be slightly reduced. There are some performance improvements, for simple scenes I wouldn't expect more than 5-10% to be cut off the render time, for sintel scenes we got about 50% on average, that's with millions of polygons on intel quad cores. This because memory access / cache misses were the main bottleneck for those scenes, and the optimizations improve that. Interal changes: * Remove RE_raytrace.h, raytracer is now only used by render engine again. * Split non-public parts rayobject.h into rayobject_internal.h, hopefully makes it clearer how the API is used. * Added rayintersection.h to contain some of the stuff from RE_raytrace.h * Change Isect.vec/labda to Isect.dir/dist, previously vec was sometimes normalized and sometimes not, confusing... now dir is always normalized and dist contains the distance. * Change VECCOPY and similar to BLI_math functions. * Force inlining of auxiliary functions for ray-triangle/quad intersection, helps a few percentages. * Reorganize svbvh code so all the traversal functions are in one file * Don't do test for root so that push_childs can be inlined * Make shadow a template parameter so it doesn't need to be runtime checked * Optimization in raytree building, was computing bounding boxes more often than necessary. * Leave out logf() factor in SAH, makes tree build quicker with no noticeable influence on raytracing on performance? * Set max childs to 4, simplifies traversal code a bit, but also seems to help slightly in general. * Store child pointers and child bb just as fixed arrays of size 4 in nodes, nearly all nodes have this many children, so overall it actually reduces memory usage a bit and avoids a pointer indirection.
2011-02-05 13:41:29 +00:00
}
else
2012-05-15 18:45:20 +00:00
return RE_rayobject_intersect( (RayObject *) obj->root, isec);
}
template<class Tree>
static void bvh_hint_bb(Tree *tree, LCTSHint *hint, float *UNUSED(min), float *UNUSED(max))
{
//TODO renable hint support
{
2011-09-25 12:31:21 +00:00
hint->size = 0;
2012-05-15 18:45:20 +00:00
hint->stack[hint->size++] = (RayObject *)tree->root;
}
}
2012-09-18 03:15:12 +00:00
#if 0 /* UNUSED */
static void bfree(VBVHTree *tree)
{
if (tot_pushup + tot_pushdown + tot_hints + tot_moves) {
if (G.debug & G_DEBUG) {
printf("tot pushups: %d\n", tot_pushup);
printf("tot pushdowns: %d\n", tot_pushdown);
printf("tot moves: %d\n", tot_moves);
printf("tot hints created: %d\n", tot_hints);
}
tot_pushup = 0;
tot_pushdown = 0;
tot_hints = 0;
tot_moves = 0;
}
bvh_free(tree);
}
2012-09-18 03:15:12 +00:00
#endif
/* the cast to pointer function is needed to workarround gcc bug: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11407 */
template<class Tree, int STACK_SIZE>
static RayObjectAPI make_api()
{
static RayObjectAPI api =
{
2012-05-15 18:45:20 +00:00
(RE_rayobject_raycast_callback) ((int (*)(Tree *, Isect *)) & intersect<STACK_SIZE>),
(RE_rayobject_add_callback) ((void (*)(Tree *, RayObject *)) & bvh_add<Tree>),
(RE_rayobject_done_callback) ((void (*)(Tree *)) & bvh_done<Tree>),
(RE_rayobject_free_callback) ((void (*)(Tree *)) & bvh_free<Tree>),
(RE_rayobject_merge_bb_callback)((void (*)(Tree *, float *, float *)) & bvh_bb<Tree>),
(RE_rayobject_cost_callback) ((float (*)(Tree *)) & bvh_cost<Tree>),
(RE_rayobject_hint_bb_callback) ((void (*)(Tree *, LCTSHint *, float *, float *)) & bvh_hint_bb<Tree>)
};
return api;
}
template<class Tree>
2012-05-15 18:45:20 +00:00
RayObjectAPI *bvh_get_api(int maxstacksize)
{
2012-04-29 15:47:02 +00:00
static RayObjectAPI bvh_api256 = make_api<Tree, 1024>();
if (maxstacksize <= 1024) return &bvh_api256;
assert(maxstacksize <= 256);
return 0;
}
RayObject *RE_rayobject_vbvh_create(int size)
{
2012-04-29 15:47:02 +00:00
return bvh_create_tree<VBVHTree, DFS_STACK_SIZE>(size);
}