This repository has been archived on 2023-10-09. You can view files and clone it. You cannot open issues or pull requests or push a commit.
Files
blender-archive/intern/cycles/device/oneapi/queue.h
Michael Jones (Apple) 8dd7b5b26b Cycles: Metal integrator state size tuning
This patch tunes the integrator state sizing for Metal (`num_concurrent_states` and `num_concurrent_busy_states`).

On all GPUs architecture, we adjust the busy:total states ratio to be 1:4 which gives better rendering performance than the previous 1:16 ratio (independent of total state count). This gives a small performance uplift (e.g. 2-3% on M1 Ultra).

Additionally for M2 architectures, we double the overall state size if there is available headroom. Inclusive of the first change, we can expect uplift of close to 10% in future, as this results in larger dispatch sizes and minimises work submission overheads. In order to make an accurate determination of available headroom, we defer the calculation of `num_concurrent_states` and `num_concurrent_busy_states` until the time of integrator state allocation (i.e. after all of the scene data has been allocated). We also refactor `alloc_integrator_soa` to calculate an *exact* single-state-size in a first pass, right before allocating the integrator SoA buffers in a second pass.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D16313
2022-10-24 17:14:33 +01:00

50 lines
1.2 KiB
C++

/* SPDX-License-Identifier: Apache-2.0
* Copyright 2021-2022 Intel Corporation */
#pragma once
#ifdef WITH_ONEAPI
# include "device/kernel.h"
# include "device/memory.h"
# include "device/queue.h"
# include "device/oneapi/device.h"
# include "kernel/device/oneapi/kernel.h"
CCL_NAMESPACE_BEGIN
class OneapiDevice;
class device_memory;
/* Base class for OneAPI queues. */
class OneapiDeviceQueue : public DeviceQueue {
public:
explicit OneapiDeviceQueue(OneapiDevice *device);
~OneapiDeviceQueue();
virtual int num_concurrent_states(const size_t state_size) const override;
virtual int num_concurrent_busy_states(const size_t state_size) const override;
virtual void init_execution() override;
virtual bool enqueue(DeviceKernel kernel,
const int kernel_work_size,
DeviceKernelArguments const &args) override;
virtual bool synchronize() override;
virtual void zero_to_device(device_memory &mem) override;
virtual void copy_to_device(device_memory &mem) override;
virtual void copy_from_device(device_memory &mem) override;
protected:
OneapiDevice *oneapi_device_;
KernelContext *kernel_context_;
};
CCL_NAMESPACE_END
#endif /* WITH_ONEAPI */