| 
									
										
										
										
											2022-09-20 10:59:12 +02:00
										 |  |  | /* SPDX-License-Identifier: GPL-2.0-or-later */ | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | #pragma once
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | /** \file
 | 
					
						
							|  |  |  |  * \ingroup bli | 
					
						
							|  |  |  |  * | 
					
						
							|  |  |  |  * The goal of "lazy threading" is to avoid using threads unless one can reasonably assume that it | 
					
						
							|  |  |  |  * is worth distributing work over multiple threads. Using threads can lead to worse overall | 
					
						
							|  |  |  |  * performance by introducing inter-thread communication overhead. Keeping all work on a single | 
					
						
							|  |  |  |  * thread reduces this overhead to zero and also makes better use of the CPU cache. | 
					
						
							|  |  |  |  * | 
					
						
							|  |  |  |  * Functions like #parallel_for also solve this to some degree by using a "grain size". When the | 
					
						
							|  |  |  |  * number of individual tasks is too small, no multi-threading is used. This works very well when | 
					
						
							|  |  |  |  * there are many homogeneous tasks that can be expected to take approximately the same time. | 
					
						
							|  |  |  |  * | 
					
						
							|  |  |  |  * The situation becomes more difficult when: | 
					
						
							|  |  |  |  * - The individual tasks are not homogeneous, i.e. they take different amounts of time to compute. | 
					
						
							|  |  |  |  * - It is practically impossible to guess how long each task will take in advance. | 
					
						
							|  |  |  |  * | 
					
						
							|  |  |  |  * Given those constraints, a single grain size cannot be determined. One could just schedule all | 
					
						
							|  |  |  |  * tasks individually but that would create a lot of overhead when the tasks happen to be very | 
					
						
							|  |  |  |  * small. While TBB will keep all tasks on a single thread if the other threads are busy, if they | 
					
						
							| 
									
										
										
										
											2022-09-21 11:57:20 +10:00
										 |  |  |  * are idle they will start stealing the work even if that's not beneficial for overall | 
					
						
							| 
									
										
										
										
											2022-09-20 10:59:12 +02:00
										 |  |  |  * performance. | 
					
						
							|  |  |  |  * | 
					
						
							|  |  |  |  * This file provides a simple API that allows a task scheduler to properly handle tasks whose size | 
					
						
							|  |  |  |  * is not known in advance. The key idea is this: | 
					
						
							|  |  |  |  * | 
					
						
							|  |  |  |  * > By default, all work stays on a single thread. If an individual task notices that it is about | 
					
						
							|  |  |  |  * > start a computation that will take a while, it notifies the task scheduler further up on the | 
					
						
							|  |  |  |  * > stack. The scheduler then allows other threads to take over other tasks that were originally | 
					
						
							|  |  |  |  * > meant for the current thread. | 
					
						
							|  |  |  |  * | 
					
						
							|  |  |  |  * This way, when all tasks are small, no threading overhead has to be paid for. Whenever there is | 
					
						
							|  |  |  |  * a task that keeps the current thread busy for a while, the other tasks are moved to a separate | 
					
						
							|  |  |  |  * thread so that they can be executed without waiting for the long computation to finish. | 
					
						
							|  |  |  |  * | 
					
						
							|  |  |  |  * Consequently, the earlier a task knows during it execution that it will take a while, the | 
					
						
							|  |  |  |  * better. That's because if it is blocking anyway, it's more efficient to move the other tasks to | 
					
						
							|  |  |  |  * another thread earlier. | 
					
						
							|  |  |  |  * | 
					
						
							|  |  |  |  * To make this work, three things have to be solved: | 
					
						
							|  |  |  |  * 1. The task scheduler has to be able to start single-threaded and become multi-threaded after | 
					
						
							|  |  |  |  *    tasks have started executing. This has to be solved in the specific task scheduler. | 
					
						
							|  |  |  |  * 2. There has to be a way for the currently running task to tell the task scheduler that it is | 
					
						
							|  |  |  |  *    about to perform a computation that will take a while and that it would be reasonable to move | 
					
						
							|  |  |  |  *    other tasks to other threads. This part is implemented in the API provided by this file. | 
					
						
							|  |  |  |  * 3. Individual tasks have to decide when a computation is long enough to justify talking to the | 
					
						
							|  |  |  |  *    scheduler. This is always based on heuristics that have to be fine tuned over time. One could | 
					
						
							|  |  |  |  *    assume that this means adding new work-size checks to many parts in Blender, but that's | 
					
						
							|  |  |  |  *    actually not necessary, because these checks exist already in the form of grain sizes passed | 
					
						
							|  |  |  |  *    to e.g. #parallel_for. The assumption here is that when the task thinks the current work load | 
					
						
							|  |  |  |  *    is big enough to justify using threads, it's also big enough to justify using another thread | 
					
						
							|  |  |  |  *    for waiting tasks on the current thread. | 
					
						
							|  |  |  |  */ | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | #include "BLI_function_ref.hh"
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | namespace blender::lazy_threading { | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | /**
 | 
					
						
							|  |  |  |  * Tell task schedulers on the current thread that it is about to start a long computation | 
					
						
							|  |  |  |  * and that other waiting tasks should better be moved to another thread if possible. | 
					
						
							|  |  |  |  */ | 
					
						
							|  |  |  | void send_hint(); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | /**
 | 
					
						
							|  |  |  |  * Used by the task scheduler to receive hints from current tasks that they will take a while. | 
					
						
							|  |  |  |  * This should only be allocated on the stack. | 
					
						
							|  |  |  |  */ | 
					
						
							|  |  |  | class HintReceiver { | 
					
						
							|  |  |  |  public: | 
					
						
							|  |  |  |   /**
 | 
					
						
							|  |  |  |    * The passed in function is called when a task signals that it will take a while. | 
					
						
							|  |  |  |    * \note The function has to stay alive after the call to the constructor. So one must not pass a | 
					
						
							|  |  |  |    * lambda directly into this constructor but store it in a separate variable on the stack first. | 
					
						
							|  |  |  |    */ | 
					
						
							|  |  |  |   HintReceiver(FunctionRef<void()> fn); | 
					
						
							|  |  |  |   ~HintReceiver(); | 
					
						
							|  |  |  | }; | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2022-11-06 15:04:47 +01:00
										 |  |  | /**
 | 
					
						
							|  |  |  |  * Used to make sure that lazy-threading hints don't propagate through task isolation. This is | 
					
						
							|  |  |  |  * necessary to avoid deadlocks when isolated regions are used together with e.g. task pools. For | 
					
						
							|  |  |  |  * more info see the comment on #BLI_task_isolate. | 
					
						
							|  |  |  |  */ | 
					
						
							|  |  |  | class ReceiverIsolation { | 
					
						
							|  |  |  |  public: | 
					
						
							|  |  |  |   ReceiverIsolation(); | 
					
						
							|  |  |  |   ~ReceiverIsolation(); | 
					
						
							|  |  |  | }; | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2022-09-20 10:59:12 +02:00
										 |  |  | }  // namespace blender::lazy_threading
 |