To make the pool more usable for running multiple stages of tasks, fix local queue handling in BLI_task_pool_work_and_wait. Specifically, after the wait loop the local queue should be empty, or the wait part of the function contract isn't fulfilled. Instead, check and run any tasks in queue before the wait loop. Also, add a new function that resets the suspended state of the pool.