Check for number of workers before soft failing the task. #104195

Merged
Sybren A. Stüvel merged 15 commits from Nitin-Rawat-1/flamenco:104190-job-stuck into main 2023-04-20 11:53:43 +02:00

15 Commits

Author SHA1 Message Date
Nitin Rawat
349b7704be Merge remote-tracking branch 'upstream/main' into 104190-job-stuck 2023-04-17 21:05:00 +05:30
Nitin Rawat
0b2c4349ec Merge remote-tracking branch 'upstream/main' into 104190-job-stuck 2023-04-14 20:38:06 +05:30
Nitin Rawat
c6974e97d2 Merge remote-tracking branch 'upstream/main' into 104190-job-stuck 2023-04-14 14:41:42 +05:30
Nitin Rawat
94848c7e05 Merge remote-tracking branch 'upstream/main' into 104190-job-stuck 2023-04-10 11:21:59 +05:30
Nitin Rawat
1da9c33f22 WorkersLeftToRun should return the UUID of the test worker which is actually failing the task in the test. 2023-04-10 11:21:14 +05:30
Nitin Rawat
e0f1400f4d Merge remote-tracking branch 'upstream/main' into 104190-job-stuck 2023-04-04 19:32:20 +05:30
Nitin Rawat
5c101c47fb Merge remote-tracking branch 'upstream/main' into 104190-job-stuck 2023-04-04 17:59:38 +05:30
Nitin Rawat
ff0a36d19d Add test to check the job failure condition when number of workers available for the job is less than failure threshold. 2023-04-04 12:54:34 +05:30
Nitin Rawat
03533c1e49 Tests for TaskUpdate needs to be updated. 2023-04-04 11:49:40 +05:30
Nitin Rawat
ac88d57ede We should also hard fail the task when numFailed == threshold 2023-04-04 10:01:44 +05:30
Nitin Rawat
ad96e3bb25 Merge remote-tracking branch 'upstream/main' into 104190-job-stuck 2023-04-04 07:32:36 +05:30
Nitin Rawat
5ceafb1a9f Merge remote-tracking branch 'upstream/main' into 104190-job-stuck 2023-03-22 08:51:06 +05:30
Nitin Rawat
65c6be1fe2 Merge remote-tracking branch 'upstream/main' into 104190-job-stuck 2023-03-21 18:29:10 +05:30
Nitin Rawat
6e24e0be3b reviese the conditions for job failure 2023-03-17 15:13:22 +05:30
Nitin Rawat
9fdf5aa7c5 Manager: fixed issue #104190 job getting stuck with less workers than soft-failed threshold
before soft-failing check the number of workers to decide if job should be failed or not.
2023-03-09 23:45:16 +05:30