Support pausing jobs #104313
@ -167,7 +167,7 @@ func (sm *StateMachine) jobStatusIfAThenB(
|
||||
}
|
||||
|
||||
// isJobPausingComplete returns true when the job status is pause-requested and there are no more active tasks.
|
||||
David-Zhang-10 marked this conversation as resolved
Outdated
|
||||
func (sm *StateMachine) isJobPausingComplete(ctx context.Context, logger zerolog.Logger, job *persistence.Job) (bool, error) {
|
||||
func (sm *StateMachine) isJobPausingComplete(ctx context.Context, job *persistence.Job) (bool, error) {
|
||||
David-Zhang-10 marked this conversation as resolved
Outdated
Sybren A. Stüvel
commented
Right now, the structure of the function is such that the entire function body is contained in a conditional. This is not a good idea, as it unnecessarily increases the complexity. Flipping the condition of the first
Right now, the structure of the function is such that the entire function body is contained in a conditional. This is not a good idea, as it unnecessarily increases the complexity. Flipping the condition of the first `if` will un-indent the remaining code:
```go
if job.Status != api.JobStatusPauseRequested {
return false, nil
}
// rest of the code
```
|
||||
if job.Status != api.JobStatusPauseRequested {
|
||||
return false, nil
|
||||
}
|
||||
@ -175,12 +175,7 @@ func (sm *StateMachine) isJobPausingComplete(ctx context.Context, logger zerolog
|
||||
if err != nil {
|
||||
return false, err
|
||||
}
|
||||
David-Zhang-10 marked this conversation as resolved
Outdated
Sybren A. Stüvel
commented
I think it would be be better to leave the logging to the caller of this function. Also what is logged here is not entirely true -- the job is still in state By removing the logging, you can reduce the final I think it would be be better to leave the logging to the caller of this function. Also what is logged here is not entirely true -- the job is still in state `pause-requested`, not in state `paused`, so logging that it **is** paused is incorrect.
By removing the logging, you can reduce the final `return` to just `return numActive == 0, nil`.
|
||||
if numActive == 0 {
|
||||
// There is no active task, and the job is in pause-requested status, so we can pause the job.
|
||||
logger.Info().Msg("No more active tasks, job is paused")
|
||||
return true, nil
|
||||
}
|
||||
return false, nil
|
||||
return numActive == 0, nil
|
||||
}
|
||||
|
||||
// updateJobOnTaskStatusCanceled conditionally escalates the cancellation of a task to cancel the job.
|
||||
@ -198,7 +193,7 @@ func (sm *StateMachine) updateJobOnTaskStatusCanceled(ctx context.Context, logge
|
||||
}
|
||||
|
||||
// Deal with the special case when the job is in pause-requested status.
|
||||
toBePaused, err := sm.isJobPausingComplete(ctx, logger, job)
|
||||
toBePaused, err := sm.isJobPausingComplete(ctx, job)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
@ -232,7 +227,7 @@ func (sm *StateMachine) updateJobOnTaskStatusFailed(ctx context.Context, logger
|
||||
failLogger.Info().Msg("task failed, but not enough to fail the job")
|
||||
|
||||
// Deal with the special case when the job is in pause-requested status.
|
||||
toBePaused, err := sm.isJobPausingComplete(ctx, logger, job)
|
||||
toBePaused, err := sm.isJobPausingComplete(ctx, job)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
@ -256,7 +251,7 @@ func (sm *StateMachine) updateJobOnTaskStatusCompleted(ctx context.Context, logg
|
||||
}
|
||||
|
||||
// Deal with the special case when the job is in pause-requested status.
|
||||
toBePaused, err := sm.isJobPausingComplete(ctx, logger, job)
|
||||
toBePaused, err := sm.isJobPausingComplete(ctx, job)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
@ -511,9 +506,9 @@ func (sm *StateMachine) pauseTasks(
|
||||
}
|
||||
|
||||
// If pausing was requested, it has now happened, so the job can transition.
|
||||
toBePaused, err := sm.isJobPausingComplete(ctx, logger, job)
|
||||
toBePaused, err := sm.isJobPausingComplete(ctx, job)
|
||||
if err != nil {
|
||||
return "", fmt.Errorf("error when accessing number of active tasks")
|
||||
return "", err
|
||||
}
|
||||
if toBePaused {
|
||||
logger.Info().Msg("all tasks of job paused, job can go to 'paused' status")
|
||||
David-Zhang-10 marked this conversation as resolved
Outdated
Sybren A. Stüvel
commented
This hides the underlying error, making things hard to debug. When wrapping an error, always use the Also, this function shouldn't know what's going on inside of This hides the underlying error, making things hard to debug. When wrapping an error, always use the `%w` format specifier to include the underlying error. This will make sure that functions like `errors.Is()` and `errors.As()` can do their work. You can find more about this at https://pkg.go.dev/errors
Also, this function shouldn't know what's going on inside of `shouldJobBePaused()`, so if you want to have "when accessing number of active tasks" in the error message, that should be done inside `shouldJobBePaused()`, not here.
|
||||
|
I think the function name can be improved. The name
shouldJobBePaused
describes what should be done by the caller. I always try to pick a function name that matches what the function does. So for this one, I feel thatisJobPausingComplete
would be better.Also a documentation comment would be good. In Go, these always start with
// {function name} ...
. It should describe that this returns true when the job status ispause-requested
and there are no more active tasks.