Support pausing jobs #104313

Manually merged
Sybren A. Stüvel merged 28 commits from David-Zhang-10/flamenco:paused-job-status into main 2024-07-01 17:53:44 +02:00
Showing only changes of commit 4b20f8fad8 - Show all commits

View File

@ -167,7 +167,7 @@ func (sm *StateMachine) jobStatusIfAThenB(
} }
// isJobPausingComplete returns true when the job status is pause-requested and there are no more active tasks. // isJobPausingComplete returns true when the job status is pause-requested and there are no more active tasks.
David-Zhang-10 marked this conversation as resolved Outdated

I think the function name can be improved. The name shouldJobBePaused describes what should be done by the caller. I always try to pick a function name that matches what the function does. So for this one, I feel that isJobPausingComplete would be better.

Also a documentation comment would be good. In Go, these always start with // {function name} .... It should describe that this returns true when the job status is pause-requested and there are no more active tasks.

I think the function name can be improved. The name `shouldJobBePaused` describes what should be done by the caller. I always try to pick a function name that matches what the function does. So for this one, I feel that `isJobPausingComplete` would be better. Also a documentation comment would be good. In Go, these always start with `// {function name} ...`. It should describe that this returns true when the job status is `pause-requested` and there are no more active tasks.
func (sm *StateMachine) isJobPausingComplete(ctx context.Context, logger zerolog.Logger, job *persistence.Job) (bool, error) { func (sm *StateMachine) isJobPausingComplete(ctx context.Context, job *persistence.Job) (bool, error) {
David-Zhang-10 marked this conversation as resolved Outdated

Right now, the structure of the function is such that the entire function body is contained in a conditional. This is not a good idea, as it unnecessarily increases the complexity. Flipping the condition of the first if will un-indent the remaining code:

	if job.Status != api.JobStatusPauseRequested {
		return false, nil
	}

  // rest of the code
Right now, the structure of the function is such that the entire function body is contained in a conditional. This is not a good idea, as it unnecessarily increases the complexity. Flipping the condition of the first `if` will un-indent the remaining code: ```go if job.Status != api.JobStatusPauseRequested { return false, nil } // rest of the code ```
if job.Status != api.JobStatusPauseRequested { if job.Status != api.JobStatusPauseRequested {
return false, nil return false, nil
} }
@ -175,12 +175,7 @@ func (sm *StateMachine) isJobPausingComplete(ctx context.Context, logger zerolog
if err != nil { if err != nil {
return false, err return false, err
} }
David-Zhang-10 marked this conversation as resolved Outdated

I think it would be be better to leave the logging to the caller of this function. Also what is logged here is not entirely true -- the job is still in state pause-requested, not in state paused, so logging that it is paused is incorrect.

By removing the logging, you can reduce the final return to just return numActive == 0, nil.

I think it would be be better to leave the logging to the caller of this function. Also what is logged here is not entirely true -- the job is still in state `pause-requested`, not in state `paused`, so logging that it **is** paused is incorrect. By removing the logging, you can reduce the final `return` to just `return numActive == 0, nil`.
if numActive == 0 { return numActive == 0, nil
// There is no active task, and the job is in pause-requested status, so we can pause the job.
logger.Info().Msg("No more active tasks, job is paused")
return true, nil
}
return false, nil
} }
// updateJobOnTaskStatusCanceled conditionally escalates the cancellation of a task to cancel the job. // updateJobOnTaskStatusCanceled conditionally escalates the cancellation of a task to cancel the job.
@ -198,7 +193,7 @@ func (sm *StateMachine) updateJobOnTaskStatusCanceled(ctx context.Context, logge
} }
// Deal with the special case when the job is in pause-requested status. // Deal with the special case when the job is in pause-requested status.
toBePaused, err := sm.isJobPausingComplete(ctx, logger, job) toBePaused, err := sm.isJobPausingComplete(ctx, job)
if err != nil { if err != nil {
return err return err
} }
@ -232,7 +227,7 @@ func (sm *StateMachine) updateJobOnTaskStatusFailed(ctx context.Context, logger
failLogger.Info().Msg("task failed, but not enough to fail the job") failLogger.Info().Msg("task failed, but not enough to fail the job")
// Deal with the special case when the job is in pause-requested status. // Deal with the special case when the job is in pause-requested status.
toBePaused, err := sm.isJobPausingComplete(ctx, logger, job) toBePaused, err := sm.isJobPausingComplete(ctx, job)
if err != nil { if err != nil {
return err return err
} }
@ -256,7 +251,7 @@ func (sm *StateMachine) updateJobOnTaskStatusCompleted(ctx context.Context, logg
} }
// Deal with the special case when the job is in pause-requested status. // Deal with the special case when the job is in pause-requested status.
toBePaused, err := sm.isJobPausingComplete(ctx, logger, job) toBePaused, err := sm.isJobPausingComplete(ctx, job)
if err != nil { if err != nil {
return err return err
} }
@ -511,9 +506,9 @@ func (sm *StateMachine) pauseTasks(
} }
// If pausing was requested, it has now happened, so the job can transition. // If pausing was requested, it has now happened, so the job can transition.
toBePaused, err := sm.isJobPausingComplete(ctx, logger, job) toBePaused, err := sm.isJobPausingComplete(ctx, job)
if err != nil { if err != nil {
return "", fmt.Errorf("error when accessing number of active tasks") return "", err
} }
if toBePaused { if toBePaused {
logger.Info().Msg("all tasks of job paused, job can go to 'paused' status") logger.Info().Msg("all tasks of job paused, job can go to 'paused' status")
David-Zhang-10 marked this conversation as resolved Outdated

This hides the underlying error, making things hard to debug. When wrapping an error, always use the %w format specifier to include the underlying error. This will make sure that functions like errors.Is() and errors.As() can do their work. You can find more about this at https://pkg.go.dev/errors

Also, this function shouldn't know what's going on inside of shouldJobBePaused(), so if you want to have "when accessing number of active tasks" in the error message, that should be done inside shouldJobBePaused(), not here.

This hides the underlying error, making things hard to debug. When wrapping an error, always use the `%w` format specifier to include the underlying error. This will make sure that functions like `errors.Is()` and `errors.As()` can do their work. You can find more about this at https://pkg.go.dev/errors Also, this function shouldn't know what's going on inside of `shouldJobBePaused()`, so if you want to have "when accessing number of active tasks" in the error message, that should be done inside `shouldJobBePaused()`, not here.