Worker Tags for clustering #104204

Closed
opened 2023-04-04 15:31:43 +02:00 by Sybren A. Stüvel · 5 comments

Flamenco 3.3 will have 'worker tags'. Workers can be assigned to any number of tags. A job can be submitted with a specific tag, or without a tag.

From the perspective of the jobs:

  • Tagless jobs can be handled by any worker.
  • Jobs assigned a tag can only be handled by workers with that tag.

From the perspective of the workers:

  • Tagless workers can only handle tagless jobs.
  • Workers with one or more tags handle jobs for those tags, as well as tagless jobs.

To do

The basic implementation is there, but there is still work to be done to make this release-ready:

  • API for CRUD operations of worker tags.
  • API for assigning workers to tags.
  • Web interface for CRUD operations of worker tags.
  • Web interface for assigning workers to tags.
  • SocketIO broadcasting of worker tag changes (i.e. the CRUD operations on the tags themselves).
  • Web interface handling of those SocketIO messages.
  • Task scheduler adjustments.
  • Job cancelling when there are no more workers available should take tags into account.
  • Documentation.
  • #104223: Update the code to change 'worker cluster' (which was an earlier implementation/naming) to 'worker tag' (which is what was decided upon a little bit later)
Flamenco 3.3 will have 'worker tags'. Workers can be assigned to any number of tags. A job can be submitted with a specific tag, or without a tag. From the perspective of the jobs: - Tagless jobs can be handled by any worker. - Jobs assigned a tag can only be handled by workers with that tag. From the perspective of the workers: - Tagless workers can only handle tagless jobs. - Workers with one or more tags handle jobs for those tags, as well as tagless jobs. #### To do The basic implementation is there, but there is still work to be done to make this release-ready: - [x] API for CRUD operations of worker tags. - [x] API for assigning workers to tags. - [x] Web interface for CRUD operations of worker tags. - [x] Web interface for assigning workers to tags. - [x] SocketIO broadcasting of worker tag changes (i.e. the CRUD operations on the tags themselves). - [x] Web interface handling of those SocketIO messages. - [x] Task scheduler adjustments. - [x] Job cancelling when there are no more workers available should take tags into account. - [ ] Documentation. - [x] #104223: Update the code to change 'worker cluster' (which was an earlier implementation/naming) to 'worker tag' (which is what was decided upon a little bit later)
Sybren A. Stüvel added the
Type
To Do
label 2023-04-04 15:31:59 +02:00
Sybren A. Stüvel added this to the v3.3 milestone 2023-04-04 15:33:52 +02:00
Collaborator

I'll get started with helping the Web interface for the worker clusters once I have the previous problem solved. Also, I can help with documentation as well 👍 Any standard for the documentation?

I'll get started with helping the Web interface for the worker clusters once I have the previous problem solved. Also, I can help with documentation as well 👍 Any standard for the documentation?

Perhaps better discussed directly, but I would like to suggest rebranding "clusters" to "tags". While clusters is a common term in HPC, I feel that "tags" is a more accessible term and it is semantically closer to what the feature does. Also, the concept of "tagging" is well known in web UIs.

To recap, from the user perspective:

  • create tags, which reflect hardware capabilities, availability, etc
  • assign zero or more tags to each worker
  • assign zero or more tags to a job
Perhaps better discussed directly, but I would like to suggest rebranding "clusters" to "tags". While clusters is a common term in HPC, I feel that "tags" is a more accessible term and it is semantically closer to what the feature does. Also, the concept of "tagging" is well known in web UIs. To recap, from the user perspective: - create tags, which reflect hardware capabilities, availability, etc - assign zero or more tags to each worker - assign zero or more tags to a job
Author
Owner

I'll get started with helping the Web interface for the worker clusters once I have the previous problem solved.

Thanks!

Also, I can help with documentation as well 👍 Any standard for the documentation?

The docs are in https://projects.blender.org/studio/flamenco/src/branch/main/web/project-website, which is a website made with hugo and geekdocs.

Perhaps better discussed directly, but I would like to suggest rebranding "clusters" to "tags".
Sure, I'd be fine with that too.

> I'll get started with helping the Web interface for the worker clusters once I have the previous problem solved. Thanks! > Also, I can help with documentation as well 👍 Any standard for the documentation? The docs are in https://projects.blender.org/studio/flamenco/src/branch/main/web/project-website, which is a website made with [hugo](https://gohugo.io/) and [geekdocs](https://geekdocs.de/). > Perhaps better discussed directly, but I would like to suggest rebranding "clusters" to "tags". Sure, I'd be fine with that too.
Eveline Anderson was assigned by Sybren A. Stüvel 2023-06-02 16:51:29 +02:00
Sybren A. Stüvel changed title from Worker Clusters to Worker Clusters / Tags 2023-06-23 16:21:23 +02:00
Sybren A. Stüvel changed title from Worker Clusters / Tags to Worker Tags for clustering 2023-06-23 16:22:11 +02:00

Great to see progress on this. If I recall correctly, Sybren mentioned something about a data migration for existing installs (for example at Blender Studio) that are using the daily build. Would that still be necessary?

Great to see progress on this. If I recall correctly, Sybren mentioned something about a data migration for existing installs (for example at Blender Studio) that are using the daily build. Would that still be necessary?
Author
Owner

Not sure how that would work with GORM. I'm sure it can be done, I just personally don't know how. Since the data model itself didn't change much, I think it's easier to just do a single-line SQL statement manually.

Not sure how that would work with [GORM](https://gorm.io/). I'm sure it can be done, I just personally don't know how. Since the data model itself didn't change much, I think it's easier to just do a single-line SQL statement manually.
Sign in to join this conversation.
No Milestone
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: studio/flamenco#104204
No description provided.