Gitea: Split 'data' storage into seperate pools #34

Open
opened 2023-02-14 15:47:07 +01:00 by Arnd Marijnissen · 2 comments

Given the fact that the repo-archive disk filled up super-quick all of a sudden, it's clear that there's quite som potential to DdoS the site by simply uploading a lot, creating a large repo or , in the previous case, doing a lot of download-requests.

It'd be good to split up the current 'data' pool into three pools with different semantics (and storage-requirements). Some of this can be done by using seperate ceph-mounts or by using quota's instead.

  • database: small, speed critical, uptime is paramount
  • Repositories-blender: the main repositories that the system's designed to handle, important, large
  • Repositories-other: The community projects that can be on a different speed and backup regime (optional, can be done later, too)
  • repo-archive: this is storage that is ephemeral , can be lower-speed or should at least be quota'd.
Given the fact that the repo-archive disk filled up super-quick all of a sudden, it's clear that there's quite som potential to DdoS the site by simply uploading a lot, creating a large repo or , in the previous case, doing a lot of download-requests. It'd be good to split up the current 'data' pool into three pools with different semantics (and storage-requirements). Some of this can be done by using seperate ceph-mounts or by using quota's instead. - database: small, speed critical, uptime is paramount - Repositories-blender: the main repositories that the system's designed to handle, important, large - Repositories-other: The community projects that can be on a different speed and backup regime (optional, can be done later, too) - repo-archive: this is storage that is ephemeral , can be lower-speed or should at least be quota'd.
Arnd Marijnissen added the
Type
Deployment
label 2023-02-14 15:47:07 +01:00

One thing I wonder about is if blender and other repositories are in separate pools, are hard links between them still possible to avoid every fork taking up space? Or are hard links already not working?

One thing I wonder about is if blender and other repositories are in separate pools, are hard links between them still possible to avoid every fork taking up space? Or are hard links already not working?

One thing I wonder about is if blender and other repositories are in separate pools, are hard links between them still possible to avoid every fork taking up space?

With gitea.com, hardlinks across pools did not work, although that could just be because we didn't look into it much and it may have well been possible with appropriate tuning.

For repo archive, issue attachments, LFS etc.. you may wish to look into using ceph's radosgw and using their S3 compatible API and https://docs.gitea.io/en-us/config-cheat-sheet/#storage-storage that way you can manage that specific storage outside of VM config.

S3 storage can't be used for the issues search index, code search index, and git repos themselves (and a few other minor things).

> One thing I wonder about is if blender and other repositories are in separate pools, are hard links between them still possible to avoid every fork taking up space? With gitea.com, hardlinks across pools did not work, although that could just be because we didn't look into it much and it may have well been possible with appropriate tuning. For repo archive, issue attachments, LFS etc.. you may wish to look into using ceph's radosgw and using their S3 compatible API and https://docs.gitea.io/en-us/config-cheat-sheet/#storage-storage that way you can manage that specific storage outside of VM config. S3 storage can't be used for the issues search index, code search index, and git repos themselves (and a few other minor things).
Bart van der Braak added this to the DevOps Progress Board project 2024-07-16 12:57:38 +02:00
Bart van der Braak changed title from Deployment: Split 'data' storage into seperate pools to Gitea: Split 'data' storage into seperate pools 2024-07-17 15:12:43 +02:00
Bart van der Braak added the
Service
Gitea
label 2024-08-01 11:15:39 +02:00
Sign in to join this conversation.
No Milestone
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: infrastructure/blender-projects-platform#34
No description provided.