Gitea server running out of disk space due to archive downloads #29
Labels
No Label
Service
Buildbot
Service
Chat
Service
Gitea
Service
Translate
Type
Bug
Type
Config
Type
Deployment
Type
Feature
Type
Setup
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: infrastructure/blender-projects-platform#29
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
What appears to be happening is that something is downloading a lot of archives. That is a .tar.gz or .bundle as can be found for example on this page by clicking on the "..." menu next to the Git URL.
https://projects.blender.org/blender/blender/
Such packages are being downloaded for various revisions, and each is about 500MB for the Blender repository. Each archive is cached for some time and adds up quickly.
This is supposed to be cleared regularly by
cron.archive_cleanup
, but that doesn't seem to be working and leaving behind older files.Even if it was working, generating 100s of GB every day just to discard it is not good either. So we should block whatever is downloading these archives, or at least block it from downloading
projects.blender.org/*/*/archive/*
.Are you able to check the logs and see what is triggering these archive generting events? Is it a sewtch crawler, some other automated tooling, or is it user behaviour?
Based on the answers to the above, perhaps some ratelimiting could be added, and known crawlers be blocked from the archive endpoint.
Closing as a duplicate of #32.