Make Shaman compatible with symlinked storage directories #99965

Open
opened 2022-07-25 12:54:27 +02:00 by Sybren A. Stüvel · 5 comments

Shaman gets confused when it's configured to a storage directory that's a symlink to some other place. The effect is that the garbage collector will compare the actual, physical path of a file with the symlinked storage path, which will fail. As a result, files will be garbage collected even when they're still in use.

This surfaced when running the unit tests on macOS. The TestGCComponents unit test (in pkg/shaman/cleanup_test.go) uses a temporary directory (via ioutil.TempDir(...)) that on a Mac will be something like /var/folders/.... However, /var is a symlink to /private/var, and thus the Shaman Garbage Collector will be comparing /private/var/folders/blabla symlinks to /var/folders/blabla paths.

Marking as high priority as this can cause loss of files. However, I expect it to be unlikely to happen in practice, as Flamenco should just be configured with the actual storage path.

Shaman gets confused when it's configured to a storage directory that's a symlink to some other place. The effect is that the garbage collector will compare the actual, physical path of a file with the symlinked storage path, which will fail. As a result, files will be garbage collected even when they're still in use. This surfaced when running the unit tests on macOS. The `TestGCComponents` unit test (in `pkg/shaman/cleanup_test.go`) uses a temporary directory (via `ioutil.TempDir(...)`) that on a Mac will be something like `/var/folders/...`. However, `/var` is a symlink to `/private/var`, and thus the Shaman Garbage Collector will be comparing `/private/var/folders/blabla` symlinks to `/var/folders/blabla` paths. Marking as high priority as this can cause loss of files. However, I expect it to be unlikely to happen in practice, as Flamenco should just be configured with the actual storage path.
Author
Owner

Changed status from 'Needs Triage' to: 'Confirmed'

Changed status from 'Needs Triage' to: 'Confirmed'
Author
Owner

Added subscriber: @dr.sybren

Added subscriber: @dr.sybren
Sybren A. Stüvel added
Type
Bug
and removed
Type
Report
labels 2023-02-17 11:14:08 +01:00
Contributor

To recreate this problem (on Linux workstation) in the Flamenco Shaman unit tests, comment out L38-41 in
Then create a symlinked TempDir() path.

$> ls -l /tmp/
total 124
drwxrwxr-x 7 mcook mcook  4096 Jul 13 22:47 99965_real_dir
lrwxrwxrwx 1 mcook mcook    19 Jul 13 22:24 99965_sym_dir -> /tmp/99965_real_dir

func CreateTestConfig() (conf Config, cleanup func()) {
        //tempDir, err := ioutil.TempDir("", "shaman-test-")
	tempDir, err := ioutil.TempDir("/tmp/99965_sym_dir", "shaman-test-")
	if err != nil {
		panic(err)
	}

	// tempDir, err = filepath.EvalSymlinks(tempDir)
	// if err != nil {
	// 	panic(err)
	// }

	conf = Config{
...

This will generate GC unit tests errors like...

go test -v ./pkg/shaman/
{"level":"info","package":"shaman/test","path":"/tmp/99965_sym_dir/shaman-test-1291574843/file-store/stored/80/b749c27b2fef7255e7e7b3c2029b03b31299c75ff1f1c72732081c70a713a3/7488.blob","time":"2023-07-13T20:47:22-04:00","message":"deleting unused file"}
    cleanup_test.go:194: 
        	Error Trace:	cleanup_test.go:194
        	Error:      	unable to find file "/tmp/99965_sym_dir/shaman-test-1291574843/file-store/stored/59/0c148428d5c35fab3ebad2f3365bb469ab9c531b60831f3e826c472027a0b9/3367.blob"
        	Test:       	TestGCComponents
        	Messages:   	file should exist after GC
--- FAIL: TestGCComponents (0.08s)
To recreate this problem (on Linux workstation) in the Flamenco Shaman unit tests, comment out L38-41 in Then create a symlinked TempDir() path. ``` $> ls -l /tmp/ total 124 drwxrwxr-x 7 mcook mcook 4096 Jul 13 22:47 99965_real_dir lrwxrwxrwx 1 mcook mcook 19 Jul 13 22:24 99965_sym_dir -> /tmp/99965_real_dir ``` ``` func CreateTestConfig() (conf Config, cleanup func()) { //tempDir, err := ioutil.TempDir("", "shaman-test-") tempDir, err := ioutil.TempDir("/tmp/99965_sym_dir", "shaman-test-") if err != nil { panic(err) } // tempDir, err = filepath.EvalSymlinks(tempDir) // if err != nil { // panic(err) // } conf = Config{ ... ``` This will generate GC unit tests errors like... ``` go test -v ./pkg/shaman/ ``` ``` {"level":"info","package":"shaman/test","path":"/tmp/99965_sym_dir/shaman-test-1291574843/file-store/stored/80/b749c27b2fef7255e7e7b3c2029b03b31299c75ff1f1c72732081c70a713a3/7488.blob","time":"2023-07-13T20:47:22-04:00","message":"deleting unused file"} cleanup_test.go:194: Error Trace: cleanup_test.go:194 Error: unable to find file "/tmp/99965_sym_dir/shaman-test-1291574843/file-store/stored/59/0c148428d5c35fab3ebad2f3365bb469ab9c531b60831f3e826c472027a0b9/3367.blob" Test: TestGCComponents Messages: file should exist after GC --- FAIL: TestGCComponents (0.08s) ```
Contributor

Marking as high priority as this can cause loss of files. However, I expect it to be unlikely to happen in practice, as Flamenco should just be configured with the actual storage path.

I actually created the scenario for this accidental deletion when I was supporting more than one type of shared-storage system (SMB for Artists workstations + NFS for worker cluster on the same Flamenco Manager workstation. I had symlinked one to the other and could have easily chosen Flamenco Manager to use the sym-linked path.

> Marking as high priority as this can cause loss of files. However, I expect it to be unlikely to happen in practice, as Flamenco should just be configured with the actual storage path. I actually created the scenario for this accidental deletion when I was supporting more than one type of shared-storage system (SMB for Artists workstations + NFS for worker cluster on the same Flamenco Manager workstation. I had symlinked one to the other and could have easily chosen Flamenco Manager to use the sym-linked path.
Author
Owner

That's good to know. What was the reason for symlinking, rather than simply configuring both Samba and NFS to expose the same path?

That's good to know. What was the reason for symlinking, rather than simply configuring both Samba and NFS to expose the same path?
Sybren A. Stüvel added
Priority
Normal
and removed
Priority
High
labels 2024-02-09 08:09:59 +01:00
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: studio/flamenco#99965
No description provided.