Suppose two git-annex symlinks (locked files) point to the same key. If you unlock one symlink, and annex.thin
is true
, editing the file will change the contents pointed to by the second, supposedly locked, symlink.
Maybe, with annex.thin
, when one copy of a file is unlocked, all must be unlocked?
Also, with annex.thin
, the invariant that .git/annex/objects/aa/bb/*/*
contains content with given key gets broken if the file is edited. Might that affect other things, like metadata lookup? git-annex-fsck
reports this as an error.
git-annex-drop
succeeds but does not actually drop the file.
Also, even if the current repo is trusted, with annex.thin
, an unlocked file should not count as a trusted copy.
Confirmed this is still a problem.
git-annex fsck
does detect and deal with it, by eg deleting the corrupted object.It seems like it would be hard for git-annex to make other files that use the same object be unlocked. Consider a repo with one file, that is unlocked and uses an object. Then a
git merge
adds another, locked file, using the same object. git-annex didn't have a chance to run at all, and now the stage is set for this problem to happen if the user appends to the unlocked file.In a way, the docs for annex.thin do warn the user about this. If you squint just right:
But, git-annex goes out of its way to avoid 2 unlocked files being hardlinked when using annex.thin. So it seems wrong that a locked file and an unlocked file will be hard linked, and that the locked file can get corrupted.
I've made the docs warn about it better.
Ugh, I had closed this as not solvable, but on second thought, it's a very real wart and if it's not solvable that doesn't make it not a problem.
And it does seem that it could be solved in some outside the box way.
For example, what if thin files were not hard links to the object file? Now, they have to hard link to somewhere, to prevent git checkout from deleting the only copy of a thin file. But it could hard link to a different name in the object directory. One that symlinks do not point to.
Then
git-annex fix
or fsck would need to notice when a symlink points to an object file that is missing, and copy the thin file to it to populate it. And, something might also want to reap object files that have become only used by thin files.