A somewhat follow up to https://git-annex.branchable.com/bugs/add_config_var_preventing_adjusted_branch_mode/ where we ended up in adjusted branch mode and want to get back to original indirect mode using the thaw/freeze commands.
Checking out master
branch is not sufficient since .git/annex/objects
uses different layout I guess to ensure that symlinks do not jeopardize actual annex storage on systems without read-only protection. But we need some command to migrate .git/annex/objects layout. May be it is already there and I just failed to find
But adjusted branches do not affect the location of git-annex object files.
The git-annex adjust man page says to use
git checkout
to switch back, and it certianly does work.If you are having a problem with this, you need to explain what the problem is and what is happening...
Ok, so the object location usually used in bare repositories..
One way that could happen is if core.symlinks=false and annex.crippledfilesystem=true. Then it does use the bare form of object filenames, which is kind of ok since it's not going to be using symlinks in that repository.
Also, before 2016 (?commit 2d00523609def535588b693a00d4092768e1c3c6), git-annex used those names whenever annex.crippledfilesystem=true, no matter what core.symlinks was set to. So if the files are that old..
This does seem to point to there needing to be a way to migrate the object files in a repository to the right names. It might be a reasonable thing for git-annex fsck to do, when it sees a symlink to an object file that is in the other location.
The rationalle for using the bare object layout when on a crippled filesystem was given in f1b0a4b404ed835f1c4a27a92352180be8564f8a. Basically it may be more portable. Not a strong rationalle at all, as the later change to not do it when symlinks are supported shows. But I don't think worth changing at this point.
So teaching fsck to move object files to the preferred location seems the best way. It will also deal with the situation where a bare repository gets converted by the user into a non-bare repo.
As well as moving the object file, fsck will need to move any other associated files, including the object lock file. It may as well move the whole object directory.
Locking is a concern for implementing this in fsck. There would be a race where another process that is locking the object file sees the object file in the old location, so tries to lock it in the old location, but by then the object file has been moved.
Experimentally: In v10, moving the object file after it has checked its location in preparation for locking for drop results in it making a separate lock file in the old object directory. That lock file remains after the drop succeeds. In v8/v9, it seems to not create the object file when trying to lock it. (Based on reading the code, I though perhaps it would!) In v8-v10, moving the object directory in the race when it's locking content in place causes the lock to fail; it does not create any lock file or object file.
So, v10 post drop lock file cleanup is the problem. Or at least one problem, there could be other points in the race than the one I tested that have other behavior. This seems like an ugly race to insert fsck into the middle of; it would be much preferable if fsck could somehow avoid such races when moving the object directory. But how?
fsck could lock the object file for drop, and then rather than removeing it, move it to a holding location. Then it could move the object file into the right place the same as
get
does. This should avoid the race. Interrupting fsck at the wrong time would leave the object file in this holding location though. Re-running fsck would need to recover from this situation. Putting it in.git/annex/tmp/
might make sense, althoughgit-annex get
does not necessarily recover when the object file is located there.If fsck locks the content for removal, then moves it to the preferred location, how is that any different from git-annex first dropping content and then very quickly retrieving another copy and storing it in the other location? The only difference is timing, but things like being suspended and resumed can affect timing.
So, if there is a problem with fsck doing that, there would also be a more general problem, that could occur in other circumstances, even if only rarely.
One way to see the general problem happen would be to have two processes trying to drop the same object. One process finds the object location, then stalls. Meanwhile, the second process drops the object. Then the first process resumes, and locks for removal. Per comment #5 this will result in a dangling lock file in the object directory. I have not managed to get this to happen yet though.
A fix for the general problem is to make it not create the object directory when opening the object lock file. So I've made that change.
Made
git-annex fsck
move the object files to the preferred location for the repository type.You can run it with --fast and it should solve your problem. I'm still not certain what circumstance led to you having the problem, but unless I hear back I'll assume it was something like an old version of git-annex. So will close this bug with this as the fix..
FWIW: confirming that
git-annex fsck --fast
worked out nicely on a sample test repo.Re question above on how we got there: it is trivial. While having no globally defined thawcontent-command/freezecontent-command we created a new repo with git annex
10.20230126-1~ndall+1
(so - recent).so we ended up in adjusted branches mode, with .git/config having those
core.symlinks = false
andannex.crippledfilesystem = true
.Then we wanted to move back to "normal" -- enabled those thaw/freeze config options,
git config --unset core.symlinks; git config --unset annex.crippledfilesystem
, rangit annex fsck --fast
and seems got it all alright.