Please describe the problem.
- Some files remain symlinked after aborted
git annex add
and completedgit annex unannex
- This files are present in
`.git/annex/objects
butgit annex unused
does not find them. Runninggit annex whereused --key=SHA256E...
runs empty.
To restore files and remove them from git-annex objects folder - need manual workarounds or hacks like adding file again with git annex add
and trying to removing it again
What steps will reproduce the problem?
- run
git annex add
and abort operation mid-way (this was on directory with large number of files ~3K and running with 12 jobs command switch) - run
git annex unannex
until done - find that some files that were added - were restored, and some still symlinked but are not tracked by git annex
What version of git-annex are you using? On what operating system?
Debian Bookworm / git-annex version: 10.20240227-1
Please provide any additional information below.
Similar report from another user here: https://git-annex.branchable.com/forum/File_still_symlinked_after_git_annex_unannex/
# If you can, paste a complete transcript of the problem occurring here.
# If the problem is with the git-annex assistant, paste in .git/annex/daemon.log
# End of transcript or log.
Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
Yes, using it extensively for a few years with terabytes of data
Solution with running
git annex add
is also described at the link below:https://git-annex.branchable.com/forum/git_annex_add_crash_and_subsequent_recovery/#comment-4f5af644597a055624009c5bbb9aca3f
So need to find files that are symlinks to git annex object folder and run
git annex add
/git annex unused
- I can handle that with a script, though would be nice to have a built-in methodAdditional notes:
git annex add
gracefully on long-running jobs. Is there a way to do it now? Looks like ctrl-c resulted in a broken state. Whould Ctrl-z work better?The reason
git-annex unused
does not show these files as unused is that it looks at non-staged files as well as staged files. There is a good reason for it to do that, consider:If that said that the object used by bar was unused, the user might drop that object, and then they would be surprised and unhappy when bar turned into a broken link. So the object is in fact still used even though only by an unstaged file.
On the other hand,
git-annex unannex
only operates on files that are staged in git.It would be possible for it to also operate on annexed symlinks that are not staged.
But it seems to me there are other ways to get into that situation where it's not clear that the user would want
git-annex unannex
to do anything. Consider:That unannex does nothing. If it instead replaced the symlink with a copy of the file, and the file was large, the user might be surprised to have a lot more disk space being used than they did before.
It seems easy enough to recover an interruped
git-annex add
by either runninggit add
on the symlinks, or re-running thegit-annex add
which will add the symlinks and pick up where it was interrupted.