Perhaps stupidly I added some very large bare git repos into a git-annex.
This took a very long time, used lot's of memory, and then crashed. I didn't catch the error (which is annoying) - sorry about that. IIRC it is the same error if one Ctrl-c's the addition.
I ran git annex add .
a second time and eventually killed it (I perhaps should have waited - I now think it was working).
A git annex unannex
fixed up some files but somehow I managed to end up with tonnes of files all sym-linked into the git annex object directory but not somehow recognised as annexed files. I'm assuming that they somehow didn't make it into git annex's meta-data layer (or equivalent).
Commands such as git annex {fsck,whereis,unannex} weirdfile
immediately returned without error.
I've now spent a lot of manual time copying the files back. Doing the following, not the cleverest but I was a little panicky about my data...
find . -type l -exec mv \{} \{}.link \; #Move link names out of the way
find . -type l -exec cp \{} \{}.cp \; #Copy follows links so we can copy target back to link location
find . -type f -name "*.link.cp" | xargs -n 1 rename 's/\.link\.cp//' #Change to original name
find . -type l -exec rm \{} \; #Ditch the links
git annex unused
git annex dropunused `seq 9228`
9228 files were found to be unused, this gives an idea of the scale of the number of "lost" files for want of a better term.
A pretty poor bug report as these things go. Anyone any idea what might have happened (it didn't seem space or memory related)? Or how I might have fixed it a little more cleverly?
For reference I am using stable Debian, git annex version 3.20111011.
Ah HA! Looks like I found the cause of this.
Spot the file name with a newline character in it! This causes the error message above. It seems that the files proceeding this badly named file are sym-linked but not registered.
Perhaps a bug?
The bug with newlines is now fixed.
Thought I'd mention how to clean up from interrupting
git annex add
. When you do that, it doesn't get a chance togit add
the files it's added (this is normally done at the end, or sometimes at points in the middle when you're adding a lot of files). Which is also why fsck, whereis, and unannex wouldn't operate on them, since they only deal with files in git.So the first step is to manually use
git add
on any symlinks.Then,
git commit
as usual.At that point,
git annex unannex
would get you back to your starting state.Ah - very good to know that recovery is easier than the method I used.
I wonder if it could be made a feature to automatically and safely recover/resume from an interrupted
git add
?git annex add
recover when ran a second time.