Hi,
I have been attempting to use git annex
for archiving about 300gb of data (a usb disk that is a copy of a laptop drive). I have an S3 special remote setup as a backup remote right now (the plan is once I feel like the data is safe on s3 I will change the remote type to archive and get rid of the local copies). I am using the WORM backend. I think something has gone wrong:
- As of right now I have about 200mb of data on S3, but 'git annex copy --to S3 .' runs for a while, then exits, without any indication that it has uploaded any additional files.
- .git/ is around 200gb in size
- 'git annex add .' adds files for a while, then exits, and does the same thing when run again.
- At this point I decided to try and restart from scratch so I ran 'git annex direct' to get the files replaced in the working directory, this runs for a while then hangs.
This has been my first outing with git annex
, so I am sort of at a loss as to what there is I can poke at to try and determine what is going on. It would be nice to be able to get the data back to the original state, but more than that, I would like to understand what I did wrong, and I would appreciate any pointers or advice.
When you say that commands run for a while and then exit, are they exiting successfully? With a nonzero exit status? With an error message?
What OS? What git-annex version?
The commands would exit successfully.
I realize now that if I wanted to start from scratch the correct thing to do is likely 'git annex uninit' so I am running that now, but it fails often with an error like:
So I wrote in little shell loop to run 'git annex uninit', replace symlink with the contents out of .git/, and run 'git annex uinit' again. This is going slowly and seems to be getting slower, so I guess uninit does some kind of scan of all the files that is interrupted every time it hits the above error, and has to be done again from scratch every time, so as I remove those errors one by one uninit takes longer and longer to hit the next error. So what I need to do is write a script to fix the links up in a single scan.
So, this seems to suggest that
git annex add
exited before it got around to checking all the files it had added into the git repository.I don't see how that could happen if the command exited successfully. If it were interrupted, sure.
Normally I'd recommend that
git annex add
just be re-run to recover from an interrupted add. But if it's just going to exit for some reason again, I guess that won't work.I think you need to strace a git-annex command like
git-annex copy --to S3
and see if there's some clue why it's exiting unexpectedly.