How can I recover from misadventure?

Hi,

I have been attempting to use git annex for archiving about 300gb of data (a usb disk that is a copy of a laptop drive). I have an S3 special remote setup as a backup remote right now (the plan is once I feel like the data is safe on s3 I will change the remote type to archive and get rid of the local copies). I am using the WORM backend. I think something has gone wrong:

As of right now I have about 200mb of data on S3, but 'git annex copy --to S3 .' runs for a while, then exits, without any indication that it has uploaded any additional files.
.git/ is around 200gb in size
'git annex add .' adds files for a while, then exits, and does the same thing when run again.
At this point I decided to try and restart from scratch so I ran 'git annex direct' to get the files replaced in the working directory, this runs for a while then hangs.

This has been my first outing with git annex, so I am sort of at a loss as to what there is I can poke at to try and determine what is going on. It would be nice to be able to get the data back to the original state, but more than that, I would like to understand what I did wrong, and I would appreciate any pointers or advice.

RSS Atom

comment 1

When you say that commands run for a while and then exit, are they exiting successfully? With a nonzero exit status? With an error message?

What OS? What git-annex version?

Comment by joey — Tue Jan 19 20:32:30 2016

Remove comment

comment 2

$ git annex version
git-annex version: 5.20151218-g5008846
build flags: Assistant Webapp Webapp-secure Pairing Testsuite S3(multipartupload) WebDAV Inotify DBus DesktopNotify XMPP ConcurrentOutput DNS Feeds Quvi TDFA TorrentParser Database
key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 SHA1E SHA1 MD5E MD5 WORM URL  
remote types: git gcrypt S3 bup directory rsync web bittorrent webdav tahoe glacier ddar hook external
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 15.04
Release:        15.04
Codename:       vivid  
$ uname -a
Linux aktaios 4.2.0-040200rc2-generic #201507160938 SMP Thu Jul 16 09:49:19 UTC 2015 i686 i686 i686 GNU/Linux

The commands would exit successfully.

I realize now that if I wanted to start from scratch the correct thing to do is likely 'git annex uninit' so I am running that now, but it fails often with an error like:

git-annex: Users/hiredman/.gem/specs/rubygems.org%80/quick/Marshal.4.8/bundler-1.0.15.gemspec points to annexed content, but is not checked into git.
Perhaps this was left behind by an interrupted git annex add?
Not continuing with uninit; either delete or git annex add the file and retry.

So I wrote in little shell loop to run 'git annex uninit', replace symlink with the contents out of .git/, and run 'git annex uinit' again. This is going slowly and seems to be getting slower, so I guess uninit does some kind of scan of all the files that is interrupted every time it hits the above error, and has to be done again from scratch every time, so as I remove those errors one by one uninit takes longer and longer to hit the next error. So what I need to do is write a script to fix the links up in a single scan.

Comment by hiredman — Tue Jan 19 23:11:25 2016

Remove comment

comment 3

So, this seems to suggest that git annex add exited before it got around to checking all the files it had added into the git repository.

I don't see how that could happen if the command exited successfully. If it were interrupted, sure.

Normally I'd recommend that git annex add just be re-run to recover from an interrupted add. But if it's just going to exit for some reason again, I guess that won't work.

I think you need to strace a git-annex command like git-annex copy --to S3 and see if there's some clue why it's exiting unexpectedly.

Comment by joey — Wed Jan 20 19:08:34 2016

Remove comment

Add a comment