is there a way to automatically retry when special remotes fail?

I'm storing hundreds of gigabytes of data on a S3 remote, and often when I try to copy to my remote using this type of command:

git annex copy newdir/* --to my-s3-remote

I'll get a little bit of the way uploading some large file (which is in chunks) and then something like this:

copy newdir/file1.tgz (gpg) (checking my-s3-remote...) (to my-s3-remote...) 

3%        2.2MB/s 11h14m

  ErrorMisc "<socket: 16>: Data.ByteString.hGetLine: timeout (Operation timed out)"

failed                  

copy newdir/file2.tgz (checking my-s3-remote...) (to my-s3-remote...) 

15%        2.3MB/s 3h40m

  ErrorMisc "<socket: 16>: Data.ByteString.hGetLine: resource vanished (Connection reset by peer)"

failed                  

copy newdir/file3.tgz (checking my-s3-remote...) (checking my-s3-remote...) (checking my-s3-remote...) (checking my-s3-remote...) (checking my-s3-remote...) (checking my-s3-remote...) (checking my-s3-remote...) ok

One common cause of this is if my Internet connection is intermittent. But even when my connection seems steady, it can happen. I'm willing to chalk that up to network problems elsewhere though.

If I keep just hitting "up enter" to re-execute the command each time it fails, eventually everything gets up there.

But this can actually take weeks, because often uploading these big files, I'll let it go overnight, and then wake up every morning and find out with dismay that it has failed again.

My questions:

Is there a way to make it automatically retry? I am sure that upon any of these errors, an immediate automatic reply would amost assuredly work.
If not, is there at least a way to make it pick up where it left off? Even though I'm using chunks, it seems to start the file over again.

Thanks.

RSS Atom

comment 1

The git-annex assistant will automatically retry uploads/downloads, if you want to use it.

Or, you can use a simple loop until git-annex succeeds:

while ! git-annex annex copy newdir/* --to my-s3-remote ; do echo retry ; done

As to picking up where it left off, make sure you have a recent release of git-annex, and then you can enable chunking for a S3 remote. git-annex can then resume an upload or download starting at the next chunk that needs to be transferred.

See chunking.

Comment by joey — Mon Jan 5 20:23:40 2015

Remove comment

Is there a way to make this upload all files that have not been uploaded yet?

Let's say I want my S3 repo to have all files in it that are in my current repo. Is there a variation of this while loop that will keep uploading until it's done?

I recognize I could use annex assistant but let's say I just want to do it this way.

Comment by digiuser [livejournal.com] — Thu Jan 8 02:37:51 2015

Remove comment

comment 3

That's what the while loop above does; runs the git-annex command, and retries it until it succeeds.

Comment by joeyh.name — Thu Jan 8 18:13:47 2015

Remove comment

Add a comment