As shown by benchmarks in here, there is some overhead for each file transfer to a rsync special remote, to set up the connection. Idea is to extend git-annex-shell with a command or commands that don't use rsync for transferring objects, and that can handle transferring or otherwise operating on multiple objects inside a single tcp session.

This might only be used when it doesn't need to resume transfer of a file; it could fall back to rsync for resuming.

Of course, when talking with a git-annex-shell that does not support this new command, git-annex would still need to fall back to the old commands using rsync. And should remember for the session that the remote doesn't support the new command.

It could use sftp, but that seems kind of difficult; it would need to lock down sftp-server to only write annexed objects to the right repository. And, using sftp would mean that git-annex would need to figure out the filenames to use for annexed objects in the remote repository, rather than letting git-annex-shell on the remote work that out.

So, it seems better to not use sftp, and instead roll our own simple file transfer protocol.

So, "git-annex-shell -c p2pstdio" would speak a protocol over stdin/stdout that essentially contains the commands inannex, lockcontent, dropkey, recvkey, and sendkey.

P2P.Protocol already contains such a similar protocol, used over tor. That protocol even supports resuming interrupted transfers. It has stuff including auth that this wouldn't need, but it would be good to unify with it as much as possible.


Dropping 200 files from a remote over a satellite internet connection, speed increased from 364s to 183s.

Dropping 1000 files from a remote over ssh to localhost, with -J4, speed increased from 20s to 6s. Without -J4, speed increased from 41s to 10s.

(By comparison, dropping 1000 files from a remote on the same filesystem took 12s, so remote access over localhost seems faster now! Possibly there's a little bit more concurrency when git-annex and git-annex-shell are both running?)

Transferring a 30 mb file over ssh to localhost, speed increased from 3.288s to 3.031s. Due to rsync's several levels of checksumming?

Dropping 1000 files from local, with each being locked on a ssh localhost remote, speed increased from 30s to 7s.