git annex sync supports git remote groups, that should be expanded to all commands supporting --to and --from. And since there will be a [Remote] list, could also support --to foo --to bar etc.


Started working on this in multifromto branch.

Problem: Command.Move uses onlyActionOn, so only allows one remote to process a key at a time. Seems that onlyActionOn needs to be expanded to include the remote.

Problem: The concurrency is not right yet. For example, git-annex copy --to foo --to bar -J2 will start actions copying a file to both remotes. Assume the copy to bar finishes first. Then it will start an action copying the next file to foo, despite a copy to foo already running.

To fix this, it would help to look at Annex.activeremotes when starting the next action. If foo is busy, start the action on bar instead. It would be possible to build up a queue of actions that have not yet been started on a remote. That risks using a lot of memory if one remote is very slow. The queue would need to be capped at some amount, and when full, delay until the laggard remotes catches up.

Bonus: git annex sync --content -J2 already works, but it has the same problem described above and ought to be able to be fixed the same way.

Also worth noting that with --auto and -J, git-annex may make more transfers than preferred content settings demand, because it will start several transfers to different remotes at once. If only one copy is needed amoung all the remotes, it won't notice and a copy will be sent to all remotes. I think this is something the user can understand though?

Looking at commands that support --to and --from and what each should do, there is a lot of diversity.

git annex move --to of course removes the local copy. So if moving to multiple remotes it would need to delay that removal until it's sent to all of them. And it really only ought to try to remove the local copy once at the end, not once per remote moved to.

git annex move --from ought to spread the load amoung remotes with -Jn, and once a file is downloaded, it needs to try to remove it from all the remotes.

git annex mirror --from mirrors one remote; mirroring from multiple remotes does not really make any sense. mirror --to multiple could be done.

git annex unused --from seems unlikely to make sense with multiple remotes, since it would result in a list of keys distributed amoung them, and what would be done with that? Perhaps a git annex drop --from multiple remotes, but that would be innefficient. A shell script looping over remotes makes more sense if the user wants to drop unused from multiple remotes.

git annex get/fsck/copy/export/transferkey/drop/dropunused all make sense to support multiple remotes. But with -Jn the operations that get files behave differently than the operations that drop files. The gets want to balance load amoung the remotes, while the drops and uploads need to run each action over each remote.

Seems two runners are needed with different concurrency behavior, one that balances the load amoung remotes, and one that runs the same action against multiple remotes concurrently.