Hi,
So I have yet another idea to speed up git annex. For now only for the 2nd pass of git annex sync --content --all.

  1. Check which remotes are currently available (i.e. online and connected).
  2. If any of the (available) remotes doesn't have a commit recorded (See below), do a full sync.
  3. If numcopies.log, mincopies.log, trust.log, group.log, preferred-content.log, required-content.log, group-preferred-content.log or transitions.log (on the git-annex branch) changed since the last time we synced, do a full sync (since if any of those logs changed, it may affect the preferred-content expressions and we need to reevaluate every key).
  4. If every check passed, do an incremental sync.

Full sync:

  1. Do a normal (full) git annex sync.
  2. For every remote that we synced with, record the commit id of the current tip of the git-annex branch.
    • Record the commit id only if every file was successfully transferred/dropped.

Incremental sync:

  1. In the 2nd pass of git annex sync --content --all, only look at keys whose location log changed since the last (full or incremental) sync via git diff-tree -r --name-only <lowest recorded commit id of all remotes> git-annex.
  2. Again, update the commit id of remotes that we successfully synced with.