git annex sync --content
behaves differently for directory or SSH resources.
I use the directory resource on two external hard drives to sync between them. They always get mounted to the same mount point.
When syncing from one USB drive to another, git-annex iterates over all files in the annex. This takes a long time even if only one file got added.
When running git-annex in one of the USB drives and syncing with SSH, git-annex only does the file transfer necessary and does not iterate over all files.
I'd like to have the SSH behavior also for syncs between USB drives. Is that possible?
Also, how does git-annex make the decision?
UPDATE: adding the output of git-annex as requested by Joey
% git annex sync --content usb-sg2
commit
Auto packing the repository in background for optimum performance.
See "git help gc" for manual housekeeping.
[master 3bREDACTED4b] git-annex in usb-sg1
2619 files changed, 2619 insertions(+)
create mode 120000 import/REDACTED
...REDACTED...
create mode 120000 import/REDACTED
ok
pull usb-sg2
From /media/thk/thk-sg1/annex-sg1
7dREDACTED45..caREDACTEDfb git-annex -> usb-sg2/git-annex
84REDACTED33..3bREDACTED4b master -> usb-sg2/master
Auto packing the repository in background for optimum performance.
See "git help gc" for manual housekeeping.
ok
copy import/REDACTING ok
[FROM here, the above line repeats for all files in the annex]
What specifically are you seeing when it "iterates over all files in the annex"?
sync --content necessarily has to look at each file in turn in order to determine if a) the file's content is present locally and b) if the file's content is present on the remote.
Actually I believe, the difference might just be that with an SSH remote, there is a remote process getting started on the remote that iterates over all files while with a directory remote, the iteration is done in the same process. Now when I actually look at it, the SSH remote sync seems to take a similar amount of time to sync, it only does not output all the files it iterates over.