Hi,
some time ago, I accidentally copied some files from the archive to the non-archive part of my (indirect, type 'client') repository (instead of moving), with the effect that assistant always kept downloading and afterwards immediately dropping the files. Now this was not really surprising once I found the duplicate folder, but maybe git annex could detect this case and refuse to run in circles or at least complain about it.
Best Karsten
Update: Current status is this is fixed for direct mode.
In indirect mode, the startup scan will still download and then drop content if a file outside and inside the archive directory has the same content. It doesn't loop like it did in direct mode, only happens once (or once per duplicate file, really). Is still potentially annoying and a bug. --Joey
This is a known problem.
It seems possible to fix it for direct mode. After all, direct mode tracks all files associated with a key, so it could expose this to preferred content expressions, and the expression check if any of the associated files was in an archive directory.
Unsure how to deal with it in indirect mode. Short of making indirect mode do all the same tracking direct mode does, or otherwise build a key to file lookup table.
This turns out to be much worse in direct mode than in indirect mode.
In indirect mode, it only does extra work during the full startup scan. Suppose there are 3 files with the same content, 1, archive/2, and 3. It will download 1, and then will drop archive/2, and then will download 3. This certainly is not ideal, especially when the file content is large.
In indirect mode, it continally and repeatedly downloads the drops the files, as long as it's running. Which is beyond unacceptable.
What seems to be going on is that when archive/2 gets dropped, it necessary needs to convert 1 and 3 to broken symlinks. But the watcher than sees those file changes, thinks these are new or renamed files that have appeared, and promptly re-downloads them. That, in turn triggers an update of archive/2, to convert it back from symlink to direct mode file, and that in turn is noticed by the watcher. Round and round we go!