Currently, git-annex pull
drops unwanted files from the remote it pulls
from. I wonder if this is a good choice?
I was surprised by this behavior today myself. I was thinking of pull as
conceptually not modifying the state of the remote, only getting data from
it. That's what git pull
does after all.
My use case was that I knew the remote didn't want a file, but I wanted to leave the file on the remote for a little while longer. So I did a pull, rather than a full sync.
It's also the case that git-annex push
drops unwanted files from the
local repository. The same analogy to git push
would say it should not do
that.
Separating out these behaviors would have pull drop unwanted files from the local repository, while push drops unwanted files from the remote. The latter seems unambiguously what the user would want; the former might be surprising to some, but one of pull/push needs to drop from local in order for them combined to be the same as sync.
Looking at 5df89d58c7d43b5cd26829cb8c4699e02fc352f3 that
implemented pull and push, I think this behavior was emergent, not
designed. The existing git-annex sync --pull
happened to drop unwanted
content from the remote and git-annex pull
inherited that behavior.
Looking back to 1cc1f9f4e5e3e974ddec069b2a6a3edf0893c369 that
implemented --pull
, it also doesn't seem to have considered what to do
about dropping.
Note that there is some risk of a wider behavior change than expected if
implementing this. handleDropsFrom
drops from remotes first, and from the
local repository last. So if a file is unwanted by both local and remote,
and both start with a copy, git-annex pull
will drop it from the remote,
then be unable to drop it from the local, and so it will stay on the local
repo. If it were changed to only drop from the local repo, it would be able
to drop it from local, and the file would stay on the remote. It's not
clear to me that either behavior is better than the other; both are legal
solutions to that preferred content situation of course. It might be
possible to only document this behavior change, and if a user has set up
such a preferred content, they can of course change it to something that
picks the repository they want to keep the copy.
--Joey