annex merge --remotes

ATM 'annex merge' does not accept any parameter to specify which remotes to consider -- it merges all. In some cases it might be desirable to merge only information from some remotes (e.g. I keep some annex remote "private" or smth like that)

closing because there is a way to do it with git config, and no better way for git-annex to do it. --Joey

RSS Atom

comment 1

There's been discussion of keeping private forks of git-annex repositories before.

IIRC, the remote.name.annex-sync and remote.name.annex-readonly settings can accomplish that.

For example, if the private repository A has a remote B, it can set annex-readonly, and this will prevent A from pushing any data to B. A can still pull from B. If A is on a locked down machine that B cannot itsef access, this guarantees that the changes in A remain private. I think this is the best way to accomplish this kind of scenario.

If B has A as a remote, then B could set annex-sync to false, which would prevent it from pulling from A, and so B would never merge in git-annex branches from A, at least unless A pushed them to B. Of course, in this scenario, a manual git pull A on B bypasses the protection.

It might make sense to make git-annex merge honor annex-ignore, and skip merging branches that belong to a remote, even if they were somehow pulled down. Unfortunately, git's remote branch name mapping can be quite complicated; IIRC it's not as simple as skipping branches remotes/B/*

Comment by joey — Mon Aug 8 15:32:19 2016

Remove comment

comment 2

For a git annex merge --remotes to be useful, there would need to be a config to disable the automatic merging of git-annex branches, which all git-annex commands do when they notice it needs to be done. So, this needs to be a git config and not a switch, so it can also control the automatic git-annex branch merging.

Using remote.name.annex-ignore as the config does not make sense on second look, because that gets set automatically when the remote is on eg github.

Using remote.name.annex-sync=false as the config makes some sense, although as noted above, that prevents git annex sync from fetching from the remote already, so unless git pull is run manually, the existing config should suffice.

To implement that, git-annex would have to parse the remote.name.fetch config in order to tell what name a remote's git-annex branch is fetched to. I am reluctant to do this for several reasons:

The syntax of remote.name.fetch is only documented by example. It's not clear what's supposed to be done if eg, the * appears twice in a branch name or different numbers of times on the left and right hand sides.
Two remotes can have remote.name.fetch set such that the same remote tracking branch is locally used for fetches from both remotes. So git-annex would not know if such a branch should be synced or not.
remote.name.fetch can be overridden when using git fetch or git pull at the command line, so again git-annex can't know for sure what remote a given tracking branch came from.

Four approaches that could work:

Add a config that is a list of remote tracking branches, and make git annex merge and the automatic git-annex branch merging merge only those tracking branches. For example, annex.allowmerge=refs/remotes/origin/git-annex refs/remotes/origin/master

Doable, but it seems this would be an annoying list to maintain, especially when new branches are made.
Embed some information in a branch that can be looked at to determine that git-annex should not auto-merge this branch. Note that this would need to be done for both the git-annex branch and the regular branch. The latter seems particularly hard to do.
Configure remote.name.fetch so that the remote git-annex branch is either not fetched, or fetched to a tracking branch that does not end in /git-annex. I think this is possible to do, but due to the lack of documentation for that config, it would take some experimentation to find how to do it. This would prevent the automatic merging of that branch by git-annex.

And if you make the remote master branch be fetched to eg refs/remotes/name/master/nomerge then git annex merge won't merge that into master.
Prevent adding a remote to a repository if that remote contains private information that you don't want to get merged into the local repository. This still seems like the best solution to me; if the information is private it should not be possible to fetch it from the remote.

Comment by joey — Thu Oct 13 17:27:36 2016

Remove comment

comment 3

This came up again in https://git-annex.branchable.com/tips/local_caching_of_annexed_files/ and there it was sufficient to configure remote.name.fetch so that no branches were fetched from the cache remote.

Approach #3 can be implemented using:

fetch = refs/heads/master:refs/remotes/private/nomerge/master

This prevents git-fetch from fetching the git-annex branch, and it makes the remote master banch fetch into a name that git-annex won't automatically merge into master.

Comment by joey — Mon Aug 6 15:47:56 2018

Remove comment

comment 4

Based on my last comment, I think, if you still need this, you should try configuring remote.name.fetch to avoid fetching the git-annex branches you don't want to merge.

If that's not sufficient, followup and we can think about the other options I discussed earlier.

Comment by joey — Wed Jan 29 15:10:09 2020

Remove comment

Add a comment