multiple urls for the same UUID

I've been doing a sort of experiment but I'm not sure if it's working or, really, how to even tell.

I have two macbooks that are both configured as clients as well as a USB HDD, an rsync endpoint on a home NAS, and a glacier endpoint.

For the purposes of this example, lets call the macbooks "chrissy" and "brodie". Chrissy's was initially configured with a remote for brodie with the url as

ssh://Brodie.88195848.members.btmm.icloud.com./Users/akraut/Desktop/annex

This allows me to leverage the "Back To My Mac" free IPv6 roaming I get from Apple. Now, occasionally, that dns resolution fails. Since I'm frequently on the same network, I can also use the mDNS address of brodie.local. which is much more reliable.

So my brilliant/terrible idea was to put this in my git config:

[remote "brodie"]
    url = ssh://Brodie.88195848.members.btmm.icloud.com./Users/akraut/Desktop/annex
    fetch = +refs/heads/*:refs/remotes/brodie/*
    annex-uuid = BF4BCA6D-9252-4B5B-BE12-36DD755FAF4B
    annex-cost-command = /Users/akraut/Desktop/annex/tools/annex-cost6.sh Brodie.88195848.members.btmm.icloud.com.
[remote "brodie-local"]
    url = ssh://brodie.local./Users/akraut/Desktop/annex
    fetch = +refs/heads/*:refs/remotes/brodie/*
    annex-uuid = BF4BCA6D-9252-4B5B-BE12-36DD755FAF4B
    annex-cost-command = /Users/akraut/Desktop/annex/tools/annex-cost.sh brodie.local.

Is there any reason why I shouldn't do this? Is annex smart enough to know that it can reach the same remote through both urls? Will the cost calculations be considered and the "local" url chosen if it's cost is less than the other?

(I posted the annex-cost.sh stuff at Calculating Annex Cost by Ping Times.)

RSS Atom

comment 1

Yes, this is absolutely supported. If it's trying to get a file and it fails to access the first remote, it'll even automatically fall back to using the second. I do similar things with my own remote, for example I have a foo that uses ssh with a foo-local that uses NFS.

Comment by joeyh.name — Sun Dec 30 04:19:48 2012

Remove comment

multiple ways to access a special remote?

The use case is like this: an external USB drive could be accessed both locally (as a special directory remote) or it could be mounted remotely (then it would need to be an rsync special remote). Is there a way to handle this?

Failing that, can one switch the same remote from being a directory to an rsync without having to --move all of the content?

Finally, what's the best way to check directory remote's directory= parameter?

Comment by Michael — Wed Jul 10 05:52:28 2013

Remove comment

comment 3

@Michael, like I said above, this use case is completely supported. That's why git-annex uses UUIDs to uniquely identify repositories, no matter where or how many urls are used for them.

Just set up the remotes you need, and if they end up pointing to the same repository by different routes, git-annex will automatically notice.

The directory= parameter used when initializing a directory remote is only used to set up the remote in the .git/config file. It is not stored anywhere else, since the directory could be mounted at different locations on different computers, eg when a drive is moved between computers.

Comment by joeyh.name — Wed Jul 10 16:38:04 2013

Remove comment

comment 4

Very cool, thanks Joey.

Comment by Michael — Wed Jul 10 21:26:06 2013

Remove comment

annex sync when inside an itinerant repository

I'm having trouble with this where the different remotes have the same URL but different UUIDs. My situation is a repository on a USB drive that can be plugged into one of two machines and used to transport large files between them. On each machine there is a local repository in a consistent location, so I can rely on paths to things in the repos being consistent across machines. Each repo obviously has a different UUID. The USB repo has remotes for local filesystem access and remotes for over-the-network access as a convenience - something like this:

[remote "host1"]
    url = /m/stuff
    fetch = +refs/heads/*:refs/remotes/host1/*
    annex-uuid = ce6175ba-4a0d-49e6-88b1-615dac7a37c1
[remote "host1-net"]
    url = ssh://host1.network/m/stuff
    fetch = +refs/heads/*:refs/remotes/host1/*
    annex-uuid = ce6175ba-4a0d-49e6-88b1-615dac7a37c1
[remote "host2"]
    url = /m/stuff
    fetch = +refs/heads/*:refs/remotes/host2/*
    annex-uuid = f7e3fbe8-f7f5-4231-a885-a72a46680d0b
[remote "host2-net"]
    url = ssh://host2.network/m/stuff
    fetch = +refs/heads/*:refs/remotes/host2/*
    annex-uuid = f7e3fbe8-f7f5-4231-a885-a72a46680d0b

The over-the-network path is useful for keeping everything in sync, but it doesn't have enough bandwidth to sensibly sync the content as well.

If I run 'git annex sync' in this repository while it's attached to host1 I'd hope it would sync with host1 and host2-net, as those are the URLs through which the two repositories can be reached. What actually happens is that it syncs with all of the repositories and updates the annex-uuid of remote 'host2' to be the UUID of the host1 repository. It also obviously gets a bit confused because it updates the remote branches for host2 from host1.

Is there some way to configure it so that sync works with all repositories based on unique uuid values, rather than all remotes?

Comment by Mark — Thu Feb 12 12:54:21 2015

Remove comment

comment 6

I had the same problem, and I solved it using a host specific directory with symlink:

On host "host1", I've a directory named "/home/me/host1/" that contain a symlink "mygitrepos" to "/home/me/mygitrepos/". On host "host2", I've a directory named "/home/me/host2/" that contain a symlink "mygitrepos" to "/home/me/mygitrepos/".

On the usb drive, the remote are set as:

[remote \"host1\"]
    url = /home/me/host1/mygitrepos
    fetch = +refs/heads/*:refs/remotes/host1/*
    annex-uuid = ce6175ba-4a0d-49e6-88b1-615dac7a37c1
[remote \"host2\"]
    url = /home/me/host2/mygitrepos
    fetch = +refs/heads/*:refs/remotes/host2/*
    annex-uuid = f7e3fbe8-f7f5-4231-a885-a72a46680d0b

(I didn't set the net remote, but it should work). With this added indirection I protect myself against git (and git-annex) confusion.

Comment by Remi — Thu Feb 12 14:47:38 2015

Remove comment

comment 7

You have two configured git remotes, host1 and host2, with the same "url = /m/stuff". Only one of these remotes can be accessed at a time, depending on where the drive is docked.

So, why not just combine those two remote configs into a single remote. Call it "host".

git-annex will automatically notice when the uuid of the repository pointed to by "host" changes, and it will update the .git/config appropriately.

That BTW, happens to be just how I use git-annex with my own USB drive, and it works great.

git annex sync will still try to sync with "host1-net" and "host2-net", as well as which ever one of the two "host" points to. There's a small redundancy there, but since it will sync with "host" first, as it knows local file access is less expensive, the redundant sync will not involve much work.

Comment by joey — Thu Feb 12 20:25:12 2015

Remove comment

comment 8

Ah, ok. So it doesn't cause any issues that the host/* remote branches will also keep getting swapped from one repository to another? The operation of git annex sync is sufficiently (and happily) opaque to me, so I was concerned that this might break some of its basic assumptions.

Comment by Mark — Fri Feb 13 13:55:21 2015

Remove comment

comment 9

It's entirely expected and normal for git-annex to update the UUID of a remote with url = somepath when it notices that the repo at somepath has changed.

This is what you want to happen. If git-annex didn't notice and react to the UUID change, its location tracking information (for UUID A) would be inconsistent with the actual status of the repo (using UUID B).

Comment by joey — Tue Feb 17 21:43:16 2015

Remove comment

Add a comment