Hi joey,
I was recently moving many of my git annex repos around. This caused all of the (partly auto-generated) repo descriptions (e.g. user@host:path
) to be outdated, making it more difficult to re-add them later from another host. Updating all of them manually was really error prone and tedious.
What do you think about having git annex update the repo description "from time to time"? git annex sync|assist|...
could check if the current repo description matches what the auto-generated string would be and update it accordingly.
I see the following problems with this:
- Adding this check to each and every git annex command is probably not a good idea. Maybe
git annex sync
andgit annex assist
. The overhead might be negligible though. - How to detect if the current description was really auto-generated and not user-specified? git annex could parse it with a regex (e.g.
(?P<user>[^@]+@(?P<host>[^:]+:(?P<path>.*)$
) and if that matches could assume it was auto-generated. Feels a little fragile though.
Maybe the whole auto-updating idea is not ideal, but a new command like git annex redescribe
or git annex autodescribe
or git annex describe --auto
could be introduced, so users can run it periodically or on-demand. Following the discussion on 'git annex sync
defaulting to syncing content', I have a feeling that people wouldn't like git annex messing with their repo descriptions 😉. Ideally, auto-describing would (optionally) also be able to update remotes' descriptions properly.
All of this could also be done by a third-party program, but having this functionality in git annex itself would be handy.
I think a good thing to do if you plan to be moving a repo around is to describe it with something that does not depend on its current location.
For example, suppose I am making a repo on a removable USB drive. I'm gonna literally move that from place to place by plugging it into different computers. So the default user@host:/mntpoint description is not a good one for that repository. Instead I use something like "2 tb passport USB drive", or even better I slap a sticker on that drive and give it a real name and use that as the description.
I guess that someone who was moving a USB drive back and forth between 2 computers would not be enthused if git-annex started updating the description after each move. Even if that prevented the description from being wrong half the time.
I do think that
git-annex describe --auto
is a reasonable idea.Note that
git-annex describe remote --auto
has a small problem when the remote is a git ssh remote, that there may be multiple hostnames, the one that happens to be used locally might not be as fully qualified as the "right" one. Or it may even be a ssh host alias, which can't be converted to a FQDN.For special remotes,
git-annex initremote
does not set a description at all, and whatever hostname might be used for one is hidden underneath an abstraction layer anyway. So it couldn't do anything useful for those.So that limits it to local git repositories..
I do that too (adding unique information about the storage medium, e.g. the HDD's manufacturer, serial number and human-readable description), but still it is important (to me) to have a (be it partly) copy-pastable link/path that simplifies re-adding the remote later elsewhere. In my case, I moved three HDDs (some of them internal, some connected via USB) to a different host and had to change the mountpoints for consistency. On all of these I stored many (>10) git annex repos, so that makes >30 remote descriptions I had to update manually -- probably forgot a lot of those.
Automating this would have helped a lot. A quick'n'dirty way to get FQDN(s) is this
nslookup "$(curl -s icanhazip.com)" | perl -ne 'print if s|^.*name\s+=\s+(.*)\.$|$1|g'
(doesn't work for DynDNS though...).I'd agree with you if you said, something like this is too specific to have git annex do it ('it' being updating descriptions of remotes). But if
git annex describe here --auto
would check ifhere
's description looks auto-generated and if yes, update it with the default auto-generated one, that would already help. Maybe justgit annex describe --auto
would suffice, as thehere
is kind of redundant -- any other remote wouldn't work like this.