Please describe the problem.
When a remote is located on a device (network) that systemd is configured to automount but fails to do so, every git-annex command blocks/waits until the automount times out.
Commands that have to access such a remote (e.g., sync
, move
) are are allowed to block, but commands that only operate on the local repository (e.g., version
, add
, calckey
, find
) or another one (sync not-doesnotexist
, move --to=not-doesnotexist
) should not.
The Bash completion is also affected and blocks at every tab.
Probably related: git-annex causes not missing idle hard drives (as remotes) to spin up for no reason – even for local commands and completions.
What steps will reproduce the problem?
Add a non-existing mount point to /etc/fstab
:
/dev/sdoesnotexist /mnt/doesnotexist ext4 defaults,noauto,x-systemd.automount,x-systemd.device-timeout=10 0 0
Add a remote pointing to a path on /mnt/doesnotexist
:
$ git remote add doesnotexist /mnt/doesnotexist/path/to/repository
Use any git-annex command and wait for at least x-systemd.device-timeout
:
$ time git-annex version > /dev/null
real 0m10.433s
user 0m0.171s
sys 0m0.028s
What version of git-annex are you using? On what operating system?
git-annex version: 6.20171214-g61b515d71d
build flags: Assistant Webapp Pairing Testsuite S3(multipartupload)(storageclasses) WebDAV Inotify DBus DesktopNotify ConcurrentOutput TorrentParser MagicMime Feeds
dependency versions: aws-0.18 bloomfilter-2.0.1.0 cryptonite-0.24 DAV-1.3.1 feed-1.0.0.0 ghc-8.2.2 http-client-0.5.7.1 persistent-sqlite-2.6.4 torrent-10000.1.1 uuid-1.3.13 yesod-1.4.5
key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 SHA1E SHA1 MD5E MD5 WORM URL
remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav tahoe glacier ddar hook external
local repository version: 5
supported repository versions: 3 5 6
upgrade supported from repository versions: 0 1 2 3 4 5
operating system: linux x86_64
Please provide any additional information below.
strace
always includes a call to stat("/mnt/doesnotexist/path/to/repository")
.
Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
I’m very happy with git-annex (thanks) and use it frequently enough to notice this behavior.
fixed via the
remote.<name>.annex-checkuuid
config setting that can disable this behavior. --Joey
There are a couple of parts to this, so let's get this one out of the way first: Tab completion etc should not be looking at remotes.
It seems that even
git annex --help
does for some reason; so does stuff likegit annex examinekey
. So it's happening in a core code-path.Ah, ok.. Git.Config.read uses Git.Construct.fromRemotes, which uses Git.Construct.fromAbsPath, which stats the remote directory to handle ".git" canonicalization.
Fixed this part of it; now only when the remoteList is built does it stat remotes.
With the above dealt with, the remaining problem is with commands like
git annex whereis
orgit annex info
, which don't really any on any remote, but still need to examine the remotes as part of building the remoteList.git-annex supports remotes that point to a mount point that might have different drives mounted at it at different times. So, it needs to check the git config of the remote each time, to see what repository is currently there.
Even commands like "whereis" and "info" have output that depends on what repository a remote is currently pointing to. In some cases, "whereis" might not output anything that depends on a given remote, so in theory it could avoid looking at the config of that remote. And a command like "git annex copy --to origin" doesn't really need to look at the configs of any other remotes.
But to avoid unncessarily checking the git configs of remotes that a command does not use would need each use of the current remoteList to be replaced with something else that does the minimal needed work, instead of building the whole remoteList. I think this would be quite complicated.
And, I don't know that it would address the bug report adequequately, even if it were done. Running
git annex info
would still block waiting for the automount;git annex whereis
would only sometimes block, depending on where content is.So instead of that approach, perhaps a config setting will do? A per-remote config that tells git-annex that only one repository should ever be mounted at its location. That would make git-annex avoid checking the git config of that remote each time, except when it's actually storing/dropping content on it.
There would still be some cases where a git-annex command blocks somewhat unexpectedly on the automount.
For one,
git annex drop
can need to check if content is in a remote, and so would block, despite not acting directly on that remote.And,
git annex get
of a file that's located in such an locally automounted remote and a network remote will default to trying the local remote first, and so would block.The cost of the automounted remote could be adjusted to make these commands prefer some other remote, but then you've configured git-annex to not use the automounted remote much, which is probably not what you really want to do if it's a fast drive.
Of course, there are also ways to automount removable drives when they get plugged in, rather than using automounts that block on access, and so neatly avoid all blocking problems.
Added remote..annex-checkuuid config, which can be set to false to disable the default checking of the uuid. It will still check before making any modification of the remote repository.
There may still be situations where using this kind of automount is suboptimal with git-annex, as outlined in comment 3, but I think this is as far as it makes sense to change git-annex to deal with them.