When running multiple git-annex commands in parallel on different worktrees linked to the same repo, they all may be writing to the git-annex branch. Could this potentially lead to errors/undefined behaviors, or is git-annex designed with this usage in mind?
git-annex is fully safe both with parallel --jobs and with multiple git-annex programs run at the same time. It makes extensive use of locking.
(This assumes that git is also safe when multiple git jobs are running; ttbomk it is, although its locking around the index file is not very fine-grained.)
Thanks!
I do see some failures during concurrent usage, with the error message
fatal: Unable to create '/data/ilya-work/benchmarks/viral-ngs-benchmarks/.git/index.lock': File exists.
Another git process seems to be running in this repository, e.g. an editor opened by 'git commit'. Please make sure all processes are terminated then try again. If it still fails, a git process may have crashed in this repository earlier: remove the file manually to continue.
It seems that git does not have a built-in wait for the index lock to be released: https://stackoverflow.com/questions/36208630/is-there-a-way-to-make-git-automatically-retry-commands-if-index-lock-exists Maybe, for greater robustness, git-annex could add a wait for this, when it calls git.
As far as different worktrees go, git-annex mostly doesn't distinguish between multiple worktrees, at the level of annexed content and the git-annex branch there is no difference between one worktree and another. So it will be just the same as running multiple git-annex commands in the same worktree.
During parallel running of git-annex commands, I also get errors like
git-annex: .git/annex/othertmp/inge59014-3: getFileStatus: does not exist (No such file or directory) failed
git-annex: .git/annex/othertmp/ingest-assemble_den59014-8: removeLink: does not exist (No such file or directory) failed
Is this also now fixed by ?only allow one git queue to be flushed at a time, or is it a separate issue?
Also, is it correct that linked worktrees must be on the same filesystem as the main worktree, when using git-annex, because the linked worktrees must be on the same filesystem as .git/annex/othertmp ?
?withOtherTmp file escapes
I don't think you need to use the same filesystem, but it will be slower when git-annex can't hard link and has to fall back to copying.
I think that has a good chance of being out of date; I can't think of any use of that temp directory that doesn't gracefully handle the case of a hard link into it failing.