Please describe the problem.
A very basic operation is stuck (isn't there a unit-test?)
What steps will reproduce the problem?
#!/bin/bash
export PS4='> '
set -x
set -eu
cd "$(mktemp -d ${TMPDIR:-/tmp}/dl-XXXXXXX)"
(
mkdir src
cd src
git init
git annex init
touch 123
git add 123
git commit -m 123 123
)
#git clone --shared src dest
git clone src dest
(
cd dest
#git annex init
# would stall unless we git annex init above with 8.20200810+git5-gb41f77445-1~ndall+1
git annex get 123
)
ls -lLi {dest,src}/123
What version of git-annex are you using? On what operating system?
8.20200810+git47-g27329f0bb-1~ndall+1
Please provide any additional information below.
$> bash check-annex-hardlink.sh
> set -eu
>> mktemp -d /home/yoh/.tmp/dl-XXXXXXX
> cd /home/yoh/.tmp/dl-sgm3azv
> mkdir src
> cd src
> git init
Initialized empty Git repository in /home/yoh/.tmp/dl-sgm3azv/src/.git/
> git annex init
init (scanning for unlocked files...)
ok
(recording state in git...)
> touch 123
> git add 123
> git commit -m 123 123
[master (root-commit) 8e97f2f] 123
1 file changed, 0 insertions(+), 0 deletions(-)
create mode 100644 123
> git clone src dest
Cloning into 'dest'...
done.
> cd dest
> git annex get 123
(merging origin/git-annex into git-annex...)
(scanning for unlocked files...)
^C
git add
instead ofgit annex add
(not thatget
should have stuck)Your script doesn't stall on my end. I tried with the last git-annex release 8.20200810 and the commit you report (27329f0bb).
Thank you Kyle! your comment reminded me that yesterday while I was trying to reproduce stalling issue on NFS (not yet reported here) I have set globally pidlock=true!
Here is an adjusted script which sets it in the clone and causes the stall (also changed to use git annex add instead of git add)
This is the same class of bug as 82448bdf39bbe2b4fb6e0bac0735b845d52e189a fixed.
Subsequent to that fix, I audited for other bugs in this class and tried to fix them too in 96f6aa39dda335bf9aa25ed8a67756e31bd307c2. What that commit missed was that runsGitAnnexChildProcess only affects running git commands, while actually in all 3 places touched by that commit, git-annex runs other, non-git commands that result in git-annex being run. So all 3 of those can still deadlock like this, and actually that commit made it worse by making them always deadlock, rather than only theoretically deadlock in some case where some other part of git-annex happened to have taken the pid lock at the same time.
(Not really deadlock, I think, but wait for the annex.pidlocktimeout and then fail after 5 minutes.)
I can trigger it with annex.pidlock=true. It looks like the hang starts with 96f6aa39d (add runsGitAnnexChildProcess calls, 2020-06-17).
While fixed, this shows that annex.pidlock blocking for a long time should display a message. Perhaps it could first try to take the lock, and only display the message if it needs to wait on it.