Please describe the problem.
eg from this recent run
Tests
Repo Tests v10 adjusted unlocked branch
Init Tests
init: OK (0.43s)
add: OK (0.83s)
sop crypto: OK
upgrade: OK (0.52s)
conflict resolution (uncommitted local file): OK (4.99s)
adjusted branch merge regression: OK (1.09s)
describe: OK (0.62s)
fsck (local untrusted): OK (1.60s)
lock --force: OK (2.29s)
drop (untrusted remote): OK (1.69s)
view: OK (0.91s)
git-remote-annex: FAIL (3.01s)
./Test/Framework.hs:86:
git clone from special remote failed with unexpected exit code (transcript follows)
Cloning into 'clonedir'...
Detected a filesystem without fifo support.
Disabling ssh connection caching.
Detected a crippled filesystem.
Entering an adjusted branch where files are unlocked as this filesystem does not support locked files.
Switched to branch 'adjusted/master(unlocked)'
error: Untracked working tree file 'bar.c' would be overwritten by merge.
fatal: unable to checkout working tree
warning: Clone succeeded, but checkout failed.
You can inspect what was checked out with 'git status'
and retry with 'git restore --source=HEAD :/'
Use -p '/git-remote-annex/' to rerun this test only.
1 out of 12 tests failed (17.99s)
overall -- seems started to fail about a week ago
167 T Nov 17 GitHub Actions datalad/git-annex daily summary: 20 PASSED, 10 FAILED, 1 ABSENT
238 T Nov 16 GitHub Actions datalad/git-annex daily summary: 20 PASSED, 10 FAILED, 1 ABSENT
348 T Nov 15 GitHub Actions datalad/git-annex daily summary: 23 PASSED, 7 FAILED, 1 ABSENT
890 T Nov 14 GitHub Actions datalad/git-annex daily summary: 23 PASSED, 7 FAILED, 1 ABSENT
1676 T Nov 13 GitHub Actions datalad/git-annex daily summary: 22 PASSED, 8 FAILED, 1 ABSENT
2032 T Nov 12 GitHub Actions datalad/git-annex daily summary: 23 PASSED, 7 FAILED, 1 ABSENT
2561 T Nov 11 GitHub Actions datalad/git-annex daily summary: 30 PASSED, 1 ABSENT
although there in first failing was a bit different on OSX
Repo Tests v10 locked
Init Tests
init: OK (0.43s)
add: OK (1.17s)
sop crypto: OK
upgrade: OK (0.62s)
conflict resolution (uncommitted local file): OK (5.93s)
adjusted branch merge regression: OK (7.74s)
describe: OK (0.92s)
fsck (local untrusted): OK (1.87s)
lock --force: OK (1.64s)
drop (untrusted remote): OK (1.38s)
view: OK (1.48s)
git-remote-annex: FAIL (2.95s)
./Test/Framework.hs:86:
git clone from special remote failed with unexpected exit code (transcript follows)
Cloning into 'clonedir'...
git-annex: No git repository found in this remote.
Use -p '/git-remote-annex/' to rerun this test only.
This is a new test.
Looks like it's found a legitimate bug in git-remote-annex. When the filesystem is crippled, the git-annex init checks out an adjusted branch, which here happens in the middle of git's own checkout and so legitimately confuses git.
I can reproduce this on a FAT filesystem, cloning from eg a directory special remote. Fixed this.
(The OSX failure is something else.)
Re the OSX failure, it seems that somehow the manifest key is not being found when the test is run on OSX. I don't know why. There is nothing in this code that should be OSX-specific.
Unfortunately I do have access to any OSX system to try to investigate this. The "datalads-mac" I used to use does not seem to exist anymore.
Of course, this test could be skipped on OSX.
Does occur to me this could somehow be exposing a deeper problem on OSX with exporttree special remotes. I have split the failing test in two, so we'll see if both fail, or only the exporttree one.
Aha, this test on ubuntu is failing the same way as the OSX test:
https://github.com/datalad/git-annex/actions/runs/11905453897/job/33176247387
It seems that "custom-config1" only involves a annex.stalldetection setting, if I am reading the workflow file right. I was not able to reproduce the failure with that config set though.
joey@datalads-imac2
fromsmaug
My arm64-ancient build failed today in the same way as the OSX build is failing, so I should be able to debug it there.
Huh ok, so git-remote-annex is failing to push, which is why clone later fails. And for whatever reason git doesn't propigate the error, which is why this is not visible in the transcript.
That build uses git 2.30.2. That git bundle --stdin was broken and didn't read refs from stdin at all. Also it had other bugs. I think it's best not to try to support git-remote-annex with that version of git at all, given those bugs.
That probably won't help with the OSX failure, which is with a very new git version. So I also made the test suite capture the git push output even when it exits successfully, so it can display it when the git pull fails. That should show what the problem is there.
And here's why it's failing still on OSX and that 1 ubuntu "custom-config1" run:
Fascinating. It seems that git-remote-annex has been run twice. The first time seemed to do something successfully, since it reported the "Full remote url". Probably that first run is git using it to see what refs are on the remote.
The second time, git ran git-remote-annex with only 1 argument, rather than the expected 2. Why would git do that? And only in these few situations?
According to gitremote-helpers:
But that does not apply. The docs don't seem to give any other reason why the second argument would be omitted. Although the docs do say it's optional.
I've improved git-remote-annex output in this situation, so it will show wha the first parameter is. That might help understand out what git is trying to do here.
Apparently git is running "git-remote-annex transferrer".
This must be due to git-remote-annex be running "$0 transferrer" instead of "git-annex transferrer"!
In the usual case, when git-remote-annex is a symlink to git-annex, getExecutablePath returns "git-annex". But, if git-remote-annex is a hardlink or copy, that returns "git-remote-annex" instead.
And in the linux standalone tarball and OSX app, it does not use getExecutablePath, but getProgName so "git-remote-annex" also there.
And the specific reason these test cases are failing is because they have annex.stalldetection set, which needs to run the transferrer.
Fixed this.