Please describe the problem.
original report: https://github.com/datalad/datalad/issues/3583
What steps will reproduce the problem?
initial get -J8 largely fails (ok for 1 file in one thread I guess) but subsequent succeeds:
hopa:/tmp
$> git clone https://github.com/OpenNeuroDatasets/ds000248 && cd ds000248 && git annex init && git annex get -J 8 sub-01
Cloning into 'ds000248'...
remote: Enumerating objects: 161, done.
remote: Total 161 (delta 0), reused 0 (delta 0), pack-reused 161
Receiving objects: 100% (161/161), 18.55 KiB | 863.00 KiB/s, done.
Resolving deltas: 100% (20/20), done.
CHANGES acq-epi_T1w.json acq-flipangle05_run-01_MEFLASH.json acq-flipangle30_run-01_MEFLASH.json dataset_description.json derivatives/ sub-01/ sub-emptyroom/
init (merging origin/git-annex into git-annex...)
(recording state in git...)
Remote origin not usable by git-annex; setting annex-ignore
(Auto enabling special remote s3-PUBLIC...)
ok
(recording state in git...)
get sub-01/meg/sub-01_task-audiovisual_run-01_meg.fif (from s3-PUBLIC...)
Remote is configured to use versioning, but no S3 version ID is recorded for this key
unknown export location
Unable to access these remotes: s3-PUBLIC
Try making some of these repositories available:
82a4b182-753f-4d93-a59e-20cfdd4d4237 -- [s3-PUBLIC]
e3612a8a-0c48-4374-9bfb-888f4010be54 -- root@1f69c4ed80cf:/datalad/ds000248
(Note that these git remotes have annex-ignore set: origin)
failed
get sub-01/anat/sub-01_acq-flipangle30_run-01_MEFLASH.nii.gz (from s3-PUBLIC...)
Remote is configured to use versioning, but no S3 version ID is recorded for this key
unknown export location
Unable to access these remotes: s3-PUBLIC
Try making some of these repositories available:
82a4b182-753f-4d93-a59e-20cfdd4d4237 -- [s3-PUBLIC]
e3612a8a-0c48-4374-9bfb-888f4010be54 -- root@1f69c4ed80cf:/datalad/ds000248
(Note that these git remotes have annex-ignore set: origin)
failed
get sub-01/anat/sub-01_T1w.nii.gz (from s3-PUBLIC...)
Remote is configured to use versioning, but no S3 version ID is recorded for this key
unknown export location
Unable to access these remotes: s3-PUBLIC
Try making some of these repositories available:
82a4b182-753f-4d93-a59e-20cfdd4d4237 -- [s3-PUBLIC]
e3612a8a-0c48-4374-9bfb-888f4010be54 -- root@1f69c4ed80cf:/datalad/ds000248
(Note that these git remotes have annex-ignore set: origin)
failed
get sub-01/anat/sub-01_acq-flipangle05_run-01_MEFLASH.nii.gz (from s3-PUBLIC...)
Remote is configured to use versioning, but no S3 version ID is recorded for this key
(checksum...) ok
(recording state in git...)
git-annex: get: 3 failed
$> git annex get -J 8 sub-01
get sub-01/anat/sub-01_T1w.nii.gz (from s3-PUBLIC...)
Remote is configured to use versioning, but no S3 version ID is recorded for this key
(checksum...) ok
get sub-01/anat/sub-01_acq-flipangle30_run-01_MEFLASH.nii.gz (from s3-PUBLIC...)
Remote is configured to use versioning, but no S3 version ID is recorded for this key
(checksum...) ok
get sub-01/meg/sub-01_task-audiovisual_run-01_meg.fif (from s3-PUBLIC...)
Remote is configured to use versioning, but no S3 version ID is recorded for this key
(checksum...) ok
(recording state in git...)
What version of git-annex are you using? On what operating system?
7.20190708+git36-g32d526164-1~ndall+1
fixed in 7.20200219-129-ge52034150 AKA 8.20200226~12 --yarikoptic
Looking in the git-annex branch of this repository, for information about a key such as MD5E-s10555001--f8bc87e8841634b3d2f9ac0ba85d0a83.nii.gz.log which is one of the files that fails to download, all there is is this:
So location log says it's in s3-PUBLIC, but in fact no S3 version id has been recorded.
And there were old bugs that prevented the recording of the S3 version id. Notably, it used to be possible to set versioning=yes on an existing S3 remote, and the files already stored in it necessarily lacked version ids then. That has been fixed.
So, is it still possible to reproduce creating a repository with this problem?
no, it would not be easily possible to reproduce such a dataset, but there is an ample number of them now available from openneuro. Another example on ds000115, where no versioning information seems to be available as well.
But it seems to be getting files just fine on a fast box with fast network (smaug) and annex 7.20190819+git2-g908476a9b-1~ndall+1
while it fails to get all files upon initial attempt on my laptop with annex 7.20200204+git62-gcc4521068-1~ndall+1
or 7.20190819+git2-g908476a9b-1~ndall+1 as well. So it seems to boil down to slow(er) networking causing annex too quickly give up.
Running get with --debug shows only a single Request being sent
and adding
-c annex.retry=2
seems to be sufficient to workaround the issue and thenget
works for me even on the laptop.annex.retry 3
set at the user level~/.gitconfig
, that is why I didn't experience it (disadvantage of all the workaround customizations, heh heh). Removing that setting, reproduces it on the fast box as well, so doesn't have much to do with network speed etcHow do I produce such a repo, I thought that git-annex has fixed the problem that made it not include the S3 versioning information? I don't want to see a lot of repos being created with that information missing.
Anyway, the S3 version is is a red herring, the failure is actually caused by the export db not getting populated from the git-annex branch before some threads try to use it. Remote.Helper.ExportImport has a updateexportdb that lets one thread update the db, but other threads don't block waiting for it. Easily fixed.
Sorry if I was not clear. Those repositories were produced in 2018 when available/used version of git annex was not yet "fixed". Newer datasets, have .rmet files, e.g. just checked https://github.com/OpenNeuroDatasets/ds002596 (full of .rmet files in git-annex branch ;))
Thanks. FTR in 7.20200219-129-ge52034150