Please describe the problem.
original report https://github.com/dandi/dandisets/issues/168
- likely to happen due to dealing with lots of non-large files
happens only for non-large .json files (due to
* annex.largefiles=((mimeencoding=binary)and(largerthan=0))
)our log for a sample file looks like
(base) dandi@drogon:/mnt/backup/dandi/dandisets/000108/.git/dandi/logs$ grep 'sub-MITU01/ses-20220311h10m19s37/micr/sub-MITU01_ses-20220311h10m19s37_sample-22_stain-NN_run-1_chunk-6_SPIM.json' sync-20220608125450Z-1944539.log
2022-06-08T09:34:27-0400 [INFO ] backups2datalad sub-MITU01/ses-20220311h10m19s37/micr/sub-MITU01_ses-20220311h10m19s37_sample-22_stain-NN_run-1_chunk-6_SPIM.json: Syncing
2022-06-08T09:34:27-0400 [INFO ] backups2datalad sub-MITU01/ses-20220311h10m19s37/micr/sub-MITU01_ses-20220311h10m19s37_sample-22_stain-NN_run-1_chunk-6_SPIM.json: Not in dataset; will add
2022-06-08T09:34:27-0400 [DEBUG ] backups2datalad sub-MITU01/ses-20220311h10m19s37/micr/sub-MITU01_ses-20220311h10m19s37_sample-22_stain-NN_run-1_chunk-6_SPIM.json: Fetching bucket URL
2022-06-08T09:34:27-0400 [DEBUG ] backups2datalad sub-MITU01/ses-20220311h10m19s37/micr/sub-MITU01_ses-20220311h10m19s37_sample-22_stain-NN_run-1_chunk-6_SPIM.json: About to query S3
2022-06-08T09:34:28-0400 [DEBUG ] backups2datalad sub-MITU01/ses-20220311h10m19s37/micr/sub-MITU01_ses-20220311h10m19s37_sample-22_stain-NN_run-1_chunk-6_SPIM.json: Got bucket URL
2022-06-08T09:34:28-0400 [INFO ] backups2datalad sub-MITU01/ses-20220311h10m19s37/micr/sub-MITU01_ses-20220311h10m19s37_sample-22_stain-NN_run-1_chunk-6_SPIM.json: File is text; sending off for download from https://dandiarchive.s3.amazonaws.com/blobs/4a8/75e/4a875eb7-ab7c-4d54-b962-671b7c35a9d6?versionId=5dwackcdjuYtMeNZUh23k0rIpU7tRqbn
2022-06-08T09:34:28-0400 [INFO ] backups2datalad sub-MITU01/ses-20220311h10m19s37/micr/sub-MITU01_ses-20220311h10m19s37_sample-22_stain-NN_run-1_chunk-6_SPIM.json: Downloading from https://dandiarchive.s3.amazonaws.com/blobs/4a8/75e/4a875eb7-ab7c-4d54-b962-671b7c35a9d6?versionId=5dwackcdjuYtMeNZUh23k0rIpU7tRqbn
2022-06-08T09:34:28-0400 [TRACE ] backups2datalad Sending to addurl command: 'https://dandiarchive.s3.amazonaws.com/blobs/4a8/75e/4a875eb7-ab7c-4d54-b962-671b7c35a9d6?versionId=5dwackcdjuYtMeNZUh23k0rIpU7tRqbn sub-MITU01/ses-20220311h10m19s37/micr/sub-MITU01_ses-20220311h10m19s37_sample-22_stain-NN_run-1_chunk-6_SPIM.json\n'
2022-06-08T09:34:28-0400 [TRACE ] backups2datalad Decoded line from addurl command: '{"command":"addurl","note":"to sub-MITU01/ses-20220311h10m19s37/micr/sub-MITU01_ses-20220311h10m19s37_sample-22_stain-NN_run-1_chunk-6_SPIM.json\\nnon-large file; adding content to git repository","success":false,"input":["https://dandiarchive.s3.amazonaws.com/blobs/4a8/75e/4a875eb7-ab7c-4d54-b962-671b7c35a9d6?versionId=5dwackcdjuYtMeNZUh23k0rIpU7tRqbn sub-MITU01/ses-20220311h10m19s37/micr/sub-MITU01_ses-20220311h10m19s37_sample-22_stain-NN_run-1_chunk-6_SPIM.json"],"error-messages":["git-annex: user error (xargs [\\"-0\\",\\"git\\",\\"--git-dir=.git\\",\\"--work-tree=.\\",\\"--literal-pathspecs\\",\\"-c\\",\\"filter.annex.smudge=\\",\\"-c\\",\\"filter.annex.clean=\\",\\"-c\\",\\"filter.annex.process=\\",\\"add\\",\\"-f\\",\\"--\\"] exited 123)"],"file":"sub-MITU01/ses-20220311h10m19s37/micr/sub-MITU01_ses-20220311h10m19s37_sample-22_stain-NN_run-1_chunk-6_SPIM.json"}\n'
2022-06-08T09:34:28-0400 [ERROR ] backups2datalad sub-MITU01/ses-20220311h10m19s37/micr/sub-MITU01_ses-20220311h10m19s37_sample-22_stain-NN_run-1_chunk-6_SPIM.json: download failed: git-annex: user error (xargs ["-0","git","--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","filter.annex.smudge=","-c","filter.annex.clean=","-c","filter.annex.process=","add","-f","--"] exited 123)
- that file is downloaded but not added to git (according to git status)
- dedicated
git add
on that file after proceeds fine
seems to happen for some files more than for the others (grepping across logs across multiple runs) -- 38 different files failed, some up to 4 times
(base) dandi@drogon:/mnt/backup/dandi/dandisets/000108/.git/dandi/logs$ awk '{print $5;}' < /tmp/108-download-fails.log | sort | uniq -c | sort -n | nl | tail
29 2 sub-MITU01/ses-20220311h10m19s37/micr/sub-MITU01_ses-20220311h10m19s37_sample-22_stain-LEC_run-1_chunk-7_SPIM.json:
30 2 sub-MITU01/ses-20220311h10m19s37/micr/sub-MITU01_ses-20220311h10m19s37_sample-22_stain-NN_run-1_chunk-1_SPIM.json:
31 2 sub-MITU01/ses-20220311h10m19s37/micr/sub-MITU01_ses-20220311h10m19s37_sample-22_stain-NN_run-1_chunk-5_SPIM.json:
32 2 sub-MITU01/ses-20220311h10m19s37/micr/sub-MITU01_ses-20220311h10m19s37_sample-22_stain-YO_run-1_chunk-6_SPIM.json:
33 2 sub-MITU01/ses-20220311h10m19s37/micr/sub-MITU01_ses-20220311h10m19s37_sample-22_stain-YO_run-1_chunk-7_SPIM.json:
34 2 sub-MITU01/ses-20220315h17m32s36/micr/sub-MITU01_ses-20220315h17m32s36_sample-25_stain-NN_run-1_chunk-1_SPIM.json:
35 2 sub-MITU01/ses-20220315h17m32s36/micr/sub-MITU01_ses-20220315h17m32s36_sample-25_stain-NN_run-1_chunk-7_SPIM.json:
36 3 sub-MITU01/ses-20220311h10m19s37/micr/sub-MITU01_ses-20220311h10m19s37_sample-22_stain-LEC_run-1_chunk-4_SPIM.json:
37 4 sub-MITU01/ses-20220311h10m19s37/micr/sub-MITU01_ses-20220311h10m19s37_sample-22_stain-NN_run-1_chunk-6_SPIM.json:
38 4 sub-MITU01/ses-20220311h10m19s37/micr/sub-MITU01_ses-20220311h10m19s37_sample-22_stain-NN_run-1_chunk-7_SPIM.json:
- ATM there is about 3k of staged "new file" .json files and 4 not staged (those errors)
- invocation was
git-annex addurl --batch --with-files --jobs 5 --json --json-error-messages --json-progress --raw
What steps will reproduce the problem?
haven't tried to create reproducer yet. may be I could just grep log for all those addurls and provide text file to feed annex addurls..
What version of git-annex are you using? On what operating system?
10.20220525+git57-ge796080f3-1~ndall+1
before that was some other recent -- same problem
(base) dandi@drogon:/mnt/backup/dandi/dandisets/000108$ /home/dandi/git-annexes/10.20220525+git57-ge796080f3-1~ndall+1/usr/lib/git-annex.linux/git --version
git version 2.30.2
(base) dandi@drogon:/mnt/backup/dandi/dandisets/000108$ /home/dandi/git-annexes/10.20220525+git57-ge796080f3-1~ndall+1/usr/lib/git-annex.linux/git-annex version
git-annex version: 10.20220525+git57-ge796080f3-1~ndall+1
...
Please advise on how to debug or share ideas on what could lead to it or what to try.
Well, this is a case where
git add
is failing for some reason. It seems very likely that it output some error message to stderr, so I'd look for that in the log. It's not clear to me if your grep would show that.If
git add
was silent on stderr, then I'd move on to thinking it must have crashed (or have a weird bug). The 123 exit code is from xargs and only indicates thatgit add
exited nonzero, so we don't know what the actual exit code was. But it seems thatgit add
did not die of a segfault; it if had the exit code from xargs would be 125.Since git-annex is batching up some number of files and sending them all to a single
git add
, the files that come last would fail to be added if any earlier file somehow causedgit add
to give up. So that could explain why it happens more often for some files.