projects/dandi/bugs-done/addurl failure has empty error-messagesjwodderhttp://git-annex.branchable.com/projects/dandi/bugs-done/addurl_failure_has_empty_error-messages/git-annexikiwiki2023-01-05T17:30:31Zcomment 1http://git-annex.branchable.com/projects/dandi/bugs-done/addurl_failure_has_empty_error-messages/comment_1_86b9cb50635c8ef8cf5918ab9a76d013/joey2023-01-05T17:30:31Z2021-10-27T16:23:52Z
<p>Is it reproducible with a particular url? Does it only happen with -J?</p>
<p>Version would also be good to know. There were recent relevant
changes eg <a href="http://source.git-annex.branchable.com/?p=source.git;a=commitdiff;h=4f42292b13dc5a6664eeb19b5c9d48991eaef292">4f42292b13dc5a6664eeb19b5c9d48991eaef292</a>.</p>
<p>I've spent a while hunting for a code path where it fails without
displaying a warning, and have not found one. Since the code in addurl
is structured as return Nothing and hopefully display a warning
beforehand, rather than as throw an error, it's certianly possible that
happens.</p>
comment 2http://git-annex.branchable.com/projects/dandi/bugs-done/addurl_failure_has_empty_error-messages/comment_2_2a151ccc0d9ef464df1452adc6ca449a/jwodder2023-01-05T17:30:31Z2021-10-27T18:16:43Z
It appears that the problem occurs whenever one tries to download the same URL to two different paths at the same time. When this occurs, one of the downloads fails, and though its "error-messages" is empty, its "notes" field reads, "transfer already in progress, or unable to take transfer lock".
comment 3http://git-annex.branchable.com/projects/dandi/bugs-done/addurl_failure_has_empty_error-messages/comment_3_1732be4c1d48acefcf3174cf8a7c8434/jwodder2023-01-05T17:30:31Z2021-10-27T18:19:23Z
As to your questions, I am using git-annex 8.20211011 on macOS 11.6. The problem does not occur when the <code>--jobs</code> option is omitted, but that's not viable for the current project we're using git-annex for.
comment 4http://git-annex.branchable.com/projects/dandi/bugs-done/addurl_failure_has_empty_error-messages/comment_4_74e37e96d01bfb8b9521000d5faa7e53/joey2023-01-05T17:30:31Z2021-10-27T18:40:48Z
<p>Aha, that makes sense! addurl constructs a url-based Key to use while
downloading, and the key transfer machinery prevents redundant downloads
of the same Key at the same time.</p>
<p>Arguably, the problem is not where the message gets put, but that
it fails when adding an url to two different paths at the same time.</p>
<p>I have, though, moved that message so it will appear in error-messages.</p>
comment 5http://git-annex.branchable.com/projects/dandi/bugs-done/addurl_failure_has_empty_error-messages/comment_5_1b87390e69204b42878930cc1614437e/joey2023-01-05T17:30:31Z2021-10-27T18:56:23Z
<p>The best solution I can find is for it to notice when another thread is
downloading the same url, and wait until it finishes. Then proceed
with downloading the url for a second time.</p>
<p>It's not very satisfying to re-download. But once the url Key is downloaded,
it does not keep that url Key populated, but hashes the content and moves
the content to the final Key. It would be a real complication to
communicate, across threads, what Key the content ended up at, and have the
waiting thread use that. And addurl is already complicated well beyond a
point I am comfortable with.</p>
<p>Also, the content of an url can of course change over time. If I feed
"$url foo" into git-annex addurl --batch -J10 and then some time
later, I feed "$url bar", I might expect that file bar gets whatever
content the url has now, not the content that the url had back when I added
the same url to file foo. And if I cared about avoiding re-downloading,
I could add the url to the first file, and then copy the annex link to the
second file myself.</p>
<p>Implemented this approach.</p>