projects/datalad/bugs-done/WEBDAV export has wrong subdirectory contentgit-annexhttp://git-annex.branchable.com/projects/datalad/bugs-done/WEBDAV_export_has_wrong_subdirectory_content/git-annexikiwiki2023-01-05T17:30:31Zcomment 1http://git-annex.branchable.com/projects/datalad/bugs-done/WEBDAV_export_has_wrong_subdirectory_content/comment_1_e4b3597a6be74b10f03af3cdbc9ab4b2/joey2023-01-05T17:30:31Z2021-03-12T18:14:59Z
<p>Unless this only happens on the one webdav server, my guess is
it involves the kind of weird way DAV handles collections.</p>
<p>In particular, the content is being written to a temp file,
which is in the webdav root, and then it runs:</p>
<pre><code>inLocation src $ moveContentM (B8.fromString newurl)
</code></pre>
<p>where src = the webdav root. It may be that ignores the path in newurl and
just puts it into the same collection.</p>
comment 2http://git-annex.branchable.com/projects/datalad/bugs-done/WEBDAV_export_has_wrong_subdirectory_content/comment_2_aa11bebaa9ca341434fbc1382bbbbee4/joey2023-01-05T17:30:31Z2021-03-12T18:27:26Z
<p>Well, I tried with the box.com webdav server, and did not reproduce the
problem there; exported files went into the proper subdirectories.</p>
<p>I think there's some chance that the webdav server you're trying to use is
just broken in its handling of moving from one collection to another.
Or perhaps the webdav spec can be interpreted multiple ways and this is
falling into an edge case.</p>
<p>Here's --debug of it working:</p>
<pre><code>export box.com sub/t
[2021-03-12 14:29:33.044254918] getProps sub/t
[2021-03-12 14:29:35.56315446] putContent git-annex-webdav-tmp-SHA256E-s3--98ea6e4f216f2fb4b69fff9b3a44842c38686ca685f3f55dc48c5d3fb1107be4
[2021-03-12 14:29:37.751164566] delContent sub/t
[2021-03-12 14:29:38.695664147] getProps sub
[2021-03-12 14:29:40.638445948] moveContent git-annex-webdav-tmp-SHA256E-s3--98ea6e4f216f2fb4b69fff9b3a44842c38686ca685f3f55dc48c5d3fb1107be4 https://dav.box.com/dav/git-annex/sub/t
</code></pre>
<p>Does it look significantly different in your case?</p>
<p>We unfortunately haven't an easy way to get a HTTP trace, but this shows
effectively the api calls for the DAV libary, and the moveContent looks
like it would generate a http request like this:</p>
<pre><code>MOVE /git-annex-webdav-tmp-SHA256E-s3...
Destination: https://dav.box.com/dav/git-annex/sub/t
</code></pre>
<p>Which seems fine, there are similar examples in the webdav spec
of moving from one directory to another.</p>
comment 3http://git-annex.branchable.com/projects/datalad/bugs-done/WEBDAV_export_has_wrong_subdirectory_content/comment_3_22290ea94d5e547a199411d696657708/joey2023-01-05T17:30:31Z2021-03-12T18:48:38Z
<p>If the problem does involve moving between collections, it could
avoid the problem by storing the temp file into the subdirectory in the first
place, and only renaming it to its final name once transferred.</p>
<p>I've implemented that and committed it along with this comment,
and it works tested against box.com again, can you try it with
the 4shared server?</p>
Progress reporthttp://git-annex.branchable.com/projects/datalad/bugs-done/WEBDAV_export_has_wrong_subdirectory_content/comment_4_b6c36a9f466cbd0a09ae83ada7bc728c/mih2023-01-05T17:30:31Z2021-03-13T16:19:17Z
<p>Thx for the fixes! This seems to be working nicely, but only for subdirectories. All files that are not in the root of a repository are placed in their respective subdirectories, whether or not they are annexed.</p>
<p>However, the export errors for all files in the root. Using <code>git-annex version: 8.20210311-gecee702b3</code> I see:</p>
<pre><code>% git annex export HEAD --to 4shared --json --json-error-messages
{"command":"export 4shared","success":true,"input":[],"error-messages":[],"file":".datalad/.gitattributes"}
{"command":"export 4shared","success":true,"input":[],"error-messages":[],"file":".datalad/config"}
{"command":"export 4shared","success":false,"input":[],"error-messages":[" DAV failure: Status {statusCode = 409, statusMessage = \"Conflict\"} \"<html><body><h1>Conflict</h1></body></html>\" HTTP request: \"PUT\" \"/dummy/./git-annex-webdav-tmp-GIT--c3aaefef9a2470b31ba9213350046ff7cde75737\""],"file":".gitattributes"}
{"command":"export 4shared","success":false,"input":[],"error-messages":[" DAV failure: Status {statusCode = 409, statusMessage = \"Conflict\"} \"<html><body><h1>Conflict</h1></body></html>\" HTTP request: \"PUT\" \"/dummy/./git-annex-webdav-tmp-MD5E-s4--971658bc2f5bdee5660844a83b5bf0a2.txt\""," DAV failure: Status {statusCode = 409, statusMessage = \"Conflict\"} \"<html><body><h1>Conflict</h1></body></html>\" HTTP request: \"PUT\" \"/dummy/./git-annex-webdav-tmp-MD5E-s4--971658bc2f5bdee5660844a83b5bf0a2.txt\""],"file":"inannexroot.txt"}
{"command":"export 4shared","success":false,"input":[],"error-messages":[" DAV failure: Status {statusCode = 409, statusMessage = \"Conflict\"} \"<html><body><h1>Conflict</h1></body></html>\" HTTP request: \"PUT\" \"/dummy/./git-annex-webdav-tmp-GIT--ec28779b36c1a65a3fb8ca1b1cad32c8b1f0fd45\""],"file":"ingit.txt"}
git-annex: export: 3 failed
</code></pre>
<p>and with <code>--debug</code> on a subsequent attempt:</p>
<pre><code>...
[2021-03-13 17:10:40.082377685] getProps .datalad/.gitattributes
{"command":"export 4shared","success":true,"input":[],"error-messages":[],"file":".datalad/.gitattributes"}
[2021-03-13 17:10:41.109311219] getProps .datalad/config
{"command":"export 4shared","success":true,"input":[],"error-messages":[],"file":".datalad/config"}
[2021-03-13 17:10:41.274971242] getProps .gitattributes
[2021-03-13 17:10:41.477825807] putContent ./git-annex-webdav-tmp-GIT--c3aaefef9a2470b31ba9213350046ff7cde75737
{"command":"export 4shared","success":false,"input":[],"error-messages":[" DAV failure: Status {statusCode = 409, statusMessage = \"Conflict\"} \"<html><body><h1>Conflict</h1></body></html>\" HTTP request: \"PUT\" \"/dummy/./git-annex-webdav-tmp-GIT--c3aaefef9a2470b31ba9213350046ff7cde75737\""],"file":".gitattributes"}
[2021-03-13 17:10:41.684686094] getProps inannexroot.txt
[2021-03-13 17:10:42.125435547] putContent ./git-annex-webdav-tmp-MD5E-s4--971658bc2f5bdee5660844a83b5bf0a2.txt
[2021-03-13 17:10:42.299969611] putContent ./git-annex-webdav-tmp-MD5E-s4--971658bc2f5bdee5660844a83b5bf0a2.txt
{"command":"export 4shared","success":false,"input":[],"error-messages":[" DAV failure: Status {statusCode = 409, statusMessage = \"Conflict\"} \"<html><body><h1>Conflict</h1></body></html>\" HTTP request: \"PUT\" \"/dummy/./git-annex-webdav-tmp-MD5E-s4--971658bc2f5bdee5660844a83b5bf0a2.txt\""," DAV failure: Status {statusCode = 409, statusMessage = \"Conflict\"} \"<html><body><h1>Conflict</h1></body></html>\" HTTP request: \"PUT\" \"/dummy/./git-annex-webdav-tmp-MD5E-s4--971658bc2f5bdee5660844a83b5bf0a2.txt\""],"file":"inannexroot.txt"}
[2021-03-13 17:10:42.64284096] getProps ingit.txt
[2021-03-13 17:10:42.808582229] putContent ./git-annex-webdav-tmp-GIT--ec28779b36c1a65a3fb8ca1b1cad32c8b1f0fd45
{"command":"export 4shared","success":false,"input":[],"error-messages":[" DAV failure: Status {statusCode = 409, statusMessage = \"Conflict\"} \"<html><body><h1>Conflict</h1></body></html>\" HTTP request: \"PUT\" \"/dummy/./git-annex-webdav-tmp-GIT--ec28779b36c1a65a3fb8ca1b1cad32c8b1f0fd45\""],"file":"ingit.txt"}
...
</code></pre>
4shared-specific issuehttp://git-annex.branchable.com/projects/datalad/bugs-done/WEBDAV_export_has_wrong_subdirectory_content/comment_5_201fb2f59e3143145f9f0b07a5f3ac30/mih2023-01-05T17:30:31Z2021-03-13T16:26:12Z
I was just able to export one and the same repository with the identical remote parameterization to a Nextcloud instance's WEBDAV. The reported behavior is indeed specific to 4shared, or at least not a general issue.
comment 6http://git-annex.branchable.com/projects/datalad/bugs-done/WEBDAV_export_has_wrong_subdirectory_content/comment_6_e1051429a09bf5d7e5887d0eb378d72f/joey2023-01-05T17:30:31Z2021-03-16T18:13:45Z
<p>I think I see how my change broke exporting to the top directory of the
repo. I've committed a fix for that.</p>
comment 7http://git-annex.branchable.com/projects/datalad/bugs-done/WEBDAV_export_has_wrong_subdirectory_content/comment_7_e50ed095c335c616b8e27fe20dae00d9/joey2023-01-05T17:30:31Z2021-03-16T18:19:17Z
<p>I think I see another way that the same webdav server misbehavior could
happen, since there is also a rename. When a file is in the top of the
repo, is exported to webdav, then is moved into a subdirectory, and the
export run again, it will rename it to avoid re-uploading.</p>
<p>mid, can you check if the 4shared server breaks in that situation?</p>
<p>(I notice there's a special case in that code path already for the box.com
webdav server (see <a href="http://source.git-annex.branchable.com/?p=source.git;a=commitdiff;h=bf48ba4ef7aeb69d5efca7c04068ff7752f57f3f">bf48ba4ef7aeb69d5efca7c04068ff7752f57f3f</a>)
which apparently also had problems with renames. Although as I noted
upthread, I didn't see the problem reported in this bug report when I
tested it against box.com.)</p>
Works!http://git-annex.branchable.com/projects/datalad/bugs-done/WEBDAV_export_has_wrong_subdirectory_content/comment_8_c1f5a71c217b965ece29fe8d9a01e6b2/mih2023-01-05T17:30:31Z2021-03-17T08:20:08Z
<p>I tested export to 4shared with 8.20210311-g02e74c010 and it is working nicely now!</p>
<p>Thx!</p>
comment 9http://git-annex.branchable.com/projects/datalad/bugs-done/WEBDAV_export_has_wrong_subdirectory_content/comment_9_b25f2c71675f4d8f6a6e22bb5cd98b23/joey2023-01-05T17:30:31Z2021-03-17T13:44:02Z
Can you test the rename case mentioned in comment #7?
comment 10http://git-annex.branchable.com/projects/datalad/bugs-done/WEBDAV_export_has_wrong_subdirectory_content/comment_10_0598b1ccc9d7002d6b7fa08f29f8edc9/mih2023-01-05T17:30:31Z2021-03-17T17:03:34Z
<p>Sorry, forgot about that.</p>
<p>I <code>git mv</code>'ed a file from the root into a subdirectory, and re-exported. It fails:</p>
<p>DAV failure: Status {statusCode = 500, statusMessage = "Internal Server Error"} "<h1>Server Error</h1>" HTTP request: "MOVE" "/dltest20/some%20space/git-annex-webdav-tmp-MD5E-s4--ba1f2511fc30423bdbb183fe33f3dd0f"</p>
<p>On the server-side the file is removed from the root, but the target directory only has the tmp files.</p>
comment 11http://git-annex.branchable.com/projects/datalad/bugs-done/WEBDAV_export_has_wrong_subdirectory_content/comment_11_03ea7014854be4fa9f40a8815616b81d/joey2023-01-05T17:30:31Z2021-03-22T16:27:10Z
<p>Ok, failed kind of like I expected, although I have to say the filenames
don't entirely make sense to me.
("git-annex-webdav-tmp" is not used in filenames when renaming an exported
file, only when storing it in the first place)</p>
<p>So what seems to make sense for git-annex to do when renaming an exported
file is: Try to rename, and if it fails, delete the source and the
destination -- since it has no idea what may have been left in either place
-- and then fall back to uploading the file again instead. I have
implemented that.</p>