I am using a free 4shared account to test the webdav special remote export.
My repository looks like this:
% tree
.
├── ingit.txt
└── subdir
└── inannex.txt -> ../.git/annex/objects/Kp/FZ/MD5E-s7--3b158c5b0a18c247ebad28c09fc3e180.txt/MD5E-s7--3b158c5b0a18c247ebad28c09fc3e180.txt
1 directory, 2 files
My webdav setup:
% git cat-file -p git-annex:remote.log
93522a6c-8e9f-47a1-a578-b6a18f82d429 encryption=none exporttree=yes name=4shared type=webdav url=https://webdav.4shared.com/datalad-tester/6350cc6b-2af7-41db-89cf-96c3d41f29cc timestamp=1615568357.193179854s
I can export without error, but the resulting layout on the server does not match the local worktree.
.
├── ingit.txt
├── inannex.txt
└── subdir
The annexed file is in the root, and the subdirectory exists, but is empty.
Git annex itself isn't happy with the result either:
% git annex fsck -f 4shared
fsck subdir/inannex.txt (fixing location log)
** Based on the location log, subdir/inannex.txt
** was expected to be present, but its content is missing.
failed
(recording state in git...)
git-annex: fsck: 1 failed
On repeated upload attempts to situation remains identical: the annexed files is misplaced into the root.
% git annex export HEAD --to 4shared --json --json-error-messages
{"command":"export 4shared","success":true,"input":[],"error-messages":[],"file":null}
I have no attempted a replication with another webdav service yet.
Thx!
Unless this only happens on the one webdav server, my guess is it involves the kind of weird way DAV handles collections.
In particular, the content is being written to a temp file, which is in the webdav root, and then it runs:
where src = the webdav root. It may be that ignores the path in newurl and just puts it into the same collection.
Well, I tried with the box.com webdav server, and did not reproduce the problem there; exported files went into the proper subdirectories.
I think there's some chance that the webdav server you're trying to use is just broken in its handling of moving from one collection to another. Or perhaps the webdav spec can be interpreted multiple ways and this is falling into an edge case.
Here's --debug of it working:
Does it look significantly different in your case?
We unfortunately haven't an easy way to get a HTTP trace, but this shows effectively the api calls for the DAV libary, and the moveContent looks like it would generate a http request like this:
Which seems fine, there are similar examples in the webdav spec of moving from one directory to another.
If the problem does involve moving between collections, it could avoid the problem by storing the temp file into the subdirectory in the first place, and only renaming it to its final name once transferred.
I've implemented that and committed it along with this comment, and it works tested against box.com again, can you try it with the 4shared server?
Thx for the fixes! This seems to be working nicely, but only for subdirectories. All files that are not in the root of a repository are placed in their respective subdirectories, whether or not they are annexed.
However, the export errors for all files in the root. Using
git-annex version: 8.20210311-gecee702b3
I see:and with
--debug
on a subsequent attempt:I think I see how my change broke exporting to the top directory of the repo. I've committed a fix for that.
I think I see another way that the same webdav server misbehavior could happen, since there is also a rename. When a file is in the top of the repo, is exported to webdav, then is moved into a subdirectory, and the export run again, it will rename it to avoid re-uploading.
mid, can you check if the 4shared server breaks in that situation?
(I notice there's a special case in that code path already for the box.com webdav server (see bf48ba4ef7aeb69d5efca7c04068ff7752f57f3f) which apparently also had problems with renames. Although as I noted upthread, I didn't see the problem reported in this bug report when I tested it against box.com.)
I tested export to 4shared with 8.20210311-g02e74c010 and it is working nicely now!
Thx!
Sorry, forgot about that.
I
git mv
'ed a file from the root into a subdirectory, and re-exported. It fails:DAV failure: Status {statusCode = 500, statusMessage = "Internal Server Error"} "
Server Error
" HTTP request: "MOVE" "/dltest20/some%20space/git-annex-webdav-tmp-MD5E-s4--ba1f2511fc30423bdbb183fe33f3dd0f"On the server-side the file is removed from the root, but the target directory only has the tmp files.
Ok, failed kind of like I expected, although I have to say the filenames don't entirely make sense to me. ("git-annex-webdav-tmp" is not used in filenames when renaming an exported file, only when storing it in the first place)
So what seems to make sense for git-annex to do when renaming an exported file is: Try to rename, and if it fails, delete the source and the destination -- since it has no idea what may have been left in either place -- and then fall back to uploading the file again instead. I have implemented that.