Please describe the problem.
When copying large files to an S3 special remote, I get a cryptic error message -- what can cause this?
What steps will reproduce the problem?
git annex copy --all --to s3_dssi-bioinfo-shlyakhi_sccrispr-data
What version of git-annex are you using? On what operating system?
$ git annex version
git-annex version: 10.20220724-ge30d846
build flags: Assistant Webapp Pairing Inotify DBus DesktopNotify TorrentParser MagicMime Feeds Testsuite S3 WebDAV
dependency versions: aws-0.22 bloomfilter-2.0.1.0 cryptonite-0.29 DAV-1.3.4 feed-1.3.2.0 ghc-8.10.7 http-client-0.7.9 persistent-sqlite-2.13.0.3 torrent-10000.1.1 uuid-1.3.15 yesod-1.6.1.2
key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 BLAKE2B256E BLAKE2B256 BLAKE2B512E BLAKE2B512 BLAKE2B160E BLAKE2B160 BLAKE2B224E BLAKE2B224 BLAKE2B384E BLAKE2B384 BLAKE2BP512E BLAKE2BP512 BLAKE2S256E BLAKE2S256 BLAKE2S160E BLAKE2S160 BLAKE2S224E BLAKE2S224 BLAKE2SP256E BLAKE2SP256 BLAKE2SP224E BLAKE2SP224 SHA1E SHA1 MD5E MD5 WORM URL X*
remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav adb tahoe glacier ddar git-lfs httpalso borg hook external
operating system: linux x86_64
supported repository versions: 8 9 10
upgrade supported from repository versions: 0 1 2 3 4 5 6 7 8 9 10
local repository version: 8
$ uname -a
Linux ctchpcpx212.merck.com 3.10.0-1160.6.1.el7.x86_64 #1 SMP Tue Nov 17 13:59:11 UTC 2020 x86_64 GNU/Linux
Please provide any additional information below.
copy SHA256E-s67657--d4ab2864cf2d94fb9eeb74721860da93bea55c45194b6c678d5c66040126edfb.csv ok
copy SHA256E-s10246716468--aa082f7efc476b2903330d57261d16d884653e97e5d007e559ef5ae6a03a52ba.bam (to s3_dssi-bioinfo-shlyakhi_sccrispr-data...)
13:1 (567)-13:8 (574): Expected end element for: Name {nameLocalName = "meta", nameNamespace = Nothing, namePrefix = Nothing}, but received: EventEndElement (Name {nameLocalName = "head", nameNamespace = Nothing, namePrefix = Nothing})
13:1 (567)-13:8 (574): Expected end element for: Name {nameLocalName = "meta", nameNamespace = Nothing, namePrefix = Nothing}, but received: EventEndElement (Name {nameLocalName = "head", nameNamespace = Nothing, namePrefix = Nothing})
failed
Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
git-annex fills a real need, and none of the alternate tools I've looked at fill it as well. I've used it at three different workplaces to manage data, and been able to adapt it to every use case. It also represents a staggering amount of work by one person.
This error message is from xml-conduit, which is used by aws to parse a response from the S3 server. It seems that response is probably malformed XML.
What S3 server is this using?
You may be able to get more information about the http request and response by using the --debug flag.
Re-enabling the remote with
partsize=1GiB
seems to fix the issue. Maybe, make that the default for S3?Are there reasons to use
chunk
in addition topartsize
, or doespartsize
obviate the need for chunking?Is this remote configured with chunk? (
git-annex info s3_dssi-bioinfo-shlyakhi_sccrispr-data
)Either chunk or partsize should work. partsize is disabled by default because there are a lot of S3 implementations besides AWS, that may not support it.
Generally, it's probably better to use partsize rather than chunk if the S3 server supports it.