bugs/Hybrid encryption can't generate the right key after moving filesgit-annexhttp://git-annex.branchable.com/bugs/Hybrid_encryption_can__39__t_generate_the_right_key_after_moving_files/git-annexikiwiki2017-06-09T17:11:39Zcomment 1http://git-annex.branchable.com/bugs/Hybrid_encryption_can__39__t_generate_the_right_key_after_moving_files/comment_1_bbc1e7205d7701afd405c6e62a1c0aa3/yibe2017-06-04T20:07:31Z2017-06-04T20:07:31Z
<p>Hello,</p>
<p>I have no idea why git-annex fails to decrypt your file, but as for the two different HMAC keys, I guess you have chunking enabled on that remote (at least when you uploaded the file) and that first HMAC key is the right key for the 1st chunk of your file. That decryption script does not take chunks into account, so you were only able to generate the second HMAC key, which should be the right key if the file was uploaded without chunking enabled. I've just <a href="http://git-annex.branchable.com/tips/Decrypting_files_in_special_remotes_without_git-annex/#comment-ea2df7b4739f3d66c169bf297e339e9d">posted</a> a modified version of the script that supports chunks.</p>
comment 2http://git-annex.branchable.com/bugs/Hybrid_encryption_can__39__t_generate_the_right_key_after_moving_files/comment_2_52a1800537f1244ad6cf417c5f25ebe0/joey2017-06-06T16:21:07Z2017-06-06T15:38:24Z
<p>Your remote has chunking enabled, so git-annex first tries generating a
HMAC for a chunked key. When decrypting the content fails for some reason,
it falls back to trying the HMAC for an unchunked key. This is done because
chunking can be enabled after some content has been uploaded to a remote,
so it always tries the unchunked location just in case.</p>
<p>It looks like gpg is successfully decrypting the hybrid encryption key
that's embedded in your git repository. That is the first, successful
call tp gpg in your log.</p>
<p>The "bad key" error then comes when gpg is asked to use the hybrid
encryption key to decrypt the content. This seems to indicate it's not
using the same hybrid encryption key that was used to encrypt it.</p>
<p>The fact that it was able to generate the right HMAC key to download the
content though, indicates that it did get the right hybrid encryption key
(since half of that key is used to generate the HMAC).</p>
<p>So hmm, I don't understand what is going on.</p>
<p>Are you able to retrieve the same file successfully when the rclone.conf
is configured to use the Amazon Cloud Drive? (Assuming that the content of
the file is still present over there.)</p>
comment 3http://git-annex.branchable.com/bugs/Hybrid_encryption_can__39__t_generate_the_right_key_after_moving_files/comment_3_414bd487619e46b25e8d9a57d9aef1f6/interfect2017-06-09T01:40:00Z2017-06-09T01:40:00Z
<p>Unfortunately even though the content is still over at Amazon, ACD can no longer be accessed through rclone, so I can't get Git Annex to go over there and download it.</p>
<p>With the new chunk-supporting decryption script (which I further modified to actually use the timestamps on the log entries), I am able to generate the right key for my test file. I was also able to generate the key for and decrypt another test file, and then testing with git annex again shows that I can successfully fsck other file in the remote.</p>
<p>I think what is happening is that my small file I was testing with somehow became corrupted or was modified while on Amazon's servers. I downloaded from Amazon before the transfer and from Google after and did a diff, and both files are identical, so I think I moved over corrupt data.</p>
<p>It looks like git annex is just successfully doing its job and identifying some data corruption here. The bug can probably be closed.</p>
comment 4http://git-annex.branchable.com/bugs/Hybrid_encryption_can__39__t_generate_the_right_key_after_moving_files/comment_4_81abd2627672911ba6367effbf5c487b/interfect2017-06-09T01:48:15Z2017-06-09T01:48:15Z
I'm actually seeing a lot of files (mostly smaller ones, for some reason) failing to fsck because GPG decryption failed. I can't tell at the moment whether they got corrupted in the transfer, or corrupted in the initial upload somehow. I'm pretty sure the problem here isn't with git annex itself, or more people would have noticed, but I'm definitely going to be fscking my cloud remotes more frequently.
comment 5http://git-annex.branchable.com/bugs/Hybrid_encryption_can__39__t_generate_the_right_key_after_moving_files/comment_5_ce0342e38a87017ad58c9a79b17d759a/joey2017-06-09T17:11:39Z2017-06-09T17:04:09Z
<p>"I think what is happening is that my small file I was testing with somehow
became corrupted or was modified while on Amazon's servers."</p>
<p>That was also kind of my guess. It's hard to imagine how
the way that a file is downloaded from The Cloud changes
how git-annex decrypts it. As long as the content is the same,
the decrpytion step should behave identially no matter where
the file is downloaded from.</p>
<p>But, multiple small files getting corrupted seems like it must
have a cause other than a bit flip. Perhaps something about how
they were transferred between the two clouds corrupted them..</p>
<p>I suppose there could also be a bug in git-annex or rclone that somehow
corrupts uploads of small files. Perhaps something to do with chunking..
What does <code>git annex info theremote --fast</code> say about its configuration?</p>
<p>What is the range of sizes of small files that you've found to be
corrupted? Is there a cut-off point after which all larger files are
not corrupted? Are any small files not corrupted?</p>