Started work on gcrypt support.

The first question is, should git-annex leave it up to gcrypt to transport the data to the encrypted repository on a push/pull? gcrypt hooks into git nicely to make that just work. However, if I go this route, it limits the places the encrypted git repositores can be stored to regular git remotes (and rsync). The alternative is to somehow use gcrypt to generate/consume the data, but use the git-annex special remotes to store individual files. Which would allow for a git repo stored on S3, etc. For now, I am going with the simple option, but I have not ruled out trying to make the latter work. It seems it would need changes to gcrypt though.

Next question: Given a remote that uses gcrypt, how do I determine the annex.uuid of that repository. I found a nice solutuon to this. gcrypt has its own gcrypt-id, and I convert it to a UUID in a reproducible, and even standards-compliant way. So the same encrypted remote will automatically get the same annex.uuid wherever it's used. Nice. Does mean that git-annex cannot find a uuid until git pull or git push has been used, to let gcrypt get the gcrypt-id. Implemented that.

The next step is actually making git-annex store data on gcrypt remotes. And it needs to store it encrypted of course. It seems best to avoid needing a git annex initremote for these gcrypt remotes, and just have git-annex automatically encrypt data stored on them. But I don't know. Without initializing them like a special remote is, I'm limited to using the gpg keys that gcrypt is configured to encrypt to, and cannot use the regular git-annex hybrid encryption scheme. Also, I need to generate and store a nonce anyway to HMAC ecrypt keys. (Or modify gcrypt to put enough entropy in gcrypt-id that I can use it?)

Another concern I have is that gcrypt's own encryption scheme is simply to use a list of public keys to encrypt to. It would be nicer if the full set of git-annex encryption schemes could be used. Then the webapp could use shared encryption to avoid needing to make the user set up a gpg key, or hybrid encryption could be used to add keys later, etc.

But I see why gcrypt works the way it does. Otherwise, you can't make an encrypted repo with a friend set as one of the particpants and have them be able to git clone it. Both hybrid and shared encryption store a secret inside the repo, which is not accessible if it's encrypted using that secret. There are use cases where not being able to blindly clone a gcrypt repo would be ok. For example, you use the assistant to pair with a friend and then set up an encrypted repo in the cloud for both of you to use.

Anyway, for now, I will need to deal with setting up gpg keys etc in the assistant. I don't want to tackle full gpgkeys yet. Instead, I think I will start by adding some simple stuff to the assistant:

  • When adding a USB drive, offer to encrypt the repository on the drive so that only you can see it.
  • When adding a ssh remote make a similar offer.
  • Add a UI to add an arbitrary git remote with encryption. Let the user paste in the url to an empty remote they have, which could be to eg github. (In most cases this won't be used for annexed content..)
  • When the user has no gpg key, prompt to set one up. (Securely!)
  • Maybe have an interface to add another gpg key that can access the gcrypt repo. Note that this will need to re-encrypt and re-push the whole git history.