git-remote-gcrypt adds support for encrypted remotes to git. Combine this with git-annex encrypting the files it stores in a remote, and you can fully encrypt all the data stored on a remote.
Here are some ways you can use this awesome stuff..
This page will show how to set it up at the command line, but the git-annex assistant can also be used to help you set up encrypted git repositories.
prerequisites
Install git-remote-gcrypt.
Set up a gpg key. You might consider generating a special purpose key just for this use case, since you may end up wanting to put the key on multiple machines that you would not trust with your main gpg key.
The examples below use "$mykey" where you should put your gpg keyid.
encrypted backup drive
Let's make a USB drive into an encrypted backup repository. It will contain both the full contents of your git repository, and all the files you instruct git-annex to store on it, and everything will be encrypted so that only you can see it.
Here's how to set up the encrypted repository:
git init --bare /mnt/encryptedbackup
git annex initremote encryptedbackup type=gcrypt gitrepo=/mnt/encryptedbackup keyid=$mykey
git annex sync encryptedbackup
(Remember to replace "$mykey" with the keyid of your gpg key.)
This uses the gcrypt special remote to encrypt pushes to the git remote, and git-annex will also encrypt the files it stores there.
Now you can copy (or even move) files to the repository. After sending files to it, you'll probably want to do a sync, which pushes the git repository changes to it as well.
git annex copy --to encryptedbackup ...
git annex sync encryptedbackup
Note that if you lose your gpg key, it will be impossible to get the data out of your encrypted backup. You need to find a secure way to store a backup of your gpg key. Printing it out and storing it in a safe deposit box, for example.
You can actually specify keyid= as many times as you like to allow any one of a set of gpg keys to access this repository. So you could add a friend's key, or another gpg key you have.
To restore from the backup, just plug the drive into any machine that has the gpg key used to encrypt it, and then:
git clone gcrypt::/mnt/encryptedbackup restored
cd restored
git annex enableremote encryptedbackup gitrepo=/mnt/encryptedbackup
git annex get --from encryptedbackup
encrypted git-annex repository on a ssh server
If you have a server that has ssh and rsync installed on it, you can set up an encrypted repository there. Works just like the encrypted drive except without the cable.
This example uses rsync urls in a form supported by git-remote-gcrypt since version 1.4. Older versions won't work with the urls used here, consult its documentation if you have to use an old version.
First, on the server, run:
git init --bare encryptedrepo
Now, in your existing git-annex repository, set up the encrypted remote:
git annex initremote encryptedrepo type=gcrypt gitrepo=rsync://my.server/home/me/encryptedrepo keyid=$mykey
git annex sync encryptedrepo
(Remember to replace "$mykey" with the keyid of your gpg key.)
This uses the gcrypt special remote to encrypt pushes to the git remote, and git-annex will also encrypt the files it stores there. Data is transferred using rsync over ssh.
If you're going to be sharing this repository with others, be sure to also include their keyids, by specifying keyid= repeatedly.
Now you can copy (or even move) files to the repository. After sending files to it, you'll probably want to do a sync, which pushes the git repository changes to it as well.
git annex copy --to encryptedrepo ...
git annex sync encryptedbackup
Anyone who has access to the repo it and has one of the keys used to encrypt it can check it out:
git clone gcrypt::rsync://my.server/home/me/encryptedrepo myrepo
cd myrepo
git annex enableremote encryptedrepo gitrepo=rsync://my.server/home/me/encryptedrepo
git annex get --from encryptedrepo
private encrypted git remote on a git-lfs hosting site
Some git repository hosting sites do not support git-annex, but do support the similar git-lfs for storing large files alongside a git repository. git-annex can use the git-lfs protocol to store files in such repositories, and with gcrypt, everything stored in the remote can be encrypted.
First, make a new, empty git repository on the hosting site. Get the ssh clone url for the repository, which might look like "git@github.com:username/somerepo.git"
Then, in your git-annex repository, set up the encrypted remote:
git annex initremote lfstest type=git-lfs url=gcrypt::git@github.com:username/somerepo.git keyid=$mykey
(Remember to replace "$mykey" with the keyid of your gpg key.)
This uses the git-lfs special remote, and the
gcrypt::
prefix on the url makes pushes be encrypted with gcrypt.
private encrypted git remote on a git hosting site
You can use gcrypt to store your git repository in encrypted form on any hosting site that supports git. Only you can decrypt its contents. Using it this way, git-annex does not store large files on the hosting site; it's only used to store your git repository itself.
git remote add encrypted gcrypt::ssh://hostingsite/myrepo.git
git push encrypted master git-annex
Now you can carry on using git-annex with your new repository. For example,
git annex sync
will sync with it.
To check out the repository from the hosting site, use the same gcrypt:: url you used when setting it up:
git clone gcrypt::ssh://hostingsite/myrepo.git
multiuser encrypted git remote on a git hosting site
Suppose two users want to share an encrypted git remote. Both of you need to set up the remote, and configure gcrypt to encrypt it so that both of you can see it.
git remote add sharedencrypted gcrypt::ssh://hostingsite/myrepo.git
git config remote.sharedencrypted.gcryt-participants "$mykey $friendkey"
git push sharedencrypted master git-annex
Hi,
I think the current behavior of the special remote is a bit annoying when one has several pgp keys.
Indeed, I've followed the encrypted backup drive example specifying the id of a dedicated key in the initremote step, so far so good. Doing that, I was prompted for my key phrase by the gnome keyring daemon, as expected.
The annoying part starts right at the git annex sync step. Indeed, when git-remote-gcrypt tries to decrypt the manifest from the encrypted remote, rather than trying only the key specified during the initremote step, it tries all my (secret) keys. This means that I get prompted for the key phrase of all those keys (minus the correct one which is already unlocked...).
In the future, this might possible to avoid by allowing gcrypt to fetch a preferred key from git config and to use with the --try-secret-key option available gnupg 2.1.x. But for 1.x or 2.0.x, the simpler option --default-key does not seem to alter the order in which keys are tried to decrypt the manifest. Also, it does not seem to be a problem of the gnome keyring daemon, but rather a gpg problem as when the daemon is replaced by the standard gpg-agent, the same problem occurs.
Meanwhile, is there any way to avoid this problem?
I'm answering to myself :-). A possible solution to the annoying pass phrase asking with current gnupg is to use a specialized secret keyring. One first exports the secret key used for this repository in a specific keyring as follows:
gpg --export-secret-keys keyid | gpg --import --no-default-keyring --secret-keyring mygitannexsecret.gpg
This will create a keyring in $HOME/.gnupg with only the specific key.
Then, in the git-remote-gcrypt shell script, gpg should be called as follows
gpg --no-default-keyring --secret-keyring mygitannexsecret.gpg -q -d ...
when decrypting the manifest in order to try only the specific key. This behavior can be easily triggered via some git configuration variable.
Any comment?
The way I would want to setup git-annex (assistant) is "Wuala/Spideroak style": two computers with a full checkout of the repository, changes automatically being synced between them, even if the two computers are never online simultaneously, and encryption should be done locally: the (special) remote should not be able to view file listings or content.
Do I understand it correctly that the gcrypt remote is the only way to make this happen? I tried to create such a setup via the webapp but failed. Adding the repository and remote (via "Encrypt with GnuPG key") on the first computer went OK*, but trying to enable that remote on the other computer fails: clicking enable asks me for the SSH password, but after that I just get redirected to a blank screen, with nothing to see in the logfile after the succesful call to ssh-keygen. No entry for the second computer is being added to authorized_keys on the remote.
Perhaps this is because at this point the assistant is unable to actually parse the content of the encrypted repository? I tried importing the private key that was used while creating the repository on the other computer, but that made no difference.
Thinking about this for a while, I believe gpg keys aren't actually particularly suited for this usecase. Even without the bug above, one would either have to awkwardly copy a private key to all hosts that are syncing to the repository; or, every time a new (or reinstalled) host wants to sync the repository, you would manually have to add the new keyid to the config and do the forced push + GCRYPT_FULL_REPACK, presumably having to reupload your entire history. Apart from this, having to backup a private key (outside of your git-annex based backups!) would be quite inconvenient.
How would you feel about adding a new mode of operation where encryption is simply based on a passphrase? We could symetrically encrypt the repository with a keyfile that's stored in the repository itself, protecting the keyfile with a passphrase which - if stored at all - would be stored on the individual computers, outside of the repository.
*although it erroneously used "E0D2F776E7F674E3" as key-id while the actual id is E7F674E3; where did that other half come from?
Isn't that what the regular shared-encryption remote already does? Except it doesn't put a passphrase on the key, because anyone who has access to the local repo wouldn't need access to the remote one anyway.
As Adam wrote, without a passphrase, this is the shared encryption method. With an encrypted key, this is more or less the hybrid (default) scheme. The thing is that you have to share a secret to have a encrypted remote. I don't use the webapp, so I don't know what's happening in your case, but this is how it should work with the command line tools. First Alice create the encrypted remote with her pgp key. As far as I understand, git annex creates (via gpg) a key for a symmetric cypher which is stored in the repository, encrypted with Alice public key. If Alice wants to share the repository with Bob, she must either give a key pair (so the private key also, of course) to Bob or ask Bob for his public key. In the first case, Bob can clone the repository directly (upon reception of the key pair), while in the second case, Alice has to active Bob's public key (with
git annex enableremote myremote keyid+=bobsId
). In this case, again as far as I understand, the symmetric key is reencrypted for both Alice and Bob in the repo.I understand that you tried the first case with the webapp and that it did not work. I had a similar problem documented in this http://git-annex.branchable.com/bugs/git-annex-shell:_gcryptsetup_permission_denied. Maybe you could had some comments to this bug description?
This is the long id of your pgp key (16 characters as opposed to 8 for the short id).
Thanks for the responses. Please correct me if I'm wrong, but the way I understood it, using the shared encryption scheme creates a conflict between "changes being synced between them, even if the two computers are never online simultaneously" and "encryption should be done locally: the (special) remote should not be able to view file listings or content."
If I use shared encryption "the webapp way", only the file contents will be rsynced to the remote, not the repository itself. This means that different hosts are unable to sync unless they are online simultaneously, so that commit data can be sent directly between them via XMPP. In practice, this would mean my hosts are never synced (because I don't keep my home computer running when I leave for work, and vice versa)
If I use shared encryption and additionally put the repository itself on a remote, that remote would have the keys to fully decrypt the repository, that's not acceptable.
Reading through the docs again, the hybrid scheme actually seems to be closer to what I want than the shared scheme, but it still has a major downside: the encryption only applies to the files itself, so in order to get "offline sync" there still has to be a 'remote' for the repository itself, which will contain all your metadata unencrypted. And also it would depend on the user being able to manually setup and backup a set of gpg keys instead of just memorizing a secure passphrase.
@Fabrice Looks like the bug you found could very well be the cause of the problem I had; I'll try it again when a new version is available.
I think you are (at least partially) right. Of course, the only way to sync completely computers that are not on together is to use either a usb drive or a third always on computer. (I've to confess I did not understand first when I read git annex docs, shame on me If you don't want to trust completely this computer (I don't, for instance), you must :
use an encrypted git repository on this computer;
and use either hybrid or pubkey encryption.
But contrarily to what you seem to imply (I hope I understand you correctly), if you do that, the third computer can still figure out a few things (usage patterns, such as where connections come from), but that's all. You've got full sync and everything is encrypted, both the git part and the files handled by the annex. This applied only to encrypted git special remotes as other remotes do not store the git part.
"We could symetrically encrypt the repository with a keyfile that's stored in the repository itself"
Then you would need to decrypt the repository in order get the key you need to decrypt the repository. The impossibility of this design is why I didn't do that!
It would certainly be possible to store a non-encrypted gpg key alongside the repository encrypted with it, but then you have to rely on a passphrase for all your security.
You should file a bug report for the bug you saw..
Sorry, I ment that the file containing the symmetric encryption key should obviously not be used to encrypt itself, it would be stored in the repository "unencrypted" (but protected with a passphrase)
Exactly. I think such a mode be a great addition. It might not be as secure as encryption based on a private key - depending on the passphrase strength -, but it would certainly be a lot more convenient and portable (and still much more secure than the shared encryption method).
Hi there,
I try to follow the instructions provided here but I don't manage it to get my repo encrypted. Here are the steps:
1) git init --bare Encrypted 2) git-annex init 3) git annex initremote encryptedbackup type=gcrypt gitrepo=~/tmp/Encrypted encryption=pubkey keyid=DXXXX
The last step takes a lof of time to run. Basically my key doesn't get used at all:
... instead a new pgp key is generated. How comes??!
Any help would be appreciated.
Thx and cheers,
cyneox
@Peter, in your example, it is going to use your gpg key to encrypt files. gpg is being used to generate a 256 bit random value (not a key), which will be used as a random seed for HMAC scrambling of the keys stored in the encrypted special remote.
If that's taking too long to generate for your liking, you can pass --fast, which will make gpg use /dev/urandom to generate it rather than /dev/random.
I'm a bit confused about how do the gcrypted repositories actually work with git-annex. As far as I can tell, using git-remote-gcrypt with pure git produces a directory containing a couple of files with names looking like hashes. These files contain the whole repository - I checked by cloning the encrypted repo.
With git-annex: The instructions on this page suggest first creating a bare repo (creating a normal git repo layout). When I do this and then do the next step (
git annex initremote ...
) git-remote-gcrypt complains that the repository doesn't exist (which is correct from its point of view, as there is no encrypted repo yet) and creates a new one (so now there are both structures of a bare git repo and an encrypted repo alongside in the same directory). The setup sort of works, but the bare git repo is never touched after that (or at least it shouldn't be, as it has nothing to do with the encrypted repo).I've tried also following the instructions, only skipping the first step entirely (ie. no bare repo created). As far as I can tell, git-remote-gcrypt will run "fine" (will create a new encrypted repo), but git-annex itself complains that "could not lock config file \<dir>/.git/config" and quits. Interestingly enough, the following gets around this "problem" and also results in a working setup (
~/annex
is a git-annex repo).Then
git clone gcrypt::/tmp/test restored
will successfully recover the whole git-annex repo intorestored
.So finally the question: is creating a bare git repo really necessary; and if not, is writing into
.git/config
necessary?@flabbergast, you seem to be confused about how git-remote-gcrypt stores its data. The data is stored as git commits inside a bare repository. That is why the instructions say to create a bare repository first. (I think it's also possible to use git-remote-gcrypt in a rsync mode where it just uploads encrypted files to an empty directory and does not use a bare git repository, but git-annex does not use it like that.)
Your mkdir and touch commands effecetively create a bare git repository too.
If you're having a problem, I suggest filing a bug report (not a comment on this page) with the full details. The examples show on this page have been tested, and work.
I just created the same setup again that I attempted 1-2 years ago, with the latest self-contained git-annex build. Two clients, one SSH server, using gcrypt. Setup worked flawless now (although the process of having to manually export the generated GPG key and import it into all clients is still very awkward); changes made on a client are immediately detected and synced with the server. However, the changes made on client A are never automatically propagated to client B. They are picked up when I restart the annex assistant on client B, but never automatically.
Is this a bug or simply not supported? I read about XMPP being deprecated in favor of notifications via the annex-shell, but I couldn't find a post detailing these changes.
I'm having a problem generating a proper annex map where one of the remotes is gcrypt. Specifically, the map command fails as per below. I will concede that it's not clear though if the failure is that of encryptedown, or the third and final remote dsown, which is an rsync one.
jonas@silk:~/own$ git annex map map /home/jonas/own ok map encryptedown failed jonas@silk:~/own$ git annex info encryptedown remote: encryptedown description: VM / Montreal / gcrypt [encryptedown] uuid: 51ee1422-67d5-56f5-83f3-2718c9996080 cost: 250.0 type: gcrypt repository location: ssh://198.50.187.233/home/jonas/encryptedown/ encryption: encrypted (to gpg keys: 0C5161298A31D11A) (hybrid mode) chunking: none jonas@silk:~/own$ git annex info dsown remote: dsown description: DiskStation NAS / Asgatan / rsync [dsown] uuid: ef3b81aa-47bd-41b4-8672-e371742306cf cost: 250.0 type: rsync url: diskstation:/volume1/silkbackup/ encryption: encrypted (encryption key stored in git repository) chunking: none
I just discovered that cloning over ssh an gcrypt encrypted repository and enabling the remote afterwards is somehow messing up the git config:
git clone grypt::ssh://user@ip.com:/mnt/encrypted_backup cd encrypted_backup git annex enableremote encrypted_backup gitrepo=/.../encrypted_backup
leads to following in the .git/config of the just cloned repository:
... [remote "origin"] url = grypt::ssh://user@ip.com:/mnt/encrypted_backup gcrypt-id = :id:12312312 fetch = +refs/heads/:refs/remotes/origin/
[remote "encrypted_backup"] url = grypt::ssh://user@ip.com:/mnt/encrypted_backup fetch = +refs/heads/:refs/remotes/server/ gcrypt-participants = keyid gcrypt-signingkey = keyid gcrypt-publish-participants = true gcrypt-id = :id:adsasd annex-gcrypt = shell annex-uuid = 312312312 ...
Note, that for the remote "origin" some config like the signingkey is missing compared to the remote "encrypted_backup"
Then, running git annex sync --content
leads to a error saying
"gcrypt: Failed to decrypt manifest!"
during the push process. After that I am not able to sync the repository anymore, even with the original repostitory, which initiated the remote. The encrypted_backup is then somehow messed up.
Removing the "origin" remote via git remote remove origin
solves the problem for me. But that command has to be launched right before the first sync, pull or push command! Otherwise the sync process cannot be done anymore.
Details
I followed the instructions to create a new encrypted ssh-accessible remote (gcrypt, hybrid). As an additional detail, I have a key generated with instructions from https://alexcabal.com/creating-the-perfect-gpg-keypair). Everything worked until I tried to setup another machine (which has the same encryption GPG subkey available) to use the repo. After that, neither machine could no longer sync successfully. Here is the result of the commands on the second machine (partially omitted):
Looking at the output, I was convinced that the following line explained the issue:
So I did everything again and took copies of the repo on the server after each interaction. I found out that when the remote is initialized, the encrypted "session key" contained the information on the recipients, but if the remote in synced with the cloned repo, the session key is modified.
I took a look at the encrypted session key in the remote repo before and after (I cloned it directly without trying to decrypt). Here, I've replaced the real key id with "DEADBEEFDEADBEEF":
This means that the recipient information is not included in the file. In practice this means that you need to check if you are the recipient by trying to decrypt the file. Unfortunately, it seems that no key is actually tried. The file is decrypted successfully with:
Solution
As the core issue is that GnuPG fails to use the correct key if recipients are not known, we can tell gpg which keys to try. This can be done by editing ~/.gnupg/gpg.conf. Set the 'default-key' or alternatively, if the key you are using for this purpose is not your default key, set the 'try-secret-key'
The other issue is that the recipients are removed from the session key, you can make sure that the correct recipients are kept if you set the following options for the remote in the cloned repo (.git/config):
I suspect that many users who have multiple keys may run into this issue. The previous comment also seems to be a variant of this.