Hello, the teams I work with have repositories for tracking CI pipelines and build scripts. There are binary resources, and sensitive information, that we would like to somehow be able to store with the repo, but in a secure fashion. Would a scenario like this be feasible with git-annex?
- create an annex attached to existing code repositories, with s3 as the special remote.
- each developer is able to read or add to and from the encrypted bucket using either their key from signed commits or from an ssh key
We already reject non-signed commits, and are not public-facing in our repositories or accessible without credentials to s3. The developers with access to the repository are all of the same access level internal to the company with permission to do what they must with the keys. I'm sorry if this is an obvious 'yes' or 'no' question. Using git-annex privately as a file store for myself thus far has been excellent.
I think you could probably achieve what you need to (depending on your specific needs).
There some general notes on encryption at: http://git-annex.branchable.com/encryption/ and http://git-annex.branchable.com/design/encryption/ and some insights into
git-annex
internals with respect to encryption here https://git-annex.branchable.com/tips/Decrypting_files_in_special_remotes_without_git-annex/.I think you could setup s3 as a special remote with something like:
DEV1_KEYID
is a name that the user's GPG keyring can recognize (sounds like you already have those).hybrid
encryption means (I think?) that DEV1_KEYID GPG public key is used to encrypt a symmetric cipher that is stored in the repo. All the content onsensitive-s3
will be encrypted using the symmetric cipher. By default DEV1_KEYID will also be used to encryptAWS_ACCESS_KEY_ID
andAWS_SECRET_ACCESS_KEY
which are stored encrypted in the repo,embedcreds=no
means don't store that info in the repo.If you want to grant another developer access to
sensitive-s3
, then run something like:I can't find any documentation of what happens next, but I assume
git-annex
re-encrypts the symmetric cipher using multi-key encryption so both DEV1_KEYID and DEV2_KEYID can decrypt the symmetric cipher using either of their private keys. Becausegit-annex
doesn't actually encrypt your files using gpg keys when usinghybrid
encryption, you don't need to re-upload or re-encrypt any files.git-annex
is only using the GPG keys to grant access to a small encrypted file containing a symmetric cipher that is used for the actual encryption of files.The one main drawback with this design is that is difficult to revoke access. If you want, at a later date, to revoke DEV2's access to sensitive-s3, you can't do that using any built-in
git-annex
feature. You could give each dev their own AWS creds up-front, then at the very least you could revoke those on AWS. If you need to, you could also delete the old cipher regenerate a new one and re-upload all files with a new cipher that only remaining developers have access to.Another workflow is to use
encryption=pubkey
. Again init the repo on s3. And add the keys of all your devs.Then files on sensitive-s3 will be encrypted using (I think) multi-key encryption that any of the devs can decrypt using their private key.
If you want to remove a dev later then you would have to, tell git-annex to remove their key, drop all files from sensitive-s3 (since they are readable by the revoked dev), then re-upload all files.
encryption=pubkey
is that you can't easily add add a key for a newly added developer later on, if you want to add a new developer key after you have uploaded files to s3 you will have to drop all the files, add the new key to git-annex, make sure all your devs do agit-annex sync
to get the new set of public keys to use for encryption, then re-upload all the files to s3.