git-annex extends git's usual remotes with some special remotes, that are not git repositories. This way you can set up a remote using say, Amazon S3, and use git-annex to transfer files into the cloud.
First, export your Amazon AWS credentials:
# export AWS_ACCESS_KEY_ID="08TJMT99S3511WOZEP91"
# export AWS_SECRET_ACCESS_KEY="s3kr1t"
Now, create a gpg key, if you don't already have one. This will be used
to encrypt everything stored in S3, for your privacy. Once you have
a gpg key, run gpg --list-secret-keys
to look up its key id, something
like "2512E3C7"
Next, create the S3 remote, and describe it.
# git annex initremote cloud type=S3 chunk=1MiB keyid=2512E3C7
initremote cloud (encryption setup with gpg key C910D9222512E3C7) (checking bucket) (creating bucket in US) (gpg) ok
# git annex describe cloud "at Amazon's US datacenter"
describe cloud ok
The configuration for the S3 remote is stored in git. So to make another repository use the same S3 remote is easy:
# export AWS_ACCESS_KEY_ID="08TJMT99S3511WOZEP91"
# export AWS_SECRET_ACCESS_KEY="s3kr1t"
# git pull laptop
# git annex enableremote cloud
enableremote cloud (gpg) (checking bucket) ok
Notice that to enable an existing S3 remote, you have to provide the Amazon AWS credentials because they were not stored in the repository. (It is possible to configure git-annex to do that, but not the default.)
further reading
See S3 for details about configuring S3 remotes.
See public Amazon S3 remote for how to set up a Amazon S3 remote that can be used by the public, without them needing AWS credentials.
If you want to publish files to S3 so they can be accessed without using git-annex, see publishing your files to the public.
AWS_ACCESS_KEY_ID
andAWS_SECRET_ACCESS_KEY
?You can use
git annex enableremote
to change an existing remote's configuration. So this should work:Jack, if you don't want to use encryption you can use
encryption=none
as documented here.I'm not sure exactly what you're trying to do, but please note that you files won't be easily available on S3: they will be named as git-annex keys, with long and unreadable names such as "SHA256E-s6311--c7533fdd259d872793b7298cbb56a1912e80c52a845661b0b9ff391c65ee2abc.html" instead of "index.html".
I don't know if this is what Jack wanted, but you can upload your files to S3 and let them be accessible through a public URL.
First, go to (or create) the bucket you will use at S3 and add a public get policy to it:
Then set up your special remote with the options
encryption=none
,bucket='BUCKETNAME'
chunk=0
(and any others you want).Your files will be accessible through
http://BUCKETNAME.s3-website-LOCATION.amazonaws.com/KEY
where location is the one specified through the optionsdatacenter
and KEY is the SHA-SOMETHING hash of the file, created by git annex and accessible if you rungit annex lookupkey FILEPATH
.This way you can share a link to each file you have at your S3 remote.
I use github as my central git repository and I would like to use S3 to store large files with annex. Since the s3 remote in .git/config is not stored in github, how do I make sure I reconnect to the same s3 bucket in case I delete my local clone? Reinitializing the remote will create a completely new bucket.
I would also be a good idea to centralize git-annex folders inside a single bucket so I keep the global namespace under control and can narrow down the permissioning.
Lemao, make sure you have pushed your git-annex branch to your central git repository.
When you clone that repo elsewhere, you can add the S3 remote by running
git annex enableremote cloud
(replace "cloud" with whatever name you originally picked when you usedgit annex initremote
to set up the S3 remote in the first place.git-annex stores the necessary configuration of the S3 remote on the git-annex branch.
Even after enableremote I can't get from s3.
This is after all branches are pushed from my original repo. Any suggestions?
RE: my last comment
The reason I couldn't get it to work is because I didn't have proper read access to the bucket. My bad for not checking first but it would be great it there was a clearer error message from git-annex and/or a way to get more detailed information on the s3 extension (-d doesn't do much).
Regardless git-annex is pretty cool, thanks to all the maintainers for their hard work.
Hi, I am trying to enable access to my s3 area from a clone. I am running into this issue:
My gpg key is available :
I would expect this to pop up a dialog asking me for my passphrase, as it will when I run the gpg command from a term.
Any ideas?
@james, since your keyring apparenty contains your secret key, the problem may be in the configuration of your gpg agent or pinentry program. If the agent is unable to use pinentry for some reason, gpg will complain that the secret key is unavailable since it is unable to get the passphrase to unlock it.
I had a similar problem with gpg2 the other day: http://bugs.debian.org/791379
For me, the
git annex initremote amazon-s3 encryption=shared embedcreds=yes
[1] command hung for several minutes after printingTurns out the problem was that I was low on entropy. Figured this out by running
per this bug comment. According to this blog post a solution is to
The
git annex initremote
command had finished by the time I found that solution, but I verified that it madegpg --gen-random 2
work.System: Ubuntu 16.04 with Git Annex 5.20151208-1build1 installed via package manager.
[1] I'm using AWS credentials that are restricted to a specific bucket, so I'm not worried about the conjunction
encryption=shared
andembedcreds=yes
.You can also use the --fast option to make git-annex use less entropy when generating the encryption key. That's a little less secure, but probably still secure enough.