S3

This special remote type stores file contents in a bucket in Amazon S3 or a similar service.

See using Amazon S3, Internet Archive via S3, and using Google Cloud Storage for usage examples.

configuration

The standard environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY are used to supply login credentials for S3. You need to set these only when running git annex initremote (or enableremote), as they will be cached in a file only you can read inside the local git repository. If you’re working with temporary security credentials, you can also set the AWS_SESSION_TOKEN environment variable.

A number of parameters can be passed to git annex initremote to configure the S3 remote.

encryption - One of "none", "hybrid", "shared", or "pubkey". See encryption.
keyid - Specifies the gpg key to use for encryption.
chunk - Enables chunking when storing large files. chunk=1MiB is a good starting point for chunking.
embedcreds - Optional. Set to "yes" embed the login credentials inside the git repository, which allows other clones to also access them. This is the default when gpg encryption is enabled; the credentials are stored encrypted and only those with the repository's keys can access them.

It is not the default when using shared encryption, or no encryption. Think carefully about who can access your repository before using embedcreds without gpg encryption.
datacenter - Specifies which Amazon datacenter to use for the bucket. Defaults to "US". Other values include "EU" (which is EU/Ireland), "us-west-1", "us-west-2", "ap-southeast-1", "ap-southeast-2", and "sa-east-1". See Amazon's documentation for a complete list. Configuring this is equivilant to configuring both host and region.
storageclass - Default is "STANDARD".
Consult S3 provider documentation for pricing details and available storage classes. For example, the s3cmd(1) man page lists valid storage class names for Amazon S3.

When using Amazon S3, if the remote will be used for backup or archival, and so its files are Infrequently Accessed, STANDARD_IA is a good choice to save money (requires a git-annex built with aws-0.13.0). If you have configured git-annex to preserve multiple copies, also consider setting this to ONEZONE_IA to save even more money.

Amazon S3's DEEP_ARCHIVE is similar to Amazon Glacier. For that, use the glacier special remote, rather than this one.

When using Google Cloud Storage, to make a nearline bucket, set this to NEARLINE. (Requires a git-annex built with aws-0.13.0)

Note that changing the storage class of an existing S3 remote will affect new objects sent to the remote, but not objects already stored there.
host - Specify in order to use a different, S3 compatible service.
region - Specify the region to use. Only makes sense to use when you also set host. (Requires a git-annex built with aws-0.24.)
protocol - Either "http" (the default) or "https". Setting protocol=https implies port=443.

This option was added in git-annex version 7.20190322; to make a special remote that uses http with older versions of git-annex, explicitly specify port=443.
port - Specify the port to connect to. Only needed when using a service on an unusual port. Setting port=443 implies protocol=https.
requeststyle - Set to "path" to use path style requests, instead of the default DNS style requests. This is needed with some S3 services.

If you get an error about a host name not existing, it's a good indication that you need to use this.
signature - This controls the S3 signature version to use. "v2" is currently the default, "v4" is needed to use some S3 services. If you get some kind of authentication error, try "v4". To access a S3 bucket anonymously, use "anonymous".
bucket - S3 requires that buckets have a globally unique name, so by default, a bucket name is chosen based on the remote name and UUID. This can be specified to pick a bucket name.
versioning - Indicate whether the S3 bucket should have versioning enabled. Set to "yes" to enable.

Enabling versioning along with "exporttree=yes" allows git-annex to access old versions of files that were exported to the special remote by git-annex export.

And enabling versioning along with "importtree=yes" allows git-annex import to import the whole history of files in the bucket, synthesizing a series of git commits.

Note that git-annex does not support dropping content from versioned S3 buckets, since the versioning preserves the content.
exporttree - Set to "yes" to make this special remote usable by git-annex export. It will not be usable as a general-purpose special remote.
importtree - Set to "yes" to make this special remote usable by git-annex-import. When set in combination with exporttree, this lets files be imported from it, and changes exported back to it.

Note that exporting files to a S3 bucket may overwrite changes that have been made to files in the bucket by other software since the last time git-annex imported from the bucket. When versioning is enabled, the content of files overwritten in this way can still be recovered, but you may have to look through the git history to find them. When versioning is not enabled, this risks data loss, and so git-annex will not let you enable a remote with that configuration unless forced.
annexobjects - When set to "yes" along with "exporttree=yes", this allows storing other objects in the remote along with the exported tree. They will be stored under .git/annex/objects/ in the remote.
publicurl - Configure the URL that is used to download files from the bucket. Using this with a S3 bucket that has been configured to allow anyone to download its content allows git-annex to download files from the S3 remote without needing to know the S3 credentials.

To configure the S3 bucket to allow anyone to download its content, refer to S3 documentation to set a Bucket Policy.
public - Deprecated. This enables public read access to files sent to the S3 remote using ACLs. Note that Amazon S3 buckets created after April 2023 do not support using ACLs in this way and a Bucket Policy must instead be used. This should only be set for older buckets.
partsize - Amazon S3 only accepts uploads up to a certian file size, and storing larger files requires a multipart upload process.

Setting partsize=1GiB is recommended for Amazon S3 when not using chunking; this will cause multipart uploads to be done using parts up to 1GiB in size. Note that setting partsize to less than 100MiB will cause Amazon S3 to reject uploads.

This is not enabled by default, since other S3 implementations may not support multipart uploads or have different limits, but can be enabled or changed at any time.
fileprefix - By default, git-annex places files in a tree rooted at the top of the S3 bucket. When this is set, it's prefixed to the filenames used. For example, you could set it to "foo/" in one special remote, and to "bar/" in another special remote, and both special remotes could then use the same bucket.
x-amz-meta-* are passed through as http headers when storing keys in S3.
x-archive-meta-* are passed through as http headers when storing keys in the Internet Archive. See the Internet Archive S3 interface documentation for example headers.

RSS Atom

environment variables

Just noting that the environment variables ANNEX_S3_ACCESS_KEY_ID and ANNEX_S3_SECRET_ACCESS_KEY seem to have been changed to AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY

Comment by Matt — Tue May 29 12:40:25 2012

Remove comment

comment 2

Thanks, I've fixed that. (You could have too.. this is a wiki

Comment by joeyh.name — Tue May 29 19:10:46 2012

Remove comment

comment 3

Thanks! Being new here, I didn't want to overstep my boundaries. I've gone ahead and made a small edit and will do so elsewhere as needed.

Comment by Matt — Wed May 30 00:26:33 2012

Remove comment

bucket/folder s3 remotes

it'd be really nice being able to configure a S3 remote of the form <bucket>/<folder> (not really a folder, of course, just the usual prefix trick used to simulate folders at S3). The remote = bucket architecture is not scalable at all, in terms of number of repositories.

how hard would it be to support this?

thanks, this is the only thing that's holding us back from using git-annex, nice tool!

Comment by Eduardo — Thu Aug 9 10:52:07 2012

Remove comment

comment 5

I guess this could be useful if you have a lot of buckets already in use at S3, or if you want to be able to have a lot of distinct S3 special remotes. Implemented the fileprefix setting. Note that I have not tested it, beyond checking it builds, since I let my S3 account expire. Your testing would be appreciated.

Comment by joeyh.name — Thu Aug 9 18:01:06 2012

Remove comment

Rackspace Cloud Files support?

Any chance I could bribe you to setup Rackspace Cloud Files support? We are using them and would hate to have a S3 bucket only for this.

https://github.com/rackspace/python-cloudfiles

Comment by alan — Thu Aug 23 21:00:11 2012

Remove comment

S3 Remote Future Proof?

Joey, I'm curious to understand how future proof an S3 remote is. Can I restore my files without git-annex?

Comment by Eric — Sun Jan 20 09:21:50 2013

Remove comment

comment 8

If encryption is not used, the files are stored in S3 as-is, and can be accessed directly. The S3 bucket contains object named using the git-annex key, rather than the original filename. To get back to the original filename, a copy of the git repo would also be needed.

With encryption, you need the gpg key used in the encryption, or, for shared encryption, a symmetric key which is stored in the git repo.

See future proofing for non-S3 specific discussion of this topic.

Comment by joeyh.name — Sun Jan 20 20:37:09 2013

Remove comment

Recovering from a clone

How do I recover a special remote from a clone, please? I see that remote.log has most of the details, but my remote is not configured on my clone and I see no obvious way to do it. And I used embedcreds, but the only credentials I can see are stored in .git/annex/creds/ so did not survive a clone. I'm confused because the documentation here for embedcreds says that clones should have access.

As a workaround, it looks like copying the remote over from .git/config as well as the credentials from .git/annex/creds/ seems to work. Is there some other way I'm supposed to do this, or is this the intended way?

Comment by basak — Wed May 22 18:32:05 2013

Remove comment

comment 10

You can enable a special remote on a clone by running git annex enableremote $name, where $name is the name you used to originally create the special remote. (Older versions of git-annex used git annex initremote to enable the special remote on the clone.)

(Just in case, I have verified that embedcreds does cause the cipher= to be stored in the remote.log. It does.)

Comment by joey — Thu May 23 20:04:03 2013

Remove comment

comment 11

Thanks Joey - initremote on my slightly older version appears to work. I'll use enableremote when I can.

(Just in case, I have verified that embedcreds does cause the cipher= to be stored in the remote.log. It does.)

This doesn't do what I expect. The documentation suggests that my S3 login credentials would be stored. I understand that the cipher would be stored; but isn't this a separate concept? Instead, I'm being asked to set AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY; my understanding was that git-annex will keep them in the repository for me, so that I don't have to set them after running initremote before cloning. This works, apart from surviving the cloning. I'm using encryption=shared; does this affect anything? Or am I using a version of git-annex (3.20121112ubuntu3) that's too old?

Comment by basak — Fri May 24 09:38:40 2013

Remove comment

comment 12

Ah -- No, your AWS creds are not stored. While some other special remotes, like webdav, can store all necessary credentials, it's not done for AWS. I didn't want git-annex to be responsible for someone accidentally publishing their AWS creds to their friends, since that could cost them a lot of money.

Comment by joey — Fri May 24 15:33:12 2013

Remove comment

comment 13

That's not what the documentation here says! It even warns me: "Think carefully about who can access your repository before using embedcreds without gpg encryption."

My use case:

Occasional use of EC2, and a desire to store some persistent stuff in S3, since the dataset is large and I have limited bandwidth. I want to destroy the EC2 instance when I'm not using it, leaving the data in S3 for later.

If I use git-annex to manage the S3 store, then I get the ability to clone the repository and destroy the instance. Later, I can start a new instance, push the repo back up, and would like to be able to then pull the data back out of S3 again.

I'd really like the login credentials to persist in the repository (as the documentation here says it should). Even if I have to add a --yes-i-know-my-s3-credentials-will-end-up-available-to-anyone-who-can-see-my-git-repo flag. This is because I use some of my git repos to store private data, too.

If I use an Amazon IAM policy as follows, I can generate a set of credentials that are limited to access to a particular prefix of a specific S3 bucket only - effectively creating a sandboxed area just for git-annex:

{ 
  "Statement": [{"Sid": "Stmt1368780615583",
                 "Action": ["s3:GetObject", "s3:PutObject", "s3:DeleteObject"],
                 "Effect": "Allow",
                 "Resource": ["arn:aws:s3:::bucketname/prefix/*"]}
                ],
  "Statement": [{"Sid": "Stmt1368781573129",
                "Action": ["s3:GetBucketLocation"],
                "Effect": "Allow",
                "Resource": ["arn:aws:s3:::bucketname"]}
               ]
}

Doing this means that I have a different set of credentials for every annex, so it would be really useful to be able have these stored and managed within the repository itself. Each set is limited to what the annex stores, so there is no bigger compromise I have to worry about apart from the compromise of the data that the annex itself manages.

Comment by basak — Fri May 24 15:47:14 2013

Remove comment

comment 14

I apologise for incorrect information. I was thinking about defaults when using the webapp.

I have verified that embedcreds=yes stores the AWS creds, always.

Comment by joey — Fri May 24 16:45:25 2013

Remove comment

different s3 storage URLs

Is it possible to change the S3 endpoint hosts? I'm running a radosgw with S3 support which I'd like to define as S3 remote for git-annex

Comment by Tobias — Fri Aug 23 08:59:32 2013

Remove comment

comment 16

Yes, you can specify the host to use when setting up the remote. It's actually documented earlier on this very page, if ou search for "host". Any S3 compatabile host will probably work -- the Internet Archive's S3 does, for example.

Comment by joeyh.name — Fri Aug 23 17:39:56 2013

Remove comment

S3 file/folder names

Is there a way to tell the S3 backend to store the files as they are named locally, instead of by hashed content name? i.e., I've annexed foo/bar.txt and annex puts it in s3 as mybucket.name/foo/bar.txt instead of mybucket.name/GPGHMACSHA1-random.txt

Or should I just write a script to s3cmd sync my annex, and add the S3/cloudfront distribution URL as a web remote?

Comment by Joe — Thu Feb 19 22:22:26 2015

Remove comment

Sharing S3 bucket between repos

What are the implications of sharing an S3 remote/bucket between two repositories with using fileprefix? Can this be used to "deduplicate" files shared between repositories?

Comment by darkfeline — Fri Oct 9 01:16:25 2015

Remove comment

comment 19

@darkfeline I suppose you're talking about two completely disjoint git repositories, and not two clones of the same parent repo.

If you don't use fileprefix, and have the same file in two disjoint repositories, git-annex will pick the same key for it in both cases, and so you'll get deduplication, but only if you don't use different fileprefixes.

And this will mostly work pretty well. The danger is, if you drop the file from the S3 repo, because say, it's not used anymore in one repository, then you're also removing it from the S3 repo as used by the other repository. If that was the last copy of the file, that may not be what you want.

Comment by joey — Mon Oct 12 17:10:21 2015

Remove comment

How to view configuration of special remotes?

I don't remember which gpg key my s3 remote is using, but I can't seem to get git annex to tell me about the configuration of my s3 remote, which has the gpg key ID that I need to find (I need to restore it from many backed up keys, but need to know which one). Is there a way to view the remote metadata? I was hoping to see a command like git annex remoteinfo NAME. (git annex version: 5.20140408-gb37d538).

Comment by cantora — Tue Dec 8 08:29:12 2015

Remove comment

comment 21

@cantora with a recent enough version of git-annex, git annex info $theremotename will show quite a lot of information about a special remote, including encryption details.

Comment by joey — Thu Dec 10 15:20:43 2015

Remove comment

Sharing S3 bucket without fileprefix

I was hoping to use the same bucket for multiple repo's (100+) with a lot of files in common. Dropping unused files would not be an issue for me, so from what I read above this should be possible. However I cannot get initremote nor enableremote to add a repo with an existing uuid. How do I add an already initialized bucket to a new git annex repo?

Comment by tim — Wed Apr 13 19:02:24 2016

Remove comment

comment 23

@tim, are the git-annex repositories going to be connected? If so, git-annex initremote the S3 remote to one, merge it into the next repo, and then git annex enableremote the S3 remote there.

That's the sane way. If you want for some reason to have multiple separate git-annex repositories, that all try to use the same S3 bucket, without knowing about one-another, I have to recommend against it. You're setting yourself up to shoot yourself in the foot, and quite possibly lose data.

While you can git annex enableremote the same bucket repeatedly in the different repositories, each time it will be given a different uuid, and since the uuid is stored in the bucket, this will prevent git annex enableremote from being used for the old uuid, since it will see the bucket has a different uuid now.

Comment by joey — Wed Apr 13 19:13:35 2016

Remove comment

comment 24

Is it possible to edit an existing special remote to set embedcreds to true?

Comment by justin.lebar — Tue Nov 1 04:10:25 2016

Remove comment

comment 25

@justin.lebar I think you can turn on embedcreds when using git annex enableremote.

Comment by joey — Tue Nov 1 18:30:31 2016

Remove comment

comment 26

Hi,

I've been unable to setup a remote in Beijing (region cn-north-1) as the signature request type doesn't seem to be V4

MacBook-Pro:gitannex-china iwatson$ git annex initremote ings3cn bucket="testbec" type=S3 chunk=50MiB encryption=none datacenter="cn-north-1" initremote ings3cn (checking bucket...) (creating bucket in cn-north-1...) git-annex: FailedConnectionException2 "testbec.s3-cn-north-1.amazonaws.com" 80 False getAddrInfo: does not exist (nodename nor servname provided, or not known)

The endpoints for china default to "testbec.s3-cn-north-1.amazonaws.com" but should be "s3.cn-north-1.amazonaws.com.cn". Using host option I can set this: MacBook-Pro:gitannex-china iwatson$ git annex initremote ings3cn bucket="testbec" type=S3 chunk=50MiB encryption=none host="s3.cn-north-1.amazonaws.com.cn" datacenter="cn-north-1" initremote ings3cn (checking bucket...) (creating bucket in cn-north-1...) git-annex: S3Error {s3StatusCode = Status {statusCode = 400, statusMessage = "Bad Request"}, s3ErrorCode = "InvalidRequest", s3ErrorMessage = "The authorization mechanism you have provided is not supported. Please use AWS4-HMAC-SHA256.", s3ErrorResource = Nothing, s3ErrorHostId = Just "xxx=", s3ErrorAccessKeyId = Nothing, s3ErrorStringToSign = Nothing, s3ErrorBucket = Nothing, s3ErrorEndpointRaw = Nothing, s3ErrorEndpoint = Nothing}

However, the message is that the authorization mechanism isn't AWS4-HMAC-SHA256 - if you see AWS this is V4 required by china and all new AWS regions: http://docs.aws.amazon.com/general/latest/gr/signature-version-4.html

Do the S3 remotes use the new V4 signature method? I can't see what else I'm doing wrong here but any help would be appreciated!

Thanks bec

Comment by bec.watson — Sat Nov 5 01:04:47 2016

Remove comment

comment 28

I'd like to reiterate a question that was unanswered above:

Comment by David_K — Wed Nov 16 01:28:14 2016

Remove comment

https-only access, type=S3 port=443

(My own question, answered by Use The Source Luke method)

If you need to point a type=S3 special remote at a service which provides only https (in my case, a local CEPH RADOS gateway) then you can do it by setting port=443.

This was implemented in 6fcca2f1 and next tag was 5.20141203 . On Ubuntu, that version is available in Xenial but not Trusty.

Comment by mca — Thu Jan 5 13:10:43 2017

Remove comment

comment 29

@David_K @Joe, it's finally possible to publish annexed files to S3 while preserving filenames, using the new git annex export command! See publishing your files to the public.

Comment by joey — Fri Sep 8 20:46:28 2017

Remove comment

comment 30

"When versioning is not enabled, this risks data loss" -- maybe, add an option to always import before exporting? I don't want to enable versioning since that permanently loses the chance of dropping any objects (even if they're stored elsewhere).

Also, versioning on S3 buckets can be suspended/resumed; not sure if git-annex uses current bucket versioning state or the remote configuration.

Comment by Ilya_Shlyakhter — Thu May 2 16:04:05 2019

Remove comment

comment 31

@Ilya, importing before exporting does not avoid losing any data that gets written at just the wrong time. Versioning is the only way to guarantee no data loss.

If you enable versioning for the bucket and then disable it or don't version some items, git-annex won't notice and if you then do an export I'm afraid you again risk losing data.

Comment by joey — Mon May 6 16:27:44 2019

Remove comment

Deep Archive Storage Class

Does anybody have experience with the Deep Archive storage class?

Comment by Zoran — Sun Sep 22 09:02:16 2019

Remove comment

comment 33

First of all thanks for git-annex! I have recently discovered it and was so pleased to see how it solves problems that I had not anticipated!

Related to comment 26 above, I want to create a bucket in Frankfurt. It seems like the aws library used by git-annex is from aristidb/aws where the topic of V4 signing was raised in this issue (comment by you joey?).

Are there workarounds here? I can only use Frankfurt.

Thanks again!

Comment by cnjr2 — Wed Feb 12 10:37:24 2020

Remove comment

@bec.watson, I've now added a way to use V4 authentication, the signature=v4 option.

Please file a bug report if it still doesn't work.

Comment by joey — Thu May 7 16:49:05 2020

Remove comment

comment 35

I don't want to enable versioning since that permanently loses the chance of dropping any objects (even if they're stored elsewhere)

I believe it is in principle possible to remove object versions from S3 and/or completely wipe out an object with all of its versions (so it wouldn't be just a DeleteMarker added as the most recent version). So in principle git-annex could (if doesn't already, didn't check) delete previous versions in a versioned bucket upon drop.

Comment by yarikoptic — Sun May 10 16:30:51 2020

Remove comment

ACL deprecation vs public=yes

Amazon has deprecated ACLs

A majority of modern use cases in Amazon S3 no longer require the use of ACLs, and we recommend that you disable ACLs except in unusual circumstances where you need to control access for each object individually. With Object Ownership, you can disable ACLs and rely on policies for access control. When you disable ACLs, you can easily maintain a bucket with objects uploaded by different AWS accounts. You, as the bucket owner, own all the objects in the bucket and can manage access to them using policies.

They are encouraging everyone to migrate to bucket policies instead.

Implementation

I've done this for a bucket I run. I wrote and attached this Bucket Policy to it:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": "*",
            "Action": "s3:GetObject",
            "Resource": [
                "arn:aws:s3:::BUCKET-NAME",
                "arn:aws:s3:::BUCKET-NAME/*"
            ]
        }
    ]
}

I had to reset the ACLs on that bucket to the default:

Bucket owner (your AWS account):
- Objects:
  - List
  - Write
- Bucket ACL:
  - Read
  - Write

and with that set Amazon let me also set

Object ownership: Bucket owner enforced

git-annex incompatibility

But, attempting to use the new setup failed:

$ git annex copy --to amazon what.nii.gz 
copy what.nii.gz (checking amazon...) (to amazon...) 
41%   8.15 MiB         20 MiB/s 0s
  S3Error {s3StatusCode = Status {statusCode = 400, statusMessage = "Bad Request"}, s3ErrorCode = "AccessControlListNotSupported", s3ErrorMessage = "The bucket does not allow ACLs", s3ErrorResource = Nothing, s3ErrorHostId = Just "a6+ieujj4z3Z4P8ooA306DdbGAoxWDiXd6O2ZwjdfapGnuOGPyL5/WQ4UBEytR80FG+5b6xdlsM=", s3ErrorAccessKeyId = Nothing, s3ErrorStringToSign = Nothing, s3ErrorBucket = Nothing, s3ErrorEndpointRaw = Nothing, s3ErrorEndpoint = Nothing}

32%   6.43 MiB         16 MiB/s 0s 
  S3Error {s3StatusCode = Status {statusCode = 400, statusMessage = "Bad Request"}, s3ErrorCode = "AccessControlListNotSupported", s3ErrorMessage = "The bucket does not allow ACLs", s3ErrorResource = Nothing, s3ErrorHostId = Just "bFOgMomROCOes9yI6HZHysQGoZaTbsPI5b7rHjcTI0wA8Yx5Dm1JOky9BvXvpcXxzY1kVt48FRQ=", s3ErrorAccessKeyId = Nothing, s3ErrorStringToSign = Nothing, s3ErrorBucket = Nothing, s3ErrorEndpointRaw = Nothing, s3ErrorEndpoint = Nothing}

37%   7.37 MiB         21 MiB/s 0s 
  S3Error {s3StatusCode = Status {statusCode = 400, statusMessage = "Bad Request"}, s3ErrorCode = "AccessControlListNotSupported", s3ErrorMessage = "The bucket does not allow ACLs", s3ErrorResource = Nothing, s3ErrorHostId = Just "hqd4HRNk5yp3tKJ6yMhcECEpCjBw8qB6oTpKF3PaOsYFeVG0C+dGI06xq3zgmvnPoFUttI040sY=", s3ErrorAccessKeyId = Nothing, s3ErrorStringToSign = Nothing, s3ErrorBucket = Nothing, s3ErrorEndpointRaw = Nothing, s3ErrorEndpoint = Nothing}

39%   7.81 MiB         21 MiB/s 0s 
  S3Error {s3StatusCode = Status {statusCode = 400, statusMessage = "Bad Request"}, s3ErrorCode = "AccessControlListNotSupported", s3ErrorMessage = "The bucket does not allow ACLs", s3ErrorResource = Nothing, s3ErrorHostId = Just "7m7wwG5woSPmICIuXr9QnBOEjUikuyzHSebMLuaNyZMc2Ki2vaqKpU9U+GOTYmR/NzFjOeyxngk=", s3ErrorAccessKeyId = Nothing, s3ErrorStringToSign = Nothing, s3ErrorBucket = Nothing, s3ErrorEndpointRaw = Nothing, s3ErrorEndpoint = Nothing}
failed
git-annex: copy: 1 failed

Workaround

However, this fixed it:

$ git annex enableremote amazon public=no
enableremote amazon ok
(recording state in git...)
$ git annex copy --to amazon what.nii.gz 
copy what.nii.gz (checking amazon...) (to amazon...) 
ok

Feature Request

If public=yes, instead of trying to set an ACL, first try HEAD on the newly uploaded object without using the AWS_ACCESS_KEY. Only if that fails, fall over to trying to set an ACL. And if you get AccessControlListNotSupported (i.e. the error due to BucketOwnerEnforced), then give a warning that the bucket policy is not configured for public access.
Update the docs here to explain how to set up a public bucket policy as recommended by Amazon, and that public=yes will either try to confirm that the bucket policy is public, or will fallback to using ACLs.

Comment by nick.guenther — Tue Jul 5 18:20:50 2022

Remove comment

comment 37

@nick.guenther A comment on this page is really not the place to report a bug like this. See bugs for where to file a bug.

Comment by joey — Tue Jul 5 20:24:05 2022

Remove comment

comment 38

Thanks for taking the time to direct me, Joey. I usually find myself getting lost in this wiki amongst all the old notes and worklogs and documentation about old features, so sometimes someone coming along with a sign post is a very kind help

Comment by nick.guenther — Fri Jul 15 15:55:38 2022

Remove comment

Sync git-annex metadata subset with S3 metadata.

How feasible would it be to be able to configure the remote so that a git-annex metadata get pushed to the S3 with the objects. Something like sync-meta=mymetafield that would set x-amz-meta-mymetafield= to the value of git-annex metadata -g mymetafield --key theobjectkey, whenever the data is pushed to the S3. Thanks!

Comment by Basile.Pinsard — Wed Dec 18 19:34:59 2024

Remove comment

Add a comment