Please describe the problem.
When I attempt to create a S3 remote against my garage1 cluster, it errors with the following:
$ git annex initremote garage type=S3 encryption=none host=my-s3-endpoint.domain.com protocol=https bucket=git-annex requeststyle=path datacenter=garage signature=v4
initremote garage (checking bucket...) (creating bucket in garage...)
git-annex: S3Error {s3StatusCode = Status {statusCode = 400, statusMessage = "Bad Request"}, s3ErrorCode = "AuthorizationHeaderMalformed", s3ErrorMessage = "Authorization header malformed, expected scope: 20230118/my-s3-endpoint.domain.com/s3/aws4_request", s3ErrorResource = Just "/git-annex/", s3ErrorHostId = Nothing, s3ErrorAccessKeyId = Nothing, s3ErrorStringToSign = Nothing, s3ErrorBucket = Nothing, s3ErrorEndpointRaw = Nothing, s3ErrorEndpoint = Nothing}
failed
initremote: 1 failed
$ git annex initremote garage type=S3 encryption=none host=my-s3-endpoint.domain.com protocol=https bucket=git-annex requeststyle=path datacenter=garage
initremote garage (checking bucket...) (creating bucket in garage...)
git-annex: S3Error {s3StatusCode = Status {statusCode = 400, statusMessage = "Bad Request"}, s3ErrorCode = "InvalidRequest", s3ErrorMessage = "Bad request: Unsupported authorization method", s3ErrorResource = Just "/git-annex/", s3ErrorHostId = Nothing, s3ErrorAccessKeyId = Nothing, s3ErrorStringToSign = Nothing, s3ErrorBucket = Nothing, s3ErrorEndpointRaw = Nothing, s3ErrorEndpoint = Nothing}
failed
initremote: 1 failed
Garage appears to support v4 signatures: https://garagehq.deuxfleurs.fr/documentation/reference-manual/s3-compatibility/#high-level-features - and other S3 tooling works against the endpoint.
What version of git-annex are you using? On what operating system?
Fedora Silverblue 37 / git-annex-10.20221212-1.fc37.x86_64
Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
Yes, many years ago - now trying to get it up and running with my self-hosted S3 endpoint.
fixed although it needs git-annex to be built against a not yet released version of aws. --Joey
I took a look at the credentialv4 structure at https://github.com/aristidb/aws/blob/9bdc4ee018d0d9047c0434eeb21e2383afaa9ccf/Aws/Core.hs#L621 and found it curious that it has the region inside the scope (as the garage code) does... however in my error message from git-annex - the hostname of the S3 service is what's inside the scope instead of the 'garage' region name.
I therefore adjusted the garage API's configuration to have the FQDN as the region and then... git-annex Just Worked.
I believe the fix for this is:
...however I cannot test it myself right now as it's failing to compile on another bit of code:
I think this can all be removed as only the 'region' should be in the
S3.*
calls:But now I get this as an error and I do not know why:
Firstly, this code is working, as far as I know, when accessing AWS. And I want to be very careful to avoid breaking that. So patching all that out would have to be very carefully examined and/or tested.
I've verified that the s3Endpoint is supposed to be a hostname. It's used as such in s3SignQuery:
If s3Endpoint were just "garage" that would break.
s3SignQuery also has
region = s3ExtractRegion s3Endpoint
. s3ExtractRegion parses a hostname like "s3-foo.amazonaws.com" to "foo", and that is used as the region (or "scope" as you've called it).git-annex makes sure to set s3Endpoint to a hostname. When the default AWS hostname is used, it converts the datacenter=foo value to a hostname like s3-foo.amazonaws.com and sets s3Endpoint to that. When some other hostname= is provided, it set s3Endpoints to that hostname, ignoring the datacenter= value.
This seems like the only thing git-annex can do. Your proposed patch would make a configuration of "datacenter=us-west-1" set s3Endpoint to "us-west-1" and s3SignQuery would use that as the host.
So I think this is probably a bug or shortcoming of the aws library. It seems that to fix this, the aws library would need to have a way to specify a "scope" or "region" separate from the s3Endpoint. And git-annex would then need to expose that as something other than datacenter=
I've filed an issue on aws about this: https://github.com/aristidb/aws/issues/283
It's working, because presumably AWS accept 's3.amazon.com' as a region setting for the scope in their cloud.
My S3 endpoint is indeed the FQDN of my self-hosted S3 endpoint. My "AWS region" is "garage".
To give exact examples, this is what I configure in my environment when interacting with garage with any other S3 utility:
export AWS_ACCESS_KEY_ID=foo export AWS_SECRET_ACCESS_KEY=bar export AWS_DEFAULT_REGION='garage' export AWS_ENDPOINT='https://my-s3-endpoint.example.com/'
The region bit is what's added to the credentialV4 in Core.hs (and what AWS appear to also accept as s3.amazon.com(? - I do not have an AWS account to test this). To make git-annex work, I need to do the equivalent of this (after changing garage's configuration):
export AWS_DEFAULT_REGION='my-s3-endpoint.example.com'
Once I do this, awscli and every other tool breaks:
$ aws s3 ls s3:// Provided region_name 'my-s3-endpoint.example.com' doesn't match a supported format.
It might be best to deprecate datacenter= and add region= directly to match AWS parlance
What I'd really like to test is just being able to set datacenter=garage but with my change it's not accepted as a [Char].
I'd also like to clarify this bit in that what I mean by "scope" is that line in the error message from the backend:
git-annex is incorrectly setting the FQDN/endpoint of the service in this, instead of what garage and the linked Haskell library want here after the date which is the region.
I doubt it; that's not what aws sends to it.
Just to be clear, comment #5 is me fully understanding the problem and escalating the issue to the relevant library. Further speculation would probably muddy the water.
Thanks for clarifying that "scope" is internal jargon of garage though. Also it's good to know that
AWS_DEFAULT_REGION
is a commonly used environment variable for it.Implemented this in aws: https://github.com/aristidb/aws/pull/284 Which should be released as version 0.24.
git-annex will support region= when built with that version of aws.
Thank you for the changes Joey, I can confirm that it's working and I've added this to the garage docs: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/502
I did however have to change git-annex's stack.yaml as follows to make this compile:
On my machine, it wasn't happy about the aeson version without the lts bump and I also hit the bug at https://github.com/snoyberg/http-client/issues/482 hence the http-client change.
I believe that this is actually an AWS thing, as it's extensively used in their Python and Rust SDKs: https://github.com/awslabs/aws-sdk-rust/blob/66423e05991ee831696bc32fe3e452694cf0d231/sdk/s3/src/config.rs#L98