Creating a special S3 remote to hold files shareable by URL

(In this example, I'll assume you'll be creating a bucket in S3 named public-annex and a special remote in git-annex, which will store its files in the previous bucket, named public-s3, but change these names if you are going to do the thing for real)

Set up your special S3 remote with (at least) these options:

git annex initremote public-s3 type=s3 encryption=none bucket=public-annex chunk=0 public=yes

This way git-annex will upload the files to this repo, (when you call git annex copy [FILES...] --to public-s3) without encrypting them and without chunking them. And, thanks to the public=yes, they will be accessible by anyone with the link.

(Note that public=yes was added in git-annex version 5.20150605. If you have an older version, it will be silently ignored, and you will instead need to use the AWS dashboard to configure a public get policy for the bucket.)

Following the example, the files will be accessible at http://public-annex.s3.amazonaws.com/KEY where KEY is the file key created by git-annex and which you can discover running

git annex lookupkey FILEPATH

This way you can share a link to each file you have at your S3 remote.

Sharing all links in a folder

To share all the links in a given folder, for example, you can go to that folder and run (this is an example with the fish shell, but I'm sure you can do the same in bash, I just don't know exactly):

for filename in (ls)
    echo $filename": https://public-annex.s3.amazonaws.com/"(git annex lookupkey $filename)
end

Sharing all links matching certain metadata

The same applies to all the filters you can do with git-annex.

For example, let's share links to all the files whose author's name starts with "Mario" and are, in fact, stored at your public-s3 remote. However, instead of just a list of links we will output a markdown-formatted list of the filenames linked to their S3 urls:

for filename in (git annex find --metadata "author=Mario*" --and --in public-s3)
   echo "* ["$filename"](https://public-annex.s3.amazonaws.com/"(git annex lookupkey $filename)")"
end

Very useful.

Sharing links with time-limited URLs

By using pre-signed URLs it is possible to create limits on how long a URL is valid for retrieving an object. To enable use a private S3 bucket for the remotes and then pre-sign actual URL with the script in AWS-Tools. Example:

key=`git annex lookupkey "$fname"`;  sign_s3_url.bash --region 'eu-west-1' --bucket 'mybuck' --file-path $key --aws-access-key-id XX --aws-secret-access-key XX --method 'GET' --minute-expire 10

Adding the S3 URL as a source

Assuming all files in the current directory are available on S3, this will register the public S3 url for the file in git-annex, making it available for everyone through git-annex:

git annex find --in public-s3 | while read file ; do
  key=$(git annex lookupkey $file)
  echo $key https://public-annex.s3.amazonaws.com/$key
done | git annex registerurl

registerurl was introduced in 5.20150317.

Manually configuring a public get policy

Here is how to manually configure a public get policy for a bucket, in the AWS dashboard.

{
  "Version": "2008-10-17",
  "Statement": [
    {
      "Sid": "AllowPublicRead",
      "Effect": "Allow",
      "Principal": {
        "AWS": "*"
      },
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::public-annex/*"
    }
  ]
}

This should not be necessary if using a new enough version of git-annex, which can instead be configured with public=yes.