This tutorial shows how to set up a centralized repository hosted on GitHub.

GitHub does not currently let git-annex store the contents of large files there. This doesn't prevent using git-annex with GitHub, it just means you have to set up some other centralized location for the large files.

set up the repository, and make a checkout

I've created a repository for technical talk videos, which you can fork on Github. Or make your own repository on GitHub now.

On your laptop, install git-annex, and clone the repository:

# git clone git@github.com:joeyh/techtalks.git
# cd techtalks

Tell git-annex to use the repository, and describe where this clone is located:

# git annex init 'my laptop'
init my laptop ok

add files to the repository

Add some files, obtained however.

# git annex add *.mp4
add Haskell_Amuse_Bouche-b9OVqxmI.mp4 (checksum) ok
(Recording state in git...)
# git commit -m "added a video. I have not watched it yet but it sounds interesting"

This file is available on the web; so git-annex can download it:

# git annex addurl http://kitenet.net/~joey/screencasts/git-annex_coding_in_haskell.ogg
addurl kitenet.net_~joey_screencasts_git-annex_coding_in_haskell.ogg
(downloading http://kitenet.net/~joey/screencasts/git-annex_coding_in_haskell.ogg ...)
(checksum...) ok
(Recording state in git...)
# git commit -a -m 'added a screencast I made'

Feel free to rename the files, etc, using normal git commands:

# git mv Haskell_Amuse_Bouche-b9OVqxmI.mp4 Haskell_Amuse_Bouche.mp4
# git mv kitenet.net_~joey_screencasts_git-annex_coding_in_haskell.ogg git-annex_coding_in_haskell.ogg
# git commit -m 'better filenames'

Now push your changes back to the central repository on GitHub. As well as pushing the master branch, remember to push the git-annex branch, which is used to track the file contents. You can do this push manually as shown below, or you can just run git annex sync to do the same thing.

# git push origin master git-annex
To git@github.com:joeyh/techtalks.git
 * [new branch]      master -> master
 * [new branch]      git-annex -> git-annex

That push went fast, because it didn't upload large videos to GitHub. To check this, you can ask git-annex where the contents of the videos are:

# git annex whereis
whereis Haskell_Amuse_Bouche.mp4 (1 copy) 
    767e8558-0955-11e1-be83-cbbeaab7fff8 -- here
ok
whereis git-annex_coding_in_haskell.ogg (2 copies) 
    00000000-0000-0000-0000-000000000001 -- web
    767e8558-0955-11e1-be83-cbbeaab7fff8 -- here
ok

make more checkouts

So far you have a central repository, and a checkout on a laptop. You, or anyone you allow to can clone the central repository, and use git-annex with it.

But, since GitHub doesn't currently support storing large files there with git-annex, other checkouts of your repository won't be able to access the files you added to the repository on your laptop.

# git clone git@github.com:myrepo/techtalks.git
# git annex get Haskell_Amuse_Bouche-b9OVqxmI.mp4
get Haskell_Amuse_Bouche-b9OVqxmI.mp4

  Try making some of these repositories available:
    767e8558-0955-11e1-be83-cbbeaab7fff8 -- my laptop
failed

add a special remote

So, to complete your setup, you need to set up a repository where git-annex can store the contents of large files. This is often done by setting up a special remote. One free option is explained in using box.com as a special remote. Another useful approach is explained in public Amazon S3 remote.

Once you have the special remote set up on your laptop, you can send files to it:

# git annex copy --to myspecialremote Haskell_Amuse_Bouche-b9OVqxmI.mp4
copy Haskell_Amuse_Bouche-b9OVqxmI.mp4 (to myspecialremote...)
100% 255.11kB/s
ok

You can also git annex move files to it, to free up space on your laptop. And then you can git annex get files back to your laptop later on, as desired.

After you use git-annex to move files around, remember to sync, which will broadcast its updated location information.

# git annex sync

After setting up the special remote and storing some files on it, you can download them on other clones. You'll first need to enable the same special remote on the clones.

# git annex sync
# git annex enableremote myspecialremote
# git annex get git-annex_coding_in_haskell.ogg
100% 255.11kB/s
ok

take it farther

You can add remotes for each direct connection between machines you find you need -- so make the laptop have the desktop as a remote, and the desktop have the laptop as a remote, and then on either machine git-annex can access files stored on the other.