This tutorial shows how to set up a centralized repository hosted on GitHub.
GitHub does not currently let git-annex store the contents of large files there. This doesn't prevent using git-annex with GitHub, it just means you have to set up some other centralized location for the large files.
set up the repository, and make a checkout
I've created a repository for technical talk videos, which you can fork on Github. Or make your own repository on GitHub now.
On your laptop, install git-annex, and clone the repository:
# git clone email@example.com:joeyh/techtalks.git # cd techtalks
Tell git-annex to use the repository, and describe where this clone is located:
# git annex init 'my laptop' init my laptop ok
add files to the repository
Add some files, obtained however.
# git annex add *.mp4 add Haskell_Amuse_Bouche-b9OVqxmI.mp4 (checksum) ok (Recording state in git...) # git commit -m "added a video. I have not watched it yet but it sounds interesting"
This file is available on the web; so git-annex can download it:
# git annex addurl http://kitenet.net/~joey/screencasts/git-annex_coding_in_haskell.ogg addurl kitenet.net_~joey_screencasts_git-annex_coding_in_haskell.ogg (downloading http://kitenet.net/~joey/screencasts/git-annex_coding_in_haskell.ogg ...) (checksum...) ok (Recording state in git...) # git commit -a -m 'added a screencast I made'
Feel free to rename the files, etc, using normal git commands:
# git mv Haskell_Amuse_Bouche-b9OVqxmI.mp4 Haskell_Amuse_Bouche.mp4 # git mv kitenet.net_~joey_screencasts_git-annex_coding_in_haskell.ogg git-annex_coding_in_haskell.ogg # git commit -m 'better filenames'
Now push your changes back to the central repository on GitHub. As well as
pushing the master branch, remember to push the git-annex branch, which is
used to track the file contents. You can do this push manually as shown
below, or you can just run
git annex sync to do the same thing.
# git push origin master git-annex To firstname.lastname@example.org:joeyh/techtalks.git * [new branch] master -> master * [new branch] git-annex -> git-annex
That push went fast, because it didn't upload large videos to GitHub. To check this, you can ask git-annex where the contents of the videos are:
# git annex whereis whereis Haskell_Amuse_Bouche.mp4 (1 copy) 767e8558-0955-11e1-be83-cbbeaab7fff8 -- here ok whereis git-annex_coding_in_haskell.ogg (2 copies) 00000000-0000-0000-0000-000000000001 -- web 767e8558-0955-11e1-be83-cbbeaab7fff8 -- here ok
make more checkouts
So far you have a central repository, and a checkout on a laptop. You, or anyone you allow to can clone the central repository, and use git-annex with it.
But, since GitHub doesn't currently support storing large files there with git-annex, other checkouts of your repository won't be able to access the files you added to the repository on your laptop.
# git clone email@example.com:myrepo/techtalks.git # git annex get Haskell_Amuse_Bouche-b9OVqxmI.mp4 get Haskell_Amuse_Bouche-b9OVqxmI.mp4 Try making some of these repositories available: 767e8558-0955-11e1-be83-cbbeaab7fff8 -- my laptop failed
add a special remote
So, to complete your setup, you need to set up a repository where git-annex can store the contents of large files. This is often done by setting up a special remote. One free option is explained in using box.com as a special remote. Another useful approach is explained in public Amazon S3 remote.
Once you have the special remote set up on your laptop, you can send files to it:
# git annex copy --to myspecialremote Haskell_Amuse_Bouche-b9OVqxmI.mp4 copy Haskell_Amuse_Bouche-b9OVqxmI.mp4 (to myspecialremote...) 100% 255.11kB/s ok
You can also
git annex move files to it, to free up space on your laptop.
And then you can
git annex get files back to your laptop later on, as
After you use git-annex to move files around, remember to sync, which will broadcast its updated location information.
# git annex sync
After setting up the special remote and storing some files on it, you can download them on other clones. You'll first need to enable the same special remote on the clones.
# git annex sync # git annex enableremote myspecialremote # git annex get git-annex_coding_in_haskell.ogg 100% 255.11kB/s ok
take it farther
You can add remotes for each direct connection between machines you find you need -- so make the laptop have the desktop as a remote, and the desktop have the laptop as a remote, and then on either machine git-annex can access files stored on the other.