basic use

The web can be used as a special remote too.

# git annex addurl http://example.com/video.mpeg
addurl example.com_video.mpeg (downloading http://example.com/video.mpeg)
########################################################## 100.0%
ok

Now the file is downloaded, and has been added to the annex like any other file. So it can be renamed, copied to other repositories, and so on.

To add a lot of urls at once, just list them all as parameters to git annex addurl.

trust issues

Note that git-annex assumes that, if the web site does not 404, and has the right file size, the file is still present on the web, and this counts as one copy of the file. If the file still seems to be present on the web, it will let you remove your last copy, trusting it can be downloaded again:

# git annex drop example.com_video.mpeg
drop example.com_video.mpeg (checking http://example.com/video.mpeg) ok

If you don't trust the web to this degree, just let git-annex know:

# git annex untrust web
untrust web ok

With the result that it will hang onto files:

# git annex drop example.com_video.mpeg
drop example.com_video.mpeg (unsafe) 
  Could only verify the existence of 0 out of 1 necessary copies
  Also these untrusted repositories may contain the file:
    00000000-0000-0000-0000-000000000001  -- web
  (Use --force to override this check, or adjust numcopies.)
failed

attaching urls to existing files

You can also attach urls to any file already in the annex:

# git annex addurl --file my_cool_big_file http://example.com/cool_big_file
addurl my_cool_big_file ok
# git annex whereis my_cool_big_file
whereis my_cool_big_file (2 copies) 
00000000-0000-0000-0000-000000000001 -- web
27a9510c-760a-11e1-b9a0-c731d2b77df9 -- here

configuring addurl filenames

By default, addurl will generate a filename for you. You can use --file= to specify the filename to use.

If you're adding a bunch of related files to a directory, or just don't like the default filenames generated by addurl, you can use --pathdepth to specify how many parts of the url are put in the filename. A positive number drops that many paths from the beginning, while a negative number takes that many paths from the end.

# git annex addurl http://example.com/videos/2012/01/video.mpeg
addurl example.com_videos_2012_01_video.mpeg (downloading http://example.com/videos/2012/01/video.mpeg)
# git annex addurl http://example.com/videos/2012/01/video.mpeg --pathdepth=2
addurl 2012_01_video.mpeg (downloading http://example.com/videos/2012/01/video.mpeg)
# git annex addurl http://example.com/videos/2012/01/video.mpeg --pathdepth=-2
addurl 01_video.mpeg (downloading http://example.com/videos/2012/01/video.mpeg)

videos

There's support for downloading videos from sites like YouTube, Vimeo, and many more. This relies on yt-dlp to download the videos.

When you have yt-dlp installed, you can just git annex addurl http://youtube.com/foo and it will detect that it is a video and download the video content for offline viewing.

(However, this is disabled by default as it can be a security risk. See the documentation of annex.security.allowed-ip-addresses in git-annex for details.)

Later, in another clone of the repository, you can run git annex get on the file and it will also be downloaded with yt-dlp. This works even if the video host has transcoded or otherwise changed the video in the meantime; the assumption is that these video files are equivalent.

There is an annex.youtube-dl-options configuration setting that can be used to pass parameters to yt-dlp. For example, you could set git config annex.youtube-dl-options "--format worst" to configure it to download low quality videos from YouTube.

To download all the videos in a youtube channel, you can use git-annex importfeed --scrape with the url to the channel, or you can find the RSS feed for the channel, and git-annex importfeed that url (without --scrape).

bittorrent

The bittorrent special remote lets git-annex also download the content of torrent files, and magnet links to torrents.

You can simply pass the url to a torrent to git annex addurl the same as any other url.

You have to have aria2 and bittornado (or the original bittorrent) installed for this to work.

podcasts

This is done using git annex importfeed. See downloading podcasts.

configuring which url is used when there are several

An annexed file can have content at multiple urls that git-annex knows about, and git-annex may use any of those urls for downloading a file.

If some urls are especially fast, or especially slow, you might want to configure which urls git-annex prefers to use first, or should only use as a last resory. To accomplish that, you can create additional remotes, that are web special remotes, and are configured to only be used for some urls, and have a different cost than the web special remote.

For example, suppose that you want to prioritize using urls on "fasthost.com".

git-annex initremote --sameas=web fasthost type=web urlinclude='*//fasthost.com/*' cost=150

Now, git-annex get of a file that is on both fasthost.com and another url will prefer to use the fasthost special remote, rather than the web special remote (which has a higher cost of 200), and so will use the fasthost.com url. If that url is not available, it will fall back to the web special remote, and use the other url.

Suppose that you want to avoid using urls on "slowhost.com", except as a last resort.

git-annex initremote --sameas=web slowhost type=web urlinclude='*//slowhost.com/*' cost=300

Now, git-annex get of a file that is on both slowhost.com and another url will first try the fasthost remote. If fasthost does not support the url, it will next try the regular "web" remote. Which will avoid using urls that are used by the configuration of either fasthost or slowhost. Finally, if it's unable to get the file from some other url, it will use the slowhost remote to get it from the slow url.