quvi does not seem maintained (last upstream release in 2013) and it supports many fewer videos than youtube-dl does.

The difficulty with using youtube-dl is it, by design, does not provide a way to probe if it supports an url, other than running it and seeing if it finds a video at the url. This would make git annex addurl significantly slower if it ran youtube-dl to probe every url.

It is possible to use youtube-dl to download arbitrary non-video files; it stores the file to disk just as wget or curl. But, that's well outside its intended use case, and so it does not feel like a good idea to make git-annex depend on using youtube-dl to download generic urls. (Also, youtube-dl has bugs with downloading non-video urls, see for example http://bugs.debian.org/874321)

So, switching to youtube-dl would probably need a new switch, like git annex addurl --rip that enables using it.

(Importfeed only treats links in the feed as video urls, not enclosures, so this problem does not affect it and it would not need a new switch.)

That would need changes to users' workflows. git-annex could keep supporting quvi for some time, and warn when it uses quvi, to help with the transition.

Alternatively, git-annex addurl could download the url first, and then check the file to see if it looks like html. If so, run youtube-dl (which unfortunately has to download it again) and see if it manages to rip media from it. This way, addurl of non-html files does not have extra overhead, and the redundant download is fairly small compared to ripping the media. Only the unusual case where addurl is being used on html that does not contain media becomes more expensive.

However, for --relaxed, running youtube-dl --get-filename would be significantly more expensive since it hits the network. It seems that --relaxed would need to change to not rip videos; users who want that could use --fast.

--fast already hits the network, but if it uses youtube-dl --get-filename, it would fall afoul of bugs like http://bugs.debian.org/874321, although those can be worked around (/dev/null stderr in cast youtube-dl crashes)

Another gotcha is playlists. youtube-dl downloads playlists automatically. But, git-annex needs to record an url that downloads a single file so that git annex get works right. So, playlists will need to be disabled when git-annex runs youtube-dl. But, --no-playlist does not always disable playlists. Best option seems to be --no-playlist --playlist-items 0 which works for non-playlists, and downloads only 1 item from playlists (hopefully a fairly stable item, but who knows..).

(git annex importfeed handles youtube playlist downloads, but needs the user to find the url to the rss feed for the playlist. Youtube still has these, although it makes them hard to find.)

Another gotcha is that youtube-dl's -o option does not fully determine the filename it downloads to. Sometims it will tack on an additional extension (seen with youtube videos where it added a ".mkv"). And --get-filename does not report the actual filename when that happens. This seems to be due to format merging by ffmpeg; with -f best, it does not merge and so does not do that. https://github.com/rg3/youtube-dl/issues/14864

To do disk free space checking will need a different technique than git-annex normally uses, because youtube-dl does not provide an easy way to query for size. Could use --dump-json, but that would require downloading the web page yet again, so too expensive.. and, the json seems to have "filesize: null" for youtube videos. What does work is the --max-filesize option, which makes youtube-dl abort if it's too big.

done --Joey