git-annex now has support for storing arbitrary metadata about annexed files. For example, this can be used to tag files, to record the author of a file, etc. The metadata is synced around between repositories with the other information git-annex keeps track of.

One nice way to use the metadata is through views. You can ask git-annex to create a view of files in the currently checked out branch that have certain metadata. Once you're in a view, you can move and copy files to adjust their metadata further. Rather than the traditional hierarchical directory structure, views are dynamic; you can easily refine or reorder a view.

Let's get started by setting some tags on files. No views yet, just some metadata:

To avoid needing to manually tag files with the year (and month), run annex.genmetadata true, and git-annex will do it for you when adding files.

# git annex metadata --tag todo work/2014/*
# git annex metadata --untag todo work/2014/done/*
# git annex metadata --tag urgent work/2014/presentation_for_tomorrow.odt
# git annex metadata --tag done work/2013/* work/2014/done/*
# git annex metadata --tag work work
# git annex metadata --tag video videos
# git annex metadata --tag work videos/
# git annex metadata --tag done videos/old
# git annex metadata --tag new videos/lotsofcats.ogv
# git annex metadata --tag sound podcasts
# git annex metadata --tag done podcasts/*/old
# git annex metadata --tag new podcasts/*/recent

So, you had a bunch of different kinds of files sorted into a directory structure. But that didn't really reflect how you approach the files. Adding some tags lets you categorize the files in different ways.

Ok, metadata is in place, but how to use it? Time to change views!

# git annex view tag=*
view  (searching...)

Switched to branch 'views/_'

Notice that a single file may appear in multiple directories depending on its tags. For example, lotsofcats.ogv is in both new/ and video/.

This searched for all files with any tag, and created a new git branch that sorts the files according to their tags.

# tree -d

Ah, but you're at work now, and don't want to be distracted by cat videos. Time to filter the view:

# git annex vfilter tag=work
Switched to branch 'views/(work)/_'

Now only the work files are in the view, and they're otherwise categorized according to their other tags. So you can check the urgent/ directory to see what's next, and look in todo/ for other work related files.

Now that you're in a tag based view, you can move files around between the directories, and when you commit your changes to git, their tags will be updated.

# git mv urgent/presentation_for_tomorrow_{work;2014}.odt ../done
# git commit -m "a good day's work"
metadata tag-=urgent
metadata tag+=done

You can return to a previous view by running git annex vpop. If you pop all the way out of all views, you'll be back on the regular git branch you originally started from. You can also use git checkout to switch between views and other branches.


Beyond simple tags and directories, you can add whatever kinds of metadata you like, and use that metadata in more elaborate views. For example, let's add a year field.

# git checkout master
# git annex metadata --set year=2014 work/2014
# git annex metadata --set year=2013 work/2013
# git annex view year=* tag=*

Now you're in a view with two levels of directories, first by year and then by tag.

# tree -d
  |-- work
  |-- todo
  |-- urgent
  `-- done
  |-- work
  `-- done

Oh, did you want it the other way around? Easy!

# git annex vcycle
# tree -d
  |-- 2014
  `-- 2013
  `-- 2014
  `-- 2014
  |-- 2014
  `-- 2013

location fields

Let's switch to a view containing only new podcasts. And since the podcasts are organized into one subdirectory per show, let's include those subdirectories in the view.

# git checkout master
# git annex view tag=new podcasts/=*
# tree -d

That's an example of using part of the directory layout of the original branch to inform the view. Every file gets fields automatically set up corresponding to the directory it's in. So a file"foo/bar/baz/file" has fields "/=foo", "foo/=bar", and "foo/bar/=baz". These location fields can be used the same as other metadata to construct the view.

This has probably only scratched the surface of what you can do with views.

I have played around with views and found out that I can create new tags by creating directories in the view and that I can created files in those new directories that are not contained in the original working tree. The behavoiur of git annex in this behaviour is a bit strange.

Assume for example you have a file "foo" with tag "t1" and switch to the tag view. Then create a directory "t2" and a file "bar" in it. Add the file, sync, and switch back to the master branch. If you enter the tag view again, the directory "t2" will be vanished, i.e. your newly created file is gone, too. This is not surprising, as the file has never been added to the original working tree. However, another "git annex sync" will restore the file.

I am unsure what behaviour I would expect, maybe it shouldn't be possible to files to a view in the first place, or newly created files might be collected in a separate branch. On the other hand, it seems reasonable to add a new file with a new tag at the same time. Anyway, I found it confusing that I can seemingly lose a file like this. It took me a bit of time to figure out that another sync recovers the file.

Comment by Reiner Mon Mar 24 21:11:31 2014
For example, I'd like to have a view that only contains files present in this git-annex repository (no dangling symlinks).
Comment by Michael Sun Jun 8 03:55:24 2014