git-annex now has support for storing arbitrary metadata about annexed files. For example, this can be used to tag files, to record the author of a file, etc. The metadata is synced around between repositories with the other information git-annex keeps track of.
One nice way to use the metadata is through views. You can ask git-annex to create a view of files in the currently checked out branch that have certain metadata. Once you're in a view, you can move and copy files to adjust their metadata further. Rather than the traditional hierarchical directory structure, views are dynamic; you can easily refine or reorder a view.
Let's get started by setting some tags on files. No views yet, just some metadata:
To avoid needing to manually tag files with the year (and month),
run annex.genmetadata true
, and git-annex will do it for you
when adding files.
# git annex metadata --tag todo work/2014/*
# git annex metadata --untag todo work/2014/done/*
# git annex metadata --tag urgent work/2014/presentation_for_tomorrow.odt
# git annex metadata --tag done work/2013/* work/2014/done/*
# git annex metadata --tag work work
# git annex metadata --tag video videos
# git annex metadata --tag work videos/operating_heavy_machinery.mov
# git annex metadata --tag done videos/old
# git annex metadata --tag new videos/lotsofcats.ogv
# git annex metadata --tag sound podcasts
# git annex metadata --tag done podcasts/*/old
# git annex metadata --tag new podcasts/*/recent
So, you had a bunch of different kinds of files sorted into a directory structure. But that didn't really reflect how you approach the files. Adding some tags lets you categorize the files in different ways.
Ok, metadata is in place, but how to use it? Time to change views!
# git annex view tag=*
view (searching...)
Switched to branch 'views/_'
ok
Notice that a single file may appear in multiple directories
depending on its tags. For example, lotsofcats.ogv
is in
both new/
and video/
.
This searched for all files with any tag, and created a new git branch that sorts the files according to their tags.
# tree -d
work
todo
urgent
done
new
video
sound
Ah, but you're at work now, and don't want to be distracted by cat videos. Time to filter the view:
# git annex vfilter tag=work
vfilter
Switched to branch 'views/(work)/_'
ok
Now only the work files are in the view, and they're otherwise categorized
according to their other tags. So you can check the urgent/
directory
to see what's next, and look in todo/
for other work related files.
Now that you're in a tag based view, you can move files around between the directories, and when you commit your changes to git, their tags will be updated.
# git mv urgent/presentation_for_tomorrow_{work;2014}.odt ../done
# git commit -m "a good day's work"
metadata tag-=urgent
metadata tag+=done
You can return to a previous view by running git annex vpop
. If you pop
all the way out of all views, you'll be back on the regular git branch you
originally started from. You can also use git checkout
to switch between
views and other branches.
fields
Beyond simple tags and directories, you can add whatever kinds of metadata you like, and use that metadata in more elaborate views. For example, let's add a year field.
# git checkout master
# git annex metadata --set year=2014 work/2014
# git annex metadata --set year=2013 work/2013
# git annex view year=* tag=*
Now you're in a view with two levels of directories, first by year and then by tag.
# tree -d
2014
|-- work
|-- todo
|-- urgent
`-- done
2013
|-- work
`-- done
Oh, did you want it the other way around? Easy!
# git annex vcycle
# tree -d
work
|-- 2014
`-- 2013
todo
`-- 2014
urgent
`-- 2014
done
|-- 2014
`-- 2013
location fields
Let's switch to a view containing only new podcasts. And since the podcasts are organized into one subdirectory per show, let's include those subdirectories in the view.
# git checkout master
# git annex view tag=new podcasts/=*
# tree -d
This_Developers_Life
Escape_Pod
GitMinutes
The_Haskell_Cast
StarShipSofa
That's an example of using part of the directory layout of the original branch to inform the view. Every file gets fields automatically set up corresponding to the directory it's in. So a file"foo/bar/baz/file" has fields "/=foo", "foo/=bar", and "foo/bar/=baz". These location fields can be used the same as other metadata to construct the view.
This has probably only scratched the surface of what you can do with views.
I have played around with views and found out that I can create new tags by creating directories in the view and that I can created files in those new directories that are not contained in the original working tree. The behavoiur of git annex in this behaviour is a bit strange.
Assume for example you have a file "foo" with tag "t1" and switch to the tag view. Then create a directory "t2" and a file "bar" in it. Add the file, sync, and switch back to the master branch. If you enter the tag view again, the directory "t2" will be vanished, i.e. your newly created file is gone, too. This is not surprising, as the file has never been added to the original working tree. However, another "git annex sync" will restore the file.
I am unsure what behaviour I would expect, maybe it shouldn't be possible to files to a view in the first place, or newly created files might be collected in a separate branch. On the other hand, it seems reasonable to add a new file with a new tag at the same time. Anyway, I found it confusing that I can seemingly lose a file like this. It took me a bit of time to figure out that another sync recovers the file.
Hi,
Would it be hard to handle the wildcard character in the location view before the = sign.
works but
does not.