Last night I got git annex watch
to also handle deletion of files.
This was not as tricky as feared; the key is using git rm --ignore-unmatch
,
which avoids most problematic situations (such as a just deleted file
being added back before git is run).
Also fixed some races when git annex watch
is doing its startup scan of
the tree, which might be changed as it's being traversed. Now only one
thread performs actions at a time, so inotify events are queued up during
the scan, and dealt with once it completes. It's worth noting that inotify
can only buffer so many events .. Which might have been a problem except
for a very nice feature of Haskell's inotify interface: It has a thread
that drains the limited inotify buffer and does its own buffering.
Right now, git annex watch
is not as fast as it could be when doing
something like adding a lot of files, or deleting a lot of files.
For each file, it currently runs a git command that updates the index.
I did some work toward coalescing these into one command (which git annex
already does normally). It's not quite ready to be turned on yet,
because of some races involving git add
that become much worse
if it's delayed by event coalescing.
And races were the theme of today. Spent most of the day really
getting to grips with all the fun races that can occur between
modification happening to files, and git annex watch
. The inotify
page now has a long list of known races, some benign, and several,
all involving adding files, that are quite nasty.
I fixed one of those races this evening. The rest will probably involve
moving away from using git add
, which necessarily examines the file
on disk, to directly shoving the symlink into git's index.
BTW, it turns out that dvcs-autosync
has grappled with some of these same
races: http://comments.gmane.org/gmane.comp.version-control.home-dir/665
I hope that git annex watch
will be in a better place to deal with them,
since it's only dealing with git, and with a restricted portion of it
relevant to git-annex.
It's important that git annex watch
be rock solid. It's the foundation
of the git annex assistant. Users should not need to worry about races
when using it. Most users won't know what race conditions are. If only I
could be so lucky!