day 336 pid locks

Been working today on getting git-annex to fall back from nice posix fcntl locks to pid locks when the former are not supported. There will be an annex.pidlock to control this. Mostly useful, I think for networked file systems like NFS and Lustre. While these do support posix locks, I guess it can be hard sometimes to get some big server configured appropriately, especially when you don't admin it and just want to use git-annex there.

Of course, the fun part about pid locks is that it can be pretty hard to tell if one is stale or not. Especialy when using a networked filesystem, because then the pid in question can be running on a different computer.

Even if you do figure out that a pid lock is stale, how do you then take over a stale pid lock, without racing with anther process that also wants to take it over? This was the truely tricky question of the day.

I have a possibly slightly novel approach to solve that: Put a more modern lock file someplace else (eg, /dev/shm) and use that lock file to lock the pid lock file. Then you can tell if a local pid lock file is stale quickly locally, and take it over safely. Of course, if the pid is not locked by a local process, this still has to fall back to the inevitable retry-and-timeout-and-fail.

I hope the result will work pretty well, although git-annex will not support as fine-grained concurrency when using pid locks. Will find out tomorrow when I run today's code!