Please describe the problem.
git-annex will randomly crash.
What steps will reproduce the problem?
Unknown. Keeping git-annex running for an extended period, failing to sync properly on XMPP(not sure if that is relevant, but given this haven't been found before it might be)
What version of git-annex are you using? On what operating system?
git-annex version: 4.20130521-g20710d4 (And multiple prior versions)
Please provide any additional information below.
.git/annex/daemon.log upload: http://paste.ubuntu.com/5694813/
I could find no debug.log?
moreinfo until it's reproduced with a current version.. --Joey
Note that this seems like a similar untrappable crash as git annex daemon crashes when authenticating with jabber.de. --Joey
I have seen this once on a similar system (family computer; XMPP being used). Unfortunatly it could be coming from anywhere -- and it's not at all clear how a crash in one thread could take it all down, since there are global top-level per-thread exception handlers that should run and log which thread crashed -- and normally seem to do this quite well.
I may need to make a management process that ensures the assistant stays alive.
I have also seen this happen when a computer is shutting down. But presumably in that case it's not really a bug.
One thing you might try is see what is using socket 16 when it's running, assuming the socket will be the same. (Also, if you've had repeated crashes, it would be good to know if it's 16 each time..). You could do this by looking at
/proc/$pid/fd/16
Also, check the old logs,.git/annex/daemon.log.*
And this time it was socket 27.
Sadly happened during the night and I didn't monitor the socket when it happened.
I got a similar crash:
I was able to determine that fd 18/19 is reliably used for one of the git cat-file processes on this system. It's quite likely that fd 16/17 would be similar. fd 27 less likely. (But could easily be some other less long running git command.)
This would be consistent with git cat-file crashing as it's trying to write to it and read from it.
This took down the transferscanner thread, but the assistant continued running.
I tried, as an experiment, killing on of the
git cat-file
child processes of the assistant. As hypothesized, that led to the same thing I saw logged before.So, why might git commands be dying, and which commands? It would be pretty easy for git-annex to detect git cat-file dying, and restart it. Other commands would be more difficult. Still this might be a git bug which would best be fixed there. It would be good to get a core dump from git.
I'm sure i can still provoke it on some of my machines. But this weekend is completely blocked for me.
I'll update everything to latest and give you an updated log. But will probably not happen till the middle of next week.
I haven't been able to replicate this on two of my computers with the latest git-annex(as of this message).
It seemed to happen more on my Mac OS X Lion though. And nightlies haven't been build for some time.
So i'm waiting for a updated package for OS X. If you would rather clean up feel free to close this, I can just open it again if i hit it again.