parallel get to the files for the same key would fail with thread blocked indefinitely in an STM transaction

Originally was trying to reproduce datalad/issues/3653 assuming that multiple files pointed to the same key. It was not the case, and my attempt revealed another bug - annex inability to "obtain" files in parallel when multiple of them point to the same key:

setup of original repo(click to expand)

/tmp > mkdir src; (cd src; git init; git annex init; dd if=/dev/zero of=1 count=1024 bs=1024; for f in {2..10}; do cp 1 $f; done ; git annex add *; git commit -m added; )
Initialized empty Git repository in /tmp/src/.git/
init  (scanning for unlocked files...)
ok
(recording state in git...)
1024+0 records in
1024+0 records out
1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.00106651 s, 983 MB/s
add 1 
ok                                
add 10 
ok                                
add 2 
ok                                
add 3 
ok                                
add 4 
ok                                
add 5 
ok                                
add 6 
ok                                
add 7 
ok                                
add 8 
ok                                
add 9 
ok                                
(recording state in git...)
[master (root-commit) 63b1163] added
 10 files changed, 10 insertions(+)
 create mode 120000 1
 create mode 120000 10
 create mode 120000 2
 create mode 120000 3
 create mode 120000 4
 create mode 120000 5
 create mode 120000 6
 create mode 120000 7
 create mode 120000 8
 create mode 120000 9

And that is what happens then when we try to get the same key in parallel:

/tmp > git clone src dst; (cd dst; git annex get -J 5 *; )
Cloning into 'dst'...
done.
(merging origin/git-annex into git-annex...)
(recording state in git...)
(scanning for unlocked files...)
get 2 (from origin...) (checksum...) 
git-annex: thread blocked indefinitely in an STM transaction
failed
git-annex: thread blocked indefinitely in an MVar operation

I felt like it is an old issue but failed to find a trace of it upon a quick lookup

fixed --Joey

RSS Atom

comment 1

Reproduced.

After building git-annex with the DebugLocks flag, I got this:

debugLocks, called at ./Annex/Transfer.hs:248:18 in main:Annex.Transfer
debugLocks, called at ./CmdLine/Action.hs:263:26 in main:CmdLine.Action

Which points to pickRemote and ensureOnlyActionOn. But pickRemote does no STM actions when there's only 1 remote, so it must really be the latter.

Also, I notice that when 5 files to get are provided, it crashes, but with less than 5, it succeeds. Even this trivial case crashes: git annex get -J1 1 2

Comment by joey — Wed Nov 13 16:34:34 2019

Remove comment

comment 2

Ok, I see the bug. ensureOnlyActionOn does a STM retry if it finds in the activekeys map some other thread is operating on the same key. But, there is no running STM transaction what will update the map. So, STM detects that the retry would deadlock.

It's not really a deadlock, because once the other thread finishes, it will update the map to remove itself. But STM can't know that. The solution will be to not use STM for waiting on the other thread.

Hmm, I tried the obvious approach, using a MVar semaphore to wait for the thread, but that just resulted in more STM and MVar deadlocks.

I don't understand why after puzzling over it for two hours. I did instrument all calls to atomically, and it looks, unfortunately, like the one in finishCommandActions is deadlocking. If the problem extends beyond ensureOnlyActionOn it may be much more complicated.

Patch that does not work and I don't know why.

diff --git a/CmdLine/Action.hs b/CmdLine/Action.hs
index 87298a95f..bf4bdd589 100644
--- a/CmdLine/Action.hs
+++ b/CmdLine/Action.hs
@@ -268,16 +268,30 @@ ensureOnlyActionOn k a = debugLocks $
    go ConcurrentPerCpu = goconcurrent
    goconcurrent = do
        tv <- Annex.getState Annex.activekeys
-       bracket (setup tv) id (const a)
-   setup tv = liftIO $ do
+       bracketIO (setup tv) id (const a)
+   setup tv = do
+       mysem <- newEmptyMVar
        mytid <- myThreadId
-       atomically $ do
+       finishsetup <- atomically $ do
            m <- readTVar tv
            case M.lookup k m of
-               Just tid
-                   | tid /= mytid -> retry
-                   | otherwise -> return $ return ()
+               Just (tid, theirsem)
+                   | tid /= mytid -> return $ do
+                       -- wait for the other
+                       -- thread to finish, and
+                       -- retry (STM retry would
+                       -- deadlock)
+                       readMVar theirsem
+                       setup tv
+                   | otherwise -> return $
+                       -- same thread, so no
+                       -- blocking
+                       return $ return ()
                Nothing -> do
-                   writeTVar tv $! M.insert k mytid m
-                   return $ liftIO $ atomically $
-                       modifyTVar tv $ M.delete k
+                   writeTVar tv $! M.insert k (mytid, mysem) m
+                   return $ return $ do
+                       atomically $ modifyTVar tv $
+                           M.delete k
+                       -- indicate finished
+                       putMVar mysem ()
+       finishsetup
diff --git a/Annex.hs b/Annex.hs
index 9eb4c5f39..936399ae7 100644
--- a/Annex.hs
+++ b/Annex.hs
@@ -143,7 +143,7 @@ data AnnexState = AnnexState
    , existinghooks :: M.Map Git.Hook.Hook Bool
    , desktopnotify :: DesktopNotify
    , workers :: Maybe (TMVar (WorkerPool AnnexState))
-   , activekeys :: TVar (M.Map Key ThreadId)
+   , activekeys :: TVar (M.Map Key (ThreadId, MVar ()))
    , activeremotes :: MVar (M.Map (Types.Remote.RemoteA Annex) Integer)
    , keysdbhandle :: Maybe Keys.DbHandle
    , cachedcurrentbranch :: (Maybe (Maybe Git.Branch, Maybe Adjustment))

Comment by joey — Wed Nov 13 17:07:29 2019

Remove comment

comment 3

Tried going back to c04b2af3e1a8316e7cf640046ad0aa68826650ed, which is before the separation of perform and cleanup stages. The same code was in onlyActionOn back then. And the test case does not crash.

So, that gives a good commit to start a bisection. Which will probably find the bug was introduced in the separation of perform and cleanup stages, because that added a lot of STM complexity.

(Have to cherry-pick 018b5b81736a321f3eb9762a2afb7124e19dbdf9 onto those old commits to make them build with current libraries.)

Comment by joey — Wed Nov 13 19:07:49 2019

Remove comment

comment 4

Simplified version of patch above, that converts ensureOnlyActionOn to not use STM at all, and is significantly simpler.

With this patch, the test case still STM deadlocks. So this seems to be proof that the actual problem is not in ensureOnlyActionOn.

diff --git a/Annex.hs b/Annex.hs
index 9eb4c5f39..9baf7755a 100644
--- a/Annex.hs
+++ b/Annex.hs
@@ -143,7 +143,7 @@ data AnnexState = AnnexState
    , existinghooks :: M.Map Git.Hook.Hook Bool
    , desktopnotify :: DesktopNotify
    , workers :: Maybe (TMVar (WorkerPool AnnexState))
-   , activekeys :: TVar (M.Map Key ThreadId)
+   , activekeys :: MVar (M.Map Key (ThreadId, MVar ()))
    , activeremotes :: MVar (M.Map (Types.Remote.RemoteA Annex) Integer)
    , keysdbhandle :: Maybe Keys.DbHandle
    , cachedcurrentbranch :: (Maybe (Maybe Git.Branch, Maybe Adjustment))
@@ -154,7 +154,7 @@ data AnnexState = AnnexState
 newState :: GitConfig -> Git.Repo -> IO AnnexState
 newState c r = do
    emptyactiveremotes <- newMVar M.empty
-   emptyactivekeys <- newTVarIO M.empty
+   emptyactivekeys <- newMVar M.empty
    o <- newMessageState
    sc <- newTMVarIO False
    return $ AnnexState
diff --git a/CmdLine/Action.hs b/CmdLine/Action.hs
index 87298a95f..a8c2bd205 100644
--- a/CmdLine/Action.hs
+++ b/CmdLine/Action.hs
@@ -22,7 +22,7 @@ import Remote.List
 import Control.Concurrent
 import Control.Concurrent.Async
 import Control.Concurrent.STM
-import GHC.Conc
+import GHC.Conc (getNumProcessors)
 import qualified Data.Map.Strict as M
 import qualified System.Console.Regions as Regions

@@ -267,17 +267,22 @@ ensureOnlyActionOn k a = debugLocks $
    go (Concurrent _) = goconcurrent
    go ConcurrentPerCpu = goconcurrent
    goconcurrent = do
-       tv <- Annex.getState Annex.activekeys
-       bracket (setup tv) id (const a)
-   setup tv = liftIO $ do
+       mv <- Annex.getState Annex.activekeys
+       bracketIO (setup mv) id (const a)
+   setup mv =  do
        mytid <- myThreadId
-       atomically $ do
-           m <- readTVar tv
-           case M.lookup k m of
-               Just tid
-                   | tid /= mytid -> retry
-                   | otherwise -> return $ return ()
-               Nothing -> do
-                   writeTVar tv $! M.insert k mytid m
-                   return $ liftIO $ atomically $
-                       modifyTVar tv $ M.delete k
+       m <- takeMVar mv
+       let ready sem = do
+           putMVar mv $! M.insert k (mytid, sem) m
+           return $ do
+               modifyMVar_ mv $ pure . M.delete k
+               putMVar sem ()
+       case M.lookup k m of
+           Nothing -> ready =<< newEmptyMVar
+           Just (tid, sem)
+               | tid /= mytid -> do
+                   takeMVar sem
+                   ready sem
+               | otherwise -> do
+                   putMVar mv m
+                   return noop

Comment by joey — Wed Nov 13 21:22:07 2019

Remove comment

comment 5

finishCommandActions is reaching the retry case, and STM deadlocks there. The WorkerPool is getting into a state where allIdle is False, and is not leaving it, perhaps due to an earlier STM deadlock. (There seem to be two different ones.)

Also, I notice with --json-error-messages:

{"command":"get","note":"from origin...\nchecksum...","success":false,"key":"SHA256E-s524288--07854d2fef297a06ba81685e660c332de36d5d18d546927d30daad6d7fda1541","error-messages":["git-annex: thread blocked indefinitely in an STM transaction"],"file":"1"}

So the thread that actually gets to run on the key is somehow reaching a STM deadlock.

Which made me wonder if that thread deadlocks on enteringStage. And it seems so. If Command.Get is changed to use commandStages rather than transferStages, the test case succeeds.

Like finishCommandActions, enteringStage has a STM retry if it needs to wait for something to happen to the WorkerPool. So again it looks like the WorkerPool is getting screwed up.

Comment by joey — Wed Nov 13 21:42:58 2019

Remove comment

catch-all deadlock breaker

Not sure if feasible, but maybe a ?catch-all deadlock breaker could be implemented to mask this and other deadlocks?

The moon landings software had something like this, and it worked pretty well...

Comment by Ilya_Shlyakhter — Wed Nov 13 22:33:59 2019

Remove comment

comment 6

Added tracing of changes to the WorkerPool.

joey@darkstar:/tmp/dst>git annex get -J1 1 2 --json
("initial pool",WorkerPool UsedStages {initialStage = TransferStage, stageSet = fromList [TransferStage,VerifyStage]} [IdleWorker TransferStage,IdleWorker VerifyStage] 2)
("starting worker",WorkerPool UsedStages {initialStage = TransferStage, stageSet = fromList [TransferStage,VerifyStage]} [ActiveWorker TransferStage,IdleWorker VerifyStage] 1)

Transfer starts for file 1

(("change stage from",TransferStage,"to",VerifyStage),WorkerPool UsedStages {initialStage = TransferStage, stageSet = fromList [TransferStage,VerifyStage]} [IdleWorker TransferStage,ActiveWorker VerifyStage] 1)

Transfer complete, verifying starts.

("starting worker",WorkerPool UsedStages {initialStage = TransferStage, stageSet = fromList [TransferStage,VerifyStage]} [ActiveWorker TransferStage,ActiveWorker VerifyStage] 0)

This second thread is being started to process file 2. It starts in TransferStage, but it will be blocked from doing anything by ensureOnlyActionOn.

("finishCommandActions starts with",WorkerPool UsedStages {initialStage = TransferStage, stageSet = fromList [TransferStage,VerifyStage]} [ActiveWorker TransferStage,ActiveWorker VerifyStage] 0)
("finishCommandActions observes",WorkerPool UsedStages {initialStage = TransferStage, stageSet = fromList [TransferStage,VerifyStage]} [ActiveWorker TransferStage,ActiveWorker VerifyStage] 0)

All files have threads to process them started, so finishCommandActions starts up. It will retry since the threads are still running.

(("change stage from",VerifyStage,"to",TransferStage),WorkerPool UsedStages {initialStage = TransferStage, stageSet = fromList [TransferStage,VerifyStage]} [IdleWorker VerifyStage,ActiveWorker TransferStage] 0)

The first thread is done with verification, and the stage is being restored to transfer.

The 0 means that there are 0 spareVals. Normally, the number of spareVals should be the same as the number of IdleWorkers, so it should be 1. It's 0 because the thread is in the process of changing between stages.

The thread should at this point be waiting for an idle TransferStage slot to become available. The second thread still has that active. It seems that wait never completes, because a trace I had after that wait never got printed.

("finishCommandActions observes",WorkerPool UsedStages {initialStage = TransferStage, stageSet = fromList [TransferStage,VerifyStage]} [IdleWorker VerifyStage,ActiveWorker TransferStage] 0)

It retries again, because of the active worker and also because spareVals is not the same as IdleWorkers.

git-annex: thread blocked indefinitely in an STM transaction

Deadlock.

Looks like that second thread that got into transfer stage never leaves it, and then the first thread, which wants to restore back to transfer stage, is left waiting forever for it. And so is finishCommandActions.

Aha! The second thread is in fact still in ensureOnlyActionOn. So it's waiting on the first thread to finish. But the first thread can't transition back to TransferStage because the second thread has stolen it.

Now it makes sense.

So.. One way to fix this would be to add a new stage, which is used for threads that are just starting. Then the second thread would be in StartStage, and the first thread would not be prevented from transitioning back to TransferStage. Would need to make sure that, once a thread leaves StartStage, it does not ever transition back to it.

Comment by joey — Thu Nov 14 15:20:13 2019

Remove comment

Add a comment