Information in this article comes from:
ghc -threaded ...
Assumed in this
article. Without -threaded
, you lose concurrency during
FFI calls.
foreign import …
defaults to foreign import … safe
.
foreign import … safe
allows the C call to call Haskell,
is concurrent if -threaded
, and incurs a bit more cost.
foreign import … unsafe
is the opposite.
+RTS
-N 2
means that all your thousands of Haskell threads are cramped
into 2 OS threads. OS threads running safe C calls do not have capabilities
assigned.
Without much effort from the Haskell programmer, multiple Haskell
threads calling C together already works: they don't block
each other, and they don't block unrelated Haskell threads. The Haskell
programmer only needs to add -threaded
and delete
unsafe
.
Here is how GHC does it. So an OS thread with a capability is happily churning along Haskell threads. Suddenly one unbound Haskell thread safe-calls C. (The story of the bound case is in the next section.) This Haskell thread is suspended, this OS thread loses its capability and runs the C code, and some other OS thread gains the capability and picks up the other Haskell threads. Everyone is happy.
An unsafe C call does not involve a transfer of capability. Therefore many other Haskell threads, including garbage collection threads, are put on hold as collateral damage.
When the C call finishes, eventually the original OS thread re-gains a capability to resume the caller Haskell thread (and picks up other Haskell threads).
Some of the above are probably technical details we don't have to worry about. For example, we don't mind which OS thread is chosen to run C, and we don't mind which OS thread is chosen to resume the caller Haskell thread. The point we care about is that one OS thread runs C and another OS thread runs Haskell.
The following example spawns 2 Haskell threads to make 2 slow C calls; meanwhile the main thread still has something to say. All of them have their say at the scheduled times. We also hear that two OS threads run the two C calls.
To compile on Linux: ghc -threaded main.hs slow.c
main.hs:
import Control.Concurrent import Control.Exception(finally) import Foreign.C mforkIO action = do done <- newEmptyMVar forkIO (action `finally` putMVar done ()) return (takeMVar done) main = do w1 <- mforkIO (thread_code 3) threadDelay 100000 w2 <- mforkIO (thread_code 2) threadDelay 1000000 putStrLn "haskell thread here" w2 w1 thread_code :: CUInt -> IO () thread_code n = do ht <- myThreadId putStrLn (show ht ++ " starts") slow n putStrLn (show ht ++ " ends") foreign import ccall safe slow :: CUInt -> IO ()
slow.c (Linux only):
#define _GNU_SOURCE #include <sys/types.h> #include <sys/syscall.h> #include <unistd.h> #include <stdio.h> unsigned get_ostid(void) { return syscall(SYS_gettid); } /* yes, I gamble that pid_t is essentially a word. */ void slow(unsigned n) { printf("slow sleeps in OS thread %u for %u seconds\n", get_ostid(), n); sleep(n); }
Result:
ThreadId 4 starts slow sleeps in OS thread 4323 for 3 seconds ThreadId 5 starts slow sleeps in OS thread 4324 for 2 seconds [1 second later] haskell thread here [1 second later] ThreadId 5 ends [1 second later] ThreadId 4 ends
C calls happening in unpredictably chosen OS threads defeat some C libraries; such a library requires you to choose one OS thread and make all your library C calls there. This is the sole cause of all of the visible complications of GHC threading.
The complication is nicely contained by bound Haskell threads. When a bound Haskell thread is created, it is associated with an OS thread permanently. Every C calls from this bound Haskell thread are run in that associated OS thread. So, if you make all library C calls from this bound Haskell thread, they all go to the same OS thread. The library is happy.
(Nominally, Haskell code in this thread may still be run in whatever OS threads bearing capabilities, but unlikely in current GHC. So beware that a bound Haskell thread costs more for switching context.)
Three ways to obtain a bound Haskell thread:
Why does forkOS always create a fresh OS thread for the association? For concurrency: two forkOS'ed Haskell threads calling C at the same time necessitates two OS threads.
The following example first shows that an unbound Haskell thread can make C calls in different OS threads at different times. (I force it by exploiting a technical detail in the previous section.) Then it tests that a forkOS bound Haskell thread makes two C calls in the same OS thread (immune to my exploit); meanwhile another forkOS bound Haskell thread butts in.
To compile on Linux: ghc -threaded main.hs slow.c
main.hs:
import Control.Concurrent import Control.Exception(finally) import Foreign.C mforkIO action = do done <- newEmptyMVar forkIO (action `finally` putMVar done ()) return (takeMVar done) mforkOS action = do done <- newEmptyMVar forkOS (action `finally` putMVar done ()) return (takeMVar done) main = do wait_ibm <- mforkIO ibm wait_ibm wait_ibm <- mforkOS ibm threadDelay 500000 forkOS (do x <- get_ostid putStrLn ("another forkOS calls C in " ++ show x) ) wait_ibm -- ibm = I've Been Moved! ibm = do b <- isCurrentThreadBound let msg = "ibm " ++ (if b then "" else "un") ++ "bound calls C in " x <- get_ostid putStrLn (msg ++ show x) wait_sleep <- mforkIO (sleep 2 >> return ()) threadDelay 1000000 x <- get_ostid putStrLn (msg ++ show x) wait_sleep foreign import ccall safe get_ostid :: IO CUInt foreign import ccall safe sleep :: CUInt -> IO CUInt
slow.c (Linux only):
#define _GNU_SOURCE #include <sys/syscall.h> #include <unistd.h> unsigned get_ostid(void) { return syscall(SYS_gettid); } /* yes, I gamble that pid_t is essentially a word. */
Result:
ibm unbound calls C in 5193 ibm unbound calls C in 5194 ibm bound calls C in 5196 another forkOS calls C in 5197 ibm bound calls C in 5196
C calling Haskell works without extra effort from the Haskell programmer (or the C programmer). Firstly, multiple C OS threads calling Haskell is concurrent. Secondly, if the called Haskell calls C, i.e., C → Haskell → C, the 2nd C code is run in the same OS thread as the 1st C code. So C libraries with thread-locality requirements are happy.
The most popular use case of C → Haskell → C is with GUI libraries and OpenGL: the 1st C is the event loop, the Haskell is an event handler you supply, and the 2nd C is your event handler giving commands to the library. The library requires the event loop and the commands to be in the same OS thread.
Here is how GHC does it. When C calls Haskell, the GHC RTS creates a fresh bound Haskell thread associated with the calling OS thread, to run the called Haskell. From what we now know about bound threads, everything just works when the called Haskell calls C.
This mechanism is also how multiple bound Haskell threads end up sharing the same OS thread. For example if we have this call chain:
C → Haskell → C → Haskell → C → Haskell → C → Haskell
then we have 4 bound Haskell threads associated with the same OS thread. This is harmless because at least 3 of them are suspended; only the last one is active and may make yet another C call. In fact, we also understand that it is important that all 4 C calls and any further ones are in the same OS thread, stacked upon each other.
The following example has a C function and a Haskell function recursively calling each other, showing that every call into Haskell is another bound Haskell thread, and they all use the same OS thread for C calls.
To compile on Linux: ghc -threaded main.hs slow.c
main.hs:
import Control.Concurrent import Foreign.C import Foreign.Ptr foreign import ccall safe get_ostid :: IO CUInt hthreadinfo prefix = do t <- myThreadId putStr (prefix ++ ": haskell " ++ show t) b <- isCurrentThreadBound if b then do n <- get_ostid putStrLn (" bound to os thread " ++ show n) else putStrLn " unbound" main = do -- recall that main is also run in a bound thread haskell 5 foreign import ccall safe cfunc :: FunPtr (IO ()) -> IO () foreign import ccall "wrapper" ptr_for_cfunc :: IO () -> IO (FunPtr (IO ())) haskell 0 = return () haskell n = do hthreadinfo ("T minus " ++ show n) ptr <- ptr_for_cfunc (haskell (n-1)) cfunc ptr freeHaskellFunPtr ptr ht <- myThreadId putStrLn (show ht ++ " done")
slow.c (Linux only):
#define _GNU_SOURCE #include <sys/syscall.h> #include <unistd.h> #include <HsFFI.h> unsigned get_ostid(void) { return syscall(SYS_gettid); } /* yes, I gamble that pid_t is essentially a word. */ void cfunc(HsFunPtr callback) { callback(); }
Result:
T minus 5: haskell ThreadId 3 bound to os thread 3942 T minus 4: haskell ThreadId 4 bound to os thread 3942 T minus 3: haskell ThreadId 5 bound to os thread 3942 T minus 2: haskell ThreadId 6 bound to os thread 3942 T minus 1: haskell ThreadId 7 bound to os thread 3942 ThreadId 7 done ThreadId 6 done ThreadId 5 done ThreadId 4 done ThreadId 3 done
I have shown using Haskell as the main program. You can use C as the main program too; in fact, you can create OS threads on the C side, and from them call Haskell. All of the above still work.
I have more Haskell Notes and Examples