This article assumes x86 Linux and has been tested on:
Linux | CPU | GHC | cabal-install | GCC |
---|---|---|---|---|
Ubuntu 17.04 | x86_64 | 8.2.1 | 2.0.0.0 | 6.3.0 |
Ubuntu 16.10 | x86_64 | 8.0.2 | 1.24.0.2 | 6.2.0 |
Ubuntu 15.10 | x86_64 | 7.10.3 | 1.22.6.0 | 5.2.1 |
Take for example this Haskell module that exports to C and will go into the shared library. In general there can be multiple modules.
-- This file is Eval.hs. module Eval() where import Foreign.C import Foreign ... foreign export ccall "eval" c_eval :: CString -> Ptr CInt -> IO (Ptr CInt) c_eval s r = do cs <- peekCAString s case hs_eval cs of Nothing -> return nullPtr Just x -> do poke r x return r hs_eval :: String -> Maybe CInt ...
Some C code must call hs_init
near the beginning and
hs_exit
near the end. There are many choices for this. Here is one:
include one C module with a constructor and a destructor in the shared
library—they will be called automatically.
/* This file is hsbracket.c. */ #include <HsFFI.h> static void my_enter(void) __attribute__((constructor)); static void my_enter(void) { static char *argv[] = { "libEval.so", 0 }, **argv_ = argv; static int argc = 1; hs_init(&argc, &argv_); } static void my_exit(void) __attribute__((destructor)); static void my_exit(void) { hs_exit(); }
In general these are the major choices:
hs_init
and hs_exit
hs_init
and
hs_exit
Here is how to build the shared library (if your GHC is 8.2.1):
ghc -O2 -dynamic -shared -fPIC -o libEval.so Eval.hs hsbracket.c -lHSrts-ghc8.2.1
A few words on this long command:
-fPIC
means generate position-independent code; this is
important for creating a shared library.-dynamic
means link against shared libraries of other packages,
e.g., base.-shared
means create a shared library of self.-dynamic -shared
is not redundant, they refer to
opposite sides.-dynamic
to request static libraries of other
packages? Not really, they were not built with -fPIC
. In
particular it is illegal on x86_64.)-lHSrts-ghc8.2.1
means link against the GHC RTS; somehow it needs
to be specified explicitly, and worse, when you use another GHC version,
you have to edit it. You likely need a configuration script.-lHSrts_thr-ghc8.2.1
if you want the threaded RTS, i.e.,
the equivalent of --threaded
)If you want more separate compilation:
ghc -O2 -fPIC -c hsbracket.c ghc -O2 -dynamic -shared -fPIC -o libEval.so Eval.hs hsbracket.o -lHSrts-ghc8.2.1
Cabalization is also possible. With Cabal and cabal-install 2.0 onwards, there is specific support for building an export-to-C library. It automatically links against the right version of GHC RTS (unfortunately just the unthreaded one), and it lets you specify the name and Linux-style version string of the *.so file. The package description file goes like this:
-- This files is eval.cabal. name: eval version: 2.0 ... build-type: Simple cabal-version: >=2.0 foreign-library Eval type: native-shared lib-version-info: 2:0:0 -- if os(Windows) -- options: standalone -- mod-def-file: Eval.def other-modules: Eval c-sources: hsbracket.c build-depends: base >= 4.8 default-language: Haskell2010
The section type is foreign-library
, and it needs a name, which
will become part of the name of the *.so file. I chose “Eval”, and this means I
will get the filename “libEval.so” (possibly plus version strings).
The type
field has to be present and set to
native-shared
to mean you want a shared library. (In the future,
native-static
will be supported to mean static libraries.)
The lib-version-info
field (optional) is unfamiliar to many of
us, but if you have worked with libtool, it's exactly that. The format is
usually summarized as “current:revision:age”,
and my value is 2:0:0
, meaning the following:
With this set, cabal-install will name the file libEval.so.2.0.0 and produce the symlinks libEval.so.2 and libEval.so as per Linux convention. (In general, the Linux-style version string is (current - age) . age . revision )
Alternatively, the optional lib-version-linux
field (not shown)
lets you give directly the Linux-style version string. The corresponding
symlinks will also be made.
If neither field is present, then the filename will be simply libEval.so.
On Windows, the clause options: standalone
must be present. (On
other platforms, at least the standalone
part must be absent). It
means the produced DLL does not depend on other Haskell DLLs such as base or the
RTS. There is also an optional mod-def-file
field for the *.def
file. I know nothing about this.
The Haskell code is to be listed under other-modules
. The other
fields are as usual.
A simple “cabal install
” will build and install to
$HOME/.cabal/lib by default. There are many ways to override, such as adding
--prefix=/xxx/yyy
.
See also Cabal User's Guide - foreign libraries and Libtool Manual - Libtool's versioning system.
With older Cabal, I hijack the library
section for the shared
library, and I use Cabal's configuration hook to detect and request the correct
GHC RTS version number. The package description file goes like this:
-- This files is eval.cabal. name: eval version: 2.0 ... build-type: Configure cabal-version: >=1.10 -- >=1.10 because of default-language below extra-source-files: configure extra-tmp-files: eval.buildinfo library default-language: Haskell2010 exposed-modules: Eval -- other-modules: c-sources: hsbracket.c build-depends: base >= 4.8
We use build-type Configure to ask cabal to run our configure script:
#!/bin/sh while [ $# -ne 0 ]; do case $1 in --with-compiler=*) v=`${1#--with-compiler=} --numeric-version` cat > eval.buildinfo <<EOF extra-libraries: HSrts-ghc$v EOF break ;; *) shift esac done
This shell script receives parameters from cabal, including a “--with-compiler=” parameter. Assuming the parameter refers to GHC (either just “ghc” or a more elaborate path), we can run it to find its version number. Then we can generate a line “extra-libraries: HSrts-ghcversion” to link in the RTS. Put this line in a file called eval.buildinfo (generally pkgname.buildinfo), and cabal will honour it.
(The full story on build-type Configure is in the Cabal User Guide at link. Exercise: write a configure script to find out what parameters it receives from cabal. It does not have to generate a pkgname.buildinfo file.)
For consistency across users who use “cabal install” and users who use “Setup.hs”, we are also required to provide a Setup.hs that does the same thing as build-type Configure:
import Distribution.Simple main = defaultMainWithHooks autoconfUserHooks
Now we can build and install our library with:
cabal configure cabal build
The shared library file is now dist/build/libHSeval-2.0-DjSc3QQ1qhyJIL3fbPKMJ3-ghc8.2.1.so. This name is determined by package name, package version, API hash, and GHC version. Copy it out and add a “libEval.so” symlink if you want. (Renaming is no good, Linux will look for the presence of the birth filename.)
Here is a C main program that links against the shared library at build time. calculator.c
#include <stdio.h> extern int *eval(const char *, int *); #define INPUT_LEN 500 int main(int argc, char*argv[]) { int r; char input[INPUT_LEN]; for (;;) { printf("> "); if (fgets(input, INPUT_LEN, stdin) == NULL) break; if (eval(input, &r) != NULL) { printf("%d\n", r); } else { printf("syntax error\n"); } } printf("\n"); return 0; }
It can be built with:
gcc -O2 -c calculator.c gcc -o calculator calculator.o -L. -lEval -Wl,-rpath,'$ORIGIN'
A few words on two options:
-L.
means look for libraries in the current directory at build
time; hopefully libEval.so is there.-Wl,-rpath,'$ORIGIN'
means look for libraries in the same
directory as the executable at run time; hopefully libEval.so will be
there.-Wl,-rpath,'$ORIGIN'
, then
you will have to copy libEval.so to a standard location or use
LD_LIBRARY_PATHHow to find libEval.so and all the other Haskell libraries at run time is a long story, told in a separate section.
Here is a C main program that does not link against the shared library at build time, but will load it by handwritten code at run time. calculoader.c
#include <stdio.h> #include <dlfcn.h> #define INPUT_LEN 500 int main(void) { int r; char input[INPUT_LEN]; void *libEval; int *(*eval)(const char *, int *); if ( (libEval = dlopen("libEval.so", RTLD_LAZY)) == NULL || (eval = dlsym(libEval, "eval")) == NULL ) { printf("calculoader says: %s\n", dlerror()); return 1; } for (;;) { printf("> "); if (fgets(input, INPUT_LEN, stdin) == NULL) break; if (eval(input, &r) != NULL) { printf("%d\n", r); } else { printf("syntax error\n"); } } printf("\n"); dlclose(libEval); return 0; }
It can be built with:
gcc -O2 -c calculoader.c gcc -Wl,-rpath='$ORIGIN' -o calculoader calculoader.o -ldl
-Wl,-rpath,'$ORIGIN'
also means look for libraries in
the same directory as the executable at dlopen time; hopefully libEval.so
will be there.
If it is not up to you to add -Wl,-rpath,'$ORIGIN'
, then the
dlopen call will have to use a specific path, or you will have to copy
libEval.so to a standard location, or use LD_LIBRARY_PATH.
There are many choices for where to store shared libraries and how to find them at run time. First, let me describe the choice taken above (including GHC and cabal defaults):
The executable records the rpath $ORIGIN, which is a special string meaning the directory that holds the executable. I put libEval.so there, so it will be found there.
libEval.so records a lot of absolute directories in its rpath, e.g.,
/usr/local/lib/ghc-8.2.1/base-4.10.0.0
/usr/local/lib/ghc-8.2.1/rts
This is done by GHC and cabal defaults. This is how the shared libraries of
base, RTS, etc. are found. You may recognize that these
are exactly where GHC and Haskell libraries are installed.
So, this choice intends that the target computer is similar to the build computer in having the same GHC and Haskell packages installed at the same place, and libEval.so is installed in the same directory as the executable.
The bigger picture (though not the biggest picture) is this search order:
Use readelf --dynamic
to find what rpath is recorded in a file;
use ldd
to dry-run finding shared libraries. Both commands apply
to both executables and shared libraries, e.g., ldd libEval.so
and ldd calculator
.
The GHC default rpath generation is represented by the option -dynload
sysdep
, and can be negated by -dynload deploy
: don't record
those rpath entries.
On top of that, you can make your own additions to rpath. Using GHC, it's
-optl-Wl,-rpath=
; using GCC, it's -Wl,-rpath=
; using
ld, it's -rpath=
. The special string $ORIGIN refers to the directory
containing self, and can be part of a longer path, e.g., $ORIGIN/mylibs
I have more Haskell Notes and Examples