(c) Software Lab. Alexander Burger
This document describes how to call C functions in shared object files
(libraries) from PicoLisp, using the built-in native function -- possibly with the help of
the struct and lisp functions. It applies only to the 64-bit
version of PicoLisp.
native calls a C function in a shared library. It tries to
The direct return value of native is the Lisp representation of
the C function's return value. Further values, returned by reference from the C
function, are available in Lisp variables (symbol values).
struct is a helper function, which can be used to manipulate C
data structures in memory. It may take a scalar (a numeric representation of a C
value) to convert it to a Lisp item, or (more typically) a pointer to a memory
area to build and extract data structures. lisp allows you to
install callback functions, callable from C code, written in Lisp.
In combination, these three functions can interface PicoLisp to almost any C function.
The above steps are fully dynamic; native doesn't have (and
doesn't require) a priory knowledge about the library, the function or the
involved data. No need to write any glue code, interfaces or include files. All
functions can even be called interactively from the REPL.
The arguments to native are
The simplest form is a call to a function without return value and without arguments. If we assume a library "lib.so", containing a function with the prototype
void fun(void);
then we can call it as
(native "lib.so" "fun")
The first argument to native specifies the library. It is either
the name of a library (a symbol), or the handle of a previously
found library (a number).
As a special case, a transient symbol "@" can be passed for the
library name. It then refers to the current main program (instead of an external
library), and can be used for standard functions like "malloc" or
"printf".
native uses dlopen(3) internally to find and open
the library, and to obtain the handle. If the name contains a slash ('/'), then
it is interpreted as a (relative or absolute) pathname. Otherwise, the dynamic
linker searches for the library according to the system's environment and
directories. See the man page of dlopen(3) for further details.
If called with a symbolic argument, native automatically caches
the handle of the found library in the value of that symbol. The most natural
way is to pass the library name as a transient
symbol ("lib.so" above): The initial value of a transient symbol is
that symbol itself, so that native receives the library name upon
the first call. After successfully finding and opening the library,
native stores the handle of that library in the value of the passed
symbol ("lib.so"). As native evaluates its arguments
in the normal way, subsequent calls within the same transient scope will receive
the numeric value (the handle), and don't need to open and search the library
again.
The same rules applies to the second argument, the function. When called with
a symbol, native stores the function pointer in its value, so that
subsequent calls evaluate to that pointer, and native can directly
jump to the function.
native uses dlsym(3) internally to obtain the
function pointer. See the man page of dlsym(3) for further details.
In most cases a program will call more than one function from a given
library. If we keep the code within the same transient scope (i.e. in the same
source file, and not separated by the ==== function), each library will be opened --
and each function searched -- only once.
(native "lib.so" "fun1")
(native "lib.so" "fun2")
(native "lib.so" "fun3")
After "fun1" was called, "lib.so" will be open, and
won't be re-opened for "fun2" and "fun3". Consider
the definition of helper functions:
(de fun1 ()
(native "lib.so" "fun1") )
(de fun2 ()
(native "lib.so" "fun2") )
(de fun3 ()
(native "lib.so" "fun3") )
After any one of fun1, fun2 or fun3
was called, the symbol "lib.so" will hold the library handle. And
each function function "fun1", "fun2" and
"fun3" will be searched only when called the first time.
Warning: It should be avoided to put more than one library into a single transient scope if there is a chance that two different functions with the same name will be called in two different libraries. Because of the function pointer caching, the second call would otherwise (wrongly) go to the first function.
The (optional) third argument to native specifies the return
value. A C function can return many types of values, like integer or floating
point numbers, string pointers, or pointers to structures which in turn consist
of those types, and even other structures or pointers to structures.
native tries to cover most of them.
As described in the result specification,
the third argument should consist of a pattern which tells native
how to extract the proper value.
In the simplest case, the result specification is NIL like in
the examples so far. This means that either the C function returns
void, or that we are not interested in the value. The return value
of native will be NIL in that case.
If the result specification is one of the symbols B,
I or N, an integer number is returned, by interpreting
the result as a char (8 bit unsigned byte), int (32
bit signed integer), or long number (64 bit signed integer),
respectively. Other (signed or unsigned numbers, and of different sizes) can be
produced from these types with logical and arithmetic operations if necessary.
If the result specification is the symbol C, the result is
interpreted as a 16 bit number, and a single-char transient symbol (string) is
returned.
A specification of S tells native to interpret the
result as a pointer to a C string (null terminated), and to return a transient
symbol (string).
If the result specification is a number, it will be used as a scale to
convert a returned double (if the number is positive) or
float (if the number is negative) to a scaled fixpoint number.
Examples for function calls, with their corresponding C prototypes:
(native "lib.so" "fun" 'I) # int fun(void);
(native "lib.so" "fun" 'N) # long fun(void);
(native "lib.so" "fun" 'N) # void *fun(void);
(native "lib.so" "fun" 'S) # char *fun(void);
(native "lib.so" "fun" 1.0) # double fun(void);
If the result specification is a list, it means that the C function returned a pointer to an array, or an arbitrary memory structure. The specification list should then consist of either the above primitive specifications (symbols or numbers), or of cons pairs of a primitive specification and a repeat count, to denote arrays of the given type.
Examples for function calls, with their corresponding pseudo C prototypes:
(native "lib.so" "fun" '(I . 8)) # int *fun(void); // 8 integers
(native "lib.so" "fun" '(B . 16)) # unsigned char *fun(void); // 16 bytes
(native "lib.so" "fun" '(I I)) # struct {int i; int j;} *fun(void);
(native "lib.so" "fun" '(I . 4)) # struct {int i[4];} *fun(void);
(native "lib.so" "fun" '(I (B . 4))) # struct {
# int i;
# unsigned char c[4];
# } *fun(void);
(native "lib.so" "fun" # struct {
'(((B . 4) I) (S . 12) (N . 8)) ) # struct {unsigned char c[4]; int i;}
# char *names[12];
# long num[8];
# } *fun(void);
If a returned structure has an element which is a pointer to some
other structure (i.e. not an embedded structure like in the last example above),
this pointer must be first obtained with a N pattern, which can
then be passed to struct for further
extraction.
The (optional) fourth and following arguments to native specify
the arguments to the C function.
Integer arguments (up to 64 bits, signed or unsigned char,
short, int or long) can be passed as they
are: As numbers.
(native "lib.so" "fun" NIL 123) # void fun(int);
(native "lib.so" "fun" NIL 1 2 3) # void fun(int, long, short);
String arguments can be specified as symbols. native allocates
memory for each string (with strdup(3)), passes the pointer to the
C function, and releases the memory (with free(3)) when done.
(native "lib.so" "fun" NIL "abc") # void fun(char*);
(native "lib.so" "fun" NIL 3 "def") # void fun(int, char*);
Note that the allocated string memory is released after the return
value is extracted. This allows a C function to return the argument string
pointer, perhaps after modifying the data in-place, and receive the new string
as the return value (with the S specification).
(native "lib.so" "fun" 'S "abc") # char *fun(char*);
Also note that specifying NIL as an argument passes an empty
string ("", which also reads as NIL in PicoLisp) to the C function.
Physically, this is a pointer to a NULL-byte, and is not a NULL-pointer.
Be sure to pass 0 (the number zero) if a NULL-pointer is desired.
Floating point arguments are specified as cons pairs, where the value is in
the CAR, and the CDR holds the fixpoint scale. If the scale is positive, the
number is passed as a double, otherwise as a float.
(native "lib.so" "fun" NIL # void fun(double, float);
(12.3 . 1.0) (4.56 . -1.0) )
Composite arguments are specified as nested list structures.
native allocates memory for each array or structure (with
malloc(3)), passes the pointer to the C function, and releases the
memory (with free(3)) when done.
This implies that such an argument can be both an input and an output value to a C function (pass by reference).
The CAR of the argument specification can be NIL (then it is an
input-only argument). Otherwise, it should be a variable which receives the
returned structure data.
The CADR of the argument specification must be a cons pair with the total size of the structure in its CAR. The CDR is ignored for input-only arguments, and should contain a result specification for the output value to be stored in the variable.
For example, a minimal case is a function that takes an integer reference, and stores the number '123' in that location:
void fun(int *i) {
*i = 123;
}
We call native with a variable X in the CAR of the
argument specification, a size of 4 (i.e. sizeof(int)), and
I for the result specification. The stored value is Othew) # 'show' all internal symbols
inc> 67292896
*Dbg ((859 . "lib/db.l"))
leaf ((Tree) (let (Node (cdr (root Tree)) X) (while (val Node) (setq X (cadr @) Node (car @))) (cddr X)))
*Dbg ((173 . "lib/btree.l"))
nil 67284680
T (((@X) (@ not (-> @X))))
. # Stop
-> T
: (more '+Link) # Display a class
(+relation)
(dm mis> (Val Obj)
(and
Val
(nor (isa (: type) Val) (canQuery Val))
"Type error" ) )
(dm T (Var Lst)
(unless (=: type (car Lst)) (quit "No Link" Var))
(super Var (cdr Lst)) )
-> NIL
(msg 'any ['any ..]) -> any
any with print, followed by all any
arguments (printed with prin) and a
newline, to standard error. The first any argument is returned.
: (msg (1 a 2 b 3 c) " is a mixed " "list")
(1 a 2 b 3 c) is a mixed list
-> (1 a 2 b 3 c)