abu@software-lab.de

Native C Calls

(c) Software Lab. Alexander Burger

This document describes how to call C functions in shared object files (libraries) from PicoLisp, using the built-in native function -- possibly with the help of the struct and lisp functions. It applies only to the 64-bit version of PicoLisp.


Overview

native calls a C function in a shared library. It tries to

  1. find a library by name
  2. find a function by name in the library
  3. convert the function's argument(s) from Lisp to C structures
  4. call the function's C code
  5. convert the function's return value(s) from C to Lisp structures

The direct return value of native is the Lisp representation of the C function's return value. Further values, returned by reference from the C function, are available in Lisp variables (symbol values).

struct is a helper function, which can be used to manipulate C data structures in memory. It may take a scalar (a numeric representation of a C value) to convert it to a Lisp item, or (more typically) a pointer to a memory area to build and extract data structures. lisp allows you to install callback functions, callable from C code, written in Lisp.

In combination, these three functions can interface PicoLisp to almost any C function.

The above steps are fully dynamic; native doesn't have (and doesn't require) a priory knowledge about the library, the function or the involved data. No need to write any glue code, interfaces or include files. All functions can even be called interactively from the REPL.


Syntax

The arguments to native are

  1. a library
  2. a function
  3. a return value specification
  4. optional arguments

The simplest form is a call to a function without return value and without arguments. If we assume a library "lib.so", containing a function with the prototype


void fun(void);

then we can call it as


(native "lib.so" "fun")


Libraries

The first argument to native specifies the library. It is either the name of a library (a symbol), or the handle of a previously found library (a number).

As a special case, a transient symbol "@" can be passed for the library name. It then refers to the current main program (instead of an external library), and can be used for standard functions like "malloc" or "printf".

native uses dlopen(3) internally to find and open the library, and to obtain the handle. If the name contains a slash ('/'), then it is interpreted as a (relative or absolute) pathname. Otherwise, the dynamic linker searches for the library according to the system's environment and directories. See the man page of dlopen(3) for further details.

If called with a symbolic argument, native automatically caches the handle of the found library in the value of that symbol. The most natural way is to pass the library name as a transient symbol ("lib.so" above): The initial value of a transient symbol is that symbol itself, so that native receives the library name upon the first call. After successfully finding and opening the library, native stores the handle of that library in the value of the passed symbol ("lib.so"). As native evaluates its arguments in the normal way, subsequent calls within the same transient scope will receive the numeric value (the handle), and don't need to open and search the library again.


Functions

The same rules applies to the second argument, the function. When called with a symbol, native stores the function pointer in its value, so that subsequent calls evaluate to that pointer, and native can directly jump to the function.

native uses dlsym(3) internally to obtain the function pointer. See the man page of dlsym(3) for further details.

In most cases a program will call more than one function from a given library. If we keep the code within the same transient scope (i.e. in the same source file, and not separated by the ==== function), each library will be opened -- and each function searched -- only once.


(native "lib.so" "fun1")
(native "lib.so" "fun2")
(native "lib.so" "fun3")

After "fun1" was called, "lib.so" will be open, and won't be re-opened for "fun2" and "fun3". Consider the definition of helper functions:


(de fun1 ()
   (native "lib.so" "fun1") )

(de fun2 ()
   (native "lib.so" "fun2") )

(de fun3 ()
   (native "lib.so" "fun3") )

After any one of fun1, fun2 or fun3 was called, the symbol "lib.so" will hold the library handle. And each function function "fun1", "fun2" and "fun3" will be searched only when called the first time.

Warning: It should be avoided to put more than one library into a single transient scope if there is a chance that two different functions with the same name will be called in two different libraries. Because of the function pointer caching, the second call would otherwise (wrongly) go to the first function.


Return Value

The (optional) third argument to native specifies the return value. A C function can return many types of values, like integer or floating point numbers, string pointers, or pointers to structures which in turn consist of those types, and even other structures or pointers to structures. native tries to cover most of them.

As described in the result specification, the third argument should consist of a pattern which tells native how to extract the proper value.

Primitive Types

In the simplest case, the result specification is NIL like in the examples so far. This means that either the C function returns void, or that we are not interested in the value. The return value of native will be NIL in that case.

If the result specification is one of the symbols B, I or N, an integer number is returned, by interpreting the result as a char (8 bit unsigned byte), int (32 bit signed integer), or long number (64 bit signed integer), respectively. Other (signed or unsigned numbers, and of different sizes) can be produced from these types with logical and arithmetic operations if necessary.

If the result specification is the symbol C, the result is interpreted as a 16 bit number, and a single-char transient symbol (string) is returned.

A specification of S tells native to interpret the result as a pointer to a C string (null terminated), and to return a transient symbol (string).

If the result specification is a number, it will be used as a scale to convert a returned double (if the number is positive) or float (if the number is negative) to a scaled fixpoint number.

Examples for function calls, with their corresponding C prototypes:


(native "lib.so" "fun" 'I)             # int fun(void);
(native "lib.so" "fun" 'N)             # long fun(void);
(native "lib.so" "fun" 'N)             # void *fun(void);
(native "lib.so" "fun" 'S)             # char *fun(void);
(native "lib.so" "fun" 1.0)            # double fun(void);

Arrays and Structures

If the result specification is a list, it means that the C function returned a pointer to an array, or an arbitrary memory structure. The specification list should then consist of either the above primitive specifications (symbols or numbers), or of cons pairs of a primitive specification and a repeat count, to denote arrays of the given type.

Examples for function calls, with their corresponding pseudo C prototypes:


(native "lib.so" "fun" '(I . 8))       # int *fun(void);  // 8 integers
(native "lib.so" "fun" '(B . 16))      # unsigned char *fun(void);  // 16 bytes

(native "lib.so" "fun" '(I I))         # struct {int i; int j;} *fun(void);
(native "lib.so" "fun" '(I . 4))       # struct {int i[4];} *fun(void);

(native "lib.so" "fun" '(I (B . 4)))   # struct {
                                       #    int i;
                                       #    unsigned char c[4];
                                       # } *fun(void);

(native "lib.so" "fun"                 # struct {
   '(((B . 4) I) (S . 12) (N . 8)) )   #    struct {unsigned char c[4]; int i;}
                                       #    char *names[12];
                                       #    long num[8];
                                       # } *fun(void);

If a returned structure has an element which is a pointer to some other structure (i.e. not an embedded structure like in the last example above), this pointer must be first obtained with a N pattern, which can then be passed to struct for further extraction.


Arguments

The (optional) fourth and following arguments to native specify the arguments to the C function.

Primitive Types

Integer arguments (up to 64 bits, signed or unsigned char, short, int or long) can be passed as they are: As numbers.


(native "lib.so" "fun" NIL 123)        # void fun(int);
(native "lib.so" "fun" NIL 1 2 3)      # void fun(int, long, short);

String arguments can be specified as symbols. native allocates memory for each string (with strdup(3)), passes the pointer to the C function, and releases the memory (with free(3)) when done.


(native "lib.so" "fun" NIL "abc")      # void fun(char*);
(native "lib.so" "fun" NIL 3 "def")    # void fun(int, char*);

Note that the allocated string memory is released after the return value is extracted. This allows a C function to return the argument string pointer, perhaps after modifying the data in-place, and receive the new string as the return value (with the S specification).


(native "lib.so" "fun" 'S "abc")       # char *fun(char*);

Also note that specifying NIL as an argument passes an empty string ("", which also reads as NIL in PicoLisp) to the C function. Physically, this is a pointer to a NULL-byte, and is not a NULL-pointer. Be sure to pass 0 (the number zero) if a NULL-pointer is desired.

Floating point arguments are specified as cons pairs, where the value is in the CAR, and the CDR holds the fixpoint scale. If the scale is positive, the number is passed as a double, otherwise as a float.


(native "lib.so" "fun" NIL             # void fun(double, float);
   (12.3 . 1.0) (4.56 . -1.0) )

Arrays and Structures

Composite arguments are specified as nested list structures. native allocates memory for each array or structure (with malloc(3)), passes the pointer to the C function, and releases the memory (with free(3)) when done.

This implies that such an argument can be both an input and an output value to a C function (pass by reference).

The CAR of the argument specification can be NIL (then it is an input-only argument). Otherwise, it should be a variable which receives the returned structure data.

The CADR of the argument specification must be a cons pair with the total size of the structure in its CAR. The CDR is ignored for input-only arguments, and should contain a result specification for the output value to be stored in the variable.

For example, a minimal case is a function that takes an integer reference, and stores the number '123' in that location:


void fun(int *i) {
   *i = 123;
}

We call native with a variable X in the CAR of the argument specification, a size of 4 (i.e. sizeof(int)), and I for the result specification. The stored value is Othew) # 'show' all internal symbols inc> 67292896 *Dbg ((859 . "lib/db.l")) leaf ((Tree) (let (Node (cdr (root Tree)) X) (while (val Node) (setq X (cadr @) Node (car @))) (cddr X))) *Dbg ((173 . "lib/btree.l")) nil 67284680 T (((@X) (@ not (-> @X)))) . # Stop -> T : (more '+Link) # Display a class (+relation) (dm mis> (Val Obj) (and Val (nor (isa (: type) Val) (canQuery Val)) "Type error" ) ) (dm T (Var Lst) (unless (=: type (car Lst)) (quit "No Link" Var)) (super Var (cdr Lst)) ) -> NIL

(msg 'any ['any ..]) -> any
Prints any with print, followed by all any arguments (printed with prin) and a newline, to standard error. The first any argument is returned.

: (msg (1 a 2 b 3 c) " is a mixed " "list")
(1 a 2 b 3 c) is a mixed list
-> (1 a 2 b 3 c)
./usr/share/doc/picolisp/doc/native.html0000644000000000000000000006604511772001370017145 0ustar rootroot Native C Calls abu@software-lab.de

Native C Calls

(c) Software Lab. Alexander Burger

This document describes how to call C functions in shared object files (libraries) from PicoLisp, using the built-in native function -- possibly with the help of the struct and lisp functions. It applies only to the 64-bit version of PicoLisp.


Overview

native calls a C function in a shared library. It tries to

  1. find a library by name
  2. find a function by name in the library
  3. convert the function's argument(s) from Lisp to C structures
  4. call the function's C code
  5. convert the function's return value(s) from C to Lisp structures

The direct return value of native is the Lisp representation of the C function's return value. Further values, returned by reference from the C function, are available in Lisp variables (symbol values).

struct is a helper function, which can be used to manipulate C data structures in memory. It may take a scalar (a numeric representation of a C value) to convert it to a Lisp item, or (more typically) a pointer to a memory area to build and extract data structures. lisp allows you to install callback functions, callable from C code, written in Lisp.

In combination, these three functions can interface PicoLisp to almost any C function.

The above steps are fully dynamic; native doesn't have (and doesn't require) a priory knowledge about the library, the function or the involved data. No need to write any glue code, interfaces or include files. All functions can even be called interactively from the REPL.


Syntax

The arguments to native are

  1. a library
  2. a function
  3. a return value specification
  4. optional arguments

The simplest form is a call to a function without return value and without arguments. If we assume a library "lib.so", containing a function with the prototype


void fun(void);

then we can call it as


(native "lib.so" "fun")


Libraries

The first argument to native specifies the library. It is either the name of a library (a symbol), or the handle of a previously found library (a number).

As a special case, a transient symbol "@" can be passed for the library name. It then refers to the current main program (instead of an external library), and can be used for standard functions like "malloc" or "printf".

native uses dlopen(3) internally to find and open the library, and to obtain the handle. If the name contains a slash ('/'), then it is interpreted as a (relative or absolute) pathname. Otherwise, the dynamic linker searches for the library according to the system's environment and directories. See the man page of dlopen(3) for further details.

If called with a symbolic argument, native automatically caches the handle of the found library in the value of that symbol. The most natural way is to pass the library name as a transient symbol ("lib.so" above): The initial value of a transient symbol is that symbol itself, so that native receives the library name upon the first call. After successfully finding and opening the library, native stores the handle of that library in the value of the passed symbol ("lib.so"). As native evaluates its arguments in the normal way, subsequent calls within the same transient scope will receive the numeric value (the handle), and don't need to open and search the library again.


Functions

The same rules applies to the second argument, the function. When called with a symbol, native stores the function pointer in its value, so that subsequent calls evaluate to that pointer, and native can directly jump to the function.

native uses dlsym(3) internally to obtain the function pointer. See the man page of dlsym(3) for further details.

In most cases a program will call more than one function from a given library. If we keep the code within the same transient scope (i.e. in the same source file, and not separated by the ==== function), each library will be opened -- and each function searched -- only once.


(native "lib.so" "fun1")
(native "lib.so" "fun2")
(native "lib.so" "fun3")

After "fun1" was called, "lib.so" will be open, and won't be re-opened for "fun2" and "fun3". Consider the definition of helper functions:


(de fun1 ()
   (native "lib.so" "fun1") )

(de fun2 ()
   (native "lib.so" "fun2") )

(de fun3 ()
   (native "lib.so" "fun3") )

After any one of fun1, fun2 or fun3 was called, the symbol "lib.so" will hold the library handle. And each function function "fun1", "fun2" and "fun3" will be searched only when called the first time.

Warning: It should be avoided to put more than one library into a single transient scope if there is a chance that two different functions with the same name will be called in two different libraries. Because of the function pointer caching, the second call would otherwise (wrongly) go to the first function.


Return Value

The (optional) third argument to native specifies the return value. A C function can return many types of values, like integer or floating point numbers, string pointers, or pointers to structures which in turn consist of those types, and even other structures or pointers to structures. native tries to cover most of them.

As described in the result specification, the third argument should consist of a pattern which tells native how to extract the proper value.

Primitive Types

In the simplest case, the result specification is NIL like in the examples so far. This means that either the C function returns void, or that we are not interested in the value. The return value of native will be NIL in that case.

If the result specification is one of the symbols B, I or N, an integer number is returned, by interpreting the result as a char (8 bit unsigned byte), int (32 bit signed integer), or long number (64 bit signed integer), respectively. Other (signed or unsigned numbers, and of different sizes) can be produced from these types with logical and arithmetic operations if necessary.

If the result specification is the symbol C, the result is interpreted as a 16 bit number, and a single-char transient symbol (string) is returned.

A specification of S tells native to interpret the result as a pointer to a C string (null terminated), and to return a transient symbol (string).

If the result specification is a number, it will be used as a scale to convert a returned double (if the number is positive) or float (if the number is negative) to a scaled fixpoint number.

Examples for function calls, with their corresponding C prototypes:


(native "lib.so" "fun" 'I)             # int fun(void);
(native "lib.so" "fun" 'N)             # long fun(void);
(native "lib.so" "fun" 'N)             # void *fun(void);
(native "lib.so" "fun" 'S)             # char *fun(void);
(native "lib.so" "fun" 1.0)            # double fun(void);

Arrays and Structures

If the result specification is a list, it means that the C function returned a pointer to an array, or an arbitrary memory structure. The specification list should then consist of either the above primitive specifications (symbols or numbers), or of cons pairs of a primitive specification and a repeat count, to denote arrays of the given type.

Examples for function calls, with their corresponding pseudo C prototypes:


(native "lib.so" "fun" '(I . 8))       # int *fun(void);  // 8 integers
(native "lib.so" "fun" '(B . 16))      # unsigned char *fun(void);  // 16 bytes

(native "lib.so" "fun" '(I I))         # struct {int i; int j;} *fun(void);
(native "lib.so" "fun" '(I . 4))       # struct {int i[4];} *fun(void);

(native "lib.so" "fun" '(I (B . 4)))   # struct {
                                       #    int i;
                                       #    unsigned char c[4];
                                       # } *fun(void);

(native "lib.so" "fun"                 # struct {
   '(((B . 4) I) (S . 12) (N . 8)) )   #    struct {unsigned char c[4]; int i;}
                                       #    char *names[12];
                                       #    long num[8];
                                       # } *fun(void);

If a returned structure has an element which is a pointer to some other structure (i.e. not an embedded structure like in the last example above), this pointer must be first obtained with a N pattern, which can then be passed to struct for further extraction.


Arguments

The (optional) fourth and following arguments to native specify the arguments to the C function.

Primitive Types

Integer arguments (up to 64 bits, signed or unsigned ch