!gcc -shared -fpic -o libexample.dylib example.c
How can compute intensive large language models (LLMs) run on consumer-grade laptops? C bindings are part of this magic, they create wrappers around the C code to make is accessible in higher-level languages like Python. While it might sound complicated, the concept is surprisingly accessible with the right approach. Let’s explore a simple example to how to utilize the ctypes
library to implement a C bindings in Python.
In my previous blog post, I demonstrated how you can use llama-cpp-python
to run a llama2-model using llama.cpp
. To understand how the interface between these 2 project works, I created a simple C library to unveil some of the underlying “magic”.
The Python ctypes
-library is the bridge between the to worlds, and in this notebook, we first create and compile a simple function written in C that accepts an int32_t
value and returns its square. Subsequently, we use this function from python to learn how to implement the C binding via the ctypes
-library.
While I wrote this notebook on macOS, the principles and techniques are universally applicable, with slight adjustments for Linux or Windows environments.
A final note before we get started: You can find the notebook version of this blog post on my GitHub.
Step 1: Create the C Code Library
Create a file named example.c
with the following content:
#include <stdint.h>
int32_t square(int32_t number) {
return number * number;
}
Next, compile this C code into a shared library via the terminal
# Linux / so -> shared object
gcc -shared -fpic -o libexample.so example.c
# macOS / dylib -> dynamic library
gcc -shared -fpic -o libexample.dylib example.c
# Windows / dll -> dynamic-link library
gcc -shared -o example.dll example.c
Before we run the command, let’s break it down:
gcc
stands for “GNU Compiler Collection”, and it can compile C by default.-shared
creates a “shared library”. In Python analogy, this is like a module which can be imported.-fpic
creates “Position-Independent Code” (PIC), removing any absolute memory references and making them relative.- For C developers, it is good practice to prefix the name of shared libraries (dynamic libraries) with “lib”, hence
example.c
becomeslibexample.xxx
Since I am running on a Mac, I use the following command to compile my library:
As a result, I get a new file called libexample.dylib
.
Step 2: Python Code
To call this function from Python, we need to do couple of steps.
First, we need to load the shared libaray via the ctypes.CDLL
-method to access our square function:
import ctypes
= ctypes.CDLL('./libexample.dylib') # Use appropriate file name on your system libexample
Next, let’s create an object which represents a 32-bit integer which corresponds to the C type int32_t
. This is the type we used in our example C code.
= ctypes.c_int32 c_int32_type
Preparing for the call to C, we need to specify the arguments and return type of the C function so that the variables can be converted correctly:
- The arguments
argtypes
are passed in a list, because there could be more arguments (depending on the function). - The result type
restype
is just a single value, because a function returns exactly one result.
= [c_int32_type]
libexample.square.argtypes = c_int32_type libexample.square.restype
We want to calculate the square of a number_to_be_squared
. This Python variable first needs to be converted into a proper int32 representation:
= 7
number_to_be_squared = c_int32_type(number_to_be_squared) input_value
Finally, we can call the C function:
= libexample.square(input_value)
result print(f"The square of {input_value.value} is {result}.")
The square of 7 is 49.
Note that the input type is a C type, therefore we need to use .value
to access the Python equivalent, and the result is automatically converted to a Python type:
print(f"The input value type is {type(input_value)}")
print(f"The result value type is {type(result)}")
The input value type is <class 'ctypes.c_int'>
The result value type is <class 'int'>
Wrapping up
This tiny example demonstrated how a C binding works and which steps are needed to call C code from Python. Although aligning the types between Python and C requires some effort, the payoff is significantly enhanced performance for compute-intensive tasks like neural net inference. While introducing additional complexity, the C binding llama-cpp-python
makes it possible to run a llama2-model via llama.cpp
directly from Python, even on a consumer laptop.