How to Implement a C Binding in Python with ctypes

c
binding
llama.cpp
llm
Author

Christian Wittmann

Published

February 16, 2024

How can compute intensive large language models (LLMs) run on consumer-grade laptops? C bindings are part of this magic, they create wrappers around the C code to make is accessible in higher-level languages like Python. While it might sound complicated, the concept is surprisingly accessible with the right approach. Let’s explore a simple example to how to utilize the ctypes library to implement a C bindings in Python.

In my previous blog post, I demonstrated how you can use llama-cpp-python to run a llama2-model using llama.cpp. To understand how the interface between these 2 project works, I created a simple C library to unveil some of the underlying “magic”.

The Python ctypes-library is the bridge between the to worlds, and in this notebook, we first create and compile a simple function written in C that accepts an int32_t value and returns its square. Subsequently, we use this function from python to learn how to implement the C binding via the ctypes-library.

While I wrote this notebook on macOS, the principles and techniques are universally applicable, with slight adjustments for Linux or Windows environments.

A final note before we get started: You can find the notebook version of this blog post on my GitHub.

Dalle: A Python snake winding around a letter C
Dalle: A Python snake winding around a letter C

Step 1: Create the C Code Library

Create a file named example.c with the following content:

#include <stdint.h>

int32_t square(int32_t number) {
    return number * number;
}

Next, compile this C code into a shared library via the terminal

# Linux / so -> shared object
gcc -shared -fpic -o libexample.so example.c
# macOS / dylib -> dynamic library
gcc -shared -fpic -o libexample.dylib example.c
# Windows / dll -> dynamic-link library
gcc -shared -o example.dll example.c

Before we run the command, let’s break it down:

  • gcc stands for “GNU Compiler Collection”, and it can compile C by default.
  • -shared creates a “shared library”. In Python analogy, this is like a module which can be imported.
  • -fpic creates “Position-Independent Code” (PIC), removing any absolute memory references and making them relative.
  • For C developers, it is good practice to prefix the name of shared libraries (dynamic libraries) with “lib”, hence example.c becomes libexample.xxx

Since I am running on a Mac, I use the following command to compile my library:

!gcc -shared -fpic -o libexample.dylib example.c

As a result, I get a new file called libexample.dylib.

Step 2: Python Code

To call this function from Python, we need to do couple of steps.

First, we need to load the shared libaray via the ctypes.CDLL-method to access our square function:

import ctypes

libexample = ctypes.CDLL('./libexample.dylib')  # Use appropriate file name on your system

Next, let’s create an object which represents a 32-bit integer which corresponds to the C type int32_t. This is the type we used in our example C code.

c_int32_type = ctypes.c_int32

Preparing for the call to C, we need to specify the arguments and return type of the C function so that the variables can be converted correctly:

  • The arguments argtypes are passed in a list, because there could be more arguments (depending on the function).
  • The result type restype is just a single value, because a function returns exactly one result.
libexample.square.argtypes = [c_int32_type]
libexample.square.restype = c_int32_type

We want to calculate the square of a number_to_be_squared. This Python variable first needs to be converted into a proper int32 representation:

number_to_be_squared = 7
input_value = c_int32_type(number_to_be_squared)

Finally, we can call the C function:

result = libexample.square(input_value)
print(f"The square of {input_value.value} is {result}.")
The square of 7 is 49.

Note that the input type is a C type, therefore we need to use .value to access the Python equivalent, and the result is automatically converted to a Python type:

print(f"The input value type is {type(input_value)}")
print(f"The result value type is {type(result)}")
The input value type is <class 'ctypes.c_int'>
The result value type is <class 'int'>

Wrapping up

This tiny example demonstrated how a C binding works and which steps are needed to call C code from Python. Although aligning the types between Python and C requires some effort, the payoff is significantly enhanced performance for compute-intensive tasks like neural net inference. While introducing additional complexity, the C binding llama-cpp-python makes it possible to run a llama2-model via llama.cpp directly from Python, even on a consumer laptop.