Awasu » Embedding Python: Using Python in a multi-threaded program
Sunday 30th November 2014 8:54 AM []

If your C/C++ program is multi-threaded, and you want to use the embedded Python interpreter in multiple threads, it is possible, albeit a bit clunky. While it is possible to create multiple interpreters, only one of them can be running at any given time, and switching between them requires saving the state of the old interpreter, then restoring the state of the new one before letting it resume.

We also have to deal with Python's notorious GIL, or Global Interpreter Lock, which controls access to a lot of the Python internal code and ensures that only one thread is using it any given time.

Initialization

Initializing and shutting down Python is a little different if you want to use it in a multi-threaded program.

// initialize Python
Py_Initialize() ;
PyEval_InitThreads() ; // nb: creates and locks the GIL
// NOTE: We save the current thread state, and restore it when we unload,
// so that we can clean up properly.
PyThreadState* pMainThreadState = PyEval_SaveThread() ; // nb: this also releases the GIL

We first call PyEval_InitThreads() to create the GIL, which will be needed to control thread access.

Then, since we will be swapping interpreters in and out during the course of our program, we need to save the state of the initial thread Python created for us. This is so that we can restore this state prior to shutting Python down.

// clean up
PyEval_RestoreThread( pMainThreadState ) ; // nb: this also locks the GIL
Py_Finalize() ;

Managing Python interpreters

Creating a new Python interpreter can be done like this.

// create a new interpreter 
PyEval_AcquireLock() ; // nb: get the GIL
PyThreadState* pThreadState = Py_NewInterpreter() ;
assert( pThreadState != NULL ) ;

To use an interpreter, we need to swap its state in.

PyEval_AcquireThread( pThreadState ) ;

And when we're done using it, swap it out.

PyEval_ReleaseThread( pThreadState ) ;

Once we're all done with using an interpreter, we need to release it.

// release the interpreter 
PyEval_AcquireThread( pThreadState ) ; // nb: this also locks the GIL
Py_EndInterpreter( pThreadState ) ;
PyEval_ReleaseLock() ; // nb: release the GIL

A full example

Here's a full example that creates a bunch of threads, each one with its own Python interpreter.

The boilerplate code that initializes Python and creates the worker threads looks like this.

void
main( int argc , char** argv )
{
    // initialize Python
    Py_Initialize() ;
    PyEval_InitThreads() ; // nb: creates and locks the GIL
    // NOTE: We save the current thread state, and restore it when we unload,
    // so that we can clean up properly.
    PyThreadState* pMainThreadState = PyEval_SaveThread() ; // nb: this also releases the GIL

    // create some worker threads
    vector<HANDLE> workerThreads ;
    for ( int i=0 ; i < 5 ; ++i )
    {
        HANDLE h = CreateThread( NULL , 0 , workerThread , (LPVOID)i , 0 , NULL ) ;
        assert( h != NULL ) ;
        workerThreads.push_back( h ) ;
    }

    // wait for the worker threads to finish 
    DWORD rc = WaitForMultipleObjects( workerThreads.size() , &workerThreads[0] , TRUE , INFINITE ) ;
    assert( rc == WAIT_OBJECT_0 ) ;

    // clean up
    PyEval_RestoreThread( pMainThreadState ) ; // nb: this also locks the GIL
    Py_Finalize() ;
}

The interesting stuff is in the worker threads.

DWORD WINAPI
workerThread( LPVOID threadParam )
{
    // initialize 
    int threadNo = (int)threadParam ;

    // create a new interpreter 
    PyEval_AcquireLock() ; // nb: get the GIL
    PyThreadState* pThreadState = Py_NewInterpreter() ;
    assert( pThreadState != NULL ) ;
    PyEval_ReleaseThread( pThreadState ) ; // nb: this also releases the GIL

    // do some Python stuff
    for ( int i=0 ; i < 5 ; ++i )
    {
        // switch in our interpreter
        PyEval_AcquireThread( pThreadState ) ;

        // execute some Python
        stringstream buf ;
        buf << "print \"Thread " << threadNo << ": pass " << 1+i << "\"" ;
        int rc = PyRun_SimpleString( buf.str().c_str() ) ;
        assert( rc == 0 ) ;

        // switch out our interpreter
        PyEval_ReleaseThread( pThreadState ) ;

        // sleep for a short time 
        Sleep( rand() % 100 ) ;
    }

    // release the interpreter 
    PyEval_AcquireThread( pThreadState ) ; // nb: this also locks the GIL
    Py_EndInterpreter( pThreadState ) ;
    PyEval_ReleaseLock() ; // nb: release the GIL

    return 0 ;
}

Once the worker thread has created its interpreter, it runs a short loop, swapping the interpreter in, running a bit of Python code (just a simple print statement), then swapping the interpreter out (so that another interpreter can run).

Note that when you run this program, you will probably get garbled output. This is because we have multiple threads, all trying to write to the console at the same time.
Download the source code here.


3 Responses to this post

Excellent! Thank you.

Thanks for your write up
How would you handle the case where the code you run in this line is a long running process.

line 22 int rc = PyRun_SimpleString( buf.str().c_str() ) ;

what i need is for each thread to go off and run an extensive time consuming operation. I don't want each thread to wait for the other thread to complete.

You would need to call it in a new (C++) thread.

You might be able to get away with sharing a Python interpreter object between threads, as long as you don't access it simultaneously (i.e. create it in one thread, use it in another), but in general, it's probably not a good idea to do that kind of thing. Depending on your application, you might want to set up a pool of worker threads, each one of which creates its own interpreter object, then send requests to do work to them.

You could also structure your program to call a main Python function, that kicks off its own (Python) threads to do the work.

Have your say