Multithreading

underscore.png

Chapter Overview


>Simple Example
>Change Lists
>Tutorial - Moving torus with multiple threads
>Barriers


Multithreading is often not very easy to handle and even more often it is dangerous and evil to debug. However, multithreading is powerful and very useful in many situations. At the very beginning I promised that multithreading is very easy with OpenSG. So now let's see if I can hold this promise. If I am right you will be finally rewarded for all these begin- and endEditCP blocks you have written (or copied) so far.

Simple Example

Well, let's start right of with something very simple, just to see how easy it can be to use multiple threads (if you know how ;) ).

As always, please make a copy of the framework file. We really need no geometry at all for this little demonstration, but the scene manager must have at least a root node. So create a group node or some simple geometry, whatever you want, and set it as root for the simple scene manager. This is necessary to suppress error messages. Try to execute, just to make sure everything works fine - there is nothing more annoying than searching hours for an error in your threads, where the real error lies somewhere else!

First there is, of course, a new include file, that needs to be added.

    #include <OpenSG/OSGThreadManager.h>

Like with the simple scene manager, too, the thread manager will include most other include files needed in this context.

First we need a function that will be executed by a thread. As we want to create two new threads, beside the main thread, we want two functions - paste them somewhere before the main function.

void printA(void *args){
    while (true)
        std::cout << "A" << std::endl;
}

void printB(void *args){
    while (true)
        std::cout << "B" << std::endl;
}

These functions will simply print A or B respectively until the application is terminated. Now we need to create and start the threads that will run both functions. Add the following code right before the main function!

    Thread* threadOne = dynamic_cast<Thread *>(ThreadManager::the()->getThread("One"));
    Thread* threadTwo = dynamic_cast<Thread *>(ThreadManager::the()->getThread("Two"));
    
    threadOne->runFunction(printA, 1, NULL);
    threadTwo->runFunction(printB, 1, NULL);

If you now run this application, watch the terminal. You will see many many A's and B's mixed together. Notice that you still can navigate in the window (well, you need some geometry to see that effect), because the OpenSG rendering (i.e. the GLUT main loop) runs in the main thread.

You can find the code in file 13multithreading.cpp, like always in the progs folder.

If you like to, replace the letters with a word or a sentence and watch what happens. I chose "Power Mac G5" and "Taschenrechner", the next image shows the terminal output.

thread_change.png

Two threads producing non-stop output

As you can see, the printing of "Taschenrechner" was completed correctly, but "Power Mac G5" suddenly started at " G5" and left out the beginning. Afterwards the output continues normally. What happend? Well, the execution of threads is handled by your operating system, which decides when a thread is halted and another stalled process is continued. If your threads are computing some stuff in the background that might be no problem at all, but if these are printing output to the terminal, things might get ugly like above. Most likely you want the output of the first thread finished, before the second starts.

However, possibly you are not easily able to reproduce this "error" because several hundred or even thousands words are printed before the active thread changes, so the error occurs rarely. You could pipe the result into a file, but if you do so, let the application run a few seconds only!

Things get even worse, if one thread writes data and another is writing or reading data on the same segment - this will crash your application in most cases or at least it will produce undesired results.

We will later be able to solve the output problem...

Change Lists

Well, it is time for some theory now. As I mentioned above, the real problem is the asynchronous read and write of data, which will happen for sure, if two or more threads are working on the scene graph. So how does OpenSG handle that problem? You already have used osg::FieldContainers all the time, well, at least objects that were derived from it - remember, all node cores are derived from field container, for example.

Every time a field container is created, not only one instance is created, but two per default. These multiple instances of one and the same field container are called aspects and every thread is associated with one single aspect. If data is written by a thread, only the corresponding aspect's data is changed at that time. The following figure illustrates this:

aspects.png

Two threads and one field container with two aspects

Here you have one field container of type Transform, which holds two aspects. Two threads are running and each has assigned a single aspect. If thread two now sets a new matrix that field in the appropriate threads aspect is changed - the other thread's field stays untouched.

Of course, this has the potential to cause heavy problems, as we now have inconsistent data, because thread one seems not to be aware of the new matrix thread two has set. That is what ChangeLists are for. Every thread has its own change list. Every time data is written, an entry is added to the corresponding change list. At some point, the threads needs to be synchronized and that is where a change list's content is read and the relevant data of this aspect is copied to the other aspect. The synchronization has to be initiated manually and can't be done automatically, as both threads have to be aware of it and be in state where they can accept changes to the graph.

Well, this will be easier to understand if we consider a little example. So here we go: imagine we want a simple torus (yes, again amazing geometry!) which should rotate around the y axis - so we need a new transformation matrix every frame. This time, computation of the matrix will be done in an additional thread. Please keep in mind that this is for didactic reasons only, as it would not be very smart to create own threads for such simple tasks ;)

Important Notice: Currently the dynamic configuration of multiple aspects is not fully implemented yet! In detail that means, that there is currently no way to change the number of aspects at runtime, there are always two aspects per default, so if you have three threads that need to read/write data independantly, you have to recompile the library. However that is unlikely... anyway if you do need that feature, you should have a look at the mailing list (see OpenSG Mailinglist).

Tutorial - Moving torus with multiple threads

Anyway, let's start right away with our 00framework.cpp file. You need an additional header and some global variables

#include <OpenSG/OSGThreadManager.h>

using namespace std;

SimpleSceneManager *mgr;
NodePtr scene;
//we will store the transformation globally - this
//is not necessary, but comfortable
TransformPtr trans;

Thread* animationThread;
Barrier *syncBarrier;

This should look familiar to you, except for the Barrier variable, which will be used later to synchronize the threads again.

The createScenegraph() function is not that interesting - we have a transform node with a single child, the torus. That's all

NodePtr createScenegraph(){
    // the scene must be created here
    NodePtr n = makeTorus(0.5,4,8,16);
    
    //add a simple Transformation
    trans = Transform::create();
    beginEditCP(trans);
        Matrix m;
        m.setIdentity();
        trans->setMatrix(m);
    endEditCP(trans);
    
    NodePtr transNode = Node::create();
    beginEditCP(transNode);
        transNode->setCore(trans);
        transNode->addChild(n);
    endEditCP(transNode);
    
    return transNode;
}

Next comes the function that will compute and set the new transformation. This function will be run in its own thread.

//this function will run in a thread and will simply
//rotate the torus by setting a new transformation matrix
void rotate(void *args){
    // we won't stop calculating new matrices....
    while(true){
        Real32 time = glutGet(GLUT_ELAPSED_TIME);
        Matrix m;
        m.setIdentity();
        m.setRotate(Quaternion(Vec3f(0,1,0), time/1000));
        
        beginEditCP(trans);
            trans->setMatrix(m);
        endEditCP(trans);
        // nothing unusual until here
        
        //well that's new...
        
        //wait until two threads are cought in the
        //same barrier
        syncBarrier->enter(2);
        
        //just the same again
        syncBarrier->enter(2);
    }
}

The display() function also needs to be replaced.

void display(void)
{
    // we wait here until the animation thread enters
    //the first barrier
    syncBarrier->enter(2);
    
    //now we sync data
    animationThread->getChangeList()->applyAndClear();
    
    // and again
    syncBarrier->enter(2);
        
    // now render...
    mgr->redraw();
}

Finally you need to add the following code between the call of createScenegraph() and glutMainLoop() in the main() function.

    //create the barrier, that will be used to
    //synchronize threads

    //instead of NULL you could provide a name
    syncBarrier = Barrier::get(NULL);

    mgr = new SimpleSceneManager;
    mgr->setWindow(gwin );
    mgr->setRoot  (scene);
    mgr->showAll();

    //create the thread that will run generation of new matrices
    animationThread = dynamic_cast<Thread *>(ThreadManager::the()->getThread("anim"));

    //do it...
    animationThread->runFunction(rotate, 1, NULL);

Note: If you're using versions after 1.2, the ChangeLists are set to read-only by default. To actually record any changes, you have to set them to read-write by adding the following command before osgInit:

#if OSG_MINOR_VERSION > 2
    ChangeList::setReadWriteDefault();
#endif

If you execute this application you will see an animated torus... well, amazing as this one looks exactly like the one from the second tutorial (Tutorial - it's moving!). However, this time it is a lot more impressive, as we have used two threads.

Maybe it is time to talk about what we have actually done here. We have a rough picture about what aspects and change lists are - and we have this working example now. The situation is as follows: The main thread is actually doing the same thing like always, that is setting up everything that is needed and rendering the graph until the application is terminated. The second thread only changes the contents of the transform core of the scenegraph - at least it's own aspect. Let us give the main thread number 1 (and therefore aspect 1, too) and the animation thread number 2 as well as aspect 2. Then, thread 2 is constantly changing the matrix value of the transform field container in aspect 2 while the value in aspect 1 is staying untouched. Thread 1, however, will render the graph and thus is using aspect 1 and therefore is using the old (unchanged) value.

You saw the application working, but now comment the line out that says

    animationThread->getChangeList()->applyAndClear();

found in the display() function.

The application will still run, but this time without any movement of the torus. That is why we just skipped synchronization of the two threads. Without this command thread 2 is doing a fine job, but thread 1 will never become aware of the changes. So the change list of thread 2 has to be applied on thread 1, thus all changes from aspect 2 are copied to aspect 1.

Alright, it seems that animationThread->getChangeList()->applyAndClear(); will do some synchronization between the threads. The next figure shows what happens on applying the change list

sync.png

Applying a change list

Well, I assume that you have not worked with threads before, so if you have already you can jump to the next passage right away. When working with threads some things might get a bit confusing. It is somehow like having two independent programs running, which are written in one and the same source file and can use the same global variables or objects. You have to keep in mind that two or more (nearly) independant applications can access the same data at the same time.

In this specific case, the synchronization command is called within the display() function, which is run by the main thread. That is, if we call animationThread->getChangeList()->applyAndClear();, we will get the change list from the other thread, which will have one entry (the changed matrix field). This change list will be applied to the current thread, the main thread. Remember, the main thread is also the one, that will render - so afterwards the new matrix can be taken into account when rendering. Additionally, applyAndClear() also removes all entries in the change list, so that it is empty again.

There is still one thing we have not talked about: Barriers. You might have noticed that we used an object of type osg::Barrier. This barrier is entered two times at the end of the animation thread and also two times during the display() function, where they appear to be like brackets around the sync command. All have one integer parameter of value two passed to them. The reason why these barriers have to be used is quite simple: during synchronization, that is when data is written from one aspect to another, no new data is allowed to be written (or even read), because this may cause inconsistent data. So all threads that are going to be synchronized have to stop until the process has finished. That is what barriers are for.

Barriers

When a barriers is entered via

    some_barrier->enter(2);

the thread will stop. The integer argument tells the system how many threads will have to enter this barrier until they are released again. In our example the main thread will stop during the display function() at the very fist line, until the animation thread also enters the first barrier at the end of the function(). At the moment that happens, both threads will continue to run. However the animation thread will be stopped at the next line already, where the main thread will first finish executing the synchronization until it enters the next barrier, where both will be released again.

barriers.png

Usage of barriers

Code is executed downwards along the red line, until the blue line marks a barrier. If and only if both threads have arrived at the blue barrier, execution of both threads continues.

Be careful to use the correct value as argument for entering barriers. If I would have accidently given a value of three instead of two, to any barrier, the application will wait there forever and never continue, because there is no third thread that could enter the barrier.

Now that you have a basic understanding of how threads work, you might come to the conclusion, that this is not very efficient, as every thread has to wait every frame until the other has also finished - yes, as I said before, this was for didactic reasons only and should not be used like that in the real world.

A little example where multithreading might be useful: imagine you have a virtual reality system, that has to load files from disk during runtime. If these files contain some bigger models, such as terrain or houses, loading might take some noticeable time. If you are running a single threaded application, the whole thing will stop (i.e. the picture will freeze) until loading is finished. In this case you could have a second thread for loading. This thread will be started when a model should be loaded. The model will be loaded into memory while the main application will continue to run. When the loading process has finished, both threads will be synchronized and then the additional thread will be terminated. Done.

Next Chapter : Clustering


Generated on 8 Feb 2010 for OpenSG by  doxygen 1.6.1