Parallel Game Code: DX12 and Vulkan-Ready, Usable with DX11 and OpenGL Now

Entry posted by patrickjp93 November 21, 2015

531 views

Seriously game devs, it's not that hard. Get with the times.

This would be part of the main class

std::vector<VisualObject> movableObjects;//initialize everything, set starting coordinates and directions//other prep work//FUNCTIONSvoid update(float updateTimeMillis) {    //move view point, make decisions on color changes, texture loading, etc.    ...    //check for collisions in parallel, using in-lined function calls (no stack frame created)    //and loop-unrolling to make explicit use of branching AVX 256    //assume there is a mutex or semaphore to lock an object for analysis/deletion and a smart    //function to skip if locked and come back built into the calls to inline function     //collisionUpdate    #pragma omp parallel for    for(int i = movableObjects.size()-1; i >= 0; i -= 8) {        movableObjects[i].collisionUpdate(); //will run each type of object's unique function        movableObjects[i-1].collisionUpdate();        ...        movableObjects[i-7].collisionUpdate();    }    //collided objects now destroyed or had states updated to change physical effects routines,    //update all vertices of all objects in parallel,    //ensure dummies exist if not in multiples of 8 to take advantage of AVX 256 in loop unrolling    #pragma omp parallel for    for(int i = movableObjects.size()-1; i >= 0; i -= 8) {         movableObjects[i].update(updateTimeMillis);        movableObjects[i-1].update(updateTimeMillis);        ...        movableObjects[i-7].update(updateTimeMillis);    }        VisualObject.draw(); //draw the scene and all objects in it.}

And this would be the chief graphics object from which near every other graphics object should inherit from.

public class VisualObject {    VisualObject* parent = null;    vector<VisualObject> children;    int omp_max_thread_count = omp_get_max_threads();    //if quad I5, will get 4, if quad I7, will get 8    void VisualObject::addChild(const VisualObject &v) {        v.setParent(this);        children.push_back(v);    }    void VisualObject::draw() {        //draw children in parallel using dynamic scheduling in case some objects         //have complex functions that take far longer than others, auto scaling with core count        #pragma parallel for schedule(dynamic)        for(int i = 0; i < children.size(); i += 8) {            children[i].draw();            children[i+1].draw();            ...            children[i+7].draw();        }    }    struct {        bool operator()(VisualObject v1, VisualObject v2)        {               return v1.complexity() < v2.complexity();        }       } VOComparator;    //Sort Visual Objects by draw complexity into equal-sized buckets for the draw() function     //to handle, most expensive draws first per thread.    void VisualObject::loadBalance() {        //If on GCC/Clang/ICC, compile with -fopenmp and -D_GLIBCXX_PARALLEL to get parallel sort        std::sort(children.begin(), children.end(), VOComparator);        int chunkSize = children.size()/omp_max_thread_count;        vector<VisualObject> sortedSet(children.size());        #pragma parallel for        for(int i = 0; i < omp_max_thread_count; i++){            //There is room for loop unrolling as long as you check to insure your unroll length is no larger than             //chunkSize and you either have dummies to fill the empty space or a cleanup function for the remainder under a given chunkSize            for(int j = 0; j < chunkSize; j++){                sortedSet[i * chunkSize + j] = children[j*omp_max_thread_count + i];            }        }        children = sortedSet;    } //end loadBalance}; //end VisualObject class

There, 80+% of your CPU-side optimization done for you. Quit your belly aching, start over, and do it right this time. There's no excuse when it's this easy.

Sign In

Multithreaded Game Template Code

Parallel Game Code: DX12 and Vulkan-Ready, Usable with DX11 and OpenGL Now

0 Comments

My Activity Streams