Seriously game devs, it's not that hard. Get with the times.
This would be part of the main class
And this would be the chief graphics object from which near every other graphics object should inherit from.
There, 80+% of your CPU-side optimization done for you. Quit your belly aching, start over, and do it right this time. There's no excuse when it's this easy.
It's not difficult to rearrange this with native C++ threads.
The benefit to starting with this template is easy compartmentalization, easy debugging, easy expandability, and it's not overly difficult to optimize or rebalance if one thread is getting bogged down. You can rapidly figure out where in the system you're choking and can find ways to alleviate it.
If you're building a game and you don't start with something similar to this, I blame you for the mediocrity of ga