Jump to content

Struggling with openGL

I'm trying to create a rendering engine similar to that of minecraft (i.e saves the world in 16*16 chunks and all that stuff) and I've just began to start trying to generate multiple chunks into a std::vector.

The issue is that I'm assigning each sub-chunk (16*16*16 blocks, 8 per chunk at the moment) its own vertex buffer, but some of the chunks seem to be assigning themselves buffers that are already in use. Is there a limit on the number of vertex buffers in OpenGL? If so, how could I bypass this?

-Parts-

Core i7 8086k 4.0GHz (4.6GHz OC) - ASUS z390 ROG Maximus XI code - Corsair H115i - 16GB (2 x 8GB) G.SKILL Trident Z RGB 3600MHz - Windforce GTX 1050Ti - Corsair HX750i - Corsair Obsidian 500D RGB SE

 

-Upgrades when I get the money-

Undecided on the GPU - ASUS ROG PG278QR 1440p 144Hz (165Hz OC)

Link to comment
Share on other sites

Link to post
Share on other sites

The number of buffers you can have is limited by the amount of VRAM you have. You'll need to swap data in and out of buffers if you don't have enough memory.

Also, don't use the STL containers in a game engine if you want predictable performance.

Game engines tend to use custom and specialized allocators to minimize memory footprint and maximize performance.

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, Trinopoty said:

The number of buffers you can have is limited by the amount of VRAM you have. You'll need to swap data in and out of buffers if you don't have enough memory.

Also, don't use the STL containers in a game engine if you want predictable performance.

Game engines tend to use custom and specialized allocators to minimize memory footprint and maximize performance.

I haven't come across STLs before, what are they? 

Also, I believe I've narrowed the problem down to timing issues, as if I increase the number of chunks, the number of allocated buffers seems to be a quarter of what it should be.

-Parts-

Core i7 8086k 4.0GHz (4.6GHz OC) - ASUS z390 ROG Maximus XI code - Corsair H115i - 16GB (2 x 8GB) G.SKILL Trident Z RGB 3600MHz - Windforce GTX 1050Ti - Corsair HX750i - Corsair Obsidian 500D RGB SE

 

-Upgrades when I get the money-

Undecided on the GPU - ASUS ROG PG278QR 1440p 144Hz (165Hz OC)

Link to comment
Share on other sites

Link to post
Share on other sites

I got it working! Create all the objects first and then allocate VRAM to each afterwards!

-Parts-

Core i7 8086k 4.0GHz (4.6GHz OC) - ASUS z390 ROG Maximus XI code - Corsair H115i - 16GB (2 x 8GB) G.SKILL Trident Z RGB 3600MHz - Windforce GTX 1050Ti - Corsair HX750i - Corsair Obsidian 500D RGB SE

 

-Upgrades when I get the money-

Undecided on the GPU - ASUS ROG PG278QR 1440p 144Hz (165Hz OC)

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, RushFan said:

I haven't come across STLs before, what are they? 

 

Standard Template Library. Vectors, Map, etc. from the C++ standard libs.

2 hours ago, Trinopoty said:

The number of buffers you can have is limited by the amount of VRAM you have. You'll need to swap data in and out of buffers if you don't have enough memory.

Also, don't use the STL containers in a game engine if you want predictable performance.

Game engines tend to use custom and specialized allocators to minimize memory footprint and maximize performance.

Not nearly as much of a problem as it previously was. It really comes down to use case now. You can get more than acceptable performance out of the libraries now if you're not pushing anything too crazy or not developing for consoles.

CPU: Intel i7 - 5820k @ 4.5GHz, Cooler: Corsair H80i, Motherboard: MSI X99S Gaming 7, RAM: Corsair Vengeance LPX 32GB DDR4 2666MHz CL16,

GPU: ASUS GTX 980 Strix, Case: Corsair 900D, PSU: Corsair AX860i 860W, Keyboard: Logitech G19, Mouse: Corsair M95, Storage: Intel 730 Series 480GB SSD, WD 1.5TB Black

Display: BenQ XL2730Z 2560x1440 144Hz

Link to comment
Share on other sites

Link to post
Share on other sites

6 hours ago, RushFan said:

I'm trying to create a rendering engine similar to that of minecraft (i.e saves the world in 16*16 chunks and all that stuff) and I've just began to start trying to generate multiple chunks into a std::vector.

The issue is that I'm assigning each sub-chunk (16*16*16 blocks, 8 per chunk at the moment) its own vertex buffer, but some of the chunks seem to be assigning themselves buffers that are already in use. Is there a limit on the number of vertex buffers in OpenGL? If so, how could I bypass this?

Too little details given, but it might be this (note im not fammiliar with details of minecraft engine, bjt the following is generic and apply in most of the cases / engines).

 

Game worlds can be huge. Waaaaay more instances of "things" that could ever be held directly in gpu buffers.

 

The state (position, orientation) and occurences of stuff is held in normal (system) memory.

 

From there, in multiple stages, things are rendered to the gpu (the engines rendering pipeline, part of which is in opengl, but there is at least 1 but usually more stages before that. The 1 mandatory stage is what transfers a selection of the world to opengl in order to render at gpu).

 

Basically, depending on position and viewpoint & near and far z ranges (frustum), a subselection of the "world" in system memory is made. Is eradicates a lot of stuff to send to opengl (stuff behind the viewer needs not be send is the simplest example). Stage 1.

 

Stage 2: the transform matrices need to be traversed bringing stuff into their state location and orientation (this can be a tree operation since one thing attached to another inherits transform from its parent for example). Note the eventual draw order can deviate from this traversalorder in order to optimize performance). The traversed objects now have their matrices ready for draw. This set of selected and prepared objects in a list, call it a "displaylist". This could be stage 2.

 

Stage 3 can be sorting on: the objects *referenced* (so not! a instance per object!) a vertex buffer (prepared once, during 1 time initialisation phase beforehand, so not each frame or each object instance), the objects referenced texture buffer(s), and objects required draw state and what shader (for example no z test, or opacity, which require proper ordering to look good).

This sort will (a) reduce shader, texture / buffer swaps and drawstate changes, helping immensly with performance, and b) can be used to enforce draw order for things likfe translucent objects that need to draw last in order to show the objects behind them.

 

Stage 4: rendering pass, the sorted display list is traversed and send to opengl to queue the commands to the gpu.

 

Nice thing: once send, and the gpu is processing/drawing the stuff it received from opengl/driver, meanwhile the 1st till 3rd stages can already be at work preparing the next batch.

 

Omitted is "stage 0", which wouldbe changing then game state, handling collisions and movement and stuff.

 

Moral of the story: if theres 100000 cubes in a world, using 4 different textures, only 1 vertex buffer is ever used, and only 4 texture objects. Opengl is for holding objects to draw, which is usually a tiny subset of objects in world.

 

Same principles apply up to a extent for static parts of a world (buildings, walls), although theres more options there (to many and/or complex to elaborate on here).

 

Theres endless (and way more advanced) variations on the above but hopefully it paints a picture (pun intended 😂) and you find some use in it, even if only in getting inspired to read more on the subjects.

 

Link to comment
Share on other sites

Link to post
Share on other sites

18 hours ago, Bartholomew said:

Too little details given, but it might be this (note im not fammiliar with details of minecraft engine, bjt the following is generic and apply in most of the cases / engines).

 

Game worlds can be huge. Waaaaay more instances of "things" that could ever be held directly in gpu buffers.

 

The state (position, orientation) and occurences of stuff is held in normal (system) memory.

 

From there, in multiple stages, things are rendered to the gpu (the engines rendering pipeline, part of which is in opengl, but there is at least 1 but usually more stages before that. The 1 mandatory stage is what transfers a selection of the world to opengl in order to render at gpu).

 

Basically, depending on position and viewpoint & near and far z ranges (frustum), a subselection of the "world" in system memory is made. Is eradicates a lot of stuff to send to opengl (stuff behind the viewer needs not be send is the simplest example). Stage 1.

 

Stage 2: the transform matrices need to be traversed bringing stuff into their state location and orientation (this can be a tree operation since one thing attached to another inherits transform from its parent for example). Note the eventual draw order can deviate from this traversalorder in order to optimize performance). The traversed objects now have their matrices ready for draw. This set of selected and prepared objects in a list, call it a "displaylist". This could be stage 2.

 

Stage 3 can be sorting on: the objects *referenced* (so not! a instance per object!) a vertex buffer (prepared once, during 1 time initialisation phase beforehand, so not each frame or each object instance), the objects referenced texture buffer(s), and objects required draw state and what shader (for example no z test, or opacity, which require proper ordering to look good).

This sort will (a) reduce shader, texture / buffer swaps and drawstate changes, helping immensly with performance, and b) can be used to enforce draw order for things likfe translucent objects that need to draw last in order to show the objects behind them.

 

Stage 4: rendering pass, the sorted display list is traversed and send to opengl to queue the commands to the gpu.

 

Nice thing: once send, and the gpu is processing/drawing the stuff it received from opengl/driver, meanwhile the 1st till 3rd stages can already be at work preparing the next batch.

 

Omitted is "stage 0", which wouldbe changing then game state, handling collisions and movement and stuff.

 

Moral of the story: if theres 100000 cubes in a world, using 4 different textures, only 1 vertex buffer is ever used, and only 4 texture objects. Opengl is for holding objects to draw, which is usually a tiny subset of objects in world.

 

Same principles apply up to a extent for static parts of a world (buildings, walls), although theres more options there (to many and/or complex to elaborate on here).

 

Theres endless (and way more advanced) variations on the above but hopefully it paints a picture (pun intended 😂) and you find some use in it, even if only in getting inspired to read more on the subjects.

 

no I was only sending the vertices to be rendered to the GPU buffer, not the whole object lol.

Turns out my problem was with STL vectors being awful and having an arbitrary cap before they rejig their memory and lose some data. I don't really get why, but I needed to resize it to  accomodate all of the chunks being held before it was written to. I thought it'd automatically d othis, but no. of course not.

-Parts-

Core i7 8086k 4.0GHz (4.6GHz OC) - ASUS z390 ROG Maximus XI code - Corsair H115i - 16GB (2 x 8GB) G.SKILL Trident Z RGB 3600MHz - Windforce GTX 1050Ti - Corsair HX750i - Corsair Obsidian 500D RGB SE

 

-Upgrades when I get the money-

Undecided on the GPU - ASUS ROG PG278QR 1440p 144Hz (165Hz OC)

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×