2019-12-01

Programmable vertex pulling

No more complex vertex array abstractions

So i finally managed to invest some time to implement programmable vertex pulling in my engine. I can really recommend to implement an abstraction over persistent mapped buffers that lets you implement structured buffers of generic structs and then use it on the cpu and the gpu side as a simple array of things.

Nothing comes for free: I find it quite difficult to handle any other layout than std430 because that matches what your c, c++ code is doing, as long as you restrict yourself to always use 16 byte alignment members, I think. My struct framework doesn't do any alignment, so I just added dummy members where appropriate in order to match the layout requirements. Afterwards, struct definitions in glsl have to match your struct on the cpu side and the only things left for your vertices is


struct VertexPacked {
    vec4 position;
    vec4 texCoord;
    vec4 normal;
};
layout(std430, binding=7) buffer _vertices {
    VertexPacked vertices[];
};

...


int vertexIndex = gl_VertexID;
VertexPacked vertex = vertices[vertexIndex];


Combined with persistent mapping, you can get rid of any layout fuddling, synchronization, buffering, mapping...and it just works.

Regarding performance: I am using an array of structs approach because it is the simplest to use. The performance in my test scenes (for example sponza) is completely identical to the traditional approach. No performance differences on a Intel UHD Graphics 620.

Having free indexed access to vertices in your shaders can be beneficial in other situations as well. For example you can implement a kd-tree accelerated ray tracer with compute that uses indices into your regular vertex array.