one thing that baffles me is how, all the stuff about graphics is not really what is suppose to be.
I am trying to make a shader that render a sphere from a quad.
it is quite a trivial operation.
-basically you transform the particle to viewspace.
-build the quad in view space.
-calculate the radius but projecting the origin and the corner, and subtract the differences. this will give the radius in pixel space.
in the pixel shade
-calculate the pixel distance to the origin, and is bigger smaller that the radius reject the pixel,
-calculate the normal at the surfance of the sphere of that pixel, and apply light.
-the end.
the first step is to write a build board render that make the quad out of each particles. liek th acode below.
- Code: Select all
ndFloat32 radius = particle->GetParticleRadius();
radius *= 16.0f;
ndVector quad[] =
{
ndVector(-radius, radius, ndFloat32(0.0f), ndFloat32(0.0f)),
ndVector(-radius, -radius, ndFloat32(0.0f), ndFloat32(0.0f)),
ndVector(radius, radius, ndFloat32(0.0f), ndFloat32(0.0f)),
ndVector(radius, radius, ndFloat32(0.0f), ndFloat32(0.0f)),
ndVector(-radius, -radius, ndFloat32(0.0f), ndFloat32(0.0f)),
ndVector(radius, -radius, ndFloat32(0.0f), ndFloat32(0.0f)),
};
for (ndInt32 i = 0; i < positions.GetCount(); i++)
{
const ndVector p(viewMatrix.TransformVector(positions[i]));
ndInt32 j = i * 6;
pointBuffer[j + 0] = viewMatrix.UntransformVector(p + quad[0]);
pointBuffer[j + 1] = viewMatrix.UntransformVector(p + quad[1]);
pointBuffer[j + 2] = viewMatrix.UntransformVector(p + quad[2]);
pointBuffer[j + 3] = viewMatrix.UntransformVector(p + quad[3]);
pointBuffer[j + 4] = viewMatrix.UntransformVector(p + quad[4]);
pointBuffer[j + 5] = viewMatrix.UntransformVector(p + quad[5]);
}
}
glDrawArrays(GL_TRIANGLES, 0, pointBuffer.GetCount());
as you can see neither the vertex shader or the pixel shade can do that, simple because they can emit data.
now for a 64k buffer that number of point is now 384k point (about 4 meg buffer)
this part alone is what make the rendering so slow.
you would thong that the ideal candicate for this is a Geometry shader. that will reduce the bandwidth by a factor of 6 in the CPU.
but now enter the self-appointed experts, they all agree that Geometry shader are bad, because the write back to memory.
to me writing back to memory in teh GPU should be still better than the CPU making the buffer and coping it across the PCI buffer.
I read one expert that say that it is better to have a compute shader making the triangle list the issue a Gl synchronization point and the call Draw triangle list with a share resource.
I have not doubt that they are probably correct, but that beg the question what are Geometry shaders good for them?