by Julio Jerez » Mon Sep 03, 2018 6:38 pm
Ok I think I fixed now, if you sync the newton SDK and build it, it should be fine now.
This was a legacy bug that goes back to the very early version of the engine.
Basically the collision tree caches the polygons face normal, so when I said the reason for these kind of errors is usually the face normal and that is seem a float32 rounding.
I was correct, but the source of the bug was not the input data, it was a bug In the collision tree builder.
The collision tree caches the face normal because there is a lot of redundancy on face normal and by pre calculating them we can do the intermediate calculation in higher precision and truncate the error to single precision on the result. This trick is use a lot in newton. In fact some calculations are done in up to 256 bit of precision using adaptive arithmetic.
This is done in function void dgPolygonSoupDatabaseBuilder::End(bool optimize)
you can see that the intermediate calculation is done using dgDouble, my mistake was the last function that call to remove the duplicates normal, I was calling.
m_normalCount = dgVertexListToIndexList(&m_normalPoints[0].m_x, sizeof (dgBigVector), 3, m_faceCount, &m_normalIndex[0], dgFloat32 (1.0e-4f));
since these are normalized values we can use a bound tolerance. we need to use a torelace or else some values may come out incorrect because of rounding errors, (is posible that two values yeal the wrong result because of round error and the a conditional pick the wrong branch, producing a bad sorting.
Anyway the value (1.e-4) dated from when this was all done with 32 bit floats which are about 6 digit of dynamic range, after I code to use double, i forgot to change that value to something smaller.
The result was that normal that were close enough where fused and an avagare was used as a representative.
By making the tolerance 100 time smaller the bug does not goes away, but it becomes 100 time smaller. The error can be made even smaller, but just trust me, that will result in some unhealthy calculation when using single precision like you are doing, I am going to ask you to trust me on that, I am not saying it likely, this is many years of numerical analysis, and I can tell you that it is a fool errand to try to do exact arithmetic with floating point precision.
Anyway, if you sync and try again, the error should be within a bound of +-1e-5
Thank for that bug, this was would probably never be fixed without such a precise repro case.