Crash in dgCollisionConvexPolygon

by **Julio Jerez** » Mon Mar 30, 2020 10:41 am

If you use static libraries, is not going to work because the new libraries are usimg standard features of cpp 11 like threads and atomics and thread local.
This prompted me to deprecated support for vs 2010.
Even when building with vs 2013, I have to install 5 compiler updates.
Soon we will have to move on and deprecate visual studio 2013 as well as Microsoft abandon support for it in windows 10.

I assume you are using Windows 7, so my guess is that you are making changes to the engine to compiled it with vs2008, but that is a futile effort.

Visual studio 2015 and 2017 are free now, so there is not reason not to upgrade.
The only way it can possibly work is if you use dll the link to static runtime, but you can do that yourself with cmake if you install vs 2015 or 2017.

by **anothertime12** » Mon Mar 30, 2020 12:51 pm

Julio Jerez wrote:I assume you are using Windows 7, so my guess is that you are making changes to the engine to compiled it with vs2008, but that is a futile effort.

Actually I downloaded CMake and run the GUI for that - I didn't have to change anything at all.

Julio Jerez wrote:Visual studio 2015 and 2017 are free now, so there is not reason not to upgrade.

Professional isn't free.. and so far our older copy of professional fills our needs.

I suppose using the DLL is the next best option - but is there a place to download those precompiled?

by **JernejL** » Sat Apr 18, 2020 2:57 pm

I can gladly compile you dlls, let me know which cpu and variant you need (single, double - 32, 64 bit, debug yes / no) - send .me a pm!

by **anothertime12** » Sat Sep 26, 2020 6:43 am

Hi All - apologies for these text dump of a post.

I've been using the new compilations that JernejL kindly provided but I'm still getting all the same issues I used to.

All the settings are the same as the example I posted a while ago.

I'm setting:
NewtonSetSolverIterations(,8);
NewtonSetNumberOfSubsteps(,5);
NewtonSetThreadsCount(,1);
NewtonUpdate is being called 60 times per second.

The gravity being applied is 9.8 * 2 (felt right for this game).

Ball and world material (every object currently uses the same material):
NewtonMaterialSetDefaultElasticity(,0.25f);
NewtonMaterialSetDefaultFriction(,2.1f,0.9f);
NewtonMaterialSetDefaultSoftness(,0.25f);
I used to set a surface thickness of 0.1f but I don't now.

I have also been told that newton requires you to force the floating point control word.. I'm not keen on this but I have tried wrapping all my calls to NewtonUpdate so that every frame it's guaranteed to be set, IE:
_controlfp_s(NULL,_MCW_EM,_MCW_EM);
NewtonUpdate(,1.0f/60.0f);

The issue I'm getting is - sometimes when rolling around the level I get a hard crash inside the newton.dll as described below:

> newton.dll!dgCollisionConvexPolygon::CalculateContactToConvexHullContinue(const dgWorld * const world=0x04794c20, const dgCollisionInstance * const parentMesh=0x0479bda0, dgCollisionParamProxy & proxy={...}) Line 567 + 0x1d bytes C++
newton.dll!dgWorld::CalculateConvexToNonConvexContactsContinue(dgCollisionParamProxy & proxy={...}) Line 2041 C++
newton.dll!dgWorld::CalculateConvexToNonConvexContacts(dgCollisionParamProxy & proxy={...}) Line 1779 + 0x11 bytes C++
newton.dll!dgWorld::ConvexContacts(dgBroadPhase::dgPair * const pair=0x0479c650, dgCollisionParamProxy & proxy={...}) Line 1183 + 0x5 bytes C++
newton.dll!dgWorld::CalculateContacts(dgBroadPhase::dgPair * const pair=0x0479c650, int threadIndex=0, bool ccdMode=true, bool intersectionTestOnly=false) Line 1289 C++
newton.dll!dgWorld::CollideContinue(const dgCollisionInstance * const collisionSrcA=0x04ce4580, const dgMatrix & matrixA={...}, const dgVector & velocA={...}, const dgVector & omegaA={...}, const dgCollisionInstance * const collisionSrcB=0x04ce4700, const dgMatrix & matrixB={...}, const dgVector & velocB={...}, const dgVector & omegaB={...}, float & retTimeStep=0.0033333185, dgTriplex * const points=0x0479f490, dgTriplex * const normals=0x0479f3d0, float * const penetration=0x0479f550, __int64 * const attibuteA=0x0479f2d0, __int64 * const attibuteB=0x0479f350, int maxContacts=6, int threadIndex=0) Line 1403 C++
newton.dll!dgContact::EstimateCCD(float timestep=0.00000000) Line 214 + 0x53 bytes C++
newton.dll!dgWorldDynamicUpdate::BuildClusters(float timestep=0.00000000) Line 389 + 0x13 bytes C++
newton.dll!dgWorldDynamicUpdate::UpdateDynamics(float timestep=0.00000000) Line 110 C++
newton.dll!dgWorld::StepDynamics(float timestep=0.00000000) Line 940 C++
newton.dll!dgWorld::RunStep() Line 1002 C++
newton.dll!dgWorld::TickCallback(int threadID=1474240929) Line 1035 C++
newton.dll!dgMutexThread::Execute(int threadID=1) Line 59 C++
newton.dll!dgThread::dgThreadSystemCallback(void * threadData=0x04c7d330) Line 202 C++
newton.dll!std::_LaunchPad<std::_Bind<1,void *,void * (__cdecl*const)(void *),dgThread *> >::_Go() Line 187 + 0xb bytes C++
newton.dll!_Call_func(void * _Data=0x0018fbd0) Line 28 + 0xb bytes C++
newton.dll!_callthreadstartex() Line 376 + 0x6 bytes C
newton.dll!_threadstartex(void * ptd=0x04c63a28) Line 354 + 0x5 bytes C

The actual offending line seems to be a SIMD / SSE division part way into the function - there's an exception on a divss:

73045F80 ja dgCollisionConvexPolygon::CalculateContactToConvexHullContinue+2B5h (73045F85h)
73045F82 movaps xmm2,xmm0
73045F85 movaps xmm6,xmmword ptr [ebp-450h]
73045F8C movaps xmm7,xmmword ptr [ebp-480h]
73045F93 movaps xmm1,xmmword ptr [ebp-440h]
73045F9A shufps xmm2,xmm2,0
73045F9E mulps xmm2,xmmword ptr [ebp-4A0h]
73045FA5 mulps xmm1,xmm2
73045FA8 movaps xmm0,xmm2
73045FAB mulps xmm0,xmm7
73045FAE movaps xmm3,xmm2
73045FB1 mulps xmm3,xmm6
73045FB4 haddps xmm1,xmm1
73045FB8 haddps xmm0,xmm0
73045FBC haddps xmm3,xmm3
73045FC0 haddps xmm1,xmm1
73045FC4 haddps xmm0,xmm0
73045FC8 haddps xmm3,xmm3
73045FCC movaps xmmword ptr [ebp-470h],xmm2
73045FD3 xorps xmm2,xmm2
73045FD6 unpcklps xmm1,xmm2
73045FD9 unpcklps xmm3,xmm0
73045FDC movaps xmm0,xmmword ptr [__xmm@1e3ce5081e3ce5081e3ce5081e3ce508 (73095660h)]
73045FE3 unpcklps xmm3,xmm1
73045FE6 subps xmm3,xmmword ptr [__xmm@00000000000000000000000000000000 (730953E0h)]
73045FED movaps xmm1,xmmword ptr [dgVector::m_one (730BFFB0h)]
73045FF4 andps xmm3,xmmword ptr [dgVector::m_triplexMask (730BFF80h)]
73045FFB movaps xmm2,xmm3
73045FFE xorps xmm0,xmm3
73046001 andps xmm2,xmmword ptr [dgVector::m_signMask (730BFF90h)]
73046008 cmpltps xmm2,xmmword ptr [__xmm@322bcc77322bcc77322bcc77322bcc77 (73095690h)]
73046010 andps xmm0,xmm2
73046013 xorps xmm0,xmm3
73046016 xorps xmm3,xmm3
73046019 divps xmm1,xmm0
7304601C movaps xmm0,xmm3
7304601F andps xmm1,xmmword ptr [dgVector::m_triplexMask (730BFF80h)]
73046026 movaps xmmword ptr [ebp-450h],xmm1
7304602D movaps xmm1,xmm5
73046030 cmpleps xmm1,xmm3
73046034 cmpleps xmm0,xmm4
73046038 orps xmm1,xmm0
7304603B andps xmm1,xmm2
7304603E movmskps eax,xmm1
73046041 test al,7
73046043 je dgCollisionConvexPolygon::CalculateContactToConvexHullContinue+37Fh (7304604Fh)
73046045 movss xmm4,dword ptr [__real@3f99999a (7309515Ch)]
7304604D jmp dgCollisionConvexPolygon::CalculateContactToConvexHullContinue+3ECh (730460BCh)
7304604F subps xmm4,xmm3
73046052 movaps xmm2,xmmword ptr [__xmm@3f8000003f8000003f8000003f800000 (730957C0h)]
73046059 subps xmm5,xmm3
7304605C mulps xmm4,xmmword ptr [ebp-450h]
73046063 mulps xmm5,xmmword ptr [ebp-450h]
7304606A movaps xmm0,xmm4
7304606D minps xmm0,xmm5
73046070 maxps xmm4,xmm5
73046073 maxps xmm3,xmm0
73046076 minps xmm2,xmm4
73046079 movaps xmm0,xmm3
7304607C shufps xmm0,xmm3,0D2h
73046080 maxps xmm3,xmm0
73046083 movaps xmm0,xmm2
73046086 shufps xmm0,xmm2,0D2h
7304608A minps xmm2,xmm0
7304608D movaps xmm0,xmm3
73046090 shufps xmm0,xmm3,0D2h
73046094 maxps xmm3,xmm0
73046097 movaps xmm0,xmm2
7304609A shufps xmm0,xmm2,0D2h
7304609E movaps xmm4,xmm3
730460A1 minps xmm2,xmm0
730460A4 movaps xmm0,xmmword ptr [__xmm@3f99999a3f99999a3f99999a3f99999a (730957E0h)]
730460AB cmpltps xmm4,xmm2
730460AF xorps xmm0,xmm3
730460B2 andps xmm4,xmm0
730460B5 xorps xmm4,xmmword ptr [__xmm@3f99999a3f99999a3f99999a3f99999a (730957E0h)]
730460BC movss xmm0,dword ptr [__real@3f800000 (73095138h)]
730460C4 xor edx,edx
730460C6 comiss xmm0,xmm4
730460C9 jbe 730467DF
730460CF movaps xmm1,xmmword ptr [ebp-490h]
730460D6 addps xmm1,xmmword ptr [ebp-4B0h]
730460DD movaps xmm4,xmmword ptr [ebp-470h]
730460E4 movaps xmm2,xmm4
730460E7 mulps xmm1,xmmword ptr [dgVector::m_half (730BFFD0h)]
730460EE movaps xmm3,xmm1
730460F1 movaps xmm0,xmm1
730460F4 shufps xmm0,xmm1,0
730460F8 shufps xmm3,xmm1,55h
730460FC mulps xmm3,xmmword ptr [ebp-440h]
73046103 mulps xmm0,xmm6
73046106 shufps xmm1,xmm1,0AAh
7304610A mulps xmm1,xmm7
7304610D addps xmm3,xmm0
73046110 movaps xmm0,xmmword ptr [edi+0A0h]
73046117 mulps xmm2,xmm0
7304611A addps xmm3,xmm1
7304611D movaps xmmword ptr [ebp-480h],xmm2
73046124 addps xmm3,xmmword ptr [ebp-500h]
7304612B movaps xmm1,xmm3
7304612E movaps xmmword ptr [ebp-450h],xmm3
73046135 subps xmm1,xmmword ptr [edi+0B0h]
7304613C mulps xmm1,xmm0
7304613F movaps xmm0,xmm2
73046142 haddps xmm0,xmm2
73046146 haddps xmm1,xmm1
7304614A haddps xmm0,xmm0
7304614E haddps xmm1,xmm1
73046152 divss xmm1,xmm0 <- issue occurs here

My processor should support the SSE and AVX variants of this instruction so I assume it's dividing by zero - it's a float point trap exception rather than instruction decoding one.

by **Julio Jerez** » Sat Sep 26, 2020 9:41 am

oh yes there is a divide by zero bug there.

is in this code.

Code: Select all: ..... dgVector relStep (relativeVelocity.Scale(dgMax (proxy.m_timestep, dgFloat32 (1.0e-12f)))); dgFastRayTest ray(dgVector(dgFloat32(0.0f)), polygonMatrix.UnrotateVector(relStep)); dgFloat32 distance = ray.BoxIntersect(minBox, maxBox); dgInt32 count = 0; if (distance < dgFloat32(1.0f)) { bool inside = false; dgVector sphOrigin(polygonMatrix.TransformVector((hullBoxP1 + hullBoxP0) * dgVector::m_half)); dgVector pointInPlane (sphOrigin - relStep.Scale (m_normal.DotProduct(sphOrigin - m_localPoly[0]).GetScalar() / m_normal.DotProduct(relStep).GetScalar()));

you cna see that dgVector relStep (relativeVelocity.Scale(dgMax (proxy.m_timestep, dgFloat32 (1.0e-12f)))); check that the time step is no zero but never check that the vector relativeVelocity itsel was already zero.

then it proceeds to do the calculate the project of the point of a the plane of collision along the relative velocity vector, and that will generate a divide be zero. whi will be capture by the exception handler.
I will fix it later.
Thank for the teh detail bug report.

of this.

I have also been told that newton requires you to force the floating point control word.. I'm not keen on this but I have tried wrapping all my calls to NewtonUpdate so that every frame it's guaranteed to be set, IE:
_controlfp_s(NULL,_MCW_EM,_MCW_EM);

why do you think setting the float mask for exception handling is bad, is if was no for that that crash would be very hard to catch, because the teh denomination will probably not be an exact zero, and teh result of teh divide will be and overflood infinite number, which will not crash. by will produce bad numerical result leading to nand on some other places which will the make the library crash or worse malfunction.

the engine set two three exception traps.

Code: Select all: #define DG_FLOAT_EXECTIONS_MASK (EM_INVALID | EM_DENORMAL | EM_ZERODIVIDE) dgFloatExceptions(dgUnsigned32 mask = DG_FLOAT_EXECTIONS_MASK);

you can comment out the place where the controll world is set, but I would no recommend that, physic libraries and heavy of linear algebra calculations.
there is a process name pivoting which takes two vector and and combine them wit thsi operation

A = B + a * C

upper case are vectors and lower case are scalar values.
the object of this operation is to zero out one of the component of vector A.

the problem with this is that in floating point you never get a exact zero. instead you get a denormal float value which is a legal float but that does no conform to IEEE.

a trick to go around the denormal value is to after the operation just simply rewrite the exact zero on the elements that is expected to be zero. however, because the is linear algebra with long vector, is possible that some othe elements of A are also near zero denormal values and that teh real problem.

this is no so bad when using double precision, but happen quiet frequent when using float, because floats are just truncated doubles, so a value that is not denormal in double precision is denormal when written bad to the register or memory.

some cpu can deal with denormal in hardware, bur the large majority do no and the teh OS set an expensive exception routine to resolve the problem and teh kernel level.
teh causes the performance of the math to be really slow.

to solve the there is a feature called
_MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON);

what tshi does is that the float using in hardware set any denormal value to zero when is going to be write to memory or to a register.
and that is what caused the divide by zero crash.

In my opinion set the float mask is a good thing, but I undertand why some people thinks is not.

the way you are trying to reset the control world mask is no doing anythong because the control world is a per thread funtionality. so setting before the newtonUpdate simple set it on teh calling thread but not on the actually engine threads.

I can add a function that you can call afte the engine initialization to disable rest it back to the default value, if you want, but I strongly recommend against that, again is simple made bug harder to find and made some float operation slower unnecessarily.

so even in a final release product you do not want your floats operation to slow down, because each time a denormal value is generated the a kernel exception is produced to repair the denormal value.

by **Julio Jerez** » Sat Sep 26, 2020 10:04 am

if you sync, the divide by zero bug is fixed.

by **anothertime12** » Sat Sep 26, 2020 12:17 pm

Thank you for fixing so quickly... when JernejL updates the precompiled downloads I'll test it.

To address the two questions you have:

Float exceptions - I always find it difficult to guarantee rounding modes / exception traps and other control words because I'm often finding libraries which require them to be set a certain way.. and then a second library which needs something different - especially when some WILL set them as needed and then others won't - so they can actually break each other depending on the order you call their functions.

As for the per thread control word - I was under the impression that if setting the engine to only use 1 thread, it would use the main thread and update would be synchronous - do I need to set it to 0 for this behaviour?

by **Julio Jerez** » Sat Sep 26, 2020 1:19 pm

As for the per thread control word - I was under the impression that if setting the engine to only use 1 thread, it would use the main thread and update would be synchronous - do I need to set it to 0 for this behaviour?

in Newton 3.14, the engine always run on its own thread. you can set it to run on the main thread or the parent thread by uncommenting DG_USE_THREAD_EMULATION in dgType.h file

Code: Select all: // by default newton run on a separate thread and // optionally concurrent with the calling thread, // it also uses a thread job pool for multi core systems. // define DG_USE_THREAD_EMULATION on the command line for // platform that do not support hardware multi threading or // if the and application want to control threading at the application level //#define DG_USE_THREAD_EMULATION

my suggestion is that you let it run on its own thread. it will not be async unless you select that mode. This way you get the round denormal to zero, very important, and will not conflict with other libraries. believe me having the exception on is a good thing.

the reason this is not the default of IEEE is because that standard predate modern hardware.
at the time the standard was written, floats operation were done with software routines that took thousands of clock cycles per operations. Even high end work stations with math coprocessors adds and multiplies took hundred of cycles.
these days floats operation are faster that integer operation and the processors can execute tenth of them, some time even, few dozen per clock cycle. so a hardware interruption is really bad.

I believe that in x64 mode, setting the control world is not longer allowed so teh funtion does nothing. But it still let you set the round denormal to zero, which really, really important for modern simd operations.

by **JernejL** » Tue Sep 29, 2020 3:11 am

It took some time because Julio has moved files for preparations to work on newton 4, but i got the compilation project working and got you new, fresh newton 3 DLL binaries here:

https://github.com/JernejL/NewtonBinaries

Crash in dgCollisionConvexPolygon

Re: Crash in dgCollisionConvexPolygon

Re: Crash in dgCollisionConvexPolygon

Re: Crash in dgCollisionConvexPolygon

Re: Crash in dgCollisionConvexPolygon

Re: Crash in dgCollisionConvexPolygon

Re: Crash in dgCollisionConvexPolygon

Re: Crash in dgCollisionConvexPolygon

Re: Crash in dgCollisionConvexPolygon

Re: Crash in dgCollisionConvexPolygon

Who is online