potential division by zero in BrushTireModel()

Report any bugs here and we'll post fixes

Moderators: Sascha Willems, Thomas

potential division by zero in BrushTireModel()

Postby JoeJ » Sat Aug 13, 2022 2:46 pm

Code: Select all
   if (ndAbs(tireSpeed) > ndFloat32(0.9f))
   {
      // apply brush tire model only when center travels faster that 10 miles per hours
      const ndVector contactVeloc0(tireBody->GetVelocityAtPoint(contactPoint.m_point) - tireBody->GetVelocity() + tireVeloc);
      const ndVector contactVeloc1(otherBody->GetVelocityAtPoint(contactPoint.m_point));
      const ndVector relVeloc(contactVeloc0 - contactVeloc1);

      const ndFloat32 relSpeed = ndAbs(relVeloc.DotProduct(contactPoint.m_dir1).GetScalar());
      const ndFloat32 longitudialSlip = relSpeed / tireSpeed;

      // calculate side slip ratio
      const ndFloat32 sideSpeed = ndAbs(relVeloc.DotProduct(contactPoint.m_dir0).GetScalar());
      const ndFloat32 lateralSlip = sideSpeed / (relSpeed + ndFloat32(1.0f)); // Unhandled exception at 0x00007FF7E8DE8FC9 in Realtime.exe: 0xC0000090: Floating-point invalid operation (parameters: 0x0000000000000000, 0x0000000000009D31).



I just got this exception, but can't replicate in debug mode to see if relSpeed + 1 sums up to zero.
But i guess this can happen and you might want to handle this case.
It happens because i have replaced a floor box with a larger test terrain composed from my mesh segments.
Now the floor is lower, and the car falls from a height of maybe 15 meters down, hitting the terrain with large speed. The crash happens on collision.

But sometimes the app crashes immediately when the simulation starts, when the car is still in air, here:
Code: Select all
inline void ndSkeletonContainer::ndNode::CalculateInertiaMatrix(ndSpatialMatrix* const bodyMassArray) const
{
   ndSpatialMatrix& bodyMass = bodyMassArray[m_index];
   
   bodyMass = ndSpatialMatrix(ndFloat32(0.0f)); // <- memory access violation from reading at 0
   if (m_body->GetInvMass() != ndFloat32(0.0f))
   {
//...
   }
}

I can't make sense of this. I assume it's from accessing bodyMassArray, but i saw m_index is 0, and the first mass element of the mass array was 0 too.
So that's pretty fishy. Maybe some memory corruption on my side. Ignore it.

I will now offset the cars position so it rests on the floor on startup, to see if i can get a stable simulation.
I assume both problems will go away for me then, so i just wanted to report in case it rings some bells on you.

The car is pretty awesome, btw :D
User avatar
JoeJ
 
Posts: 1453
Joined: Tue Dec 21, 2010 6:18 pm

Re: potential division by zero in BrushTireModel()

Postby Julio Jerez » Sat Aug 13, 2022 7:23 pm

exception 0xC0000090 is an invalid operation, that's a sign of memory corruption of bad instruction.
I assume your cpu support SSE2 so that should not be a problem.Are you using c make to build the library, if so try commenting out this cmake option

set(CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG} /arch:SSE2")

I just find out that is I do not pass that option in any version of VS, the compiler generates sse scalar code in all versions.

I found that when dealing with the neural net code, is does generates sse2 code by scalar version only, but when passing that option them is does the vectorization and generate much better code.

please see if removing SSE2 fix it, if so I can make it optional,
Julio Jerez
Moderator
Moderator
 
Posts: 12249
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: potential division by zero in BrushTireModel()

Postby Julio Jerez » Sat Aug 13, 2022 8:45 pm

I look at the code, and if you look at it both sideSpeed and relSpeed are the result of an abs function
so they are either zero or positive.
I add 1.0 (m/s) (provably too high and makes the tire act as a soft truck tire)
to the size speed to prevent that relSpeed is too low

so unless the values are undefined it should never generate an exception.

I added the SSE option as a cmake parameter, please try selection NEWTON_NO_SSE2_INTRINSICS
to see if this is provoked by illegal instructions.

you need to sync.
glad you like the vehicle.
Julio Jerez
Moderator
Moderator
 
Posts: 12249
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: potential division by zero in BrushTireModel()

Postby JoeJ » Sun Aug 14, 2022 3:57 am

Julio Jerez wrote:exception 0xC0000090 is an invalid operation, that's a sign of memory corruption I assume your cpu support SSE2 so that should not be a problem.Are you using c make to build the library


I have Ryzen 2700, which has SSE2 and AVX.
But i'm not using cmake or dll. Instead i have a local copy of your source and added it to my Visual Studio project.
Can i disable SSE2 with some preprocessor define, eventually?

I look at the code, and if you look at it both sideSpeed and relSpeed are the result of an abs function
so they are either zero or positive.

I didn't notice, so probably that's a nan.
Not sure if i could still reproduce this crash. After repositioning the car it was gone.
I've noticed similar things when picking up the car with the picking joint and will keep an eye on that...

The second crash from access violation is still there.
But: Only if i use Clang. No problems with MSVC.
And this reminded me that i had a similar case just few weeks ago.
It was another project, and i got a crash indicating memory corruption, with shown debug info probably bogus.
I came to the conclusion that my VS installation or current Clang support is broken, and i can't use Clang at the moment. Maybe some VS update or a future reinstall will fix it.
It feels broke since i did the update to VS 2022.

Now just the same happened for my realtime / Newton project.
So i'll just keep using MSVC for now and ignore it.
But i'd try to disable SSE2, just to see what happens...
User avatar
JoeJ
 
Posts: 1453
Joined: Tue Dec 21, 2010 6:18 pm

Re: potential division by zero in BrushTireModel()

Postby Julio Jerez » Sun Aug 14, 2022 6:25 am

Oh, I see, I had not compile using clang other than to make sure it does.

On the sse2, if you are not using cmake and building the project by hand, them it will not be using sse2, unless you forced it explicitly, since sse2 is not a default option.
I select it in cmake because visual studio will only vertorice simd code if selecting the option. Otherwise it uses the scalar instructions.

The library uses sse2 simd intrinsics, but only when using the ndVector and ndBigVector, but there is lot of generic template linear algebra code that is straight vanilla cpp template, that can improve a lot from compiler autovectorization because these are long sequences of vectors of floats, where compiler overhead is negligible.

You can try printf in release mode to see if the numbers are becoming Nan, or infinite.
Julio Jerez
Moderator
Moderator
 
Posts: 12249
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: potential division by zero in BrushTireModel()

Postby JoeJ » Sun Aug 14, 2022 6:39 am

Ah ok. I still need to experiment with multi threading and SSE solver, currently i have nothing enabled yet.
I've noticed similar things when picking up the car with the picking joint and will keep an eye on that...

That's just an assert.
If i pick up the car with a custom joint similar to Kinematic Controller, i get the assert at the end of ndSkeletonContainer::InitLoopMassMatrix():
Code: Select all
dAssert(dTestPSDmatrix(m_auxiliaryRowCount - m_blockSize, m_auxiliaryRowCount, &m_massMatrix11[m_auxiliaryRowCount * m_blockSize + m_blockSize]));


If i comment it out anything seems fine. I can lift the car, push it against static geometry, let it fall down and drive. No jitter or something bad to see.

Should i worry about the assert? The picking joint forces are not too high i think. I can barely lift the car up due to max friction rows with low limits.
User avatar
JoeJ
 
Posts: 1453
Joined: Tue Dec 21, 2010 6:18 pm

Re: potential division by zero in BrushTireModel()

Postby Julio Jerez » Sun Aug 14, 2022 12:04 pm

that could be serious, because it shows that somehow the joint arrangement is making an ill conditioned or a singular matrix.
if the matrix is singular, meaning one or more rows are a combination of some other row the solver will blow up, so that does not seem to be the problem.

if the matrix is ill formed the, meaning two rows are almost parallel, them the solver will iteratively increase that main diagonal of all rows until is capable of factorizing it, this will result in a softer damped solution.

normally the pick joint will add a close loop to any hierarchy of joint and bodies,
so for that there are for types of joints that the solver can use as hint of how to tread joints whne build the hierarchy. Those types are

Code: Select all
enum ndJointBilateralSolverModel
{
   m_jointIterativeSoft,
   m_jointkinematicOpenLoop,
   m_jointkinematicCloseLoop,
   m_jointkinematicAttachment,
   m_jointModesCount
};


m_jointIterativeSoft: is for joints that do not go to the joint solver, sone time this are useful for soft effects.

m_jointkinematicOpenLoop: is for joint that form open loop, in general the solver does a best guess job find out which joint are open loop when mor than one joint can be selected. but when the application knows the joints of two joints, the application can tag one joint as close loop so that it is no selected. this is the default type for bilateral joints.

m_jointkinematicCloseLoop: is for when a joint form a close loop and that application wants this particular joint to the loop joint. an example is the vehicle.
the chassis has a motor which is a rigid body attach to the chassis with a joint, it has a Gear Box,
which is also a rigid body attached to the chasse with a joint.
the is a clutch and torque converted which is a joint that connect the motor and the gear box.
this joint will for a loop, but it is no desirable that the motor joint or the gear box joint is select as the close loop, so the joint is tag as m_jointkinematicCloseLoop so that conflict is solved in favor of the motor and gear box joints.

m_jointkinematicAttachment: this joint is a special case of m_jointkinematicCloseLoop, basically this makes the joint a close loop, but does not has to reconstruct the factorizations.

if you made your own kinematic pick joint, and you did not sub classed from ndJointKinematicController, them the joint will be tagged as m_jointkinematicOpenLoop
so if you are not doing it, you can just make the joint a kinematic attachment by calling this
SetSolverModel(m_jointkinematicAttachment);

in the constructors.
Julio Jerez
Moderator
Moderator
 
Posts: 12249
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: potential division by zero in BrushTireModel()

Postby JoeJ » Sun Aug 14, 2022 2:24 pm

I had it set as m_jointkinematicAttachment already.
Now changed this to m_jointIterativeSoft, and no asserts and works fine.

Maybe my joint causes parallel rows, because i do not use a stable parent bodies orientation or world space axis for the rows.
Instead i get the primary axis from the linear / angular errors, and the second from projecting velocities to that plane.
So i have fluctuation of the row directions if the error is small.

I need the alignment to error to prevent the joint from wobbling.
On the primary error row i solve the control problem to drive to the target position and velocity, using a given constant acceleration. Works very well - it never overshoots and feels robust and smart. :)

But eventually i should switch to stable world space row directions if the error is already small.
Also, if i remove the picking joint, the picked bodies often receive some kind of impulse, nudging them into some unexpected direction. Maybe i have a bug somewhere...
User avatar
JoeJ
 
Posts: 1453
Joined: Tue Dec 21, 2010 6:18 pm


Return to Bugs and Fixes

Who is online

Users browsing this forum: No registered users and 19 guests