How to debug random state after kinematic/dynamic collision

Report any bugs here and we'll post fixes

Moderators: Sascha Willems, Thomas

How to debug random state after kinematic/dynamic collision

Postby oliver » Wed Mar 08, 2017 5:20 am

Hi,

tl;dr: which file/function should I start with, if I wanted to find out where the body
transform (or velocity) gets corrupted?

I have an obscure bug where the collision between a kinematic and dynamic body
results in random states for both bodies, ie their transforms, linear/angular
velocities contain what seems to be random numbers.

So far I have been unable to reproduce this problem in a standalone program,
but I can often trigger it as part of a bigger application. My question to the list:
which Newton methods would make for a good starting point to populate with
debug assert on eg the velocity (I know it must be small, no matter what)?

Some background on the setup: the collision shapes are compounds, even though
they contain only one shape at the moment (a sphere and box, respectively). The
dynamic body is initially at rest. Neither body has an angular velocity, and
both have unit mass and inertia. The simulation (slowly) moves the kinematic
sphere towards the dynamic box. Everything is fine until they touch. Once they
do, the state (ie transform, linear/angular velocity) of _both_ bodies is
corrupt.

This does not always happen on the first collision, but it rarely survives 3
collisions. In rare cases, the bug does not trigger at all, but the kinematic
body pushes the dynamic one, as intended.

I have also tried to rewind the simulation, ie. store the transforms and
velocities at each step, and reload it when the error occurs (with- and
without invalidating the cache). It always triggers again. However, when
I load those values at the start of the program, the bug does not trigger.

The bug has never triggered when both bodies are dynamic.

I am a bit at a loss with this. The corrupt state is the only lead I have
so far, hence my question as to where I should start placing asserts. Any
suggestions would be greatly appreciated.

Btw, I use a debug build with threads disabled. I am on Ubuntu 16.04, my
Newton version is ~2 weeks old.
oliver
 
Posts: 36
Joined: Sat Sep 17, 2016 10:31 am

Re: How to debug random state after kinematic/dynamic collis

Postby Julio Jerez » Wed Mar 08, 2017 9:47 am

the first thing is to see if It can be reproduced.
one trick I use to do debug difficult bugs to write a log,

in file ../coreLibrary_300\source\physics\dgBroadPhase.cpp
find this code
Code: Select all
#if 0
   static dgInt32 xxx;
   xxx ++;
   for (dgBodyMasterList::dgListNode* node = masterList->GetLast(); node; node = node->GetPrev()) {
      dgDynamicBody* const body = (dgDynamicBody*)node->GetInfo().GetBody();
      if ((body->GetType() == dgBody::m_dynamicBody) && (body->GetInvMass().m_w > dgFloat32 (0.0f))) {
         #if 0


enable the forts and second if def and that will record the state of each active body as I simulates.
then when the bug happens, change the second #if to zero, and see if the bug is reproducible.

if the bug happens then is a matter of finding the pair that causes the bug, for the I use the body unique ID to do conditionals dgTrace of its state until I can narrow the frame that the bug occurs.

you can also use the unique ID to identify If the bug happen to the same body, but inspecting it when it crashes. that was you can narrow the log even further
Julio Jerez
Moderator
Moderator
 
Posts: 12249
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: How to debug random state after kinematic/dynamic collis

Postby oliver » Wed Mar 08, 2017 10:44 pm

Hi Julio,

thank you for the suggestion. Here is what I have found so far, in case you have any ideas.

For debugging purposes I stripped the code from all materials, never modify the
sleeping state, and never destroy bodies/cshapes. Now the code occasionally
segfaults in dgMemoryAllocator because "cashe->m_next" points to invalid memory.


Code: Select all
#0  0x00007fffe0854b7c in dgMemoryAllocator::Malloc (this=0x4852ac0, memsize=432)
    at /home/oliver/projects/ds2/dependencies/newton-dynamics/coreLibrary_300/source/core/dgMemory.cpp:283
#1  0x00007fffe0854f68 in dgMalloc (size=432, allocator=0x4852ac0) at /home/oliver/projects/ds2/dependencies/newton-dynamics/coreLibrary_300/source/core/dgMemory.cpp:505
#2  0x00007fffe086bf8e in dgBody::operator new (size=432, allocator=0x4852ac0)
    at /home/oliver/projects/ds2/dependencies/newton-dynamics/coreLibrary_300/source/physics/dgBody.h:82
#3  0x00007fffe0868cd7 in dgWorld::CreateKinematicBody (this=0x4852c80, collision=0x8d0b540, matrix=...)
    at /home/oliver/projects/ds2/dependencies/newton-dynamics/coreLibrary_300/source/physics/dgWorld.cpp:536
#4  0x00007fffe093cbf2 in NewtonCreateKinematicBody (newtonWorld=0x4852c80, collisionPtr=0x8d0b540, matrixPtr=0x8d36fc0)
    at /home/oliver/projects/ds2/dependencies/newton-dynamics/coreLibrary_300/source/newton/Newton.cpp:4292


An invalid pointer would explain the random values, and how the simulation goes bonkers.

The bug, if it occurs, only materialises via CreateKinematicBody, never via
CreateDynamicBody. However, their memory allocation strategy is identical, as
far as I can tell from your code.

I will keep you posted.
oliver
 
Posts: 36
Joined: Sat Sep 17, 2016 10:31 am

Re: How to debug random state after kinematic/dynamic collis

Postby Julio Jerez » Thu Mar 09, 2017 8:00 am

can you either hack any of the demos in the sandbox, of maybe serialize your scene and play in the sand body.

Is possible that with all the changes to 3.14 that I left a bug regarding the treatment the kinematic bodies. kinematic bodies are assumed to not make it to the solver, but is possible that with the new changes they are and when they get casted to dynamics they override some data they should not.
since the are few demos using kinematic bodies is difficult to say what could be wrong.
Julio Jerez
Moderator
Moderator
 
Posts: 12249
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: How to debug random state after kinematic/dynamic collis

Postby oliver » Fri Mar 10, 2017 12:26 am

The sandbox does not work in my dev environment (library conflict).

Based on your suggestion, I did serialise the scene before and after the crash.
When I load that scene it never causes a problem.

My setup code still depends on external libraries, and may well contain a
pointer bug that is responsible for all this. A dependency free version is WIP,
hopefully later today, or tomorrow.

One thing I did notice, however, is that Valgrind reports a conditional jump
in the Newton solver based on an uninitialised variable.

The jump in question is in dgContact.cpp:

Code: Select all
dgFloat32 restitution = (vRel <= dgFloat32 (0.0f)) ? (dgFloat32 (1.0f) + row->m_restitution) : dgFloat32 (1.0f);


Not sure if this is relevant, but I thought you might want to know. Here is the Valgrind context (the
line numbers may not be 100% accurate since I added some comments to your code while reading it):

Code: Select all
==23595== Conditional jump or move depends on uninitialised value(s)
==23595==    at 0x4FC559D: dgContact::JointAccelerations(dgJointAccelerationDecriptor*) (dgContact.cpp:313)
==23595==    by 0x4FD1D7D: dgWorldDynamicUpdate::CalculateClusterReactionForces(dgBodyCluster const*, int, float, float) const (dgWorldDynamicsSimpleSolver.cpp:998)
==23595==    by 0x4FCBEBE: dgWorldDynamicUpdate::ResolveClusterForces(dgBodyCluster*, int, float) const (dgWorldDynamicsSimpleSolver.cpp:48)
==23595==    by 0x4FC97C9: dgWorldDynamicUpdate::CalculateClusterReactionForcesKernel(void*, void*, int) (dgWorldDynamicUpdate.cpp:774)
==23595==    by 0x4EEB30E: dgThreadHive::QueueJob(void (*)(void*, void*, int), void*, void*) (dgThreadHive.cpp:166)
==23595==    by 0x4FC64FD: dgWorldDynamicUpdate::UpdateDynamics(float) (dgWorldDynamicUpdate.cpp:142)
==23595==    by 0x4F4CCA1: dgWorld::StepDynamics(float) (dgWorld.cpp:920)
==23595==    by 0x4F4CEB7: dgWorld::RunStep() (dgWorld.cpp:967)
==23595==    by 0x4F4D058: dgWorld::Update(float) (dgWorld.cpp:995)
==23595==    by 0x50185D3: Newton::UpdatePhysics(float) (NewtonClass.cpp:53)
==23595==    by 0x50199A6: NewtonUpdate (Newton.cpp:648)
==23595==    by 0x461A57: main (kinbug.cpp:131)
==23595==  Uninitialised value was created by a heap allocation
==23595==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==23595==    by 0x4F3816C: dgGlobalAllocator::__malloc__(unsigned int) (dgMemory.cpp:105)
==23595==    by 0x4F377FA: dgMemoryAllocator::MallocLow(int, int) (dgMemory.cpp:202)
==23595==    by 0x4F37975: dgMemoryAllocator::Malloc(int) (dgMemory.cpp:248)
==23595==    by 0x4F37F67: dgMalloc(unsigned long, dgMemoryAllocator*) (dgMemory.cpp:508)
==23595==    by 0x4F41355: dgCollision::operator new(unsigned long, dgMemoryAllocator*) (dgCollision.h:231)
==23595==    by 0x4F4AC09: dgWorld::dgWorld(dgMemoryAllocator*, int) (dgWorld.cpp:308)
==23595==    by 0x501829F: Newton::Newton(float, dgMemoryAllocator*, int) (NewtonClass.cpp:40)
==23595==    by 0x50193B1: NewtonCreateEx (Newton.cpp:183)
==23595==    by 0x464A0D: NS_DS2Newton::DS2Newton::DS2Newton(Events::RelaySet const*, int) (newton_ds2.cpp:92)
==23595==    by 0x4619D3: main (kinbug.cpp:99)
oliver
 
Posts: 36
Joined: Sat Sep 17, 2016 10:31 am

Re: How to debug random state after kinematic/dynamic collis

Postby oliver » Fri Mar 10, 2017 4:27 am

I now have a bare bones Newton scene with nothing but a kinematic sphere, and a
dynamic box. The bodies touch, and I run a single time step. Here is what I see.

Valgrind is happy if the kinematic body is not collidable.

Valgrind throws many "Conditional jump or move depends on uninitialised
value(s)" warnings when the kinematic body _is_ collidable.

kinematic bodies are assumed to not make it to the solver


Do you think the Valgrind errors indicate that this is precisely what happens?
oliver
 
Posts: 36
Joined: Sat Sep 17, 2016 10:31 am

Re: How to debug random state after kinematic/dynamic collis

Postby Julio Jerez » Fri Mar 10, 2017 6:50 am

umm you say that valging indicate a error on this line

Code: Select all
dgJacobianMatrixElement* const row = &rowMatrix[k];

         dgVector relVeloc (row->m_Jt.m_jacobianM0.m_linear.CompProduct4(bodyVeloc0) + row->m_Jt.m_jacobianM0.m_angular.CompProduct4(bodyOmega0) + row->m_Jt.m_jacobianM1.m_linear.CompProduct4(bodyVeloc1) + row->m_Jt.m_jacobianM1.m_angular.CompProduct4(bodyOmega1));
         dgFloat32 vRel = relVeloc.AddHorizontal().GetScalar();
         dgFloat32 aRel = row->m_deltaAccel;

         if (row->m_normalForceIndex < 0) {

// valgring said this is wrong but this is impossible
            dgFloat32 restitution = (vRel <= dgFloat32 (0.0f)) ? (dgFloat32 (1.0f) + row->m_restitution) : dgFloat32 (1.0f);


but that is no possible, if you look a the variable involve, they are local variables define and initialized right before that line, with no possible path.

vRel is
dgFloat32 vRel = relVeloc.AddHorizontal().GetScalar();

and the other variable is row->m_restitution which is read from the contact material, if that is wrong then the entire row entry will also be wrong.

I do not see how another library can makes that malfunction since this is deep inside the solver unless is deleting one of the two bodies from a callback.

since serialization appear to fix it, can you can you run windows in your system so that you can hack that scene by hand and try it in the sandbox?

Do you think the Valgrind errors indicate that this is precisely what happens?


no, I do not think so, the code there is not addressing bodies is addressing information from a contact joint.
When contact are created the get the information from the contact calculation.
the joint has pointer to the two point to the two bodies, and also index to index of the two bodies proxies representation the bodies in the solver pass.
function void dgContact::JointAccelerations(dgJointAccelerationDecriptor* const params)
is a constraint callback the calculate the jacobian derivative of the contact for the solver mass matrix. it only references the bodies velocities at the beginning.
the only way if wool be wrong is if the whole contact joint is invalided, which could be possible but very unlikely.

can you inspect the operator this, when the crash happens and see if the member functions of the
are valid.
Julio Jerez
Moderator
Moderator
 
Posts: 12249
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: How to debug random state after kinematic/dynamic collis

Postby oliver » Fri Mar 10, 2017 11:56 pm

I am not a Valgrind expert, and false positives are certainly possible.
Nevertheless, I thought this warning would be easy to fix by initialising
the memory returned by the memory allocator. To that end I added
`memset(ptr, 0, size)` to dgMemoryAllocator::MallocLow so that it now reads

Code: Select all
void* const ptr = m_malloc(dgUnsigned32 (size));

// Oli: initialise memory.
memset(ptr, 0, size);


Now Valgrind does not complain anymore.

However, I can make the program reliably segfault with an initial value of 99
instead of zero (ie `memset(ptr, 99, size)`). Should the initial value matter?
Have I initialised in the wrong spot?

since serialization appear to fix it, can you can you run windows in your system so that you can hack that scene by hand and try it in the sandbox?


Sorry, I cannot help you with Windows or the sandbox, but the program I
used for the last test is below (it is really just a kinematic sphere touching
a dynamic box, and a single time step).

Code: Select all
#include <stdio.h>
#include "Newton.h"
#include "dVector.h"
#include "dMatrix.h"

using namespace std;


int main (int argc, const char * argv[]) {
  NewtonWorld *world = NewtonCreate();

  // Dummy transform matrix.
  dMatrix tm(dGetIdentityMatrix());

  // Create kinematic sphere.
  NewtonCollision *cs = nullptr;
  cs = NewtonCreateSphere(world, 2, 0, nullptr);
  auto kin = NewtonCreateKinematicBody(world, cs, &tm[0][0]);
  NewtonDestroyCollision(cs);

  // Create dynamic box.
  cs = NewtonCreateBox(world, 0.5, 0.5, 0.5, 0, nullptr);
  auto dyn = NewtonCreateDynamicBody(world, cs, &tm[0][0]);
  NewtonDestroyCollision(cs);
  cs = nullptr;

  // Both bodies have unit mass and inertia.
  NewtonBodySetMassMatrix(kin, 1, 1, 1, 1);
  NewtonBodySetMassMatrix(dyn, 1, 1, 1, 1);

  // Place the dynamic box to the right of the sphere, such that they touch.
  tm[3] = dVector(0, 0, 0);
  NewtonBodySetMatrix(kin, &tm[0][0]);
  tm[3] = dVector(2, 0, 0);
  NewtonBodySetMatrix(dyn, &tm[0][0]);

  // If enabled, Valgrind reports a conditional jump based on an uninitialised value.
  #if 1
  NewtonBodySetCollidable(kin, true);
  #endif

  // Step the world.
  NewtonUpdate(world, 1.0f / 60);

  // Tear down.
  NewtonMaterialDestroyAllGroupID(world);
  NewtonDestroyAllBodies(world);
  NewtonDestroy(world);

  return 0;
}

oliver
 
Posts: 36
Joined: Sat Sep 17, 2016 10:31 am

Re: How to debug random state after kinematic/dynamic collis

Postby Julio Jerez » Sat Mar 11, 2017 8:09 am

oh but that small program that you made helps a lot.

I paste in the tutorial part of the and I have the seg fault yet by there are some serious asserts that you should have gotten before that segment fault.

this is the first one

Code: Select all
         dgDynamicBody* const body = (dgDynamicBody*)bodyArray[i].m_body;
         dgAssert(body->IsRTTIType(dgBody::m_dynamicBodyRTTI));
         if (!body->m_equilibrium) {
            dgAssert (body->m_invMass.m_w > dgFloat32 (0.0f));
            body->AddDampingAcceleration(timestep);
            body->CalcInvInertiaMatrix ();
//            body->ApplyGyroTorque();
            internalForces[i].m_linear = dgVector::m_zero;
            internalForces[i].m_angular = dgVector::m_zero;
         } else {
            internalForces[i].m_linear = body->m_externalForce.CompProduct4(dgVector::m_negOne) ;
            internalForces[i].m_angular = body->m_externalTorque.CompProduct4(dgVector::m_negOne) ;
         }


I am debugging those and see if the, than for eth program to expose the bug.

I also added memory initialization to a signature but only in debug mode, we do not really wan to initialize memory for any other reason than for debug purposes.

on the dgAssert it soudl resolve to assert on all platform expect windows.

Code: Select all
#ifdef DG_DISABLE_ASSERT
   #define dgAssert(x)
#else
   #if defined (_WIN_32_VER) || defined (_WIN_64_VER)
      #define dgAssert(x) _ASSERTE(x)
   #else
      #ifdef _DEBUG
         #define dgAssert(x) assert(x)
      #else
         #define dgAssert(x)
      #endif
   #endif
#endif


do you have DG_DISABLE_ASSERT defines?
Julio Jerez
Moderator
Moderator
 
Posts: 12249
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: How to debug random state after kinematic/dynamic collis

Postby Julio Jerez » Sat Mar 11, 2017 10:18 am

ok I pasted your program in the tutorial
../newton-dynamics\applications\tutorialsSDK300\NewtonTutorials
this allowed me to fix all the place when I assumed the body was dynamics and make the fix.

I said before that kinematic bodies do not make to the solve, I was wrong, what I meant was that kinematic bodies do make to the solver as dynamics bodies because the have mass, by the do no make to the integrations phase.
this is the only way I could figure to treat a static body as it was dynamics for force calculation with other objects, by treat as static what moving it around by the user.

because the island has static and dynamic bodies, the bug was that dynamics bodies had the acceleration variable is the wrong place and when casted to dynamics, I overrode memory.
the solution was to move the coming data to the base class.

anyway I believe the problem is fixed now,
please sync and try again.

again, Thanks for the repro program, that was in fact a bad bug.
question what are you doing with kinematic bodies?
Julio Jerez
Moderator
Moderator
 
Posts: 12249
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: How to debug random state after kinematic/dynamic collis

Postby oliver » Sun Mar 12, 2017 4:32 am

Thank you for fixing this. The latest version now behaves as it should in terms of kinematic bodies. However, it broke the collision response for fast moving bodies.

I turn on CCD for all my bodies and they used to collide just fine. Now, they freeze upon contact.

Here is the program. It creates two identical boxes, separated by 5m on the x-axis. The left box has a velocity of 30 m/s and will smash into the right box after 9 time steps. Then both bodies just freeze.

Code: Select all
#include <stdio.h>
#include "Newton.h"
#include "dVector.h"
#include "dMatrix.h"

using namespace std;


int main (int argc, const char * argv[]) {
  NewtonWorld *world = NewtonCreate();

  // Dummy transform matrix.
  dMatrix tm(dGetIdentityMatrix());

  // Create two identical boxes.
  NewtonCollision *cs = nullptr;
  cs = NewtonCreateBox(world, 0.5, 0.5, 0.5, 0, nullptr);
  auto left = NewtonCreateDynamicBody(world, cs, &tm[0][0]);
  NewtonDestroyCollision(cs);

  cs = NewtonCreateBox(world, 0.5, 0.5, 0.5, 0, nullptr);
  auto right = NewtonCreateDynamicBody(world, cs, &tm[0][0]);
  NewtonDestroyCollision(cs);
  cs = nullptr;

  // Both bodies have unit mass and inertia.
  NewtonBodySetMassMatrix(left, 1, 1, 1, 1);
  NewtonBodySetMassMatrix(right, 1, 1, 1, 1);

  // Activate CCD.
  NewtonBodySetContinuousCollisionMode(left, 1);
  NewtonBodySetContinuousCollisionMode(right, 1);

  // Place the boxes side by side on x-axis.
  tm[3] = dVector(0, 0, 0);
  NewtonBodySetMatrix(left, &tm[0][0]);
  tm[3] = dVector(5, 0, 0);
  NewtonBodySetMatrix(right, &tm[0][0]);

  // Assign velocity to left box such that it will hit the right box.
  dVector vLin(30, 0, 0);
  NewtonBodySetVelocity(left, &vLin[0]);

  // Step the world and print the x-position of both bodies.
  for (int ii=0; ii < 20; ii++) {
    dVector lpos, rpos;
    NewtonUpdate(world, 1.0f / 60);
    NewtonBodyGetPosition(left, &lpos[0]);
    NewtonBodyGetPosition(right, &rpos[0]);
    printf("%02d Left=%.3f   Right=%.3f\n", ii, lpos[0], rpos[0]);
  }

  // Tear down.
  NewtonMaterialDestroyAllGroupID(world);
  NewtonDestroyAllBodies(world);
  NewtonDestroy(world);

  return 0;
}


Here is the output I get for the x-position of the left and right body after each time step. Note how
the left one simply stops moving at step 8, and the right one never picks up any velocity at all from
the impact.

00: Left=0.499 Right=5.000
01: Left=0.997 Right=5.000
02: Left=1.494 Right=5.000
03: Left=1.990 Right=5.000
04: Left=2.484 Right=5.000
05: Left=2.978 Right=5.000
06: Left=3.471 Right=5.000
07: Left=3.963 Right=5.000
08: Left=4.453 Right=5.000
09: Left=4.453 Right=5.000
10: Left=4.453 Right=5.000


The problem goes away when I disable CCD, or lower the velocity. Is that now the expected behaviour?
oliver
 
Posts: 36
Joined: Sat Sep 17, 2016 10:31 am

Re: How to debug random state after kinematic/dynamic collis

Postby Julio Jerez » Sun Mar 12, 2017 9:37 am

Oh yes I forget that to get is going I comment put the ccd solver part.
I will uncomment it today, they is some special consideration I need to make there
are you getting the assert? there is one right at the beginning.
Code: Select all
         if (body->IsRTTIType (dgBody::m_dynamicBodyRTTI)) {
            dgAssert (body->m_invMass.m_w);
            dgAssert (0);
/*
            const dgFloat32 accel2 = body->m_accel.DotProduct3(body->m_accel);
            const dgFloat32 alpha2 = body->m_alpha.DotProduct3(body->m_alpha);
            const dgFloat32 speed2 = body->m_veloc.DotProduct3(body->m_veloc);
            const dgFloat32 omega2 = body->m_omega.DotProduct3(body->m_omega);
Julio Jerez
Moderator
Moderator
 
Posts: 12249
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: How to debug random state after kinematic/dynamic collis

Postby oliver » Sun Mar 12, 2017 8:49 pm

The original dgAssert did not work on Linux. I (think I) fixed this (PR on GH).

Now the assert triggers, assuming you mean the one in dgDynamicsSimpleSolver.
oliver
 
Posts: 36
Joined: Sat Sep 17, 2016 10:31 am

Re: How to debug random state after kinematic/dynamic collis

Postby oliver » Tue Mar 14, 2017 1:46 am

I saw you added the CCD part again. Does this mean I can re-enable it for all my bodies, or are you still
working on it?
oliver
 
Posts: 36
Joined: Sat Sep 17, 2016 10:31 am

Re: How to debug random state after kinematic/dynamic collis

Postby Julio Jerez » Tue Mar 14, 2017 8:39 am

yes, please try again.
Julio Jerez
Moderator
Moderator
 
Posts: 12249
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Next

Return to Bugs and Fixes

Who is online

Users browsing this forum: No registered users and 2 guests

cron