Segfault with too many bodies

Report any bugs here and we'll post fixes

Moderators: Sascha Willems, Thomas

Segfault with too many bodies

Postby oliver » Sat Sep 24, 2016 1:56 am

Hi,

I am learning Newton at the moment. It all works well for as long as the world
does not contain too many bodies, but segfaults beyond a certain limit.

For me, that limit is ~40k non-penetrating bodies, or ~5k penetrating bodies.
The segfault happens in `dgBroadPhase::UpdateContactsBroadPhaseEnd` or
`dgBroadPhase::ImproveFitness`, respectively.

Am I doing something wrong? Is there a limit to the number of bodies one
can/should have in the world?

The code below reproduces this for me (Intel CPU, 16GB Ram, Ubuntu 16.04, Debug
build), where `spacing` is either 10.0f or 0.1f.

Code: Select all
#include <stdio.h>
#include <stdlib.h>
#include <vector>
#include "Newton.h"


int main (int argc, const char * argv[])
{
  // Specify the number of bodies to create, and their spacing.
  const int numBodies = 100000;
  const dFloat spacing = 10.0f;

  NewtonWorld* const world = NewtonCreate();
  std::vector<NewtonCollision*> cshapes(numBodies);
  std::vector<NewtonBody*> bodies(numBodies);
  std::vector<dFloat*> TMs(numBodies);

  for (int ii=0; ii < numBodies; ii++) {
    // Create neutral transform matrix for current body.
    dFloat   const ref_TM[16] = {
      1, 0, 0, 0,
      0, 1, 0, 0,
      0, 0, 1, 0,
      0, 0 ,0, 1
    };
    TMs.at(ii) = new dFloat[16];
    for (int jj=0; jj < 16; jj++) TMs.at(ii)[jj] = ref_TM[jj];

    // Update the position to enforce inter-body spacing.
    TMs.at(ii)[13] = ii * spacing;

    // Create dynamic body from collision shape.
    cshapes.at(ii) = NewtonCreateBox(world, 1, 1, 1, 0, NULL);
    bodies.at(ii) = NewtonCreateDynamicBody(world, cshapes.at(ii), TMs.at(ii));
    NewtonBodySetMassMatrix(bodies.at(ii), 1.0f, 1, 1, 1);
  }

  // Segfaults here.
  NewtonUpdate(world, 1.0f/60);

  NewtonDestroyAllBodies(world);
  NewtonDestroy(world);

  return 0;
}
oliver
 
Posts: 36
Joined: Sat Sep 17, 2016 10:31 am

Re: Segfault with too many bodies

Postby JoeJ » Sat Sep 24, 2016 8:21 am

Try to destroy the collision shape after body creation:
Code: Select all
// Create dynamic body from collision shape.
    NewtonCollision* cshape = NewtonCreateBox(world, 1, 1, 1, 0, NULL); // no need to store this
    bodies.at(ii) = NewtonCreateDynamicBody(world, cshapes.at(ii), TMs.at(ii));
    NewtonBodySetMassMatrix(bodies.at(ii), 1.0f, 1, 1, 1);
    NewtonDestroyCollision (cshape);


I don't know about the internals and can't explain exactly,
but i guess if multiple bodies have the same shape you can create all with the same object to reduce memory costs (but still, delete it after creation of all bodies).

Usually you should get an assert if you forget to delete a shape.
User avatar
JoeJ
 
Posts: 1453
Joined: Tue Dec 21, 2010 6:18 pm

Re: Segfault with too many bodies

Postby Julio Jerez » Sat Sep 24, 2016 12:47 pm

what Joe say is correct, you need to release the shapes after you use it. otherwise it leave a memory leak. In this demo the leak is not that bad though, because the engine does caches shapes with same properties, so in the case it only leaks one shape.

The program crashes because Newton uses the thread stack for some intermediate computations.
since the test try to make simulate 100,000 bodies then the intimidate arrays are more than default thread stack size and function alloca simple override the stack.
I has not found a way to add an assert to that, niteh Linux nor window provide debug funtionality stack allocations.

In newton the engine solver run on its own thread and to control the thread tack size by change define
#define DG_ENGINE_STACK_SIZE (1024 * 1024)
in file ..\coreLibrary_300\source\physics\dgWorld.cpp

it is hard code because I never designed the engine to do so many objects, at one.

if you change the define 10 say (1024 * 1024 * 10)
that will commit 10 mgbyte of stack, that should be sufficient.

if you are going to do so many extreme scene maybe you can commit 100 meg of attack for the engine thread.
Julio Jerez
Moderator
Moderator
 
Posts: 12249
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: Segfault with too many bodies

Postby oliver » Sat Sep 24, 2016 11:05 pm

Thank you for the quick response, Joe and Julio.

I did not realise that Newton makes an internal copy of the transform matrix
and collision shape. That would have simplified the example. :)

As for the stack size: Julio's suggestion works. The number of bodies before
segfaulting seems to increase linearly with it. I was able to simulate 1M
non-overlapping bodies with a stack size of 128MB.

Would it make sense to have the stack size as a `dgWorld` constructor
parameter? This would be useful for people who are interested in large,
non-real time simulations. Newton seems to be able to handle it fine.
oliver
 
Posts: 36
Joined: Sat Sep 17, 2016 10:31 am

Re: Segfault with too many bodies

Postby Julio Jerez » Sun Sep 25, 2016 9:09 am

oliver wrote:Would it make sense to have the stack size as a `dgWorld` constructor
parameter? This would be useful for people who are interested in large,
non-real time simulations. Newton seems to be able to handle it fine.

Yes that could be a good idea, I see if I add a function rather than a parameter.
Julio Jerez
Moderator
Moderator
 
Posts: 12249
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: Segfault with too many bodies

Postby Julio Jerez » Sun Sep 25, 2016 11:01 am

Ok it turns out that adding as a function is a lot of work because once threads are created, some parameter can not be changed, I will have to destroy lot of stuff and recreate it.
Like you said, I too the simpler approach of adding a new constructor.
Code: Select all
NewtonWorld* NewtonCreate ();
NewtonWorld* NewtonCreateEx (int stackSizeInMegabytes);


are you simulation millions of bodies, what are you doing, if I can ask and how long doe sit take to execute. notice that lot of bodies will easily generate rounding error in single precision, you may want to try double precision.
Julio Jerez
Moderator
Moderator
 
Posts: 12249
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: Segfault with too many bodies

Postby oliver » Sun Sep 25, 2016 9:44 pm

Thank you for considering it, Julio. I had a look at your code myself and I
think the ctor argument would be the simplest thing that works.

Regarding performance: here is the breakdown with 1M non-penetrating spheres
that are all pulled in the same direction (without colliding).

Code: Select all
Threads    Create Bodies    Iteration 0    Iteration 1+
      1         1,043 ms       9,375 ms        1,816 ms
      2         1,031 ms       8,955 ms        1,350 ms
      3         1,070 ms       8,780 ms        1,150 ms
      4         1,029 ms       8,692 ms        1,100 ms
      8         1,034 ms       8,674 ms        1,070 ms

I notice that the thread count makes a smaller difference than I expected. Various
comments in the source code suggest that multiple threads may not result in better
performance. May I ask why? Cache pollution?

As for what I am doing: having fun with the engine, learning how to use it, and
gauging its limits... I am an engineer, after all :)

As for double precision: I was unable to compile it (I will create a new report
for it). I have to ask, though, what is the connection between the number of
bodies and the accuracy? Or is that only a concern for collision/constraint
resolution?
oliver
 
Posts: 36
Joined: Sat Sep 17, 2016 10:31 am

Re: Segfault with too many bodies

Postby Julio Jerez » Mon Sep 26, 2016 9:33 am

thread counts only make a difference with good multiple cores.
by this I mean cores who are in fact real unit with independent units with plenty of cache memory.
Intel core 7 are good, amd not so good in my experience.

also as there are more an more thread, all cores share the same bus and the same memory because they are working on the same problem, as there are more cores there gain diminish because all is doing is measuring the band with of the memory bus and system memory.
multicores are effective as long as the effective part of the scene can be kept on the cup cache.
the scene you are testing are so large that the flush the cache many time over even in one pass.
This is the only reason GPU are better that CPU at they have a very large band with.

in my experience using three or four threads is the possible the best combination, after that I have never seen a net gain.
Julio Jerez
Moderator
Moderator
 
Posts: 12249
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: Segfault with too many bodies

Postby JoeJ » Mon Sep 26, 2016 12:02 pm

oliver wrote:I have to ask, though, what is the connection between the number of
bodies and the accuracy? Or is that only a concern for collision/constraint
resolution?


You should test stacks of bodies with gravity. (After some time the bodies might fell asleep and performance should go up.)
Testing non intersecting bodies without any constrains to solve makes no sense, neither for performance nor for accuracy.

To get better accuracy, you need to set more iterations with NewtonSetSolverModel (world, e.g. 7).
0 is for the exact solver.
User avatar
JoeJ
 
Posts: 1453
Joined: Tue Dec 21, 2010 6:18 pm

Re: Segfault with too many bodies

Postby Julio Jerez » Mon Sep 26, 2016 1:36 pm

I have to ask, though, what is the connection between the number of
bodies and the accuracy? Or is that only a concern for collision/constraint
resolution?

collision/constraint resolution? yes but is more than that, the demo that you posited object were locate at distance of about 10 million units for the origin.
number large can not be hold with the enough precision in single precision floats so even integration will has lot of round errors.
Julio Jerez
Moderator
Moderator
 
Posts: 12249
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: Segfault with too many bodies

Postby oliver » Tue Sep 27, 2016 10:15 pm

Thank you for the explanations about the threads.

collision/constraint resolution? yes but is more than that, the demo that you posited object were locate at distance of about 10 million units for the origin.
number large can not be hold with the enough precision in single precision floats so even integration will has lot of round errors.


You are, of course, correct. I completely forgot about that.

Btw, I tried out the new CreateWorldEx function and it works a treat - thank you for that.
oliver
 
Posts: 36
Joined: Sat Sep 17, 2016 10:31 am


Return to Bugs and Fixes

Who is online

Users browsing this forum: No registered users and 3 guests