newton on gpu or discrete HPC

A place to discuss everything related to Newton Dynamics.

Moderators: Sascha Willems, walaber

Re: newton on gpu or discrete HPC

Postby Julio Jerez » Thu Sep 27, 2018 3:29 pm

ah ok, the two call load.
two things.
the name sse4.2 is another copy and paste bug.
I will change it to see.
the crash is also another copy and paste, I copy the newtonnSse4.2 folder and renamed the chance some of the sse4 intrinsic, fmadd the the equivalent sequence with see.
but I probably missed some others.
it will run on my system because it support them, but your don't therefore it will crash.

I will see what other intrinsic are sse4 specific and change them, and that should fix it.
Julio Jerez
Moderator
Moderator
 
Posts: 11039
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: newton on gpu or discrete HPC

Postby Julio Jerez » Thu Sep 27, 2018 4:18 pm

um I do not see any intrinsic that isn't allow with SSE
the only suspicious part are this function _mm_add_ps and _mm_cvtss_f32 but I believe these are legal to use, beside Is use in dgVector for horizontal add us ethem.
Code: Select all
   DG_INLINE float AddHorizontal() const
   {
      __m128 tmp0 (_mm_add_ps (m_low, m_high));
      __m128 tmp1 (_mm_hadd_ps (tmp0, tmp0));
      __m128 tmp2 (_mm_hadd_ps (tmp1, tmp1));
      return _mm_cvtss_f32 (tmp2);
   }

can you run with the default solver?

also the trace stack seem for a release build, can you run a debug build? and when is crash can you click show disassembly.
first sync the SEE4 is renamed now.
Julio Jerez
Moderator
Moderator
 
Posts: 11039
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: newton on gpu or discrete HPC

Postby JoeJ » Fri Sep 28, 2018 12:50 am

Assertion presists (i was always running debug):
disasm.JPG
disasm.JPG (198.72 KiB) Viewed 1929 times
User avatar
JoeJ
 
Posts: 1147
Joined: Tue Dec 21, 2010 6:18 pm

Re: newton on gpu or discrete HPC

Postby JoeJ » Fri Sep 28, 2018 12:54 am

Deleting dx12.dll, 'newtonSSE_d' is now properly displayed, can run with that.
Deletint sse.dll i can run the default solver as well.
User avatar
JoeJ
 
Posts: 1147
Joined: Tue Dec 21, 2010 6:18 pm

Re: newton on gpu or discrete HPC

Postby JoeJ » Fri Sep 28, 2018 1:01 am

I tried to edit dgVectorSimd.h:

DG_INLINE dgVector operator^ (const dgVector& data) const
{
//return _mm_xor_ps (m_type, data.m_type);
return data.m_type;
}

but nothing changes. The instruction must come from somewhere else.
User avatar
JoeJ
 
Posts: 1147
Joined: Tue Dec 21, 2010 6:18 pm

Re: newton on gpu or discrete HPC

Postby JoeJ » Fri Sep 28, 2018 1:06 am

Then i edit project options of dgNedwonDx12, set EnableEnhancedInstructionSet to SSE (was AVX)...

Yup, works: 'gpu experimental'

So you could do some benching to see if it is worth to have multiple permutations for the DX plugin... :roll:
User avatar
JoeJ
 
Posts: 1147
Joined: Tue Dec 21, 2010 6:18 pm

Re: newton on gpu or discrete HPC

Postby Julio Jerez » Fri Sep 28, 2018 1:18 am

oh I see the problem now from the assembly in the listing.
the intrinsic with a letter v in from mean they are avx code,
yet another copy and paste from the avx script and editing.

I believe I have it fix now.
Please sync again when you get time.
Julio Jerez
Moderator
Moderator
 
Posts: 11039
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: newton on gpu or discrete HPC

Postby JoeJ » Fri Sep 28, 2018 2:21 am

Before i try, did you notice the DX12 project settings allow AVX?

JoeJ wrote:Then i edit project options of dgNedwonDx12, set EnableEnhancedInstructionSet to SSE (was AVX)...

Yup, works: 'gpu experimental'
User avatar
JoeJ
 
Posts: 1147
Joined: Tue Dec 21, 2010 6:18 pm

Re: newton on gpu or discrete HPC

Postby Julio Jerez » Fri Sep 28, 2018 8:19 am

yes that was the bug.

the cmake script now set both to sse2 for 32bit builds and leave it not set for 64 bit builds.
Julio Jerez
Moderator
Moderator
 
Posts: 11039
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: newton on gpu or discrete HPC

Postby JoeJ » Fri Sep 28, 2018 4:24 pm

Works fine now. I've built all 5 plugIns this time, selection without problems :)
(It chooses DX and displays GPU name)
User avatar
JoeJ
 
Posts: 1147
Joined: Tue Dec 21, 2010 6:18 pm

Re: newton on gpu or discrete HPC

Postby Julio Jerez » Fri Sep 28, 2018 5:32 pm

so Joe you were never able to run the plugin ins did you?
did you run the pyramid stack?
is a 100 x 100 pyramid :shock: :? :o :lol: :twisted:
and state up at 12 iterations not cheats. it even goes to sleep.
Julio Jerez
Moderator
Moderator
 
Posts: 11039
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: newton on gpu or discrete HPC

Postby JoeJ » Fri Sep 28, 2018 5:55 pm

Julio Jerez wrote:it even goes to sleep.


How long do you wait until it sleeps?
I'm watching since 2-3 minutes, but it's still working hard :)
Runtime is about 500ms with SSE.

Oh, now the pyramid collapses, looks like timelapsed sand.

Still no sleep after collapse. I'll upload a screenshot - seems our results differ...
User avatar
JoeJ
 
Posts: 1147
Joined: Tue Dec 21, 2010 6:18 pm

Re: newton on gpu or discrete HPC

Postby JoeJ » Fri Sep 28, 2018 5:56 pm

pyramid.JPG
pyramid.JPG (120.21 KiB) Viewed 1890 times


I go to sleep now, but for real :P
User avatar
JoeJ
 
Posts: 1147
Joined: Tue Dec 21, 2010 6:18 pm

Re: newton on gpu or discrete HPC

Postby Julio Jerez » Fri Sep 28, 2018 7:34 pm

12 iterations seem to be the cut point where its may or may not goes to sleep.
I have noticed some randomness, I think cause by multithreading, that some runs go to sleep an other collapse.
but in the two system I test it when setting iteration count to 16, goes to sleep each time.
I committed at 16 iterations, so wile make a small video so that we have as the reference to compared too.
Julio Jerez
Moderator
Moderator
 
Posts: 11039
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: newton on gpu or discrete HPC

Postby JoeJ » Sat Sep 29, 2018 2:08 am

I've tried some settings, but even with 1 thread and 20 iterations it does not sleep with SSE plugIn.

After deleting all plugIns:
It does sleep with 1 threads / 20 iter
No sleep with 4 threads / 16 iter
Does sleep with 4 threads / 20 iter

It's really an edge case, e.g. if i change the settings while simulation is running i get different results - need to restart it.

Maybe there is a difference in accuracy for older and newer CPUs? That would explain it.
User avatar
JoeJ
 
Posts: 1147
Joined: Tue Dec 21, 2010 6:18 pm

PreviousNext

Return to General Discussion

Who is online

Users browsing this forum: Majestic-12 [Bot] and 3 guests