ARM Neon Support and Android x86

A place to discuss everything related to Newton Dynamics.

Moderators: Sascha Willems, walaber

ARM Neon Support and Android x86

Postby execomrt » Fri Sep 21, 2018 10:29 am

Hello;
I'm currently adapting the new code for ARM Neon Support and better Android Support.
The mandat is to

- Support SSE4 + for Intel 64bit only, SSE3 for x86 (which is end of life)
- Support for SSE on Android, MacOSX (currently it is disabled on Android).
- Support for ARM Neon for Android, iOS and Windows ARM
- Support for ARM 64 instructions (which have a single instructions for horizontal adds).

This will includes new defines for detecting architecture
like a new preprocessor DG_ARCH which will have different values according the compilation.
like DG_ARCH_SCALAR, DG_ARCH_SSE_3, DG_ARCH_SSE_4_1, DG_ARCH_NEON, DG_ARCH_NEON_64

I have now working on Android x86 and 64 and ARM is also working

Will post the patch here this week end.

If some people could test the patch especially on various version of Visual Studio,Linux, iOS and Android and reports some feedback?

It is also possible to enable the PowerPC altivec version, but is there anyone working on that ?

Thanks;
execomrt
 
Posts: 8
Joined: Fri Sep 21, 2018 8:41 am

Re: ARM Neon Support and Android x86

Postby Dave Gravel » Fri Sep 21, 2018 1:28 pm

Yes sure I can test it in my projects but it is only ARM.
I have 7 android devices for test from different age to the last one.

I currently work with a 2 years old newton sdk for my projects.
I'm going to finish the update for my projects with last newton sdk in the week-end.

Thanks for the contributions, I can reports feedback later.
You search a nice physics solution, if you can read this message you're at the good place :wink:
OrionX3D Projects & Demos:
https://www.facebook.com/dave.gravel1
http://orionx3d.googlepages.com/
https://www.youtube.com/user/EvadLevarg/videos
User avatar
Dave Gravel
 
Posts: 712
Joined: Sat Apr 01, 2006 9:31 pm
Location: Quebec in Canada.

Re: ARM Neon Support and Android x86

Postby Julio Jerez » Fri Sep 21, 2018 2:03 pm

Nice, we will have optimized ARM, cool stuff.

I am making another pass elimination stuff like DG_SSE4_INSTRUCTIONS_SET which
haven been use at all since no too make people ha the same PC system, and now we are supporting plugins that encapsulate that kind of support.

I will also eliminate some of the function that operator of 3 float of a vector four.
stuff like
Code: Select all
   DG_INLINE dgFloat32 DotProduct3 (const dgVector& A) const
   {
      dgVector tmp (A & m_triplexMask);
      dgAssert ((m_w * tmp.m_w) == dgFloat32 (0.0f));
      return (*this * tmp).AddHorizontal().GetScalar();
   }


it end up bein actually slower because in most case a dot product3 the w member of the vector is already a one or a zero, so the client use is in better position to decide what to do.

on this
It is also possible to enable the PowerPC altivec version, but is there anyone working on that ?

we can do away with altivec, is not likely that power PC are going to make a comeback anytime soon,
Julio Jerez
Moderator
Moderator
 
Posts: 10984
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: ARM Neon Support and Android x86

Postby Julio Jerez » Fri Sep 21, 2018 2:22 pm

Ok I remove all rwefrence to __ppc__, and SSE4.2
also remove all references to functuion dgVector CompProduct3 (const dgVector& A) const

next I will remove reference to DotProduct3 and DotScale3
is have to be done slowly so that not side effect is made but that stream line the vector class.
Julio Jerez
Moderator
Moderator
 
Posts: 10984
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: ARM Neon Support and Android x86

Postby Dave Gravel » Sun Sep 23, 2018 3:14 am

Hi, This is currently what I get with last newton sdk + last android arm neon add.

Edited:
Oh I have think the neon patch is already present in the sdk lol sorry.
No arm neon because it is not present in the sdk.
Now I see the __ARMCC_VERSION is only for avoid sse3 and more.
Maybe newton sdk need some updates about tag (ANDROID) because 32 and 64 bits version.
Some infos and neon code here https://developer.android.com/ndk/guides/cpu-arm-neon
Maybe i'm better to test from a other device.
This one start to become old and is only 32bits if i'm not wrong, but from what I see it running faster with older newton sdk.
I just have finish to reupdate the sdk again and I have try all possible setup and I get pretty same result.

I'm not able to compile with abi hard, I need to use the softfp.
I have try to set all in hard mode but nothing seen to work for this mode.
The soft mode normal is very slow.
With the softfp the speed is ok but something is slower if I compare with my old tests.
I get better result with the 2 years old newton sdk.
I need to do more tests and see if I can gain more speed,
because i'm very limited for have fun with this speed.
When I pick a object it seen to glitch more too,
Maybe it come from a newton setup in my demo because the sdk lib update.

Video new use last sdk:
https://www.youtube.com/watch?v=wVD3WPJOqik


Video 2years old:
https://www.youtube.com/watch?v=TZEewV1p-io
Last edited by Dave Gravel on Sun Sep 23, 2018 4:55 pm, edited 4 times in total.
You search a nice physics solution, if you can read this message you're at the good place :wink:
OrionX3D Projects & Demos:
https://www.facebook.com/dave.gravel1
http://orionx3d.googlepages.com/
https://www.youtube.com/user/EvadLevarg/videos
User avatar
Dave Gravel
 
Posts: 712
Joined: Sat Apr 01, 2006 9:31 pm
Location: Quebec in Canada.

Re: ARM Neon Support and Android x86

Postby Julio Jerez » Sun Sep 23, 2018 12:02 pm

ok on my side I am cleaning up soem of the legacy functions that are no useful anymore.

I commited the replacemen of Scale3 and Scale4
now ther is only scale that simply mutiply the vector by a constant.

now I will renemat crossproduce3 to just crossProduct
and remove DotOProduct3, this one has to be doen with care.

finally I will add functions, MulAdd and NegMulAdd wich for x86 do nothong that for other arch are very powerful.
on ARM system this is where the get their strength from while in older sse ther is not multiply add until; the latest versions, and is does makes a difference with AVX2.
Julio Jerez
Moderator
Moderator
 
Posts: 10984
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: ARM Neon Support and Android x86

Postby Julio Jerez » Wed Sep 26, 2018 5:20 pm

I added two functions that that are important for ARM and new x86
MultAdd and negMulAdd, this is where they get performance because multiply and add does twice the amount floats for the same cost.
is does nothing for now, but when you get there just remember those two new functions.

I will completed for all classes later tonight.

In theory a complier should recognize that kind of optimization but I have never seem it doing it.
at least I know visual studio does not. and cose generate using Clang in Visual studio does not do it either if you use intrinsic.
Julio Jerez
Moderator
Moderator
 
Posts: 10984
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: ARM Neon Support and Android x86

Postby Dave Gravel » Sat Sep 29, 2018 11:31 am

This is my result with newton sdk from 2days old.

https://www.youtube.com/watch?v=uNZRQriT3yc


Demo with 296 objects stack compatible Android 5.0.2 Lollipop & newer + OpenGLES 3.0:
https://sites.google.com/site/orionx3d/ ... 01_apk.zip

Info install external apk[use at your own risk]:
https://www.wikihow.tech/Install-APK-Files-on-Android
Last edited by Dave Gravel on Sat Sep 29, 2018 12:00 pm, edited 1 time in total.
You search a nice physics solution, if you can read this message you're at the good place :wink:
OrionX3D Projects & Demos:
https://www.facebook.com/dave.gravel1
http://orionx3d.googlepages.com/
https://www.youtube.com/user/EvadLevarg/videos
User avatar
Dave Gravel
 
Posts: 712
Joined: Sat Apr 01, 2006 9:31 pm
Location: Quebec in Canada.

Re: ARM Neon Support and Android x86

Postby Julio Jerez » Sat Sep 29, 2018 11:45 am

hey that's really cool.
what kind of hardware is that running on?

edit:
never mind it said right there quad core arm 7

questions:
is it this using neon?
is is multithreaded?
Julio Jerez
Moderator
Moderator
 
Posts: 10984
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: ARM Neon Support and Android x86

Postby Dave Gravel » Sat Sep 29, 2018 11:54 am

I have neon active for some math, but enable or disable it don't really change for newton.
I use multithread and async update mode.

This is my newton setup

NewtonSetParallelSolverOnLargeIsland(nworld, 1);
NewtonSetSolverModel(nworld, 1);
NewtonSetThreadsCount(nworld, 3);
NewtonSetNumberOfSubsteps(nworld, 2);

The cpu is a bit low in this device I update newton with 1/30
2 or 3 threads seen to give best result for this device.
You search a nice physics solution, if you can read this message you're at the good place :wink:
OrionX3D Projects & Demos:
https://www.facebook.com/dave.gravel1
http://orionx3d.googlepages.com/
https://www.youtube.com/user/EvadLevarg/videos
User avatar
Dave Gravel
 
Posts: 712
Joined: Sat Apr 01, 2006 9:31 pm
Location: Quebec in Canada.

Re: ARM Neon Support and Android x86

Postby Dave Gravel » Sun Sep 30, 2018 3:07 am

Finally with joints now!! :D
https://www.youtube.com/watch?v=NRk5_F0-jfc


Download Demo updated for landscape view APK:
https://sites.google.com/site/orionx3d/ ... mo_apk.zip

Edited:
In this video the demo running on a samsung s9+
https://www.youtube.com/watch?v=Zh-8EbwZGg4
You search a nice physics solution, if you can read this message you're at the good place :wink:
OrionX3D Projects & Demos:
https://www.facebook.com/dave.gravel1
http://orionx3d.googlepages.com/
https://www.youtube.com/user/EvadLevarg/videos
User avatar
Dave Gravel
 
Posts: 712
Joined: Sat Apr 01, 2006 9:31 pm
Location: Quebec in Canada.

Re: ARM Neon Support and Android x86

Postby execomrt » Mon Oct 01, 2018 11:02 am

Here the new patch (tested against latest snapshot from 1st October)
- Changed DotProduct4 to DotProduct, and CrossProduct
- Added MulAdd/MulSub (two version, with FMA or regular, FMA has more precision).
- Removed dgVectorSSE.h (doing an include).
Attachments
ARM PATCH 2.zip
(22.55 KiB) Downloaded 120 times
Last edited by execomrt on Mon Oct 01, 2018 3:25 pm, edited 1 time in total.
execomrt
 
Posts: 8
Joined: Fri Sep 21, 2018 8:41 am

Re: ARM Neon Support and Android x86

Postby execomrt » Mon Oct 01, 2018 11:07 am

[quote="Julio Jerez"]Nice, we will have optimized ARM, cool stuff.

I am making another pass elimination stuff like DG_SSE4_INSTRUCTIONS_SET which

quote]

Hello;

The patch is using a single define DG_ARCH, I've replaced by #if DG_ARCH >= DG_ARCH_SSE_4_1

I didn't do the ARM 64 double float yet.
The cross product was not yet optimized, because it doesn't optimize easily.
execomrt
 
Posts: 8
Joined: Fri Sep 21, 2018 8:41 am

Re: ARM Neon Support and Android x86

Postby Dave Gravel » Mon Oct 01, 2018 1:52 pm

I have quickly test here with my project.
I have last newton sdk from this morning, here the dgVectorArmNeon.h file can't compile.
Maybe it depend from a flag somewhere or I need to do modification in my project.
I have already the neon file from ndk compiling for my glm math, I'm not sure why the modification don't compile with my project.
I wait later to see how use it with last newton sdk and retry it with my project.
You search a nice physics solution, if you can read this message you're at the good place :wink:
OrionX3D Projects & Demos:
https://www.facebook.com/dave.gravel1
http://orionx3d.googlepages.com/
https://www.youtube.com/user/EvadLevarg/videos
User avatar
Dave Gravel
 
Posts: 712
Joined: Sat Apr 01, 2006 9:31 pm
Location: Quebec in Canada.

Re: ARM Neon Support and Android x86

Postby execomrt » Mon Oct 01, 2018 2:01 pm

Hello;
Please use updated ARM PATCH 2.zip, addressing compilation option with latest snapshot and added the 2 news functions (MulAdd, MulSub)
Last edited by execomrt on Mon Oct 01, 2018 3:26 pm, edited 1 time in total.
execomrt
 
Posts: 8
Joined: Fri Sep 21, 2018 8:41 am

Next

Return to General Discussion

Who is online

Users browsing this forum: No registered users and 1 guest