Newton on Android NDK (ARM-NEON)

A place to discuss everything related to Newton Dynamics.

Moderators: Sascha Willems, walaber

Newton on Android NDK (ARM-NEON)

Postby Krystian » Thu Jul 21, 2011 4:29 pm

Hi,

Does anyone successfully ported/build Newton to Android NDK and are able to share his build code/Android.mk or some knowledge how to make it work? :)
I'm interested to use(learn) Newton on Android(2.3+) NDK(r5), pure C/C++, without Dalvil/JNI/Java and only for devices that support NEON (armeabi-v7a).

Currenlty I've made small test build for Newton2.33 that contains:
Code: Select all
(Application.mk)
APP_ABI := armeabi-v7a
APP_PLATFORM := android-9

and
Code: Select all
(Android.mk based on coreLibrary_200\projets\linux32\makefile)
...
# Android build
LOCAL_PATH := $(call my-dir)
include $(CLEAR_VARS)

LOCAL_MODULE    := newton
LOCAL_SRC_FILES := $(DG_SRCS) $(DG_PHYSICS_SRCS) $(DG_NEWTON_SRCS)
TARGET_ARCH_ABI := armeabi-v7a
LOCAL_ARM_NEON  := true

include $(BUILD_SHARED_LIBRARY)
...

But I have a problem to build dgSimd_Instrutions.h
Code: Select all
#define simd_type __m128
...

As I've read that __m128 is SSE type same/similar as float32x4_t on ARM-NEON.
So the question is, do I need to 'translate' all of simd_Xxx definitions in that dgSimd_Instrutions.h to match ARM-NEON? Or is there a special/magic flag to gcc compiler which handle this translations?

I know that Newton+AndroidNDK is possible because of @martinsm http://www.newtondynamics.com/forum/vie ... =75#p46105

PS. I'm building on Win7x64, for my Samsung Galaxy S2, and I'm novice in C/CPP/GCC/makefile :)
Warning: I'm learning C/CPP/ARM-NEON/Newton/AndroidNDK/OpenGL ES, so whatever I post here, keep in mind that I can be totally wrong ;)
Krystian
 
Posts: 26
Joined: Thu Jul 21, 2011 3:11 pm
Location: Poland

Re: Newton on Android NDK (ARM-NEON)

Postby martinsm » Fri Jul 22, 2011 10:51 am

afaik Newton is not supporting Neon (currently) - to support it you will need to port SSE simd to NEON, if you really want Neon.
So your best case is to build library without SIMD support. To do that add _SCALAR_ARITHMETIC_ONLY define. Here is my LOCAL_CFLAGS line from makefile:
Code: Select all
LOCAL_CFLAGS    := -ffast-math -freciprocal-math -funsafe-math-optimizations -fsingle-precision-constant -D_LINUX_VER -D_LINUX_VER_64 -D_SCALAR_ARITHMETIC_ONLY

From there on - it should build just fine. Maybe some quick tweaks will be needed.

_LINUX_VER_64 was needed to force Newton not to use some code parts that Android could not compile - this doesn't mean that code will be build 64-bit. Maybe this is fixed in newer revisions.

Also don't forget to switch to code generation to ARM opcodes (instead of default THUMB) - this will give you nice 20-30% performance boost:
Code: Select all
LOCAL_ARM_MODE  := arm


Note that not all armeabi-v7a devices support Neon. Most notable exception is Tegra2 - that includes Motorola Xoom, Asus Transformer and other devices.
martinsm
 
Posts: 86
Joined: Mon Dec 19, 2005 3:15 pm
Location: Latvia

Re: Newton on Android NDK (ARM-NEON)

Postby Krystian » Fri Jul 22, 2011 2:47 pm

Thank you, it worked. I've managed to compile it without LOCAL_CFLAGS (with your first hint about _SCALAR_ARITHMETIC_ONLY), just changing/"fixing" sources, but I see that it was not a good/proper way.
There was about 15-20 changes to sources (some ifdefs/includes/etc. fixes), but with your LOCAL_CFLAGS should be better.

Edit: (now compiles with ZERO modifications to source, so this is obsolute ->) I didn't tested it with LOCAL_CFLAGS yet, but I think that some #includes will need to be still because for example file NewtonStdAfx.h includes:
Code: Select all
#include "dg.h"
#include "dgPhysics.h"

but I had to change it to:
Code: Select all
#include "../core/dg.h"
#include "../physics/dgPhysics.h"


My target is still to make Newton+SIMD work with AndroidNDK :) Of course when (or "if") I'll do it, then sources/patches will be posted here.

I've read about LOCAL_ARM_MODE in ANDROID-MK.html, but they didn't say anything about performance boost, thanks for this hint, 20-30% with single line is really nice! I don't understand why it isn't by default set tot ARM, to save a space or memory? Using AndroidNDK is IMHO to give better speed (or easily to port some apps, mainly games).

About Tegra2 I'm aware of it (that's why I've choose SGS2 insted of for example LG Optimus 2X). Currently I'm on learning path, so whatever I'll do it will take at least 6-12months (or more), so after that time, new high-end android devices should come with NEON (just a hope).

Thanks again for your help, and special thanks to Julio Jerez for great Newton!
Last edited by Krystian on Sun Jul 24, 2011 12:07 pm, edited 1 time in total.
Warning: I'm learning C/CPP/ARM-NEON/Newton/AndroidNDK/OpenGL ES, so whatever I post here, keep in mind that I can be totally wrong ;)
Krystian
 
Posts: 26
Joined: Thu Jul 21, 2011 3:11 pm
Location: Poland

Re: Newton on Android NDK (ARM-NEON)

Postby martinsm » Fri Jul 22, 2011 6:28 pm

Krystian wrote:I don't understand why it isn't by default set tot ARM, to save a space or memory?

Yes, exactly. THUMB code takes ~50-70% less size. ARM devices usually has limited memory, especially L1 cache (32KiB for data, and 32KiB for code - that's very small). For simple applications it is reasonable to use thumb. For games - not so much.
martinsm
 
Posts: 86
Joined: Mon Dec 19, 2005 3:15 pm
Location: Latvia

Re: Newton on Android NDK (ARM-NEON)

Postby Krystian » Sun Jul 24, 2011 5:15 pm

Larger instructions, so probably a bit larger possibility of cache miss in very specific code, but I can assume that it's still much better to use ARM code gen. than THUMB (for games).

Thanks to martinsm help (or mostly likely providing almost complete solution to get Newton to compile on Android NDK), in attachment there is my current Android.mk for building Newton 2.x (latest rev. 687) - it should be placed in {newton}\coreLibrary_200\projets\android\Android.mk
There is no need to make any changes to sources, but there was one workaround with _LINUX_VER_64 as martinsm said - problem with cpuid().
Update: It doesn't compile with Newton3.x - I've made some mistake during compile test... Note that it do compile without any problems also with Newton 3.x (coreLibrary_300), but I haven't tested it yet if it runs correctly (but I don't expect any problems).
Of course it's without NEON/SIMD support (for now?).

My current main Android.mk to use Newton looks like this:
Code: Select all
LOCAL_PATH := $(call my-dir)
include $(CLEAR_VARS)

LOCAL_MODULE      := NewtonTest
LOCAL_SRC_FILES      := main.cpp
LOCAL_C_INCLUDES   := $(LOCAL_PATH)/newton/coreLibrary_200/source/newton
LOCAL_LDLIBS      := -llog -landroid
LOCAL_STATIC_LIBRARIES   := android_native_app_glue Newton
LOCAL_ARM_MODE      := arm
TARGET_ARCH_ABI      := armeabi-v7a

include $(BUILD_SHARED_LIBRARY)

include $(LOCAL_PATH)/newton/coreLibrary_200/projets/android/Android.mk

$(call import-module,android/native_app_glue)


Maybe this steps to make Newton works with Android NDK was relatively easy to do for someone who have before experience with makefile's or/and Android, but for me it was great practical lesson.

PS. I can confirm that what is said at main page "Newton Physics engine is cross-platform..." :)

(get updated attachment from few posts below)
Last edited by Krystian on Wed Aug 03, 2011 2:02 pm, edited 3 times in total.
Warning: I'm learning C/CPP/ARM-NEON/Newton/AndroidNDK/OpenGL ES, so whatever I post here, keep in mind that I can be totally wrong ;)
Krystian
 
Posts: 26
Joined: Thu Jul 21, 2011 3:11 pm
Location: Poland

Re: Newton on Android NDK (ARM-NEON)

Postby Julio Jerez » Sun Jul 24, 2011 6:13 pm

Can I put that package on the google download?
Julio Jerez
Moderator
Moderator
 
Posts: 12452
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: Newton on Android NDK (ARM-NEON)

Postby Krystian » Sun Jul 24, 2011 6:40 pm

Of course. I'm not sure if I did everything as it should be - will see if anybody report something.
Thanks for making Newton code portable :)
Warning: I'm learning C/CPP/ARM-NEON/Newton/AndroidNDK/OpenGL ES, so whatever I post here, keep in mind that I can be totally wrong ;)
Krystian
 
Posts: 26
Joined: Thu Jul 21, 2011 3:11 pm
Location: Poland

Re: Newton on Android NDK (ARM-NEON)

Postby Krystian » Sat Jul 30, 2011 10:52 am

It doesn't compile with Newton 3.x of course. I've probably made some kind of mistake during compile test (It's possible that I haven't saved .mk file while doing that test... my bad)
And as I see Newton3.x is only for Simd (for now?) and also doesn't compile with __USE_DOUBLE_PRECISION__ (I've used it just for compile test).

I've started to do some porting NGD2.x simd_Xxx to match NEON, but I've found that NEON seems to lack equivalent for _mm_shuffle_ps(instr. SHUFPS), which I see in general is very simple instruction, but very powerful/useful, and widely used in Newton. I think that it's possible to workaround it, by simulating behaviour, but my concern is that if it won't be too much overhead.

But I see that Newton3 have class simd_128 which could be much easier to make it work (at least in first step partially) with NEON.
Are there any plans to make class simd_128 (Newton3) for non-simd (like _SCALAR_ARITHMETIC_ONLY previously), or maybe someone already implemented it and can share code? If not, I will probably start try implementing it in next weekend. It could be good start point for me to port simd_128 to ARM NEON.

What is the current state and plans for Newton3 :) ? Is there anything more/less than in this post?

PS. In attachment I've made new patch for NGD 2.x rev 687 (mk files to AndroidNDK), with small change, that it now doesn't require _LINUX_VER_64 definition.

Update1: I've managed to compile Newton 3.x on Android (with few fixes/translations/commenting out few lines), but ... because class simd_128(for ARM-NEON) is for now almost stub it doesn't work correctly.

Update2: I'm now translating SSE to NEON, more than 85% of simd_128 translated to NEON. I've just noticed that two constructors of simd_128(SSE) doesn't work properly (however they are seems to be not used in Newton now, aren't they?)
Code: Select all
DG_INLINE simd_128 (dgInt32 a): m_type (_mm_set_ps1 (*(dgFloat32*)&a)) {}
DG_INLINE simd_128 (dgInt32 ix, dgInt32 iy, dgInt32 iz, dgInt32 iw): m_type(_mm_set_ps(*(dgFloat32*)&iw, *(dgFloat32*)&iz, *(dgFloat32*)&iy, *(dgFloat32*)&ix)) {}


Update3: I've translated simd_128 to NEON, and my simple tests between simd_128(SSE in VS2008) gives almost exactly same results as simd_128(ARM-NEON). There is small difference in division, but that one I can 'fix' doing two reciprocal steps (vrecpsq_f32). Second difference is in InvSqrt - this one I've left as it's.
For example InvSqrt(3.0): SSE=0.577350, NEON=0.577347 (I think that it's not relevant difference). Of course it may(/is?) not be optimized well, as I'm new to NEON/SSE, and I'm not counting cycles yet ;)
But main problem I've faced, is too many ICE's in GCC when compiling that translated simd_128 with Newton. I'll back to this problem next week.

.
Attachments
Netwon2_AndroidNDK_NoSimd_rev687.zip
(1.76 KiB) Downloaded 542 times
Warning: I'm learning C/CPP/ARM-NEON/Newton/AndroidNDK/OpenGL ES, so whatever I post here, keep in mind that I can be totally wrong ;)
Krystian
 
Posts: 26
Joined: Thu Jul 21, 2011 3:11 pm
Location: Poland

Re: Newton on Android NDK (ARM-NEON)

Postby syl » Mon Sep 05, 2011 4:40 pm

Thanks for your work ! Usefull :) Also, did you really finished the simd work, and do you think core_300 could be used then ?
syl
 
Posts: 7
Joined: Sat Sep 03, 2011 9:45 am

Re: Newton on Android NDK (ARM-NEON)

Postby Julio Jerez » Mon Sep 05, 2011 5:04 pm

core 300, is much better for porting the simd code. I actually made teh class with mutiplaform in mind.

cpre 200 use macros, but it is bery lmiting because macros do no support operator overloading, so teh code is oggly and error prone.
the class is by far better and suprincinlly the Class generates the exact same binary than when using macros, and in some case even better.

also the class is prepare for usin the 256 simd code. an dther is when the benefic start to shine since it requre very litl code chnages.
Julio Jerez
Moderator
Moderator
 
Posts: 12452
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: Newton on Android NDK (ARM-NEON)

Postby Krystian » Mon Sep 05, 2011 6:11 pm

I haven't move on core300 on NDK since Update3(read above) - I'm getting ICEs - Internal Compiler Errors on GCC...
But simd_128 class is 'translated' to ARM-NEON and on small test works correctly, but when compiling Core300 -> ICEs. I'll post that class this week here, maybe someone will able to workaround(/fix) that ICEs (and optimize translation of simd_128 to NEON). Also I'll post new patch for Newton with Android.mk files for: dMath, dContainers, dCustomJoints, tinyxml, dScene (also with very small 'fixes' to code to compile it under GCC).

Julio: "I actually made teh class with mutiplaform in mind." - thanks, translation was fair easy, even for me who never worked with SIMD/VFP/NEON before :) In core200 when I tried to do that translation, I faced mainly problem with lack of SHUFPS-equivalent in NEON.

@Julio
What's the status of Core300? Is it 'ready' for replacement of Core200, I mean is 'everything' works as in Core200. I'm asking because I saw in one Tutorial from Newton, that for example in Core300 car wheels 'does not works' (I mean there is no collision with terrain).
Warning: I'm learning C/CPP/ARM-NEON/Newton/AndroidNDK/OpenGL ES, so whatever I post here, keep in mind that I can be totally wrong ;)
Krystian
 
Posts: 26
Joined: Thu Jul 21, 2011 3:11 pm
Location: Poland

Re: Newton on Android NDK (ARM-NEON)

Postby Julio Jerez » Tue Sep 06, 2011 1:42 am

Krystian wrote:Julio: "I actually made teh class with mutiplaform in mind." - thanks, translation was fair easy, even for me who never worked with SIMD/VFP/NEON before :) In core200 when I tried to do that translation, I faced mainly problem with lack of SHUFPS-equivalent in NEON.


yes that was the idea, shuffe operation are hard to deal with, with macros, but a class can eassyly us ethe best possible intructions.
for example in altivec it would use the correct permute instruction, on newer Intel it can also use the new shuffler instructions, but on old Intel if can simple emulate the shuffle in c code because the shuffle on older simd is very primitive, the end user will not have to worried about.
I like it that you successfully ported to Neon/VFP,

Krystian wrote:@Julio
What's the status of Core300? Is it 'ready' for replacement of Core200, I mean is 'everything' works as in Core200. I'm asking because I saw in one Tutorial from Newton, that for example in Core300 car wheels 'does not works' (I mean there is no collision with terrain).

core 300 in by far superior to core 200, however I have not converted all the feature yet, i took a break, until I complete the new scripting languare I am working on.

I am actually writing a trully hybrid between using eh besst feature of Java c sharp and objective. It will be integrated with core 300 and the level editor.
I am having too many problems with the visual interface in C++. I realized that the trick in to make a language with build in gui functionality and that integrates seamlessly with C++ and C.
after I have the script language in a stable state, it will resume back the core 300 integration.

however core 300 is usable as it now, but not all feature are ported yet.
it has a far superior multithreading support. The multithreading was change from data parallel base to first in first served threading system.
this may have some problem with platforms with slow atomics or with no atomic at all,
but all we need to worried about is that the atoms dispatched to each thread had enough instrutions to outwait the cores on the semaphore.
in core 300, 100% of the engine is multithreaded, while in core 200 the solver had a very hard time separation the work among cores. and only part of the broad phase is multithreaded.
so far in all my test core 300 is faster that core 200 using multithread, and in some case the different is up to three time better.

I had run test in core 300 with 8 an 10 thousand falling objects, and still run at iterative time using the 4 cores available.
also in core 300 not longer the number of thread has to be equal to the number of core for best performance, in core 300 I can set 32 thread and it surprisingly it run even better.
the scene with 8 thousand object does runs much, much slower in core 200 than it does in core 300 when using for example 16 threads, in my Intel core 7 with 8 hardware threads.
amazingly even hyper threaded system yield better performance with the new threading system.
Julio Jerez
Moderator
Moderator
 
Posts: 12452
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: Newton on Android NDK (ARM-NEON)

Postby Julio Jerez » Tue Sep 06, 2011 2:05 am

Krystian wrote:But I see that Newton3 have class simd_128 which could be much easier to make it work (at least in first step partially) with NEON.
Are there any plans to make class simd_128 (Newton3) for non-simd (like _SCALAR_ARITHMETIC_ONLY previously), or maybe someone already implemented it and can share code? If not, I will probably start try implementing it in next weekend. It could be good start point for me to port simd_128 to ARM NEON..


that is very possible, I was thinking about that, we can make a simd_128 to using floats, and that way people can simple use the simd version on platform that do not have simd code.
My first though was to just write the class and have only one solver, but that is a mistake because the engine needs both solution Simd and Floats, and in the futre even more larger vector like intel AVX
but making a simd_128 that emulate teh class with floats in a very good solution for a quick integration.
Julio Jerez
Moderator
Moderator
 
Posts: 12452
Joined: Sun Sep 14, 2003 2:18 pm
Location: Los Angeles

Re: Newton on Android NDK (ARM-NEON)

Postby Krystian » Sun Sep 11, 2011 5:55 am

Warning: I'm learning C/CPP/ARM-NEON/Newton/AndroidNDK/OpenGL ES, so whatever I post here, keep in mind that I can be totally wrong ;)
Krystian
 
Posts: 26
Joined: Thu Jul 21, 2011 3:11 pm
Location: Poland

Re: Newton on Android NDK (ARM-NEON)

Postby pHySiQuE » Sun Sep 18, 2011 10:42 pm

I am trying to build this with Android NDK 2.2. I get this error:
dgVector.h:88: error: ISO C++ forbids declaration of '__m128' with no type

I did add this in the makefile:
Code: Select all
LOCAL_CFLAGS    := -ffast-math -freciprocal-math -funsafe-math-optimizations -fsingle-precision-constant -D_LINUX_VER -D_LINUX_VER_64 -D_SCALAR_ARITHMETIC_ONLY
pHySiQuE
 
Posts: 608
Joined: Fri Sep 02, 2011 9:54 pm

Next

Return to General Discussion

Who is online

Users browsing this forum: No registered users and 525 guests