Raydium 3D Game Engine

Official forum for everything about Raydium, ManiaDrive, MeMak, ...
It is currently Fri Mar 29, 2024 3:10 pm

All times are UTC




Post new topic Reply to topic  [ 40 posts ]  Go to page Previous  1, 2, 3  Next
Author Message
PostPosted: Tue Aug 18, 2009 11:47 am 
Offline
User avatar

Joined: Thu Sep 29, 2005 2:59 pm
Posts: 828
$ uname -m -p -i -o
x86_64 unknown unknown GNU/Linux

Looks like uname can only inform about the bits of the system (not the cpu) but, in fact, that would be ok for us. However it can not recognise if it's AMD or Intel.
Can we guess if it's an AMD processor without using proc/cpu?

Update: Argh... I noticed the patch is wrong, Dont use it! (I forced an option)


Top
 Profile  
 
PostPosted: Tue Aug 18, 2009 11:55 am 
Offline

Joined: Wed May 06, 2009 2:06 pm
Posts: 30
I tested patch-amd64.c

with 32bit linux:
configure and make passed without errors.

with 64bit (non AMD) linux
configure success, make stops "/usr/bin/ld: raydium/php/libs/libphp5.a(lt30-main.o): relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object; recompile with -fPIC"

Why you want know is it amd or intel?


Top
 Profile  
 
PostPosted: Tue Aug 18, 2009 12:00 pm 
Offline
User avatar

Joined: Thu Sep 29, 2005 2:59 pm
Posts: 828
I though the problem happend only on AMD...


Top
 Profile  
 
PostPosted: Tue Aug 18, 2009 12:09 pm 
Offline
User avatar

Joined: Sun Mar 16, 2003 2:53 am
Posts: 2591
Location: gnniiiii (Scrat)
Well, I'm a bit lost too. -fPIC option is not desirable on "generic" (let's say Intel) x86_64, for performance reasons ... why do you end up with such a message, aapo ?

There's perhaps two different problems (only a guess). Here's what I think:
- AMD64 needs -fPIC for both PHP and Raydium
- Intel x86_64 (and amd64 ?) may need another fix. See my last post here: viewtopic.php?f=7&t=575&hilit=lib64


Top
 Profile  
 
PostPosted: Tue Aug 18, 2009 1:24 pm 
Offline

Joined: Wed May 06, 2009 2:06 pm
Posts: 30
I have talked about AMD, but it is my mistake. I have intel 64bit.

If I just configure and make, it doesn't pass and it suggests using -fPIC, and with it I can compile. (If I fix that INT32) I do not know why this works.

I tested ./configure --libdir=/usr/lib64 and it doesn't change anything. (INT32 issue and relocation R_X86_64_32 against issue)


Top
 Profile  
 
PostPosted: Tue Aug 18, 2009 7:14 pm 
Offline
User avatar

Joined: Thu Sep 29, 2005 2:59 pm
Posts: 828
New patch.
-This should work fine with AMD 64 (not Intel yet).
-No forced options this time.
-X include loaded just before jpeg include.
-uname used for bits, but proc still used for fabricant (alternatives?)

St has confirmed that Mac is 100% safe of this problem.
Quote:
"The iCompile script included in the Raydium SDK for Mac OS X builds universal binaries, which covers all Mac architectures (and sub-architectures) available"


Aapo, what exact Intel cpu model do you have? Do you have ideas for adding your case over this patch?

Patch: http://www.sendspace.com/file/s09qxj


Top
 Profile  
 
PostPosted: Wed Aug 19, 2009 6:27 am 
Offline

Joined: Tue Jul 08, 2008 2:37 am
Posts: 181
Here's the full answer to vicente's question, about how to determine the processor type under Mac OS X, how to guess it is an AMD64, I wrote on IRC yesterday:

There's no Apple system available with that kind of architecture you want to check for (AMD64). Newer Mac systems are shipped with Intel based chips.
The iCompile script included in the Raydium SDK for Mac OS X builds universal binaries, which covers all Mac architectures (and sub-architectures) available, so the binaries are bigger that on other systems like GNU/Linux.
Supported platforms on Mac OS X are IA-32, x86-64 and PowerPC (32-bit & 64-bit).
ARMv6 and ARMv7-A for the iPhone OS created by the iCompile script in the Raydium SDK for the iPhone OS.
There's no need to make any Raydium related changes to support 64-bit platforms for any Apple system. The dependencies and binaries are fully universal, this means it runs everywhere where an Apple is used.
So, IMHO, you don't have to guess if the processor is an AMD64, because we'll build for every arch available and we don't have/need/provide a Raydium configure script on any Apple system, see Raydium R822 configure (M).
This'll hopefully answer your question. :)

A long time ago in a galaxy far, far away I created a Makefile for a game project which I've abandoned, perhaps you can use some detection parts from it.


Top
 Profile  
 
PostPosted: Wed Aug 19, 2009 6:49 am 
Offline

Joined: Wed May 06, 2009 2:06 pm
Posts: 30
I know very little about differencies of intel and amd, but on this page
http://www.intel.com/software/products/compilers/docs/flin/main_for/copts/common_options/option_fpic.htm
there are:
Quote:
On systems using IA-32 or Intel® 64 architecture, -fpic must be used when building shared objects.


Why do you think -fpic is only for AMD?

Patch is working if I use only x86_64-checking.


Top
 Profile  
 
PostPosted: Wed Aug 19, 2009 8:19 am 
Offline
User avatar

Joined: Thu Sep 29, 2005 2:59 pm
Posts: 828
Hmm, I have no idea about the real utility and advantages/disadvantages of -fpic but looks like there is a different point of view here.
Aapo says fpic should be enabled in all 64 archs but Xfennec says it shouldn't be, cause it causes low perfomance. Have I understood right?


Top
 Profile  
 
PostPosted: Wed Aug 19, 2009 8:58 am 
Offline
User avatar

Joined: Sun Mar 16, 2003 2:53 am
Posts: 2591
Location: gnniiiii (Scrat)
Vicente, you're right :)

The fPIC option allows GCC to not use absolute address for globals vars. Using absolute referencing is OK on 32 bits, but use twice more memory with 64 bits (of course). So the idea behind fPIC is simply to use (32 bits) offset referencing. It costs a few more instructions, reduces allocation "page" space to 2 GB, but allows binaries to stay small.

Short story: fPIC produce "relocatable" code, so it needs memory offset mapping, something quite AMD64 specific/native to me. Let's sort it out.

Here's a sample C file :
Code:
int global = 0;

 int test(int value)
 {
 global = value;
 return ++global;
 }

int main(void)
{
test(29);
return 0;
}


First compilation, without fPIC:
gcc a.c
objdump -d a.out
Test function looks like:

Code:
08048344 <test>:
 8048344:       55                      push   %ebp
 8048345:       89 e5                   mov    %esp,%ebp
 8048347:       8b 45 08                mov    0x8(%ebp),%eax
 804834a:       a3 58 95 04 08          mov    %eax,0x8049558
 804834f:       a1 58 95 04 08          mov    0x8049558,%eax
 8048354:       83 c0 01                add    $0x1,%eax
 8048357:       a3 58 95 04 08          mov    %eax,0x8049558
 804835c:       a1 58 95 04 08          mov    0x8049558,%eax
 8048361:       5d                      pop    %ebp
 8048362:       c3                      ret


Second test with fPIC:
gcc -fPIC a.c
objdump -d a.out

Code:
08048374 <test>:
 8048374:       55                      push   %ebp
 8048375:       89 e5                   mov    %esp,%ebp
 8048377:       e8 66 00 00 00          call   80483e2 <__i686.get_pc_thunk.cx>
 804837c:       81 c1 1c 12 00 00       add    $0x121c,%ecx
 8048382:       8b 91 fc ff ff ff       mov    -0x4(%ecx),%edx
 8048388:       8b 45 08                mov    0x8(%ebp),%eax
 804838b:       89 02                   mov    %eax,(%edx)
 804838d:       8b 81 fc ff ff ff       mov    -0x4(%ecx),%eax
 8048393:       8b 00                   mov    (%eax),%eax
 8048395:       8d 50 01                lea    0x1(%eax),%edx
 8048398:       8b 81 fc ff ff ff       mov    -0x4(%ecx),%eax
 804839e:       89 10                   mov    %edx,(%eax)
 80483a0:       8b 81 fc ff ff ff       mov    -0x4(%ecx),%eax
 80483a6:       8b 00                   mov    (%eax),%eax
 80483a8:       5d                      pop    %ebp
 80483a9:       c3                      ret


This result is on a 32 bits Intel. Even without particular asm skills, It's quite clear that fPIC generate insane things, since the fPIC version of test() function is calling another internal function (__i686.get_pc_thunk.cx), and is doing strange things with ecx registrer. 17 lines of code, versus 10 without fPIC.

It's (a lot) slower, and the binary is larger ! :) (in facts, I don't care about binary size, with Raydium, our only goad is speed). In this case, fPIC is a total failure.

So, let's please do the same test on AMD64 (where GCC should use some CPU features to generate "light" relocatable code) and Intel x86_64, where I think such features are not available, but where I can be completely wrong :)

edit: made the post a bit more readable.


Top
 Profile  
 
PostPosted: Wed Aug 19, 2009 9:09 am 
Offline
User avatar

Joined: Thu Sep 29, 2005 2:59 pm
Posts: 828
AMD64 withuout -fpic:

Code:
00000000004004ac <test>:
  4004ac:   55                      push   %rbp
  4004ad:   48 89 e5                mov    %rsp,%rbp
  4004b0:   89 7d fc                mov    %edi,-0x4(%rbp)
  4004b3:   8b 45 fc                mov    -0x4(%rbp),%eax
  4004b6:   89 05 6c 0b 20 00       mov    %eax,0x200b6c(%rip)        # 601028 <global>
  4004bc:   8b 05 66 0b 20 00       mov    0x200b66(%rip),%eax        # 601028 <global>
  4004c2:   83 c0 01                add    $0x1,%eax
  4004c5:   89 05 5d 0b 20 00       mov    %eax,0x200b5d(%rip)        # 601028 <global>
  4004cb:   8b 05 57 0b 20 00       mov    0x200b57(%rip),%eax        # 601028 <global>
  4004d1:   c9                      leaveq
  4004d2:   c3                      retq   


with -fpic:

Code:
00000000004004fc <test>:
  4004fc:   55                      push   %rbp
  4004fd:   48 89 e5                mov    %rsp,%rbp
  400500:   89 7d fc                mov    %edi,-0x4(%rbp)
  400503:   48 8b 15 d6 0a 20 00    mov    0x200ad6(%rip),%rdx        # 600fe0 <_DYNAMIC+0x1a8>
  40050a:   8b 45 fc                mov    -0x4(%rbp),%eax
  40050d:   89 02                   mov    %eax,(%rdx)
  40050f:   48 8b 05 ca 0a 20 00    mov    0x200aca(%rip),%rax        # 600fe0 <_DYNAMIC+0x1a8>
  400516:   8b 00                   mov    (%rax),%eax
  400518:   8d 50 01                lea    0x1(%rax),%edx
  40051b:   48 8b 05 be 0a 20 00    mov    0x200abe(%rip),%rax        # 600fe0 <_DYNAMIC+0x1a8>
  400522:   89 10                   mov    %edx,(%rax)
  400524:   48 8b 05 b5 0a 20 00    mov    0x200ab5(%rip),%rax        # 600fe0 <_DYNAMIC+0x1a8>
  40052b:   8b 00                   mov    (%rax),%eax
  40052d:   c9                      leaveq
  40052e:   c3                      retq   


We need more 64bits testers to see if -fpic is a must in all systems or not.


Top
 Profile  
 
PostPosted: Wed Aug 19, 2009 9:41 am 
Offline
User avatar

Joined: Thu Sep 29, 2005 2:59 pm
Posts: 828
One more test with cpu detection flags according AMD (minimal) recomendations with gcc >4.3:

gcc -fPIC -march=native -O2 a.c
Code:
0000000000400520 <test>:
  400520:   48 8b 15 b9 0a 20 00    mov    0x200ab9(%rip),%rdx        # 600fe0 <_DYNAMIC+0x1a8>
  400527:   8d 47 01                lea    0x1(%rdi),%eax
  40052a:   89 02                   mov    %eax,(%rdx)
  40052c:   c3                      retq   
  40052d:   0f 1f 40 00             nopl   0x0(%rax)
  400531:   66 66 66 66 66 66 2e    nopw   %cs:0x0(%rax,%rax,1)
  400538:   0f 1f 84 00 00 00 00
  40053f:   00
 


Top
 Profile  
 
PostPosted: Wed Aug 19, 2009 10:11 am 
Offline
User avatar

Joined: Sun Mar 16, 2003 2:53 am
Posts: 2591
Location: gnniiiii (Scrat)
Vicente, can you try directly using -O3, with and without fPIC ?
The other point is march=native: the result is way better on your AMD64 system, but it means the binary is AMD64 specific :/ We should stick to 32 and 64 bits, not 32/Intel64/AMD64, don't you think ?


Top
 Profile  
 
PostPosted: Wed Aug 19, 2009 10:16 am 
Offline
User avatar

Joined: Thu Sep 29, 2005 2:59 pm
Posts: 828
Here it is.

gcc -O3 a.c
Code:
00000000004004b0 <test>:
  4004b0:   8d 47 01                lea    0x1(%rdi),%eax
  4004b3:   89 05 6f 0b 20 00       mov    %eax,0x200b6f(%rip)        # 601028 <global>
  4004b9:   c3                      retq   
  4004ba:   66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)
 


updated: didn't read the "with and wihtout -fpic"

gcc -O3 -fPIC a.c
Code:
0000000000400500 <test>:
  400500:   48 8b 15 d9 0a 20 00    mov    0x200ad9(%rip),%rdx        # 600fe0 <_DYNAMIC+0x1a8>
  400507:   8d 47 01                lea    0x1(%rdi),%eax
  40050a:   89 02                   mov    %eax,(%rdx)
  40050c:   c3                      retq   
  40050d:   0f 1f 00                nopl   (%rax)


Top
 Profile  
 
PostPosted: Wed Aug 19, 2009 10:49 am 
Offline
User avatar

Joined: Sun Mar 16, 2003 2:53 am
Posts: 2591
Location: gnniiiii (Scrat)
OK, thanks. Your results shows that fPIC is well supported with AMD64, even without asking GCC to use AMD64 specific features (march=native), so it probably means that Intel x86_64 implements theses features too (and that I was wrong on this point).

aapo, can you confirm this using the same tests ? (-O3 with and without -fPIC)

Another important point here is the huge effect of -O3 optimizations on 64 bits architectures, we should probably add it as a default. I'll have a look at this last point, since I was thinking about creating a "raydium-config" binary/script providing cflags/libs/includes/... to Makefile and *comp.sh.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 40 posts ]  Go to page Previous  1, 2, 3  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 25 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group