Porting Death Rally - help needed

	Bottom Previous Topic Next Topic
Register To Post

(1) 2 »

SinanSam460

Posted on: 2023/2/14 11:39 #1

Not too shy to talk

@all

I have compiled BSzili's BE port of Death Rally
using gcc crosscompiler and newlib

https://github.com/BSzili/dRally

Game compiles fine. However there is strange problem.
Game runs fine without crashes on A1222
While it crashes on X5000 and Sam460 with a crash related to this section in the source code ?

What can be the problem ?


...

extern __BYTE__ ___1e6ed0h[];

...

double dRMath_cos(double dval){ return cos(dval); }

...



static void helper55(int n, float * xf, float * yf, double angle){



    struct_35e_t *     s_35e = (struct_35e_t *)___1e6ed0h;



    angle = ((double)s_35e[n].Direction+angle)*A_PI/180.0;



    *xf = ((double)s_35e[n].XLocation+12.0*dRMath_sin(angle));

    *yf = ((double)s_35e[n].YLocation+12.0*dRMath_cos(angle)*L0_83);

}

Sinan - AmigaOS4 Beta-Tester
- AmigaOne X5000
- AmigaOne A1222
- Sam460ex

geennaam

Re: Porting Death Rally - help needed

Posted on: 2023/2/14 12:02 #2

Quite a regular

@SinanSam460

I'm not really a much of a coder so I would not use these kind of constructions myself without exactly knowing what's the resulting behaviour.

But are you sure that it is wise to fill a pointer to a float (32bits) with a double (64bits) result? I can imagine that at least the compiler would complain about a missing cast.

Difference in behaviour can be the difference between unintended compiler decision for this construction versus FP emulation code for the A1222.

Furthermore, the issue could be somewhere else as well. Make sure that n is within bounds for example.

On the positive side: It it good to see that the A1222 actually behaves as intended by the original coders without having a compatible FPU

Edited by geennaam on 2023/2/14 12:18:24

joerg

Re: Porting Death Rally - help needed

Posted on: 2023/2/14 17:30 #3

Just can't stay away

@SinanSam460
Quote:

Game runs fine without crashes on A1222
While it crashes on X5000 and Sam460 with a crash related to this section in the source code ?

What kind of crash? If it's an alignment exception extern __BYTE__ ___1e6ed0h[]; is probably not 32 (float) or 64 bit (double) aligned, which may crash on a real FPU with an alignment exception, but not with an FPU emulator using integer accesess.

SinanSam460

Re: Porting Death Rally - help needed

Posted on: 2023/2/14 21:39 #4

Not too shy to talk

@joerg

Crash is as follows:


Crash occured in module drally_linux at address 0x7F2822F0

Type of crash: alignment exception

Alert number: 0x80000003



Register dump:

GPR (General Purpose Registers):

   0: 7F2822D4 5B018220 00000000 00000001 5B24264E 5B242652 01F50000 01F50000 

   8: 3FC53923 5B24264E 3FE53923 5B018220 000001C8 5B02C534 5B3265A0 00000001 

  16: 00000000 00000000 5FD5D9D0 00000000 5B018DD8 7F21A558 0000000D 5B326588 

  24: 00000000 5B326578 00000000 00000000 0000000F 5B242540 00000003 5B018220 





FPR (Floating Point Registers, NaN = Not a Number):

   0:              nan         0.788011      1.78442e-17                0 

   4:       4.5036e+15       4.5036e+15                0                0 

   8:     -2.75573e-07      2.48016e-05         0.165806              0.5 

  12:         0.788011         0.333333             -nan    -1.39828e+308 

  16:    -3.49614e+307     -4.1353e+307              nan    -2.19577e+306 

  20:              nan    -1.62943e+308      1.3926e+308     -2.68448e+38 

  24:     1.83323e+307    -6.05159e+302      6.61829e+37    -7.34871e+307 

  28:    -2.59471e+306     4.90317e+306     7.97058e+307              413 



FPSCR (Floating Point Status and Control Register): 0x82024000





SPRs (Special Purpose Registers):

           Machine State (msr) : 0x0002F030

                Condition (cr) : 0x00000010

      Instruction Pointer (ip) : 0x7F2822F0

       Xtended Exception (xer) : 0x02080000

                   Count (ctr) : 0x5DFD3040

                     Link (lr) : 0x020A9C2C

            DSI Status (dsisr) : 0x59920DE0

            Data Address (dar) : 0x0183FB84







680x0 emulated registers:

DATA: 00000001 00000001 00000000 00000000 00000000 00000000 00000000 00000000 

ADDR: 6FFA6000 A38BAC00 00000000 00000000 00000000 00000000 00000000 5B0182E0 

FPU0:                0                0                0                0 

FPU4:                0                0                0                0 







Symbol info:

Instruction pointer 0x7F2822F0 belongs to module "drally_linux" (PowerPC) 

Symbol: helper55 + 0xBC in section 1 offset 0x0006B2EC



Stack trace:

    [race___54668h.c:63] drally_linux:helper55()+0xbc (section 1 @ 0x6B2EC)

    [race___54668h.c:63] drally_linux:helper55()+0xa0 (section 1 @ 0x6B2D0)

    [race_main.c:497] drally_linux:race_main()+0x13d8 (section 1 @ 0x4F874)

    [___33010h.c:754] drally_linux:___33010h_cdecl()+0x27e8 (section 1 @ 0x1D8E8)

    [___3266ch.c:255] drally_linux:___3266ch()+0xafc (section 1 @ 0x29644)

    [shop_main.c:219] drally_linux:shop_main()+0x450 (section 1 @ 0x8122C)

    [menu___194a8h.c:100] drally_linux:menu___194a8h()+0x280 (section 1 @ 0x7C9CC)

    drally_linux:menu___194a8h_2()+0x18 (section 1 @ 0x7CCB4)

    [menu_main.c:221] drally_linux:menu_main()+0x5dc (section 1 @ 0x7C690)

    [___3e720h.c:69] drally_linux:___3e720h()+0xec (section 1 @ 0xC524)

    [drally.c:46] drally_linux:main()+0x50 (section 1 @ 0x35A4)

    native kernel module newlib.library.kmod+0x00002624

    native kernel module newlib.library.kmod+0x00003350

    native kernel module newlib.library.kmod+0x00003874

    drally_linux:_start()+0x1e0 (section 1 @ 0x3280)

    native kernel module kernel.debug+0x00081084

    native kernel module kernel.debug+0x000810fc



PPC disassembly:

 7f2822e8: fc000018   frsp              f0,f0

 7f2822ec: 813f001c   lwz               r9,28(r31)

*7f2822f0: d0090000   stfs              f0,0(r9)

 7f2822f4: 813f0018   lwz               r9,24(r31)

 7f2822f8: 1d29035e   mulli             r9,r9,862

Sinan - AmigaOS4 Beta-Tester
- AmigaOne X5000
- AmigaOne A1222
- Sam460ex

geennaam

Re: Porting Death Rally - help needed

Posted on: 2023/2/14 22:52 #5

Quite a regular

@SinanSam460

Assuming that Asterix points to the failing instruction and I am reading the disassembly correctly than your code is trying to store a SP float result (f0) at a non-aligned memory location 0x5B24264E(r9). Since a double is converted to single prior to load of destination and store, it is likely *xf or *yf.

Also f0 is NaN. So something is really wrong here.

Can you copy and paste the contents of the disassembly tab of the grim reaper window? It might give a hint where things start to go wrong.

Edited by geennaam on 2023/2/14 23:11:33
Edited by geennaam on 2023/2/14 23:15:03
Edited by geennaam on 2023/2/14 23:20:29
Edited by geennaam on 2023/2/14 23:20:56

SinanSam460

Re: Porting Death Rally - help needed

Posted on: 2023/2/15 5:35 #6

Not too shy to talk

@geennaam

Here is a screenshot of PPC disassembly window

Resized Image

Edited by SinanSam460 on 2023/2/15 20:21:54

Sinan - AmigaOS4 Beta-Tester
- AmigaOne X5000
- AmigaOne A1222
- Sam460ex

geennaam

Re: Porting Death Rally - help needed

Posted on: 2023/2/15 10:23 #7

Quite a regular

@SinanSam460

Is this sourcecode reverse engineered? This would at least explain the weird namings.

I am pretty sure that the problem is elsewhere in the code.

The disassembly shows that *xf points to 16bit aligned address (0x5B24264E) where it must be 32bit. The pointer itself is 32bit aligned (0x5B01823C).


lfd f1,40(r31)      Load f1(double) from address 0x5B018248

bl 0x7F73668C       branch to linked function sin()

fmr f12,f1          Copy f1 to f12 (result from sine function)

lis r9, 23527       load immediate shift (r9 = 0x5B070000)

lfd f0,-7736(r9)    Load f0 with double from address 0x5BE6E1C8

fmul f0,f12,f0      f0 = f0 * 0.788011

fadd f0,f31,f0      f0 = f0 + 413.0

frsp f0,f0          double -> Float

lwz r9,28(r31)      Load r9 with word from address 0x5B01823C

stfs f0,0(r9)       Store float in F0 to 0x5B24264E

The double loads are double aligned, so that is ok.
The loaded value from (double)s_35e[n].XLocation seems to be 413.0. Strange because this is supposed to be a __BYTE__
The NaN is most likely the result of something bogus loaded from 0x5BE6E1C8 -> lfd f0,-7736(r9). This should have been 12.0

The question remains why there is only an alignement issue on the X5k and sam460. And then specifically on AmigaOS4. Because as I understand it, the MOS version runs fine.
But this is a question for the compiler experts.

SinanSam460

Re: Porting Death Rally - help needed

Posted on: 2023/2/15 10:37 #8

Not too shy to talk

@geennaam

Thanks for analyze. Yes, Death Rally is reverse engineered.

Is there a gcc switch that handles these kinds of operations ?

Sinan - AmigaOS4 Beta-Tester
- AmigaOne X5000
- AmigaOne A1222
- Sam460ex

kas1e

Re: Porting Death Rally - help needed

Posted on: 2023/2/15 10:44 #9

Home away from home

@geennaam
Quote:

The question remains why there is only an alignment issue on the X5k and sam460. And then specifically on AmigaOS4. Because as I understand it, the MOS version runs fine.
But this is a question for the compiler experts.

I meet with this alignment -x5000-only issue when working on the Irrlicht Engine port, and one of the loader's source code were done without worry about alignment.
I then asked the developers of our kernel, and was told that the PowerPC architecture does not allow _ANY_ unaligned access. That is 16 bit must be 16 bit aligned, 32 bit must be 32 bit aligned, etc.

But, then, while it expected that we get alignment exceptions with access on floats at unaligned address, the OS4 kernel does have an emulator for unaligned floating point access, but it's pretty slow (and on pretty high abstraction level).
It also enabled on all machines (including x5000 too), since the emulation is part of exec, and not the HAL.

The problem which we have on x5000, is probably because of
missing 4 opcodes (lfs, lfsu, stfs, stfsu) which need to implement for x5k, but this wasn't done yet.

While unaligned memory access looks like a bad thing from bad code, the real live says that better to handle this situation without crash, even if it will be some milliseconds slower.

Mathias created a simple test case which can be checked on all machines:


#include <stdio.h>



int main(int argc, char **argv)

{

    // Declare a 16-byte buffer, it will be aligned on 16 bytes

    printf("A buffer contains the same 4-byte pattern at index 1 (unaligned) and 8 (aligned)\n");

    char buffer[16] = {0, 60, 127, 113, 58, 5, 6, 7, 60, 127, 113, 58, 12, 13, 14, 15};

    volatile char * ptr;



    // Read the reference pattern at an aligned address (buffer + 8)

    ptr = buffer + 8;

    printf("Read the reference pattern at an aligned address (buffer + 8) (addr = %p)\n", ptr);

    printf("float = %f\n", *(float *)ptr);



    // Read the same pattern at an unaligned address (buffer + 1)

    ptr = buffer + 1;

    printf("Read the same pattern at an unaligned address (buffer + 1) (addr = %p)\n", ptr);

    printf("float = %f\n", *(float *)ptr);



    return 0;

}

So while works on some machines, crashes on x5000 for sure. Probable the reasons why it works on other than x5000 machines, it's because PA61 CPU on x1000 probably allows unaligned floating point access.

I created a bug report about a year or two ago, so devs aware.

@Sinan
You say it crashes on, x1000 too ? Take care if you use any Altivec parts, because if so, it will crash too, then and on x1000.

Probably we all can take that code i posted, and checking it on different machines, i can do so on x1000,×5000,sam460 and pegasos2, if they're interest.
But that will not change a kernel for us, of course, and probably the faster way is to deal with the unaligned code in the game itself.

Join us to improve dopus5!
AmigaOS4 on youtube

SinanSam460

Re: Porting Death Rally - help needed

Posted on: 2023/2/15 10:56 #10

Not too shy to talk

@kas1e

I don't have X1000..but it crashes on X5000/Sam460 and but runs on A1222.

If interested, here are the sourcodes with my makefile.

https://drive.google.com/file/d/1zZKbp ... moM7e0XH/view?usp=sharing

Death Rally is free to play on Steam (for Windows)

Sinan - AmigaOS4 Beta-Tester
- AmigaOne X5000
- AmigaOne A1222
- Sam460ex

SinanSam460

Re: Porting Death Rally - help needed

Posted on: 2023/2/15 11:12 #11

Not too shy to talk

@kas1e

I tried this code on WinUAE

A buffer contains the same 4-byte pattern at index 1 (unaligned) and 8 (aligned)
Read the reference pattern at an aligned address (buffer + 8) (addr = 0x3d7f0ce4)
float = 0.015591
Read the same pattern at an unaligned address (buffer + 1) (addr = 0x3d7f0cdd)
float = 0.015591

I will try on real Amigas this week.

About bug report, can you tell me which one ?

Sinan - AmigaOS4 Beta-Tester
- AmigaOne X5000
- AmigaOne A1222
- Sam460ex

m3x

Re: Porting Death Rally - help needed

Posted on: 2023/2/15 11:32 #12

Just popping in

@SinanSam460

the gcc option:

-mstrict-align

may help ?

Max Tretene, ACube Systems Srl, Soft3

geennaam

Re: Porting Death Rally - help needed

Posted on: 2023/2/15 12:53 #13

Quite a regular

@kas1e

Quote:

The problem which we have on x5000, is probably because of
missing 4 opcodes (lfs, lfsu, stfs, stfsu) which need to implement for x5k, but this wasn't done yet.

Do you mean that this misalignment handling code has not been implemented for those four instructions?

As you say, non-aligned allocations will result in performance issues anyways and should therefore be avoided.

It was my understanding that all allocations in OS4 are default 32bits aligned. So potentially only doubles and uint64 could have alignement issues when you don't force the correct alignment. Or is this only true for IExec->AllocVecTags() calls and not the standard C malloc() like calls?

I must admit that I only use AllocVecTags() because it gives me as much control as possible from this abstraction level. As a hardware guy I have trust issues with compilers and OSes

Looking at the names of the source files give my already a headache. But forcing the memory allocations to be correctly aligned should't be that hard.

geennaam

Re: Porting Death Rally - help needed

Posted on: 2023/2/15 12:59 #14

Quite a regular

@m3x

Edit: Isn't this only for eg. alignments of variables within structures? And not for mallocs?

Edited by geennaam on 2023/2/15 14:00:38
Edited by geennaam on 2023/2/15 14:01:23

joerg

Re: Porting Death Rally - help needed

Posted on: 2023/2/15 17:53 #15

Just can't stay away

@geennaam
Quote:

It was my understanding that all allocations in OS4 are default 32bits aligned.

Yes, but strange things like in this code (casting some struct with float/double to a __BYTE__ array) may of course still fail with an alignment exception.

Ancient AmigaOS 4.x GCC bug, but I wouldn't be surprised at all if it was never fixed (making -mstrict-align the default): https://sourceforge.net/p/adtools/bugs/14/

@kas1e
Quote:

I then asked the developers of our kernel, and was told that the PowerPC architecture does not allow _ANY_ unaligned access. That is 16 bit must be 16 bit aligned, 32 bit must be 32 bit aligned, etc.

This is only partially true.
Any integer access has to be (at least) 16 bit aligned and only accessing odd addresses will cause an alignment exception on all systems.
But all FPU accesses have to be at least 32 bit aligned, with the exception of the 440ep (and probably 460 CPUs with "external FPU" too), and maybe POWER CPUs as well, where 64 bit alignment is required, or more correctly for the 440ep: FPU accesses never must cross a cache line boundary. The kernel has an alignment exception handler only for the 440 (and probably 460) CPUs where this is a problem for code working correctly on other CPUs.

Quote:

While unaligned memory access looks like a bad thing from bad code, the real live says that better to handle this situation without crash, even if it will be some milliseconds slower.

No, it's a bug in the code which has to be fixed, and it's not just an PowerPC/POWER issue, for example unaligned FPU accesses don't work on x86/x64 CPUs either.

Edit: WarpOS software had a lot of wrong aligned FPU accesses, partially because of the HUNK executable format used, therefore my powerpc.library includes an alignment exception handler emulating all FPU load/store instructions using integer ones, but the AmigaOS 4.x kernel doesn't do that.
On the A1222 there is AFAIK no FPU and the FPU instructions are emulated by using integer accesses instead, just like in my powerpc.library, which is the reason it doesn't crash on the A1222, even if there are very likely some other bugs in the code like the NaN in f0.

Edited by joerg on 2023/2/15 18:19:09
Edited by joerg on 2023/2/15 18:24:09
Edited by joerg on 2023/2/15 18:41:48

SinanSam460

Re: Porting Death Rally - help needed

Posted on: 2023/2/15 19:50 #16

Not too shy to talk

@m3x

Unfortunately it doesn't help...

Sinan - AmigaOS4 Beta-Tester
- AmigaOne X5000
- AmigaOne A1222
- Sam460ex

SinanSam460

Re: Porting Death Rally - help needed

Posted on: 2023/2/15 19:53 #17

Not too shy to talk

@kas1e

I run this small code on Sam460 and it crashes.
That means it will also crash on X5000.

WinUAE works and A1222 works (I guess since FPU is emulated)

Quote:

Mathias created a simple test case which can be checked on all machines:

#include int main(int argc, char **argv) { // Declare a 16-byte buffer, it will be aligned on 16 bytes printf("A buffer contains the same 4-byte pattern at index 1 (unaligned) and 8 (aligned)\n"); char buffer[16] = {0, 60, 127, 113, 58, 5, 6, 7, 60, 127, 113, 58, 12, 13, 14, 15}; volatile char * ptr; // Read the reference pattern at an aligned address (buffer + 8) ptr = buffer + 8; printf("Read the reference pattern at an aligned address (buffer + 8) (addr = %p)\n", ptr); printf("float = %f\n", *(float *)ptr); // Read the same pattern at an unaligned address (buffer + 1) ptr = buffer + 1; printf("Read the same pattern at an unaligned address (buffer + 1) (addr = %p)\n", ptr); printf("float = %f\n", *(float *)ptr); return 0; }

Sinan - AmigaOS4 Beta-Tester
- AmigaOne X5000
- AmigaOne A1222
- Sam460ex

SinanSam460

Re: Porting Death Rally - help needed

Posted on: 2023/2/15 20:20 #18

Not too shy to talk

@all

Good news :)

BSZili told me to apply this commit

https://github.com/enriquesomolinos/dR ... 0195d63178756bdac334a31ee

And game is now working on Sam460 :)

Sinan - AmigaOS4 Beta-Tester
- AmigaOne X5000
- AmigaOne A1222
- Sam460ex

geennaam

Re: Porting Death Rally - help needed

Posted on: 2023/2/15 20:42 #19

Quite a regular

@SinanSam460

Great news. Looking forward to the game. Looks a bit like Micro Machines.

BSzili

Re: Porting Death Rally - help needed

Posted on: 2023/2/16 5:26 #20

Quite a regular

@SinanSam460
Credit where credit is due, it was urxp who gave you the tip on this.

This is just like television, only you can see much further.

Register To Post	(1) 2 »
	Top Previous Topic Next Topic

Currently Active Users Viewing This Thread: 1 ( 0 members and 1 Anonymous Users )