A1222 support in the SDK and problems

	Bottom Previous Topic Next Topic
Register To Post

(1) 2 3 »

walkero

Posted on: 3/27 19:25 #1

Site Builder

I am starting this thread because I saw some messages about the SPE support in the last released SDK. Because I care to make it even better in the next releases, I would like you to add here the problems you have, examples or ideas.

Let's have a discussion on how to improve the SDK support to the SPE based systems.

Follow me on
Ko-fi, Twitter, YouTube, Twitch

flash

Re: A1222 support in the SDK and problems

Posted on: 3/27 20:23 #2

Just popping in

@walkero

Thanks for addressing the new topic

here there's my test case


#include <stdio.h>



#define SPE 1



#if SPE

#include <spe.h>

#endif /* SPE */



float sum (float, float, float);



int main()

{

    double x,y,z,result;



    x = 1.0f;

    y = 1.0f;

    z = 1.0f;



    result = sum (x,y,z);

    

    printf ("should be 3: %f\n", result);



    return 0;

}    



float sum (float a, float b, float c)

{

    return a + b + c;

}

The following is the makefile


CC=gcc

CFLAGS_SPE=-mcpu=8540 -mtune=8540 -mspe -mabi=spe -mfloat-gprs=double

OBJECTS=A1222_SPE_floats.o



A1222_SPE_floats: $(OBJECTS)



    $(CC) $(OBJECTS) $(CFLAGS_SPE) -o A1222_SPE_floats



A1222_SPE_floats.o: A1222_SPE_floats.c



    $(CC) $(INCLUDES) $(CFLAGS_SPE) -c A1222_SPE_floats.c -o A1222_SPE_floats.o

Results on A1222 is 0, should be 3.
Obviously compiling the same code for FPU lead to correct result.

The compiler used is GCC 6 present in latest (your) OS4/SDK.
Maybe I made something wrong, so any feeback is well appreciated.

sailor

Re: A1222 support in the SDK and problems

Posted on: 4/4 8:37 #3

Not too shy to talk

INTRO - this post will be long, if you want, jump down..
At first, I have to thanks all guys involved in A1222+ production, and all coders who tried to compile something for SPE. And especially to Hans, flash, HunoPPC and MickJT ( they kindly answered my stupid questions and requests

)
The main reason for my thanks is that I now forced to return to programming in c. After twenty-five years of bash, terraform and ansible

.

I read manual of P1022 CPU and found, that SPE is interesting feature - it not only Embeddded FPU, but also two SIMD engines ( int + float ). Not so powerful like AltiVec, but still nice.

1. Flash's example
Now back to flash ( and mine ) problem above:
Corrupt output from "printf" function ( and all external functions with float value parameters ) is, that float parameters are passed by registers. And P1022+SPE has no FPRs, uses 64-bit GPSs instead.

So in example above code is compiled for SPE, it puts "result" value in 64-bit GPR. And printf() is compilled for powerpc standart, so it reads parameter from FPR !!!.
And if P1022 has no FPR, exception is generated, and AmigaOS run LTE FPU emulator and retuns value from Emulated FPR - where is stored result from previous emulated operations. This value is of course totally irrelevant to our function call...

2. How to fix it
Answer is inside this nice Hans's Taborizing guide - see paragraph 2. But unfortunatelly, this guide is too short, is not written for blondes, who did c-coding 25 years before.
So I have to install Codebench ( thx @walkero for example ) and SDK and start to experiment on live A1222. And try some examples.

As Hans wrote, we cannot call standard powerpc-compilled functions with float arguments from SPE-code. But we can call call functions with pointer to float ( float* ), because value is address and is stored in GPRs both in SPE and powerpc code.
For every standard function call we can do replacement of this function called by reference.

3. solution
Instead of use printf() directly in SPE code I used _pritnf__SPEfp_1 () custom replacement. _pritnf__SPEfp_1 has no body here, only declaration. All this code is compilled with SPE:
> gcc -c -mcpu=8540 -mfloat-gprs=double -mabi=spe -mspe -c SPEfloat-printf-output.c -o SPEfloat-printf-output.o


/* spefloatprint main file */

/* compile with: */

/* gcc -c -mcpu=8540 -mfloat-gprs=double -mabi=spe -mspe */



#include <stdio.h>



/* transition funftion for float + double parameters */

int _pritnf__SPEfp_1 ( const char*, double* );



int main () {

    double LX, LY;



    /* all FPU-heavy code without external function calling should be here */

        /* ... */

        

    LX=1.5;

    LY=15.5;

    int i;

    

    /* calling standart powerpc FP code function has to be made this way, by transition function */

    i=_pritnf__SPEfp_1 ("Hodnota SPE LX=%lf\n", &LX );

    i=_pritnf__SPEfp_1 ("Hodnota SPE LX=%lf\n", &LY );

    return (i);

}

Now we call function with &LX, which is address stored in GPR.
Now we made _pritnf__SPEfp_1() body with simple code ( it calls printf() ), compilled for powerpc ABI.
> gcc -c -mcpu=powerpc -c printf__SPEfp_1.c -o printf__SPEfp_1.o


/* spefloatprint transition function */

/* compile with: */

/* gcc -c -mcpu=powerpc  */



#include <stdio.h>



int _pritnf__SPEfp_1 ( const char* string, double* LD ) {

/* receive pointer to FP parameters from SPE code, and pass direct parameters to standart powerpc code function  */

    double X;

    X=*LD;    



    printf( string, X  );

    return (1);

}

This code is powerpc ABI, so put float value into standard FPR and simply calls printf(). Of course, all this FPR call is emulated in LTE, but it works with consistent data.

Last steps are simple:
> gcc -c -mcpu=powerpc -c printf__SPEfp_1.c -o printf__SPEfp_1.o
> gcc SPEfloat-printf-output.o printf__SPEfp_1.o -o SPEfloat-printf-output
> SPEfloat-printf-output
Hodnota SPE LX=1.500000
Hodnota SPE LX=15.500000

I did it!

So it is the same like Hans advice, but for us who are slower...

4. What next ?
It is only workaround to made compatible powerpc ABI to SPE ABI and utilize hardwired embedded FPU.
Btw, did you know, that today's modern ARM CPUs has exactly the same problem with different FPUs and FPRs? But they have soulution - gcc parameter "mfloat-abi=softfp" which made code which stores f-p values in integer registers. The same what we did above.

We don't do anything yet with main SPE feature - SIMD units.
It is for next time - it is very time consuming to do it only with Basics of SIMD Programming for AltiVec + spe.h + SPEProgrammingEnvironmentsManual

If you find some SPE SIMD examples, it will be great....

5. The end
P1022 is not crippled CPU - it is more powerful CPU than G3!!!
Only it takes some work and time to do it properly...

And my last question: Where is the best place for such discussion and examples? Here or os4coding, or elsewhere?

Edited by sailor on 2024/4/4 9:21:10

AmigaOS3: Amiga 1200
AmigaOS4: Micro A1-C, AmigaOne XE, Pegasos II, Sam440ep, Sam440ep-flex, AmigaOne X1000
MorphOS: Efika 5200b, Pegasos I, Pegasos II, Powerbook, Mac Mini, iMac, Powermac Quad

nbache

Re: A1222 support in the SDK and problems

Posted on: 4/4 22:49 #4

Just can't stay away

@sailor

Thank you. Interesting and well explained.

I could even follow it, although I haven't coded C for almost as long as you

.

BTW, I think you have a minor typo/copy-paste-error here:


   i=_pritnf__SPEfp_1 ("Hodnota SPE LX=%lf\n", &LX ); 

   i=_pritnf__SPEfp_1 ("Hodnota SPE LX=%lf\n", &LY );

In the second of the two lines you want the string to be "Hodnota SPE LY=%lf\n", right?

Best regards,

Niels

sailor

Re: A1222 support in the SDK and problems

Posted on: 4/5 9:52 #5

Not too shy to talk

@nbache
of course, it is copy-paste syndrome

fortunatelly, it is only text, &LX &LY values are OK.

geennaam

Re: A1222 support in the SDK and problems

Posted on: 4/5 10:37 #6

Quite a regular

@sailor

Isn't there a typo in the printf function name as well? or is it really called _pritnf__SPEfp_1 instead of _printf__SPEfp_1

Quote:

If you find some SPE SIMD examples, it will be great....

I think that the SPE are programmed in a similar way as altivec or any other vector unit. There are just less vector types available since it is only 64bit wide.

Take a look GCC spe.h for what is at your disposal. That header also contains the buildin functions available for vector operations. If you are lucky then it is just a matter of adjusting altivec code.
This programming manual gives more information about those buildin functions and datatypes and how to use them in C/C++: https://www.st.com/resource/en/program ... al-stmicroelectronics.pdf

Edited by geennaam on 2024/4/5 11:16:14

sailor

Re: A1222 support in the SDK and problems

Posted on: 4/5 12:33 #7

Not too shy to talk

@geennaam

in this time it is no typo, only not ideal name

all above code is very dirty - I made it in the evening in a hurry. My dogs were hungry.

Probably in future I put all transition functions of project in one file. And use one naming convention something like:
origname_SPE() or origname_SPE_1 () .. if it uses different number of parameters than original. Like above printf(), it can print only one double.

Yes, SPE programming is the same as other SIMD programming. SPE has only shorter vectors and few functions.
If we have altivec code, things are relatively simple with one exception - Altivec permute unit instructions. There is no SPE alternative, so it should be carefully checked what the code exactly means...

And naming convention of spe.h is terrible, so I should have make some cheat seat first. I am using SPEPEM and also manual from the link. Thanks.

Edited by sailor on 2024/4/5 12:49:00

sailor

Re: A1222 support in the SDK and problems

Posted on: 4/6 12:41 #8

Not too shy to talk

@all
please, where I found powerpc calling conventions for gcc/AmigaOS?

Something like this asm AIX stuff?
Or some other info about stack values and so...

joerg

Re: A1222 support in the SDK and problems

Posted on: 4/6 12:54 #9

Just can't stay away

@sailor
AmigaOS 4.x uses the SysV PowerPC ABI, should be the same as AIX.
Only exceptions are R2 and R13, which aren't used at all (default) or as relative (small) data pointer: R13 when using -msdata, R2 when using -mbaserel.
http://refspecs.linux-foundation.org/elf/elfspec_ppc.pdf pages 30-32

sailor

Re: A1222 support in the SDK and problems

Posted on: 4/6 16:30 #10

Not too shy to talk

@joerg
thank you! It is exactly I needed.

sailor

Re: A1222 support in the SDK and problems

Posted on: 4/17 20:14 #11

Not too shy to talk

I have another question for SPE code:

If I need math library and using #include <math.h> and "-lm", it linked standart powerpc math code. I.e. it cannot be called directly from SPE code.
Now I am using the same workaround like in case of printf (above) - from SPE code call transition function by pointer, transition function is powerpc code and call math function normally.

But do exist in AmigaOS4 SDK some workaround?
I tried to use -lsoft-fp, but it not exists: "ld: cannot find -lsoft-fp"..

And please, what is "SDK:clib2/lib/soft-float/libm.a" for? Is it some preparation for soft-fp calls or something other? And how to use it?

Thanks for any tips and clues.

flash

Re: A1222 support in the SDK and problems

Posted on: 4/17 21:05 #12

Just popping in

@sailor

As workaround you need to pass float parameters by reference and not by value.
Another solution is to pass them using heap space and not stack.
To do this you can use an array of floats and pass it's base address to function.
Another solution is to pass a struct with floats vars as members.

Anyway also printf function is bugged for A1222 and need to be fixed for floats.

I hope someone is working on it and new sdk will be released soon.

Hans

Re: A1222 support in the SDK and problems

Posted on: 4/18 3:48 #13

Home away from home

@sailor

Quote:

Answer is inside this nice Hans's Taborizing guide - see paragraph 2. But unfortunatelly, this guide is too short, is not written for blondes, who did c-coding 25 years before.

How could I make it more accessible to "blones, who did C-coding 25 years before?"

Hans

Join Kea Campus' Amiga Corner and support Amiga content creation
https://keasigmadelta.com/ - see more of my work

joerg

Re: A1222 support in the SDK and problems

Posted on: 4/18 6:23 #14

Just can't stay away

@sailor
Quote:

I tried to use -lsoft-fp, but it not exists: "ld: cannot find -lsoft-fp"..

You have to use gcc -msoft-float ...
But if you do not only your own code but all libraries, incl. the C library, you are using have to be compiled with -msoft-float as well. You can't mix FPU with soft-float code, at least not without using similar workarounds you are using for SPE now.

There is probably no soft-float version of newlib. There used to be soft-float versions of clib2, but in case it's no longer available rebuilding clib2 or clib4 with -msoft-float should be no big problem.
With a soft-float C library and building your own code with -msoft-float you don't need workarounds for functions like printf() either, but of course a SPE C library for the A1222 would be much better than a soft-float one which uses integer instructions and registers for float/double.

sailor

Re: A1222 support in the SDK and problems

Posted on: 4/18 8:15 #15

Not too shy to talk

@flashQuote:

flash wrote:@sailor
As workaround you need to pass float parameters by reference and not by value.
Another solution is to pass them using heap space and not stack.
To do this you can use an array of floats and pass it's base address to function.
Another solution is to pass a struct with floats vars as members.

Anyway also printf function is bugged for A1222 and need to be fixed for floats.

I am using the first workaround, and I wrote my printf float function alternative for for print for spe.

@HansQuote:

Hans wrote:@sailor
How could I make it more accessible to "blones, who did C-coding 25 years before?"

I am sorry Hans, it was joke. It only means, that for me was not enough to read your article, and I had to make some examples and experiments before I understood it completely. Nothing other needed. Thanks.

sailor

Re: A1222 support in the SDK and problems

Posted on: 4/18 8:50 #16

Not too shy to talk

@joergQuote:

joerg wrote:
You have to use gcc -msoft-float ...
But if you do not only your own code but all libraries, incl. the C library, you are using have to be compiled with -msoft-float as well. You can't mix FPU with soft-float code, at least not without using similar workarounds you are using for SPE now.

There is probably no soft-float version of newlib. There used to be soft-float versions of clib2, but in case it's no longer available rebuilding clib2 or clib4 with -msoft-float should be no big problem.
With a soft-float C library and building your own code with -msoft-float you don't need workarounds for functions like printf() either, but of course a SPE C library for the A1222 would be much better than a soft-float one which uses integer instructions and registers for float/double.

Thank you for detailed info.
And please, how to use soft-float C library?
Is it something like: "gcc -mcrt=clib2 -msoft-float .... -lm" ?
Or other way?

The spe code ( -mcpu=8540 -mabi=spe ) is allways soft-float, regardless of c library used.
And how is floating-point parameters passed when I used "-mcpu=powerpc -msoft-float" ? Via GPR registers? They are 32-bit in powerpc ABI. Or via stack? Or float via GPR and double via stack?
I read wiki.amigaos.net, but trere is not much about SPE.

Hans

Re: A1222 support in the SDK and problems

Posted on: 4/18 9:38 #17

Home away from home

@sailor

[qote]I am sorry Hans, it was joke. It only means, that for me was not enough to read your article, and I had to make some examples and experiments before I understood it completely. Nothing other needed. Thanks.[/quote]
Ah, okay. Suggestions for improvements are still welcome.

Hans

Join Kea Campus' Amiga Corner and support Amiga content creation
https://keasigmadelta.com/ - see more of my work

flash

Re: A1222 support in the SDK and problems

Posted on: 4/18 11:11 #18

Just popping in

@sailor

The spe code ( -mcpu=8540 -mabi=spe ) is always hard-float.
Part of integer register file of P1222 cpu is used for vectors/scalar floats using hardware decoding engine.
It can act like a sort of an "altivec unit" or an "fpu unit".
Problem is that even if GCC 6 can manage it the result code is bugged in amigaland.

Also libc libraries (newlib/clib) needs to be recompiled for P1222 support, we need new sdk to produce the right binaries for A1222 without any workaround.

sailor

Re: A1222 support in the SDK and problems

Posted on: 4/18 12:10 #19

Not too shy to talk

@flash

It is not so simple

From point of view of SPE embedded FPU it is HARD float, becouse it uses SPE-natural instructions and registers ( but these are not FPRs, but 64-bit GPRs ).
From point of view of powerpc code it is SOFT float, because it uses GPR registers and has no powerpc instructions.

From the point of view of gcc SPE code is SOFT float:


gcc -mcpu=8540 -mspe -mabi=spe -mfloat-gprs=double -c Stream2_mh.c -o Stream2_mh.o

gcc -mcpu=powerpc -c spe_float_transition.c -o spe_float_transition.o

gcc Stream2_mh.o spe_float_transition.o -o Stream2_mh

ld: Warning: spe_float_transition.o uses hard float, Stream2_mh uses soft float

Stream2_mh.c is benchmark compilled with SPE, spe_float_transition.c contains functions which should be called with powerpc float parameters, like printf.
gcc recognizes SPE code like soft-float.

If I remember good, comlilling with additional -mhard-float:


gcc -mcpu=8540 -mspe -mabi=spe -mfloat-gprs=double -mhard-float -c Stream2_mh.c -o Stream2_mh.o

generates some error - but I can check it again.

And gcc 6.4.0 online docs says:
Quote:

-msoft-float
-mhard-float
Generate code that does not use (uses) the floating-point register set. Software floating-point emulation is provided if you use the -msoft-float option, and pass the option to GCC when linking.

And as if embedded spe FPU has no floating-point register set, it is recognized like soft

It is only my explanation, and of course, I can be wrong. I start to play with this only after A1222+ arrive, so there is a lot of things more to study.

flash

Re: A1222 support in the SDK and problems

Posted on: 4/18 12:44 #20

Just popping in

@sailor

Use GCC -S switch to generate assembler source code and look inside it.

You'll see SPE code using these flags -mcpu=8540 -mabi=spe -mfloat-gprs=double

In soft-floats you'll see standard integer powerpc instructions where math operations are only emulated (much slower).

Register To Post	(1) 2 3 »
	Top Previous Topic Next Topic

Currently Active Users Viewing This Thread: 1 ( 0 members and 1 Anonymous Users )