Hello, everyone! I've been lurking here for years, as there seems to be a high proportion of OS4 programmers here, and the signal to noise ratio is much higher than some other Amiga forums. I've finally gotten around to joining. The occasion that pushed me over the line is the release of my latest project, Profyler (in the upload queue at OS4Depot and Aminet).
There have been a number of discussions about profilers for OS4 here in the past, including the current one about getting gprof to work, so there seems to be some interest in the subject. Inspired by that and by my use of the SAS/C sprof and GUIprof profilers in the past, I've created Profyler to provide a similar capability for OS4 programmers (with, thanks to MUI, a nicer user interface).
I won't go into all the features (and limitations) here, as you can check out the ReadMe to get all the details. Profyler is not going to be the one profiler to rule them all, as it has some significant limitations. But it's one more tool in the OS4 programmer's toolbox, one I hope is useful.
@msteed Welcome to our forum. Thank you so much for this release. As you already know, tools like this are much needed and we are going to test it to the limits.
Personally, I have to learn better how these kinds of tools work, which is part of my journey. I am sorry in advance if I will ask noobies questions.
Have I told you that I love passpocket too? I believe so... ;)
@msteed OMG! I mean OMG for real! Who can expect such big work to be done behind the scenes?
I read the whole Guide currently and reading this alone are an act of satisfaction :) The whole thing looks like something which can't be done in a few days. Do you probably spend half of a year or more minimum on it?
Loved the part "Profilers 101" where you describe everything about it in terms of AOS4. That really feels cool :) Hieronimus was mentioned which kind of in time: Matthias (Corto) is working right now on it and making good progress to support more hardware and make it better than before. And Tequilla from Capehill mentioned too, which mean not only a few of us know about it :)
And the whole guide is done so nice, that if anyone wants to know what-when-how-why about Profyler will find all the answers. Really feels like something from old commercial times of high level.
At the moment I only tried test apps coming with (of course first ones from "quick start"), and everything works as expected on X5000 (at least with beta-kernel). I have different tabs, have all the necessary fields, etc.
I have a question right after i read the whole guide about the "Location" tab, which is now turned off because ObtainDebugSymbold does not return this information. Sorry, i am not that good and details about this function, but should it return Location as well and it is not (so that is a bug that needs to be fixed), or, that is a feature we want to have? Why i ask because then we can create a bug report for kernel guys then.
Not back, really; I've been here all along (just not posting to forums). I wrote my first Amiga program in AmigaBasic on my A1000 (with no hard drive) back in the late 80s, and uploaded it to CompuServe's Amiga forums via a 2400 baud modem. I've been writing Amiga programs ever since.
@walkero
Yes, you've mentioned PassPocket in your emails. I'll likely have some noob questions of my own, when it comes to using the forum here.
Naturally, I used PassPocket as a test subject for Profyler (if you look closely at the source for the latest version you'll see the profiling macros in there).
@IamSONIC
Glad it works for you, and with a much newer version of GCC. I can only do so much testing on my one OS4 system, so I'll have to depend on the folks here to test it more widely.
I notice you've got some execution times down in the 500 - 600 ns range. The fastest I've ever seen on my X1000 is around 1 us. I don't know if the difference is due to the X5000 (is it that much faster?), or perhaps the newer version of GCC is more efficient at function calls. Interesting.
@kas1e
It was hard to keep quiet during all the past discussions here about OS4 profilers (that's where I heard about Tequila), but I hate to pre-announce stuff as then I feel like I'm under pressure to get it done. Better to sneak up on people.
I've actually been working on Profyler for a year and a half, though there were a number of times where I took a month or two off to work on other projects (both Amiga-related and not).
Regarding ObtainDebugSymbol: I suspect it's more of an unimplemented feature than an actual bug. The DebugSymbol struct it returns has fields for SourceFileName and SourceLineNumber, but there's never anything in them. Perhaps they planned to add that in the future and never got around to it.
You can turn the Location column on using the list browser's pop-up menu; it might be worth checking on some other Amiga systems/kernels to see if the location information is missing on them as well.
You can turn the Location column on using the list browser's pop-up menu; it might be worth checking on some other Amiga systems/kernels to see if the location information is missing on them as well.
At least on x5000 with the latest beta kernel, it is still empty. So will make a bug report/feature-request about.
I notice you've got some execution times down in the 500 - 600 ns range. The fastest I've ever seen on my X1000 is around 1 us. I don't know if the difference is due to the X5000 (is it that much faster?), or perhaps the newer version of GCC is more efficient at function calls. Interesting.
Regarding to this blog the X5000 seems to be faster per MHz than the X1000:
"(the core used in the P50x0 line achieves 3.0 DMIPS per Mhz as opposed to the 2.2 DMIPS per Mhz that the PA6T does)"
Also SysMon's Benchmark Tab (RageMem) give us some numbers:
I suppose that newer versions of GCC are optimized regarding runtime performance. Unfortunately i've never did some kind of benchmarking in this area. Maybe sTix can provide some more profound information here.
Here's some Profyler Data from my machine:
Profile Data for MultiThread
Function Name # of Calls Incl. Time Incl. % Incl. Avg. Excl. Time Excl. % Excl. Avg.
------------------------------- ----------- ----------- ------- ----------- ----------- ------- -----------
main 1 1.342 s 100.0 % 1.342 s 1.001 s 74.6 % 1.001 s
MyPrintf 2 341.408 ms 25.4 % 170.704 ms 341.408 ms 25.4 % 170.704 ms
Profile Data for SOLibTest
Function Name # of Calls Incl. Time Incl. % Incl. Avg. Excl. Time Excl. % Excl. Avg.
------------------------------- ----------- ----------- ------- ----------- ----------- ------- -----------
main 1 65.039 ms 100.0 % 65.039 ms 2.125 us 0.0 % 2.125 us
MyLib_Hello 1 65.082 us 0.1 % 65.082 us 1.123 us 0.0 % 1.123 us
MyLib_MyPrintf 2 65.035 ms 100.0 % 32.517 ms 65.035 ms 100.0 % 32.517 ms
MyLib_World 1 64.972 ms 99.9 % 64.972 ms 1.123 us 0.0 % 1.123 us
Profile Data for LinkLibTest
Function Name # of Calls Incl. Time Incl. % Incl. Avg. Excl. Time Excl. % Excl. Avg.
------------------------------- ----------- ----------- ------- ----------- ----------- ------- -----------
main 1 169.812 ms 100.0 % 169.812 ms 1.724 us 0.0 % 1.724 us
MyLib_Hello 1 57.784 us 0.0 % 57.784 us 1.203 us 0.0 % 1.203 us
MyLib_MyPrintf 2 169.808 ms 100.0 % 84.904 ms 169.808 ms 100.0 % 84.904 ms
MyLib_World 1 169.753 ms 100.0 % 169.753 ms 1.163 us 0.0 % 1.163 us
Profile Data for Recursion
Function Name # of Calls Incl. Time Incl. % Incl. Avg. Excl. Time Excl. % Excl. Avg.
------------------------------- ----------- ----------- ------- ----------- ----------- ------- -----------
CR 2 127.788 ms 99.9 % 63.894 ms 2.045 us 0.0 % 1.023 us
main 1 127.878 ms 100.0 % 127.878 ms 1.764 us 0.0 % 1.764 us
MyPrintf 13 127.859 ms 100.0 % 9.835 ms 127.859 ms 100.0 % 9.835 ms
Recurse 11 7.231 ms 5.7 % 657.370 us 14.155 us 0.0 % 1.287 us
Wazzup 1 127.204 ms 99.5 % 127.204 ms 1.684 us 0.0 % 1.684 us
Profile Data for StructorPlus
Function Name # of Calls Incl. Time Incl. % Incl. Avg. Excl. Time Excl. % Excl. Avg.
------------------------------- ----------- ----------- ------- ----------- ----------- ------- -----------
__static_initialization_and_des 2 105.220 ms 99.5 % 52.610 ms 2.446 us 0.0 % 1.223 us
_GLOBAL__sub_D_TheGlobalClass 1 181.533 us 0.2 % 181.533 us 1.203 us 0.0 % 1.203 us
_GLOBAL__sub_I_TheGlobalClass 1 105.042 ms 99.3 % 105.042 ms 1.885 us 0.0 % 1.885 us
Hello() 1 151.458 us 0.1 % 151.458 us 1.243 us 0.0 % 1.243 us
main 1 522.784 us 0.5 % 522.784 us 5.333 us 0.0 % 5.333 us
MyPrintf(char*) 5 105.729 ms 100.0 % 21.146 ms 105.729 ms 100.0 % 21.146 ms
TestClass::TestClass() 2 105.225 ms 99.5 % 52.613 ms 2.687 us 0.0 % 1.344 us
TestClass::~TestClass() 2 358.654 us 0.3 % 179.327 us 2.326 us 0.0 % 1.163 us
About "ObtainDebugSymbol() return empty fields" , are everything compiled with -gstabs ? (i mean not -g, but exactly -gstabs?).
I just was told, that all that needs is executable compiled and linked with -gstabs or there will be nothing to display. And that both Reaper (serial) and GrimReaper (GUI) use ObtainDebugSymbol() already to extract that information as well. So if it wasn't working, then both repairs should fail.
If you can show a simple Hello World example of it failing that would be helpful (so i can just forward it to kernel team, and they can fix it if there are issue, or at least provide us with more info)
I know the X5000 is faster, but I didn't think it was almost twice as fast. And indeed, the MIPS and DMIPS numbers you quote suggest that it's 30% to 50% faster. Yet comparing your profile numbers to those on my X1000 (with the test programs in my case compiled with GCC 8.3.0) shows that your numbers are almost twice as fast.
(Note that you can't readily compare anything that calls printf(), since the speed of writing to a console window depends greatly on how the test program was run, which you didn't indicate. You get one set of numbers if you run it by typing its name in a shell, a different set if you type 'run' and then the program's name in the shell, and a third set if you run it from Workbench by double-clicking it. So when comparing I concentrated on the "average exclusive" times, which don't count calls to MyPrintf().)
That leaves GCC as the unknown quantity. I haven't installed GCC 11 yet, so I can't try it to see what difference I get. Maybe you could try running the test programs as provided with Profyler, which are compiled with GCC 8.3.0, and comparing those results to the ones you get when they're compiled with GCC 11.1.0.
I've seen the location information sometimes in the GR window, but I thought maybe it did its own parsing of the debug data using elf.library, rather than using ObtainDebugSymbol().
As you can see in the makefile for the test programs, I do use -gstabs in the compiler arguments. I didn't think I needed to add it to the linker arguments as well, but I just gave that a try. It didn't make any difference; still no location data.
I'll see if I can come up with a simple, stand-alone program that displays the problem.
Thanks for developing this, Mike - the more programmers' tools, the merrier! And as others have noted above, the manual is an example of some well-written documentation! Quite a rare thing these days: as most OS4 software is written single-handedly, the developer usually has little time/energy left to write up good docs as well.