if (&process_to_inspect->pr_Task != NULL)
{
IExec->SuspendTask(&process_to_inspect->pr_Task,0);
uint32 result = IDebug->StackTrace(&process_to_inspect->pr_Task,hookstack);
IExec->RestartTask(&process_to_inspect->pr_Task,0);
}
}
Now, when it simply Suspend/Restart : all works, no problems. Through 100% cpu loading and all crawl like a slideshow (opening of windowses, etc, 100% loading because of loop), but ok , it works.
Then once i keep this "StackTrace" : then it show me few lines of stack trace (5-10) and everything freezes (without crashlog). Sometime gives me more, sometime give less.
Now, when it simply Suspend/Restart : all works, no problems. Through 100% cpu loading and all crawl like a slideshow (opening of windowses, etc, 100% loading because of loop), but ok , it works.
Setting process_to_inspect to NULL again after SuspendTask()/StackTrace()/RestartTask() seems to be missing.
Now.. What curious me is: Why we have some CallHookPKt calls from patched function without actual stack trace, but only for some of them
And now it is clear which libs/components are involved when one call CallHookPkt (or maybe, its not CallHookPkt cause it, but runnning of any binary ? that need to test)
Is it possible to transfer addresses to the names of functions so stack trace output will be more readable ? Or it's only possible if library build with -gstabs ?
That's just an over-complicated way of using if (NULL != process_to_inspect)
(Not 100% sure about the way that's interpreted by the compiler, but at least "&(process_to_inspect->pr_Task)" would be identical to just "process_to_inspect".)
Is it possible to transfer addresses to the names of functions so stack trace output will be more readable ? Or it's only possible if library build with -gstabs ?
Well... you should find more info then this, you only showing the module. not everything is stripped.
Even if it is stripped.
addresses can be looked up in the interface, you know the layout of interfaces as documented in inline4 header files.
by closest match, or by comparing the address between interfaces.
(NutsAboutAmiga)
Basilisk II for AmigaOS4 AmigaInputAnywhere Excalibur and other tools and apps.
@kas1e Constantly polling process_to_inspect global variable in an infinite loop is not only very inefficient as you've found but also not safe in a multitasking environment, because another process could call your hooked function while it's still collecting a stack trace and that would overwrite process_to_inspect during the stack trace function is running. You really should send the process id via a message or some other IPC mechanism and make the stack tracing function reentrant (i.e. not using any globals) so it won't break if invoked by different processes at the same time. Invoking this with a message would both avoid 100% CPU usage from busy waiting and avoid needing Suspend/Restart as the traced process would wait for the message.
I don't know how to get more detailed info but you probably would need to at least run debug kernel to get info on kernel functions and may need debug versions of libraries too which may not be available.
I don't have access to the sources, but using m68k libamiga.a CallHook() in PPC native code is impossible. It can only be used for emulated m68k hook functions called by emulated m68k code, only IUtility->CallHookPkt() can be used from PPC native code, checks if the hook function is PPC native (direct function call) or emulated m68k code (executing it with the m68k emulator).
Is there an OS4 PPC libamiga.a with a CallHook() function? Would be complete nonsense (unless all it does is calling IUtility->CallHookPkt()), but OTOH I wouldn't be surprised if clib2 has something like that.
It seems that in end we do have also CallHook() separated from CallHookPkt() and not from amiga.lib, but from utility.library. See from interfaces/utility.h:
Like that ? Shouldn't i somehow detect the type and amount of the arguments coming to , and then return them only ? I mean can't do "return Original_CallHook(Self, hook, object, ...)" , but i can't assume type and amount of .. how to deal with ?
Always hate this varargs stuff, and very sure there is no more involved with CallHookPkt, and like CallHookPkt() is one and single one :)
Anyway, that all mean, that CallHookPkt() and CallHook() both from utility.library, are separate functions, and both need to be patched independently, as CallHookPkt to not call CallHook() for sure (i asked one os4 dev about, he checked sources, and they indeed separate functions).
Looks like before, in ancient times, it was in libamiga.a, then it also was in clib2 too (but only for 68k), and when it seems start to be first ppc versions of uitilty.library, they made CallHook() inside of it too (through not sure there, but looks like this, as why to have 2 different CallHook() one in amiga.lib and one in utility.library .. )
At least that explain why we have no so many CallHook calls as you may expect maybe ?
stacktrace
Btw, checking Alfkil's SpotLess source code i found this:
See, there he point that getting stack trace by IDebug, is "unsafe" way (dunno why), instead, he get ElfHandle and then by IElf->SymbolQuery get the same info one get from IDebug->StackTrace.
@balaton
Yeah, busylooping take 100% cpu loading, which i kind of fix by adding IDOS->Delay(1) in , but that of course ugly crap :)
Through i still don't understand one moment : why when i simply run my CallHookPkt() test case, i have about 150 calls of CallHookPkt() from patched function, but only 12 StackTraces.. I was expected to have the same amount : for each call of patched CallHookPkt - one StackTrace.
Quote:
I don't know how to get more detailed info but you probably would need to at least run debug kernel to get info on kernel functions and may need debug versions of libraries too which may not be available.
Debug kernel sadly didn't build with -gstabs, it only "debug" in meaning that it have more IExec->DebugPrintF to throw on serial when require it, but it didn't build with -gstabs.
But from another side, LiveForIt have a point : can't I just by offsets in the memory known the name of function ? DebugSymbol structure (from exec/debug.h) do have this field : "Offset into the module", so I can make some internal resolver probably ? Like checking , if modulename = kernel, then take offsets, compare it with the list of functions from include files, and out the correct one in human look ?
@all Current version, which kind of works, but with no IPC:
Sometime this version simple hang forever and stop output on serial : system continue to work, but just test binary never exit and i can't interrupt the traces by ctrl+c. Sometimes it works. Often it works after reboot, but second run acts like this.
Anyway, how to make proper IPC there ? I mean, I'm never doing so, are there any templates to follow ? Like:
patced_function()
{
get_proc + check_name
if (my_name) {
send message ?
}
}
main()
{
while(1) {
wait for message
}
}
thanks!
Edited by kas1e on 2024/5/6 3:06:53 Edited by kas1e on 2024/5/6 3:19:32 Edited by kas1e on 2024/5/6 3:31:51 Edited by kas1e on 2024/5/6 4:01:24 Edited by kas1e on 2024/5/6 4:29:04 Edited by kas1e on 2024/5/6 4:29:53 Edited by kas1e on 2024/5/6 4:37:14 Edited by kas1e on 2024/5/6 4:54:32 Edited by kas1e on 2024/5/6 5:54:06
IDebug is an Exec's thing, and yeah i use libauto: i tested by standalone binary (without patching involved) and IDebug handles well by it and stack trace working for let's say FindTask("Workbench").. But not when i put my process to trace on : it simple hang binary i want to trace+stack-trace right after first CallHookPkt() call.
Okay, I didn't know libauto would pick it up. So it must open all interfaces of Exec? Doesn't look efficient opening things you don't always need.
Quote:
Maybe when we in patched function i should obtain IDebug again ?
Well that would incur more overhead and would likely be unnecessary. Unless it was not thread safe. I think pulling it from your global IDebug after your main code has opened it is good enough.
Quote:
I.e. pure set of any hook inside the patched CallHookPkt() = crash. Probably that what Joerg mean when says "Exclude own hook from patched CallHookPkt to avoid endless loop and crash" ? But then, my task is explicitly "serial.device", so not the main one ..
It should be fine to allocate a hook. I didn't see it before but the "IDebug->StackTrace(&process->pr_Task,hook)" would have gone into a circular loop I imagine when it ended up calling the hook and then called Patched_CallHookPkt again.
Quote:
It seems that in end we do have also CallHook() separated from CallHookPkt() and not from amiga.lib, but from utility.library. See from interfaces/utility.h:
Yes another function you may need to patch.
Quote:
Always hate this varargs stuff, and very sure there is no more involved with CallHookPkt, and like CallHookPkt() is one and single one :)
Never knew why there was the "Tags" functions and then "Attr" or "TagList" functions. Some had their own entry. All of them in 68K land would have been an address to a tag list at the ABI level. But at the API level they over complicated it. I didn't see why two functions were needed that would be doing exactly the same thing.
For PPC where it doesn't have a stack and can't just stack data so the epilogue must create it with a stack frame I can see it being needed. But the va stuff messes up all code I can see. Regardless of CPU.
Quote:
See, there he point that getting stack trace by IDebug, is "unsafe" way (dunno why), instead, he get ElfHandle and then by IElf->SymbolQuery get the same info one get from IDebug->StackTrace.
That could be because you may pick up an illegal address off the stack trace. You may need to verify it with TypeOfMem(). That's what I started doing with unknown addresses.
Quote:
But from another side, LiveForIt have a point : can't I just by offsets in the memory known the name of function ? DebugSymbol structure (from exec/debug.h) do have this field : "Offset into the module", so I can make some internal resolver probably ? Like checking , if modulename = kernel, then take offsets, compare it with the list of functions from include files, and out the correct one in human look ?
I've had this idea for years to do something like that. Like a program that can read a stack track and replace all vague offsets with real functions. An advanced addr2line so to speak. Exec and others doesn't change much. But I think an automated way of reading interfaces would be needed that builds up a list itself. Which can then be used as a reference in a program to do a reverse name lookup on a stack trace.
Quote:
Anyway, how to make proper IPC there ? I mean, I'm never doing so, are there any templates to follow ? Like:
A few ways would be: * Just use one global structure from main program. And assign a signal in main program. Protect it with a semaphore or mutex so every time you want to use it you lock it, fill it in and signal master task. Master task will process then in patch wait for master to signal back when done.
* Just use a local structure in patch. Send it to your master task as a message and wait for it to signal back. Like above but it will be thread safe from race conditions.
* Allocate a block of memory for structure on each call. Then use the above method.
The first is like you are doing but you avoid the busy loops by using signals. Of course you would be locking the data first.
To avoid more code locking it the second one may be easier.
The third has some slight overhead but works like the second.
Although it's rather verbose this is not so much a trick but good technique. The "&process_to_inspect->pr_Task" looks redundant but this in fact avoids a cast. Being that every Process is a Task but not every Task is a Process, means you can't just throw one generic pointer around, so sometimes you need Task and other times Process. This is a good way to use one Process pointer for both.
I need to remember this way because looking at my code I've got a Task that I'm currently doing a "(struct Process *) on and it just looks messy. BCPL pointers are bad enough.
I've had this idea for years to do something like that. Like a program that can read a stack track and replace all vague offsets with real functions. An advanced addr2line so to speak. Exec and others doesn't change much. But I think an automated way of reading interfaces would be needed that builds up a list itself. Which can then be used as a reference in a program to do a reverse name lookup on a stack trace.
But those offsets didn't looks sane enough for me .. I were expected for something like numbers starting from 0 and more pointing on the function's order in the list of the functions as described in the interface. Or it not that "offset" ?
And ok, i know the offsets of the functions for each library, but what about "kernel" module for example ? We can't know what calls of what sub-library it did (exec/utility/etc) so we can't just by knowing name "kernel" and offset, to know what function is it. That probably can works with libraries only, but we need normal offset, because i currently can't get what this offset mean for example:
But those offsets didn't looks sane enough for me .. I were expected for something like numbers starting from 0 and more pointing on the function's order in the list of the functions as described in the interface. Or it not that "offset" ?
I can see two offsets at play here. The interface jump table function offsets. And the binary code offsets.
Quote:
And ok, i know the offsets of the functions for each library, but what about "kernel" module for example ? We can't know what calls of what sub-library it did (exec/utility/etc) so we can't just by knowing name "kernel" and offset, to know what function is it. That probably can works with libraries only, but we need normal offset, because i currently can't get what this offset mean for example:
The base call offset from the interface jump should be on the bottom of the stack trace at the point a program calls an OS function. It's true we can't know what other calls it makes. And, the OS won't necessarily call itself from the external function interface, but will make a direct function call.
But usually what we see with module kernel+0x is the actual code offset into the binary. Same as with -gstabs code you see the name with function and offset. So offset would be offset from function code prologue.
Quote:
I didn't get what this "offset" mean, in the headers it called "Offset into the module", but that not looks like offsets in jump table ?
No, because it's gone beyond that. Depending how it implements it the function jump should be in stack but it may be optimised away. Creating a stack frame on every jump call would be inefficient so the API call logic may do the minimum to call the function.
Not sure if GCC or what ever compiler can be told to build stack frame on API call. Like the opposite of -fomit-frame-pointer.
Quote:
So... how can we get offset of the called function of library in the jump table ?
With only an offset this is where it gets complicated. I'd say we'd need to walk the list of interface functions and build up a list of offsets from base code where ELF is loaded. They may not be in order either so it cannot be expected for functions to appear in sequential order unless they really are. Which is handy for calculating a size window for each function code block, but the table off offsets to address can be ordered later. However, the address from each jump might be just enough to do it. With a bit of math, the nearest function to that offset can be located.
I think I've made this look complicated. This is probably easier than my manual thinking of doing it by hand. There's likely other ways of doing this in the outside world. I mean it's all based on ELF anyway which is common. The answer may be on Stack overflow.
@Hypex In other words, even if we know that we can do something because of jump table and interfaces, we still dont know for now how to detect what function of the stacktraced library (at least library) were called :(
And, the OS won't necessarily call itself from the external function interface, but will make a direct function call.
At least for 99% of the OS4 code doesn't use direct function calls but always goes through the interface of the library.
Quote:
Not sure if GCC or what ever compiler can be told to build stack frame on API call.
On OS4 VARARGS68K and va_(get|start)linearva does that. It was required for compatibility to emulated m68k code passing the arguments on the stack and it's used for example for all of the varargs TagItem functions. Something like
@LiveForIt In general your argument is valid, but for the &process->pr_Task case the offsetof(struct Process, pr_Task) is 0 and therefore can't cause problems.
Yeah, busylooping take 100% cpu loading, which i kind of fix by adding IDOS->Delay(1) in , but that of course ugly crap :)
Through i still don't understand one moment : why when i simply run my CallHookPkt() test case, i have about 150 calls of CallHookPkt() from patched function, but only 12 StackTraces.. I was expected to have the same amount : for each call of patched CallHookPkt - one StackTrace.
First call to hook sets process_to_inspect. Then another process calls CallHookPkt and your hook function overwrites process_to_inspect. This happens a few times until your main process gets a chance to run or finishes the delay and notices process_to_inspect is != NULL and suspends the last process that overwrote the global variable. Then goes on to collect the stack trace, if unlucky, there was already another call from another process so process_to_inspect now points to some other process not which was suspended. Then collects trace of that process and restarts some random process at the end, depending if there was more calls that replaced process_to_inspect while the stack trace was running. This won't work this way. You need to make the trace collection function not depend on any globals and pass the process id to trace via a message or some other way that's not overwritten by another call to your hook function. Since the hook function can also be invoked by several tasks simultaneously (it can be preempted while running and another process could call it again) it also cannot use globals. You could add locking to avoid these problems and only allow one instance of the hook and stack collection to run at a time but that would stall every other caller until stack collection is finished and create a bottleneck in your tracer. It's better to make the trace collection reentrant so it can be called multiple time on different processes without breaking and connect it to a message port so it can be invoked thorugh a message. Then hook function only needs to send the message and wait for reply which is again something that's independent of all other calls, it can constuct the message in a local stack variable so then multiple tasks can invoke this simultaneously and the separate calls can commence without blocking each other.
@Balaton Thanks for detailed answer ! But on small nitpiking :
Quote:
First call to hook sets process_to_inspect. Then another process calls CallHookPkt and your hook function overwrites process_to_inspect. This happens a few times until your main process gets a chance to run or finishes the delay and notices process_to_inspect is != NULL and suspends the last process that overwrote the global variable.
But my code of Patched CallHookPkt, is checking on the process to be exactly only from my binary, all the other processes/tasks and whatever skips and returns to original CallHookPkt, without settings process_to_inspect to anything, and it continues to be NULL. It's only set to not NULL, if it only processes, and if it only processes with cli_command name of my choose (so, sets only for CallHookPkts coming from process from my binary).
Maybe issue not with "another process overwrite process_to_inspect", but the same process (my traced binary) overwrite it and main one just not have time to handle them all. I.e. my binary invoke 150 CallHookPkt's calls, CallHookPkt handle it well, just the main task with it's global have no time to handle them all as expected, but they all still from the same process (my binary).
Quote:
You need to make the trace collection function not depend on any globals and pass the process id to trace via a message
Yeah, that the way i will go now.
Quote:
You could add locking to avoid these problems and only allow one instance of the hook and stack collection to run at a time but that would stall every other caller until stack collection is finished and create a bottleneck in your tracer.
At least i can try to made this one firstly, to see how it works with bottleneck, and then make the trace collection reentrant.
Also you should check if your in forbid or not, as this can give you the option to freeze the trapped task, for a better stack trace. just restart it in main()
(NutsAboutAmiga)
Basilisk II for AmigaOS4 AmigaInputAnywhere Excalibur and other tools and apps.