I updated Hieronymus yesterday (birthday release !). It is available
here on os4depot.
Hieronymus is a statistical profiler, that means it samples periodically what is running, then it decodes information and find in which functions was spent the CPU time.
It will be useful for big programs to know which parts (functions) to optimize, to see if the time is spent in the program itself or in libraries (and which ones).
It is really easy to use ! Just be sure your program has its symbols and run "Hieronymus duration=10" for example.
I improved the output (factorizing by function names) and added the number of cache misses on G3/G4 (this information is harder to use but at least we get it). I also extended the maximum duration time to a much bigger value.
Here is an example of output :
Hieronymus got 360 samples in 6 seconds.
Report of cache misses :
L1 inst cache misses : 2195632
L2 inst cache misses : 72082
L1 data cache misses : 4005281
L2 data cache misses : 381402
Detailed results :
count = 0207, percent = 57, name = dosbox
Offset = 0x000c4060, Count = 0026, Function = _ZN7THEOPL315YMF262UpdateOneEiPsi
Offset = 0x0000b1a4, Count = 0014, Function = _Z6get_CFv
Offset = 0x0001a340, Count = 0071, Function = _Z19CPU_Core_Normal_Runv
Offset = 0x000937e4, Count = 0045, Function = _Z9mem_readbj
Offset = 0x0009532c, Count = 0020, Function = _Z9mem_readwj
Offset = 0x00019b90, Count = 0008, Function = _Z10EA_16_46_nv
Offset = 0x000aafc0, Count = 0006, Function = _ZN22VGA_ChainedVGA_Handler6writebEjj
Offset = 0x000951b8, Count = 0001, Function = _Z9mem_readdj
Offset = 0x0009386c, Count = 0001, Function = <NOT FOUND>
Offset = 0x00096fd8, Count = 0001, Function = _ZN12MixerChannel14AddSamples_s16EjPs
Offset = 0x00019be4, Count = 0001, Function = _Z10EA_16_45_nv
Offset = 0x000189e8, Count = 0002, Function = _Z8DoString9STRING_OP
Offset = 0x000195a0, Count = 0002, Function = _Z10EA_32_04_nv
Offset = 0x00093848, Count = 0002, Function = _Z10mem_writebjh
Offset = 0x0001955c, Count = 0002, Function = _Z10EA_16_06_nv
Offset = 0x000ab000, Count = 0001, Function = _ZN22VGA_LFBChanges_Handler6writewEjj
Offset = 0x0000b694, Count = 0002, Function = _Z6get_ZFv
Offset = 0x00093874, Count = 0001, Function = _Z10MEM_SetLFBjjP11PageHandler
Offset = 0x00094c80, Count = 0001, Function = _Z10mem_writewjt
count = 0138, percent = 38, name = Kickstart/kernel
Offset = 0x00017bb8, Count = 0001, Function = InternalAddTail
Offset = 0x0000cf68, Count = 0127, Function = HAL_TaskPostTerm
Offset = 0x00014f58, Count = 0008, Function = <NOT FOUND>
Offset = 0x000072bc, Count = 0002, Function = _impl_Supervisor
count = 0006, percent = 01, name = Kickstart/rtg.library
Offset = 0x00055634, Count = 0006, Function = <NOT FOUND>
count = 0009, percent = 02, name = Kickstart/timer.device.kmod
Offset = 0x0000203c, Count = 0009, Function = <NOT FOUND>
Summarized results :
Percent | Program
57 | Games:dosbox-0.72/dosbox
38 | SYS:Kickstart/kernel
1 | SYS:Kickstart/rtg.library
2 | SYS:Kickstart/timer.device.kmod
What do we see ?
- DosBox consumes 57% of the CPU time, that confirms what the CPU docky used to show
- Else the kernel is in HAL_TaskPostTerm (I suppose it is related to IDLE)
- In DosBox, heavy functions are : Core_Normal_Run, mem_readb, mem_readw and UpdateOneEiPs (audio), that is the same kind of results that oprofile gives on Linux.
- About cache misses, it would be necessary, to run Hieronymus again and look at the values. By the way, that has less interest until we don't know which functions caused the cache misses ...