I updated Hieronymus yesterday (birthday release !). It is available 
 here on os4depot.
Hieronymus is a statistical profiler, that means it samples periodically what is running, then it decodes information and find in which functions was spent the CPU time.
It will be useful for big programs to know which parts (functions) to optimize, to see if the time is spent in the program itself or in libraries (and which ones).
It is really easy to use ! Just be sure your program has its symbols and run "Hieronymus duration=10" for example.
I improved the output (factorizing by function names) and added the number of cache misses on G3/G4 (this information is harder to use but at least we get it). I also extended the maximum duration time to a much bigger value.
Here is an example of output :
Hieronymus got 360 samples in 6 seconds.
Report of cache misses :
L1 inst cache misses : 2195632
L2 inst cache misses : 72082
L1 data cache misses : 4005281
L2 data cache misses : 381402
Detailed results :
count = 0207, percent = 57, name = dosbox
  Offset = 0x000c4060, Count = 0026, Function = _ZN7THEOPL315YMF262UpdateOneEiPsi
  Offset = 0x0000b1a4, Count = 0014, Function = _Z6get_CFv
  Offset = 0x0001a340, Count = 0071, Function = _Z19CPU_Core_Normal_Runv
  Offset = 0x000937e4, Count = 0045, Function = _Z9mem_readbj
  Offset = 0x0009532c, Count = 0020, Function = _Z9mem_readwj
  Offset = 0x00019b90, Count = 0008, Function = _Z10EA_16_46_nv
  Offset = 0x000aafc0, Count = 0006, Function = _ZN22VGA_ChainedVGA_Handler6writebEjj
  Offset = 0x000951b8, Count = 0001, Function = _Z9mem_readdj
  Offset = 0x0009386c, Count = 0001, Function = <NOT FOUND>
  Offset = 0x00096fd8, Count = 0001, Function = _ZN12MixerChannel14AddSamples_s16EjPs
  Offset = 0x00019be4, Count = 0001, Function = _Z10EA_16_45_nv
  Offset = 0x000189e8, Count = 0002, Function = _Z8DoString9STRING_OP
  Offset = 0x000195a0, Count = 0002, Function = _Z10EA_32_04_nv
  Offset = 0x00093848, Count = 0002, Function = _Z10mem_writebjh
  Offset = 0x0001955c, Count = 0002, Function = _Z10EA_16_06_nv
  Offset = 0x000ab000, Count = 0001, Function = _ZN22VGA_LFBChanges_Handler6writewEjj
  Offset = 0x0000b694, Count = 0002, Function = _Z6get_ZFv
  Offset = 0x00093874, Count = 0001, Function = _Z10MEM_SetLFBjjP11PageHandler
  Offset = 0x00094c80, Count = 0001, Function = _Z10mem_writewjt
count = 0138, percent = 38, name = Kickstart/kernel
  Offset = 0x00017bb8, Count = 0001, Function = InternalAddTail
  Offset = 0x0000cf68, Count = 0127, Function = HAL_TaskPostTerm
  Offset = 0x00014f58, Count = 0008, Function = <NOT FOUND>
  Offset = 0x000072bc, Count = 0002, Function = _impl_Supervisor
count = 0006, percent = 01, name = Kickstart/rtg.library
  Offset = 0x00055634, Count = 0006, Function = <NOT FOUND>
count = 0009, percent = 02, name = Kickstart/timer.device.kmod
  Offset = 0x0000203c, Count = 0009, Function = <NOT FOUND>
Summarized results :
Percent | Program
   57   | Games:dosbox-0.72/dosbox
   38   | SYS:Kickstart/kernel
    1   | SYS:Kickstart/rtg.library
    2   | SYS:Kickstart/timer.device.kmod
What do we see ?
- DosBox consumes 57% of the CPU time, that confirms what the CPU docky used to show
- Else the kernel is in HAL_TaskPostTerm (I suppose it is related to IDLE)
- In DosBox, heavy functions are : Core_Normal_Run, mem_readb, mem_readw and UpdateOneEiPs (audio), that is the same kind of results that oprofile gives on Linux.
- About cache misses, it would be necessary, to run Hieronymus again and look at the values. By the way, that has less interest until we don't know which functions caused the cache misses ...