I've spent again time trying to optimize dosbox. It is diffult because the core is almost impossible to change. I did some things that can bring improvements but I need people to test it, mostly on Sam on which performance difference will be easier to see.
You can download it here. I will only finalize it and put it on os4depot if it shows improvements.
Tryed. Tested "Mortal Kombat 3" game. Visually all looks the same. Maybe, a bit faster loading, a bit faster starting of emulator itself. But game itself (related to graphics rendering speed) are the same. That all on Peg2 with 1ghz.
Some days ago i also tested dosbox (win32 vesion), on PC where 1ghz of cpu was too. It feels faster on 4-5 times if compare with aos4 version. Mortal Kombat 3 just works madness fast. I know that is not so correct to compare x86 emulator on x86, with x86 emulator on ppc, but still, speed of cpu are the same ..
One man on russian forum, say this (maybe it can help you in future with dosbox porting): --- On win32 dosbox use "recomplier". That only done for now for x86 and ARM only (and because of it it works so fast). In general, if write such recompiler for PPC, then, dosbox will fly on PPC too. And that must be not so hard, just need to learn PPC asm a bit ---
Not sure will it helps or not ..
ps. Btw, i also tryed dosbox on AROS , with these diskmags about we talk before (hugi, etc) with wrong colors. And on AROS it works fine -> that mean, that problems 100% in the BIG/LITTLE endian modes, when it start to works with protected mode.
Just tryed Mortal Kombat 1 but it's the same, off course I need more testing, need to explore all the various configuration, not to mention about game's details..
I'll keep you informed as I need this emulator as a piece of bread
No, I didn't use ibm_perflibs because in dosbox almost all the time is spent in the core (decoding x86 and doing huge computing) : I tried to reorganize some things to avoid cache misses (access data puting it on the same cache line, prefetch on data), improve branch prediction, add ... but I am not sure the effect is visible. It is only possible to optimize some peripheral parts (what means few percents).
The big improvement would be to write a dynamic core (called recompiler in a previous post) producing PPC code. Do you know why dosbox is so slow on PPC ? Not because PPC is bad, it is even very good (I was impressed : sometimes I thought I had found something to optimize and the compiler already did it !). It is slow compared to x86 versions because it does not use JIT technology.
I will look again at the code to see how hard it is but nothing is documented and the code is a mess even if at the end the program is very good ...
Elwood : About Hieronymus, I didn't port it to OS4, I created it And yes, again with dosbox, I can confirm it list the same slow functions than oprofile on Linux.
kas1e : About Hugi and x-ray, I tried of course and even on Linux PPC there is a problem with colors. I checked SDL surfaces and their masks, they are good. I thing these programs access hardware in a different way ... I don't know if we could fix that.
What is the value of the parameter Cycles in the config file ? Did you set it to "max" ? If I remember well, I chose 3000 cycles. Maybe you could decrease this value.
With 100% CPU, we know that there is not enough CPU power, so reducing the workload, to 80% for example we could see a difference.
@others : Any results ? Not interested in dosbox ?
Oh ... for people who will test, please tell me what is the machine you use. Thanks !
With 100% CPU, we know that there is not enough CPU power
(Off-topic) It comes as a sort of surprise how easily you can reach 100% CPU usage on the SAM (667MHz at least). Try loading a long ASCII text file into MultiView and then use the scroller to move there and back.