This thread over at AW.net caught my attention. More specifically, this:
Quote:
Another disturbing fact I discovered was that during normal play with benchmark of the same video in 8000kbit mpeg2 format, 76% of the CPU time went into the cgx_wpa VO driver! That just has to be because of a superslow RAM->GFX speed because simple color conversion (which uses altivec btw) and copying just cannot be 3 times heavier than decoding the mpeg2 data.
That would put the RAM->VRAM transfer rate at under 130 MiB/s if it were running at 30 fps. However, the frame-rate is even lower. He estimates it at about half, which would be ~65 MiB/s. That sounds ridiculously low for an A1-X1000, so I'm wondering two things: - What is MPlayer using to do the transfer? WritePixelArray()? Or some custom copy routine? - Which version of the rtg.library were these tests run at? WritePixelArray() achieves ~400 MiB/s with 53.30, but only ~110 MiB/s with the older 41.4355 on the A1-X1000
If the latest rtg.library was used, then I can only conclude that MPlayer's cgx_wpa must be doing something pretty inefficient when converting YUV to RGB and copying it to VRAM.
Strange nobody has interest in it ! It's a very important thing to argue among developers, it could bring us a faster video player closer to our operating system and take advantage to amiga os4 features
Sam 460EX, 2Gb Ram, Radeon R7 250, AmigaOS4.1 FE A4000 PPC604@233, Mediator A1200 PPC603@160, Mediator uA1 G3@800, 512 Mb [sold]
I assume its all about muimplayer port from Fab ? Because as far as i know only Fab one have and cgx, and p96 drivers, while old mplayer have only p96 ones.
If so, then relevant part of drivers and video output here:
video_out and cgx_common : general ones vo_cgx_wpa - cgx driver vo_p96pip* - p96 driver which i just get from old mplayer and a bit adapt
video_out and cgx_common : general ones vo_cgx_wpa - cgx driver vo_p96pip* - p96 driver which i just get from old mplayer and a bit adapt
It seems a pretty safe bet that it's the MUI MPlayer. Looking at the cgx_wpa code, it is using WritePixelArray(). So, that brings me to the second question that I asked: which rtg.library version was it tested with?
How can one find out what version was out at a certain time? Seing that the post was made on 16.december last year.I believe there has been released a newer one after that? Or am i way off?
Antique wrote: How can one find out what version was out at a certain time? Seing that the post was made on 16.december last year.I believe there has been released a newer one after that? Or am i way off?
In a shell window, enter: version rtg.library
IIRC, version 53.30 was released via AmiUpdate shortly after Deniil started that thread. So, you're right that he was probably using version 41.4355. It would be interesting to see whatever tests he did repeated with the latest version.
I am using rtg.library 41.4355 (2011-07-14). I'll do an update and test again
Edit: Except I can't because AmiUpdate crashes Edit2: Ok, managed to update but I honestly can't see any difference what so ever on the border-line videos. I will test with the benchmark when I get time.
Edited by Deniil on 2013/2/20 22:46:21
Software developer for Amiga OS3 and OS4. Develops for OnyxSoft and the Amiga using E and C and occasionally C++
Deniil wrote: Edit2: Ok, managed to update but I honestly can't see any difference what so ever on the border-line videos. I will test with the benchmark when I get time.
Let's see what the benchmarks say. If it makes no real difference, then the bottleneck isn't the RAM => VRAM transfer operation.
EDIT: BTW, which version of MPlayer are you using for the tests? There may be versions that use an experimental custom RAM => VRAM copy routine that doesn't use WritePixelArray() floating around.
If you see this line: [swscaler @ 0x4cdaf840]using unscaled yuv420p -> rgb24 special converter
I'm pretty sure it doesn't scale because it fits perfectly on a 1280x720 screen, and there is no way to even force it to scale without overlay. I have many clips I'd like to scale up but it refuses.
Software developer for Amiga OS3 and OS4. Develops for OnyxSoft and the Amiga using E and C and occasionally C++
If you see this line: [swscaler @ 0x4cdaf840]using unscaled yuv420p -> rgb24 special converter
I'm pretty sure it doesn't scale because it fits perfectly on a 1280x720 screen, and there is no way to even force it to scale without overlay. I have many clips I'd like to scale up but it refuses.
It's best not to make any assumptions. Have a look at part of the cgx_wpa code:
static int draw_slice(uint8_t *image[], int stride[], int w,int h,int x,int y)
{
#if 1
uint8_t *dst[3];
int dstStride[3];
Note the use of a function called sws_scale_ordered(). I have no idea if this does or doesn't make a difference. However, all I can say is that the performance benchmark doesn't stack up with what the RAM => VRAM speed should be, even when taking into account the YUV => RGB conversion.
My main point is that you have no idea what MPlayer is actually doing behind the scenes. Did you try the -noaspect option?
Haven't tried the -noaspect switch yet but will do when I get home.
Another annoying this I noticed with the cgx_wpa driver is that it can't handle 852x480 without inverting colours. 854 works though. Too bad I'm too borded and lazy to set up and compile it myself...
Software developer for Amiga OS3 and OS4. Develops for OnyxSoft and the Amiga using E and C and occasionally C++