As far as yuv-rgb conversion goes, from what i can gather from the source code is that avcodec already returns the video frame buffer in ARGB32. I tried to change it so Compositing does the conversion but resulted in a good big crash (kas1e might have more details).
while Hardware compositing returns successfully i am now blitting the video buffers to a bitmap ,then compositing to the bigger bitmap and blit again to a buffer suitable for cairo rendering. I was hoping the overhead added by this would be magnitudes lower than to scale the data by cpu,seems i was wrong or i need to remove a few steps still, for instance allocating/freeing both src,dst bitmaps for each frame.
If the data alignment is correct, i may also lock the destination buffer to pass the image data directly to cairo without blitting it to another buffer again.
If someone's willing to try changing this to see if it behaves better please do so, i can't commit for this week to the task.
Meanwhile I've found where it asks the desired pix format, it is found at : Odyssey-master/odyssey-r155188-.23/Source/WebKit/OrigynWebBrowser/Api/MorphOS/owbbrowserclass.cpp, line 3923:
(copied from the part where it sets up the overlay window)
I think this is the part that was missing for getting the correct yuv format and let compositing do the conversion. Please be aware that i am not familiar with yuv bytesize stuff and that might need some adjustments on Odyssey-master/odyssey-r155188-1.23/BAL/Media/WebCore/MorphOS/BCMediaPlayerPrivateMorphOS.cpp
Btw, are compositing supported by older cards we have like radeon9250, etc? If I remember right it supported as well? (i mean adding compositing acceleration will work everywhere and on new Radeon, and on old ones) ?
IIRC, nobody has implemented composited video in the old atiradeon.chip driver. So, an overlay fallback might still be worth writing for the old cards.
Quote:
Just to understand it more, you mean "Use CompositeTags() to do the YUV=>RGB conversion when blitting to the screen" is that "yuv=>rgb conversion and scaling on the fly"?
I.e. just using CompositeTags() by itself mean YUV->RGB conversion will be done, and programmer need to not worry about, and not specify any special flags to fucntion about ?
Yes. You give it a YUV input bitmap and an RGB destination, and the conversion just happens (if the driver supports it). There are extra flags to control a few options (e.g., the YUV=>RGB matrix to use), but the basic operation is as simple as a regular RGB=>RGB CompositeTags().
@Ami603 Quote:
As far as yuv-rgb conversion goes, from what i can gather from the source code is that avcodec already returns the video frame buffer in ARGB32. I tried to change it so Compositing does the conversion but resulted in a good big crash (kas1e might have more details).
The avcodec library will convert to ARGB32 *if* you ask it to. You'll want to disable that if you want to take advantage of composited video.
Likewise, you'll want to avoid feeding the result back into Cairo, because that means copying the RGB frame from VRAM back into RAM, which is going to be rather slow.
EDIT: I see from your followup post that you'd already found out how to get the video's actual pixel format from avcodec...
@All Don't forget plz to add changes via #ifdef __amigaos4__ , for making later porting of 1.25 version easer (to things can be just mostly copy+paste) as well as have ability to build it cross-platform still will help. Morphos parts can be keept or via #ifndef __amigaos4__ , or via #ifdef __morphos__
We go with Ami603 pretty far now. The only file we touch now is owbbrowserclass.cpp, in the hope to create the same how it for morphos.
Currently what we have, is that pressing on fullscreen button works, open the fullscreen, switch back works, etc, But when we trying to do bitmap copy we crash in that Blit() method.
(press open in new tab for fullsize):
Without copying, we actually have all functionality works, we can go to/from fullscreen, the video didn't interrupt and works when we switch between window and fullscreen (just in fullscreen all black as no copy done, but music can be heard). So all we need is a correct copy (see amigaos4 ifdefs in DEFSMETHOD(OWBBrowser_VideoBlit) ).
But we already can see that CPU loading reduced a lot (x3 times), so decoding of frames surely happens and faster much. Bitmaps are allocated just once now, and the render must go directly to VRAM without running inside Cairo. And no conversion to RGB as well as compositing do so, so we can hope that pure blitting should't take away more resources and we can expect pretty good and big speedup in compare with "cairo window"
Now, Ami603 just runs out of time, and we need your help (and help from others of course), to just check the new owbbrowserclass.cpp file and find out what we do wrong there :). Vicente do the bitmap copy differently, so we need to find out wtf.
I've taken a quick look at the code. Is it the actual BltBitMap() call that's crashing? Or OWBBrowser_VideoBlit()?
If it's the latter, then the problem is most likely that the code assumes that the U & V planes are completely separate, whereas Picasso96 stores them in an interleaved format.** So, the following won't work properly: CopyMem(sCb, pCb, (ptCb * h) >> 1); CopyMem(sCr, pCr, (ptCr * h) >> 1); This will read beyond the boundaries of the source frame, and write beyond the end of the destination bitmap because stCb != ptCb, and stCr != ptCr.
Interleaved format is like this: YYYYYY YYYYYY YYYYYY YYYYYY UUUVVV <<== EDIT: Could also be VVVUUU. Make no assumptions UUUVVV
Where, Y, U, & V, are pixels for each of the channels. Notice how UBytesPerRow == VBytesPerRow == YBytesPerRow, and each row contains both U & V data. I recommend adding a check for this, and using different code to do the copy. Try disabling the if(stY == ptY...) section, which will force it to use the "else do" code below. If that works, then switching to something like the following should work for Picasso96's interleaved format:
// Copy the Y channel
CopyMem(sY, pY, ptY * h);
// Now copy U & V
do {
CopyMem(sCb, pCb, w2);
CopyMem(sCr, pCr, w2);
sCb += stCb;
sCr += stCr;
pCb += ptCb;
pCr += ptCr;
h -= 2;
while (h > 0)
Just a few side comments: - The BltBitMap() call should use the srcbm/destbm size rather than msg->width/msg->height - Some video codecs allow you to set the Y, U, & V destination buffers, which would allow you to decode directly into srcbm (after getting it working first...) - I see no reason to use vertex arrays for compositing. Standard rect compositing should do just fine
Hans
** This has tripped multiple people up in the past. I wish Picasso96 had gone with completely separate planes, but it didn't.
I'm getting some strange YUV memory addresses and bytes per row values after locking the private bitmap. Also base address is NULL. Possibly my own bugs but just mentioning this in case this happens there, as well. Have to test the original example from Hans later.
Try debugging the parameters you are using and be aware that kprintf is only defined to DebugPrintF in some files - this bit me yesterday and I got some confusing serial logs.
Try debugging the parameters you are using and be aware that kprintf is only defined to DebugPrintF in some files - this bit me yesterday and I got some confusing serial logs.
Yeah for next beta i want to fully get rid of include of clib/debug_protos.h , kprintf() and libdebug.a. Just thinking how to do it better : replace just everywhere kprintf in all files, or, to add some #define kprintf DebugPrintF in some .h file which will be included in all files where kprintf() is called. That of course will be still "kprintf" writen, which can be misleading, but then no need to change a lot code. But dunno what is better for portability later.
Maybe better create some cross_debug.h file, in which add debugs for krptinf, DebugPrintF, and in all the code change kprintf() on some DebugPrintf() functinon ?
Imho right way will be later, we can do like we do in dopus5, just "debug.h" file which are:
IIRC, nobody has implemented composited video in the old atiradeon.chip driver. So, an overlay fallback might still be worth writing for the old cards.
I have a Radeon 9250 (don't know if SE). What is the difference between the Workbench compositing, vo_comp option of mplayer, or the compositing mode in the OS4 SDL2 port, which all work with these cards, compared to the composited video you're referring to?
Workbench: window transparency effects using alpha channel.
SDL2: scaling and rotating of textures, alpha blending. Currently support only RGBA but if YUV support was added, then somebody could use SDL to implement video player backends. IIRC OGLES2 renderer support YUV already.
Problem with LockBitMapTags and strange plane pointers (and DSI) seems to be because CGX has also this function and it is used instead of graphics.library one! After some hacking (inline4/cybergraphics.h) there is some video without crashing. I suppose this hack also helps there?
Now this mess just needs to be cleaned up. If it was upto me, I would start replacing CGX calls with graphics.library ones and put those IGraphics-> etc there. Or at least start reducing CGX usage.
IF we cannot get the correct YUV data via CGX, of course.
We just tried to comment out that block with copy and add like you show, i.e. ...
Just realized that my code example won't work, because it assumes that the Y channel's source and destination widths are the same. So forget that, and go back to the original "else do" code.
EDIT: It will work if you put it in the "if (stY == ptY && w == msg->width)" section, where the source and destination bytes-per-row are the same.
@Capehill Ah, that explains why it was crashing. Best get rid of all CGX calls and rely on the graphics.library alone.
Problem with LockBitMapTags and strange plane pointers (and DSI) seems to be because CGX has also this function and it is used instead of graphics.library one! After some hacking (inline4/cybergraphics.h) there is some video without crashing. I suppose this hack also helps there?
Now this mess just needs to be cleaned up. If it was upto me, I would start replacing CGX calls with graphics.library ones and put those IGraphics-> etc there. Or at least start reducing CGX usage.
IF we cannot get the correct YUV data via CGX, of course.
I remember that when i start to use some graphics functions instead of CGX ones, i start to have issues in other places. As well as when toget with -D__USE_INLINE__ and IGraphics-> inside of some namesspace webcore, or namesapce wtf, i also had compiling issues.
But sure, you are right, mess need to be cleaned out, at least for owbbrowserclass.cpp for start.
Maybe we just can rename current owbbrowserclass.cpp to owbbrowserclass_morphos.cpp, and made our one without cgx usage at all (where possible) ?
It's all just too much CGX based all over the code, so it can be long task to replace it all.