Any altivec experts? (H.264 codec)

	Bottom Previous Topic Next Topic
Register To Post

(1) 2 3 4 ... 36 »

Hans

Posted on: 2014/11/6 1:13 #1

Home away from home

I've been thinking about ways to improve video playback further without having to wait for HW decoding. I've already mentioned enabling direct-rendering, which should help a little.

However, comparing the PowerPC code in ffmpeg to x86, it's clear that the H.264 codec in particular is only partially altivec optimized (hint: look at the h264_* files in both directories).

So, do we have any altivec experts in the community who would be interested in checking this out?

Hans

P.S., feel free to ask MorphOS developers too, since any improvements would benefit people in both communities that have altivec machines (G4/G5/PA6T CPUs).

Join Kea Campus' Amiga Corner and support Amiga content creation
https://keasigmadelta.com/ - see more of my work

Belxjander

Re: Any altivec experts? (H.264 codec)

Posted on: 2014/11/6 7:11 #2

Just popping in

@Hans

I'd be willing to donate a little attention to this...which would provide an initial base for more knowledgable coders to update if a 100% native assembly routine is required ( I only really know to optimize for 020/040 processors and the same rules appear to work well on the PPC from what I have tried).

K-L

Re: Any altivec experts? (H.264 codec)

Posted on: 2014/11/6 8:13 #3

Just can't stay away

Hans : LiveForIt seems to be a great expert of AltiVec (with his latest MPlayer version).

Maybe in the Mac world or Linux world...

--
AmigaONE X1000 and Radeon RX 560

Hans

Re: Any altivec experts? (H.264 codec)

Posted on: 2014/11/6 8:34 #4

Home away from home

@Belxjander

Quote:

I'd be willing to donate a little attention to this...which would provide an initial base for more knowledgable coders to update if a 100% native assembly routine is required ( I only really know to optimize for 020/040 processors and the same rules appear to work well on the PPC from what I have tried).

This isn't really about using assembly, but using the vector instructions in the altivec unit (like SSE instructions in x86 processors). Ever used anything like that before?

You can look at the code that I linked to in my first post, to see if you think that you could contribute.

@K-L
Quote:

Hans : LiveForIt seems to be a great expert of AltiVec (with his latest MPlayer version).

I doubt that LiveForIt touched any altivec-specific code when he created his version. The partially altivec optimized code was already there. Plus, you shouldn't expect him to do everything. There are only so many hours in the day, and he has other things to do too.

Of course, he's welcome to have a look at the H.264 SIMD (SSE/Altivec/etc.) code, if he wants to.

Hans

Join Kea Campus' Amiga Corner and support Amiga content creation
https://keasigmadelta.com/ - see more of my work

corto

Re: Any altivec experts? (H.264 codec)

Posted on: 2014/11/6 9:00 #5

Not too shy to talk

@Hans: I think that a request could be made at the ffmpeg team. I also want to point on this possible opportunity: freevec.org offers his services. He is specialized in SIMD and AltiVec and recently proposed his services (being paid for them).

Hans

Re: Any altivec experts? (H.264 codec)

Posted on: 2014/11/6 22:48 #6

Home away from home

@corto

Quote:

I think that a request could be made at the ffmpeg team.

Would that really make a difference? I mean, they already know that the altivec code isn't as developed as the x86 & ARM counterparts.

The impression that I get is that there is little interest in improving the PowerPC-specific code due to the small number of desktop/mobile-devices that use it. This is why I'm asking here.

Quote:

I also want to point on this possible opportunity: freevec.org offers his services. He is specialized in SIMD and AltiVec and recently proposed his services (being paid for them).

That might be worthwhile, provided that enough people are interested in footing the bill.

At this stage I have no idea how much difference could be made with more altivec-optimized functions, nor do I know how much work (i.e., cost) it would take. If we had detailed profiling/benchmark data from the X86/ARM code showing what difference the extra SIMD optimized functions make, then we could probably guess how much performance could be gained. However, I have never seen such data.

Hans

Join Kea Campus' Amiga Corner and support Amiga content creation
https://keasigmadelta.com/ - see more of my work

LiveForIt

Re: Any altivec experts? (H.264 codec)

Posted on: 2014/11/7 0:01 #7

Home away from home

@Hans

Quote:

Hans : LiveForIt seems to be a great expert of AltiVec (with his latest MPlayer version).

Nope, I know some PPC assembler, but have don't know how to write AltiVec code.
I know in theory how it works thats all.

I know how to enable GCC to use altivec and how to set defines, that switched between already optimized AltiVec code, and standard C code.

What I know how to do is fix bug and implement stuff, I like graphics and like to be creative with it.

Quote:

There are only so many hours in the day, and he has other things to do too.

Yes I have, besides I get sick of working on the same thing too long.

Quote:

Of course, he's welcome to have a look at the H.264 SIMD (SSE/Altivec/etc.) code, if he wants to.

Right now I'm busy talking to Reimar he is one of the core developers working on MPlayer, I hope he can give me some hints on what to improve.

Maybe there is some thing to improve on draw_slice() function, I'm hoping that he might have some ideas.

Right now I'm having to convert between interleaved and none interleaved video, I'm hoping there is a way to force FFMEG to use interleaved yuv420p.

(NutsAboutAmiga)

Basilisk II for AmigaOS4
AmigaInputAnywhere
Excalibur
and other tools and apps.

Hans

Re: Any altivec experts? (H.264 codec)

Posted on: 2014/11/7 1:07 #8

Home away from home

@LiveForIt

Quote:

Right now I'm having to convert between interleaved and none interleaved video, I'm hoping there is a way to force FFMEG to use interleaved yuv420p.

I remember discussing this with one or two people before (possibly you too). Anyway, so long as their code treats the Y, U & V as totally independent (i.e., use the pointer and bytes-per-row of each independently), then interleaved/non-interleaved is entirely irrelevant.

Treating them independently also means that the code needs to keep its mits off the padding area (which is where the other chroma plane is stored in interleaved mode). To be honest, poking about in the padding area is a pointless waste of CPU resources, so there really is no reason for the code to be doing that, interleaved or not.

Code that follow the rules above can handle both interleaved and non-interleaved bitmaps without any special cases.

If the decoder can do direct-rendering (VideoLAN uses direct-rendering with H.264, so it should), then it should already be making no silly assumptions, as it won't know in advance what the locations and bytes-per-row of each plane will be.

Anyway, this has nothing to do with using altivec for DCT/iDCT, deblocking, motion compensation, etc.

Hans

Join Kea Campus' Amiga Corner and support Amiga content creation
https://keasigmadelta.com/ - see more of my work

LiveForIt

Re: Any altivec experts? (H.264 codec)

Posted on: 2014/11/7 3:51 #9

Home away from home

@Hans

Quote:

I remember discussing this with one or two people before (possibly you too)

Yes, and I wont to go over it again, just in case there is some thing I have missed, this time I'm taking with the Mplayer developers to see what they have to say.

Quote:

so long as their code treats the Y, U & V as totally independent

Well yes sure, but I don't wont to copy etch individual line in the draw_slice() function if its not absolutely nescessary. I wont to treat it as block of memory.

Sure it works fine with direct rendering to use pointers and bytes per row.

But if possible I wont to transferee from bitmap format A to bitmap format A, I don't wont to translate from A to B, if its not needed.

I have been looking at nv12 as it interleaved mode, but its interleaved in x axes not y axes.
"uvuvuvuv" not "uuuuvvvv"

The interleaved yuv420 format you have implemented looks like IMC4, well it work fine, but it be even better if it was also used by ffmeg/mplayer internally, I think.

Mplayers "i420" is not interleaved its just like yv12 but with u and v swapped. Mplayers "iyuv" format I think is just like i420 again.

I don't understand way there is so many formats in mplayer with different format names and where data is organized in the same way, makes no sense to me.

Edited by LiveForIt on 2014/11/7 4:45:06

(NutsAboutAmiga)

Basilisk II for AmigaOS4
AmigaInputAnywhere
Excalibur
and other tools and apps.

Hans

Re: Any altivec experts? (H.264 codec)

Posted on: 2014/11/7 5:24 #10

Home away from home

@LiveForIt

Quote:

Well yes sure, but I don't wont to copy etch individual line in the draw_slice() function if its not absolutely nescessary. I wont to treat it as block of memory.

Sure it works fine with direct rendering to use pointers and bytes per row.

You may want to ask them about using direct rendering with H.264, because the CODEC_CAP_DR1 flag is set for that codec. That would avoid the whole bitmap copying issue, which would still be there even if the source bitmap were interleaved (the bytes-per-row could still be mismatched).

Quote:

The interleaved yuv420 format you have implemented looks like IMC4, well it work fine, but it be even better if it was also used by ffmeg/mplayer internally, I think.

I didn't actually implement any particular format. The Radeon HD driver's rendering code could handle non-interleaved YUV bitmaps just as easily (and swapped U & V planes, etc.). The layout of the Y, U & V planes in memory is decided by Picasso96.

AFAIK, Picasso96 uses interleaved U & V planes because that's the layout that Radeon 7xxx/9xxx cards use (which is what we had at the time that YUV420p support was added).

Quote:

I don't understand way there is so many formats in mplayer with same names and where data is organzined in the same way, makes no sense to me.

Yes, that is confusing. It's probably not their fault, but the result of multiple companies developing codecs choosing their own formats and names.

Hans

Join Kea Campus' Amiga Corner and support Amiga content creation
https://keasigmadelta.com/ - see more of my work

AmigaBlitter

Re: Any altivec experts? (H.264 codec)

Posted on: 2014/11/7 8:03 #11

Quite a regular

@Hans

Good initiative, Hans.

Using altivec, btw, will cut out the AmigaOne 500 and sam owners.

For those, i would like to suggest you to check out the PPC 440 and 460 internal DSP. This dsp have 24 instructions that can improve audio video decoding.

Here are some interesting documents you could check:

https://www-01.ibm.com/chips/techlib/t ... PowerPC_440_Embedded_Core

this is especially interesting:
https://www-01.ibm.com/chips/techlib/t ... Optimized_dsp_440_app.pdf

something similar exist for the 460 too.

Retired

corto

Re: Any altivec experts? (H.264 codec)

Posted on: 2014/11/7 8:22 #12

Not too shy to talk

@AmigaBlitter

Quote:

Using altivec, btw, will cut out the AmigaOne 500 and sam owners.

For those, i would like to suggest you to check out the PPC 440 and 460 internal DSP. This dsp have 24 instructions that can improve audio video decoding.

Here are some interesting documents you could check:

https://www-01.ibm.com/chips/techlib/t ... PowerPC_440_Embedded_Core

this is especially interesting:
https://www-01.ibm.com/chips/techlib/t ... Optimized_dsp_440_app.pdf

I did read these docs and I also profiled ffmpeg on 440 years ago. I tried to optimize but effects were not visible. ffmpeg developers know how to program and I think the code is already efficient.
Many other CPU features could be used but I'm afraid the MAC instructions won't be enough.

By the way, there is already a macro in ffmpeg to use one of there MAC instructions in some places.

Looking again at this topic would be another interesting task!

Quote:

something similar exist for the 460 too.

Right. The CPU core is basically the same.

SinanSam460

Re: Any altivec experts? (H.264 codec)

Posted on: 2014/11/7 19:32 #13

Not too shy to talk

@AmigaBlitter
Is ppc460 powerful enough for decoding 720p ?

Sinan - AmigaOS4 Beta-Tester
- AmigaOne X5000
- AmigaOne A1222
- Sam460ex

Anonymous

Re: Any altivec experts? (H.264 codec)

#14

@SinanSam460

Quote:

Is ppc460 powerful enough for decoding 720p ?

Isn't that moot now?
I thought the decoding stuff has been given to the GPU with the upcoming new gfx driver?

So it would depend on the gfx card you use...i think?

Hans

Re: Any altivec experts? (H.264 codec)

Posted on: 2014/11/7 21:10 #15

Home away from home

@Raziel

Quote:

Isn't that moot now?
I thought the decoding stuff has been given to the GPU with the upcoming new gfx driver?

No. Composited video shifts the YUV => RGB conversion to the GPU, which improves performance by eliminating the conversion and reducing the RAM => VRAM copy bandwidth.

The CPU still does the decoding.

Hans

Join Kea Campus' Amiga Corner and support Amiga content creation
https://keasigmadelta.com/ - see more of my work

Spectre660

Re: Any altivec experts? (H.264 codec)

Posted on: 2014/11/7 21:38 #16

Quite a regular

@SinanSam460

Try the mplayer benchmark your sam460

http://download.blender.org/peach/big ... uck_bunny_720p_stereo.avi

mplayer -benchmark -nosound -ao null -vo null -lavdopts skiploopfilter=none big_buck_bunny_720p_stereo.avi

I think that the results would have to be 596s for realtime cpu decode . (18.20% of an AMD FX 4300 quad core 2.8GHZ =108.4732s)

Using the benchmark figures for the Prometheus 1080p clip that I have of 23.778 and 399.791, the Sam460 may take up to 1825s to complete the big_buck_bunny benchmark though.

Edited by Spectre660 on 2014/11/7 21:54:17
Edited by Spectre660 on 2014/11/7 21:56:37
Edited by Spectre660 on 2014/11/7 22:10:15
Edited by Spectre660 on 2014/11/7 22:32:57

LiveForIt

Re: Any altivec experts? (H.264 codec)

Posted on: 2014/11/8 0:10 #17

Home away from home

@Hans

Quote:

I didn't actually implement any particular format. The Radeon HD driver's rendering code could handle non-interleaved YUV bitmaps just as easily (and swapped U & V planes, etc.). The layout of the Y, U & V planes in memory is decided by Picasso96.

Yes, so if I can some how make a "fake" bitmap (user defined bitmap), and fill in the Y, U, V pointers to slices[x] coming from draw_slice() and set the BytesPerRow to the strides[x], I can some how prevent some memcpy's from the codec.

But does it not need to be padded for DMA operation?

Quote:

because that's the layout that Radeon 7xxx/9xxx cards use

In that case I guess its a industry standard.

(NutsAboutAmiga)

Basilisk II for AmigaOS4
AmigaInputAnywhere
Excalibur
and other tools and apps.

Hans

Re: Any altivec experts? (H.264 codec)

Posted on: 2014/11/8 0:20 #18

Home away from home

@LiveForIt

Quote:

Yes, so if I can some how make a "fake" bitmap (user defined bitmap), and fill in the Y, U, V pointers to slices[x] coming from draw_slice() and set the BytesPerRow to the strides[x], I can some how prevent a memcpy's from the codec.

Please do not try something like this. You should treat bitmaps as black boxes and, therefore, do not go creating fake bitmaps, or go poking around in its internals. Those internals can be changed at any time.

Hans

Join Kea Campus' Amiga Corner and support Amiga content creation
https://keasigmadelta.com/ - see more of my work

tommysammy

Re: Any altivec experts? (H.264 codec)

Posted on: 2014/11/8 4:38 #19

Quite a regular

@Spectre660
X1000:Playing big_buck_bunny_720p_stereo.avi.
libavformat version 55.33.100 (internal)
AVI file format detected.
[aviheader] Video stream found, -vid 0
[aviheader] Audio stream found, -aid 1
VIDEO: [MP42] 1280x720 24bpp 24.000 fps 3556.5 kbps (434.1 kbyte/s)
Clip info:
Software: MEncoder 2:1.0~rc2-0ubuntu13
Load subtitles in
==========================================================================
Opening video decoder: [ffmpeg] FFmpeg's libavcodec codec family
libavcodec version 55.52.102 (internal)
Selected video codec: [ffmp42] vfm: ffmpeg (FFmpeg MSMPEG-4 v2)
==========================================================================
Audio: no sound
Starting playback...
libmpcodecs/vd.c:mpcodecs_config_vo:d
Movie-Aspect is undefined - no prescaling applied.
VO: [null] 1280x720 => 1280x720 Planar YV12
V: 0.0 1/ 1 ??% ??% ??,?% 0 0
Select error: No such file or directory
V: 0.0 2/ 2 ??% ??% ??,?% 0 0
[VD_FFMPEG] DRI failure.
V: 596.4 14315/14315 17% 0% 0.0% 0 0

BENCHMARKs: VC: 106.197s VO: 0.045s A: 0.000s Sys: 15.797s = 122.038s
BENCHMARK%: VC: 87.0190% VO: 0.0367% A: 0.0000% Sys: 12.9442% = 100.0000%
kill the cache process - start
mpctx->stream 3c311cd8

*** free_stream() called ***
free_stream:537
kill the cache process - end

Exiting... (End of file)

Amiga600/Vampire2/PrismaMegaMix/32GB CF Card/2x Rys Mk2/A604n/IndivisionECS/Gotek

tommysammy

Re: Any altivec experts? (H.264 codec)

Posted on: 2014/11/8 5:06 #20

Quite a regular

@Spectre660
X1000:Playing Prometheus.mp4.
libavformat version 55.33.100 (internal)
libavformat file format detected.
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x4580dd10]stream 0, timescale not set
[lavf] stream 0: video (h264), -vid 0
[lavf] stream 1: audio (aac), -aid 0, -alang eng
[lavf] stream 2: video (mjpeg), -vid 1
VIDEO: [H264] 1920x816 24bpp 23.976 fps 6828.3 kbps (833.5 kbyte/s)
Clip info:
major_brand: mp42
minor_version: 0
compatible_brands: mp42isomavc1
creation_time: 2011-12-23 08:24:51
genre: Trailer
artist: 20th Fox
title: Prometheus - Trailer
encoder: HandBrake 4344svn 2011111001
date: 2012
Load subtitles in
==========================================================================
Opening video decoder: [ffmpeg] FFmpeg's libavcodec codec family
libavcodec version 55.52.102 (internal)
Selected video codec: [ffh264] vfm: ffmpeg (FFmpeg H.264)
==========================================================================
Audio: no sound
Starting playback...
libmpcodecs/vd.c:mpcodecs_config_vo:d
Movie-Aspect is 2.35:1 - prescaling to correct movie aspect.
VO: [null] 1920x816 => 1920x816 Planar YV12
V: 0.0 0/ 0 ??% ??% ??,?% 0 0
Select error: No such file or directory
V: 69.3 0/ 0 116% 0% 0.0% 0 0

BENCHMARKs: VC: 80.950s VO: 0.007s A: 0.000s Sys: 2.854s = 83.811s
BENCHMARK%: VC: 96.5861% VO: 0.0087% A: 0.0000% Sys: 3.4052% = 100.0000%
kill the cache process - start
mpctx->stream 4e204cd8

*** free_stream() called ***
free_stream:537
kill the cache process - end

Exiting... (End of file)

Amiga600/Vampire2/PrismaMegaMix/32GB CF Card/2x Rys Mk2/A604n/IndivisionECS/Gotek

Register To Post	(1) 2 3 4 ... 36 »
	Top Previous Topic Next Topic

Currently Active Users Viewing This Thread: 1 ( 0 members and 1 Anonymous Users )