What is the expected outcome and what would make people accept such a challenge as solved?
Eg. pick a video stream (bigbuckbunny?) and a base hardware spec and expect that to work with no frames dropped?
Picking criteria is going to be difficult. While I'd certainly love it if the Prometheus trailer could play without framedropping on my A1-X1000, I estimate that this would require a 33% performance improvement (from 18 to 24 fps). However, given that the H.264 code already has partial altivec acceleration, I don't know if that's realistic.
If we had benchmark results for x86, ARM, & PPC giving the performance difference between SIMD and non-SIMD, then we might be able to guess how much extra performance we could get.
I think It be good idea to compare mplayer VC benchmark speed, compare altivec vs none altvec, and do the same on PC hardware using SSE/MMX, find out the potential speed increase.
Thanks for the benchmarks. What we need now are similar benchmarks for X86/ARM machines using mplayer versions both with and without SIMD (and without HW decoding). I'm not even sure if such builds are available, so we might need someone who understands their build system to create custom versions.
In this context benchmarking with video output (VO) don't make sense, other operating systems like Linux do not have the same Video outputs as we do. And this is about optimizing video codec decoding (VC), the CPU usage on Video Output effects the VC score.
(NutsAboutAmiga)
Basilisk II for AmigaOS4 AmigaInputAnywhere Excalibur and other tools and apps.
Displaying the video I get 22fps, video and sound fullscreen gives 21fps, not far off the 23.976 but as most are 25 or 30 fps it's still got quite a way to go.
Amiga user since 1985 AOS4, A-EON, IBrowse & Alinea Betatester
Okay, do we have anyone who is able to create the custom mplayer builds to benchmark x86/ARM machines with and without SIMD enabled (and without HW acceleration)? We really need to be able to compare the AmigaOS altivec performance increase against what a fully SIMD-optimized platform achieves.
Ok, I took a close look the past days at the ffmpeg codebase, basically the altivec and arm(neon) trees. My first impression was that the altivec port was seriously lacking as there were far fewer files. However, a closer look showed that the functions were implemented but inside the .c files, and not separately as happens with the neon port. Still, not all were implemented, in particular I could not find altivec code for the ff_pred16x16_vert_* type of functions (found in libavcodec/arm/h264pred_init_arm.c). So these would be the ones I would tackle first.
I suggest to allocate ~35 hours initially for this task alone, and take a look at it again -note that it might not mean an actual week, as I am already working on a day job. Since you asked me for a public quote, my rate usually is 30EUR/hour, but as I promised and since working on Altivec is a pleasure, I'm willing to do a discount, at 23EUR/hour (if invoicing within the EU, VAT will be deducted, and I would have to invoice someone for that amount). So, in total 805EUR.
If the wasted processing time is in the decoder, I suggest that we avoid the mplayer layer, using ffmpeg only instead. With mplayer, we see that we will have to take care about the version, the operating systems and their versions, etc. It will also be easier to compare on x86 and ARM, building ffmpeg for them with and without SIMD.
Let's choose: - an ffmpeg revision - 3 videos (the 1080p prometheus being the first one) to check different parts of the code are exercized - 2 or 3 pieces of hardware
The community was able to raise a similar amount in order to buy a computer for a guy who never produced anything meaningful on it, so I guess the amount is doable.