View previous topic :: View next topic |
Author |
Message |
iam
Joined: 28 Nov 2007 Posts: 3
|
Posted: Wed Nov 28, 2007 7:22 am Post subject: |
|
|
Any updates on the medialib project?
No post from unsolo since October, not much got updated on the wiki wither... I hope this project isn't dead... Really looking forward for H.264 support... eventually... |
|
Back to top |
|
|
unsolo
Joined: 16 Apr 2007 Posts: 155 Location: OSLO Norway
|
Posted: Wed Nov 28, 2007 9:59 am Post subject: |
|
|
It'is far from dead..
I can inform you that we are currently working on the following
EXA driver for accelerated X support
Xv driver for Video acceleration in X
the mplayer-vo is under further development
And we have started looking into ways of accelerating the mpeg2 mpeg4 and h264 decoding using existing programs such as ffmpeg.
There is also allways room for more people :)
Cheers
unsolo _________________ Don't do it alone. |
|
Back to top |
|
|
popper
Joined: 15 Jan 2007 Posts: 9
|
Posted: Sun Dec 02, 2007 3:30 am Post subject: |
|
|
its good to see even more and steady progress Unsolo,perhaps
Lu has mentioned it or you already read this thread
http://www.powerdeveloper.org/forums/viewtopic.php?t=1410
it might be a good thing to help pull some new blood into the effective efforts if your thinking of combining some of both codebases/projects, whatever Markos and the teams decide to work on first....
at least i assume theres something of interest to both projects to co-operate or perhaps combine to better progress and have fun at the same time... :) |
|
Back to top |
|
|
d-range
Joined: 26 Oct 2007 Posts: 60
|
Posted: Sun Dec 02, 2007 9:35 pm Post subject: |
|
|
unsolo wrote: | And we have started looking into ways of accelerating the mpeg2 mpeg4 and h264 decoding using existing programs such as ffmpeg.
There is also allways room for more people :) |
I'm still looking into the video decoding stuff, but I more or less scaled down my focus from h264 to mpeg1/2, as there is lots of overlap in all of the mpeg/h26x decoding processes, and I need to have more basic video decoder experience before I can seriously think about h264.
Anyway, I'm not sure accelerating the existing ffmpeg codecs is the way to go for PS3. The PS3 architecture is almost a perfect fit for very, very high performance video decoding, but the way the ffmpeg codecs are set up it is impossible to get there. These codecs are all optimized for either single-threaded x86 or symmetric dual-thread x86 execution. You cannot efficiently parallelize them for the Cell without ending up rewriting everything.
Stuff like IDCT/dequant, color conversion, motion compensation, deblocking, you can lift them out and write spu-medialib code for it, and it will improve the computational cost of them, but you will end up with a decoder that does some parts of the decoding process very, very fast, but is throttled by its data dependencies, ie: getting stuff in and out of the SPU's and combining them for the next step.
I have a few papers about decoder setups for architectures like the cell. In short: the ffmpeg codecs are not optimized for multicore (>2 core) processing, and use a functional partitioning for the decoding process (ie: a pipeline-like setup). This is good for PC architectures, because there is no communication overhead, all decoder stages can access the same RAM. Also, typical multicore PC-setups have symmetric cores, it does not matter what task you put on what core. The PS3 however would benefit from a mixed data-partitioning/functional partitioning scheme, where each of the SPU implements it's own pipeline for a subset of the full frame data. This reduces communication overhead and maximizes parallelism. The PPU can handle entropy decoding and macroblock parsing better than the SPU's, and the SPU's can do all the other stuff.
For practical purposes hacking ffmpeg with some SPU code is a good first step, but I'm not convinced it can pull off full HD h.264 decoding at full framerates. But it might be bearable. My own 'goal' would be a decoder that is optimized for the Cell, and nothing else. I think that way it can do full HD H.264 decoding with ample room to spare. |
|
Back to top |
|
|
unsolo
Joined: 16 Apr 2007 Posts: 155 Location: OSLO Norway
|
Posted: Mon Dec 03, 2007 12:28 am Post subject: |
|
|
Provided the cell (spe's) do both inter and intra frame decoding the ppc processor is left with the task of decoding the bitstream more or less. hopefully that will be enough _________________ Don't do it alone. |
|
Back to top |
|
|
d-range
Joined: 26 Oct 2007 Posts: 60
|
Posted: Mon Dec 03, 2007 1:28 am Post subject: |
|
|
unsolo wrote: | Provided the cell (spe's) do both inter and intra frame decoding the ppc processor is left with the task of decoding the bitstream more or less. hopefully that will be enough |
If you build it efficiently, it will be. You will want to limit ppu<->ram<->spu traffic and data dependencies as much as possible. That requires careful data partitioning and scheduling, which means you will end up rewriting almost all of the ffmpeg codec. Which is not necessarily a bad thing btw, but it's too messy for me. |
|
Back to top |
|
|
unsolo
Joined: 16 Apr 2007 Posts: 155 Location: OSLO Norway
|
Posted: Mon Dec 03, 2007 6:32 am Post subject: |
|
|
you have 24 GB/s to go on there..
in comparison a YUV420 frame is 3.1MB in 1080p
so even if you split it and over dma so that you transfer 4 times as much data as needed its still fine.. _________________ Don't do it alone. |
|
Back to top |
|
|
d-range
Joined: 26 Oct 2007 Posts: 60
|
Posted: Mon Dec 03, 2007 8:04 pm Post subject: |
|
|
unsolo wrote: | you have 24 GB/s to go on there..
in comparison a YUV420 frame is 3.1MB in 1080p
so even if you split it and over dma so that you transfer 4 times as much data as needed its still fine.. |
24GB/s bandwidth that is, but bandwidth is not the problem. You still need to feed everything to the SPE's in time otherwise you'll stall them. With the limited local memory of the SPE's and the different data dependencies for inter and intra prediction, you will need to arrange macroblocks in data partition order to satisfy all data dependencies, and implement adequate buffering from entropy decoding on the PPU to inter/intra prediction on the SPE's. Extra complications involved in PEL reconstruction from the IDCT and the prediction from the reference images, because they also need to be available just in time. It's all possible, but you need more than a naive port of the ffmpeg decoder. |
|
Back to top |
|
|
unsolo
Joined: 16 Apr 2007 Posts: 155 Location: OSLO Norway
|
Posted: Sat Dec 15, 2007 4:37 am Post subject: |
|
|
I wouldnt worry to much ...
Im saying its doable
very very very doable..
and im allways right :)
btw im working on a fifo for the spe's that should/could allow for more than enough unique tasks to be transfered to the spe's _________________ Don't do it alone. |
|
Back to top |
|
|
Arwin
Joined: 12 Jul 2005 Posts: 426
|
|
Back to top |
|
|
|