MPEG-4

  • Follow


Hi,

Suppose an audio-visual (AV) scene is made up of a person talking with
a background of mountains behind him/her.  Based on the standards, my
interpretation is that when this scene broken down into AV objects,
there will be a separate object representing the person and a separate
object representing the audio from the person, and following this each
of these objects are encoded separately (there will also be a separate
object representing the background etc.).  Is this interpretation
correct?

If yes, how is lip sync maintained?  I.e., how are the time bases used
in the two (audio and video) encoders kept in sync (In other words will
the OCR in the SL packetized streams for the audio and video part be
the same? - and hence the CTS time stamps be used for lip sync).

Sorry for the ton of questions.  Expert help will be greatly
appreciated.

Thanks,

Arvind

0
Reply arvind.phd (4) 12/6/2006 12:47:58 AM


0 Replies
97 Views

(page loaded in 0.04 seconds)

Similiar Articles:













7/6/2012 4:12:27 AM


Reply: