emerging wireless multimedia services and technologies phần 3 ppt

3.8 RTP Payload Types As already mentioned, RTP is a protocol framework that allows the support of new encodings and features. Each particular RTP/RTCP-based application is accompanied by one or more documents:  a profile specification document, which defines a set of payload type codes and their mapping to payload formats (e.g., media encodings) – a profile for audio and video data may be found in the companion RFC3551 [27];  payload format specification documents, which define how a particular payload, such as an audio or video encoding, is to be carried in RTP. 3.8.1 RTP Profiles for Audio and Video Conferences (RFC3551) RFC3551 lists a set of audio and video encodings used within audio and video conferences with minimal, or no session control. Each audio and video encoding comprises:  a particular media data compression or representation called payload type, plus  a payload format for encapsulation within RTP. RFC3551 reserves payload type numbers in the ranges 1–95 and 96–127 for static and dynamic assignment, respectively. The set of static payload type (PT) assignments is provided in Tables 3.7 and 3.8 (see column PT). Payload type 13 indicates the Comfort Noise (CN) payload format specified in RFC 3389. Some of the payload formats of the payload types are specified in RFC3551, while others are specified in separate RFCs. RFC3551 also assigns to each encoding a short name (see column short encoding name) which may be used by higher-level control protocols, such as the Session Description Protocol (SDP), RFC 2327 [25], to identify encodings selected for a particular RTP session. Mechanisms for defining dynamic payload type bindings have been specified in the Session Descrip- tion Protocol (SDP) and in other protocols, such as ITU-T Recommendation H.323/H.245. These mechanisms associate the registered name of the encoding/payload format, along with any additional required parameters, such as the RTP timestamp clock rate and number of channels, with a payload type number. This association is effective only for the duration of the RTP session in which the dynamic payload type binding is made. This association applies only to the RTP session for which it is made, thus the numbers can be reused for different encodings in different sessions so the number space limitation is avoided. 3.8.1.1 Audio RTP Clock Rate The RTP clock rate used for generating the RTP timestamp is independent of the number of channels and the encoding; it usually equals the number of sampling periods per second. For N-channel encodings, each sampling period (say, 1/8000 of a second) generates N samples. If multiple audio channels are used, channels are numbered left-to-right, starting at one. In RTP audio packets, information from lower-numbered channels precedes that from higher-numbered channels. Samples for all channels belonging to a single sampling instant must be within the same packet. The interleaving of samples from different channels depends on the encoding. The sampling frequency is drawn from the set: 8000, 11 025, 16 000, 22 050, 24 000, 32 000, 44 100 and 48 000 Hz. However, most audio encodings are defined for a more restricted set of sampling frequencies. For packetized audio, the default packetization interval has a duration of 20 ms or one frame, which- ever is longer, unless otherwise noted in Table 3.7 (column Default ‘ms/packet’). The packetization interval determines the minimum end-to-end delay; longer packets introduce less header overhead 72 Multimedia Transport Protocols for Wireless Networks Table 3.7 Payload types (PT) and properties for audio encodings (n/a: not applicable) Short encoding Sample or Sampling (clock) Default PT name frame Bits/sample rate (Hz) ms/ frame ms/ packet Channels 0 PCMU Sample 8 var. 20 1 1 Reserved 2 Reserved 3 GSM Frame n/a 8000 20 20 1 4 G723 Frame n/a 8000 30 30 1 5 DVI4 Sample 4 8000 20 1 6 DVI4 Sample 4 16 000 20 1 7 LPC Frame n/a 8000 20 20 1 8 PCMA Sample 8 8000 20 1 9 G722 Sample 8 16 000 20 1 10 L16 Sample 16 44 100 20 2 11 L16 Sample 16 44 100 20 1 12 QCELP Frame n/a 8000 20 20 1 13 CN 8000 1 14 MPA Frame n/a 90 000 var. 15 G728 Frame n/a 8000 2.5 20 1 16 DVI4 Sample 4 11 025 20 1 17 DVI4 Sample 4 22 050 20 1 18 G729 Frame na 8000 10 20 1 19 Reserved 20 Unassigned 21 Unassigned 22 Unassigned 23 Unassigned dyn G726-40 Sample 5 8000 20 1 dyn G726-32 Sample 4 8000 20 1 dyn G726-24 Sample 3 8000 20 1 dyn G726-16 Sample 2 8000 20 1 dyn G729D Frame n/a 8000 10 20 1 dyn G729E Frame n/a 8000 10 20 1 dyn GSM-EFR Frame n/a 8000 20 20 1 dyn L8 Sample 8 Variable 20 Variable dyn RED dyn VDVI Sample Variable Variable 20 1 Table 3.8 Payload types (PT) for video and combined encodings PT Short encoding name Clock rate (Hz) PT Short encoding name Clock rate (Hz) 24 Unassigned 32 MPV 90 000 25 CelB 90 000 33 MP2T 90 000 26 JPEG 90 000 34 H263 90 000 27 Unassigned 35–71 Unassigned 28 nv 90 000 72–76 Reserved 29 Unassigned 77–95 Unsigned 30 Unassigned 96–127 Dynamic 31 H261 90 000 Dyn h263-1998 90 000 RTP Payload Types 73 but higher delay and make packet loss more noticeable. For non-interactive applications such as lectures or for links with severe bandwidth constraints, a higher packetization delay may be used. A receiver should accept packets representing between 0 and 200 ms of audio data. This restriction allows reasonable buffer sizing for the receiver. Sample and Frame-based Encodings In sample-based encodings, each audio sample is represented by a fixed number of bits. An RTP audio packet may contain any number of audio samples, subject to the constraint that the number of bits per sample times the number of samples per packet yields an integral octet count. The duration of an audio packet is determined by the number of samples in the packet. For sample- based encodings producing one or more octets per sample; samples from different channels sampled at the same sampling instant are packed in consecutive octets. For example, for a two-channel encoding, the octet sequence is (left channel, first sample), (right channel, first sample), (left channel, second sample), (right channel, second sample) The packing of sample-based encodings producing less than one octet per sample is encoding-specific. The RTP timestamp reflects the instant at which the first sample in the packet was sampled, that is, the oldest information in the packet. Frame-based encodings encode a fixed-length block of audio into another block of compressed data, typically also of fixed length. For frame-based encodings, the sender may choose to combine several such frames into a single RTP packet. The receiver can tell the number of frames contained in an RTP packet, provided that all the frames have the same length, by dividing the RTP payload length by the audio frame size that is defined as part of the encoding. For frame-based codecs, the channel order is defined for the whole block. That is, for two-channel audio, right and left samples are coded independently, with the encoded frame for the left channel preceding that for the right channel. All frame-oriented audio codecs are able to encode and decode several consecutive frames within a single packet. Since the frame size for the frame-oriented codecs is given, there is no need to use a separate designation for the same encoding, but with different number of frames per packet. RTP packets contain a number of frames which are inserted according to their age, so that the oldest frame (to be played first) is inserted immediately after the RTP packet header. The RTP timestamp reflects the instant at which the first sample in the first frame was sampled, that is, the oldest information in the packet. Silence Suppression Since the ability to suppress silence is one of the primary motivations for using packets to transmit voice, the RTP header carries both a sequence number and a timestamp to allow a receiver to distinguish between lost packets and periods of time when no data are transmitted. Discontinuous transmission (silence suppression) is used with any audio payload format. In the sequel, the audio encodings are listed:  DVI4: DVI4 uses an adaptive delta pulse code modulation (ADPCM) encoding scheme that was specified by the Interactive Multimedia Association (IMA) as the ‘IMA ADPCM wave type’. However, the encoding defined in RFC3551 here as DVI4 differs in three respects from the IMA specification.  G722: G722 is specified in ITU-T Recommendation G.722, ‘7 kHz audio-coding within 64 kbit/s’. The G.722 encoder produces a stream of octets, each of which shall be octet-aligned in an RTP packet.  G723: G723 is specified in ITU Recommendation G.723.1, ‘Dual-rate speech coder for multimedia communications transmitting at 5.3 and 6.3 kbit/s’. The G.723.1 5.3/6.3 kbit/s codec was defined by the ITU-T as a mandatory codec for ITU-T H.324 GSTN videophone terminal applications.  G726-40, G726-32, G726-24 and G726-16: ITU-T Recommendation G.726 describes, among others, the algorithm recommended for conversion of a single 64 kbit/s A-law or mu-law PCM channel encoded at 8000 samples/sec to and from a 40, 32, 24, or 16 kbit/s channel. 74 Multimedia Transport Protocols for Wireless Networks  G729: G729 is specified in ITU-T Recommendation G.729, ‘Coding of speech at 8 kbit/s using conjugate structure-algebraic code excited linear prediction (CS-ACELP)’.  GSM: GSM (Group Speciale Mobile) denotes the European GSM 06.10 standard for full-rate speech transcoding, ETS 300 961, which is based on RPE/LTP (residual pulse excitation/long term prediction) coding at a rate of 13 kbit/s.  GSM-EFR: GSM-EFR denotes GSM 06.60 enhanced full rate speech transcoding, specified in ETS 300 726.  L8: L8 denotes linear audio data samples, using 8 bits of precision with an offset of 128, that is, the most negative signal is encoded as zero.  L16: L16 denotes uncompressed audio data samples, using 16-bit signed representation with 65 535 equally divided steps between minimum and maximum signal level, ranging from À32 768 to 32 767.  LPC: LPC designates an experimental linear predictive encoding.  MPA: MPA denotes MPEG-1 or MPEG-2 audio encapsulated as elementary streams. The encoding is defined in ISO standards ISO/IEC 11172-3 and 13818-3. The encapsulation is specified in RFC 2250.  PCMA and PCMU: PCMA and PCMU are specified in ITU-T Recommendation G.711. Audio data is encoded as eight bits per sample, after logarithmic scaling. PCMU denotes mu-law scaling, PCMA A-law scaling.  QCELP: The Electronic Industries Association (EIA) and Telecommunications Industry Association (TIA) standard IS-733, ‘TR45: High Rate Speech Service Option for Wideband Spread Spectrum Communications Systems’, defines the QCELP audio compression algorithm for use in wireless CDMA applications.  RED: The redundant audio payload format ‘RED’ is specified by RFC 2198. It defines a means by which multiple redundant copies of an audio packet may be transmitted in a single RTP stream.  VDVI: VDVI is a variable-rate version of DVI4, yielding speech bit rates between 10 and 25 kbit/s. 3.8.1.2 Video This section describes the video encodings that are defined in RFC3551 and give their abbreviated names used for identification. These video encodings and their payload types are listed in Table 3.8. All of these video encodings use an RTP timestamp frequency of 90 000 Hz, the same as the MPEG presentation time stamp frequency. This frequency yields exact integer timestamp increments for the typical 24 (HDTV), 25 (PAL), and 29.97 (NTSC) and 30 (HDTV) Hz frame rates and 50, 59.94 and 60 Hz field rates. While 90 Hz is the recommended rate for future video encodings used within this profile, other rates may be used as well. However, it is not sufficient to use the video frame rate (typically between 15 and 30 Hz) because that does not provide adequate resolution for typical synchronization requirements when calculating the RTP timestamp corresponding to the NTP timestamp in an RTCP SR packet. The timestamp resolution must also be sufficient for the jitter estimate contained in the receiver reports. For most of these video encodings, the RTP timestamp encodes the sampling instant of the video image contained in the RTP data packet. If a video image occupies more than one packet, the timestamp is the same on all of those packets. Packets from different video images are distinguished by their different timestamps. Most of these video encodings also specify that the marker bit of the RTP header is set to one in the last packet of a video frame and otherwise set to zero. Thus, it is not necessary to wait for a following packet with a different timestamp to detect that a new frame should be displayed. In the sequel, the video encodings are listed:  CelB: The CELL-B encoding is a proprietary encoding proposed by Sun Microsystems. The byte stream format is described in RFC 2029.  JPEG: The encoding is specified in ISO Standards 10918-1 and 10918-2. The RTP payload format is as specified in RFC 2435. RTP Payload Types 75 Table 3.9 RFC for RTP profiles and payload format Protocols and payload formats RFC 1889 RTP: A transport protocol for real-time applications (obsoleted by RFC 3550) RFC 1890 RTP profile for audio and video conferences with minimal control (obsoleted by RFC 3551) RFC 2035 RTP payload format for JPEG-compressed video (obsoleted by RFC 2435) RFC 2032 RTP payload format for H.261 video streams RFC 2038 RTP payload format for MPEG1/MPEG2 video obsoleted by RFC 2250 RFC 2029 RTP payload format of Sun’s CellB video encoding RFC 2190 RTP payload format for H.263 video streams RFC 2198 RTP payload for redundant audio data RFC 2250 RTP payload format for MPEG1/MPEG2 video RFC 2343 RTP payload format for bundled MPEG RFC 2429 RTP payload format for the 1998 version of ITU-T Rec. H.263 Video (H.263þ) RFC 2431 RTP payload format for BT.656 video encoding RFC 2435 RTP payload format for JPEG-compressed video RFC 2733 An RTP payload format for generic forward error correction RFC 2736 Guidelines for writers of RTP payload format specifications RFC 2793 RTP payload for text conversation RFC 2833 RTP payload for DTMF digits, telephony tones and telephony signals RFC 2862 RTP payload format for real-time pointers RFC 3016 RTP payload format for MPEG-4 audio/visual streams RFC 3047 RTP payload format for ITU-T Recommendation G.722.1 RFC 3119 A more loss-tolerant RTP payload format for MP3 audio RFC 3158 RTP testing strategies RFC 3189 RTP payload format for DV format video RFC 3190 RTP payload format for 12-bit DAT, 20- and 24-bit linear sampled audio RFC 3267 RTP payload format and file storage format for the Adaptive Multi-Rate (AMR) and Adaptive Multi-Rate Wideband (AMR-WB) audio codecs RFC 3389 RTP payload for comfort noise RFC 3497 RTP payload format for Society of Motion Picture and Television Engineers (SMPTE) 292M video RFC 3550 RTP: A transport protocol for real-time applications RFC 3551 RTP profile for audio and video conferences with minimal control RFC 3555 MIME type registration of RTP payload formats RFC 3557 RTP payload format for European Telecommunications Standards Institute (ETSI) European Standard ES 201 108 distributed speech recognition encoding RFC 3558 RTP payload format for Enhanced Variable Rate Codecs (EVRC) and Selectable Mode Vocoders (SMV) RFC 3640 RTP payload format for transport of MPEG-4 elementary streams RFC 3711 The secure real-time transport protocol RFC 3545 Enhanced compressed RTP (CRTP) for links with high delay, packet loss and reordering RFC 3611 RTP Control Protocol Extended Reports (RTCP XR) Repairing losses RFC 2354 Options for repair of streaming media Others RFC 3009 Registration of parity FEC MIME types RFC 3556 Session Description Protocol (SDP) bandwidth modifiers for RTP Control Protocol (RTCP) bandwidth RFC 2959 Real-time transport protocol management information base RFC 2508 Compressing IP/UDP/RTP headers for low-speed serial links RFC 2762 Sampling of the group membership in RTP 76 Multimedia Transport Protocols for Wireless Networks  H261: The encoding is specified in ITU-T Recommendation H.261, ‘Video codec for audiovisual services at p Â 64 Kbit/s’. The packetization and RTP-specific properties are described in RFC 2032.  H263: The encoding is specified in the 1996 version of ITU-T Recommendation H.263, ‘Video coding for low bit rate communication’. The packetization and RTP-specific properties are described in RFC 2190.  H263-1998: The encoding is specified in the 1998 version of ITU-T Recommendation H.263, ‘Video coding for low bit rate communication’. The packetization and RTP-specific properties are described in RFC 2429.  MPV: MPV designates the use of MPEG-1 and MPEG-2 video encoding elementary streams as specified in ISO Standards ISO/IEC 11172 and 13818-2, respectively. The RTP payload format is as specified in RFC 2250. The MIME registration for MPV in RFC 3555 specifies a parameter that may be used with MIME or SDP to restrict the selection of the type of MPEG video.  MP2T: MP2T designates the use of MPEG-2 transport streams, for either audio or video. The RTP payload format is described in RFC 2250.  nv: The encoding is implemented in the program ‘nv’, version 4, developed at Xerox PARC. Table 3.9 summarizes the RFCs defined for RTP profiles and payload format. 3.9 RTP in 3G This section summarizes the supported media types in 3G and RTP implementation issues for 3G, as reported in 3GPP TR 26.937 [2], TR 26.234 [33] and TR 22.233 [34]. Figure 3.10 shows the basic entities involved in a 3G Packet-Switch Streaming Service (PSS). Clients initiate the service and connect to the selected Content Server. Content Servers, apart from prerecorded content, can generate live content, e.g. video, from a concert or TV (see Table 3.10: Potential services over PSS.). User Profile and terminal capability data can be stored in a network server and will be accessed at the initial set up. User Profile will provide the PSS service with the user’s preferences. Terminal capabilities will be used by the PSS service to decide whether or not Streaming Client Content Servers User and terminal profiles Portals IP Network Content Cache 3GPP Core Network UTRAN GERAN SGSN GGSN Streaming Client Figure 3.10 Network elements involved in a 3G packet switched streaming service. RTP in 3G 77 the client is capable of receiving the streamed content. Portals are servers allowing for a convenient access to streamed media content. For instance, a portal might offer content browse and search facilities. In the simplest case, it is simply a Web/WAP-page with a list of links to streaming content. The content itself is usually stored in content servers, which can be located elsewhere in the network. 3.9.1 Supported Media Types in 3GPP In the 3GPP’s Packet-Switched streaming Service (PSS), the communication between the client and the streaming servers, including session control and transport of media data, is IP-based. Thus, the RTP/ UDP/IP and HTTP/TCP/IP protocol stacks have been adopted for the transport of continuous media and discrete media, respectively. The supported continuous media types are restricted to the following set:  AMR narrow-band speech codec RTP payload format according to RFC3267 [28],  AMR-WB (WideBand) speech codec RTP payload format according to RF3267 [28],  MPEG-4 AAC audio codec RTP payload format according to RFC 3016 [29],  MPEG-4 video codec RTP payload format according to RFC 3016 [29],  H.263 video codec RTP payload format according to RFC 2429 [30]. The usage scenarios of the above continuous data are: (1) voice only streaming (AMR at 12.2 kbps), (2) high-quality voice/low quality music only streaming (AMR-WB at 23.85 kbps), (3) music only streaming (AAC at 52 kbps), (4) voice and video streaming (AMR at 7.95 kbps þ video at 44 kbps), (5) voice and video streaming (AMR at 4.75 kbps þ video at 30 kbps). During streaming, the packets are encapsulated using RTP/UDP/IP protocols. The total header overhead consists of: IP header: 20 bytes for IPv4 (IPv6 would add a 20 bytes overhead); UDP header: 8 bytes; RTP header: 12 bytes. Table 3.10 Potential services over PSS Infotainment Video on demand, including TV Audio on demand, including news, music, etc. Multimedia travel guide Karaoke – song words change colour to indicate when to sing Multimedia information services: sports, news, stock quotes, traffic Weather cams – gives information on other part of country or the world Edutainment Distance learning – video stream of teacher or learning material together with teacher’s voice or audio track. How to ? service – manufacturers show how to program the VCR at home Corporate Field engineering information – junior engineer gets access to online manuals to show how to repair, say, the central heating system Surveillance of business premises or private property (real-time and non-real-time) M-commerce Multimedia cinema ticketing application On line shopping – product presentations could be streamed to the user and then the user could buy on line. 78 Multimedia Transport Protocols for Wireless Networks The supported discrete media types (which use the HTTP/TCP/IP stack) for scene description, text, bitmap graphics and still images, are as follows:  Still images: ISO/IEC JPEG [35] together with JFIF [36] decoders are supported. The support for ISO/IEC JPEG only apply to the following modes: baseline DCT, non-differential, Huffman coding, and progressive DCT, non-differential, Huffman coding.  Bitmap graphics: GIF87a [40], GIF89a [41], PNG [42].  Synthetic audio: The Scalable Polyphony MIDI (SP-MIDI) content format defined in Scalable Polyphony MIDI Specification [45] and the device requirements defined in Scalable Polyphony MIDI Device 5-to-24 Note Profile for 3GPP [46] are supported. SP-MIDI content is delivered in the structure specified in Standard MIDI Files 1.0 [47], either in format 0 or format 1.  Vector graphics: The SVG Tiny profile [43, 44] shall be supported. In addition SVG Basic profile [43, 44] may be supported.  Text: The text decoder is intended to enable formatted text in a SMIL presentation. The UTF-8 [38] and UCS-2 [37] character coding formats are supported. A PSS client shall support:  text formatted according to XHTML Mobile Profile [32, 48];  rendering a SMIL presentation where text is referenced with the SMIL 2.0 ‘text’ element together with the SMIL 2.0 ‘src’ attribute.  Scene description: The 3GPP PSS uses a subset of SMIL 2.0 [39] as format of the scene description. PSS clients and servers with support for scene descriptions support the 3GPP PSS SMIL Language Profile (defined in 3GPP TS 26.234 specification [33]). This profile is a subset of the SMIL 2.0 Language Profile, but a superset of the SMIL 2.0 Basic Language Profile. It should be noted that not that all streaming sessions are required to use SMIL. For some types of sessions, e.g. consisting of one single continuous media or two media synchronized by using RTP timestamps, SMIL may not be needed.  Presentation description: SDP is used as the format of the presentation description for both PSS clients and servers. PSS servers shall provide and clients interpret the SDP syntax according to the SDP specification [25] and appendix C of [24]. The SDP delivered to the PSS client shall declare the media types to be used in the session using a codec specific MIME media type for each media. 3.9.2 RTP Implementation Issues for 3G 3.9.2.1 Transport and Transmission Media streams can be packetized using different strategies. For example, video encoded data could be encapsulated using:  one slice of a target size per RTP packet;  one Group of Blocks (GOB), that is, a row of macroblocks per RTP packet;  one frame per RTP packet. Speech data could be encapsulated using an arbitrary (but reasonable) number of speech frames per RTP packet, and using bit- or byte alignment, along with options such as interleaving. The transmission of RTP packets take place in two different ways: (1) VBRP (Variable Bit Rate Packet) transmission – the transmission time of a packet depends solely on the timestamp of the video frame to which the packet belongs, therefore, the video rate variation is directly reflected to the channel; (2) CBRP (Constant Bit Rate Packet) transmission – the delay between sending consecutive packets is continuously adjusted to maintain a near constant rate. RTP in 3G 79 3.9.2.2 Maximum and Minimum RTP Packet Size The RFC 3550 (RTP) [26] does not impose a maximum size for RTP packets. However, when RTP packets are sent over the radio link of a 3GPP PSS, limiting the maximum size of RTP packets can be advantageous. Two types of bearers can be envisaged for streaming using either acknowledged mode (AM) or unacknowledged mode (UM) Radio Link Control (RLC). The AM uses retransmissions over the radio link, whereas the UM does not. In UM mode, large RTP packets are more susceptible to losses over the radio link compared with small RTP packets, since the loss of a segment may result in the loss of the entire packet. On the other hand in AM mode, large RTP packets will result in a larger delay jitter compared with small packets, as it is more likely that more segments have to be retransmitted. Fragmentation is one more reason for limiting packet sizes. It is well known that fragmentation causes:  increased bandwidth requirement, due to additional header(s) overhead;  increased delay, because of operations of segmentation and re-assembly. Implementers should consider avoiding/preventing fragmentation at any link of the transmission path from the streaming server to the streaming client. For the above reasons it is recommended that the maximum size of RTP packets is limited, taking into account the wireless link. This will decrease the RTP packet loss rate particularly for RLC in UM. For RLC in AM the delay jitter will be reduced, permitting the client to use a smaller receiving buffer. It should also be noted that too small RTP packets could result in too much overhead if IP/UDP/RTP header compression is not applied or unnecessary load at the streaming server. While there are no theoretical limits for the usage of small packet sizes, implementers must be aware of the implications of using too small RTP packets. The use of such packets would result in three drawbacks. (1) The RTP/UDP/IP packet header overhead becomes too large compared with the media data. (2) The bandwidth requirement for the bearer allocation increases, for a given media bit rate. (3) The packet rate increases considerably, producing challenging situations for server, network and mobile client. As an example, Figure 3.11 shows a chart with the bandwidth partitions between RTP payload media data and RTP/UDP/IP headers for different RTP payload sizes. The example assumes IPv4. The space RTP payload vs. headers overhead 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 14 32 61 100 200 500 750 1000 1250 RTP payload size (bytes) RTP/UDP/IPv4 headers RTP payload Figure 3.11 Bandwidth of RTP payload and RTP/UDP/IP header for different packet sizes. 80 Multimedia Transport Protocols for Wireless Networks occupied by RTP payload headers is considered to be included in the RTP payload. The smallest RTP payload sizes (14, 32 and 61 bytes) are examples related to minimum payload sizes for AMR at 4.75 kbps, 12.20 kbps and for AMR-WB at 23.85 kbps (1 speech frame per packet). As Figure 3.11 shows, too small packet sizes ( 100 bytes) yield an RTP/UDP/IPv4 header overhead from 29 to 74%. When using large packets (!750 bytes) the header overhead is 3 to 5%. When transporting video using RTP, large RTP packets may be avoided by splitting a video frame into more than one RTP packet. Then, to be able to decode packets following a lost packet in the same video frame, it is recommended that synchronization information is inserted at the start of such an RTP packet. For H.263, this implies the use of GOBs with non-empty GOB headers and, in the case of MPEG-4 video, the use of video packets (resynchronization markers). If the optional Slice Structured mode (Annex K) of H.263 is in use, GOBs are replaced by slices. References [1] S. V. Raghavan and S. K. Tripathi. Networked Multimedia Systems: Concepts, Architecture and Design, Prentice Hall, 1998. [2] 3GPP TR26.937. Technical Specification Group Services and System Aspects; Transparent end-to-end PSS; RTP usage model (Rel.6, 03-2004). [3] V. Varsa and M. Karczewicz, Long Window Rate Control for Video Streaming, Proceedings of 11th International Packet Video Workshop, Kyungju, South Korea. [4] J. -C. Bolot and A. Vega-Garcia, The case for FEC based error control for packet audio in the Internet, ACM Multimedia Systems. [5] IETF RFC 2354. Options for Repair of Streaming Media, C. Perkins and O. Hodson, June 1998. [6] V. Jacobson, Congestion avoidance control. In Proceedings of the SIGCOMM ’88 Conference on Commu- nications Architectures and Protocols, 1988. [7] IETF RFC 2001. TCP Slow Start, Congestion Avoidance, Fast Retransmit, and Fast Recovery Algorithms. [8] D. M. Chiu and R. Jain, Analysis of the increase and decrease algorithms for congestion avoidance in computer networks, Computer Networks and ISDN Systems, 17, 1989, 1–14. [9] C. Bormann, L. Cline, G. Deisher, T. Gardos, C. Maciocco, D. Newell, J. Ott, S. Wenger and C. Zhu, RTP payload format for the 1998 version of ITU-T reccomendation H.263 video (H.263þ). [10] D. Budge, R. McKenzie, W. Mills, W. Diss and P. Long, Media-independent error correction using RTP. [11] S. Floyd and K. Fall, Promoting the use of end-to-end congestion control in the internet, IEEE/ACM Transactions on Networking, August 1999. [12] M. Handley, An examination of Mbone performance, USC/ISI Research Report: ISI/RR-97-450, April 1997. [13] M. Handley and J. Crowcroft, Network text editor (NTE): A scalable shared text editor for the Mbone. In Proceedings ACM SIGCOMM’97, Cannes, France, September 1997. [14] V. Hardman, M. A. Sasse, M. Handley, and A. Watson, Reliable audio for use over the Internet. In Proceedings of INET’95, 1995. [15] I. Kouvelas, O. Hodson, V. Hardman and J. Crowcroft. Redundancy control in real-time Internet audio conferencing. In Proceedings of AVSPN’97, Aberdeen, Scotland, September 1997. [16] J. Nonnenmacher, E. Biersack and D. Towsley. Parity-based loss recovery for reliable multicast transmission. In Proceedings ACM SIGCOMM’97, Cannes, France, September 1997. [17] IETF RFC 2198. RTP Payload for Redundant Audio Data, C. Perkins, I. Kouvelas, O. Hodson, V. Hardman, M. Handley, J-C. Bolot, A. Vega-Garcia, and Fosse-Parisis, S. September 1997. [18] J. L. Ramsey, Realization of optimum interleavers. IEEE Transactions on Information Theory, IT-16, 338–345. [19] J. Rosenberg and H. Schulzrinne, An A/V profile extension for generic forward error correction in RTP. [20] M. Yajnik, J. Kurose and D. Towsley, Packet loss correlation in the Mbone multicast network. In Proceedings IEEE Global Internet Conference, November 1996. [21] I. Busse, B. Defner and H. Schulzrinne, Dynamic QoS Control of Multimedia Application based on RTP, May. [22] J. Bolot and T. Turletti, Experience with rate control mechanisms for packet video in the Internet, ACM SIGCOMM Computer Communication Review, 28(1), 4–15. [23] S. McCanne, V. Jacobson and M. Vetterli, Receiver-driven Layered Multicast, Proc. of ACM SIGCOOM, Stanford, CA, August 1996. References 81 [...]... [27] [28] [29] [30 ] [31 ] [32 ] [33 ] [34 ] [35 ] [36 ] [37 ] [38 ] [39 ] [40] [41] [42] [ 43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [ 53] Multimedia Transport Protocols for Wireless Networks IETF RFC 232 6: Real Time Streaming Protocol (RTSP), H Schulzrinne, A Rao, and R Lanphier, April 1998 IETF RFC 232 7: SDP: Session Description Protocol, M Handley and V Jacobson, April 1998 IETF RFC 35 50: RTP: A Transport... additional info, and owner’s e-m ail e=pedrom@dif.um.es (Pedro M Ruiz) c=IN IP4 224.2 .3. 4/16 Every media sent to 224.2 .3. 4 with TTL=16 t =37 7 238 2771 37 735 95971 Start time, and end time of last repetition (difference = 2weeks + 2 days) r=7d 1h 0 48h Each week, for 1 hour, at start time and two days later m=audio 4 836 0 RTP/AVP 0 m=video 539 58 RTP/AVP 31 PCM audio to port 4 836 0 and H.261 video to port 539 58 m=application... H .32 3 defines different entities (called endpoints) depending on the functions that they perform Their functions and names are shown in Table 4.1 Figure 4.1 shows the protocols involved in the H .32 3 recommendation, including both the control and the data planes As we see in the figure, H .32 3 defines the minimum codecs that need to be supported Table 4.1 H .32 3 entities and their functionalities H .32 3 entity... that ‘All-IP’ wireless networks are receiving from within the research community Since the Release 5 of UMTS multimedia, services are going to be offered by the IP Multimedia Subsystem (IMS), which is largely based on IETF multimedia control protocols However, Emerging Wireless Multimedia: Services and Technologies Edited by A Salkintzis and N Passas # 2005 John Wiley & Sons, Ltd 84 Multimedia Control... t =37 7 238 2771 37 735 95971 It happens to be the same host from which the r=7d 1h 0 48h session was advertised But it might be different m=audio 4 836 0 RTP/AVP 0 m=video 539 58 RTP/AVP 31 m=application 32 440 udp wb a=orient:landscape Figure 4.5 Example of description for a unicast session 4.4 Control Protocols for Media Streaming One-way streaming and media on demand delivery real-time services (Section 3. 3)... RTP/AVP;unicast;client_port=5 031 2-5 031 3; server_port=6972-69 73; ssrc=00007726 SETUP rtsp://rtsp.um.es/movie.mp4/trackID=2 RTSP/1.0 CSeq: 3 Session: 21784074157144 Transport: RTP/AVP;unicast;client_port=5 031 4-5 031 5 User-Agent: BasicRTSPClient/1.0 RTSP/1.0 200 OK Server: BasicRTSPServer/1.1 CSeq: 3 Session: 21784074157144 Date: Thu, 20 Feb 20 03 11:00 :34 GMT Transport: RTP/AVP;unicast;client_port=5 031 4-5 031 5; server_port=6972-69 73; ssrc=00002891... videoconferencing standards such as H .32 0 [1], which was approved in 1990 The new ITU standard for packet switched networks grew out of the H .32 0 standard Its first version was approved in 1996 and it was named H .32 3 [2] Two subsequent versions adding improvements were also approved in 1998 and 1999, respectively Currently there is also a fourth version but most of the implementations are based on H .32 3v3 Since... OH, USA, 1990 IETF RFC 20 83: PNG (Portable Networks Graphics) Specification Version 1.0, T Boutell, et al., March 1997 W3C Recommendation: Scalable Vector Graphics (SVG) 1.1 Specification, http://www.w3.org/TR/20 03/ REC-SVG11-20 030 114/, January 20 03 W3C Recommendation: Mobile SVG Profiles: SVG Tiny and SVG Basic, http://www.w3.org/TR/20 03/ REC-SVGMobile-20 030 114/, January 20 03 Scalable Polyphony MIDI... UDP TCP Internet Protocol (IP) Covered by H .32 3 Recommendation Figure 4.1 H .32 3 protocol stack including multimedia control protocols 86 Multimedia Control Protocols for Wireless Networks both for audio and video communications However it does not include any specification regarding the audio and video capture devices According to H .32 3 recommendation, audio and video flows must be delivered over the... M Ruiz From: Juan A Sanchez ;tag=192 830 1774 Call-ID: a84b4c76e66710@istar.dif.um.es CSeq: 31 4159 INVITE Contact: Content-Type: application/sdp Content-Length: v=0 o=origin 5 532 355 232 3 532 334 IN IP4 128 .3. 4.5 c=IN IP4 istar.dif.um.es m=audio 34 56 RTP/AVP 0 3 5 As we can see, a SIP message contains enough information to allow the components . in 3GPP TR 26. 937 [2], TR 26. 234 [33 ] and TR 22. 233 [34 ]. Figure 3. 10 shows the basic entities involved in a 3G Packet-Switch Streaming Service (PSS). Clients initiate the service and connect to. Freed and N. Borenstein, November 1996. [32 ] IETF RFC 32 36: The ‘application/xhtmlþxml’ Media Type, M. Baker and P. Stark, January 2002. [33 ] 3GPP TR26. 234 : Technical Specification Group Services and. RTP packet.  G7 23: G7 23 is specified in ITU Recommendation G.7 23. 1, ‘Dual-rate speech coder for multimedia communications transmitting at 5 .3 and 6 .3 kbit/s’. The G.7 23. 1 5 .3/ 6 .3 kbit/s codec was

emerging wireless multimedia services and technologies phần 3 ppt

Thông tin tài liệu

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan