- WebRTC Video Codecs
- Non WebRTC supported Video codecs
- H.265 / HEVC
- WebRTC Audio Codecs
- G.711 (PCMA and PCMU)
- DTMF and ‘audio/telephone-event’ media type
- Stats for Audio Media track
- Stats for Datachannel
Codecs signifies the media stream’s compession and decompression. For peers to have suceesfull excchange of media, they need a common set of codecs to agree upon for the session. The list codecs are sent between each other as part of offeer and answer or SDP in SIP.
As WebRTC provides containerless bare mediastreamgtrackobjects. Codecs for these tracks is not mandated by webRTC . Yet the codecs are specified by two seprate RFCs
- RFC 7878 WebRTC Audio Codec and Processing Requirements specifies least the Opus codec as well as G.711’s PCMA and PCMU formats.
- RFC 7742 WebRTC Video Processing and Codec Requirnments specifies support for VP8 and H.264’s Constrained Baseline profile for video .
In WebRTC video is protected using Datagram Transport Layer Security (DTLS) / Secure Real-time Transport Protocol (SRTP). In this article we are going to dicuss Audio/Video Codecs processing requirnments only.
WebRTC is free and opensource and its woring bodies promote royality free codecs too. The working groups RTCWEB and IETF make the sure of the fact that non-royality beraning codec are mandatory while other codecs can be optional in WebRTC non browsers .
WebRTC Browsers MUST implement the VP8 video codec as described in RFC6386 and H.264 Constrained Baseline described in RFC 7442.WebRTC Video Codec and Processing Requirements
Most of the codesc below follow Lossy DCT(discrete cosine transform (DCT) based algorithm for encoding. Sample SDP from offer in Chrome browser v80 for Linux incliudes these profile :
m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101 102 122 127 121 125 107 108 109 124 120 123 a=rtcp-mux a=rtcp-rsize a=rtpmap:96 VP8/90000 a=rtcp-fb:96 goog-remb a=rtcp-fb:96 transport-cc a=rtcp-fb:96 ccm fir a=rtcp-fb:96 nack a=rtcp-fb:96 nack pli a=rtpmap:97 rtx/90000 a=fmtp:97 apt=96 a=rtpmap:98 VP9/90000 a=rtcp-fb:98 goog-remb a=rtcp-fb:98 transport-cc a=rtcp-fb:98 ccm fir a=rtcp-fb:98 nack a=rtcp-fb:98 nack pli a=fmtp:98 profile-id=0 a=rtpmap:99 rtx/90000 a=fmtp:99 apt=98 a=rtpmap:100 VP9/90000 a=rtcp-fb:100 goog-remb a=rtcp-fb:100 transport-cc a=rtcp-fb:100 ccm fir a=rtcp-fb:100 nack a=rtcp-fb:100 nack pli a=fmtp:100 profile-id=2 a=rtpmap:101 rtx/90000 a=fmtp:101 apt=100 a=rtpmap:102 H264/90000 a=rtcp-fb:102 goog-remb a=rtcp-fb:102 transport-cc a=rtcp-fb:102 ccm fir a=rtcp-fb:102 nack a=rtcp-fb:102 nack pli a=fmtp:102 level-asymmetry-allowed=1;packetization-mode=1;profile-level-id=42001f a=rtpmap:122 rtx/90000 a=fmtp:122 apt=102 a=rtpmap:127 H264/90000 a=rtcp-fb:127 goog-remb a=rtcp-fb:127 transport-cc a=rtcp-fb:127 ccm fir a=rtcp-fb:127 nack a=rtcp-fb:127 nack pli a=fmtp:127 level-asymmetry-allowed=1;packetization-mode=0;profile-level-id=42001f a=rtpmap:121 rtx/90000 a=fmtp:121 apt=127 a=rtpmap:125 H264/90000 a=rtcp-fb:125 goog-remb a=rtcp-fb:125 transport-cc a=rtcp-fb:125 ccm fir a=rtcp-fb:125 nack a=rtcp-fb:125 nack pli a=fmtp:125 level-asymmetry-allowed=1;packetization-mode=1;profile-level-id=42e01f a=rtpmap:107 rtx/90000 a=fmtp:107 apt=125 a=rtpmap:108 H264/90000 a=rtcp-fb:108 goog-remb a=rtcp-fb:108 transport-cc a=rtcp-fb:108 ccm fir a=rtcp-fb:108 nack a=rtcp-fb:108 nack pli a=fmtp:108 level-asymmetry-allowed=1;packetization-mode=0;profile-level-id=42e01f a=rtpmap:109 rtx/90000 a=fmtp:109 apt=108 a=rtpmap:124 red/90000 a=rtpmap:120 rtx/90000 a=fmtp:120 apt=124
Developed by on2 and then acquired and opensource by google.
libvpx encoder library.
- Supported conatiner – 3GP, Ogg, WebM
- (+) supported simulcast
- (+) Now free of royality fees.
- (+) No limit on frame rate or data rate
Maximum resolution of 16384×16384 pixels.
VP8 encoders must limit the streams they send to conform to the values indicated by receivers in the corresponding max-fr and max-fs SDP attributes.
Encode and decode pixels with an implied 1:1 (square) aspect ratio.
Video Processor 9 (VP9) is the successor to the older VP8 and comparable to HEVC as they both have simillar bit rates.
- supported Containers are – MP4, Ogg, WebM
- (+) Open and free of royalties and any other licensing requirements
AVC’s Constrained Baseline (CBP ) profile compliant with WebRTC.
- propertiary, patented codec, mianted by MPEG / ITU
Constrained Baseline Profile Level 1.2 and H.264 Constrained High Profile Level 1.3 . Contrained baseline is a submet of the main profile , suited to low dealy , low complexity. suited to lower processing device like mobile videos
Multiview Video Coding – can have multiple views of the same scene ,such as stereoscopic video.
Other profiles , which are not supporedt are Baseline(BP), Extended(XP), Main(MP) , High(HiP) , Progressive High(ProHiP) , High 10(Hi10P), High 4:2:2 (Hi422P) and High 4:4:4 Predictive
- supported containers are 3GP, MP4, WebM
- max-mbps, max-smbps, max-fs, max-cpb, max-dpb, and max-br
- sprop-parameter-sets: H.264 allows sequence and picture information to be sent both in-band and out-of-band. WebRTC implementations must signal this information in-band.
- Supplemental Enhancement Information (SEI) “filler payload” and “full frame freeze” messages( used while video switching in MCU streams )
open format designed by the Alliance for Open Media. It is royality free and especially designed for internet video HTML element and WebRTC.
- higher data compression rates than VP9 and H.265/HEVC
offers 3 profiles in increasing support for color depths and chroma subsampling.
2. high, and
- supports HDR
- supports Varible Frame Rate
- Supported container are ISOBMFF, MPEG-TS, MP4, WebM
Stats for Video based media stream track
timestamp 04/05/2020, 14:25:59 ssrc 3929649593 isRemote false mediaType video kind video trackId RTCMediaStreamTrack_sender_2 transportId RTCTransport_0_1 codecId RTCCodec_1_Outbound_96 [codec] VP8 (payloadType: 96) firCount 0 pliCount 9 nackCount 476 qpSum 912936 [qpSum/framesEncoded] 32.86666666666667 mediaSourceId RTCVideoSource_2 packetsSent 333664 [packetsSent/s] 29.021823604499957 retransmittedPacketsSent 0 bytesSent 342640589 [bytesSent/s] 3685.7715977714947 headerBytesSent 8157584 retransmittedBytesSent 0 framesEncoded 52837 [framesEncoded/s] 30.022576142586164 keyFramesEncoded 31 totalEncodeTime 438.752 [totalEncodeTime/framesEncoded_in_ms] 3.5333333333331516 totalEncodedBytesTarget 335009905 [totalEncodedBytesTarget/s] 3602.7091371103397 totalPacketSendDelay 20872.8 [totalPacketSendDelay/packetsSent_in_ms] 6.89655172416302 qualityLimitationReason bandwidth qualityLimitationResolutionChanges 20 encoderImplementation libvpx
Need active realtime media transcoding
Already used for video conferencing on PSTN (Public Switched Telephone Networks), RTSP, and SIP (IP-based videoconferencing) systems.
- suited for low bandwidth networks
- (-) not comaptible with WebRTC
- but many media gateways incldue realtime transcoding existed between H263 based SIP systems and vp8 based webrtc ones to enable video communication between them
H.265 / HEVC
proprietary format and is covered by a number of patents. Licensing is managed by MPEG LA .
- Container – Mp4
Interoprabiloity between non WebRT Compatible and WebRTC compatible endpoints
With the rise of Internet of Things many Endpoints especially IP cameras connected to Raspberry Pi like SOC( system on chiops )n wanted to stream directly to the browser within theor own provate network or even on public network using TURN / STUN.
The figure below shows how such a call flow is possible between an IP cemera ( such as Baby Cam ) and its parent monitoring it over a WebRTC suppported mobile phone browser . The process includes streaming teh content from IOT device on RTSP stream and using realtime trans-coding between H264 and VP8
WebRTC endpoints are should implement audio codecs: OPUS and PCMA / PCMU, along with Comforrt Noise and DTMF events.
Trace for audio codecs supported in chrome (Version 80.0.3987.149 (Official Build) (64-bit) on ubuntu)
m=audio 9 UDP/TLS/RTP/SAVPF 111 103 104 9 0 8 106 105 13 110 112 113 126
Opus is a lossy audio compression format developed by the Internet Engineering Task Force (IETF) targeting a broad range of interactive real-time applications over the Internet, from speech to music and supportes multiple compression algorithms
- Constant and variable bitrate encoding – 6 kbit/s to 510 kbit/s
- frame sizes – 2.5 ms to 60 ms
- sampling rates – 8 kHz (with 4 kHz bandwidth) to 48 kHz (with 20 kHz bandwidth, where the entire hearing range of the human auditory system can be reproduced).
- container- Ogg, WebM, MPEG-TS, MP4
As an open format standardized through RFC 6716, a reference implementation is provided under the 3-clause BSD license. All known software patents which cover Opus are licensed under royalty-free terms.
- (+ ) flexible, suited for speech ( by SILK) and music ( CELT)
- (+) support for mono and stereo
- (+) inbuild FEC( Forward Error Correction) thus resilient to packet loss
- (+) compression adjustability\ for unpredictable networks
- (-) Highly CPU intensive ( unsuitable for embedded devices like rpi)
- (-) processing and memory intensive
For all cases where the endpoint is able to process audio at a sampling rate higher than 8 kHz, it is w3C recommends that Opus be offered before PCMA/PCMU.
AAC (Advanvced Audio Encoding)
part of the MPEG-4 (H.264) standard. Lossy compression but has number pf profiles suiting each usecase like high quality surround sound to low-fidelity audio for speech-only use.
- supported containers – MP4, ADTS, 3GP
G.711 (PCMA and PCMU)
G.711 is an ITU standard (1972) for audio compression. It is primarily used in telephony.
ITU published Pulse Code Modulation (PCM) with either µ-law or A-law encoding.
vital to interface with the standard telecom network and carriers. G.711 PCM (A-law) is known as PCMA and G.711 PCM (µ-law) is known as PCMU
It is the required standard in many voice-based systems and technologies, for example in H.320 and H.323 specifications.
- Fixed 64Kbpd bit rate
- supports 3GP container formats
ITU standard (1988) Encoded using Adaptive Differential Pulse Code Modulation (ADPCM) which is suited for voice compression
- 7 kHz Wideband audio codec operating
- Bitrate 48, 56 and 64 kbit/s.
- containers used 3GP, AMR-WB
G722 improved speech quality due to a wider speech bandwidth of up to 50-7000 Hz compared to G.711 of 300–3400 Hz.
Comfort noise (CN)
artificial background noise which is used to fill gaps in a transmission instead of using pure silence. It prevents – jarring or RTP Timeout.
Should be used for streams encoded with G.711 or any other supported codec that does not provide its own CN. Use of Discontinuous Transmission (DTX) / CN by senders is optional
Internet Low Bitrate Codec (iLBC)
A opensource narrowband speech codec for VoIP and streaming audio.
- 8 kHz sampling frequency with a bitrate of 15.2 kbps for 20ms frames and 13.33 kbps for 30ms frames.
- Defined by IETF RFCs 3951 and 3952.
Internet Speech Audio Codec (iSAC)
iSAC: A wideband and super wideband audio codec for VoIP and streaming audio. It is designed for voice transmissions which are encapsulated within an RTP stream.
- 16 kHz or 32 kHz sampling frequency
- adaptive and variable bit rate of 12 to 52 kbps.
patent-free audio compression format designed for speech and also a free software speech codec that is used in VoIP applications and podcasts. May be obsolete, with Opus as its official successor.
AMR-WB Adaptive Multi-rate Wideband is a patented wideband speech coding standard that provides improved speech quality. This is codec is generally available on mobile phones.
- wider speech bandwidth of 50–7000 Hz.
- data rate is between 6-12 kbit/s, and the
DTMF and ‘audio/telephone-event’ media type
endpoints may send DTMF events at any time and should suppress in-band dual-tone multi-frequency (DTMF) tones, if any.
DTMF events list
| 0 | DTMF digit "0"
| 1 | DTMF digit "1"
| 2 | DTMF digit "2"
| 3 | DTMF digit "3"
| 4 | DTMF digit "4"
| 5 | DTMF digit "5"
| 6 | DTMF digit "6"
| 7 | DTMF digit "7"
| 8 | DTMF digit "8"
| 9 | DTMF digit "9"
| 10 | DTMF digit "*"
| 11 | DTMF digit "#"
| 12 | DTMF digit "A"
| 13 | DTMF digit "B"
| 14 | DTMF digit "C"
| 15 | DTMF digit "D"
Stats for Audio Media track
Stats for Audio Media include
timestamp 04/05/2020, 14:25:59 ssrc 3005719707 isRemote fals mediaType audio kind audio trackId RTCMediaStreamTrack_sender_1 transportId RTCTransport_0_1 codecId RTCCodec_0_Outbound_111 [codec] opus (payloadType: 111) mediaSourceId RTCAudioSource_1 packetsSent 88277 [packetsSent/s] 50.03762690431027 retransmittedPacketsSent 0 bytesSent 1977974 [bytesSent/s] 150.11288071293083 headerBytesSent 2118648 retransmittedBytesSent 0
m=application 9 UDP/DTLS/SCTP webrtc-datachannel
c=IN IP4 0.0.0.0
Stats for Datachannel
Statistics RTCDataChannel_1 timestamp 04/05/2020, 14:25:59 label sctp protocol datachannelid 1 state open messagesSent 1 [messagesSent/s] 0 bytesSent 228 [bytesSent/s] 0 messagesReceived 1 [messagesReceived/s] 0 bytesReceived 228 [bytesReceived/s] 0
- RFC 7874 – https://tools.ietf.org/html/rfc7874
- RFC 6386 VP8 Data Format and Decoding Guide
- RFC 6236 Negotiation of Generic Image Attributes in the Session Description Protocol (SDP)
- RFC 7472 https://tools.ietf.org/html/rfc7742
- RFC 6716 OPUS https://tools.ietf.org/html/rfc6716
Quick links : If you are new to WebRTC read : Introduction to WebRTC is at https://telecom.altanai.com/2013/08/02/what-is-webrtc/
Layers of WebRTC at https://telecom.altanai.com/2013/07/31/webrtc/