AEC (Echo Cancellation) and AGC (Gain Control) in WebRTC


Echo is the sound of your own voice reverberating. If the amplitude of such a sound is high and intervals exceed 25 ms, it becomes disruptive to the conversation. Its types can be acoustic or hybrid. Echo cancellers need to eliminate the echo while still preserving call quality and not disrupting tones such as DTMF.

Acoustic Echo 

Usually the background or reflected noise which is an undesired voiceband energy transfers from the speaker to the microphone and into the communication network. Mostly found in a hands-free set or speakerphone. In a multiparty call scenario, it could also occur due to unmatched volume levels, challenging network conditions on one party, background noise, double talk or even proximity between user and microphone

Hybrid / Electronic Echo in PSTN phones

In a public telephone system, local loop wiring is done using two-wire connections carrying bidirectional voice signals. In PBX, a two-to-four wire conversion is done using a hybrid circuit which does not perform perfect impedance matches resulting in a Hybrid echo.

echo AEC
Hybrid / Electronic Echo in PSTN phones

Echo Cancellation

An efficient echo canceller should cancel out the entire echo tail while not leading to any packet loss. It needs to be adaptive to changing IP network bandwidth and algorithm should function equally well in conference scenarios  where there may be more than one echo sources. Benchmarking tools like MOS (Mean Opinion scores ) are used to gauge the  results. Often voice quality enhancement technologies are also integrated into AEC modules, such as :

  • automatic Gain control ( AGC) ,
  • Noise Reduction
  • Confort Noise Generator ( CNG)
  • Non linear processor
  • tone Disabler for SS& and DTMF tones
echo AEC 2
Automatic Echo Cancellation

WebRTC Echo Cancellation

WebRTC now actively detects and removes echo especially the local system echo resonance.

Noise Suppression in WebRTC

Noise suppression automatically filters the audio to remove background noise.

Automatic Gain Control (AGC)

AGC works as a circuit. When the average audio level is low , circuit raises it and if the audio level is high the circuit brings it down.

  • (+) AGC frees the user from manually tuning the audio level.
  • (-) During a pause too , agc tries to bring audio level to standard setting making background noises louder.
  • (-) subesquent audio processing make gain control progressively worse.

Audio Compressor : Due to the drawbacks with AGC , Audio Compressers carry the operation more sophistically by looking at amplitude of the sound.

(-) not ideal for music which had varrying sound amplitude.

Audio Peak Limiter : Limiters simply keep the audio from exceeding a set maximum level.

(+) well suited for avoiding loud noise such as door slam from entering the processing pipeline.

Audio Expanders :increase the dynamic (loudness) range of audio that has been overly processed.

(+) suited for over compressed audio transmissiono such as Satellite relays

Audio Filters :attenuate audio frequencies either above or below certain points within the audio range.

AGC in webRTC

navigator.mediaDevices.getSupportedConstraints();

aspectRatio: true
autoGainControl: true
brightness: true
channelCount: true
colorTemperature: true
contrast: true
deviceId: true
echoCancellation: true
exposureCompensation: true
exposureMode: true
exposureTime: true
facingMode: true
focusDistance: true
focusMode: true
frameRate: true
groupId: true
height: true
iso: true
latency: true
noiseSuppression: true
pan: true
pointsOfInterest: true
resizeMode: true
sampleRate: true
sampleSize: true
saturation: true
sharpness: true
tilt: true
torch: true
whiteBalanceMode: true
width: true
zoom: true

WebRTC Get User MEdia with various values of autoGainControl

References :

WebRTC Audio/Video Codecs


Codecs signifies the media stream’s compession and decompression. For peers to have suceesfull excchange of media, they need a common set of codecs to agree upon for the session. The list codecs are sent  between each other as part of offeer and answer or SDP in SIP.

As WebRTC provides containerless bare mediastreamgtrackobjects. Codecs for these tracks is not mandated by webRTC . Yet the codecs are specified by two seprate RFCs

  1. RFC 7878 WebRTC Audio Codec and Processing Requirements specifies least the Opus codec as well as G.711’s PCMA and PCMU formats.
  2. RFC 7742 WebRTC Video Processing and Codec Requirnments specifies support for  VP8 and H.264’s Constrained Baseline profile for video .

In WebRTC video is protected using Datagram Transport Layer Security (DTLS) / Secure Real-time Transport Protocol (SRTP). In this article we are going to dicuss Audio/Video Codecs processing requirnments only.

WebRTC is free and opensource and its woring bodies promote royality free codecs too. The working groups RTCWEB and IETF make the sure of the fact that non-royality beraning codec are mandatory while other codecs can be optional in WebRTC non browsers .

WebRTC Browsers MUST implement the VP8 video codec as described in RFC6386 and H.264 Constrained Baseline described in RFC 7442.

WebRTC Video Codec and Processing Requirements
Media Flow in WebRTC Call

WebRTC Video Codecs

Most of the codesc below follow Lossy DCT(discrete cosine transform (DCT) based algorithm for encoding. Sample SDP from offer in Chrome browser v80 for Linux incliudes these profile :

m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101 102 122 127 121 125 107 108 109 124 120 123
a=rtcp-mux
a=rtcp-rsize

a=rtpmap:96 VP8/90000
a=rtcp-fb:96 goog-remb
a=rtcp-fb:96 transport-cc
a=rtcp-fb:96 ccm fir
a=rtcp-fb:96 nack
a=rtcp-fb:96 nack pli
a=rtpmap:97 rtx/90000
a=fmtp:97 apt=96

a=rtpmap:98 VP9/90000
a=rtcp-fb:98 goog-remb
a=rtcp-fb:98 transport-cc
a=rtcp-fb:98 ccm fir
a=rtcp-fb:98 nack
a=rtcp-fb:98 nack pli
a=fmtp:98 profile-id=0
a=rtpmap:99 rtx/90000
a=fmtp:99 apt=98

a=rtpmap:100 VP9/90000
a=rtcp-fb:100 goog-remb
a=rtcp-fb:100 transport-cc
a=rtcp-fb:100 ccm fir
a=rtcp-fb:100 nack
a=rtcp-fb:100 nack pli
a=fmtp:100 profile-id=2
a=rtpmap:101 rtx/90000
a=fmtp:101 apt=100

a=rtpmap:102 H264/90000
a=rtcp-fb:102 goog-remb
a=rtcp-fb:102 transport-cc
a=rtcp-fb:102 ccm fir
a=rtcp-fb:102 nack
a=rtcp-fb:102 nack pli
a=fmtp:102 level-asymmetry-allowed=1;packetization-mode=1;profile-level-id=42001f
a=rtpmap:122 rtx/90000
a=fmtp:122 apt=102

a=rtpmap:127 H264/90000
a=rtcp-fb:127 goog-remb
a=rtcp-fb:127 transport-cc
a=rtcp-fb:127 ccm fir
a=rtcp-fb:127 nack
a=rtcp-fb:127 nack pli
a=fmtp:127 level-asymmetry-allowed=1;packetization-mode=0;profile-level-id=42001f
a=rtpmap:121 rtx/90000
a=fmtp:121 apt=127

a=rtpmap:125 H264/90000
a=rtcp-fb:125 goog-remb
a=rtcp-fb:125 transport-cc
a=rtcp-fb:125 ccm fir
a=rtcp-fb:125 nack
a=rtcp-fb:125 nack pli
a=fmtp:125 level-asymmetry-allowed=1;packetization-mode=1;profile-level-id=42e01f
a=rtpmap:107 rtx/90000
a=fmtp:107 apt=125

a=rtpmap:108 H264/90000
a=rtcp-fb:108 goog-remb
a=rtcp-fb:108 transport-cc
a=rtcp-fb:108 ccm fir
a=rtcp-fb:108 nack
a=rtcp-fb:108 nack pli
a=fmtp:108 level-asymmetry-allowed=1;packetization-mode=0;profile-level-id=42e01f
a=rtpmap:109 rtx/90000
a=fmtp:109 apt=108
a=rtpmap:124 red/90000
a=rtpmap:120 rtx/90000
a=fmtp:120 apt=124

VP8

Developed by on2 and then acquired and opensource by google.

libvpx encoder library.

  • Supported conatiner – 3GP, Ogg, WebM
  • (+) supported simulcast
  • (+) Now free of royality fees.
  • (+) No limit on frame rate or data rate

Maximum resolution of 16384×16384 pixels.

VP8 encoders must limit the streams they send to conform to the values indicated by receivers in the corresponding max-fr and max-fs SDP attributes.

Encode and decode pixels with an implied 1:1 (square) aspect ratio.

VP9

Video Processor 9 (VP9) is the successor to the older VP8 and comparable to HEVC as they both have simillar bit rates.

  • supported Containers are – MP4, Ogg, WebM
  • (+) Open and free of royalties and any other licensing requirements

H264/AVC constrained

AVC’s Constrained Baseline (CBP ) profile compliant with WebRTC.

  • propertiary, patented codec, mianted by MPEG / ITU

Constrained Baseline Profile Level 1.2 and H.264 Constrained High Profile Level 1.3 . Contrained baseline is a submet of the main profile , suited to low dealy , low complexity. suited to lower processing device like mobile videos

Multiview Video Coding – can have multiple views of the same scene ,such as stereoscopic video.

Other profiles , which are not supporedt are Baseline(BP), Extended(XP), Main(MP) , High(HiP) , Progressive High(ProHiP) , High 10(Hi10P), High 4:2:2 (Hi422P) and High 4:4:4 Predictive

  • supported containers are 3GP, MP4, WebM

Parameter settings:

  • packetization-mode
  • max-mbps, max-smbps, max-fs, max-cpb, max-dpb, and max-br
  • sprop-parameter-sets: H.264 allows sequence and picture information to be sent both in-band and out-of-band. WebRTC implementations must signal this information in-band.
  • Supplemental Enhancement Information (SEI) “filler payload” and “full frame freeze” messages( used while video switching in MCU streams )

AV1 (AOMedia Video 1)

open format designed by the Alliance for Open Media. It is royality free and especially designed for internet video HTML element and WebRTC.

  • higher data compression rates than VP9 and H.265/HEVC

offers 3 profiles in increasing support for color depths and chroma subsampling.
1. main,
2. high, and
3. professional

  • supports HDR
  • supports Varible Frame Rate
  • Supported container are ISOBMFF, MPEG-TS, MP4, WebM

Stats for Video based media stream track

timestamp 04/05/2020, 14:25:59
ssrc 3929649593
isRemote false
mediaType video
kind video
trackId RTCMediaStreamTrack_sender_2
transportId RTCTransport_0_1
codecId RTCCodec_1_Outbound_96
[codec] VP8 (payloadType: 96)
firCount 0
pliCount 9
nackCount 476
qpSum 912936
[qpSum/framesEncoded] 32.86666666666667
mediaSourceId RTCVideoSource_2
packetsSent 333664
[packetsSent/s] 29.021823604499957
retransmittedPacketsSent 0
bytesSent 342640589
[bytesSent/s] 3685.7715977714947
headerBytesSent 8157584
retransmittedBytesSent 0
framesEncoded 52837
[framesEncoded/s] 30.022576142586164
keyFramesEncoded 31
totalEncodeTime 438.752
[totalEncodeTime/framesEncoded_in_ms] 3.5333333333331516
totalEncodedBytesTarget 335009905
[totalEncodedBytesTarget/s] 3602.7091371103397
totalPacketSendDelay 20872.8
[totalPacketSendDelay/packetsSent_in_ms] 6.89655172416302
qualityLimitationReason bandwidth
qualityLimitationResolutionChanges 20
encoderImplementation libvpx
Graph for Video Track in chrome://webrtc-internals

Non WebRTC supported Video codecs

Need active realtime media transcoding

H.263

Already used for video conferencing on PSTN (Public Switched Telephone Networks), RTSP, and SIP (IP-based videoconferencing) systems.

  • suited for low bandwidth networks
  • (-) not comaptible with WebRTC
    • but many media gateways incldue realtime transcoding existed between H263 based SIP systems and vp8 based webrtc ones to enable video communication between them

H.265 / HEVC

proprietary format and is covered by a number of patents. Licensing is managed by MPEG LA .

  • Container – Mp4

Interoprabiloity between non WebRT Compatible and WebRTC compatible endpoints

With the rise of Internet of Things many Endpoints especially IP cameras connected to Raspberry Pi like SOC( system on chiops )n wanted to stream directly to the browser within theor own provate network or even on public network using TURN / STUN.

The figure below shows how such a call flow is possible between an IP cemera ( such as Baby Cam ) and its parent monitoring it over a WebRTC suppported mobile phone browser . The process includes streaming teh content from IOT device on RTSP stream and using realtime trans-coding between H264 and VP8

Interoprabiloity between non WebRT Compatible and WebRTC compatible endpoints

WebRTC Audio Codecs

source : unknown

WebRTC endpoints are should implement audio codecs: OPUS and PCMA / PCMU, along with Comforrt Noise and DTMF events.

Trace for audio codecs supported in chrome (Version 80.0.3987.149 (Official Build) (64-bit) on ubuntu)

m=audio 9 UDP/TLS/RTP/SAVPF 111 103 104 9 0 8 106 105 13 110 112 113 126

a=rtcp-mux
a=rtpmap:111 opus/48000/2
a=rtcp-fb:111 transport-cc
a=fmtp:111 minptime=10;useinbandfec=1
a=rtpmap:103 ISAC/16000
a=rtpmap:104 ISAC/32000
a=rtpmap:9 G722/8000
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:106 CN/32000
a=rtpmap:105 CN/16000
a=rtpmap:13 CN/8000
a=rtpmap:110 telephone-event/48000
a=rtpmap:112 telephone-event/32000
a=rtpmap:113 telephone-event/16000
a=rtpmap:126 telephone-event/8000

Opus

Opus is a lossy audio compression format developed by the Internet Engineering Task Force (IETF) targeting a broad range of interactive real-time applications over the Internet, from speech to music and supportes multiple compression algorithms

  • Constant and variable bitrate encoding – 6 kbit/s to 510 kbit/s
  • frame sizes – 2.5 ms to 60 ms
  • sampling rates – 8 kHz (with 4 kHz bandwidth) to 48 kHz (with 20 kHz bandwidth, where the entire hearing range of the human auditory system can be reproduced).
  • container- Ogg, WebM, MPEG-TS, MP4

As an open format standardized through RFC 6716, a reference implementation is provided under the 3-clause BSD license. All known software patents which cover Opus are licensed under royalty-free terms.

  • (+ ) flexible, suited for speech ( by SILK) and music ( CELT)
  • (+) support for mono and stereo
  • (+) inbuild FEC( Forward Error Correction) thus resilient to packet loss
  • (+) compression adjustability\ for unpredictable networks
  • (-) Highly CPU intensive ( unsuitable for embedded devices like rpi)
  • (-) processing and memory intensive

For all cases where the endpoint is able to process audio at a sampling rate higher than 8 kHz, it is w3C recommends that Opus be offered before PCMA/PCMU.

AAC (Advanvced Audio Encoding)

part of the MPEG-4 (H.264) standard. Lossy compression but has number pf profiles suiting each usecase like high quality surround sound to low-fidelity audio for speech-only use.

  • supported containers – MP4, ADTS, 3GP

G.711 (PCMA and PCMU)

G.711 is an ITU standard (1972) for audio compression. It is primarily used in telephony.

ITU published Pulse Code Modulation (PCM) with either µ-law or A-law encoding.
vital to interface with the standard telecom network and carriers. G.711 PCM (A-law) is known as PCMA and G.711 PCM (µ-law) is known as PCMU

It is the required standard in many voice-based systems and technologies, for example in H.320 and H.323 specifications.

  • Fixed 64Kbpd bit rate
  • supports 3GP container formats

G.722

ITU standard (1988) Encoded using Adaptive Differential Pulse Code Modulation (ADPCM) which is suited for voice compression

  • 7 kHz Wideband audio codec operating
  • Bitrate 48, 56 and 64 kbit/s.
  • containers used 3GP, AMR-WB

G722 improved speech quality due to a wider speech bandwidth of up to 50-7000 Hz compared to G.711 of 300–3400 Hz.

Comfort noise (CN)

artificial background noise which is used to fill gaps in a transmission instead of using pure silence. It prevents – jarring or RTP Timeout.

Should be used for streams encoded with G.711 or any other supported codec that does not provide its own CN. Use of Discontinuous Transmission (DTX) / CN by senders is optional

Internet Low Bitrate Codec (iLBC)

A opensource narrowband speech codec for VoIP and streaming audio.

  • 8 kHz sampling frequency with a bitrate of 15.2 kbps for 20ms frames and 13.33 kbps for 30ms frames.
  • Defined by IETF RFCs 3951 and 3952.

Internet Speech Audio Codec (iSAC)

iSAC: A wideband and super wideband audio codec for VoIP and streaming audio. It is designed for voice transmissions which are encapsulated within an RTP stream.

  • 16 kHz or 32 kHz sampling frequency
  • adaptive and variable bit rate of 12 to 52 kbps.

Speex

patent-free audio compression format designed for speech and also a free software speech codec that is used in VoIP applications and podcasts. May be obsolete, with Opus as its official successor.

AMR-WB Adaptive Multi-rate Wideband is a patented wideband speech coding standard that provides improved speech quality. This is codec is generally available on mobile phones.

  • wider speech bandwidth of 50–7000 Hz.
  • data rate is between 6-12 kbit/s, and the

DTMF and ‘audio/telephone-event’ media type

endpoints may send DTMF events at any time and should suppress in-band dual-tone multi-frequency (DTMF) tones, if any.

DTMF events list

| 0 | DTMF digit "0"
| 1 | DTMF digit "1"
| 2 | DTMF digit "2"
| 3 | DTMF digit "3"
| 4 | DTMF digit "4"
| 5 | DTMF digit "5"
| 6 | DTMF digit "6"
| 7 | DTMF digit "7"
| 8 | DTMF digit "8"
| 9 | DTMF digit "9"
| 10 | DTMF digit "*"
| 11 | DTMF digit "#"
| 12 | DTMF digit "A"
| 13 | DTMF digit "B"
| 14 | DTMF digit "C"
| 15 | DTMF digit "D"

Stats for Audio Media track

Stats for Audio Media include

  • headerBytesSent
  • packetsSent
  • bytesSent
timestamp 04/05/2020, 14:25:59
ssrc 3005719707
isRemote fals
mediaType audio
kind audio
trackId RTCMediaStreamTrack_sender_1
transportId RTCTransport_0_1
codecId RTCCodec_0_Outbound_111
[codec] opus (payloadType: 111)
mediaSourceId RTCAudioSource_1
packetsSent 88277
[packetsSent/s] 50.03762690431027
retransmittedPacketsSent 0
bytesSent 1977974
[bytesSent/s] 150.11288071293083
headerBytesSent 2118648
retransmittedBytesSent 0
Graphs in chrome://webrtc-internals for Audio

DataChannel

m=application 9 UDP/DTLS/SCTP webrtc-datachannel
c=IN IP4 0.0.0.0
b=AS:30
a=ice-ufrag:blj+
a=ice-pwd:Ytdofc24WZYWRAnyNSNhuF4F
a=ice-options:trickle
a=fingerprint:sha-256 18:2F:B9:13:A1:BA:33:0C:D0:59:DB:83:9A:EA:38:0B:D7:DC:EC:50:20:6E:89:54:CC:E8:70:10:80:2B:8C:EE
a=setup:active
a=mid:2
a=sctp-port:5000
a=max-message-size:262144

Stats for Datachannel

Statistics RTCDataChannel_1
timestamp 04/05/2020, 14:25:59
label sctp
protocol
datachannelid 1
state open
messagesSent 1
[messagesSent/s] 0
bytesSent 228
[bytesSent/s] 0
messagesReceived 1
[messagesReceived/s] 0
bytesReceived 228
[bytesReceived/s] 0

Refrenecs :

Quick links : If you are new to WebRTC read : Introduction to WebRTC is at https://telecom.altanai.com/2013/08/02/what-is-webrtc/

Layers of WebRTC at https://telecom.altanai.com/2013/07/31/webrtc/

Unified Plan SDP format in WebRTC

Until recently a customised or property extension could signal multiple media streams within an m-section of an SDP and experiment with media-level “msid” (Media Stream Identifier ) attribute used to associate RTP streams that are described in different media descriptions with the same MediaStreams. However, with the transition to a unified plan, they will experience breaking changes.

The previous SDP format implementation called “planB” was transitioned to “unified plan” in 2019.

Who it does effect ?

  • Uses various media tracks within m line in SDP such as for video stream and screen sharing simultaneously
  • Munges SDP, uses MCUs or SFUs
  • used track-based APIs addTrack, removeTrack, and sender.replaceTrack or legacy addstream removeStream exposed senders and receivers to edit tracks and their encoding parameters

Who it does not affect ?

  • This does not affect any application which has only single audio and video track.
 { iceServers: [], iceTransportPolicy: all, bundlePolicy: balanced, rtcpMuxPolicy: require, iceCandidatePoolSize: 0, sdpSemantics: "unified-plan" },

Plan B vs Unified Pan

Multiple media stream may be required for cases such as video and screen share stream in same SDP or in specific cases of SFU.

This implementation in Plan B will result in one “m=” line of SDP being used for video and audio. While within the video m= section multiple “a=ssrc” lines are listed for multiple media tracks.

In Unified Plan, every single media track is assigned to a separate “m=” section. Hence for video and screen sharing simultaneously two m sections will be created.

Interoperability between unified plan and plan B

A mismatch in SDP (between Plan B and Unified Plan) usually results :-

  1. only Unified Plan client receives an offer generated by a Plan B client – the Unified Plan client must reject the offer with a failed setRemoteDescription() error.
  2. only Plan B client receives an offer generated by a Unified Plan client – only first track in every “m=” section is used and other tracks are ignored

Reference :

continue : Streaming / broadcasting Live Video call to non webrtc supported browsers and media players

This blog is in continuation to the attempts / outcomes and problems in building a WebRTC to RTP media framework that successfully stream / broadcast WebRTC content to non webrtc supported browsers ( safari / IE ) / media players ( VLC ).


Attempt 4: Stream the content to a WebRTC endpoint which is hidden in a video call . Pick the stream from vp8 object URL send to a streaming server

This process involved the following components :

  • WebRTC API : simplewebrtc on Chrome
  • Transfer mechanism from client to Streaming server:  webrtc media channel

Problems : No streaming server is qualified to handle a direct webrtc input and stream it on network .


Attempt 4.1 : Stream the content to a WebRTC endpoint . Do WebRTC Endpoint to RTP Endpoint bridge using Kurento APIs. 

Use the RTP port and ip address to input into a ffmpeg or gstreamer or VLC terminal command and out put a live H264 stream on another ip and port address .  

This process involved the following components :

  • API : Kurento
  • Transfer mechanism : HTML5 webrtc client -> application server hosting java -> media server -> application for webrtc media to RTP media conversation -> RTP player

Screenshots of attempts with Wowza to stream RTP from a IP and port

kurentowowoza

Problems : The stream was black which means 100% loss.

Lesson learned : RTP is not suitable for over the intgernet transmission especially with firewalls


Attempt 4.2 : Build a WebRTC Endpoint to Http endpoint in kurento and force the video audio encoding to be that of H264 and PCMU.

Code snippet for adding constraints to output media via pipeline and forcing choice of codecs( H264 for video and PCMU for audio ).

MediaPipeline pipeline = kurento.createMediaPipeline();
WebRtcEndpoint webRtcEndpoint = new WebRtcEndpoint.Builder(pipeline).build();
HttpGetEndpoint httpEndpoint=new HttpGetEndpoint.Builder(pipeline).build();

org.kurento.client.Fraction fr= new org.kurento.client.Fraction(1, 30);
VideoCaps vc= new VideoCaps(VideoCodec.H264,fr);
httpEndpoint.setVideoFormat(vc);

AudioCaps ac= new AudioCaps(AudioCodec.PCMU, 65536);
httpEndpoint.setAudioFormat(ac);

webRtcEndpoint.connect(httpEndpoint);

Alternatively one can opt to use gstreamer filter to force the output in raw format.

// basic media operation of 1 pipeline and 2 endpoints
MediaPipeline pipeline = kurento.createMediaPipeline();
WebRtcEndpoint webRtcEndpoint = new WebRtcEndpoint.Builder(pipeline).build();
RtpEndpoint rtpEndpoint = new RtpEndpoint.Builder(pipeline).build();

// adding Gstream filters
GStreamerFilter filter1 = new GStreamerFilter.Builder(pipeline, "videorate max-rate=30").withFilterType(FilterType.VIDEO).build();
GStreamerFilter filter2 = new GStreamerFilter.Builder(pipeline, "capsfilter caps=video/x-h264,width=1280,height=720,framerate=30/1").withFilterType(FilterType.VIDEO).build();
GStreamerFilter filter3 = new GStreamerFilter.Builder(pipeline, "capsfilter caps=audio/x-mpeg,layer=3,rate=48000").withFilterType(FilterType.AUDIO).build();

// connecting all poin ts to one another
webRtcEndpoint.connect (filter1);
filter1.connect (filter2);
filter2.connect (filter3);
filter3.connect (rtpEndpoint);

// RTP SDP offer and answer
String requestRTPsdp = rtpEndpoint.generateOffer();
rtpEndpoint.processAnswer(requestRTPsdp);

End result : The output is still webm based and doesnt work on h264 clients.


Attempt 5  : Use a RTP SDP Endpoint ( ie a SDP file valid for a given session ) and use it to play the WebRTC media over Wowza streaming server

This process involved the following components

  1. WebRTC Stream and object URL of the blob containing VP8 media
  2. Kurento  WebRTC Endpoint  bridge to generate SDP
  3. Wowza Streaming server

Snippet used for kurento to generate a SDP file from WebRTC to RTP bridge

@RequestMapping(value = "/rtpsdp", method = RequestMethod.POST)
private String processRequestrtpsdp(@RequestBody String sdpOffer)
throws IOException, URISyntaxException, InterruptedException {

//basic media operation of 1 pipeline and 2 endpoinst
MediaPipeline pipeline = kurento.createMediaPipeline();
WebRtcEndpoint webRtcEndpoint = new WebRtcEndpoint.Builder(pipeline).build();
RtpEndpoint rtpEndpoint = new RtpEndpoint.Builder(pipeline).build();

//connecting all poin ts to one another
webRtcEndpoint.connect (rtpEndpoint);

// RTP SDP offer and answer
String requestRTPsdp = rtpEndpoint.generateOffer();
rtpEndpoint.processAnswer(requestRTPsdp);

// write the SDP conector to an external file
PrintWriter out = new PrintWriter("/tmp/test.sdp");
out.println(requestRTPsdp);
out.close();

HttpGetEndpoint httpEndpoint = new HttpGetEndpoint.Builder(pipeline).build();
PlayerEndpoint player = new PlayerEndpoint.Builder(pipeline, requestRTPsdp).build();
httpEndpoint.connect(rtpEndpoint);
player.connect(httpEndpoint);

// Playing media and opening the default desktop browser
player.play();
String videoUrl = httpEndpoint.getUrl();
System.out.println(" ------- video URL -------------"+ videoUrl);

// send the response to front client
String responseSdp = webRtcEndpoint.processOffer(sdpOffer);

return responseSdp;
}

End result : wowza doesnt not recognize the WebRTC SDP and play the video

screenshot of wowza with SDP input

Screenshot from 2015-01-30 15:28:59

Attempt 5.1 : Use a RTP SDP Endpoint ( ie a SDP file valid for a given session ) and use it to play the WebRTC media over Default Ubuntu media player 

SDP file formed contains contents such as :

v=0
o=- 3631611195 3631611195 IN IP4 192.168.0.119
s=Kurento Media Server
c=IN IP4 192.168.0.119
t=0 0
m=audio 42802 RTP/AVP 98 99 0
a=rtpmap:98 OPUS/48000/2
a=rtpmap:99 AMR/8000/1
a=rtpmap:0 PCMU/8000
a=ssrc:2713728673 cname:user59375791@host-ad1117df
m=video 35946 RTP/AVP 96 97 100 101
a=rtpmap:96 H263-1998/90000
a=rtpmap:97 VP8/90000
a=rtpmap:100 MP4V-ES/90000
a=rtpmap:101 H264/90000
a=ssrc:93449274 cname:user59375791@host-ad1117df

End result : wowza doesnt not recognize the WebRTC SDP and play the video : deformed media

screenshot of playing from a SDP file

Screenshot from 2015-01-29 17:42:21

Attempt 5.2 : Use a RTP SDP Endpoint ( ie a SDP file valid for a given session ) and use it to play the WebRTC media over VLC using socket input

End result : nothing plays

screenshot of VLC connected to play from socket and failure to play anything

Screenshot from 2015-01-21 17:49:52

Attempt 5.3: Create a WebRTC endpoint and connected it to RTP endpoint via media pipelines . Also make the RTP SDP offer and answering the same . Play with ffnpeg / ffplay / gst playbin

String requestRTPsdp = rtpEndpoint.generateOffer();
rtpEndpoint.processAnswer(requestRTPsdp);

Write the requestRTPsdp to a file and obtain a RTP connector endpoint with Application/SDP .It plays okay with gst playbin ( 10 secs without audio ). Successful attempt to play from a gst playbin

gst-launch -vvv playbin uri=file:///tmp/test.sdp 
donekurento streaming

but refuses to be played by VLC , ffplay and even wowza . The error generated with

ffmpeg -i test.sdp -vcodec copy -acodec copy -f mpegts output-file.ts

or

ffmpeg -re -i test.sdp -vcodec h264 -acodec mp3 -f mpegts "udp://192.168.4.26:5000"

End result : This results in “Could not find codec parameter for stream1 ( video:h263, none ) .Other errors types are , Could not write header for output file output file is empty nothing was encoded”

Error screenshots of trying to play the RTP SDP file with ffmpeg

ffmpeg error kurebto1
ffmpeg error kurebto2

Attempt 6 : Use a WebRTC capable media and streaming server ( eg Kurento )  to pick a live stream of VP8 .

Convert the VP8 to H264  ( ffmpeg / RTP endpoint )

Convert H264 to Mp4 using MP4 parser and pass to a streaming server  ( wowza)

End Result : yes it did work on mozilla but with considerable lag


Update : Thankfully the updates to WebRTC standards mandated the support for PCMU and AVC/H264 CB profile in the media stack of the browser thus solving the “from scratch buildup of transcoder between webrtc and non webrtc endpoints”.

  • Video Codecs : RFC 7742 specifies that all WebRTC-compatible browsers must support VP8 and H.264’s Constrained Baseline profile for video.
  • Audio Codecs : RFC 7874 specifies that browsers must support at least the Opus codec as well as G.711’s PCMA and PCMU formats.

The latest Webrtc specification lists a set of codecs which all compliant browsers are required to support which includes chrome 52 , Firefox , safari , edge.

References :

  1. RFC7742: WebRTC Video Processing and Codec Requirements
  2. RFC 7874: WebRTC Audio Codec and Processing Requirements

Read more about Webrtc Audio Video Codecs

Streaming / broadcasting Live Video call to non webrtc supported browsers and media players

As the title of this article suggests I am going to pen my attempts of streaming / broadcasting Live Video WebRTC call to non WebRTC supported browsers and media players such as VLC , ffplay , default video player in Linux etc.

Some of the high level archietctures for streaming Webrtc Video to multiple endpoints can be viewed in the post below.

Aim : I will be attempting to create a lightweight WebRTC to raw/h264 transcoder by making my own media engine which takes input from WebRTC peerconnection or getusermedia. I am sharing my past experiments in hope of helping someone whose objective may be to acheive the same since many non webrtc supported endpoints ( Rpi , kisosks , mobile browsers ) could benifit heavily from webrtc streaming . Even if your objective is not the same as mine, you may gain some insigh on what not to do when making a media transcoder.


Attempt 1 : use one to many brodcasting API in js

<table class=”visible”> 
<tr> 
<td style=”text-align: right;”> 
<input type=”text” id=”conference-name” placeholder=”Broadcast Name”> </td> 
<td> <select id=”broadcasting-option”> <option>Audio + Video</option> <option>Only Audio</option> <option>Screen</option> </select> </td> 
<td> <button id=”start-conferencing”>Start Broadcasting</button> </td> </tr> 
</table> 
<table id=”rooms-list” class=”visible”></table> 
<div id=”participants”></div> 
http://”RTCPeerConnection-v1.5.js” http://”firebase.js” 
http://”broadcast.js” 
http://”broadcast-ui.js”

It uses API fromwebrtc-experiment.com. The broadcast is in one direction only where the viewrs are never asked for their mic / webcam permission .

problem : The broadcast is for WebRTC browsers only and doesnt support non webrtc players / browsers

Attempt 1.1: Stream the media directly to nodejs through websocke

window.addEventListener('DOMContentLoaded', function () {

    var v = document.getElementById('v');
    navigator.getUserMedia = (navigator.getUserMedia ||
        navigator.webkitGetUserMedia ||
        navigator.mozGetUserMedia ||
        navigator.msGetUserMedia);

    if (navigator.getUserMedia) {
// Request access to video only
        navigator.getUserMedia(
            {
                video: true,
                audio: false
            },
            function (stream) {
                var url = window.URL || window.webkitURL;
                v.src = url ? url.createObjectURL(stream) : stream;
                v.play();

                var ws = new WebSocket('ws://localhost:3000', 'echo-protocol');
                waitForSocketConnection(ws, function () {

                    console.log(" url.createObjectURL(stream)-----", url.createObjectURL(stream))
                    ws.send(stream);

                    console.log("message sent!!!");
                });

            },
            function (error) {
                alert('Something went wrong. (error code ' + error.code + ')');
                return;
            }
        );
    } else {
        alert('Sorry, the browser you are using doesn\'t support getUserMedia');
        return;
    }
});

//Make the function wait until the connection is made...
function waitForSocketConnection(socket, callback) {
    setTimeout(
        function () {
            if (socket.readyState === 1) {
                console.log("Connection is made")
                if (callback != null) {
                    callback();
                }
                return;

            } else {
                console.log("wait for connection...")
                waitForSocketConnection(socket, callback);
            }

        }, 5); // wait 5 milisecond for the connection...
}

Problem : The video is in form of buffer and doesnot play

Attempt 2: Record the WebRTC media ( 5 secs each ) into chunks of webm format->  transfer them to other end -> append the chunks together like a regular file 

This process involved the following components :

  • Recorder Javascript library : RecordJs
  • Transfer mechanism : Record using RecordRTC.js -> send to other end for media server -> stitching together the small webm files into big one at runtime and play
  • Programs :

Code for video recorder

navigator.getUserMedia(videoConstraints, function (stream) {

    video.onloadedmetadata = function () {
        video.width = 320;
        video.height = 240;

        var options = {
            type: isRecordVideo ? 'video' : 'gif',
            video: video,
            canvas: {
                width: canvasWidth_input.value,
                height: canvasHeight_input.value
            }
        };

        recorder = window.RecordRTC(stream, options);
        recorder.startRecording();
    };
    video.src = URL.createObjectURL(stream);
}, function () {
    if (document.getElementById('record-screen').checked) {
        if (location.protocol === 'http:')
            alert('https is mandatory to capture screen.');
        else
            alert('Multi-capturing of screen is not allowed.Have you enabled flag: "Enable screen capture support in getUserMedia"?');
    } else
        alert('Webcam access is denied.');
});

Code for video append-er

var FILE1 = '1.webm';
var FILE2 = '2.webm';
var FILE3 = '3.webm';
var FILE4 = '4.webm';
var FILE5 = '5.webm';

var NUM_CHUNKS = 5;
var video = document.querySelector('video');

window.MediaSource = window.MediaSource || window.WebKitMediaSource;
if (!!!window.MediaSource) {
    alert('MediaSource API is not available');
}

var mediaSource = new MediaSource();
video.src = window.URL.createObjectURL(mediaSource);

function callback(e) {
    var sourceBuffer = mediaSource.addSourceBuffer('video/webm; codecs="vorbis,vp8"');
    GET(FILE1, function (uInt8Array) {
        var file = new Blob([uInt8Array], {type: 'video/webm'});
        var i = 1;
        (function readChunk_(i) {
            var reader = new FileReader();
            reader.onload = function (e) {
                sourceBuffer.appendBuffer(new Uint8Array(e.target.result));
                if (i == NUM_CHUNKS) mediaSource.endOfStream();
                else {
                    if (video.paused) {
                        video.play(); // Start playing after 1st chunk is appended.
                    }
                    readChunk_(++i);
                }
            };
            reader.readAsArrayBuffer(file);
        })(i); // Start the recursive call by self calling.
    });
}

mediaSource.addEventListener('sourceopen', callback, false);
mediaSource.addEventListener('webkitsourceopen', callback, false);
mediaSource.addEventListener('webkitsourceended', function (e) {
    logger.log('mediaSource readyState: ' + this.readyState);
}, false);

// function get the video via XHR
function GET(url, callback) {
    var xhr = new XMLHttpRequest();
    xhr.open('GET', url, true);
    xhr.responseType = 'arraybuffer';
    xhr.send();
    xhr.onload = function (e) {
        if (xhr.status != 200) {
            alert("Unexpected status code " + xhr.status + " for " + url);
            return false;
        }
        callback(new Uint8Array(xhr.response));
    };
}

Shortcoming of this approach

  1. The webm files failed to play on most of the media players
  2. The recorder can only either record video or audio file at a time .

Attempt 2.Chunking and media proxy

Since the previous approach failed to support on webrtc endpoinst , the next iteration of this approach was to channel the webrtc media via a nodejs server thus disrupting the peer to peer media strem in favour of centralized / proxied emdia stream. This would enable me to obtain raw media packets form teh stream using low level C based vp8 decoder libraries and then re encode them to h364 or other media formats suitable for endpoints .

In theory media could be reencoded jusing openH264 library and the frame could be then send to players

let mediaSource = new MediaSource();
let sourceBuffer = mediaSource.addSourceBuffer('video/mp4; codecs=vp9', 
    new VP9Decoder());
let buffer = await loadBuffer();
sourceBuffer.appendBuffer(buffer);

Further extending for uncompressed video

let mediaSource = new MediaSource();
let sourceBuffer = mediaSource.addSourceBuffer('video/raw; codecs=yuv420p');
for (let p in demuxPAckets()) {
    let frame = await codec.decode(p);
    sourceBuffer.appendBuffer(frame);
}

Atleast that was the plan .

Attempt 2.1:  Record the WebRTC media ( 5 secs each ) into chunks of webm format ( RecordRTC.js) >  Use Kurento JS script ( kws-media-api,js) to make a HTTP Endpoint to recorded Webm files  -> append the chunks together like a regular file at runtime 

// UI elements
function getByID(id) {
    return document.getElementById(id);
}

var recordAudio = getByID('record-audio'),
    recordVideo = getByID('record-video'),
    stopRecordingAudio = getByID('stop-recording-audio'),
    stopRecordingVideo = getByID('stop-recording-video'),
    broadcasting = getByID('broadcasting');

var canvasWidth_input = getByID('canvas-width-input'),
    canvasHeight_input = getByID('canvas-height-input');

var video = getByID('video');
var audio = getByID('audio');

// Audio video constraints
var videoConstraints = {
    audio: false,
    video: {
        mandatory: {},
        optional: []
    }
};

var audioConstraints = {
    audio: true,
    video: false
};

// Recording and stop recording - to be convrted into real time capture and chunking 
const ws_uri = 'ws://localhost:8888/kurento';
var URL_SMALL = "http://localhost:8080/streamtomp4/approach1/5561840332.webm";

var audioStream;
var recorder;

recordAudio.onclick = function () {
    if (!audioStream)
        navigator.getUserMedia(audioConstraints, function (stream) {
            if (window.IsChrome) stream = new window.MediaStream(stream.getAudioTracks());
            audioStream = stream;
            audio.src = URL.createObjectURL(audioStream);
            audio.muted = true;
            audio.play();
            // "audio" is a default type
            recorder = window.RecordRTC(stream, {
                type: 'audio'
            });
            recorder.startRecording();
        }, function () {
        });
    else {
        audio.src = URL.createObjectURL(audioStream);
        audio.muted = true;
        audio.play();
        if (recorder) recorder.startRecording();
    }
    window.isAudio = true;
    this.disabled = true;
    stopRecordingAudio.disabled = false;
};

Recording and stop recording video inot small media files ( chunks )

recordVideo.onclick = function () {
    recordVideoOrGIF(true);
};
stopRecordingAudio.onclick = function () {
    this.disabled = true;
    recordAudio.disabled = false;
    audio.src = '';

    if (recorder)
        recorder.stopRecording(function (url) {
            audio.src = url;
            audio.muted = false;
            audio.play();

            document.getElementById('audio-url-preview').innerHTML = '&amp;amp;amp;amp;lt;a href="' + url + '" target="_blank"&amp;amp;amp;amp;gt;Recorded Audio URL&amp;amp;amp;amp;lt;/a&amp;amp;amp;amp;gt;';
        });
};
function recordVideoOrGIF(isRecordVideo) {
    navigator.getUserMedia(videoConstraints, function (stream) {

        video.onloadedmetadata = function () {
            video.width = 320;
            video.height = 240;

            var options = {
                type: isRecordVideo ? 'video' : 'gif',
                video: video,
                canvas: {
                    width: canvasWidth_input.value,
                    height: canvasHeight_input.value
                }
            };

            recorder = window.RecordRTC(stream, options);
            recorder.startRecording();
        };
        video.src = URL.createObjectURL(stream);
    }, function () {
        if (document.getElementById('record-screen').checked) {
            if (location.protocol === 'http:')
                alert('&amp;amp;amp;amp;lt;https&amp;amp;amp;amp;gt; is mandatory to capture screen.');
            else
                alert('Multi-capturing of screen is not allowed. Capturing process is denied. Are you enabled flag: "Enable screen capture support in getUserMedia"?');
        } else
            alert('Webcam access is denied.');
    });

    window.isAudio = false;

    if (isRecordVideo) {
        recordVideo.disabled = true;
        stopRecordingVideo.disabled = false;
    } else {
        recordGIF.disabled = true;
        stopRecordingGIF.disabled = false;
    }
}

stopRecordingVideo.onclick = function () {
    this.disabled = true;
    recordVideo.disabled = false;

    if (recorder)
        recorder.stopRecording(function (url) {
            video.src = url;
            video.play();
            document.getElementById('video-url-preview').innerHTML = '&amp;amp;amp;amp;lt;a href="' + url + '" target="_blank"&amp;amp;amp;amp;gt;Recorded Video URL&amp;amp;amp;amp;lt;/a&amp;amp;amp;amp;gt;';

        });
};

Broadcasting the chunks to media engine

function onerror(error) {
    console.log(" error occured");
    console.error(error);
}

broadcast.onclick = function () {
var videoOutput = document.getElementById("videoOutput");
KwsMedia(ws_uri, function (error, kwsMedia) {
    if (error) return onerror(error);
    // Create pipeline
    kwsMedia.create('MediaPipeline', function (error, pipeline) {
        if (error) return onerror(error);
        // Create pipeline media elements (endpoints &amp;amp;amp;amp;amp; filters)
        pipeline.create('PlayerEndpoint', {uri: URL_SMALL}, function (error, player) {
                if (error) return console.error(error);

                pipeline.create('HttpGetEndpoint', function (error, httpGet) {
                    if (error) return onerror(error);
                    // Connect media element between them
                    player.connect(httpGet, function (error, pipeline) {
                        if (error) return onerror(error);
                        // Set the video on the video tag
                        httpGet.getUrl(function (error, url) {
                            if (error) return onerror(error);
                            videoOutput.src = url;
                            console.log(url);
                            // Start player
                            player.play(function (error) {
                                if (error) return onerror(error);
                                console.log('player.play');
                            });
                        });
                    });

                    // Subscribe to HttpGetEndpoint EOS event
                    httpGet.on('EndOfStream', function (event) {
                        console.log("EndOfStream event:", event);
                    });
                });
            });
    });
}, onerror);
}

problem : dissecting the live video into small the files and appending to each other on reception is an expensive , time and resource consuming process . Also involves heavy buffering and other problems pertaining to real-time streaming .

Attempt 2.2 : Send the recorded chunks of webm to a port on linux server. Use socket programming to pick up these individual files and play using  VLC player from UDP port of the Linux Server

Screenshot from 2015-01-22 15:32:51

End Result : Small file containers play but slow buffering makes this approach non compatible for streaming files chunks and appending as single file.

Attempt 2.3: Send the recorded chunks of webm to a port on linux server socket . Use socket programming to pick up these individual webm files and convert to H264 format so that they can be send to a media server. 

This process involved the following components :

  • Recorder Javascript library : RecordJs
  • Transfer mechanism :WebRTC endpoint -> Call handler ( Record in chunks ) -> ffmpeg / gstreamer to put it on RTP -> streaming server like wowza – > viewers
  • Programs : Use HTML webpage Webscoket connection -> nodejs program to write content from websocket to linux socket -> nodejs program to read that socket and print the content on console

Snippet to transfer the webm recorder files over websocket to nodejs program

// Make the function wait until the connection is made.
function waitForSocketConnection(socket, callback) {
    setTimeout(
        function () {
            if (socket.readyState === 1) {
                console.log("Connection is made")
                if (callback != null) 
                    callback();
            } else {
                console.log("wait for connection...")
                waitForSocketConnection(socket, callback);
            }
        }, 5); // wait 5 milisecond for the connection...
}

function previewFile() {
    var preview = document.querySelector('img');
    var file = document.querySelector('input[type=file]').files[0];
    var reader = new FileReader();

    reader.onloadend = function () {
        preview.src = reader.result;
        console.log(" reader result ", reader.result);

        var video = document.getElementById("v");
        video.src = reader.result;
        console.log(" video played ");

        var ws = new WebSocket('ws://localhost:3000', 'echo-protocol');
        waitForSocketConnection(ws, function () {
            ws.send(reader.result);
            console.log("message sent!!!");
        });

    }

    if (file) {
        // converts to base64 encoded string of the file data
        //reader.readAsDataURL(file);
        reader.readAsBinaryString(file);
    } else {
        preview.src = "";
    }
}

Program for Linux Sockets sender which creates the socket for the webm files in nodejs

var net = require('net');
var fs = require('fs');
var socketPath = '/tmp/tfxsocket';
var http = require('http');
var stream = require('stream');
var util = require('util');

var WebSocketServer = require('ws').Server;
var port = 3000;
var serverUrl = "localhost";

var socket;
/*----------http server -----------*/
var server = http.createServer(function (request, response) {});
server.listen(port, serverUrl);
console.log('HTTP Server running at ', serverUrl, port);

/*------websocket server ----------*/
var wss = new WebSocketServer({server: server});

wss.on("connection", function (ws) {
    console.log("websocket connection open");
    ws.on('message', function (message) {
        console.log(" stream recived from broadcast client on port 3000 ");
        var s = require('net').Socket();
        s.connect(socketPath);
        s.write(message);
        console.log(" send the stream to socketPath", socketPath);
    });

    ws.on("close", function () {
        console.log("websocket connection close")
    });
});

Program for Linux Socket Listener using nodejs and socket . Here the socket is in node /tmp/mysocket

var net = require('net');
var client = net.createConnection("/tmp/mysocket");
client.on("connect", function() {
    console.log("connected to mysocket");
});
client.on("data", function(data) {
    console.log(data);
});
client.on('end', function() {
    console.log('server disconnected');
});

Output 1: Video Buffer displayed

Screenshot from 2015-01-22 15:35:06 (copy)

Output 2 : Payload from Video displayed that shows the pipeline works but no output yet.

Screenshot from 2015-01-23 12:57:35

ffmpeg format of transfering the content from socket to UDP IP and port

ffmpeg -i unix://tmp/mysocket -f format udp://192.168.0.119:8083

problems of this approach : The video was on a passing stage from the socket and contained no information as such when tried to play / show console.


Attempt 3 : Use existing media engine like kurento to do the transocding for me

Send the live WebRTC stream from Kurento WebRTC endpoint to Kurento HTTP endpoint then play using Mozilla VLC web plugin

VLC mozilla plugin can be embedded by :

name="video2"
autoplay="yes" loop="no" hidden="no"
target="rtp://@192.165.0.119:8086" />

screenshot of failure on part of Mozilla VLC plugin to play from a WebRTC endpoint

Screenshot from 2015-01-29 10:37:06
Screenshot from 2015-01-29 10:37:17
Screenshot from 2015-01-29 12:06:14

problem : VLC mozilla plugin was unable to play the video and mozilla playback only was a difficult optio for most consumers .

Contnued on next article : continue : Streaming / broadcasting Live Video call to non webrtc supported browsers and media players

More article on simmilar topics

WebRTC Media Streams and Quality metrics


Media Stream Tracks in WebRTC

The MediaStreamTrack interface typically represents a stream of data of audio or video and a MediaStream may contain zero or more MediaStreamTrack objects.

The objects RTCRtpSender and RTCRtpReceiver can be used by the application to get more fine grained control over the transmission and reception of MediaStreamTracks.

Media Flow in VoIP system
Media Flow in WebRTC Call

Video Streams

Video Capture insync with hardware’s capabilities

WebRTC compatible browsers are required to support Whie-balance , light level , autofocus from video source

Video Capture Resolution

Minimum WebRTC video attributes unless specified in SDP ( Session Description protocl ) is minimum 20 FPS and resolution 320 x 240 pixels. 

Also supports mid stream resilution changes such as in screen source fromdesktop sharinig .

SDP attributes for resolution, frame rate, and bitrate

SDP allows for codec-independent indication of preferred video resolutions using a=imageattr to indicate the maximum resolution that is acceptable. 

Sender must send limiting the encoded resolution to the indicated maximum size, as the receiver may not be capable of handling higher resolutions.

Dynamic FPS control based on actual hardware encoding

video source capture to adjust frame rate accroding to low bandwidth , poor light conditions and harware supported rate rather than force a higher FPS .

Stream Orientation

support generating the R0 and R1 bits of the Coordination of Video Orientation (CVO) mechanism and sharing with peer.

Audio Streams

Audio Level

Audio level for speech transmission to avoid users having to manually adjust the playback and to facilitate mixing in conferencing applications.

Normalization is considering frequencies above 300 Hz, regardless of the sampling rate used. Can be adapted to avoid clipping, either by lowering the gain to a level below -19 dBm0 or through the use of a compressor.

GAIN calculation

  • If the endpoint has control over the entire audio-capture path like a regular phone
    the gain should be adjusted in such a way that an average speaker would have a level of 2600 (-19 dBm0) for active speech.
  • If the endpoint does not have control over the entire audio capture like software endpoint
    then the endpoint SHOULD use automatic gain control (AGC) to dynamically adjust the level to 2600 (-19 dBm0) +/- 6 dB.
  • For music- or desktop-sharing applications, the level SHOULD NOT be automatically adjusted, and the endpoint SHOULD allow the user to set the gain manually.

Acoustic Echo Cancellation (AEC)

Endpoints allow echo control mechanisms

SDP signaling and negotiation for media plane

Media plane adaptation is done at the SBC for network carried media, it should be done for all network hosted media services which face peer-to-peer media.

The high-level architecture elements of WebRTC media streams consists of

  • Encryption, RTP Multiplexing, Support for ICE
  • Audio – Interworking of differing WebRTC and codec sets
  • Video – Use of VP8, Support for H.264
  • Data – Support of MSRP ( RCS standard for messaging over DataChannel API)

Media Source

RTCVideoSource_4 (media-source)

timestamp	03/01/2022, 23:07:05
trackIdentifier	1bcab53d-1eca-41d1-a96a-00f1458c9b1b
kind	video
width	640
height	480
frames	7556
framesPerSecond	30

RTCAudioSource_3 (media-source)

timestamp	        03/01/2022, 23:06:26
trackIdentifier	        12cb979c-b40f-4de7-8b50-be6f4425e0b2
kind	                audio
audioLevel	        0.020599993896298106
totalAudioEnergy	1.8476431267450812
[Audio_Level_in_RMS]	0.02541394245734895
totalSamplesDuration	213.66999999995065
echoReturnLoss	        -0.11197675950825214
echoReturnLossEnhancement 8.111690521240234

Peer-to-Peer Media Stream

Direct connection to media servers and media gateways.

Use common codec set wherever possible to eliminate transcoding —Use regionalized transcoding where common codec not available Real-time video transcoding is expensive and performance impacting.

On-going standards/device/network work needs to be done to expand common codec set. WebRTC codec standards have not been finalized yet. WebRTC target is to support royalty free codecs within its standards.

MediaWebRTClegacy
AudioG.711, OpusG.711, AMR, AMR-WB (G.722.2)
Audio – ExtendedG.729a[b], G.726
VideoVP8H.264/AVC

Supporting common codecs between VoLTE devices and WebRTC endpoints requires one or more of the following:

  1. Support of WebRTC codecs on 3GPP/GSMA
  2. Support of 3GPP/GSMA codecs on WebRTC
  3. WebRTC browser support of codecs native to the device

RTP streams and RTCP stats Outbound Video

Chrome Browser on ubuntu OS

RTCOutboundRTPVideoStream_3305924664 (outbound-rtp)

timestamp	03/01/2022, 22:23:32
ssrc	3305924664
kind	video
trackId	RTCMediaStreamTrack_sender_4
transportId	RTCTransport_0_1
codecId	RTCCodec_1_Outbound_96
[codec]	VP8 (96)
mediaType	video
mediaSourceId	RTCVideoSource_4
packetsSent	171360
[packetsSent/s]	204.02266754223697
retransmittedPacketsSent	620
[retransmittedPacketsSent/s]	0
bytesSent	177210957
[bytesSent_in_bits/s]	1680050.6587655507
headerBytesSent	4218672
[headerBytesSent_in_bits/s]	39812.423281967494
retransmittedBytesSent	668008
[retransmittedBytesSent_in_bits/s]	0
framesEncoded	22003
[framesEncoded/s]	30.00333346209367
keyFramesEncoded	14
totalEncodeTime	418.017
[totalEncodeTime/framesEncoded_in_ms]	9.533333333333378
totalEncodedBytesTarget	0
[totalEncodedBytesTarget_in_bits/s]	0
framesSent	22003
[framesSent/s]	30.00333346209367
hugeFramesSent	1
totalPacketSendDelay	29963.73
[totalPacketSendDelay/packetsSent_in_ms]	31.62745098039772
qualityLimitationReason	none
qualityLimitationDurations	{bandwidth:0,cpu:174895,none:717684,other:0}
qualityLimitationResolutionChanges	0
encoderImplementation	libvpx
firCount	0
pliCount	2
nackCount	161
remoteId	RTCRemoteInboundRtpVideoStream_3305924664
frameWidth	640
frameHeight	480
framesPerSecond	30
qpSum	151000
[qpSum/framesEncoded]	9.3

RTCP statistics RTCRemoteInboundRtpVideoStream_3305924664 (remote-inbound-rtp)

timestamp	03/01/2022, 22:25:29
ssrc	984864038
kind	audio
transportId	RTCTransport_0_1
codecId	RTCCodec_0_Outbound_111
jitter	0.026854166666666665
packetsLost	19
localId	RTCOutboundRTPAudioStream_984864038
roundTripTime	0.048
fractionLost	0
totalRoundTripTime	8.932
roundTripTimeMeasurements	201

Frames

Initially

After considerable time( 10 minutes in my case ) the quality of the media stream adjust to network conditions and variations ( peaks and dips) flat out.

after some time
after some time has passed
after some time

Packets

Initially
After some time

Bytes Send And Received

Initially
After some time has passes

Headers

Initially
after some time has passes

Outbound Audio from Ubuntu Chrome Browser

RTCOutboundRTPAudioStream_984864038 (outbound-rtp)

timestamp 03 / 01 / 2022, 22: 13: 26
ssrc 984864038
kind audio
trackId RTCMediaStreamTrack_sender_3
transportId RTCTransport_0_1
codecId RTCCodec_0_Outbound_111
    [codec] opus(111, minptime = 10; useinbandfec = 1)
mediaType audio
mediaSourceId RTCAudioSource_3
packetsSent 14292
    [packetsSent / s] 50.003051944088384
retransmittedPacketsSent 0
    [retransmittedPacketsSent / s] 0
bytesSent 1151754
    [bytesSent_in_bits / s] 32449.980589635597
headerBytesSent 400176
    [headerBytesSent_in_bits / s] 11200.683635475798
retransmittedBytesSent 0
    [retransmittedBytesSent_in_bits / s] 0
nackCount 0
remoteId RTCRemoteInboundRtpAudioStream_984864038

RTCP statistics RTCRemoteInboundRtpAudioStream_984864038 (remote-inbound-rtp)

timestamp	03/01/2022, 22:17:05
ssrc	984864038
kind	audio
transportId	RTCTransport_0_1
codecId	RTCCodec_0_Outbound_111
jitter	0.002
packetsLost	3
localId	RTCOutboundRTPAudioStream_984864038
roundTripTime	0.023
fractionLost	0
totalRoundTripTime	4.344
roundTripTimeMeasurements 98	

Inbound Video from Android Webrtc Browser

RTCInboundRTPVideoStream_3384287918 (inbound-rtp)

timestamp 03 / 01 / 2022, 22: 55: 35
ssrc 3384287918
kind video
trackId RTCMediaStreamTrack_receiver_4
transportId RTCTransport_0_1
mediaType video
jitter 0.027
packetsLost 78
packetsReceived 79545
    [packetsReceived / s] 0
bytesReceived 77156700
    [bytesReceived_in_bits / s] 0
headerBytesReceived 1978716
    [headerBytesReceived_in_bits / s] 0
jitterBufferDelay 2284.024
    [jitterBufferDelay / jitterBufferEmittedCount_in_ms] 0
jitterBufferEmittedCount 13100
framesReceived 13101
    [framesReceived / s] 0[framesReceived - framesDecoded] 0
framesDecoded 13101
    [framesDecoded / s] 0
keyFramesDecoded 1
    [keyFramesDecoded / s] 0
framesDropped 0
totalDecodeTime 94.229
    [totalDecodeTime / framesDecoded_in_ms] 0
totalInterFrameDelay 442.0259999999831
    [totalInterFrameDelay / framesDecoded_in_ms] 0
totalSquaredInterFrameDelay 20.370232000000772
    [interFrameDelayStDev_in_ms] 0
decoderImplementation libvpx
firCount 0
pliCount 2
nackCount 51
codecId RTCCodec_1_Inbound_96
    [codec] VP8(96)
lastPacketReceivedTimestamp 1641276962171
    [lastPacketReceivedTimestamp] 03 / 01 / 2022, 22: 16: 02
frameWidth 480
frameHeight 640
framesPerSecond 4
qpSum 97949
    [qpSum / framesDecoded] 0
estimatedPlayoutTimestamp 3850268134980
    [estimatedPlayoutTimestamp] 03 / 01 / 2092, 22: 55: 42

missing

Inbound Audio from Android Webrtc Browser

RTCInboundRTPAudioStream_579305270 (inbound-rtp)

timestamp 03 / 01 / 2022, 22: 50: 14
ssrc 579305270
kind audio
trackId RTCMediaStreamTrack_receiver_3
transportId RTCTransport_0_1
mediaType audio
jitter 0.003
packetsLost 208
packetsDiscarded 0
packetsReceived 124469
    [packetsReceived / s] 50.03320990953163
fecPacketsReceived 0
fecPacketsDiscarded 0
bytesReceived 4433321
    [bytesReceived_in_bits / s] 14209.431614306981
headerBytesReceived 3485132
    [headerBytesReceived_in_bits / s] 11207.439019735084
jitterBufferDelay 17887008
    [jitterBufferDelay / jitterBufferEmittedCount_in_ms] 113.79999999996896
jitterBufferEmittedCount 119485440
totalSamplesReceived 118645920
    [totalSamplesReceived / s] 48031.88151315036
concealedSamples 689415
    [concealedSamples / s] 0[concealedSamples / totalSamplesReceived] 0
silentConcealedSamples 338882
    [silentConcealedSamples / s] 0
concealmentEvents 230
insertedSamplesForDeceleration 33841
    [insertedSamplesForDeceleration / s] 0
removedSamplesForAcceleration 1562246
    [removedSamplesForAcceleration / s] 0
totalAudioEnergy 4.458078675648182
    [Audio_Level_in_RMS] 0
totalSamplesDuration 2472.2900000075438
codecId RTCCodec_0_Inbound_111
    [codec] opus(111, minptime = 10; useinbandfec = 1)
lastPacketReceivedTimestamp 1641279014658
    [lastPacketReceivedTimestamp] 03 / 01 / 2022, 22: 50: 14
audioLevel 0
remoteId RTCRemoteOutboundRTPAudioStream_579305270
estimatedPlayoutTimestamp 3850267813642
    [estimatedPlayoutTimestamp]

RTCP statistics RTCRemoteOutboundRTPAudioStream_579305270 (remote-outbound-rtp)

timestamp 03 / 01 / 2022, 22: 48: 47
ssrc 579305270
kind audio
transportId RTCTransport_0_1
codecId RTCCodec_0_Inbound_111
packetsSent 120306
bytesSent 4285534
localId RTCInboundRTPAudioStream_579305270
remoteTimestamp 1641278927459
    [remoteTimestamp] 03 / 01 / 2022, 22: 48: 47
reportsSent 480
roundTripTimeMeasurements 0
totalRoundTripTime 0

Comparision of Media stream QoS metrics between laptop browser and mobile browser

chrome browser on laptopmobile chorme browser
higher frame received ( 30)lower frame received (20)
lower jitter (0.002) and packet loss (3)higher jitter (0.003) and packet loss (208)

Bundled Streams

Same port used for all emdia stream. Fir exmaple port 9 is used for audio video as well as their RTCP feedbacks in snippet below.

a=group:BUNDLE 0 1
a=extmap-allow-mixed
a=msid-semantic: WMS kAGMqdVh7lL70CVUVZQblgjPYsuhOAiGY3ii
m=audio 9 UDP/TLS/RTP/SAVPF 111 63 103 104 9 0 8 106 105 13 110 112 113 126 (33 more lines)
a=rtcp:9 IN IP4 0.0.0.0
a=sendrecv
a=msid:kAGMqdVh7lL70CVUVZQblgjPYsuhOAiGY3ii ed96e925-4425-467b-a099-8fb2e0c67b88
a=rtcp-mux
...
m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101 102 122 127 121 125 107 108 109 35 36 120 119 124 (100 more lines)
a=rtcp:9 IN IP4 0.0.0.0
a=sendrecv
a=msid:kAGMqdVh7lL70CVUVZQblgjPYsuhOAiGY3ii f45d3002-2866-44ce-a807-ac59a4f6708c
a=rtcp-mux

Same CSRC : To run multiple streams of media over a single RTP stream, a common SSRC is used.

a=ssrc:1683985800 cname:OYWXQ35YL2Hh+eUX
a=ssrc:1384669066 cname:OYWXQ35YL2Hh+eUX

Peer to Peer Data Transfer

Data Channel API of Webrtc allows bidirectional communication of arbitrary data between peers. It uses the same API as WebSockets and has very low latency.

  • (+) DataChannel is p2p and is also ened to end encrypted leader to higher privacy
  • (+) build in security due to p2p transfer
  • (+) high throughput than text transfer via a messaging server
  • (+) lower latency as p2p transfer takes shortest route

SCTP is the protocol that opens connectiosn for peer to peer data channel support in WebRTC. It can be configured for reliability and ordered delivery. It provides flow and congestion control to the data messages.

Data Channel Metrics

timestamp  03/01/2022, 23:13:13
label	   sctp
protocol	
dataChannelIdentifier	1
state	    open
messagesSent	42
[messagesSent/s]	0
bytesSent   1962750
[bytesSent_in_bits/s]	0
messagesReceived	31
[messagesReceived/s]	0
bytesReceived	4712
[bytesReceived_in_bits/s]	0
After sharing 2 files of 1.5 Mb each

Bitrate

Webrtc Changes bitrate , resolution and framerate dynamically to accomodate the network conditions, policy constraints or user equipment capability. Higher the bitrate, higher the media quality.

Birate of Audio Codecs

Lossey formats
– iLBC (narrow band )13.33, 15.20 kbit/s
– iSAC ( wideband) 10–52 kbit/s
– GSM-EFR 12.2 kbit/s
– AAC 8–529 kbit/s (stereo)
– AMR-WB (G.722.2) 6.60, 8.85, 12.65, 14.25, 15.85, 18.25, 19.85, 23.05, 23.85 kbit/s
– Opus – 6–510 kbit/s(-) higher bitrate consumes more bandwidth
(-) can cause congestion on network route

ITU-T formats
– G711 64kbps
– G.711.1 ( MDCT, A-law, μ-law) 64, 80, 96 kbit/s
– G.722 64 kbit/s (comprises 48, 56 or 64 kbit/s audio
and 16, 8 or 0 kbit/s auxiliary data)

Lossless formats (such as Dolby trueHD, MPEG-4 ALS)
consume much larger bitrates.

Bitrate of Video

QVGA 200-500 kbps
VGA 400 – 800 kbps
720p+ > 800 kbps
4K( 60fps) > 20 mbps

Packet Loss

Packet loss can cause choppy audio and distorted, blurry or frozen video.

Audio Packet loss

Video Packet loss

Jitter

Jitter is the packet delay variation in an otherwise predictable normal rate of delay. This could indicate route changes, growing congestion etc.

Audio

Jitter fo Audio

Video

Jitter for Video

Round Trip Time

High RTT is indicative of network congestion and causes delays.

Audio RTT

Video RTT

Cummulative Analysis of packet lost , RTT measurement and total RTT on an internetwork scenarios with peerfelxive and relay ICE candidates

Deeper Analysis of fraction lost , jitter and RTT

The chart shows how jitter follows RTT

References :

Read more on SDP and its attributes