NAT traversal using STUN and TURN

WebRTC : Web-based real-time communications is a gamechanger for real-time communication systems. WebRTC is one such open-source, royalty-free, unencumbered browser-based platform using the browser’s embedded media application programming interface (API). It allows developers to add custom JavaScript & HTML5 to control the media setup and flow. WebRTC has enabled developers to build apps, sites, widgets, plugins and extensions capable of delivering simultaneous audio, video, data, and screen-sharing capability in a peer to peer fashion.

Issues accross Networks : But something which escapes our attention is how media is traversing across the network. Of course, the webrtc sessions run smoothly when both the peers are on the open public internet without any restrictions or firewall blocks. But the real problem begins when one of the peers is behind a Corporate/Enterprise network or using a different Internet service provider with some security restrictions. In such a case the normal ICE capability of WebRTC is not sufficient to set up a bidirectional media streaming setup. For network restriction what is required is a NAT ( Network Address Traversal) mechanism that performs address discovery.

NAT and ICE Solution : STUN and TURN server protocols handle session initiations with handshakes between peers in different network environments. In the case of a firewall blocking a STUN peer-to-peer connection, the system fallback to a TURN server which provides the necessary traversing mechanism through the NAT.

Lets study from the start ie ICE.


Network Address Translation provides a mapping of internal to external IP addresses. This helps in network address modification for packets which in transit accross a tarfic routinig node such as inter networks.

A private address on the inside of the NAT is mapped to an external public address. Port address translation (PAT) resolves conflicts that arise when multiple hosts happen to use the same source port number to establish different external connections at the same time.

Some ways to acheive this

  • Application Layer Gateway (ALG) 
  • Interactive Connectivity Establishment ( ICE )
  •  UPnP Internet Gateway Device Protocol
  • propertiary SIP based Session Border Controller, so on

Lets us just look at ICE in detail which is the default implementation for WebRTC

What is ICE and why is it used ?

ICE (Interactive Connectivity Establishment )  framework ( mandatory by WebRTC standards  ) find network interfaces and ports in Offer / Answer Model to exchange network based information with participating communication clients. ICE makes use of the Session Traversal Utilities for NAT (STUN) protocol and its extension, Traversal Using Relay NAT (TURN)

ICE is defined by RFC 5245 – Interactive Connectivity Establishment (ICE): A Protocol for Network Address Translator (NAT) Traversal for Offer/Answer Protocols.

Sample WebRTC offer holding ICE candidates :

type: offer, sdp: v=0
o=- 3475901263113717000 2 IN IP4
t=0 0
a=group:BUNDLE audio video data
a=msid-semantic: WMS dZdZMFQRNtY3unof7lTZBInzcRRylLakxtvc
m=audio 9 RTP/SAVPF 111 103 104 9 0 8 106 105 13 126
c=IN IP4
a=rtcp:9 IN IP4
a=fingerprint:sha-256 F1:A8:2E:71:4B:4E:FF:08:0F:18:13:1C:86:7B:FE:BA:BD:67:CF:B1:7F:19:87:33:6E:10:5C:17:42:0A:6C:15
a=rtpmap:111 opus/48000/2
a=fmtp:111 minptime=10
a=rtpmap:103 ISAC/16000
a=rtpmap:104 ISAC/32000
a=rtpmap:9 G722/8000
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:106 CN/32000
a=rtpmap:105 CN/16000
a=rtpmap:13 CN/8000
a=rtpmap:126 telephone-event/8000
m=video 9 RTP/SAVPF 100 116 117 96
c=IN IP4
a=rtcp:9 IN IP4
a=fingerprint:sha-256 F1:A8:2E:71:4B:4E:FF:08:0F:18:13:1C:86:7B:FE:BA:BD:67:CF:B1:7F:19:87:33:6E:10:5C:17:42:0A:6C:15
a=rtpmap:100 VP8/90000
a=rtcp-fb:100 ccm fir
a=rtcp-fb:100 nack
a=rtcp-fb:100 nack pli
a=rtcp-fb:100 goog-remb
a=rtpmap:116 red/90000
a=rtpmap:117 ulpfec/90000
a=rtpmap:96 rtx/90000
a=fmtp:96 apt=100
m=application 9 DTLS/SCTP 5000
c=IN IP4
a=fingerprint:sha-256 F1:A8:2E:71:4B:4E:FF:08:0F:18:13:1C:86:7B:FE:BA:BD:67:CF:B1:7F:19:87:33:6E:10:5C:17:42:0A:6C:15
a=sctpmap:5000 webrtc-datachannel 1024

Notice the ICE candidates under video and audio. Now take a look at the SDP answer

type: answer, sdp: v=0
o=- 6931590438150302967 2 IN IP4
t=0 0
a=group:BUNDLE audio video data
a=msid-semantic: WMS R98sfBPNQwC20y9HsDBt4to1hTFeP6S0UnsX
m=audio 1 RTP/SAVPF 111 103 104 0 8 106 105 13 126
c=IN IP4
a=rtcp:1 IN IP4
a=fingerprint:sha-256 7B:9A:A7:43:EC:17:BD:9B:49:E4:23:92:8E:48:E4:8C:9A:BE:85:D4:1D:D7:8B:0E:60:C2:AE:67:77:1D:62:70
a=rtpmap:111 opus/48000/2
a=fmtp:111 minptime=10
a=rtpmap:103 ISAC/16000
a=rtpmap:104 ISAC/32000
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:106 CN/32000
a=rtpmap:105 CN/16000
a=rtpmap:13 CN/8000
a=rtpmap:126 telephone-event/8000
m=video 1 RTP/SAVPF 100 116 117 96
c=IN IP4
a=rtcp:1 IN IP4
a=fingerprint:sha-256 7B:9A:A7:43:EC:17:BD:9B:49:E4:23:92:8E:48:E4:8C:9A:BE:85:D4:1D:D7:8B:0E:60:C2:AE:67:77:1D:62:70
a=rtpmap:100 VP8/90000
a=rtcp-fb:100 ccm fir
a=rtcp-fb:100 nack
a=rtcp-fb:100 nack pli
a=rtcp-fb:100 goog-remb
a=rtpmap:116 red/90000
a=rtpmap:117 ulpfec/90000
a=rtpmap:96 rtx/90000
a=fmtp:96 apt=100
m=application 1 DTLS/SCTP 5000
c=IN IP4
a=fingerprint:sha-256 7B:9A:A7:43:EC:17:BD:9B:49:E4:23:92:8E:48:E4:8C:9A:BE:85:D4:1D:D7:8B:0E:60:C2:AE:67:77:1D:62:70
a=sctpmap:5000 webrtc-datachannel 1024
address discovery for global IP:portallocates its own address as interface to the client
binary protocolextension of STUN
doesnt stay in path after connectionstays in path after connection.
tunnels and relays media
higher priority lower priority
server and peer reflexive ICE candidates relay ICE candidates
Failed WebRTC ICE Conection
succesfull STUN Connection
ICE candidate grid

Call Flow for STUN protocol exchange

Client -> Server : binding request with attributes – CHANGE-REQUEST

Server -> Cient : binding response with attributes – MAPPED-ADDRESS, RESPONSE-ORIGIN, OTHER-ADDRESS, XOR-MAPPED-ADDRESS

STUN call flow for WebRTC Offer Answer
STUN call flow for WebRTC Offer Answer
WebRTC STUN binding request
WebRTC STUN Binding success response

WebRTC needs SDP Offer to be sent to B from A.
Client B uses this SDP offer to generate an SDP Answer for A.
The SDP ( as seen on chrome://webrtc-internals/ ) includes ICE candidates which map open ports in the firewalls.

However, in case both sides are symmetric NATs, the media flow gets blocked. For such a case TURN is used which tries to give a public IP and port mapped to internal IP and port. This relay path provides an alternative routing mechanism like a packet mirror. It can open a DTLS connection and use it to key the SRTP-DTLS media streams.

NAT types

Some types of NAT are described below

Full Cone ( Normal)

All requests from the same internal IP address and port are mapped to the same external IP address and port. It also allows external hosts to send packet to internal host by using the mapped external address.

Full cone ( credits wikipedia)

Restricted Cone

All requests from the same internal IP address and port are mapped to the same external IP address and port, but external hosts can send packet to internal host only if  internal host had previously sent a packet to that IP address.

Address Restricted cone ( credits wikipedia)

Port Restricted Cone

All requests from the same internal IP address and port are mapped to the same external IP address and port, but external hosts can send packet to internal host only if  internal host had previously sent a packet to that IP address and that port.

Port Restricted cone ( credits wikipedia)


All requests from the same internal IP address and port, to a specific destination IP address and port, are mapped to the same external IP address and port. Any traffic from same internal IP+port to a different destination uses a new mapping. Also external hosts which receives a packet can send a UDP packet back to internal host.

Symmetric NAT ( credits wikipedia).
Address and Port-Dependent Mapping (“APDM”) and APDF (Address and Port-Dependent Filtering).
Web Archieve 2018

Network Scenarios for NAT

In order to Understand this better consider various scenarios that determine the NAT Mapping Behavior one could run tests using cli or network analyzer tools and checking checking the XOR-MAPPED-ADDRESS value of the Binding Response message that the client receives

Mapping behaviour

  •  Endpoint-Independent Mapping NAT (EIM-NAT)
  • Address-Dependent Mapping NAT (ADM-NAT)
  • Address and Port-Dependent Mapping NAT (APDM-NAT)

Filtering behaviour

  • Endpoint-Independent Filtering NAT (EIF-NAT)
  • Address-Dependent Filtering NAT (ADF-NAT)
  • Address and Port-Dependent Filtering NAT (APDF-NAT)

Hole Punching

As long as one end of the connection is able to determine the dynamic association of thee other [arty by NAT and send data , hole punching can work.

Permissive NAT mapping techniques which map the same internal address/port consistently to an external address/port are suitable for hole punching such as full cone , address or port restricted NAT. However pure symmetric NAT have inconsistent destination specific port mapping and thus cannot do hole punching.

1 . No Firewall present on either peer. Both connected to open public internet

Diagrammatic representation of  this shown as follows :

WebRTC signalling and media flow on Open public network
WebRTC signalling and media flow on Open public network

In this case there is no restriction to signal or media flow and the call takes places smoothly in p2p fashion.

2.  Either one or both the peer ( could be many in case of multi conf call ) are present behind a firewall  or  restrictive connection or router configured for intranet

In such a case the signal may pass with the use of default ICE candidates or simple ppensource google Stun server such as

 { 'url': ""}]

Diagram :

WebRTC signalling when peers are behind  firewalls
WebRTC signalling when peers are behind firewalls

However the media is restricted resulting in a black / empty / no video situation for both peers  . To combat such situation a relay mechanism such as TURN is required which essentially maps public ip to private ips thus creating a alternative route for media and data to flow through .

WebRTC media flow when peers are behind NAT . Uses TURN relay mechanism
WebRTC media flow when peers are behind NAT . Uses TURN relay mechanism

Peer config should look like :

var configuration =  {
  iceServers: [
 { "url':"stun::"},
 { "url":"turn::"}

var pc = new RTCPeerConnection(configuration);

3. When the TURN server is also behind a firewall

The config file of the turn server need to be altered to map the public and private IP. The diagrammatic description of this is as follows :

WebRTC media flow when peers are behind NAT and TURN server is behind NAT as well . TURN config files bind a public interface to private interface address.
WebRTC media flow when peers are behind NAT and TURN server is behind NAT as well . TURN config files bind a public interface to private interface address .

References :

continue : Streaming / broadcasting Live Video call to non webrtc supported browsers and media players

This blog is in continuation to the attempts / outcomes and problems in building a WebRTC to RTP media framework that successfully stream / broadcast WebRTC content to non webrtc supported browsers ( safari / IE ) / media players ( VLC ).

Attempt 4: Stream the content to a WebRTC endpoint which is hidden in a video call . Pick the stream from vp8 object URL send to a streaming server

This process involved the following components :

  • WebRTC API : simplewebrtc on Chrome
  • Transfer mechanism from client to Streaming server:  webrtc media channel

Problems : No streaming server is qualified to handle a direct webrtc input and stream it on network .

Attempt 4.1 : Stream the content to a WebRTC endpoint . Do WebRTC Endpoint to RTP Endpoint bridge using Kurento APIs. 

Use the RTP port and ip address to input into a ffmpeg or gstreamer or VLC terminal command and out put a live H264 stream on another ip and port address .  

This process involved the following components :

  • API : Kurento
  • Transfer mechanism : HTML5 webrtc client -> application server hosting java -> media server -> application for webrtc media to RTP media conversation -> RTP player

Screenshots of attempts with Wowza to stream RTP from a IP and port


Problems : The stream was black which means 100% loss.

Lesson learned : RTP is not suitable for over the intgernet transmission especially with firewalls

Attempt 4.2 : Build a WebRTC Endpoint to Http endpoint in kurento and force the video audio encoding to be that of H264 and PCMU.

Code snippet for adding constraints to output media via pipeline and forcing choice of codecs( H264 for video and PCMU for audio ).

MediaPipeline pipeline = kurento.createMediaPipeline();
WebRtcEndpoint webRtcEndpoint = new WebRtcEndpoint.Builder(pipeline).build();
HttpGetEndpoint httpEndpoint=new HttpGetEndpoint.Builder(pipeline).build();

org.kurento.client.Fraction fr= new org.kurento.client.Fraction(1, 30);
VideoCaps vc= new VideoCaps(VideoCodec.H264,fr);

AudioCaps ac= new AudioCaps(AudioCodec.PCMU, 65536);


Alternatively one can opt to use gstreamer filter to force the output in raw format.

// basic media operation of 1 pipeline and 2 endpoints
MediaPipeline pipeline = kurento.createMediaPipeline();
WebRtcEndpoint webRtcEndpoint = new WebRtcEndpoint.Builder(pipeline).build();
RtpEndpoint rtpEndpoint = new RtpEndpoint.Builder(pipeline).build();

// adding Gstream filters
GStreamerFilter filter1 = new GStreamerFilter.Builder(pipeline, "videorate max-rate=30").withFilterType(FilterType.VIDEO).build();
GStreamerFilter filter2 = new GStreamerFilter.Builder(pipeline, "capsfilter caps=video/x-h264,width=1280,height=720,framerate=30/1").withFilterType(FilterType.VIDEO).build();
GStreamerFilter filter3 = new GStreamerFilter.Builder(pipeline, "capsfilter caps=audio/x-mpeg,layer=3,rate=48000").withFilterType(FilterType.AUDIO).build();

// connecting all poin ts to one another
webRtcEndpoint.connect (filter1);
filter1.connect (filter2);
filter2.connect (filter3);
filter3.connect (rtpEndpoint);

// RTP SDP offer and answer
String requestRTPsdp = rtpEndpoint.generateOffer();

End result : The output is still webm based and doesnt work on h264 clients.

Attempt 5  : Use a RTP SDP Endpoint ( ie a SDP file valid for a given session ) and use it to play the WebRTC media over Wowza streaming server

This process involved the following components

  1. WebRTC Stream and object URL of the blob containing VP8 media
  2. Kurento  WebRTC Endpoint  bridge to generate SDP
  3. Wowza Streaming server

Snippet used for kurento to generate a SDP file from WebRTC to RTP bridge

@RequestMapping(value = "/rtpsdp", method = RequestMethod.POST)
private String processRequestrtpsdp(@RequestBody String sdpOffer)
throws IOException, URISyntaxException, InterruptedException {

//basic media operation of 1 pipeline and 2 endpoinst
MediaPipeline pipeline = kurento.createMediaPipeline();
WebRtcEndpoint webRtcEndpoint = new WebRtcEndpoint.Builder(pipeline).build();
RtpEndpoint rtpEndpoint = new RtpEndpoint.Builder(pipeline).build();

//connecting all poin ts to one another
webRtcEndpoint.connect (rtpEndpoint);

// RTP SDP offer and answer
String requestRTPsdp = rtpEndpoint.generateOffer();

// write the SDP conector to an external file
PrintWriter out = new PrintWriter("/tmp/test.sdp");

HttpGetEndpoint httpEndpoint = new HttpGetEndpoint.Builder(pipeline).build();
PlayerEndpoint player = new PlayerEndpoint.Builder(pipeline, requestRTPsdp).build();

// Playing media and opening the default desktop browser;
String videoUrl = httpEndpoint.getUrl();
System.out.println(" ------- video URL -------------"+ videoUrl);

// send the response to front client
String responseSdp = webRtcEndpoint.processOffer(sdpOffer);

return responseSdp;

End result : wowza doesnt not recognize the WebRTC SDP and play the video

screenshot of wowza with SDP input

Screenshot from 2015-01-30 15:28:59

Attempt 5.1 : Use a RTP SDP Endpoint ( ie a SDP file valid for a given session ) and use it to play the WebRTC media over Default Ubuntu media player 

SDP file formed contains contents such as :

o=- 3631611195 3631611195 IN IP4
s=Kurento Media Server
c=IN IP4
t=0 0
m=audio 42802 RTP/AVP 98 99 0
a=rtpmap:98 OPUS/48000/2
a=rtpmap:99 AMR/8000/1
a=rtpmap:0 PCMU/8000
a=ssrc:2713728673 cname:user59375791@host-ad1117df
m=video 35946 RTP/AVP 96 97 100 101
a=rtpmap:96 H263-1998/90000
a=rtpmap:97 VP8/90000
a=rtpmap:100 MP4V-ES/90000
a=rtpmap:101 H264/90000
a=ssrc:93449274 cname:user59375791@host-ad1117df

End result : wowza doesnt not recognize the WebRTC SDP and play the video : deformed media

screenshot of playing from a SDP file

Screenshot from 2015-01-29 17:42:21

Attempt 5.2 : Use a RTP SDP Endpoint ( ie a SDP file valid for a given session ) and use it to play the WebRTC media over VLC using socket input

End result : nothing plays

screenshot of VLC connected to play from socket and failure to play anything

Screenshot from 2015-01-21 17:49:52

Attempt 5.3: Create a WebRTC endpoint and connected it to RTP endpoint via media pipelines . Also make the RTP SDP offer and answering the same . Play with ffnpeg / ffplay / gst playbin

String requestRTPsdp = rtpEndpoint.generateOffer();

Write the requestRTPsdp to a file and obtain a RTP connector endpoint with Application/SDP .It plays okay with gst playbin ( 10 secs without audio ). Successful attempt to play from a gst playbin

gst-launch -vvv playbin uri=file:///tmp/test.sdp 
donekurento streaming

but refuses to be played by VLC , ffplay and even wowza . The error generated with

ffmpeg -i test.sdp -vcodec copy -acodec copy -f mpegts output-file.ts


ffmpeg -re -i test.sdp -vcodec h264 -acodec mp3 -f mpegts "udp://"

End result : This results in “Could not find codec parameter for stream1 ( video:h263, none ) .Other errors types are , Could not write header for output file output file is empty nothing was encoded”

Error screenshots of trying to play the RTP SDP file with ffmpeg

ffmpeg error kurebto1
ffmpeg error kurebto2

Attempt 6 : Use a WebRTC capable media and streaming server ( eg Kurento )  to pick a live stream of VP8 .

Convert the VP8 to H264  ( ffmpeg / RTP endpoint )

Convert H264 to Mp4 using MP4 parser and pass to a streaming server  ( wowza)

End Result : yes it did work on mozilla but with considerable lag

Update : Thankfully the updates to WebRTC standards mandated the support for PCMU and AVC/H264 CB profile in the media stack of the browser thus solving the “from scratch buildup of transcoder between webrtc and non webrtc endpoints”.

  • Video Codecs : RFC 7742 specifies that all WebRTC-compatible browsers must support VP8 and H.264’s Constrained Baseline profile for video.
  • Audio Codecs : RFC 7874 specifies that browsers must support at least the Opus codec as well as G.711’s PCMA and PCMU formats.

The latest Webrtc specification lists a set of codecs which all compliant browsers are required to support which includes chrome 52 , Firefox , safari , edge.

References :

  1. RFC7742: WebRTC Video Processing and Codec Requirements
  2. RFC 7874: WebRTC Audio Codec and Processing Requirements

Read more about Webrtc Audio Video Codecs

Streaming / broadcasting Live Video call to non webrtc supported browsers and media players

As the title of this article suggests I am going to pen my attempts of streaming / broadcasting Live Video WebRTC call to non WebRTC supported browsers and media players such as VLC , ffplay , default video player in Linux etc.

Some of the high level archietctures for streaming Webrtc Video to multiple endpoints can be viewed in the post below.

Aim : I will be attempting to create a lightweight WebRTC to raw/h264 transcoder by making my own media engine which takes input from WebRTC peerconnection or getusermedia. I am sharing my past experiments in hope of helping someone whose objective may be to acheive the same since many non webrtc supported endpoints ( Rpi , kisosks , mobile browsers ) could benifit heavily from webrtc streaming . Even if your objective is not the same as mine, you may gain some insigh on what not to do when making a media transcoder.

Attempt 1 : use one to many brodcasting API in js

<table class=”visible”> 
<td style=”text-align: right;”> 
<input type=”text” id=”conference-name” placeholder=”Broadcast Name”> </td> 
<td> <select id=”broadcasting-option”> <option>Audio + Video</option> <option>Only Audio</option> <option>Screen</option> </select> </td> 
<td> <button id=”start-conferencing”>Start Broadcasting</button> </td> </tr> 
<table id=”rooms-list” class=”visible”></table> 
<div id=”participants”></div> 
http://”RTCPeerConnection-v1.5.js” http://”firebase.js” 

It uses API The broadcast is in one direction only where the viewrs are never asked for their mic / webcam permission .

problem : The broadcast is for WebRTC browsers only and doesnt support non webrtc players / browsers

Attempt 1.1: Stream the media directly to nodejs through websocke

window.addEventListener('DOMContentLoaded', function () {

    var v = document.getElementById('v');
    navigator.getUserMedia = (navigator.getUserMedia ||
        navigator.webkitGetUserMedia ||
        navigator.mozGetUserMedia ||

    if (navigator.getUserMedia) {
// Request access to video only
                video: true,
                audio: false
            function (stream) {
                var url = window.URL || window.webkitURL;
                v.src = url ? url.createObjectURL(stream) : stream;

                var ws = new WebSocket('ws://localhost:3000', 'echo-protocol');
                waitForSocketConnection(ws, function () {

                    console.log(" url.createObjectURL(stream)-----", url.createObjectURL(stream))

                    console.log("message sent!!!");

            function (error) {
                alert('Something went wrong. (error code ' + error.code + ')');
    } else {
        alert('Sorry, the browser you are using doesn\'t support getUserMedia');

//Make the function wait until the connection is made...
function waitForSocketConnection(socket, callback) {
        function () {
            if (socket.readyState === 1) {
                console.log("Connection is made")
                if (callback != null) {

            } else {
                console.log("wait for connection...")
                waitForSocketConnection(socket, callback);

        }, 5); // wait 5 milisecond for the connection...

Problem : The video is in form of buffer and doesnot play

Attempt 2: Record the WebRTC media ( 5 secs each ) into chunks of webm format->  transfer them to other end -> append the chunks together like a regular file 

This process involved the following components :

  • Recorder Javascript library : RecordJs
  • Transfer mechanism : Record using RecordRTC.js -> send to other end for media server -> stitching together the small webm files into big one at runtime and play
  • Programs :

Code for video recorder

navigator.getUserMedia(videoConstraints, function (stream) {

    video.onloadedmetadata = function () {
        video.width = 320;
        video.height = 240;

        var options = {
            type: isRecordVideo ? 'video' : 'gif',
            video: video,
            canvas: {
                width: canvasWidth_input.value,
                height: canvasHeight_input.value

        recorder = window.RecordRTC(stream, options);
    video.src = URL.createObjectURL(stream);
}, function () {
    if (document.getElementById('record-screen').checked) {
        if (location.protocol === 'http:')
            alert('https is mandatory to capture screen.');
            alert('Multi-capturing of screen is not allowed.Have you enabled flag: "Enable screen capture support in getUserMedia"?');
    } else
        alert('Webcam access is denied.');

Code for video append-er

var FILE1 = '1.webm';
var FILE2 = '2.webm';
var FILE3 = '3.webm';
var FILE4 = '4.webm';
var FILE5 = '5.webm';

var NUM_CHUNKS = 5;
var video = document.querySelector('video');

window.MediaSource = window.MediaSource || window.WebKitMediaSource;
if (!!!window.MediaSource) {
    alert('MediaSource API is not available');

var mediaSource = new MediaSource();
video.src = window.URL.createObjectURL(mediaSource);

function callback(e) {
    var sourceBuffer = mediaSource.addSourceBuffer('video/webm; codecs="vorbis,vp8"');
    GET(FILE1, function (uInt8Array) {
        var file = new Blob([uInt8Array], {type: 'video/webm'});
        var i = 1;
        (function readChunk_(i) {
            var reader = new FileReader();
            reader.onload = function (e) {
                sourceBuffer.appendBuffer(new Uint8Array(;
                if (i == NUM_CHUNKS) mediaSource.endOfStream();
                else {
                    if (video.paused) {
              ; // Start playing after 1st chunk is appended.
        })(i); // Start the recursive call by self calling.

mediaSource.addEventListener('sourceopen', callback, false);
mediaSource.addEventListener('webkitsourceopen', callback, false);
mediaSource.addEventListener('webkitsourceended', function (e) {
    logger.log('mediaSource readyState: ' + this.readyState);
}, false);

// function get the video via XHR
function GET(url, callback) {
    var xhr = new XMLHttpRequest();'GET', url, true);
    xhr.responseType = 'arraybuffer';
    xhr.onload = function (e) {
        if (xhr.status != 200) {
            alert("Unexpected status code " + xhr.status + " for " + url);
            return false;
        callback(new Uint8Array(xhr.response));

Shortcoming of this approach

  1. The webm files failed to play on most of the media players
  2. The recorder can only either record video or audio file at a time .

Attempt 2.Chunking and media proxy

Since the previous approach failed to support on webrtc endpoinst , the next iteration of this approach was to channel the webrtc media via a nodejs server thus disrupting the peer to peer media strem in favour of centralized / proxied emdia stream. This would enable me to obtain raw media packets form teh stream using low level C based vp8 decoder libraries and then re encode them to h364 or other media formats suitable for endpoints .

In theory media could be reencoded jusing openH264 library and the frame could be then send to players

let mediaSource = new MediaSource();
let sourceBuffer = mediaSource.addSourceBuffer('video/mp4; codecs=vp9', 
    new VP9Decoder());
let buffer = await loadBuffer();

Further extending for uncompressed video

let mediaSource = new MediaSource();
let sourceBuffer = mediaSource.addSourceBuffer('video/raw; codecs=yuv420p');
for (let p in demuxPAckets()) {
    let frame = await codec.decode(p);

Atleast that was the plan .

Attempt 2.1:  Record the WebRTC media ( 5 secs each ) into chunks of webm format ( RecordRTC.js) >  Use Kurento JS script ( kws-media-api,js) to make a HTTP Endpoint to recorded Webm files  -> append the chunks together like a regular file at runtime 

// UI elements
function getByID(id) {
    return document.getElementById(id);

var recordAudio = getByID('record-audio'),
    recordVideo = getByID('record-video'),
    stopRecordingAudio = getByID('stop-recording-audio'),
    stopRecordingVideo = getByID('stop-recording-video'),
    broadcasting = getByID('broadcasting');

var canvasWidth_input = getByID('canvas-width-input'),
    canvasHeight_input = getByID('canvas-height-input');

var video = getByID('video');
var audio = getByID('audio');

// Audio video constraints
var videoConstraints = {
    audio: false,
    video: {
        mandatory: {},
        optional: []

var audioConstraints = {
    audio: true,
    video: false

// Recording and stop recording - to be convrted into real time capture and chunking 
const ws_uri = 'ws://localhost:8888/kurento';
var URL_SMALL = "http://localhost:8080/streamtomp4/approach1/5561840332.webm";

var audioStream;
var recorder;

recordAudio.onclick = function () {
    if (!audioStream)
        navigator.getUserMedia(audioConstraints, function (stream) {
            if (window.IsChrome) stream = new window.MediaStream(stream.getAudioTracks());
            audioStream = stream;
            audio.src = URL.createObjectURL(audioStream);
            audio.muted = true;
            // "audio" is a default type
            recorder = window.RecordRTC(stream, {
                type: 'audio'
        }, function () {
    else {
        audio.src = URL.createObjectURL(audioStream);
        audio.muted = true;;
        if (recorder) recorder.startRecording();
    window.isAudio = true;
    this.disabled = true;
    stopRecordingAudio.disabled = false;

Recording and stop recording video inot small media files ( chunks )

recordVideo.onclick = function () {
stopRecordingAudio.onclick = function () {
    this.disabled = true;
    recordAudio.disabled = false;
    audio.src = '';

    if (recorder)
        recorder.stopRecording(function (url) {
            audio.src = url;
            audio.muted = false;

            document.getElementById('audio-url-preview').innerHTML = '&amp;amp;amp;amp;lt;a href="' + url + '" target="_blank"&amp;amp;amp;amp;gt;Recorded Audio URL&amp;amp;amp;amp;lt;/a&amp;amp;amp;amp;gt;';
function recordVideoOrGIF(isRecordVideo) {
    navigator.getUserMedia(videoConstraints, function (stream) {

        video.onloadedmetadata = function () {
            video.width = 320;
            video.height = 240;

            var options = {
                type: isRecordVideo ? 'video' : 'gif',
                video: video,
                canvas: {
                    width: canvasWidth_input.value,
                    height: canvasHeight_input.value

            recorder = window.RecordRTC(stream, options);
        video.src = URL.createObjectURL(stream);
    }, function () {
        if (document.getElementById('record-screen').checked) {
            if (location.protocol === 'http:')
                alert('&amp;amp;amp;amp;lt;https&amp;amp;amp;amp;gt; is mandatory to capture screen.');
                alert('Multi-capturing of screen is not allowed. Capturing process is denied. Are you enabled flag: "Enable screen capture support in getUserMedia"?');
        } else
            alert('Webcam access is denied.');

    window.isAudio = false;

    if (isRecordVideo) {
        recordVideo.disabled = true;
        stopRecordingVideo.disabled = false;
    } else {
        recordGIF.disabled = true;
        stopRecordingGIF.disabled = false;

stopRecordingVideo.onclick = function () {
    this.disabled = true;
    recordVideo.disabled = false;

    if (recorder)
        recorder.stopRecording(function (url) {
            video.src = url;
            document.getElementById('video-url-preview').innerHTML = '&amp;amp;amp;amp;lt;a href="' + url + '" target="_blank"&amp;amp;amp;amp;gt;Recorded Video URL&amp;amp;amp;amp;lt;/a&amp;amp;amp;amp;gt;';


Broadcasting the chunks to media engine

function onerror(error) {
    console.log(" error occured");

broadcast.onclick = function () {
var videoOutput = document.getElementById("videoOutput");
KwsMedia(ws_uri, function (error, kwsMedia) {
    if (error) return onerror(error);
    // Create pipeline
    kwsMedia.create('MediaPipeline', function (error, pipeline) {
        if (error) return onerror(error);
        // Create pipeline media elements (endpoints &amp;amp;amp;amp;amp; filters)
        pipeline.create('PlayerEndpoint', {uri: URL_SMALL}, function (error, player) {
                if (error) return console.error(error);

                pipeline.create('HttpGetEndpoint', function (error, httpGet) {
                    if (error) return onerror(error);
                    // Connect media element between them
                    player.connect(httpGet, function (error, pipeline) {
                        if (error) return onerror(error);
                        // Set the video on the video tag
                        httpGet.getUrl(function (error, url) {
                            if (error) return onerror(error);
                            videoOutput.src = url;
                            // Start player
                   (error) {
                                if (error) return onerror(error);

                    // Subscribe to HttpGetEndpoint EOS event
                    httpGet.on('EndOfStream', function (event) {
                        console.log("EndOfStream event:", event);
}, onerror);

problem : dissecting the live video into small the files and appending to each other on reception is an expensive , time and resource consuming process . Also involves heavy buffering and other problems pertaining to real-time streaming .

Attempt 2.2 : Send the recorded chunks of webm to a port on linux server. Use socket programming to pick up these individual files and play using  VLC player from UDP port of the Linux Server

Screenshot from 2015-01-22 15:32:51

End Result : Small file containers play but slow buffering makes this approach non compatible for streaming files chunks and appending as single file.

Attempt 2.3: Send the recorded chunks of webm to a port on linux server socket . Use socket programming to pick up these individual webm files and convert to H264 format so that they can be send to a media server. 

This process involved the following components :

  • Recorder Javascript library : RecordJs
  • Transfer mechanism :WebRTC endpoint -> Call handler ( Record in chunks ) -> ffmpeg / gstreamer to put it on RTP -> streaming server like wowza – > viewers
  • Programs : Use HTML webpage Webscoket connection -> nodejs program to write content from websocket to linux socket -> nodejs program to read that socket and print the content on console

Snippet to transfer the webm recorder files over websocket to nodejs program

// Make the function wait until the connection is made.
function waitForSocketConnection(socket, callback) {
        function () {
            if (socket.readyState === 1) {
                console.log("Connection is made")
                if (callback != null) 
            } else {
                console.log("wait for connection...")
                waitForSocketConnection(socket, callback);
        }, 5); // wait 5 milisecond for the connection...

function previewFile() {
    var preview = document.querySelector('img');
    var file = document.querySelector('input[type=file]').files[0];
    var reader = new FileReader();

    reader.onloadend = function () {
        preview.src = reader.result;
        console.log(" reader result ", reader.result);

        var video = document.getElementById("v");
        video.src = reader.result;
        console.log(" video played ");

        var ws = new WebSocket('ws://localhost:3000', 'echo-protocol');
        waitForSocketConnection(ws, function () {
            console.log("message sent!!!");


    if (file) {
        // converts to base64 encoded string of the file data
    } else {
        preview.src = "";

Program for Linux Sockets sender which creates the socket for the webm files in nodejs

var net = require('net');
var fs = require('fs');
var socketPath = '/tmp/tfxsocket';
var http = require('http');
var stream = require('stream');
var util = require('util');

var WebSocketServer = require('ws').Server;
var port = 3000;
var serverUrl = "localhost";

var socket;
/*----------http server -----------*/
var server = http.createServer(function (request, response) {});
server.listen(port, serverUrl);
console.log('HTTP Server running at ', serverUrl, port);

/*------websocket server ----------*/
var wss = new WebSocketServer({server: server});

wss.on("connection", function (ws) {
    console.log("websocket connection open");
    ws.on('message', function (message) {
        console.log(" stream recived from broadcast client on port 3000 ");
        var s = require('net').Socket();
        console.log(" send the stream to socketPath", socketPath);

    ws.on("close", function () {
        console.log("websocket connection close")

Program for Linux Socket Listener using nodejs and socket . Here the socket is in node /tmp/mysocket

var net = require('net');
var client = net.createConnection("/tmp/mysocket");
client.on("connect", function() {
    console.log("connected to mysocket");
client.on("data", function(data) {
client.on('end', function() {
    console.log('server disconnected');

Output 1: Video Buffer displayed

Screenshot from 2015-01-22 15:35:06 (copy)

Output 2 : Payload from Video displayed that shows the pipeline works but no output yet.

Screenshot from 2015-01-23 12:57:35

ffmpeg format of transfering the content from socket to UDP IP and port

ffmpeg -i unix://tmp/mysocket -f format udp://

problems of this approach : The video was on a passing stage from the socket and contained no information as such when tried to play / show console.

Attempt 3 : Use existing media engine like kurento to do the transocding for me

Send the live WebRTC stream from Kurento WebRTC endpoint to Kurento HTTP endpoint then play using Mozilla VLC web plugin

VLC mozilla plugin can be embedded by :

autoplay="yes" loop="no" hidden="no"
target="rtp://@" />

screenshot of failure on part of Mozilla VLC plugin to play from a WebRTC endpoint

Screenshot from 2015-01-29 10:37:06
Screenshot from 2015-01-29 10:37:17
Screenshot from 2015-01-29 12:06:14

problem : VLC mozilla plugin was unable to play the video and mozilla playback only was a difficult optio for most consumers .

Contnued on next article : continue : Streaming / broadcasting Live Video call to non webrtc supported browsers and media players

More article on simmilar topics

TFX platform

So I haven’t written anything worthy in a while , just published some posts that were lying around in my drafts . Here I write about the main thing . some thing awesome that I was trying to accomplish in the last quarter .

<< TFX is now live in chrome store , open and free for public use . No signin or account required , no advertisements   : >>

TFX Sessions is a plug and play platform for VoIP ( voice over IP ) scenarios.  Intrinsically it  is a very lightweight API package and shipped in form of a Chrome Extension . It is a turn-key solution when parties want instant audio/audio communication without any sign-in ,plugin installation or additional downloads  . Additionally TFX Sessions is packaged with some interesting plugins which enable the communicating parties to get the interactive and immersive experience as in a face to face meeting.

There is a market requirement of making a utterly simple WebRTC API  that has everything needed to build bigger aggregate projects but the available solutions are either just to basic or much too complex . So I initially started writing my own getuserMedia APIs, but left it midway and picked up simplewebrtc API instead for want of time .Then I focused on the main crux  of the project which was the widget API and ease of integration.

How Does TFX Sessions Work ?

  • Signalling channel establishes the session using Offer- Answer Model
  • Browser’s  media API’s , like getUserMedia and Peerconnection are used for media flow
  • Media only flows peer to peer
TFX WebRTC platform architecture . socket io signalling
TFX WebRTC platform architecture . socket io signalling

A Widget is essentially any web project that wants communication over webrtc channel . Once the platform is ready I have core APIs , widgets and signalling server. Then came up the subject of enterprise internet blocking my communication stream . Time for TURN ( Coturn in my case ).

TFX create / join room
TFX create / join room
TFX startup screen
TFX startup screen

Components of TFX

Client Side Components of tangoFX :

broplug API
Inhouse master library for TangoFX. Makes the TFX sessions platform .Masks the low level webrtc and socketio functions .
Provides simple to use handles for interesting plugins development in platform .
performs webrtc support , peer configuration , wildemitter , utils , event emitters , JSON formatter , websocket , socket namespace , transport , XHR more like so
exports and listener for real time event based bidirectional communication
JS library for client side scripting
HTML and CSS-based design templates for typography, forms, buttons, navigation and other interface components.

Server Side components

Signalling Server -signal master server for webrtc signalling
TURN -Coturn
TURN protocol based media traversal for connecting media across restricting domains ie firewalls, network policies etc .
redis Data structure to maintain current and lost sessions

TFXSessions Components


So here is the final architecture of TFX chrome extension widget based platform .

  • The client side contains widgets , chore extension APIs , chrome’s WebRTC API’s , client for signalling , HTML5 , Jquerry , Javascript , and CSS for styling.
  • The Server Side of the solution contains server for signalling , manuals and other help/support materials , HTTPS certificate and TURN server implementation for NAT .
TFX whitepaper v2.0
TFX platform Server client components . WebRTC media and socketio communication . Build as chrome Extension

Salient Features

  • The underlying technology of TangoFX is  webrtc with socket based signalling  . Also it adheres to the latest standards of W3C , IETF and ITU on internet telephony .
  • TangoFX sessions is extremely scalable and flexible due to the abstraction between communication and service development. This make it a piece of  cake for any web developer using  TangoFX interface to add his/ her own service easily and quickly without diving into the nitty gritties .
  • TangoFX is currently packaged in a chrome extension supported on chrome browser on desktop operating system like window , mac , linux etc .
  • The call is private to both the parties as it is peer-to-peer meaning that the media / information exchanged by the parties over TangoFX does not pass through an intervening server as in other existing internet calling solutions.
  • TangoFX is very adaptive to slow internet and can be used across all kinds of networks such as corporate to public without being affected by firewall or restricting policies  .

TFX Widget Screens

Alright so that’s there . Tada the platform is alive and kicking . Right now in beta stage however . Intensive testing going on here . However here are some screenshots that are from my own developer version .

TFX recording widget
TFX recording widget
TFX face detection and overlay widget
TFX face detection and overlay widget
TFX multilingual communication
TFX multilingual communication
TFX screen-sharing
TFX screen-sharing
TFX video Filters
TFX video Filters
TFX audio visualizer
TFX audio visualizer
TFX text messaging widget
TFX text messaging widget
TFX cross domian access . flicker here
TFX cross domain access . flicker here
TFX draw widget
TFX draw widget
TFX code widget supportes many programming languages
TFX code widget supportes many programming languages
TFX  webrtc dynamic stats
TFX webrtc dynamic stats
TFX  introduction widget
TFX introduction widget

Note that the widgets described above have been made with the help of third party APIs.

TFX Sessions Summary

We saw that TFX is WebRTC based communication and collaboration solution .It is build on Open standards from w3c , IETF , Google etc.
Scalable and customizable. Immersive and interactive experience .
Easy to build widgets framework using TangoFX APIs.

TFX User Manual :

TFX Developer’s Manual :

TangoFX v1 demo on youtube
TangoFX reseracg paper on
TangoFX article on Linkedin

WebRTC SDKs Analysis

In the last few months, I have been keenly tracking how the course for WebRTC is turning out. In my opinion, it is an incredible game-changer and a market disrupter for the telco industry plagued by licensed codecs and heavy call session control software.

Contrary to my expectations the fundamental holes in WebRTC specification are still the same with less being done to fulfil them – Desktop sharing on Chrome Extension, media compatibility with desktop popular H264 being prominent obstacles to adoption. Of course, now there exists an abundance of interactive use-cases for WebRTC APIs from gaming to telemedicine. However, none of the applications is complete and standalone since each uses a new gateway to connect to their existing platform or service.

As new webrtc SDKs and open-sourcing platforms surface, many seem to be wrapping around the same old WebRTC functions ( getusermedia data-channel and peer-connection ) with few or no addons. Few of I am listing down some popular working ones in this blog but there still exists no concrete stable reliable guide to set up the backbone network ( yes I am referring to Media interconversion, relay, TURN, STUN servers ). It is left to a telecom software engineer/developer to find and figure out the best integrations to configure session handling and PSTN or desktop application compatibility.

Some commercial off the shelf service providers have begin to extend interconnecting gateways ( SBC’s) for their backend for Web-Javascript based WebRTC implementations but there are concerns on the end to end encryption and media management as it passes via transcoding media server and many points of relay. This in my opinion completely defeats the objective of WebRTC’s peer-to-peer communication which by design is supposed to be independent of centralised server setup. WebRTC was meant to *everything you can’t do with proprietary communication tools and networks*.

Well moving on , here are some nice API implementations of WebRTC ( only for Websockets no SIP/WebSockets ) which can be quickly used by web developers to create peer to peer media session on web endpoints via a WebRTC supported browser web page.



Neat process of setting up offer-answer and SDP . Notice the Relay candidate gathering 


Session Description ( SDP) for the WebRTC peer with audio / video codecs and other session specificatiosn such as bitrate , framerate , codec profile , RTP specs etc.


No over the top media control which is good as media flows end to end here without any centralized media server .


Also related to SimpleWebRTC which is a lightweight MIT licensed library providing wrappers around core WebRTC API to support application building while easing the lower level peerconnection and session management from developer.




Simmilar offer-answer handshake and SDP excahnge.

talky4 talky5 talky6

There seems to be better noise control management which could be my browser acting on my fluctuating network bandwidth as well . 

3. tokbox


More control from UI on Media settings which is provided as part of getusermedia in WebRTC specification. Read more about the webrtc APIs in my other writeup 


Simmilar to previous WebRTC session ( totally dependant on my own network and CPU ) , independant of any thord party control . 

As my peers network degrated , my webrtc session automatically adjust based on RTCP feedback to send lower resolutions and framerate.


4. webrtcdevelopment

Open source MIT licensed libray to spin up Webrtc calls quickly and easily from a chrome supported browser . It is a fork of best features from multiple libraries such as apprtc , webrtcexperiments , simplewebrtc and is maintained by n open community of users including me.

WebRTC Media Stack is explained in following articles

Call Continuity from Mobile GSM network to WebRTC

In  the present age of IP telephony when telecom convergence is the big thing all around the world , need of the hours is to enable fixed and mobile Service Providers ( SP )  to monetize the subscriber’s phone number by extending it to new web based services.SPs can offer a WebRTC Communicator endpoint that uses the same phone number as the subscriber’s fixed or mobile phone.

Advanced features enable calls to be transferred between fixed-line, mobile and WebRTC endpoints.

Find the diagram depicting this below :

Transfer mobile callto WebRTC session
Transfer mobile callto WebRTC session

SPs can offer 3rd Party WebRTC endpoints to access the user’s phone number and subscription . E.g. enable web applications such as Facebook, Amazon or Netflix to allow their users to make/receive calls or messages directly from the web applications

Revenue Streams :

  • monthly fee for access to WebRTC endpoints and for receiving calls from by 3rd Party WebRTC endpoints
  • One time upgrade fees for Accessing the Web service integration with telecom network like a plan upgrade

Brownie points

  • No software is required to be downloaded on the subscriber’s computer, tablet or mobile phone
  • No desktop support required for the service provider

Plans For Consumer Customers:

  • Subscribers can use the WebRTC endpoints on their computers, tablets or mobile phones as a fixed-line device at home, as a desktop solution when away from home and to avoid international tolls when traveling
  • Subscribers can connect their web services (e.g. Websites , Facebook, Amazon, Netflix) to their fixed or mobile services subscriptions using their SP-provided phone number

Plans For SP Enterprise Customers:

  • Enterprises can deploy a WebRTC endpoint for their employees that provides a single corporate communications endpoint that can be connected to any of the corporation’s UC/PBX and Call Recording systems
  • Employees can use the WebRTC endpoint as their office phone at work, home or when traveling
  • Connects to all leading UC/PBX and Recording platforms simultaneously
  • Enterprises can deploy a single WebRTC endpoint across all their UC/PBX and Recording platforms – current and future
  • Easy for IT departments to deploy – no software is required to be downloaded to employees’ computers, tablets or mobile phones
  • Enables corporate policies and features from the WebRTC endpoint including
  • Displaying the corporate identity
  • Routing calls via corporate networks
  • Tracking and Recording calls and messages

WebRTC Media Streams and Quality metrics

Media Stream Tracks in WebRTC

The MediaStreamTrack interface typically represents a stream of data of audio or video and a MediaStream may contain zero or more MediaStreamTrack objects.

The objects RTCRtpSender and RTCRtpReceiver can be used by the application to get more fine grained control over the transmission and reception of MediaStreamTracks.

Media Flow in VoIP system
Media Flow in WebRTC Call

Video Streams

Video Capture insync with hardware’s capabilities

WebRTC compatible browsers are required to support Whie-balance , light level , autofocus from video source

Video Capture Resolution

Minimum WebRTC video attributes unless specified in SDP ( Session Description protocl ) is minimum 20 FPS and resolution 320 x 240 pixels. 

Also supports mid stream resilution changes such as in screen source fromdesktop sharinig .

SDP attributes for resolution, frame rate, and bitrate

SDP allows for codec-independent indication of preferred video resolutions using a=imageattr to indicate the maximum resolution that is acceptable. 

Sender must send limiting the encoded resolution to the indicated maximum size, as the receiver may not be capable of handling higher resolutions.

Dynamic FPS control based on actual hardware encoding

video source capture to adjust frame rate accroding to low bandwidth , poor light conditions and harware supported rate rather than force a higher FPS .

Stream Orientation

support generating the R0 and R1 bits of the Coordination of Video Orientation (CVO) mechanism and sharing with peer.

Audio Streams

Audio Level

Audio level for speech transmission to avoid users having to manually adjust the playback and to facilitate mixing in conferencing applications.

Normalization is considering frequencies above 300 Hz, regardless of the sampling rate used. Can be adapted to avoid clipping, either by lowering the gain to a level below -19 dBm0 or through the use of a compressor.

GAIN calculation

  • If the endpoint has control over the entire audio-capture path like a regular phone
    the gain should be adjusted in such a way that an average speaker would have a level of 2600 (-19 dBm0) for active speech.
  • If the endpoint does not have control over the entire audio capture like software endpoint
    then the endpoint SHOULD use automatic gain control (AGC) to dynamically adjust the level to 2600 (-19 dBm0) +/- 6 dB.
  • For music- or desktop-sharing applications, the level SHOULD NOT be automatically adjusted, and the endpoint SHOULD allow the user to set the gain manually.

Acoustic Echo Cancellation (AEC)

Endpoints allow echo control mechanisms

SDP signaling and negotiation for media plane

Media plane adaptation is done at the SBC for network carried media, it should be done for all network hosted media services which face peer-to-peer media.

The high-level architecture elements of WebRTC media streams consists of

  • Encryption, RTP Multiplexing, Support for ICE
  • Audio – Interworking of differing WebRTC and codec sets
  • Video – Use of VP8, Support for H.264
  • Data – Support of MSRP ( RCS standard for messaging over DataChannel API)

Media Source

RTCVideoSource_4 (media-source)

timestamp	03/01/2022, 23:07:05
trackIdentifier	1bcab53d-1eca-41d1-a96a-00f1458c9b1b
kind	video
width	640
height	480
frames	7556
framesPerSecond	30

RTCAudioSource_3 (media-source)

timestamp	        03/01/2022, 23:06:26
trackIdentifier	        12cb979c-b40f-4de7-8b50-be6f4425e0b2
kind	                audio
audioLevel	        0.020599993896298106
totalAudioEnergy	1.8476431267450812
[Audio_Level_in_RMS]	0.02541394245734895
totalSamplesDuration	213.66999999995065
echoReturnLoss	        -0.11197675950825214
echoReturnLossEnhancement 8.111690521240234

Peer-to-Peer Media Stream

Direct connection to media servers and media gateways.

Use common codec set wherever possible to eliminate transcoding —Use regionalized transcoding where common codec not available Real-time video transcoding is expensive and performance impacting.

On-going standards/device/network work needs to be done to expand common codec set. WebRTC codec standards have not been finalized yet. WebRTC target is to support royalty free codecs within its standards.

AudioG.711, OpusG.711, AMR, AMR-WB (G.722.2)
Audio – ExtendedG.729a[b], G.726

Supporting common codecs between VoLTE devices and WebRTC endpoints requires one or more of the following:

  1. Support of WebRTC codecs on 3GPP/GSMA
  2. Support of 3GPP/GSMA codecs on WebRTC
  3. WebRTC browser support of codecs native to the device

RTP streams and RTCP stats Outbound Video

Chrome Browser on ubuntu OS

RTCOutboundRTPVideoStream_3305924664 (outbound-rtp)

timestamp	03/01/2022, 22:23:32
ssrc	3305924664
kind	video
trackId	RTCMediaStreamTrack_sender_4
transportId	RTCTransport_0_1
codecId	RTCCodec_1_Outbound_96
[codec]	VP8 (96)
mediaType	video
mediaSourceId	RTCVideoSource_4
packetsSent	171360
[packetsSent/s]	204.02266754223697
retransmittedPacketsSent	620
[retransmittedPacketsSent/s]	0
bytesSent	177210957
[bytesSent_in_bits/s]	1680050.6587655507
headerBytesSent	4218672
[headerBytesSent_in_bits/s]	39812.423281967494
retransmittedBytesSent	668008
[retransmittedBytesSent_in_bits/s]	0
framesEncoded	22003
[framesEncoded/s]	30.00333346209367
keyFramesEncoded	14
totalEncodeTime	418.017
[totalEncodeTime/framesEncoded_in_ms]	9.533333333333378
totalEncodedBytesTarget	0
[totalEncodedBytesTarget_in_bits/s]	0
framesSent	22003
[framesSent/s]	30.00333346209367
hugeFramesSent	1
totalPacketSendDelay	29963.73
[totalPacketSendDelay/packetsSent_in_ms]	31.62745098039772
qualityLimitationReason	none
qualityLimitationDurations	{bandwidth:0,cpu:174895,none:717684,other:0}
qualityLimitationResolutionChanges	0
encoderImplementation	libvpx
firCount	0
pliCount	2
nackCount	161
remoteId	RTCRemoteInboundRtpVideoStream_3305924664
frameWidth	640
frameHeight	480
framesPerSecond	30
qpSum	151000
[qpSum/framesEncoded]	9.3

RTCP statistics RTCRemoteInboundRtpVideoStream_3305924664 (remote-inbound-rtp)

timestamp	03/01/2022, 22:25:29
ssrc	984864038
kind	audio
transportId	RTCTransport_0_1
codecId	RTCCodec_0_Outbound_111
jitter	0.026854166666666665
packetsLost	19
localId	RTCOutboundRTPAudioStream_984864038
roundTripTime	0.048
fractionLost	0
totalRoundTripTime	8.932
roundTripTimeMeasurements	201



After considerable time( 10 minutes in my case ) the quality of the media stream adjust to network conditions and variations ( peaks and dips) flat out.

after some time
after some time has passed
after some time


After some time

Bytes Send And Received

After some time has passes


after some time has passes

Outbound Audio from Ubuntu Chrome Browser

RTCOutboundRTPAudioStream_984864038 (outbound-rtp)

timestamp 03 / 01 / 2022, 22: 13: 26
ssrc 984864038
kind audio
trackId RTCMediaStreamTrack_sender_3
transportId RTCTransport_0_1
codecId RTCCodec_0_Outbound_111
    [codec] opus(111, minptime = 10; useinbandfec = 1)
mediaType audio
mediaSourceId RTCAudioSource_3
packetsSent 14292
    [packetsSent / s] 50.003051944088384
retransmittedPacketsSent 0
    [retransmittedPacketsSent / s] 0
bytesSent 1151754
    [bytesSent_in_bits / s] 32449.980589635597
headerBytesSent 400176
    [headerBytesSent_in_bits / s] 11200.683635475798
retransmittedBytesSent 0
    [retransmittedBytesSent_in_bits / s] 0
nackCount 0
remoteId RTCRemoteInboundRtpAudioStream_984864038

RTCP statistics RTCRemoteInboundRtpAudioStream_984864038 (remote-inbound-rtp)

timestamp	03/01/2022, 22:17:05
ssrc	984864038
kind	audio
transportId	RTCTransport_0_1
codecId	RTCCodec_0_Outbound_111
jitter	0.002
packetsLost	3
localId	RTCOutboundRTPAudioStream_984864038
roundTripTime	0.023
fractionLost	0
totalRoundTripTime	4.344
roundTripTimeMeasurements 98	

Inbound Video from Android Webrtc Browser

RTCInboundRTPVideoStream_3384287918 (inbound-rtp)

timestamp 03 / 01 / 2022, 22: 55: 35
ssrc 3384287918
kind video
trackId RTCMediaStreamTrack_receiver_4
transportId RTCTransport_0_1
mediaType video
jitter 0.027
packetsLost 78
packetsReceived 79545
    [packetsReceived / s] 0
bytesReceived 77156700
    [bytesReceived_in_bits / s] 0
headerBytesReceived 1978716
    [headerBytesReceived_in_bits / s] 0
jitterBufferDelay 2284.024
    [jitterBufferDelay / jitterBufferEmittedCount_in_ms] 0
jitterBufferEmittedCount 13100
framesReceived 13101
    [framesReceived / s] 0[framesReceived - framesDecoded] 0
framesDecoded 13101
    [framesDecoded / s] 0
keyFramesDecoded 1
    [keyFramesDecoded / s] 0
framesDropped 0
totalDecodeTime 94.229
    [totalDecodeTime / framesDecoded_in_ms] 0
totalInterFrameDelay 442.0259999999831
    [totalInterFrameDelay / framesDecoded_in_ms] 0
totalSquaredInterFrameDelay 20.370232000000772
    [interFrameDelayStDev_in_ms] 0
decoderImplementation libvpx
firCount 0
pliCount 2
nackCount 51
codecId RTCCodec_1_Inbound_96
    [codec] VP8(96)
lastPacketReceivedTimestamp 1641276962171
    [lastPacketReceivedTimestamp] 03 / 01 / 2022, 22: 16: 02
frameWidth 480
frameHeight 640
framesPerSecond 4
qpSum 97949
    [qpSum / framesDecoded] 0
estimatedPlayoutTimestamp 3850268134980
    [estimatedPlayoutTimestamp] 03 / 01 / 2092, 22: 55: 42


Inbound Audio from Android Webrtc Browser

RTCInboundRTPAudioStream_579305270 (inbound-rtp)

timestamp 03 / 01 / 2022, 22: 50: 14
ssrc 579305270
kind audio
trackId RTCMediaStreamTrack_receiver_3
transportId RTCTransport_0_1
mediaType audio
jitter 0.003
packetsLost 208
packetsDiscarded 0
packetsReceived 124469
    [packetsReceived / s] 50.03320990953163
fecPacketsReceived 0
fecPacketsDiscarded 0
bytesReceived 4433321
    [bytesReceived_in_bits / s] 14209.431614306981
headerBytesReceived 3485132
    [headerBytesReceived_in_bits / s] 11207.439019735084
jitterBufferDelay 17887008
    [jitterBufferDelay / jitterBufferEmittedCount_in_ms] 113.79999999996896
jitterBufferEmittedCount 119485440
totalSamplesReceived 118645920
    [totalSamplesReceived / s] 48031.88151315036
concealedSamples 689415
    [concealedSamples / s] 0[concealedSamples / totalSamplesReceived] 0
silentConcealedSamples 338882
    [silentConcealedSamples / s] 0
concealmentEvents 230
insertedSamplesForDeceleration 33841
    [insertedSamplesForDeceleration / s] 0
removedSamplesForAcceleration 1562246
    [removedSamplesForAcceleration / s] 0
totalAudioEnergy 4.458078675648182
    [Audio_Level_in_RMS] 0
totalSamplesDuration 2472.2900000075438
codecId RTCCodec_0_Inbound_111
    [codec] opus(111, minptime = 10; useinbandfec = 1)
lastPacketReceivedTimestamp 1641279014658
    [lastPacketReceivedTimestamp] 03 / 01 / 2022, 22: 50: 14
audioLevel 0
remoteId RTCRemoteOutboundRTPAudioStream_579305270
estimatedPlayoutTimestamp 3850267813642

RTCP statistics RTCRemoteOutboundRTPAudioStream_579305270 (remote-outbound-rtp)

timestamp 03 / 01 / 2022, 22: 48: 47
ssrc 579305270
kind audio
transportId RTCTransport_0_1
codecId RTCCodec_0_Inbound_111
packetsSent 120306
bytesSent 4285534
localId RTCInboundRTPAudioStream_579305270
remoteTimestamp 1641278927459
    [remoteTimestamp] 03 / 01 / 2022, 22: 48: 47
reportsSent 480
roundTripTimeMeasurements 0
totalRoundTripTime 0

Comparision of Media stream QoS metrics between laptop browser and mobile browser

chrome browser on laptopmobile chorme browser
higher frame received ( 30)lower frame received (20)
lower jitter (0.002) and packet loss (3)higher jitter (0.003) and packet loss (208)

Bundled Streams

Same port used for all emdia stream. Fir exmaple port 9 is used for audio video as well as their RTCP feedbacks in snippet below.

a=group:BUNDLE 0 1
a=msid-semantic: WMS kAGMqdVh7lL70CVUVZQblgjPYsuhOAiGY3ii
m=audio 9 UDP/TLS/RTP/SAVPF 111 63 103 104 9 0 8 106 105 13 110 112 113 126 (33 more lines)
a=rtcp:9 IN IP4
a=msid:kAGMqdVh7lL70CVUVZQblgjPYsuhOAiGY3ii ed96e925-4425-467b-a099-8fb2e0c67b88
m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101 102 122 127 121 125 107 108 109 35 36 120 119 124 (100 more lines)
a=rtcp:9 IN IP4
a=msid:kAGMqdVh7lL70CVUVZQblgjPYsuhOAiGY3ii f45d3002-2866-44ce-a807-ac59a4f6708c

Same CSRC : To run multiple streams of media over a single RTP stream, a common SSRC is used.

a=ssrc:1683985800 cname:OYWXQ35YL2Hh+eUX
a=ssrc:1384669066 cname:OYWXQ35YL2Hh+eUX

Peer to Peer Data Transfer

Data Channel API of Webrtc allows bidirectional communication of arbitrary data between peers. It uses the same API as WebSockets and has very low latency.

  • (+) DataChannel is p2p and is also ened to end encrypted leader to higher privacy
  • (+) build in security due to p2p transfer
  • (+) high throughput than text transfer via a messaging server
  • (+) lower latency as p2p transfer takes shortest route

SCTP is the protocol that opens connectiosn for peer to peer data channel support in WebRTC. It can be configured for reliability and ordered delivery. It provides flow and congestion control to the data messages.

Data Channel Metrics

timestamp  03/01/2022, 23:13:13
label	   sctp
dataChannelIdentifier	1
state	    open
messagesSent	42
[messagesSent/s]	0
bytesSent   1962750
[bytesSent_in_bits/s]	0
messagesReceived	31
[messagesReceived/s]	0
bytesReceived	4712
[bytesReceived_in_bits/s]	0
After sharing 2 files of 1.5 Mb each


Webrtc Changes bitrate , resolution and framerate dynamically to accomodate the network conditions, policy constraints or user equipment capability. Higher the bitrate, higher the media quality.

Birate of Audio Codecs

Lossey formats
– iLBC (narrow band )13.33, 15.20 kbit/s
– iSAC ( wideband) 10–52 kbit/s
– GSM-EFR 12.2 kbit/s
– AAC 8–529 kbit/s (stereo)
– AMR-WB (G.722.2) 6.60, 8.85, 12.65, 14.25, 15.85, 18.25, 19.85, 23.05, 23.85 kbit/s
– Opus – 6–510 kbit/s(-) higher bitrate consumes more bandwidth
(-) can cause congestion on network route

ITU-T formats
– G711 64kbps
– G.711.1 ( MDCT, A-law, μ-law) 64, 80, 96 kbit/s
– G.722 64 kbit/s (comprises 48, 56 or 64 kbit/s audio
and 16, 8 or 0 kbit/s auxiliary data)

Lossless formats (such as Dolby trueHD, MPEG-4 ALS)
consume much larger bitrates.

Bitrate of Video

QVGA 200-500 kbps
VGA 400 – 800 kbps
720p+ > 800 kbps
4K( 60fps) > 20 mbps

Packet Loss

Packet loss can cause choppy audio and distorted, blurry or frozen video.

Audio Packet loss

Video Packet loss


Jitter is the packet delay variation in an otherwise predictable normal rate of delay. This could indicate route changes, growing congestion etc.


Jitter fo Audio


Jitter for Video

Round Trip Time

High RTT is indicative of network congestion and causes delays.

Audio RTT

Video RTT

Cummulative Analysis of packet lost , RTT measurement and total RTT on an internetwork scenarios with peerfelxive and relay ICE candidates

Deeper Analysis of fraction lost , jitter and RTT

The chart shows how jitter follows RTT

References :

Read more on SDP and its attributes

Regulatory/Legal Considerations and CALEA with WebRTC development

This post is deals with some less known real world implication of developing and integrating WebRTC with telecom service providers network and bring the solution in action . The  regulatory and legal constrains are bought to light after the product is in action and are mostly result of short sightedness .  The following is a list of factors that must be kept in mind while webRTC solution is in development stages

  • WebRTC services from telecom provider depend on the access technology, which may differ if the user accessing the network through a third party Wi-Fi hotspot.
  • User/network type may also dictate if decryption of the media is possible/required.
  • For Peer-to-Peer paths, media could be extracted through the use of network probes or other methodology

Then there are Other Considerations such as specific services, for example if WebRTC is used to create softphones software permitting users to receive or originate calls to the PSTN, the current view is to treat this as a fully interconnected VoIP service subject to all the rules that apply to the PSTN – regardless of technologies employed.


Communications Assistance for Law Enforcement Act (CALEA) , a  United States wiretapping law passed in 1994, during the presidency of Bill Clinton.

  • CALEA requirement for an LTE user may be very different than the CALEA requirements for a user accessing the network through a third party Wi-Fi hotspot.
  • For media going through the SBC, CALEA may use a design similar to existing CALEA designs.
calea intercept infrstructure
calea intercept infrastructure

Read more on WebRTC Security here which discusses SOP (single origin policy ) , CORs ( cross origin requests) , JSONP , ICE , location sharing , scerensharing , Long term access to camera and microphone , SRTP DTLS as well as best practises for secure communication

VoIP and WebRTC platform security largely depend on the underlying protocols such as SIP . SIP is an robuts and time tested VoIP proctol to facilitate VoIP calls . To learn more about SIP security against atacks like

  • Registration Hijacking
  • Impersonating a Server
  • Temparing Message bodies
  • mid-session threats like tearing down session
  • Denial of Service and Amplification

Also security mechnisms like

  • Full encryption vs hop by hop encrption
  • Transport and Network Layer Security
  • SIP over TLS
  • SRTP

Read more about Certificates , compliances and Security in VoIP which summarized

  • HIPAA (Health Insurance Portability and Accountability Act) ,
  • SOX( Sarbanes Oxley Act of 2002) ,
  • Privacy Related Compliance certificates like COPPA (Children’s Online Privacy Protection Act ) of 1998  ,
  • CPNI (Customer Proprietary Network Information) 2007 ,
  • GDPR (General Data Protection Regulation)  in European Union 2018,
  • California Consumer Privacy Act (CCPA) 2019 ,
  • Personal Data Protection Bill (PDP) – India 2018 and
  • also specificatiosn against Robocalls and SPIT ( SPAM over Internet Telephony) among others

Read about General Data Protection Regulation (GDPR) in VoIP

STIR/SHAKEN – Secure Telephony Identity Revisited / Signature-based Handling of Asserted information using toKENs

Transformation towards IMS (Total IP)

The telecommunications industry has been going through a significant transformation over the past few years. At the outset incumbent operators used to focus on mainly basic voice services and still remained profitable due to the limited number of players in the space and requirement of huge amounts as initial investment.

However, with the advent of competitive vendors, rise in consumer base, and introduction of cost effective IP based technologies a major revolution has come about. This has enabled operators to come out of their traditional business models to maintain and enhance subscriber base by providing better and cheaper voice, multimedia and data services in order to grab the biggest possible share in this multi- billion dollar industry.

The evolution in Telecom industry has been accelerating all the time. The Next-Generation Operators wants to keep pace with the rapidly changing technology by, adapting to market needs and looking at the system and business process from multiple perspectives concurrently. Communication Service Providers (CSPs) need to consider several factors in mind before proposing any solution. They need to deploy solutions which are highly automated, highly flexible, caters to customer needs coupled with ultra low operating costs.

By hosting new services on the new platform and combining new and old services CSP‟s aim to provide service bundles that would generate new revenue streams. This process is largely dependant on IMS ( IP Multimedia Subsystem ) architecture .

Transformation towards IMS (Total IP)
Transformation towards IMS (Total IP)

Optimization in operator landscape evolve as result of synergistic technologies that come together to address the innovation and cost optimization needs of operator for better user experience. In following sections different technological evolutions that are affecting overall operator ecosystems have been discussed with focus towards Service Layer.

Legacy to IP transformation

This section broadly covered the aspects of migration from legacy IN solution to new age JAINSLEE framework based one. Applies to Legacy IN hosting voice based services mostly  such as VPN, Access Screening ,Number Portability, SIP-Trunking ,Call Gapping.

Most operator environments have seen a rise in the number of service delivery platforms. Also complexity of telecom networks have increased manifold hence CSPs are facing multiple challenges. Increased efforts and costs are required for maintaining all the SDP platforms. These platforms are generally of different vendors and cater to different technologies thereby greatly increase chances of limiting the scalability and flexibility of the operator landscape. More effort required for sustaining the life cycle of the platform and challenges in integrating non compatible SDPs due to proprietary design have been stumbling blocks in the progress of CSPs across the world.

To overcome these challenges there is trend in the market to move towards SDP consolidation wherein instead of maintaining several SDPs with their proprietary design CSPs prefer maintaining a single or less number of SDPs having standardized interfaces.

SDP consolidation SDP consolidation (1) SDP consolidation (2)

As illustrated in the above figure there is a transition that is taking place in the industry towards consolidation of service delivery session control. This would provide a cost effective sustenance of existing applications and the rapid creation and deployment of new services leading to increased revenue recognition by CSPs.

  • Agile Development
  • Innovative services
  • open SOA based architectures
  • IN/NGN Platform and Services
  • Reuse of existing investments in legacy service platforms
  • low cost of new service development
  • faster time to market
  • Monetize investment in Network Infrastructure uplift – SIP trunking, VoLTE etc.

Services that should be covered  in the Scope of Migration from fixed line to IP telephony are:

  • Virtual Private Network (VPN) : An Intelligent Network (IN) service, which offers the functions of a private telephone network. The basic idea behind this service is that business customers are offered the benefits of a (physical) private network, but spared from owning and maintaining it.
  • Access Screening(ASC): An IN service, which gives the operators the possibility to screen (allow/barring) the incoming traffic and decide the call routing, especially when the subscribers choose an alternate route/carrier/access network (also called Equal Access) for long distance calls on a call by call basis or pre-selected.
  • Number Portability(NP) : An IN service allows subscribers to retain their subscriber number while changing their service provider, location, equipment or type of subscribed telephony service. Both geographic numbers and non-geographic numbers are supported by the NP service.

WebRTC based Unified Communication platform

Using WebRTC Solution for Delivering In Context Voice which provides new monetizing benefits to the Enterprise customers of Service Providers. This includes following components:

  • WebRTC Gateway for implementation for inter-connect with SIP Legacy
  • Enhancement of WebRTC Client with new features like Cloud Address Book, Conferencing & Social Networking hooks.
  • Cloud based solutions


Challenges in Migration to IMS  (Total IP )

Since long I have been advocating the benefits of migration to IMS  from a current fixed line / legacy/ proprietary VOIP / SS7 based system . However I decided to write this post on the challenges in migration to IMS system from a telecom provider’s view.  Though I could think of many , I have jot down the major 4 . they are as follows :

Data Migration challenges

  • Establishing a common data model definition
  • Data migration seamlessly
  • Configuration management
  • Extracting data from multiple sources and vendors , that includes legacy systems
  • Extracting data due to its large scale and volume


  • Creating an effective knowledge share and transfer for live operations
  • Training in fallback plans, standards and policies .

Customer impact

  • Minimized customer outage
  • Enhance customer experience by delivering quality services on schedule
  • Ensuring security of customer’s confidential data
  • Transfer of customer services without any impact.

Testing in replicated environment

  • Physical pre-transfer test
  • Reducing cycle time
  • Verification and validation at every change in data environment
  • Detect production issues early in the test -lifecycle

Fallback plans

  • Pilot program and real network simulation for ensuring preparedness
  • Tracking changes in new network

WebRTC compatible android client

This post describes the requirement of creating a SIP phone application on android over the same codecs as WebRTC ( PCMA , PCMU , VP8) . In my project concerning the demonstration of WebRTC inter operability ( presence , audio / video call , message )  with a native android client , I had to develop a lightweight Android SIP application , customized for the look and feel of the webrtc web application . This also enables the added services to WebRTC client such as geolocation , visual voice mail , phonebook , call control options be set from android application as well .

Aim :

Android webrtc- sip client development , using sipml5 stack implemented through web services and native android programming .  

Software Used:

⦁ Eclipse IDE
⦁ Java SE Development Kit 7.0
⦁ Android SDK

Tasks :

⦁ Authorization of a user, based on his/her credentials (Database local to the application).

⦁ Navigation Drawer on the home page which shows a menu giving the user various options like:
⦁ View Home Page
⦁ View Contact List
⦁ View/Edit My Profile
⦁ View My Location
⦁ Sign Out

⦁ Phonebook sync : Importing contact list of the Android Phone into the application. Editing user profile with values like  User Name ,  Password ,  Domain. 

⦁ Inclusion of a Web View in the application which currently opens the desired webpage(

⦁ Geolocation: Showing marker for the current location of user in Google Maps.Displaying the address of the user in a Toast Message.


⦁ Audio / Video call capability 


figure 1 : Login page , figure 2 : Call page , Figure 3 : Menu bar 

Future Roadmap:

⦁ Connecting the application to a database which sits on the cloud.
⦁ Based on the entries in the database the user will be able to:
⦁ Login to the application.
⦁ View or edit his/her details in the My Profile Section.
⦁ Understanding codes of sample applications for making SIP calls from Android OS like:
⦁ SipDroid
⦁ SipDemo
⦁ IMSDroid
⦁ Modifying the existing application to be able to make SIP calls like one of the apps listed above.

Modules :

Development Done:
  1. Development of an authorization page connecting the application to a local database from where values are inserted and retrieved.
  2. Development of navigation drawer where additional options for the application will be displayed making it a user friendly application.
Development Planned:

1.Connectivity to a cloud database.  

2. App engine on cloud.

3. Importing contacts from phone address book .

4. Offine storage of profile details and few call logs .  




Difference between WebRTC and plugin based communication

A lot of service providers ie telecom operators had deduced their own ways to provide Web based communication even before WebRTC was born . With time , as WebRTC has become stronger , more secure , resilient to failure they have come around to migrate their existing system from previous closed box native APIs to opensource WebRTC APIs.

The first figure ( given below ) depicts a communication platform build over plugins and proprietary APIs using HTTP REST based signaling .

Web Communication Service Architecture over HTTP/ REST API

As the migration took place the proprietary API components were replaced by Open standard based entities such as plugins were replaced by WebRTC APIs, HTTP REST based signalling was replaced by SIP ( Session Initiation Protocol ) .

Web Communication Service Architecture over WebRTC SIP
Web Communication Service Architecture over WebRTC SIP

Note telecom operator network did not had to face transformation by integration of WebRTC elements .

Interoperability between WebRTC, SIP phones and softphones

WebRTC is a disruptive techbology for the telephony and cloud based communication services . It will change the landscape and foster growth of new innovative VoIP services that will be device agnostic and future ready .

Role of SIP servers ?

SIP Server convert the SIP transport from WebSocket protocol to UDP, TCP or TLS which are supported by all legacy networks. It also facilitates the use of rich serves such as phonebook synchronisation , file sharing , oauth in client .

How does WebRTC Solution traverse through FireWalls ?

NAT traversal across Firewalls is achieved via TURN/STUN through ICE candidates gathering .Current ice_servers are : and

What audio and video codecs are supported by WebRTC client side alone ?

Without the role of Media Server WebRTC solution supports Opus , PCMA , PCMU for audio and VP8 for video call.

RTCBreaker if enabled provides a third party B2BUA agent that performs certain level of codec conversion to H.264, H.263, Theora or MP4V-ES for non WebRTC supported agents.

What video resolution is supported by WebRTC solution ?

The browser will try to find the best video size between max and min based on the camera capabilities.

Options are : sqcif | qcif | qvga | cif | hvga | vga | 4cif | svga | 480p | 720p | 16cif | 1080p

We can also predefine the video size such as minWidth, minHeight, maxWidth, maxHeight.

What bandwidth is required to run WebRTC solution ?

We can set maximum audio and video bandwidth to use or use the browser’s ability to set it hy default at runtime . This will change the outgoing SDP to include a “b:AS=” attribute. Browser negotiates the right value using RTCP-REMB and congestion control.

List of Web based SIP clients

SIPML5 client by Dubango


Telestax WebRTC client


SIPJS with flash network support



MIT license 2014-02-09_1444

SIP phones in Ubuntu / Linux

SFL phone

linux sfl 2
linux sfl 1

Yate SIP phone

linux yate 2
linux yate 1


There are ready made build of Linphone for Windows , Mac and Mobile


Aletrnatively one can also build the Linphone from source

Installation of Linphone v4.1 for Desktop

apt-get install libqt53dcore5:amd64 libqt53dextras5:amd64 libqt53dinput5:amd64 libqt53dlogic5:amd64 libqt53dquick5:amd64 libqt53dquickextras5:amd64 libqt53dquickinput5:amd64 libqt53dquickrender5:amd64 libqt53drender5:amd64 libqt5concurrent5:amd64 libqt5core5a:amd64 libqt5dbus5:amd64 libqt5designer5:amd64 libqt5designercomponents5:amd64 libqt5gui5:amd64 libqt5help5:amd64 libqt5multimedia5:amd64 libqt5multimedia5-plugins:amd64 libqt5multimediawidgets5:amd64 libqt5network5:amd64 libqt5opengl5:amd64 libqt5opengl5-dev:amd64 libqt5positioning5:amd64 libqt5printsupport5:amd64 libqt5qml5:amd64 libqt5quick5:amd64 libqt5quickcontrols2-5:amd64 libqt5quickparticles5:amd64 libqt5quicktemplates2-5:amd64 libqt5quicktest5:amd64 libqt5quickwidgets5:amd64 libqt5script5:amd64 libqt5scripttools5:amd64 libqt5sensors5:amd64 libqt5serialport5:amd64 libqt5sql5:amd64 libqt5sql5-sqlite:amd64 libqt5svg5:amd64 libqt5svg5-dev:amd64 libqt5test5:amd64 libqt5webchannel5:amd64 libqt5webengine-data libqt5webenginecore5:amd64 libqt5webenginewidgets5:amd64 libqt5webkit5:amd64 libqt5widgets5:amd64 libqt5x11extras5:amd64 libqt5xml5:amd64 libqt5xmlpatterns5:amd64 qt5-default:amd64 qt5-doc qt5-gtk-platformtheme:amd64 qt5-qmake:amd64 qt5-qmltooling-plugins:amd64

Besdies these dont foeget to also install pip and pystache which is a templating system

sudo apt install python-pip
pip install pystache

And Doxygen which d tool for generating documentation from annotated C++ sources

apt install doxygen

Yasm assembler

sudo apt install yasm


sudo apt-get install -y v4l-utils

Get source Code

git clone --recursive

To build without video or v4l support

sudo cmake .. -DCMAKE_BUILD_TYPE=RelWithDebInfo -DENABLE_V4L=0

The run build

sudo cmake --build . --target all




CMakeFiles/EP_ms2.dir/build.make:118: recipe for target '/home/altanai/linphone-desktop/WORK/WORK/desktop/Stamp/EP_ms2/EP_ms2-configure' failed
make[8]: *** [/home/altanai/linphone-desktop/WORK/WORK/desktop/Stamp/EP_ms2/EP_ms2-configure] Error 1
CMakeFiles/Makefile2:115: recipe for target 'CMakeFiles/EP_ms2.dir/all' failed
make[7]: *** [CMakeFiles/EP_ms2.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make[6]: *** [all] Error 2


[ 57%] Performing configure step for 'EP_ms2'
loading initial cache file /home/altanai/linphone-desktop/WORK/WORK/desktop//tmp/EP_ms2/EP_ms2-cache-RelWithDebInfo.cmake
CMake Error at CMakeLists.txt:322 (message):
Could not find a support sound driver API. Use -DENABLE_SOUND=NO if you
don't care about having sound.

Install sound drivers

sudo apt-get install libpulse-dev pulseaudio libasound2-dev pavucontrol alsa-lib 

Failing on MS compilation on Performing configure step for ‘EP_ms2

Ref :

Windows Operating system SIP software

Xlite is well known SIP softphone for windows dessktop

xlite 1

Xlite new version


Kapanga SIP softphone

It is also runnable on Linux desktop through windows compatibility softwares like wine


FreeSwitch Communicator

comes along with the Freeswitch Media Server


Boghe SIP RCS client


Jitsi SIP phone

jitsi 2
jitsi 1

MAC SIP software

idoubs desktop SIP RCS client for Mac

Screen shot 2014-06-13 at 4.03.27 PM

iOS SIP phone applications

Linphone for ios

IMG-20140703-WA0003  IMG-20140703-WA0006 IMG-20140703-WA0007  IMG-20140710-WA0001 IMG-20140710-WA0002

Android SIP applications

Sipdroid for Android


sip droid

Supporfts SIP stack and compatible with most of the SIP servers

WebRTC SIP / IMS solution

We started in winters on 2012 with Webrtc . At time time it just looked like a new tech jargon that might fade away when new ones comes . In many many WebRTC’s buzz has died down since its massive adoption. But i nevertheless still see a lot of potential and development around it.

What really is WebRTC ? I made an entry on it  here .

Around nov – dec 2012 , team and I spend the time learning the nitty-grities of HTML5 based media operation and Javascript sip stack of SIPML. I remember toward the end of the year ie before Christmas , We were done with the explanation and education aspects of WebRTC , a technology that will revolutionise communication in ages to come , at-least so says the numerous other blogs ,  and documents i read so far .

Usecases for WebRTC range across a wide variety , of them the most revenue generating ones are around video conferencing with realtime HD audio-video-data streams ,

To bridge the flow between a webrtc client to a PSTN endpoint via IMS , interworking between webrtc media standards and codecs with that of gateways in IMS is critical . For instance WebRTC mandates secure RTP ( SRTP) the media engine / gateway should be able to support and connect with RTP from PSTN endpoints.

client BOB -> webrtc2sip Gateway -> SIP server -> client Alice

can be  understood with the callflow of a simple SIP Invite initiated from one html page towards another which passes through the configuration of gateway to IMS world ,  SIP Telecom Application server , Database , nodes of IMS environment etc.

For the purpose of a simple Explanation a simplified call flow ca be depicted as ,


A very high level architecture of solution deployment in IMS world could be

solution arch2

As the solution matures into a full fleshed project . The alpha version has been released with the following feature set . The WebRTC platform Suite offers a easily deploy-able solution to enable communication

Alpha Release WebRTC platform Suite

  • Single Sign On
  • Login with id and password to access all services
  • Audio / Video Call
    • Call Hold / Call Transfer
  • Messaging:
    • SIP Instant Messaging
    • Message to Facebook Messenger
    • Message delivered as Email
  • Chatroom
    • group chat between multiple users . Room is created for set of users .
  • Video Conferencing
    • video chat between multiple parties . Room is created for set of users .
  • File Transfer
    • Sharing of files from local to remote , in peer-to-peer and broadcasting fashion .
  • Third party Webservices
    • Widgets like calendar , weather , stocks , twitter are embedded.
  • Visual Voice Mail
    • Record and deliver voice message to recipients voice mail inbox which can be accessed/ played from web client .
  • Phonebook
    • cloud integration
    • add new entries
    • add photos to contacts identity
    • import contacts from google account
  • Click to Call :
    • Drop down list of contacts form mail call console
    • 2 step Click to call from Phonebook
  • Presence :
    • Publish online / offline status
    • Use Subscribe / notify requests of SIP
  • Web Ssocket to SIP Gateway
    • Conversion between the signal coming from the WebRTC and SIP client to the IMS core
    • Conversion of “voice/video ” media between sRTP and RTP
    • Conversion of other media (data channel) towards MSRP and Transcoding.
    • Support of ICE procedure
    • Implementation of a STUN server
  • QoS Support


  • Logs
    • calls logs
    • Message logs
  • User Profile
    • user details like address , email and social networking accounts
    • Phonenumber for GSM integration through SMS
    • User’s Media storage like Pictures , profile picture , Audio , video
    • File sharing documents storage for future access in the same format
  • Real Time and Offline Analytics
  • service usage with graphical and tabular history trends
  • Session Management
    • Single Sign-on
    • Forgot password regeneration using secure question
    • Registration of new user account
    • Logout and clearance of session parameters
  • Security
    • No redirection to any page through url entry without valid session
    • No going back to home page after logout by back button on browser
    • No data vulnerability
    • Multiple login through different devices handled
  • OAuth
    • Login via IMAP / token through facebook and Google
  • Phonebook with Presence functionality inbuilt
  • Directory Service based on country / region
  • Geolocation of approximate location detection of device logged in and visibility to others
webrtc solution
WebRTC client deployment view , accessible devices , network elements
WebRTC deploymenet overview and inetraction with other network elemets such as gateway , cloud storage ,  sipserver , IMS
WebRTC deploymenet overview and inetraction with other network elemets such as gateway , cloud storage , sipserver , IMS

Commercial release features specs for WebRTC over IMS

  • Integration with new age CSP deployments like VoLTE, ViLTE, VoWiFi
  • Multi vendor support
  • Interactive webrtc services
  • Media Services
    • Automated Natural language Speech recognition
    • Semantic processing via ML
    • Enhanced incall services replacing IVR ( touch -tone)
    • VQE (voice Quality Enhancements)
    • Encoding and Decoding – Multiple Codec Support
    • Transcoding
    • Silence Suppression
  • Security via TLS, encryption and AAA
  • Http, NFS caching
  • NAT using Xirsys TURN
  • Recording, playback and media file compression
  • active frame selection
  • DTMF (Dual Tone Multi Frequency)
    • SIP info messages (out-of-band)
    • SIP notify messages (out-of-band)
    • Inband DTMF not supported yet
  • Audio
    • mixing
    • announcements ( VXML, MSML )
    • filters
    • gain control ( AGC using webrtc stack)
    • noise suppresesion ( webrtc stack)
    • speakers notification
    • Narrowband, Wideband, and Super Wideband
    • dynamic sample rate
  • Video
    • continuous presence ( Face detetion )
    • floor control
    • video lipsync (sync)
    • speaker tile selection
  • VQE (Voice Quality Enhancement )
    • Acoustic Echo Cancelation
    • noise reduction
    • noise line detection
    • noise gating
    • Packet Loss concealment
  • Call analyics
    • progress analysis
    • MOS , R-factor ( derived from latency , jitter , packet loss )
  • CDR (Call detail records ) and accounting
  • Lawful interception

Updating this article 2019

There was a long journey from traditional telecom architectures to NFV cloud based architectures ( like openstack). supported over web , 4G , LTE or other upcoming networks. Many OTT providers prefer using the public cloud over a NFV data centre.

Multinode / Multiedge computing platforms like Media Resource Function are expected to meet the need for quick delivery with additional features like hardware accelerated media , algorithms for optimised data flow (packetization, decongesting , security ) etc . With th decomposed architecture they can better utilise the

  • CPU – contains couple of cores optimised for sequential serial processing such as   graphics or video processing
  • GPU – contains many smaller cores to accelerate creation of images for computer display . Can include texture mapping, image rotation, translation, shading or more enhanced features like motion compensation, calculation of inverse DCT, etc. for accelerated video decoding.
  • DSP- processing data representing analog signals

Although IMS based solutions are more suited to telephony applications and CSPs ( Communication service providers like telecom companies ) but similar or same architectures are widely finding their into newer developed cloud communications solutions supporting tens of millions of subscribers and hyper scale deployment . It could be around applications such as

  • HD (High Definition ) calls
  • UCC ( conf , draw-board, speech recognition , realtime streaming)
  • immersive experiences ( Augmented reality , virtual reality , face recognition , tracking )
  • contextual communication ( transcription etc)
  • video content delivery with deep media analytics

Demand these says is for a decentralised system of pool of servers ( media and signalling ) that can scale independently to match up to peak traffic at any moment , with ofcourse carrier class performance . Not only these flexible solutions reduce complexity but also OpEX .


Unified Communicator and Collaborator for Enterprise

Modular enterprise communicator solution for enterprise based communication and collaboration . Use sipml5 client side library to provide webRTC based media stream capture and propagation from client side without external plugins.

Github Repo –

Unified Communications and Collaborations ( UC&C ) –