JavaScript Session Establishment Protocol (JSEP) in WebRTC handshake

This article is aimed at explaning the intricacies and detailed offer answer flow in webrtc handshake and JSEP . You can read the following artciles on WebRTC as prereq before reading through this one

WebRTC API – Peerconnection , getUserMedia , Datachannel , DataStaats

JSEP (JavaScript Session Establishment Protocol)

JSEP (JavaScript Session Establishment Protocol) is used during signalling via w3c RTCPeerConnectionAPI interface to set up a multimedia session. The multimedia session description specfies the crtical components of setting up a session between local and remote such as transport ports , protcol , profiles . It also handles the intercation with ICE state machine

Offer/Answer Excahange Flow

prereq : Setup Client side for the caller
PeerConnectionFactory to generate PeerConnections
PeerConnection for every connection to remote peer
MediaStream audio and video from client device

  1. Side initiating the session creates a offer by CreateOffer() API
aPromise = myPeerConnection.createOffer([options]);

options is type of RTC Offer Options

  • iceRestart
  • offerToReceiveAudio ( legacy)
  • offerToReceiveVideo ( legacy)
  • voiceActivityDetection

2. The application then stores the offer in local config as setLocalDescriptionAPI()

 myPeerConnection.createOffer().then(function(offer) {
    return myPeerConnection.setLocalDescription(offer);
})

3. Offer is sent to remote side using its choice of signalling ( SIP , WS , HTTP, XMPP .. )

4. Remote party stores it use setRemoteDescription() API

myPeerConnection.setRemoteDescription(sdp)
.then(function () {
  return createMyStream();
})

4. Remote part generates an answer using createAnswer() API

aPromise = RTCPeerConnection.createAnswer([options]);

5. Remote party stores the answer in its local config using setLocalDescription() API

6. Answer is transferred to Initiator side using choice of signalling ( SIP , WS , HTTP, XMPP .. ) again

7. Initiating side stores it use setRemoteDescription() API

Interfaces of webrtc and tracks to stream addition

Process to perform webrtc handshake

Webrtc call setup and incoming call callflow between remote peer , peerconnection actory , peerconnection and application

setup a call
receive a call

Signalling state Transitions on PeerConnection

As the caller initiates a new RTCPeerConnection() , the RTCSignalingState state is “stable” as remote and local descriptions are empty

As the caller initiates call and calls createOffer() , he now has offer SDP and procced to store offer locally with setLocalDescription(offer) the RTCSignalingState state is “have-local-offer” . After than caller send the offer to callee over signalling channel

Simillarily as the calle recives the offer , it starts with RTCSignalingState stable and then proceeds to store the Remote’s offer using setRemoteDescription(offer) , its state is now “have-remote-offer”

The callee generates a provsional answer and for caller and stores it locally , state transitiosn to “have-local-pranswer” . the pranswer SDP is send to caller over signalling channel again .

Caller stores the callee’s pr answer SDP and state updates to “have-remote-pranswer”

Once there is no offer/answer exchange in progress the state again changes to ” stable “.

State schanges to “closed” if RTCpeerConnection is closed

img : https://w3c.github.io/webrtc-pc

Detailed Offer / Answer SDP

Local Offer created by side initiating the session / Caller

The first offfer called initial offer can have dummy date for contact line such as 0.0.0.0 to prevent leaking a local Ip address

c=IN IP4 0.0.0.0
a=rtcp:9 IN IP4 0.0.0.0

“o=” line contains <username> <sess-id> <sess-version> <nettype> <addrtype> <unicast-address>

o=- 4445251981417004127 2 IN IP4 127.0.0.

shows username – and 4445251981417004127 as session id. Same username “-” is specified in “s=” line

“t=” line shows <start time> <stop time>

t=0 0

Full session Block example

type: offer, sdp: v=0
o=- 4445251981417004127 2 IN IP4 127.0.0.1
s=-
t=0 0
a=group:BUNDLE 0 1 2
a=msid-semantic: WMS DYVK4IA4kA8LvnIYWjXhRzMgSGicnwVutWE2

Media Section : An m= section is generated for each RtpTransceiver that has been added to the PeerConnection. For the initial offer since no ports are available yet , dummy port 9 can be sadded. However if it is bundle only then port value is set to 0. Later the port value will be set to the port value of default ICE candidate.

DTLS filed “UDP/TLS/RTP/SAVPF” is followed by the list of codecs in order of priority.

“c=” line in msection too must be filled with dummy values if IP 0.0.0.0 as no candidates are available yet .

ICE

a=ice-options:trickle

Transport

“a=ice-ufrag” , “a=ice-pwd” , “a=fingerprint” , “a=setup” , “a=tls-id”

Media Stream Identification attribute “a-mid:”

For each media format on the m= line, “a=rtpmap” for “rtx” with the clock rate of codec and “a=fmtp” to reference the payload type of the primary codec.  “a=rtcp-fb” specified RTCP feedback

a=rtpmap:111 opus/48000/2
a=rtcp-fb:111 transport-cc
a=fmtp:111 minptime=10;useinbandfec=1

Audio Block exmaple

m=audio 9 UDP/TLS/RTP/SAVPF 111 103 104 9 0 8 106 105 13 110 112 113 126
c=IN IP4 0.0.0.0
a=rtcp:9 IN IP4 0.0.0.0
a=ice-ufrag:JDMg
a=ice-pwd:6OARDQ8U/orhtXZbfN+ars37
a=ice-options:trickle
a=fingerprint:sha-256 1D:C8:1F:18:D2:AB:B7:68:CC:DC:A8:8D:6B:1D:70:11:06:E9:19:D2:22:CE:A5:F3:BE:82:00:ED:99:58:20:4A
a=setup:actpass
a=mid:0
a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level
a=extmap:2 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time
a=extmap:3 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01
a=extmap:4 urn:ietf:params:rtp-hdrext:sdes:mid
a=extmap:5 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id
a=extmap:6 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id
a=sendrecv
a=msid:DYVK4IA4kA8LvnIYWjXhRzMgSGicnwVutWE2 7525d75c-ffe7-4038-8b71-653d249e63bb
a=rtcp-mux
a=rtpmap:111 opus/48000/2
a=rtcp-fb:111 transport-cc
a=fmtp:111 minptime=10;useinbandfec=1
a=rtpmap:103 ISAC/16000
a=rtpmap:104 ISAC/32000
a=rtpmap:9 G722/8000
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:106 CN/32000
a=rtpmap:105 CN/16000
a=rtpmap:13 CN/8000
a=rtpmap:110 telephone-event/48000
a=rtpmap:112 telephone-event/32000
a=rtpmap:113 telephone-event/16000
a=rtpmap:126 telephone-event/8000
a=ssrc:3968544080 cname:da0nYe1oYR8AvVNp
a=ssrc:3968544080 msid:DYVK4IA4kA8LvnIYWjXhRzMgSGicnwVutWE2 7525d75c-ffe7-4038-8b71-653d249e63bb
a=ssrc:3968544080 mslabel:DYVK4IA4kA8LvnIYWjXhRzMgSGicnwVutWE2
a=ssrc:3968544080 label:7525d75c-ffe7-4038-8b71-653d249e63bb

// remove video section for simplicity

Data Block is created if data channle has been created with m= section for data.

“a=sctp-port” line referencing the SCTP port number set to 5000

 “a=max-message-size”  set to 262144 here

Data Block example

m=application 9 UDP/DTLS/SCTP webrtc-datachannel
c=IN IP4 0.0.0.0
a=ice-ufrag:JDMg
a=ice-pwd:6OARDQ8U/orhtXZbfN+ars37
a=ice-options:trickle
a=fingerprint:sha-256 1D:C8:1F:18:D2:AB:B7:68:CC:DC:A8:8D:6B:1D:70:11:06:E9:19:D2:22:CE:A5:F3:BE:82:00:ED:99:58:20:4A
a=setup:actpass
a=mid:2
a=sctp-port:5000
a=max-message-size:262144

Subsequent Offers

When createOffer is called a second (or later) time, or is called after a local description has already been installed, the processig is different due to gathered ICE candidates . However the <session-version> is not changed .

Additionally m section is updated if RtpTransceiver is added or removed

Each “m=” and c=” line MUST be filled in with the port, relevant RTP profile, and address of the default candidate for the m= section

If the m= section is not bundled into another m= section, update the “a=rtcp” with port and address of RTCP camdidate and add “a=camdidate” with  “a=end-of-candidates” 

Local Answer created by side receiving the session/ Callee

When createAnswer is called for the first time after a remote description has been provided, the result is known as the initial answer. 

 Each offered m= section will have an associated RtpTransceiver

Remote Destination / Callee can reject the m section by setting port in m line to 0 . It can reject msection if neither of the offered media format are supported , RtpTransceiver is stoopped etc.

For the initial offer the dummy port value of 9 is set as no ICE candudate is avaible yet . Simillarly  “c=” line must contain the “dummy” value “IN IP4 0.0.0.0” too.

The <proto> field MUST be set to exactly match the <proto> field for the corresponding m= line in the offer.

type: answer, sdp: v=0
o=- 5730481682283561642 3 IN IP4 127.0.0.1
s=-
t=0 0
a=group:BUNDLE 0 1 2
a=msid-semantic: WMS KGmQ9mTmvTaWlHTQ0B0YP36QIxOYNeB3i2nT

Audio section

m=audio 9 UDP/TLS/RTP/SAVPF 111 103 104 9 0 8 106 105 13 110 112 113 126
c=IN IP4 0.0.0.0
a=rtcp:9 IN IP4 0.0.0.0
a=ice-ufrag:MgKS
a=ice-pwd:X3oTkKO/v7GVgd/CDC3e9B7c
a=ice-options:trickle
a=fingerprint:sha-256 B9:9C:8A:A9:E9:09:0C:FB:52:2A:D3:18:7B:A9:D4:EC:B3:00:77:72:27:51:EC:5F:82:BE:11:7F:C7:CF:43:43
a=setup:active
a=mid:0
a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level
a=extmap:2 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time
a=extmap:3 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01
a=extmap:4 urn:ietf:params:rtp-hdrext:sdes:mid
a=extmap:5 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id
a=extmap:6 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id
a=sendrecv
a=msid:KGmQ9mTmvTaWlHTQ0B0YP36QIxOYNeB3i2nT e817fe0f-1cc0-4901-9fd9-e810289cc85d
a=rtcp-mux
a=rtpmap:111 opus/48000/2
a=rtcp-fb:111 transport-cc
a=fmtp:111 minptime=10;useinbandfec=1
a=rtpmap:103 ISAC/16000
a=rtpmap:104 ISAC/32000
a=rtpmap:9 G722/8000
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:106 CN/32000
a=rtpmap:105 CN/16000
a=rtpmap:13 CN/8000
a=rtpmap:110 telephone-event/48000
a=rtpmap:112 telephone-event/32000
a=rtpmap:113 telephone-event/16000
a=rtpmap:126 telephone-event/8000
a=ssrc:3260997313 cname:FxLUKuXrLQe0r1rn

Video section removed for simplicity

Data stream

m=application 9 UDP/DTLS/SCTP webrtc-datachannel
c=IN IP4 0.0.0.0
b=AS:30
a=ice-ufrag:MgKS
a=ice-pwd:X3oTkKO/v7GVgd/CDC3e9B7c
a=ice-options:trickle
a=fingerprint:sha-256 B9:9C:8A:A9:E9:09:0C:FB:52:2A:D3:18:7B:A9:D4:EC:B3:00:77:72:27:51:EC:5F:82:BE:11:7F:C7:CF:43:43
a=setup:active
a=mid:2
a=sctp-port:5000
a=max-message-size:262144

Subsequent Answers

 Port value would normally be set to the port of the default ICE candidate for this m= section. For the exmaple above

m=audio 9 UDP/TLS/RTP/SAVPF 111 103 104 9 0 8 106 105 13 110 112 113 126

will be changes with relevant port adress such as

type: offer, sdp: v=0
o=- 6407282338169184323 3 IN IP4 54.190.54.190
s=-
t=0 0
a=group:BUNDLE 0 1 2
a=msid-semantic: WMS bSrCUCFybGovIy0FUhPTZAr9ToRmx8I09nEj
m=audio 55375 UDP/TLS/RTP/SAVPF 111 103 104 9 0 8 106 105 13 110 112 113 126
c=IN IP4 54.190.54.190
a=rtcp:9 IN IP4 0.0.0.0
a=candidate:2880323124 1 udp 2122260223 54.190.54.190 55375 typ host generation 0 network-id 1 network-c

Simillarly m video and data line will also get ports

m=video 53877 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101 102 122 127 121 125 107 108 109 124 120 123 119 114 115 116
c=IN IP4 54.190.54.190
a=rtcp:9 IN IP4 0.0.0.0
a=candidate:2880323124 1 udp 2122260223 54.190.54.190 53877 typ host generation 0 network-id 1 network-cost 10
..
m=application 57991 UDP/DTLS/SCTP webrtc-datachannel
c=IN IP4 54.190.54.190
a=candidate:2880323124 1 udp 2122260223 54.190.54.190 57991 typ host generation 0 network-id 1 network-cost 10

If the answer contains any “a=ice-options” attributes where “trickle” is listed as an attribute, update the PeerConnection canTrickle property to be true. 

Modifying Offer/answer SDP

SDP returned from createOffer or createAnswer MUST NOT be changed before passing it to setLocalDescription.
After calling setLocalDescription with an offer or answer, the application MAY modify the SDP to reduce its capabilities before sending it to the far side

Assume we have a MCU at location and want the video stream to relay via a Media Server.

SDP Parsing

SDP is used for session parsing and contians sequence of line with key value pairs. SDP is read, line-by-line, and converted to a data structure that contains the deserialized information.

JSEP SDP bears a lot of simillarity to SIP SDP explained here : SIP and SDP Messages Explained

Session-Level Parsing

Line “v=” , “o=”,”b=” and “a=” are processed . The “i=”, “u=”, “e=”, “p=”, “t=”, “r=”, “z=”, and “k=” lines are not used by this specification; they MUST be checked for syntax but their values are not used. Line “c=” is checked for syntax and ICE mismatch detection

“a= ” attribute could be : “a=group” , “s=”ice-lite” , “a=ice-pwd”, “a=ice-options” , “a=fingerprint”, “a=setup” , a=tls-id”, “a=identity” , “a=extmap”

Media Section Parsing

Line “m=” for media , proto , port , fmt in RTP

Attributes “a=” can be

“a=rtpmap” or “a=fmtp”

map from an RTP payload type number to a media encoding name that identifies the payload format.

a=rtpmap:<payload type> <encoding name>/<clock rate> [/<encoding parameters>]
m=audio 49230 RTP/AVP 96 97 98
a=rtpmap:96 L8/8000
a=rtpmap:97 L16/8000
a=rtpmap:98 L16/11025/2

“a=ptime” , “a=maxptime”

dierction as  “a=sendrecv” , a=recvonly , a=sendonly , a=inactive

Muxing as “a=rtcp-mux” ,

“a=rtcp-mux-only”

RTCP attributes “a=rtcp” , “a=rtcp-rsize”

Line “c=” is checked .

Line “b=” for bandiwtdh , bwtype

Attribites for “a=” could be “a=ice-ufrag”, “a=”ice-pwd”, “a=ice-options” , “a=candidate”, “a=remote-candidate” , a=end-of-candidates” and “a=fingerprint”

Semantics Verification

Interactive Connectivity Establishment (ICE) for NAT traversal

Protocols using offer/answer are difficult to operate through Network Address Translators (NATs) since flow of media packets require IP addresses and ports of media sources and sinks within their messages. Also realtime media emphasises on reduced latency and decreased packet loss .

An extension to the offer/answer model, and works by including a multiplicity of IP addresses and ports in SDP offers and answers, which are then tested for connectivity by peer-to-peer connectivity checks.
Checks done by STUN and TURN
also allows for address selection for multi-homed and dual-stack hosts

ICE allows the agents to discover enough information about their topologies to potentially find one or more paths by which they can communicate. Then it systematically tries all possible pairs (in a carefully sorted order) until it finds one or more that work.

ICE Gathering

Caller and callee performs checks to finalize the protocol and routing needed to establish a peer connection . Number of candudates are proposed till they mutually agree upon one . Peerconnection then uses that candiadte detaisl to initiate the connection .

While Applying a Local Description at the media engine level if m= section is new, WebRTC media stacks begins gathering candidates for it.

RTCPeerconnection specified canTrickleIceCandidates . ICE trickling is the process of continuing to send candidates after the initial offer or answer has already been sent to the other peer.

ICE TransportRole is responsible for Choosing a candidate pair

ICE layer sets one peer as controlling and other as controlled agent . The controling agent makes the final decision as to which candidate pair to choose.

Final selected canduadte in SDP

a=group:BUNDLE 0 1 2
a=msid-semantic: WMS 9Cv3eIelHVuhxrGfxSvUsfokNu4eb4R9PYw2

m=audio 59937 UDP/TLS/RTP/SAVPF 111 103 104 9 0 8 106 105 13 110 112 113 126
c=IN IP4 x.x.x.x
a=rtcp:9 IN IP4 0.0.0.0
a=candidate:2880323124 1 udp 2122260223 x.x.x.x 59937 typ host generation 0 network-id 1 network-cost 10
a=candidate:3844981444 1 tcp 1518280447 x.x.x.x 9 typ host tcptype active generation 0 network-id 1 network-cost 10

An agent identifies all CANDIDATE whic is a transport address. Types:

  • HOST CANDIDATE – directly from a local interface which could be Wifi, Virtual Private Network (VPN) or Mobile IP (MIP)
    if an agent is multihomed ( private and public networks) , it obtains a candidate from each IP address and includes all candidates in its offer.
  • STUN or TURN to obtain additional candidates. Types
    • translated addresses on the public side of a NAT (SERVER REFLEXIVE CANDIDATES)
    • addresses on TURN servers (RELAYED CANDIDATES)

Mapping Server Reflexive address

Agent sends the TURN Allocate request from IP address and port X:x,
NAT will create a binding X1′:x1′, mapping this server reflexive candidate to the host candidate X:x ( BASE).
Outgoing packets sent from the host candidate will be translated by the NAT to the server reflexive candidate.
Incoming packets sent to the server reflexive candidate will be translated by the NAT to the host candidate and forwarded to the agent.

Allocate Request and response fom TURN – Informing the agent of this relayed candidate

Only STUN based Binding

agent sends a STUN Binding request to its STUN server which will get server reflexive candidate and send back Binding response.

STUN Binding request for connectivity checks on CANDIDATE PAIRS

The candidates are carried in attributes in the SDP offer . The remote peer also follows this process and gather and send lits own sorted list of candidates. Hence CANDIDATE PAIRS from both sides are formed.

PEER REFLEXIVE CANDIDATES – connectivity checks can produce aditional candidates espceialy around symmetric NAT

Since the same address is used for STUN. and media ( RTP/RTCP) Demultiplexing based on packet contents helps to identify which one is which.

Checks : ICE checks are performed in a specific sequence, so that high-priority candidate pairs are checked first.

TRIGGERED CHECKS – accelerates the process of finding a valid candidate
ORDINARY CHECKS – agent works through ordered prioritised check list by sending a STUN request for the next candidate pair on the list periodically.

Checks ensure maintaining frozen candidates and pairs with some foundation for media stream

Each candidate pair in the check list has a foundation and a state. States for candidates pairs

1.Waiting: A check has not been performed for this pair, and can be performed as soon as it is the highest-priority Waiting pair onthe check list.

2. In-Progress: A check has been sent for this pair, but the transaction is in progress.

3. Succeeded: A check for this pair was already done and produced a successful result.

4. Failed: A check for this pair was already done and failed, either never producing any response or producing an unrecoverable failure response.

5. Frozen: A check for this pair hasn’t been performed, and it can’t yet be performed until some other check succeeds, allowing this pair to unfreeze and move into the Waiting state.

Example of ICE gather state

icegatheringstatechange – gathering

icecandidate (host)
sdpMid: 0, sdpMLineIndex: 0, candidate: candidate:1511920713 1 udp 2122260223 192.168.0.2 58122 typ host generation 0 ufrag vzpn network-id 1 network-cost 10

icecandidate (srflx)
sdpMid: 0, sdpMLineIndex: 0, candidate: candidate:4081163164 1 udp 1686052607 106.51.26.168 37542 typ srflx raddr 192.168.0.2 rport 58122 generation 0 ufrag vzpn network-id 1 network-cost 10

icecandidate (host)
sdpMid: 0, sdpMLineIndex: 0, candidate: candidate:345893049 1 tcp 1518280447 192.168.0.2 9 typ host tcptype active generation 0 ufrag vzpn network-id 1 network-cost 10

icecandidate (relay)
sdpMid: 0, sdpMLineIndex: 0, candidate: candidate:2130406062 1 udp 41886207 74.125.39.44 27190 typ relay raddr 106.51.26.168 rport 37542 generation 0 ufrag vzpn network-id 1 network-cost 10

icecandidate (relay)
sdpMid: 0, sdpMLineIndex: 0, candidate: candidate:3052096874 1 udp 25108479 172.217.163.158 28049 typ relay raddr 106.51.26.168 rport 37543 generation 0 ufrag vzpn network-id 1 network-cost 10

icegatheringstatechange – complete

Examaple Candidate Checking

iceconnectionstatechange : checking

setRemoteDescription L type: answer, sdp: v=0

m=audio 9 UDP/TLS/RTP/SAVPF 111 103 104 9 0 8 110 112 113 126
c=IN IP4 0.0.0.0
a=rtcp:9 IN IP4 0.0.0.0
a=ice-ufrag:ydvf
a=ice-pwd:mb4ousBoT6B0l//ljjD/9Z/M
a=ice-options:trickle


m=video 9 UDP/TLS/RTP/SAVPF 98 100 96 97 99 101 102 122 127 121 125 107 108 109 124 120 123 119 114 115 116
c=IN IP4 0.0.0.0
a=rtcp:9 IN IP4 0.0.0.0
a=ice-ufrag:ydvf
a=ice-pwd:mb4ousBoT6B0l//ljjD/9Z/M
a=ice-options:trickle

addIceCandidate (host)
sdpMid: , sdpMLineIndex: 0, candidate: candidate:1511920713 1 udp 2122260223 192.168.0.2 56060 typ host generation 0 ufrag ydvf network-id 1 network-cost 10

iceconnectionstatechange : connected

Candidate Nomination for Media Path

selecting low-latency media paths can use various techniques such as actual round-trip time (RTT) measurement
controlling agent gets to nominate which candidate pairs will get used for media amongst the ones that are valid. Ways
regular nomination and aggressive nomination

TBD

To read More on WebRTC Communication as a platform

WebRTC Media Stack

WebRTC service’s

References :

http://w3c.github.io/webrtc-pc/ WebRTC 1.0: Real-time Communication Between Browsers – W3C Editor’s Draft 31 August 2019
RFC 5245 Inter

WebRTC compatible android client

This post describes the requirement of creating a SIP phone application on android over the same codecs as WebRTC ( PCMA , PCMU , VP8) . In my project concerning the demonstration of WebRTC inter operability ( presence , audio / video call , message )  with a native android client , I had to develop a lightweight Android SIP application , customized for the look and feel of the webrtc web application . This also enables the added services to WebRTC client such as geolocation , visual voice mail , phonebook , call control options be set from android application as well .

Aim :

Android webrtc- sip client development , using sipml5 stack implemented through web services and native android programming .  

Software Used:

⦁ Eclipse IDE
⦁ Java SE Development Kit 7.0
⦁ Android SDK

Tasks :

⦁ Authorization of a user, based on his/her credentials (Database local to the application).

webrtc_android_2
⦁ Navigation Drawer on the home page which shows a menu giving the user various options like:
⦁ View Home Page
⦁ View Contact List
⦁ View/Edit My Profile
⦁ View My Location
⦁ Sign Out

⦁ Phonebook sync : Importing contact list of the Android Phone into the application. Editing user profile with values like  User Name ,  Password ,  Domain. 

webrtc_android_1
⦁ Inclusion of a Web View in the application which currently opens the desired webpage(http://sipml5.org/call.htm).

⦁ Geolocation: Showing marker for the current location of user in Google Maps.Displaying the address of the user in a Toast Message.

webrtc_android_4

⦁ Audio / Video call capability 

android_webrtc

figure 1 : Login page , figure 2 : Call page , Figure 3 : Menu bar 

Future Roadmap:

⦁ Connecting the application to a database which sits on the cloud.
⦁ Based on the entries in the database the user will be able to:
⦁ Login to the application.
⦁ View or edit his/her details in the My Profile Section.
⦁ Understanding codes of sample applications for making SIP calls from Android OS like:
⦁ SipDroid
⦁ SipDemo
⦁ IMSDroid
⦁ Modifying the existing application to be able to make SIP calls like one of the apps listed above.

Modules :

Development Done:
  1. Development of an authorization page connecting the application to a local database from where values are inserted and retrieved.
  2. Development of navigation drawer where additional options for the application will be displayed making it a user friendly application.
Development Planned:

1.Connectivity to a cloud database.  

2. App engine on cloud.

3. Importing contacts from phone address book .

4. Offine storage of profile details and few call logs .  

Architecture:

webrtc_android_enviornment

……………………………………………………………………………………..

Difference between WebRTC and plugin based communication

A lot of service providers ie telecom operators had deduced their own ways to provide Web based communication even before WebRTC was born . With time , as WebRTC has become stronger , more secure , resilient to failure they have come around to migrate their existing system from previous closed box native APIs to opensource WebRTC APIs.

The first figure ( given below ) depicts a communication platform build over plugins and proprietary APIs using HTTP REST based signaling .

2014-07-22_1212
Web Communication Service Architecture over HTTP/ REST API

As the migration took place the proprietary API components were replaced by Open standard based entities such as plugins were replaced by WebRTC APIs, HTTP REST based signalling was replaced by SIP ( Session Initiation Protocol ) .

Web Communication Service Architecture over WebRTC SIP
Web Communication Service Architecture over WebRTC SIP

Note telecom operator network did not had to face transformation by integration of WebRTC elements .

WebRTC business benefits to OTT and telecom carriers

Historically, RTC has been corporate and complex, requiring expensive audio and video technologies to be licensed or developed in house. Integrating RTC technology with existing content, data and services has been difficult and time consuming, particularly on the web.
Now with WebRTC the operator finally gets a chance to take the shift the focus from OTT ( Over The Top service providers like SKype , Google chat WebEx etc that were otherwise eating away the Operators revenue ) to its very own WebRTC client Server solution , hence making the VOIP calls chargeable , while at the same time being available from any client ( web or softphone based )

To know more about what webrtc is read : https://telecom.altanai.com/2013/08/02/what-is-webrtc/

To read about how webrtc integrates with the SIP/IMS systems read https://telecom.altanai.com/2013/10/02/webrtc-solution/

OTT
OTT ( Over The Top ) Applications

Where are we Now ?

WebRTC has now implemented open standards for real-time, plugin-free video, audio and data communication.

Many web services already use RTC, but need downloads, native apps or plugins. These includes Skype, Facebook (which uses Skype Flash ) and Google Hangouts (which use the Google Talk plugin).
Downloading, installing and updating plugins can be complex, error prone and annoying , such as Flash , Java .,etc

Plugins can be difficult to deploy, debug, troubleshoot, test and maintain—and may require licensing and integration with complex, expensive technology. It’s often difficult to persuade people to install plugins in the first place/ bookmark it or keep it activated at all times.

WebRTC support across various browsers
WebRTC support across various browsers , pic source : caniuse.com

API support from browser

  • PeerConnection API
  • getUserMedia
  • WebAudio Integration
  • dataChannels
  • TURN support
  • Echo cancellation
  • MediaStream API
  • Multiple Streams
  • Simulcast
  • Screen Sharing
  • mediaConstraints
  • Stream re-broadcasting
  • getStats API
  • ORTC API
  • H.264 video
  • VP8 video
  • Solid interoperability
  • srcObject in media element
  • Promise based getUserMedia
  • Promise based PeerConnection API

WebRTC trends

disruptive graph
Biz users
ic source : Disruptiveanalysis

The APIs and standards of WebRTC can democratize and decentralize tools for content creation and communication—for telephony, gaming, video production, music making, news gathering and many other applications.

pic source: iswebrtcreadyyet.com

What is WebRTC?

webrtc draft
 

WebRTC 1.0: Real-time Communication Between Browsers – W3C Candidate Recommendation 13 December 2019 https://www.w3.org/TR/webrtc/

Read more in the layers of webrtc  and their functionalities here :  WebRTC layers

webrtc_development_logowebrtcdevelopment
Open Source WebRTC SDK and its implementation steps https://github.com/altanai/webrtc

What is WebRTC ?

WebRTC (Web Real-Time Communication) is an API definition drafted by the World Wide Web Consortium (W3C) that supports browser-to-browser applications for voice calling, video chat, and P2P file sharing without the need of either internal or external plugins.

  • Enables browser to browser media streaming over secure RTP profile
  • Standardization , on a API level at the W3C and at the protocol level at the IETF.
  • Enables web browsers with Real-Time Communications (RTC) capabilities
  • written in c++ and javascript
  • BSDD style license
  • free, open project avaiable in all major borwsers 

As of the 2019 update the W3C defines it as

a set of ECMAScript APIs in WebIDL to allow media to be sent to and received from another browser or device implementing the appropriate set of real-time protocols. The specification being developed in conjunction with a protocol specification developed by the IETF RTCWEB group and an API specification to get access to local media devices.

 The following is the browser side stack for webrtc media .  

WebRTC media stack Solution Architecture
WebRTC Media Stack

Open and Free Codecs

Codecs signifies the media stream’s compession and decompression. For peers to have suceesfull excchange of media, they need a common set of codecs to agree upon for the session . The list codecs are sent  between each other as part of offeer and answer or SDP in SIP.

WebRTC uses bare MediaStreamTrack objects for each track being shared from one peer to another. Codecs associated in those tracks is not mandated by webrtc soecification.

For video as per RFC 7742 WebRTC Video Processing and Codec Requirements , the manadatory codesc to be supported by webrtc clients are : VP8 and H.264‘s Constrained Baseline profile

For Audio as per RFC 7874 WebRTC Audio Codec and Processing Requirements ,browser must support  Opus codec as well as G.711‘s PCMA and PCMU formats.

Video Resolution handling

Unless the SDP specifically signals otherwise, the web browser receiving a WebRTC video stream must be able to handle video at at least 20 FPS at a minimum resolution of 320 pixels wide by 240 pixels tall.

In the best scenarios ( avaible bandwidth and media devices ) VP8 had no upper mark set on resolution of vdieo stream hence the stream can even go asfar as  maximum resolution of 16384×16384 pixels.

Independant of Signalling 

Webrtc does not specify any signalling / telecommunication protocl and it is upto the adoptor to perform ofeer/answer exchaneg in any way deemed fit for the usecase . For ex maple for a web only application on may use only plain websockets, whereas for a teelcom endpoints compatible app one should SIP as the protocol . 

Read more about WebRTC handshakes :

NAT-traversal technologies such as ICE, STUN, and TURN

Have written in detail about TURN based WebRTC flow diagrams .

https://telecom.altanai.com/2015/03/11/nat-traversal-using-stun-and-turn/. The post describe ICE  (Interactive Connectivity Establishment )  framework which is  mandatory by WebRTC standards.  It is find network interfaces and ports in Offer / Answer Model to exchange network based information with participating communication clients. ICE makes use of the Session Traversal Utilities for NAT (STUN) protocol and its extension, Traversal Using Relay NAT (TURN) 

NAT and TURN Relay

Learn about hosting / integrating different TURN servers for WebRTC

TURN server for WebRTC – RFC5766-TURN-Server , Coturn , Xirsys – https://telecom.altanai.com/2015/03/28/turn-server-for-webrtc-rfc5766-turn-server-coturn-xirsys/

Why is WebRTC importatnt ?

Significantly better video qualityWebRTC video quality is noticeably better than Flash.
Up to 6x faster connection timesUsing JavaScript WebSockets, also an HTML5 standard, improves session connection times and accelerates delivery of other OpenTok events.
Reduced audio/video latencyWebRTC offers significant improvements in latency through WebRTC, enabling more natural and effortless conversations.
Freedom from FlashWith WebRTC and JavaScript WebSockets, you no longer need to rely on Flash for browser-based RTC.
Native HTML5 elementsCustomize the look and feel and work with video like you would any other element on a web page with the new video tag in HTML5.

The major players behind conception and advancement of WebRTC standards and libraries are  :

IETF , W3C , Java community , GSMA .   The idea is to develop a Light -weight browser based call console , to make SIP calls from Web page .This was successfully achieved using fundamental technologies as Javascript , html5 , web-sockts  and TCP /UDP , open source sip server.It is good to note that there is no extra extension, plugin or gateway required , such as flash support  .Also it bears cross platform support ,  including Mozilla , chrome so on .

 Peer to peer Communication

 WebRTC forms a p2p communication channel between all the peers . that means as the participant count grows  , it converts to  a mesh networking topology with incoming and outgoing stream towards direction of each of its peers .

Two party call p2p

Peer to peer calling

two party call
p2p call

Multiparty Call and mesh network

Mesh based arrangement .

Multiparty party call
Mesh based webrtc video confeerncing

 In special case of broadcasting or  large number of viewers ( without outgoing media stream ) it is recommended to setup a Media Control Unit ( MCU) which will replay the incoming stream to large number of users without putting traffic load on the clients from where the stream is actually originating .   Important note :     1.It should be notes that these diagrams do not depict the ICE and NAT traversal and have been simplifies for better understanding. In real world scenarios there is almost all the time a STUN and TURN server involved .  

More on TURN Servers is given here : NAT traversal using STUN and TURN

2.Also the webrtc mandates the use of secure origin ( https ) on the webpage which invoke getusermedia to capture user media devices like audio , video and location .

Browser Adoption

As of March 2020 , webrtc is supported on following client’s browsers

  • Desktop PC
    Microsoft Edge 12+[25]
    Google Chrome 28+
    Mozilla Firefox 22+[26]
    Safari 11+[27]
    Opera 18+[28]
    Vivaldi 1.9+
  • Android
    Google Chrome 28+ (enabled by default since 29)
    Mozilla Firefox 24+[29]
    Opera Mobile 12+
  • Chrome OS
  • Firefox OS
  • BlackBerry 10
  • iOS
    MobileSafari/WebKit (iOS 11+)
  • Tizen 3.0

Furthermore , read about the Steps for building and deploying WebRTC solution – https://telecom.altanai.com/2014/12/04/steps-for-building-and-deploying-webrtc-solution/

TURN based media Relay

WebRTC APIs

Javascript functions  to access and process the browser media stack

getUserMedia

acquires the audio and video media (e.g., by accessing a device’s camera and microphone)

Properties

ondevicechange

Methods

enumerateDevices()
getDisplayMedia()
getSupportedConstraints()
getUserMedia()

navigator.mediaDevices.getUserMedia({ audio: true, video: true })
.then(function(stream) {
  var video = document.querySelector('video');
  // Older browsers may not have srcObject
  if ("srcObject" in video) {
    video.srcObject = stream;
  } else {
    // Avoid using this in new browsers, as it is going away.
    video.src = window.URL.createObjectURL(stream);
  }
  video.onloadedmetadata = function(e) {
    video.play();
  };
})
.catch(function(err) {
  console.log(err.name + ": " + err.message);
});

DOMException Error on getusermedia

Rejections of the returned promise are made by passing a DOMException error object to the promise’s failure handler. Possible errors are:

AbortError
Although the user and operating system both granted access to the hardware device, problem occurred which prevented the device from being used.

NotAllowedError
One or more of the requested source devices cannot be used at this time. This will happen if the browsing context is insecure( http instead of https) or if the user has specified that the current browsing instance /sessionis not permitted access to the device or has denied all access to user media devices globally.

NotFoundError
No media tracks of the type specified were found that satisfy the given constraints.

NotReadableError
Although the user granted permission to use the matching devices, a hardware error occurred at the operating system, browser, or Web page level which prevented access to the device.

OverconstrainedError
no candidate devices which met the criteria requested. string value is the name of a constraint which was not meet, and a message property containing a human-readable string explaining the problem.

exmaple conatraints :

var constraints = { video: { facingMode: (front? "user" : "environment") } };

SecurityError
User media support is disabled on the Document on which getUserMedia() was called.

TypeError
The list of constraints specified is empty, or has all constraints set to false.

RTCPeerConnection

enables audio and video communication between peers. It performs signal processing, codec handling, peer-to-peer communication, security, and bandwidth management.

Properties

canTrickleIceCandidates
connectionState
currentLocalDescription
currentRemoteDescription
getDefaultIceServers()
iceConnectionState
iceGatheringState
localDescription
onaddstream
onconnectionstatechange
ondatachannel
onicecandidate
oniceconnectionstatechange
onicegatheringstatechange
onidentityresult
onnegotiationneeded
onremovestream
onsignalingstatechange
ontrack
peerIdentity
pendingLocalDescription
pendingRemoteDescription
remoteDescription
sctp
signalingState

Methods

addIceCandidate()
addStream()
addTrack()
close()
createAnswer()
createDataChannel()
createOffer()
generateCertificate()
getConfiguration()
getIdentityAssertion()
getReceivers()
getSenders()
getStats()
getStreamById()
getTransceivers()
removeStream()
removeTrack()
restartIce()
setConfiguration()
setIdentityProvider()
setLocalDescription()
setRemoteDescription()

 signalling state transitions diagram , source W3C

RTC Signalling states

stable
There is no offer/answer exchange in progress. This is also the initial state, in which case the local and remote descriptions are empty.

have-local-offer
Local description, of type “offer”, has been successfully applied.

have-remote-offer
Remote description, of type “offer”, has been successfully applied.

have-local-pranswer
Remote description of type “offer” has been successfully applied and a local description of type “pranswer” has been successfully applied.

have-remote-pranswer
Local description of type “offer” has been successfully applied and a remote description of type “pranswer” has been successfully applied.
closed The RTCPeerConnection has been closed; its [[IsClosed]] slot is true.

RTCSDPType

offer
SDP offer.

pranswer
An RTCSdpType of pranswer indicates that a description MUST be treated as an [SDP] answer, but not a final answer.

answer
treated as an [SDP] final answer, and the offer-answer exchange MUST be considered complete. A description used as an SDP answer may be applied as a response to an SDP offer or as an update to a previously sent SDP pranswer.

rollback
canceling the current SDP negotiation and moving the SDP [SDP] offer back to what it was in the previous stable state.

RTCPeerConfiguration

Defines a set of parameters to configure how the peer-to-peer communication established via RTCPeerConnection

iceServers of type sequence
array of objects describing servers available to be used by ICE, such as STUN and TURN servers.

iceTransportPolicy of type RTCIceTransportPolicy.

bundle policy affects which media tracks are negotiated if the remote endpoint is not bundle-aware, and what ICE candidates are gathered. If the remote endpoint is bundle-aware, all media tracks and data channels are bundled onto the same transport.

  • relay
    ICE Agent uses only media relay candidates such as candidates passing through a TURN server.
  • all
    The ICE Agent can use any type of candidate when this value is specified.

bundlePolicy of type RTCBundlePolicy.
media-bundling policy to use when gathering ICE candidates.
Types :

  • balanced
    Gather ICE candidates for each media type in use (audio, video, and data). If the remote endpoint is not bundle-aware, negotiate only one audio and video track on separate transports.
  • max-compat
    Gather ICE candidates for each track. If the remote endpoint is not bundle-aware, negotiate all media tracks on separate transports.
  • max-bundle
    Gather ICE candidates for only one track. If the remote endpoint is not bundle-aware, negotiate only one media track.

rtcpMuxPolicy of type RTCRtcpMuxPolicy.
rtcp-mux policy to use when gathering ICE candidates.

certificates of type sequence
A set of certificates that the RTCPeerConnection uses to authenticate.

iceCandidatePoolSize of type octet, defaulting to 0
Size of the prefetched ICE pool as defined in [JSEP]

RTCDataChannel

allows bidirectional communication of arbitrary data between peers. It uses the same API as WebSockets and has very low latency.

getStats

allows the web application to retrieve a set of statistics about WebRTC sessions. These statistics data are being described in a separate W3C document

Peer to Peer DTMF

-tbd

Call Setup betweeb WebRTC Endpoints

updates in W3C 13 Dec , 2019

Over the years since its adoption many of the associated tech were depricated from the Webrtc based platforms and enviornments , some of which are: OAuth as a credential method for ICE servers
Negotiated RTCRtcpMuxPolicy (previously marked at risk)
voiceActivityDetection
RTCCertificate.getSupportedAlgorithms()
RTCRtpEncodingParameters: ptime, maxFrameRate, codecPayloadType, dtx, degradationPreference
RTCRtpDecodingParameters: encodings
RTCDatachannel.priority

Some of the newly added featufres include:

restartIce() method added to RTCPeerConnection
Introduced the concept of “perfect negotiation”, with an example to solve signalling races.
Implicit rollback in setRemoteDescription to solve races.
Implicit offer/answer creation in setLocalDescription to solve races.

References :

WebRTC 1.0: Real-time Communication Between Browsers – W3C Candidate Recommendation 13 December 2019https://www.w3.org/TR/webrtc/

WebRTC Stack Architecture and layers

WebRTC stands for Web Real-Time Communications and introduces a real-time media framework in the browser core alongside associated JavaScript APIs for controlling the media frame and HTML5 tags for displaying.

If you are new to WebRTC , read what is WebRTC ? From a technical point of view, WebRTC will hide all the complexity of real-time media behind a very simple JavaScript API. 

Codec Confusion :

Video Codecs

Currently VP8 is the codec of choice since it is royalty-free. In mobility today, the codec of choice is h264. H264 is not royalty-free. But it is native in most mobile handsets due to its high performance.

Audio Codecs

Opus is a lossy audio compression format developed by the Internet Engineering Task Force (IETF) targeting a broad range of interactive real-time applications over the Internet, from speech to music. As an open format standardized through RFC 6716, a reference implementation is provided under the 3-clause BSD license. All known software patents Which cover Opus are licensed under royalty-free terms.

G.711 is an ITU (International Telecommunications Union) standard for  audio compression. It is primarily used in telephony. The standard was released in 1972. It is the required standard in many voice-based systems  and technologies, for example in H.320 and H.323 specifications.
Speex is a patent-free audio compression format designed for speech and also  a free software speech codec that is used in VoIP applications and podcasts. Some consider Speex obsolete, with Opus as its official successor, but since
significant content is out there using Speex, it will not disappear anytime soon.

G.722 is an ITU standard 7 kHz Wideband audio codec operating at 48, 56 and 64 kbit/s. It was approved by ITU-T in 1988. G722 provides improved speech quality due to a wider speech bandwidth of up to 50-7000 Hz compared to G.711 of 300–3400 Hz.

AMR-WB Adaptive Multi-rate Wideband is a patented wideband speech coding standard that provides improved speech quality due to a wider speech bandwidth of 50–7000 Hz. Its data rate is between 6-12 kbit/s, and the codec is generally available on mobile phones.

Architecture :

WebRTC offers web application developers the ability to write rich, realtime multimedia applications (think video chat) on the web, without requiring plugins, downloads or installs. It’s purpose is to help build a strong RTC platform that works across multiple web browsers, across multiple platforms.

WebRTCpublicdiagramforwebsite

Web API – An API to be used by third-party developers for developing web-based video chat-like applications.

WebRTC Native C++ API – An API layer that enables browser makers to easily implement the Web API proposal

Transport / Session – The session components are built by re-using components from libjingle, without using or requiring the XMPP/jingle protocol.

RTP Stack – A network stack for RTP, the Real-Time Protocol.

STUN/ICE – A component allowing calls to use the STUN and ICE mechanisms to establish connections across various types of networks.

Session Management – An abstracted session layer, allowing for call setup and management layer. This leaves the protocol implementation decision to the application developer.

VoiceEngine – VoiceEngine is a framework for the audio media chain, from sound card to the network.

iSAC / iLBC / Opus

iSAC: A wideband and super wideband audio codec for VoIP and streaming audio. iSAC uses 16 kHz or 32 kHz sampling frequency with an adaptive and variable bit rate of 12 to 52 kbps.

iLBC: A narrowband speech codec for VoIP and streaming audio. Uses 8 kHz sampling frequency with a bitrate of 15.2 kbps for 20ms frames and 13.33 kbps for 30ms frames. Defined by IETF RFCs 3951 and 3952.

Opus: Supports constant and variable bitrate encoding from 6 kbit/s to 510 kbit/s, frame sizes from 2.5 ms to 60 ms, and various sampling rates from 8 kHz (with 4 kHz bandwidth) to 48 kHz (with 20 kHz bandwidth, where the entire hearing range of the human auditory system can be reproduced). Defined by IETF RFC 6176.

NetEQ for Voice– A dynamic jitter buffer and error concealment algorithm used for concealing the negative effects of network jitter and packet loss. Keeps latency as low as possible while maintaining the highest voice quality.

Acoustic Echo Canceler (AEC) – The Acoustic Echo Canceler is a software-based signal processing component that removes, in real-time, the acoustic echo resulting from the voice being played out coming into the active microphone.

Noise Reduction (NR) -The Noise Reduction component is a software-based signal processing component that removes certain types of background noise usually associated with VoIP. (Hiss, fan noise, etc…)

Video Engine – VideoEngine is a framework video media chain for video, from the camera to the network, and from network to the screen.

VP8  – Video codec from the WebM Project. Well suited for RTC as it is designed for low latency.

Video Jitter Buffer – Dynamic Jitter Buffer for video. Helps conceal the effects of jitter and packet loss on overall video quality.
Image enhancements -For example, removes video noise from the image capture by the webcam.

W3C contribution


w3c

  • Media Stream Functions

API for connecting processing functions to media devices and network connections, including media manipulation functions.

  • Audio Stream Functions

An extension of the Media Stream Functions to process audio streams (e.g. automatic gain control, mute functions and echo cancellation).

  • Video Stream Functions

An extension of the Media Stream Functions to process video streams (e.g. bandwidth limiting, image manipulation or “video mute“).

  • Functional Component 

 API to query presence of WebRTC components in an implementation, instantiate them and connect them to media streams.

  • P2P Connection Functions

API functions to support establishing signalling protocol-agnostic peer-to-peer connections between Web browsers

  • API specification Availability

WebRTC 1.0: Real-time Communication Between Browsers –  Draft 3 June 2013 available

  • Implementation Library: WebRTC Native APIs

Media Capture and Streams – Draft 16 May 2013

  • Supported by Chrome , Firefox, Opera in desktop of all OS ( Linux, Windows , Mac )
  • Supported by Chrome , Firefox  in Mobile browsers ( android )

IETF contribution

ietf

Communication model

Security model

Firewall and NAT traversal

Media functions

Functionality such as media codecs, security algorithms, etc.,

Media formats

Transport of non media data between clients

Input to W3C for APIs development

Interworking with legacy VoIP equipment

WG RFC   Date

  • draft-ietf-rtcweb-audio-02      2013-08-02
  • draft-ietf-rtcweb-data-channel-05      2013-07-15
  • draft-ietf-rtcweb-data-protocol-00      2013-07-15
  • draft-ietf-rtcweb-jsep-03      2013-02-27
  • draft-ietf-rtcweb-overview-07      2013-08-14
  • draft-ietf-rtcweb-rtp-usage-07     2013-07-15
  • draft-ietf-rtcweb-security-05      2013-07-15
  • draft-ietf-rtcweb-security-arch-07      2013-07-15
  • draft-ietf-rtcweb-transports-00      2013-08-19
  • draft-ietf-rtcweb-use-cases-and-reqs-11      2013-06-27
  • Plus over 20 discussion RFC drafts

What will be the outcome of WebRTC Adoption?

In simple words, it’s a phenomenal change in decentralizing communication platforms from proprietary vendors who heavily depended on patented and royalty bound technologies and protocols.  It will revolutionize internet telephony.  Also it will emerge to be platform-independent ( ie any browser, any desktop operating system any mobile Operating system ).

WebRTC allows anybody to introduce real-time communication to their web page as simple as introducing a table.

Read More about webRTC business benefits


update 2020 – This article was written very early in 2013 while WebRTC was being standardised and not as widely adopted since the inception of WebRTC began in 2012.

There are many more articles written after that to explain and emphasize the detailing and application of WebRTC. List of these is below :

For SIP IMS and WebRTC

Read about STUN and TURN which form a crtical part of any webrtc based communication system

Security of WebRTC based CaaS and CPaaS

WebRTC APIs


 WEBRTC CALL BETWEEN BROWSER AND SIP PHONE

Call Between Web client and SIP client

  1. HTML5 and WebRTC enabled Web Client :

We are using open source HTML5 SIP client entirely written in javascript to make it light and to have easy integration with the SIP server. No extension, plugin or gateway is needed to initiate the call from the web Client. The media stack rely on WebRTC. The client can be used to connect to any SIP or IMS network from HTML5 and WebRTC enabled browser to make and receive audio/video calls and instant messages.

  1.  Proxy Server / WS to UDP Translator :

For the Proposed Solution we are proposing the Freeware light SIP – Server which besides acting like the normal Sip Server and Registrar can also act like the Translator Engine to convert the SIP over WS message to SIP over UDP. As one of the requirement we need to terminate the call on the hard-phone like Turret which supports only SIP over UDP we need to have the translator in the overall picture which can convert the SIP over WS request to SIP over UDP. Through this component the use case like initiating the call from the web Browser the terminating the call at the Hard-phone is possible.

  1. Soft Phone/ SIP  Client :

We are using the Boghe IMS client to act like the Soft phone which supports the Audio Codec required to talk with web Client like PCMU And PCMA audio Codec.

Working on the discussed Components we have successfully established the following Use- Case Scenario.

  1. Call Initiated from the Browser and Terminated on Browser :

(a)   Signalling Part – Initial Handshake is done and Call is established. (Captured from Wire-Shark)

(b)   Media Part – SDP is being exchanged as capture by Wire-shark and both the client can exchange Voice.

  1. Call initiated from the Browser and Terminated on the Softphone and Vice-Vera :

(a)    Signalling Part – Initial Handshake is done and Call is established. (Captured from Wire-Shark)

(b)   Media Part – SDP is being exchanged as capture by Wire-shark and both the client can exchange Voice but have some dependency on machine being used.

  1. Call initiated from the Softphone and Terminating on SoftPhone.

(a)   Signalling Part : Initial Handshake is done and Call is established. (Captured from Wire-Shark)

(b)   Media Part : No hiccup its working fine.


The structure for multi network traversal using ICE – STUN and TURN is described in the following diagram .

Call Between Web client and SIP client (1)

You can read more about NAT traversal using STUN and TURN here .

Detailed TURN server for WebRTC – RFC5766-TURN-Server , Coturn , Xirsys is here .