JavaScript Session Establishment Protocol (JSEP) in WebRTC handshake

JSEP (JavaScript Session Establishment Protocol)
Offer/Answer Excahange Flow
Perform WebRTC handshake
- Outgoing Call
- Incoming Call
Signalling state Transitions on PeerConnection
Detailed Offer / Answer SDP
Subsequent Offers
Subsequent Answers
Modifying Offer/answer SDP
SDP Parsing
- Session level parsing
- Media Section parsing
Interactive Connectivity Establishment (ICE) for NAT traversal
- ICE Gathering
- Mapping Server Reflexive address
- STUN Binding request for connectivity checks on CANDIDATE PAIRS
- Example of ICE gather state

This article is aimed at explaining the intricacies and detailed offer answer flow in webrtc handshake and JSEP. You can read the following articles on WebRTC as a prereq before reading through this one. WebRTC has API s namely – Peerconnection , getUserMedia , Datachannel and getStats.

JSEP (JavaScript Session Establishment Protocol)

JSEP is used during signalling via w3c’s recommended RTCPeerConnectionAPI interface to set up a multimedia session. The multimedia session description specifies the critical components of setting up a session between local and remote such as transport ports, protocol, profiles. It also handles the interaction with the ICE state machine.

Offer/Answer Excahange Flow

prereq : Setup Client side for the caller
PeerConnectionFactory to generate PeerConnections
PeerConnection for every connection to remote peer
MediaStream audio and video from client device

Side initiating the session creates a offer by CreateOffer() API

aPromise = myPeerConnection.createOffer([options]);

options is type of RTC Offer Options

iceRestart
offerToReceiveAudio ( legacy)
offerToReceiveVideo ( legacy)
voiceActivityDetection

2. The application then stores the offer in local config as setLocalDescriptionAPI()

 myPeerConnection.createOffer().then(function(offer) {
    return myPeerConnection.setLocalDescription(offer);
})

3. Offer is sent to remote side using its choice of signalling ( SIP , WS , HTTP, XMPP .. )

4. Remote party stores it use setRemoteDescription() API

myPeerConnection.setRemoteDescription(sdp)
.then(function () {
  return createMyStream();
})

4. Remote part generates an answer using createAnswer() API

aPromise = RTCPeerConnection.createAnswer([options]);

5. Remote party stores the answer in its local config using setLocalDescription() API

6. Answer is transferred to Initiator side using choice of signalling ( SIP , WS , HTTP, XMPP .. ) again

7. Initiating side stores it use setRemoteDescription() API

Interfaces of webrtc and tracks to stream addition

Perform webRTC handshake

Webrtc call setup and incoming call callflow between remote peer , peerconnection factory , peerconnection and application

Outgoing Call: setting up a call with remote by sending an offer. Wait for the remote’s answer to process it to create the session.

Incoming Call : Receive remote’s offer and process to reply with an answer.

Signalling state Transitions on PeerConnection

As the caller initiates a new RTCPeerConnection() , the RTCSignalingState state is “stable” as remote and local descriptions are empty

As the caller initiates call and calls createOffer() , he now has offer SDP and procced to store offer locally with setLocalDescription(offer) the RTCSignalingState state is “have-local-offer” . After than caller send the offer to callee over signalling channel

Simillarily as the calle recives the offer, it starts with RTCSignalingState stable and then proceeds to store the Remote’s offer using setRemoteDescription(offer), its state is now “have-remote-offer”

The callee generates a provsional answer and for caller and stores it locally , state transitiosn to “have-local-pranswer“. The pranswer SDP is send to caller over signalling channel again .

Caller stores the callee’s pr answer SDP and state updates to “have-remote-pranswer”

Once there is no offer/answer exchange in progress the state again changes to ” stable “.

State schanges to “closed” if RTCpeerConnection is closed

Detailed Offer / Answer SDP

Local Offer created by side initiating the session / Caller

The first offfer called initial offer can have dummy date for contact line such as 0.0.0.0 to prevent leaking a local Ip address

c=IN IP4 0.0.0.0
a=rtcp:9 IN IP4 0.0.0.0

“o=” line contains <username> <sess-id> <sess-version> <nettype> <addrtype> <unicast-address>

o=- 4445251981417004127 2 IN IP4 127.0.0.

shows username – and 4445251981417004127 as session id. Same username “-” is specified in “s=” line

“t=” line shows <start time> <stop time>

t=0 0

Full session Block example

type: offer, sdp: v=0
o=- 4445251981417004127 2 IN IP4 127.0.0.1
s=-
t=0 0
a=group:BUNDLE 0 1 2
a=msid-semantic: WMS DYVK4IA4kA8LvnIYWjXhRzMgSGicnwVutWE2

Media Section : An m= section is generated for each RtpTransceiver that has been added to the PeerConnection. For the initial offer since no ports are available yet , dummy port 9 can be sadded. However if it is bundle only then port value is set to 0. Later the port value will be set to the port value of default ICE candidate.

DTLS filed “UDP/TLS/RTP/SAVPF” is followed by the list of codecs in order of priority.

“c=” line in msection too must be filled with dummy values if IP 0.0.0.0 as no candidates are available yet .

ICE

a=ice-options:trickle

Transport

“a=ice-ufrag” , “a=ice-pwd” , “a=fingerprint” , “a=setup” , “a=tls-id”

Media Stream Identification attribute “a-mid:”

For each media format on the m= line, “a=rtpmap” for “rtx” with the clock rate of codec and “a=fmtp” to reference the payload type of the primary codec. “a=rtcp-fb” specified RTCP feedback

a=rtpmap:111 opus/48000/2
a=rtcp-fb:111 transport-cc
a=fmtp:111 minptime=10;useinbandfec=1

Audio Block exmaple

m=audio 9 UDP/TLS/RTP/SAVPF 111 103 104 9 0 8 106 105 13 110 112 113 126
c=IN IP4 0.0.0.0
a=rtcp:9 IN IP4 0.0.0.0
a=ice-ufrag:JDMg
a=ice-pwd:6OARDQ8U/orhtXZbfN+ars37
a=ice-options:trickle
a=fingerprint:sha-256 1D:C8:1F:18:D2:AB:B7:68:CC:DC:A8:8D:6B:1D:70:11:06:E9:19:D2:22:CE:A5:F3:BE:82:00:ED:99:58:20:4A
a=setup:actpass
a=mid:0
a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level
a=extmap:2 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time
a=extmap:3 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01
a=extmap:4 urn:ietf:params:rtp-hdrext:sdes:mid
a=extmap:5 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id
a=extmap:6 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id
a=sendrecv
a=msid:DYVK4IA4kA8LvnIYWjXhRzMgSGicnwVutWE2 7525d75c-ffe7-4038-8b71-653d249e63bb
a=rtcp-mux
a=rtpmap:111 opus/48000/2
a=rtcp-fb:111 transport-cc
a=fmtp:111 minptime=10;useinbandfec=1
a=rtpmap:103 ISAC/16000
a=rtpmap:104 ISAC/32000
a=rtpmap:9 G722/8000
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:106 CN/32000
a=rtpmap:105 CN/16000
a=rtpmap:13 CN/8000
a=rtpmap:110 telephone-event/48000
a=rtpmap:112 telephone-event/32000
a=rtpmap:113 telephone-event/16000
a=rtpmap:126 telephone-event/8000
a=ssrc:3968544080 cname:da0nYe1oYR8AvVNp
a=ssrc:3968544080 msid:DYVK4IA4kA8LvnIYWjXhRzMgSGicnwVutWE2 7525d75c-ffe7-4038-8b71-653d249e63bb
a=ssrc:3968544080 mslabel:DYVK4IA4kA8LvnIYWjXhRzMgSGicnwVutWE2
a=ssrc:3968544080 label:7525d75c-ffe7-4038-8b71-653d249e63bb

// remove video section for simplicity

Data Block is created if data channle has been created with m= section for data.

“a=sctp-port” line referencing the SCTP port number set to 5000

“a=max-message-size” set to 262144 here

Data Block example

m=application 9 UDP/DTLS/SCTP webrtc-datachannel
c=IN IP4 0.0.0.0
a=ice-ufrag:JDMg
a=ice-pwd:6OARDQ8U/orhtXZbfN+ars37
a=ice-options:trickle
a=fingerprint:sha-256 1D:C8:1F:18:D2:AB:B7:68:CC:DC:A8:8D:6B:1D:70:11:06:E9:19:D2:22:CE:A5:F3:BE:82:00:ED:99:58:20:4A
a=setup:actpass
a=mid:2
a=sctp-port:5000
a=max-message-size:262144

Subsequent Offers

When createOffer is called a second (or later) time, or is called after a local description has already been installed, the processig is different due to gathered ICE candidates . However the <session-version> is not changed .

Additionally m section is updated if RtpTransceiver is added or removed

Each “m=” and c=” line MUST be filled in with the port, relevant RTP profile, and address of the default candidate for the m= section

If the m= section is not bundled into another m= section, update the “a=rtcp” with port and address of RTCP camdidate and add “a=camdidate” with “a=end-of-candidates”

Local Answer created by side receiving the session/ Callee

When createAnswer is called for the first time after a remote description has been provided, the result is known as the initial answer.

Each offered m= section will have an associated RtpTransceiver

Remote Destination / Callee can reject the m section by setting port in m line to 0 . It can reject msection if neither of the offered media format are supported , RtpTransceiver is stoopped etc.

For the initial offer the dummy port value of 9 is set as no ICE candudate is avaible yet. Simillarly “c=” line must contain the “dummy” value “IN IP4 0.0.0.0” too.

The <proto> field MUST be set to exactly match the <proto> field for the corresponding m= line in the offer.

type: answer, sdp: v=0
o=- 5730481682283561642 3 IN IP4 127.0.0.1
s=-
t=0 0
a=group:BUNDLE 0 1 2
a=msid-semantic: WMS KGmQ9mTmvTaWlHTQ0B0YP36QIxOYNeB3i2nT

Audio section

m=audio 9 UDP/TLS/RTP/SAVPF 111 103 104 9 0 8 106 105 13 110 112 113 126
c=IN IP4 0.0.0.0
a=rtcp:9 IN IP4 0.0.0.0
a=ice-ufrag:MgKS
a=ice-pwd:X3oTkKO/v7GVgd/CDC3e9B7c
a=ice-options:trickle
a=fingerprint:sha-256 B9:9C:8A:A9:E9:09:0C:FB:52:2A:D3:18:7B:A9:D4:EC:B3:00:77:72:27:51:EC:5F:82:BE:11:7F:C7:CF:43:43
a=setup:active
a=mid:0
a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level
a=extmap:2 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time
a=extmap:3 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01
a=extmap:4 urn:ietf:params:rtp-hdrext:sdes:mid
a=extmap:5 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id
a=extmap:6 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id
a=sendrecv
a=msid:KGmQ9mTmvTaWlHTQ0B0YP36QIxOYNeB3i2nT e817fe0f-1cc0-4901-9fd9-e810289cc85d
a=rtcp-mux
a=rtpmap:111 opus/48000/2
a=rtcp-fb:111 transport-cc
a=fmtp:111 minptime=10;useinbandfec=1
a=rtpmap:103 ISAC/16000
a=rtpmap:104 ISAC/32000
a=rtpmap:9 G722/8000
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:106 CN/32000
a=rtpmap:105 CN/16000
a=rtpmap:13 CN/8000
a=rtpmap:110 telephone-event/48000
a=rtpmap:112 telephone-event/32000
a=rtpmap:113 telephone-event/16000
a=rtpmap:126 telephone-event/8000
a=ssrc:3260997313 cname:FxLUKuXrLQe0r1rn

Video section removed for simplicity

Data stream

m=application 9 UDP/DTLS/SCTP webrtc-datachannel
c=IN IP4 0.0.0.0
b=AS:30
a=ice-ufrag:MgKS
a=ice-pwd:X3oTkKO/v7GVgd/CDC3e9B7c
a=ice-options:trickle
a=fingerprint:sha-256 B9:9C:8A:A9:E9:09:0C:FB:52:2A:D3:18:7B:A9:D4:EC:B3:00:77:72:27:51:EC:5F:82:BE:11:7F:C7:CF:43:43
a=setup:active
a=mid:2
a=sctp-port:5000
a=max-message-size:262144

Subsequent Answers

Port value would normally be set to the port of the default ICE candidate for this m= section. For the exmaple above

m=audio 9 UDP/TLS/RTP/SAVPF 111 103 104 9 0 8 106 105 13 110 112 113 126

will be changes with relevant port adress such as

type: offer, sdp: v=0
o=- 6407282338169184323 3 IN IP4 54.190.54.190
s=-
t=0 0
a=group:BUNDLE 0 1 2
a=msid-semantic: WMS bSrCUCFybGovIy0FUhPTZAr9ToRmx8I09nEj
m=audio 55375 UDP/TLS/RTP/SAVPF 111 103 104 9 0 8 106 105 13 110 112 113 126
c=IN IP4 54.190.54.190
a=rtcp:9 IN IP4 0.0.0.0
a=candidate:2880323124 1 udp 2122260223 54.190.54.190 55375 typ host generation 0 network-id 1 network-c

Simillarly m video and data line will also get ports

m=video 53877 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101 102 122 127 121 125 107 108 109 124 120 123 119 114 115 116
c=IN IP4 54.190.54.190
a=rtcp:9 IN IP4 0.0.0.0
a=candidate:2880323124 1 udp 2122260223 54.190.54.190 53877 typ host generation 0 network-id 1 network-cost 10
..
m=application 57991 UDP/DTLS/SCTP webrtc-datachannel
c=IN IP4 54.190.54.190
a=candidate:2880323124 1 udp 2122260223 54.190.54.190 57991 typ host generation 0 network-id 1 network-cost 10

If the answer contains any “a=ice-options” attributes where “trickle” is listed as an attribute, update the PeerConnection canTrickle property to be true.

Modifying Offer/answer SDP

SDP returned from createOffer or createAnswer MUST NOT be changed before passing it to setLocalDescription. After calling setLocalDescription with an offer or answer, the application MAY modify the SDP to reduce its capabilities before sending it to the far side.

Assume we have a MCU at location and want the video stream to relay via a Media Server.

SDP Parsing

SDP is used for session parsing and contians sequence of line with key value pairs. SDP is read, line-by-line, and converted to a data structure that contains the deserialized information.

JSEP SDP bears a lot of simillarity to SIP SDP explained here : SIP and SDP Messages Explained

SIP and SDP Messages Explained

Session-Level Parsing

Line “v=” , “o=”,”b=” and “a=” are processed . The “i=”, “u=”, “e=”, “p=”, “t=”, “r=”, “z=”, and “k=” lines are not used by this specification; they MUST be checked for syntax but their values are not used. Line “c=” is checked for syntax and ICE mismatch detection
“a= ” attribute could be : “a=group” , “s=”ice-lite” , “a=ice-pwd”, “a=ice-options” , “a=fingerprint”, “a=setup” , a=tls-id”, “a=identity” , “a=extmap”

Media Section Parsing

Line “m=” for media , proto , port , fmt in RTP

Attributes “a=” can be :

“a=rtpmap” or “a=fmtp” : map from an RTP payload type number to a media encoding name that identifies the payload format.

a=rtpmap:<payload type> <encoding name>/<clock rate> [/<encoding parameters>]

m=audio 49230 RTP/AVP 96 97 98
a=rtpmap:96 L8/8000
a=rtpmap:97 L16/8000
a=rtpmap:98 L16/11025/2

Packetization parameters as “a=ptime” , “a=maxptime” which define the length of each RTP packet.

Direction as “a=sendrecv” , a=recvonly , a=sendonly , a=inactive“

Muxing as “a=rtcp-mux” , “a=rtcp-mux-only”

RTCP attributes “a=rtcp” , “a=rtcp-rsize”

Line “c=” is checked.

Line “b=” for bandiwtdh , bwtype

Attribites for “a=” could be “a=ice-ufrag”, “a=”ice-pwd”, “a=ice-options” , “a=candidate”, “a=remote-candidate” , a=end-of-candidates” and “a=fingerprint”

Interactive Connectivity Establishment (ICE) for NAT traversal

Protocols using offer/answer are difficult to operate through Network Address Translators (NATs) since flow of media packets require IP addresses and ports of media sources and sinks within their messages. Also realtime media emphasises on reduced latency and decreased packet loss .

An extension to the offer/answer model, and works by including a multiplicity of IP addresses and ports in SDP offers and answers, which are then tested for connectivity by peer-to-peer connectivity checks.
Checks done by STUN and TURN, also allows for address selection for multi-homed and dual-stack hosts

ICE allows the agents to discover enough information about their topologies to potentially find one or more paths by which they can communicate. Then it systematically tries all possible pairs (in a carefully sorted order) until it finds one or more that work.

ICE Gathering

Caller and callee performs checks to finalize the protocol and routing needed to establish a peer connection . Number of candudates are proposed till they mutually agree upon one . Peerconnection then uses that candiadte detaisl to initiate the connection .

While Applying a Local Description at the media engine level if m= section is new, WebRTC media stacks begins gathering candidates for it.

RTCPeerconnection specified canTrickleIceCandidates. ICE trickling is the process of continuing to send candidates after the initial offer or answer has already been sent to the other peer.

ICE TransportRole is responsible for Choosing a candidate pair.

ICE layer sets one peer as controlling and other as controlled agent. The controling agent makes the final decision as to which candidate pair to choose.

Final selected canduadte in SDP

a=group:BUNDLE 0 1 2
a=msid-semantic: WMS 9Cv3eIelHVuhxrGfxSvUsfokNu4eb4R9PYw2
m=audio 59937 UDP/TLS/RTP/SAVPF 111 103 104 9 0 8 106 105 13 110 112 113 126
c=IN IP4 x.x.x.x
a=rtcp:9 IN IP4 0.0.0.0
a=candidate:2880323124 1 udp 2122260223 x.x.x.x 59937 typ host generation 0 network-id 1 network-cost 10
a=candidate:3844981444 1 tcp 1518280447 x.x.x.x 9 typ host tcptype active generation 0 network-id 1 network-cost 10

An agent identifies all CANDIDATE whic is a transport address. Types:

HOST CANDIDATE – directly from a local interface which could be Wifi, Virtual Private Network (VPN) or Mobile IP (MIP)
if an agent is multihomed ( private and public networks) , it obtains a candidate from each IP address and includes all candidates in its offer.
STUN or TURN to obtain additional candidates. Types
- translated addresses on the public side of a NAT (SERVER REFLEXIVE CANDIDATES)
- addresses on TURN servers (RELAYED CANDIDATES)

Mapping Server Reflexive address

Steps for mappling Server Reflexive Address

Agent sends the TURN Allocate request from IP address and port X:x,
NAT will create a binding X1′:x1′, mapping this server reflexive candidate to the host candidate X:x ( BASE).
Outgoing packets sent from the host candidate will be translated by the NAT to the server reflexive candidate.
Incoming packets sent to the server reflexive candidate will be translated by the NAT to the host candidate and forwarded to the agent.

Allocate Request and response fom TURN – Informing the agent of this relayed candidate

Only STUN based Binding

agent sends a STUN Binding request to its STUN server which will get server reflexive candidate and send back Binding response.

STUN Binding request for connectivity checks on CANDIDATE PAIRS

The candidates are carried in attributes in the SDP offer . The remote peer also follows this process and gather and send lits own sorted list of candidates. Hence CANDIDATE PAIRS from both sides are formed.

PEER REFLEXIVE CANDIDATES – connectivity checks can produce aditional candidates espceialy around symmetric NAT

Since the same address is used for STUN. and media ( RTP/RTCP) Demultiplexing based on packet contents helps to identify which one is which.

Checks : ICE checks are performed in a specific sequence, so that high-priority candidate pairs are checked first.

TRIGGERED CHECKS – accelerates the process of finding a valid candidate
ORDINARY CHECKS – agent works through ordered prioritised check list by sending a STUN request for the next candidate pair on the list periodically.

Checks ensure maintaining frozen candidates and pairs with some foundation for media stream. Each candidate pair in the check list has a foundation and a state. States for candidates pairs

1.Waiting: A check has not been performed for this pair, and can be performed as soon as it is the highest-priority Waiting pair onthe check list.

2. In-Progress: A check has been sent for this pair, but the transaction is in progress.

3. Succeeded: A check for this pair was already done and produced a successful result.

4. Failed: A check for this pair was already done and failed, either never producing any response or producing an unrecoverable failure response.

5. Frozen: A check for this pair hasn’t been performed, and it can’t yet be performed until some other check succeeds, allowing this pair to unfreeze and move into the Waiting state.

ICE gather state

icegatheringstatechange – gathering

icecandidate (host)
sdpMid: 0, sdpMLineIndex: 0, candidate: candidate:1511920713 1 udp 2122260223 192.168.0.2 58122 typ host generation 0 ufrag vzpn network-id 1 network-cost 10

icecandidate (srflx)
sdpMid: 0, sdpMLineIndex: 0, candidate: candidate:4081163164 1 udp 1686052607 106.51.26.168 37542 typ srflx raddr 192.168.0.2 rport 58122 generation 0 ufrag vzpn network-id 1 network-cost 10

icecandidate (host)
sdpMid: 0, sdpMLineIndex: 0, candidate: candidate:345893049 1 tcp 1518280447 192.168.0.2 9 typ host tcptype active generation 0 ufrag vzpn network-id 1 network-cost 10

icecandidate (relay)
sdpMid: 0, sdpMLineIndex: 0, candidate: candidate:2130406062 1 udp 41886207 74.125.39.44 27190 typ relay raddr 106.51.26.168 rport 37542 generation 0 ufrag vzpn network-id 1 network-cost 10

icecandidate (relay)
sdpMid: 0, sdpMLineIndex: 0, candidate: candidate:3052096874 1 udp 25108479 172.217.163.158 28049 typ relay raddr 106.51.26.168 rport 37543 generation 0 ufrag vzpn network-id 1 network-cost 10

icegatheringstatechange – complete

Candidate Checking

iceconnectionstatechange : checking

setRemoteDescription L type: answer, sdp: v=0
…
m=audio 9 UDP/TLS/RTP/SAVPF 111 103 104 9 0 8 110 112 113 126
c=IN IP4 0.0.0.0
a=rtcp:9 IN IP4 0.0.0.0
a=ice-ufrag:ydvf
a=ice-pwd:mb4ousBoT6B0l//ljjD/9Z/M
a=ice-options:trickle
…
m=video 9 UDP/TLS/RTP/SAVPF 98 100 96 97 99 101 102 122 127 121 125 107 108 109 124 120 123 119 114 115 116
c=IN IP4 0.0.0.0
a=rtcp:9 IN IP4 0.0.0.0
a=ice-ufrag:ydvf
a=ice-pwd:mb4ousBoT6B0l//ljjD/9Z/M
a=ice-options:trickle

addIceCandidate (host)
sdpMid: , sdpMLineIndex: 0, candidate: candidate:1511920713 1 udp 2122260223 192.168.0.2 56060 typ host generation 0 ufrag ydvf network-id 1 network-cost 10

iceconnectionstatechange : connected

Candidate Nomination for Media Path

Selecting low-latency media paths can use various techniques such as actual round-trip time (RTT) measurement. Controlling agent gets to nominate which candidate pairs will get used for media amongst the ones that are valid. There are 2 ways : regular nomination and aggressive nomination.

WebRTC Media Stack

WebRTC service’s

References :

[1] WebRTC 1.0: Real-time Communication Between Browsers – W3C Editor’s Draft 31 August 2019 http://w3c.github.io/webrtc-pc/
[2] RFC 5245 Interactive Connectivity Establishment (ICE): A Protocol for Network Address Translator (NAT) Traversal for Offer/Answer Protocols

Kamailio DNS and NAT

DNS sub-system in Kamailio
- DNS failover
- DNS load balancing
- NAT ( Network Address Translation)
NAT ( Network Address Translation)
Why is NAT is important in SIP?
Types of NAT solutions
NAT behaviours
RTP NAT
Fixing NAT
NAT Traversal Module
- Why use keepalive when Registrations are already there for NATing ?
- How keepalives work for NATing ?
- function nat_keepalive()
- Params
- Functions
- client_nat_test()
  - fix_contact()
  - nat_keepalive()
- Pseudo Variables
- Statistics
NATHelper Module
- NAT pinging types
  - UDP packet
  - SIP request
- params
- functions
- Pseudo Variables
- RPC Commands

In this article, we discuss Nating in a SIP Server like Kamailio. Types of NAT pings, their behaviour and types. Also some implementation of some of the Kamailios modules like

NAT Traversal module
NAT helper module
STUN module

A repository for extensive application of kamailio modules for various usecases is https://github.com/altanai/kamailioexamples

DNS sub-system in Kamailio

To resolve hostname into IPs Kamailio can do either of below

use libresolv and a combination of the locally configured DNS server /etc/hosts and the local Network Information Service (NIS/YP a.s.o) or
cache the query results and first look into internal cache

DNS failover – if destination resolves to multiple addresses tm can try all of them until it finds one to which it can successfully send the packet or it exhausts all of them, with internal DNS cache. Also used when the destination host doesn’t send any reply to a forwarded invite within the SIP timeout interval (tm fr_timer parameter).

DNS load balancing – SRV based load balancing with weight value in the DNS SRV record.

(-) Only the locally configured DNS server (usually in /etc/resolv.conf) is
used for the requests (/etc/hosts and the local Network Information Service are ignored).
- optional: disable the DNS cache (use_dns_cache=off or compile without -DUSE_DNS_CACHE).
(-) DNS cache uses extra memory
- optional: disable the DNS cache.
(-) DNS failover introduces a very small performance penalty
- optional: disable the DNS failover (use_dns_failover=off).
(-) DNS failover increases the memory usage (the internal structures used to represent the transaction are bigger when the DNS failover support is compiled).
- optional: compile without DNS failover support (DUSE_DNS_FAILOVER).Turning it off from the config file is not enough in this case (the extra memory will still be used).

NAT ( Network Address Translation)

Network address translation replaces the IP address within packets with a different IP address which internet endpoints can relate with. This enables multiple hosts in a private subnet with their pwn private address ( 10.x.x.x or 192.x.x.x etc ) to share single public IP address interface, to access the Internet.

NAT is bidirectional- If the private ip:port got translated to public ip:port on the inside interface while entering outside internet, on arriving from outside interface it will get translated from public ip:port to private ip:port.

For a SBC ( Session border controller ) or where the kamailio server is directly customer facing, where you dont have a private line or VPN to clients, then it is often encountered with NATed endpoints. Read more about NAT traversal using STUN and TURN here

NAT traversal using STUN and TURN

Why is NAT is important in SIP?

These characteristics of SIP design and operation flows demonstrate why NAT solutions are so important

RFC 3261 for SIP presumed end-to-end reachability and does not specify much around ANT issues .
No NLRI (Network Layer Reachability Information) translation layer exists, such as DNS or ARP
SIP is designed to used RTP which uses dynamically allocated ports to stream media.
It is comparable to FTP which creates ephemeral connections on unpredictable dynamic ports to send multiplexed data and “metadata”, instead of protocol like HTTP where all data is sent on same connection.
UDP (default transport for SIP) is connection less and session tracking requires these be mapped onto a statelful flow, rigorous keepalives and other such techniques like using TCP instead have their own tradeoffs
since sip packets put network and transport information right on sip header they are limited by the rateability and awareness of their network interface thereby prevent other endpoint from reaching its ip or port

Types of NAT solutions

Client-side NAT traversal – clients are responsible for identifying their WAN NLRI and adding ip and port to navigate them in outside world
Server-side NAT traversal – SIP server should discover the client’s WAN addressing while clients continue to work transparently behind NAT. Requires that DIP server look at the source and destination ip and port of actual packets instead of relying on the encapsulated sip headers and SDP body.
ALG (Application Layer Gateways) – mostly applied at router itself. wodk by susbtitung public IP/port information inplace of provate and vice versa for return packets. Limitataions – they dont provide a fullproof fix example they may fix Via but not the Contact address or SDP body or RTP ports

WebRTC signalling and media flow on Open public network — Open network leading to smooth p2p media stream

Far-end NAT traversal solution ( TURN server)

WebRTC media flow when peers are behind NAT . Uses TURN relay mechanism — public private ip mapping , firewalls and private network obstruct p2p media flow. TURN is useful to relay media pckets

NAT behaviours

Cone NAT	Symmetric NAT
Local client performs an outbound connection to a remote UA and a dynamic rule is created for the destination IP tuple, allowing the remote machine to connect back.	Local client allows inbound connections from a specific source IP address and port, also NAT assigns a new random source port for each destination IP tuple
Further subdivied into: 1. Full Cone NAT 2. Restricted Cone NAT – all requests from the same internal IP address and port are mapped to the same external IP address and port. -3. Port-Restricted Cone NAT

RTP NAT

NAT not only applies to sip signalling packets but also to RTP. Even RTP packets are not able to transverse accross private -public network interfaces to the right place across a NAT’d connection.

To solve two-way media RTP performs RTP latching, where client listens for at least one RTP frame arriving at the destination port it advertised, and harvests the source IP and port from that packet and uses that for the return RTP path. RTP latching works out of the box for public RTP endpoints but not for ones behind NAT.

It is thus recommended to use an intermediate RTP relay such as RTPengine on kamailio. It is controlled via a UDP control socket by kamailio as an external process. More on installation and descrition of RTP engine on kamailio is covered here. When RTPengine control module receives RTP offer /answer from Kamailio, it opens a pair of RTP/RTCP ports to receive traffic and substitues in SDP. Doing so for both ends makes RTP engine come in the path of media stream packets for both directions.

A first INVITE has no no corresponding on NAT ( as no port has been allocated yet ) so the c= contact and m = media lines would look like

c=IN IP4 192.0.2.13.
m=audio 23767 RTP/AVP 0 101.

We can force RTP to go through a proxy by changing c line and m line to

c=IN IP4 address-of-proxy  
m=audio port-on-proxy RTP/AVP 0 101

A separate daemon called an RTP proxy (with a public IP address) that both user agents can send their audio to, can be setup by calling force_rtp_proxy().

Fixing NAT

When the client is behind NAT, following needs to be taken careof to provide smooth operation

Ensuring Tranactional replies are sent to correct source address ( maybe using ;rport param and forcerport() method ) instead of just relying on via header transport protocol and port.
example:

if (client_nat_test("3")){
    //CALL RE-INVITE/UPDATE Nat DETECTED $ci\n");
    force_rport();
    fix_contact();
    ...
}

also Change Media ip address to public IP

if(nat_uac_test("8") && search("Content-type: application/sdp")) {
        // RE-INVITE/UPDATE CALL fix SDP- NAT
        fix_nated_sdp("2");
}

Any far-end NAT traversal solution ( TURN server) if employed should stay in path of entire Dialog not just for initial INVITE transaction which many times results in ACK being dropped. This can be achived by adding Record-Route header of rr module to the initial INVITE request itself

Set the advertised address of the public-facing inetrface to the Public NAT IP using “listen” parameter

Ensure contact URI is NAT processed by using NATHelper modules which rewrites the domain portion of the Contact URI to contain the source IP and port of the request or reply. add_contact_alias([ip_addr, port, proto]) in NAThelper module which adds “;alias=ip~port~transport” parameter to the contact URI containing either received ip, port, and transport protocol or those given as parameters

Implement RTP proxy which performs NAT for streams such as rtpengine module

NAT Traversal Module

Provides far-end NAT traversal to kamailio’s SIP signalling. Its role is

detect user agents behind NAT
manipulate SIP headers so that user agents can continue working behind NAT transparently
keepalives to UA behind NAT to preserve their visibility in network

Some pros and cons of NATTravsersal module

(+) detect even UAs behind multiple cascaded NAT boxes, complex distributed env with multiple proxies
(+) handle env where incoming and outgoing paths are diff for SIP messages
(+) handle cases when routing path may even change between consecutive dialogs
(+) can work for other than registered UA’s also
(-) built for IPv4 NAT handling not adapted to support IPv6 session keepalives.

Why use keepalive when Registrations are already there for NATing ?

NAT binding works for registered users who want incoming calls. However for cases like outgoing calls or for presence subscription notifications, failings registration implies inability to receive further in-dialog messages after the NAT binding expires. This artificial binding for registrations makes system unreliable and volatile as it doesnot guarantee the delivery of in-dialog messages for outgoing calls without registration renewal. Therefore keepalive are adopted which also works for unregistered users.
Minimizes the traffic as only border proxies send keepalives which send keepalives statelessly, instead of having to relay messages generated by the registrars.
Also for situations when DNS resolves diff proxies for outgoing or incoming path traditional register based keepalives fail to associate or dissociate correct routes.

How keepalives work for NATing ?

This mechanism works by sending a SIP request to a user agent behind NAT to make that user agent send back a reply. The purpose is to have packets sent from inside the NAT to the proxy often enough to prevent the NAT box from timing out the connection.

Module sends Keeplaives to preserve their visibility only in :

Registration – for user agent that have registered to for incoming calls, triggering keepalive for a REGISTER request.
Subscription – for presence agents that have subscribed to some events for receiving back notifications with SUBSCRIBE request.
Dialogs – for user agents that have initiated an outgoing call for receiving further in-dialog messages.
When all the conditions to keepalive a NAT endpoint will disappear, that endpoint will be removed from the list with the NAT endpoints that need to be kept alive.

function nat_keepalive()

the function needs to be called on proxy directly interacting with UA behind NAT.
call only once for the requests (REGISTER, SUBSCRIBE or outgoing INVITEs) that triggers the need for network visibility.
call before the request gets either a stateless reply or it is relayed with t_relay()
for outgoing INVITE , it triggers dialog tracing for that dialog and will use the dialog callbacks to detect changes in the dialog state.

Dependencies – sl , tm and dialog module

Params

keepalive_interval – time interval between sending a keepalive message to all the endpoints that need being kept alive. A negative value or zero will disable the keepalive functionality.

modparam("nat_traversal", "keepalive_interval", 30) // 30 seconds keeplaive inetrval

keepalive_method – SIP method to use to send keepalive messages.usual ones are NOTIFY and OPTIONS. Default value is “NOTIFY”.

modparam("nat_traversal", "keepalive_method", "OPTIONS")

keepalive_from – SIP URI to use in the From header of the keepalive requests. default sip:keepalive@proxy_ip,with IP address of the outgoing interface

modparam("nat_traversal", "keepalive_from", "sip:keepalive@altanai.com")

keepalive_extra_headers – extra headers that should be added to the keepalive messages. Header must also include the CRLF (\r\n) line separator. Multiple headers can be specified by concatenating with \r\n separator.

modparam("nat_traversal", "keepalive_extra_headers", "User-Agent: Kamailio\r\nX-MyHeader: some_value\r\n")

keepalive_state_file – filename where information about the NAT endpoints and the conditions for which they are being kept alive is saved . It is used when Kamailio starts to restore its internal state and continue to send keepalive messages to the NAT endpoints that have not expired in the meantime. Also used at kamailio restart as it avoids losing keepalive state information about the NAT endpoints.

modparam("nat_traversal", "keepalive_state_file", "/var/run/kamailio/keepalive_state")

Functions

client_nat_test ()– Check if the client is behind NAT. Tests to be performed gievn by int can be :
1 – tests if client has a private IP address or one from shared address space in the Contact field of the SIP message.
2 – tests if client has contacted Kamailio from an address that is different from the one in the Via field.
4 – tests if client has a private IP address or one from shared address space in the top Via field of the SIP message.

For example calling client_nat_test(“3”) will perform test 1 and test 2 and return true if at least one succeeds, otherwise false.

fix_contact() – replace the IP and port in the Contact header with the IP and port the SIP message was received from. Usually called after a succesfull call to client_nat_test(type)

if (client_nat_test("3")) {
    fix_contact();
}

nat_keepalive() – Triggers keepalive functionality for the source address of the request. When called it only sets some internal flags, which will trigger later the addition of the endpoint to the keepalive list if a positive reply is generated/received (for REGISTER and SUBSCRIBE) or when the dialog is started/replied (for INVITEs). For this reason, it can be called early or late in the script. The only condition is to call it before replying to the request or before sending it to another proxy. If the request needs to be sent to another proxy, t_relay() must be used to be able to intercept replies via TM or dialog callbacks.

If stateless forwarding is used, the keepalive functionality will not work. Also for outgoing INVITEs, record_route() should also be used to make sure the proxy that keeps the caller endpoint alive stays in the path.

if ((method=="REGISTER" || method=="SUBSCRIBE" ||
    (method=="INVITE" && !has_totag())) && client_nat_test("3"))
{
    nat_keepalive();
}

Pseudo Variables

$keepalive.socket(nat_endpoint)
$source_uri

Statistics

keepalive_endpoints – total number of NAT endpoints that are being kept alive.
registered_endpoints – NAT endpoints kept alive for registrations
subscribed_endpoints – NAT endpoints kept alive for subscriptions.
dialog_endpoints – Indicates how many of the NAT endpoints are kept alive for taking part in an INVITE dialog.

NATHelper Module

NAT traversal and reuse of TCP connections.
Helps symmetric UAs who are not able to determine their public address.

NAT pinging types

UDP packet 4 bytes (zero filled) UDP packets are sent to the contact address.	SIP request a stateless SIP request is sent to the UDP contact address.
(+) low bandwitdh traffic, (+) easy to generate by Kamailio;	(+) bidirectional traffic through NAT (+) since each PING request from Kamailio (inbound traffic) will force the SIP client to generate a SIP reply (outbound traffic) – the NAT bind will be surely kept open.
(-) unidirectional traffic through NAT (inbound – from outside to inside); if many NATs do update the bind timeout only on outbound traffic, the bind may expire and closed.	(-) higher bandwitdh traffic (-) more expensive (as time) to generate by Kamailio;

Dependencies – usrloc

Params

force_socket – Socket to be used when sending NAT pings for UDP communication.

modparam("nathelper", "force_socket", "127.0.0.1:5060")

natping_interval
ping_nated_only
natping_processes – How many timer processes should be created by the module for the exclusive task of sending the NAT pings.
natping_socket
received_avp – AVP used to store the URI containing the received IP, port, and protocol by fix_nated_register
sipping_bflag
sipping_from
sipping_method
natping_disable_bflag
nortpproxy_str
keepalive_timeout
udpping_from_path
append_sdp_oldmediaip
filter_server_id

Functions

fix_nated_contact() – rewrites the “Contact” header field with request’s source address:port pair
fix_nated_sdp() – adds the active direction indication to SDP and updates ource ip address information too
add_rcv_param() – add a received parameter to the “Contact” header fields or the Contact URI.
fix_nated_register() exports the request’s source address:port into an AVP to be used during save()
nat_uac_test()- check if client’s request originated behind a nat
is_rfc1918()
add_contact_alias() – Adds an “;alias=ip~port~transport” parameter to the contact URI
handle_ruri_alias() – Checks if the Request URI has an “alias” parameter and if so, removes it and sets the “$du” based on its value.
set_contact_alias()

Pseudo Variables

$rr_count – Number of Record Routes in received SIP request or reply.
$rr_top_count – If topmost Record Route in received SIP request or reply is a double Record Route, value of $rr_top_count is 2.

RPC Commands

nathelper.enable_ping

References :

[1] https://www.kamailio.org/docs/modules/5.1.x/modules/nat_traversal.html
[2] https://kamailio.org/docs/modules/5.1.x/modules/nathelper.html
[3] https://www.kamailio.org/docs/modules/devel/modules/stun.html
[4] RFC1631
[5] RFC 4787

Kamailio as Inbound/Outbound proxy or Session Border Controller (SBC)

TURN server for WebRTC – RFC5766-TURN , Coturn, Xirsys , Twillio

STUN (Session Traversal Utilities for NAT) and TURN (Traversal Using Relays around NAT) are protocols that can be used to provide NAT traversal for VoIP and WebRTC. These projects provide a VoIP media traffic NAT traversal server and gateway.

TURN Server is a VoIP media traffic NAT traversal server and gateway.

This article describes working with some of the TURN servers

rfc5766-turn-server ( legacy )
COTURN
Xirsys TURN service
Twillio TURN Server
Asterisk ICE resolution
Kamailio NAT and STUN modules in Server

We know that a STUN only server such as the one below , may not work under firewalls like in enterperises or even in universities resulting in black video , one way video , inconcsistent streaming or even no video experience .

// google STUN 
var iceservers_array=[{urls: 'stun:stun.l.google.com:19302'}] ;

To overcome this we rely on a publicaly avaiable TURN server which can step in and do ICE exchange to seup relay routes for the media stream. Some of the options for both self hosted and TURN as a service are described below .

rfc5766-turn-server

It is a legacy project mainly archives for reference( links at the end of section). This is a VoIP gateway for inter network communication which is opensource MIT licensed .

platforms supported : Any client platform is supported, including Android, iOS, Linux, OS X, Windows, and Windows Phone. This project can be successfully used on other *NIX platforms ( Aamazon EC2) too. It supports flat file or Database based user management system ( MySQL , postgress , redis ). The source code project contains , TURN server , TURN client messaging library and some sample scripts to test various modules like protocol , relay , security etc .

Protocols : Protocols between the TURN client and the TURN server – UDP, TCP, TLS, and DTLS. Relay protocol – UDP , TCP .

Authentication : The authentication mechanism is using key which is calculated over the user name, the realm, and the user password. Key for the HMAC depends on whether long-term or short-term credentials are in use. For long-term credentials, the key is 16 bytes: key = MD5(username “:” realm “:” SASLprep(password))

Installation : Since I used my Ubuntu Software center for installing the RFC turn server 5677 .

More information is on Ubuntu Manuals : http://manpages.ubuntu.com/manpages/trusty/man1/turnserver.1.html

The content got stored inside /usr/share/rfc5766-turn-server. Also install mysql for record keeping

sudo apt-get install mysql-server

Intall MySQL workbench to monitor the values feed into the turn database server in MysqL. connect to MySQL instance using the following screenshot

The database formed with mysql after successful operation is as follows . We shall notice that the initial db is absolutely null

Terminal Commands

These terminal command ( binary images ) get stored inside etc/init.d after installing

turnadmin – Its turn relay administration tool used for generating , updating keys and passwords . For generating a key to get long term crdentaial use -k command and for aading or updateing a long -term user use the -a command. Therefore a simple command to generate a key is

format : turnadmin -k -u -r -p

 turnadmin -k -u turnwebrtc -r mycompany.com -p turnwebrtc

The generated key is displayed in console . For example the following screenshot shows this :

To fill in user with long term credentails

Format : turnadmin -a [-b | -e | -M | -N ] -u -r -p

turnadmin -a -M "host=localhost dbname=turn user=turn password=turn" -u altanai -r mycompany.com -p 123456

Check the values reflected in MySQL workbench for long term user table . ( screenshot depicts two entries for altanai and turnwebrtc user )

you can also check it on console using the -l command

format :turnadmin -l –mysql-userdb=””

turnadmin -l --mysql-userdb="host=127.0.0.1 dbname=turn user=turnwebrtc password=turnwebrtc connect_timeout=30"

or we can also check using the terminal based mySQL client

mysql> use turn;
Database changed

mysql> select * from turnusers_lt;
+------------+----------------------------------+
| name | hmackey |
+------------+----------------------------------+
| altanai | 57bdc681481c4f7626bffcde292c85e7 |
| turnwebrtc | 6066cbe0b5ee14439b2ddfc177268309 |
+------------+----------------------------------+
2 rows in set (0.00 sec)

turnserver – Its command to handle the turnserver itself . We can use the simple turnserver command to start it without any db support using just turnserver. Screenshot for this is

We can use a database like mysql to start it with db connection string

Format : turnserver –mysql-userdb=””

turnserver --mysql-userdb="host=127.0.0.1 dbname=turn user=turnwebrtc password=turnwebrtc connect_timeout=30"

turnutils_uclient: emulates multiple UDP,TCP,TLS or DTLS clients.

turnutils_peer: simple stateless UDP-only “echo” server. For every incoming UDP packet, it simply echoes it back.

turnutils_stunclient: simple STUN client example that implements RFC 5389 ( using STUN as endpoint to determine the IP address and port allocated to it , keep-alive , check connectivity etc) and RFC 5780 (experimental NAT Behavior Discovery STUN usage) .

turnutils_rfc5769check: checks the correctness of the STUN/TURN protocol implementation. This program will perform several checks and print the result on the screen. It will exit with 0 status if everything is OK, and with (-1) if there was an error in the protocol implementation.

Test

1. Test vectors from RFC 5769 to double-check that our STUN/TURN message encoding algorithms work properly. Run the utility to check all protocols :

$ cd examples
$ ./scripts/rfc5769.sh

2. TURN functionality test (bare minimum TURN example).

If everything compiled properly, then the following programs must run together successfully, simulating TURN network routing in local loopback networking environment:

console 1 :

$ ./scripts/basic/relay.sh

console2 :

$ ./scripts/peer.sh

If the client application produces output and in approximately 22 seconds prints the jitter, loss and round-trip-delay statistics, then everything is fine.

Usage

iceServers:[
 { 'url': 'stun: altanai@mycompany.com'},
 { 'url': 'turn: altanai@mycompany.com', 'credential': '123456'}]

Insert the above piece of code on peer connection config . Now call from one network environment to another . For example call from a enterprise network behind a Wifi router to a public internet datacard webrtc agent . The call should connect with video flowing smoothly between the two .

tooltips — TURN working accross Enterprise firrewall on a WebRTC video platform written with SimpeWebRTC lbrary . Project tango FX

website : https://code.google.com/p/rfc5766-turn-server/

Download the executable from : http://turnserver.open-sys.org/downloads/v3.2.5.4/

coturn

Project Coturn evolved from rfc5766-turn-server project with many new advanced TURN specs beyond the original RFC 5766 document. The databses supported are : SQLite , MySQL , PostgreSQL , Redis , MongoDB

Protocols :

The implementation fully supports the following client-to-TURN-server protocols: UDP , TCP , TLS SSL3/TLS1.0/TLS1.1/TLS1.2; ECDHE , DTLS versions 1.0 and 1.2. Supported relay protocols UDP (per RFC 5766) and TCP (per RFC 6062)

Authetication :

Supported message integrity digest algorithms:

HMAC-SHA1, with MD5-hashed keys (as required by STUN and TURN standards)
HMAC-SHA256, with SHA256-hashed keys (an extension to the STUN and TURN specs)

Supported TURN authentication mechanisms:

‘classic’ long-term credentials mechanism;
TURN REST API (a modification of the long-term mechanism, for time-limited secret-based authentication, for WebRTC applications:http://tools.ietf.org/html/draft-uberti-behave-turn-rest-00);
experimental third-party oAuth-based client authorization option;

Installation

Install libopenssl and libevent plus its dev or extra libraries .
OpenSSL has to be installed before libevent2 for TLS beacuse When libevent builds it checks whether OpenSSL has been already installed, and its version.

Download coturn readonly from

svn checkout http://coturn.googlecode.com/svn/trunk/ coturn-read-only

extract the tar contents
$ tar xvfz turnserver-.tar.gz

go inside the extracted folder and run the following command to build
$ ./configure
$ make
$ make install

Adding users in the format using turnadmin
$ Sudo turnadmin -a -u -r -p

$ Sudo turnadmin -a -u altanai -r myserver.com -p 123456

Start the turn Server using turnserver from inside of /etc/init.d using the start command

$ sudo /etc/init.d/coturn start

The logs are usually stored in /var/log . Screenshot of log file

The default configured port is 3478.If other port is needed, change the file /etc/turnserver.conf

Usuage:

Specify the values in Peer Connection

iceServers:[
{ ‘url’: ‘stun: altanai@myserver.com’},
{ ‘url’: ‘turn: altanai@myserver.com’, ‘credential’: ‘123456’}]

Specifications:

TURN specs:

RFC 5766 -Traversal Using Relays around NAT (TURN):Relay Extensions to Session Traversal Utilities for NAT (STUN)
RFC 6062 -Traversal Using Relays around NAT (TURN) Extensions for TCP Allocations
RFC 6156 – IPv6 extension for TURN
RFC 7443 – ALPN support for STUN & TURN
Datagram Transport Layer Security (DTLS) as Transport for Traversal Using Relays around NAT (TURN) . It facilitate the resolution of TURN URIs into the IP address and port of TURN servers supporting DTLS as a transport protocol.
Mobile ICE (MICE) – Mobility with TURN . It minimizes traffic disruption caused by changing IP address during a mobility event by shorter network path .
TURN REST API (http://tools.ietf.org/html/draft-uberti-behave-turn-rest-00)
Origin field in TURN (Multi-tenant TURN Server) (https://tools.ietf.org/html/draft-ietf-tram-stun-origin-05)
TURN Bandwidth draft specs (http://tools.ietf.org/html/draft-thomson-tram-turn-bandwidth-01)
TURN-bis (with dual allocation) draft specs (http://tools.ietf.org/html/draft-ietf-tram-turnbis-01)
Third-party authorization support (http://tools.ietf.org/html/draft-ietf-tram-turn-third-party-authz-13).

STUN specs:

RFC 3489 – STUN – Simple Traversal of User Datagram Protocol (UDP) Through Network Address Translators (NATs)
RFC 5389 – Session Traversal Utilities for NAT (STUN)
RFC 5769 – test vectors for STUN protocol testing
RFC 5780 – NAT behavior discovery support
RFC 7443 – Application-Layer Protocol Negotiation (ALPN) Labels for STUN and TURN

ICE :

RFC 5245 – ICE
RFC 5768 – ICE–SIP
RFC 6336 – ICE–IANA Registry
RFC 6544 – ICE–TCP
RFC 5928 – TURN Resolution Mechanism

website : https://code.google.com/p/coturn/

Chorme WebRTC Internals Page for COTURN connection

Xirsys

Xirsys is a provider for WebRTC infrastructure which included stun and turn server hosting as well .

The process of using their services includes singing up for a account and choosing whether you want a paid service capable of handling more calls simultaneously or free one handling only upto 10 concurrent turn connections .

The dashboard appears like this :

To receive the api one need to make a one time call to their service , the result of which contains the keys to invoke the turn services from webrtc script .

window.addEventListener("load", function (e) {
    let xhr = new XMLHttpRequest();
    xhr.onreadystatechange = function ($evt) {
        if (xhr.readyState == 4 && xhr.status == 200) {
            let res = JSON.parse(xhr.responseText);
            console.log("response: ", res);
            iceservers_array.push(res.v.iceServers);
           alert( iceservers_array);
        }
    };
    xhr.open("PUT", "https://global.xirsys.net/_turn/webrtc", true);
    xhr.setRequestHeader("Authorization","Basic " + btoa("altanai:<sec rettoken>"));
    xhr.setRequestHeader("Content-Type","application/json");
    xhr.send(JSON.stringify({"format": "urls"}));
});

The resulting output should look like ( my keys are hidden with a red rectangle ofcourse )

The process of adding a TURN / STUN to your webrtc script in JS is as follows :

iceServers:[
 {"url":"stun:turn2.xirsys.com"},
 {"username":"< put your API username>","url":"turn:turn2.xirsys.com:443?transport=udp","credential":"< put your API credentail>"},
 {"username":"< put your API username>","url":"turn:turn2.xirsys.com:443?transport=tcp","credential":"< put your API credentail>"}]

website : http://xirsys.com/technology/

Twillio TURN Server

// before unload update status on main site
window.addEventListener("load", function (e) {
    let xhr = new XMLHttpRequest();
    xhr.onreadystatechange = function ($evt) {
        if (xhr.readyState == 4 && xhr.status == 200) {
            let res = JSON.parse(xhr.responseText);
            console.log("response: ", res);
            iceservers_array = res.iceServers;
            console.log("iceservers_array: ", iceservers_array);
            CallSessionBegins();
        }
    };
    xhr.open("POST", "https://<ourdomain>:3000/token", true);
    xhr.setRequestHeader("Content-Type","application/json");
    xhr.send(JSON.stringify({"format": "urls"}));
});

Asterisk TURN module and service

Asterisk is an open source carrer grade SIP server which also provides Firewall traversal . A github repo containing some asterisk dialplan examples is https://github.com/altanai/asteriskexamples. An article discussing Asterisk and its implementation

Asterisk – installation and dial plans for WebRTC supported PJSIP clients

Asterisk is a framework or toolkit designed for VOIP systems . It can support Enterprise communication systems like PBXs, call distributors, VoIP gateways , conference bridges etc . It is open source and free to use . It is developed in C and runs in linux . Technically , Asterisk has protocol support for many…

Kamailio NAT and STUN modules in Server

It is also interesting to note that majority of popular open source SIP servers also have an implementation , libabary or module for STUN/TURN such as Kamailio’s STUN module https://www.kamailio.org/docs/modules/devel/modules/stun.html

You can read more about the kamailio DNS and NATing here

Kamailio DNS and NAT

DNS sub-system in Kamailio DNS failover DNS load balancing NAT ( Network Address Translation) NAT ( Network Address Translation) Why is NAT is important in SIP? Types of NAT solutions NAT behaviours RTP NAT Fixing NAT NAT Traversal Module Why use keepalive when Registrations are already there for NATing ? How keepalives work for NATing…

NAT traversal using STUN and TURN

NAT
What is ICE and why is it used ?
Call Flow for STUN protocol exchange
NAT types
Network Scenarios for NAT
Hole Punching

WebRTC : Web-based real-time communications is a gamechanger for real-time communication systems. WebRTC is one such open-source, royalty-free, unencumbered browser-based platform using the browser’s embedded media application programming interface (API). It allows developers to add custom JavaScript & HTML5 to control the media setup and flow. WebRTC has enabled developers to build apps, sites, widgets, plugins and extensions capable of delivering simultaneous audio, video, data, and screen-sharing capability in a peer to peer fashion.

Issues accross Networks : But something which escapes our attention is how media is traversing across the network. Of course, the webrtc sessions run smoothly when both the peers are on the open public internet without any restrictions or firewall blocks. But the real problem begins when one of the peers is behind a Corporate/Enterprise network or using a different Internet service provider with some security restrictions. In such a case the normal ICE capability of WebRTC is not sufficient to set up a bidirectional media streaming setup. For network restriction what is required is a NAT ( Network Address Traversal) mechanism that performs address discovery.

NAT and ICE Solution : STUN and TURN server protocols handle session initiations with handshakes between peers in different network environments. In the case of a firewall blocking a STUN peer-to-peer connection, the system fallback to a TURN server which provides the necessary traversing mechanism through the NAT.

Lets study from the start ie ICE.

NAT

Network Address Translation provides a mapping of internal to external IP addresses. This helps in network address modification for packets which in transit accross a tarfic routinig node such as inter networks.

A private address on the inside of the NAT is mapped to an external public address. Port address translation (PAT) resolves conflicts that arise when multiple hosts happen to use the same source port number to establish different external connections at the same time.

Some ways to acheive this

Application Layer Gateway (ALG)
Interactive Connectivity Establishment ( ICE )
UPnP Internet Gateway Device Protocol
propertiary SIP based Session Border Controller, so on

Lets us just look at ICE in detail which is the default implementation for WebRTC

What is ICE and why is it used ?

ICE (Interactive Connectivity Establishment ) framework ( mandatory by WebRTC standards ) find network interfaces and ports in Offer / Answer Model to exchange network based information with participating communication clients. ICE makes use of the Session Traversal Utilities for NAT (STUN) protocol and its extension, Traversal Using Relay NAT (TURN)

ICE is defined by RFC 5245 – Interactive Connectivity Establishment (ICE): A Protocol for Network Address Translator (NAT) Traversal for Offer/Answer Protocols.

Sample WebRTC offer holding ICE candidates :

type: offer, sdp: v=0
o=- 3475901263113717000 2 IN IP4 127.0.0.1
s=-
t=0 0
a=group:BUNDLE audio video data
a=msid-semantic: WMS dZdZMFQRNtY3unof7lTZBInzcRRylLakxtvc
m=audio 9 RTP/SAVPF 111 103 104 9 0 8 106 105 13 126
c=IN IP4 0.0.0.0
a=rtcp:9 IN IP4 0.0.0.0
a=ice-ufrag:/v5dQj/qdvKXthQ2
a=ice-pwd:CvSEjVc1z6cMnhjrLlcbIxWK
a=ice-options:google-ice
a=fingerprint:sha-256 F1:A8:2E:71:4B:4E:FF:08:0F:18:13:1C:86:7B:FE:BA:BD:67:CF:B1:7F:19:87:33:6E:10:5C:17:42:0A:6C:15
a=setup:actpass
a=mid:audio
a=sendrecv
a=rtcp-mux
a=rtpmap:111 opus/48000/2
a=fmtp:111 minptime=10
a=rtpmap:103 ISAC/16000
a=rtpmap:104 ISAC/32000
a=rtpmap:9 G722/8000
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:106 CN/32000
a=rtpmap:105 CN/16000
a=rtpmap:13 CN/8000
a=rtpmap:126 telephone-event/8000
a=maxptime:60
m=video 9 RTP/SAVPF 100 116 117 96
c=IN IP4 0.0.0.0
a=rtcp:9 IN IP4 0.0.0.0
a=ice-ufrag:/v5dQj/qdvKXthQ2
a=ice-pwd:CvSEjVc1z6cMnhjrLlcbIxWK
a=ice-options:google-ice
a=fingerprint:sha-256 F1:A8:2E:71:4B:4E:FF:08:0F:18:13:1C:86:7B:FE:BA:BD:67:CF:B1:7F:19:87:33:6E:10:5C:17:42:0A:6C:15
a=setup:actpass
a=mid:video
a=sendrecv
a=rtcp-mux
a=rtpmap:100 VP8/90000
a=rtcp-fb:100 ccm fir
a=rtcp-fb:100 nack
a=rtcp-fb:100 nack pli
a=rtcp-fb:100 goog-remb
a=rtpmap:116 red/90000
a=rtpmap:117 ulpfec/90000
a=rtpmap:96 rtx/90000
a=fmtp:96 apt=100
m=application 9 DTLS/SCTP 5000
c=IN IP4 0.0.0.0
a=ice-ufrag:/v5dQj/qdvKXthQ2
a=ice-pwd:CvSEjVc1z6cMnhjrLlcbIxWK
a=ice-options:google-ice
a=fingerprint:sha-256 F1:A8:2E:71:4B:4E:FF:08:0F:18:13:1C:86:7B:FE:BA:BD:67:CF:B1:7F:19:87:33:6E:10:5C:17:42:0A:6C:15
a=setup:actpass
a=mid:data
a=sctpmap:5000 webrtc-datachannel 1024

Notice the ICE candidates under video and audio. Now take a look at the SDP answer

type: answer, sdp: v=0
o=- 6931590438150302967 2 IN IP4 127.0.0.1
s=-
t=0 0
a=group:BUNDLE audio video data
a=msid-semantic: WMS R98sfBPNQwC20y9HsDBt4to1hTFeP6S0UnsX
m=audio 1 RTP/SAVPF 111 103 104 0 8 106 105 13 126
c=IN IP4 0.0.0.0
a=rtcp:1 IN IP4 0.0.0.0
a=ice-ufrag:WM/FjMA1ClvNb8xm
a=ice-pwd:8yy1+7x0PoHZCSX2aOVZs2Oq
a=fingerprint:sha-256 7B:9A:A7:43:EC:17:BD:9B:49:E4:23:92:8E:48:E4:8C:9A:BE:85:D4:1D:D7:8B:0E:60:C2:AE:67:77:1D:62:70
a=setup:active
a=mid:audio
a=sendrecv
a=rtcp-mux
a=rtpmap:111 opus/48000/2
a=fmtp:111 minptime=10
a=rtpmap:103 ISAC/16000
a=rtpmap:104 ISAC/32000
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:106 CN/32000
a=rtpmap:105 CN/16000
a=rtpmap:13 CN/8000
a=rtpmap:126 telephone-event/8000
a=maxptime:60
m=video 1 RTP/SAVPF 100 116 117 96
c=IN IP4 0.0.0.0
a=rtcp:1 IN IP4 0.0.0.0
a=ice-ufrag:WM/FjMA1ClvNb8xm
a=ice-pwd:8yy1+7x0PoHZCSX2aOVZs2Oq
a=fingerprint:sha-256 7B:9A:A7:43:EC:17:BD:9B:49:E4:23:92:8E:48:E4:8C:9A:BE:85:D4:1D:D7:8B:0E:60:C2:AE:67:77:1D:62:70
a=setup:active
a=mid:video
a=sendrecv
a=rtcp-mux
a=rtpmap:100 VP8/90000
a=rtcp-fb:100 ccm fir
a=rtcp-fb:100 nack
a=rtcp-fb:100 nack pli
a=rtcp-fb:100 goog-remb
a=rtpmap:116 red/90000
a=rtpmap:117 ulpfec/90000
a=rtpmap:96 rtx/90000
a=fmtp:96 apt=100
m=application 1 DTLS/SCTP 5000
c=IN IP4 0.0.0.0
b=AS:30
a=ice-ufrag:WM/FjMA1ClvNb8xm
a=ice-pwd:8yy1+7x0PoHZCSX2aOVZs2Oq
a=fingerprint:sha-256 7B:9A:A7:43:EC:17:BD:9B:49:E4:23:92:8E:48:E4:8C:9A:BE:85:D4:1D:D7:8B:0E:60:C2:AE:67:77:1D:62:70
a=setup:active
a=mid:data
a=sctpmap:5000 webrtc-datachannel 1024

STUN	TURN
address discovery for global IP:port	allocates its own address as interface to the client
binary protocol	extension of STUN
doesnt stay in path after connection	stays in path after connection. tunnels and relays media
higher priority	lower priority
server and peer reflexive ICE candidates	relay ICE candidates

Call Flow for STUN protocol exchange

Client -> Server : binding request with attributes – CHANGE-REQUEST

Server -> Cient : binding response with attributes – MAPPED-ADDRESS, RESPONSE-ORIGIN, OTHER-ADDRESS, XOR-MAPPED-ADDRESS

WebRTC needs SDP Offer to be sent to B from A.
Client B uses this SDP offer to generate an SDP Answer for A.
The SDP ( as seen on chrome://webrtc-internals/ ) includes ICE candidates which map open ports in the firewalls.

However, in case both sides are symmetric NATs, the media flow gets blocked. For such a case TURN is used which tries to give a public IP and port mapped to internal IP and port. This relay path provides an alternative routing mechanism like a packet mirror. It can open a DTLS connection and use it to key the SRTP-DTLS media streams.

NAT types

Some types of NAT are described below

Full Cone ( Normal)

All requests from the same internal IP address and port are mapped to the same external IP address and port. It also allows external hosts to send packet to internal host by using the mapped external address.

Restricted Cone

All requests from the same internal IP address and port are mapped to the same external IP address and port, but external hosts can send packet to internal host only if internal host had previously sent a packet to that IP address.

Port Restricted Cone

Symmetric

All requests from the same internal IP address and port, to a specific destination IP address and port, are mapped to the same external IP address and port. Any traffic from same internal IP+port to a different destination uses a new mapping. Also external hosts which receives a packet can send a UDP packet back to internal host.

Web Archieve 2018
credit https://web.archive.org/web/20180829131023/http://nattest.net.in.tum.de/results.php

Network Scenarios for NAT

In order to Understand this better consider various scenarios that determine the NAT Mapping Behavior one could run tests using cli or network analyzer tools and checking checking the XOR-MAPPED-ADDRESS value of the Binding Response message that the client receives

Mapping behaviour

Endpoint-Independent Mapping NAT (EIM-NAT)
Address-Dependent Mapping NAT (ADM-NAT)
Address and Port-Dependent Mapping NAT (APDM-NAT)

Filtering behaviour

Endpoint-Independent Filtering NAT (EIF-NAT)
Address-Dependent Filtering NAT (ADF-NAT)
Address and Port-Dependent Filtering NAT (APDF-NAT)

Hole Punching

As long as one end of the connection is able to determine the dynamic association of thee other [arty by NAT and send data , hole punching can work.

Permissive NAT mapping techniques which map the same internal address/port consistently to an external address/port are suitable for hole punching such as full cone , address or port restricted NAT. However pure symmetric NAT have inconsistent destination specific port mapping and thus cannot do hole punching.

1 . No Firewall present on either peer. Both connected to open public internet

Diagrammatic representation of this shown as follows :

In this case there is no restriction to signal or media flow and the call takes places smoothly in p2p fashion.

2. Either one or both the peer ( could be many in case of multi conf call ) are present behind a firewall or restrictive connection or router configured for intranet

In such a case the signal may pass with the use of default ICE candidates or simple ppensource google Stun server such as

iceServers:[
 { 'url': "stun:stun.l.google.com:19302"}]

Diagram :

WebRTC signalling when peers are behind firewalls

However the media is restricted resulting in a black / empty / no video situation for both peers . To combat such situation a relay mechanism such as TURN is required which essentially maps public ip to private ips thus creating a alternative route for media and data to flow through .

Peer config should look like :

var configuration =  {
  iceServers: [
 { "url':"stun::"},
 { "url":"turn::"}
  ]};

var pc = new RTCPeerConnection(configuration);

3. When the TURN server is also behind a firewall

The config file of the turn server need to be altered to map the public and private IP. The diagrammatic description of this is as follows :

WebRTC media flow when peers are behind NAT and TURN server is behind NAT as well . TURN config files bind a public interface to private interface address. — WebRTC media flow when peers are behind NAT and TURN server is behind NAT as well . TURN config files bind a public interface to private interface address .

References :

RFC 3489 STUN – Simple Traversal of User Datagram Protocol (UDP)Through Network Address Translators (NATs)
RFC 5928 Traversal Using Relays around NAT (TURN) Resolution Mechanism
webrtchacks https://webrtchacks.com/symmetric-nat/
netmanias https://www.netmanias.com/en/post/techdocs/6067/nat-network-protocol/nat-behavior-discovery-using-stun-rfc-5780
wikipedia https://en.wikipedia.org/wiki/Network_address_translation

SIP VoIP system architecture basics

Infrastructure-requirements
Integral Components of a VOIP SIP based architecture
- SIP Gateways
- Registrar Server
- Proxy Server
  - Stateless Proxy Server
  - Stateful Proxy Server
  - Forking Proxy Server
- Redirect Server
- Application Server
Adding Media Management
- DTMF( Dual tone Multi Frequency )
- TTS ( Text to Speech )
Developing SIP based applications
- Basic SIP methods
- Extending SIP headers
- Call routing Scripts
SIP platform Development
- Performance factors
- Security considerations
- Collecting and Processing PCAPS
NAT and DNS
- Near End NAT traversal
  - STUN
  - TURN
  - ICE
- Far End NAT traversal
CDR processing and Billing
- Data Streams for billing service
- Call Rate and Accounting
Cross platform and integration to External Telecommunication provider landscape
- Adhere to Standard
- Database Integration
- Call RatConsistency of Call Records and duplicated charging records at various endpoints

A VOIP/CPaaS solution is designed to accommodate the signalling and media both along with integration leads to various external endpoints such as various SIP phones ( desktop, softphones, webRTC ), telecom carriers, different VoIP networks providers, enterprise applications ( Skype, Microsoft Lync ), Trunks etc.

A sufficiently capable SIP platform should have

Audio calls ( optionally video ) service using SIP gateways
Media services (such as recording , conferencing, voicemail, and IVR )
Messaging and presence ( could be using SIP SIMPLE, SMS , messahing service from third parties)
Developing SIP based applications : Programmable services through standardized APIs and development of new modules
NAT and DNS near-end and far-end NAT traversal for signalling and media flows
Telemetry for Sessions , Registry, Location and lookup service
CDR Processing and Billing : Backend for CDR and accounts ( can use Redis, Kafka , MySQL, PostgreSQL, Oracle, Radius, LDAP, Diameter)
Serial and parallel forking, load balancing , proxying
Cross platform and integration to External Telecommunication provider landscape
- Interconnectivity with other IP multimedia systems, VoLTE ( optional interconnection with other types of communications networks as GSM or PSTN/ISDN).
- support for VoIP signalling protocols (SIP, H,323, SCCP, MGCP, IAX) and telephony signalling protocols ( ISDN/SS7, FXS/FXO, Sigtran ) either internally via pluggable modules or externally via gateways .

Performnace factors :	Security considerations :
High availability using redundant servers in standby Load balancing IPv4 and IPv6 network layer support TCP , UDP , SCTP transport layer protocol support DNS lookups and hop by hop connectvity	authentication, authorization, and accounting (AAA) Digest authentication and credentials fetched from backend Media Encryption TLS and SRTP support Topology hidding to prevent disclosing IP form internal components in via and route headers Firewalls , blacklist, filters , peak detectors to prevent Dos and Ddos attacks

The article only outlines SIP system architecture from 3 viewpoints :

Infrastructure standpoint
Vore voice engineering perspective
External components required to run and system

Infrastructure Requirements

Data Centers with BCP ( Business Continuity Planning ) and DR ( Disaster Recovery )
Servers and Clusters for faster and parallel calculating
Virtualization
VMs to make a distributed computing environment with HA ( high availability ) and DRS ( Distributed Resource Scheduling )
Storage
SAN with built-in redundancy for the resiliency of data.
WORM compliant NAS for storing voice archives over a retention period.
Racks, power supplies, battery backups, cages etc.
Networking
DMZs ( Demilitarized Zones) which are interfacing areas between internal servers in the green zone and outside network
VLANs for segregation between tenants.
Connectivity through the public Internet as well as through VPN or dedicated optical fibre network for security.
Firewall configuration
Load Balancer ( Layer 7 )
Reverse Proxies for the security of internal IPs and port
Security controls In compliance with ISO/IEC 27000 family – Information security management systems
PKI Infrastructure to manage digital certificates
Key management with HSM ( hardware security module )
truster CA ( Certificate Authority ) to issue publicly signed certificate for TLS ( Https, wss etc)
OWASP ( Open Web Application Security Project ) rules compliance

Integral Components of a VOIP SIP based architecture

Call Controller
Media Manager
Recording
Softclients
logs and PCAP archives
CDR generators
Session Borer Controllers ( SBCs)

A SIP server can be moulded to take up any role based on the libraries and programs that run on it such as gateway server, call manager, load balancer etc. This in turn defines its placement in overall VoIP communication architecture. For example
– stateless proxy servers are placed on the border,
– application and B2BUA server at the core

SIP Gateways

A SIP gateway is an application that interfaces a SIP network to a network utilising another signalling protocol. In terms of the SIP protocol, a gateway is just a special type of user agent, where the user agent acts on behalf of another protocol rather than a human. A gateway terminates the signalling path and can also terminate the media path .

sip gaeways — To PSTN for telephony inter-working
To H.323 for IP Telephony inter-working
Client – originates message
Server – responds to or forwards message

Logical SIP entities are:

User Agent Client (UAC): Initiates SIP requests ….
User Agent Server (UAS): Returns SIP responses ….
Network Servers ….

Registrar Server

A registrar server accepts SIP REGISTER requests; all other requests receive a 501 Not Implemented response. The contact information from the request is then made available to other SIP servers within the same administrative domain, such as proxies and redirect servers. In a registration request, the To header field contains the name of the resource being registered, and the Contact header fields contain the contact or device URIs.

Proxy Server

A SIP proxy server receives a SIP request from a user agent or another proxy and acts on behalf of the user agent in forwarding or responding to the request. Just as a router forwards IP packets at the IP layer, a SIP proxy forwards SIP messages at the application layer.

Typically proxy server ( inbound or outbound) have no media capabilities and ignore the SDP . They are mostly bypassed once dialog is established but can add a record-route .
A proxy server usually also has access to a database or a location service to aid it in processing the request (determining the next hop).

1. Stateless Proxy Server
A proxy server can be either stateless or stateful. A stateless proxy server processes each SIP request or response based solely on the message contents. Once the message has been parsed, processed, and forwarded or responded to, no information (such as dialog information) about the message is stored. A stateless proxy never retransmits a message, and does not use any SIP timers

2. Stateful Proxy Server
A stateful proxy server keeps track of requests and responses received in the past, and uses that information in processing future requests and responses. For example, a stateful proxy server starts a timer when a request is forwarded. If no response to the request is received within the timer period, the proxy will retransmit the request, relieving the user agent of this task.

3 . Forking Proxy Server
A proxy server that receives an INVITE request, then forwards it to a number of locations at the same time, or forks the request. This forking proxy server keeps track of each of the outstanding requests and the response. This is useful if the location service or database lookup returns multiple possible locations for the called party that need to be tried.

Redirect Server

A redirect server is a type of SIP server that responds to, but does not forward, requests. Like a proxy server, a redirect server uses a database or location service to lookup a user. The location information, however, is sent back to the caller in a redirection class response (3xx), which, after the ACK, concludes the transaction. Contact header in response indicates where request should be tried .

Application Server

The heart of all call routing setup. It loads and executes scripts for call handling at runtime and maintains transaction states and dialogs for all ongoing calls . Usually the one to rewrite SIP packets adding media relay servers, NAT . Also connects external services like Accounting , CDR , stats to calls .

Adding Media Management

Media processing is usually provided by media servers in accordance to the SIP signalling. Bridges, call recording, Voicemail, audio conferencing, and interactive voice response (IVR) are commomly used. Read more about Media Architecture here

RFC 6230 Media Control Channel Framework decribes framework and protocol for application deployment where the application programming logic and media processing are distributed.

Any one such service could be a combination of many smaller services within such as Voicemail is a combitional of prompt playback, runtime controls, Dual-Tone Multi-Frequency (DTMF) collection, and media recording. RFC 6231 Interactive Voice Response (IVR) Control Package for the Media Control Channel Framework.

RealTime Transport protocol (RTP) and supporting protocols

Media Architecture , RTP topologies

DTMF( Dual tone Multi Frequency )

delivery options:

Inband – With Inband digits are passed along just like the rest of your voice as normal audio tones with no special coding or markers using the same codec as your voice does and are generated by your phone.
Outband – Incoming stream delivers DTMF signals out-of-audio using either SIP-INFO or RFC-2833 mechanism, independently of codecs – in this case, the DTMF signals are sent separately from the actual audio stream.

TTS ( Text to Speech )

Alexa Text-to-Speech (TTS) + Amazon Polly

Ivona – multiple language text to speech converter with ssml scripts such as below

      <speak>
          <p>
              <s><prosody rate="slow">IVONA</prosody> means highest quality speech
              synthesis in various languages.</s>
              <s>It offers both male and female radio quality voices <break/> at a
              sampling rate of 22 kHz <break/> which makes the IVONA voices a
              perfect tool for professional use or individual needs.</s>
          </p>
      </speak>

check ivona status

service ivona-tts-http status
 tail -f /var/log/tts.log

Developing SIP based applications

Basic SIP methods

SIP defines basic methods such as INVITE, ACK and BYE which can pretty much handle simple call routing with some more advanced processoes too like call forwarding/redirection, call hold with optional Music on hold, call parking, forking, barge etc.

Extending SIP headers

Newer SIP headers defined by more updated SIP RFC’s contina INFO, PRACK, PUBLISH, SUBSCRIBY, NOTIFY, MESSAGE, REFER, UPDATE. But more methods or headers can be added to baseline SIP packets for customization specific to a particular service provider. In case where a unrecognized SIP header is found on a SIP proxy which it either does not suppirt or doesnt understand, it will simply forward it to the specified endpoint.

Call routing Scripts

Interfaces for programming SIP call routing include :
– Call Processing Language—SIP CPL,
– Common Gateway Interface—SIP CGI,
– SIP Servlets,
– Java API for Integrated Networks—JAIN APIs etc .

Some known SIP stacks :

SailFin – SIP servlet container uses GlassFish open source enterprise Application Server platform (GPLv2), obsolete since merger from Sun Java to Oracle.

Mobicents – supports both JSLEE 1.1 and SIP Servlets 1.1 (GPLv2)

Cipango – extension of SIP Servlets to the Jetty HTTP Servlet engine thus compliant with both SIP Servlets 1.1 and HTTP Servlets 2.5 standards.

WeSIP – SIP and HTTP ( J2EE) converged application server build on OpenSER SIP platform

Additionally SIP stacks are supported on almost all popular SIP programming lanaguges which can be imported as lib and used for building call routing scripts to be mounted on SIP servers or endpoints such as :

PJSIP in C

JSSIP Javascript

Sofia in kamailio , Freswitch

Some popular SIP server also have proprietary scripting language such as –
Asterisk Gateway Interface (AGI) , application interface for extending the dialplan with your functionality in the language you choose – PHP, Perl, C, Java, Unix Shell and others

SIP platform Development

audio calls ( optionally video )
media services such as conferencing, voicemail, and IVR,
messaging as IM and presence based on SIMPLE,
programmable services through standardized APIs and development of new modules
near-end and far-end NAT traversal for signalling and media flows
interconnectivity with other IP multimedia systems, VoLTE ( optional interconnection with other types of communications networks as GSM or PSTN/ISDN)
Registry, location and lookup service
Serial and parallel forking

SIP Servlets – Develop and Deploy

Service Creation Environment (SCE ) for SIP Applications

A sufficiently capable SIP platform shoudl consist of following features :

Performance factors :

High availability using redundant servers in standby
Load balancing
IPv4 and IPv6 support

Security considerations :

digest authentication and credentials fetched from backend
Media Encryption
TLS and SRTP support
Topology hiding to prevent disclosng IP form internal components in via and route headers
Firewalls , blacklist, filters , peak detectors to prevent Dos and Ddos attacks .

Collecting and Processing PCAPS

VoIP monitor – network packet sniffer with commercial frontend for SIP RTP RTCP SKINNY(SCCP) MGCP WebRTC VoIP protocols

it uses a passive network sniffer (like tcpdump or wireshark) to analyse packets in realtime and transforms all SIP calls with associated RTP streams into database CDR record which is sent over the TCP to MySQL server (remote or local). If enabled saving SIP / RTP packets the sniffer stores each VoIP call into separate files in native pcap format (to local storage).

sngrep
tcpdump
custom made pcap capture and uploader

EEP (formely HEP) Extensible Encapsulation Protocol with HOMER

NAT and DNS

To adapt SIP to modern IP networks with inter network traversal ICE, far and near-end NAT traversal solutions are used. Network Address traversal is crtical to traffic flow between private public network and from behind firewalls and policy controlled networks
One can use any of the VOVIDA-based STUN server, mySTUN , TurnServer, reStund , CoTURN , NATH (PJSIP NAT Helper), ReTURN, or ice4j

Near-end NAT traversal

STUN (session traversal utilities for NAT) – UA itself detect presence of a NAT and learn the public IP address and port assigned using Nating. Then it replaces device local private IP address with it in the SIP and SDP headers. Implemented via STUN, TURN, and ICE.
limitations are that STUN doesnt work for symmetric NAT (single connection has a different mapping with a different/randomly generated port) and also with situations when there are multiple addresses of a end point.

TURN (traversal using relay around NAT) or STUN relay – UA learns the public IP address of the TURN server and asks it to relay incoming packets. Limitatiosn since it handled all incoming and outgong traffic, it must scale to meet traffic requirments and should not become the bottle neck junction or single point of failure.

ICE (interactive connectivity establishment) – UA gathers “candidates of communication” with priorities offered by the remote party. After this client pairs local candidates with received peer candidates and performs offer-answer negotiating by trying connectivity of all pairs, therefore maximising success. The types of candidates :
– host candidate who represents clients’ IP addresses,
– server reflexive candidate for the address that has been resolved from STUN
– and a relayed candidate for the address which has been allocated from a TURN relay by the client.

Far-end NAT traversal

UA is not concerned about NAT at all and communicated using its local IP port. The border controller implies a NAT handling components such as an application layer gateway (ALG) or universal plug and play (UPnP) etc which resolves the private and public network address mapping by act as a back to back user agent (B2BUA).
Far end NAT can also be enabled by deploying a public SIP server which performs media relay (RTP Proxy/Media proxy).

Limitations of this approach
(-) security risks as they are operating in the public network
(-) enabling reverse traffic from UAS to UAC behind NAT.

A keep-alive mechanism is used to keep NAT translations of communications between SIP endpoint and its serving SIP servers opened , so that this NAT translation can be reused for routing. It contains client-to-server “ping” keep-alive and corresponding server-to-client “pong” messages. The 2 keep-alive mechanisms: a CRLF keep-alive and a STUN keep-alive message exchange.

The 3 types of SIP URIs,

address of record (AOR)
fully qualified domain name (FQDN)
globally routable user agent (UA) URI
SIP uniform resource identifiers (URIs) are identified based on DNS resolution since the URI after @ symbol contains hostname , port and protocl for the next hop.

Adding record route headers for locating the correct SIP server for a SIP message can be done by :
– DNS service record (DNS SRV)
– naming authority pointer (NAPTR) DNS resource record

Steps for SIP endpoints locating SIP server

From SIP packet get the NAPTR record to get the protocl to be used
Inspect SRV record to fetch port to use
Inspect A/AAA record to get IPv4 or IPv6 addresses
ref : RFC 3263 – Locating SIP Servers
Can use BIND9 server for DNS resolution supports NAPTR/SRV, ENUM, DNSSEC, multidomains, and private trees or public trees.

NAT traversal using STUN and TURN

CDR Processing and Billing

CDR store call detail records along with proof of call with tiemstamps, orignation, destination, duaration, rate etc. At the end of month or any other term, the aggregated CDR are cumulatively processed to generate the bill for a user. This heavy data stream needs to be accurately processed and this can be achived by using data-pipelines like AWS kinesis or Kafka eventstore.

The prime requirnment for the system is to handle enormous amount of call records data in relatime , cater to a number of producers and consumers.

For security the data is obfuscated into blob using base 64 encoding.

For good consistency only a single shard should be rsponsible to process one user account’s bill.

Data Streams for billing service

AWS Kinesis – Kinesis Data Streams is sued for for rapid and continuous data intake and aggregation. The type of data used can include IT infrastructure log data, application logs, social media, market data feeds, and web clickstream data. It supports data sharding (ie number of call records grouped) and uses a partition Key ( string MD5 hash) to determine which shard the record goes to.

(+) This system can handle high volume of data in realtime and produce call uuid specfic reults which can be consumed by consumers waiting for the processed results

(-) If not consumed with a pre-specified time duration the processed results expire and are irretrivable . Self implement publisher to store teh processed reults from kisesis stream to data stores like Redis / RDBMS or other storge locations like s3 , dynamo DB. If pieline crashes during operation , data is lost

(-) Data stream should have low latency igesting contnous data from producer and presenting data to consumer.

Call Rate and Accounting

Generally data streams proecssing are used for crtical and voluminious service usage like for
– metering/billing
– server activity,
– website clicks,
– geo-location of devices, people, and physical goods

Call Rates are very crticial for billing and charging the calls . Any updates from the customer or carriers or individuals need to propagate automatically and quickly to avoid discrpencies and neagtive margins. CDRs need to be processed sequentially and incrementally on a record-by-record basis or over sliding time windows, and used for a wide variety of analytics including correlations, aggregations, filtering, and sampling.

To acheieve this the follow setup is ideal to use the new input rate sheet values via web UI console or POST API and propagate it quickly to main DB via AWS SQS which is a queing service and AWS lamda which is a serverless trigger based system . This ensures that any new input rates are updates in realtime and maintin fallback values in s3 bucket too

Call Rate and Accounting using task pipes , lambda serverless and qiueing service. Uses s3 buckets , AWS lambda, AWS SQS and AWS RDS. — Call Rate and Accounting using task pipes , lambda serverless and qiueing service

Cross platform and integration to External Telecommunication provider landscape

It is an advantage to plan for ahead for connection with IMS such as openIMS, support for Voip signalling protocols (SIP, H,323, SCCP, MGCP, IAX) and telephony signalling protocls ( ISDN/SS7, FXS/FXO, Sigtran ) either internally via pluggable modules or externally via gateways or for SIP trunking integration via OTT providers/ cloud telephony.

Adhere to Standard

The obvious starting milestone before making a full-scale carrier-grade, SIP-based VoIP system is to start by building a PBX for intra-enterprise communication. There are readily available solutions to make an IP telephony PBX Kamailio, FreeSWITCH, asterisk, Elastix, SipXecs. It is important to use the standard protocol and widely acceptable media formats and codecs to ensure interoperability and reduce compute and delay involved in protocol or media transcoding.

Database Integration

Need backend , cache , databse integration to npt only store routing rules with temporary varaible values but also aNeed backend, cache, database integration to not only store routing rules with temporary variable values but also account details, call records details, access control lists etc. Should therefore extend integration with text-based DB, Redis, MySQL, PostgreSQL, OpenLDAP, and OpenRadius.

Consistency of Call Records and duplicated charging records at various endpoints

In current Voip scenarios a call may be passing thorugh various telco providers , ISP and cloud telephony serviIn current VoIP scenarios, a call may be passing through various telco providers, ISP and cloud telephony service providers where each system maintains its own call records and billing. This in my opinion is duplication and can be avoided by sharing a consistent data store possible in the blockchain. This is an experimental idea that I have further explored in this article

Blockchain integarted with Voice over IP platforms

There are other external components to setup a VOIP solution apart from Core voice Servers and gateways like the ones listed below, I will try to either add a detailed overall architecture diagram here or write about them in an seprate article. Keep watching this space for updates

Payment Gateways
Billing and Invoice
Fraud Prevention
Contacts Integration
Call Analytics
API services
Admin Module
Number Management ( DIDs ) and porting
Call Tracking
Single Sign On and User Account Management with Oauth and SAML
Dashboards and Reporting
Alert Management
Continuous Deployment
Automated Validation
Queue System
External cache

References :

[1]AWS kinesis –https://docs.aws.amazon.com/streams/latest/dev/introduction.html
[2]AWazon docs streaming data – https://aws.amazon.com/streaming-data/
[3]VOIP monitor Archietcture – https://www.voipmonitor.org/doc/Architecture
[4]TTS Ivona – http://developer.ivona.com/en/ttsresources/ssml/ssml.html

SIP solutioning and architectures is a subsequent article after SIP introduction, which can be found here.

SIP ( Session Initiation Protocol )

Read about VoIP/ OTT / Telecom Solution startup’s strategy for Building a scalable flexible SIP platform which includes :

Scalable and Flexible SIP platform building
Cluster SIP telephony Server for High Availability
Failure Recovery
Multi-tier cluster architecture
Role Abstraction / Micro-Service based architecture
Distributed Event management and Event-Driven architecture
Containerization
Autoscaling Cloud Servers
Open standards and Data Privacy
Flexibility for inter-working – NextGen911 , IMS , PSTN
security and Operational Efficiencies

VoIP/ OTT / Telecom Solution startup’s strategy for building a scalable flexible SIP platform

	Boris Ivanov on Asterisk – installation…
	Paras Kumar on Hosted IP-PBX and SBC
	altanai on Hosted IP-PBX and SBC
	Debra Olsen on Streaming / broadcasting Live…
	Things to know about… on WebRTC
	Hugo K on FreeSwitch SIP and Media …
	Bert H on Evolution of voice Commun…

Local Offer created by side initiating the session / Caller

Local Answer created by side receiving the session/ Callee

Session-Level Parsing

Media Section Parsing

Why is NAT is important in SIP?

Types of NAT solutions

NAT behaviours

RTP NAT

Fixing NAT

Statistics

NAT pinging types

Related

rfc5766-turn-server

Terminal Commands

Test

Usage

Protocols :

Authetication :

Installation

Usuage:

Specifications:

Related

Related

SIP Gateways

Registrar Server

Proxy Server

Redirect Server

Application Server

DTMF( Dual tone Multi Frequency )

TTS ( Text to Speech )

Basic SIP methods

Extending SIP headers

Call routing Scripts

Collecting and Processing PCAPS

Near-end NAT traversal

Far-end NAT traversal

Data Streams for billing service

Call Rate and Accounting

Database Integration