- JSEP (JavaScript Session Establishment Protocol)
- Offer/Answer Excahange Flow
- Perform WebRTC handshake
- Outgoing Call
- Incoming Call
- Signalling state Transitions on PeerConnection
- Detailed Offer / Answer SDP
- Subsequent Offers
- Subsequent Answers
- Modifying Offer/answer SDP
- SDP Parsing
- Session level parsing
- Media Section parsing
- Interactive Connectivity Establishment (ICE) for NAT traversal
- ICE Gathering
- Mapping Server Reflexive address
- STUN Binding request for connectivity checks on CANDIDATE PAIRS
- Example of ICE gather state
This article is aimed at explaining the intricacies and detailed offer answer flow in webrtc handshake and JSEP. You can read the following articles on WebRTC as a prereq before reading through this one. WebRTC has API s namely – Peerconnection , getUserMedia , Datachannel and getStats.
JSEP (JavaScript Session Establishment Protocol)
JSEP is used during signalling via w3c’s recommended RTCPeerConnectionAPI interface to set up a multimedia session. The multimedia session description specifies the critical components of setting up a session between local and remote such as transport ports, protocol, profiles. It also handles the interaction with the ICE state machine.
Offer/Answer Excahange Flow
prereq : Setup Client side for the caller
PeerConnectionFactory to generate PeerConnections
PeerConnection for every connection to remote peer
MediaStream audio and video from client device
- Side initiating the session creates a offer by CreateOffer() API
aPromise = myPeerConnection.createOffer([options]);
options is type of RTC Offer Options
- iceRestart
- offerToReceiveAudio ( legacy)
- offerToReceiveVideo ( legacy)
- voiceActivityDetection
2. The application then stores the offer in local config as setLocalDescriptionAPI()
myPeerConnection.createOffer().then(function(offer) { return myPeerConnection.setLocalDescription(offer); })
3. Offer is sent to remote side using its choice of signalling ( SIP , WS , HTTP, XMPP .. )
4. Remote party stores it use setRemoteDescription() API
myPeerConnection.setRemoteDescription(sdp) .then(function () { return createMyStream(); })
4. Remote part generates an answer using createAnswer() API
aPromise = RTCPeerConnection.createAnswer([options]);
5. Remote party stores the answer in its local config using setLocalDescription() API
6. Answer is transferred to Initiator side using choice of signalling ( SIP , WS , HTTP, XMPP .. ) again
7. Initiating side stores it use setRemoteDescription() API
Interfaces of webrtc and tracks to stream addition

Perform webRTC handshake
Webrtc call setup and incoming call callflow between remote peer , peerconnection factory , peerconnection and application
Outgoing Call: setting up a call with remote by sending an offer. Wait for the remote’s answer to process it to create the session.

Incoming Call : Receive remote’s offer and process to reply with an answer.

Signalling state Transitions on PeerConnection
As the caller initiates a new RTCPeerConnection() , the RTCSignalingState
state is “stable” as remote and local descriptions are empty
As the caller initiates call and calls createOffer() , he now has offer SDP and procced to store offer locally with setLocalDescription(offer) the RTCSignalingState
state is “have-local-offer” . After than caller send the offer to callee over signalling channel
Simillarily as the calle recives the offer, it starts with RTCSignalingState stable and then proceeds to store the Remote’s offer using setRemoteDescription(offer), its state is now “have-remote-offer”
The callee generates a provsional answer and for caller and stores it locally , state transitiosn to “have-local-pranswer“. The pranswer SDP is send to caller over signalling channel again .
Caller stores the callee’s pr answer SDP and state updates to “have-remote-pranswer”

Once there is no offer/answer exchange in progress the state again changes to ” stable “.
State schanges to “closed” if RTCpeerConnection is closed
Detailed Offer / Answer SDP
Local Offer created by side initiating the session / Caller
The first offfer called initial offer can have dummy date for contact line such as 0.0.0.0 to prevent leaking a local Ip address
c=IN IP4 0.0.0.0
a=rtcp:9 IN IP4 0.0.0.0
“o=” line contains <username> <sess-id> <sess-version> <nettype> <addrtype> <unicast-address>
o=- 4445251981417004127 2 IN IP4 127.0.0.
shows username – and 4445251981417004127 as session id. Same username “-” is specified in “s=” line
“t=” line shows <start time> <stop time>
t=0 0
Full session Block example
type: offer, sdp: v=0
o=- 4445251981417004127 2 IN IP4 127.0.0.1
s=-
t=0 0
a=group:BUNDLE 0 1 2
a=msid-semantic: WMS DYVK4IA4kA8LvnIYWjXhRzMgSGicnwVutWE2
Media Section : An m= section is generated for each RtpTransceiver that has been added to the PeerConnection. For the initial offer since no ports are available yet , dummy port 9 can be sadded. However if it is bundle only then port value is set to 0. Later the port value will be set to the port value of default ICE candidate.
DTLS filed “UDP/TLS/RTP/SAVPF” is followed by the list of codecs in order of priority.
“c=” line in msection too must be filled with dummy values if IP 0.0.0.0 as no candidates are available yet .
ICE
a=ice-options:trickle
Transport
“a=ice-ufrag” , “a=ice-pwd” , “a=fingerprint” , “a=setup” , “a=tls-id”
Media Stream Identification attribute “a-mid:”
For each media format on the m= line, “a=rtpmap” for “rtx” with the clock rate of codec and “a=fmtp” to reference the payload type of the primary codec. “a=rtcp-fb” specified RTCP feedback
a=rtpmap:111 opus/48000/2 a=rtcp-fb:111 transport-cc a=fmtp:111 minptime=10;useinbandfec=1
Audio Block exmaple
m=audio 9 UDP/TLS/RTP/SAVPF 111 103 104 9 0 8 106 105 13 110 112 113 126 c=IN IP4 0.0.0.0 a=rtcp:9 IN IP4 0.0.0.0 a=ice-ufrag:JDMg a=ice-pwd:6OARDQ8U/orhtXZbfN+ars37 a=ice-options:trickle a=fingerprint:sha-256 1D:C8:1F:18:D2:AB:B7:68:CC:DC:A8:8D:6B:1D:70:11:06:E9:19:D2:22:CE:A5:F3:BE:82:00:ED:99:58:20:4A a=setup:actpass a=mid:0 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level a=extmap:2 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time a=extmap:3 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01 a=extmap:4 urn:ietf:params:rtp-hdrext:sdes:mid a=extmap:5 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id a=extmap:6 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id a=sendrecv a=msid:DYVK4IA4kA8LvnIYWjXhRzMgSGicnwVutWE2 7525d75c-ffe7-4038-8b71-653d249e63bb a=rtcp-mux a=rtpmap:111 opus/48000/2 a=rtcp-fb:111 transport-cc a=fmtp:111 minptime=10;useinbandfec=1 a=rtpmap:103 ISAC/16000 a=rtpmap:104 ISAC/32000 a=rtpmap:9 G722/8000 a=rtpmap:0 PCMU/8000 a=rtpmap:8 PCMA/8000 a=rtpmap:106 CN/32000 a=rtpmap:105 CN/16000 a=rtpmap:13 CN/8000 a=rtpmap:110 telephone-event/48000 a=rtpmap:112 telephone-event/32000 a=rtpmap:113 telephone-event/16000 a=rtpmap:126 telephone-event/8000 a=ssrc:3968544080 cname:da0nYe1oYR8AvVNp a=ssrc:3968544080 msid:DYVK4IA4kA8LvnIYWjXhRzMgSGicnwVutWE2 7525d75c-ffe7-4038-8b71-653d249e63bb a=ssrc:3968544080 mslabel:DYVK4IA4kA8LvnIYWjXhRzMgSGicnwVutWE2 a=ssrc:3968544080 label:7525d75c-ffe7-4038-8b71-653d249e63bb
// remove video section for simplicity
Data Block is created if data channle has been created with m= section for data.
“a=sctp-port” line referencing the SCTP port number set to 5000
“a=max-message-size” set to 262144 here
Data Block example
m=application 9 UDP/DTLS/SCTP webrtc-datachannel
c=IN IP4 0.0.0.0
a=ice-ufrag:JDMg
a=ice-pwd:6OARDQ8U/orhtXZbfN+ars37
a=ice-options:trickle
a=fingerprint:sha-256 1D:C8:1F:18:D2:AB:B7:68:CC:DC:A8:8D:6B:1D:70:11:06:E9:19:D2:22:CE:A5:F3:BE:82:00:ED:99:58:20:4A
a=setup:actpass
a=mid:2
a=sctp-port:5000
a=max-message-size:262144
Subsequent Offers
When createOffer is called a second (or later) time, or is called after a local description has already been installed, the processig is different due to gathered ICE candidates . However the <session-version> is not changed .
Additionally m section is updated if RtpTransceiver is added or removed
Each “m=” and c=” line MUST be filled in with the port, relevant RTP profile, and address of the default candidate for the m= section
If the m= section is not bundled into another m= section, update the “a=rtcp” with port and address of RTCP camdidate and add “a=camdidate” with “a=end-of-candidates”
Local Answer created by side receiving the session/ Callee
When createAnswer is called for the first time after a remote description has been provided, the result is known as the initial answer.
Each offered m= section will have an associated RtpTransceiver
Remote Destination / Callee can reject the m section by setting port in m line to 0 . It can reject msection if neither of the offered media format are supported , RtpTransceiver is stoopped etc.
For the initial offer the dummy port value of 9 is set as no ICE candudate is avaible yet. Simillarly “c=” line must contain the “dummy” value “IN IP4 0.0.0.0” too.
The <proto> field MUST be set to exactly match the <proto> field for the corresponding m= line in the offer.
type: answer, sdp: v=0
o=- 5730481682283561642 3 IN IP4 127.0.0.1
s=-
t=0 0
a=group:BUNDLE 0 1 2
a=msid-semantic: WMS KGmQ9mTmvTaWlHTQ0B0YP36QIxOYNeB3i2nT
Audio section
m=audio 9 UDP/TLS/RTP/SAVPF 111 103 104 9 0 8 106 105 13 110 112 113 126
c=IN IP4 0.0.0.0
a=rtcp:9 IN IP4 0.0.0.0
a=ice-ufrag:MgKS
a=ice-pwd:X3oTkKO/v7GVgd/CDC3e9B7c
a=ice-options:trickle
a=fingerprint:sha-256 B9:9C:8A:A9:E9:09:0C:FB:52:2A:D3:18:7B:A9:D4:EC:B3:00:77:72:27:51:EC:5F:82:BE:11:7F:C7:CF:43:43
a=setup:active
a=mid:0
a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level
a=extmap:2 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time
a=extmap:3 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01
a=extmap:4 urn:ietf:params:rtp-hdrext:sdes:mid
a=extmap:5 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id
a=extmap:6 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id
a=sendrecv
a=msid:KGmQ9mTmvTaWlHTQ0B0YP36QIxOYNeB3i2nT e817fe0f-1cc0-4901-9fd9-e810289cc85d
a=rtcp-mux
a=rtpmap:111 opus/48000/2
a=rtcp-fb:111 transport-cc
a=fmtp:111 minptime=10;useinbandfec=1
a=rtpmap:103 ISAC/16000
a=rtpmap:104 ISAC/32000
a=rtpmap:9 G722/8000
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:106 CN/32000
a=rtpmap:105 CN/16000
a=rtpmap:13 CN/8000
a=rtpmap:110 telephone-event/48000
a=rtpmap:112 telephone-event/32000
a=rtpmap:113 telephone-event/16000
a=rtpmap:126 telephone-event/8000
a=ssrc:3260997313 cname:FxLUKuXrLQe0r1rn
Video section removed for simplicity
Data stream
m=application 9 UDP/DTLS/SCTP webrtc-datachannel
c=IN IP4 0.0.0.0
b=AS:30
a=ice-ufrag:MgKS
a=ice-pwd:X3oTkKO/v7GVgd/CDC3e9B7c
a=ice-options:trickle
a=fingerprint:sha-256 B9:9C:8A:A9:E9:09:0C:FB:52:2A:D3:18:7B:A9:D4:EC:B3:00:77:72:27:51:EC:5F:82:BE:11:7F:C7:CF:43:43
a=setup:active
a=mid:2
a=sctp-port:5000
a=max-message-size:262144
Subsequent Answers
Port value would normally be set to the port of the default ICE candidate for this m= section. For the exmaple above
m=audio 9 UDP/TLS/RTP/SAVPF 111 103 104 9 0 8 106 105 13 110 112 113 126
will be changes with relevant port adress such as
type: offer, sdp: v=0
o=- 6407282338169184323 3 IN IP4 54.190.54.190
s=-
t=0 0
a=group:BUNDLE 0 1 2
a=msid-semantic: WMS bSrCUCFybGovIy0FUhPTZAr9ToRmx8I09nEj
m=audio 55375 UDP/TLS/RTP/SAVPF 111 103 104 9 0 8 106 105 13 110 112 113 126
c=IN IP4 54.190.54.190
a=rtcp:9 IN IP4 0.0.0.0
a=candidate:2880323124 1 udp 2122260223 54.190.54.190 55375 typ host generation 0 network-id 1 network-c
Simillarly m video and data line will also get ports
m=video 53877 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101 102 122 127 121 125 107 108 109 124 120 123 119 114 115 116
c=IN IP4 54.190.54.190
a=rtcp:9 IN IP4 0.0.0.0
a=candidate:2880323124 1 udp 2122260223 54.190.54.190 53877 typ host generation 0 network-id 1 network-cost 10
..
m=application 57991 UDP/DTLS/SCTP webrtc-datachannel
c=IN IP4 54.190.54.190
a=candidate:2880323124 1 udp 2122260223 54.190.54.190 57991 typ host generation 0 network-id 1 network-cost 10
If the answer contains any “a=ice-options” attributes where “trickle” is listed as an attribute, update the PeerConnection canTrickle property to be true.
Modifying Offer/answer SDP
SDP returned from createOffer or createAnswer MUST NOT be changed before passing it to setLocalDescription. After calling setLocalDescription with an offer or answer, the application MAY modify the SDP to reduce its capabilities before sending it to the far side.
Assume we have a MCU at location and want the video stream to relay via a Media Server.

SDP Parsing
SDP is used for session parsing and contians sequence of line with key value pairs. SDP is read, line-by-line, and converted to a data structure that contains the deserialized information.
JSEP SDP bears a lot of simillarity to SIP SDP explained here : SIP and SDP Messages Explained
Session-Level Parsing
- Line “v=” , “o=”,”b=” and “a=” are processed . The “i=”, “u=”, “e=”, “p=”, “t=”, “r=”, “z=”, and “k=” lines are not used by this specification; they MUST be checked for syntax but their values are not used. Line “c=” is checked for syntax and ICE mismatch detection
- “a= ” attribute could be : “a=group” , “s=”ice-lite” , “a=ice-pwd”, “a=ice-options” , “a=fingerprint”, “a=setup” , a=tls-id”, “a=identity” , “a=extmap”
Media Section Parsing
Line “m=” for media , proto , port , fmt in RTP
Attributes “a=” can be :
- “a=rtpmap” or “a=fmtp” : map from an RTP payload type number to a media encoding name that identifies the payload format.
a=rtpmap:<payload type> <encoding name>/<clock rate> [/<encoding parameters>]
m=audio 49230 RTP/AVP 96 97 98 a=rtpmap:96 L8/8000 a=rtpmap:97 L16/8000 a=rtpmap:98 L16/11025/2
- Packetization parameters as “a=ptime” , “a=maxptime” which define the length of each RTP packet.
- Direction as “a=sendrecv” , a=recvonly , a=sendonly , a=inactive“
- Muxing as “a=rtcp-mux” , “a=rtcp-mux-only”
- RTCP attributes “a=rtcp” , “a=rtcp-rsize”
- Line “c=” is checked.
- Line “b=” for bandiwtdh , bwtype
- Attribites for “a=” could be “a=ice-ufrag”, “a=”ice-pwd”, “a=ice-options” , “a=candidate”, “a=remote-candidate” , a=end-of-candidates” and “a=fingerprint”
Interactive Connectivity Establishment (ICE) for NAT traversal
Protocols using offer/answer are difficult to operate through Network Address Translators (NATs) since flow of media packets require IP addresses and ports of media sources and sinks within their messages. Also realtime media emphasises on reduced latency and decreased packet loss .
An extension to the offer/answer model, and works by including a multiplicity of IP addresses and ports in SDP offers and answers, which are then tested for connectivity by peer-to-peer connectivity checks.
Checks done by STUN and TURN, also allows for address selection for multi-homed and dual-stack hosts
ICE allows the agents to discover enough information about their topologies to potentially find one or more paths by which they can communicate. Then it systematically tries all possible pairs (in a carefully sorted order) until it finds one or more that work.
ICE Gathering
Caller and callee performs checks to finalize the protocol and routing needed to establish a peer connection . Number of candudates are proposed till they mutually agree upon one . Peerconnection then uses that candiadte detaisl to initiate the connection .
While Applying a Local Description at the media engine level if m= section is new, WebRTC media stacks begins gathering candidates for it.
RTCPeerconnection specified canTrickleIceCandidates. ICE trickling is the process of continuing to send candidates after the initial offer or answer has already been sent to the other peer.
ICE TransportRole is responsible for Choosing a candidate pair.
ICE layer sets one peer as controlling and other as controlled agent. The controling agent makes the final decision as to which candidate pair to choose.
Final selected canduadte in SDP
a=group:BUNDLE 0 1 2
a=msid-semantic: WMS 9Cv3eIelHVuhxrGfxSvUsfokNu4eb4R9PYw2
m=audio 59937 UDP/TLS/RTP/SAVPF 111 103 104 9 0 8 106 105 13 110 112 113 126
c=IN IP4 x.x.x.x
a=rtcp:9 IN IP4 0.0.0.0
a=candidate:2880323124 1 udp 2122260223 x.x.x.x 59937 typ host generation 0 network-id 1 network-cost 10
a=candidate:3844981444 1 tcp 1518280447 x.x.x.x 9 typ host tcptype active generation 0 network-id 1 network-cost 10
An agent identifies all CANDIDATE whic is a transport address. Types:
- HOST CANDIDATE – directly from a local interface which could be Wifi, Virtual Private Network (VPN) or Mobile IP (MIP)
if an agent is multihomed ( private and public networks) , it obtains a candidate from each IP address and includes all candidates in its offer. - STUN or TURN to obtain additional candidates. Types
- translated addresses on the public side of a NAT (SERVER REFLEXIVE CANDIDATES)
- addresses on TURN servers (RELAYED CANDIDATES)
Mapping Server Reflexive address
Steps for mappling Server Reflexive Address
- Agent sends the TURN Allocate request from IP address and port X:x,
- NAT will create a binding X1′:x1′, mapping this server reflexive candidate to the host candidate X:x ( BASE).
- Outgoing packets sent from the host candidate will be translated by the NAT to the server reflexive candidate.
- Incoming packets sent to the server reflexive candidate will be translated by the NAT to the host candidate and forwarded to the agent.

Allocate Request and response fom TURN – Informing the agent of this relayed candidate
Only STUN based Binding
agent sends a STUN Binding request to its STUN server which will get server reflexive candidate and send back Binding response.
STUN Binding request for connectivity checks on CANDIDATE PAIRS
The candidates are carried in attributes in the SDP offer . The remote peer also follows this process and gather and send lits own sorted list of candidates. Hence CANDIDATE PAIRS from both sides are formed.
PEER REFLEXIVE CANDIDATES – connectivity checks can produce aditional candidates espceialy around symmetric NAT
Since the same address is used for STUN. and media ( RTP/RTCP) Demultiplexing based on packet contents helps to identify which one is which.
Checks : ICE checks are performed in a specific sequence, so that high-priority candidate pairs are checked first.
- TRIGGERED CHECKS – accelerates the process of finding a valid candidate
- ORDINARY CHECKS – agent works through ordered prioritised check list by sending a STUN request for the next candidate pair on the list periodically.
Checks ensure maintaining frozen candidates and pairs with some foundation for media stream. Each candidate pair in the check list has a foundation and a state. States for candidates pairs
1.Waiting: A check has not been performed for this pair, and can be performed as soon as it is the highest-priority Waiting pair onthe check list.
2. In-Progress: A check has been sent for this pair, but the transaction is in progress.
3. Succeeded: A check for this pair was already done and produced a successful result.
4. Failed: A check for this pair was already done and failed, either never producing any response or producing an unrecoverable failure response.
5. Frozen: A check for this pair hasn’t been performed, and it can’t yet be performed until some other check succeeds, allowing this pair to unfreeze and move into the Waiting state.

ICE gather state
icegatheringstatechange – gathering
icecandidate (host)
sdpMid: 0, sdpMLineIndex: 0, candidate: candidate:1511920713 1 udp 2122260223 192.168.0.2 58122 typ host generation 0 ufrag vzpn network-id 1 network-cost 10
icecandidate (srflx)
sdpMid: 0, sdpMLineIndex: 0, candidate: candidate:4081163164 1 udp 1686052607 106.51.26.168 37542 typ srflx raddr 192.168.0.2 rport 58122 generation 0 ufrag vzpn network-id 1 network-cost 10
icecandidate (host)
sdpMid: 0, sdpMLineIndex: 0, candidate: candidate:345893049 1 tcp 1518280447 192.168.0.2 9 typ host tcptype active generation 0 ufrag vzpn network-id 1 network-cost 10
icecandidate (relay)
sdpMid: 0, sdpMLineIndex: 0, candidate: candidate:2130406062 1 udp 41886207 74.125.39.44 27190 typ relay raddr 106.51.26.168 rport 37542 generation 0 ufrag vzpn network-id 1 network-cost 10
icecandidate (relay)
sdpMid: 0, sdpMLineIndex: 0, candidate: candidate:3052096874 1 udp 25108479 172.217.163.158 28049 typ relay raddr 106.51.26.168 rport 37543 generation 0 ufrag vzpn network-id 1 network-cost 10
icegatheringstatechange – complete
Candidate Checking
iceconnectionstatechange : checking
setRemoteDescription L type: answer, sdp: v=0
…
m=audio 9 UDP/TLS/RTP/SAVPF 111 103 104 9 0 8 110 112 113 126
c=IN IP4 0.0.0.0
a=rtcp:9 IN IP4 0.0.0.0
a=ice-ufrag:ydvf
a=ice-pwd:mb4ousBoT6B0l//ljjD/9Z/M
a=ice-options:trickle
…
m=video 9 UDP/TLS/RTP/SAVPF 98 100 96 97 99 101 102 122 127 121 125 107 108 109 124 120 123 119 114 115 116
c=IN IP4 0.0.0.0
a=rtcp:9 IN IP4 0.0.0.0
a=ice-ufrag:ydvf
a=ice-pwd:mb4ousBoT6B0l//ljjD/9Z/M
a=ice-options:trickle
addIceCandidate (host)
sdpMid: , sdpMLineIndex: 0, candidate: candidate:1511920713 1 udp 2122260223 192.168.0.2 56060 typ host generation 0 ufrag ydvf network-id 1 network-cost 10
iceconnectionstatechange : connected

Candidate Nomination for Media Path
Selecting low-latency media paths can use various techniques such as actual round-trip time (RTT) measurement. Controlling agent gets to nominate which candidate pairs will get used for media amongst the ones that are valid. There are 2 ways : regular nomination and aggressive nomination.
ReadMore :
WebRTC Media Stack
- WebRTC Media Streams
Streaming / broadcasting - Live Video call to non webrtc supported browsers and media players
- continue : Streaming / broadcasting Live Video call to non webrtc supported browsers and media players
- WebRTC Audio Video Codecs
WebRTC service’s
References :
- [1] WebRTC 1.0: Real-time Communication Between Browsers – W3C Editor’s Draft 31 August 2019 http://w3c.github.io/webrtc-pc/
- [2] RFC 5245 Interactive Connectivity Establishment (ICE): A Protocol for Network Address Translator (NAT) Traversal for Offer/Answer Protocols