At the time of writing this article on SIP and related VOIP technologies I a newbie in VOIP domain , probably just out college . However over the past decade , looking at the steady traffic to these articles , I have tried updating the same with new RFC standards and market trends .
In this updated version (2019) , the main points described are
- SIP transactions , dialog , branch
- Record Routing
- strict routing
- loose routing
- System Components in SIP based Voip ( Requests and Responses )
- SIP Transport Layer
- Session Description Protocol (SDP)
- Mobility and Location Service
- Network Address Translator ( NAT)
- SIP Call Flows
- Call Redirection
- click to Dial
- SIP for Instant Messaging and Presence Leveraging Extensions ( SIMPLE)
The Session Initiation Protocol (SIP) is a multimedia signalling protocol that has evolved the defacto communication standard for IP telephony.
Even today it forms the primary protocol for many Real Time Communication platforms which are integrated with telecom carriers and provide Cloud and IP based Services for applications such as robo/mass calls for advertising, API based calls like OTP generator, IVR announcements with DTMF input like customer care centre etc. Infact it would be not far from truth to say that converged platform we find today are a result of SIP integrating with the IP world.
Converged platforms integrates audio, video, data, presence, instant messaging, voicemails and conference services into a single network .
- SIP is the key component to build an advanced converged IP communication platform or rich multimedia Real time communication service.
- Can be used to create programmable APIs and complex call routing VoIP scripts such as PBX , SBC etc.
- Bears the support of many high quality open source and freeware SIP client , servers , proxies , tool such as Kamailio , Astersk , Freeswitch , Sipp , JAINSIP etc
- Also supported on most standardised VoIP hardware and network such as Cisco, Microsoft, Avaya, and Radvision.
- standardised specificatiosn RFC 3261
SIP ( Session Initiation Protocol) negotiates session between 2 parties. It primarily exchanges headers that are used for making a call session such as example of outgoing telephone call from SIP session invite .
Session Initiation Protocol (INVITE) Request-Line: INVITE sip:firstname.lastname@example.org;transport=tcp SIP/2.0 Method: INVITE Request-URI: email@example.com;transport=tcp Request-URI User Part: altanai Request-URI Host Part: telecomcompany.com [Resent Packet: False] Message Header Via: SIP/2.0/TCP 220.127.116.11:5080;rport;branch=z9hG4bKceX7a2H2866cN Transport: TCP Sent-by Address: 18.104.22.168 Sent-by port: 5080 RPort: rport Branch: z9hG4bKceX7a2H2866cN Max-Forwards: 41 From: "+16014801797" <sip:+firstname.lastname@example.org>;tag=7HKgjNQ6y2FSj SIP Display info: "+16014801797" SIP from address: sip:+email@example.com SIP from address User Part: +16014801797 E.164 number (MSISDN): 16014801797 Country Code: Americas (1) SIP from address Host Part: 22.214.171.124 SIP from tag: 7HKgjNQ6y2FSj To: <sip:firstname.lastname@example.org;transport=tcp> SIP to address: sip:email@example.com;transport=tcp SIP to address User Part: altanai SIP to address Host Part: telecomcompany.com SIP To URI parameter: transport=tcp Call-ID: e10306be-0cfd-4b38-af3c-b2ada0827cef CSeq: 126144925 INVITE Contact: <sip:firstname.lastname@example.org:5080;transport=tcp> User-Agent: phone1 Allow: INVITE, ACK, BYE, CANCEL, OPTIONS, MESSAGE, INFO, UPDATE, REFER, NOTIFY Supported: path, replaces Allow-Events: talk, hold, conference, refer Privacy: none Content-Type: application/sdp Content-Disposition: session Content-Length: 249 SIP Display info: "+16014801797" SIP PAI Address: sip:+email@example.com
The SIP philosophy :
- reuse Internet addressing (URLs, DNS, proxies)
- utilize rich Internet feature set
- reuse HTTP coding
- text based
- makes no assumptions about underlying protocol:
TCP, UDP, X.25, frame, ATM, etc
- support of multicast
SIP URI can either be in format of sip:firstname.lastname@example.org (RFC 2543 ) or sips:email@example.com ( secure with TLS over TCP RFX 3261) . Additionally SIP URI resolution can either be
- DNS SRV based such as firstname.lastname@example.org with SIP servers locating record for domain “telecomcompnay.com ” or
- FQDN ( Fully qualified domain name ) / contact / ip address based such as email@example.com or altanai@us-west1-prod-server . Both of which do not need any resolution for routing.
Tags are pseudo-random numbers inserted in To or From headers to uniquely identify a call leg
Max forwards is a count decremented by each proxy
that forwards the request.When count goes to zero, request is discarded and 483 Too Many Hops response is sent.Used for stateless loop detection.
Content-Type indicates the type of message body attachment. In this case application /SDP but others could be text/plain, application/cpl+xml, etc.)
Content-Length indicates the octet (byte) count of the message body
Firewalls can sometimes block SIP packets , change TCP to UDP or change IP address of the packets. Record-Route can be used , ensures Firewall proxy stays in path . Clients and Servers copy Record-Route and put in Route header for all messages
Message body is separated from SIP header fields by a blank line (CRLF).
A SIP transaction occurs between a UAC and a UAS in form of 1 request , its provisional and final response.
All transactions are independent of each other. Each transaction are uniquely identified by the branch id on the via header and the cseq.
Via: SIP/2.0/UDP <server ip>:5060;branch=z9hG4bKcb16.c47db56d6d8eb62677a0f0dc733cd73d.0 ... CSeq: 1 INVITE
Each transaction is uniquely identified by: the branch-id on the Via-header and the Cseq header
for ACK given below , tid=-d8754z-deea18278a05ce16-1—d8754z-
T 2017/06/06 06:56:03.656614 :37126 -> :5060 [AP] ACK sip:9876543210@:5080;transport=tcp SIP/2.0. Via: SIP/2.0/TCP :38834;branch=z9hG4bK-d8754z-deea18278a05ce16-1---d8754z-;rport. Max-Forwards: 70. To: :5080>;tag=fdc0b562c1d44395f53d16b622397a3f-589d. From: >;tag=b5327b03. Call-ID: MTllYjkyZjczMjhjM2I5OGE4MTgzZDUxODVjYmM0YzY. CSeq: 1 ACK. Content-Length: 0.
For CANCEL given below , tid=-d8754z-04665556a3f8c928-1—d8754z-
T 2017/06/06 06:53:09.643301 :37126 -> :5060 [AP] CANCEL sip:9876543210@:5080;transport=tcp SIP/2.0. Via: SIP/2.0/TCP :38834;branch=z9hG4bK-d8754z-04665556a3f8c928-1---d8754z-;rport. Max-Forwards: 70. To: :5080>. From: >;tag=c0869612. Call-ID: NTJhMGU1ZTA1NTAyZTYzZmUzMWQ0NjQ2MjIwYTE0MmI. CSeq: 1 CANCEL. User-Agent: Bria 3 release 3.5.5 stamp 71243. Content-Length: 0.
The branch parameter is a transaction identifier. Responses relating a request can be correlated because they will contain the same transaction identifier.
The p2p relationship between 2 sip endpoints , containing sequence of transactions.
The initiator of the session that generates the establishing INVITE generates the unique Call-ID and From tag. In the response to the INVITE, the user agent answering the request will generate the To tag. The combination of the local tag (contained in the From header field), remote tag (contained in the To header field), and the Call-ID uniquely identifies the established session, known as a dialog. This dialog identifier is used by both parties to identify this call because there could be multiple calls set up between them.
A dialog is uniquely identified by: Call-ID header , remote-tag and local-tag. Dialog id is different for both ends since local and remote for both ends are different.
Example : Notice the to and from tag ids in INVITE and its 200 ok. The dialog id for invite is , 97576NjQ5MTBlNjVjNDQ0MzFmOTEyZGEzYWJjZjQxYjcyYzc70edc66c. First invite doesnt bear the To tag.
INVITE sip:1234567890@ SIP/2.0 Via: SIP/2.0/UDP :59583;branch=z9hG4bK-524287-1---22728813bce01a15;rport Max-Forwards: 70 Contact: :59583> To: > From: >;tag=70edc66c Call-ID: 97576NjQ5MTBlNjVjNDQ0MzFmOTEyZGEzYWJjZjQxYjcyYzc CSeq: 1 INVITE Allow: OPTIONS, SUBSCRIBE, NOTIFY, INVITE, ACK, CANCEL, BYE, REFER, INFO Content-Type: application/sdp Supported: replaces User-Agent: X-Lite release 5.5.0 stamp 97576 Content-Length: 210 v=0 o=- 1559804173873191 1 IN IP4 s=X-Lite release 5.5.0 stamp 97576 c=IN IP4 t=0 0 m=audio 49750 RTP/AVP 8 101 a=rtpmap:101 telephone-event/8000 a=fmtp:101 0-15 a=sendrecv
The dialog id, with reversed to and from tag is 97576NjQ5MTBlNjVjNDQ0MzFmOTEyZGEzYWJjZjQxYjcyYzcStNBKgjjXS84r70edc66c
SIP/2.0 200 OK Via: SIP/2.0/UDP :59583;branch=z9hG4bK-524287-1---22728813bce01a15;rport=10973;received= From: >;tag=70edc66c To: >;tag=StNBKgjjXS84r Call-ID: 97576NjQ5MTBlNjVjNDQ0MzFmOTEyZGEzYWJjZjQxYjcyYzc CSeq: 1 INVITE Contact: :5060;transport=udp> User-Agent: FreeSWITCH-mod_sofia/1.9.0-742-8f1b7e0~64bit Accept: application/sdp Allow: INVITE, ACK, BYE, CANCEL, OPTIONS, MESSAGE, INFO, UPDATE, REGISTER, REFER, NOTIFY, PUBLISH, SUBSCRIBE Supported: timer, path, replaces Allow-Events: talk, hold, conference, presence, as-feature-event, dialog, line-seize, call-info, sla, include-session-description, presence.winfo, message-summary, refer Session-Expires: 120;refresher=uas Content-Type: application/sdp Content-Disposition: session Content-Length: 222 Remote-Party-ID: "1234567890" >;party=calling;privacy=off;screen=no v=0 o=FreeSWITCH 1559778909 1559778910 IN IP4 s=FreeSWITCH c=IN IP4 t=0 0 m=audio 25266 RTP/AVP 8 101 a=rtpmap:8 PCMA/8000 a=rtpmap:101 telephone-event/8000 a=fmtp:101 0-16 a=ptime:20
All requests sent within a dialog are by default sent directly from one user agent to the other. Only requests outside a dialog traverse SIP proxies. This approach makes SIP network more scalable because only a small number of SIP messages hit the proxies.
However few request need to explicitly state that they need to stay on path of proxies such as for accounting during termination of when NAT process is being carried out then . For these we need to insert a Record-Route header field into SIP messages which contain address of the proxy. Messages sent within a dialog will then traverse all SIP proxies that put a Record-Route header field into the message.
The server copies the Record-Route header field unchanged into the
response. (Record-Route is only relevant for 2xx responses. ) ie the end point recipient will also mirror the proxies for the response.
Rewrite the Request-URI ie Request-URI always contained URI of the next hop so it is necessary to save the original Request-URI as the last Route header field. Defined in RFC2543
Request-URI is no more overwritten, it always contains URI of the destination user agent, therby keeping target seprated from route. ( ;lr) . If there are any Route header field in a message, then the message is sent to the URI from the topmost Route header field. Defined in RFC 3261
Components of SIP based VoIP Solution
SIP Request methods :
- INVITE : Initiates negotiation to establish a session ( dialog). Usually contains SDP payload. Another invite during an existing session ( dialog) is called an RE-INVITE. A RE-INVITE can be used for
- hold / resume a call
- change session parameters and codecs in mid of a call
- ACK : Acknowledge an INVITE request by completing the 3 way handshake . If an INVITE did not contain media contain then ACK must contain it .
- BYE : Ends a session ( dialog).
- CANCEL : Cancels a session( dialog) before it establishes .
- REGISTER : Registers a user location (host name, IP) on a registrar SIP server.
- OPTIONS : Communicates information about the capabilities of the calling and receiving SIP phones ( methods , extensions , codecs etc )
- PRACK : Provisional Acknowledgement for provisional response as 183 ( session in progress) . PRACK only application to 101- 199 responses .
- SUBSCRIBE : Subscribes for Notification from the notifier. Can use Expire=0 to unsubscribe.
- NOTIFY : Notifies the subscriber of a new event.
- PUBLISH : Publishes an event to the Server.
- INFO : Sends mid session information.
- REFER : Asks the recipient to issue call transfer.
- MESSAGE : Transports Instant Messages.
- UPDATE : Modifies the state of a session ( dialog).
Some SIP responses :
1xx = Informational SIP Responses
183 Session Progress
2xx = Success Responses
200 OK – Shows that the request was successful
3xx = Redirection Responses
4xx = Request Failures
404 Not Found
405 Method Not Allowed
407 Proxy Authentication Required
408 Request Timeout
480 Temporarily Unavailable
481 Call/Transaction Does Not Exist
486 Busy Here
487 Request Terminated
488 Not Acceptable Here
482 Loop Detected
483 Too Many Hops
5xx = Server Errors
500 Server Internal Error
503 Service Unavailable
6xx = Global Failures
600 Busy Everywhere
604 Does Not Exist Anywhere
606 Not Acceptable
SIP callflow diagram for a Call Setup and termination using RTP for media and RTCP for control. Read about SIP messages indepth here
SIP Transport Layers
We know the ISO OSI layers which servers as a standard model for data communications .
- Physical Layer : Ethernet , USB , IEEE 802.11 WiFi, Bluetooth , BLE
- Data Link Layer : ARP ( Address Resolution Protocol ) , PPP ( point to point protocol ) , MAC ( Media Access control ) , ATM , Frame Relay
- Network Layer : IP (IPv4 / IPv6), ICMP, IPsec
- Transport : TCP , UDP , SCTP
- Session : PPTP ( Point to point tunnelling protocol) , NFS, SOCKS
- Presentation : Codecs such as JPEG , GIFF , SSL
- Application : Application level like Call -manager/ softphone as HTTP , FTP , DNS , SIP , RTSP , RTP , DNS
SDP ( Session Description Protocol)
SIP can bear many kinds of MIME attachments , one such is SDP. It uses RTP/AVP Profiles for common media types . Specified by RFC 3264 . It defines media information and capabilities such as codecs , termination points .
Contains connection headers used for establishing the session . Sample SDP payload for Invite SIP above :
Session Description Protocol Version (v): 0 Owner/Creator, Session Id (o): FreeSWITCH 1532932581 1532932582 IN IP4 126.96.36.199 Owner Username: FreeSWITCH Session ID: 1532932581 Session Version: 1532932582 Owner Network Type: IN Owner Address Type: IP4 Owner Address: 188.8.131.52 Session Name (s): FreeSWITCH Connection Information (c): IN IP4 184.108.40.206 Connection Network Type: IN Connection Address Type: IP4 Connection Address: 220.127.116.11 Time Description, active time (t): 0 0 Session Start Time: 0 Session Stop Time: 0 Media Description, name and address (m): audio 29398 RTP/AVP 0 101 Media Type: audio Media Port: 29398 Media Protocol: RTP/AVP Media Format: ITU-T G.711 PCMU Media Format: DynamicRTP-Type-101 Media Attribute (a): rtpmap:0 PCMU/8000 Media Attribute Fieldname: rtpmap Media Format: 0 MIME Type: PCMU Sample Rate: 8000 Media Attribute (a): rtpmap:101 telephone-event/8000 Media Attribute Fieldname: rtpmap Media Format: 101 MIME Type: telephone-event Sample Rate: 8000 Media Attribute (a): fmtp:101 0-16 Media Attribute Fieldname: fmtp Media Format: 101 [telephone-event] Media format specific parameters: 0-16 Media Attribute (a): silenceSupp:off - - - - Media Attribute Fieldname: silenceSupp Media Attribute Value: off - - - - Media Attribute (a): ptime:20 Media Attribute Fieldname: ptime Media Attribute Value: 20
v=0 indicates the start of the SDP content.
o=FreeSWITCH 1532932581 1532932582 IN IP4 18.104.22.168 , is session origin and owner’s name
c=IN IP4 22.214.171.124 is connect information Specifies the IP address of a session.
m= is Media type – audio, port – 29398, RTP/AVP Profile – 0 and 101
Attribute profile – 0, codec – PCMU, sampling rate – 8000 Hz and Attribute profile – 101, telephone-event
Authentication , security , confidentiality and integrity form the basic requirement for any communication system . To protect against hacking a user account and Denial of service attacks , SIP uses HTTP digest authentication mechanism with nonces and challenges along with 407 Proxy Authorization required and 401 unauthorised . The sender has to resend the request with MD5 hash of nonce and password ( password id never send in clear ). Thus preventing man-in-middle attacks.
Challenge / Response Scheme :
- Sends REGISTER and receives 407 Challenge + nonce
- Again sends REGISTER + MD-5 hash (pw + nonce) get a 200 OK
To prevent spoofing ie impersonating as server , SIP provides server authentication too. Required by ITSP’s ( Internet telephony service providers ) .
End to end encryption is achieved thorough TS and SRTP. More on SIP Security here .
Mobility and Location Service
To provide session mobility SIP endpoints send Register request to their respective registrar as they move and update their location.
As User changes terminals , they registers themselves to the appropriate server
Location server tracks the location of user
Redirect servers prioritise the possible locations of the user
Users keep same services as located at home server, while mobile
Call is processed by home servers using RECORD-ROUTE
NAT ( Network Address Translator)
Network Address Translator , defined by RFC 3022 to conserve network space as most packets are exchanged inside a private network itself .
All internet users whether they are using Wifi , 3G/LTE, home AP, any other telecom data packet network by TSP or ISP , are assigned a private IP address , which is unreachable from out side world .Addresses are assigned by Internet Assigned Numbers Authority (IANA). Private address blocks are in format of 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16.
Therefore when they access the Internet , this address is converted into a globally unique public IP address through a NAT for external communication
SIP Issues around NAT
NATs modify IP addresses (Layer 3)- SIP/SDP are Layer 7 protocols – transparent to NAT
SIP Via:, From: and Contact: headers use not-routable private addresses
SDP states that originator wishes to receive media at not-routable private addresses
If destination on the public internet tries to send SIP or RTP traffic to those private address
Traffic will be dumped by first router
Solution are to use either Application level gateway (ALG) or STUN or Universal Plug and Pray (UPnP)
To rewrite all SIP/SDP source addresses
- SIP Via:, From: and Contact: headers use public NAT address
- SDP addresses use NAT public address
- Use SIP over TCP
Use draft-ietf-sip-symmetric-response-00 and “Symmetric” SIP/RTP
Use same UDP port number for incoming/outgoing
Hold ports open for call duration
Send UDP packet typically every 30 seconds
SIP over UDP uses 30 second re-INVITE, REGISTER or OPTIONs
RTP sends at much higher frequency by default
NAPT ( Network Address Port Translator )
- Can map multiple private IP addresses and ports to one public IP address and ports
Localization Server –Used by the Proxy Server and Redirect Server to obtain the location of the called user (one or more addresses)
Registration Server- Accept registration requests from the client applications . Generally, the service is offered by the Proxy Server or Redirect Server
DNS Server – Used to locate the Proxy Server or Redirect Server
Sending Call invite but as Redirect Server responded with 302 moved temporary , a new destination address is returned. The invite is forwarded to another proxy server which connects the sip endpoints again after consultation with Redirect server .
In this stage of we see the call getting connected to sip endpoint via 2 proxy servers . The redirect server doesnt get into path once the initial sip request is send.
After communication the endpoints send BYE to terminate the session
This callflow deals with the use-case when a user maybe registered from multiple SIP phones ( perhaps one home phone , one car and one office desk etc ) and wants to receive a ring on all registered phone ie fork a call to multiple endpoints .
In the above diagram we can see a forked invite going to both the sip phones . Both of them reply with 100 trying and 180 ringing, but only 1 gets answered by the user .
After one endpoint sends 200 ok and connects with session , the other receiver a cancel from the sip server .
Click to Dial
A web or desktop application which has HTTP can fire a API call which is interpreted by the controller or SIP server and call is fired .
The API can contain params for to and from sip addresses as well as any authentication token that is required for api authentication and validation .
SIP for Instant Messaging and Presence Leveraging Extensions (SIMPLE)
- several vendors who intend to implement SIMPLE
- provides for presence and buddy lists
- Instant Messaging in the enterprise
- telephony enabled user lists
Using SIP based Call routing algorithms and flows , one can build carrier grade communication solution . SIP solutions can hook up with existing telecom networks and service providers to be backward compatible . Also has untapped unlimited potential to integrate with any external IP application or service to provide converged , customised control both for signalling and media planes.
- SIP by Henning Schulzrinne Dept. of Computer Science Columbia University New York
- International Institute of Telecommunications 2000-2004
- Introduction to SIP by Patrick Ferriter from ZULTYS
- Internet Draft, IETF, RFC 2543
- NTU – Internet Telephony based on SIP