Update :
At the time of writing this article on SIP and related VOIP technologies I a newbie in VOIP domain , probably just out college . However over the past decade , looking at the steady traffic to these articles , I have tried updating the same with new RFC standards and market trends .
SIP ( Session Initiation Protocol) negotiates session between 2 parties. It primarily exchanges headers that are used for making a call session such as example of outgoing telephone call from SIP session invite . It is a L
Session Initiation Protocol (INVITE)
Request-Line: INVITE sip:altanai@telecomcompany.com;transport=tcp SIP/2.0
Method: INVITE
Request-URI: altanai@telecomcompany.com;transport=tcp
Request-URI User Part: altanai
Request-URI Host Part: telecomcompany.com
[Resent Packet: False]
Message Header
Via: SIP/2.0/TCP 1.2.3.4:5080;rport;branch=z9hG4bKceX7a2H2866cN
Transport: TCP
Sent-by Address: 1.2.3.4
Sent-by port: 5080
RPort: rport
Branch: z9hG4bKceX7a2H2866cN
Max-Forwards: 41
From: "+16014801797" <sip:+16014801797@1.2.3.4>;tag=7HKgjNQ6y2FSj
SIP Display info: "+16014801797"
SIP from address: sip:+16014801797@1.2.3.4
SIP from address User Part: +16014801797
E.164 number (MSISDN): 16014801797
Country Code: Americas (1)
SIP from address Host Part: 1.2.3.4
SIP from tag: 7HKgjNQ6y2FSj
To: <sip:altanai@telecomcompany.com;transport=tcp>
SIP to address: sip:altanai@telecomcompany.com;transport=tcp
SIP to address User Part: altanai
SIP to address Host Part: telecomcompany.com
SIP To URI parameter: transport=tcp
Call-ID: e10306be-0cfd-4b38-af3c-b2ada0827cef
CSeq: 126144925 INVITE
Contact: <sip:mod_sofia@1.2.3.4:5080;transport=tcp>
User-Agent: phone1
Allow: INVITE, ACK, BYE, CANCEL, OPTIONS, MESSAGE, INFO, UPDATE, REFER, NOTIFY
Supported: path, replaces
Allow-Events: talk, hold, conference, refer
Privacy: none
Content-Type: application/sdp
Content-Disposition: session
Content-Length: 249
SIP Display info: "+16014801797"
SIP PAI Address: sip:+16014801797@1.2.3.4
The SIP philosophy :
- reuse Internet addressing (URLs, DNS, proxies)
- utilize rich Internet feature set
- reuse HTTP coding
- text based
- makes no assumptions about underlying protocol:
TCP, UDP, X.25, frame, ATM, etc
- support of multicast
SIP URI can either be in format of sip:altanai@telecomcompnay.com (RFC 2543 ) or sips:altanai@telecomcompany.com ( secure with TLS over TCP RFX 3261) . Additionally SIP URI resolution can either be
- DNS SRV based such as altanai@telecomcompnay.com with SIP servers locating record for domain “telecomcompnay.com ” or
- FQDN ( Fully qualified domain name ) / contact / ip address based such as altanai@2.2.2.2 or altanai@us-west1-prod-server . Both of which do not need any resolution for routing.
Tags are pseudo-random numbers inserted in To or From headers to uniquely identify a call leg
Max forwards is a count decremented by each proxy
that forwards the request.When count goes to zero, request is discarded and 483
Too Many Hops response is sent.Used for stateless loop detection.
Content-Type indicates the type of message body attachment. In this case application /SDP but others could be text/plain, application/cpl+xml, etc.)
Content-Length indicates the octet (byte) count of the message body
Firewalls can sometimes block SIP packets , change TCP to UDP or change IP address of the packets. Record-Route can be used , ensures Firewall proxy stays in path . Clients and Servers copy Record-Route and put in Route header for all messages
Message body is separated from SIP header fields by a blank line (CRLF).

SIP transaction
A SIP transaction occurs between a UAC and a UAS. The SIP transaction comprises all messages from the first request sent from the UAC to the UAS up to a final response (non-1xx) sent from the UAS to the UAC
Branch
The branch parameter is a transaction identifier. Responses relating a request can be correlated because they will contain the same transaction identifier.
Dialog
The initiator of the session that generates the establishing INVITE generates the unique Call-ID and From tag. In the response to the INVITE, the user agent answering the request will generate the To tag. The combination of the local tag (contained in the From header field), remote tag (contained in the To header field), and the Call-ID uniquely identifies the established session, known as a dialog. This dialog identifier is used by both parties to identify this call because there could be multiple calls set up between them.
Components of SIP Solution

SIP Request methods :
- INVITE : Initiates negotiation to establish a session ( dialog). Usually contains SDP payload. Another invite during an existing session ( dialog) is called an RE-INVITE. A RE-INVITE can be used for
- hold / resume a call
- change session parameters and codecs in mid of a call
- ACK : Acknowledge an INVITE request by completing the 3 way handshake . If an INVITE did not contain media contain then ACK must contain it .
- BYE : Ends a session ( dialog).
- CANCEL : Cancels a session( dialog) before it establishes .
- REGISTER : Registers a user location (host name, IP) on a registrar SIP server.
- OPTIONS : Communicates information about the capabilities of the calling and receiving SIP phones ( methods , extensions , codecs etc )
- PRACK : Provisional Acknowledgement for provisional response as 183 ( session in progress) . PRACK only application to 101- 199 responses .
- SUBSCRIBE : Subscribes for Notification from the notifier. Can use Expire=0 to unsubscribe.
- NOTIFY : Notifies the subscriber of a new event.
- PUBLISH : Publishes an event to the Server.
- INFO : Sends mid session information.
- REFER : Asks the recipient to issue call transfer.
- MESSAGE : Transports Instant Messages.
- UPDATE : Modifies the state of a session ( dialog).
Some SIP responses :
1xx = Informational SIP Responses
100 Trying
180 Ringing
183 Session Progress
2xx = Success Responses
200 OK – Shows that the request was successful
3xx = Redirection Responses
4xx = Request Failures
401 Unauthorized
404 Not Found
405 Method Not Allowed
407 Proxy Authentication Required
408 Request Timeout
480 Temporarily Unavailable
481 Call/Transaction Does Not Exist
486 Busy Here
487 Request Terminated
488 Not Acceptable Here
482 Loop Detected
483 Too Many Hops
5xx = Server Errors
500 Server Internal Error
503 Service Unavailable
6xx = Global Failures
600 Busy Everywhere
603 Decline
604 Does Not Exist Anywhere
606 Not Acceptable
SIP callflow diagram for a Call Setup and termination using RTP for media and RTCP for control.

SIP Transport Layers
We know the ISO OSI layers which servers as a standard model for data communications .

- Physical Layer : Ethernet , USB , IEEE 802.11 WiFi, Bluetooth , BLE
- Data Link Layer : ARP ( Address Resolution Protocol ) , PPP ( point to point protocol ) , MAC ( Media Access control ) , ATM , Frame Relay
- Network Layer : IP (IPv4 / IPv6), ICMP, IPsec
- Transport : TCP , UDP , SCTP
- Session : PPTP ( Point to point tunnelling protocol) , NFS, SOCKS
- Presentation : Codecs such as JPEG , GIFF , SSL
- Application : Application level like Call -manager/ softphone as HTTP , FTP , DNS , SIP , RTSP , RTP , DNS
SDP ( Session Description Protocol)
SIP can bear many kinds of MIME attachments , one such is SDP. It uses RTP/AVP Profiles for common media types . Specified by RFC 3264 . It defines media information and capabilities such as codecs , termination points .
Contains connection headers used for establishing the session . Sample SDP payload for Invite SIP above :
Session Description Protocol Version (v): 0
Owner/Creator, Session Id (o): FreeSWITCH 1532932581 1532932582 IN IP4 1.2.3.4
Owner Username: FreeSWITCH
Session ID: 1532932581
Session Version: 1532932582
Owner Network Type: IN
Owner Address Type: IP4
Owner Address: 1.2.3.4
Session Name (s): FreeSWITCH
Connection Information (c): IN IP4 1.2.3.4
Connection Network Type: IN
Connection Address Type: IP4
Connection Address: 1.2.3.4
Time Description, active time (t): 0 0
Session Start Time: 0
Session Stop Time: 0
Media Description, name and address (m): audio 29398 RTP/AVP 0 101
Media Type: audio
Media Port: 29398
Media Protocol: RTP/AVP
Media Format: ITU-T G.711 PCMU
Media Format: DynamicRTP-Type-101
Media Attribute (a): rtpmap:0 PCMU/8000
Media Attribute Fieldname: rtpmap
Media Format: 0
MIME Type: PCMU
Sample Rate: 8000
Media Attribute (a): rtpmap:101 telephone-event/8000
Media Attribute Fieldname: rtpmap
Media Format: 101
MIME Type: telephone-event
Sample Rate: 8000
Media Attribute (a): fmtp:101 0-16
Media Attribute Fieldname: fmtp
Media Format: 101 [telephone-event]
Media format specific parameters: 0-16
Media Attribute (a): silenceSupp:off - - - -
Media Attribute Fieldname: silenceSupp
Media Attribute Value: off - - - -
Media Attribute (a): ptime:20
Media Attribute Fieldname: ptime
Media Attribute Value: 20
v=0 indicates the start of the SDP content.
o=FreeSWITCH 1532932581 1532932582 IN IP4 1.2.3.4 , is session origin and owner’s name
c=IN IP4 1.2.3.4 is connect information Specifies the IP address of a session.
m= is Media type – audio, port – 29398, RTP/AVP Profile – 0 and 101
Attribute profile – 0, codec – PCMU, sampling rate – 8000 Hz and Attribute profile – 101, telephone-event
SIP Authorization
authentication , security , confidentiality and integrity form the basic requirement for any communication system .
To protect against hacking a user account and Denial of service attacks , SIP uses HTTP digest authentication mechanism. Here the SIP request is responded with challenge and nonce . The sender has to resend the request with MD5 hash of nonce and password ( password id never send in clear ) . Thus preventing man-in-middle attacks.
Challenge / Response Scheme :
- Sends REGISTER and receives 407 Challenge + nonce
- Again sends REGISTER + MD-5 hash (pw + nonce) get a 200 OK
To prevent spoofing ie impersonating as server , SIP provides server authentication too. Required by ITSP’s ( Internet telephony service providers ) .
Mobility
To provide session mobility SIP endpoints send Register request to their respective registrar as they move and update their location.
As User changes terminals , they registers themselves to the appropriate server
Location server tracks the location of user
Redirect servers prioritize the possible locations of the user
Users keep same services as located at home server, while mobile
Call is processed by home servers using RECORD-ROUTE
NAT
National Address Translator , defined by RFC 3022 to conserve network space as most packets are exchanged inside a private network itself .
All internet users whether they are using Wifi , 3G/LTE, home AP, any other telecom data packet network by TSP or ISP , are assigned a private IP address , which is unreachable from out side world .Addresses are assigned by Internet Assigned Numbers Authority (IANA). Private address blocks are in format of 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16.
Therefore when they access the Internet , this address is converted into a globally unique public IP address through a NAT for external communication

SIP Issues around NAT
NATs modify IP addresses (Layer 3)- SIP/SDP are Layer 7 protocols – transparent to NAT
SIP Via:, From: and Contact: headers use not-routable private addresses
SDP states that originator wishes to receive media at not-routable private addresses
If destination on the public internet tries to send SIP or RTP traffic to those private address
Traffic will be dumped by first router
Solution are to use either Application level gateway (ALG) or STUN or Universal Plug and Pray (UPnP)
To rewrite all SIP/SDP source addresses
- SIP Via:, From: and Contact: headers use public NAT address
- SDP addresses use NAT public address
- Use SIP over TCP
Use draft-ietf-sip-symmetric-response-00 and “Symmetric” SIP/RTP
Use same UDP port number for incoming/outgoing
Hold ports open for call duration
Send UDP packet typically every 30 seconds
SIP over UDP uses 30 second re-INVITE, REGISTER or OPTIONs
RTP sends at much higher frequency by default
NAPT ( Network Address Port Translator )
- Can map multiple private IP addresses and ports to one public IP address and ports
SIP Flows
Registration
Localization Server –Used by the Proxy Server and Redirect Server to obtain the location of the called user (one or more addresses)
Registration Server- Accept registration requests from the client applications . Generally, the service is offered by the Proxy Server or Redirect Server
DNS Server – Used to locate the Proxy Server or Redirect Server

Call Redirection
Sending Call invite but as Redirect Server responded with 302 moved temporary , a new destination address is returned. The invite is forwarded to another proxy server which connects the sip endpoints again after consultation with Redirect server .

In this stage of we see the call getting connected to sip endpoint via 2 proxy servers . The redirect server doesnt get into path once the initial sip request is send.

After communication the endpoints send BYE to terminate the session

Forking
This callflow deals with the use-case when a user maybe registered from multiple SIP phones ( perhaps one home phone , one car and one office desk etc ) and wants to receive a ring on all registered phone ie fork a call to multiple endpoints .

In the above diagram we can see a forked invite going to both the sip phones . Both of them reply with 100 trying and 180 ringing, but only 1 gets answered by the user .

After one endpoint sends 200 ok and connects with session , the other receiver a cancel from the sip server .

Click to Dial
A web or desktop application which has HTTP can fire a API call which is interpreted by the controller or SIP server and call is fired .

The API can contain params for to and from sip addresses as well as any authentication token that is required for api authentication and validation .
Source code for some of the SIP application can be found on github
https://github.com/altanai/sip-servlets
SIPMLE
SIP for Instant Messaging and Presence Leveraging Extensions (SIMPLE)
- several vendors who intend to implement SIMPLE
- provides for presence and buddy lists
- Instant Messaging in the enterprise
- telephony enabled user lists
Using SIP based Call routing algorithms and flows , one can build carrier grade communication solution . SIP solutions can hook up with existing telecom networks and service providers to be backward compatible . Also has untapped unlimited potential to integrate with any external IP application or service to provide converged , customised control both for signalling and media planes.
References :
- SIP by Henning Schulzrinne Dept. of Computer Science Columbia University New York
- International Institute of Telecommunications 2000-2004
- Introduction to SIP by Patrick Ferriter from ZULTYS
- Internet Draft, IETF, RFC 2543
- NTU – Internet Telephony based on SIP