I hoped of making a SIP application Development environment a year back and worked towards it earnestly . Sadly I wasn’t able to complete the job yet I have decided to share a few things about it here .
Aim :
Develop a SCE ( Service Creation Environment ) to addresses all aspects of lifecycle of a Service, right from creation/development, orchestration, execution/delivery, Assurance and Migration/Upgrade of services.
Similar market products :
Open/cloud Rhino
Mobicents and Telestax
Limitations of open source/other market products:
Free versions of the Service Creation Environments do not offer High Availability.
High Cost of Deployment grade versions.
Solution Description
I propose a in-house Java based Service Creation Environment “SLC SCE”. The SLC SCE will enable creation of JAINSLEE based SIP services. It can be used to develop and deploy carrier-grade applications that use SS7 and IMS based protocols such as INAP, CAP, Diameter and SIP as well as IT / Web protocols such as HTTP and XML.
Benefits:
Service Agility
Significantly Lower price points
Open Standards eliminate Legacy SCP Lock-in
Timeline
Java-based service creation environment (SCE) – 1.5 Months
Graphical User Interface (GUI) and schematic representations to help in the design, maintenance and support of applications – 1.5 months
SIP Resource Adapter – 1 month
Architecture
Service Creation Environment (SCE) for SIP Applications
In essence it encompasses the idea of developing the following
SIP stack
Javascript API’s
Java Libraries for calling SIP stack
Eclipse plugin to work with the SIP application development process
Visual Interface to view the logic of application and possible errors / flaws
SDKs ( Service Development Kit) , which are development Environment themselves
Extra Effort required to put in to make the venture successful
Demo applications for basic SIP logic like Call screening , call rerouting .
tutorial to create , deploy and run application from scratch . Aimed at all sections ie web developer , telecom engineer , full stack developer etc .
Some opensource implementation on public repositories like Github , Google code , SourceForge
Perform active problem solving on Stackoverflow , CodeRanch , Google groups and other forums .
We have already learned about Sip user agent and sip network server. SIP clients initiates a call and SIP server routes the call . Registrar is responsible for name resolution and user location. Sip proxy receives calls and send it to its destination or next hop.
Presence is user’s reachability and willingness to communicate its current status information . User subscribe to an event and receive notification . The components in presence are :
Presence user agent
Presence agent
Presence server
Watcher
Sip was initially introduced as a signaling protocol but there were Lack of method to emulate constant communication and update status between entity
Three more method was introduced namely – Publish , Subscribe and Notify
Subscribe request should be send by watchers to presence server
Presence agent should authenticate and send acknowledgement
State changes should be notified to subscriber
Agents should be able to allow or terminate subscription
Presence is a way to have sustained stateful communication. The SIP User agents can use presence service to know about others user’s online status . Presnece deployment must confirm to security standards .
WebRTC is a disruptive techbology for the telephony and cloud based communication services . It will change the landscape and foster growth of new innovative VoIP services that will be device agnostic and future ready .
Role of SIP servers ?
SIP Server convert the SIP transport from WebSocket protocol to UDP, TCP or TLS which are supported by all legacy networks. It also facilitates the use of rich serves such as phonebook synchronisation , file sharing , oauth in client .
How does WebRTC Solution traverse through FireWalls ?
NAT traversal across Firewalls is achieved via TURN/STUN through ICE candidates gathering .Current ice_servers are : stun:stun.l.google.com:19302 and turn:user@numb.viagenie.ca
What audio and video codecs are supported by WebRTC client side alone ?
Without the role of Media Server WebRTC solution supports Opus , PCMA , PCMU for audio and VP8 for video call.
RTCBreaker if enabled provides a third party B2BUA agent that performs certain level of codec conversion to H.264, H.263, Theora or MP4V-ES for non WebRTC supported agents.
What video resolution is supported by WebRTC solution ?
The browser will try to find the best video size between max and min based on the camera capabilities.
We can also predefine the video size such as minWidth, minHeight, maxWidth, maxHeight.
What bandwidth is required to run WebRTC solution ?
We can set maximum audio and video bandwidth to use or use the browser’s ability to set it hy default at runtime . This will change the outgoing SDP to include a “b:AS=” attribute. Browser negotiates the right value using RTCP-REMB and congestion control.
List of Web based SIP clients
SIPML5 client by Dubango
Telestax WebRTC client
SIPJS with flash network support
JSSIP
MIT license
SIP phones in Ubuntu / Linux
SFL phone
Yate SIP phone
Linphone
There are ready made build of Linphone for Windows , Mac and Mobile
Aletrnatively one can also build the Linphone from source
[ 57%] Performing configure step for 'EP_ms2'
loading initial cache file /home/altanai/linphone-desktop/WORK/WORK/desktop//tmp/EP_ms2/EP_ms2-cache-RelWithDebInfo.cmake
CMake Error at CMakeLists.txt:322 (message):
Could not find a support sound driver API. Use -DENABLE_SOUND=NO if you
don't care about having sound.
SIP is a widely adopted application layer protocol used in VoIP calls and confernecing applciations and in IMS architeture or pure packet switched networks .
Traditional SIP headers for Call setup are INVITE, ACK and teardown are CANCEL or BYE , however with more adoption newer methods specific to services were added such as :
MESSAGE Methods for Instant Message based services SUBSCRIBE, NOTIFY standardised by Event notification extension RFC 3856 PUBLISH to push presence information to the network
Outlining the SIP Requests and Responses in tables below,
Request Message
Request Message
Description
REGISTER
A Client use this message to register an address with a SIP server
INVITE
A User or Service use this message to let another user/service participate in a session. The body of this message would include a description of the session to which the callee is being invited.
ACK
This is used only for INVITE indicating that the client has received a final response to an INVITE request
CANCEL
This is used to cancel a pending request
BYE
A User Agent Client use this message to terminate the call
OPTIONS
This is used to query a server about its capabilities
Response Message
Code
Category
Description
1xx
Provisional
The request has been received and processing is continuing
2xx
Success
An ACK, to indicate that the action was successfully received, understood, and accepted.
3xx
Redirection
Further action is required to process this request
4xx
Client Error
The request contains bad syntax and cannot be fulfilled at this server
5xx
Server Error
The server failed to fulfill an apparently valid request
6xx
Global Failure
The request cannot be fulfilled at any server
SIP headers
Display names
From originators sipuri
CSeq or Command Sequence contains an integer and a method name. The CSeq number is incremented for each new request within a dialog and is a traditional sequence number.
Contact – SIP URI that represents a direct route to the originator usually composed of a username at a fully qualified domain name (FQDN) , also IP addresses are permitted. The Contact header field tells other elements where to send future requests.
Max-Forwards -to limit the number of hops a request can make on the way to its destination. It consists of an integer that is decremented by one at each hop.
Content-Length – an octet (byte) count of the message body.
Content-Disposition
describes how the message body or, for multipart messages, a message body part is to be interpreted by the UAC or UAS. It extends the MIME Content-Type
Disposition Types :
“session” – body part describes a session, for either calls or early (pre-call) media
“render” – body part should be displayed or otherwise rendered to the user.
“icon” – body part contains an image suitable as an iconic representation of the caller or callee
“alert” – body part contains information, such as an audio clip
Accept
Accept – acceptable formats like application/sdp or currency/dollars
Header field where proxy ACK BYE CAN INV OPT REG
Accept R - o - o m* o Accept 2xx - - - o m* o Accept 415 - c - c c c
An empty Accept header field means that no formats are acceptable.
Accept-Encoding
Accept-Encoding R - o - o o o Accept-Encoding 2xx - - - o m* o Accept-Encoding 415 - c - c c c
Accept-Language : languages for reason phrases, session descriptions, or status responses carried as message bodies in the response.
Accept-Language: da, en-gb;q=0.8, en;q=0.7
Accept-Language R - o - o o o
Accept-Language 2xx - - - o m* o
Accept-Language 415 - c - c c c
Tag globally unique and cryptographically random with at least 32 bits of randomness. identify a dialog, which is the combination of the Call-ID along with two tags ( from To and FROM headers )
Call-Id uniquely identify a session
contact – sip url alternative for direct routing
Encryption
Expires – when msg content is no longer valid
Mandatory SIP headers
INVITE sip:altanai@domain.comSIP/2.0
Via: SIP/2.0/UDP host.domain.com:5060
From: Bob
To: Altanai
Call-ID: 163784@host.domain.com
CSeq: 1 INVITE
Informational headers
Call-Info additional information for example, through a web page. The “card” parameter provides a business card, for example, in vCard [36] or LDIF [37] formats. Additional tokens can be registered using IANA
Priority indicates the urgency of the request as perceived by the client. can have the values “non-urgent”, “normal”, “urgent”, and “emergency”, but additional values can be defined elsewhere
Subject: A tornado is heading our way! Priority: emergency
or
Subject: Weekend plans Priority: non-urgent
Subject summary or indicates the nature of call
Subject: Need more boxes s: Tech Support
Supported enumerates all the extensions supported. can contain list of option tags, described
Supported: 100rel k: 100rel
Unsupported features not supported
Unsupported: foo
User-Agent information about the UAC originating the request.
User-Agent: Softphone Beta1.5
Organization conveys the name of the organization to which the SIP element issuing the request or response belongs.
Organization: AltanaiTelecom Co.
Warning additional information about the status of a response. List of warn-code
300 Incompatible network protocol:
301 Incompatible network address formats:
302 Incompatible transport protocol:
303 Incompatible bandwidth units:
304 Media type not available:
305 Incompatible media format:
306 Attribute not understood:
307 Session description parameter not understood:
330 Multicast not available:
331 Unicast not available:
370 Insufficient bandwidth:
399 Miscellaneous warning:
1xx and 2xx have been taken by HTTP/1.1.
Warning: 307 isi.edu “Session parameter ‘foo’ not understood” Warning: 301 isi.edu “Incompatible network address type ‘E.164′”
Authetication and Authorization related headers
Authentication-Info mutual authentication with HTTP Digest. A UAS MAY include this header field in a 2xx response to a request that was successfully authenticated using digest based on the Authorization header field.
limit the time period over which a stateful proxy must maintain state information. options
User agents must tear down the call after the expiration of the timer , or
aller can send re-INVITEs to refresh the timer, enabling a “keep alive” mechanism for SIP.
SDP (Session Description Protocol)
SIP can bear many kinds of MIME attachments , one such is SDP. It is a standard for protocol definition for exchange of media , metadata and other transport realted attributes between the particpants before establishing a VoIP call.
SDP session description is entirely textual using the ISO 10646 character set in UTF-8 encoding and described by application/SDP media type.
It should be noted that SDP itself does not incorporate a transport protocol and can be used with difference protocls like Session announcement proctols (SAP) , SIP , HTTP , Electronic MAIl MIME extension, RTSP etc.
In case of SIP SDP is encapsulated inside of SIP packet and use offer/answer model to convey information about media stream in multimedia session.
SDP body contains 2 parts : session based section starting with v= line and media bsesction starting with m= line Media and Transport Information can contain type of media like video, audio , transport protocol like RTP/UDP/IP, H.320 and format of the media such as H.261 video, MPEG video, etc.
Session Description in SDP
protocol version ( v= ) protocol version mostly version 0
sessionname ( s=) and session information ( i= ) session name is textual and can contain empty space or even s=- but must not be empty. Session infomration is optional textual information about the session
URI of description ( u = )
Email Address and Phone Number (“e=” and “p=”)
Both are optional free text string SHOULD be in the ISO-10646 character set with UTF-8 encoding
Nothe that if given the Phone numbers SHOULD follow international public telecommunication number specification ( ITU-T Recommendation E.164) and be preceded by a “+”. Spaces and hyphens may be used to split up a phone field to aid readability if desired.
Connection Data ( c= ) connection information — not required if included in all media in which media specific connecion data override overall session connection data
c= <net-type> <addr-type> <connection-address>
c=IN IP4 172.31.90.251
If the session is multicast, the connection address will be an IP multicast group address . TTL shoudl be present in IPv4 multicast address . If connection is unicast the address contains the unicast IP address of the expected data source or data relay or data sink .
Bandwidth ( b= ) interpreted as kilobits per second by default
b= <bwtype> : <bandwidth>
Encryption Keys ( k= ) Only is SDP is exchanged in secure and trusted channel, keys va be excahnged on this SDP field . Although this process is not recomended,
k= clear:< encryption key > k= base64:< encoded encryption key > k= uri:< URI to obtain key > k= prompt
Attributes ( a= )
extends the SDP with values like flags
a=inactive , a=sendonly , a=sendrecv , a=recvonly
Mapping the Encoder Spec from
a=rtpmap: < payload type > < encoding name >/ < clock rate > [/ ]
If the <stop-time> is set to zero, then the session is not bounded, though it will not become active until after the < start -time>. If the <start-time> is also zero, the session is regarded as permanent.
t=0 0
Repeat Times ( r= )
zero or more repeat times for scheduling a session
r= <repeat interval> <active duration> <offsets from start-time>
useful for scejduling session during transation to daylightv saving to standard time and vice versa
Media Description in SDP
For RTP, the default is that only the even-numbered ports are used for data with the corresponding one-higher odd ports used for the RTCP belonging to the RTP session
m= <media> <port> <proto> <fmt> …
m=audio 20098 RTP/AVP 0 101
will stream RTP on 20098 and RTCP on 20099
For multiple transport ports pairs of RTP , RTCP stream are specified
m= <media> <port>/ <number of ports> <proto> <fmt> …
m=audio 20098/2 RTP/AVP 0 101 will stream one pair on RTP 20098 , RTCP 20099 and RTP 20100 , RTCP 20101
If non-contiguous ports are required, they must be signalled using a separate attribute like example, “a=rtcp:”
Additioan SDP features : In addition to normal unicast sessions , SDP can also convery multicast group address for media on IP multicast session. Private (encryption of SDP ) or public session are not treated differently by SDP and they are entorely a function of implementing mechanism like SIP or SAP. Optiopnal SDP params include URI , Categorisation “a=cat:” , Internationalisation etc
Example 1 : Typical Audio call SIP INVITE showing SIP headers in blue and SDP in green below
INVITEnbspsip:01150259917040@x.x.x.x SIP/2.0
Via: SIP/2.0/UDP x.x.x.x:5060branch=z9hG4bK400fc6e6
From: "123456789" ltsip:123456789@x.x.x.xgttag=as42e2ecf6
To: ltsip:01150259917040@x.x.x.x.4gt
Contact: ltsip:123456789@x.x.x.x4gt
Call-ID: 2485823e63b290b47c042f20764d990a@x.x.x.x.x
CSeq: 102 INVITE
User-Agent:nbspMatrixSwitch
Date: Thu, 22 Dec 2005 18:38:28 GMT
Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER
Content-Type: application/sdp
Content-Length: 268
v=0
o=root 14040 14040 IN IP4 x.x.x.x
s=session
c=IN IP4 x.x.x.x
t=0 0
m=audio 26784 RTP/AVP 0 8 18 101
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:18 G729/8000
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-16
a=fmtp:18nbspannexb=no - - - -
c=* (connection information - optional if included at session-level)
b=* (bandwidth information)
a=* (zero or more media attribute lines)
The above SDP shows 4 supported media codecs on audio stream which are 0 PCMU , 8 PCMA , 18 G729 and finally 101 used for telephone events . It also shows RTP/AVP as RTP profile and does not contain any m=cideo line which shows that this endpoint does not want a video call , only an audio one.
Example 2 : Video Vall SIP invite from Linphone
SIP URI Params
Internet Assigned Number Authority (IANA) Universal Resource Identifier (URI) Parameter Registry defines URI params that can be sued along with SIP scheme
The aobve exmaple indicates that the request has to be compressed using SigComp
transport-param
SIP can use any network transport protocol. Parameter names are defined for UDP (RFC 768), TCP (RFC 761), and SCTP (RFC 2960). For a SIPS URI, the transport parameter MUST indicate a reliable transport.
The server address ( detsiantion address , port , transport ) to be contacted for this user, overriding any address derived from the host field.
Although discouraged , maddr URI param has been used as a simple form of loose source routing. It allows a URI to specify a proxy that must be traversed en-route to the destination.
ttl parameter determines the time-to-live value of the UDP multicast packet and MUST only be used if maddr is a multicast address and the transport protocol is UDP.
sip:alice@atlanta.com;maddr=239.255.255.1;ttl=15
cause param
“cause” EQUAL Status-Code ; 404 Unknown/Not available ; 486 User busy ; 408 No reply ; 302 Unconditional ; 487 Deflection during alerting ; 480 Deflection immediate response ; 503 Mobile subscriber not reachable ; 380 Service number translation RFC 8119 – Section 2
response that tells to its recipient that the associated request was received but result of the processing is not known yet which could be if the processing hasnt finished immediately. The sender must stop retransmitting the request upon reception of a provisional response.
100 Trying 180 Ringing : Triigers a local ringing at callers device 181 Call is Being Forwarded : Used before tranefering to another UA such as during forking or tranfer to voice mail Server
182 Queued
183 Session in Progress : conveys information . Headers field or SDP body has mor details about the call. Used in announcements and IVR + DTMF too by being followed by “Early media”.
199 Early Dialog Terminated
2xx—Successful Responses
final responses express result of the processing of the associated request and they terminate the transactions.
200 OK 202 Accepted 204 No Notification
3xx—Redirection Responses
Redirection response gives information about the user’s new location or an alternative service that the caller should try for the call. Used for cases when the server cant satisfy the call and wants the caller to try elsewhere . After this the caller is suppose to resend the request to the new location.
300 Multiple Choices 301 Moved Permanently 302 Moved Temporarily 305 Use Proxy 380 Alternative Service
4xx—Client Failure Responses
negative final responses indicating that the request couldn’t be processed due to callers fault , for reasons such as t contains bad syntax or cannot be fulfilled at that server.
400 Bad Request 401 Unauthorized 402 Payment Required 403 Forbidden 404 Not Found 405 Method Not Allowed 406 Not Acceptable 407 Proxy Authentication Required 408 Request Timeout 409 Conflict 410 Gone 411 Length Required 412 Conditional Request Failed 413 Request Entity Too Large 414 Request-URI Too Long 415 Unsupported Media Type 416 Unsupported URI Scheme 417 Unknown Resource-Priority 420 Bad Extension 421 Extension Required 422 Session Interval Too Small 423 Interval Too Brief 424 Bad Location Information 428 Use Identity Header 429 Provide Referrer Identity 430 Flow Failed 433 Anonymity Disallowed 436 Bad Identity-Info 437 Unsupported Certificate 438 Invalid Identity Header 439 First Hop Lacks Outbound Support 470 Consent Needed 480 Temporarily Unavailable 481 Call/Transaction Does Not Exist 482 Loop Detected. 483 Too Many Hops 484 Address Incomplete 485 Ambiguous 486 Busy Here 487 Request Terminated 488 Not Acceptable Here 489 Bad Event 491 Request Pending 493 Undecipherable 494 Security Agreement Required
5xx—Server Failure Responses
negative responses but indicating that fault is at server’s side for cases such as server cant or doesnt want to respond the the request.
500 Server Internal Error 501 Not Implemented 502 Bad Gateway 503 Service Unavailable 504 Server Time-out 505 Version Not Supported 513 Message Too Large 580 Precondition Failure
6xx—Global Failure Responses
request cannot be fulfilled at any server with definitive information
600 Busy Everywhere 603 Decline 604 Does Not Exist Anywhere 606 Not Acceptable
Mandatory SIP headers in SIP respone
SIP/2.0 200 OK
Via: SIP/2.0/UDP host.domain.com:5060
From: Bob<sip:bob@domain.com>
To: Altanai<sip:altanai@domain.com>
Call-ID: 163784@host.domain.com
CSeq: 1 INVITE
Via, From, To, Call-ID , and CSeq are copied exactly from request
3. UAS receives re-INVITE but waits for user intervention
UAS receives re-INVITE to add video , but instead of rejecting , it prompts user to permit.
So UAS provides a null IPaddress instead of setting the stream to ‘inactive’ because inactive streams still need to exchange RTP Control Protocol (RTCP) traffic
Later if user rejects the addition of the video stream. Consequently, the UAS sends an UPDATE request (6) setting the port of the video stream to zero in its offer.
Call RatConsistency of Call Records and duplicated charging records at various endpoints
A VOIP/CPaaS solution is designed to accommodate the signalling and media both along with integration leads to various external endpoints such as various SIP phones ( desktop, softphones, webRTC ), telecom carriers, different VoIP networks providers, enterprise applications ( Skype, Microsoft Lync ), Trunks etc.
A sufficiently capable SIP platform should have
Audio calls ( optionally video ) service using SIP gateways
Media services (such as recording , conferencing, voicemail, and IVR )
Messaging and presence ( could be using SIP SIMPLE, SMS , messahing service from third parties)
Interconnectivity with other IP multimedia systems, VoLTE ( optional interconnection with other types of communications networks as GSM or PSTN/ISDN).
support for VoIP signalling protocols (SIP, H,323, SCCP, MGCP, IAX) and telephony signalling protocols ( ISDN/SS7, FXS/FXO, Sigtran ) either internally via pluggable modules or externally via gateways .
Performnace factors :
Security considerations :
High availability using redundant servers in standby Load balancing IPv4 and IPv6 network layer support TCP , UDP , SCTP transport layer protocol support DNS lookups and hop by hop connectvity
authentication, authorization, and accounting (AAA) Digest authentication and credentials fetched from backend Media Encryption TLS and SRTP support Topology hidding to prevent disclosing IP form internal components in via and route headers Firewalls , blacklist, filters , peak detectors to prevent Dos and Ddos attacks
The article only outlines SIP system architecture from 3 viewpoints :
Data Centers with BCP ( Business Continuity Planning ) and DR ( Disaster Recovery )
Servers and Clusters for faster and parallel calculating
Virtualization VMs to make a distributed computing environment with HA ( high availability ) and DRS ( Distributed Resource Scheduling )
Storage SAN with built-in redundancy for the resiliency of data. WORM compliant NAS for storing voice archives over a retention period.
Racks, power supplies, battery backups, cages etc.
Networking DMZs ( Demilitarized Zones) which are interfacing areas between internal servers in the green zone and outside network VLANs for segregation between tenants. Connectivity through the public Internet as well as through VPN or dedicated optical fibre network for security.
Firewall configuration
Load Balancer ( Layer 7 )
Reverse Proxies for the security of internal IPs and port
Security controls In compliance with ISO/IEC 27000 family – Information security management systems
PKI Infrastructure to manage digital certificates
Key management with HSM ( hardware security module )
truster CA ( Certificate Authority ) to issue publicly signed certificate for TLS ( Https, wss etc)
A SIP server can be moulded to take up any role based on the libraries and programs that run on it such as gateway server, call manager, load balancer etc. This in turn defines its placement in overall VoIP communication architecture. For example – stateless proxy servers are placed on the border, – application and B2BUA server at the core
SIP platform components
SIP Gateways
A SIP gateway is an application that interfaces a SIP network to a network utilising another signalling protocol. In terms of the SIP protocol, a gateway is just a special type of user agent, where the user agent acts on behalf of another protocol rather than a human. A gateway terminates the signalling path and can also terminate the media path .
To PSTN for telephony inter-working To H.323 for IP Telephony inter-working Client – originates message Server – responds to or forwards message
Logical SIP entities are:
User Agent Client (UAC): Initiates SIP requests ….
User Agent Server (UAS): Returns SIP responses ….
Network Servers ….
Registrar Server
A registrar server accepts SIP REGISTER requests; all other requests receive a 501 Not Implemented response. The contact information from the request is then made available to other SIP servers within the same administrative domain, such as proxies and redirect servers. In a registration request, the To header field contains the name of the resource being registered, and the Contact header fields contain the contact or device URIs.
Proxy Server
A SIP proxy server receives a SIP request from a user agent or another proxy and acts on behalf of the user agent in forwarding or responding to the request. Just as a router forwards IP packets at the IP layer, a SIP proxy forwards SIP messages at the application layer.
Typically proxy server ( inbound or outbound) have no media capabilities and ignore the SDP . They are mostly bypassed once dialog is established but can add a record-route .
A proxy server usually also has access to a database or a location service to aid it in processing the request (determining the next hop).
1. Stateless Proxy Server A proxy server can be either stateless or stateful. A stateless proxy server processes each SIP request or response based solely on the message contents. Once the message has been parsed, processed, and forwarded or responded to, no information (such as dialog information) about the message is stored. A stateless proxy never retransmits a message, and does not use any SIP timers
2. Stateful Proxy Server A stateful proxy server keeps track of requests and responses received in the past, and uses that information in processing future requests and responses. For example, a stateful proxy server starts a timer when a request is forwarded. If no response to the request is received within the timer period, the proxy will retransmit the request, relieving the user agent of this task.
3 . Forking Proxy Server A proxy server that receives an INVITE request, then forwards it to a number of locations at the same time, or forks the request. This forking proxy server keeps track of each of the outstanding requests and the response. This is useful if the location service or database lookup returns multiple possible locations for the called party that need to be tried.
Redirect Server
A redirect server is a type of SIP server that responds to, but does not forward, requests. Like a proxy server, a redirect server uses a database or location service to lookup a user. The location information, however, is sent back to the caller in a redirection class response (3xx), which, after the ACK, concludes the transaction. Contact header in response indicates where request should be tried .
Application Server
The heart of all call routing setup. It loads and executes scripts for call handling at runtime and maintains transaction states and dialogs for all ongoing calls . Usually the one to rewrite SIP packets adding media relay servers, NAT . Also connects external services like Accounting , CDR , stats to calls .
Media processing is usually provided by media servers in accordance to the SIP signalling. Bridges, call recording, Voicemail, audio conferencing, and interactive voice response (IVR) are commomly used. Read more about Media Architecture here
RFC 6230 Media Control Channel Framework decribes framework and protocol for application deployment where the application programming logic and media processing are distributed.
Any one such service could be a combination of many smaller services within such as Voicemail is a combitional of prompt playback, runtime controls, Dual-Tone Multi-Frequency (DTMF) collection, and media recording. RFC 6231 Interactive Voice Response (IVR) Control Package for the Media Control Channel Framework.
Inband – With Inband digits are passed along just like the rest of your voice as normal audio tones with no special coding or markers using the same codec as your voice does and are generated by your phone.
Outband – Incoming stream delivers DTMF signals out-of-audio using either SIP-INFO or RFC-2833 mechanism, independently of codecs – in this case, the DTMF signals are sent separately from the actual audio stream.
TTS ( Text to Speech )
Alexa Text-to-Speech (TTS) + Amazon Polly
Ivona – multiple language text to speech converter with ssml scripts such as below
<speak><p><s><prosody rate="slow">IVONA</prosody> means highest quality speech
synthesis in various languages.</s><s>It offers both male and female radio quality voices <break/> at a
sampling rate of 22 kHz <break/> which makes the IVONA voices a
perfect tool for professional use or individual needs.</s></p></speak>
check ivona status
service ivona-tts-http status
tail -f /var/log/tts.log
SIP defines basic methods such as INVITE, ACK and BYE which can pretty much handle simple call routing with some more advanced processoes too like call forwarding/redirection, call hold with optional Music on hold, call parking, forking, barge etc.
Extending SIP headers
Newer SIP headers defined by more updated SIP RFC’s contina INFO, PRACK, PUBLISH, SUBSCRIBY, NOTIFY, MESSAGE, REFER, UPDATE. But more methods or headers can be added to baseline SIP packets for customization specific to a particular service provider. In case where a unrecognized SIP header is found on a SIP proxy which it either does not suppirt or doesnt understand, it will simply forward it to the specified endpoint.
Call routing Scripts
Interfaces for programming SIP call routing include : – Call Processing Language—SIP CPL, – Common Gateway Interface—SIP CGI, – SIP Servlets, – Java API for Integrated Networks—JAIN APIs etc .
Some known SIP stacks :
SailFin – SIP servlet container uses GlassFish open source enterprise Application Server platform (GPLv2), obsolete since merger from Sun Java to Oracle.
Mobicents – supports both JSLEE 1.1 and SIP Servlets 1.1 (GPLv2)
Cipango – extension of SIP Servlets to the Jetty HTTP Servlet engine thus compliant with both SIP Servlets 1.1 and HTTP Servlets 2.5 standards.
WeSIP – SIP and HTTP ( J2EE) converged application server build on OpenSER SIP platform
Additionally SIP stacks are supported on almost all popular SIP programming lanaguges which can be imported as lib and used for building call routing scripts to be mounted on SIP servers or endpoints such as :
PJSIP in C
JSSIP Javascript
Sofia in kamailio , Freswitch
Some popular SIP server also have proprietary scripting language such as – Asterisk Gateway Interface (AGI) , application interface for extending the dialplan with your functionality in the language you choose – PHP, Perl, C, Java, Unix Shell and others
A sufficiently capable SIP platform shoudl consist of following features :
Performance factors :
High availability using redundant servers in standby
Load balancing
IPv4 and IPv6 support
Security considerations :
digest authentication and credentials fetched from backend
Media Encryption
TLS and SRTP support
Topology hiding to prevent disclosng IP form internal components in via and route headers
Firewalls , blacklist, filters , peak detectors to prevent Dos and Ddos attacks .
Collecting and Processing PCAPS
VoIP monitor – network packet sniffer with commercial frontend for SIP RTP RTCP SKINNY(SCCP) MGCP WebRTC VoIP protocols
it uses a passive network sniffer (like tcpdump or wireshark) to analyse packets in realtime and transforms all SIP calls with associated RTP streams into database CDR record which is sent over the TCP to MySQL server (remote or local). If enabled saving SIP / RTP packets the sniffer stores each VoIP call into separate files in native pcap format (to local storage).
To adapt SIP to modern IP networks with inter network traversal ICE, far and near-end NAT traversal solutions are used. Network Address traversal is crtical to traffic flow between private public network and from behind firewalls and policy controlled networks
One can use any of the VOVIDA-based STUN server, mySTUN , TurnServer, reStund , CoTURN , NATH (PJSIP NAT Helper), ReTURN, or ice4j
Near-end NAT traversal
STUN (session traversal utilities for NAT) – UA itself detect presence of a NAT and learn the public IP address and port assigned using Nating. Then it replaces device local private IP address with it in the SIP and SDP headers. Implemented via STUN, TURN, and ICE. limitations are that STUN doesnt work for symmetric NAT (single connection has a different mapping with a different/randomly generated port) and also with situations when there are multiple addresses of a end point.
TURN (traversal using relay around NAT) or STUN relay – UA learns the public IP address of the TURN server and asks it to relay incoming packets. Limitatiosn since it handled all incoming and outgong traffic, it must scale to meet traffic requirments and should not become the bottle neck junction or single point of failure.
ICE (interactive connectivity establishment) – UA gathers “candidates of communication” with priorities offered by the remote party. After this client pairs local candidates with received peer candidates and performs offer-answer negotiating by trying connectivity of all pairs, therefore maximising success. The types of candidates : – host candidate who represents clients’ IP addresses, – server reflexive candidate for the address that has been resolved from STUN – and a relayed candidate for the address which has been allocated from a TURN relay by the client.
Far-end NAT traversal
UA is not concerned about NAT at all and communicated using its local IP port. The border controller implies a NAT handling components such as an application layer gateway (ALG) or universal plug and play (UPnP) etc which resolves the private and public network address mapping by act as a back to back user agent (B2BUA). Far end NAT can also be enabled by deploying a public SIP server which performs media relay (RTP Proxy/Media proxy).
Limitations of this approach (-) security risks as they are operating in the public network (-) enabling reverse traffic from UAS to UAC behind NAT.
A keep-alive mechanism is used to keep NAT translations of communications between SIP endpoint and its serving SIP servers opened , so that this NAT translation can be reused for routing. It contains client-to-server “ping” keep-alive and corresponding server-to-client “pong” messages. The 2 keep-alive mechanisms: a CRLF keep-alive and a STUN keep-alive message exchange.
The 3 types of SIP URIs,
address of record (AOR)
fully qualified domain name (FQDN)
globally routable user agent (UA) URI
SIP uniform resource identifiers (URIs) are identified based on DNS resolution since the URI after @ symbol contains hostname , port and protocl for the next hop.
Adding record route headers for locating the correct SIP server for a SIP message can be done by : – DNS service record (DNS SRV) – naming authority pointer (NAPTR) DNS resource record
Steps for SIP endpoints locating SIP server
From SIP packet get the NAPTR record to get the protocl to be used
Inspect SRV record to fetch port to use
Inspect A/AAA record to get IPv4 or IPv6 addresses ref : RFC 3263 – Locating SIP Servers Can use BIND9 server for DNS resolution supports NAPTR/SRV, ENUM, DNSSEC, multidomains, and private trees or public trees.
CDR store call detail records along with proof of call with tiemstamps, orignation, destination, duaration, rate etc. At the end of month or any other term, the aggregated CDR are cumulatively processed to generate the bill for a user. This heavy data stream needs to be accurately processed and this can be achived by using data-pipelines like AWS kinesis or Kafka eventstore.
The prime requirnment for the system is to handle enormous amount of call records data in relatime , cater to a number of producers and consumers.
For security the data is obfuscated into blob using base 64 encoding.
For good consistency only a single shard should be rsponsible to process one user account’s bill.
Data Streams for billing service
AWSKinesis – Kinesis Data Streams is sued for for rapid and continuous data intake and aggregation. The type of data used can include IT infrastructure log data, application logs, social media, market data feeds, and web clickstream data. It supports data sharding (ie number of call records grouped) and uses a partition Key ( string MD5 hash) to determine which shard the record goes to.
(+) This system can handle high volume of data in realtime and produce call uuid specfic reults which can be consumed by consumers waiting for the processed results
(-) If not consumed with a pre-specified time duration the processed results expire and are irretrivable . Self implement publisher to store teh processed reults from kisesis stream to data stores like Redis / RDBMS or other storge locations like s3 , dynamo DB. If pieline crashes during operation , data is lost
(-) Data stream should have low latency igesting contnous data from producer and presenting data to consumer.
Call Rate and Accounting
Generally data streams proecssing are used for crtical and voluminious service usage like for – metering/billing – server activity, – website clicks, – geo-location of devices, people, and physical goods
Call Rates are very crticial for billing and charging the calls . Any updates from the customer or carriers or individuals need to propagate automatically and quickly to avoid discrpencies and neagtive margins. CDRs need to be processed sequentially and incrementally on a record-by-record basis or over sliding time windows, and used for a wide variety of analytics including correlations, aggregations, filtering, and sampling.
To acheieve this the follow setup is ideal to use the new input rate sheet values via web UI console or POST API and propagate it quickly to main DB via AWS SQS which is a queing service and AWS lamda which is a serverless trigger based system . This ensures that any new input rates are updates in realtime and maintin fallback values in s3 bucket too
Call Rate and Accounting using task pipes , lambda serverless and qiueing service
It is an advantage to plan for ahead for connection with IMS such as openIMS, support for Voip signalling protocols (SIP, H,323, SCCP, MGCP, IAX) and telephony signalling protocls ( ISDN/SS7, FXS/FXO, Sigtran ) either internally via pluggable modules or externally via gateways or for SIP trunking integration via OTT providers/ cloud telephony.
Adhere to Standard
The obvious starting milestone before making a full-scale carrier-grade, SIP-based VoIP system is to start by building a PBX for intra-enterprise communication. There are readily available solutions to make an IP telephony PBX Kamailio, FreeSWITCH, asterisk, Elastix, SipXecs. It is important to use the standard protocol and widely acceptable media formats and codecs to ensure interoperability and reduce compute and delay involved in protocol or media transcoding.
Database Integration
Need backend , cache , databse integration to npt only store routing rules with temporary varaible values but also aNeed backend, cache, database integration to not only store routing rules with temporary variable values but also account details, call records details, access control lists etc. Should therefore extend integration with text-based DB, Redis, MySQL, PostgreSQL, OpenLDAP, and OpenRadius.
Consistency of Call Records and duplicated charging records at various endpoints
In current Voip scenarios a call may be passing thorugh various telco providers , ISP and cloud telephony serviIn current VoIP scenarios, a call may be passing through various telco providers, ISP and cloud telephony service providers where each system maintains its own call records and billing. This in my opinion is duplication and can be avoided by sharing a consistent data store possible in the blockchain. This is an experimental idea that I have further explored in this article
There are other external components to setup a VOIP solution apart from Core voice Servers and gateways like the ones listed below, I will try to either add a detailed overall architecture diagram here or write about them in an seprate article. Keep watching this space for updates
Payment Gateways
Billing and Invoice
Fraud Prevention
Contacts Integration
Call Analytics
API services
Admin Module
Number Management ( DIDs ) and porting
Call Tracking
Single Sign On and User Account Management with Oauth and SAML
Update : At the time of writing this article on SIP and related VOIP technologies I was newbie in VOIP domain, probably just out college. However over the past decade, looking at the steady traffic to these articles, I have tried updating the same with new RFC standards and market trends. This is an updated version (2019).
The Session Initiation Protocol (SIP) is a multimedia signalling protocol that has evolved the defacto communication standard for IP telephony. Today it forms the primary protocol for many Real Time Communication platforms which are integrated with telecom carriers and provide Cloud and IP based Services for applications such as robo/mass calls for advertising, API based calls like OTP generator, IVR announcements with DTMF input like customer care centre etc. Infact it would be not far from truth to say that converged platform we find today are a result of SIP integrating with the IP world.
Converged platforms integrates audio, video, data, presence, instant messaging, voicemails and conference services into a single network . SIP is the key component to build an advanced converged IP communication platform or rich multimedia Real time communication service.
SIP can be used to create programmable APIs and complex call routing VoIP scripts such as PBX , SBC etc.
Bears the support of many high quality open source and freeware SIP client , servers , proxies , tool such as Kamailio, Astersk, Freeswitch, Sipp, JAINSIP etc. Also supported on most standardised VoIP hardware and network such as Cisco, Microsoft, Avaya, and Radvision.
It is standardized by Internet Engineering Task Force (IETF) such as RFC 3261 which describes SIP v2 . Architecturally SIP request response (404 , 301) format is very similar to HTTP and its addressing schemes have a resemblance to SMTP (sip:altanai@company.com).
We know the ISO OSI layers which servers as a standard model for data communications .
Physical Layer : Ethernet , USB , IEEE 802.11 WiFi, Bluetooth , BLE
Data Link Layer : ARP ( Address Resolution Protocol ) , PPP ( point to point protocol ) , MAC ( Media Access control ) , ATM , Frame Relay
Network Layer : IP (IPv4 / IPv6), ICMP, IPsec
Transport : TCP , UDP , SCTP
Session : PPTP ( Point to point tunnelling protocol) , NFS, SOCKS
Presentation : Codecs such as JPEG , GIFF , SSL
Application : Application level like Call -manager/ softphone as HTTP , FTP , DNS , SIP , RTSP , RTP , DNS
SIP is an application layer protocol
SIP and SDP as Application layer protocols
SIP ( Session Initiation Protocol) negotiates session between 2 parties. It primarily exchanges headers that are used for making a call session such as example of outgoing telephone call from SIP session invite.
Session Initiation Protocol (INVITE)
Request-Line: INVITE sip:altanai@telecomcompany.com;transport=tcp SIP/2.0
Method: INVITE
Request-URI: altanai@telecomcompany.com;transport=tcp
Request-URI User Part: altanai
Request-URI Host Part: telecomcompany.com
[Resent Packet: False]
Message Header
Via: SIP/2.0/TCP 1.2.3.4:5080;rport;branch=z9hG4bKceX7a2H2866cN
Transport: TCP
Sent-by Address: 1.2.3.4
Sent-by port: 5080
RPort: rport
Branch: z9hG4bKceX7a2H2866cN
Max-Forwards: 41
From: "+16014801797" <sip:+16014801797@1.2.3.4>;tag=7HKgjNQ6y2FSj
SIP Display info: "+16014801797"
SIP from address: sip:+16014801797@1.2.3.4
SIP from address User Part: +16014801797
E.164 number (MSISDN): 16014801797
Country Code: Americas (1)
SIP from address Host Part: 1.2.3.4
SIP from tag: 7HKgjNQ6y2FSj
To: <sip:altanai@telecomcompany.com;transport=tcp>
SIP to address: sip:altanai@telecomcompany.com;transport=tcp
SIP to address User Part: altanai
SIP to address Host Part: telecomcompany.com
SIP To URI parameter: transport=tcp
Call-ID: e10306be-0cfd-4b38-af3c-b2ada0827cef
CSeq: 126144925 INVITE
Contact: <sip:mod_sofia@1.2.3.4:5080;transport=tcp>
User-Agent: phone1
Allow: INVITE, ACK, BYE, CANCEL, OPTIONS, MESSAGE, INFO, UPDATE, REFER, NOTIFY
Supported: path, replaces
Allow-Events: talk, hold, conference, refer
Privacy: none
Content-Type: application/sdp
Content-Disposition: session
Content-Length: 249
SIP Display info: "+16014801797"
SIP PAI Address: sip:+16014801797@1.2.3.4
The SIP philosophy :
reuse Internet addressing (URLs, DNS, proxies)
utilize rich Internet feature set
reuse HTTP coding
text based
makes no assumptions about underlying protocol: TCP, UDP, X.25, frame, ATM, etc
support of multicast
SIP URI can either be in format of sip:altanai@telecomcompnay.com (RFC 2543 ) or sips:altanai@telecomcompany.com ( secure with TLS over TCP RFX 3261) . Additionally SIP URI resolution can either be
DNS SRV based such as altanai@telecomcompnay.com with SIP servers locating record for domain “telecomcompnay.com ” or
FQDN ( Fully qualified domain name ) / contact / ip address based such as altanai@2.2.2.2 or altanai@us-west1-prod-server . Both of which do not need any resolution for routing.
Tags are pseudo-random numbers inserted in To or From headers to uniquely identify a call leg
Max forwards is a count decremented by each proxy that forwards the request.When count goes to zero, request is discarded and 483 Too Many Hops response is sent.Used for stateless loop detection.
Content-Type indicates the type of message body attachment. In this case application /SDP but others could be text/plain, application/cpl+xml, etc.)
Content-Length indicates the octet (byte) count of the message body
Contact direct route to contact the sender, composed of SIPURI with a user name and IP or FQDN. USed for later requests to directly reach the destination such as ACK after INVITE
via gives the last SIP hop as IP, transport, and transaction-specific parameters along with branch that identifies the transaction each proxy adds an additional via header. fianlly via header is used to route back the responses . This ensures the user agents after the initial request dont have to rely on DNS and location tables to route the messages.
Firewalls can sometimes block SIP packets, change TCP to UDP or change IP address of the packets. Record-Route can be used, ensures Firewall proxy stays in path. Clients and Servers copy Record-Route and put in Route header for all messages.
Message body is separated from SIP header fields by a blank line (CRLF).
INVITE : Initiates negotiation to establish a session ( dialog). Usually contains SDP payload.
Another invite during an existing session ( dialog) is called an RE-INVITE. RE-INVITE can be used for hold / resume a call and change session parameters and codecs in mid of a call
ACK : Acknowledge an INVITE request by completing the 3 way handshake.
If an INVITE did not contain media contain then ACK must contain it .
BYE : Ends a session ( dialog).
CANCEL : Cancels a session( dialog) before it establishes .
REGISTER : Registers a user location (host name, IP) on a registrar SIP server.
OPTIONS : Communicates information about the capabilities of the calling and receiving SIP phones ( methods , extensions , codecs etc )
PRACK : Provisional Acknowledgement for provisional response as 183 ( session in progress). PRACK only application to 101- 199 responses .
SUBSCRIBE : Subscribes for Notification from the notifier. Can use Expire=0 to unsubscribe.
NOTIFY : Notifies the subscriber of a new event.
PUBLISH : Publishes an event to the Server.
INFO : Sends mid session information.
REFER : Asks the recipient to issue call transfer.
MESSAGE : Transports Instant Messages.
UPDATE : Modifies the state of a session ( dialog).
SIP responses :
1xx = Informational SIP Responses
100 Trying 180 Ringing 183 Session Progress
2xx = Success Responses
200 OK – Shows that the request was successful
3xx = Redirection Responses
4xx = Request Failures
401 Unauthorized 404 Not Found 405 Method Not Allowed 407 Proxy Authentication Required 408 Request Timeout 480 Temporarily Unavailable 481 Call/Transaction Does Not Exist 486 Busy Here 487 Request Terminated 488 Not Acceptable Here 482 Loop Detected 483 Too Many Hops
5xx = Server Errors
500 Server Internal Error 503 Service Unavailable
6xx = Global Failures
600 Busy Everywhere 603 Decline 604 Does Not Exist Anywhere 606 Not Acceptable
SIP callflow diagram for a Call Setup and termination using RTP for media and RTCP for control.
SIP can bear many kinds of MIME attachments , one such is SDP. SDP contains session metadata used for establishing the session. It defines media information and capabilities such as codecs and formats , timestamps , termination points like address , ports. Additionally it can also convey other details like bandwidth and contact for the node acting as proxy for the session.
Session Description Protocol Version (v): 0
Owner/Creator, Session Id (o): FreeSWITCH 1532932581 1532932582 IN IP4 1.2.3.4
Owner Username: FreeSWITCH
Session ID: 1532932581
Session Version: 1532932582
Owner Network Type: IN
Owner Address Type: IP4
Owner Address: 1.2.3.4
Session Name (s): FreeSWITCH
Connection Information (c): IN IP4 1.2.3.4
Connection Network Type: IN
Connection Address Type: IP4
Connection Address: 1.2.3.4
Time Description, active time (t): 0 0
Session Start Time: 0
Session Stop Time: 0
Media Description, name and address (m): audio 29398 RTP/AVP 0 101
Media Type: audio
Media Port: 29398
Media Protocol: RTP/AVP
Media Format: ITU-T G.711 PCMU
Media Format: DynamicRTP-Type-101
Media Attribute (a): rtpmap:0 PCMU/8000
Media Attribute Fieldname: rtpmap
Media Format: 0
MIME Type: PCMU
Sample Rate: 8000
Media Attribute (a): rtpmap:101 telephone-event/8000
Media Attribute Fieldname: rtpmap
Media Format: 101
MIME Type: telephone-event
Sample Rate: 8000
Media Attribute (a): fmtp:101 0-16
Media Attribute Fieldname: fmtp
Media Format: 101 [telephone-event]
Media format specific parameters: 0-16
Media Attribute (a): silenceSupp:off - - - -
Media Attribute Fieldname: silenceSupp
Media Attribute Value: off - - - -
Media Attribute (a): ptime:20
Media Attribute Fieldname: ptime
Media Attribute Value: 20
v=0 indicates the start of the SDP content.
o=FreeSWITCH 1532932581 1532932582 IN IP4 1.2.3.4 , is session origin and owner’s name
c=IN IP4 1.2.3.4 is connection data specifing the IP address of session.
m= is Media type – audio, port – 29398, RTP/AVP Profile – 0 and 101
SIP transaction consists of a single request and any responses to that request, which include zero or more provisional responses and one or more final responses.
A transaction consists of a Request, any non-final (1xx) Responses received, and a final Response (2xx, 3xx, 4xx, 5xx, or 6xx). ACK is not considered part of this transaction and is a new transaction.
Request whose responses to that are non succesfull such as INVITE response with 100, 405 then, ACK is part of the transaction.
Hence , ror positive replies (2XX), a new transaction is created for ACK with new CONTACT header and it can be sent straight to the UAS bypassing the proxy.
For negative replies, ACK stays part of INVITE transaction hence request is sent to the same proxy as INVITE.
Examples
for ACK given below , tid=-d8754z-deea18278a05ce16-1—d8754z-
SIP entities that have notion of transactions are called stateful.
Branch
The branch parameter is a transaction identifier. Responses relating a request can be correlated because they will contain the same transaction identifier.
Dialog
The p2p relationship between 2 sip endpoints , containing sequence of transactions, is called the dialog . The initiator of the session that generates the establishing INVITE generates the unique Call-IDandFrom tag. In the response to the INVITE, the user agent answering the request will generate the To tag. The combination of the local tag (contained in the From header field), remote tag (contained in the To header field), and the Call-ID uniquely identifies the established session, known as a dialog. This dialog identifier is used by both parties to identify this call because there could be multiple calls set up between them.
A dialog is uniquely identified by: Call-ID header , remote-tag and local-tag.
DialogId is different for both ends since local and remote for both ends are different.
Example : Notice the to and from tag ids in INVITE and its 200 ok. The dialog id for invite is , 97576NjQ5MTBlNjVjNDQ0MzFmOTEyZGEzYWJjZjQxYjcyYzc70edc66c. Since it is the first INVITE, it doesnt bear the To tag.
The combination of the To tag, From tag, and Call-ID completely defines a peer-to-peer SIP relationship between endpoints and is referred to as a dialog.
All requests sent within a dialog are by default sent directly from one user agent to the other. Only requests outside a dialog traverse SIP proxies. This approach makes SIP network more scalable because only a small number of SIP messages hit the proxies.
However few request need to explicitly state that they need to stay on path of proxies such as for accounting during termination of when NAT process is being carried out then. For these we need to insert a Record-Route header field into SIP messages which contain address of the proxy. Messages sent within a dialog will then traverse all SIP proxies that put a Record-Route header field into the message.
The server copies the Record-Route header field unchanged into the response. (Record-Route is only relevant for 2xx responses) i.e. the end point recipient will also mirror the proxies for the response.
without Record Routing
with record routing
Strict Routing
Rewrite the Request-URI ie Request-URI always contained URI of the next hop so it is necessary to save the original Request-URI as the last Route header field. Defined in RFC2543.
Loose routing
Request-URI is no more overwritten, it always contains URI of the destination user agent, therby keeping target seprated from route. ( ;lr). If there are any Route header field in a message, then the message is sent to the URI from the topmost Route header field. Defined in RFC 326.
SIP Authorization
Authentication , security , confidentiality and integrity form the basic requirement for any communication system . To protect against hacking a user account and Denial of service attacks , SIP uses HTTP digest authentication mechanism with nonces and challenges along with 407 Proxy Authorization required and 401 unauthorised . The sender has to resend the request with MD5 hash of nonce and password ( password id never send in clear ). Thus preventing man-in-middle attacks.
Challenge / Response Scheme :
Sends REGISTER and receives 401 / 407 Challenge + nonce
Again sends REGISTER + MD-5 hash (pw + nonce) get a 200 OK
REGISTER using HTTP Digest for authentication using TLS transport, challenge is in form
Here qop is Quality Of Protection param indicating quality of protection that the client has applied to the message. qop=1 (enabled) will help you to avoid replay attacks.
Here qop is Quality Of Protection param indicating
Cancellation of Registration – UA sends REGISTER request with Expires: 0 Contact: * , to apply to all . Since user is already authenticated , it is not challenged again .
To prevent spoofing ie impersonating as server , SIP provides server authentication too. Required by ITSP’s ( Internet telephony service providers ) .
End to end encryption is achieved thorough TS and SRTP.
According to RFC 3263 Session Initiation Protocol (SIP): Locating SIP Servers , if the proxy finds that the request is for an outside domain , it will take help of a DNS server to resolve to IP address of target domain and forward the request. Then target domain proxy used REGISTRAR’s discovery services to find if user is present in the host via location table entry . If found then request reaches the user .
To provide session mobility SIP endpoints send Register request to their respective registrar as they move and update their location. As User changes terminals , they registers themselves to the appropriate server – Location server tracks the location of user – Redirect servers prioritise the possible locations of the user – Users keep same services as located at home server, while mobile Call is processed by home servers using RECORD-ROUTE
Network Address Translator, defined by RFC 3022 to conserve network space as most packets are exchanged inside a private network itself.
All internet users whether they are using Wifi , 3G/LTE, home AP, any other telecom data packet network by TSP or ISP , are assigned a private IP address , which is unreachable from out side world .Addresses are assigned by Internet Assigned Numbers Authority (IANA). Private address blocks are in format of 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16.
Therefore when they access the Internet , this address is converted into a globally unique public IP address through a NAT for external communication
SIP Issues around NAT
NATs modify IP addresses (Layer 3)- SIP/SDP are Layer 7 protocols – transparent to NAT
SIP Via:, From: and Contact: headers use not-routable private addresses SDP states that originator wishes to receive media at not-routable private addresses If destination on the public internet tries to send SIP or RTP traffic to those private address Traffic will be dumped by first router
Solution are to use either Application level gateway (ALG) or STUN or Universal Plug and Pray (UPnP)
To rewrite all SIP/SDP source addresses
SIP Via:, From: and Contact: headers use public NAT address
SDP addresses use NAT public address
Use SIP over TCP
Use draft-ietf-sip-symmetric-response-00 and “Symmetric” SIP/RTP Use same UDP port number for incoming/outgoing Hold ports open for call duration Send UDP packet typically every 30 seconds SIP over UDP uses 30 second re-INVITE, REGISTER or OPTIONs RTP sends at much higher frequency by default
NAPT ( Network Address Port Translator ) – Can map multiple private IP addresses and ports to one public IP address and ports
To adapt SIP to modern IP networks with inter network traversal ICE, far and near-end NAT traversal solutions are used. Network Address traversal is crtical to traffic flow between private public network and from behind firewalls and policy controlled networks
STUN (session traversal utilities for NAT) – UA itself detect presence of a NAT and learn the public IP address and port assigned using NAting. Then it replaces device local private IP address with it in the SIP and SDP headers. Implemented va STUN, TURN, and ICE. (-) doesnt work for symmetric NAT (single connection has a different mapping with a different/randomly generated port) (-) doesnt work when there are multiple addresses of a end point.
TURN (traversal using relay around NAT) or STUN relay – UA learns the public IP address of the TURN server and asks it to relay incoming packets. (-) since it handled all incoming and outgong traffic , it must scale to meet traffic requirments and should not become the bottleneck junction or single point of failure.
UA is not concerned about NAT at all and communicated using its local IP port. The border controller implies a NAT handling compoenets such as an application layer gateway (ALG) or universal plug and play (UPnP) etc which resolves the private and public network address mapping by act as a back to back user agent (B2BUA).
ICE (interactive connectivity establishment) – UA gathers “candidates of communication” with priorities offered by the remote party. After this client pairs local candidates with received peer candidates and performs offer-answer negotiating by trying connectivity of all pairs, therefore maximising success. The types of candidates – host candidate who represents clients’ IP addresses, – server reflexive candidate for the address that has been resolved from STUN – relayed candidate for the address which has been allocated from a TURN relay by the client.
Far end NAT can also be enabled by deploying a public SIP server which performs media relay (RTP Proxy/Media proxy).
(-) security risks : operating in public network enabling reverse traffic from UAS to UAC behind NAT.
A keep-alive mechanism is used to keep NAT translations of communications between SIP endpoint and its serving SIP servers opened , so that this NAT translation can be reused for routing. It contains client-to-server “ping” keep-alive and corresponding server-to-client “pong” messages. The 2 keep-alive mechanisms: a CRLF keep-alive and a STUN keep-alive message exchange.
Localization Server –Used by the Proxy Server and Redirect Server to obtain the location of the called user (one or more addresses)
Registration Server- Accept registration requests from the client applications . Generally, the service is offered by the Proxy Server or Redirect Server
DNS Server – Used to locate the Proxy Server or Redirect Server using NAPTR or SRV records
The 3 types of SIP URIs,
address of record (AOR)
fully qualified domain name (FQDN)
globally routable user agent (UA) URI
SIP uniform resource identifiers (URIs) are identified based on DNS resolution since the URI after @ symbol contains hostname , port and protocol for the next hop.
Adding record route headers for locating the correct SIP server for a SIP message can be done by :
DNS service record (DNS SRV)
naming authority pointer (NAPTR) DNS resource record
Steps for SIP endpoints locating SIP server
From SIP packet get the NAPTR record to get the protocl to be used
Inspect SRV record to fetch port to use
Inspect A/AAA record to get IPv4 or IPv6 addresses
ref : RFC 3263 – Locating SIP Servers
Can use BIND9 server for DNS resolution supports NAPTR/SRV, ENUM, DNSSEC, multidomains, and private trees or public trees.
Sending Call invite but as Redirect Server responded with 302 moved temporary , a new destination address is returned. The invite is forwarded to another proxy server which connects the sip endpoints again after consultation with Redirect server .
In this stage of we see the call getting connected to sip endpoint via 2 proxy servers . The redirect server doesnt get into path once the initial sip request is send.
After communication the endpoints send BYE to terminate the session
This callflow deals with the use-case when a user maybe registered from multiple SIP phones ( perhaps one home phone , one car and one office desk etc ) and wants to receive a ring on all registered phone ie fork a call to multiple endpoints .
In the above diagram we can see a forked invite going to both the sip phones . Both of them reply with 100 trying and 180 ringing, but only 1 gets answered by the user .
After one endpoint sends 200 ok and connects with session , the other receiver a cancel from the sip server .
Using SIP based Call routing algorithms and flows , one can build carrier grade communication solution . SIP solutions can hook up with existing telecom networks and service providers to be backward compatible . Also has untapped unlimited potential to integrate with any external IP application or service to provide converged , customised control both for signalling and media planes.
SIP Registration – Successful New Registration, Update of Contact List, Request for Current Contact List, Cancellation of Registration, Unsuccessful Registration
SIP Session Establishment
Session Establishment Through Two Proxies,
Session with Multiple Proxy Authentication,
Successful Session with Proxy Failure,
Session Through a SIP ALG,
Session via Redirect and Proxy Servers with SDP in ACK,
Session with re-INVITE (IP Address Change),
Unsuccessful No Answer, Unsuccessful Busy, Unsuccessful No Response from User Agent, Unsuccessful Temporarily Unavailable,
Security Considerations
RFC 5359 – Session Initiation Protocol Service Examples
It contains the description for services like
Call Hold, Consultation Hold, Music on Hold,
Transfer – Unattended, Transfer – Attended, Transfer – Instant Messaging,