WebRTC stands for Web Real-Time Communications and introduces a real-time media framework in the browser core alongside associated JavaScript APIs for controlling the media frame and HTML5 tags for displaying.
If you are new to WebRTC , read what is WebRTC ? From a technical point of view, WebRTC will hide all the complexity of real-time media behind a very simple JavaScript API.
Codec Confusion :
Video Codecs
Currently VP8 is the codec of choice since it is royalty-free. In mobility today, the codec of choice is h264. H264 is not royalty-free. But it is native in most mobile handsets due to its high performance.
Audio Codecs
Opus is a lossy audio compression format developed by the Internet Engineering Task Force (IETF) targeting a broad range of interactive real-time applications over the Internet, from speech to music. As an open format standardized through RFC 6716, a reference implementation is provided under the 3-clause BSD license. All known software patents Which cover Opus are licensed under royalty-free terms.
G.711 is an ITU (International Telecommunications Union) standard for audio compression. It is primarily used in telephony. The standard was released in 1972. It is the required standard in many voice-based systems and technologies, for example in H.320 and H.323 specifications. Speex is a patent-free audio compression format designed for speech and also a free software speech codec that is used in VoIP applications and podcasts. Some consider Speex obsolete, with Opus as its official successor, but since significant content is out there using Speex, it will not disappear anytime soon.
G.722 is an ITU standard 7 kHz Wideband audio codec operating at 48, 56 and 64 kbit/s. It was approved by ITU-T in 1988. G722 provides improved speech quality due to a wider speech bandwidth of up to 50-7000 Hz compared to G.711 of 300–3400 Hz.
AMR-WB Adaptive Multi-rate Wideband is a patented wideband speech coding standard that provides improved speech quality due to a wider speech bandwidth of 50–7000 Hz. Its data rate is between 6-12 kbit/s, and the codec is generally available on mobile phones.
Architecture :
WebRTC offers web application developers the ability to write rich, realtime multimedia applications (think video chat) on the web, without requiring plugins, downloads or installs. It’s purpose is to help build a strong RTC platform that works across multiple web browsers, across multiple platforms.
Web API – An API to be used by third-party developers for developing web-based video chat-like applications.
WebRTC Native C++ API – An API layer that enables browser makers to easily implement the Web API proposal
Transport / Session – The session components are built by re-using components from libjingle, without using or requiring the XMPP/jingle protocol.
RTP Stack – A network stack for RTP, the Real-Time Protocol.
STUN/ICE – A component allowing calls to use the STUN and ICE mechanisms to establish connections across various types of networks.
Session Management – An abstracted session layer, allowing for call setup and management layer. This leaves the protocol implementation decision to the application developer.
VoiceEngine – VoiceEngine is a framework for the audio media chain, from sound card to the network.
iSAC / iLBC / Opus
iSAC: A wideband and super wideband audio codec for VoIP and streaming audio. iSAC uses 16 kHz or 32 kHz sampling frequency with an adaptive and variable bit rate of 12 to 52 kbps.
iLBC: A narrowband speech codec for VoIP and streaming audio. Uses 8 kHz sampling frequency with a bitrate of 15.2 kbps for 20ms frames and 13.33 kbps for 30ms frames. Defined by IETF RFCs 3951 and 3952.
Opus: Supports constant and variable bitrate encoding from 6 kbit/s to 510 kbit/s, frame sizes from 2.5 ms to 60 ms, and various sampling rates from 8 kHz (with 4 kHz bandwidth) to 48 kHz (with 20 kHz bandwidth, where the entire hearing range of the human auditory system can be reproduced). Defined by IETF RFC 6176.
NetEQ for Voice– A dynamic jitter buffer and error concealment algorithm used for concealing the negative effects of network jitter and packet loss. Keeps latency as low as possible while maintaining the highest voice quality.
Acoustic Echo Canceler (AEC) – The Acoustic Echo Canceler is a software-based signal processing component that removes, in real-time, the acoustic echo resulting from the voice being played out coming into the active microphone.
Noise Reduction (NR) -The Noise Reduction component is a software-based signal processing component that removes certain types of background noise usually associated with VoIP. (Hiss, fan noise, etc…)
Video Engine – VideoEngine is a framework video media chain for video, from the camera to the network, and from network to the screen.
VP8 – Video codec from the WebM Project. Well suited for RTC as it is designed for low latency.
Video Jitter Buffer – Dynamic Jitter Buffer for video. Helps conceal the effects of jitter and packet loss on overall video quality. Image enhancements -For example, removes video noise from the image capture by the webcam.
W3C contribution
Media Stream Functions
API for connecting processing functions to media devices and network connections, including media manipulation functions.
Audio Stream Functions
An extension of the Media Stream Functions to process audio streams (e.g. automatic gain control, mute functions and echo cancellation).
Video Stream Functions
An extension of the Media Stream Functions to process video streams (e.g. bandwidth limiting, image manipulation or “video mute“).
Functional Component
API to query presence of WebRTC components in an implementation, instantiate them and connect them to media streams.
P2P Connection Functions
API functions to support establishing signalling protocol-agnostic peer-to-peer connections between Web browsers
API specification Availability
WebRTC 1.0: Real-time Communication Between Browsers – Draft 3 June 2013 available
Implementation Library: WebRTC Native APIs
Media Capture and Streams – Draft 16 May 2013
Supported by Chrome , Firefox, Opera in desktop of all OS ( Linux, Windows , Mac )
Supported by Chrome , Firefox in Mobile browsers ( android )
IETF contribution
Communication model
Security model
Firewall and NAT traversal
Media functions
Functionality such as media codecs, security algorithms, etc.,
In simple words, it’s a phenomenal change in decentralizing communication platforms from proprietary vendors who heavily depended on patented and royalty bound technologies and protocols. It will revolutionize internet telephony. Also it will emerge to be platform-independent ( ie any browser, any desktop operating system any mobile Operating system ).
WebRTC allows anybody to introduce real-time communication to their web page as simple as introducing a table.
update 2020 – This article was written very early in 2013 while WebRTC was being standardised and not as widely adopted since the inception of WebRTC began in 2012.
There are many more articles written after that to explain and emphasize the detailing and application of WebRTC. List of these is below :
– signalling at the protocol level (such as SIP, MGCP and SS7)
•For telephony, data and wireless communications networks, the Java APIs defined through.
– service portability
– network independence
– open development
•A Service Logic Execution Environment (SLEE) is high-throughput, low-latency, event-processing application environment.
•JAIN SLEE is designed specifically to allow implementations of a standard to meet the stringent requirements of communications applications (such as network-signaling applications).
Goals of JAIN SLEE are:
– Portable services and network independence.
– Hosting on an extensible platform.
– services and SLEE platform available from many vendors.
Key Features are :
•Industry standard :- JSLEE is the industry-agreed standard for an application server that meets the specific needs of telecommunications networks.
•Network independence:-The JSLEE programming model enables network independence for the application developer. The model is independent of any particular network protocol, API or network topology.
•Converged services:- JSLEE provides the means to create genuinely converged services, which can run across multiple network technologies.
•Network migrations :-As JSLEE provides a generic, horizontal platform across many protocols, independent of the network technology, it provides the ideal enabler technology for smooth transition between networks.
•Global market—global services:-JSLEE-compliant applications, hosted on a JSLEE application server are network agnostic. A single platform can be used across disparate networks
•Robust and reliable:- As with the enterprise application server space, deploying applications on a standard application server that has been tested and deployed in many other networks reduces logic errors, and produces more reliable applications
A VOIP Solution is designed to accommodate the signalling and media both along with integration leads to various external endpoints such as various SIP phones ( desktop, softphones , webRTC ) , telecom carriers , different voip network providers , enterprise applications ( Skype , Microsoft Lync ), Trunks etc .
A sufficiently capable SIP platform should consist of following features :
audio calls ( optionally video )
media services such as conferencing, voicemail, and IVR,
messaging as IM and presence based on SIMPLE,
programmable services through standardized APIs and development of new modules
near-end and far-end NAT traversal for signalling and media flows
interconnectivity with other IP multimedia systems, VoLTE ( optional interconnection with other types of communications networks as GSM or PSTN/ISDN)
registry , location and lookup service
Backend support like Redis, MySQL, PostgreSQL, Oracle, Radius, LDAP, Diameter
serial and parallel forking
support for Voip signalling protocols (SIP, H,323, SCCP, MGCP, IAX) and telephony signalling protocols ( ISDN/SS7, FXS/FXO, Sigtran ) either internally via pluggable modules or externally via gateways
Performnace factors :
High availability using redundant servers in standby
Load balancing
IPv4 and IPv6 network layer support
TCP , UDP , SCTP transport layer protocol support
DNS lookups and hop by hop connectvity
Security considerations :
authentication, authorization, and accounting (AAA)
Digest authentication and credentials fetched from backend
Media Encryption
TLS and SRTP support
Topology hidding to prevent disclosing IP form internal components in via and route headers
Firewalls , blacklist, filters , peak detectors to prevent Dos and Ddos attacks
The article only outlines SIP system architecture from 3 viewpoints :
from Infrastructure standpoint
from core voice engineering perspective
and accompanying external components required to run and system
Infrastructure Requirements
Data Centers with BCP ( Business Continuity Planning ) and DR ( Disaster Recovery )
Servers and Clusters for faster and parallel calculating
Virtualization VMs to make a distributed computing environment with HA ( high availability ) and DRS ( Distributed Resource Scheduling )
Storage SAN with built-in redundancy for the resiliency of data. WORM compliant NAS for storing voice archives over a retention period.
Racks, power supplies, battery backups, cages etc.
Networking DMZs ( Demilitarized Zones) which are interfacing areas between internal servers in the green zone and outside network VLANs for segregation between tenants. Connectivity through the public Internet as well as through VPN or dedicated optical fibre network for security.
Firewall configuration
Load Balancer ( Layer 7 )
Reverse Proxies for the security of internal IPs and port
Security controls In compliance with ISO/IEC 27000 family – Information security management systems
PKI Infrastructure to manage digital certificates
Key management with HSM ( hardware security module )
truster CA ( Certificate Authority ) to issue publicly signed certificate for TLS ( Https, wss etc)
Integral Components of a VOIP SIP based architecture
Call Controller
Media Manager
Recording
Softclients
logs and PCAP archives
CDR generators
Session Borer Controllers ( SBCs)
Types of SIP servers are listed below . It is important to understand the roles a SIP server can be moulded to take up which in turn defines its placement in overall voip communication platform such as stateless proxy servers on the border , application and B2BUA server at the core etc
SIP Gateways:
SIP platform components
A SIP gateway is an application that interfaces a SIP network to a network utilising another signalling protocol. In terms of the SIP protocol, a gateway is just a special type of user agent, where the user agent acts on behalf of another protocol rather than a human. A gateway terminates the signalling path and can also terminate the media path .
To PSTN for telephony inter-working To H.323 for IP Telephony inter-working Client – originates message Server – responds to or forwards message
Logical SIP entities are:
User Agent Client (UAC): Initiates SIP requests ….
User Agent Server (UAS): Returns SIP responses ….
Network Servers ….
Registrar Server
A registrar server accepts SIP REGISTER requests; all other requests receive a 501 Not Implemented response. The contact information from the request is then made available to other SIP servers within the same administrative domain, such as proxies and redirect servers. In a registration request, the To header field contains the name of the resource being registered, and the Contact header fields contain the contact or device URIs.
Proxy Server
A SIP proxy server receives a SIP request from a user agent or another proxy and acts on behalf of the user agent in forwarding or responding to the request. Just as a router forwards IP packets at the IP layer, a SIP proxy forwards SIP messages at the application layer.
Typically proxy server ( inbound or outbound) have no media capabilities and ignore the SDP . They are mostly bypassed once dialog is established but can add a record-route .
A proxy server usually also has access to a database or a location service to aid it in processing the request (determining the next hop).
1. Stateless Proxy Server A proxy server can be either stateless or stateful. A stateless proxy server processes each SIP request or response based solely on the message contents. Once the message has been parsed, processed, and forwarded or responded to, no information (such as dialog information) about the message is stored. A stateless proxy never retransmits a message, and does not use any SIP timers
2. Stateful Proxy Server A stateful proxy server keeps track of requests and responses received in the past, and uses that information in processing future requests and responses. For example, a stateful proxy server starts a timer when a request is forwarded. If no response to the request is received within the timer period, the proxy will retransmit the request, relieving the user agent of this task.
3 . Forking Proxy Server A proxy server that receives an INVITE request, then forwards it to a number of locations at the same time, or forks the request. This forking proxy server keeps track of each of the outstanding requests and the response. This is useful if the location service or database lookup returns multiple possible locations for the called party that need to be tried.
Redirect Server
A redirect server is a type of SIP server that responds to, but does not forward, requests. Like a proxy server, a redirect server uses a database or location service to lookup a user. The location information, however, is sent back to the caller in a redirection class response (3xx), which, after the ACK, concludes the transaction. Contact header in response indicates where request should be tried .
Application Server
The heart of all call routing setup. It loads and executes scripts for call handling at runtime and maintains transaction states and dialogs for all ongoing calls . Usually the one to rewrite SIP packets adding media relay servers, NAT . Also connects external services like Accounting , CDR , stats to calls .
Developing SIP based applications
Basic SIP methods
SIP defines basic methods such as INVITE, ACK and BYE which can pretty much handle simple call routing with some more advanced processoes too like call forwarding/redirection, call hold with optional Music on hold, call parking, forking, barge etc.
Extending SIP headers
Newer SIP headers defined by more updated SIP RFC’s contina INFO, PRACK, PUBLISH, SUBSCRIBY, NOTIFY, MESSAGE, REFER, UPDATE. But more methods or headers can be added to baseline SIP packets for customization specific to a particular service provider. In case where a unrecognized SIP header is found on a SIP proxy which it either does not suppirt or doesnt understand, it will simply forward it to the specified endpoint.
Call routing Scripts
Interfaces for programming SIP call routing include : – Call Processing Language—SIP CPL, – Common Gateway Interface—SIP CGI, – SIP Servlets, – Java API for Integrated Networks—JAIN APIs etc .
Some known SIP stacks :
SailFin – SIP servlet container uses GlassFish open source enterprise Application Server platform (GPLv2), obsolete since merger from Sun Java to Oracle.
Mobicents – supports both JSLEE 1.1 and SIP Servlets 1.1 (GPLv2)
Cipango – extension of SIP Servlets to the Jetty HTTP Servlet engine thus compliant with both SIP Servlets 1.1 and HTTP Servlets 2.5 standards.
WeSIP – SIP and HTTP ( J2EE) converged application server build on OpenSER SIP platform
Additionally SIP stacks are supported on almost all popular SIP programming lanaguges which can be imported as lib as used for building call routing scripts to be mounted on SIP servers or endpoints such as :
PJSIP in C
JSSIP Javascript
Sofia in kamailio , Freswitch
Some popular SIP server also have proprietary scripting language such as Asterisk Gateway Interface (AGI) , application interface for extending the dialplan with your functionality in the language you choose – PHP, Perl, C, Java, Unix Shell and others
Adding Media Management
Media processing is usually provided by media servers in accordance to the SIP signalling. Bridges, call recording, Voicemail, audio conferencing, and interactive voice response (IVR) are commomly used.
RFC 6230 Media Control Channel Framework decribes framework and protocol for application deployment where the application programming logic and media processing are distributed
Any one such service could be a combination of many smaller services within such as Voicemail is a combitional of prompt playback, runtime controls, Dual-Tone Multi-Frequency (DTMF) collection, and media recording. RFC 6231 Interactive Voice Response (IVR) Control Package for the Media Control Channel Framework.
RTP ( Real Time Transport Protocol )
RTP handles realtime multimedia transport between end to end network components . RFC 3550 .
Packet structure of RTP
RTP Header contain timestamp , name of media source , codec type and sequence number .
RTCP
– tbd
DTMF( Dual tone Multi Frequency )
delivery options:
Inband – With Inband digits are passed along just like the rest of your voice as normal audio tones with no special coding or markers using the same codec as your voice does and are generated by your phone.
Outband – Incoming stream delivers DTMF signals out-of-audio using either SIP-INFO or RFC-2833 mechanism, independently of codecs – in this case, the DTMF signals are sent separately from the actual audio stream.
TTS ( Text to Speech )
Alexa Text-to-Speech (TTS) + Amazon Polly
Ivona – multiple language text to speech converter with ssml scripts such as below
<speak><p><s><prosody rate="slow">IVONA</prosody> means highest quality speech
synthesis in various languages.</s><s>It offers both male and female radio quality voices <break/> at a
sampling rate of 22 kHz <break/> which makes the IVONA voices a
perfect tool for professional use or individual needs.</s></p></speak>
check ivona status
service ivona-tts-http status
tail -f /var/log/tts.log
Collecting and Processing PCAPS
VoIP monitor – network packet sniffer with commercial frontend for SIP RTP RTCP SKINNY(SCCP) MGCP WebRTC VoIP protocols
it uses a passive network sniffer (like tcpdump or wireshark) to analyse packets in realtime and transforms all SIP calls with associated RTP streams into database CDR record which is sent over the TCP to MySQL server (remote or local). If enabled saving SIP / RTP packets the sniffer stores each VoIP call into separate files in native pcap format (to local storage).
voip monitor
sngrep
tcpdump
custom made pcap capture and uploader
SIP platform Development
A sufficiently capable SIP platform shoudl consist of following features :
audio calls ( optionally video )
media services such as conferencing, voicemail, and IVR,
messaging as IM and presence based on SIMPLE,
programmable services through standardized APIs and development of new modules
near-end and far-end NAT traversal for signalling and media flows
interconnectivity with other IP multimedia systems, VoLTE ( optional interconnection with other types of communications networks as GSM or PSTN/ISDN)
registry , location and lookup service
serial and parallel forking
Performance factors :
High availability using redundant servers in standby
Load balancing
IPv4 and IPv6 support
Security considerations :
digest authentication and credentials fetched from backend
Media Encryption
TLS and SRTP support
Topology hiding to prevent disclosng IP form internal components in via and route headers
Firewalls , blacklist, filters , peak detectors to prevent Dos and Ddos attacks
Add NAT and DNS components
To adapt SIP to modern IP networks with inter network traversal ICE, far and near-end NAT traversal solutions are used. Network Address traversal is crtical to traffic flow between private public network and from behind firewalls and policy controlled networks
One can use any of the VOVIDA-based STUN server, mySTUN , TurnServer, reStund , CoTURN , NATH (PJSIP NAT Helper), ReTURN, or ice4j
Near-end NAT traversal
STUN (session traversal utilities for NAT) – UA itself detect presence of a NAT and learn the public IP address and port assigned using Nating. Then it replaces device local private IP address with it in the SIP and SDP headers. Implemented via STUN, TURN, and ICE. limitations are that STUN doesnt work for symmetric NAT (single connection has a different mapping with a different/randomly generated port) and also with situations when there are multiple addresses of a end point.
TURN (traversal using relay around NAT) or STUN relay – UA learns the public IP address of the TURN server and asks it to relay incoming packets. Limitatiosn since it handled all incoming and outgong traffic , it must scale to meet traffic requirments and should not become the bottle neck junction or single point of failure.
ICE (interactive connectivity establishment) – UA gathers “candidates of communication” with priorities offered by the remote party. After this client pairs local candidates with received peer candidates and performs offer-answer negotiating by trying connectivity of all pairs, therefore maximising success. The types of candidates : – host candidate who represents clients’ IP addresses, – server reflexive candidate for the address that has been resolved from STUN – and a relayed candidate for the address which has been allocated from a TURN relay by the client.
Far-end NAT traversal
UA is not concerned about NAT at all and communicated using its local IP port. The border controller implies a NAT handling components such as an application layer gateway (ALG) or universal plug and play (UPnP) etc which resolves the private and public network address mapping by act as a back to back user agent (B2BUA). Far end NAT can also be enabled by deploying a public SIP server which performs media relay (RTP Proxy/Media proxy).
Limitations of this approach – security risks as they are operating in the public network – enabling reverse traffic from UAS to UAC behind NAT.
A keep-alive mechanism is used to keep NAT translations of communications between SIP endpoint and its serving SIP servers opened , so that this NAT translation can be reused for routing. It contains client-to-server “ping” keep-alive and corresponding server-to-client “pong” messages. The 2 keep-alive mechanisms: a CRLF keep-alive and a STUN keep-alive message exchange.
The 3 types of SIP URIs,
address of record (AOR)
fully qualified domain name (FQDN)
globally routable user agent (UA) URI
SIP uniform resource identifiers (URIs) are identified based on DNS resolution since the URI after @ symbol contains hostname , port and protocl for the next hop.
Adding record route headers for locating the correct SIP server for a SIP message can be done by : – DNS service record (DNS SRV) – naming authority pointer (NAPTR) DNS resource record
Steps for SIP endpoints locating SIP server
From SIP packet get the NAPTR record to get the protocl to be used
Inspect SRV record to fetch port to use
Inspect A/AAA record to get IPv4 or IPv6 addresses
ref : RFC 3263 – Locating SIP Servers
Can use BIND9 server for DNS resolution supports NAPTR/SRV, ENUM, DNSSEC, multidomains, and private trees or public trees.
Cross platform and integration to External Telecommunication provider landscape
connection to IMS such as openIMS
support for Voip signalling protocols (SIP, H,323, SCCP, MGCP, IAX) and telephony signalling protocls ( ISDN/SS7, FXS/FXO, Sigtran ) either internally via pluggable modules or externally via gateways
Database Integration
Need backend , cache , databse integration to npt only store routing rules with temporary varaible values but also account details , call records details, access control lists etc. Should therefore extend integartion with text based db, redis, MySQL, PostrgeSQL, OpenLDAP, and OpenRadius.
The obvious starting milestone before making a full scale carrier grade, SIP based VoIP system is to start by building a PBX for intra enterprise communication. There are readily available solutions to make a IP telephony PBX kamailio , freeswitch , asterisk , Elastix , SipXecs
Call Rate and Accounting
Generally data streams proecssing are used for crtical and voluminious service usage like for – metering/billing – server activity, – website clicks, – geo-location of devices, people, and physical goods
Call Rates are very crticial for billing and charging the calls . Any updates from the customer or carriers or individuals need to propagate automatically and quickly to avoid discrpencies and neagtive margins. CDRs need to be processed sequentially and incrementally on a record-by-record basis or over sliding time windows, and used for a wide variety of analytics including correlations, aggregations, filtering, and sampling.
To acheieve this the follow setup is ideal to use the new input rate sheet values via web UI console or POST API and propagate it quickly to main DB via AWS SQS which is a queing service and AWS lamda which is a serverless trigger based system . This ensures that any new input rates are updates in realtime and maintin fallback values in s3 bucket too
CDR Processing and Billing
CDR store call detail records along with proof of call with tiemstamps , orignation , destination , duaration , rate etc. At the end of month or any other term , the aggregated CDR are cumulatively processed to generate the bill for a user . This heavy data stream needs to be accurately processed and this can be achiveed by using datapipelines like AWS kinesis or Kafka eventstore .
The prime requirnment for the system is to handle enormous amount of call records data in relatime , cater to a number of producers and consumers .
For security the data is obfuscated into blob using base 64 encoding
AWS kinesis – Kinesis Data Streams is sued for for rapid and continuous data intake and aggregation. The type of data used can include IT infrastructure log data, application logs, social media, market data feeds, and web clickstream data
Pros of data streams
This system can handle high volume of data in realtime and produce call uuid specfic reults which can be consumed by consumers waiting for the processed results
Cons of data streams
If not consumed with a pre-specified time duration the processed results expire and are irretrivable . Self implement publisher to store teh processed reults from kisesis stream to data stores like Redis / RDBMS or other storge locations like s3 , dynamo DB. If pieline crashes during operation , data is lost
Data stream should have low latency igesting contnous data from producer and presenting data to consumer .
It should support data sharding ie number of call records grouped and uses a partition Key ( string MD5 hash) to determine which shard the record goes to.
There are other external components to setup a VOIP solution apart from Core voice Servers and gateways like the ones listed below, I will try to either add a detailed overall architecture diagram here or write about them in an seprate article . Keep watching this space for updates
Payment Gateways
Billing and Invoice
Fraud Prevention
Contacts Integration
Call Analytics
API services
Admin Module
Number Management ( DIDs ) and porting
Call Tracking
Single Sign On and User Account Management with Oauth and SAML
At the time of writing this article on SIP and related VOIP technologies I a newbie in VOIP domain , probably just out college . However over the past decade , looking at the steady traffic to these articles , I have tried updating the same with new RFC standards and market trends .
In this updated version (2019) , the main points described are
SIP – Application layer protocol
SIP Requests
SIP responses
Session Description Protocol (SDP)
SIP transactions , dialog , branch
Record Routing
strict routing
loose routing
Mobility and Location Service
Network Address Translator ( NAT)
Far End Traversal
Near End Traversal
SIP Call Flows
Registeration
Call Redirection
Forking
click to Dial
SIP for Instant Messaging and Presence Leveraging Extensions ( SIMPLE)
The Session Initiation Protocol (SIP) is a multimedia signalling protocol that has evolved the defacto communication standard for IP telephony. Even today it forms the primary protocol for many Real Time Communication platforms which are integrated with telecom carriers and provide Cloud and IP based Services for applications such as robo/mass calls for advertising, API based calls like OTP generator, IVR announcements with DTMF input like customer care centre etc. Infact it would be not far from truth to say that converged platform we find today are a result of SIP integrating with the IP world.
Converged platforms integrates audio, video, data, presence, instant messaging, voicemails and conference services into a single network . SIP is the key component to build an advanced converged IP communication platform or rich multimedia Real time communication service.
SIP can be used to create programmable APIs and complex call routing VoIP scripts such as PBX , SBC etc.
Bears the support of many high quality open source and freeware SIP client , servers , proxies , tool such as Kamailio , Astersk , Freeswitch , Sipp , JAINSIP etc .Also supported on most standardised VoIP hardware and network such as Cisco, Microsoft, Avaya, and Radvision.
It is standardized by Internet Engineering Task Force (IETF) such as RFC 3261 which describes SIP v2 . Architecturally SIP request response ( 404 , 301 ) format is very similar to HTTP and its addressing schemes have a resemblance to SMTP ( sip:altanai@company.com) .
SIP – Application layer protocol
We know the ISO OSI layers which servers as a standard model for data communications .
Physical Layer : Ethernet , USB , IEEE 802.11 WiFi, Bluetooth , BLE
Data Link Layer : ARP ( Address Resolution Protocol ) , PPP ( point to point protocol ) , MAC ( Media Access control ) , ATM , Frame Relay
Network Layer : IP (IPv4 / IPv6), ICMP, IPsec
Transport : TCP , UDP , SCTP
Session : PPTP ( Point to point tunnelling protocol) , NFS, SOCKS
Presentation : Codecs such as JPEG , GIFF , SSL
Application : Application level like Call -manager/ softphone as HTTP , FTP , DNS , SIP , RTSP , RTP , DNS
SIP is an application layer protocol
SIP and SDP as Application layer protocols
SIP ( Session Initiation Protocol) negotiates session between 2 parties. It primarily exchanges headers that are used for making a call session such as example of outgoing telephone call from SIP session invite .
Session Initiation Protocol (INVITE)
Request-Line: INVITE sip:altanai@telecomcompany.com;transport=tcp SIP/2.0
Method: INVITE
Request-URI: altanai@telecomcompany.com;transport=tcp
Request-URI User Part: altanai
Request-URI Host Part: telecomcompany.com
[Resent Packet: False]
Message Header
Via: SIP/2.0/TCP 1.2.3.4:5080;rport;branch=z9hG4bKceX7a2H2866cN
Transport: TCP
Sent-by Address: 1.2.3.4
Sent-by port: 5080
RPort: rport
Branch: z9hG4bKceX7a2H2866cN
Max-Forwards: 41
From: "+16014801797" <sip:+16014801797@1.2.3.4>;tag=7HKgjNQ6y2FSj
SIP Display info: "+16014801797"
SIP from address: sip:+16014801797@1.2.3.4
SIP from address User Part: +16014801797
E.164 number (MSISDN): 16014801797
Country Code: Americas (1)
SIP from address Host Part: 1.2.3.4
SIP from tag: 7HKgjNQ6y2FSj
To: <sip:altanai@telecomcompany.com;transport=tcp>
SIP to address: sip:altanai@telecomcompany.com;transport=tcp
SIP to address User Part: altanai
SIP to address Host Part: telecomcompany.com
SIP To URI parameter: transport=tcp
Call-ID: e10306be-0cfd-4b38-af3c-b2ada0827cef
CSeq: 126144925 INVITE
Contact: <sip:mod_sofia@1.2.3.4:5080;transport=tcp>
User-Agent: phone1
Allow: INVITE, ACK, BYE, CANCEL, OPTIONS, MESSAGE, INFO, UPDATE, REFER, NOTIFY
Supported: path, replaces
Allow-Events: talk, hold, conference, refer
Privacy: none
Content-Type: application/sdp
Content-Disposition: session
Content-Length: 249
SIP Display info: "+16014801797"
SIP PAI Address: sip:+16014801797@1.2.3.4
makes no assumptions about underlying protocol:
TCP, UDP, X.25, frame, ATM, etc
support of multicast
SIP URI can either be in format of sip:altanai@telecomcompnay.com (RFC 2543 ) or sips:altanai@telecomcompany.com ( secure with TLS over TCP RFX 3261) . Additionally SIP URI resolution can either be
DNS SRV based such as altanai@telecomcompnay.com with SIP servers locating record for domain “telecomcompnay.com ” or
FQDN ( Fully qualified domain name ) / contact / ip address based such as altanai@2.2.2.2 or altanai@us-west1-prod-server . Both of which do not need any resolution for routing.
Tags are pseudo-random numbers inserted in To or From headers to uniquely identify a call leg
Max forwards is a count decremented by each proxy that forwards the request.When count goes to zero, request is discarded and 483 Too Many Hops response is sent.Used for stateless loop detection.
Content-Type indicates the type of message body attachment. In this case application /SDP but others could be text/plain, application/cpl+xml, etc.)
Content-Length indicates the octet (byte) count of the message body
Contact direct route to contact the sender, composed of SIPURI with a user name and IP or FQDN. USed for later requests to directly reach the destination such as ACK after INVITE
via gives the last SIP hop as IP, transport, and transaction-specific parameters along with branch that identifies the transaction each proxy adds an additional via header. fianlly via header is used to route back the responses . This ensures the user agents after the initial request dont have to rely on DNS and location tables to route the messages.
Firewalls can sometimes block SIP packets , change TCP to UDP or change IP address of the packets. Record-Route can be used , ensures Firewall proxy stays in path . Clients and Servers copy Record-Route and put in Route header for all messages
Message body is separated from SIP header fields by a blank line (CRLF).
SIP Message Body
SIP Request methods :
INVITE : Initiates negotiation to establish a session ( dialog). Usually contains SDP payload. Another invite during an existing session ( dialog) is called an RE-INVITE. A RE-INVITE can be used for hold / resume a call and change session parameters and codecs in mid of a call
ACK : Acknowledge an INVITE request by completing the 3 way handshake . If an INVITE did not contain media contain then ACK must contain it .
BYE : Ends a session ( dialog).
CANCEL : Cancels a session( dialog) before it establishes .
REGISTER : Registers a user location (host name, IP) on a registrar SIP server.
OPTIONS : Communicates information about the capabilities of the calling and receiving SIP phones ( methods , extensions , codecs etc )
PRACK : Provisional Acknowledgement for provisional response as 183 ( session in progress) . PRACK only application to 101- 199 responses .
SUBSCRIBE : Subscribes for Notification from the notifier. Can use Expire=0 to unsubscribe.
NOTIFY : Notifies the subscriber of a new event.
PUBLISH : Publishes an event to the Server.
INFO : Sends mid session information.
REFER : Asks the recipient to issue call transfer.
MESSAGE : Transports Instant Messages.
UPDATE : Modifies the state of a session ( dialog).
SIP responses :
1xx = Informational SIP Responses
100 Trying 180 Ringing 183 Session Progress
2xx = Success Responses
200 OK – Shows that the request was successful
3xx = Redirection Responses
4xx = Request Failures
401 Unauthorized 404 Not Found 405 Method Not Allowed 407 Proxy Authentication Required 408 Request Timeout 480 Temporarily Unavailable 481 Call/Transaction Does Not Exist 486 Busy Here 487 Request Terminated 488 Not Acceptable Here 482 Loop Detected 483 Too Many Hops
5xx = Server Errors
500 Server Internal Error 503 Service Unavailable
6xx = Global Failures
600 Busy Everywhere 603 Decline 604 Does Not Exist Anywhere 606 Not Acceptable
SIP can bear many kinds of MIME attachments , one such is SDP. SDP contains session metadata used for establishing the session . It defines media information and capabilities such as codecs and formats , timestamps , termination points like address , ports . Additionally it can also convey other details like bandwidth and contact for the node acting as proxy for the session.
Session Description Protocol Version (v): 0
Owner/Creator, Session Id (o): FreeSWITCH 1532932581 1532932582 IN IP4 1.2.3.4
Owner Username: FreeSWITCH
Session ID: 1532932581
Session Version: 1532932582
Owner Network Type: IN
Owner Address Type: IP4
Owner Address: 1.2.3.4
Session Name (s): FreeSWITCH
Connection Information (c): IN IP4 1.2.3.4
Connection Network Type: IN
Connection Address Type: IP4
Connection Address: 1.2.3.4
Time Description, active time (t): 0 0
Session Start Time: 0
Session Stop Time: 0
Media Description, name and address (m): audio 29398 RTP/AVP 0 101
Media Type: audio
Media Port: 29398
Media Protocol: RTP/AVP
Media Format: ITU-T G.711 PCMU
Media Format: DynamicRTP-Type-101
Media Attribute (a): rtpmap:0 PCMU/8000
Media Attribute Fieldname: rtpmap
Media Format: 0
MIME Type: PCMU
Sample Rate: 8000
Media Attribute (a): rtpmap:101 telephone-event/8000
Media Attribute Fieldname: rtpmap
Media Format: 101
MIME Type: telephone-event
Sample Rate: 8000
Media Attribute (a): fmtp:101 0-16
Media Attribute Fieldname: fmtp
Media Format: 101 [telephone-event]
Media format specific parameters: 0-16
Media Attribute (a): silenceSupp:off - - - -
Media Attribute Fieldname: silenceSupp
Media Attribute Value: off - - - -
Media Attribute (a): ptime:20
Media Attribute Fieldname: ptime
Media Attribute Value: 20
v=0 indicates the start of the SDP content.
o=FreeSWITCH 1532932581 1532932582 IN IP4 1.2.3.4 , is session origin and owner’s name
c=IN IP4 1.2.3.4 is connection data specifing the IP address of session.
m= is Media type – audio, port – 29398, RTP/AVP Profile – 0 and 101
ACK – For positive replies (2XX), a new transaction is created with new CONTACT header and it can be sent straight to the UAS bypassing the proxy. For negative replies, it stays part of INVITE transaction hence request is sent to the same proxy as INVITE.
Branch
The branch parameter is a transaction identifier. Responses relating a request can be correlated because they will contain the same transaction identifier.
Dialog
The p2p relationship between 2 sip endpoints , containing sequence of transactions.
The initiator of the session that generates the establishing INVITE generates the unique Call-ID and From tag. In the response to the INVITE, the user agent answering the request will generate the To tag. The combination of the local tag (contained in the From header field), remote tag (contained in the To header field), and the Call-ID uniquely identifies the established session, known as a dialog. This dialog identifier is used by both parties to identify this call because there could be multiple calls set up between them.
A dialog is uniquely identified by: Call-ID header , remote-tag and local-tag. Dialog id is different for both ends since local and remote for both ends are different.
Example : Notice the to and from tag ids in INVITE and its 200 ok. The dialog id for invite is , 97576NjQ5MTBlNjVjNDQ0MzFmOTEyZGEzYWJjZjQxYjcyYzc70edc66c. First invite doesnt bear the To tag.
All requests sent within a dialog are by default sent directly from one user agent to the other. Only requests outside a dialog traverse SIP proxies. This approach makes SIP network more scalable because only a small number of SIP messages hit the proxies.
However few request need to explicitly state that they need to stay on path of proxies such as for accounting during termination of when NAT process is being carried out then . For these we need to insert a Record-Route header field into SIP messages which contain address of the proxy. Messages sent within a dialog will then traverse all SIP proxies that put a Record-Route header field into the message.
The server copies the Record-Route header field unchanged into the
response. (Record-Route is only relevant for 2xx responses. ) ie the end point recipient will also mirror the proxies for the response.
without Record Routingwith record routing
Strict Routing
Rewrite the Request-URI ie Request-URI always contained URI of the next hop so it is necessary to save the original Request-URI as the last Route header field. Defined in RFC2543
Loose routing
Request-URI is no more overwritten, it always contains URI of the destination user agent, therby keeping target seprated from route. ( ;lr) . If there are any Route header field in a message, then the message is sent to the URI from the topmost Route header field. Defined in RFC 3261
SIP Authorization
Authentication , security , confidentiality and integrity form the basic requirement for any communication system . To protect against hacking a user account and Denial of service attacks , SIP uses HTTP digest authentication mechanism with nonces and challenges along with 407 Proxy Authorization required and 401 unauthorised . The sender has to resend the request with MD5 hash of nonce and password ( password id never send in clear ). Thus preventing man-in-middle attacks.
Challenge / Response Scheme :
Sends REGISTER and receives 401 / 407 Challenge + nonce
Again sends REGISTER + MD-5 hash (pw + nonce) get a 200 OK
REGISTER using HTTP Digest for authentication using TLS transport, challenge is in form
Here qop is Quality Of Protection param indicating quality of protection that the client has applied to the message. qop=1 (enabled) will help you to avoid replay attacks.
Here qop is Quality Of Protection param indicating
Cancellation of Registration – UA sends REGISTER request with Expires: 0 Contact: * , to apply to all . Since user is already authenticated , it is not challenged again .
To prevent spoofing ie impersonating as server , SIP provides server authentication too. Required by ITSP’s ( Internet telephony service providers ) .
According to RFC 3263 Session Initiation Protocol (SIP): Locating SIP Servers , if the proxy finds that the request is for an outside domain , it will take help of a DNS server to resolve to IP address of target domain and forward the request. Then target domain proxy used REGISTRAR’s discovery services to find if user is present in the host via location table entry . If found then request reaches the user .
To provide session mobility SIP endpoints send Register request to their respective registrar as they move and update their location. As User changes terminals , they registers themselves to the appropriate server – Location server tracks the location of user – Redirect servers prioritise the possible locations of the user – Users keep same services as located at home server, while mobile Call is processed by home servers using RECORD-ROUTE
NAT ( Network Address Translator)
Network Address Translator , defined by RFC 3022 to conserve network space as most packets are exchanged inside a private network itself .
All internet users whether they are using Wifi , 3G/LTE, home AP, any other telecom data packet network by TSP or ISP , are assigned a private IP address , which is unreachable from out side world .Addresses are assigned by Internet Assigned Numbers Authority (IANA). Private address blocks are in format of 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16.
Therefore when they access the Internet , this address is converted into a globally unique public IP address through a NAT for external communication
SIP Issues around NAT
NATs modify IP addresses (Layer 3)- SIP/SDP are Layer 7 protocols – transparent to NAT
SIP Via:, From: and Contact: headers use not-routable private addresses
SDP states that originator wishes to receive media at not-routable private addresses
If destination on the public internet tries to send SIP or RTP traffic to those private address
Traffic will be dumped by first router
Solution are to use either Application level gateway (ALG) or STUN or Universal Plug and Pray (UPnP)
To rewrite all SIP/SDP source addresses
SIP Via:, From: and Contact: headers use public NAT address
SDP addresses use NAT public address
Use SIP over TCP
Use draft-ietf-sip-symmetric-response-00 and “Symmetric” SIP/RTP
Use same UDP port number for incoming/outgoing
Hold ports open for call duration
Send UDP packet typically every 30 seconds
SIP over UDP uses 30 second re-INVITE, REGISTER or OPTIONs
RTP sends at much higher frequency by default
NAPT ( Network Address Port Translator ) – Can map multiple private IP addresses and ports to one public IP address and ports
To adapt SIP to modern IP networks with inter network traversal ICE, far and near-end NAT traversal solutions are used. Network Address traversal is crtical to traffic flow between private public network and from behind firewalls and policy controlled networks
One can use any of the VOVIDA-based STUN server, mySTUN , TurnServer, reStund , CoTURN , NATH (PJSIP NAT Helper), ReTURN, or ice4j
Near-end NAT traversal
STUN (session traversal utilities for NAT) – UA itself detect presence of a NAT and learn the public IP address and port assigned using NAting. Then it replaces device local private IP address with it in the SIP and SDP headers. Implemented va STUN, TURN, and ICE. limitations are that STUN doesnt work for symmetric NAT (single connection has a different mapping with a different/randomly generated port) and also with situtatiosn when there are multiple addresses of a end point.
TURN (traversal using relay around NAT) or STUN relay – UA learns the public IP address of the TURN server and asks it to relay incoming packets. Limitatiosn since it handled all incoming and outgong traffic , it must scale to meet traffic requirments and should not become the bottle neck junction or single point of failure.
ICE (interactive connectivity establishment) – UA gathers “candidates of communication” with priorities offered by the remote party. After this client pairs local candidates with received peer candidates and performs offer-answer negotiating by trying connectivity of all pairs, therefore maximising success. The types of candidates – host candidate who represents clients’ IP addresses, – server reflexive candidate for the address that has been resolved from STUN – relayed candidate for the address which has been allocated from a TURN relay by the client.
Far-end NAT traversal
UA is not concerned about NAT at all and communicated using its local IP port. The border controller implies a NAT handling compoenets such as an application layer gateway (ALG) or universal plug and play (UPnP) etc which resolves the private and public network address mapping by act as a back to back user agent (B2BUA).
Far end NAT can also be enabled by deploying a public SIP server which performs media relay (RTP Proxy/Media proxy).
Limitations of this approach
security risks as they are operating in public network
enabling reverse traffic from UAS to UAC behind NAT.
A keep-alive mechanism is used to keep NAT translations of communications between SIP endpoint and its serving SIP servers opened , so that this NAT translation can be reused for routing. It contains client-to-server “ping” keep-alive and corresponding server-to-client “pong” messages. The 2 keep-alive mechanisms: a CRLF keep-alive and a STUN keep-alive message exchange.
SIP Flows
Components of SIP based VoIP Solution
Registration
Localization Server –Used by the Proxy Server and Redirect Server to obtain the location of the called user (one or more addresses)
Registration Server- Accept registration requests from the client applications . Generally, the service is offered by the Proxy Server or Redirect Server
DNS Server – Used to locate the Proxy Server or Redirect Server using NAPTR or SRV records
The 3 types of SIP URIs,
address of record (AOR)
fully qualified domain name (FQDN)
globally routable user agent (UA) URI
SIP uniform resource identifiers (URIs) are identified based on DNS resolution since the URI after @ symbol contains hostname , port and protocol for the next hop.
Adding record route headers for locating the correct SIP server for a SIP message can be done by :
DNS service record (DNS SRV)
naming authority pointer (NAPTR) DNS resource record
Steps for SIP endpoints locating SIP server
From SIP packet get the NAPTR record to get the protocl to be used
Inspect SRV record to fetch port to use
Inspect A/AAA record to get IPv4 or IPv6 addresses
ref : RFC 3263 – Locating SIP Servers
Can use BIND9 server for DNS resolution supports NAPTR/SRV, ENUM, DNSSEC, multidomains, and private trees or public trees.
Call Redirection
Sending Call invite but as Redirect Server responded with 302 moved temporary , a new destination address is returned. The invite is forwarded to another proxy server which connects the sip endpoints again after consultation with Redirect server .
In this stage of we see the call getting connected to sip endpoint via 2 proxy servers . The redirect server doesnt get into path once the initial sip request is send.
After communication the endpoints send BYE to terminate the session
Forking
This callflow deals with the use-case when a user maybe registered from multiple SIP phones ( perhaps one home phone , one car and one office desk etc ) and wants to receive a ring on all registered phone ie fork a call to multiple endpoints .
In the above diagram we can see a forked invite going to both the sip phones . Both of them reply with 100 trying and 180 ringing, but only 1 gets answered by the user .
After one endpoint sends 200 ok and connects with session , the other receiver a cancel from the sip server .
Click to Dial
A web or desktop application which has HTTP can fire a API call which is interpreted by the controller or SIP server and call is fired .
The API can contain params for to and from sip addresses as well as any authentication token that is required for api authentication and validation .
SIP for Instant Messaging and Presence Leveraging Extensions (SIMPLE)
several vendors who intend to implement SIMPLE
provides for presence and buddy lists
Instant Messaging in the enterprise
telephony enabled user lists
Using SIP based Call routing algorithms and flows , one can build carrier grade communication solution . SIP solutions can hook up with existing telecom networks and service providers to be backward compatible . Also has untapped unlimited potential to integrate with any external IP application or service to provide converged , customised control both for signalling and media planes.
It contains SIP implementation examples such as SIP Registration – Successful New Registration, Update of Contact List, Request for Current Contact List, Cancellation of Registration, Unsuccessful Registration SIP Session Establishment – Successful Session Establishment, Session Establishment Through Two Proxies, Session with Multiple Proxy Authentication, Successful Session with Proxy Failure, Session Through a SIP ALG, Session via Redirect and Proxy Servers with SDP in ACK, Session with re-INVITE (IP Address Change), Unsuccessful No Answer, Unsuccessful Busy, Unsuccessful No Response from User Agent, Unsuccessful Temporarily Unavailable, Security Considerations
RFC 5359 – Session Initiation Protocol Service Examples
It contains the description for services like Call Hold, Consultation Hold, Music on Hold, Transfer – Unattended, Transfer – Attended, Transfer – Instant Messaging, Call Forwarding Unconditional, Call Forwarding – Busy, Call Forwarding – No Answer, 3-Way Conference – Third Party Is Added, 3-Way Conference – Third Party Joins, Find-Me , Call Management (Incoming Call Screening) , Call Management (Outgoing Call Screening) , Call Park , Call Pickup , Automatic Redial , Click to Dial
The rapidly changing scene of telecoms operations , brings to light many challenges faced by an telcos and service providers as they cater to the end users , with swift and innovative services , while at the same time keeping surcharge operational costs at bay .
Today a client expects converged orchestrated harmonized applications bringing call control and customizing features, all under one roof .
What is Unified Communication ?
Unified Communications solutions bring together voice, messaging, video, and desktop applications to enable companies to quickly adapt to market changes, increase productivity, improve competitive advantage and deliver a rich-media experience across any work space.
Components of Unified Communications.
The latest UCC solutions are based on open standards such as SIMPLE/XMPP protocols or socket.io and REST webservices
Communications: Voice, data, and video
Messaging: Voice, email, video, and IM
Conferencing: Online, audio, and video
Application integration: Microsoft Office and CRM
Presence: IP phone, desktop clients, and call connectors
Common user experience: Desktop, phone, and mobility
Service like : voice Mail , IVR , auto attendant with Voice XML
What is the need of Unified communication ?
Currently the mode of communication across various users differs such as emails , SMS, VOIP call , GSM call , message on other platform etc .
These random forms of communication cannot be tracked and hamper fast decision making .
Challenges with Unified Communications
Adds complexity in to already complex infrastructure
Lack of standardization
Organization Infrastructure and Bandwidth Limitations
Integration of services from different application platforms like emails
migrating existing communication infrastructure like desk phones
Interfacing of telephony applications with Business Applications such as CRMs
Types of UCC solutions
There exists broadly two types of UC&C solution – On-premise and cloud based . The fundamental difference is the location of the backend infrastructure supporting the communication system . Some more differences are outlined in table below:
On -premise
Cloud Based
Mostly in SaaS nature ( software as a service )
Hosted by the consumer / business unit itself more customizability and flexibility more investment and maintenance
Service provider offers his infrastructure to the consumer as a service bills monthly / yearly etc quick setup lower upfront payment billing is either per user basis or on consumption . data is synced to cloud servers for storage and can be fetched from there when required such as cloud synced Call-logs or contact-book
Device / Platform Agnostic
The UCC clients are designed keeping mobility in mind . Thus UC Solutions are made compatible with online provisioning / portal system , native mobile apps like android /ios , Desktop app for linux, mac, windows etc .
UC&C + CRM
UC&C models integrates with Enterprise and Customer Relation Management Systems (CRM). Therefore provides unified messaging across teamspace/workspace/workflow management systems . It is trackable and can be used for realtime notification and analytics .
This directly ensures boost in sales and profitability by quick communication between customers / partners / eco-system / sales-rep / developers / field agents and others part of the communication system.
SME adopt UCC solutions
SME ( Small and Medium Enterprises) are first to adopt UC&C due to absence of already setup traditional communication infrastructure like PSTN lines , desktop phones , special handsets etc .
Factors like :
quick setup
low budget of UC&C solutions
BYOD ( Bring your Own Device ) to work
enable SMEs to adopt UC&C solution much faster and readily.
It has multiple advantages like:
All information is stored in one place therefore easier to retrieve
Increased Productivity
Greater flexibility
Faster Response and Information delivery
Future of UCC
Context Driven + Real Time Analytics
Monitization by cross vertical integration
Integration with IOT
Automation for Billing and Operation – OSS/BSS
Machine Learning
Big Data Management
In conclusion UC&C is aimed at providing inter operable communications with ubiquitous coverage for applications and devices such as desktops , IP phones , smartphones, smart watches , kisosk etc . It means web , native apps and IP phones having the ability to create, share and participate in integrated multimedia ( like audio, video , desktop, files ) collaboration.