VoIP manages Call setup and teardown using IP protocol. The APIs can be used to provide public or internal endpoinst to create mnage calls , conference addon services like recording , tgranscription or even do auth and heartbeat. This article lists some external programmable Call Control APIs, internal APIs for biling , health as well as Rate limitting.
get CDR ( filtered per cal or acc to specific date or account)
bulk export of CDR
Internal API gateways
API Rate Limiter
Noisy neighbour is when one of the clients monoplizes the bandwidth using most of the i/o or cpu or other resources which can negatively affect the performance for other users . Throttling is a good way to solve this problem by limit.
Auto scaling
Load balancer
Rate Limiter
horizotal or vertical scalling can countger incoming traffic
LB can limit number of simultaneous requests. It can reject or send to queue for later operation
Can intelligently understand the cost of each operation and perform throttling.
(-) takes time to scale out thus cannot solve noisy neighbour problem immediately
(-) but the LB’s behaviour is indiscriminate ( cannot distinguish between the cost of diff operations) (-) LB cannot ensure uniform distribution of distribution of operations among all servers.
A rate limiter should have low latency, accurate and scalable.
RateLimiter inside the serviceprocess
Rate Limiter as its own process outside as a daemon
(+) faster , no IPC (+) reisstnt to interprocess call failures
(+) programming langiage agnostic daemon (+) uses its own memory space, more predictable
(-) service meory needs to allocate space for rate limiters
widely used for auto discovery of service host
Token based Rate Limiting
provides admission contro
Token bucket filter
define a users quota in terms average rate and burst capacity
Hierarchical Token Bucket ( HTB)
uses the deficit round-robin algorithm for fair queuing
Fair Queing
give paying users a bandwidth fraction of 25%
priority queuing
decide 1 packet/ms for free or reduce rate user
distributes that sender’s bandwidth among the other senders
CBQ ( Class Based Queing)
Shaping is performed using link idle time calculations based on the timing of dequeue events and underlying link bandwidth. Input classes that tried to send too much were restricted, unless the node was permitted to “borrow” bandwidth from a sibling.
Modular QoS command-Line interface (MQC) Shaping
mplement traffic shaping for a specific type of traffic using a traffic policy
When the rate of packets matching the specified traffic classifier exceeds the rate limit, the device buffers the excess packets.
When there are sufficient tokens in the token bucket, the device forwards the buffered packets at an even rate.
When the buffer queue is full, the device discards the buffered packets.
Throttling
delay the packet until the bucket is ready / shaping
drop the packet / Policing
mark the packet as non-compliant
Failure management on Rate Limiter
Node Crash : just less requests trolled
Leaky bucket
tokens can go into -ve
System Design for API gateway
Important points for design API gateway
Serialize data in company binary format
allocate buffer in memory and build frequency count hash table and flash once full or based on time to calculate counters
aggregation on API gateway on the fly
Frontend Service
Partitioned Service
Backend Service
Lightweight web service Stateless Request Validation Auth / Authorization TLS(SSL ) termination Server sode encryption Caching Rate Limiting(throttling) Request deduplication
Caching layer between frontend and backend
Replication Leader Selection + Quorem
Distributed messaging system( fast and slow paths) for API
A distributed messahing system such as Apache kafka or AWs kinesis, internally splits a msg accross serveral partitions where each parition can be placed on a single shard in a seprate machine on a clustered system.
I have been contemplating points that make for a successful developer to develop solutions and services for a Telecom Application Server. The trend has shown many variations from pure IN programs like VPN, Prepaid billing logic to SIP servlets for call parking, call completion. From SIP servlets to JAISNLEE open standard-based communication.
A cloud communication provider is who acts as a service provider between the SME ( Small and Medium Enterprises ) and Large scale telco carrier. An important concern for a cloud provider is to build a Scalable and Flexible platform. Let’s go in-depth to discuss how can one go about achieving scalability in SIP platforms.
A typical semi multi-geography scaled, read replica based / data sharding based Distributed VoIP system which is controlled by a router that distributes the traffic to various regions based on destination number prefix matching looks like
Clusters of SIP servers are great at providing High availability and resilience however they also add a factor of latency and management issues.
Considerations for a clustered SIP application server setup
memory requirements to store the state for a given session and the increasing overhead of having more than two replicas within a partition.
Co-hosted virtual machine add resource contention and delay call established due to multi-node traversal.
In case of node failures or plannet reboot after upgrade, the traffic redirection needs draining existing calls from sip server before briniging it down. This setup ensures that
no new calls are channelled to this server
servers waits for existing calls to end before reboot.
Fail fast and recover : The system should be reliable to not let a node failure propagate and become root cause for entire system failure due to corrupted data.
A Clustered SIP platform is quickly recoverable with containerized applications. Clear separation between stateless engine layer and session management or Data layer is critical to enable auto-reboot of failed nodes in engine layer.
It should be noted that, unlike HTTP based platforms, dialogue and transaction state variables are critical to SIP platforms for example, call duration for CDR entry. Therefore for a mid-call failure and auto reboot the state variable should be replicated on an extrenal cache so that value can persist for correct billing.
Symmetrical Multi-Processing (SMP) architectures have
stateless “Engine Tier” processes all traffic and
distributes all transaction and session state to a “Data Tier.”
A very good example of this is the Oracle Communications Converged Application Server Cluster (OCCAS) which is composed of 3 tiers :
Message dispatcher,
Communication engine stateless
Datastore which is in-memory session store for the dialogues and ongoing transactions
An advantage of having stateless servers is that if the application server crashes or reboots, the session state is not lost as a new server can pick up the session information from an external session store .
The components for a well-performing highly scalable SIP architecture are abstracted in their role and responsibilities. We can have categories like
Load Balancer / Message Dispatcher
LB routes traffic based on an algorithm (round robin, hashing , priority based scheduling, weight-based scheduling ) among active and ready servers
Backend Dynamic Routing and REST API services
Services which the Application server calls during its call flow execution which may include tasks like IP address associated with the caller, screened numbers associated with destination etc such as XML Remote Procedure Call (XML-RPC) or AVAPI Service in Kamailio
OSS/BSS layer
This layer is responsible for jobs in relation to operations and billing and should take place in an independent system without affecting the session call flow or causing a high RTT. Some top features possible with defining this layer well are
There are other componets ina typical VoIP micro services architecture such as Heartbeat service , backend accounting servuce , security check service, REST API service , synmaic routing service , event notofication service etc which should be decoupled from each other leading to high parallel programing approach.
To improve Flexibility w.r.t Infrastructure binding, all server components including edge components, proxies, engines, media servers must be containerized in form of images or docker for easy deployment via an infrastructure tool like Kubernetes, Terraform, chef cookbooks and be efficiently controlled with an Identify manage tool and CICD ( continuous integration and Delivery ) tool like Travis or Jenkins.
Autoscaling Cloud Servers using containerized images
Autoscaled servers are provided by the majority of Cloud Infrastructure providers such as AWS ( Amazon Web Services ), Google Cloud platform which scale the capacity based on traffic in real-time also called elasticity. Any VoIP developer would notice patterns in voice traffic such as less during holidays/night hours where servers can be freed, whereas traffic peaks during days where server capacity needs to scale up.
Additionally, traffic may pike when the setup is under DDoS attacks, not an uncommon thing for SIP server, then the server needs to identify and block malicious source points and prevent unnecessary upscaling. There are 2 approaches to scaling
Scale UP / Vertical Scaling
Scale OUT / Horizontal scaling
Resusing the existing server to upgrade performance to match the load requirnments
Increasing the number of servers and adding their IP to Load balancer to manage traffic . It should be noted that scalling up or down shouel be carried out incrementally to have better control on resource to requirnment ratio.
It is crucial for any Voice traffic / media servcis provoder to have state of the art security in the content without disrupting data privacy norms.
SIP secure practises like Authentication , authorization ,Impersonating a Server , Temparing Message bodies, mid-session threats like tearing down session , Denial of Service and Amplification , Full encryption vs hop by hop encrption , Transport and Network Layer Security , HTTP Authentication , SIP URI, nonce and SIP over TLS flows , can be read at https://telecom.altanai.com/2020/04/12/sip-security/
While scaling out the infrastructure for extensing the Pop( point of presence ) accross the differnet geographies , define zones such as
red zone : public facing server like load balancers
dmz zone ( demilitarized zone ) interfacing servers betwee private and public network
green zone : provate and secure interal serer which communicate over private IPs snd should ne unrechable from outside .
To futher increase efficiency between communication and transmission between green zone server , setup private VPC ( Virtual provate cloud ) between them .
To establish itself as a dependable Realtime communication provider , the product must follow stabdardised RFC’s and stacks such as SIP RFC 3261 and W3C drfat for Webrtc peer connection etc . It si also a good practise to be updated with all recommendation by ITU and IANA and keep with the implementation . For exmaple : STIR/SHAKEN https://telecom.altanai.com/2020/01/08/cli-ncli-and-stir-shaken/
Many Communication service providers offer Voice over IP and related unified communication and collaboration platforms. A new VoIP provider needs to envision enhancements and innovations that meet the growing user expectation.
Easy to follow technical documentationand help and quick response to any technical question about platform posted on QnA sites (StackOverflow, Quora .. ), tech forums ( Google groups, slack channels .. ) even Twitter handles to address issues.
Data Visualization Tools – Show overall call quality insights, call flows, stats, probable issues, fixes, spending/saving on user groups, duration, negative-positive margins, healthy/unhealthy calls, spams etc.
Graphical Event Timelines – time based events such as call setup , termination , codec negotiation , call rediection events
Drag and Drop Call Flow deisgner – As call routing logic beome more complicated with a large set of known and pre-defined operations ( parking , routing , voicemail , forking , rediercting etc) . The call routing can be easily composed from these preset operation as UI block attached to a call flow chain which results in calls being channels as predefined by this call flow logic . Leads to plenty of cutomaizibility and design flexibility to custoemrs to design their calls.
Encourage users to use the services either for free or for a minimal price
Besides increasing onboarding count and developing an internationla presence, this also helps gain a good word and pays long term.
Discount, onboarding bonuses encourage users to try out services without signing up with long term contracts. The value could range from 5-15$ one-time onboarding prize to use services such as DID number purchase, outgoing telco call or purchasing any other service addon.
No or minimal onboarding cost
Toll-free minutes 50- 1000 minutes per month.
Competitive Pricing some of the enteries below show an approximate pricing figure for various service ( note these may be outdated and should references be used as it is).
Pay as you go pricing : Rate per minute (USD) plan for example( from google voice )
Australia – Mobile ~$0.02 Portugal – Mobile ~$0.15 Switzerland – Mobile – ~Orange $0.11 United Kingdom – Mobile – ~Orange $0.02 United Kingdom – Mobile – ~Vodafone $0.01
Outbound calls to PSTN ~$0.015 per min ( depending on teleco and destination) Incoming Voice Calls on a Local Number ~$0.0060 per min Incoming Voice Calls on a Toll-Free Number ~$0.020 per min VoIP Calls (In-App WebRTC & SIP) ~$0.003 per min
Addon Services
Call Recordings ~$0.0025 per min Voicemail Detection, Call Analyzers, Call Transcription ~$0.015 – $0.050 per min ( depends on external API cost or inhouse R&D effort) Automatic Speech Recognition ~$0.02 per 15 sec
Trunk Calls ( heavy volume customers )
Inbound Voice Calls ~$0.0025/min Outbound Voice Calls ~$0.0065/min Tollfree Inbound Voice Calls ~$0.0135/min ( toll free numbers usually charge more than local numbers)
Services which should be offered on a non chargable basis :
Round the clock technical support
Compensation for Downtime
CDRs per account
IP to IP calls
Security Certificates in TLS and SRTP calls
Authetication and Authorization
Services that can be charged are
Value added services – Live Weather updates , horoscope update ..
CarrierIntegration – trunk , PRI
Toll Free Numbers – DID numbers
Virtual Private Network (VPN) : An Intelligent Network (IN) service, which offers the functions of a private telephone network. The basic idea behind this service is that business customers are offered the benefits of a (physical) private network, but spared from owning and maintaining it
Access Screening(ASC): An IN service, which gives the operators the possibility to screen (allow/barring) the incoming traffic and decide the call routing, especially when the subscribers choose an alternate route/carrier/access network (also called Equal Access) for long distance calls on a call by call basis or pre-selected.
Number Portability(NP) : An IN service allows subscribers to retain their subscriber number while changing their service provider, location, equipment or type of subscribed telephony service. Both geographic numbers and non-geographic numbers are supported by the NP service.
Interworking among the services from legacy IN solution and IMS /IT. Allow the Operators to extend their basic offering with added services via low cost software and increases the ARPU for subscribers.
Next Gen 911
911 like emergency services afre moving from tradiotional TDM networks to IP networks . However this poses some challenges such as detecting callers geolocation and routing the call to his/her nearest servicing station pr Public safety Answering Point ( PSAP)
Backward compatibility with existing legacy networks
PSTN-SIP gateways to interface bwteen SIP platform and SS7 siganlling platform also convert the RTP stream to Analog waveforms required byb PSTN endpoints
Internetworking with IMS
IMS is a IP telephony service archietcture developed by 3rd Generation Partnership Project ( 3GPP) ,global cellular network standards organization that also standardized Third Generation (3G) services and Long Term Evolution (LTE) services
Develop on Interactive and populator frameworks like webRTC
Agile Development and Service Priented Architecture (SOA) are proven methods of delievry quality and updated products and releases which can cater to eveolcing market demands . In short “Be Future ready while protecting the existing investments”
Make a WebRTC solution that offers a plug in free, device agnostic, network agnostic web based communication tool along with the server side implementation.
Enterprise communication agents Integration – consider integration with Microsoft 365, Google Workspace, Skype for Business , Slack , WebEx
CRM Integartion – Salesforce , Zendesk
Business specific integartion
Canvas for eleraning
telehealth platform for doctor consultation
A2P ( application to person) msging
Integration of the services with social media/networking enables new monetizing benefits to CSPs especially in terms on advertising and gaining popularity , inviting new customers etc.
Enterprises seek to reach their customers with trusted telecom mediums such as phone calls/SMS. Telcos play an instrumental role in increasing the customer’s trust for an enterprise by means of updates over call and SMS in addition to emails and postal mail. The medium of VoIP services offers value addition in their present product/service delivery model for any firm whether it be an e-commerce firm or banking.
VoIP providers should develop an SDK rich, a dev-friendly arrangement that can facilitate onboarding SMEs ( small-medium enterprises) by self-guided tutorials and quick setups.
Support developer base to aggregate, use open-standard services/technologies and tie them with other communication technologies suited to their business use-case using the VoIP platform as the medium of communication between web/mobile app endpoints and telecom endpoints.
Log aggregation and Analytics. PagerDuty Alerts Daily and Weekly backups and VM snapshots. Automated sanity Tests Centralized alert management, monitoring and admin dashboards . Deployment automation / CICD Tools and workflows for diagnostics, software upgrades, OS patches etc. Customer support portal , provisioning Web Application
QoS : Media Stats can help us collect the call qulaity metrics which determins the overall USer experience. Some frequently encountered issues include
Issue
Cause
Observance
High Packet Loss
250 ms of audio suration lost in 5 sec
broken audio
High Jitter
jitter >= 30 ms in 5 sec
robotic audio
Low Audio Level
audio level < -80dB
inaudible
High RTT
RTT > 300 ms in 5 sec
lags
Pro-active Isssue Tracking via Call Meta data Analysis
Call details even during a setup phase , continuation or reinvite /update phase can suggest the probably outcomes based on previous results such as bad call quality from certain geographic areas due to their known network or firewall isseus or high packet loss from certain handset device types . We can deduce well in advance what call quality stats will be generated from such calls .
Contains which can be identfied from calls setup details itself include :
geography and number – Call was made from which orignating location to which destination
SIP devices – device related details , Version of device (browser version etc..,)
Chronological aspects of call – Initiation, ring start, pick up and end time.
call direction – inbound ( coming from carrier towards our VoIP platform ) or outbound ( call directed to carrier from out VoIP platform )
Network type – network ssues and quality score across network type
Contarins which can be identfied during a ongoing call itself include :
Participants and their local time – ongoing RTCP from Legs, probability of long Conferences is low in off hours
Call events – DTMF, XML, API calls , quality issues
The minor issues identified during an ongoing calls RTCP packets such as increasing jitter or packet loss can extrapolate to human perceivable bad audio quality of call after a while . Thus any suspected issues should be identified as early as traced and corrective action should be put in place .
Predicting Low Audio / Call quality
Having a predictive engine can forecast bad call Quality such as 408 timeouts , high RTT , low audio level , Audio lag , one way audio , MOS < 2.5 out of 5 etc .
The predictive engine can use targeted notifications pointing towards specific issues that can comeup in a call relatine and assign a technical rep to overlook or manually intervene . This can include scenario such as an agent warning a customer that his bad audio quality is due to him using an outdated SIP Device with slow codecs and suggest to upgrade it to lightweight codecs as per his bandwidth. This saves bad user experince of the customer and can happen without cusomer reporting the issues homself with feedback , RTP stats , PCAPS etc. Save a lot of trouble and effort in call debugging .
CSP’s are looking into long term growth and profitability from new online services media streaming services. A new VoIP provider could develop use-cases exceeding the exsting usecase of media stream rending to create a differentiator such as
Streaming
Conference bridges/mixers
Recording and playback
IPTV and VOD ( Video On Demand)
Voicemails , IVR , DTMF,
TTS( text to speech ),
realtime transcription / captioning
Speech recognition etc
Some more services that a new VoIP provider should consider