Certificates, compliances and Security in VoIP

This article describes various Certificates and compliances, Bill and Acts on data privacy, Security and prevention of Robocalls as adopted by countries around the world pertaining to Interconnected VoIP providers, telecommunications services, wireless telephone companies etc

Compliance certificates by Industry types

HIPAA (Health Insurance Portability and Accountability Act)

Deals with privacy and security of personal medical records and electronic health care transaction

Applicability  : If voip company handles medical information

Includes : 

  • Not allowed Voice mail transcription
  • Should have End-to-End Encryption
  • Restrict  using unsecured WiFi networks to prevent Snooping
  • User security , strong password rules  and mandatory monthly change
  • Secure Firmware on VoIP phones
  • Maintaining Call and Access Logs

SOX( Sarbanes Oxley Act of 2002)

Also known as SOX, SarbOX or Public Company Accounting Reform and Investor Protection Act

Applicability : if managing the communications operations of a regulated, publicly traded company 

Includes : 

  • Retain records which include financial and other sensitive data
  • ways employees are provided or denied access to records or data based on their roles and responsibilities
  • do information audit by a trusted third party. 
  • Retention and deletion of files such as audio files like voicemails, text messages, video clips, declared paper records, storage, and logs of communications activities
  • Physical and digital security controls around cloud-based VoIP applications and the networks

Privacy Related Compliance certificates

COPPA (Children’s Online Privacy Protection Act ) of 1998 

prohibits deceptive marketing to children under the age of 13, or collecting personal information without disclosure to their parents. 

any information is to be passed on to a third party, must be easy for the child’s guardian to review and/or protect

2011 amendment  requires that the data collected was erased after a period of time,

2014 FTC issued guidelines that apps and app stores require “verifiable parental consent.”

CPNI (Customer Proprietary Network Information) 2007

CPNI (Customer Proprietary Network Information) in united states is the information that communication providers  acquire about their subscribers. This Individually identifiable information that is created by a customer’s relationship with a provider, such as data about the frequency, duration, and timing of calls, the information on a customer’s bill, and call identifying information. This processing information is governed strictly by FCC and certification should be renewed on an annual basis

Provider can pass along that information to marketers to sell other services, as long as the customer is notified

In 2007, the FCC explicitly extended the application of the Commission’s CPNI rules of the Telecommunications Act of 1996 to providers of interconnected VoIP service.

CALEA

Communications Assistance for Law Enforcement Act (CALEA) conduct electronic surveillance by imposing specific obligations on “telecommunications carriers” for assisting law enforcement, including delivering call interception and call identification functionality to the government with a minimum of interference to customer service and privacy.

Read more about CALEA and its roles in VoIP here Regulatory and Legal Considerations with WebRTC development

GDPR (General Data Protection Regulation)  in European Union 2018

Supersedes the 1995 Data Protection Directive

Establishes requirements of organizations that process data, defines the rights of individuals to manage their data, and outlines penalties for those who violate these rights.

No personal data may be processed unless this processing is done under one of six lawful bases specified by the regulation (consent, contract, public task, vital interest, legitimate interest or legal requirement). When the processing is based on consent the data subject has the right to revoke it at any time.

Controllers must notify Supervising Authorities (SA)s of a personal data breach within 72 hours of learning of the breach.

California Consumer Privacy Act (CCPA) 2019

consumer rights relating to the access to, deletion of, and sharing of personal information that is collected by businesses. 

Allows consumers to know whether their personal data is sold or disclosed , to whom .

Allows opt-out right for sales of personal information

Right to deletion – to request a business to delete any personal information about a consumer collected from that consumer

Personal Data Protection Bill (PDP) – India 2018

This bill introduces various private and sensitive protection frameworks  like restriction on retention of personal data, Right to correction and erasure (such as right to be forgotten) , Prohibition and transparency of processing of personal data. It also classifies data fiduciaries  including certain social media intermediaries. 

The Bill amends the Information Technology Act, 2000 to delete the provisions related to compensation payable by companies for failure to protect personal data.

Other data privacy acts similar to GDPR 

  • South Korea’s Personal Information Protection Act  2011
  • Brazil’s Lei Geral de Proteçao de Dados (LGPD)  2020
  • Privacy Amendment (Notifiable Data Breaches) to Australia’s Privacy Act 2018
  • Japan’s Act on Protection of Personal Information 2017
  • Thailand Personal Data Protection Act (PDPA) 2020

Features offered by VOIP companies for Data privacy 

  • Access Control & Logging
  • Auto Data Redaction / Account Deletion policy 
  • SIEM (Security information and event management) alerts 
  • Information security , Encrypted Storage For Recordings & Transcripts
  • Disclosing all third party services that are involved in data processing too
  • Role Based Access Control and 2 Factor Authentication
  • Data Security Audits and appointing  data protection officer to oversee GDPR compliance

Against Robocalls and SPIT ( SPAM over Internet Telephony)

 2009 Truth in Caller ID Act 

Telephone Consumer Protection Act of 1991

Implementation of Do not call registry against use of robocalls, automatic dialers, and other methods of communication

Do-Not-Call Implementation Act of 2003

if a business has an established relationship with a customer, it can continue to call them for up to 18 months. If a consumer calls the company, say, to ask for information about the product or service, the company has three months to get back to him.

if the customer asks to not receive calls, the company must stop calling, or be subject to fines.

Exemptions – Calls from a not-for-profit B organisation , informational messages as flight cancellations , Calls from sales and debt collectors etc

Personal Data Privacy and Security Act 2009

Implemented to curb  identity theft and computer hacking. Sensitive personal identifiable information includes : victim’s name, social security number, home address, fingerprint/biometrics data, date of birth, and bank account numbers.

Any company that is breached must notify the affected individuals by mail, telephone, or email, and the message must include information on the company and how to get in touch with credit reporting agencies

If the breach involves government or national security , company must also contact the Secret Service within fourteen days 

TRACED Act (Telephone Robocall Abuse Criminal Enforcement and Deterrence) 2019

Canadian Radio-television and Telecommunications Commission (CRTC) 2018 -32

A solution mechanism has already been standardised and active in adoption called STIR / SHAKEN ( Secure Telephony Identity Revisited / Signature-based Handling of Asserted information using toKENs) described in another article here.

Emergency services 

FCC E911 E911 / VoIP E911 rules

Unlike traditional telephone connections, which are tied to a physical location, VOIP’s packet switched technology allows a particular number to be anywhere making it more difficult for it to reach localised services like emergency numbers of Public Safety Answering Points (PSAPs) . Thus FCC regulations as well as the New and Emerging Technologies 911 Improvement Act of 2008 (NET 911 Act), interconnected VoIP providers are required to provide 911 and E911 service. 

Ref : 

VoIP system DevOps, operations and Infrastructure management, Automation


Overview of VoIP platform DevOPS tools

This article is focussed around various tools required to operate and maintain a growing large scale VoIP Platform, which are mostly classified under following roles:

  • PCAP Collections
  • CICD on Jenkins pipeline
  • Configuration management using chef cookbooks
  • virtualization and containerization using Docker
  • Infrastructure management using terraform / Kubernetes
  • Logs Analysis and Alarming

PCAP Collection

Packet Capture (PCAP) is an API that captures live network packets. Besides tracking, audit and RTC visualizers, PCAP is widely used for debugging faults such as during production alarm on high failure occurrences.

Example usecase: Production alert on 503 SIP response or log entry from a gateway is not as helpful as PCAP tracking of the session ID of call across various endpoints in and out of the network to determine the point of failure.Debugging involves :

  1. Pre-specified SIP / RTP and related protocols capture 

Capture pcaps examples

tcpdump -i any -w alltraffic.pcap
rtpbreak -P2 -t100 -T100 -d logz -r alltraffic.pcap

2. Call SessionId to uniquely identify failed calls among tens of thousands of the packet 

3. Analyzer such as wireshark or tshark to track the packet

TShark inspection examples

brew cask install wireshark
tshark -r alltraffic.pcap -R "sip.CSeq.method eq INVITE"

Some of the useful call specs captured from PCAP

  • DTMF – Both in-band and out of band DTMF for every call, along with the time stamp.
  • Codec negotiations –  Extracting codecs from PCAP lets us 
    1. Validate later whether there were codec changes without prior SIP message,
    2. If the call has been hung up with 488 error code then it was due to which  codec 
  • SIP errors – track deviations from standard SIP messaging. 
    1. Identify known erroneous SIP messaging scenarios such as for MITM or replay attacks
  • RTCP Media stats – extract Jitter, Loss, RTT with RTCP reports for both the incoming and outgoing stream.
  • Identify Media or ACK Timeouts 
    1. Check whether a party has not sent any media packet for > 60 s (media time out threshold duration)
    2. When a call is hung up due to ACK time out.
  • Audio stream – After GDPR, take explicit permission from users before storing audio streams.
PCAP file analyzed in Wireshark ( PCAP source : https://wiki.wireshark.org/SampleCaptures#Sample_Captures)

Continuous Integration and Delivery Automation using Jenkins

CICD provides continous delivery hub , distribute work across multiple machines, helping drive builds, tests and deployments across multiple platforms .

Jenkins jobs is a self-contained Java-based program extensible using plugins.

Jenkins pieline– orchestrates and automates building project in Jenkins

Configuration management using chef cookbooks

Alternatives like puppet and Ansible, which are also a cross-platform configuration management platform

Compute virtualization and containerization using Docker

Docker containers can be used instead of virtual machines such as VirtualBox , to isolates applications and be OS and platform independent
Makes distributed development possible and automates the deployment possible

  • stop Stop one or more running containers
  • top Display the running processes of a container
> docker top 4417600169e8
UID PID PPID C STIME TTY TIME CMD
root 9913 9888 0 08:50 ? 00:00:00 bash /point.sh
root 10083 9913 0 08:50 ? 00:00:01 /usr/sbin/worker
root 10092 10083 0 08:50 ? 00:00:02 /micro-service
  • unpause Unpause all processes within one or more containers
  • update Update configuration of one or more containers
  • wait Block until one or more containers stop, then print their exit codes

see all iamges

> docker images
REPOSITORY                  TAG                 IMAGE ID            CREATED             SIZE
sipcapture/homer-cron       latest              fb2243f90cde        3 hours ago         476MB
sipcapture/homer-kamailio   latest              f159d46a22f3        3 hours ago         338MB
sipcapture/heplify          latest              9f5280306809        21 hours ago        9.61MB
<none>                      <none>              edaa5c708b3a        

See all stats

>  docker stats
CONTAINER ID        NAME                CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS
f42c71741107        homer-cron          0.00%               52KiB / 994.6MiB      0.01%               2.3kB / 0B          602MB / 0B          0
0111765091ae        mysql               0.04%               452.2MiB / 994.6MiB   45.46%              1.35kB / 0B         2.06GB / 49.2kB     22

Run command from within a docker instnace

docker exec -it  bash

First see all processes

docker ps

select a process and enter its bash

docker exec -it 0472a5127fff bash

to edit or update a file inside docker either install vim everytime u login in resh docker conainer like

apt-get update
apt-get install vim

or add this to dockerfile

RUN [“apt-get”, “update”]
RUN [“apt-get”, “install”, “-y”, “vim”]

see if ngrep is install , if not then install and run ngrep to get sip logs isnode that docker container

apt update
apt install ngrep
ngrep -p "14795778704" -W byline -d any port 5060

docker volume – Volumes are used for persisting data generated by and used by Docker containers.
docker volumes have advantages over blind mounts such as easier to backup or migrate , managed by docker APIs, can be safely shared among multiple containers etc

docker stack – Lets to manager a cluster of docker containers thorugh docker swarm can be defined via docker-compose.yml file

docker service

  • create Create a new service
  • inspect Display detailed information on one or more services
  • logs Fetch the logs of a service or task
  • ls List services
  • ps List the tasks of one or more services
  • rm Remove one or more services
  • rollback Revert changes to a service’s configuration
  • scale Scale one or multiple replicated services
  • update Update a service

Run docker containers

sample run command

docker run -it -d --name opensips -e ENV=dev imagename:2.2

-it flags attaches to an interactive tty in the container.
-e gives envrionment variables
-d runs it in background and prints container id

Remove docker entities

To remove all stopped containers, all dangling images, and all unused networks:

docker system prune -a

To remove all unused volumes

docker system prune --volumes

To remove all stopped containers

docker container prune
sometimes docker images keep piling with stopped congainer such as 

REPOSITORY                                                             TAG                 IMAGE ID            CREATED             SIZE                                                                              d1dcfe2438ae        15 minutes ago      753MB                                                                           2d353828889b        16 hours ago        910MB                                                          ...
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS                        PORTS               NAMES

0dd6698a7517        2d353828889b        "/entrypoint.sh"         13 minutes ago      Exited (137) 13 minutes ago                       hardcore_wozniak

to remove such images and their conainer , first stop and remove confainers

docker stop $(docker ps -a -q)
docker rm $(docker ps -a -q)

then remove all dangling images

docker rmi  $(docker images -aq --filter dangling=true)

Infrastructure management using terraform

Terraform is used for building, changing and versioning infrastructure.
Infra as Code – can run single application to datacentres via configuration files which create execution plan.
It can manage low-level components such as compute instances, storage, and networking, as well as high-level components such as DNS entries, SaaS features, etc.
Resource Graph – builds a graph of all your resources

tfenv can be used to manage terraform versions

> brew unlink terraform
tfenv install 0.11.14
tfenv list 

Terraform configuration language

This is used for declaring resources and descriptions of infrastructure and associated files have a .tf or .tf.json file extension
Group of resources can be gathered into a module. Terraform configuration consists of a root module, where evaluation begins, along with a tree of child modules created when one module calls another.

Example : launch a single AWS EC2 instance , fle server1.tf

provider "aws" {
  profile    = "default"
  region     = "us-east-1"
}

resource "aws_instance" "server1" {
  ami           = "ami-2757fxxx"
  instance_type = "t2.micro"
}

note : AMI IDs are region specific.
profile attribute here refers to the AWS Config File in ~/.aws/credentials

Terraform command line interface (CLI)

engine for evaluating and applying Terraform configurations.
uses plugins called providers that each define and manage a set of resource types

Command Usage: terraform [-version] [-help] [args]

  • apply Builds or changes infrastructure
  • console Interactive console for Terraform interpolations
  • destroy Destroy Terraform-managed infrastructure
  • env Workspace management
  • fmt Rewrites config files to canonical format
  • get Download and install modules for the configuration
  • graph Create a visual graph of Terraform resources
  • import Import existing infrastructure into Terraform
  • init Initialize a Terraform working directory
  • output Read an output from a state file
  • plan Generate and show an execution plan
  • providers Prints a tree of the providers used in the configuration
  • refresh Update local state file against real resources
  • show Inspect Terraform state or plan
  • taint Manually mark a resource for recreation
  • untaint Manually unmark a resource as tainted
  • validate Validates the Terraform files
  • version Prints the Terraform version
  • workspace Workspace management
  • 0.12upgrade Rewrites pre-0.12 module source code for v0.12
  • debug Debug output management (experimental)
  • force-unlock Manually unlock the terraform state
  • push Obsolete command for Terraform Enterprise legacy (v1)
  • state Advanced state management

terraform init
Initialize a working directory containing Terraform configuration files.

terraform validate
checks that verify whether a configuration is internally-consistent, regardless of any provided variables or existing state.

Kubernetes

container orchestration platform , automating deployment, scaling, and management of containerized applications. Can deploy to cluster of computers, automating the distribution and scheduling as well

Service discovery and load balancing – gives Pods their own IP addresses and a single DNS name for a set of Pods, and can load-balance across them.

Automatic bin packing – Automatically places containers based on their resource requirements and other constraints, while not sacrificing availability. Mix critical and best-effort workloads in order to drive up utilization and save even more resources.

Storage orchestration – Automatically mount the storage system of your choice, whether from local storage, a public cloud provider such as GCP or AWS, or a network storage system such as NFS, iSCSI, Gluster, Ceph, Cinder, or Flocker.

Self-healing – Restarts containers that fail, replaces and reschedules containers when nodes die, kills containers that don’t respond to your user-defined health check, and doesn’t advertise them to clients until they are ready to serve.

Automated rollouts and rollbacks – progressively rolls out changes to your application or its configuration, while monitoring application health to ensure it doesn’t kill all your instances at the same time.

Secret and configuration management – Deploy and update secrets and application configuration without rebuilding your image and without exposing secrets in your stack configuration.

Batch execution– manage batch and CI workloads, replacing containers that fail, if desired.

Horizontal scaling – Scale application up and down with a simple command, with a UI, or automatically based on CPU usage.

create minikube cluster and deploy pods

prerequisities : docker , curl , redis , others

install minikube

curl -Lo minikube https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
chmod +x minikube
install minikube /usr/local/bin

Install kubectl

snap install kubectl --classic
ln -s /snap/bin/kubectl /usr/local/bin

Setup Minikube

minikube start --vm-driver=none
minikube addons enable registry-creds
kubectl -n kube-system create secret generic registry-creds-ecr
kubectl -n kube-system create secret generic registry-creds-gcr
kubectl -n kube-system create secret generic registry-creds-dpr
minikube addons configure registry-creds
Starting Kubernetes…minikube version: v1.3.0
 commit: 43969594266d77b555a207b0f3e9b3fa1dc92b1f
 minikube v1.3.0 on Ubuntu 18.04
 Running on localhost (CPUs=2, Memory=2461MB, Disk=47990MB) …
 OS release is Ubuntu 18.04.2 LTS
 Preparing Kubernetes v1.15.0 on Docker 18.09.5 …
 kubelet.resolv-conf=/run/systemd/resolve/resolv.conf
 Pulling images …
 Launching Kubernetes …
 Done! kubectl is now configured to use "minikube"
 dashboard was successfully enabled
 Kubernetes Started 

Basic Commands

  • start Starts a local kubernetes cluster
  • status Gets the status of a local kubernetes cluster
  • stop Stops a running local kubernetes cluster
  • delete Deletes a local kubernetes cluster
  • dashboard Access the kubernetes dashboard running within the minikube cluster

Images Commands:

  • docker-env Sets up docker env variables; similar to ‘$(docker-machine env)’
  • cache Add or delete an image from the local cache.

Configuration and Management Commands:

  • addons Modify minikube’s kubernetes addons
  • config Modify minikube config
  • profile Profile gets or sets the current minikube profile
  • update-context Verify the IP address of the running cluster in kubeconfig.

Networking and Connectivity Commands:

  • service Gets the kubernetes URL(s) for the specified service in your local cluster
  • tunnel tunnel makes services of type LoadBalancer accessible on localhost

Advanced Commands:

  • mount Mounts the specified directory into minikube
  • ssh Log into or run a command on a machine with SSH; similar to ‘docker-machine ssh’
  • kubectl Run kubectl

Troubleshooting Commands:

  • ssh-key Retrieve the ssh identity key path of the specified cluster
  • ip Retrieves the IP address of the running cluster
  • logs Gets the logs of the running instance, used for debugging minikube, not user code.
  • update-check Print current and latest version number

kubectl

controls the Kubernetes cluster manager.

Basic Commands (Beginner):

  • create Create a resource from a file or from stdin.
  • expose Take a replication controller, service, deployment or pod and expose it as a new Kubernetes Service
  • run Run a particular image on the cluster
  • set Set specific features on objects
  • explain Documentation of resources
  • get Display one or many resources
  • edit Edit a resource on the server
  • delete Delete resources by filenames, stdin, resources and names, or by resources and label selector

Deploy Commands:

  • rollout Manage the rollout of a resource
  • scale Set a new size for a Deployment, ReplicaSet, Replication Controller, or Job
  • autoscale Auto-scale a Deployment, ReplicaSet, or ReplicationController

Cluster Management Commands:

  • certificate Modify certificate resources.
  • cluster-info Display cluster info
  • top Display Resource (CPU/Memory/Storage) usage.
  • cordon Mark node as unschedulable
  • uncordon Mark node as schedulable
  • drain Drain node in preparation for maintenance
  • taint Update the taints on one or more nodes

Troubleshooting and Debugging Commands:

  • describe Show details of a specific resource or group of resources
  • logs Print the logs for a container in a pod
  • attach Attach to a running container
  • exec Execute a command in a container
  • port-forward Forward one or more local ports to a pod
  • proxy Run a proxy to the Kubernetes API server
  • cp Copy files and directories to and from containers.
  • auth Inspect authorization

Advanced Commands:

  • diff Diff live version against would-be applied version
  • apply Apply a configuration to a resource by filename or stdin
  • patch Update field(s) of a resource using strategic merge patch
  • replace Replace a resource by filename or stdin
  • wait Experimental: Wait for a specific condition on one or many resources.
  • convert Convert config files between different API versions
  • kustomize Build a kustomization target from a directory or a remote url.

Settings Commands:

  • label Update the labels on a resource
  • annotate Update the annotations on a resource
  • completion Output shell completion code for the specified shell (bash or zsh)

Other Commands:

  • api-resources Print the supported API resources on the server
  • api-versions Print the supported API versions on the server, in the form of “group/version”
  • config Modify kubeconfig files
  • plugin Provides utilities for interacting with plugins.
  • version Print the client and server version information

DevOps monitoring tools nagios

Manage Docker configs

  • create Create a config from a file or STDIN
  • inspect Display detailed information on one or more configs
  • ls List configs
  • rm Remove one or more configs

Manage containers

  • attach Attach local standard input, output, and error streams to a running container
  • commit Create a new image from a container’s changes
  • cp Copy files/folders between a container and the local filesystem
  • create Create a new container
  • diff Inspect changes to files or directories on a container’s filesystem
  • exec Run a command in a running container
  • export Export a container’s filesystem as a tar archive
  • inspect Display detailed information on one or more containers
  • kill Kill one or more running containers
  • logs Fetch the logs of a container
  • ls List containers
  • pause Pause all processes within one or more containers
  • port List port mappings or a specific mapping for the container
  • prune Remove all stopped containers
  • rename Rename a container
  • restart Restart one or more containers
  • rm Remove one or more containers
  • run Run a command in a new container
  • start Start one or more stopped containers
  • stats Display a live stream of container(s) resource usage statistics
  • stop Stop one or more running containers
  • top Display the running processes of a container
  • unpause Unpause all processes within one or more containers
  • update Update configuration of one or more containers
  • wait Block until one or more containers stop, then print their exit codes

Alternatives, Senu multi-cloud monitoring or Raygun

Monitoring, debugging, logs analysis and alarms

Aggregate logs into logstash and provide search and filtering via Elastic Search and Kibana. Can also trigger alerts or notifications on specific keyword searches in logs such as WARNING or ERRRO or call_failed. Some common alert scenarios include :

SBC and proxy gateways failures – check states of VM instance

DNS caching alerts – Domain Name System (DNS) caching, a Dynamic Host Configuration Protocol (DHCP) server, router advertisement and network boot alerts from service such as dnsmasq

Disk usage alert – setup alerts for 80% usage and trigger an alarm to either manually prune or create automatic timely archive backups.
check the percentage of DISK USAGE

df -h

Mostly it is either the logs file or pcap recorder which need to be archieved in external storage.

Use logrotate – it can rotates, compresses, and mails system logs

config file for logrorate – logrotate -vf /etc/logrotate.conf

/var/log/messages {
    rotate 5
    weekly
    postrotate
        /usr/bin/killall -HUP syslogd
    endscript
}

Elevated Call failure SIP 503 or Call timeout SIP 408 – high frequency of failed calls indicate an internal issue and must be followed up by smoke testing the entire system to identify any probable issue such as undetected frequent crashes of any individual component or any blacklisting by a destination endpoint etc

sudo tail -f sip.log | grep 503

or

sudo tail -f sip.log | grep WARNING

cron service or processed alerts

 ps axf
  PID TTY      STAT   TIME COMMAND
    2 ?        S      0:00 [kthreadd]
    3 ?        I<     0:00  \_ [rcu_gp]
    4 ?        I<     0:00  \_ [rcu_par_gp]
    5 ?        I      0:00  \_ [kworker/0:0-eve]
    6 ?        I<     0:00  \_ [kworker/0:0H-kb]
    7 ?        I      0:00  \_ [kworker/0:1-eve]
    8 ?        I      0:00  \_ [kworker/u4:0-nv]
    9 ?        I<     0:00  \_ [mm_percpu_wq]
   10 ?        S      0:00  \_ [ksoftirqd/0]
   11 ?        I      0:00  \_ [rcu_sched]
   12 ?        S      0:00  \_ [migration/0]
   13 ?        S      0:00  \_ [cpuhp/0]
   14 ?        S      0:00  \_ [cpuhp/1]
   15 ?        S      0:00  \_ [migration/1]
   16 ?        S      0:00  \_ [ksoftirqd/1]
   17 ?        I      0:00  \_ [kworker/1:0-eve]
   18 ?        I<     0:00  \_ [kworker/1:0H-kb]

or checks cron status

service cron status
● cron.service - Regular background program processing daemon
   Loaded: loaded (/lib/systemd/system/cron.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2016-06-26 03:00:37 UTC; 1min 17s ago
     Docs: man:cron(8)
 Main PID: 845 (cron)
    Tasks: 1 (limit: 4383)
   CGroup: /system.slice/cron.service
           └─845 /usr/sbin/cron -f

Jun 26 03:00:37 ip-172-31-45-21 systemd[1]: Started Regular background program processing daemon.
Jun 26 03:00:37 ip-172-31-45-21 cron[845]: (CRON) INFO (pidfile fd = 3)
Jun 26 03:00:37 ip-172-31-45-21 cron[845]: (CRON) INFO (Running @reboot jobs)

restart or start cron service if required

DB connections / connection pool process – keep listening for any alerts on DB connections failure or even warnings as this can be due to too many read operations such as in DDOS and can escalate very quickly

netstat -nltp  | grep db 
tcp        0      0 0.0.0.0:5433            0.0.0.0:*               LISTEN      5792/db-server * 

Routine deepstatus checks is a good practice too. Raise alert if any check doesnt result as expected.

Port check, unexpected result alert– Regular checks if servers are lsietning on ports such as 5060 for SIP

netstat -nltp | grep 5060
tcp        0      0 x.x.x.x:5060       0.0.0.0:*               LISTEN      8970/kamailio  

cron zombie process checks – zombie process or defunct process is a process that has completed execution (via the exit system call) but still has an entry in the process table: it is a process in the “Terminated state”. List xombie process and kill them with pid to free up .

kill -9 <PID1>

Bulk calls checks – consult ongoing call cmd commands for application server such as
For Freeswitch use

fs_ctl> show channels 

For kamailio use kamcmd

kamcmd dlg.list

For asterisk watch or show cmmand

watch -n 1 "sudo asterisk -vvvvvrx 'core show channels' | grep call"

Incase of DDOS or other macious attacker IP identification block the IP

iptables -I INPUT -s y.y.y.y -j DROP   

Can also use fail2ban

>apt-get update && apt-get install fail2ban

Additionally check how many dispatchers are responding on outbound gateway

opensipsctl dispatcher dump

Process control supervisor or pm2 checks – supervisor is a Linux Process Control System that allows its users to monitor and control a number of processes

ps axf | grep supervisor

for pm2

> pm2 status
[PM2] Spawning PM2 daemon with pm2_home=/Users/altanai/.pm2
[PM2] PM2 Successfully daemonized
┌─────┬───────────┬─────────────┬─────────┬─────────┬──────────┬────────┬──────┬───────────┬──────────┬──────────┬──────────┬──────────┐
│ id │ name │ namespace │ version │ mode │ pid │ uptime │ ↺ │ status │ cpu │ mem │ user │ watching │

htop to check memeory and CPU

Health and load on the reverse proxy, load balancer as Nginx – perform a direct curl request to host to check if Nginx responds with a non 4xx / 5xx response or not

curl -v <public-fqdn-of-server> 

Incase of error response , restart

/etc/init.d/nginx start

Incase of updates restart ngnix config

nginx -s reload

For HTTP/SSL proxy daemon such as tiny proxy which are used for fast resposne , set the MinSpareServers, MaxSpareServers , MaxClients , MaxRequestsPerChild etc appropriately

VPN checks – restart fireealls or IPsec incase of ssues

/etc/init.d/ipsec restart

Additionally also check ssh service

ps axf | grep sshd

restart sshd if required

SSL cert expiry checks – to keep the operations running securely and prevent and abrupt termination it is a good practise to run regular certificate expiry checks for SSL certs especially on secure HTTP endpoint like APIs , web server and also on SIP applications servers for TLS. If any expiry is due in < 10 days to trigger an alert to renew the certs

Health of Task scheduling services such as RabbitMQ, Celery Distributed Task Queue – remote debugging of these can be set up via pdb which supports setting (conditional) breakpoints and single stepping at the source line level, inspection of stack frames, source code listing, and evaluation of arbitrary Python code in the context of any stack frame.

import pdb; pdb.set_trace()
python3 -m pdb myscript.py

It can also be set up via using the client libraries provided by these Queue services themselves

Cluster status – setup an efficient health check service which monitors the cluster status for High Availability. JSON object depicting the status of cluster shards

{
  "cluster_name" : "ABC-cluster",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 14,
  "number_of_data_nodes" : 6,
  "active_primary_shards" : 200,
  "active_shards" : 300,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0
}

Status of Crticial Application Server

fscli > show status
UP 0 years, 0 days, 0 hours, 58 minutes, 33 seconds, 15 milliseconds, 58 microseconds
FreeSWITCH (Version 1.6.20 git 987c9b9 2018-01-23 21:49:09Z 64bit) is ready
3 session(s) since startup
0 session(s) - peak 1, last 5min 1
0 session(s) per Sec out of max 30, peak 1, last 5min 1
1000 session(s) max
min idle cpu 0.00/80.83
Current Stack Size/Max 240K/8192K

Programming or Syntax error in the production environment – mostly arising due to incomplete QA/testing before pushing new changes to production. Should trigger alerts for dev teams and meet with hot patches.

Many programing application development frameworks have inbuild libs for debugging , exceotion handling and reporting such as

  • backend service in Django
  • API service in Go

Distributed memory caching – redis , memcahe : Redis info shows the master -salve configuration for all the instances as well as their memeory and cpu status.

>redis-cli info
# Server
redis_version:6.0.4
redis_git_dirty:0
redis_mode:standalone
os:Darwin 18.7.0 x86_64
arch_bits:64
multiplexing_api:kqueue
atomicvar_api:atomic-builtin
gcc_version:4.2.1
tcp_port:6379

# Clients
connected_clients:1
client_recent_max_input_buffer:0
client_recent_max_output_buffer:0
blocked_clients:0
tracking_clients:0
clients_in_timeout_table:0

# Memory
used_memory:1065648
used_memory_human:1.02M
number_of_cached_scripts:0
maxmemory:0
allocator_frag_bytes:1123680
allocator_rss_ratio:1.00
rss_overhead_bytes:37888
mem_fragmentation_ratio:2.16
active_defrag_running:0
lazyfree_pending_objects:0

# Persistence
loading:0
rdb_changes_since_last_save:0
module_fork_last_cow_size:0

# Stats
total_connections_received:1
total_commands_processed:0
..

# Replication
role:master
connected_slaves:0
..

# CPU
used_cpu_sys:0.011198
used_cpu_sys_children:0.000000

# Modules

# Cluster
cluster_enabled:0

SMS service using smsc on Kannel : From the kannel servers, you should see the PANIC error (most of the time Assertion error crashing kannel):

grep PANIC /var/log/kannel/bearerbox.log

IF you are going to restart , Flush redis cache

sudo redis-cli FLUSHALL
sudo redis-cli SAVE

restart kannel

sudo /etc/init.d/kannel restart

If the carriers are throttling the SMS request , verify “ERROR” responses using

sudo grep -i "throttling" bearerbox.log

Alternatives include AWS logs services :

  • Scalyr logging
  • Sensu monitoring for multi-cloud monitoring using event pipeline

Read about VoIP/ OTT / Telecom Solution startup’s strategy for Building a scalable flexible SIP platform that includes :

  • Scalable and Flexible SIP platform building
  • Cluster SIP telephony Server for High Availability
  • Failure Recovery
  • Multi-tier cluster architecture
  • Role Abstraction / Micro-Service based architecture
  • Distributed Event management and Event-Driven architecture
  • Containerization
  • Autoscaling Cloud Servers
  • Open standards and Data Privacy
  • Flexibility for inter-working – NextGen911 , IMS , PSTN
  • security and Operational Efficiencies

Read more about SIP VoIP system Architecture which includes

  • Infrastructure Requirements
  • Integral Components of a VOIP SIP-based architecture
  • RTP ( Real-Time Transport Protocol ) / RtCP
  • SIP gateways, registrar, proxy, redirect, application
  • Developing SIP-based applications – basic call routing, media management
  • SIP platform Development – NAt and DNS , Cross-platform and integration to External Telecommunication provider landscape , Databases

References :


Harmonization of services between generations of telecommunication core layers


A communication system can be made up of many components which are individually undergoing evolution such as access layer generations, and core layer upgrades. Harmonized and uniform open standard-based service delivery platforms over legacy Proprietary codebase is the preferred choice for most service providers to save the investment in their infrastructure and programming while keeping up with the shift in technology. I shall be editing this post to discuss more on the process of Service Harmonization.This saves the Telecom Service Provider the trouble of rewriting call logic with every telecom generation evolution ie IN to SIP to Web based WebRTC phones.

Landscape shift for Telecommunication Service providers includes Transmission layer which is ATM/Frame relays moving towards IP/MPLS. Access Layer hardware specific to POTS / PSTN / ISDN upgrading towards NGN and VOIP.  Packet Switched Next gen Soft Switches based on SIP.

Telecommunication service Harmonization

The Service Harmonization Layer does the job of holding all new and legacy services while providing uniform interface to interact with access network regardless of the back-end Call program logic. It involves consolidation for the service layers across IMS and legacy mobile network and Orchestration to extend the capability of underlying platform to support multiple IN variants. Diagrammatic depiction of scope of Service Harmonization.

Gateways based Harmonization

Service Broker based Harmonization

As CSPs evolve their networks for LTE, the resulting networks present tremendous challenges in voice services and application delivery. Realizing this opportunity, the telecom software industry has come forward with a purpose-built network element: the Service Broker, a solution specifically designed to overcome network architecture challenges and ensure voice service delivery from any network domain to any other network domain. Service Brokers are placed between the application layer and the control layer.

A service broker is a service abstraction layer between the network and application layer in a telecom environment. SB( Service Broker ) enables us to make use of existing applications and services from Intelligent Network’s SCP ( Service Control Point ), IMS’s Application Server as well as other sources in a harmonized manner

Legacy switches vs Softswitches

Legacy switches are circuit-switched, monolithic, propertiary and expensive while Softswitch is packet-switched and open interfaced. They are scalable and vendor-independent which enables easy convergence. Softswitches forms the basis for a service harmonization engine as they increase the granularity and power processing distribution of the Network

Service Delivery Layer in Legacy vs Harmonized Services

Legacy Service Layer has a function-centric architecture having multiple domain-specific session types such as Mobile calls, IPTV and broadband. Harmonized service delivery layer has Open APIs and is essentially Data-centric. This leads to fast and agile development and deployment of convergent services specifically IMS system providing the framework for underline network agnosticism across fixed and mobile.


SIP VoIP system architecture basics


A VOIP/CPaaS solution is designed to accommodate the signalling and media both along with integration leads to various external endpoints such as various SIP phones ( desktop, softphones, webRTC ), telecom carriers, different VoIP networks providers, enterprise applications ( Skype, Microsoft Lync ), Trunks etc.

A sufficiently capable SIP platform should have

  1. Audio calls ( optionally video ) service using SIP gateways
  2. Media services (such as recording , conferencing, voicemail, and IVR )
  3. Messaging and presence ( could be using SIP SIMPLE, SMS , messahing service from third parties)
  4. Developing SIP based applications : Programmable services through standardized APIs and development of new modules
  5. NAT and DNS near-end and far-end NAT traversal for signalling and media flows
  6. Telemetry for Sessions , Registry, Location and lookup service
  7. CDR Processing and Billing : Backend for CDR and accounts ( can use Redis, Kafka , MySQL, PostgreSQL, Oracle, Radius, LDAP, Diameter)
  8. Serial and parallel forking, load balancing , proxying
  9. Cross platform and integration to External Telecommunication provider landscape
    • Interconnectivity with other IP multimedia systems, VoLTE ( optional interconnection with other types of communications networks as GSM or PSTN/ISDN).
    • support for VoIP signalling protocols (SIP, H,323, SCCP, MGCP, IAX) and telephony signalling protocols ( ISDN/SS7, FXS/FXO, Sigtran ) either internally via pluggable modules or externally via gateways .
Performnace factors :Security considerations :
High availability using redundant servers in standby
Load balancing
IPv4 and IPv6 network layer support
TCP , UDP , SCTP transport layer protocol support
DNS lookups and hop by hop connectvity
authentication, authorization, and accounting (AAA)
Digest authentication and credentials fetched from backend
Media Encryption
TLS and SRTP support
Topology hidding to prevent disclosing IP form internal components in via and route headers
Firewalls , blacklist, filters , peak detectors to prevent Dos and Ddos attacks

The article only outlines SIP system architecture  from 3 viewpoints :

  1. Infrastructure standpoint
  2. Vore voice engineering perspective
  3. External components required to run and system

Infrastructure Requirements

  • Data Centers with BCP ( Business Continuity Planning ) and DR ( Disaster Recovery )
  • Servers and Clusters for faster and parallel calculating
  • Virtualization
    VMs to make a distributed computing environment with HA ( high availability ) and DRS ( Distributed Resource Scheduling )
  • Storage
    SAN with built-in redundancy for the resiliency of data.
    WORM compliant NAS for storing voice archives over a retention period.
  • Racks, power supplies, battery backups, cages etc.
  • Networking
    DMZs ( Demilitarized Zones)  which are interfacing areas between internal servers in the green zone and outside network
    VLANs for segregation between tenants.
    Connectivity through the public Internet as well as through VPN or dedicated optical fibre network for security.
  • Firewall configuration
  • Load Balancer ( Layer 7 )
  • Reverse Proxies for the security of internal IPs and port
  • Security controls In compliance with ISO/IEC 27000 family – Information security management systems
  • PKI Infrastructure to manage digital certificates
  • Key management with HSM ( hardware security module )
  • truster CA ( Certificate Authority ) to issue publicly signed certificate for TLS ( Https, wss etc)
  • OWASP ( Open Web Application Security Project )  rules compliance

Integral Components of a VOIP SIP based architecture

  • Call Controller
  • Media Manager
  • Recording
  • Softclients
  • logs and PCAP archives
  • CDR generators
  • Session Borer Controllers ( SBCs)

A SIP server can be moulded to take up any role based on the libraries and programs that run on it such as gateway server, call manager, load balancer etc. This in turn defines its placement in overall VoIP communication architecture. For example
– stateless proxy servers are placed on the border,
– application and B2BUA server at the core

sip entities
SIP platform components

SIP Gateways

A SIP gateway is an application that interfaces a SIP network to a network utilising another signalling protocol. In terms of the SIP protocol, a gateway is just a special type of user agent, where the user agent acts on behalf of another protocol rather than a human. A gateway terminates the signalling path and can also terminate the media path .

sip gaeways
To PSTN for telephony inter-working
To H.323 for IP Telephony inter-working
Client – originates message
Server – responds to or forwards message

Logical SIP entities are:

  • User Agent Client (UAC): Initiates SIP requests  ….
  • User Agent Server (UAS): Returns SIP responses ….
  • Network Servers ….

Registrar Server

A registrar server accepts SIP REGISTER requests; all other requests receive a 501 Not Implemented response. The contact information from the request is then made available to other SIP servers within the same administrative domain, such as proxies and redirect servers. In a registration request, the To header field contains the name of the resource being registered, and the Contact header fields contain the contact or device URIs.

regsitrar server

Proxy Server

A SIP proxy server receives a SIP request from a user agent or another proxy and acts on behalf of the user agent in forwarding or responding to the request. Just as a router forwards IP packets at the IP layer, a SIP proxy forwards SIP messages at the application layer.

Typically proxy server ( inbound or outbound) have no media capabilities and ignore the SDP . They are mostly bypassed once dialog is established but can add a record-route .
A proxy server usually also has access to a database or a location service to aid it in processing the request (determining the next hop).

proxy server

 1. Stateless Proxy Server
A proxy server can be either stateless or stateful. A stateless proxy server processes each SIP request or response based solely on the message contents. Once the message has been parsed, processed, and forwarded or responded to, no information (such as dialog information) about the message is stored. A stateless proxy never retransmits a message, and does not use any SIP timers

2. Stateful Proxy Server
A stateful proxy server keeps track of requests and responses received in the past, and uses that information in processing future requests and responses. For example, a stateful proxy server starts a timer when a request is forwarded. If no response to the request is received within the timer period, the proxy will retransmit the request, relieving the user agent of this task.

  3 . Forking Proxy Server
A proxy server that receives an INVITE request, then forwards it to a number of locations at the same time, or forks the request. This forking proxy server keeps track of each of the outstanding requests and the response. This is useful if the location service or database lookup returns multiple possible locations for the called party that need to be tried.

Redirect Server

A redirect server is a type of SIP server that responds to, but does not forward, requests. Like a proxy server, a redirect server uses a database or location service to lookup a user. The location information, however, is sent back to the caller in a redirection class response (3xx), which, after the ACK, concludes the transaction. Contact header in response indicates where request should be tried .

redirect server

Application Server

The heart of all call routing setup. It loads and executes scripts for call handling at runtime and maintains transaction states and dialogs for all ongoing calls . Usually the one to rewrite SIP packets adding media relay servers, NAT . Also connects external services like Accounting , CDR , stats to calls .

Adding Media Management

Media processing is usually provided by media servers in accordance to the SIP signalling. Bridges, call recording, Voicemail, audio conferencing, and interactive voice response (IVR) are commomly used. Read more about Media Architecture here

RFC 6230 Media Control Channel Framework decribes framework and protocol for application deployment where the application programming logic and media processing are distributed.

Any one such service could be a combination of many smaller services within such as Voicemail is a combitional of prompt playback, runtime controls, Dual-Tone Multi-Frequency (DTMF) collection, and media recording. RFC 6231 Interactive Voice Response (IVR) Control Package for the Media Control Channel Framework.

DTMF( Dual tone Multi Frequency )

delivery options:

  • Inband –  With Inband digits are passed along just like the rest of your voice as normal audio tones with no special coding or markers using the same codec as your voice does and are generated by your phone.
  • Outband  – Incoming stream delivers DTMF signals out-of-audio using either SIP-INFO or RFC-2833 mechanism, independently of codecs – in this case, the DTMF signals are sent separately from the actual audio stream.

TTS ( Text to Speech )

 Alexa Text-to-Speech (TTS) + Amazon Polly

Ivona – multiple language text to speech converter with ssml scripts such as below

      <speak>
          <p>
              <s><prosody rate="slow">IVONA</prosody> means highest quality speech
              synthesis in various languages.</s>
              <s>It offers both male and female radio quality voices <break/> at a
              sampling rate of 22 kHz <break/> which makes the IVONA voices a
              perfect tool for professional use or individual needs.</s>
          </p>
      </speak>

check ivona status

service ivona-tts-http status
 tail -f /var/log/tts.log

Developing SIP based applications

Basic SIP methods

SIP defines basic methods such as INVITE, ACK and BYE which can pretty much handle simple call routing with some more advanced processoes too like call forwarding/redirection, call hold with optional Music on hold, call parking, forking, barge etc.

Extending SIP headers

Newer SIP headers defined by more updated SIP RFC’s contina INFO, PRACK, PUBLISH, SUBSCRIBY, NOTIFY, MESSAGE, REFER, UPDATE. But more methods or headers can be added to baseline SIP packets for customization specific to a particular service provider. In case where a unrecognized SIP header is found on a SIP proxy which it either does not suppirt or doesnt understand, it will simply forward it to the specified endpoint.

Call routing Scripts

Interfaces for programming SIP call routing include :
– Call Processing Language—SIP CPL,
– Common Gateway Interface—SIP CGI,
– SIP Servlets,
– Java API for Integrated Networks—JAIN APIs etc .

Some known SIP stacks :

SailFin – SIP servlet container uses GlassFish open source enterprise Application Server platform (GPLv2), obsolete since merger from Sun Java to Oracle.

Mobicents – supports both JSLEE 1.1 and SIP Servlets 1.1 (GPLv2)

Cipango – extension of SIP Servlets to the Jetty HTTP Servlet engine thus compliant with both SIP Servlets 1.1 and HTTP Servlets 2.5 standards.

WeSIP – SIP and HTTP ( J2EE) converged application server build on OpenSER SIP platform

Additionally SIP stacks are supported on almost all popular SIP programming lanaguges which can be imported as lib and used for building call routing scripts to be mounted on SIP servers or endpoints such as :

PJSIP in C

JSSIP Javascript

Sofia in kamailio , Freswitch

Some popular SIP server also have proprietary scripting language such as –
Asterisk Gateway Interface (AGI) , application interface for extending the dialplan with your functionality in the language you choose – PHP, Perl, C, Java, Unix Shell and others

SIP platform Development

  • audio calls ( optionally video )
  • media services such as conferencing, voicemail, and IVR,
  • messaging as IM and presence based on SIMPLE,
  • programmable services through standardized APIs and development of new modules
  • near-end and far-end NAT traversal for signalling and media flows
  • interconnectivity with other IP multimedia systems, VoLTE ( optional interconnection with other types of communications networks as GSM or PSTN/ISDN)
  • Registry, location and lookup service
  • Serial and parallel forking

A sufficiently capable SIP platform shoudl consist of following features :

Performance factors :

  • High availability using redundant servers in standby
  • Load balancing
  • IPv4 and IPv6 support

Security considerations :

  • digest authentication and credentials fetched from backend
  • Media Encryption
  • TLS and SRTP support
  • Topology hiding to prevent disclosng IP form internal components in via and route headers
  • Firewalls , blacklist, filters , peak detectors to prevent Dos and Ddos attacks .

Collecting and Processing PCAPS

  • VoIP monitor – network packet sniffer with commercial frontend for SIP RTP RTCP SKINNY(SCCP) MGCP WebRTC VoIP protocols

it uses a passive network sniffer (like tcpdump or wireshark) to analyse packets in realtime and transforms all SIP calls with associated RTP streams into database CDR record which is sent over the TCP to MySQL server (remote or local). If enabled saving SIP / RTP packets the sniffer stores each VoIP call into separate files in native pcap format (to local storage).

voip monitor
  • sngrep
  • tcpdump
  • custom made pcap capture and uploader

NAT and DNS

To adapt SIP to modern IP networks with inter network traversal ICE, far and near-end NAT traversal solutions are used. Network Address traversal is crtical to traffic flow between private public network and from behind firewalls and policy controlled networks
One can use any of the VOVIDA-based STUN server, mySTUN , TurnServer, reStund , CoTURN , NATH (PJSIP NAT Helper), ReTURN, or ice4j

Near-end NAT traversal

STUN (session traversal utilities for NAT) – UA itself detect presence of a NAT and learn the public IP address and port assigned using Nating. Then it replaces device local private IP address with it in the SIP and SDP headers. Implemented via STUN, TURN, and ICE.
limitations are that STUN doesnt work for symmetric NAT (single connection has a different mapping with a different/randomly generated port) and also with situations when there are multiple addresses of a end point.

TURN (traversal using relay around NAT) or STUN relay – UA learns the public IP address of the TURN server and asks it to relay incoming packets. Limitatiosn since it handled all incoming and outgong traffic, it must scale to meet traffic requirments and should not become the bottle neck junction or single point of failure.

ICE (interactive connectivity establishment) – UA gathers “candidates of communication” with priorities offered by the remote party. After this client pairs local candidates with received peer candidates and performs offer-answer negotiating by trying connectivity of all pairs, therefore maximising success. The types of candidates :
– host candidate who represents clients’ IP addresses,
– server reflexive candidate for the address that has been resolved from STUN
– and a relayed candidate for the address which has been allocated from a TURN relay by the client.

Far-end NAT traversal

UA is not concerned about NAT at all and communicated using its local IP port. The border controller implies a NAT handling components such as an application layer gateway (ALG) or universal plug and play (UPnP) etc which resolves the private and public network address mapping by act as a back to back user agent (B2BUA).
Far end NAT can also be enabled by deploying a public SIP server which performs media relay (RTP Proxy/Media proxy).

Limitations of this approach
(-) security risks as they are operating in the public network
(-) enabling reverse traffic from UAS to UAC behind NAT.

A keep-alive mechanism is used to keep NAT translations of communications between SIP endpoint and its serving SIP servers opened , so that this NAT translation can be reused for routing. It contains client-to-server “ping” keep-alive and corresponding server-to-client “pong” messages. The 2 keep-alive mechanisms: a CRLF keep-alive and a STUN keep-alive message exchange.

The 3 types of SIP URIs,

  • address of record (AOR)
  • fully qualified domain name (FQDN)
  • globally routable user agent (UA) URI
    SIP uniform resource identifiers (URIs) are identified based on DNS resolution since the URI after @ symbol contains hostname , port and protocl for the next hop.

Adding record route headers for locating the correct SIP server for a SIP message can be done by :
– DNS service record (DNS SRV)
– naming authority pointer (NAPTR) DNS resource record

Steps for SIP endpoints locating SIP server

  1. From SIP packet get the NAPTR record to get the protocl to be used
  2. Inspect SRV record to fetch port to use
  3. Inspect A/AAA record to get IPv4 or IPv6 addresses
    ref : RFC 3263 – Locating SIP Servers
    Can use BIND9 server for DNS resolution supports NAPTR/SRV, ENUM, DNSSEC, multidomains, and private trees or public trees.

CDR Processing and Billing

CDR store call detail records along with proof of call with tiemstamps, orignation, destination, duaration, rate etc. At the end of month or any other term, the aggregated CDR are cumulatively processed to generate the bill for a user. This heavy data stream needs to be accurately processed and this can be achived by using data-pipelines like AWS kinesis or Kafka eventstore.

The prime requirnment for the system is to handle enormous amount of call records data in relatime , cater to a number of producers and consumers.

For security the data is obfuscated into blob using base 64 encoding.

For good consistency only a single shard should be rsponsible to process one user account’s bill.

Data Streams for billing service

AWS Kinesis – Kinesis Data Streams is sued for for rapid and continuous data intake and aggregation. The type of data used can include IT infrastructure log data, application logs, social media, market data feeds, and web clickstream data. It supports data sharding (ie number of call records grouped) and uses a partition Key ( string MD5 hash) to determine which shard the record goes to. 

(+) This system can handle high volume of data in realtime and produce call uuid specfic reults which can be consumed by consumers waiting for the processed results

(-) If not consumed with a pre-specified time duration the processed results expire and are irretrivable . Self implement publisher to store teh processed reults from kisesis stream to data stores like Redis / RDBMS or other storge locations like s3 , dynamo DB. If pieline crashes during operation , data is lost

(-) Data stream should have low latency igesting contnous data from producer and presenting data to consumer.

Call Rate and Accounting

Generally data streams proecssing are used for crtical and voluminious service usage like for
– metering/billing
– server activity,
– website clicks,
– geo-location of devices, people, and physical goods

Call Rates are very crticial for billing and charging the calls . Any updates from the customer or carriers or individuals need to propagate automatically and quickly to avoid discrpencies and neagtive margins. CDRs need to be processed sequentially and incrementally on a record-by-record basis or over sliding time windows, and used for a wide variety of analytics including correlations, aggregations, filtering, and sampling.

To acheieve this the follow setup is ideal to use the new input rate sheet values via web UI console or POST API and propagate it quickly to main DB via AWS SQS which is a queing service and AWS lamda which is a serverless trigger based system . This ensures that any new input rates are updates in realtime and maintin fallback values in s3 bucket too

Call Rate and Accounting using task pipes , lambda serverless and qiueing service. Uses s3 buckets , AWS lambda, AWS SQS and AWS RDS.
Call Rate and Accounting using task pipes , lambda serverless and qiueing service

Cross platform and integration to External Telecommunication provider landscape

It is an advantage to plan for ahead for connection with IMS such as openIMS, support for Voip signalling protocols (SIP, H,323, SCCP, MGCP, IAX) and telephony signalling protocls ( ISDN/SS7, FXS/FXO, Sigtran ) either internally via pluggable modules or externally via gateways or for SIP trunking integration via OTT providers/ cloud telephony.

Adhere to Standard

The obvious starting milestone before making a full-scale carrier-grade, SIP-based VoIP system is to start by building a PBX for intra-enterprise communication. There are readily available solutions to make an IP telephony PBX Kamailio, FreeSWITCH, asterisk, Elastix, SipXecs. It is important to use the standard protocol and widely acceptable media formats and codecs to ensure interoperability and reduce compute and delay involved in protocol or media transcoding.

Database Integration

Need backend , cache , databse integration to npt only store routing rules with temporary varaible values but also aNeed backend, cache, database integration to not only store routing rules with temporary variable values but also account details, call records details, access control lists etc. Should therefore extend integration with text-based DB, Redis, MySQL, PostgreSQL, OpenLDAP, and OpenRadius.

Consistency of Call Records and duplicated charging records at various endpoints

In current Voip scenarios a call may be passing thorugh various telco providers , ISP and cloud telephony serviIn current VoIP scenarios, a call may be passing through various telco providers, ISP and cloud telephony service providers where each system maintains its own call records and billing. This in my opinion is duplication and can be avoided by sharing a consistent data store possible in the blockchain. This is an experimental idea that I have further explored in this article


There are other external components to setup a VOIP solution apart from Core voice Servers and gateways like the ones listed below, I will try to either add a detailed overall architecture diagram here or write about them in an seprate article. Keep watching this space for updates

  • Payment Gateways
  • Billing and Invoice
  • Fraud Prevention
  • Contacts Integration
  • Call Analytics
  • API services
  • Admin Module
  • Number Management ( DIDs ) and porting
  • Call Tracking
  • Single Sign On and User Account Management with Oauth and SAML
  • Dashboards and Reporting
  • Alert Management
  • Continuous Deployment
  • Automated Validation
  • Queue System
  • External cache

References :

SIP solutioning and architectures is a subsequent article after SIP introduction, which can be found here.

Read about VoIP/ OTT / Telecom Solution startup’s strategy for Building a scalable flexible SIP platform which includes :

  • Scalable and Flexible SIP platform building
  • Cluster SIP telephony Server for High Availability
  • Failure Recovery
  • Multi-tier cluster architecture
  • Role Abstraction / Micro-Service based architecture
  • Distributed Event management and Event-Driven architecture
  • Containerization
  • Autoscaling Cloud Servers
  • Open standards and Data Privacy
  • Flexibility for inter-working – NextGen911 , IMS , PSTN
  • security and Operational Efficiencies

IMS , the revolution ahead

vision :
To make a model that separates the services offered by
fixed-line (traditional telcos),                mobile (traditional cellular),            and            converged service providers (cable companies and others who provide triple-play — voice,  video, and data — services) from the access networks used to receive those services.
———————————————
Layers :
 IMS architecture is broken into distinct layers:
Screenshot from 2013-05-16 18:37:09
———————————————————
Drivers :
Revenue streams for plain vanilla voice services are sharply falling and the need of the hour is to propose smart intuitive and creative service to kep up the Telecom market alive .