High availiability and Scalibility in VoIP platforms

Load Balancers
- MPLS
Service-discovery
- Keepalive, unregistering unhealthy nodes
Replication
- Data Store Replication
Quick Response / Low latency
Scalability
- autoscalling
- Partitioining
Multiple PoPs (point of presence)
- Minimal Latency and lowest amount of tarffic via public internet
High availiability (HA)
- 5 9’s in aggregate failures
- HA for Load balancer (LB)
- HA for Call Control app server
- Media Server HA
Security against malicious attacks
- MITM
- DDOS
Identifying outages , logs and pcap analysis + alerting
Bottlenecks
- Performance testing
- Robust QA for potential bottlenecks before going live
Distributed Data Store
- message-queues-vs-batch-processing
Distributed-event-management-and-event-driven-architecture-using-streams
Distributed cache for call control Servers
Circuits – fail fast , wait for circuit to recover before calling again

Load Balancers

Load Balancer(LB) is the initial point of interaction between the client application and the core system. It is pivotal in the distribution of the load across multiple servers and ensuring the client is connected to the nearest VoIP/SIP application server to minimize latency. However, the load balancers are also susceptible to security breaches and DOS attacks as they have a public-facing interface. This section lists the protocols, types and algorithms used popularly in Load balancers of VoIP systems.

software LB	Layer 4 / hardware LB
Nginx Amazon ELB ( eleastic load balanecr)	F5 BIG-IP load balancer CISCO system catalyst Barracuda load balancer NetScaler
used by applications in cloud ADN (Application delivery network)	used by network address translators (NATs) DNS load balancing

examples and roles of software and hardware based load balancers

Load Balancers(LB) ping each server for health status and greylists servers that are unhealthy( respond late) as they may be overloaded or experiencing congestion. The LB monitors it rechecks after a while and if a server is healthy ( ie if a server responds with responds with status update) it can resume sending traffic to it. LB should also be distributed to different data centres in primary-secondary setup for HA.

Networking protocol

TCP Loadbalancer	HTTP load balancers	SIP based LB as Kamailio/ Opensips
can forwrad the packet without inspecting the content of the packet.	terminate the connection and look inside the request to make a load balcing decsiion for exmaple by using a cookie or a header.	domain specific to VoIP
(+) fast, can handle million of req per second		(+) handle SIP routing based on SIP headers and prevent flooding atacks and other malicious malformed packets from reaching application server

Load balancing algorithms

Weighted Scheduling Algorithm
Round Robin Algorithm
Least Connection First Scheduling
Lest response time algorithms
Hash based algorithm ( send req based on hashed value such as suing IP address of request URL)

Loadbalancer	Reverse Proxy
forward proxy server allows multiple clients to route traffic to an external server	accepts clients requestd for server and also returns the server’s response to the client ie routes traffic on behalf of multiple servers.
Balances load and incoming traffic endpoint	public facing endpoint for outgoing traffic additional level of abstraction and security, compression
	used in SBC (session border controllers) and gateways

Service Discovery

Client-side or even backend service discovery uses a broadcasting or heartbeat mechanism to keep track of active servers and deactivates unresponsive or failed servers. This process of maintaining active servers helps in faster connection time. Some approaches to Service Discovery

Mesh
1. (-) exponentially incresing network traffic
Gossip
Distributed cache
Coordination service with Service
- (-) requesres coordination service for leader selection
- (-) needs consensus
- (-) RAFT and pbFT for mnaging failures
Random leader selection
- (+) quicker
- (-) may not gurantee one leader
- (-)split brain problem

Keepalive, unregistering unhealthy nodes

Systems such as Consul, Etcd, and Zookeeper can help services find each other by keeping track of registered names, addresses, and ports. Health checks help verify service integrity and are often done using an HTTP endpoint.

Replication

Usuallay there is a tradeoff between liveness and safety.

Single leader replication
- (-) vulnerable to loss of data is leader goes down before replication completes
- used to in sql
multileader replication and
leaderless replication
- (-) increases latencies
- (-) quorem based on majority , cannot function is majority node are not down
- used in cassandra

Data Store Replication

For Relatonal Dataabase

For NoSQL databse replication and HA

Quick Response / Low latency

Message format

Textual Message format	Binary Message Format
human readbale like json xml	diff to comprehend , need shared schema between sender and receiver to serilaize and deserialze ,
names for every field adds to size	no field name or only tags , reduces message size

Gateways for faster routing and caching to services

gateways are single entry point to route user requests to backend services .

Separate hot storage from cold storage

hot storage is frquently accessed data which must be near to server

cold storage is less frequently accessed data such as archives

object storage
slow access

Scalability

To make a system :-

scalable : use partitioing
reliable : use replication and checkpointing to not loose data in failures
fast : use in -memory usage

According to CAP theorem Consistency and Availiability are difficult to achieve together and there has to be a tradeof acc to requirnments.

Partitioining

Partition strategy can be based on various ways such as :-

Name based partition
geographic partition
names’ hashed value based on identifier
- (-) can lead to hot partitions ( high density in areas of freq accessible identioers )
- (-) high density spots for example all messages with a null key to go to the same partition
- (-) doesnt scale
event time based hash
- (+) data is spread evenly over time

To create a well distributed partition we could spread hot partition into 2 partitions or dedicate partitions for freq accessible items. An effective partitioning keys uses

Cardinality : total num of unique keys for a usecase. High cardinality leads to better distribution.
- high cardinatility keys : names , email address , url since they have high variatioln
- low cardinatlity keys : boolean flags such as gender M/F
Selectivity : number of message with each key. High selectivity leads to hotspots and hence low selectivity is better for even distribution.

Autoscalling

Scale Out not Up !

Kubernetes is an open-source container-orchestration system for automating computer application deployment, scaling, and management used in DevOps. I have mentioned this in detail on the article on VoIP and DevOps below.

VoIP system DevOps, operations and Infrastructure management, Automation

Multiple PoPs (point of presence)

for a VOIP system catering to many clients accross the globe or accessing multiple carriers meant for different counteries based on Prefix matching , there should be alocal PoP in most used regions . typically these regions include – US east – west coasts, UK – germany of London , Asia Pacific – Mumbai ,Hong Kong and Australia.

Minimal Latency and lowest amount of tarffic via public internet

Creating multiple POPs and enabling private traffic via VPN in between them ensures that we use the backbone of our cloud provider such as AWS or datacentre instead of traversing via public internet which is slower and more insecure. Hopping on a private interface between the cloud server and maintaining a private connection and keepalive between them helps optimize the traffic flow while keeping the RTT and latency low.

HA ( High-availability )

Some factors affecting Dependability are

Eventual Consistency
MultiRegion failover
Disaster Recovery

A high-availability (HA) architecture implies Dependability.Usually via existence of redundant applications servers for backups: a primary and a standby. These applications are configured so that if primary fails, the other can take over its operations without significant loss of data or impact to business operations.

Downtime / SLA of 5 9’s in aggregate failures

4 9’s of availiability on each service components gives a downtime of 53 mins per service each year. However in aggregate failure this could amlount to (99.99)¹⁰ = 99.9 downtime which is 8-10 hours each year.

Thus, aggregate failure should be taken into consideration while designing reliable systems.

HA for Proxy / Load balancer (LB)

A LB is the first point of contact for outbound calls and usually does not save the dialogue information into memory or database but still contain the transaction information in memory. In case the LB crashes and has to restart, it should

have a quick uptime
be able to handle in dialogue requests
handle new incoming dialogue requests in a stateless manner
verify auth/authorization details from requests even after restart

HA for Call Control app server

App server is where all the business logic for call flow management resides and it maintains the dialog information in memory.

Issues with in-memory call states : If the VM or server hosting the call control app server is down or disconnected, then live calls are affected, this, in turn, causes revenue loss. Primarily since the state variable holding the call duration would be able to pass onto the CDR/ billing service upon the termination of the call. For long-distance, multi telco endpoint calls running hours this could be a significant loss.

Standby app server configuration and shared memory : If the primary app server crashes the standby app server should be ready to take its place and reads the dialog states from the shared memory.
Live load balanced secondary app server + external cache for state varaibles : External cache for state variables: a cluster of master-slave caches like Redis is a good way of maintaining the dialogue state and reading from it once the app server recovers from a failed state or when a secondary server figures it has a missing variable in local memory.

Media Server HA

Assuming the kamailio-RTPengine duo as App server and Media Server. These components can reside in same or different VMs. Incase of media server crash, during the process of restoring restarted RTpengine or assigning a secondary backup RTpengine , it should load the state of all live calls without dropping any and causing loss of revenue . This is achived by

external cache such as Redis ,
quick switchover from primary to secondary/fallback media server and
floating IPs for media servers that ensures call continuity inspite of failure on active media server.

Architecturally it looks the same as fig above on HA for the SIP app server.

Security against malicious attacks

Attacks and security compromisation pose a very signficant threat to a VoIP platform.

MITM attacks

Man in midddle attacks can be counetred by

End to end encryption of media using SRTP and signals using TLS
Strong SIP auth mechanism using challenges and creds where password is composed of mixed alphanumeric charecters and atleast 12 digits long
Authorization / whitelisting based on IP which adheres to CIDR notation

DDOS attacks

DDOS renders a particular network element unavailable, usually by directing an excessive amount of network traffic at its interfaces.

dDOS – multiple network hosts to flood a target host with a large amount of network traffic. Can be created by sending falsified sip requests to other parties such that numerous transactions originating in the backwards direction comes to the target server created congestion.

Can be counetred by

detect flooding and q in traffic and use Fail2ban to block
challenge questionable requests with only a single 401 (Unauthorized) or 407 (Proxy Authentication Required)

Read about SIP security practices in deatils https://telecom.altanai.com/2020/04/12/sip-security/

Attacks on SIP Networks

Other important factors leading to security

Keystores and certificate expiry tracker
priveligges and roles
Test cases and code coverage
Reviewers approval before code merge
Window for QA setup and testing , to give go ahead before deployment

Identifying outages and Alerting

Raise Event notification alerts to designated developers for any anolous behavior. It could be call based or SMS basef alert based on the sevirity of the situtaion .

Logging and Alerting for a VoIP CPaaS platform .
Raise Event notification alerts to designated developers for any anolous behavior. It could be call based or SMS basef alert based on the sevirity of the situtaion.

Sources for alert manager

Build failed ( code crashes, Jenkins error)
Deployment failed ( from Kubernetes , codechef, docker ..)
configuration errors ( setting VPN etc )
Server logs
Server health
homer alerts ( SIP calls responses 4xx,5xx,6xx)
PCAP alerts ( Malformed SIP SDP ..)
Internal Smoke test ( auto testing procedure done routinely to check live systems )
Support tickets from customer complaints ( treat these as high priority since they are directly impacting customers)

Bottlenecks

The test bed and QA framework play a very crticial role in final product’s credibility and quality.

Performance Testing

Stress Testing : take to breaking
Load Testing : 2x to 3x testing
Soak Testing : typical network load to long time ( identify leaks )

Robust QA framework( stress and monkey testing) to identify potential bottlenecks before going live

A QA framework basically validates the services and callflows on staging envrionment before pushing changes to production. Any architectural changes should especially be validated throughly on staginng QA framework befire making the cut. The qualities of an efficient QA platform are :

Genric nature – QA framework should be adatable to different envrionments such as dev , staging , prod

Containerized – it should be easy to spn the QA env to do large scale or small scale testing and hence it should be dockerized

CICD Integration and Automation – integrate the testcases tightly with gt post push and pull request creation . Minimal Latency and lowest amount of tarffic via public internet

Keep as less external dependecies as possible for exmaple a telecom carrier can be simulated by using an PBX like freeswitch or asterix

Asynchronous Run – Test cases should be able to run asynchronously. Such as seprate sipp xml script for reach usecase

Sample Testcases for VoIP

Authentication before establish a session
Balance and account check before establishing a session like whitelisting , blacklisting , restricted permission in a particular geography
Transport security and adaptibility checks , TLS , UDP , TCP
codec support validation
DTMF and detection
Cross checking CDR values with actual call initiator and terminator party
cross checking call uuid and stats
Validating for media and related timeouts

QA frameworks tools – Robot framework

traffic monitor – VOIP monitor

customer simulator – sipP scripts

network traffic analyser – wireshark

pcap collevcter – tcpdump , sngrep

Distributed Data Store

A Distributed Database Design could have many components. It could work on static datastore like

SQL DB where schema is important
- MySQL
- postgress
- Spanner – Globally-distributed database from Google
NoSQL DB for to store records in json
- Cassandra – Distributed column-oriented database
Cache for low latency retrivals
- Memcached – Distributed memory caching system
- Redis – Distributed memory caching system with persistence and value types
Data lakes for heavy sized data
- AWS s3 object storage
- blob storage
File System
- Google File System (GFS) – Distributed file system
- Hadoop File System (HDFS) – Open source Distributed file system

or work on realtime data streams

Batch processing ( Hadoop Mapreduce)
Stream processing ( Kafka + spark)
- Kafka – Pub/sub message queue
Cloud native stream processing ( kinesis)

Each component has its own pros and cons. The choice depends on requirnments and scope for system behaviour like

users/customer usuage and expectation ,
Scale ( read and write )
Performnace
Cost

Users/customers	Scale ( read / write)	Performance	Cost
Who uses the system ? How the system will be used?	Read / writes per second ? Size of data per request ? cps ( calls or click per second) ?	write to read delay ? p99 latency for read querries ?	should design minimize the cost of development ? should design mikn ize the cost of mantainance ?
	spikes in traffic	eventual consistency ( prefer quick stale data ) as compared to no data at all
	redundancy for failure management

Some fundamental constrains while design distributed data structure :-

p99 latency : 99% of the requests should be faster than given latency. In other words only 1% of the requests are allowed to be slower.

Request latency:
    min: 0.1
    max: 7.2
    median: 0.2
    p95: 0.5
    p99: 1.3

Inidiviual Events vs Aggregate Data

Inidividual Events ( like every click or every call metric)	Aggregate Data ( clicks per minute, outgoing calls per minute)
(+) fast write (+) can customize/ recalculate data from raw	(+) faster reads (+) data is fready for decision making / statistics
(-) slow reads (-) costlier for large scale implementations ( many events )	(-) can only query in the data as was aggregates ( no raw ) (-) requires data aggregation pipeline (-) hard to fix errors
suitable for realtime / data on fly low expected data delay ( minutes )	suitable for batch processing in background where delay is acceptable from mintes to hours

Push vs Pull Architecture

Push : A processing server manages state of varaible in memory and pushes them to data store.

(-) crashed processingserver means all data is lost

Pull : A temporary data strcyture such as a queue manages the stream of data and processing service pull from it to process before pusging to data stoore.

(+) a crashed server has to effect on temporarily queue held data and new server can simply take on where previous processing server left.
(+) can use checkpointing

Popular DB storage technologies

SQL	NoSQL
Structured and Strict schema Relational data with joins	Semi-structured data Dynamic or flexible schema
(+) faster lookup by index	(-) data intensive workload (+) high throughput for IOPS (Input/output operations per second )
used for Account information transactions	best suitable for Rapid ingest of clickstream and log data Leaderboard or scoring data Metadata/lookup tables
	DynamoDB – Document-oriented database from Amazon MongoDB – Document-oriented database

A NoSQL databse can be of type

Quorem
Document
Key value
Graph

Cassandra is wide column supports asyn master less replication

Hinge base also a quorem based db also has master based preplication

MongoDB documente orientd DB used leacder based replication

SQL scaling patterns include:

Federation/ federated database system : transparently maps multiple autonomous database systems into a single virtual/federated database.
- (-) slow since it access multiple data storages to get the value
Sharding / horizontal partition
Denormalization : Even though normalization is more memory efficient denormalization can enhance read performance by additing redundant pre computed data in db or grouping related data.
- Normalizing data reduces data warehouse disk space by reducing data duplication and dimension cardinality. In its full definition, normalization is the process of discarding repeating groups, minimizing redundancy, eliminating composite keys for partial dependency and separating non-key attributes.

SQL Tuning : “iterative process of improving SQL statement performance to meet specific, measurable, and achievable goals”

Influx DB : to store time series data

AWS Redshift

Apache Hadoop

Redis

Embeed Data : RocksDB

Message Queues(Buffering) vs Batch Processing

Distributed event management, monitoring and working on incoming realtime data instead of stored Database is the preferred way to churn realtime analysis and updates. The multiple ways to handle incoming data are

Batch processing – has lags to produce results, not time crtical
Data stream – realtime response
Message Queues – ensures timely sequence and order

Buffering	Batching
Add events to buffer that can be read	Add events to batch and send when batch is full
(+) can handle each event	(+) cost effective (+) ensures throughput (-) if some events in batch fail should whole batch fail ? (-) not suited for real time processing
	S3 like objects storage + Hadoop Mapreduce for processing

Timeout

Connection timeout : use latency percentiles to calculate this
Request timeout

Retries

exponential backoff : increase waiting time each try
jitter : adds rabdomness to retry intervals to spread out the load.

Grouping events into object storage and Message Brokers

slower than stream processing but faster than batch processing.

Distributed Event management and Event Driven architecture using streams

In event driven archietcture a produce components performs and action which creates an event thata consumer/listener would subscribes to consume.

(+) time sensitive
(+)Asynch
(+) Decoupled
(+) Easy scaling and Elasticity
(+) Heterogeneous
(+) contginious

Expanding the stream pipeline

Event Streams decouple the source and sink applications. The event source and event sinks (such as webhooks) can asynchronously communicate with each other through events.

Options for stream processing architectures

Apache Kafka
Apache Spark
Amazon kinesis
Google Cloud Data Flow
Spring Cloud Data Flow

Here is a post from earlier which discusses – Scalable and Flexible SIP platform building, Multi geography Scaled via Universal Router, Cluster SIP telephony Server for High Availability, Failure Recovery, Multi-tier cluster architecture, Role Abstraction / Micro-Service based architecture, Load Balancer / Message Dispatcher, Back end Dynamic Routing and REST API services, Containerization and Auto Deployment, Auto scaling Cloud Servers using containerized images.

VoIP/ OTT / Telecom Solution startup’s strategy for building a scalable flexible SIP platform

Lambda Architecture

Stream processing on top of map reduce and stream processing engine. In lambda architecture we can send events to batch system and stream processing system in parallel. The results are stiched together at query time.

Lambda Archietcture : stream processing on top of map reduce and stream processing engine. Send events to batch system and stream processing system in parallel. The results are stiched together at query time.

Apache Kafka is used as source which is a framework implementation of a software bus using stream-processing. “.. high-throughput, low-latency platform for handling real-time data feeds”.

Apache Spark : Data partitioning and in memory aggregation.

Distributed cache for call control Servers

Dedicated Cache Cluster	Co located cache
Isolates cache fro service Cache and service do not share memory and CPU can scale independently can be used by many microservices flexibility in choosing hardware	doesnt require seprate hardware low operational and hardware cost scales together with the service

Choosing cache host

Mod function
- (-) behaves differently when a new client is added or one is removed , unsuitable for prod
Consistent hashing ( chord)
- maps each value to a point on circle

Cache Replacement

Least Recently Used Cache Replacement

Consistency and High Availiability in Cache setup

ReadReplicas live in differenet data centre for disaster recovery.

Strong consistency using Master Slave

Circuits – fail fast, wait for circuit to recover before using again

Design patterns for a circuit base setup to gracefully handle exceptions using fallback.

Circuit breaker : stops client from repeatedly trying to exceute by calculate the error threshold.

Isolated thread pool in circuits and ensure full recovery before calling the service again.

(+) Circuit breaker event causes the entire circuit to repair itself before attempting operations.

References :

[1] http://highscalability.com/blog/2013/4/15/scaling-pinterest-from-0-to-10s-of-billions-of-page-views-a.html
[2] http://engineering.hackerearth.com/2013/10/07/scaling-database-with-django-and-haproxy/
[3] Mastering Chaos – A Netflix Guide to Microservices , QCon London
[4] https://dzone.com/refcardz/event-stream-processing-essentials
[5] p99 latency https://stackoverflow.com/questions/12808934/what-is-p99-latency
[6] Oracle site on partitioing- https://docs.oracle.com/en-us/iaas/Content/Streaming/Concepts/partitioningastream.htm
[7] Mastering Chaos – A Netflix Guide to Microservices https://www.youtube.com/watch?v=CZ3wIuvmHeM&ab_channel=InfoQ
[8] Netflix cloud architecture https://netflixtechblog.com/tagged/cloud-architecture

SIP Trunks

Traditional trunk call
SIP trunk (older) systems
Centralized SIP Trunk Model
SIP trunking is an IP-based alternative to ISDN trunking services
Planning to set up SIP trunk
Features of SIP trunking
Future of SIP trunks

With the dawn of IP telephony service and cloud communication platforms in recent years, the SIP has caught the attention of many application developers. while SIP is essentially a session management multimedia signalling protocol its generic stack can be used for various use cases from IoT camera streaming sessions to call centres even auto calling for purpose of sharing OTP(one-time password) etc. In this I will highlight the usecase of large calltraffic and the use of SIP trunks.

SIP based trunking can provide significant cost savings and business process improvements by supporting the native SIP protocol that controls the VoIP systems used in call centres and business communication platforms.

(+) unified communication
(+) lower telco network
(+) streamline operations for multicountry/ geography

Traditional trunk call

In the past, telephone systems used trunk lines to connect different parts of the network. Trunk lines were long-distance communication lines that connected telephone exchanges in different locations. Trunk calls were calls made over these trunk lines. They were typically used for long-distance communication, as they allowed calls to be made between exchanges that were geographically far apart. Trunk calls were generally more expensive than local calls, as they involved the use of long-distance communication lines.

Traditional trunk calls operated like a circuit with local loops , trunk lines and switching offices. The telco acted as carriers that sell of lease communication lines to facilitate communication over long distances using local exchanges and interexchange carriers.

In the early days of telephone systems, trunk lines were typically made of copper wires or cables. Later, trunk lines were replaced with satellite links and fiber optic cables, which provided higher capacity and faster transmission speeds. Today, with the widespread adoption of VoIP (Voice over Internet Protocol) technology, many telephone systems no longer use trunk lines in the traditional sense. Instead, they use virtual connections, such as SIP trunks (Session Initiation Protocol trunks), which allow organizations to make and receive phone calls over the internet. SIP trunks are generally more flexible and cost-effective than traditional trunk lines, and do not require the installation of additional hardware.

Voice trunk Lines in SS7 based Next Generations IN networks used media gate ways and MGCP, H323 protocols

SIP trunk (older) systems

SIP is a protocol that is commonly used in VoIP (Voice over Internet Protocol) systems to set up, modify, and terminate sessions that involve the exchange of audio, video, and other media. SIP Trunks are virtual voice channels (or paths) which deliver media (voice, video, IM) over an IP network to a designated endpoint. SIP Trunks can be thought of as a virtual line or concurrent call path. SIP Trunks are delivered over an IP connection like Tier One Carrier or Voice Optimized Recommended or UDP. SIP Trunk may be over-subscribed ie can have more numbers than trunks for example G.711 – 17 calls over T1 or G.729a – 45 calls over T1. SIP Trunking can be provided as one-way or two-way lines. Direct Inward Dialing (DIDs) can be used for toll-free number service.

Centralized SIP Trunk Model

Centralized SIP Trunk Model is designed to aggregate all calls from all sites and funnels them into a single entry point. Each site has its own SIP trunk termination of the appropriate capacity for calls to and from that site.

Such SIP trunks models offer benefits in three significant areas:

Cost savings, arising from many factors including reduced telecommunications network charges and streamlined operations.
Unified communications, where voice, video, email, text and other messaging technologies are combined to provide greater flexibility for users by enabling new ways to transfer information and manage connectivity. Many SIP trunk providers offer advanced features such as call forwarding, call waiting, and voicemail, which can improve the overall communication experience for employees.
Business Continuity and Disaster Recovery, where the right physical configuration in conjunction with intelligence in the network can be leveraged to provide uninterrupted communications and alternative means to stay connected for employees in the event of system bottlenecks or failures.

SIP trunking is an IP-based alternative to ISDN trunking services

SIP Trunking is a low-cost IP-based alternative to ISDN offering for medium to large businesses needing upwards of several tens of channels in a trunk, often across multiple sites, with IP VPN access.

(+) Optimal utilization of bandwidth by delivering both data and voice in the same bandwidth

A telephony company such as a telecom service provider may expose SIP trunks as a means of connecting inbound or outbound calls through its telecom network. For the integrator ( or the service provider managing the other enedpoint of the call leg ) it can be no different that a traditional phone call.The SIP signalling however is useful for enabling better session understaning using standard SIP requests and responses as compared to SS7 or PRI lines.

Planning to set up SIP trunk

•Cost analysis
•Assess traffic volumes and patterns
•Assess network design implications
•Emergency call policy
•Define production user community phases
•Define user community to pilot
•Evaluate future new services
•Assess security precautions

The steps to set up a SIP trunk connection may vary depending on the specific provider and the equipment being used. However, here are some general steps that are often involved in the process:

Choose a SIP trunk provider: Research and compare different SIP trunk providers to find one that meets your organization’s needs and budget.
Sign up for a SIP trunk account: Follow the provider’s instructions to sign up for a SIP trunk account. This may involve completing an online form, providing contact information and payment details, and selecting the desired features and services.
Configure your VoIP phone system: Consult your VoIP phone system’s documentation to learn how to configure it to work with a SIP trunk. This may involve specifying the SIP trunk’s IP address and port number, as well as any authentication credentials that are required.
Test the connection: Once the SIP trunk is set up, it is a good idea to test the connection to ensure that it is working properly. Make a few test calls to verify that the connection is functioning as expected.
Use the SIP trunk: Once the SIP trunk is set up and tested, it can be used to make and receive calls using your VoIP phone system.

SIP Trunking platform has to integrate with multiple networks seamlessly. Components for setting up a SIP trunking system requires atleast these

Compliance with standrad signalling protol, like SIP.
SBC( Session Border Controller ) facing the private PBX
Gateway for specific endpoints such as PSTN gateway , public internet gateway etc
L3/L4 Layer switches
Telco operator lines
Codec support

Kamailio is an open-source SIP (Session Initiation Protocol) server that can be used to create a SIP trunk. Kamailio can be PBX used to connect different locations within an organization, enabling employees to communicate with each other using their VoIP phones. Kamailio can also be used to set up a SIP trunk in a number of ways. For example, it can be used to connect an organization’s VoIP phone system to the public telephone network, allowing employees to make and receive calls from outside the organization.

https://telecom.altanai.com/2016/08/02/session-border-controller-for-webrtc/

Kamailio is a highly flexible and customizable SIP server that can be configured to meet the specific needs of an organization. It offers a range of features and functionality, including call routing, load balancing, and security. Kamailio is a popular choice for organizations that want to set up a SIP trunk because it is open-source and can be customized to meet their specific needs.

L1, L2 ,L3 equipment and L3 vs L4 switches

Features of SIP trunking

SIP trunk with VoIP phone systems are often preferred over traditional phone systems because they are generally more flexible and cost-effective. They allow employees to make and receive calls from any device with an internet connection, including desk phones, smartphones, and laptops. They can be easily scaled up or down to meet changing communication needs and do not require the installation of additional physical hardware. Some factors to consider when evaluating SIP trunks include:

Cost: It is important to compare the costs of different SIP trunk providers and consider factors such as monthly fees, per-minute charges, and any additional fees for features or services.
Coverage: Make sure that the SIP trunk provider has coverage in the areas where your organization needs to make and receive calls.
Quality: The quality of a SIP trunk can vary greatly depending on the provider and the connection. Be sure to research the provider’s reputation for call quality and reliability.
Features: Different SIP trunk providers may offer different features, such as call forwarding, call waiting, and voicemail. Consider which features are important to your organization and make sure that the SIP trunk provider offers them.
Customer support: It is important to choose a SIP trunk provider that offers reliable customer support in case you experience any issues with your service.

Other features that are good to have is integration to existing backend for OSS/BSS stack. Some of the feature set for a carrier grade SIP trunking solution are listed here

Inbound and outbound trunks
Number Import/Export
Security
- Dynamic registeration of users
- Authentication and Authorization
- Security (SRTP)
Cost Savings
- Low cost for large traffic volumes instead of charges of call per second
- CDR for tracing and monitoring call failures
Clear media stream ( no robotic or choopy audio). Good MOS score
realtime traffic monitoring to rule out bad players.
Inbound and Outbound call – Call Establishment, Rejection, Termination
DDI: Direct Dialling-In ranges can be provided on the SIP Trunk
CLIP(Calling Line Identification Presentation )/CLIR Calling Line Identification Presentation Restriction) for Inbound and Outbound
Call Management
- AUTH Code Screening
- Combined Screening
- Data Call Screening
- Local Screening
- Anonymous Call Rejection: Anonymous Call Rejection
- Incoming Call Barring: bar receiving of calls to certain extensions
- Outgoing Call Barring: Restrict calls to certain numbers
- Incoming Call Diversion – unconditional, busy, and unreachable
- Call Admission Control: Call Admission Control (CAC) is a mechanism to restrict the number of simultaneous sessions (calls)
- Incoming Call Diversion (DestNo not reachable, CAC exceeded, unconditional)
Geographic and Non-Geographic Number Support
Multiple Codec Support
Emergency Calling: Emergency Calls are routed on a priority basis irrespective of the customer’s available channel

Trunking inbound services voice can be used to support contact centres, conferencing, number translation services etc. Regulatory requirements for the operation of the customer in the PSTN of respective countries must be met with Country Specific Emergency Calling support Enhanced feature set for SIP trunking should include the features of the SIP Trunking with Multicountry support

Enhanced CAC(Call Admission Control) – Directional & Network
Global Dial Plan Support
Proactive MCID (Malicious CallerId) Identification and tracing
Call Distribution(CD)
Intelligent Routing involving machine learning and constant feedback
- Origin Based Routing
- Menu Routing
- Origin Dependent Routing (ODR)
- PIN Routing
- Dynamic Route Select
- Time-Dependent Routing (TDR)
- Uniform Load Distribution(ULD)
- International Routing
- Mobile Routing
- Payphone Routing
Product Association

Ultimately, the most useful SIP trunk for your organization will depend on your specific needs and budget. It is a good idea to research and compare different SIP trunk providers to find the one that best meets your organization’s needs.

Future of SIP trunks

SIP trunking systems are likely to continue to be an important part of the telecommunications landscape in the future. As more and more organizations adopt WebRTC or SRT based VoIP (Voice over Internet Protocol) technology for their phone systems, the demand for SIP trunks is likely to continue to grow. One trend that is expected to shape the future of SIP trunking is the increasing adoption of cloud-based communication systems. As more organizations move their communication systems to the cloud, they are likely to turn to SIP trunks as a way to connect their phone systems to the public telephone network and enable remote communication. Another trend that is expected to impact the future of SIP trunking is the increasing adoption of 5G technology. 5G networks offer faster speeds and lower latency, which may make it possible to use SIP trunks for real-time communication applications such as interactive and/or immersive video conferencing.

VoIP/ OTT / Telecom Solution startup’s strategy for building a scalable flexible SIP platform

SIP VoIP system architecture basics

VoIP system DevOps, operations and Infrastructure management, Automation

PCAP Collection
Continuous Integration and Delivery Automation using Jenkins
Configuration management using chef cookbooks
Compute virtualization and containerization using Docker
Infrastructure management using terraform
Monitoring, debugging, logs analysis and alarms
- SBC and proxy gateways failures
- DNS caching alerts
- Disk usage alert
- Elevated Call failure SIP 503 or Call timeout SIP 408
- cron service or processed alerts
- DB connections / connection pool process alerts
- port check, unexpected result alert
- cron zombie process checks alerts
- Bulk calls checks
- Process control supervisor or pm2 checks
- Health and load on the reverse proxy, load balancer as Nginx alerts
- VPN checks
- SSL cert expiry checks
- Health of Task scheduling services such as RabbitMQ, Celery Distributed Task Queue
- Cluster status
- Status of Crticial Application Server
- Programming or Syntax error in the production environment
- Distributed memory caching – redis , memcahe
- SMS service using smsc on Kannel

This article is focussed around various tools required to operate and maintain a growing large scale VoIP Platform, which are mostly classified under following roles:

PCAP Collections
CICD on Jenkins pipeline
Configuration management using chef cookbooks
virtualization and containerization using Docker
Infrastructure management using terraform / Kubernetes
Logs Analysis and Alarming

PCAP Collection

Packet Capture (PCAP) is an API that captures live network packets. Besides tracking, audit and RTC visualizers, PCAP is widely used for debugging faults such as during production alarm on high failure occurrences.

Example usecase: Production alert on 503 SIP response or log entry from a gateway is not as helpful as PCAP tracking of the session ID of call across various endpoints in and out of the network to determine the point of failure.Debugging involves :

Pre-specified SIP / RTP and related protocols capture

Capture pcaps examples

tcpdump -i any -w alltraffic.pcap

rtpbreak -P2 -t100 -T100 -d logz -r alltraffic.pcap

2. Call SessionId to uniquely identify failed calls among tens of thousands of the packet

3. Analyzer such as wireshark or tshark to track the packet

TShark inspection examples

brew cask install wireshark
tshark -r alltraffic.pcap -R "sip.CSeq.method eq INVITE"

Some of the useful call specs captured from PCAP

DTMF – Both in-band and out of band DTMF for every call, along with the time stamp.
Codec negotiations – Extracting codecs from PCAP lets us
1. Validate later whether there were codec changes without prior SIP message,
2. If the call has been hung up with 488 error code then it was due to which codec
SIP errors – track deviations from standard SIP messaging.
1. Identify known erroneous SIP messaging scenarios such as for MITM or replay attacks
RTCP Media stats – extract Jitter, Loss, RTT with RTCP reports for both the incoming and outgoing stream.
Identify Media or ACK Timeouts
1. Check whether a party has not sent any media packet for > 60 s (media time out threshold duration)
2. When a call is hung up due to ACK time out.
Audio stream – After GDPR, take explicit permission from users before storing audio streams.

PCAP file analyzed in Wireshark ( PCAP source : https://wiki.wireshark.org/SampleCaptures#Sample_Captures)

Continuous Integration and Delivery Automation using Jenkins

CICD provides continous delivery hub , distribute work across multiple machines, helping drive builds, tests and deployments across multiple platforms .

Jenkins jobs is a self-contained Java-based program extensible using plugins.

Jenkins pieline– orchestrates and automates building project in Jenkins

Configuration management using chef cookbooks

Alternatives like puppet and Ansible, which are also a cross-platform configuration management platform

Compute virtualization and containerization using Docker

Docker containers can be used instead of virtual machines such as VirtualBox , to isolates applications and be OS and platform independent
Makes distributed development possible and automates the deployment possible

stop Stop one or more running containers
top Display the running processes of a container

> docker top 4417600169e8
UID PID PPID C STIME TTY TIME CMD
root 9913 9888 0 08:50 ? 00:00:00 bash /point.sh
root 10083 9913 0 08:50 ? 00:00:01 /usr/sbin/worker
root 10092 10083 0 08:50 ? 00:00:02 /micro-service

unpause Unpause all processes within one or more containers
update Update configuration of one or more containers
wait Block until one or more containers stop, then print their exit codes

see all iamges

> docker images
REPOSITORY                  TAG                 IMAGE ID            CREATED             SIZE
sipcapture/homer-cron       latest              fb2243f90cde        3 hours ago         476MB
sipcapture/homer-kamailio   latest              f159d46a22f3        3 hours ago         338MB
sipcapture/heplify          latest              9f5280306809        21 hours ago        9.61MB
<none>                      <none>              edaa5c708b3a

See all stats

>  docker stats
CONTAINER ID        NAME                CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS
f42c71741107        homer-cron          0.00%               52KiB / 994.6MiB      0.01%               2.3kB / 0B          602MB / 0B          0
0111765091ae        mysql               0.04%               452.2MiB / 994.6MiB   45.46%              1.35kB / 0B         2.06GB / 49.2kB     22

Run command from within a docker instnace

docker exec -it  bash

First see all processes

docker ps

select a process and enter its bash

docker exec -it 0472a5127fff bash

to edit or update a file inside docker either install vim everytime u login in resh docker conainer like

apt-get update
apt-get install vim

or add this to dockerfile

RUN [“apt-get”, “update”]
RUN [“apt-get”, “install”, “-y”, “vim”]

see if ngrep is install , if not then install and run ngrep to get sip logs isnode that docker container

apt update
apt install ngrep
ngrep -p "14795778704" -W byline -d any port 5060

docker volume – Volumes are used for persisting data generated by and used by Docker containers.
docker volumes have advantages over blind mounts such as easier to backup or migrate , managed by docker APIs, can be safely shared among multiple containers etc

docker stack – Lets to manager a cluster of docker containers thorugh docker swarm can be defined via docker-compose.yml file

docker service

create Create a new service
inspect Display detailed information on one or more services
logs Fetch the logs of a service or task
ls List services
ps List the tasks of one or more services
rm Remove one or more services
rollback Revert changes to a service’s configuration
scale Scale one or multiple replicated services
update Update a service

Run docker containers

sample run command

docker run -it -d --name opensips -e ENV=dev imagename:2.2

-it flags attaches to an interactive tty in the container.
-e gives envrionment variables
-d runs it in background and prints container id

Remove docker entities

To remove all stopped containers, all dangling images, and all unused networks:

docker system prune -a

To remove all unused volumes

docker system prune --volumes

To remove all stopped containers

docker container prune

sometimes docker images keep piling with stopped congainer such as 

REPOSITORY                                                             TAG                 IMAGE ID            CREATED             SIZE                                                                              d1dcfe2438ae        15 minutes ago      753MB                                                                           2d353828889b        16 hours ago        910MB                                                          ...

CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS                        PORTS               NAMES

0dd6698a7517        2d353828889b        "/entrypoint.sh"         13 minutes ago      Exited (137) 13 minutes ago                       hardcore_wozniak

to remove such images and their conainer , first stop and remove confainers

docker stop $(docker ps -a -q)
docker rm $(docker ps -a -q)

then remove all dangling images

docker rmi  $(docker images -aq --filter dangling=true)

Infrastructure management using terraform

Terraform is used for building, changing and versioning infrastructure.
Infra as Code – can run single application to datacentres via configuration files which create execution plan.
It can manage low-level components such as compute instances, storage, and networking, as well as high-level components such as DNS entries, SaaS features, etc.
Resource Graph – builds a graph of all your resources

tfenv can be used to manage terraform versions

> brew unlink terraform
tfenv install 0.11.14
tfenv list

Terraform configuration language

This is used for declaring resources and descriptions of infrastructure and associated files have a .tf or .tf.json file extension
Group of resources can be gathered into a module. Terraform configuration consists of a root module, where evaluation begins, along with a tree of child modules created when one module calls another.

Example : launch a single AWS EC2 instance , fle server1.tf

provider "aws" {
  profile    = "default"
  region     = "us-east-1"
}

resource "aws_instance" "server1" {
  ami           = "ami-2757fxxx"
  instance_type = "t2.micro"
}

note : AMI IDs are region specific.
profile attribute here refers to the AWS Config File in ~/.aws/credentials

Terraform command line interface (CLI)

engine for evaluating and applying Terraform configurations.
uses plugins called providers that each define and manage a set of resource types

Command Usage: terraform [-version] [-help] [args]

apply Builds or changes infrastructure
console Interactive console for Terraform interpolations
destroy Destroy Terraform-managed infrastructure
env Workspace management
fmt Rewrites config files to canonical format
get Download and install modules for the configuration
graph Create a visual graph of Terraform resources
import Import existing infrastructure into Terraform
init Initialize a Terraform working directory
output Read an output from a state file
plan Generate and show an execution plan
providers Prints a tree of the providers used in the configuration
refresh Update local state file against real resources
show Inspect Terraform state or plan
taint Manually mark a resource for recreation
untaint Manually unmark a resource as tainted
validate Validates the Terraform files
version Prints the Terraform version
workspace Workspace management

0.12upgrade Rewrites pre-0.12 module source code for v0.12
debug Debug output management (experimental)
force-unlock Manually unlock the terraform state
push Obsolete command for Terraform Enterprise legacy (v1)
state Advanced state management

terraform init
Initialize a working directory containing Terraform configuration files.

terraform validate
checks that verify whether a configuration is internally-consistent, regardless of any provided variables or existing state.

Kubernetes

container orchestration platform , automating deployment, scaling, and management of containerized applications. Can deploy to cluster of computers, automating the distribution and scheduling as well

Service discovery and load balancing – gives Pods their own IP addresses and a single DNS name for a set of Pods, and can load-balance across them.

Automatic bin packing – Automatically places containers based on their resource requirements and other constraints, while not sacrificing availability. Mix critical and best-effort workloads in order to drive up utilization and save even more resources.

Storage orchestration – Automatically mount the storage system of your choice, whether from local storage, a public cloud provider such as GCP or AWS, or a network storage system such as NFS, iSCSI, Gluster, Ceph, Cinder, or Flocker.

Self-healing – Restarts containers that fail, replaces and reschedules containers when nodes die, kills containers that don’t respond to your user-defined health check, and doesn’t advertise them to clients until they are ready to serve.

Automated rollouts and rollbacks – progressively rolls out changes to your application or its configuration, while monitoring application health to ensure it doesn’t kill all your instances at the same time.

Secret and configuration management – Deploy and update secrets and application configuration without rebuilding your image and without exposing secrets in your stack configuration.

Batch execution– manage batch and CI workloads, replacing containers that fail, if desired.

Horizontal scaling – Scale application up and down with a simple command, with a UI, or automatically based on CPU usage.

create minikube cluster and deploy pods

prerequisities : docker , curl , redis , others

install minikube

curl -Lo minikube https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
chmod +x minikube
install minikube /usr/local/bin

Install kubectl

snap install kubectl --classic
ln -s /snap/bin/kubectl /usr/local/bin

Setup Minikube

minikube start --vm-driver=none

minikube addons enable registry-creds
kubectl -n kube-system create secret generic registry-creds-ecr
kubectl -n kube-system create secret generic registry-creds-gcr
kubectl -n kube-system create secret generic registry-creds-dpr
minikube addons configure registry-creds

Starting Kubernetes…minikube version: v1.3.0
 commit: 43969594266d77b555a207b0f3e9b3fa1dc92b1f
 minikube v1.3.0 on Ubuntu 18.04
 Running on localhost (CPUs=2, Memory=2461MB, Disk=47990MB) …
 OS release is Ubuntu 18.04.2 LTS
 Preparing Kubernetes v1.15.0 on Docker 18.09.5 …
 kubelet.resolv-conf=/run/systemd/resolve/resolv.conf
 Pulling images …
 Launching Kubernetes …
 Done! kubectl is now configured to use "minikube"
 dashboard was successfully enabled
 Kubernetes Started

Basic Commands

start Starts a local kubernetes cluster
status Gets the status of a local kubernetes cluster
stop Stops a running local kubernetes cluster
delete Deletes a local kubernetes cluster
dashboard Access the kubernetes dashboard running within the minikube cluster

Images Commands:

docker-env Sets up docker env variables; similar to ‘$(docker-machine env)’
cache Add or delete an image from the local cache.

Configuration and Management Commands:

addons Modify minikube’s kubernetes addons
config Modify minikube config
profile Profile gets or sets the current minikube profile
update-context Verify the IP address of the running cluster in kubeconfig.

Networking and Connectivity Commands:

service Gets the kubernetes URL(s) for the specified service in your local cluster
tunnel tunnel makes services of type LoadBalancer accessible on localhost

Advanced Commands:

mount Mounts the specified directory into minikube
ssh Log into or run a command on a machine with SSH; similar to ‘docker-machine ssh’
kubectl Run kubectl

Troubleshooting Commands:

ssh-key Retrieve the ssh identity key path of the specified cluster
ip Retrieves the IP address of the running cluster
logs Gets the logs of the running instance, used for debugging minikube, not user code.
update-check Print current and latest version number

kubectl

controls the Kubernetes cluster manager.

Basic Commands (Beginner):

create Create a resource from a file or from stdin.
expose Take a replication controller, service, deployment or pod and expose it as a new Kubernetes Service
run Run a particular image on the cluster
set Set specific features on objects
explain Documentation of resources
get Display one or many resources
edit Edit a resource on the server
delete Delete resources by filenames, stdin, resources and names, or by resources and label selector

Deploy Commands:

rollout Manage the rollout of a resource
scale Set a new size for a Deployment, ReplicaSet, Replication Controller, or Job
autoscale Auto-scale a Deployment, ReplicaSet, or ReplicationController

Cluster Management Commands:

certificate Modify certificate resources.
cluster-info Display cluster info
top Display Resource (CPU/Memory/Storage) usage.
cordon Mark node as unschedulable
uncordon Mark node as schedulable
drain Drain node in preparation for maintenance
taint Update the taints on one or more nodes

Troubleshooting and Debugging Commands:

describe Show details of a specific resource or group of resources
logs Print the logs for a container in a pod
attach Attach to a running container
exec Execute a command in a container
port-forward Forward one or more local ports to a pod
proxy Run a proxy to the Kubernetes API server
cp Copy files and directories to and from containers.
auth Inspect authorization

Advanced Commands:

diff Diff live version against would-be applied version
apply Apply a configuration to a resource by filename or stdin
patch Update field(s) of a resource using strategic merge patch
replace Replace a resource by filename or stdin
wait Experimental: Wait for a specific condition on one or many resources.
convert Convert config files between different API versions
kustomize Build a kustomization target from a directory or a remote url.

Settings Commands:

label Update the labels on a resource
annotate Update the annotations on a resource
completion Output shell completion code for the specified shell (bash or zsh)

Other Commands:

api-resources Print the supported API resources on the server
api-versions Print the supported API versions on the server, in the form of “group/version”
config Modify kubeconfig files
plugin Provides utilities for interacting with plugins.
version Print the client and server version information

DevOps monitoring tools nagios

Manage Docker configs

create Create a config from a file or STDIN
inspect Display detailed information on one or more configs
ls List configs
rm Remove one or more configs

Manage containers

attach Attach local standard input, output, and error streams to a running container
commit Create a new image from a container’s changes
cp Copy files/folders between a container and the local filesystem
create Create a new container
diff Inspect changes to files or directories on a container’s filesystem
exec Run a command in a running container
export Export a container’s filesystem as a tar archive
inspect Display detailed information on one or more containers
kill Kill one or more running containers
logs Fetch the logs of a container
ls List containers
pause Pause all processes within one or more containers
port List port mappings or a specific mapping for the container
prune Remove all stopped containers
rename Rename a container
restart Restart one or more containers
rm Remove one or more containers
run Run a command in a new container
start Start one or more stopped containers
stats Display a live stream of container(s) resource usage statistics
stop Stop one or more running containers
top Display the running processes of a container
unpause Unpause all processes within one or more containers
update Update configuration of one or more containers
wait Block until one or more containers stop, then print their exit codes

Alternatives, Senu multi-cloud monitoring or Raygun

Monitoring, debugging, logs analysis and alarms

Aggregate logs into logstash and provide search and filtering via Elastic Search and Kibana. Can also trigger alerts or notifications on specific keyword searches in logs such as WARNING or ERRRO or call_failed. Some common alert scenarios include :

SBC and proxy gateways failures – check states of VM instance

DNS caching alerts – Domain Name System (DNS) caching, a Dynamic Host Configuration Protocol (DHCP) server, router advertisement and network boot alerts from service such as dnsmasq

Disk usage alert – setup alerts for 80% usage and trigger an alarm to either manually prune or create automatic timely archive backups.
check the percentage of DISK USAGE

df -h

Mostly it is either the logs file or pcap recorder which need to be archieved in external storage.

Use logrotate – it can rotates, compresses, and mails system logs

config file for logrorate – logrotate -vf /etc/logrotate.conf

/var/log/messages {
    rotate 5
    weekly
    postrotate
        /usr/bin/killall -HUP syslogd
    endscript
}

Elevated Call failure SIP 503 or Call timeout SIP 408 – high frequency of failed calls indicate an internal issue and must be followed up by smoke testing the entire system to identify any probable issue such as undetected frequent crashes of any individual component or any blacklisting by a destination endpoint etc

sudo tail -f sip.log | grep 503

sudo tail -f sip.log | grep WARNING

cron service or processed alerts –

 ps axf
  PID TTY      STAT   TIME COMMAND
    2 ?        S      0:00 [kthreadd]
    3 ?        I<     0:00  \_ [rcu_gp]
    4 ?        I<     0:00  \_ [rcu_par_gp]
    5 ?        I      0:00  \_ [kworker/0:0-eve]
    6 ?        I<     0:00  \_ [kworker/0:0H-kb]
    7 ?        I      0:00  \_ [kworker/0:1-eve]
    8 ?        I      0:00  \_ [kworker/u4:0-nv]
    9 ?        I<     0:00  \_ [mm_percpu_wq]
   10 ?        S      0:00  \_ [ksoftirqd/0]
   11 ?        I      0:00  \_ [rcu_sched]
   12 ?        S      0:00  \_ [migration/0]
   13 ?        S      0:00  \_ [cpuhp/0]
   14 ?        S      0:00  \_ [cpuhp/1]
   15 ?        S      0:00  \_ [migration/1]
   16 ?        S      0:00  \_ [ksoftirqd/1]
   17 ?        I      0:00  \_ [kworker/1:0-eve]
   18 ?        I<     0:00  \_ [kworker/1:0H-kb]

or checks cron status

service cron status
● cron.service - Regular background program processing daemon
   Loaded: loaded (/lib/systemd/system/cron.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2016-06-26 03:00:37 UTC; 1min 17s ago
     Docs: man:cron(8)
 Main PID: 845 (cron)
    Tasks: 1 (limit: 4383)
   CGroup: /system.slice/cron.service
           └─845 /usr/sbin/cron -f

Jun 26 03:00:37 ip-172-31-45-21 systemd[1]: Started Regular background program processing daemon.
Jun 26 03:00:37 ip-172-31-45-21 cron[845]: (CRON) INFO (pidfile fd = 3)
Jun 26 03:00:37 ip-172-31-45-21 cron[845]: (CRON) INFO (Running @reboot jobs)

restart or start cron service if required

DB connections / connection pool process – keep listening for any alerts on DB connections failure or even warnings as this can be due to too many read operations such as in DDOS and can escalate very quickly

netstat -nltp  | grep db 
tcp        0      0 0.0.0.0:5433            0.0.0.0:*               LISTEN      5792/db-server *

Routine deepstatus checks is a good practice too. Raise alert if any check doesnt result as expected.

Port check, unexpected result alert– Regular checks if servers are lsietning on ports such as 5060 for SIP

netstat -nltp | grep 5060
tcp        0      0 x.x.x.x:5060       0.0.0.0:*               LISTEN      8970/kamailio

cron zombie process checks – zombie process or defunct process is a process that has completed execution (via the exit system call) but still has an entry in the process table: it is a process in the “Terminated state”. List xombie process and kill them with pid to free up .

kill -9 <PID1>

Bulk calls checks – consult ongoing call cmd commands for application server such as
For Freeswitch use

fs_ctl> show channels

For kamailio use kamcmd

kamcmd dlg.list

For asterisk watch or show cmmand

watch -n 1 "sudo asterisk -vvvvvrx 'core show channels' | grep call"

Incase of DDOS or other macious attacker IP identification block the IP

iptables -I INPUT -s y.y.y.y -j DROP

Can also use fail2ban

>apt-get update && apt-get install fail2ban

Additionally check how many dispatchers are responding on outbound gateway

opensipsctl dispatcher dump

Process control supervisor or pm2 checks – supervisor is a Linux Process Control System that allows its users to monitor and control a number of processes

ps axf | grep supervisor

for pm2

> pm2 status
[PM2] Spawning PM2 daemon with pm2_home=/Users/altanai/.pm2
[PM2] PM2 Successfully daemonized
┌─────┬───────────┬─────────────┬─────────┬─────────┬──────────┬────────┬──────┬───────────┬──────────┬──────────┬──────────┬──────────┐
│ id │ name │ namespace │ version │ mode │ pid │ uptime │ ↺ │ status │ cpu │ mem │ user │ watching │

htop to check memeory and CPU

Health and load on the reverse proxy, load balancer as Nginx – perform a direct curl request to host to check if Nginx responds with a non 4xx / 5xx response or not

curl -v <public-fqdn-of-server>

Incase of error response , restart

/etc/init.d/nginx start

Incase of updates restart ngnix config

nginx -s reload

For HTTP/SSL proxy daemon such as tiny proxy which are used for fast resposne , set the MinSpareServers, MaxSpareServers , MaxClients , MaxRequestsPerChild etc appropriately

VPN checks – restart fireealls or IPsec incase of ssues

/etc/init.d/ipsec restart

Additionally also check ssh service

ps axf | grep sshd

restart sshd if required

SSL cert expiry checks – to keep the operations running securely and prevent and abrupt termination it is a good practise to run regular certificate expiry checks for SSL certs especially on secure HTTP endpoint like APIs , web server and also on SIP applications servers for TLS. If any expiry is due in < 10 days to trigger an alert to renew the certs

Health of Task scheduling services such as RabbitMQ, Celery Distributed Task Queue – remote debugging of these can be set up via pdb which supports setting (conditional) breakpoints and single stepping at the source line level, inspection of stack frames, source code listing, and evaluation of arbitrary Python code in the context of any stack frame.

import pdb; pdb.set_trace()
python3 -m pdb myscript.py

It can also be set up via using the client libraries provided by these Queue services themselves

Cluster status – setup an efficient health check service which monitors the cluster status for High Availability. JSON object depicting the status of cluster shards

{
  "cluster_name" : "ABC-cluster",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 14,
  "number_of_data_nodes" : 6,
  "active_primary_shards" : 200,
  "active_shards" : 300,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0
}

Status of Crticial Application Server

fscli > show status
UP 0 years, 0 days, 0 hours, 58 minutes, 33 seconds, 15 milliseconds, 58 microseconds
FreeSWITCH (Version 1.6.20 git 987c9b9 2018-01-23 21:49:09Z 64bit) is ready
3 session(s) since startup
0 session(s) - peak 1, last 5min 1
0 session(s) per Sec out of max 30, peak 1, last 5min 1
1000 session(s) max
min idle cpu 0.00/80.83
Current Stack Size/Max 240K/8192K

Programming or Syntax error in the production environment – mostly arising due to incomplete QA/testing before pushing new changes to production. Should trigger alerts for dev teams and meet with hot patches.

Many programing application development frameworks have inbuild libs for debugging , exceotion handling and reporting such as

backend service in Django
API service in Go

Distributed memory caching – redis , memcahe : Redis info shows the master -salve configuration for all the instances as well as their memeory and cpu status.

>redis-cli info
# Server
redis_version:6.0.4
redis_git_dirty:0
redis_mode:standalone
os:Darwin 18.7.0 x86_64
arch_bits:64
multiplexing_api:kqueue
atomicvar_api:atomic-builtin
gcc_version:4.2.1
tcp_port:6379

# Clients
connected_clients:1
client_recent_max_input_buffer:0
client_recent_max_output_buffer:0
blocked_clients:0
tracking_clients:0
clients_in_timeout_table:0

# Memory
used_memory:1065648
used_memory_human:1.02M
number_of_cached_scripts:0
maxmemory:0
allocator_frag_bytes:1123680
allocator_rss_ratio:1.00
rss_overhead_bytes:37888
mem_fragmentation_ratio:2.16
active_defrag_running:0
lazyfree_pending_objects:0

# Persistence
loading:0
rdb_changes_since_last_save:0
module_fork_last_cow_size:0

# Stats
total_connections_received:1
total_commands_processed:0
..

# Replication
role:master
connected_slaves:0
..

# CPU
used_cpu_sys:0.011198
used_cpu_sys_children:0.000000

# Modules

# Cluster
cluster_enabled:0

SMS service using smsc on Kannel : From the kannel servers, you should see the PANIC error (most of the time Assertion error crashing kannel):

grep PANIC /var/log/kannel/bearerbox.log

IF you are going to restart , Flush redis cache

sudo redis-cli FLUSHALL
sudo redis-cli SAVE

restart kannel

sudo /etc/init.d/kannel restart

If the carriers are throttling the SMS request , verify “ERROR” responses using

sudo grep -i "throttling" bearerbox.log

Alternatives include AWS logs services :

Scalyr logging
Sensu monitoring for multi-cloud monitoring using event pipeline

Read about VoIP/ OTT / Telecom Solution startup’s strategy for Building a scalable flexible SIP platform that includes :

Scalable and Flexible SIP platform building
Cluster SIP telephony Server for High Availability
Failure Recovery
Multi-tier cluster architecture
Role Abstraction / Micro-Service based architecture
Distributed Event management and Event-Driven architecture
Containerization
Autoscaling Cloud Servers
Open standards and Data Privacy
Flexibility for inter-working – NextGen911 , IMS , PSTN
security and Operational Efficiencies

VoIP/ OTT / Telecom Solution startup’s strategy for building a scalable flexible SIP platform

Read more about SIP VoIP system Architecture which includes

Infrastructure Requirements
Integral Components of a VOIP SIP-based architecture
RTP ( Real-Time Transport Protocol ) / RtCP
SIP gateways, registrar, proxy, redirect, application
Developing SIP-based applications – basic call routing, media management
SIP platform Development – NAt and DNS , Cross-platform and integration to External Telecommunication provider landscape , Databases

SIP VoIP system architecture basics

References :

Terraform : https://www.terraform.io
Kubernetes : https://kubernetes.io/ , https://kubernetes.io/docs/home
Sensu : https://sensu.io/
Jenkisn : https://raygun.com/blog/best-devops-tools/
Ngnix : https://docs.nginx.com/nginx/admin-guide/basic-functionality/runtime-control/
Python dbeugger : https://docs.python.org/dev/library/pdb.html#module-pdb
sensuGO : https://docs.sensu.io/sensu-go/latest/
Memcache restartale cache : https://github.com/memcached/memcached/wiki/WarmRestart

sipP ( SIP testing tool )

SIPp is an opensource (GNU GPL license) performance testing tool for the SIP protocol and is widely used for Quality assurabce of callflows in voip applications for UAC / UASs cenarios.

It can emulate functioing of a sip phone such as REGISTER , establishes and releases multiple calls with the INVITE and BYE methods , send other SIP requests and wait for reponses based on dafult of custom xml scenario files.

Plus factor is the dynamic display of statistics about running tests (call rate, round trip delay, and message statistics), periodic CSV statistics dumps, TCP and UDP over multiple sockets or multiplexed with retransmission management, regular expressions and variables in scenario files, and dynamically adjustable call rates.

sipp -sn uac -d 10000 -s 9876543210 127.0.0.1:5060  -l 10

It is widley used as aperformnace and load testing tool since it can test SIP equipements like SIP proxies, B2BUAs, SIP media servers, SIP/x gateways, and SIP PBXes and can also emulate thousands of user agents calling your SIP system.

More on SIPp scripts and various exmaples can be read from

https://github.com/altanai/kamailioexamples/tree/master/sipp

Installation

Pre-requisites to compile SIPp are:
– C++ Compiler
– curses or ncurses library
– For TLS support: OpenSSL >= 0.9.8
– For pcap play support: libpcap and libnet
– For SCTP support: lksctp-tools
– For distributed pauses: Gnu Scientific Libraries

sudo apt-get install dh-autoreconf ncurses-dev libssl-dev libpcap-dev libncurses5-dev libsctp-dev lksctp-tools

Either get source code from git

git clone https://github.com/SIPp/sipp.git
cd sipp
cmake . -DUSE_SSL=1 -DUSE_SCTP=1 -DUSE_PCAP=1 -DUSE_GSL=1
make

or download readymade tar , then extract and build with options like

tar -xvzf sipp-xxx.tar.gz
cd sipp
./configure --with-sctp --with-pcap --with-openssl
make

Building certs for TLS based sipp UAS server

make master dir for all certs

mkdir certs 
chmod 0700 certs
cd certs

Make CA folder, create cert and check

mkdir demoCA
cd demoCA
mkdir newcerts
echo '01' > serial
touch index.txt
openssl req -new -x509 -extensions v3_ca -keyout key.pem -out cert.pem -days 3650

Validation of the contents of certs ( optional )

openssl x509 -in cert.pem -noout -text
openssl x509 -in cert.pem -noout -dates
openssl x509 -in cert.pem -noout -purpose

Make domain folder and create the certs for the sip domain name from parent and check

cd ..
mkdir 10.10.10.10
openssl req -new -nodes -keyout key.pem -out req.pem
cd ..
openssl ca -days 730 -out 10.10.10.10/cert.pem -keyfile demoCA/key.pem -cert demoCA/cert.pem -infiles 10.10.10.10/req.pem

Verify the generated certificate for for SIP domain

openssl x509 -in 10.10.10.10/cert.pem -noout -text

Run sipp

sipp -sn uas -p 5077 -t l1 -tls_key /home/ubuntu/certs/10.10.10.10/key.pem  -tls_cert /home/ubuntu/certs/10.10.10.10/cert.pem  -i 10.10.10.10

Verify installation

Run sipp with embedded server (uas) scenario:

sipp -sn uas

On the same host, run sipp with embedded client (uac) scenario:

sipp -sn uac 127.0.0.1 -trace_msg -trace_err

output for server 

 # sipp -sn uas

------------------------------ Scenario Screen -------- [1-9]: Change Screen --

  Port   Total-time  Total-calls  Transport
  5060      32.95 s           61  UDP

0 new calls during 0.874 s period      1 ms scheduler resolution
  19 calls                               Peak was 41 calls, after 28 s
  0 Running, 63 Paused, 12 Woken up
  0 dead call msg (discarded)          
  3 open sockets

                             Messages  Retrans   Timeout   Unexpected-Msg

----------> INVITE             61        0         0         0 
<---------- 180                61        0                               <---------- 200                61        0         0                     ----------> ACK         E-RTD1 61        0         0         0               

----------> BYE                61        0         0         0        
   <---------- 200                61        0                            
   [   4000ms] Pause              61                            0        
 ------------------------------ Test Terminated --------------------------------

----------------------------- Statistics Screen ------- [1-9]: Change Screen --

  Start Time             | 2019-02-04    13:04:32.108663 1549265672.108663         
  Last Reset Time        | 2019-02-04    13:05:04.189720 1549265704.189720         
  Current Time           | 2019-02-04    13:05:05.065119 1549265705.065119         
-------------------------+---------------------------+--------------------------
  Counter Name           | Periodic value            | Cumulative value
-------------------------+---------------------------+--------------------------
  Elapsed Time           | 00:00:00:875000           | 00:00:32:956000          
  Call Rate              |    0.000 cps              |    1.851 cps             
-------------------------+---------------------------+--------------------------

  Incoming call created  |        0                  |       61                 

  OutGoi traceings

———————————————– 2019-02-04 13:08:13.939148
UDP message sent (530 bytes):

INVITE sip:service@127.0.0.1:5060 SIP/2.0
 Via: SIP/2.0/UDP 192.x.x.x:5061;branch=z9hG4bK-52422-25-0
 From: sipp ;tag=52422SIPpTag0025
 To: service 
 Call-ID: 25-52422@192.x.x.x
 CSeq: 1 INVITE
 Contact: sip:sipp@192.x.x.x:5061
 Max-Forwards: 70
 Subject: Performance Test
 Content-Type: application/sdp
 Content-Length:   135

v=0
o=user1 53655765 2353687637 IN IP4 192.x.x.x
s=-
c=IN IP4 192.x.x.x
t=0 0
m=audio 6004 RTP/AVP 0
a=rtpmap:0 PCMU/8000

———————————————– 2019-02-04 13:08:13.939310
UDP message received [321] bytes :

SIP/2.0 180 Ringing
Via: SIP/2.0/UDP 192.x.x.x:5061;branch=z9hG4bK-52422-1-0
From: sipp ;tag=52422SIPpTag001
To: service ;tag=52416SIPpTag011
Call-ID: 1-52422@192.x.x.x
CSeq: 1 INVITE
Contact: 
Content-Length: 0

———————————————– 2019-02-04 13:08:13.939905
UDP message received [486] bytes :

SIP/2.0 200 OK
Via: SIP/2.0/UDP 192.x.x.x:5061;branch=z9hG4bK-52422-1-0
From: sipp ;tag=52422SIPpTag001
To: service ;tag=52416SIPpTag011
Call-ID: 1-52422@192.x.x.x
CSeq: 1 INVITE
Contact: 
Content-Type: application/sdp
Content-Length:   135

v=0
o=user1 53655765 2353687637 IN IP4 192.x.x.x
s=-
c=IN IP4 192.x.x.x
t=0 0
m=audio 6000 RTP/AVP 0
a=rtpmap:0 PCMU/8000

———————————————– 2019-02-04 13:08:13.940159
UDP message sent (371 bytes):

ACK sip:service@127.0.0.1:5060 SIP/2.0
Via: SIP/2.0/UDP 192.x.x.x:5061;branch=z9hG4bK-52422-1-5
From: sipp ;tag=52422SIPpTag001
To: service ;tag=52416SIPpTag011
Call-ID: 1-52422@192.x.x.x
CSeq: 1 ACK
Contact: sip:sipp@192.x.x.x:5061
Max-Forwards: 70
Subject: Performance Test
Content-Length: 0

~ RTP

———————————————– 2019-02-04 13:08:13.941658
UDP message sent (371 bytes):

BYE sip:service@127.0.0.1:5060 SIP/2.0
Via: SIP/2.0/UDP 192.x.x.x:5061;branch=z9hG4bK-52422-1-7
From: sipp ;tag=52422SIPpTag001
To: service ;tag=52416SIPpTag011
Call-ID: 1-52422@192.x.x.x
CSeq: 2 BYE
Contact: sip:sipp@192.x.x.x:5061
Max-Forwards: 70
Subject: Performance Test
Content-Length: 0

———————————————– 2019-02-04 13:08:13.952888
UDP message received [313] bytes :

SIP/2.0 200 OK
Via: SIP/2.0/UDP 192.x.x.x:5061;branch=z9hG4bK-52422-1-7
From: sipp ;tag=52422SIPpTag001
To: service ;tag=52416SIPpTag011
Call-ID: 1-52422@192.x.x.x
CSeq: 2 BYE
Contact: 
Content-Length: 0

Time

---------------------------- Repartition Screen ------- [1-9]: Change Screen --

  Average Response Time Repartition 1                                           

           0 ms <= n <        10 ms :        293                                
          10 ms <= n <        20 ms :          9                                
          20 ms <= n <        30 ms :          0                                
          30 ms <= n <        40 ms :          0                                
          40 ms <= n <        50 ms :          0                                
          50 ms <= n <       100 ms :          0                                
         100 ms <= n <       150 ms :          0                                
         150 ms <= n <       200 ms :          0                                
                   n >=      200 ms :          0                                

  Average Call Length Repartition                                               

           0 ms <= n <        10 ms :          0                                
          10 ms <= n <        50 ms :          0                                
          50 ms <= n <       100 ms :          0                                
         100 ms <= n <       500 ms :          0                                
         500 ms <= n <      1000 ms :          0                                
        1000 ms <= n <      5000 ms :        262                                
        5000 ms <= n <     10000 ms :          0                                
                   n >=    10000 ms :          0                                

------------------------------ Sipp Server Mode -------------------------------

Output for client

uac.xml
 
SIPp UAC Remote
 |(1) INVITE |
 |------------------>|
 |(2) 100 (optional) |
 |<------------------| 
 |(3) 180 (optional) | 
  |<------------------| 
|(4) 200             | 
|<------------------| 
|(5) ACK             | 
|------------------>|
 |                     |
 |(6) PAUSE             |
 |                     |
 |(7) BYE             |
 |------------------>|
 |(8) 200             |
 |<------------------|

sipp -sn uac 127.0.0.1 -trace_msg -trace_err
Resolving remote host ‘127.0.0.1’… Done.
—————————— Scenario Screen ——– [1-9]: Change Screen —
Call-rate(length) Port Total-time Total-calls Remote-host
10.0(0 ms)/1.000s 5061 17.32 s 98 127.0.0.1:5060(UDP)

3 new calls during 0.286 s period 1 ms scheduler resolution
0 calls (limit 30) Peak was 25 calls, after 10 s
0 Running, 101 Paused, 7 Woken up
0 dead call msg (discarded) 0 out-of-call msg (discarded)
3 open sockets

                             Messages  Retrans   Timeout   Unexpected-Msg
  INVITE ---------->         98        0         0                  
     100 <----------         0         0         0         0        
     180 <----------         98        0         0         0        
     183 <----------         0         0         0         0        
     200          98        0                            
   Pause [      0ms]         98                            0        
     BYE ---------->         98        0         0                  
     200 <----------         98        0         0         0

—————————— Test Terminated ——————————–

----------------------------- Statistics Screen ------- [1-9]: Change Screen --

  Start Time             | 2019-02-04    13:08:03.908208 1549265883.908208         
  Last Reset Time        | 2019-02-04    13:08:20.954289 1549265900.954289         
  Current Time           | 2019-02-04    13:08:21.241152 1549265901.241152         

-------------------------+---------------------------+--------------------------
  Counter Name           | Periodic value            | Cumulative value

-------------------------+---------------------------+--------------------------
  Elapsed Time           | 00:00:00:286000           | 00:00:17:332000          

  Call Rate

Tracings

———————————————– 2019-02-04 13:08:13.934840
UDP message received [527] bytes :

INVITE sip:service@127.0.0.1:5060 SIP/2.0
Via: SIP/2.0/UDP 192.x.x.x:5061;branch=z9hG4bK-52422-1-0
From: sipp ;tag=52422SIPpTag001
To: service 
Call-ID: 1-52422@192.x.x.x
CSeq: 1 INVITE
Contact: sip:sipp@192.x.x.x:5061
Max-Forwards: 70
Subject: Performance Test
Content-Type: application/sdp
Content-Length:   135

v=0
o=user1 53655765 2353687637 IN IP4 192.x.x.x
s=-
c=IN IP4 192.x.x.x
t=0 0
m=audio 6004 RTP/AVP 0
a=rtpmap:0 PCMU/8000

———————————————– 2019-02-04 13:08:13.936616
UDP message sent (321 bytes):

SIP/2.0 180 Ringing
Via: SIP/2.0/UDP 192.x.x.x:5061;branch=z9hG4bK-52422-1-0
From: sipp ;tag=52422SIPpTag001
To: service ;tag=52416SIPpTag011
Call-ID: 1-52422@192.x.x.x
CSeq: 1 INVITE
Contact: 
Content-Length: 0

———————————————– 2019-02-04 13:08:13.937003
UDP message sent (486 bytes):

SIP/2.0 200 OK
Via: SIP/2.0/UDP 192.x.x.x:5061;branch=z9hG4bK-52422-1-0
From: sipp ;tag=52422SIPpTag001
To: service ;tag=52416SIPpTag011
Call-ID: 1-52422@192.x.x.x
CSeq: 1 INVITE
Contact: 
Content-Type: application/sdp
Content-Length:   135

v=0
o=user1 53655765 2353687637 IN IP4 192.x.x.x
s=-
c=IN IP4 192.x.x.x
t=0 0
m=audio 6000 RTP/AVP 0
a=rtpmap:0 PCMU/8000

———————————————– 2019-02-04 13:08:13.948679
UDP message received [371] bytes :

ACK sip:service@127.0.0.1:5060 SIP/2.0
Via: SIP/2.0/UDP 192.x.x.x:5061;branch=z9hG4bK-52422-1-5
From: sipp ;tag=52422SIPpTag001
To: service ;tag=52416SIPpTag011
Call-ID: 1-52422@192.x.x.x
CSeq: 1 ACK
Contact: sip:sipp@192.x.x.x:5061
Max-Forwards: 70
Subject: Performance Test
Content-Length: 0

~ RTP

———————————————– 2019-02-04 13:08:13.949168
UDP message received [371] bytes :

BYE sip:service@127.0.0.1:5060 SIP/2.0
Via: SIP/2.0/UDP 192.x.x.x:5061;branch=z9hG4bK-52422-1-7
From: sipp ;tag=52422SIPpTag001
To: service ;tag=52416SIPpTag011
Call-ID: 1-52422@192.x.x.x
CSeq: 2 BYE
Contact: sip:sipp@192.x.x.x:5061
Max-Forwards: 70
Subject: Performance Test
Content-Length: 0

———————————————– 2019-02-04 13:08:13.949245
UDP message sent (313 bytes):

SIP/2.0 200 OK
Via: SIP/2.0/UDP 192.x.x.x:5061;branch=z9hG4bK-52422-1-7
From: sipp ;tag=52422SIPpTag001
To: service ;tag=52416SIPpTag011
Call-ID: 1-52422@192.x.x.x
CSeq: 2 BYE
Contact: 
Content-Length: 0

time

---------------------------- Repartition Screen ------- [1-9]: Change Screen --

  Average Response Time Repartition 1                                           

           0 ms <= n <        10 ms :        657                                
          10 ms <= n <        20 ms :         20                                
          20 ms <= n <        30 ms :          0                                
          30 ms <= n <        40 ms :          0                                
          40 ms <= n <        50 ms :          0                                
          50 ms <= n <       100 ms :          0                                
         100 ms <= n <       150 ms :          0                                
         150 ms <= n <       200 ms :          0                                
                   n >=      200 ms :          0                                

  Average Call Length Repartition                                               

           0 ms <= n <        10 ms :        649                                
          10 ms <= n <        50 ms :         28                                
          50 ms <= n <       100 ms :          0                                
         100 ms <= n <       500 ms :          0                                
         500 ms <= n <      1000 ms :          0                                
        1000 ms <= n <      5000 ms :          0                                
        5000 ms <= n <     10000 ms :          0                                
                   n >=    10000 ms :          0                                

------ [+|-|*|/]: Adjust rate ---- [q]: Soft exit ---- [p]: Pause traffic -----

Last Error: Overload warning: the major watchdog timer 3000ms has been t…

UAC with Media

SIPp UAC            Remote
    |(1) INVITE         |
    |------------------>|
    |(2) 100 (optional) |
    |<------------------|
    |(3) 180 (optional) |
    |<------------------|
    |(4) 200            |
    |<------------------|
    |(5) ACK            |
    |------------------>|
    |                   |
    |(6) RTP send (8s)  |
    |==================>|
    |                   |
    |(7) RFC2833 DIGIT 1|
    |==================>|
    |                   |
    |(8) BYE            |
    |------------------>|
    |(9) 200            |
    |<------------------|

sipp Usage:

sipp remote_host[:remote_port] [options]

Run SIPp with embedded server (uas) scenario: ./sipp -sn uas On the same host, run SIPp with embedded client (uac) scenario: ./sipp -sn uac 127.0.0.1

Scenario file options:

-sd : Dumps a default scenario (embedded in the SIPp executable)
-sf : Loads an alternate XML scenario file. To learn more about XML scenario syntax, use the -sd option to dump embedded scenarios. They contain all the necessary help.
-oocsf : Load out-of-call scenario.
-oocsn : Load out-of-call scenario.
-sn : Use a default scenario (embedded in the SIPp executable). If this option is omitted, the Standard SipStone UAC scenario is loaded. Available values in this version:
- ‘uac’ : Standard SipStone UAC (default).
- ‘uas’ : Simple UAS responder.
- ‘regexp’ : Standard SipStone UAC – with regexp and variables.
- ‘branchc’ : Branching and conditional branching in scenarios – client.
- ‘branchs’ : Branching and conditional branching in scenarios – server.
Default 3pcc scenarios (see -3pcc option):
- ‘3pcc-C-A’ : Controller A side (must be started after all other 3pcc scenarios)
- ‘3pcc-C-B’ : Controller B side.
- ‘3pcc-A’ : A side.
- ‘3pcc-B’ : B side.

IP, port and protocol options

-t : Set the transport mode:
- u1: UDP with one socket (default),
- un: UDP with one socket per call,
- ui: UDP with one socket per IP address. The IP addresses must be defined in the injection file.
- t1: TCP with one socket,
- tn: TCP with one socket per call,
- l1: TLS with one socket,
- ln: TLS with one socket per call,
- c1: u1 + compression (only if compression plugin loaded),
- cn: un + compression (only if compression plugin loaded). This plugin is not provided with SIPp.
-i : Set the local IP address for ‘Contact:’,’Via:’, and ‘From:’ headers. Default is primary host IP address.
-p : Set the local port number. Default is a random free port chosen by the system
-bind_local : Bind socket to local IP address, i.e. the local IP address is used as the source IP address. If SIPp runs in server mode it will only listen on the local IP address instead of all IP addresses.
-ci : Set the local control IP address
-cp : Set the local control port number. Default is 8888.
-max_socket : Set the max number of sockets to open simultaneously. This option is significant if you use one socket per call. Once this limit is reached, traffic is distributed over the sockets already opened. Default value is 50000
-max_reconnect : Set the the maximum number of reconnection.
-reconnect_close : Should calls be closed on reconnect?
-reconnect_sleep : How long (in milliseconds) to sleep between the close and reconnect?
-rsa : Set the remote sending address to host:port for sending the messages.
-tls_cert : Set the name for TLS Certificate file. Default is ‘cacert.pem
-tls_key : Set the name for TLS Private Key file. Default is ‘cakey.pem’
-tls_ca : Set the name for TLS CA file. If not specified, X509 verification is not activated.
-tls_crl : Set the name for Certificate Revocation List file. If not specified, X509 CRL is not activated.
-tls_version : Set the TLS protocol version to use (1.0, 1.1, 1.2) — default is autonegotiate

SIPp overall behavior options:

-v : Display version and copyright information.
-bg : Launch SIPp in background mode.
-nostdin : Disable stdin.
-plugin : Load a plugin.
-sleep : How long to sleep for at startup. Default unit is seconds.
-skip_rlimit : Do not perform rlimit tuning of file descriptor limits. Default: false.
-buff_size : Set the send and receive buffer size.
-sendbuffer_warn : Produce warnings instead of errors on SendBuffer failures.
-lost : Set the number of packets to lose by default (scenario specifications override this value).
-key : keyword value Set the generic parameter named “keyword” to “value”.
-set : variable value Set the global variable parameter named “variable” to “value”.
-tdmmap : Generate and handle a table of TDM circuits. A circuit must be available for the call to be placed. Format: -tdmmap {0-3}{99}{5-8}{1-31}
-dynamicStart : variable value Set the start offset of dynamic_id variable
-dynamicMax : variable value Set the maximum of dynamic_id variable
-dynamicStep : variable value Set the increment of dynamic_id variable

Call behavior options:

-aa : Enable automatic 200 OK answer for INFO, NOTIFY, OPTIONS and UPDATE.
-base_cseq : Start value of [cseq] for each call.
-cid_str : Call ID string (default %u-%p@%s). %u=call_number, %s=ip_address, %p=process_number, %%=% (in any order).
-d : Controls the length of calls. More precisely, this controls the duration of ‘pause’ instructions in the scenario, if they do not have a ‘milliseconds’ section. Default value is 0 and default unit is milliseconds.
-deadcall_wait : How long the Call-ID and final status of calls should be kept to improve message and error logs (default unit is ms).
-auth_uri : Force the value of the URI for authentication. By default, the URI is composed of remote_ip:remote_port.
-au : Set authorization username for authentication challenges. Default is taken from -s argument
-ap : Set the password for authentication challenges. Default is ‘password’
-s : Set the username part of the request URI. Default is ‘service’.
-default_behaviors: Set the default behaviors that SIPp will use. Possible values are:
- all Use all default behaviors
- none Use no default behaviors
- bye Send byes for aborted calls
- abortunexp Abort calls on unexpected messages
- pingreply Reply to ping requests If a behavior is prefaced with a -, then it is turned off. Example: all,-bye
-nd : No Default. Disable all default behavior of SIPp which are the following:
On UDP retransmission timeout, abort the call by sending a BYE or a CANCEL
On receive timeout with no ontimeout attribute, abort the call by sending a BYE or a CANCEL
On unexpected BYE send a 200 OK and close the call
On unexpected CANCEL send a 200 OK and close the call
On unexpected PING send a 200 OK and continue the call
On any other unexpected message, abort the call by sending a BYE or a CANCEL
-pause_msg_ign : Ignore the messages received during a pause defined in the scenario
-callid_slash_ign: Don’t treat a triple-slash in Call-IDs as indicating an extra SIPp prefix.

Injection file options:

-inf : Inject values from an external CSV file during calls into the scenarios. First line of this file say whether the data is to be read in sequence (SEQUENTIAL), random (RANDOM), or user (USER) order. Each line corresponds to one call and has one or more ‘;’ delimited data fields. Those fields can be referred as [field0], [field1], … in the xml scenario file. Several CSV files can be used simultaneously (syntax: -inf f1.csv -inf f2.csv …)
-infindex : file field Create an index of file using field. For example -inf ../path/to/users.csv -infindex users.csv 0 creates an index on the first key.
-ip_field : Set which field from the injection file contains the IP address from which the client will send its messages. If this option is omitted and the ‘-t ui’ option is present, then field 0 is assumed. Use this option together with ‘-t ui’

RTP behaviour options:

-mi : Set the local media IP address (default: local primary host IP address)
-rtp_echo : Enable RTP echo. RTP/UDP packets received on port defined by -mp are echoed to their sender. RTP/UDP packets coming on this port + 2 are also echoed to their sender (used for sound and video echo).
-mb : Set the RTP echo buffer size (default: 2048).
-mp : Set the local RTP echo port number. Default is 6000.
-rtp_payload : RTP default payload type.
-rtp_threadtasks : RTP number of playback tasks per thread.
-rtp_buffsize : Set the rtp socket send/receive buffer size.

Call rate options:

-r : Set the call rate (in calls per seconds). This value can bechanged during test by pressing ‘+’, ‘_’, ‘*’ or ‘/’. Default is 10.
- pressing ‘+’ key to increase call rate by 1 * rate_scale,
- pressing ‘-‘ key to decrease call rate by 1 * rate_scale,
- pressing ‘*’ key to increase call rate by 10 * rate_scale,
- pressing ‘/’ key to decrease call rate by 10 * rate_scale.
-rp : Specify the rate period for the call rate. Default is 1 second and default unit is milliseconds. This allows you to have n calls every m milliseconds(by using -r n -rp m). Example: -r 7 -rp 2000 ==> 7 calls every 2 seconds. -r 10 -rp 5s => 10 calls every 5 seconds.
-rate_scale : Control the units for the ‘+’, ‘-‘, ‘*’, and ‘/’ keys.
-rate_increase : Specify the rate increase every -rate_interval units (default is seconds). This allows you to increase the load for each independent logging period. Example: -rate_increase 10 -rate_interval 10s ==> increase calls by 10 every 10 seconds.
-rate_max :

If -rate_increase is set, then quit after the rate reaches this value. Example: -rate_increase 10 -rate_max 100 ==> increase calls by 10 until 100 cps is hit.

-rate_interval : Set the interval by which the call rate is increased. Defaults to the value of -fd.
-no_rate_quit : If -rate_increase is set, do not quit after the rate reaches -rate_max.
-l : Set the maximum number of simultaneous calls. Once this limit is reached, traffic is decreased until the number of open calls goes down. Default: (3 * call_duration (s) * rate).

-m : Stop the test and exit when ‘calls’ calls are processed
-users : Instead of starting calls at a fixed rate, begin ‘users’ calls at startup, and keep the number of calls constant.

Retransmission and timeout options:

-recv_timeout : Global receive timeout. Default unit is milliseconds. If the expected message is not received, the call times out and is aborted.
-send_timeout : Global send timeout. Default unit is milliseconds. If a message is not sent (due to congestion), the call times out and is aborted.
-timeout : Global timeout. Default unit is seconds. If this option is set, SIPp quits after nb units (-timeout 20s quits after 20 seconds).
-timeout_error : SIPp fails if the global timeout is reached is set (-timeout option required).
-max_retrans : Maximum number of UDP retransmissions before call ends on timeout. Default is 5 for INVITE transactions and 7 for others.
-max_invite_retrans: Maximum number of UDP retransmissions for invite transactions before call ends on timeout.
-max_non_invite_retrans: Maximum number of UDP retransmissions for non-invite transactions before call ends on timeout.
-nr : Disable retransmission in UDP mode.
-rtcheck : Select the retransmission detection method: full (default) or loose.
-T2 : Global T2-timer in milli seconds

Third-party call control options:

-3pcc : Launch the tool in 3pcc mode (“Third Party call control”). The passed IP address depends on the 3PCC role.
- When the first twin command is ‘sendCmd’ then this is the address of the remote twin socket. SIPp will try to connect to this address:port to send the twin command (This instance must be started after all other 3PCC scenarios). Example: 3PCC-C-A scenario.
- When the first twin command is ‘recvCmd’ then this is the address of the local twin socket. SIPp will open this address:port to listen for twin command. Example: 3PCC-C-B scenario.
-master : 3pcc extended mode: indicates the master number
-slave : 3pcc extended mode: indicates the slave number
-slave_cfg : 3pcc extended mode: indicates the file where the master and slave addresses are stored

Performance and watchdog options:

-timer_resol
Set the timer resolution. Default unit is milliseconds. This option has an impact on timers precision.Small values allow more precise scheduling but impacts CPU usage.If the compression is on, the value is set to 50ms. The default value is 10ms.
-max_recv_loops Set the maximum number of messages received read per cycle. Increase this value for high traffic level. The default value is 1000.
-max_sched_loops Set the maximum number of calls run per event loop. Increase this value for high traffic level. The default value is 1000.
-watchdog_interval : Set gap between watchdog timer firings. Default is 400.
-watchdog_reset : If the watchdog timer has not fired in more than this time period, then reset the max triggers counters. Default is 10 minutes.
-watchdog_minor_threshold: If it has been longer than this period between watchdog executions count a minor trip. Default is 500.
-watchdog_major_threshold: If it has been longer than this period between watchdog executions count a major trip. Default is 3000.
-watchdog_major_maxtriggers : How many times the major watchdog timer can be tripped before the test is terminated. Default is 10.
-watchdog_minor_maxtriggers: How many times the minor watchdog timer can be tripped before the test is terminated. Default is 120.

Tracing, logging and statistics options:

-f : Set the statistics report frequency on screen. Default is 1 and default unit is seconds.
-trace_stat : Dumps all statistics in <scenario_name>_.csv file. Use the ‘-h stat’ option for a detailed description of the statistics file content.
-stat_delimiter : Set the delimiter for the statistics file
-stf : Set the file name to use to dump statistics
-fd : Set the statistics dump log report frequency. Default is 60 and default unit is seconds.
-periodic_rtd : Reset response time partition counters each logging interval.
-trace_msg : Displays sent and received SIP messages in __messages.log
-message_file : Set the name of the message log file.
-message_overwrite: Overwrite the message log file (default true).
-trace_shortmsg : Displays sent and received SIP messages as CSV in <scenario file name>__shortmessages.log
-shortmessage_file: Set the name of the short message log file.
-shortmessage_overwrite: Overwrite the short message log file (default true).
-trace_counts : Dumps individual message counts in a CSV file.
-trace_err : Trace all unexpected messages in __errors.log.
-error_file : Set the name of the error log file.
-error_overwrite : Overwrite the error log file (default true).
-trace_error_codes: Dumps the SIP response codes of unexpected messages to <scenario file name>__error_codes.log.
-trace_calldebug : Dumps debugging information about aborted calls to <scenario_name>__calldebug.log file.
-calldebug_file : Set the name of the call debug file.
-calldebug_overwrite: Overwrite the call debug file (default true).
-trace_screen : Dump statistic screens in the <scenario_name>__screens.log file when quitting SIPp. Useful to get a final status report in background mode (-bg option).
-screen_file : Set the name of the screen file.
-screen_overwrite: Overwrite the screen file (default true).
-trace_rtt : Allow tracing of all response times in __rtt.csv.
-rtt_freq : freq is mandatory. Dump response times every freq calls in the log file defined by -trace_rtt. Default value is 200.
-trace_logs : Allow tracing of actions in __logs.log.
-log_file : Set the name of the log actions log file.
-log_overwrite : Overwrite the log actions log file (default true).
-ringbuffer_files: How many error, message, shortmessage and calldebug files should be kept after rotation?
-ringbuffer_size : How large should error, message, shortmessage and calldebug files be before they get rotated?
-max_log_size : What is the limit for error, message, shortmessage and calldebug file sizes.

Signal handling:

SIPp can be controlled using POSIX signals. The following signals are handled: USR1: Similar to pressing the ‘q’ key. It triggers a soft exit of SIPp. No more new calls are placed and all ongoing calls are finished before SIPp exits. Example: kill -SIGUSR1 732 USR2: Triggers a dump of all statistics screens in <scenario_name>__screens.log file. Especially useful in background mode to know what the current status is. Example: kill -SIGUSR2 732

Exit codes:

Upon exit (on fatal error or when the number of asked calls (-m option) is reached, SIPp exits with one of the following exit code: 0: All calls were successful 1: At least one call failed 97: Exit on internal command. Calls may have been processed 99: Normal exit without calls processed -1: Fatal error -2: Fatal error binding a socket

Debugging

Issue1 The commonName field needed to be supplied and was missing

Solution Given the common name while generating the certs

Issue2 If cmake error appears such as “command not found: cmake” then

solutionsudo apt-get install build-essential cmake

References :

Gstreamer

GStreamer ( LGPL )ia a media handling library written in C for applicatioan such as streaming , recording, playback , mixing and editing attributes etc. Even enhnaced applicaiosn such as tsrancoding , media ormat conversion , streaming servers for embeeded devices ( read more about Gstreamer in RPi in my srticle here).
It encompases various codecs, filters and is modular with plugins developement to enhance its capabilities. Media Streaming application developers use it as part of their framework at either the broadcaster’s end or as media player.

gst-launch-1.0 videotestsrc ! videoconvert ! autovideosink

Project : Making and IP survillance system using gstreamer and Janus

To build a turn-key easily deployable surveillance solution

Features :

Paring of Android Mobile with box
Live streaming from Box to Android
Video Recording inside the box
Auto parsing of recorded video around motion detection
Event listeners
2 way audio
Inbuild Media Control Unit
Efficient use of bandwidth
Secure session while live-streaming

Modules

Authentication ( OTP / username- password)
Livestreaming on Opus / vp8
Session Security and keepalives for live-streaming sessions
Sync local videos to cloud storage
Record and playback with timeline and events
Parsing and restructuring video ( transcoding may also be required )
Coturn server for NAT and ICE
Web platform on box ( user interface )+ NoSQL
Web platform on Cloud server ( Admin interface )+ NoSQL
REST APIs for third party add-ons ( Node based )
Android demo app for receiving the live stream and feeds

Varrying experiments and working gstreamer commands

Local Network Stream

To create /dev/video0

modprobe bcm2835-v4l2

To stream on rtspserver using rpicamsrc using h264 parse

./gst-rtsp-server-1.4.4/examples/test-launch --gst-debug=2 '(rpicamsrc num-buffers=5000 ! 'video/x-h264,width=1080,height=720,framerate=30/1' ! h264parse ! rtph264pay name=pay0 pt=96 )'

./test-launch “( tcpclientsrc host=127.0.0.1 port=5000 ! gdpdepay ! rtph264pay name=pay0 pt=96 )”

pipe raspivid to tcpserversink

raspivid -t 0 -w 800 -h 600 -fps 25 -g 5 -b 4000000 -vf -n -o - | gst-launch-1.0 -v fdsrc ! h264parse ! gdppay ! tcpserversink host=127.0.0.1 port=5000;

Stream Video over local Network with 15 fps

raspivid -n -ih -t 0 -rot 0 -w 1280 -h 720 -fps 15 -b 1000000 -o - | nc -l -p 5001

streaming video over local network with 30FPS and higher bitrate

raspivid -n -t 0 -rot 0 -w 1920 -h 1080 -fps 30 -b 5000000 -o - | nc -l -p 5001

Recording

Audio record to file
Using arecord :

arecord -D plughw:1 -c1 -r 48000 -f S16_LE -t wav -v file.wav;

Using pulse :
pulseAudio -D

gst-launch-1.0 -v pulsesrc device=hw:1 volume=8.0 ! audio/x-raw,format=S16LE ! audioconvert ! voaacenc bitrate=48000 ! aacparse ! flvmux ! filesink location = "testaudio.flv";

Video record to file ( mpg)

gst-launch-1.0 -e rpicamsrc bitrate=500000 ! 'video/x-h264,width=640,height=480’ ! mux. avimux name=mux ! filesink location=testvideo2.mpg;

Video record to file ( flv )

gst-launch-1.0 -e rpicamsrc bitrate=500000 ! video/x-h264,width=320,height=240,framerate=10/1 ! h264parse ! flvmux ! filesink location="testvieo.flv";

Video record to file ( h264)
gst-launch-1.0 -e rpicamsrc bitrate=500000 ! filesink location=”raw3.h264″;

Video record to file ( mp4)

gst-launch-1.0 -e rpicamsrc bitrate=500000 ! video/x-h264,width=320,height=240,framerate=10/1 ! h264parse ! mp4mux ! filesink location=video.mp4;

Audio + Video record to file ( flv)

gst-launch-1.0 -e /
rpicamsrc bitrate=500000 ! /
video/x-h264,width=320,height=240,framerate=10/1 ! h264parse ! muxout. /
pulsesrc volume=8.0 ! /
queue ! audioconvert ! voaacenc bitrate=65536 ! aacparse ! muxout. /
flvmux name=muxout streamable=true ! filesink location ='test44.flv';

Audio + Video record to file ( flv) using pulsesrc

gst-launch-1.0 -v --gst-debug-level=3 pulsesrc device="alsa_input.platform-asoc-simple-card.0.analog-stereo" volume=5.0 mute=FALSE ! audio/x-raw,format=S16LE,rate=48000,channels=1 ! audioresample ! audioconvert ! voaacenc ! aacparse ! flvmux ! filesink location="voicetest.flv";

Audio + Video record to file (mp4)

gst-launch-1.0 -e /
rpicamsrc bitrate=500000 ! /
video/x-h264,width=320,height=240,framerate=10/1 !s h264parse ! muxout. /
pulsesrc volume=4.0 ! /
queue ! audioconvert ! voaacenc ! muxout. /
flvmux name=muxout streamable=true ! filesink location = 'test224.mp4';

Streaming

stream raw Audio over RTMP to srtmpsink

gst-launch-1.0 pulsesrc device=hw:1 volume=8.0 ! /
audio/x-raw,format=S24LE ! audioconvert ! voaacenc bitrate=48000 ! aacparse ! flvmux ! rtmpsink location = “rtmp://192.168.0.3:1935/live/test”;

stream AACpparse Audio over RTMP to srtmpsink

gst-launch-1.0 -v --gst-debug-level=3 pulsesrc device="alsa_input.platform-asoc-simple-card.0.analog-stereo" volume=5.0 mute=FALSE ! audio/x-raw,format=S16LE,rate=48000,channels=1 ! audioresample ! audioconvert ! voaacenc ! aacparse ! flvmux ! rtmpsink location="rtmp://www.altani.com:1935/voice/1/test";

stream Video over RTMP

gst-launch-1.0 -e rpicamsrc bitrate=500000 ! /
video/x-h264,width=320,height=240,framerate=6/1 ! h264parse ! /
flvmux ! rtmpsink location = ‘rtmp://52.66.125.31:1935/live/test live=1’;

stream Audio + video over RTMP from rpicamsrc , framerate 10

gst-launch-1.0 rpicamsrc bitrate=500000 ! video/x-h264,width=320,height=240,framerate=10/1 ! h264parse ! muxout. pulsesrc volume=8.0 ! queue ! audioconvert ! voaacenc bitrate=65536 ! aacparse ! muxout. flvmux name=muxout streamable=true ! rtmpsink location ='rtmp://www.altanai.com/live/test44';

stream Audio + video over RTMP from rpicamsrc , framerate 30

gst-launch-1.0 rpicamsrc bitrate=500000 ! video/x-h264,width=1280,height=720,framerate=30/1 ! h264parse ! muxout. pulsesrc ! queue ! audioconvert ! voaacenc bitrate=65536 ! aacparse ! muxout. flvmux name=muxout ! queue ! rtmpsink location ='rtmp://www.altanai.com/live/test44';

VOD ( video On Demand )

Stream h264 file over RTMP

gst-launch-1.0 -e filesrc location="raw3.h264" ! video/x-h264 ! h264p
arse ! flvmux ! rtmpsink location = 'rtmp://www.altanai.com/live/test';

Stream flv file over RTMP

gst-launch-1.0 -e filesrc location=”testvieo.flv” ! /
video/x-h264,width=320,height=240,framerate=10/1 ! h264parse ! /
flvmux ! rtmpsink location = 'rtmp://192.168.0.3:1935/live/test';

Github Repo for Livestreaming

https://github.com/altanai/Livestreaming

Contains code for Android and ios Publishers , players on various platforms including HLS and Flash , streamings servers , Wowza playing mosules , webrtc broadcast

Gstreamer 1.8.0 – 24 March 2016

Features Hardware-accelerated zero-copy video decoding on Android

New video capture source for Android using the android.hardware.Camera API

Windows Media reverse playback support (ASF/WMV/WMA)

tracing system provides support for more sophisticated debugging tools

high-level GstPlayer playback convenience API

Initial support for the new Vulkan API

Improved Opus audio codec support: Support for more than two channels; MPEG-TS demuxer/muxer can handle Opus; sample-accurate encoding/decoding/transmuxing with Ogg, Matroska, ISOBMFF (Quicktime/MP4), and MPEG-TS as container; new codec utility functions for Opus header and caps handling in pbutils library. The Opus encoder/decoder elements were also moved to gst-plugins-base (from -bad), and the opus RTP depayloader/payloader to -good.

Asset proxy support in the GStreamer Editing Services

GStreamer 1.16.0 – 19 April 2019.

GStreamer WebRTC stack gained support for data channels for peer-to-peer communication based on SCTP, BUNDLE support, as well as support for multiple TURN servers.

AV1 video codec support for Matroska and QuickTime/MP4 containers and more configuration options and supported input formats for the AOMedia AV1 encoder

Closed Captions and other Ancillary Data in video

planar (non-interleaved) raw audio

GstVideoAggregator, compositor and OpenGL mixer elements are now in -base

New alternate fields interlace mode where each buffer carries a single field

WebM and Matroska ContentEncryption support in the Matroska demuxer

new WebKit WPE-based web browser source element

Video4Linux: HEVC encoding and decoding, JPEG encoding, and improved dmabuf import/export

Hardware-accelerated Nvidia video decoder gained support for VP8/VP9 decoding, whilst the encoder gained support for H.265/HEVC encoding.

Improvements to the Intel Media SDK based hardware-accelerated video decoder and encoder plugin (msdk): dmabuf import/export for zero-copy integration with other components; VP9 decoding; 10-bit HEVC encoding; video post-processing (vpp) support including deinterlacing; and the video decoder now handles dynamic resolution changes.

ASS/SSA subtitle overlay renderer can now handle multiple subtitles that overlap in time and will show them on screen simultaneously

Meson build feature-complete (with the exception of plugin docs) and it is now the recommended build system on all platforms. The Autotools build is scheduled to be removed in the next cycle.

GStreamer Rust bindings and Rust plugins module

GStreamer Editing Services allows directly playing back serialized edit list with playbin or (uri)decodebin

References :

https://gstreamer.freedesktop.org

OTT ( Over the Top ) Communication applications

Market trends are not in favour of Telecom Service /providers with increasing use of OTT ( Over The Top ) applications like WhatsApp, Facebook messenger, Google hangouts, skype, Viber, etc. OTT applications are often blamed to take a stake in voice traffic revenue by using IP calls where the telco could’ve charged based on its rate plan of call seconds. This especially intensifies for long-distance or international calls where customers can use OTT providers instead of expensive telco rate plans.

What is an OTT ?
An Over The Top ( OTT ) application is one which provides communication services over Internet . Therefore these bypass the communication billing system setup by a Telecom Operator , resulting in no gain or loss of revenue to Telecom Operator who is providing the Internet service to user in first place .

Hence we see that OTT are major source of concern for Telecom Operators whose traditional and obviously expensive ( when compared to OTTs free service ) billing models are facing disruption .

Telecom Regulatory bodies around the world

The telecom regulatory authorities in some of the countries are for example listed as :

Afghanistan Telecom Regulatory Authority (ATRA) – Afganistan
Australian Communications and Media Authority (ACMA) – Australia
Bangladesh Telecommunication Regulatory Commission (BTRC) – Bnagaladesh
Canadian Radio-television and Telecommunications Commission (CRTC) – Canada
Ministry of Information Industry (MII) – China
Autorité de Régulation des Communications Électroniques et des Postes (ARCEP) – France
Bundesnetzagentur (BNA) – Germany
Telecom Regulatory Authority of India (TRAI) – India
Ministry for Communications and Informatization of the Russian Federation (Minsvyaz) – Russia
Infocomm Development Authority of Singapore (IDA) – Singapore
Independent Communications Authority of South Africa (ICASA) – south Africa
Federal Communications Commission (FCC) , National Association of Regulatory Utility Commissioners (regulators of individual states) (NARUC) , CTIA – The Wireless Association (CTIA) – USA

Such telecom regulatory bodies get to decide whether to enforce differential price to end consumers for using OTT so that telecom service providers can benefit or keep the Internet fair and open by passing Net Neutrality Laws and Bills and amendments .

What is Net Neaurality ?
The fundamental principle of Net Neurality is that Telecom Operators should not block , slow down or charge consumers extra for using other services as their means of communication. This states that it is wrong to charge users above the regular data rates for using VOIP apps and other internet based communication services.

The following counteries have adopted principles of Net Neutrality by passing bills or making law .

Chile – Chile’s General Law of Telecommunications, “No [ISP] can block, interfere with, discriminate, hinder, nor restrict the right of any Internet user of using, send, receive, or offer any content, application, or legitimate service through the Internet, as well as any activity or legitimate use conducted through the Internet.”
Brazil – ” Internet Bill of Rights ” makes equal access to internet mandatory in Brazil .
Netherlands – Even European Union has adopted Netherlands’ Net Neutrality amendment which reads “traffic should be treated equally, without discrimination, restriction or interference, independent of the sender, receiver, type, content, device, service or application.”
USA – Citizens make ‘We the People’ platform to ‘Restore Net Neutrality By Directing the Federal Communications Commission (FCC) to Classify Internet Providers as ‘Common Carriers‘. Therefore not allowing them to either throttle speed by paid prioritization , discriminate in pricing or block any broadband access to legal content . Above facts are from this tech.firstpost.com article.

Inspite of the fact that I Support Net Neutrality with all my heart , as a telecom engineer I understand the cost investment made by Telecom operators in providing am efficient communication network to its subscribers ( Access , Network and Application layers ). Therefor I do have my sympathies with the Telcos and to level out the wide ranging conflict between Telcos and ISP ( Internet Service Providers ) , I pen down the following points which reflect the Telecom Operators Problems and also highlight the solutions that can be adopted to counteract the OTT threat .

Depleting revenue for Telco

Messaging – OTT messaging cost operators $13.9 billion, or 9% of message revenue in 2013
Voice – Voice services under threat from VOIP services like Skype, Viber
OTT apps – Voice & Message apps have been the operator’s biggest headache. Its time Operator should launch its own OTT Services
Data Traffic – The utilization is yet to reach its peak. Will face challenges from WiFi access
Critical Pain areas – Erosion of Operator’s revenue from voice and (especially) messaging

Telco’s OTT Application

At this stage, a telecom Service provider / Operator must enter the apps market and bring forth a Messenger which is more powerful, interactive and awesome than an OTT application. Fortunately, the Operator can always couple this application with his background telecom infrastructure to provide the edge in performance and functionalities.

Road block while developing a OTT application for a Telecom Service Provider :

Investment in Data Network is not being utilized due to lack of service
Reuse of Existing business Logic and extending the service reach across devices and networks is tough
Operator already has full fledged network Infrastructure in Place
Desire for minimum CAPEX while investing in new technologies
compete with OTT players and open new revenue streams is a challenge

Next we find the way of solving the problems and integrating them together to form a Solution .

OTT Application for Telecom Service provider

Introduce new services to benefit from investment on Data Plans and Bandwidth
Expose REST API to enable 3trd party Integration with existing network Infrastructure
Partner with individual OTT players to make new services that do not compete on core competencies like billing etc
Use protocols like SIP that reduce CAPEX and have goto market more quickly
Go for enriched service that lead to better user experience

This write-up outlines the process of creating an OTT application for a Telecom Service Provider. Components for the application include cloud Address Book, Video Chatting, Location share, Contact synchronization ,REST-based thin client , OS and device agnostic etc shown in the figure below:

The Application is designed to close knit with Operator’s own infrastructure hence the crucial entities like Network Address Book , Location Service are synced and fetched from Backend Network .

OTT application Feature Overview

Smart Address Book

Automatic: Get contacts from Gmail, Facebook
Fast search by first, last name, frequently
dialed number
Roadmap: View calendar events
Personal: Get image from Gmail and display in contacts list

Geo Location

Share own location during chatting
Get map for calculating the distance between two chat users
Roadmap : Trigger device (say Switch on/off AC before reaching home) from a threshold distance away from home location

Messaging

Ad-hoc Chat
Session Based Chat
Voice Input for texting
Presence information of contacts
RoadMap: Legacy message integration

Telephony

Voice call to mobile
Voice call to PSTN
Video call to other @imAll user
Share images during voice call to other

Device agnostic

Compatible with IOS, windows
Can run as native app on ipad
Can run as browser client on windows
RoadMap: native app for android, windows phone,blackberry10

Roadmap

features of Unified Communications ( UC)

To upgrade the application and provide enganced and enrich service support the I propose the following roadmap.
From plain vanilla voice and video calling ( supported by every other OTT application ) our application should progress towards legacy telecom support whihc included PSTN , GSM , ISDN etc . This requires backbone of telecom network and a good setup for media codec conversion to suit various legacy media codecs .

Road Map from Traditional to New age services

Voice and video calling
Legacy services support like MMS and SMS
Integration with 3rd party Vendors
Give new enriched services like Multilingual support , file transfer , screen-sharing etc
give facility to integrated web plugins for web calling

To keep the interest of customers it is essential that the application be supported on other popular OTT services like skype , Gtalk . for exmaple a caller should be able to make call from Skype / Gtalk to our application .Multilingual capabilities, support for larger protocol spectrum will just act like icing on the cake .

How does it benefit the Operator??

Saves on development cost and time
Device Agnostic OTT Applications
Simplified Service deployment
Saves licensing cost per client
Reuses existing Messaging and Address Book service logic.
Open New Revenue Streams for operator
No separate SIP stack required for the client
Faster Time to Market

Update : At the time of writing this post I did not anticipate the wave of change that bring focus on subjects like “net neutrality” , ” Save the internet” and “free internet”. However now most of the telcos providers have either joined the bandwagon by prividing SIP trunk endpoinst for cloud teelphony providers ( eg twilio, Google Calls) or have made their own IP call application for B2B customers.

Business Challenges for a telecom service provider

With the fast pace of telecom evolution both towards the access network front ( ie GSM , UMTS , 3G , 4G , LTE , VOLTE ) to core network side ( ie application servers , registrar , proxies , gateway , media server etc ) a CSP ( content service provider ) is trying hard to keep up with the user expectation . The user expects a plethora of services , reduced cost and high speed bandwidth . If this was not enough a CSP also has competition OTT ( Over The Top ) Players who provide communication and messaging for FREE .

You can read on how OTT’s players are disruption the revenue streams of traditional telecom operators and how can Telco’s develop their own OTT app , integrated with their backend system to answer to that challenge here – OTT ( Over the Top ) Communication applications

The following points outline the major business challenges faced by telecom operators today .

Technology Evolution Challenges

The increased data speeds and further more increasing hunger for the data overwhelms the existing network infrastructures.
Ensure uniform service experience across the network technologies to check the customer churn.
Access / Radio Technology independent delivery of services.
Enhance Reuse for exiting investments.

Multiple Service Platform Challenges

Typical network constitutes of Multiple Service Platforms increasing network complexity and integration challenges many fold.
Heterogeneous multiple SDP Solutions typically deployed to cater to Multiple Types of Networks/ Standards/Variants
Service Islands makes introduction of seamless services a challenging task for the CSP

Transport Upgrade and Convergence of Wireless Wireline

Retain investments in copper wire systems while migrating towards next generation Fiber Optic systems.
Severe competition among wire-line and wireless operators to provide latest services to retain subscriber base.
Fixed Mobile Convergence leading to a diminishing gap among the revenue shares of various operators in the space, and leading to losses for wire-line only players.

Tools for a Telecom software Engineer

Evernote for notekeeping
Eclipse to do real programming

Github to upload download code
MySQL workbench to take care of Database Management

Technologies to Work with

IETF
W3C
WebRTC
HTML
Java
GSMS standards

Frameworks

Struts
Hibernate
Spring
EJB

	Boris Ivanov on Asterisk – installation…
	Paras Kumar on Hosted IP-PBX and SBC
	altanai on Hosted IP-PBX and SBC
	Debra Olsen on Streaming / broadcasting Live…
	Things to know about… on WebRTC
	Hugo K on FreeSwitch SIP and Media …
	Bert H on Evolution of voice Commun…