High availiability and Scalibility in VoIP platforms


Load Balancers

Load Balancer(LB) is the initial point of interaction between the client application and the core system. It is pivotal in the distribution of the load across multiple servers and ensuring the client is connected to the nearest VoIP/SIP application server to minimize latency. However, the load balancers are also susceptible to security breaches and DOS attacks as they have a public-facing interface. This section lists the protocols, types and algorithms used popularly in Load balancers of VoIP systems.

software LBLayer 4 / hardware LB
Nginx
Amazon ELB ( eleastic load balanecr)
F5 BIG-IP load balancer
CISCO system catalyst
Barracuda load balancer
NetScaler
used by applications in cloud
ADN (Application delivery network)
used by  network address translators (NATs) 
DNS load balancing
examples and roles of software and hardware based load balancers

Load Balancers(LB) ping each server for health status and greylists servers that are unhealthy( respond late) as they may be overloaded or experiencing congestion. The LB monitors it rechecks after a while and if a server is healthy ( ie if a server responds with responds with status update) it can resume sending traffic to it. LB should also be distributed to different data centres in primary-secondary setup for HA.

Networking protocol

TCP LoadbalancerHTTP load balancersSIP based LB as Kamailio/ Opensips
can forwrad the packet without inspecting the content of the packet.terminate the connection and look inside the request to make a load balcing decsiion for exmaple by using a cookie or a header.domain specific to VoIP
(+) fast, can handle million of req per second(+) handle SIP routing based on SIP headers and prevent flooding atacks and other malicious malformed packets from reaching application server

Load balancing algorithms

  • Weighted Scheduling Algorithm
  • Round Robin Algorithm
  • Least Connection First Scheduling
  • Lest response time algorithms
  • Hash based algorithm ( send req based on hashed value such as suing IP address of request URL)
LoadbalancerReverse Proxy
forward proxy server allows multiple clients to route traffic to an external serveraccepts clients requestd for server and also returns the server’s response to the client ie routes traffic on behalf of multiple servers.
Balances load and incoming traffic endpointpublic facing endpoint for outgoing traffic
 additional level of abstraction and security, compression
used in SBC (session border controllers) and gateways

Service Discovery

Client-side or even backend service discovery uses a broadcasting or heartbeat mechanism to keep track of active servers and deactivates unresponsive or failed servers. This process of maintaining active servers helps in faster connection time. Some approaches to Service Discovery

  1. Mesh
    1. (-) exponentially incresing network traffic
  2. Gossip
  3. Distributed cache
  4. Coordination service with Service
    • (-) requesres coordination service for leader selection
    • (-) needs consensus
    • (-) RAFT and pbFT for mnaging failures
  5. Random leader selection
    • (+) quicker
    • (-) may not gurantee one leader
    • (-)split brain problem

Keepalive, unregistering unhealthy nodes

Systems such as Consul, Etcd, and Zookeeper can help services find each other by keeping track of registered names, addresses, and ports. Health checks help verify service integrity and are often done using an HTTP endpoint.

Replication

Usuallay there is a tradeoff between liveness and safety.

  1. Single leader replication
    • (-) vulnerable to loss of data is leader goes down before replication completes
    • used to in sql
  2. multileader replication and
  3. leaderless replication
    • (-) increases latencies
    • (-) quorem based on majority , cannot function is majority node are not down
    • used in cassandra

Data Store Replication

For Relatonal Dataabase

For NoSQL databse replication and HA

Quick Response / Low latency

Message format

Textual Message formatBinary Message Format
human readbale like
json xml
diff to comprehend , need shared schema between sender and receiver to serilaize and deserialze ,
names for every field adds to size no field name or only tags , reduces message size

Gateways for faster routing and caching to services

gateways are single entry point to route user requests to backend services .

Separate hot storage from cold storage

hot storage is frquently accessed data which must be near to server

cold storage is less frequently accessed data such as archives

  • object storage
  • slow access

Scalability

To make a system :-

  • scalable : use partitioing
  • reliable : use replication and checkpointing to not loose data in failures
  • fast : use in -memory usage

According to CAP theorem Consistency and Availiability are difficult to achieve together and there has to be a tradeof acc to requirnments.

Partitioining

Partition strategy can be based on various ways such as :-

  • Name based partition
  • geographic partition
  • names’ hashed value based on identifier
    • (-) can lead to hot partitions ( high density in areas of freq accessible identioers )
    • (-) high density spots for example all messages with a null key to go to the same partition
    • (-) doesnt scale
  • event time based hash
    • (+) data is spread evenly over time

To create a well distributed partition we could spread hot partition into 2 partitions or dedicate partitions for freq accessible items. An effective partitioning keys uses

  • Cardinality : total num of unique keys for a usecase. High cardinality leads to better distribution.
    • high cardinatility keys : names , email address , url since they have high variatioln
    • low cardinatlity keys : boolean flags such as gender M/F
  • Selectivity : number of message with each key. High selectivity leads to hotspots and hence low selectivity is better for even distribution.

Autoscalling

Scale Out not Up !

Kubernetes is an open-source container-orchestration system for automating computer application deployment, scaling, and management used in DevOps. I have mentioned this in detail on the article on VoIP and DevOps below.

Multiple PoPs (point of presence)

for a VOIP system catering to many clients accross the globe or accessing multiple carriers meant for different counteries based on Prefix matching , there should be alocal PoP in most used regions . typically these regions include – US east – west coasts, UK – germany of London , Asia Pacific – Mumbai ,Hong Kong and Australia.

Minimal Latency and lowest amount of tarffic via public internet

Creating multiple POPs and enabling private traffic via VPN in between them ensures that we use the backbone of our cloud provider such as AWS or datacentre instead of traversing via public internet which is slower and more insecure. Hopping on a private interface between the cloud server and maintaining a private connection and keepalive between them helps optimize the traffic flow while keeping the RTT and latency low.

HA ( High-availability )

Some factors affecting Dependability are

  • Eventual Consistency
  • MultiRegion failover
  • Disaster Recovery

A high-availability (HA) architecture implies Dependability.Usually via existence of redundant applications servers for backups: a primary and a standby. These applications are configured so that if primary fails, the other can take over its operations without significant loss of data or impact to business operations.

Downtime / SLA of 5 9’s in aggregate failures

4 9’s of availiability on each service components gives a downtime of 53 mins per service each year. However in aggregate failure this could amlount to (99.99)10 = 99.9 downtime which is 8-10 hours each year.

Thus, aggregate failure should be taken into consideration while designing reliable systems.

HA for Proxy / Load balancer (LB)

A LB is the first point of contact for outbound calls and usually does not save the dialogue information into memory or database but still contain the transaction information in memory. In case the LB crashes and has to restart, it should

  • have a quick uptime
  • be able to handle in dialogue requests
  • handle new incoming dialogue requests in a stateless manner
  • verify auth/authorization details from requests even after restart

HA for Call Control app server

App server is where all the business logic for call flow management resides and it maintains the dialog information in memory.

Issues with in-memory call states : If the VM or server hosting the call control app server is down or disconnected, then live calls are affected, this, in turn, causes revenue loss. Primarily since the state variable holding the call duration would be able to pass onto the CDR/ billing service upon the termination of the call. For long-distance, multi telco endpoint calls running hours this could be a significant loss.

  • Standby app server configuration and shared memory : If the primary app server crashes the standby app server should be ready to take its place and reads the dialog states from the shared memory.
  • Live load balanced secondary app server + external cache for state varaibles : External cache for state variables: a cluster of master-slave caches like Redis is a good way of maintaining the dialogue state and reading from it once the app server recovers from a failed state or when a secondary server figures it has a missing variable in local memory.

Media Server HA

Assuming the kamailio-RTPengine duo as App server and Media Server. These components can reside in same or different VMs. Incase of media server crash, during the process of restoring restarted RTpengine or assigning a secondary backup RTpengine , it should load the state of all live calls without dropping any and causing loss of revenue . This is achived by

  • external cache such as Redis ,
  • quick switchover from primary to secondary/fallback media server and
  • floating IPs for media servers that ensures call continuity inspite of failure on active media server.

Architecturally it looks the same as fig above on HA for the SIP app server.

Security against malicious attacks

Attacks and security compromisation pose a very signficant threat to a VoIP platform.

MITM attacks

Man in midddle attacks can be counetred by

  • End to end encryption of media using SRTP and signals using TLS
  • Strong SIP auth mechanism using challenges and creds where password is composed of mixed alphanumeric charecters and atleast 12 digits long
  • Authorization / whitelisting based on IP which adheres to CIDR notation

DDOS attacks

DDOS renders a particular network element unavailable, usually by directing an excessive amount of network traffic at its interfaces.

dDOS – multiple network hosts to flood a target host with a large amount of network traffic. Can be created by sending falsified sip requests to other parties such that numerous transactions originating in the backwards direction comes to the target server created congestion.

Can be counetred by

  • detect flooding and q in traffic and use Fail2ban to block
  • challenge questionable requests with only a single 401 (Unauthorized) or 407 (Proxy Authentication Required)

Read about SIP security practices in deatils https://telecom.altanai.com/2020/04/12/sip-security/

Other important factors leading to security

  • Keystores and certificate expiry tracker
  • priveligges and roles
  • Test cases and code coverage
  • Reviewers approval before code merge
  • Window for QA setup and testing , to give go ahead before deployment

Identifying outages and Alerting

Raise Event notification alerts to designated developers for any anolous behavior. It could be call based or SMS basef alert based on the sevirity of the situtaion .

Logging and Alerting for a VoIP CPaaS platform .
Raise Event notification alerts to designated developers for any anolous behavior. It could be call based or SMS basef alert based on the sevirity of the situtaion.

Sources for alert manager

  • Build failed ( code crashes, Jenkins error)
  • Deployment failed ( from Kubernetes , codechef, docker ..)
  • configuration errors ( setting VPN etc )
  • Server logs
  • Server health
  • homer alerts ( SIP calls responses 4xx,5xx,6xx)
  • PCAP alerts ( Malformed SIP SDP ..)
  • Internal Smoke test ( auto testing procedure done routinely to check live systems )
  • Support tickets from customer complaints ( treat these as high priority since they are directly impacting customers)

Bottlenecks

The test bed and QA framework play a very crticial role in final product’s credibility and quality.

Performance Testing

  • Stress Testing : take to breaking
  • Load Testing : 2x to 3x testing
  • Soak Testing : typical network load to long time ( identify leaks )

Robust QA framework( stress and monkey testing) to identify potential bottlenecks before going live

A QA framework basically validates the services and callflows on staging envrionment before pushing changes to production. Any architectural changes should especially be validated throughly on staginng QA framework befire making the cut. The qualities of an efficient QA platform are :

Genric nature – QA framework should be adatable to different envrionments such as dev , staging , prod

Containerized – it should be easy to spn the QA env to do large scale or small scale testing and hence it should be dockerized

CICD Integration and Automation – integrate the testcases tightly with gt post push and pull request creation . Minimal Latency and lowest amount of tarffic via public internet

Keep as less external dependecies as possible for exmaple a telecom carrier can be simulated by using an PBX like freeswitch or asterix

Asynchronous Run – Test cases should be able to run asynchronously. Such as seprate sipp xml script for reach usecase

Sample Testcases for VoIP

  • Authentication before establish a session
  • Balance and account check before establishing a session like whitelisting , blacklisting , restricted permission in a particular geography
  • Transport security and adaptibility checks , TLS , UDP , TCP
  • codec support validation
  • DTMF and detection
  • Cross checking CDR values with actual call initiator and terminator party
  • cross checking call uuid and stats
  • Validating for media and related timeouts

QA frameworks tools – Robot framework

traffic monitor – VOIP monitor

customer simulator – sipP scripts

network traffic analyser – wireshark

pcap collevcter – tcpdump , sngrep

Distributed Data Store

A Distributed Database Design could have many components. It could work on static datastore like

  • SQL DB where schema is important
    • MySQL
    • postgress
    • Spanner – Globally-distributed database from Google
  • NoSQL DB for to store records in json
    • Cassandra – Distributed column-oriented database
  • Cache for low latency retrivals
    • Memcached – Distributed memory caching system
    • Redis – Distributed memory caching system with persistence and value types
  • Data lakes for heavy sized data
    • AWS s3 object storage
    • blob storage
  • File System
    • Google File System (GFS) – Distributed file system
    • Hadoop File System (HDFS) – Open source Distributed file system

or work on realtime data streams

  • Batch processing ( Hadoop Mapreduce)
  • Stream processing ( Kafka + spark)
    • Kafka – Pub/sub message queue
  • Cloud native stream processing ( kinesis)

Each component has its own pros and cons. The choice depends on requirnments and scope for system behaviour like

  • users/customer usuage and expectation ,
  • Scale ( read and write )
  • Performnace
  • Cost
Users/customersScale ( read / write)PerformanceCost
Who uses the system ?
How the system will be used?
Read / writes per second ?
Size of data per request ?
cps ( calls or click per second) ?
write to read delay ?
p99 latency for read querries ?
should design minimize the cost of development ?

should design mikn ize the cost of mantainance ?
spikes in traffic eventual consistency ( prefer quick stale data ) as compared to no data at all
redundancy for failure management

Some fundamental constrains while design distributed data structure :-

p99 latency : 99% of the requests should be faster than given latency. In other words only 1% of the requests are allowed to be slower.

Request latency:
    min: 0.1
    max: 7.2
    median: 0.2
    p95: 0.5
    p99: 1.3

Inidiviual Events vs Aggregate Data

Inidividual Events ( like every click or every call metric)Aggregate Data ( clicks per minute, outgoing calls per minute)
(+) fast write
(+) can customize/ recalculate data from raw
(+) faster reads
(+) data is fready for decision making / statistics
(-) slow reads
(-) costlier for large scale implementations ( many events )
(-) can only query in the data as was aggregates ( no raw )
(-) requires data aggregation pipeline
(-) hard to fix errors
suitable for realtime / data on fly
low expected data delay ( minutes )
suitable for batch processing in background where delay is acceptable from mintes to hours

Push vs Pull Architecture

Push : A processing server manages state of varaible in memory and pushes them to data store.

  • (-) crashed processingserver means all data is lost

Pull : A temporary data strcyture such as a queue manages the stream of data and processing service pull from it to process before pusging to data stoore.

  • (+) a crashed server has to effect on temporarily queue held data and new server can simply take on where previous processing server left.
  • (+) can use checkpointing
SQLNoSQL
Structured and Strict schema
Relational data with joins
Semi-structured data
Dynamic or flexible schema
(+) faster lookup by index(-) data intensive workload
(+) high throughput for IOPS (Input/output operations per second )
used for
Account information
transactions
best suitable for
Rapid ingest of clickstream and log data
Leaderboard or scoring data
Metadata/lookup tables
DynamoDB – Document-oriented database from Amazon
MongoDB – Document-oriented database

A NoSQL databse can be of type

  • Quorem
  • Document
  • Key value
  • Graph

Cassandra is wide column supports asyn master less replication

Hinge base also a quorem based db also has master based preplication

MongoDB documente orientd DB used leacder based replication

SQL scaling patterns include:

  • Federation/ federated database system : transparently maps multiple autonomous database systems into a single virtual/federated database.
    • (-) slow since it access multiple data storages to get the value
  • Sharding / horizontal partition
  • Denormalization : Even though normalization is more memory efficient denormalization can enhance read performance by additing redundant pre computed data in db or grouping related data.
    • Normalizing data reduces data warehouse disk space by reducing data duplication and dimension cardinality. In its full definition, normalization is the process of discarding repeating groups, minimizing redundancy, eliminating composite keys for partial dependency and separating non-key attributes.
  • SQL Tuning : “iterative process of improving SQL statement performance to meet specific, measurable, and achievable goals”

Influx DB : to store time series data

AWS Redshift

Apache Hadoop

Redis

Embeed Data : RocksDB

Message Queues(Buffering) vs Batch Processing

Distributed event management, monitoring and working on incoming realtime data instead of stored Database is the preferred way to churn realtime analysis and updates. The multiple ways to handle incoming data are

  1. Batch processing – has lags to produce results, not time crtical
  2. Data stream – realtime response
  3. Message Queues – ensures timely sequence and order
BufferingBatching
Add events to buffer that can be read Add events to batch and send when batch is full
(+) can handle each event(+) cost effective
(+) ensures throughput
(-) if some events in batch fail should whole batch fail ?
(-) not suited for real time processing
S3 like objects storage + Hadoop Mapreduce for processing

Timeout

  • Connection timeout : use latency percentiles to calculate this
  • Request timeout

Retries

  • exponential backoff : increase waiting time each try
  • jitter : adds rabdomness to retry intervals to spread out the load.

Grouping events into object storage and Message Brokers

slower than stream processing but faster than batch processing.

Distributed Event management and Event Driven architecture using streams

In event driven archietcture a produce components performs and action which creates an event thata consumer/listener would subscribes to consume.

  • (+) time sensitive
  • (+)Asynch
  • (+) Decoupled
  • (+) Easy scaling and Elasticity
  • (+) Heterogeneous
  • (+) contginious

Expanding the stream pipeline

Event Streams decouple the source and sink applications. The event source and event sinks (such as webhooks) can asynchronously communicate with each other through events.

Options for stream processing architectures

  • Apache Kafka
  • Apache Spark
  • Amazon kinesis
  • Google Cloud Data Flow
  • Spring Cloud Data Flow

Here is a post from earlier which discusses – Scalable and Flexible SIP platform building, Multi geography Scaled via Universal Router, Cluster SIP telephony Server for High Availability, Failure Recovery, Multi-tier cluster architecture, Role Abstraction / Micro-Service based architecture, Load Balancer / Message Dispatcher, Back end Dynamic Routing and REST API services, Containerization and Auto Deployment, Auto scaling Cloud Servers using containerized images.

Lambda Architecture

Stream processing on top of map reduce and stream processing engine. In lambda architecture we can send events to batch system and stream processing system in parallel. The results are stiched together at query time.

Lambda Archietcture : stream processing on top of map reduce and stream processing engine. Send events to batch system and stream processing system in parallel. The results are stiched together at query time.

Apache Kafka is used as source which is a framework implementation of a software bus using stream-processing. “.. high-throughput, low-latency platform for handling real-time data feeds”.

Apache Spark : Data partitioning and in memory aggregation.

Distributed cache for call control Servers

Dedicated Cache ClusterCo located cache
Isolates cache fro service
Cache and service do not share memory and CPU
can scale independently
can be used by many microservices
flexibility in choosing hardware
doesnt require seprate hardware
low operational and hardware cost
scales together with the service

Choosing cache host

  • Mod function
    • (-) behaves differently when a new client is added or one is removed , unsuitable for prod
  • Consistent hashing ( chord)
    • maps each value to a point on circle

Cache Replacement

Least Recently Used Cache Replacement

Consistency and High Availiability in Cache setup

ReadReplicas live in differenet data centre for disaster recovery.

Strong consistency using Master Slave

Circuits – fail fast, wait for circuit to recover before using again

Design patterns for a circuit base setup to gracefully handle exceptions using fallback.

Circuit breaker : stops client from repeatedly trying to exceute by calculate the error threshold.

Isolated thread pool in circuits and ensure full recovery before calling the service again.

(+) Circuit breaker event causes the entire circuit to repair itself before attempting operations.

References :

Hosted IP-PBX and SBC


SBC ( Session Borde Controllers ) are basically gateways that provide interconnectivity between the hosted IP-PBX of the enterprise to the outside world endpoints such as telco service provider, PSTN/ TDM , SIP trunking providers or even third party OTT provider apps like skype for business etc.

If you have a hosted IPPBX or PBX in your data-centre or on premise and you need controlled but heavy outflowing traffic, it is a good idea to integrate a resilient and efficient SBC to provide seamless interconnectivity.

Hosted PBX

For an enterprises such as an Trading floor or warehouse with multiple phone types , softphones , hardphones , turrets etc distributed across various geographies and zones a device agnostic architectural setup is prime . Listing the essentials for setting up such a system. Note supplementary services are data-services , logging , licensing etc are important but kept out of scope to keep focus on functional aspects .

An enterprise application usually is structured in tiers or layers

  • Client tier – the networks clients communication to the central java programs . Runs on client machines
  • web tier – state full communication between client and business tier . Runs in server machine.
  • business tier- handles the logic of the application. The business tier uses the Enterprise Java Bean (EJB) container, which manages the execution of the beans
  • data tier – encompasses DB drivers . Runs on separate machines for database storage

Event services for Line status notifications

providers lines status notification across enterprise for inter zone and softphone to hardphone .

Routing services

routing calls within enterprise and hardphone sites read more about resource zones later in the article

Call Control Manager (CCM)

Consolidated set of all service and component that make up the VOIP platform besides media handlers. It includes SIP adapters, bridge managers, call processing frameworks, API frameworks, healthchecks etc.

Call processing framework ( CPF)

Signalling and call routing logic, mostly in SIP and trunks. Manages identities such as Call Line information, Called Party Information, line status etc in shared memory.

Multiple shared Lines and their statuses

Incases where there is a need to process multiple calls from a single User agent device such as a softphone or hardphone ( common scenario for a turret phone) , the design involves assigning it multiple sip uris and each sip uri will establish a line.

When caller calls callee , the line is said to be BUSY , otherwise said to be IDLE. Transition of a shared sip line from IDLE to BUSY is transmitted to others via SIP PUBLISH as other UAs holding the same sip

Similarly any other event like transfer is propagated to other via SIP UPDATE

Clustering Call control managers

A Call Communication manager (CCM) from various zones should be able to cowork on call and session management and advanced features such as routing from home guest zone to home zone , call transfer , refer , barge etc. Designing a clustered setup will also provide elasticity , fail-over and high availability. Can use clustered , HA compliant framework such as Oracle Communication Application Server , suited for enterprise level deployments.

Call Replication and distributed memory management

A node will store two types of data: active sessions and passive sessions. The active sessions are used by the node and stored in cache. The passive sessions are the replicas from the other nodes’ active sessions. The passives sessions are stored on a persistent storage.

Controlling Line Calls using AOR and Resource Zones

When dealing with many SIP endpoints , now referred to as resource, it is best to assign the resources to their respective zones. Thus a resource’s status updates will be only updated by its active resource zone while can be read by any resource zone.

Incoming request Zone vs Active Resource Zone

For an Incoming request such a INVITE , check whether the zone sending the request is its active resource zone or not .If the Active Resource Zone is the same zone on which the INVITE came in, then the call is handled by that zone. If the Active Resource Zone is a different zone, then the call needs to be forwarded to the Active Resource Zone.

Bridges for Local Media connections

Although call signalling is handled by a resources active resource zone only, we can still create media bridges in local zone of the resource .

Local MM bridges are used to auto answer an incoming sip line call and create trunk , especially from hardphones which do not support provisional responses.

Interzone proxy Handler

proxies call control messages between active and non active resource zones. Primarily mapping the sip messages with all custom headers inbetween the communication device interfaces.

Dial Trunk using multiple dedicated SIP lines and connect via Media Bridge

To save up on call routing /connection time and to support te ability to add as many users on call at runtime , a dedicated media bridge is established for every call.

  • A sip line activated is auto-answered by MM , creates a trunk and waits for other endpoint to join the bridge. The flow is as follows :
  • As INVITE arrives for an IDLE sip line , it is connected to a trunk and auto answered by a local MM bridge .
  • Since the call is already answered , when caller dials number for callee , collect the DTMF digits over RTP using RFC 2833 DTMF events.
  • Run inter-digit timer for digit collection and detect end of dialing on timeout.
  • The dialed trunk connection is made and call is added to media bridge
  • When provisional responses are received on the trunk connection, generate in-band call progress tones (ringing, proceeding etc) via the MM
  • When the line answers, the progress tones have to be stopped and the called party gets bridged to the calling party via the media bridge.

Call Diversion involves forwarding calls from zone to another zone. joinjed parties get call UPDATE status and forward response .

Call barge is the processing of joining an ongoing call . The barge event is usually propagated to joined parities via SIP INFO. Private lines do not allow barge in and are exclusively reserved for only few users.

Interconnectivity provided by an SBC ( Session Border Controller)

Hold-Resume and Music on Hold in multi-line evironment

While a regular p2p call involves simple reinvite based hold and resume with varrying SDP, the scenario is slightly more detailed for hold resume on bridged trunk connection , as explained below.

As the calls made are on bridge , a hold signal involves a RE-INIVITE with held-SDP to media manager (MM). If hold status on trunk is 200 OK the hold status will be sent to other call interfaces connected on the trunk. Else if hold is denied, 403 is sent back to hold-initiates.

Music on hold is an one way RTP mostly from media server.

For a bridged scenarios , separate Music on hold bridges are kept on Media Managers. When an UA has to hold , it is removed from original bridge and place on music on hold bridge. To be unhold/ resume it is placed back into the orignal bridge from music on hold bridge.

Conference

user initiates conference, the conference feature can execute on the zone where the user was logged on, irrespective of zones where the other conference attendees join from . The Call processing framework of originators zone completes the SDP exchange to establish two-way speech path among all the parties.

Incases there are multiple connections from a zone , a local MM conference bridge can be created for them which would connect back to originators MM conf bridge . this two part conf bridge will be transparent to the sip line sand users .

For provisioning inputs and settings setup a Diagnostics , Administration and Configuration platform which can process APIs for data services , licences , alarms or do remote device control such as using SNMP.

Session Border Controllers (SBC) role for PBX

At network level SBC operations include

  • bridging multiple interfaces in different networks even between the IPv4 and IPv6 networks
  • auto NAT discovery and STUN
  • protocol conversion such as TLS to UDP etc
  • Flood detection and IP filtering

For SIP specific functionalities, SBC does

  • SIP validation involving checks on syntax and message contents also consistency checks are performed.
  • stateful and call aware. tracing, monitoring and checking for validitya and health of all the SIP messages
  • Topology hiding
  • Traffic filtering
  • Codec filtering , reordering , media pinning, transcoding, or call recording
  • Data replication brings High Availability (HA) with hot backups or even Active-Active solutions.

Traffic sharing and routing roles of SBC can include

  • IP-based and Digest-based authentication
  • limiting traffic by number of concurrent calls or calling rate.
  • Dialplan and/or Custom routing
  • Dispatching/Load-balancing to a backend cluster of servers

SBC’s can be physical hardware boxes or software based applications, as the name suggests their purpose is to control the session at border between the enterprise and external service provider.

SIP to PSTN – SIP is an IP protocol whereas PSTN is a TDM one , achieving interoperability is also the KRA of an SBC

SIP trunking – SBC provide a secure sip connectivity to connect calls to sip trunks which provide bulk calls functionality at a flat pricing.

support for various fixed or mobile endpoints – SBC ensure they are RFC compliant and can extend SIP to any kind of telecom endpoint like PSTN , GSM, fax , Skype , sipphone , IP phones etc.

NAT (Network address translator) – To meet the packet routing challenges across a firewall or even during private -public mapping. A combo of DHCP servers and NAT provider comes very handy to reroute or perform hole punching such that signalling and media packets are not dropped and meet the required endpoint. More about NAT here – NAT traversal using STUN and TURN.

Load balancing – Reverse proxies and Load balancers is a much adopted industry practise to mask the inner IPs of the VoIP platform and also route traffic appropriately between control and media server .

Security, QoS and Regulatory compliance – since SBCs are required to typically support a large array of clients they adhere to regulatory and industry accepted standards ,which also involves security features like AAA, TLS/SSL and other means for quality of assurance like logging and fault detection, preventing DDoS etc . In many cases SBC can also encrypt / decrypt RTP streams for probing , tapping or lawful inspection .

Terminating at carriers, PSTN and IP gateways

There are 2 ways to integrate IP calls to telecom provider endpoints such as GSM or LTE phones.

  • PRI lines
  • SIP trunks
convergence

Additional SBC features

Inaddition to above it is good to have if an SBC provides extra features like forking , emergency number dialing ( 911 ) or active directory integration . Real Time Analysis and monitoring of call and metrics are also expected from a SBC since they reside on edge of the network and are more vulnerable to threats . For example Dialogic Mediant SBC’s and gateways , Audio Codes SBCs

With the shift from on premise PBXs to cloud based VM or microservice architecture , SBC vendors adopt a lager umbrella of services also including automation scripts for checks , reporting tools / consoles , developer friendly APIs to manage sessions via SBC and even WebRTC gateways to connect browser endpoints .

UseCase Scenarios

Any VOIP dependant system which deals with bulksome voice / video traffic from external endpoints is a usages scenarios. Listing few

  • Contact Call centres
  • Remote work / offsite monitoring
  • CRM solution for sales/marketing
  • Connecting webrtc click to dial from webpage to enterprise representatives
  • connecting enterprise UCC clients to PSTN endpoints

The There are many more features and usecases for an IPPBX solution for an enterprise. The features of modern IP PBX systems are a big addon to internal secure telecom channel in an company and accross its various office.

Future of IP PBX

There has been a significant shift in replacing hard PBX systems with software-based IP PBX such as using Freeswitch, Asterisk or other commercial-grade SIP servers which seamlessly integrate into other business software such as CRM systems, task force management systems.
In recent times cloud telephony providers, particularly CPaaS platforms have revolutionized the IP telecommunication landscape with lightweight and feature-rich communication agents( web, native platform) and services such as programmable API to control call logic and services such as recording, IVR announcements, call parking, Automatic Queueing so on.

Kamailio

Asterisk

VoIP system DevOps, operations and Infrastructure management, Automation


Overview of VoIP platform DevOPS tools

This article is focussed around various tools required to operate and maintain a growing large scale VoIP Platform, which are mostly classified under following roles:

  • PCAP Collections
  • CICD on Jenkins pipeline
  • Configuration management using chef cookbooks
  • virtualization and containerization using Docker
  • Infrastructure management using terraform / Kubernetes
  • Logs Analysis and Alarming

PCAP Collection

Packet Capture (PCAP) is an API that captures live network packets. Besides tracking, audit and RTC visualizers, PCAP is widely used for debugging faults such as during production alarm on high failure occurrences.

Example usecase: Production alert on 503 SIP response or log entry from a gateway is not as helpful as PCAP tracking of the session ID of call across various endpoints in and out of the network to determine the point of failure.Debugging involves :

  1. Pre-specified SIP / RTP and related protocols capture 

Capture pcaps examples

tcpdump -i any -w alltraffic.pcap
rtpbreak -P2 -t100 -T100 -d logz -r alltraffic.pcap

2. Call SessionId to uniquely identify failed calls among tens of thousands of the packet 

3. Analyzer such as wireshark or tshark to track the packet

TShark inspection examples

brew cask install wireshark
tshark -r alltraffic.pcap -R "sip.CSeq.method eq INVITE"

Some of the useful call specs captured from PCAP

  • DTMF – Both in-band and out of band DTMF for every call, along with the time stamp.
  • Codec negotiations –  Extracting codecs from PCAP lets us 
    1. Validate later whether there were codec changes without prior SIP message,
    2. If the call has been hung up with 488 error code then it was due to which  codec 
  • SIP errors – track deviations from standard SIP messaging. 
    1. Identify known erroneous SIP messaging scenarios such as for MITM or replay attacks
  • RTCP Media stats – extract Jitter, Loss, RTT with RTCP reports for both the incoming and outgoing stream.
  • Identify Media or ACK Timeouts 
    1. Check whether a party has not sent any media packet for > 60 s (media time out threshold duration)
    2. When a call is hung up due to ACK time out.
  • Audio stream – After GDPR, take explicit permission from users before storing audio streams.
PCAP file analyzed in Wireshark ( PCAP source : https://wiki.wireshark.org/SampleCaptures#Sample_Captures)

Continuous Integration and Delivery Automation using Jenkins

CICD provides continous delivery hub , distribute work across multiple machines, helping drive builds, tests and deployments across multiple platforms .

Jenkins jobs is a self-contained Java-based program extensible using plugins.

Jenkins pieline– orchestrates and automates building project in Jenkins

Configuration management using chef cookbooks

Alternatives like puppet and Ansible, which are also a cross-platform configuration management platform

Compute virtualization and containerization using Docker

Docker containers can be used instead of virtual machines such as VirtualBox , to isolates applications and be OS and platform independent
Makes distributed development possible and automates the deployment possible

  • stop Stop one or more running containers
  • top Display the running processes of a container
> docker top 4417600169e8
UID PID PPID C STIME TTY TIME CMD
root 9913 9888 0 08:50 ? 00:00:00 bash /point.sh
root 10083 9913 0 08:50 ? 00:00:01 /usr/sbin/worker
root 10092 10083 0 08:50 ? 00:00:02 /micro-service
  • unpause Unpause all processes within one or more containers
  • update Update configuration of one or more containers
  • wait Block until one or more containers stop, then print their exit codes

see all iamges

> docker images
REPOSITORY                  TAG                 IMAGE ID            CREATED             SIZE
sipcapture/homer-cron       latest              fb2243f90cde        3 hours ago         476MB
sipcapture/homer-kamailio   latest              f159d46a22f3        3 hours ago         338MB
sipcapture/heplify          latest              9f5280306809        21 hours ago        9.61MB
<none>                      <none>              edaa5c708b3a        

See all stats

>  docker stats
CONTAINER ID        NAME                CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS
f42c71741107        homer-cron          0.00%               52KiB / 994.6MiB      0.01%               2.3kB / 0B          602MB / 0B          0
0111765091ae        mysql               0.04%               452.2MiB / 994.6MiB   45.46%              1.35kB / 0B         2.06GB / 49.2kB     22

Run command from within a docker instnace

docker exec -it  bash

First see all processes

docker ps

select a process and enter its bash

docker exec -it 0472a5127fff bash

to edit or update a file inside docker either install vim everytime u login in resh docker conainer like

apt-get update
apt-get install vim

or add this to dockerfile

RUN [“apt-get”, “update”]
RUN [“apt-get”, “install”, “-y”, “vim”]

see if ngrep is install , if not then install and run ngrep to get sip logs isnode that docker container

apt update
apt install ngrep
ngrep -p "14795778704" -W byline -d any port 5060

docker volume – Volumes are used for persisting data generated by and used by Docker containers.
docker volumes have advantages over blind mounts such as easier to backup or migrate , managed by docker APIs, can be safely shared among multiple containers etc

docker stack – Lets to manager a cluster of docker containers thorugh docker swarm can be defined via docker-compose.yml file

docker service

  • create Create a new service
  • inspect Display detailed information on one or more services
  • logs Fetch the logs of a service or task
  • ls List services
  • ps List the tasks of one or more services
  • rm Remove one or more services
  • rollback Revert changes to a service’s configuration
  • scale Scale one or multiple replicated services
  • update Update a service

Run docker containers

sample run command

docker run -it -d --name opensips -e ENV=dev imagename:2.2

-it flags attaches to an interactive tty in the container.
-e gives envrionment variables
-d runs it in background and prints container id

Remove docker entities

To remove all stopped containers, all dangling images, and all unused networks:

docker system prune -a

To remove all unused volumes

docker system prune --volumes

To remove all stopped containers

docker container prune
sometimes docker images keep piling with stopped congainer such as 

REPOSITORY                                                             TAG                 IMAGE ID            CREATED             SIZE                                                                              d1dcfe2438ae        15 minutes ago      753MB                                                                           2d353828889b        16 hours ago        910MB                                                          ...
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS                        PORTS               NAMES

0dd6698a7517        2d353828889b        "/entrypoint.sh"         13 minutes ago      Exited (137) 13 minutes ago                       hardcore_wozniak

to remove such images and their conainer , first stop and remove confainers

docker stop $(docker ps -a -q)
docker rm $(docker ps -a -q)

then remove all dangling images

docker rmi  $(docker images -aq --filter dangling=true)

Infrastructure management using terraform

Terraform is used for building, changing and versioning infrastructure.
Infra as Code – can run single application to datacentres via configuration files which create execution plan.
It can manage low-level components such as compute instances, storage, and networking, as well as high-level components such as DNS entries, SaaS features, etc.
Resource Graph – builds a graph of all your resources

tfenv can be used to manage terraform versions

> brew unlink terraform
tfenv install 0.11.14
tfenv list 

Terraform configuration language

This is used for declaring resources and descriptions of infrastructure and associated files have a .tf or .tf.json file extension
Group of resources can be gathered into a module. Terraform configuration consists of a root module, where evaluation begins, along with a tree of child modules created when one module calls another.

Example : launch a single AWS EC2 instance , fle server1.tf

provider "aws" {
  profile    = "default"
  region     = "us-east-1"
}

resource "aws_instance" "server1" {
  ami           = "ami-2757fxxx"
  instance_type = "t2.micro"
}

note : AMI IDs are region specific.
profile attribute here refers to the AWS Config File in ~/.aws/credentials

Terraform command line interface (CLI)

engine for evaluating and applying Terraform configurations.
uses plugins called providers that each define and manage a set of resource types

Command Usage: terraform [-version] [-help] [args]

  • apply Builds or changes infrastructure
  • console Interactive console for Terraform interpolations
  • destroy Destroy Terraform-managed infrastructure
  • env Workspace management
  • fmt Rewrites config files to canonical format
  • get Download and install modules for the configuration
  • graph Create a visual graph of Terraform resources
  • import Import existing infrastructure into Terraform
  • init Initialize a Terraform working directory
  • output Read an output from a state file
  • plan Generate and show an execution plan
  • providers Prints a tree of the providers used in the configuration
  • refresh Update local state file against real resources
  • show Inspect Terraform state or plan
  • taint Manually mark a resource for recreation
  • untaint Manually unmark a resource as tainted
  • validate Validates the Terraform files
  • version Prints the Terraform version
  • workspace Workspace management
  • 0.12upgrade Rewrites pre-0.12 module source code for v0.12
  • debug Debug output management (experimental)
  • force-unlock Manually unlock the terraform state
  • push Obsolete command for Terraform Enterprise legacy (v1)
  • state Advanced state management

terraform init
Initialize a working directory containing Terraform configuration files.

terraform validate
checks that verify whether a configuration is internally-consistent, regardless of any provided variables or existing state.

Kubernetes

container orchestration platform , automating deployment, scaling, and management of containerized applications. Can deploy to cluster of computers, automating the distribution and scheduling as well

Service discovery and load balancing – gives Pods their own IP addresses and a single DNS name for a set of Pods, and can load-balance across them.

Automatic bin packing – Automatically places containers based on their resource requirements and other constraints, while not sacrificing availability. Mix critical and best-effort workloads in order to drive up utilization and save even more resources.

Storage orchestration – Automatically mount the storage system of your choice, whether from local storage, a public cloud provider such as GCP or AWS, or a network storage system such as NFS, iSCSI, Gluster, Ceph, Cinder, or Flocker.

Self-healing – Restarts containers that fail, replaces and reschedules containers when nodes die, kills containers that don’t respond to your user-defined health check, and doesn’t advertise them to clients until they are ready to serve.

Automated rollouts and rollbacks – progressively rolls out changes to your application or its configuration, while monitoring application health to ensure it doesn’t kill all your instances at the same time.

Secret and configuration management – Deploy and update secrets and application configuration without rebuilding your image and without exposing secrets in your stack configuration.

Batch execution– manage batch and CI workloads, replacing containers that fail, if desired.

Horizontal scaling – Scale application up and down with a simple command, with a UI, or automatically based on CPU usage.

create minikube cluster and deploy pods

prerequisities : docker , curl , redis , others

install minikube

curl -Lo minikube https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
chmod +x minikube
install minikube /usr/local/bin

Install kubectl

snap install kubectl --classic
ln -s /snap/bin/kubectl /usr/local/bin

Setup Minikube

minikube start --vm-driver=none
minikube addons enable registry-creds
kubectl -n kube-system create secret generic registry-creds-ecr
kubectl -n kube-system create secret generic registry-creds-gcr
kubectl -n kube-system create secret generic registry-creds-dpr
minikube addons configure registry-creds
Starting Kubernetes…minikube version: v1.3.0
 commit: 43969594266d77b555a207b0f3e9b3fa1dc92b1f
 minikube v1.3.0 on Ubuntu 18.04
 Running on localhost (CPUs=2, Memory=2461MB, Disk=47990MB) …
 OS release is Ubuntu 18.04.2 LTS
 Preparing Kubernetes v1.15.0 on Docker 18.09.5 …
 kubelet.resolv-conf=/run/systemd/resolve/resolv.conf
 Pulling images …
 Launching Kubernetes …
 Done! kubectl is now configured to use "minikube"
 dashboard was successfully enabled
 Kubernetes Started 

Basic Commands

  • start Starts a local kubernetes cluster
  • status Gets the status of a local kubernetes cluster
  • stop Stops a running local kubernetes cluster
  • delete Deletes a local kubernetes cluster
  • dashboard Access the kubernetes dashboard running within the minikube cluster

Images Commands:

  • docker-env Sets up docker env variables; similar to ‘$(docker-machine env)’
  • cache Add or delete an image from the local cache.

Configuration and Management Commands:

  • addons Modify minikube’s kubernetes addons
  • config Modify minikube config
  • profile Profile gets or sets the current minikube profile
  • update-context Verify the IP address of the running cluster in kubeconfig.

Networking and Connectivity Commands:

  • service Gets the kubernetes URL(s) for the specified service in your local cluster
  • tunnel tunnel makes services of type LoadBalancer accessible on localhost

Advanced Commands:

  • mount Mounts the specified directory into minikube
  • ssh Log into or run a command on a machine with SSH; similar to ‘docker-machine ssh’
  • kubectl Run kubectl

Troubleshooting Commands:

  • ssh-key Retrieve the ssh identity key path of the specified cluster
  • ip Retrieves the IP address of the running cluster
  • logs Gets the logs of the running instance, used for debugging minikube, not user code.
  • update-check Print current and latest version number

kubectl

controls the Kubernetes cluster manager.

Basic Commands (Beginner):

  • create Create a resource from a file or from stdin.
  • expose Take a replication controller, service, deployment or pod and expose it as a new Kubernetes Service
  • run Run a particular image on the cluster
  • set Set specific features on objects
  • explain Documentation of resources
  • get Display one or many resources
  • edit Edit a resource on the server
  • delete Delete resources by filenames, stdin, resources and names, or by resources and label selector

Deploy Commands:

  • rollout Manage the rollout of a resource
  • scale Set a new size for a Deployment, ReplicaSet, Replication Controller, or Job
  • autoscale Auto-scale a Deployment, ReplicaSet, or ReplicationController

Cluster Management Commands:

  • certificate Modify certificate resources.
  • cluster-info Display cluster info
  • top Display Resource (CPU/Memory/Storage) usage.
  • cordon Mark node as unschedulable
  • uncordon Mark node as schedulable
  • drain Drain node in preparation for maintenance
  • taint Update the taints on one or more nodes

Troubleshooting and Debugging Commands:

  • describe Show details of a specific resource or group of resources
  • logs Print the logs for a container in a pod
  • attach Attach to a running container
  • exec Execute a command in a container
  • port-forward Forward one or more local ports to a pod
  • proxy Run a proxy to the Kubernetes API server
  • cp Copy files and directories to and from containers.
  • auth Inspect authorization

Advanced Commands:

  • diff Diff live version against would-be applied version
  • apply Apply a configuration to a resource by filename or stdin
  • patch Update field(s) of a resource using strategic merge patch
  • replace Replace a resource by filename or stdin
  • wait Experimental: Wait for a specific condition on one or many resources.
  • convert Convert config files between different API versions
  • kustomize Build a kustomization target from a directory or a remote url.

Settings Commands:

  • label Update the labels on a resource
  • annotate Update the annotations on a resource
  • completion Output shell completion code for the specified shell (bash or zsh)

Other Commands:

  • api-resources Print the supported API resources on the server
  • api-versions Print the supported API versions on the server, in the form of “group/version”
  • config Modify kubeconfig files
  • plugin Provides utilities for interacting with plugins.
  • version Print the client and server version information

DevOps monitoring tools nagios

Manage Docker configs

  • create Create a config from a file or STDIN
  • inspect Display detailed information on one or more configs
  • ls List configs
  • rm Remove one or more configs

Manage containers

  • attach Attach local standard input, output, and error streams to a running container
  • commit Create a new image from a container’s changes
  • cp Copy files/folders between a container and the local filesystem
  • create Create a new container
  • diff Inspect changes to files or directories on a container’s filesystem
  • exec Run a command in a running container
  • export Export a container’s filesystem as a tar archive
  • inspect Display detailed information on one or more containers
  • kill Kill one or more running containers
  • logs Fetch the logs of a container
  • ls List containers
  • pause Pause all processes within one or more containers
  • port List port mappings or a specific mapping for the container
  • prune Remove all stopped containers
  • rename Rename a container
  • restart Restart one or more containers
  • rm Remove one or more containers
  • run Run a command in a new container
  • start Start one or more stopped containers
  • stats Display a live stream of container(s) resource usage statistics
  • stop Stop one or more running containers
  • top Display the running processes of a container
  • unpause Unpause all processes within one or more containers
  • update Update configuration of one or more containers
  • wait Block until one or more containers stop, then print their exit codes

Alternatives, Senu multi-cloud monitoring or Raygun

Monitoring, debugging, logs analysis and alarms

Aggregate logs into logstash and provide search and filtering via Elastic Search and Kibana. Can also trigger alerts or notifications on specific keyword searches in logs such as WARNING or ERRRO or call_failed. Some common alert scenarios include :

SBC and proxy gateways failures – check states of VM instance

DNS caching alerts – Domain Name System (DNS) caching, a Dynamic Host Configuration Protocol (DHCP) server, router advertisement and network boot alerts from service such as dnsmasq

Disk usage alert – setup alerts for 80% usage and trigger an alarm to either manually prune or create automatic timely archive backups.
check the percentage of DISK USAGE

df -h

Mostly it is either the logs file or pcap recorder which need to be archieved in external storage.

Use logrotate – it can rotates, compresses, and mails system logs

config file for logrorate – logrotate -vf /etc/logrotate.conf

/var/log/messages {
    rotate 5
    weekly
    postrotate
        /usr/bin/killall -HUP syslogd
    endscript
}

Elevated Call failure SIP 503 or Call timeout SIP 408 – high frequency of failed calls indicate an internal issue and must be followed up by smoke testing the entire system to identify any probable issue such as undetected frequent crashes of any individual component or any blacklisting by a destination endpoint etc

sudo tail -f sip.log | grep 503

or

sudo tail -f sip.log | grep WARNING

cron service or processed alerts

 ps axf
  PID TTY      STAT   TIME COMMAND
    2 ?        S      0:00 [kthreadd]
    3 ?        I<     0:00  \_ [rcu_gp]
    4 ?        I<     0:00  \_ [rcu_par_gp]
    5 ?        I      0:00  \_ [kworker/0:0-eve]
    6 ?        I<     0:00  \_ [kworker/0:0H-kb]
    7 ?        I      0:00  \_ [kworker/0:1-eve]
    8 ?        I      0:00  \_ [kworker/u4:0-nv]
    9 ?        I<     0:00  \_ [mm_percpu_wq]
   10 ?        S      0:00  \_ [ksoftirqd/0]
   11 ?        I      0:00  \_ [rcu_sched]
   12 ?        S      0:00  \_ [migration/0]
   13 ?        S      0:00  \_ [cpuhp/0]
   14 ?        S      0:00  \_ [cpuhp/1]
   15 ?        S      0:00  \_ [migration/1]
   16 ?        S      0:00  \_ [ksoftirqd/1]
   17 ?        I      0:00  \_ [kworker/1:0-eve]
   18 ?        I<     0:00  \_ [kworker/1:0H-kb]

or checks cron status

service cron status
● cron.service - Regular background program processing daemon
   Loaded: loaded (/lib/systemd/system/cron.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2016-06-26 03:00:37 UTC; 1min 17s ago
     Docs: man:cron(8)
 Main PID: 845 (cron)
    Tasks: 1 (limit: 4383)
   CGroup: /system.slice/cron.service
           └─845 /usr/sbin/cron -f

Jun 26 03:00:37 ip-172-31-45-21 systemd[1]: Started Regular background program processing daemon.
Jun 26 03:00:37 ip-172-31-45-21 cron[845]: (CRON) INFO (pidfile fd = 3)
Jun 26 03:00:37 ip-172-31-45-21 cron[845]: (CRON) INFO (Running @reboot jobs)

restart or start cron service if required

DB connections / connection pool process – keep listening for any alerts on DB connections failure or even warnings as this can be due to too many read operations such as in DDOS and can escalate very quickly

netstat -nltp  | grep db 
tcp        0      0 0.0.0.0:5433            0.0.0.0:*               LISTEN      5792/db-server * 

Routine deepstatus checks is a good practice too. Raise alert if any check doesnt result as expected.

Port check, unexpected result alert– Regular checks if servers are lsietning on ports such as 5060 for SIP

netstat -nltp | grep 5060
tcp        0      0 x.x.x.x:5060       0.0.0.0:*               LISTEN      8970/kamailio  

cron zombie process checks – zombie process or defunct process is a process that has completed execution (via the exit system call) but still has an entry in the process table: it is a process in the “Terminated state”. List xombie process and kill them with pid to free up .

kill -9 <PID1>

Bulk calls checks – consult ongoing call cmd commands for application server such as
For Freeswitch use

fs_ctl> show channels 

For kamailio use kamcmd

kamcmd dlg.list

For asterisk watch or show cmmand

watch -n 1 "sudo asterisk -vvvvvrx 'core show channels' | grep call"

Incase of DDOS or other macious attacker IP identification block the IP

iptables -I INPUT -s y.y.y.y -j DROP   

Can also use fail2ban

>apt-get update && apt-get install fail2ban

Additionally check how many dispatchers are responding on outbound gateway

opensipsctl dispatcher dump

Process control supervisor or pm2 checks – supervisor is a Linux Process Control System that allows its users to monitor and control a number of processes

ps axf | grep supervisor

for pm2

> pm2 status
[PM2] Spawning PM2 daemon with pm2_home=/Users/altanai/.pm2
[PM2] PM2 Successfully daemonized
┌─────┬───────────┬─────────────┬─────────┬─────────┬──────────┬────────┬──────┬───────────┬──────────┬──────────┬──────────┬──────────┐
│ id │ name │ namespace │ version │ mode │ pid │ uptime │ ↺ │ status │ cpu │ mem │ user │ watching │

htop to check memeory and CPU

Health and load on the reverse proxy, load balancer as Nginx – perform a direct curl request to host to check if Nginx responds with a non 4xx / 5xx response or not

curl -v <public-fqdn-of-server> 

Incase of error response , restart

/etc/init.d/nginx start

Incase of updates restart ngnix config

nginx -s reload

For HTTP/SSL proxy daemon such as tiny proxy which are used for fast resposne , set the MinSpareServers, MaxSpareServers , MaxClients , MaxRequestsPerChild etc appropriately

VPN checks – restart fireealls or IPsec incase of ssues

/etc/init.d/ipsec restart

Additionally also check ssh service

ps axf | grep sshd

restart sshd if required

SSL cert expiry checks – to keep the operations running securely and prevent and abrupt termination it is a good practise to run regular certificate expiry checks for SSL certs especially on secure HTTP endpoint like APIs , web server and also on SIP applications servers for TLS. If any expiry is due in < 10 days to trigger an alert to renew the certs

Health of Task scheduling services such as RabbitMQ, Celery Distributed Task Queue – remote debugging of these can be set up via pdb which supports setting (conditional) breakpoints and single stepping at the source line level, inspection of stack frames, source code listing, and evaluation of arbitrary Python code in the context of any stack frame.

import pdb; pdb.set_trace()
python3 -m pdb myscript.py

It can also be set up via using the client libraries provided by these Queue services themselves

Cluster status – setup an efficient health check service which monitors the cluster status for High Availability. JSON object depicting the status of cluster shards

{
  "cluster_name" : "ABC-cluster",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 14,
  "number_of_data_nodes" : 6,
  "active_primary_shards" : 200,
  "active_shards" : 300,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0
}

Status of Crticial Application Server

fscli > show status
UP 0 years, 0 days, 0 hours, 58 minutes, 33 seconds, 15 milliseconds, 58 microseconds
FreeSWITCH (Version 1.6.20 git 987c9b9 2018-01-23 21:49:09Z 64bit) is ready
3 session(s) since startup
0 session(s) - peak 1, last 5min 1
0 session(s) per Sec out of max 30, peak 1, last 5min 1
1000 session(s) max
min idle cpu 0.00/80.83
Current Stack Size/Max 240K/8192K

Programming or Syntax error in the production environment – mostly arising due to incomplete QA/testing before pushing new changes to production. Should trigger alerts for dev teams and meet with hot patches.

Many programing application development frameworks have inbuild libs for debugging , exceotion handling and reporting such as

  • backend service in Django
  • API service in Go

Distributed memory caching – redis , memcahe : Redis info shows the master -salve configuration for all the instances as well as their memeory and cpu status.

>redis-cli info
# Server
redis_version:6.0.4
redis_git_dirty:0
redis_mode:standalone
os:Darwin 18.7.0 x86_64
arch_bits:64
multiplexing_api:kqueue
atomicvar_api:atomic-builtin
gcc_version:4.2.1
tcp_port:6379

# Clients
connected_clients:1
client_recent_max_input_buffer:0
client_recent_max_output_buffer:0
blocked_clients:0
tracking_clients:0
clients_in_timeout_table:0

# Memory
used_memory:1065648
used_memory_human:1.02M
number_of_cached_scripts:0
maxmemory:0
allocator_frag_bytes:1123680
allocator_rss_ratio:1.00
rss_overhead_bytes:37888
mem_fragmentation_ratio:2.16
active_defrag_running:0
lazyfree_pending_objects:0

# Persistence
loading:0
rdb_changes_since_last_save:0
module_fork_last_cow_size:0

# Stats
total_connections_received:1
total_commands_processed:0
..

# Replication
role:master
connected_slaves:0
..

# CPU
used_cpu_sys:0.011198
used_cpu_sys_children:0.000000

# Modules

# Cluster
cluster_enabled:0

SMS service using smsc on Kannel : From the kannel servers, you should see the PANIC error (most of the time Assertion error crashing kannel):

grep PANIC /var/log/kannel/bearerbox.log

IF you are going to restart , Flush redis cache

sudo redis-cli FLUSHALL
sudo redis-cli SAVE

restart kannel

sudo /etc/init.d/kannel restart

If the carriers are throttling the SMS request , verify “ERROR” responses using

sudo grep -i "throttling" bearerbox.log

Alternatives include AWS logs services :

  • Scalyr logging
  • Sensu monitoring for multi-cloud monitoring using event pipeline

Read about VoIP/ OTT / Telecom Solution startup’s strategy for Building a scalable flexible SIP platform that includes :

  • Scalable and Flexible SIP platform building
  • Cluster SIP telephony Server for High Availability
  • Failure Recovery
  • Multi-tier cluster architecture
  • Role Abstraction / Micro-Service based architecture
  • Distributed Event management and Event-Driven architecture
  • Containerization
  • Autoscaling Cloud Servers
  • Open standards and Data Privacy
  • Flexibility for inter-working – NextGen911 , IMS , PSTN
  • security and Operational Efficiencies

Read more about SIP VoIP system Architecture which includes

  • Infrastructure Requirements
  • Integral Components of a VOIP SIP-based architecture
  • RTP ( Real-Time Transport Protocol ) / RtCP
  • SIP gateways, registrar, proxy, redirect, application
  • Developing SIP-based applications – basic call routing, media management
  • SIP platform Development – NAt and DNS , Cross-platform and integration to External Telecommunication provider landscape , Databases

References :


sipP ( SIP testing tool )

SIPp is an opensource (GNU GPL license) performance testing tool for the SIP protocol and is widely used for Quality assurabce of callflows in voip applications for UAC / UASs cenarios.

It can emulate functioing of a sip phone such as REGISTER , establishes and releases multiple calls with the INVITE and BYE methods , send other SIP requests and wait for reponses based on dafult of custom xml scenario files.

Plus factor is the dynamic display of statistics about running tests (call rate, round trip delay, and message statistics), periodic CSV statistics dumps, TCP and UDP over multiple sockets or multiplexed with retransmission management, regular expressions and variables in scenario files, and dynamically adjustable call rates.

sipp -sn uac -d 10000 -s 9876543210 127.0.0.1:5060  -l 10

It is widley used as aperformnace and load testing tool since it can test SIP equipements like SIP proxies, B2BUAs, SIP media servers, SIP/x gateways, and SIP PBXes and can also emulate thousands of user agents calling your SIP system.

More on SIPp scripts and various exmaples can be read from

https://github.com/altanai/kamailioexamples/tree/master/sipp

Installation

Pre-requisites to compile SIPp are:
– C++ Compiler
– curses or ncurses library
– For TLS support: OpenSSL >= 0.9.8
– For pcap play support: libpcap and libnet
– For SCTP support: lksctp-tools
– For distributed pauses: Gnu Scientific Libraries

sudo apt-get install dh-autoreconf ncurses-dev libssl-dev libpcap-dev libncurses5-dev libsctp-dev lksctp-tools

Either get source code from git

git clone https://github.com/SIPp/sipp.git
cd sipp
cmake . -DUSE_SSL=1 -DUSE_SCTP=1 -DUSE_PCAP=1 -DUSE_GSL=1
make

or download readymade tar , then extract and build with options like

tar -xvzf sipp-xxx.tar.gz
cd sipp
./configure --with-sctp --with-pcap --with-openssl
make

Building certs for TLS based sipp UAS server

make master dir for all certs

mkdir certs 
chmod 0700 certs
cd certs

Make CA folder, create cert and check

mkdir demoCA
cd demoCA
mkdir newcerts
echo '01' > serial
touch index.txt
openssl req -new -x509 -extensions v3_ca -keyout key.pem -out cert.pem -days 3650

Validation of the contents of certs ( optional )

openssl x509 -in cert.pem -noout -text
openssl x509 -in cert.pem -noout -dates
openssl x509 -in cert.pem -noout -purpose

Make domain folder and create the certs for the sip domain name from parent and check

cd ..
mkdir 10.10.10.10
openssl req -new -nodes -keyout key.pem -out req.pem
cd ..
openssl ca -days 730 -out 10.10.10.10/cert.pem -keyfile demoCA/key.pem -cert demoCA/cert.pem -infiles 10.10.10.10/req.pem

Verify the generated certificate for for SIP domain

openssl x509 -in 10.10.10.10/cert.pem -noout -text

Run sipp

sipp -sn uas -p 5077 -t l1 -tls_key /home/ubuntu/certs/10.10.10.10/key.pem  -tls_cert /home/ubuntu/certs/10.10.10.10/cert.pem  -i 10.10.10.10

Verify installation

Run sipp with embedded server (uas) scenario:

sipp -sn uas

On the same host, run sipp with embedded client (uac) scenario:

sipp -sn uac 127.0.0.1 -trace_msg -trace_err
output for server 

 # sipp -sn uas

------------------------------ Scenario Screen -------- [1-9]: Change Screen --

  Port   Total-time  Total-calls  Transport
  5060      32.95 s           61  UDP
0 new calls during 0.874 s period      1 ms scheduler resolution
  19 calls                               Peak was 41 calls, after 28 s
  0 Running, 63 Paused, 12 Woken up
  0 dead call msg (discarded)          
  3 open sockets                        
                             Messages  Retrans   Timeout   Unexpected-Msg

----------> INVITE 61 0 0 0
<---------- 180 61 0 <---------- 200 61 0 0 ----------> ACK E-RTD1 61 0 0 0

----------> BYE 61 0 0 0
<---------- 200 61 0
[ 4000ms] Pause 61 0
------------------------------ Test Terminated --------------------------------
----------------------------- Statistics Screen ------- [1-9]: Change Screen --

  Start Time             | 2019-02-04    13:04:32.108663 1549265672.108663         
  Last Reset Time        | 2019-02-04    13:05:04.189720 1549265704.189720         
  Current Time           | 2019-02-04    13:05:05.065119 1549265705.065119         
-------------------------+---------------------------+--------------------------
  Counter Name           | Periodic value            | Cumulative value
-------------------------+---------------------------+--------------------------
  Elapsed Time           | 00:00:00:875000           | 00:00:32:956000          
  Call Rate              |    0.000 cps              |    1.851 cps             
-------------------------+---------------------------+--------------------------

  Incoming call created  |        0                  |       61                 

  OutGoi traceings 

———————————————– 2019-02-04 13:08:13.939148
UDP message sent (530 bytes):

INVITE sip:service@127.0.0.1:5060 SIP/2.0
Via: SIP/2.0/UDP 192.x.x.x:5061;branch=z9hG4bK-52422-25-0
From: sipp ;tag=52422SIPpTag0025
To: service
Call-ID: 25-52422@192.x.x.x
CSeq: 1 INVITE
Contact: sip:sipp@192.x.x.x:5061
Max-Forwards: 70
Subject: Performance Test
Content-Type: application/sdp
Content-Length: 135
v=0
o=user1 53655765 2353687637 IN IP4 192.x.x.x
s=-
c=IN IP4 192.x.x.x
t=0 0
m=audio 6004 RTP/AVP 0
a=rtpmap:0 PCMU/8000

———————————————– 2019-02-04 13:08:13.939310
UDP message received [321] bytes :

SIP/2.0 180 Ringing
Via: SIP/2.0/UDP 192.x.x.x:5061;branch=z9hG4bK-52422-1-0
From: sipp ;tag=52422SIPpTag001
To: service ;tag=52416SIPpTag011
Call-ID: 1-52422@192.x.x.x
CSeq: 1 INVITE
Contact: 
Content-Length: 0

———————————————– 2019-02-04 13:08:13.939905
UDP message received [486] bytes :

SIP/2.0 200 OK
Via: SIP/2.0/UDP 192.x.x.x:5061;branch=z9hG4bK-52422-1-0
From: sipp ;tag=52422SIPpTag001
To: service ;tag=52416SIPpTag011
Call-ID: 1-52422@192.x.x.x
CSeq: 1 INVITE
Contact: 
Content-Type: application/sdp
Content-Length:   135
v=0
o=user1 53655765 2353687637 IN IP4 192.x.x.x
s=-
c=IN IP4 192.x.x.x
t=0 0
m=audio 6000 RTP/AVP 0
a=rtpmap:0 PCMU/8000

———————————————– 2019-02-04 13:08:13.940159
UDP message sent (371 bytes):

ACK sip:service@127.0.0.1:5060 SIP/2.0
Via: SIP/2.0/UDP 192.x.x.x:5061;branch=z9hG4bK-52422-1-5
From: sipp ;tag=52422SIPpTag001
To: service ;tag=52416SIPpTag011
Call-ID: 1-52422@192.x.x.x
CSeq: 1 ACK
Contact: sip:sipp@192.x.x.x:5061
Max-Forwards: 70
Subject: Performance Test
Content-Length: 0

~ RTP

———————————————– 2019-02-04 13:08:13.941658
UDP message sent (371 bytes):

BYE sip:service@127.0.0.1:5060 SIP/2.0
Via: SIP/2.0/UDP 192.x.x.x:5061;branch=z9hG4bK-52422-1-7
From: sipp ;tag=52422SIPpTag001
To: service ;tag=52416SIPpTag011
Call-ID: 1-52422@192.x.x.x
CSeq: 2 BYE
Contact: sip:sipp@192.x.x.x:5061
Max-Forwards: 70
Subject: Performance Test
Content-Length: 0

———————————————– 2019-02-04 13:08:13.952888
UDP message received [313] bytes :

SIP/2.0 200 OK
Via: SIP/2.0/UDP 192.x.x.x:5061;branch=z9hG4bK-52422-1-7
From: sipp ;tag=52422SIPpTag001
To: service ;tag=52416SIPpTag011
Call-ID: 1-52422@192.x.x.x
CSeq: 2 BYE
Contact: 
Content-Length: 0

Time

---------------------------- Repartition Screen ------- [1-9]: Change Screen --
Average Response Time Repartition 1
0 ms <= n < 10 ms : 293 10 ms <= n < 20 ms : 9 20 ms <= n < 30 ms : 0 30 ms <= n < 40 ms : 0 40 ms <= n < 50 ms : 0 50 ms <= n < 100 ms : 0 100 ms <= n < 150 ms : 0 150 ms <= n < 200 ms : 0 n >= 200 ms : 0
Average Call Length Repartition
0 ms <= n < 10 ms : 0 10 ms <= n < 50 ms : 0 50 ms <= n < 100 ms : 0 100 ms <= n < 500 ms : 0 500 ms <= n < 1000 ms : 0 1000 ms <= n < 5000 ms : 262 5000 ms <= n < 10000 ms : 0 n >= 10000 ms : 0
------------------------------ Sipp Server Mode -------------------------------

Output for client

uac.xml
 
SIPp UAC Remote
 |(1) INVITE |
 |------------------>|
 |(2) 100 (optional) |
 |<------------------| 
 |(3) 180 (optional) | 
  |<------------------| 
|(4) 200             | 
|<------------------| 
|(5) ACK             | 
|------------------>|
 |                     |
 |(6) PAUSE             |
 |                     |
 |(7) BYE             |
 |------------------>|
 |(8) 200             |
 |<------------------|

sipp -sn uac 127.0.0.1 -trace_msg -trace_err
Resolving remote host ‘127.0.0.1’… Done.
—————————— Scenario Screen ——– [1-9]: Change Screen —
Call-rate(length) Port Total-time Total-calls Remote-host
10.0(0 ms)/1.000s 5061 17.32 s 98 127.0.0.1:5060(UDP)

3 new calls during 0.286 s period 1 ms scheduler resolution
0 calls (limit 30) Peak was 25 calls, after 10 s
0 Running, 101 Paused, 7 Woken up
0 dead call msg (discarded) 0 out-of-call msg (discarded)
3 open sockets

                             Messages  Retrans   Timeout   Unexpected-Msg
  INVITE ---------->         98        0         0                  
     100 <----------         0         0         0         0        
     180 <----------         98        0         0         0        
     183 <----------         0         0         0         0        
     200          98        0                            
   Pause [      0ms]         98                            0        
     BYE ---------->         98        0         0                  
     200 <----------         98        0         0         0        

—————————— Test Terminated ——————————–

----------------------------- Statistics Screen ------- [1-9]: Change Screen --

  Start Time             | 2019-02-04    13:08:03.908208 1549265883.908208         
  Last Reset Time        | 2019-02-04    13:08:20.954289 1549265900.954289         
  Current Time           | 2019-02-04    13:08:21.241152 1549265901.241152         

-------------------------+---------------------------+--------------------------
  Counter Name           | Periodic value            | Cumulative value

-------------------------+---------------------------+--------------------------
  Elapsed Time           | 00:00:00:286000           | 00:00:17:332000          

  Call Rate  

Tracings

———————————————– 2019-02-04 13:08:13.934840
UDP message received [527] bytes :

INVITE sip:service@127.0.0.1:5060 SIP/2.0
Via: SIP/2.0/UDP 192.x.x.x:5061;branch=z9hG4bK-52422-1-0
From: sipp ;tag=52422SIPpTag001
To: service 
Call-ID: 1-52422@192.x.x.x
CSeq: 1 INVITE
Contact: sip:sipp@192.x.x.x:5061
Max-Forwards: 70
Subject: Performance Test
Content-Type: application/sdp
Content-Length:   135
v=0
o=user1 53655765 2353687637 IN IP4 192.x.x.x
s=-
c=IN IP4 192.x.x.x
t=0 0
m=audio 6004 RTP/AVP 0
a=rtpmap:0 PCMU/8000

———————————————– 2019-02-04 13:08:13.936616
UDP message sent (321 bytes):

SIP/2.0 180 Ringing
Via: SIP/2.0/UDP 192.x.x.x:5061;branch=z9hG4bK-52422-1-0
From: sipp ;tag=52422SIPpTag001
To: service ;tag=52416SIPpTag011
Call-ID: 1-52422@192.x.x.x
CSeq: 1 INVITE
Contact: 
Content-Length: 0

———————————————– 2019-02-04 13:08:13.937003
UDP message sent (486 bytes):

SIP/2.0 200 OK
Via: SIP/2.0/UDP 192.x.x.x:5061;branch=z9hG4bK-52422-1-0
From: sipp ;tag=52422SIPpTag001
To: service ;tag=52416SIPpTag011
Call-ID: 1-52422@192.x.x.x
CSeq: 1 INVITE
Contact: 
Content-Type: application/sdp
Content-Length:   135
v=0
o=user1 53655765 2353687637 IN IP4 192.x.x.x
s=-
c=IN IP4 192.x.x.x
t=0 0
m=audio 6000 RTP/AVP 0
a=rtpmap:0 PCMU/8000

———————————————– 2019-02-04 13:08:13.948679
UDP message received [371] bytes :

ACK sip:service@127.0.0.1:5060 SIP/2.0
Via: SIP/2.0/UDP 192.x.x.x:5061;branch=z9hG4bK-52422-1-5
From: sipp ;tag=52422SIPpTag001
To: service ;tag=52416SIPpTag011
Call-ID: 1-52422@192.x.x.x
CSeq: 1 ACK
Contact: sip:sipp@192.x.x.x:5061
Max-Forwards: 70
Subject: Performance Test
Content-Length: 0

~ RTP

———————————————– 2019-02-04 13:08:13.949168
UDP message received [371] bytes :

BYE sip:service@127.0.0.1:5060 SIP/2.0
Via: SIP/2.0/UDP 192.x.x.x:5061;branch=z9hG4bK-52422-1-7
From: sipp ;tag=52422SIPpTag001
To: service ;tag=52416SIPpTag011
Call-ID: 1-52422@192.x.x.x
CSeq: 2 BYE
Contact: sip:sipp@192.x.x.x:5061
Max-Forwards: 70
Subject: Performance Test
Content-Length: 0

———————————————– 2019-02-04 13:08:13.949245
UDP message sent (313 bytes):

SIP/2.0 200 OK
Via: SIP/2.0/UDP 192.x.x.x:5061;branch=z9hG4bK-52422-1-7
From: sipp ;tag=52422SIPpTag001
To: service ;tag=52416SIPpTag011
Call-ID: 1-52422@192.x.x.x
CSeq: 2 BYE
Contact: 
Content-Length: 0

time

---------------------------- Repartition Screen ------- [1-9]: Change Screen --
Average Response Time Repartition 1
0 ms <= n < 10 ms : 657 10 ms <= n < 20 ms : 20 20 ms <= n < 30 ms : 0 30 ms <= n < 40 ms : 0 40 ms <= n < 50 ms : 0 50 ms <= n < 100 ms : 0 100 ms <= n < 150 ms : 0 150 ms <= n < 200 ms : 0 n >= 200 ms : 0
Average Call Length Repartition
0 ms <= n < 10 ms : 649 10 ms <= n < 50 ms : 28 50 ms <= n < 100 ms : 0 100 ms <= n < 500 ms : 0 500 ms <= n < 1000 ms : 0 1000 ms <= n < 5000 ms : 0 5000 ms <= n < 10000 ms : 0 n >= 10000 ms : 0
------ [+|-|*|/]: Adjust rate ---- [q]: Soft exit ---- [p]: Pause traffic -----

Last Error: Overload warning: the major watchdog timer 3000ms has been t…

UAC with Media

SIPp UAC            Remote
    |(1) INVITE         |
    |------------------>|
    |(2) 100 (optional) |
    |<------------------|
    |(3) 180 (optional) |
    |<------------------|
    |(4) 200            |
    |<------------------|
    |(5) ACK            |
    |------------------>|
    |                   |
    |(6) RTP send (8s)  |
    |==================>|
    |                   |
    |(7) RFC2833 DIGIT 1|
    |==================>|
    |                   |
    |(8) BYE            |
    |------------------>|
    |(9) 200            |
    |<------------------|

sipp Usage:

sipp remote_host[:remote_port] [options]

Run SIPp with embedded server (uas) scenario: ./sipp -sn uas On the same host, run SIPp with embedded client (uac) scenario: ./sipp -sn uac 127.0.0.1

Scenario file options:

  • -sd : Dumps a default scenario (embedded in the SIPp executable)
  • -sf : Loads an alternate XML scenario file. To learn more about XML scenario syntax, use the -sd option to dump embedded scenarios. They contain all the necessary help.
  • -oocsf : Load out-of-call scenario.
  • -oocsn : Load out-of-call scenario.
  • -sn : Use a default scenario (embedded in the SIPp executable). If this option is omitted, the Standard SipStone UAC scenario is loaded. Available values in this version: 
    • ‘uac’ : Standard SipStone UAC (default).
    • ‘uas’ : Simple UAS responder.
    • ‘regexp’ : Standard SipStone UAC – with regexp and variables.
    • ‘branchc’ : Branching and conditional branching in scenarios – client.
    • ‘branchs’ : Branching and conditional branching in scenarios – server.
    Default 3pcc scenarios (see -3pcc option):
    • ‘3pcc-C-A’ : Controller A side (must be started after all other 3pcc scenarios)
    • ‘3pcc-C-B’ : Controller B side.
    • ‘3pcc-A’ : A side.
    • ‘3pcc-B’ : B side.

IP, port and protocol options

  • -t : Set the transport mode:
    • u1: UDP with one socket (default),
    • un: UDP with one socket per call,
    • ui: UDP with one socket per IP address. The IP addresses must be defined in the injection file.
    • t1: TCP with one socket,
    • tn: TCP with one socket per call,
    • l1: TLS with one socket,
    • ln: TLS with one socket per call,
    • c1: u1 + compression (only if compression plugin loaded),
    • cn: un + compression (only if compression plugin loaded). This plugin is not provided with SIPp.
  • -i : Set the local IP address for ‘Contact:’,’Via:’, and ‘From:’ headers. Default is primary host IP address.
  • -p : Set the local port number. Default is a random free port chosen by the system 
  • -bind_local : Bind socket to local IP address, i.e. the local IP address is used as the source IP address. If SIPp runs in server mode it will only listen on the local IP address instead of all IP addresses.
  • -ci : Set the local control IP address
  • -cp : Set the local control port number. Default is 8888.
  • -max_socket : Set the max number of sockets to open simultaneously. This option is significant if you use one socket per call. Once this limit is reached, traffic is distributed over the sockets already opened. Default value is 50000
  • -max_reconnect : Set the the maximum number of reconnection.
  • -reconnect_close : Should calls be closed on reconnect?
  • -reconnect_sleep : How long (in milliseconds) to sleep between the close and reconnect?
  • -rsa : Set the remote sending address to host:port for sending the messages.
  • -tls_cert : Set the name for TLS Certificate file. Default is ‘cacert.pem
  • -tls_key : Set the name for TLS Private Key file. Default is ‘cakey.pem’
  • -tls_ca : Set the name for TLS CA file. If not specified, X509 verification is not activated.
  • -tls_crl : Set the name for Certificate Revocation List file. If not specified, X509 CRL is not activated.
  • -tls_version : Set the TLS protocol version to use (1.0, 1.1, 1.2) — default is autonegotiate

SIPp overall behavior options:

  • -v : Display version and copyright information.
  • -bg : Launch SIPp in background mode.
  • -nostdin : Disable stdin.
  • -plugin : Load a plugin.
  • -sleep : How long to sleep for at startup. Default unit is seconds.
  • -skip_rlimit : Do not perform rlimit tuning of file descriptor limits. Default: false.
  • -buff_size : Set the send and receive buffer size.
  • -sendbuffer_warn : Produce warnings instead of errors on SendBuffer failures.
  • -lost : Set the number of packets to lose by default (scenario specifications override this value).
  • -key : keyword value Set the generic parameter named “keyword” to “value”.
  • -set : variable value Set the global variable parameter named “variable” to “value”.
  • -tdmmap : Generate and handle a table of TDM circuits. A circuit must be available for the call to be placed. Format: -tdmmap {0-3}{99}{5-8}{1-31}
  • -dynamicStart : variable value Set the start offset of dynamic_id variable
  • -dynamicMax : variable value Set the maximum of dynamic_id variable 
  • -dynamicStep : variable value Set the increment of dynamic_id variable

Call behavior options:

  • -aa : Enable automatic 200 OK answer for INFO, NOTIFY, OPTIONS and UPDATE.
  • -base_cseq : Start value of [cseq] for each call.
  • -cid_str : Call ID string (default %u-%p@%s). %u=call_number, %s=ip_address, %p=process_number, %%=% (in any order).
  • -d : Controls the length of calls. More precisely, this controls the duration of ‘pause’ instructions in the scenario, if they do not have a ‘milliseconds’ section. Default value is 0 and default unit is milliseconds.
  • -deadcall_wait : How long the Call-ID and final status of calls should be kept to improve message and error logs (default unit is ms).
  • -auth_uri : Force the value of the URI for authentication. By default, the URI is composed of remote_ip:remote_port.
  • -au : Set authorization username for authentication challenges. Default is taken from -s argument
  • -ap : Set the password for authentication challenges. Default is ‘password’
  • -s : Set the username part of the request URI. Default is ‘service’.
  • -default_behaviors: Set the default behaviors that SIPp will use. Possible values are:
    • all Use all default behaviors
    • none Use no default behaviors
    • bye Send byes for aborted calls
    • abortunexp Abort calls on unexpected messages
    • pingreply Reply to ping requests If a behavior is prefaced with a -, then it is turned off. Example: all,-bye
  • -nd : No Default. Disable all default behavior of SIPp which are the following:
  • On UDP retransmission timeout, abort the call by sending a BYE or a CANCEL
  • On receive timeout with no ontimeout attribute, abort the call by sending a BYE or a CANCEL
  • On unexpected BYE send a 200 OK and close the call
  • On unexpected CANCEL send a 200 OK and close the call
  • On unexpected PING send a 200 OK and continue the call
  • On any other unexpected message, abort the call by sending a BYE or a CANCEL
  • -pause_msg_ign : Ignore the messages received during a pause defined in the scenario 
  • -callid_slash_ign: Don’t treat a triple-slash in Call-IDs as indicating an extra SIPp prefix.

Injection file options:

  • -inf : Inject values from an external CSV file during calls into the scenarios. First line of this file say whether the data is to be read in sequence (SEQUENTIAL), random (RANDOM), or user (USER) order. Each line corresponds to one call and has one or more ‘;’ delimited data fields. Those fields can be referred as [field0], [field1], … in the xml scenario file. Several CSV files can be used simultaneously (syntax: -inf f1.csv -inf f2.csv …)
  • -infindex : file field Create an index of file using field. For example -inf ../path/to/users.csv -infindex users.csv 0 creates an index on the first key.
  • -ip_field : Set which field from the injection file contains the IP address from which the client will send its messages. If this option is omitted and the ‘-t ui’ option is present, then field 0 is assumed. Use this option together with ‘-t ui’

RTP behaviour options:

  • -mi : Set the local media IP address (default: local primary host IP address)
  • -rtp_echo : Enable RTP echo. RTP/UDP packets received on port defined by -mp are echoed to their sender. RTP/UDP packets coming on this port + 2 are also echoed to their sender (used for sound and video echo).
  • -mb : Set the RTP echo buffer size (default: 2048).
  • -mp : Set the local RTP echo port number. Default is 6000.
  • -rtp_payload : RTP default payload type.
  • -rtp_threadtasks : RTP number of playback tasks per thread.
  • -rtp_buffsize : Set the rtp socket send/receive buffer size.

Call rate options:

  • -r : Set the call rate (in calls per seconds). This value can bechanged during test by pressing ‘+’, ‘_’, ‘*’ or ‘/’. Default is 10.
    • pressing ‘+’ key to increase call rate by 1 * rate_scale,
    • pressing ‘-‘ key to decrease call rate by 1 * rate_scale,
    • pressing ‘*’ key to increase call rate by 10 * rate_scale,
    • pressing ‘/’ key to decrease call rate by 10 * rate_scale.
  • -rp : Specify the rate period for the call rate. Default is 1 second and default unit is milliseconds. This allows you to have n calls every m milliseconds(by using -r n -rp m). Example: -r 7 -rp 2000 ==> 7 calls every 2 seconds. -r 10 -rp 5s => 10 calls every 5 seconds.
  • -rate_scale : Control the units for the ‘+’, ‘-‘, ‘*’, and ‘/’ keys.
  • -rate_increase : Specify the rate increase every -rate_interval units (default is seconds). This allows you to increase the load for each independent logging period. Example: -rate_increase 10 -rate_interval 10s ==> increase calls by 10 every 10 seconds.
  • -rate_max : 

If -rate_increase is set, then quit after the rate reaches this value. Example: -rate_increase 10 -rate_max 100 ==> increase calls by 10 until 100 cps is hit.

  • -rate_interval : Set the interval by which the call rate is increased. Defaults to the value of -fd.
  • -no_rate_quit : If -rate_increase is set, do not quit after the rate reaches -rate_max.
  • -l :  Set the maximum number of simultaneous calls. Once this limit is reached, traffic is decreased until the number of open calls goes down. Default: (3 * call_duration (s) * rate).
  • -m : Stop the test and exit when ‘calls’ calls are processed
  • -users : Instead of starting calls at a fixed rate, begin ‘users’ calls at startup, and keep the number of calls constant.

Retransmission and timeout options:

  • -recv_timeout : Global receive timeout. Default unit is milliseconds. If the expected message is not received, the call times out and is aborted.
  • -send_timeout : Global send timeout. Default unit is milliseconds. If a message is not sent (due to congestion), the call times out and is aborted.
  • -timeout : Global timeout. Default unit is seconds. If this option is set, SIPp quits after nb units (-timeout 20s quits after 20 seconds).
  • -timeout_error : SIPp fails if the global timeout is reached is set (-timeout option required).
  • -max_retrans : Maximum number of UDP retransmissions before call ends on timeout. Default is 5 for INVITE transactions and 7 for others.
  • -max_invite_retrans: Maximum number of UDP retransmissions for invite transactions before call ends on timeout.
  • -max_non_invite_retrans: Maximum number of UDP retransmissions for non-invite transactions before call ends on timeout.
  • -nr : Disable retransmission in UDP mode.
  • -rtcheck : Select the retransmission detection method: full (default) or loose.
  • -T2 : Global T2-timer in milli seconds

Third-party call control options:

  • -3pcc : Launch the tool in 3pcc mode (“Third Party call control”). The passed IP address depends on the 3PCC role.
    • When the first twin command is ‘sendCmd’ then this is the address of the remote twin socket. SIPp will try to connect to this address:port to send the twin command (This instance must be started after all other 3PCC scenarios). Example: 3PCC-C-A scenario.
    • When the first twin command is ‘recvCmd’ then this is the address of the local twin socket. SIPp will open this address:port to listen for twin command. Example: 3PCC-C-B scenario.
  • -master : 3pcc extended mode: indicates the master number
  • -slave : 3pcc extended mode: indicates the slave number
  • -slave_cfg : 3pcc extended mode: indicates the file where the master and slave addresses are stored

Performance and watchdog options:

  • -timer_resol
    Set the timer resolution. Default unit is milliseconds. This option has an impact on timers precision.Small values allow more precise scheduling but impacts CPU usage.If the compression is on, the value is set to 50ms. The default value is 10ms.
  • -max_recv_loops Set the maximum number of messages received read per cycle. Increase this value for high traffic level. The default value is 1000.
  • -max_sched_loops Set the maximum number of calls run per event loop. Increase this value for high traffic level. The default value is 1000.
  • -watchdog_interval : Set gap between watchdog timer firings. Default is 400.
  • -watchdog_reset : If the watchdog timer has not fired in more than this time period, then reset the max triggers counters. Default is 10 minutes.
  • -watchdog_minor_threshold: If it has been longer than this period between watchdog executions count a minor trip. Default is 500.
  • -watchdog_major_threshold: If it has been longer than this period between watchdog executions count a major trip. Default is 3000.
  • -watchdog_major_maxtriggers : How many times the major watchdog timer can be tripped before the test is terminated. Default is 10.
  • -watchdog_minor_maxtriggers: How many times the minor watchdog timer can be tripped before the test is terminated. Default is 120.

Tracing, logging and statistics options:

  • -f : Set the statistics report frequency on screen. Default is 1 and default unit is seconds.
  • -trace_stat : Dumps all statistics in <scenario_name>_.csv file. Use the ‘-h stat’ option for a detailed description of the statistics file content.
  • -stat_delimiter : Set the delimiter for the statistics file
  • -stf : Set the file name to use to dump statistics
  • -fd : Set the statistics dump log report frequency. Default is 60 and default unit is seconds.
  • -periodic_rtd : Reset response time partition counters each logging interval.
  • -trace_msg : Displays sent and received SIP messages in __messages.log
  • -message_file : Set the name of the message log file.
  • -message_overwrite: Overwrite the message log file (default true).
  • -trace_shortmsg : Displays sent and received SIP messages as CSV in <scenario file name>__shortmessages.log
  • -shortmessage_file: Set the name of the short message log file.
  • -shortmessage_overwrite: Overwrite the short message log file (default true).
  • -trace_counts : Dumps individual message counts in a CSV file.
  • -trace_err : Trace all unexpected messages in __errors.log.
  • -error_file : Set the name of the error log file.
  • -error_overwrite : Overwrite the error log file (default true).
  • -trace_error_codes: Dumps the SIP response codes of unexpected messages to <scenario file name>__error_codes.log.
  • -trace_calldebug : Dumps debugging information about aborted calls to <scenario_name>__calldebug.log file.
  • -calldebug_file : Set the name of the call debug file.
  • -calldebug_overwrite: Overwrite the call debug file (default true).
  • -trace_screen : Dump statistic screens in the <scenario_name>__screens.log file when quitting SIPp. Useful to get a final status report in background mode (-bg option).
  • -screen_file : Set the name of the screen file.
  • -screen_overwrite: Overwrite the screen file (default true).
  • -trace_rtt : Allow tracing of all response times in __rtt.csv.
  • -rtt_freq : freq is mandatory. Dump response times every freq calls in the log file defined by -trace_rtt. Default value is 200.
  • -trace_logs : Allow tracing of actions in __logs.log.
  • -log_file : Set the name of the log actions log file.
  • -log_overwrite : Overwrite the log actions log file (default true).
  • -ringbuffer_files: How many error, message, shortmessage and calldebug files should be kept after rotation?
  • -ringbuffer_size : How large should error, message, shortmessage and calldebug files be before they get rotated?
  • -max_log_size : What is the limit for error, message, shortmessage and calldebug file sizes.

Signal handling:

SIPp can be controlled using POSIX signals. The following signals are handled: USR1: Similar to pressing the ‘q’ key. It triggers a soft exit of SIPp. No more new calls are placed and all ongoing calls are finished before SIPp exits. Example: kill -SIGUSR1 732 USR2: Triggers a dump of all statistics screens in <scenario_name>__screens.log file. Especially useful in background mode to know what the current status is. Example: kill -SIGUSR2 732

Exit codes:

Upon exit (on fatal error or when the number of asked calls (-m option) is reached, SIPp exits with one of the following exit code: 0: All calls were successful 1: At least one call failed 97: Exit on internal command. Calls may have been processed 99: Normal exit without calls processed -1: Fatal error -2: Fatal error binding a socket

Debugging

Issue1  The commonName field needed to be supplied and was missing 

Solution Given the common name while generating the certs

Issue2 If cmake error appears such as “command not found: cmake” then 

solutionsudo apt-get install build-essential cmake

References :

Gstreamer

GStreamer ( LGPL )ia a media handling library written in C for applicatioan such as streaming , recording, playback , mixing and editing attributes etc. Even enhnaced applicaiosn such as tsrancoding , media ormat conversion , streaming servers for embeeded devices ( read more about Gstreamer in RPi in my srticle here).
It encompases various codecs, filters and is modular with plugins developement to enhance its capabilities. Media Streaming application developers use it as part of their framework at either the broadcaster’s end or as media player.

gst-launch-1.0 videotestsrc ! videoconvert ! autovideosink

More detailed reading :

GStreamer-1.8.1 rtsp server and client on ubuntu – Install and configuration for a RTSP Streaming server and Client https://telecom.altanai.com/2016/05/20/gstreamer-1-8-1-rtsp-server-and-client-on-ubuntu/

crtmpserver + ffmpeg –

https://telecom.altanai.com/2016/06/19/crtmpserver-ffmpeg

Streaming / broadcasting Live Video call to non webrtc supported browsers and media players

 attempts of streaming / broadcasting Live Video WebRTC call to non WebRTC supported browsers and media players such as VLC , ffplay , default video player in Linux etc .

https://telecom.altanai.com/2015/02/17/streaming-broadcasting-live-video-call-to-non-webrtc-supported-browsers-and-media-players/

continue : Streaming / broadcasting Live Video call to non webrtc supported browsers and media players

httontinuation to the attempts / outcomes and problems in building a WebRTC to RTP media framework that successfully stream / broadcast WebRTC content to non webrtc supported browsers ( safari / IE ) / media players ( VLC )

https://telecom.altanai.com/2015/02/26/continue-streaming-broadcasting-live-video-call-to-non-webrtc-supported-browsers-and-media-players/

TO continue with basics of gstreamer keep reading

To list all packages of Gstreamer

pkg-config --list-all | grep gstreamer
  • gstreamer-gl-1.0 GStreamer OpenGL Plugins Libraries – Streaming media framework, OpenGL plugins libraries
  • gstreamer-bad-video-1.0GStreamer bad video library – Bad video library for GStreamer elements
  • gstreamer-tag-1.0 GStreamer Tag Library – Tag base classes and helper functions
  • gstreamer-bad-base-1.0 GStreamer bad base classes – Bad base classes for GStreamer elements
  • gstreamer-net-1.0GStreamer networking library – Network-enabled GStreamer plug-ins and clocking
  • gstreamer-sdp-1.0 GStreamer SDP Library – SDP helper functions
  • gstreamer-1.0 GStreamer – Streaming media framework
  • gstreamer-bad-audio-1.0 GStreamer bad audio library, uninstalled – Bad audio library for GStreamer elements, Not Installedgstreamer-allocators-1.0 GStreamer Allocators Library – Allocators implementation
  • gstreamer-player-1.0 GStreamer Player – GStreamer Player convenience library
  • gstreamer-insertbin-1.0 GStreamer Insert Bin – Bin to automatically and insertally link elements
  • gstreamer-plugins-base-1.0 GStreamer Base Plugins Libraries – Streaming media framework, base plugins libraries
  • gstreamer-vaapi-glx-1.0 GStreamer VA-API (GLX) Plugins Libraries – Streaming media framework, VA-API (GLX) plugins librariesgstreamer-codecparsers-1.0 GStreamer codec parsers – Bitstream parsers for GStreamer elementsgstreamer-base-1.0 GStreamer base classes – Base classes for GStreamer elements
  • gstreamer-app-1.0 GStreamer Application Library – Helper functions and base classes for application integration
  • gstreamer-vaapi-drm-1.0 GStreamer VA-API (DRM) Plugins Libraries – Streaming media framework, VA-API (DRM) plugins librariesgstreamer-check-1.0 GStreamer check unit testing – Unit testing helper library for GStreamer modules
  • gstreamer-vaapi-1.0 GStreamer VA-API Plugins Libraries – Streaming media framework, VA-API plugins libraries
  • gstreamer-controller-1.0 GStreamer controller – Dynamic parameter control for GStreamer elements
  • gstreamer-video-1.0 GStreamer Video Library – Video base classes and helper functions
  • gstreamer-vaapi-wayland-1.0 GStreamer VA-API (Wayland) Plugins Libraries – Streaming media framework, VA-API (Wayland) plugins libraries
  • gstreamer-fft-1.0 GStreamer FFT Library – FFT implementation
  • gstreamer-mpegts-1.0 GStreamer MPEG-TS – GStreamer MPEG-TS support
  • gstreamer-pbutils-1.0 GStreamer Base Utils Library – General utility functions
  • gstreamer-vaapi-x11-1.0 GStreamer VA-API (X11) Plugins Libraries – Streaming media framework, VA-API (X11) plugins libraries
  • gstreamer-rtp-1.0 GStreamer RTP Library – RTP base classes and helper functions
  • gstreamer-rtsp-1.0 GStreamer RTSP Library – RTSP base classes and helper functions
  • gstreamer-riff-1.0 GStreamer RIFF Library – RIFF helper functions
  • gstreamer-audio-1.0 GStreamer Audio library – Audio helper functions and base classes
  • gstreamer-plugins-bad-1.0 GStreamer Bad Plugin libraries – Streaming media framework, bad plugins libraries
  • gstreamer-rtsp-server-1.0 gst-rtsp-server – GStreamer based RTSP server

At the time of writing this article Gstreamer an much early version in 1.X , which was newer than its then stable version 0.x. Since then the library has updated many fold. summarising release highlights for major versions as the blog was updated over time .

Project : Making and IP survillance system using gstreamer and Janus

To build a turn-key easily deployable surveillance solution 

Features :

  1. Paring of Android Mobile with box
  2. Live streaming from Box to Android
  3. Video Recording inside the  box
  4. Auto parsing of recorded video around motion detection 
  5. Event listeners 
  6. 2 way audio
  7. Inbuild Media Control Unit
  8. Efficient use of bandwidth 
  9. Secure session while live-streaming

Modules

  1. Authentication ( OTP / username- password)
  2. Livestreaming on Opus / vp8 
  3. Session Security and keepalives for live-streaming sessions
  4. Sync local videos to cloud storage 
  5. Record and playback with timeline and events 
  6. Parsing and restructuring video ( transcoding may also be required ) 
  7. Coturn server for NAT and ICE
  8. Web platform on box ( user interface )+ NoSQL
  9. Web platform on Cloud server ( Admin interface )+ NoSQL
  10.  REST APIs for third party add-ons ( Node based )
  11. Android demo app for receiving the live stream and feeds

Varrying experiments and working gstreamer commands

Local Network Stream 

To create /dev/video0

modprobe bcm2835-v4l2

To stream on rtspserver using rpicamsrc using h264 parse

./gst-rtsp-server-1.4.4/examples/test-launch --gst-debug=2 '(rpicamsrc num-buffers=5000 ! 'video/x-h264,width=1080,height=720,framerate=30/1' ! h264parse ! rtph264pay name=pay0 pt=96 )'

./test-launch “( tcpclientsrc host=127.0.0.1 port=5000 ! gdpdepay ! rtph264pay name=pay0 pt=96 )”

pipe raspivid to tcpserversink

raspivid -t 0 -w 800 -h 600 -fps 25 -g 5 -b 4000000 -vf -n -o - | gst-launch-1.0 -v fdsrc ! h264parse ! gdppay ! tcpserversink host=127.0.0.1 port=5000;

Stream Video over local Network with 15 fps

raspivid -n -ih -t 0 -rot 0 -w 1280 -h 720 -fps 15 -b 1000000 -o - | nc -l -p 5001

streaming video over local network with 30FPS and higher bitrate

raspivid -n -t 0 -rot 0 -w 1920 -h 1080 -fps 30 -b 5000000 -o - | nc -l -p 5001

Recording

Audio record to file
Using arecord :

arecord -D plughw:1 -c1 -r 48000 -f S16_LE -t wav -v file.wav;

Using pulse :
pulseAudio -D

gst-launch-1.0 -v pulsesrc device=hw:1 volume=8.0 ! audio/x-raw,format=S16LE ! audioconvert ! voaacenc bitrate=48000 ! aacparse ! flvmux ! filesink location = "testaudio.flv";

Video record to file ( mpg)

gst-launch-1.0 -e rpicamsrc bitrate=500000 ! 'video/x-h264,width=640,height=480’ ! mux. avimux name=mux ! filesink location=testvideo2.mpg;

Video record to file ( flv )

gst-launch-1.0 -e rpicamsrc bitrate=500000 ! video/x-h264,width=320,height=240,framerate=10/1 ! h264parse ! flvmux ! filesink location="testvieo.flv";

Video record to file ( h264)
gst-launch-1.0 -e rpicamsrc bitrate=500000 ! filesink location=”raw3.h264″;

Video record to file ( mp4)

gst-launch-1.0 -e rpicamsrc bitrate=500000 ! video/x-h264,width=320,height=240,framerate=10/1 ! h264parse ! mp4mux ! filesink location=video.mp4;

Audio + Video record to file ( flv)

gst-launch-1.0 -e /
rpicamsrc bitrate=500000 ! /
video/x-h264,width=320,height=240,framerate=10/1 ! h264parse ! muxout. /
pulsesrc volume=8.0 ! /
queue ! audioconvert ! voaacenc bitrate=65536 ! aacparse ! muxout. /
flvmux name=muxout streamable=true ! filesink location ='test44.flv';

Audio + Video record to file ( flv) using pulsesrc

gst-launch-1.0 -v --gst-debug-level=3 pulsesrc device="alsa_input.platform-asoc-simple-card.0.analog-stereo" volume=5.0 mute=FALSE ! audio/x-raw,format=S16LE,rate=48000,channels=1 ! audioresample ! audioconvert ! voaacenc ! aacparse ! flvmux ! filesink location="voicetest.flv";

Audio + Video record to file (mp4)

gst-launch-1.0 -e /
rpicamsrc bitrate=500000 ! /
video/x-h264,width=320,height=240,framerate=10/1 !s h264parse ! muxout. /
pulsesrc volume=4.0 ! /
queue ! audioconvert ! voaacenc ! muxout. /
flvmux name=muxout streamable=true ! filesink location = 'test224.mp4';

Streaming

stream raw Audio over RTMP to srtmpsink

gst-launch-1.0 pulsesrc device=hw:1 volume=8.0 ! /
audio/x-raw,format=S24LE ! audioconvert ! voaacenc bitrate=48000 ! aacparse ! flvmux ! rtmpsink location = “rtmp://192.168.0.3:1935/live/test”;

stream AACpparse Audio over RTMP to srtmpsink

gst-launch-1.0 -v --gst-debug-level=3 pulsesrc device="alsa_input.platform-asoc-simple-card.0.analog-stereo" volume=5.0 mute=FALSE ! audio/x-raw,format=S16LE,rate=48000,channels=1 ! audioresample ! audioconvert ! voaacenc ! aacparse ! flvmux ! rtmpsink location="rtmp://www.altani.com:1935/voice/1/test";

stream Video over RTMP

gst-launch-1.0 -e rpicamsrc bitrate=500000 ! /
video/x-h264,width=320,height=240,framerate=6/1 ! h264parse ! /
flvmux ! rtmpsink location = ‘rtmp://52.66.125.31:1935/live/test live=1’;

stream Audio + video over RTMP from rpicamsrc , framerate 10

gst-launch-1.0 rpicamsrc bitrate=500000 ! video/x-h264,width=320,height=240,framerate=10/1 ! h264parse ! muxout. pulsesrc volume=8.0 ! queue ! audioconvert ! voaacenc bitrate=65536 ! aacparse ! muxout. flvmux name=muxout streamable=true ! rtmpsink location ='rtmp://www.altanai.com/live/test44';

stream Audio + video over RTMP from rpicamsrc , framerate 30

gst-launch-1.0 rpicamsrc bitrate=500000 ! video/x-h264,width=1280,height=720,framerate=30/1 ! h264parse ! muxout. pulsesrc ! queue ! audioconvert ! voaacenc bitrate=65536 ! aacparse ! muxout. flvmux name=muxout ! queue ! rtmpsink location ='rtmp://www.altanai.com/live/test44';

VOD ( video On Demand )

Stream h264 file over RTMP

gst-launch-1.0 -e filesrc location="raw3.h264" ! video/x-h264 ! h264p
arse ! flvmux ! rtmpsink location = 'rtmp://www.altanai.com/live/test';

Stream flv file over RTMP

gst-launch-1.0 -e filesrc location=”testvieo.flv” ! /
video/x-h264,width=320,height=240,framerate=10/1 ! h264parse ! /
flvmux ! rtmpsink location = 'rtmp://192.168.0.3:1935/live/test';

Github Repo for Livestreaming

https://github.com/altanai/Livestreaming

Contains code for Android and ios Publishers , players on various platforms including HLS and Flash , streamings servers , Wowza playing mosules , webrtc broadcast

Gstreamer 1.8.0 – 24 March 2016

Features Hardware-accelerated zero-copy video decoding on Android

New video capture source for Android using the android.hardware.Camera API

Windows Media reverse playback support (ASF/WMV/WMA)

tracing system provides support for more sophisticated debugging tools

high-level GstPlayer playback convenience API

Initial support for the new Vulkan API

Improved Opus audio codec support: Support for more than two channels; MPEG-TS demuxer/muxer can handle Opus; sample-accurate encoding/decoding/transmuxing with Ogg, Matroska, ISOBMFF (Quicktime/MP4), and MPEG-TS as container; new codec utility functions for Opus header and caps handling in pbutils library. The Opus encoder/decoder elements were also moved to gst-plugins-base (from -bad), and the opus RTP depayloader/payloader to -good.

Asset proxy support in the GStreamer Editing Services

GStreamer 1.16.0 – 19 April 2019.

GStreamer WebRTC stack gained support for data channels for peer-to-peer communication based on SCTP, BUNDLE support, as well as support for multiple TURN servers.

AV1 video codec support for Matroska and QuickTime/MP4 containers and more configuration options and supported input formats for the AOMedia AV1 encoder

Closed Captions and other Ancillary Data in video

planar (non-interleaved) raw audio

GstVideoAggregator, compositor and OpenGL mixer elements are now in -base

New alternate fields interlace mode where each buffer carries a single field

WebM and Matroska ContentEncryption support in the Matroska demuxer

new WebKit WPE-based web browser source element

Video4Linux: HEVC encoding and decoding, JPEG encoding, and improved dmabuf import/export

Hardware-accelerated Nvidia video decoder gained support for VP8/VP9 decoding, whilst the encoder gained support for H.265/HEVC encoding.

Improvements to the Intel Media SDK based hardware-accelerated video decoder and encoder plugin (msdk): dmabuf import/export for zero-copy integration with other components; VP9 decoding; 10-bit HEVC encoding; video post-processing (vpp) support including deinterlacing; and the video decoder now handles dynamic resolution changes.

ASS/SSA subtitle overlay renderer can now handle multiple subtitles that overlap in time and will show them on screen simultaneously

Meson build feature-complete (with the exception of plugin docs) and it is now the recommended build system on all platforms. The Autotools build is scheduled to be removed in the next cycle.

GStreamer Rust bindings and Rust plugins module

GStreamer Editing Services allows directly playing back serialized edit list with playbin or (uri)decodebin

References :

https://gstreamer.freedesktop.org

OTT ( Over the Top ) Communication applications

Market trends are not in favour of Telecom Service /providers with increasing use of OTT ( Over The Top ) applications like WhatsApp, Facebook messenger, Google hangouts, skype, Viber, etc. OTT applications are often blamed to take a stake in voice traffic revenue by using IP calls where the telco could’ve charged based on its rate plan of call seconds. This especially intensifies for long-distance or international calls where customers can use OTT providers instead of expensive telco rate plans.

What is an OTT ?

An Over The Top ( OTT ) application is one which provides communication services over Internet . Therefore these bypass the communication billing system setup by a Telecom Operator , resulting in no gain or loss of revenue to Telecom Operator who is providing the Internet service to user in first place .

Hence we see that OTT are major source of concern for Telecom Operators whose traditional and obviously expensive ( when compared to OTTs free service ) billing models are facing disruption .


Telecom Regulatory bodies around the world

The telecom regulatory authorities in some of the countries are for example listed as :

  • Afghanistan Telecom Regulatory Authority (ATRA) – Afganistan
  • Australian Communications and Media Authority (ACMA) – Australia
  • Bangladesh Telecommunication Regulatory Commission (BTRC) – Bnagaladesh
  • Canadian Radio-television and Telecommunications Commission (CRTC) – Canada
  • Ministry of Information Industry (MII) – China
  • Autorité de Régulation des Communications Électroniques et des Postes (ARCEP) – France
  • Bundesnetzagentur (BNA) – Germany
  • Telecom Regulatory Authority of India (TRAI) – India
  • Ministry for Communications and Informatization of the Russian Federation (Minsvyaz) – Russia
  • Infocomm Development Authority of Singapore (IDA) – Singapore
  • Independent Communications Authority of South Africa (ICASA) – south Africa
  • Federal Communications Commission (FCC) , National Association of Regulatory Utility Commissioners (regulators of individual states) (NARUC) , CTIA – The Wireless Association (CTIA) – USA

Such telecom regulatory bodies get to decide whether to enforce differential price to end consumers for using OTT so that telecom service providers can benefit or keep the Internet fair and open by passing Net Neutrality Laws and Bills and amendments .

What is Net Neaurality ?

The fundamental principle of Net Neurality is that Telecom Operators should not block , slow down or charge consumers extra for using other services as their means of communication. This states that it is wrong to charge users above the regular data rates for using VOIP apps and other internet based communication services.

The following counteries have adopted principles of Net Neutrality by passing bills or making law .

  • Chile – Chile’s General Law of Telecommunications, “No [ISP] can block, interfere with, discriminate, hinder, nor restrict the right of any Internet user of using, send, receive, or offer any content, application, or legitimate service through the Internet, as well as any activity or legitimate use conducted through the Internet.”
  • Brazil – ” Internet Bill of Rights ” makes equal access to internet mandatory in Brazil .
  • Netherlands – Even European Union has adopted Netherlands’ Net Neutrality amendment which reads “traffic should be treated equally, without discrimination, restriction or interference, independent of the sender, receiver, type, content, device, service or application.”
  • USA – Citizens make ‘We the People’ platform to ‘Restore Net Neutrality By Directing the Federal Communications Commission (FCC) to Classify Internet Providers as ‘Common Carriers‘. Therefore not allowing them to either throttle speed by paid prioritization , discriminate in pricing or block any broadband access to legal content .  Above facts are from this tech.firstpost.com article.

 

Inspite of the fact that I Support Net Neutrality with all my heart , as a telecom engineer I understand the cost investment made by Telecom operators in providing am efficient communication network to its subscribers ( Access , Network and Application layers ). Therefor I do have my sympathies with the Telcos and to level out the wide ranging conflict between Telcos and  ISP ( Internet Service Providers ) , I pen down the following points which reflect the Telecom Operators Problems and also highlight the solutions that can be adopted to counteract the OTT threat .

Depleting revenue for Telco

  1. Messaging – OTT messaging cost operators $13.9 billion, or 9% of message revenue in 2013
  2. Voice – Voice services under threat from VOIP services like Skype, Viber
  3. OTT apps – Voice & Message apps have been the operator’s biggest headache. Its time Operator should launch its own OTT Services
  4. Data Traffic – The utilization is yet to reach its peak. Will face challenges from  WiFi access
  5. Critical Pain areas – Erosion of Operator’s revenue from voice and (especially) messaging

Telco’s OTT Application

At this stage, a telecom Service provider / Operator must enter the apps market and bring forth a Messenger which is more powerful, interactive and awesome than an OTT application. Fortunately, the Operator can always couple this application with his background telecom infrastructure to provide the edge in performance and functionalities.

Road block while developing a OTT application for a Telecom Service Provider :

  • Investment in Data Network is not being utilized due to lack of service
  • Reuse of Existing business Logic and extending the service reach across devices and networks is tough
  • Operator already has full fledged network Infrastructure in Place
  • Desire for minimum CAPEX while investing in new technologies
  • compete with OTT players and open new revenue streams is a challenge

Next we find the way of solving the problems and integrating them together to form a Solution .

OTT Application for Telecom Service provider

  • Introduce new services to benefit from investment on Data Plans and Bandwidth
  • Expose REST API to enable 3trd party Integration with existing network Infrastructure
  • Partner with individual OTT players to make new services  that do not compete on core competencies like billing etc
  • Use protocols like SIP that reduce CAPEX and have goto market more quickly
  • Go for enriched service that lead to better user experience

This write-up outlines the process of creating an OTT application for a Telecom Service Provider. Components for the application include cloud Address Book, Video Chatting, Location share, Contact synchronization ,REST-based thin client , OS and device agnostic etc shown in the figure below:

telco's OTT app
telco’s OTT app

The Application  is designed to close knit with Operator’s own infrastructure hence the crucial entities like Network Address Book , Location Service are synced and fetched from Backend Network .

OTT application Feature Overview

Smart Address Book

  • Automatic: Get contacts from Gmail, Facebook
  • Fast search by first, last name, frequently
  •   dialed number
  • Roadmap: View calendar events
  • Personal: Get image from Gmail and display in   contacts list

Geo Location

  • Share own location during chatting
  • Get map for calculating the distance between two chat users
  • Roadmap : Trigger device (say Switch on/off AC before reaching home) from a threshold distance away from home   location

Messaging

  • Ad-hoc Chat
  • Session Based Chat
  • Voice Input for texting
  • Presence information of contacts
  • RoadMap: Legacy message integration

Telephony

  • Voice call to mobile
  • Voice call to PSTN
  • Video call to other @imAll user
  • Share images during voice call to other

Device agnostic

  • Compatible with IOS, windows
  • Can run as native app on ipad
  • Can run as browser client on windows
  • RoadMap: native app for android, windows phone,blackberry10

Roadmap

features of Unified Communications ( UC)
features of Unified Communications ( UC)
  • To upgrade the application and provide enganced and enrich service support the I propose the following roadmap.
  • From plain vanilla voice and video calling ( supported by every other OTT application ) our application should progress towards  legacy telecom support whihc included PSTN , GSM , ISDN etc . This requires backbone of telecom network and a good setup for media codec conversion to suit various legacy media codecs .

Road Map  from Traditional to New age services 

  1. Voice and video calling
  2. Legacy services support like MMS and SMS
  3. Integration with 3rd party Vendors
  4. Give new enriched services like Multilingual support , file transfer , screen-sharing etc
  5. give facility to integrated web plugins for web calling

To keep the interest of customers it is essential that the application be supported on other popular OTT services like skype  , Gtalk . for exmaple a caller should be able to make call from Skype  / Gtalk to our application .Multilingual capabilities, support for larger protocol spectrum will just act like icing on the cake .

How does it benefit the Operator??

  1.  Saves on development cost and time
  2.  Device Agnostic OTT Applications
  3. Simplified Service deployment
  4. Saves licensing cost per client
  5. Reuses existing Messaging and   Address Book service logic.
  6. Open New Revenue Streams for operator
  7. No separate SIP stack required for the client
  8.  Faster Time to Market

Update : At the time of writing this post I did not anticipate the wave of change that bring focus on subjects like “net neutrality” , ” Save the internet” and “free internet”. However now most of the telcos providers have either joined the bandwagon by prividing SIP trunk endpoinst for cloud teelphony providers ( eg twilio, Google Calls) or have made their own IP call application for B2B customers.


Business Challenges for a telecom service provider

With the fast pace of telecom evolution both towards the access network front ( ie GSM , UMTS , 3G , 4G , LTE , VOLTE ) to core network side ( ie application servers , registrar , proxies , gateway , media server etc ) a CSP ( content service provider ) is trying hard to keep up with the user expectation . The user expects a plethora of services , reduced cost and high speed bandwidth . If this was not enough a CSP also has competition  OTT (   Over The Top ) Players who provide communication and messaging for FREE .

You can read on how OTT’s players are disruption the revenue streams of traditional telecom operators and how can Telco’s develop  their own OTT app , integrated with their backend system to answer to that challenge  here – OTT ( Over the Top ) Communication applications

The following points outline the major business challenges faced by telecom operators today .

Technology Evolution Challenges

  •  The increased data speeds and further more increasing hunger for the data overwhelms the existing network infrastructures.
  • Ensure uniform service experience across the network technologies to check the customer churn.
  • Access / Radio Technology independent delivery of services.
  • Enhance Reuse for exiting investments.

Multiple Service Platform Challenges

  • Typical network constitutes of Multiple Service Platforms increasing network complexity and integration challenges many fold.
  • Heterogeneous multiple SDP Solutions typically deployed to cater to Multiple Types of Networks/ Standards/Variants
  • Service Islands makes introduction of seamless services a challenging task for the CSP

Transport Upgrade and Convergence of Wireless Wireline

  • Retain investments in copper wire systems while migrating towards next generation Fiber Optic systems.
  • Severe competition among wire-line and wireless operators to provide latest services to retain subscriber base.
  • Fixed Mobile Convergence leading to a diminishing gap among the revenue shares of various operators in the space, and leading to losses for wire-line only players.