- PCAP Collection
- Continuous Integration and Delivery Automation using Jenkins
- Configuration management using chef cookbooks
- Compute virtualization and containerization using Docker
- Infrastructure management using terraform
- Monitoring, debugging, logs analysis and alarms
- SBC and proxy gateways failures
- DNS caching alerts
- Disk usage alert
- Elevated Call failure SIP 503 or Call timeout SIP 408
- cron service or processed alerts
- DB connections / connection pool process alerts
- port check, unexpected result alert
- cron zombie process checks alerts
- Bulk calls checks
- Process control supervisor or pm2 checks
- Health and load on the reverse proxy, load balancer as Nginx alerts
- VPN checks
- SSL cert expiry checks
- Health of Task scheduling services such as RabbitMQ, Celery Distributed Task Queue
- Cluster status
- Status of Crticial Application Server
- Programming or Syntax error in the production environment
- Distributed memory caching – redis , memcahe
- SMS service using smsc on Kannel
This article is focussed around various tools required to operate and maintain a growing large scale VoIP Platform, which are mostly classified under following roles:
- PCAP Collections
- CICD on Jenkins pipeline
- Configuration management using chef cookbooks
- virtualization and containerization using Docker
- Infrastructure management using terraform / Kubernetes
- Logs Analysis and Alarming
PCAP Collection
Packet Capture (PCAP) is an API that captures live network packets. Besides tracking, audit and RTC visualizers, PCAP is widely used for debugging faults such as during production alarm on high failure occurrences.
Example usecase: Production alert on 503 SIP response or log entry from a gateway is not as helpful as PCAP tracking of the session ID of call across various endpoints in and out of the network to determine the point of failure.Debugging involves :
- Pre-specified SIP / RTP and related protocols capture
Capture pcaps examples
tcpdump -i any -w alltraffic.pcap
rtpbreak -P2 -t100 -T100 -d logz -r alltraffic.pcap
2. Call SessionId to uniquely identify failed calls among tens of thousands of the packet
3. Analyzer such as wireshark or tshark to track the packet
TShark inspection examples
brew cask install wireshark tshark -r alltraffic.pcap -R "sip.CSeq.method eq INVITE"
Some of the useful call specs captured from PCAP
- DTMF – Both in-band and out of band DTMF for every call, along with the time stamp.
- Codec negotiations – Extracting codecs from PCAP lets us
- Validate later whether there were codec changes without prior SIP message,
- If the call has been hung up with 488 error code then it was due to which codec
- SIP errors – track deviations from standard SIP messaging.
- Identify known erroneous SIP messaging scenarios such as for MITM or replay attacks
- RTCP Media stats – extract Jitter, Loss, RTT with RTCP reports for both the incoming and outgoing stream.
- Identify Media or ACK Timeouts
- Check whether a party has not sent any media packet for > 60 s (media time out threshold duration)
- When a call is hung up due to ACK time out.
- Audio stream – After GDPR, take explicit permission from users before storing audio streams.
Continuous Integration and Delivery Automation using Jenkins
CICD provides continous delivery hub , distribute work across multiple machines, helping drive builds, tests and deployments across multiple platforms .
Jenkins jobs is a self-contained Java-based program extensible using plugins.
Jenkins pieline– orchestrates and automates building project in Jenkins
Configuration management using chef cookbooks
Alternatives like puppet and Ansible, which are also a cross-platform configuration management platform
Compute virtualization and containerization using Docker
Docker containers can be used instead of virtual machines such as VirtualBox , to isolates applications and be OS and platform independent
Makes distributed development possible and automates the deployment possible
- stop Stop one or more running containers
- top Display the running processes of a container
> docker top 4417600169e8 UID PID PPID C STIME TTY TIME CMD root 9913 9888 0 08:50 ? 00:00:00 bash /point.sh root 10083 9913 0 08:50 ? 00:00:01 /usr/sbin/worker root 10092 10083 0 08:50 ? 00:00:02 /micro-service
- unpause Unpause all processes within one or more containers
- update Update configuration of one or more containers
- wait Block until one or more containers stop, then print their exit codes
see all iamges
> docker images REPOSITORY TAG IMAGE ID CREATED SIZE sipcapture/homer-cron latest fb2243f90cde 3 hours ago 476MB sipcapture/homer-kamailio latest f159d46a22f3 3 hours ago 338MB sipcapture/heplify latest 9f5280306809 21 hours ago 9.61MB <none> <none> edaa5c708b3a
See all stats
> docker stats CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS f42c71741107 homer-cron 0.00% 52KiB / 994.6MiB 0.01% 2.3kB / 0B 602MB / 0B 0 0111765091ae mysql 0.04% 452.2MiB / 994.6MiB 45.46% 1.35kB / 0B 2.06GB / 49.2kB 22
Run command from within a docker instnace
docker exec -it bash
First see all processes
docker ps
select a process and enter its bash
docker exec -it 0472a5127fff bash
to edit or update a file inside docker either install vim everytime u login in resh docker conainer like
apt-get update
apt-get install vim
or add this to dockerfile
RUN [“apt-get”, “update”]
RUN [“apt-get”, “install”, “-y”, “vim”]
see if ngrep is install , if not then install and run ngrep to get sip logs isnode that docker container
apt update
apt install ngrep
ngrep -p "14795778704" -W byline -d any port 5060
docker volume – Volumes are used for persisting data generated by and used by Docker containers.
docker volumes have advantages over blind mounts such as easier to backup or migrate , managed by docker APIs, can be safely shared among multiple containers etc
docker stack – Lets to manager a cluster of docker containers thorugh docker swarm can be defined via docker-compose.yml file
docker service
- create Create a new service
- inspect Display detailed information on one or more services
- logs Fetch the logs of a service or task
- ls List services
- ps List the tasks of one or more services
- rm Remove one or more services
- rollback Revert changes to a service’s configuration
- scale Scale one or multiple replicated services
- update Update a service
Run docker containers
sample run command
docker run -it -d --name opensips -e ENV=dev imagename:2.2
-it flags attaches to an interactive tty in the container.
-e gives envrionment variables
-d runs it in background and prints container id
Remove docker entities
To remove all stopped containers, all dangling images, and all unused networks:
docker system prune -a
To remove all unused volumes
docker system prune --volumes
To remove all stopped containers
docker container prune
sometimes docker images keep piling with stopped congainer such as REPOSITORY TAG IMAGE ID CREATED SIZE d1dcfe2438ae 15 minutes ago 753MB 2d353828889b 16 hours ago 910MB ...
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 0dd6698a7517 2d353828889b "/entrypoint.sh" 13 minutes ago Exited (137) 13 minutes ago hardcore_wozniak
to remove such images and their conainer , first stop and remove confainers
docker stop $(docker ps -a -q)
docker rm $(docker ps -a -q)
then remove all dangling images
docker rmi $(docker images -aq --filter dangling=true)
Infrastructure management using terraform
Terraform is used for building, changing and versioning infrastructure.
Infra as Code – can run single application to datacentres via configuration files which create execution plan.
It can manage low-level components such as compute instances, storage, and networking, as well as high-level components such as DNS entries, SaaS features, etc.
Resource Graph – builds a graph of all your resources
tfenv can be used to manage terraform versions
> brew unlink terraform
tfenv install 0.11.14
tfenv list
Terraform configuration language
This is used for declaring resources and descriptions of infrastructure and associated files have a .tf or .tf.json file extension
Group of resources can be gathered into a module. Terraform configuration consists of a root module, where evaluation begins, along with a tree of child modules created when one module calls another.
Example : launch a single AWS EC2 instance , fle server1.tf
provider "aws" {
profile = "default"
region = "us-east-1"
}
resource "aws_instance" "server1" {
ami = "ami-2757fxxx"
instance_type = "t2.micro"
}
note : AMI IDs are region specific.
profile attribute here refers to the AWS Config File in ~/.aws/credentials
Terraform command line interface (CLI)
engine for evaluating and applying Terraform configurations.
uses plugins called providers that each define and manage a set of resource types
Command Usage: terraform [-version] [-help] [args]
- apply Builds or changes infrastructure
- console Interactive console for Terraform interpolations
- destroy Destroy Terraform-managed infrastructure
- env Workspace management
- fmt Rewrites config files to canonical format
- get Download and install modules for the configuration
- graph Create a visual graph of Terraform resources
- import Import existing infrastructure into Terraform
- init Initialize a Terraform working directory
- output Read an output from a state file
- plan Generate and show an execution plan
- providers Prints a tree of the providers used in the configuration
- refresh Update local state file against real resources
- show Inspect Terraform state or plan
- taint Manually mark a resource for recreation
- untaint Manually unmark a resource as tainted
- validate Validates the Terraform files
- version Prints the Terraform version
- workspace Workspace management
- 0.12upgrade Rewrites pre-0.12 module source code for v0.12
- debug Debug output management (experimental)
- force-unlock Manually unlock the terraform state
- push Obsolete command for Terraform Enterprise legacy (v1)
- state Advanced state management
terraform init
Initialize a working directory containing Terraform configuration files.
terraform validate
checks that verify whether a configuration is internally-consistent, regardless of any provided variables or existing state.
Kubernetes
container orchestration platform , automating deployment, scaling, and management of containerized applications. Can deploy to cluster of computers, automating the distribution and scheduling as well
Service discovery and load balancing – gives Pods their own IP addresses and a single DNS name for a set of Pods, and can load-balance across them.
Automatic bin packing – Automatically places containers based on their resource requirements and other constraints, while not sacrificing availability. Mix critical and best-effort workloads in order to drive up utilization and save even more resources.
Storage orchestration – Automatically mount the storage system of your choice, whether from local storage, a public cloud provider such as GCP or AWS, or a network storage system such as NFS, iSCSI, Gluster, Ceph, Cinder, or Flocker.
Self-healing – Restarts containers that fail, replaces and reschedules containers when nodes die, kills containers that don’t respond to your user-defined health check, and doesn’t advertise them to clients until they are ready to serve.
Automated rollouts and rollbacks – progressively rolls out changes to your application or its configuration, while monitoring application health to ensure it doesn’t kill all your instances at the same time.
Secret and configuration management – Deploy and update secrets and application configuration without rebuilding your image and without exposing secrets in your stack configuration.
Batch execution– manage batch and CI workloads, replacing containers that fail, if desired.
Horizontal scaling – Scale application up and down with a simple command, with a UI, or automatically based on CPU usage.
create minikube cluster and deploy pods
prerequisities : docker , curl , redis , others
install minikube
curl -Lo minikube https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
chmod +x minikube
install minikube /usr/local/bin
Install kubectl
snap install kubectl --classic
ln -s /snap/bin/kubectl /usr/local/bin
Setup Minikube
minikube start --vm-driver=none
minikube addons enable registry-creds
kubectl -n kube-system create secret generic registry-creds-ecr
kubectl -n kube-system create secret generic registry-creds-gcr
kubectl -n kube-system create secret generic registry-creds-dpr
minikube addons configure registry-creds
Starting Kubernetes…minikube version: v1.3.0 commit: 43969594266d77b555a207b0f3e9b3fa1dc92b1f minikube v1.3.0 on Ubuntu 18.04 Running on localhost (CPUs=2, Memory=2461MB, Disk=47990MB) … OS release is Ubuntu 18.04.2 LTS Preparing Kubernetes v1.15.0 on Docker 18.09.5 … kubelet.resolv-conf=/run/systemd/resolve/resolv.conf Pulling images … Launching Kubernetes … Done! kubectl is now configured to use "minikube" dashboard was successfully enabled Kubernetes Started
Basic Commands
- start Starts a local kubernetes cluster
- status Gets the status of a local kubernetes cluster
- stop Stops a running local kubernetes cluster
- delete Deletes a local kubernetes cluster
- dashboard Access the kubernetes dashboard running within the minikube cluster
Images Commands:
- docker-env Sets up docker env variables; similar to ‘$(docker-machine env)’
- cache Add or delete an image from the local cache.
Configuration and Management Commands:
- addons Modify minikube’s kubernetes addons
- config Modify minikube config
- profile Profile gets or sets the current minikube profile
- update-context Verify the IP address of the running cluster in kubeconfig.
Networking and Connectivity Commands:
- service Gets the kubernetes URL(s) for the specified service in your local cluster
- tunnel tunnel makes services of type LoadBalancer accessible on localhost
Advanced Commands:
- mount Mounts the specified directory into minikube
- ssh Log into or run a command on a machine with SSH; similar to ‘docker-machine ssh’
- kubectl Run kubectl
Troubleshooting Commands:
- ssh-key Retrieve the ssh identity key path of the specified cluster
- ip Retrieves the IP address of the running cluster
- logs Gets the logs of the running instance, used for debugging minikube, not user code.
- update-check Print current and latest version number
kubectl
controls the Kubernetes cluster manager.
Basic Commands (Beginner):
- create Create a resource from a file or from stdin.
- expose Take a replication controller, service, deployment or pod and expose it as a new Kubernetes Service
- run Run a particular image on the cluster
- set Set specific features on objects
- explain Documentation of resources
- get Display one or many resources
- edit Edit a resource on the server
- delete Delete resources by filenames, stdin, resources and names, or by resources and label selector
Deploy Commands:
- rollout Manage the rollout of a resource
- scale Set a new size for a Deployment, ReplicaSet, Replication Controller, or Job
- autoscale Auto-scale a Deployment, ReplicaSet, or ReplicationController
Cluster Management Commands:
- certificate Modify certificate resources.
- cluster-info Display cluster info
- top Display Resource (CPU/Memory/Storage) usage.
- cordon Mark node as unschedulable
- uncordon Mark node as schedulable
- drain Drain node in preparation for maintenance
- taint Update the taints on one or more nodes
Troubleshooting and Debugging Commands:
- describe Show details of a specific resource or group of resources
- logs Print the logs for a container in a pod
- attach Attach to a running container
- exec Execute a command in a container
- port-forward Forward one or more local ports to a pod
- proxy Run a proxy to the Kubernetes API server
- cp Copy files and directories to and from containers.
- auth Inspect authorization
Advanced Commands:
- diff Diff live version against would-be applied version
- apply Apply a configuration to a resource by filename or stdin
- patch Update field(s) of a resource using strategic merge patch
- replace Replace a resource by filename or stdin
- wait Experimental: Wait for a specific condition on one or many resources.
- convert Convert config files between different API versions
- kustomize Build a kustomization target from a directory or a remote url.
Settings Commands:
- label Update the labels on a resource
- annotate Update the annotations on a resource
- completion Output shell completion code for the specified shell (bash or zsh)
Other Commands:
- api-resources Print the supported API resources on the server
- api-versions Print the supported API versions on the server, in the form of “group/version”
- config Modify kubeconfig files
- plugin Provides utilities for interacting with plugins.
- version Print the client and server version information
DevOps monitoring tools nagios
Manage Docker configs
- create Create a config from a file or STDIN
- inspect Display detailed information on one or more configs
- ls List configs
- rm Remove one or more configs
Manage containers
- attach Attach local standard input, output, and error streams to a running container
- commit Create a new image from a container’s changes
- cp Copy files/folders between a container and the local filesystem
- create Create a new container
- diff Inspect changes to files or directories on a container’s filesystem
- exec Run a command in a running container
- export Export a container’s filesystem as a tar archive
- inspect Display detailed information on one or more containers
- kill Kill one or more running containers
- logs Fetch the logs of a container
- ls List containers
- pause Pause all processes within one or more containers
- port List port mappings or a specific mapping for the container
- prune Remove all stopped containers
- rename Rename a container
- restart Restart one or more containers
- rm Remove one or more containers
- run Run a command in a new container
- start Start one or more stopped containers
- stats Display a live stream of container(s) resource usage statistics
- stop Stop one or more running containers
- top Display the running processes of a container
- unpause Unpause all processes within one or more containers
- update Update configuration of one or more containers
- wait Block until one or more containers stop, then print their exit codes
Alternatives, Senu multi-cloud monitoring or Raygun
Monitoring, debugging, logs analysis and alarms
Aggregate logs into logstash and provide search and filtering via Elastic Search and Kibana. Can also trigger alerts or notifications on specific keyword searches in logs such as WARNING or ERRRO or call_failed. Some common alert scenarios include :
SBC and proxy gateways failures – check states of VM instance
DNS caching alerts – Domain Name System (DNS) caching, a Dynamic Host Configuration Protocol (DHCP) server, router advertisement and network boot alerts from service such as dnsmasq
Disk usage alert – setup alerts for 80% usage and trigger an alarm to either manually prune or create automatic timely archive backups.
check the percentage of DISK USAGE
df -h
Mostly it is either the logs file or pcap recorder which need to be archieved in external storage.
Use logrotate – it can rotates, compresses, and mails system logs
config file for logrorate – logrotate -vf /etc/logrotate.conf
/var/log/messages { rotate 5 weekly postrotate /usr/bin/killall -HUP syslogd endscript }
Elevated Call failure SIP 503 or Call timeout SIP 408 – high frequency of failed calls indicate an internal issue and must be followed up by smoke testing the entire system to identify any probable issue such as undetected frequent crashes of any individual component or any blacklisting by a destination endpoint etc
sudo tail -f sip.log | grep 503
or
sudo tail -f sip.log | grep WARNING
cron service or processed alerts –
ps axf
PID TTY STAT TIME COMMAND
2 ? S 0:00 [kthreadd]
3 ? I< 0:00 \_ [rcu_gp]
4 ? I< 0:00 \_ [rcu_par_gp]
5 ? I 0:00 \_ [kworker/0:0-eve]
6 ? I< 0:00 \_ [kworker/0:0H-kb]
7 ? I 0:00 \_ [kworker/0:1-eve]
8 ? I 0:00 \_ [kworker/u4:0-nv]
9 ? I< 0:00 \_ [mm_percpu_wq]
10 ? S 0:00 \_ [ksoftirqd/0]
11 ? I 0:00 \_ [rcu_sched]
12 ? S 0:00 \_ [migration/0]
13 ? S 0:00 \_ [cpuhp/0]
14 ? S 0:00 \_ [cpuhp/1]
15 ? S 0:00 \_ [migration/1]
16 ? S 0:00 \_ [ksoftirqd/1]
17 ? I 0:00 \_ [kworker/1:0-eve]
18 ? I< 0:00 \_ [kworker/1:0H-kb]
or checks cron status
service cron status
● cron.service - Regular background program processing daemon
Loaded: loaded (/lib/systemd/system/cron.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2016-06-26 03:00:37 UTC; 1min 17s ago
Docs: man:cron(8)
Main PID: 845 (cron)
Tasks: 1 (limit: 4383)
CGroup: /system.slice/cron.service
└─845 /usr/sbin/cron -f
Jun 26 03:00:37 ip-172-31-45-21 systemd[1]: Started Regular background program processing daemon.
Jun 26 03:00:37 ip-172-31-45-21 cron[845]: (CRON) INFO (pidfile fd = 3)
Jun 26 03:00:37 ip-172-31-45-21 cron[845]: (CRON) INFO (Running @reboot jobs)
restart or start cron service if required
DB connections / connection pool process – keep listening for any alerts on DB connections failure or even warnings as this can be due to too many read operations such as in DDOS and can escalate very quickly
netstat -nltp | grep db
tcp 0 0 0.0.0.0:5433 0.0.0.0:* LISTEN 5792/db-server *
Routine deepstatus checks is a good practice too. Raise alert if any check doesnt result as expected.
Port check, unexpected result alert– Regular checks if servers are lsietning on ports such as 5060 for SIP
netstat -nltp | grep 5060
tcp 0 0 x.x.x.x:5060 0.0.0.0:* LISTEN 8970/kamailio
cron zombie process checks – zombie process or defunct process is a process that has completed execution (via the exit system call) but still has an entry in the process table: it is a process in the “Terminated state”. List xombie process and kill them with pid to free up .
kill -9 <PID1>
Bulk calls checks – consult ongoing call cmd commands for application server such as
For Freeswitch use
fs_ctl> show channels
For kamailio use kamcmd
kamcmd dlg.list
For asterisk watch or show cmmand
watch -n 1 "sudo asterisk -vvvvvrx 'core show channels' | grep call"
Incase of DDOS or other macious attacker IP identification block the IP
iptables -I INPUT -s y.y.y.y -j DROP
Can also use fail2ban
>apt-get update && apt-get install
fail2ban
Additionally check how many dispatchers are responding on outbound gateway
opensipsctl dispatcher dump
Process control supervisor or pm2 checks – supervisor is a Linux Process Control System that allows its users to monitor and control a number of processes
ps axf | grep supervisor
for pm2
> pm2 status [PM2] Spawning PM2 daemon with pm2_home=/Users/altanai/.pm2 [PM2] PM2 Successfully daemonized ┌─────┬───────────┬─────────────┬─────────┬─────────┬──────────┬────────┬──────┬───────────┬──────────┬──────────┬──────────┬──────────┐ │ id │ name │ namespace │ version │ mode │ pid │ uptime │ ↺ │ status │ cpu │ mem │ user │ watching │
htop to check memeory and CPU
Health and load on the reverse proxy, load balancer as Nginx – perform a direct curl request to host to check if Nginx responds with a non 4xx / 5xx response or not
curl -v <public-fqdn-of-server>
Incase of error response , restart
/etc/init.d/nginx start
Incase of updates restart ngnix config
nginx -s reload
For HTTP/SSL proxy daemon such as tiny proxy which are used for fast resposne , set the MinSpareServers, MaxSpareServers , MaxClients , MaxRequestsPerChild etc appropriately
VPN checks – restart fireealls or IPsec incase of ssues
/etc/init.d/ipsec restart
Additionally also check ssh service
ps axf | grep sshd
restart sshd if required
SSL cert expiry checks – to keep the operations running securely and prevent and abrupt termination it is a good practise to run regular certificate expiry checks for SSL certs especially on secure HTTP endpoint like APIs , web server and also on SIP applications servers for TLS. If any expiry is due in < 10 days to trigger an alert to renew the certs
Health of Task scheduling services such as RabbitMQ, Celery Distributed Task Queue – remote debugging of these can be set up via pdb which supports setting (conditional) breakpoints and single stepping at the source line level, inspection of stack frames, source code listing, and evaluation of arbitrary Python code in the context of any stack frame.
import pdb; pdb.set_trace()
python3 -m pdb myscript.py
It can also be set up via using the client libraries provided by these Queue services themselves
Cluster status – setup an efficient health check service which monitors the cluster status for High Availability. JSON object depicting the status of cluster shards
{
"cluster_name" : "ABC-cluster",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 14,
"number_of_data_nodes" : 6,
"active_primary_shards" : 200,
"active_shards" : 300,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0
}
Status of Crticial Application Server
fscli > show status
UP 0 years, 0 days, 0 hours, 58 minutes, 33 seconds, 15 milliseconds, 58 microseconds
FreeSWITCH (Version 1.6.20 git 987c9b9 2018-01-23 21:49:09Z 64bit) is ready
3 session(s) since startup
0 session(s) - peak 1, last 5min 1
0 session(s) per Sec out of max 30, peak 1, last 5min 1
1000 session(s) max
min idle cpu 0.00/80.83
Current Stack Size/Max 240K/8192K
Programming or Syntax error in the production environment – mostly arising due to incomplete QA/testing before pushing new changes to production. Should trigger alerts for dev teams and meet with hot patches.
Many programing application development frameworks have inbuild libs for debugging , exceotion handling and reporting such as
- backend service in Django
- API service in Go
Distributed memory caching – redis , memcahe : Redis info shows the master -salve configuration for all the instances as well as their memeory and cpu status.
>redis-cli info
# Server
redis_version:6.0.4
redis_git_dirty:0
redis_mode:standalone
os:Darwin 18.7.0 x86_64
arch_bits:64
multiplexing_api:kqueue
atomicvar_api:atomic-builtin
gcc_version:4.2.1
tcp_port:6379
# Clients
connected_clients:1
client_recent_max_input_buffer:0
client_recent_max_output_buffer:0
blocked_clients:0
tracking_clients:0
clients_in_timeout_table:0
# Memory
used_memory:1065648
used_memory_human:1.02M
number_of_cached_scripts:0
maxmemory:0
allocator_frag_bytes:1123680
allocator_rss_ratio:1.00
rss_overhead_bytes:37888
mem_fragmentation_ratio:2.16
active_defrag_running:0
lazyfree_pending_objects:0
# Persistence
loading:0
rdb_changes_since_last_save:0
module_fork_last_cow_size:0
# Stats
total_connections_received:1
total_commands_processed:0
..
# Replication
role:master
connected_slaves:0
..
# CPU
used_cpu_sys:0.011198
used_cpu_sys_children:0.000000
# Modules
# Cluster
cluster_enabled:0
SMS service using smsc on Kannel : From the kannel servers, you should see the PANIC error (most of the time Assertion error crashing kannel):
grep PANIC /var/log/kannel/bearerbox.log
IF you are going to restart , Flush redis cache
sudo redis-cli FLUSHALL
sudo redis-cli SAVE
restart kannel
sudo /etc/init.d/kannel restart
If the carriers are throttling the SMS request , verify “ERROR” responses using
sudo grep -i "throttling" bearerbox.log
Alternatives include AWS logs services :
- Scalyr logging
- Sensu monitoring for multi-cloud monitoring using event pipeline
Read about VoIP/ OTT / Telecom Solution startup’s strategy for Building a scalable flexible SIP platform that includes :
- Scalable and Flexible SIP platform building
- Cluster SIP telephony Server for High Availability
- Failure Recovery
- Multi-tier cluster architecture
- Role Abstraction / Micro-Service based architecture
- Distributed Event management and Event-Driven architecture
- Containerization
- Autoscaling Cloud Servers
- Open standards and Data Privacy
- Flexibility for inter-working – NextGen911 , IMS , PSTN
- security and Operational Efficiencies
Read more about SIP VoIP system Architecture which includes
- Infrastructure Requirements
- Integral Components of a VOIP SIP-based architecture
- RTP ( Real-Time Transport Protocol ) / RtCP
- SIP gateways, registrar, proxy, redirect, application
- Developing SIP-based applications – basic call routing, media management
- SIP platform Development – NAt and DNS , Cross-platform and integration to External Telecommunication provider landscape , Databases
References :
- Terraform : https://www.terraform.io
- Kubernetes : https://kubernetes.io/ , https://kubernetes.io/docs/home
- Sensu : https://sensu.io/
- Jenkisn : https://raygun.com/blog/best-devops-tools/
- Ngnix : https://docs.nginx.com/nginx/admin-guide/basic-functionality/runtime-control/
- Python dbeugger : https://docs.python.org/dev/library/pdb.html#module-pdb
- sensuGO : https://docs.sensu.io/sensu-go/latest/
- Memcache restartale cache : https://github.com/memcached/memcached/wiki/WarmRestart