Kamailio Transaction management and Transaction Module tm

Kamailio is basically only a transaction stateful proxy, without any dialog support built in. Here the TM module enables stateful processing of SIP transactions ( by maintaining state machine). State is a requirement for many complex logic such as accounting, forking, DNS resolution. 

Although most of the Kamailio module related description is covered here, I wanted to keep a separate space to describe and explain how Kamailio handles transactions and in particular, the Transaction Module.

Note: This article has been updated many times to match v5.1 since v3.0 from when it was written, if u see and outdated content or deprecated functions, please point them out to me in the comments. If you are new to Kamailio, this post is probably not a good starting point for you, instead read more on Kamailio. It is a powerful open-source SIP server here and has a widespread application in telephony.

Kamailio can manage stateless replying as well as stateful processing – SIP transaction management. The difference between the two is below

StatefulStateless
stateful processing is per SIP transaction

Each SIP transaction will be kept in memory so that any replies, failures, or retransmissions can be recognized
Forwarding each msg in the dialog without any context.
Application understands the transactions , for example
– recognize if a new INVITE message is a resend
– know that 200 OK reponse belongs to the initial INVITE which it will be able to handle in an onreply_route[x] block.
it doesnt
know that the call is on-going.
However it can use callId to match INVITE and BYE.
Uses : manage call state , routing , call control like forward on busy, voicemailUses : Load distribution , proxying

Kamailio’s Transaction management

t_relay, t_relay_to_udp and t_relay_to_tcp are main functions to setup transaction state, absorb retransmissions from upstream, generate downstream retransmissions and correlate replies to requests in Kamailio.

Lifecycle of Transaction

Transactions lifecycle are controlled by various factors which includes reliable ( TCP) or non reliable transport, invite or non-invite transaction types etc. Transaction are terminated either by final response or when timers are fired, which control it.

ACK is considered part of INVITE trasnaction when non 2xx / negative final resposne is received , When 2xx final / positive response is recievd than ACK is not considered part of the transaction.

Memory Management in Transactions

Transaction Module copies clones of received SIP messages in shared memory. Non-TM functions operate over the received message in private memory. Therefore core operations ( like record_route) should not be called before settings the transaction state ( t_realy ) for state-fully processing a message.

An INVITE transaction will be kept in memory for maximum: max_inv_lifetime + fr_timer + wt_timer.
While A non-INVITE transaction will be kept in memory for a maximum: max_noninv_lifetime + wt_timer.

Branches

A single SIP INVITE request may be forked to multiple destinations, all of which together is called destination sets and Individual elements within the destination sets are called branches. A transaction can have more than one branch. For example, during DNA failover, each failed DNS SRV destination can introduce a new branch.

Serial, Parallel and Combined Forking 

By default Kamailio performs parallel forking sending msg to all destinations and waiting for a response, however it can also do serial ie send requests one by one and wait for response /timeout before sending next. 

By use of priorities ( q value 0 – 1.0) , Kamailio can also intermix the forking technique ie describing priority oder for serial and same level for parallel . The destination uri are loaded using unctions t_load_contacts() and t_next_contacts().

Parallel forking snippet

request_route {
  seturi("sip:a@example.com");
  append_branch("sip:b@example.com");
  append_branch("sip:c@example.com");
  append_branch("sip:d@example.com");

  t_relay();
  break;
}

Mixed forking snippet

modparam("tm", "contacts_avp", "tm_contacts");
modparam("tm", "contact_flows_avp", "tm_contact_flows");

request_route {
  seturi("sip:a@example.com"); // lowest 0 
  append_branch("sip:b@example.com", "0.5"); // shoudl be in parallel with C
  append_branch("sip:c@example.com", "0.5"); // shoudl be in parallel with B
  append_branch("sip:d@example.com", "1.0"); // highest priority , should be tried first

  t_load_contacts();   // load all branches as per q values, store them in AVP configured in modparam 
  t_next_contacts();   // takes AVP and extracts higher q value branch

  t_relay();
  break;
}

Code to terminate when no more branches are found ( -1 returned) and return the message upstream

 failure_route["serial"]
 {
   if (!t_next_contacts()) {
     exit;
   }
   t_on_failure("serial");
   t_relay();
 }

TM Module

t_relay, t_relay_to_udp and t_relay_to_tcp are main functions to setup transaction state, absorb retransmissions from upstream, generate downstream retransmissions and correlate replies to requests.

Memory

TM copies clones of received SIP messages in shared memory. non-TM functions operate over the received message in private memory. Therefore core operations ( like record_route) should ne called before settings the trasnaction state ( t_realy ) for statefully processing a message.

An INVITE transaction will be kept in memory for maximum: max_inv_lifetime + fr_timer + wt_timer.
While A non-INVITE transaction will be kept in memory for a maximum: max_noninv_lifetime + wt_timer.

Parameters

Various parameters are used to fine tune how trsnactions are handled and timedout in kamailio. Note all timers are set in miliseconds notation.

  • fr_timer (integer) – timer hit when no final reply for a request or ACK for a negative INVITE reply arrives. Default 30000 ms (30 seconds).
  • fr_inv_timer (integer) – timer hit when no final reply for an INVITE arrives after a provisional message was received on branch. Default 120000 ms (120 seconds).
  • restart_fr_on_each_reply (integer) – restart fr_inv_timer fir INVITE transaction for each provisional reply. Otherwise it will be sreatred only for fisrt and then increasing provisonal replies. Turn it off in cases when dealing with bad UAs that continuously retransmit 180s, not allowing the transaction to timeout.
  • max_inv_lifetime (integer) – Maximum time an INVITE transaction is allowed to be active in a tansaction. It starts from the time trnsaction was created and after this timer is hit , transaction is moved to either wait state or in the final response retransmission state. Default 180000 ms (180 seconds )
  • max_noninv_lifetime (integer) – Maximum time a non-INVITE transaction is allowed to be active. default 32000 ms (32 seconds )
  • wt_timer (integer) – Time for which a transaction stays in memory to absorb delayed messages after it completed.
  • delete_timer (integer) – Time after which a to-be-deleted transaction currently ref-ed by a process will be tried to be deleted again. This is now obsolte and now transaction is deleted the moment it’s not referenced anymore.

Retry transmission timers

  • retr_timer1 (integer) – Initial retransmission period
  • retr_timer2 (integer) – Maximum retransmission period started increasingly from starts with retr_timer1 and stays constant after this
  • noisy_ctimer (integer) – if set, INVITE transactions that time-out (FR INV timer) will be always replied. Otherwise they will be quitely dropped without any 408 branch timeout resposne
  • auto_inv_100 (integer) – automatically send and 100 reply to INVITEs.
  • auto_inv_100_reason (string) – Set reason text of the automatically sent 100 to an INVITE.
  • unix_tx_timeout (integer) – nix socket transmission timeout,
  • aggregate_challenges (integer) – if more than one branch received a 401 or 407 as final response, then all the WWW-Authenticate and Proxy-Authenticate headers from all the 401 and 407 replies will be aggregated in a new final response.

Blacklist

  • blst_503 (integer) – reparse_invite=1.
  • blst_503_def_timeout (integer) – blacklist interval if no “Retry-After” header is present
  • blst_503_min_timeout / blst_503_max_timeout (integer) – minimum and maximun blacklist interval respectively
  • blst_methods_add (unsigned integer) – Bitmap of method types that trigger blacklisting on transaction timeouts and by default INVITE triggers blacklisting only
  • blst_methods_lookup (unsigned integer) – Bitmap of method types that are looked-up in the blacklist before being forwarded statefully. For default only applied to BYE.

Reparse

  • reparse_invite (integer) – set if CANCEL and negative ACK requests are to be constructed from the INVITE message ( same record-set etc as INVITE ) which was sent out instead of building them from the received request.
  • reparse_on_dns_failover (integer) – SIP message after a DNS failover is constructed from the outgoing message buffer of the failed branch instead of from the received request.
  • ac_extra_hdrs (string) – Header fields prefixed by this parameter value are included in the CANCEL and negative ACK messages if they were present in the outgoing INVITE. Can be only used with reparse_invite=1.
  • on_sl_reply (string) – Sets reply route block, to which control is passed when a reply is received that has no associated transaction.
modparam("tm", "on_sl_reply", "stateless_replies")
...
onreply_route["stateless_replies"] {
    // return 0 if do not allow stateless replies to be forwarded
    return 1; // will pass to core for stateless forwading
}
  • xavp_contact (string) – name of XAVP storing the attributes per contact.
  • contacts_avp (string) – name of an XAVP that stores names of destination sets. Used by t_load_contacts() and t_next_contacts() for forking branches
  • contact_flows_avp (string) – name of an XAVP that were skipped
  • fr_timer_avp (string) – override teh value of fr_timer on per transactio basis , outdated
  • cancel_b_method (integer) – method to CANCEL an unreplied transaction branch. Params :
    • 0 will immediately stop the request (INVITE) retransmission on the branch so that unrpelied branches will be terminated
    • 1 will keep retransmitting the request on unreplied branches.
    • 2 end and retransmit CANCEL even on unreplied branches, stopping the request retransmissions.
  • unmatched_cancel (string) – sets how to forward CANCELs that do not match any transaction. Params :
    • 0 statefully
    • 1 statelessly
    • 2 dropping them
  • ruri_matching (integer) – try to match the request URI when doing SIP 1.0 transaction matching as older SIP didnt have via cookies as in RFC 3261
  • via1_matching (integer) – match the topmost “Via” header when doing SIP 1.0 transaction matching
  • callid_matching (integer) – match the callid when doing transaction matching.
  • pass_provisional_replies (integer)
  • default_code (integer) – Default response code sent by t_reply() ( 500 )
  • default_reason (string) – Default SIP reason phrase sent by t_reply() ( “Server Internal Error” )
  • disable_6xx_block (integer)- treat all the 6xx replies like normal replies. However according to RFC receiving a 6xx will cancel all the running parallel branches, will stop DNS failover and forking.
  • local_ack_mode (integer) – where locally generated ACKs for 2xx replies to local transactions are sent. Params :
    • 0 – the ACK destination is choosen according next hop in contact and the route set and then DNS resolution is used on it
    • 1 – the ACK is sent to the same address as the corresponding INVITE branch
    • 2 – the ACK is sent to the source of the 2xx reply.
  • failure_reply_mode (integer) – how branches are managed and replies are selected for failure_route handling. Params :
    • 0 – all branches are kept
    • 1 – all branches are discarded
    • 2 – only the branches of previous leg of serial forking are discarded
    • 3 – all previous branches are discarded
    • if you dont want to drop all branches then use t_drop_replies() to sleectively drop
  • faked_reply_prio (integer) – how branch selection is done.
  • local_cancel_reason (boolean) – add reason headers for CANCELs generated due to receiving a final reply.
  • e2e_cancel_reason (boolean) – add reason headers for CANCELs generated due to receiving a CANCEL
  • remap_503_500 (boolean) – conversion of 503 response code to 500. RFC requirnment.
  • failure_exec_mode (boolean) – Add local failed branches in timer to be considered for failure routing blocks.
  • dns_reuse_rcv_socket (boolean) – reuse of the receive socket for additional branches added by DNS failover.
  • event_callback (str) – function in the kemi configuration file (embedded scripting language such as Lua, Python, …) to be executed instead of event_route[tm:local-request] block. The function recives a string param with name of the event.
modparam("tm", "event_callback", "ksr_tm_event")
...
function ksr_tm_event(evname)
    KSR.info("===== TM module triggered event: " .. evname .. "\n");
    return 1;
end
  • relay_100 (str) – whether or not a SIP 100 response is proxied. not valid behavior when operating in stateful mode and only useful when in stateless mode
  • rich_redirect (int) – to add branch info in 3xx class reply. Params :
    0 – no extra info is added (default)
    1 – include branch flags as contact header parameter
    2 – include path as contact uri Route header

Functions

These functions are operational blocks and route handlers for trsnactions handling in kamailio

  • t_relay([host, port]) – Relay a message statefully.
    Exmaple to show if t_relay fails, atleast send a reply to UAC statelessly to not keep it waiting
if (!t_relay()) 
{ 
    sl_reply_error(); 
    break; 
};
  • t_relay_to_udp([ip, port]) / t_relay_to_tcp([ip, port]) – same as above, relay a message statefully but using specific protocol
if (some_conditon)
    t_relay_to_udp("1.2.3.4", "5060"); # sent to 1.2.3.4:5060 over udp
else
    t_relay_to_tcp(); # relay to msg. uri, but over tcp
  • t_relay_to_tls([ip, port])
  • t_relay_to_sctp([ip, port])
  • t_on_failure(failure_route) – on route block for failure management on a branch when a negative reply is recived to transaction. here uri is reset to value which it had on relaying.
  • t_on_branch_failure(branch_failure_route) – controls when negative response come for a transacion. here uri is reset to value which it had on relaying.
  • t_on_reply(onreply_route) – gets control when a reply from transaction is received
  • t_on_branch(branch_route) – control is passed after forking (when a new branch is created)
  • t_newtran() – Creates a new transaction
  • t_reply(code, reason_phrase) – Sends a stateful reply after a transaction has been established.
  • t_send_reply(code, reason)
  • t_lookup_request() – Checks if a transaction exists
  • t_retransmit_reply()
  • t_release() – Remove transaction from memory
  • t_forward_nonack([ip, port]) – forward a non-ACK request statefully
  • t_forward_nonack_udp(ip, port) / t_forward_nonack_tcp(ip, port)
  • t_forward_nonack_tls(ip, port)
  • t_forward_nonack_sctp(ip, port)
  • t_set_fr(fr_inv_timeout [, fr_timeout]) – Sets the fr_inv_timeout
  • t_reset_fr()
  • t_set_max_lifetime(inv_lifetime, noninv_lifetime) – Sets the maximum lifetime for the current INVITE or non-INVITE transaction, or for transactions created during the same script invocation
  • t_reset_max_lifetime()
  • t_set_retr(retr_t1_interval, retr_t2_interval) – Sets the retr_t1_interval and retr_t2_interval for the current transaction
  • t_reset_retr()
  • t_set_auto_inv_100(0|1) – switch automatically sending 100 replies to INVITEs on/off on a per transaction basis
  • t_branch_timeout() – Returns true if the failure route is executed for a branch that did timeout.
  • t_branch_replied()
  • t_any_timeout()
  • t_any_replied()
  • t_grep_status(“code”)
  • t_is_canceled()
  • t_is_expired()
  • t_relay_cancel()
  • t_lookup_cancel([1])
  • t_drop_replies([mode])
  • t_save_lumps()
  • t_load_contacts()
  • t_next_contacts()
  • t_next_contact_flow()
  • t_check_status(re)
  • t_check_trans() – check if a message belongs or is related to a transaction.
  • t_set_disable_6xx(0|1)
  • t_set_disable_failover(0|1)
  • t_set_disable_internal_reply(0|1)
  • t_replicate([params]) – Replicate the SIP request to a specific address.
  • t_relay_to(proxy, flags) – KSR.tm.t_relay()
  • t_set_no_e2e_cancel_reason(0|1)
  • t_is_set(target) – KEMI – KSR.tm.t_is_set() Return true if the attribute specified by ‘target’ is set for transaction. Target can be branch_route , failure_route and onreply_route.
if not(KSR.tm.t_is_set("branch_route")>0) then
    core.set_branch_route("ksr_branch_manage");
end

if not(KSR.tm.t_is_set("onreply_route")>0) then
    core.set_reply_route("ksr_onreply_manage");
end

if not(KSR.tm.t_is_set("failure_route")>0) and (req_method == "INVITE") then
   core.set_failure_route("ksr_failure_manage");
end
  • t_use_uac_headers()
  • t_is_retr_async_reply()
  • t_uac_send(method, ruri, nexthop, socket, headers, body)
  • t_get_status_code() – Return the status code for transaction or -1 in case of error or no status code was set.

Snippet to demo stateful handling of trsansactions

Yhis program is designed to accept all Register with 200 OK and create a new transaction. Does a check for username altanai. After the check cutom message hello is replied and any other username is printed a different rejection reply.

# ------------------ module loading ----------------------------------
loadmodule "tm.so"

route{
    # for testing purposes, simply okay all REGISTERs
    if (method=="REGISTER") {
        log("REGISTER");
        sl_send_reply("200", "ok");
        break;
    };

    # create transaction state with t_newtran(); abort if error occurred
    if (t_newtran()){
        log("New Transaction created"); 
    }
    else {
        sl_reply_error();
        break;
    };

    log(1, "New Transaction Arrived\n");

    # add a check for matching username to print a cutom message with t_reply()
    if (uri=~"altanai@") {
        if (!t_reply("409", "Well , hello altanai !")) {
            sl_reply_error();
        };
    } else {
        if (!t_reply("699", "Do not proceed with this one")) {
            sl_reply_error();
        };
    };
}

Raw RPC cmds

1. kamctl rpc tm.list

{  

   "jsonrpc":"2.0",
   "result":[  
      {  
         "cell":"0x7f0698d06488",
         "tindex":50969,
         "tlabel":163886326,
         "method":"INVITE",
         "from":"From: ;tag=dddab54e\r\n",
         "to":"To: \r\n",
         "callid":"Call-ID: NjkyYjJlNzJkNzQ1OTYyZjE2MDM2NjFlYWZkNjY4OWE\r\n",
         "cseq":"CSeq: 1",
         "uas_request":"yes",
         "tflags":65,
         "outgoings":2,
         "ref_count":1,
         "lifetime":29578635
      }
   ],
   "id":3922
}

2. kamctl rpc tm.stats

before call

{  
   "jsonrpc":"2.0",
   "result":{  
      "current":0,
      "waiting":0,
      "total":3,
      "total_local":0,
      "rpl_received":6,
      "rpl_generated":6,
      "rpl_sent":6,
      "6xx":0,
      "5xx":3,
      "4xx":0,
      "3xx":0,
      "2xx":0,
      "created":3,
      "freed":3,
      "delayed_free":0
   },
   "id":4119
}

during call

{  

   "jsonrpc":"2.0",
   "result":{  
      "current":1,
      "waiting":0,
      "total":4,
      "total_local":0,
      "rpl_received":7,
      "rpl_generated":7,
      "rpl_sent":7,
      "6xx":0,
      "5xx":3,
      "4xx":0,
      "3xx":0,
      "2xx":0,
      "created":4,
      "freed":3,
      "delayed_free":0
   },
   "id":4217
}

during call wait

{  
   "jsonrpc":"2.0",
   "result":{  
      "current":1,
      "waiting":1,
      "total":4,
      "total_local":0,
      "rpl_received":8,
      "rpl_generated":8,
      "rpl_sent":8,
      "6xx":0,
      "5xx":4,
      "4xx":0,
      "3xx":0,
      "2xx":0,
      "created":4,
      "freed":3,
      "delayed_free":0
   },
   "id":4275
}

after call is completed

{  
   "jsonrpc":"2.0",
   "result":{  
      "current":0,
      "waiting":0,
      "total":4,
      "total_local":0,
      "rpl_received":8,
      "rpl_generated":8,
      "rpl_sent":8,
      "6xx":0,
      "5xx":4,
      "4xx":0,
      "3xx":0,
      "2xx":0,
      "created":4,
      "freed":4,
      "delayed_free":0
   },
   "id":4333
}

VoIP/ OTT / Telecom Solution startup’s strategy for building a scalable flexible SIP platform


I have been contemplating points that make for a successful developer to develop solutions and services for a Telecom Application Server. The trend has shown many variations from pure IN programs like VPN, Prepaid billing logic to SIP servlets for call parking, call completion. From SIP servlets to JAISNLEE open standard-based communication.

Scalable and Flexible SIP platform building

This section has been updated in 2020

A cloud communication provider is who acts as a service provider between the SME ( Small and Medium Enterprises ) and Large scale telco carrier. An important concern for a cloud provider is to build a Scalable and Flexible platform. Let’s go in-depth to discuss how can one go about achieving scalability in SIP platforms.

Multi geography Scaled via Universal Router

A typical semi multi-geography scaled, read replica based / data sharding based Distributed VoIP system which is controlled by a router that distributes the traffic to various regions based on destination number prefix matching looks like

Cluster SIP telephony Server for High Availability

Clusters of SIP servers are great at providing High availability and resilience however they also add a factor of latency and management issues.

Considerations for a clustered SIP application server setup

  • memory requirements to store the state for a given session and the increasing overhead of having more than two replicas within a partition.
  • Co-hosted virtual machine add resource contention and delay call established due to multi-node traversal.
  • In case of node failures or plannet reboot after upgrade, the traffic redirection needs draining existing calls from sip server before briniging it down. This setup ensures that
    • no new calls are channelled to this server
    • servers waits for existing calls to end before reboot.
  • Fail fast and recover : The system should be reliable to not let a node failure propagate and become root cause for entire system failure due to corrupted data.

Failure Recovery

A Clustered SIP platform is quickly recoverable with containerized applications. Clear separation between stateless engine layer and session management or Data layer is critical to enable auto-reboot of failed nodes in engine layer.

It should be noted that, unlike HTTP based platforms, dialogue and transaction state variables are critical to SIP platforms for example, call duration for CDR entry. Therefore for a mid-call failure and auto reboot the state variable should be replicated on an extrenal cache so that value can persist for correct billing.

Multi-tier cluster architecture

Symmetrical Multi-Processing (SMP) architectures have

  • stateless “Engine Tier” processes all traffic and
  • distributes all transaction and session state to a “Data Tier.”

A very good example of this is the Oracle Communications Converged Application Server Cluster (OCCAS) which is composed of 3 tiers :

  1. Message dispatcher,
  2. Communication engine stateless
  3. Datastore which is in-memory session store for the dialogues and ongoing transactions

An advantage of having stateless servers is that if the application server crashes or reboots, the session state is not lost as a new server can pick up the session information from an external session store .

Role Abstraction / Micro-Service based architecture

The components for a well-performing highly scalable SIP architecture are abstracted in their role and responsibilities. We can have categories like

Load Balancer / Message Dispatcher

LB routes traffic based on an algorithm (round robin, hashing , priority based scheduling, weight-based scheduling ) among active and ready servers

Backend Dynamic Routing and REST API services 

Services which the Application server calls during its call flow execution which may include tasks like IP address associated with the caller, screened numbers associated with destination etc such as XML Remote Procedure Call (XML-RPC) or AVAPI Service in Kamailio

OSS/BSS layer 

This layer is responsible for jobs in relation to operations and billing and should take place in an independent system without affecting the session call flow or causing a high RTT. Some top features possible with defining this layer well are

  • POS CRM ,Order Management , Loyality , feedback , ticketing
  • Post Paid Billing , Inter-carrier Billing
  • BPM and EAI
  • Provisioning & Mediation
  • Number Management
  • Inventory
  • ERP, SCM
  • Commissions
  • Directory Enquiry
  • Payments & Collections
  • BI ( Business Intelligence)
  • Fraud and RAS
  • Pre-Paid Billing
  • Document Management
  • EBPP, Self Care

There are other componets ina typical VoIP micro services architecture such as Heartbeat service , backend accounting servuce , security check service, REST API service , synmaic routing service , event notofication service etc which should be decoupled from each other leading to high parallel programing approach.

Containerization and Auto deployment

To improve Flexibility w.r.t Infrastructure binding, all server components including edge components, proxies, engines, media servers must be containerized in form of images or docker for easy deployment via an infrastructure tool like Kubernetes, Terraform, chef cookbooks and be efficiently controlled with an Identify manage tool and CICD ( continuous integration and Delivery ) tool like Travis or Jenkins.

Autoscaling Cloud Servers using containerized images

Autoscaled servers are provided by the majority of Cloud Infrastructure providers such as AWS ( Amazon Web Services ), Google Cloud platform which scale the capacity based on traffic in real-time also called elasticity. Any VoIP developer would notice patterns in voice traffic such as less during holidays/night hours where servers can be freed, whereas traffic peaks during days where server capacity needs to scale up.

Additionally, traffic may pike when the setup is under DDoS attacks, not an uncommon thing for SIP server, then the server needs to identify and block malicious source points and prevent unnecessary upscaling. There are 2 approaches to scaling

Scale UP / Vertical ScalingScale OUT / Horizontal scaling
Resusing the existing server to upgrade performance to match the load requirnmentsIncreasing the number of servers and adding their IP to Load balancer to manage traffic .
It should be noted that scalling up or down shouel be carried out incrementally to have better control on resource to requirnment ratio.
Hardware resource map for Clustered Application server , Media Server Database cluster , LB , monitoring server
Hardware resource map for Clustered Application server , Media Server Database cluster , LB , monitoring server

Security should be a priority

It is crucial for any Voice traffic / media servcis provoder to have state of the art security in the content without disrupting data privacy norms.

SIP secure practises like Authentication , authorization ,Impersonating a Server , Temparing Message bodies, mid-session threats like tearing down session , Denial of Service and Amplification , Full encryption vs hop by hop encrption , Transport and Network Layer Security , HTTP Authentication , SIP URI, nonce and SIP over TLS flows , can be read at https://telecom.altanai.com/2020/04/12/sip-security/

While scaling out the infrastructure for extensing the Pop( point of presence ) accross the differnet geographies , define zones such as

  • red zone : public facing server like load balancers
  • dmz zone ( demilitarized zone ) interfacing servers betwee private and public network
  • green zone : provate and secure interal serer which communicate over private IPs snd should ne unrechable from outside .

To futher increase efficiency between communication and transmission between green zone server , setup private VPC ( Virtual provate cloud ) between them .

Open standards

To establish itself as a dependable Realtime communication provider , the product must follow stabdardised RFC’s and stacks such as SIP RFC 3261 and W3C drfat for Webrtc peer connection etc . It si also a good practise to be updated with all recommendation by ITU and IANA and keep with the implementation . For exmaple : STIR/SHAKEN https://telecom.altanai.com/2020/01/08/cli-ncli-and-stir-shaken/

Data Privacy

Adhere to Privacy and protection standards like GDPE , COPPA , HIPPA , CCPA. More details on VoIP certificates , compliances and security at https://telecom.altanai.com/2020/01/20/certificates-compliances-and-security-in-voip/

Product Innovation and Market Differentiator

innovation
Innovation + Experiment + Oyt of Box Thinking

Many Communication service providers offer Voice over IP and related unified communication and collaboration platforms. A new VoIP provider needs to envision enhancements and innovations that meet the growing user expectation.

  • Easy to follow technical documentation and help and quick response to any technical question about platform posted on QnA sites (StackOverflow, Quora .. ), tech forums ( Google groups, slack channels .. ) even Twitter handles to address issues.

Data Visualization Tools – Show overall call quality insights, call flows, stats, probable issues, fixes, spending/saving on user groups, duration, negative-positive margins, healthy/unhealthy calls, spams etc. 

Graphical Event Timelines – time based events such as call setup , termination , codec negotiation , call rediection events

Drag and Drop Call Flow deisgner – As call routing logic beome more complicated with a large set of known and pre-defined operations ( parking , routing , voicemail , forking , rediercting etc) . The call routing can be easily composed from these preset operation as UI block attached to a call flow chain which results in calls being channels as predefined by this call flow logic . Leads to plenty of cutomaizibility and design flexibility to custoemrs to design their calls.

Pricing Model

Encourage users to use the services either for free or for a minimal price

Besides increasing onboarding count and developing an internationla presence, this also helps gain a good word and pays long term.

  • Discount, onboarding bonuses encourage users to try out services without signing up with long term contracts. The value could range from 5-15$ one-time onboarding prize to use services such as DID number purchase, outgoing telco call or purchasing any other service addon.
  • No or minimal onboarding cost
  • Toll-free minutes 50- 1000 minutes per month.

Competitive Pricing some of the enteries below show an approximate pricing figure for various service ( note these may be outdated and should references be used as it is).

  • Pay as you go pricing : Rate per minute (USD) plan for example( from google voice )

Australia – Mobile ~$0.02
Portugal – Mobile ~$0.15
Switzerland – Mobile – ~Orange $0.11
United Kingdom – Mobile – ~Orange $0.02
United Kingdom – Mobile – ~Vodafone $0.01

Outbound calls to PSTN ~$0.015 per min ( depending on teleco and destination)
Incoming Voice Calls on a Local Number ~$0.0060 per min
Incoming Voice Calls on a Toll-Free Number ~$0.020 per min
VoIP Calls (In-App WebRTC & SIP) ~$0.003 per min

  • Addon Services

Call Recordings ~$0.0025 per min
Voicemail Detection, Call Analyzers, Call Transcription ~$0.015 – $0.050 per min ( depends on external API cost or inhouse R&D effort)
Automatic Speech Recognition ~$0.02 per 15 sec

  • Trunk Calls ( heavy volume customers )

Inbound Voice Calls ~$0.0025/min
Outbound Voice Calls ~$0.0065/min
Tollfree Inbound Voice Calls ~$0.0135/min ( toll free numbers usually charge more than local numbers)

Pictures1
“Pay as you go ” Pricing model

Services which should be offered on a non chargable basis :

  • Round the clock technical support
  • Compensation for Downtime
  • CDRs per account
  • IP to IP calls
  • Security Certificates in TLS and SRTP calls
  • Authetication and Authorization

Services that can be charged are

  • Value added services – Live Weather updates , horoscope update ..
  • Carrier Integration – trunk , PRI
  • Toll Free Numbers – DID numbers
  • Virtual Private Network (VPN) : An Intelligent Network (IN) service, which offers the functions of a private telephone network. The basic idea behind this service is that business customers are offered the benefits of a (physical) private network, but spared from owning and maintaining it
  • Access Screening(ASC): An IN service, which gives the operators the possibility to screen (allow/barring) the incoming traffic and decide the call routing, especially when the subscribers choose an alternate route/carrier/access network (also called Equal Access) for long distance calls on a call by call basis or pre-selected.
  • Number Portability(NP) : An IN service allows subscribers to retain their subscriber number while changing their service provider, location, equipment or type of subscribed telephony service. Both geographic numbers and non-geographic numbers are supported by the NP service.

Flexibility for inter-working

Interworking among the services from  legacy IN solution and IMS /IT. Allow the Operators to extend their basic offering with added  services via low cost software and increases the ARPU for subscribers.

Next Gen 911

911 like emergency services afre moving from tradiotional TDM networks to IP networks . However this poses some challenges such as detecting callers geolocation and routing the call to his/her nearest servicing station pr Public safety Answering Point ( PSAP)

Backward compatibility with existing legacy networks

PSTN-SIP gateways to interface bwteen SIP platform and SS7 siganlling platform also convert the RTP stream to Analog waveforms required byb PSTN endpoints

Internetworking with IMS

IMS is a IP telephony service archietcture developed by 3rd Generation Partnership Project ( 3GPP) ,global cellular network standards organization that also standardized Third Generation (3G) services and Long Term Evolution (LTE) services

More about IMS ( IP multimedia System )

Develop on Interactive and populator frameworks like webRTC

Agile Development and Service Priented Architecture (SOA) are proven methods of delievry quality and updated products and releases which can cater to eveolcing market demands . In short “Be Future ready while protecting the existing investments”

Make a WebRTC solution that offers a plug in free, device agnostic, network agnostic web based communication tool along with the server side implementation.

webrtc

Read More about WebRTC Communication as a platform Service – https://telecom.altanai.com/2019/07/04/webrtc-cpaas-communication-platform-as-a-service/

External Integartions

  • Enterprise communication agents Integration – consider integration with Microsoft 365, Google Workspace, Skype for Business , Slack , WebEx
  • CRM Integartion – Salesforce , Zendesk
  • Business specific integartion
    • Canvas for eleraning
    • telehealth platform for doctor consultation
  • A2P ( application to person) msging

Integration of the services with social media/networking enables new monetizing benefits to CSPs especially in terms on advertising and gaining popularity , inviting new customers etc.

resources

Enterprises seek to reach their customers with trusted telecom mediums such as phone calls/SMS. Telcos play an instrumental role in increasing the customer’s trust for an enterprise by means of updates over call and SMS in addition to emails and postal mail. The medium of VoIP services offers value addition in their present product/service delivery model for any firm whether it be an e-commerce firm or banking.

VoIP providers should develop an SDK rich, a dev-friendly arrangement that can facilitate onboarding SMEs ( small-medium enterprises) by self-guided tutorials and quick setups.

Support developer base to aggregate, use open-standard services/technologies and tie them with other communication technologies suited to their business use-case using the VoIP platform as the medium of communication between web/mobile app endpoints and telecom endpoints.

Operational Efficiencies

Log aggregation and Analytics.
PagerDuty Alerts
Daily and Weekly backups and VM snapshots.
Automated sanity Tests
Centralized alert management, monitoring and admin dashboards .
Deployment automation / CICD
Tools and workflows for diagnostics, software upgrades, OS patches etc.
Customer support portal , provisioning Web Application

Read about VoIP system DevOps, operations and Infrastructure management, Automation

Telemetrics

QoS : Media Stats can help us collect the call qulaity metrics which determins the overall USer experience. Some frequently encountered issues include

IssueCauseObservance
High Packet Loss 250 ms of audio suration lost in 5 secbroken audio
High Jitterjitter >= 30 ms in 5 secrobotic audio
Low Audio Levelaudio level < -80dBinaudible
High RTTRTT > 300 ms in 5 seclags

Pro-active Isssue Tracking via Call Meta data Analysis

Call details even during a setup phase , continuation or reinvite /update phase can suggest the probably outcomes based on previous results such as bad call quality from certain geographic areas due to their known network or firewall isseus or high packet loss from certain handset device types . We can deduce well in advance what call quality stats will be generated from such calls .

Contains which can be identfied from calls setup details itself include :

  • geography and number – Call was made from which orignating location to which destination
  • SIP devices – device related details , Version of device (browser version etc..,)
  • Chronological aspects of call – Initiation, ring start, pick up and end time.
  • call direction – inbound ( coming from carrier towards our VoIP platform ) or outbound ( call directed to carrier from out VoIP platform )
  • Network type – network ssues and quality score across network type

Contarins which can be identfied during a ongoing call itself include :

  • Participants and their local time – ongoing RTCP from Legs, probability of long Conferences is low in off hours
  • Call events – DTMF, XML, API calls , quality issues

The minor issues identified during an ongoing calls RTCP packets such as increasing jitter or packet loss can extrapolate to human perceivable bad audio quality of call after a while . Thus any suspected issues should be identified as early as traced and corrective action should be put in place .

Predicting Low Audio / Call quality

Having a predictive engine can forecast bad call Quality such as 408 timeouts , high RTT , low audio level , Audio lag , one way audio , MOS < 2.5 out of 5 etc .

The predictive engine can use targeted notifications pointing towards specific issues that can comeup in a call relatine and assign a technical rep to overlook or manually intervene .
This can include scenario such as an agent warning a customer that his bad audio quality is due to him using an outdated SIP Device with slow codecs and suggest to upgrade it to lightweight codecs as per his bandwidth. This saves bad user experince of the customer and can happen without cusomer reporting the issues homself with feedback , RTP stats , PCAPS etc. Save a lot of trouble and effort in call debugging .

Media Procesisng

CSP’s are looking into long term growth and profitability from new online services media streaming services. A new VoIP provider could develop use-cases exceeding the exsting usecase of media stream rending to create a differentiator such as

  • Streaming
  • Conference bridges/mixers
  • Recording and playback
  • IPTV and VOD ( Video On Demand)
  • Voicemails , IVR , DTMF,
  • TTS( text to speech ),
  • realtime transcription / captioning
  • Speech recognition etc

Some more services that a new VoIP provider should consider

  • Feedback gathering and User satisfaction surveys
  • Quick issues detection and detailed RCA
Picture1

References :

SIP: https://telecom.altanai.com/2013/07/13/sip-session-initiaion-protocol/

What is OTT – https://telecom.altanai.com/2014/10/24/developing-a-ott-over-the-top-communication-application/

WebRTC Business benifits to OTT and telecom carrier – https://telecom.altanai.com/2013/08/02/webrtc-business-benefits/


E-Learning

True that the number of teacher today are not enough to teach the number of kids . For example even in India there is often 1 teacher for a class of 60 students in one subject. Also the experience and output of learning from a human teacher cannot be ever replaced by a software or ebook or application no matter how user-friendly or informative it is .

In this post I am going to describe an e-learning platform which harness the power of Internet for the purpose of distance education and where students around the world volunteer to teach each other any subject they wish to. This will be made possible through a combination of real time communication technologies like WebRTC and plethora of knowledge repositories.

Aim :

Platform to connect volunteers ( children ) teach each other a subject in a stipulated time through Web based Real Time Communication.

Working :

A student enrolls himself for a subject or a course it could be anything from arithmetic to french language . Another student who know French language for example find this in portal and sign up to be teacher for that child. They can anytime connect with each other in audio , video , message , file sharing , screen sharing session through WebRTC and learn the subject. The students earn reward points .

e-leaning service on WebRTC
e-leaning service on WebRTC

Technologies  :

  • WebRTC for communication
  • MySQL for data storage
  • Apache Tomcat as Webserver
  • HTML5
  • CSS
  • JavaScript

Conclusion :

By encouraging a child to take responsibility and teach another child , we will not only encourage friendship between them but also give them a sense of accomplishment .