4G, Long Term Evolution (LTE), VOLTE and VOWifi

LTE stands for Long Term Evolution and is a registered trademark owned by ETSI (European Telecommunications Standards Institute) for the wireless data communications technology and a development of the GSM/UMTS standards.

  • Both radio and core network evolution
  • All-IP packet-switched architecture
  • standardised by 3GPP
  • lower CAPEX ans OPEX involved

LTE evolved from an earlier 3GPP system known as the Universal Mobile Telecommunication System (UMTS), which in turn evolved from the Global System for Mobile Communications (GSM). Also it is aligned with 4G (fourth-generation mobile)

It is backward compatible with GSM/EDGE/UMTS/CDMA/WCDMA systems on existing 2G and 3G spectrum, even hand-over and roaming to existing mobile networks.

Motivation for evolution – Wireless/cellular technology standards are constantly evolving for better efficiency and performance.LTE evolved as a result of rapid increase of mobile data usage. Applications such as voice over IP (VOIP), streaming multimedia, videoconferencing , cellular modemetc.

It provides packet-switched traffic with seamless mobility and higher qos than predecessors. Also high data rate, throughput, low latency and packet optimized radioaccess technology on flexible bandwidth deployments.

Timeline of Evolution 

  • GSM  : calls  on circuit switching ( CS ) between 2 parties for communication. Dedicated circuits are used for voice and SMS.
  • GPRS : packet switching (PS) is introduced for data services
  • UMTS / 3G : network elements begin evolving into PS . No changes to core.
  • EPC / LTE/VOLTE : No circuit switched domain at all .


Peak Data Rate

  • uplink – 75Mbps(20MHz bandwidth)
  • downlink – 150 Mbps(UE Category 4, 2×2 MIMO, 20MHz bandwidth) , 300 Mbps(UE category 5, 4×4 MIMO, 20MHz bandwidth)

Carrier bandwidth

Range from 1.4 MHz up to 20 MHz. Ultimately bandwidth used by carrier depends on frequency band and the amount of spectrum available with a network operator

Mobility 350 km/h

Multiple Access Schemes

  • uplink: SC-FDMA (Single Carrier Frequency Division Multiple Access) 50Mbps+ (20MHz spectrum)
  • downlink: OFDM (Orthogonal Frequency Division Multiple Access) 100Mbps+ (20MHz spectrum)
  • Multi-Antenna Technology , Multi-user collaborative MIMO for Uplink and TxAA, spatial multiplexing, CDD ,max 4×4 array for downlink


  • 5 – 100km with slight degradation after 30km
  • LTE architecture supports hard QoS and guaranteed bit rate (GBR) for radio bearers.


All interfaces between network nodes are IP based
Duplexing – Time Division Duplex (TDD) , Frequency Division Duplex (FDD) and half duplex FD

MIMO ( Multiple Input Multiple Output ) transmissions –

Allows the base station to transmit several data streams over the same carrier simultaneously.
Modulation Schemes

QPSK, 16QAM, 64QAM(optional)

LTE Architecture

Primarily composed of

User Equipment (UE)

  • Mobile Termination (MT)
  • Terminal Equipment (TE) 
  • Universal Integrated Circuit Card (UICC) : also known as the SIM card for LTE equipments. It runs an application known as the Universal Subscriber Identity Module (USIM).

2. Evolved UMTS Terrestrial Radio Access Network (E-UTRAN)

handles the radio communications between the mobile and the evolved packet core. High level representation for  eNodeB or eNB

Role of eNB : sends and receives radio transmissions to all the mobiles using the analogue and digital signal processing functions of the LTE air interface. eNB also controls the low-level operation of all its mobiles, by sending them signalling messages such as handover commands.

3. Evolved Packet Core (EPC)

This sub system resembles IMS environment.

Packet Data Network (PDN) Gateway (P-GW) communicates with the outside world simillar to GGSN ( GPRS support node ) and SGSN ( serving GPRS support node ) in UMTS and GSM.

Home Subscriber Server (HSS) is a central database that contains information about all the network operator’s subscribers. Almost simillar to HLR/AAA in 2G /3G architcture.

Mobility management entity (MME) controls the high-level operation

For a roming user in Visited-PLMN , he is connected with the E-UTRAN, MME and S-GW of the visited LTE network. However, LTE/SAE allows the P-GW of either the visited or the home network to be used, as shown in below:

For roaming prepaid charging , accounting flows are made to access prepaid customer data, via P-Gateways or CSCF in an IMS environment.


LTE devices capable of CAT6 speeds (Category 6 )
Increased peak data rate – downlink 3 Gbps, Uplink 1.5 Gbps ( 1 Gbps = 1000 Mbps)
Spectral efficiency from 16bps/Hz in R8 to 30 bps/Hz in R10
Carrier Aggregation (CA)
Enhanced use of multi-antenna techniques
Support for Relay Nodes (RN)


Read More

Also read about previous generations of telecom namely 2 G and 3G

2G to 3G – generation of telecom

Where 2G is referred to as the GSM era , 2.5 G as the GPRS with GSM era. As compared to its predecessor 1G which used FDMA ( Frequency Division Multiplexing ) for channelization , 2G used used TDMA and CDMA for dividing the channels .

MIMO ( multiple-input and multiple-output )

SISO – Single Input Single Output
SIMO – Single Input Multiple output
MISO – Multiple Input Single Output
MIMO – Multiple Input multiple Output

Multiplying the capacity of a radio link using multiple transmission and receiving antennas to exploit multipath propagation.
Key technology for achieving a vast increase of wireless communication capacity over a finite electromagnetic spectrum.

Antenna configuration – implies antenna spatial diversity by useing arrays of multiple antennas on one or both ends of a wireless communication link
boost channel capacity.
combats multipath fading
enhance signal to noise ratio,
create multiple communication paths

Applies to wifi
IEEE 802.11n (Wi-Fi), IEEE 802.11ac (Wi-Fi)
as well as cellular networks
HSPA+ (3G)
WiMAX (4G)
Long Term Evolution (4G LTE)
power-line communication for 3-wire installations as part of ITU G.hn standard and HomePlug AV2 specification

Large capacity increases over given bandwidth and S/N resources
Greater throughputs on bands below 6 GHz,

multi-user MU-MIMO

simultaneous independent data links to multiple users over a common time-frequency resource

massive MIMO

enable the expansion of the useful spectrum to microwave and millimeter wave bands within the framework of 5G cellular communication.

microdiversity MIMO

MIMO modes (60m)

Diversity – Alamouti algorithm
Beam forming – create and aim the antenna pattern electronically
Spatial multiplex – use of precoding and shaping to unravel the multipath signals

challenges faced by mobile equipment vendors implementing MIMO in small portable devices.


3main categories: precoding, spatial multiplexing (SM), and diversity coding.


multi-stream beamforming ( signal is emitted from each of the transmit antennas with appropriate phase and gain weighting such that the signal power is maximized at the receiver input ) , increases reception and reduce multipath fading

In line-of-sight propagation, beamforming results in a well-defined directional pattern. However, conventional beams are not a good analogy in cellular networks, which are mainly characterized by multipath propagation. When the receiver has multiple antennas, the transmit beamforming cannot simultaneously maximize the signal level at all of the receive antennas, and precoding with multiple streams is often beneficial. Note that precoding requires knowledge of channel state information (CSI) at the transmitter and the receiver.

Spatial multiplexing

High-rate signal is split into multiple lower-rate streams and each stream is transmitted from a different transmit antenna in the same frequency channel. If these signals arrive at the receiver antenna array with sufficiently different spatial signatures and the receiver has accurate CSI, it can separate these streams into (almost) parallel channels.

increasing channel capacity at higher signal-to-noise ratios (SNR).

Diversity coding

when there is no channel knowledge at the transmitter , a single stream is transmitted. The signal is emitted from each of the transmit antennas with full or near orthogonal coding. Diversity coding exploits the independent fading in the multiple antenna links to enhance signal diversity.

Ref :

NLP ( Natural Language Processing ) in VoIP

NLP ( Natural Language Processing ) can be defined as the automatic manipulation of natural languages ( text or audio) using computer algorithms and softwares. As such NLP has great potential in cognitive and artificial intelligence , but also with increasing human to machine interaction and enhancement in Machine learning ,NLP is set to revolutionize the Voice over IP space.

Note : although not obvious but some people confuse Natural language procession with Neurolinguistic pressing which is a science in Psychology.

NLP evolves from linguistics which itself is a study of language along with its semantics , phonetics and gramer. Every language has rules and NLP uses mathematical formulation to understand it. Discrete mathematical formalisms will be discussed later in this article.

Inputs for NLP is usually though conversation, speech, correspondence, reading, print, written composition, dictation, publishing, translation, lip reading, signing etc .

Rule based vs Statistical NLP – In contrast to rule based engines which work on hard preset values using maybe a decision tree , statistical models work in a more probabilistic fashion which produces more reliable results even in unfamiliar scenarios.

Linear classifier vs Convolutional Neural Nets– CNNs are powerful supervised deep learning technique. As opposed to a linear classifier whose decision boundary on feature space is linear function , CNN increases model complexity by adding more layers . tbd-

NLP tasks


Grammer induction , lemmitization , morphological segmentation , part of speech tagging , parsing , sentence breaking , stemming , word segmentation , terminology extraction


lexical , distributional , machine translation , Named entity recognition ( NER) , natural language understanding and generation, relationship establishment , sentimental analysis , work sense disambiguation , OCR( optical Character recognition) , recognizing textual entailment


speech recognition , specch segmentation , text to speech , dialogues


automatic summarizations , conference resolution , discourse analysis

Key techniques

Out of above its worthy to point out few key techniques

Parts of speech (POS )

A primary tasks in NLP is to extract tokens and sentences, identify parts of speech ( like nouns , verbs , adjectives ) and create parse trees.

POS tagging is the process of marking up a word in a corpus to a corresponding part of a speech tag . By tagging, algorithm builds lemmatizers which are used to reduce a word to its root form.

POS methods significantly differs from Bag-of-words(BOW) methods which disregards semantic relation relationship and only takes into account words and their frequencies. Whereas POS takes context and definition into consideration.

POS tagging techniques include lexical , rule based , probablistic and deep learning methods.

Named entity recognition (NER)

Given a stream of text, determine which items in the text map to proper names, such as people or places, and their types such as person, location, Organization. Example for raw test as below using Spacy.io

“Hello ! My name is Atanai and I work on Solution design and architecture, developed many custom WebRTC and SIP based solutions such as telecom applications, media stream inetgration into IOT,Unified communication-collaboration ,signalling gateways ,SBC etc. I passed out from Anna university with Betch degree in 2011 and currenlty stay in Bangalore India.”

Analysis of NER is

Noun phrases: ['My name', 'Atanai', 'I', 'Solution design', 'architecture', 'many custom', 'WebRTC and SIP based solutions', 'telecom applications', 'media stream integration', 'IOT', 'Unified communication-collaboration', 'signalling gateways', 'I', 'Anna university', 'Betch degree', 'currently stay', 'Bangalore India'] 
Verbs: ['be', 'work', 'develop', 'base', 'signal', 'pass'] 
Atanai PERSON 
Betch NORP 
2011 DATE 
Bangalore India LOC

Sentiment Analysis

Understand the overall opinion, feeling, or attitude expressed in given media ( speech , text or video) .

NLP in action

NLP application layout

Steps to obtain insights and relevant information from an unclassified document , raw tex file or speech to text content such as recording from VOIP meeting

step 1 : upload a document which could be an invoice , order , feedback , complaint or any other unstructured raw text

Step 2 : Collect the data from the document

  • use OCR (optical character recognition) for hand written or signed components
  • perform search , index , duplication detection etc
  • can use MNIST database as
  • phrase matching and vocabulary
  • Can use translation APIs to trans late from other languages

Step 3 : Collect meaning-full data

  • perform Part of Speech (POS) tagging and chunking process
  • topic discovery and modelling
  • tokenizations and text classification , obtain domain specific entities from the document
  • can use standard model language to collect relevant frequently used words
  • NER ( Named Entity recognition ) to validate names , places and locations
  • can extract out time and date from mentioned entities
  • build relationship graphs

step 4 : extract sentiments using a trained model

  • utilize Regular Expressions for pattern searching
  • sentiment analysis

General Applications:

Application of NLP find its way into many domains

1.VOIP platforms ,media servers and automatic summarization of conference / meetings like “Minutes of Meetings” to highlight the key takeaways from a VOIP session

2. Automatic essay assessment and scripting in education setting alike.

3. Image annotation using metadata describing digital images for categorizations and easy retrieval based on keywords.

4. Spam filtering

5. Building automatic assistants and chatbots with Speech Recognition and using auto suggest with sentence completion ( Siri , Alexa , google voice etc )

6. Social Media Analytics , to track sentiments about topic , figure out influencers such as for movie or restaurant reviews .

NLP in VOIP system

To know more about sound waves go here which describes fundamental characteristics of analog waves . To know more about analog wave modulation go here , this describes how waves are modulated such as frequency , phase , amplitude etc to hold information for propagation . click here to know more about digital wave modulation such as amplitude , frequency , phase shift keying etc . This section build on top of audio streams captured or live .

Based on NLP and trained models on extracted features ,an unknown audio wave can be classified and possibly identified.

Replacing auto attendants with IVR


Ref :

Tools ref:

WebRTC CPaaS ( Communication Platform as a Service )

A CPasS ( communication platform as a service ) is cloud based communication platform alo B2B cloud communications platform that provides real time communication capabilities. This should be easily integrable with any given external environment or application of the customer, without him worrying about building backend infrastructure or interfaces .

Traditionally , with IP protected protocols , licensed codecs maintaining a signalling protocol stack , network interfaces building communication platform was a costly affair. Cisco , Facetime , Skype were the only OTT ( over the top) players taking away from telco’s call revenue .

However with the advent of standardize , open source protocol and codecs plenty of CPaaS providers have crowded the market making more supply than there is demand. A customer wanting to quickly integrate real time communications on his platform has many options to choose from. This article provides an insight to how CPaaS solution are architectured and programmed

Sample CPass Architecture build on open source technologies

I have written an article before on Steps for building and deploying WebRTC solution , which includes standalone , cloud hosted and TURN based NAT handler systems .

A typical CPaaS solution provides

  • Call server + Media Server that can be interacted with via UA
  • Comm clients like sipphones , webrtc client , SDK ( software development kits ) or libraries for desktop , embedded and/or mobile platforms .
  • APIs that can trigger automated calls and perform preprogrammed routing.
  • Rich documentation and samples to build various apps such as call centre solutions , interactive auto-attendant using IVR , DTMF , conference solutions etc .
  • Some CPaaS providers also add features like transcribing ,transcoding , recording , playback etc to provide edge over other CPaaS providers

Datacentre vs Cloud server


Advantages of using a CPaaS vs building your own RTC platform

Tech insights and experiences

companies who have been catering to telco and communication domain make robust solutions based on industry best practices which beats novice solution build in a fortnight anyday

keeping up with emerging trends

Market trends like new codecs , rich communication services , multi tenancy, contextual communication , NLP , other ML based enhancements are provided by CPaaS company

Auto Scaling , High Availability

A firm specializing in CPaaS solution has already thought of clustering and autoscaling to meet peak traffic requirements and backup/replication on standby servers to activate incase of failure


using a Cpaas saves on human resources, infrastructure, and time to market. It saves tremendously on underlying IT infrastructure and many a times provides flexible pricing models

In a nutshell I have come across so many small size startups trying to build CPaaS solution from scratch but only realising it after weeks of trying to build a MVP that they are stuck with firewall, NAT, media quality or interoperability issues . Since there are so many solution already out in the market it is best to instead use them as underlying layer and build applications services using it such as callcentre or CRM services etc .