Both radio and core nework evolution
all-IP packet-switched architecture
standardized by 3GPP
lower CAPEX ans OPEX involved
Evolved from Universal Mobile Telecommunication System (UMTS), which in turn evolved from the Global
Also aligned with 4G (fourth-generation mobile)
LTE is backward compatible with GSM/EDGE/UMTS/CDMA/WCDMA systems on existing 2G and 3G spectrum , even hand-over and roaming to existing mobile networks.
Motivation for evolution
Wireless/cellular technology standards are constantly evolving for better efficiency and performance.
LTE evolved as a result of rapid increase of mobile data usage. Applications such as voice over IP (VOIP), streaming multimedia, videoconferencing , cellular modemetc.
It provides packet-switched traffic with seamless mobility and higher qos than predecessors.
Also high data rate, throughput, low latency and packet optimized radioaccess technology on flexible bandwidth deployments.
Multi-Antenna Technology , Multi-user collaborative MIMO for Uplink and TxAA, spatial multiplexing, CDD ,max 4×4 array for downlink
Coverage 5 – 100km with slight degradation after 30km
LTE architecture supports hard QoS and guaranteed bit rate (GBR) for radio bearers.
All interfaces between network nodes are IP based
Duplexing – Time Division Duplex (TDD) , Frequency Division Duplex (FDD) and half duples FD
(MIMO) Multiple Input Multiple Output transmissions – LTE devices have to support this. Allows the base station to transmit several data streams over the same carrier simultaneously.
Modulation Schemes QPSK, 16QAM, 64QAM(optional)
Primarily composed of
User Equipment (UE)
Evolved UMTS Terrestrial Radio Access Network (E-UTRAN).
Evolved Packet Core (EPC).
LTE devices capable of CAT6 speeds (Category 6 )
Increased peak data rate, downlink 3 Gbps, Uplink 1.5 Gbps ( 1 Gbps = 1000 Mbps)
spectral efficiency from 16bps/Hz in R8 to 30 bps/Hz in R10
Carrier Aggregation (CA)
enhanced use of multi-antenna techniques
support for Relay Nodes (RN)
Multiplying the capacity of a radio link using multiple transmission and receiving antennas to exploit multipath propagation. Key technology for achieving a vast increase of wireless communication capacity over a finite electromagnetic spectrum.
Antenna configuration – implies antenna spatial diversity by useing arrays of multiple antennas on one or both ends of a wireless communication link boost channel capacity. combats multipath fading enhance signal to noise ratio, create multiple communication paths
Applies to wifi IEEE 802.11n (Wi-Fi), IEEE 802.11ac (Wi-Fi) as well as cellular networks HSPA+ (3G) WiMAX (4G) Long Term Evolution (4G LTE) power-line communication for 3-wire installations as part of ITU G.hn standard and HomePlug AV2 specification
Large capacity increases over given bandwidth and S/N resources Greater throughputs on bands below 6 GHz,
simultaneous independent data links to multiple users over a common time-frequency resource
enable the expansion of the useful spectrum to microwave and millimeter wave bands within the framework of 5G cellular communication.
MIMO modes (60m)
Diversity – Alamouti algorithm Beam forming – create and aim the antenna pattern electronically Spatial multiplex – use of precoding and shaping to unravel the multipath signals
challenges faced by mobile equipment vendors implementing MIMO in small portable devices.
3main categories: precoding, spatial multiplexing (SM), and diversity coding.
multi-stream beamforming ( signal is emitted from each of the transmit antennas with appropriate phase and gain weighting such that the signal power is maximized at the receiver input ) , increases reception and reduce multipath fading
In line-of-sight propagation, beamforming results in a well-defined directional pattern. However, conventional beams are not a good analogy in cellular networks, which are mainly characterized by multipath propagation. When the receiver has multiple antennas, the transmit beamforming cannot simultaneously maximize the signal level at all of the receive antennas, and precoding with multiple streams is often beneficial. Note that precoding requires knowledge of channel state information (CSI) at the transmitter and the receiver.
High-rate signal is split into multiple lower-rate streams and each stream is transmitted from a different transmit antenna in the same frequency channel. If these signals arrive at the receiver antenna array with sufficiently different spatial signatures and the receiver has accurate CSI, it can separate these streams into (almost) parallel channels.
increasing channel capacity at higher signal-to-noise ratios (SNR).
when there is no channel knowledge at the transmitter , a single stream is transmitted. The signal is emitted from each of the transmit antennas with full or near orthogonal coding. Diversity coding exploits the independent fading in the multiple antenna links to enhance signal diversity.
NLP ( Natural Language Processing ) can be defined as the automatic manipulation of natural languages ( text or audio) using computer algorithms and softwares. As such NLP has great potential in cognitive and artificial intelligence , but also with increasing human to machine interaction and enhancement in Machine learning ,NLP is set to revolutionize the Voice over IP space.
Note : although not obvious but some people confuse Natural language procession with Neurolinguistic pressing which is a science in Psychology.
NLP evolves from linguistics which itself is a study of language along with its semantics , phonetics and gramer. Every language has rules and NLP uses mathematical formulation to understand it. Discrete mathematical formalisms will be discussed later in this article.
Inputs for NLP is usually though conversation, speech, correspondence, reading, print, written composition, dictation, publishing, translation, lip reading, signing etc .
Rule based vs Statistical NLP – In contrast to rule based engines which work on hard preset values using maybe a decision tree , statistical models work in a more probabilistic fashion which produces more reliable results even in unfamiliar scenarios.
Linear classifier vs Convolutional Neural Nets– CNNs are powerful supervised deep learning technique. As opposed to a linear classifier whose decision boundary on feature space is linear function , CNN increases model complexity by adding more layers . tbd-
Grammer induction , lemmitization , morphological segmentation , part of speech tagging , parsing , sentence breaking , stemming , word segmentation , terminology extraction
lexical , distributional , machine translation , Named entity recognition ( NER) , natural language understanding and generation, relationship establishment , sentimental analysis , work sense disambiguation , OCR( optical Character recognition) , recognizing textual entailment
speech recognition , specch segmentation , text to speech , dialogues
Out of above its worthy to point out few key techniques
Parts of speech (POS )
A primary tasks in NLP is to extract tokens and sentences, identify parts of speech ( like nouns , verbs , adjectives ) and create parse trees.
POS tagging is the process of marking up a word in a corpus to a corresponding part of a speech tag . By tagging, algorithm builds lemmatizers which are used to reduce a word to its root form.
POS methods significantly differs from Bag-of-words(BOW) methods which disregards semantic relation relationship and only takes into account words and their frequencies. Whereas POS takes context and definition into consideration.
POS tagging techniques include lexical , rule based , probablistic and deep learning methods.
Named entity recognition (NER)
Given a stream of text, determine which items in the text map to proper names, such as people or places, and their types such as person, location, Organization. Example for raw test as below using Spacy.io
“Hello ! My name is Atanai and I work on Solution design and architecture, developed many custom WebRTC and SIP based solutions such as telecom applications, media stream inetgration into IOT,Unified communication-collaboration ,signalling gateways ,SBC etc. I passed out from Anna university with Betch degree in 2011 and currenlty stay in Bangalore India.”
Analysis of NER is
Noun phrases: ['My name', 'Atanai', 'I', 'Solution design', 'architecture', 'many custom', 'WebRTC and SIP based solutions', 'telecom applications', 'media stream integration', 'IOT', 'Unified communication-collaboration', 'signalling gateways', 'I', 'Anna university', 'Betch degree', 'currently stay', 'Bangalore India']
Verbs: ['be', 'work', 'develop', 'base', 'signal', 'pass']
Bangalore India LOC
Understand the overall opinion, feeling, or attitude expressed in given media ( speech , text or video) .
NLP in action
Steps to obtain insights and relevant information from an unclassified document , raw tex file or speech to text content such as recording from VOIP meeting
step 1 : upload a document which could be an invoice , order , feedback , complaint or any other unstructured raw text
Step 2 : Collect the data from the document
use OCR (optical character recognition) for hand written or signed components
perform search , index , duplication detection etc
can use MNIST database as
phrase matching and vocabulary
Can use translation APIs to trans late from other languages
Step 3 : Collect meaning-full data
perform Part of Speech (POS) tagging and chunking process
topic discovery and modelling
tokenizations and text classification , obtain domain specific entities from the document
can use standard model language to collect relevant frequently used words
NER ( Named Entity recognition ) to validate names , places and locations
can extract out time and date from mentioned entities
build relationship graphs
step 4 : extract sentiments using a trained model
utilize Regular Expressions for pattern searching
Application of NLP find its way into many domains
1.VOIP platforms ,media servers and automatic summarization of conference / meetings like “Minutes of Meetings” to highlight the key takeaways from a VOIP session
2. Automatic essay assessment and scripting in education setting alike.
3. Image annotation using metadata describing digital images for categorizations and easy retrieval based on keywords.
4. Spam filtering
5. Building automatic assistants and chatbots with Speech Recognition and using auto suggest with sentence completion ( Siri , Alexa , google voice etc )
6. Social Media Analytics , to track sentiments about topic , figure out influencers such as for movie or restaurant reviews .
NLP in VOIP system
To know more about sound waves go here which describes fundamental characteristics of analog waves . To know more about analog wave modulation go here , this describes how waves are modulated such as frequency , phase , amplitude etc to hold information for propagation . click here to know more about digital wave modulation such as amplitude , frequency , phase shift keying etc . This section build on top of audio streams captured or live .
Classifying Call recordings
Sound waves bear multiple features such as
Pitch – frequency of a sound wave ,
Frequencies from 20 to 20000 Hz are audible to the human ear , while dogs can hear 50 to 45000 Hz , Freq < 20Hz – infra sound Freq > 20000 Hz – ultra sound
Loud – amplitude of sound wave
Amplitude, Frequency, Wavelength And Timbre
statistical – Mean, Variance, Skewness
zero-crossing rate (ZCR) – number of times in a sound sample that the amplitude of the sound wave changes sign
root-mean-square (RMS) –
Harmonic Odd Even Ratio
Mel Frequency Cepstral Coefficient (MFCC)
Bark Scale etc
Based on NLP and trained models on extracted features ,an unknown audio wave can be classified and possibly identified.
A CPasS ( communication platform as a service ) is cloud based communication platform that provides real time communication capabilities. This should be easily integrable with any given external environment or application of the customer, without him worrying about building backend infrastructure or interfaces .
Traditionally , with IP protected protocols , licensed codecs maintaining a signalling protocol stack , network interfaces building communication platform was a costly affair. Cisco , Facetime , Skype were the only OTT ( over the top) players taking away from telco’s call revenue .
However with the advent of standardize , open source protocol and codecs plenty of CPaaS providers have crowded the market making more supply than there is demand. A customer wanting to quickly integrate real time communications on his platform has many options to choose from. This article provides an insight to how CPaaS solution are architectured and programmed
Call server + Media Server that can be interacted with via UA
Comm clients like sipphones , webrtc client , SDK ( software development kits ) or libraries for desktop , embedded and/or mobile platforms .
APIs that can trigger automated calls and perform preprogrammed routing.
Rich documentation and samples to build various apps such as call centre solutions , interactive auto-attendant using IVR , DTMF , conference solutions etc .
Some CPaaS providers also add features like transcribing ,transcoding , recording , playback etc to provide edge over other CPaaS providers
Advantages of using a CPaaS vs building your own RTC platform
Tech insights and experiences
companies who have been catering to telco and communication domain make robust solutions based on industry best practices which beats novice solution build in a fortnight anyday
keeping up with emerging trends
Market trends like new codecs , rich communication services , multi tenancy, contextual communication , NLP , other ML based enhancements are provided by CPaaS company
Auto Scaling , High Availability
A firm specializing in CPaaS solution has already thought of clustering and autoscaling to meet peak traffic requirements and backup/replication on standby servers to activate incase of failure
CAPEX and OPEX
using a Cpaas saves on human resources, infrastructure, and time to market. It saves tremendously on underlying IT infrastructure and many a times provides flexible pricing models
In a nutshell I have come across so many small size startups trying to build CPaaS solution from scratch but only realising it after weeks of trying to build a MVP that they are stuck with firewall, NAT, media quality or interoperability issues . Since there are so many solution already out in the market it is best to instead use them as underlying layer and build applications services using it such as callcentre or CRM services etc .
MPEG 2 MPEG-2 (a.k.a. H.222/H.262 as defined by the ITU) generic coding of moving pictures and associated audio information combination of lossy video compression and lossy audio data compression methods, which permit storage and transmission of movies using currently available … Continue reading →
Audio signals are electronic representations of sound waves—longitudinal waves which travel through air, consisting of compressions and rarefactions and Audio Signal Processing focuses on the computational methods for intentionally altering auditory signals or sounds, in order to achieve a particular goal. … Continue reading →
Interfaces of webrtc and tracks to stream addition Process to perform webrtc handshake 1.Setup Client side for the caller PeerConnectionFactory to generate PeerConnections PeerConnection for every connection to remote peer MediaStream audio and video from client device 2.caller creates SDP … Continue reading →
Web resources are usually build on request/response paradigm such as HTTP , SIP messages . This means that server responds only when a client requests it to. This made web intercations very slow and unsuited for VOIP signalling Long Poll … Continue reading →
Wi‑Fi is a trademark of the Wi-Fi Alliancefamily of radio technologies commonly used for wireless local area networking (WLAN) of devices. Current and older Wifi standards standards operate on varying frequencies, deliver different bandwidth, and support different numbers of channels. … Continue reading →
Both radio and core nework evolution all-IP packet-switched architecture standardized by 3GPP lower CAPEX ans OPEX involved Evolved from Universal Mobile Telecommunication System (UMTS), which in turn evolved from the Global Also aligned with 4G (fourth-generation mobile) LTE is backward … Continue reading →
Multiplying the capacity of a radio link using multiple transmission and receiving antennas to exploit multipath propagation.Key technology for achieving a vast increase of wireless communication capacity over a finite electromagnetic spectrum. Antenna configuration – implies antenna spatial diversity by … Continue reading →
NLP has great potential in cognitive and artificial intelligence , but also with increasing human to machine interaction and enhancement in Machine learning ,NLP is set to revolutionize the Voice over IP space. Continue reading →
CPasS ( communication platform as a service ) is cloud based communication platform that provides real time communication capabilities. This should be easily integrable with any given external environment or application of the customer, without him worrying about building backend infrastructure or interfaces . Continue reading →
Solution design and architecture, developed many custom WebRTC and SIP based solutions such as telecom applications, surveillance, IOT, Unified communication-collaboration , signalling gateways , SBC , soft turrets
Developed use cases on Machine Learning and Computer vision for VoIP and Media streaming platforms including - NLP , Image processing and Real Time Video Analytics etc
Core engineer for Plivo SIP Trunking “Zentrunk”
Core member to architecture and build IPC Unigy 360
Media broadcasting and transcoding for scalable Live Streaming Solutions
IMS integration( SIP, RTP, RTCP and related 3GPP IMS protocols )
Video Telephony on LTE/VOLTE , BLE and WiFi radio technologies
Build framework and platform for Jiyo
Invented RamuDroid , a IOT Road-Cleaning robot
Ardent contributor to Open Source Software like TFX , blockchain VoIP and among many others .
Author of the book “ WebRTC Integrator's Guide“ by Packt Publishing
Frequent techblogger and contributor to forums and communities , have many white papers and conferences presentations to my credit including an IEEE paper on Hybrid micro Grid .
Applied 2 parents, one for an elearning platform and another for image processing and mechanical design of a specific purpose robot
Speaker at many tech conferences/meetups (Jquery Conf , GraceHopper , kranky Geek , Hackaday , Barcamp , blockchain summit , plivo Tech conference etc) and won many hackathons . Jury member for smart india Hackathon .
Working on my second book titled “Innovations in Voice Over IP and its practical applications”.
“Multimedia Conferencing“ US patent - US20180284957A1