Metrics for monitoring a VOIP call can be obtained from any node in media path of the call flow . Essentially used for analysis via calculation and aggregation , and sometimes used for realtime performance tracking and rectification too .
Rating Factor (R-Factor) and Mean Opinion Score (MOS) are two commonly-used measurements of overall VoIP call quality.
R-Factor: A value derived from metrics such as latency, jitter, and packet loss per ITU‑T Recommendation G.107. It assess the quality-of-experience for VoIP calls on your network. Typical scores range from 50 (bad) to 90 (excellent).
R factor of 90 , Mos is 4.3 ( Excellent )
R factor 50 , Mos is 2.6 ( Bad)
MOS: It is derived from the R-Factor per ITU‑T Recommendation G.10 which measures VoIP call quality. PacketShaper measures MOS using a scale of 10-50. To convert to a standard MOS score (which uses a scale of 1-5), divide the PacketShaper MOS value by 10.
ITU ? The International Telecommunication Union is the United Nations specialised agency in the field of telecommunications, information and communication technologies (ICTs).
ITU-T ? TU Telecommunication Standardisation Sector is responsible for studying technical, operating and tariff questions and issuing Recommendations on them with a view to standardising telecommunications on a worldwide basis.
MOS ( Mean Opinion Score )
MOS is terminology for audio, video and audiovisual quality expressions as per ITU-T P.800.1. It refers to listening, talking or conversational quality, whether they originate from subjective or objective models.
It provides provisions for identifiers regarding the audio bandwidth, the type of interface (electrical or acoustical) and the video resolution too , such as
MOS-AVQE for audiovisual quality;
MOS-CQE is for estimated conversational quality;
MOS-LQE for listening quality;
MOS-TQE is used for talking quality;
MOS-VQE depicts video quality;
For Audio Signal Speech Quality/ AV
– N denotes audio signals upto narrow-band (300-3400 Hz)
– W is for audio signals upto wideband (50-7000 Hz)
– S for upto super-wideband (20-14000 Hz)
– F is obtained for fullband (10-20000 Hz)
For Listening quality LQO
performed at electrical interfaces only. In order to predict the listening quality as perceived by the user, assumptions for the terminals are made in terms of intermediate reference system (IRS) or corrected IRS frequency response. A sealed condition between the handset receiver and the user’s ear is assumed.
performed at acoustical interfaces. In order to predict the listening quality as perceived by the user, this measurement includes the actual telephone set products provided by the manufacturer or vendor. In combination with the choice of the acoustical receiver in the laboratory test , there will be a more or less leaky condition between the handset’s receiver and the artificial ear.
Conversational Quality / CQ
Arithmetic mean value of subjective judgments on a 5-point ACR quality scale, is calculated. Talking Quality / TQ
This describes the quality of a telephone call as it is perceived by the talking party only. Factors affecting TQ include echo signal , background noise , double talk etc. It is calculated based on the arithmetic mean value of judgments on a 5-point ACR quality scale.
Video Quality / VQ
To account for differentiation in perceived quality for mobile and fixed devices and to allow for proper handling of different use-cases as
– M for mobile screen such as a smartphone or tablet (approximately 25 cm or less)
– T for PC/TV monitors
It is calculated based on the arithmetic mean value of subjective judgments, typically on a 5-point quality scale
Audio Visual Quality / AVQ
Refers to quality of audio visual stream under corresponding networking conditions. It is also calculated based on the arithmetic mean value of judgments on a 5-point ACR quality scale.
Other parameters also contributing to VoIP metric Analysis
It is the time required for packets to travel from one end to another, in milliseconds. If the sum of measured latency is 800 ms and the number of latency samples is 20, then the average latency is 40 ms. Header of the RTP packets carry timestamps which later can also be used to calculate round-trip time.
packet loss percentage performed per RFC 3550 using RTP header sequence numbers.
The variation in the delay of received packets in a flow, measured by comparing the interval when RTP packets were sent to the interval at which they were received.
For instance, if packet #1 and packet #2 leave 30 milliseconds apart and arrive 50 milliseconds apart, then the jitter is 20 milliseconds.
Methods for objective and subjective assessment of speech and video quality.
Mapping R-value to calculate MOS
To map MOS from R value using above defined metrics , a standard formula is used. First the latency and jitter are added and defined value for computation time is also added , resulting in effective latency
This article focuses on setting up sipwise rtpegine to proxy rtp traffic from kamailio app server. This is an updated version of the the old article .
RTPengine is a proxy for RTP traffic and other UDP based media traffic over either IPv4 or IPv6. It can even bridge between diff IP networks and interfaces . It can do TOS/QoS field setting. It is Multi-threaded , can advertise different addresses for operation behind NAT.
It bears in-kernel packet forwarding for low-latency and low-CPU performance .
Bridging between ICE-enabled and ICE-unaware user agents
Optionally acting only as additional ICE relay/candidate
Optionally forcing relay of media streams by removing other ICE candidates
SRTP (RFC 3711):
Support for SDES (RFC 4568) and DTLS-SRTP (RFC 5764)
AES-CM and AES-F8 ciphers, both in userspace and in kernel
HMAC-SHA1 packet authentication
Bridging between RTP and SRTP user agents
RTCP profile with feedback extensions (RTP/AVPF, RFC 4585 and 5124)
Arbitrary bridging between any of the supported RTP profiles (RTP/AVP, RTP/AVPF, RTP/SAVP, RTP/SAVPF)
RTP/RTCP multiplexing (RFC 5761) and demultiplexing
Breaking of BUNDLE’d media streams (draft-ietf-mmusic-sdp-bundle-negotiation)
Recording of media streams, decrypted if possible
Transcoding and repacketization
Playback of pre-recorded streams/announcements
Sipwise NGCP RTP Engine Source Code
There are 3 parts of the source structure in sipwise NGCP ( Next Generation communication Platform) rtpengine :
The userspace daemon and workhorse, minimum requirement for anything to work. Running make will compile the binary, which will be called rtpengine.
Required packages including their development headers are required to compile the daemon:
GLib including GThread and GLib-JSON version 2.x
XMLRPC-C version 1.16.08 or higher
libcurl version 3.x or 4.x
libevent version 2.x
MySQL or MariaDB client library (optional for media playback and call recording daemon)
libiptc library for iptables management (optional)
ffmpeg codec libraries for transcoding (optional) such as libavcodec, libavfilter, libswresample
bcg729 for full G.729 transcoding support (optional)
options for make – with_iptables_option , with_transcoding
Required for in-kernel packet forwarding. With the iptables development headers installed, issuing make will compile the plugin for iptables and ip6tables. The file will be called libxt_RTPENGINE.so and needs to be copied into the xtables module directory. The location of this directory can be determined through pkg-config xtables –variable=xtlibdir on newer systems, and/or is usually either /lib/xtables/ or /usr/lib/x86_64-linux-gnu/xtables/.
Required for in-kernel packet forwarding. Compilation of the kernel module requires the kernel development headers to be installed in/lib/modules/$VERSION/build/, where $VERSION is the output of the command uname -r.
Successful compilation of the module will produce the file xt_RTPENGINE.ko. The module can be inserted into the running kernel manually through insmod xt_RTPENGINE.ko
It is recommended to copy the module into /lib/modules/$VERSION/updates/, followed by running depmod -a.
After this, the module can be loaded by issuing modprobe xt_RTPENGINE.
To avoid the overhead involved in processing each individual RTP packet in userspace-only operation, especially as RTP traffic consists of many small packets at high rates, rtpengine provides a kernel module to offload the bulk of the packet forwarding duties from user space to kernel space. This also results in increasing the number of concurrent calls as CPU usage decreases.In-kernel packet forwarding is implemented as an iptables module (x_tables) and has 2 parts – xt_RTPENGINE and plugin to the iptables and ip6tables command-line utilities
Sequence of events for a newly established media stream is then:
Kamailio as SIP proxy controls rtpengine and signals it about a newly established call.
Rtpengine daemon allocates local UDP ports and sets up preliminary forward rules based on the info received from the SIP proxy.
An RTP packet is received on the local port.
It traverses the iptables chains and gets passed to the xt_RTPENGINE module.
The module doesn’t recognize it as belonging to an established stream and thus ignores it.
The packet continues normal processing and eventually ends up in the daemon’s receive queue.
The daemon reads it, processes it and forwards it. It also updates some internal data.
This userspace-only processing and forwarding continues for a little while, during which time information about additional streams and/or endpoints may be obtained from the SIP proxy.
After a few seconds, when the daemon is satisfied with what it has learned about the media endpoints, it pushes the forwarding rules to the kernel.
From this moment on, the kernel module will recognize incoming packets belonging to those streams and will forward them on its own. It will stop those packets from traversing the network stacks any further, so the daemon will not see them any more on its receive queues.
In-kernel forwarding is allowed to cease to work at any given time, either accidentally (e.g. by removal of the iptablesrule) or deliberatly (the daemon will do so in case of a re-invite), in which case forwarding falls back to userspace-only operation.
The Kernel Module
The kernel module supports multiple forwarding tables, identified through their ID number , bydefault 0 to 63
Each running instance of the rtpengine daemon controls one such table. To load use
modprobe xt_RTPENGINE and to unload rmmod xt_RTPENGINE,. With the module loaded, a new directory will appear in /proc/, namely /proc/rtpengine/ , containing pseudo-files, control ( to create and delete forwarding tables) and list ( list of currently active forwarding tables)
To manually create a forwarding table with ID 33, the following command can be used:
echo ‘add 43’ > /proc/rtpengine/control
The iptables module
In order for the kernel module to be able to actually forward packets, an iptables rule must be set up to send packets into the module. Each such rule is associated with one forwarding table. In the simplest case, for forwarding table 33, this can be done through:
iptables -I INPUT -p udp -j RTPENGINE –id 33
To restrict the rules to the UDP port range used by rtpengine, e.g. by supplying a parameter like –dport 30000:40000. If the kernel module receives a packet that it doesn’t recognize as belonging to an active media stream, it will simply ignore it and hand it back to the network stack for normal processing.
A typical start-up sequence including in-kernel forwarding might look like this:
To run multiple instances of rtpengine on the same machine run multiple instances of the daemon using different command-line options ( local addresses and listening ports), together with multiple different kernel forwarding tables.
For example, if one local network interface has address 10.64.73.31 and another has address 192.168.65.73, then the start-up sequence might look like this:
With this setup, the SIP proxy can choose which instance of rtpengine to talk to and thus which local interface to use by sending its control messages to either port 2223 or port 2224.
Currently transcoding is supported for audio streams. Can we turned off with with_transcoding=no option in makeFile
Normally rtpengine leaves codec negotiation up to the clients involved in the call and does not interfere. In this case, if the clients fail to agree on a codec, the call will fail.
transcoding options in the ng control protocol, transcode or ptime . If a codec is requested via the transcode option that was not originally offered, transcoding will be engaged for that call. With transcoding active for a call, all unsupported codecs will be removed from the SDP.
Transcoding happens in userspace only, so in-kernel packet forwarding will not be available for transcoded codecs. Codecs that are supported by both sides will simply be passed through transparently (unless repacketization is active). In-kernel packet forwarding will still be available for these codecs.
codecs supported by rtpengine can be shown with –codecs options
PCMA: fully supported
PCMU: fully supported
G723: fully supported
G722: fully supported
QCELP: supported for decoding only
G729: supported for decoding only
speex: fully supported
GSM: fully supported
iLBC: not supported
opus: fully supported
vorbis: codec supported but lacks RTP definition
ac3: codec supported but lacks RTP definition
eac3: codec supported but lacks RTP definition
ATRAC3: supported for decoding only
ATRAC-X: supported for decoding only
AMR: supported for decoding only
AMR-WB: supported for decoding only
PCM-S16LE: codec supported but lacks RTP definition
PCM-U8: codec supported but lacks RTP definition
MP3: codec supported but lacks RTP definition
ng Control Protocol
advanced control protocol to pass SDP body from the SIP proxy to the rtpengine daemon, has the body rewritten in the daemon, and then pas back to the SIP proxy to embed into the SIP message. It is based on the bencode standard and runs over UDP transport.
Each message passed between the SIP proxy and the media proxy contains of two parts: message cookie ( to match requests to responses, and retransmission detection) and bencoded dictionary
The dictionary of each request must contain at least one key called command and corresponding value must be a string and determines the type of message. Currently the following commands are defined:
The response dictionary must contain at least one key called result. The value can be either ok (optional key warning) or error( to be accompanied by error-reason ). For the ping command, the additional value pong is allowed.