VOIP Call Metric Monitoring and MOS ( Mean Opinion Score)

Metrics for monitoring a VoIP call can be obtained from any node in the media path of the call flow. These metrics are essential for analysis via calculation and aggregation, and are increasingly used for real-time performance tracking and quality rectification. Comprehensive monitoring ensures optimal voice quality across all call legs and network conditions.

RTP Fundamentals

RTP (Real-time Transport Protocol) provides real-time media streams with payload type identification, packet sequencing, and timestamping headers. It forms the foundation of voice and video transmission over IP networks.

Source: RFC 3550 – RTP: A Transport Protocol for Real-Time Applications

Key RTP Header Fields

Sequence Number: Tracks the incremental succession of incoming packets from the sender and identifies out-of-order delivery
Timestamp: Used by the receiver to play back received samples at the appropriate time and interval, ensuring proper synchronization
Synchronization Source (SSRC): Identifies the synchronization source within the RTP session

Call Legs in VoIP Sessions

In a typical VoIP call, there are two media paths:

Leg A: Between Caller and RTP Proxy
Leg B: Between RTP Proxy and Callee

Each leg is monitored independently for quality metrics and may experience different network conditions.

RTCP Protocol

RTCP (RTP Control Protocol) provides detailed monitoring of streams to session participants with statistical data and enhanced metrics for Quality of Service (QoS) and synchronization.

Source: RFC 3550 – RTCP Specification

RTCP Report Types

SR (Sender Report): Sent by active senders containing:
- Sender packet and octet counts
- Timestamp correlation for synchronization
- Reception quality feedback
RR (Receiver Report): Sent by receivers with:
- Fraction of packets lost
- Cumulative packet loss count
- Highest sequence number received
- Interarrival jitter
- Last Sender Report (LSR) timestamp
- Delay since Last Sender Report (DLSR)

Key RTCP Metrics

Packet Loss Rate: Percentage of packets lost during transmission
Packet Discard Rate: Percentage of packets discarded due to late arrival
Round Trip Time (RTT): Time for a packet to travel to destination and back
R-Factor: Numerical rating of voice quality (0-100 scale)
MOS-LQ: Mean Opinion Score for Listening Quality
MOS-CQ: Mean Opinion Score for Conversation Quality
Jitter Buffer Metrics: Current and maximum delay values

Further Reading: VoIP Metrics: RTP and RTCP

VoIP call quality metrics

RTP provides real time media stream, payload type identification, packet sequencing and timestamping headers.

sequence num : tracks incremental succession of incoming packets by sendor and tracls out of order delivery.
timestamp : used by the receiver to play back the received samples at appropriate time and interval.

Note that all Synchronization source (SSRC) identifiers fields denote the synchronization source within the RTP session such as both legs of a call session.

Certain aspects of RTP media and its RTCP metrics were discussed before you can read more about RTCP and RTCP / AVPF here RealTime Transport protocol (RTP) and RTP control protocol (RTCP )

RealTime Transport protocol (RTP) and supporting protocols

Other call related factors which are not specifically part of RTCP but provide information about call quality are

signal level
noise level
gap density, gap threshold
Burst density
residual echo return loss

Delays like following also play a significant influence in VoIP Quality

end system delay
Paketzation Delay
Setup delay ( auth, TLS handshake, accessing mic/camera stream ..)
Queing Delay
Serialization dleay
Network latency
End device processing delay such as CPU of the end device

It should be noted that in addition to these values which can be calculated algorithimically and with high precision , there are more subjective quality parameters which can be only evaluated manually ( ie with a person listening on both ends ) such as

Robot voice
Perceptible sound but annoying speech quality

Quality Factors

Acoustic and Noise Parameters

Signal Level: Amplitude of the audio signal (-dBm or dBFS)
Noise Level: Background noise measurement and floor
Gap Density & Threshold: Frequency and duration of silence gaps
Burst Density: Concentration of packet loss events in time
Residual Echo Return Loss: Echo cancellation effectiveness (higher is better)

Delay Components

VoIP quality is significantly affected by various delay factors that accumulate throughout the media path:

End System Delay: Processing time at endpoints (codec, jitter buffer, DSP)
Packetization Delay: Time to accumulate audio samples into packets
Setup Delay: Time for authentication, TLS handshake, and media stream initialization
Queuing Delay: Time spent waiting in network buffers
Serialization Delay: Time to transmit a packet onto the physical link
Network Latency: Propagation delay through the network infrastructure
End Device Processing Delay: CPU processing time at the endpoint device

Subjective Quality Parameters

Some quality aspects can only be evaluated through subjective listening tests:

Robot Voice: Synthetic or unnatural speech characteristics
Annoying Speech Quality: Perceptible but irritating audio artifacts
Overall Listening Experience: End-user perception of call quality

Rating Factor (R-Factor)

Rating Factor (R-Factor) and Mean Opinion Score (MOS) are two commonly-used measurements of overall VoIP call quality.

What is R-Factor ? This is a value derived from metrics such as latency, jitter, and packet loss per ITU‑T Recommendation G.107. It assess the quality-of-experience for VoIP calls on your network. Typical scores range from 50 (bad) to 90 (excellent).

It assesses the quality-of-experience for VoIP calls on your network with a scale from 0 to 100:

R-Factor Range	Quality Rating
90-100	Excellent
80-90	Good
70-80	Fair
60-70	Poor
50-60	Very Poor
0-50	Bad

Examples:

R-factor of 90 → MOS is 4.3 (Excellent)
R-factor of 50 → MOS is 2.6 (Bad)

MOS ( Mean Opinion Score )

What is MOS? MOS is derived from the R-Factor per ITU‑T Recommendation G.10 which measures VoIP call quality. PacketShaper measures MOS using a scale of 10-50. To convert to a standard MOS score (which uses a scale of 1-5), divide the PacketShaper MOS value by 10.

MOS is terminology for audio, video and audiovisual quality expressions as per ITU-T P.800.1. It refers to listening, talking or conversational quality, whether they originate from subjective or objective models.

MOS Score	Quality Rating
4.3-5.0	Excellent
4.0-4.3	Very Good
3.6-4.0	Good
3.1-3.6	Fair
2.6-3.1	Poor
1.0-2.6	Bad

MOS Types

MOS is terminology for audio, video, and audiovisual quality expressions. Different variants measure specific aspects:

MOS-LQE: Listening Quality Estimate (one-way audio quality)
MOS-CQE: Conversational Quality Estimate (two-way conversation quality)
MOS-TQE: Talking Quality Estimate (speaker’s perception)
MOS-AVQE: Audiovisual Quality Estimate
MOS-VQE: Video Quality Estimate

Audio Signal Bandwidth Classifications

N (Narrow-band): 300-3400 Hz (traditional telephone)
W (Wide-band): 50-7000 Hz (enhanced voice)
S (Super-wide-band): 20-14000 Hz (high fidelity)
F (Full-band): 10-20000 Hz (ultimate audio quality)

Listening Quality Measurement Types

Electrical Measurement: Done at electrical interfaces using the Intermediate Reference System (IRS) assumptions

Acoustical Measurement: Done at acoustical interfaces using actual telephone set products, providing real-world perception metrics

Conversational Quality (CQ)

Calculated as the arithmetic mean value of subjective judgments on a 5-point ACR (Absolute Category Rating) quality scale, considering both directions of communication.

Talking Quality (TQ)

Describes the quality as perceived by the talking party only. Factors affecting TQ include:

Echo signal presence and magnitude
Background noise perception
Double-talk scenarios

Calculated based on arithmetic mean of 5-point ACR scale judgments.

Video Quality (VQ)

Accounts for differentiation in perceived quality:

M (Mobile): Smartphone/tablet screens (~25 cm or less)
T (TV/PC): Monitor or television displays

Calculated based on arithmetic mean value of subjective judgments on a 5-point quality scale.

Audiovisual Quality (AVQ)

Refers to quality of integrated audio-visual streams under corresponding networking conditions. Also calculated using 5-point ACR scale mean judgments.

Latency

Latency is primarily is the time required for packets to travel from one end to another, in milliseconds. For example, if the sum of measured latency is 800 ms and the number of latency samples is 20, then the average latency is 40 ms. The header of the RTP packets carry timestamps which later can also be used to calculate round-trip time.

Propagation Delay by Medium

Terrestrial Cables:

Coaxial cable over FDM: 4-6 microseconds per km
Digital transmission submarine cables: 4-6 microseconds per km

Optical Fiber:

Digital transmission: ~5 microseconds per km
Includes delay from repeaters and regenerators

Satellite Communication: Delay varies significantly by altitude

400 km above earth: 12 ms
14,000 km above earth: 110 ms
36,000 km (geostationary): 260 ms

Equipment Delays

FDM Modem: ~0.75 ms
Transmultiplexer: ~1.5 ms
Exchanges (analog, digital, transit): 0.45-0.825 ms
Echo Cancellers: ~0.5 ms
DCME (Circuit manipulation, signal compression): 30-200 ms

RTT (Round Trip Time )

RTT is the time in milliseconds (ms) taken for data to travel to the target destination and back. In terms of SIP calls it is the time for a transaction to complete between caller/client and callee/server. It is calculated as when the packet was sent and when the acknowledgment for it was received.

High RTT : The media stream especially audio must not suffer a delay higher than 150 ms including all the processing delays at intermediate nodes and network latency. Any value above it is of poor quality. High RTT indicates a poor network quality and would result in the audio lag issue.

RTT vs Network ping calculation: RTT can represent full path network latency experienced by the packets and can do away with frequent ICMP ping/echo requests/probes to check network health. Although it should be noted that while pings happen in lower transport layers protocol, RTT happens at the high up application layer.

RTT is used to calculate RTO ( Request transmission timeouts )in TCP transmission ie how much time the sender should wait before retrying to send an unacknowledged packet.

Factors affecting RTT can include delays in propagation delay, processing delay, queuing delay and encoding delay. Prorogation delay can correlate to the

physical distance ( inter country/continents or intra) ,
medium of transmission ( copper cables , fiber , wireless)
bandwidth available

Similarly propagation delay can occur due to large num of network hops like routers / servers . It should be noted that server response time also plays a critical role in RTT as it depends on server’s processing capacity and nature of request.

Star based network topology like MCU , SFU or TURN servers can introduce processing delays too for activities such as mixing, encoding , NATing etc .

Network congestion can amplify the RTT the most.Traffic level must be monitored when RTT spikes such as during DDos attacks.

Overcoming large RTT can be achieved by

identifying the choke points of network
distributing the load evenly
ensuring scalability of the server side resources
ensuring points of presence(PoP) into geographic regions where caller/ callee is present and routing through it rather than unreliable open public network

Note : avg RTT of the session is misleading denotaion of latency as there maybe be assymetrically RTT between the two legs of the call

Calculation of RTT

EffectiveLatency = ( AverageLatency + Jitter * 2 + 10 )

In RTPengine
int eff_rtt = ssb->rtt / 1000 + ssb->jitter * 2 + 10;

Thus for RTT = 11338 and jitter =0
eff_RTT = 11338/1000 + 0*2 +10
= 11.651 + 10 = 21.651 , which is a good score as it is way below 150ms of latency

But for RTT = 129209 and jitter =7
eff_RTT = 129209/1000 + 7*2 +10
= 153.209 , which is a bad score > 150 ms

Packet Loss

Packet loss occurs when a packet does not successfully reach its destination. It can significantly degrade voice quality, causing choppy audio or complete dropouts.

Common Causes

Network Issues:

Unavailable network bandwidth or congestion
Buffer overload with insufficient queue space
ACL or firewall rules dropping packets
ISP rate limiting

Device Issues:

CPU unable to handle encryption/decryption at high speed
Low battery causing underperformance
Physical hardware damage (router, switch, cabling)
Softphone/hardphone limitations
Bluetooth headset range/signal weakness

Network Configuration:

High-priority packet preferences causing lower-priority drops

Radio Frequency Issues:

Interference from high voltage systems or microwaves in wireless networks
Weak signals causing packet drops
Late-arriving packets dropped by codec

User Experience: Manifests as chopped voice or complete dropout for moments

Impact on Audio

When packets are lost:

Listener hears clipped or missing words
Complete audio dropout during loss bursts
Codec attempts error concealment but quality suffers

Obtaining Packet Loss Details

RFC 3550 Method: Performed using RTP header sequence numbers. If packets are missing sequences, the media stream monitors flags this as lost packets.

CDR Analysis: Can be concluded from the difference between total packets and received packets.

RTP-XR Monitoring: RFC 3611 records real-time drops with detailed statistics.

Packet Loss Thresholds

Loss Rate	Quality
< 0.5%	Good
0.5% – 0.9%	Average
≥ 0.9%	Bad

Jitter

The variation in the delay of received packets in a flow, measured by comparing the interval when RTP packets were sent to the interval at which they were received.
For instance, if packet #1 and packet #2 leave 30 milliseconds apart and arrive 50 milliseconds apart, then the jitter is 20 milliseconds or if packets transmitted every 15ms and reach destination at every 15ms then there is no variability, and the jitter is 0.

Causes of jitter

Frame bigger than jitter buffer size
algorithms to back-of collision by introducing delays in packet transmission in half duplex interfaces
even small jitter can get exponentially worse on slow or congestion links
jitter can be introduced due to bottlenecks near router buffer, rerouting / parallel routes to the same destination, load-sharing, or route tables changing the path

Handling jitter:

Jitter below 30ms is manageable with the help of jitter buffers in codecs however above that the codec starts to drop the late arrived packets and cannot reassemble / splice up the packets for a smooth media stream effectively, hence causing media quality issues like clipped audio

Detecting Jitter

Passive Monitoring:

Analyze inter-packet gaps in Wireshark
RFC 3611 real-time jitter buffer usage metrics
RFC 7005 RTCP XR extension

Active Tools:

Network sniffers (Wireshark, tcpdump)
Path analyzers
Application Performance Monitoring (APM) tools
CDR analyzers
SNMP (Simple Network Management Protocol) collectors

Metric	Good	Average	Bad
Jitter	<= 10ms	10ms – 30ms	>=30ms
Packet Loss	< 0.5%	0.5% – 0.9%	>= 0.9%
Audio Level	>-40dB	-80dB to -40dB	< -80dB
RTT	< 200ms	200ms – 300ms	> 300ms

Range for good bad attributes for calculating mos score

Ref : ITU P.800.1 : Mean opinion score (MOS) terminology

Methods for objective and subjective assessment of speech and video quality.

Scheduling for low bandwidth networks

The ability of the end application or the RTP proxy to deal with packet loss or delays depends on its processing techniques , particularly with encoding and buffering techniquee to deal with high pac ket loss rate.

Mapping R-value to calculate MOS

To map MOS from R value using above defined metrics , a standard formula is used. First the latency and jitter are added and defined value for computation time is also added , resulting in effective latency

Step 1: Calculate effective latency

effectiveLatency = latency + (jitter * latencyImpact) + compTime

Step 2: Subtract effective latency from R-base value

R = 93 - (effectiveLatency / factorLatencyBased)

Step 3: Apply packet loss impact

R = R - (lostPackets * impact)

Step 4: Calculate MOS

MOS = (((R - 60) * (100 - R) * 0.000007) + (0.035 * R) + 1)

Media Stats and MOS on RTP engine Kamailio

Minimum edge Values

mos_min_pv
minimum encountered MOS value for the call.
range – 1.0 to 5.0.

mos_min_at_pv
timestamp of when the minimum MOS value was encountered during the call

mos_min_packetloss_pv
amount of packetloss in percent at the time the minimum MOS value was encountered

mos_min_roundtrip_pv
packet round-trip time in milliseconds at the time the minimum MOS value was encountered

mos_min_jitter_pv
amount of jitter in milliseconds at the time the minimum MOS value was encountered

Maximum edge Values

mos_max_pv
maximum encountered MOS value for the call.

mos_max_at_pv
timestamp of when the maximum MOS value was encountered during the cal

mos_max_packetloss_pv
amount of packetloss in percent at maximum MOS moment

mos_max_roundtrip_pv
packet round-trip time in milliseconds at maximum MOS moment

mos_max_jitter_pv
amount of jitter in milliseconds at maximum moment

Average Values

mos_average_pv : average (median) MOS value for the call. Range – 1.0 through 5.0.

mos_average_packetloss_pv : average (median) amount of packetloss in percent present throughout the call.

mos_average_jitter_pv : average (median) amount of jitter in milliseconds present throughout the call.

mos_average_roundtrip_pv

mos_average_samples_pv : number of samples used to determine the other “average” MOS data points.

Labels

mos_A_label_pv : custom label used in rtpengine signalling.
If set, all the statistics pseudovariables with the A suffix will be filled in with statistics only from the call legs that match the label given in this variable.

A label’s min
mos_min_A_pv
mos_min_at_A_pv
mos_min_packetloss_A_pv
mos_min_jitter_A_pv
mos_min_roundtrip_A_pv

A label’s max
mos_max_A_pv
mos_max_at_A_pv
mos_max_packetloss_A_pv
mos_max_jitter_A_pv
mos_max_roundtrip_A_pv

A label’s average
mos_average_A_pv
mos_average_packetloss_A_pv
mos_average_jitter_A_pv
mos_average_roundtrip_A_pv
mos_average_samples_A_pv

B labels’s min
mos_B_label_pv
mos_min_B_pv
mos_min_at_B_pv
mos_min_packetloss_B_pv
mos_min_jitter_B_pv
mos_min_roundtrip_B_pv

B label’s max
mos_max_B_pv
mos_max_at_B_pv
mos_max_packetloss_B_pv
mos_max_jitter_B_pv
mos_max_roundtrip_B_pv

B label’s average
mos_average_B_pv
mos_average_packetloss_B_pv
mos_average_jitter_B_pv
mos_average_roundtrip_B_pv
mos_average_samples_B_pv

Setting MOS collection on kamailio

set the kamailio config rtpengine params for names the variable the hold specific mos values

modparam("rtpengine", "mos_max_pv", "$avp(mos_max)")
modparam("rtpengine", "mos_average_pv", "$avp(mos_average)")
modparam("rtpengine", "mos_min_pv", "$avp(mos_min)")

modparam("rtpengine", "mos_average_packetloss_pv", "$avp(mos_average_packetloss)")
modparam("rtpengine", "mos_average_jitter_pv", "$avp(mos_average_jitter)")
modparam("rtpengine", "mos_average_roundtrip_pv", "$avp(mos_average_roundtrip)")
modparam("rtpengine", "mos_average_samples_pv", "$avp(mos_average_samples)")

modparam("rtpengine", "mos_min_pv", "$avp(mos_min)")
modparam("rtpengine", "mos_min_at_pv", "$avp(mos_min_at)")
modparam("rtpengine", "mos_min_packetloss_pv", "$avp(mos_min_packetloss)")
modparam("rtpengine", "mos_min_jitter_pv", "$avp(mos_min_jitter)")
modparam("rtpengine", "mos_min_roundtrip_pv", "$avp(mos_min_roundtrip)")

modparam("rtpengine", "mos_max_pv", "$avp(mos_max)")
modparam("rtpengine", "mos_max_at_pv", "$avp(mos_max_at)")
modparam("rtpengine", "mos_max_packetloss_pv", "$avp(mos_max_packetloss)")
modparam("rtpengine", "mos_max_jitter_pv", "$avp(mos_max_jitter)")
modparam("rtpengine", "mos_max_roundtrip_pv", "$avp(mos_max_roundtrip)")

modparam("rtpengine", "mos_A_label_pv", "$avp(mos_A_label)")
modparam("rtpengine", "mos_average_packetloss_A_pv", "$avp(mos_average_packetloss_A)")
modparam("rtpengine", "mos_average_jitter_A_pv", "$avp(mos_average_jitter_A)")
modparam("rtpengine", "mos_average_roundtrip_A_pv", "$avp(mos_average_roundtrip_A)")
modparam("rtpengine", "mos_average_A_pv", "$avp(mos_average_A)")

modparam("rtpengine", "mos_B_label_pv", "$avp(mos_B_label)")
modparam("rtpengine", "mos_average_packetloss_B_pv", "$avp(mos_average_packetloss_B)")
modparam("rtpengine", "mos_average_jitter_B_pv", "$avp(mos_average_jitter_B)")
modparam("rtpengine", "mos_average_roundtrip_B_pv", "$avp(mos_average_roundtrip_B)")
modparam("rtpengine", "mos_average_B_pv", "$avp(mos_average_B)")

For individual leg labbeling fill up the lables

KSR.pv.sets("$avp(mos_A_label)","Aleg_label")
KSR.pv.sets("$avp(mos_B_label)","Bleg_label")

Gather the mos stats from the code . Given exmaple is in Lua.
The values are filled in after invoking“rtpengine_delete”, “rtpengine_query”, or “rtpengine_manage” if the command resulted in a deletion of the call (or call branch).

KSR.log("info", " mos avg " .. KSR.pv.get("$avp(mos_average)"))
KSR.log("info", " mos max " .. KSR.pv.get("$avp(mos_max)"))
KSR.log("info", " mos min " .. KSR.pv.get("$avp(mos_min)"))

KSR.log("info", "mos_average_packetloss_pv" .. KSR.pv.get("$avp(mos_average_packetloss)"))
KSR.log("info", "mos_average_jitter_pv" .. KSR.pv.get("$avp(mos_average_jitter)"))
KSR.log("info", "mos_average_roundtrip_pv" .. KSR.pv.get("$avp(mos_average_roundtrip)"))
KSR.log("info", "mos_average_samples_pv" .. KSR.pv.get("$avp(mos_average_samples)"))

KSR.log("info", "mos_min_pv" .. KSR.pv.get("$avp(mos_min)"))
KSR.log("info", "mos_min_at_pv" .. KSR.pv.get("$avp(mos_min_at)"))
KSR.log("info", "mos_min_packetloss_pv" .. KSR.pv.get("$avp(mos_min_packetloss)"))
KSR.log("info", "mos_min_jitter_pv" .. KSR.pv.get("$avp(mos_min_jitter)"))
KSR.log("info", "mos_min_roundtrip_pv" .. KSR.pv.get("$avp(mos_min_roundtrip)"))

KSR.log("info", "mos_max_pv" .. KSR.pv.get("$avp(mos_max)"))
KSR.log("info", "mos_max_at_pv" .. KSR.pv.get("$avp(mos_max_at)"))
KSR.log("info", "mos_max_packetloss_pv" .. KSR.pv.get("$avp(mos_max_packetloss)"))
KSR.log("info", "mos_max_jitter_pv" .. KSR.pv.get("$avp(mos_max_jitter)"))
KSR.log("info", "mos_max_roundtrip_pv" .. KSR.pv.get("$avp(mos_max_roundtrip)"))

local mos_A_label = KSR.pv.get("$avp(mos_A_label)")
if not (mos_A_label == nil) then
    KSR.log("info", "mos_average_packetloss_A_pv" .. KSR.pv.get("$avp(mos_average_packetloss_A)"))
    KSR.log("info", "mos_average_jitter_A_pv" .. KSR.pv.get("$avp(mos_average_jitter_A)"))
    KSR.log("info", "mos_average_roundtrip_A_pv" .. KSR.pv.get("$avp(mos_average_roundtrip_A)"))
    KSR.log("info", "mos_average_A_pv" .. KSR.pv.get("$avp(mos_average_A)"))
end

local mos_B_label = KSR.pv.get("$avp(mos_B_label)")
if not (mos_B_label == nil) then
    KSR.log("info", "mos_average_packetloss_B_pv" .. KSR.pv.get("$avp(mos_average_packetloss_B)"))
    KSR.log("info", "mos_average_jitter_B_pv" .. KSR.pv.get("$avp(mos_average_jitter_B)"))
    KSR.log("info", "mos_average_roundtrip_B_pv" .. KSR.pv.get("$avp(mos_average_roundtrip_B)"))
    KSR.log("info", "mos_average_B_pv" .. KSR.pv.get("$avp(mos_average_B)"))
end

Sample obtained result for one leg

      "average MOS": {
        "MOS": 43,
        "round-trip time": 13430,
        "jitter": 0,
        "packet loss": 0,
        "samples": 4
      },
      "lowest MOS": {
        "MOS": 43,
        "round-trip time": 24184,
        "jitter": 0,
        "packet loss": 0,
        "reported at": 1590498085
      },
      "highest MOS": {
        "MOS": 44,
        "round-trip time": 8218,
        "jitter": 0,
        "packet loss": 0,
        "reported at": 1590498089
      },

CDR with MOS on Freeswitch

<?xmlversion="1.0"?>
					
<cdr core-uuid="[UUID]" switchname="freeswitch">
<channel_data>
	<state>
	<direction>
	<state_number>
	<flags>	
	<caps>
</channel_data>
					
<call-stats>			
//Audio Compoennts 		
// Video Component
</call-stats>

// Variables 			

<app_log>			
	<application app_name="..."app_data="...">
	<application app_name="..."app_data="...">
</app_log>
				
// Callflow 
				
</cdr>

Audio

<audio>	
	<inbound>
		<raw_bytes>	
		<media_bytes>
		<packet_count>
		<media_packet_count>		
		<skip_packet_count>
		<jitter_packet_count>
		<dtmf_packet_count>	
		<cng_packet_count>		
		<flush_packet_count>
		<largest_jb_size>
		<jitter_min_variance>
		<jitter_max_variance>
		<jitter_loss_rate>
		<jitter_burst_rate>
		<mean_interval>
		<flaw_total>
		<quality_percentage>
		<mos>
	</inbound>				
	<outbound>
		<raw_bytes>
		<media_bytes>
		<packet_count>
		<media_packet_count>
		<skip_packet_count>
		<dtmf_packet_count>
		<cng_packet_count>
		<rtcp_packet_count>
		<rtcp_octet_count>
	</outbound>	
</audio>

Video

<video>	
	<inbound>
		<raw_bytes>
		<media_bytes>
		<packet_count>
		<media_packet_count>
		<skip_packet_count>
		<jitter_packet_count>
		<dtmf_packet_count>
		<cng_packet_count>
		<flush_packet_count>
		<largest_jb_size>
		<jitter_min_variance>
		<jitter_max_variance>
		<jitter_loss_rate>
		<jitter_burst_rate>
		<mean_interval>
		<flaw_total>
		<quality_percentage>
		<mos>
	</inbound>	
	<outbound>
		<raw_bytes>
		<media_bytes>
		<packet_count>
		<media_packet_count>
		<skip_packet_count>
		<dtmf_packet_count>
		<cng_packet_count>
		<rtcp_packet_count>
		<rtcp_octet_count>	
	</outbound>
</video>

The variables

<variables>		
<is_outbound>
<uuid><session_id><text_media_flow>
<direction>
<ep_codec_string>
<channel_name>
<secondary_recovery_module>
<verto_dvar_email><verto_dvar_avatar><jsock_uuid_str>
<verto_user><presence_id>
<verto_client_address><chat_proto>
<verto_host><event_channel_cookie>
<verto_profile_name>
<record_stereo><default_areacode><transfer_fallback_extension>
<toll_allow><accountcode><user_context><effective_caller_id_name><effective_caller_id_number>
<outbound_caller_id_name><outbound_caller_id_number><callgroup><user_name><domain_name>
<Event-Name>
<Core-UUID>
<FreeSWITCH-Hostname><FreeSWITCH-Switchname><FreeSWITCH-IPv4><FreeSWITCH-IPv6><Event-Date-Local><Event-Date-GMT><Event-Date-Timestamp>
<Event-Calling-File>
<Event-Calling-Function>
<Event-Calling-Line-Number>
<Event-Sequence>
<verto_remote_caller_id_name><verto_remote_caller_id_number>
<switch_r_sdp>

<call_uuid><open>
<rtp_secure_media>
<export_vars><conference_enter_sound>
<conference_exit_sound><video_banner_text>
<rtp_use_codec_string><remote_audio_media_flow>
<audio_media_flow>
<rtp_audio_recv_pt>
<rtp_use_codec_name> 
<rtp_use_codec_fmtp>
<rtp_use_codec_rate>
<rtp_use_codec_ptime>
<rtp_use_codec_channels>
<rtp_last_audio_codec_string>
<original_read_codec>
<original_read_rate>
<write_codec><write_rate>
<remote_audio_ip>
<remote_audio_port>
<remote_audio_rtcp_ip>
<remote_audio_rtcp_port>
<dtmf_type>
<remote_video_media_flow>
<video_media_flow>
<video_possible>
<rtp_video_pt>
<rtp_video_recv_pt>
<video_read_codec>
<video_read_rate><video_write_codec><video_write_rate><rtp_last_video_codec_string>
<rtp_use_video_codec_name>
<rtp_use_video_codec_rate>
<rtp_use_video_codec_ptime>
<remote_video_ip><remote_video_port>
<remote_video_rtcp_ip><remote_video_rtcp_port>
<local_media_ip><local_media_port>
<advertised_media_ip>
<rtp_use_timer_name><rtp_use_pt>
<rtp_use_ssrc><rtp_2833_send_payload>
<rtp_2833_recv_payload><remote_media_ip>
<remote_media_port><local_video_ip>
<local_video_port><rtp_use_video_pt><rtp_use_video_ssrc><rtp_local_sdp_str><current_application_data><current_application><send_silence_when_idle><rtp_has_crypto><endpoint_disposition><conference_name><conference_member_id><conference_moderator><conference_ghost><conference_uuid><video_width><video_height><video_fps><verto_hangup_disposition><read_codec><read_rate><hangup_cause><hangup_cause_q850>
<digits_dialed>
<start_stamp><profile_start_stamp><answer_stamp><progress_media_stamp><end_stamp>
<start_epoch><start_uepoch>
<profile_start_epoch><profile_start_uepoch>
<answer_epoch><answer_uepoch>
<bridge_epoch><bridge_uepoch>
<last_hold_epoch><last_hold_uepoch>
<hold_accum_seconds><hold_accum_usec><hold_accum_ms><resurrect_epoch><resurrect_uepoch>
<progress_epoch><progress_uepoch><progress_media_epoch><progress_media_uepoch>
<end_epoch><end_uepoch>
<last_app><last_arg><caller_id><duration><billsec><progresssec><answersec><waitsec><progress_mediasec>

<flow_billsec>
   <mduration><billmsec><progressmsec><answermsec><waitmsec><progress_mediamsec><flow_billmsec><uduration>  <billusec><progressusec><answerusec><waitusec><progress_mediausec>
<flow_billusec>

<rtp_audio_in_raw_bytes>
<rtp_audio_in_media_bytes>
<rtp_audio_in_packet_count>
<rtp_audio_in_media_packet_count>
<rtp_audio_in_skip_packet_count><rtp_audio_in_jitter_packet_count><rtp_audio_in_dtmf_packet_count>
<rtp_audio_in_cng_packet_count>
<rtp_audio_in_flush_packet_count>
<rtp_audio_in_largest_jb_size>
<rtp_audio_in_jitter_min_variance><rtp_audio_in_jitter_max_variance>
<rtp_audio_in_jitter_loss_rate>
<rtp_audio_in_jitter_burst_rate>
<rtp_audio_in_mean_interval>
<rtp_audio_in_flaw_total>
<rtp_audio_in_quality_percentage>
<rtp_audio_in_mos>
<rtp_audio_out_raw_bytes>
<rtp_audio_out_media_bytes>
<rtp_audio_out_packet_count>
<rtp_audio_out_media_packet_count><rtp_audio_out_skip_packet_count><rtp_audio_out_dtmf_packet_count>
<rtp_audio_out_cng_packet_count>
<rtp_audio_rtcp_packet_count>
<rtp_audio_rtcp_octet_count>
<rtp_video_in_raw_bytes>
<rtp_video_in_media_bytes>
<rtp_video_in_packet_count>
<rtp_video_in_media_packet_count>
<rtp_video_in_skip_packet_count><rtp_video_in_jitter_packet_count><rtp_video_in_dtmf_packet_count>
<rtp_video_in_cng_packet_count>
<rtp_video_in_flush_packet_count>
<rtp_video_in_largest_jb_size>
<rtp_video_in_jitter_min_variance><rtp_video_in_jitter_max_variance>
<rtp_video_in_jitter_loss_rate>
<rtp_video_in_jitter_burst_rate>
<rtp_video_in_mean_interval
><rtp_video_in_flaw_total>
<rtp_video_in_quality_percentage>
<rtp_video_in_mos>
<rtp_video_out_raw_bytes>
<rtp_video_out_media_bytes>
<rtp_video_out_packet_count>
<rtp_video_out_media_packet_count><rtp_video_out_skip_packet_count><rtp_video_out_dtmf_packet_count>
<rtp_video_out_cng_packet_count>
<rtp_video_rtcp_packet_count>
<rtp_video_rtcp_octet_count>

</variables>

The Callflow components

<callflow dialplan="XML" unique-id="[UUID]" profile_index="1">
	
	<extension name="myconference" number="3500">		
		<application app_name="..." app_data="...">
	</extension>	
	<caller_profile>
		<username>
		<dialplan>
		<caller_id_name>
		<caller_id_number>
		<callee_id_name>
		<callee_id_number>
		<ani>
		<aniii>
		<network_addr>
		<rdnis>
		<destination_number>
		<uuid>
		<source>
		<context>
		<chan_name>
	</caller_profile>
				
			
	<times>
		<created_time>
		<profile_created_time>
		<progress_time>	
		<progress_media_time>
		<answered_time>
		<bridged_time>
		<last_hold_time>	
		<hold_accum_time>
		<hangup_time>
		<resurrect_time>	
		<transfer_time>	
	</times>
</callflow>

Standardizing bodies :-

ITU (The International Telecommunication Union) is the United Nations specialised agency in the field of telecommunications, information and communication technologies (ICTs).
ITU-T ( ITU Telecommunication Standardisation Sector) is responsible for studying technical, operating and tariff questions and issuing Recommendations on them with a view to standardising tele-communications on a worldwide basis.

As the technology for packet switching matured, the voice quality between circuit-switched and packet-switched networks is mostly indistinguishable. However, the flaws in the VoIP communication system reappear under low network conditions and bad architecture design. Especially with applications that are greedy for network bandwidth such as large scale conferencing or HD streaming, the need for monitoring and quality control is very high, which can be only met by above described QoS parameters.

References

CDR on freeswitch
ITU-T G.114 TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU (05/2003) SERIES G: TRANSMISSION SYSTEMS AND MEDIA, DIGITAL SYSTEMS AND NETWORKS , International telephone connections and circuits – General Recommendations on the transmission qua
Kamailio RTP engine https://www.kamailio.org/docs/modules/devel/modules/rtpengine.html
RFC 3550 – RTP: A Transport Protocol for Real-Time Applications
RFC 3611 – RTP Control Protocol (RTCP) Extended Reports (XR)
RFC 7005 – RTCP (RTP Control Protocol) Extended Reports for RTP/AVPF/AVP
ITU-T G.107 – The E-model: a computational model for use on transmission planning
ITU-T P.800.1 – Mean opinion score (MOS) terminology
ITU-T G.110 – Mean Opinion Score (MOS) calculation
VoIP Metrics: RTP and RTCP

	Anonymous on NAT traversal using STUN and…
	Anonymous on VoIP/ OTT / Telecom Solution s…
	What is IPTV Player… on IPTV ( Internet Based Televisi…
	Anonymous on Proxying Media Streams via Kam…
	Anonymous on Proxying Media Streams via Kam…
	WebRTC 安全之道 –… on WebRTC Security Architecture
	Boris Ivanov on Asterisk – installation…