- RTP Fundamentals
- RTCP Protocol
- VoIP call quality metrics
- Quality Factors
- Rating Factor (R-Factor)
- MOS ( Mean Opinion Score )
- Latency
- Packet Loss
- Jitter
- Mapping R-value to calculate MOS
- Media Stats and MOS on RTP engine Kamailio
- Setting MOS collection on kamailio
- CDR with MOS on Freeswitch
Metrics for monitoring a VoIP call can be obtained from any node in the media path of the call flow. These metrics are essential for analysis via calculation and aggregation, and are increasingly used for real-time performance tracking and quality rectification. Comprehensive monitoring ensures optimal voice quality across all call legs and network conditions.
RTP Fundamentals
RTP (Real-time Transport Protocol) provides real-time media streams with payload type identification, packet sequencing, and timestamping headers. It forms the foundation of voice and video transmission over IP networks.
Source: RFC 3550 – RTP: A Transport Protocol for Real-Time Applications
Key RTP Header Fields
- Sequence Number: Tracks the incremental succession of incoming packets from the sender and identifies out-of-order delivery
- Timestamp: Used by the receiver to play back received samples at the appropriate time and interval, ensuring proper synchronization
- Synchronization Source (SSRC): Identifies the synchronization source within the RTP session
Call Legs in VoIP Sessions
In a typical VoIP call, there are two media paths:
- Leg A: Between Caller and RTP Proxy
- Leg B: Between RTP Proxy and Callee
Each leg is monitored independently for quality metrics and may experience different network conditions.
RTCP Protocol
RTCP (RTP Control Protocol) provides detailed monitoring of streams to session participants with statistical data and enhanced metrics for Quality of Service (QoS) and synchronization.
Source: RFC 3550 – RTCP Specification
RTCP Report Types
- SR (Sender Report): Sent by active senders containing:
- Sender packet and octet counts
- Timestamp correlation for synchronization
- Reception quality feedback
- RR (Receiver Report): Sent by receivers with:
- Fraction of packets lost
- Cumulative packet loss count
- Highest sequence number received
- Interarrival jitter
- Last Sender Report (LSR) timestamp
- Delay since Last Sender Report (DLSR)
Key RTCP Metrics
- Packet Loss Rate: Percentage of packets lost during transmission
- Packet Discard Rate: Percentage of packets discarded due to late arrival
- Round Trip Time (RTT): Time for a packet to travel to destination and back
- R-Factor: Numerical rating of voice quality (0-100 scale)
- MOS-LQ: Mean Opinion Score for Listening Quality
- MOS-CQ: Mean Opinion Score for Conversation Quality
- Jitter Buffer Metrics: Current and maximum delay values
Further Reading: VoIP Metrics: RTP and RTCP
VoIP call quality metrics
RTP provides real time media stream, payload type identification, packet sequencing and timestamping headers.
- sequence num : tracks incremental succession of incoming packets by sendor and tracls out of order delivery.
- timestamp : used by the receiver to play back the received samples at appropriate time and interval.

Note that all Synchronization source (SSRC) identifiers fields denote the synchronization source within the RTP session such as both legs of a call session.


Certain aspects of RTP media and its RTCP metrics were discussed before you can read more about RTCP and RTCP / AVPF here RealTime Transport protocol (RTP) and RTP control protocol (RTCP )
Other call related factors which are not specifically part of RTCP but provide information about call quality are
- signal level
- noise level
- gap density, gap threshold
- Burst density
- residual echo return loss
Delays like following also play a significant influence in VoIP Quality
- end system delay
- Paketzation Delay
- Setup delay ( auth, TLS handshake, accessing mic/camera stream ..)
- Queing Delay
- Serialization dleay
- Network latency
- End device processing delay such as CPU of the end device
It should be noted that in addition to these values which can be calculated algorithimically and with high precision , there are more subjective quality parameters which can be only evaluated manually ( ie with a person listening on both ends ) such as
- Robot voice
- Perceptible sound but annoying speech quality
Quality Factors
Acoustic and Noise Parameters
- Signal Level: Amplitude of the audio signal (-dBm or dBFS)
- Noise Level: Background noise measurement and floor
- Gap Density & Threshold: Frequency and duration of silence gaps
- Burst Density: Concentration of packet loss events in time
- Residual Echo Return Loss: Echo cancellation effectiveness (higher is better)
Delay Components
VoIP quality is significantly affected by various delay factors that accumulate throughout the media path:
- End System Delay: Processing time at endpoints (codec, jitter buffer, DSP)
- Packetization Delay: Time to accumulate audio samples into packets
- Setup Delay: Time for authentication, TLS handshake, and media stream initialization
- Queuing Delay: Time spent waiting in network buffers
- Serialization Delay: Time to transmit a packet onto the physical link
- Network Latency: Propagation delay through the network infrastructure
- End Device Processing Delay: CPU processing time at the endpoint device
Subjective Quality Parameters
Some quality aspects can only be evaluated through subjective listening tests:
- Robot Voice: Synthetic or unnatural speech characteristics
- Annoying Speech Quality: Perceptible but irritating audio artifacts
- Overall Listening Experience: End-user perception of call quality
Rating Factor (R-Factor)
Rating Factor (R-Factor) and Mean Opinion Score (MOS) are two commonly-used measurements of overall VoIP call quality.
What is R-Factor ? This is a value derived from metrics such as latency, jitter, and packet loss per ITU‑T Recommendation G.107. It assess the quality-of-experience for VoIP calls on your network. Typical scores range from 50 (bad) to 90 (excellent).
It assesses the quality-of-experience for VoIP calls on your network with a scale from 0 to 100:
| R-Factor Range | Quality Rating |
|---|---|
| 90-100 | Excellent |
| 80-90 | Good |
| 70-80 | Fair |
| 60-70 | Poor |
| 50-60 | Very Poor |
| 0-50 | Bad |
Examples:
- R-factor of 90 → MOS is 4.3 (Excellent)
- R-factor of 50 → MOS is 2.6 (Bad)
MOS ( Mean Opinion Score )
What is MOS? MOS is derived from the R-Factor per ITU‑T Recommendation G.10 which measures VoIP call quality. PacketShaper measures MOS using a scale of 10-50. To convert to a standard MOS score (which uses a scale of 1-5), divide the PacketShaper MOS value by 10.
MOS is terminology for audio, video and audiovisual quality expressions as per ITU-T P.800.1. It refers to listening, talking or conversational quality, whether they originate from subjective or objective models.
| MOS Score | Quality Rating |
|---|---|
| 4.3-5.0 | Excellent |
| 4.0-4.3 | Very Good |
| 3.6-4.0 | Good |
| 3.1-3.6 | Fair |
| 2.6-3.1 | Poor |
| 1.0-2.6 | Bad |
MOS Types
MOS is terminology for audio, video, and audiovisual quality expressions. Different variants measure specific aspects:
- MOS-LQE: Listening Quality Estimate (one-way audio quality)
- MOS-CQE: Conversational Quality Estimate (two-way conversation quality)
- MOS-TQE: Talking Quality Estimate (speaker’s perception)
- MOS-AVQE: Audiovisual Quality Estimate
- MOS-VQE: Video Quality Estimate
Audio Signal Bandwidth Classifications
- N (Narrow-band): 300-3400 Hz (traditional telephone)
- W (Wide-band): 50-7000 Hz (enhanced voice)
- S (Super-wide-band): 20-14000 Hz (high fidelity)
- F (Full-band): 10-20000 Hz (ultimate audio quality)
Listening Quality Measurement Types
Electrical Measurement: Done at electrical interfaces using the Intermediate Reference System (IRS) assumptions
Acoustical Measurement: Done at acoustical interfaces using actual telephone set products, providing real-world perception metrics
Conversational Quality (CQ)
Calculated as the arithmetic mean value of subjective judgments on a 5-point ACR (Absolute Category Rating) quality scale, considering both directions of communication.
Talking Quality (TQ)
Describes the quality as perceived by the talking party only. Factors affecting TQ include:
- Echo signal presence and magnitude
- Background noise perception
- Double-talk scenarios
Calculated based on arithmetic mean of 5-point ACR scale judgments.
Video Quality (VQ)
Accounts for differentiation in perceived quality:
- M (Mobile): Smartphone/tablet screens (~25 cm or less)
- T (TV/PC): Monitor or television displays
Calculated based on arithmetic mean value of subjective judgments on a 5-point quality scale.
Audiovisual Quality (AVQ)
Refers to quality of integrated audio-visual streams under corresponding networking conditions. Also calculated using 5-point ACR scale mean judgments.
Latency
Latency is primarily is the time required for packets to travel from one end to another, in milliseconds. For example, if the sum of measured latency is 800 ms and the number of latency samples is 20, then the average latency is 40 ms. The header of the RTP packets carry timestamps which later can also be used to calculate round-trip time.
Propagation Delay by Medium
Terrestrial Cables:
- Coaxial cable over FDM: 4-6 microseconds per km
- Digital transmission submarine cables: 4-6 microseconds per km
Optical Fiber:
- Digital transmission: ~5 microseconds per km
- Includes delay from repeaters and regenerators
Satellite Communication: Delay varies significantly by altitude
- 400 km above earth: 12 ms
- 14,000 km above earth: 110 ms
- 36,000 km (geostationary): 260 ms
Equipment Delays
- FDM Modem: ~0.75 ms
- Transmultiplexer: ~1.5 ms
- Exchanges (analog, digital, transit): 0.45-0.825 ms
- Echo Cancellers: ~0.5 ms
- DCME (Circuit manipulation, signal compression): 30-200 ms

RTT (Round Trip Time )
RTT is the time in milliseconds (ms) taken for data to travel to the target destination and back. In terms of SIP calls it is the time for a transaction to complete between caller/client and callee/server. It is calculated as when the packet was sent and when the acknowledgment for it was received.
High RTT : The media stream especially audio must not suffer a delay higher than 150 ms including all the processing delays at intermediate nodes and network latency. Any value above it is of poor quality. High RTT indicates a poor network quality and would result in the audio lag issue.
RTT vs Network ping calculation: RTT can represent full path network latency experienced by the packets and can do away with frequent ICMP ping/echo requests/probes to check network health. Although it should be noted that while pings happen in lower transport layers protocol, RTT happens at the high up application layer.
RTT is used to calculate RTO ( Request transmission timeouts )in TCP transmission ie how much time the sender should wait before retrying to send an unacknowledged packet.
Factors affecting RTT can include delays in propagation delay, processing delay, queuing delay and encoding delay. Prorogation delay can correlate to the
- physical distance ( inter country/continents or intra) ,
- medium of transmission ( copper cables , fiber , wireless)
- bandwidth available

Similarly propagation delay can occur due to large num of network hops like routers / servers . It should be noted that server response time also plays a critical role in RTT as it depends on server’s processing capacity and nature of request.
Star based network topology like MCU , SFU or TURN servers can introduce processing delays too for activities such as mixing, encoding , NATing etc .

Network congestion can amplify the RTT the most.Traffic level must be monitored when RTT spikes such as during DDos attacks.
Overcoming large RTT can be achieved by
- identifying the choke points of network
- distributing the load evenly
- ensuring scalability of the server side resources
- ensuring points of presence(PoP) into geographic regions where caller/ callee is present and routing through it rather than unreliable open public network
Note : avg RTT of the session is misleading denotaion of latency as there maybe be assymetrically RTT between the two legs of the call
Calculation of RTT
EffectiveLatency = ( AverageLatency + Jitter * 2 + 10 )
In RTPengine
int eff_rtt = ssb->rtt / 1000 + ssb->jitter * 2 + 10;
Thus for RTT = 11338 and jitter =0
eff_RTT = 11338/1000 + 0*2 +10
= 11.651 + 10 = 21.651 , which is a good score as it is way below 150ms of latency
But for RTT = 129209 and jitter =7
eff_RTT = 129209/1000 + 7*2 +10
= 153.209 , which is a bad score > 150 ms
Packet Loss
Packet loss occurs when a packet does not successfully reach its destination. It can significantly degrade voice quality, causing choppy audio or complete dropouts.
Common Causes
Network Issues:
- Unavailable network bandwidth or congestion
- Buffer overload with insufficient queue space
- ACL or firewall rules dropping packets
- ISP rate limiting
Device Issues:
- CPU unable to handle encryption/decryption at high speed
- Low battery causing underperformance
- Physical hardware damage (router, switch, cabling)
- Softphone/hardphone limitations
- Bluetooth headset range/signal weakness
Network Configuration:
- High-priority packet preferences causing lower-priority drops
Radio Frequency Issues:
- Interference from high voltage systems or microwaves in wireless networks
- Weak signals causing packet drops
- Late-arriving packets dropped by codec
User Experience: Manifests as chopped voice or complete dropout for moments
Impact on Audio
When packets are lost:
- Listener hears clipped or missing words
- Complete audio dropout during loss bursts
- Codec attempts error concealment but quality suffers
Obtaining Packet Loss Details
RFC 3550 Method: Performed using RTP header sequence numbers. If packets are missing sequences, the media stream monitors flags this as lost packets.
CDR Analysis: Can be concluded from the difference between total packets and received packets.
RTP-XR Monitoring: RFC 3611 records real-time drops with detailed statistics.
Packet Loss Thresholds
| Loss Rate | Quality |
|---|---|
| < 0.5% | Good |
| 0.5% – 0.9% | Average |
| ≥ 0.9% | Bad |
Jitter
The variation in the delay of received packets in a flow, measured by comparing the interval when RTP packets were sent to the interval at which they were received.
For instance, if packet #1 and packet #2 leave 30 milliseconds apart and arrive 50 milliseconds apart, then the jitter is 20 milliseconds or if packets transmitted every 15ms and reach destination at every 15ms then there is no variability, and the jitter is 0.
Causes of jitter
- Frame bigger than jitter buffer size
- algorithms to back-of collision by introducing delays in packet transmission in half duplex interfaces
- even small jitter can get exponentially worse on slow or congestion links
- jitter can be introduced due to bottlenecks near router buffer, rerouting / parallel routes to the same destination, load-sharing, or route tables changing the path
Handling jitter:
Jitter below 30ms is manageable with the help of jitter buffers in codecs however above that the codec starts to drop the late arrived packets and cannot reassemble / splice up the packets for a smooth media stream effectively, hence causing media quality issues like clipped audio
Detecting Jitter
Passive Monitoring:
- Analyze inter-packet gaps in Wireshark
- RFC 3611 real-time jitter buffer usage metrics
- RFC 7005 RTCP XR extension
Active Tools:
- Network sniffers (Wireshark, tcpdump)
- Path analyzers
- Application Performance Monitoring (APM) tools
- CDR analyzers
- SNMP (Simple Network Management Protocol) collectors
| Metric | Good | Average | Bad |
| Jitter | <= 10ms | 10ms – 30ms | >=30ms |
| Packet Loss | < 0.5% | 0.5% – 0.9% | >= 0.9% |
| Audio Level | >-40dB | -80dB to -40dB | < -80dB |
| RTT | < 200ms | 200ms – 300ms | > 300ms |
Ref : ITU P.800.1 : Mean opinion score (MOS) terminology
Methods for objective and subjective assessment of speech and video quality.
Scheduling for low bandwidth networks
The ability of the end application or the RTP proxy to deal with packet loss or delays depends on its processing techniques , particularly with encoding and buffering techniquee to deal with high pac ket loss rate.
Mapping R-value to calculate MOS
To map MOS from R value using above defined metrics , a standard formula is used. First the latency and jitter are added and defined value for computation time is also added , resulting in effective latency
Step 1: Calculate effective latency
effectiveLatency = latency + (jitter * latencyImpact) + compTime
Step 2: Subtract effective latency from R-base value
R = 93 - (effectiveLatency / factorLatencyBased)
Step 3: Apply packet loss impact
R = R - (lostPackets * impact)
Step 4: Calculate MOS
MOS = (((R - 60) * (100 - R) * 0.000007) + (0.035 * R) + 1)
Media Stats and MOS on RTP engine Kamailio
Minimum edge Values
mos_min_pv
minimum encountered MOS value for the call.
range – 1.0 to 5.0.
mos_min_at_pv
timestamp of when the minimum MOS value was encountered during the call
mos_min_packetloss_pv
amount of packetloss in percent at the time the minimum MOS value was encountered
mos_min_roundtrip_pv
packet round-trip time in milliseconds at the time the minimum MOS value was encountered
mos_min_jitter_pv
amount of jitter in milliseconds at the time the minimum MOS value was encountered
Maximum edge Values
mos_max_pv
maximum encountered MOS value for the call.
mos_max_at_pv
timestamp of when the maximum MOS value was encountered during the cal
mos_max_packetloss_pv
amount of packetloss in percent at maximum MOS moment
mos_max_roundtrip_pv
packet round-trip time in milliseconds at maximum MOS moment
mos_max_jitter_pv
amount of jitter in milliseconds at maximum moment
Average Values
mos_average_pv : average (median) MOS value for the call. Range – 1.0 through 5.0.
mos_average_packetloss_pv : average (median) amount of packetloss in percent present throughout the call.
mos_average_jitter_pv : average (median) amount of jitter in milliseconds present throughout the call.
mos_average_roundtrip_pv
mos_average_samples_pv : number of samples used to determine the other “average” MOS data points.
Labels
mos_A_label_pv : custom label used in rtpengine signalling.
If set, all the statistics pseudovariables with the A suffix will be filled in with statistics only from the call legs that match the label given in this variable.
A label’s min
mos_min_A_pv
mos_min_at_A_pv
mos_min_packetloss_A_pv
mos_min_jitter_A_pv
mos_min_roundtrip_A_pv
A label’s max
mos_max_A_pv
mos_max_at_A_pv
mos_max_packetloss_A_pv
mos_max_jitter_A_pv
mos_max_roundtrip_A_pv
A label’s average
mos_average_A_pv
mos_average_packetloss_A_pv
mos_average_jitter_A_pv
mos_average_roundtrip_A_pv
mos_average_samples_A_pv
B labels’s min
mos_B_label_pv
mos_min_B_pv
mos_min_at_B_pv
mos_min_packetloss_B_pv
mos_min_jitter_B_pv
mos_min_roundtrip_B_pv
B label’s max
mos_max_B_pv
mos_max_at_B_pv
mos_max_packetloss_B_pv
mos_max_jitter_B_pv
mos_max_roundtrip_B_pv
B label’s average
mos_average_B_pv
mos_average_packetloss_B_pv
mos_average_jitter_B_pv
mos_average_roundtrip_B_pv
mos_average_samples_B_pv
Setting MOS collection on kamailio
set the kamailio config rtpengine params for names the variable the hold specific mos values
modparam("rtpengine", "mos_max_pv", "$avp(mos_max)")
modparam("rtpengine", "mos_average_pv", "$avp(mos_average)")
modparam("rtpengine", "mos_min_pv", "$avp(mos_min)")
modparam("rtpengine", "mos_average_packetloss_pv", "$avp(mos_average_packetloss)")
modparam("rtpengine", "mos_average_jitter_pv", "$avp(mos_average_jitter)")
modparam("rtpengine", "mos_average_roundtrip_pv", "$avp(mos_average_roundtrip)")
modparam("rtpengine", "mos_average_samples_pv", "$avp(mos_average_samples)")
modparam("rtpengine", "mos_min_pv", "$avp(mos_min)")
modparam("rtpengine", "mos_min_at_pv", "$avp(mos_min_at)")
modparam("rtpengine", "mos_min_packetloss_pv", "$avp(mos_min_packetloss)")
modparam("rtpengine", "mos_min_jitter_pv", "$avp(mos_min_jitter)")
modparam("rtpengine", "mos_min_roundtrip_pv", "$avp(mos_min_roundtrip)")
modparam("rtpengine", "mos_max_pv", "$avp(mos_max)")
modparam("rtpengine", "mos_max_at_pv", "$avp(mos_max_at)")
modparam("rtpengine", "mos_max_packetloss_pv", "$avp(mos_max_packetloss)")
modparam("rtpengine", "mos_max_jitter_pv", "$avp(mos_max_jitter)")
modparam("rtpengine", "mos_max_roundtrip_pv", "$avp(mos_max_roundtrip)")
modparam("rtpengine", "mos_A_label_pv", "$avp(mos_A_label)")
modparam("rtpengine", "mos_average_packetloss_A_pv", "$avp(mos_average_packetloss_A)")
modparam("rtpengine", "mos_average_jitter_A_pv", "$avp(mos_average_jitter_A)")
modparam("rtpengine", "mos_average_roundtrip_A_pv", "$avp(mos_average_roundtrip_A)")
modparam("rtpengine", "mos_average_A_pv", "$avp(mos_average_A)")
modparam("rtpengine", "mos_B_label_pv", "$avp(mos_B_label)")
modparam("rtpengine", "mos_average_packetloss_B_pv", "$avp(mos_average_packetloss_B)")
modparam("rtpengine", "mos_average_jitter_B_pv", "$avp(mos_average_jitter_B)")
modparam("rtpengine", "mos_average_roundtrip_B_pv", "$avp(mos_average_roundtrip_B)")
modparam("rtpengine", "mos_average_B_pv", "$avp(mos_average_B)")
For individual leg labbeling fill up the lables
KSR.pv.sets("$avp(mos_A_label)","Aleg_label")
KSR.pv.sets("$avp(mos_B_label)","Bleg_label")
Gather the mos stats from the code . Given exmaple is in Lua.
The values are filled in after invoking“rtpengine_delete”, “rtpengine_query”, or “rtpengine_manage” if the command resulted in a deletion of the call (or call branch).
KSR.log("info", " mos avg " .. KSR.pv.get("$avp(mos_average)"))
KSR.log("info", " mos max " .. KSR.pv.get("$avp(mos_max)"))
KSR.log("info", " mos min " .. KSR.pv.get("$avp(mos_min)"))
KSR.log("info", "mos_average_packetloss_pv" .. KSR.pv.get("$avp(mos_average_packetloss)"))
KSR.log("info", "mos_average_jitter_pv" .. KSR.pv.get("$avp(mos_average_jitter)"))
KSR.log("info", "mos_average_roundtrip_pv" .. KSR.pv.get("$avp(mos_average_roundtrip)"))
KSR.log("info", "mos_average_samples_pv" .. KSR.pv.get("$avp(mos_average_samples)"))
KSR.log("info", "mos_min_pv" .. KSR.pv.get("$avp(mos_min)"))
KSR.log("info", "mos_min_at_pv" .. KSR.pv.get("$avp(mos_min_at)"))
KSR.log("info", "mos_min_packetloss_pv" .. KSR.pv.get("$avp(mos_min_packetloss)"))
KSR.log("info", "mos_min_jitter_pv" .. KSR.pv.get("$avp(mos_min_jitter)"))
KSR.log("info", "mos_min_roundtrip_pv" .. KSR.pv.get("$avp(mos_min_roundtrip)"))
KSR.log("info", "mos_max_pv" .. KSR.pv.get("$avp(mos_max)"))
KSR.log("info", "mos_max_at_pv" .. KSR.pv.get("$avp(mos_max_at)"))
KSR.log("info", "mos_max_packetloss_pv" .. KSR.pv.get("$avp(mos_max_packetloss)"))
KSR.log("info", "mos_max_jitter_pv" .. KSR.pv.get("$avp(mos_max_jitter)"))
KSR.log("info", "mos_max_roundtrip_pv" .. KSR.pv.get("$avp(mos_max_roundtrip)"))
local mos_A_label = KSR.pv.get("$avp(mos_A_label)")
if not (mos_A_label == nil) then
KSR.log("info", "mos_average_packetloss_A_pv" .. KSR.pv.get("$avp(mos_average_packetloss_A)"))
KSR.log("info", "mos_average_jitter_A_pv" .. KSR.pv.get("$avp(mos_average_jitter_A)"))
KSR.log("info", "mos_average_roundtrip_A_pv" .. KSR.pv.get("$avp(mos_average_roundtrip_A)"))
KSR.log("info", "mos_average_A_pv" .. KSR.pv.get("$avp(mos_average_A)"))
end
local mos_B_label = KSR.pv.get("$avp(mos_B_label)")
if not (mos_B_label == nil) then
KSR.log("info", "mos_average_packetloss_B_pv" .. KSR.pv.get("$avp(mos_average_packetloss_B)"))
KSR.log("info", "mos_average_jitter_B_pv" .. KSR.pv.get("$avp(mos_average_jitter_B)"))
KSR.log("info", "mos_average_roundtrip_B_pv" .. KSR.pv.get("$avp(mos_average_roundtrip_B)"))
KSR.log("info", "mos_average_B_pv" .. KSR.pv.get("$avp(mos_average_B)"))
end
Sample obtained result for one leg
"average MOS": {
"MOS": 43,
"round-trip time": 13430,
"jitter": 0,
"packet loss": 0,
"samples": 4
},
"lowest MOS": {
"MOS": 43,
"round-trip time": 24184,
"jitter": 0,
"packet loss": 0,
"reported at": 1590498085
},
"highest MOS": {
"MOS": 44,
"round-trip time": 8218,
"jitter": 0,
"packet loss": 0,
"reported at": 1590498089
},
CDR with MOS on Freeswitch
<?xmlversion="1.0"?> <cdr core-uuid="[UUID]" switchname="freeswitch"> <channel_data> <state> <direction> <state_number> <flags> <caps> </channel_data> <call-stats> //Audio Compoennts // Video Component </call-stats> // Variables <app_log> <application app_name="..."app_data="..."> <application app_name="..."app_data="..."> </app_log> // Callflow </cdr>
Audio
<audio> <inbound> <raw_bytes> <media_bytes> <packet_count> <media_packet_count> <skip_packet_count> <jitter_packet_count> <dtmf_packet_count> <cng_packet_count> <flush_packet_count> <largest_jb_size> <jitter_min_variance> <jitter_max_variance> <jitter_loss_rate> <jitter_burst_rate> <mean_interval> <flaw_total> <quality_percentage> <mos> </inbound> <outbound> <raw_bytes> <media_bytes> <packet_count> <media_packet_count> <skip_packet_count> <dtmf_packet_count> <cng_packet_count> <rtcp_packet_count> <rtcp_octet_count> </outbound> </audio>
Video
<video> <inbound> <raw_bytes> <media_bytes> <packet_count> <media_packet_count> <skip_packet_count> <jitter_packet_count> <dtmf_packet_count> <cng_packet_count> <flush_packet_count> <largest_jb_size> <jitter_min_variance> <jitter_max_variance> <jitter_loss_rate> <jitter_burst_rate> <mean_interval> <flaw_total> <quality_percentage> <mos> </inbound> <outbound> <raw_bytes> <media_bytes> <packet_count> <media_packet_count> <skip_packet_count> <dtmf_packet_count> <cng_packet_count> <rtcp_packet_count> <rtcp_octet_count> </outbound> </video>
The variables
<variables> <is_outbound> <uuid><session_id><text_media_flow> <direction> <ep_codec_string> <channel_name> <secondary_recovery_module> <verto_dvar_email><verto_dvar_avatar><jsock_uuid_str> <verto_user><presence_id> <verto_client_address><chat_proto> <verto_host><event_channel_cookie> <verto_profile_name> <record_stereo><default_areacode><transfer_fallback_extension> <toll_allow><accountcode><user_context><effective_caller_id_name><effective_caller_id_number> <outbound_caller_id_name><outbound_caller_id_number><callgroup><user_name><domain_name> <Event-Name> <Core-UUID> <FreeSWITCH-Hostname><FreeSWITCH-Switchname><FreeSWITCH-IPv4><FreeSWITCH-IPv6><Event-Date-Local><Event-Date-GMT><Event-Date-Timestamp> <Event-Calling-File> <Event-Calling-Function> <Event-Calling-Line-Number> <Event-Sequence> <verto_remote_caller_id_name><verto_remote_caller_id_number> <switch_r_sdp> <call_uuid><open> <rtp_secure_media> <export_vars><conference_enter_sound> <conference_exit_sound><video_banner_text> <rtp_use_codec_string><remote_audio_media_flow> <audio_media_flow> <rtp_audio_recv_pt> <rtp_use_codec_name> <rtp_use_codec_fmtp> <rtp_use_codec_rate> <rtp_use_codec_ptime> <rtp_use_codec_channels> <rtp_last_audio_codec_string> <original_read_codec> <original_read_rate> <write_codec><write_rate> <remote_audio_ip> <remote_audio_port> <remote_audio_rtcp_ip> <remote_audio_rtcp_port> <dtmf_type> <remote_video_media_flow> <video_media_flow> <video_possible> <rtp_video_pt> <rtp_video_recv_pt> <video_read_codec> <video_read_rate><video_write_codec><video_write_rate><rtp_last_video_codec_string> <rtp_use_video_codec_name> <rtp_use_video_codec_rate> <rtp_use_video_codec_ptime> <remote_video_ip><remote_video_port> <remote_video_rtcp_ip><remote_video_rtcp_port> <local_media_ip><local_media_port> <advertised_media_ip> <rtp_use_timer_name><rtp_use_pt> <rtp_use_ssrc><rtp_2833_send_payload> <rtp_2833_recv_payload><remote_media_ip> <remote_media_port><local_video_ip> <local_video_port><rtp_use_video_pt><rtp_use_video_ssrc><rtp_local_sdp_str><current_application_data><current_application><send_silence_when_idle><rtp_has_crypto><endpoint_disposition><conference_name><conference_member_id><conference_moderator><conference_ghost><conference_uuid><video_width><video_height><video_fps><verto_hangup_disposition><read_codec><read_rate><hangup_cause><hangup_cause_q850> <digits_dialed> <start_stamp><profile_start_stamp><answer_stamp><progress_media_stamp><end_stamp> <start_epoch><start_uepoch> <profile_start_epoch><profile_start_uepoch> <answer_epoch><answer_uepoch> <bridge_epoch><bridge_uepoch> <last_hold_epoch><last_hold_uepoch> <hold_accum_seconds><hold_accum_usec><hold_accum_ms><resurrect_epoch><resurrect_uepoch> <progress_epoch><progress_uepoch><progress_media_epoch><progress_media_uepoch> <end_epoch><end_uepoch> <last_app><last_arg><caller_id><duration><billsec><progresssec><answersec><waitsec><progress_mediasec> <flow_billsec> <mduration><billmsec><progressmsec><answermsec><waitmsec><progress_mediamsec><flow_billmsec><uduration> <billusec><progressusec><answerusec><waitusec><progress_mediausec> <flow_billusec> <rtp_audio_in_raw_bytes> <rtp_audio_in_media_bytes> <rtp_audio_in_packet_count> <rtp_audio_in_media_packet_count> <rtp_audio_in_skip_packet_count><rtp_audio_in_jitter_packet_count><rtp_audio_in_dtmf_packet_count> <rtp_audio_in_cng_packet_count> <rtp_audio_in_flush_packet_count> <rtp_audio_in_largest_jb_size> <rtp_audio_in_jitter_min_variance><rtp_audio_in_jitter_max_variance> <rtp_audio_in_jitter_loss_rate> <rtp_audio_in_jitter_burst_rate> <rtp_audio_in_mean_interval> <rtp_audio_in_flaw_total> <rtp_audio_in_quality_percentage> <rtp_audio_in_mos> <rtp_audio_out_raw_bytes> <rtp_audio_out_media_bytes> <rtp_audio_out_packet_count> <rtp_audio_out_media_packet_count><rtp_audio_out_skip_packet_count><rtp_audio_out_dtmf_packet_count> <rtp_audio_out_cng_packet_count> <rtp_audio_rtcp_packet_count> <rtp_audio_rtcp_octet_count> <rtp_video_in_raw_bytes> <rtp_video_in_media_bytes> <rtp_video_in_packet_count> <rtp_video_in_media_packet_count> <rtp_video_in_skip_packet_count><rtp_video_in_jitter_packet_count><rtp_video_in_dtmf_packet_count> <rtp_video_in_cng_packet_count> <rtp_video_in_flush_packet_count> <rtp_video_in_largest_jb_size> <rtp_video_in_jitter_min_variance><rtp_video_in_jitter_max_variance> <rtp_video_in_jitter_loss_rate> <rtp_video_in_jitter_burst_rate> <rtp_video_in_mean_interval ><rtp_video_in_flaw_total> <rtp_video_in_quality_percentage> <rtp_video_in_mos> <rtp_video_out_raw_bytes> <rtp_video_out_media_bytes> <rtp_video_out_packet_count> <rtp_video_out_media_packet_count><rtp_video_out_skip_packet_count><rtp_video_out_dtmf_packet_count> <rtp_video_out_cng_packet_count> <rtp_video_rtcp_packet_count> <rtp_video_rtcp_octet_count> </variables>
The Callflow components
<callflow dialplan="XML" unique-id="[UUID]" profile_index="1"> <extension name="myconference" number="3500"> <application app_name="..." app_data="..."> </extension> <caller_profile> <username> <dialplan> <caller_id_name> <caller_id_number> <callee_id_name> <callee_id_number> <ani> <aniii> <network_addr> <rdnis> <destination_number> <uuid> <source> <context> <chan_name> </caller_profile> <times> <created_time> <profile_created_time> <progress_time> <progress_media_time> <answered_time> <bridged_time> <last_hold_time> <hold_accum_time> <hangup_time> <resurrect_time> <transfer_time> </times> </callflow>
Standardizing bodies :-
- ITU (The International Telecommunication Union) is the United Nations specialised agency in the field of telecommunications, information and communication technologies (ICTs).
- ITU-T ( ITU Telecommunication Standardisation Sector) is responsible for studying technical, operating and tariff questions and issuing Recommendations on them with a view to standardising tele-communications on a worldwide basis.
As the technology for packet switching matured, the voice quality between circuit-switched and packet-switched networks is mostly indistinguishable. However, the flaws in the VoIP communication system reappear under low network conditions and bad architecture design. Especially with applications that are greedy for network bandwidth such as large scale conferencing or HD streaming, the need for monitoring and quality control is very high, which can be only met by above described QoS parameters.
References
- CDR on freeswitch
- ITU-T G.114 TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU (05/2003) SERIES G: TRANSMISSION SYSTEMS AND MEDIA, DIGITAL SYSTEMS AND NETWORKS , International telephone connections and circuits – General Recommendations on the transmission qua
- Kamailio RTP engine https://www.kamailio.org/docs/modules/devel/modules/rtpengine.html
- RFC 3550 – RTP: A Transport Protocol for Real-Time Applications
- RFC 3611 – RTP Control Protocol (RTCP) Extended Reports (XR)
- RFC 7005 – RTCP (RTP Control Protocol) Extended Reports for RTP/AVPF/AVP
- ITU-T G.107 – The E-model: a computational model for use on transmission planning
- ITU-T P.800.1 – Mean opinion score (MOS) terminology
- ITU-T G.110 – Mean Opinion Score (MOS) calculation
- VoIP Metrics: RTP and RTCP
