Media Architecture , RTP topologies

With the sudden onset of Covid-19 and building trend of working-from-home , the demand for building scalable conferncing solution and virtual meeting room has skyrocketed . Here is my advice if you are building a auto- scalable conferencing solution

This article is about media server setup to provide mid to high scale conferencing solution over SIP to various endpoints including SIP softphones , PBXs , Carrier/PSTN and WebRTC.

Point to Point

Endpoints communicating over unicast
RTP and RTCP tarffic is private between sender and reciver even if the endpoints contains multiple SSRC’s in RTP session.

Advantages of P2p Disadvantages of p2p
  • Facilitates private communication between the parties
  • Only limitaion to number of stream between the partcipants are the physical limiations such as bandwidth, num of available ports
  • Point to Point via Middlebox

    Same as above but with a middle-box involved


    mostly used interoperability for non-interoperable endpoints such as transcoding the codecs or transport convertion
    does not use an SSRC of its own and keeps the SSRC for an RTP stream across the translation.

    Subtypes of Multibox :

    Transport/Relay Anchoring

    Roles like NAT traversal by pinning the media path to a public address domain relay or TURN server

    Middleboxes for auditing or privacy control of participant’s IP

    Other SBC ( Session Border Gateways) like characteristics are also part of this topology setup

    Transport translator

    interconnecting networks like multicast to unicast

    media packetization to allow other media to connect to the session like non-RTP protocols

    Media translator

    modified the media inside of RTP streams commonly known as transcoding

    can do up to full encoding/decoding of RTP streams

    in many cases it can also act on behalf of non-RTP supported endpoints , receiving and responding to feedback reports ad performing FEC ( forward error corrected )

    Back-To-Back RTP Session

    Mostly like middlebox like translator but establishes separate legs RTP session with the endpoints, bridging the two sessions.

    Takes complete responsibility of forwarding the correct RTP payload and maintain the relation between the SSRC and CNAMEs

    Advantages of Back-To-Back RTP SessionDisadvantages of Back-To-Back RTP Session
    B2BUA / media bridge take responsibility tpo relay and manages congestion
  • It can be subjected to MIM attack or have a backdoor to eavesdrop on conversations
  • Point to Point using Multicast

    Any-Source Multicast (ASM)

    traffic from any particpant sent to the multicat group address reaches all other partcipants

    Source-Specific Multicast (SSM)

    Selective Sender stream to the multicast group which streams it to the recibers

    Point to Multipoint using Mesh

    many unicast RTP streams making a mesh

    Point to Multipoint + Translator

    Some more variants of this topology are Point to Multipoint with Mixer

    Media Mixing Mixer

    receives RTP streams from several endpoints and selects the stream(s) to be included in a media-domain mix. The selection can be through

    static configuration or by dynamic, content-dependent means such as voice activation. The mixer then creates a single outgoing RTP stream from this mix.

    Media Switching Mixer

    RTP mixer based on media switching avoids the media decoding and encoding operations in the mixer, as it conceptually forwards the encoded media stream.

    The Mixer can reduce bitrate or switch between sources like active speakers.

    SFU ( Selective Forwarding Unit)

    Middlebox can select which of the potential sources ( SSRC) transmitting media will be sent to each of the endpoints. This transmission is set up as an independent RTP Session.

    Extensively used in videoconferencing topologies with scalable video coding as well as simulcasting.

    Advantges of SFUDisadvatages of SFU
    Low lanetncy and low jitter buffer requirnment by avoiding re encondingunable to manage network and control bitrate

    On a high level, one can safely assume that given the current average internet bandwidth, for count of peers between 3-6 mesh architectures make sense however any number above it requires centralized media architecture.

    Among the centralized media architectures, SFU makes sense for atmost 6-15 people in a conference however is the number of participants exceed that it may need to switch to MCU mode.

    Other Hybrid Topologies

    There are various topologies for multi-endpoint conferences. Hybrid topologies include forward video while mixing audio or auto-switching between the configuration as load increases or decreases or by a paid premium of free plan

    Hybrid model

    Some endpoints receive forwarded streams while others receive mixed/composited streams.

    Serverless models

    Centralized topology in which one endpoint serves as an MCU or SFU.

    Used by Jitsi and Skype

    Point to Multipoint Using Video-Switching MCUs

    Much like MCU but unlike MCU can switch the bitrate and resolution stream based on the active speaker, host or presenter, floor control like characteristics.

    This setup can embed the characteristics of translator, selector and can even do congestion control based on RTCP

    To handle a multipoint conference scenario it acts as a translator forwarding the selected RTP stream under its own SSRC, with the appropriate CSRC values and modifies the RTCP RRs it forwards between the domains

    Cascaded SFUs

    SFU chained reduces latency while also enabling scalability however takes a toll on server network as well as endpoint resources

    Transport Protocols

    Before getting into an in-depth discussion of all possible types of Media Architectures in VoIP systems, let us learn about TCP vs UDP

    TCP is a reliable connection-oriented protocol that sends REQ and receives ACK to establish a connection between communicating parties. It sequentially ends packets which can be resent individually when the receiver recognizes out of order packets. It is thus used for session creation due to its errors correction and congestion control features.

    Once a session is established it automatically shifts to RTP over UDP. UDP even though not as reliable, not guarantying non-duplication and delivery error correction is used due to its tunnelling methods where packets of other protocols are encapsulated inside of UDP packet. However to provide E2E security other methods for Auth and encryption are used.

    Audio PCAP storage and Privacy constraints for Media Servers

    A Call session produces various traces for offtime monitoring and analysis which can include

    CDR ( Call Detail Records ) – to , from numbers , ring time , answer time , duration etc

    Signalling PCAPS – collected usually from SIP application server containing the SIP requests, SDP and responses. It shows the call flow sequences for example, who sent the INVITE and who send the BYE or CANCEL. How many times the call was updated or paused/resumed etc .

    Media Stats – jitter , buffer , RTT , MOS for all legs and avg values

    Audio PCAPS – this is the recording of the RTP stream and RTCP packets between the parties and requires explicit consent from the customer or user . The VoIP companies complying with GDPR cannot record Audio stream for calls and preserve for any purpose like audit , call quality debugging or an inspection by themselves.

    Throwing more light on Audio PCAPS storage, assuming the user provides explicit permission to do so , here is the approach for carrying out the recording and storage operations.

    Firther more , strict accesscontrol , encryption and annonymisation of the media packets is necessary to obfuscate details of the call session.

    References :

    To learn about the difference between Media Server tologies

    • centralized vs decentralised,
    • SFU vs MCU ,
    • multicast vs unicast ,

    Read – SIP conferecning and Media Bridge

    SIP conferencing and Media Bridges

    SIP is the most popular signalling protocol in VOIP ecosystem. It is most suited to a caller-callee scenario , yet however supporting scalable conferences on VOIP is a market demand. It is desired that SIP must for multimedia stream but also provide conference control for building communication and collaboration apps for new and customisable solutions.

    To read more about buildinga scalable VoIP Server Side architecture and

    • Clustering the Servers with common cache for High availiability and prompt failure recovery
    • Multitier archietcture ie seprartion between Data/session and Application Server /Engine layer
    • Micro service based architecture ie diff between proxies like Load balancer, SBC, Backend services , OSS/BSS etc
    • Containerization and Autoscalling

    Read – VoIP/ OTT / Telecom Solution startup’s strategy for Building a scalable flexible SIP platform

    VoIP/ OTT / Telecom Solution startup’s strategy for Building a scalable flexible SIP platform

    I have been contemplating points that make for a successful developer to develop solutions and services for a Telecom Application Server. The trend has shown many variations from pure IN programs like VPN, Prepaid billing logic to SIP servlets for call parking, call completion. From SIP servlets to JAISNLEE open standard-based communication. Read about Introduction … Continue reading VoIP/ OTT / Telecom Solution startup’s strategy for Building a scalable flexible SIP platform

    Wowza Secure URL params Authentication for streams in an application

    To secure the publishers for a common application through username -password specific for stream names , this post is useful . It  uses Module Core Security to prompt back the user for supplying credentials.

    The detailed code to check the rtmp query-string for parameters  and performs the checks –  is user is allowed to connect and is user allowed to stream on given stream name is given below .

    Initialize the hashmap containing publisher clients and IapplicationInstance

    HashMap <Integer, String> publisherClients =null;
    IApplicationInstance appInstance = null;

    On app start initilaize the IapplicationInstance object .

    public void onAppStart(IApplicationInstance appInstance)
        this.appInstance = appInstance;

    Onconnect is called called when any publisher tries to connects with media server. At this event collect the username and clientId from the client.
    Check if publisherclient contains the userName which client has provided else reject the connection .

    public void onConnect(IClient client, RequestFunction function, AMFDataList params)
    AMFDataObj obj = params.getObject(2);
    AMFData data = obj.get("app");
       String[] paramlist = data.toString().split();
       String[] userParam = paramlist[1].split("=");
       String userName = userParam[1];
           this.publisherClients = new HashMap<Integer, String>();
    } else {

    AMFDataItem: class for marshalling data between Wowza Pro server and Flash client.

    As the event user starts to publish a stream after sucessful connection Onpublishing function is called . It extracts the stream name from the client ( function extractStreamName() )and checks if user is allowed to stream on the given streamname (function isStreamNotAllowed()) .

    public void publish(IClient client, RequestFunction function, AMFDataList params)
    String streamName = extractStreamName(client, function, params);
    if (isStreamNotAllowed(client, streamName))
    sendClientOnStatusError(client, NetStream.Publish.Denied, "Stream name not allowed for the logged in user: "+streamName);
    invokePrevious(client, function, params);

    Function when publisher disconnects from server . It removes the client from publisherClients.

    public void onDisconnect(IClient client)

    The function to extract a streamname is

    public String extractStreamName(IClient client, RequestFunction function, AMFDataList params)
    String streamName = params.getString(PARAM1);
    if (streamName != null)
    String streamExt = MediaStream.BASE_STREAM_EXT;
    String[] streamDecode = ModuleUtils.decodeStreamExtension(streamName, streamExt);
    streamName = streamDecode[0];
    streamExt = streamDecode[1];
    return streamName;

    The fucntion to check if streamname is allowed for the given user

    public boolean isStreamNotAllowed(IClient client, String streamName)
    WMSProperties localWMSProperties = client.getAppInstance().getProperties();
    String allowedStreamName = localWMSProperties.getPropertyStr(this.publisherClients.get(client.getClientId()));
    String sName="";
    sName = streamName.substring(0, streamName.lastIndexOf(&amp;amp;quot;?&amp;amp;quot;));
    sName = streamName;
    return !sName.toLowerCase().equals(allowedStreamName.toLowerCase().toString()) ;

    On adding the application to wowza server make sure that the ModuleCoreSecurity is present under Modules in Application.xml

    <Description>Core Security Module for Applications</Description>

    Also ensure that property securityPublishRequirePassword is present under properties


    Add the user credentials as properties too. For example to give access to testuser with password 123456 to stream on myStream include the following ,


    Also include the mapping of user and password inside of conf/publish.password file

    # Publish password file (format [username][space][password])
    # username password

    testuser 123456

    IP Multimedia Subsystem ( IMS )

     IMS is a an architectural framework for IP based multimedia rich communications. It was standardized by a group called 3GPP formed in 1999.
    It started as an enabler for 3rd generation mobile networks in European market and later spread to wirelne networks too . IMS became the key to  Fixed Mobile Convergence (FMC).
    Based on IETF Protocols (such as SIP, RTP, RTSP, COPS, DIAMETER, etc.) , IMS is  now crucial for controlling conmmunication in a IP based Next Genration Network ( NGN ).
    Communication service providers and telecom operators are migrating from circuit-switched networks to IMS technology with the increasing bandwidth (5G) and user expectations.
    ims layers

    Why IMS ?

    Early days TDM networks were not robust enough to support emerging technologies and data networking. There was a need to migrate from voic eonly network to Triple play network ( voice , video and data ).

    Other factors included :

    • rapid service development
    • service availiability in both home and roaming network
    • wireline and wireless convergence

    Due to these above mentioned reasons TDM was outdated and IMS gained support .


    What benefits does IMS bring ?

    It offers counteless applications around rich multimedia services on wireless , packet swtched and even tradional circuit switched networks.

    Easier to Create and Deploy New Applications and Services
    •  Enhanced applications are easier to develop due to open APIs and common network services.
    • Third-party developers can offer their own applications and use common network services, sharing profits with minimal risk
    •  New services involving concurrent sessions of multimedia (voice, video, and data) during the same call are now possible
    • Reduced time-to-market for new services is possible because service providers are not tied to the timescales and functions of their primary NEPs
    Capture New Subscribers, Retain Current Subscribers
    • Better voice quality for business applications, such as conferencing, is possible
    • Wireless applications (like SMS, and so on) can be offered to wire line or broadband subscribers.
    • Service providers can more easily offer bundled services.
    Lower Operating and Capital Costs
    • Cost-effective implementation of services is possible across multiple transports, such as Push-To-Talk (PTT), presence and Location-Based Services (LBS), Fixed-Mobile Convergence (FMC), mobile video services, and so on.
    • Common provisioning, management, and billing systems are supported for all networks.
    • Significantly lower transport costs result when moving from time-switched to packet-switched channels.
    • Service providers can take advantage of competitive offerings from multiple NEPs for most network elements.
    • IMS results in reduced expenses for delivering licensed content to subscribers of different types of devices, encodings, or networks.


    The reason for widespread adoption of IMS is also that it follows standards and open interfaces  from 3GPP and ETSI, also is flexible for policy control , OSS/BSS , Value Added Services etc .


    IMS features

    1. Abstraction from Underlying Network :
    IMS is essentially leading towards an open and standardized network and interface ,  irrespective of underlay network.
    2. Fixed /Mobile Convergence 
    Inter operability with Circuit Switched (CS) Mobile application Part (MAP)
    3. Roaming 
    Location awareness between home and visiting network.
    4. Application layer Call Control
    IMS application layer has the provision for defining proxy or B2BUA based call flow completion . This leads to operator being able to introduce business logic into call sessions.
    IMS is supplemented by SIP (IETF ) , Diameter ( IETF) and H248(ITU-T). The release cycle of IMS is as follows :
    • 2002-03-14 Rel-5  : IMS was introduced with SIP. Qos voice over MGW.
    • 2004-12-16 Rel-6 : Services like emergency , voice call continuity , IPCAN ( IP connectivity Access Network )
    • 2005-09-28 Rel-7 : Single Radio Voice Call Continuity , multimedia telephony,eCall ,ICS
    • 2008-12-11 Rel-8 : IMS centralized services , supplementary services and internetworking between  IMS and  Circuit Switched Networks,charging , QoS
    • 2009-12-10 Rel-9 : IMS emergency numbers on GPRS , EPS(Enhanced packet system) , Custom alert tone , MM broadcast/Multicast
    • 2011-3-23 Rel-10 : home NodeB, M2M, Roaming and Inter UE transfer
    • 2012-09-12 Rel-11 :-tbd
    • 2014-09-17 Rel-12 :- tbd
    • 2015-12-11 Rel-13 :- tbd

    IMS Layers

    Majorly IMS is divided into 3 horizontal layers given below :


    •Transport / MediaEndpoint Layer

    Unifies transports and media from analog, digital, or broadband formats to Real-time Transport Protocol (RTP) and SIP protocols. This is accomplished by media gateways and signaling gateways.

    It also includes media servers with media processing elements to allow for announcements, in-band signaling, and conferencing. These media servers are shared across all applications (voicemail, interactive response systems, push-to-talk, and so on), maximizing statistical use of the equipment and creating a common base of media services without “hard-coding” these services into the applications.

    •Session and Control Layer

    This layer arranges logical connections between various other network elements. It provides registration of end-points, routing of SIP messages, and overall coordination of media and signaling resources.

    IMS core which is part of this layer primarily contains 2 important elements Call Session Control Function (CSCF) and Home Subscriber Server (HSS) database. These are explained below 

    HSS Home Subscriber Server

    It is a database of user profiles and location information . It is responsible for name/address resolution and also authorization/authentication .

    CSCF Call Session Control Function

    Handles most routing, session and security related operation for SIP messages . It is further divided into 3 parts :

    • Proxy CSCF: P_CSCF is the first point of contact from any SIP UA. It proxies UE requests to subsystem.
    • Serving CSCF: S-CSCF is a powerful part of IMS Core as it decides how UE request will be forwarded to the application servers.
    • Interrogating CSCF: I-CSCF initiates the assignment of a user to an S-CSCF (by querying the HSS) during registration.

    •Application Services Layer

     The Application Services Layer contains multiple Application Servers (AS), such as:
    • Telephony Application Server (TAS) – for defining custom call flow logic
    • IP Multimedia Services Switching Function (IM-SSF)
    • Open Service Access Gateway (OSA-GW), and so on.

    Additional Links :

    Update on IMS :

    IMS has been mandated as the control architecture for Voice over LTE (VoLTE) networks. Also IMS is being widely adopted to mange traffic for Voice over WiFi (VoWiFi) systems.

    FreeSwitch SIP and Media Server

    FreeSWITCH is free and open source communications software licensed under Mozilla Public License. It if often the core of voice core to provider call routing and media control . Its core library, libfreeswitch, is capable of being embedded into other projects, as well as being used as a stand-alone application.

     FreeSWITCH is designed to route and interconnect popular communication protocols using audio, video, text, or any other form

    of media. First released in January 2006, FreeSWITCH has grown to become the world’s premier open source soft-switch

    platform. This versatile platform is used to power voice, video, and chat communications on devices ranging from single calls on

    a Raspberry Pi to large server clusters handling millions of calls. FreeSWITCH powers a number of commercial products

    from start-ups to Carriers.


    It can perform the functions of  ( but not limited to )

    • PBX Server (Transcoding B2BUA)
    • IVR & Announcement Server
    • Conference host
    • Voicemail
    • Session Border Controller
    • Text to Speech (TTS)
    • VOIP endpoint
    • Class 5 softswitch

    Freeswitch has a modular architecture which is both scalable and customisable. The most important modules are , Endpoint , dialplan and Application .

    Application is the instruction added for a particular dial plan with an extension object. Data Arguments are also passed to an application. Examples like Set: configure extension parameter , Bridge: bridge a new channel to the existing one , Answer: answer the call for a channel , Hangup: hangup a current channel , Run an IVR menu etc

    Protocols set up call legs/ channels , negotiate codecs and stream media.The endpoint module helps to bridge channels between different protocol supported endpoints . SIP being the most popular protocol for voip session is implemented by mod_sofia module while RTP is inbuild into freeswitch core . SRTP ( media protocol for webrtc ) is provided by mod_verto.

    Architecture and Design of Freeswitch

    Freeswitch can form the basis of complicated and sophisticated communications backend framework with thousand CPS(Call per second ) . It can connect to VOIP ( voice over IP ) as well as PSTN ( Public Switched Telephone network ) and PRI ( Primary Rate Interfaces – used in enterprises communication)


    Data strutters are opaque and operations can be invoke by APIs with routines getting maximum reuse .

    Threaded Model 

    Enables parallel operation as every connection has its own thread. Event handlers push incoming events into threads .  Sub system run in background threads .


    Channel Variables

    Channel variables are used to manipulate dialplan execution, to control call progress, and to provide options to applications. They play a pervasive role, as FreeSWITCH™ frequently consults channel variables as a way to customize processing prior to a channel’s creation, during call progress, and after the channel hangs up.


    • $${variable} is expanded once when FreeSWITCH™ first parses the configuration on startup or after invoking reloadxml. It is suitable for variables that do not change, such as the domain of a single-tenant FreeSWITCH™ server.

    <param name=”domain” value=”$${domain}”/>

    • ${variable} is expanded during each pass through the dialplan, so it is used for variables that are expected to change, such as the ${destination_number} or ${sip_to_user} fields.

    Setting a channel variable :

    <application="set" data="rtp_secure_media=true"/>

    Reading a channel variable:

    <route service="E2U+SIP" regex="sip:(.*)" replace="sofia/${use_profile}/$1;transport=udp"/>

    Exporting channel variables in bridge operations

    • from one to another call leg using export_var
    • exporting to a list using export application
    <action application="export" data="dialed_extension=$1"/>

    Custom channel variables can be defined anytime too such as

    <action application="set" data="conference_auto_outcall_caller_id_name=Mad Boss"/>

    Also channel variables can be limited to scope on an extension . An example of passing some channel variable to log application .

    <action application="log" data="INFO Inbound call CallUUID ${call_uuid} SIPCallID ${sip_call_id}- from ${caller_id_number} to ${destination_number}"/>

    If the conditions are not met, optional anti-actions are executed.

    <name="is_secure" continue="true">
    <-- Only Truly consider it secure if its TLS and SRTP -->
    <condition field="${sip_via_protocol}" expression="tls"/>
    <condition field="${rtp_secure_media_confirmed}" expression="^true$">
        <action application="sleep" data="2000"/>
        <action application="playback" data="misc/call_secured.wav"/>
        <anti-action application="eval" data="not_secure"/>

    Inline actions are executed during the hunting phase of dialplan


    A Dialplan is designed to lookup list of instructions from the central XML registry within FreeSWITCH. In general dialplans are used to route a dialed call to an endpoint based on the extension and its  condition. When a matching extension is found , it executes its actions . The combination of the above can create detailed control and call flow plans . FS uses Perl-compatible regular expressions (PCRE) for pattern matching. Few formats

    • sofia/profile2/8765@ , will dial out 8765 at host using profile2
    • sofia/gateway/ , will dial through a Gateway (SIP Provider) to user 5432
    • sofia/profile2/8765@;transport=tcp , dialing with specific transport like TCP, UDP, TLS, or SCTP.
    • {absolute_codec_string=PCMU}sofia/external/sip:9106@${local_ip_v4}:5080 , to specify the codecs

    Speak Time and Date on Call

    when dialed number matches regular expression 9172 , then call is answered , put to sleep for 1 seconds and using say application current date and time is said , then application hangs up .

    <extension name="speak_date_time" >
    <condition field="destination_number" expression="^9172$">
        <action application="answer"/>
        <action application="sleep" data="1000"/>
        <action application="say" data="en CURRENT_DATE_TIME pronounced ${strepoch()}"/>
        <action application="hangup"/>

    There may be 3 kinds of contexts  :

    1. default  : used for all internal users  such as PBX . Local_Extension can route the call between internal users .
    2. public  : used by external world users such as DID
    3. features : other custom in call features using bind_meta_app application etc

    Call Routing based on destination number and forwarding to voice mail on no answer

    Configure the sip driver to use the custom context while processing the call such as ,

    <profile name="telco_custom_sipprofile">
        <param name="context" value="custom_sipcontext"/>

    When call arrives for destination 501 , the condition matches and this blocks action are executed such as in example below .
    Exetnsion 501 rings , when not answered it sleeps or 1 seconds , then gets forwarded to voice mail .

    If the call to 501 was answered ie handed off then further actions would not be executed

    <context name="custom_sipcontext">
    <extension name="501">
    <condition field="destination_number" expression="^501$">
        <action application="bridge" data="user/501"/>
        <action application="answer"/>
        <action application="sleep" data="1000"/>
        <action application="bridge" data="loopback/app=voicemail:default ${domain_name} ${dialed_extension}"/>

    Call routing based on day and time

    <extension name="Time of day, day of week setup" continue="true">
    <condition wday="2-6" hour="8-16 break="never">
    <action application="set" data="office_status=open" inline="true"/>
    <anti-action application="set" data="office_status=closed" inline="true"/>
    <condition wday="2-6" time-of-day="1:30-2:30" break="never">
    <action application="set" data="office_status=lunch" inline="true"/>

    inline= true states that channel variables will be used for later reference while break=never and continue=true tell the program to keep looking for more condition matches incase of failed or successful match respectively

    Match incoming network IP address with pre configured IP

    Store incoming number to $1 variable and bridge the call with custom profile . Read more about sip profiles in sections below .

    <extension name="ipmatch">
    <condition field="network_addr" expression="^198\.168\.1\.0$"/>
    <condition field="destination_number" expression="^(\d+)$">
        <action application="bridge" data="sofia/customprofile/$1@"/>

    Note : $1 varibles value is not available outside of the condition block
    Store captured values in standard variables 

    <action application=”set” data=”domain_name=$${domain}”/>

    Following example store stores destination_number ( freeswitch variable ) into ‘dialed_number’

    <extension name="ipmatch_variable">
    <condition field="destination_number" expression="^(\d+)$">
        <action application="set" data="dialed_number=$1"/>
    <condition field="network_addr" expression="^192\.168\.1\.1$">
        <action application="bridge" data="sofia/customprofile/${dialed_number}@"/>


    Media recording and playback in audio (wav)

    <extension name="recording">
    <condition field="destination_number" expression="^(4444)$">
        <action application="answer"/>
        <action application="set" data="playback_terminators=#"/>
        <action application="record" data="/tmp/audiofile.wav 20 200"/>
    <extension name="playback">
    <condition field="destination_number" expression="^(5555)$">
        <action application="answer"/>
        <action application="set" data="playback_terminators=#"/>
        <action application="playback" data="/tmp/audiofile.wav"/>

    Routing by listening on the audio stream for a touch-tone * followed by a single digit.

    If the called user dials *1, then the execute_extension::dx XML features command is executed.

    <extension name="Local_Extension">
    <condition field="destination_number" expression="^(10[01][0-9])$">
        <action application="export" data="dialed_extension=$1"/>
        <!-- bind_meta_app can have these args <key> [a|b|ab] [a|b|o|s] <app> -->
        <action application="bind_meta_app" data="1 b s execute_extension::dx XML features"/>
        <action application="bind_meta_app" data="2 b s record_session::${recordings_dir}/${caller_id_number}.${strftime(%Y-%m-%d-%H-%M-%S)}.wav"/>
        <action application="bind_meta_app" data="3 b s execute_extension::cf XML features"/>
        <action application="bind_meta_app" data="4 b s execute_extension::att_xfer XML features"/>

    The dx extension in features accepts the digits and proceeds as defined with the call

    <extension name="dx">
    <condition field="destination_number" expression="^dx$">
        <action application="answer"/>
        <action application="read" data="11 11 'tone_stream://%(10000,0,350,440)' digits 5000 #"/>
        <action application="execute_extension" data="is_transfer XML features"/>

    Some authentication and security related dialplan applications :-

    Checking user is authenticated before routing call , else respond 407

    <extension name="9191">
    <condition field="destination_number" expression="^9191$"/>
    <condition field="${sip_authorized}" expression="true">
        <anti-action application="respond" data="407"/>
        <action application="playback" data="misc/connected_securly.wav"/>

    Checking if there is TLS and SRTP security , else set not_secure

    <extension name="is_secure">
    <condition field="${sip_via_protocol}" expression="tls"/>
    <condition field="${rtp_secure_media_confirmed}" expression="^true$">
    <action application="sleep" data="1000"/>
    <action application="playback" data="misc/connected_securly.wav"/>
    <anti-action application="eval" data="not_secure"/>

    Catching invalid destinations or extensions

    Catch numbers which didnt match any other case. Add this extension to bottom. It plays an invalid tune

    <extension name="catchall">
    <condition field="destination_number" expression=".*" continue="true">
        <action application="playback" data="misc/invalid_extension.wav"/>

    Call screening and blocking dialplan applications

    Call Screening by name announcement

    User caller’s name store in wave file

    <action application="set" data="call_screen_filename=/tmp/${caller_id_number}-name.wav"/>

    Connect to the called party. On answer announce the name. since playback_terminators is set to digits , pressing any one of them will terminate the call

    <action application="set" data="hangup_after_bridge=true" />
    <action application="answer"/>
    <action application="sleep" data="1000"/>
    <action application="phrase" data="voicemail_record_name"/>
    <action application="playback" data="tone_stream://%(500, 0, 640)"/>
    <action application="set" data="playback_terminators=#*0123456789"/>
    <action application="record" data="${call_screen_filename} 7 200 2"/>

    If called party presses 1 connect the call, or hang up.

    <action application="set" data="group_confirm_key=1"/>
    <action application="set" data="fail_on_single_reject=true"/>
    <action application="set" data="group_confirm_file=phrase:screen_confirm:${call_screen_filename}"/>
    <action application="set" data="continue_on_fail=true"/>
    <action application="bridge" data="user/$1"/>

    If the called party hangs up, the caller is connected with voicemail.

    <action application="voicemail" data="default ${domain} $1"/>

    finally hangup

    <action application="hangup"/>

    Block caller

    Dial *77 followed by the number to be blocked

    <extension name="block_caller_id">
    <condition field="destination_number" expression="^\*77(\d+)$">
    <action application="privacy" data="full"/>
    <action application="set" data="sip_h_Privacy=id"/>
    <action application="set" data="privacy=yes"/>
    <action application="transfer" data="$1 XML default"/>

    Block certain codes

    block certain NPAs that you do not want to terminate based on caller id area codes and respond with SIP:503 to your origination so that they can route advance if they have other carrier to terminate to.

    <extension name="blocked_cid_npa">
    <condition field="caller_id_number" expression="^(\+1|1)?((876|809)\d{7})$">
    <action application="respond" data="503"/>
    <action application="hangup"/>

    DID – Direct Inward Dialling via dialplan Public.xml

    Assume we have a DID number 676767 which is served by telco provider either over SIP trunk/PRI lines . When someone from external world calls this number , FE needs to route the call to an internal user for example user at extension 3003 ( in default .xml context)

    <extension name="public_did">
    <condition field="destination_number" expression="^\+?1?(676767)$">
        <action application="set" data="domain_name=${domain}"/>
        <action application="transfer" data="3003 XML default"/>

    If we are on multi domain setup , we need to setup the domain correctly .$${domain} is the default domain set from vars.xml but you can set it to any domain we have setup in user directory. Added the extra characters in from of DID number to adjust for various ISD code and number formats suffixes such as +1- ,91- , 0- etc .

    IVR ( Interactive Voice Respondent ) using Menu

    Main Menu – uses tts enginer and 3 attempsts to repond with timeout 10 seconds
    On pressing 1 – bridge the call to conference , on press 2 – transfer to 2222 using default
    On press of 3 – transfer using enum while on press 4 – play submenu. On press of 9 – goto top menu

    <menu name="demo_ivr"
    greet-long="say:Press 1 to join the conference, Press 2 to transfer , 3 to transfer , 4 to goto another menu "
        <entry action="menu-exec-app" digits="1" param="bridge sofia/${domain}/"/>
        <entry action="menu-exec-app" digits="2" param="transfer 2222 XML default"/> 
        <entry action="menu-exec-app" digits="3" param="transfer 1234*256 enum"/> 
        <entry action="menu-sub" digits="4" param="demo_ivr_submenu"/> 
        <entry action="menu-exec-app" digits="/^(10[01][0-9])$/" param="transfer $1 XML features"/>
        <entry action="menu-top" digits="9"/> 

    Submenu – press * to repeat menu , # to exit . the timeout is 15 seconds

    <menu name="demo_ivr_submenu"
        <entry action="menu-top" digits="*"/>
        <entry action="menu-exit" digits="#"/>

    Find me Follow Me

    If a users has lets say 3 phone – home , office and car then an incomming call should subesquently ring everywhere one by one till the user picks up the phone closet to him . leg_delay_start is the timer after which this endpoint will start riniging and leg_timeout is the duration till when this endpoint will ring.
    Therfore as per below sample homephone will ring , after 5 sceonds office phone will ring and after 15 secons his cellphone 987654321 will ring . after 25 seconds call will end.

    <action application="bridge" data="user/, 
    [leg_delay_start=15,leg_timeout=25] sofia/gateway/flowroute/987654321" />

    DID can bridge to multiple extensions or gateways sequentially in a hunt pattern

    <extension name="did_hunt">
    <condition field="destination_number" expression="87654321">
    <action application="set" data="hangup_after_bridge=true"/>
    <action application="set" data="continue_on_fail=true"/>
    <!-- this is needed to allow call_timeout to work after bridging to a gateway -->
    <action application="set" data="ignore_early_media=true"/>
    <!-- ring desk extension for 10 seconds. -->
    <action application="set" data="call_timeout=10"/>
    <action application="bridge" data="sofia/${domain}/1001"/>
    <!-- Now try cell phone, hangup after 13 -->
    <action application="set" data="call_timeout=13"/>
    <action application="bridge" data="sofia/gateway/voicepulse/987654321" />
    <!-- No answer, transfer to voicemail -->
    <action application="answer"/>
    <action application="sleep" data="1000"/>
    <action application="voicemail" data="default ${domain} 1001"/>

    Ring Multiple Targets

    <action application="bridge" 
    user/7010@${domain}, user/7022@${domain}, user/7007@${domain}, 

    Handle Failures and Early Media

    <action application="bridge" 
    data="{ ignore_early_media=true, 
    destination_out_of_order:2:1776.7 }

    To detect early media fail the conditions are
    user busy – number of attempts is 3 and 480Hz 620Hz is the tone of frequency which is standard busy tone.
    destination out of order – number of attempts 2 , 1776.7 Hz frequency .
    Note that as per condition only these frequencies are detected for action , others are ignored .


    A simple directory listing containing two groups with 2 users each

    <domain name="${domain}">
    <param name="dial-string" value="{^^:sip_invite_domain=${dialed_domain}:
    <variable name="record_stereo" value="true"/>
    <variable name="default_gateway" value="${default_provider}"/>
    <variable name="default_areacode" value="${default_areacode}"/>
    <variable name="transfer_fallback_extension" value="operator"/>
    <group name="default">
            <X-PRE-PROCESS cmd="include" data="default/*.xml"/>
    <group name="team1">
            <user id="1000" type="pointer"/>
            <user id="1001" type="pointer"/>
    <group name="team2">
            <user id="1002" type="pointer"/>
            <user id="1003" type="pointer"/>

    1001 user’s xml

    <user id="1001">
    <param name="password" value="${default_password}"/>
        <variable name="toll_allow" value="domestic,international,local"/>
        <variable name="accountcode" value="1001"/>
        <variable name="user_context" value="default"/>
        <variable name="effective_caller_id_name" value="Extension 1001"/>
        <variable name="effective_caller_id_number" value="1001"/>
        <variable name="outbound_caller_id_name" value="${outbound_caller_name}"/>
        <variable name="outbound_caller_id_number" value="${outbound_caller_id}"/>
        <variable name="callgroup" value="team1"/>

    Adding users

    /usr/src/freeswitch-debs/freeswitch# scripts/perl/add_user 3000
    perl: warning: Setting locale failed.
    perl: warning: Please check that your locale settings:
     LANGUAGE = (unset),
     LC_ALL = (unset),
     LC_CTYPE = "UTF-8",
     LANG = "en_US.UTF-8"
        are supported and installed on your system.
    perl: warning: Falling back to a fallback locale ("en_US.UTF-8").
    Added 3000 in file /usr/local/freeswitch/conf/directory/default/3000.xml 
    Operation complete. 1 user added.
    Be sure to reloadxml.
    Regular expression information:
                Sample regex for all new users: ^3000$
    Sample regex for all new AND current users: ^(10(0[0-9]|1[0-9]|20)|3000)$
    In the default configuration you can modify the expression in the condition for 'Local_Extension'.

    Adding a range of users , 3000 to 3010

    Since 3000 was already added previously , it threw a warning , rest were successfully added

    root@ip-172-31-27-106:/usr/src/freeswitch-debs/freeswitch# scripts/perl/add_user -users=3000-3010 
    perl: warning: Setting locale failed.
    perl: warning: Please check that your locale settings:
     LANGUAGE = (unset),
     LC_ALL = (unset),
     LC_CTYPE = "UTF-8",
     LANG = "en_US.UTF-8"
        are supported and installed on your system.
    perl: warning: Falling back to a fallback locale ("en_US.UTF-8").
    User id 3000 already exists, skipping...
    Added 3001 in file /usr/local/freeswitch/conf/directory/default/3001.xml 
    Added 3002 in file /usr/local/freeswitch/conf/directory/default/3002.xml 
    Added 3003 in file /usr/local/freeswitch/conf/directory/default/3003.xml 
    Added 3004 in file /usr/local/freeswitch/conf/directory/default/3004.xml 
    Added 3005 in file /usr/local/freeswitch/conf/directory/default/3005.xml 
    Added 3006 in file /usr/local/freeswitch/conf/directory/default/3006.xml 
    Added 3007 in file /usr/local/freeswitch/conf/directory/default/3007.xml 
    Added 3008 in file /usr/local/freeswitch/conf/directory/default/3008.xml 
    Added 3009 in file /usr/local/freeswitch/conf/directory/default/3009.xml 
    Added 3010 in file /usr/local/freeswitch/conf/directory/default/3010.xml 
    Operation complete. 10 users added.
    Be sure to reloadxml.
    Regular expression information:
                Sample regex for all new users: ^30(0[123456789]|10)$
    Sample regex for all new AND current users: ^(10(0[0-9]|1[0-9]|20)|30(0[0-9]|10))$
    In the default configuration you can modify the expression in the condition for 'Local_Extension'.

    After adding the user to directory , users can now make outbound calls . But howver cannot be rechable for incoming calls . To enable that e need to add them to dialplan .

    Creating dialplan for the newly added users  in conf/dialplan/default.xml

    update the existing condition <condition field=destination_number expression=^(10[01][0-9])$> with <condition field=destination_number expression=^30(0[123456789]|10)$>

    After this goto fs_cli cmd prompt and do reloadxml


    Quick Installation on MacOS

    Download and run the dmg , screenshots attached .

    Building from source on Ubuntu 16.04 Xenial

    *experimental not suitable for production as per Freeswitch docs

    The master branch depends on video libraries which are not available as packages in Debian distribution, but are available from FreeSWITCH repository , requires the use of the devscripts and cowbuilder packages.apt-get install git devscripts cowbuilder

    Change to root and add freeswitch to sources.list

    wget -O - | apt-key add -
    echo "deb jessie main" > /etc/apt/sources.list.d/freeswitch.list
    echo "deb-src jessie main" >> /etc/apt/sources.list.d/freeswitch.list
    apt-get update

    apt-get build-dep freeswitch

    cd /usr/src/

    git clone -bv1.8 freeswitch

    cd freeswitch

    git config pull.rebase true

    Enter freeswitch directory and Build

    ./ -j
    make install

    for errors such as “The repository ‘ xenial InRelease’ is not signed.” and “The following signatures couldn’t be verified because the public key is not available: NO_PUBKEY 0xxxxxxx” please note than only debian 8 is the officially supported os version by FS now. hence is using AWS ( amazon web service ) stick with ubuntu v 14 ie Ubuntu Server 14.04 LTS (HVM), SSD Volume Type  which is also free tier eligible.

      Ubuntu      |       Debian  
    18.04  bionic     buster  / sid   - 10
    17.10  artful     stretch / sid   - 9
    17.04  zesty      stretch / sid
    16.10  yakkety    stretch / sid
    16.04  xenial     stretch / sid
    15.10  wily       jessie  / sid   - 8
    15.04  vivid      jessie  / sid
    14.10  utopic     jessie  / sid
    14.04  trusty     jessie  / sid
    13.10  saucy      wheezy  / sid   - 7
    13.04  raring     wheezy  / sid
    12.10  quantal    wheezy  / sid
    12.04  precise    wheezy  / sid
    11.10  oneiric    wheezy  / sid
    11.04  natty      squeeze / sid   - 6
    10.10  maverick   squeeze / sid
    10.04  lucid      squeeze / sid

    Manual Process of bootstarp and cofigure

    Once build is successfull , install libtool-bin , libcurl4-openssl-dev , libpcre3-dev , libspeex-dev , libspeexdsp-dev ,libtiff5 ,libtiff5-dev , yasm for libvpx , liblua5.1-0-dev for scripting

    For mod_enum support install libldns-dev or disable it in modules.conf

    we can either install libedit-dev (>= 2.11) or configure with –disable-core-libedit-support


    For errors around lua file such as Cannot find lua.h header file , just do apt-get install lua5.2 and lua5.2-dev and copy the headers file manually to freeswitch languages folder such as
    cp -R /usr/include/lua5.2/ src/mod/languages/mod_lua/
    or you can copy these one by one lauxlib.h lua.h lua.hpp luaconf.h lualib.h

    ln -s /usr/lib/x86_64-linux-gnu/ llua
    sudo make install
    sudo make uhd-sounds-install
    sudo make uhd-moh-install
    sudo make samples

    If you want to make lua from source

    mkdir -p ~/Developing/third_party
    cd Developing
    tar xf lua-5.3.2.tar.gz
    cd lua-5.3.2.tar.gz
    make linux 
    sudo make install INSTALL_TOP=/usr/local
    cd ~/Developing/third_party/rtags/build
    cmake -DLUA_INCLUDE_DIR=/usr/local/include/ -DLUA_LIBRARY=/usr/local/lib/liblua.a ../
    aptitude install -y -r -o APT::Install-Suggests=true freeswitch-meta-vanilla
    cp -a /usr/share/freeswitch/conf/vanilla /etc/freeswitch
    /etc/init.d/freeswitch start

    To see if freeswitch is running  – ps aux | grep freeswitch

    Screen Shot 2018-09-20 at 10.45.47 AM

    To check listening ports – ngrep -W byline -d any port 5060 or netstat -lnp | grep 5060

    Custom TCP RuleTCP5080 – 50810.0.0.0/0
    Custom TCP RuleTCP5080 – 5081::/0
    Custom UDP RuleUDP16384 – 327680.0.0.0/0
    Custom UDP RuleUDP16384 – 32768::/0
    All trafficAllAll0.0.0.0/0
    All trafficAllAll::/0
    Custom TCP RuleTCP80210.0.0.0/0
    Custom TCP RuleTCP8021::/0
    Custom UDP RuleUDP5060 – 50620.0.0.0/0
    Custom UDP RuleUDP5060 – 5062::/0
    Custom UDP RuleUDP5080 – 50810.0.0.0/0
    Custom UDP RuleUDP5080 – 5081::/0
    Custom TCP RuleTCP8081 – 80820.0.0.0/0
    Custom TCP RuleTCP8081 – 8082::/0
    Custom TCP RuleTCP5060 – 50610.0.0.0/0
    Custom TCP RuleTCP5060 – 5061::/0



    • ACL
    • Fail2Ban
    • IPtables

    Debugging and Call

    For internal calls , originate api can be used to initiate calls such as  originate ALEG BLEG

    originate {origination_caller_id_number=9999988888}sofia/internal/1004@ 91999998888 XML default CALLER_ID_NAME CALLER_ID_NUMBER

    This will make a call out to sip:1004@1127.0.0.1 with the Caller ID number set to 999998888, then it will send the call to the XML dialplan using context=default. Then the dialplan will process call to 91999998888 with the Caller ID name and number specified in the fields CALLER_ID_NAME and CALLER_ID_NUMBER.

    fsc_cli> originate sofia/internal/1002@ &echo()
    switch_ivr_originate.c:2159 Parsing global variables
    switch_channel.c:1104 New Channel sofia/internal/1002@ [5188806e-cabd-4acc-b20b-00620c3362ec]
    mod_sofia.c:5026 (sofia/internal/1002@ State Change CS_NEW -> CS_INIT
    switch_core_state_machine.c:584 (sofia/internal/1002@ Running State Change CS_INIT (Cur 5 Tot 122559)
    switch_core_state_machine.c:627 (sofia/internal/1002@ State INIT
    mod_sofia.c:93 sofia/internal/1002@ SOFIA INIT
    sofia_glue.c:1299 sofia/internal/1002@ sending invite version: 1.9.0 -654-ed4920e 64bit
    Local SDP:
    o=FreeSWITCH 1538689496 1538689497 IN IP4
    c=IN IP4
    t=0 0
    m=audio 24636 RTP/AVP 9 0 8 101
    a=rtpmap:9 G722/8000
    a=rtpmap:0 PCMU/8000
    a=rtpmap:8 PCMA/8000
    a=rtpmap:101 telephone-event/8000
    a=fmtp:101 0-16
    m=video 19042 RTP/AVP 102
    a=rtpmap:102 VP8/90000
    a=rtcp-fb:102 ccm fir
    a=rtcp-fb:102 ccm tmmbr
    a=rtcp-fb:102 nack
    a=rtcp-fb:102 nack pli
    switch_core_state_machine.c:40 sofia/internal/1002@ Standard INIT
    switch_core_state_machine.c:48 (sofia/internal/1002@ State Change CS_INIT -> CS_ROUTING
    switch_core_state_machine.c:627 (sofia/internal/1002@ State INIT going to sleep
    switch_core_state_machine.c:584 (sofia/internal/1002@ Running State Change CS_ROUTING (Cur 4 Tot 122612)
    sofia.c:7291 Channel sofia/internal/1002@ entering state [calling][0]
    sofia.c:7291 Channel sofia/internal/1002@ entering state [terminated][503]

    Ref :

    My freeswitch contributor profile

    Freeswitch Wiki

    Freeswitch bitbucket Codebase