Unified Plan SDP format in WebRTC

Until recently a customised or property extension could signal multiple media streams within an m-section of an SDP and experiment with media-level “msid” (Media Stream Identifier ) attribute used to associate RTP streams that are described in different media descriptions with the same MediaStreams. However, with the transition to a unified plan, they will experience breaking changes.

The previous SDP format implementation called “planB” was transitioned to “unified plan” in 2019.

Who it does effect ?

  • Uses various media tracks within m line in SDP such as for video stream and screen sharing simultaneously
  • Munges SDP, uses MCUs or SFUs
  • used track-based APIs addTrack, removeTrack, and sender.replaceTrack or legacy addstream removeStream exposed senders and receivers to edit tracks and their encoding parameters

Who it does not affect ?

  • This does not affect any application which has only single audio and video track.
 { iceServers: [], iceTransportPolicy: all, bundlePolicy: balanced, rtcpMuxPolicy: require, iceCandidatePoolSize: 0, sdpSemantics: "unified-plan" },

Plan B vs Unified Pan

Multiple media stream may be required for cases such as video and screen share stream in same SDP or in specific cases of SFU.

This implementation in Plan B will result in one “m=” line of SDP being used for video and audio. While within the video m= section multiple “a=ssrc” lines are listed for multiple media tracks.

In Unified Plan, every single media track is assigned to a separate “m=” section. Hence for video and screen sharing simultaneously two m sections will be created.

Interoperability between unified plan and plan B

A mismatch in SDP (between Plan B and Unified Plan) usually results :-

  1. only Unified Plan client receives an offer generated by a Plan B client – the Unified Plan client must reject the offer with a failed setRemoteDescription() error.
  2. only Plan B client receives an offer generated by a Unified Plan client – only first track in every “m=” section is used and other tracks are ignored

Reference :

Realtime Messaging Services Design


Functional Requirnments

  1. one to one / group chat
  2. support for multimedia – text / images / video / loccation
  3. Read receipt / Message status
  4. Last seen
  5. Push notifications

Non Functional Requirnments

  1. No latency / No lag
  2. HA ( high availibilty ) + Fault tolerent
  3. scalablity ( 2 billion users , 1.6 Monthly ative users )
  4. traffic 64 billion msgs / day
  5. Administrative requirnments – GDPR so on

Design Expectations

  1. Partition tolerance to handle a large amount of data using clusters.
  2. To create trust, reliability and consistency are critical as miscommunication will drain user confidence in the application.
  3. Resilient to recover from failures.
  4. Security and Privacy : End to end encryption on SSL
  5. Analytics and monitoring

The User Application system could have user profile , messaging service and alerts / notifications .

A transistent data store handles unsent messages before expiry. The transient message are temporarily stored and once send to user., deleted .

The frontend tech for the various mobile and desktop agents could be

  • Android: Java / React Native
  • iOS: Swift / React Native
  • Web client: JavaScript/HTML/CSS/ with web frameworks such as Angular or React JS
  • Mac Desktop app: Swift/Objective-C
  • PC Desktop app: Electron , C/Java

Message Format

from :  alice x.x.x.x
to : bob y.y.y.y
metadata : 
    timestamp : 12 dec 2017 3:09:13:6678
    type : text 
msgPayload : "Hi How are you " 

“Message Read ” Format

from :  bob y.y.y.y
to : alice y.y.y.y
metadata : 
    timestamp : 12 dec 2017 3:09:13:9070
    type : seen
msgPayload : 

Primary Keys

  • User
    • UserId
    • Username
    • UserprofilePic
  • Groups
    • GId
    • UserId1, UserId2…
  • Messages
    • ToUserId
    • FromUserId
    • Ts
    • MediaUrl
  • Sessions
    • UserId
    • MsgServerId
  • LastSeen
    • UserId
    • Ts

API

Msg API

  • SendMessage(senderUId, receiverUID, msg)
  • GetMessages( UserId , count , timerange)

Accout API

  • registerUser ( APIKey , phoneNumber, UserId)
  • loginUser ( APIKey , phoneNumber , UserId , OTP)
  • validateAcc

Group API

  • createGrp (APIkey , groupInfo) return GrpId
  • addUserToGrp ( APIKey , UserId, GrpId)
  • removeUserFromGrp (UserId, GrpId)
  • createAdmin( APIKey , UserId , GrpId)

Other API should provide

  • authentication
  • monitoring
  • Load balancing
  • caching
  • request hsaping
  • static responses

Messaging protocol

HTTPHTTP Long Poll / short Poll by Client Websocket
slow as server closes connection after each req client polls the server requesting new informationpersistnet connecion
p2p
Server Push
unsuitable to for realtime msgingunsuitable to for realtime msgingsuitable

High Level architecture (HLL)

The overall high level architecture consists of interfacing websocket handler layer isolated from core messaging service and msg handler.

Session Management

Dedicated / Private Chat Sessions : SessionId = <UserId1 + UserId2>

Group / shared Chat Session : SessionId < prefix + randomId >

SessionMessages schema can have its primary key : <SessionId + timestamp>

Fan Out Message / Send to All

Routing Service -> Messaing Group -> Push Notification

Push Notification

  • APNS – apple Push Notification Service used for iphone
  • GCM ( Google Cloud Mesageing ) / FCM ( FireBase Cloud Messaging )
  • WNS ( Windows Notofcaton Service )

Mobile Agent talks to its PNS with its device ID to get a pus notification token

The push notifcation token will be then used by Messaging platform to send a push notification to recipint .

Handling Load

External Load Balancers for Websockets Handlers and User agents

High load shared by multuple Message servers and PAI gateways behind Internal Load balancers in Dmz zone ( demilitirized zone).

Distributed Datastore : API gateways to distribute requests accross servers using consistent hashing

Sharded by GroupId as Primary Index and UserId as seconday Index

Distributed cache : write through Mechanism : Redis Clusters

Stream and Log Analysis : Kafka + Hadoop

Scalability

Assume 1 billion users active per month and 40 million at peak

server required = message count per second * latency / server limit for concurrent messages per second

servers required = 40 million * 20 ms / 100,000 = 8 servers

BottleNecks

  1. If receipient of Message is offline / unavaible the message delivery is tried indefinately
    • Solved using transiset message satore to hold undelivered messages untill user is able to take the message or message is expired .
    • Transisnet DB can be FIFO
  2. Server Failure
    • Replication of Messaging Server for ongoing sessions in 2f+1
    • client to automatically be handed over to new server when exsting server crashes

Optimizations

  1. Replication of Transient Storage and CDN for media ( images / Videos )
  2. To fetch new messages – use the last msg as pointer to read the message that was last read and fetch all message with greater sequnece
  3. Random Authentication and Challenge
  4. Proactive server restore and key refresh to prevent brazentine attacks
  5. Integration with SMS gateway

Rich Communication suite

I I have discussed more Addon Features for IM in terms of RCS ( Rich Communication Suite )