Encapsulting Protocols

  1. Secure Shell (SSH )
  2. Encapsulating protocols at Network Layer
    1. IP in IP
    2. Multiprotocol Label Switching (MPLS)
    3. ESP (Encapsulating Security Payload)
  3. Encapsulating protocols at Transport Layer
    1. TLS and Datagram Transport Layer Security (DTLS)
    2. Generic Routing Encapsulation (GRE)
    3. QUIC
    4. TCP encapsulation
  4. Subtle points of using encapsulation
    1. Path MTU and fragmentation
    2. Migration of inner payload from one to another protocol
    3. Prioritization and congestion post encapsulation
      1. Misordering
      2. Loss of ECN signal
    4. Classification and tagging
    5. State Synchronization in Multipath
      1. Applying policies
      2. Anti-replay window synchronization
  5. Protocol Design for an encapsulating protocol
    1. Traffic Fow confidentiality ( TFC)
    2. Segmentation of information into header and trailer
    3. Dynamically adjustable anti-replay window sizes
    4. Security constraints

Encapsulation is the process of encasing the payload sent by an endpoint into another protocol’s payload, attaching its own header and trailer. This is applied to all data being processed by the network stack layers. For example, an HTTP ( L7 application layer protocol) is encapsulated under a TCP header (L4 layer protocol ) and further in an IP header ( L3 protocol) and so on until it is down to the physical layer.

Network stack encapsulation accross layers

This article doesn’t deal with encapsulation across the general network stack but rather focuses on encapsulating protocols that try to mask the identity of the original payload from network middleboxes to provide anonymity or virtualized networking.

Core parts of a secure communication framework relying on encapsulation are

  • Encapsulation and decapsulation ( encap and decap) libraries on the receiver and sender as well as a means to exchange the metadata enabling encap-decap.
  •  The payload, which is an original packet, optionally including upper-layer protocols, can be encrypted or plain.
  •  Algorithms and tools to order packets arriving out of order and duplicated
  •  Provide confidentiality as well as detection against malicious activities like MITM attacks or other kinds of passive attacks such as eavesdropping and replaying.

Sequnece Numbers are monotonically increasing numbers that show the ordering of the packets in a stream. They do not necessarily start from 0. Most sequence numbers have a wrapping feature, especially useful for long-lived connections. At the far end of the valid sequence number range, the sequence numbers can go down to the beginning after maxing out. First used in TCP, sequence numbers have become popular in the most reliable communication protocols. Primarily meant for reordering the incoming packets in traffic in the correct order, these numbers can be used for other purposes as well.

  • Sequence numbers in the headers, such as ESP, help maintain the anti-replay window, which prevents attacks from replaying previously captured data. Any packet coming from a sequence outside the window is either a retransmission or a replay attack and, hence, can be more scrutinized.
  • In some cases of ESP, even extended sequence numbers are used, which can be controlled by the cryptographic algorithm making it a security enhancement.

Security Association between the endpoints of an encapsulated path. Security Association is a critical aspect of securing a communication path with a crypto algorithm, integration, keys, etc, and is used especially in the case of IPsec.

Sharing Keys : Maual keying , static configuration based keying and most recommended, Key exchange protocols such as IKE ( Internet Key exchange) can be used to share secret keys.

Unique identification : The onus of unique identification of the multiple paths/streams and the order of packets in the individual paths/streams lies with the creator of the header that would be attached to the encapsulated payload. Such a unique identifier or set of attributes should be able to distinguish multiple coexisting tunnels. Some example of unique Id/ identifiers in multioplexed usecase in case of non tunneling usecases are

  • HTTP/2 uses stream ID for multiplexed flows within a single connection
  • SIP uses Session ID

Simmilarly example of unique Id/ identifier in case of tunneling are

  • GRE uses the key field to make distuiction between individual flows
  • MPLS uses labels to identify diferent strems associated with different classes.
  • ESP uses SPI to attach the cryptographic SA keys to each packet for processing.
  • L2TP ( Layer 2 Tunneling protocol) also uses tunnel ID to identify coexisting tunnels.
  • SPI used in IPSec ESP protocol is a 32-bit identifier that bounds a security association to a packet. This is used to demultiplex inbound traffic at the receiver’s end.
  • QUIC uses connection ID

Others can rely on sequence numbers as counters or even destination address mappings to identify the path/stream. However, these approaches have many limitations. Most protocols would try to attach a new custom field.

More description on some encapsulating protocols

Secure Shell (SSH )

While ssh is not generally thought of as a tnunneling protocol , it does create a communicaion link for remote access and file transfer to happen securly. Hence forming the crux of what is considered a VPN functionality.

Encapsulating protocols at Network Layer

IP in IP

As IPV4 in IPv4 , IPV4 in IPv6 , IPv6 in IPv4 , and IPv6 in IPv6, are most commonly used for network virtualization ( VPN) and other kinds of Network as service such as Secure Access Service Edge (SASE).

IPv6 over IPv4

Multiprotocol Label Switching (MPLS)

MPLS can transport IP packets ( IPv4 and IPv6) over IPv4 backbone.Orignally design for forwarding and routing, instead of trraditional IP based routing MPLS uses packet labels to make next hop routing decisions. This enables creation of paths based on QoS. In contrast to ESP which is applied at layer 3, MPLs operated at layer 2.5( between layer 2 and 3).

Original packet and modified packet with MPLS header format

ESP (Encapsulating Security Payload)

ESP, part of the IPSec suite, enables confidentiality, integrity, and authenticity for IP packets it encapsulates. ESP header contains

  • SPI ( Security Parameter Index) to links SA ( security association) with an endpoint
  •  Sequence number, which is a counter to prevent a replay attack
  •  payload type can be encrypted or plain
  • followed by the Next header, which specifies the type of original IP packet in the payload.

ESP can operate in transport mode (protecting the payload of an IP packet) or tunnel mode (protecting the entire IP packet).

Generic
IP packet
Transport ModeTunnel Mode

Authetication Header
ESP

Without considering encryption or authentication overhead, the basic ESP header is 8 bytes in size. The ESP H( header) is realitively smaller in transport mode than in tunnel mode.

IPV4
IPv6
Transport ModeTunnel Model
IPv4
IPv6
IPv4 and IPv6 in transport and tunnel modes for ESP.

Encapsulating protocols at Transport Layer

TLS and Datagram Transport Layer Security (DTLS)

While TLS is designed for TCP, DTLS securily encapsulates datagrams over UDP. In contrast to ESP which encapsulates traffic for VPN usecases , DTLS is mostly used to encapsulate real time data traffic suhc as in WebRTC, gaming.

Generic Routing Encapsulation (GRE)

Another layer 4 tunneling protocol is GRE. It is protocol agnostic to layer 3 payloads as in it can tunnel any layer 3 protocol from IPv6, IPv4 to other raw formats. The “Protocol Type” fiels in GRE header specifies the protocol type of the encapsulated packet. GRE has a minimal header structure with no out of box security such as encrytion.

GRE tunnel forming a site to site VPN

QUIC

QUIC encapsulates higher-layer protocols, such as HTTP/3, within its own transport layer over UDP. Besides efficient multiplexing, encryption , QUIC also excels at migration and quick handshakes.

In contrarst to ESP , while ESP is part of IPSec suite of protocol aimed at lower underlay layer tunnleing, QUIC is aimed at application data encapsulation like web traffic and leverages UDP itself , appendning its own header with control information.

Note that MASQUE is another enacpsulation protocol build over QUIC. As these are still nascent and eveolving I will update this section as more specificatiosn are standardised.

TCP encapsulation

Many network middleboxes that filter traffic on public hotspots block all UDP traffic. As a result, UDP traffic, such as media streams for VoIP calls or even IKEv2 UDP packets, gets blocked. But middleboxes are likely to allow TCP connections through because they appear to be web traffic

  • (+) provides NAT support
  •  (+) Avoids UDP fragmentation
  •  (-) overhead of TCP or TLS

While designing a TCP-based encapsulation, it is recommended that Initiators should only use TCP encapsulation when traffic over UDP is blocked. TCP can leverage the streams over a single TCP connection to send data across. This way, any firewall or NAT mappings allocated for the TCP connection apply to all of the traffic associated with the encapsulated packet. This prevents large number of roundtrips.

Subtle points of using encapsulation

In addition to encapsulation overhead and reachability, the following are concerns that occur in encapsulating the data and traversing through a complex network.

Path MTU and fragmentation

Path MTU discovery messages such as ICMP can be blanket blocked by firewalls, which prevents proper MTU from being set for the encapsulated and overall packet. Subsequently, the MTU of the endcap packet may exceed the path MTU, leading to fragmentation, leading to

  • latency in transmission
  • undecipherable or unroutable packets by middleboxes / VPN hubs
  • Fragmented packets received may be unprocessable or not be able to be disassembled properly, which is countered by packet loss, leading to retransmission and further congestion.

Migration of inner payload from one to another protocol

For IPv4 encapsulating other IPs in a dual-stack implementation, sometimes the destination can decide to upgrade, such as from IPv4 to IPv6. The upgrade involves periods of simultaneously using IPv4 and IPv6 and then a gradual transition towards IPv6. The key differenece between the IPv4 and IPv6

  • header size which is 20 bytes and 40 bytes for IPv4 and IPv6 repectively.
  • only IPv4 supports fragmenetation
  • IPv4 has nique adddress space thus NAT is not needed as much as with limited addressing of IPv4

With migration ineffect on a dual stack impact on as the overlay depends on the underlay protocols ability to carry its traffic. For example in IPv6 over IPv4, as IPV4 underlays are well adopted in network infrastructure, original payload packets with ipv6 header use the ipv4 as encapsulating packet IP to carry accross the tunnel. BGP and OSPF are common routing and forwrading protocols for a IPv4 underlay network.

IPv4IPv6
OverlayIPv4 over IPv6
IPv4 over IPv4 ( GRE)
IPv6 over IPv4 ( 6in4)
IPv6 over IPv6

Prioritization and congestion post encapsulation

Many network middleboxes ( routers, cloud firewalls, gateways, and so on ) implement traffic filtering, shaping, or queue management based on prioritization( AQM at ISPs). Due to the masked nature of encapsulated packets, there is a high chance of the middleboxes not being able to ascertain their identity and, thus, making it deprioritized.

Misordering

To still keep the endpoints connected, the packets thus need to be put in the correct order with manageable latency at the receiving endpoint. This is especially critical for encapsulating packets for low-latency applications. For example, for video codec, a keyframe encapsulated packets arriving late could affect all subsequent packets to wait in the receiver buffer. This cascaded into a downward spiral of sending selective acknowledgments, retransmission, and discarding arrived packets, which can further intensified the problem.

Loss of ECN signal

Some systems overcome this by copying the ECN header from the inner header of the original packet to the outside headers. However, this needs to be a mutable field as the packet traverses through the many nodes of a large network. This is not protected by the integrator algorithms, and thus, for the risk of being misused, once a feature, this is now dropped by the updated specifications of many encapsulating protocols, such as IPsec.

Classification and tagging

Classification and tagging of traffic within a stream of encapsulated packets is also a challenge. The payload of the encapsulated packets could contain mixed content that cannot be tagged to any specific class, which can be used to prioritize real-time or critical traffic. For example, DSCP tag propagation from inner to outer packets can help in this direction but runs the risk of traffic profiling by middleboxes.

State Synchronization in Multipath

Scaling to involve multiple streams in a session using encapsulation poses challenges. This problem is further amplified in the case of multi-sender and multi-receiver scenarios.

Applying policies

For a stateful, contextual, and intelligent decision-making process, a sender needs to leverage the multiple available paths. It needs to discover alternate reachable paths and collect and sync network metrics to use the resources matching the needs of time sensitivity, cost, etc.

Anti-replay window synchronization

Difficulty in synchronizing anti-replay windows when multiple paths are involved impacts load-sharing encapsulated traffic. Additional issues may involve multicore or distributed operations.
A short-term solution to Sync issues is to have a very large Anti-replay window. Patching this with a short-term change of making the anti-replay window too big increases the possibility of packets being too far in sequence, which further leads to unpredictability in ordering. In high throughput scenarios, it may even be difficult for the CPU to keep the state in cache for an immensely large window size, thus causing undue latency penalties.

Protocol Design for an encapsulating protocol

Traffic Fow confidentiality ( TFC)

Primarily, an encapsulating protocol needs to make the traffic from outside visibility, which can be done by rewriting the source destination information as well as encrypting and/or padding the payload so as to not make it intelligible to middleboxes. Also mentioned in RFC 4303 for ESP in Tunel mode.

Dummy Data : Other means to secure confidentiality could be to use dummy packets or even dummy streams.

Segmentation of information into header and trailer

  • Avoid leaking information on the payoad
  • enable reassembly in case of fragmentation

Dynamically adjustable anti-replay window sizes

A smaller window is faster to process and secure but inapplicable to multi legged session, while a large window has a performance impact and can jeopardize the security of replays. By implementing a dynamically sized sliding window, the protocol can keep up the instantaneous requirements of the Network, such as keeping the window large for higher packet loss but also compressing the window size when the traffic latency-sensitive and out-of-sequence is intended to be discarded.

  • multiple replay windows for multiple paths
  • Synchronize sequneces with minial communication between the threads

Security constraints

Enabling multiple child SAs to be linked to a session is one way to overcome both multipath and antireplay issues mentioned above. The multiple children SAs in a parent SA can be thought of as representing a child tunnel inside a parent tunnel as it enables uniquely identifying and maintaining each SA-associated path with SPI-based identification. This leads to no overhead in synchronizing sequence numbers for ordering in case of multipath or multicore. Such concepts have been proposed in a few IETF drafts using different terminologies, such as sub-tunnel, cluster-tunnel, etc

References :

Multihoming protocols and mobility

  1. Low Layer Multihoming
  2. Layer 3 Network Layer multihoming
  3. Layer 4 Transport layer multihoming
  4. Higher Layer Multihoming techniques
  5. Best Path Selection among multiple paths
    1. FIFO or Round Robin
    2. Weights ( predetermined or dynamically allocated)
    3. Prioritization algorithms
  6. MultiPath protocol design
    1. Service discovery
    2. Unique Identifiers
      1. Path / Route and Network identifiers
      2. Header metadata
    3. Pre – Registration
    4. Handover / Failover from one path to another without disruption
    5. Security association for paths

A multihoming protocol maintains a simultaneous connection to multiple networks. Such a protocol enhances reliability, load balancing, and fault tolerance and makes an excellent candidate for signaling planes, which are lightweight packets managing the connection for the data plane. Dataplane acts as the actual data transfer protocol, which can be for multimedia such as audio-video content, games, streaming, and so on. Multihoming can be implemented at several levels of the network stack.

Low Layer Multihoming

While lower layers of the network stack do not generally have the logic to maintain statefulness and decision-making, they can still leverage multiple paths efficiently to provide redundancy and increased BDP.

Link Aggregation Control Protocol (LACP), an IEEE standard and part of the IEEE 802.3ad specification, bundles individual physical links of Ethernet connections into one logical link to increase throughput using multiple NICs.

Layer 3 Network Layer multihoming

BGP can be considered an example of multihoming since it process multiple paths via logical addressing and sets up the routes.

IP multihoming : Virtual Router Redundancy Protocol (VRRP) and Hot Standby Router Protocol (HSRP) enable routers to share IP and MAC address.

Layer 4 Transport layer multihoming

Some protocols simultaneously use these multiple paths for a single communication session such as Multi-path TCP. This can be used to improve bandwidth utilization ( sometimes unfairly) and build resilience. While simultaneous usage of multiple links/ paths provides great resilience, they can lead to asymmetrical routing issues.

Asymmetrical Routing

SCTP (Stream Control Transmission Protocol) is another example of a transport layer protocol that provides connection-oriented reliable communication while having multiple IP addresses and interfaces. With multiple paths dynamically added, the SCTP session can switch from one path to another in the event of a failover or can load balance between paths.

Higher Layer Multihoming techniques

AnyCast is a networking technique that lets multiple endpoints share the same IP and dynamically select the best destination path.

SIP Forking is an application layer technique to have multiple VoIP endpoints receive an incoming call either parallelly or sequentially. The first SIP phone to answer the calls establishes the connection.

Equal-cost multipath (ECMP) is a routing technique that allows routers to distribute traffic across multiple equal-cost paths. This is often associated with OSPF (Open Shortest Path First) like implementations.

Imbalanced load sharing between the paths

Best Path Selection among multiple paths

Selection of the best path and routing decision in a multihomed scenario can be made by :

FIFO or Round Robin

The first path to respond can be selected first to establish a connection in FIFO, while Round Robin can select paths in order of packet arrival to prevent starvation of a path.

Weights ( predetermined or dynamically allocated)

BGP can help provide weight and preferences to paths while exchanging routing and reachability. For example, AS( Autonomous systems ) that connect to multiple ISPs use BGP to advertise their IP prefixes through each link.

Prioritization algorithms

In an anycast scenario, where the same address is assigned to multiple destinations, the traffic can be assigned to the geographically nearest, shortest RTT, or best-performing path. For example, to manage ingress traffic to a service across multiple data centers. A more sophisticated algorithm can also add compound metrics such as a derivative of loss, jitter, etc.

Cost and load balancing are often top considerations for path selection. The more complex decisions can be based on instantaneous Qos collected or even forecasted. Resource utilization or carbon footprint is also a candidate for path selection. A fairness-based approach can also be built in to avoid starvation.

Mobility management

Mobility management has been long used to provide network continuity. From telecom devices’ handover between base stations ( home network and visitor network ) to mobile IPv6 ( via home agent and Foreign agent), mobility is crucial to meet the needs of mobile devices for seamless connectivity.

In cases of session-based protocols, when the source IP changes after an established connection, the mobility of the protocol should kick in to discover and migrate to another reachable address for the endpoint. This is especially crucial in the case of wireless networks and mobile devices. Examples of such cases may be

  • when NATing changes and a host receives a new IP address.
  •  A device with multiple interfaces or uplinks decides to tear down one of the interfaces to uplink and migrate to another one.
  •  A user on a call travels through multiple networks, so the call is handed over from mobile data to wifi, etc.
  •  Switching between ipv4 and ipv6 address
A mobile node can change its IP address each time it moves to a new network or uses a new uplink. However a mobile node is not be able to maintain transport and higher-layer connections when it changes location. One of the ways to overcome the loss of connection is to assign an independant “home address” to the node that doesnt change when the node’s actual ip address changes, as is proposed in RFC 3775 for Mobile IPv6 protocol.
Mobility without relying on home network rquires shared state

A mobility management protocol should be able to seamlessly update the existing sessions endpoint IP address without re-establishing all of the security protocol or handshakes or using a minimal subset of it.

MultiPath protocol design

To design a multipath protocol with multihoming and seamless session migration, any available network metrics and topologies should be identified. These help build mechanisms for discovery and selection. Multipath management is stateful, and the states are utilized to identify degradation, plan migration, and recovery. Multihoming protocol design can be divided into the following parts

Service discovery

The network endpoint needs to detect the presence of multiple paths to a common destination, such as a server. This can be achieved by gaining the link state information at the router, preferred policies and static routes, feedback packets sharing path alternatives, Hello’s, ICMP, or other monitoring techniques. Address Resolution Protocol(ARP), IPv6 Neighbor Discovery, and Neighbor Unreachability Detection are also instances of path discovery.

A successful discovery is followed by routability checks, which involve checking if a BGP session is established. If it is not, then check the advertised IP prefix by each link. At this stage, reviewing the routing table at the hub( or router) to ensure the entries corresponding to each IP prefix point to a valid next-hop address is also an option. Tools like ping and traceroute can be used to confirm that the packets can reach the multihomed network through each link.
This prevalidation leads to quick convergence from old to new path in case the network conditions change, and this avoids blackholing traffic and downtime.

Unique Identifiers

DHCP, a client-server protocol, assigns addresses dynamically to devices as they get joined to a network. Even with rapid network changes, DHCP automates the address assignment from a pool and even relays across subnets. Since the IPV4 address is not necessarily unique across networks (32 bits), a different UUID ( 128 bits) is needed to be generated per endpoint. In the case of IPv6-enabled endpoints, the IPv6 address itself can serve as a unique identifier.

Path / Route and Network identifiers

To identify migration from one network to another, it is important to address each network uniquely. While the details of most ISP infrastructure itself cannot be determined by analyzing a packet that traversed it, other means of analysis can help narrow the choices, such as IP address ranges and DNS resolutions. TTL ( Time to Live) and RTT (Round Trip Time) can also help make inferences on physical distance. Fragmentation and filtering can help narrow its behavior and policies. For all different kinds of networks identified between a common source and destination, there should be a unique identifier to address the paths. Network identifiers can be in the form of

  • Link-layer identifiers for an interface include IEEE 802 addresses on Ethernet links.
  • subnet prefixes, CIDR
  • Gateway identification
  • home address/care of address in mobile ipv6
  • Index serialization 
  • UUID generated from its characteristics so on

Append the router’s address to an array of path identifiers. Such a prefix helps to identify the path. Example

  • SIP via header Via: SIP/2.0/UDP client.atlanta.com:5060;branch=z9hG45684bf9
  • SIP Route header : <sip:proxy1.atlanta.com;lr>, <sip:proxy2.atlanta.com;lr>

Some techniques also rely on cookies to identify the route

Header metadata

In addition to the network identifiers themselves, there are more metadata that need to be part of the payload or the encapsulating header to determine that the packet is traversing the appropriate source and destination. This can be in the form of the following:

  • unique pair of ip and port + optional magic number for NAT
  •  tuple of transport type, ip, and port
  •  unique ID derived from timestamp, ip , port, etc

Pre – Registration

As a mobile node detects a new network it registeres its current location with the network. This can be in form of authtication as a mobile node enters a network of sending its presence with pub/sub notification. Example :

  • SIP registration
  • acknowlege message for a new init
  • hello handshake

Handover / Failover from one path to another without disruption

A shift from one network to another can be triggered by various factors such as signal strength, load balancing, or network congestion. The decision to move over to a new network may also be gauged by algorithmically analyzing signal quality, interference, performance metrics, or cost and usage.
L2 handover: Change in link layer connection such as disconnecting from a wireless access point and connecting to another. Another example of mobility management is the telecom network handover, which re-established link-layer connectivity instead of relying on upper layers to reconnect.
L3 handover: change of router to which the mobile node is connected to.
Assuming endpoint 1 of the session has M addresses and another endpoint, i.e., endpoint 2 has an N address, then the connection should be able to migrate between any one of the M*N address pairs.

Security association for paths

Although more related to IPsec tunnels, SAs( security associations) in this context refer to cryptographic certainty that the endpoints are authorized and authenticated. SA uses nonce to randomize the keygen tokens. Additionally, there is anti-replay and rekeys in place to detect if there is possible interference in the communication link. A successful security association should display that the binding is successful and the endpoints are now allowed to transfer data or establish a tunnel to transfer encapsulated encrypted dat

Token based validity : A token, often generated by nonce and session’s unique parameters, is often used to validate intermediate messages without having to cross-confirm every packet with the home network or profile database

ESP Header : In cases of encapsulated packets like Encapsulating Security Payload (ESP), a not null payload authentication header can provide information on the authenticity of the origin.

Pre – Registration / Validation : As a mobile node detects a new network, it registers its current location with the network. This can be in the form of authentication as a mobile node enters a network by sending its presence with pub/sub notification. Example: SIP registration, child SAs for various paths in IPSec

References :