Layer 2/3 Direct Server Return (DSR), also referred to as ‘triangulation’, is a network routing technique used in some load balancing situations:
Incoming traffic (blue) is routed through the load balancer, and return traffic (red) bypasses the load balancer
Direct Server Return is fundamentally different from the normal load balancing mode of operation, where the load balancer observes and manages both inbound and outbound traffic.
In contrast, there are two other common load balancing modes of operation:
They often use a technique called ‘delayed binding’ to delay and inspect a new network connection before sending the packets to a back-end node; this allows them to perform content-based routing. NAT-based load balancers can switch TCP streams, but have limited capabilities to inspect and rewrite network traffic.
Whereas NAT-based load balancers manage traffic on a packet-by-packet basis, proxy-based load balancers can read entire request and responses. They can manage and manipulate the traffic based on a full understanding of the transaction between the client and the application server.
Note that some load balancers can operated in a dual-mode fashion - a service can be handled either in a NAT-like fashion or in a Proxy-like fashion. This introduces are tradeoff between hardware performance and software sophistication - see SOL4707 - Choosing appropriate profiles for HTTP traffic for an example. Stingray Traffic Manager can only function in a Proxy-like fashion.
This article describes how the benefits of direct server return can be applied to a layer 7 traffic management device such as Stingray Traffic Manager.
Layer 2/3 Direct Server Return was very popular from 1995 to about 2000 because the load balancers of the time were seriously limited in performance and compute power; DSR uses less compute resources then a full NAT or Proxy load balancer. DSR is no longer necessary for high performance services as modern load balancers on modern hardware can easily handle multi-gigabits of traffic without requiring DSR.
DSR is still an appealing option for organizations who serve large media files, or who have very large volumes of traffic.
Stingray Traffic Manager does not support a traditional DSR mode of operation, but it is straightforward to manage traffic to obtain a similar layer 7 DSR effect.
There are a number of distinct limitations and disadvantages with DSR:
1. The load balancer does not observe the response traffic
The load balancer has no way of knowing if a back-end server has responded correctly to the remote client. The server may have failed, or it may have returned a server error message. An external monitoring service is necessary to verify the health and correct operation of each back-end server.
2. Proper load balancing is not possible
The load balancer has no idea of service response times so it is difficult for it to perform effective, performance-sensitive load balancing.
3. Session persistence is severely limited
Because the load balancer only observes the initial ‘SYN’ packet before it makes a load balancing decision, it can only perform session persistence based on the source IP address and port of the packet, i.e. the IP address of the remote client.
The load balancer cannot perform cookie-based session persistence, SSL session ID persistence, or any of the many other session persistence methods offered by other load balancers.
4. Content-based routing is not possible
Again, because the load balancer does not observe the initial request, it cannot perform content based routing.
5. Limited traffic management and reporting
The load balancer cannot manage traffic, performing operations like SSL decryption, content compression, security checking, SYN cookies, bandwidth management, etc. It cannot retry failed requests, or perform any traffic rewriting. The load balancer cannot report on traffic statistics such as bandwidth sent.
6. DSR can only be used within a datacenter
There is no way to perform DSR between datacenters (other than proprietary tunnelling, which may be limited by ISP egress filtering).
In addition, many of the advanced capabilities of an application delivery controller that depend on inspection and modification (security, acceleration, caching, compression, scrubbing etc) cannot be deployed when a DSR mode is in use.
The performance benefits of DSR are often assumed to be greater than they really are. Central to this doubt is the observation that client applications will send TCP ‘ACK’ packets via the load balancer in response to the data they receive from the server, and the volume of the ACK packets can overwhelm the load balancer.
Although ACK packets are small, in many cases the rated capacities of network hardware assume that all packets are the size of the maximum MTU (typically 1500 bytes). A load balancer running on a 100 MBit network could receive a little over 8,000 ACK packets per second.
On a low-latency network, ACK packets are relatively infrequent (1 ACK packet for every 4 data packets), but for large downloads over a high-latency network (8 hops) the number of ACK packets closely approaches 1:1 as the server and client attempt to optimize the TCP session. Therefore, over high-latency networks, a DSR-equipped load balancer will receive a similar volume of ACK packets to the volume of outgoing data packets (and the difference in size between the ACK and data packets has little effect to packet-based load balancers).
There are two alternatives to direct server return:
Stingray Traffic Manager is comfortably able to manage over many Gbits of traffic in its normal ‘proxy’ mode on appropriate hardware, and can be scaled horizontally for increased capacity. In benchmarks, modern Intel and AMD-based systems can achieve multiple 10's of Gbits of fully load-balanced traffic, and up to twice as much when serving content from Stingray Traffic Manager’s content cache.
For the most common protocols (HTTP and RTSP), it is possible to handle them in ‘proxy’ mode, and then redirect the client to the chosen server node once the load balancing and session persistence decision has been made. For the large file download, the client communicates directly with the server node, bypassing Stingray Traffic Manager completely:
Requests for small objects (blue) are proxied directly to the origin.
Requests for large objects (red) elicit a lightweight probe to locate the resource,
and then the client is instructed (green)to retrieve the resource directly from the origin.
This technique would generally be used selectively. Small file downloads (web pages, images, etc) would be managed through the Stingray Traffic Manager. Only large files – embedded media for example – would be handled in this redirect mode. For this reason, the HTTP session will always run through the Stingray Traffic Manager.
Layer 7 DSR with HTTP is fairly straightforward. In the following example, incoming requests that begin “/media” will be converted into simple probe requests and sent to the ‘Media Servers’ pool. The Stingray Traffic Manager will determine which node was chosen, and send the client an explicit redirect to retrieve the requested content from the chosen node:
Request rule: Deploy the following TrafficScript request rule:
$path = http.getPath();
if( string.startsWith( $path, "/media/" ) || 1 ) {
# Store the real path
connection.data.set( "path", $path );
# Convert the request to a lightweight HEAD for '/'
http.setMethod( "HEAD" );
http.setPath( "/" );
pool.use( "Media Servers" );
}
Response rule: This rule reads the response from the server; load balancing and session persistence (if relevant) will ensure that we’ve connected with the optimal server node. The rule only takes effect if we did the request rewrite, the $saved_path value will begin with ‘/media/’, so we can issue the redirect.
$saved_path = connection.data.get( "path" );
if( string.startsWith( $saved_path, "/media" ) ) {
$chosen_node = connection.getNode();
http.redirect( "http://".$chosen_node.$saved_path );
}
An RTSP connection is a persistent TCP connection. The client and server communicate with HTTP-like requests and responses.
In this example, Stingray Traffic Manager will receive initial RTSP connections from remote clients and load-balance them on to a pool of media servers. In the RTSP protocol, a media download is always preceded by a ‘DESCRIBE’ request from the client; Stingray Traffic Manager will replace the ‘DESCRIBE’ response with a 302 Redirect response that tells the client to connect directly to the back-end media server.
This code example has been tested with the Quicktime, Real and Windows media clients, and against pools of Quicktime, Helix (Real) and Windows media servers.
Create a virtual server listening on port 554 (standard port for RTSP traffic). Set the protocol type to be “RTSP”.
In this example, we have three pools of media servers, and we’re going to select the pool based on the User-Agent field in the RTSP request. The pools are named “Helix Servers”, “QuickTime Servers” and “Windows Media Servers”.
Request rule: Deploy the following TrafficScript request rule:
$client = rtsp.getRequestHeader( "User-Agent" );
# Choose the pool based on the User-Agent
if( string.Contains( $client, "RealMedia" ) ) {
pool.select( "Helix Servers" );
} else if ( string.Contains( $client, "QuickTime" ) ) {
pool.select( "QuickTime Servers" );
} else if ( string.Contains( $client, "WMPlayer" ) ) {
pool.select( "Windows Media Servers" );
}
This rule uses pool.select() to specify which pool to use when Stingray is ready to forward the request to a back-end server.
Response rule: All of the work takes place in the response rule. This rule reads the response from the server. If the request was a ‘DESCRIBE’ method, the rule then replaces the response with a 302 redirect, telling the client to connect directly to the chosen back-end server.
Add this rule as a response rule, setting it to run every time (not once).
# Wait for a DESCRIBE response since this contains the stream
$method = rtsp.getMethod();
if( $method != "DESCRIBE" ) break;
# Get the chosen node
$node = connection.getnode();
# Instruct the client to retry directly against the chosen node
rtsp.redirect( "rtsp://" . $node . "/" . $path );
It’s useful to have an appreciation of how DSR (and Delayed Binding) functions in order to understand some of its limitations (such as content inspection).
A simplified overview of a TCP connection is as follows:
The TCP connection setup is often referred to as a 3-way TCP handshake. Think of it as the following conversation:
Once the connection has been established by the 3-way handshake, the client and server exchange data packets with each other. Because packets may be dropped or re-ordered, each packet contains a sequence number; the sequence number is incremented for each packet sent.
When a client receives intact data packets from the server, it sends back an ACK (acknowledgement) with the packet sequence number. When a client acknowledges a sequence number, it is acknowledging it received all packets up to that number, so ACKs may be sent less frequently than data packets.
The server may send several packets in sequence before it receives an ACK (determined by the (“window size”), and will resend packets if they are not ACK’d rapidly enough.
There are many variants for IP and MAC rewriting used in simple NAT-based load balancing. The simplest NAT-based load balancing technique uses Destination-NAT (DNAT) and works as follows:
This implementation is very amenable to a hardware (ASIC) implementation. The TCP connection is load-balanced on the first SYN packet; one of the implications is that the load balancer cannot inspect the content in the TCP connection before making the routing decision.
Delayed binding is a variant of the DNAT load balancing method. It allows the load balancer to inspect a limited amount of the content before making the load balancing decision.
This implementation is still amenable to hardware (ASIC) implementation. However, layer 4-7 tasks such as detailed content inspection and content rewriting are beyond implementation in specialized hardware alone and are often implemented using software approaches (such as F5's FastHTTP profile), albeit with significant functional limitations.
Direct Server Return is most commonly implemented using MAC address translation (layer 2).
A MAC (Media Access Control) address is a unique, unchanging hardware address that is bound to a network card. Network devices will read all network packets destined for their MAC address.
Network devices use ARP (address resolution protocol) to announce the MAC address that is hosting a particular IP address. In a Direct Server Return configuration, the load balancer and the server nodes will all listen on the same VIP. However, only the load balancer makes ARP broadcasts to tell the upstream router that the VIP maps to its MAC address.
In this way, reply packets completely bypass the load balancer machine.
Content inspection (delayed binding) is not possible because it requires that the load balancer first completes the three-way handshake with the remote source node, and possibly ACK’s some of the data packets.
When the load balancer then sends the first SYN to the chosen node, the node will respond with a SYN/ACK packet directly back to the remote source. The load balancer is out-of-line and cannot suppress this SYN/ACK. Additionally, the sequence number that the node selects cannot be translated to the one that the remote client is expecting. There is no way to persuade the node to pick up in the TCP connection from where the load balancer left off.
For similar reasons, SYN cookies cannot be used by the load balancer to offload SYN floods from the server nodes.
There are two alternative implementations of DSR (see this 2002 paper entitled 'The State of the Art'), but neither is widely used any more:
This configuration does not support delayed binding, or any equivalent means of inspecting content before making the load balancing decision
This uniquely allows for content-based routing and direct server return in one solution.
Neither of these methods are in wide use now.