Showing results for 
Search instead for 
Did you mean: 

Managing consistent caches across a Stingray Cluster - the remix!

The article Managing consistent caches across a Stingray Cluster describes in detail how to configure a pair of Stingray devices to operate together as a fully fault-tolerant cache.


The beauty of the configuration was that it minimized the load on the origin servers - content would only be requested from the origin servers when it had expired on both peers, and a maximum of one request per 15 seconds (configurable) per item of content would be sent to the origin servers:




The solution uses two Stingray Traffic Managers, and all incoming traffic is distributed to one single front-end traffic manager.


How could we extend this solution to support more than two traffic managers (for very high-availability requirements) with multiple active traffic managers?




The basic architecture of the solution is as follows:


  • We begin with a cluster of 3 Stingray Traffic Managers, named stm-1, stm-2 and stm-3, with a multi-hosted IP address distributing traffic across the three traffic managers
  • Incoming traffic is looped through all three traffic managers before being forwarded to the origin servers; the return traffic can then be cached by each traffic manager

Screen Shot 2013-04-19 at 18.14.07.png


  • If any of the traffic managers have a cached version of the response, they respond directly




Starting from a working cluster.  In this example, the names 'stm-1', 'stm-2' and 'stm-3' resolve to the permanent IP addresses of each traffic manager; replace these with the hostnames of the machines in your cluster.  The origin servers are webserver1, webserver2 and webserver3.


Step 1: Create the basic pool and virtual server


Create a pool named 'website0', containing the addresses of the origin servers.

Screen Shot 2013-04-19 at 18.22.35.png

Create a virtual server that uses the 'discard' pool as its default pool.  Add a request rule to select 'website0':


pool.use( "website0" );


... and verify that you can browse your website through this virtual server.


Step 2: Create the additional pools


You will need to create N * (N-1) additional pools if you have N traffic managers in your cluster.


Pools website10, website20 and website30 contain the origin servers and either node stm-1:80, stm-2:80 or stm-3:80.  Edit each pool and enable priority lists so that the stm node is used in favor to the origin servers:



Configuration for Pools website10 (left), website20 (middle) and website30 (right)


Pools website230, website310 and website120 contain the origin servers and two of nodes stm-1:80, stm-2:80 or stm-3:80.  Edit each pool and enable priority lists so that the stm nodes are each used in favor to the origin servers.


For example, pool website310 will contain nodes stm-3:80 and stm-1:80, and have the following priority list configuration:


Screen Shot 2013-04-19 at 18.38.37.png


Step 3: Add the TrafficScript rule to route traffic through the three Stingrays


Enable trafficscript!variable_pool_use (Global Settings > Other Settings), then add the following TrafficScript request rule:


# Consistent cache with multiple active traffic managers

$tm = [

   'stm-1' => [ 'id' => '1', 'chain' => '123' ],

   'stm-2' => [ 'id' => '2', 'chain' => '231' ],

   'stm-3' => [ 'id' => '3', 'chain' => '312' ]


$me = sys.hostname();

$id = $tm[$me]['id'];

$chain = http.getHeader( 'X-Chain' );

if( !$chain ) $chain = $tm[$me]['chain']; "Request " . http.getPath() . ": ".$me.", id ".$id.": chain: ".$chain );

do {

   $i = string.left( $chain, 1 );

   $chain = string.skip( $chain, 1 );

} while( $chain && $i != $id ); "Request " . http.getPath() . ": New chain is ".$chain.", selecting pool 'website".$chain."0'");

http.setHeader( 'X-Chain', $chain );

pool.use( 'website'.$chain.'0' );


Leave the debugging '' statements in for the moment; you should comment them out when you deploy in production.


How does the rule work?


When traffic is received by a Traffic Manager (for example, the traffic manager with hostname stm-2), the rule selects the chain of traffic managers to process that request - traffic managers 2, 3 and 1.


  • It updates the chain by removing '2' from the start, and then selects pool 'website310'.


  • This pool selects stm-3 in preference, then stm-1 (if stm-3 has failed), and finally the origin servers if both devices have failed.


  • stm-3 will process the request, check the chain (which is now '31'), remove itself from the start of the chain and select pool 'website10'.


  • stm-1 will then select the origin servers.


This way, a route for the traffic is threaded through all of the working traffic managers in the cluster.


Testing the rule


You should test the configuration with a single request.  It can be very difficult to unravel multiple requests at the same time with this configuration.


Note that each traffic manager in the cluster will log its activity, but the merging of these logs is done at a per-second accuracy, so they will likely be misordered.  You could add a 'connection.sleep( 2000 )' in the rule for the purposes of testing to avoid this problem.


Enable caching


Once you are satisfied that the configuration is forwarding each request through every traffic manager, and that failures are appropriately handled, then you can configure caching.  The details of the configuration are explained in the Managing consistent caches across a Stingray Cluster article:




Test the configuration using a simple, repeated GET for a cacheable object:


$ while sleep 1 ; do wget ; done


Just as in the Consistent Caches article, you'll see that all Stingrays have the content in their cache, and it's refreshed from one of the origin servers once every 15 seconds:

Screen Shot 2013-04-19 at 17.47.26.png




This configuration used a Multi-Hosted IP address to distribute traffic across the cluster.  It works just as well with single hosted addresses, and this can make testing somewhat easier as you can control which traffic manager receives the initial request.


You could construct a similar configuration using Failpools rather than priority lists.  The disadvantage of using failpools is that Stingray would treat the failure of a Stingray node as a serious error (because an entire pool has failed), whereas with priority lists, the failure of a node is reported as a warning.  A warning is more appropriate because the configuration can easily accommodate the failure of one or two Stingray nodes.


Performance should not be unduly affected by the need to thread requests through multiple traffic managers.  All cacheable requests are served directly by the traffic manager that received the request.  The only requests that traverse multiple traffic managers are those that are not in the cache, either because the response is not cacheable or because it has expired according to the 'one check every 15 seconds' policy.

Version history
Revision #:
1 of 1
Last update:
‎04-19-2013 10:39:AM
Updated by:
Labels (1)
Tags (2)