Stingray's Feature Brief: Stingray Content Caching capability allows the traffic manager to operate as a high-performance proxy cache, with full control using TrafficScript.
When Stingray is deployed in a cluster, each Stingray device manages its own local cache. When an item expires from one Stingray's cache, it retrieves that item from the origin server the next time that it is used.
Each Stingray manages its own cache independently
In the majority of situations, this deployment pattern is entirely appropriate. In some specialized situations, it is preferable to have 100% consistency between the caches in the traffic managers and to absolutely minimize the load on the origin servers. This article describes the recommended deployment configuration, meeting the following technical goals:
Stingray’s cache flood protection is used in this configuration to manage the effects of multiple simultaneous requests for the same content that is no longer fresh (defined by a cache refresh time). If multiple requests arrive at the same instant, a maximum of one request per second will be forwarded to the origin servers.
The configuration delivers the following operational benefits:
... letting you deliver reliable services with predictable performance whilst minimizing infrastructure costs and eliminating overspend to accommodate occasional spikes of traffic or visitor numbers.
Stingray Traffic Managers are deployed in a fault-tolerant pair, and external traffic directed to a public-facing IP address (“Traffic IP address”). Both traffic managers are active, and the one that owns the traffic IP handles incoming traffic:
During normal operation, the primary traffic manager receives traffic and responds directly from its local cache. If a cache miss occurs (i.e when priming the cache, or when content expires or needs refreshing), the traffic manager first checks with its secondary peer and retrieves the secondary’s cached version if that is still valid (not expired or needing refreshed). It caches the copy retrieved from the secondary according to the local cache policy.
If the secondary does not have valid content, then the resource will be retrieved from the origin servers (load balancing as appropriate) and cached by both traffic managers.
If the secondary traffic manager fails, the primary will continue to respond directly from its local cache whenever possible. If a cache miss occurs or a refresh is needed, content is retrieved from the origin servers and cached in the primary traffic manager’s cache.
When the secondary traffic manager recovers, it may have an out-of-date cache (in the event of a network failure), or the cache may be completely empty (in the event of a software restart). This cache is fully repopulated with the working set of documents within the configured refresh time. There is no risk of the traffic manager serving stale content, and the load on the origin server is not increased during the repopulation.
If the primary traffic manager fails, the secondary will take over with a fully primed cache and respond to all traffic. If a cache miss occurs or a refresh is needed, the secondary will retrieve the content from the origin servers and cache it locally.
When the primary traffic manager recovers, its cache may be out-of-date or completely empty (as above). The primary takes back responsibility for user traffic and will update its cache rapidly by retrieving fresh content from the secondary traffic manager on every cache miss or refresh. There is no risk of the primary traffic manager returning stale content, and the load on the origin servers is not increased during the repopulation.
The configuration instructions are based on the following context:
Configure two pools as follows:
Name: Origin Servers, pool members: webserver1:80, webserver2:80, webserver3:80
Name: Cache Servers, pool members: stingray-2:80, webserver1:80, webserver2:80, webserver3:80
Configure both pools to use an appropriate load balancing algorithm. Additionally, configure priority lists for the Cache Servers pool as follows:
Cache Servers pool: priority list configuration
This configuration ensures that when the Cache Servers pool is selected, all traffic is sent to stingray-2 if it is available, or is load-balanced across the three origin webservers if not.
Note: if any changes are made to the origin servers (nodes are added or removed), both the Origin Servers and Cache Servers pools must be updated.
Create an HTTP virtual server named Cache Service, and set the default pool to Origin Servers.
Note: This virtual server should generally listen on all IP addresses. If it’s necessary to restrict the IP addresses it listens on, it should listen on the public traffic IP and on the IP address that is resolved by ‘stingray-2’ (the first node in the Cache Servers pool).
Configure caching in the virtual server to cache content for the desired period (e.g. 99915 seconds), and to refresh it (with a maximum of one request per second) once it has been cached for 15 seconds, as follows:
Caching settings for the Cache Service virtual server
Add a request rule named Synchronize Cache to the virtual server to select the Cache Servers pool (overriding the default Origin Servers pool).
This rule selects the Cache Servers pool when traffic is received on the primary traffic manager (‘stingray-1’).
Note: you will need to update this rule if the hostname for the primary traffic manager is not stingray-1.
This completes the configuration for the Cache Service virtual server and related pools and rule:
Testing this configuration must be done with due care because the presence of multiple caches can make debugging dificult. The following techniques are useful:
Test with a small sample set of content to verify that the cache policies function as desired. For example, the following command will repeatedly request a single cacheable resource:
bash$ while sleep 1 ; do wget http://host/cacheablecontent.gif ; done
Use the activity monitor to chart connections made to the origin servers:
Note that the activity monitor charts generally merge data from all traffic managers in a cluster; for clarity, this chart plots traffic individually per traffic manager. Observe that despite the front-end load of 1 request per second (not plotted), only one request every 15 seconds is sent to the origin server to refresh the cache. All requests originate from stingray-2.
If Stingray-2 is suspended or shut down, the activity monitor report run from stingray-1 will verify that there is no increase in origin server traffic; likewise if stingray-1 fails and the activity monitor chart is run from stingray-2.
The connections report will assist in identifying where traffic is routed.
In the following report, requests for the same 866-byte resource were issued once a second, and received by stingray-1 and responded from the local cache (‘To’ is ‘none’).
At 10:47:25, a cache refresh event occurred. Stingray-1 forwarded the request to stingray-2 (192.168.35.11). Stingray-2’s cache also required a refresh (because the caches are synchronized) so stingray-2 requested the content from one of the origin servers (192.168.207.103).
The synchronization solution effectively meets the goals of reliability, performance and minimization of load on the origin servers. It may be extended from an active-active pair of Stingray traffic managers to a larger cluster if required, which will increase the level of redundancy in the system, but at the expense of a small (probably insignificant) increase in latency as caches must be synchronized across a larger set of devices.
It is generally not a good idea to pre-prime a cache because the act of priming the cache puts a large one-time load on the origin servers. In the majority of situations, it is better to use this synchronization solution and allow the caches to fill on demand, in response to end user traffic.
If it is necessary to pre-prime the cache, this can be done using synthetic transactions to submit requests through the stingray cluster. For example, ‘wget –r’ was used with success during advanced testing of this solution.