cancel
Showing results for 
Search instead for 
Did you mean: 

Pulse Secure vADC

Sort by:
Traffic Manager operates as a TCP/UDP proxy.  It receives traffic from remote clients, processes it internally, and then hands it off to a pool.  The pool performs a load balancing decision to pick a back-end server, and then forwards the traffic to that server node.   Traffic Manager has two main methods to verify the correct operation of a back-end server. If it detects that a server has failed, it will stop sending traffic to it until the server has recovered and can be re-introduced back into the pool of servers.   Passive Health Monitoring is a basic technique that checks each response from a back-end server. It uses live traffic to detect a limited range of errors (it is not application aware), and end-users may experience slow or broken responses while the health monitors seek to confirm that the node has failed. Active Health Monitoring uses synthetic transactions to regularly probe each node to verify correct operation. It can detect a very wide range of application-level failures, and will detect node failures independently of managing live traffic. In most cases, both health monitoring techniques are used together to assure the availability of the service.   Passive Health Monitoring   Every time Traffic Manager forwards a request to a back-end server, it verifies that a response is received. A number of tests are performed:   If the connection is refused, or is not established within the max_connect_time setting (default 4 seconds), the request is considered to have failed; If the connection is closed prematurely, or if the beginning of a response is not received within max_reply_time seconds (default 30 seconds), the request is considered to have failed; SSL only: if the SSL handshake to the node fails, the request is considered to have failed; HTTP only; if an invalid HTTP response or a 503 status code is received, the request is considered to have failed.   The max_connect_time and max_reply_time settings are properties of the Connection Management settings in a Pool. If these tests fail, then the request may be retried against another node in the pool, depending on the Idempotent status of the request. The request may be tried against all of the working nodes in the pool before the traffic manager gives up. If the traffic manager does give up, it either drops the connection or returns a custom error message (Virtual Server -> Connection Management -> Connection Error Settings).   If requests to a particular back-end server fail 3 times in a row (node_connection_attempts), then Traffic Manager assumes that the back-end server has failed. Traffic Manager does not send the server any new requests for a period of time (node_fail_time, default 60 seconds) and then speculatively sends it requests occasionally to probe it, to determine whether or not it has recovered.   When a node recovers, the perceptive load balancing algorithm will gradually ramp up traffic to the new node: Tech Tip: Perceptive Load Balancing.  The 'Fastest Response Time' algorithm has similar behavior; other load balancing algorithms will immediately load up the node with new requests.   Summary   Passive Health Monitoring is basic and very easy to configure. However, it only performs basic tests to verify the health of a server, so it cannot detect application-level failures other than '503' HTTP responses. Furthermore, Passive Health Monitoring only detects failures when Traffic Manager attempts to forward a request to a back-end node, so some clients may experience a slow or broken response while Traffic Manager attempts to verify whether the node has failed or not.   Active Health Monitoring   Traffic Manager may be configured with explicit health monitors. These health monitors perform periodic tests and verify the correct operation of each node. By using health monitors, the traffic manager can detect failures even when no traffic is being handled.   The tests are performed at configured intervals against each node in a pool (defined per-health-monitor). If the monitors fail sufficiently often, that node is marked as unavailable and the traffic manager will stop sending traffic to that node.   The monitors are held in the Monitors Catalog and can be applied to any pool. Each performs a specific test, ranging from simple operations, such as pinging each node, to more sophisticated tests which check that the appropriate port is open and the node is able to serve specified data, such as your home page. You may also create custom health monitors to perform sophisticated tests against your servers.   Summary   Active Health Monitors are very effective at determining the correct operation of each back-end server and are able to detect a wide range of network and application failures.   Logging and debugging health monitoring   Whenever a node fails or recovers, a log message is written to Traffic Manager's event log, and an event is raised.   The Virtual Server setting log!server_connection_failures may be used to log request failures against back-end servers and will help to track how passive health monitoring is functioning.   Each active Health Monitor has a verbose configuration setting that may be used to log the result of each test that is conducted.   Note that when a node fails, it may take several seconds for a health monitor to detect that it has failed (a combination of the delay, timeout and failures settings). Passive Health Monitoring will not detect the failure until Traffic Manager has attempted to send sufficient live traffic to the node.   Find out more Please refer to the Health Monitoring chapter in the User Manual (Product Documentation) for a more detailed description of the various methods used to measure server health.    
View full article
I have several hundred websites that all use host headers in IIS. I would like to use a single virtual/Public IP address and have the traffic manager select the appropriate pool based on the host header passed in. I’ve been using a traffic script similar to the code snippet below. Is there a more efficient way to code this there will be several hundred pools and if statements? Can you do case statements in traffic script? $HostHeader = http.getHostHeader(); if( string.contains( $HostHeader, "site1.test.com" ) ){    pool.use( "Pool_site1.test.com_HTTP"); }else if( string.contains( $HostHeader, "site2.test.com" ) ){    pool.use( "Pool_site2.test.com_HTTP"); }else if( string.contains( $HostHeader, "site3.test.com" ) ){    pool.use( "Pool_site3.test.com_HTTP"); }else if( string.contains( $HostHeader, "site4.test.com" ) ){    pool.use( "Pool_site4.test.com_HTTP"); }else if( string.contains( $HostHeader, "site5.test.com" ) ){    pool.use( "Pool_site5.test.com_HTTP"); }else if( string.contains( $HostHeader, "site6.test.com" ) ){    pool.use( "Pool_site6.test.com_HTTP"); }else{    http.changeSite( " http://www.test.com " );   }
View full article
Traffic Manager's Content Caching capability allows Traffic Manager to identify web page responses that are the same for each request and to remember (‘cache’) the content. The content may be ‘static’, such as a file on disk on the web server, or it may have been generated by an application running on the web server.   Why use Content Caching?   When another client asks for content that Traffic Manager has cached in its internal web cache, Traffic Manager can return the content directly to the client without having to forward the request to a back-end web server.   This has the effect of reducing the load on the back-end web servers, particularly if Traffic Manager has detected that it can cache content generated by complex applications which consume resources on the web server machine.   What are the pitfalls?   A content cache may store a document that should not be cached.   Traffic Manager conforms to the caching recommendations of RFC 2616, which describe how web browsers and server can specify cache behaviour. However, if a web server is misconfigured, and does not provide the correct cache control information, then a TrafficScript or RuleBuilder rule can be used to override Traffic Manager's default caching logic.   A content cache may need a very large amount of memory to be effective   Depending on the spread of content for your service, and the proportion that is cacheable and frequently used compared to the long tail of less-used content, you may need a very large content cache to get the best possible hit rates.   Traffic Manager allows you to specify precisely how much memory you wish to use for your cache, and to impose fine limits on the sizes of files to be cached and the duration that they should be cached for. Traffic Manager's 64-bit software overcomes the 2-4Gb limit of older solutions, and Traffic Manager can operate with a two-tier (in-memory and on-SSD) cache in situations where you need a very large cache and the cost of server memory is prohibitive.   How does it work?   Not all web content can be cached. Information in the HTTP request and the HTTP response drives Traffic Manager's decisions as to whether or not a request should be served from the web cache, and whether or not a response should be cached.   Requests   Only HTTP GET and HEAD requests are cacheable. All other methods are not cachable. The Cache-Control header in an HTTP request can force Traffic Manager to ignore the web cache and to contact a back-end node instead. Requests that use HTTP basic-auth are uncacheable.   Responses   The Cache-Control header in an HTTP response can indicate that an HTTP response should never be placed in the web cache.  The header can also use the max-age value to specify how long the cached object can be cached for. This may cause a response to be cached for less than the configured webcache!time parameter. HTTP responses can use the Expires header to control how long to cache the response for. Note that using the Expires header is less efficient than using the max-age value in the Cache-Control response header. The Vary HTTP response header controls how variants of a resource are cached, and which variant is served from the cache in response to a new request.   If a web application wishes to prevent Traffic Manager from caching a response, it should add a ‘ Cache-Control: no-cache ’ header to the response.   Debugging Traffic Manager's Cache Behaviour   You can use the global setting webcache!verbose if you wish to debug your cache behaviour. This setting is found in the Cache Settings section of the System, Global Settings page. If you enable this setting, Traffic Manager will add a header named ‘ X-Cache-Info ’ to the HTTP response to indicate how the cache policy has taken effect. You can inspect this header using Traffic Manager's access logging, or using the developer extensions in your web browser.   X-Cache-Info values   X-Cache-Info: cached X-Cache-Info: caching X-Cache-Info: not cacheable; request had a content length X-Cache-Info: not cacheable; request wasn't a GET or HEAD X-Cache-Info: not cacheable; request specified "Cache-Control: no-store" X-Cache-Info: not cacheable; request contained Authorization header X-Cache-Info: not cacheable; response had too large vary data X-Cache-Info: not cacheable; response file size too large X-Cache-Info: not cacheable; response code not cacheable X-Cache-Info: not cacheable; response contains "Vary: *" X-Cache-Info: not cacheable; response specified "Cache-Control: no-store" X-Cache-Info: not cacheable; response specified "Cache-Control: private" X-Cache-Info: not cacheable; response specified "Cache-Control: no-cache" X-Cache-Info: not cacheable; response specified max-age <= 0 X-Cache-Info: not cacheable; response specified "Cache-Control: no-cache=..." X-Cache-Info: not cacheable; response has already expired X-Cache-Info: not cacheable; response is 302 without expiry time   Overriding Traffic Manager's default cache behaviour Several TrafficScript and RuleBuilder cache contrl functions are available to facilitate the control of Traffic Manager’s caching behaviour. In most cases, these functions eliminate the need to manipulate headers in the HTTP requests and responses.   http.cache.disable()   Invoking http.cache.disable() in a response rule prevents Traffic Manager from caching the response. The RuleBuilder 'Make response uncacheable' action has the same effect.   http.cache.enable()   Invoking http.cache.enable() in a response rule reverts the effect of a previous call to http.cache.disable(). It causes Traffic Manager's default caching logic to take effect. Note that it possible to force Traffic Manager to cache a response that would normally be uncachable by rewriting the headers of that response using TrafficScript or RuleBuilder (response rewriting occurs before cachability testing).   http.cache.setkey()   The http.cache.setkey() function is used to differentiate between different versions of the same request, in much the same way that the Vary response header functions. It is used in request rules, but may also be used in response rules.   It is more flexible than the RFC2616 vary support, because it lets you partition requests on any calculated value – for example, different content based on whether the source address is internal or external, or whether the client’s User-Agent header indicates an IE or Gecko-based browser.   This capability is not available via RuleBuilder.   Simple control   http.cache.disable and http.cache.disable allow you to easily implement either 'default on', or 'default off' policies, where you either wish to cache everything cacheable unless you explicity disallow it, or you wish to only let Traffic Manager cache things you explictly allow. For example, you have identified a particular set of transactions out of a large working set that each 90% of your web server usage, and you wish to just cache those requests, and not lets less painful transactions knock these out of the cache. Alternatively, you may be trying to cache a web-based application which is not HTTP compliant in that it does not properly mark up pages which are not cacheable and caching them would break the application. In this scenario, you wish to only enable caching for particular code paths which you have tested to not break the application. An example TrafficScript rule implementing a 'default off' policy might be:     # Only cache what we explicitly allow http.cache.disable(); if( string.regexmatch( http.geturl(), "^/sales/(order|view).asp" )) { # these are our most painful pages for the DB, and are cacheable http.cache.enable(); }     RuleBuilder offers only the simple 'default on' policy, overridden either by the response headers or the 'Make response uncacheable' action.   Caching multiple resource versions for the same URL Suppose that your web service returns different versions of your home page, depending on whether the client is coming from an internal network (10.0.0.0) or an external network. If you were to put a content cache in front of your web service, you would need to arrange that your web server sent a Cache-Control: no-cache header with each response so that the page were not cached. Use the following request rule to manipulate the request and set a 'cache key' so that Traffic Manager caches the two different versions of your page:   # We're only concerned about the home page... if( http.getPath() != "/" ) break; # Set the cache key depending on where the client is located $client = request.getRemoteIP(); if( string.ipmaskmatch( $ip, "10.0.0.0/8" ) ) { http.cache.setkey( "internal" ); } else { http.cache.setkey( "external" ); } # Remove the Cache-Control response header - it's no longer needed! http.removeResponseHeader( "Cache-Control" );     Forcing pages to be cached   You may have an application, say a JSP page, that says it is not cacheable, but actually you know under certain circumstances that it is and you want to force Traffic Manager to cache this page because it is a heavy use of resource on the webserver.   You can force Traffic Manager to cache such pages by rewriting its response headers; any TrafficScript rewrites happen before the content caching logic is invoked, so you can perform extremely fine-grained caching control by manipulating the HTTP response headers of pages you wish to cache.   In this example, as have a JSP page that sets a 'Cache-Control: no-cache' header, which prevents Stingray by caching the page. We can make this response cacheable by removing the Cache-Control header (and potentially the Expires header as well), for example:   if( http.getpath() == "/testpage.jsp" ) { # We know this request is cacheable; remove the 'Cache-Control: no-cache' http.removeResponseHeader( "Cache-Control" ); }   Granular cache timeouts   For extra control, you may wish instead to use the http.setResponseHeader() function to set a Cache-Control with a max-age= paramter to specify exactly how long this particular piece of content should be cached or; or add a Vary header to specify which parts of the input request this response depends on (e.g. user language, or cookie). You can use these methods to set cache parameters on entire sets of URLs (e.g. all *.jsp) or individual requests for maximum flexibility.   The RuleBuilder 'Set Response Cache Time' action has the same effect.   Read more   Stingray Product Documentation Cache your website - just for one second? Managing consistent caches across a Stingray Cluster
View full article
The following code uses Stingray's RESTful API to enable or disabled a specific Virtual Server.   The code is written in Perl.  This program checks to see if the Virtual Server "test vs" is enabled and if it is, it disables it and if it is disabled, it enables it.  A GET is done to retrieve the configuration data for the Virtual Server and the "enabled" value in the "basic" properties section is checked.  This is a boolean value, so if it is true it is set to false and if it is false it is set to true. The changed data is then sent to the server using a PUT.   startstopvs.pl   #!/usr/bin/perl use REST::Client; use MIME::Base64; use JSON; use URI::Escape; # Since Stingray is using a self-signed certificate we don't need to verify it $ENV{'PERL_LWP_SSL_VERIFY_HOSTNAME'} = 0; my $vs = "test vs"; # Because there is a space in the virtual serve name it must be escaped my $url = "/api/tm/1.0/config/active/vservers/" . uri_escape($vs); # Set up the connection my $client = REST::Client->new(); $client->setHost("https://stingray.example.com:9070"); $client->addHeader("Authorization", "Basic " . encode_base64("admin:admin")); # Get configuration data for the virtual server $client->GET($url); # Decode the json response. The result will be a hash my $vsConfig = decode_json $client->responseContent(); if ($client->responseCode() == 200) { if ($vsConfig->{properties}->{basic}->{enabled}) { # The virtual server is enabled, disable it. We only need to send the data that we # are changing so create a new hash with just this data. %newVSConfig = (properties => { basic => { enabled => JSON::false}}); print "$vs is Enabled. Disable it.\n"; } else { # The virtual server is disabled, enable it. %newVSConfig = (properties => { basic => { enabled => JSON::true}}); print "$vs is Diabled. Enable it.\n"; } $client->addHeader("Content-Type", "application/json"); $client->PUT($url, encode_json(\%newVSConfig)); $vsConfig = decode_json $client->responseContent(); if ($client->responseCode() != 200) { print "Error putting virtual server config. status=" . $client->responseCode() . " Id=" . $vsConfig->{error_id} . ": " . $vsConfig->{error_text} . "\n"; } } else { print "Error getting pool config. status=" . $client->responseCode() . " Id=" . $vsConfig->{error_id} . ": " . $vsConfig->{error_text} . "\n"; }   Running the example   This code was tested with Perl 5.14.2 and version 249 of the REST::Client module.   Run the Perl script as follows:   $ startstopvs.pl test vs is enabled. Disable it.   Notes   This program it is sending only the 'enabled' value to the server by creating a new hash with just this value in the 'basic' properties section.  Alternatively, the entire Virtual Server configuration could have been returned to the server with just the enabled value changed.  Sending just the data that has changed reduces the chances of overwriting another user's changes if multiple programs are concurrently accessing the RESTful API.   Read More   The REST API Guide in the   Product Documentation Feature Brief: Traffic Manager's RESTful Control API Collected Tech Tips: Using the RESTful Control API  
View full article
This Document provides step by step instructions on how to set up Brocade Virtual Traffic Manager for Microsoft Exchange 2013.
View full article
This short article explains how you can match the IP addresses of remote clients with a DNS blacklist.  In this example, we'll use the Spamhaus XBL blacklist service (http://www.spamhaus.org/xbl/).   This article updated following discussion and feedback from Ulrich Babiak - thanks!   Basic principles   The basic principle of a DNS-based blacklist such as Spamhaus' is as follows:   Perform a reverse DNS lookup of the IP address in question, using xbl.spamhaus.org rather than the traditional in-addr.arpa domain Entries that are not in the blacklist don't return a response (NXDOMAIN); entries that are in the blacklist return a particular IP/domain response indicating their status   Important note: some public DNS servers don't respond to spamhaus.org lookups (see http://www.spamhaus.org/faq/section/DNSBL%20Usage#261). Ensure that Traffic Manager is configured to use a working DNS server.   Simple implementation   A simple implementation is as follows:   1 2 3 4 5 6 7 8 9 10 11 $ip = request.getRemoteIP();       # Reverse the IP, and append ".zen.spamhaus.org".  $bytes = string.dottedToBytes( $ip );  $bytes = string. reverse ( $bytes );  $query = string.bytesToDotted( $bytes ). ".xbl.spamhaus.org" ;       if ( $res = net.dns.resolveHost( $query ) ) {      log . warn ( "Connection from IP " . $ip . " should be blocked - status: " . $res );      # Refer to Zen return codes at http://www.spamhaus.org/zen/  }    This implementation will issue a DNS request on every request, but Traffic Manager caches DNS responses internally so there's little risk that you will overload the target DNS server with duplicate requests:   Traffic Manager DNS settings in the Global configuration   You may wish to increase the dns!negative_expiry setting because DNS lookups against non-blacklisted IP addresses will 'fail'.   A more sophisticated implementation may interpret the response codes and decide to block requests from proxies (the Spamhaus XBL list), while ignoring requests from known spam sources.   What if my DNS server is slow, or fails?  What if I want to use a different resolver for the blacklist lookups?   One undesired consequence of this configuration is that it makes the DNS server a single point of failure and a performance bottleneck.  Each unrecognised (or expired) IP address needs to be matched against the DNS server, and the connection is blocked while this happens.    In normal usage, a single delay of 100ms or so against the very first request is acceptable, but a DNS failure (Stingray times out after 12 seconds by default) or slowdown is more serious.   In addition, Traffic Manager uses a single system-wide resolver for all DNS operations.  If you are hosting a local cache of the blacklist, you'd want to separate DNS traffic accordingly.   Use Traffic Manager to manage the DNS traffic?   A potential solution would be to configure Traffic Manager to use itself (127.0.0.1) as a DNS resolver, and create a virtual server/pool listening on UDP:53.  All locally-generated DNS requests would be delivered to that virtual server, which would then forward them to the real DNS server.  The virtual server could inspect the DNS traffic and route blacklist lookups to the local cache, and other requests to a real DNS server.   You could then use a health monitor (such as the included dns.pl) to check the operation of the real DNS server and mark it as down if it has failed or times out after a short period.  In that event, the virtual server can determine that the pool is down ( pool.activenodes() == 0 ) and respond directly to the DNS request using a response generated by HowTo: Respond directly to DNS requests using libDNS.rts.   Re-implement the resolver   An alternative is to re-implement the TrafficScript resolver using Matthew Geldert's libDNS.rts: Interrogating and managing DNS traffic in Traffic Manager TrafficScript library to construct the queries and analyse the responses.  Then you can use the TrafficScript function tcp.send() to submit your DNS lookups to the local cache (unfortunately, we've not got a udp.send function yet!):   1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 sub resolveHost( $host , $resolver ) {      import libDNS.rts as dns;           $packet = dns.newDnsObject();       $packet = dns.setQuestion( $packet , $host , "A" , "IN" );      $data = dns.convertObjectToRawData( $packet , "tcp" );            $sock = tcp. connect ( $resolver , 53, 1000 );      tcp. write ( $sock , $data , 1000 );      $rdata = tcp. read ( $sock , 1024, 1000 );      tcp. close ( $sock );           $resp = dns.convertRawDatatoObject( $rdata , "tcp" );           if ( $resp [ "answercount" ] >= 1 ) return $resp [ "answer" ][0][ "host" ];  }    Note that we're applying 1000ms timeouts to each network operation.   Let's try this, and compare the responses from OpenDNS and from Google's DNS servers.  Our 'bad guy' is 201.116.241.246, so we're going to resolve 246.241.116.201.xbl.spamhaus.org:   1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 $badguy = "246.241.116.201.xbl.spamhaus.org " ;       $text .= "Trying OpenDNS...\n" ;  $host = resolveHost( $badguy , "208.67.222.222" );  if ( $host ) {      $text .= $badguy . " resolved to " . $host . "\n" ;  } else {      $text .= $badguy . " did not resolve\n" ;  }       $text .= "Trying Google...\n" ;  $host = resolveHost( $badguy , "8.8.8.8" );  if ( $host ) {      $text .= $badguy . " resolved to " . $host . "\n" ;  } else {      $text .= $badguy . " did not resolve\n" ;  }       http.sendResponse( 200, "text/plain" , $text , "" );    (this is just a snippet - remember to paste the resolveHost() implementation, and anything else you need, in here)   This illustrates that OpenDNS resolves the spamhaus.org domain fine, and Google does not issue a response.   Caching the responses   This approach has one disadvantage; because it does not use Traffic Manager's resolver, it does not cache the responses, so you'll hit the resolver on every request unless you cache the responses yourself.   Here's a function that calls the resolveHost function above, and caches the result locally for 3600 seconds.  It returns 'B' for a bad guy, and 'G' for a good guy:   1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 sub getStatus( $ip , $resolver ) {      $key = "xbl-spamhaus-org-" . $resolver . "-" . $ip ; # Any key prefix will do             $cache = data.get( $key );      if ( $cache ) {         $status = string.left( $cache , 1 );         $expiry = string.skip( $cache , 1 );                   if ( $expiry < sys. time () ) {            data.remove( $key );            $status = "" ;         }      }             if ( ! $status ) {              # We don't have a (valid) entry in our cache, so look the IP up                # Reverse the IP, and append ".xbl.spamhaus.org".         $bytes = string.dottedToBytes( $ip );         $bytes = string. reverse ( $bytes );         $query = string.bytesToDotted( $bytes ). ".xbl.spamhaus.org" ;                $host = resolveHost( $query , $resolver );                if ( $host ) {            $status = "B" ;         } else {            $status = "G" ;         }         data.set( $key , $status .(sys. time ()+3600) );      }      return $status ;  } 
View full article
The Pulse Virtual Traffic Manager Kernel Modules may be installed on a supported Linux system to enable advanced networking functionality – Multi-Hosted Traffic IP Addresses.   Notes:  Earlier versions of this package contained two modules: ztrans (for IP Transparency) and zcluster (for Multi-Hosted Traffic IP Addresses). The Pulse Virtual Traffic Manager software has supported IP Transparency without requiring the ztrans kernel module since version 10.1, and the attached version of the Kernel Modules package only contains the zcluster module. The  Kernel Module is pre-installed in Pulse Secure Virtual Traffic Manager Appliances, and in Cloud images where they are applicable. The Kernel Modules are not available for Solaris.  The Multi-hosted IP Module (zcluster)   The Multi-hosted IP Module allows a set of clustered Traffic Managers to share the same IP address. The module manipulates ARP requests to deliver connections to a multicast group that the machines in the cluster subscribe to. Responsibility for processing data is distributed across the cluster so that all machines process an equal share of the load. Refer to the User Manual (Pulse Virtual Traffic Manager Product Documentation) for details of how to configure multi-hosted Traffic IP addresses. zcluster is supported for kernel versions up to and including version 5.2.   Installation   Prerequisites   Your build machine must have the kernel header files and appropriate build tools to build kernel modules.   You may build the modules on one machine and copy them to an identical machine if you wish to avoid installing build tools and kernel headers on your production traffic manager.   Installation   Unpack the kernel modules tarball, and cd into the directory created:   # tar –xzf pulse_vtm_modules_installer-2.14.tgz    # cd pulse_vtm_modules_installer-2.14   Review the README within for late-breaking news and to confirm kernel version compatibility.   As root, run the installation script install_modules.pl to install the zcluster module:   # ./install_modules.pl   If installation is successful, restart the vTM software:   # $ZEUSHOME/restart-zeus   If the installation fails, please refer to the error message given, and to the distribution specific guidelines you will find in the README file inside the pulse_vtm_modules_installer package.   Kernel Upgrades   If you upgrade your kernel, you will need to re-run the install-modules.pl script to re-install the modules after the kernel upgrade is completed.   Latest Packages   Packages for the kernel modules are now available via the normal Pulse Virtual Traffic Manager download service.
View full article
When you need to scale out your MySQL database, replication is a good way to proceed. Database writes (UPDATEs) go to a 'master' server and are replicated across a set of 'slave' servers. Reads (SELECTs) are load-balanced across the slaves.   Overview   MySQL's replication documentation describes how to configure replication:   MySQL Replication   A quick solution...   If you can modify your MySQL client application to direct 'Write' (i.e. 'UPDATE') connections to one IP address/port and 'Read' (i.e. 'SELECT') connections to another, then this problem is trivial to solve. This generally needs a code update (Using Replication for Scale-Out).   You will need to direct the 'Update' connections to the master database (or through a dedicated Traffic Manager virtual server), and direct the 'Read' connections to a Traffic Manager virtual server (in 'generic server first' mode) and load-balance the connections across the pool of MySQL slave servers using the 'least connections' load-balancing method: Routing connections from the application   However, in most cases, you probably don't have that degree of control over how your client application issues MySQL connections; all connections are directed to a single IP:port. A load balancer will need to discriminate between different connection types and route them accordingly.   Routing MySQL traffic   A MySQL database connection is authenticated by a username and password. In most database designs, multiple users with different access rights are used; less privileged user accounts can only read data (issuing 'SELECT' statements), and more privileged users can also perform updates (issuing 'UPDATE' statements). A well architected application with sound security boundaries will take advantage of these multiple user accounts, using the account with least privilege to perform each operation. This reduces the opportunities for attacks like SQL injection to subvert database transactions and perform undesired updates.   This article describes how to use Traffic Manager to inspect and manage MySQL connections, routing connections authenticated with privileged users to the master database and load-balancing other connects to the slaves:   Load-balancing MySQL connections   Designing a MySQL proxy   Stingray Traffic Manager functions as an application-level (layer-7) proxy. Most protocols are relatively easy for layer-7 proxies like Traffic Manager to inspect and load-balance, and work 'out-of-the-box' or with relatively little configuration.   For more information, refer to the article Server First, Client First and Generic Streaming Protocols.   Proxying MySQL connections   MySQL is much more complicated to proxy and load-balance.   When a MySQL client connects, the server immediately responds with a randomly generated challenge string (the 'salt'). The client then authenticates itself by responding with the username for the connection and a copy of the 'salt' encrypted using the corresponding password:   Connect and Authenticate in MySQL   If the proxy is to route and load-balance based on the username in the connection, it needs to correctly authenticate the client connection first. When it finally connects to the chosen MySQL server, it will then have to re-authenticate the connection with the back-end server using a different salt.   Implementing a MySQL proxy in TrafficScript   In this example, we're going to proxy MySQL connections from two users - 'mysqlmaster' and 'mysqlslave', directing connections to the 'SQL Master' and 'SQL Slaves' pools as appropriate.   The proxy is implemented using two TrafficScript rules ('mysql-request' and 'mysql-response') on a 'server-first' Virtual Server listening on port 3306 for MySQL client connections. Together, the rules implement a simple state machine that mediates between the client and server:   Implementing a MySQL proxy in TrafficScript   The state machine authenticates and inspects the client connection before deciding which pool to direct the connection to. The rule needs to know the encrypted password and desired pool for each user. The virtual server should be configured to send traffic to the built-in 'discard' pool by default.   The request rule:   Configure the following request rule on a 'server first' virtual server. Edit the values at the top to reflect the encrypted passwords (copied from the MySQL users table) and desired pools:   sub encpassword( $user ) { # From the mysql users table - double-SHA1 of the password # Do not include the leading '*' in the long 40-byte encoded password if( $user == "mysqlmaster" ) return "B17453F89631AE57EFC1B401AD1C7A59EFD547E5"; if( $user == "mysqlslave" ) return "14521EA7B4C66AE94E6CFF753453F89631AE57EF"; } sub pool( $user ) { if( $user == "mysqlmaster" ) return "SQL Master"; if( $user == "mysqlslave" ) return "SQL Slaves"; } $state = connection.data.get( "state" ); if( !$state ) { # First time in; we've just recieved a fresh connection $salt1 = randomBytes( 8 ); $salt2 = randomBytes( 12 ); connection.data.set( "salt", $salt1.$salt2 ); $server_hs = "\0\0\0\0" . # length - fill in below "\012" . # protocol version "Stingray Proxy v0.9\0" . # server version "\01\0\0\0" . # thread 1 $salt1."\0" . # salt(1) "\054\242" . # Capabilities "\010\02\0" . # Lang and status "\0\0\0\0\0\0\0\0\0\0\0\0\0" . # Unused $salt2."\0"; # salt(2) $l = string.length( $server_hs )-4; # Will be <= 255 $server_hs = string.replaceBytes( $server_hs, string.intToBytes( $l, 1 ), 0 ); connection.data.set( "state", "wait for clienths" ); request.sendResponse( $server_hs ); break; } if( $state == "wait for clienths" ) { # We've recieved the client handshake. $chs = request.get( 1 ); $chs_len = string.bytesToInt( $chs ); $chs = request.get( $chs_len + 4 ); # user starts at byte 36; password follows after $i = string.find( $chs, "\0", 36 ); $user = string.subString( $chs, 36, $i-1 ); $encpasswd = string.subString( $chs, $i+2, $i+21 ); $passwd2 = string.hexDecode( encpassword( $user ) ); $salt = connection.data.get( "salt" ); $passwd1 = string_xor( $encpasswd, string.hashSHA1( $salt.$passwd2 ) ); if( string.hashSHA1( $passwd1 ) != $passwd2 ) { log.warn( "User '" . $user . "': authentication failure" ); connection.data.set( "state", "authentication failed" ); connection.discard(); } connection.data.set( "user", $user ); connection.data.set( "passwd1", $passwd1 ); connection.data.set( "clienths", $chs ); connection.data.set( "state", "wait for serverhs" ); request.set( "" ); # Select pool based on user pool.select( pool( $user ) ); break; } if( $state == "wait for client data" ) { # Write the client handshake we remembered from earlier to the server, # and piggyback the request we've just recieved on the end $req = request.get(); $chs = connection.data.get( "clienths" ); $passwd1 = connection.data.get( "passwd1" ); $salt = connection.data.get( "salt" ); $encpasswd = string_xor( $passwd1, string.hashSHA1( $salt . string.hashSHA1( $passwd1 ) ) ); $i = string.find( $chs, "\0", 36 ); $chs = string.replaceBytes( $chs, $encpasswd, $i+2 ); connection.data.set( "state", "do authentication" ); request.set( $chs.$req ); break; } # Helper function sub string_xor( $a, $b ) { $r = ""; while( string.length( $a ) ) { $a1 = string.left( $a, 1 ); $a = string.skip( $a, 1 ); $b1 = string.left( $b, 1 ); $b = string.skip( $b, 1 ); $r = $r . chr( ord( $a1 ) ^ ord ( $b1 ) ); } return $r; }   The response rule   Configure the following as a response rule, set to run every time, for the MySQL virtual server.   $state = connection.data.get( "state" ); $authok = "\07\0\0\2\0\0\0\02\0\0\0"; if( $state == "wait for serverhs" ) { # Read server handshake, remember the salt $shs = response.get( 1 ); $shs_len = string.bytesToInt( $shs )+4; $shs = response.get( $shs_len ); $salt1 = string.substring( $shs, $shs_len-40, $shs_len-33 ); $salt2 = string.substring( $shs, $shs_len-13, $shs_len-2 ); connection.data.set( "salt", $salt1.$salt2 ); # Write an authentication confirmation now to provoke the client # to send us more data (the first query). This will prepare the # state machine to write the authentication to the server connection.data.set( "state", "wait for client data" ); response.set( $authok ); break; } if( $state == "do authentication" ) { # We're expecting two responses. # The first is the authentication confirmation which we discard. $res = response.get(); $res1 = string.left( $res, 11 ); $res2 = string.skip( $res, 11 ); if( $res1 != $authok ) { $user = connection.data.get( "user" ); log.info( "Unexpected authentication failure for " . $user ); connection.discard(); } connection.data.set( "state", "complete" ); response.set( $res2 ); break; }   Testing your configuration   If you have several MySQL databases to test against, testing this configuration is straightforward. Edit the request rule to add the correct passwords and pools, and use the mysql command-line client to make connections:   $ mysql -h zeus -u username -p Enter password: *******   Check the 'current connections' list in the Traffic Manager UI to see how it has connected each session to a back-end database server.   If you encounter problems, try the following steps:   Ensure that trafficscript!variable_pool_use is set to 'Yes' in the Global Settings page on the UI. This setting allows you to use non-literal values in pool.use() and pool.select() TrafficScript functions. Turn on the log!client_connection_failures and log!server_connection_failures settings in the Virtual Server > Connection Management configuration page; these settings will configure the traffic manager to write detailed debug messages to the Event Log whenever a connection fails.   Then review your Traffic Manager Event Log and your mysql logs in the event of an error.   Traffic Manager's access logging can be used to record every connection. You can use the special *{name}d log macro to record information stored using connection.data.set(), such as the username used in each connection.   Conclusion   This article has demonstrated how to build a fairly sophisticated protocol parser where the Traffic Manager-based proxy performs full authentication and inspection before making a load-balancing decision. The protocol parser then performs the authentication again against the chosen back-end server.   Once the client-side and server-side handshakes are complete, Traffic Manager will simply forward data back and fro between the client and the server.   This example addresses the problem of scaling out your MySQL database, giving load-balancing and redundancy for database reads ('SELECTs'). It does not address the problem of scaling out your master 'write' server - you need to address that by investing in a sufficiently powerful server, architecting your database and application to minimise the number and impact of write operations, or by selecting a full clustering solution.     The solution leaves a single point of failure, in the form of the master database. This problem could be effectively dealt with by creating a monitor that tests the master database for correct operation. If it detects a failure, the monitor could promote one of the slave databases to master status and reconfigure the 'SQLMaster' pool to direct write (UPDATE) traffic to the new MySQL master server.   Acknowledgements   Ian Redfern's MySQL protocol description was invaluable in developing the proxy code.     Appendix - Password Problems? This example assumes that you are using MySQL 4.1.x or later (it was tested with MySQL 5 clients and servers), and that your database has passwords in the 'long' 41-byte MySQL 4.1 (and later) format (see http://dev.mysql.com/doc/refman/5.0/en/password-hashing.html)   If you upgrade a pre-4.1 MySQL database to 4.1 or later, your passwords will remain in the pre-4.1 'short' format.   You can verify what password format your MySQL database is using as follows:   mysql> select password from mysql.user where user='username'; +------------------+ | password         | +------------------+ | 6a4ba5f42d7d4f51 | +------------------+ 1 rows in set (0.00 sec)   mysql> update mysql.user set password=PASSWORD('password') where user='username'; Query OK, 1 rows affected (0.00 sec) Rows matched: 1  Changed: 1  Warnings: 0   mysql> select password from mysql.user where user='username'; +-------------------------------------------+ | password                                  | +-------------------------------------------+ | *14521EA7B4C66AE94E6CFF753453F89631AE57EF | +-------------------------------------------+ 1 rows in set (0.00 sec)   If you can't create 'long' passwords, your database may be stuck in 'short' password mode. Run the following command to resize the password table if necessary:   $ mysql_fix_privilege_tables --password=admin password   Check that 'old_passwords' is not set to '1' (see here) in your my.cnf configuration file.   Check that the mysqld process isn't running with the --old-passwords option.   Finally, ensure that the privileges you have configured apply to connections from the Stingray proxy. You may need to GRANT... TO 'user'@'%' for example.
View full article
Request rule   The request rule below captures the start time for each request and sets a connection data value called “start” for each request:-   $tm = sys.time.highres(); # Don't store $tm directly, use sprintf to preserve precision connection.data.set("start", string.sprintf( "%f", $tm ) );   Response rule   The following response rule then tests each response against a threshold, which is currently set to 6 seconds. A log entry is written to the event log for each response that takes longer to complete than the 6 second threshold. Each log entry will show the response time in seconds, the back-end node used and the full URI of the request:   $THRESHOLD = 6; # Response time in (integer) seconds above # which requests are logged. $start = connection.data.get("start"); $now = sys.time.highres(); $diff = ($now - $start); if ( $diff > $THRESHOLD ) { $uri = http.getRawURL(); $node = connection.getNode(); log.info ("SLOW REQUEST (" . $diff . "s) " . $node . ":" . $uri ); }   The information in the event log will be useful to identify patterns in slow connections. For example, it might be that all log entries relate to RSS connections, indicating that there might be a problem with the RSS content.   Read more   Collected Tech Tips: TrafficScript examples
View full article
Top examples of Pulse vADC in action   Examples of how SteelApp can be deployed to address a range of application delivery challenges.   Modifying Content   Simple web page changes - updating a copyright date Adding meta-tags to a website with Traffic Manager Tracking user activity with Google Analytics and Google Analytics revisited Embedding RSS data into web content using Traffic Manager Add a Countdown Timer Using TrafficScript to add a Twitter feed to your web site Embedded Twitter Timeline Embedded Google Maps Watermarking PDF documents with Traffic Manager and Java Extensions Watermarking Images with Traffic Manager and Java Extensions Watermarking web content with Pulse vADC and TrafficScript   Prioritizing Traffic   Evaluating and Prioritizing Traffic with Traffic Manager HowTo: Control Bandwidth Management Detecting and Managing Abusive Referers Using Pulse vADC to Catch Spiders Dynamic rate shaping slow applications Stop hot-linking and bandwidth theft! Slowing down busy users - driving the REST API from TrafficScript   Performance Optimization   Cache your website - just for one second? HowTo: Monitor the response time of slow services HowTo: Use low-bandwidth content during periods of high load   Fixing Application Problems   No more 404 Not Found...? Hiding Application Errors Sending custom error pages   Compliance Problems   Satisfying EU cookie regulations using The cookiesDirective.js and TrafficScript   Security problems   The "Contact Us" attack against mail servers Protecting against Java and PHP floating point bugs Managing DDoS attacks with Traffic Manager Enhanced anti-DDoS using TrafficScript, Event Handlers and iptables How to stop 'login abuse', using TrafficScript Bind9 Exploit in the Wild... Protecting against the range header denial-of-service in Apache HTTPD Checking IP addresses against a DNS blacklist with Traffic Manager Heartbleed: Using TrafficScript to detect TLS heartbeat records TrafficScript rule to protect against "Shellshock" bash vulnerability (CVE-2014-6271) SAML 2.0 Protocol Validation with TrafficScript Disabling SSL v3.0 for SteelApp   Infrastructure   Transparent Load Balancing with Traffic Manager HowTo: Launch a website at 5am Using Stingray Traffic Manager as a Forward Proxy Tunnelling multiple protocols through the same port AutoScaling Docker applications with Traffic Manager Elastic Application Delivery - Demo How to deploy Traffic Manager Cluster in AWS VPC   Other solutions   Building a load-balancing MySQL proxy with TrafficScript Serving Web Content from Traffic Manager using Python and Serving Web Content from Traffic Manager using Java Virtual Hosting FTP services Managing WebSockets traffic with Traffic Manager TrafficScript can Tweet Too Instrument web content with Traffic Manager Antivirus Protection for Web Applications Generating Mandelbrot sets using TrafficScript Content Optimization across Equatorial Boundaries
View full article
With more services being delivered through a browser, it's safe to say web applications are here to stay. The rapid growth of web enabled applications and an increasing number of client devices mean that organizations are dealing with more document transfer methods than ever before. Providing easy access to these applications (web mail, intranet portals, document storage, etc.) can expose vulnerable points in the network.   When it comes to security and protection, application owners typically cover the common threats and vulnerabilities. What is often overlooked happens to be one of the first things we learned about the internet, virus protection. Some application owners consider the response “We have virus scanners running on the servers” sufficient. These same owners implement security plans that involve extending protection as far as possible, but surprisingly allow a virus sent several layers within the architecture.   Pulse vADC can extend protection for your applications with unmatched software flexibility and scale. Utilize existing investments by installing Pulse vADC on your infrastructure (Linux, Solaris, VMWare, Hyper-V, etc.) and integrate with existing antivirus scanners. Deploy Pulse vADC (available with many providers: Amazon, Azure, CoSentry, Datapipe, Firehost, GoGrid, Joyent, Layered Tech, Liquidweb, Logicworks, Rackspace, Sungard, Xerox, and many others) and externally proxy your applications to remove threats before they are in your infrastructure. Additionally, when serving as a forward proxy for clients, Pulse vADC can be used to mitigate virus propagation by scanning outbound content.   The Pulse Web Application Firewall ICAP Client Handler provides the possibility to integrate with an ICAP server. ICAP (Internet Content Adaption Protocol) is a protocol aimed at providing simple object-based content vectoring for HTTP services. The Web Application Firewall acts as an ICAP client and passes requests to a specified ICAP server. This enables you to integrate with third party products, based on the ICAP protocol. In particular, you can use the ICAP Client Handler as a virus scanner interface for scanning uploads to your web application.   Example Deployment   This deployment uses version 9.7 of the Pulse Traffic Manager with open source applications ClamAV and c-icap installed locally. If utilizing a cluster of Traffic Managers, this deployment should be performed on all nodes of the cluster. Additionally, Traffic Manager could be utilized as an ADC to extend availability and performance across multiple external ICAP application servers. I would also like to credit Thomas Masso, Jim Young, and Brian Gautreau - Thank you for your assistance!   "ClamAV is an open source (GPL) antivirus engine designed for detecting Trojans, viruses, malware and other malicious threats." - http://www.clamav.net/   "c-icap is an implementation of an ICAP server. It can be used with HTTP proxies that support the ICAP protocol to implement content adaptation and filtering services." - The c-icap project   Installation of ClamAV, c-icap, and libc-icap-mod-clamav   For this example, public repositories are used to install the packages on version 9.7 of the Traffic Manager virtual appliance with the default configuration. To install in a different manner or operating system, consult the ClamAV and c-icap documentation.   Run the following commands (copy and paste) to backup and update sources.list file cp /etc/apt/sources.list /etc/apt/sources.list.rvbdbackup   Run the following commands to update the sources.list file. *Tested with Traffic Manager virtual appliance version 9.7. For other Ubuntu releases replace the 'precise' with the current version installed. Run "lsb_release -sc" to find out your release. cat <> /etc/apt/sources.list deb http://ch.archive.ubuntu.com/ubuntu/ precise main restricted deb-src http://ch.archive.ubuntu.com/ubuntu/ precise main restricted deb http://us.archive.ubuntu.com/ubuntu/ precise universe deb-src http://us.archive.ubuntu.com/ubuntu/ precise universe deb http://us.archive.ubuntu.com/ubuntu/ precise-updates universe deb-src http://us.archive.ubuntu.com/ubuntu/ precise-updates universe EOF   Run the following command to retrieve the updated package lists   apt-get update   Run the following command to install ClamAV, c-icap, and libc-icap-mod-clamav.   apt-get install clamav c-icap libc-icap-mod-clamav   Run the following command to restore your sources.list.   cp /etc/apt/sources.list.rvbdbackup /etc/apt/sources.list   Configure the c-icap ClamAV service   Run the following commands to add lines to the /etc/c-icap/c-icap.conf   cat <> /etc/c-icap/c-icap.conf Service clamav srv_clamav.so ServiceAlias avscan srv_clamav?allow204=on&sizelimit=off&mode=simple srv_clamav.ScanFileTypes DATA EXECUTABLE ARCHIVE GIF JPEG MSOFFICE srv_clamav.MaxObjectSize 100M EOF   *Consult the ClamAV and c-icap documentation and customize the configuration and settings for ClamAV and c-icap (i.e. definition updates, ScanFileTypes, restricting c-icap access, etc.) for your deployment.   Just for fun run the following command to manually update the clamav database. /usr/bin/freshclam   Configure the ICAP Server to Start   This process can be completed a few different ways, for this example we are going to use the Event Alerting functionality of Traffic Manager to start i-cap server when the Web Application Firewall is started.   Save the following bash script (for this example start_icap.sh) on your computer. #!/bin/bash /usr/bin/c-icap #END   Upload the script via the Traffic Manager UI under Catalogs > Extra Files > Action Programs. (see Figure 1) Figure 1      Create a new event type (for this example named "Firewall Started") under System > Alerting > Manage Event Types. Select "appfirewallcontrolstarted: Application firewall started" and click update to save. (See Figure 2) Figure 2      Create a new action (for this example named "Start ICAP") under System > Alerting > Manage Actions. Select the "Program" radio button and click "Add Action" to save. (See Figure 3) Figure 3     Configure the "Start ICAP" Action Program to use the "start_icap.sh" script, and for this example we will adjust the timeout setting to 300. Click Update to save. (See Figure 4) Figure 4      Configure the Alert Mapping under System > Alerting to use the Event type and Action previously created. Click Update to save your changes. (See Figure 5) Figure 5      Restart the Application Firewall or reboot to automatically start i-cap server. Alternatively you can run the /usr/bin/c-icap command from the console or select "Update and Test" under the "Start ICAP" alert configuration page of the UI to manually start c-icap.   Configure the Web Application Firewall Within the Web Application Firewall UI, Add and configure the ICAPClientHandler using the following attribute and values.   icap_server_location - 127.0.0.1 icap_server_resource - /avscan   Testing Notes   Check the WAF application logs. Use Full logging for the Application configuration and enable_logging for the ICAPClientHandler. As with any system use full logging with caution, they could fill fast! Check the c-icap logs ( cat /var/log/c-icap/access.log & server.log). Note: Changing the /etc/c-icap/c-icap.conf "DebugLevel" value to 9 is useful for testing and recording to the /var/log/c-icap/server.log. *You may want to change this back to 1 when you are done testing. The Action Settings page in the Traffic Manager UI (for this example  Alerting > Actions > Start ICAP) also provides an "Update and Test" that allows you to trigger the action and start the c-icap server. Enable verbose logging for the "Start ICAP" action in the Traffic Manager for more information from the event mechanism. *You may want to change this setting back to disable when you are done testing.   Additional Information Pulse Secure Virtual Traffic Manager Pulse Secure Virtual Web Application Firewall Product Documentation RFC 3507 - Internet Content Adaptation Protocol (ICAP) The c-icap project Clam AntiVirus  
View full article
We spend a great deal of time focusing on how to speed up customers' web services. We constantly research new techniques to load balance traffic, optimise network connections and improve the performance of overloaded application servers. The techniques and options available from us (and yes, from our competitors too!) may seem bewildering at times. So I would like to spend a short time singing the praises of one specific feature, which I can confidently say will improve your website's performance above all others - caching your website.   "But my website is uncacheable! It's full of dynamic, changing pages. Caching is useless to me!"   We'll answer that objection soon, but first, it is worth a quick explanation of the two main styles of caching:     Client-side caching   Most people's experience of a web cache is on their web browser. Internet Explorer or Firefox will store copies of web pages on your hard drive, so if you visit a site again, it can load the content from disk instead of over the Internet.   There's another layer of caching going on though. Your ISP may also be doing some caching. The ISP wants to save money on their bandwidth, and so puts a big web cache in front of everyone's Internet access. The cache keeps copies of the most-visited web pages, storing bits of many different websites. A popular and widely used open-source web cache is Squid.   However, not all web pages are cacheable near the client. Websites have dynamic content, so for example any web page containing personalized or changing information will not be stored in your ISP's cache. Generally the cache will fill up with "static" content such as images, movies, etc. These get stored for hours or days. For your ISP, this is great, as these big files take up the most of their precious bandwidth.   For someone running their own website, the browser caching or ISP caching does not do much. They might save a little bandwidth from the ISP image caching if they have lots of visitors from the same ISP, but the bulk of the website, including most of the content generated by their application servers, will not be cached and their servers will still have lots of work to do.   Server-side caching (with Traffic Manager)   Here, the main aim is not to save bandwidth, but to accelerate your website. The traffic manager sits in your datacenter (or your cloud provider), in front of your web and application servers. Access to your website is through the Traffic Manager software, so it sees both the requests and responses. Traffic Manager can then start to answer these requests itself, delivering cached responses. Your servers then have less work to do. Less work = faster responses = fewer servers needed = saves money!   "But I told you - my website isn't cacheable!"   There's a reason why your website is marked uncacheable. Remember the ISP caches...? They mustn't store your changing, constantly updating web pages. To enforce this, application servers send back instructions with every web page, the Cache-Control HTTP header, saying "Don't cache this". Traffic Manager obeys these cache instructions too, because it's well-behaved.   But, think - how often does your website really change? Take a very busy site, for example a popular news site. Its front page may be labelled as uncacheable so that vistors always see the latest news, since it changes as new stories are added. But new additions aren't happening every second of the day. What if the page was marked as cacheable - for just one second? Visitors would still see the most up-to-date news, but the load on the site servers would plummet. Even if the website had as few as ten views in a second, this simple change would reduce the load on the app servers ten-fold.   This isn't an isolated example - there are plenty of others: Think twitter searches, auction listings, "live" graphing, and so on. All such content can be cached briefly without any noticable change to the "liveness" of the site. Traffic Manager can deliver a cached version of your web page much faster than your application servers - not just because it is highly optimized, but because sending a cached copy of a page is so much less work than generating it from scratch.   So if this simple cache change is so great, why don't people use this technique more - surely app servers can mark their web pages as cacheable for one or two seconds without Traffic Manager's help, and those browser/ISP caches can then do the magic? Well, the browser caches aren't going to be any use - an individual isn't going to be viewing the same page on your website multiple times a second (and if they keep hitting the reload button, their page requests are not cacheable anyway). So how about those big ISP caches? Unfortunately, they aren't always clever enough either. Some see a web page marked as cacheable for a short time and will either:   Not cache it at all (it's going to expire soon, what's the point in keeping it?) or will cache it for much longer (if it is cacheable for 3 seconds, why not cache it for 300, right?)   Also, by leaving the caching to the client-side, the cache hit rate gets worse. A user in France isn't going to be able to make use of a cached copy of your site stored in a US ISP's cache, for instance.   If you use Traffic Manager to do the caching, these issues can be solved. First, the cache is held in one place - your datacenter, so it is available to all visitors. Second, Traffic Manager can tweak the cache instructions for the page, so it caches the page while forcing other people not to. Here is what's going on:     Request arrives at Traffic Manager, which sends it on to your application server. App server sends web page response back to the traffic manager. The page has a Cache-Control: no-cache header, since the app server thinks the page can't be cached. TrafficScript response rule identifies the page as one that can be cached, for a short time. It changes the cache instructions to Cache-Control: max-age=3, meaning that the page can now be cached for three seconds. Traffic Manager's web cache stores the page. Traffic Manager sends out the response to the user (and to anyone else for the next three seconds), but changes the cache instructions to Cache-Control: no-cache, to ensure downstream caches, ISP caches and web browsers do not try to cache the page further.   Result: a much faster web site, yet it still serves dynamic and constantly updating pages to viewers. Give it a try - you will be amazed at the performance improvements possible, even when caching for just a second. Remember, almost anything can be cached if you configure your servers correctly!   How to set up Traffic Manager   On the admin server, edit the virtual server that you want to cache, and click on the "Content Caching" link. Enable the cache. There are options here for the default cache time for pages. These can be changed as desired, but are primarily for the "ordinary" content that is cacheable normally, such as images, etc. The "webcache!control_out" setting allows you to change the Cache-Control header for your pages after they have been cached by the Traffic Manager software, so you can put "no-cache" here to stop others from caching your pages.   The "webcache!refresh_time" setting is a useful extra here. Set this to one second. This will smooth out the load on your app servers. When a cached page is about to expire (i.e. it's too old to stay cached) and a new request arrives, Traffic Manager will hand over a single request to your app servers, to see if there is a newer page available. Other requests continue to get served from the cache. This can prevent 'waves' of requests hitting your app servers when a page is about to expire from the cache.   Now, we need to make Traffic Manager cache the specific pages of your site that the app server claims are uncacheable. We do this using the RuleBuilder system for defining rules, so click on the "Catalogs" icon and then select the "Rules" tab. Now create a new RuleBuilder rule.   This rule needs to run for the specific web pages that you wish to make cacheable for short amounts of time. For an example, we'll make "/news" cacheable. Add a condition of "HTTP:URL Path" to match "/news", then add an action to set a HTTP response header. The rule should look like this:     Finally, add this rule as a response rule to your virtual server. That's it! Your site should now start to be cached. Just a final few words of caution:   Be selective in the pages that you mark as cacheable; remember that personalized pages (e.g. showing a username) cannot be cached otherwise other people will see those pages too! If necessary, some page redesign might be called for to split the content into "generic" and "user-specific" iframes or AJAX requests. Server-side caching saves you CPU time, not bandwidth. If your website is slow because you are hitting your site throughput limits, then other techniques are needed.
View full article
This article explains how to use Traffic Manager's REST Control API using the excellent requests Python library.   There are many ways to install the requests library.  On my test client (MacOSX), the following was sufficient:   $ sudo easy_install pip $ sudo pip install requests   Resources   The REST API gives you access to the Traffic Manager Configuration, presented in the form of resources.  The format of the data exchanged using the RESTful API will depend on the type of resource being accessed:   Data for Configuration Resources, such as Virtual Servers and Pools are exchanged in JSON format using the MIME type of “application/json”, so when getting data on a resource with a GET request the data will be returned in JSON format and must be deserialized or decoded into a Python data structure.  When adding or changing a resource with a PUT request the data must be serialized or encoded from a Phython data structure into JSON format. Files, such as rules and those in the extra directory are exchanged in raw format using the MIME type of “application/octet-stream”.   Working with JSON and Python   The json module provides functions for JSON serializing and deserializing.  To take a Python data structure and serialize it into JSON format use json.dumps() and to deserialize a JSON formatted string into a Python data structure use json.loads() .   Working with a RESTful API and Python   To make the programming easier, the program examples that follow utilize the requests library as the REST client. To use the requests library you first setup a requests session as follows, replacing <userid> and <password> with the appropriate values:   client = requests.Session() client.auth = ('<userid>', '<password>') client.verify = False   The last line prevents it from verifying that the certificate used by Traffic Manager is from a certificate authority so that the self-signed certificate used by Traffic Manager will be allowed.  Once the session is setup, you can make GET, PUT and DELETE calls as follows:   response = client.get() response = client.put(, data = , headers = ) response = client.delete()   The URL for the RESTful API will be of the form:   https:// <STM hostname or IP>:9070/api/tm/1.0/config/active/   followed by a resource type or a resource type and resource, so for example to get a list of all the pools from the Traffic Manager instance, stingray.example.com, it would be:   https://stingray.example.com:9070/api/tm/1.0/config/active/pools   And to get the configuration information for the pool, “testpool” the URL would be:   https://stingray.example.com:9070/api/tm/1.0/config/active/pools/testpool   For most Python environments, it will probably be necessary to install the requests library.  For some Python environments it may also be necessary to install the httplib2 module.   Data Structures   JSON responses from a GET or PUT are deserialized into a Python dictionary that always contains one element.   The key to this element will be:   'children' for lists of resources.  The value will be a Python list with each element in the list being a dictionary with the key, 'name', set to the name of the resource and the key, 'href', set to the URI of the resource. 'properties' for configuration resources.  The value will be a dictionary with each key value pair being a section of properties with the key being set to the name of the section and the value being a dictionary containing the configuration values as key/value pairs.  Configuration values can be scalars, lists or dictionaries.   Please see Feature Brief: Traffic Manager's RESTful Control API for examples of these data structures and something like the Chrome REST Console can be used to see what the actual data looks like.   Read More   The REST API Guide in the Product Documentation Feature Brief: Traffic Manager's RESTful Control API Collected Tech Tips: Using the RESTful Control API
View full article
Distributed denial of service (DDoS) attacks are the worst nightmare of every web presence. Common wisdom has it that there is nothing you can do to protect yourself when a DDoS attack hits you. Nothing? Well, unless you have Stingray Traffic Manager. In this article we'll describe how Stingray helped a customer keep their site available to legitimate users when they came under massive attack from the "dark side".   What is a DDoS attack?   DDoS attacks have risen to considerable prominence even in mainstream media recently, especially after the BBC published a report on how botnets can be used to send SPAM or take web-sites down and another story detailing that even computers of UK government agencies are taken over by botnets.   A botnet is an interconnected group of computers, often home PCs running MS Windows, normally used by their legitimate owners but actually under the control of cyber-criminals who can send commands to programs running on those computers. The fact that their machines are controlled by somebody else is due to operating system or application vulnerabilities and often unknown to the unassuming owners. When such a botnet member goes online, the malicious program starts to receive its orders. One such order can be to send SPAM emails, another to flood a certain web-site with requests and so on.   There are quite a few scary reports about online extortions in which web-site owners are forced to pay money to avoid disruption to their services.   Why are DDoS attacks so hard to defend against?   The reason DDoS attacks are so hard to counter is that they are using the service a web-site is providing and wants to provide: its content should be available to everybody. An individual PC connected to the internet via DSL usually cannot take down a server, because servers tend to have much more computing power and more networking bandwidth. By distributing the requests to as many different clients as possible, the attacker solves three problems in one go:   They get more bandwidth to hammer the server. The victim cannot thwart the attack by blocking individual IP addresses: that will only reduce the load by a negligible fraction. Also, clever DDoS attackers gradually change the clients sending the request. It's impossible to keep up with this by manually adapting the configuration of the service. It's much harder to identify that a client is part of the attack because each individual client may be sending only a few requests per second.   How to Protect against DDoS Attacks?   There is an article on how to ban IP addresses of individual attackers here: Dynamic Defense Against Network Attacks.The mechanism described there involves a Java Extension that modifies Stingray Traffic Manager's configuration via a SOAP call to add an offending IP address to the list of banned IPs in a Service Protection Class. In principle, this could be used to block DDoS attacks as well. In reality it can't, because SOAP is a rather heavyweight process that involves much too much overhead to be able to run hundreds of times per second. (Stingray's Java support is highly optimized and can handle tens of thousands of requests per second.)   The performance of the Java/SOAP combination can be improved by leveraging the fact that all SOAP calls in the Stingray API are array-based. So a list of IP addresses can be gathered in TrafficScript and added to Stingray's configuration in one call. But still, the blocking of unwanted requests would happen too late: at the application level rather than at the OS (or, preferably, the network) level. Therefore, the attacker could still inflict the load of accepting a connection, passing it up to Stingray Traffic Manager, checking the IP address inside Stingray Traffic Manager etc. It's much better to find a way to block the majority of connections before they reach Stingray Traffic Manager.   Introducing iptables   Linux offers an extensive framework for controlling network connections called iptables. One of its features is that it allows an administrator to block connections based on many different properties and conditions. We are going to use it in a very simple way: to ignore connection initiations based on the IP address of their origin. iptables can handle thousands of such conditions, but of course it has an impact on CPU utilization. However, this impact is still much lower than having to accept the connection and cope with it at the application layer.   iptables checks its rules against each and every packet that enters the system (potentially also on packets that are forwarded by and created in the system, but we are not going to use that aspect here). What we want to impede are new connections from IP addresses that we know are part of the DDoS attack. No expensive processing should be done on packets belonging to connections that have already been established and on which data is being exchanged. Therefore, the first rule to add is to let through all TCP packets that do not establish a new connections, i.e. that do not have the SYN flag set:   # iptables -I INPUT -p tcp \! --syn -j ACCEPT   Once an IP address has been identified as 'bad', it can be blocked with the following command:   # iptables -A INPUT -s [ip_address] -J DROP   Using Stingray Traffic Manager and TrafficScript to detect and block the attack   The rule that protects the service from the attack consists of two parts: Identifying the offending requests and blocking their origin IPs.   Identifying the Bad Guys: The Attack Signature   A gift shopping site that uses Stingray Traffic Manager to manage the traffic to their service recently noticed a surge of requests to their home page that threatened to take the web site down. They contacted us, and upon investigation of the request logs it became apparent that there were many requests with unconventional 'User-Agent' HTTP headers. A quick web search revealed that this was indicative of an automated distributed attack.   The first thing for the rule to do is therefore to look up the value of the User-Agent header in a list of agents that are known to be part of the attack:   1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 sub isAttack()  {      $ua = http.getHeader( "User-Agent" );      if ( $ua == "" || $ua == " " ) {         #log.info("Bad Agent [null] from ". request.getRemoteIP());         counter.increment(1,1);         return 1;      } else {         $agentmd5 = resource.getMD5( "bad-agents.txt" );         if ( $agentmd5 != data.get( "agentmd5" ) ) {            reloadBadAgentList( $agentmd5 );         }         if ( data.get( "BAD" . $ua ) ) {            #log.info("Bad agent ".$ua." from ". request.getRemoteIP());            counter.increment(2,1);            return 1;         }      }      return 0;  }    The rule fetches the relevant header from the HTTP request and makes a quick check whether the client sent an empty User-Agent or just a whitespace. If so, a counter is incremented that can be used in the UI to track how many such requests were found and then 1 is returned, indicating that this is indeed an unwanted request.   If a non-trivial User-Agent has been sent with the request, the list is queried. If the user-agent string has been marked as 'bad', another counter is incremented and again 1 is returned to the calling function. The techniques used here are similar to those in the more detailed HowTo: Store tables of data in TrafficScript article; when needed, the resource file is parsed and an entry in the system-wide data hash-table is created for each black-listed user agent.   This is accomplished by the following sub-routine:   1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 sub reloadBadAgentList( $newmd5 )  {      # do this first to minimize race conditions:      data.set( "agentmd5" , $newmd5 );      $badagents = resource.get( "bad-agents.txt" );      $i = 0;      data. reset ( "BAD" ); # clear old entries      while ( ( $j = string.find( $badagents , "\n" , $i ) ) != -1 ) {         $line = string.substring( $badagents , $i , $j -1 );         $i = $j +1;         $entry = "BAD" .string.trim( $line );         log .info( "Adding bad UA '" . $entry . "'" );         data.set( $entry , 1 );      }  }    Most of the time, however, it won't be necessary to read from the file system because the list of 'bad' agents does not change often (if ever for a given botnet attack). You can download the file with the black-listed agents here and there is even a web-page dedicated to user-agents, good and bad.   Configuring iptables from TrafficScript   Now that TrafficScript 'knows' that it is dealing with a request whose IP has to be blocked, this address must be added to the iptables 'INPUT' chain with target 'DROP'. The most lightweight way to get this information from inside Stingray Traffic Manager somewhere else is to use the HTTP client functionality in TrafficScript provided by the function http.request.get(). Since many such 'evil' IP addresses are expected per second, it is a good idea to buffer up a certain number of IPs before making an HTTP request (the first of which will have some overhead due to TCP's three-way handshake, but of course much less than forking a new process; subsequent requests will be made over the kept-alive connection).   Here is the rule that accomplishes the task:   1 2 3 4 5 6 7 8 9 10 11 12 13 if ( isAttack() ) {      $ip = request.getRemoteIP();      $iplist = data.get( "badiplist" );      if ( string.count( $iplist , "/" )+1 >= 10 ) {         data.remove( "badiplist" );         $url = "http://127.0.0.1:44252" . $iplist . "/" . $ip ;         http.request.get( $url , "" , 5);      } else {         data.set( "badiplist" , $iplist . "/" . $ip );      }      connection. sleep ( $sleep );      connection.discard();  }    A simple 'Web Server' that Adds Rules for iptables   Now who is going to handle all those funny HTTP GET requests? We need a simple web-server that reads the URL, splits it up into the IPs to be blocked and adds them to iptables (unless it is already being blocked). On startup this process checks which addresses are already in the black-list to make sure they are not added again (which would be a waste of resources), makes sure that a fast path is taken for packets that do not correspond to new connections and then listens for requests on a configurable port (in the rule above we used port 44252).   This daemon doesn't fork one iptables process per IP address to block. Instead, it uses the 'batch-mode' of the iptables framework, iptables-restore. With this tool, you compile a list of rules and send all of them down to the kernel with a single commit command.   A lot of details (like IPv6 support, throttling etc) have been left out because they are not specific to the problem at hand, but can be studied by downloading the Perl code (attached) of the program.   To start this server you have to be root and invoke the following command:   # iptablesrd.pl   Preventing too many requests with Stingray Traffic Manager's Rate Shaping   As it turned out when dealing with the DDoS attack that plagued our client, the bottleneck in the whole process described up until now was the addition of rules to iptables. This is not surprising as the kernel has to lock some of its internal structures before each such manipulation. On a moderately-sized workstation, for example, a few hundred transactions can be committed per second when starting from an empty rule set. Once there are, say, 10,000 IP addresses in the list, adding more becomes slower and slower, down to a few dozen per second at best. If we keep sending requests to the 'iptablesrd' web-server at a high rate, it won't be able to keep up with them. Basically, we have to take into account that this is the place where processing is channeled from a massively parallel, highly scalable process (Stingray) into the sequential, one-at-a-time mechanism that is needed to keep the iptables configuration consistent across CPUs.   Queuing up all these requests is pointless, as it will only eat resources on the server. It is much better to let Stingray Traffic Manager sleep on the connection for a short time (to slow down the attacker) and then close it. If the IP address continues to be part of the botnet, the next request will come soon enough and we can try and handle it then.   Luckily, Stingray comes with rate-shaping functionality that can be used in TrafficScript. Setting up a 'Rate' class in the 'Catalog' tab looks like this:     The Rate Class can now be used in the rule to restrict the number of HTTP requests Stingray makes per second:   1 2 3 4 5 6 if ( rate.getBackLog( "DDoS Protect" ) < 1 ) {      $url = "http://localhost:44252" . $iplist . "/" . $ip ;      rate. use ( "DDoS Protect" );      # our 'webserver' never sends a response      http.request.get( $url , "" , 5);  }    Note that we simply don't do anything if the rate class already has a back-log, i.e. there are outstanding requests to block IPs. If there is no request queued up, we impose the rate limitation on the current connection and then send out the request.   The Complete Rule   To wrap this section up, here is the rule in full:   1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 $sleep = 300; # in milliseconds  $maxbacklog = 1;  $ips_per_httprequest = 10;  $http_timeout = 5; # in seconds  $port = 44252; # keep in sync with argument to iptablesrd.pl       if ( isAttack() ) {      $ip = request.getRemoteIP();      $iplist = data.get( "badiplist" );      if ( string.count( $iplist , "/" )+1 >= $ips_per_httprequest ) {         data.remove( "badiplist" );         if ( rate.getBackLog( "ddos_protect" ) < $maxbacklog ) {            $url = "http://127.0.0.1:" . $port . $iplist . "/" . $ip ;            rate. use ( "ddos_protect" );            # our 'webserver' never sends a response            http.request.get( $url , "" , $http_timeout );         }      } else {         data.set( "badiplist" , $iplist . "/" . $ip );      }      connection. sleep ( $sleep );      connection.discard();  }       $rawurl = http.getRawURL();  if ( $rawurl == "/" ) {      counter.increment(3, 1);  # Small delay - shouldn't annoy real users, will at least slow down attackers      connection. sleep (100);      http.redirect( "/temp-redirection" );  # Attackers will probably ignore the redirection. Real browsers will come back  }       # Re-write the URL before passing it on to the web servers  if ( $rawurl == "/temp-redirection" ) {      http.setPath( "/" );  }       sub isAttack()  {      $ua = http.getHeader( "User-Agent" );           if ( $ua == "" || $ua == " " ) {         counter.increment(1,1);         return 1;      } else {         $agentmd5 = resource.getMD5( "bad-agents.txt" );         if ( $agentmd5 != data.get( "agentmd5" ) ) {            reloadBadAgentList( $agentmd5 );         }         if ( data.get( "BAD" . $ua ) ) {            counter.increment(2,1);            return 1;         }      }      return 0;  }       sub reloadBadAgentList( $newmd5 )  {      # do this first to minimize race conditions:      data.set( "agentmd5" , $newmd5 );      $badagents = resource.get( "bad-agents.txt" );      $i = 0;      data. reset ( "BAD" );      while ( ( $j = string.find( $badagents , "\n" , $i ) ) != -1 ) {         $line = string.substring( $badagents , $i , $j -1 );         $i = $j +1;         $entry = "BAD" .string.trim( $line );         data.set( $entry , 1 );      }  }   Note that there are a few tunables at the beginning of the rule. Also, since in the particular case of the gift shopping site all attack requests went to the home page ("/"), a small slowdown and subsequent redirect was added for that page.   Further Advice   The method described here can help mitigate the server-side effect of DDoS attacks. It is important, however, to adapt it to the particular nature of each attack and to the system Stingray Traffic Manager is running on. The most obvious adjustment is to change the isAttack() sub-routine to reliably detect attacks without blocking legitimate requests.   Beyond that, a careful eye has to be kept on the system to make sure Stingray strikes the right balance between adding bad IPs (which is expensive but keeps further requests from that IP out) and throwing away connections the attackers have managed to establish (which is cheap but won't block future connections from the same source). After a while, the rules for iptables will block all members of the botnet. However, botnets are dynamic, they change over time: new nodes are added while others drop out.   An useful improvement to the iptablesrd.pl process described above would therefore be to speculatively remove blocks if they have been added a long time ago and/or if the number of entries crosses a certain threshold (which will depend on the hardware available).   Most DDoS attacks are short-lived, however, so it may suffice to just wait until it's over.   The further upstream in the network the attack can be blocked, the better. With the current approach, blocking occurs at the machine Stingray Traffic Manager is running on. If the upstream router can be remote-controlled (e.g. via SNMP), it would be preferable to do the blocking there. The web server we are using in this article can easily be adapted to such a scenario.   A word of warning and caution: The method presented here is no panacea that can protect against arbitrary attacks. A massive DDoS attack can, for instance, saturate the bandwidth of a server with a flood of SYN packets and such an attack can only be handled further upstream in the network. But Stingray Traffic Manager can certainly be used to scale down the damage inflicted upon a web presence and take a great deal of load from the back-end servers.   Footnote   The image at the top of the article is a graphical representation of the distribution of nodes on the internet produced by the opte project. It is protected by the Creative Commons License.
View full article
What is Direct Server Return? Layer 2/3 Direct Server Return (DSR), also referred to as ‘triangulation’, is a network routing technique used in some load balancing situations: Incoming traffic from the client is received by the load balancer and forwarded to a back-end node Outgoing (return) traffic from the back-end node is sent directly to the client and bypasses the load balancer completely Incoming traffic (blue) is routed through the load balancer, and return traffic (red) bypasses the load balancer Direct Server Return is fundamentally different from the normal load balancing mode of operation, where the load balancer observes and manages both inbound and outbound traffic. In contrast, there are two other common load balancing modes of operation: NAT (Network Address Translation): layer-4 load balancers and simple layer 7 application delivery controllers use NAT (Network Address Translation) to rewrite the destination value of individual network packets.  Network connections are load-balanced by the choice of destination value. They often use a technique called ‘delayed binding’ to delay and inspect a new network connection before sending the packets to a back-end node; this allows them to perform content-based routing.  NAT-based load balancers can switch TCP streams, but have limited capabilities to inspect and rewrite network traffic. Proxy: Modern general-purpose load balancers like Stingray Traffic Manager operate as full proxies.  The proxy mode of operation is the most compute-intensive, but current general purpose hardware is more than powerful enough to manage traffic at multi-gigabit speeds. Whereas NAT-based load balancers manage traffic on a packet-by-packet basis, proxy-based load balancers can read entire request and responses.  They can manage and manipulate the traffic based on a full understanding of the transaction between the client and the application server. Note that some load balancers can operated in a dual-mode fashion - a service can be handled either in a NAT-like fashion or in a Proxy-like fashion.  This introduces are tradeoff between hardware performance and software sophistication - see SOL4707 - Choosing appropriate profiles for HTTP traffic for an example.  Stingray Traffic Manager can only function in a Proxy-like fashion. This article describes how the benefits of direct server return can be applied to a layer 7 traffic management device such as Stingray Traffic Manager. Why use Direct Server Return? Layer 2/3 Direct Server Return was very popular from 1995 to about 2000 because the load balancers of the time were seriously limited in performance and compute power; DSR uses less compute resources then a full NAT or Proxy load balancer.  DSR is no longer necessary for high performance services as modern load balancers on modern hardware can easily handle multi-gigabits of traffic without requiring DSR. DSR is still an appealing option for organizations who serve large media files, or who have very large volumes of traffic. Stingray Traffic Manager does not support a traditional DSR mode of operation, but it is straightforward to manage traffic to obtain a similar layer 7 DSR effect. Disadvantages of Layer2/3 Direct Server Return There are a number of distinct limitations and disadvantages with DSR: 1. The load balancer does not observe the response traffic The load balancer has no way of knowing if a back-end server has responded correctly to the remote client.   The server may have failed, or it may have returned a server error message.  An external monitoring service is necessary to verify the health and correct operation of each back-end server. 2. Proper load balancing is not possible The load balancer has no idea of service response times so it is difficult for it to perform effective, performance-sensitive load balancing. 3. Session persistence is severely limited Because the load balancer only observes the initial ‘SYN’ packet before it makes a load balancing decision, it can only perform session persistence based on the source IP address and port of the packet, i.e. the IP address of the remote client. The load balancer cannot perform cookie-based session persistence, SSL session ID persistence, or any of the many other session persistence methods offered by other load balancers. 4. Content-based routing is not possible Again, because the load balancer does not observe the initial request, it cannot perform content based routing. 5. Limited traffic management and reporting The load balancer cannot manage traffic, performing operations like SSL decryption, content compression, security checking, SYN cookies, bandwidth management, etc.  It cannot retry failed requests, or perform any traffic rewriting.  The load balancer cannot report on traffic statistics such as bandwidth sent. 6. DSR can only be used within a datacenter There is no way to perform DSR between datacenters (other than proprietary tunnelling, which may be limited by ISP egress filtering). In addition, many of the advanced capabilities of an application delivery controller that depend on inspection and modification (security, acceleration, caching, compression, scrubbing etc) cannot be deployed when a DSR mode is in use. Performance of Direct Server Return The performance benefits of DSR are often assumed to be greater than they really are.  Central to this doubt is the observation that client applications will send TCP ‘ACK’ packets via the load balancer in response to the data they receive from the server, and the volume of the ACK packets can overwhelm the load balancer. Although ACK packets are small, in many cases the rated capacities of network hardware assume that all packets are the size of the maximum MTU (typically 1500 bytes).  A load balancer running on a 100 MBit network could receive a little over 8,000 ACK packets per second. On a low-latency network, ACK packets are relatively infrequent (1 ACK packet for every 4 data packets), but for large downloads over a high-latency network (8 hops) the number of ACK packets closely approaches 1:1 as the server and client attempt to optimize the TCP session.  Therefore, over high-latency networks, a DSR-equipped load balancer will receive a similar volume of ACK packets to the volume of outgoing data packets (and the difference in size between the ACK and data packets has little effect to packet-based load balancers). Stingray alternatives to Layer 2/3 DSR There are two alternatives to direct server return: Use Stingray Traffic Manager in its usual full proxy mode Stingray Traffic Manager is comfortably able to manage over many Gbits of traffic in its normal ‘proxy’ mode on appropriate hardware, and can be scaled horizontally for increased capacity.  In benchmarks, modern Intel and AMD-based systems can achieve multiple 10's of Gbits of fully load-balanced traffic, and up to twice as much when serving content from Stingray Traffic Manager’s content cache. Redirect requests to the chosen origin server (a.k.a. Layer 7 DSR) For the most common protocols (HTTP and RTSP), it is possible to handle them in ‘proxy’ mode, and then redirect the client to the chosen server node once the load balancing and session persistence decision has been made.  For the large file download, the client communicates directly with the server node, bypassing Stingray Traffic Manager completely: Client issues HTTP or RTSP request to Stingray Traffic Manager Stingray Traffic Manager issues ‘probe’ request via pool to back-end server Stingray Traffic Manager verifies that the back-end server returns a correct response Stingray Traffic Manager sends a 302 redirect to the client, telling it to retry the request against the chosen back-end server Requests for small objects (blue) are proxied directly to the origin.  Requests for large objects (red) elicit a lightweight probe to locate the resource, and then the client is instructed (green)to retrieve the resource directly from the origin. This technique would generally be used selectively.  Small file downloads (web pages, images, etc) would be managed through the Stingray Traffic Manager.  Only large files – embedded media for example – would be handled in this redirect mode.  For this reason, the HTTP session will always run through the Stingray Traffic Manager. Layer 7 DSR with HTTP Layer 7 DSR with HTTP is fairly straightforward.  In the following example, incoming requests that begin “/media” will be converted into simple probe requests and sent to the ‘Media Servers’ pool.  The Stingray Traffic Manager will determine which node was chosen, and send the client an explicit redirect to retrieve the requested content from the chosen node: Request rule: Deploy the following TrafficScript request rule: $path = http.getPath(); if( string.startsWith( $path, "/media/" ) || 1 ) {    # Store the real path    connection.data.set( "path", $path );    # Convert the request to a lightweight HEAD for '/'    http.setMethod( "HEAD" );    http.setPath( "/" );    pool.use( "Media Servers" ); } Response rule: This rule reads the response from the server; load balancing and session persistence (if relevant) will ensure that we’ve connected with the optimal server node.  The rule only takes effect if we did the request rewrite, the $saved_path value will begin with ‘/media/’, so we can issue the redirect. $saved_path = connection.data.get( "path" ); if( string.startsWith( $saved_path, "/media" ) ) {    $chosen_node = connection.getNode();    http.redirect( " http:// ".$chosen_node.$saved_path ); } Layer 7 DSR  with RTSP An RTSP connection is a persistent TCP connection.  The client and server communicate with HTTP-like requests and responses.  In this example, Stingray Traffic Manager will receive initial RTSP connections from remote clients and load-balance them on to a pool of media servers.  In the RTSP protocol, a media download is always preceded by a ‘DESCRIBE’ request from the client; Stingray Traffic Manager will replace the ‘DESCRIBE’ response with a 302 Redirect response that tells the client to connect directly to the back-end media server. This code example has been tested with the Quicktime, Real and Windows media clients, and against pools of Quicktime, Helix (Real) and Windows media servers. The details Create a virtual server listening on port 554 (standard port for RTSP traffic).  Set the protocol type to be “RTSP”. In this example, we have three pools of media servers, and we’re going to select the pool based on the User-Agent field in the RTSP request.  The pools are named “Helix Servers”, “QuickTime Servers” and “Windows Media Servers”. Request rule: Deploy the following TrafficScript request rule: $client = rtsp.getRequestHeader( "User-Agent" ); # Choose the pool based on the User-Agent if( string.Contains( $client, "RealMedia" ) ) {    pool.select( "Helix Servers" ); } else if ( string.Contains( $client, "QuickTime" ) ) {    pool.select( "QuickTime Servers" ); } else if ( string.Contains( $client, "WMPlayer" ) ) {    pool.select( "Windows Media Servers" ); } This rule uses pool.select() to specify which pool to use when Stingray is ready to forward the request to a back-end server.  Response rule: All of the work takes place in the response rule.  This rule reads the response from the server.  If the request was a ‘DESCRIBE’ method, the rule then replaces the response with a 302 redirect, telling the client to connect directly to the chosen back-end server.  Add this rule as a response rule, setting it to run every time (not once). # Wait for a DESCRIBE response since this contains the stream $method = rtsp.getMethod(); if( $method != "DESCRIBE" ) break; # Get the chosen node $node = connection.getnode(); # Instruct the client to retry directly against the chosen node rtsp.redirect( "rtsp://" . $node . "/" . $path ); Appendix: How does DSR work? It’s useful to have an appreciation of how DSR (and Delayed Binding) functions in order to understand some of its limitations (such as content inspection). TCP overview A simplified overview of a TCP connection is as follows: Connection setup The client initiates a connection with a server by sending a ‘SYN’ packet.  The SYN packet contains a randomly generated client sequence number (along with other data). The server replies with a ‘SYN ACK’ packet, acknowledging the client’s SYN and sending its own randomly generated server sequence number. The client completes the TCP connection setup by sending an ACK packet to acknowledge the server’s SYN.  The TCP connection setup is often referred to as a 3-way TCP handshake.  Think of it as the following conversation: Client: “Can you hear me?” (SYN) Server: “Yes.  Can you hear me?” (ACK, SYN) Client: “Yes” (ACK) Data transfer Once the connection has been established by the 3-way handshake, the client and server exchange data packets with each other.  Because packets may be dropped or re-ordered, each packet contains a sequence number; the sequence number is incremented for each packet sent. When a client receives intact data packets from the server, it sends back an ACK (acknowledgement) with the packet sequence number.  When a client acknowledges a sequence number, it is acknowledging it received all packets up to that number, so ACKs may be sent less frequently than data packets.  The server may send several packets in sequence before it receives an ACK (determined by the (“window size”), and will resend packets if they are not ACK’d rapidly enough. Simple NAT-based Load Balancing There are many variants for IP and MAC rewriting used in simple NAT-based load balancing.  The simplest NAT-based load balancing technique uses Destination-NAT (DNAT) and works as follows: The client initiates a connection by sending a SYN packet to the Virtual IP (VIP) that the load balancer is listening on The load balancer makes a load balancing decision and forwards the SYN packet to the chosen node.  It rewrites the destination IP address in the packet to the IP address of the node.  The load-balancer also remembers the load-balancing decision it made. The node replies with a SYN/ACK.  The load-balancer rewrites the source IP address to be the VIP and forwards the packet on to the remote client. As more packets flow between the client and the server, the load balancer checks its internal NAT table to determine how the IP addresses should be rewritten. This implementation is very amenable to a hardware (ASIC) implementation.  The TCP connection is load-balanced on the first SYN packet; one of the implications is that the load balancer cannot inspect the content in the TCP connection before making the routing decision. Delayed Binding Delayed binding is a variant of the DNAT load balancing method.  It allows the load balancer to inspect a limited amount of the content before making the load balancing decision. When the load balancer receives the initial SYN, it chooses a server sequence number and returns a SYN/ACK response The load balancer completes the TCP handshake with the remote client and reads the initial few data packets in the client’s request. The load balancer reassembles the request, inspects it and makes the load-balancing decision.  It then makes a TCP connection to the chosen server, using DNAT (i.e., the client’s source IP address) and writes the request to the server. Once the request has been written, the load balancer must splice the client-side and server-side connection together.  It does this by using DNAT to forward packets between the two endpoints, and by rewriting the sequence numbers chosen by the server so that they match the initial sequence numbers that the load balancer used. This implementation is still amenable to hardware (ASIC) implementation.  However, layer 4-7 tasks such as detailed content inspection and content rewriting are beyond implementation in specialized hardware alone and are often implemented using software approaches (such as F5's FastHTTP profile), albeit with significant functional limitations. Direct Server Return Direct Server Return is most commonly implemented using MAC address translation (layer 2). A MAC (Media Access Control) address is a unique, unchanging hardware address that is bound to a network card.  Network devices will read all network packets destined for their MAC address. Network devices use ARP (address resolution protocol) to announce the MAC address that is hosting a particular IP address.  In a Direct Server Return configuration, the load balancer and the server nodes will all listen on the same VIP.  However, only the load balancer makes ARP broadcasts to tell the upstream router that the VIP maps to its MAC address. When a packet destined for the VIP arrives at the router, the router places it on the local network, addressed to the load balancer’s MAC address.  The load balancer picks that packet up. The load balancer then makes a load balancing decision, choosing which node to send it to.  The load balancer rewrites the MAC address in the packet and puts it back on the wire. The chosen node picks the packet up just as if it were addressed directly to it. When the node replies, it sends its packets directly to the source node.  They are immediately picked up by the upstream router and forwarded on. In this way, reply packets completely bypass the load balancer machine. Why content inspection is not possible Content inspection (delayed binding) is not possible because it requires that the load balancer first completes the three-way handshake with the remote source node, and possibly ACK’s some of the data packets. When the load balancer then sends the first SYN to the chosen node, the node will respond with a SYN/ACK packet directly back to the remote source.  The load balancer is out-of-line and cannot suppress this SYN/ACK.  Additionally, the sequence number that the node selects cannot be translated to the one that the remote client is expecting.  There is no way to persuade the node to pick up in the TCP connection from where the load balancer left off. For similar reasons, SYN cookies cannot be used by the load balancer to offload SYN floods from the server nodes. Alternative Implementations of Direct Server Return There are two alternative implementations of DSR (see this 2002 paper entitled 'The State of the Art'), but neither is widely used any more: TCP Tunnelling: IP tunnelling (aka IP encapsulation) can be used to tunnel the client IP packets from the load balancer to the server.  All client IP packets are encapsulated within IP datagrams, and the server runs a tunnel device (an OS driver and configuration) to strip off the datagram header before sending the client IP packet up the network stack. This configuration does not support delayed binding, or any equivalent means of inspecting content before making the load balancing decision TCP Connection Hopping: Resonate have implemented a proprietary protocol (Resonate Exchange Protocol, RXP) which interfaces deeply with the server node’s TCP stack.  Once a TCP connection has been established with the Resonate Central Dispatch load balancer and the initial data has been read, the load balancer can hand the response side of the connection off to the selected server node using RXP.  The RXP driver on the server suppresses the initial TCP handshake packets, and forces the use of the correct TCP sequence number.  This uniquely allows for content-based routing and direct server return in one solution. Neither of these methods are in wide use now.
View full article
Introduction   Do you ever face any of these requirements?   "I want to best-effort provide certain levels of service for certain users." "I want to prioritize some transactions over others." "I want to restrict the activities of certain types of users."   This article explains that to address these problems, you must consider the following questions:    "Under what circumstances do you want the policy to take effect?" "How do you wish to categorise your users?" "How do you wish to apply the differentiation?"   It then describes some of the measures you can take to monitor performance more deeply and apply prioritization to nominated traffic:   Service Level Monitoring – Measure system performance, and apply policies only when they are needed. Custom Logging - Log and analyse activity to record and validate policy decisions. Application traffic inspection - Determine source, user, content, value; XML processing with XPath searches and calculations. Request Rate Shaping - Apply fine-grained rate limits for transactions. Bandwidth Control - Allocate and reserve bandwidth. Traffic Routing and Termination - Route high and low priority traffic differently; Terminate undesired requests early Selective Traffic Optimization - Selective caching and compression.   Whether you are running an eCommerce web site, online corporate services or an internal intranet, there’s always the need to squeeze more performance from limited resources and to ensure that your most valuable users get the best possible levels of service from the services you are hosting.   An example   Imagine that you are running a successful gaming service in a glamorous location.  The usage of your service is growing daily, and many of your long-term users are becoming very valuable.   Unfortunately, much of your bandwidth and server hits are taken up by competitors’ robots that screen-scrape your betting statistics, and poorly-written bots that spam your gaming tables and occasionally place low-value bets. At certain times of the day, this activity is so great that it impacts the quality of the service you deliver, and your most valuable customers are affected.     Using Traffic Manager to measure, classify and prioritize traffic, you can construct a service policy that comes into effect when your web site begins to run slowly to enforce different levels of service:     Competitor’s screen-scraping robots are tightly restricted to one request per second each.  A ten-second delay reduces the value of the information they screen-scrape. Users who have not yet logged in are limited to a small proportion of your available bandwidth and directed to a pair of basic web servers, thus reserving capacity for users who are logged in. Users who have made large transactions in the past are tagged with a cookie and the performance they receive is measured.  If they are receiving poor levels of service (over 100ms response time), then some of the transaction servers are reserved for these high-value users and the activity of other users is shaped by a system-wide queue.   Whether you are operating a gaming service, a content portal, a B2B or B2C eCommerce site or an internal intranet, this kind of service policy can help ensure that key customers get the best possible service, minimize the churn of valuable users and prevent undesirable visitors from harming the service to the detriment of others.   Designing a service policy     “I want to best-effort guarantee certain levels of service for certain users.” “I want to prioritize some transactions over others.” “I want to restrict the activities of certain users.”   To address these problems, you must consider the following questions:   Under what circumstances do you want the policy to take effect? How do you wish to categorise your users? How do you wish to apply the differentiation?   One or more TrafficScript rules can be used to apply the policy.  They take advantage of the following features:   When does the policy take effect?   Service Level Monitoring – Measure system performance, and apply policies only when they are needed. Custom Logging - Log and analyse activity to record and validate policy decisions.   How are users categorized?   Application traffic inspection - Determine source, user, content, value; XML processing with XPath searches and calculations.   How are they given different levels of service?   Request Rate Shaping – Apply fine-grained rate limits for transactions. Bandwidth Control - Allocate and reserve bandwidth. Traffic Routing and Termination - Route high and low priority traffic differently; Terminate undesired requests early Selective Traffic Optimization - Selective caching and compression.   TrafficScript   Feature Brief: TrafficScript is the key to defining traffic management policies to implement these prioritization rules.  TrafficScript brings together functionality to monitor and classify behavior, and then applies functionality to impose the appropriate prioritization rules.   For example, the following TrafficScript request rule inspects HTTP requests.  If the request is for a .jsp page, the rule looks at the client’s ‘Priority’ cookie and routes the request to the ‘high-priority’ or ‘low-priority’ server pools as appropriate:   $url = http.getPath(); if( string.endsWith( $url, ".jsp" ) ) { $cookie = http.getCookie( "Priority" ); if( $cookie == "high" ) { pool.use( "high-priority" ); } else { pool.use( "low-priority" ); } }   Generally, if you can describe the traffic management logic that you require, it is possible to implement it using TrafficScript.   Capability 1: Service Level Monitoring   Using Feature Brief: Service Level Monitoring, Traffic Manager can measure and react to changes in response times for your hosted services, by comparing response times to a desired time.   You configure Service Level Monitoring by creating a Service Level Monitoring Class (SLM Class).  The SLM Class is configured with the desired response time (for example, 100ms), and some thresholds that define actions to take.  For example, if fewer than 80% of requests meet the desired response time, Traffic Manager can log a warning; if fewer than 50% meet the desired time, Traffic Manager can raise a system alert.   Suppose that we were concerned about the performance of our Java servlets.  We can configure an SLM Class with the desired performance, and use it to monitor all requests for Java servlets:   $url = http.getPath(); if( string.startsWith( $url, "/servlet/" ) { connection.setServiceLevelClass( "Java servlets" ); }   You can then monitor the performance figures generated by the ‘Java servlets’ SLM class to discover the response times, and the proportion of requests that fall outside the desired response time:   Once requests are monitored by an SLM Class, you can discover the proportion of requests that meet (or fail to meet) the desired response time within a TrafficScript rule.  This makes it possible to implement TrafficScript logic that is only called when services are underperforming.   Example: Simple Differentiation   Suppose we had a TrafficScript rule that tested to see if a request came from a ‘high value’ customer.   When our service is running slowly, high-value customers should be sent to one server pool (‘gold’) and other customers sent to a lower-performing server pool (‘bronze’). However, when the service is running at normal speed, we want to send all customers to all servers (the server pool named ‘all servers’).   The following TrafficScript rule describes how this logic can be implemented:   # Monitor all traffic with the 'response time' SLM class, which is # configured with a desired response time of 200ms connection.setServiceLevelClass( "response time" ); # Now, check the historical activity (last 10 seconds) to see if it’s # been acceptable (more than 90% requests served within 200ms) if( slm.conforming( "response time" ) > 90 ) ) { # select the ‘all servers’ server pool and terminate the rule pool.use( "all servers" ); } # If we get here, things are running slowly # Here, we decide a customer is ‘high value’ if they have a login cookie, # so we penalize customers who are not logged in. You can put your own # test here instead $logincookie = http.getCookie( "Login" ); if( $logincookie ) { pool.use( "gold" ); } else { pool.use( "bronze" ); }   For a more sophisticated example of this technique, check out the article Dynamic rate shaping slow applications   Capability 2: Application Traffic Inspection   There’s no limit to how you can inspect and evaluate your traffic.  Traffic Manager lets you look at any aspect of a client’s request, so that you can then categorize them as you need. For example:   # What is the client asking for? $url = http.getPath(); # ... and the QueryString $qs = http.getQueryString(); # Where has the client come from? $referrer = http.getHeader( "Referer" ); $country = geo.getCountryCode( request.getRemoteIP() ); # What sort of browser is the client using? $ua = http.getHeader( "User-Agent" ); # Is the client trying to spend more than $49.99? if( http.getPath() == "/checkout.cgi" && http.getFormParam( "total" ) > 4999 ) ... # What’s the value of the CustomerName field in the XML purchase order # in the SOAP request? $body = http.getBody(); $name = xml.xpath.matchNodeSet( $body, "", "//Info/CustomerName/text()"); # Take the name, post it to a database server with a web interface and # inspect the response. Does the response contain the value ‘Premium’? $response = http.request.post( "http://my.database.server/query", "name=".string.htmlEncode( $name ) ); if( string.contains( $response, "Premium" ) ) { ... }   Remembering the Classification with a Cookie   Often, it only takes one request to identify the status of a user, but you want to remember this decision for all subsequent requests.  For example, if a user places an item in his shopping cart by accessing the URL ‘ /cart.php ’, then you want to remember this information for all of his subsequent requests.   Adding a response cookie is the way to do this.  You can do this in either a Request or Response Rule with the ‘ http.setResponseCookie() ’ function:   if( http.getPath() == "/cart.php" ) { http.setResponseCookie( "GotItems", "Yes" ); }   This cookie will be sent by the client on every subsequent request, so to test if the user has placed items in his shopping cart, you just need to test for the presence of the ‘GotItems’ cookie in each request rule:   if( http.getCookie( "GotItems" ) ) { ... }   If necessary, you can encrypt and sign the cookie so that it cannot be spoofed or reused:   # Setting the cookie # Create an encryption key using the client’s IP address and user agent # Encrypt the current time using encryption key; it can only be decrypted # using the same key $key = http.getHeader( "User-Agent" ) . ":" . request.getRemoteIP(); $encrypted = string.encrypt( sys.time(), $key ); $encoded = string.hexencode( $encrypted ); http.setResponseHeader( "Set-Cookie", "GotItems=".$encoded ); # Validating the cookie $isValid = 0; if( $cookie = http.getCookie( "GotItems" ) ) { $encrypted = string.hexdecode( $cookie ); $key = http.getHeader( "User-Agent" ) . ":" . request.getRemoteIP(); $secret = string.decrypt( $encrypted, $key ); # If the cookie has been tampered with, or the ip address or user # agent differ, the string.decrypt will return an empty string. # If it worked and the data was less than 1 hour old, it’s valid: if( $secret && sys.time()-$secret < 3600 ) { $isValid = 1; } }   Capability 3: Request Rate Shaping   Having decided when to apply your service policy (using Service Level Monitoring), and classified your users (using Application Traffic Inspection), you now need to decide how to prioritize valuable users and penalize undesirable ones.   The Feature Brief: Bandwidth and Rate Shaping in Traffic Manager capability is used to apply maximum request rates:   On a global basis (“no more than 100 requests per second to my application servers”); On a very fine-grained per-user or per-class basis (“no user can make more than 10 requests per minute to any of my statistics pages”).   You can construct a service policy that places limits on a wide range of events, with very fine grained control over how events are identified.  You can impose per-second and per-minute rates on these events.   For example:   You can rate-shape individual web spiders, to stop them overwhelming your web site. Each web spider, from each remote IP address, can be given maximum request rates. You can throttle individual SMTP connections, or groups of connections from the same client, so that each connection is limited to a maximum number of sent emails per minute. You may also rate-shape new SMTP connections, so that a remote client can only establish new connections at a particular rate. You can apply a global rate shape to the number of connections per second that are forwarded to an application. You can identify individual user’s attempts to log in to a service, and then impede any dictionary-based login attacks by restricting each user to a limited number of attempts per minute.   Request Rate Limits are imposed using the TrafficScript rate.use() function, and you can configure per-second and per-minute limits in the rate class.  Both limits are applied (note that if the per-minute limit is more than 60-times the per-second limit, it has no effect).   Using a Rate Class   Rate classes function as queues.  When the TrafficScript rate.use() function is called, the connection is suspended and added to the queue that the rate class manages.  Connections are then released from the queue according to the per-second and per-minute limits.   There is no limit to the size of the backlog of queued connections.  For example, if 1000 requests arrived in quick succession to a rate class that admitted 10 per second, 990 of them would be immediately queued.  Each second, 10 more requests would be released from the front of the queue.   While they are queued, connections may time out or be closed by the remote client.  If this happens, they are immediately discarded.   You can use the rate.getBacklog() function to discover how many requests are currently queued.  If the backlog is too large, you may decide to return an error page to the user rather than risk their connection timing out.  For example, to rate shape jsp requests, but defer requests when the backlog gets too large:   $url = http.getPath(); if( string.endsWith( $url, ".jsp" ) ) { if( rate.getBacklog( "shape requests" ) > 100 ) { http.redirect( "http://mysite/too_busy.html" ); } else { rate.use( "shape requests" ); } }   Rate Classes with Keys In many circumstances, you may need to apply more fine-grained rate-shape limits.  For example, imagine a login page; we wish to limit how frequently each individual user can attempt to log in to just 2 attempts per minute.   The rate.use() function can take an optional ‘key’ which identifies a specific instance of the rate class.  This key can be used to create multiple, independent rate classes that share the same limits, but enforce them independently for each individual key.   For example, the ‘login limit’ class is restricted to 2 requests per minute, to limit how often each user can attempt to log in:   $url = http.getPath(); if( string.endsWith( $url, "login.cgi" ) ) { $user = http.getFormParam( "username" ); rate.use( "login limit", $user ); }   This rule can help to defeat dictionary attacks where attackers try to brute-force crack a user’s password.  The rate shaping limits are applied independently to each different value of $user.  As each new user accesses the system, they are limited to 2 requests per minute, independently of all other users who share the “login limit” rate shaping class.   For another example, check out The "Contact Us" attack against mail servers.   Applying service policies with rate shaping   Of course, once you’ve classified your users, you can apply different rate settings to different categories of users:   # If they have an odd-looking user agent, or if there’s no host header, # the client is probably a web spider. Limit it to 1 request per second. $ua = http.getHeader( "User-Agent" ); if( ! string.startsWith( $ua, "Mozilla/" ) && ! string.startsWith( $ua, "Opera/" ) || ! http.getHeader( "Host" ) ) { rate.use( "spiders", request.getRemoteIP() ); }   If the service is running slowly, rate-shape users who have not placed items into their shopping cart with a global limit, and rate-shape other users to 8 requests per second each:   if( slm.conforming( "timer" ) < 80 ) { $cookie = request.getCookie( "Cart" ); if( ! $cookie ) { rate.use( "casual users" ); } else { # Get a unique id for the user $cookie = request.getCookie( "JSPSESSIONID" ); rate.use( "8 per second", $cookie ); } }   Capability 4: Bandwidth Shaping   Feature Brief: Bandwidth and Rate Shaping in Traffic Manager allows Traffic Manager to limit the number of bytes per second used by inbound or outbound traffic, for an entire service, or by the type of request.   Bandwidth limits are automatically shared and enforced across all the Traffic Managers in a cluster. Individual Traffic Managers take different proportions of the total limit, depending on the load on each, and unused bandwidth is equitably allocated across the cluster depending on the need of each machine.   Like Request Rate Shaping, you can use Bandwidth shaping to limit the activities of subsets of your users. For example, you may have a 1 Gbits/s network connection which is being over-utilized by a certain type of client, which is affecting the responsiveness of the service.  You may therefore wish to limit the bandwidth available to those clients to 20Mbits/s.   Using Bandwidth Shaping Like Request Rate Shaping, you configure a Bandwidth class with a maximum bandwidth limit.  Connections are allocated to a class as follows:   response.setBandwidthClass( "class name" );   All of the connections allocated to the class share the same bandwidth limit.   Example: Managing Flash Floods The following example helps to mitigate the ‘Slashdot Effect’, a common example of a Flash Flood problem.  In this situation, a web site is overwhelmed by traffic as a result of a high-profile link (for example, from the Slashdot news site), and the level of service that regular users experience suffers as a result.   The example looks at the ‘Referer’ header, which identifies where a user has come from to access a web site.  If the user has come from ‘slashdot.org’, he is tagged with a cookie so that all of his subsequent requests can be identified, and he is allocated to a low-bandwidth class:   $referrer = http.getHeader( "Referer" ); if( string.contains( $referrer, "slashdot.org" ) ) { http.addResponseHeader( "Set-Cookie", "slashdot=1" ); connection.setBandwidthClass( "slashdot" ); } if( http.getCookie( "slashdot" ) ) { connection.setBandwidthClass( "slashdot" ); }   For a more in depth discussion, check out Detecting and Managing Abusive Referers.   Capability 5: Traffic Routing and Termination   Different levels of service can be provided by different traffic routing, or in extreme events, by dropping some requests.   For example, some large media sites provide different levels of content; high-bandwidth rich media versions of news stories are served during normal usage, and low-bandwidth versions which are served when traffic levels are extremely high.  Many websites provide flash-enabled and simple HTML versions of their home page and navigation.   This is also commonplace when presenting content to a range of browsing devices with different capabilities and bandwidth.   The switch between high and low bandwidth versions could take place as part of a service policy: as the service begins to under-perform, some (or all) users could be forced onto the low-bandwidth versions so that a better level of service is maintained.   # Forcibly change requests that begin /high/ to /low/ $url = http.getPath(); if( string.startsWith( $url, "/high" ) ) { $url = string.replace( $url, "/high", "/low" ); http.setPath( $low ); }   Example: Ticket Booking Systems   Ticket Booking systems for major events often suffer enormous floods of demand when tickets become available.   You can use Stingray's request rate shaping system to limit how many visitors are admitted to the service, and if the service becomes overwhelmed, you can send back a ‘please try again’ message rather than keeping the user ‘on hold’ in the queue indefinitely.   Suppose the ‘booking’ rate shaping class is configured to admit 10 users per second, and that users enter the booking process by accessing the URL /bookevent?eventID=<id> .  This rule ensures that no user is queued for more than 30 seconds, by keeping the queue length to no more than 300 users (10 users/second * 30 seconds):   # limit how users can book events $url = http.getPath(); if( $url == "/bookevent" ) { # How many users are already queued? if( rate.getBacklog( "booking" ) > 300 ) { http.redirect( "http://www.mysite.com/too_busy.html"); } else { rate.use( "booking" ); } }   Example: Prioritizing Resource Usage In many cases, the resources are limited and when a site is overwhelmed, users’ requests still need to be served.   Consider the following scenario:   The site runs a cluster of 4 identical application servers (‘servers ‘1’ to ‘4’); Users are categorized into casual visitors and customers; customers have a ‘Cart’ cookie, and casual visitors do not.   Our goal is to give all users the best possible level of service, but if customers begin to get a poor level of service, we want to prioritize them over casual visitors.  We desire that more then 80% of customers get responses within 100ms.   This can be achieved by splitting the 4 servers into 2 pools: the ‘allservers’ pool contains servers 1 to 4, and the ‘someservers’ pool contains servers 1 and 2 only.   When the service is poor for the customers, we will restrict the casual visitors to just the ‘someservers’ pool.  This effectively reserves the additional servers 3 and 4 for the customers’ exclusive use.   The following code uses the ‘response’ SLM class to measure the level of service that customers receive:   $customer = http.getCookie( "Cart" ); if( $customer ) { connection.setServiceLevelClass( "response" ); pool.use( "allservers" ); } else { if( slm.conforming( "response" ) < 80 ) { pool.use( "someservers" ); } else { pool.use( "allservers" ); } }   Capability 6: Selective Traffic Optimization Some of Traffic Manager's features can be used to improve the end user’s experience, but they take up resources on the system:   Pulse Web Acceleraror (Aptimizer) rewrites page content for faster download and rendering, but is very CPU intensive. Content Compression reduces the bandwidth used in responses and gives better response times, but it takes considerable CPU resources and can degrade performance. Feature Brief: Traffic Manager Content Caching can give much faster responses, and it is possible to cache multiple versions of content for each user.  However, this consumes memory on the system.   All of these features can be enabled and disabled on a per-user basis, as part of a service policy.   Pulse Web Accelerator (Stingray Aptimizer)   Use the http.aptimizer.bypass() and http.aptimizer.use() TrafficScript functions to control whether Traffic Manager will employ the Aptimizer optimization module for web content.    Note that these functions only refer to optimizations to the base HTML document (e.g. index.html, or other content of type text/html) - all other resources will be served as is appropriate.  For example, if a client receives an aptimized version of the base content and then requests the image sprites, Traffic Manager will always serve up the sprites.   # Optimize web content for clients based in Australia $ip = request.getRemoteIP(); if( geo.getCountry( $ip ) == "Australia" ) { http.aptimizer.use( "All", "Remote Users" ); }   Content Compression Use the http.compress.enable() and http.compress.disable() TrafficScript functions to control whether or not Traffic Manager will compress response content to the remote client.   Note that Traffic Manager will only compress content if the remote browser has indicated that it supports compression.   On a lightly loaded system, it’s appropriate to compress all response content whenever possible :   http.compress.enable();   On a system where the CPU usage is becoming too high, you can selectively compress content:   # Don’t compress by default http.compress.disable(); if( $isvaluable ) { # do compress in this case http.compress.enable(); }   Content Caching Traffic Manager can cache multiple different versions of a HTTP response.  For example, if your home page is generated by an application that customizes it for each user, Traffic Manager can cache each version separately, and return the correct version from the cache for each user who accesses the page.   Traffic Manager's cache has a limited size so that it does not consume too much memory and cause performance to degrade.  You may wish to prioritize which pages you put in the cache, using the http.cache.disable() and http.cache.enable() TrafficScript  functions.   Note: you also need to enable Content Caching in your Virtual Server configuration; otherwise the TrafficScript cache control functions will have no effect.   # Get the user name $user = http.getCookie( "UserName" ); # Don’t cache any pages by default: http.cache.disable(); if( $isvaluable ) { # Do cache these pages for better performance. # Each user gets a different version of the page, so we need to cache # the page indexed by the user name. http.cache.setkey( $user ); http.cache.enable(); }   Custom Logging A service policy can be complicated to construct and implement.   The TrafficScript functions log.info() , log.warn() and log.error() are used to write messages to the event log, and so are very useful debugging tools to assist in developing complex TrafficScript rules.   For example, the following code:   if( $isvaluable && slm.conforming( "timer" ) < 70 ) { log.info( "User ".$user." needs priority" ); }   … will append the following message to your error log file:   $ tail $ZEUSHOME/zxtm/log/errors [20/Jan/2013:10:24:46 +0000] INFO rulename rulelogmsginfo vsname User Jack needs priority   You can also inspect your error log file by viewing the ‘Event Log’ on the  Admin Server.   When you are debugging a rule, you can use log.info() to print out progress messages as the rule executes.  The log.info() f unction takes a string parameter; you can construct complex strings by appending variables and literals together using the ‘.’ operator:   $msg = "Received ".connection.getDataLen()." bytes."; log.info( $msg );   The functions log.warn() and log.error() are similar to log.info() .  They prefix error log messages with a higher priority - either “WARN” or “ERROR” and you can filter and act on these using the Event Handling system.   You should be careful when printing out connection data verbatim, because the connection data may contain control characters or other non-printable characters.  You can encode data using either ‘ string.hexEncode() ’ or ‘ string.escape( ) ’; you should use ‘ string.hexEncode() ’ if the data is binary, and ‘ string.escape() ’ if the data contains readable text with a small number of non-printable characters.   Conclusion Traffic Manager is a powerful toolkit for network and application administrators.  This white paper describes a number of techniques to use tools in the kit to solve a range of traffic valuation and prioritization tasks.   For more examples of how Traffic Manager and TrafficScript can manipulate and prioritize traffic, check out the Top Examples of Traffic Manager in action on the Pulse Community.
View full article
The Pulse vADC Community Edition is a free-to-download, free-to-use, full-featured virtual application delivery controller (ADC) solution, which you can use immediately to build smarter applications.  
View full article
In Stingray, each virtual server is configured to manage traffic of a particular protocol.  For example, the HTTP virtual server type expects to see HTTP traffic, and automatically applies a number of optimizations - keepalive pooling, HTTP upgrades, pipelines - and offers a set of HTTP-specific functionality (caching, compression etc).   A virtual server is bound to a specific port number (e.g. 80 for HTTP, 443 for HTTPS) and a set of IP addresses.  Although you can configure several virtual servers to listen on the same port, they must be bound to different IP addresses; you cannot have two virtual servers bound to the same IP: port pair as Stingray will not know which virtual server to route traffic to.   "But I need to use one port for several different applications!"   Sometimes, perhaps due to firewall restrictions, you can't publish services on arbitrary ports.  Perhaps you can only publish services on port 80 and 443; all other ports are judged unsafe and are firewalled off. Furthermore, it may not be possible to publish several external IP addresses.   You need to accept traffic for several different protocols on the same IP: port pair.  Each protocol needs a particular virtual server to manage it;  How can you achieve this?   The scenario   Let's imagine you are hosting several very different services:   A plain-text web application that needs an HTTP virtual server listening on port 80 A second web application listening for HTTPS traffic listening on port 443 An XML-based service load-balanced across several servers listening on port 180 SSH login to a back-end server (this is a 'server-first' protocol) listening on port 22   Clearly, you'll need four different virtual servers (one for each service), but due to firewall limitations, all traffic must be tunnelled to port 80 on a single IP address.  How can you resolve this?   The solution - version 1   The solution is relatively straightforward for the first three protocols.  They are all 'client-first' protocols (see Feature Brief: Server First, Client First and Generic Streaming Protocols), so Stingray can read the initial data written from the client.   Virtual servers to handle individual protocols   First, create three internal virtual servers, listening on unused private ports (I've added 7000 to the public ports).  Each virtual server should be configured to manage its protocol appropriately, and to forward traffic to the correct target pool of servers.  You can test each virtual server by directing your client application to the correct port (e.g. http://stingray-ip-address:7080/ ), provided that you can access the relevant port (e.g. you are behind the firewall):   For security, you can bind these virtual servers to localhost so that they can only be accessed from the Stingray device.   A public 'demultiplexing' virtual server   Create three 'loopback' pools (one for each protocol), directing traffic to localhost:7080, localhost:7180 and localhost:7443.   Create a 'public' virtual server listening on port 80 that interrogates traffic using the following rule, and then selects the appropriate pool based on the data the clients send.  The virtual server should be 'client first', meaning that it will wait for data from the client connection before triggering any rules:     # Get what data we have... $data = request.get(); # SSL/TLS record layer: # handshake(22), ProtocolVersion.major(3), ProtocolVersion.minor(0-3) if( string.regexmatch( $data, '^\026\003[\000-\003]' )) { # Looks like SSLv3 or TLS v1/2/3 pool.use( "Internal HTTPS loopback" ); } if( string.startsWithI( $data, "<xml" )) { # Looks like our XML-based protocol pool.use( "Internal XML loopback" ); } if( string.regexmatch( $data, "^(GET |POST |PUT |DELETE |OPTIONS |HEAD )" )) { # Looks like HTTP pool.use( "Internal HTTP loopback" ); } log.info( "Request: '".$data."' unrecognised!" ); connection.discard();   The Detect protocol rule is triggered once we receive client data   Now you can target all your client applications at port 80, tunnel through the firewall and demultiplex the traffic on the Stingray device.   The solution - version 2   You may have noticed that we omitted SSH from the first version of the solution.   SSH is a challenging protocol to manage in this way because it is 'server first' - the client connects and waits for the server to respond with a banner (greeting) before writing any data on the connection.  This means that we cannot use the approach above to identify the protocol type before we select a pool.   However, there's a good workaround.  We can modify the solution presented above so that it waits for client data.  If it does not receive any data within (for example) 5 seconds, then assume that the connection is the server-first SSH type.   First, create a "SSH" virtual server and pool listening on (for example) 7022 and directing traffic to your target SSH virtual server (for example, localhost:22 - the local SSH on the Stingray host):     Note that this is a 'Generic server first' virtual server type, because that's the appropriate type for SSH.   Second, create an additional 'loopback' pool named 'Internal SSH loopback' that forwards traffic to localhost:7022 (the SSH virtual server).   Thirdly, reconfigure the Port 80 listener public virtual server to be 'Generic streaming' rather than 'Generic client first'.  This means that it will run the request rule immediately on a client connection, rather than waiting for client data.   Finally, update the request rule to read the client data.  Because request.get() returns whatever is in the network buffer for client data, we spin and poll this buffer every 10 ms until we either get some data, or we timeout after 5 seconds.   # Get what data we have... $data = request.get(); $count = 500; while( $data == "" && $count-- > 0 ) { connection.sleep( 10 ); # milliseconds $data = request.get(); } if( $data == "" ) { # We've waited long enough... this must be a server-first protocol pool.use( "Internal SSH loopback" ); } # SSL/TLS record layer: # handshake(22), ProtocolVersion.major(3), ProtocolVersion.minor(0-3) if( string.regexmatch( $data, '^\026\003[\000-\003]' )) { # Looks like SSLv3 or TLS v1/2/3 pool.use( "Internal HTTPS loopback" ); } if( string.startsWithI( $data, "<xml" )) { # Looks like our XML-based protocol pool.use( "Internal XML loopback" ); } if( string.regexmatch( $data, "^(GET |POST |PUT |DELETE |OPTIONS |HEAD )" )) { # Looks like HTTP pool.use( "Internal HTTP loopback" ); } log.info( "Request: '".$data."' unrecognised!" ); connection.discard();   This solution isn't perfect (the spin and poll may incur a hit for a busy service over a slow network connection) but it's an effective solution for the single-port firewall problem and explains how to tunnel SSH over port 80 (not that you'd ever do such a thing, would you?)   Read more   Check out Feature Brief: Server First, Client First and Generic Streaming Protocols for background The WebSockets example (libWebSockets.rts: Managing WebSockets traffic with Traffic Manager) uses a similar approach to demultiplex websockets and HTTP traffic
View full article
This article describes how to gather activity statistics across a cluster of traffic managers using Perl, SOAP::Lite and Traffic Manager's SOAP Control API.   Overview   Each local Traffic Manager tracks a very wide range of activity statistics. These may be exported using SNMP or retrieved using the System/Stats interface in Traffic Manager's SOAP Control API.   When you use the Activity monitoring in Traffic Manager's Administration Interface, a collector process communicates with each of the Traffic Managers in your cluster, gathering the local statistics from each and merging them before plotting them on the activity chart.   'Aggregate data across all traffic managers'   However, when you use the SNMP or Control API interfaces directly, you will only receive the statistics from the Traffic Manager machine you have connected to. If you want to get a cluster-wide view of activity using SNMP or the Control API, you will need to poll each machine and merge the results yourself.   Using Perl and SOAP::Lite to query the traffic managers and merge activity statistics   The following code sample determines the total TCP connection rate across the cluster as follows:   Connect to the named traffic manager and use the getAllClusterMachines() method to retrieve a list of all of the machines in the cluster; Poll each machine in the cluster for its current value of TotalConn (the total number of TCP connections processed since startup); Sleep for 10 seconds, then poll each machine again; Calculate the number of connections processed by each traffic manager in the 10-second window, and calculate the per-second rate accurately using high-res time.   The code:   #!/usr/bin/perl -w use SOAP::Lite 0.6; use Time::HiRes qw( time sleep ); $ENV{PERL_LWP_SSL_VERIFY_HOSTNAME}=0; my $userpass = "admin:admin"; # SOAP-capable authentication credentials my $adminserver = "stingray:9090"; # Details of an admin server in the cluster my $sampletime = 10; # Sample time (seconds) sub getAllClusterMembers( $$ ); sub makeConnections( $$$ ); sub makeRequest( $$ ); my $machines = getAllClusterMembers( $adminserver, $userpass ); print "Discovered cluster members ". ( join ", ", @$machines ) . "\n"; my $connections = makeConnections( $machines, $userpass, "http://soap.zeus.com/zxtm/1.0/System/Stats/" ); # sample the value of getTotalConn my $start = time(); my $res1 = makeRequest( $connections, "getTotalConn" ); sleep( $sampletime-(time()-$start) ); my $res2 = makeRequest( $connections, "getTotalConn" ); # Determine connection rate per traffic manager my $totalrate = 0; foreach my $z ( keys %{$res1} ) { my $conns = $res2->{$z}->result - $res1->{$z}->result; my $elapsed = $res2->{$z}->{time} - $res1->{$z}->{time}; my $rate = $conns / $elapsed; $totalrate += $rate; } print "Total connection rate across all machines: " . sprintf( '%.2f', $totalrate ) . "\n"; sub getAllClusterMembers( $$ ) { my( $adminserver, $userpass ) = @_; # Discover cluster members my $mconn = SOAP::Lite -> ns('http://soap.zeus.com/zxtm/1.0/System/MachineInfo/') -> proxy("https://$userpass\@$adminserver/soap") -> on_fault( sub { my( $conn, $res ) = @_; die ref $res?$res->faultstring:$conn->transport->status; } ); $mconn->proxy->ssl_opts( SSL_verify_mode => 0 ); my $res = $mconn->getAllClusterMachines(); # $res->result is a reference to an array of System.MachineInfo.Machine objects # Pull out the name/port of the traffic managers in our cluster my @machines = grep [email protected]://(.*?)/@[email protected], map { $_->{admin_server}; } @{$res->result}; return \@machines; } sub makeConnections( $$$ ) { my( $machines, $userpass, $ns ) = @_; my %conns; foreach my $z ( @$machines ) { $conns{ $z } = SOAP::Lite -> ns( $ns ) -> proxy("https://$userpass\@$z/soap") -> on_fault( sub { my( $conn, $res ) = @_; die ref $res?$res->faultstring:$conn->transport->status; } ); $conns{ $z }->proxy->ssl_opts( SSL_verify_mode => 0 ); } return \%conns; } sub makeRequest( $$ ) { my( $conns, $req ) = @_; my %res; foreach my $z ( keys %$conns ) { my $r = $conns->{$z}->$req(); $r->{time} = time(); $res{$z}=$r; } return \%res; }   Running the script   $ ./getConnections.pl Discovered cluster members stingray1-ny:9090, stingray1-sf:9090 Total connection rate across all machines: 5.02
View full article
This technical brief describes recommended techniques for installing, configuring and tuning Traffic Manager.  You should also refer to the Product Documentation for detailed instructions on the installation process of Traffic Manager software. Getting started Hardware and Software requirements for Traffic Manager Pulse Virtual Traffic Manager Kernel Modules for Linux Software Tuning Stingray Traffic Manager Tuning Traffic Manager for best performance Tech Tip: Where to find a master list of the Traffic Manager configuration keys Tuning the operating system kernel The following instructions only apply to Traffic Manager software running on a customer-supplied Linux or Solaris kernel: Tuning the Linux operating system for Traffic Manager Routing and Performance tuning for Traffic Manager on Linux Tuning the Solaris operating system for Traffic Manager Debugging procedures for Performance Problems Tech Tip: Debugging Techniques for Performance Investigation Load Testing Load Testing recommendations for Traffic Manager Conclusion The Traffic Manager software and the operating system kernels both seek to optimize the use of the resources available to them, and there is generally little additional tuning necessary except when running in heavily-loaded or performance-critical environments. When tuning is required, the majority of tunings relate to the kernel and tcp stack and are common to all networked applications.  Experience and knowledge you have of tuning webservers and other applications on Linux or Solaris can be applied directly to Traffic Manager tuning, and skills that you gain working with Traffic Manager can be transferred to other situations. The importance of good application design TCP and kernel performance tuning will only help to a small degree if the application running over HTTP is poorly designed.  Heavy-weight web pages with large quantities of referenced content and scripts will tend to deliver a poorer user experience and will limit the capacity of the network to support large numbers of users. Traffic Manager's Web Content Optimization capability ("Aptimizer") applies best-practice rules for content optimization dynamically, as the content is delivered by Traffic Manager.  It applies browser-aware techniques to reduce bandwidth and TCP round-trips (image, CSS, JavaScript and HTML minification, image resampling, CSS merging, image spriting) and it automatically applies URL versioning and far-future expires to ensure that clients cache all content and never needlessly request an update for a resource which has not changed. Traffic Manager's Aptimizer is a general purpose solution that complements TCP tuning to give better performance and a better service level.  If you’re serious about optimizing web performance, you should apply a range of techniques from layer 2-4 (network) up to layer 7 and beyond to deliver the best possible end-user experience while maximizing the capacity of your infrastructure.
View full article