cancel
Showing results for 
Search instead for 
Did you mean: 

Pulse Secure vADC

Sort by:
Introduction   Do you ever face any of these requirements?   "I want to best-effort provide certain levels of service for certain users." "I want to prioritize some transactions over others." "I want to restrict the activities of certain types of users."   This article explains that to address these problems, you must consider the following questions:    "Under what circumstances do you want the policy to take effect?" "How do you wish to categorise your users?" "How do you wish to apply the differentiation?"   It then describes some of the measures you can take to monitor performance more deeply and apply prioritization to nominated traffic:   Service Level Monitoring – Measure system performance, and apply policies only when they are needed. Custom Logging - Log and analyse activity to record and validate policy decisions. Application traffic inspection - Determine source, user, content, value; XML processing with XPath searches and calculations. Request Rate Shaping - Apply fine-grained rate limits for transactions. Bandwidth Control - Allocate and reserve bandwidth. Traffic Routing and Termination - Route high and low priority traffic differently; Terminate undesired requests early Selective Traffic Optimization - Selective caching and compression.   Whether you are running an eCommerce web site, online corporate services or an internal intranet, there’s always the need to squeeze more performance from limited resources and to ensure that your most valuable users get the best possible levels of service from the services you are hosting.   An example   Imagine that you are running a successful gaming service in a glamorous location.  The usage of your service is growing daily, and many of your long-term users are becoming very valuable.   Unfortunately, much of your bandwidth and server hits are taken up by competitors’ robots that screen-scrape your betting statistics, and poorly-written bots that spam your gaming tables and occasionally place low-value bets. At certain times of the day, this activity is so great that it impacts the quality of the service you deliver, and your most valuable customers are affected.     Using Traffic Manager to measure, classify and prioritize traffic, you can construct a service policy that comes into effect when your web site begins to run slowly to enforce different levels of service:     Competitor’s screen-scraping robots are tightly restricted to one request per second each.  A ten-second delay reduces the value of the information they screen-scrape. Users who have not yet logged in are limited to a small proportion of your available bandwidth and directed to a pair of basic web servers, thus reserving capacity for users who are logged in. Users who have made large transactions in the past are tagged with a cookie and the performance they receive is measured.  If they are receiving poor levels of service (over 100ms response time), then some of the transaction servers are reserved for these high-value users and the activity of other users is shaped by a system-wide queue.   Whether you are operating a gaming service, a content portal, a B2B or B2C eCommerce site or an internal intranet, this kind of service policy can help ensure that key customers get the best possible service, minimize the churn of valuable users and prevent undesirable visitors from harming the service to the detriment of others.   Designing a service policy     “I want to best-effort guarantee certain levels of service for certain users.” “I want to prioritize some transactions over others.” “I want to restrict the activities of certain users.”   To address these problems, you must consider the following questions:   Under what circumstances do you want the policy to take effect? How do you wish to categorise your users? How do you wish to apply the differentiation?   One or more TrafficScript rules can be used to apply the policy.  They take advantage of the following features:   When does the policy take effect?   Service Level Monitoring – Measure system performance, and apply policies only when they are needed. Custom Logging - Log and analyse activity to record and validate policy decisions.   How are users categorized?   Application traffic inspection - Determine source, user, content, value; XML processing with XPath searches and calculations.   How are they given different levels of service?   Request Rate Shaping – Apply fine-grained rate limits for transactions. Bandwidth Control - Allocate and reserve bandwidth. Traffic Routing and Termination - Route high and low priority traffic differently; Terminate undesired requests early Selective Traffic Optimization - Selective caching and compression.   TrafficScript   Feature Brief: TrafficScript is the key to defining traffic management policies to implement these prioritization rules.  TrafficScript brings together functionality to monitor and classify behavior, and then applies functionality to impose the appropriate prioritization rules.   For example, the following TrafficScript request rule inspects HTTP requests.  If the request is for a .jsp page, the rule looks at the client’s ‘Priority’ cookie and routes the request to the ‘high-priority’ or ‘low-priority’ server pools as appropriate:   $url = http.getPath(); if( string.endsWith( $url, ".jsp" ) ) { $cookie = http.getCookie( "Priority" ); if( $cookie == "high" ) { pool.use( "high-priority" ); } else { pool.use( "low-priority" ); } }   Generally, if you can describe the traffic management logic that you require, it is possible to implement it using TrafficScript.   Capability 1: Service Level Monitoring   Using Feature Brief: Service Level Monitoring, Traffic Manager can measure and react to changes in response times for your hosted services, by comparing response times to a desired time.   You configure Service Level Monitoring by creating a Service Level Monitoring Class (SLM Class).  The SLM Class is configured with the desired response time (for example, 100ms), and some thresholds that define actions to take.  For example, if fewer than 80% of requests meet the desired response time, Traffic Manager can log a warning; if fewer than 50% meet the desired time, Traffic Manager can raise a system alert.   Suppose that we were concerned about the performance of our Java servlets.  We can configure an SLM Class with the desired performance, and use it to monitor all requests for Java servlets:   $url = http.getPath(); if( string.startsWith( $url, "/servlet/" ) { connection.setServiceLevelClass( "Java servlets" ); }   You can then monitor the performance figures generated by the ‘Java servlets’ SLM class to discover the response times, and the proportion of requests that fall outside the desired response time:   Once requests are monitored by an SLM Class, you can discover the proportion of requests that meet (or fail to meet) the desired response time within a TrafficScript rule.  This makes it possible to implement TrafficScript logic that is only called when services are underperforming.   Example: Simple Differentiation   Suppose we had a TrafficScript rule that tested to see if a request came from a ‘high value’ customer.   When our service is running slowly, high-value customers should be sent to one server pool (‘gold’) and other customers sent to a lower-performing server pool (‘bronze’). However, when the service is running at normal speed, we want to send all customers to all servers (the server pool named ‘all servers’).   The following TrafficScript rule describes how this logic can be implemented:   # Monitor all traffic with the 'response time' SLM class, which is # configured with a desired response time of 200ms connection.setServiceLevelClass( "response time" ); # Now, check the historical activity (last 10 seconds) to see if it’s # been acceptable (more than 90% requests served within 200ms) if( slm.conforming( "response time" ) > 90 ) ) { # select the ‘all servers’ server pool and terminate the rule pool.use( "all servers" ); } # If we get here, things are running slowly # Here, we decide a customer is ‘high value’ if they have a login cookie, # so we penalize customers who are not logged in. You can put your own # test here instead $logincookie = http.getCookie( "Login" ); if( $logincookie ) { pool.use( "gold" ); } else { pool.use( "bronze" ); }   For a more sophisticated example of this technique, check out the article Dynamic rate shaping slow applications   Capability 2: Application Traffic Inspection   There’s no limit to how you can inspect and evaluate your traffic.  Traffic Manager lets you look at any aspect of a client’s request, so that you can then categorize them as you need. For example:   # What is the client asking for? $url = http.getPath(); # ... and the QueryString $qs = http.getQueryString(); # Where has the client come from? $referrer = http.getHeader( "Referer" ); $country = geo.getCountryCode( request.getRemoteIP() ); # What sort of browser is the client using? $ua = http.getHeader( "User-Agent" ); # Is the client trying to spend more than $49.99? if( http.getPath() == "/checkout.cgi" && http.getFormParam( "total" ) > 4999 ) ... # What’s the value of the CustomerName field in the XML purchase order # in the SOAP request? $body = http.getBody(); $name = xml.xpath.matchNodeSet( $body, "", "//Info/CustomerName/text()"); # Take the name, post it to a database server with a web interface and # inspect the response. Does the response contain the value ‘Premium’? $response = http.request.post( "http://my.database.server/query", "name=".string.htmlEncode( $name ) ); if( string.contains( $response, "Premium" ) ) { ... }   Remembering the Classification with a Cookie   Often, it only takes one request to identify the status of a user, but you want to remember this decision for all subsequent requests.  For example, if a user places an item in his shopping cart by accessing the URL ‘ /cart.php ’, then you want to remember this information for all of his subsequent requests.   Adding a response cookie is the way to do this.  You can do this in either a Request or Response Rule with the ‘ http.setResponseCookie() ’ function:   if( http.getPath() == "/cart.php" ) { http.setResponseCookie( "GotItems", "Yes" ); }   This cookie will be sent by the client on every subsequent request, so to test if the user has placed items in his shopping cart, you just need to test for the presence of the ‘GotItems’ cookie in each request rule:   if( http.getCookie( "GotItems" ) ) { ... }   If necessary, you can encrypt and sign the cookie so that it cannot be spoofed or reused:   # Setting the cookie # Create an encryption key using the client’s IP address and user agent # Encrypt the current time using encryption key; it can only be decrypted # using the same key $key = http.getHeader( "User-Agent" ) . ":" . request.getRemoteIP(); $encrypted = string.encrypt( sys.time(), $key ); $encoded = string.hexencode( $encrypted ); http.setResponseHeader( "Set-Cookie", "GotItems=".$encoded ); # Validating the cookie $isValid = 0; if( $cookie = http.getCookie( "GotItems" ) ) { $encrypted = string.hexdecode( $cookie ); $key = http.getHeader( "User-Agent" ) . ":" . request.getRemoteIP(); $secret = string.decrypt( $encrypted, $key ); # If the cookie has been tampered with, or the ip address or user # agent differ, the string.decrypt will return an empty string. # If it worked and the data was less than 1 hour old, it’s valid: if( $secret && sys.time()-$secret < 3600 ) { $isValid = 1; } }   Capability 3: Request Rate Shaping   Having decided when to apply your service policy (using Service Level Monitoring), and classified your users (using Application Traffic Inspection), you now need to decide how to prioritize valuable users and penalize undesirable ones.   The Feature Brief: Bandwidth and Rate Shaping in Traffic Manager capability is used to apply maximum request rates:   On a global basis (“no more than 100 requests per second to my application servers”); On a very fine-grained per-user or per-class basis (“no user can make more than 10 requests per minute to any of my statistics pages”).   You can construct a service policy that places limits on a wide range of events, with very fine grained control over how events are identified.  You can impose per-second and per-minute rates on these events.   For example:   You can rate-shape individual web spiders, to stop them overwhelming your web site. Each web spider, from each remote IP address, can be given maximum request rates. You can throttle individual SMTP connections, or groups of connections from the same client, so that each connection is limited to a maximum number of sent emails per minute. You may also rate-shape new SMTP connections, so that a remote client can only establish new connections at a particular rate. You can apply a global rate shape to the number of connections per second that are forwarded to an application. You can identify individual user’s attempts to log in to a service, and then impede any dictionary-based login attacks by restricting each user to a limited number of attempts per minute.   Request Rate Limits are imposed using the TrafficScript rate.use() function, and you can configure per-second and per-minute limits in the rate class.  Both limits are applied (note that if the per-minute limit is more than 60-times the per-second limit, it has no effect).   Using a Rate Class   Rate classes function as queues.  When the TrafficScript rate.use() function is called, the connection is suspended and added to the queue that the rate class manages.  Connections are then released from the queue according to the per-second and per-minute limits.   There is no limit to the size of the backlog of queued connections.  For example, if 1000 requests arrived in quick succession to a rate class that admitted 10 per second, 990 of them would be immediately queued.  Each second, 10 more requests would be released from the front of the queue.   While they are queued, connections may time out or be closed by the remote client.  If this happens, they are immediately discarded.   You can use the rate.getBacklog() function to discover how many requests are currently queued.  If the backlog is too large, you may decide to return an error page to the user rather than risk their connection timing out.  For example, to rate shape jsp requests, but defer requests when the backlog gets too large:   $url = http.getPath(); if( string.endsWith( $url, ".jsp" ) ) { if( rate.getBacklog( "shape requests" ) > 100 ) { http.redirect( "http://mysite/too_busy.html" ); } else { rate.use( "shape requests" ); } }   Rate Classes with Keys In many circumstances, you may need to apply more fine-grained rate-shape limits.  For example, imagine a login page; we wish to limit how frequently each individual user can attempt to log in to just 2 attempts per minute.   The rate.use() function can take an optional ‘key’ which identifies a specific instance of the rate class.  This key can be used to create multiple, independent rate classes that share the same limits, but enforce them independently for each individual key.   For example, the ‘login limit’ class is restricted to 2 requests per minute, to limit how often each user can attempt to log in:   $url = http.getPath(); if( string.endsWith( $url, "login.cgi" ) ) { $user = http.getFormParam( "username" ); rate.use( "login limit", $user ); }   This rule can help to defeat dictionary attacks where attackers try to brute-force crack a user’s password.  The rate shaping limits are applied independently to each different value of $user.  As each new user accesses the system, they are limited to 2 requests per minute, independently of all other users who share the “login limit” rate shaping class.   For another example, check out The "Contact Us" attack against mail servers.   Applying service policies with rate shaping   Of course, once you’ve classified your users, you can apply different rate settings to different categories of users:   # If they have an odd-looking user agent, or if there’s no host header, # the client is probably a web spider. Limit it to 1 request per second. $ua = http.getHeader( "User-Agent" ); if( ! string.startsWith( $ua, "Mozilla/" ) && ! string.startsWith( $ua, "Opera/" ) || ! http.getHeader( "Host" ) ) { rate.use( "spiders", request.getRemoteIP() ); }   If the service is running slowly, rate-shape users who have not placed items into their shopping cart with a global limit, and rate-shape other users to 8 requests per second each:   if( slm.conforming( "timer" ) < 80 ) { $cookie = request.getCookie( "Cart" ); if( ! $cookie ) { rate.use( "casual users" ); } else { # Get a unique id for the user $cookie = request.getCookie( "JSPSESSIONID" ); rate.use( "8 per second", $cookie ); } }   Capability 4: Bandwidth Shaping   Feature Brief: Bandwidth and Rate Shaping in Traffic Manager allows Traffic Manager to limit the number of bytes per second used by inbound or outbound traffic, for an entire service, or by the type of request.   Bandwidth limits are automatically shared and enforced across all the Traffic Managers in a cluster. Individual Traffic Managers take different proportions of the total limit, depending on the load on each, and unused bandwidth is equitably allocated across the cluster depending on the need of each machine.   Like Request Rate Shaping, you can use Bandwidth shaping to limit the activities of subsets of your users. For example, you may have a 1 Gbits/s network connection which is being over-utilized by a certain type of client, which is affecting the responsiveness of the service.  You may therefore wish to limit the bandwidth available to those clients to 20Mbits/s.   Using Bandwidth Shaping Like Request Rate Shaping, you configure a Bandwidth class with a maximum bandwidth limit.  Connections are allocated to a class as follows:   response.setBandwidthClass( "class name" );   All of the connections allocated to the class share the same bandwidth limit.   Example: Managing Flash Floods The following example helps to mitigate the ‘Slashdot Effect’, a common example of a Flash Flood problem.  In this situation, a web site is overwhelmed by traffic as a result of a high-profile link (for example, from the Slashdot news site), and the level of service that regular users experience suffers as a result.   The example looks at the ‘Referer’ header, which identifies where a user has come from to access a web site.  If the user has come from ‘slashdot.org’, he is tagged with a cookie so that all of his subsequent requests can be identified, and he is allocated to a low-bandwidth class:   $referrer = http.getHeader( "Referer" ); if( string.contains( $referrer, "slashdot.org" ) ) { http.addResponseHeader( "Set-Cookie", "slashdot=1" ); connection.setBandwidthClass( "slashdot" ); } if( http.getCookie( "slashdot" ) ) { connection.setBandwidthClass( "slashdot" ); }   For a more in depth discussion, check out Detecting and Managing Abusive Referers.   Capability 5: Traffic Routing and Termination   Different levels of service can be provided by different traffic routing, or in extreme events, by dropping some requests.   For example, some large media sites provide different levels of content; high-bandwidth rich media versions of news stories are served during normal usage, and low-bandwidth versions which are served when traffic levels are extremely high.  Many websites provide flash-enabled and simple HTML versions of their home page and navigation.   This is also commonplace when presenting content to a range of browsing devices with different capabilities and bandwidth.   The switch between high and low bandwidth versions could take place as part of a service policy: as the service begins to under-perform, some (or all) users could be forced onto the low-bandwidth versions so that a better level of service is maintained.   # Forcibly change requests that begin /high/ to /low/ $url = http.getPath(); if( string.startsWith( $url, "/high" ) ) { $url = string.replace( $url, "/high", "/low" ); http.setPath( $low ); }   Example: Ticket Booking Systems   Ticket Booking systems for major events often suffer enormous floods of demand when tickets become available.   You can use Stingray's request rate shaping system to limit how many visitors are admitted to the service, and if the service becomes overwhelmed, you can send back a ‘please try again’ message rather than keeping the user ‘on hold’ in the queue indefinitely.   Suppose the ‘booking’ rate shaping class is configured to admit 10 users per second, and that users enter the booking process by accessing the URL /bookevent?eventID=<id> .  This rule ensures that no user is queued for more than 30 seconds, by keeping the queue length to no more than 300 users (10 users/second * 30 seconds):   # limit how users can book events $url = http.getPath(); if( $url == "/bookevent" ) { # How many users are already queued? if( rate.getBacklog( "booking" ) > 300 ) { http.redirect( "http://www.mysite.com/too_busy.html"); } else { rate.use( "booking" ); } }   Example: Prioritizing Resource Usage In many cases, the resources are limited and when a site is overwhelmed, users’ requests still need to be served.   Consider the following scenario:   The site runs a cluster of 4 identical application servers (‘servers ‘1’ to ‘4’); Users are categorized into casual visitors and customers; customers have a ‘Cart’ cookie, and casual visitors do not.   Our goal is to give all users the best possible level of service, but if customers begin to get a poor level of service, we want to prioritize them over casual visitors.  We desire that more then 80% of customers get responses within 100ms.   This can be achieved by splitting the 4 servers into 2 pools: the ‘allservers’ pool contains servers 1 to 4, and the ‘someservers’ pool contains servers 1 and 2 only.   When the service is poor for the customers, we will restrict the casual visitors to just the ‘someservers’ pool.  This effectively reserves the additional servers 3 and 4 for the customers’ exclusive use.   The following code uses the ‘response’ SLM class to measure the level of service that customers receive:   $customer = http.getCookie( "Cart" ); if( $customer ) { connection.setServiceLevelClass( "response" ); pool.use( "allservers" ); } else { if( slm.conforming( "response" ) < 80 ) { pool.use( "someservers" ); } else { pool.use( "allservers" ); } }   Capability 6: Selective Traffic Optimization Some of Traffic Manager's features can be used to improve the end user’s experience, but they take up resources on the system:   Pulse Web Acceleraror (Aptimizer) rewrites page content for faster download and rendering, but is very CPU intensive. Content Compression reduces the bandwidth used in responses and gives better response times, but it takes considerable CPU resources and can degrade performance. Feature Brief: Traffic Manager Content Caching can give much faster responses, and it is possible to cache multiple versions of content for each user.  However, this consumes memory on the system.   All of these features can be enabled and disabled on a per-user basis, as part of a service policy.   Pulse Web Accelerator (Stingray Aptimizer)   Use the http.aptimizer.bypass() and http.aptimizer.use() TrafficScript functions to control whether Traffic Manager will employ the Aptimizer optimization module for web content.    Note that these functions only refer to optimizations to the base HTML document (e.g. index.html, or other content of type text/html) - all other resources will be served as is appropriate.  For example, if a client receives an aptimized version of the base content and then requests the image sprites, Traffic Manager will always serve up the sprites.   # Optimize web content for clients based in Australia $ip = request.getRemoteIP(); if( geo.getCountry( $ip ) == "Australia" ) { http.aptimizer.use( "All", "Remote Users" ); }   Content Compression Use the http.compress.enable() and http.compress.disable() TrafficScript functions to control whether or not Traffic Manager will compress response content to the remote client.   Note that Traffic Manager will only compress content if the remote browser has indicated that it supports compression.   On a lightly loaded system, it’s appropriate to compress all response content whenever possible :   http.compress.enable();   On a system where the CPU usage is becoming too high, you can selectively compress content:   # Don’t compress by default http.compress.disable(); if( $isvaluable ) { # do compress in this case http.compress.enable(); }   Content Caching Traffic Manager can cache multiple different versions of a HTTP response.  For example, if your home page is generated by an application that customizes it for each user, Traffic Manager can cache each version separately, and return the correct version from the cache for each user who accesses the page.   Traffic Manager's cache has a limited size so that it does not consume too much memory and cause performance to degrade.  You may wish to prioritize which pages you put in the cache, using the http.cache.disable() and http.cache.enable() TrafficScript  functions.   Note: you also need to enable Content Caching in your Virtual Server configuration; otherwise the TrafficScript cache control functions will have no effect.   # Get the user name $user = http.getCookie( "UserName" ); # Don’t cache any pages by default: http.cache.disable(); if( $isvaluable ) { # Do cache these pages for better performance. # Each user gets a different version of the page, so we need to cache # the page indexed by the user name. http.cache.setkey( $user ); http.cache.enable(); }   Custom Logging A service policy can be complicated to construct and implement.   The TrafficScript functions log.info() , log.warn() and log.error() are used to write messages to the event log, and so are very useful debugging tools to assist in developing complex TrafficScript rules.   For example, the following code:   if( $isvaluable && slm.conforming( "timer" ) < 70 ) { log.info( "User ".$user." needs priority" ); }   … will append the following message to your error log file:   $ tail $ZEUSHOME/zxtm/log/errors [20/Jan/2013:10:24:46 +0000] INFO rulename rulelogmsginfo vsname User Jack needs priority   You can also inspect your error log file by viewing the ‘Event Log’ on the  Admin Server.   When you are debugging a rule, you can use log.info() to print out progress messages as the rule executes.  The log.info() f unction takes a string parameter; you can construct complex strings by appending variables and literals together using the ‘.’ operator:   $msg = "Received ".connection.getDataLen()." bytes."; log.info( $msg );   The functions log.warn() and log.error() are similar to log.info() .  They prefix error log messages with a higher priority - either “WARN” or “ERROR” and you can filter and act on these using the Event Handling system.   You should be careful when printing out connection data verbatim, because the connection data may contain control characters or other non-printable characters.  You can encode data using either ‘ string.hexEncode() ’ or ‘ string.escape( ) ’; you should use ‘ string.hexEncode() ’ if the data is binary, and ‘ string.escape() ’ if the data contains readable text with a small number of non-printable characters.   Conclusion Traffic Manager is a powerful toolkit for network and application administrators.  This white paper describes a number of techniques to use tools in the kit to solve a range of traffic valuation and prioritization tasks.   For more examples of how Traffic Manager and TrafficScript can manipulate and prioritize traffic, check out the Top Examples of Traffic Manager in action on the Pulse Community.
View full article
On 27th February 2006, we took part in VMware's launch of their Virtual Appliance initiative.  Riverbed Stingray (or 'Zeus Extensible Traffic Manager / ZXTM') was the first ADC product that was packaged as a virtual appliance and we were delighted to be a launch partner with VMware, and gain certification in November 2006 when they opened up their third party certification program.   We had to synchronize the release of our own community web content with VMware's website launch, which was scheduled for 9pm PST. That's 5am in our Cambridge UK dev labs!   With a simple bit of TrafficScript, we were able to test and review our new web content internally before the release, and make the new content live at 5am while everyone slept soundly in their beds.   Our problem...   The community website we operated was a reasonably sophisticated website. It was based on a blogging engine, and the configuration and content for the site was split between the filesystem and a local database. Content was served up from the database via the website and an RSS feed. To add a new section with new content to the site, it was necessary to coordinate a number of changes to the filesystem and the database together.   We wanted to make the new content live for external visitors at 5am on Monday 27th, but we also wanted the new content to be visible internally and to selected partners before the release, so that we could test and review it.   The obvious solution of scripting the database and filesystem changes to take place at 5am was not a satisfactory solution. It was hard to test on the live site, and it did not let us publish the new site internally beforehand.   How we did it   if( $usenew ) { pool.use( "Staging server" ); } else { pool.use( "DMZ Server" ); }   We had a couple of options.   We have a staging system that we use to develop new code and content before putting it on the live site. This system has its own database and filesystem, and when we publish a change, we copy the new settings to the live site manually. We could have elected to run the new site on the staging system, and use Stingray to direct traffic to the live or staging server as appropriate:   However, this option would have exposed our staging website (running on a developer's desktop behind the DMZ) to live traffic, and created a vulnerable single-point-of-failure. Instead, we modified the current site so that it could select the database to use based on the presence of an HTTP header:   $usenew = 0; # For requests from internal users, always use the new site $remoteip = request.getremoteip(); if( string.ipmaskmatch( $remoteip, "10.0.0.0/8" ) ) { $usenew = 1; } # If its after 5am on the 27th, always use the new site # Fix this before the 1st of the following month! if (sys.time.monthday() == 27 && sys.time.hour() >= 5 ) { $usenew = 1; } if( sys.time.monthday() > 27 ) { $usenew = 1; } http.removeHeader( "NEWSITE" ); if( $usenew ) { http.addHeader( "NEWSITE", "1" ); }   PHP code   // The php code selects overrides the database host $DB_HOST = "livedb.internal.zeus.com"; if( isset( $_ENV['HTTP_NEWSITE'] ) && ( $_ENV['HTTP_HOST'] == 'knowledgehub.zeus.com' ) ) { $DB_HOST = "stagedb.internal.zeus.com"; }   This way, only the secured DMZ webserver processed external traffic, but it would use the internal staging database for the new content.   Did it work?   Of course it did! Because we used Stingray to categorize the traffic, we could safely test the new content, confident that the switchover would be seamless. No one was awake at 5am when the site went live, but traffic to the site jumped after the launch:  
View full article
Introduction Many DDoS attacks work by exhausting the resources available to a website for handling new connections.  In most cases, the tool used to generate this traffic has the ability to make HTTP requests and follow HTTP redirect messages, but lacks the sophistication to store cookies.  As such, one of the most effective ways of combatting DDoS attacks is to drop connections from clients that don't store cookies during a redirect. Before you Proceed It's important to point out that using the solution herein may prevent at least the following legitimate uses of your website (and possibly others): Visits by user-agents that do not support cookies, or where cookies are disabled for any reason (such as privacy); some people may think that your website has gone down! Visits by internet search engine web-crawlers; this will prevent new content on your website from appearing in search results! If either of the above items concern you, I would suggest seeking advice (either from the community, or through your technical support channels). Solution Planning Implementing a solution in pure TrafficScript will prevent traffic from reaching the web servers.  But, attackers are still free to consume connection-handling resources on the traffic manager.  To make the solution more robust, we can use iptables to block traffic a bit earlier in the network stack.  This solution presents us with a couple of challenges: TrafficScript cannot execute shell commands, so how do we add rules to iptables? Assuming we don't want to permanently block all IP addresses that are involved in a DDoS attack, how can we expire the rules? Even though TrafficScript cannot directly run shell commands, the Event Handling system can.  We can use the event.emit() TrafficScript function to send jobs to a custom event handler shell script that will add an iptables rule that blocks the offending IP address.  To expire each rule can use the at command to schedule a job that removes it.  This means that we hand over the scheduling and running of that job over to the control of the OS (which is something that it was designed to do). The overall plans looks like this: Write a TrafficScript rule that emits a custom event when it detects a client that doesn't support cookies and redirects Write a shell script that takes as its input: an --eventtype argument (the event handler includes this automatically) a --duration argument (to define the length of time that an IP address stays blocked for) a string of information that includes the IP address that is to be blocked Create an event handler for the events that our TrafficScript is going to emit TrafficScript Code $cookie = http.getCookie( "DDoS-Test" ); if ( ! $cookie ) {       # Either it's the visitor's first time to the site, or they don't support cookies    $test = http.getFormParam( "cookie-test" );       if ( $test != "1" ) {       # It's their first time.  Set the cookie, redirect to the same page       # and add a query parameter so we know they have been redirected.       # Note: if they supplied a query string or used a POST,       # we'll respond with a bare redirect       $path = http.getPath();             http.sendResponse( "302 Found" , "text/plain" , "" ,          "Location: " . string.escape( $path ) .          "?cookie-test=1\r\nSet-Cookie: DDoS-Test=1" );          } else {             # We've redirected them and attempted to set the cookie, but they have not       # accepted.  Either they don't support cookies, or (more likely) they are a bot.             # Emit the custom event that will trigger the firewall script.       event.emit( "firewall" , request.getremoteip());             # Pause the connection for 100 ms to give the firewall time to catch up.       # Note: This may need tuning.       connection.sleep( 100 );             # Close the connection.       connection.close( "HTTP/1.1 200 OK\n" );    } } Installation This code will need to be applied to the virtual server as a request rule.  To do that, take the following steps: In the traffic manager GUI, navigate to Catalogs → Rule Enter ts-firewaller in the Name field Click the Use TrafficScript radio button Click the Create Rule button Paste the code from the attached ts-firewaller.rts file Click the Save button Navigate to the Virtual Server that you want to protect ( Services → <Service Name> ) Click the Rules link In the Request Rules section, select ts-firewaller from the drop-down box Click the Add Rule button Your virtual server should now be configured to execute the rule. Shell Script Code #!/bin/bash # Use getopt to collect parameters. params=`getopt -o e:,d: -l eventtype:,duration: -- "$@"` # Evaluate the set of parameters. eval set -- "$params" while true; do   case "$1" in   --duration ) DURATION="$2"; shift 2 ;;   --eventtype ) EVENTTYPE="$2"; shift 2 ;;   -- ) shift; break ;;   * ) break ;;   esac done # Awk the IP address out of ARGV IP=$(echo "${BASH_ARGV}" | awk ' { print ( $(NF) ) }' ) # Add a new rule to the INPUT chain. iptables -A INPUT -s ${IP} -j DROP && # Queue a new job to delete the rule after DURATION minutes. # Prevents warning about executing the command using /bin/sh from # going in the traffic manager event log. echo "iptables -D INPUT -s ${IP} -j DROP" | at -M now + ${DURATION} minutes &> /dev/null Installation To use this script as an action program, you'll need to upload it via the GUI.  To do that, take the following steps: Open a new file with the editor of your choice (depends on what OS you're using) Copy and paste the script code into the editor Save the file as ts-firewaller.sh In the traffic manager UI, navigate to Catalogs → Extra Files → Action Programs Click the Choose File button Select the ts-firewaller.sh file that you just created Click the Upload Program button Event Handler Now that we have a rule that emits a custom event, and a script that we can use as an action program, we can configure the event handler that will tie the two together. First, we need to create a new event type: In the traffic manager's UI, navigate to System → Alerting Click the Manage Event Types button Enter Firewall in the Name field Click the Add Event Type button Click the + next to the Custom Events item in the event tree Click the Some custom events... radio button Enter firewall in the empty field Click the Update button Now that we have an event type, we need to create a new action: In the traffic manager UI, navigate to System → Alerting Click on the Manage Actions button In the Create New Action section, enter firewall in the Name field Click the Program radio button Click the Add Action button In the Program Arguments section, enter duration in the Name field Enter Determines the length of time in minutes that an IP will be blocked for in the Description field Click the Update button Enter 10 in the newly-appeared arg!duration field Click the Update button Now that we have an action configured, the only thing that we have left to do is to connect the custom event to the new action: In the traffic manager UI, navigate to System → Alerting In the Event Type column, select firewall from the drop-down box In the Actions column, select firewall from the drop-down box Click the Update button That concludes the installation steps; this solution should now be live! Testing Testing the functionality is pretty simple for this solution.  Basically, you can monitor the state of iptables while you run specific commands from a command line.  To do this, ssh into your traffic manager and execute iptables -L as root.  You should check this after tech of the upcoming tests. Since I'm using a Linux machine for testing, I'm going to use the curl command to send crafted requests to my traffic manager.  The 3 scenarios that I want to test are: Initial visit: The user-agent is missing a query string and a cookie Successful second visit: The user-agent has a query string and has provided the correct cookie Failed second visit: The user ages has a query string (indicating that they were redirected), but hasn't provided a cookie The respective curl commands that need to be run are: curl -v http:// <tmhost>/ curl -v http:// <tmhost>/?cookie-test=1 -b "DDoS-Test=1" curl -v http:// <tmhost>/?cookie-test=1 Note: If you run these commands from your workstation, you will be unable to connect to the traffic manager in any way for a period of 10 minutes!
View full article
Stingray's TrafficScript rules can inspect and modify an entire request and response stream. This provides many opportunities for securing content against unauthorized breaches. For example, over a period of 9 months, a hacker named Nicolas Jacobsen used a compromised customer account on T-Mobile's servers to exploit a vulnerability and leach a large amount of sensitive information (see http://www.securityfocus.com/news/10271). This information included US Secret Service documents and customer records including their Social Security Numbers. This article describes how to use a simple TrafficScript rule to detect and mask out suspicious data in a response. The TrafficScript rule Here is a simple rule to remove social security numbers from any web documents served from a CGI script: if( string.contains( http.getPath(), "/cgi-bin/" ) ) {    $payload = http.getResponseBody();    $new_response = string.regexsub( $payload, "\\d{3}-\\d{2}-\\d{4}",                             "xxx-xx-xxxx", "g" );    if( $new_response != $payload )       http.setResponseBody( $new_response ); } Configure this rule as a 'Response Rule' for a virtual server that handles HTTP traffic. How it works How does this simple-looking TrafficScript rule work?  The specification for the rule is: If the request is for a resource in /cgi-bin/, then: mask anything in the response that looks like a social security number. In this case, we recognize social security numbers as sequences of digits and '-' (for example, '123-45-6789') and we replace them with 'XXX-XX-XXXX'. 1. If the request is for a resource in /cgi-bin/: if( string.contains( http.getPath(), "/cgi-bin/" ) ) { The http.getPath() function returns the name of the HTTP request, having removed any %-encoding which obscures the request. You can use this function in a request or response rule.The string.contains() test checks whether the request is for a resource in /cgi-bin/ . 2. Get the entire response: $payload = http.getResponseBody(); The http.getResponseBody() function reads the entire HTTP response. It seamlessly handles cases where no content length is provided, and it dechunks a chunk-transfer-encoded response - these are common cases when handling responses from dynamic web pages and applications. It interoperates perfectly with performance features like HTTP Keepalive connections and Pipelined requests. 3. Replace any social security numbers: $new_response = string.regexsub( $payload, "\\d{3}-\\d{2}-\\d{4}",                             "xxx-xx-xxxx", "g" ); The string.regexsub() function applies a regular expression substitution to the $payload data, replacing potential social security numbers with anonymous data. Regular expressions are commonly used to inspect and manipulate textual data, and Stingray supports the full POSIX regular expression specification. 4. Change the response: if( $new_response != $payload )    http.setResponseBody( $new_response ); The http.setResponseBody() function replaces the HTTP response with the supplied data. You can safely replace the response with a message of different length - Stingray will take care of the Content-Length header, as well as compressing and SSL-encrypting the response as required. http.setResponseBody() interoperates with keepalives and pipelined requests. In action... Here is the vulnerable application, before (left) and after (right) the TrafficScript rule is applied: Masking social security numbers with a string of 'XXX' Summary Although Stingray is not a total application security solution (look to the Stingray Application Firewall for this), this example demonstrates how Stingray Traffic Manager can be used as one layer in a larger belt-and-braces system. Stingray is one location where security measures can be very easily added - perhaps as a rapid reaction to a vulnerability elsewhere in the network, patching over the problem until a more permanant solution can be deployed. In a real deployment, you might do something more firm than masking content.  For example, if a web page contains unexpected, sensitive data it might be best just to forcibly-redirect the client to the home page of your application to avoid the risk of any sensitive content being leaked.
View full article
Google Analytics i s a great tool for monitoring and tracking visitors to your web sites. Perhaps best of all, it's entirely web based - you only need a web browser to access the analysis services it provides.     To enable tracking for your web sites, you need to embed a small fragment of JavaScript code in every web page. This extension makes this easy, by inspecting all outgoing content and inserting the code into each HTML page, while honoring the users 'Do Not Track' preferences.   Installing the Extension   Requirements   This extension has been tested against Stingray Traffic Manager 9.1, and should function with all versions from 7.0 onwards.   Installation    Copy the contents of the User Analytics rule below. Open in an editor, and paste the contents into a new response rule:     Verify that the extension is functioning correctly by accessing a page through the traffic manager and use 'View Source' to verify that the Google Analytics code has been added near the top of the document, just before the closing </head> tag:     User Analytics rule   # Edit the following to set your profile ID; $defaultProfile = "UA-123456-1"; # You may override the profile ID on a site-by-site basis here $overrideProfile = [ "support.mysite.com" => "UA-123456-2", "secure.mysite.com" => "UA-123456-3" ]; # End of configuration settings # Only process text/html responses $contentType = http.getResponseHeader( "Content-Type" ); if( !string.startsWith( $contenttype, "text/html" )) break; # Honor any Do-Not-Track preference $dnt = http.getHeader( "DNT" ); if ( $dnt == "1" ) break; # Determine the correct $uacct profile ID $uacct = $overrideProfile[ http.getHostHeader() ]; if( !$uacct ) $uacct = $defaultProfile; # See http://www.google.com/support/googleanalytics/bin/answer.py?answer=174090 $script = ' <script type="text/javascript"> var _gaq = _gaq || []; _gaq.push(["_setAccount", "' . $uacct . '"]); _gaq.push(["_trackPageview"]); (function() { var ga = document.createElement("script"); ga.type = "text/javascript"; ga.async = true; ga.src=("https:" == document.location.protocol ? "https://ssl" : "http://www") + ".google-analytics.com/ga.js"; var s = document.getElementsByTagName("script")[0]; s.parentNode.insertBefore(ga, s); })(); </script>'; $body = http.getResponseBody(); # Find the location of the closing '</head>' tag $i = string.find( $body, "</head>" ); if( $i ==-1 ) $i = string.findI( $body, "</head>" ); if( $i ==-1 ) break; # Give up http.setResponseBody( string.left( $body, $i ) . $script . string.skip( $body, $i ));   For some extensions to this rule, check out Faisal Memon's article Google Analytics revisited
View full article
  Introduction   While I was thinking of writing an article on how to use the traffic manager to satisfy EU cookie regulations, I figured "somebody else has probably done all the hard work".  Sure enough, a quick search turned up an excellent and (more importantly) free utility called cookiesDirective.js.  In addition to cookieDirective.js being patently nifty, its website left me with a nostalgic craving for a short, wide glass of milk.   Background   If you're reading this article, you probably have a good idea of why you might want (need) to disclose to your users that your site uses cookies.  You should visit the site at http://cookiesdirective.com in order to gain a richer understanding of what the cookieDirective script actually does and why you might want to use it.  For the impatient, let's just assume that you're perfectly happy for random code to run in your visitors' browsers.     Requirements   A website. A TrafficScript-enabled traffic manager, configured to forward traffic to your web servers. Preparation   According to the directions, one must follow "just a simple 3-step process" in order to use cookieDirective.js:   Move cookie-generating JavaScript in your page (such as Google Analytics) in to a separate file, and the name of the file to a function that causes it to get loaded before the closing </head> tag in the HTML body.  Basically, this makes it possible to display the cookie disclosure message before the cookie-generating code gets run by the browser.  That much moving code around is not within the scope of this article.  For now, let's assume that displaying the message to the user is "good enough". Add a snippet of code to the end of your html body that causes the browser to download cookiesDirective.js.  In the example code, it gets downloaded directly from cookiesdirective.com, but you should really download it and host it on your own web server if you're going to be using it in production. Add another snippet of code that runs the JavaScript.  This is the bit that causes the popup to appear. The Goods   # The path to your home page? $homepath = '/us/'; # The location on the page where the cookie notification should appear (top or bottom)? $noticelocation = 'bottom'; # The URL that contains your privacy statement. $privacyURL = 'http://www.riverbed.com/us/privacy_policy.php'; # ==== DO NOT EDIT BELOW THIS LINE! (unless you really want to) ==== sub insert_before_endbody($response, $payload){ # Search from the end of the document for the closing body tag. $idx = string.findr($response, "</body>"); # Insert the payload. $response = string.substring($response, 0, $idx-1) . $payload . string.substring($response, $idx, -1); # Return the response. return $response; } $path = http.getpath(); if ( $path == $homepath ){ # Initialize the response body. $response = http.getresponsebody(); # Cookie-generating JavaScript gets loaded in this function. $wrapper = '<script type="text/javascript">function cookiesDirectiveScriptWrapper(){}</script>'; # Imports the cookiesdirective code. # FIXME: Download the package and host it locally! $code = '<script type=_ # Executes the cookiesdirective code, providing the appropriate arguments. $run = '<script type="text/javascript">cookiesDirective(\'' . $noticelocation . '\',0,\'' . $privacyURL . '\',\'\');</script>'; # Insert everything into the response body. foreach($snippet in [$wrapper, $code, $run]){ $response = insert_before_endbody($response, $snippet); } # Update the response data. http.setresponsebody($response); }   This particular example works on the main Riverbed site.  To get the code to work, you'll need to change at least the $homepath and $privacyURL variables.  If you want the notice to appear at the top of the page, you can change the $noticelocation variable.   NOTE: Remember to apply this rule to your virtual server as a response rule!
View full article
Top examples of Pulse vADC in action   Examples of how SteelApp can be deployed to address a range of application delivery challenges.   Modifying Content   Simple web page changes - updating a copyright date Adding meta-tags to a website with Traffic Manager Tracking user activity with Google Analytics and Google Analytics revisited Embedding RSS data into web content using Traffic Manager Add a Countdown Timer Using TrafficScript to add a Twitter feed to your web site Embedded Twitter Timeline Embedded Google Maps Watermarking PDF documents with Traffic Manager and Java Extensions Watermarking Images with Traffic Manager and Java Extensions Watermarking web content with Pulse vADC and TrafficScript   Prioritizing Traffic   Evaluating and Prioritizing Traffic with Traffic Manager HowTo: Control Bandwidth Management Detecting and Managing Abusive Referers Using Pulse vADC to Catch Spiders Dynamic rate shaping slow applications Stop hot-linking and bandwidth theft! Slowing down busy users - driving the REST API from TrafficScript   Performance Optimization   Cache your website - just for one second? HowTo: Monitor the response time of slow services HowTo: Use low-bandwidth content during periods of high load   Fixing Application Problems   No more 404 Not Found...? Hiding Application Errors Sending custom error pages   Compliance Problems   Satisfying EU cookie regulations using The cookiesDirective.js and TrafficScript   Security problems   The "Contact Us" attack against mail servers Protecting against Java and PHP floating point bugs Managing DDoS attacks with Traffic Manager Enhanced anti-DDoS using TrafficScript, Event Handlers and iptables How to stop 'login abuse', using TrafficScript Bind9 Exploit in the Wild... Protecting against the range header denial-of-service in Apache HTTPD Checking IP addresses against a DNS blacklist with Traffic Manager Heartbleed: Using TrafficScript to detect TLS heartbeat records TrafficScript rule to protect against "Shellshock" bash vulnerability (CVE-2014-6271) SAML 2.0 Protocol Validation with TrafficScript Disabling SSL v3.0 for SteelApp   Infrastructure   Transparent Load Balancing with Traffic Manager HowTo: Launch a website at 5am Using Stingray Traffic Manager as a Forward Proxy Tunnelling multiple protocols through the same port AutoScaling Docker applications with Traffic Manager Elastic Application Delivery - Demo How to deploy Traffic Manager Cluster in AWS VPC   Other solutions   Building a load-balancing MySQL proxy with TrafficScript Serving Web Content from Traffic Manager using Python and Serving Web Content from Traffic Manager using Java Virtual Hosting FTP services Managing WebSockets traffic with Traffic Manager TrafficScript can Tweet Too Instrument web content with Traffic Manager Antivirus Protection for Web Applications Generating Mandelbrot sets using TrafficScript Content Optimization across Equatorial Boundaries
View full article
This article presents a TrafficScript library that give you easy and efficient access to tables of data stored as files in the Stingray configuration:   libTable.rts   Download the following TrafficScript library from gihtub and import it into your Rules Catalog, naming it libTable.rts :   libTable.rts   # libTable.rts # # Efficient lookups of key/value data in large resource files (>100 lines) # Use getFirst() and getNext() to iterate through the table sub lookup( $filename, $key ) { update( $filename ); $pid = sys.getPid(); return data.get( "resourcetable".$pid.$filename."::".$key ); } sub getFirst( $filename ) { update( $filename ); $pid = sys.getPid(); return data.get( "resourcetable".$pid.$filename.":first" ); } sub getNext( $filename, $key ) { update( $filename ); $pid = sys.getPid(); return data.get( "resourcetable".$pid.$filename.":next:".$key ); } # Internal functions sub update( $filename ) { $pid = sys.getPid(); $md5 = resource.getMD5( $filename ); if( $md5 == data.get( "resourcetable".$pid.$filename.":md5" ) ) return; data.reset( "resourcetable".$pid.$filename.":" ); data.set( "resourcetable".$pid.$filename.":md5", $md5 ); $contents = resource.get( $filename ); $pkey = ""; foreach( $l in string.split( $contents, "\n" ) ) { if( ! string.regexmatch( $l, "(.*?)\\s+(.*)" ) ) continue; $key = string.trim( $1 ); $value = string.trim( $2 ); data.set( "resourcetable".$pid.$filename."::".$key, $value ); if( !$pkey ) { data.set( "resourcetable".$pid.$filename.":first", $key ); } else { data.set( "resourcetable".$pid.$filename.":next:".$pkey, $key ); } $pkey = $key; } }   Usage:   import libTable.rts as table; $filename = "data.txt"; # Look up a key/value pair $value = table.lookup( $filename, $key ); # Iterate through the table for( $key = table.getFirst( $filename ); $key != ""; $key = table.getNext( $filename, $key ) ) { $value = table.lookup( $filename, $key ); }   The library caches the contents of the file internally, and is very efficient for large files.  For smaller files, it may be slightly more efficient to search these files using a regular expression, but the convenience of this library may outweigh the small performance gains.   Data file format   This library provides access to files stored in the Stingray conf/extra folder (by way of the Extra Files > Miscellaneous Files) section of the catalog.  These files can be uploaded using the UI, the SOAP or REST API, or by manually copying them in place and initiating a configuration replication.   Files should contain  key-value pairs, one per line, space separated:   key1value1 key2value2 key3value3   Preservation of order   The lookup operation uses an open hash table, so is efficient for large files. The getFirst() and getNext() operations will iterate through the data table in order, returning the keys in the order they appear in the file.   Performance and alternative implementations   The performance of this library is investigated in the article Investigating the performance of TrafficScript - storing tables of data.  It is very efficient for large tables of data, and marginally less efficient than a simple regular-expression string search for small files.   If performance is of a concern and you only need to work with small datasets, then you could use the following library instead:   libTableSmall.rts   # libTableSmall.rts: Efficient lookups of key/value data in a small resource file (
View full article
TrafficScript rules often need to refer to tables of data - redirect mappings, user lists, IP black lists and the like. For small tables that are not updated frequently, you can place these inline in the TrafficScript rule: $redirect = [   "/widgets" => "/sales/widgets",   "/login" => "/cgi-bin/login.cgi" ]; $path = http.getPath(); if( $redirect[ $path ] ) http.redirect( $redirect[ $path ] ); This approach becomes difficult to manage if the table becomes large, or you want to update it without having to edit the TrafficScript rule.  In this case, you can store the table externally (in a resource file) and reference it from the rule: The following examples will consider a file that follows a standard space-separated 'key value' pattern, and we'll look at alternative TrafficScript approaches to efficiently handle the data and look up key-value pairs: # cat /opt/zeus/zxtm/conf/extra/redirects.txt /widgets /sales/widgets /login   /cgi-bin/login.cgi /support http://support.site.com We'll propose several 'ResourceTable' TrafficScript library implementations that express a lookup() function that can be used in the following fashion: # ResourceTable provides a lookup( filename, key ) function import ResourceTable as table; $path = http.getPath(); $redirect = table.lookup( "redirects.txt", $path ); We'll then look at the performance of each to see which is the best. For a summary of the solutions in this article, jump straight to libTable.rts: Interrogating tables of data in TrafficScript. Implementation 1: Search the file on each lookup ResourceTable1 sub lookup( $filename, $key ) {    $contents = resource.get( $filename );    if( string.regexmatch( $contents, '\n'.$key.'\s+([^\n]+)' ) )       return $1;    if( string.regexmatch( $contents, '^'.$key.'\s+([^\n]+)' ) )       return $1;    return ""; } ​ This simple implementation searches the file on each and every lookup, using a regular expression to locate the key and also the text on the remainder of the line.  It pins the key to the start of the line so that it does not mistakenly match lines where $key is a substring (suffix) of the key. The implementation is simple and effective, but we would reasonably expect that it would become less and less efficient, the larger the resource file became. Implementation 2: Store the table in a TrafficScript hash table for easy lookup The following code builds a TrafficScript hash table from the contents of the resource file: $contents = resource.get( $filename ); $h = [ ]; foreach( $l in string.split( $contents, "\n" ) ) {   if( ! string.regexmatch( $l, '(.*?)\s+(.*)' ) ) continue;   $key = string.trim( $1 );     $value = string.trim( $2 );      $h[$key] = $value; } You can then quickly look up values in the hash table using $h[ $key ]. However, we don't want to have to create the hash table every time we call the lookup function; we would like to create it once and then cache it somewhere.  We can use the global data table to store persistent data, and we can verify that the data is still current by checking that the MD5 of the resource file has not changed: ResourceTable2a sub update( $filename ) {    # Store the md5 of the resource file we have cached. No need to update if the file has not changed    $md5 = resource.getMD5( $filename );    if( $md5 == data.get( "resourcetable:".$filename.":md5" ) ) return;    # Do the update    $contents = resource.get( $filename );    $h = [ ];    foreach( $l in string.split( $contents, "\n" ) ) {       if( ! string.regexmatch( $l, "(.*?)\\s+(.*)" ) ) continue;       $key = string.trim( $1 );         $value = string.trim( $2 );          $h[$key] = $value;    }    data.set( "resourcetable:".$filename.':data', $h );    data.set( "resourcetable:".$filename.':md5', $md5 ); } sub lookup( $filename, $key ) {    # Check to see if the file has been updated, and update our table if necessary    update( $filename );    $h = data.get( "resourcetable:".$filename.':data' );    return $h[$key]; } ​ Version 2a: we store the MD5 of the file in the global key 'resourcetable:filename:md5', and the hash table in the global key 'resourcetable:filename:data'. This implementation has one significant fault.  If two trafficscript rules are running concurrently, they may both try to update the keys in the global data table and a race condition may result in inconsistent data.  This situation is not possible on a single-core system with one zeus.zxtm process because rules are run serially and only pre-empted if they invoke a blocking operation, but it's entirely possible on a multi-core system, and TrafficScript does not implement mutexes or locks to help protect against this. The simplest solution is to give each core its own, private copy of the data.  Because system memory should be scaled with the number of cores, the additional overhead of these copies is generally acceptable: ResourceTable2b: sub update( $filename ) {    $pid = sys.getPid();    $md5 = resource.getMD5( $filename );    if( $md5 == data.get( "resourcetable:".$pid.$filename.":md5" ) ) return;    $contents = resource.get( $filename );    $h = [ ];    foreach( $l in string.split( $contents, "\n" ) ) {       if( ! string.regexmatch( $l, "(.*?)\\s+(.*)" ) ) continue;       $key = string.trim( $1 );         $value = string.trim( $2 );          $h[$key] = $value;    }    data.set( "resourcetable:".$pid.$filename.':data', $h );    data.set( "resourcetable:".$pid.$filename.':md5', $md5 ); } sub lookup( $filename, $key ) {    update( $filename );    $pid = sys.getPid();    $h = data.get( "resourcetable:".$pid.$filename.':data' );    return $h[$key]; } ​ Version 2b: by including the pid in the name of the key, we avoid multi-core race conditions at the expense of multiple copies of the date Implementation 3: Store the key/value data directly in the global hash table data.set and data.get address a global key/value table.  We could use that directly, rather than constructing a TrafficScript hash: sub update( $filename ) {    $pid = sys.getPid();    $md5 = resource.getMD5( $filename );    if( $md5 == data.get( "resourcetable".$pid.$filename.":md5" ) ) return;    data.reset( "resourcetable".$pid.$filename.":" );    data.set( "resourcetable".$pid.$filename.":md5", $md5 );    $contents = resource.get( $filename );    foreach( $l in string.split( $contents, "\n" ) ) {       if( ! string.regexmatch( $l, "(.*?)\\s+(.*)" ) ) continue;       $key = string.trim( $1 );         $value = string.trim( $2 );          data.set( "resourcetable".$pid.$filename."::".$key, $value );    } } sub lookup( $filename, $key ) {    update( $filename );    $pid = sys.getPid();    return data.get( "resourcetable".$pid.$filename."::".$key ); } ​ Version 3: key/value pairs are stored in the global data table.  Keys begin with the string "resourcetable:pid:filename:", so it's easy to delete all of the key/value pairs using data.reset() before rebuilding the dataset How do these implementations compare? We tested the number of lookups-per-second that each implementation could achieve (using a single-core virtual machine running on a laptop Core2 processor) to investigate performance for different dataset sizes: Resource file size (entries) 10 100 1,000 10,000 Implementation 1: simple search 300,000 100,000 17,500 1,000 Implementation 2: trafficscript hash, cached in global data table 27,000 2,000 250 10 Implementation 3: key/value pairs in the global data table 200,000 200,000 200,000 200,000 ResourceTable lookups per second (single core, lightweight processor) The test just exercised the rate of lookups in resource files of various sizes; the cost of building the cached datastructures (implementations 2 and 3) and one-off costs and not included in the tests. Interpreting the results The degradation of performance in implementation 1 as the file size increases was to be expected. The constant performance of implementation 3 was as expected, as hash tables should generally give O(1) lookup speed, not affected by the number of entries. The abysmal performance of implementation 2 is surprising, until you note that on every lookup we retrieve the entire hash table from the global data table: $h = data.get( "resourcetable:".$pid.$filename.':data' ); return $h[$key]; The global data table is a key/value store; all keys and values are serialized as strings.  The data.get() operation will read the serialized version of the hash table and reconstruct the entire table (up to 10,000 entries) before the O(1) lookup operation. What is most surprising perhaps is the speed at which you can search and extract data from a string using regular expressions (implementation 1).  For small and medium datasets (up to approx 50 entries), this is the simplest and fastest method, and it's only worth considering the more complex data.get key/value implementation for large datasets. Read more Check out the article How is memory managed in TrafficScript? for more detail on the ways that TrafficScript handles data and memory
View full article
This article describes how to gather activity statistics across a cluster of traffic managers using Perl, SOAP::Lite and Stingray's SOAP Control API. Overview Each local Stingray Traffic Manager tracks a very wide range of activity statistics. These may be exported using SNMP or retrieved using the System/Stats interface in Stingray's SOAP Control API. When you use the Activity monitoring in Stingray's Administration Interface, a collector process communicates with each of the Traffic Managers in your cluster, gathering the local statistics from each and merging them before plotting them on the activity chart. 'Aggregate data across all traffic managers' However, when you use the SNMP or Control API interfaces directly, you will only receive the statistics from the Traffic Manager machine you have connected to. If you want to get a cluster-wide view of activity using SNMP or the Control API, you will need to poll each machine and merge the results yourself. Using Perl and SOAP::Lite to query the traffic managers and merge activity statistics The following code sample determines the total TCP connection rate across the cluster as follows: Connect to the named traffic manager and use the getAllClusterMachines() method to retrieve a list of all of the machines in the cluster; Poll each machine in the cluster for its current value of TotalConn (the total number of TCP connections processed since startup); Sleep for 10 seconds, then poll each machine again; Calculate the number of connections processed by each traffic manager in the 10-second window, and calculate the per-second rate accurately using high-res time. The code: #!/usr/bin/perl -w use SOAP::Lite 0.6; use Time::HiRes qw( time sleep ); $ENV{PERL_LWP_SSL_VERIFY_HOSTNAME}=0; my $userpass = "admin:admin";      # SOAP-capable authentication credentials my $adminserver = "stingray:9090"; # Details of an admin server in the cluster my $sampletime = 10;               # Sample time (seconds) sub getAllClusterMembers( $$ ); sub makeConnections( $$$ ); sub makeRequest( $$ ); my $machines = getAllClusterMembers( $adminserver, $userpass ); print "Discovered cluster members ". ( join ", ", @$machines ) . "\n"; my $connections = makeConnections( $machines, $userpass,    " http://soap.zeus.com/zxtm/1.0/System/Stats/ " ); # sample the value of getTotalConn my $start = time(); my $res1 = makeRequest( $connections, "getTotalConn" ); sleep( $sampletime-(time()-$start) ); my $res2 = makeRequest( $connections, "getTotalConn" ); # Determine connection rate per traffic manager my $totalrate = 0; foreach my $z ( keys %{$res1} ) {    my $conns   = $res2->{$z}->result - $res1->{$z}->result;    my $elapsed = $res2->{$z}->{time} - $res1->{$z}->{time};    my $rate = $conns / $elapsed;    $totalrate += $rate; } print "Total connection rate across all machines: " .       sprintf( '%.2f', $totalrate ) . "\n"; sub getAllClusterMembers( $$ ) {     my( $adminserver, $userpass ) = @_;     # Discover cluster members     my $mconn =  SOAP::Lite          -> ns(' http://soap.zeus.com/zxtm/1.0/System/MachineInfo/ ')          -> proxy(" https://$userpass\@$adminserver/soap ")          -> on_fault( sub  {               my( $conn, $res ) = @_;               die ref $res?$res->faultstring:$conn->transport->status; } );     $mconn->proxy->ssl_opts( SSL_verify_mode => 0 );      my $res = $mconn->getAllClusterMachines();     # $res->result is a reference to an array of System.MachineInfo.Machine objects     # Pull out the name:port of the traffic managers in our cluster     my @machines = grep s@ https://(.*?)/@$1@ ,        map { $_->{admin_server}; } @{$res->result};     return \@machines; } sub makeConnections( $$$ ) {     my( $machines, $userpass, $ns ) = @_;     my %conns;     foreach my $z ( @$machines ) {        $conns{ $z } = SOAP::Lite          -> ns( $ns )          -> proxy(" https://$userpass\@$z/soap ")          -> on_fault( sub  {               my( $conn, $res ) = @_;               die ref $res?$res->faultstring:$conn->transport->status; } );        $conns{ $z }->proxy->ssl_opts( SSL_verify_mode => 0 );     }     return \%conns; } sub makeRequest( $$ ) {     my( $conns, $req ) = @_;     my %res;     foreach my $z ( keys %$conns ) {        my $r = $conns->{$z}->$req();        $r->{time} = time();        $res{$z}=$r;     }     return \%res; } Running the script $ ./getConnections.pl Discovered cluster members stingray1-ny:9090, stingray1-sf:9090 Total connection rate across all machines: 5.02
View full article
What is Direct Server Return? Layer 2/3 Direct Server Return (DSR), also referred to as ‘triangulation’, is a network routing technique used in some load balancing situations: Incoming traffic from the client is received by the load balancer and forwarded to a back-end node Outgoing (return) traffic from the back-end node is sent directly to the client and bypasses the load balancer completely Incoming traffic (blue) is routed through the load balancer, and return traffic (red) bypasses the load balancer Direct Server Return is fundamentally different from the normal load balancing mode of operation, where the load balancer observes and manages both inbound and outbound traffic. In contrast, there are two other common load balancing modes of operation: NAT (Network Address Translation): layer-4 load balancers and simple layer 7 application delivery controllers use NAT (Network Address Translation) to rewrite the destination value of individual network packets.  Network connections are load-balanced by the choice of destination value. They often use a technique called ‘delayed binding’ to delay and inspect a new network connection before sending the packets to a back-end node; this allows them to perform content-based routing.  NAT-based load balancers can switch TCP streams, but have limited capabilities to inspect and rewrite network traffic. Proxy: Modern general-purpose load balancers like Stingray Traffic Manager operate as full proxies.  The proxy mode of operation is the most compute-intensive, but current general purpose hardware is more than powerful enough to manage traffic at multi-gigabit speeds. Whereas NAT-based load balancers manage traffic on a packet-by-packet basis, proxy-based load balancers can read entire request and responses.  They can manage and manipulate the traffic based on a full understanding of the transaction between the client and the application server. Note that some load balancers can operated in a dual-mode fashion - a service can be handled either in a NAT-like fashion or in a Proxy-like fashion.  This introduces are tradeoff between hardware performance and software sophistication - see SOL4707 - Choosing appropriate profiles for HTTP traffic for an example.  Stingray Traffic Manager can only function in a Proxy-like fashion. This article describes how the benefits of direct server return can be applied to a layer 7 traffic management device such as Stingray Traffic Manager. Why use Direct Server Return? Layer 2/3 Direct Server Return was very popular from 1995 to about 2000 because the load balancers of the time were seriously limited in performance and compute power; DSR uses less compute resources then a full NAT or Proxy load balancer.  DSR is no longer necessary for high performance services as modern load balancers on modern hardware can easily handle multi-gigabits of traffic without requiring DSR. DSR is still an appealing option for organizations who serve large media files, or who have very large volumes of traffic. Stingray Traffic Manager does not support a traditional DSR mode of operation, but it is straightforward to manage traffic to obtain a similar layer 7 DSR effect. Disadvantages of Layer2/3 Direct Server Return There are a number of distinct limitations and disadvantages with DSR: 1. The load balancer does not observe the response traffic The load balancer has no way of knowing if a back-end server has responded correctly to the remote client.   The server may have failed, or it may have returned a server error message.  An external monitoring service is necessary to verify the health and correct operation of each back-end server. 2. Proper load balancing is not possible The load balancer has no idea of service response times so it is difficult for it to perform effective, performance-sensitive load balancing. 3. Session persistence is severely limited Because the load balancer only observes the initial ‘SYN’ packet before it makes a load balancing decision, it can only perform session persistence based on the source IP address and port of the packet, i.e. the IP address of the remote client. The load balancer cannot perform cookie-based session persistence, SSL session ID persistence, or any of the many other session persistence methods offered by other load balancers. 4. Content-based routing is not possible Again, because the load balancer does not observe the initial request, it cannot perform content based routing. 5. Limited traffic management and reporting The load balancer cannot manage traffic, performing operations like SSL decryption, content compression, security checking, SYN cookies, bandwidth management, etc.  It cannot retry failed requests, or perform any traffic rewriting.  The load balancer cannot report on traffic statistics such as bandwidth sent. 6. DSR can only be used within a datacenter There is no way to perform DSR between datacenters (other than proprietary tunnelling, which may be limited by ISP egress filtering). In addition, many of the advanced capabilities of an application delivery controller that depend on inspection and modification (security, acceleration, caching, compression, scrubbing etc) cannot be deployed when a DSR mode is in use. Performance of Direct Server Return The performance benefits of DSR are often assumed to be greater than they really are.  Central to this doubt is the observation that client applications will send TCP ‘ACK’ packets via the load balancer in response to the data they receive from the server, and the volume of the ACK packets can overwhelm the load balancer. Although ACK packets are small, in many cases the rated capacities of network hardware assume that all packets are the size of the maximum MTU (typically 1500 bytes).  A load balancer running on a 100 MBit network could receive a little over 8,000 ACK packets per second. On a low-latency network, ACK packets are relatively infrequent (1 ACK packet for every 4 data packets), but for large downloads over a high-latency network (8 hops) the number of ACK packets closely approaches 1:1 as the server and client attempt to optimize the TCP session.  Therefore, over high-latency networks, a DSR-equipped load balancer will receive a similar volume of ACK packets to the volume of outgoing data packets (and the difference in size between the ACK and data packets has little effect to packet-based load balancers). Stingray alternatives to Layer 2/3 DSR There are two alternatives to direct server return: Use Stingray Traffic Manager in its usual full proxy mode Stingray Traffic Manager is comfortably able to manage over many Gbits of traffic in its normal ‘proxy’ mode on appropriate hardware, and can be scaled horizontally for increased capacity.  In benchmarks, modern Intel and AMD-based systems can achieve multiple 10's of Gbits of fully load-balanced traffic, and up to twice as much when serving content from Stingray Traffic Manager’s content cache. Redirect requests to the chosen origin server (a.k.a. Layer 7 DSR) For the most common protocols (HTTP and RTSP), it is possible to handle them in ‘proxy’ mode, and then redirect the client to the chosen server node once the load balancing and session persistence decision has been made.  For the large file download, the client communicates directly with the server node, bypassing Stingray Traffic Manager completely: Client issues HTTP or RTSP request to Stingray Traffic Manager Stingray Traffic Manager issues ‘probe’ request via pool to back-end server Stingray Traffic Manager verifies that the back-end server returns a correct response Stingray Traffic Manager sends a 302 redirect to the client, telling it to retry the request against the chosen back-end server Requests for small objects (blue) are proxied directly to the origin.  Requests for large objects (red) elicit a lightweight probe to locate the resource, and then the client is instructed (green)to retrieve the resource directly from the origin. This technique would generally be used selectively.  Small file downloads (web pages, images, etc) would be managed through the Stingray Traffic Manager.  Only large files – embedded media for example – would be handled in this redirect mode.  For this reason, the HTTP session will always run through the Stingray Traffic Manager. Layer 7 DSR with HTTP Layer 7 DSR with HTTP is fairly straightforward.  In the following example, incoming requests that begin “/media” will be converted into simple probe requests and sent to the ‘Media Servers’ pool.  The Stingray Traffic Manager will determine which node was chosen, and send the client an explicit redirect to retrieve the requested content from the chosen node: Request rule: Deploy the following TrafficScript request rule: $path = http.getPath(); if( string.startsWith( $path, "/media/" ) || 1 ) {    # Store the real path    connection.data.set( "path", $path );    # Convert the request to a lightweight HEAD for '/'    http.setMethod( "HEAD" );    http.setPath( "/" );    pool.use( "Media Servers" ); } Response rule: This rule reads the response from the server; load balancing and session persistence (if relevant) will ensure that we’ve connected with the optimal server node.  The rule only takes effect if we did the request rewrite, the $saved_path value will begin with ‘/media/’, so we can issue the redirect. $saved_path = connection.data.get( "path" ); if( string.startsWith( $saved_path, "/media" ) ) {    $chosen_node = connection.getNode();    http.redirect( " http:// ".$chosen_node.$saved_path ); } Layer 7 DSR  with RTSP An RTSP connection is a persistent TCP connection.  The client and server communicate with HTTP-like requests and responses.  In this example, Stingray Traffic Manager will receive initial RTSP connections from remote clients and load-balance them on to a pool of media servers.  In the RTSP protocol, a media download is always preceded by a ‘DESCRIBE’ request from the client; Stingray Traffic Manager will replace the ‘DESCRIBE’ response with a 302 Redirect response that tells the client to connect directly to the back-end media server. This code example has been tested with the Quicktime, Real and Windows media clients, and against pools of Quicktime, Helix (Real) and Windows media servers. The details Create a virtual server listening on port 554 (standard port for RTSP traffic).  Set the protocol type to be “RTSP”. In this example, we have three pools of media servers, and we’re going to select the pool based on the User-Agent field in the RTSP request.  The pools are named “Helix Servers”, “QuickTime Servers” and “Windows Media Servers”. Request rule: Deploy the following TrafficScript request rule: $client = rtsp.getRequestHeader( "User-Agent" ); # Choose the pool based on the User-Agent if( string.Contains( $client, "RealMedia" ) ) {    pool.select( "Helix Servers" ); } else if ( string.Contains( $client, "QuickTime" ) ) {    pool.select( "QuickTime Servers" ); } else if ( string.Contains( $client, "WMPlayer" ) ) {    pool.select( "Windows Media Servers" ); } This rule uses pool.select() to specify which pool to use when Stingray is ready to forward the request to a back-end server.  Response rule: All of the work takes place in the response rule.  This rule reads the response from the server.  If the request was a ‘DESCRIBE’ method, the rule then replaces the response with a 302 redirect, telling the client to connect directly to the chosen back-end server.  Add this rule as a response rule, setting it to run every time (not once). # Wait for a DESCRIBE response since this contains the stream $method = rtsp.getMethod(); if( $method != "DESCRIBE" ) break; # Get the chosen node $node = connection.getnode(); # Instruct the client to retry directly against the chosen node rtsp.redirect( "rtsp://" . $node . "/" . $path ); Appendix: How does DSR work? It’s useful to have an appreciation of how DSR (and Delayed Binding) functions in order to understand some of its limitations (such as content inspection). TCP overview A simplified overview of a TCP connection is as follows: Connection setup The client initiates a connection with a server by sending a ‘SYN’ packet.  The SYN packet contains a randomly generated client sequence number (along with other data). The server replies with a ‘SYN ACK’ packet, acknowledging the client’s SYN and sending its own randomly generated server sequence number. The client completes the TCP connection setup by sending an ACK packet to acknowledge the server’s SYN.  The TCP connection setup is often referred to as a 3-way TCP handshake.  Think of it as the following conversation: Client: “Can you hear me?” (SYN) Server: “Yes.  Can you hear me?” (ACK, SYN) Client: “Yes” (ACK) Data transfer Once the connection has been established by the 3-way handshake, the client and server exchange data packets with each other.  Because packets may be dropped or re-ordered, each packet contains a sequence number; the sequence number is incremented for each packet sent. When a client receives intact data packets from the server, it sends back an ACK (acknowledgement) with the packet sequence number.  When a client acknowledges a sequence number, it is acknowledging it received all packets up to that number, so ACKs may be sent less frequently than data packets.  The server may send several packets in sequence before it receives an ACK (determined by the (“window size”), and will resend packets if they are not ACK’d rapidly enough. Simple NAT-based Load Balancing There are many variants for IP and MAC rewriting used in simple NAT-based load balancing.  The simplest NAT-based load balancing technique uses Destination-NAT (DNAT) and works as follows: The client initiates a connection by sending a SYN packet to the Virtual IP (VIP) that the load balancer is listening on The load balancer makes a load balancing decision and forwards the SYN packet to the chosen node.  It rewrites the destination IP address in the packet to the IP address of the node.  The load-balancer also remembers the load-balancing decision it made. The node replies with a SYN/ACK.  The load-balancer rewrites the source IP address to be the VIP and forwards the packet on to the remote client. As more packets flow between the client and the server, the load balancer checks its internal NAT table to determine how the IP addresses should be rewritten. This implementation is very amenable to a hardware (ASIC) implementation.  The TCP connection is load-balanced on the first SYN packet; one of the implications is that the load balancer cannot inspect the content in the TCP connection before making the routing decision. Delayed Binding Delayed binding is a variant of the DNAT load balancing method.  It allows the load balancer to inspect a limited amount of the content before making the load balancing decision. When the load balancer receives the initial SYN, it chooses a server sequence number and returns a SYN/ACK response The load balancer completes the TCP handshake with the remote client and reads the initial few data packets in the client’s request. The load balancer reassembles the request, inspects it and makes the load-balancing decision.  It then makes a TCP connection to the chosen server, using DNAT (i.e., the client’s source IP address) and writes the request to the server. Once the request has been written, the load balancer must splice the client-side and server-side connection together.  It does this by using DNAT to forward packets between the two endpoints, and by rewriting the sequence numbers chosen by the server so that they match the initial sequence numbers that the load balancer used. This implementation is still amenable to hardware (ASIC) implementation.  However, layer 4-7 tasks such as detailed content inspection and content rewriting are beyond implementation in specialized hardware alone and are often implemented using software approaches (such as F5's FastHTTP profile), albeit with significant functional limitations. Direct Server Return Direct Server Return is most commonly implemented using MAC address translation (layer 2). A MAC (Media Access Control) address is a unique, unchanging hardware address that is bound to a network card.  Network devices will read all network packets destined for their MAC address. Network devices use ARP (address resolution protocol) to announce the MAC address that is hosting a particular IP address.  In a Direct Server Return configuration, the load balancer and the server nodes will all listen on the same VIP.  However, only the load balancer makes ARP broadcasts to tell the upstream router that the VIP maps to its MAC address. When a packet destined for the VIP arrives at the router, the router places it on the local network, addressed to the load balancer’s MAC address.  The load balancer picks that packet up. The load balancer then makes a load balancing decision, choosing which node to send it to.  The load balancer rewrites the MAC address in the packet and puts it back on the wire. The chosen node picks the packet up just as if it were addressed directly to it. When the node replies, it sends its packets directly to the source node.  They are immediately picked up by the upstream router and forwarded on. In this way, reply packets completely bypass the load balancer machine. Why content inspection is not possible Content inspection (delayed binding) is not possible because it requires that the load balancer first completes the three-way handshake with the remote source node, and possibly ACK’s some of the data packets. When the load balancer then sends the first SYN to the chosen node, the node will respond with a SYN/ACK packet directly back to the remote source.  The load balancer is out-of-line and cannot suppress this SYN/ACK.  Additionally, the sequence number that the node selects cannot be translated to the one that the remote client is expecting.  There is no way to persuade the node to pick up in the TCP connection from where the load balancer left off. For similar reasons, SYN cookies cannot be used by the load balancer to offload SYN floods from the server nodes. Alternative Implementations of Direct Server Return There are two alternative implementations of DSR (see this 2002 paper entitled 'The State of the Art'), but neither is widely used any more: TCP Tunnelling: IP tunnelling (aka IP encapsulation) can be used to tunnel the client IP packets from the load balancer to the server.  All client IP packets are encapsulated within IP datagrams, and the server runs a tunnel device (an OS driver and configuration) to strip off the datagram header before sending the client IP packet up the network stack. This configuration does not support delayed binding, or any equivalent means of inspecting content before making the load balancing decision TCP Connection Hopping: Resonate have implemented a proprietary protocol (Resonate Exchange Protocol, RXP) which interfaces deeply with the server node’s TCP stack.  Once a TCP connection has been established with the Resonate Central Dispatch load balancer and the initial data has been read, the load balancer can hand the response side of the connection off to the selected server node using RXP.  The RXP driver on the server suppresses the initial TCP handshake packets, and forces the use of the correct TCP sequence number.  This uniquely allows for content-based routing and direct server return in one solution. Neither of these methods are in wide use now.
View full article
When you need to scale out your MySQL database, replication is a good way to proceed. Database writes (UPDATEs) go to a 'master' server and are replicated across a set of 'slave' servers. Reads (SELECTs) are load-balanced across the slaves.   Overview   MySQL's replication documentation describes how to configure replication:   MySQL Replication   A quick solution...   If you can modify your MySQL client application to direct 'Write' (i.e. 'UPDATE') connections to one IP address/port and 'Read' (i.e. 'SELECT') connections to another, then this problem is trivial to solve. This generally needs a code update (Using Replication for Scale-Out).   You will need to direct the 'Update' connections to the master database (or through a dedicated Traffic Manager virtual server), and direct the 'Read' connections to a Traffic Manager virtual server (in 'generic server first' mode) and load-balance the connections across the pool of MySQL slave servers using the 'least connections' load-balancing method: Routing connections from the application   However, in most cases, you probably don't have that degree of control over how your client application issues MySQL connections; all connections are directed to a single IP:port. A load balancer will need to discriminate between different connection types and route them accordingly.   Routing MySQL traffic   A MySQL database connection is authenticated by a username and password. In most database designs, multiple users with different access rights are used; less privileged user accounts can only read data (issuing 'SELECT' statements), and more privileged users can also perform updates (issuing 'UPDATE' statements). A well architected application with sound security boundaries will take advantage of these multiple user accounts, using the account with least privilege to perform each operation. This reduces the opportunities for attacks like SQL injection to subvert database transactions and perform undesired updates.   This article describes how to use Traffic Manager to inspect and manage MySQL connections, routing connections authenticated with privileged users to the master database and load-balancing other connects to the slaves:   Load-balancing MySQL connections   Designing a MySQL proxy   Stingray Traffic Manager functions as an application-level (layer-7) proxy. Most protocols are relatively easy for layer-7 proxies like Traffic Manager to inspect and load-balance, and work 'out-of-the-box' or with relatively little configuration.   For more information, refer to the article Server First, Client First and Generic Streaming Protocols.   Proxying MySQL connections   MySQL is much more complicated to proxy and load-balance.   When a MySQL client connects, the server immediately responds with a randomly generated challenge string (the 'salt'). The client then authenticates itself by responding with the username for the connection and a copy of the 'salt' encrypted using the corresponding password:   Connect and Authenticate in MySQL   If the proxy is to route and load-balance based on the username in the connection, it needs to correctly authenticate the client connection first. When it finally connects to the chosen MySQL server, it will then have to re-authenticate the connection with the back-end server using a different salt.   Implementing a MySQL proxy in TrafficScript   In this example, we're going to proxy MySQL connections from two users - 'mysqlmaster' and 'mysqlslave', directing connections to the 'SQL Master' and 'SQL Slaves' pools as appropriate.   The proxy is implemented using two TrafficScript rules ('mysql-request' and 'mysql-response') on a 'server-first' Virtual Server listening on port 3306 for MySQL client connections. Together, the rules implement a simple state machine that mediates between the client and server:   Implementing a MySQL proxy in TrafficScript   The state machine authenticates and inspects the client connection before deciding which pool to direct the connection to. The rule needs to know the encrypted password and desired pool for each user. The virtual server should be configured to send traffic to the built-in 'discard' pool by default.   The request rule:   Configure the following request rule on a 'server first' virtual server. Edit the values at the top to reflect the encrypted passwords (copied from the MySQL users table) and desired pools:   sub encpassword( $user ) { # From the mysql users table - double-SHA1 of the password # Do not include the leading '*' in the long 40-byte encoded password if( $user == "mysqlmaster" ) return "B17453F89631AE57EFC1B401AD1C7A59EFD547E5"; if( $user == "mysqlslave" ) return "14521EA7B4C66AE94E6CFF753453F89631AE57EF"; } sub pool( $user ) { if( $user == "mysqlmaster" ) return "SQL Master"; if( $user == "mysqlslave" ) return "SQL Slaves"; } $state = connection.data.get( "state" ); if( !$state ) { # First time in; we've just recieved a fresh connection $salt1 = randomBytes( 8 ); $salt2 = randomBytes( 12 ); connection.data.set( "salt", $salt1.$salt2 ); $server_hs = "\0\0\0\0" . # length - fill in below "\012" . # protocol version "Stingray Proxy v0.9\0" . # server version "\01\0\0\0" . # thread 1 $salt1."\0" . # salt(1) "\054\242" . # Capabilities "\010\02\0" . # Lang and status "\0\0\0\0\0\0\0\0\0\0\0\0\0" . # Unused $salt2."\0"; # salt(2) $l = string.length( $server_hs )-4; # Will be <= 255 $server_hs = string.replaceBytes( $server_hs, string.intToBytes( $l, 1 ), 0 ); connection.data.set( "state", "wait for clienths" ); request.sendResponse( $server_hs ); break; } if( $state == "wait for clienths" ) { # We've recieved the client handshake. $chs = request.get( 1 ); $chs_len = string.bytesToInt( $chs ); $chs = request.get( $chs_len + 4 ); # user starts at byte 36; password follows after $i = string.find( $chs, "\0", 36 ); $user = string.subString( $chs, 36, $i-1 ); $encpasswd = string.subString( $chs, $i+2, $i+21 ); $passwd2 = string.hexDecode( encpassword( $user ) ); $salt = connection.data.get( "salt" ); $passwd1 = string_xor( $encpasswd, string.hashSHA1( $salt.$passwd2 ) ); if( string.hashSHA1( $passwd1 ) != $passwd2 ) { log.warn( "User '" . $user . "': authentication failure" ); connection.data.set( "state", "authentication failed" ); connection.discard(); } connection.data.set( "user", $user ); connection.data.set( "passwd1", $passwd1 ); connection.data.set( "clienths", $chs ); connection.data.set( "state", "wait for serverhs" ); request.set( "" ); # Select pool based on user pool.select( pool( $user ) ); break; } if( $state == "wait for client data" ) { # Write the client handshake we remembered from earlier to the server, # and piggyback the request we've just recieved on the end $req = request.get(); $chs = connection.data.get( "clienths" ); $passwd1 = connection.data.get( "passwd1" ); $salt = connection.data.get( "salt" ); $encpasswd = string_xor( $passwd1, string.hashSHA1( $salt . string.hashSHA1( $passwd1 ) ) ); $i = string.find( $chs, "\0", 36 ); $chs = string.replaceBytes( $chs, $encpasswd, $i+2 ); connection.data.set( "state", "do authentication" ); request.set( $chs.$req ); break; } # Helper function sub string_xor( $a, $b ) { $r = ""; while( string.length( $a ) ) { $a1 = string.left( $a, 1 ); $a = string.skip( $a, 1 ); $b1 = string.left( $b, 1 ); $b = string.skip( $b, 1 ); $r = $r . chr( ord( $a1 ) ^ ord ( $b1 ) ); } return $r; }   The response rule   Configure the following as a response rule, set to run every time, for the MySQL virtual server.   $state = connection.data.get( "state" ); $authok = "\07\0\0\2\0\0\0\02\0\0\0"; if( $state == "wait for serverhs" ) { # Read server handshake, remember the salt $shs = response.get( 1 ); $shs_len = string.bytesToInt( $shs )+4; $shs = response.get( $shs_len ); $salt1 = string.substring( $shs, $shs_len-40, $shs_len-33 ); $salt2 = string.substring( $shs, $shs_len-13, $shs_len-2 ); connection.data.set( "salt", $salt1.$salt2 ); # Write an authentication confirmation now to provoke the client # to send us more data (the first query). This will prepare the # state machine to write the authentication to the server connection.data.set( "state", "wait for client data" ); response.set( $authok ); break; } if( $state == "do authentication" ) { # We're expecting two responses. # The first is the authentication confirmation which we discard. $res = response.get(); $res1 = string.left( $res, 11 ); $res2 = string.skip( $res, 11 ); if( $res1 != $authok ) { $user = connection.data.get( "user" ); log.info( "Unexpected authentication failure for " . $user ); connection.discard(); } connection.data.set( "state", "complete" ); response.set( $res2 ); break; }   Testing your configuration   If you have several MySQL databases to test against, testing this configuration is straightforward. Edit the request rule to add the correct passwords and pools, and use the mysql command-line client to make connections:   $ mysql -h zeus -u username -p Enter password: *******   Check the 'current connections' list in the Traffic Manager UI to see how it has connected each session to a back-end database server.   If you encounter problems, try the following steps:   Ensure that trafficscript!variable_pool_use is set to 'Yes' in the Global Settings page on the UI. This setting allows you to use non-literal values in pool.use() and pool.select() TrafficScript functions. Turn on the log!client_connection_failures and log!server_connection_failures settings in the Virtual Server > Connection Management configuration page; these settings will configure the traffic manager to write detailed debug messages to the Event Log whenever a connection fails.   Then review your Traffic Manager Event Log and your mysql logs in the event of an error.   Traffic Manager's access logging can be used to record every connection. You can use the special *{name}d log macro to record information stored using connection.data.set(), such as the username used in each connection.   Conclusion   This article has demonstrated how to build a fairly sophisticated protocol parser where the Traffic Manager-based proxy performs full authentication and inspection before making a load-balancing decision. The protocol parser then performs the authentication again against the chosen back-end server.   Once the client-side and server-side handshakes are complete, Traffic Manager will simply forward data back and fro between the client and the server.   This example addresses the problem of scaling out your MySQL database, giving load-balancing and redundancy for database reads ('SELECTs'). It does not address the problem of scaling out your master 'write' server - you need to address that by investing in a sufficiently powerful server, architecting your database and application to minimise the number and impact of write operations, or by selecting a full clustering solution.     The solution leaves a single point of failure, in the form of the master database. This problem could be effectively dealt with by creating a monitor that tests the master database for correct operation. If it detects a failure, the monitor could promote one of the slave databases to master status and reconfigure the 'SQLMaster' pool to direct write (UPDATE) traffic to the new MySQL master server.   Acknowledgements   Ian Redfern's MySQL protocol description was invaluable in developing the proxy code.     Appendix - Password Problems? This example assumes that you are using MySQL 4.1.x or later (it was tested with MySQL 5 clients and servers), and that your database has passwords in the 'long' 41-byte MySQL 4.1 (and later) format (see http://dev.mysql.com/doc/refman/5.0/en/password-hashing.html)   If you upgrade a pre-4.1 MySQL database to 4.1 or later, your passwords will remain in the pre-4.1 'short' format.   You can verify what password format your MySQL database is using as follows:   mysql> select password from mysql.user where user='username'; +------------------+ | password         | +------------------+ | 6a4ba5f42d7d4f51 | +------------------+ 1 rows in set (0.00 sec)   mysql> update mysql.user set password=PASSWORD('password') where user='username'; Query OK, 1 rows affected (0.00 sec) Rows matched: 1  Changed: 1  Warnings: 0   mysql> select password from mysql.user where user='username'; +-------------------------------------------+ | password                                  | +-------------------------------------------+ | *14521EA7B4C66AE94E6CFF753453F89631AE57EF | +-------------------------------------------+ 1 rows in set (0.00 sec)   If you can't create 'long' passwords, your database may be stuck in 'short' password mode. Run the following command to resize the password table if necessary:   $ mysql_fix_privilege_tables --password=admin password   Check that 'old_passwords' is not set to '1' (see here) in your my.cnf configuration file.   Check that the mysqld process isn't running with the --old-passwords option.   Finally, ensure that the privileges you have configured apply to connections from the Stingray proxy. You may need to GRANT... TO 'user'@'%' for example.
View full article
FTP is an example of a 'server-first' protocol. The back-end server sends a greeting message before the client sends its first request. This means that the traffic manager must establish the connection to the back-end node before it can inspect the client's request.   Fortunately, it's possible to implement a full protocol proxy in Stingray's TrafficScript language. This article (dating from 2005) explains how.   FTP Virtual Hosting scenario   We're going to manage the following scenario:   A service provider is hosting FTP services for organizations - ferrari-f1.com, sauber-f1.com and minardi-f1.com. Each organization has their own cluster of FTP servers:   Ferrari have 3 Sun E15Ks in a pool named 'ferrari ftp' Sauber have a couple of old, ex-Ferrari servers in Switzerland, in a pool named 'sauber ftp' Minardi have a capable and cost-effective pair of pizza-box servers in a pool named 'minardi ftp'   The service provider hosts the FTP services through Stingray, and requires that users log in with their email address. If a user logs in as 'rbraun@ferrarif1.com ', Stingray will connect the user to the 'ferrari ftp' pool and log in with username 'rbraun'.   This is made complicated because an FTP connection begins with a 'server hello' message as follows:   220 ftp.zeus.com FTP server (Version wu-2.6.1-0.6x.21) ready.   ... before reading the data from the client.   Configuration   Create the virtual server (FTP, listening on port 21) and the three pools ('ferrari ftp' etc).  Configure the default pool for the virtual server to be the discard pool.   Configure the virtual server connection management settings, setting the FTP serverfirst_banner to:   220 F1 FTP server ready.   Add the following trafficscript request rule to the virtual server, setting it to run every time:   $req = string.trim( request.endswith( "\n" ) ); if( !string.regexmatch( $req, "USER (.*)", "i" ) ) { # if we're connected, forward the message; otherwise # return a login prompt if( connection.getNode() ) { break; } else { request.sendresponse( "530 Please log in!!\r\n" ); break; } } $loginname = $1; # The login name should look like 'user@host' if( ! string.regexmatch( $loginname, "(.*)@(.*)" ) ) { request.sendresponse( "530 Incorrect user or password!!\r\n" ); break; } $user = $1; $domain = string.lowercase( $2 ); request.set( "USER ".$user."\r\n" ); # select the pool we want... if( $domain == "ferrarif1.com" ) { pool.use( "ferrari ftp" ); } else if( $domain == "sauberf1.com" ) { pool.use( "sauber ftp" ); } else if( $domain == "minardif1.com" ) { pool.use( "minardi ftp" ); } else { request.sendresponse( "530 Incorrect user or password!!\r\n" ); }   And that's it! Stingray automatically slurps and discards the serverfirst banner message from the back-end ftp servers when it connects on the first request.   More...   Here's a more sophisticated example which reads the username and password from the client before attempting to connect. You could add your own authentication at this stage (for example, using http.request.get or auth.query to query an external server) before initiating the connect to the back-end ftp server:   TrafficScript request rule   $req = string.trim( request.endswith( "\n" ) ); if( string.regexmatch( $req, "USER (.*)" ) ) { connection.data.set( "user", $1 ); $msg = "331 Password required for ".$1."!!\r\n"; request.sendresponse( $msg ); break; } if( !string.regexmatch( $req, "PASS (.*)" ) ) { # if we're connected, forward the message; otherwise # return a login prompt if( connection.getNode() ) { break; } else { request.sendresponse( "530 Please log in!!\r\n" ); break; } } $loginname = connection.data.get( "user" ); $pass = $1; # The login name should look like 'user@host' if( ! string.regexmatch( $loginname, "(.*)@(.*)" ) ) { request.sendresponse( "530 Incorrect user or password!!\r\n" ); break; } $user = $1; $domain = string.lowercase( $2 ); # You could add your own authentication at this stage. # If the username and password is invalid, do the following: # # if( $badpassword ) { # request.sendresponse( "530 Incorrect user or password!!\r\n" ); # break; # } # now, replay the correct request against a new # server instance connection.data.set( "state", "connecting" ); request.set( "USER ".$user."\r\nPASS ".$pass."\r\n" ); # select the pool we want... if( $domain == "ferrarif1.com" ) { pool.use( "ferrari ftp" ); } else if( $domain == "sauberf1.com" ) { pool.use( "sauber ftp" ); } else if( $domain == "minardif1.com" ) { pool.use( "minardi ftp" ); } else { request.sendresponse( "530 Incorrect user or password!!\r\n" ); }   TrafficScript response rule   if( connection.data.get("state") == "connecting" ) { # We've just connected, but Stingray doesn't slurp the serverfirst # banner until after this rule has run. # Slurp the first line (the serverfirst banner), the second line # (the 331 need password) and then replace the serverfirst banner $first = response.getLine(); $second = response.getLine( "\n", $1 ); $remainder = string.skip( response.get(), $1 ); response.set( $first.$remainder ); connection.data.set( "state", "" ); }   Remember that both rules must be set to 'run every time'.
View full article
Lots of websites provide a protected area for authorized users to log in to. For instance, you might have a downloads section for products on your site where customers can access the software that they have bought.   There are many different ways to protect web pages with a user name and password. Their login and password could be quickly spread around. Once the details are common knowledge, anyone could login and access the site without paying.   Stingray and TrafficScript to the rescue!   Did you know that TrafficScript can be used to detect when a username and password are used from several different locations? You can then choose whether to disable the account or give the user a new password. All this can be done without replacing any of your current authentication systems on your website:   Looks like the login details for user 'ben99' have been leaked! How can we stop people leeching from this account?   For this example, we'll use a website where the entire site is protected with a PHP script that handles the authentication. It will check a user's password, and then set a USER cookie filled in with the user name. The details of the authentication scheme are not important. In this instance, all that matters is that TrafficScript can discover the user name of the account.   Writing the TrafficScript rule   First of all, TrafficScript needs to ignore any requests that aren't authenticated:   $user = http.getCookie( "USER" ); if( $user == "" ) break;   Next, we'll need to discover where the user is coming from. We'll use the IP address of their machine. However, they may also be connecting via a proxy, in which case we'll use the address supplied by the proxy.   $from = request.getRemoteIP(); $proxy = http.getHeader( "X-Forwarded-For" ); if( $proxy != "" ) $from = $proxy;   TrafficScript needs to keep track of which IP addresses have been used for each account. We will have to store a list of the IP addresses used. TrafficScript provides persistent storage with the data.get() and data.set() functions.   $list = data.get( $user ); if( !string.contains( $list, $from )) { # Add this entry in, padding list with spaces $list = sprintf( "%19s %s", $from, $list ); ...   Now we need to know how many unique IP addresses have been used to access this account. If the list has grown too large, then don't let this person fetch any more pages.   # Count the number of entries in the list. Each entry is 20 # characters long (the 19 in the sprintf plus a space) $entries = string.length( $list ) / 20; if( $entries > 4 ) { # Kick the user out with an error message http.sendResponse( "403 Permission denied", "text/plain", "Account locked", "" ); } else { # Update the list of IP addresses data.set( $user, $list ); } }   That's it! If a single account on your site is accessed from more than four different locations, the account will be locked out, preventing abuse.   As this is powered by TrafficScript, further improvements can be made. We can extend the protection in many ways, without having to touch the code that runs your actual site. Remember, this can be deployed with any kind of authentication being used - TrafficScript just needs the user name.   A more advanced example   This has a few new improvements. First of all, the account limits are given a timeout, enabling someone to access the site from different locations (e.g. home and office), but will still catch abuse if the account is being used simultaneously in different locations. Secondly, any abuse is logged, so that an administrator can check up on leaked accounts and take appropriate action. Finally, to show that we can work with other login schemes, this example uses HTTP Basic Authentication to get the user name.   # How long to keep data for each userid (seconds) $timelimit = 3600; # Maximum number of different IP addresses to allow a client # to connect from $maxips = 4; # Only interested in HTTP Basic authentication $h = http.getHeader( "Authorization" ); if( !string.startsWith( $h, "Basic " )) continue; # Extract and decode the username:password combination $enc = string.skip( $h, 6 ); $userpasswd = string.base64decode( $enc ); # Work out where the user came from. If they came via a proxy, # then ensure that we don't log the proxy's IP address(es) $from = request.getRemoteIP(); $proxy = http.getHeader( "X-Forwarded-For" ); if( $proxy != "" ) $from = $proxy; # Have we seen this user before? We will store a space separated # list of all the IPs that we have seen the user connect from $list = data.get( $userpasswd ); # Also check the timings. Only keep the records for a fixed period # of time, then delete them. $time = data.get( "time-" . $userpasswd ); $now = sys.time(); if(( $time == "" ) || (( $now - $time ) > $timelimit )) { # Entry expired (or hasn't been created yet). Start with a new # list and timestamp. $list = ""; $time = $now; data.set( "time-" . $userpasswd, $time ); } if( !string.contains( $list, $from )) { # Pad each entry in the list with spaces $list = sprintf( "%19s %s", $from, $list ); # Count the number of entries in the list. Each entry is 20 # characters long (the 19 in the sprintf plus a space) $entries = string.length( $list ) / 20; # Check if the list of used IP addresses is too large - if so, # send back an error page! if( $entries > $maxips ) { # Put a message in the logs so the admins can see the abuse # (Ensure that we show the username but not the password) $user = string.substring( $userpasswd, 0, string.find( $userpasswd, ":" ) - 1 ); log.info( "Login abuse for account: " . $user . " from " . $list ); http.sendResponse( "403 Permission denied", "text/html", "Your account is being accessed by too many users", "" ); } else { # Update the list and let the user through data.set( $userpasswd, $list ) ; } }   This article was originally written by Ben Mansell in March 2007
View full article
"The 'contact us' feature on many websites is often insecure, and makes it easy to launch denial of service (DoS) attacks on corporate mail servers," according to UK-based security consultancy SecureTest, as reported in The Register.   This article describes how such an attack might be launched, and how it can be easily mitigated against by using a traffic manager like Stingray.   Mounting an Attack   Many websites contain a "Contact Us" web-based form that generates an internal email. An attacker can use a benchmarking program like ApacheBench to easily submit a large number of requests to the form, bombarding the target organization's mail servers with large volumes of traffic.   Step 1. Identify the target web form and deduce the POST request     An appropriate POST request for the http://www.site.com/cgi-bin/mail.aspx page would contain the form parameters and desired values as an application/x-www-form-urlencoded file (ignore line breaks):   email_subject=Site+Feedback&mailto=target%40noname.org& email_name=John+Doe&email_from=target%40noname.org&email_country=US& email_comments=Ha%2C+Ha%2C+Ha%21%0D%0ADon%27t+try+this+at+home   Step 2. Mount the attack   The following example uses ApacheBench to submit the POST request data in postfile.txt. ApacheBench creates 10 users who send 10,000 requests to the target system.   # ab -p postfile.txt -c 10 -n 10000 -T application/x-www-form-urlencoded http://www.site.com/cgi-bin/mail.aspx   The attack is worsened because the web server typically resides inside the trusted DMZ and is not subject to the filtering that untrusted external clients must face. Additionally, this direct attack bypasses any security or validation that is built into the web form.   Ken Munro of SecureTest described the results of the firm's penetration testing work with clients. "By explicit agreement we conduct a 'contact us' DoS, and in every case we've tried so far, the client's mail server stops responding during the test window."   Defending against the Attack   There is a variety of ways to defend against this form of attack, but one of the easiest ways would be to rate-limit requests to the web-based form.   In Stingray, you can create a 'Rate Shaping Class'; we'll create one named 'mail limit' that restricts traffic to 2 requests per minute:     Using TrafficScript, we rate-limit traffic to the mail.aspx page to 2 requests per minute in total:   if( http.getPath() == "/cgi-bin/mail.aspx" ) { rate.use( "mail limit" ); }   In this case, one attacker could dominate the form and prevent other legitimate users from using it. So, we could instead limit each individual user (identified by source IP address) to 2 requests per minute:   if( http.getPath() == "/cgi-bin/mail.aspx" ) { rate.use( "mail limit", request.getRemoteIP() ); }   In the case of a distributed denial of service attack, we can rate limit on other criteria. For example, we could extract the 'name' field from the submitted data and rate-shape on that basis:   if( http.getPath() == "/cgi-bin/mail.aspx" ) { $name = http.getFormParam( "name" ); rate.use( "mail limit", $name ); }   Stingray gives you a very quick, simple and non-disruptive method to limit accesses to a vulnerable or resource-heavy web-based form like this. This solution illustrates one of the many ways that Stingray's traffic inspection and control can be used to help secure your public facing services.
View full article
...Riverbed customers protected!   When I got in to the office this morning, I wasn't expecting to read about a new BIND 9 exploit!! So as soon as I'd had my first cup of tea I sat down to put together a little TrafficScript magic to protect our customers.   BIND Dynamic Update DoS   The exploit works by sending a specially crafted DNS Update packet to a zones master server. Upon receiving the packet, the DNS server will shut down. ISC, The creators of BIND, have this to say about the new exploit   "Receipt of a specially-crafted dynamic update message to a zone for which the server is the master may cause BIND 9 servers to exit. Testing indicates that the attack packet has to be formulated against a zone for which that machine is a master. Launching the attack against slave zones does not trigger the assert."   "This vulnerability affects all servers that are masters for one or more zones – it is not limited to those that are configured to allow dynamic updates. Access controls will not provide an effective workaround."   Sounds nasty, but how easy is it to get access to code to exploit this vulnerability? Well the guy who found the bug, posted a fully functional perl script with the Debian Bug Report.   TrafficScript to the Rescue   I often talk to customers about how TrafficScript can be used to quickly patch bugs and vulnerabilities while they wait for a fix from the vendor or their own development teams. It's time to put my money where my mouth is, so here's the work around for this particular vulnerability:   $data = request.get( 3 ); if ( string.regexmatch($data, "..[()]" ) ) { log.warn("FOUND UPDATE PACKET"); connection.discard(); }   The above TrafficScript checks the Query Type of the incoming request, and if it's an UPDATE, then we discard the connection. Obviously you could extend this script to add a white list of servers which you want to allow updates from if necessary. However, you should only have this script in place while your servers are vulnerable, and you should apply patches as soon as you can.   Be safe!
View full article
If you are unfortunate enough to suffer a total failure of all of your webservers, all is not lost. Stingray Traffic Manager can host custom error pages for you, and this article shows you how! If all of the servers in a pool have failed, you have several options:   Each pool can be configured to have a 'Failure Pool'. This is used when all of the nodes in the primary pool have completely failed. You may configure the traffic manager to send an HTTP Redirect message, directing your visitors to an alternate website. However, you may reach the point where you've got nowhere to send your users. All your servers are down, so failure pools are not an option, and you can't redirect a visitor to a different site for the same reason.   In this case, you can use a third option:   You may configure a custom error page which is returned with every request. Custom error pages   Use the error_file setting in the Virtual Server configuration to specify the response if the back-end servers are not functioning. The error file should be placed in your Extra Files catalog (in the 'miscellaneous files' class:     <html> <head> <title>Sorry</title> <link rel="stylesheet" href="main.css" type="text/css" media="screen" > </head> <body> <img src="logo.gif"> <h1>Our apologies</h1> We're sorry. All of our operators are busy. Please try again later. </body> </html>   This HTML error page will now be returned whenever an HTTP request is received, and all of your servers are down.   Embedding images and other resources   Note that the HTML page has embedded images and stylesheets. Where are these files hosted? With the current configuration, the error page will be returned for every request.   You can use a little TrafficScript to detect requests for files referenced by the error page, and serve content directly from the conf/extra/ directory.   First, we'll modify the error page slightly to may it easier to recognize requests for files used by the error page:   <link rel="stylesheet" href="https://community.brocade.com/.extra/main.css" type="text/css" media="screen">   and   <img src="/.extra/logo.gif">   Then, upload the main.css and logo.gif files, and any others you use, to the Extra Files catalog.   Finally, the following TrafficScript request rule can detect requests for those files and will make the traffic manager serve the response directly:   # Only process requests that begin '/.extra/' $url = http.getPath(); if( ! string.regexmatch( $url, "^/\\.extra/(.*)$" ) ) { break; } else { $file = $1; } # If the file does not exist, stop if( ! resource.exists( $file ) ) break; # Work out the MIME type of the file from its extension $mimes = [ "html" => "text/html", "jpg" => "image/jpeg", "jpeg" => "image/jpeg", "png" => "image/png", "gif" => "image/gif", "js" => "application/x-javascript", "css" => "text/css" ]; if( string.regexmatch( $file, ".*\\.([^.]+)$" ) ) { $mime = $mimes[ $1 ]; } if( ! $mime ) $mime = "text/plain"; # Serve the file from the conf/extra directory $contents = resource.get( $file ); http.sendResponse( "200 OK", $mime, $contents, "" );   Copy and paste this TrafficScript into the Rules Catalog, and assign it as a request rule to the virtual server. Images (and css or js files) that are placed in the Extra Files catalog can be refered to using /.extra/imagename.png . You will also be able to test your error page by browsing to /.extra/errorpage.html (assuming the file is called errorpage.html in the extra directory).
View full article
What can you do if an isolated problem causes one or more of your application servers to fail? How can you prevent vistors to your website seeing the error, and instead send them a valid response?   This article shows how to use TrafficScript to inspect responses from your application servers and retry the requests against several different machines if a failure is detected.   The Scenario   Consider the following scenario. You're running a web based service on a cluster of four application servers, running .NET, Java, PHP, or some other application environment. An occasional error on one of the machines means that one particular application sometimes fails on that one machine. It might be caused by a runaway process, a race condition when you update configuration, or by failing system memory.   With Stingray, you can check the responses coming back from your application servers. For example, application errors may be identified by a '500 Internal Error' or '502 Bad Gateway' message (refer to the HTTP spec for a full list of error codes).   You can then write a Response rule that retries the request a certain number of times against different servers to see if it gets a better response before sending it back to the remote user.   $code = http.getResponseCode(); if( $code >= 500 && $code != 503 ) { # Not retrying 503s here, because they get retried # automatically before response rules are run if( request.getRetries() < 3 ) { # Avoid the current node when we retry, # if possible request.avoidNode( connection.getNode() ); log.warn( "Request " . http.getPath() . " to site " . http.getHostHeader() . " from " . request.getRemoteAddr() . " caused error " . http.getResponseCode() . " on node " . connection.getNode() ); request.retry(); } }   How does the rule work?   The rule does a few checks before telling Stingray to retry the request:   1. Did an error occur?   First of all, the rule checks to see if the response code indicated that an error occurred:   if( $code >= 500 && $code != 503 ) { ... }   If your service was prone to other types of error - for example, Java backtraces might be found in the middle of a response page - you could write a TrafficScript test for those errors instead.   2. Have we retried this request before?   Some requests may always generate an error response. We don't want to keep retrying a request in this case - we've got to stop at some point:   if( request.getRetries() < 3 ) { ... }   request.getRetries() returns the number of times that this request has been resent to a back-end node. It's initially 0; each time you call request.retry() , it is incremented.   This code will retry a request 3 times, in addition to the first time that it was processed.   3. Don't use the same node again!   When you retry a request, the load-balancing decision is recalculated to select the target node. However, you will probably want to avoid the node that generated the error before, as it may be likely to generate the error again.   request.avoidNode( connection.getNode() );   connection.getNode() returns the name of the node that was last used to process the request. request.avoidNode() gives the load balancing algorithm a hint that it should avoid that node. The hint is just advisory - if there are no other available nodes in the pool, that node will be used anyway.   4. Log what we're about to do.   This rule conceals problems with the service so that the end user does not see them. It it works well, these problems may never be found!   log.warn( "Request " . http.getPath() . " to site " . http.getHostHeader() . " from " . request.getRemoteAddr() . " caused error " . http.getResponseCode() . " on node " . connection.getNode() );   It's a sensible idea to log the fact that a request caused an unexpected error so that the problem can be investigated later.   5. Retry the request   Finally, tell Stingray to resubmit the request again, in the hope that this time we'll get a better response:   request.retry();   And that's it.   Notes   If a malicious user finds an HTTP request that always causes an error, perhaps because of an application bug, then this rule will replay the malicious request against 3 additional machines in your cluster. This makes it easier for the user to mount a DoS-style attack against your site, because he only needs to send 1/4 of the number of requests.   However, the rule explicitly logs that a failure occured, and logs both the request that caused the failure and the source of the request. This information is vital when performing triage, i.e., rapid fault fixing. Once you have noticed that the problem exists, you can very quickly add a request rule to drop the bad request before it is ever processed:   if( http.getPath() == "/known/bad/request" ) connection.discard();
View full article
When you move content around a web site, links break. Even if you've patched up all your internal links, site visitors from external links, outdated search results and people's bookmarks will be broken and return a '404 Not Found' error.   Rather than giving each user a sorry "404 Not Found" apology page, how about trying to send them to a useful page? The following TrafficScript example shows you exactly how to do that, without having to modify any of your web site content or configuration.   The TrafficScript rule works by inspecting the response from the webserver before it's sent back to the remote user. If the status code of the response is '404', the rule sends back a redirect to a higher level page:   http://www.site.com/products/does/not/exist.html returns 404, so try: http://www.site.com/products/does/not/ returns 404, so try: http://www.site.com/products/does/ returns 404, so try: http://www.site.com/products/ which works fine!     Here is the code (it's a Stingray response rule):   if( http.getResponseCode() == 404 ) { $path = http.getPath(); # If the home page gives a 404, nothing we can do! if( $path == "/" ) http.redirect( "http://www.google.com/" ); if( string.endsWith( $path, "/" ) ) $path = string.drop( $path, 1 ); $i = string.findr( $path, "/" ); $path = string.substring( $path, 0, $i-1 )."/"; http.redirect( $path ); }   Your users will never get a 404 Not Found message for any web page on your site; Stingray will try higher and higher pages until it finds one that exists.   Of course, you could use a similar strategy for other application errors, such as 503 Too Busy.   The same for images...   This strategy works fine for web pages, but it's not appropriate for embedded content such as missing images, stylesheets or javascript files.   For some content types, a 404 response is not user visible and is acceptable. For images, it may not be. Some browsers will display a broken image icon, where a simple transparent GIF image would be more appropriate:   if( http.getResponseCode() == 404 ) { $path = http.getPath(); # If the home page gives a 404, nothing we can do! if( $path == "/" ) http.redirect( "http://www.google.com/" ); # Is it an image? if( string.endsWith( $path, ".gif" ) || string.endsWith( $path, ".jpg" ) || string.endsWith( $path, ".png" ) ) { http.sendResponse( "200 OK", "image/gif", "GIF89a\x01\x00\x01\x00\x80\xff\x00\xff\xff\xff\x00\x00\x00\x2c\x00\x00\x00\x00\x01\x00\x01\x00\x00\x02\x02\x44\x01\x00\x3b", "" ); } # Is it a stylesheet (.css) or javascript file (.js)? if( string.endsWith( $path, ".css" ) || string.endsWith( $path, ".js" ) ) { http.sendResponse( "404", "text/plain", "", "" ); break; } if( string.endsWith( $path, "/" ) ) $path = string.drop( $path, 1 ); $i = string.findr( $path, "/" ); $path = string.substring( $path, 0, $i-1 )."/"; http.redirect( $path ); }  
View full article
Popular news and blogging sites such as Slashdot and Digg have huge readerships. They are community driven and allow their members to post articles on various topics ranging from hazelnut chocolate bars to global warming. These sites, due to their massive readership, have the power to generate huge spikes in the web traffic to those (un)fortunate enough to get mentioned in their articles. Fortunately Traffic Manager and TrafficScript can help.   If the referenced site happens to be yours, you are faced with dealing with this sudden and unpredictable spike in bandwidth and request rate, causing:   a large proportion or all of your available bandwidth to be consumed by visitors referred to you by this popular site; and in extreme cases, a cascade failure across your web servers as each one becomes overloaded, fails and, in doing so, adds further load onto the remaining web servers.   Bandwidth Management and Rate Shaping   Traffic Manager has the ability to shape traffic in two important ways. Firstly, you can restrict the amount of bandwidth any client or group of clients are allowed to consume. This is commonly known as "Bandwidth Management" and in Traffic Manager it's configured by using a bandwidth class. Bandwidth classes are used to specify the maximum bits per second to make available. The alternative method is to limit the number of requests that those clients or group of clients can make per second and/or per minute. This is commonly known as "Rate Shaping" and is configured within a rate class.   Both Rate Shaping and Bandwidth Management classes are configured and stored within the catalog section of Traffic Manager. Once you have created a class it is ready for use and can be applied to one or more of your Virtual Servers. However the true power of these Traffic Shaping features really becomes apparent when you make use of them with TrafficScript.   What is an Abusive Referer?   I would class an Abusive Referer as any site on the internet that refers enough traffic to your server to overwhelm it and effectively deny service to other users. This abuse is usually unintentional, the problem lies in the sheer number of people wanting to visit your site at that one time. This slashdot effect can be detected and dealt with by a TrafficScript rule and either a Bandwidth or a Rate Class.   Detecting and Managing Abusive Referers   Example One   Take a look at the TrafficScript below for an example of how you could stop a site (in this instance slashdot) from from using a large proportion or all of your available bandwidth.   $referrer = http.getHeader( "Referer" ); if( string.contains( $referrer, "slashdot" ) ) { http.addResponseHeader( "Set-Cookie", "slashdot=1" ); response.setBandwidthClass( "slashdot" ); } if( http.getCookie( "slashdot" ) ) { response.setBandwidthClass( "slashdot" ); }   In this example we are specifically targeting slashdot users and preventing them from using more bandwidth than we have allotted them in our "slashdot" bandwidth class. This rule requires you to know the name of the site you want protection from, but this rule or similar could be modified to defend against other high traffic sites. Example Two The next example is a little more complicated, but will automatically limit all requests from any referer. I've chosen to use two rate classes here, BusyReferer for those sites I allow to send a large amount of traffic and StandardReferers for those I don't. At the top I specify a $whitelist, which contains sites I never want to rate shape, and $highTraffic which is a list of sites I'm going to shape with my BusyReferer class. By default, all traffic not in the white list is sent through one of my rate classes, but only on entry to the site. That's because subsequent requests will have myself as the referer and will be whitelisted. In times of high load, when a referer is sending more traffic than the rate class allows, a back log will build up, at that point we will also start issuing cookies to put the offending referers into a bandwidth class.   # Referer whitelist. These referers are never rate limited. $whitelist = "localhost 172.16.121.100"; # Referers that are allowed to pass a higher number of clients. $highTraffic = "google mypartner.com"; # How many queued requests are allowed before we track users. $shapeQueue = 2; # Retrieve the referer and strip out the domain name part. $referer = http.getheader("Referer"); $referer = String.regexsub($referer, ".*?://(.*?)/.*", "$1", "i" ); # Check to see if this user has already been given an abuse cookie. # If they have we'll force them into a bandwidth class if ( $cookie = http.getCookie("AbusiveReferer") ) { response.setBandwidthClass("AbusiveReferer"); } # If the referer is whitelisted then exit. if ( String.contains( $whitelist, $referer ) ) { break; } # Put the incoming users through the busy or standard rate classes # and check the queue length for their referer. if ( String.contains( $highTraffic, $referer ) ) { $backlog = rate.getbacklog("BusyReferer", $referer); rate.use("BusyReferer", $referer); } else { $backlog = rate.getbacklog("StandardReferer", $referer); rate.use("StandardReferer", $referer); } # If we have exceeded our backlog limit, then give them a cookie # this will enforce bandwidth shaping for subsequent requests. if ( $backlog > $shapeQueue ) { http.setResponseCookie("AbusiveReferer", $referer); response.setBandwidthClass("AbusiveReferer"); }   In order for the TrafficScript to function optimally, you must enter your servers own domainname(s) into the white list. If you do not, then the script will perform rate shaping on everyone surfing your website!   You also need to set appropriate values for the BusyReferer and StandardReferer shaping classes. Remember we're only counting the clients entry to the site, so Perhaps you want to set 10/minute as a maximum standard rate and then 20/minute for your BusyReferer rate.   In this script we also use a bandwidth class for when things get busy. You will need to create this class, called "AbusiveReferer" and assign it an appropriate amount of bandwidth. Users are only put into this class when their referer is exceeding the rate of referrals set by the relevant rate class.   Shaping with Context   Rate Shaping classes can be given a context so you can apply the class to a subset of users, based on a piece of key data. The second script uses context to create an instance of the Rate Shaping class for each referer. If you do not use context, then all referers will share the same instance of the rate class.   Conclusion   Traffic Manager can use bandwidth and rate shaping classes to control the number of requests that can be made by any group of clients. In this article, we have covered choosing the class based on the referer, which has allowed us to restrict the rate at which any one site can refer visitors to us. These examples could be modified to base the restrictions on other data, such as cookies, or even extended to work with other protocols. A good example would be FTP, where you could extract the username from the FTP logon data and apply a bandwidth class based on the username.
View full article