cancel
Showing results for 
Search instead for 
Did you mean: 

Consistent Hash Load Balancing Algorithm - A TrafficScript Library to save energy, money, and polar bears

sameh
Contributor

Consistent Hash Load Balancing Algorithm - A TrafficScript Library to save energy, money, and polar bears

 

In a previous post I was offered to share my TS code to achieve a consistently hashed load balancing.

 

DISCLAIMER: I am definitely not a developer, this code is ugly, naive, shy and I am pretty sure there are off-by-one errors imbalancing the share of requests each node will receive, feel free to debug them, but I'm too lazy to do it now it works decently Smiley Happy So don't judge me too hard: you can laugh at the code, but please make sure I can't hear you.

 

Obviously it would be much better if handled internally as one of the current load balancing algorithms, but it still isn't, and I believe this feature is very helpful, so here I post it.

 

An archetypical scenario: when vTM is running in front of a bunch of caching servers. You will probably want to optimize your caching surface by not fragmenting it, e.g. by sending 2 identical requests to the same server.

This could apply to a lesser extent to dynamic backends, 

 

 

That's where consistent hashes come handy: they can balance requests equally (not accurately here, see disclaimer) among a pool of nodes and have the awesome property to only rebalance 1/n of the traffic when a node comes down (where n = pool size).

 

Read more about them here.

 

 

Not going through details, the code is pretty straightforward and written naively, following general chash principles.

 

One library. One rule to call it.

 

The CHashLib:

 

# blame sameh


sub abs ($num) {
   if ($num < 0) {
      return (-$num);
   } else {
      return ($num);
   }
}


sub haHash($str) {
   # Cheap but enough I guess
    return (abs(string.bytesToInt(string.substring(string.hashMD5($str), 0, 3))));
}


sub buildCHashRing($pool) {     #builds the in-memory ring

   # **filtered** Feel free to update those arbitrary numbers to your needs depending on your SLAs
   $delayBeforeRefresh = 10 + math.random(14); # refresh the ring every 10-23 seconds pe
   $nbReplicas = 10;         # how many entries on ring per node in pool

   $lastUpdate = data.get($pool . "_lastRingUpdate");
   if ($lastUpdate == "") { $lastUpdate = 0; }
   $currentTime = sys.time();

   if ($lastUpdate + $delayBeforeRefresh < $currentTime) { # Time to refresh!
      data.set($pool . "_lastRingUpdate", $currentTime);  # Mark the ring as updated + is a way to reserve the computing slot when concurrent checks occur
      $ring = [];
      $indexList = [];
      foreach ($node in pool.listactivenodes($pool)) {    # Very important to notice when a failed node comes back up (i.e. active health checking is MANDATORY)
         for ($replica=0; $replica < $nbReplicas; $replica++) {
            $hashedIndex = haHash($node . "_" . $replica);
            $ring[$hashedIndex] = $node;
            array.push($indexList, $hashedIndex);
         }
      }
      array.sortNumerical($indexList);
      data.set($pool . "_CHashRing", $ring);
      data.set($pool . "_indexList", $indexList);
      return ($ring);
   } else {
      return (data.get($pool . "_CHashRing"));
   }
}


# returns the name of the node we want to use according to haHash($str)
sub getNodeFromRing($pool, $str) { # pool name and string to hash for match (beginning of URIs as of now)

   # shall we buildCHashRing from here? probably smarter to do it from rule in case we query multiple times
   buildCHashRing($pool);
   $searchedHash = haHash($str);
   $indexList = data.get($pool . "_indexList");

   if (!lang.isArray($indexList) || array.length($indexList) < 1) { return (""); }

   $currentIndex = array.shift($indexList);
   $electedIndex = $currentIndex;

   while (array.length($indexList) > 0 && $searchedHash > $currentIndex) {
      $electedIndex = $currentIndex;
      $currentIndex = array.shift($indexList);
   }

   if (array.length($indexList) == 0 && $searchedHash > $currentIndex) { # fin du ring
      $electedIndex = $currentIndex;
   }
   return (data.get($pool . "_CHashRing")[$electedIndex]);
}


# load balances a request if the elected node is alive, else let ZXTM fallback to regular load balancing
sub poolSelectFromRing($pool, $str) {
   $node = getNodeFromRing($pool, $str);
   if ($node != "" && array.contains(pool.listactivenodes($pool), $node)) {
     $node = string.split($node, ":");
     pool.select($pool, $node[0], $node[1]);
   } else {
     pool.select($pool);
   }
}

 

A request rule calling the library to balance a request according to arbitrary criterias:

 

# blame sameh

import CHashLib;
# Hash considering only the first 4 characters in the URI CHashLib.poolSelectFromRing("poolname", string.left(http.getPath(), 4));

 

Hope this helps, have fun!

 

Tags (2)