There's one important class of applications where ADCs make a very significant performance difference using TCP offload, request/response buffering and HTTP keepalive optimization.
A number of application frameworks have fixed concurrency limits. Apache is the most notable (the worker MPM has a default limit of 256 concurrent processes), mongrel (Ruby) and others have a fixed number of worker processes; some Java app servers also have an equivalent limit. The reason the fixed concurrency limits are applied is a pragmatic one; each TCP connection takes a concurrency slot, which corresponds to a heavyweight process or thread; too many concurrent processes or threads will bring the server to its knees and this can easily be exploited remotely if the limit is not low enough.
The implication of this limit is that the server cannot service more than a certain number of TCP connections concurrently. Additional connections are queued in the OS' listen queue until a concurrency slot is released. In most cases, an idle client keepalive connection can occupy a concurrency slot (leading to the common performance detuning advice for apache recommending that keepalives are disabled or limited).
When you benchmark a concurrency-limited server over a fast local network, connections are established, serviced and closed rapidly. Concurrency slots are only occupied for a short period of time, connections are not queued for long, so the performance achieved is high.
However, when you place the same server in a production environment, the duration of connections is much greater (slow, lossy TCP; client keepalives) so concurrency slots are held for much longer. I have commonly discussed scenarios with customers where their server runs at <10% utilization, but they struggle to achieve 10% of the performance they measured in the lab.
The solution is to put a scalable proxy in front of the concurrency-limited server to offload the TCP connection, buffer the request data, use connections to the server efficiently, offload the response, free up a concurrency slot and offload the lingering keepalive connection. Stingray is a great solution; proxies like nginx do the job too, but the scheduling, connection limiting and load balancing control that Stingray gives you improves signfiicantly on generic offload proxies.