I'm troubleshooting and issue with load balancing connections from the NC client to the Juniper VPN servers. First off I don't have access to the servers, versions, cluster or that information. I do know that I have 2 issues with load balancing.
The first is with load balancing between 2 vpn servers in geographically different locations. The admins use DNS round robin to direct users between the 2 servers. I know there is a knowledge article about this but it does not seem to match what I am seeing. Users get a session time out error when building the VPN connection. I have seen this happen when the user receives a different IP when building the VPN connection. The connection does not bounce (connect and reconnect) like the knowledge article states. The KB article is KB14355.
What about windows vista NC client that causes the round robin DNS to fail? Why doesn't the vista DNS cache keep the 1st IP it receives from the server? The VPN sessions builds multiple TCP connections with the server what are they all for? Will Clustering of the servers solve the problem? What other load balancing techniques are there?
The other issue causes the same symptom. When the user tries to start a vpn session it errors out because the session times out. In this case the load balancing of the proxy servers causes the connection to switch proxy servers and the session then times out. Any comments or solutions would be helpful?
Are you sure the issue is the DNS round-robin?
Have you attempted to log on to each SA using either its IP address or a DNS name which points only to it? Do these work as expected?
If so, I'd do a packet trace from the PC to see if the PC is doing a DNS query at the time the NC session is starting. If this always occurs, and if the address always changes from one SA to the other, I think you'll be certain of what is going on.
In your last paragraph, I don't understand what you mean by "proxy servers". if the source of the traffic coming into the SA changes (because the PC switches what proxy server it is talking to), you may want to ensure that roaming is configured for the role in question.
I'm sure that it is the issue is caused by the DNS round-robin, I just don't know why. If is use the IP address instead of the URL I don't have the problem.
Using netstat I can view the TCP connections while the client is setting up the VPN tunnel. Multiple connections are made and if one of them goes to another SA due to load balancing the connections are dropped and the client displays the error message that the "session has timed out"
In the last paragraph I attempted to describe my other issue where web traffic is load balanced between multiple different proxy servers. As with above if any of the connections made during the building of the VPN tunnel switch between proxy servers the connections are closed and the session times out. I have worked around this by having users specify the IP a proxy server in the browser settings.
In both cases the client seems to be querying DNS for a proxy and VPN server multiple times and getting different IP addresses. I have checked and DNS caching service is running on the PC.
I will attempt to setup a packet capture to figure out what is going on.
Can you confirm this is a supported solution? I was led to believe (it's been a few years...) that geographically disperse active/active clusters would not perform well, so we run two standalone systems with global server load balancing (based on time-to-closest-server).. But I would love to run a cluster because it would make administration so much easier.
We've run geographically disperse, active/active SA-6000 clusters for several years with minimal problems. We've used DNS round-robin load-balancing and GSLB load-balancing both over the years. I think the clustering is key to getting this to work successfully and make management about 10 times easier.