Current Setup -
3 SA6500 appliances in active/active cluster located in 3 separate states, all locations connected by circuits
7.2R4 (build 21697)
Pulse 3.0 (build 22.214.171.12405)
I am seeing large amounts of pulse users randomly drop at the same time and have to re-connect. This seems to happen on 2 out of the 3 appliances we have. The screenshot attached shows the graph showing the big drop and other screenshot shows users re-connecting at the same time. Event logs show nothing, any ideas? I do see the following in the release notes:
7.2 R7 -pulse-connmgr- If the SA is heavily loaded, Pulse users randomly get disconnected. (831242)
what is the latency between the nodes?
what are the Synchronization Settings for the cluster?
System --> Clustering --> Properties
which of these is enabled?
Synchronize log messages
Synchronize user sessions
Synchronize last access time for user sessions
latency - 28ms average
sync user sessions and last access time
Not sure if this is releated but another weird symptom is if we lose circuit connectivity between the nodes it drops people from all 3 nodes. I don't think that is supposed to happen, the SA devices should work independently of each other.
Please disable the sync user sessions and last access time. In a A/A cluster we do not need this. Test and update us.
Have you been able to correlate any other connectivity issues to the time frame your users experience problems? This may be a long shot but., what about the firewalls at each of the 2 or 3 locations you had mentioned? Having dealt with Checkpoints for some time, have seen scenarios where they can occasionally drop traffic if you happen to push policy to them while they are under a heavy load. If multiple firewalls are managed using the same policy package, that could account for simultaneous drops at 2 locations. Again, this is a bit of a stretch, but just something to consider..