PRTG Manual: Clustering
A PRTG Cluster consists of two or more installations of PRTG that work together to form a high availability monitoring system. The objective is to reach true 100% uptime for the monitoring tool. Using clustering, the uptime will no longer be degraded by failing connections because of an internet outage at a PRTG server's location, failing hardware, or because of downtime due to a software update for the operating system or PRTG itself.
How a PRTG Cluster Works
A PRTG cluster consists of one Primary Master Node and one or more Failover Nodes. Each node is simply a full installation of PRTG which could perform the whole monitoring and alerting on its own. Nodes are connected to each other using two TCP/IP connections. They communicate in both directions and a single node only needs to connect to one other node to integrate into the cluster.
During normal operation the Primary Master is used to configure devices and sensors (using the web interface or Enterprise Console). The master automatically distributes the configuration to all other nodes in real time. All nodes are permanently monitoring the network according to this common configuration and each node stores its results into its own database. This way, the storage of monitoring results also is distributed among the cluster (the downside of this concept is that monitoring traffic and load on the network is multiplied by the number of cluster nodes, but this will not be a problem for most usage scenarios). The user can review the monitoring results by logging into the web interface of any of the cluster nodes in read only mode. Because the monitoring configuration is centrally managed, it can only be changed on the master node, though.
By default, all devices created on the Cluster Probe are monitored by all nodes in the cluster, so data from different perspective is available and monitoring for these devices always continues, even if one of the nodes fails. In case the Primary Master fails, one of the Failover Nodes takes over the master role and controls the cluster until the master node is back. This ensures a fail-safe monitoring with gapless data. Note: During the outage of a node, it will not be able to collect monitoring data. The data of this single node will show gaps. However, monitoring data for this time span is still available on the other node(s). There is no functionality to actually fill in other nodes' data into those gaps.
If downtimes or threshold breaches are discovered by one or more nodes, only one installation, either the Primary Master or the Failover Master, will send out notifications (via email, SMS text message, etc.). Thus, the administrator will not be flooded with notifications from all cluster nodes in case of failures.
For detailed information, please see Failover Cluster Configuration.
Knowledge Base: What's the Clustering Feature in PRTG?
Keywords: Cluster,Cluster Basic Concept