Siemplify provides multiple deployment modes with high-availability clusters to ensure the high availability of services.
There are three layers of Siemplify High Availability(HA) mechanism:
- Application HA cluster
- Database HA cluster
- Disaster Recovery – DR
Disaster Recovery is required to provide system availability when the main organization site is not available. This option should only be used in extreme cases when all HA protections on the primary site have failed or are inaccessible.
Switching the system to DR mode or back to Primary mode is not an automatic process and requires manual actions.
To restore Siemplify from DR:
- Connect to the primary database server via SSH with the root user.
- Once connected, switch to the postgres user by running the command :
“su – postgres”
- Check the status of the cluster by running the command:
“repmgr cluster show”
|1||node1||primary||? unreachable||default||100||host=db_node1 dbname=repmgr user=repmgr|
|2||node2||standby||? unreachable||? node1||default||100||2||host=db_node2 dbname=repmgr user=repmgr|
|3||node3||standby||running||? node1||default||100||1||host=db_node3 dbname=repmgr user=repmgr|
|4||node4||standby||running||? node1||default||100||host=db_node4 dbname=repmgr user=repmgr|
- Promote an active standby server as the primary server with the following command:
“repmgr standby promote —siblings-follow —force”
- After the standby server is successfully promoted, remove the old primary node from the cluster (in example with ID 1) with the following command: “repmgr primary unregister —node-id 1 —force”
- In the example above, the server with ID 2 has also failed. It should be removed with the following command: “repmgr standby unregister —node-id 2”
- Check that the repmgrd is now running by exiting from postgres user to root and issuing the command: “systemctl status repmgr10”
- In case the repmgrd service is not running, restart it with the command: “systemctl restart repmgr10”
- After completing the above steps, check that the following status is displayed on the postgres user with the command: “repmgr cluster show” (run as a postgres user)
|3||node3||primary||running||default||100||3||host=db_node3 dbname=repmgr user=repmgr|
|4||node4||standby||running||node3||default||100||host=db_node4 dbname=repmgr user=repmgr|
To restore Siemplify Application on DR:
Verify that its service configuration files points to the proper DB IP or HOSTNAME and to do so:
- Connect to the Siemplify server app via SSH as root and run the command: “cat /etc/systemd/system.conf”
- At the very end of the file , locate the DB_IP=“node3” . This variable should point to the new primary DB. When there is HA, check this variable in all service files of Siemplify in /etc/systemd/system/ folder.
- If the variable “DB_IP” has been edited in /etc/systemd/system.conf – you must reboot the server to force the system to reread this configuration file. If this variable had been edited in service files, you must reload systemd to apply the changes and run the command: “systemctl daemon-reload” before starting Siemplify.
- After the variable is properly defined, enable and start Siemplify services with the following command: (this applies only to the single app server). “cd /etc/systemd/system && systemctl enable —now Siemplify.* nginx”
- In case if HA is running on application servers, on any Siemplify app server run the following command: “systemctl enable —now pcsd pacemaker corosync” and after the Pacemaker finishes starting up, run the next command: “pcs cluster start —all”
- Check the status of the cluster app server by running the command:
“pcs status” as root on any APP server.
If Siemplify is not starting , please verify that the DB_IP variable is set properly.
Need more help with this?
Click here to open a Support ticket