What is the correct procedure for upgrading OS and application server on two-node failover HA Pacemaker cluster?
I've scoured enough search engine results before coming over. Looking for the correct procedure to upgrade an Ubuntu server and an application server on a two-node failover HA Pacemaker/Corosync/DRBD cluster. I saw a procedure (but can't find it now) that said to do it on the secondary node then make it the primary demoting the original-primary to secondary and then do the upgrade there.
I am upgrading: Ubuntu 14.04 to 18.04 which will require the HA stack to be upgraded, too. Zimbra Collaboration Server (Network Edition) 8.7 to 9.x (new owner Synacor maintains no support for HA stack. Just Redhat.)
Please advise. Thank you in advance.
See also questions close to this topic
-
ColdFusion 2018 Standard 2 node cluster with J2EE shared sessions for failover
Why we want to configure that setup?
We would like to have a Blue/Green zero downtime setup for our CF2018 App.
We currently have a basic CF Server Install (IIS + CF2018) in one server that connects to another Server for the DB (we are using CF2018 Standard).
Our app uses J2EE sessions
There are posts that explain how to use the External Session Storage feature included in CF (redis) but that won’t work with J2EE sessions, the CF admin interface wont allow it.
How can I setup 2 servers in a cluster (behind a load balancer) with J2EE session failover functionality by using CF2018 Standard Edition?
-
Cassandra cluster vs cassandra ring
If I have one Cassandra cluster setup across 5 data centers (3 are private DCs) and 2 are Public (Azure DCs), can I say I have 5 rings or is this 1 cluster and 1 ring ?
Can someone help understanding the term "ring" in this context.
-
Hbase fail to create table in cloudera
I am beginner in Hadoop. I am facing a problem when I try to create a simple table in Hbase.These are following ERRORS.
21/02/26 11:36:38 ERROR client.ConnectionManager$HConnectionImplementation: Can't get connection to ZooKeeper: KeeperErrorCode = ConnectionLoss for /hbase
21/02/26 11:36:56 ERROR zookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 4 attempts
21/02/26 11:36:56 ERROR zookeeper.ZooKeeperWatcher: hconnection-0x4844cdb60x0, quorum=quickstart.cloudera:2181, baseZNode=/hbase Received unexpected KeeperException, re-throwing exception -
How to have highly available Moodle in Kubernetes?
Want to set up highly available Moodle in K8s (on-prem). I'm using Bitnami Moodle with helm charts.
After a successful Moodle installation, it becomes work. But when a K8s node down, Moodle web page displays/reverts/redirects to the Moodle installation web page. It's like a loop.
Persistence storage is rook-ceph. Moodle PVC is ReadriteMany where Mysql is ReadWriteOnce.
The following command was used to deploy Moodle.
helm install moodle --set global.storageClass=rook-cephfs,replicaCount=3,persistence.accessMode=ReadWriteMany,allowEmptyPassword=false,moodlePassword=Moodle123,mariadb.architecture=replication bitnami/moodle
Any help on this is appreciated.
Thanks.
-
High-Availability not working in Hadoop cluster
I am trying to move my non-HA namenode to HA. After setting up all the configurations for JournalNode by following the Apache Hadoop documentation, I was able to bring the namenodes up. However, the namenodes are crashing immediately and throwing the follwing error.
ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode. java.io.IOException: There appears to be a gap in the edit log. We expected txid 43891997, but got txid 45321534.
I tried to recover the edit logs, initialize the shared edits etc., but nothing works. I am not sure how to fix this problem without formatting namenode since I do not want to loose any data.
Any help is greatly appreciated. Thanking in advance.
-
Apache Kafka Consume from Slave/ISR node
I understand the concept of master/slave and data replication in Kafka, but i don't understand why consumers and producers will always be routed to a master node when writing/reading from a partition instead of being able to read from any ISR (in-sync replica)/slave.
The way i think about it, if all consumers are redirected to one single master node, then more hardware is required to handle read/write operations from large consumer groups/producers.
Is it possible to read and write in slave nodes or the consumers/producers will always reach out to the master node of that partition?
-
Pacemaker Postgresql Master-Master State
I am using Pacemaker with Centos 8 and Postgresql 12. Failover master/slave states successfully run. But if all nodes are masters, pacemaker can't repair its unlikely send command 'pcs resources cleanup'. Wheras I set 60s in resource config. How can I fix it?
State video: https://kerteriz.net/depo/Pacemaker-Fail-1.m4v
<video style="width:80%" controls> <source src="https://kerteriz.net/depo/Pacemaker-Fail-1.m4v" type="video/mp4"> </video>
-
on resource timeout corosync/pacemaker node kill
Is there any option I can kill the node when the resource timeouts on stop action (alternatively on failure of any action)? I do not have any fencing / stonith device.
Thanks in advance
-
repmgr managed by pacemaker
I'm looking for solution how to connect repmgr to the pacemaker on RedHat. There are applications configured for pacemaker sending data to postgres managed by repmgr. When issue occures then pacemaker switch virtual ip to standby server and aplications too, but repmgr does not switch DB to standby node. If there is issue with repmgr, then repmgr can also call pacemaker to switch everytinh over to standby node. Is there a way how to manage repmgr with pacemaker ?
Thanks
-
Failed to install DRBD9 in debian-9
I need to upgrade drbd8 to drbd9. for that I following this documentation.
https://www.linbit.com/drbd-user-guide/drbd-guide-9_0-en/#s-upgrading-drbd
step 1
root@oreo:~# add-apt-repository ppa:linbit/linbit-drbd9-stack This PPA contains DRBD9, drbd-utils, LINSTOR (client, python API, server). This differs from official, production grade LINBIT repositories in several ways, including: - We push RCs immediately to the PPA - We don't push hotfixes, these usually have to wait until the next RC/release - We only keep 2 LTS versions up to date (Bionic and Focal, but not Xenial) For support and access to official repositories see: https://www.linbit.com or write an email to: sales AT linbit.com More info: https://launchpad.net/~linbit/+archive/ubuntu/linbit-drbd9-stack Press [ENTER] to continue or ctrl-c to cancel adding it gpg: keybox '/tmp/tmp68jovxd3/pubring.gpg' created gpg: keyserver receive failed: No keyserver available
step 2 Next you will want to add the DRBD signing key to your trusted keys
wget -O- http://packages.linbit.com/gpg-pubkey-53B3B037-282B6E23.asc | apt-key add -
step 3 Lastly perform an apt update so Debian recognizes the updated repository.
apt update
I got error like this
apt-get update Ign:1 http://ftp.debian.org/debian stretch InRelease Hit:2 http://security.debian.org stretch/updates InRelease Hit:3 http://download.virtualbox.org/virtualbox/debian stretch InRelease Hit:4 http://apt.postgresql.org/pub/repos/apt stretch-pgdg InRelease Hit:5 http://ftp.debian.org/debian stretch-updates InRelease Ign:6 http://ppa.launchpad.net/linbit/linbit-drbd9-stack/ubuntu hirsute InRelease Hit:7 http://ftp.debian.org/debian stretch-backports InRelease Hit:8 http://ftp.debian.org/debian stretch Release Ign:9 http://ppa.launchpad.net/linbit/linbit-drbd9-stack/ubuntu hirsute Release Ign:10 http://ppa.launchpad.net/linbit/linbit-drbd9-stack/ubuntu hirsute/main amd64 Packages Ign:11 http://ppa.launchpad.net/linbit/linbit-drbd9-stack/ubuntu hirsute/main all Packages Ign:12 http://ppa.launchpad.net/linbit/linbit-drbd9-stack/ubuntu hirsute/main Translation-en Ign:10 http://ppa.launchpad.net/linbit/linbit-drbd9-stack/ubuntu hirsute/main amd64 Packages Ign:11 http://ppa.launchpad.net/linbit/linbit-drbd9-stack/ubuntu hirsute/main all Packages Ign:12 http://ppa.launchpad.net/linbit/linbit-drbd9-stack/ubuntu hirsute/main Translation-en Ign:10 http://ppa.launchpad.net/linbit/linbit-drbd9-stack/ubuntu hirsute/main amd64 Packages Ign:11 http://ppa.launchpad.net/linbit/linbit-drbd9-stack/ubuntu hirsute/main all Packages Ign:12 http://ppa.launchpad.net/linbit/linbit-drbd9-stack/ubuntu hirsute/main Translation-en Ign:10 http://ppa.launchpad.net/linbit/linbit-drbd9-stack/ubuntu hirsute/main amd64 Packages Ign:11 http://ppa.launchpad.net/linbit/linbit-drbd9-stack/ubuntu hirsute/main all Packages Ign:12 http://ppa.launchpad.net/linbit/linbit-drbd9-stack/ubuntu hirsute/main Translation-en Ign:10 http://ppa.launchpad.net/linbit/linbit-drbd9-stack/ubuntu hirsute/main amd64 Packages Ign:11 http://ppa.launchpad.net/linbit/linbit-drbd9-stack/ubuntu hirsute/main all Packages Ign:12 http://ppa.launchpad.net/linbit/linbit-drbd9-stack/ubuntu hirsute/main Translation-en Err:10 http://ppa.launchpad.net/linbit/linbit-drbd9-stack/ubuntu hirsute/main amd64 Packages 404 Not Found Ign:11 http://ppa.launchpad.net/linbit/linbit-drbd9-stack/ubuntu hirsute/main all Packages Ign:12 http://ppa.launchpad.net/linbit/linbit-drbd9-stack/ubuntu hirsute/main Translation-en Reading package lists... Done W: The repository 'http://ppa.launchpad.net/linbit/linbit-drbd9-stack/ubuntu hirsute Release' does not have a Release file. N: Data from such a repository can't be authenticated and is therefore potentially dangerous to use. N: See apt-secure(8) manpage for repository creation and user configuration details. E: Failed to fetch http://ppa.launchpad.net/linbit/linbit-drbd9-stack/ubuntu/dists/hirsute/main/binary-amd64/Packages 404 Not Found E: Some index files failed to download. They have been ignored, or old ones used instead.
please help me
-
Setup DRBD with 3 server (master/slave)
I currently work with pacemaker DRBD setup. I have 3 Debian 9 server and I want to synchronize drive with all server. unfortunately I get error while installing DRBD to my server.
drbd configuration
global { usage-count no; } common { protocol C;} resource r0 { on oreo { device /dev/drbd0; disk /dev/sda3; address 10.xxx.xxx.205:7788; meta-disk internal; } on cupcone { device /dev/drbd0; disk /dev/sda3; address 10.xxx.xxx.206:7788; meta-disk internal; } on twinkie { device /dev/drbd0; disk /dev/sda3; address 10.xxx.xxx.207:7788; meta-disk internal; } }
Step I followed
root@oreo:~# sudo modprobe drbd root@oreo:~# sudo drbdadm create-md r0 You want me to create a v08 style flexible-size internal meta data block. There appears to be a v08 flexible-size internal meta data block already in place on /dev/sda3 at byte offset 6291453898752 Do you really want to overwrite the existing meta-data? [need to type 'yes' to confirm] yes initializing activity log NOT initializing bitmap Writing meta data... New drbd meta data block successfully created.
Error I get
root@oreo:~# sudo drbdadm up r0 /etc/drbd.conf:3: in resource r0: There are multiple host sections for the peer node. Use the --peer option to select which peer section to use. resource r0: cannot configure network without knowing my peer.
I think error is config file, I don't know how to fix this
please help me
-
Stacked DRBD (DR Node) not Syncing Back to Primary Site
We have following Situation here,
We have 3 nodes with DRBD Setup. 2 Nodes are part of pacemaker clustering and 1 node is DR Node. As part of DR testing we have shutdown 2 Nodes (Pacemaker), and mounted DRBD drive on DR Node. We can see all the data which was there on Primary Site and also we have written new data on DR node. However, after few days, we brought one node on primary site but whatever the data written on DR node was replicated back to primary node.
Replicated back to primary site is automatic or do we need to follow any additional steps
We are using SLES 12.4 Trail Version for this POC
-
How can i prevent Resources from Moving after Recovery
I have a pacemaker cluster of two nodes running on centos 8 . i setup resource-stickiness to INFINITY , when i reboot the node that host my resources , all resource migrate to the second node as expected , but when the node one come back online , we get a multi-active resources problem so the cluster detect that my resources are active in the two nodes then stop all resources in the second node and keep them active in the first one . My expectation is , when the first node come back online , the resources don't migrate again to the first node but stay active in the second node .
-
Pacemaker configuration for xCAT
I'm looking to set up an xCAT HA solution using this guide: https://xcat-docs.readthedocs.io/en/stable/advanced/hamn/setup_ha_mgmt_node_with_drbd_pacemaker_corosync.html
Unfortunately this guide is fairly old and outdated. I've got most of it working but I'm having issues with the pcs commands that should sort out the ordering such as:
pcs -f xcat_cfg constraint order list ip_xCAT db_xCAT
There are several of these lines in the config but this command does not seem to be valid. Is anyone able to assist with what this/these line(s) should be?
TIA Pete