User defined node states in corosync cluster in SLES
As of now we see bare minimum states for the nodes in the cluster, like online, offline or under maintenance... but is there a way to have user defined states for each node, which any other node in the cluster can change/manage ? Can it be done by adding some specific resource ? We have a requirement, where we run the nodes in the cluster into a state machine, and want to manage that state machine through corosync. All the nodes should be able to access other node's states, and should be able to change their states based on logic. Any help is appreciated ! Thanks
See also questions close to this topic
ColdFusion 2018 Standard 2 node cluster with J2EE shared sessions for failover
Why we want to configure that setup?
We would like to have a Blue/Green zero downtime setup for our CF2018 App.
We currently have a basic CF Server Install (IIS + CF2018) in one server that connects to another Server for the DB (we are using CF2018 Standard).
Our app uses J2EE sessions
There are posts that explain how to use the External Session Storage feature included in CF (redis) but that won’t work with J2EE sessions, the CF admin interface wont allow it.
How can I setup 2 servers in a cluster (behind a load balancer) with J2EE session failover functionality by using CF2018 Standard Edition?
Cassandra cluster vs cassandra ring
If I have one Cassandra cluster setup across 5 data centers (3 are private DCs) and 2 are Public (Azure DCs), can I say I have 5 rings or is this 1 cluster and 1 ring ?
Can someone help understanding the term "ring" in this context.
Hbase fail to create table in cloudera
I am beginner in Hadoop. I am facing a problem when I try to create a simple table in Hbase.These are following ERRORS.
21/02/26 11:36:38 ERROR client.ConnectionManager$HConnectionImplementation: Can't get connection to ZooKeeper: KeeperErrorCode = ConnectionLoss for /hbase
21/02/26 11:36:56 ERROR zookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 4 attempts
21/02/26 11:36:56 ERROR zookeeper.ZooKeeperWatcher: hconnection-0x4844cdb60x0, quorum=quickstart.cloudera:2181, baseZNode=/hbase Received unexpected KeeperException, re-throwing exception
How to have highly available Moodle in Kubernetes?
Want to set up highly available Moodle in K8s (on-prem). I'm using Bitnami Moodle with helm charts.
After a successful Moodle installation, it becomes work. But when a K8s node down, Moodle web page displays/reverts/redirects to the Moodle installation web page. It's like a loop.
Persistence storage is rook-ceph. Moodle PVC is ReadriteMany where Mysql is ReadWriteOnce.
The following command was used to deploy Moodle.
helm install moodle --set global.storageClass=rook-cephfs,replicaCount=3,persistence.accessMode=ReadWriteMany,allowEmptyPassword=false,moodlePassword=Moodle123,mariadb.architecture=replication bitnami/moodle
Any help on this is appreciated.
High-Availability not working in Hadoop cluster
I am trying to move my non-HA namenode to HA. After setting up all the configurations for JournalNode by following the Apache Hadoop documentation, I was able to bring the namenodes up. However, the namenodes are crashing immediately and throwing the follwing error.
ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode. java.io.IOException: There appears to be a gap in the edit log. We expected txid 43891997, but got txid 45321534.
I tried to recover the edit logs, initialize the shared edits etc., but nothing works. I am not sure how to fix this problem without formatting namenode since I do not want to loose any data.
Any help is greatly appreciated. Thanking in advance.
Apache Kafka Consume from Slave/ISR node
I understand the concept of master/slave and data replication in Kafka, but i don't understand why consumers and producers will always be routed to a master node when writing/reading from a partition instead of being able to read from any ISR (in-sync replica)/slave.
The way i think about it, if all consumers are redirected to one single master node, then more hardware is required to handle read/write operations from large consumer groups/producers.
Is it possible to read and write in slave nodes or the consumers/producers will always reach out to the master node of that partition?
How to install singularity on SuSE?
I'm using SUSE 12:
> cat /etc/SuSE-release SUSE LINUX Enterprise Server 12 (x86_64) VERSION = 12 PATCHLEVEL = 3
I can see it also in my
/etc/os-releasefile that I use SLES12-SP3. I'm trying to install singularity. From the docs:
VERSION=2.5.2 wget https://github.com/singularityware/singularity/releases/download/$VERSION/singularity-$VERSION.tar.gz tar xvf singularity-$VERSION.tar.gz cd singularity-$VERSION ./configure --prefix=/usr/local make sudo make install
./configure --prefix=/usr/localstep I get:
configure: error: Unable to find the libarchive headers, need package libarchive-devel (libarchive-dev on Debian/Ubuntu)
sudo zypper libarchive-develbut there is no
libarchive-devel. How can I install singularity on SLES12?
SLES License Migration?
Recently i migrated an SLES image from AWS to GCP and when i tried to update the repositories using the command
zypper refI realized that the zypper was not working since the instance wasn't properly registered as a Cloud SLES.
I've created a new fresh SLES instance in GCP to check the zypper configurations and realised in
/etc/hoststhere was a entry for the GCP SMT servers.
I went back to the migrated compute engine in GCP, updated the /etc/hosts and ran the following commands:
And it didn't work.
I even tried by replicating the machine image with the flag of
--licensesvia gcloud https://cloud.google.com/sdk/gcloud/reference/compute/images/create but still no success.
Does anyone has any suggestions?
Error creating user groups from text file
I have a problem with a script:
#!/bin/bash # El argumento tipo corresponde a usuario o grupo. # El argumento accion corresponde a crear o eliminar cualquiera de los anteriores. tipo=$1 accion=$2 if [ $tipo = "usuario" ]; then while IFS=, read -r nombre usuario grupo do if [ $accion = "crear" ]; then useradd "$usuario" -c "$nombre" -m -G "$grupo" else sudo userdel "$usuario" -r -f fi done < /tmp/usuarios_fles.txt echo "----- Usuarios procesados ----" exit 0 else while IFS= read -r grupo do if [ $accion = "crear" ]; then groupadd "$grupo" else groupdel "$grupo" -f fi done < /tmp/grupos_fles.txt exit 0 echo "----- Grupos procesados ----" fi
it only contains these text strings:
adminuser readwriteuser readuser
It gives me the following error on Suse 11.3, I test on Fedora 33 and it doesn't show the error:
'.oupadd: Invalid group name `adminuser '.oupadd: Invalid group name `readwriteuser '.oupadd: Invalid group name `readuser
I don't know what else to do to make it run
Pacemaker Postgresql Master-Master State
I am using Pacemaker with Centos 8 and Postgresql 12. Failover master/slave states successfully run. But if all nodes are masters, pacemaker can't repair its unlikely send command 'pcs resources cleanup'. Wheras I set 60s in resource config. How can I fix it?
State video: https://kerteriz.net/depo/Pacemaker-Fail-1.m4v
<video style="width:80%" controls> <source src="https://kerteriz.net/depo/Pacemaker-Fail-1.m4v" type="video/mp4"> </video>
on resource timeout corosync/pacemaker node kill
Is there any option I can kill the node when the resource timeouts on stop action (alternatively on failure of any action)? I do not have any fencing / stonith device.
Thanks in advance
repmgr managed by pacemaker
I'm looking for solution how to connect repmgr to the pacemaker on RedHat. There are applications configured for pacemaker sending data to postgres managed by repmgr. When issue occures then pacemaker switch virtual ip to standby server and aplications too, but repmgr does not switch DB to standby node. If there is issue with repmgr, then repmgr can also call pacemaker to switch everytinh over to standby node. Is there a way how to manage repmgr with pacemaker ?
How can i prevent Resources from Moving after Recovery
I have a pacemaker cluster of two nodes running on centos 8 . i setup resource-stickiness to INFINITY , when i reboot the node that host my resources , all resource migrate to the second node as expected , but when the node one come back online , we get a multi-active resources problem so the cluster detect that my resources are active in the two nodes then stop all resources in the second node and keep them active in the first one . My expectation is , when the first node come back online , the resources don't migrate again to the first node but stay active in the second node .
Pacemaker configuration for xCAT
I'm looking to set up an xCAT HA solution using this guide: https://xcat-docs.readthedocs.io/en/stable/advanced/hamn/setup_ha_mgmt_node_with_drbd_pacemaker_corosync.html
Unfortunately this guide is fairly old and outdated. I've got most of it working but I'm having issues with the pcs commands that should sort out the ordering such as:
pcs -f xcat_cfg constraint order list ip_xCAT db_xCAT
There are several of these lines in the config but this command does not seem to be valid. Is anyone able to assist with what this/these line(s) should be?