Node-RED High Availability/Fault Tolerant

I'm very interested in using Node-RED for some projects, but having a strong background in architecture I'm always thinking is scalability and HA first.

I know that Node-RED instances can be scaled if we design the flows to be "stateless", but I'm worried about what happens if an instance dies in the middle of a flow execution. The impact can be mitigated if a Queue ( like SQS ) with an ACK timeout is used, but this would create other problems, like having to implement each step to be idempotent. I really like Apache NiFi's approach to put a queue before each step and if the instance dies, the message will be processed once it restarts. Any ideias on how to do something similar in Node-RED?

1 answer

  • answered 2020-12-03 16:05 hardillb

    Recent releases added the pluggable wiring API which allows the code that passes messages between nodes to be swapped out.

    This was done to solve these sort of problems and to allow things like a distributed instance (e.g. nodes running on different hosts)

    To work properly it does require all the nodes to be updated to signal when they are done with a message. The core nodes have been done iirc but there are still a large number of 3rd party nodes that still need updating.

    https://nodered.org/blog/2019/09/20/node-done