What is the best way to select more than 4 GiB (multiple selects concurrently) of data from PostgreSQL DB?

I am working a app to monitor CNC machines. I use Nest.js + TypeScript + MikroORM + PostgreSQL.

A customer needs to have real-time timeline graph with some state data (6 Boolean values, only one can be true) from start of the first workshift (6am) until the end of the last workshift (10pm) in a particular day.

I get the current state (6 boolean values) every second, then compute intervals for consecutive records with the same state (all 6 boolean values have same values in each state record during a single interval).

Example data:

// State record
{
  created: Date,
  v1: boolean,
  v2: boolean,
  v3: boolean,
  v4: boolean,
  v5: boolean,
  v6: boolean
}

// Interval
{
  startTime: Date,
  endTime: Date,
  length: number
  v1: boolean,
  v2: boolean,
  v3: boolean,
  v4: boolean,
  v5: boolean,
  v6: boolean
}

One interval has usually a length of 10-100 minutes, but it can be several days long (e.g. on holiday company shutdown).

As we monitor many machines (60+ machines), when there are many intervals between 6am and 10pm, the data is too much to hold in memory (it causes JS heap to go out of memory).

I was thinking about publishing the data to the frontend using MQTT with all data from 6am to now in a machine/$id/data/current topic and then only the changes in machine/$id/data topic, however:

  1. the last interval (latest) can change (endTime and length will be incremented by a second every second and when the state changes, the previous interval still is incremented by a second and a new interval is inserted with 0 length), therefore I cannot simply publish the latest interval, but occasionally I would need to publish latest two interval;
  2. bigger problem is that initial (current) data: it will be huge e.g. at 9pm (15 hours in total; of course, it depends on the number of intervals).

With the first problem, the frontend could deal with it comparing the IDs of the intervals, but I have no idea how to solve the issues with the initial (current) data: I store the intervals in the DB, therefore I need to select * from mx.intervals where start_time < now() and end_time > '2022-01-25T06:00:00.00+01' order by start_time asc on every machine on app initialisation, which fails (fatal error: reached heap limit allocation failed - JS heap out of memory).

Would anyone help me how could I implement this? I need an algorithm, not necessarily a working code.


At one other customer with 35 machines, yesterday (24 January 2022) from 6am to 10pm, there are 103 intervals per machine on average. Maximum number of intervals was 567 on a machine, however, it could be higher, as some operations (i.e. producing a piece/product) on some CNC machines take about 30 seconds.

How many English words
do you know?
Test your English vocabulary size, and measure
how many words do you know
Online Test
Powered by Examplum