Region/boundary detection in time series analysis

I'm trying to find the limits in a time series analysis.

For example:

Here we have three (flat) peaks and the limits are well defined

  |     aaaaaaaaaaaaaaaa                 
  |     aaaaaaaaaaaaaaaa      aaaaaaaaaaaaaaaa           
  |     aaaaaaaaaaaaaaaa      aaaaaaaaaaaaaaaa           aaaaaaaaaaaaaaaaaaaaaa
  |     aaaaaaaaaaaaaaaa      aaaaaaaaaaaaaaaa           aaaaaaaaaaaaaaaaaaaaaa
  |aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
  |---------------------------------------------------------------------------------
   |    +    |    +    |    +    |    +    |    +    |    +    |    +    |    +    |
   1         10        20        30        40        50        60        70        80

Where the limits are:

1-) 5 to 20

2-) 26 to 42

3-) 54 to 75

But sometimes there are some noise like the following example:

  |       aa aa a   aa                 
  |      aaa aaaaa  aaa          a  a  a  aa              aa a a a aaaa aa a a
  |     aaaa aaaaaaaaaaa       aaaa aaaaaaaaa      a     aaaaaaaaa aaaaaaaaa aa
  |     aaaaaaaaaaaaaaaa   a  aaaaaaaaaaaaaaaa  aa a  a  aaaaaaaaaaaaaaaaaaaaaa
  |aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
  |---------------------------------------------------------------------------------
   |    +    |    +    |    +    |    +    |    +    |    +    |    +    |    +    |
   1         10        20        30        40        50        60        70        80

Here the limits are not well defined, but may have three or 4 peaks and associated regions.

To solve this problem I tried to use smoothed z-score (to find peaks) and tried to find the longest region above a threshold.

Even though I was able to find inversion (peaks) in a defined lag phase (sliding window), I couldn't define the regions of interest.

I'm wondering if gaussian mixture models (GMM) or otsu thresholding are good enough for this sort of data.

Also, how can I define the number of regions, if they are unknown (I don't know the number of components).

Is there any other good algo to separate multimodal distributions?