Dataproc secondary workers not used

I've got a Dataproc cluster going on configured this way:

{
    "worker_config": {
        "num_instances": 20
    },
    "secondary_worker_config": {
        "num_instances": 10,
        "is_preemptible": True
    }
    # no autoscaling set
}

I've omitted on purpose master node details, machines types etc.

The problem is that at run time, Dataproc doesn't seem to use the secondary nodes at all:

  • Secondary nodes don't get a green mark in the VM list of the cluster
  • The sum of the available+allocated memory does not include the secondary nodes
  • The number is the Yarn node managers is 20 which is the number of primary nodes only

More importantly: the underlying Spark job execution time is comparable with a 20 machines cluster and I see no benefits from using secondary preemptible nodes.

Thank you!