How to query nested fields in MongoDB using Presto

I'm setting up a Presto cluster which I'd like to use to query a MongoDB instance. Data in my Mongo instance has the following structure:

  _id: <value>
  somefield: <value>
  otherfield: <value>
  nesting_1: {
    nested_field_1_1: <value>
    nested_field_1_2: <value>
  nesting_2: {
    nesting_2_1: {
      nested_field_2_1_1: <value>
      nested_field_2_1_2: <value>
    nesting_2_2: {
      nested_field_2_2_1: <value>
      nested_field_2_2_2: <value>

Just by plugging it, Presto correctly identifies and creates columns for the values in the top level (e.g. somefield, otherfield) and in the first nesting level -- that is, it creates a column for nesting_1, and its content is a row(nested_field_1_1 <type>, nested_field_1_2 <type>, ...), and I can query table.nesting1.nested_field_1_1.

However, fields with an extra nesting layer (e.g. nesting_2 and everything within it) are missing from the table schema. Presto's documentation for the MongoDB connector does mention that:

At startup, this connector tries guessing fields’ types, but it might not be correct for your collection. In that case, you need to modify it manually. CREATE TABLE and CREATE TABLE AS SELECT will create an entry for you.

While that seems to explain my use case, it's not very clear on how to "modify it manually" -- a CREATE TABLE statement doesn't seem appropriate, as the table is already there. The documentation also has a section on how to declare fields and their types, but it's also not very clear on how to deal with multiple nesting levels.

My question is: how do I setup Presto's MongoDB connector so that I can query fields in the third nesting layer?

Answers can assume that:

  • all nested fields' names are known;
  • there are only 3 layers;
  • there is no need to preserve the layered table layout (i.e. I don't mind if my resulting Presto table has all nested fields as unique columns like somefield, rather than one field with rows like nesting_1 in the above example);
  • extra points if the solution doesn't require me to explicitly declare the names and types of all columns in the third layer, as I have over 1500 of them -- but this is not a hard requirement.