How to normalize a nested .json?

So I am using Mapbox web API and have a .json returned. I've been having trouble and difficulties parsing .jsons. One of the challenge I'm having is that the returned .json is nested. Here is the .json:

{
   "type":"FeatureCollection",
   "query":[
      -73.989,
      40.733
   ],
   "features":[
      {
         "id":"locality.12696928000137850",
         "type":"Feature",
         "place_type":[
            "locality"
         ],
         "relevance":1,
         "properties":{
            "wikidata":"Q11299"
         },
         "text":"Manhattan",
         "place_name":"Manhattan, New York, United States",
         "bbox":[
            -74.047313153061,
            40.679573,
            -73.907,
            40.8820749648427
         ],
         "center":[
            -73.9597,
            40.7903
         ],
         "geometry":{
            "type":"Point",
            "coordinates":[
               -73.9597,
               40.7903
            ]
         },
         "context":[
            {
               "id":"place.2618194975964500",
               "wikidata":"Q60",
               "text":"New York"
            },
            {
               "id":"district.12113562209855570",
               "wikidata":"Q500416",
               "text":"New York County"
            },
            {
               "id":"region.17349986251855570",
               "wikidata":"Q1384",
               "short_code":"US-NY",
               "text":"New York"
            },
            {
               "id":"country.19678805456372290",
               "wikidata":"Q30",
               "short_code":"us",
               "text":"United States"
            }
         ]
      },
      {
         "id":"region.17349986251855570",
         "type":"Feature",
         "place_type":[
            "region"
         ],
         "relevance":1,
         "properties":{
            "wikidata":"Q1384",
            "short_code":"US-NY"
         },
         "text":"New York",
         "place_name":"New York, United
States",
         "bbox":[
            -79.8578350999901,
            40.4771391062446,
            -71.7564918092633,
            45.0239286969073
         ],
         "center":[
            -75.4652471468304,
            42.751210955
         ],
         "geometry":{
            "type":"Point",
            "coordinates":[
               -75.4652471468304,
               42.751210955
            ]
         },
         "context":[
            {
               "id":"country.19678805456372290",
               "wikidata":"Q30",
               "short_code":"us",
               "text":"United States"
            }
         ]
      },
      {
         "id":"country.19678805456372290",
         "type":"Feature",
         "place_type":[
            "country"
         ],
         "relevance":1,
         "properties":{
            "wikidata":"Q30",
            "short_code":"us"
         },
         "text":"United States",
         "place_name":"United States",
         "bbox":[
            -179.9,
            18.8163608007951,
            -66.8847646185949,
            71.4202919997506
         ],
         "center":[
            -97.9222112121185,
            39.3812661305678
         ],
         "geometry":{
            "type":"Point",
            "coordinates":[
               -97.9222112121185,
               39.3812661305678
            ]
         }
      }
   ],
   "attribution":"NOTICE: © 2021 Mapbox and its suppliers. All
rights reserved. Use of this data is subject to the Mapbox Terms of Service
(https://www.mapbox.com/about/maps/). This response and the information it contains may not be
retained. POI(s) provided by Foursquare."
}

I was able to load it into a dataframe using the following code snippet:

url = "https://api.mapbox.com/geocoding/v5/mapbox.places/-73.989,40.733.json?
types=country,region,locality&access_token=MY_KEY_HERE"

data = json.loads(requests.get(url).text)

df = json_normalize(data, 'features')

return df

However, I see that I need to add [query] to it so I modified the relevant potion to look like:

url = "https://api.mapbox.com/geocoding/v5/mapbox.places/-73.989,40.733.json?
types=country,region,locality&access_token=MY_KEY_HERE"

data = json.loads(requests.get(url).text)

df = json_normalize(data, 'features', ['query'])

return df

(The syntax I am following comes from the documentation)

The error I get states:

ValueError: Length of values does not match length of index.

The query field looks like this...

Query field needs to be added to the dataframe

I'm not sure what the error is stating and how to resolve it.

Here is my desired output dataframe: Desired Output

I can do the cleaning and dropping of unneeded fields but I am having trouble getting the [query] field to appear.

1 answer

  • answered 2021-07-24 05:16 Corralien

    Add the column query after json_normalize:

    df.insert(0, 'query', [data['query']] * len(df))
    

How many English words
do you know?
Test your English vocabulary size, and measure
how many words do you know
Online Test
Powered by Examplum