Appling a custom function to each row in a column in a dataframe

I have a bit of code which pulls the latitude and longitude for a location. It is here:

address = 'New York University'
url = 'https://nominatim.openstreetmap.org/search/' + urllib.parse.quote(address) +'?format=json'

response = requests.get(url).json()
print(response[0]["lat"])
print(response[0]["lon"])

I'm wanting to apply this as a function to a long column of "address".

I've seen loads of questions about 'apply' and 'map', but they're almost all simple math examples.

Here is what I tried last night:

def locate (address):
    response = requests.get(url).json()
    print(response[0]["lat"])
    print(response[0]["lon"])
    return

df['lat'] = df['lat'].map(locate)
df['lon'] = df['lon'].map(locate)

This ended up just applying the first row lat / lon to the entire csv.

What is the best method to turn the code into a custom function and apply it to each row?

Thanks in advance.

EDIT: Thank you @PacketLoss for your assistance. I'm getting an indexerror:list index out of range, but it does work on his sample dataframe.

Here is the read_csv I used to pull in the data:

df = pd.read_csv('C:\\Users\\CIHAnalyst1\\Desktop\\InstitutionLocations.csv', sep=',', error_bad_lines=False, index_col=False, dtype='unicode', encoding = "utf-8",  warn_bad_lines=False)

Here is a text copy of the rows from the dataframe:

address

0 GRAND CANYON UNIVERSITY 1 SOUTHERN NEW HAMPSHIRE UNIVERSITY 2 WESTERN GOVERNORS UNIVERSITY 3 FLORIDA INTERNATIONAL UNIVERSITY - UNIVERSITY ... 4 PENN STATE UNIVERSITY UNIVERSITY PARK ... ... 4292 THE ART INSTITUTES INTERNATIONAL LLC 4293 INTERCOAST - ONLINE 4294 CAROLINAS COLLEGE OF HEALTH SCIENCES 4295 DYERSBURG STATE COMMUNITY COLLEGE COVINGTON 4296 ULTIMATE MEDICAL ACADEMY - NY

1 answer

  • answered 2021-05-13 11:12 PacketLoss

    You need to return your values from your function, or nothing will happen.

    We can use apply here and pass the address from the df as well.

    data = {'address': ['New York University', 'Sydney Opera House', 'Paris', 'SupeRduperFakeAddress']}
    
    df = pd.DataFrame(data)
    
    def locate(row):
        url = 'https://nominatim.openstreetmap.org/search/' + urllib.parse.quote(row['address']) +'?format=json'
        response = requests.get(url).json()
        if response:
            row['lat'] = response[0]['lat']
            row['lon'] = response[0]['lon']
        return row
    
    df = df.apply(locate, axis=1)
    

    Outputs

                     address           lat                 lon
    0    New York University   40.72925325  -73.99625393609625
    1     Sydney Opera House  -33.85719805  151.21512338473752
    2                  Paris    48.8566969           2.3514616
    3  SupeRduperFakeAddress           NaN                 NaN