How to make namedtuple when column headers have spaces

I'm trying to make a namedtuple from a DictReader object. My code looks like this. The problem I'm struggling with is I have some really long and ugly column headers in the csv file I'm working with. For the sake of this example, one of the column headers I am working with is:

"What is typically the main dish at your Thanksgiving dinner?".

What is throwing me off is there are a bunch of spaces in this title, so if I understand correctly, the namedtuple thinks these are all arguments. What way would you recommend to solve this? I have referenced several threads and feel like I almost got there through this one: What is the pythonic way to read CSV file data as rows of namedtuples?

I am just using one column header as an example. Here is some code I have so far:

import csv
import collections

filename = 'thanksgiving2015.csv'
with open(filename, 'r', encoding = 'utf-8') as f:
    reader = csv.DictReader(f)
    columns = collections.namedtuple('columns', 
    'What is typically the main dish at your 
    Thanksgiving dinner?')

Should I strip all these column headers of their spaces before making the namedtuple? I could do this before I even import the csv in excel, but I assume there is a nice solution in python.

1 answer

  • answered 2018-10-11 19:13 chepner

    namedtuple treats a single string as a white-space-delimited list of field names. You need to pass an explicit list of column names instead.

    namedtuple('columns', ['What is...', 'some other absurd column name'])
    

    I would rethink using the header values directly as field names, though. Ignore the header, and pass a list of shorter names that you can use as attributes later.