How to delete dictionary keys that have a blank value?

Currently, I'm working on a script (3.9) that takes a .csv file, and creates a .json file out of it. The original code is as follows...

import csv
import json
from time import time

start = time()


def make_json(csv_path, json_path):
    data = []

    with open(csv_path, encoding='utf-8') as csvp:
        csv_reader = csv.DictReader(csvp)

        for row in csv_reader:
            data.append(row)

    with open(json_path, 'w', encoding='utf-8') as jsonp:
        jsonp.write(json.dumps(data, indent=4))


csv_path = r'elearningtest.csv'
json_path = r'elearningtest.json'

make_json(csv_path, json_path)

print(f'Time taken to run: {time() - start} seconds')

This here works as expected. Here's a sample of the output...

[
    {
        "username": "username0001",
        "course_code": "coursecode_001",
        "enrollment_date": "7/28/2015 16:16",
        "completion_date": "8/28/2019 11:20",
        "score": ""
    },
    {
        "username": "username0002",
        "course_code": "coursecode_001",
        "enrollment_date": "7/28/2015 16:18",
        "completion_date": "8/7/2019 17:20",
        "score": "78"
    },
    {
        "username": "username0003",

However, what I'd like is the following:

[
    {
        "username": "username0001",
        "course_code": "coursecode_001",
        "enrollment_date": "7/28/2015 16:16",
        "completion_date": "8/28/2019 11:20"
    },
    {
        "username": "username0002",
        "course_code": "coursecode_001",
        "enrollment_date": "7/28/2015 16:18",
        "completion_date": "8/7/2019 17:20",
        "score": "78"
    },
    {
        "username": "username0003",

As you can see in the first entry, the value of the "score" key was "". What I'm aiming for is, if the value is "", it simply won't include that key for that row. This is the code I've done to accomplish this.

for row in csv_reader:
    for k, v in row.items():
        if v == "":
            del row[k]
        else:
            data.append(row)  

When I run this, I get the error of RuntimeError: dictionary changed size during iteration.

I've looked all over, and have found some promising leads, but nothing that solidly helps with this. Any input from the community would be amazing.

2 answers

  • answered 2021-04-21 15:25 Henry Ecker

    You can do this using filter

    import json
    
    data = json.loads('''
    [
        {
            "username": "username0001",
            "course_code": "coursecode_001",
            "enrollment_date": "7/28/2015 16:16",
            "completion_date": "8/28/2019 11:20",
            "score": ""
        },
        {
            "username": "username0002",
            "course_code": "coursecode_001",
            "enrollment_date": "7/28/2015 16:18",
            "completion_date": "8/7/2019 17:20",
            "score": "78"
        },
        {
            "username": "username0003"
        }
    ]''')
    
    print([dict(filter(lambda item: item[1] != '', d.items())) for d in data])
    

    Output:

    [{"username": "username0001", "course_code": "coursecode_001",
      "enrollment_date": "7/28/2015 16:16", "completion_date": "8/28/2019 11:20"},
     {"username": "username0002", "course_code": "coursecode_001",
      "enrollment_date": "7/28/2015 16:18",
      "completion_date": "8/7/2019 17:20", "score": "78"},
     {"username": "username0003"}]
    

  • answered 2021-04-21 17:00 Pranav Hosangadi

    You get that error because you're iterating over row.items() and changing the items in row at the same time. There are a couple ways to get around this:

    1. Create a copy of row.items() before iterating over it.
    for row in csv_reader:
        row_items = list(row.items())
        for k, v in row_items:
            if v == "":
                del row[k]
        # Append row to data after checking all keys
        data.append(row) 
    
    1. Create a new dictionary containing only those values that aren't blank instead of deleting keys from the old one:
    for row in csv_reader:
        row_filtered = {k: v for k, v in row.items() if v != ""} 
        # this line can also be written as:
        # dict(item for item in row.items() if item[1] != "")    
    
        # Then append the filtered row
        data.append(row_filtered)