python map each word to its own text

I have a list of words like this:

 word_list=[{"word": "python",
    "repeted": 4},
    {"word": "awsome",
    "repeted": 3},
    {"word": "frameworks",
    "repeted": 2},
    {"word": "programing",
    "repeted": 2},
    {"word": "stackoverflow",
    "repeted": 2},
    {"word": "work",
    "repeted": 1},
    {"word": "error",
    "repeted": 1},
    {"word": "teach",
    "repeted": 1}
    ]

,that comes from another list of notes:

note_list = [{"note_id":1,
"note_txt":"A curated list of awesome Python frameworks"},
{"note_id":2,
"note_txt":"what is awesome Python frameworks"},
{"note_id":3,
"note_txt":"awesome Python is good to wok with it"},
{"note_id":4,
"note_txt":"use stackoverflow to lern programing with python is awsome"},
{"note_id":5,
"note_txt":"error in programing is good to learn"},
{"note_id":6,
"note_txt":"stackoverflow is very useful to share our knoloedge"},
{"note_id":7,
"note_txt":"teach, work"},
  ]

I want to know how can I map every word to its own note:

maped_list=[{"word": "python",
        "notes_ids": [1,2,3,4]},
        {"word": "awsome",
        "notes_ids": [1,2,3]},
        {"word": "frameworks",
        "notes_ids": [1,2]},
        {"word": "programing",
        "notes_ids": [4,5]},
        {"word": "stackoverflow",
        "notes_ids": [4,6]},
        {"word": "work",
        "notes_ids": [7]},
        {"word": "error",
        "notes_ids": [5]},
        {"word": "teach",
        "notes_ids": [7]}
        ]

my work:

# i started by appending all the notes text into one list
notes_test = []
for note in note_list:
notes_test.append(note['note_txt'])
# calculate the reptition of each word
dict = {}
for sentence in notes_test:
    for word in re.split('\s', sentence): # split with whitespace
        try:
            dict[word] += 1
        except KeyError:
            dict[word] = 1
word_list= []
for key in dict.keys():
    word = {}
    word['word'] = key
    word['repeted'] = dict[key]
    word_list.append(word)

my question:

  1. how can I map the word list and note list to get the mapped list
  2. how do you find the quality of my code, any remarks

2 answers

  • answered 2021-11-23 01:45 Selcuk

    You can use a list comprehension:

    mapped_list = [{"word": w_dict["word"],
                    "notes_ids": [n_dict["note_id"] for n_dict in note_list
                                  if w_dict["word"].lower() in n_dict["note_txt"].lower()]
                    } for w_dict in word_list]
    

    The result would be:

    [{'word': 'python', 'notes_ids': [1, 2, 3, 4]},
     {'word': 'awsome', 'notes_ids': [4]},
     {'word': 'frameworks', 'notes_ids': [1, 2]},
     {'word': 'programing', 'notes_ids': [4, 5]},
     {'word': 'stackoverflow', 'notes_ids': [4, 6]},
     {'word': 'work', 'notes_ids': [1, 2, 7]},
     {'word': 'error', 'notes_ids': [5]},
     {'word': 'teach', 'notes_ids': [7]}]
    

  • answered 2021-11-23 01:55 riquefr

    1. Try to create the maped_list while creating the dict, adding the index of a word when it's iterating.
    2. Do not use dict as variable, it's a python's reserved name to create dicts, like dict(), if you use it, it will be overwritten. Also, yuor input don't contain any other white spaces other than space, you can use sentence.split(). Other thing you can do is transform all words in lowercase, so they don't differ if write uppercase or not.

How many English words
do you know?
Test your English vocabulary size, and measure
how many words do you know
Online Test
Powered by Examplum