suggestions on fulltext search or already existing search algorithms

Can someone suggest how to solve the below seach problem easily, I mean is there any algorithm, or full text search will be suffice for this?

There is below classification of items data,

ItemCategory        ItemCluster     ItemSubCluster      SubCluster  Items
Vegetable       Root vegetables     Root            WithOutSkin potato, sweet potato, yam
Vegetable       Root vegetables     Root            WithSkin    onion, garlic, shallot
Vegetable       Greens          Leafy green     Leaf        lettuce, spinach, silverbeet
Vegetable       Greens          Cruciferous         Flower      cabbage, cauliflower, Brussels sprouts, broccoli
Vegetable       Greens          Edible plant stem   Stem        celery, asparagus

The inputs will be some thing like,

sweet potato, yam Yam, Potato garlik, onion lettuce, spinach, silverbeet lettuce, silverbeet lettuce, silverbeet, spinach

From the input, I want to get the mapping of the input items those belongs to which ItemCategory, ItemCluster, ItemSubCluster, SubCluster.

Any help will be much appreciated.

1 answer

  • answered 2022-05-07 10:48 Deepak Tatyaji Ahire

    You are nearly following the right approach.

    You don't need full text searching here.

    What you can create here is a kind of inverted index as follows:

    If we take example of potato, create a map for potato storing what is its ItemCategory, ItemCluster, ItemSubCluster, SubCluster.

    For example -

    "potato": {
        "ItemCategory": "Vegetable",
        "ItemCluster": "Root vegetables",
        "ItemSubcluster": "Root",
        "Subcluster": "Without Skin"
    }
    

    Now, to store this kind of data for each vegetable would be expensive.

    You can optimise the storage by using an encoding scheme:

    For example -

    let ItemCategory be denoted by 0, let ItemCluster be denoted by 1, let ItemSubcluster be denoted by 2, let Subcluster be denoted by 3

    and the values be denoted by a similar encoding scheme:

    let Vegetable be denoted by 0, let Root vegetables be denoted by 1, let Root be denoted by 2, let Without Skin be denoted by 3

    Now, your mapping becomes:

    "potato": {
        "0": "0",
        "1": "1",
        "2": "2",
        "3": "3",
    }
    

    To further optimise this, you can also make maintain an index of vegetables. For example, potato can be denoted by 0.

    So your final index becomes:

    "0": {
        "0": "0",
        "1": "1",
        "2": "2",
        "3": "3",
    }
    

How many English words
do you know?
Test your English vocabulary size, and measure
how many words do you know
Online Test
Powered by Examplum