Extracting only some keys from a JSON file in Bash

So, I have a JSON file that looks like this ;

    "data": { stuff that i need }
    "stuff not needed": {more stuff not needed}
    "data": {more stuff that i need}

In short, the stuff that I need is inside the curly braces of the "data" key. How can I print this in a Linux shell command? Note, there are several "data" objects in my file, and I would like to extract data from all of them each one at a time.

The intended output would be like this

data {...}
data {...}

3 answers

  • answered 2019-03-14 08:10 funkyjelly

    as others suggested you should really use jq tool for parsing json format. However if you don't have access to the tool and/or can't install it, below's a very simple way treating the json as raw text (not recommended) and producing the output you want :

     grep "\"data\":" json_file | tr -d \"

  • answered 2019-03-14 08:31 David C. Rankin

    You can very simply use awk with the field-separator of "{" and the substr and length($2) - 1 to trim the closing "}".

    For example with your data:

    $ awk -F"{" '/^[ ]*"data"/{print substr($2, 1, length($2)-1)}' json
     stuff that i need
    more stuff that i need

    (note: you can trim the leading space before "stuff" in the 1st line if needed)

    Quick Explanation

    • awk -F"{" invoke awk with a field-separator of '{',
    • /^[ ]*"data"/ locate only lines beginning with zero-or-more spaces followed by "data",
    • print substr($2, 1, length($2)-1) print the substring of the 2nd field from the first character to the length-1 character removing the closing '}'.

    bash Solution

    With bash you can loop over each line looking for a line beginning with "data" and then use a couple of simple parameter expansions to remove the unwanted parts of the line from each end. For instance:

    $ while read -r line; do 
        [[ $line =~ ^\ *\"data\" ]] && { 
            echo $line 
    done <json

    (With your data in the json filename, you can just copy/paste into a terminal)

    Example Use/Output

     $ while read -r line; do
    >     [[ $line =~ ^\ *\"data\" ]] && {
    >         line="${line#*\{}"
    >         line="${line%\}*}"
    >         echo $line
    >     }
    > done <json
    stuff that i need
    more stuff that i need

    (note: bash default word splitting even handles the leading whitespace for you)

    While you can do it with awk and bash, any serious JSON manipulation should be done with the jq utility.

  • answered 2019-03-14 12:00 Walter A

    With the given input, you can use

    sed -rn 's/.*"(data)": (.*)/\1 \2/p' inputfile