Extract data and save in different output files

I have a data file with the following format:

aaa     0
bbb     1
ccc     2
ddd     ?
eee     0
fff     1
ggg     2
hhh     3
iii     ?
   ...

What I want to do is quite simple: extract and save the parts of the data in different files with the criteria for splitting being only taking the lines between 0 and the '?' so that I would obtain:

output_1.txt >

aaa     0
bbb     1
ccc     2
ddd     ?

output_2.txt >

eee     0
fff     1
ggg     2
hhh     3
iii     ?

And so on until the end of the input file is reached. I've tried to look into awk command but I'm not quite sure how to specify the conditions nor how to create an output file that depends on the number of times the data is split.

2 answers

  • answered 2018-10-11 20:30 glenn jackman

    You can redirect print statements in awk:

    awk -v n=1 '{print > "output_" n ".txt"} $2 == "?" {n++}' file
    

    If your file is large, you may have to explicitly close the open file:

    awk -v n=1 '
        {print > "output_" n ".txt"} 
        $2 == "?" {close("output_" n ".txt"); n++}
    ' file
    

    If I was feeling really DRY, I would write

    awk -v n=1 '
        function filename(n) {return "output_" n ".txt"} 
        {print > filename(n)} 
        $2 == "?" {close(filename(n++))}  # important, post-increment
    ' file
    

  • answered 2018-10-12 05:08 Ed Morton

    All you need is:

    awk 'NR==1 || $NF=="?"{close(out); out="output_"++cnt".txt"} {print > out}' file
    

    The above will work with any awk in any shell on any UNIX system for any size of input file.