How to remove the path from awk's input FILENAME variable - is basename available somehow?

The following command

gawk '{print $0, FILENAME}' input.txt > result.txt

where input.txt is:

FIXED3 LENGTH7      FILE FORMAT     00001
FIXED2 LENGTH8      FILE FORMAT     00002
FIXED2 LENGTH20     FILE FORMAT     00003
FIXED1 LENGTH20     FILE FORMAT     00004

Produces the following Desired result:

FIXED3 LENGTH7      FILE FORMAT     00001 input.txt
FIXED2 LENGTH8      FILE FORMAT     00002 input.txt
FIXED2 LENGTH20     FILE FORMAT     00003 input.txt
FIXED1 LENGTH20     FILE FORMAT     00004 input.txt

However if use a path to the file like below:

gawk '{print $0, FILENAME}' /cygdrive/c/dev/data/input.txt > result.txt

Then FILENAME appended to each line also includes the path. This is what I want to correct I would like the same result as the first scenario above.

FIXED3 LENGTH7      FILE FORMAT     00001 /cygdrive/c/dev/data/input.txt
FIXED2 LENGTH8      FILE FORMAT     00002 /cygdrive/c/dev/data/input.txt
FIXED2 LENGTH20     FILE FORMAT     00003 /cygdrive/c/dev/data/input.txt
FIXED1 LENGTH20     FILE FORMAT     00004 /cygdrive/c/dev/data/input.txt

5 answers

  • answered 2019-03-13 19:06 James Brown

    This is one way:

    $ gawk '{f=FILENAME; sub(/^.*\//,"",f); print $0, f}' ../here/file
    FIXED3 LENGTH7      FILE FORMAT     00001 file
    FIXED2 LENGTH8      FILE FORMAT     00002 file
    FIXED2 LENGTH20     FILE FORMAT     00003 file
    FIXED1 LENGTH20     FILE FORMAT     00004 file
    

    Explained:

    $ gawk '{
        f=FILENAME          # copy the filename to f
        sub(/^.*\//,"",f)   # process f instead, removeall from beginning to last /
        print $0, f         # etc. etc.
    }' ../here/file
    

    or since you did mention gawk:

    $ gawk '{print $0, gensub(/^.*\//,"",1,FILENAME)}' ../here/file
    

  • answered 2019-03-13 19:11 Cyrus

    Split FILENAME with / to an array and output last element of array:

    awk '{n=split(FILENAME,array,"/"); print $0, array[n]}' /cygdrive/c/dev/data/input.txt
    

  • answered 2019-03-14 02:26 Tiw

    A little tweak for efficiency and for conciseness:

    gawk 'FNR==1{f=gensub(".*/","",1,FILENAME)} $(NF+1)=f'
    

    Only extract the filename ( f here ) once from each file, by executing that part at first line.
    And since FILENAME won't be empty, so just append it to the line, the {print $0} will be implied.

    However, that will change output separator if it's not a single space.
    Use below one if that's not what you wanted:

    gawk 'FNR==1{f=gensub(".*/","",1,FILENAME)}{print $0 OFS f}'
    

  • answered 2019-03-14 07:31 RavinderSingh13

    Could you please try following. It will only run on first line to grab exact Input_file name and will NOT run on each line of file.

    awk 'FNR==1{if(FILENAME~/\//){sub(/.*\//,"",FILENAME)}} {print $0,FILENAME}' Input_file
    

    Possible benifits of this approach:

    1- NOT generating edited filename on each line, getting is on 1st line itself and simply prinintg in all other lines.

    2- NO array/memory place holder created so this should be FAST on huge size file too.

    3- Since I am simply printing it and not creating any additonal column with its filename values that will also save time during run of this code.



    EDIT: Just got another thought too, you could simply navigate to the new directory where Input_file present and could come back in code/one liner itself like example as follows. IMHO, I hope this will be FASTEST one among all other solutions mentioned here(since we are NOT doing any data manipulation here and moreover we are using same command which you used previously too :) )

    cd  /cygdrive/c/dev/data/ && awk '{print $0,FILENAME}' input.txt && cd -
    

    Speciality of this command would be it will come back to your original directory where you are running the code so you wil lnever feel like you navigated anywhere :)

  • answered 2019-03-14 08:55 stack0114106

    Another awk using / as separator

    gawk -F"/"  ' { printf("%s ",$0) ; $0=FILENAME } { print $NF } ' /home/full/path/input.txt
    

    with your given inputs

    $ cat /cygdrive/c/dev/data/input.txt
    FIXED3 LENGTH7      FILE FORMAT     00001
    FIXED2 LENGTH8      FILE FORMAT     00002
    FIXED2 LENGTH20     FILE FORMAT     00003
    FIXED1 LENGTH20     FILE FORMAT     00004
    
    $ gawk -F"/"  ' { printf("%s ",$0) ; $0=FILENAME } { print $NF } ' /cygdrive/c/dev/data/input.txt
    FIXED3 LENGTH7      FILE FORMAT     00001 input.txt
    FIXED2 LENGTH8      FILE FORMAT     00002 input.txt
    FIXED2 LENGTH20     FILE FORMAT     00003 input.txt
    FIXED1 LENGTH20     FILE FORMAT     00004 input.txt
    
    $