Linux find xargs command grep showing path and filename
find /folder/202205??/ -type f | xargs head -50| grep '^Starting'
There are 20220501 20220502 20220503 and so on folders... This command searches all first 50 lines of all files in '/folder/202205??/' and shows the lines beginning with text "Starting"
I haven't the path and the filename of the files that are matched by the grep command. How can I get this information: path and filename and the matched line with a simple command?
3 answers
-
answered 2022-05-06 21:18
Gordon Davisson
The main problem here is that
head
doesn't pass on the info about what lines came from which file, sogrep
can pick out the matching lines but not show the file name or path.awk
can do the matching and trimming to 50 lines, and you can control exactly what gets printed for each match. So something like this:find /folder/202205??/ -type f -exec awk '/^Starting/ {print FILENAME ": " $0}; (FNR>=50) {nextfile}' {} +
Explanation: the first clause in the
awk
script prints matching lines (prefixed by theFILENAME
, which'll actually include the path as well), and the second skips to the next file when it gets to line 50. Also, I usedfind
's-exec ... +
feature instead ofxargs
, just because it's a bit cleaner (and won't run into trouble with weird filenames). Terminating the-exec
command with+
instead of\;
makes it run the files in batches (likexargs
) rather than one at a time. -
answered 2022-05-07 03:48
RARE Kpop Manifesto
A relatively portable
awk
-based solution that provides forbuilt-in
realpath
variant detection,shell-safe
single-quotation
(and escaping) for filenames, andgrep
-like output format :file-full-realpath
:line-number
:[matched line contents..]
————————————————————————————————————————
gfind 202…………/ -mindepth 1 -type f -not -empty -not -name ".*" -print0 | xargs -0 -n 20 -P 16 dash -c 'nice [mg]awk -e '\'' # gawk profile, created Fri May 6 23:26:31 2022 # BEGIN rule(s) BEGIN { 1 __=substr("grealpath", 2^0^system("exit \140 which "\ "grealpath | grep -m 1 -ce . \140 ")) 1 FS="^Starting" } # Rule(s) 1020 50 < FNR { # 20 20 nextfile } 1000 FNR == 1 { # 20 20 _ = getpath(FILENAME, __) } 1000 -NF < -sub("^",(_)":"(FNR)":",$0) { print } 20 function getpath(_,____,__,___) { 20 return "-"==_ \ ? "/dev/stdin" \ : substr((___=RS)*(RS="\0")*gsub(/\47/,"\47\134&\47",_), \ ((__=(____)" -zePq \47"(_)"\47 ")|getline _)~"", +__*close(__)^(RS=___))(_) }'\'' "${@}" ' _
-
answered 2022-05-07 05:35
Marco
I am sure this is not perfect. But it might give some new ideas.
Be aware, that filenames with special characters like newlines are not handled correctly in this solution !!
while IFS=: read -r -a a; do [[ ${a[1]} -gt 50 ]] && break; printf "%s\n" "${a[0]}"; done < <( grep -rnH '^Starting' /folder/202205??/ | sort -t":" -k2,2n )
This
bash
snippet is written in one line, but actually with pretty printing it is more than one.while IFS=: read -r -a a; do [[ ${a[1]} -gt 50 ]] && break printf "%s\n" "${a[0]}" done < <( grep -rnH '^Starting' /folder/202205??/ | sort -t":" -k2,2n )
grep
can go recursive through directories using-r
and shows the line number-n
and the filename-H
. Thesort
is done on the line number. The loop stops on line number greater 50. Till then it prints the filename.Depending on what you want, you can output the line number and/or the string found.
If you need the information inside something else, where the line number can be handled, the simple
grep
might lead you to a better solution:grep -rnH '^Starting' /folder/202205??/
I am sure the output can be put to something like
awk
which stops the output if the number in the second field is greater than 50. Unfortunately I am no awk expert.
do you know?
how many words do you know