The task is to find the number of lines in each file in the current directory, order the result and write to the file.
The problem is that I do not know how to identify all the text files in the folder.
The task is to find the number of lines in each file in the current directory, order the result and write to the file.
The problem is that I do not know how to identify all the text files in the folder.
I would suggest such a perversion:
find . -maxdepth 1 -type f \ -exec sh -c "file -bi '{}' | grep -q ^text/ && echo '{}'" \; \ | xargs wc -l | head -n -1 | sort -gk 1 > line_counts.txt
What is going on here:
find . -maxdepth 1 -type f
find . -maxdepth 1 -type f
- we search all files in the current directory (without breaking deeper) ( -type f
)-exec sh -c "…" \;
- for each of them execute the command sh -c "…"
(instead of " {}
" the file name will be substituted). The meaning of this action is that we will not simply thrust a pipe into find, it will not understand, so we have to call the shell.file -bi '{}'
- define the MIME type of the file ( -i
), the file name itself is not displayed ( -q
). This may not be the most accurate definition, see notes below.grep -q ^text/
- choose lines that start with " text/
", but do not print anything ( -q
), but just tell the exit code whether something is found or not&& echo '{}'
- if found, the right part of &&
will be executed and the file name will be displayed.xargs wc -l
- all incoming file names will be supplied with wc
arguments, which will count the lines ( -l
).head -n -1
- cut the last line, with the total resultsort -gk 1
- numerically ( -g
) sorted by the first field ( -k 1
)Variations are possible. In particular, I think that not only ^text/
all limited (some files representing text in UTF-8 have MIME types in application/*
, but there are also any application/octet-stream
, which, generally speaking, never text), so maybe something better is in the spirit of file -b '{}' | grep -Fq ' text'
file -b '{}' | grep -Fq ' text'
. Also, if there are a lot of files, and they have long names, you will have to call wc more than once, but once for each file, “ xargs -I '{}' wc -l '{}'
”.
Yes, I used mostly GNU's utilities (GNU findutils, GNU coreutils, GNU grep), except for the BSD's file
. There may be other utilities on non-GNU systems that may not have any options, or they may not. In general, YMMV, if that - see the documentation.
All this, however, will break if any fan of the strange creates a file with the name containing the newline character ( \n
). Then the pipe, starting with xargs
will break. To solve, you have to add && echo -e '\x00'
(or something like that) and add the xargs
argument to -0
( --null
).
file * | grep text | awk '{ print $1 }' | tr -d :
file * | grep text | awk '{ print $1 }' | tr -d :
file * | grep text | awk '{ print $1 }' | tr -d :
| sort -gk 1 - avpSource: https://ru.stackoverflow.com/questions/163448/
All Articles