Code and comments

Practical and theoretical aspects of software development

Using find and grep.

Back in August of last year, for one of the first posts on my brand-new blog, I put up the slides from a lunch-time presentation that I gave on find, grep, sed, and awk. I put it up so that it would be available as reference to those that attended the talk, even though the slides weren’t designed for stand-alone use, and slides make a poor Internet reference.

Given the surprising popularity of that post, it seemed reasonable to repackage the information in a format that is more fitting for a blog post.

find and grep are incredibly powerful. But many never learn a fraction of what they can do. Did you know that grep can print the surrounding lines? Or that you can search for files based on permissions? Here’s a collection of examples that might expand your notion of what can be done with find and grep.

Most find commands follows the pattern:
$ find [PATH] [expression]

where the expression consists of one or more of the options shown below. find is recursive by default so the command $ find . would output the entire directory tree at the current location.

Find files named in src directory:
$ find src -name

Find case insensitive in current directory — matches README, readme, ReadMe, etc.
$ find . -iname readme

Find by wildcard name — NB: quotes keep shell from expanding wildcards, it is best to always include them.
$ find /etc -name '*xml'

Some exclusions — -not can be replaced with !
$ find . -not -name '*java' -maxdepth 4

Find with or
$ find . -name '*java' -o -name '*xml'

There are several ways to do stuff with the results of your find.  I prefer to use the more intuitive xargs rather than -exec, as the -exec syntax is strange. But -exec gives you the ability to choose whether you want the second command executed once (+) or once per result (\;).
$ find . –name '*.java' | xargs wc –l | sort
$ find . –name '*.java' -exec wc –l {} \; | sort
$ find . –name '*.java' -exec wc –l {} + | sort

It is often useful to filter by type. Using find is more reliable than parsing ls.
$ find . -type f
$ find . -type d
$ find . -type l

To find all directories in current directory:
$ find . -type d -maxdepth 1

Do you want to see the files that have changed today?
$ find . -mtime -1

Or the files that have changed in the last 15 minutes?
$ find . -mmin -15

Sometimes it is convenient to find the files that are newer (or older) than a certain file.
$ find . -newer foo.txt
$ find . ! -newer foo.txt

Of the files that have been modified after a certain date … or between two dates.

$ find . -type f -newermt '2010-01-01'
$ find . -type f -newermt '2010-01-01' ! -newermt '2010-06-01'

You can also filter by size. Note that the - finds smaller files, + finds larger files.
$ find . -size -1k
$ find . -size +100M

I don’t know that I’ve every included permissions in a search, but it could be useful.
$ find . -perm 644
$ find . -perm -u=w
$ find . -perm -ug=x

A grep command typically takes the form
$ grep [options] [string/regex] [file or path]

For example:
$ grep 'new Account' *.java

Note that you do not quote a wildcard expression for grep, in this case we want the shell to expand the expression.

grep is not recursive by default, this is altered with the -r flag.
$ grep -r 'Dao[Impl|Mock]' src

I use the following flags commonly:
-i Case insensitive
-w Restricts to word matches
-n outputs line number
-c outputs count of matches

Grepping for multiple terms is useful in theory, but I rarely do it:
$ grep -e foo -e bar baz.txt

Whether searching through source code or log files, the ability to display the surrounding lines is often useful. For example:
$ grep -r -A 2 foo src

will display the lines that contain foo and the two subsequent lines. Similarly, you can use
-B before
-C centered (lines both before and after found text.)

That’s all for now, I hope that this helps you find something soon.


Written by Eric Wilson

January 31, 2012 at 12:15 pm

Posted in how-to

Tagged with , , ,

%d bloggers like this: