Few Unix commands are as famous as sed, grep, and awk. They get grouped together often, possibly because they have strange names and powerful tools for parsing text. They also share some syntactical and logical similarities. And while they're all useful for parsing text, each has its specialties. This article examines the sed
command, which is a stream editor.
I've written before about sed, as well as its distant relative ed. To get comfortable with sed, it helps to have some familiarity with ed because that helps you get used to the idea of buffers. This article assumes that you're familiar with the very basics of sed, meaning you've at least run the classic s/foo/bar/
style find-and-replace command.
[Download our free sed cheat sheet]
Installing sed
If you're using Linux, BSD, or macOS, you already have GNU or BSD sed installed. These are unique reimplementations of the original sed
command, and while they're similar, there are minor differences. This article has been tested on the Linux and NetBSD versions, so you can use whatever sed you find on your computer in this case, although for BSD sed you must use short options (-n
instead of --quiet
, for instance) only.
GNU sed is generally regarded to be the most feature-rich sed available, so you might want to try it whether or not you're running Linux. If you can't find GNU sed (often called gsed on non-Linux systems) in your ports tree, then you can download its source code from the GNU website. The nice thing about installing GNU sed is that you can use its extra functions but also constrain it to conform to the POSIX specifications of sed, should you require portability.
MacOS users can find GNU sed on MacPorts or Homebrew.
On Windows, you can install GNU sed with Chocolatey.
Understanding pattern space and hold space
Sed works on exactly one line at a time. Because it has no visual display, it creates a pattern space, a space in memory containing the current line from the input stream (with any trailing newline character removed). Once you populate the pattern space, sed executes your instructions. When it reaches the end of the commands, sed prints the pattern space's contents to the output stream. The default output stream is stdout, but the output can be redirected to a file or even back into the same file using the --in-place=.bak
option.
Then the cycle begins again with the next input line.
To provide a little flexibility as you scrub through files with sed, sed also provides a hold space (sometimes also called a hold buffer), a space in sed's memory reserved for temporary data storage. You can think of hold space as a clipboard, and in fact, that's exactly what this article demonstrates: how to copy/cut and paste with sed.
First, create a sample text file with this text as its contents:
Line one
Line three
Line two
Copying data to hold space
To place something in sed's hold space, use the h
or H
command. A lower-case h
tells sed to overwrite the current contents of hold space, while a capital H
tells it to append data to whatever's already in hold space.
Used on its own, there's not much to see:
$ sed --quiet -e '/three/ h' example.txt
$
The --quiet
(-n
for short) option suppresses all output but what sed has performed for my search requirements. In this case, sed selects any line containing the string three
, and copying it to hold space. I've not told sed to print anything, so no output is produced.
Copying data from hold space
To get some insight into hold space, you can copy its contents from hold space and place it into pattern space with the g
command. Watch what happens:
$ sed -n -e '/three/h' -e 'g;p' example.txt
Line three
Line three
The first blank line prints because the hold space is empty when it's first copied into pattern space.
The next two lines contain Line three
because that's what's in hold space from line two onward.
This command uses two unique scripts (-e
) purely to help with readability and organization. It can be useful to divide steps into individual scripts, but technically this command works just as well as one script statement:
$ sed -n -e '/three/h ; g ; p' example.txt
Line three
Line three
Appending data to pattern space
The G
command appends a newline character and the contents of the hold space to the pattern space.
$ sed -n -e '/three/h' -e 'G;p' example.txt
Line one
Line three
Line three
Line two
Line three
The first two lines of this output contain both the contents of the pattern space (Line one
) and the empty hold space. The next two lines match the search text (three
), so it contains both the pattern space and the hold space. The hold space doesn't change for the third pair of lines, so the pattern space (Line two
) prints with the hold space (still Line three
) trailing at the end.
Doing cut and paste with sed
Now that you know how to juggle a string from pattern to hold space and back again, you can devise a sed script that copies, then deletes, and then pastes a line within a document. For example, the example file for this article has Line three
out of order. Sed can fix that:
$ sed -n -e '/three/ h' -e '/three/ d' \
-e '/two/ G;p' example.txt
Line one
Line two
Line three
- The first script finds a line containing the string
three
and copies it from pattern space to hold space, replacing anything currently in hold space. - The second script deletes any line containing the string
three
. This completes the equivalent of a cut action in a word processor or text editor. - The final script finds a line containing
two
and appends the contents of hold space to pattern space and then prints the pattern space.
Job done.
Scripting with sed
Once again, the use of separate script statements is purely for visual and mental simplicity. The cut-and-paste command works as one script:
$ sed -n -e '/three/ h ; /three/ d ; /two/ G ; p' example.txt
Line one
Line two
Line three
It can even be written as a dedicated script file:
#!/usr/bin/sed -nf
/three/h
/three/d
/two/ G
p
To run the script, mark it executable and try it on your sample file:
$ chmod +x myscript.sed
$ ./myscript.sed example.txt
Line one
Line two
Line three
Of course, the more predictable the text you need to parse, the easier it is to solve your problem with sed. It's usually not practical to invent "recipes" for sed actions (such as a copy and paste) because the condition to trigger the action is probably different from file to file. However, the more fluent you become with sed's commands, the easier it is to devise complex actions based on the input you need to parse.
The important things are recognizing distinct actions, understanding when sed moves to the next line, and predicting what the pattern and hold space can be expected to contain.
Download the cheat sheet
Sed is complex. It only has a dozen commands, yet its flexible syntax and raw power mean it's full of endless potential. I used to reference pages of clever one-liners in an attempt to get the most use out of sed, but it wasn't until I started inventing (and sometimes reinventing) my own solutions that I felt like I was starting to actually learn sed. If you're looking for gentle reminders of commands and helpful tips on syntax, download our sed cheat sheet, and start learning sed once and for all!
Comments are closed.