Recently, a friend innocently asked me how many file formats there are. My semi-serious response was, "Think of a soup bowl filled with beach sand."
OK, there aren't quite that many file formats. That said, you've probably never heard of many of the formats that are commonly used enough to warrant listing on Wikipedia. Chances are, you'll never see and never use most of them. If, however, you want or need to convert between file formats, then there are a quite a few applications for the job.
Let's take a look at three solid file conversion tools for the Linux command line.
Pandoc
Everyone I know who works with markup languages says Pandoc is the go-to utility for converting between those languages. And for good reason: Pandoc not only does some pretty nifty conversions, it's fast, too.
Have a file formatted with Markdown that you want to convert to a LibreOffice Writer document? How about a LaTeX document that you want to turn into an EPUB? Or maybe you have an HTML file that you want to turn into a slide deck. Pandoc is up to all of those tasks. And more.
Here's how to use Pandoc for a simple conversion (in this case, from HTML to reStructuredText):
pandoc -t rst myFile.html -o myFile.rst
You're not just limited to straight conversions. You can, for example, add a table of contents, typographic quotes, custom headers, and syntax highlighting to the resulting file. Take a peek at Pandoc's documentation for details.
Pandoc, however, only handles text-based files. What happens if you have a binary file, such as a word processor document? Help at the command line comes from an unexpected source.
LibreOffice
You're probably thinking, "Hold on! LibreOffice is a GUI application." Yes, it is. But what many people don't know is that you can run LibreOffice from the command line to quickly convert one or more files.
How? To, for example, transform a LibreOffice Impress slide deck to PDF, you'd type the following:
soffice --headless --convert-to pdf mySlides.odp
You'd just replace pdf with the extension of whatever file format you want to convert to. The --headless
option, in case you're wondering, stops an empty LibreOffice window from opening on your desktop.
Using LibreOffice at the command line to convert a single file is overkill. However, turning to the command line is a great way to convert several files at once. If, say, you want to convert all of the Microsoft Word documents in a folder to LibreOffice Writer format, you'd type:
soffice --headless --convert-to odt *.docx
The conversion takes far less time than opening all of those files in LibreOffice Writer and doing the conversion manually.
FFmpeg
Whereas Pandoc is the Swiss Army Knife for converting between markup languages, FFmpeg is Pandoc's opposite number for audio and video formats.
FFmpeg is a set of libraries and executables that give you the ability to convert seemlessly between nearly any format.
Here's an example of a simple conversion of a video from AVI to Ogg Theora:
ffmpeg -i myVideo.avi myvideo.ogg
FFmpeg can do a lot more than that. You can set the frame rate of videos and add subtitles to them, change the aspect ratio, change the quality of audio, and more.
The command line can get quite crowded with those options, should you choose to use more than a couple of them. It's easy to forget the options, especially if you only use FFmpeg every so often. Take it from an old technical writer: There's no shame in reading the documentation.
Do you have a favorite command-line file conversion tool? Feel free to share it by leaving a comment below.
21 Comments