There was a time in the history of the ancient computer when a computer only had one CPU. Today, your computer may only have one physical CPU, but that CPU has multiple cores for data processing. When you run a command, you owe it to the brave system administrators of the past to put all those cores to good use. One way to honor those who suffered on single-core machines is to use GNU Parallel, the seemingly magical command parser that can execute a task on multiple files simultaneously.
[ Get the guide to installing apps on Linux. ]
Install parallel
On CentOS, RHEL, and Fedora, you can install GNU Parallel from your software repository: $ sudo dnf install parallel
On CentOS and RHEL, you can sometimes find the latest version of EPEL.
Start Parallel for the first time
Great Linux resources
Linux Advanced
- Download RHEL 9 free of charge through the Red Hat Developer program
- A guide to installing applications on Linux
- Linux System Administration Skills Assessment
- How well do you know Linux? Take a quiz and get a badge
Command Cheat Sheet
The first time you use GNU Parallel, it will ask you to agree to quote when using Parallel in scientific research. The academic tradition requires you to cite works on which you base your article. If you use programs that use GNU Parallel to process data for an article in a scientific journal, please cite:
Tange, O. (2022, August 22). GNU Parallel 20220822 (‘Rushdie’). Zenodo. https://doi.org/10.5281/zenodo.7015730
This quote helps fund further development, and it won’t cost you a dime. If you pay 10,000 EUR, you should feel free to use GNU Parallel without citing. See the GNU website for more information on GNU Parallel funding and citation notice.
To mute this citation notice, run parallel -citation once. Read the reminder and follow the instructions to mute the reminder.
[ Keep your favorite Git commands, aliases, and tips handy. Download the Git cheat sheet. ]
Pipe output
to Parallel
Assuming you’re already familiar with essential Linux commands like find and ls, one of the easiest ways to get started with GNU Parallel is to feed it the results of a command you already understand. For example, suppose you want to move some log files (ignoring, for the moment, that you may be using logrotate or a similar tool in real life).
$ sudo find /var/log/ -type f -name “*.log” | \ sudo parallel mv {} ~/log-stash
In this code, the curly braces ({}) represent the search results
.
Learn parallel syntax
While existing Linux utilities can act as a convenient “front-end” for Parallel, you can also use the parallel command to build processes. The concept is straightforward, though the logic can sometimes get complex, depending on how many tasks you want to execute. Starting simply, here
is a basic parallel command: $ parallel echo {} ::: hello world hello world
Notice that the instruction is separated by a semicolon (:::), with the command on the left and the arguments on the right. If you try that command, you might recover hello world or world hello, depending on which process completes first.
Suppose you want to convert some large media files. Instead of encoding the files one after the other, you can use GNU Parallel to launch separate instances of your encoder, each targeting a different codec:
$ parallel ffmpeg ~/Audio/file.flac ~/Audio/file. {} ::: ogg m4a opus
[ Get the IT job interview tips cheat sheet. ]
Use multiple variables
Parallel is not limited to a single variable {}. You can create multiple entries and then define them by an index number that reflects the order in which they are listed. Compare this output
: $ parallel echo {1} {2} ::: hello Linux ::: world sysadmin hello world sysadmin linux world linux sysadmin In this code example, {1} indicates the first “block”
input (hello and Linux) while {2} indicates the second “block” (world and sysadmin). They do not have to appear in that order, nor are they limited to a single use
: $ echo parallel {2} {1} {2} ::: hello Linux ::: sysadmin world hello sysadmin world sysadmin sysadmin world linux sysadmin parallel
processing They say
that with great power comes
great responsibility, but ideally, with great power also comes great parallelization. The computer in front of you is probably more powerful than you need most of the time, so you can also make your daily commands faster by taking advantage of cycles that would otherwise go to waste. Use GNU Parallel.