AWK command in Unix/Linux with examples – GeeksforGeeks

Awk is a scripting language used to manipulate data and generate reports. The awk command programming language requires no compilation and allows the user to use variables, numeric functions, string functions, and logical operators.

Awk is a utility that allows a programmer to write small but effective programs in the form of statements that define the text patterns to be searched for on each line of a document and the action to be taken when a match is found within a line. Awk is mainly used for pattern scanning and processing. Searches for one or more files to see if they contain lines that match the specified patterns, and then performs the associated actions.

Awk is abbreviated from the names of the developers: Aho, Weinberger and Kernighan

.

WHAT CAN WE DO WITH AWK?

1. AWK operations: (a) Scans a file line by line (b) Divides each input line into fields (c) Compares lines/input fields with the pattern (d) Performs actions on matching lines

2. Useful for: (a) Transforming data files (b) Producing formatted reports

3. Programming constructs: (a) Format output lines (b) Arithmetic and string operations (c) Conditionals and loops

Syntax

:awk options ‘selection _criteria {action }’ input-file > output-file Options: -f program-file

:

Reads the AWK program source from the program-file file, instead of the first command-line argument. -F fs : Use fs for the input field separator

Example commands

Example

: Consider the following text file

as the input file for all of the following cases:

$cat > employee.txt Ajay 45000 administrator account Sunil 25000 employee account Varun sales manager 50000 AMIT administrator account 47000 sales of tarun peon 15000 sales of deepak employee 23000 sales of sunil peon 13000 purchase of Director of Satvik 80000

1. Default Awk behavior: By default, Awk prints each line of data in the specified file.

$ awk ‘{print}’ employee.txt

Output:

ajay 45000 administrator account Sunil 25000 employee account Varun 50000 administrator sales Amit 47000 administrator account Tarun 15000 Deepak employee sales 23000 Sunil Peon 13000 sales Satvik 80000 director purchase In

the example above, no pattern is given. So the actions are applicable to all lines. Action print without any argument prints the entire line by default, so it prints all lines in the file without fail.

Arabic numeral. Print the lines that match the given pattern.

$ awk ‘/manager/ {print}’ employee.txt

Output:

ajay 45000 varun administrator account sales manager 50000 amit administrator account 47000

In the example above, the awk command prints the entire line that matches the ‘manager’

.

3. Divide a line into fields: For each record, i.e. line, the awk command divides the character-delimited record of white space by default and stores it in the $n variables. If the line has 4 words, it will be stored at $1, $2, $3, and $4 respectively. Also, $0 represents the entire line.

$ awk ‘{print $1,$4}’ employee.txt

Output:

ajay 45000 sunil 25000 varun 50000 amit 47000 tarun 15000 deepak 23000 sunil 13000 satvik 80000 In the

example above, $1 and $4 represent the Name and Salary fields respectively.

Awk’s

built-in

variables

include field variables ($1, $2, $3, etc. ($0 is the whole line)—which divide a line of text into individual words or fragments called fields

. NR

  • : The NR command maintains a current count of the number of input records. Remember that records are usually lines. The awk command performs the pattern/action statements once for each record in a file.
  • NF: The NF command keeps a count of the number of fields within the current input record
  • . FS

  • : The FS command contains the field separator character that is used to split the fields on the input line. The default value is “white space,” that is, space and tab characters. FS can be reassigned to another character (usually in BEGIN) to change the field separator.
  • RS: The RS command stores the character of the current record separator. Because by default an input line is the input record, the default record separator character is a new line.
  • OFS

  • : The OFS command stores the output field separator, which separates the fields when Awk prints them. The default value is a blank space. Whenever print has multiple parameters separated with commas, it prints the OFS value between each parameter.
  • ORS: The ORS command stores the output record separator, which separates the output lines when Awk prints them. The default value is a newline character. print automatically issues ORS content at the end of what it is given to print.

Examples

:

Using built-in variables NR (display line number)

$ awk ‘{print NR,$0}’ employee.txt

Output:

1 ajay 45000 administrator account 2 sunil 25000 employee account 3 varun 50000 administrator sales 4 amit administrator account 47000 5 sales of tarun peon 15000 6 Deepak 23000 employee sales 7 Sunil Peon 13000 sales 8 Satvik 80000

director purchase In the example above, the awk command with NR prints all lines along with the line number.

Using built-in variables NF (Show Last Field)

$ awk ‘{print $1,$NF}’ employee.txt

Output:

ajay 45000 sunil 25000 varun 50000 amit 47000 tarun 15000 deepak 23000 sunil 13000 satvik 80000 In

the example above, $1 represents Name and $NF represents Salary. We can get the Salary using $NF , where $NF represents the last field.

Other use of built-in variables NR (display line

3 to 6) $ awk ‘NR==3, NR==6 {print NR,$0}’ employee.txt

Output

: 3 varun manager sales 50000 4 amit manager account 47000 5 tarun peon sales 15000 6 deepak clerk sales 23000

More examples

For the given text file:

$cat > geeksforgeeks.txt A B C Tarun A12 1 Man B6 2 Praveen M42 3

1) To print the first item along with the row number (NR) separated with

” – ” from each line in geeksforgeeks.txt: $ awk ‘{print NR “- ” $1 }’ geeksforgeeks.txt 1 – A 2 – Tarun 3 – Manav 4 – Praveen 2) To return the second column/item of geeksforgeeks.txt:

The question should be:- To return the second column/article of

geeksforgeeks.txt:$ awk ‘{print $2}’

geeksforgeeks.txt

B A12 B6 M42

3) To print any non-empty line if present

$ awk ‘NF < 0’ geeksforgeeks.txt

here NF should be 0 no less than and the user has

to print the line number as well:correct answer : awk ‘NF == 0 {print NR}’ geeksforgeeks.txt

OR

awk ‘NF <= 0 {print NR}’ geeksforgeeks.txt 0 4) To

find the length of the longest line present in the

file: $ awk ‘{ if (length($0) > max) max = length($0) } END { print max }’ geeksforgeeks.txt13

5) To

count the lines in a file

: $ awk ‘END { print NR }’ geeksforgeeks.txt

3 6) Print lines with more than

10 characters: $ awk ‘length($0) > 10

‘ geeksforgeeks.txt Tarun A12

1 Praveen M42 3

7) To search/check any string in any specific column:

$ awk ‘{ if($3 == “B6”) print $0;} ‘ geeksforgeeks.txt 8) To print the

squares of the first numbers from 1 to n say 6:

$ awk ‘BEGIN { for(i=1;i<=6;i++) print “square of”, i, “is”,i*i; } ‘ square of 1 is 1 square of 2 is 4 square of 3 is 9 square of 4 is 16 square of 5 is 25 square of 6 is 36

This article is contributed by Anshika Goyal and Praveen Negi. If you like GeeksforGeeks and would like to contribute, you can also write an article using write.geeksforgeeks.org or mail your article to review-team@geeksforgeeks.org. See their article listed on the GeeksforGeeks homepage and help other Geeks.

Please write comments if you find something wrong, or if you want to share more information about the topic discussed above.