Linux学习笔记：awk详细用法-阿里云开发者社区

一、基础用法

awk：报告生成工具；把文件中读取到的每一行的每个字段分别进行格式化，然后进行显示。

 
  
    
      
      
        [Linux85]
        #awk -h 
       
 
        Usage: awk [POSIX 
        or 
        GNU style options] 
        -
        f progfile [
        -
        -
        ] 
        file 
        ... 
       
 
        Usage: awk [POSIX 
        or 
        GNU style options] [
        -
        -
        ] 
        'program' 
        file 
        ... 
       
 
        POSIX options:      GNU 
        long 
        options: 
       
 
            
        -
        f progfile     
        -
        -
        file
        =
        progfile 
       
 
            
        -
        F fs           
        -
        -
        field
        -
        separator
        =
        fs    
        #字段分隔符 
       
 
            
        -
        v var
        =
        val      
        -
        -
        assign
        =
        var
        =
        val 
       
 
            
        -
        m[fr] val 
       
 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
       
 
        awk [options] 
        'script' 
        FILE 
        ... 
       
 
        awk [options] 
        '/pattern/{action}' 
        FILE 
        ... 
       
 
    

   
 

四种分隔符：

输入/输出

行分隔符：$

字段分隔符：空白

模式

地址定界	/pattern1/,/pattern2/
/pattern/	可以 ! 取反
expression	表达式；>, >=, <, <=, ==, !=, ~
BEGIN{}	在遍历操作开始之前执行一次
END{}	在遍历操作结束之后、命令退出之前执行一次

 
  
    
      
      
        [Linux85]
        #awk '/^soul/{print $0}' /etc/passwd /etc/shadow /etc/group 
       
 
        soul:x:
        501
        :
        501
        ::
        /
        home
        /
        soul:
        /
        bin
        /
        bash 
       
 
        soul:!!:
        16166
        :
        0
        :
        99999
        :
        7
        ::: 
       
 
        soul:x:
        501
        : 
       
 
        [Linux85]
        # 
       
 
    

   
 

 
        #ID号大于等于500的用户
       
        [Linux85]
        #awk -F : '$3>=500{print $1}' /etc/passwd 
       
        nfsnobody
       
        gentoo
       
        soul
       
        [Linux85]
        #

 
        BEGIN执行前操作
       
        [Linux85]
        #awk -F : 'BEGIN{print "UserName\n***********"}$3>=500{print $1}' /etc/passwd 
       
        UserName
       
        *
        *
        *
        *
        *
        *
        *
        *
        *
        *
        * 
       
        nfsnobody
       
        gentoo
       
        soul
       
        [Linux85]
        #

awk的内置变量：

NF	字段数( The number of fields in the current input record.)
FS	field separator，读取文本时，所使用字段分隔符
RS	Record separator，输入文本信息所使用的换行符；
OFS	输出时使用字段分隔符，默认为空白(output field separator)
ORS	output record separator

 
        [Linux85]
        #awk -F : '/^soul/{print $1,$7}' /etc/passwd 
       
        soul 
        /
        bin
        /
        bash 
       
        [Linux85]
        #awk 'BEGIN{FS=":"}/^soul/{print $1,$7}' /etc/passwd 
       
        soul 
        /
        bin
        /
        bash 
       
        [Linux85]
        #awk 'BEGIN{FS=":";OFS=":"}/^soul/{print $1,$7}' /etc/passwd 
       
        soul:
        /
        bin
        /
        bash 
       
        [Linux85]
        #

 
        [Linux85]
        #awk '!/^$|^#/{print $1}' /etc/sysctl.conf 
       
        net.ipv4.ip_forward
       
        net.ipv4.conf.default.rp_filter
       
        net.ipv4.conf.default.accept_source_route
       
        kernel.sysrq
       
        kernel.core_uses_pid
       
        net.ipv4.tcp_syncookies
       
        net.bridge.bridge
        -
        nf
        -
        call
        -
        ip6tables 
       
        net.bridge.bridge
        -
        nf
        -
        call
        -
        iptables 
       
        net.bridge.bridge
        -
        nf
        -
        call
        -
        arptables 
       
        kernel.msgmnb
       
        kernel.msgmax
       
        kernel.shmmax
       
        kernel.shmall
       
        [Linux85]
        #

 
        [Linux85]
        #ifconfig | awk '/inet addr/{print $2}' | awk -F : '!/127/{print $2}' 
       
        172.16
        .
        251.85 
       
        [Linux85]
        #

二、awk的进阶使用

1、print输出：print item1, item2, ...

各项目之间使用逗号隔开，而输出时则以空白字符分隔；
输出的item可以为字符串或数值、当前记录的字段(如$1)、变量或awk的表达式；数值会先转换为字符串，而后再输出；
print命令后面的item可以省略，此时其功能相当于print $0, 因此，如果想输出空白行，则需要使用print ""；

2、printf输出：printf format, item1, item2, ...

其与print命令的最大不同是，printf需要指定format；
format用于指定后面的每个item的输出格式；
printf语句不会自动打印换行符；\n

format格式的指示符都以%开头；后面跟一个字符；

%c	显示字符的ASCII码；
%d \| %i	十进制整数；
%e \| %E	科学计数法显示数值；
%f	显示浮点数；
%g \| %G	以科学计数法的格式或浮点数的格式显示数值；
%s	显示字符串；
%u	无符号整数；
%%	显示%自身；

 
        [Linux85]
        #awk 'BEGIN{num1=20;num2=30; printf "%d %d\n",num1,num2}' 
       
        20 
        30 
       
        [Linux85]
        # 
       
        #不显示item；只显示的是格式；格式对应的后面的变量；所以需要一一对应

修饰符

N	显示宽度
-	左对齐
+	显示数值符号；正负数

 
        [Linux85]
        #awk -F: '{printf "%-14s %s\n",$1,$NF}' /etc/passwd 
       
        root           
        /
        bin
        /
        bash 
       
        bin            
        /
        sbin
        /
        nologin 
       
        daemon         
        /
        sbin
        /
        nologin 
       
        adm            
        /
        sbin
        /
        nologin 
       
        lp             
        /
        sbin
        /
        nologin 
       
        sync           
        /
        bin
        /
        sync

3、awk内置变量之数据变量

NR	The number of input records，awk命令所处理的记录数；如果有多个文件，这个数目会把处理的多个文件中行统一计数；
NF	Number of Field，当前记录的field个数；
FNR	与NR不同的是，FNR用于记录正处理的行是当前这一文件中被总共处理的行数；
ARGV	数组，保存命令行本身这个字符串，如awk '{print $0}' a.txt b.txt这个命令中，ARGV[0]保存awk，ARGV[1]保存a.txt；
ARGC	awk命令的参数的个数；
FILENAME	awk命令所处理的文件的名称；
ENVIROM	当前shell环境变量及其值的关联数组；

 
        [Linux85]
        #awk '{print NR,$0}' 1.txt 
       
        1 
        one line 
       
        2 
        two line 
       
        3 
        three line 
       
        4 
        four line 
       
        5 
        five line 
       
        [Linux85]
        #awk '{print NR,$0}' 2.txt 
       
        1 
        six line 
       
        2 
        seven line 
       
        3 
        eight line 
       
        4 
        nine line 
       
        5 
        ten line 
       
        [Linux85]
        #awk '{print NR,$0}' 1.txt 2.txt 
       
        1 
        one line 
       
        2 
        two line 
       
        3 
        three line 
       
        4 
        four line 
       
        5 
        five line 
       
        6 
        six line 
       
        7 
        seven line 
       
        8 
        eight line 
       
        9 
        nine line 
       
        10 
        ten line 
       
        [Linux85]
        # 
       
        #
       
        [Linux85]
        #awk '{print FNR,$0}' 1.txt 2.txt 
       
        1 
        one line 
       
        2 
        two line 
       
        3 
        three line 
       
        4 
        four line 
       
        5 
        five line 
       
        1 
        six line 
       
        2 
        seven line 
       
        3 
        eight line 
       
        4 
        nine line 
       
        5 
        ten line 
       
        [Linux85]
        #

 
        [Linux85]
        #awk -F: '/root/{print $1,"is a user in",ARGV[1]}' /etc/passwd 
       
        root 
        is 
        a user 
        in 
        /
        etc
        /
        passwd 
       
        operator 
        is 
        a user 
        in 
        /
        etc
        /
        passwd 
       
        [Linux85]
        #

 
        [Linux85]#awk 
        'BEGIN{print ARGC}' 
        /etc/passwd /etc/group /etc/shadow 
       
        4
       
        [Linux85]#
       
        # 
        'BEGIN{print ARGC}'
        本身也当成一个参数

 
        [Linux85]
        #awk '{print $0,"in",  FILENAME}' 1.txt 2.txt 
       
        one line 
        in 
        1
        .txt 
       
        two line 
        in 
        1
        .txt 
       
        three line 
        in 
        1
        .txt 
       
        four line 
        in 
        1
        .txt 
       
        five line  
        in 
        1
        .txt 
       
        six line 
        in 
        2
        .txt 
       
        seven line 
        in 
        2
        .txt 
       
        eight line 
        in 
        2
        .txt 
       
        nine line 
        in 
        2
        .txt 
       
        ten line 
        in 
        2
        .txt 
       
        [Linux85]
        #

4、输出重定向

print items > output-file

print items >> output-file

print items | command

特殊文件描述符：

/dev/stdin：标准输入
/dev/sdtout: 标准输出
/dev/stderr: 错误输出
/dev/fd/N: 某特定文件描述符，如/dev/stdin就相当于/dev/fd/0；

5、awk的操作符

算术操作符	赋值操作符	比较操作符
-x:负值	=:应[=]	x < y True if x is less than y.
+x:转换为数值	+=	x <= y True if x is less than or equal to y.
x^y:次方	-=	x > y True if x is greater than y.
x**y:次方	*=	x >= y True if x is greater than or equal to y.
x*y	/=	x == y True if x is equal to y.
x/y	%=	x != y True if x is not equal to y.
x+y	^=	x ~ y True if the string x matches the regexp denoted by y.
x-y	**=	x !~ y True if the string x does not match the regexp denoted by y.
x%y	++	subscript in array True if the array array has an element with the subscript subscript.
	--

awk中；任何非0值或非空字符串都为真；反之为假。

条件表达式：

select?if-true-exp:if-false-exp

6、模式和常见的模式类型

模式：

awk 'program' input-file1 input-file2 ...

program：

pattern { action }
pattern { action }
....

常见的模式：

Regexp	正则表达式，格式为/regular expression/
expresssion	表达式，其值非0或为非空字符时满足条件，如：$1 ~ /foo/ 或 $1 == "soul"，用运算符~(匹配)和!~(不匹配)。
Ranges	指定的匹配范围，格式为pat1,pat2
BEGIN/END	特殊模式，仅在awk命令执行前运行一次或结束前运行一次
Empty(空模式)	匹配任意输入行；

常见的Action

Expressions
Control statements
Compound statements
Input statements
Output statements

7、控制语句

if-else

语法：if (condition) {then-body} else {[ else-body ]}

 
        [Linux85]
        #awk -F : 'BEGIN{OFS=":"}{if ($3==0) {print $1,"Administrator";} else {print $1,"Common User"}}' /etc/passwd 
       
        root:Administrator
       
        bin
        :Common User 
       
        daemon:Common User
       
        adm:Common User
       
        lp:Common User
       
        sync:Common User
       
        shutdown:Common User

 
        [Linux85]
        #awk -F: '{if ($1=="root") printf "%-15s: %s\n",$1,"Admin";else printf "%-15s: %s\n",$1,"Common User"}' /etc/passwd 
       
        root           : Admin
       
        bin            
        : Common User 
       
        daemon         : Common User
       
        adm            : Common User
       
        lp             : Common User
       
        sync           : Common User
       
        shutdown       : Common User
       
        halt           : Common User
       
        mail           : Common User
       
        uucp           : Common User
       
        operator       : Common User
       
        games          : Common User
       
        gopher         : Common User
       
        ftp            : Common User
       
        nobody         : Common User
       
        dbus           : Common User
       
        usbmuxd        : Common User

 
        [Linux85]
        #awk -F: -v sum=0 '{if ($3>=500) sum++}END{print sum}' /etc/passwd 
       
        3
       
        [Linux85]
        #统计uid>=500的用户个数

while

语法：while (condition){statement1; statment2; ...}

 
        [Linux85]
        #awk -F : '{i=1;while (i<=3) {print $i;i++}}' /etc/passwd 
       
        root
       
        x
       
        0
       
        bin
       
        x
       
        1
       
        #打印出/etc/passwd前三个字段

 
        [Linux85]
        #awk -F: '{i=1;while (i<=NF) { if (length($i)>=4) {print $i}; i++ }}' /etc/passwd 
       
        root
       
        root
       
        /
        root 
       
        /
        bin
        /
        bash 
       
        /
        bin 
       
        /
        sbin
        /
        nologin

do-while 至少执行一次循环体，不管条件满足与否

语法：do {statement1, statement2, ...} while (condition)

 
        [Linux85]
        #awk -F: '{i=1;do {print $i;i++}while(i<=3)}' /etc/passwd 
       
        root
       
        x
       
        0
       
        bin
       
        x
       
        1
       
        daemon
       
        x
       
        2

 
        [Linux85]
        #awk -F: '{i=4;do {print $i;i--}while(i>4)}' /etc/passwd 
       
        0
       
        1
       
        2
       
        4
       
        7
       
        0
       
        0
       
        0
       
        12

for

语法：for (variable assignment; condition; iteration process) {statement1, statement2, ...}

 
        [Linux85]
        #awk -F: '{for(i=1;i<=3;i++) if (i<3){printf "%s:",$i} print $i}' /etc/passwd 
       
        root:x:
        0 
       
        bin
        :x:
        1 
       
        daemon:x:
        2 
       
        adm:x:
        4 
       
        lp:x:
        7 
       
        sync:x:
        0 
       
        shutdown:x:
        0

for循环遍历数组元素

语法： for (i in array) {statement1, statement2, ...}

 
        [Linux85]
        #awk -F: '$NF!~/^$/{BASH[$NF]++}END{for(A in BASH){printf "%15s:%i\n",A,BASH[A]}}' /etc/passwd 
       
        /
        sbin
        /
        shutdown:
        1 
       
        /
        bin
        /
        csh:
        1 
       
        /
        bin
        /
        bash:
        2 
       
        /
        sbin
        /
        nologin:
        29 
       
        /
        sbin
        /
        halt:
        1 
       
        /
        bin
        /
        sync:
        1 
       
        [Linux85]
        # 
       
        #统计最后一个字段出现的次数

case

语法：switch (expression) { case VALUE or /REGEXP/: statement1, statement2,... default: statement1, ...}
break 和 continue
next

提前结束对本行文本的处理，并接着处理下一行；

 
        [Linux85]
        #awk -F: '{if($3%2==0) next;print $1,$3}' /etc/passwd 
       
        bin 
        1 
       
        adm 
        3 
       
        sync 
        5 
       
        halt 
        7 
       
        operator 
        11 
       
        gopher 
        13 
       
        nobody 
        99 
       
        dbus 
        81 
       
        usbmuxd 
        113 
       
        vcsa 
        69 
       
        rtkit 
        499 
       
        abrt 
        173 
       
        postfix 
        89 
       
        rpcuser 
        29 
       
        pulse 
        497 
       
        soul 
        501 
       
        [Linux85]
        #

8、数组

array[index-expression]

index-expression可以使用任意字符串；需要注意的是，如果某数据组元素事先不存在，那么在引用其时，awk会自动创建此元素并初始化为空串；因此，要判断某数据组中是否存在某元素，需要使用index in array的方式。
要遍历数组中的每一个元素，需要使用如下的特殊结构：

for (var in array) { statement1, ... }

其中，var用于引用数组下标，而不是元素值；

删除数组中的变量：delete array[index]

 
        [Linux85]
        #netstat -ant | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}' 
       
        ESTABLISHED 
        2 
       
        LISTEN 
        10 
       
        [Linux85]
        #

9、awk的内置函数

split(string, array [, fieldsep [, seps ] ])

将string表示的字符串以fieldsep为分隔符进行分隔，并将分隔后的结果保存至array为名的数组中；数组下标为从1开始的序列；

 
        [Linux85]
        #df -lh | awk '!/^File/{split($5,percent,"%");if(percent[1]>=10){print $1}}' 
       
        /
        dev
        /
        sda1 
       
        /
        dev
        /
        mapper
        /
        vg0
        -
        usr 
       
        [Linux85]
        # 
       
        #磁盘使用率大于等于%10的显示出来

length([string])：返回string字符串中字符的个数；

 
        [Linux85]
        #awk -F: '{for(i=1;i<=NF;i++) { if (length($i)>=4) {print $i}}}' /etc/passwd 
       
        root
       
        root
       
        /
        root 
       
        /
        bin
        /
        bash 
       
        /
        bin 
       
        /
        sbin
        /
        nologin 
       
        daemon
       
        daemon
       
        /
        sbin 
       
        /
        sbin
        /
        nologin

substr(string, start [, length ])

取string字符串中的子串，从start开始，取length个；start从1开始计数；
system(command)：执行系统command并将结果返回至awk命令
systime()：取系统当前时间
tolower(s)：将s中的所有字母转为小写
toupper(s)：将s中的所有字母转为大写

10、用户自定义函数

自定义函数使用function关键字。格式如下：

function F_NAME([variable])

{

statements

}

example：

 
        #统计当前系统上每个客户端IP的连接中状处于ESTABLISHED的连接态的个数；
       
        [Linux85]
        #netstat -tn | awk '/ESTABLISHED\>/{split($5,ip,":");num[ip[1]]++}END{for (i in num) printf "%s %d\n", i, num[i]}' 
       
        172.16
        .
        254.28 
        2 
       
        [Linux85]
        #

 
        #统计ps aux命令执行时，当前系统上各状态的进程的个数；
       
        [Linux85]
        #ps aux | awk '!/^USER/{state[$8]++}END{for (i in state) printf "%s %d\n",i,state[i]}' 
       
        S< 
        2 
       
        S<sl 
        1 
       
        Ss 
        18 
       
        SN 
        1 
       
        S 
        69 
       
        Ss
        + 
        6 
       
        Ssl 
        2 
       
        R
        + 
        1 
       
        S
        + 
        2 
       
        Sl 
        2 
       
        S<s 
        1 
       
        [Linux85]
        #

 
        #统计ps aux命令执行时，当前系统上各用户的进程的个数；
       
        [Linux85]
        #ps aux | awk '!/^USER/{state[$1]++}END{for (i in state) printf "%s %d\n",i,state[i]}' 
       
        rpc 
        1 
       
        dbus 
        1 
       
        68 
        2 
       
        postfix 
        2 
       
        rpcuser 
        1 
       
        root 
        96 
       
        gentoo 
        2 
       
        [Linux85]
        #

 
        #显示ps aux命令执行时，当前系统上其VSZ（虚拟内存集）大于10000的进程及其PID；
       
        [Linux85]
        #ps aux | awk '!/USER/{if($5>10000) print $2,$11}' 
       
        1 
        /
        sbin
        /
        init 
       
        397 
        /
        sbin
        /
        udevd 
       
        1184 
        auditd 
       
        1209 
        /
        sbin
        /
        rsyslogd 
       
        1251 
        rpcbind 
       
        1282 
        dbus
        -
        daemon 
       
        1292 
        NetworkManager 
       
        1297 
        /
        usr
        /
        sbin
        /
        modem
        -
        manager 
       
        1311 
        rpc.statd 
       
        1344 
        cupsd 
       
        1354 
        /
        usr
        /
        sbin
        /
        wpa_supplicant 
       
        1392 
        hald

 
  本文转自Mr_陈 51CTO博客，原文链接：http://blog.51cto.com/chenpipi/1391178，如需转载请自行联系原作者

Linux学习笔记：awk详细用法

热门文章

最新文章

相关课程

相关电子书

相关实验场景

推荐镜像