一、基础用法
awk:报告生成工具;把文件中读取到的每一行的每个字段分别进行格式化,然后进行显示。
1
2
3
4
5
6
7
8
9
10
11
|
[Linux85]
#awk -h
Usage: awk [POSIX
or
GNU style options]
-
f progfile [
-
-
]
file
...
Usage: awk [POSIX
or
GNU style options] [
-
-
]
'program'
file
...
POSIX options: GNU
long
options:
-
f progfile
-
-
file
=
progfile
-
F fs
-
-
field
-
separator
=
fs
#字段分隔符
-
v var
=
val
-
-
assign
=
var
=
val
-
m[fr] val
awk [options]
'script'
FILE
...
awk [options]
'/pattern/{action}'
FILE
...
|
四种分隔符:
输入/输出
行分隔符:$
字段分隔符:空白
模式
地址定界 | /pattern1/,/pattern2/ |
/pattern/ | 可以 ! 取反 |
expression |
表达式;>, >=, <, <=, ==, !=, ~ |
BEGIN{} | 在遍历操作开始之前执行一次 |
END{} | 在遍历操作结束之后、命令退出之前执行一次 |
1
2
3
4
5
|
[Linux85]
#awk '/^soul/{print $0}' /etc/passwd /etc/shadow /etc/group
soul:x:
501
:
501
::
/
home
/
soul:
/
bin
/
bash
soul:!!:
16166
:
0
:
99999
:
7
:::
soul:x:
501
:
[Linux85]
#
|
1
2
3
4
5
6
|
#ID号大于等于500的用户
[Linux85]
#awk -F : '$3>=500{print $1}' /etc/passwd
nfsnobody
gentoo
soul
[Linux85]
#
|
1
2
3
4
5
6
7
8
|
BEGIN执行前操作
[Linux85]
#awk -F : 'BEGIN{print "UserName\n***********"}$3>=500{print $1}' /etc/passwd
UserName
*
*
*
*
*
*
*
*
*
*
*
nfsnobody
gentoo
soul
[Linux85]
#
|
awk的内置变量:
NF | 字段数( The number of fields in the current input record.) |
FS | field separator,读取文本时,所使用字段分隔符 |
RS | Record separator,输入文本信息所使用的换行符; |
OFS | 输出时使用字段分隔符,默认为空白(output field separator) |
ORS | output record separator |
1
2
3
4
5
6
7
8
9
|
[Linux85]
#awk -F : '/^soul/{print $1,$7}' /etc/passwd
soul
/
bin
/
bash
[Linux85]
#awk 'BEGIN{FS=":"}/^soul/{print $1,$7}' /etc/passwd
soul
/
bin
/
bash
[Linux85]
#awk 'BEGIN{FS=":";OFS=":"}/^soul/{print $1,$7}' /etc/passwd
soul:
/
bin
/
bash
[Linux85]
#
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
[Linux85]
#awk '!/^$|^#/{print $1}' /etc/sysctl.conf
net.ipv4.ip_forward
net.ipv4.conf.default.rp_filter
net.ipv4.conf.default.accept_source_route
kernel.sysrq
kernel.core_uses_pid
net.ipv4.tcp_syncookies
net.bridge.bridge
-
nf
-
call
-
ip6tables
net.bridge.bridge
-
nf
-
call
-
iptables
net.bridge.bridge
-
nf
-
call
-
arptables
kernel.msgmnb
kernel.msgmax
kernel.shmmax
kernel.shmall
[Linux85]
#
|
1
2
3
|
[Linux85]
#ifconfig | awk '/inet addr/{print $2}' | awk -F : '!/127/{print $2}'
172.16
.
251.85
[Linux85]
#
|
二、awk的进阶使用
1、print输出:print item1, item2, ...
-
各项目之间使用逗号隔开,而输出时则以空白字符分隔;
-
输出的item可以为字符串或数值、当前记录的字段(如$1)、变量或awk的表达式;数值会先转换为字符串,而后再输出;
-
print命令后面的item可以省略,此时其功能相当于print $0, 因此,如果想输出空白行,则需要使用print "";
2、printf输出:printf format, item1, item2, ...
-
其与print命令的最大不同是,printf需要指定format;
-
format用于指定后面的每个item的输出格式;
-
printf语句不会自动打印换行符;\n
format格式的指示符都以%开头;后面跟一个字符;
%c | 显示字符的ASCII码; |
%d | %i | 十进制整数; |
%e | %E | 科学计数法显示数值; |
%f | 显示浮点数; |
%g | %G | 以科学计数法的格式或浮点数的格式显示数值; |
%s | 显示字符串; |
%u | 无符号整数; |
%% | 显示%自身; |
1
2
3
4
|
[Linux85]
#awk 'BEGIN{num1=20;num2=30; printf "%d %d\n",num1,num2}'
20
30
[Linux85]
#
#不显示item;只显示的是格式;格式对应的后面的变量;所以需要一一对应
|
修饰符
N | 显示宽度 |
- | 左对齐 |
+ | 显示数值符号;正负数 |
1
2
3
4
5
6
7
|
[Linux85]
#awk -F: '{printf "%-14s %s\n",$1,$NF}' /etc/passwd
root
/
bin
/
bash
bin
/
sbin
/
nologin
daemon
/
sbin
/
nologin
adm
/
sbin
/
nologin
lp
/
sbin
/
nologin
sync
/
bin
/
sync
|
3、awk内置变量之数据变量
NR | The number of input records,awk命令所处理的记录数;如果有多个文件,这个数目会把处理的多个文件中行统一计数; |
NF | Number of Field,当前记录的field个数; |
FNR | 与NR不同的是,FNR用于记录正处理的行是当前这一文件中被总共处理的行数; |
ARGV | 数组,保存命令行本身这个字符串,如awk '{print $0}' a.txt b.txt这个命令中,ARGV[0]保存awk,ARGV[1]保存a.txt; |
ARGC | awk命令的参数的个数; |
FILENAME | awk命令所处理的文件的名称; |
ENVIROM | 当前shell环境变量及其值的关联数组; |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
|
[Linux85]
#awk '{print NR,$0}' 1.txt
1
one line
2
two line
3
three line
4
four line
5
five line
[Linux85]
#awk '{print NR,$0}' 2.txt
1
six line
2
seven line
3
eight line
4
nine line
5
ten line
[Linux85]
#awk '{print NR,$0}' 1.txt 2.txt
1
one line
2
two line
3
three line
4
four line
5
five line
6
six line
7
seven line
8
eight line
9
nine line
10
ten line
[Linux85]
#
#
[Linux85]
#awk '{print FNR,$0}' 1.txt 2.txt
1
one line
2
two line
3
three line
4
four line
5
five line
1
six line
2
seven line
3
eight line
4
nine line
5
ten line
[Linux85]
#
|
1
2
3
4
|
[Linux85]
#awk -F: '/root/{print $1,"is a user in",ARGV[1]}' /etc/passwd
root
is
a user
in
/
etc
/
passwd
operator
is
a user
in
/
etc
/
passwd
[Linux85]
#
|
1
2
3
4
|
[Linux85]#awk
'BEGIN{print ARGC}'
/etc/passwd /etc/group /etc/shadow
4
[Linux85]#
#
'BEGIN{print ARGC}'
本身也当成一个参数
|
1
2
3
4
5
6
7
8
9
10
11
12
|
[Linux85]
#awk '{print $0,"in", FILENAME}' 1.txt 2.txt
one line
in
1
.txt
two line
in
1
.txt
three line
in
1
.txt
four line
in
1
.txt
five line
in
1
.txt
six line
in
2
.txt
seven line
in
2
.txt
eight line
in
2
.txt
nine line
in
2
.txt
ten line
in
2
.txt
[Linux85]
#
|
4、输出重定向
print items > output-file
print items >> output-file
print items | command
特殊文件描述符:
-
/dev/stdin:标准输入
-
/dev/sdtout: 标准输出
-
/dev/stderr: 错误输出
-
/dev/fd/N: 某特定文件描述符,如/dev/stdin就相当于/dev/fd/0;
5、awk的操作符
算术操作符 |
赋值操作符 | 比较操作符 |
-x:负值 | =:应[=] | x < y True if x is less than y. |
+x:转换为数值 | += | x <= y True if x is less than or equal to y. |
x^y:次方 | -= | x > y True if x is greater than y. |
x**y:次方 | *= |
x >= y True if x is greater than or equal to y. |
x*y | /= | x == y True if x is equal to y. |
x/y | %= | x != y True if x is not equal to y. |
x+y | ^= | x ~ y True if the string x matches the regexp denoted by y. |
x-y | **= | x !~ y True if the string x does not match the regexp denoted by y. |
x%y | ++ | subscript in array True if the array array has an element with the subscript subscript. |
-- |
awk中;任何非0值或非空字符串都为真;反之为假。
条件表达式:
select?if-true-exp:if-false-exp
6、模式和常见的模式类型
模式:
awk 'program' input-file1 input-file2 ...
program:
-
pattern { action }
-
pattern { action }
-
....
常见的模式:
Regexp | 正则表达式,格式为/regular expression/ |
expresssion | 表达式,其值非0或为非空字符时满足条件,如:$1 ~ /foo/ 或 $1 == "soul",用运算符~(匹配)和!~(不匹配)。 |
Ranges | 指定的匹配范围,格式为pat1,pat2 |
BEGIN/END | 特殊模式,仅在awk命令执行前运行一次或结束前运行一次 |
Empty(空模式) | 匹配任意输入行; |
常见的Action
-
Expressions
-
Control statements
-
Compound statements
-
Input statements
-
Output statements
7、控制语句
-
if-else
语法:if (condition) {then-body} else {[ else-body ]}
1
2
3
4
5
6
7
8
|
[Linux85]
#awk -F : 'BEGIN{OFS=":"}{if ($3==0) {print $1,"Administrator";} else {print $1,"Common User"}}' /etc/passwd
root:Administrator
bin
:Common User
daemon:Common User
adm:Common User
lp:Common User
sync:Common User
shutdown:Common User
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
[Linux85]
#awk -F: '{if ($1=="root") printf "%-15s: %s\n",$1,"Admin";else printf "%-15s: %s\n",$1,"Common User"}' /etc/passwd
root : Admin
bin
: Common User
daemon : Common User
adm : Common User
lp : Common User
sync : Common User
shutdown : Common User
halt : Common User
mail : Common User
uucp : Common User
operator : Common User
games : Common User
gopher : Common User
ftp : Common User
nobody : Common User
dbus : Common User
usbmuxd : Common User
|
1
2
3
|
[Linux85]
#awk -F: -v sum=0 '{if ($3>=500) sum++}END{print sum}' /etc/passwd
3
[Linux85]
#统计uid>=500的用户个数
|
-
while
语法:while (condition){statement1; statment2; ...}
1
2
3
4
5
6
7
8
|
[Linux85]
#awk -F : '{i=1;while (i<=3) {print $i;i++}}' /etc/passwd
root
x
0
bin
x
1
#打印出/etc/passwd前三个字段
|
1
2
3
4
5
6
7
|
[Linux85]
#awk -F: '{i=1;while (i<=NF) { if (length($i)>=4) {print $i}; i++ }}' /etc/passwd
root
root
/
root
/
bin
/
bash
/
bin
/
sbin
/
nologin
|
-
do-while 至少执行一次循环体,不管条件满足与否
语法:do {statement1, statement2, ...} while (condition)
1
2
3
4
5
6
7
8
9
10
|
[Linux85]
#awk -F: '{i=1;do {print $i;i++}while(i<=3)}' /etc/passwd
root
x
0
bin
x
1
daemon
x
2
|
1
2
3
4
5
6
7
8
9
10
|
[Linux85]
#awk -F: '{i=4;do {print $i;i--}while(i>4)}' /etc/passwd
0
1
2
4
7
0
0
0
12
|
-
for
语法:for (variable assignment; condition; iteration process) {statement1, statement2, ...}
1
2
3
4
5
6
7
8
|
[Linux85]
#awk -F: '{for(i=1;i<=3;i++) if (i<3){printf "%s:",$i} print $i}' /etc/passwd
root:x:
0
bin
:x:
1
daemon:x:
2
adm:x:
4
lp:x:
7
sync:x:
0
shutdown:x:
0
|
-
for循环遍历数组元素
语法: for (i in array) {statement1, statement2, ...}
1
2
3
4
5
6
7
8
9
|
[Linux85]
#awk -F: '$NF!~/^$/{BASH[$NF]++}END{for(A in BASH){printf "%15s:%i\n",A,BASH[A]}}' /etc/passwd
/
sbin
/
shutdown:
1
/
bin
/
csh:
1
/
bin
/
bash:
2
/
sbin
/
nologin:
29
/
sbin
/
halt:
1
/
bin
/
sync:
1
[Linux85]
#
#统计最后一个字段出现的次数
|
-
case
语法:switch (expression) { case VALUE or /REGEXP/: statement1, statement2,... default: statement1, ...}
-
break 和 continue
-
next
提前结束对本行文本的处理,并接着处理下一行;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
[Linux85]
#awk -F: '{if($3%2==0) next;print $1,$3}' /etc/passwd
bin
1
adm
3
sync
5
halt
7
operator
11
gopher
13
nobody
99
dbus
81
usbmuxd
113
vcsa
69
rtkit
499
abrt
173
postfix
89
rpcuser
29
pulse
497
soul
501
[Linux85]
#
|
8、数组
array[index-expression]
-
index-expression可以使用任意字符串;需要注意的是,如果某数据组元素事先不存在,那么在引用其时,awk会自动创建此元素并初始化为空串;因此,要判断某数据组中是否存在某元素,需要使用index in array的方式。
-
要遍历数组中的每一个元素,需要使用如下的特殊结构:
for (var in array) { statement1, ... }
其中,var用于引用数组下标,而不是元素值;
删除数组中的变量:delete array[index]
1
2
3
4
|
[Linux85]
#netstat -ant | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}'
ESTABLISHED
2
LISTEN
10
[Linux85]
#
|
9、awk的内置函数
-
split(string, array [, fieldsep [, seps ] ])
将string表示的字符串以fieldsep为分隔符进行分隔,并将分隔后的结果保存至array为名的数组中;数组下标为从1开始的序列;
1
2
3
4
5
|
[Linux85]
#df -lh | awk '!/^File/{split($5,percent,"%");if(percent[1]>=10){print $1}}'
/
dev
/
sda1
/
dev
/
mapper
/
vg0
-
usr
[Linux85]
#
#磁盘使用率大于等于%10的显示出来
|
-
length([string]):返回string字符串中字符的个数;
1
2
3
4
5
6
7
8
9
10
11
|
[Linux85]
#awk -F: '{for(i=1;i<=NF;i++) { if (length($i)>=4) {print $i}}}' /etc/passwd
root
root
/
root
/
bin
/
bash
/
bin
/
sbin
/
nologin
daemon
daemon
/
sbin
/
sbin
/
nologin
|
-
substr(string, start [, length ])
取string字符串中的子串,从start开始,取length个;start从1开始计数;
-
system(command):执行系统command并将结果返回至awk命令
-
systime():取系统当前时间
-
tolower(s):将s中的所有字母转为小写
-
toupper(s):将s中的所有字母转为大写
10、用户自定义函数
自定义函数使用function关键字。格式如下:
function F_NAME([variable])
{
statements
}
example:
1
2
3
4
|
#统计当前系统上每个客户端IP的连接中状处于ESTABLISHED的连接态的个数;
[Linux85]
#netstat -tn | awk '/ESTABLISHED\>/{split($5,ip,":");num[ip[1]]++}END{for (i in num) printf "%s %d\n", i, num[i]}'
172.16
.
254.28
2
[Linux85]
#
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
#统计ps aux命令执行时,当前系统上各状态的进程的个数;
[Linux85]
#ps aux | awk '!/^USER/{state[$8]++}END{for (i in state) printf "%s %d\n",i,state[i]}'
S<
2
S<sl
1
Ss
18
SN
1
S
69
Ss
+
6
Ssl
2
R
+
1
S
+
2
Sl
2
S<s
1
[Linux85]
#
|
1
2
3
4
5
6
7
8
9
10
|
#统计ps aux命令执行时,当前系统上各用户的进程的个数;
[Linux85]
#ps aux | awk '!/^USER/{state[$1]++}END{for (i in state) printf "%s %d\n",i,state[i]}'
rpc
1
dbus
1
68
2
postfix
2
rpcuser
1
root
96
gentoo
2
[Linux85]
#
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
#显示ps aux命令执行时,当前系统上其VSZ(虚拟内存集)大于10000的进程及其PID;
[Linux85]
#ps aux | awk '!/USER/{if($5>10000) print $2,$11}'
1
/
sbin
/
init
397
/
sbin
/
udevd
1184
auditd
1209
/
sbin
/
rsyslogd
1251
rpcbind
1282
dbus
-
daemon
1292
NetworkManager
1297
/
usr
/
sbin
/
modem
-
manager
1311
rpc.statd
1344
cupsd
1354
/
usr
/
sbin
/
wpa_supplicant
1392
hald
|