硬盘监控和分析工具:Smartctl

简介:

硬盘监控和分析工具:Smartctl

Smartctl(S.M.A.R.T 自监控,分析和报告技术)是类Unix系统下实施SMART任务命令行套件或工具,它用于打印SMART自检错误日志,启用并禁用SMRAT自动检测,以及初始化设备自检。

Smartctl对于Linux物理服务器十分有用,在这些服务器上,可以对智能磁盘进行错误检查,并将与硬件RAID相关的磁盘信息摘录下来。

在本帖中,我们将讨论smartctl命令的一些实用样例。如果你的Linux上海没有安装smartctl,请按以下步骤来安装。

安装 Smartctl

对于 Ubuntu

 
 
  1. $ sudo apt-get install smartmontools

对于 CentOS & RHEL

 
 
  1. # yum install smartmontools

启动Smartctl服务

对于 Ubuntu

 
 
  1. $ sudo /etc/init.d/smartmontools start

对于 CentOS & RHEL

 
 
  1. # service smartd start ; chkconfig smartd on

样例

样例:1 检查磁盘的 Smart 功能是否启用

 
 
  1. root@linuxtechi:~# smartctl -i /dev/sdb
  2. smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.0-32-generic] (local build)
  3. Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
  4.  
  5. === START OF INFORMATION SECTION ===
  6. Model Family: Seagate Momentus 5400.6
  7. Device Model: ST9320325AS
  8. Serial Number: 5VD2V59T
  9. LU WWN Device Id: 5 000c50 020a37ec4
  10. Firmware Version: 0002BSM1
  11. User Capacity: 320,072,933,376 bytes [320 GB]
  12. Sector Size: 512 bytes logical/physical
  13. Rotation Rate: 5400 rpm
  14. Device is: In smartctl database [for details use: -P show]
  15. ATA Version is: ATA8-ACS T13/1699-D revision 4
  16. SATA Version is: SATA 2.6, 1.5 Gb/s
  17. Local Time is: Sun Nov 16 12:32:09 2014 IST
  18. SMART support is: Available - device has SMART capability.
  19. SMART support is: Enabled

这里‘/dev/sdb’是你的硬盘。上面输出中的最后两行显示了SMART功能已启用。

样例:2 启用磁盘的 Smart 功能

 
 
  1. root@linuxtechi:~# smartctl -s on /dev/sdb
  2. smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.0-32-generic] (local build)
  3. Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
  4.  
  5. === START OF ENABLE/DISABLE COMMANDS SECTION ===
  6. SMART Enabled.

样例:3 禁用磁盘的 Smart 功能

 
 
  1. root@linuxtechi:~# smartctl -s off /dev/sdb
  2. smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.0-32-generic] (local build)
  3. Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
  4.  
  5. === START OF ENABLE/DISABLE COMMANDS SECTION ===
  6. SMART Disabled. Use option -s with argument 'on' to enable it.

样例:4 显示磁盘的详细 Smart 信息

 
 
  1. root@linuxtechi:~# smartctl -a /dev/sdb // For IDE drive
  2. root@linuxtechi:~# smartctl -a -d ata /dev/sdb // For SATA drive

样例:5 显示磁盘总体健康状况

 
 
  1. root@linuxtechi:~# smartctl -H /dev/sdb
  2. smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.0-32-generic] (local build)
  3. Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
  4.  
  5. === START OF READ SMART DATA SECTION ===
  6. SMART overall-health self-assessment test result: PASSED
  7. Warning: This result is based on an Attribute check.
  8. Please note the following marginal Attributes:
  9. ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
  10. 190 Airflow_Temperature_Cel 0x0022 067 045 045 Old_age Always In_the_past 33 (Min/Max 25/33)

样例:6 使用long和short选项测试硬盘

Long测试

 
 
  1. root@linuxtechi:~# smartctl --test=long /dev/sdb
  2. smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.0-32-generic] (local build)
  3. Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
  4.  
  5. === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
  6. Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
  7. Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.
  8. Testing has begun.
  9. Please wait 102 minutes for test to complete.
  10. Test will complete after Sun Nov 16 14:29:43 2014
  11.  
  12. Use smartctl -X to abort test.

或者,我们可以重定向测试输出到日志文件,就像下面这样

 
 
  1. root@linuxtechi:~# smartctl --test=long /dev/sdb > /var/log/long.text

Short测试

 
 
  1. root@linuxtechi:~# smartctl --test=short /dev/sdb
  2. smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.0-32-generic] (local build)
  3. Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
  4.  
  5. === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
  6. Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
  7. Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful.
  8. Testing has begun.
  9. Please wait 1 minutes for test to complete.
  10. Test will complete after Sun Nov 16 12:51:45 2014
  11.  
  12. Use smartctl -X to abort test.

 
 
  1. root@linuxtechi:~# smartctl --test=short /dev/sdb > /var/log/short.text

注意:short测试将花费最多2分钟,而在long测试中没有时间限制,因为它会读取并验证磁盘的每个段。

样例:7 查看驱动器的自检结果

 
 
  1. root@linuxtechi:~# smartctl -l selftest /dev/sdb
  2. smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.0-32-generic] (local build)
  3. Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
  4.  
  5. === START OF READ SMART DATA SECTION ===
  6. SMART Self-test log structure revision number 1
  7. Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
  8. # 1 Short offline Completed: read failure 90% 492 210841222
  9. # 2 Extended offline Completed: read failure 90% 492 210841222

样例:8 计算测试时间估值

 
 
  1. root@linuxtechi:~# smartctl -c /dev/sdb
  2. smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.0-32-generic] (local build)
  3. Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
  4.  
  5. === START OF READ SMART DATA SECTION ===
  6. General SMART Values:
  7. Offline data collection status: (0x00) Offline data collection activity
  8. was never started.
  9. Auto Offline Data Collection: Disabled.
  10. Self-test execution status: ( 121) The previous self-test completed having
  11. the read element of the test failed.
  12. Total time to complete Offline
  13. data collection: ( 0) seconds.
  14. Offline data collection
  15. capabilities: (0x73) SMART execute Offline immediate.
  16. Auto Offline data collection on/off support.
  17. Suspend Offline collection upon new
  18. command.
  19. No Offline surface scan supported.
  20. Self-test supported.
  21. Conveyance Self-test supported.
  22. Selective Self-test supported.
  23. SMART capabilities: (0x0003) Saves SMART data before entering
  24. power-saving mode.
  25. Supports SMART auto save timer.
  26. Error logging capability: (0x01) Error logging supported.
  27. General Purpose Logging supported.
  28. Short self-test routine
  29. recommended polling time: ( 1) minutes.
  30. Extended self-test routine
  31. recommended polling time: ( 102) minutes.
  32. Conveyance self-test routine
  33. recommended polling time: ( 2) minutes.
  34. SCT capabilities: (0x103b) SCT Status supported.
  35. SCT Error Recovery Control supported.
  36. SCT Feature Control supported.
  37. SCT Data Table supported.

样例:9 显示磁盘错误日志

 
 
  1. root@linuxtechi:~# smartctl -l error /dev/sdb
  2.  
  3. Sample Output
  4.  
  5. smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.0-32-generic] (local build)
  6. Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
  7.  
  8. === START OF READ SMART DATA SECTION ===
  9. SMART Error Log Version: 1
  10. ATA Error Count: 5
  11. CR = Command Register [HEX]
  12. FR = Features Register [HEX]
  13. SC = Sector Count Register [HEX]
  14. SN = Sector Number Register [HEX]
  15. CL = Cylinder Low Register [HEX]
  16. CH = Cylinder High Register [HEX]
  17. DH = Device/Head Register [HEX]
  18. DC = Device Command Register [HEX]
  19. ER = Error register [HEX]
  20. ST = Status register [HEX]
  21. Powered_Up_Time is measured from power on, and printed as
  22. DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
  23. SS=sec, and sss=millisec. It "wraps" after 49.710 days.
  24.  
  25. Commands leading to the command that caused the error were:
  26. CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
  27. -- -- -- -- -- -- -- -- ---------------- --------------------
  28. 25 da 08 e7 e5 a5 4c 00 00:30:44.515 READ DMA EXT
  29. 25 da 08 df e5 a5 4c 00 00:30:44.514 READ DMA EXT
  30. 25 da 80 5f e5 a5 4c 00 00:30:44.502 READ DMA EXT
  31. 25 da f0 5f e6 a5 4c 00 00:30:44.496 READ DMA EXT
  32. 25 da 10 4f e6 a5 4c 00 00:30:44.383 READ DMA EXT

原文发布时间:2015-01-16

本文来自云栖合作伙伴“linux中国”
目录
相关文章
|
1月前
|
存储 缓存 监控
|
3月前
|
存储 缓存 监控
磁盘I/O性能监控的指标
【1月更文挑战第22天】
|
机器学习/深度学习 Shell Python
vmcore自动分析工具
作者:雨庭 ## vmcore分析工具的需求变化 解决内核宕机、修复线上问题以及优化性能瓶颈是各操作系统团队工程师日常工作之一,其中大量工作依赖于crash工具对vmcore进行分析,但是应用规模以及场景的变化对其提出了新的需求。这种需求对开发者和集群运维而言,反映出不同的问题。对于开发者而言,crash工具可以满足查看vmcore中几乎所有数据的需求,例如全局变量、调度子系统
5078 0
|
消息中间件 监控 NoSQL
ELK搭建(三):监控服务器CPU、网络、磁盘、内存指标
本期我们来讲解如何通过ELK+metricbeat来监控服务器/主机中的CPU、网络、磁盘、内存等指标变化。并绘制会数据看板来方便我们实时监控
402 0
ELK搭建(三):监控服务器CPU、网络、磁盘、内存指标
网络性能监控工具
本文研究全球及中国市场网络性能监控工具现状及未来发展趋势,侧重分析全球及中国市场的主要企业,同时对比北美、欧洲、中国、日本、东南亚和印度等地区的现状及未来发展趋势
|
Java 监控 API
UAVStack JVM监控分析工具:图形化展示采集及分析监控数据
UAVStack推出的JVM监控分析工具提供基于页面的展现方式,以图形化的方式展示采集到的监控数据;同时提供JVM基本参数获取、内存dump、线程分析、内存分配采样和热点方法分析等功能。
|
监控 tsar
Tsar——灵活的系统和应用采集软件
在开源人的盛会LinuxCon + ContainerCon + CloudOpen中国(简称LC3)大会上,阿里云CDN团队的空见(花名),为大家分享了开源的系统和应用采集软件Tsar的背景、设计思路和用法、模块开发以及未来规划。 实际上它是阿里巴巴在做系统或应用监控时候的一个idea,团队同学在实际的使用过程中比较舒服,软件扩展性、稳定性、易用性也比较好,所以目前在所有机器上都有部署,作
2566 0