Quantcast
Channel: CodeSection,代码区,Linux操作系统:Ubuntu_Centos_Debian - CodeSec
Viewing all articles
Browse latest Browse all 11063

smart checkin #2 for 3ware / twe

$
0
0

Well, just because I did need that today, here’s my smart checkin script enhanced by checking discs behind a 3ware controller which has the /dev/twe devices… one day I’ll add adaptec.

Here’s the script again, and don’t forget to install smartmontools and bc:

#!/bin/bash while read -r disk; do ret=0 echo "~~ checking ${disk} ~~" health=$(smartctl -H ${disk} | awk '/result: /{print $6}'); if [ $health != "PASSED" ]; then echo "Check the disc, it failed the overall smart health check..." ret=1 fi while read -r line; do id=$(echo $line | awk '{print $1}' | bc); title=$(echo $line | awk '{print $2}'); thresh=$(echo $line | awk '{print $6}' | bc); worst=$(echo $line | awk '{print $5}' | bc); value=$(echo $line | awk '{print $4}' | bc); raw=$(echo $line | awk '{print $10}' | bc); if [ $value -lt $thresh ]; then echo "$title: value($value) is less than thresh($thresh)"; ret=1 fi if [ $id -eq 5 ] || [ $id -eq 183 ] || [ $id -eq 187 ] || [ $id -eq 197 ] || [ $id -eq 198 ]; then if [ $raw -gt 0 ]; then echo "$title: raw value($value) is greater than zero"; ret=1 fi fi if [ $id -eq 9 ]; then years=$(echo "scale=0; $raw / 24 / 365" | bc); if [ $years -ge 4 ]; then echo "$title: disk is older($years) than 4 years"; ret=1 fi fi if [ $id -eq 194 ]; then if [ $raw -ge 45 ]; then echo "$title: disk is hotter($raw°C) than 45°C" ret=1 elif [ $raw -le 25 ]; then echo "$title: disk is colder($raw°C) than 25°C" ret=1 fi fi done< <(smartctl -A ${disk} | tail -n +8 | head -n -1); if [ $ret -eq 1 ]; then echo -e " - \e[91mcheck ${disk} manually and monitor it closely.\e[39m"; else echo -e " + \e[92meverything is fine with ${disk}\e[39m"; fi done< <(ls /dev/sd[a-z])

The 3ware raid controller/kernel driver creates some device nodes in /dev which are called twe0 to twe15. I just need a quick and dirty solution, so I could just issue smartctl -H -d3ware,0-9 on twe0 to twe15 or something and check if $? is 0. Something which might as well work is:

tw_cli /c0 show | awk '/p[0-9]/{if($2 != "NOT-PRESENT") print $1}' | sed 's_p__g'

This will tell me the port numbers where a disc is located. Not sure how good this works if you have multiple controllers. However, for the usual case this should suffice. So, let’s go for the following:

DEVICES=$(ls /dev/sd[a-z]) if [ -x "/usr/sbin/tw_cli" ]; then while read -r PORT; do DEVICES=($DEVICES "/dev/twe0 -d 3ware,$PORT") done< <(/usr/sbin/tw_cli /c0 show | awk '/p[0-9]/{if($2 != "NOT-PRESENT") print $1}' | sed 's_p__g'); fi

this results in:

root@psv1:~# for i in "${DEVICES[@]}"; do echo $i; done /dev/sda /dev/twe0 -d 3ware,1 note the “ which I’ve put around ${DEVICES[@]}. Without that, it’ll place -d 3ware,1 on their own lines. Now, let’s adjust our script from above: #!/bin/bash DEVICES=$(ls /dev/sd[a-z]) if [ -x "/usr/sbin/tw_cli" ]; then while read -r PORT; do DEVICES=($DEVICES "/dev/twe0 -d 3ware,$PORT") done< <(/usr/sbin/tw_cli /c0 show | awk '/p[0-9]/{if($2 != "NOT-PRESENT") print $1}' | sed 's_p__g'); fi for disk in "${DEVICES[@]}"; do ret=0 echo "~~ checking ${disk} ~~" health=$(smartctl -H ${disk} | awk '/result: /{print $6}'); if [ "$health" != "PASSED" ]; then echo "Check the disc, it failed the overall smart health check..." ret=1 fi while read -r line; do id=$(echo $line | awk '{print $1}' | bc); title=$(echo $line | awk '{print $2}'); thresh=$(echo $line | awk '{print $6}' | bc); worst=$(echo $line | awk '{print $5}' | bc); value=$(echo $line | awk '{print $4}' | bc); raw=$(echo $line | awk '{print $10}' | bc); if [ $value -lt $thresh ]; then echo "$title: value($value) is less than thresh($thresh)"; ret=1 fi if [ $id -eq 5 ] || [ $id -eq 183 ] || [ $id -eq 187 ] || [ $id -eq 197 ] || [ $id -eq 198 ]; then if [ $raw -gt 0 ]; then echo "$title: raw value($value) is greater than zero"; ret=1 fi fi if [ $id -eq 9 ]; then years=$(echo "scale=0; $raw / 24 / 365" | bc); if [ $years -ge 4 ]; then echo "$title: disk is older($years) than 4 years"; ret=1 fi fi if [ $id -eq 194 ]; then if [ $raw -ge 45 ]; then echo "$title: disk is hotter($raw°C) than 45°C" ret=1 elif [ $raw -le 25 ]; then echo "$title: disk is colder($raw°C) than 25°C" ret=1 fi fi done< <(smartctl -A ${disk} | tail -n +8 | head -n -1); if [ $ret -eq 1 ]; then echo -e " - \e[91mcheck ${disk} manually and monitor it closely.\e[39m"; else echo -e " + \e[92meverything is fine with ${disk}\e[39m"; fi done

and a test run…

root@psv1:~# ./test.sh ~~ checking /dev/sda ~~ Check the disc, it failed the overall smart health check... - check /dev/sda manually and monitor it closely. ~~ checking /dev/twe0 -d 3ware,1 ~~ Power_On_Hours: disk is older(8) than 4 years Reported_Uncorrect: raw value(100) is greater than zero Temperature_Celsius: disk is colder(23°C) than 25°C Current_Pending_Sector: raw value(95) is greater than zero - check /dev/twe0 -d 3ware,1 manually and monitor it closely.

sda fails because thats the exported drive by the raidcontroller. That is fine and can be ignored.


Viewing all articles
Browse latest Browse all 11063

Trending Articles