Well, just because I did need that today, here’s my smart checkin script enhanced by checking discs behind a 3ware controller which has the /dev/twe devices… one day I’ll add adaptec.
Here’s the script again, and don’t forget to install smartmontools and bc:
#!/bin/bash while read -r disk; do ret=0 echo "~~ checking ${disk} ~~" health=$(smartctl -H ${disk} | awk '/result: /{print $6}'); if [ $health != "PASSED" ]; then echo "Check the disc, it failed the overall smart health check..." ret=1 fi while read -r line; do id=$(echo $line | awk '{print $1}' | bc); title=$(echo $line | awk '{print $2}'); thresh=$(echo $line | awk '{print $6}' | bc); worst=$(echo $line | awk '{print $5}' | bc); value=$(echo $line | awk '{print $4}' | bc); raw=$(echo $line | awk '{print $10}' | bc); if [ $value -lt $thresh ]; then echo "$title: value($value) is less than thresh($thresh)"; ret=1 fi if [ $id -eq 5 ] || [ $id -eq 183 ] || [ $id -eq 187 ] || [ $id -eq 197 ] || [ $id -eq 198 ]; then if [ $raw -gt 0 ]; then echo "$title: raw value($value) is greater than zero"; ret=1 fi fi if [ $id -eq 9 ]; then years=$(echo "scale=0; $raw / 24 / 365" | bc); if [ $years -ge 4 ]; then echo "$title: disk is older($years) than 4 years"; ret=1 fi fi if [ $id -eq 194 ]; then if [ $raw -ge 45 ]; then echo "$title: disk is hotter($raw°C) than 45°C" ret=1 elif [ $raw -le 25 ]; then echo "$title: disk is colder($raw°C) than 25°C" ret=1 fi fi done< <(smartctl -A ${disk} | tail -n +8 | head -n -1); if [ $ret -eq 1 ]; then echo -e " - \e[91mcheck ${disk} manually and monitor it closely.\e[39m"; else echo -e " + \e[92meverything is fine with ${disk}\e[39m"; fi done< <(ls /dev/sd[a-z])The 3ware raid controller/kernel driver creates some device nodes in /dev which are called twe0 to twe15. I just need a quick and dirty solution, so I could just issue smartctl -H -d3ware,0-9 on twe0 to twe15 or something and check if $? is 0. Something which might as well work is:
tw_cli /c0 show | awk '/p[0-9]/{if($2 != "NOT-PRESENT") print $1}' | sed 's_p__g'This will tell me the port numbers where a disc is located. Not sure how good this works if you have multiple controllers. However, for the usual case this should suffice. So, let’s go for the following:
DEVICES=$(ls /dev/sd[a-z]) if [ -x "/usr/sbin/tw_cli" ]; then while read -r PORT; do DEVICES=($DEVICES "/dev/twe0 -d 3ware,$PORT") done< <(/usr/sbin/tw_cli /c0 show | awk '/p[0-9]/{if($2 != "NOT-PRESENT") print $1}' | sed 's_p__g'); fithis results in:
root@psv1:~# for i in "${DEVICES[@]}"; do echo $i; done /dev/sda /dev/twe0 -d 3ware,1 note the “ which I’ve put around ${DEVICES[@]}. Without that, it’ll place -d 3ware,1 on their own lines. Now, let’s adjust our script from above: #!/bin/bash DEVICES=$(ls /dev/sd[a-z]) if [ -x "/usr/sbin/tw_cli" ]; then while read -r PORT; do DEVICES=($DEVICES "/dev/twe0 -d 3ware,$PORT") done< <(/usr/sbin/tw_cli /c0 show | awk '/p[0-9]/{if($2 != "NOT-PRESENT") print $1}' | sed 's_p__g'); fi for disk in "${DEVICES[@]}"; do ret=0 echo "~~ checking ${disk} ~~" health=$(smartctl -H ${disk} | awk '/result: /{print $6}'); if [ "$health" != "PASSED" ]; then echo "Check the disc, it failed the overall smart health check..." ret=1 fi while read -r line; do id=$(echo $line | awk '{print $1}' | bc); title=$(echo $line | awk '{print $2}'); thresh=$(echo $line | awk '{print $6}' | bc); worst=$(echo $line | awk '{print $5}' | bc); value=$(echo $line | awk '{print $4}' | bc); raw=$(echo $line | awk '{print $10}' | bc); if [ $value -lt $thresh ]; then echo "$title: value($value) is less than thresh($thresh)"; ret=1 fi if [ $id -eq 5 ] || [ $id -eq 183 ] || [ $id -eq 187 ] || [ $id -eq 197 ] || [ $id -eq 198 ]; then if [ $raw -gt 0 ]; then echo "$title: raw value($value) is greater than zero"; ret=1 fi fi if [ $id -eq 9 ]; then years=$(echo "scale=0; $raw / 24 / 365" | bc); if [ $years -ge 4 ]; then echo "$title: disk is older($years) than 4 years"; ret=1 fi fi if [ $id -eq 194 ]; then if [ $raw -ge 45 ]; then echo "$title: disk is hotter($raw°C) than 45°C" ret=1 elif [ $raw -le 25 ]; then echo "$title: disk is colder($raw°C) than 25°C" ret=1 fi fi done< <(smartctl -A ${disk} | tail -n +8 | head -n -1); if [ $ret -eq 1 ]; then echo -e " - \e[91mcheck ${disk} manually and monitor it closely.\e[39m"; else echo -e " + \e[92meverything is fine with ${disk}\e[39m"; fi doneand a test run…
root@psv1:~# ./test.sh ~~ checking /dev/sda ~~ Check the disc, it failed the overall smart health check... - check /dev/sda manually and monitor it closely. ~~ checking /dev/twe0 -d 3ware,1 ~~ Power_On_Hours: disk is older(8) than 4 years Reported_Uncorrect: raw value(100) is greater than zero Temperature_Celsius: disk is colder(23°C) than 25°C Current_Pending_Sector: raw value(95) is greater than zero - check /dev/twe0 -d 3ware,1 manually and monitor it closely.sda fails because thats the exported drive by the raidcontroller. That is fine and can be ignored.