Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SMART data collection broken for one of my drives (SAMSUNG SpinPoint F2) #657

Closed
therealprof opened this issue May 22, 2015 · 5 comments
Closed

Comments

@therealprof
Copy link

My system has three drives, for one of them the SMART data collection is broken, the other ones work just fine:

Traceback (most recent call last):
  File "/opt/rockstor/src/rockstor/rest_framework_custom/generic_view.py", line 40, in _handle_exception
    yield
  File "/opt/rockstor/src/rockstor/storageadmin/views/disk_smart.py", line 114, in post
    return self._info(disk)
  File "/opt/rockstor/eggs/Django-1.6.2-py2.7.egg/django/db/transaction.py", line 339, in inner
    return func(*args, **kwargs)
  File "/opt/rockstor/src/rockstor/storageadmin/views/disk_smart.py", line 68, in _info
    attributes = smart.extended_info(disk.name)
  File "/opt/rockstor/src/rockstor/system/smart.py", line 48, in extended_info
    o, e, rc = run_command([SMART, '-a', '/dev/%s' % device])
  File "/opt/rockstor/src/rockstor/system/osi.py", line 78, in run_command
    raise CommandException(out, err, rc)
CommandException: 'smartctl 6.2 2013-07-26 r3841 [x86_64-linux-4.0.2-1.el7.elrepo.x86_64] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org  === START OF INFORMATION SECTION === Model Family:     SAMSUNG SpinPoint F2 EG Device Model:     SAMSUNG HD154UI Serial Number:    S1XWJ1LSB03973 LU WWN Device Id: 5 0024e9 002690eb8 Firmware Version: 1AG01118 User Capacity:    1,500,301,910,016 bytes [1.50 TB] Sector Size:      512 bytes logical/physical Device is:        In smartctl database [for details use: -P show] ATA Version is:   ATA/ATAPI-7, ATA8-ACS T13/1699-D revision 3b Local Time is:    Fri May 22 03:25:41 2015 CDT SMART support is: Available - device has SMART capability. SMART support is: Enabled  === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED  General SMART Values: Offline data collection status:  (0x00)\tOffline data collection activity \t\t\t\t\twas never started. \t\t\t\t\tAuto Offline Data Collection: Disabled. Self-test execution status:      (  25)\tThe self-test routine was aborted by \t\t\t\t\tthe host. Total time to complete Offline  data collection: \t\t(19704) seconds. Offline data collection capabilities: \t\t\t (0x7f) SMART execute Offline immediate. \t\t\t\t\tAuto Offline data collection on/off support. \t\t\t\t\tAbort Offline collection upon new \t\t\t\t\tcommand. \t\t\t\t\tOffline surface scan supported. \t\t\t\t\tSelf-test supported. \t\t\t\t\tConveyance Self-test supported. \t\t\t\t\tSelective Self-test supported. SMART capabilities:            (0x0003)\tSaves SMART data before entering \t\t\t\t\tpower-saving mode. \t\t\t\t\tSupports SMART auto save timer. Error logging capability:        (0x01)\tError logging supported. \t\t\t\t\tGeneral Purpose Logging supported. Short self-test routine  recommended polling time: \t (   2) minutes. Extended self-test routine recommended polling time: \t ( 329) minutes. Conveyance self-test routine recommended polling time: \t (  34) minutes. SCT capabilities: \t       (0x003f)\tSCT Status supported. \t\t\t\t\tSCT Error Recovery Control supported. \t\t\t\t\tSCT Feature Control supported. \t\t\t\t\tSCT Data Table supported.  SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE   1 Raw_Read_Error_Rate     0x000f   253   253   051    Pre-fail  Always       -       1   3 Spin_Up_Time            0x0007   073   073   011    Pre-fail  Always       -       8970   4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       300   5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0   7 Seek_Error_Rate         0x000f   253   253   051    Pre-fail  Always       -       0   8 Seek_Time_Performance   0x0025   100   100   015    Pre-fail  Offline      -       12876   9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       751  10 Spin_Retry_Count        0x0033   100   100   051    Pre-fail  Always       -       0  11 Calibration_Retry_Count 0x0012   100   100   000    Old_age   Always       -       0  12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       83  13 Read_Soft_Error_Rate    0x000e   253   253   000    Old_age   Always       -       1 183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0 184 End-to-End_Error        0x0033   100   100   000    Pre-fail  Always       -       0 187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       1 188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0 190 Airflow_Temperature_Cel 0x0022   082   078   000    Old_age   Always       -       18 (Min/Max 12/18) 194 Temperature_Celsius     0x0022   079   076   000    Old_age   Always       -       21 (Min/Max 12/21) 195 Hardware_ECC_Recovered  0x001a   100   100   000    Old_age   Always       -       14283 196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0 197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       1 198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       0 199 UDMA_CRC_Error_Count    0x003e   253   253   000    Old_age   Always       -       0 200 Multi_Zone_Error_Rate   0x000a   253   253   000    Old_age   Always       -       0 201 Soft_Read_Error_Rate    0x000a   253   253   000    Old_age   Always       -       0  SMART Error Log Version: 1 No Errors Logged  SMART Self-test log structure revision number 1 Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error # 1  Extended captive    Aborted by host               90%       747         - # 2  Extended offline    Aborted by host               90%       747         - # 3  Extended offline    Completed: read failure       90%       746         2111383925  SMART Selective self-test log data structure revision number 1  SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS     1        0        0  Not_testing     2        0        0  Not_testing     3        0        0  Not_testing     4        0        0  Not_testing     5        0        0  Not_testing Selective self-test flags (0x0):   After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.   '
@phillxnet
Copy link
Member

A forum member KarstenV has also reported SMART info errors:-
http://forum.rockstor.com/t/question-about-rockstor-install-with-issues-im-seing/204
In their case 3 out of 4 of their drives are causing the SMART system to throw errors.

@phillxnet
Copy link
Member

Linked forum report states errors with:-

Western Digital Blue WD5000AAKS 500GB
Western Digital Caviar GP WD5000AACS 500GB
Maxtor DiamondMax 10 6B200M0 200GB

However another older Western Digital Blue WD5000AAKS 500GB they have does report / parse correctly.

@phillxnet
Copy link
Member

We now have in hand a working and Rockstor upsetting output from two identical drives thanks to the forum poster KarstenV linked above. I have also found an old drive of mine that triggers this behaviour, a ST3160021AS
Disks page - click on drive name - Update button in smart window - whole load of red text such as in forum post.
I'm can't look into this at the moment but just noting we have another example to hand.

@schakrava schakrava added this to the Toulumne milestone Aug 15, 2015
@phillxnet
Copy link
Member

I will try and have a look at this soon.

@phillxnet
Copy link
Member

I am looking at this again today, no promises though.

phillxnet added a commit to phillxnet/rockstor-core that referenced this issue Oct 12, 2015
…kstor#657

This is at least better than a wall of red text and in tests so far produces
viable results, but we now need to deal with the return code in an informative
manner.
phillxnet added a commit to phillxnet/rockstor-core that referenced this issue Oct 13, 2015
…kstor#657

This is at least better than a wall of red text and in tests so far produces
viable results, but we now need to deal with the return code in an informative
manner.
phillxnet added a commit to phillxnet/rockstor-core that referenced this issue Oct 14, 2015
A function to email the root user with a message; used to access
the email notification system as all root's email is forwarded to
whatever external address is defined in email notifications. Should
work without email notifications enabled as root always exists.
phillxnet added a commit to phillxnet/rockstor-core that referenced this issue Oct 14, 2015
the command noted returns a code of 64 when producing drive error logs
successfully. As this error report is otherwise muted we notify root via
email so as not to bury the error.
phillxnet added a commit to phillxnet/rockstor-core that referenced this issue Oct 14, 2015
phillxnet added a commit to phillxnet/rockstor-core that referenced this issue Oct 14, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants