Yandex Cloud
Search
Contact UsGet started
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • AI for business
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Center for Technologies and Society
    • Yandex Cloud Partner program
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
© 2025 Direct Cursus Technology L.L.C.
Yandex BareMetal
  • Getting started
    • All guides
    • Overview
      • Overview
      • Server configurations
      • Disk status analysis
      • Additional server settings
      • Overview
      • DHCP
      • MC-LAG
      • Restrictions in BareMetal networks
    • Images
    • Quotas and limits
    • All tutorials
    • Connecting an existing BareMetal server to Cloud Backup
    • Configuring VRRP for a cluster of BareMetal servers
    • Establishing network connectivity in a BareMetal private subnet
    • Establishing network connectivity between BareMetal and Virtual Private Cloud private subnets
    • Establishing network connectivity between a BareMetal private subnet and on-premise resources
    • Delivering USB devices to a BareMetal server or virtual machine
    • Configuring an OPNsense firewall in high availability cluster mode
    • Deploying a web app on BareMetal servers with an L7 load balancer and Smart Web Security protection
    • Connecting a BareMetal server as an external node to a Managed Service for Kubernetes cluster
  • Monitoring metrics
  • Audit Trails events
  • Access management
  • Pricing policy
  • FAQ
  1. Concepts
  2. Servers
  3. Disk status analysis

Analysis of BareMetal server disk status

Written by
Yandex Cloud
Updated at June 19, 2025

If you encounter disk read/write errors, disk or RAID array failures while using a BareMetal server, you can run server diagnostics to identify the source of the problem and generate a report for support.

Information on disk errors is analyzed using the SMART technology for disk self-diagnostics and the HW Watcher utility to collect and process data and generate a report. You can only use HW Watcher on Linux servers.

Information on server disk status is saved in the report’s drive directory, and reports for each of the server’s disks are saved in separate files. A report on the disk’s SMART attribute values is formatted as a table:

HDDs
SSDs
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR--   083   063   044    -    203094696
  3 Spin_Up_Time            PO----   093   093   000    -    0
  4 Start_Stop_Count        -O--CK   100   100   020    -    224
  5 Reallocated_Sector_Ct   PO--CK   100   100   036    -    0
  7 Seek_Error_Rate         POSR--   084   060   030    -    293695131
  9 Power_On_Hours          -O--CK   074   011   000    -    23513
 10 Spin_Retry_Count        PO--C-   100   100   097    -    0
 12 Power_Cycle_Count       -O--CK   100   100   020    -    225
184 End-to-End_Error        -O--CK   100   100   099    -    0
187 Reported_Uncorrect      -O--CK   100   100   000    -    0
188 Command_Timeout         -O--CK   100   099   000    -    65537
189 High_Fly_Writes         -O-RCK   093   093   000    -    7
190 Airflow_Temperature_Cel -O---K   068   051   045    -    32 (Min/Max 31/32)
191 G-Sense_Error_Rate      -O--CK   100   100   000    -    0
192 Power-Off_Retract_Count -O--CK   100   100   000    -    187
193 Load_Cycle_Count        -O--CK   100   100   000    -    1816
194 Temperature_Celsius     -O---K   032   049   000    -    32 (0 18 0 0 0)
195 Hardware_ECC_Recovered  -O-RC-   023   003   000    -    203094696
197 Current_Pending_Sector  -O--C-   100   100   000    -    0
198 Offline_Uncorrectable   ----C-   100   100   000    -    0
199 UDMA_CRC_Error_Count    -OSRCK   200   200   000    -    0

Where:

  • ID#: Attribute ID.

  • ATTRIBUTE_NAME: Attribute name.

    • Raw_Read_Error_Rate: Frequency of errors caused by the disk’s hardware when reading data.
    • Spin_Up_Time: Disk spin-up time from an idle state to an operational speed. It increases as the disk’s mechanical parts wear out or may indicate problems with the disk’s power supply.
    • Start_Stop_Count: Total number of disk start/stop cycles.
    • Reallocated_Sector_Ct: Total number of sectors with read/write errors reallocated to the reserve area.
    • Seek_Error_Rate: Frequency of magnetic head positioning errors. The more errors you get, the worse is the disk condition. Overheating and external vibrations may affect this parameter.
    • Power_On_Hours: Total number of disk power-on hours.
    • Spin_Retry_Count: Total number of retry attempts to spin up the disk to its operational speed in cases when the previous attempt failed. If this attribute’s value increases, there are likely to be problems with the disk’s mechanical parts.
    • Power_Cycle_Count: Total number of disk power cycles.
    • End-to-End_Error: Total number of errors caused by the mismatch in the host and disk parity data transferred though the cache.
    • Reported_Uncorrect: Total number of errors that could not be recovered using hardware error correction mechanisms.
    • Command_Timeout: Total number of operations interrupted by the disk timeout.
    • High_Fly_Writes: Total number of cases detected during write operations where the head was flying higher over the disk surface than the calculated range.
    • Airflow_Temperature_Cel: Air temperature inside the disk case.
    • G-Sense_Error_Rate: Total number of errors caused by impact loads.
    • Power-Off_Retract_Count: Total number of disk emergency shutdown or power failure cycles.
    • Load_Cycle_Count: Total number of cycles when the magnetic head was moved to the parking position.
    • temperature: Disk temperature.
    • Hardware_ECC_Recovered: Total number of times the disk controller has corrected ECC errors.
    • Current_Pending_Sector: Total number of so-called suspicious sectors that are not yet marked as bad, but their read behavior deviates from stable sectors. If such a sector is successfully read next time, it is removed from suspicious sectors. In case read errors persist, the disk will attempt to restore the sector by reallocating it.
    • Offline_Uncorrectable: Total number of suspicious (Current_Pending_Sector) sectors the disk could not restore.
    • UDMA_CRC_Error_Count: Total number of errors with data transmission via an external interface in UltraDMA mode, e.g., package integrity errors.
  • FLAGS: Attribute flags set by the disk manufacturer characterizing the attribute type:

    • P (prefailure warning): When these attributes reach their thresholds, the disk needs to be replaced.
    • O (updated online): These attributes are updated when built-in SMART tests are performed online and offline.
    • S (speed/performance): These attributes characterize disk performance.
    • R (error rate): These attributes reflect disk error counter values​.
    • C (event count): These attributes reflect event counter values.
    • K (auto-keep): Auto-keep attributes.
  • VALUE: Current attribute value.

  • WORST: Worst attribute value throughout the disk's lifetime.

  • THRESH: The attribute's minimum threshold value for the disk to be considered in critical condition and prone to failure.

  • FAIL: State signaling that the attribute has exceeded the THRESH value.

  • RAW_VALUE: Absolute value of the attribute.

If any of the table attributes with the P flag (prefailure warning) has FAILING_NOW in the FAIL field, the disk's service life has expired and you need to replace it.

ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  5 Reallocated_Sector_Ct   PO--CK   100   100   010    -    0
  9 Power_On_Hours          -O--CK   086   086   000    -    67710
 12 Power_Cycle_Count       -O--CK   099   099   000    -    108
177 Wear_Leveling_Count     PO--C-   062   062   005    -    1182
179 Used_Rsvd_Blk_Cnt_Tot   PO--C-   100   100   010    -    0
180 Unused_Rsvd_Blk_Cnt_Tot PO--C-   100   100   010    -    17618
181 Program_Fail_Cnt_Total  -O--CK   100   100   000    -    0
182 Erase_Fail_Count_Total  -O--CK   100   100   000    -    0
183 Runtime_Bad_Block       PO--C-   100   100   010    -    0
184 End-to-End_Error        PO--CK   100   100   097    -    0
187 Reported_Uncorrect      -O--CK   100   100   000    -    0
190 Airflow_Temperature_Cel -O--CK   073   049   000    -    27
195 Hardware_ECC_Recovered  -O-RC-   200   200   000    -    0
199 UDMA_CRC_Error_Count    -OSRCK   100   100   000    -    0
202 Unknown_SSD_Attribute   PO--CK   100   100   010    -    0
235 Unknown_Attribute       -O--C-   099   099   000    -    68
241 Total_LBAs_Written      -O--CK   099   099   000    -    2179262941271

Where:

  • ID#: Attribute ID.

  • ATTRIBUTE_NAME: Attribute name.

    • Reallocated_Sector_Ct: Total number of blocks with read/write errors reallocated to the reserve area.
    • Power_On_Hours: Total number of disk power-on hours.
    • Power_Cycle_Count: Total number of disk power cycles.
    • Wear_Leveling_Count: Maximum number of erase operations performed on a single flash memory block.
    • Used_Rsvd_Blk_Cnt_Tot: Total number of used flash memory blocks in the reserve area.
    • Unused_Rsvd_Blk_Cnt_Tot: Total number of available flash memory blocks in the reserve area.
    • Program_Fail_Cnt_Total: Total number of failures when attempting to write data to a flash memory block.
    • Erase_Fail_Count_Total: Total number of failures when attempting to erase data from a flash memory block.
    • Runtime_Bad_Block: Total number of flash memory blocks with unfixable errors detected during the disk’s operation time.
    • End-to-End_Error: Total number of errors caused by the mismatch in the host and disk parity data transferred though the cache.
    • Reported_Uncorrect: Total number of errors that could not be recovered using hardware error correction mechanisms.
    • Airflow_Temperature_Cel: Air temperature inside the disk case.
    • Hardware_ECC_Recovered: Total number of times the disk controller has corrected ECC errors.
    • UDMA_CRC_Error_Count: Total number of errors with data transmission via an external interface in UltraDMA mode, e.g., package integrity errors.
    • Total_LBAs_Written: Total number of data blocks written to the disk over its lifespan.
    • Unknown_SSD_Attribute and Unknown_Attribute: Manufacturer-specific attributes.
  • FLAGS: Attribute flags set by the disk manufacturer characterizing the attribute type:

    • P (prefailure warning): When these attributes reach their thresholds, the disk needs to be replaced.
    • O (updated online): These attributes are updated when built-in SMART tests are performed online and offline.
    • S (speed/performance): These attributes characterize disk performance.
    • R (error rate): These attributes reflect disk error counter values​.
    • C (event count): These attributes reflect event counter values.
    • K (auto-keep): Auto-keep attributes.
  • VALUE: Current attribute value.

  • WORST: Worst attribute value throughout the disk's lifetime.

  • THRESH: The attribute's minimum threshold value for the disk to be considered in critical condition and prone to failure.

  • FAIL: State signaling that the attribute has exceeded the THRESH value.

  • RAW_VALUE: Absolute value of the attribute.

If any of the table attributes with the P flag (prefailure warning) has FAILING_NOW in the FAIL field, the disk's service life has expired and you need to replace it.

See alsoSee also

  • Analyzing the status of BareMetal server disks using HW Watcher

Was the article helpful?

Previous
Server configurations
Next
Additional server settings
© 2025 Direct Cursus Technology L.L.C.