Results 1 to 4 of 4

Thread: 24.04 causes my harddrive to say BRAAAPPP

  1. #1
    Join Date
    May 2018
    Beans
    41

    24.04 causes my harddrive to say BRAAAPPP

    I have a 4TB mechanical 2.5" archive drive (aside the NVMe OS drive) with an ext4 partition, and it intermittently rasps now - which it did not do with 23.10, 23.04 etc back to 19.10 or some such.

    Brrpp..........brrpp..........brrpp.......*silence for a long time*...... brrpp.

    Thing is I have had this once before, on another computer and with a completely different drive. That one was a 3.5" and the sound was really loud and annoying. It began right away when new and I formatted it to ext4, but for some reason it went silent after I reformatted it to NTFS. Incidentally, it died after a year despite none of the S.M.A.R.T diagnostics showing any error. Then the replacement did the same one year later, and I replaced it with an SSD.

    So naturally I wonder if there is something software-related killing my drives, although I cannot for the life of me understand what or why?

    And IF this is some sort of precursor to failure, why would it appear instantly after upgrading to 24.04? That drive had nothing to do with the installation, I reconnected it later.

    Is there some way to diagnose what that causes that sound?

  2. #2
    Join Date
    Mar 2010
    Location
    Squidbilly-Land
    Beans
    Hidden!
    Distro
    Ubuntu

    Re: 24.04 causes my harddrive to say BRAAAPPP

    Use the drive manufacturer's diagnostic tools.

    The only things I can think of is that you didn't properly align your sectors on 4K boundaries. That seems to make it hard for HDDs. Also, SMART isn't 100% accurate. I watch mine weekly running short tests and long tests monthly, then get the reports and look for changes in the reports over time. That's really the only way to predict a failure that I know. A single test run yearly isn't sufficient to see the changes.

    For example, the normal things in the SMART reports with pending and sector errors were all fine with the last HDD that failed here. But here's the last report I used to get the vendor to approve an RMA:
    Code:
    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
      1 Raw_Read_Error_Rate     0x002f   001   001   051    Pre-fail  Always   FAILING_NOW 87106
      3 Spin_Up_Time            0x0027   161   115   021    Pre-fail  Always       -       10933           
      4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       20              
      5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0               
      7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
      9 Power_On_Hours          0x0032   092   092   000    Old_age   Always       -       6133
     10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
     11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
     12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       20
    192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       14
    193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       7
    194 Temperature_Celsius     0x0022   105   091   000    Old_age   Always       -       47
    196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
    197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
    198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
    199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
    200 Multi_Zone_Error_Rate   0x0008   198   198   000    Old_age   Offline      -       1819
    Nothing in those to make me worry about data corruption. Well, not when this initially started. Over time,
    Code:
    $ egrep Raw_Read_Error_Rate smart.202*sda
    smart.2023-10-10.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
    smart.2023-10-17.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
    smart.2023-10-24.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
    smart.2023-10-31.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
    smart.2023-11-07.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       1
    smart.2023-11-14.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       1
    smart.2023-11-21.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       1
    smart.2023-11-28.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       1
    smart.2023-12-05.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
    smart.2023-12-12.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
    smart.2023-12-19.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
    smart.2023-12-26.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
    smart.2024-01-02.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       5
    smart.2024-01-09.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0 
    smart.2024-01-16.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       3
    smart.2024-01-23.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
    smart.2024-01-30.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
    smart.2024-02-06.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       5
    smart.2024-02-13.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       2
    smart.2024-02-20.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       2
    smart.2024-02-27.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       2 
    smart.2024-03-05.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       7
    smart.2024-03-12.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       10
    smart.2024-03-19.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       20
    smart.2024-03-26.sda:  1 Raw_Read_Error_Rate     0x002f   199   199   051    Pre-fail  Always       -       61
    smart.2024-04-02.sda:  1 Raw_Read_Error_Rate     0x002f   117   117   051    Pre-fail  Always       -       3184
    smart.2024-04-07.sda:  1 Raw_Read_Error_Rate     0x002f   001   001   051    Pre-fail  Always   FAILING_NOW 87106
    smart.2024-04-09.sda:  1 Raw_Read_Error_Rate     0x002f   001   001   051    Pre-fail  Always   FAILING_NOW 87095
    smart.2024-04-16.sda:  1 Raw_Read_Error_Rate     0x002f   200   001   051    Pre-fail  Always   In_the_past 5
    smart.2024-04-23.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
    That last line is from a new HDD.

    Kept getting worse and worse, until the drive failed. When it did fail, it was already replaced and being accessed using a USB2 dock, being wiped with random data. It has been shipped for RMA.

    Initially, it became slow, very slow, with writing. Then reading from the files that were slow to write was also REALLY slow. Other files were fine. That made me look at the SMART data more carefully, since the HDD was only 8 months old and came new with a 5 yr warranty. I stopped buying HDDs with less than 5 yr warranties about 3-4 yrs ago. The inconvenience of dealing with data issues more than about once a decade is just too much hassle for me. Anyway, since there weren't any reallocated events or pending, I ensured all the data was backed up to other disks and reformatted it with a fresh ext4, then moved all the data back. To get the data initially moved off, a simple copy was failing, so I used ddrescue on a file-by-file basis. If there were 100 files, then over 99 of them moved quickly, but that last 1% ran overnight.

    I also was monitoring the drive temperature. It was warm, but not hot.

    BTW, I really do run those SMART tests weekly:
    Code:
    SMART Self-test log structure revision number 1
    Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
    # 1  Extended offline    Completed without error       00%      6013         -
    # 2  Short offline       Completed without error       00%      5832         -
    # 3  Short offline       Completed without error       00%      5664         -
    # 4  Short offline       Completed without error       00%      5497         -
    # 5  Extended offline    Completed without error       00%      5342         -
    # 6  Short offline       Completed without error       00%      5162         -
    # 7  Short offline       Completed without error       00%      4995         -
    # 8  Short offline       Completed without error       00%      4827         -
    # 9  Extended offline    Completed without error       00%      4670         -
    #10  Short offline       Completed without error       00%      4491         -
    #11  Short offline       Completed without error       00%      4323         -
    #12  Short offline       Completed without error       00%      4155         -
    #13  Short offline       Completed without error       00%      3988         -
    #14  Extended offline    Completed without error       00%      3832         -
    #15  Short offline       Completed without error       00%      3652         -
    #16  Short offline       Completed without error       00%      3484         -
    #17  Short offline       Completed without error       00%      3316         -
    #18  Extended offline    Completed without error       00%      3160         -
    #19  Short offline       Completed without error       00%      2981         -
    #20  Short offline       Completed without error       00%      2813         -
    #21  Short offline       Completed without error       00%      2645         -
    That's a long test the first Monday of every month and short tests every other Monday.
    See how looking at the data over time let me be proactive? In the end, I didn't lose any data, even with 1 new file being inaccessible when the problem first began.
    I should also mention, that disk was for scratch use, not archival of stuff, so I didn't have great daily backups, like I do with all other data. Most of the data was being migrated from an old RAID setup to this drive and I just got bogged down. I didn't delete the RAID data, which is why almost nothing was lost that wasn't in the "scratch" area.

    Drives making noise is never good. Start looking more closely at the smart reports and testing weekly. You won't know the problem until it is an emergency, but you need to be prepared. If it were me, I'd move a noisy disk that still worked to be a backup and put in a new disk that's quiet for the primary.

  3. #3
    Join Date
    May 2018
    Beans
    41

    Re: 24.04 causes my harddrive to say BRAAAPPP

    Okay. I will dump this thing and get a new.

    The thing is that I do find it to show suspicious slowdowns at times, aside from the weird sound.

    Thanks!

  4. #4
    Join Date
    Mar 2010
    Location
    USA
    Beans
    Hidden!
    Distro
    Ubuntu Development Release

    Re: 24.04 causes my harddrive to say BRAAAPPP

    But your description of that sound makes me 'cringe'. It reminds me of hearing that sound from time to time, over the years on failing drives. Like when a magnetic HDD drive has a seek error. Then the drive heads vibrate trying to correct itself during the error.

    RE: https://www.lacie.com/support/kb/ide...hat-they-mean/

    "Concurrent coexistence of Windows, Linux and UNIX..." || Ubuntu user # 33563, Linux user # 533637
    Sticky: Graphics Resolution | UbuntuForums 'system-info' Script | Posting Guidelines | Code Tags

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •