bug

Discuss development on Bubba
Post Reply
Rawhead
Posts: 42
Joined: 18 Mar 2007, 07:10
Location: Infected mountain in Sweden
Contact:

bug

Post by Rawhead » 01 Feb 2008, 12:31

Does bubba got this bug?
Bug description

You can scan through the bug for links to the Ubuntu forums where many, many different questions have been asked, answered, and re-answered. The temporary workaround is just below, but you may need to use '254', or a bit lower, as opposed to '255'. If HD temperature gets high, you may want to set it all the way "down" to 200 or so. ~1 click every 2.5-3 minutes is fine.

Temporary workaround: https://bugs.launchpad.net/ubuntu/+sour ... omments/14
A more extensive description of the workaround: http://ubuntuforums.org/showthread.php?t=591503
Note: Some disks are unresponsive to having their APM changed by hdparm, and therefore the workaround doesn't work. It would be a good idea, in such cases, to disable APM in the BIOS if possible.

Following is a summary of the issue:
It is confirmed that some systems are seeing an unusually high number of load/unload cycles on their hard disks, as evidenced by smartctl. It was originally surmised that this was related to laptop-mode being enabled, but this affects systems *regardless* of whether or not laptop-mode has been enabled. In fact, aggressive APM is not a bad idea while a system is not on AC, as that system is much more likely to encounter a physical impact. But unfortunately, the heads are only parked for a very short period of time, making impact protection much less effective (and wearing out the drive as well).

This problem has been confirmed in Ubuntu as well as in other distributions.

Symptoms of this bug are:
* Frequent HD clicks -- more than one per 3 minutes while idle, louder than the typical access sounds. Often more than twice per minute. On some disks, the click is very quiet
* Rapidly Increasing Load_Cycle_Count as displayed in the final number in "sudo smartctl -a /dev/hda | grep Load_Cycle_Count" (where /dev/hda is replaced with your own hard disk device)
* Early hard disk failure never stay parked, due to very frequent disk activity. Thus this cycle occurs often, thus wearing out the drive, and any comparative benefit is negligible (whereas, if the-- some disks are cut down to less than a year of actual uptime.

The problem is only present due to the existence of *all three* of the following factors:
* Hardware is set (default or otherwise) to aggressive power management, causing heads to park. (default behaviour of many drives)
* Disk is touched often, causing heads to unpark. (default behaviour of many distributions)
* Drives are spec'd to a limited number of these cycles. (600,000 is the most common, although some may be spec'd higher or lower).

Reasonable Limits / Criteria for a fix:
* There should be fewer than ~15 load cycles per hour, except during heavy usage while on battery.
* This provides a life expectancy of over four years, which is reasonable for a hard disk.

Temporary Workaround:
* Follow the above link.

Some hardware with this issue:
WD1200VE -- http://www.wdc.com/en/library/portable/2879-001121.pdf -- This aggressive parking is a feature of this disk, but that feature relies on behaviour that allows for significant amounts of (truly) idle time without the disk being touched. Notice the "Load/unload cycles" of 600,000.

Example Load_Cycle_Counts:
* Thinkpad Z60m/Hitachi HTS541080G9SA00 with well over 7000 load cycles in only 100 hours. That's >70 per hour.
* Gateway MT6451/Western Digital WD1200VE with 164762 load cycles in 3747 hours (156 days) of uptime. That's ~43 per hour -- except that the system was patched during the initial third of its life, which puts it at ~63/hour since Gutsy was installed (and wasn't patched, as I had done with feisty).

Please see for yourself how often your drive is load cycling:
smartctl -d ata -a /dev/sda
(This command is for an SATA drive; you'll need to install the smartmontools package first.)

You can get the average per hour by the following division:
Load_Cycle_Count / Power_On_Hours

See also http://paul.luon.net/journal/hacking/BrokenHDDs.html for a rather dramatic account of the effects the current default values may have.
https://bugs.launchpad.net/debian/+sour ... viewstatus

Post Reply