Bug 386555

Summary: High number of hard drive load cycles on notebooks
Product: [openSUSE] openSUSE 11.0 Reporter: Alberto Passalacqua <alberto.passalacqua>
Component: KernelAssignee: Tejun Heo <teheo>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: aj, ciaran.farrell, cmoess, coolo, elchevive68, forgotten_qMyteedNxa, forgotten_sLJ7K2dvxj, noiano, pacho, pearson45j, sbrabec, sontek, sshaw, wm
Version: Beta 2   
Target Milestone: ---   
Hardware: Other   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: storage-fixup script
storage-fixup slightly updated.
storage-fixup.conf
storage-fixup.conf
dmidecode
hdparm -I
storage-fixup.conf
storage-fixup
storage-fixup
storage-fixup.conf
patch against OBS package (rev. 9)

Description Alberto Passalacqua 2008-05-05 03:03:42 UTC
Following what I read about some users experiences with the latest release of Fedora and Ubuntu, having "clicks" and high load cycles counts on certain laptops (see https://bugs.launchpad.net/ubuntu/+source/acpi-support/+bug/59695 for example), I asked for some testing in Beta 2.

One hundred load cycles per hour were reported on a Lenovo R61 using openSUSE 11 beta 2, as you can read here: http://lists.opensuse.org/opensuse-factory/2008-05/msg00066.html

I think further investigation is required, being this issue quite serious, considering its impact on the disk life.

Regards,
Alberto
Comment 1 Tejun Heo 2008-05-06 02:41:24 UTC
Aieeeeeeeee.......... This is nasty.  The BIOS / drive vendors are setting APM to aggressive values w/ idle IO access pattern of windows on mind.  The setting is too aggressive to the point of being fragile and different idle IO pattern causes the drive to unload like crazy && you can't really expect Linux or any other operating system to have similar idle IO pattern as windows.  :-(

The interesting part is that such aggressive settings can cause crazy unloads even on Windows.  My friend's tp tablet does crazy unloadings under windows but not under SL103.  Maybe the vaccine program my friend is running makes the idle io pattern different or something.

So, this basically is the vendor excessively fine-tuning their hardware to achieve slightly better battery benchmark and the 'excessive' part, unsurprisingly, breaking down when the circumstance changes.

This is tough to solve.  The ATA APM feature set is vague about which values means what and, expectedly, not all drive manufacturers follow even the vague outlines.  Vendors tuning their hardwares 'excessively' know which drives they're gonna use and can determine their settings.  Value 255 is supposed to disable the feature but apparently it just wraps around to 127/8 on certain drives.  Value 254 means maximum performance but we're never sure what it will do on odd ones and there are drives which seemingly overheat if APM is set to 255 or 254 - these drives are using APM to control other aspects of the drive than head in addition to loading/unloading - be it drive RPM, electric circuit power mode or whatever, which is legit as the ATA spec doesn't dictate how the power should be saved when APM is configured.

So, there's no default golden value we can use to solve it.  One way to solve it is to issue IO every 1 sec if no actual IO is in progress and idle timeout hasn't expired, which is a seriously ugly solution and will require considerable time and effort to get right.  Another more realistic approach is to develop a blacklist of machines / drives which have this problem and turn it off during boot or resume.  Hmmm... if the BIOS does it right and implements proper SATA ACPI and turns on APM via _GTF after hardresets, the blacklist somehow needs to be invoked after each reset or the driver should remember the setting and restore it.  Eeeewwww........

BIOS shouldn't configure it w/ such fragile values.  Please take a look at the following wiki page for more information.

  http://www.thinkwiki.org/wiki/Problem_with_hard_drive_clicking

Ideally, this should be solved by updating the firmware or disabling APM from the BIOS but this problem will kill drives pretty fast and we better have some kind of workaround although I can't think of a good one at this point.

Alberto, can you please cc other people who are seeing this problem here?  Thanks.
Comment 3 Alberto Passalacqua 2008-05-06 03:20:21 UTC
Added three users to CC. They discussed with me about the issue on the Factory ML.

Regards,
Alberto
Comment 4 Alberto Passalacqua 2008-05-06 03:23:16 UTC
Changing status to opened.
Comment 5 Pavel Machek 2008-05-06 08:36:16 UTC
I'd say you have a buggy hardware, too bad. Report it to system vendor.

(Plus, we should be careful not to certify such broken machines).

Blacklist is certainly good solution for now... but I don't think it is a "major" problem, at least it is not a _Linux_ major problem.

Long-term solution would be to generate less disk activity by default. Maybe 5 seconds commits on ext3 are not that good idea? ...It will also allow longer spindowns and save power...



Comment 6 Tejun Heo 2008-05-06 08:46:55 UTC
Pavel, I pretty much agree with you here but as this can actually kill disks which can make people genuinely miserable.  I think adding minimal workaround is a good idea.  Say, invoking hdparm -B w/ the correct value on certain configurations during boot and resume.  Where would be the correct place to do that?
Comment 7 Alberto Passalacqua 2008-05-06 13:13:42 UTC
I agree it depends on hardware configuration, but saying it's not a Linux problem is just not seeing it. Defining that hardware buggy when it works OK with other operating systems like Windows and Mac Os X (also a macbook is affected, the user is in CC), is something at least funny.

If I have to decide between the life of my hard disk and Linux, without a doubt I won't use Linux and keep my hardware safe. That's why, for me and others users, it is a major, if not higher severity, issue.

Regards,
Alberto
Comment 8 Tejun Heo 2008-05-06 13:44:58 UTC
Lemme play a bat here.

Alberto, I see where you're coming from too and I _am_ considering it seriously (w/ due amount of cursing of course).  Pavel is just pointing out where the problem lies and that's usuall the place where the fix should go.  Solving it at the different layer especially in this case is nasty and cleaning up a mess someone else did for a nickel is annoying, so please understand our frustration.

Anyways, I'm working on a script which can be called during boot and resume and can issue some hdparm and/or smartctl commands to handle this, so we should be able to handle these weirdos soon.  Please standby a bit.

Thanks.
Comment 9 Alberto Passalacqua 2008-05-06 14:08:40 UTC
I perfectly understand your point of view, but I just represent the average users who behaves as I described. If I read that my brand new laptop will have a damaged hard drive under Linux, with all the consequences of the case (read void warranty), I won't surely use it. That's my point. 

It's not a question of how seriously you (at Novell) take the problem. I never wrote you're not considering it seriously. But I also think that if this hardware works correctly under Windows and Mac OS X, it should be work correctly without ugly hacks under Linux too.

Just my two cents.
Comment 10 Tejun Heo 2008-05-06 14:40:59 UTC
(In reply to comment #9 from Alberto Passalacqua)
> It's not a question of how seriously you (at Novell) take the problem. I never
> wrote you're not considering it seriously. But I also think that if this
> hardware works correctly under Windows and Mac OS X, it should be work
> correctly without ugly hacks under Linux too.

Other points I can agree to but it working under mac os X is just dumb luck.  The setting just isn't healthy.  Let's say the vendor sets the unload timeout to 10 seconds and under nominal laptop circumstances Windows issues IOs every 8 secs unless it's completely idle.  It will work fine there.  Let's say mac does so every 7 to 9 seconds which will work fine too.  Now, let's say linux does so every 10-12 seconds.  Now you have a problem.  Being idle for longer period time is a good thing which we should strive for but in this case it's fast death for the drive.  This is why some people are reporting the problem goes away when laptop mode is disabled because then IOs will be issued more frequently.

Such short fixed timeouts just can't generically work.  It's bound to break.  They had to go with longer timeout or adaptive one.  So, it's not like something is wrong with linux, it's just different and the setting is way too aggressive for real world.  Please think of my friend's laptop I mentioned earlier.  That drive is going to be toasted pretty soon even if it's just differently configured windows.
Comment 11 Thomas Renninger 2008-05-06 16:56:20 UTC
Can we increase write back time (especially of the journal) by default, say by factor 4-10?
This would lower the severity of the issue, at least a bit.

BTW: I Windows probably also is waking up the disk really often over some weeks months...
Comment 12 Forgotten User ZhJd0F0L3x 2008-05-06 16:58:20 UTC
pm-utils is the right place. The question is: "what is the correct value"?

My observations with different drives usually show the following (all hdparm -B values):

1-127: Drive unloads heads and does auto-spindown, depending on access pattern
128-191: drive unloads heads if no access for a second or so
>192: drive does not unload heads
from the specs, 255 disables APM

On my machine, i usually set it to 192 when on AC power and to 128 when on battery power.
Comment 13 Tejun Heo 2008-05-06 17:08:34 UTC
As written above, I don't think there's one magic value we can use.  We'll have to build a blacklist for problematic laptops (we should be able to take some from the thinkwiki page).  I'm almost done with a bash script to do it.  Will post it soon after adding some comments.
Comment 14 Tejun Heo 2008-05-06 17:33:59 UTC
Created attachment 212826 [details]
storage-fixup script

Okay, here's the script.  The comment on top of it should explain most things.  I'll attach a sample rule file soon.
Comment 15 Tejun Heo 2008-05-06 17:44:06 UTC
Created attachment 212833 [details]
storage-fixup slightly updated.

Turned off debug mode and misc fix up.
Comment 16 Tejun Heo 2008-05-06 17:44:37 UTC
Created attachment 212834 [details]
storage-fixup.conf

Sample configuration file.
Comment 17 Tejun Heo 2008-05-06 17:51:53 UTC
Stefan, does this look good enough?  Or does pm-utils have better facility to do this?

Alberto and other owners of troubled machines, can you guys please post the result of "dmidecode" and "hdparm -I /dev/sda" and which APM value works for the machine?

Thanks.
Comment 18 Forgotten User ZhJd0F0L3x 2008-05-06 18:24:01 UTC
No, i think that pm-utils does not have a better facility.
We should just run your script at boot and after resume.

Additionally, i'll add a pm-utils hook, that gets executed whenever the "set low power" function is called (with "true" or "false" as an argument, to set low power mode or to unset low power mode).

This script will per default do nothing - but it will have a configuration file, where the user can enter values that will be used for AAM and APM settings. The config file will probably look something like:

LOW_POWER_APM="128"
HI_POWER_APM="192"
LOW_POWER_AAM=""
HI_POWER_AAM=""

But i have to think about that.
Comment 19 Carlos Robinson 2008-05-06 19:13:02 UTC
I don't have that hardware, I remove myself from the CC list.
Comment 20 James PEARSON 2008-05-07 05:37:36 UTC
hello here are the results of "dmidecode" and "hdparm -I /dev/sda" from my Thinkpad T60

Linux linux 2.6.25-26-pae #1 SMP 2008-04-30 07:56:05 +0200 i686 i686 i386 GNU/Linux
 
 
Running hdparm -I /dev/sda....

/dev/sda:

ATA device, with non-removable media
	Model Number:       Hitachi HTS722020K9SA00                 
	Serial Number:      071006DP0410DTG4VSSP
	Firmware Revision:  DC4OC54P
	Transport:          Serial, ATA8-AST, SATA 1.0a, SATA II Extensions, SATA Rev 2.5; Revision: ATA8-AST T13 Project D1697 Revision 0b
Standards:
	Used: ATA-8-ACS revision 3f 
	Supported: 8 7 6 5 
Configuration:
	Logical		max	current
	cylinders	16383	16383
	heads		16	16
	sectors/track	63	63
	--
	CHS current addressable sectors:   16514064
	LBA    user addressable sectors:  268435455
	LBA48  user addressable sectors:  390721968
	device size with M = 1024*1024:      190782 MBytes
	device size with M = 1000*1000:      200049 MBytes (200 GB)
Capabilities:
	LBA, IORDY(can be disabled)
	Queue depth: 32
	Standby timer values: spec'd by Vendor, no device specific minimum
	R/W multiple sector transfer: Max = 16	Current = 16
	Advanced power management level: 128
	Recommended acoustic management value: 128, current value: 254
	DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6 
	     Cycle time: min=120ns recommended=120ns
	PIO: pio0 pio1 pio2 pio3 pio4 
	     Cycle time: no flow control=120ns  IORDY flow control=120ns
Commands/features:
	Enabled	Supported:
	   *	SMART feature set
	    	Security Mode feature set
	   *	Power Management feature set
	   *	Write cache
	   *	Look-ahead
	   *	Host Protected Area feature set
	   *	WRITE_BUFFER command
	   *	READ_BUFFER command
	   *	NOP cmd
	   *	DOWNLOAD_MICROCODE
	   *	Advanced Power Management feature set
	    	Power-Up In Standby feature set
	   *	SET_FEATURES required to spinup after power up
	    	SET_MAX security extension
	    	Automatic Acoustic Management feature set
	   *	48-bit Address feature set
	   *	Device Configuration Overlay feature set
	   *	Mandatory FLUSH_CACHE
	   *	FLUSH_CACHE_EXT
	   *	SMART error logging
	   *	SMART self-test
	   *	General Purpose Logging feature set
	   *	WRITE_{DMA|MULTIPLE}_FUA_EXT
	   *	64-bit World wide name
	   *	IDLE_IMMEDIATE with UNLOAD
	   *	WRITE_UNCORRECTABLE_EXT command
	   *	SATA-I signaling speed (1.5Gb/s)
	   *	Native Command Queueing (NCQ)
	   *	Host-initiated interface power management
	   *	Phy event counters
	   *	unknown 76[12]
	    	Non-Zero buffer offsets in DMA Setup FIS
	    	DMA Setup Auto-Activate optimization
	   *	Device-initiated interface power management
	    	In-order data delivery
	   *	Software settings preservation
	   *	SMART Command Transport (SCT) feature set
	   *	SCT LBA Segment Access (AC2)
	   *	SCT Error Recovery Control (AC3)
	   *	SCT Features Control (AC4)
	   *	SCT Data Tables (AC5)
Security: 
	Master password revision code = 65534
		supported
	not	enabled
	not	locked
		frozen
	not	expired: security count
		supported: enhanced erase
	72min for SECURITY ERASE UNIT. 74min for ENHANCED SECURITY ERASE UNIT.
Logical Unit WWN Device Identifier: 5000cca53ec235f6
	NAA		: 5
	IEEE OUI	: cca
	Unique ID	: 53ec235f6
Checksum: correct
 
 
Running dmidecode....
# dmidecode 2.9
SMBIOS 2.4 present.
68 structures occupying 2250 bytes.
Table at 0x000E0010.

Handle 0x0000, DMI type 0, 24 bytes
BIOS Information
	Vendor: LENOVO
	Version: 79ETE1WW (2.21 )
	Release Date: 02/05/2008
	Address: 0xDC000
	Runtime Size: 144 kB
	ROM Size: 2048 kB
	Characteristics:
		PCI is supported
		PC Card (PCMCIA) is supported
		PNP is supported
		BIOS is upgradeable
		BIOS shadowing is allowed
		ESCD support is available
		Boot from CD is supported
		Selectable boot is supported
		BIOS ROM is socketed
		EDD is supported
		ACPI is supported
		USB legacy is supported
		BIOS boot specification is supported
		Targeted content distribution is supported
	BIOS Revision: 2.33
	Firmware Revision: 1.7

Handle 0x0001, DMI type 1, 27 bytes
System Information
	Manufacturer: LENOVO
	Product Name: 1952W5R
	Version: ThinkPad T60
	Serial Number: L3C8047
	UUID: 81E12281-482B-11CB-B37C-91674F2F476C
	Wake-up Type: Power Switch
	SKU Number: Not Specified
	Family: ThinkPad T60

Handle 0x0002, DMI type 2, 8 bytes
Base Board Information
	Manufacturer: LENOVO
	Product Name: 1952W5R
	Version: Not Available
	Serial Number: VF0BC6730XA

Handle 0x0003, DMI type 3, 13 bytes
Chassis Information
	Manufacturer: LENOVO
	Type: Notebook
	Lock: Not Present
	Version: Not Available
	Serial Number: Not Available
	Asset Tag: No Asset Information
	Boot-up State: Unknown
	Power Supply State: Unknown
	Thermal State: Unknown
	Security Status: Unknown

Handle 0x0004, DMI type 126, 13 bytes
Inactive

Handle 0x0005, DMI type 126, 13 bytes
Inactive

Handle 0x0006, DMI type 4, 35 bytes
Processor Information
	Socket Designation: None
	Type: Central Processor
	Family: Other
	Manufacturer: GenuineIntel
	ID: F6 06 00 00 FF FB EB BF
	Version: Intel(R) Core(TM)2 CPU         
	Voltage: 1.3 V
	External Clock: 167 MHz
	Max Speed: 2333 MHz
	Current Speed: 2333 MHz
	Status: Populated, Enabled
	Upgrade: None
	L1 Cache Handle: 0x000A
	L2 Cache Handle: 0x000C
	L3 Cache Handle: Not Provided
	Serial Number: Not Specified
	Asset Tag: Not Specified
	Part Number: Not Specified

Handle 0x0007, DMI type 5, 20 bytes
Memory Controller Information
	Error Detecting Method: None
	Error Correcting Capabilities:
		None
	Supported Interleave: One-way Interleave
	Current Interleave: One-way Interleave
	Maximum Memory Module Size: 2048 MB
	Maximum Total Memory Size: 4096 MB
	Supported Speeds:
		Other
	Supported Memory Types:
		DIMM
		SDRAM
	Memory Module Voltage: 2.9 V
	Associated Memory Slots: 2
		0x0008
		0x0009
	Enabled Error Correcting Capabilities:
		Unknown

Handle 0x0008, DMI type 6, 12 bytes
Memory Module Information
	Socket Designation: DIMM Slot 1
	Bank Connections: 0 3
	Current Speed: Unknown
	Type: DIMM SDRAM
	Installed Size: 2048 MB (Double-bank Connection)
	Enabled Size: 2048 MB (Double-bank Connection)
	Error Status: OK

Handle 0x0009, DMI type 6, 12 bytes
Memory Module Information
	Socket Designation: DIMM Slot 2
	Bank Connections: 4 7
	Current Speed: Unknown
	Type: DIMM SDRAM
	Installed Size: 1024 MB (Double-bank Connection)
	Enabled Size: 1024 MB (Double-bank Connection)
	Error Status: OK

Handle 0x000A, DMI type 7, 19 bytes
Cache Information
	Socket Designation: Internal L1 Cache
	Configuration: Enabled, Socketed, Level 1
	Operational Mode: Write Back
	Location: Internal
	Installed Size: 64 KB
	Maximum Size: 64 KB
	Supported SRAM Types:
		Synchronous
	Installed SRAM Type: Synchronous
	Speed: Unknown
	Error Correction Type: Single-bit ECC
	System Type: Instruction
	Associativity: 8-way Set-associative

Handle 0x000B, DMI type 7, 19 bytes
Cache Information
	Socket Designation: Internal L1 Cache
	Configuration: Enabled, Socketed, Level 1
	Operational Mode: Write Back
	Location: Internal
	Installed Size: 64 KB
	Maximum Size: 64 KB
	Supported SRAM Types:
		Synchronous
	Installed SRAM Type: Synchronous
	Speed: Unknown
	Error Correction Type: Single-bit ECC
	System Type: Data
	Associativity: 8-way Set-associative

Handle 0x000C, DMI type 7, 19 bytes
Cache Information
	Socket Designation: Internal L2 Cache
	Configuration: Enabled, Socketed, Level 2
	Operational Mode: Write Back
	Location: Internal
	Installed Size: 4096 KB
	Maximum Size: 4096 KB
	Supported SRAM Types:
		Burst
	Installed SRAM Type: Burst
	Speed: Unknown
	Error Correction Type: Single-bit ECC
	System Type: Unified
	Associativity: 8-way Set-associative

Handle 0x000D, DMI type 8, 9 bytes
Port Connector Information
	Internal Reference Designator: Not Available
	Internal Connector Type: None
	External Reference Designator: Infrared
	External Connector Type: Infrared
	Port Type: Other

Handle 0x000E, DMI type 8, 9 bytes
Port Connector Information
	Internal Reference Designator: Not Available
	Internal Connector Type: None
	External Reference Designator: External Monitor
	External Connector Type: DB-15 female
	Port Type: Video Port

Handle 0x000F, DMI type 8, 9 bytes
Port Connector Information
	Internal Reference Designator: Not Available
	Internal Connector Type: None
	External Reference Designator: Microphone Jack
	External Connector Type: Mini Jack (headphones)
	Port Type: Audio Port

Handle 0x0010, DMI type 8, 9 bytes
Port Connector Information
	Internal Reference Designator: Not Available
	Internal Connector Type: None
	External Reference Designator: Headphone Jack
	External Connector Type: Mini Jack (headphones)
	Port Type: Audio Port

Handle 0x0011, DMI type 126, 9 bytes
Inactive

Handle 0x0012, DMI type 126, 9 bytes
Inactive

Handle 0x0013, DMI type 8, 9 bytes
Port Connector Information
	Internal Reference Designator: Not Available
	Internal Connector Type: None
	External Reference Designator: Modem
	External Connector Type: RJ-11
	Port Type: Modem Port

Handle 0x0014, DMI type 8, 9 bytes
Port Connector Information
	Internal Reference Designator: Not Available
	Internal Connector Type: None
	External Reference Designator: Ethernet
	External Connector Type: RJ-45
	Port Type: Network Port

Handle 0x0015, DMI type 8, 9 bytes
Port Connector Information
	Internal Reference Designator: Not Available
	Internal Connector Type: None
	External Reference Designator: USB 1
	External Connector Type: Access Bus (USB)
	Port Type: USB

Handle 0x0016, DMI type 8, 9 bytes
Port Connector Information
	Internal Reference Designator: Not Available
	Internal Connector Type: None
	External Reference Designator: USB 2
	External Connector Type: Access Bus (USB)
	Port Type: USB

Handle 0x0017, DMI type 8, 9 bytes
Port Connector Information
	Internal Reference Designator: Not Available
	Internal Connector Type: None
	External Reference Designator: USB 3
	External Connector Type: Access Bus (USB)
	Port Type: USB

Handle 0x0018, DMI type 126, 9 bytes
Inactive

Handle 0x0019, DMI type 126, 9 bytes
Inactive

Handle 0x001A, DMI type 126, 9 bytes
Inactive

Handle 0x001B, DMI type 126, 9 bytes
Inactive

Handle 0x001C, DMI type 126, 9 bytes
Inactive

Handle 0x001D, DMI type 126, 9 bytes
Inactive

Handle 0x001E, DMI type 126, 9 bytes
Inactive

Handle 0x001F, DMI type 126, 9 bytes
Inactive

Handle 0x0020, DMI type 9, 13 bytes
System Slot Information
	Designation: ExpressCard Slot 1
	Type: x1 PCI Express
	Current Usage: Available
	Length: Other
	ID: 0
	Characteristics:
		Hot-plug devices are supported

Handle 0x0021, DMI type 9, 13 bytes
System Slot Information
	Designation: CardBus Slot 1
	Type: 32-bit PC Card (PCMCIA)
	Current Usage: Available
	Length: Other
	ID: Adapter 1, Socket 0
	Characteristics:
		5.0 V is provided
		3.3 V is provided
		PC Card-16 is supported
		Cardbus is supported
		Zoom Video is supported
		Modem ring resume is supported
		PME signal is supported
		Hot-plug devices are supported

Handle 0x0022, DMI type 126, 13 bytes
Inactive

Handle 0x0023, DMI type 126, 13 bytes
Inactive

Handle 0x0024, DMI type 126, 13 bytes
Inactive

Handle 0x0025, DMI type 10, 6 bytes
On Board Device Information
	Type: Other
	Status: Disabled
	Description: IBM Embedded Security hardware

Handle 0x0026, DMI type 11, 5 bytes
OEM Strings
	String 1: IBM ThinkPad Embedded Controller -[79HT50WW-1.07    ]-

Handle 0x0027, DMI type 13, 22 bytes
BIOS Language Information
	Installable Languages: 1
		enUS
	Currently Installed Language: enUS

Handle 0x0028, DMI type 15, 25 bytes
System Event Log
	Area Length: 0 bytes
	Header Start Offset: 0x0000
	Header Length: 16 bytes
	Data Start Offset: 0x0010
	Access Method: General-purpose non-volatile data functions
	Access Address: 0x0000
	Status: Invalid, Full
	Change Token: 0x000000EF
	Header Format: Type 1
	Supported Log Type Descriptors: 1
	Descriptor 1: POST error
	Data Format 1: POST results bitmap

Handle 0x0029, DMI type 16, 15 bytes
Physical Memory Array
	Location: System Board Or Motherboard
	Use: System Memory
	Error Correction Type: None
	Maximum Capacity: 2 GB
	Error Information Handle: Not Provided
	Number Of Devices: 2

Handle 0x002A, DMI type 17, 27 bytes
Memory Device
	Array Handle: 0x0029
	Error Information Handle: No Error
	Total Width: 64 bits
	Data Width: 64 bits
	Size: 2048 MB
	Form Factor: SODIMM
	Set: None
	Locator: DIMM 1
	Bank Locator: Bank 0/1
	Type: DDR2
	Type Detail: Synchronous
	Speed: Unknown
	Manufacturer: Not Specified
	Serial Number: Not Specified
	Asset Tag: Not Specified
	Part Number: Not Specified

Handle 0x002B, DMI type 17, 27 bytes
Memory Device
	Array Handle: 0x0029
	Error Information Handle: No Error
	Total Width: 64 bits
	Data Width: 64 bits
	Size: 1024 MB
	Form Factor: SODIMM
	Set: None
	Locator: DIMM 2
	Bank Locator: Bank 2/3
	Type: DDR2
	Type Detail: Synchronous
	Speed: Unknown
	Manufacturer: Not Specified
	Serial Number: Not Specified
	Asset Tag: Not Specified
	Part Number: Not Specified

Handle 0x002C, DMI type 18, 23 bytes
32-bit Memory Error Information
	Type: OK
	Granularity: Unknown
	Operation: Unknown
	Vendor Syndrome: Unknown
	Memory Array Address: Unknown
	Device Address: Unknown
	Resolution: Unknown

Handle 0x002D, DMI type 19, 15 bytes
Memory Array Mapped Address
	Starting Address: 0x00000000000
	Ending Address: 0x000BFFFFFFF
	Range Size: 3 GB
	Physical Array Handle: 0x0029
	Partition Width: 0

Handle 0x002E, DMI type 20, 19 bytes
Memory Device Mapped Address
	Starting Address: 0x00000000000
	Ending Address: 0x0007FFFFFFF
	Range Size: 2 GB
	Physical Device Handle: 0x002A
	Memory Array Mapped Address Handle: 0x002D
	Partition Row Position: 1

Handle 0x002F, DMI type 20, 19 bytes
Memory Device Mapped Address
	Starting Address: 0x00080000000
	Ending Address: 0x000BFFFFFFF
	Range Size: 1 GB
	Physical Device Handle: 0x002B
	Memory Array Mapped Address Handle: 0x002D
	Partition Row Position: 1

Handle 0x0030, DMI type 21, 7 bytes
Built-in Pointing Device
	Type: Track Point
	Interface: PS/2
	Buttons: 3

Handle 0x0031, DMI type 21, 7 bytes
Built-in Pointing Device
	Type: Touch Pad
	Interface: PS/2
	Buttons: 0

Handle 0x0032, DMI type 24, 5 bytes
Hardware Security
	Power-On Password Status: Disabled
	Keyboard Password Status: Disabled
	Administrator Password Status: Disabled
	Front Panel Reset Status: Unknown

Handle 0x0033, DMI type 32, 11 bytes
System Boot Information
	Status: No errors detected

Handle 0x0034, DMI type 131, 17 bytes
OEM-specific Type
	Header and Data:
		83 11 34 00 01 02 03 FF FF 1F 00 00 00 00 00 02
		00
	Strings:
		BOOTINF 20h
		BOOTDEV 21h
		KEYPTRS 23h

Handle 0x0035, DMI type 131, 11 bytes
OEM-specific Type
	Header and Data:
		83 0B 35 00 00 00 E8 FF C5 01 01
	Strings:
		IBM System Metrics

Handle 0x0036, DMI type 131, 22 bytes
OEM-specific Type
	Header and Data:
		83 16 36 00 01 00 00 00 00 00 00 00 00 00 00 00
		00 00 00 00 00 01
	Strings:
		TVT-Enablement

Handle 0x0037, DMI type 132, 7 bytes
OEM-specific Type
	Header and Data:
		84 07 37 00 01 D8 36

Handle 0x0038, DMI type 133, 5 bytes
OEM-specific Type
	Header and Data:
		85 05 38 00 01
	Strings:
		KHOIHGIUCCHHII

Handle 0x0039, DMI type 133, 17 bytes
OEM-specific Type
	Header and Data:
		85 11 39 00 30 30 2E 35 00 60 6F BF 00 00 00 00
		01
	Strings:
		Audit Boot History

Handle 0x003A, DMI type 134, 13 bytes
OEM-specific Type
	Header and Data:
		86 0D 3A 00 04 07 06 20 00 00 00 00 00

Handle 0x003B, DMI type 134, 16 bytes
OEM-specific Type
	Header and Data:
		86 10 3B 00 00 41 54 4D 4C 01 01 00 00 02 01 02
	Strings:
		TPM INFO
		System Reserved

Handle 0x003C, DMI type 135, 13 bytes
OEM-specific Type
	Header and Data:
		87 0D 3C 00 54 50 07 00 01 00 00 00 00

Handle 0x003D, DMI type 135, 18 bytes
OEM-specific Type
	Header and Data:
		87 12 3D 00 54 50 07 01 01 A3 07 00 00 00 01 00
		00 00

Handle 0x003E, DMI type 135, 35 bytes
OEM-specific Type
	Header and Data:
		87 23 3E 00 54 50 07 02 42 41 59 20 49 2F 4F 20
		01 00 02 00 00 0E 00 F0 01 F6 03 02 00 0F 00 70
		01 76 03

Handle 0x003F, DMI type 136, 6 bytes
OEM-specific Type
	Header and Data:
		88 06 3F 00 5A 5A

Handle 0x0040, DMI type 137, 28 bytes
OEM-specific Type
	Header and Data:
		89 1C 40 00 0C 02 00 01 01 00 00 01 50 57 4D 53
		20 49 6E 66 6F 72 6D 61 74 69 6F 6E

Handle 0x0041, DMI type 138, 40 bytes
OEM-specific Type
	Header and Data:
		8A 28 41 00 14 01 01 01 07 01 01 0C 01 01 0C 01
		01 0C 00 00 42 49 4F 53 20 50 61 73 73 77 6F 72
		64 20 46 6F 72 6D 61 74

Handle 0x0042, DMI type 139, 37 bytes
OEM-specific Type
	Header and Data:
		8B 25 42 00 11 01 0A 00 00 00 00 00 00 00 00 00
		00 50 57 4D 53 20 4B 65 79 20 49 6E 66 6F 72 6D
		61 74 69 6F 6E

Handle 0x0043, DMI type 127, 4 bytes
End Of Table

Comment 21 Tejun Heo 2008-05-07 05:47:09 UTC
So, for your machine, the rule would be something like the following.

rule tp-t60
dmi system-manufacturer   LENOVO
dmi system-product-name   1952W5R
dmi system-version        ThinkPad T60
hal storage.model         Hitachi HTS722020K9SA00
act hdparm -B 255 $DEV

Can you please put the above into /etc/storage-fixup.conf, run the attached storage-fixup and verify it makes the quick unloads go away?

Also, please attach large outputs instead of pasting inline.  It makes bug report difficult to follow.  Thanks.
Comment 22 Alberto Passalacqua 2008-05-07 14:20:56 UTC
This is what comes out from the HP Pavillion dv6500 of a friend who actually found the problem.

# dmidecode 2.9
SMBIOS 2.4 present.
19 structures occupying 731 bytes.
Table at 0x000DC010.

Handle 0x0000, DMI type 0, 24 bytes
BIOS Information
        Vendor: Hewlett-Packard
        Version: F.53     
        Release Date: 04/02/2008
        Address: 0xE6F30
        Runtime Size: 102608 bytes
        ROM Size: 1024 kB
        Characteristics:
                ISA is supported
                PCI is supported
                PNP is supported
                BIOS is upgradeable
                BIOS shadowing is allowed
                ESCD support is available
                Boot from CD is supported
                Selectable boot is supported
                Print screen service is supported (int 5h)
                8042 keyboard services are supported (int 9h)
                Serial services are supported (int 14h)
                Printer services are supported (int 17h)
                ACPI is supported
                USB legacy is supported
                AGP is supported
                Smart battery is supported
                BIOS boot specification is supported
                Targeted content distribution is supported

Handle 0x0001, DMI type 1, 27 bytes
System Information
        Manufacturer: Hewlett-Packard
        Product Name: HP Pavilion dv6500 Notebook PC    
        Version: Rev 1
        Serial Number: CNF73565FT
        UUID: 434E4637-3335-3635-4654-001B249711B6
        Wake-up Type: Power Switch
        SKU Number: GT460EA#ABZ 
        Family: 103C_5335KV

Handle 0x0002, DMI type 2, 8 bytes
Base Board Information
        Manufacturer: Quanta
        Product Name: 30D2
        Version: 79.2B
        Serial Number: None

Handle 0x0003, DMI type 3, 17 bytes
Chassis Information
        Manufacturer: Quanta
        Type: Notebook
        Lock: Not Present
        Version: N/A
        Serial Number: None
        Asset Tag:                     
        Boot-up State: Safe
        Power Supply State: Safe
        Thermal State: Safe
        Security Status: None
        OEM Information: 0x00000004

Handle 0x0004, DMI type 4, 35 bytes
Processor Information
        Socket Designation: U2E1
        Type: Central Processor
        Family: Other
        Manufacturer: Intel
        ID: FA 06 00 00 FF FB EB BF
        Version: Intel(R) Core(TM)2 Duo CPU T7300 
        Voltage: 3.3 V
        External Clock: 800 MHz
        Max Speed: 2000 MHz
        Current Speed: 2000 MHz
        Status: Populated, Enabled
        Upgrade: Socket 478
        L1 Cache Handle: 0x0005
        L2 Cache Handle: 0x0006
        L3 Cache Handle: Not Provided
        Serial Number: Not Specified
        Asset Tag: Not Specified
        Part Number: Not Specified

Handle 0x0005, DMI type 7, 19 bytes
Cache Information
        Socket Designation: L1 Cache
        Configuration: Enabled, Socketed, Level 1
        Operational Mode: Write Back
        Location: Internal
        Installed Size: 64 KB
        Maximum Size: 64 KB
        Supported SRAM Types:
                Burst
                Pipeline Burst
                Asynchronous
        Installed SRAM Type: Asynchronous
        Speed: Unknown
        Error Correction Type: Unknown
        System Type: Unknown
        Associativity: Unknown

Handle 0x0006, DMI type 7, 19 bytes
Cache Information
        Socket Designation: L2 Cache
        Configuration: Enabled, Socketed, Level 2
        Operational Mode: Write Back
        Location: External
        Installed Size: 4096 KB
        Maximum Size: 4096 KB
        Supported SRAM Types:
                Burst
                Pipeline Burst
                Asynchronous
        Installed SRAM Type: Burst
        Speed: Unknown
        Error Correction Type: Unknown
        System Type: Unknown
        Associativity: Unknown

Handle 0x0007, DMI type 9, 13 bytes
System Slot Information
        Designation: PCI Express Slot 1
        Type: 64-bit PCI Express
        Current Usage: Available
        Length: Long
        ID: 0
        Characteristics:
                5.0 V is provided
                3.3 V is provided

Handle 0x0008, DMI type 9, 13 bytes
System Slot Information
        Designation: PCI Express Slot 2
        Type: 64-bit PCI Express
        Current Usage: Available
        Length: Long
        ID: 0
        Characteristics:
                5.0 V is provided
                3.3 V is provided
                PME signal is supported
                Hot-plug devices are supported

Handle 0x0009, DMI type 9, 13 bytes
System Slot Information
        Designation: PCI Express Slot 6
        Type: 64-bit PCI Express
        Current Usage: Available
        Length: Long
        ID: 0
        Characteristics:
                5.0 V is provided
                3.3 V is provided
                PME signal is supported

Handle 0x000A, DMI type 10, 6 bytes
On Board Device Information
        Type: Video
        Status: Enabled
        Description:    

Handle 0x000B, DMI type 11, 5 bytes
OEM Strings
        String 1: $HP$
        String 2: LOC#ABZ
        String 3: ABS 70/71 79 7A 7B 7C

Handle 0x000C, DMI type 16, 15 bytes
Physical Memory Array
        Location: System Board Or Motherboard
        Use: System Memory
        Error Correction Type: None
        Maximum Capacity: 4 GB
        Error Information Handle: Not Provided
        Number Of Devices: 2

Handle 0x000D, DMI type 17, 27 bytes
Memory Device
        Array Handle: 0x000C
        Error Information Handle: No Error
        Total Width: 64 bits
        Data Width: 64 bits
        Size: 1024 MB
        Form Factor: DIMM
        Set: 1
        Locator: DIMM 1
        Bank Locator: Bank 0,1
        Type: DDR2
        Type Detail: Synchronous
        Speed: 667 MHz (1.5 ns)
        Manufacturer: Not Specified
        Serial Number: Not Specified
        Asset Tag: Not Specified
        Part Number: Not Specified

Handle 0x000E, DMI type 17, 27 bytes
Memory Device
        Array Handle: 0x000C
        Error Information Handle: No Error
        Total Width: 64 bits
        Data Width: 64 bits
        Size: 1024 MB
        Form Factor: DIMM
        Set: 1
        Locator: DIMM 2
        Bank Locator: Bank 2,3
        Type: DDR2
        Type Detail: Synchronous
        Speed: 667 MHz (1.5 ns)
        Manufacturer: Not Specified
        Serial Number: Not Specified
        Asset Tag: Not Specified
        Part Number: Not Specified

Handle 0x000F, DMI type 19, 15 bytes
Memory Array Mapped Address
        Starting Address: 0x00000000000
        Ending Address: 0x0007FFFFFFF
        Range Size: 2 GB
        Physical Array Handle: 0x000C
        Partition Width: 0

Handle 0x0010, DMI type 20, 19 bytes
Memory Device Mapped Address
        Starting Address: 0x00000000000
        Ending Address: 0x0003FFFFFFF
        Range Size: 1 GB
        Physical Device Handle: 0x000D
        Memory Array Mapped Address Handle: 0x000F
        Partition Row Position: 2
        Interleave Position: 2
        Interleaved Data Depth: 2

Handle 0x0011, DMI type 32, 20 bytes
System Boot Information
        Status: <OUT OF SPEC>

--------------------

hdparm -i /dev/sda

/dev/sda:

 Model=SAMSUNG HM250JI                         , FwRev=HS100-10, SerialNo=S15YJD0P821334      
 Config={ Fixed DTR>10Mbs }
 RawCHS=16383/16/63, TrkSize=34902, SectSize=554, ECCbytes=4
 BuffType=DualPortCache, BuffSize=8192kB, MaxMultSect=16, MultSect=?16?
 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=268435455
 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes:  pio0 pio1 pio2 pio3 pio4 
 DMA modes:  mdma0 mdma1 mdma2 
 UDMA modes: udma0 udma1 udma2 udma3 udma4 *udma5 
 AdvancedPM=yes: unknown setting WriteCache=enabled
 Drive conforms to: ATA/ATAPI-7 T13 1532D revision 0:  ATA/ATAPI-1,2,3,4,5,6,7

 * signifies the current active mode

Regards,
Alberto
Comment 23 Forgotten User ZhJd0F0L3x 2008-05-09 10:19:05 UTC
*** Bug 338230 has been marked as a duplicate of this bug. ***
Comment 24 Tejun Heo 2008-05-13 13:10:38 UTC
Created attachment 214783 [details]
storage-fixup.conf

Okay, here's storage-fixup.conf for above two machines.  Can you guys please put the attached file under /etc and verify that storage-fixup script does the right thing?  Also, please attach long outputs instead of pasting inline.
Comment 25 Forgotten User qMyteedNxa 2008-05-18 19:25:14 UTC
this doesn`t only happen on notebooks but also with enterprise disks for 24/7hrs useage.

>Aieeeeeeeee.......... This is nasty.  The BIOS / drive vendors are setting APM
>to aggressive values w/ idle IO access pattern of windows on mind.  The setting
>is too aggressive to the point of being fragile and different idle IO pattern
>causes the drive to unload like crazy && you can't really expect Linux or any
>other operating system to have similar idle IO pattern as windows.  :-(

even worse - there are disks on the market which don`t let you tune the timeouts with standard methods. you need to get a proprietary DOS *sigh* tool to tune the unload interval - and WD support may give that to you - or not....

see these threads:
http://marc.info/?l=linux-kernel&m=120777293511872&w=2
http://marc.info/?l=linux-kernel&m=121071269907588&w=2
Comment 26 John Anderson 2008-05-18 19:58:45 UTC
Created attachment 216248 [details]
dmidecode

dmidecode for dell e1505
Comment 27 John Anderson 2008-05-18 19:59:15 UTC
Created attachment 216249 [details]
hdparm -I

hdparm -I /dev/sda for dell e1505
Comment 28 John Anderson 2008-05-18 20:00:05 UTC
inspidell:~ # smartctl -a /dev/sda | grep Load_Cycle_Count 
193 Load_Cycle_Count        0x0032   001   001   000    Old_age   Always       -       769211


This is for a dell e1505
Comment 29 Tejun Heo 2008-05-19 05:35:16 UTC
(In reply to comment #25 from roland kletzing)
> this doesn`t only happen on notebooks but also with enterprise disks for
> 24/7hrs useage.
> 
> even worse - there are disks on the market which don`t let you tune the
> timeouts with standard methods. you need to get a proprietary DOS *sigh* tool
> to tune the unload interval - and WD support may give that to you - or not....
> 
> see these threads:
> http://marc.info/?l=linux-kernel&m=120777293511872&w=2
> http://marc.info/?l=linux-kernel&m=121071269907588&w=2
> 

Yeah, I recall the thread.  I don't really think we can do something about it tho.  The only possible solution is to periodically issue commands to the drive to keep its head from unloading.  Yuk..

Comment 30 Tejun Heo 2008-05-19 05:38:35 UTC
Created attachment 216281 [details]
storage-fixup.conf

Okay, here's the updated storage-fixup.conf.

You can test this by

1. Saving https://bugzilla.novell.com/attachment.cgi?id=212833 as ~root/bin/storage-fixup

2. Saving this attachment as /etc/storage-fixup.sh

3. Execute storage-fixup and verify it did the right thing.

Please verify it works.  It needs to be tested before shipped.
Comment 31 Tejun Heo 2008-05-22 14:23:20 UTC
Can we please get this tested?  We *really* need to get it tested before RC1 if this problem is going to be worked around in SL110.
Comment 32 Alberto Passalacqua 2008-05-22 15:28:49 UTC
It works on the HP Pavillion. Moreover, it seems the problem is being addressed also by the manufacturer with a new firmware.

Regards,
Alberto
Comment 33 Forgotten User qMyteedNxa 2008-05-22 17:00:36 UTC
>This is tough to solve.

yes, probably. 

so why not creating better "transparency" for the end user here ? 

smartctl can read the power-on-hours and the load_cycle_count.
each value for itself doesn't tell much...

you won`t even need the power_on_hours value if you just check for load cycle count increase for specific time intervals.

if we want to stop harddisk vendors doing such dumb things, more users must raise an eyebrow. the more noise about this, the more chances vendors will do something about that.
but how should an technically inexperienced user know, that he actually suffers from that issue ?

one step into that direction would be adding a feature to smartmontools to give out a warning message, when it detects excessive load cycle count number.

see http://sourceforge.net/mailarchive/forum.php?thread_name=261451287%40web.de&forum_name=smartmontools-support

---

>It works on the HP Pavillion. Moreover, it seems the problem is being 
>addressed also by the manufacturer with a new firmware.

any link/information for that? WD support was pretty ignorant about this issue and it wouldn`t hurt telling them that other vendors do something about that problem.

Comment 34 Tejun Heo 2008-05-23 01:18:15 UTC
Yeah, that could help.  I'll ping Bruce Allen with the idea and cc you.

Stefan, can you please include the script and config file?  We'll need to continue updating the config file and look for other ways to improve the situation but for now the dirty little script seems all we've got.
Comment 35 Tejun Heo 2008-05-23 01:20:19 UTC
Created attachment 217657 [details]
storage-fixup

storage-fixup with typo fixed.
Comment 36 Forgotten User ZhJd0F0L3x 2008-05-23 06:04:25 UTC
Where do we include it? In the hdparm package?

And the matching, at least for thinkpads, is much too fine.
Generally, the first 4 digits are the model number, the last 3 characters are the specific variety (different keyboard, different CD burner, etc), so you'll need a very huge list to match all the thinkpads.

Or do we really want to only do this on the (very few) machines reported as having problems?
Comment 37 Tejun Heo 2008-05-23 06:25:53 UTC
It should depend on hal, dmidecode, hdparm and smartctl but doesn't really belong to any of them.  pm-utils?

As for the matching, yeah, that can be a problem.  I'll see if I can extend the current script to do globbing.

Thanks.
Comment 38 Forgotten User ZhJd0F0L3x 2008-05-23 08:56:05 UTC
At least HAL and smartctl alredy have init scripts where it could be called, which pm-utils has not ;-), so i think it should go in one of those two packages.

Since we will need very frequent updates of the blacklist, HAL is probably also a bad idea, since a HAL update pulls in a huge QA effort everytime.
Comment 39 Tejun Heo 2008-05-23 09:05:48 UTC
Hmmm... Maybe a separate package then?  I'm almost done implementing glob matching w/ hal info caching.  Man.. bash is a great programming language. :-(

If separate package is the way to go, I'll package it and put it on OBS.  I've never submitted a package before.  What else do I need to do?

Thanks.
Comment 40 Tejun Heo 2008-05-23 11:27:34 UTC
Created attachment 217758 [details]
storage-fixup

storage-fix v0.1
Comment 41 Tejun Heo 2008-05-23 11:28:09 UTC
Created attachment 217759 [details]
storage-fixup.conf
Comment 42 Tejun Heo 2008-05-23 11:32:51 UTC
Posted are the updated versions of storage-fixup and configuration file which can do glob matching and matches drives in the same family.  It's also improved in many aspects including documentation.  git tree is set up.

  http://git.kernel.org/?p=linux/kernel/git/tj/storage-fixup.git

I started upstream discussion regarding this problem and we'll probably set up a page on linux-ata.org so that we can track things better and other distros can use it too.
Comment 43 Forgotten User qMyteedNxa 2008-05-23 12:40:57 UTC
here are some more for storage fixup.conf

Western Digital:
WD5000AACS
WD7500AYPS
WD1000FYPS

act sendmsgtouser "please check your disk for load_cycle_count numbers. you may need wdidle3.exe on bootable dos disk/cd to fix it. this is not available for download, so you may need to ask WD support for that tool or for a firmware upgrade."

TOSHIBA:
MK1032GAX

Comment 44 Forgotten User ZhJd0F0L3x 2008-05-23 14:23:09 UTC
(In reply to comment #39 from Tejun Heo)
> Hmmm... Maybe a separate package then?  I'm almost done implementing glob
> matching w/ hal info caching.  Man.. bash is a great programming language. :-(
> 
> If separate package is the way to go, I'll package it and put it on OBS.  I've
> never submitted a package before.  What else do I need to do?

Add it in the PDB and submit it to autobuild. I can do that for you. If you already have it in OBS, please put the package "source dir" somewhere accessible (hm, i should be able to check it out from the OBS...).

Make sure you select a good license, so that we can get past legal quickly ;-)

Then we need management's approval to get that still into 11.0, so i'll add relevant people to cc now.
Comment 45 Tejun Heo 2008-05-26 07:02:53 UTC
Here's the OBS project.  The license is beerware.  Is that something legal can agree with?

https://build.opensuse.org/package/show?package=storage-fixup&project=home%3Ateheo

Every file is accessible through OBS but just in case, the git tree is at...

http://git.kernel.org/?p=linux/kernel/git/tj/storage-fixup.git;a=shortlog;h=suse

I don't know anything about PDB so I would appreciate if you can help me with that.

Thanks.
Comment 46 Tejun Heo 2008-05-26 07:05:39 UTC
Roland, thanks for the info.  Hmmm... Echoing the message to console won't do too much good for most users.  It'll just pass by and in most cases the pretty boot splash won't even show them.

I'll ping the research mailing list on the best way to notify the user about dying but not-workaroundable hard disks.
Comment 47 Forgotten User ZhJd0F0L3x 2008-05-26 07:24:03 UTC
(In reply to comment #45 from Tejun Heo)
> Here's the OBS project.  The license is beerware.  Is that something legal can
> agree with?

Let's hope so, i added Ciaran and Jürgen to CC. Feel free to remove yourself again ;-)

If this is not "good enough", we can probably also dual-license it "beerware / BSD 3-clause". BSD is known to go easily through legal ;)

https://build.opensuse.org/package/show?package=storage-fixup&project=home%3Ateheo
> 
> Every file is accessible through OBS but just in case, the git tree is at...

Yes, i was able to check it out with OSC. Everything is easy if you have an BS account, it's only hard for people without.

> http://git.kernel.org/?p=linux/kernel/git/tj/storage-fixup.git;a=shortlog;h=suse
> 
> I don't know anything about PDB so I would appreciate if you can help me with
> that.

I'll take care of it.
Comment 48 Pavel Machek 2008-05-26 08:14:43 UTC
Please just use GPL or BSD3clause. Having beerware in default install is a nice joke, but...
Comment 50 Forgotten User ZhJd0F0L3x 2008-05-26 08:27:44 UTC
Created attachment 218052 [details]
patch against OBS package (rev. 9)

I submitted a package to autobuild, with the attached diff (pm-utils hooks belong into /usr/lib/pm-utils now, some rpmlint warnings).

PDB seems not to be unhappy about the license btw ;)
Comment 51 Tejun Heo 2008-05-27 03:54:51 UTC
Thanks Stefan.

I'll change the license to BSD, incorporate your changes into the repo and it seems we'll need to use different way to get storage.model as hal output is truncated.
Comment 52 Forgotten User ZhJd0F0L3x 2008-05-27 07:04:14 UTC
Please update your buildservice repo once you are finished, i will pick up the changes from there.
Comment 55 Tejun Heo 2008-05-28 06:07:39 UTC
License switched to BSD and dependency to HAL removed.  It now uses hdparm and sg_inq to acquire storage device information.  Also, rc script is moved to boot stage after boot.localfs.  OBS project updated.

Thanks.
Comment 56 Forgotten User ZhJd0F0L3x 2008-05-28 07:49:51 UTC
Thanks, i submitted a package to the build system and changed the license information in PDB => everything should be fine ;-)
Comment 57 Tejun Heo 2008-05-29 00:25:17 UTC
As workaround has gone in, lowering priority to normal.  I'm gonna keep this bug entry open to keep track of this problem.

Thanks.
Comment 58 Alberto Passalacqua 2008-05-29 00:59:52 UTC
Thank you for the excellent work!

Alberto
Comment 59 Tejun Heo 2008-06-20 03:05:55 UTC
Stefan Seyfried, there's no storage-fixup in SL110.  Do you know where it went?  Thanks.
Comment 60 Forgotten User ZhJd0F0L3x 2008-06-21 08:26:40 UTC
No. Everybody necessary is in CC, but maybe it was not severe enough to warrant late inclusion.
Coolo, can we push it out via online update?
Comment 61 Stephan Kulow 2008-06-23 08:14:32 UTC
please read mails I send around.
Comment 62 Pavel Machek 2008-06-23 08:23:51 UTC
coolo: I do not remember being Cced on those mails. I'd still like to know what is going on. Could you attach short summary to the bug?
Comment 63 Stephan Kulow 2008-06-23 09:08:22 UTC
In short: I have nothing to do with maintenance of 11.0 unless the maintenance coordinator needs me.
Comment 70 Pacho Ramos 2008-06-26 07:24:30 UTC
Then, Is "storage-fixup" the package who fixes this problem? From http://software.opensuse.org , seems that it has versions for factory, SLED 10 and opensuse-10.3, Is save use factory rpm in 11.0 ?

Thanks a lot for info
Comment 71 Pacho Ramos 2008-06-26 07:39:19 UTC
After reading storage-fixup.conf, seems that -B 255 is being used but, after reading https://wiki.ubuntu.com/DanielHahler/Bug59695 , seems that some drives would need -B 254 instead. Also laptop-mode-tools now uses 254 instead of 255 from 1.35 version as can be read in http://samwel.tk/laptop_mode/changelog

Also, Is fully disabling Power Manager really needed? Wouldn't it cause some problems related with overheat or short battery lifetime? 

Seems that my hardrive is now using 128 value:
# hdparm -I /dev/sda | grep Advanced
	Advanced power management level: 128
	   *	Advanced Power Management feature set
# smartctl -a /dev/sda -d ata | grep Load
193 Load_Cycle_Count        0x0032   090   090   000    Old_age   Always  -       218785
# smartctl -a /dev/sda -d ata | grep Power
  9 Power_On_Seconds        0x0032   076   076   000    Old_age   Always       -       12338h+14m+13s
 12 Power_Cycle_Count       0x0032   099   099   000    Old_age   Always       -       5498
192 Power-Off_Retract_Count 0x0032   098   098   000    Old_age   Always       -       569

I am not sure if 218785 is a good value, but, after reading "man hdparm":

       -B     Set Advanced Power Management feature, if the drive supports it.
              A  low  value means aggressive power management and a high value
              means better performance.  Possible settings range from values 1
              through 127 (which permit spin-down), and values 128 through 254
              (which do not permit spin-down).  The highest  degree  of  power
              management  is attained with a setting of 1, and the highest I/O
              performance with a setting of 254.  A value of 255 tells  hdparm
              to  disable  Advanced  Power  Management altogether on the drive
              (not all drives support disabling it, but most do).

Then, seems that 128 doesn't permit spin-down.

Anyway, in hdparm man page also tells you that 254 could be needed in some cases instead of 255

Thanks a lot
Comment 72 Forgotten User ZhJd0F0L3x 2008-06-26 08:04:02 UTC
(In reply to comment #70 from Pacho Ramos)
> Then, Is "storage-fixup" the package who fixes this problem? From
> http://software.opensuse.org , seems that it has versions for factory, SLED 10
> and opensuse-10.3, Is save use factory rpm in 11.0 ?

Yes. But there will also be an online update for 11.0 delivered soon, so it should pop up in YOU shortly without the need for adding any buildservice repositories.
Comment 73 Tejun Heo 2008-06-27 13:38:54 UTC
(In reply to comment #71 from Pacho Ramos)
> After reading storage-fixup.conf, seems that -B 255 is being used but, after
> reading https://wiki.ubuntu.com/DanielHahler/Bug59695 , seems that some drives
> would need -B 254 instead. Also laptop-mode-tools now uses 254 instead of 255
> from 1.35 version as can be read in http://samwel.tk/laptop_mode/changelog
> 
> Also, Is fully disabling Power Manager really needed? Wouldn't it cause some
> problems related with overheat or short battery lifetime? 

As discussed above, no one value suits every drive.  I hope it were like that but the standard isn't too specific about which value means exactly what.  Furthermore, this is ATA and vendors often forget to follow the spec.  So, for some drives, 255 is a good value for others 254, yet others 128.  That's why storage-fixup matches specific machines and use appropriate commands.  We'll need to find out which value is the appropriate one for specific machine and add it machine-by-machine.
Comment 74 Pacho Ramos 2008-06-27 16:49:22 UTC
OK, thanks a lot for explanation
Comment 75 Anja Stock 2008-07-04 14:21:09 UTC
released
Comment 76 Tejun Heo 2008-07-04 14:26:51 UTC
From now on, please use the following wiki page to track this problem and report new ones to linux-ide@vger.kernel.org as this problem is not specific to SUSE.

  http://ata.wiki.kernel.org/index.php/Known_issues

Thanks.