Difference between revisions of "Talk:How to hotswap Ultrabay devices"

From ThinkWiki
Jump to: navigation, search
(T42 freezing up when trying to hot swap ultrabay.)
(Second disk not seen correctly on reinsert (T43p))
Line 70: Line 70:
  
 
--Barny  09/21/2006@7:46PM EST
 
--Barny  09/21/2006@7:46PM EST
 +
 +
== Second disk not seen correctly on reinsert (T43p) ==
 +
 +
I have followed the instructions on my T43p running Gentoo using 2.6.18. I have a second hard disk in the UltraBay, using ata_piix, so it is seen as /dev/sdb (as described in [[Problems with SATA and Linux#No_DMA_on_system_hard_disk|Problems with SATA and Linux]]). The eject works fine. When I reinsert it and issue the rescan command, Only the main /dev/sdb device reappears, but not the ones corresponding to the partitions (/dev/sdb1, etc.), so I cannot mount them, and fdisk /dev/sdb says that it cannot open the device.
 +
 +
In dmesg, I see a bunch of errors like these, repeated multiple times:
 +
 +
<pre>
 +
sd 1:0:0:0: SCSI error: return code = 0x08000002
 +
sdb: Current: sense key=0xb
 +
    ASC=0x0 ASCQ=0x0
 +
end_request: I/O error, dev sdb, sector 0
 +
ata2: EH complete
 +
ata2.00: speed down requested but no transfer mode left
 +
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
 +
ata2.00: tag 0 cmd 0x20 Emask 0x1 stat 0x51 err 0x4 (device error)
 +
ata2: EH complete
 +
ata2.00: speed down requested but no transfer mode left
 +
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
 +
ata2.00: tag 0 cmd 0x20 Emask 0x1 stat 0x51 err 0x4 (device error)
 +
ata2: EH complete
 +
ata2.00: speed down requested but no transfer mode left
 +
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
 +
ata2.00: tag 0 cmd 0x20 Emask 0x1 stat 0x51 err 0x4 (device error)
 +
ata2: EH complete
 +
ata2.00: speed down requested but no transfer mode left
 +
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
 +
ata2.00: tag 0 cmd 0x20 Emask 0x1 stat 0x51 err 0x4 (device error)
 +
ata2: EH complete
 +
ata2.00: speed down requested but no transfer mode left
 +
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
 +
ata2.00: tag 0 cmd 0x20 Emask 0x1 stat 0x51 err 0x4 (device error)
 +
ata2: EH complete
 +
ata2.00: speed down requested but no transfer mode left
 +
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
 +
ata2.00: tag 0 cmd 0x20 Emask 0x1 stat 0x51 err 0x4 (device error)
 +
</pre>
 +
 +
And at the end:
 +
 +
<pre>
 +
sdb: Current: sense key=0xb
 +
    ASC=0x0 ASCQ=0x0
 +
end_request: I/O error, dev sdb, sector 0
 +
ata2: EH complete
 +
SCSI device sdb: 117210240 512-byte hdwr sectors (60012 MB)
 +
sdb: Write Protect is off
 +
sdb: Mode Sense: 00 3a 00 00
 +
SCSI device sdb: drive cache: write back
 +
SCSI device sdb: 117210240 512-byte hdwr sectors (60012 MB)
 +
sdb: Write Protect is off
 +
sdb: Mode Sense: 00 3a 00 00
 +
SCSI device sdb: drive cache: write back
 +
</pre>
 +
 +
The situation is not cured by a reboot (I still see only /dev/sdb), I have to power cycle to get the devices back.
 +
 +
Thanks for any ideas.

Revision as of 14:44, 5 October 2006

I recently tried using the libata-tj patch tarball for 2.6.16.16, applying this against the newly released 2.6.16.18 kernel (released today.) Patch applied cleanly. Upon boot, I immediately get a multitude of "weird" errors -- strange lockups, programs segmentation fault (running "top" resulted in a seg fault), and ultimately a hard lockup.

I booted back to my vanilla 2.6.16.16, ran fsck (appeared to just replay a few transactions, no major damage), and am back to normal. However, it successfully scared me off - unfortunately can't risk too much downtime (or worse, subtle fs corruption) right now on my main system. Anybody have experiences with this on a T43p using piix driver?

--gsmenden 00:00, 23 May 2006 (EST)


The 2.6.16.16 patch works fine on my T43. There's a git tree (mentioned on the patch's webpage) which is closer to 2.6.18, but AFAIK no simple unified patch was prepred.

--Thinker 08:37, 23 May 2006 (CEST)


Cool. If I get brave I'll try it again on the 43p against 2.6.16.16 proper and report back.

--gsmenden 15:29, 23 May 2006 (EST)


Works fine here on 2.6.16. I got only one crash with Suspend to Ram, which I'm unable to reproduce yet. I renamed the acpi event files because at least my acpid doesn't read files that ends with .conf

--Defiant 21:09, 28 May 2006 (CEST)


Update - patched against 2.6.16.19, works fine. It appears my previous problems were due to a disk error unrelated to the patch. Excellent!

--gsmenden 00:57, 31 May 2006 (EST)

Anybody have time to make a patch of the libata(-tj) .git tree against the recently released 2.6.17? I hope to make one in the future if not...

--gsmenden 22:08, 19 Jun 2006 (EST)

one nit about ultrabay_close script / patch against 2.6.17 available

Howdy,

In ultrabay_close, there is 'sleep 3' for disk spinup, which isn't necessary. libata itself waits for disk spinup and if something breaks (e.g. first reset fails w/ timeout or something), it's libata's fault. Please remove that line and see if anything breaks.

Also, I've uploaded patch against 2.6.17/2.6.17.1 today.

http://home-tj.org/files/libata-tj-stable/libata-tj-2.6.17-20060625-1.tar.bz2

Hmmm... My post looks different from others. This wasn't intentional. Just don't know how to add normal discussion entry. Sorry.

--tj


Right, it works fine without "sleep 3" using the new patches. Sleep removed.

--Thinker 12:35, 1 July 2006 (CEST)


Is it correct, that the ata_piix driver in kernel 2.6.18 RC4 now supports hot swapping like described in the howto and announced here http://lwn.net/Articles/183734/?

--cob 15:53, 23 August 2006

T42 freezing up when trying to hot swap ultrabay.

Hi,

Please bear with me. I am totally new at this and I am making my best effort to understand and learn.

My problem is that when typing "# echo eject > /proc/acpi/ibm/bay" to eject my ultrabay and put another in, I see the power going off in the ultrabay LED, but then my PC freezes completely.

I am running Fedora 6 Test 3, kernel 2.6.17-1.2647 and my notebook is a ThinkPad T42.

Please help! I have to constantly be changing my bay to use information in other hard drives, and I have to shutdown the system completely to not have any problems.

Thanks,

--Barny 09/21/2006@7:46PM EST

Second disk not seen correctly on reinsert (T43p)

I have followed the instructions on my T43p running Gentoo using 2.6.18. I have a second hard disk in the UltraBay, using ata_piix, so it is seen as /dev/sdb (as described in Problems with SATA and Linux). The eject works fine. When I reinsert it and issue the rescan command, Only the main /dev/sdb device reappears, but not the ones corresponding to the partitions (/dev/sdb1, etc.), so I cannot mount them, and fdisk /dev/sdb says that it cannot open the device.

In dmesg, I see a bunch of errors like these, repeated multiple times:

sd 1:0:0:0: SCSI error: return code = 0x08000002
sdb: Current: sense key=0xb
    ASC=0x0 ASCQ=0x0
end_request: I/O error, dev sdb, sector 0
ata2: EH complete
ata2.00: speed down requested but no transfer mode left
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata2.00: tag 0 cmd 0x20 Emask 0x1 stat 0x51 err 0x4 (device error)
ata2: EH complete
ata2.00: speed down requested but no transfer mode left
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata2.00: tag 0 cmd 0x20 Emask 0x1 stat 0x51 err 0x4 (device error)
ata2: EH complete
ata2.00: speed down requested but no transfer mode left
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata2.00: tag 0 cmd 0x20 Emask 0x1 stat 0x51 err 0x4 (device error)
ata2: EH complete
ata2.00: speed down requested but no transfer mode left
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata2.00: tag 0 cmd 0x20 Emask 0x1 stat 0x51 err 0x4 (device error)
ata2: EH complete
ata2.00: speed down requested but no transfer mode left
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata2.00: tag 0 cmd 0x20 Emask 0x1 stat 0x51 err 0x4 (device error)
ata2: EH complete
ata2.00: speed down requested but no transfer mode left
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata2.00: tag 0 cmd 0x20 Emask 0x1 stat 0x51 err 0x4 (device error)

And at the end:

sdb: Current: sense key=0xb
    ASC=0x0 ASCQ=0x0
end_request: I/O error, dev sdb, sector 0
ata2: EH complete
SCSI device sdb: 117210240 512-byte hdwr sectors (60012 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
SCSI device sdb: 117210240 512-byte hdwr sectors (60012 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back

The situation is not cured by a reboot (I still see only /dev/sdb), I have to power cycle to get the devices back.

Thanks for any ideas.