Talk:How to hotswap Ultrabay devices

From ThinkWiki
Revision as of 16:02, 17 August 2007 by PeterJordan (Talk | contribs) (problem with umount_rdev)
Jump to: navigation, search

I recently tried using the libata-tj patch tarball for 2.6.16.16, applying this against the newly released 2.6.16.18 kernel (released today.) Patch applied cleanly. Upon boot, I immediately get a multitude of "weird" errors -- strange lockups, programs segmentation fault (running "top" resulted in a seg fault), and ultimately a hard lockup.

I booted back to my vanilla 2.6.16.16, ran fsck (appeared to just replay a few transactions, no major damage), and am back to normal. However, it successfully scared me off - unfortunately can't risk too much downtime (or worse, subtle fs corruption) right now on my main system. Anybody have experiences with this on a T43p using piix driver?

--gsmenden 00:00, 23 May 2006 (EST)


The 2.6.16.16 patch works fine on my T43. There's a git tree (mentioned on the patch's webpage) which is closer to 2.6.18, but AFAIK no simple unified patch was prepred.

--Thinker 08:37, 23 May 2006 (CEST)


Cool. If I get brave I'll try it again on the 43p against 2.6.16.16 proper and report back.

--gsmenden 15:29, 23 May 2006 (EST)


Works fine here on 2.6.16. I got only one crash with Suspend to Ram, which I'm unable to reproduce yet. I renamed the acpi event files because at least my acpid doesn't read files that ends with .conf

--Defiant 21:09, 28 May 2006 (CEST)


Update - patched against 2.6.16.19, works fine. It appears my previous problems were due to a disk error unrelated to the patch. Excellent!

--gsmenden 00:57, 31 May 2006 (EST)

Anybody have time to make a patch of the libata(-tj) .git tree against the recently released 2.6.17? I hope to make one in the future if not...

--gsmenden 22:08, 19 Jun 2006 (EST)

one nit about ultrabay_close script / patch against 2.6.17 available

Howdy,

In ultrabay_close, there is 'sleep 3' for disk spinup, which isn't necessary. libata itself waits for disk spinup and if something breaks (e.g. first reset fails w/ timeout or something), it's libata's fault. Please remove that line and see if anything breaks.

Also, I've uploaded patch against 2.6.17/2.6.17.1 today.

http://home-tj.org/files/libata-tj-stable/libata-tj-2.6.17-20060625-1.tar.bz2

Hmmm... My post looks different from others. This wasn't intentional. Just don't know how to add normal discussion entry. Sorry.

--tj


Right, it works fine without "sleep 3" using the new patches. Sleep removed.

--Thinker 12:35, 1 July 2006 (CEST)


Is it correct, that the ata_piix driver in kernel 2.6.18 RC4 now supports hot swapping like described in the howto and announced here http://lwn.net/Articles/183734/?

--cob 15:53, 23 August 2006

T42 freezing up when trying to hot swap ultrabay.

Hi,

Please bear with me. I am totally new at this and I am making my best effort to understand and learn.

My problem is that when typing "# echo eject > /proc/acpi/ibm/bay" to eject my ultrabay and put another in, I see the power going off in the ultrabay LED, but then my PC freezes completely.

I am running Fedora 6 Test 3, kernel 2.6.17-1.2647 and my notebook is a ThinkPad T42.

Please help! I have to constantly be changing my bay to use information in other hard drives, and I have to shutdown the system completely to not have any problems.

Thanks,

--Barny 09/21/2006@7:46PM EST


Have the same problem on a T40p running SuSE 10.1. Also lt_hotplug module is of no help. Keep me informed in case you have a solution! Thanks, --Ays 19:49, 5 October 2006 (CEST)


I have no problems with kernel 2.6.17-1.2187_1.fc5.cu from suspend2 on my T42p running Fedora Core 5. I have compiled the lt_hotswap module and every thing works fine. Since kernel 2.6.18-1.2200.fc5 my system freeez on loading the modul or on calling "echo eject > /proc/acpi/ibm/bay". Any ideas what has changed in the kernel?

--CoolMischa 2006-11-06@13:24 CET

Second disk not seen correctly on reinsert (T43p) [solved]

(update: see below for solution)

I have followed the instructions on my T43p running Gentoo using 2.6.18. I have a second hard disk in the UltraBay, using ata_piix, so it is seen as /dev/sdb (as described in Problems with SATA and Linux). The eject works fine. When I reinsert it and issue the rescan command, Only the main /dev/sdb device reappears, but not the ones corresponding to the partitions (/dev/sdb1, etc.), so I cannot mount them, and fdisk /dev/sdb says that it cannot open the device.

In dmesg, I see a bunch of errors like these, repeated multiple times:

sd 1:0:0:0: SCSI error: return code = 0x08000002
sdb: Current: sense key=0xb
    ASC=0x0 ASCQ=0x0
end_request: I/O error, dev sdb, sector 0
ata2: EH complete
ata2.00: speed down requested but no transfer mode left
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata2.00: tag 0 cmd 0x20 Emask 0x1 stat 0x51 err 0x4 (device error)
ata2: EH complete
ata2.00: speed down requested but no transfer mode left
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata2.00: tag 0 cmd 0x20 Emask 0x1 stat 0x51 err 0x4 (device error)
ata2: EH complete
ata2.00: speed down requested but no transfer mode left
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata2.00: tag 0 cmd 0x20 Emask 0x1 stat 0x51 err 0x4 (device error)
ata2: EH complete
ata2.00: speed down requested but no transfer mode left
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata2.00: tag 0 cmd 0x20 Emask 0x1 stat 0x51 err 0x4 (device error)
ata2: EH complete
ata2.00: speed down requested but no transfer mode left
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata2.00: tag 0 cmd 0x20 Emask 0x1 stat 0x51 err 0x4 (device error)
ata2: EH complete
ata2.00: speed down requested but no transfer mode left
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata2.00: tag 0 cmd 0x20 Emask 0x1 stat 0x51 err 0x4 (device error)

And at the end:

sdb: Current: sense key=0xb
    ASC=0x0 ASCQ=0x0
end_request: I/O error, dev sdb, sector 0
ata2: EH complete
SCSI device sdb: 117210240 512-byte hdwr sectors (60012 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
SCSI device sdb: 117210240 512-byte hdwr sectors (60012 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back

The situation is not cured by a reboot (I still see only /dev/sdb), I have to power cycle to get the devices back.

Thanks for any ideas.


(2006-10-10) As a followup to my note above, I have noticed that the DVD-RW drive works perfectly after hot-swapping it - it's just the second hard disk that doesg not get recognized properly. I can "scsiping" the /dev/sdb device and it seems to respond OK, I have tried restarting udevd without success, and I'm at a loss as to what to try next.

---

It turned out to be an obvious problem - I had a disk password set on my second disk, so on reinsert it could not be accessed. I turned off the disk password, and now it works perfectly.

ultrabay_open: Problem when using /proc/mounts

I am just working on a perl-free version of the ultrabay_open script. When the script reads the currently mounted devices from /proc/mounts, it may not find all the relevant device files. A file system mounted with a relative device path given to the mount command doesn't show up with the absolute device path in /proc/mounts. Example:

# cd /dev/; mount sdb5 /mnt results in the following line in /proc/mounts:

sdb5 /mnt ext3 rw 0 0

/etc/mtab contains the needed information:

/dev/sdb5 /mnt ext3 rw 0 0

However, /proc/mounts is the more reliable source of information IMHO. The absolute device path is needed to find out its major and minor numbers.

Any suggestions?

--MinioN 01:03, 28 December 2006 (CET)

Can/t mount CD after reinsert on X41

I did all the steps described on this page, and the drive ejects fine, and then when I reinsert it the /dev/scd0 entry reappears, but when I insert a CD Gnome won't mount it automatically, and when I try manually I ger this message:

   mount: wrong fs type, bad option, bad superblock on /dev/scd0,
   missing codepage or other error

and dmesg says:

   isofs_fill_super: bread failed, dev=sr0, iso_blknum=16, block=16

I have to reboot to use the drive again.

P.S. I discovered the following in dmesg when I boot:

   ata2.01: qc timeout (cmd 0xa1)
   ata2.01: failed to IDENTIFY (I/O error, err_mask=0x4)
   ata2: failed to recover some devices, retrying in 5 secs
   ata2.01: qc timeout (cmd 0xa1)
   ata2.01: failed to IDENTIFY (I/O error, err_mask=0x4)
   ata2: failed to recover some devices, retrying in 5 secs
   ata2.01: qc timeout (cmd 0xa1)
   ata2.01: failed to IDENTIFY (I/O error, err_mask=0x4)
   ata2: failed to recover some devices, retrying in 5 secs
   ata2.00: configured for UDMA/33

Is that relevant?

hdparm -Y /dev/<devnode>

is the hdparm part in the ultrabay_eject script really necessary?

It does not work with my dvdram drive (R60):

hdparm -Y /dev/sdb

/dev/sdb:
issuing sleep command
HDIO_DRIVE_CMD(sleep) failed: Input/output error

thanks,

problem with umount_rdev

I tried out the ultrabay_eject script and get this error (debian lenny with kernel 2.6.22.3)

cat: /sys/class/scsi_device/1:0:0:0/device/block:*/*/dev: No such file or directory

What is wrong? Why I need the output of $ULTRABAY_SYSDIR/block\:*/*/dev in

unmount_rdev `cat $ULTRABAY_SYSDIR/block\:*/dev    /
$ULTRABAY_SYSDIR/block\:*/*/dev`  \

?