Thinkpad G41 overheating
This is a short description of what happened today (27th Dec 2006), with my overheating G41. It's not too technical, as I don't exactly understand, what happened. Anyway, I'm placing it here -- perhaps someone with better hardware knowledge, than me, could improve it, and tell me what happened.
The problem: my G41 (equipped with P4 3.4 GHz and NVidia), when working at full CPU speed, started to overheat some time ago. Temperatures reached 85 deg Celsius, causing kernel to shut down the system (with "temperature above threshold" message). After some time, even if I downscaled the CPU using cpu-freq (down to 433 MHz) the CPU temperature could reach 85 deg Celsius and the system shutted down.
What I have noticed is, that the fan was always pretty quiet. It never reached higher RPM speeds. Temporary solution? I used an external cooling fan (40 Watt can give you pretty much cool air); unfortunatley, the CPU could still overheat. Perfect solution? Learn, how to control CPU cooling fan using Linux.
I've looked in many places to achieve that. Unfortunatley, ibm-acpi (0.12a, even with the patch) doesn't seem to support fan speed on G41. I have no idea about 0.13 in 2.6.20.
I have upgraded the bios, as described in BIOS Upgrade. During the actual upgrade (when the upgraded BIOS was written to flash) I have heard... fan noise. As powerful and loud, as I haven't heard anytime before.
Unfortunatley, Linux (2.6.18) was still running on the same, slow fan speed. Even when I supplied the kernel with "acpi=off apm=off" boot parameters.
So, what I've thought about is: 1) modprobe ibm-acpi experimental=1 2) find the correct register in /proc/acpi/ibm/ecdump
I've started to write 0x07 value into every register, using a simple Python script:
import sys, os baseReg = 16*1 for reg in range(0,16): for foo in range(6,8): file = open("/proc/acpi/ibm/ecdump", "w") x = "0x%x 0x0%x" % (baseReg + reg, foo) print x file.write(x + "\n") file.close() print "Press ENTER" raw_input()
A few times I've had to reboot, as the machine freezed after some writes. It looks like some of those writes hanged up the whole OS; some of them seem only to disconnect the keyboard (because Linux continued to run); some of those hangs required to disconnect the AC cable and take out the battery -- I was not able to turn the machine off even if I pressed the power button for more, than 4 seconds.
What I have found out is, that:
- writing to 0x57 I was able to do different display voodoo (turn off display scaling, change the backlight)
- 0x9a (I believe) caused the machine to freeze; there were some more registers, that caused the machine to freeze
- 0xea seemed to speed up the fan a bit, but after that, the fan went slow again
After some more freezes, the machine... refused to boot from HD. It booted BIOS correctly, being unable to proceed further for a significant (noticeable) amount of time -- neither "blue access key" nor F12 during boot time caused progress -- it merely displayed a message, then nothing.
After some more tries (reboots) and giving it some more time... the machine finally booted. It seems, that I have had to write to some really strange ecdump registers -- my boot up and bios passwords were gone (and, I suppose, that some more BIOS settings was lost).
Suprisingly... the fan started to work properly. I can hear it speed up, when the machine is getting warm. It keeps my CPU temperature around 75 deg Cel at 3.4GHz; as I found out, with fan finally working, it never went to more, than 82 deg Cel.
To make your fan working properly on your Linux-powered G41, make sure to:
- upgrade the BIOS
- reset all the BIOS data somehow (perhaps by direct writing to ecdump, just like I did by accident -- but this could be dangerous to your hardware, beware)
I understand, that the above description is not really technical and precise; I understand, that this may be not the best place to publish it. Anyway, my G41 overheating problems were such a pain in the arse, that I even thought about replacing my computer -- and, I was unable to find a working solution. BIOS upgrade and some random writes to ecdump (and some hang-ups) seem to solve that. Perhaps someone else will be able to describe more technically, what really has happened.