现在的位置: 首页 > 综合 > 正文

ACPI 高级电源管理接口Suspend与Resume

2013年08月23日 ⁄ 综合 ⁄ 共 6899字 ⁄ 字号 评论关闭

http://www.advogato.org/article/913.html

 

Back in the APM days, everything was easy. You called an ioctl on /dev/apm,
and the kernel made a BIOS call. After that, it was all up to the hardware.
Sure, it never really worked properly, and it was basically impossible to debug
what the hardware actually did. And then ACPI came along, and nothing worked at
all. Several years later, we're almost back to where we were with APM. But
what's actually happening when you hit that sleep key?

 

Without the ability to suspend and resume, laptop users are doomed to spend
several hours of their lives waiting for machines to boot and shutdown. This is,
clearly, suboptimal. APM made it fairly easy to implement this, because almost
everything was handled by the BIOS. And that, in a nutshell, is one of the
primary reasons why ACPI ended up in charge.

 

The biggest problem with APM is that it left policy in hardware. Don't want
to suspend on lid closure? The OS doesn't get any say in the matter, though if
you're lucky there might be a BIOS option to control it. Would prefer it if the
BIOS didn't scribble all over the contents of your video registers while it
tries to reprogram them (probably back to the defaults of the Windows
drivers...)? Sucks to be you. Want the sleep button to trigger suspend to disk,
not suspend to RAM? A-ha ha ha.

 

ACPI deals with that problem, by moving almost all the useful functionality
out of hardware. The downside of this is that the functionality needs to be
reimplemented in the OS. Which, given that the ACPI spec is around 600 pages
long, has taken a little time.

 

(Of course, it turns out that most of the ACPI spec is entirely uninteresting
for suspend and resume purposes, but that's not really the point right now)

 

So, firstly, lets have some ACPI jargon. ACPI itself stands for "Advanced
Configuration and Power Interface". It's not just a power management spec - it
provides the OS with a description of all the built-in hardware in your system,
along with a certain degree of abstraction. It gives you information about
interrupt routing, tells you if someone's just removed a hot-pluggable DVD drive
from a laptop and may even let you control which video output is being used.

 

This information is provided in a table called the DSDT (Discrete System
Descriptor Table). The DSDT is in a bytecode called AML (ACPI Machine Language),
compiled from a simple language called ASL (ACPI Source Language, shockingly
enough). At boot time, the system reads the DSDT, parses it and executes various
methods. These can do pretty much anything, but on the bright side they're being
executed in kernel context and (in principle) you can filter out anything that
you really don't want to do (such as scribbling all over CMOS or something).

 

The final relevant piece of ACPI information is something called the FADT, or
Fixed Address Descriptor Table. This gives the OS information about various
register addresses. It's a static structure, and doesn't contain any executable
code.

 

So, how does all of this stuff actually work?

 

First of all, the user hits the sleep key. This triggers a hardware
interrupt, which is caught by the embedded controller. That pokes a register in
the southbridge, which flags that a general purpose event has just occured. The
OS notices this, and checks the DSDT for what's supposed to happen next.
Generally, this just calls a notification event. This is bounced back out to
userspace via /proc/acpi/events (currently, though it's going to be moved to the
input layer in future) and userspace gets to choose what happens next.

 

Let's concentrate on the common scenario, which is that someone hitting the
sleep button wants to suspend to RAM. Via some abstraction (either acpid,
gnome-power-manager or kpowersave or something), userspace makes that decision
and initiates the suspend to RAM process by either calling a suspend script
directly or bouncing via HAL.

 

Depending on distribution, this ends up running a shell script or binary
which attempts to prepare the system for suspend. Right now, this tends to
involve a bunch of bandaids around various broken drivers - unloading modules
and reloading them is one of the easiest workarounds for breakage. Finally, the
string "mem" is written to /sys/power/state.

 

This jumps back into the kernel. First, userspace is stopped. This stops it
getting horribly confused when a load of hardware mysteriously stops working.
Then the kernel goes through the device tree and calls suspend methods on each
bound driver. Individual drivers have responsibility for storing enough state in
order to be able to reprogram the device on resume - ACPI doesn't make
guarantees about what the hardware state is going to be when we come back. Once
the kernel-side suspend code has been run, we execute a couple of ACPI methods -
PTS (Prepare To Sleep) and GTS (Going To Sleep). These tend to poke various
things that the kernel knows nothing about, and so a certain amount of magic may
be involved.

 

At this point, the system should be fairly quiescent. Only two things to do
now. Firstly, the address of the kernel wakeup code is written to an address
contained in the FADT. Secondly, two magic values from the DSDT are written to
registers described in the FADT. This usually causes some sort of system
management trap, which makes sure that the memory is put in self-refresh mode
and actually sequences the machine into suspend. For the S3 power state, this
basically involves shutting the machine (other than the RAM) down completely.

 

Time passes.

 

The user presses the power button. The system switches on, jumps to the BIOS
start address, does a certain amount of setup (programming the memory controller
and so on) and then looks at the ACPI status register. This tells it that the
machine was previously suspended to RAM, so it then jumps to the wakeup address
programmed earlier. This leads it to a bunch of real-mode x86 code provided by
the kernel, which programs the CPU back into protected mode and restores
register state. Suddenly we're running kernel code again.

 

 

From this point onwards, it's much the reverse of the suspend process. We
call the ACPI WAK method, resume all the drivers and restart userspace. The
shell script suddenly starts running again and cleans up after itself, reloading
any drivers that were unloaded before suspend. As far as userspace is concerned,
the only thing that's happened is that the clock has jumped forward.

 

So why is this difficult?

 

In a lot of cases, it's just down to bugs in the drivers. Restoring hardware
state can be hard, especially if you don't actually have all the documentation
for the hardware to start with - traditionally, many Linux drivers have ended up
depending on the BIOS to have programmed the hardware into a semi-sane state,
and there's no guarantee that that will happen with ACPI. Other cases can just
be oversights - for instance, the bug in the APIC (not to be confused with ACPI)
code that meant a single register wasn't restored, resulting in some machines
resuming without any interrupts being delivered.

 

The single biggest problem is video hardware. The spec doesn't require the
BIOS to reprogram the video hardware at all, and so often it'll come back in an
entirely unprogrammed state. This is an issue, since we (in general) have
absolutely no idea how to bring a video card up from scratch. One of the easiest
workarounds is to execute code from the video BIOS in the same way that the
system BIOS does on machine startup. vbetool lets you do this from userspace,
and it works a surprisingly large amount of the time. However, there's no
guarantee that it'll be successful. Vendors often unmap that section of BIOS
after the system has been brought up, since they've got far more BIOS code than
will fit in the BIOS region of the legacy address space. In the long run, the
only solution is drivers that know how to program an entirely uninitialised
chip. The new modesetting branch of the Intel driver aims to do this, as do the
developers of noveau.

 

Despite all this misery, ACPI support is generally improving. Most machines
can now suspend and resume once more. The next big challenge is improving
run-time power management in order to get battery life to at least the level it
is under Windows, and ideally beyond that.

【上篇】
【下篇】

抱歉!评论已关闭.