Booting an Intel Architecture System, Part II: Advanced InitializationBooting an Intel Architecture System, Part II: Advanced Initialization
Once the processor is running and memory has been initialized, timers and devices must be started up and a memory map laid out. Only then, can the OS be loaded.
January 17, 2012
Once the processor is running and memory has been initialized, timers and devices must be started up and a memory map laid out. Only then, can the OS be loaded.
In the first part of this article, I covered hardware power-up issues, processor operating-mode selection, early initialization steps, memory initialization, and application processor initialization. This article — the final installment — shines a light into advanced device initialization, memory-map configuration, and all the other steps required to prepare the hardware for loading the operating system.
Advanced Device Initialization
Advanced initialization follows early initialization, as you might expect. This second stage is focused on device-specific initialization. In a UEFI-based BIOS solution, advanced initialization tasks are also known as DXE and Boot Device Selection (BDS) phases. The following devices must be initialized to enable a system. Not all are applicable to all embedded systems, but the list is prescriptive for most. This list is applies specifically to SOCs (systems on a chip) based on Intel architecture:
General purpose I/O (GPIO)
Interrupt controller
Timers
Cache initialization (this could also be accomplished during early initialization)
Serial ports, console in/out
Clocking and overclocking
PCI bus initialization
Graphics (optional)
Universal Serial Bus (USB)
Serial Advanced Technology Attachment (SATA)
General-Purpose I/O: GPIOs are key to platform extensibility. GPIOs can be configured for either input or output, but can also be configured to enable native functionality. Depending on weak or strong pull-up or pull-down resistors, some GPIOs can function as strapping pins that are sampled at reset by the chip-set, and then have a second function during boot-up and at run-time. GPIOs may also act like sideband signals to allow for system wakes.
SOC devices are designed to be used in a large number of configurations. The devices often have more capabilities than the device is capable of exposing on the I/O pins concurrently. That is because multiple functions may be multiplexed to an individual I/O pin. Before the pins are used they must be configured to implement a specific function or serve as general-purpose I/O pins. The system firmware developer must work through 64 to 256 GPIOs and their individual options with the board designer of each platform to ensure that this feature is properly enabled.
Interrupt Controllers: The Intel Architecture supports several different methods of handling interrupts. No matter which method is chosen, all interrupt controllers must be initialized at start-up.
When the Programmable Interrupt Controller (PIC) is the only enabled interrupt device, the system is in PIC mode. This is the simplest mode. All APIC components are bypassed and the system operates in single-thread mode using LINT0. The BIOS must set the IRQs per board configuration for all onboard, integrated, and add-in PCI devices.
The PIC contains two cascaded 8259s with fifteen available IRQs. IRQ2 is not available because it is used to connect the 8259s. On mainstream components, there are eight PIRQ pins supported by the PCH, named PIRQ[A# :H#]. These route PCI interrupts to IRQs of the 8259 PIC. PIRQ[A#:D#] routing is controlled by PIRQ routing registers 60h—63h (D31:F0:Reg 60- 63h). PIRQ[E#:H#] routing is controlled by PIRQ routing registers 68h—6Bh (D31:F0:Reg 68 — 6Bh). This arrangement is illustrated in Figure 1. The PCH also connects the eight PIRQ[A#:H#] pins to eight individual I/O Advanced Programmable Interrupt Controller input pins, as shown in Table 1.
Figure 1: Platform controller hub PIRQ-to-IRQ routing.
Table 1: Platform controller hub PIRQ routing table.
The Local Advanced Programmable Interrupt Controller (LAPIC) is inside the processor. It controls interrupt delivery to the processor. Each LAPIC has its own set of associated registers as well as a Local Vector Table (LVT). The LVT specifies the manner in which the interrupts are delivered to each processor core.
The I/O Advanced Programmable Interrupt Controller (IOxAPIC) is contained in the I/O Controller Hub (ICH) or the I/O Hub (IOH). It expands the number of IRQs available to 24. Each IRQ's entry in the redirection table may be enabled or disabled. The redirection table selects the IDT vector for the associated IRQ. This mode is available only when running in protected mode.
The boot loader typically does not use Message Signaled Interrupts (MSIs) for interrupt handling.
The Interrupt Vector Table (IVT) is located at memory location 0p. It contains 256 interrupt vectors. The IVT is used in real mode. Each 32-bit vector address consists of the CS:IP for the interrupt vector.
The Interrupt Descriptor Table (IDT) contains the exceptions and interrupts in protected mode. There are 256 interrupt vectors, and the exceptions and interrupts are defined in the same locations as in the IVT. Exceptions are routines that handle error conditions such as page faults and general protection Real-mode Interrupt Service Routines (ISRs) communicate information between the boot loader and the OS. For example, INT10h is used for video services such as changing video mode and resolution. Some legacy programs and drivers, assuming these real-mode ISRs are available, call INT routines directly.
Timers: A variety of timers can be employed in an Intel Architecture system:
The Programmable Interrupt Timer (PIT) resides in the IOH or ICH and contains the system timer, also referred to as IRQ0.
The High Precision Event Timer (HPET) resides in the IOH or ICH. It contains three timers. Typically, the boot loader need not initialize the HPET, and the functionality is used only by the OS.
The Real Time Clock (RTC) resides in the IOH or ICH. It contains the system time. These values are contained in CMOS. The RTC also contains a timer that can be used by firmware.
The System Management Total Cost of Ownership (TCO) timers reside in the IOH or ICH. They include the Watch Dog Timer (WDT), which can be used to detect system hangs and reset the system.
The LAPIC contains a timer that can be used by firmware.
Memory Caching Control: Memory regions that must have different caching behaviors will vary from design to design. In the absence of detailed caching requirements for a platform, the following guidelines provide a safe caching environment for typical systems:
Default Cache Rule: Uncached.
00000000-0009FFFF: Write Back.
000A0000-000BFFFF: Write Combined/Uncached
000C0000-000FFFFF: Write Back/Write Protect
00100000-TopOfMemory: Write Back.
Top of Memory Segment (TSEG): Cached on newer processors.
Graphics Memory: Write Combined or Uncached.
Hardware Memory-Mapped I/O: Uncached.
While MTRRs are programmed by the BIOS, Page Attribute Tables (PATs) are used primarily with the OS to control caching down to the page level.
Serial Ports: An RS-232 serial port or UART 16550 is initialized for either run-time or debug solutions. Unlike USB ports, which require considerable initialization and a large software stack, serial ports have a minimal register-level interface requirements. A serial port can be enabled very early in POST to provide serial output support.
Console In/Console Out: During the DXE portion of the UEFI phase, the boot services include console in and console out protocols.
Clock and Overclock Programming: Depending on the clocking solution of the platform, the BIOS may have to enable the clocking of the system. It is possible that a subsystem such as the ME or a server platform's Baseboard Management Controller (BMC) has this responsibility. It is also possible that beyond the basic clock programming, there are expanded configuration options for overclocking, such as:
Based on enumeration, enable or disable clock-output enables.
Adjust clock spread settings. Enable, disable, and adjust amount. Note that settings are provided as fixed register values determined from expected usages.
Under-clock CPU for adaptive clocking support.
Lock out clock registers prior to transitioning to host OS.
PCI Device Enumeration: PCI device enumeration is a generic term that refers to detecting and assigning resources to PCI-compliant devices in the system. The discovery process assigns the resources needed by each device, including the following:
Memory, prefetchable memory, I/O space.
Memory mapped I/O space.
IRQ assignment.
Expansion ROM detection and execution.
PCI device discovery applies to all newer interfaces such as PCIe root ports, USB controllers, SATA controllers, audio controllers, LAN controllers, and various add-in devices. These newer interfaces all comply with the PCI specification.
It is interesting to note that in UEFI-compliant systems, it is not during the DXE phase but during BDS that most required drivers are loaded.
Graphics Initialization: The video BIOS or Graphics Output Protocol (GOP) UEFI driver is normally the first option ROM to be executed. Once the main console-out is up and running, the console-in line can be configured.
Input Devices: Refer to schematics to determine which I/O devices are in the system. Typically, a system will contain one or more of the following devices:
Embedded Controller (EC): An EC is typically used in mobile or low-power systems. The EC contains separate firmware that controls power-management functions as well as PS/2 keyboard functionality.
Super I/O (SIO): An SIO typically controls the PS/2, serial, and parallel interfaces. Most systems still support some of the legacy interfaces.
Legacy-Free Systems: Legacy-free systems use USB as the input device. If pre-OS keyboard support is required, then the legacy keyboard interfaces must be trapped. Refer to the IOH/ICH BIOS Specification for more details on legacy-free systems.
USB Initialization: The USB controller supports both Enhanced Host Controller Interface (EHCI) and Extensible Host Controller Interface (xHCI) hardware. Enabling the host controller for standard PCI resources is relatively easy. It is possible to delay USB support until the OS drivers take over. If pre-OS support for EHCI or xHCI is required, then the tasks associated with the USB subsystem become substantially more complex. Legacy USB requires an SMI handler be used to trap port 60 and 64 accesses to I/O space, converting these to the proper keyboard or mouse commands. This pre-OS USB support is required if booting to USB is preferred.
SATA Initialization: A SATA controller supports the ATA/IDE programming interface as well as the Advanced Host Controller Interface (AHCI). In the following discussion, the term "ATA-IDE Mode" refers to the ATA/IDE programming interface that uses standard task file I/O registers or PCI IDE Bus Master I/O block registers. The term "AHCI Mode" refers to the AHCI programming interface that uses memory-mapped register and buffer space and a command-list-based model.
The general guidelines for initializing the SATA controller during POST and S3 resume are described in the following sections. Upon resuming from S3, system BIOS is responsible for restoring all the registers that it initialized during POST.
The system BIOS must program the SATA controller mode prior to beginning other initialization steps. The SATA controller mode is set by programming the SATA Mode Select (SMS) field of the port mapping register (D31:F2:Reg 90h[7:6]). The system BIOS may never change the SATA controller mode during run-time. Please note that the availability of the following modes is dependent on which PCH is in use. If system BIOS is enabling AHCI Mode or RAID Mode, system BIOS must disable D31:F5 by setting the SAD2 bit, RCBA + 3418h[25]. The BIOS must ensure that it has not enabled memory space, I/O space, or interrupts for this device prior to disabling the device.
IDE mode is selected by programming the SMS field, D31:F2:Reg 90h[7:6] to 00. In this mode, the SATA controller is set up to use the ATA/IDE programming interface. The 6/4 SATA ports are controlled by two SATA functions. One function routes up to four SATA ports, D31:F2, and the other routes up to two SATA ports, D31:F5. In IDE mode, the Sub Class Code, D31:F2:Reg 0Ah and D31:F5:Reg 0Ah are set to 01h. This mode may also be referred to as "compatibility mode," as it does not have any special OS driver requirements.
AHCI mode is selected by programming the SMS field, D31:F2:Reg 90h[7:6], to 01h. In this mode, the SATA controller is set up to use the AHCI programming interface. The six SATA ports are controlled by a single SATA function, D31:F2. In AHCI mode the Sub Class Code, D31:F2:Reg 0Ah, is set to 06h. This mode does require specific OS driver support.
RAID mode is selected by programming the SMS field, D31:F2:Reg 90h[7:6] to 10b. In this mode, the SATA controller is set up to use the AHCI programming interface. The 6/4 SATA ports are controlled by a single SATA function, D31:F2. In RAID mode, the Sub Class Code, D31:F2:Reg 0Ah, is set to 04h. This mode does require specific OS driver support.
To allow the RAID option ROM to access all 6/4 SATA ports, the RAID option ROM enables and uses the AHCI programming interface by setting the AE bit, ABAR + 04h[31]. One consequence is that all register settings applicable to AHCI mode set by the BIOS have to be set in RAID as well. The other consequence is that the BIOS is required to provide AHCI support to ATAPI SATA devices, which the RAID option ROM does not handle.
PCH supports stable image-compatible ID. When the alternative ID enable, D31:F2:Reg 9Ch [7] is not set, the PCH SATA controller will report the Device ID as 2822h.
It has been observed that some SATA drives will not start spin-up until the SATA port is enabled by the controller. In order to reduce drive detection time, and hence the total boot time, system BIOS should enable the SATA port early during POST (for example, immediately after memory initialization) by setting the Port x Enable (PxE) bits of the Port Control and Status register, D31:F2:Reg 92h and D31:F5:Reg 92h, to initiate spin-up.
Defining the Memory Map
In addition to defining the caching behavior of different regions of memory for consumption by the OS, it is also firmware's responsibility to provide a "map" of system memory to the OS so that it knows what regions are available for use.
The most widely used mechanism for a boot loader or an OS to determine the system memory map is to use real mode interrupt service 15h, function E8h, sub-function 20h (INT15/E820), which must be implemented in firmware.
Region Types: There are several general types of memory regions that are described by this interface:
Memory (1): General DRAM available for OS consumption.
Reserved (2): DRAM addresses not for OS consumption.
ACPI Reclaim (3): Memory that contains ACPI tables to which firmware does not require run-time access.
ACPI NVS (4): Memory that contains all ACPI tables to which firmware requires run-time access. See the applicable ACPI specification for details.
ROM (5): Memory that decodes to nonvolatile storage (for example, flash).
IOAPIC (6): Memory that is decoded by IOAPICs in the system (must also be uncached).
LAPIC (7): Memory that is decoded by local APICs in the system (must also be uncached).
Region Locations: The following regions are typically reserved in a system memory map:
00000000-0009FFFF: Memory
000A0000-000FFFFF: Reserved
00100000-xxxxxxxx: Memory (The xxxxxxxx indicates that the top of memory changes based on "reserved" items listed below and any other design-based reserved regions.)
TSEG: Reserved
Graphics Stolen Memory: Reserved
FEC00000-FEC01000*: IOAPIC
FEE00000-FEE01000*: LAPIC
Loading the Operating System
Following configuration of the memory map, a boot device is selected from a prioritized list of potential bootable partitions. The "Load Image" command, or Int 19h, is used to call the OS loader, which in turns load the OS. And off we go.
This article is adapted from material in Intel Technology Journal (March 2011) "UEFI Today: Bootstrapping the Continuum," and portions of it are copyright Intel Corp.
White Paper:Minimum Steps Necessary to Boot an Intel Architecture Platform
Pete Dice is a software architect in Intel's chip-set architecture group.
Related Article
Booting an Intel Architecture System, Part I: Early Initialization