Linux SMP启动过程学习笔记_Linux系统教程

1.SMP硬件体系结构：

对于SMP最简单可以理解为系统存在多个完全相同的CPU，所有CPU共享总线，拥有自己的寄存器。对于内存和外部设备访问，由于共享总线，所以是共享的。Linux操作系统多个CPU共享在系统空间上映射相同，是完全对等的。

由于系统中存在多个CPU，这是就引入一个问题，当外部设备产生中断的时候，具体有哪一个CPU进行处理？

为此，intel公司提出了IO APCI和LOCAL APCI的体系结构。

IO APIC连接各个外部设备，并可以设置分发类型，根据设定的分发类型，中断信号发送的对应CPU的LOCAL APIC上。

LOCAL APIC负责本地CPU的中断处理，LOCAL APIC不仅可以接受IO APIC的中断，也需要处理本地CPU产生的异常。同时LOCAL APIC还提供了一个定时器。

如何确定那个CPU是引导CPU？

根据intel公司中的资料，系统上电后，会根据MP Initialization Protocol随机选择一个CPU作为BSP，只有BSP会运行BIOS程序，其他AP都进入等待状态，BSP发送IPI中断触发后才可以运行。具体的MP Initialization Protocol细节，可以参考Intel? 64 and IA-32 Architectures Software Developer’s Manual Volume 3A: System Programming Guide, Part 1 第8章。

引导CPU如何控制其他CPU开始运行？

BSP可以通过IPI消息控制AP从指定的起始地址运行。CPU中集成的LOCAL APIC提供了这个功能。可以通过写LOCAL APIC中提供的相关寄存器，发送IPI消息到指定的CPU上。

如何获取系统硬件CPU信息的？

在系统初始化后，硬件会在内存的规定位置提供关于CPU，总线, IO APIC等的信息，即SMP MP table。在linux初始化的过程，会读取该位置，获取系统相关的硬件信息。

2.linux SMP启动过程流程简介

setup_arch()

setup_memory();

reserve_bootmem(PAGE_SIZE, PAGE_SIZE);

find_smp_config(); //查找smp mp table的位置

smp_alloc_memory();

trampoline_base = (void *) alloc_bootmem_low_pages(PAGE_SIZE); //分配trampoline，用于启动AP的引导代码。

get_smp_config(); //根据smp mp table，获取具体的硬件信息

trap_init()

init_apic_mappings();

mem_init()

zap_low_mappings(); 如果没有定义SMP的话，清楚用户空间的地址映射。

rest_init();

kernel_thread(init, NULL, CLONE_FS | CLONE_SIGHAND);

init();

set_cpus_allowed(current, CPU_MASK_ALL);

smp_prepare_cpus(max_cpus);

smp_boot_cpus(max_cpus);

connect_bsp_APIC();

setup_local_APIC(); //初始化 BSP的 LOCAL APCI。

map_cpu_to_logical_apicid();

针对每个CPU调用do_boot_cpu(apicid, cpu)

smp_init(); //每个CPU开始进行调度

trampoline.S AP引导代码，为16进制代码，启用保护模式

head.s 为AP创建分页管理

initialize_secondary 根据之前fork创建设置的信息，跳转到start_secondary处

start_secondary 判断BSP是否启动，如果启动AP进行任务调度。

3.代码学习总结

find_smp_config();，查找MP table在内存中的位置。具体协议可以参考MP协议的第4章。

这个表的作用在于描述系统CPU，总线，IO APIC等的硬件信息。

相关的两个全局变量：smp_found_config是否找到SMP MP table，mpf_found SMP MP table的线性地址。

smp_alloc_memory() 为启动AP的启动程序分配内存空间。相关全局变量trampoline_base，分配的启动地址的线性地址。

get_smp_config() 根据MP table中提供的内容，获取硬件的信息。

init_apic_mappings();获取IO APIC和LOCAL APIC的映射地址。

zap_low_mappings();如果没有定义SMP的话，清楚用户空间的地址映射。将swapper_pg_dir中表项清零。

setup_local_APIC(); 初始化 BSP的 LOCAL APCI。

do_boot_cpu(apicid, cpu)

idle = alloc_idle_task(cpu);

task = copy_process(CLONE_VM, 0, idle_regs(&regs), 0, NULL, NULL, 0);

init_idle(task, cpu);

将init进程使用copy_process复制，并且调用init_idle函数，设置可以运行的CPU。

idle->thread.eip = (unsigned long) start_secondary;

修改task_struct中的thread.eip，使得AP初始化完成后，就运行start_secondary函数。

start_eip = setup_trampoline();

调用setup_trampoline()函数，复制trampoline_data到trampoline_end之间的代码到trampoline_base处，trampoline_base就是之前在setup_arch处申请的内存。start_eip返回值是trampoline_base对应的物理地址。

smpboot_setup_warm_reset_vector(start_eip);设置内存40:67h处为start_eip为启动地址。

wakeup_secondary_cpu(apicid, start_eip);在这个函数中通过操作APIC_ICR寄存器，BSP向目标AP发送IPI消息，触发目标AP从start_eip地址处，从实模式开始运行。

trampoline.S

ENTRY(trampoline_data)

r_base = .

wbinvd # Needed for NUMA-Q should be harmless for others

mov %cs, %ax# Code and data in the same place

mov %ax, %ds

cli # We should be safe anyway

movl$0xA5A5A5A5, trampoline_data - r_base

这个是设置标识，以便BSP知道AP运行到这里了。

lidtl boot_idt - r_base # load idt with 0, 0

lgdtl boot_gdt - r_base # load gdt with whatever is appropriate

加载ldt和gdt

xor %ax, %ax

inc %ax # protected mode (PE) bit

lmsw%ax # into protected mode

# flush prefetch and jump to startup_32_smp in arch/i386/kernel/head.S

ljmpl$__BOOT_CS, $(startup_32_smp-__PAGE_OFFSET)

启动保护模式，跳转到startup_32_smp 处

# These need to be in the same 64K segment as the above;

# hence we don't use the boot_gdt_descr defined in head.S

boot_gdt:

.word __BOOT_DS + 7 # gdt limit

.longboot_gdt_table-__PAGE_OFFSET# gdt base

boot_idt:

.word 0 # idt limit = 0

.long0 # idt base = 0L

.globl trampoline_end

trampoline_end:

在这段代码中，设置标识，以便BSP知道该AP已经运行到这段代码，加载GDT和LDT表基址。

然后启动保护模式，跳转到startup_32_smp 处。

Head.s部分代码：

ENTRY(startup_32_smp)

cld

movl $(__BOOT_DS),%eax

movl %eax,%ds

movl %eax,%es

movl %eax,%fs

movl %eax,%gs

xorl %ebx,%ebx

incl %ebx

如果是AP的话，将bx设置为1

movl $swapper_pg_dir-__PAGE_OFFSET,%eax

movl %eax,%cr3 /* set the page table pointer.. */

movl %cr0,%eax

orl $0x80000000,%eax

movl %eax,%cr0 /* ..and set paging (PG) bit */

ljmp $__BOOT_CS,$1f /* Clear prefetch and normalize %eip */

启用分页，

lss stack_start,%esp

使esp执行fork创建的进程内核堆栈部分，以便后续跳转到start_secondary

#ifdef CONFIG_SMP

movb ready, %cl

movb $1, ready

cmpb $0,%cl

je 1f # the first CPU calls start_kernel

# all other CPUs call initialize_secondary

call initialize_secondary

jmp L6

#endif /* CONFIG_SMP */

call start_kernel

如果是AP启动的话，就调用initialize_secondary函数。

void __devinit initialize_secondary(void)

{

* We don't actually need to load the full TSS,

* basically just the stack pointer and the eip.

asm volatile(

"movl %0,%%esp/n/t"

"jmp *%1"

:"r" (current->thread.esp),"r" (current->thread.eip));

}

设置堆栈为fork创建时的堆栈，ip为fork时的ip，这样就跳转的了start_secondary。

start_secondary函数中处理如下：

while (!cpu_isset(smp_processor_id(), smp_commenced_mask))

rep_nop();

进行smp_commenced_mask判断，是否启动AP运行。smp_commenced_mask 在smp_init()中设置。

cpu_idle();

如果启动了，调用cpu_idle进行任务调度。

Linux SMP启动过程学习笔记

频道文章

网站推荐文章

推荐教程

热点推荐