前面的初探内存保护模式里面,Linux最初进入保护模式,仅仅是一种纯段式的内存映射模式,而且也未起到很明显的保护作用,明显这不是linux内存管理的最终模式。Linux是不使用段保护的,使用的是页保护,所以它还需要开启分页管理。

    分页说简单也简单,就是通过页全局目录找到页表接着通过页表找到页面,诸如此类的查找映射方式。但是Intel支持有4k2M4M等不同的内存页面大小,不同的页面大小其映射方式又各有差异,所以又显得复杂。复杂的东西,以后再列举分析。

    简单起见,下面的分析基于x86 32位非PAE进行分析。前面已经了解了纯分段模式,内存映射是通过段选择符及GDTR/LDTR查找到段描述符,根据段描述符里的基地址加上偏移量即可得到映射后物理地址。如果开启了分页管理,那么段模式转换后的物理地址就不再是物理地址,而是线性地址,该线性地址需要对应到物理内存,则需要经过分页映射。所以说段页式映射,页式映射是基于段映射基础上的,也就是说不存在纯页式模式。

    下面是来自Intel手册的段页式映射的全景图:

>

 

    段映射就不多说了,接下来看一下分页部分的工作:

>

 

    这是在x86 32位环境上使用4k页面大小的分页模式情况。很明显可以看到32bit的线性地址被分成了三部分:10bit10bit12bit,分别表示页全局目录索引、页表索引、物理页面偏移。通过cr3寄存器查找到页全局目录表,根据页全局目录索引查找到页全局目录表项,继而页表索引结合页全局目录表项中记录的页表基址找到页表项,最后页表项记录的基址加上偏移量就得到了线性地址转换后的物理地址。算一下,32位系统(非PAE)环境下,4G内存映射需要多大空间来存放页表项:

> (4G/4k)4=(0x100000000/0x1000)4=0x400000

    4G空间除以4k页面大小(offset12bit代表的空间)得到页表项数,再乘以每项页表4byte的大小,可以得到如果映射完了,需要4M大小的页表空间。

    同样可以算出需要的页全局目录空间大小为:

> (0x400000/4k)4=0x1000

    也就是一个页面4k的空间大小。映射4G内存可以有下面的关系表:
















页全局目录所需空间



页表所需空间



物理内存空间



0x1000



0x400000



0x100000000



    试着分析一下,如果需要开启页映射模式,需要做点什么?预计:

> 1、 建立页全局目录和页表及物理内存的关联关系;
>
>
> 2、 设置CR3寄存器。

    那么接下来看一下相关的代码实现,开启页式映射模式的主要实现代码在:/arch/x86/kernel/head_32.s,代码约700多行,扣除注释后,实际代码就更少了。不过它做了不少工作,粗略分析了一下其工作有:构建了栈空间、建立页映射目录、开启分页功能、初始化中断描述符表、启动86虚拟机、初始化0号进程等。这里主要关注linux内核如何开启分页映射模式。那就走一下偏门吧,根据Intel手册可以知道,分页映射模式的开关是由CR0寄存器的PG位控制的。那就搜索一下CR0寄存器设置的地方。

    可以很快地搜索到head_32.s中共有三处设置了CR0

    首先看第一处(317行):



1. 【file:/arch/x86/kernel/head_32.s】
2. default_entry:
3. #define CR0_STATE (X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
4.              X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
5.              X86_CR0_PG)
6.     movl $(CR0_STATE & ~X86_CR0_PG),%eax
7.     movl %eax,%cr0
8.  
9. /

10.   We want to start out with EFLAGS unambiguously cleared. Some BIOSes leave
11.   bits like NT set. This would confuse the debugger if this code is traced. So
12.   initialize them properly now before switching to protected mode. That means
13.   DF in particular (even though we have cleared it earlier after copying the
14.   command line) because GCC expects it.
15.  /
16.     pushl $0
17.     popfl
    <span style="line-height: 1.5; -ms-word-wrap: break-word;">&nbsp; &nbsp; 可以看到</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">CR0</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">的值来自</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">eax</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">,而</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">eax</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">的值则为</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">$(CR0_STATE &amp; ~X86_CR0_PG)</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">,很明显这里把</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">CR0</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">的</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">PG</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">位给过滤之后设置给</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">CR0</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">,</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">PG</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">位未置位,这里不是开启分页模式的地方。那么我们看看这里做了什么?是否是分页模式的某个关键点?根据代码分析,</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">CR0_STATE</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">的值为</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">0x80050033</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">,而这里把</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">PG</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">位给去掉了,则为</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">0x00050033</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">二进制也就是</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">00000000000001010000000000110011,</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">对应到手册中</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">CR0</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">的位图介绍:</span>

> <span style="-ms-word-wrap: break-word;">![](http://blog.chinaunix.net/attachment/201409/5/26859697_1409846935yM9r.png)</span>

    &nbsp;

    &nbsp; &nbsp;&nbsp;可以看到被置位的功能位为:<span style="-ms-word-wrap: break-word;">AM</span>、<span style="-ms-word-wrap: break-word;">WP</span>、<span style="-ms-word-wrap: break-word;">NE</span>、<span style="-ms-word-wrap: break-word;">ET</span>、<span style="-ms-word-wrap: break-word;">MP</span>、<span style="-ms-word-wrap: break-word;">PE</span>。其中<span style="-ms-word-wrap: break-word;">PE</span>还是在分段模式开启时已经启用了的。根据手册介绍,粗略总结了一下:

> AM&mdash;&mdash;对齐功能屏蔽;
WP&mdash;&mdash;写保护;


NE&mdash;&mdash;数字错误标志位;


ET&mdash;&mdash;扩展类型,用于数学协处理器指令;


MP&mdash;&mdash;协处理器监视标志位;
    &nbsp; &nbsp;&nbsp;貌似和分页没什么关系。

    &nbsp; &nbsp;&nbsp;<span style="line-height: 1.5; -ms-word-wrap: break-word;">接着往下看第二处设置</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">CR0</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">的地方:</span>

<div class="codeText" id="codeText" style="background: rgb(255, 255, 255); font: 12px/normal Consolas, monospace; margin: 0px 0px 1.1em; padding: 0px; border: 1px solid rgb(221, 221, 221); border-image: none; width: 1252.19px; letter-spacing: 0.1px; overflow: auto; -ms-word-break: break-all; -ms-word-wrap: break-word; font-size-adjust: none; font-stretch: normal;">
  1. 【file:/arch/x86/kernel/head_32.s】
  2. enable_paging:
  3.  
  4. /*
  5.  * Enable paging
  6.  */
  7.     movl $pa(initial_page_table), %eax
  8.     movl %eax,%cr3 / set the page table pointer.. /
  9.     movl $CR0_STATE,%eax
  10.     movl %eax,%cr0 / ..and set paging (PG) bit /
  11.     ljmp $__BOOT_CS,$1f / Clear prefetch and normalize %eip /

        Good,很明显是可以看出来是这里开启了分页模式。貌似错过了页全局目录和页表及物理内存的关联关系的建立了。那就逆推吧,就在上面的代码片段里面,在设置CR0前,设置了CR3,而CR3是存放页全局目录的地址的。那么毫无疑问的initial_page_table就是页全局目录。也不绕弯了,直接看一下页全局目录和页表的实现:

  12. 【file:/arch/x86/kernel/head_32.s】

  13. page_pde_offset = (__PAGE_OFFSET >> 20);
  14.  
  15.     movl $pa(__brk_base), %edi
  16.     movl $pa(initial_page_table), %edx
  17.     movl $PTE_IDENT_ATTR, %eax
  18. 10:
  19.     leal PDE_IDENT_ATTR(%edi),%ecx / Create PDE entry /
  20.     movl %ecx,(%edx) / Store identity PDE entry /
  21.     movl %ecx,page_pde_offset(%edx) / Store kernel PDE entry /
  22.     addl $4,%edx
  23.     movl $1024, %ecx
  24. 11:
  25.     stosl
  26.     addl $0x1000,%eax
  27.     loop 11b
  28.     /*
  29.      * End condition: we must map up to the end + MAPPING_BEYOND_END.
  30.      */
  31.     movl $pa(_end) + MAPPING_BEYOND_END + PTE_IDENT_ATTR, %ebp
  32.     cmpl %ebp,%eax
  33.     jb 10b
  34.     addl $__PAGE_OFFSET, %edi
  35.     movl %edi, pa(_brk_end)
  36.     shrl $12, %eax
  37.     movl %eax, pa(max_pfn_mapped)
  38.  
  39.     / Do early initialization of the fixmap area /
  40.     movl $pa(initial_pg_fixmap)+PDE_IDENT_ATTR,%eax
  41.     movl %eax,pa(initial_page_table+0xffc)

        首先看到的是:

    movl $pa(__brk_base), %edi

    movl $pa(initial_page_table), %edx
    
    
    movl $PTE_IDENT_ATTR, %eax
    

        这是将ediedxeax进行设置,紧接着:

    leal PDE_IDENT_ATTR(%edi),%ecx      / Create PDE entry /

    movl %ecx,(%edx)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /* Store identity PDE entry */
    

        结合数据定义:

  42. 【file:/arch/x86/include/asm/pgtable_types.h】

  43. #define PTE_IDENT_ATTR 0x003 / PRESENT+RW /
  44. #define PDE_IDENT_ATTR 0x067 / PRESENT+RW+USER+DIRTY+ACCESSED /
  45. #define PGD_IDENT_ATTR 0x001 / PRESENT (no other attributes) /

    <span style="line-height: 1.5; -ms-word-wrap: break-word;">&nbsp; &nbsp; 可以看到将</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">__brk_base+3</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">作为页全局目录项存入到</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">initial_page_table</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">里面。为什么要加上</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">3</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">?分析一下页全局目录项的格式:</span>
    

    &nbsp;
    
    &nbsp; &nbsp;&nbsp;3的二进制:<span style="-ms-word-wrap: break-word;">11</span>,正好对应<span style="-ms-word-wrap: break-word;">R/W</span>位和<span style="-ms-word-wrap: break-word;">P</span>位,表示当前页全局目录项可读写且有效<span style="-ms-word-wrap: break-word;">,</span>很明显加上<span style="-ms-word-wrap: break-word;">3</span>的用意在此。而<span style="-ms-word-wrap: break-word;">__brk_base</span>就是页表了。继续往下看<span style="-ms-word-wrap: break-word;">:</span>
    

    movl %ecx,page_pde_offset(%edx)     / Store kernel PDE entry /

    &nbsp; &nbsp;&nbsp;这里表示将同样的页全局目录项存放到了<span style="-ms-word-wrap: break-word;">edx</span>偏移<span style="-ms-word-wrap: break-word;">page_pde_offset</span>,即<span style="-ms-word-wrap: break-word;">__PAGE_OFFSET&gt;&gt;20</span>。而<span style="-ms-word-wrap: break-word;">__PAGE_OFFSET</span>为:
    
  46. 【file:/arch/x86/include/asm/page_32_types.h】

  47. /*
  48.  * This handles the memory map.
  49.  *
  50.  * A __PAGE_OFFSET of 0xC0000000 means that the kernel has
  51.  * a virtual address space of one gigabyte, which limits the
  52.  * amount of physical memory you can use to about 950MB.
  53.  *
  54.  * If you want more physical memory than this then see the CONFIG_HIGHMEM4G
  55.  * and CONFIG_HIGHMEM64G options in the kernel configuration.
  56.  */
  57. #define __PAGE_OFFSET _AC(CONFIG_PAGE_OFFSET, UL)

        它取决于CONFIG_PAGE_OFFSET的配置,方便起见,这里就以默认的PAGE_OFFSET0xC0000000来分析。那么PAGE_OFFSET>>20也就是0xC00。这是哪一项呢?0xC00/4即为768项。也就是说页全局目录项中,第0项和第768项是指向了相同的页表,意味着映射到了相同的物理内存上面。为什么这么做呢?页全局目录项768项的16进制为0x300,如果左移22位,转换为虚拟地址的头则为0xC0000000,这很明显是页式映射后内核空间的虚拟地址起始。那么可以知道这里的目的就是使得当前内核空间0xC00000000x00000000映射的是同一块物理内存,主要是为了此次临时分页映射方便访问数据。

        同时这也是根据Intel手册里面开启分页模式的要求:

  58. 6. If paging is enabled, the code for the MOV CR0 instruction and the JMP or CALL instruction must come from a page that is identity mapped (that is, the linear address before the jumpis the same as the physical address after paging and protected mode is enabled). The target instruction for the JMP or CALL instruction does not need to be identity mapped.

    <span style="line-height: 1.5; -ms-word-wrap: break-word;">&nbsp; &nbsp; 接下来的指令是:</span>
    

    addl $4,%edx

    &nbsp; &nbsp;&nbsp;edx加上<span style="-ms-word-wrap: break-word;">4</span>,也就是往后偏移了<span style="-ms-word-wrap: break-word;">4</span>字节,这里是为了下一个页全局目录表项的写入做准备。
    
    &nbsp; &nbsp;&nbsp;接下来的指令是:
    

        movl $1024, %ecx

    11:
    
    
    &nbsp;&nbsp;&nbsp; stosl
    
    
    &nbsp;&nbsp;&nbsp; addl $0x1000,%eax
    
    
    &nbsp;&nbsp;&nbsp; loop 11b
    
    &nbsp; &nbsp;&nbsp;先把<span style="-ms-word-wrap: break-word;">1024</span>赋值给<span style="-ms-word-wrap: break-word;">ecx</span>,这里是为了给<span style="-ms-word-wrap: break-word;">loop</span>指令做准备的,表示循环次数,循环<span style="-ms-word-wrap: break-word;">1024</span>次。<span style="-ms-word-wrap: break-word;">stosl</span>指令的作用是将<span style="-ms-word-wrap: break-word;">eax</span>中的值保存到<span style="-ms-word-wrap: break-word;">es</span>:<span style="-ms-word-wrap: break-word;">edi</span>指向的地址,然后<span style="-ms-word-wrap: break-word;">edi</span>自加<span style="-ms-word-wrap: break-word;">4</span>。<span style="-ms-word-wrap: break-word;">stosl</span>完了之后,<span style="-ms-word-wrap: break-word;">eax</span>自加<span style="-ms-word-wrap: break-word;">0x1000</span>,然后<span style="-ms-word-wrap: break-word;">loop</span>回到<span style="-ms-word-wrap: break-word;">stosl</span>重复刚才的<span style="-ms-word-wrap: break-word;">eax</span>存储到内存的操作。回顾前面,首先<span style="-ms-word-wrap: break-word;">edi</span>是指向<span style="-ms-word-wrap: break-word;">__brk_base</span>的,这是页表,页表存储的是物理页面的基址和属性。那么很明显<span style="-ms-word-wrap: break-word;">eax</span>就是页表项的内容了,页表的第一项存储的是<span style="-ms-word-wrap: break-word;">eax</span>的初始值<span style="-ms-word-wrap: break-word;">PTE_IDENT_ATTR</span>,也就是<span style="-ms-word-wrap: break-word;">0x003</span>,对比下面的页表项内容格式,也如其注释上面写的开启了<span style="-ms-word-wrap: break-word;">RW</span>和<span style="-ms-word-wrap: break-word;">PRESENT</span>属性,而基址值就是<span style="-ms-word-wrap: break-word;">0</span>了。往后加上的<span style="-ms-word-wrap: break-word;">0x1000</span>,恰好就是<span style="-ms-word-wrap: break-word;">4k</span>物理页面的大小,表示接着映射下一片物理页面。
    

    &nbsp;
    
    &nbsp; &nbsp;&nbsp;刚才的<span style="-ms-word-wrap: break-word;">loop</span>循环完了,大概映射的物理内存空间为<span style="-ms-word-wrap: break-word;">1024*0x1000</span>,也就是<span style="-ms-word-wrap: break-word;">4M</span>的内存空间,<span style="-ms-word-wrap: break-word;">eax</span>也就是指向<span style="-ms-word-wrap: break-word;">0x</span><span style="-ms-word-wrap: break-word;"> </span>400000,再接着往下看:
    

    movl $pa(_end) + MAPPING_BEYOND_END + PTE_IDENT_ATTR, %ebp

    cmpl %ebp,%eax
    
    
    jb 10b
    
    &nbsp; &nbsp;&nbsp;然后上面就是将&ldquo;<span style="-ms-word-wrap: break-word;">$pa(_end) + MAPPING_BEYOND_END + PTE_IDENT_ATTR</span>&rdquo;的计算结果和<span style="-ms-word-wrap: break-word;">eax</span>比较,如果<span style="-ms-word-wrap: break-word;">eax</span>小于该值将会跳转回去重复执行标签<span style="-ms-word-wrap: break-word;">10</span>的代码指令。除掉<span style="-ms-word-wrap: break-word;">PTE_IDENT_ATTR</span>是页表项的属性,到这里可以看到这次内存映射的范围是从<span style="-ms-word-wrap: break-word;">0</span>开始一直到<span style="-ms-word-wrap: break-word;">_end</span>符号往后偏移<span style="-ms-word-wrap: break-word;">MAPPING_BEYOND_END</span>的物理内存(其实不一定,当其物理内存映射到最后不足一页时,将会新增一页页表再映射,确保该范围的内存都能够映射覆盖。)。
    
    &nbsp; &nbsp;&nbsp;很明显,这里没有按照<span style="-ms-word-wrap: break-word;">e820</span>图来映射的,肯定是一个临时映射,映射内存大小是如何的呢?看一下页全局目录的大小:
    
  59. 【file:/arch/x86/kernel/head_32.s】

  60. ENTRY(initial_page_table)
  61.     .fill 1024,4,0

    <span style="line-height: 1.5; -ms-word-wrap: break-word;">&nbsp; &nbsp; 这里表示按照</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">4byte</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">大小,以</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">0</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">为数据,填充</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">1024</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">项,也就是</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">initial_page_table</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">有</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">4k</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">空间,内容为</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">0</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">。前面已经算过了,很明显页全局目录空间是足够用来映射</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">4G</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">的空间的。</span>
    
    &nbsp; &nbsp;&nbsp;那么接下来看看页表空间,页表空间是自<span style="-ms-word-wrap: break-word;">__brk_base</span>开始的空间。<span style="-ms-word-wrap: break-word;">__brk_base</span>这个符号来自<span style="-ms-word-wrap: break-word;">/arch/x86/kernel/vmlinux.lds.s</span>:
    
  62. 【file: /arch/x86/kernel/vmlinux.lds.s】

  63.     .brk : AT(ADDR(.brk) - LOAD_OFFSET) {
  64.         __brk_base = .;
  65.         . += 64 1024; / 64k alignment slop space */
  66.         (.brk_reservation) / areas brk users have reserved */
  67.         __brk_limit = .;
  68.     }
  69.  
  70.     _end = .;

        这里截取了代码片段,如何理解这段代码呢,可以看一下ld手册。这里粗略分析一下,brk_base后面跟随的+= 64 * 1024表示预留了64k的空间,紧接着是brk的保留空间,这个空间来自类似RESERVE_BRK(pagetables, INIT_MAP_SIZE)定义所保留的,最后就是符号_end。那么可以知道从brk_base_end是明显大于64k的。64k的页表可以映射64M的物理内存,也就是说最终可映射的内存是超过64M的。_end符号标志的是内核映像的结束位置,而内核映像通常都是几M的大小,所以可以确定是完全够的。那么此次映射物理内存的空间大小很明显不取决于页表的空间限制了。

    &nbsp;
    
    &nbsp; &nbsp;&nbsp;于是乎就回到刚才的&ldquo;<span style="-ms-word-wrap: break-word;">$pa(_end) + MAPPING_BEYOND_END + PTE_IDENT_ATTR</span>&rdquo;,此次映射物理内存的空间大小就止于<span style="-ms-word-wrap: break-word;">_end</span>符号往后偏移<span style="-ms-word-wrap: break-word;">MAPPING_BEYOND_END</span>的位置了。算一下<span style="-ms-word-wrap: break-word;">MAPPING_BEYOND_END</span>的大小:
    

    LOWMEM_PAGES = (((1<<32) - __PAGE_OFFSET) >> PAGE_SHIFT)

    &mdash;&mdash;<span style="-ms-word-wrap: break-word;">&gt;((1&lt;&lt;32)-0xC0000000)&gt;&gt;12 = 0x40000</span>
    
    
    #define PAGE_TABLE_SIZE(pages) ((pages) / PTRS_PER_PGD)
    
    
    &mdash;&mdash;<span style="-ms-word-wrap: break-word;">&gt;0x40000/1024 = 0x100</span>
    
    
    MAPPING_BEYOND_END = PAGE_TABLE_SIZE(LOWMEM_PAGES) &lt;&lt; PAGE_SHIFT
    
    
    &mdash;&mdash;<span style="-ms-word-wrap: break-word;">&gt;0x100&lt;&lt;12 = 0x100000</span>
    
    &nbsp; &nbsp;&nbsp;可以看到这里是表示新增<span style="-ms-word-wrap: break-word;">256k</span>的物理内存映射。那么初次映射物理内存空间大小主要取决于内核映像的大小。按理来说其实映射到<span style="-ms-word-wrap: break-word;">_end</span>也足够了,为什么会多映射<span style="-ms-word-wrap: break-word;">256k</span>呢?这是预留给后来建立内存映射页表用的,毕竟<span style="-ms-word-wrap: break-word;">e820</span>图还没排上用场,而且从<span style="-ms-word-wrap: break-word;">__brk_base</span>到<span style="-ms-word-wrap: break-word;">_end</span>这段内存在内核建立物理内存直接映射的时候是不够用的。
    
    &nbsp; &nbsp;&nbsp;至此,可以知道页全局目录在哪里、页表在哪里、映射的内存大小等信息,而且它们已经建立完毕。
    
    &nbsp; &nbsp;&nbsp;那么看一下剩下的那部分汇编:
    

    addl $__PAGE_OFFSET, %edi

    movl %edi, pa(_brk_end)

    &nbsp; &nbsp;&nbsp;这是为了将页表的边界值的虚拟地址存入到<span style="-ms-word-wrap: break-word;">_brk_end</span>里面。
    

    shrl $12, %eax

    movl %eax, pa(max_pfn_mapped)

    &nbsp; &nbsp;&nbsp;而这里把最大映射的页框数量写入<span style="-ms-word-wrap: break-word;">max_pfn_mapped</span>变量中。
    

    / Do early initialization of the fixmap area /

    movl $pa(initial_pg_fixmap)+PDE_IDENT_ATTR,%eax
    
    
    movl %eax,pa(initial_page_table+0xffc)
    
    &nbsp; &nbsp;&nbsp;最后把<span style="-ms-word-wrap: break-word;">pgd</span>中的最后一个页全局目录项设置成固定内存映射项。
    
    &nbsp; &nbsp;&nbsp;这些是做什么的呢?暂时这里不分析,这都是为了后续全局内存初始化使用做准备的。那么最后随着刚才查看到的<span style="-ms-word-wrap: break-word;">CR0</span>的设置,分页模式开启。
    
    &nbsp; &nbsp;&nbsp;其实<span style="-ms-word-wrap: break-word;">Linux</span>在初始化过程中,从开启分段模式后,并不是立刻开启分页管理的,而是等到内核<span style="-ms-word-wrap: break-word;">decompress_kernel</span>执行之后,也就是将内核解压后。否则也没法知道内核的符号位置。
    
    &nbsp; &nbsp;&nbsp;<span style="line-height: 1.5; -ms-word-wrap: break-word;">回到前面在</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">head_32.s</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">查找到的</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">CR0</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">设置的情况,</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">CR0</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">总共设置</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">3</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">次,那么第</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">2</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">次是开启分页模式的,那最后</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">1</span><span style="line-height: 1.5; -ms-word-wrap: break-word;">次设置又做了什么呢?</span>
    
  71.     movl $0x50022,%ecx # set AM, WP, NE and MP

  72.     movl %cr0,%eax
  73.     andl $0x80000011,%eax # Save PG,PE,ET
  74.     orl %ecx,%eax
  75.     movl %eax,%cr0
  76.  
  77.     lgdt early_gdt_descr
  78.     lidt idt_descr

        这里很明显看到是更新了一下CR0,重新定位了gdtidt表位置。CR0此次被设置为0x50022|0x80000011=0x80050033。对比分页时,CR0设置的值(X86_CR0_PE|X86_CR0_MP|X86_CR0_ET|X86_CR0_NE|X86_CR0_WP|X86_CR0_AM|X86_CR0_PG),其也是0x80050033。这里应该是为了将后面使用的CPU环境设置正确而已。与分页映射无关。

        其实开启分页模式的整个流程也是根据Intel手册里描述进入保护模式的流程来实现的:

  79. Switching to Protected Mode

  80. Before switching to protected mode from real mode, a minimum set of system data structures and code modules must be loaded into memory, as described in Section 9.8, “Software Initialization for Protected-Mode Operation.” Once these tables are created, software initialization code can switch into protected mode.
  81. Protected mode is entered by executing a MOV CR0 instruction that sets the PE flag in the CR0 register. (In the same instruction, the PG flag in register CR0 can be set to enable paging.)Execution in protected mode begins with a CPL of 0.
  82. Intel 64 and IA-32 processors have slightly different requirements for switching to protected mode. To insure upwards and downwards code compatibility with Intel 64 and IA-32 processors, we recommend that you follow these steps:
  83. 1. Disable interrupts. A CLI instruction disables maskable hardware interrupts. NMI interrupts can be disabled with external circuitry. (Software must guarantee that no exceptions or interrupts are generated during the mode switching operation.)
  84. 2. Execute the LGDT instruction to load the GDTR register with the base address of the GDT.
  85. 3. Execute a MOV CR0 instruction that sets the PE flag (and optionally the PG flag) in control register CR0.
  86. 4. Immediately following the MOV CR0 instruction, execute a far JMP or far CALL instruction. (This operation is typically a far jump or call to the next instruction in the instruction stream.)
  87. 5. The JMP or CALL instruction immediately after the MOV CR0 instruction changes the flow of execution and serializes the processor.
  88. 6. If paging is enabled, the code for the MOV CR0 instruction and the JMP or CALL instruction must come from a page that is identity mapped (that is, the linear address before the jumpis the same as the physical address after paging and protected mode is enabled). The target instruction for the JMP or CALL instruction does not need to be identity mapped.
  89. 7. If a local descriptor table is going to be used, execute the LLDT instruction to load the segment selector for the LDT in the LDTR register.
  90. 8. Execute the LTR instruction to load the task register with a segment selector to the initial protected-mode task or to a writable area of memory that can be used to store TSS information on a task switch.
  91. 9. After entering protected mode, the segment registers continue to hold the contents they had in real-address mode. The JMP or CALL instruction in step 4 resets the CS register. Perform one of the following operations to update the contents of the remaining segment registers.
  92. — Reload segment registers DS, SS, ES, FS, and GS. If the ES, FS, and/or GS registers are not going to be used, load them with a null selector.
  93. — Perform a JMP or CALL instruction to a new task, which automatically resets the values of the segment registers and branches to a new code segment.
  94. 10. Execute the LIDT instruction to load the IDTR register with the address and limit of the protected-mode IDT.
  95. 11. Execute the STI instruction to enable maskable hardware interrupts and perform the necessary hardware operation to enable NMI interrupts.
  96. Random failures can occur if other instructions exist between steps 3 and 4 above. Failures will be readily seen in some situations, such as when instructions that reference memory are inserted between steps 3 and 4 while in system management mode.