http://blog.chinaunix.net/uid-26859697-id-5473255.html

先由slub分配算法初始化进入分析。

回到mm_init()函数中,在调用mem_init()初始化伙伴管理算法后,紧接着调用的kmem_cache_init()便是slub分配算法的入口。其中该函数在/mm目录下有三处实现slab.c、slob.c和slub.c,表示不同算法下其初始化各异,分析slub分配算法则主要分析slub.c的实现。

该函数具体实现:


1. 【file:/mm/slub.c】
2. void init kmem_cache_init(void)
3. {
4.     static
initdata struct kmem_cache boot_kmem_cache,
5.         boot_kmem_cache_node;
6.  
7.     if (debug_guardpage_minorder())
8.         slub_max_order = 0;
9.  
10.     kmem_cache_node = &boot_kmem_cache_node;
11.     kmem_cache = &boot_kmem_cache;
12.  
13.     create_boot_cache(kmem_cache_node, "kmem_cache_node",
14.         sizeof(struct kmem_cache_node), SLAB_HWCACHE_ALIGN);
15.  
16.     register_hotmemory_notifier(&slab_memory_callback_nb);
17.  
18.     / Able to allocate the per node structures /
19.     slab_state = PARTIAL;
20.  
21.     create_boot_cache(kmem_cache, "kmem_cache",
22.             offsetof(struct kmem_cache, node) +
23.                 nr_node_ids sizeof(struct kmem_cache_node ),
24.                SLAB_HWCACHE_ALIGN);
25.  
26.     kmem_cache = bootstrap(&boot_kmem_cache);
27.  
28.     /
29.      
Allocate kmem_cache_node properly from the kmem_cache slab.
30.       kmem_cache_node is separately allocated so no need to
31.      
update any list pointers.
32.      /
33.     kmem_cache_node = bootstrap(&boot_kmem_cache_node);
34.  
35.     /
Now we can use the kmem_cache to allocate kmalloc slabs */
36.     create_kmalloc_caches(0);
37.  
38. #ifdef CONFIG_SMP
39.     register_cpu_notifier(&slab_notifier);
40. #endif
41.  
42.     printk(KERN_INFO
43.         "SLUB: HWalign=%d, Order=%d-%d, MinObjects=%d,"
44.         " CPUs=%d, Nodes=%d\n",
45.         cache_line_size(),
46.         slub_min_order, slub_max_order, slub_min_objects,
47.         nr_cpu_ids, nr_node_ids);
48. }
    浏览该函数整体结构,很容易就可以辨认出来register_hotmemory_notifier()和register_cpu_notifier()主要是用于注册内核通知链回调的;除此之外,主要涉及的函数分别为create_boot_cache()、bootstrap()和create_kmalloc_caches()。为了了解具体实现,逐一分析这三个函数。

首先是create_boot_cache():


1. 【file:/mm/slub.c】
2. / Create a cache during boot when no slab services are available yet /
3. void init create_boot_cache(struct kmem_cache s, const char name, size_t size,
4.         unsigned long flags)
5. {
6.     int err;
7.  
8.     s->name = name;
9.     s->size = s->object_size = size;
10.     s->align = calculate_alignment(flags, ARCH_KMALLOC_MINALIGN, size);
11.     err =
kmem_cache_create(s, flags);
12.  
13.     if (err)
14.         panic("Creation of kmalloc slab %s size=%zu failed. Reason %d\n",
15.                     name, size, err);
16.  
17.     s->refcount = -1; / Exempt from merging for now /
18. }
该函数用于创建分配算法缓存,主要是把boot_kmem_cache_node结构初始化了。其内部的calculate_alignment()主要用于计算内存对齐值,而__kmem_cache_create()则是创建缓存的核心函数,其主要是把kmem_cache结构初始化了。具体的__kmem_cache_create()实现将在后面的slab创建部分进行详细分析。

至此,create_boot_cache()函数创建kmem_cache_node对象缓冲区完毕,往下register_hotmemory_notifier()注册内核通知链回调之后,同样是通过create_boot_cache()创建kmem_cache对象缓冲区。接续往下走,可以看到bootstrap()函数调用,bootstrap(&boot_kmem_cache)及bootstrap(&boot_kmem_cache_node)。

bootstrap()函数主要是将临时kmem_cache向最终kmem_cache迁移,并修正相关指针,使其指向最终的kmem_cache。具体实现:


1. 【file:/mm/slub.c】
2. /**
3.   Basic setup of slabs
4.  **
/
5.  
6. /
7.  
Used for early kmem_cache structures that were allocated using
8.   the page allocator. Allocate them properly then fix up the pointers
9.  
that may be pointing to the wrong kmem_cache structure.
10.  /
11.  
12. static struct kmem_cache
init bootstrap(struct kmem_cache static_cache)
13. {
14.     int node;
15.     struct kmem_cache
s = kmem_cache_zalloc(kmem_cache, GFP_NOWAIT);
16.  
17.     memcpy(s, static_cache, kmem_cache->object_size);
18.  
19.     /
20.      
This runs very early, and only the boot processor is supposed to be
21.       up. Even if it weren't true, IRQs are not up so we couldn't fire
22.      
IPIs around.
23.      */
24.     
flush_cpu_slab(s, smp_processor_id());
25.     for_each_node_state(node, N_NORMAL_MEMORY) {
26.         struct kmem_cache_node n = get_node(s, node);
27.         struct page
p;
28.  
29.         if (n) {
30.             list_for_each_entry(p, &n->partial, lru)
31.                 p->slab_cache = s;
32.  
33. #ifdef CONFIG_SLUB_DEBUG
34.             list_for_each_entry(p, &n->full, lru)
35.                 p->slab_cache = s;
36. #endif
37.         }
38.     }
39.     list_add(&s->list, &slab_caches);
40.     return s;
41. }
首先将会通过kmem_cache_zalloc()申请kmem_cache空间,值得注意的是该函数申请调用kmem_cache_zalloc()->kmem_cache_alloc()->slab_alloc(),其最终将会通过前面create_boot_cache()初始化创建的kmem_cache来申请slub空间来使用;继而将bootstrap()入参的kmem_cache结构数据memcpy()至申请的空间中,再接着会__flush_cpu_slab()刷新cpu的slab信息;然后回通过for_each_node_state()遍历各个内存管理节点node,在通过get_node()获取对应节点的slab,如果slab不为空这回遍历部分满slab链,修正每个slab指向kmem_cache的指针,如果开启CONFIG_SLUB_DEBUG,则会遍历满slab链,设置每个slab指向kmem_cache的指针;最后将kmem_cache添加到全局slab_caches链表中。

由此可以看到linux内核代码的精简设计,通过临时空间创建初始化了slab的管理框架,然后再将临时空间的数据迁移至管理框架中,实现高度自管理,不轻易浪费丝毫内存空间。

kmem_cache_init()函数再往下则是create_kmalloc_caches():


1. 【file:/mm/slab_common.c】
2. /
3.  
Create the kmalloc array. Some of the regular kmalloc arrays
4.   may already have been created because they were needed to
5.  
enable allocations for slab creation.
6.  /
7. void __init create_kmalloc_caches(unsigned long flags)
8. {
9.     int i;
10.  
11.     /

12.       Patch up the size_index table if we have strange large alignment
13.      
requirements for the kmalloc array. This is only the case for
14.       MIPS it seems. The standard arches will not generate any code here.
15.      

16.       Largest permitted alignment is 256 bytes due to the way we
17.      
handle the index determination for the smaller caches.
18.      
19.      
Make sure that nothing crazy happens if someone starts tinkering
20.       around with ARCH_KMALLOC_MINALIGN
21.      
/
22.     BUILD_BUG_ON(KMALLOC_MIN_SIZE > 256 ||
23.         (KMALLOC_MIN_SIZE & (KMALLOC_MIN_SIZE - 1)));
24.  
25.     for (i = 8; i < KMALLOC_MIN_SIZE; i += 8) {
26.         int elem = size_index_elem(i);
27.  
28.         if (elem >= ARRAY_SIZE(size_index))
29.             break;
30.         size_index[elem] = KMALLOC_SHIFT_LOW;
31.     }
32.  
33.     if (KMALLOC_MIN_SIZE >= 64) {
34.         /
35.          
The 96 byte size cache is not used if the alignment
36.           is 64 byte.
37.          
/
38.         for (i = 64 + 8; i <= 96; i += 8)
39.             size_index[size_index_elem(i)] = 7;
40.  
41.     }
42.  
43.     if (KMALLOC_MIN_SIZE >= 128) {
44.         /
45.          
The 192 byte sized cache is not used if the alignment
46.           is 128 byte. Redirect kmalloc to use the 256 byte cache
47.          
instead.
48.          /
49.         for (i = 128 + 8; i <= 192; i += 8)
50.             size_index[size_index_elem(i)] = 8;
51.     }
52.     for (i = KMALLOC_SHIFT_LOW; i <= KMALLOC_SHIFT_HIGH; i++) {
53.         if (!kmalloc_caches[i]) {
54.             kmalloc_caches[i] = create_kmalloc_cache(NULL,
55.                             1 << i, flags);
56.         }
57.  
58.         /

59.           Caches that are not of the two-to-the-power-of size.
60.          
These have to be created immediately after the
61.           earlier power of two caches
62.          
/
63.         if (KMALLOC_MIN_SIZE <= 32 && !kmalloc_caches[1] && i == 6)
64.             kmalloc_caches[1] = create_kmalloc_cache(NULL, 96, flags);
65.  
66.         if (KMALLOC_MIN_SIZE <= 64 && !kmalloc_caches[2] && i == 7)
67.             kmalloc_caches[2] = create_kmalloc_cache(NULL, 192, flags);
68.     }
69.  
70.     / Kmalloc array is now usable /
71.     slab_state = UP;
72.  
73.     for (i = 0; i <= KMALLOC_SHIFT_HIGH; i++) {
74.         struct kmem_cache s = kmalloc_caches[i];
75.         char
n;
76.  
77.         if (s) {
78.             n = kasprintf(GFP_NOWAIT, "kmalloc-%d", kmalloc_size(i));
79.  
80.             BUG_ON(!n);
81.             s->name = n;
82.         }
83.     }
84.  
85. #ifdef CONFIG_ZONE_DMA
86.     for (i = 0; i <= KMALLOC_SHIFT_HIGH; i++) {
87.         struct kmem_cache s = kmalloc_caches[i];
88.  
89.         if (s) {
90.             int size = kmalloc_size(i);
91.             char
n = kasprintf(GFP_NOWAIT,
92.                  "dma-kmalloc-%d", size);
93.  
94.             BUG_ON(!n);
95.             kmalloc_dma_caches[i] = create_kmalloc_cache(n,
96.                 size, SLAB_CACHE_DMA | flags);
97.         }
98.     }
99. #endif
100. }
函数入口处的BUILD_BUG_ON()主要是做检查,保证kmalloc允许的最小对象大小不能大于256,且该值必须是2的整数幂;接着的for循环,主要是对大小在8byte与KMALLOC_MIN_SIZE之间的对象,将其在size_index数组的索引设置为KMALLOC_SHIFT_LOW;接着的KMALLOC_MIN_SIZE与64及128的比较判断分支则主要是对64byte至96byte及128byte至192byte之间的对象,将其在size_index数组的索引值进行设置;而对于slub分配算法而言,KMALLOC_MIN_SIZE为1 &lt;&lt; KMALLOC_SHIFT_LOW,其中KMALLOC_SHIFT_LOW为3,则KMALLOC_MIN_SIZE为8,故上述几个分支都不会进入,即size_index定义数据未变,仍为其在/mm/slab_common.c的原始定义数值;再往下至KMALLOC_SHIFT_LOW到KMALLOC_SHIFT_HIGH的for循环主要是调用create_kmalloc_cache()来初始化kmalloc_caches表,其最终创建的kmalloc_caches是以{0,96,192,8,16,32,64,128,256,512,1024,2046,4096,8196}为大小的slab表;创建完之后,将设置slab_state为UP,然后将kmem_cache的name成员进行初始化;最后如果配置了CONFIG_ZONE_DMA,将会初始化创建kmalloc_dma_caches表。

根据这里的信息,可以得到size_index与kmalloc_caches的对应关系:

![](http://blog.chinaunix.net/attachment/201511/18/26859697_1447780750j9QJ.png)

&nbsp;

进而分析create_kmalloc_cache()的实现:


1. 【file:/mm/slab_common.c】
2. struct kmem_cache __init create_kmalloc_cache(const char name, size_t size,
3.                 unsigned long flags)
4. {
5.     struct kmem_cache *s = kmem_cache_zalloc(kmem_cache, GFP_NOWAIT);
6.  
7.     if (!s)
8.         panic("Out of memory when creating slab %s\n", name);
9.  
10.     create_boot_cache(s, name, size, flags);
11.     list_add(&s->list, &slab_caches);
12.     s->refcount = 1;
13.     return s;
14. }
其先经kmem_cache_zalloc()申请一个kmem_cache对象,完了create_boot_cache()创建slab并将其添加到slab_caches列表中。而create_boot_cache()的实现:


1. 【file:/mm/slab_common.c】
2. / Create a cache during boot when no slab services are available yet /
3. void init create_boot_cache(struct kmem_cache s, const char name, size_t size,
4.         unsigned long flags)
5. {
6.     int err;
7.  
8.     s->name = name;
9.     s->size = s->object_size = size;
10.     s->align = calculate_alignment(flags, ARCH_KMALLOC_MINALIGN, size);
11.     err =
kmem_cache_create(s, flags);
12.  
13.     if (err)
14.         panic("Creation of kmalloc slab %s size=%zu failed. Reason %d\n",
15.                     name, size, err);
16.  
17.     s->refcount = -1; / Exempt from merging for now /
18. }
很清楚可以看到该函数主要是通过__kmem_cache_create()来创建各种大小的slab以满足后期内存分配时使用。

至此,Slub分配框架初始化完毕。稍微总结一下kmem_cache_init()函数流程,该函数首先是create_boot_cache()创建kmem_cache_node对象的slub管理框架,然后register_hotmemory_notifier()注册热插拔内存内核通知链回调函数用于热插拔内存处理;值得关注的是此时slab_state设置为PARTIAL,表示将分配算法状态改为PARTIAL,意味着已经可以分配kmem_cache_node对象了;再往下则是create_boot_cache()创建kmem_cache对象的slub管理框架,至此整个slub分配算法所需的管理结构对象的slab已经初始化完毕;不过由于前期的管理很多都是借用临时变量空间的,所以将会通过bootstrap()将kmem_cache_node和kmem_cache的管理结构迁入到slub管理框架的对象空间中,实现自管理;最后就是通过create_kmalloc_caches()初始化一批后期内存分配中需要使用到的不同大小的slab缓存。