http://blog.chinaunix.net/uid-26859697-id-5573776.html



kmalloc()是基于slab/slob/slub分配分配算法上实现的,不少地方将其作为slab/slob/slub分配算法的入口,实际上是略有区别的。

现在分析一下其实现:



1. 【file:/include/linux/slab.h】
2. /
3.   kmalloc - allocate memory
4.   @size: how many bytes of memory are required.
5.   @flags: the type of memory to allocate.
6.  
7.   kmalloc is the normal method of allocating memory
8.   for objects smaller than page size in the kernel.
9.  
10.   The @flags argument may be one of:
11.  
12.   %GFP_USER - Allocate memory on behalf of user. May sleep.
13.  
14.   %GFP_KERNEL - Allocate normal kernel ram. May sleep.
15.  
16.   %GFP_ATOMIC - Allocation will not sleep. May use emergency pools.
17.   For example, use this inside interrupt handlers.
18.  
19.   %GFP_HIGHUSER - Allocate pages from high memory.
20.  
21.   %GFP_NOIO - Do not do any I/O at all while trying to get memory.
22.  
23.   %GFP_NOFS - Do not make any fs calls while trying to get memory.
24.  
25.   %GFP_NOWAIT - Allocation will not sleep.
26.  
27.   %__GFP_THISNODE - Allocate node-local memory only.
28.  
29.   %GFP_DMA - Allocation suitable for DMA.
30.   Should only be used for kmalloc() caches. Otherwise, use a
31.   slab created with SLAB_DMA.
32.  
33.   Also it is possible to set different flags by OR'ing
34.   in one or more of the following additional @flags:
35.  
36.   %GFP_COLD - Request cache-cold pages instead of
37.   trying to return cache-warm pages.
38.  
39.  * %
GFP_HIGH - This allocation has high priority and may use emergency pools.
40.  
41.   %GFP_NOFAIL - Indicate that this allocation is in no way allowed to fail
42.   (think twice before using).
43.  
44.  * %
GFP_NORETRY - If memory is not immediately available,
45.   then give up at once.
46.  
47.   %__GFP_NOWARN - If allocation fails, don't issue any warnings.
48.  
49.   %__GFP_REPEAT - If allocation fails initially, try once more before failing.
50.  
51.   There are other flags available as well, but these are not intended
52.   for general use, and so are not documented here. For a full list of
53.   potential flags, always refer to linux/gfp.h.
54.  /
55. static always_inline void *kmalloc(size_t size, gfp_t flags)
56. {
57.     if (
builtin_constant_p(size)) {
58.         if (size > KMALLOC_MAX_CACHE_SIZE)
59.             return kmalloc_large(size, flags);
60. #ifndef CONFIG_SLOB
61.         if (!(flags & GFP_DMA)) {
62.             int index = kmalloc_index(size);
63.  
64.             if (!index)
65.                 return ZERO_SIZE_PTR;
66.  
67.             return kmem_cache_alloc_trace(kmalloc_caches[index],
68.                     flags, size);
69.         }
70. #endif
71.     }
72.     return __kmalloc(size, flags);
73. }
     

    kmalloc()的参数<span style="-ms-word-wrap: break-word;">size</span>表示申请的空间大小,而<span style="-ms-word-wrap: break-word;">flags</span>则表示分配标志。<span style="-ms-word-wrap: break-word;">kamlloc</span>的分配标志众多,各标志都分配标识特定的<span style="-ms-word-wrap: break-word;">bit</span>位,藉此可以多样组合。

    GFP_USER:用于表示为用户空间分配内存,可能会引起休眠;

    GFP_KERNEL:内核内存的常规分配,可能会引起休眠;

    GFP_ATOMIC:该分配不会引起休眠,但可能会使用应急内存资源,通常用于中断处理中;

    GFP_HIGHUSER:使用高端内存进行分配;

    GFP_NOIO:分配内存时,禁止任何<span style="-ms-word-wrap: break-word;">IO</span>操作;

    GFP_NOFS:分配内存时,禁止任何文件系统操作;

    GFP_NOWAIT:分配内存时禁止休眠;

    __GFP_THISNODE:分配内存时,仅从本地节点内存中分配;

    GFP_DMA:从<span style="-ms-word-wrap: break-word;">DMA</span>内存中分配合适的内存,应仅使用于<span style="-ms-word-wrap: break-word;">kmalloc</span>的<span style="-ms-word-wrap: break-word;">cache</span>分配;

    __GFP_COLD:用于请求分配冷热页中的冷页;

    __GFP_HIGH:用于表示该分配优先级较高并可能会使用应急内存资源;

    __GFP_NOFAIL:用于指示该分配不允许分配失败,该标志需要慎用;

    __GFP_NORETRY:如果分配内存未能够直接获取到,则不再尝试分配,直接放弃;

    __GFP_NOWARN:如果分配过程中失败,不上报任何告警;

    __GFP_REPEAT:如果分配过程中失败,则尝试再次申请;

    函数入口<span style="-ms-word-wrap: break-word;">if</span>判断内的<span style="-ms-word-wrap: break-word;">__builtin_constant_p</span>是<span style="-ms-word-wrap: break-word;">Gcc</span>内建函数,用于判断一个值是否为编译时常量,是则返回<span style="-ms-word-wrap: break-word;">true</span>,否则返回<span style="-ms-word-wrap: break-word;">false</span>。也就意味着如果调用<span style="-ms-word-wrap: break-word;">kmalloc()</span>传入常量且该值大于<span style="-ms-word-wrap: break-word;">KMALLOC_MAX_CACHE_SIZE</span>(即申请空间超过<span style="-ms-word-wrap: break-word;">kmalloc()</span>所能分配最大<span style="-ms-word-wrap: break-word;">cache</span>的大小),那么将会通过<span style="-ms-word-wrap: break-word;">kmalloc_large()</span>进行分配;否则都将通过<span style="-ms-word-wrap: break-word;">__kmalloc()</span>进行分配。如果通过<span style="-ms-word-wrap: break-word;">kmalloc_large()</span>进行内存分配,将会经<span style="-ms-word-wrap: break-word;">kmalloc_large()-&gt;kmalloc_order()-&gt;__get_free_pages()</span>,最终通过<span style="-ms-word-wrap: break-word;">Buddy</span>伙伴算法申请所需内存。

    伙伴算法前面已经分析过了,不再赘述,接下来看<span style="-ms-word-wrap: break-word;">__kmalloc()</span>的实现:

<div class="codeText" id="codeText" style="background: rgb(255, 255, 255); font: 12px/normal Consolas, monospace; margin: 0px 0px 1.1em; padding: 0px; border: 1px solid rgb(221, 221, 221); border-image: none; width: 1252.19px; letter-spacing: 0.1px; overflow: auto; -ms-word-break: break-all; -ms-word-wrap: break-word; font-size-adjust: none; font-stretch: normal;">
  1. 【file:/mm/slub.c】
  2. void *__kmalloc(size_t size, gfp_t flags)
  3. {
  4.     struct kmem_cache *s;
  5.     void *ret;
  6.  
  7.     if (unlikely(size > KMALLOC_MAX_CACHE_SIZE))
  8.         return kmalloc_large(size, flags);
  9.  
  10.     s = kmalloc_slab(size, flags);
  11.  
  12.     if (unlikely(ZERO_OR_NULL_PTR(s)))
  13.         return s;
  14.  
  15.     ret = slab_alloc(s, flags, _RETIP);
  16.  
  17.     trace_kmalloc(_RETIP, ret, size, s->size, flags);
  18.  
  19.     return ret;
  20. }

 

该函数同样判断申请是否超过最大cache大小,如果是则通过kmalloc_large()进行分配;接着通过申请大小及申请标志调用kmalloc_slab()查找适用的kmem_cache;最后通过slab_alloc()进行slab分配。

具体看一下kmalloc_slab()的实现:

  • 【file:/mm/slab_commmon.c】

  • /*
  •  * Find the kmem_cache structure that serves a given size of
  •  * allocation
  •  */
  • struct kmem_cache *kmalloc_slab(size_t size, gfp_t flags)
  • {
  •     int index;
  •  
  •     if (unlikely(size > KMALLOC_MAX_SIZE)) {
  •         WARN_ON_ONCE(!(flags & __GFP_NOWARN));
  •         return NULL;
  •     }
  •  
  •     if (size <= 192) {
  •         if (!size)
  •             return ZERO_SIZE_PTR;
  •  
  •         index = size_index[size_index_elem(size)];
  •     } else
  •         index = fls(size - 1);
  •  
  • #ifdef CONFIG_ZONE_DMA
  •     if (unlikely((flags & GFP_DMA)))
  •         return kmalloc_dma_caches[index];
  •  
  • #endif
  •     return kmalloc_caches[index];
  • }

  •  

    如果申请的大小超过KMALLOC_MAX_SIZE最大值,则返回NULL表示失败;如果申请大小小于192,且不为0,将通过size_index_elem宏转换为下标后,经size_index全局数组取得索引值,否则将直接通过fls()取得索引值;最后如果开启了DMA内存配置且设置了GFP_DMA标志,将结合索引值通过kmalloc_dma_caches返回kmem_cache管理结构信息,否则将通过kmalloc_caches返回该结构。

    由此可以看出kmalloc()实现较为简单,起分配所得的内存不仅是虚拟地址上的连续存储空间,同时也是物理地址上的连续存储空间。这是有别于后面将会分析到的vmalloc()申请所得的内存。

    此外再过一下kfree()的接口实现,该函数在多处均有实现,主要是在slab.c/slob.c/slub.c中,所以也说kmalloc()kfree()是基于slab/slob/slub实现的。这里接前面的slub算法,主要分析一下slub.c中的kfree()实现:

  • 【file:/mm/slub.c】

  • void kfree(const void *x)
  • {
  •     struct page *page;
  •     void object = (void )x;
  •  
  •     trace_kfree(_RETIP, x);
  •  
  •     if (unlikely(ZERO_OR_NULL_PTR(x)))
  •         return;
  •  
  •     page = virt_to_head_page(x);
  •     if (unlikely(!PageSlab(page))) {
  •         BUG_ON(!PageCompound(page));
  •         kfree_hook(x);
  •         __free_memcg_kmem_pages(page, compound_order(page));
  •         return;
  •     }
  •     slab_free(page->slab_cache, page, object, _RETIP);
  • }

  •  

    该函数实现简单,首先是经过trace_kfree()记录kfree轨迹,然后if (unlikely(ZERO_OR_NULL_PTR(x)))对地址做非零判断,接着virt_to_head_page(x)将虚拟地址转换到页面;再是判断if (unlikely(!PageSlab(page)))判断该页面是否作为slab分配管理,如果是的话则转为通过slab_free()进行释放,否则将进入if分支中;在if分支中,将会kfree_hook()做释放前kmemleak处理(该函数主要是封装了kmemleak_free()),完了之后将会__free_memcg_kmem_pages()将页面释放,同时该函数内也将cgroup释放处理。

    kmalloc()和kfree()也就这么简单了。