现在的位置: 首页 > 综合 > 正文

再读内核存储管理(5):buddy算法

2013年09月22日 ⁄ 综合 ⁄ 共 6579字 ⁄ 字号 评论关闭
 

快乐虾
http://blog.csdn.net/lights_joy/
lights@hb165.com
  
 
本文适用于
ADI bf561 DSP
uclinux-2008r1-rc8 (移植到vdsp5)
Visual DSP++ 5.0
  
 
欢迎转载,但请保留作者信息
 

buddy算法是用来做内存管理的经典算法,目的是为了解决内存的外碎片。
避免外碎片的方法有两种:
1,利用分页单元把一组非连续的空闲页框映射到非连续的线性地址区间。
2,开发适当的技术来记录现存的空闲连续页框块的情况,以尽量避免为满足对小块的请求而把大块的空闲块进行分割。
内核选择第二种避免方法。
buddy算法将所有空闲页框分组为11个块链表,每个块链表的每个块元素分别包含1,2,4,8,16,32,64,128,256,512,1024个连续的页框,每个块的第一个页框的物理地址是该块大小的整数倍。如,大小为16个页框的块,其起始地址是16*2^12(一个页框的大小为4k,16个页框的大小为16*4K,1k=1024=2的10次方,4k=2的12次方)的倍数。
例,假设要请求一个128个页框的块,算法先检查128个页框的链表是否有空闲块,如果没有则查256个页框的链表,有则将256个页框的块分裂两份,一份使用,一份插入128个页框的链表。如果还没有,就查512个页框的链表,有的话就分裂为128,128,256,一个128使用,剩余两个插入对应链表。如果在512还没查到,则返回出错信号。
回收过程相反,内核试图把大小为b的空闲伙伴合并为一个大小为2b的单独快,满足以下条件的两个块称为伙伴:1,两个块具有相同的大小,记做b;2,它们的物理地址是连续的,3,第一个块的第一个页框的物理地址是2*b*2^12的倍数,该算法迭代,如果成功合并所释放的块,会试图合并2b的块来形成更大的块。
 
buddy算法的内存回收由__free_pagefree_page两个宏及两个回收函数完成。其定义为:
#define __free_page(page) __free_pages((page), 0)
#define free_page(addr) free_pages((addr),0)
extern void FASTCALL(__free_pages(struct page *page, unsigned int order));
extern void FASTCALL(free_pages(unsigned long addr, unsigned int order));
这两个函数的实现为:
fastcall void __free_pages(struct page *page, unsigned int order)
{
     if (put_page_testzero(page)) {
         if (order == 0)
              free_hot_page(page);
         else
              __free_pages_ok(page, order);
     }
}
fastcall void free_pages(unsigned long addr, unsigned int order)
{
     if (addr != 0) {
         VM_BUG_ON(!virt_addr_valid((void *)addr));
         __free_pages(virt_to_page((void *)addr), order);
     }
}
从这两个函数的实现可以看出,它们只是接收的参数不同而已,最后的处理方法则是一样的。free_pages函数接收一个绝对地址做为参数,而__free_pages函数则直接接收page这个结构体的指针做为参数。
这其中put_page_testzero函数用于将page->_count减一并判断其是否为0,如果为0则说明此页已经没有用户使用,此时就释放这个页。
从这里还可以看出,当order为0时,此函数试图先将内存页放到所谓的热高速缓存中,否则就将其回收到指定阶数的链表中。
这个函数位于mm/page_alloc.c:
static void __free_pages_ok(struct page *page, unsigned int order)
{
     unsigned long flags;
     int i;
     int reserved = 0;
 
     // 判断是否应该释放这些连续的页,通常reserved为0
     for (i = 0 ; i < (1 << order) ; ++i)
         reserved += free_pages_check(page + i);
     if (reserved)
         return;
 
     if (!PageHighMem(page)) // 恒为true
         debug_check_no_locks_freed(page_address(page),PAGE_SIZE<<order); // 空语句
     arch_free_page(page, order);     // 空语句
     kernel_map_pages(page, 1 << order, 0);    // 空语句
 
     local_irq_save(flags);
     __count_vm_events(PGFREE, 1 << order);    // 空语句
     free_one_page(page_zone(page), page, order);
     local_irq_restore(flags);
}
很简单,就是调用free_one_page函数进行释放工作。
static void free_one_page(struct zone *zone, struct page *page, int order)
{
     spin_lock(&zone->lock);
     zone->all_unreclaimable = 0;
     zone->pages_scanned = 0;
     __free_one_page(page, zone, order);
     spin_unlock(&zone->lock);
}
简单调用__free_one_page函数,继续跟踪。
/*
 * Freeing function for a buddy system allocator.
 *
 * The concept of a buddy system is to maintain direct-mapped table
 * (containing bit values) for memory blocks of various "orders".
 * The bottom level table contains the map for the smallest allocatable
 * units of memory (here, pages), and each level above it describes
 * pairs of units from the levels below, hence, "buddies".
 * At a high level, all that happens here is marking the table entry
 * at the bottom level available, and propagating the changes upward
 * as necessary, plus some accounting needed to play nicely with other
 * parts of the VM system.
 * At each level, we keep a list of pages, which are heads of continuous
 * free pages of length of (1 << order) and marked with PG_buddy. Page's
 * order is recorded in page_private(page) field.
 * So when we are allocating or freeing one, we can derive the state of the
 * other. That is, if we allocate a small block, and both were  
 * free, the remainder of the region must be split into blocks.  
 * If a block is freed, and its buddy is also free, then this
 * triggers coalescing into a block of larger size.           
 *
 * -- wli
 */
 
static inline void __free_one_page(struct page *page,
         struct zone *zone, unsigned int order)
{
     unsigned long page_idx;
     int order_size = 1 << order;
 
     if (unlikely(PageCompound(page)))
         destroy_compound_page(page, order);
 
     page_idx = page_to_pfn(page) & ((1 << MAX_ORDER) - 1);
 
     VM_BUG_ON(page_idx & (order_size - 1));
     VM_BUG_ON(bad_range(zone, page));
 
     __mod_zone_page_state(zone, NR_FREE_PAGES, order_size);
     while (order < MAX_ORDER-1) {
         unsigned long combined_idx;
         struct free_area *area;
         struct page *buddy;
 
         buddy = __page_find_buddy(page, page_idx, order);
         if (!page_is_buddy(page, buddy, order))
              break;        /* Move the buddy up one level. */
 
         list_del(&buddy->lru);
         area = zone->free_area + order;
         area->nr_free--;
         rmv_page_order(buddy);
         combined_idx = __find_combined_index(page_idx, order);
         page = page + (combined_idx - page_idx);
         page_idx = combined_idx;
         order++;
     }
     set_page_order(page, order);
     list_add(&page->lru, &zone->free_area[order].free_list);
     zone->free_area[order].nr_free++;
}
这个注释中已经很清楚地说明了buddy算法的核心,它就是将内存分成不同大小的块,每个块都由数量不等的页组成。最小的块只有一个页,其次是2个页、4个页、8个页…组成的块。最多由1024个页组成。
在上述函数中,while循环尽可能地将页面放在更大的块中,在查找到最大的块后,将这个内存块放到相应的块链表中。
在buddy算法中,每一页都有一个buddy页与之相对应,使用__page_find_buddy函数可以找到指定页面对应的buddy页面:
/*
 * Locate the struct page for both the matching buddy in our
 * pair (buddy1) and the combined O(n+1) page they form (page).
 *
 * 1) Any buddy B1 will have an order O twin B2 which satisfies
 * the following equation:
 *     B2 = B1 ^ (1 << O)
 * For example, if the starting buddy (buddy2) is #8 its order
 * 1 buddy is #10:
 *     B2 = 8 ^ (1 << 1) = 8 ^ 2 = 10
 *
 * 2) Any buddy B will have an order O+1 parent P which
 * satisfies the following equation:
 *     P = B & ~(1 << O)
 *
 * Assumption: *_mem_map is contiguous at least up to MAX_ORDER
 */
static inline struct page *
__page_find_buddy(struct page *page, unsigned long page_idx, unsigned int order)
{
     unsigned long buddy_idx = page_idx ^ (1 << order);
 
     return page + (buddy_idx - page_idx);
}
 
要确认两个指定的页面是否互为buddy,可以使用page_is_buddy函数:
/*
 * This function checks whether a page is free && is the buddy
 * we can do coalesce a page and its buddy if
 * (a) the buddy is not in a hole &&
 * (b) the buddy is in the buddy system &&
 * (c) a page and its buddy have the same order &&
 * (d) a page and its buddy are in the same zone.
 *
 * For recording whether a page is in the buddy system, we use PG_buddy.
 * Setting, clearing, and testing PG_buddy is serialized by zone->lock.
 *
 * For recording page's order, we use page_private(page).
 */
static inline int page_is_buddy(struct page *page, struct page *buddy,
                                     int order)
{
     if (!pfn_valid_within(page_to_pfn(buddy)))     // 恒为false
         return 0;
 
     if (page_zone_id(page) != page_zone_id(buddy))
         return 0;
 
     if (PageBuddy(buddy) && page_order(buddy) == order) {
         BUG_ON(page_count(buddy) != 0);
         return 1;
     }
     return 0;
}
在这里PageBuddy用于判断一个页面是否带有PG_buddy标记。注意:在初始化的时候,所有的页面都只有PG_reserved标记。
#define PageBuddy(page)     test_bit(PG_buddy, &(page)->flags)

抱歉!评论已关闭.