刚go through了一下2.6.23的change log. 发现一个叫 "Lumpy" reclaim 的新feature. 可以缓解上一篇文章里所说的memory fragment的现象. 它并不是在分配机制上下手(以前看到一个patch是给分配的页归为Kernel, UserSpace, Reserve三类去管理), 而是在page reclaim的算法上做了个变动, 改动在try_to_free_pages里, 将原先只scan LRU链上指定数量的inactive页来回收, 改为:
1) 查找LRU每个指定scan的页, 如果能回收, 放入队列
2) 将该页开始的物理连续页都遍历(遍历数量是由caller指定), 如果在LRU上并能回收, 就将其放入回收队列里.
但是最后这些放入回收队列的页是否能被回收, 还需要过很复杂的流程:shrink_page_list().不知效果如何, 有时间那板子来试试. 估计分配大块连续页的时候会很慢.
付上对这个patch的描述:
Lumpy Reclaim (V3)When we are out of memory of a suitable size we enter reclaim.The current reclaim algorithm targets pages in LRU order, whichis great for fairness but highly unsuitable if you desire pages athigher orders. To get pages of higher order we must shoot down avery high proportion of memory; >95% in a lot of cases.This patch set adds a lumpy reclaim algorithm to the allocator.It targets groups of pages at the specified order anchored at theend of the active and inactive lists. This encourages groups ofpages at the requested orders to move from active to inactive,and active to free lists. This behaviour is only triggered out ofdirect reclaim when higher order pages have been requested.This patch set is particularly effective when utilised withan anti-fragmentation scheme which groups pages of similarreclaimability together.This patch set (against 2.6.19-rc5-mm2) is based on Peter Zijlstra'slumpy reclaim V2 patch which forms the foundation. It comprisesthe following patches:lumpy-reclaim-v2 -- Peter Zijlstra's lumpy reclaim prototype,lumpy-cleanup-a-missplaced-comment-and-simplify-some-code -- cleanups to move a comment back to where it came from, to make the area edge selection more comprehensible and also cleans up the switch coding style to match the concensus in mm/*.c,lumpy-ensure-we-respect-zone-boundaries -- bug fix to ensure we do not attempt to take pages from adjacent zones, andlumpy-take-the-other-active-inactive-pages-in-the-area -- patch to increase aggression over the targetted order.Testing of this patch set under high fragmentation high allocationload conditions shows significantly improved high order reclaimrates than a standard kernel. The stack here it is now within 5%of the best case linear-reclaim figures.It would be interesting to see if this setup is also successful inreducing order-2 allocation failures that you have been seeing withjumbo frames.Please consider for -mm.