Test Case 1 Regression testing for memory fragmentation. Last updated: 10/10/05 Issues to detect: 1- Overall memory fragmentation that significantly affects performance due to time cost of memory allocation, or that will eventually cause memory allocations to fail. 2- Kernel memory pools (e.g. slab cache) that become fragmented even if the rest of memory is not. Background: The memory infrastructure changes that were made first before any hotplug memory patches could be accepted had the effect also of reducing memory fragmentation (or should have). So, as a regression, we want to check that memory does not get fragmented. Unfortunately, it can take a very long time for it to become fragmented, and it is hard to say what workload exactly will cause it to do so. Another challenge is how to know that it is becoming fragmented before you get to the point that memory requests are denied, an event which itself might not be necessarily due to fragmentation. A possible indication might be the time a memory allocation call takes to complete, as that should increase as fragmentation increases. Better yet would be some way of measuring fragmentation, a tool perhaps or data that could be collected and analyzed after a test. Besides the improvements in defragmentation due to the infrastructure changes, there are patches that actually directly defragment memory. These developers likely have a way to test the effectiveness of their code, so we hope to leverage tests they have (and measures they use) for testing any regressions that may occur. Finally, as memory is not all created equal, we need to investigate ways to examine pools of memory, like slab cache, to be sure that memory of a particular type is not fragmented. For kernel stuctures like the slab cache, it will likely require writing a specific test for the slab cache and instrumenting the kernel to gather the informations we need to determine the state of fragmentation. Measuring Fragmentation: There may be some less invasive ways to approach the testing than the above approach. You measure slab fragmentation by looking at the first two columns of /proc/slabinfo. The first is the number of objects which have been handed outside the slab. The second is the number of objects that have already have memory in the slab allocated to them, but have not been allocated outside the slab cache itself. The first column may never be larger than the second. The farther apart the values in the two columns, the more slab fragmentation you have. To measure fragmentation of the buddy allocator, you must use a different method: look at /proc/buddyinfo. The first column is the number of (2^0 * PAGE_SIZE) pages available. The second column is the number of (2^1 * PAGE_SIZE) pages available. These go all the way up to MAX_ORDER-1. You must force the system to free up as many pages as it can, then look at buddyinfo again. A perfectly non-fragmenting system will have no column with a value of greater than 1. Any column with a larger value is an indicator of fragmentation. Larger number in columns farther to to the left indicate larger levels of fragmentation. There are a couple of ways to force the system to free up pages, but the following is probably one of the most realistic: write an application that allocates an amount memory close to the total amount resident in the system, one that repeatedly touches it. Then, just kill the application. Test Plan. TBD