Part 1, Memory overview and tuning memory parameters AIX7 -

Part one of a three-part series

by Ken Milberg, Martin Brown | Published November 2, 2010

Introduction

As a systems administrator, you should already be familiar with the basics of memory, such as the differences between physical and virtual memory. What you might not fully understand is how the Virtual Memory Manager (VMM) works in AIX® 7 and how it relates to performance tuning. In AIX 7, it is also worth considering the effect of virtual memory and how it is used and applied within workload partitions (WPAR). This article looks at some of the monitoring tools you can use to tune your systems and outline some of the more important AIX 7 memory management systems, including how the virtual memory manager operates and the effects of the dynamic variable page size. The implementation of these enhancements, as they apply to your systems environment, can optimize memory performance on your box.

While you might find tuning your memory to be more difficult than other subsystems, the reward is often greater. Depending on the type of system you are running, there might also be specific tuning recommendations that should be set on your systems. To help validate the findings, let’s use a specific example and discuss some best practices for setting these parameters. Tuning one or two parameters on the fly, in some cases, can make a significant difference in the overall performance of your system.

One area that does not change, regardless of which subsystem you are looking to tune, is tuning systems—you should always think of it as an ongoing process. The best time to start monitoring your systems is when you have first put your system in production, and it is running well (rather than when your users are screaming about slow performance). You can never really be sure if there is a problem without a real baseline of what the system looked like when it was behaving normally. Further, only one change should be made at a time, and data should be captured and analyzed as quickly as possible after that change to determine what difference, if any, the change really made.

Memory overview

This section gives an overview of memory as it relates to AIX 7. We discuss how AIX 7 uses virtual memory to address more memory than is physically on your system. We also explain how the Virtual Memory Manager (VMM) actually works and how it services requests.

Any discussion of memory and AIX 7 must start with a description of the VMM. AIX newbies are sometimes surprised to hear that the VMM services all memory requests from the system, not just virtual memory. When RAM is accessed, the VMM needs to allocate space, even when there is plenty of physical memory left on the system. It implements a process of early allocation of paging space. Using this method, the VMM plays a vital role in helping manage real memory, not just virtual memory.

Here is how it works. In AIX 7, all virtual memory segments are partitioned into pages. For AIX 7, the default size per page is 4KB, but it can be altered to different ranges depending on the processor environment being used. The default page size is 4KB per page, although this can be changed. POWER5+ or later can also use 64KB, 16MB, and 16GB page sizes. The POWER4 architecture can also support the 16MB page size. The 16MB page size is known as large and the 16GB page size as huge; both have use cases for very large memory applications.

For POWER6, variable page size support (VPSS) was introduced, which means that the system will use larger pages as the application requests larger chunks of memory. The different page sizes can be mixed within the OS concurrently, with different applications making use of different page sizes. In addition, pages can be dynamically resized, collecting different segments of 4KB pages to make 64KB pages. This improves performance by allowing the application to access the memory in larger single chunks, instead of many smaller chunks. The pages can be dynamically resized from 4KB to 64KB. Tuning of VPSS can be managed using the vmo tuning tool.

Allocated pages can be either RAM or paging space (virtual memory stored on disk). VMM also maintains what is referred to as a free list, which is defined as unallocated page frames. These are used to satisfy page faults. There are usually a very small amount of unallocated pages (which you configure) that the VMM uses to free up space and reassign the page frames to. The virtual memory pages whose page frames are to be reassigned are selected using the VMM’s page replacement algorithm. This paging algorithm determines which virtual memory pages currently in RAM ultimately have their page frames brought back to the free list. AIX 7 uses all available memory, except memory that is configured to be unallocated and known as the free list.

To reiterate, the purpose of VMM is to manage the allocation of both RAM and virtual pages. From here, you can determine that its objectives are to help minimize both the response time of page faults and to decrease the use of virtual memory where it can. Obviously, given the choice between RAM and paging space, most people would prefer to use physical memory, if the RAM is available. What VMM also does is classify virtual memory segments into two distinct categories. The categories are working segments using computational memory and persistent segments using file memory. It is extremely important to understand the distinction between the two, as this helps you tune your systems to their optimum capabilities.

Computational memory

Computational memory is used while your processes are actually working on computing information. These working segments are temporary (transitory) and only exist up until the time a process terminates or the page is stolen. They have no real permanent disk storage location. When a process terminates, both the physical and paging spaces are released in many cases. When there is a large spike in available pages, you can actually see this happening while monitoring your system. When free physical memory starts getting low, programs that have not used recently are moved from RAM to paging space to help release physical memory for more real work.

File memory

File memory (unlike computational memory) uses persistent segments and has a permanent storage location on the disk. Data files or executable programs are mapped to persistent segments rather than working segments. The data files can relate to filesystems, such as JFS, JFS2, or NFS. They remain in memory until the file is unmounted, a page is stolen, or a file is unlinked. After the data file is copied into RAM, VMM controls when these pages are overwritten or used to store other data. Given the alternative, most people would much rather have file memory paged to disk rather than computational memory.

When a process references a page which is on disk, it must be paged, which could cause other pages to page out again. VMM is constantly lurking and working in the background trying to steal frames that have not been recently referenced, using the page replacement algorithm discussed earlier. It also helps detect thrashing, which can occur when memory is extremely low and pages are constantly being paged in and out to support processing. VMM actually has a memory load control algorithm, which can detect if the system is thrashing and actually tries to remedy the situation. Unabashed thrashing can literally cause a system to come to a standstill, as the kernel becomes too concerned with making room for pages than actually doing anything productive.

Active memory expansion

In addition to the core memory settings and environment, AIX 7 can take advantage of the power of the POWER7 CPU to provide active memory expansion (AME).

AME compresses data within memory, allowing you to store keep more data in memory, and reduce the amount of page swapping to disk as data is loaded. The configuration of AME is based on individual LPARs, so you can enable it for your database partition and keep more data read from disk in memory, but disable it for web servers, where the information stored in memory is swapped regularly.

To prevent all information being compressed, memory is split into two pools, a compressed pool and an uncompressed pool. AIX 7 automatically adjusts the size of the two pools according to the workload and configuration of the logical partition. The compression amount is defined using a compression ratio, that is, if your LPAR has been granted 2048MB you can specify a compression ratio of 2.0 and be given an effective memory capacity of 4096MB.

Because different applications and environments are capable of different compression ratios (for example, heavy text applications may benefit from higher ratios), you can use the amepat command to monitor and determine the possible compression ratio with your given workload.

You should run amepat with a given interval (in minutes) and the number of iterations, while you run your normal applications in the background to collect the necessary information. This will lead to a recommendation for the compression ratio to be used within the LPAR. You can see a sample of this in Listing 1.

Listing 1. Getting Active Memory Expansion statistics


Command Invoked                : amepat 1 1 

Date/Time of invocation        : Fri Aug 13 11:43:45 CDT 2010
Total Monitored time           : 1 mins 5 secs
Total Samples Collected        : 1

System Configuration:
‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑
Partition Name                 : l488pp065_pub
Processor Implementation Mode  : POWER7
Number Of Logical CPUs         : 4
Processor Entitled Capacity    : 0.25
Processor Max. Capacity        : 1.00
True Memory                    : 2.00 GB
SMT Threads                    : 4
Shared Processor Mode          : Enabled‑Uncapped
Active Memory Sharing          : Disabled
Active Memory Expansion        : Disabled

System Resource Statistics:               Current
‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑          ‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑
CPU Util (Phys. Processors)               0.04   4%Virtual Memory Size (MB)                  1628  79%True Memory In‑Use (MB)                   1895  93%Pinned Memory (MB)                        1285  63%File Cache Size (MB)                       243  12%Available Memory (MB)                      337  16%
Active Memory Expansion Modeled Statistics         :
‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑
Modeled Expanded Memory Size :   2.00 GB
Achievable Compression ratio   :2.10

Expansion    Modeled True      Modeled              CPU Usage  
Factor       Memory Size       Memory Gain          Estimate     
‑‑‑‑‑‑‑‑‑    ‑‑‑‑‑‑‑‑‑‑‑‑‑     ‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑   ‑‑‑‑‑‑‑‑‑‑‑     
     1.00          2.00 GB         0.00 KB [  0%]   0.00   0%     1.14          1.75 GB       256.00 MB [ 14%]   0.00   0%
Active Memory Expansion Recommendation:
‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑
The recommended AME configuration for this workload is to configure the LPAR
with a memory size of 1.75 GB and to configure a memory expansion factor
of 1.14.  This will result in a memory gain of 14%. With this 
configuration, the estimated CPU usage due to AME is approximately 0.00 
physical processors, and the estimated overall peak CPU resource required for
the LPAR is 0.04 physical processors.

NOTE: amepat's recommendations are based on the workload's utilization level
during the monitored period. If there is a change in the workload's utilization
level or a change in workload itself, amepat should be run again.

The modeled Active Memory Expansion CPU usage reported by amepat is just an estimate. The actual CPU usage used for AME may be lower or higher depending on the workload.

You can monitor the current compression within a configured LPAR using the svmon tool, as shown here in Listing 2.

Listing 2. Using svmon to get compression stats


#svmon ‑G ‑O summary=longame,unit=MB
Unit: MB
Active Memory Expansion
‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑
Size     Inuse   Free  DXMSz UCMInuse CMInuse TMSz    TMFr
1024.00 607.91 142.82 274.96 388.56   219.35   512.00 17.4
CPSz CPFr txf cxf CR
106.07 18.7 2.00 1.46 2.50

The DXMSz column is the important one here, as it shows the deficit in expanded memory. Deficits occur when the compression ratio specified cannot be achieved, and the system starts to use memory that cannot be created out of compression. Therefore, you need to be careful about over specifying the compression ratio.

One other artifact of AME is that the memory sizes displayed by most tools, including vmstat and others, typically show the expanded memory size (for example, the configured memory multiplied by the compression ratio), not the actual memory size. Look for the true memory size in the output of different tools to determine the actual memory available without compression.

Tuning

Let’s examine the tools that can allow you to tune the VMM to optimize performance for your system. Here is an example of an environment where you want to tune parameters using a certain type of methodology, along with some of the key parameters you need to be aware of.

In AIX 7, the vmo tool is responsible for all of the configuration of the tunable parameters of the VMM system. This replaces the old vmtune tool available in AIX 5.

Altering the page size provides the most immediate performance improvement and is due to the reduction of Translation Lookaside Buffer (TLB) misses, which occurs because the TLB can now map to a much larger virtual memory range. For example, in high performance computing (HPC), or an Oracle® database, either Online Transaction Processing (OLTP) or a Data Warehouse application, can benefit when using large pages. This is because Oracle uses a lot of virtual memory, particularly with respect to its System Global Area (SGA), which is used to cache table data, among other things.

The command in Listing 3 allocates 16777216 bytes to provide large pages with 128 actual large pages.

Listing 3. Allocating bytes

                
#vmo ‑r ‑o lgpg_size=16777216 lgpg_regions=128

If you want to use large pages in combination with shared memory, which is often used in HPC and database applications, you will also need to set the v_pnshm value: # vmo -p -o v_pinshm=1.

The most important vmo settings are minperm and maxperm. Setting these parameters determine the appropriate value for your system to ensure that it is tuned to either favor computational memory or file memory. In most cases, you do not want to page working segments, as doing so causes your system to page unnecessarily and decrease performance. The way it worked in the past was actually quite simple: If your file pages (numperm%) were greater than the actual maxperm%, then the page replacement would only steal file pages. When it falls below minperm, it could steal both file and computational pages. If it was between both, then it would only steal file pages, unless the number of file repages was greater than the amount of computational pages. Another way of looking at this is if your numperm is greater than the maxperm, then you would start to steal from persistent storage. Based on this methodology, the old approach to tuning your minperm and maxperm parameters was to bring maxperm to an amount <20 and minperm to <=10. This is how you would have normally tuned your database server.

That has all changed. The new approach sets maxperm to a high value (for example, >80) and makes sure the lru_file_repage parameter is set to 0. lru_file_repage was first introduced in AIX Version 5.2 with ML4 and on ML1 of AIX Version 5.3. This parameter indicates whether or not the VMM re-page counts should be considered and what type of memory it should steal. The default setting is 1, so you need to change it. When you set the parameter to 0, it tells VMM that you prefer that it steal only file pages rather than computational pages. This can change if your numperm is less than the minperm or greater than the maxperm, which is why you would now want maxperm to be high and minperm to be low. Let’s not lose sight of the fact that the primary reason you need this value tuned is because you want to protect the computational memory. Getting back to the example, Oracle uses its own cache, and using AIX 7 file caching for this purpose only causes confusion, so you want to stop it. If you were to reduce the maxperm in this scenario, then you would now make the mistake of stopping the application caching programs that are running.

Listing 4 sets these critical tuning parameters.

Listing 4. Setting tuning parameters

                
vmo ‑p ‑o minperm%=5
vmo ‑p ‑o maxperm%=90
vmo ‑p ‑o maxclient%=90

Although you used to have to change these parameters, you now leave strict_maxperm and strict_maxclient at their default numbers. If strict_maxperm were changed to 1, it would place a hard limit on the amount of memory that could be used for persistent file cache. This is done by making the maxperm value the upper limit for the cache. These days it is unnecessary, because changing the lru_file_repage parameter is a far more effective way of tuning, as you would prefer not to use AIX 7 file caching.

Two other important parameters worth noting here are minfree and maxfree. If the number of pages on your free list falls below the minfree parameter, VMM starts to steal pages (just to add to the free list), which is not good. It continues to do this until the free list has at least the number of pages in the maxfree parameter.

In older versions of AIX when the default minfree was set at 120, you would commonly see your free list at 120 or lower, which led to more paging than was necessary, and worse, threads needing free frames were actually getting blocked because the value would be so low. To address this issue, the default values of minfree and maxfree were bumped up in AIX Version 5.3 to 960 and 1088, respectively. If you are running AIX Version 5.2 or lower, we recommend these settings, which you can manually change using the commands in Listing 5.

Listing 5. Setting the minfree and maxfree parameters manually

                
vmo ‑p ‑o minfree=960
vmo ‑p ‑o maxfree=1088

Configuring variable page size support

VPSS works by using the default 4KB page size. Once the application has been allocated 16 4KB blocks, assuming all the blocks are in current use, they are promoted to be a single 64KB block. This process is repeated for as many 16-count sequences of 4KB blocks as the application is using.

Two configurable parameters control how VPSS operates. The first is simply enabling the multiple page size support. The vmm_support tunable, configured with vmo, sets how the VMM operates. A value of 0 indicates that only the 4KB and 16MB page sizes are supported. A value 1 allows the VMM to use all the page sizes supported by the processor. A value of 2 allows the VMM to use multiple page sizes per segment and is the default for all new installations.

With the multiple page size support enabled, the vmm_default_pspa parameter controls how many sequential pages are required for the smaller 4KB pages to be promoted to the 64KB page size. Some applications, particularly those that use a lot of memory, may perform better with a 64KB page size even though they don’t use full 64KB pages.

In this case, you can use the vmm_default_pspa parameter to specify that less than 16 4KB pages are required for promotion, expressed as a percentage. The default value of 0 indicates that 16 pages are required. A value of 50 indicates that only 8 pages are required. A value of 0 has the effect of promoting all 4KB pages to 64KB pages.

Summary

As discussed, before you tune or even start monitoring AIX 7, you must establish a baseline. After you tune, you must capture data and analyze the results of your changes. Without this type of information, you never really understand the true impact of tuning. In Part 1 of this series, and, where appropriate, we covered the effect of using AME to squeeze more memory out of your systems. You also tuned an Oracle system to optimize utilization of the memory subsystem. You examined some important kernel parameters, what they do, and how to tune them, including how to make the best use of the variable page size support.

Part 2 focuses much more on the detail of systems monitoring for the purposes of determining memory bottlenecks, along with analyzing trends and results. Part 3 focuses primarily on swap space and other methods to tune your VMM to maximize performance.