Storage Subsystem Analysis

Storage Analysis

Storage (or memory) is a very important part of the system. Understanding storage utilization is critical and will depend greatly on workload. Defining storage varies with traditional CMS workloads and Linux workloads. There are three types of storage on the z/VM system:

Basic Storage Management:

For more information about IBR, see IBR Analysis.

Paging happens when storage/memory is not available and data has to be written to disk. It is much more efficient to use storage instead of disk, plus it improves response times. Paging can cause severe system performance issues. The amount of total storage, the size of the Dynamic Paging Area (DPA) and the size/type of paging disks all come into play for the most efficient performance of the system. One of the top reasons for a first-time installation having a z/VM outage is from lack of proper page space planning. If running out of page space, the system will take a PGT004 abend. If the system starts paging at a high rate - check ESASTR1 to see what changed.

For more information about paging, see System Page analysis.

Linux and Storage:

For a presentation on the Storage/Paging environment and utilization, see z/VM Storage Analysis and Tuning

Helpful configuration settings:

The storage subsystem analysis should be done top down:

Helpful ESAMON screens/ESAMAP reports:

Using zVPS to find information for solving issues with the storage/paging level:


ESASTRC - Shows storage configuration information.

?

  • Offline storage - This shows if there main storage frames that are not online. If this number is not zero, investigation is needed.

  • ESASTR1 - Shows storage analysis information. This shows all of the system storage with 99% accuracy. If the system is paging, look here first.

    ?

  • System Storage - This shows the amount of storage in the DPA area.
  • Available <2gb - This shows the amount of storage below the 2GB line. Many critical z/VM system tasks need to use storage below the 2GB line. Watch for large depletions. If there is consistently high available storage, the system may be over-configured. Storage could be used by other workloads or move to another LPAR.
  • Systm ExSpc - This is system execution space allocated (CP's virtual address space). Look for any large fluctuations. Fluctuations could be caused by multiple large virtual machines logging on. To drill down - use ESASXS screen.
  • User Resdnt - This shows the megabytes of user resident storage. This should be the major use. Look for any large fluctuations. To drill down - use ESAUSPG/ESAUSP2 screens.
  • NSS/DCSS Resident - This shows the megabytes of NSS and DCSS resident in storage. Again, look for large fluctuations. To drill down - use ESADCSS screen.
  • AddSpace - This shows the System and User owned resident space, which contains page tables. To drill down - use ESAASPC screen. This will go up if a server with large storage (say 2G-5G) logs on and the system is not set up to handle it.
  • VDISK Rsdnt - This shows the virtual disk storage in use. To drill down - use ESAVDSK/ESAUSPG screens.
  • MDC Rsdnt - This shows the minidisk cache resident storage. To drill down - use ESAMDC screen. The MDC limits should always be set (see setting recommendations above).

  • ESASTR1 Example - This shows an example of when a large server logged on and caused a storage problem.

    ?

  • User Resdnt - User Resident storage went down
  • NSS/DCSS Resident - Storage used for NSS/DCSS storage went down.
  • AddSpace Systm - System address space storage went up. User page tables live in this area. Since a very large user was logging on, many page tables had to be built. Storage was stolen from other areas to compensate.
  • VDISK Rsdnt - Storage for VDISK went down.
  • MDC Rsdnt - Storage for MiniDisk Cache went down.
  • It was very easy to see where the storage went and why the system would start to behave poorly. With this information, go to ESAUSP2 to see what user/server just logged on (what happened in this case) or ESALNXP to see what Linux process may have ramped up (like a database server) which is also a common issue.

  • ESASTR2 - Shows additional storage analysis information.

    ?

  • Avail List Empty/Sec - This shows times the available list was empty. This means there were no pages in storage that were available to satisfy new storage requirements.
  • Demand Scans - This shows the number of times a demand scan could not obtain sufficient page frames. Both of these can indicate issues.

  • ESAXACT - Transaction delay analysis. This can show if users are waiting on paging operations. Helpful storage/paging information:

    ?

  • Pag - This shows the percentage of time a a user is in page wait. This would indicate a storage issue.

  • ESAUSPG - Shows user storage information. Both screen and report samples:

    ?

  • Total >2GB <2GB - This shows total storage above and below the 2GB line by group, then by members of the group. (This is user storage only so will show less than ESASTR1.) All of the Total storage is backed up by z/VM in real storage. This is helpful to show which users are consuming the most storage (especially <2G).
  • Paged out - This shows the amount of storage that is paged out, which means the data had to be written out to disk. Look for large fluctuations in these numbers.
  • VirtDisk - This is storage resident in virtual disk address spaces. Look for spikes. This means that swap is being used more. (This is only a problem if storage constrained) If this number goes up, maybe the workload is growing and a server needs to be reconfigured or a workload is out of control. It is advantageous to get a baseline for the usual amount of VirtDisk usage, then set an alert if the number jumps significantly.
  • Locked MegaBytes - This shows locked storage in MB both above and below 2GB. If the combined number is over 4000, there are too many pages locked. Use reserved pages instead.

  • Conclusions

    Review the options for reducing storage requirements before analyzing or enhancing the paging subsystem. Many times storage requirements can be reduced so that paging requirements drop significantly. If this is the case, any time spent on the paging subsystem will be wasted. Spooling rarely impacts performance. Have sufficient number of spool volumes to keep the device busy at less than 20% during peak periods.


    Back to top of page
    Back to Performance Tuning Guide