Velocity Software, Inc. is recognized as
a leader in the performance measurement of z/VM and Linux on z.
The Velocity Performance Suite consist of a set of tools that
enable installations running z/VM to manage Linux and z/VM
performance.
In addition, many components of server farms can be
measured and analyzed. Performance data can be viewed real-time
through the use of either 3270
or a browser.
The CLOUD Implementation (zPRO) component is designed
for full cloud PaaS implementation as well as
to extend the capabilities of the z/VM sysprog (system programmer)
to the browser world. This feature moves system management to
the point-and-click crowd. Archived data and reports can be
kept available of long term review and reporting usine zMAP.
The zVPS, formally ESALPS, components consist of:
zMON (formally ESAMON - real-time display of performance data),
zTCP (formally ESATCP - SNMP data collection),
zMAP (formally ESAMAP - historical reporting and archiving),
zVWS (formally ESAWEB - z/VM based web server),
zTUNE (a subscription service),
zVIEW (formally SHOWCASE - web based viewing of performance data),
zPRO (new to the quality line of Velocity Software Products).
Velocity continues to work with other software vendors to ensure
smooth interface with or from other products such as
VM:Webgateway, CA-Webgateway, EnterpriseWeb, MXG, MICS.
Velocity software remains the leader and inovator in the
z/VM performance, Linux performance, Managing cloud computing
arenas.
The information and suggestions contained in this
document are provided on an as-is basis without
any warranty either expressed or implied. The use
of this information or the implementation of any of
the suggestions is at the reader's own risk. The
evaluation of the information for applicability is
the reader's responsibility. Velocity Software may
make improvements and/or changes to this publication
at any time.
Overview
This reference guide is extracted from Velocity Software's
presentation on configuration guidelines. It provides
high level configuration and tuning recommendations that
can have their results measured using zVPS. Most
installations will find significant results from the
recommendations suggested here. Installations with more
complex requirements will want to evaluate the recommendations
based on measurements of their particular systems.
This abreviated Tuning Guide will discuss the following from
both a traditional z/VM environment and from a
Linux server farm environment:
For performance questions or further information
about evaluating these recommendations in your z/VM
Linux environment please contact Barton Robinson
of Velocity Software at:
Velocity Software, Inc.
PO 390640
Mountain View, CA 94039-0640
650-964-8867
DASD Subsystem Performance Summary
DASD Configuration Guidelines for z/VM:
Do NOT combine spool, paging, TDISK and minidisks
at the volume level. (This means dedicated volumes!)
Page and spool use algorithms designed to minimize
seeks and overhead within the file. Putting either
page or spool on the same volume as any other active
area will result in contention and overhead.
Furthermore, multiple page or spool allocations should not
reside on the same volume.
TDISK is formatted regularly and should not be assigned
to the same volume as data with a performance requirement.
z/OS and VM data should be segregated at the control unit
level to avoid error recovery complications and to reduce
performance spikes when z/OS runs I/O intensive batch jobs.
DASD Planning Using Access Density
When allocating DASD using the access density methodology,
consider the following planning guidelines. Note, they are
not hard and fast rules of thumb. Installations should always
review performance to ensure that the user's needs have been
met. Access Density is defined as the number of I/O expected
per Gigabyte of data. The ESADSD6 report provides data access
densities at a device level. The following recommendations for
current DASD technology are intended to keep device busy below
10%. This number is intentionally conservative as a guideline
to provide positive results when estimates are wrong.
For some Linux volumes and z/VM paging volumes, the I/O are
larger and longer in duration. Plan for Linux volumes and
page volumes to have service times at 1-2ms per I/O, thus a
device should be targeted at 50-100 I/O per second. Traditional
I/O at 4K per I/O have service times in the 1-2 ms range,
which means 50 I/O per volume is a reasonable target.
SCSI is currently not suited for high access data or paging
due to reduced performance.
Control Unit Planning
He with the most cache wins. In a linux environment, the issue
is often the non-volatile write cache, as Linux will buffer writes
and then write out data in large bursts overflowing the write
cache. Ensure there is a mechanism for detecting NVS full
conditions. Minimizing Linux server storage sizes also minimizes
the potential of this problem by reducing the available storage
to cache write data.
Channels
Channels today rarely impact performance. PAV/HiperPAV is always good.
Measuring the DASD Subsystem
Each of the above tuning recommendations can be evaluated using
the following zVPS reports:
ESADSD1: Device Configuration
ESADSDC: Cache Configuration
ESADSD2/6: DASD Performance and access rates
ESADSD5: DASD Cache performance
ESAPSDV: Page/Spool Device Performance
ESACHAN: Channel Performance
ESACHNH: Hypersockets Performance
Storage Subsystem Performance
Storage requirements should be reduced as much as
possible to avoid unnecessary paging delays. Linux
adds several guidelines. Plan on 2GB of storage for
z/VM, MDC, and the infrastructure (TCPIP, DIRMAINT,
zVPS).
Linux Storage Planning Guidelines
With Linux, the over-commit ratio is the planning target.
If you plan for 20 Linux servers, and they are 1GB each,
with a target over-commit ratio of 2, then 12 GB is
required. (20 servers times 1gb, divided by 2, plus
z/VM 2GB of storage). For WAS and Domino environments,
an over-commit target of 1.5 is reasonable.
For Oracle and other virtual friendly applications,
over-commit of 3 is reasonable.
To put more servers into existing storage, decrease
Linux server storage sizes until they start to swap.
Repeat.
This is the largest tuning knob available to
improve storage utilization.
System Settings
Many SRM settings are no longer useful:
SET SRM STORBUF
SET SRM LDUBUF
SET SRM DSPBUF
Storage Analysis
Use the following reports to evaluate storage.
ESASTRC: Storage Configuration
ESASTR1: Storage Analysis
ESASTR2: Storage Analysis Details
ESADCSS: NSS/DCSS Analysis
ESAASPC: Address Space Analysis
ESAUSR2: User Resource Utilization
ESAUSPG: User page Analysis
Paging Subsystem Performance
Review the options for reducing storage requirements
BEFORE analyzing or enhancing the paging subsystem.
Many times, storage requirements can be reduced so that
paging requirements drop significantly. If this is the
case, any time spent on the paging subsystem will be
wasted.
Paging Configuration Requirements
The following requirements for the paging subsystem
are in order. Ensure page packs are dedicated.
Page space requirements: Page space must be
no more than 50% allocated. More than that, and blocking
factors drop, and Page I/O goes up. This is the most
critical consideration.
Page device requirements: Ensure that paging
devices do not exceed 20 percent busy.
Allocate the same page space to each paging
device to ensure the load is balanced at peak intervals.
Spooling Configuration Requirements
Spool very rarely impacts performance. Have sufficient
number of spool volumes to keep the device busy at less than
20 percent peak period. Maintain sufficient space to ensure
console logs are available for problem determination.
Paging/Spooling Analysis
The following reports should be used for analyzing the
paging and spooling subsystems:
ESAPAGE: Page/Spool requirements (system level)
ESABLKP: Block Paging analysis
ESAPSDV: Page/Spool (Device level)
ESAUSPG: User requirements
Processor Subsystem Performance
Moore's law is dead, long live the mainframe. Processor/cycle
speeds have not significantly changed in several generations.
Now the objective is to get more work done with less cycles.
Reducing CPU requirements from a system tuning perspective
can be done with the following actions.
Minimize Linux Virtual Processors: Linux should not
be allowed to have multiple processors when the workload does
not need it. Giving a server an extra virtual processor may
provide a few milliseconds improvement in performance, but
will result in spin locks, consuming processor unnecessarily.
Minimize polling: Linux hertz time is just one
example of polling within a Linux server. This can and
should be corrected using the timer patch. Note that WAS,
Domino, SAP and some other applications have since
implemented polling.
System Settings
Many guidelines for SRM settings have changed over the years
and for those that have their own "SRMSET EXEC" that has been
carried forward for years, there may be some new recommendations.
Many guidelines had to do with controlling access to
the dispatch list, and when resources were constrained, virtual
machines would be delayed on the eligible list. This function
no longer exists.
SET SRM DSPBUF | LDUBUF | STORBUF are no longer
useful
SET SRM IABIAS has no meaning anymore
SET SRM DSPSLICE minslice Can be useful for
systems with few processors and CPU intensive workloads.
For Linux workloads, use the default of 5 (ms).
SET SRM MAXWSS has always been a useless setting
SET SRM VERTICAL | HORIZONTAL - VERTICAL is required
if using SMT. HORIZONTAL has helped performance of many
systems where SMT was not a good option.
Processor Performance Analysis
Use the following reports to evaluate the processor performance:
For CMS and shared Linux disk workloads,
analysis of several systems has shown a pattern of diminishing
returns from MDC. The largest gain is from the first 100mb.
Note that Linux servers sharing one or two disks can avoid I/O
with MDC. In no case should the z/VM control program (CP) be
allowed to determine how much storage is to be used for MDC. Many
case studies have shown that CP will cause paging spikes by allocating
too much storage to MDC. The following commands should be issued to
control MDC Storage where the maximum sets a reasonable limit
on the size of MDC.
SET MDC STORAGE 0M 256M
For VSE systems that will get benefit from the MDC track
read, meaning that for every I/O to disk, the full track is
read and cached, ensure that MIN and MAX are the same to
maintain consistent performance:
SET MDC STORAGE 1024M 1024M
Measuring MDC
The following reports can be used for analyzing different
aspects of MDC performance:
ESAMDC: Mini-Disk Cache Analysis
ESADSD6: Analyze MDC impact on disks
ESAUSR3: Analyze MDC impact on users and Linux servers
Linux Configuration Recommendations
These are recommendations that are considered best
practices and have been validated in 100's of Linux
installations.
Swap to virtual disk - swapping to virtual
disk does not impact response time and means swap is
not a bad thing. Swap to vdisk is measured in microseconds,
not milliseconds.
Multiple and Small Swap Disks - Swap disks must
be multiple and small, and prioritized so that one is used
before the next one. Alerts should be set for when the
2nd vdisk becomes needed.
Minimize virtual machine size until Linux swaps:
The only way to reduce storage requirements is to stop
Linux from caching unnecessary data and programs.
Minimize number of virtual CPUs - if the workload
requires only 1 virtual CPU, providing a second CPU will
waste CPU by creating spin locks.
These are best practices. All have been validated many
times in many installations.
Service Machine Performance Summary
Configuring TCPIP for z/VM
TCPIP should have the following option set to provide
optimum service. The Share setting can be modified
later to fit requirements if TCPIP's requirements are very large.
SET SHARE TCPIP ABSOLUTE 5% To ensure TCPIP is
not over prioritized
as the default REL 3000 accomplishes.
Tuning z/VM Database Service Machines
Database service machines such as SQL are a shared resource.
They should have the following options set to provide optimum
service. This does not include Linux servers, unless they are
shared by many other servers as a resource.
SET SHARE SQLxxx RELATIVE 300
Measuring Service Machines
Each of the above tuning recommendations can be evaluated
using the following zVPS reports:
ESAUSR1: User Configuration to validate settings
ESAXACT: Transaction Analysis - understand
server delays
ESAUSR2/3/4: User Resource Utilization
ESAUSRQ: User Queue Analysis (to understand queue sizes)
Tuning Traditional CMS Workloads
The following guidelines are for traditional CMS workloads,
and have no impact on Linux server farm workloads.
File Directories in Storage:
Use the SAVEFD facility to
save file directories in saved segments for disks that are often
accessed. For example the HELP disk and the tools disks. This
will eliminate I/O needed to access the minidisk.
EXECs in storage
Put often-used EXECs from the S, Y, and
tools disk into the installation saved segment. This reduces both
I/O and storage use since the EXEC will not be loaded into the user
storage, but instead executes from a single shared copy. IBM
provides 'SAMPNSS EXEC' to define the segment and a sample
'CMSINST EXECLIST' that contains the list of EXECs and XEDIT
macros that will be loaded into the saved CMSINST segment.
Installations
should add all EXECs and XEDIT macros that are likely to
be used frequently.
Help Disk
Blocking the Help disk at 1K requires 25% less
DASD space without changing the I/O rate. Always save the file
directory with SAVEFD to reduce directory access I/O and time.
Installations with heavy use of Help should force users to directly
access it by modifying the SYSPROF EXEC.
Performance Analysis
Use the following reports to evaluate impacts of these functions
on performance:
ESAWKLD: Determine impact on user workloads
ESAUSPG: User Paging Analysis
ESAPAGE: Determine impact on system paging
Functional Requirements for Managing Linux Performance Under zVM....
What Every IT Professional Should Know
Performance Measurement skills and tools to
ensure current service levels are met. This includes
current performance measurements and the ability to
analyze performance from previous time frames.
Capacity Planning skills and performance
database to ensure future needs are met, including
the ability to transfer data to MICS or MXG.
Operational alerts implemented to allow
operations to detect current issues such as looping
processes, exceeding disk capacity, etc., for hundreds
of servers. Alerts can be sent to any SNMP based
management console, 3270, or a browser on a workstation.
Charge back and accounting capability to
provide data used in a mainframe business model
to charge for resources consumed either using
zVPS facilities or MICS.
Achieving these results introduces the following challenges:
Accuracy of the Data - The CPU data provided
by Linux in a virtual environment prior to SLES10 was
wrong. Velocity Software was the first to understand
this issue and offered the ONLY product to correct the
results. The same is now true for Linux in SMT environment.
Complete Data Collection: Multi-platform
Data Collection - Through the use of a standard
interface (SNMP and NETSNMP) an installation using
zVPS may monitor many different platforms (NT,
Linux, Sun, HP).
Complete Data Collection: Ability to
collect data from 100s or even 1000s of servers.
A 100% Capture Ratio insures that you know
exactly how much system resource is being used and
by whom - down to the Linux process level.
Cost of Data Collection - Cost of collecting
data should be kept to a minimum. Some management
tools require as much as 5% of the processor resource.
Velocity's target is .1% or less of ONE processor
at 1 minute data granularity per linux server.
Velocity Software, Inc. Products and Services
Velocity Software's focus is to provide performance
products and services for z/VM. Velocity Software
offerings currently include:
zVPS: The Velocity Performance Suite is
designed for installations using Linux under z/VM.
It includes the standard z/VM measurement facilities
(zMAP and zMON) as well as Linux and network data
collection (zTCP), and a full function z/VM based
webserver (zVWS).
zMON,
the z/VM Real-Time Monitor analyze system performance
monitor data produced by z/VM. zMAP generates reports
for use in performance analysis activities, and stores,
retrieves and reports from history files to facilitate
capacity planning and long-term performance trend
analysis. zMON generates real-time displays that show
z/VM system performance measurements. zMON captures
system performance data and records it on disk, as well
as creating history files that can be employed with
zMAP. Together, zMAP and zMON provide a complete
z/VM performance monitoring system.
zVWS (Velocity Web Server) provides a full
function web server to allow browser based interface
to z/VM functions and performance data.
z/VM performance workshops are offered
regularly. See the Velocity Software website for
details.
zTUNE is Velocity Software's service to ensure
performance problems are resolved quickly and
allow access to Velocity Software 100 years of
experience in solving performance problems. This
includes system performance reviews when ever requested.
zPRO is Velocity Software's solution for implementing
private clouds as well as providing an easy to use web page
for managing your z/VM environments.