The DASD Level
The DASD Level
Planning considerations:
- Do NOT combine spool, paging, TDISK and minidisks at the volume level to avoid contention and overhead.
- Do NOT have multiple page or spool allocations on the same volume.
- Do NOT put TDISK on the same volume as other data that has a performance requirement.
- Use dedicated volumes for SFS File Pools.
- Use dedicated volumes for Linux shared disks and Linux LVM (Logical Volume Manager).
- Do NOT share z/OS and z/VM data at the control unit level to avoid error recovery complications and performance
issues if z/OS runs I/O intensive batch jobs.
For some Linux volumes and z/VM paging volumes, the I/Os are larger and longer in duration. Plan for Linux volumes and page
volumes to have service times at 1-2ms per I/O, thus a device should be targeted at 50-100 I/Os per second. Traditional I/Os
at 4K per I/O have service times in the 1-2 ms range, which means 50 I/Os per volume is a reasonable target.
SCSI is currently not suited for high access data or paging due to reduced performance.
For control units in a linux environment, the issue is often the non-volatile write cache as Linux will buffer writes
and then write out data in large bursts overflowing the write cache. Ensure there is a mechanism for detecting NVS full
conditions. Minimizing Linux server storage sizes also minimizes the potential of this problem by reducing the available
storage to cache write data.
For a presentaiton about the DASD environment and utilization, see
DASD Performance
Helpful ESAMON screens/ESAMAP reports:
- ESADSD1 - DASD configuration - shows how the current DASD is configured.
- ESADSD2 - DASD performance analysis Part 1 - shows how the current DASD is performing.
- ESADSD6 - DASD performance analysis Part 2 - shows different information about DASD performance.
- ESADSDC - Cache control unitl configuration - shows how the current DASD cache is configured.
- ESADSD5 - Cache (3990) analysis - shows cache activity and effectiveness.
- ESACHAN - Channel performance analysis - shows how the current DASD channels are performing.
- ESASEEK - DASD seeks analysis - shows the DASD arm movement per volume.
- ESAUSEK - User minidisk seek analysis - shows each user's minidks activity.
- ESAXACT - Transaction delay analysis - shows an analysis of virtual machine states and wait states.
Using zVPS to find information for solving issues with the DASD level:
ESADSD1 - Shows the current DASD configuration and characteristics.
Volser - This is the volser name. Often the name indicates what kind of data is on the device.
IE - VM4P19 is a paging device on system VM4.
Device Type - This shows the DASD device type - ie a 3390-3 is a "Mod 3" device with 3339 cylinders.
Online CHPIDs - Shows what CHPIDs are associated with each device. Depending on the device type, there should
be at least two CHPIDs per device.
Some device types are ok to have one CHPID or can architecturally only handle one
CHPID. Verify the expected number of CHPIDs are present/online.
UserID (if ded) - This shows if a device is dedicated. If it is dedicated and the device is over 50% busy,
it would be good to evaluate defining it as a minidisk so it will utilize MDC (minidisk caching).
MDisk links - This shows how many minidisks are currently on each device. Certain devices like paging or
spooling, should not have minidisks defined.
Extent Type - If the device is a page/spool device, it will show it here. This is good to verify that page
and spool devices aren't being shared with minidisks.
MDC Elg - Indicates if this device is eligible for minidisk caching. MDC needs to be enabled for certain
processes. Enabled is appropriate.
ESADSD2 - Shows DASD performance.
Device Number - This shows the device number and model number for the head of string.
Can click (zview) or zoom (z/VM) to see the all of the devices on the string.
This will first show all the devices that have activity.
%Device Busy - This shows the elapsed time a device was busy (if not seeing the whole string, the head of string
will show the total for that string).
Look for out of pattern busy numbers which can show a disk that is overworked or
may be failing. The ESADSD2 report will show the top DASD by Device busy.
If the device busy is over 50, there is very high utilization.
SSCH Average/Peak - This shows the amount of start subchannel commands were issued per second on average and
the peak. This indicates which DASD are the most busy.
Response Times - This shows different aspects of how the devices are functioning.
When Response times do not equal Service times, there is queueing.
High Response/Service times can show a dysfunctional/overworked device,
that PAV/HiperPAV is turned off/not working or there is a need for secondary channels.
Service times of 2.4 are high by today's standards.
High Pending/Disconnect times can be an indication of a cache problem.
High Disconnect times can be also indicate the need for solid state DASD. A number over
10 is high.
High Connect time may indicate faster channels are required or there are very large
block data transfers.
Queueing - This shows the different ways a device can queue. It shows where the queuing is happening -
in the device vs the control unit vs I/O throttling
(where multiple entities are after the same data). Queueing over 10 is high - evaluate
the controller details.
ESADSD6 - DASD performance analysis Part 2 - shows different information about DASD performance
Device Addr - This shows the device number and model number for the head of string.
Can click (zview) or zoom (z/VM) to see the all of the devices on the string.
%Device Busy - This shows the elapsed time a device was busy (if not seeing the whole string, the head of string
will show the total for that string).
Look for out of pattern busy numbers which can show a disk that is overworked or
may be failing. If the device busy is over 50, there is very high utilization.
ESADSDC - Shows the current cache control unit configuration and characteristics
Control Unit - This shows the device number and model number.
Storage Director - This shows the storage director id and its status. Verify all are online.
Available Cache - This shows the available cache. It may be less than the total size if other LPARs are
also using that control unit.
Cache Fast Write - This shows that cache fast write is active. Active is good.
Channel Paths Online - This shows the channel paths online to each device. Verify each expected device is
online.
ESADSD5 - Shows the cache control unit performance
Device Number - This shows the device number and model number for the head of string.
Can click (zview) or zoom (z/VM) to see the all of the devices on the string.
Total I/O Cache Hit% - This shows how well cache is being utilized. A low hit% (below 80) may indicate
more cache is needed or it might be batch (like backups).
NVS Full - This indicates the Non-volatile storage is full. This stops fast write (caching), which is a
big problem as it will cause the disks to be more highly utilized and slow down.
Cache Inhib/Bypass - This also will indicate that caching is not working.
ESACHAN - Shows the channel performance
Channel Util% - This shows the channel utilization for the LPAR and all the LPARS to which it is connected. If
the LPAR utilization is over (50%-ESCON or 40%-FICON), consider faster channels or moving data.
If the total utilization for all of the connected LPARS is over (50%-ESCON or 40%-FICON)
consider adding channels, moving to faster channels or moving data to other channels.
Data Unit/Work Unit/Bus Cycles Pct - This shows the percent busy for the channel. This should
really stay under 40%.
ESASEEK - Shows the DASD arm movment per volume
Device address/volser/type - This shows DASD that is active. Seeks are no longer a good way of showing
performance,
however this information can be helpful when looking for where servers are using data
(if data needs to be moved, a device needs to be taken offline, etc).
ESAUSEK - Shows the DASD activity by minidisk
Volume/Minidisk Ownerid - This shows the volume and owner of the minidisk. Like ESAUSEEK,seeks are no
longer a good way of showing performance,
however this information can be helpful when looking for where
servers are using data (if data needs to be moved, a device needs to be taken offline, etc).
ESAXACT - Transaction delay analysis. This can show if users are waiting on I/O operations.
Helpful DASD information:
UserID/Class - This shows the userid/class.
SIO - This shows if a user is waiting for input/output operation start.
Async I/O - This shows if a user is waiting with asynchronus input/output operation outstanding.
Can click (zview) or zoom (z/VM) to see the all of the users in a class.
Conclusions
It is very important that users/servers can get to their data. If there are DASD/cache/channel issues, this can cause
major performance issues. Servers using the same disks/volumes can cause contention which will also cause performance
problems.
Back to top of page
Back to Performance Tuning Guide