DASD Utilization Analysis
Specifics - DASD Utilization analysis:
Calculations:
DASD response time = service time + queue time
DASD service time = pend time + disconnect + connect
Device busy = rate * service time
DASD response time = (service time) / (1 - device busy)
Helpful ESAMON screens/ESAMAP reports:
- ESADSD2 - DASD performance analysis Part 1 - shows how the current DASD is performing.
- ESADSD6 - DASD performance analysis Part 2 - shows different information about DASD performance.
- ESACHAN - Channel performance analysis - shows how the current DASD channels are performing.
ESADSD2 - Shows DASD performance. Both screen and report samples:
Device Number - This shows the device number and model number for the head of string.
Can click (zview) or zoom (z/VM) to see the all of the devices on the string.
This will first show all the devices that have activity.
%Device Busy - This shows the elapsed time a device was busy (if not seeing the whole string, the head of string
will show the total for that string).
If the device busy is over 50, there is very high utilization.
Look for out of pattern busy numbers which can show a disk that is overworked or
may be failing. The ESADSD2 report will show the top DASD by Device busy.
If a device is consistently in the top busy DASD, may need to move information
to another device (spread highly acccessed information across multiple devices).
SSCH Average/Peak - This shows the amount of start subchannel commands were issued per second on average and
the peak. This indicates which DASD are the most busy.
Peak shows the 1 minute peak for the device. The report will show this for 15 minute
intervals over the course of a day to do trending/determine problem times.
Response Times - This shows different aspects of how the devices are functioning.
When Response times do not equal Service times, there is queueing.
High Response/Service times can show a dysfunctional/overworked device,
that PAV/HiperPAV is turned off/not working or there is a need for secondary channels.
Service times of 2.4 are high by today's standards.
High Pending/Disconnect times can be an indication of a cache problem.
High Disconnect times can be also indicate the need for solid state DASD.
High Connect time may indicate faster channels are required or there are very large
block data transfers.
Queueing - This shows the different ways a device can queue. It shows where the queuing is happening -
in the device vs the control unit vs I/O throttling
(where multiple entities are after the same data). Queueing over 10 is high - evaluate
the controller details.
Note: Compare control unit activity to get a baseline for 'normal' performance and to determine if any control
units are utilized more than others.
If this happens, volumes may need to be reorganized to better equalize controller
usage.
ESADSD6 - DASD performance analysis Part 2 - shows different information about DASD performance. Both screen and
report samples:
Device Addr - This shows the device number and model number for the head of string.
Can click (zview) or zoom (z/VM) to see the all of the devices on the string.
%Device Busy - This shows the elapsed time a device was busy (if not seeing the whole string, the head of string
will show the total for that string).
If the device busy is over 50, there is very high utilization.
Look for out of pattern busy numbers which can show a disk that is overworked or
may be failing. The ESADSD6 report will show the top DASD by Device busy.
Access Density - This shows the number of I/O operations per gigabyte of capacity. Look for numbers above that
may be above that threshold.
ESACHAN - Shows the channel performance
Channel Util% - This shows the channel utilization for the LPAR and all the LPARS to which it is connected. If
the LPAR utilization is over (50%-ESCON or 40%-FICON), consider faster channels or moving data.
If the total utilization for all of the connected LPARS is over (50%-ESCON or 40%-FICON)
consider adding channels, moving to faster channels or moving data to other channels.
Data Unit/Work Unit/Bus Cycles Pct - This shows the percent busy for the channel. This should
really stay under 40%.
The Total Reads and Writes per second shows the total for the CEC so only one LPAR on each
CEC needs to provide the information.
Back to top of page
Back to Flow Chart main page