Understanding SMT

SMT - Simultaneous MultiThreading

The following information has come from z14-z15 machine data. More information to come for the z16 machines!


SMT Introduction:

Understanding how to view CPU utilization with SMT

  • When SMT is active, there are x vCPUs and x*2 threads. If viewing from a hardware perspective (ESALPARx/ESAUSP5) the numbers shown are the number of vCPUs. If viewing from a z/VM perspective (ESACPUx), the numbers show are the number of threads.
  • For example, the pictures below show a system with 7 vCPUs and thus 14 threads. ESAUSP5 is showing the percentage of CPU used as 682.9 (out of 700 - for 7 vCPUs) but ESACPUU shows it as 1220 (out of 1400 - for 14 threads).
  • ?

    ?

    Helpful Screens/Reports:

    ESAHDR - Shows SMT configuration information.

    ?

  • Multithreading status - This shows if SMT is enabled or disabled.
  • Core Thread count - This shows the number of threads available.
  • Enabled count - This shows the number of threads that are enabled.
  • Operating on IFL processor(s) - This verifies that IFLs are in use. SMT can only be used on IFL engines.
  • Horizontal/Vertical Scheduling - This shows the polarization information and will show the SRM parking settings if polarization is vertical. Must be vertical to run SMT. For best use of SMT, parking is set to Large and Excessuse is set to High. See CPU/LPAR Parking for more information on Parking.

  • ESALPAR - Shows the LPAR weight/polarization (and other information that will be discussed later).

    ?

  • Logical Processor Weight/Polar - This shows the weight for the LPAR and each processor polarization. It is important to note that the polarization designation is determined by a hard fixed algorithm and the weights/entitlement. To change the polarization designation means changing the weight and thus the entitlement. See LPAR weights/overhead analysis for more information on how entitlement is calculated.

  • Back to top of page

    SMT Performance vs Capacity:

    Presentation - SMT for z/VM Understanding Capacity Planning and Chargeback

    Helpful Screens/Reports:

    ESASMT - Shows the Simultaneous Multi-Threading report.

    ?

  • CPU ID/CPU - This shows the CPU id or Tot for the sum total of each CPU. This shows the number of "threads" with SMT. So there are 6 threads (CPU ID) vs 3 cores (Cor Cnt).
  • SMT Core Productivity Busy Pct - This shows how busy the full core has been (the next line is 0's as the number is for the full core).
  • SMT Core Productivity Mt Util - This shows the thread utilization for that core (the nextline is 0's as the number is for the full core)
  • SMT Core Productivity Thread Density - This shows a ratio expressing the average number of threads that were active when the core was dispatched and at least one thread was active (the next line is 0's as the number is for the full core).
  • Cor Cnt - This shows amount of cores.
  • Capability Factor Intervl - This shows the ratio of work rate with only one thread busy. This is indicative of the capacity gain.
  • Capability Factor Max - This shows the ratio of the work rate with all threads busy as opposed to the interval rate which is the ratio of only one thread busy.

  • ESALPAR - Shows logical partition characteristics and processor utilization for each and the system as a whole.

    ?

  • %Assigned Total - This shows the total percentage of time a physical processor was assigned to the logical processor.
  • %Assigned Ovhd - This shows the percentage of overhead time for a physical processor assigned to the logical processor.
  • Weight/Polar - This shows the weight and polarization for each vcpu. The weight is defined in the LPAR. Vxx shows it is vertical polarization which is needed for SMT. Lo/Me (or Hi) are Low, Medium or High designations. These are determined a fixed algorithm and can only be changed by changing the weight/entitlement. These designations will affect parking.
  • CPU Total Util - This shows the total CPU utilization.
  • CPU Emul time - This shows the total amount of CPU problem state time. This is 'real work' time.
  • CPU User ovrhd - This shows the total CPU overhead time. This is the time the Control Program is doing work on behalf of the user.
  • CPU Sys ovrhd - This shows the total system time. This is the time the Control Program is doing other things not associated directly with a user.
  • Multi-thread Idle Time - This shows the time that an individual dispatched CPU of a core is in any of the following states: enabled wait, disabled wait, stopped, check-stop or program interrupt loop. This number can be used to determine if utilization and capacity are efficiently balanced, but thread idle is not one for one for capacity planning. %Assigned Total time - %Assigned Overhead time * 2 = "thread assignment time". Thread Idle Time will then show what was idle out of that assignment time.

  • ESALPARS - Shows a summary of the logical partition configuration and utilization for each partition.

    ?

    For the LXB5 LPAR:
    ESALPMGS - Shows how the hardware/processing resources are distributed in the box.

    ?

  • CPU Type - This shows the different types of CPUs.
  • Ovhd/Mgmt - This shows the Logical (Ovhd) overhead and Physical (Mgmt) overhead. Currently on this system, the overhead is low, below 2%. If this rises into double digits, there could be too many vcpus defined. The ESALPARS report shows the totals for the day averaged every 15 minutes.
  • ?

  • Showing in the ESALPARS report from above, the Totals by Processor type: is the ESALPMGS report information.
  • For the box that holds the LXB5 LPAR:
    • The total busy is 1073%
    • Of which 18.7% was used for (logical) Overhead
    • And 18.4% was used for (physical) Management Overhead
  • (Approximately 37% overhead for 23 shared IFL's at 1073% is slightly high and should be investigated.)
    ESALPARS/ESACPUU (screen) - Shows processor count from a hardware perspective (ESALPARS) vs from a zVM perspective (ESACPUU).

    ?


    ESALPAR/ESACPUU (report) - Shows processor utilization from hardware perspective (ESALPAR) vs from a zVM perspective (ESACPUU).

    ?

  • ESALPAR shows 816% busy (%Assigned Total (828.6) - %Assigned Ovhd (12.6)
  • ESACPUU shows 1019% "total thread" busy time (CPU Total Util)
  • With 20 cores/40 threads there are two different utilization numbers - Core busy - 828% out of 20 cores (ESALPAR) Thread busy - 1019% out of 40 threads (ESACPUU) Both of these will be important from a performance analysis perspective.

  • ESAMFC - Shows processor cache use and instruction information.

    ?

  • Processor Rate/Sec Cycles - This shows the CPU cycles used
  • Processor Rate/Sec Instr - This shows the rate of instructions executed by the CPU
  • Processor Rate/Sec Ratio - This shows the average number of cycles required to process an instruction. These are the numbers that are important. If the instruction rate goes up, there is more capacity and more work being done. If the workload changes and utilization goes up but instruction count goes down, that is not good. If the ratio number goes down, the cache is being used more effectively.
  • Level 1 Cache/Second Instruction Cost/Data Cost - Shows the cost of cache misses.
  • TLB CPU Cost/Cycles Lost - Also shows the cost of cache misses - cycles being used for 'non-work' (such as address translation) or 'idle' due to time lost moving data from a higher level of cache/memory. Watch for changes changes in each of these numbers - especially if changing parking settings and/or LPAR weighting.
  • Note - In the above report:
    • This report shows 6 threads for 3 vcpus.
    • There were 16.5G cycles consumed for the day.
    • Of that 1468M were used for Instruction cache load and
    • Another 3226M were used for data cache load
    • So of the 16.5G cycles consumed,
    • 11.8G were used for executing instructions,
    • But there is also the TLB Cycles Lost that must also be added
  • The equation is then 16.5G (Processor Cycles) - (1468M (Level 1 Instruction Cost) + 3226M (Level 1 Data Cost) + 1038M (TLB Cycles Lost)) = 10.8G cyles actually used for work.

  • Back to top of page

    SMT and Cache:

    ESAMFC - Shows processor instruction information.

    ?

  • Processor Rate/Sec Cycles/Instr/Ratio - Shows processor cache effectiveness. The lower the ratio (the average number of cycles required to process an instruction) the more work is being accomplished. When turning on SMT, watch to see if this number changes. The ratio will fluctuate with different workloads but if it goes down on average, this is a good thing.
  • Level 1 Cache/Second Instruction Cost/Data Cost - Shows the cost of cache misses. If one thread is consuming a lot of DAT, don't turn on SMT or it will get worse.
  • TLB CPU Cost/Cycles Lost - Also shows the cost of cache misses - cycles being used for 'non-work' (such as address translation) or 'idle' due to time lost moving data from a higher level of cache/memory. Watch for changes changes in each of these numbers - especially if changing parking settings and/or LPAR weighting.

  • ESAMFCA - Shows processor cache hit information.

    ?

  • Processor Rate/Sec Cycles/Instr/Ratio - Shows processor cache effectiveness. The lower the ratio (the average number of cycles required to process an instruction) the more work is being accomplished.
  • Data source read from L1/L2/L3/L4L/L4R/Mem - Shows the cache hits from the different levels of cache. The farther the system has to go to get the information, the higher the cost.
  • TLB Miss Instr/Data - This shows the Transaction Look Aside Buffer misses for both instructions and data. The higher the number, the less actual work is being accomplished.
  • Overhead Pct Cycles Used TLB%/Total - Shows the amount of overhead caused by TLB misses.
  • RNI From Burg - Shows the Relative Nesting Intensity from the Burg formula. This is a calculation of how long it takes to load L1 cache from the different levels of cache. The smaller the number, the faster L1 cache is being refreshed and the more work is being done. RNI goes up when SMT is enabled as cache is being affected.

  • ESAMFCC - Shows processor L1 cache write analysis.

    ?

  • L2 Cache Inst/Data - Shows L1 cache writes from L2 cache. The closer the cache is to L1, the more effective it is and the less time it will take to be able to execute the instruction.
  • L3 Cache Data OnChip/OnBook/Offbk - Shows L1 cache writes from L3 cache - on the same chip, on the same book or on a different book for data.
  • L3 Cache Inst OnChip/OnBook/Offbk - Shows L1 cache writes from L3 cache - on the same chip, on the same book or on a different book for instructions.
  • L4 Cache OnBook/Offbk - Shows L1 cache writes from L4 cache - on the same book or on a different book.
  • Memory OnChip/OnBook/OffBook/OffDrawer - Shows L1 cache writes from memory - on the same chip, on/off the same book or on a different drawer. This would be the most costly.
  • SIIS - This shows the Store Into Instruction Stream (from Burg) percentage. Anything over 5% will cause impact.
  • The farther away the L1 write has to go, the more time it takes and performance will suffer. This is a good place to see cache efficiency.

  • ESAPLDV - Shows processor local dispatch vector activity

    ?

  • VMDBK Moves - Shows the number of VMDBKs that moved to a different processor. Either from processor to processor or from a slave processor to the master processor. Watch for any large fluctuations. If the number of VMDBK's moved to the master starts to climb or has a sharp increase, investigation is needed to determine what is being run that must run on the master.
  • CPU Steals from Other CPUs - This shows when VMDBKs were moved from all the different levels of cache. The farther out a steal goes, the more time it takes and the worse the performance. This is another way to determine if SMT is working for a system. (Be sure to get benchmark numbers before turning on SMT).

  • Back to top of page

    Chargeback with SMT:

    Presentation - SMT for z/VM Understanding Capacity Planning and Chargeback ESAMAIN - Shows an overview of the system.

    ?

  • Processor Utilization - This shows the current system activity.
  • SMT Prort Ratio - When SMT is enabled, this is the thread to core ratio. A number of 0.5x is excellent. It says the hardware is supporting two threads without loss. Any number under one shows SMT is providing value. Above one, SMT may still be providing capacity, but may start to impact performance.

  • ESACPUU - Shows CPU utilization. These are z/VM times and are accurate.

    ?

  • CPU Total util - This shows the total CPU processor time for each IFL core.
  • CPU Emul time - This shows the processor utilization for virtual emulation, which is actually work done by a machine for a user.
  • CPU Overhd User - This is the overhead time that is attributed to users or user tasks.
  • Note: For the chart above, out of 365.6% total utilization, really only 353.9% (345.5 (Emul time) + 8.4 (User ovrhd)) could be charged to the users.

  • ESAUSP5 - Shows user SMT CPU percent utilization by user.

    ?

  • CPU Percent Consumed Traditional Total/Virtual - This shows the time the CPU core was assigned and dispatched on a thread.
  • CPU Percent Consumed MT-Equivalent Total/Virtual - This shows the time if SMT was not enabled. This also shows the cost in response time.
  • CPU Percent Consumed IBM Prorated Total/Virtual - This shows (approximately) the cycles that were really used. If the MT-Equivalent number is higher than the IBM-Prorated number, there was NO gain in capacity. Look at the response time loss vs the amount of gained capacity to see if SMT is helpful.
  • Total CPU VSI Prorated Total/Virtual - This shows the total prorated CPU busy percent as computed by Velocity and is the best number to use for chargeback.
  • Example of a chargeback scenario:

    ?

    ?

    ?

  • ESALPAR - Physical LPAR core metrics say - 816% (Total %Assigned - 828.6% minus %Assigned Ovhd - 12.6%) 999% (Total CPU util - 1020% minus User ovrhd - 20.9%)
  • ESACPUU - z/VM thread metrics say - 1019% (Total util or 'thread busy time') 972.6% (Total util - 1019% minus Sys ovrhd - 46.4%)
  • ESAUSP5 - SMT thread metrics say - 951% (Traditional virtual which excludes overhead - CPU dispatched on a thread) 813% (MT-Equivalent virtual which excludes overhead - simulated CPU with no SMT) 951% (MT Prorated virtual which excludes overhead - 'best guess' algorithm to do chargeback
  • Which metric is the best for chargeback? Usually it would be MT Prorated. However, whatever metric is chosen, stay consistent.

  • Back to top of page

    Tips when running with SMT:

    The following are suggestions or things to remember when using SMT:


    Back to top of page

    Conclusions:

    There are a lot of measurements that are not in agreement with SMT. In general, follow the advise below to determine for your system if SMT is beneficial. If you have interesting data or need further assistance, contact Velocity Software.


    Back to top of page