The Problem: Too much paging
With one Linux server running a standard profile, our
system paged so badly that performance of all users were
impacted.
Given a virtual machine size, Linux will reference most of
it, such that the working set is very close to the virtual
machine size. This makes it very difficult to share real
storage, and limits the number of linux servers you can run
efficiently.
Given Linux storage requirements
greater than your real storage availability, you will page.
Linux on INTEL servers does not need to 'share' storage, so this
'feature' of linux is not a problem on other dedicated
servers. But the benefits to operating many servers under
VM is the ability to share storage, cycles and disk space.
I really don't think David Boyes ran 97,000 images with
dedicated storage especially at 128mb per image.
The problem then is how to make
linux share it's storage using standard features of linux and
VM.
The solution(s): Reduce Virtual Machine Size
Minidisk cache?
Linux uses storage to cache. So does VM.
Multiple linux servers should share disk, not cache, and use
VM's ability to cache minidisks. This provides a shared cache,
not umpteen private caches. Reducing the cache size and
utilizing MDC instead would be a good experiment, but this was
not the solution for us (yet).
Use SWAP instead of Real
storage?
It looks like nobody seems to have actually published any
research on real storage requirements of a linux system.
The real solution is to cut down the virtual machine size
very significantly and make up the difference with swap. Linux
design considers a swap device slow, and would not consider
using that space unless necessary.
The choices are limited:
- Expanded Storage, must be dedicated. Not a good
solution.
- Cached DASD, better. But only as a last resort.
- Virtual Disk: Perfect....
Going from a 128MB virtual machine to a 32MB virtual machine
with 100 mb of virtual disk worked. Linux still ran the tar
successfully. Only quicker. Instead of an 80MB working set
size, the working set was closer to 20MB. And better yet, very
little of the swap space (virtual disk) was actually
referenced. So linux requirements dropped, because
linux didn't see the resource as something that should be
used. This led to VM requirements being dropped by
60MB. And we may try this again with an even smaller
server....
The High Level Analysis
Base case:
Using ROT: (rule of thumb) Note that the system was thrashing
hard enough to impact the performance measurements.
Linux with 128MB: 3/8/01, 15:19-16:00 (0308DAY OUT02)
Queue and State analysis (ESAXACT
Report)
PageWait:
53%
CPUTime:
25%
CPUWait: 15%
ELIGIBLE:
000% DID NOT USE or need QUICKDSP
Storage Analysis
Linux001: 80MB Working Set (20,000
Pages) (ESAUSP2 Report)
MDC: 30MB (ESAMDC Report)
Paging Analysis
Linux001: 20/second (ESAUSR2
Report)
SystemBlkPgs: 6/sec (normally .1)
System responsiveness:
80% triv < .2, normally 98%
User Avg resp 3 seconds, normally
subsecond
I/O analysis (DASD I/O Per second)
Total: .8
Vdisk 0
MDC .1
BlockI/O .7
2nd case, using Virtual
Disk for Swap (0310NITE OUT02)
Linux with 32MB, 3/10/01 07:00-08:00
State Analysis
PageWait: 3%
CPUTime: 52%
CPUWait: 21%
Asynch I/O: 24
Storage Analysis
Linux001: 20MB Working Set ( 4,800
Pages)
MDC: 70MB
Paging Analysis
Linux001: 2/second
SystemBlkPgs: 1/sec (normally .1)
System responsiveness:
96% triv < .2, normally 98%
User Avg resp 1 seconds, normally
subsecond
I/O analysis (DASD I/O Per second)
Total: 1.6
Vdisk: .1
MDC: .2
BlockI/O 1.3
3rd case, Not using Virtual Disk for Swap
Linux with 32MB, 3/13/01 10:00-11:00
State Analysis
PageWait: 3%
CPUTime: 60%
CPUWait: 20%
Asynch I/O:18%
Storage Analysis
Linux001: 20MB Working Set ( 4,800
Pages)
MDC: 70MB
Paging Analysis
Linux001: 2/second
SystemBlkPgs: 1/sec (normally .1)
System responsiveness:
99% triv < .2, normally 98%
User Avg resp 1.2 seconds, normally
subsecond
I/O analysis (DASD I/O Per second)
Total: 2
Vdisk: 0
MDC: .4
BlockI/O 1.4
Reading the reports
System data:
ESAHDR: System configuration
ESASSUM: Subsystem Overview
ESAMDC: Minidisk cache size, hit rate
ESAUSLA: System responsiveness
ESASTR1: Storage functional requirements
ESABLKP: Block paging analysis
Linux Server Data
ESAUSR2: Linux Server CPU time, working set,
Page rate
ESAWKLD: Linux wait state analysis
ESAVDSK: Virtual disk storage, page rate
ESAUSR3: I/O rate to MDC, VDISK
ESAUSPx: Resource rates