The Issue:
With dedicated servers, running even a gigabyte of RAM, storage
reference patterns were never an issue. But with VM, there is a
significant advantage to running many servers with shared
memory. If one server does not need storage, then it should be
available to other servers.
Unfortunately, Linux has not been taught to share. In analyzing
both idle and non-idle servers, linux references most of it's
storage on a frequent basis. If you define a virtual machine
size of 128MB, a linux server will address all of it. This makes
it's working set close to 128MB. Add 10 linux servers to 512MB
box and even though idle, these servers will page.
From a dedicated server perspective when having to do I/O
to potentially very slow disk drives, Linux and Unix make good
use of storage to cache. This however is detrimental to a
VM environment, where minidisk cache may be caching data
for several guests. One shared cache (MDC) is certainly
more cost effective than many (hundreds or thousands) all
caching the same data internally.
So from a storage perspective, there are two issues: 1) many
servers caching identical data using up storage, and 2) linux
referencing every 'real page' enough that paging on VM is required.
The Solution:
Linux does NOT need all of it's storage. Many linux
administrators will tell you not to use swap if you can
avoid it. But what if that swap was not a slow SCSI device
but a very fast Virtual Disk? This turns out to be an
optimal solution. Drop the storage size of your linux
server to 32MB or even 24MB if you can, and define a virtual
disk as a swap disk to make up the difference.
The result is that linux greatly reduces the amount of storage
referenced, your storage requirements go down, your paging
goes down, your virtual disk takes up the slack, and you've
taught linux to share it's storage.
Click here (VDISK case study) for
the associated case study, and here (VDISK implementation) for
implementation tips.