| Date: | Fri, 29 Apr 2005 10:41:00 -0700 |
| Reply-To: | cassell.david@EPAMAIL.EPA.GOV |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | "David L. Cassell" <cassell.david@EPAMAIL.EPA.GOV> |
| Subject: | Re: Bad SAS AIX performance |
| In-Reply-To: | <200504291546.j3TFkTlZ023743@listserv.cc.uga.edu> |
| Content-type: | text/plain; charset=UTF-8 |
Charles Harbour <charles.harbour@PEARSON.COM> posted for Gunnar:
> The p550 is running the new Advanced POWERâ„¢ Virtualization with
dynamic
> LPAR and micropartioning managed from a HMC. All this was setup by a
> consultant. At the moment there are only 2 partions, the virtual i/o
server
> and the main running partion. This main partion has right to use all
> available resources minus what the i/o server uses. I have 100%
control of
> what's running except for what AIX and the i/o server does behind the
> curtain, so no other users. memsize = 2GB, bufno=default=1. SASWORK is
on a
> FastT900 SAN-system. Yes, I'm using fullstimer, this is the result
from the
> both p-series machines:
>
> p550:
>
> PROCEDURE GENMOD used (Total process time):
> real time 6:21:19.92
> user cpu time 3:52:36.61
> system cpu time 5.54 seconds
> Memory 801645k
> Page Faults 350
> Page Reclaims 1041561
> Page Swaps 0
> Voluntary Context Switches 457314
> Involuntary Context Switches 1152405
> Block Input Operations 0
> Block Output Operations 0
>
> p615:
>
> PROCEDURE GENMOD used (Total process time):
> real time 9:24:59.58
> user cpu time 9:13:53.78
> system cpu time 18.63 seconds
> Memory 801645k
> Page Faults 858
> Page Reclaims 693320
> Page Swaps 0
> Voluntary Context Switches 676930
> Involuntary Context Switches 2166843
> Block Input Operations 0
> Block Output Operations 0
>
> My first thought was also that some disk i/o was going on but we have
> carefully monitored the disk with nmon64 and after loading the dataset
to
> RAM it hardly uses the disk at all, look at the low page faults. We
have
> also watched what the i/o partion does and it's resource use is very
low.
> As you can see the p615 has a normal difference between user cpu time
and
> elapsed time. We also had an issue with proc phreg on the p550, where
we
> got the same big difference but there it actually had to swap a 40GB
> working file in/out, running with the multipass option reduced the
real
> time with 3 hours and there was a normal difference between user cpu
time
> and elapsed time.
In both of these, the count on involuntary context switches is
massive, and page faults are really low. Could there be a problem
with the tuning of the dynamic LPAR ? Could there be a problem
with underlying system processes? I'd want an AIX expert to look
at these numbers, and I'd want SAS Tech Support to try out your
cases on their AIX boxes.
David
--
David Cassell, CSC
Cassell.David@epa.gov
Senior computing specialist
|