|
Hi Larry,
I hope I'm not wasting your bandwith, but
"there's always the delete key" ...
I'd like to support the more technical articles, demonstrating SAS
programming outside of the purely applications/user program area.
Where the straight forward "normal standard SAS program" works fine,
there is *no need whatever* for exotic methods.
So tell me Larry (Technical Manager) in your 14 years exposure to SAS,
that those "works fine" boundaries have *never* been crossed for you !
The procedures of the SAS System assume little, or no, pre-knowledge of
the information in the data you analyse.
You, or your business users, or even your application development SAS
programmers, will probably recognise more of the simple facts about your
data, than a "normal standard SAS" procedure - when it starts.
It is this pre-knowledge which enables the more exotic solutions to beat
the pants off standard SAS.
Some examples might give you cause to be more supportive.
1 proc tabulate
This provides fantastic formating and labeling support to cross-tab
analyses.
But have you ever applied it where the deepest cross is, say greater
than 5 class vars, for which some have over 200 unique values (within
the input dataset and others have say 50 - 100 unique values.
This can't be untypical in the data warehousing analysis field. Bit I
first suffered this problem in a data set of less than 5000 obs.
I doubt that this problem is one you have been exposed to.
It eats run-time *till next week*, once real memory has been exhausted.
When offering a real problem to SI tech support, I usually have to
distil out the business side for confidentiality and clarification.
The problem which overwhelms tabulate, can be demo-ed with data reduced
to 3 obs but using an increasing number of cross.classes. Let me know if
you want to try this your self, to be aware of the limits of standard
sas in just one context.
2 merging in a data step
How often is a sort step used in a program when only a few obs are out
of order? This may be crucial when the data volumes are large. But when
the design results in this scenario, the normal standard sas program
solution makes us *all* wait. It is not just (or even) the programmer
who will wait, but the program user, and all other users of that
processor which has been hogged by proc sort.
When a "hardly complicated" design consideration would eliminate the
problem before it arose.
Because the designer of the system knows more about the information than
a normal standard sas program.
3 lookup tables / hashing
There are some enormous sets of data around, challenging the
statisticans model solutions, many of which are historically based
(score card model theory is over 40 years old ! ).
Data volumes are higher now. That brings non-statistical problems.
Computer systems have advanced enormously in even just the last third of
that time (as you can testify). The expectation of users may have
crossed from "over-eager anticipation" into the "reality of waiting",
but competetive business decision information will always be needed
"like yesterday, or even sooner". Sometimes, I just can't "wait a week"
for the normal standard sas program to tagsort very large sets of data,
even if I could obtain all the resources (disk space and processor
power) needed.
Model solutions involving table-lookup have been included among the
sample programs offered by the SAS Institute with each installation.
These demonstrate using arrays and formats for look-up tables as
alternatives to data step merge. There are a great many other demos too.
I think it's not only the explosion of data volumes, but also the offer
of "possible solutions" from competitors, which drives the search for
faster solutions for large sets of data.
Another (final) reason I want technical programming theories and
examples to feature on SAS-L, is as the antidote for questions for which
the correct response might be
RTFM
In article <914199038.2121222.0@vm121.akh-wien.ac.at>, david pider
<dpider@HOTMAIL.COM> writes
>I wonder if somebody else on the list is annoyed by certain academic
>types polluting the list with their homegrown "routines". For instance,
>what is this 'hahsing' BS?? To code something like that one must've
>never worked in the industry. I've been programing for years and never
>even heard the term! Anybody ever saw it in any SAS manual? Who'd use it
>in the real world? No sane manager will allow this kind of stuff in
>production. I'd fire anyone on the team who'd have the audacity to code
>a monstrosity like that and claim it works better than merge or sql.
>Even if it ran 10% faster, so what? Who'd be able to understand it after
>that self appointed guru is let go (or rather fired)? Plus in all my
>years in SAS I never seen anything "explicitly coded" work faster than
>what is already there in SAS. I'm not so gullible to trust so called
>"test results" in those posts, it can be anything, how do we know they
>aren't concocted?
>
>Or what about this 'big format' thing? Who in his right mind would use a
>5 million format just to run out of memory, and then the the one who
>"coded" it gets called and paid big bucks to fix it. If simple merge was
>used nobedy would have no problem in the first place. What I'm saying is
>stick to normal standard SAS program. Formats weren't intended for this
>kind of (ab)use so it's not standard.
>
>BTW, I've noticed that those posting those extravagant "methods" are
>never on the money trying to answer a normal question about standard SAS
>coding. Anybody thinks it's a coincidence?
>
>Don Pider,
>Technical Manager,
>14 years of SAS
>
>
>
>
>
>
>______________________________________________________
>Get Your Private, Free Email at http://www.hotmail.com
wishing you all the compliments of the season
--
Peter Crawford
|