Date: Thu, 26 Jan 2006 17:14:34 -0800
Reply-To: David L Cassell <davidlcassell@MSN.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: David L Cassell <davidlcassell@MSN.COM>
Subject: Re: To SAS or not To SAS (or whatever else)!!,
In-Reply-To: <200601262040.k0QKIfnR001122@mailgw.cc.uga.edu>
Content-Type: text/plain; format=flowed
neilanessa@MSN.COM wrote:
>This message is directed primarily to individuals in these groups who have
>long experience using both SPSS and SAS (or any other statistically
>oriented/data intensive analytic tools) with a heavy data volume and
>frequent
>reporting requirements. I am pretty open to any and all suggestions.
>
>In any normal context I wouldn't be caught dead with a SAS manual,
>and my obit will probably read, "We had to pull the SPSS manual from
>his cold dead hands, and needed to break some fingers to boot."
Why is it that I am thinking of the scene from "Men In Black" where Edgar
looks
down into the crater? :-)
>BUT, I recently accepted a position where part of my responsibilities
>will involve the selection of off the shelf data base tools, analytical
>software,
>programming tools and use them to build rather large scale solutions.
>
>
>Questions I need to answer to management:
>
>Relative efficiency and ease of data access via a Database (ODBC
>connection).
>-We will be building a fairly large data base containing both
>contemporaneous and historical data. What I mean by fairly
>large most people would consider unfathomly HUGE.
>Multivariate Data feeds every ten minutes over a year. Often there will be
>multiple years from multiple sites.
In my experience, SAS is a lot better here. It has a lot of facilities for
data
warehousing and large-scale database management that I have not found in
SPSS. Data sets in petabytes? No problem, if you have the disk space.
SAS also has a lot of data manipulation methodologies which scale reasonably
well to levels that you will face. As you know, simple merge and sort are
not
our friends when we have gigabytes to address.
>Ease of use and training/learning curve for new users (I am a seasoned
>SPSS professional with 20 years of experience with SPSS, I used SAS
>in Graduate school, but it seemed like having a root canal without a
>local).
I didn't think there was much difference between learning SAS and learning
SPSS, in terms of learning curve. But I'm probably an outlier anyway. They
are different.
Now that SAS has Enterprise Guide, new users don't have to learn how to
do 50 things in a DATA step and 180 different procs. Look into it. And let
me
put in a plug for Delwiche and Slaughter's new book on it. Their "The
Little
SAS Book: A Primer" is already a classic.
>Cost of licensing.
SPSS wins. Unless Paul Allen is throwing money at you to build a new sports
team using SAS.
>Flexibility of Output (Is SAS still text based output?, what do other
>statistical/reporting software solutions generate).
SAS has not been text based since about SAS 7. SAS uses ODS and separates
the process from the output delivery. And if you have SAS/ACCESS for PC
File
Formats (or a willingness to use XML or HTML or RTF as your output format),
then you have an easy way to get the output into the apps users want.
>Graphical capability (mostly sequence charts, histograms, bar charts) .
Improved in SAS. A lot. Still not up to what you can do in a lot of other
tools.
But if you face having to generate hundreds or thousands or plots (say, for
drilldown on your website) then I would prefer to build the plots in SAS
over
SPSS. ODS Statistical Graphics (new) also lets you get at a lot of plots
that
*I* care about without killing yourself to crank out diagnostics by hand.
>Output Exports (Word, Excel)?
CSV, XML, HTML, RTF, LaTeX, PDF, etc. by default.
MS formats at extra cost, as noted above.
graphics to be pulled into a Word document: seriously improved
>Customizability, External Programability. AUTOMATION!!!
>(Preferably from VB.Net or C#.Net -yeah, I'm tossing VB6 into the
>history bin!-)
I like Perl for this. But that's me. Look at the SAS-L posts by Alan
Churchill
on this subject, or look at his website. .NET is not only usable as an
external
tool, but C# code can be actually accessible from *inside* SAS now.
In terms of automation, I feel that SAS is better structured than SPSS
(or Stata or ...). With massive amounts of data, you'll want to stress this
aspect.
>Quality and ease of use/customizabilty of the User Interface.
See my comments on Enterprise Guide. If you have big bucks to make
executives happy, there's also a good EIS product.
>Data Export capability/flexibility
Way better than it used to be. Look up PROC EXPORT. Also look up stuff
like the ExcelXP tagset. Paul Choate is giving a paper at SUGI on exporting
to
Excel, and he was talking about that earlier today in SAS-L. Check out that
thread. The whole ODS thing has totally opened up what used to be a
fairly closed system.
One really key feature is that you can now get a single table that you care
about
out of a proc liike PROC CORR or PROC REG, instead of getting a ton of
output
and then having to hack it all up to get the chunk you really wanted. Plus,
you can automatically have the single table put in the format you wanted,
without having to jump through additional hoops.
>The analytical/statistical reporting side is not terribly complex but the
>data volume will be immense and multiple person's will be using the
>data at any given moment. I hope this is clear wrt our requirements.
SAS does now have facilities for managing data table access. You'll want
that.
>BTW,
>I read the Comparisons document produced by Michael Mitchell at UCLA
>and have even posted my own comments on one of the SPSS lists
>(possibly this one *WHEN IS SPSS inc going to respond???*). His report
>might be useful to university students/professors, but fails to address
>the needs of people trying to make decisions/recommendations in the
>context I currently find myself, so please DO NOT SUGGEST I use that
>information as a guide (too much of it is simply incorrect -read my post
>for some examples!)
Even Mike Mitchell would not suggest that. :-) His report is specifically
aimed
at his target audience at UCLA. And he has plenty of errors on the SAS
and Stata sides too. It's still a working document.
>Thanks, Neila
>Feel free to email me directly, but I believe it will be useful to have
>this
>discussion in the public forum
I agree with you.
David
--
David L. Cassell
mathematical statistician
Design Pathways
3115 NW Norwood Pl.
Corvallis OR 97330
_________________________________________________________________
Don’t just search. Find. Check out the new MSN Search!
http://search.msn.click-url.com/go/onm00200636ave/direct/01/