Date: Thu, 11 Jan 2007 17:10:33 -0800
Reply-To: Robert <callingrw@YAHOO.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Robert <callingrw@YAHOO.COM>
Organization: http://groups.google.com
Subject: A new statistical programming language
Content-Type: text/plain; charset="iso-8859-1"
Vilno is a new data crunching programming language. It's available as a
file attachment at the August 31 blog-cast at
www.my.opera.com/datahelper . More information is at
www.xanga.com/datahelper and datahelper.blogspot.com .
The positive: The syntax of Vilno is a lot more innovative than that of
SAS or SPSS, which allows one to achieve more data crunching with less
code. This productivity gap between the Vilno data processing function
and the SAS datastep ( or SPSS data crunching ) will only get bigger
over time because the internal architecture of Vilno gives it a lot of
room to grow ( for versions 2.0, 3.0, etc.). Also, the source code for
Vilno is probably tiny compared to the accumulated source at SAS or
SPSS, which makes Vilno much easier to enhance and extend.
The negative: Not yet ported to Apple/Windows. Still needs a library of
mathematical functions and date/time functions(particularly important
for data crunching). Not yet extended and integrated with a library of
statistical functions( regression, ANOVA, etc.).
DATA ANALYSIS = DATA CRUNCHING + STATISTICAL ANALYSIS
Data crunching has many names: data cleansing, data preparation, data
munging. It is the least glamorous of the two halves, but far more time
consuming. You cannot do proper data analysis without it.
Statistical analysis is the application of mathematical procedures to
produce analysis statistics and p-values. The choice and interpretation
of these statistical procedures requires some knowledge of applied
mathematics ( i.e. statistics ). Many people find this activity to be
far more interesting than data crunching ( I however find data
crunching to be a fascinating challenge ).
S-Plus ( or R ) is good at statistical analysis, but not data
crunching.
Vilno is excellent at data crunching (date/time functions aside), but
does not yet do statistical analysis.
In data crunching/preparation, there has been a dramatic slowdown in
productivity growth over the last 20 years. This is because a software
monopoly causes a lack of competition, hence a slowdown in creativity
and innovation.
All three major statistical programming languages ( S, SPSS, SAS ) are
at least three decades old. It's time to shake things up a bit.