Date: Tue, 9 Feb 2010 23:41:49 -0800
Reply-To: xlr82sas <xlr82sas@AOL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: xlr82sas <xlr82sas@AOL.COM>
Organization: http://groups.google.com
Subject: Response to why do we need a drop down to perl or R in SAS
Content-Type: text/plain; charset=ISO-8859-1
Hi Sas_Lers,
I was asked in a previous post about why a drop down to perl or R in
the datastep would be useful.
For a cleaner version of this message see:
http://homepage.mac.com/magdelina/.Public/utl.html
/* T004900 CREATING A BINARY EXCEL SHEET FROM SAS WITHOUT SAS ACCESS
TO PCFILES
Below is an example where a create an excel sheet without SAS-
Access to PC-Files. This works on all platforms supported by perl.
You don't even need SAS. The interesting point is perl can create
almost any file/datbase format from any other file/datbase format.
Richard Devenezia shows how a slight change to my simplified code can
export an xml file to excel. I believe the XML support in R and perl
is simpler to use than the SAS counterpart. Also perl and R have
better JAVA/CORBA/COM/ODBC/OLE-DB uou name it support. This is a very
big deal.
For other perl excel functions check out
PERL http://www.cpan.org/modules/index.html (over 2000 modules?)
R http://cran.r-project.org/web/packages/ (over 2000?)
Areas where perl and R would be useful for SAS statisticians and
programmers
perl
1. I feel perl is the best language for scripting objects like table
drive batch programs
2. Because perl allows an entire file to be a scalar you can seach
strange file structures very easily. And the text functions can be
applied to the entire file. None of this breaking the file into 32k
chunks.
3. Although I am somewhat new to hashing, it looks to me that perl
hashes are more powerful than SAS hashes. For instance perl allows
mixed type arrays ie array mxd[3] 'roger',1.32,'dick'.
Before you go out and buy some SAS components, see if what you need is
in the 2000 packages?
R
1. New statistical techniques tend to pop up in the R packages sooner
than SAS. So if you need exact confidence intervals on a binary risk
difference(not avail in 9.1.3) go to R
2. If you need meta analysis software - check out the three packages
in R
3. R seems to have a some very nice XML tools
4. If you need extended precision functionality go to R
Obviously I cannot summarize 4000 packages in this message
As a side note. I don't think SAS should worry about R and perl.
However, SAS should integrate these languages into the datastep. I had
some major probelms with a couple of R packages. On one package
repeated communiations with the author were not answered, this was a
key package and it was broken on at least one popular platform. I
would like to see SAS provide support for key perl and R packages.
SAS should focus on supporting R, perl and SAS. Even to the extent of
supporting some of the key R and perl packages. Also SAS must
differentiate its product from perl and R. SAS should charge for
'state of the art' consulting with experts in each industry. I think
is the future. Software revenue may go the same route as mainframes 15
years ago.
1. Add fuctionality to proc sql - at least add all the univariate
statistics
2. 128bit numerics
3. Drop downs to R and perl into SAS proper
4. Beef up proc report by merging cells so that graphic objects can be
imbeded in proc report, and fix the row scambling issue in proc
report, this cost me hours of labor.
5. Move IML into base. If SAS does not move IML, IML will die as
soon as SAS beefs up connectivity to R and perl through Java or some
other interface. R and perl already have excellent connectivity so the
ball is in the SAS court.
SAS needs to refocus on the fundamentals and integrate the Starship
Enterprise and Warner Brother Studios products back into base. Proc
matrix was in base SAS in the 80s.
data tst;
set xyz;
array .................
perl on;
perl code
perl off;
R on;
R code
R off;
run;
==================================================================
Here is my example
The perl script and log follows
/* create the file you want to put in excel (input dbf/mdb/oracle/
sybase/mysql.......) */
Data _null_;
file "c:\tip\csv.txt";
put '1,2,3,4,5,"Hello"';
put '1,2,3,4,5,"World"';
run;
/* the last file is the output excel file */
filename xeq pipe "c:\strawberry\perl\bin\perl c:\tip\PerlScript.pl c:
\tip\csv.txt c:\tip\PerlExcel.xls";
data _null_;
infile xeq;
input;
put _infile_;
run;
Here is the entire program and log
/* create the perl script */
data _null_;
file "c:\tip\PerlScript.pl";
input;
put _infile_;
cards4;
###############################################################################
#
# Example of how to use the WriteExcel module
#
# Program to convert a CSV comma-separated value file into an Excel
file.
# This is more or less an non-op since Excel can read CSV files.
# The program uses Text::CSV_XS to parse the CSV.
#
# Usage: csv2xls.pl file.csv newfile.xls
#
# reverse('©'), March 2001, John McNamara, jmcnamara@cpan.org
#
use strict;
use Spreadsheet::WriteExcel;
use Text::CSV_XS;
use Win32;
use Win32::Pipe;
sleep(5);
# Check for valid number of arguments
if (($#ARGV < 1) || ($#ARGV > 2)) {
die("Usage: csv2xls csvfile.txt newfile.xls\n");
};
# Open the Comma Seperated Variable file
open (CSVFILE, "$ARGV[0]" ) or die "$ARGV[0]: $!";
# Create a new Excel workbook
my $workbook = Spreadsheet::WriteExcel->new($ARGV[1]);
my $worksheet = $workbook->addworksheet();
# Create a new CSV parsing ojbect
my $csv = Text::CSV_XS->new;
# Row and column are zero indexed
my $row = 0;
while (<CSVFILE>) {
if ($csv->parse($_)) {
my @Fld = $csv->fields;
my $col = 0;
foreach my $token (@Fld) {
$worksheet->write($row, $col, $token);
$col++;
}
$row++;
}
else {
my $err = $csv->error_input;
print "Text::CSV_XS parse() failed on argument: ", $err, "\n";
}
}
;;;;
run;
Data _null_;
file "c:\tip\csv.txt";
put '1,2,3,4,5,"Hello"';
put '1,2,3,4,5,"World"';
run;
filename xeq pipe "c:\strawberry\perl\bin\perl c:\tip\PerlScript.pl c:
\tip\csv.txt c:\tip\PerlExcel.xls";
data _null_;
infile xeq;
input;
put _infile_;
run;
/* THE LOG
MLOGIC(UTLOPTS): Beginning execution.
MPRINT(UTLOPTS): OPTIONS VALIDVARNAME=UPCASE NOQUOTELENMAX OBS=MAX
FIRSTOBS=1 NOFMTERR SOURCE MACROGEN SYMBOLG
NOOVP CMDMAC ERRORS=2 SOURCE2 MLOGIC MPRINT MRECALL MERROR NOCENTER
DETAILS SERROR NONUMBER NODATE ;
MPRINT(UTLOPTS): run;
MLOGIC(UTLOPTS): Ending execution.
1942 /* create the perl script */
1943 data _null_;
1944 file "c:\tip\PerlScript.pl";
1945 input;
1946 put _infile_;
1947 cards4;
NOTE: The file "c:\tip\PerlScript.pl" is:
File Name=c:\tip\PerlScript.pl,
RECFM=V,LRECL=256
NOTE: 46 records were written to the file "c:\tip\PerlScript.pl".
The minimum record length was 80.
The maximum record length was 80.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
1994 ;;;;
1995 run;
1996 Data _null_;
1997 file "c:\tip\csv.txt";
1998 put '1,2,3,4,5,"Hello"';
1999 put '1,2,3,4,5,"World"';
2000 run;
NOTE: The file "c:\tip\csv.txt" is:
File Name=c:\tip\csv.txt,
RECFM=V,LRECL=256
NOTE: 2 records were written to the file "c:\tip\csv.txt".
The minimum record length was 17.
The maximum record length was 17.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
2001 filename xeq pipe "c:\strawberry\perl\bin\perl c:\tip
\PerlScript.pl c:\tip\csv.txt c:\tip\PerlExcel.xls";
2002 data _null_;
2003 infile xeq;
2004 input;
2005 put _infile_;
2006 run;
NOTE: The infile XEQ is:
Unnamed Pipe Access Device,
PROCESS=c:\strawberry\perl\bin\perl c:\tip\PerlScript.pl c:\tip
\csv.txt c:\tip\PerlExcel.xls,
RECFM=V,LRECL=256
NOTE: 0 records were read from the infile XEQ. (IT DID WORK EVEN
THOUGH ZERO RECORDS)
NOTE: DATA statement used (Total process time):
real time 5.15 seconds
cpu time 0.00 seconds