Date: Mon, 10 Jan 2011 13:51:45 -0600
Reply-To: Bhupinder Farmaha <bhupi80singh@YAHOO.CO.IN>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Bhupinder Farmaha <bhupi80singh@YAHOO.CO.IN>
Subject: Re: Creating subset of data and calculate weighted means for depth
In-Reply-To: <8353172EA8F14C05B09AC2F4A1B46C8B@D1871RB1>
Content-Type: text/plain; charset="us-ascii"
Nat,
I noticed that you have asterisked the following variable as
* format Loc best12. ;
By looking at the format type it gives no specification. Is it okay?
Thanks
Bhupinder
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Nat
Wooding
Sent: Monday, January 10, 2011 1:17 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: Creating subset of data and calculate weighted means for depth
Buphinder
In this case, depth has to be a character since 5-10 is not a number. SAS
keeps data as either floating point numeric or as characters. Art's post
shows how to convert the ranges into a numeric code.
I converted your xls file to a CSV and was able to run Import on it.
However, there were errors since Import did not recognize that Loc was a
character variable. I copied the log and removed the line numbers ( I was
too lazy to look up the way that you can save the generated code). I also
removed the statements that make the job fail when there is an error in the
data. I now have
data WORK.SOILS ;
infile 'c:\natsas\PKCSSoildata200709.csv' delimiter = ',' MISSOVER DSD
lrecl=32767
firstobs=2 ;
informat Year best32. ;
informat Field $2. ;
informat Crop $10. ;
informat Sampling_stage $8. ;
informat Plot best32. ;
informat Block best32. ;
informat Tillplc $4. ;
informat Prate best32. ;
informat Krate best32. ;
informat Fert $7. ;
informat Loc $7. ;***;
informat Depth $6. ;
informat Phos best32. ;
informat Potas best32. ;
informat Calc best32. ;
informat Magn best32. ;
format Year best12. ;
format Field $2. ;
format Crop $10. ;**;
format Sampling_stage $8. ;
format Plot best12. ;
format Block best12. ;
format Tillplc $4. ;
format Prate best12. ;
format Krate best12. ;
format Fert $7. ;
* format Loc best12. ;
format Depth $6. ;
format Phos best12. ;
format Potas best12. ;
format Calc best12. ;
format Magn best12. ;
input
Year
Field $
Crop $
Sampling_stage $
Plot
Block
Tillplc $
Prate
Krate
Fert $
Loc $
Depth $
Phos
Potas
Calc
Magn
;
run;
which are the base SAS statements that will read your file. You will note
that I have altered the a couple informat and formats as well as Loc in the
Input statement. This code will run a lot faster than Proc Import since
Import first has to generate these statements.
If I run a subsetting If crop = "Corn" then delete;
The log shows
533 Data crops;
534 set soils;
535 if crop ='Corn' then delete;
536 run;
NOTE: There were 1500 observations read from the data set WORK.SOILS.
NOTE: The data set WORK.CROPS has 1152 observations and 16 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
which means observations were deleted.
Nat
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
Bhupinder Farmaha
Sent: Monday, January 10, 2011 1:16 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: Creating subset of data and calculate weighted means for depth
Nat,
The following variable are listed as character variable and their type in
the form of "$" sign.
Crop,
Depth (it should not be character or num because it has values like 0-5,
5-10,..), Fert Field Loc Sampling_stage Tillplc
If I use the "if - then statement" as if "crop = 'Corn' then delete;" then
it won't delete that dataset.
Thanks
Bhupinder
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Nat
Wooding
Sent: Monday, January 10, 2011 12:03 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: Creating subset of data and calculate weighted means for depth
Bhupinder
Which many of the data variables that your are reading are coming in as
character values which is why new numeric variables are being created
through multiplication by 1. This, by the way, is not the best coding
practice but we can ignore it for now.
If you write
If crop = 'Corn' then delete;
Are the values for corn removed?
Nat
_____
From: Bhupinder Farmaha [mailto:bhupi80singh@yahoo.co.in]
Sent: Monday, January 10, 2011 1:00 PM
To: 'Nat Wooding'; SAS-L@LISTSERV.UGA.EDU; Arthur Tabachneck
Subject: RE: Creating subset of data and calculate weighted means for depth
Hi,
I ran "data contents" to see if there is any trouble with the format. I saw
that variable listed as character type have format something like "$8.",
"$4.".. May be it is causing trouble in the data?
Nat: I did try to imbed data as it is in SAS (10,440 values) but it still
didn't resolve the issue. I am running SAS ver 9.2.
Thanks
Bhupinder
From: Bhupinder Farmaha [mailto:bhupi80singh@yahoo.co.in]
Sent: Monday, January 10, 2011 10:55 AM
To: 'Nat Wooding'; 'SAS-L@LISTSERV.UGA.EDU'; Arthur Tabachneck
(art297@ROGERS.COM)
Subject: RE: Creating subset of data and calculate weighted means for depth
Hello Art and Nat,
Thanks for quick reply. I am hereby posting the part of my dataset along
with the question that I have on this data. When crop is "none", it shows
initial phosphorus, potassium, calcium, and magnesium levels of the field at
the start of experiment (2007). Then at the end of each year (2007, 2008,
and 2009) we measured these parameters again to quantify the effect of
treatments at different depths and different parts of the plots, aka, "loc"
(IR-at the planting row; BR- between two planting rows). My second question
in the original post came from the idea to quantify how much soil phosphorus
increased or decreased for top 18 cm (weighted means of 0-5, 5-10, and
10-18) at the end of each year from initial values as well as from the first
year. This can be calculate for locations separately and also for overall
each plot.
The file also has data from other half part of the field for corn crop which
won't get utilized in this analysis.
Soybean experiment (2007 - 09) was planned as RCBD with split split plot
arrangements of treatments with three replications. Three tillage treatments
(NTBC, NTDP, and STDP) were done in three blocks as main treatments which I
am calling as whole plots. Each block (whole plot) was divided into four
parts to apply four phosphorus levels (0, 12, 24, and 36 kg/ha) which I am
calling as sub-plots. These sub-plots were further divided into four parts
to apply four potassium levels (0, 42, 84, and 168 kg/ha) which I am calling
as sub-sub-plots or "PLOTS". For year 2008 and 2009, we sampled all the
plots but in the third year, we sampled plots only receiving 5 particular
fertility treatments out of 16 combinations of P and K. Therefore, I have
fertility (P-K) as category.
We sampled soil at two particular "locations" (within two planting rows and
at the planting rows) and four depths (0-5, 5-10, 10-20, and 20-40) within
PLOTS.
Year
Field
Crop
Sampling_stage
Plot
Block
Tillplc
Prate
Krate
Fert
Loc
Depth
Phos
Potas
Calc
Magn
2007
F2
none
Spring07
401
4
NTBC
0
84
P0K84
.
0-5
18
286
1570
219
2007
F2
none
Spring07
401
4
NTBC
0
84
P0K84
.
5-10
9
130
1734
182
2007
F2
none
Spring07
401
4
NTBC
0
84
P0K84
.
10-18
6
70
1503
164
2007
F2
none
Spring07
401
4
NTBC
0
84
P0K84
.
18-50
3
81
1885
300
2007
F2
none
Spring07
402
4
NTBC
0
0
P0K0
.
0-5
19
169
1429
204
2007
F2
none
Spring07
402
4
NTBC
0
0
P0K0
.
5-10
7
66
1791
187
2007
F2
none
Spring07
402
4
NTBC
0
0
P0K0
.
10-18
6
56
1647
162
2007
F2
none
Spring07
402
4
NTBC
0
0
P0K0
.
18-50
3
76
2092
303
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2007
F2
Soybean
Fall07
401
4
NTBC
0
84
P0K84
BR
0-5
10
251
1528
207
2007
F2
Soybean
Fall07
401
4
NTBC
0
84
P0K84
BR
5-10
7
99
1524
166
2007
F2
Soybean
Fall07
401
4
NTBC
0
84
P0K84
BR
10-18
3
66
1406
151
2007
F2
Soybean
Fall07
401
4
NTBC
0
84
P0K84
BR
18-50
2
109
2355
367
2007
F2
Soybean
Fall07
401
4
NTBC
0
84
P0K84
IR
0-5
5
78
1530
158
2007
F2
Soybean
Fall07
401
4
NTBC
0
84
P0K84
IR
5-10
6
118
1617
176
2007
F2
Soybean
Fall07
401
4
NTBC
0
84
P0K84
IR
10-18
10
259
1463
218
2007
F2
Soybean
Fall07
401
4
NTBC
0
84
P0K84
IR
18-50
3
99
2001
330
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2008
F1
Soybean
Fall08
101
1
NTDP
36
168
P36K168
BR
0-5
39
349
1664
291
2008
F1
Soybean
Fall08
101
1
NTDP
36
168
P36K168
BR
5-10
37
238
1710
237
2008
F1
Soybean
Fall08
101
1
NTDP
36
168
P36K168
BR
10-18
101
304
1941
233
2008
F1
Soybean
Fall08
101
1
NTDP
36
168
P36K168
BR
18-50
6
185
3074
543
2008
F1
Soybean
Fall08
101
1
NTDP
36
168
P36K168
IR
0-5
36
303
1740
258
2008
F1
Soybean
Fall08
101
1
NTDP
36
168
P36K168
IR
5-10
22
180
1829
242
2008
F1
Soybean
Fall08
101
1
NTDP
36
168
P36K168
IR
10-18
65
236
2167
253
2008
F1
Soybean
Fall08
101
1
NTDP
36
168
P36K168
IR
18-50
7
165
2631
491
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2009
F2
Soybean
Fall09
402
4
NTBC
0
0
P0K0
BR
0-5
8
190
1481
266
2009
F2
Soybean
Fall09
402
4
NTBC
0
0
P0K0
BR
5-10
5
116
1374
211
2009
F2
Soybean
Fall09
402
4
NTBC
0
0
P0K0
BR
10-18
5
97
1646
204
2009
F2
Soybean
Fall09
402
4
NTBC
0
0
P0K0
BR
18-50
3
94
1867
299
2009
F2
Soybean
Fall09
402
4
NTBC
0
0
P0K0
IR
0-5
9
195
1288
237
2009
F2
Soybean
Fall09
402
4
NTBC
0
0
P0K0
IR
5-10
5
117
1297
177
2009
F2
Soybean
Fall09
402
4
NTBC
0
0
P0K0
IR
10-18
4
81
1543
172
2009
F2
Soybean
Fall09
402
4
NTBC
0
0
P0K0
IR
18-50
3
85
1866
266
To handle this analysis, I need to subset this data for each MIXED model. I
just want to keep only one data file to avoid confusion.
In log of both the datasets, it doesn't give any error but also not do
sub-setting of data. I am sorry if I confused you guys more.
In the code I posted in previous email, first dataset is to import data
successfully and remove error if there is any due to different character
specified in the soil variable columns.
Thanks
Bhupinder
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Nat
Wooding
Sent: Monday, January 10, 2011 9:57 AM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: Creating subset of data and calculate weighted means for depth
Bhupinder
Thank you for posting the code. Are you running V9.2? If so, you can
eliminate the Proc Import and use the excel file as if it were a SAS data
set.
I seldom used Proc Import and my SAS copy will not read the Excel engine so
I cannot test a variation of your code.
Please post the log for your two data steps. That would tell us a lot. Also,
how may crops are in the soils data set?
I will be away from the computer for a while but the log would really help
anyone else who is on line.
Nat
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
Bhupinder Farmaha
Sent: Monday, January 10, 2011 10:40 AM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: Creating subset of data and calculate weighted means for depth
Hi Nat,
I made code with " " and the one your suggested. But, unfortunately, none of
them not worked. I am posting my code complete code here to see if there is
something wrong.
PROC IMPORT OUT= WORK.Soils
DATAFILE= "C:\Users\Bhupinder\Desktop\Roots, soil, and soil
water\PKCSSoildata200709.xls"
DBMS=EXCEL REPLACE;
RANGE="CompleteCS$";
GETNAMES=YES;
MIXED=NO;
SCANTEXT=YES;
USEDATE=YES;
SCANTIME=YES;
RUN;
data Soils;
set Soils;
Phos1=Phos*1;
Potas1=Potas*1;
Calc1=Calc*1;
Magn1=Magn*1;
drop Phos;
drop Potas;
drop Calc;
drop Magn;
rename Phos1=Phos;
rename Potas1=Potas;
rename Calc1=Calc;
rename Magn1=Magn;
run;
data soils; set soils;
if crop = "corn" then delete;
if crop = "soybean" then delete;
run;
proc print data = soils;
run;
Thanks
Bhupinder
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Nat
Wooding
Sent: Monday, January 10, 2011 9:19 AM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: Creating subset of data and calculate weighted means for depth
Bhupinder
In your code, the line
If crop = corn then delete;
Will not work as you wish for at least one reason, that corn is not enclosed
in quotation marks. This assumes that you want to select the crop called
"corn". As the code stands, the statement will attempt to test the value of
the variable Crop with the value of another variable, Corn which I presume
does note exist in set Soils. Take a look at the log of the following code.
data crops;
informat crop $10.;
input crop @@;
cards;
corn
corn
wheat
rye
barley
run;
Data wanted;
set crops;
if crop = corn;
run;
If you enclose Corn in quotation marks, then the If statement will work.
Your If statement, with quotations, would delete all data for Corn. Is this
what you want or do you wish to select the values for corn? Without a sample
of your data, I can only guess.
I am going to ignore your second question for the moment until we get the
first one resolved.
Nat Wooding
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
Bhupinder Farmaha
Sent: Monday, January 10, 2011 10:00 AM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Creating subset of data and calculate weighted means for depth
Hi,
I want to create subset of data based on one level of independent variable.
I have used the following code to create subset of the data but it didn't
work.
data spring07; set soils;
if crop = corn then delete;
run;
Is there any way to model with variable ne (^=). I did try but it is not
working.
My second question is to calculate weighted means based on depths from that
particular subset of data. I want to get weighted means of Phos variable
based on depth as per following formula
Phos(weighted0-40) = (5*Phos1 + 5*Phos2 + 10*Phos3 + 20*Phos4)/40.
Phos(weighted0-20) = (5*Phos1 + 5*Phos2 + 10*Phos3)/20.
I don't know how I can do that because data has different other variables
like Tillplc, Prate, Krate, and sampling location (in-row and between row).
Thanks, in advance!
Bhupinder