Date: Mon, 22 Nov 2004 02:24:26 -0500
Reply-To: Richard Ristow <wrristow@mindspring.com>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: Richard Ristow <wrristow@mindspring.com>
Subject: Re: Several hundreds IF lines
In-Reply-To: <20041122041336.69871.qmail@web50905.mail.yahoo.com>
Content-Type: text/plain; charset="us-ascii"; format=flowed
At 11:13 PM 11/21/2004, LAI Man Kin wrote:
>I have to assign different values to 30000+ cases of different
>backgrounds (e.g., sex, age in year, district, etc.) and I use the
>command "IF (condition) var=(value)". Since the combination of the
>background variables are over several hundreds (two sexes times 10
>ages in year times 18 districts ...) and running the syntax takes me
>several minutes. Is there a faster way to accomplish this (in syntax
>writing and in running)?
At 01:39 AM 11/22/2004, Martins Liberts wrote:
>Probably it will be enough if you will remove all "exe" or "execute"
>commands from the syntax. Try it.
Very much after my own heart! And certainly, "EXECUTE" commands could
slow your syntax to a crawl. (Remember, every one of them means reading
all 30,000+ cases again.) But that may not be the whole problem.
What you're writing is called "data in the code":
IF (SEX = 'F' AND AGE=21 AND DISTRICT=01) BKGND = <value1>.
IF (SEX = 'M' AND AGE=21 AND DISTRICT=01) BKGND = <value2>.
.... etc., for your 2*10*18=360 conditions.
It's worse if you have more than one "background" variable; for two of
them, you have 720 IFs instead of 360.
Nested DO IFs would probably be faster. (360 IFs means every one of the
360 tests is performed for every case.)
DO IF SEX = 'F'.
- DO IF DISTRICT = 01.
. DO IF AGE = 21.
. COMPUTE BKGND1 = <value1 for F,dstr.10,age21>
. COMPUTE BKGND2 = <value2 for F,dstr.10,age21>
. ELSE IF AGE=22.
.
. ELSE.
. <error message for bad AGE value>
. END IF.
- ELSE IF DISTRICT = 02.
...
- ELSE IF DISTRICT = 10.
- ELSE.
- END IF.
ELSE IF SEX = 'M'.
< another bunch like the above>
ELSE.
. <print error message: bad SEX code>.
END IF.
The preferred way, much the easiest to write, is a separate file that
you MATCH. It should look like this:
SEX DISTRICT AGE BKGND1 BKGND2 ...
F 01 21 <val1> <val2>
M 01 21 .......
.....
Then, if that file is c:\MY_SPSS\BACKGRND.SAV, it is sorted by SEX,
DISTRICT, and AGE, and the file you're adding to is the working file,
SORT CASES BY SEX DISTRICT AGE.
MATCH FILES/FILES=*
/TABLE= 'c:\MY_SPSS\BACKGRND.SAV'
/BY SEX DISTRICT AGE.
Much easier and more reliable to code, almost certainly faster.