```Date: Wed, 30 May 2007 16:24:09 -0400 Reply-To: Richard Ristow Sender: "SPSSX(r) Discussion" From: Richard Ristow Subject: Re: Rounding Issues (v.14) Comments: To: Matthew Reeder Comments: cc: Melissa Ives , "Beadle, ViAnn" , "Peck, Jon" In-Reply-To: <630080.62193.qm@web37005.mail.mud.yahoo.com> Content-Type: text/plain; charset="us-ascii"; format=flowed At 09:22 AM 5/30/2007, Matthew Reeder wrote: >I've created a weighted composite from 11 variables (we'll call the >composite comp1, which ranges from 1-10). I then create a binary >variable (binvar) that assigns each case in the dataset a 0 or a 1, >depending on whether or not the case reaches a certain minimum on >comp1 (such as below). > > DO IF (comp1>=4.5) . > COMPUTE binvar=1 . > ELSE . > COMPUTE binvar=0 . > END IF . > EXECUTE . > >Nothing complicated. Binvar will be my filter variable for subsequent >analyses. As you've found, so far, so good. By the way, a good replacement for the above syntax is RECODE comp1 (4.5 THRU HI = 1) (OTHER = 0) INTO binvar. >Let's say that there are 5 people in the dataset with a value of 4.5 >on comp1. Binvar is being assigned a 0 for some of these people, and a >1 for others. In other words, even though they all are equal to 4.5, >SPSS views them differently. You've had the answer: because the values *display as* 4.5 does not mean they are *equal to* 4.5. At 11:03 AM 5/30/2007, Melissa Ives wrote: >It is likely due to the 2nd digit post-decimal. Actually, the minimum difference cannot be guaranteed to appear in a modest fixed number of decimal places. SPSS numbers are represented with 53 bits of precision, which is about 16 decimal digits. But even that doesn't characterize the representation: the representable numbers are spaced about as closely as 16-digit decimal numbers, it they aren't the same set of numbers. >Further, when I run frequencies on comp1, 4.5 appears twice, with >different counts next to it. Why is it doing this? For, of course, the same reason: the numbers* are different, though their *display forms* are the same, in the format (F.1) that you are using. You've had a couple of suggestions (ViAnn Beadle, Jon Peck) for producing display forms that will show all differences. To write a little differently, but close to ViAnn's: if, as I assume, your "weighted composite" is a weighted average, then guarantee the result is integral by (a) using only integer weights, and (b) taking the *sum* rather than *average*, using those weights. But you probably won't like the result. Depending on your weights, you may have to multiply them by a large number to convert them to integers while maintaining their relative magnitudes; and while you will see the exact values of the weighted sums, those values may be integers with a lot of digits. Normally, when you've taken a weighted average like that, it's best to treat it as a continuous quantity, whose the magnitude is important to the appropriate precision, but whose exact values are not relevant. It's rarely illuminating to take FREQUENCIES for such a quantity. It can be useful to use RECODE to classify the values into ranges, and take FREQUENCIES of the result; you'll have to decide about that. If you're particularly interested in what's happening near the cutpoint value of 4.5, I'd try something like this (not tested): a.) Use the code you already have, to calculate 'comp1'. b.) Assuming you have an ID variable called CaseNum, and your 11 data variables are Datum1 to Datum 11, inspect the data by TEMPORARY /* If desired */. NUMERIC Delta (E10.3). VAR LABEL Delta 'Difference from 4.5'. COMPUTE Delta = comp1 - 4.5. SELECT IF ABS(Delta) LE 0.2 /* Or other threshold */. LIST VARIABLES= CaseID comp1 Delta Datum1 TO Datum11. ```

Back to: Top of message | Previous page | Main SPSSX-L page