LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (October 2007)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 1 Oct 2007 11:19:47 -0300
Reply-To:     Hector Maletta <hmaletta@fibertel.com.ar>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Hector Maletta <hmaletta@fibertel.com.ar>
Subject:      Re: "Match files" with duplicates in table
Comments: To: ViAnn Beadle <vab88011@gmail.com>
In-Reply-To:  <000301c8042c$18288540$48798fc0$@com>
Content-Type: text/plain; charset="iso-8859-1"

The problem with Catherine's situation is that she wants first to assign student's data to all course registrations, and then she wants to add student data for all students not registered in any course. The first problem is well solved with a TABLE subcommand, using the general student list as a TABLE and the course registration list as a file. Of course, the latter may contain duplicate ID records since the same students may have registered in more than one course. For the second problem, I responded in haste and late at night, sorry. Before matching or adding, one needs to exclude from the student list all students already included before, i.e. exclude all students registered for courses, and then ADD to the course file the remaining students from the student list. This complicates things a bit. The complete process may look like this: *Assign student data to course registration records. MATCH FILES/TABLE='UNDUPCOMPLETE.SAV'/FILE='DupPartial.SAV'/by ID. SAVE OUTFILE 'COURSEFILE.SAV'. *Flag records with course registration. COMPUTE REGCOURSE=1. *Aggregate registration records by student. AGGREGATE OUTFILE=*/PRESORTED/BREAK ID /REGCOURSE=MAX(REGCOURSE). *Match records of student registered in courses, with student list. MATCH FILES /FILE 'UNDUPCOMPLETE.SAV'/FILE=*/BY ID. *Exclude from list all student registered in courses. SELECT IF SYSMIS(REGCOURSE) OR REGCOURSE=0. *Add registered and non-registered students in a single list. ADD FILES /FILE 'COURSEFILE.SAV'/FILE *. SAVE OUTFILE 'FINAL.SAV'.

This is untested. Hope it works.

Hector

-----Original Message----- From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of ViAnn Beadle Sent: 01 October 2007 10:08 To: SPSSX-L@LISTSERV.UGA.EDU Subject: Re: "Match files" with duplicates in table

That "warning" message is more of a note and usually can be ignored for a FILE file. It is however, a fatal error for a TABLE file.

-----Original Message----- From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of Richard Ristow Sent: Sunday, September 30, 2007 4:40 PM To: SPSSX-L@LISTSERV.UGA.EDU Subject: Re: "Match files" with duplicates in table

At 02:30 PM 9/28/2007, Hector Maletta wrote:

>My final result is thus:

SPSS 14 draft output, test data as before:

MATCH FILES /TABLE = UNDUPCOMPLETE /FILE = DupPartial/by ID. MATCH FILES /FILE = * /FILE= UNDUPCOMPLETE /BY ID.

LIST.

List |-----------------------------|---------------------------| |Output Created |30-SEP-2007 17:47:58 | |-----------------------------|---------------------------| id name Major

1 Ann 2 Bill ENGL File #1 KEY: 3

>Warning # 5132 >Duplicate key in a file. The BY variables do not uniquely identify each >case on the indicated file. Please check the results carefully.

3 Chris HIST 3 Chris THEO 4 David 5 Ethel ENGL 6 Frank

Number of cases read: 7 Number of cases listed: 7

(Gary Moser, <9000.hal@gmail.com>, tested earlier, and reported the same error message that's here.)

Hector wrote,

>There might be a more elegant solution, but it >is late here and I cannot think of anything better right now.

THERE'S a question of aesthetics. I expected, myself, that a neat double MATCH FILES would do it. I rejected it without writing the code because of the problem of non-unique keys, as reported in the warning message above. But Hector's code gives the right result, and I think will do so reliably.

I don't like to put code in production that gives warning messages. The last thing you want is to leave users either blasè or confused about whether to pay attention to warning messages. But Hector's code does work, and it's simpler and clearer than the MATCH FILES/ADD FILES logic I posted.

Now we can be like one of those TV shows where viewers vote on which singer, or something, is the best. Or maybe, simply, de gustibus non disputandum est.


Back to: Top of message | Previous page | Main SPSSX-L page