LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (January 2007, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 12 Jan 2007 15:21:52 -0500
Reply-To:     Sigurd Hermansen <HERMANS1@WESTAT.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Sigurd Hermansen <HERMANS1@WESTAT.COM>
Subject:      Re: how to replace SSNs with fake
Comments: To: Jen <plessthanpointohfive@GMAIL.COM>
In-Reply-To:  <200701121739.l0CFWkou015172@mailgw.cc.uga.edu>
Content-Type: text/plain; charset="us-ascii"

Jen: Best to do this very carefully.... What you call a fake ID will likely become a surrogate ID for each person's SSN. Someone needs to take responsibility for maintaining a 'key ring' or crosswalk dataset that has a surrogate ID and its corresponding SSN in each row.

I'd construct the key ring in two steps. First, create a column of distinct instances of SSN. In SAS SQL,

create table key as select distinct SSN from <dataset>;

Second, create a non-informative surrogate ID for each distinct SSN:

create table keyRing as select put(ranuni(1773)*100000000,z9.) as ID,SSN from key;

One can then join the keyring to a dataset on SSN and substitute the ID for the SSN in a new dataset. Reversing the process restores the SSN when required for identification of subjects. For the ID I have used a purely random number that will duplicate if applied to large numbers of SSN. (Cehck for duplicated ID's.) In the event of duplicates, it will take a somewhat more complicated process to guarantee distinct ID's.

I've mentioned a Data Privacy By Design paper that a colleague of mine and I wrote some time back for a CDC conference. It illustrates some of the uses of surrogate key ID's. Sig

-----Original Message----- From: owner-sas-l@listserv.uga.edu [mailto:owner-sas-l@listserv.uga.edu] On Behalf Of Jen Sent: Friday, January 12, 2007 12:39 PM To: SAS-L@LISTSERV.UGA.EDU Cc: Jennifer Sabatier Subject: how to replace SSNs with fake

Hello,

I have a file of information about people and I want to create an id variable to replace SSN. In this file people have multiple rows, ie, some SSNs have multiple rows, others don't.

I know this probably a simple request but I couldn't find something similar in a search.

Thanks,

Jen


Back to: Top of message | Previous page | Main SAS-L page