LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (October 2002, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 21 Oct 2002 08:38:28 +1000
Reply-To:     Tim Churches <tchur@OPTUSHOME.COM.AU>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Tim Churches <tchur@OPTUSHOME.COM.AU>
Subject:      Re: Record linkage
Comments: To: John Whittington <John.W@mediscience.co.uk>
Content-Type: text/plain; charset=us-ascii

John Whittington wrote: > > At 05:30 20/10/02 +1000, Tim Churches wrote (in part): > > >Tim B wrote: > > > Franck, The simplest way is to use an actual code: for each > > > identifier in your data, assign a meaningless value and keep > > > a list. Just do not lose the list. > > > >Yes, but those values need to be **really** meaningless i.e. as > >random as possible. Such a list of random numbers, used only once to > >encrypt another list, is called a one-time pad. Don't use any > >software-based (pseudo-)random number generator to generate your one-time pad. > > Tim, whilst that's obviously very correct advice in relation to > cryptography, isn't it really rather 'over the top' in the present > context. If, as I understand it, the intent is simply to 'anonymise' the > true identity of data (SAS observations), then even the 'list of true IDs' > is presumably not going to be known to third parties - so even sequential > numbering (with a securely stored 'look up list') would probably suffice, > and any form of 'erratic' (even if not truely 'random') numbers would be > more than good enough. ... or am I (as so often!) missing something?

The effort which should be put into the protection of privacy and confidentiality really depends on the hazard (i.e. consequences) associated with loss of that protection. It is dangerous to make judgements on the correct degree of protection to be afforded to other people's personal information (at least not without asking them first, and you can guess what the answer will be), so the best policy is to employ the best protection which is feasible. That does not necessarily mean huge expense or complexity.

My point was that the protection offered by XORing with a one-time pad depends entirely on the quality (randomness) of that one-time pad. If the one time-pad is predictable, then the XOR encryption can be broken. How easily depends on how predictable the one-time pad is.

The use of a high quality software-based pseudo-random number generators, such as a Mersenne Twister or similar, may well be good enough. I don't think the algorithms used by the random number generators in SAS are documented, are they? If they are not, you should not assume they are of cryptographic quality.

But high quality sources of random numbers, like those provided by /dev/random in Linux, are readily available, and thus you need to be able to justify any decision not to use them when protecting other people's privacy and confidentiality.

Tim C

> > Kind Regards > > John > > ---------------------------------------------------------------- > Dr John Whittington, Voice: +44 (0) 1296 730225 > Mediscience Services Fax: +44 (0) 1296 738893 > Twyford Manor, Twyford, E-mail: John.W@mediscience.co.uk > Buckingham MK18 4EL, UK mediscience@compuserve.com > ----------------------------------------------------------------


Back to: Top of message | Previous page | Main SAS-L page