>Most of the time the field contains useful consistent information. >However, on occation the person that entered the data appended a phone >number in the format XXX XXX XXXX. > >The real kicker is that the location of the phone number is not >consistent and can be either in the middle or the end of the string. > >The phone number is of no value to my analysis and I would like to remove >it from the string or split out the phone number into another field. Is >there a way of identifying the phone number pattern and removing it?

William, if the telephone number is ALWAYS EXACTLY in the format XXX XXX XXXX with all X's being numeric digits (i.e. with the two embedded spaces always there, and without, say, any brackets around the area code, of dashes beiween the components), and always with a space before, and a space after the phone number (unless at the start or end of the string), then something like the following 'brute force' method should work:

data test ; input var \$57. ; cards ; this one has no numbers 123 456 7890 other than telephone this has 1234 and 345 as well as 456 789 0123 this has 123 456 numbers but no telephone 123 456 7890 number comes first ; run ;

data doit (drop = i var ) ; length var2 \$ 57 ; set test ; var2 = var ; do i = 1 to length(var) - 11 ; if substr(var, i+3, 1) = ' ' and substr(var, i+7, 1) = ' ' and compress(substr(var, i, 12), '1234567890') = '' then do ; if i = 1 then var2 = substr(var, i+13) ; * special case ; else var2 = substr(var, 1, i-2) || substr(var, i+12) ; end ; end ; run ;

... if your phone numbers could take any other formats, the code could be modified accordingly.

