LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (December 2006, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Thu, 21 Dec 2006 20:27:04 +0000
Reply-To:   toby dunn <tobydunn@HOTMAIL.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   toby dunn <tobydunn@HOTMAIL.COM>
Subject:   Re: Breaking down text by sentences and creating new observations
Comments:   To: hakanener99@YAHOO.COM
In-Reply-To:   <200612212001.kBLJYir8002377@mailgw.cc.uga.edu>
Content-Type:   text/plain; format=flowed

Data Need ( Keep = NewText ) ; Set One ;

Count = CountC( text , '.' ) ;

Do I = 1 To Count ; NewText = Scan( Text , I , '.' ) ; output ; End ;

Run ;

Toby Dunn

To sensible men, every day is a day of reckoning. ~John W. Gardner

The important thing is this: To be able at any moment to sacrifice that which we are for what we could become. ~Charles DuBois

Don't get your knickers in a knot. Nothing is solved and it just makes you walk funny. ~Kathryn Carpenter

From: Hakan ENER <hakanener99@YAHOO.COM> Reply-To: Hakan ENER <hakanener99@YAHOO.COM> To: SAS-L@LISTSERV.UGA.EDU Subject: Breaking down text by sentences and creating new observations Date: Thu, 21 Dec 2006 15:01:41 -0500

Following up on an earlier post that deals with text, I was wondering if anyone knows a way that sentences in a text variable can be broken down (by taking the period at the end of a sentence as the breaking point), and the resulting new sentences being posted as new observations? For example, imagine a two-column dataset, first column being a person's name, and the second column being a person's biography that contains 5 sentences, where a period appears at the end of each sentence (and periods do not appear elsewhere). Could these 5 sentences be broken down into one sentence each, and each sentence being associated with the same person's name in order to create a new observation, so that each individual person will now appear to have 5 observations in the dataset, with one sentence in the biography column? This is necessary as part of a more complicated task, so I know it looks like a strange thing to do, but may save a lot of hassle otherwise.

Thanks for any tips!

Hakan

_________________________________________________________________ Visit MSN Holiday Challenge for your chance to win up to $50,000 in Holiday cash from MSN today! http://www.msnholidaychallenge.com/index.aspx?ocid=tagline&locale=en-us


Back to: Top of message | Previous page | Main SAS-L page