Date: Tue, 25 Jul 2006 14:39:48 -0400
Reply-To: Peter Flom <Flom@NDRI.ORG>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Peter Flom <Flom@NDRI.ORG>
Subject: Re: Language Seperation
Content-Type: text/plain; charset=US-ASCII
>>> Sigurd Hermansen <HERMANS1@WESTAT.COM> 7/25/2006 2:25 pm >>> wrote
Could be easier than it might seem at first....
My knowledge of German doesn't extend much beyond 'Gesundheit' (and I
had to use the Google 'Do you mean ..' feature to spell it correctly),
but what I know about syntax suggests that frequencies of special
symbols and capitalized words should be higher in German than English
You may be able to compute differences in relatively small phrases.
If you have phrases, then I think it would be easy to identify articles
and prepositions that occur only in German and only in English. If a
phrase has "the" or "a" or "one" in it, it's English. If it has 'der'
or 'das" or "en" in it, it's German, if neither, it's something else.
Obviously, you'd have to make both lists longer, but that should be