LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous (more recent) messageNext (less recent) messagePrevious (more recent) in topicNext (less recent) in topicPrevious (more recent) by same authorNext (less recent) by same authorPrevious page (December 2002, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 20 Dec 2002 09:39:05 -0500
Reply-To:     Quentin McMullen <Quentin_McMullen@BROWN.EDU>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Quentin McMullen <Quentin_McMullen@BROWN.EDU>
Subject:      Re: SQLheads (was RE: new "clashvars" macro)
In-Reply-To:  <08B08C9FA5EBD311A2CC009027D5BF8104BFC20C@remailnt2-re01.westat.com>
Content-Type: text/plain; charset="iso-8859-1"

Many thanks to Sig, Ian, Dianne, Ed, and David for your comments. (4 out of 5 responses to my post came from SAS Mecca, that's what attracted me to Westat in the first place. Why did I leave? : )

I am happy to accept Ian's point that I gave up my imagined control long ago. Indeed, even as I wrote that the SAS data step language allowed me to communicate "explicit control of the process", it seemed a bit silly to me. Because of course there is so much happening implicitly (implicit loop of the datastep, initializiation of *some* variables at top of the loop, etc.).

My exploration of SAS started a few years back as a user, with no programming background. At the time I thought of the SAS data step as a communication of a "goal" rather than "instructions". That is to say, I recognized I was writing instructions, but I had no idea how they were being carried out. And 6 months into my SAS programming career, when I went to a SI training course, I was confused as to why the instructor spent so much time talking about the Program Data Vector, which seemed like unnecessary minutiae to me.

And since them I have come to believe that to understand a SAS data step, I need to think in terms of the sequential processing of instructions. So when I'm debugging a data step, I might write out the PDV on paper (still haven't learned the data step debugger), and work line by line through the data step. And when I do this, I can (hopefully) understand the wonderful results that come from a step with a DOW-loop or two, or a merge with multiple by-values in different data sets, or....

So I suppose my real mistake was hoping that the path I took in learning SAS (moving from believing in an automagic data step to thinking about individual processes), would serve me well in other languages. I suppose if I want to embrace SQL, I need to give up my (imagined) control over processes, and accept the higher level of abstraction that communication via SQL allows. And I will admit that the few times I have played with setting up some data structures in a manner that would make them useful for SQL, it has always been an educational/enjoyable experience. That is to say, focussing on logical processing of the data seems to encourage logical structuring of the data, and vice-versa. Since Sig's bookcase is now more than 1 staircase away, I suppose I'll have to visit Amazon for a little Christmas reading.

Kind Regards, --Quentin

> -----Original Message----- > From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU]On Behalf Of > Sigurd Hermansen > Sent: Thursday, December 19, 2002 6:16 PM > To: SAS-L@LISTSERV.UGA.EDU > Subject: Re: SQLheads (was RE: new "clashvars" macro) > > > Bravo, Ian. Despite my penchant for needling a certain SAS-L icon every so > often, I recognize your expertise in SQL. To what you have stated so well > already, I would add one simple but important idea. That is, if > two programs > produce the same tabular data object, one can interchange the two in a > function that yields that same tabular data object. I recall overhearing > this specific instance of a general principle some years ago when Harlan > Mills was holding court on the subject of program verification. > If we design > a program as a composition of functions, some provided by the programming > environment and others by a programmer, the program becomes a verifiable > composite function. A logic programming language such as SQL puts > functions > that yield tabular data objects in an abstract framework that embeds these > functions properly (in the sense of a logic diagram) within a composite > function. SQL queries, like SAS data steps before them, encapsulate the > underlying operations of devices so each query simply yields a > tabular data > object. > > While any one SQL compiler might fail to implement a composite function > properly, a tested SQL compiler will fail far less often than an average > programmer working under deadline pressure. This generalization does not > apply to Quentin, in that we know him to be an exceptionally > perceptive and > thoughtful programmer, but it does have general implications. One can > improve the accuracy of programming by raising the level of > abstraction from > variables in a sequence of records to tabular data objects (or > even abstract > relations). Rather than trace operations on data items through a > sequence of > assignments, conditional branches, and loops (the Turing-complete language > of Paul Dorfman), Quentin might test the implications of a logical process > (a query) applied to a set of tabular data objects (SAS datasets) > by testing > the yield of that query (another SAS dataset). While this method > of checking > a program ignores information in the sequencing of rows of data > and makes it > difficult to view intermediate results, it does focus on the > relation of the > final result to the data sources. > > Sig > > -----Original Message----- > From: Ian Whitlock > Sent: Wednesday, December 18, 2002 2:09 PM > To: SAS-L@LISTSERV.UGA.EDU > Subject: Re: SQLheads (was RE: new "clashvars" macro) > > > Quentin, > > In part you wrote: > > >>>>>>>>>>>> > While I appreciate Sig's compliment, I think in my case the > emphasis should > be placed on "SQLhead **in the making** ". In the fraternity of SQLheads > (which of course includes too many listers to mention), I'm nothing but a > pledge, or even just a potential pledge. > > I think part of what holds me back from pursuing SQL more (besides > laziness), is that I'm still troubled that I don't give > "directions" to SQL. > That is, when I write a DATA step I feel like I am communicating > directions > to SAS. When I write SQL, it feels less like I'm writing directions, and > more like I'm writing a "goal". That is, in SQL, the > communication seems to > be "here is what I want, now do whatever it takes to make it". > So in SQL I > may not know what is actually happening (a sort? a hash table a ...?) to > give me what I want. So I imagine it would be hard for me to debug SQL > code, since each line is not really an instruction (in some > awkward sense). > > I can understand how many folks enjoy this aspect of SQL, i.e. you don't > *have* to know what it's doing under the hood, and if you really want to > know, there are ways to find out. But there's some part of me (inner > control freak?) that likes the communicating explicit control of the > process, rather than just describing the desired outcome. > > That said, when every once in a while I play with a SQL step to replace a > handful of DATA/PROC steps, or see some of the SQL solutions posted on the > L, I can definitely see the attraction. So I imagine I won't make it too > long without making a serious effort to expand my toolbox accordingly. > <<<<<<<<<<<<<< > > You gave up control a long time ago, when you decided to write > programs and > get your meat in a grocery store, instead of running it down and > killing it > with a rock. > > When you write > > x = 2 * x ; > > do you care which register(s) the work is done in? Do you care whether it > was accomplished by shifting the bits or some other means? Do you care > whether the number was stored in data memory or instruction memory? > Whatever control you feel is an illusion. Moreover, it doesn't matter, > because you are more interested in the fact X is doubled than how > it go that > way. Are you not? > > You have pinpointed the greatest roadblock to learning SQL, your > history and > the feeling of control. Why do you care whether a sort or a hash > table was > used? Well maybe the files are too big for your machine, then you have to > care. Otherwise, why? How often is the speed of you machine the limiting > factor in the problems you solve? the method may be interesting, but why > should you care in terms of solving the problem? Are you more > interested in > knowing how a problem got solved, or in knowing how to solve it? You may > have to give up one to make progress with the other. > > You know that it is dangerous in solving a problem to take advantage of > accidents in the data. Now think how much easier it is to > control that urge > when the solution doesn't involve the method and therefore cannot take > advantage of data accidents. > > Just as the DATA step has rules, so SQL has rules. In SQL, it is still a > matter of knowing what rules produce what results. Notice I say "produce > what results" not how the results were produced. Some day you will say, > "What I like about SQL, is that I have control over the results." > But then > the question may be, why do you care about the details of the results? It > will then be knowing how to manipulate the agents that determine > the results > that will be important. > > At every stage you have to give up control of the details of the method to > get better control over solving harder problems. Striving to > understand the > details is only important in that helps to provide the rules for obtaining > the solution. When the knowledge fails to help in providing the rules, > consider it intellectual entertainment or decide whether you want it. > > If it helps, look at history. The first programming engineers felt like > they lost control when they could no longer flip the switches setting the > program. The 0/1 programmers felt like they lost control with the simple > mnemonics of assembly language. The assembly programmers fought losing > control to the Fortran and COBOL compilers. The C programmer > fights losing > control to the SAS compiler. Now you fight losing control to an SQL > compiler that will decide the method of solution. You belong to an > illustrious line of losers, but I doubt if you stay there much > longer. When > you start to solve SAS-L problems with SQL you have already committed > yourself. You just don't know it yet. > > IanWhitlock@westat.com


Back to: Top of message | Previous page | Main SAS-L page