LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (September 2008, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 11 Sep 2008 18:42:14 -0400
Reply-To:     Arthur Tabachneck <art297@NETSCAPE.NET>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Arthur Tabachneck <art297@NETSCAPE.NET>
Subject:      Re: KEEP DROP array variables
Comments: To: Mary H <mlhoward@AVALON.NET>

I haven't kept up with this thread and, upon trying to review it, quickly saw that the underlying subject has changed at least once.

I'm responding to the issue of whether a short-wide file is better than a tall-narrow one.

It was surprising to see the number of responses that addressed the increased length of tall-narrow files as, unless I've missed something, tall-narrow file have to be larger by definition.

But, as for utility, I have to agree with Mary. That is, why store one's data in a different format than you usually need it? Enquiring minds want to know.

Art ------- On Thu, 11 Sep 2008 17:07:44 -0500, Mary <mlhoward@AVALON.NET> wrote:

>Yes, I halved the number of rows by using the two matching variables on the >same row, but note that the varID variable has to be as large as the largest >variable name, such as 30 characters, not 8 characters as in your >calculation. I just looked at that variable by itself by creating a data >set with just the varID variable, and even with half the rows it normally >would have been it was 100MG, which by itself is 3 times the size of the >original file! > >-Mary >----- Original Message ----- >From: Nordlund, Dan (DSHS/RDA) >To: SAS-L@LISTSERV.UGA.EDU >Sent: Thursday, September 11, 2008 4:48 PM >Subject: Re: KEEP DROP array variables > > >>> > ----- Original Message ----- >> > From: Mary >> > To: ./ ADD NAME=Data _null_, ; SAS-L@LISTSERV.UGA.EDU >> > Sent: Thursday, September 11, 2008 9:12 AM >> > Subject: Re: Re: KEEP DROP array variables >> > ><<<snip>>> > >I am not going to comment on the appropriateness of wide vs. narrow. But >the fact that Mary found a big increase in size when going to narrow does >not surprize me. Let's use numbers "like" Mary is giving: 1000 rows, 6000 >data variables with let's say 1 additional ID variable (all numeric). This >is 8 * 1000 * 6001 = 48008000 ~ 48MB. > >In the narrow file we will have an obsID, a varID variable, and a data Value >variable. There will be 6000*1000 rows of 3 variables. This is >8*3*6000*1000 = 1.44 X 10^8 = 144MB. It may be that Mary had extraneous >variables that didn't need to be there, but the narrow file will be >substantially larger than the same data stored in a wide format. > >Dan > >Daniel J. Nordlund >Washington State Department of Social and Health Services >Planning, Performance, and Accountability >Research and Data Analysis Division >Olympia, WA 98504-5204


Back to: Top of message | Previous page | Main SAS-L page