Date: Fri, 21 Jun 1996 17:07:07 +0000 "Bruce A. Rayton" "SAS(r) Discussion" "Bruce A. Rayton" SIC codes -- Leading zeros

I need to break down some government data into component industries for merging into another dataset.

This government file looks something like this:

INDUSTRY VALUE 07 12 007 30 0007 7 7 99 70 200 700 300 7000 400

The firm data file looks like this:

FIRM SIC SIC3 SIC2 SIC1 A 7000 700 70 7 B 7001 700 70 7 C 721 72 7 0 D 728 72 7 0 D 79 7 0 0 E 7 0 0 0

I need to construct SIC, SIC2, SIC3, and SIC1 in the government dataset so that I can merge values into the firm dataset by appropriate industry grouping. The goal is to put the least aggregated government data available into the firm dataset.

E.g., if (4-digit data available) then merge it in; else if (3-digit data available) then merge it in; else if (2-digit data available) then merge it in; ** The data is available for every 2-digit industry.

I need the first observation to be associated with the two-digit industry and the second to be associated with the three digit industry. Suppose I read in the INDUSTRY variable as numeric: This method would properly classify observations 4-7, but it would fail on observations 1-3. It would cause the first four observations in the government dataset get an industry number of 7. (Perhaps I've discovered the REAL reason everyone focuses on manufacturing firms -- they have SIC codes 2000-3900 <g>)

Notice that this isn't a problem for industries that don't start with a zero. The numeric representation works just fine in that instance, and I can separate the data based on the range of the INDUSTRY variable. For example, all industry numbers between 100 and 999 are three-digit industries (if we ignore this problem). The trick is distinguishing the industries that start with zero.

If I read the INDUSTRY in as a character variable then it doesn't match the SIC codes I have in the main dataset. I can pull SUBSTRings of these variables, but I can't figure out a way to make this help me.

Help greatly appreciated.

Bruce

rayton@wuecona.wustl.edu

*********************************************************************** | Dr. Bruce A. Rayton Office: (0115) 941-8418 | | Nottingham Trent University, Dept of EPA Home: (0115) 985-6821 | | Burton Street Fax: (0115) 948-6808 | | Nottingham NG1 4BU rayton@wuecona.wustl.edu | | England http://wuecon.wustl.edu/~bruceray | *********************************************************************** <<< I prefer electronic deliveries of manuscripts >>>

Back to: Top of message | Previous page | Main SAS-L page