Date: Sun, 18 Mar 2012 23:13:36 +0100
Reply-To: John F Hall <johnfhall@orange.fr>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: John F Hall <johnfhall@orange.fr>
Subject: Re: In support of CTABLES
In-Reply-To: <OF6F6C02C2.1924E119-ON872579C4.0049BD2F-872579C4.004AFB55@us.ibm.com>
Content-Type: multipart/alternative;
Jon
Thanks for this and apologies for the lengthy reply. I’ve not tried David’s solution yet, but that’s on my to do list.
I’ve spent the last couple of days experimenting with CTABLES on a larger data set (my 1975 Quality of Life in Britain) as I’m more familiar with it and the variables are a close match for the example in the CSR (sex, marital status feeling very, fairly or not at all happy). I’ve done this for “happy” using %% both in marital by happy by sex format using CROSSTABS and also my elaboration summary table format which involves recoding to 0,100 and using MEANS. I’ve then attempted to repeat the analysis using CTABLES.
As it happens I use almost the same example in tutorial 3.1.3 to introduce the simultaneous tabulation of two variables. This is the first time in the course that students will come across joint frequency distributions.
[Extract from tutorial]
_____
We'll be using data from this survey to explore the relationship:
Marital status → Feeling happy
. . . or is it the other way round?
Is this the true story, or are there any other variables, (related or unrelated to marital status) which might influence feeling happy? What might they be? How do they affect the relationship between marital status and feeling happy? Thus as well as dependent and independent variables, we also need to think of test variables to examine the initial relationship between marital status and feeling happy by controlling for the test variables. Does marital status affect feeling happy at all? These are the kind of questions which make survey research so interesting.
_____
The following examples use CROSSTABS and are from the SSRC Survey Unit Quality of Life in Britain survey, 1975]
Exercise 1: The sequence followed in class would produce output like this:
Q.53 How [happy] are you these days? * Marital status of respondent Crosstabulation
Count
Marital status of respondent
Total
Single
Married
Widowed
Divorced or separated
Q.53 How [happy] are you these days?
Not too happy
7
29
17
4
57
Fairly Happy
105
337
58
16
516
Very happy
38
283
23
9
353
Total
150
649
98
29
926
Q.53 How [happy] are you these days? * Marital status of respondent Crosstabulation
Marital status of respondent
Total
Single
Married
Widowed
Divorced or separated
Q.53 How [happy] are you these days?
Not too happy
Count
7
29
17
4
57
% within Marital status of respondent
4.7%
4.5%
17.3%
13.8%
6.2%
Fairly Happy
Count
105
337
58
16
516
% within Marital status of respondent
70.0%
51.9%
59.2%
55.2%
55.7%
Very happy
Count
38
283
23
9
353
% within Marital status of respondent
25.3%
43.6%
23.5%
31.0%
38.1%
Total
Count
150
649
98
29
926
% within Marital status of respondent
100.0%
100.0%
100.0%
100.0%
100.0%
This is a bit cluttered, so just use col%:
Q.53 How [happy] are you these days? * Marital status of respondent Crosstabulation
% within Marital status of respondent
Marital status of respondent
Total
Single
Married
Widowed
Divorced or separated
Q.53 How [happy] are you these days?
Not too happy
4.7%
4.5%
17.3%
13.8%
6.2%
Fairly Happy
70.0%
51.9%
59.2%
55.2%
55.7%
Very happy
25.3%
43.6%
23.5%
31.0%
38.1%
Total
100.0%
100.0%
100.0%
100.0%
100.0%
. . . but it's much easier to compare figures visually down columns rather than across rows.
I prefer to have the dependent variable across the top of the table and the independent variable(s) down the side, viz:
marital * happy Crosstabulation
% within marital
happy
Total
Not too happy
Fairly happy
Very happy
marital
Married or cohabiting
6.5%
68.6%
24.8%
100.0%
Single
3.2%
46.9%
49.9%
100.0%
Widowed
11.7%
62.8%
25.5%
100.0%
Separated or divorced
35.3%
55.9%
8.8%
100.0%
Total
5.7%
52.2%
42.1%
100.0%
This table is easier to interpret, but we have lost the base for percentaging at the end of each row. Without special programming beyond the scope of this tutorial, SPSS cannot produce a table with n instead of 100%. A more useful table would look like this:
marital * happy Crosstabulation
% within marital
happy
N = 100%
Not too happy
Fairly happy
Very happy
marital
Married or cohabiting
6.5%
68.6%
24.8%
150
Single
3.2%
46.9%
49.9%
649
Widowed
11.7%
62.8%
25.5%
98
Separated or divorced
35.3%
55.9%
8.8%
29
Total
5.7%
52.2%
42.1%
926
The 3-way table for elaboration:
Marital status of respondent * Q.53 How [happy] are you these days? * Sex of Respondent Crosstabulation
% within Marital status of respondent
Sex of Respondent
Q.53 How [happy] are you these days?
Total
Not too happy
Fairly Happy
Very happy
Men
Marital status of respondent
Single
5.3%
75.0%
19.7%
100.0%
Married
5.2%
55.6%
39.2%
100.0%
Widowed
26.7%
60.0%
13.3%
100.0%
Divorced or separated
16.7%
66.7%
16.7%
100.0%
Total
6.2%
59.7%
34.0%
100.0%
Women
Marital status of respondent
Single
4.1%
64.9%
31.1%
100.0%
Married
3.9%
49.0%
47.1%
100.0%
Widowed
15.7%
59.0%
25.3%
100.0%
Divorced or separated
13.0%
52.2%
34.8%
100.0%
Total
6.1%
52.9%
41.0%
100.0%
Total
Marital status of respondent
Single
4.7%
70.0%
25.3%
100.0%
Married
4.5%
51.9%
43.6%
100.0%
Widowed
17.3%
59.2%
23.5%
100.0%
Divorced or separated
13.8%
55.2%
31.0%
100.0%
Total
6.2%
55.7%
38.1%
100.0%
Again, it would be more useful if it looked like this:
Sex of Respondent * Q.53 How [happy] are you these days? * Marital status of respondent Crosstabulation
% within Sex of Respondent
Marital status of respondent
Q.53 How [happy] are you these days?
N = 1000%
Not too happy
Fairly Happy
Very happy
Single
Sex of Respondent
Men
5.3%
75.0%
19.7%
76
Women
4.1%
64.9%
31.1%
74
Total
4.7%
70.0%
25.3%
150
Married
Sex of Respondent
Men
5.2%
55.6%
39.2%
288
Women
3.9%
49.0%
47.1%
361
Total
4.5%
51.9%
43.6%
649
Widowed
Sex of Respondent
Men
26.7%
60.0%
13.3%
15
Women
15.7%
59.0%
25.3%
83
Total
17.3%
59.2%
23.5%
98
Divorced or separated
Sex of Respondent
Men
16.7%
66.7%
16.7%
93
Women
13.0%
52.2%
34.8%
15
Total
13.8%
55.2%
31.0%
29
Total
Sex of Respondent
Men
6.2%
59.7%
34.0%
385
Women
6.1%
52.9%
41.0%
541
Total
6.2%
55.7%
38.1%
926
This my starting point for CTABLES.
I’ve replicated the above exercises using CTABLES, but it took me a very long time to get used to the displays and routing. I’m beginning to get the hang of it, but I still think it’s far too complicated for the kind of students I taught and the timetable constraints they (and I) faced. Full-time postgrads and early career research staff have fewer constraints on their time. I took many a wrong path on the way, as would many of my students. This was one reason we abandoned a teaching experiment using SPSS PC+ on PCs: the students finished up all over the place and we reverted to SPSS-X on the mainframe using syntax on VDUs.
That said I have started on a draft tutorial for CTABLES with much more explanation than the CSR or the on-line help, and with the addition of step-by-step screenshots. On the way I discovered some really nice features, such as the table layout previews, especially the ability to drag variables around from these to rows or columns to see what the table would look like. I found it irritating that the display reverted to the top of the file when starting a new analysis rather than going back to where I left off.
I have yet to work out how to get row and column totals (and %%) into tables.
CTABLES
/VLABELS VARIABLES=marital sex happy DISPLAY=DEFAULT
/TABLE marital > sex BY happy [C] [rowpct f3.1]
/CATEGORIES VARIABLES=marital sex happy ORDER=A KEY=VALUE EMPTY=INCLUDE.
Q.53 How [happy] are you these days?
Not too happy
Fairly Happy
Very happy
Row N %
Row N %
Row N %
Marital status of respondent
Single
Sex of Respondent
Men
5.3
75.0
19.7
Women
4.1
64.9
31.1
Married
Sex of Respondent
Men
5.2
55.6
39.2
Women
3.9
49.0
47.1
Widowed
Sex of Respondent
Men
26.7
60.0
13.3
Women
15.7
59.0
25.3
Divorced or separated
Sex of Respondent
Men
16.7
66.7
16.7
Women
13.0
52.2
34.8
This table needs an additional column giving base N for % in each row, which I can do by hand, but I’ll try to do it in CTABLES
Example 2: Summary table
recode happy (3 = 100) (1,2 =0)(else = sysmis) into happy2.
var level happy2 (scale).
* Custom Tables.
CTABLES
/VLABELS VARIABLES=sex marital happy2 DISPLAY=DEFAULT
/TABLE sex [C] BY marital [C] > happy2 [S][MEAN]
/CATEGORIES VARIABLES=sex [1, 2, OTHERNM] EMPTY=INCLUDE
/CATEGORIES VARIABLES=marital [1, 2, 3, 4, OTHERNM] EMPTY=INCLUDE.
Marital status of respondent
Single
Married
Widowed
Divorced or separated
happy2
happy2
happy2
happy2
Mean
Mean
Mean
Mean
Sex of Respondent
Men
19.74
39.24
13.33
16.67
Women
31.08
47.09
25.30
34.78
So far, so good, but I want a table that looks something like this (edited rather clumsily in Word):
Marital status of respondent
All
Single
Married
Widowed
Divorced or separated
All
%
38.12
25.33
43.61
23.47
31.03
n=100%
926
150
649
98
29
Sex of Respondent
Men
%
34.03
19.74
39.24
13.33
16.67
n=100%
385
76
288
15
6
Women
%
41.04
31.08
47.09
25.30
34.78
n=100%
541
74
361
83
23
I’d prefer % as f3.1
From this table we can calculate epsilons
-7.0
-11.3
-7.9
-12.0
-18.1
. . and begin to discuss how to interpret these figures and what other variables might be included.
A much better example is the difference in earnings between men and women at zero order and when controlling for other variables such as qualifications, full-time or part-time, employee or self-employed etc., bu that is a lot of work.
Like I said, sorry for the length of this, but it gives you a better idea of what I’m trying to do.
John
Email: johnfhall@orange.fr
Website: www.surveyresearch.weebly.com <http://surveyresearch.weebly.com/>
Skype: surveyresearcher1
Phone: (+33) (0) 2.33.45.91.47
From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of Jon K Peck
Sent: 17 March 2012 14:39
To: SPSSX-L@LISTSERV.UGA.EDU
Subject: Re: In support of CTABLES
Using employee data.sav
This produces almost the table you want (labelling suppressed as in your example)
CTABLES
/VLABELS VARIABLES=salary DISPLAY=NONE
/TABLE gender > salary [MEAN, COUNT PAREN40.0] BY minority
/SLABELS POSITION=ROW VISIBLE=NO
/CATEGORIES VARIABLES=gender minority TOTAL=YES POSITION=BEFORE.
If you want to align the counts right, you can select those cells in the table editor and change the alignment. This could be automated with the SPSSINC MODIFY TABLES extension command. You could also add striping every other row via a tableLook or preference setting.
Alternatively, put the counts in adjacent cells like this.
CTABLES
/VLABELS VARIABLES=salary DISPLAY=NONE
/TABLE gender [C] > salary [S][MEAN, COUNT PAREN40.0] BY minority [C]
/SLABELS VISIBLE=NO
/CATEGORIES VARIABLES=gender minority TOTAL=YES POSITION=BEFORE .
Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
peck@us.ibm.com
new phone: 720-342-5621
From: John F Hall <johnfhall@orange.fr>
To: SPSSX-L@listserv.uga.edu
Date: 03/17/2012 04:38 AM
Subject: Re: [SPSSX-L] In support of CTABLES
Sent by: "SPSSX(r) Discussion" <SPSSX-L@listserv.uga.edu>
_____
This exchange (and pouring rain) has prompted me to explore CTABLES with some of the examples in my tutorials. The CSR, whilst thorough in listing everything CT can do, is less than helpful for working out how to do what the user wants: the Help tutorials for CT inside SPSS are uninformative and sparse.
Looks like 1972 all over again when Jim Ring and I had to write a series of new handouts (effectively a manual to the manual) for researchers who came to SSRC Survey Unit for advice and assistance. These detailed, in simple language, the various stages of survey data capture, file management and statistical analysis and explained how to do this with SPSS. We even joked about writing a “Clod’s Guide to Survey Analysis Using SPSS” They formed the basis for teaching notes on our Summer Schools in Survey Methods (1970-76) and, eventually, for my Survey Analysis Workshop (1976 – 92) at the then Polytechnic of North London. Apart from workshop exercises and supplementary explanations, we used Maria Norusis wonderful book from 1987 onwards (bought in bulk and sold at cost to students).
Unless I can find a downloadable CT explanation and workthrough for the kind of tables I need, it seems I’ll have to write something myself. I’ve been playing with CT on some of my course data and noted that it is very, very fast. What I need to do now is work backwards from the output I used to get using BREAKDOWN (always caused a laugh from my students when it was first mentioned). . . /CROSSBREAK (now superceded) to see if CT can produce it. I’m looking for a way to get this table:
sexism2 * sex * ethnic
sexism2
sex
ethnic
White
Other
Total
Mean
N
Mean
N
Mean
N
Boys
13.41
22
11.90
20
12.69
42
Girls
9.17
30
8.64
14
9.00
44
Total
10.96
52
10.56
34
10.80
86
Into a format which matches this blank table:
Sexism
Mean
(n)
All
White
Other
All
( )
( )
( )
Boys
( )
( )
( )
Girls
( )
( )
( )
[This one is for means, but it applies equally to percentages for elaboration.]
A quick search for SPSS CROSSBREAK produced this 2009 correspondence from Jon Peck:
<http://listserv.uga.edu/cgi-bin/wa?A2=ind0909&L=spssx-l&P=24774> http://listserv.uga.edu/cgi-bin/wa?A2=ind0909&L=spssx-l&P=24774 from which I hope to be able to produce:
Sexism
Mean
(n)
All
White
Other
All
10.8
(86)
11.0
(52)
10.6
(34)
Boys
12.7
(42)
13.4
(22)
11.9
(20)
Girls
9.0
(44)
9.2
(30)
8.6
(14)
The main point about such tables is that the sample statistic appears top left, first order statistics in the 1st row/ column and second order statistics in the 2nd and 3rd rows/columns: right-to-left language users may prefer the table to be flipped horizontally. I would normally do this with percentages, following on from CROSSTABS, as it is a relatively simple way of demonstrating analysis by breaking down a statistic into constituent parts, forcing students to think about explanations for the emerging pattern and about other variables which might be introduced as 3rd order controls. Once the concept of a mean is introduced and understood, using similar summary tables, students can progress to further statistical tests.
There are three international rugby matches on BBC this afternoon, so I’ve got less than two hours to modify Jon’s syntax and see what I come up with (and fit a sandwich in for lunch: just like being back at work!).
John Hall
Email: <mailto:johnfhall@orange.fr> johnfhall@orange.fr
Website: <http://surveyresearch.weebly.com/> www.surveyresearch.weebly.com
Skype: surveyresearcher1
Phone: (+33) (0) 2.33.45.91.47
[text/html]