Date: Mon, 17 Nov 2008 15:48:41 -0600
Reply-To: aldi@wustl.edu
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Aldi Kraja <aldi@WUSTL.EDU>
Subject: Re: Probchi and large Chi-squares
In-Reply-To: <661508.320.qm@web32207.mail.mud.yahoo.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Hi Dale,
I see that you are familiar with the genetic tests.
My discussion was that having so many tests a small pvalue is expected
and I was supporting the formatting of sas has to be not default p < 0.0001.
But as you know p-values are also dependent on what power is available
from the data analysis. Larger the set of data (subjects' number), and
higher the contribution of the genetic polymorphism studied, smaller the
p-value will be, which shows for higher evidence on the particular
polymorphism as a contributor in the model.
Here is a paragraph from a recent paper where the n=60,352 and
pvalue=2.8 x 10^(-15) :
"...mapped 188 kb downstream of MC4R (melanocortin-4 receptor),
mutations of which are the leading cause of monogenic severe
childhood-onset obesity. We confirmed the BMI association in 60,352
adults (per-allele effect = 0.05 Z-score units; P = 2.8 x 10(-15)) and
5,988 children aged 7-11 (0.13 Z-score units; P = 1.5 x 10(-8)). In
case-control analyses (n = 10,583), the odds for severe childhood
obesity reached 1.30 (P = 8.0 x 10(-11))."
The trend is: different studies place data together (in meta analysis)
to increase the analysis power.
With my previous email, I claimed that very low p-values are recent
facts of life in the genetic analysis.
Thanks,
Aldi
Dale McLerran wrote:
> Aldi,
>
> Please elaborate on how a p-value less than 1E-16 would be useful
> in any analysis. You suggest that small p-values have utility
> in genetic analyses, and I don't disagree. However, for a
> p-value smaller than 1E-16 to be of value in a situation where
> multiple testing is an issue, it seems to me that one would
> need to be examining 1E14 or more tests. I have yet to see any
> analysis where there were anywhere near that many tests performed.
>
> Dale
>
> ---------------------------------------
> Dale McLerran
> Fred Hutchinson Cancer Research Center
> mailto: dmclerra@NO_SPAMfhcrc.org
> Ph: (206) 667-2926
> Fax: (206) 667-5977
> ---------------------------------------
>
>
> --- On Mon, 11/17/08, Aldi Kraja <aldi@WUSTL.EDU> wrote:
>
>
>> From: Aldi Kraja <aldi@WUSTL.EDU>
>> Subject: Re: Probchi and large Chi-squares
>> To: SAS-L@LISTSERV.UGA.EDU
>> Date: Monday, November 17, 2008, 7:19 AM
>> Mary has given you a correct answer, play with the format.
>> In my case I do not define any format because the ODS
>> tables will
>> provide you the right p-values if they are very small, as
>> long as you
>> transfer to your tables and remove the sas format.
>>
>> For example
>> I capture a table with
>> ods output SAStablename=mytablename;
>>
>> and when I refer to my table name I use before the set
>> statement the
>> following, which removes the SAS format default for the
>> ProbF.
>>
>> format ProbF ;
>>
>> In regard to the such low p-values. They are meaningful
>> when the p
>> (number of variables tested) is large. For example one can
>> apply
>> 1Million independent tests, and by rules of probabilities
>> one would
>> expect 5% false positives by chance. Therefore in the
>> Genetic analysis
>> and in other sciences are developed multiple corrections
>> (see proc
>> multtest). The simples and the most conservative one is
>> Bonferroni test
>> (when alpha=0.05) a family wise alpha=0.05/(# of tests=1M)
>> => 5e-8.
>>
>> HTH,
>>
>> Aldi
>>
>>
>> Mary wrote:
>>
>>> To answer the question; save the result in ODS and
>>>
>> then you can set a
>>
>>> format
>>> of the resulting value however you want;
>>> it is a format question rather than a value question.
>>>
>> I haven't actually
>>
>>> tried this big a format in SAS but know that it saves
>>>
>> it, as
>>
>>> I can reformat it over in Excel so I know it does save
>>>
>> the decimals.
>>
>>> ods select ChiSq;
>>> data chisq_set;
>>> informat chi_square 20.14 format chi_square 20.14
>>>
>> table $256.;
>>
>>> stop;
>>> run;
>>> ods output chisq=chisq_set;
>>> proc freq data=set2;
>>> tables amd_flag * &snp/chisq nocol norow
>>>
>> nopercent;
>>
>>> title &snp;
>>> run;
>>>
>>> data chisq_set;
>>> informat chi_square 20.14 format chi_square 20.14
>>>
>> table $256.;
>>
>>> set chisq_set;
>>> where Statistic='Chi-Square';
>>>
>>>
>> table=trim(substr(table,index(table,'*')+1,length(table)));
>>
>>> chi_square=prob;
>>> keep table chi_square;
>>> run;
>>>
>>> To answer Dale's question, WHY?, this may be due
>>>
>> to many thousands of
>>
>>> multiple tests, for instance in Genetics.
>>>
>>> In my field, Ophthalmology Macular Degeneration, a
>>>
>> major paper was
>>
>>> published
>>> in which all but 2 of over 116,000 SNPs
>>> were discarded:
>>>
>>> Complement factor H polymorphism in age-related
>>>
>> macular degeneration.
>>
>>> Klein et al. Science. 2005 Apr 15;308(5720):385-9.
>>>
>>> In this, they set a Bonforroni acceptance level of 10
>>>
>> -7.
>>
>>>
>> http://www.sciencemag.org/cgi/content/full/308/5720/385
>>
>>> "Single-marker associations. For each SNP, we
>>>
>> tested for allelic
>>
>>> association
>>> with disease status. To account for multiple testing,
>>>
>> we used the
>>
>>> Bonferroni
>>> correction and considered significant only those SNPs
>>>
>> for which P <
>>
>>> 0.05/103,611 = 4.8 x 10-7. This correction is known to
>>>
>> be conservative
>>
>>> and
>>> thus "over-corrected" the raw P values (14).
>>>
>> Of the autosomal SNPs, only
>>
>>> two, rs380390 and rs10272438, are significantly
>>>
>> associated with disease
>>
>>> status (Bonferroni-corrected P = 0.0043 and P =
>>>
>> 0.0080, respectively)
>>
>>> (Fig.
>>> 1A). "
>>>
>>>
>>>
>>> -Mary
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> ----- Original Message -----
>>> From: baxtefer@GMAIL.COM
>>> To: SAS-L@LISTSERV.UGA.EDU
>>> Sent: Monday, November 10, 2008 3:28 PM
>>> Subject: Probchi and large Chi-squares
>>>
>>>
>>> This might be a really stupid question...
>>> I need to get the exact probabilities associated with
>>>
>> a number of
>>
>>> large Chi-Square values (i.e. >100, 1 DF)
>>> using
>>> P = 1-Probchi(ChiSq,1);
>>> can give me values of P down to around 1E-16, but
>>>
>> anything less than
>>
>>> that gets represented as 0.
>>> Is this a real issue or am i dealing with a number
>>>
>> formatting problem?
>>
>>> Thanks in advance!
>>>
>> --
>>
--
|