Find sets of discoveries for a range of effect sizes, controlling the False Discovery Rate (FDR) for each set.

nest_confects(n, pfunc, fdr = 0.05, step = 0.001, full = FALSE)

Arguments

n

Number of items being tested.

pfunc

A function(indices, effect_size) to calculate p-values. Indices is a subset of 1:n giving the p-values to be computed. Should return a numeric vector of length length(indices).

fdr

False Discovery Rate to control for.

step

Granularity of effect sizes to test.

full

If TRUE, also include FDR-adjusted p-value that effect size is non-zero. Note that this is against the spirit of the topconfects approach.

Value

A "Topconfects" object, containing a table of results and various associated information.

The most important part of this object is the $table element, a data frame with the following columns:

  • rank - Ranking by confect and for equal confect by p-value at that effect size.

  • index - Number of the test, between 1 and n.

  • confect - CONfident efFECT size.

The usage is as follows: To find a set of tests which have effect size greater than x with the specified FDR, take the rows with abs(confect) >= x. Once the set is selected, the confect values provide confidence bounds on the effect size with False Coverage-statement Rate (FCR) at the same level as the FDR.

One may essentially take the top however many rows of the data frame and these will be the best set of results of that size to dependably have an effect size that is as large as possible. However if some genes have the same abs(confect), all or none should be selected.

Some rows in the output may be given the same confect, even if step is made small. This is an expected behaviour of the algorithm. (This is similar to FDR adjustment of p-values sometimes resulting in a run of the same adjusted p-value, even if all the input p-values are distinct.)

Some wrappers around this function may add a sign to the confect column, if it makes sense to do so. They will also generally add an effect column, containing an estimate of the effect size that aims to be unbiassed rather than a conservative lower bound.

Details

This is a general purpose function, which can be applied to any method of calculting p-values (supplied as a function argument) for the null hypothesis that the effect size is smaller than a given amount.

Examples

# Find largest positive z-scores in a collection, # and place confidence bounds on them that maintain FDR 0.05. z <- c(1,2,3,4,5) pfunc <- function(i, effect_size) { pnorm(z[i], mean=effect_size, lower.tail=FALSE) } nest_confects(length(z), pfunc, fdr=0.05)
#> $table #> rank index confect #> 1 5 2.673 #> 2 4 1.946 #> 3 3 1.119 #> 4 2 0.249 #> 5 1 NA #> 4 of 5 non-zero effect size at FDR 0.05