`weitrix_sd_confects.Rd`

Find rows with confident excess standard deviation beyond what is expected based on the weights of a calibrated weitrix. This may be used, for example, to find potential marker genes.

weitrix_sd_confects( weitrix, design = ~1, fdr = 0.05, step = 0.001, assume_normal = TRUE )

weitrix | A weitrix object, or an object that can be converted to a weitrix
with |
---|---|

design | A formula in terms of |

fdr | False Discovery Rate to control for. |

step | Granularity of effect sizes to test. |

assume_normal | Assume weighted residuals are normally distributed? Assumption of normality is quite a strong assemption here. If TRUE, tests are based on the weighted squared residuals following a chi-squared distribution. If FALSE, tests are based on assuming the dispersion follows an asymptotically normal distribution, with variance estimated from the weighted squared residuals. If FALSE, a reasonably large number of columns is required. Defaults to TRUE. |

A topconfects result. The `$table`

data frame contains columns:

effect Estimated excess standard deviation, in the same units as the observations themselves. 0 if the dispersion is less than 1.

confect A lower confidence bound on effect.

row_mean Weighted mean of observations in this row.

typical_obs_err Typical accuracy of each observation.

dispersion Dispersion. Weighted sum of squared residuals divided by residual degrees of freedom.

n_present Number of observations with non-zero weight.

df Degrees of freedom. n minus the number of coefficients in the model.

fdr_zero FDR-adjusted p-value for the null hypothesis that effect is zero.

Note that `dispersion = effect^2/typical_obs_err^2 + 1`

for non-zero effect values.

Important note: With the default setting of `assume_normal=TRUE`

, the "confect" values produced by this method are only valid if the weighted residuals are close to normally distributed. If you have a reasonably large number of columns (eg single cell data), you can and should relax this assumption by specifying `assume_normal=FALSE`

.

This is a conversion of the "dispersion" statistic for each row into units that are more readily interpretable, accompanied by confidence bounds with a multiple testing correction.

We are looking for further perturbation of observed values beyond what is accounted for by a linear model and, further, beyond what is expected based on the observation weights (assumed to be calibrated and so interpreted as 1/variance). We are seeking to estimate the standard deviation of this further perturbation.

The weitrix must have been calibrated for results to make sense.

Top confident effect sizes are found using the `topconfects`

method, based on the model that the observed weighted sum of squared residuals being non-central chi-square distributed.

Note that all calculations are based on weighted residuals, with a rescaling to place results on the original scale. When a row has highly variable weights, this is an approximation that is only sensible if the weights are unrelated to the values themselves.

# weitrix_sd_confects should only be used with a calibrated weitrix calwei <- weitrix_calibrate_all(simwei, ~1, ~1) weitrix_sd_confects(calwei, ~1)#> $table #> confect effect row_mean typical_obs_err dispersion n_present df fdr_zero #> 1 NA 1.8901 2.715e+00 1.416 2.7825 5 4 0.1760 #> 2 NA 0.5925 -3.310e+00 1.416 1.1751 5 4 0.7083 #> 3 NA 0.4948 -2.700e+00 1.416 1.1222 4 3 0.7083 #> 4 NA 0.0000 2.748e+00 1.416 0.9720 4 3 0.7083 #> 5 NA 0.0000 -9.565e-02 1.416 0.3228 5 4 0.9544 #> 6 NA 0.0000 9.153e-05 1.416 0.2445 5 4 0.9544 #> 7 NA 0.0000 -8.430e-02 1.416 0.1686 5 4 0.9544 #> name #> 1 3 #> 2 5 #> 3 1 #> 4 7 #> 5 4 #> 6 2 #> 7 6 #> 0 of 7 non-zero excess standard deviation at FDR 0.05