rnalysis.filtering.CountFilter.differential_expression_deseq2
- CountFilter.differential_expression_deseq2(design_matrix: str | Path, comparisons: Iterable[Tuple[str, str, str]], covariates: Iterable[str] = (), lrt_factors: Iterable[str] = (), model_factors: Literal['auto'] | Iterable[str] = 'auto', r_installation_folder: str | Path | Literal['auto'] = 'auto', output_folder: str | Path | None = None, return_design_matrix: bool = False, scaling_factors: str | Path | None = None, cooks_cutoff: bool = True, return_code: bool = False, return_log: bool = False) Tuple[DESeqFilter, ...]
Run differential expression analysis on the count matrix using the DESeq2 algorithm. The count matrix you are analyzing should be unnormalized (meaning, raw read counts). The analysis will be based on a design matrix supplied by the user. The design matrix should contain at least two columns: the first column contains all the sample names, and each of the following columns contains an experimental design factor (e.g. ‘condition’, ‘replicate’, etc). (see the User Guide and Tutorial for a complete example). The analysis formula will contain all the factors in the design matrix. To run this function, a version of R must be installed.
- Parameters:
design_matrix (str or Path) – path to a csv file containing the experiment’s design matrix. The design matrix should contain at least two columns: the first column contains all the sample names, and each of the following columns contains an experimental design factor (e.g. ‘condition’, ‘replicate’, etc). (see the User Guide and Tutorial for a complete example). The analysis formula will contain all the factors in the design matrix.
comparisons (Iterable of tuple(factor, numerator_value, denominator_value)) – specifies what comparisons to build results tables out of. each individual comparison should be a tuple with exactly three elements: the name of a factor in the design formula, the name of the numerator level for the fold change, and the name of the denominator level for the fold change.
lrt_factors (Iterable of factor names (default=tuple())) – optionally, specify factors to be tested using the likelihood ratio test (LRT). If the factors are a continuous variable, you can also specify the number of polynomial degree to fit.
covariates (Iterable of covariate names (default=tuple())) – optionally, specify a list of continuous covariates to include in the analysis. The covariates should be column names in the design matrix. The reported fold change values correspond to the expected fold change for every increase of 1 unit in the covariate.
model_factors (Iterable of factor names or 'auto' (default='auto')) – optionally, specify a list of factors to include in the differential expression model. If ‘auto’, all factors in the design matrix will be included.
r_installation_folder (str, Path, or 'auto' (default='auto')) – Path to the installation folder of R. For example: ‘C:/Program Files/R/R-4.2.1’
output_folder (str, Path, or None) – Path to a folder in which the analysis results, as well as the log files and R script used to generate them, will be saved. if output_folder is None, the results will not be saved to a specified directory.
return_design_matrix (bool (default=False)) – if True, the function will return the sanitized design matrix used in the analysis.
return_code (bool (default=False)) – if True, the function will return the path to the R script used to generate the analysis results.
return_log (bool (default=False)) – if True, the function will return the path to the analysis logfile, which includes session info.
- Returns:
a tuple of DESeqFilter objects, one for each comparison