rnalysis.filtering.CountFilter.normalize_median_of_ratios

CountFilter.normalize_median_of_ratios(sample_grouping: GroupedColumns, reference_group: NonNegativeInt = 0, inplace: bool = True, return_scaling_factors: bool = False)

Normalizes the count matrix using the ‘Median of Ratios Normalization’ (MRN) method (Maza et al 2013). This normalization method uses information about the experimental condition of each sample. To calculate the Median of Ratios scaling factors, you first calculate the weighted mean expression of each gene within the replicates of each experimental condition. You then calculate per gene the ratio between each weighted mean in the experimental condition and those of the reference condition. You then pick the median ratio for each experimental condition, and calculate the scaling factor for each sample by multiplying it with the sample’s total number of reads. Finally, the scaling factors are adjusted, for symmetry, so that they multiply to 1.

Parameters
  • sample_grouping (nested list of column names) – grouping of the samples into conditions. Each grouping should containg all replicates of the same condition.

  • reference_group (int (default=0)) – the index of the sample group to be used as the reference condition. Must be an integer between 0 and the number of sample groups -1.

  • inplace (bool (default=True)) – If True (default), filtering will be applied to the current CountFilter object. If False, the function will return a new CountFilter instance and the current instance will not be affected.

  • return_scaling_factors (bool (default=False)) – if True, return a DataFrame containing the calculated scaling factors.

Returns

If inplace is False, returns a new instance of the Filter object.

Examples
>>> from rnalysis import filtering
>>> c = filtering.CountFilter("tests/test_files/counted.csv")
>>> c.normalize_median_of_ratios([['cond1','cond2'],['cond3','cond4']])

Normalized 22 features. Normalized inplace.