rnalysis.filtering.CountFilter.filter_low_reads

CountFilter.filter_low_reads(threshold: float = 5, n_samples: PositiveInt = 1, opposite: bool = False, inplace: bool = True)

Filter out features which are lowly-expressed in all columns, keeping only features with at least ‘threshold’ reads in at least ‘n_samples’ columns.

Parameters:

threshold (float) – The minimal number of reads (counts, rpm, rpkm, tpm, etc) a feature should have in at least n_samples samples in order not to be filtered out.
n_samples (positive integer (default=1)) – the minimal number of samples a feature should have at least ‘threshold’ reads in in order not to be filtered out.
opposite (bool) – If True, the output of the filtering will be the OPPOSITE of the specified (instead of filtering out X, the function will filter out anything BUT X). If False (default), the function will filter as expected.
inplace (bool (default=True)) – If True (default), filtering will be applied to the current CountFilter object. If False, the function will return a new CountFilter instance and the current instance will not be affected.

Returns:

If ‘inplace’ is False, returns a new instance of CountFilter.

Examples:

>>> from rnalysis import filtering
>>> c = filtering.CountFilter('tests/test_files/counted.csv')
>>> c.filter_low_reads(5) # remove all rows whose values in all columns are all <5
Filtered 6 features, leaving 16 of the original 22 features. Filtered inplace.