rnalysis.filtering.CountFilter.split_by_reads

CountFilter.split_by_reads(threshold: float = 5) → tuple

Splits the features in the CountFilter object into two non-overlapping CountFilter objects, based on their maximum expression level. The first object will contain only highly-expressed features (which have reads over the specified threshold in at least one sample). The second object will contain only lowly-expressed features (which have reads below the specified threshold in all samples).

Parameters:

threshold (float (default=5)) – The minimal number of reads (counts, RPM, RPKM, TPM etc) a feature needs to have in at least one sample in order to be included in the “highly expressed” object and no the “lowly expressed” object.

Return type:

Tuple[filtering.CountFilter, filtering.CountFilter]

Returns:

A tuple containing two CountFilter objects: the first has only highly-expressed features, and the second has only lowly-expressed features.

Examples:

>>> from rnalysis import filtering
>>> c = filtering.CountFilter('tests/test_files/counted.csv')
>>> low_expression, high_expression = c.split_by_reads(5)
Filtered 6 features, leaving 16 of the original 22 features. Filtering result saved to new object.
Filtered 16 features, leaving 6 of the original 22 features. Filtering result saved to new object.