rnalysis.filtering.CountFilter.normalize_to_rpm_htseqcount

CountFilter.normalize_to_rpm_htseqcount(special_counter_fname: Union[str, Path], inplace: bool = True, return_scaling_factors: bool = False)

Normalizes the count matrix to Reads Per Million (RPM). Uses a table of feature counts (ambiguous, no feature, not aligned, etc) from HTSeq-count’s output. Divides each column in the CountFilter object by (total reads + ambiguous + no feature)*10^-6 .

Parameters
  • special_counter_fname – the .csv file which contains feature information about the RNA library (ambiguous, no feature, not aligned, etc).

  • inplace (bool (default=True)) – If True (default), filtering will be applied to the current CountFilter object. If False, the function will return a new CountFilter instance and the current instance will not be affected.

  • return_scaling_factors (bool (default=False)) – if True, return a DataFrame containing the calculated scaling factors.

Returns

If inplace is False, returns a new instance of the Filter object.

Examples
>>> from rnalysis import filtering
>>> c = filtering.CountFilter("tests/test_files/counted.csv")
>>> c.normalize_to_rpm_htseqcount("tests/test_files/uncounted.csv")

Normalized 22 features. Normalized inplace.