rnalysis.filtering.CountFilter.filter_top_n

CountFilter.filter_top_n(by: ColumnNames, n: PositiveInt = 100, ascending: bool | List[bool] = True, na_position: str = 'last', opposite: bool = False, inplace: bool = True)

Sort the rows by the values of specified column or columns, then keep only the top ‘n’ rows.

Parameters:

by (name of column/columns (str/List[str])) – Names of the column or columns to sort and then filter by.
n (int) – How many features to keep in the Filter object.
ascending (bool or list of bools (default=True)) – Sort ascending vs. descending. Specify list for multiple sort orders. If this is a list of bools, it must have the same length as ‘by’.
na_position ('first' or 'last', default 'last') – If ‘first’, puts NaNs at the beginning; if ‘last’, puts NaNs at the end.
opposite (bool) – If True, the output of the filtering will be the OPPOSITE of the specified (instead of filtering out X, the function will filter out anything BUT X). If False (default), the function will filter as expected.
inplace (bool (default=True)) – If True (default), filtering will be applied to the current Filter object. If False, the function will return a new Filter instance and the current instance will not be affected.

Returns:

If ‘inplace’ is False, returns a new instance of Filter.

Examples:

>>> from rnalysis import filtering
>>> counts = filtering.Filter('tests/test_files/counted.csv')
>>> # keep only the 10 rows with the highest values in the columns 'cond1'
>>> counts.filter_top_n(by='cond1',n=10, ascending=False)
Filtered 12 features, leaving 10 of the original 22 features. Filtered inplace.

>>> counts = filtering.Filter('tests/test_files/counted.csv')
>>> # keep only the 10 rows which have the lowest values in the columns 'cond1'
>>> # and then the highest values in the column 'cond2'
>>> counts.filter_top_n(by=['cond1','cond2'],n=10, ascending=[True,False])
Filtered 12 features, leaving 10 of the original 22 features. Filtered inplace.