rnalysis.filtering.FoldChangeFilterļ
- class rnalysis.filtering.FoldChangeFilter(fname: str | Path | tuple, numerator_name: str, denominator_name: str, suppress_warnings: bool = False)ļ
- A class that contains a single column, representing the gene-specific fold change between two conditions.
this class does not support āinfā and ā0ā values, and importing a file with such values could lead to incorrect filtering and statistical analyses.
Attributes
- df: pandas Series
A Series that contains the fold change values. The Series is modified upon usage of filter operations.
- shape: tuple (rows, columns)
The dimensions of df.
- columns: list
The columns of df.
- fname: pathlib.Path
The path and filename for the purpose of saving df as a csv file. Updates automatically when filter operations are applied.
- index_set: set
All of the indices in the current DataFrame (which were not removed by previously used filter methods) as a set.
- index_string: string
A string of all feature indices in the current DataFrame separated by newline.
- numerator: str
Name of the numerator used to calculate the fold change.
- denominator: str
Name of the denominator used to calculate the fold change.
- __init__(fname: str | Path | tuple, numerator_name: str, denominator_name: str, suppress_warnings: bool = False)ļ
Load a fold-change table. Valid fold-change tables should contain exactly two columns: the first column containing gene names/indices, and the second column containing log2(fold change) values.
- Parameters:
fname (Union[str, Path]) ā full path/filename of the .csv file to be loaded into the Filter object
numerator_name (str) ā name of the numerator condition in the fold-change table
denominator_name (str) ā name of the denominator condition in the fold-change table
suppress_warnings (bool (default=False)) ā if True, RNAlysis will not issue warnings about the loaded tableās structure or content.
|
Returns a DataFrame describing the biotypes in the table and their count. |
Returns a DataFrame describing the biotypes in the table and their count. |
|
|
|
|
Generate descriptive statistics that summarize the central tendency, dispersion and shape of the dataset's distribution, excluding NaN values. |
|
Keep only the features that exist in the first Filter object/set but NOT in the others. |
|
Drop specific columns from the table. |
Filters out all features whose absolute log2 fold change is below the indicated threshold. |
|
Filters out all features that do not match the indicated biotype/biotypes (for example: 'protein_coding', 'ncRNA', etc). |
|
Filters out all features that do not match the indicated biotype/biotypes (for example: 'protein_coding', 'ncRNA', etc). |
|
Filters features according to user-defined attributes from an Attribute Reference Table. |
|
Filters genes according to GO annotations, keeping only genes that are annotated with a specific GO term. |
|
Filters genes according to KEGG pathways, keeping only genes that belong to specific KEGG pathway. |
|
|
Filter out specific rows from the table by their name (index). |
Filter out rows with duplicate names/IDs (index). |
|
Filters out features according to the direction in which they changed between the two conditions. |
|
Remove all rows with missing values. |
|
|
Removes all entries above the specified percentile in the specified column. |
|
Sort the rows by the values of specified column or columns, then keep only the top 'n' rows. |
Find paralogs within the same species using the Ensembl database. |
|
Find paralogs within the same species using the PantherDB database. |
|
|
|
Return the first n rows of the Filter object. |
|
|
|
|
Keep only the features that exist in ALL of the given Filter objects/sets. |
Returns a set/string of the features that appear in at least (majority_threhold * 100)% of the given Filter objects/sets. |
|
Map genes to their nearest orthologs in a different species using the Ensembl database. |
|
Map genes to their nearest orthologs in a different species using the OrthoInspector database. |
|
Map genes to their nearest orthologs in a different species using the PantherDB database. |
|
Map genes to their nearest orthologs in a different species using the PhylomeDB database. This function generates a table describing all matching discovered ortholog pairs (both unique and non-unique) and returns it, and can also translate the genes in this data table into their nearest ortholog, as well as remove unmapped genes. |
|
|
Applay a number filter (greater than, equal, lesser than) on a particular column in the Filter object. |
Print the feature indices in the Filter object, sorted by their current order in the FIlter object, and separated by newline. |
|
|
Perform a randomization test to examine whether the fold change of a group of specific genomic features is significantly different than the fold change of a background set of genomic features. |
|
Saves the current filtered data to a .csv file. |
|
Saves the current filtered data to a .parquet file. |
|
Save the current filtered data table. |
|
Sort the rows by the values of specified column or columns. |
|
Splits the features in the Filter object into multiple Filter objects, each corresponding to one of the specified Attribute Reference Table attributes. |
|
Splits the features in the Filter object into two non-overlapping Filter objects: one containing features below the specified percentile in the specfieid column, and the other containing features about the specified percentile in the specified column. |
Splits the features in the FoldChangeFilter object into two non-overlapping FoldChangeFilter objects, based on the direction of their log2(fold change). |
|
Returns a set/string of the WBGene indices that exist either in the first Filter object/set OR the second, but NOT in both (set symmetric difference). |
|
Return the last n rows of the Filter object. |
|
|
Applay a text filter (equals, contains, starts with, ends with) on a particular column in the Filter object. |
|
Transform the values in the Filter object with the specified function. |
|
Translates gene names/IDs from one type to another. |
|
Returns a set/string of the union of features between multiple Filter objects/sets (the features that exist in at least one of the Filter objects/sets). |