rnalysis.general.parse_gene_name_string

rnalysis.general.parse_gene_name_string(string)

Receives a string that contains gene names (like ‘daf-2’ or ‘lin15B’). Parses the string into a set of gene names. The format of a gene name is a sequence consisting of the expression ‘[a-z]{3,4}’, the character ‘-’, and the expression ‘[A-Z,0-9]{1,4}’. :type string: str :param string: The string to be parsed. Can be any format of string. :return: a set of the WBGene indices that appear in the given string.

Examples

>>> from rnalysis import general
>>> string = 'saeg-2 lin-15B cyp-23A1lin-15A WBGene12345678%GHF5H.3'
>>> parsed = general.parse_gene_name_string(string)
>>> print(parsed)
{'saeg-2', 'lin-15B', 'cyp-23A1', 'lin-15A'}