rnalysis.general.parse_sequence_name_string

rnalysis.general.parse_sequence_name_string(string)

Receives a string that contains sequence names (such as ‘Y55D5A.5’). Parses the string into a set of WBGene indices. The format of a sequence name is a sequence consisting of the expression ‘[A-Z,0-9]{5,6}’, the character ‘.’, and a digit. :type string: str :param string: The string to be parsed. Can be any format of string. :return: a set of the WBGene indices that appear in the given string.

Examples

>>> from rnalysis import general
>>> string = 'CELE_Y55D5A.5T23G5.6WBGene00000000 daf-16^^ZK662.4 '
>>> parsed = general.parse_sequence_name_string(string)
>>> print(parsed)
{'Y55D5A.5', 'T23G5.6', 'ZK662.4'}