rnalysis.fastq.bowtie2_align_paired_end

rnalysis.fastq.bowtie2_align_paired_end(r1_files: List[str], r2_files: List[str], output_folder: Union[str, Path], index_file: Union[str, Path], bowtie2_installation_folder: Union[str, Path, Literal['auto']] = 'auto', new_sample_names: Union[List[str], Literal['auto']] = 'auto', mode: Literal['end-to-end', 'local'] = 'end-to-end', settings_preset: Literal['very-fast', 'fast', 'sensitive', 'very-sensitive'] = 'very-sensitive', ignore_qualities: bool = False, quality_score_type: Literal['phred33', 'phred64', 'solexa-quals', 'int-quals'] = 'phred33', mate_orientations: Literal['fwd-rev', 'rev-fwd', 'fwd-fwd'] = 'fwd-rev', min_fragment_length: NonNegativeInt = 0, max_fragment_length: PositiveInt = 500, allow_individual_alignment: bool = True, allow_disconcordant_alignment: bool = True, random_seed: NonNegativeInt = 0, threads: PositiveInt = 1)

Align paired-end reads from FASTQ files to a reference sequence using the bowtie2 aligner. The FASTQ file pairs will be individually aligned, and the aligned SAM files will be saved in the output folder. You can read more about how bowtie2 works in the bowtie2 manual.

Parameters
  • r1_files (list of str/Path to existing FASTQ files) – a list of paths to your Read#1 files. The files should be sorted in tandem with r2_files, so that they line up to form pairs of R1 and R2 files.

  • r2_files (list of str/Path to existing FASTQ files) – a list of paths to your Read#2 files. The files should be sorted in tandem with r1_files, so that they line up to form pairs of R1 and R2 files.

  • output_folder (str/Path to an existing folder) – Path to a folder in which the aligned reads, as well as the log files, will be saved.

  • index_file (str or Path) – Path to a pre-built bowtie2 index of the target genome. Can either be downloaded from the bowtie2 website (menu on the right), or generated manually from FASTA files using the function ‘bowtie2_create_index’. Note that bowtie2 indices are composed of multiple files ending with the ‘.bt2’ suffix. All of those files should be in the same location. It is enough to specify the path to one of those files (for example, ‘path/to/index.1.bt2’), or to the main name of the index (for example, ‘path/to/index’).

  • bowtie2_installation_folder (str, Path, or 'auto' (default='auto')) – Path to the installation folder of bowtie2. For example: ‘C:/Program Files/bowtie2-2.5.1’. if installation folder is set to ‘auto’, RNAlysis will attempt to find it automatically.

  • new_sample_names (list of str or 'auto' (default='auto')) – Give a new name to each quantified sample (optional). If sample_names=’auto’, sample names will be given automatically. Otherwise, sample_names should be a list of new names, with the order of the names matching the order of the file pairs.

  • mode ('end-to-end' or 'local' (default='end-to-end')) – determines the alignment mode of bowtie2. end-to-end mode will look for alignments involving all the read characters. local mode will allow ‘clipping’ of nucleotides from both sides of the read, if that maximizes the alignment score.

  • settings_preset ('very-sensitive', 'sensitive', 'fast', or 'very-fast' (default='very-sensitive')) – determines the alignment sensitivity preset. Higher sensitivity will result in more accurate alignments, but will take longer to calculate. You can read more about the settings presets in the bowtie2 manual.

  • ignore_qualities (bool (default=False)) – if True, bowtie2 will ignore the qualities of the reads and treat them all as maximum quality.

  • quality_score_type ('phred33', 'phred64', 'solexa-quals', or 'int-quals' (default='phred33')) – determines the encoding type of the read quality scores. Most modern sequencing setups use phred+33.

  • mate_orientations ('fwd-rev', 'rev-fwd', or 'fwd-fwd' (default='fwd-rev')) –

  • min_fragment_length (int >= 0 (default=0)) – The minimum fragment length for valid paired-end alignments.

  • max_fragment_length (int > 0 (default=500)) – The maximum fragment length for valid paired-end alignments.

  • allow_individual_alignment (bool (default=) –

  • allow_disconcordant_alignment (bool (default=) –

  • random_seed (int >=0 (default=0)) – determines the seed for pseudo-random number generator.

  • threads (int > 0 (default=1)) – number of threads to run bowtie2-build on. More threads will generally make index building faster.