i) Installation environment requirements
Operating system: Linux, Mac or Windows
Software requirements: Golang compilation environment (required when compiling with source code; not required when using compiled program directly).
ii) Installation
The compiled executable program does not need to be installed, and can be run directly in Command Prompt. After downloading the source package, you need to pre-install and configure the Golang compilation environment, and then compile the source code one by one to get the executable file and save it to the bin subfolder. In Win system, you can also compile all module source programs by double-clicking the build.bat file in the ASIIQT.v.1.0 folder to automatically generate all executable programs in the bin subfolder. Under each system, compile module by module using similar commands:
Linux / Mac system:
$ cd bin
$ go build ../codes/[module].go
Windows system:
$ cd bin
$ go build .. \ codes \[module].go
i) Identfication of ASIs:
Given the CCS reads of the single-molecule RNA-Seq: sample_raw.longReads.fasta;
the high-quality isoform assembled using IsoSeq: sample_hq_isoform.fastq;
NGS RNA-Seq reads: sample_NGS.reads.fasta;
Reference genome: ref.fa
Mapping 'sample_NGS.reads.fasta' against 'sample_pseudo.longReads.fasta';
Recalling SNPs with SAMtools, generating BCF-derived VCF file: sample_bcf.txt;
Low-quality filtering of 'sample_bcf.txt', generating: sample_filtered.vcf.txt
./modifiedLongReadsCorrect sample_pseudo.longReads.coord.txt sample_modified.longReads.fasta sample_filtered.vcf.txt sample_SNP_marked.NGS_corrected.pseudo.longReads.fasta
#Automatically generating two files: SNP_marked.NGS_corrected.modified.longReads.fasta; modified.longReads.SNP.annotation.txt;
Renaming the files with 'sample_SNP_marked.NGS_corrected.modified.longReads.fasta' and 'sample_modified.longReads.SNP.annotation.txt'
./snpSet sample_pseudo.longReads.coord.txt sample_filtered.vcf.txt
#Automatically generating a file: chr_index_SNP.txt;
Renaming the file with 'sample_chr_index_SNP.txt'
./phasing sample_modified.longReads.SNP.annotation.txt sample_chr_index_SNP sample_SNP_marked.NGS_corrected.modified.longReads.fasta
#Automatically generating 9 files: (1) to_be_phased.txt ; (2) to_be_phased_2.txt; (3) to_be_phased_2_ann.txt; (4) to_be_phased_2_stat.txt; (5) phased_raw.txt; (6) phased_correction.txt; (7) read_phased.txt; (8) corrected.modified.longReads.fasta; (9) phased.modified.longReads.fasta;
Renaming the files with a prefix 'sample_' added.
ii) Quantification of ASIs:
./phasedIso sample_chr_index_SNP.txt sample_hq_isoform.fastq sample_phased_raw.txt sample_hq_isoform_transcripts.txt
#Automatically generating 2 files:,isoform_phased.txt; phased.hq_isoform.fasta;
Renaming the files with 'sample_isoform_phased.txt' and 'sample_phased.hq_isoform.fasta' resepctively