Workflow of De Novo Sequencing
De Novo sequencing, also known as whole genome assembly, involves obtaining a novel genome sequence by direct sequencing and subsequent assembly without any reference genome information. This technology is crucial in genomics research and is widely utilized for discovering unknown genomes, analyzing new species' genomes, and studying complex genomic structures.
Sample Preparation
The initial step in De Novo sequencing is sample preparation, which significantly impacts the outcomes of subsequent sequencing and assembly processes. This step involves extracting and purifying genomic DNA to achieve high-quality and high-purity DNA samples. The samples must be free of degradation and contaminants and should possess sufficient quantity and concentration.
Library Construction
Library construction entails fragmenting the purified genomic DNA and attaching adapters to create DNA libraries suitable for high-throughput sequencing platforms. There are two primary methods: constructing short-fragment libraries for platforms like Illumina and long-fragment libraries for platforms such as PacBio and Oxford Nanopore.
Sequencing
Sequencing involves the high-throughput sequencing of the constructed DNA libraries to generate extensive read data. Each sequencing platform offers distinct advantages. For instance, the Illumina platform is renowned for its high accuracy and throughput, while PacBio and Oxford Nanopore platforms excel in reading long DNA fragments. Selecting the appropriate sequencing platform based on the project's specific needs is essential for obtaining high-quality data.
Data Preprocessing
Data preprocessing includes quality control and filtering of raw sequencing data. This step commonly involves removing adapter sequences, low-quality reads, and redundant sequences. The resulting high-quality read data are then utilized for genome assembly.
Genome Assembly
Genome assembly involves piecing together the high-quality read data into a complete genome sequence. Two common assembly methods are Overlap-Layout-Consensus (OLC) and De Bruijn graph methods. OLC is suitable for long-read data and can handle highly repetitive regions, whereas De Bruijn graph methods are more efficient for short-read data. The selection of an appropriate assembly algorithm enhances the accuracy and completeness of the genome assembly.
Evaluation and Annotation
Post-assembly, the genome sequence must be evaluated and annotated. Evaluation metrics include calculating the N50 value and assessing genome completeness and accuracy. Annotation involves identifying functional elements within the genome, such as genes, regulatory regions, and repetitive sequences. Utilizing bioinformatics tools and databases, comprehensive functional information is provided for the assembled genome.
By adhering to these steps, De Novo sequencing enables the comprehensive analysis of novel genomes, contributing significantly to advancements in genomic research.
How to order?