Pennsylvania State University
Topic: De novo Assembly
A whole day of assembly!
There is no single program right now that is considered 'the assembler'. Different assemblers have advantages and disadvantages as well as things they are generally useful and not useful for. So one thing in todays assemblers is that they all take a lot of time and memory to run--especially when doing de novo assembly. One of the exceptions is the program Minia, developed by Dr. Chikhi which was designed to run efficiently using low memory requirement.
One of the important things that you need to know for assembly is what a k-mer is. A k-mer is any sequences with length k.
AGC is a k-mer with k=3
AGCT is a k-mer with k=4
AGCTT is a k-mer with k=5
You hopefully get the idea.
There are two essential methods that assemblers use to assemble: de Bruijn graphs and overlap/string graphs. Now we sort of covered this in the Assembly prep blog...lets see if I can explain this better here now...