v1.1.alpha07: SQL overhaul, multiplicity and fixes23 May 2011, by Erick
Work continues on tuning up the core code.
The biggest change is that guppy now makes actual SQLite databases rather than a collection of SQLite commands.
This means database building is much much faster.
For those of you compiling the code, you will now need
We have also finished full support for multiplicity of placements now (i.e. > 1 sequence name per placement).
They are supported in the database code.
There is also a
guppy redup command for re-adding duplicate sequences to placefiles generated from deduplicated sequence files.
Deduplication will make your pipeline much faster, and it’s easy with seqmagick (the
guppy redup documentation has some details).
- fixed all of the sequence parsers to be tail-recursive, so parsing large files no longer causes segfaults.
- better consistency of output flags across all guppy commands.
- renamed the
--gaussianto avoid confusion with normalization.
- shuffling for
guppy kris now much more memory efficient, and fixed bug that was throwing off significance estimation.
guppy pcanow defaults to scaling eigenvalues to percent variance.
- Re-added in JTT which had been mysteriously dropped.