Work continues on tuning up the core code.
The biggest change is that guppy now makes actual SQLite databases rather than a collection of SQLite commands.
This means database building is much much faster.
For those of you compiling the code, you will now need godi-sqlite3.
We have also finished full support for multiplicity of placements now (i.e. > 1 sequence name per placement).
They are supported in the database code.
There is also a guppy redup command for re-adding duplicate sequences to placefiles generated from deduplicated sequence files.
Deduplication will make your pipeline much faster, and it’s easy with seqmagick (the guppy redup documentation has some details).
Also
- fixed all of the sequence parsers to be tail-recursive, so parsing large files no longer causes segfaults.
- better consistency of output flags across all guppy commands.
- renamed the
--normalflag forguppy krto--gaussianto avoid confusion with normalization. - shuffling for
guppy kris now much more memory efficient, and fixed bug that was throwing off significance estimation. guppy pcanow defaults to scaling eigenvalues to percent variance.- Re-added in JTT which had been mysteriously dropped.