Expanding support for euks in PhyloSift

PhyloSift genome data lives on Merlot in /share/eisen-z2/phylosift/

Talked with Guillaume about euk markers

  • maker updates – ONLY updating tree and its associated taxonomy (not the HMM, because we don’t want the HMM to shift over time). We throw away all the old reference sequences and start afresh by scanning the downloaded genomes
  • For euks – I’m worried that we’re getting rid of a lot of taxonomic diversity for this marker update method. Original Parfrey sequences definitely seem to be getting thrown away. Need to figure out if we’re representing all the deep protist lineages during euk maker updates (99% tree pruning won’t do much harm if we already have these taxa present).

To investigate:

  • How many euk genomes do we have in the Phylosift directories (draft, ebi, WGS, etc)? How many bacterial/archaeal in comparison?
  • Work with Guillaume to get full NCBI taxonomic hierarchies placed into the trees. This will help to evaluate what lineages present/abesent in our reference markers.
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: