Lots more alignment edits

Busy few days of alignment edits:

On June 10th I got the first full Nematode RAxML tree back.  It was a complete mess and I’ve now gone back to the alignment and made some major edits. 

Firstly, I removed taxa from the ARB tree that had extremely long branches on the best ML tree from RAxML job #418200.  These removed sequences were as follows:

  • MlgInc16, MLgHisp3, PyncZea2, UncNe311, UncNe482, UncNe469, HtaGly20, WucBanc6, Pd6Strong, Pd6Punct, Pd6Cylin

Next I edited alignments for the following groups:

  • Bunonema, Trichurus, Daptonema, Theristus, Pelodera, Rhabditas, Ditylenchus

The tree ‘SSU_MajorReduc_11June’ reflects the following changes:

  1. I removed all the duplicate sequences that RAxML job #418200 indicated were in the tree.  See file ‘Arb_ident_Seq_7June’ [or something similar] for the list of duplicate sequences with ARB ID/taxonomy information.  After this step there were 2969 sequences in the NJ tree. [I believe this step is also reflected in ARB database ”SSU_10June’]
  2. The Uncultured nematodes were really clogging up the tree and didn’t seem to be adding anything informative.  They added long branches to clades, and a lot of them were just short barcode sequences anyway.  I went back to the tree and removed all the ‘uncultured nematodes’ and ‘uncultured nematodes sourhope farm’. After this step there were 2836 sequences in the NJ tree.
  3. Removed lots of Acrobeloides sequences–sequences I kept in the tree are highlighted in yellow on the printout of ARB taxa in database.  After this step there were 2510 sequences in the NJ tree.

I exported the resulting 2510 taxa into a Phylip file named ‘NewSSU_2510taxa.phy’.  I ran the following jobs in RAxML:

  • Job #514028  was run with GTR +G
  • Job #514044 was run with GTR + I + G


I kept going with the reductionist edits, next removing my own list of duplicate SSU sequences–only the sequences with duplicate SSU AND D2D3 were removed from the SSU database (so we can see the topology differences if seqs are identical for SSU but different for D2D3).  Following sequences were removed:

  • AUK 10, 35
  • HUK 1
  • LUK 3
  • OUS 1, 14, 21, 9, 3, 5, 7, 8
  • HCL 10, 11, 12, 2, 27, 5, 7, 9, 20, 32, 23
  • BUS 1, 2, 3, 4, 5, 7
  • NUS 2, 4, 5, 6, 14, 40
  • DBA 4, 2, 3, 5, 6, 7
  • SBA 3, 5, 1, 12
  • NAR 1, 5, 15, 16, 7, 2, 8
  • SUS 2, 21, 10, 15, 6
  • WUS 5, 2, 4
  • LCL 2, 3, 4, 7, 8, 5, 9, 19
  • BCA 10, 1, 31, 47, 5, 6, 21, 23
  • SBN 2
  • Cr 55, 73a, 83a, 84b
  • TCR 1, 3, 12, 130, 139, 188, 202, 158

I also went through the following genera and removed sequences which were shorter than 1000bps.  Generally for these groups, the short sequences were outliers and attached to other clades (presumably because of LBA due to their shortness), and there were plenty of full SSU sequences availalbe in the database which could be used to represent these genera.  I’m now aiming to limit representative sequences to 15 or less per genera (excluding Enoplids) Groups edited were as follows:

  • Strongyloides
  • Rhabdolaimus
  • Bursaphalenchus
  • Longidorus
  • Pratylenchus
  • Tricodorus
  • Paratrichodorus
  • Rotylenchus/Rotylenchulus
  • Ditylenchus
  • Globodera
  • Heterodera
  • Pristionchus
  • Pellioditis
  • Aphelenchoides
  • Steinernema

After all these edits, I rebuilt the NJ tree early morning on June 12.  This NJ tree had 1876 sequences.

I also built the initial LSU database in ARB.  I downloaded the alignment for Phylum Nematoda from SILVA (containing 1544 taxa).  I imported this into ARB and edited the following groups:

  • Caenorhabditis
  • Strongyloides
  • Bursaphelenchus
  • Pratylenchus

The LSU aligments are a MESS.  The conserved regions are OK, but the less conservative regions are not aligned at all.  I’ve been using fast aligner a lot to rectify things within genera, as oftentimes it seem that things align ok if you use this method–although it doesn’t work for everything and some regions are impossible to work with. 



Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: