Cox1 ML and Bayesian Treebuilding

 

Model choice for ML –> Use Modeltest 3.7 (chooses between 56 models)

  • Cox1_28May_aln.nex –> Cox1 Enoplids only with 6 Pellioditis spp. (105 taxa  total)

Ran on modeltest:

  1. First execute data file (Nexus) in PAUP
  2. Execute file ‘modelblock PAUP b10’ to compare models—this makes an output file called  ‘model.scores’ and you can change this filename afterward to reflect your data
  3. Deposit the ‘model.scores’ file into C:\bin folder and then run modeltest <inputfile.scores> outputfile.out
  4. If successful, the command line will revert back to C:\bin and the outfile is written to the bin folder.

For Cox1_28May_aln.nex the AIC selected GTR +I +G as the appropriate model for Cox1 data

 

Modeltest for MrBayes –> Use MrModelTest2.3 (chooses between 24 models)

  1. Execute data file (Nexus) in PAUP
  2. Execute ‘mrmodelblock’ in PAUP to compare models.  This makes an ooutput file called ‘mrmodel.scores’
  3. Deposit the ‘mrmodel.scores’ in the bin folder, and run the command mrmodel2  <inputfile.scores> outputfile.scores
  4. If successful, the command line will revert back to C:\bin and the outfile is written to the bin folder.

For Cox1_28May_aln.nex the AIC selected GTR +I +G as the appropriate model for Cox1 data

 

PhyML run 29 May

  • Used file ‘Cox1_29May.phy’ (sequential  file format, 105 taxa)
  • GTR + I +G model, estimated invariable sites, estimated gamma distribution, 4 nt substitution categories, base frequency estimate was empirical
  • Loglikelihood from stat file = -11260.10740

Running files in MrBayes:

  • Save as a sequential Nexus file (PAUP 4.0) from Mega
  • No need for alteration to MrBayes block—use the default one from the Barry Hall Book (chapter 8 folder).  For Cox1 I used the one with codon partitions.
  • Ran file ‘Cox1_28May_aln.nex’ using GTR +G+I with a codon partition model. Ran for 1,000,000 generations, final average standard deviation was 0.042370.
  • Burnin=1250 (should have increased this for 1,000,000 gen), sampplefreq=100 (records tree/parameters every 100 gen), printfreq=1000 (prints to screen every 1000 gen), nchains=4, temp=0.2

 

Bayes_datafile_changes

Cox1 Duplicates eliminated

If duplicate (and bad) sequences are being eliminated from the Enoplid Cox1 data set, then the alignment should contain all the following sequences:

AUK 1, 13, 18, 23, 7
BCA 25, 26, 37, 40, 42
BUS 2, 3, 5
Cr 1, 17b, 24b, 3, 4, 59, 66, 68, 72b, 82b
DBA 1, 2
HCL 23, 8
JCC 37, 4, 52, 79
LCL 19, 7
NAR 1, 11, 14, 16, 2, 5, 6, 8, 9
NUS 10, 2, 21
OUS 1, 2, 3, 5, 6
PPA 1, 7
SBA 13, 14, 7
SUS 10, 15
TCR 197, 41, 70, 75, 78, 81, 87, 89, 95
WUS 1, 2, 5

Cox1 Treebuilding and SSU Alignment Edits

Cox1 Tree building—23 May 09

Note: in previous Cox1 treebuilding (files from 4th-11th May), the correct Cox1 alignment to be used is named ‘Align_4May_nts_truncnames

 Today I imported all the cox1 sequences I could find from genbank—the full list of species and accession numbers are listed in the file ‘Cox1_Genbank_Acc_nos’.  The files I worked on are as follows:

  • Genbank_cox1nems.mas (protein sequence alignments—you CANNOT back translate this file to nts in MEGA for some reason)
  • Genbank_cox1nems_ntseq.mas (nucleotide sequence alignment—all sequences had to be re-downloaded from Genbank)
  • Cox1_GBplusEnop_23May.mas (combines all my Enoplid sequences with all the Genbank files I downloaded)
  • Cox1_GBplusEnop_23Maya.mas (removed the following bad sequences: BUS 4, JCC 59, LCL 3, OUS 14, TCR 69, AND WUS 3)
  • Cox1_GBplusEnop_23Mayc.mas (added more Pellioditis sequences from Genbank, and also deleted more sequences as follows: B. ruftipennis, S. lupi, H. muscae, S. sp, R sp., P. sp., C. nassatus, T. skrjabini, G. binucleatum [2 seqs], T. native, C. briggsae, R. iyengari.  These sequences were either too short or looked too divergent to be fitted into the alignment.)
  • Cox1_GBplusEnop_23Mayd.mas (additionally removed the following taxa, which were a bit divergent: O. volvulus, D. immitis, B. malayi, R. lichstensfelsi, B. debrae)

I ran likelihood analyses in PhyML using the divergent and tight alignments (23Mayc and 23Mayd, respectively). I first aligned the translated protein sequences in each .mas file using CLUSTAL in MEGA (default parameters).  The output nucleotide alignments are saved as 23Mayc_DNAaln and 23Mayd_DNAaln.

Alignments had to be saved as sequential PHYLIP files to be run through PhyML—alterted parameters were as follows: HKY model, Ts/Tv ratio estimated, invariable sites estimated, 4 categories of nt substitution, sequential input sequences.

 

Nematode groups edited in the ARB alignment on 22 May 08:

  • Trichura
  • Trichinella
  • Bathyeurystomina
  • Enchelidiidae
  • Pareurystomina
  • Thoracostomopsidae
  • Enoploides
  • Calyptronema
  • Oxystomina
  • Thalassoalaimus
  • Litinium

 The species Birgit used in her paper (Meldal 2007, A revised phylogeny of the phylum Nematoda; I searched for all the accession numbers listed in the appendix [saved as ‘Meldal_appendix_acc_nos.pdf’]), were annotated with the phrase ‘Meldal’ under the field ‘publication_doi’.  Two species weren’t in my ARB database: Trigulla aluta and Belondira apitica.  Somehow, though, although Birgit listed 214 species, I only have 211 marked in ARB (minus the two not found, I missed one somewhere?)