Parfrey Euk Markers – Accessions to GIs to Taxon IDs

Spent today wrestling with scripts to convert these damn accessions to GI numbers; Guillaume modified some of the BioPerl scripts using Bio::DB::Eutils modules. All listed in Edhar: /home/hollybik/eukmarkers_taxonids/accs_from_aligns/

Run the following commands to convert Accessions to GIs and then pull down NCBI taxon IDs (script prints text output with Accession GI NCBI_taxon with space between each)

perl Accs_14_3_3 Taxon_gi_14_3_3
perl Accs_40S Taxon_gi_40S
perl Accs_Actin Taxon_gi_Actin
perl Accs_Atub Taxon_gi_Atub
perl Accs_Btub Taxon_gi_Btub
perl Accs_ef1aLike Taxon_gi_ef1aLike
perl Accs_ef2 Taxon_gi_ef2
perl Accs_enolase Taxon_gi_enolase
perl Accs_gamma Taxon_gi_gamma
perl Accs_grc5 Taxon_gi_grc5
perl Accs_hsp70 Taxon_gi_hsp70
perl Accs_hsp70cyt Taxon_gi_hsp70cyt
perl Accs_hsp70er Taxon_gi_hsp70er
perl Accs_Hsp90 Taxon_gi_Hsp90
perl Accs_metk Taxon_gi_metk
perl Accs_Rad51 Taxon_gi_Rad51
perl Accs_rps22 Taxon_gi_rps22
perl Accs_Rps23a Taxon_gi_Rps23a
perl Accs_TFIIH Taxon_gi_TFIIH
perl Accs_Tsec61 Taxon_gi_Tsec61
perl Accs_U5 Taxon_gi_U5


Taxon IDs for Parfrey Eukaryotic Markers

Getting down to work this morning. First order of business was to grep FASTA headers from the original Parfrey alignment files, using the commands as follows:

grep XXX 14-3-3.fasta >headers_14_3_3
grep XXX 40S.fasta >headers_40S
grep XXX Actin_noOuts.fasta >headers_Actin
grep XXX Atub_noOuts.fasta >headers_Atub
grep XXX Btub_noOuts.fasta >headers_Btub
grep XXX ef1aLike.fasta >headers_ef1aLike
grep XXX ef2_noOuts.fasta >headers_ef2
grep XXX enolase.fasta >headers_enolase
grep XXX gamma_noOuts.fasta >headers_gamma
grep XXX grc5.fasta >headers_grc5
grep XXX hsp70cyt.fasta >headers_hsp70cyt
grep XXX hsp70er.fasta >headers_hsp70er
grep XXX hsp70.fasta >headers_hsp70
grep XXX Hsp90.fasta >headers_Hsp90
grep XXX metk_noOuts.fasta >headers_metk
grep XXX Rad51_noOuts.fasta >headers_Rad51
grep XXX rps22.fasta >headers_rps22
grep XXX Rps23a_noOuts.fasta >headers_Rps23a
grep XXX SSU_noOuts.fasta >headers_SSU
grep XXX TFIIH.fasta >headers_TFIIH
grep XXX Tsec61.fasta >headers_Tsec61
grep XXX U5.fasta >headers_U5