Rerunning GSL guppy analyses

Guppy gives its outputs using the original file names, so I had to change them to make sure the PCAs have informative labels:

hollybik@edhar:~/GSL_analyses/concat_files$ cp ~/phylosift_v1.0.0_01/PS_temp/2051774008_NA_RozPt_all_assembled.fna.gz/treeDir/concat.updated.1.jplace GSL_concat_NA_RozPt_allassembled.jplace
hollybik@edhar:~/GSL_analyses/concat_files$ cp ~/phylosift_v1.0.0_01/PS_temp/2058419001_SA_AntIs_all_assembled.fna.gz/treeDir/concat.updated.1.jplace GSL_concat_SA_AntIs_allassembled.jplace
hollybik@edhar:~/GSL_analyses/concat_files$ cp ~/phylosift_v1.0.0_01/PS_temp/2058419003_NA_Stram_all_assembled.fna.gz/treeDir/concat.updated.1.jplace GSL_concat_NA_Stram_allassembled.jplace
hollybik@edhar:~/GSL_analyses/concat_files$ cp ~/phylosift_v1.0.0_01/PS_temp/2077657010_SA_Stram_all_assembled.fna.gz/treeDir/concat.updated.1.jplace GSL_concat_SA_Stram_allassembled.jplace

And then reran all the guppy analyses:

./guppy pca –out-dir ~/GSL_analyses/guppy/ ~/GSL_analyses/concat_files/GSL_concat_NA_RozPt_allassembled.jplace ~/GSL_analyses/concat_files/GSL_concat_NA_Stram_allassembled.jplace ~/GSL_analyses/concat_files/GSL_concat_SA_AntIs_allassembled.jplace ~/GSL_analyses/concat_files/GSL_concat_SA_Stram_allassembled.jplace –prefix GSL_concat_guppyPCA

 

./guppy squash –out-dir ~/GSL_analyses/guppy/ ~/GSL_analyses/concat_files/GSL_concat_NA_RozPt_allassembled.jplace ~/GSL_analyses/concat_files/GSL_concat_NA_Stram_allassembled.jplace ~/GSL_analyses/concat_files/GSL_concat_SA_AntIs_allassembled.jplace ~/GSL_analyses/concat_files/GSL_concat_SA_Stram_allassembled.jplace –prefix GSL_concat_guppySquash

 

./guppy kr –out-dir ~/GSL_analyses/guppy/ ~/GSL_analyses/concat_files/GSL_concat_NA_RozPt_allassembled.jplace ~/GSL_analyses/concat_files/GSL_concat_NA_Stram_allassembled.jplace ~/GSL_analyses/concat_files/GSL_concat_SA_AntIs_allassembled.jplace ~/GSL_analyses/concat_files/GSL_concat_SA_Stram_allassembled.jplace -o GSL_concat_guppyKRdist

Advertisements

Simulating Euk Reads + GSL analyses

Using the artsim program to simulate some more data from Euk genomes. Command for PE reads:

./art_illumina -i /Users/hollybik/Dropbox/U
C\ Davis\ Projects/PhyloSift/Euk\ NCBI\ Test\ data/C_elegans_genome.fasta -p -l
100 -f 2 -m 200 -s 10 -o /Users/hollybik/Dropbox/UC\ Davis\ Projects/PhyloSift
/Euk\ NCBI\ Test\ data/artsim_euks/artsim_2x

./art_illumina -i /Users/hollybik/Dropbox/U
C\ Davis\ Projects/PhyloSift/Euk\ NCBI\ Test\ data/C_elegans_mito_genome.fasta
-p -l 100 -f 2 -m 200 -s 10 -o /Users/hollybik/Dropbox/UC\ Davis\ Projects/Phyl
oSift/Euk\ NCBI\ Test\ data/artsim_euks/artsim_2x_C_elegans_mt_genome_sim

./art_illumina -i /Users/hollybik/Dropbox/U
C\ Davis\ Projects/PhyloSift/Euk\ NCBI\ Test\ data/Saccharomyces_cerevisiae_gen
ome.fasta -p -l 100 -f 2 -m 200 -s 10 -o /Users/hollybik/Dropbox/UC\ Davis\ Pro
jects/PhyloSift/Euk\ NCBI\ Test\ data/artsim_euks/artsim_2x_Saccharomyces_genom
e_sim

./art_illumina -i /Users/hollybik/Dropbox/U
C\ Davis\ Projects/PhyloSift/Euk\ NCBI\ Test\ data/Saccharomyces_cerevisiae_mt_
genome.fasta -p -l 100 -f 2 -m 200 -s 10 -o /Users/hollybik/Dropbox/UC\ Davis\
Projects/PhyloSift/Euk\ NCBI\ Test\ data/artsim_euks/artsim_2x_Saccharomyces_mt
_genome_sim

  • -p is for paired end simulation
  • -l is for read length
  • -f is for fold coverage
  • -m is the mean size of DNA fragments for PE simulations
  • -s is the standard deviation of DNA frament size for PE simulations

Running artsim data on phylosift_v1.0.0_01

./phylosift all –paired /User
s/hollybik/Dropbox/UC\ Davis\ Projects/PhyloSift/Euk\ NCBI\ Test\ data/artsim_e
uks/artsim_2x_C_elegans_genome_sim1.fq /Users/hollybik/Dropbox/UC\ Davis\ Proje
cts/PhyloSift/Euk\ NCBI\ Test\ data/artsim_euks/artsim_2x_C_elegans_genome_sim2
.fq –debug

./phylosift all –paired /Users/ho
llybik/Dropbox/UC\ Davis\ Projects/PhyloSift/Euk\ NCBI\ Test\ data/artsim_euks/arts
im_2x_Saccharomyces_genome_sim1.fq /Users/hollybik/Dropbox/UC\ Davis\ Projects/Phyl
oSift/Euk\ NCBI\ Test\ data/artsim_euks/artsim_2x_Saccharomyces_genome_sim2.fq –de
bug

./phylosift all –paired /Users/ho
llybik/Dropbox/UC\ Davis\ Projects/PhyloSift/Euk\ NCBI\ Test\ data/artsim_euks/arts
im_2x_Saccharomyces_mt_genome_sim1.fq /Users/hollybik/Dropbox/UC\ Davis\ Projects/P
hyloSift/Euk\ NCBI\ Test\ data/artsim_euks/artsim_2x_Saccharomyces_mt_genome_sim2.f
q –debug

./phylosift all –paired /Users/ho
llybik/Dropbox/UC\ Davis\ Projects/PhyloSift/Euk\ NCBI\ Test\ data/artsim_euks/arts
im_2x_C_elegans_mt_genome_sim1.fq /Users/hollybik/Dropbox/UC\ Davis\ Projects/Phylo
Sift/Euk\ NCBI\ Test\ data/artsim_euks/artsim_2x_C_elegans_mt_genome_sim2.fq –debu
g

GSL Data – running Guppy analyses on the finished PhyloSift runs

./guppy pca –out-dir ~/GSL_analyses/guppy
/ ~/phylosift_v1.0.0_01/PS_temp/2051774008_NA_RozPt_all_assembled.fna.gz/treeDir/con
cat.updated.1.jplace ~/phylosift_v1.0.0_01/PS_temp/2058419001_SA_AntIs_all_assembled
.fna.gz/treeDir/concat.updated.1.jplace ~/phylosift_v1.0.0_01/PS_temp/2058419003_NA_
Stram_all_assembled.fna.gz/treeDir/concat.updated.1.jplace ~/phylosift_v1.0.0_01/PS_
temp/2077657010_SA_Stram_all_assembled.fna.gz/treeDir/concat.updated.1.jplace –pref
ix GSL_concat_guppyPCA

./guppy squash –out-dir ~/GSL_analyses/guppy/ ~/phylosift_v1.0.0_01/PS_temp/2051774008_NA_RozPt_all_assembled.fna.gz/treeDir/concat.updated.1.jplace ~/phylosift_v1.0.0_01/PS_temp/2058419001_SA_AntIs_all_assembled.fna.gz/treeDir/concat.updated.1.jplace ~/phylosift_v1.0.0_01/PS_temp/2058419003_NA_Stram_all_assembled.fna.gz/treeDir/concat.updated.1.jplace ~/phylosift_v1.0.0_01/PS_temp/2077657010_SA_Stram_all_assembled.fna.gz/treeDir/concat.updated.1.jplace –prefix GSL_concat_guppySquash

./guppy kr –out-dir ~/GSL_analyses/guppy/
~/phylosift_v1.0.0_01/PS_temp/2051774008_NA_RozPt_all_assembled.fna.gz/treeDir/conc
at.updated.1.jplace ~/phylosift_v1.0.0_01/PS_temp/2058419001_SA_AntIs_all_assembled.
fna.gz/treeDir/concat.updated.1.jplace ~/phylosift_v1.0.0_01/PS_temp/2058419003_NA_S
tram_all_assembled.fna.gz/treeDir/concat.updated.1.jplace ~/phylosift_v1.0.0_01/PS_t
emp/2077657010_SA_Stram_all_assembled.fna.gz/treeDir/concat.updated.1.jplace -o GSL_
concat_guppyKRdist

PhyloSift analysis of Deepsea OTUs

Prepping for lab meeting tomorrow, so looking at the results of the PhyloSift runs for the Deepsea OTU data.

Edge PCA (produces an .xml tree file) :

./guppy pca –out-dir ~/phylosift_v1.0.0_01/ ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_ShallowCalif.fna/treeDir/18s_reps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_ShallowGulf.fna/treeDir/18s_reps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Atlantic22.1.fna/treeDir/18s_reps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Atlantic25.2.fna/treeDir/18s_reps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Atlantic29.fna/treeDir/18s_reps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Atlantic43.fna/treeDir/18s_reps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Atlantic45.fna/treeDir/18s_reps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Pacific128.fna/treeDir/18s_reps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Pacific237.fna/treeDir/18s_reps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Pacific321.fna/treeDir/18s_reps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Pacific422.fna/treeDir/18s_reps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Pacific528.fna/treeDir/18s_reps.1.jplace –prefix guppyDS

Squash clustering (

./guppy squash –out-dir ~/phylosift_v1.0.0_01/ ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_ShallowCalif.fna/treeDir/ShallowCalif_18Sreps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_ShallowGulf.fna/treeDir/ShallowGulf_18Sreps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Atlantic22.1.fna/treeDir/Atlantic22.1_18Sreps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Atlantic25.2.fna/treeDir/Atlantic25.2_18Sreps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Atlantic29.fna/treeDir/Atlantic29_18Sreps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Atlantic43.fna/treeDir/Atlantic43_18Sreps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Atlantic45.fna/treeDir/Atlantic45_18Sreps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Pacific128.fna/treeDir/Pacific128_18Sreps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Pacific237.fna/treeDir/Pacific237_18Sreps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Pacific321.fna/treeDir/Pacific321_18Sreps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Pacific422.fna/treeDir/Pacific422_18Sreps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Pacific528.fna/treeDir/Pacific528_18Sreps.1.jplace –prefix guppyDeepsea_squash

Kantorovich-Rubinstein Distance:

~/phylosift_v1.0.0_01/bin$ ./guppy kr –out-dir ~/phylosift_v1.0.0_01/ ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_ShallowCalif.fna/treeDir/18s_reps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_ShallowGulf.fna/treeDir/18s_reps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Atlantic22.1.fna/treeDir/18s_reps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Atlantic25.2.fna/treeDir/18s_reps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Atlantic29.fna/treeDir/18s_reps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Atlantic43.fna/treeDir/18s_reps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Atlantic45.fna/treeDir/18s_reps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Pacific128.fna/treeDir/18s_reps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Pacific237.fna/treeDir/18s_reps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Pacific321.fna/treeDir/18s_reps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Pacific422.fna/treeDir/18s_reps.1.jplace ~/phylosift_v1.0.0_01/PS_temp/QiimeSplit_F04_Pacific528.fna/treeDir/18s_reps.1.jplace -o guppy_Deepsea_KRdistance