Metagenomes (xeno, gom_fungi) and rerunning Open Ref OTUs

Xeno HiSeq data – Talked with David, still trying to figure out if we have xeno in there (David will run the raw reads through PhyloSift). What I’m doing:

  • Running Xeno data through MG-RAST. Trying to get an initial overview of the shotgun data
  • Running Xeno data through QIIME (prefiltering, ref-based picking only at 60%) to pull out any rRNA reads that might be in there. Hopefully we can get a better picture of the microbial community. Command ran:
!pick_closed_reference_otus.py -i /Users/hollybik/Desktop/Data/metagenomes/HB_RN_March2013_XENO_unzip.fasta -r /macqiime/silva_111/rep_set/Silva_111_full_unique.fasta -o /Users/hollybik/Desktop/Data/metagenomes/xeno_qiime60prefilter -p /Users/hollybik/Dropbox/QIIME/qiime_parameters_filterMGforrRNA.txt --parallel -O 2

Also uploaded GOM_Fungi data to MG-RAST to get an idea of what’s in the sample – data is processing through the pipeline now.

Made some final tweaks to the open ref OTU picking protocol on StarCluster. This should hopefully be the final command that will run to completion after changing the SC script in qiime_config:

!pick_open_reference_otus.py -i /gom_data/GOM_concat1.7_rev_demulti_1to12_2.fna -o /gom_data/uclust_openref96_ref_22Aug -r /gom_data/99_Silva_111_rep_set_euk.fasta --parallel -O 8 -s 0.1 -p /gom_data/qiime_parameters_18Sopenref96_GOMamazon.txt --prefilter_percent_id 0.0 -f

QIIME on Starcluster – configure start_parallel_jobs_sc.py script

Some of my jobs weren’t finishing via iPython notebook run on StarCluster.  After looking through the QIIME documentation, I realized I needed to change the main qiime_config file – Apparently Starcluster uses a different method to distribute jobs in paralell, requiring a change to the script filepath.

QIIME config file needs to look like this (pointing to start_parallel_jobs_sc.pysee bottom of QIIME AWS tutorial page here):

cluster_jobs_fp /home/ubuntu/qiime_software/qiime-1.7.0-release/bin/start_parallel_jobs_sc.py