Converting .biom files and replacing metadata

There’s no “remove metadata” command for the .biom files, so what I’ve been doing is converting to Classic OTU tables (and pulling down taxonomy information in this conversion), and then converting *back* to .biom to add the new set of metadata:

convert_biom.py -i /Users/hollybik/Dropbox/Projects/Visualization/Test\ data\ from\ Greg/biom_for_holly/global-gut/study_850_closed_reference_otu_table_w_sample_md.biom -o /Users/hollybik/Dropbox/Projects/Visualization/Test\ data\ from\ Greg/biom_for_holly/global-gut/study_850_closed_reference_otu_table_classic.txt --biom_to_classic_table --header_key taxonomy

Then I went in and edited the sample mapping file (to condense the metadata to only relevant columns for visualization). Then re-convert the classic OTU table to .biom using the new condensed metadata mapping file:

convert_biom.py -i /Users/hollybik/Dropbox/Projects/Visualization/Test\ data\ from\ Greg/biom_for_holly/global-gut/study_850
_closed_reference_otu_table_classic.txt -o /Users/hollybik/Dropbox/Projects/Visualization/Test\ data\ from\ Greg/biom_for_holly/global-gut/study_850_closed_reference_otu_table_condensed_metadata.biom -m /Users/hollybik/Dropbox/Projects/Visualization/Test\ data\ from\ Greg/biom_for_holly/global-gut/study_850_mapping_file_condensed_metadata.txt --biom_table_type="otu table"

Also converting my published Deepsea data and GOM data into new .biom files

First, Rename “Consensus Lineage” final column to “taxonomy” – Next, run command:

convert_biom.py -i /Users/hollybik/Dropbox/Projects/Visualization/my_test_data/Deepsea_OTUtable_uclust99_F04NF1.txt -o /Users/hollybik/Dropbox/Projects/Visualization/my_test_data/Deepsea_OTUtable_uclust99_F04NF1.biom --process_obs_metadata taxonomy -m /Users/hollybik/Dropbox/Projects/Visualization/my_test_data/QIIMEmappingfile_Deepsea_MASTER.txt --biom_table_type="otu table"

However, there are a few problems with this conversion:

  1. Taxonomy strings are the old comma-delimited version. They aren’t separated by the new k__;p__;c__;o__;f__;g__;s__ delimiters
  2. I can’t figure out if the sample names (OTU table column headers got pulled through). It doesn’t look like they did, only the ¬†QIIME mapping file info got pasted in at the bottom. Maybe I’m wrong about this, it was just based on a quick glance through the .biom table

So what I think I’ll do for these data:

  1. Re-run taxonomy assignment using new SILVA database and RDP classifier. This will put the taxonomy strings in the correct format
  2. Just construct a new .biom table from the OTU mapping file and taxonomy assignments, instead of trying to convert old classic OTU tables into .biom files.

Illumina GOM data – prepping for QIIME

Going through QIIME to figure out how you have to process Illumina FASTQ data – process seems like it takes a lot of prep work, especially for paired-end files. I’m using the FastX toolkit on Merlot to generate the barcode files that QIIME needs as input for demultiplexing. ¬†Commands I ran (a whole lotta them) are as follows:

——————–
./fastx_trimmer -Q33 -f 6 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-1_1_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-1_1_trimmed.txt

./fastx_trimmer -Q33 -f 1 -l 5 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-1_1_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-1_1_barcode.txt

./fastx_trimmer -Q33 -f 6 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-1_2_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-1_2_trimmed.txt

./fastx_trimmer -Q33 -f 1 -l 5 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-1_2_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-1_2_barcode.txt

—————————-
./fastx_trimmer -Q33 -f 6 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-2_1_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-2_1_trimmed.txt

./fastx_trimmer -Q33 -f 1 -l 5 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-2_1_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-2_1_barcode.txt

./fastx_trimmer -Q33 -f 6 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-2_2_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-2_2_trimmed.txt

./fastx_trimmer -Q33 -f 1 -l 5 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-2_2_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-2_2_barcode.txt

—————————-
./fastx_trimmer -Q33 -f 6 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-3_1_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-3_1_trimmed.txt

./fastx_trimmer -Q33 -f 1 -l 5 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-3_1_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-3_1_barcode.txt

./fastx_trimmer -Q33 -f 6 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-3_2_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-3_2_trimmed.txt

./fastx_trimmer -Q33 -f 1 -l 5 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-3_2_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-3_2_barcode.txt

—————————-
./fastx_trimmer -Q33 -f 6 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-4_1_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-4_1_trimmed.txt

./fastx_trimmer -Q33 -f 1 -l 5 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-4_1_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-4_1_barcode.txt

./fastx_trimmer -Q33 -f 6 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-4_2_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-4_2_trimmed.txt

./fastx_trimmer -Q33 -f 1 -l 5 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-4_2_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-4_2_barcode.txt

—————————-
./fastx_trimmer -Q33 -f 6 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-5_1_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-5_1_trimmed.txt

./fastx_trimmer -Q33 -f 1 -l 5 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-5_1_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-5_1_barcode.txt

./fastx_trimmer -Q33 -f 6 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-5_2_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-5_2_trimmed.txt

./fastx_trimmer -Q33 -f 1 -l 5 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-5_2_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-5_2_barcode.txt

—————————-
./fastx_trimmer -Q33 -f 6 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-6_1_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-6_1_trimmed.txt

./fastx_trimmer -Q33 -f 1 -l 5 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-6_1_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-6_1_barcode.txt

./fastx_trimmer -Q33 -f 6 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-6_2_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-6_2_trimmed.txt

./fastx_trimmer -Q33 -f 1 -l 5 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-6_2_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-6_2_barcode.txt

—————————-
./fastx_trimmer -Q33 -f 6 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-7_1_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-7_1_trimmed.txt

./fastx_trimmer -Q33 -f 1 -l 5 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-7_1_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-7_1_barcode.txt

./fastx_trimmer -Q33 -f 6 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-7_2_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-7_2_trimmed.txt

./fastx_trimmer -Q33 -f 1 -l 5 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-7_2_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-7_2_barcode.txt

—————————-
./fastx_trimmer -Q33 -f 6 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-8_1_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-8_1_trimmed.txt

./fastx_trimmer -Q33 -f 1 -l 5 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-8_1_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-8_1_barcode.txt

./fastx_trimmer -Q33 -f 6 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-8_2_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-8_2_trimmed.txt

./fastx_trimmer -Q33 -f 1 -l 5 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-8_2_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-8_2_barcode.txt

—————————-
./fastx_trimmer -Q33 -f 6 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-9_1_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-9_1_trimmed.txt

./fastx_trimmer -Q33 -f 1 -l 5 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-9_1_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-9_1_barcode.txt

./fastx_trimmer -Q33 -f 6 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-9_2_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-9_2_trimmed.txt

./fastx_trimmer -Q33 -f 1 -l 5 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-9_2_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-9_2_barcode.txt

—————————-
./fastx_trimmer -Q33 -f 6 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-10_1_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-10_1_trimmed.txt

./fastx_trimmer -Q33 -f 1 -l 5 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-10_1_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-10_1_barcode.txt

./fastx_trimmer -Q33 -f 6 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-10_2_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-10_2_trimmed.txt

./fastx_trimmer -Q33 -f 1 -l 5 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-10_2_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-10_2_barcode.txt

—————————-
./fastx_trimmer -Q33 -f 6 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-11_1_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-11_1_trimmed.txt

./fastx_trimmer -Q33 -f 1 -l 5 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-11_1_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-11_1_barcode.txt

./fastx_trimmer -Q33 -f 6 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-11_2_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-11_2_trimmed.txt

./fastx_trimmer -Q33 -f 1 -l 5 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-11_2_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-11_2_barcode.txt

—————————-
./fastx_trimmer -Q33 -f 6 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-12_1_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-12_1_trimmed.txt

./fastx_trimmer -Q33 -f 1 -l 5 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-12_1_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-12_1_barcode.txt

./fastx_trimmer -Q33 -f 6 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-12_2_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-12_2_trimmed.txt

./fastx_trimmer -Q33 -f 1 -l 5 -i /home/hbik/Illumina_GOM/raw_data/1926-KO-12_2_sequence.txt -o /home/hbik/Illumina_GOM/fastx_trimmed_files/1926-KO-12_2_barcode.txt