Up and running on the Amazon Cloud

This morning I played around with EC2 instances and trying to mount S3 storage as buckets. Followed this s3fs tutorial for an external program that allows s3 mounting – managed to get the program and its dependencies installed, but I couldn’t mount volumes in the end for some reason (and sidenote, the program had dumb rules like S3 buckets couldn’t have capitalized names…!) . After a lot of frustration I decided to look further into alternative storage.

So I think now I’ve finally sorted out the storage issues with Amazon Cloud – discovered that you can just change the default root volume for each instance to increase the storage space (auto mounted EBS volumes) – so no need to use S3 and all that mounting bucket malarky.

Time to get started with QIIME – first focused on the microBEnet genomes.

wget David’s error corrected THU files from Edhar

Convert to FASTA for QIIME

cat THU.r1.fastq.pp.ec.fastq | perl -e ‘$i=0;while(<>){if(/^\@/&&$i==0){s/^\@/\>/;print;}elsif($i==1){print;$i=-3}$i++;}’ > THU.r1.fastq.pp.ec.converted.fasta

cat THU.r2.fastq.pp.ec.fastq | perl -e ‘$i=0;while(<>){if(/^\@/&&$i==0){s/^\@/\>/;print;}elsif($i==1){print;$i=-3}$i++;}’ > THU.r2.fastq.pp.ec.converted.fasta

Downloaded/unzipped greengenes via wget and ran closed-reference OTU picking:

pick_reference_otus_through_otu_table.py -i THU.r1.fastq.pp.ec.converted.fasta -r /home/ubuntu/microbenet_data/gg_otus_4feb2011/rep_set/gg_99_otus_4feb2011.fasta -o /home/ubuntu/microbenet_data/16S_gg_THU_ECreads_r1/ -t /home/ubuntu/microbenet_data/gg_otus_4feb2011/taxonomies/greengenes_tax.txt

.py -i THU.r2.fastq.pp.ec.converted.fasta -r /home/ubuntu/microbenet_data/gg_otu
s_4feb2011/rep_set/gg_99_otus_4feb2011.fasta -o /home/ubuntu/microbenet_data/16S
_gg_THU_ECreads_r2/ -t /home/ubuntu/microbenet_data/gg_otus_4feb2011/taxonomies/


