Wednesday, March 2, 2011

started downloading all bacterial sequences

Using sha:
3ba1cd6720cd59c6c5da9da18fcfaf3653b5cffd
I started downloading all bacterial species which have at least 7 completed genomes (at count 50 species). The list is in the repository now.
Used command:
python SequenceDownload.py --file DownloadData.txt

Should only take ~30-60 minutes. This where I will also start with the project tag "BacterialLinkage" to describe how I plan to look for evolution of linkage across many bacterial species.

Edit: Actually takes about ~4 hours to download, unzip and aggregate all of the data.

No comments:

Post a Comment