Thursday, November 19, 2009

Bacterial Genome Sequence Links

I keep meaning to put up these links, so I can find them in the future. They get you to NCBI's genome FTP site. If you click on the genome you want, you get a listing of a bunch of file formats. For the complete genomes, the choice for the FastA of the complete sequence is the .fna file. But there’s a bunch of others, which makes getting the correctly formatted data for application X easier:

Bacterial Genomes FTP

For the draft assemblies, the way to get the data is at this link:
Whole Genome Shotgun FTP

But to find out which accession you want is a little irritating and the best way is via searching this list:
Whole Genome Shotgun HTML

Using the supplied links from this last list is not so useful for downloading the data as is the FTP link, since you would have to click a maddening amount to get all the contigs from a particular whole genome shotgun sequencing project.

