Friday, July 24, 2009

Pile-ups

As I try and address the critiques of my original NIH postdoc fellowship application in my resubmission application, I started fleshing out ways I which our planned periplasmic donor DNA sequencing experiments will investigate the mechanism of DNA uptake.

I’m playing around with different figures that illustrate what I’ll do and what it might reveal. Here’s the basic idea:

(1) Incubate sheared chromosomal donor DNA with recipient competent cell preparations.
(2) Recover the DNA that is taken up in to the periplasm.
(3) Obtain paired-end sequence data for periplasmic and input DNA libraries.
(4) Compare the abundance of different sequences between the periplasmic and input DNA libraries to calculate the periplasmic uptake efficiency for sequences across the genome.

What would this data look like?

One mapped back to the genome, each paired-end read will define the span of an individual fragment from the sequenced pool.

Below, I illustrate what a few dozen of these spans would look like mapped to a short stretch of chromosome containing an uptake signal sequence in the center.

The first diagram shows what the input DNA pool would look like. The blue spans do not contain USS, while the red spans do.
The second diagram shows what a similar amount of sequencing of the periplasmic uptake DNA pool might look like. I assume that the presence of a USS motif is sufficient and necessary to strongly stimulate uptake (“all-or-nothing” model of uptake). Thus, spans containing USS would be much more abundant than spans that do not.
These sequence data could be plotted in several ways:

(1) Spanning coverage at each genomic position: The input DNA is expected to have roughly equal spanning coverage of eaach genomic position, but spanning coverage in the periplasmic uptake library is expected to be higher closer to USS motifs. As the distance between a genomic position and USS increases, fewer spans will contain both. Peaks will indicate USS, and peak height will indicate the effectiveness of individual USS loci. I estimate that one lane of sequencing the input will provide ~2500X spanning coverage per nucleotide for 500 bp donor DNA fragments.

(2) End coverage at each genomic position: If the position of USS in a fragment is irrelevant to the uptake mechanism, then plotting end coverage would have a different shape than the spanning coverage around. Since any fragment containing USS will be effectively taken up, I expect a more sawtooth-shaped distribution of end reads at USS determined by the spanning fragment length. I estimate that one lane of sequencing the input will provide 100X end coverage per nucleotide for 500 bp donor DNA fragments.

Below, I illustrate an idealized case of spanning and end coverage.
Different USS loci may behave in different fashions. Here are what two other scenarios might look like:

(1) If uptake is polarized by USS, such that the position of USS on a fragment is important, or if there was an uptake blocking sequence nearby a USS, the distribution might be skewed:
(2) If a fragment’s uptake is equally efficient with one or two USS motifs (a USS interference model), then coverage around two nearby USS might look like this:
I wonder what other mechanistic details might be found in the data...

No comments:

Post a Comment