Monday, July 27, 2009

Food-to-sex ratio?

In my recent experiments using radiolabeled USS-1 donor DNAs, I was impressed by just how well naturally competent H. influenzae will slurp up USS-containing DNA. I’d read about it, but observing it myself was really something. (It reminds me a bit of the first time I “saw” Mendelian segregation when I dissected my first yeast tetrads.)

The other thing that struck me, reading the antecedents of my uptake experiments, was that a large majority of taken up DNA is simply degraded and the nucleotide subunits used for DNA replication. In my experiments, while donor DNA remained intact in rec-2 cells, there was no intact donor DNA after an hour in wild-type cells. All the radiolabel was found in the chromosomal DNA.

Using linearized USS-containing plasmid donor DNA, Barouki and Smith (1985) nicely show that this chromosomal labeling is NOT dependent on recombination, as rec-1 mutant competent cells obtain similar levels of chromosomal labeling (rec-1 is the recA homolog in H. influenzae). Using restriction digestion, they also nicely show that in wild-type cells, only some of the donor DNA manages to recombine into the recipient chromosome.

The most important lanes in the above autoradiogram from Figure 3 of their paper are Lane E and Lane F, which show restriction-digested DNA after uptake in wild-type and rec-1. In Lane E (wild-type), the two arrowheads indicate the chromosomal restriction fragments indicative of transformation by the linear plasmid donor. In Lane F (rec-1), there is no appreciable transformation of the donor DNA into the chromosome (those restriction fragments are gone).

These results raise an interesting prospect: Could we interpret amount of recombination-independent radiolabeling relative to the recombination-dependent radiolabeling as a “food-to-sex” ratio? Undeniably, DNA is taken up and used by competent cells, but it’s clearly used in two different ways: Subunit recycling (food) and recombination (sex). By my eye, it seems like a lot more of the donor DNA, even in wild-type cells, is used for food than for sex.

Of course the amount of taken up DNA a cell could use for “sex” would be highly dependent on what DNA was taken up. In the case of the experiment above if homologous portions of the donor plasmid are removed, then the “sex” fragments disappear (Lanes I and J for wt and rec-1, respectively). Furthermore, the length of homologous DNA fragments taken up by competent cells is likely to matter, due to degradation during translocation and in the cytosol.

In Maughan and Redfield (2009), they show extensive natural variation among H. influenzae strains in the amount of uptake and transformation that competent cells will undertake. Do strains that can take up DNA well but fail to transform have a high food to sex ratio?

The assay that Barouki and Smith use doesn’t seem like the best way to measure a food-to-sex ratio, since the “sex” signal is somewhat buried behind the “food” signal. I wonder if there’s an experimental scheme that would allow one to measure such a ratio more accurately. Is there a way I could feed one strain’s chromosomes to another recipient strain and figure out how much DNA incorporated into chromosomes is recombination-dependent and how much is recombination-independent?

Anyway, thinking about this has led me to having a slightly clearer idea about the food versus sex hypotheses for the maintenance of natural competence by natural selection. Things are rarely black-and-white, so perhaps both models have their merits, but it seems like it might be possible to experimentally measure how much naturally competent cells use DNA for food or sex. Would this help in understanding these arguments?

Friday, July 24, 2009


As I try and address the critiques of my original NIH postdoc fellowship application in my resubmission application, I started fleshing out ways I which our planned periplasmic donor DNA sequencing experiments will investigate the mechanism of DNA uptake.

I’m playing around with different figures that illustrate what I’ll do and what it might reveal. Here’s the basic idea:

(1) Incubate sheared chromosomal donor DNA with recipient competent cell preparations.
(2) Recover the DNA that is taken up in to the periplasm.
(3) Obtain paired-end sequence data for periplasmic and input DNA libraries.
(4) Compare the abundance of different sequences between the periplasmic and input DNA libraries to calculate the periplasmic uptake efficiency for sequences across the genome.

What would this data look like?

One mapped back to the genome, each paired-end read will define the span of an individual fragment from the sequenced pool.

Below, I illustrate what a few dozen of these spans would look like mapped to a short stretch of chromosome containing an uptake signal sequence in the center.

The first diagram shows what the input DNA pool would look like. The blue spans do not contain USS, while the red spans do.
The second diagram shows what a similar amount of sequencing of the periplasmic uptake DNA pool might look like. I assume that the presence of a USS motif is sufficient and necessary to strongly stimulate uptake (“all-or-nothing” model of uptake). Thus, spans containing USS would be much more abundant than spans that do not.
These sequence data could be plotted in several ways:

(1) Spanning coverage at each genomic position: The input DNA is expected to have roughly equal spanning coverage of eaach genomic position, but spanning coverage in the periplasmic uptake library is expected to be higher closer to USS motifs. As the distance between a genomic position and USS increases, fewer spans will contain both. Peaks will indicate USS, and peak height will indicate the effectiveness of individual USS loci. I estimate that one lane of sequencing the input will provide ~2500X spanning coverage per nucleotide for 500 bp donor DNA fragments.

(2) End coverage at each genomic position: If the position of USS in a fragment is irrelevant to the uptake mechanism, then plotting end coverage would have a different shape than the spanning coverage around. Since any fragment containing USS will be effectively taken up, I expect a more sawtooth-shaped distribution of end reads at USS determined by the spanning fragment length. I estimate that one lane of sequencing the input will provide 100X end coverage per nucleotide for 500 bp donor DNA fragments.

Below, I illustrate an idealized case of spanning and end coverage.
Different USS loci may behave in different fashions. Here are what two other scenarios might look like:

(1) If uptake is polarized by USS, such that the position of USS on a fragment is important, or if there was an uptake blocking sequence nearby a USS, the distribution might be skewed:
(2) If a fragment’s uptake is equally efficient with one or two USS motifs (a USS interference model), then coverage around two nearby USS might look like this:
I wonder what other mechanistic details might be found in the data...

Friday, July 17, 2009

Dose Response

How hungry are competent cells for DNA? I know that about a billion cells will consume ~65% of 20 nanograms tasty USS-1 fragment, but what if I offer the cells different amounts of USS-1?

To get a better hands-on feel for the DNA uptake process in wild-type and rec-2 mutant competent cells, I did a dose response experiment, where I incubated competent cells with different amounts of USS-1 DNA.

For this first experiment, I used 0.5 ml of competent cell cultures for each sample and did 6 different amounts of USS-1 DNA (12 samples total for wt and rec-2). I didn’t have enough radiolabeled fragment for all of my desired concentration, so I mixed in some cold USS-1 DNA to make up the difference. I let the DNA and cells incubate for 30 mins, then I washed the cells several times and determined the total radioactive counts in the cell pellet and washes to determine the % uptake and total uptake.

Here’s the results:

Total Uptake:

Percent uptake:

Interestingly, rec-2 does better at low concentrations of DNA than wild-type, but worse with high concentrations. The latter could be due to the periplasm getting too clogged with DNA, such that the outer membrane uptake machinery has to work too hard to get more DNA through, while in wild-type translocation of DNA frees up space in the periplasm. But the former (higher uptake in rec-2 at low DNA concentrations) doesn't really make much sense to me. Maybe not all free nucleotides created during degradation at the inner membrane remain in the cell, so that at low concentrations, rec-2 simply holds more label?

Sunday, July 12, 2009

Standing upon the shoulders of giants

It really is gratifying to have things work the way they're supposed to. Some kind of bug bit me on Saturday and I came in to see if the periplasmic DNA preparation reported by Kahn et al 1983 would work in my hands. And sure enough it did!

The experiment was much the same as before. I added radiolabeled USS-1 fragments to either wild-type or rec-2 competent cell preps, incubated for 5 minutes, and then either prepared total DNA or did the periplasmic extraction (TE/1.5M CsCl + phenol/acetone, 1:1).

Since wild-type cells will take up the fragment, but also incorporate labeled subunits from degradation of taken up DNA, I can tell if the periplasmic DNA prep managed to exclude chromosomal DNA. But first, I counted the radiolabel present in the different cellular fractions...

This time, about a quarter of the USS-1 fragment added was taken up within five minutes (wild-type: 26%; rec-2: 28%). I suspect these numbers are lower than the last time I did it, because my five minutes was really five minutes (whereas the first time, I think I was 2-3 minutes late).

The extraction: When I collected the aqueous phase, I also collected the organic phase, and the interface between the phases (which should contain the cells minus their outer membranes). I counted the radiolabel in these different fractions as before:
Wild-type cells had label in both the aqueous extract, as well as in the interface containing the cells, while rec-2 had nearly all the label in the aqueous extract. The organic phase had less than 1% rec-2.

But here's the important bit:
Lane 1: Input (1/3, or 4 ng)
Lane 2: Total DNA, wild-type + USS-1 for 5 min.
Lane 3: Total DNA, rec-2 +USS-1 for 5 min.
Lane 4: Peri DNA, wild-type + USS-1 for 5 min.
Lane 5: Peri DNA, rec-2 +USS-1 for 5 min.

The important point here is that in the total DNA extract of wild-type, both intact donor USS-1 and chromosomal labeling are evident, while in the peri-extract of wild-type, there is no chromosomal label.

This means that the extraction I did successfully purified periplasmic DNA over chromosomal DNA. Fabulous!

Now I need to scale this protocol up, and get cleaner DNA (i.e. use RNase), so hopfully I can see this without using radiolabel. If I can really get clean periplasmic DNA with little or no chromosomal contamination, I will move onto doing the "real" experiment with donor DNA made up from sheared genomic DNA of another isolate.


Friday, July 10, 2009

Building a periplasm prep...

After my failed attempts at doing a large-scale periplasm prep right off the bat, I decided to spend this week going a bit more slowly. I repeated what others have already done successfully using radio-labeled DNA fragments as donors. This means that I can do smaller scale experiments and don't need particularly pure DNA.

And this time the experiments all worked. Here's what I did:

(1) I made competent cells of KW20 (RR722) and KW20 rec-2 (RR622). I confirmed that the wild-type strain transformed normally and the rec-2 strain not at all (or at least below my limit of detection). This confirmed that my competent cell preps were okay, and that the rec-2 strain seems to be correct.

(2) Following the DNA uptake assay protocol of Maughan and Redfield, 2009, I showed that USS-1 is taken up very well, but USS-R is only poorly taken up. To do this, I simply incubated ~12 ng of either radio-labeled USS-1 or USS-R with 0.5 ml of competent cells for 20 minutes, and then compared the radioactive counts in a washed cell pellet compared to the total counts:
Wild type and rec-2 both take up USS-1 well, but USS-R poorly, as expected. But rec-2 seems to take up USS-1 slightly better than wild type. This is also true in the next experiment. This may be significant but could also reflect slight differences in the competent cell prep of the two strains.

Possibly the coolest part of this for me was that I got numbers that were spot-on the former post-doc's numbers (found in her notebook) and older papers describing % uptake. That is: ~65% uptake for ~20 ng / ml of cells. This was very encouraging to me.

(3) I repeated the uptake assay described above using ~12 ng USS-1 donor DNA and incubated wild-type and rec-2 cells for either 5 or 60 minutes. This gave results similar to those shown above:
Most uptake was finished after only 5 mins, though additional incubation increased the level of uptake. The results were nearly identical for 60 min incubation as for 20 min incubation, so I don't need to do it for so long.. The rec-2 strain again showed slightly more uptake at all time points.

After this, I took it a step further: I also extracted the DNA from the cell pellets and ran them out on a gel. I also included the input donor DNA as a control. I dried down the gel and exposed it to a phosphor screen. This is what the gel looked like:

Lane 1: Donor DNA (50% of input; ~6 ng)
Lane 2: Wild-type + USS-1 for 5 min. Total DNA.
Lane 3: Wild-type + USS-1 for 60 min. Total DNA.
Lane 4: rec-2 + USS-1 for 5 min. Total DNA.
Lane 5: rec-2 + USS-1 for 60 min. Total DNA.

Alright! That's exactly what I hoped for! (Well, not quantitatively between lanes: this was a sloppy first experiment.) The gel shows that the natural competence phenotypes of the two strains: wild-type and rec-2.

Intact uptake DNA is the smaller (lower) band, while chromosomal DNA is the high molecular weight species. In wild type, donor DNA gets degraded and nucleotides can be incorporated into the chromosome over time. (Importantly, the labeling of the chromosome is NOT from transformation, but from incorporation of degraded nucleotides into the genome by DNA replication.) In rec-2, the radio-labeled donor DNA is trapped in the periplasm and isn't degraded. So there is no chromosomal labeling in this case.

This is effectively a repeat of an experiment from Barouki and Smith, 1985.

Next, I'll try exactly the same thing, but I'll also try the extraction from Kahn et al., 1983. If this successfully yields pure periplasmic DNA, then I expect that the extraction will not yield radiolabeled chromosome, even for the wild type sample. If that works, I can work on scaling the protocol up to do a real purification of uptake DNA.


Tuesday, July 7, 2009

Imagine it exists, and maybe it does!

Our proposed experiments involve capturing DNA molecules at the different stages of natural transformation. One of the technical challenges we face will be producing a library of DNA molecules that have been translocated into the cytosol. We have some schemes for how we’ll do the purification of donor DNA from the cytosol, but even assuming that this works wonderfully, we still need to turn these into double-stranded DNA. We can’t use a specific primer to the 3’ends of translocated ssDNAs, because (a) we don’t know the exact 3’ ends and (b) it will be a complex mixture.

What to do?

Until now the only thing that had occurred to me is to use random priming of our ssDNA to convert cytosolic ssDNA into dsDNA (shown schematically above), but this approach has several limitations. The biggest problem is that we would only be able to accurately identify the 5’-end of translocated DNA. The 3’-end of the final dsDNA we produce would not represent the 3’-ends of the original ssDNA molecules using random primers. Furthermore, we would not know which end of our dsDNA was the original ssDNA’s 5’ or 3’ end. And finally, we would end up with a highly heterogeneous size distribution, which might complicate sequencing.

How can I circumvent this, get both ends, and know which is which? I need a strategy like RACE. I decided to imagine that a certain enzyme existed that might help me in this endeavor and then see if it actually existed and was already was commercially available. This strategy has worked for me in the past: Once, I’d wanted to know if there were restriction enzymes that only nicked at their recognition sites, so I typed “nickase neb” into Google, and sure enough NEB carries nickases! Go biotechnologists!

This time, I want to tack some type of single-stranded adaptor sequence onto the 3’ ends of my putative cytosolic ssDNAs, so I typed “ssDNA ligase” into Google, and Presto!... Epicentre produces a single-stranded ligase that they call CircLigase. Sweet!

This doesn’t fully solve the problem, since the ligase will normally take an ssDNA and circularize it (since the intra-molecular ligation will usually be favored). This is useful to plenty of folks who are interested in doing rolling circle amplification and rolling circle transcription, but I would rather not circularize my ssDNAs, but would like to favor ligation of an ssDNA adaptor specifically to the 3’-end. This will require a couple of bells and whistles.

If we take our ssDNA and then treat it with a phosphatase, we can rid the 5’-end of its terminal phosphate and both block circular ligation, as well as ligation of our adaptor to the 5’end. If our adaptor oligonucleotide also has a protected 3’end (not sure how to do this... an oligo with a terminal dideoxy nucleotide?), then we’d block the ligation of the adaptors to each other and force ligation only in the orientation we want (5’ of the adaptor to 3’ of the target).

Then, using a primer complementary to the adaptor, we can convert full-length ssDNA into dsDNA. Furthermore, the adaptor marks the original 3’ end of the fragment, so we can give a polarity to our cytosolic fragments. Here’s the scheme:

Afterwards, of course, we’d need to either amplify this product or de-protect both ends, so that we could ligate sequencing adaptors to the mixture.

This plan just might work, and I could make sure it works using defined substrates, rather than precious (as well as non-existent) cytosolic DNA fractions. The main thing I can’t think of off the top of my head is getting a hold of an oligo with a protected 3’-end (preferably reversibly so).

UPDATE: Looks like at least some oligo companies can include dideoxy bases in oligos. Awesome. This is not quite as ideal as a reversible protection of the 3' end...

UPDATE 2: Uh oh. How to amplify the product? There's no primer sequence at one end... This could involve a phosphorylation step and a ligation of a normal adaptor to that end? That's an extra unfortunate step. Also, the adaptor sequences will eat into the sequence read length, but that should be acceptable, since we only need to get tag sequences.

Thursday, July 2, 2009

Mutation versus Transformation: Structural Variation

Here's an idea: Compare structural changes (indels and rearrangements) in transformed and untransformed cultures.

One major goal of our planned experiments is to measure the transformation rate across the genome. This is a fairly ambitious prospect, mainly because of the amount of sequencing it will require.

If we assume that the average donor allele transforms recipient chromosomes 1% of the time, then we would need to sequence the average locus 100 times to see the donor allele just one time. But to get a reliable measurement of transformation rate would require considerably more... perhaps 10,000 times. This would give us an average of 100 donor alleles / 10,000 alleles sequenced. Even using the Illumina platform would get fairly expensive to measure the transformation rate for every SNP.

One way we could do a preliminary sequencing experiment would be to ignore single-nucleotide differences between donor and recipient and focus solely on indel and rearrangement differences (a.k.a. structural variation). This data would speak to interesting hypotheses regarding the role of natural competence in maintaining the core genome and diversifying the accessory genome, and it would require considerably less sequencing.

So the goal of the preliminary experiment would be to measure structural variation due to mutation versus structural variation due to transformation.

How would this work, and why would it be cheap and easy?

First of all, the experiment-side is extremely easy. Naturally competent recipient cultures would be split in two. One part would be used to prepare untransformed recipient chromosomes; the other part would be incubated with donor DNA for a while, allowed to recover, and then transformed chromosomes would be purified. (We could also include a selection for a donor marker at this point to increase the relative transformation rate.) That’s it.

The sequencing side would also be comparatively easy. Untransformed and transformed chromosome preparations would be sheared, end-repaired, and size-fractionated by gel to 500 bp (as precisely as possible). This DNA would be ready for sequencing library construction and paired-end sequencing.

For a 500 bp library, I previously estimated ~2500X “spanning coverage” in one lane of Illumina sequencing using conservative estimates of the sequencing parameters. “Spanning coverage” is defined as how many times a particular genomic position is found between two mapped paired-end reads. So we’d get to 10,000X spanning coverage in ~1/2 a full run.

How would this help us measure the transformation of donor structural variants into the recipient?

Let’s use an example to illustrate. In the alignment shown above (using GenomeMatcher), the donor genome (86-028NP) is shown on top, while the recipient genome is shown on bottom (Kw20). As is pretty clear, genes Hi_0512 and Hi_0513 are absent from the donor genome, indicating a deletion of those genes from the donor (or possibly an insertion into the recipient). These genes happen to be the HindII restriction enzyme and methylase. The flanking genes are syntenic (so Hi_0511 = tchA, etc.).

Since we know the size-distribution of the library (500 bp), it would be quite simple to spot the deletion allele. Paired-end reads with one end in tchA and the other in rpoC would define the deletion. By contrast the insertion allele would always have tchA and rpoC mappings on different fragments (with the other ends in the HindII methylase or restriction enzyme).

Here’s a way of illustrating what paired-end reads of different kinds of alleles relative to the recipient would look like:
So for our deletion, we’d see paired-ends that mapped to positions further apart than they should be. By having extremely high spanning coverage, we could count how often these kinds of mappings occurred versus the recipient mappings. This would give us the rate of deletion.

In our untransformed library, if we saw the deletion allele, we’d be seeing mutation, while in the transformed library we’d be seeing mutation and/or transformation. Since we know the donor genome sequence, we can distinguish paired-end reads that look like the donor sequence versus de novo mutations that occurred when we grew out the cells. Better still, we could spot transformation-induced de novo mutations by comparing to the untransformed library.

Why else do the untransformed chromosomes at all? Well, structural mutations are likely to occur at a much higher rate than single-nucleotide mutations in many instances. Independent of our interest in natural transformation, doing the control experiment may reveal regions of the genome that are unstable, along with the mutation rate of different types of structural variation. This last part is non-trivial. If, for example, we see a particular deletion that occurred on 50% of the untransformed chromosomes we looked at, it could be that this is an extremely frequent mutation, but it could also have simply occurred early in the grow-out of the culture.

As a control for transformation rates, we don’t have to worry about that, but doing the untransformed control would make us confident that changes we saw were induced by transformation and not simply due to such kinds of mutation.

Happy (Belated) Canada Day!

(Image: The Canadian-built robot arm attaching the space shuttle docked to the Hubble Space Telescope with the Earth in the background.)

Oh yeah, and my second attempt at preparing periplasm DNA was... inconclusive. But it was pretty interesting to try out. In particular, the TE/CsCl/phenol/acetone extraction was quite compelling visually, involving small bubbles breaking up and reforming. I need to start these experiments out at a smaller scale.

And luckily, we've received radiolabeled dATP, so I can do some more sensitive and controlled experiments next week. The use of radiolabeled uptake fragments will be significantly more sensitive and allow me to use small cultures and follow small amounts of uptake DNA.

Once I've got a functioning uptake assay, I can work out the best purification method and scale up from there.