data:image/s3,"s3://crabby-images/bfbd1/bfbd12edd356c3670fcfccb0708f79bd81871381" alt=""
The main motivation for doing this besides practice is that I am fairly sure we should be ordering degenerate oligos with more degeneracy than we have previously considered. I won't make that argument here, but just repeat some analytical graphs I'd previously made.
It took a while (since I’m learning), but was still much more straight-forward than doing it in a spreadsheet. The exercise was extremely useful, as I learned a bunch of stuff (especially about plots in R), while doing the following:
Problem #1: Given a percentage of degeneracy per base, d, in an n length oligo, what is the proportion of oligos with k mismatches?
Answer #1: Use the binomial distribution. For a 32mer with different levels of degeneracy (shown in legend):
data:image/s3,"s3://crabby-images/5536d/5536da4a9cf8ca2740c1e3908dac2f7c0500135f" alt=""
Answer #2: Simply adjust each of the above values by dividing the number of classes within each of k mismatches (i.e. choose(n, k)):
data:image/s3,"s3://crabby-images/bbc4d/bbc4d6b7d2f4bf91f40b228b1a9ed087a2736cbd" alt=""
Answer #3: Use the hypergeometric distribution. The below plot is as for Problem #1 for 0.12 degeneracy, but with the # of hits broken down for each k:
data:image/s3,"s3://crabby-images/e7727/e77278082397f2766667777b66c6731554eacf98" alt=""
In #1 and #2, is it possible to have R draw theY axis going through zero? That would make values easier to estimate.
ReplyDeleteAnd in #3, what's the value of m? Am I right in thinking that the graph shows that 98% of oligos will have at least one mismatch to the consensus but only about 65% will have at least one of these in an important position?
Yeah. I figured out how to add lines using the "abline" function. For #2, I should probably be focused on only a part of the displayed graph too.
ReplyDeleteAs for #3, m=8, so 1/4 of the 32 positions are presumed to be important. And your approximations are about right...