The most exciting phrase to hear in science, the one that heralds new discoveries, is not 'Eureka!' but 'That's funny...' --Isaac Asimov

## Predicting the height of a saturated peak on an electropherogram

#### May 6th, 2009 by eric

One way to assess the microbial community structure in an environment is to use a ‘fingerprinting’ technique, like T-RFLP or ARISA, to interrogate the ‘species’ living there as determined from their 16S rRNA genes or some functional gene like amoA. Here’s an example of a T-RFLP electropherogram from sea ice:

You can see that most of the signal in this sample is contained within a few peaks. Sometimes those peaks saturate (max-out, overblow) the detector, which is bad if I am interested in comparing the heights of the peaks (a controversial subject, I should note I am only doing bulk, not individual, comparisons). Of course, I could just add less DNA and run it again, except that then I would be liable to lose some of the smaller peaks (also, it’s not practical for me to re-run these specific samples). So I’ve written a script in the open-source statistical package R to estimate the heights of the saturated peaks by fitting a Gaussian function of the form

$$f(x) = y_0+\dfrac{b\sqrt{2/\pi}}{d}*e^{-2\left(\dfrac{x-x_0}{d}\right)^2}$$

where ‘y_0’ is the y-minimum, ‘x_0’ is the center of the peak, ‘b’ is a scaling factor, and ‘d’ is related to the standard deviation of the distribution.

The figures below show (A) a fitted regular-sized peak, and (B) a fitted saturated peak. In my case, the fitted function has a maximum that is 1.6 ± 2.5% of the observed maximum for regular-sized peaks.

### 0 responses so far ↓

• There are no comments yet...add one by filling out the form below.