1/X Weighting works for MS peak area data. Any good references as to why?

During an installation of an Acquity /Mass Spec system, I selected a 1/X data transformation of peak area in order to get the coefficient of variation of my standard curve (R squared) to maximize and approach 1.000. My customer is a long time user of UV data and was uncomfortable with the notion of massaging data. As one who came from that school, I can share that misgiving. Though as an empiricist, I tend to use what appears to be useful and this certainly does. I recall picking up a statistics text by Stuckey back in the 80s that argued that an elegant transformation of data to make it amenable to gaussian statistical analysis is a valid and powerful tool for understanding the data. I pointed out that UV data is itself the 1/logTransmittance transformation of photon data, making it amenable to Gaussian analysis, but my customer was unsatisfied. Sadly, I am unable to find that text by Stuckey, or even remember its title exactly. Does anyone have a good basic text or monograph reference on this point? A propagation of error discussion would be icing on the cake. Matt B the FSE


  • Hello Matt

    There is a paper that treats this topic.
    it was very useful.

    See you Manuel


    N.V. Nagaraja, J.K. Paliwal, R.C. Gupta .Choosing the calibration model in assay validation. Journal of Pharmaceutical and Biomedical Analysis. 20 (1999) 433-438

  • Hello:

    I have been collecting replies on this as I am travelling, and could not add to the thread until thia AM. But this was interesting question and I learned why agian MS data and UV data differs, so thank you for the question.

    One chemist explaines that this is part of making MS data linear over a wider dynamic range. Here is a graphic on this

    a graphic representation



    Using 1/X curve weighting for MS work (particularly bioanalysis work where the expected concentrations sample to sample could be quite large) is the normal, accepted way to do this. The regression weighting parameters of 1/x or 1/x squared, are often needed to make the low end of the standard curve fit correctly (the higher points skew the curve way more that the lower points)

    This is because at higher concentrations, the standard curve may deviate from a linear response due to ionization factors or instrumental detection limits.

    To this point, this recent paper uses this method without reference or explanation!

    "Development and validation of a LC/MS/MS method for quantifying the next generation calcineurin inhibitor, voclosporin, in human whole blood

    Russell Handya, Dan Trepaniera, Grace Scotta, Robert Fostera and Derrick Freitaga, b, ,

    Journal of Chromatography B Volume 874, Issues 1-2, 15 October 2008, Pages 57-63

    Further from another of our scientists..

    The point of confusion is about what weighting does in a regression analysis.

    First -- it does not address the problem of curve fitting. data can be weighted or not and still be highly linear, quadratic, etc.

    Second -- what it does address is the problem of the distribution of precision over the calibration range. the usual assumption in so-called unweighted regression analysis is that all of the calibration levels are known with equal precision. this is called homoscedastic.

    The more frequent observation is that the data is not uniformly precise and this is known as heteroscedastic. when this is the case, non-uniform weighting is called for because the summation that is being minimized in a regression is this ...

    Summation (individual residuals)^2

    The largest point with the largest residual drives the summation. the least precise data will drive the apparent values of R squared.

    when weighting is added the summation becomes ..

    summation[individual weight*(indiviual residual^2] and the best weighting factors are 1/variance at the calibration levels.

    (see Chemometrics, Sharaf, Illman, and Kowalski , Wiley Interscience, 19860

    All calibration data is weighted -- the so-called unweighted case uses 1 as the weighting factor. the selection of calibration levels and the number of repeats are forms of weighting.

    For MS data, the typical experience is that the RSD's run between 3 and 5% independently of the size of the peak. this reflects atomization and ionization variations which apply with equal effect on small and large peaks. consequently, the standard deviation TENDS TO INCREASE LINEARLY ACROSS THE CALIBRATION CURVE. the appropriate compensation for this non uniform standard deviation is to use 1/x or 1/x^2 weighting.

    For UV data, the precision is generally much better (RSD's around 0.5%) and the standard deviations do not show such strong variations. consequently, the "improvement" in R squared is generally small for UV data.

    <p dir=...
  • Thanks for the cogent summary of the issue.

    It was the failure to understand the heteroscedasticity of the data that was the source of my misgivings.

Sign In or Register to comment.