(Hitachi Software Engneering America Ltd. MiraiBio Group.)

Hitachi

The MiraiBio Group Blog

What's going on at MiraiBio

Subscribe to The MiraiBio Group Blog
Posted by Allen Liu under MasterPlex QT, MasterPlex ReaderFit

Have you ever wondered how curve-fitting works and what is going on beneath the hood of software algorithms? Well, put on your work gloves and prepare to get greasy because we’re going under the hood =)

The goal of curve fitting is to find the values for the parameters of a given model equation that are most likely to be correct, based on the data provided. Please note the part that is in bold. A curve fit will only be as good as the input data. There is no magical formula to “fix” bad data.

For this article, I will use the 4 parameter logistic or 4PL model equation:

F(x) = ((A-D)/(1+((x/C)^B))) + D

This is a popular model used to fit standard curves in bioassays or immunoassays. It is characterized by it’s classic “S” or sigmoidal shape that is symmetrical around its inflection point:

4 Parameter Logistic Model Equation

Model equations such as the 4PL have parameters in the equation that represent significant parts of the curve. Not surprisingly, the 4PL model equation has 4 such parameters (A, B, C, and D).

A curve-fitting algorithm will make an initial guess at the values of these parameters and attempt the curve fitting accordingly. This initial guess could be based on historical curve fitting trials or based on the input data points but this is a topic of a whole other discussion.

Once the curve fit is done, a method of measuring the goodness of fit is required to see how nicely the curve fits to the input data points. One popular method of measuring the goodness of fit is the least squares method or minimizing the sum of squares. Here is the sum of squares equation and we will go over a quick example of what this really translates to.

Sum of Squares

Minimizing Sum of Squares - Least Squares

Minimizing the sum of squares is a simple concept. Let’s use the curve above and use it as an example by starting with the first point on the lower left-hand side.

First point

The Y1 data point is your original data point on the lower end of the curve and Y1 curve data point is where the point exists in this first iteration of the curve. In this example:

Y1 data = 15
Y1 curve = 10
Y1 data – Y1 curve = 15 – 10 = 5 (This value of the difference between your expected value [Y data] and the calculated value [Y curve] from the curve represents the residual value.)

Next, we will need to square this value:

(Y1 data – Y1 curve) ^ 2 = 5 ^ 2 = 25

Now that we have calculated the difference squared for the first point (Y1), the same will have to apply for Y2 through Y6. Once all the difference squared values are calculated for all the points, you simply sum them up so you end up with the sum of squares =)

The sum of squares is an indication of how good the curve fit is. As can be seen in the example given, the larger the residual, the farther the distance the point is from the curve. This will ultimately result in a larger sum of squares value. The goal here is to minimize the sum of squares with the ideal situation being zero but that that is very rarely ever obtained.

The next step of the algorithm is to go back and adjust some of the parameters and calculate the sum of squares again for the next iteration of the curve fit. If the goodness of fit is worse (i.e. larger sum of squares), the algorithm has most likely overcompensated for one or more of the parameters. If the goodness of fit is better, the algorithm should keep adjusting the parameter(s) in the same direction. This process can go on for hundreds of iterations until the algorithm reaches a predefined stopping threshold where let’s say the sum of squares can no longer be improved and it reports the final parameter values as the best fit curve.

That is curve-fitting in a nutshell! Please feel free to comment below if you have any questions or comments =)

Here are some related blog posts that you may be interested in:

Share
  1. Question on weighting factor for 4PL curve Said,

    Hi
    What is the best method of fitting curve for 4PL. Is it the least square method or robust method? What is the best weighting factors?

    Cheers
    Chamindie

    Reply

    aliu reply on February 8th, 2011 1:11 pm:

    Hi Chamindie,

    Thank you for your question!

    From what I understand of the robust method, it is better suited to data that contain outliers and it can model after heteroscedastic data (although least squares method can do this with use of weighting).

    In general, I find that the 1/Y^2 weighting usually results in the best percent recoveries of standard data for immunoassays. The best way to find out is to simply try both the 1/Y and 1/Y^2 and see if your percent recoveries improve.

    Allen Liu

    Reply

Add A Comment