1. Joined
    12 Jul '08
    Moves
    13814
    02 Apr '20 19:241 edit
    @DeepThought

    Looks like you are confusing SD with Margin of Error.

    1.96 SD for to calculate the Margin of Error for a 95 percent confidence interval.
  2. R
    Standard memberRemoved
    Joined
    10 Dec '06
    Moves
    8528
    02 Apr '20 19:262 edits
    @ponderable said
    you buy it now?
    I'm not trying to pick a fight, but no, I still don't buy it. Your prediction of 38,000 Deaths by April 11th is nearly double the current projection. Just 2 days further and its well past double.
  3. Standard memberDeepThought
    Losing the Thread
    Quarantined World
    Joined
    27 Oct '04
    Moves
    87415
    02 Apr '20 19:35
    @joe-shmo said
    The online regression software is predicts 75,800 Deaths by the Inflection Point ( Day 56 - April 21), totaling 161,100. So they seem to be in agreement, with the exception of the inflection point date.
    A major problem with my method is that I can change the answer drastically by changing the start date for the linear regression. I get a quarter of a million deaths if I time from the 13/3 (this is in the US) - although that's 109,000 if I include negative numbers of death in the averaging. I don't know if there's a way of stabilizing it against that.
  4. Standard memberDeepThought
    Losing the Thread
    Quarantined World
    Joined
    27 Oct '04
    Moves
    87415
    02 Apr '20 19:39
    @eladar said
    @DeepThought

    Looks like you are confusing SD with Margin of Error.

    1.96 SD for to calculate the Margin of Error for a 95 percent confidence interval.
    You didn't read my post very carefully did you.
  5. Joined
    12 Jul '08
    Moves
    13814
    02 Apr '20 19:503 edits
    @deepthought said
    Using the data from worldometers.com I get 76,500 deaths in the US with a standard deviation of 92,900 and the top of the 95% confidence interval at 186,000. I need to have a think about how to integrate errors in the linear regression into this.
    76,500 + 1.96(92,900) would put the max for your 95 percent confidence interval at 258,584


    Your low end would be smaller than -100,000. Your virus is actually making people or raising the dead.
  6. R
    Standard memberRemoved
    Joined
    10 Dec '06
    Moves
    8528
    02 Apr '20 21:26
    @deepthought said
    Hi joe, I've got a way of doing this on a spreadsheet. The logistic function is:

    f(x) = a/(1 + b exp(-mx)) + d

    Notice that if we send x to minus infinity we get f = d. Since we expect the total number of deaths far in the past to be zero I'm setting d = 0. This makes the analysis a lot easier. Also I've replaced c with m for reasons that'll become clear below. ...[text shortened]...
    So my 95% confidence interval is (1,800 to 42,750)

    I'll repeat the calculation with the US data.
    I have some questions so I can get a handle on what you are doing:

    "The next thing to notice is that for small x we have:

    f(x) ~ exp(mx)

    So we can get m by taking the log of our data and do linear regression, m gives us the slope of the linear regression which is why I renamed it. We can always write:

    b = exp(m*x0) = exp(c) where c is the intercept from linear regression.

    In other words b just determines when our zero in time is. Let's choose b so that f(0) = 1. In other words the date of the first case. Then we can write:

    f(0) = 1 = a/(1 + b) "

    Is the above bit reverse order? If it isn't, I'm missing how we can go directly to f(x)≈e^(mx)

    It seems like first we do some algebra:

    f(x) = a* e^(mx) /(e^(mx) + b )

    Then we constrain b,:

    f(0) = 1 = a/(1+b)

    Then its clear that for small values of x:

    [a/( e^(mx)+b)] ≈ a/(1+b) = 1

    Thus, f(x) ≈ e^(mx)

    Guess Ill start with that, but I'll have more questions to follow. I hope you don't think I'm being a pain, its just that I don't do these manipulations of approximating functions over certain subsets that often ( I feel like that's more physicist business ), so I require pretty strict logical flow to follow along. 🙂
  7. Standard memberDeepThought
    Losing the Thread
    Quarantined World
    Joined
    27 Oct '04
    Moves
    87415
    02 Apr '20 21:45
    @eladar said
    76,500 + 1.96(92,900) would put the max for your 95 percent confidence interval at 258,584


    Your low end would be smaller than -100,000. Your virus is actually making people or raising the dead.
    I think we can safely constrain the low end of the confidence interval at the current fatality figure.
  8. Joined
    12 Jul '08
    Moves
    13814
    02 Apr '20 22:321 edit
    @deepthought said
    I think we can safely constrain the low end of the confidence interval at the current fatality figure.
    A 95 percent confidence interval implies the condition concerning a normal distribution has been met.

    It is impossible that your mean and sd could come from a normal distribution.
  9. Standard memberDeepThought
    Losing the Thread
    Quarantined World
    Joined
    27 Oct '04
    Moves
    87415
    02 Apr '20 23:501 edit
    @joe-shmo said
    I have some questions so I can get a handle on what you are doing:

    "The next thing to notice is that for small x we have:

    f(x) ~ exp(mx)

    So we can get m by taking the log of our data and do linear regression, m gives us the slope of the linear regression which is why I renamed it. We can always write:

    b = exp(m*x0) = exp(c) where c is the intercept from linear ...[text shortened]... l like that's more physicist business ), so I require pretty strict logical flow to follow along. 🙂
    I was just taking the exponential approximation as a given, the easiest way to see it is to start with the standard logistic function:

    f(x) = 1/(1 + exp(-x))

    Its rate of change is given by:

    f'(x) = exp(-x)/(1 + exp(-x))² = (1/f(x) - 1)f(x)² = f(x)(1 - f(x))

    which makes sense for an epidemic, the rate of change of the fraction of the population infected is proportional to the number of carriers and proportional to the fraction of people left uninfected. For the full function we have:

    f(x) = a/(1 + b exp(-mx))

    so

    f'(x) = abm exp(-mx)/(1 + b exp(-mx))² = (m/a) (1 + b exp(-mx) - 1) f(x)² = (m/a) (a/f(x) - 1) f(x)²

    Giving our final result:

    f'(x) = (m/a) f(x) (a - f(x))

    Note first that b does not appear in this, which forms my justification in treating it as an initial condition. Since we expect a to be large compared with f(x) for the initial part of the curve we have the approximation:

    f'(x) = m f(x)

    giving us:

    f(x) ~ exp(mx)
  10. Standard memberDeepThought
    Losing the Thread
    Quarantined World
    Joined
    27 Oct '04
    Moves
    87415
    03 Apr '20 01:161 edit
    The problem with the above is it seems to be numerically unstable. An alternative approach might be to use the tanh function. The standard logistic function can be written in terms of tanh:

    f(x) = 1/(1 + exp(-x)) = (1 - tanh(x/2))/2

    So for the whole function if we have:

    f(x) = a/(1 + b exp(-mx)) + d = a/(1 + exp[-m(x - xâ‚€)]) + d
    f(x) = a (1 - tanh(-μ(x - xâ‚€)))/2 + d
    f(x) = (d + a/2) - a/2 tanh(-μ(x - xâ‚€))

    rationalising the constants we get:

    f(x) = A + B tanh(μ(x - xâ‚€))

    For the four points we can simplify:

    f(-∞ ) = A - B
    f(0) = A + B tanh(-μxâ‚€)
    f(xâ‚€) = A
    f(∞ ) = A + B

    Now, we can either select A = B which will guarantee that the inflection point is when exactly half the population has died - this is an assumption of my version of the model above. Or we could select f(0) = 0 when:

    A + B tanh(-μxâ‚€) = 0 => A = B tanh(μxâ‚€)

    The advantage of this form is we can expand around x = 0 to get the initial part of the curve. I might have a go at implementing it tomorrow.
  11. Joined
    12 Jul '08
    Moves
    13814
    03 Apr '20 07:42
    Ponderable,

    How did you calculate your margin of error for your 95 percent confidence interval? I do not think it is the same technique as stats.
  12. R
    Standard memberRemoved
    Joined
    10 Dec '06
    Moves
    8528
    03 Apr '20 12:423 edits
    @deepthought said
    I was just taking the exponential approximation as a given, the easiest way to see it is to start with the standard logistic function:

    f(x) = 1/(1 + exp(-x))

    Its rate of change is given by:

    f'(x) = exp(-x)/(1 + exp(-x))² = (1/f(x) - 1)f(x)² = f(x)(1 - f(x))

    which makes sense for an epidemic, the rate of change of the fraction of the population infected is prop ...[text shortened]... itial part of the curve we have the approximation:

    f'(x) = m f(x)

    giving us:

    f(x) ~ exp(mx)
    Thank you, quite clever( at least for my standard ) algebraic manipulations there!

    For anyone operating at my speed:

    f(x) = a/ ( 1+ b*e^(-mx))

    b*e^(-mx) = a/f(x) - 1

    Then;

    f'(x) = m/a * [a²/(1+b*e^(-mx))²]*[b*e^(-mx)] = m/a*[f(x)]² *[a/f(x) - 1]

    Ok, so when you solved for "a" we were expecting a constant, but the results for it varied substantially? What is causing the variation? I guess, why was it not scrapped at that point when a wasn't shown to be a well behaved constant? Is the variability of "a" instead a hint to the failures of the Logistic Regression in this point of time?
  13. Joined
    12 Jul '08
    Moves
    13814
    03 Apr '20 17:311 edit

    Removed by poster

  14. Joined
    12 Jul '08
    Moves
    13814
    03 Apr '20 17:491 edit
    Take 2, bad data entry in round 1

    Ti 84 exp reg equation

    .95845*1.2478^t. With r^2=.985

    Where t is measured in days after March 13 and we are finding how many tens of deaths. 156.0 means 1560 deaths

    Ok, residuals are more random now, but there is still a pattern and the residual distance from the model is growing.

    Predicted deaths for today, day 23 1560
  15. Standard memberDeepThought
    Losing the Thread
    Quarantined World
    Joined
    27 Oct '04
    Moves
    87415
    03 Apr '20 17:57
    @joe-shmo said
    Thank you, quite clever( at least for my standard ) algebraic manipulations there!

    For anyone operating at my speed:

    f(x) = a/ ( 1+ b*e^(-mx))

    b*e^(-mx) = a/f(x) - 1

    Then;

    f'(x) = m/a * [a²/(1+b*e^(-mx))²]*[b*e^(-mx)] = m/a*[f(x)]² *[a/f(x) - 1]

    Ok, so when you solved for "a" we were expecting a constant, but the results for it varied substantially? What ...[text shortened]... variability of "a" instead a hint to the failures of the Logistic Regression in this point of time?
    There's a few potential sources of the problem. The data's noisy at the start of the curve as the particular circumstances of the first 100 or so people to die affect the data. We're trying to impose a logistic curve onto nature, but the actual dynamics are a lot more complex, this can't be helped without using a very detailed model that's going to go way beyond what we can do on a spreadsheet.

    Methodologically, part of the problem is that I'm getting values for m and xâ‚€/b from the same data I'm trying to estimate "a" from. The time series for the UK data's longer so I've stopped doing that for the UK data as of the 1st of this month, using the estimating m and b from the data up to the 1st and a using the data from the 1st I get a = 24,500 +/- 210, but that's because there's only three data points. For the US data we don't have that luxury yet.

    One thing. The "logest" function on OpenOffice gives a 1.4% error for its estimate of m but a 10% error for the estimate of b (US data). This means there's a significant source of error in our estimate of where the function crosses the x-axis.
Back to Top

Cookies help us deliver our Services. By using our Services or clicking I agree, you agree to our use of cookies. Learn More.I Agree