Thursday, April 23, 2020

Published 8:50 PM by with 0 comment

Why Are Errors in Exponential Projections so Large?

You've likely seen huge errors or uncertainty ranges in exponential projections...
Working with the numbers, it's pretty easy to see what's going on. To start with, let's assume the following:
  • 100 people have a virus today
  • every day, between 15 and 25 people get that virus, our best guess is that it's 20, and we assume this is steady for the next month
Let's see what this projection yields:


After 30 days, our projection is that 3580 people will have the virus and that the number is likely between 3430 and 3730. The total spread in projections is ~10% of the actual projection.

Now let's assume the following instead:

  • 100 people have a virus today
  • every day, between 15 and 25% more people get that virus, our best guess is that it's 20%, and we assume this is steady for the next month (15% is doubling every 3 days and 25% is doubling every 5 days)
Let's see what this projection yields:


After 30 days, our projection is that 19,800 people will have the virus and that the number is likely between 5,760 and 64,600. The total spread in projections is enormous...more than a factor of 10. This is all with a small error in the assumption (cases double every 3 to 5 days).

Why is that second case so much less certain?

In the first model, it's linear...the same number of new people get it each day. Thus, the error in the first model grows linearly. When you see something phrased like 'every day between 25% more get...', that's exponential growth.  The error here is in the exponential growth rate assumption. Thus, the error itself grows exponentially. 

To take a specific example, consider the first 3 days in each model:

linear:
  • day 0: cases between 100 + (15)*0 and 100 + (25)*0
  • day 1: cases between 100 + (15)*1 and 100 + (25)*1
  • day 2: cases between 100 + (15)*2 and 100 + (25)*2
exponential:
  • day 0: cases between 100*(1.15)^0 and 100*(1.25)^0
  • day 1: cases between 100*(1.15)^1 and 100*(1.25)^1
  • day 2: cases between 100*(1.15)^2 and 100*(1.25)^2
The exponential model is raising the assumed rate to the # of days, so the gap between them explodes.

Since the model is really based on assumptions about growth rate, when converted to number of cases, you blow up the uncertainty and that is why you see the giant spread in projections for things like spread of a virus.


      edit

0 comments:

Post a Comment