Embedded Systems November 2000 Vol13_12

Issue link:

Contents of this Issue


Page 102 of 189

In the language of statistics, the lightning has hung a fat tail on our probability distribution. adding up the values and dividing by N. We will deal with such practical details in a moment. Another, more fundamental, rea on calls for a short diversion into theory. If the data had an underlying statis- tical distribution with thin tai ls- something like the classic normal den- sity function shown in Figw·e 1 (red trace)-the ordina•·y mean would work be tter. In this ideal case, the median would be noisier than the mean by a factor near -JW2l "' 1.25. However, distributions in the real world often have fat tai ls. Consider the di stributio n shown in Figure 1 (blue trace). It is 90% the same as before, but 10% of the probabili ty follows a normal distribution 10 Limes wider. What I'm trying to rep- resent is a composite process: some- thing well-behaved is making the na r- row central peak, but something else is going on at a low level and making outlie rs. The mean allows the outlier -rep- resented by those fat tails-to affect its estimate u111·easonably strongly, and is 3.3 times as noisy with the new data. In contrast, the median focuses its atten- tion on the largely undamaged proba- bility peak in the center, and is noisie r by a factor near 1 .1. This time, the median is quieter than the mean by a factor near 2.5. Quite a difference for such a subtle change in the shape of tl1e disu·ibution! So it matte rs what the unde rlying statistical distributio n is. As is usual, I have used th e normal distribution for conve ni e n ce a nd familiarity when doing theory. Some real-world situa tions, like o ur lightning storm, have eve n longer tails that are hard- e r to treat analytically, but show off the median to even better advan- tage. I o nce saw a really good exam- ple at very close range: a pair of Confidence intervals So how did I get those theoretical results? Confidence intervals for the mean are based on the variance, a2, of the composite distribution. To form that. you form the appro- priate weighted average of variances of the two distributions that make it up. As always when averaging squared quantities, the larger a quickly takes control of the result. In sharp contrast, confidence intervals for the median are based on the central den- sity instead\ and that is where we win. For the two normal distributions of our exam- ple, the central density goes like 1 Ia and we take the appropriate weighted average of that quantity. As always when averaging reciprocals, the smaller a takes control! In cal- culating the median, the most common, organized data behavior dominates any rare large excursions just as we wish it to. 1. The median m is approximately normal with , 2 I neglect a factor of N/(N-1) in the noise comparisons. For the normal distribution, f(m) = f(p) = 1/n"'(2:r). Mathematical Statisttcs, Freund, Prentice-Hal! (1962). = 1/(4 (N-1) f2(m)], f(m) the central density. N odd FIGURE 1 Density functions . ·. :::: · _ : ;I I, l! 0.3 ' 0.25 ' li ,, 'I 11 !! 0.2 0.15 I. o.1 I· 0.05 10 20 30 Embedded Systems Programming NOVEMBER 2ooo 101

Articles in this issue

Archives of this issue

view archives of EETimes - Embedded Systems November 2000 Vol13_12