Sunday, 17 March 2019

From Five to Fifteen?


Statistic Chances for Correct Statistics from Insecure Items ... · Zero Polarity on Five Item Statistics · One Lifespan Too Short in Five · One Lifespan Too Long in Five · Method for Good Choice · Two LifeSpans Too Short in Five Items · Two Lifespans Too Long in Five · The More Extreme Variations or Deviations · Summary for Five Lifespans · From Five to Fifteen? · From Five to Ten, the Long Way · From Five and Ten to Fifteen, the Long Way · From Fifteen to Thirty

As a general method - you recall what it was about, right, verifying that, if each wikipedian article has a 95 % likelihood of giving a correct lifespan, that after five, or fifteen, articles have been put together for a statistic, the overall statistic either has no excess or very marginal excess - one would multiply the likelihoods for first five articles having excess or deficit or neither with those of the next five. This would give 121 multiplications, for the following results of overall polarity:

  -5-4-3-2-10+1+2+3+4+5
 
+5 0+1+2+3+4+5+6+7+8+9+10
+4 -10+1+2+3+4+5+6+7+8+9
+3 -2-10+1+2+3+4+5+6+7+8
+2 -3-2-10+1+2+3+4+5+6+7
+1 -4-3-2-10+1+2+3+4+5+6
0 -5-4-3-2-10+1+2+3+4+5
-1 -6-5-4-3-2-10+1+2+3+4
-2 -7-6-5-4-3-2-10+1+2+3
-3 -8-7-6-5-4-3-2-10+1+2
-4 -9-8-7-6-5-4-3-2-10+1
-5 -10-9-8-7-6-5-4-3-2-10


Then one would add up all polarities + 10 (only one), all +9 (two of them) and so on to + 1 (ten of them), 0 (eleven), -1 (again ten) and then decreasing to one multiplication for -10.

This process would no doubt take some time, so I am saving it for tomorrow (if I finish this today).

However, one can simplify this a bit, seeing that, from polarities + 2 or -2 on, the total probabilities within five are very small. One can stylise the 78.45 % to 80 % or 8 in 10, the 10.22 % each to 10 % or 1 in 10.

I here have a little confession to make about initial assumption of each article having 95 % likelihood of being correct in lifespan, and 2.5 % each direction of excess or deficit in lifespan. I was making a quick calculation on it and took 8+1+1 in ten as the values for each. Then I relented after seeing it was going low a bit too quickly.

One can of course give me credit for having chosen the 8 + 1 + 1 ratio simply for ease of calculation, so my higher evaluation of likelihood now reflects a compensation for my initial laziness too.

Now, I already did the multiplications for two and three "articles" (according to initial evaluation) or (according to this stylised version of newer evaluation) batches of five articles. I can tell you straight away, that while the likelihood of all 3 being correct or errors balancing in pairs is 560 in 1000, the likelihood of there being exactly one in excess or deficit is 195 each, meaning that the 0 or 1 article marginal error is taken together 950 in 1000. For two articles in excess or deficit, it is 24 each and for 3 articles in excess or deficit it is one each.

Now, suppose we take this 95 % relative correctness value as valid for 3 (my initial take) or for 15 (my take now, with approximation), if we multiply the results crosswise, we get a set of values valid for 6 or for 30 articles.

  560195+195-24++24--1+++1---
 
560 313600109200+109200-13440++13440--560+++560---
195+ 109200+38025++380254680+++4680-195++++195--
195- 109200-3802538025--4680+4680---195++195----
24++ 13440++4680+++4680+576++++57624+++++24-
24-- 13440--4680-4680---576576----24+24-----
1+++ 560+++195++++195++24+++++24+1++++++1
1--- 560---195--195----24-24-----11------


According to my additions, the neutral ones (no overall excess, no overall deficit) add up to 390804 in 1000000.

The one in excess, as do the one in deficit, add up to 227808 each, meaning, the 0 to one deficit is a clear majority:

390804 ppm
227808 ppm
227808 ppm
______________
846420 ppm

The plus or minus two are 64895 ppm each. The plus or minus three are 10480 ppm each. The plus or minus four are 966 ppm each. The plus or minus five are 48 ppm each and the plus or minus all six articles excess or deficit are 1 ppm each.

390804 + 227808 + 227808 + 64895 + 64895 + 10480 + 10480 + 966 + 966 + 48 + 48 + 1 + 1 = 999200

And I did some fault, since I am 800 ppm short.

Know what? It is Sunday, I'll publish it as is anyway, and make the correction later or tomorrow in an update.

Hans Georg Lundahl
Nanterre
II Sunday in Lent
also St Patrick
17.III.2019

Update, next day.

Neither excess nor deficit : 390 804 ppm
One either way : + 227 808 + 227 808 ppm
Two either way : + 65 295 + 65 295 ppm
Three either way : + 10 480 + 10 480 ppm
Four either way : + 966 + 966 ppm
Five either way : + 48 + 48 ppm
Six either way : + 1 + 1 ppm.

390 804 + 227 808 + 227 808 + 65 295 + 65 295 + 10 480 + 10 480 + 966 + 966 + 48 + 48 + 1 + 1 = 1 000 000

No comments:

Post a Comment