Several R90 estimators are compared using Monte Carlo simulations. L-estimators based on second worst miss radius or R6:10+R9:10 found to perform well on wide variety of input distributions.
Based on observed shots, estimate R90 (the radius of a circle that is expected to contain 90% of all impacts).
5x5 (NRA): Measure average extreme spread of five 5-shot groups. This common method is used by NRA magazine American Rifleman to report firearm accuracy. Multiply by 0.72 to get estimate of R90. This assumes perfect zero.
5x5 (4th): Measure 4th-smallest miss radius R4:5 in each of the five 5 shot groups, take the average, multiply by 1.42 to get estimate of R90.
4x5 (4th): Measure 4th-smallest miss radius R4:5 in each of the four 5 shot groups, take the average, multiply by 1.43 to get estimate of R90.
2x10 (9th): Measure 9th-smallest miss radius R9:10 in each of the two 10 shot groups, take the average, multiply by 1.15 to get estimate of R90.
2x10 (6&9th): In a 10 shot group, add 6th-smallest miss radius R6:10 and 9th-smallest miss radius R9:10. Do it again with the second group and take the average. Multiply by 0.69 to get estimate of R90.
Rayleigh 20: Take the average of 20 miss radiuses, multiply by 1.76 to get estimate of R90.
MLE 20: Take square root of the sum of squares of 20 miss radiuses, multiply by 0.35 to get estimate of R90.
Start with establishing the baseline. Pull impact coordinates from standard bivariate normal distibution, estimate R90, pull another impact, see if it lands inside R90. Compare root mean square error (RMSE) and mean absolute error (MAE) of different estimators.
Example groups:
Distribution of R90 estimates based on \(10^{4}\) simulations:
Estimator | Hits within estimated R90 | RMSE | MAE |
---|---|---|---|
5x5 (NRA) | 90.4% | 0.276 | 0.468 |
5x5 (4th) | 90.2% | 0.27 | 0.463 |
4x5 (4th) | 90.3% | 0.304 | 0.49 |
2x10 (9th) | 90.4% | 0.31 | 0.494 |
2x10 (6&9th) | 90.4% | 0.278 | 0.469 |
Rayleigh 20 | 90.6% | 0.266 | 0.46 |
MLE 20 | 90.6% | 0.255 | 0.45 |
6&9 estimator with 20 shots and 4 measurements works about as well as 5x5 estimator with 25 shots and 5 measurements.
Rayleigh estimator is just slightly better, but it requires 20 measurments. MLE is noticeably better, but requires even more work.
Now let’s shake the ladder - see how well the estimators work when underlying assumptions are violated.
Start by introducing some vertical offset (constant point of impact shift). Not much, just one \(\sigma\). In practice this is pretty common. Constant wind, change in temperature and therefore muzzle velocity, or errors in determining range can all introduce systemic error.
Estimator | Hits within estimated R90 | RMSE | MAE |
---|---|---|---|
5x5 (NRA) | 79.1% | 0.464 | 0.634 |
5x5 (4th) | 91.1% | 0.332 | 0.513 |
4x5 (4th) | 91% | 0.372 | 0.543 |
2x10 (9th) | 90.8% | 0.363 | 0.535 |
2x10 (6&9th) | 91.1% | 0.341 | 0.52 |
Rayleigh 20 | 91.2% | 0.337 | 0.516 |
MLE 20 | 90.9% | 0.306 | 0.493 |
5x5 estimator now underestimates R90. This is expected because group size did not change, the group just moved away from the center of the target so average miss radius increased accordingly.
Other estimators still work ok.
Now let’s see what happens when instead of shifting point of impact we add some fliers. They are common in practice, and typical advice is to simply discard fliers. There are some problems, though.
It’s hard to tell with any sort of consistency whether any particular shot is a flier. In Experiment 0, bottom left shot in the third group looks like a flier, but isn’t. All shots came from the same bivariate normal distribution. Excluding only called fliers depends a lot on the shooter. Some can do it way better than others.
Re-shooting entire group after a flier is expensive and time-consuming, and group size without flier is biased down.
A robust estimator that is not particularly sensitive to occasional flier is preferable.
In this experiment we will simulate fliers by increasing standard deviation by a factor of 5 with 5% probability.
Estimator | Hits within estimated R90 | RMSE | MAE |
---|---|---|---|
5x5 (NRA) | 91.7% | 1.02 | 0.857 |
5x5 (4th) | 88.3% | 0.452 | 0.558 |
4x5 (4th) | 88.2% | 0.51 | 0.586 |
2x10 (9th) | 89% | 0.906 | 0.713 |
2x10 (6&9th) | 89.1% | 0.601 | 0.616 |
Rayleigh 20 | 90.6% | 0.631 | 0.677 |
MLE 20 | 91.7% | 1.31 | 0.941 |
Now 6&9 estimator outperforms others, and MLE is the worst.
Now let’s see what happens when vertical and horizontal dispersions are different. In this experiment we flatten the groups - increase horizontal dispersion and decrease vertical dispersion.
This is common in practice. For example, ammo-induced variance in muzzle velocity affects primarily vertical impact coordinate, while wind affects mostly horizontal coordinate.
Estimator | Hits within estimated R90 | RMSE | MAE |
---|---|---|---|
5x5 (NRA) | 89.6% | 0.399 | 0.563 |
5x5 (4th) | 87% | 0.375 | 0.551 |
4x5 (4th) | 86.9% | 0.414 | 0.578 |
2x10 (9th) | 88.6% | 0.437 | 0.589 |
2x10 (6&9th) | 87.6% | 0.384 | 0.556 |
Rayleigh 20 | 87.3% | 0.359 | 0.538 |
MLE 20 | 88.3% | 0.353 | 0.531 |
Now combine all factors together: increase standard deviation by a factor of 5 with 5% probability, add one \(\sigma\) vertical offset, and flatten the groups.
Estimator | Hits within estimated R90 | RMSE | MAE |
---|---|---|---|
5x5 (NRA) | 88.9% | 1.11 | 0.899 |
5x5 (4th) | 89.3% | 0.533 | 0.602 |
4x5 (4th) | 89.2% | 0.606 | 0.635 |
2x10 (9th) | 89.3% | 1 | 0.769 |
2x10 (6&9th) | 89.5% | 0.678 | 0.663 |
Rayleigh 20 | 91.5% | 0.792 | 0.756 |
MLE 20 | 91.9% | 1.53 | 0.998 |
Finally let’s look at some real-world data: instead of generating impact coordinates with pseudorandom number generator, sample with replacement from 341 actual shots. Only miss radiuses were recorded, so 5x5 estimator is not shown.
Estimator | Hits within estimated R90 | RMSE | MAE |
---|---|---|---|
5x5 (4th) | 87.4% | 6.87 | 2.3 |
4x5 (4th) | 87.2% | 7.69 | 2.44 |
2x10 (9th) | 89% | 10.3 | 2.63 |
2x10 (6&9th) | 88% | 7.92 | 2.41 |
Rayleigh 20 | 87.9% | 8.27 | 2.49 |
MLE 20 | 89.6% | 14.8 | 3.05 |
This section describes where the magic numbers came from.
5x5 factor 0.72 was determined approximately using Monte Carlo simulations because there does not seem to be another way to do it.
2x10 (9) uses 9th smallest miss radius in 10 shot group because according to formula 3.13 in [1] optimum unbiased estimator from single order statistic Rm:n is
\[m \approx 0.79681(n+1)-0.39841+\frac{1.16312}{n+1}\]
For \(n=10\) it falls right between 8 and 9, and R9:10 (second worst miss radius) is easier to find on paper.
Factor 1.15 can be obtained analytically. Assume miss radiuses of individual shots follow Rayleigh distribution with \(\sigma = 1\). Its probability density function is
\[f(x)=x e^{-\frac{x^2}{2}}\]
and cumulative distribution function is
\[F(x)=1-e^{-\frac{x^2}{2}}\]
Probability density of \(m\)th miss radius in a group of \(n\) shots is
\[f_{m:n}(x)=\frac{n!}{(m-1)!(n-m)!}(F(x))^{m-1}(1-F(x))^{n-m}f(x)\]
For \(m=9\) and \(n=10\)
\[f_{9:10}(x)=\frac{10!}{(9-1)!(10-9)!}(F(x))^{9-1}(1-F(x))^{10-9}f(x)=90(F(x))^{8}(1-F(x))f(x)\] \[=90\left(1-e^{-\frac{x^2}{2}}\right)^{8}e^{-\frac{x^2}{2}}x e^{-\frac{x^2}{2}}=90\left(1-e^{-\frac{x^2}{2}}\right)^{8}x e^{-x^2}\]
Let \(t\) be average of two miss radiuses \(R_{9:10}\). PDF of \(t\) is convolution of PDFs \(f_{9:10}\)
\[p_2(t)=2\int_{0}^{2t}f_{9:10}(u)f_{9:10}(2t-u) du\]
Probability that miss radius of the next shot is greater than \(y\) (complementary cumulative distribution function) is
\[p(y)=1-F(y)=e^{-\frac{y^2}{2}}\]
\(y=R_{90}\) should be proportional to \(t\) with some yet unknown coefficient \(k\):
\[p(y)=p(kt)=e^{-\frac{k^2 t^2}{2}}\]
Miss radius \(y\) is greater than \(R_{90}\) with 10% probability:
\[cdf_{9:10}(t)=\int_{0}^{\infty}p_2(t)e^{-\frac{t^2 k^2}{2}} dt=0.1\]
The following Maxima code can solve the above equation to get \(k \approx 1.149216\). Upper integration limit of 10 was chosen to be high enough to not affect the result much, but low enough so that numerical integration does not underflow.
assume(x>0,t>0,u>0,k>0);
f(x):=90*x*(1-exp(-x^2/2))^8*exp(-x^2);
p2(t):=romberg(2*f(u)*f(2*t-u),u,0,2*t);
cdf(k):=romberg(p2(t)*exp(-t^2*k^2/2),t,0,10);
find_root(cdf(k)=0.1,k,1.1,1.2);
2x10 (6&9) factor 0.69 can also be derived from the formulas for order statistics.
Joint probability distribution of 6th miss radius \(x=R_{6:10}\) and 9th miss radius \(y=R_{9:10}\) is
\[f(x,y)=\frac{n!}{(j-1)!(k-j-1)!(n-k)!}[F(x)]^{j-1}(F(y)-F(x))^{k-j-1}(1-F(y))^{n-k}f(x)f(y)\] \[=15120 \left[1-e^{-\frac{x^2}{2}}\right]^5 \left(e^{-\frac{x^2}{2}}-e^{-\frac{y^2}{2}}\right)^2 e^{-\frac{y^2}{2}} x e^{-\frac{x^2}{2}} y e^{-\frac{y^2}{2}}, x \le y\]
Since we only care about \(x+y\), we can rotate axes by 45° and integrate over \(x-y\):
\[p(u)=\frac{1}{2}\int_{0}^{u}f\left(\frac{u-v}{2},\frac{u+v}{2}\right)dv\]
Applying convolution to get PDF of the average of the two measurement:
\[p_2(t)=\int_{0}^{2t}2p(u)p(2t-u)du\]
Integrating to get cumulative distribution:
\[cdf(k)=\int_{0}^{\infty}p_2(z)e^{\frac{-z^2 k^2}{2}}dz = 0.1\]
The following Maxima code can solve the above equation to get \(k \approx 0.68860849\)
assume(x>0,y>0,x<=y,u>0,v>0,v<=u,z>0);
f(x,y):=15120*(1-exp(-x^2/2))^5*(exp(-x^2/2)-exp(-y^2/2))^2*exp(-y^2/2)*x*exp(-x^2/2)*y*exp(-y^2/2);
p(u):=romberg(f((u-v)/2,(u+v)/2)/2,v,0,u);
p2(t):=romberg(2*p(u)*p(2*t-u),u,0,2*t);
cdf(k):=romberg(p2(z)*exp(-z^2*k^2/2),z,0,10);
find_root(cdf(k)=0.1,k,0.6,0.8);
Rayleigh 20 factor 1.76 was determined approximately using Monte Carlo simulations, but we can also get lower bound in closed from. With infinite number of shots, expected value of miss radius (Rayleigh estimator) is
\[r=E(x)=\int_{0}^{\infty}f(x) x dx = \sqrt{\frac{\pi}{2}}\]
From cumulative distribution function of miss radius
\[F(R_{90})=1-e^{-\frac{R_{90}^2}{2}} = 0.9, R_{90}=\sqrt{2ln(10)}\]
therefore the factor is
\[k = \frac{R_{90}}{r} = \sqrt{\frac{4 ln(10)}{\pi}} \approx 1.71223316\]
Rayleigh distribution is asymmetric so this is a lower bound on \(k\).
MLE 20 factor 0.35 was determined approximately using Monte Carlo simulations, but we can similarly get lower bound in closed form, which for \(N=20\) is
\[k=\frac{4^N N! (N-1)! \sqrt{N}}{(2N)!\sqrt{\pi}}\sqrt{2ln(10)}\sqrt{\frac{1}{2N}} \approx 0.3414341\]
Maxima expression:
float(4^20*20!*(20-1)!*sqrt(20)*sqrt(2*log(10))/((2*20)!*sqrt(%pi)*sqrt(2*20)));
[1] M. M. Siddiqui (1964): “Statistical Inference for Rayleigh Distributions”, Radio Science Journal of Research Vol 68D, No.9 http://ballistipedia.com/images/7/7a/Statistical_Inference_for_Rayleigh_Distributions_-_Siddiqui%2C_1964.pdf