<div dir="ltr"><div>The biggest problem with statistics is misuse due to making invalid assumptions of randomness.</div><div><br></div><div>This particularly applies to things like testing components which are batch built and sampling a whole lot of items from the same batch can give you poor results.</div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sun, Oct 18, 2020 at 7:26 PM Helmut Walle <<a href="mailto:helmut.walle@gmail.com">helmut.walle@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">You can use a normal distribution for your random variable (the number of failures within a<br>
sample) if the pass-fail outcomes of the individual elements of that sample (the units) are<br>
statistically independent of each other, and identically distributed (but these individual<br>
outcomes need not be following a normal distribution themselves).<br>
In practice this means that you cannot easily analyse the situation if your process is drifting<br>
throughout the production of one batch - in that case you would have to stabilise your process<br>
first. But as long as there are just some random variations, you just have to select the units<br>
that go into the sample as randomly as possible.<br>
<br>
With that given, the number of failures in a sample then follows a normal distribution (for<br>
large sample sizes, that is - strictly speaking, it is a binomial distribution for finite sample<br>
sizes, but in practice this can be approximated by a normal distribution even for sample sizes<br>
as low as 100). You can calculate the average and standard deviation of the distribution.<br>
Confidence intervals for a certain target confidence level can then be expressed as multiples of<br>
the standard deviation. If you want a higher confidence obviously your confidence intervals will<br>
be wider. Conversely, if you want a narrower range of outcomes your confidence will be low.<br>
<br>
A few references that may help for your use case:<br>
<br>
<a href="https://www.qualtrics.com/au/experience-management/research/determine-sample-size/?rid=ip&prevsite=en&newsite=au&geo=NZ&geomatch=au" rel="noreferrer" target="_blank">https://www.qualtrics.com/au/experience-management/research/determine-sample-size/?rid=ip&prevsite=en&newsite=au&geo=NZ&geomatch=au</a><br>
<a href="https://www.dummies.com/education/math/statistics/choosing-a-confidence-level-for-a-population-sample/" rel="noreferrer" target="_blank">https://www.dummies.com/education/math/statistics/choosing-a-confidence-level-for-a-population-sample/</a><br>
<a href="https://www.quanterion.com/test-samples-how-many-are-needed/" rel="noreferrer" target="_blank">https://www.quanterion.com/test-samples-how-many-are-needed/</a><br>
<br>
Kind regards,<br>
<br>
Helmut.<br>
<br>
<br>
On 18/10/2020 17:57, Stephen Irons wrote:<br>
> I seem to remember doing calculations like this a long time ago...there are a number of<br>
> variations which are probably all related. I have not been able to find any Google search terms<br>
> that give me anything useful.<br>
> <br>
> A factory produces a batch of 10_000 units.<br>
> <br>
> * I test 100 units; there are 3 failures. What failure rate can I expect from the whole batch?<br>
> What is my confidence in that estimate?<br>
> * I test 100 units; there are 0 failures. What failure rate can I expect from the whole batch?<br>
> What is my confidence in that estimate?<br>
> * How many units do I need to test to have 99% confidence that there will be less than 1%<br>
> failure rate from the whole batch?<br>
> <br>
> Can someone tell me what you call this type of calculation? Point me to a suitable reference site?<br>
> <br>
> All of the examples I find online are of the form: a factory produces widgets with x% failure<br>
> rate; out of a sample of y units, what is the probability of finding z defective units...this is<br>
> probably the same calculation from the other direction.<br>
> <br>
> This is just for interest. In my specific case, I had 36 failures out of a sample of 50 taken<br>
> from a batch of a few thousand -- this is clearly not acceptable. But we now have a repeatable<br>
> test that causes the failure.<br>
> <br>
> Stephen Irons<br>
> <br>
> _______________________________________________<br>
> Chchrobotics mailing list <a href="mailto:Chchrobotics@lists.ourshack.com" target="_blank">Chchrobotics@lists.ourshack.com</a><br>
> <a href="https://lists.ourshack.com/mailman/listinfo/chchrobotics" rel="noreferrer" target="_blank">https://lists.ourshack.com/mailman/listinfo/chchrobotics</a><br>
> Mail Archives: <a href="http://lists.ourshack.com/pipermail/chchrobotics/" rel="noreferrer" target="_blank">http://lists.ourshack.com/pipermail/chchrobotics/</a><br>
> Meetings usually 3rd Monday each month. See <a href="http://kiwibots.org" rel="noreferrer" target="_blank">http://kiwibots.org</a> for venue, directions and dates.<br>
> When replying, please edit your Subject line to reflect new subjects.<br>
> <br>
<br>
_______________________________________________<br>
Chchrobotics mailing list <a href="mailto:Chchrobotics@lists.ourshack.com" target="_blank">Chchrobotics@lists.ourshack.com</a><br>
<a href="https://lists.ourshack.com/mailman/listinfo/chchrobotics" rel="noreferrer" target="_blank">https://lists.ourshack.com/mailman/listinfo/chchrobotics</a><br>
Mail Archives: <a href="http://lists.ourshack.com/pipermail/chchrobotics/" rel="noreferrer" target="_blank">http://lists.ourshack.com/pipermail/chchrobotics/</a><br>
Meetings usually 3rd Monday each month. See <a href="http://kiwibots.org" rel="noreferrer" target="_blank">http://kiwibots.org</a> for venue, directions and dates.<br>
When replying, please edit your Subject line to reflect new subjects.</blockquote></div>