tex: NFBP stats

This commit is contained in:
Paul ALNET 2023-06-04 18:39:18 +02:00
parent 1c6db889a6
commit 4a4531a413

View file

@ -180,14 +180,75 @@ Mathematically, the NFBP algorithm imposes the following constraint on the first
We implemented the NFBP algorithm in Python \footnotemark, for its ease of use
and broad recommendation. We used the \texttt{random} library to generate
random numbers between $ 0 $ and $ 1 $ and \texttt{matplotlib} to plot the
results in the form of histograms. We ran $ R = 10^6 $ simulations with
$ N = 10 $ different items each.
results in the form of histograms.
\footnotetext{The code is available in Annex \ref{annex:probabilistic}}
We will try to approximate $ \mathbb{E}[X] $ and $ \mathbb{E}[V] $ with $
\overline{X_N} $ using $ {S_n}^2 $. This operation will be done for both $ R =
2 $ and $ R = 10^6 $ simulations.
\[
\overline{X_N} = \frac{1}{N} \sum_{i=1}^{N} X_i
\]
As the variance value is unknown, we will use $ {S_n}^2 $ to estimate the
variance and further determine the Confidence Interval (95 \% certainty).
\begin{align*}
{S_N}^2 & = \frac{1}{N-1} \sum_{i=1}^{N} (X_i - \overline{X_N})^2 \\
IC_{95\%}(m) & = \left[ \overline{X_N} \pm \frac{S_N}{\sqrt{N}} \cdot t_{1 - \frac{\alpha}{2}, N-1} \right] \\
\end{align*}
\paragraph{2 simulations} We first ran $ R = 2 $ simulations to observe the
behavior of the algorithm and the low precision of the results.
% TODO graph T_i 2 sim
On this graph, we can see each value of $ T_i $. Our calculations have yielded
that $ \overline{T_1} = 1.0 $ and $ {S_N}^2 = 2.7 $. Our student coefficient is
$ t_{0.95, 2} = 4.303 $.
\begin{align*}
\overline{T_1} = \sum_{k=1}^{2} {T_1}_k & = 1.0 \\
IC_{95\%}(T_1) & = \left[ 1.0 \pm 1.96 \frac{\sqrt{2.7}}{\sqrt{2}} \cdot 4.303 \right] \\
& = \left[ 1 \pm 9.8 \right] \\
\end{align*}
With two simulations, we obtain $ \overline{T_1} = 1.0 $.
IC observed
We then ran $ R = 10^6 $ simulations with $ N = 50 $ different items each.
With 10 6 simulations, we obtain Xn barre = cf graphe
Calcul Sn carre
IC observed
Same for V.
Graphe H
\paragraph{Distribution of $ T_i $} We first studied how many items were
present per bin.
% TODO sim of T_i
We determined the empirical mean to be
\[
\overline{T_i} = \frac{1}{20} \sum_{k=1}^{20} T_k = 1.5 \qquad \forall 1 \leq i \leq 20
\]
We can show
\paragraph{Distribution of $ V_i $} We then looked at the size of the first
item in each bin.