tex: NFBP stats

This commit is contained in:
Paul ALNET 2023-06-04 18:39:18 +02:00
parent 1c6db889a6
commit 4a4531a413

View file

@ -180,14 +180,75 @@ Mathematically, the NFBP algorithm imposes the following constraint on the first
We implemented the NFBP algorithm in Python \footnotemark, for its ease of use We implemented the NFBP algorithm in Python \footnotemark, for its ease of use
and broad recommendation. We used the \texttt{random} library to generate and broad recommendation. We used the \texttt{random} library to generate
random numbers between $ 0 $ and $ 1 $ and \texttt{matplotlib} to plot the random numbers between $ 0 $ and $ 1 $ and \texttt{matplotlib} to plot the
results in the form of histograms. We ran $ R = 10^6 $ simulations with results in the form of histograms.
$ N = 10 $ different items each.
\footnotetext{The code is available in Annex \ref{annex:probabilistic}} \footnotetext{The code is available in Annex \ref{annex:probabilistic}}
We will try to approximate $ \mathbb{E}[X] $ and $ \mathbb{E}[V] $ with $
\overline{X_N} $ using $ {S_n}^2 $. This operation will be done for both $ R =
2 $ and $ R = 10^6 $ simulations.
\[
\overline{X_N} = \frac{1}{N} \sum_{i=1}^{N} X_i
\]
As the variance value is unknown, we will use $ {S_n}^2 $ to estimate the
variance and further determine the Confidence Interval (95 \% certainty).
\begin{align*}
{S_N}^2 & = \frac{1}{N-1} \sum_{i=1}^{N} (X_i - \overline{X_N})^2 \\
IC_{95\%}(m) & = \left[ \overline{X_N} \pm \frac{S_N}{\sqrt{N}} \cdot t_{1 - \frac{\alpha}{2}, N-1} \right] \\
\end{align*}
\paragraph{2 simulations} We first ran $ R = 2 $ simulations to observe the
behavior of the algorithm and the low precision of the results.
% TODO graph T_i 2 sim
On this graph, we can see each value of $ T_i $. Our calculations have yielded
that $ \overline{T_1} = 1.0 $ and $ {S_N}^2 = 2.7 $. Our student coefficient is
$ t_{0.95, 2} = 4.303 $.
\begin{align*}
\overline{T_1} = \sum_{k=1}^{2} {T_1}_k & = 1.0 \\
IC_{95\%}(T_1) & = \left[ 1.0 \pm 1.96 \frac{\sqrt{2.7}}{\sqrt{2}} \cdot 4.303 \right] \\
& = \left[ 1 \pm 9.8 \right] \\
\end{align*}
With two simulations, we obtain $ \overline{T_1} = 1.0 $.
IC observed
We then ran $ R = 10^6 $ simulations with $ N = 50 $ different items each.
With 10 6 simulations, we obtain Xn barre = cf graphe
Calcul Sn carre
IC observed
Same for V.
Graphe H
\paragraph{Distribution of $ T_i $} We first studied how many items were \paragraph{Distribution of $ T_i $} We first studied how many items were
present per bin. present per bin.
% TODO sim of T_i
We determined the empirical mean to be
\[
\overline{T_i} = \frac{1}{20} \sum_{k=1}^{20} T_k = 1.5 \qquad \forall 1 \leq i \leq 20
\]
We can show
\paragraph{Distribution of $ V_i $} We then looked at the size of the first \paragraph{Distribution of $ V_i $} We then looked at the size of the first
item in each bin. item in each bin.