From 4a4531a41301409da70cfae6989eafb380dd3b82 Mon Sep 17 00:00:00 2001 From: Paul ALNET Date: Sun, 4 Jun 2023 18:39:18 +0200 Subject: [PATCH] tex: NFBP stats --- latex/content.tex | 65 +++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 63 insertions(+), 2 deletions(-) diff --git a/latex/content.tex b/latex/content.tex index 7d6d5cd..0e3e7e3 100644 --- a/latex/content.tex +++ b/latex/content.tex @@ -180,14 +180,75 @@ Mathematically, the NFBP algorithm imposes the following constraint on the first We implemented the NFBP algorithm in Python \footnotemark, for its ease of use and broad recommendation. We used the \texttt{random} library to generate random numbers between $ 0 $ and $ 1 $ and \texttt{matplotlib} to plot the -results in the form of histograms. We ran $ R = 10^6 $ simulations with -$ N = 10 $ different items each. +results in the form of histograms. \footnotetext{The code is available in Annex \ref{annex:probabilistic}} +We will try to approximate $ \mathbb{E}[X] $ and $ \mathbb{E}[V] $ with $ + \overline{X_N} $ using $ {S_n}^2 $. This operation will be done for both $ R = + 2 $ and $ R = 10^6 $ simulations. + +\[ + \overline{X_N} = \frac{1}{N} \sum_{i=1}^{N} X_i +\] + +As the variance value is unknown, we will use $ {S_n}^2 $ to estimate the +variance and further determine the Confidence Interval (95 \% certainty). + +\begin{align*} + {S_N}^2 & = \frac{1}{N-1} \sum_{i=1}^{N} (X_i - \overline{X_N})^2 \\ + IC_{95\%}(m) & = \left[ \overline{X_N} \pm \frac{S_N}{\sqrt{N}} \cdot t_{1 - \frac{\alpha}{2}, N-1} \right] \\ +\end{align*} + + + + +\paragraph{2 simulations} We first ran $ R = 2 $ simulations to observe the +behavior of the algorithm and the low precision of the results. + +% TODO graph T_i 2 sim + +On this graph, we can see each value of $ T_i $. Our calculations have yielded +that $ \overline{T_1} = 1.0 $ and $ {S_N}^2 = 2.7 $. Our student coefficient is +$ t_{0.95, 2} = 4.303 $. + +\begin{align*} + \overline{T_1} = \sum_{k=1}^{2} {T_1}_k & = 1.0 \\ + IC_{95\%}(T_1) & = \left[ 1.0 \pm 1.96 \frac{\sqrt{2.7}}{\sqrt{2}} \cdot 4.303 \right] \\ + & = \left[ 1 \pm 9.8 \right] \\ +\end{align*} + +With two simulations, we obtain $ \overline{T_1} = 1.0 $. + + + +IC observed + +We then ran $ R = 10^6 $ simulations with $ N = 50 $ different items each. +With 10 6 simulations, we obtain Xn barre = cf graphe +Calcul Sn carre +IC observed + + +Same for V. + + +Graphe H + \paragraph{Distribution of $ T_i $} We first studied how many items were present per bin. +% TODO sim of T_i + +We determined the empirical mean to be + +\[ + \overline{T_i} = \frac{1}{20} \sum_{k=1}^{20} T_k = 1.5 \qquad \forall 1 \leq i \leq 20 +\] + + +We can show + \paragraph{Distribution of $ V_i $} We then looked at the size of the first item in each bin.