tex: NFBP stats
This commit is contained in:
parent
1c6db889a6
commit
4a4531a413
1 changed files with 63 additions and 2 deletions
|
@ -180,14 +180,75 @@ Mathematically, the NFBP algorithm imposes the following constraint on the first
|
||||||
We implemented the NFBP algorithm in Python \footnotemark, for its ease of use
|
We implemented the NFBP algorithm in Python \footnotemark, for its ease of use
|
||||||
and broad recommendation. We used the \texttt{random} library to generate
|
and broad recommendation. We used the \texttt{random} library to generate
|
||||||
random numbers between $ 0 $ and $ 1 $ and \texttt{matplotlib} to plot the
|
random numbers between $ 0 $ and $ 1 $ and \texttt{matplotlib} to plot the
|
||||||
results in the form of histograms. We ran $ R = 10^6 $ simulations with
|
results in the form of histograms.
|
||||||
$ N = 10 $ different items each.
|
|
||||||
|
|
||||||
\footnotetext{The code is available in Annex \ref{annex:probabilistic}}
|
\footnotetext{The code is available in Annex \ref{annex:probabilistic}}
|
||||||
|
|
||||||
|
We will try to approximate $ \mathbb{E}[X] $ and $ \mathbb{E}[V] $ with $
|
||||||
|
\overline{X_N} $ using $ {S_n}^2 $. This operation will be done for both $ R =
|
||||||
|
2 $ and $ R = 10^6 $ simulations.
|
||||||
|
|
||||||
|
\[
|
||||||
|
\overline{X_N} = \frac{1}{N} \sum_{i=1}^{N} X_i
|
||||||
|
\]
|
||||||
|
|
||||||
|
As the variance value is unknown, we will use $ {S_n}^2 $ to estimate the
|
||||||
|
variance and further determine the Confidence Interval (95 \% certainty).
|
||||||
|
|
||||||
|
\begin{align*}
|
||||||
|
{S_N}^2 & = \frac{1}{N-1} \sum_{i=1}^{N} (X_i - \overline{X_N})^2 \\
|
||||||
|
IC_{95\%}(m) & = \left[ \overline{X_N} \pm \frac{S_N}{\sqrt{N}} \cdot t_{1 - \frac{\alpha}{2}, N-1} \right] \\
|
||||||
|
\end{align*}
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
\paragraph{2 simulations} We first ran $ R = 2 $ simulations to observe the
|
||||||
|
behavior of the algorithm and the low precision of the results.
|
||||||
|
|
||||||
|
% TODO graph T_i 2 sim
|
||||||
|
|
||||||
|
On this graph, we can see each value of $ T_i $. Our calculations have yielded
|
||||||
|
that $ \overline{T_1} = 1.0 $ and $ {S_N}^2 = 2.7 $. Our student coefficient is
|
||||||
|
$ t_{0.95, 2} = 4.303 $.
|
||||||
|
|
||||||
|
\begin{align*}
|
||||||
|
\overline{T_1} = \sum_{k=1}^{2} {T_1}_k & = 1.0 \\
|
||||||
|
IC_{95\%}(T_1) & = \left[ 1.0 \pm 1.96 \frac{\sqrt{2.7}}{\sqrt{2}} \cdot 4.303 \right] \\
|
||||||
|
& = \left[ 1 \pm 9.8 \right] \\
|
||||||
|
\end{align*}
|
||||||
|
|
||||||
|
With two simulations, we obtain $ \overline{T_1} = 1.0 $.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
IC observed
|
||||||
|
|
||||||
|
We then ran $ R = 10^6 $ simulations with $ N = 50 $ different items each.
|
||||||
|
With 10 6 simulations, we obtain Xn barre = cf graphe
|
||||||
|
Calcul Sn carre
|
||||||
|
IC observed
|
||||||
|
|
||||||
|
|
||||||
|
Same for V.
|
||||||
|
|
||||||
|
|
||||||
|
Graphe H
|
||||||
|
|
||||||
\paragraph{Distribution of $ T_i $} We first studied how many items were
|
\paragraph{Distribution of $ T_i $} We first studied how many items were
|
||||||
present per bin.
|
present per bin.
|
||||||
|
|
||||||
|
% TODO sim of T_i
|
||||||
|
|
||||||
|
We determined the empirical mean to be
|
||||||
|
|
||||||
|
\[
|
||||||
|
\overline{T_i} = \frac{1}{20} \sum_{k=1}^{20} T_k = 1.5 \qquad \forall 1 \leq i \leq 20
|
||||||
|
\]
|
||||||
|
|
||||||
|
|
||||||
|
We can show
|
||||||
|
|
||||||
\paragraph{Distribution of $ V_i $} We then looked at the size of the first
|
\paragraph{Distribution of $ V_i $} We then looked at the size of the first
|
||||||
item in each bin.
|
item in each bin.
|
||||||
|
|
||||||
|
|
Loading…
Reference in a new issue