tex: performance analysis

feat: add performance analysis code
tex: add listlistings for code colouring
2023-06-03 16:13:19 +02:00 · 2023-06-03 15:45:37 +02:00 · 2023-06-03 15:45:16 +02:00 · 2023-06-03 15:45:04 +02:00 · 2023-06-03 14:22:45 +02:00 · 2023-06-03 14:00:00 +02:00
4 changed files with 214 additions and 6 deletions
--- a/complexity-analysis/direct.py
+++ b/complexity-analysis/direct.py
@ -0,0 +1,41 @@
+# importing the memory tracking module
+import tracemalloc
+from random import random
+from math import floor, sqrt
+#from statistics import mean, variance
+from time import perf_counter
+
+# starting the monitoring
+tracemalloc.start()
+
+start_time = perf_counter()
+
+# store memory consumption before
+current_before, peak_before = tracemalloc.get_traced_memory()
+
+N = 10**6
+Tot = 0
+Tot2 = 0
+for _ in range(N):
+    item = random()
+    Tot  += item
+    Tot2 += item ** 2
+mean = Tot / N
+variance = Tot2 / (N-1) - mean**2
+
+# store memory after
+current_after, peak_after = tracemalloc.get_traced_memory()
+
+end_time = perf_counter()
+
+print("mean     :", mean)
+print("variance :", variance)
+
+# displaying the memory usage
+print("Used memory before : {} B (current), {} B (peak)".format(current_before,peak_before))
+print("Used memory after : {} B (current), {} B (peak)".format(current_after,peak_after))
+print("Used memory : {} B".format(peak_after - current_before))
+print("Time : {} ms".format((end_time - start_time) * 1000))
+
+# stopping the library
+tracemalloc.stop()
--- a/complexity-analysis/using_libs.py
+++ b/complexity-analysis/using_libs.py
@ -0,0 +1,36 @@
+# importing the memory tracking module
+import tracemalloc
+from random import random
+from math import floor, sqrt
+from statistics import mean, variance
+from time import perf_counter
+
+# starting the monitoring
+tracemalloc.start()
+
+start_time = perf_counter()
+
+# store memory consumption before
+current_before, peak_before = tracemalloc.get_traced_memory()
+
+N = 10**6
+values = [random() for _ in range(N)]
+mean = mean(values)
+variance = variance(values)
+
+# store memory after
+current_after, peak_after = tracemalloc.get_traced_memory()
+
+end_time = perf_counter()
+
+print("mean     :", mean)
+print("variance :", variance)
+
+# displaying the memory usage
+print("Used memory before : {} B (current), {} B (peak)".format(current_before,peak_before))
+print("Used memory after : {} B (current), {} B (peak)".format(current_after,peak_after))
+print("Used memory : {} B".format(peak_after - current_before))
+print("Time : {} ms".format((end_time - start_time) * 1000))
+
+# stopping the library
+tracemalloc.stop()
--- a/latex/content.tex
+++ b/latex/content.tex
@ -80,21 +80,124 @@ by various algorithms minimizing the waste of material.

 \subsection{1D : Networking}

-on which humans
-have decreasing control.
+When managing network traffic at scale, efficiently routing packets is
+necessary to avoid congestion, which leads to lower bandwidth and higher
+latency. Say you're a internet service provider and your users are watching
+videos on popular streaming platforms. You want to ensure that the traffic is
+balanced between the different routes to minimize throttling and energy
+consumption.

-In this paper, we will focus on one-dimensional bin packing, where we try to
-store items of different heights in a linear container.
+\paragraph{} We can consider the different routes as bins and the users'
+bandwidth as the items. If a bin overflows, we can redirect the traffic to
+another route. Using less bins means less energy consumption and decreased
+operating costs. This is a good example of bin packing in a dynamic
+environment, where the items are constantly changing. Humans are not involved
+in the process, as it is fast-paced and requires a high level of automation.
+
+\vspace{0.4cm}
+
+\paragraph{} We have seen multiple examples of how bin packing algorithms can
+be used in various technical fields. In these examples, a choice was made,
+evaluating the process effectiveness and reliability, based on a probabilistic
+analysis allowing the adaptation of the algorithm to the use case. We will now
+conduct our own analysis and study various algorithms and their probabilistic
+advantages, focusing on one-dimensional bin packing, where we try to store
+items of different heights in a linear bin.
+
+\section{Next Fit Bin Packing algorithm (NFBP)}
+
+Our goal is to study the number of bins $ H_n $ required to store $ n $ items
+for each algorithm. We first consider the Next Fit Bin Packing algorithm, where
+we store each item in the next bin if it fits, otherwise we open a new bin.
+
+\paragraph{} Each bin will have a fixed capacity of $ 1 $ and items and items
+will be of random sizes between $ 0 $ and $ 1 $. We will run X simulations % TODO
+with 10 packets.
+
+\subsubsection{Variables used in models}


-\section{Next Fit Bin Packing algorithm}

 \cite{hofri:1987}
 % TODO mettre de l'Histoire

 \section{Next Fit Dual Bin Packing algorithm}

-\section{Algorithm comparisons and optimization}
+\section{Complexity and implementation optimization}
+
+The NFBP algorithm has a linear complexity $ O(n) $, as we only need to iterate
+over the items once.
+
+\subsection{Performance optimization}
+
+When implementing the statistical analysis, the intuitive way to do it is to
+run $ R $ simulations, store the results, then conduct the analysis. However,
+when running a large number of simulations, this can be very memory
+consuming. We can optimize the process by computing the statistics on the fly,
+by using sum formulae. This uses nearly constant memory, as we only need to
+store the current sum and the current sum of squares for different variables.
+
+While the mean can easily be calculated by summing then dividing, the variance
+can be calculated using the following formula:
+
+\begin{align}
+	{S_N}^2 & = \frac{1}{N-1} \sum_{i=1}^{N} (X_i - \overline{X})^2               \\
+	        & = \frac{1}{N-1} \sum_{i=1}^{N} X_i^2 - \frac{N}{N-1} \overline{X}^2
+\end{align}
+
+The sum $ \frac{1}{N-1} \sum_{i=1}^{N} X_i^2 $ can be calculated iteratively
+after each simulation.
+
+\subsection{Effective resource consumption}
+
+We set out to study the resource consumption of the algorithms. We implemented
+the above formulae to calculate the mean and variance of $ N = 10^6 $ random
+numbers. We wrote the following algorithms \footnotemark :
+
+\footnotetext{The full code used to measure performance can be found in Annex X.}
+% TODO annex
+
+\paragraph{Intuitive algorithm} Store values first, calculate later
+
+\begin{lstlisting}[language=python]
+N = 10**6
+values = [random() for _ in range(N)]
+mean = mean(values)
+variance = variance(values)
+\end{lstlisting}
+
+Execution time : $ ~ 4.8 $ seconds
+
+Memory usage : $ ~ 32 $ MB
+
+\paragraph{Improved algorithm} Continuous calculation
+
+\begin{lstlisting}[language=python]
+N = 10**6
+Tot = 0
+Tot2 = 0
+for _ in range(N):
+    item = random()
+    Tot  += item
+    Tot2 += item ** 2
+mean = Tot / N
+variance = Tot2 / (N-1) - mean**2
+\end{lstlisting}
+
+Execution time : $ ~ 530 $ milliseconds
+
+Memory usage : $ ~ 1.3 $ kB
+
+\paragraph{Analysis} Memory usage is, as expected, much lower when calculating
+the statistics on the fly. Furthermore, something we hadn't anticipated is the
+execution time. The improved algorithm is nearly 10 times faster than the
+intuitive one. This can be explained by the time taken to allocate memory and
+then calculate the statistics (which iterates multiple times over the array).
+\footnotemark
+
+\footnotetext{Performance was measured on a single computer and will vary
+	between devices. Execution time and memory usage do not include the import of
+	libraries.}

 \subsection{NFBP vs NFDBP}

--- a/latex/main.tex
+++ b/latex/main.tex
@ -13,6 +13,34 @@
 \usepackage{eurosym}
 \usepackage[english]{babel}
 \usepackage{eso-pic} % for background on cover
+\usepackage{listings}
+
+% Define colors for code
+\definecolor{codegreen}{rgb}{0,0.4,0}
+\definecolor{codegray}{rgb}{0.5,0.5,0.5}
+\definecolor{codepurple}{rgb}{0.58,0,0.82}
+\definecolor{backcolour}{rgb}{0.95,0.95,0.92}
+
+\lstdefinestyle{mystyle}{
+    backgroundcolor=\color{backcolour},
+    commentstyle=\color{codegreen},
+    keywordstyle=\color{magenta},
+    numberstyle=\tiny\color{codegray},
+    stringstyle=\color{codepurple},
+    basicstyle=\ttfamily\small,
+    breakatwhitespace=false,
+    breaklines=true,
+    captionpos=b,
+    keepspaces=true,
+    numbers=left,
+    numbersep=5pt,
+    showspaces=false,
+    showstringspaces=false,
+    showtabs=false,
+    tabsize=2
+}
+
+\lstset{style=mystyle}


 % table des annexes
Author	SHA1	Message	Date
Paul ALNET	c536e0b28b	tex: performance analysis	2023-06-03 16:13:19 +02:00
Paul ALNET	184f4ff491	feat: add performance analysis code	2023-06-03 15:45:37 +02:00
Paul ALNET	78617e6130	tex: add listlistings for code colouring	2023-06-03 15:45:16 +02:00
Paul ALNET	c0abf64ee0	tex: move and write performance part	2023-06-03 15:45:04 +02:00
Paul ALNET	4c74dd7877	tex: NFBP	2023-06-03 14:22:45 +02:00
Paul ALNET	29851204fe	tex: add networking part	2023-06-03 14:00:00 +02:00