\sectionnn{Introduction} Bin packing is the process of packing a set of items of different sizes into containers of a fixed capacity in a way that minimizes the number of containers used. This has applications in many fields, such as logistics, where we want to optimize the storage and transport of items in boxes, containers, trucks, etc. Building mathematical models for bin packing is useful in understanding the problem and in designing better algorithms, depending on the use case. An algorithm optimized for packing cubes into boxes will not perform as well as another algorithm for packing long items into trucks. Studying the mathematics behind algorithms provides us with a better understanding of what works best. When operating at scale, every small detail can have a huge impact on overall efficiency and cost. Therefore, carefully developing algorithms based on solid mathematical models is crucial. As we have seen in our Automatics class, a small logic breach can be an issue in the long run in systems that are supposed to run autonomously. This situation can be avoided by using mathematical models during the design process wich will lead to better choices welding economic and relibility concerns. We will conduct a probabilistic analysis of multiple algorithms and compare results to theoretical values. We will also consider the algoriths complexity and performance, both in resource consumption and in box usage. \clearpage \section{Bin packing use cases} Before studying the mathematics behind bin packing algorithms, we will have a look at the motivations behind this project. Bin packing has applications in many fields and allows to automize and optimize complex systems. We will illustrate with examples focusing on two use cases: logistics and computer science. We will consider examples of multiple dimensions to show the versatility of bin packing algorithms. \paragraph{} In the modern day, an effective supply chain relies on an automated production thanks to sensors and actuators installed along conveyor belts. It is often required to implement a packing procedure. All of this is controlled by a computer system running continuously. \subsection{3D : Containers} Storing items in containers can be a prime application of bin packing. These tree-dimensional objects of standardized size are used to transport goods. While the dimensions of the containers are predictable, those of the transported items are not. Storage is furthermore complicated by the fact that there can be a void between items, allowing to move around. Multiple types of items can also be stored in the same container. There are many ways to optimize the storage of items in containers. For example, by ensuring items are of an optimal standardized size or by storing a specific item in each container, both eliminating the randomness in item size. In these settings, it is easy to fill a container by assimilating them to rectangular blocks. However, when items come in pseudo-random dimensions, it is intuitive to start filling the container with larger items and then filling the remaining gaps with smaller items. As containers must be closed, in the event of an overflow, the remaining items must be stored in another container. \subsection{2D : Cutting stock problem} In industries such as woodworking bin packing algorithms are utilized to minimize material waste when cutting large planks into smaller pieces of desired sizes. Many tools use this two-dimensional cut process. For example, at the Fabric'INSA Fablab, the milling machine, laser cutter and many more are used to cut large planks of wood into smaller pieces for student projects. In this scenario, we try to organize the desired cuts in a way that minimizes the unusable excess wood. \begin{figure}[ht] \centering \includegraphics[width=0.65\linewidth]{graphics/fraiseuse.jpg} \caption[]{Milling machine at the Fabric'INSA Fablab \footnotemark} \label{fig:fraiseuse} \end{figure} \footnotetext{Photo courtesy of Inés Bafaluy} Managing the placement of items of complex shapes can be optimized by using by various algorithms minimizing the waste of material. \subsection{1D : Networking} When managing network traffic at scale, efficiently routing packets is necessary to avoid congestion, which leads to lower bandwidth and higher latency. Say you're a internet service provider and your users are watching videos on popular streaming platforms. You want to ensure that the traffic is balanced between the different routes to minimize throttling and energy consumption. \paragraph{} We can consider the different routes as bins and the users' bandwidth as the items. If a bin overflows, we can redirect the traffic to another route. Using less bins means less energy consumption and decreased operating costs. This is a good example of bin packing in a dynamic environment, where the items are constantly changing. Humans are not involved in the process, as it is fast-paced and requires a high level of automation. \vspace{0.4cm} \paragraph{} We have seen multiple examples of how bin packing algorithms can be used in various technical fields. In these examples, a choice was made, evaluating the process effectiveness and reliability, based on a probabilistic analysis allowing the adaptation of the algorithm to the use case. We will now conduct our own analysis and study various algorithms and their probabilistic advantages, focusing on one-dimensional bin packing, where we try to store items of different heights in a linear bin. \section{Next Fit Bin Packing algorithm (NFBP)} Our goal is to study the number of bins $ H_n $ required to store $ n $ items for each algorithm. We first consider the Next Fit Bin Packing algorithm, where we store each item in the next bin if it fits, otherwise we open a new bin. \paragraph{} Each bin will have a fixed capacity of $ 1 $ and items and items will be of random sizes between $ 0 $ and $ 1 $. We will run X simulations % TODO with 10 packets. \subsubsection{Variables used in models} \cite{hofri:1987} % TODO mettre de l'Histoire \section{Next Fit Dual Bin Packing algorithm} \section{Complexity and implementation optimization} The NFBP algorithm has a linear complexity $ O(n) $, as we only need to iterate over the items once. \subsection{Performance optimization} When implementing the statistical analysis, the intuitive way to do it is to run $ R $ simulations, store the results, then conduct the analysis. However, when running a large number of simulations, this can be very memory consuming. We can optimize the process by computing the statistics on the fly, by using sum formulae. This uses nearly constant memory, as we only need to store the current sum and the current sum of squares for different variables. While the mean can easily be calculated by summing then dividing, the variance can be calculated using the following formula: \begin{align} {S_N}^2 & = \frac{1}{N-1} \sum_{i=1}^{N} (X_i - \overline{X})^2 \\ & = \frac{1}{N-1} \sum_{i=1}^{N} X_i^2 - \frac{N}{N-1} \overline{X}^2 \end{align} The sum $ \frac{1}{N-1} \sum_{i=1}^{N} X_i^2 $ can be calculated iteratively after each simulation. \subsection{Effective resource consumption} We set out to study the resource consumption of the algorithms. We implemented the above formulae to calculate the mean and variance of $ N = 10^6 $ random numbers. We wrote the following algorithms \footnotemark : \footnotetext{The full code used to measure performance can be found in Annex \ref{annex:performance}.} % TODO annex \paragraph{Intuitive algorithm} Store values first, calculate later \begin{lstlisting}[language=python] N = 10**6 values = [random() for _ in range(N)] mean = mean(values) variance = variance(values) \end{lstlisting} Execution time : $ 4.8 $ seconds Memory usage : $ 32 $ MB \paragraph{Improved algorithm} Continuous calculation \begin{lstlisting}[language=python] N = 10**6 Tot = 0 Tot2 = 0 for _ in range(N): item = random() Tot += item Tot2 += item ** 2 mean = Tot / N variance = Tot2 / (N-1) - mean**2 \end{lstlisting} Execution time : $ 530 $ milliseconds Memory usage : $ 1.3 $ kB \paragraph{Analysis} Memory usage is, as expected, much lower when calculating the statistics on the fly. Furthermore, something we hadn't anticipated is the execution time. The improved algorithm is nearly 10 times faster than the intuitive one. This can be explained by the time taken to allocate memory and then calculate the statistics (which iterates multiple times over the array). \footnotemark \footnotetext{Performance was measured on a single computer and will vary between devices. Execution time and memory usage do not include the import of libraries.} \subsection{NFBP vs NFDBP} \subsection{Optimal algorithm} \cite{bin-packing-approximation:2022} \sectionnn{Conclusion}