{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Data analysis: Velib" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Author: O. Roustant, INSA Toulouse. February 2022.\n", "\n", "\n", "We consider the ‘Vélib’ data set, related to the bike sharing system of Paris. The data are loading profiles of the bike stations over one week, collected every hour, from the period Monday 2nd Sept. - Sunday 7th Sept., 2014. The loading profile of a station, or simply loading, is defined as the ratio of number of available bikes divided by the number of bike docks. A loading of 1 means that the station is fully loaded, i.e. all bikes are available. A loading of 0 means that the station is empty, all bikes have been rent.\n", "\n", "From the viewpoint of data analysis, the individuals are the stations. The variables are the 168 time steps (hours in the week). The aim is to detect clusters in the data, corresponding to common customer usages. This clustering should then be used to predict the loading profile." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Loading required package: MASS\n", "\n", "Loading required package: fda\n", "\n", "Loading required package: splines\n", "\n", "Loading required package: Matrix\n", "\n", "Loading required package: fds\n", "\n", "Loading required package: rainbow\n", "\n", "Loading required package: pcaPP\n", "\n", "Loading required package: RCurl\n", "\n", "Loading required package: deSolve\n", "\n", "\n", "Attaching package: ‘fda’\n", "\n", "\n", "The following object is masked from ‘package:graphics’:\n", "\n", " matplot\n", "\n", "\n", "Loading required package: elasticnet\n", "\n", "Loading required package: lars\n", "\n", "Loaded lars 1.2\n", "\n", "\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "velib package:funFEM R Documentation\n", "\n", "_\bT_\bh_\be _\bV_\bé_\bl_\bi_\bb _\bd_\ba_\bt_\ba _\bs_\be_\bt\n", "\n", "_\bD_\be_\bs_\bc_\br_\bi_\bp_\bt_\bi_\bo_\bn:\n", "\n", " This data set contains data from the bike sharing system of Paris,\n", " called Vélib. The data are loading profiles of the bike stations\n", " over one week. The data were collected every hour during the\n", " period Sunday 1st Sept. - Sunday 7th Sept., 2014.\n", "\n", "_\bU_\bs_\ba_\bg_\be:\n", "\n", " data(velib)\n", " \n", "_\bF_\bo_\br_\bm_\ba_\bt:\n", "\n", " The format is:\n", "\n", " - data: the loading profiles (nb of available bikes / nb of bike\n", " docks) of the 1189 stations at 181 time points.\n", "\n", " - position: the longitude and latitude of the 1189 bike stations.\n", "\n", " - dates: the download dates.\n", "\n", " - bonus: indicates if the station is on a hill (bonus = 1).\n", "\n", " - names: the names of the stations.\n", "\n", "_\bS_\bo_\bu_\br_\bc_\be:\n", "\n", " The real time data are available at\n", " https://developer.jcdecaux.com/ (with an api key).\n", "\n", "_\bR_\be_\bf_\be_\br_\be_\bn_\bc_\be_\bs:\n", "\n", " The data were first used in C. Bouveyron, E. Côme and J. Jacques,\n", " The discriminative functional mixture model for the analysis of\n", " bike sharing systems, Preprint HAL n.01024186, University Paris\n", " Descartes, 2014.\n", "\n", "_\bE_\bx_\ba_\bm_\bp_\bl_\be_\bs:\n", "\n", " data(velib)\n", " matplot(t(velib$data[1:5,]),type='l',lty=1,col=2:5,xaxt='n',lwd=2,ylim=c(0,1))\n", " axis(1,at=seq(5,181,6),labels=velib$dates[seq(5,181,6)],las=2)\n", " " ] } ], "source": [ "rm(list = ls()) # erase everything, start from scratch!\n", "\n", "# load the data from package funFEM\n", "\n", "library(\"funFEM\")\n", "data(velib)\n", "help(\"velib\")\n", "#str(velib)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | ⋯ | 159 | 160 | 161 | 162 | 163 | 164 | 165 | 166 | 167 | 168 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
EURYALE DEHAYNIN | 0.03846154 | 0.03846154 | 0.07692308 | 0.03846154 | 0.03846154 | 0.03846154 | 0.03846154 | 0.03846154 | 0.10714286 | 0.00000000 | ⋯ | 0.29629630 | 0.11111111 | 0.1111111 | 0.14814815 | 0.30769231 | 0.07692308 | 0.11538462 | 0.07692308 | 0.1538462 | 0.1538462 |
LEMERCIER | 0.47826087 | 0.47826087 | 0.47826087 | 0.43478261 | 0.43478261 | 0.43478261 | 0.43478261 | 0.43478261 | 0.26086957 | 0.04347826 | ⋯ | 0.04347826 | 0.00000000 | 0.2173913 | 0.13043478 | 0.04545455 | 0.17391304 | 0.17391304 | 0.17391304 | 0.2608696 | 0.3913043 |
MEZIERES RENNES | 0.21818182 | 0.14545455 | 0.12727273 | 0.10909091 | 0.10909091 | 0.10909091 | 0.09090909 | 0.09090909 | 0.05454545 | 0.10909091 | ⋯ | 0.25925926 | 0.25925926 | 0.2037037 | 0.12962963 | 0.14814815 | 0.29629630 | 0.31481481 | 0.37037037 | 0.3703704 | 0.4074074 |
FARMAN | 0.95238095 | 0.95238095 | 0.95238095 | 0.95238095 | 0.95238095 | 0.95238095 | 0.95238095 | 1.00000000 | 1.00000000 | 1.00000000 | ⋯ | 1.00000000 | 1.00000000 | 0.9047619 | 0.85714286 | 0.85714286 | 0.85714286 | 0.76190476 | 0.76190476 | 0.7619048 | 0.7619048 |
QUAI DE LA RAPEE | 0.92753623 | 0.81159420 | 0.73913043 | 0.72463768 | 0.72463768 | 0.72463768 | 0.72463768 | 0.72463768 | 0.75362319 | 0.97101449 | ⋯ | 0.22727273 | 0.45454545 | 0.5909091 | 0.83333333 | 1.00000000 | 0.81818182 | 0.63636364 | 0.71212121 | 0.6212121 | 0.5757576 |
CHOISY POINT D'IVRY | 0.16666667 | 0.16666667 | 0.16666667 | 0.16666667 | 0.16666667 | 0.16666667 | 0.16666667 | 0.16666667 | 0.08333333 | 0.00000000 | ⋯ | 0.34782609 | 0.08695652 | 0.1153846 | 0.08695652 | 0.13043478 | 0.08695652 | 0.08695652 | 0.43478261 | 0.3913043 | 0.5217391 |