Publications
Papers
2023
- Combinatorial optimization enriched machine learning to solve the dynamic vehicle routing problem with time windowsTransportation Science, 2023
With the rise of e-commerce and increasing customer requirements, logistics service providers face a new complexity in their daily planning, mainly due to efficiently handling same day deliveries. Existing multi-stage stochastic optimization approaches that allow to solve the underlying dynamic vehicle routing problem are either computationally too expensive for an application in online settings, or – in the case of reinforcement learning – struggle to perform well on high-dimensional combinatorial problems. To mitigate these drawbacks, we propose a novel machine learning pipeline that incorporates a combinatorial optimization layer. We apply this general pipeline to a dynamic vehicle routing problem with dispatching waves, which was recently promoted in the EURO Meets NeurIPS Vehicle Routing Competition at NeurIPS 2022. Our methodology ranked first in this competition, outperforming all other approaches in solving the proposed dynamic vehicle routing problem. With this work, we provide a comprehensive numerical study that further highlights the efficacy and benefits of the proposed pipeline beyond the results achieved in the competition, e.g., by showcasing the robustness of the encoded policy against unseen instances and scenarios.
2022
- Learning with Combinatorial Optimization Layers: A Probabilistic ApproachJul 2022
Combinatorial optimization (CO) layers in machine learning (ML) pipelines are a powerful tool to tackle data-driven decision tasks, but they come with two main challenges. First, the solution of a CO problem often behaves as a piecewise constant function of its objective parameters. Given that ML pipelines are typically trained using stochastic gradient descent, the absence of slope information is very detrimental. Second, standard ML losses do not work well in combinatorial settings. A growing body of research addresses these challenges through diverse methods. Unfortunately, the lack of well-maintained implementations slows down the adoption of CO layers. In this paper, building upon previous works, we introduce a probabilistic perspective on CO layers, which lends itself naturally to approximate differentiation and the construction of structured losses. We recover many approaches from the literature as special cases, and we also derive new ones. Based on this unifying perspective, we present InferOpt.jl, an open-source Julia package that 1) allows turning any CO oracle with a linear objective into a differentiable layer, and 2) defines adequate losses to train pipelines containing such layers. Our library works with arbitrary optimization algorithms, and it is fully compatible with Julia’s ML ecosystem. We demonstrate its abilities using a pathfinding problem on video game maps.
Conference talks
2023
- PGMO 2023Decision-Aware Learning with REINFORCE enables new Architectures and ApplicationsLéo Baty, Louis Bouvier, Matieu Blondel, and 4 more authorsDec 2023
ABSTRACT. Using Combinatorial Optimization (CO) layers in Machine Learning (ML) pipelines is a recent area of interest (Kotary et al. 2021). These new hybrid methods provide fast differentiable heuristics for difficult CO problems. For instance, it can be used to tackle hard stochastic transportation problems, such as the stochastic vehicle scheduling or the dynamic vehicle routing problems. However, current state-of-the-art methods face several learning limitations: most of them only support CO layers with linear objective, or need (near) optimal solutions to imitate in a supervised learning way. This limits a lot the possible applications and pipeline architectures for these methods. We introduce what we call perturbed oracles, which can transform any CO oracle f into a distribution on the feasible solution space. This is done by perturbing the input θ of f using a distribution p(η|θ). It then turns the learning problem into a moment problem with non-zero gradients. Meaningful gradients can be computed by using the log-derivative/ REINFORCE trick, a well-known technique in ML and Reinforcement Learning. This flexible approach unlocks new applications and architectures, such as learning constraints for the two-stage knapsack, and learning to sparsify graphs to scale the stochastic vehicle scheduling problem on large instances. Our numerical experiments show the efficiency of the approach.
- ROADEF 2023Winning Approach for the EURO-NeurIPS 2022 Dynamic Vehicle Routing CompetitionFeb 2023
The EURO meets NeurIPS 2022 Vehicle Routing Competition (see https://euro-neurips-vrp-2022.challenges.ortec.com/) focuses on the usual static Capacitated Vehicle Routing Problem with Time Windows (VRPTW), as well as a dynamic variant which is the focus of this paper. The objective of the dynamic VRPTW is to build routes for a fleet of capacitated vehicles, in order to serve all customer requests within their given time windows. The objective is to minimize the total travel time. However, requests are not known in advance: they arrive continuously during the day. Every hour, the decision maker chooses which requests to dispatch and builds routes to serve them. Each request must be served before the end of its time window. A request whose time window allows serving it during the next hour can be postponed. Once a route has been built, it cannot be modified. The static VRPTW has been extensively studied in the literature, including exact approaches, as well as heuristic ones. This can be seen in the results of the challenge: the objective of the algorithm proposed by the 5th team on the static problem is 0.04% worse than the best solution found. The question of building an efficient heuristic for the dynamic problem remains far more open: the policy proposed by the 5th team is 4% worse than the best proposed. Our main contribution is a policy for the dynamic VRPTW. It relies on a Deep Learning pipeline with a prize collecting VRPTW combinatorial optimization layer. It ranked first of the competition. This pipeline requires a subroutine for solving the prize collecting VRPTW, for which we introduce the prize collecting hybrid genetic search, a variant of the hybrid genetic search adapted for the prize collecting VRPTW. As a side contribution, we implemented an extension of the InferOpt.jl Julia library with a generic support for generalized combinatorial optimization oracles, and in particular for training our pipeline.
- AIRO Young 2023Winning Approach for the EURO-NeurIPS 2022 Dynamic Vehicle Routing CompetitionFeb 2023
2022
- PGMO 2022Structured learning for vehicle routing problemsNov 2022
ABSTRACT. Vehicle routing and vehicle scheduling problems are extensively studied in the literature, and extremely efficient algorithms are now available to tackle them. Both exact and heuristic approaches exist to solve these pure vehicle routing and scheduling problems. For example, the Vehicle Scheduling Problem (VSP) can be exactly solved by a flow linear program, and the Capacitated Vehicle Routing Problem (CVRP) can be exactly solved using Branch-and-Price, or heuristically on larger instances using the state-of-the-art Hybrid Genetic Search (HGS). However, algorithms for variations of these pure problems are often not as efficient, and have difficulty to scale on large instances. An example of this setting is the Stochastic Vehicle Scheduling Problem (StoVSP), a variation of the VSP where travel and task times are randomly perturbed after the vehicle routes are chosen, the objective beeing to minimize the average total delay of the tasks. StoVSP can be solved with column generation on instances with up to 25 tasks, but then fails to scale and generalize to larger instance sizes. Another example is the Dynamic Capacitated Vehicle Routing Problem with Time Windows (Dynamic-CVRPTW), a multi-stage variation of the CVRP, for which the HGS does not apply anymore. Using pure vehicle routing and scheduling problem algorithms as Combinatorial Optimization (CO) layers in Machine Learning (ML) pipelines can be a way to generate good algorithms for their variations. We present such hybrid pipelines for the two variations presented above, implemented by leveraging InferOpt.jl, our open source Julia package gathering several state-of-the-art methods for integrating CO algorithms in ML pipelines. For the StoVSP, we learn a pipeline containing a VSP flow algorithm, and are able to generalize and obtain solutions with small gaps on instances with up to 1000 tasks. For the Dynamic-CVRPTW, we learn a pipeline containing a modified HGS as a CO layer, and use it at each stage to retrieve a good policy.
- JuliaCon 2022InferOpt.jl: Combinatorial Optimization in ML PipelinesGuillaume Dalle, Louis Bouvier, and Léo BatyJul 2022
We present InferOpt.jl, a generic package for combining combinatorial optimization algorithms with machine learning models. It has two purposes: 1) Increasing the expressivity of learning models thanks to new types of structured layers. 2) Increasing the efficiency of optimization algorithms thanks to an additional inference step. Our library provides wrappers for several state-of-the-art methods in order to make them compatible with Julia’s automatic differentiation ecosystem.
- ROADEF 2022Replanification ferroviaire en temps réel par résolution hybride IA/ROLéo BatyIn 23ème congrès annuel de la Société Française de Recherche Opérationnelle et d’Aide à la Décision, Feb 2022
Mon stage de master 2 et de fin d’études s’est déroulé entre le 6 avril et le 1er octobre 2021 à SNCF Direction Innovation et Recherche, sous la direction de Hugo Belhomme. Ce stage s’inscrit dans le cadre d’un projet avec Transilien, qui a pour but de fournir un outil d’aide à la décision qui propose des scénarios de replanification lors des perturbations sur le réseau. L’objectif de mon stage était de s’intéresser à l’utilisation des données d’historiques à disposition afin d’expérimenter des méthodes hybrides entre apprentissage et recherche opérationnelle, afin de déterminer si cela peut aider à l’amélioration des algorithmes d’optimisation de l’outil. Nous avons expérimenté des modèles de Graph Neural Networks utilisant les données d’historiques de plans de transport, afin de guider les heuristiques d’optimisation.