This project will apply deep reinforcement learning to solve a vehicle routing problem (VRP). VRP is a typical problem in combinatorial optimization and operations research, and it has direct applications in logistics and supply chain. A solution to VRP determines what is the optimal set of routes (in terms of minimizing transportation cost, distance, etc.) for a fleet of vehicles to traverse in order to deliver to a given set of customers who have certain needs and requirements. The vehicles must start and finish at their own depots. The question being answered is how deep reinforcement learning methods can be used to improve the performance of the existing heuristics algorithms for VPR and achieve comparable or better results on the VPR benchmark dataset.