Solving a Joint Pricing and Inventory Control Problem for Perishables via Deep Reinforcement Learning

We study a joint pricing and inventory control problem for perishables with positive lead time in a finite horizon periodic-review system. Unlike most studies considering a continuous density function of demand, in our paper the customer demand depends on the price of current period and arrives acco...

Full description

Bibliographic Details
Main Authors: Rui Wang, Xianghua Gan, Qing Li, Xiao Yan
Format: Article
Language:English
Published: Hindawi-Wiley 2021-01-01
Series:Complexity
Online Access:http://dx.doi.org/10.1155/2021/6643131
id doaj-4b932af868c64a67b17c839af22dce1f
record_format Article
spelling doaj-4b932af868c64a67b17c839af22dce1f2021-02-15T12:52:52ZengHindawi-WileyComplexity1076-27871099-05262021-01-01202110.1155/2021/66431316643131Solving a Joint Pricing and Inventory Control Problem for Perishables via Deep Reinforcement LearningRui Wang0Xianghua Gan1Qing Li2Xiao Yan3School of Business Administration, The Southwestern University of Finance and Economics, Chengdu, Sichuan, ChinaSchool of Business Administration, The Southwestern University of Finance and Economics, Chengdu, Sichuan, ChinaChina Construction Bank, Hengshui Branch, Hengshui, Hebei, ChinaSchool of Business Administration, The Southwestern University of Finance and Economics, Chengdu, Sichuan, ChinaWe study a joint pricing and inventory control problem for perishables with positive lead time in a finite horizon periodic-review system. Unlike most studies considering a continuous density function of demand, in our paper the customer demand depends on the price of current period and arrives according to a homogeneous Poisson process. We consider both backlogging and lost-sales cases, and our goal is to find a simultaneously ordering and pricing policy to maximize the expected discounted profit over the planning horizon. When there is no fixed ordering cost involved, we design a deep reinforcement learning algorithm to obtain a near-optimal ordering policy and show that there are some monotonicity properties in the learned policy. We also show that our deep reinforcement learning algorithm achieves a better performance than tabular-based Q-learning algorithms. When a fixed ordering cost is involved, we show that our deep reinforcement learning algorithm is effective and efficient, under which the problem of “curse of dimension” is circumvented.http://dx.doi.org/10.1155/2021/6643131
collection DOAJ
language English
format Article
sources DOAJ
author Rui Wang
Xianghua Gan
Qing Li
Xiao Yan
spellingShingle Rui Wang
Xianghua Gan
Qing Li
Xiao Yan
Solving a Joint Pricing and Inventory Control Problem for Perishables via Deep Reinforcement Learning
Complexity
author_facet Rui Wang
Xianghua Gan
Qing Li
Xiao Yan
author_sort Rui Wang
title Solving a Joint Pricing and Inventory Control Problem for Perishables via Deep Reinforcement Learning
title_short Solving a Joint Pricing and Inventory Control Problem for Perishables via Deep Reinforcement Learning
title_full Solving a Joint Pricing and Inventory Control Problem for Perishables via Deep Reinforcement Learning
title_fullStr Solving a Joint Pricing and Inventory Control Problem for Perishables via Deep Reinforcement Learning
title_full_unstemmed Solving a Joint Pricing and Inventory Control Problem for Perishables via Deep Reinforcement Learning
title_sort solving a joint pricing and inventory control problem for perishables via deep reinforcement learning
publisher Hindawi-Wiley
series Complexity
issn 1076-2787
1099-0526
publishDate 2021-01-01
description We study a joint pricing and inventory control problem for perishables with positive lead time in a finite horizon periodic-review system. Unlike most studies considering a continuous density function of demand, in our paper the customer demand depends on the price of current period and arrives according to a homogeneous Poisson process. We consider both backlogging and lost-sales cases, and our goal is to find a simultaneously ordering and pricing policy to maximize the expected discounted profit over the planning horizon. When there is no fixed ordering cost involved, we design a deep reinforcement learning algorithm to obtain a near-optimal ordering policy and show that there are some monotonicity properties in the learned policy. We also show that our deep reinforcement learning algorithm achieves a better performance than tabular-based Q-learning algorithms. When a fixed ordering cost is involved, we show that our deep reinforcement learning algorithm is effective and efficient, under which the problem of “curse of dimension” is circumvented.
url http://dx.doi.org/10.1155/2021/6643131
work_keys_str_mv AT ruiwang solvingajointpricingandinventorycontrolproblemforperishablesviadeepreinforcementlearning
AT xianghuagan solvingajointpricingandinventorycontrolproblemforperishablesviadeepreinforcementlearning
AT qingli solvingajointpricingandinventorycontrolproblemforperishablesviadeepreinforcementlearning
AT xiaoyan solvingajointpricingandinventorycontrolproblemforperishablesviadeepreinforcementlearning
_version_ 1714866987717361664