Solving a Joint Pricing and Inventory Control Problem for Perishables via Deep Reinforcement Learning
We study a joint pricing and inventory control problem for perishables with positive lead time in a finite horizon periodic-review system. Unlike most studies considering a continuous density function of demand, in our paper the customer demand depends on the price of current period and arrives acco...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Hindawi-Wiley
2021-01-01
|
Series: | Complexity |
Online Access: | http://dx.doi.org/10.1155/2021/6643131 |
id |
doaj-4b932af868c64a67b17c839af22dce1f |
---|---|
record_format |
Article |
spelling |
doaj-4b932af868c64a67b17c839af22dce1f2021-02-15T12:52:52ZengHindawi-WileyComplexity1076-27871099-05262021-01-01202110.1155/2021/66431316643131Solving a Joint Pricing and Inventory Control Problem for Perishables via Deep Reinforcement LearningRui Wang0Xianghua Gan1Qing Li2Xiao Yan3School of Business Administration, The Southwestern University of Finance and Economics, Chengdu, Sichuan, ChinaSchool of Business Administration, The Southwestern University of Finance and Economics, Chengdu, Sichuan, ChinaChina Construction Bank, Hengshui Branch, Hengshui, Hebei, ChinaSchool of Business Administration, The Southwestern University of Finance and Economics, Chengdu, Sichuan, ChinaWe study a joint pricing and inventory control problem for perishables with positive lead time in a finite horizon periodic-review system. Unlike most studies considering a continuous density function of demand, in our paper the customer demand depends on the price of current period and arrives according to a homogeneous Poisson process. We consider both backlogging and lost-sales cases, and our goal is to find a simultaneously ordering and pricing policy to maximize the expected discounted profit over the planning horizon. When there is no fixed ordering cost involved, we design a deep reinforcement learning algorithm to obtain a near-optimal ordering policy and show that there are some monotonicity properties in the learned policy. We also show that our deep reinforcement learning algorithm achieves a better performance than tabular-based Q-learning algorithms. When a fixed ordering cost is involved, we show that our deep reinforcement learning algorithm is effective and efficient, under which the problem of “curse of dimension” is circumvented.http://dx.doi.org/10.1155/2021/6643131 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Rui Wang Xianghua Gan Qing Li Xiao Yan |
spellingShingle |
Rui Wang Xianghua Gan Qing Li Xiao Yan Solving a Joint Pricing and Inventory Control Problem for Perishables via Deep Reinforcement Learning Complexity |
author_facet |
Rui Wang Xianghua Gan Qing Li Xiao Yan |
author_sort |
Rui Wang |
title |
Solving a Joint Pricing and Inventory Control Problem for Perishables via Deep Reinforcement Learning |
title_short |
Solving a Joint Pricing and Inventory Control Problem for Perishables via Deep Reinforcement Learning |
title_full |
Solving a Joint Pricing and Inventory Control Problem for Perishables via Deep Reinforcement Learning |
title_fullStr |
Solving a Joint Pricing and Inventory Control Problem for Perishables via Deep Reinforcement Learning |
title_full_unstemmed |
Solving a Joint Pricing and Inventory Control Problem for Perishables via Deep Reinforcement Learning |
title_sort |
solving a joint pricing and inventory control problem for perishables via deep reinforcement learning |
publisher |
Hindawi-Wiley |
series |
Complexity |
issn |
1076-2787 1099-0526 |
publishDate |
2021-01-01 |
description |
We study a joint pricing and inventory control problem for perishables with positive lead time in a finite horizon periodic-review system. Unlike most studies considering a continuous density function of demand, in our paper the customer demand depends on the price of current period and arrives according to a homogeneous Poisson process. We consider both backlogging and lost-sales cases, and our goal is to find a simultaneously ordering and pricing policy to maximize the expected discounted profit over the planning horizon. When there is no fixed ordering cost involved, we design a deep reinforcement learning algorithm to obtain a near-optimal ordering policy and show that there are some monotonicity properties in the learned policy. We also show that our deep reinforcement learning algorithm achieves a better performance than tabular-based Q-learning algorithms. When a fixed ordering cost is involved, we show that our deep reinforcement learning algorithm is effective and efficient, under which the problem of “curse of dimension” is circumvented. |
url |
http://dx.doi.org/10.1155/2021/6643131 |
work_keys_str_mv |
AT ruiwang solvingajointpricingandinventorycontrolproblemforperishablesviadeepreinforcementlearning AT xianghuagan solvingajointpricingandinventorycontrolproblemforperishablesviadeepreinforcementlearning AT qingli solvingajointpricingandinventorycontrolproblemforperishablesviadeepreinforcementlearning AT xiaoyan solvingajointpricingandinventorycontrolproblemforperishablesviadeepreinforcementlearning |
_version_ |
1714866987717361664 |