Your browser doesn't support javascript.
A deep reinforcement learning algorithm for the rectangular strip packing problem.
Fang, Jie; Rao, Yunqing; Shi, Mingliang.
  • Fang J; State Key Laboratory of Digital Manufacturing Equipment and Technology, Huazhong University of Science and Technology, Wuhan, China.
  • Rao Y; State Key Laboratory of Digital Manufacturing Equipment and Technology, Huazhong University of Science and Technology, Wuhan, China.
  • Shi M; Laboratory of Multimedia Stream Computing and Storage, Huazhong University of Science and Technology, Wuhan, China.
PLoS One ; 18(3): e0282598, 2023.
Article in English | MEDLINE | ID: covidwho-2258705
ABSTRACT
As a branch of the two-dimensional (2D) optimal blanking problem, rectangular strip packing is a typical non-deterministic polynomial (NP-hard) problem. The classical packing solution method relies on heuristic and metaheuristic algorithms. Usually, it needs to be designed with manual decisions to guide the solution, resulting in a small solution scale, weak generalization, and low solution efficiency. Inspired by deep learning and reinforcement learning, combined with the characteristics of rectangular piece packing, a novel algorithm based on deep reinforcement learning is proposed in this work to solve the rectangular strip packing problem. The pointer network with an encoder and decoder structure is taken as the basic network for the deep reinforcement learning algorithm. A model-free reinforcement learning algorithm is designed to train network parameters to optimize the packing sequence. This design can not only avoid designing heuristic rules separately for different problems but also use the deep networks with self-learning characteristics to solve different instances more widely. At the same time, a piece positioning algorithm based on the maximum rectangles bottom-left (Maxrects-BL) is designed to determine the placement position of pieces on the plate and calculate model rewards and packing parameters. Finally, instances are used to analyze the optimization effect of the algorithm. The experimental results show that the proposed algorithm can produce three better and five comparable results compared with some classical heuristic algorithms. In addition, the calculation time of the proposed algorithm is less than 1 second in all test instances, which shows a good generalization, solution efficiency, and practical application potential.
Subject(s)

Full text: Available Collection: International databases Database: MEDLINE Main subject: Reinforcement, Psychology / Algorithms Type of study: Guideline / Prognostic study Language: English Journal: PLoS One Journal subject: Science / Medicine Year: 2023 Document Type: Article Affiliation country: Journal.pone.0282598

Similar

MEDLINE

...
LILACS

LIS


Full text: Available Collection: International databases Database: MEDLINE Main subject: Reinforcement, Psychology / Algorithms Type of study: Guideline / Prognostic study Language: English Journal: PLoS One Journal subject: Science / Medicine Year: 2023 Document Type: Article Affiliation country: Journal.pone.0282598