Powerful-IoU: More straightforward and faster bounding box regression loss with a nonmonotonic focusing mechanism

doi:10.1016/j.neunet.2023.11.041

CORC > 自动化研究所 > 中国科学院自动化研究所 > 脑图谱与类脑智能实验室

	Powerful-IoU: More straightforward and faster bounding box regression loss with a nonmonotonic focusing mechanism
	Liu, Can 1,2; Wang, Kaige 3; Li, Qing 1; Zhao, Fazhan 1; Zhao, Kun 4; Ma, Hongtu 4
刊名	NEURAL NETWORKS
	2024-02-01
卷号	170 页码:276-284
关键词	Object detection Bounding box regression Loss function design Focusing mechanism
ISSN号	0893-6080
DOI	10.1016/j.neunet.2023.11.041
通讯作者	Liu, Can(canliu@whu.edu.cn)
英文摘要	Bounding box regression (BBR) is one of the core tasks in object detection, and the BBR loss function significantly impacts its performance. However, we have observed that existing IoU-based loss functions suffer from unreasonable penalty factors, leading to anchor boxes expanding during regression and significantly slowing down convergence. To address this issue, we intensively analyzed the reasons for anchor box enlargement. In response, we propose a Powerful-IoU (PIoU) loss function, which combines a target size-adaptive penalty factor and a gradient-adjusting function based on anchor box quality. The PIoU loss guides anchor boxes to regress along efficient paths, resulting in faster convergence than existing IoU-based losses. Additionally, we investigate the focusing mechanism and introduce a non-monotonic attention layer that was combined with PIoU to obtain a new loss function PIoU v2. PIoU v2 loss enhances the capability to focus on anchor boxes of medium quality. By incorporating PIoU v2 into popular object detectors such as YOLOv8 and DINO, we achieved an increase in average precision (AP) and improved performance compared to their original loss functions on the MS COCO and PASCAL VOC datasets, thus validating the effectiveness of our proposed improvement strategies.
资助项目	National Key Research and Development Program of China[2021YFB3100904]
WOS关键词	OBJECT DETECTION
WOS研究方向	Computer Science ; Neurosciences & Neurology
语种	英语
出版者	PERGAMON-ELSEVIER SCIENCE LTD
WOS记录号	WOS:001125595300001
资助机构	National Key Research and Development Program of China
内容类型	期刊论文
源URL	[http://ir.ia.ac.cn/handle/173211/54937]
专题	脑图谱与类脑智能实验室
通讯作者	Liu, Can
作者单位	1.Chinese Acad Sci, Inst Microelect, Beijing 100029, Peoples R China 2.Univ Chinese Acad Sci, Sch Integrated Circuits, Beijing 100020, Peoples R China 3.China Acad Aerosp Sci & Innovat, Beijing 100048, Peoples R China 4.Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
推荐引用方式 GB/T 7714	Liu, Can,Wang, Kaige,Li, Qing,et al. Powerful-IoU: More straightforward and faster bounding box regression loss with a nonmonotonic focusing mechanism[J]. NEURAL NETWORKS,2024,170:276-284.
APA	Liu, Can,Wang, Kaige,Li, Qing,Zhao, Fazhan,Zhao, Kun,&Ma, Hongtu.(2024).Powerful-IoU: More straightforward and faster bounding box regression loss with a nonmonotonic focusing mechanism.NEURAL NETWORKS,170,276-284.
MLA	Liu, Can,et al."Powerful-IoU: More straightforward and faster bounding box regression loss with a nonmonotonic focusing mechanism".NEURAL NETWORKS 170(2024):276-284.