research-article
Authors: Shuo Mao, Pengfei Leng, and Chunhui Liu
CVIPPR '24: Proceedings of the 2024 2nd Asia Conference on Computer Vision, Image Processing and Pattern Recognition
April 2024
Article No.: 12, Pages 1 - 5
Published: 27 June 2024 Publication History
- 0citation
- 0
- Downloads
Metrics
Total Citations0Total Downloads0Last 12 Months0
Last 6 weeks0
New Citation Alert added!
This alert has been successfully added and will be sent to:
You will be notified whenever a record that you have chosen has been cited.
To manage your alert preferences, click on the button below.
Manage my Alerts
New Citation Alert!
Please log in to your account
Get Access
- Get Access
- References
- Media
- Tables
- Share
Abstract
ShipRSImageNet is a large-scale, fine-grained ship detection dataset specifically designed for high-resolution optical remote sensing images. Existing methods rely on traditional object detection techniques that involve Non-Maximum Suppression (NMS) post-processing. However, the groundbreaking DETR (Detection Transformer) introduced an end-to-end approach to object detection. While Sparse R-CNN improved upon this, it still faced challenges related to low accuracy and slow convergence.
In this paper, we propose an Enhanced Sparse R-CNN that achieves full end-to-end capabilities. We introduce the Adaptive Proposal Network, designed to generate high-quality proposal boxes and proposal features. Additionally, we propose the Prediction Update Network, which refines the detection boxes generated by Sparse R-CNN. By comparing our method’s performance on the ShipRSImageNet dataset, we demonstrate its superiority in fine-grained ship detection within high-resolution optical remote sensing images. Our approach not only enhances detection accuracy but also effectively handles complex environments, small ships, and fine-grained classification. We believe this research has practical implications for the field of object detection in high-resolution optical remote sensing images using deep learning.
References
[1]
Xinlei Chen and Abhinav Gupta. 2017. Spatial Memory for Context Reasoning in Object Detection. In 2017 IEEE International Conference on Computer Vision (ICCV). 4106–4116. https://doi.org/10.1109/ICCV.2017.440
[2]
Myung Choi, Antonio Torralba, and Alan Willsky. 2011. A Tree-Based Context Model for Object Recognition. IEEE transactions on pattern analysis and machine intelligence 34 (06 2011), 240–52. https://doi.org/10.1109/TPAMI.2011.119
Digital Library
[3]
Zhigang Dai, Bolun Cai, Yugeng Lin, and Junying Chen. 2021. UP-DETR: Unsupervised Pre-training for Object Detection with Transformers. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 1601–1610. https://doi.org/10.1109/CVPR46437.2021.00165
[4]
SantoshK. Divvala, Derek Hoiem, JamesH. Hays, AlexeiA. Efros, and Martial Hebert. 2009. An empirical study of context in object detection. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. 1271–1278. https://doi.org/10.1109/CVPR.2009.5206532
[5]
PedroF. Felzenszwalb, RossB. Girshick, David McAllester, and Deva Ramanan. 2010. Object Detection with Discriminatively Trained Part-Based Models. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 9 (2010), 1627–1645. https://doi.org/10.1109/TPAMI.2009.167
Digital Library
[6]
Carolina Galleguillos, Andrew Rabinovich, and Serge Belongie. 2008. Object categorization using co-occurrence, location and appearance. In 2008 IEEE Conference on Computer Vision and Pattern Recognition. 1–8. https://doi.org/10.1109/CVPR.2008.4587799
[7]
Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. 2017. Mask R-CNN. (2017). https://doi.org/10.48550/ARXIV.1703.06870
[8]
Jan Hosang, Rodrigo Benenson, and Bernt Schiele. 2017. Learning non-maximum suppression. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4507–4515.
[9]
Han Hu, Xiaojiang Shen, Shuai Sun, Samuel Albanie, Enhua Wu, Zhu Yan, Jean-Yves Audibert, Vincent Rabaud, Daniel Momoh, Yuting Zhu, 2018. Relation networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3588–3597.
[10]
Jianan Li, Yunchao Wei, Xiaodan Liang, Jian Dong, Tingfa Xu, Jiashi Feng, and Shuicheng Yan. 2017. Attentive Contexts for Object Detection. IEEE Transactions on Multimedia 19, 5 (2017), 944–954. https://doi.org/10.1109/TMM.2016.2642789
Digital Library
[11]
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. 2017. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision. 2980–2988.
[12]
Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and AlexanderC. Berg. 2016. SSD: Single shot multibox detector. In European Conference on Computer Vision. Springer, 21–37.
[13]
Roozbeh Mottaghi, Xianjie Chen, Xiaobai Liu, Nam-Gyu Cho, Seong-Whan Lee, Sanja Fidler, Raquel Urtasun, and Alan Yuille. 2014. The Role of Context for Object Detection and Semantic Segmentation in the Wild. In 2014 IEEE Conference on Computer Vision and Pattern Recognition. 891–898. https://doi.org/10.1109/CVPR.2014.119
Digital Library
[14]
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems 28 (2015), 91–99.
[15]
Peize Sun, Rufeng Zhang, Yi Jiang, Tao Kong, Chenfeng Xu, Wei Zhan, Masayoshi Tomizuka, Lei Li, Zehuan Yuan, Changhu Wang, 2021. Sparse r-cnn: End-to-end object detection with learnable proposals. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 14454–14463.
[16]
Zhiqing Sun, Shengcao Cao, Yiming Yang, and Kris Kitani. 2021. Rethinking Transformer-based Set Prediction for Object Detection. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). 3591–3600. https://doi.org/10.1109/ICCV48922.2021.00359
[17]
Torralba. 2003. Context-based vision system for place and object recognition. In Proceedings Ninth IEEE International Conference on Computer Vision. IEEE, 273–280.
[18]
Zhuowen Tu. 2008. Auto-context and its application to high-level vision tasks. In 2008 IEEE Conference on Computer Vision and Pattern Recognition. 1–8. https://doi.org/10.1109/CVPR.2008.4587436
[19]
Zhengning Zhang, Lin Zhang, Yue Wang, Pengming Feng, and Ran He. 2021. ShipRSImageNet: A Large-Scale Fine-Grained Dataset for Ship Detection in High-Resolution Optical Remote Sensing Images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 14 (2021), 8458–8472. https://doi.org/10.1109/JSTARS.2021.3104230
[20]
Xizhou Zhu, Weijie Su, Lewei Lu, Bin Li, Xiaogang Wang, and Jifeng Dai. 2021. Deformable DETR: Deformable Transformers for End-to-End Object Detection. In International Conference on Learning Representations.
Index Terms
Advancing Fine-Grained Ship Detection: An Enhanced End-to-End Approach
Computing methodologies
Artificial intelligence
Computer vision
Computer vision tasks
Visual content-based indexing and retrieval
Recommendations
- A Fine-Grained Approach for Anomaly Detection in File System Accesses With Enhanced Temporal User Profiles
Protecting sensitive data from theft, exfiltration, and other kinds of abuses by malicious insiders is a challenging problem. While access control mechanisms cannot always prevent the insiders from misusing sensitive data (since, in most of the cases, ...
Read More
- Peripheral Instance Augmentation forEnd-to-End Anomaly Detection Using Weighted Adversarial Learning
Database Systems for Advanced Applications
Abstract
Anomaly detection has been a lasting yet active research area for decades. However, the existing methods are generally biased towards capturing the regularities of high-density normal instances with insufficient learning of peripheral instances. ...
Read More
- End-to-End Anomaly Score Estimation for Contaminated Data via Adversarial Representation Learning
Artificial Intelligence
Abstract
In recent years, deep learning has been widely used in the field of anomaly detection. Existing deep anomaly detection methods mostly focus on extracting feature representations that represent the essence of the data, then constructing anomaly ...
Read More
Comments
Information & Contributors
Information
Published In
CVIPPR '24: Proceedings of the 2024 2nd Asia Conference on Computer Vision, Image Processing and Pattern Recognition
April 2024
373 pages
ISBN:9798400716607
DOI:10.1145/3663976
Copyright © 2024 ACM.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [emailprotected].
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
Published: 27 June 2024
Permissions
Request permissions for this article.
Check for updates
Author Tags
- end-to-end detection
- fine-grained detection
Qualifiers
- Research-article
- Research
- Refereed limited
Conference
CVIPPR 2024
CVIPPR 2024: 2024 2nd Asia Conference on Computer Vision, Image Processing and Pattern Recognition
April 26 - 28, 2024
Xiamen, China
Acceptance Rates
Overall Acceptance Rate 14 of 38 submissions, 37%
Contributors
Other Metrics
View Article Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
Total Citations
Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Citations
View Options
Get Access
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in
Full Access
Get this Publication
View options
View or Download as a PDF file.
PDFeReader
View online with eReader.
eReaderHTML Format
View this article in HTML Format.
HTML FormatMedia
Figures
Other
Tables