3DGTN: 3D Dual-Attention GLocal Transformer Network for Point Cloud Classification and Segmentation

Lu, Dening; Gao, Kyle; Xie, Qian; Xu, Linlin; Li, Jonathan

Computer Science > Computer Vision and Pattern Recognition

arXiv:2209.11255 (cs)

[Submitted on 21 Sep 2022 (v1), last revised 31 May 2023 (this version, v2)]

Title:3DGTN: 3D Dual-Attention GLocal Transformer Network for Point Cloud Classification and Segmentation

Authors:Dening Lu, Kyle Gao, Qian Xie, Linlin Xu, Jonathan Li

View PDF

Abstract:Although the application of Transformers in 3D point cloud processing has achieved significant progress and success, it is still challenging for existing 3D Transformer methods to efficiently and accurately learn both valuable global features and valuable local features for improved applications. This paper presents a novel point cloud representational learning network, called 3D Dual Self-attention Global Local (GLocal) Transformer Network (3DGTN), for improved feature learning in both classification and segmentation tasks, with the following key contributions. First, a GLocal Feature Learning (GFL) block with the dual self-attention mechanism (i.e., a novel Point-Patch Self-Attention, called PPSA, and a channel-wise self-attention) is designed to efficiently learn the GLocal context information. Second, the GFL block is integrated with a multi-scale Graph Convolution-based Local Feature Aggregation (LFA) block, leading to a Global-Local (GLocal) information extraction module that can efficiently capture critical information. Third, a series of GLocal modules are used to construct a new hierarchical encoder-decoder structure to enable the learning of "GLocal" information in different scales in a hierarchical manner. The proposed framework is evaluated on both classification and segmentation datasets, demonstrating that the proposed method is capable of outperforming many state-of-the-art methods on both classification and segmentation tasks.

Comments:	10 pages, 6 figures, 4 tables
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2209.11255 [cs.CV]
	(or arXiv:2209.11255v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2209.11255

Submission history

From: Dening Lu [view email]
[v1] Wed, 21 Sep 2022 14:34:21 UTC (4,475 KB)
[v2] Wed, 31 May 2023 02:20:58 UTC (2,237 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:3DGTN: 3D Dual-Attention GLocal Transformer Network for Point Cloud Classification and Segmentation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:3DGTN: 3D Dual-Attention GLocal Transformer Network for Point Cloud Classification and Segmentation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators