[1]徐高,周武杰,叶绿.基于边界-图卷积的机器人行驶路障场景解析[J].浙江科技学院学报,2023,(05):402-411.[doi:10.3969/j.issn.1671-8798.2023.05.006]
 XU Gao,ZHOU Wujie,YE Lü.Robot driving road scene parsing based on boundary-graph convolution bidirectional supervised network[J].,2023,(05):402-411.[doi:10.3969/j.issn.1671-8798.2023.05.006]
点击复制

基于边界-图卷积的机器人行驶路障场景解析(/HTML)
分享到:

《浙江科技学院学报》[ISSN:1001-3733/CN:61-1062/R]

卷:
期数:
2023年05期
页码:
402-411
栏目:
出版日期:
2023-10-31

文章信息/Info

Title:
Robot driving road scene parsing based on boundary-graph convolution bidirectional supervised network
文章编号:
1671-8798(2023)05-0402-10
作者:
徐高周武杰叶绿
(浙江科技学院 信息与电子工程学院,杭州 310023)
Author(s):
XU Gao ZHOU Wujie YE Lü
(School of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou 310023, Zhejiang, China)
关键词:
场景解析边界监督多尺度上下文图卷积
分类号:
TP389.1
DOI:
10.3969/j.issn.1671-8798.2023.05.006
文献标志码:
A
摘要:
【目的】为了使地面机器人行驶过程中能准确识别出路障以避免发生碰撞,提出一种边界-图卷积双向监督网络(boundary-graph convolution bidirectional supervised network,B-GCBSNet)的场景解析算法。【方法】首先使用统一变换器(unified transformer,UniFormer)作为主干网络分别对输入的RGB(red,green,blue,红绿蓝)图像和深度图像进行特征提取; 其次利用设计的多模态上下文融合模块(multimodal context fusion module,MCFM)将深度图包含的丰富空间信息补充给RGB图以提取更丰富的语义特征,在解码阶段设计了双向监督模块(bidirectional supervision module,BSM); 再次,将含有更多全局信息的低级特征进行边缘化处理以得到边界信息,并通过二分类交叉熵损失函数(binary cross entropy loss,BCELoss)进行监督,而包含更多局部信息的高级特征,则通过图卷积来补充特征的全局上下文,以弥补传统卷积神经网络(convolutional neural network,CNN)提取高级特征时忽略局部位置信息的不足,并通过多分类交叉熵损失函数(cross entropy loss,CELoss)进行监督; 最后将边界特征和分割特征进行整合得到最终的场景解析结果。【结果】在机器人行驶路障场景数据集(ground mobile robot perception,GMRP)上进行了试验,与已有的先进方法相比,B-GCBSNet的平均交并比(mean intersection over union,MIoU)达到了93.54%,平均类别准确率(mean classification accuracy,mAcc)达到了98.89%,像素准确率(pixel accuracy,PA)达到了98.85%。【结论】B-GCBSNet能较为准确地识别障碍物及可行驶道路,从而为地面机器人行驶过程中障碍物的识别研究提供了参考。

参考文献/References:

[1] LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):640.
[2] RONNEBERGER O, FISCHER P, BROX T. U-Net:convolutional networks for biomedical image segmentation[C]//Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham:Springer,2015:234.
[3] LI G, YUNI, KIM J, et al. DABNet:depth-wise asymmetric bottleneck for real-time semantic segmentation[C]//Proceedings of the British Machine Vision Conference. Cardiff:University of Cardiff,2019:11357.
[4] HOU Q B, ZHANG L, CHENG M M, et al. Strip pooling:rethinking spatial pooling for scene parsing[C]//Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Seattle:IEEE/CVF,2020:4003.
[5] WU Y, HUANG Z M, LONG H Y, et al. A semantic segmentation network simulating the ventral and dorsal pathways of the verebralvisual cortex[J].IEEE Access,2021,9:47230.
[6] LEE S, PARK S J, HONG K S. RDFNet:RGB-dmulti-level residual feature fusion for indoor semantic segmentation[C]//Proceedings of the International Conference on Computer Vision. Venice:IEEE,2017:4980.
[7] WANG W Y, NEUMANN U. Depth-aware CNN for RGB-D segmentation[C]//Proceedings of the European Conference on Computer Vision. Munich:Springer,2018:135.
[8] HU X X, YANG K L, FEI L, et al. ACNet:attention based network to exploit complementary features for RGB-D semantic segmentation[C]//Proceedings of the International Conference on Image Processing. Taipei:IEEE,2019:1440.
[9] CHEN X K, LIN K Y, WANG J B, et al. Bi-directional cross-modality feature propagation with separation-and-aggregation gate for RGB-d semantic segmentation[C]//Proceedings of the European Conference on Computer Vision. Glasgow:Springer,2020:561.
[10] ZHOU W J, YUAN J Z, LEI J S, et al. TSNet:three-stream self-attention network for RGB-D indoor semantic segmentation[J].Intelligent Systems,2021,36(4):73.
[11] LIN D, ZHANG R M, JI Y F, et al. SCN:switchable context network for semantic segmentation of RGB-D images[J].IEEE Transactions on Cybernetics,2020,50(3):1120.
[12] YUE Y C, ZHOU W J, LEI J S, et al. Two-stage cascaded decoder for semantic segmentation of RGB-D images[J].IEEE Signal Processing Letters,2021,28:1115.
[13] SUN L, YANG K L, HU X X, et al. Real-time fusion network for RGB-D semantic segmentation incorporating unexpected obstacle detection for road-driving images[J].IEEE Robotics and Automation Letters,2020,5(4):5558.
[14] 杜敏敏,司马海峰.改进DeepLabv3+的道路图像语义分割网络[J].中国科技信息,2022(15):105.
[15] WANG H L, FAN R, SUN Y X, et al. Dynamic fusion module evolves drivable area and road anomaly detection:a benchmark and algorithms[J].IEEE Transactions on Cybernetics,2022,52(10); 10750.
[16] SUN Y X, ZUO W X, LIU M. RTFNet:RGB-thermal fusion network for semantic segmentation of urban scenes[J].IEEE Robotics and Automation Letters,2019,4(3):2576.
[17] ZHOU H, QI L, HUANG H, et al. CANet:co-attention network for RGB-D semantic segmentation[J].Pattern Recognition,2022,124:108468.
[18] ZHOU W J, YANG E Q, LEI J S, et al. FRNet:feature reconstruction network for RGB-D indoor scene parsing[J].IEEE Journal of Selected Topics in Signal Processing,2022,16(4):677.

备注/Memo

备注/Memo:
收稿日期:2022-09-19
基金项目:国家重点研发计划项目(2022YFE0196000); 国家自然科学基金项目(62371422)
通信作者:周武杰(1983— ),男,浙江省临海人,副教授,博士,主要从事人工智能和视觉大数据研究。E-mail:wujiezhou@163.com。
更新日期/Last Update: 2023-10-31