CORC

浏览/检索结果: 共64条,第1-10条 帮助

已选(0)清除 条数/页:   排序方式:
GraphMLLM: A Graph-based Multi-level Layout Language-independent Model for Document Understanding 会议论文
希腊雅典, 2024-09
作者:  He-Sen Dai;  Xiao-Hui Li;  Fei Yin;  Xudong Yan;  Shuqi Mei
收藏  |  浏览/下载:0/0  |  提交时间:2024/06/05
Investigating Compositional Challenges in Vision-Language Models for Visual Grounding 会议论文
Seattle WA, USA, 17-21 June 2024
作者:  Yunan Zeng;  Yan Huang;  Jinjin Zhang;  Zequn Jie;  Zhenhua Chai
收藏  |  浏览/下载:0/0  |  提交时间:2024/06/05
AnomalyGPT: Detecting Industrial Anomalies Using Large Vision-Language Models 会议论文
VANCOUVER, CANADA, 2024-2-20至2024-2-27
作者:  Zhaopeng Gu;  Bingke Zhu;  Guibo Zhu;  Yingying Chen;  Ming Tang
收藏  |  浏览/下载:0/0  |  提交时间:2024/06/06
Text-to-Image Vehicle Re-Identification: Multi-Scale Multi-View Cross-Modal Alignment Network and a Unified Benchmark 期刊论文
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 页码: 14
作者:  Ding, Leqi;  Liu, Lei;  Huang, Yan;  Li, Chenglong;  Zhang, Cheng
收藏  |  浏览/下载:0/0  |  提交时间:2024/03/27
VLP2MSA: Expanding vision-language pre-training to multimodal sentiment analysis 期刊论文
KNOWLEDGE-BASED SYSTEMS, 2024, 卷号: 283, 页码: 9
作者:  Yi, Guofeng;  Fan, Cunhang;  Zhu, Kang;  Lv, Zhao;  Liang, Shan
收藏  |  浏览/下载:3/0  |  提交时间:2024/02/22
CLIP-VG: Self-Paced Curriculum Adapting of CLIP for Visual Grounding 期刊论文
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 卷号: 26, 页码: 4334-4347
作者:  Xiao, Linhui;  Yang, Xiaoshan;  Peng, Fang;  Yan, Ming;  Wang, Yaowei
收藏  |  浏览/下载:0/0  |  提交时间:2024/05/30
Multi-modal spatial relational attention networks for visual question answering 期刊论文
IMAGE AND VISION COMPUTING, 2023, 卷号: 140, 页码: 13
作者:  Yao, Haibo;  Wang, Lipeng;  Cai, Chengtao;  Sun, Yuxin;  Zhang, Zhi
收藏  |  浏览/下载:5/0  |  提交时间:2024/02/22
A New Lightweight Script Independent Scene Text Style Transfer Network 期刊论文
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2023, 页码: 29
作者:  Shivakumara, Palaiahnakote;  Roy, Ayush;  Nandanwar, Lokesh;  Pal, Umapada;  Lu, Yue
收藏  |  浏览/下载:1/0  |  提交时间:2024/02/22
So Many Heads, So Many Wits: Multimodal Graph Reasoning for Text-Based Visual Question Answering 期刊论文
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 页码: 12
作者:  Zheng, Wenbo;  Yan, Lan;  Wang, Fei-Yue
收藏  |  浏览/下载:4/0  |  提交时间:2023/12/21
BEVBert: Multimodal Map Pre-training for Language-guided Navigation 会议论文
Paris, France, 2023-10-2
作者:  Dong An;  Yuankai Qi;  Yangguang Li;  Yan Huang;  Liang Wang
收藏  |  浏览/下载:0/0  |  提交时间:2024/05/28


©版权所有 ©2017 CSpace - Powered by CSpace