×
验证码:
换一张
忘记密码?
记住我
CORC
首页
科研机构
检索
知识图谱
申请加入
托管服务
登录
注册
在结果中检索
科研机构
自动化研究所 [38]
厦门大学 [7]
北京大学 [3]
深圳先进技术研究院 [3]
计算技术研究所 [3]
集美大学 [2]
更多...
内容类型
期刊论文 [36]
会议论文 [19]
学位论文 [8]
其他 [1]
发表日期
2024 [6]
2023 [10]
2022 [7]
2021 [4]
2020 [5]
2019 [5]
更多...
学科主题
Cognitive ... [1]
心理学 [1]
×
知识图谱
CORC
开始提交
已提交作品
待认领作品
已认领作品
未提交全文
收藏管理
QQ客服
官方微博
反馈留言
浏览/检索结果:
共64条,第1-10条
帮助
已选(
0
)
清除
条数/页:
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
排序方式:
请选择
作者升序
作者降序
题名升序
题名降序
发表日期升序
发表日期降序
提交时间升序
提交时间降序
GraphMLLM: A Graph-based Multi-level Layout Language-independent Model for Document Understanding
会议论文
希腊雅典, 2024-09
作者:
He-Sen Dai
;
Xiao-Hui Li
;
Fei Yin
;
Xudong Yan
;
Shuqi Mei
收藏
  |  
浏览/下载:0/0
  |  
提交时间:2024/06/05
Visual information extraction
Self-supervised pre-training
Multi-level page layouts
Investigating Compositional Challenges in Vision-Language Models for Visual Grounding
会议论文
Seattle WA, USA, 17-21 June 2024
作者:
Yunan Zeng
;
Yan Huang
;
Jinjin Zhang
;
Zequn Jie
;
Zhenhua Chai
收藏
  |  
浏览/下载:0/0
  |  
提交时间:2024/06/05
AnomalyGPT: Detecting Industrial Anomalies Using Large Vision-Language Models
会议论文
VANCOUVER, CANADA, 2024-2-20至2024-2-27
作者:
Zhaopeng Gu
;
Bingke Zhu
;
Guibo Zhu
;
Yingying Chen
;
Ming Tang
收藏
  |  
浏览/下载:0/0
  |  
提交时间:2024/06/06
Text-to-Image Vehicle Re-Identification: Multi-Scale Multi-View Cross-Modal Alignment Network and a Unified Benchmark
期刊论文
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 页码: 14
作者:
Ding, Leqi
;
Liu, Lei
;
Huang, Yan
;
Li, Chenglong
;
Zhang, Cheng
收藏
  |  
浏览/下载:0/0
  |  
提交时间:2024/03/27
Task analysis
Feature extraction
Visualization
Training
Electronic mail
Benchmark testing
Trajectory
Text-to-image vehicle re-identification
cross-modal alignment
multi-scale multi-view analysis
benchmark dataset
VLP2MSA: Expanding vision-language pre-training to multimodal sentiment analysis
期刊论文
KNOWLEDGE-BASED SYSTEMS, 2024, 卷号: 283, 页码: 9
作者:
Yi, Guofeng
;
Fan, Cunhang
;
Zhu, Kang
;
Lv, Zhao
;
Liang, Shan
收藏
  |  
浏览/下载:3/0
  |  
提交时间:2024/02/22
Multimodal sentiment analysis
Vision-language
Multimodal fusion
CLIP-VG: Self-Paced Curriculum Adapting of CLIP for Visual Grounding
期刊论文
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 卷号: 26, 页码: 4334-4347
作者:
Xiao, Linhui
;
Yang, Xiaoshan
;
Peng, Fang
;
Yan, Ming
;
Wang, Yaowei
收藏
  |  
浏览/下载:0/0
  |  
提交时间:2024/05/30
Grounding
Reliability
Adaptation models
Task analysis
Visualization
Data models
Annotations
Visual grounding
curriculum learning
pseudo-language label
and vision-language models
Multi-modal spatial relational attention networks for visual question answering
期刊论文
IMAGE AND VISION COMPUTING, 2023, 卷号: 140, 页码: 13
作者:
Yao, Haibo
;
Wang, Lipeng
;
Cai, Chengtao
;
Sun, Yuxin
;
Zhang, Zhi
收藏
  |  
浏览/下载:5/0
  |  
提交时间:2024/02/22
Visual question answering
Spatial relation
Attention mechanism
Pre -training strategy
A New Lightweight Script Independent Scene Text Style Transfer Network
期刊论文
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2023, 页码: 29
作者:
Shivakumara, Palaiahnakote
;
Roy, Ayush
;
Nandanwar, Lokesh
;
Pal, Umapada
;
Lu, Yue
收藏
  |  
浏览/下载:1/0
  |  
提交时间:2024/02/22
Text detection
style transfer
CNN models
multi-lingual transfer
So Many Heads, So Many Wits: Multimodal Graph Reasoning for Text-Based Visual Question Answering
期刊论文
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 页码: 12
作者:
Zheng, Wenbo
;
Yan, Lan
;
Wang, Fei-Yue
收藏
  |  
浏览/下载:4/0
  |  
提交时间:2023/12/21
Graph attention
graph reasoning
multimodal graph
self-attention
text-based visual question answering
BEVBert: Multimodal Map Pre-training for Language-guided Navigation
会议论文
Paris, France, 2023-10-2
作者:
Dong An
;
Yuankai Qi
;
Yangguang Li
;
Yan Huang
;
Liang Wang
收藏
  |  
浏览/下载:0/0
  |  
提交时间:2024/05/28
©版权所有 ©2017 CSpace - Powered by
CSpace