Рет қаралды 215
#arxiv #artificialintelligence #datascience #machinelearning #deeplearning #conversationalAI #VisualChatGPT
Link to paper: arxiv.org/pdf/2303.04671.pdf
Paper by: Chenfei Wu, Shengming Yin, Weizhen Qi, Xiaodong Wang, Zecheng Tang, Nan Duan from Microsoft Research Asia.
Presentation by Fellowship.ai team: www.fellowship.ai/
Fellowship.ai is brought to you by Launchpad.ai: www.launchpad.ai/
Launchpad brings cutting-edge technologies and AI applications to organizations, to learn more about our products and services check: www.launchpad.ai/ai-developme...
Abstract: This paper introduces Visual ChatGPT, an innovative approach enhancing the capabilities of ChatGPT to process language and images. The system incorporates Visual Foundation Models, allowing users to interact with ChatGPT through complex visual questions or visual editing instructions that require the collaboration of multiple AI models over multiple steps. It also allows users to provide feedback and ask for corrected results. Experiments reveal that Visual ChatGPT enhances the understanding of the visual roles of ChatGPT with the help of Visual Foundation Models/
Code and demo are available at github.com/microsoft/visual-c....