Alibaba Cloud launches open-source vision language model

Alibaba Cloud has launched two open-source large vision language models (LVLM), Qwen-VL and Qwen-VL-Chat, capable of comprehending images and texts, answering questions, and more in both English and Chinese.

Facts

  • Qwen-VL is a multimodal LVLM that can handle image inputs and text prompts in English and Chinese, performing tasks like answering questions related to images and generating image captions.
  • Qwen-VL-Chat is designed for complex interactions, including comparing multiple images and engaging in multi-round question answering, with creative capabilities like writing poetry and summarizing image content.
  • Alibaba Cloud has shared the model’s code, weights, and documentation with the open-source community via ModelScope and Hugging Face. Companies with over 100 million monthly users can request a license for commercial use.
  • These models could potentially assist visually impaired individuals during online shopping by providing information based on image comprehension.
  • Qwen-VL outperforms other large vision language models in various visual language tasks, including captioning, question answering, and object detection.
  • Qwen-VL-Chat achieves leading results in both Chinese and English text-image dialogue and alignment tests, according to Alibaba Cloud’s benchmark.
  • Earlier, Alibaba Cloud open-sourced its 7-billion-parameter LLMs, Qwen-7B and Qwen-7B-Chat, contributing to the open-source community with over 400,000 downloads within a month of launch.
Laura M
Laura M
Laura is a financial reporter, editor, and researcher with a particular interest in fintech innovation, capital markets, and the evolving global banking landscape.

You May Also Like

Danske Bank Restricts Customer Data Access Following Address Leak

Danske Bank acts after sensitive customer details exposed to payment recipients.Highlights: Danske Bank reveals a data leak of...

Fintech Veteran Launches Primitive AI Agent Operating System

A groundbreaking platform for AI-driven financial services is unveiled.Highlights: Primitive AI, led by a fintech veteran, introduces a...

Visa Launches Validator Node on Tempo Blockchain, Strengthening Payments Infrastructure

New validator node aims to boost transaction efficiency for Visa services.Highlights: Visa launches a new validator node on...

Amex Launches Agentic Commerce Development Kit to Strengthen Merchant Services

New toolkit aims to enhance payment solutions for businesses.Highlights: Amex unveils Agentic Commerce Development Kit for merchants.The toolkit...