12.7 C
London
Wednesday, October 15, 2025
HomeBankingAlibaba Cloud launches open-source vision language model

Alibaba Cloud launches open-source vision language model

Related stories

Lunar Achieves Milestone as First Scandinavian Provider to Obtain MiCA Crypto License

Revolutionizing the Crypto Landscape in Scandinavia: Lunar's Groundbreaking License...

Monzo Integrates Built-in Tax Filing Tool for Enhanced Customer Experience

Simplifying Tax Season: How Monzo's New Feature Aims to...

Former Klarna UK Chief Alex Marsh Named CEO of Salad Group

Fintech veteran Alex Marsh takes the helm at Salad...

Visa Launches Trusted Agent Protocol for AI Commerce

Innovative Security Framework Enhances Trust in AI-Driven TransactionsHighlights: Visa...

FCA’s Strategic Initiatives for Tokenisation of Investment Funds

A Comprehensive Overview of the UK Financial Conduct Authority's...

Alibaba Cloud has launched two open-source large vision language models (LVLM), Qwen-VL and Qwen-VL-Chat, capable of comprehending images and texts, answering questions, and more in both English and Chinese.

Facts

  • Qwen-VL is a multimodal LVLM that can handle image inputs and text prompts in English and Chinese, performing tasks like answering questions related to images and generating image captions.
  • Qwen-VL-Chat is designed for complex interactions, including comparing multiple images and engaging in multi-round question answering, with creative capabilities like writing poetry and summarizing image content.
  • Alibaba Cloud has shared the model’s code, weights, and documentation with the open-source community via ModelScope and Hugging Face. Companies with over 100 million monthly users can request a license for commercial use.
  • These models could potentially assist visually impaired individuals during online shopping by providing information based on image comprehension.
  • Qwen-VL outperforms other large vision language models in various visual language tasks, including captioning, question answering, and object detection.
  • Qwen-VL-Chat achieves leading results in both Chinese and English text-image dialogue and alignment tests, according to Alibaba Cloud’s benchmark.
  • Earlier, Alibaba Cloud open-sourced its 7-billion-parameter LLMs, Qwen-7B and Qwen-7B-Chat, contributing to the open-source community with over 400,000 downloads within a month of launch.

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

Latest stories

spot_img