Search Ask

Makeroom

RegisterLogin

Discussion

General
Tech

Library

Chevron Right Icon
Design
Resources
Websites
Reve: Reimagine Reality
Chevron Right Icon
Web development
Cool Libraries
Tools
Resources
Papers and Studies
Articles
Language Models
Tech and Systems
Chevron Right Icon
Computers
Chevron Right Icon
Windows Tools and Modding
Windhawk
Raycast for Windows
Rainmeter
Haiku: BeOS-Inspired Open-Source OS
Chevron Right Icon
Random fun stuff
Esoteric File Systems
Cool websites
Chevron Right Icon
Friends
Unity - Cheaterman's Bar
Chevron Right Icon
Storyden
Selfh.st
OpenAlternative
Microlaunch
Peerlist
Glama.ai
AlternativeTo
Brandfetch
Dokploy
PitchHut
Piefed Social
 Collections Links Members Roles

Makeroom

Icon

A small rag-tag assortment of makers, engineers and designers sharing mentoring, support and projects to work on at any stage in their career.

Join our Discord server!


Welcome to the Makeroom installation of Storyden!

This acts as a live demo of Storyden's forum and library software. On this site you'll find a curated collection of web and design resources as well as anything our members share.

Feel free to participate, this may be a demo but it's never wiped. That being said, Storyden is in active development and we encourage you to experiment respectfully as well as report any security issues you find to @Southclaws or by opening an issue.

Have an amazing day!

powered by storyden

Login
Library
ui-ug-a-unified-mllm-for-ui-understanding-and-generation

No versions or drafts yet.

UI-UG: A Unified MLLM for UI Understanding and Generation

UI-UG: A Unified MLLM for UI Understanding and Generation

Although Multimodal Large Language Models (MLLMs) have been widely applied across domains, they are still facing challenges in domain-specific tasks, such as User Interface (UI) understanding accuracy and UI generation quality. In this paper, we introduce UI-UG (a unified MLLM for UI Understanding and Generation), integrating both capabilities. For understanding tasks, we employ Supervised Fine-tuning (SFT) combined with Group Relative Policy Optimization (GRPO) to enhance fine-grained understanding on the modern complex UI data. For generation tasks, we further use Direct Preference Optimization (DPO) to make our model generate human-preferred UIs. In addition, we propose an industrially effective workflow, including the design of an LLM-friendly domain-specific language (DSL), training strategies, rendering processes, and evaluation metrics. In experiments, our model achieves state-of-the-art (SOTA) performance on understanding tasks, outperforming both larger general-purpose MLLMs and similarly-sized UI-specialized models. Our model is also on par with these larger MLLMs in UI generation performance at a fraction of the computational cost. We also demonstrate that integrating understanding and generation tasks can improve accuracy and quality for both tasks. Code and Model: https://github.com/neovateai/UI-UG

arxiv.org