DocDancer is a smart document helper that answers questions by exploring and reading long, mixed-media PDFs using just two tools: Search and Read.
VerseCrafter is a video world model that lets you steer both the camera and multiple moving objects by editing a single 4D world state.
Big all-in-one language models are powerful but too expensive to run everywhere, while small specialists are cheaper but narrow.
The paper shows that big language models often get stuck with weight sizes set by training hyperparameters instead of by the data, which quietly hurts performance.
SmartSearch teaches search agents to fix their own bad search queries while they are thinking, not just their final answers.
Mixture-of-Experts (MoE) models use many small specialist networks and only activate a few per token, but classic LoRA fine-tuning gives every expert the same rank, wasting parameters on the wrong experts.
AgentOCR turns an agent’s long text history into pictures so it can remember more using fewer tokens.
AT2PO is a new way to train AI agents that work in several turns, like asking the web a question, reading the result, and trying again.
KnowMe-Bench is a new test that checks if AI helpers truly understand a person, not just remember facts.
This paper turns an AI agent’s memory from a flat list of notes into a logic map of events connected by cause-and-time links.
This paper builds two teamwork models, Qwen3-VL-Embedding and Qwen3-VL-Reranker, that understand text, images, visual documents, and videos in one shared space so search works across all of them.
TourPlanner is a travel-planning system that first gathers the right places, then lets multiple expert ‘voices’ debate plans, and finally polishes the winner with a learning method that follows rules before style.