πŸŽ“How I Study AIHISA
πŸ“–Read
πŸ“„PapersπŸ“°Blogs🎬Courses
πŸ’‘Learn
πŸ›€οΈPathsπŸ“šTopicsπŸ’‘Concepts🎴Shorts
🎯Practice
πŸ“Daily Log🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers2

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#structured captions

BBQ-to-Image: Numeric Bounding Box and Qolor Control in Large-Scale Text-to-Image Models

Beginner
Eliran Kachlon, Alexander Visheratin et al.Feb 24arXiv

BBQ is a text-to-image model that lets you place objects exactly where you want using numeric bounding boxes and color them with exact RGB values.

#text-to-image#bounding boxes#RGB control

Not triaged yet

DreamID-Omni: Unified Framework for Controllable Human-Centric Audio-Video Generation

Intermediate
Xu Guo, Fulong Ye et al.Feb 12arXiv

DreamID-Omni is one model that can create, edit, and animate human-centered videos with matching voices, all in sync.

#audio-video generation#diffusion transformer#identity preservation

Not triaged yet