🎓How I Study AIHISA

📖Read

📄Papers 📰Blogs 🎬Courses

💡Learn

🛤️Paths 📚Topics 💡Concepts 🎴Shorts

🎯Practice

📝Daily Log 🎯Prompts 🧠Review

Search Settings

How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers2

All Beginner Intermediate Advanced

All Sources arXiv

#structured captions

BBQ-to-Image: Numeric Bounding Box and Qolor Control in Large-Scale Text-to-Image Models

Eliran Kachlon, Alexander Visheratin et al.Feb 24arXiv

BBQ is a text-to-image model that lets you place objects exactly where you want using numeric bounding boxes and color them with exact RGB values.

#text-to-image#bounding boxes#RGB control

Not triaged yet

DreamID-Omni: Unified Framework for Controllable Human-Centric Audio-Video Generation

Xu Guo, Fulong Ye et al.Feb 12arXiv

DreamID-Omni is one model that can create, edit, and animate human-centered videos with matching voices, all in sync.

#audio-video generation#diffusion transformer#identity preservation

Not triaged yet