AR-Omni is a single autoregressive model that can take in and produce text, images, and speech without extra expert decoders.
XR is a new, training-free team of AI helpers that finds images using both a reference picture and a short text edit (like βsame jacket but redβ).