ObjEmbed teaches an AI to understand not just whole pictures, but each object inside them, and to link those objects to the right words.
VG-Refiner is a new way for AI to find the right object in a picture when given a description, even if helper tools make mistakes.