Search is not the same as research; real research needs planning, checking many sources, fixing mistakes, and writing a clear report.
Big vision-language models are super smart but too large to fit on phones and small devices.
SlideTailor is an AI system that turns a scientific paper into personalized presentation slides that match what a specific user likes.
Large language models can say things that sound right but aren’t supported by the given document; this is called a faithfulness hallucination.
This paper builds DiRL, a fast and careful way to finish training diffusion language models so they reason better.
This paper adds a tiny but powerful step called Early Knowledge Alignment (EKA) to multi-step retrieval systems so the model takes a quick, smart look at relevant information before it starts planning.
Memory-T1 teaches chatty AI agents to keep track of when things happened across many conversations.
This paper turns messy chains of thought from language models into clear, named steps so we can see how they really think through math problems.
This paper asks a simple question: do video AI models trained only on 2D videos secretly learn about 3D worlds?
The paper proposes the Prism Hypothesis: meanings (semantics) mainly live in low frequencies, while fine picture details live in high frequencies.
GenEnv is a training system where a student AI and a teacher simulator grow together by exchanging tasks and feedback.
Autoregressive (AR) image models make pictures by choosing tokens one-by-one, but they were judged only on picking likely tokens, not on how good the final picture looks in pixels.