Papers1055

AI Gamestore: Scalable, Open-Ended Evaluation of Machine General Intelligence with Human Games

Lance Ying, Ryan Truong et al.Feb 19arXiv

The paper argues that the fairest way to check how generally smart an AI is, is to see how quickly and well it learns lots of different human-made games, just like a person with the same time and practice.

#general intelligence#evaluation benchmark#game-based testing

Computer-Using World Model

Intermediate

Yiming Guan, Rui Yu et al.Feb 19arXiv

The paper builds a Computer-Using World Model (CUWM) that lets an AI “imagine” what a desktop app (like Word/Excel/PowerPoint) will look like after a click or keystroke—before doing it for real.

#world model#GUI agent#desktop automation

2Mamba2Furious: Linear in Complexity, Competitive in Accuracy

Intermediate

Gabriel Mongaras, Eric C. LarsonFeb 19arXiv

The paper studies Mamba-2 (a fast, linear-time attention method) and pares it down to the pieces that truly boost accuracy.

#linear attention#Mamba-2#2Mamba

ArXiv-to-Model: A Practical Study of Scientific LM Training

Intermediate

Anuj GuptaFeb 19arXiv

This paper shows, step by step, how to train a 1.36-billion-parameter science-focused language model directly from raw arXiv LaTeX files using only 2 A100 GPUs.

#scientific language model#arXiv LaTeX#tokenization

Unified Latents (UL): How to train your latents

Intermediate

Jonathan Heek, Emiel Hoogeboom et al.Feb 19arXiv

Unified Latents (UL) is a way to learn the hidden code (latents) for images and videos by training three parts together: an encoder, a diffusion prior, and a diffusion decoder.

#Unified Latents#diffusion prior#diffusion decoder

FRAPPE: Infusing World Modeling into Generalist Policies via Multiple Future Representation Alignment

Intermediate

Han Zhao, Jingbo Wang et al.Feb 19arXiv

Robots learn better when they predict short, meaningful summaries of future images instead of drawing every pixel of the future scene.

#world modeling#vision-language-action (VLA)#diffusion policy

Arcee Trinity Large Technical Report

Intermediate

Varun Singh, Lucas Krauss et al.Feb 19arXiv

Trinity is a family of open language models that are huge on the inside but only wake up a few 'experts' for each word, so they are fast and affordable to run.

#Mixture-of-Experts#SMEBU#Gated Attention

Conv-FinRe: A Conversational and Longitudinal Benchmark for Utility-Grounded Financial Recommendation

Intermediate

Yan Wang, Yi Han et al.Feb 19arXiv

This paper builds Conv-FinRe, a new test that checks if AI financial advisors give advice that fits a person’s true goals, not just what they clicked before.

#financial recommendation#utility-based evaluation#conversational benchmark

DDiT: Dynamic Patch Scheduling for Efficient Diffusion Transformers

Intermediate

Dahye Kim, Deepti Ghadiyaram et al.Feb 19arXiv

This paper speeds up image and video generators called diffusion transformers by changing how big their puzzle pieces (patches) are at each step.

#Diffusion Transformer#Dynamic Tokenization#Patch Scheduling

Discovering Multiagent Learning Algorithms with Large Language Models

Intermediate

Zun Li, John Schultz et al.Feb 18arXiv

The paper shows how a code-writing AI (a large language model) can invent brand‑new multi‑agent learning algorithms instead of humans having to hand‑design them.

#Multi-Agent Reinforcement Learning#Counterfactual Regret Minimization#Policy Space Response Oracles

SimToolReal: An Object-Centric Policy for Zero-Shot Dexterous Tool Manipulation

Intermediate

Kushal Kedia, Tyler Ga Wei Lum et al.Feb 18arXiv

SimToolReal teaches a robot hand to use many different tools by practicing in simulation and then working in the real world without extra training.

#dexterous manipulation#sim-to-real reinforcement learning#goal-conditioned policy

On the Mechanism and Dynamics of Modular Addition: Fourier Features, Lottery Ticket, and Grokking

Intermediate

Jianliang He, Leda Wang et al.Feb 18arXiv

This paper explains, in detail, how a simple two-layer neural network learns to add numbers on a clock (modular addition) by building and combining wave-like patterns called Fourier features.

#modular addition#Fourier features#discrete Fourier transform

10 11 12 13 14