๐ŸŽ“How I Study AIHISA
๐Ÿ“–Read
๐Ÿ“„Papers๐Ÿ“ฐBlogs๐ŸŽฌCourses
๐Ÿ’กLearn
๐Ÿ›ค๏ธPaths๐Ÿ“šTopics๐Ÿ’กConcepts๐ŸŽดShorts
๐ŸŽฏPractice
๐Ÿ“Daily Log๐ŸŽฏPrompts๐Ÿง Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers5

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#large language models

On-Policy Self-Distillation for Reasoning Compression

Beginner
Hejian Sang, Yuanda Xu et al.Mar 5arXiv

Reasoning models often talk too much, and those extra words can actually make them more wrong.

#on-policy self-distillation#reasoning compression#conciseness instruction

Not triaged yet

MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios

Beginner
Zhiheng Song, Jingshuai Zhang et al.Feb 26arXiv

MobilityBench is a big, carefully built test that checks how well AI helpers can plan real-world routes using natural language and map tools.

#MobilityBench#route-planning agents#large language models

Not triaged yet

Learning Rate Matters: Vanilla LoRA May Suffice for LLM Fine-tuning

Beginner
Yu-Ang Lee, Ching-Yun Ko et al.Feb 4arXiv

When you tune the learning rate carefully, plain old LoRA fine-tuning works about as well as fancy new versions.

#LoRA#parameter-efficient fine-tuning#learning rate tuning

Not triaged yet

Enhancing Sentiment Classification and Irony Detection in Large Language Models through Advanced Prompt Engineering Techniques

Beginner
Marvin Schmitt, Anne Schwerk et al.Jan 13arXiv

Giving large language models a few good examples and step-by-step instructions can make them much better at spotting feelings in text.

#prompt engineering#few-shot learning#chain-of-thought

Not triaged yet

Same Claim, Different Judgment: Benchmarking Scenario-Induced Bias in Multilingual Financial Misinformation Detection

Beginner
Zhiwei Liu, Yupen Cao et al.Jan 8arXiv

This paper builds MFMD-Scen, a big test to see how AI changes its truth/false judgments about the same money-related claim when the situation around it changes.

#financial misinformation detection#scenario-induced bias#multilingual benchmark

Not triaged yet