DataChef teaches a large language model to be a smart data chef: it plans and codes full data pipelines that turn messy datasets into great training meals for other models.
This paper introduces SecCoderX, a way to teach code-writing AIs to be secure without breaking what the code is supposed to do.
TRIP-Bench is a new test that checks if AI travel agents can plan real trips over many chat turns while following strict rules and changing user requests.