This paper teaches AI to learn how-to steps from demonstrations in the moment, the way people do.
Action100M is a gigantic video dataset with about 100 million labeled action moments built automatically from 1.2 million instructional videos.