Looking times and gaze behavior indicate that infants can predict the goal state of an observed simple action event (e.g., object‐directed grasping) already in the first year of life. The present paper mainly focuses on infants’ predictive gaze‐shifts toward the goal of an ongoing action. For this, infants need to generate a forward model of the to‐be‐obtained goal state and to disengage their gaze from the moving agent at a time when information about the action event is still incomplete. By about 6 months of age, infants show goal‐predictive gaze‐shifts, but mainly for familiar actions that they can perform themselves (e.g., grasping) and for familiar agents (e.g., a human hand). Therefore, some theoretical models have highlighted close relations between infants’ ability for action‐goal prediction and their motor development and/or emerging action experience. Recent research indicates that infants can also predict action goals of familiar simple actions performed by non‐human agents (e.g., object‐directed grasping by a mechanical claw) when these agents display agency cues, such as self‐propelled movement, equifinality of goal approach, or production of a salient action effect. This paper provides a review on relevant findings and theoretical models, and proposes that the impacts of action experience and of agency cues can be explained from an action‐event perspective. In particular, infants’ goal‐predictive gaze‐shifts are seen as resulting from an interplay between bottom‐up processing of perceptual information and top‐down influences exerted by event schemata that store information about previously executed or observed actions.