Events have beginnings, ends, and often overlap in time. A major question is how perceivers come to parse a stream of multimodal information into meaningful units and how different event boundaries may vary event processing. This work investigates the roles of these three types of event boundaries in constructing event temporal relations. Predictions were made based on how people would err according to the beginning state, end state, and overlap heuristic hypotheses. Participants viewed animated events that include all the logical possibilities of event temporal relations, and then made temporal relation judgments. The results showed that people make use of the overlap between events and take into account the ends and beginnings, but they weight ends more than beginnings. Neural network simulations showed a self‐organized distinction when learning temporal relations between events with overlap versus those without.