Abstract
An essential part of text understanding is to make explicit the implicit parts of discourse, namely its presuppositions and implications. For mathematical discourse, the explicitly stated portions of a text do not make sense taken in the absence of the illuminating implicit content. A reader, human or machine, must be able to “read between the lines”.In this paper, we present a computational framework for understanding informal mathematical discourse, extending and integrating state-of-the-art technologies from natural language processing and automated reasoning in a novel and promising way. For representing mathematical discourse, we introduce proof representation structures as the central data structure. PRSs are a considerable extension to discourse representation structures accommodating our need for representing discourse structure, and handling substructures and mathematical sentences as first-class citizens. For constructing PRSs, we propose a discourse update algorithm that is powered by pragmatics. It incorporates an underspecified semantic representation into the proof context by making use of mathematical and meta-mathematical knowledge, employing a proof planner. Proof plans, capturing common patterns of reasoning in mathematical proofs, enable us to gain a high level discourse understanding, allowing us to follow the proof author's main line of argument. Given such high-level discourse understanding, we can then compute the parts “between the lines”, those pieces of information that the proof author has taken for granted. We demonstrate that much inference is required to compute this implicit information