While rules and exemplars are usually viewed as opposites, this paper argues that they form end points of the same distribution. By representing both rules and exemplars as (partial) trees, we can take into account the fluid middle ground between the two extremes. This insight is the starting point for a new theory of language learning that is based on the following idea: If a language learner does not know which phrase‐structure trees should be assigned to initial sentences, s/he allows (...) (implicitly) for all possible trees and lets linguistic experience decide which is the “best” tree for each sentence. The best tree is obtained by maximizing “structural analogy” between a sentence and previous sentences, which is formalized by the most probable shortest combination of subtrees from all trees of previous sentences. Corpus‐based experiments with this model on the Penn Treebank and the Childes database indicate that it can learn both exemplar‐based and rule‐based aspects of language, ranging from phrasal verbs to auxiliary fronting. By having learned the syntactic structures of sentences, we have also learned the grammar implicit in these structures, which can in turn be used to produce new sentences. We show that our model mimicks children’s language development from item‐based constructions to abstract constructions, and that the model can simulate some of the errors made by children in producing complex questions. (shrink)
We develop an approach to automatically identify the most probable multiword constructions used in children’s utterances, given syntactically annotated utterances from the Brown corpus of CHILDES. The found constructions cover many interesting linguistic phenomena from the language acquisition literature and show a progression from very concrete toward abstract constructions. We show quantitatively that for all children of the Brown corpus grammatical abstraction, defined as the relative number of variable slots in the productive units of their grammar, increases globally with age.
How is scientific knowledge used, adapted, and extended in deriving phenomena and real-world systems? This paper aims at developing a general account of 'applying science' within the exemplar-based framework of Data-Oriented Processing (DOP), which is also known as Exemplar-Based Explanation (EBE). According to the exemplar-based paradigm, phenomena are explained not by deriving them all the way down from theoretical laws and boundary conditions but by modelling them on previously derived phenomena that function as exemplars. To accomplish this, DOP proposes to (...) maintain a corpus of derivation trees of previous phenomena together with a matching algorithm that combines subtrees from the corpus to derive new phenomena. By using a notion of derivational similarity, a new phenomenon can be modelled as closely as possible on previously explained phenomena. I will propose an instantiation of DOP which integrates theoretical and phenomenological modelling and which generalises over various disciplines, from fluid mechanics to language technology. I argue that DOP provides a solution for what I call Kuhn's problem and that it redresses Kitcher's account of explanation. (shrink)
Though fields such as art history, the history of philosophy, and intellectual history have been around for a long time, the author's interest is in the history of what scholars in all of these fields are doing in common. This book looks beyond the humanities to the practice of disciplined inquiry more generally, bringing together the history of the humanities and the sciences under the guise of a unified search for patterns.
Unlike basic sciences, scientific research in advanced technologies aims to explain, predict, and describe not phenomena in nature, but phenomena in technological artefacts, thereby producing knowledge that is utilized in technological design. This article first explains why the covering‐law view of applying science is inadequate for characterizing this research practice. Instead, the covering‐law approach and causal explanation are integrated in this practice. Ludwig Prandtl’s approach to concrete fluid flows is used as an example of scientific research in the engineering sciences. (...) A methodology of distinguishing between regions in space and/or phases in time that show distinct physical behaviours is specific to this research practice. Accordingly, two types of models specific to the engineering sciences are introduced. The diagrammatic model represents the causal explanation of physical behaviour in distinct spatial regions or time phases; the nomo‐mathematical model represents the phenomenon in terms of a set of mathematically formulated laws. (shrink)
This paper deals with the problem of derivational redundancy in scientific explanation, i.e. the problem that there can be extremely many different explanatory derivations for a natural phenomenon while students and experts mostly come up with one and the same derivation for a phenomenon (modulo the order of applying laws). Given this agreement among humans, we need to have a story of how to select from the space of possible derivations of a phenomenon the derivation that humans come up with. (...) In this paper we argue that the problem of derivational redundancy can be solved by a new notion of “shortest derivation”, by which we mean the derivation that can be constructed by the fewest (and therefore largest) partial derivations of previously derived phenomena that function as “exemplars”. We show how the exemplar-based framework known as “Data-Oriented Parsing” or “DOP” can be employed to select the shortest derivation in scientific explanation. DOP’s shortest derivation of a phenomenon maximizes what is called the “derivational similarity” between a phenomenon and a corpus of exemplars. A preliminary investigation with exemplars from classical and fluid mechanics shows that the shortest derivation closely corresponds to the derivations that humans construct. Our approach also proposes a concrete solution to Kuhn’s problem of how we know on which exemplar a phenomenon can be modeled. We argue that humans model a phenomenon on the exemplar that is derivationally most similar to the phenomenon, i.e. the exemplar from which the largest subtree(s) can be used to derive the phenomenon. (shrink)
We argue that van der Velde's & de Kamps's model does not solve the binding problem but merely shifts the burden of constructing appropriate neural representations of sentence structure to unexplained preprocessing of the linguistic input. As a consequence, their model is not able to explain how various neural representations can be assigned to sentences that are structurally ambiguous.