データ要約を介した分類器学習法

Transactions of the Japanese Society for Artificial Intelligence 17 (5):565-575 (2002)
  Copy   BIBTEX

Abstract

Knowledge discovery in databases has been studied intensively recent years. In KDD, inductive classifier learning methods which were developed in statistics and machine learning have been used to extract classification rules from databases. Although in KDD we have to deal with large databases in many cases, many of the previous classifier learning methods are not suitable for large databases. They were designed under assumption that any data in databases is accessible on demand and they usually need to access a datum several times in a process of learning. So, they require a huge memory space or a large I/O cost to access storage devices. In this paper, we propose a classifier learning method, we call CIDRE, in which data summaries are constructed and classifiers are learned from the summaries. This learning method is realized by using a clustering method, we call MCF-tree, which is an extension of CF-tree proposed by Zhang et al. In the method, we can specify the size of memory space occupied by data summaries, and databases are swept only once to construct the summaries. In addition, new instances can be inserted into the summaries incrementally. Thus, the method possesses important properties which are desirable to deal with large databases. We also show empirical results, which indicate that our method performs very well in comparison to C4.5 and naive Bayes, and the extension from CF-tree to MCF-tree is indispensable to achieve high classification accuracy.

Links

PhilArchive



    Upload a copy of this work     Papers currently archived: 93,745

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

Similar books and articles

Machine Learning and Job Posting Classification: A Comparative Study.Ibrahim M. Nasser & Amjad H. Alzaanin - 2020 - International Journal of Engineering and Information Systems (IJEAIS) 4 (9):06-14.
識別学習による組合せ最適化問題としての文短縮手法.鈴木 潤 平尾 努 - 2007 - Transactions of the Japanese Society for Artificial Intelligence 22 (6):574-584.
情報理論的枠組に基づくマイノリティ集合の検出.佐久間 淳 安藤 晋 - 2007 - Transactions of the Japanese Society for Artificial Intelligence 22 (3):311-321.
知識の関係構造を用いた新しい概念の生成.延澤 志保 金盛 克俊 - 2006 - Transactions of the Japanese Society for Artificial Intelligence 21 (5):450-458.

Analytics

Added to PP
2014-03-24

Downloads
14 (#264,824)

6 months
14 (#987,135)

Historical graph of downloads
How can I increase my downloads?

Citations of this work

No citations found.

Add more citations

References found in this work

No references found.

Add more references