Finding lists of people on the web

Acm Sigcas Computers and Society 34 (1 special issue):1-21 (2004)
  Copy   BIBTEX

Abstract

Among the vast amounts of personal information published on the World Wide Web (“Web”) and indexed by search engines are lists of names of people. Examples include employees at companies, students enrolled in universities, officers in the military, law enforcement personnel, members of social organizations, and lists of acquaintances. Knowing who works where, attends what, or affiliates with whom provides strategic knowledge to competitors, marketers, and government surveillance efforts. However, finding online rosters of people does not lend itself to keyword lookup on search engines because the keywords tend to be common expressions such as “employees” or “students.” A typical search often retrieves hundreds of Web pages requiring many hours of human inspection to locate a page containing a list of names. As a result, people may falsely believe online rosters provide more privacy than they do. This paper presents RosterFinder, a set of simple algorithms for locating Web pages that consist predominately of a list of names. The specific names are not known beforehand. RosterFinder works by identifying rosters from candidate Web pages based on the ratio of distinct known names to distinct words appearing in the page. Accurate classification by RosterFinder depends on the set of names used. Results are reported on real Web pages using: (1) dictionary lookup employing a limited set of known names; and, (2) dictionary lookup on utilizing an extensive set of known names. Privacy implications are discussed using the example of FERPA and online student rosters

Links

PhilArchive



    Upload a copy of this work     Papers currently archived: 93,867

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

Similar books and articles

ハイパーリンクのグラフ構造に基づく Web コミュニティの洗練.Murata Tsuyoshi - 2002 - Transactions of the Japanese Society for Artificial Intelligence 17:322-329.
Sense and Reference on the Web.Harry Halpin - 2011 - Minds and Machines 21 (2):153-178.
参照の共起性に基づく Web コミュニティの発見.Murata Tsuyoshi - 2001 - Transactions of the Japanese Society for Artificial Intelligence 16 (3):316-323.
Defining web ethics.Marsha Woodbury - 1998 - Science and Engineering Ethics 4 (2):203-212.
Social web and identity: a likely encounter. [REVIEW]Thierry Nabeth - 2009 - Identity in the Information Society 2 (1):1-5.
画像検索のための Web テキストによる画像クラスタリング.Nagata Akiko Sunayama Wataru - 2004 - Transactions of the Japanese Society for Artificial Intelligence 19:580-588.

Analytics

Added to PP
2013-11-21

Downloads
17 (#863,839)

6 months
1 (#1,721,226)

Historical graph of downloads
How can I increase my downloads?

Citations of this work

No citations found.

Add more citations

References found in this work

Rosters.[author unknown] - 1977 - Journal of Pre-College Philosophy 2 (4):51-54.

Add more references