Combining prompt-based language models and weak supervision for labeling named entity recognition on legal documents

Artificial Intelligence and Law:1-21 (forthcoming)
  Copy   BIBTEX

Abstract

Named entity recognition (NER) is a very relevant task for text information retrieval in natural language processing (NLP) problems. Most recent state-of-the-art NER methods require humans to annotate and provide useful data for model training. However, using human power to identify, circumscribe and label entities manually can be very expensive in terms of time, money, and effort. This paper investigates the use of prompt-based language models (OpenAI’s GPT-3) and weak supervision in the legal domain. We apply both strategies as alternative approaches to the traditional human-based annotation method, relying on computer power instead human effort for labeling, and subsequently compare model performance between computer and human-generated data. We also introduce combinations of all three mentioned methods (prompt-based, weak supervision, and human annotation), aiming to find ways to maintain high model efficiency and low annotation costs. We showed that, despite human labeling still maintaining better overall performance results, the alternative strategies and their combinations presented themselves as valid options, displaying positive results and similar model scores at lower costs. Final results demonstrate preservation of human-trained models scores averaging 74.0% for GPT-3, 95.6% for weak supervision, 90.7% for GPT + weak supervision combination, and 83.9% for GPT + 30% human-labeling combination.

Links

PhilArchive



    Upload a copy of this work     Papers currently archived: 92,611

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

Similar books and articles

Instructions for Authors.[author unknown] - 2001 - Artificial Intelligence and Law 9 (4):315-320.
Index of Key Words.[author unknown] - 1997 - Artificial Intelligence and Law 5 (4):347-347.
Instructions for Authors.[author unknown] - 2004 - Artificial Intelligence and Law 12 (4):447-452.
Instructions for Authors.[author unknown] - 2002 - Artificial Intelligence and Law 10 (4):303-308.
Instructions for Authors.[author unknown] - 2002 - Artificial Intelligence and Law 10 (1):219-224.

Analytics

Added to PP
2024-02-15

Downloads
6 (#1,467,817)

6 months
6 (#531,961)

Historical graph of downloads
How can I increase my downloads?

Citations of this work

No citations found.

Add more citations

References found in this work

Add more references