Abstract
This paper describes new methods of automatically extracting documents for screening purposes, i.e. the computer selection of sentences having the greatest potential for conveying to the reader the substance of the document. While previous work has focused on one component of sentence significance, namely, the presence of high-frequency content words (key words), the methods described here also treat three additional components: pragmatic words (cue words); title and heading words; and structural indicators (sentence location).
The research has resulted in an operating system and a research methodology. The extracting system is parameterized to control and vary the influence of the above four components. The research methodology includes procedures for the compilation of the required dictionaries, the setting of the control parameters, and the comparative evaluation of the automatic extracts with manually produced extracts. The results indicate that the three newly proposed components dominate the frequency component in the production of better extracts.
- 1 Automatic abstracting. RADC-TDR-63-93, TRW Computer Div., Thompsoa-Ramo- Wooldridge, Inc., Canoga Park, Calif., Feb. 1963.Google Scholar
- 2 EDMUNDSON, H. P. Problems in automatic abstracting. Comm. ACM 7, 4 (Apr. 1964), 259-263. Google Scholar
- 3 EnMUNDSON, H. P., AND WYLLYS, R. E. Automatic abstracting and indexing survey and recommendations. Comm. ACM 4, 5 (May 1961), 226-234. Google Scholar
- 4 Final report on the study for automtic abstracting. Cl07-1U12, Thompson-Ramo- Wooldridge, Inc., Canoga Park, Calif., Sept. 1961.Google Scholar
- 5 KUNs, J.L. An application of logical probability to problems in automatic abstracting and information retrieval. Joint Man-Computer Indexing and Abstracting, Sess. 13, First Congress on the Information System Sciences, Nov. 1962.Google Scholar
- 6 LUHN, H.P. The automatic creation of literature abstracts, iBM J. Res. Develop. 2, 2 (1959), 159-165.Google Scholar
- 7 RATH, G. J., RESNICK, A., AND SAVAGE, T. R. Comparisons of four types of lcxical indicators of content. Amer. Docum. 12, 2 (Apr. 1961), 126-130.Google Scholar
Index Terms
- New Methods in Automatic Extracting
Recommendations
Methods for extracting and classifying pairs of cognates and false friends
The identification of cognates has attracted the attention of researchers working in the area of Natural Language Processing, but the identification of false friends is still an under-researched area. This paper proposes novel methods for the automatic ...
Automatic Methods for Extracting Taxonomic Relationships from Texts
AbstractAn overview of approaches to automatic extraction of taxonomic relationships (is_a relationships) from texts, including classical methods based on lexico-semantic patterns and vector representations of words and their modern development, is ...
Extracting Collocations in Contexts
Human Language Technology. Challenges of the Information SocietyThe aim of this paper is to develop (i) a general framework for the analysis of verb-noun (VN) collocations in English and Romanian, and (ii) a system for the extraction of VN-collocations from large tagged and annotated corpora. We identify VN-...
Comments