Automatically Generated Keywords: A Comparison to Author-Generated Keywords in the Sciences

Charlie D. Hurt


This paper examines the differences between author-generated keywords and automatically generated keywords in one area of scientific and technical literature. Using inverse frequency, keywords produced using both methods are examined using a maximum likelihood algorithm. By reducing the scope and size of the corpus of literature examined, this study more closely emulates the information gathering processes of scientists and technologists. Care was taken in developing the sample used, balancing statistical factors to allow interpretable outcomes and replication. The results of the study indicated there are no statistically significant differences between the two techniques.


keywords; autogenerated; author-generated; NLP

