New Synonyms Extraction Model Based on a Novel Terms Weighting Scheme
Keywords:Automatic Synonyms Extraction, Cosine Similarity, Orbit Weighting Scheme, Semantic Context Analysis, Vector Space-based Extraction
The traditional statistical approach in synonyms extraction is time-consuming. It is necessary to develop a new method to improve the efficiency and accuracy. This research presents a new method in synonyms extraction called Noun Based Distinctive Verbs (NBDV) that replaces the traditional tf-idf weighting scheme with a new weighting scheme called the Orbit Weighing Scheme (OWS). The OWS links the nouns to their semantic space by examining the singular verbs in each context. The new method was compared with important models in the field such as the Skip-Gram, the Continuous Bag of Words, and the GloVe model. The NBDV model was manipulated over the Arabic and English languages and the results showed 47% Recall and 51% Precision in the dictionary-based evaluation and 57.5% Precision in the human experts’ evaluation. Comparing with the synonyms extraction based on tf.idf, the NBDV obtained 11% higher recall and 10% higher precision. Regarding the efficiency, we found that on average, the synonyms extraction of a single noun requires the process of 186 verbs and in 63% of the runs; the number of singular verbs was less than 200. It is concluded that the developed method is efficient and processes the single run in linear time.