A Modified Boosted Ensemble Classifier on Location Based Social Networking
Keywords:Social Network Services (SNS), Location Based Social Media (LBSM), Random Forest, Imbalanced data learning
One of the research issues that researchers are interested in is unbalanced data classification techniques. Boosting approaches like Wang's Boosting and Modified Boosted SVM (MBSVM) have been demonstrated to be more effective for unbalanced data. Our proposal The Modified Boosted Random Forest (MBRF) classifier is a Random Forest classifier that uses the Boosting approach. The main motivation of the study is to analyze sentiment of geotagged tweets understanding the state of mind of people at FIFA and Olympics datasets. Tree based model Random Forest algorithm using boosting approach classifies the tweets to build a recommendation system with an idea of providing commercial suggestions to participants, recommending local places to visit or perform activities. MBRF employs various strategies: i) a distance-based weight-update method based on K-Medoids ii) a sign-based classifier elimination technique. We have equally partitioned the datasets as 70% of data allocated for training and the remaining 30% data as test data. Our imbalanced data ratio measured 3.1666 and 4.6 for FIFA and Olympics datasets. We looked at accuracy, precision, recall and ROC curves for each event. The average AUC achieved by MBRF on FIFA dataset is 0.96 and Olympics is 0.97. A comparison of MBRF and Decision tree model using 'Entropy' proved MBRF better.