Download PDFOpen PDF in browserSMS Spam DetectionEasyChair Preprint 516635 pages•Date: March 16, 2021AbstractOver recent years, as the popularity of mobile phone devices has increased, Short Message Service (SMS) has grown into a multi-billion dollars industry. In this project, a database of real SMS Spams from UCI Machine Learning repository is used, and after pre processing and feature extraction, different machine learning techniques are applied to the database. Finally, the results are compared and the best algorithm for spam filtering for text messaging is introduced. Final simulation results using 10-fold cross validation shows the best classifier in this work reduces the overall error rate of best model in original paper citing this dataset by more than half. Algorithms used in this technique are: Logistic regression (LR), K-nearest neighbour(K-NN) and Decision tree (DT) are used for classification of spam messages in mobile device communication. The SMS spam collection set is used for testing the method. Keyphrases: Bayes Theorem, Count Vectorization, Preprocessing
|