MULTI-STAGE FILTERING ALGORITHM FOR DETECTING OBFUSCATED SPAM MESSAGES IN THE UZBEK LANGUAGE

Authors

  • Muzaffar Atajonov

DOI:

https://doi.org/10.47390/ts-v4i4y2026N01

Keywords:

spam, SMS spam, obfuscated spam, spam filtering, machine learning, text classification, Uzbek language.

Abstract

This paper considers the problem of detecting obfuscated spam messages in the Uzbek language. A multi-stage filtering algorithm including text preprocessing, obfuscation normalization, feature extraction, and classification is proposed. Experimental results show that the proposed algorithm improves spam detection performance.

References

1. Aggarwal C. C. Machine Learning for Text. Springer, 2018. – pp. 63–95. https://link.springer.com/book/10.1007/978-3-319-73531-3

2. Alpaydin E. Introduction to Machine Learning. MIT Press, 2020. – pp. 35–58. https://mitpress.mit.edu/9780262043793/introduction-to-machine-learning/

3. Androutsopoulos I., Koutsias J., Chandrinos K., Spyropoulos C. An experimental comparison of Naive Bayesian and keyword-based anti-spam filtering. SIGIR, 2000. – pp. 160–167. https://dl.acm.org/doi/10.1145/345508.345545

4. Devlin J., Chang M., Lee K., Toutanova K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. NAACL, 2019. – pp. 4171–4186. https://arxiv.org/abs/1810.04805

5. Forman G. An extensive empirical study of feature selection metrics for text classification. Journal of Machine Learning Research, 2003. – Vol. 3, pp. 1289–1305. https://jmlr.org/papers/volume3/forman03a/forman03a.pdf

6. Jurafsky D., Martin J. Speech and Language Processing. Pearson Education, 2020. – Chapter 6, pp. 245–280. https://web.stanford.edu/~jurafsky/slp3/

7. Manning C. D., Raghavan P., Schütze H. Introduction to Information Retrieval. Cambridge University Press, 2008. – Chapter 13, pp. 259–296. https://nlp.stanford.edu/IR-book/

8. Sebastiani F. Machine learning in automated text categorization. ACM Computing Surveys, 2002. – Vol. 34, No. 1, pp. 1–47. https://dl.acm.org/doi/10.1145/505282.505283

9. Shamili S., Karthikeyan S., Balakumar T. A survey on spam filtering techniques. International Journal of Computer Science and Information Technology, 2010. – pp. 45–52. https://arxiv.org/abs/1006.0976

10. Zhang Y., Jin R., Zhou Z. Understanding bag-of-words model in text classification. International Journal of Machine Learning and Cybernetics, 2010. – pp. 43–52. https://link.springer.com/article/10.1007/s13042-010-0001-0

Downloads

Submitted

2026-04-24

Published

2026-04-25

How to Cite

Atajonov, M. (2026). MULTI-STAGE FILTERING ALGORITHM FOR DETECTING OBFUSCATED SPAM MESSAGES IN THE UZBEK LANGUAGE. Techscience Uz - Topical Issues of Technical Sciences, 4(4), 5–10. https://doi.org/10.47390/ts-v4i4y2026N01

Similar Articles

<< < 1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.