Embedding-Based Machine Learning Approach for Automatic Classification of Turkish News Articles

Authors

DOI:

https://doi.org/10.58190/icisna.2025.149

Keywords:

natural language processing, text embeddings, language model, text classification, Gemma language model

Abstract

In this study, an automatic text classification approach for Turkish news articles is presented. The savasy/ttc4900 dataset from HuggingFace, consisting of seven news categories, was used. News texts were converted into 768-dimensional vector representations using the embeddinggemma model on the Ollama framework. These embeddings were then used to evaluate the performance of several machine learning algorithms. Seven models were tested: Support Vector Classifier (SVC), Logistic Regression, Multilayer Perceptron, K-Nearest Neighbors, Random Forest, Gaussian Naive Bayes, and Decision Tree. Model performance was assessed using accuracy, precision, recall, and F1-score metrics. Results showed that SVC and Logistic Regression achieved the highest accuracy in the high-dimensional embedding space. The findings demonstrate that embedding-based representations offer strong discriminative capability for Turkish news classification and that deep learning–derived vector embeddings can be effectively combined with traditional machine learning methods. These results emphasize the importance of vectorized text representations in natural language processing research.

Author Biography

Yavuz Selim TASPINAR, Mechatronics Engineering Department, Selcuk University , Konya

YAVUZ SELIM TASPINAR was born in Konya, Turkey in 1984. He received B.Sc. degrees in Computer System Teaching in 2008 and Computer Engineering in 2017 from the Selcuk University, Konya, Turkey. He received M.Sc. degree in Electronic and Computer Education in 2012 from the Selcuk University. He received PhD degree in Mechatronic Engineering in 2021. He served as a Computer Teacher at the Ministry of Education from 2008 to 2017. Currently he is a lecturer in the Transportation and Traffic Services Department, Selcuk University Doganhisar Vocational School. His research interests are neural network, image processingi signal processing and mecatronics.

Downloads

Published

2025-12-12

How to Cite

ATASOGLU, A., & TASPINAR, Y. S. (2025). Embedding-Based Machine Learning Approach for Automatic Classification of Turkish News Articles. Proceedings of International Conference on Intelligent Systems and New Applications, 3, 103–108. https://doi.org/10.58190/icisna.2025.149