Embedding-Based Machine Learning Approach for Automatic Classification of Turkish News Articles

Ahmet ATASOGLU; Yavuz Selim TASPINAR

doi:10.58190/icisna.2025.149

Authors

Ahmet ATASOGLU Mechatronics Engineering Department, Selcuk University , Konya https://orcid.org/0009-0008-8178-2177
Yavuz Selim TASPINAR Mechatronics Engineering Department, Selcuk University , Konya https://orcid.org/0000-0002-7278-4241

DOI:

https://doi.org/10.58190/icisna.2025.149

Keywords:

natural language processing, text embeddings, language model, text classification, Gemma language model

Abstract

In this study, an automatic text classification approach for Turkish news articles is presented. The savasy/ttc4900 dataset from HuggingFace, consisting of seven news categories, was used. News texts were converted into 768-dimensional vector representations using the embeddinggemma model on the Ollama framework. These embeddings were then used to evaluate the performance of several machine learning algorithms. Seven models were tested: Support Vector Classifier (SVC), Logistic Regression, Multilayer Perceptron, K-Nearest Neighbors, Random Forest, Gaussian Naive Bayes, and Decision Tree. Model performance was assessed using accuracy, precision, recall, and F1-score metrics. Results showed that SVC and Logistic Regression achieved the highest accuracy in the high-dimensional embedding space. The findings demonstrate that embedding-based representations offer strong discriminative capability for Turkish news classification and that deep learning–derived vector embeddings can be effectively combined with traditional machine learning methods. These results emphasize the importance of vectorized text representations in natural language processing research.

Author Biography

Yavuz Selim TASPINAR, Mechatronics Engineering Department, Selcuk University , Konya

YAVUZ SELIM TASPINAR was born in Konya, Turkey in 1984. He received B.Sc. degrees in Computer System Teaching in 2008 and Computer Engineering in 2017 from the Selcuk University, Konya, Turkey. He received M.Sc. degree in Electronic and Computer Education in 2012 from the Selcuk University. He received PhD degree in Mechatronic Engineering in 2021. He served as a Computer Teacher at the Ministry of Education from 2008 to 2017. Currently he is a lecturer in the Transportation and Traffic Services Department, Selcuk University Doganhisar Vocational School. His research interests are neural network, image processingi signal processing and mecatronics.

Embedding-Based Machine Learning Approach for Automatic Classification of Turkish News Articles

Authors

DOI:

Keywords:

Abstract

Author Biography

Yavuz Selim TASPINAR, Mechatronics Engineering Department, Selcuk University , Konya

Downloads

Published

How to Cite

Issue

Section

Current Issue

Information