A multifactor model for detecting propaganda in textual data

Authors

DOI:

https://doi.org/10.20535/2786-8729.7.2025.342630

Keywords:

information technology, propaganda, publication, multifactor model, statistical analysis, data mining, machine learning, text mining, recommendations

Abstract

Detecting elements of propaganda in large volumes of textual data is currently one of the key tools in combating the information warfare taking place worldwide. This paper presents a multifactor model for determining the level of propaganda in a publication. The analyzed publications included text-based news articles and social media posts, which were processed using both quantitative and semantic text analysis methods. The model was constructed using the method of linear convolution, which enables the integration of multiple heterogeneous indicators into a unified value reflecting the degree of propaganda.

The proposed model considers thirteen indicators, each of which, when exhibiting a high value, signals the potential presence of propaganda within a text. The indicators encompass lexical, syntactic, and semantic characteristics such as emotional tone, subjective evaluation, presence of manipulative triggers, and calls to action. The value of each indicator was calculated using methods of statistical analysis, intelligent data analysis, and machine learning. An algorithm for determining the influence level of each factor was proposed, as well as a scale for assessing the overall level of propaganda. For every analyzed publication, a utility function value was computed to quantify its propaganda intensity. The threshold value of this utility function – beyond which a publication is considered propagandistic – was defined as the sample mean across the dataset. This approach allows for an objective classification of textual materials without the need for expert labeling. The advantage of the developed method lies in the fact that each indicator is derived exclusively from empirical statistical data and validated computational procedures, ensuring the elimination of human subjectivity. The study demonstrates that the modified multifactor model can serve as a universal analytical tool for detecting propaganda in various types of textual data, thereby enhancing the transparency and reliability of media content analysis.

Author Biographies

Olena Gavrilenko, National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”, Kyiv

Associated Professor of the Department of Information Systems and Technologies of the Faculty of informatics and Computer Technique, Candidate of Science (Mathematics), Associate Professor

 

Kyryl Feshchenko, National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”, Kyiv

PhD student of the Department of Information Systems and Technologies of the Faculty of informatics and Computer Technique

References

Li, W., Li, S., Liu, C. et al. (2022). «Span identification and technique classification of propaganda in news articles». Complex Intell. Syst., 8, pp. 3603–3612. DOI: https://doi.org/10.1007/s40747-021-00393-y.

Maram Hasanain, Fatema Ahmed, Firoj Alam. (2024). «Can GPT-4 Identify Propaganda? Annotation and Detection of Propaganda Spans in News Articles». Computation and Language (cs.CL). https://doi.org/10.48550/arXiv.2402.17478.

Hamilton, K. (2021). «Towards an Ontology for Propaganda Detection in News Articles». In: Verborgh, R., et al. The Semantic Web: ESWC 2021 Satellite Events. ESWC 2021. Lecture Notes in Computer Science, vol. 12739. Springer, Cham, pp. 471–485. https://doi.org/10.1007/978-3-030-80418-3_35.

Da San Martino, G., Yu, S., Barrón-Cedeño, A., Petrov, R., Nakov, P. (2019). «Fine-grained analysis of propaganda in news article». In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, pp. 5635–5645. https://doi.org/10.18653/v1/D19-1565.

Ciampaglia, G.L., Shiralkar, P., Rocha, L.M., Bollen, J., Menczer, F., Flammini, A. (2015). «Computational fact checking from knowledge networks». PLoS One, 10(6), 15. https://doi.org/10.1371/journal.pone.0128193.

Pocheptsov, G. (2015). «Modern information wars». Kyiv: Kyiv-Mogylianska Academy, p. 497. ISBN: 9789665186748.

Ghosal, S., Jain, A. (2024). «CatRevenge: towards effective revenge text detection in online social media with paragraph embedding and CATBoost». Multimed Tools Appl. https://doi.org/10.1007/s11042-024-18791-y.

Alhajj, R., Rokne, J. (eds). (2018) «Encyclopedia of Social Network Analysis and Mining». Springer, New York, NY, p. 2699. https://doi.org/10.1007/978-1-4614-7163-9.

Ramona-Diana Leon, Raúl Rodríguez-Rodríguez, Pedro Gómez-Gasquet, Josefa Mula. (2017) «Social network analysis: A tool for evaluating and predicting future knowledge flows from an insurance organization». Technological Forecasting and Social Change, 114, pp. 103–118. https://doi.org/10.1016/j.techfore.2016.07.032.

Sergii Telenyk, Grzegorz Nowakowski, Olena Gavrilenko, Mykhailo Miahkyi, Olena Khalus. (2024). «Analysis of the influence of posts of famous people in social networks on the cryptocurrency course». Bulletin of the Polish Academy of Sciences Technical Sciences, 72(4). https://doi.org/10.24425/bpasts.2024.150117.

Gavrilenko, O., Oliinyk, Y., Khanko, H. (2020). «Analysis of Propaganda Elements Detecting Algorithms in Text Data». In: Hu, Z., Petoukhov, S., Dychka, I., He, M. (eds) Advances in Computer Science for Engineering and Education II. ICCSEEA 2019. Advances in Intelligent Systems and Computing, vol. 938. Springer, Cham, pp. 438–447. https://doi.org/10.1007/978-3-030-16621-2_41.

Gavrilenko, O., Feshchenko, K. (2025). «Detecting propaganda in news flows». Interdepartmental scientific-technical journal «Adaptive systems of automatic control», №1 (46), pp. 103–118. https://doi.org/10.1007/978-3-030-16621-2_41.

Oliinyk, V., Matviichuk, I. (2023). «Low-resource text classification using cross-lingual models for bullying detection in the Ukrainian language». Interdepartmental scientific-technical journal «Adaptive systems of automatic control», №1 (42), pp. 87–100. https://doi.org/10.1007/978-3-030-16621-2_41.

Irina Dats, Olena Havrylenko, Kyrylo Feshchenko. (2025) «Determining the level of propaganda in opera librettos using Data Mining and Machine Learning». System research and information technologies, №2, pp. 81–97. https://doi.org/10.20535/SRIT.2308-8893.2025.2.05.

Ronald E. Walpole, Raymond H. Myers, Sharon L. Myers, Keying E. Ye. (2016). «Probability and Statistics for Engineers and Scientists». 9th ed. Pearson, p. 816. ISBN-13: 978-0134115856.

Sheldon Ross. (2018). «A First Course in Probability». 10th ed. Pearson, p. 528. ISBN-13: 978-0134753119.

Jure Leskovec, Anand Rajaraman, Jeffrey D. Ullman. (2014). «Mining of Massive Datasets». Cambridge University Press, p. 326. ISBN: 9781107015357.

Rudolf Flesch. (1979). «How to Write Plain English: A Book for Lawyers and Consumers». Harper & Row, p. 126. ISBN: 9780060112783.

Giovanni Da San Martino, Seunghak Yu, Alberto Barrón-Cedeño, Rostislav Petrov, Preslav Nakov. (2019). «Fine-Grained Analysis of Propaganda in News Articles». Cornell University, Computation and Language. https://doi.org/10.48550/arXiv.1910.02517.

Jürgen Branke, Kalyanmoy Deb, Kaisa Miettinen, Roman Słowiński. (2008). «Multiobjective Optimization». Springer-Verlag Berlin Heidelberg, p. 470. https://doi.org/10.1007/978-3-540-88908-3.

Kalyanmoy Deb. (2001). «Multi-Objective Optimization using Evolutionary Algorithms». Wiley, p. 536. ISBN: 978-0-471-87339-6.

Downloads

Published

2025-12-27

How to Cite

[1]
O. Gavrilenko and K. . Feshchenko, “A multifactor model for detecting propaganda in textual data”, Inf. Comput. and Intell. syst. j., no. 7, pp. 160–179, Dec. 2025.