Methodological protocol for developing reliable and valid AI-assisted content analysis: a practical guide with ChatGPT

Authors

DOI:

https://doi.org/10.3145/thinkepi.2025.e19a07

Keywords:

Content analysis, Artificial intelligence, ChatGPT, Methodological protocol

Abstract

Since the emergence of generative artificial intelligence, with ChatGPT as one of its leading examples, the automated coding of text through content analysis has become significantly more accessible. However, despite these advancements, there is still no established methodological protocol that examines this adaptation and provides recommendations for the rigorous use of artificial intelligence in content analysis. This study proposes a user-friendly methodological guide for implementing AI-assisted automated content analysis, aiming to ensure the reliability and validity of the data obtained. Furthermore, the protocol is designed to be practical and accessible to researchers without specialized training in programming or computational social sciences, offering reasoned guidance on the most relevant processes.

Downloads

References

Goyanes, Manuel; De-Marcos, Luis; Domínguez-Díaz, Adrián (2024). “Automatic gender detection: a methodological procedure and recommendations to computationally infer the gender from names with ChatGPT and gender APIs”. Scientometrics, v. 129, n. 11, pp. 6867-6888. https://doi.org/10.1007/s11192-024-05149-2

Goyanes, Manuel; Piñeiro-Naval, Valeriano (2024). “Análisis de contenido en SPSS y KALPHA: Procedimiento para un análisis cuantitativo fiable con la Kappa de Cohen y el Alpha de Krippendorff”. Estudios sobre el mensaje periodístico, v. 30, n. 1, pp. 123-140. https://doi.org/10.5209/esmp.92732

Iang, Qile; Gao, Zhiwei; Karniadakis, George-Em (2025). “DeepSeek vs. ChatGPT vs. Claude: A comparative study for scientific computing and scientific machine learning tasks”. Theoretical and Applied Mechanics Letters, v. 15, n. 3. https://doi.org/10.1016/j.taml.2025.100583

Krippendorff, Klaus (2004). “Reliability in content analysis: Some common misconceptions and recommendations”. Human communication research, v. 30, n. 3, pp. 411-433. https://doi.org/10.1111/j.1468-2958.2004.tb00738.x

Hayes, Andrew F.; Krippendorff, Klaus (2007). “Answering the call for a standard reliability measure for coding data”. Communication methods and measures, v. 1, n. 1, pp. 77-89. https://doi.org/10.1080/19312450709336664

Maj, Agnieszka; Makowska, Marta; Sacharczuk, Katarzyna (2025). “The content analysis used in nursing research and the possibility of including artificial intelligence support: A methodological review”. Applied nursing research, v. 82, 151919. https://doi.org/10.1016/j.apnr.2025.151919

Mervaala, Erkki; Kousa, Ilona (2025). “Out of Context! Managing the Limitations of Context Windows in ChatGPT-4o”. Journal of data mining & digital humanities, jdmdh:15090. https://doi.org/10.46298/jdmdh.15090

Moon, Hak (2025). Comparación del rendimiento de modelos lingüísticos extensos en problemas de cálculo avanzado. arXiv. https://doi.org/10.48550/arXiv.2503.03960

Riffe, Daniel; Lacy, Stephen; Fico, Frederick; Watson, Brendan (2019). Analyzing media messages. Using quantitative content analysis in research (4th Ed.). Routledge.

Yacouby, Reda; Axman, Dustin (2020). Probabilistic extension of precision, recall, and F1 Score for more thorough evaluation of classification models. In Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems (pp. 79-91). https://doi.org/10.18653/v1/2020.eval4nlp-1.9

Published

2025-04-01

How to Cite

Goyanes, M., & De-Marcos, L. (2025). Methodological protocol for developing reliable and valid AI-assisted content analysis: a practical guide with ChatGPT. Anuario ThinkEPI, 19. https://doi.org/10.3145/thinkepi.2025.e19a07

Dimensions

Altmetrics

Issue

Section

Tecnologí­as de la información y la comunicación