THE COMPARISON BETWEEN LOGISTIC REGRESSION AND CONVOLUTIONAL NEURAL NETWORK FOR MULTI-DRUG RESISTANT TUBERCULOSIS PREDICTION
Main Article Content
Abstract
Multi-drug resistant tuberculosis (MDR-TB) is caused by Mycobacterium tuberculosis strains that resist at least two first-line anti-TB drugs. This disease presents a major global health challenge, particularly affecting middle to lower income countries where affordable and rapid diagnostic tools are urgently needed. To address this, researchers are exploring the combination of whole genome sequencing and machine learning for drug resistance predictions. Using Mycobacterium tuberculosis genomic data from databases, both Logistic Regression (LR) and Convolutional Neural Network (CNN) models were trained to predict drug resistance. Performance evaluation revealed that CNN slightly outperformed LR in accuracy and specificity for Rifampicin and Pyrazinamide predictions, while LR showed better results for Isoniazid and Ethambutol. In terms of sensitivity, LR demonstrated superior performance for most drugs, except Ethambutol where CNN excelled. Though computational complexity assessment was incomplete due to hardware limitations, both models showed distinct advantages in predicting first-line anti-TB drug resistance.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
a). Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Attribution-NonCommercial-ShareAlike 4.0 International that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
b). Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
c). Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
References
Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M, Ghemawat S, Goodfellow I, Harp A, Irving G, Isard M, Jia Y, Jozefowicz R, Kaiser L, Kudlur M, Levenberg J, Mane D, Monga R, Moore S, Murray D, Olah C, Schuster M, Shlens J, Steiner B, Sutskever I, Talwar K, Tucker P, Vanhoucke V, Vasudevan V, Viegas F, Vinyals O, Warden P, Wattenberg M, Wicke M, Yu Y, Zheng X (2016) TensorFlow: Large-Scale Machine Learning on Heterogeneous Distribut-ed Systems
Albaradei, S., Thafar, M., Alsaedi, A., Van Neste, C., Gojobori, T., Essack, M., & Gao, X. (2021). Machine learning and deep learning methods that use omics data for metastasis predic-tion. Computational and Structural Bi-otechnology Journal, 19, 5008–5018. https://doi.org/10.1016/j.csbj.2021.09.001
Arango-Argoty, G., Garner, E., Pruden, A., Heath, L. S., Vikesland, P., & Zhang, L. (2018). DeepARG: a deep learning approach for predicting antibiotic re-sistance genes from metagenomic da-ta. Microbiome, 6(1). https://doi.org/10.1186/s40168-018-0401-z
Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Il-lumina sequence data. Bioinformatics 30:2114–2120. https://doi.org/10.1093/bioinformatics/btu170
Coll F, McNerney R, Preston MD, Guerra-Assunção JA, Warry A, Hill-Cawthorne G, Mallard K, Nair M, Mi-randa A, Alves A, Perdigão J, Viveiros M, Portugal I, Hasan Z, Hasan R, Glynn JR, Martin N, Pain A, Clark TG (2015) Rapid determination of anti-tuberculosis drug resistance from whole-genome sequences. Genome Medicine 7:51. https://doi.org/10.1186/s13073-015-0164-0
Colquhoun, D. (2014). An investigation of the false discovery rate and the misin-terpretation of p -values. Royal Socie-ty Open Science, 1(3), 140216. https://doi.org/10.1098/rsos.140216
Concato, J., & Hartigan, J. A. (2016). P val-ues: From suggestion to supersti-tion. Journal of Investigative Medi-cine, 64(7), 1166–1171. https://doi.org/10.1136/jim-2016-000206
Connolly LE, Edelstein PH, Ramakrishnan L (2007) Why Is Long-Term Therapy Required to Cure Tuberculosis? PLoS Med 4:e120. https://doi.org/10.1371/journal.pmed.0040120
Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, Whitwham A, Keane T, McCarthy SA, Davies RM, Li H (2021) Twelve years of SAMtools and BCFtools. Gigascience 10:giab008. https://doi.org/10.1093/gigascience/giab008
De Rooij, M., & Weeda, W. (2020). Cross-Validation: a method every psycholo-gist should know. Advances in Meth-ods and Practices in Psychological Science, 3(2), 248–263. https://doi.org/10.1177/2515245919898466
Green AG, Yoon CH, Chen ML, Ektefaie Y, Fina M, Freschi L, Gröschel MI, Ko-hane I, Beam A, Farhat M (2022) A convolutional neural network high-lights mutations relevant to antimicro-bial resistance in Mycobacterium tu-berculosis. Nat Commun 13:3817. https://doi.org/10.1038/s41467-022-31236-0
Gröschel, M. I., Owens, M., Freschi, L., Vargas, R., Marin, M. G., Phelan, J., Iqbal, Z., Dixit, A., & Farhat, M. R. (2021). GenTB: A user-friendly ge-nome-based predictor for tuberculosis resistance powered by machine learn-ing. Genome Medi-cine, 13(1). https://doi.org/10.1186/s13073-021-00953-4
Gupta N (2013) Accuracy, Sensitivity and Specificity Measurement of Various Classification Techniques on Healthcare Data. IOSR-JCE 11:70–73. https://doi.org/10.9790/0661-1157073
Habibzadeh F, Habibzadeh P, Yadollahie M (2016) On determining the most ap-propriate test cut-off value: the case of tests with continuous results. Biochem Med (Zagreb) 26:297–307. https://doi.org/10.11613/BM.2016.034
Hennessy JL, Patterson DA (2011) Com-puter Architecture, Fifth Edition: A Quantitative Approach, 5th edn. Mor-gan Kaufmann Publishers Inc., San Francisco, CA, USA
Hinz T, Navarro-Guerrero N, Magg S, Wermter S (2018) Speeding up the Hyperparameter Optimization of Deep Convolutional Neural Networks. Int J Comp Intel Appl 17:1850008. https://doi.org/10.1142/S1469026818500086
Jin, C., Jia, C., Hu, W., Xu, H., Shen, Y., & Yue, M. (2023). Predicting antimicro-bial resistance in E. coli with discrimi-native position fused deep learning classifier. Computational and Struc-tural Biotechnology Journal, 23, 559–565. https://doi.org/10.1016/j.csbj.2023.12.041
Kaushik R, Kumar S (2019) Image Segmen-tation Using Convolutional Neural Network. International Journal of Sci-entific & Technology Research
Korthauer, K., Kimes, P. K., Duvallet, C., Reyes, A., Subramanian, A., Teng, M., Shukla, C., Alm, E. J., & Hicks, S. C. (2019). A practical guide to meth-ods controlling false discoveries in computational biology. Genome Biol-ogy, 20(1). https://doi.org/10.1186/s13059-019-1716-1
Koteluk O, Wartecki A, Mazurek S, Kołodziejczak I, Mackiewicz A (2021) How Do Machines Learn? Artificial In-telligence as a New Era in Medicine. J Pers Med 11:32. https://doi.org/10.3390/jpm11010032
Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods 9:357–359. https://doi.org/10.1038/nmeth.1923
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444. https://doi.org/10.1038/nature14539
Li L, Jamieson K, DeSalvo G, Rostamiza-deh A, Talwalkar A (2018) Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization
Liu X, Han S, Wang Z, Gelernter J, Yang B-Z (2013) Variant Callers for Next-Generation Sequencing Data: A Comparison Study. PLOS ONE 8:e75619. https://doi.org/10.1371/journal.pone.0075619
Maksum I, Suhaili S, Amalia R, Kamara D, Rachman S, Rachman R (2018) PCR Multipleks untuk Identifikasi Mycobac-terium tuberculosis Resisten terhadap Isoniazid dan Rifampisin pada Galur Lokal Balai Laboratorium Kesehatan Provinsi Jawa Barat. Jurnal Kimia VALENSI 4:107–118. https://doi.org/10.15408/jkv.v4i2.7226
Miotto, P., Tessema, B., Tagliani, E., Chin-delevitch, L., Starks, A. M., Emerson, C., Hanna, D., Kim, P. S., Liwski, R., Zignol, M., Gilpin, C., Niemann, S., Denkinger, C. M., Fleming, J., War-ren, R. M., Crook, D., Posey, J., Gagneux, S., Hoffner, S., . . . Rodwell, T. C. (2017). A standardised method for interpreting the association be-tween mutations and phenotypic drug resistance inMycobacterium tubercu-losis. European Respiratory Jour-nal, 50(6), 1701354. https://doi.org/10.1183/13993003.01354-2017
Nadjib M, Dewi RK, Setiawan E, Miko TY, Putri S, Hadisoemarto PF, Sari ER, Pujiyanto, Martina R, Syamsi LN (2022) Cost and affordability of scal-ing up tuberculosis diagnosis using Xpert MTB/RIF testing in West Java, Indonesia. PLoS One 17:e0264912. https://doi.org/10.1371/journal.pone.0264912
Nalugwa, T., Shete, P. B., Nantale, M., Farr, K., Ojok, C., Ochom, E., Mugabe, F., Joloba, M., Dowdy, D. W., Moore, D. a. J., Davis, J. L., Cattamanchi, A., & Katamba, A. (2020). Challenges with scale-up of GeneXpert MTB/RIF® in Uganda: a health systems perspec-tive. BMC Health Services Re-search, 20(1). https://doi.org/10.1186/s12913-020-4997-x
Nguyen TNA, Anton-Le Berre V, Bañuls A-L, Nguyen TVA (2019) Molecular Di-agnosis of Drug-Resistant
Tuberculosis; A Literature Review. Frontiers in Microbiology 10
Pai, M., & Schito, M. (2015). Tuberculosis Diagnostics in 2015: Landscape,
Priorities, Needs, and Prospects. The Journal of Infectious Diseas-es, 211(suppl_2), S21–S28. https://doi.org/10.1093/infdis/jiu803
Parwati CG, Farid MN, Nasution HS, Basri C, Lolong D, Gebhard A, Tiemersma EW, Pambudi I, Surya A, Houben RMGJ (2020) Estimation of subna-tional tuberculosis burden: generation and application of a new tool in Indo-nesia. Int J Tuberc Lung Dis 24:250–257. https://doi.org/10.5588/ijtld.19.0139
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Müller, A., Nothman, J., Louppe, G., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Pas-sos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, É. (2012, January 2). SciKit-Learn: Machine Learning in Python. arXiv.org. https://arxiv.org/abs/1201.0490
Pelletreau S (2022) Desk Review: Pediatric Tuberculosis with a Focus on Indone-sia | UNICEF Indonesia
Phelan JE, O’Sullivan DM, Machado D, Ramos J, Oppong YEA, Campino S, O’Grady J, McNerney R, Hibberd ML, Viveiros M, Huggett JF, Clark TG (2019) Integrating informatics tools and portable sequencing technology for rapid detection of resistance to an-ti-tuberculous drugs. Genome Medi-cine 11:41. https://doi.org/10.1186/s13073-019-0650-x
Schisterman EF, Faraggi D, Reiser B, Hu J (2008) Youden Index and the optimal threshold for markers with mass at ze-ro. Stat Med 27:297–315. https://doi.org/10.1002/sim.2993
Shoaib, M., Shah, B., Sayed, N., Ali, F., Ul-lah, R., & Hussain, I. (2023). Deep learning for plant bioinformatics: an explainable gradient-based approach for disease detection. Frontiers in Plant Sci-ence, 14. https://doi.org/10.3389/fpls.2023.1283235
Simonovska L, Trajcevska M, Mitreski V, Simonovska I (2015) The causes of death among patients with tuberculo-sis. European Respiratory Journal 46
The CRyPTIC Consortium and the 100,000 Genomes Project (2018) Prediction of Susceptibility to First-Line Tuberculo-sis Drugs by DNA Sequencing. New England Journal of Medicine 379:1403–1415. https://doi.org/10.1056/NEJMoa1800474
Vasiliu, A., Saktiawati, A. M. I., Duarte, R., Lange, C., & Cirillo, D. M. (2022). Im-plementing molecular tuberculosis di-agnostic methods in limited-resource and high-burden countries. Breathe, 18(4), 220226. https://doi.org/10.1183/20734735.0226-2022
Verma VK, Verma S (2022) Machine learn-ing applications in healthcare sector: An overview. Materials Today:
Proceedings 57:2144–2147. https://doi.org/10.1016/j.matpr.2021.12.101
Wilimitis, D., & Walsh, C. G. (2023). Practi-cal Considerations and Applied
Examples of Cross-Validation for model development and Evaluation in Health Care: tutorial. JMIR AI, 2, e49023. https://doi.org/10.2196/49023
World Health Organization (2021) Global Tuberculosis Report 2021. World Health Organization
World Health Organization (2018). The use of next-generation sequencing tech-nologies for the detection of mutations associated with drug resistance in Mycobacterium tuberculosis complex: technical guide. https://www.who.int/publications/i/item/WHO-CDS-TB-2018.19