{"id":2714,"date":"2024-05-18T11:21:35","date_gmt":"2024-05-18T08:21:35","guid":{"rendered":"https:\/\/fti.dp.ua\/conf\/?p=2714"},"modified":"2024-05-18T11:21:37","modified_gmt":"2024-05-18T08:21:37","slug":"05187-1118","status":"publish","type":"post","link":"https:\/\/fti.dp.ua\/conf\/2024\/05187-1118\/","title":{"rendered":"Comparison of Multiprocessor and Multi-Threaded Implementations of the Entropy Approach to Impute Gaps in Data in Python"},"content":{"rendered":"\n<h1 class=\"wp-block-heading citation_title\">Comparison of Multiprocessor and Multi-Threaded Implementations of the Entropy Approach to Impute Gaps in Data in Python<\/h1>\n\n\n\n<div class=\"wp-block-group\"><div class=\"wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained\">\n<h5 class=\"wp-block-heading citation_author\"><strong><strong><strong><strong><strong><strong><strong><strong>Oleksii Zemlianyi<\/strong><\/strong><\/strong><\/strong><\/strong><\/strong><\/strong><\/strong><\/h5>\n\n\n\n<p class=\"citation_author_url\"><em>ORCID: <a href=\"https:\/\/orcid.org\/0009-0001-6157-8725\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/orcid.org\/0009-0001-6157-8725<\/a><\/em><\/p>\n\n\n\n<p><em>Oles Honchar Dnipro National University<\/em><\/p>\n<\/div><\/div>\n\n\n\n<div style=\"height:1em\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-group\"><div class=\"wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained\">\n<h5 class=\"wp-block-heading citation_author\"><strong><strong><strong><strong><strong><strong><strong><strong>Oleh Baibuz<\/strong><\/strong><\/strong><\/strong><\/strong><\/strong><\/strong><\/strong><\/h5>\n\n\n\n<p class=\"citation_author_url\"><em>ORCID: <a href=\"https:\/\/orcid.org\/0000-0001-7489-6952\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/orcid.org\/0000-0001-7489-6952<\/a><\/em><\/p>\n\n\n\n<p><em>Oles Honchar Dnipro National University<\/em><\/p>\n<\/div><\/div>\n\n\n\n<div style=\"height:1em\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p>The comparison of sequential, multi-processor, and multi-threaded implementations of the entropy-based approach for data imputation is being considered in Python programming language. The main goal of the work is to investigate approaches to optimizing computations when implementing the entropy-based approach for data imputation. The authors explain the limitations of Python&#8217;s interpreter regarding multi-threading due to the presence of the Global Interpreter Lock (GIL), which prevents full parallel data processing in a multi-threaded environment. Instead, they propose using multi-processor calculating, where each process has its own Python interpreter and GIL, allowing for efficient distribution of computational tasks across multiple processor cores. For the experimental part of the work, the UCI Heart Disease Data dataset, hosted on the Kaggle platform, is used. Artificial introduction of gaps is performed, then imputation using various implementations based on the entropy approach, assessing the accuracy of imputation and the runtime of algorithms. The authors consider three approaches: sequential, multi-threaded, and multi-processor, and compare their efficiency. The research results show that the multi-threaded approach does not provide an advantage in speed compared to the sequential approach and sometimes even worsens productivity due to time spent on thread switching. Conversely, the multi-processor approach demonstrates a reduction in computation time, confirming its effectiveness for data imputation tasks. In the conclusions, the authors note that optimizing computations in Python requires consideration of GIL peculiarities and recommend using multi-processor computations to achieve better productivity. Recommendations for further optimization are provided, including the use of vectorized computations and avoiding excessive input-output operations. This work is of practical importance for data science researchers working with Python and facing challenges in parallel data processing.<\/p>\n\n\n\n<div style=\"height:18px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-buttons is-content-justification-right is-layout-flex wp-container-core-buttons-is-layout-765c4724 wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button\"><a class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/cims.fti.dp.ua\/j\/article\/view\/131\" target=\"_blank\" rel=\"noreferrer noopener\">FULL TEXT<\/a><\/div>\n<\/div>\n\n\n\n<div style=\"height:18px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-default\"\/>\n\n\n\n<div style=\"height:1em\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-group is-vertical is-content-justification-right is-layout-flex wp-container-core-group-is-layout-b6c475e2 wp-block-group-is-layout-flex\">\n<div class=\"wp-block-group is-content-justification-right is-nowrap is-layout-flex wp-container-core-group-is-layout-fd526d70 wp-block-group-is-layout-flex\"><div class=\"taxonomy-post_tag wp-block-post-terms\"><a href=\"https:\/\/fti.dp.ua\/conf\/tag\/cims-2024-vernal\/\" rel=\"tag\">CIMS 2024 Vernal<\/a><\/div>\n\n<div class=\"wp-block-post-date\"><time datetime=\"2024-05-18T11:21:35+03:00\">May 18, 2024<\/time><\/div><\/div>\n\n\n<div class=\"taxonomy-category wp-block-post-terms\"><a href=\"https:\/\/fti.dp.ua\/conf\/session\/info-tech-2\/\" rel=\"tag\">Information Technology and Cybersecurity<\/a><\/div><\/div>\n\n\n\n<div style=\"height:1em\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-default\"\/>\n","protected":false},"excerpt":{"rendered":"<p>Comparison of Multiprocessor and Multi-Threaded Implementations of the Entropy Approach to Impute Gaps in Data in Python Oleksii Zemlianyi ORCID: https:\/\/orcid.org\/0009-0001-6157-8725 Oles Honchar Dnipro National University Oleh Baibuz ORCID: https:\/\/orcid.org\/0000-0001-7489-6952 Oles Honchar Dnipro National University The comparison of sequential, multi-processor, and multi-threaded implementations of the entropy-based approach for data imputation is being considered in Python programming language. The main goal of the work is to investigate approaches to optimizing computations when implementing the entropy-based approach for data imputation. The authors &hellip; <\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[35],"tags":[29],"class_list":["post-2714","post","type-post","status-publish","format-standard","hentry","category-info-tech-2","tag-cims-2024-vernal"],"_links":{"self":[{"href":"https:\/\/fti.dp.ua\/conf\/wp-json\/wp\/v2\/posts\/2714","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fti.dp.ua\/conf\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fti.dp.ua\/conf\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/fti.dp.ua\/conf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/fti.dp.ua\/conf\/wp-json\/wp\/v2\/comments?post=2714"}],"version-history":[{"count":1,"href":"https:\/\/fti.dp.ua\/conf\/wp-json\/wp\/v2\/posts\/2714\/revisions"}],"predecessor-version":[{"id":2715,"href":"https:\/\/fti.dp.ua\/conf\/wp-json\/wp\/v2\/posts\/2714\/revisions\/2715"}],"wp:attachment":[{"href":"https:\/\/fti.dp.ua\/conf\/wp-json\/wp\/v2\/media?parent=2714"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fti.dp.ua\/conf\/wp-json\/wp\/v2\/categories?post=2714"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fti.dp.ua\/conf\/wp-json\/wp\/v2\/tags?post=2714"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}