Optimizing Pre-Trained Code Embeddings With Triplet Loss for Code Smell Detection

İslamoğlu, Ertuğrul; Adalı, Ömer Kerem; Aydın, Musa; Nizam, Ali

dc.contributor.author	İslamoğlu, Ertuğrul
dc.contributor.author	Adalı, Ömer Kerem
dc.contributor.author	Aydın, Musa
dc.contributor.author	Nizam, Ali
dc.date.accessioned	2025-02-24T08:41:52Z
dc.date.available	2025-02-24T08:41:52Z
dc.date.issued	2025	en_US
dc.identifier.citation	NİZAM, Ali, Ertuğrul İSLMAOĞLU, Ömer Kerem ADALI & Musa AYDIN. "Optimizing Pre-Trained Code Embeddings With Triplet Loss for Code Smell Detection." IEEE Access, 13 (2025): 1-16.	en_US
dc.identifier.uri	https://ieeexplore.ieee.org/document/10890964
dc.identifier.uri	https://hdl.handle.net/11352/5189
dc.description.abstract	Code embedding represents code semantics in vector form. Although code embedding-based systems have been successfully applied to various source code analysis tasks, further research is required to enhance code embedding for better code analysis capabilities, aiming to surpass the performance and functionality of static code analysis tools. In addition, standard methods for improving code embedding are essential to develop more effective embedding-based systems, similar to augmentation techniques in the image processing domain. This study aims to create a contrastive learning-based system to explore the potential of a generic method for enhancing code embedding for code classification tasks. A triplet lossbased deep learning network is designed to optimize in-class similarity and increase the distance between classes. An experimental dataset that contains code from Java, Python, and PHP programming languages and 4 different code smells is created by collecting code from open-source repositories on GitHub. We evaluate the proposed system’s effectiveness with widely used BERT, CodeBERT, and GraphCodeBERT pretrained models to create code embedding for the code classification task of code smell detection. Our findings indicate that the proposed system may offer improvements in accuracy, an average of 8% and a maximum of 13% for models. These results suggest that incorporating contrastive learning techniques into the generation process of code representation as a preprocessing step can enhance performance in code analysis.	en_US
dc.language.iso	eng	en_US
dc.publisher	İEEE	en_US
dc.relation.isversionof	10.1109/ACCESS.2025.3542566	en_US
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.subject	Code embedding	en_US
dc.subject	Contrastive learning	en_US
dc.subject	Triplet loss	en_US
dc.subject	Code smell detection	en_US
dc.title	Optimizing Pre-Trained Code Embeddings With Triplet Loss for Code Smell Detection	en_US
dc.type	article	en_US
dc.relation.journal	IEEE Access	en_US
dc.contributor.department	FSM Vakıf Üniversitesi	en_US
dc.contributor.authorID	https://orcid.org/0000-0002-5613-0686	en_US
dc.contributor.authorID	https://orcid.org/0009-0005-8400-5611	en_US
dc.contributor.authorID	https://orcid.org/0000-0002-5825-2230	en_US
dc.identifier.volume	13	en_US
dc.identifier.startpage	1	en_US
dc.identifier.endpage	16	en_US
dc.relation.publicationcategory	Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı	en_US
dc.contributor.institutionauthor	Nizam, Ali
dc.contributor.institutionauthor	İslamoğlu, Ertuğrul
dc.contributor.institutionauthor	Adalı, Ömer Kerem
dc.contributor.institutionauthor	Aydın, Musa

Bu öğenin dosyaları:

Ad:: Nizam.pdf
Boyut:: 2.162Mb
Biçim:: PDF
Açıklama:: Ana Makale

Göster/Aç

Bu öğe aşağıdaki koleksiyon(lar)da görünmektedir.

Bilgisayar Mühendisliği Bölümü [196]
Bilgisayar Mühendisliği'ne ait yayınları içerir.
Scopus İndeksli Yayınlar / Scopus Indexed Publications [629]
Scopus İndeksli Yayınlar koleksiyonuna ait yayınları içerir.
WOS İndeksli Yayınlar / WOS Indexed Publications [566]
WOS İndeksli Yayınlar koleksiyonuna ait yayınları içerir.
Yapay Zeka ve Veri Mühendisliği Bölümü [6]
Yapay Zeka ve Veri Mühendisliği Bölümü'ne ait yayınları içerir.
Yazılım Mühendisliği [17]
Yazılım Mühendisliği'ne ait yayınları içerir.

Basit öğe kaydını göster