dc.contributor.author | Abdellatif, Abdelrahman Taha Abdeltawab | |
dc.contributor.author | İslamoğlu, Ertuğrul | |
dc.contributor.author | Nizam, Ali | |
dc.date.accessioned | 2024-12-17T06:22:58Z | |
dc.date.available | 2024-12-17T06:22:58Z | |
dc.date.issued | 2024 | en_US |
dc.identifier.citation | ABDELLATİF, Abdelrahman Taha Abdeltawab. "Analysis of Code Similarity with Triplet Loss-Based Deep Learning System". Recent Trends and Advances in Artificial Intelligence, 1138 (2024): 351-361. | en_US |
dc.identifier.uri | https://hdl.handle.net/11352/5121 | |
dc.description.abstract | Nowadays, several plagiarism detection tools based on static code features
are available for code similarity detection. The application of deep learning
in this domain represents an emerging area of research. This research proposes an
innovative deep learning system based on triplet loss for detecting code similarity.
Our training approach involves generating embeddings for pairs of code snippets
to increase the detection accuracy. The system uses a tokenization and embedding
mechanism specifically tailored for Java code snippets using CodeBERT,
a pre-trained model that combines programming language and natural language
processing. After the learning phase, we employed transfer learning with a classifier
to detect code similarity. The effectiveness of the proposed system is evaluated
by a reduction in loss values and an improvement in accuracy compared to models
without the integration of triplet loss. The results indicate that our model can
identify code similarities and distinguish between snippets with high accuracy,
improving the capability of code similarity detection, clone detection, and source
code analysis. | en_US |
dc.language.iso | eng | en_US |
dc.publisher | Springer | en_US |
dc.relation.isversionof | 10.1007/978-3-031-70924-1_26 | en_US |
dc.rights | info:eu-repo/semantics/embargoedAccess | en_US |
dc.subject | Deep Learning | en_US |
dc.subject | Code Embedding | en_US |
dc.subject | Code Similarity Analysis | en_US |
dc.subject | Contrastive Learning | en_US |
dc.subject | Triplet Loss | en_US |
dc.title | Analysis of Code Similarity with Triplet Loss-Based Deep Learning System | en_US |
dc.type | article | en_US |
dc.relation.journal | Recent Trends and Advances in Artificial Intelligence | en_US |
dc.contributor.department | FSM Vakıf Üniversitesi, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü | en_US |
dc.identifier.volume | 1138 | en_US |
dc.identifier.startpage | 351 | en_US |
dc.identifier.endpage | 361 | en_US |
dc.relation.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | en_US |
dc.contributor.institutionauthor | Abdellatif, Abdelrahman Taha Abdeltawab | |
dc.contributor.institutionauthor | İslamoğlu, Ertuğrul | |
dc.contributor.institutionauthor | Nizam, Ali | |