Building a Text Collection for Urdu Information Retrieval

dc.contributor.authorRasheed, Imran
dc.contributor.authorBanka, Haider
dc.contributor.authorKhan, Hamaid M.
dc.date.accessioned2021-08-06T08:10:52Z
dc.date.available2021-08-06T08:10:52Z
dc.date.issued2021en_US
dc.departmentFSM Vakıf Üniversitesi, Rektörlük, Alüminyum Test Eğitim ve Araştırma Merkezi (ALUTEAM)en_US
dc.description.abstractUrdu is a widely spoken language in the Indian subcontinent with over 300 million speakers worldwide. However, linguistic advancements in Urdu are rare compared to those in other European and Asian languages. Therefore, by following Text Retrieval Conference standards, we attempted to construct an extensive text collection of 85 304 documents from diverse categories covering over 52 topics with relevance judgment sets at 100 pool depth. We also present several applications to demonstrate the effectiveness of our collection. Although this collection is primarily intended for text retrieval, it can also be used for named entity recognition, text summarization, and other linguistic applications with suitable modifications. Ours is the most extensive existing collection for the Urdu language, and it will be freely available for future research and academic education.en_US
dc.identifier.citationRASHEED, Imran, Haider BANKA & Hamaid M. KHAN. "Building a Text Collection for Urdu Information Retrieval". Etri Journal, 43.5 (2021): 856-868.en_US
dc.identifier.doi10.4218/etrij.2019-0458
dc.identifier.endpage856en_US
dc.identifier.issn1225-6463
dc.identifier.issn2233-7326
dc.identifier.issue5
dc.identifier.orcidhttps://orcid.org/0000-0001-9550-1294en_US
dc.identifier.scopus2-s2.0-85111102186
dc.identifier.scopusqualityQ2
dc.identifier.startpage868en_US
dc.identifier.urihttps://onlinelibrary.wiley.com/doi/full/10.4218/etrij.2019-0458
dc.identifier.urihttps://hdl.handle.net/11352/3793
dc.identifier.volume43
dc.identifier.wosWOS:000678808000001
dc.identifier.wosqualityQ4
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.institutionauthorKhan, Hamaid M.
dc.language.isoen
dc.publisherWiley Online Libraryen_US
dc.relation.ispartofEtri Journal
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectAssessors Agreementen_US
dc.subjectRelevance Judgmenten_US
dc.subjectText Collection Construction and Evaluationen_US
dc.subjectUrdu Corpusen_US
dc.subjectUrdu Information Retrievalen_US
dc.titleBuilding a Text Collection for Urdu Information Retrievalen_US
dc.typeArticle

Dosyalar

Orijinal paket

Listeleniyor 1 - 1 / 1
Yükleniyor...
Küçük Resim
İsim:
Rasheed.pdf
Boyut:
8.94 MB
Biçim:
Adobe Portable Document Format
Açıklama:
Ana makale

Lisans paketi

Listeleniyor 1 - 1 / 1
Yükleniyor...
Küçük Resim
İsim:
license.txt
Boyut:
1.44 KB
Biçim:
Item-specific license agreed upon to submission
Açıklama: