Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.14365/824
Title: Description of Turkish Paraphrase Corpus Structure and Generation Method
Authors: Karaoglan, Bahar
Kisla, Tarik
Metin, Senem Kumova
Keywords: Turkish
Paraphrase
Corpus generation
Publisher: Springer International Publishing Ag
Abstract: Because developing a corpus requires a long time and lots of human effort, it is desirable to make it as resourceful as possible: rich in coverage, flexible, multipurpose and expandable. Here we describe the steps we took in the development of Turkish paraphrase corpus, the factors we considered, problems we faced and how we dealt with them. Currently our corpus contains nearly 4000 sentences with the ratio of 60% paraphrase and 40% non-paraphrase sentence pairs. The sentence pairs are annotated at 5-scale: paraphrase, encapsulating, encapsulated, non-paraphrase and opposite. The corpus is formulated in a database structure integrated with Turkish dictionary. The sources we used till now are news texts from Bilcon 2005 corpus, a set of professionally translated sentence pairs from MSRP corpus, multiple Turkish translations from different languages that are involved in Tatoeba corpus and user generated paraphrases.
Description: 17th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing) -- APR 03-09, 2016 -- Mevlana Univ, Konya, TURKEY
URI: https://doi.org/10.1007/978-3-319-75477-2_13
https://hdl.handle.net/20.500.14365/824
ISBN: 978-3-319-75477-2
978-3-319-75476-5
ISSN: 0302-9743
1611-3349
Appears in Collections:Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection
WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection

Files in This Item:
File SizeFormat 
824.pdf
  Until 2030-01-01
293.44 kBAdobe PDFView/Open    Request a copy
Show full item record



CORE Recommender

SCOPUSTM   
Citations

2
checked on Nov 20, 2024

WEB OF SCIENCETM
Citations

1
checked on Nov 20, 2024

Page view(s)

64
checked on Nov 18, 2024

Download(s)

6
checked on Nov 18, 2024

Google ScholarTM

Check




Altmetric


Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.