Description of Turkish Paraphrase Corpus Structure and Generation Method
| dc.contributor.author | Karaoglan, Bahar | |
| dc.contributor.author | Kisla, Tarik | |
| dc.contributor.author | Metin, Senem Kumova | |
| dc.date.accessioned | 2023-06-16T12:47:39Z | |
| dc.date.available | 2023-06-16T12:47:39Z | |
| dc.date.issued | 2018 | |
| dc.description | 17th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing) -- APR 03-09, 2016 -- Mevlana Univ, Konya, TURKEY | en_US |
| dc.description.abstract | Because developing a corpus requires a long time and lots of human effort, it is desirable to make it as resourceful as possible: rich in coverage, flexible, multipurpose and expandable. Here we describe the steps we took in the development of Turkish paraphrase corpus, the factors we considered, problems we faced and how we dealt with them. Currently our corpus contains nearly 4000 sentences with the ratio of 60% paraphrase and 40% non-paraphrase sentence pairs. The sentence pairs are annotated at 5-scale: paraphrase, encapsulating, encapsulated, non-paraphrase and opposite. The corpus is formulated in a database structure integrated with Turkish dictionary. The sources we used till now are news texts from Bilcon 2005 corpus, a set of professionally translated sentence pairs from MSRP corpus, multiple Turkish translations from different languages that are involved in Tatoeba corpus and user generated paraphrases. | en_US |
| dc.description.sponsorship | TUBITAK - The Scientific and Technological Research Council of Turkey [114E126]; Ege University Scientific Research Council [2015/BIL/034] | en_US |
| dc.description.sponsorship | This work is carried under the grant of TUBITAK - The Scientific and Technological Research Council of Turkey to Project No: 114E126, Using Certainty Factor Approach and Creating Paraphrase Corpus for Measuring Similarity of Short Turkish Texts and Ege University Scientific Research Council Project No 2015/BIL/034, Developing a Paraphrase Corpus for Turkish Short Text Similarity Studies. | en_US |
| dc.identifier.doi | 10.1007/978-3-319-75477-2_13 | |
| dc.identifier.isbn | 978-3-319-75477-2 | |
| dc.identifier.isbn | 978-3-319-75476-5 | |
| dc.identifier.issn | 0302-9743 | |
| dc.identifier.issn | 1611-3349 | |
| dc.identifier.scopus | 2-s2.0-85044430201 | |
| dc.identifier.uri | https://doi.org/10.1007/978-3-319-75477-2_13 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.14365/824 | |
| dc.language.iso | en | en_US |
| dc.publisher | Springer International Publishing Ag | en_US |
| dc.relation.ispartof | Computatıonal Lınguıstıcs And Intellıgent Text Processıng, (Cıclıng 2016), Pt I | en_US |
| dc.rights | info:eu-repo/semantics/closedAccess | en_US |
| dc.subject | Turkish | en_US |
| dc.subject | Paraphrase | en_US |
| dc.subject | Corpus generation | en_US |
| dc.title | Description of Turkish Paraphrase Corpus Structure and Generation Method | en_US |
| dc.type | Conference Object | en_US |
| dspace.entity.type | Publication | |
| gdc.author.id | KARAOGLAN, BAHAR/0000-0001-9338-7491 | |
| gdc.author.id | KISLA, TARIK/0000-0001-9007-7455 | |
| gdc.author.scopusid | 22334152300 | |
| gdc.author.scopusid | 24314851200 | |
| gdc.author.scopusid | 24471923700 | |
| gdc.bip.impulseclass | C5 | |
| gdc.bip.influenceclass | C5 | |
| gdc.bip.popularityclass | C5 | |
| gdc.coar.access | metadata only access | |
| gdc.coar.type | text::conference output | |
| gdc.collaboration.industrial | false | |
| gdc.description.department | İzmir Ekonomi Üniversitesi | en_US |
| gdc.description.departmenttemp | [Karaoglan, Bahar; Kisla, Tarik] Ege Univ, Izmir, Turkey; [Metin, Senem Kumova] Izmir Univ Econ, Izmir, Turkey | en_US |
| gdc.description.endpage | 217 | en_US |
| gdc.description.publicationcategory | Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı | en_US |
| gdc.description.scopusquality | Q3 | |
| gdc.description.startpage | 208 | en_US |
| gdc.description.volume | 9623 | en_US |
| gdc.description.wosquality | N/A | |
| gdc.identifier.openalex | W2793324584 | |
| gdc.identifier.wos | WOS:000540380100013 | |
| gdc.index.type | WoS | |
| gdc.index.type | Scopus | |
| gdc.oaire.diamondjournal | false | |
| gdc.oaire.downloads | 13 | |
| gdc.oaire.impulse | 2.0 | |
| gdc.oaire.influence | 2.881022E-9 | |
| gdc.oaire.isgreen | true | |
| gdc.oaire.keywords | Turkish | |
| gdc.oaire.keywords | Corpus generation | |
| gdc.oaire.keywords | Paraphrase | |
| gdc.oaire.popularity | 1.894736E-9 | |
| gdc.oaire.publicfunded | false | |
| gdc.oaire.sciencefields | 05 social sciences | |
| gdc.oaire.sciencefields | 0202 electrical engineering, electronic engineering, information engineering | |
| gdc.oaire.sciencefields | 0501 psychology and cognitive sciences | |
| gdc.oaire.sciencefields | 02 engineering and technology | |
| gdc.oaire.views | 2 | |
| gdc.openalex.collaboration | National | |
| gdc.openalex.fwci | 0.8156 | |
| gdc.openalex.normalizedpercentile | 0.74 | |
| gdc.opencitations.count | 2 | |
| gdc.plumx.crossrefcites | 2 | |
| gdc.plumx.mendeley | 2 | |
| gdc.plumx.scopuscites | 3 | |
| gdc.scopus.citedcount | 3 | |
| gdc.virtual.author | Kumova Metin, Senem | |
| gdc.wos.citedcount | 1 | |
| relation.isAuthorOfPublication | 81d6fcea-c590-42aa-8443-7459c9eab7fa | |
| relation.isAuthorOfPublication.latestForDiscovery | 81d6fcea-c590-42aa-8443-7459c9eab7fa | |
| relation.isOrgUnitOfPublication | 805c60d5-b806-4645-8214-dd40524c388f | |
| relation.isOrgUnitOfPublication | 26a7372c-1a5e-42d9-90b6-a3f7d14cad44 | |
| relation.isOrgUnitOfPublication | e9e77e3e-bc94-40a7-9b24-b807b2cd0319 | |
| relation.isOrgUnitOfPublication.latestForDiscovery | 805c60d5-b806-4645-8214-dd40524c388f |
Files
Original bundle
1 - 1 of 1
