Browsing by Author "Kisla, Tarik"
Now showing 1 - 5 of 5
- Results Per Page
- Sort Options
Article Certainty Factor Model in Paraphrase Detection(Pamukkale Univ, 2021) Metin, Senem Kumova; Karaoglan, Bahar; Kisla, Tarik; Soleymanzadeh, KatiraIn this paper, we address the problem of uncertainty management in identification of paraphrase sentence pairs. Paraphrase sentences are simply sets/pairs of sentences that express the same facts and/or opinions using different words or order of words. We propose the use of certainty factor (CF) model in paraphrase detection. A set of succeeding paraphrase detection features (generic and distance based features) is built by filtering and this set is used as evidences in CF model. The CF model is evaluated by F1 and accuracy measures on Microsoft Research Paraphrase corpus. The results are compared to the well-known Bayesian reasoning. The experimental results showed that CF model is an alternating paraphrase detection method to Bayes model.Conference Object Citation - WoS: 1Citation - Scopus: 3Description of Turkish Paraphrase Corpus Structure and Generation Method(Springer International Publishing Ag, 2018) Karaoglan, Bahar; Kisla, Tarik; Metin, Senem KumovaBecause developing a corpus requires a long time and lots of human effort, it is desirable to make it as resourceful as possible: rich in coverage, flexible, multipurpose and expandable. Here we describe the steps we took in the development of Turkish paraphrase corpus, the factors we considered, problems we faced and how we dealt with them. Currently our corpus contains nearly 4000 sentences with the ratio of 60% paraphrase and 40% non-paraphrase sentence pairs. The sentence pairs are annotated at 5-scale: paraphrase, encapsulating, encapsulated, non-paraphrase and opposite. The corpus is formulated in a database structure integrated with Turkish dictionary. The sources we used till now are news texts from Bilcon 2005 corpus, a set of professionally translated sentence pairs from MSRP corpus, multiple Turkish translations from different languages that are involved in Tatoeba corpus and user generated paraphrases.Conference Object Examining the Views of Engineering Students About Massive Open Online Courses(IEEE, 2021) Kisla, Tarik; Metin, Senem Kumova; Karaoglan, Bahar; Demir, Elif KubraToday, one of the most important tools of open and distance learning, Massive Open Online Courses (MOOC) has emerged as a continuation of the Open Educational Resources approach. The Open Education Resources (OER) approach was first introduced in 1999, and gained a great momentum in 2002 with the help of the Massachusetts Institute of Technology (MIT). MOOCs are defined as a scalable, free online course with open access that supports learning in different fields. Thanks to these platforms, an interaction between learner-learner and learner-intructor is established and a flexible learning environment is provided. MOOCs, which first appeared in 2008, have increased their popularity with MOOC platforms that include hundreds of courses such as edX, Coursera, Udemy and Udacity with the support of many important universities in the world such as MIT and Harvard. In this study, the thoughts of engineering students about MOOCs were asked with a questionnaire which was used as a data collection tool developed by the researchers. The study was attended by 307 students studying Electrical and Electronics Engineering, Civil Engineering, Industrial Engineering, Aerospace Engineering, Mechanical Engineering, and Food EngineeringConference Object Citation - WoS: 2Citation - Scopus: 3Extracting the Features of Similarity in Short Texts(IEEE, 2015) Kisla, Tarik; Metin, Senem Kumova; Karaoglan, BaharAutomatic identification of text similarity has found applications in information retrieval, text summarization, assessment of machine translation, assessment of question answering, word sense disambiguation and many more. In this work, the results of discrimant analysis applied to find out the cumulative effect of the attributes used in the literature so far (ratio of common words, text lentgths, common word sequences, synonyms, hypernyms, hyponyms) in detecting word similarity are reported.Conference Object A Proposal for Corpus Normalization(IEEE, 2013) Karaoglan, Bahar; Kisla, Tarik; Dincer, Bekir Taner; Metin, Senem KumovaIn order to compare work done under natural language processing, the corpora involved in different studies should be standardized/normalized. Entropy, used as language model performance metric, totally depends on signal information. Whereas, when language is considered semantic information should also be considered. Here we propose a metric that exploits Zipf's and Heaps' power laws to respresent semantic information in terms of signal information and estimates the amount of information anticipated from a corpus of given length in words. The proposed metric is tested on 20 different lengths of sub-corpora drawn from major corpus in Turkish (METU). While the entropy changed depending on the length of the corpus, the value of our proposed metric stayed almost constant which supports our claim about normalizing the corpus.
