KLASIFIKACIJE TEKSTA NA PRIMENE U OBRADI PRIRODNIH JEZIKA

eLibrary

 
 

KLASIFIKACIJE TEKSTA NA PRIMENE U OBRADI PRIRODNIH JEZIKA

Show simple item record

dc.contributor.advisor Kartelj, Aleksandar
dc.contributor.author Šandrih, Branislava
dc.date.accessioned 2021-01-12T17:02:52Z
dc.date.available 2021-01-12T17:02:52Z
dc.date.issued 2020
dc.identifier.uri http://hdl.handle.net/123456789/5090
dc.description.abstract The main goal of this dissertation is to put different text classification tasks inthe same frame, by mapping the input data into the common vector space of linguisticattributes. Subsequently, several classification problems of great importance for naturallanguage processing are solved by applying the appropriate classification algorithms.The dissertation deals with the problem of validation of bilingual translation pairs, sothat the final goal is to construct a classifier which provides a substitute for human evalu-ation and which decides whether the pair is a proper translation between the appropriatelanguages by means of applying a variety of linguistic information and methods.In dictionaries it is useful to have a sentence that demonstrates use for a particular dictio-nary entry. This task is called the classification of good dictionary examples. In this thesis,a method is developed which automatically estimates whether an example is good or badfor a specific dictionary entry.Two cases of short message classification are also discussed in this dissertation. In thefirst case, classes are the authors of the messages, and the task is to assign each messageto its author from that fixed set. This task is called authorship identification. The otherobserved classification of short messages is called opinion mining, or sentiment analysis.Starting from the assumption that a short message carries a positive or negative attitudeabout a thing, or is purely informative, classes can be: positive, negative and neutral.These tasks are of great importance in the field of natural language processing and theproposed solutions are language-independent, based on machine learning methods: sup-port vector machines, decision trees and gradient boosting. For all of these tasks, ademonstration of the effectiveness of the proposed methods is shown on for the Serbianlanguage. en_US
dc.description.provenance Submitted by Slavisha Milisavljevic (slavisha) on 2021-01-12T17:02:52Z No. of bitstreams: 1 BranislavaSandrihPhd_konacno.pdf: 9053055 bytes, checksum: 7a0b7a1be004f8cd531163a78a62d30c (MD5) en
dc.description.provenance Made available in DSpace on 2021-01-12T17:02:52Z (GMT). No. of bitstreams: 1 BranislavaSandrihPhd_konacno.pdf: 9053055 bytes, checksum: 7a0b7a1be004f8cd531163a78a62d30c (MD5) Previous issue date: 2020 en
dc.language.iso sr en_US
dc.publisher Beograd en_US
dc.title KLASIFIKACIJE TEKSTA NA PRIMENE U OBRADI PRIRODNIH JEZIKA en_US
mf.author.birth-date 1991-08-19
mf.author.birth-place Pančevo en_US
mf.author.birth-country Srbija en_US
mf.author.residence-state Srbija en_US
mf.author.citizenship Srpsko en_US
mf.author.nationality Srpkinja en_US
mf.subject.area Computer Science en_US
mf.subject.keywords natural language processing, machine learning, computational linguistics,text classification, terminology extraction, authorship identification, sentiment classifica-tion, classification of good dictionary examples en_US
mf.subject.subarea Natural Language Processing en_US
mf.contributor.committee Pavlović-Lažetić, Gordana
mf.contributor.committee Filipović, Vladimir
mf.contributor.committee Krstev, Cvetana
mf.contributor.committee Mitkov, Ruslan
mf.university.faculty Mathematical faculty en_US
mf.document.pages 145 en_US
mf.document.location Belgrade en_US
mf.document.genealogy-project No en_US
mf.university Belgrade University en_US

Files in this item

Files Size Format View
BranislavaSandrihPhd_konacno.pdf 9.053Mb PDF View/Open

This item appears in the following Collection(s)

Show simple item record