Computer Science
Recent Submissions
-
Šošić, Milena (Beograd , 2025)[more][less]
Abstract: Conversational text messages represent an important form of digital communication in modern society. With the development of information technologies, various communication tools have emerged, such as email, social media, instant messaging tools, and automated response systems. Messages generated within these tools, unlike standard texts, have a specific structure that allows for the classification of individual messages or sets of messages that form a conversation. Classification labels are defined by the specific task being addressed and can be either single-label or multi-label, which enables the recognition of complex interrelationships between the categories. Introducing moral and emotional dimensions of language into research is crucial for understanding the complex patterns of human communication, particularly in the context of digital platforms and social media. Machine learning (ML) methods, such as deep neural networks (DNN), facilitate the utilization and more precise recognition of these aspects while simultaneously providing an efficient way to classify emotions and moral values expressed in texts. The noticeable complexity in the expression of human emotions and moral values, which are often conveyed implicitly and depend heavily on context, makes their recognition particularly challenging. One of the major challenges is the lack of or limited availability of resources in terms of size and diversity for low-resource languages, including Serbian. The development of linguistic resources, such as annotated lexicons and corpora, plays a crucial role in this process by providing the necessary knowledge sources for building and improving existing ML models. Linguistic resources enable models to learn how different emotional expressions and moral values influence the tone and meaning of communication. To support this, a semantic lexicon for sentiment intensity, SentiWords.SR, containing approximately 15k words, was developed for the Serbian language, along with the associated tool SRPOL for measuring sentiment intensity in textual sequences in Serbian. Additionally, a semantic lexicon for emotional affect, EmoLex.SR, comprising around 9.8k words with assigned emotional intensity values, and a semantic lexicon for moral values, MFD.SR, consisting of approximately 4.3k words with associated moral value weights, were developed. Significant efforts were also made in annotating the first conversational corpora from social media with emotional and moral categories. In this regard, the Social-Emo.SR corpus (∼34.6k messages) was developed, consisting of the Twitter-Emo.SR subcorpus (∼16.7k messages) and the Reddit-Emo.SR subcorpus (∼17.9k messages), collected from Twitter and Reddit, respectively. Furthermore, by searching for key moral-related terms, a subset of messages expressing potential moral stances was extracted from Social-Emo.SR. This subset, named Social-Mor.SR (∼13.6k messages), was manually verified and annotated by human annotators and consists of the Twitter-Mor.SR subcorpus (∼6.1k Twitter messages) and the Reddit-Mor.SR subcorpus (∼7.5k Reddit messages). In the context of DNN architectures, models based on recurrent networks or transformers, trained on these resources, enable the recognition and utilization of emotional and moral aspects of language in various contexts. The combination of advanced algorithms, such as Bidirectional Long Short-Term Memory (BiLSTM) networks and the attention mechanism with linguistically and culturally adapted resources (Meta) opens new possibilities for analyzing moral and emotional aspects of language. This has broad applications in classification tasks such as recognizing personal context, truthfulness of posts, or types of engagement in digital communication. For personal context recognition, i.e. classifying corporate emails as either business-related or personal, results show that using a carefully designed hybrid approach (BiLSTM-Att+Meta) across entire conversation branches yields the best results, comparable to published benchmarks on the same task. In experiments related to rumor veracity classification and identifying engagement types in response to rumors, it was demonstrated that moral and emotional attributes derived from semantic lexicons (EmoAttr, MorAttr ⊆ Meta) improve classification accuracy by +4.2% and +3.8% respectively, compared to methods without these attributes. For emotion recognition in Serbian conversational texts, experiments revealed that transformer-based models fine-tuned on the task achieved F1-scores of approximately 53%, reaching performance levels reported for multi-label classification on the same emotional category set. Additionally, experiments showed that further data preprocessing and balancing improved model performance. In moral value and moral sentiment classification tasks, using the Social-Mor.SR corpus and its subcorpora, an F1-score of ∼46% was achieved for moral value recognition and ∼38% for moral sentiment recognition, indicating acceptable results but also the need for further model optimization. Fine-tuning LLaMA models yielded reasonable but slightly lower performance compared to BERT-based architectures. Since model performance is directly dependent on the data they are trained on, there is potential for further improvements by refining and balancing initial annotations in the utilized corpora. URI: http://hdl.handle.net/123456789/5774 Files in this item: 1
Doktorski_rad_Milena_Sosic.pdf ( 6.206Mb ) -
Srdanović, Vladimir (University of Belgrade , 1987)[more][less]
Abstract: The dissertation relates to the elements of medical decision-making, modeled by a consultative expert system, characteristic to the domain of rheumatology and potentially other domains of medicine with a similar structure. URI: http://hdl.handle.net/123456789/5764 Files in this item: 1
Konsultativni ekspertni sistem.pdf ( 7.218Mb ) -
Mrkela, Lazar (Beograd , 2024)[more][less]
Abstract: This dissertation examines two discrete location problems and their bi- objective variants. The first problem under consideration is the maximal covering location problem with user preferences and budget constraints imposed on facility opening. This variant of the maximal covering problem has not been previously studied in the literature. Unlike the classical maximal covering problem, the variant proposed in this dissertation includes user preferences for locations, where users are assigned to the location with opened facility that they prefer the most. Additionally, different locations have different costs for establishing facilities, and the available budget for opening facilities is limited. This problem is solved using the Variable Neighborhood Search (VNS) method, and the results were compared with the ones obtained by an exact solver on modified instances from the literature. Furthermore, an existing variant of the maximal covering problem is also addressed, which imposes the limit on the number of opened facilities instead of limiting the budget for opening facilities. The second problem examined is the regenerator placement in optical networks. In optical networks, signal quality degrades with distance, necessitating the place- ment of costly devices to restore the signal. This dissertation studies an existing model where the set of possible regenerator locations and the set of user nodes are different, defining the problem as generalized. The generalized regenerator place- ment problem in optical networks is also solved using the Variable Neighborhood Search method, with results compared to the best available solutions from the lit- erature. Bi-objective variants of these problems are defined as well. For the maximal covering location problem, user preferences are included as weighted factors in the total covered demand, forming the first objective function. The second objective function represents the number of uncovered users and aims to ensure fairness in the model. In the regenerator placement problem for optical networks, it is assumed that, due to budget constraints, uninterrupted communication between all pairs of user nodes may not be feasible. Each pair is assigned a weight, and the sum of the weights of connected pairs constitutes the first objective function, while the second objective function represents the cost of placing regenerators. These bi-objective variants are solved using an adapted multi-objective version of the Variable Neigh- borhood Search method, and the results are compared with general evolutionary algorithms. URI: http://hdl.handle.net/123456789/5750 Files in this item: 1
lazar_mrkela_doktorska_disertacija.pdf ( 17.56Mb ) -
Protić, Danijela (Beograd , 2023)[more][less]
Abstract: Anomaly detection is the recognition of suspicious computer network behavior by comparing unknown network traffic to a statistical model of normal network behavior. Binary classifiers based on supervised machine learning are good candidates for normality detection. This thesis presents five standard binary classifiers: the k-nearest neighbors, weighted k-nearest neighbors, decision trees, support vector machines and feedforward neural network. The main problem with supervised learning is that it takes a lot of data to train high-precision classifiers. To reduce the training time with minimal degradation of the accuracy of the models, a two-phase pre-processing step is performed. In the first phase, numeric attributes are selected to reduce the dataset. The second phase is a novel normalization method based on hyperbolic the tangent function and the damping strategy of the Levenberg-Marquardt algorithm. The Kyoto 2006+ dataset, the only publicly available data set of real-world network traffic intended solely for anomaly detection research in computer networks, was used to demonstrate the positive impact of such pre-processing on classifier training time and accuracy. Of all the selected classifiers, the feedforward neural network has the highest processing speed, while the weighted k-nearest neighbor model proved to be the most accurate. The assumption is that when the classifiers work concurrently, they should detect either an anomaly or normal network traffic, which occasionally is not the case, resulting in different decision about the anomaly, i.e. a conflict arises. The conflicting decision detector performs a logical exclusive OR (XOR) operation on the outputs of the classifiers. If both classifiers simultaneously detected an anomaly or recognized traffic as normal, their decision was no conflict had occurred. Otherwise a conflict is detected. The number of conflicts detected provides an opportunity for additional detection of changes in computer network behavior. URI: http://hdl.handle.net/123456789/5599 Files in this item: 1
Danijela Protic - Doktorska Disertacija.pdf ( 3.143Mb ) -
Radosavljević, Jovan (Beograd , 2023)[more][less]
Abstract: Graph G = (V,E) is an ordered pair of set of nodes V and branches E. Order graph G is the number of nodes |V |, and its size is the number of branches |E|. Knots u, v ∈ V are adjacent if there is a branch uv ∈ E between them. Distance dist(u, v) nodes u and v G is the length of the shortest path from u to v. The diameter of the graph G is the largest distance dist(u, v) let two nodes in, v. They are discussed in the dissertation graphs of diameter 2. Intuitively, the notion that graphs are dia- meters 2 simple structures; however, they are known to be asymptotically close all graphs of diameter 2. That is why a narrower class is interesting — class D2C of critical graphs of diameter 2, i.e. graphs where the removal of any branches leads to an increase in diameter. In addition, a narrower class of pri- mitive D2C (PD2C) graphs, i.e. D2C graphs that do not have two nodes with the same set of neighbors. In the introductory chapter 2, the basic concepts, algorithms and dings used in the dissertation. They are presented in the following chapters original results regarding diameter graphs 2. Chapter 3 describes the procedure for obtaining a list of D2C graphs of order up to 13. With built-in parallelization, the creation of a list of D2C graphs of order up to 13 it lasted a month. This was a step forward, because previously there was a spi- around all graphs of diameter 2 lines up to 10. The obtained results were used for testing several known hypotheses about graphs of diameter 2. In chapter 4 it is shown that for every m ⩾ 3 a D2C graph containing cli- a ku of size m must have at least 2m nodes. At the same time, with accuracy up to isomorphism, there is exactly one graph of size 2m that contains a clique of characters m. Chapter 5 discusses PD2C graphs with the smallest number of branches. From list of all PD2C graphs of order n ⩽ 13 are selected PD2C graphs of size at most 2n − 4. Only three of the isolated graphs are of size 2n − 5, which is in accordance with the statement of the Erdes-Renji theorem about the lower bound for the size graphs of diameter 2 that do not contain a node adjacent to all other nodes (that limit is 2n − 5). PD2C graphs of size 2n − 4 rows up to 13 sorted are in three groups: • The first group belongs to the Z family, defined in the dissertation, which for each n ⩾ 6 contains exactly one PD2C graph of order n of size 2n − 4. • The second group consists of seven Hamiltonian PD2C graphs of order at most 9 of size 2n−4. In the dissertation it was proved that there is no such Hamil- tone graph of order greater than 11, i.e. that the seven graphs found are the only ones Hamiltonian PD2C graphs of size 2n − 4. • The third group consists of a unique graph that does not belong to any of the first two groups. Based on these results, the hypothesis was formulated that all PD2C graphs re- that n ⩾ 10 and sizes 2n − 4 belong to the family Z. Keywords: graphs, critical graphs of diameter 2, primitive graph- You Scientific field: Computing and informatics Narrower scientific field: Graph theory UDC number: 004.415.5(519.1 URI: http://hdl.handle.net/123456789/5594 Files in this item: 1
disertacijaJovanRadosavljevic.pdf ( 746.0Kb )