Browsing Computer Science by Issue Date

Now showing items 1-20 of 45

OPERATORI SELEKCIJE I MIGRACIJE I WEB SERVISI KOD PARALELNIH EVOLUTIVNIH ALGORITAMA

Filipović, Vladimir (Beograd , 2006)

[more]

[less]

URI: http://hdl.handle.net/123456789/2478

Files in this item: 1

phdFilipovicVladimir.pdf ( 2.145Mb )

Generalizacija fenomenološkog Hilovog modela u cilju izučavanja zamora mišića

Stojanović, Boban (Kragujevac, Serbia , 2007)

[more]

Abstract:

Muscles are organs whose primary function is to produce force and motion. There are three types of muscles: skeletal (striated), smooth and hart muscles. Skeletal muscles are attached to bones and can move them voluntarily. There are many daily activities which occur over an extended period of time and during which performances of muscles can be reduced (reduction of maximal force, contraction speed, movement control, etc). Although numerous mathematical models of muscles have been developed, there are only few models which take into account muscle fatigue. Most of the existing muscle fatigue models consider muscle fatigue under specific conditions only. Motivated by the fact that the existing muscle fatigue models are very limited under arbitrary conditions of activation and loading, we here present a new model including muscle fatigue. The proposed model is based on Hill’s phenomenological model consisting of contractile, serial and parallel elastic elements, but now using a fatigue curve under maximal activation and recovery curve as input parameters, in order to predict muscle response under arbitrary loading conditions. Furthermore, an extension of Hill’s model is introduced, in order to take into account different fiber types. Various types of muscle fibers can have very different physiological and mechanical properties, significantly affecting their resistance to fatigue. The developed models are incorporated into the finite element software PAK. The proposed models are verified by comparing the calculated results with experimental measurements and data from literature. By computer modeling of human biceps and triceps muscles, as well as the frog gastrocnemius muscle, it is shown that the models can predict behavior of real muscles with satisfactory precision. Besides application to single muscles, the proposed models can be used for computer simulations of complex musculoskeletal systems. In order to provide efficient modeling of muscles and musculoskeletal systems, a software for automatic muscle generation using medical images has been developed, as well as a module for result post-processing by employing various types of graphs. The proposed models and the developed software can be used as a very powerful tool in designing medical and sport equipment, planning trainings and analyzing exercises. Computer simulations based on the muscle mechanical models can prevent work injuries and significantly reduce costs for individuals and society.

URI:

http://hdl.handle.net/123456789/1843

Files in this item: 1

Boban Stojanovic - Doktorska disertacija.pdf ( 12.75Mb )

Glatka čestična hidrodinamika - paralelizacija algoritama i primena u dinamici fluida

Ivanović, Miloš (Kragujevac, Srbija , 2010)

[more]

[less]

URI: http://hdl.handle.net/123456789/1838

Files in this item: 1

mivanovic-disertacija-pun-tekst.pdf ( 11.57Mb )

AUTOMATSKO GENERISANJE I PROVERAVANJE USLOVA ISPRAVNOSTI PROGRAMA

Vujošević Janičić, Milena (Beograd , 2013)

[more]

Abstract:

LAV is a system for statically verifying program assertions and locating bugs such as buffer overflows, pointer errors and division by zero. LAV is primarily aimed at analyzing programs written in the programming language C. Since LAV uses the popular LLVM intermediate code representation, it can also analyze programs written in other procedural languages. Also, the proposed approach can be used with any other similar intermediate low level code representation. System combines symbolic execution, SAT encoding of program’s control-flow, and elements of bounded model checking. LAV represents the program meaning using first-order logic (FOL) formulas and generates final verification conditions as FOL formulas. Each block of the code (blocks have no internal branchings and no loops) is represented by a FOL formula obtained through symbolic execution. Symbolic execution, however, is not performed between different blocks. Instead, relationships between blocks are modeled by propositional variables encoding transitions between blocks. LAV constructs formulas that encode block semantics once for each block. Then, it combines these formulas with propositional formulas encoding the transitions between the blocks. The resulting compound FOL formulas describe correctness and incorrectness conditions of individual instructions. These formulas are checked by an SMT solver which covers suitable combination of theories. Theories that can be used for modeling correctness conditions are: theory of linear arithmetic, theory of bit-vectors, theory of uninterpreted functions, and theory of arrays. Based on the results obtained from the solver, the analyzed command may be given the status safe (the command does not lead to an error), flawed (the command always leads to an error), unsafe (the command may lead to an error) or unreachable (the command will never be executed). If a command cannot be proved to be safe, LAV translates a potential counterexample from the solver into a program trace that exhibits this error. It also extracts the values of relevant program variables along this trace. The proposed system is implemented in the programming language Ñ++, as a publicly available and open source tool named LAV. LAV has support for several SMT solvers (Boolector, MathSAT, Yices, and Z3). Experimental evaluation on a corpus of C programs, which are designed to demonstrate areas of strengths and weaknesses of different verification techniques, suggests that LAV is competitive with related tools. Also, experimental results show a big advantage of the proposed system compared to symbolic execution applied to programs containing a big number of possible execution paths. The proposed approach allows determining status of commands in programs which are beyond the scope of analysis that can be done by symbolic execution tools. LAV is successfully applied in educational context where it was used for finding bugs in programs written by students at introductory programming course. This application showed that in these programs there is a large number of bugs that a verification tool can efficiently find. Experimental results on a corpus of students’ programs showed that LAV can find bugs that cannot be found by commonly used automated testing techniques. Also, it is shown that LAV can improve evaluation of students’s assignments: (i) by providing useful and helpful feedback to students, which is important in the learning process, and (ii) by improving automated grading process, which is especially important to teachers.

URI:

http://hdl.handle.net/123456789/4231

Files in this item: 1

phdVujosevicJanicicMilena.pdf ( 1.748Mb )

Usmeravanje pretrage u automatskom dokazivanju teorema

Nikolić, Mladen (Belgrade , 2013)

[more]

Abstract:

In this thesis the problem of guiding search in automated theorem proving is considered. The thesis consists of two parts that have the CDCL search system, the system intensively used by modern SAT solvers, as their common topic. In the rst part of the thesis a simple approach to guiding search is considered | guiding by the selection of the solver, its heuristics, and their parameters, based on the properties of an instance to be solved. The basis of the proposed methods for algorithm selection is syntactical similarity of formulae which is re ected in their graph structure. This graph similarity is established and analyzed by using an original graph similarity measure (which turned out to be useful in other contexts, too). Yet, practical approaches to measuring similarity of formulae are based on their numerical features due to the computational complexity issues. Two simple methods for algorithm selection, based on k nearest neighbors, were proposed. The rst technique, ArgoSmArT is based on classi cation of instance in one of the prede ned families for which the e cient algorithms are known. The instance is solved by algorithm corresponding to the family to which the instance was classi ed. The second technique, ArgoSmArT k-NN is based on nding several similar instances in the training set for which the solving times by all considered algorithms are known. The instance is solved by the algorithm that behaves the best on those instances. ArgoSmArT technique is better suited for con guration selection of a SAT solver, and ArgoSmArT k-NN for SAT solver selection. ArgoSmArT k-NN technique showed to be more e cient than the most important and very complex system for SAT solver selection | SATzilla system. Apart from CNF SAT solver selection, the problem of non-CNF SAT solver selection is considered. The focus was not on solver selection techniques, since the proposed techniques are directly applicable, but on the attributes that can be used to describe non-CNF SAT instances, which have not been proposed earlier. The results in this domain are positive, but still limited. The main reason for that is the lack of greater number of non-CNF SAT solver of di erent behaviour, which is not surprising, having in mind that this kind of solvers is in its early stage of development. Apart from construction of e cient SAT solver selection system, the methodology of SAT solver comparison, based on statistical hypothesis testing is proposed. The need for such a methodology comes from great run time variations of single instance solving by a solver, which can result in di erent SAT solver orderings when one tries to compare their performance or rank them, as experimentally demonstrated. The proposed methodology gives the estimate of statistical signi cance of the performed test and the estimate of the e ect size, for instance the probability of a solver being faster than another. The second part of the thesis is concerned with generalizing CDCL search system to fragments of rst order logic. The proposed system can be used as a basis for e cient proving in some fragment if the rules of resolution and factoring are speci ed for that fragment. These rules are de ned for an extension of coherent logic. The soundness and completeness of the system are proved. The system has several distinguishing features which are a consequence of previously performed analysis of challenges in coherent logic theorem proving. The system enables rst order reasoning, instead of ground one characteristic for all existing coherent logic provers. Moreover, it introduces backjumps and lemma learning. The special attention in system design was paid to the possibility of generating readable proofs by the prover implementing the system. This possibility is one of the greatest qualities of coherent logic, but it is not easy to achieve if CDCL search system is used. One of the properties of the system that came from the need for generation of readable proofs is preservation of quanti ers in proving process which is rather unusual for existing CDCL systems. Another advantage of the proposed CDCL system is the possibility of transfer of heuristics which are already successfully employed in SAT solving to other domains. Based on the proposed system, the proof procedure Calypso for extended coherent logic was de ned which can also be used in standard coherent logic. The extension of Rete algorithm which enables detection of con icts and literals to be propagated or decided is proposed. Procedure Claypso is implemented in C++. It was evaluated on a representative coherent logic problems and it showed superior to other coherent logic provers and also the prover Vampire, the most e cient prover for rst order logic. Based on the results presented in this thesis, it can be concluded that the main hypothesis of this work, that the search system used in CDCL SAT solvers can be signi cantly improved by simple guiding and that it can be successfully formulated for fragments of rst order logic such as coherent logic, was con rmed and that the concrete answers on how to do that were provided.

URI:

http://hdl.handle.net/123456789/2584

Files in this item: 1

nikolic_mladen.pdf ( 1.448Mb )

MODIFIKACIJE METODE PROMENLJIVIH OKOLINA I NJIHOVE PRIMENE ZA REŠAVANJE PROBLEMA RASPOREĐIVANJA PRENOSA DATOTEKA

Dražić, Zorica (Beograd , 2014)

[more]

Abstract:

The Variable neighborhood search method proved to be very successful for solving discrete and continuous optimization problems. The basic idea is a systematic change of neighborhood structures in search for the better solution. For optimiza- tion of multiple variable functions, methods for obtaining the local minimum starting from certain initial point are used. In case when the continuous function has many local minima, nding the global minimum is usually not an easy task since the obta- ined local minima in most cases are not optimal. In typical implementations with bounded neighborhoods of various diameters it is not possible, from arbitrary point, to reach all points in solution space. Consequently, the strategy of using the nite number of neighborhoods is suitable for problems with solutions belonging to some known bounded subset of IRn. In order to overcome the previously mentioned limitation the new variant of the method is proposed, Gaussian Variable neighborhood search method. Instead of de ning the sequence of di erent neighborhoods from which the random point will be chosen, all neighborhoods coincide with the whole solution space, but with di e- rent probability distributions of Gaussian type. With this approach, from arbitrary point another more distant point is theoretically reachable, although with smaller probability. In basic version of Variable neighborhood search method one must de ne in advance the neighborhood structure system, their number and size, as well as the type of random distribution to be used for obtaining the random point from it. Gaussian Variable neighborhood search method has less parameters since all the neighborhoods are theoretically the same (equal to the solution space), and uses only one distribution family - Gaussian multivariate distribution with variable dispersion. File transfer scheduling problem (FTSP) is an optimization problem widely appli- cable to many areas such as Wide Area computer Networks (WAN), Local Area Ne- v tworks (LAN), telecommunications, multiprocessor scheduling in a MIMD machines, task assignments in companies, etc. As it belongs to the NP-hard class of problems, heuristic methods are usually used for solving this kind of problems. The problem is to minimize the overall time needed to transfer all les to their destinations for a given collection of various sized les in a computer network, i.e. to nd the le transfer schedule with minimal length. In order to obtain the exact solution of the FTS problem, integer linear pro- gramming formulations are proposed and their correctness is proved. In this way optimal solutions can be found for small and medium size test instances. For large test instances, the Variable neighborhood search method is proposed using the "permutation" representation and typical neighborhood structures. More- over, the same method is used for obtaining the upper bounds of the solutions which are used in proposed integer linear programming formulations. For obtaining be- tter solutions in the small neighborhood of the current solution, three di erent local search procedures are implemented: 2-swap, 2-swap adjacent and variable neighbo- rhood descent. In order to apply the continuous optimization methods for solving FTSP, the weighted solution representation is developed. Such representation enables the co- ntinuous optimization methods to be used, which do not require the di erentiability of objective function. Since Gauss Variable neighborhood search method proved to be successful in continuous optimization problems, it was applied to FTSP. Pre- viously described local search procedures can also be used with weighted solution representation. Using the proposed methods optimal solutions for all small and medium size test instances are found. For large size instances, which are beyond the reach of exact methods, metaheuristic methods obtained good solutions in reasonable time.

URI:

http://hdl.handle.net/123456789/4246

Files in this item: 1

phdDrazic_Zorica.pdf ( 4.739Mb )

Primene metaheuristike zasnovane na elektromagnetizmu u rešavanju problema klasifikacije

Kartelj, Aleksandar (Beograd , 2014)

[more]

Abstract:

This work investigates the potential of improving the classi cation process through solving three classi cation-related problems: feature selection, feature weighting and parameter selection. All three problems are challenging and currently in the focus of scienti c researches in the eld of machine learning. Each problem is solved by using population-based metaheruistic method called electromagnetismlike method. This method is used for combinatorial and global optimization. It is inspired by laws of attraction and repulsion among charged particles. Each particle is represented by a vector of real values. The solution of the problem of interest is then obtained by mapping these real-valued vectors to the feasible solution domain. Particles representing better solutions achieve higher level of charge, which consequently produces greater impact on other particles. The search process is performed by iterating the particle movement, induced by charges. Through implementing the methods, two key aspects are managed: 1) the classi cation quality obtained after applying the optimization method and 2) the e ciency of the proposed methods from the perspective of time and space resources. All methods are equiped with problem-speci c local search procedures which tend to increase the solution quality. The bene t of applying feature selection for the classi cation process is twofold. Firstly, the elimination of unnecessary features decreases the data set noise, which degrades the quality of the classi cation model. Secondly, the problem dimension is decreased, thus the e ciency is increased. Feature selection problem is very e - ciently solved by the proposed method. The classi cation quality is in the majority of cases (instances) improved relative to the methods from literature. For some of the instances, computational times are up to several hundred times smaller than those of the competing methods. Feature weighting and parameter selection problem share similar underlying solution representation, based on the vectors of real values. Since the representation of charged particles is based on the same underlying domain, the transition from the particle to the solution domain behaves smoothly. The quality of the method for iv feature weighting is demonstrated through nearest neighbors classi cation model. The testing of the method is conducted on di erent collection of instances, and after that, the comparison to several methods from literature is made. In the majority of cases, the proposed method outperformed the comparison methods. The parameter selection, in classi cation, has a great impact on the classi cation quality. The proposed method for parameter selection is applied on the support vector machihe, which has a complex parametric structure when the number of parameters and the size of their domains is in question. By using heuristic initialization procedure, the detection of high quality regions for parameter combinations is accelerated. Exhaustive tests are performed on various instances in terms of their dimension and feature structure: homogenous and heterogeneous. Single kernel learning is adopted for homogenous, and multiple kernel learning for heterogeneous instances. The comparison with methods from literature showed superiority of the proposed method when single and multiple kernel learning based on radial basis function is considered. The method shows to be competitive in other cases. All proposed methods improved the classi cation quality. Because of the way, the problem is being solved, all three methods can be generalized and applied to a wide class of classi cation models and/or classi cation problem.

URI:

http://hdl.handle.net/123456789/4234

Files in this item: 1

phdAleksandarKartelj.pdf ( 2.121Mb )

DIGITALNA FORENZIKA U FUNKCIJI ZAŠTITE INFORMACIONOG SISTEMA BAZIRANOG NA LINUX I WINDOWS PLATFORMAMA

Korać, Vanja (Beograd , 2014)

[more]

Abstract:

Digitalna forenzika je multidisciplinarna nauka koja podrazumeva spoj razlicitih nauĉnih disciplina (raĉunarske nauke, pravo, kriminologija) sa brojnim izazovima u uslovima masovnog generisanja digitalnih podataka (Big Data), virtuelizacije klijentske i serverske strane (Cloud Computng), neusaglašenosti standardizacionih tela i opšteg nedostatka brojnih standarda i eksperata u svim disciplinama. Kako se digitalna forenzika odnosi na sve digitalne uraĊaje, uţa nauĉna oblast uklјuĉuje brojne aplikacije digitalne forenzike, kao što su raĉunarska forenzika, forenzika mobilnih ureĊaja, forenzika na sistemima savremenih automobila, senzorskih mreţa itd. U ovom radu je analizirana i primenjena uţa nauĉna oblast raĉunarske forenzike. Opisana je digitalna forenzika raĉunarskih sistema baziranih na Windows i Linux platformi, sa fokusom na odreĊena mesta u implementiranom sistemu proaktivne digitalne forenzike koja mogu ukazati na forenziĉki relevantne dogaĊaje kritiĉne za bezbednost sistema. Opisane su brojne metodologije, tehnologije i tehnike istrage visokotehnološkog kriminala. Proces prikuplјanja podataka i digitalne forenziĉke analize „uţivo―, detalјno je razmatran. Izvršena je kratka revizija karakteristika i tipiĉno zahtevanih funkcionalnosti softverskih forenziĉkih alata, za inicijalni odgovor i oporavak podataka i particija magnetnih diskova. Opisani su i najvaţniji digitalni forenziĉki kompleti alata i njihove osnovne funkcionalnosti. U radu se istiĉu i najznaĉajniji elementi kojima treba posvetiti posebnu paţnju prilikom digitalne forenziĉke analize u virtuelnom okruţenju. TakoĊe su objašnjeni i najvaţniji segmenti samog virtuelnog okruţenja i naĉin na koji oni mogu biti znaĉajni alati, za postupak digitalne forenziĉke analize. U poslednjem delu ovog rada, fokus je usmeren na ranjivosti Windows i Linux platformi sa prikazanim naĉinima zlonamernog proboja sistema. Opisane su opšte ranjivosti i specifiĉne ranjivosti koje se odnose samo na Windows, odnosno samo na Linux platforme. TakoĊe, navedeni su i najĉešći naĉini zlonamernog iskorišćavanja sistema. Ranjivosti raĉunarskih sistema i mreţa mogu se odnositi na programe, hardver, konfiguraciju i lјude. Isklјuĉujući lјude kao najznaĉajniji i istovremeno najkritiĉniji faktor u zaštiti informacija, programske ranjivosti se tipiĉno koriste za online direktne napade, ili napade malicioznim programima. Otkrivanje i otklanjanje ranjivosti sistemskih programa je jedan od glavnih cilјeva digitalne forenzike. Pored skuplјanja forenziĉki relevantnih digitalnih podataka i izgradnje ĉvrstih digitalnih dokaza o kompjuterskom incidentu ili kriminalu za potrebe pravosudnog sistema, cilј digitalne forenziĉke analize je da se iskorišćene ranjivosti trajno otklone i da se incident/protivpravna aktivnost takve vrste više nikada ne ponovi. U tom smislu je doprinos ovog rada veoma znaĉajan. Praktiĉan primer ispitivanja ranjivosti servisa na Windows i Linux platformama obuhvatio je 80 operativnih sistema. Od tog broja, 51 se odnosi na Windows operativne sisteme, a 29 na Linux operativne sisteme. Dobijeni rezultati su rezultat dvogodišnjeg istraţivanja, jer je ispitivanje sistema vršeno u 2011. i 2013. godini. Kroz skeniranje i prikaz ranjivosti difoltno instaliranih Windows i Linux sistema preventivno se otkrivaju ranjivosti koje potencijalno mogu biti iskorišćene od strane bezbednosnih pretnji (maliciozni programi ili zlonamerni napadaĉi) i time ugroziti raĉunarske sisteme i informacije. Proaktivnim otklanjanjem ovih ranjivosti realizuje se preventivna zaštita. Uspostavlјanjem sistema proaktivne forenzike, obezbeĊuje se logovanje forenziĉki relevantnih dogaĊaja, tj. tragova pokušaja napada u realnom vremenu, ĉime se bitno olakšava forenziĉka istraga u sluĉaju incidenta ili protivpravne aktivnosti.

URI:

http://hdl.handle.net/123456789/3869

Files in this item: 1

doktorat_Vanja_Korac.pdf ( 9.093Mb )

STRUKTURNA PREDIKCIJA FUNKCIJE PROTEINA I ODNOS FUNKCIONALNIH KATEGORIJA PROTEINA I NJIHOVE NEUREÐENOSTI

Kovačević, Jovana (Beograd , 2015)

[more]

Abstract:

Proteins represent the most important groups of biomoleculs. Di erent functions that they carry out in each organism are unique and irreplaceable, including versatile cellular processes, structural role of proteins, catalytic function, a number of metabolic functions and so on. Knowing and under- standing protein function is therefore essential in investigation of any biolo- gical process, especially of human diseases since a lot of them are caused by functional mutations. In this paper, we represent investigation of protein function domain through two di erent approaches. In the rst one, protein function is represented by GO ontologies with the structure of a directed acyclic graph. There are three GO ontologies: one for functions regarding biological processes, one for functions regarding cellular components and one for molecular functions. Each ontology contains several thousands of nodes, where every node deter- mines more speci c function than his ascendants. The task of this part of research was to develop a software for predicting protein function from its primary sequence based on structural support vector machines method which represents generalization of well-known support vector machines method on structural output. Structure-function paradigm is one of basic concepts in molecular biology, stating that 3D proten structure is closely connected to its role in organism. It has been detected that disordered proteins (the ones that lack 3D struc- ture) and disordered regions of proteins are related with severe contemporary illnesses, which contributed to their popularity in modern research. In an- other aspect, we investigated the relationship between proteins' functional categories and their disorder, as well ad with other physico-chemical char- acteristics of proteins. Here, protein function has been observed through 25 elementary functions grouped in 4 functional groups. In this work, we present results of thorough analysis over large protein dataset where dis- order has been determined computationally, using publicly available tools.

URI:

http://hdl.handle.net/123456789/4451

Files in this item: 1

DoktoratJK2015.pdf ( 1.116Mb )

AUTOMATSKO REŠAVANJE KONSTRUKTIVNIH PROBLEMA U GEOMETRIJI

Marinković, Vesna (Beograd , 2015)

[more]

Abstract:

The problems of geometry constructions using ruler and compass are one of the oldest and most challenging problems in elementary mathematics. A solution of construction problem is not an illustration, but a procedure that, using given construction primitives, gives a “recipe” how to construct the object sought. The main problem in solving construction problems, both for a human and a computer, is a combinatorial explosion that occurs along the solving process, as well as a procedure of proving solutions correct. In this dissertation a method for automated solving of one class of construction problems, so-called location problems, is proposed. These are the problems in which the task is to construct a triangle if locations of three characteristic points are given. This method successfully proved most of the solvable problems from Wernick’s and Connelly’s list. For each of the problems it is checked if it is symmetric to some of the already solved problems, and then its status is determined: the problem can be found redundant, locus dependent, solvable, or not solvable using existing knowledge. Also, a description of the construction in natural-language form and in language GCLC is automatically generated, accompanied by a corresponding illustration, and correctness proof of the generated construction, followed by the list of conditions when the construction is correct. The proposed method is implemented within the tool ArgoTriCS. For proving generated constructions correct, the system ArgoTriCS uses a newly developed prover ArgoCLP, the automated theorem provers integrated within tools GCLC and OpenGeoProved, as well as the interactive theorem prover Isabelle. It is demonstrated that the proofs obtained can be machine verifiable. Within this dissertation, the system ArgoCLP (developed in collaboration with Sana Stojanovi´c) which is capable of automatically proving theorems in coherent logic is described. This prover is successfully applied to different axiomatic systems. It automatically generates proofs in natural-language form, as well as machineverifiable proofs, whose correctness can be checked using interactive theorem prover Isabelle. The important part of this system is a module for simplification of generated proofs whereby shorter and readable proofs are obtained.

URI:

http://hdl.handle.net/123456789/4406

Files in this item: 1

tezaVesnaMarinkovic.pdf ( 2.233Mb )

Istraživanje obrazaca u određivanju karakterisitika proteina

Marovac, Ulfeta (Beograd , 2015)

[more]

Abstract:

Proteins are signi cant biological macromolecules of polymeric nature (polypeptides), which contain amino acids and are basic structural units of each cell. Their contents include 20+3 amino acids and, as a consequence, they are presented in biological databases as sequences formed from 23 di erent characters. Proteins can be classi ed based on their primary structure, secondary structure, function etc. One of possible classi cations of proteins by their function is related to their contents in a certain cluster of ortholologous groups (COGs). This classi cation is based on the previous comparison of proteins by their similarities in their primary structures, which is most often a result of homology, i.e. their mutual (evolutionary) origin. COG database is obtained by comparison of the known and predicted proteins encoded in the completely sequenced prokaryotic (archaea and bacteria) genomes and their classi cation by orthology. The proteins are classi ed in 25 categories which can be ordered in three basic functional groups (the proteins responsible for: (1) information storage and processing; (2) cellular processes and signaling; and (3) metabolism), or in a group of poorly characterized proteins. Classi cation of proteins by their contents in certain COG category (euKaryote Orthologous Groups- KOG for eukaryotic organisms) is signi cant for better understanding of biological processes and various pathological conditions in people and other organisms. The dissertation proposed the model for classi cation of proteins in COG categories based on amino acid n-grams (sequences of n- length). The set of data contains protein sequences of genomes from 8 di erent taxonomic classes [TKL97] of bacteria (Aqui cales, Bacteroidia, Chlamydiales, Chlorobia, Chloro exia, Cytophagia, Deinococci, Prochlorales), which are known to have been classi ed by COG categories. The new method is presented, based on the generalized systems of Boolean equations, used for separation of n-grams characteristic for proteins of corresponding COG categories. The presented method signi cantly reduces the number of processed n-grams in comparison to previously used methods of n-gram analysis, iv thus more memory space is provided and less time for protein procession is necessary. The previously known methods for classi cation of proteins by functional categories compared each new protein (whose function had to be determined) to the set of all proteins which had already been classi ed by functions in order to determine the group which contained most similar proteins to the one which was to be classi ed. In relation to the previous, the advantage of the new method is in its avoidance of sequence-sequence comparison and in search for those patterns (n-grams, up to 10 long) in a protein which are characteristic of the corresponding COG category. The selected patterns are added to a corresponding COG category and describe sequences of certain length, which have previously appeared in that COG category only, not in the proteins of other COG categories. On the basis of the proposed method, the predictor for determination of the corresponding COG category for a new protein is implemented. Minimal precision of the prediction is one of the predictors arguments. During the test phase the constructed predictor shown excellent results, with maximal precision of 99% reached for some proteins. According to its properties and relatively simple construction, the proposed method can be applied in similar domains where the solution of problem is based on n-gram sequence analysis.

URI:

http://hdl.handle.net/123456789/4308

Files in this item: 1

phdUlfetaMarovac.pdf ( 7.954Mb )

Konvejeva notacija u teoriji čvorova i njena primena u metodima za određivanje rastojanja čvorova

Zeković, Ana (Beograd , 2015)

[more]

Abstract:

A main focus of the paper is construction of new methods for defining diverse knot distance types - the distance of knots made by crossing changes (Gordian distance) and the distance among knots made by crossing smoothing (smoothing distance). Different ways of knots presentation are introduced, with objective to a mirror curve model. It is presented a purpose of the model, coding of knots, by using the model preferences, as well as introduction of a method to determinate a knots presented by the model and derived all the knots that could be placed to a nets dimensions p×q (p ≤ 4, q ≤ 4). Diverse knot notations are described into details, with a focus to Conway’s notation and its topological characteristics. As it is known, a present algorithms are based on an algebra of chain fractions, that are in close relation with a presentation of rational knots, which results in an absence of a huge number of non-rational knots, in an existing Gordian’s distance tables. The subject of the paper is an implementation of methods with bases on determination of new distances equal 1. The methods are based on a non-minimal presentation of rational and non-rational knots, generation of algorithms established on geometrical characteristics of Conway’s notation and a weighted graph search. The results are organized into Gordian’s distance knots tables up to 9 crossings, and have been enclosed with the paper. In order to append the table with knots having a bigger number of crossings, it has been suggested a method for extension of results for knot families. Using facts of relation among Gordian’s numbers and smoothing numbers, a new method for smoothing number determination is presented, and results in a form of lists for knots not having more then 11 crossings. In conjunction with Conway’s notation concept and the method, algorithms for a smoothing distance are generated. New results are organized in knot tables, up to 9 crossings, combined with previous results, and enclosed with the paper. A changes and smoothing to a knot crossing could be applied for modeling topoisomerase and recombinase actions of DNA chains. It is presented the method for studying changes introduced by the enzymes. A main contribution to the paper is the concept of Conways notation, used for all relevant results and methods, which led to introduction of a method for derivation a new knots in Conways notation by extending C-links. In a lack of an adequat pattern for an existing knot tables in DT-notation, there is usage of a structure based on topological knot concepts. It is proposed a method for knot classification based on Conways notation, tables of all knots with 13 crossings and alternated knots with 14 crossings has been generated and enclosed. The subject of the paper takes into consideration Bernhard-Jablan’s hypothesis for a determination of unknotting number using minimal knot diagrams. The determination is crucial in computation of diverse knot distances. The paper covers one of main problems in knot theory and contains a new method of knot minimization. The method is based on relevance of local and global minimization. 5 There are defined new terms such as a maximum and a mixed unknotting number. The knots that do not change a minimum crossing number, after only one crossing change are taken into consideration for the analyzes. Three classes of the knots are recognized, and called by authors . Kauffman’s knots, Zekovic knots and Taniyama’s knots. The most interesting conclusion correlated with Zekovic knots is that all derived Perko’s knots (for n ≤ 13 crossings) are actually Zekovic knots. Defining this class of knots provides opportunity to emphasize new definitions of specifis featured for well-known Perko’s knots.

URI:

http://hdl.handle.net/123456789/4255

Files in this item: 1

phdZekovicAna.pdf ( 5.246Mb )

REŠAVANJE PROBLEMA CSP TEHNIKAMA SVOÐENJA NA PROBLEM SAT

Stojadinović, Mirko (Beograd , 2016)

[more]

Abstract:

Many real-world problems can be modeled as constraint satisfaction problems (CSPs) and then solved by one of many available techniques for solving these problems. One of the techniques is reduction to SAT, i.e. Boolean Satisfiability Problem. Variables and constraints of CSP are translated (encoded) to SAT instance, that is then solved by state-of-the-art SAT solvers and solution, if exists, is translated to the solution of the original CSP. The main aim of this thesis is to improve CSP solving techniques that are using reduction to SAT. Two new hybrid encodings of CSPs to SAT are presented and they combine good sides of the existing encodings. We give the proof of correctness of one encoding that did not exist in literature. We developed system meSAT that enables reduction of CSPs to SAT by using 4 basic and 2 hybrid encodings. The system also enables solving of CSPs by reduction to two problems related to SAT, SMT and PB. We developed a portfolio for automated selection of encoding/solver to be used on some new instance that needs to be solved. The developed portfolio is comparable with the state-of-the-art portfolios. We developed a hybrid approach based on short solving timeouts with the aim of significantly reducing the preparation time of a portfolio. By using this approach, we got results comparable to the ones obtained by using preparation time of usual length. We made comparison between several machine learning techniques with the aim to find out which one is the best suited for the short training approach. The problem of assigning air traffic controllers to shifts is described and three models of this problem are presented. We used a large number of different solving methods and a diverse set of solvers for solving this problem. We developed optimization techniques that aim to find optimal solutions of the problem. A hybrid technique combining reduction to SAT and local search is shown to be the most efficient one. We also considered sudoku puzzles and the existing techniques of solving the puzzles of greater size than 9 9. Amongst the used techniques, the existing reduction to SAT is the most efficient in solving these puzzles. We improved the existing algorithm for generating large sudoku puzzles. It is shown that simple preprocessing rules additionally improve speed of generating large sudokus.

URI:

http://hdl.handle.net/123456789/4427

Files in this item: 1

MirkoStojadinovicTeza.pdf ( 2.030Mb )

REŠAVANJE KLASE MIN-MAX PROBLEMA ROBUSNE DISKRETNE OPTIMIZACIJE SA PRIMENAMA

Mišković, Stefan (Beograd , 2016)

[more]

Abstract:

In this dissertation, three NP-hard min-max discrete optimization problems are considered. The rst considered problem is multi-period emergency service location problem, the second one is dynamic maximal covering location problem with multiple covering radii, and the third one is uncapacitated multiple allocation p-hub center problem. In many practical situations, input parameters (such as user demands, transportation time or cost) often vary with unknown distributions. Therefore, it is necessary to involve these uncertainties in the deterministic variants of the problems by applying robust optimization approach. Mathematical models for the deterministic and non-deterministic variants of all three problems are developed, except for the deterministic uncapacitated multiple allocation p-hub center problem, which has already been addressed in the literature. In addition, for the rst time in the literature, it was proven that the emergency service location problem is NP-hard. The considered problems and their robust variants have numerous applications, due to the fact that in real-life situations input parameters are often subject to uncertainty. Multi-period emergency service location problem may be used when determining optimal locations for police stations, re brigades, ambulances, and other emergency units in the given region. The dynamic maximal covering location problem with multiple covering radii is useful when choosing the optimal strategy for establishing resources (service centers, suppliers, facilities, etc.) with maximal satisfaction of customer demands in a certain region, by assuming that the service e ciency directly depends on the distance between customer and service center (i.e., the selected coverage radius). The uncapacitated multiple allocation p-hub center problem has signi cant applications in designing telecommunication and transportation networks, postal delivery systems, emergency systems, supply networks, etc. Since exact methods provide optimal solutions only for problem instances of small dimensions, hybrid metaheuristic algorithms are developed to solve both deterministic and robust variants of the considered problems. The proposed hybrid algorithms are obtained by combining particle swarm optimization, with local search heuristic { classical local search or variable neighborhood search method. For dynamic maximal covering location problem with multiple covering radii, a hybridization of metaheuristic algorithm with exact method based on linear programming is developed. All elements of the proposed algorithms are adopted to the problems under consideration. Di erent strategies are implemented for improving the e ciency of proposed algorithms, especially for the calculation of the objective function value and the local search part. The in uence of di erent parameters of hybrid algorithms on the solution quality is analyzed in detail. All parameters are adjusted by using analysis of variance. For all considered problems (both deterministic and robust variant), the performance of the proposed hybrid algorithms is evaluated on adequate test data sets. The proposed algorithms are compared with existing heuristic from the literature and exact methods incorporated in commercial CPLEX solver. The obtained experimental results indicate the e ciency of proposed algorithms in obtaining high quality solutions for all considered test instances. The presented comparative analysis indicates the advantages of the proposed hybrid algorithms over existing methods in the sense of solution quality and/or required computational time, especially in the case of large problem dimensions. The results presented in this paper represent a contribution to the eld of discrete optimization, robust optimization and metaheuristic methods.

URI:

http://hdl.handle.net/123456789/4423

Files in this item: 1

Miskovic_Stefan_teza.pdf ( 1.773Mb )

INFORMATIČKI MODELI U ANALIZI OSEĆANJA ZASNOVANI NA JEZIČKIM RESURSIMA

Mladenović, Miljana (Beograd , 2016)

[more]

Abstract:

The beginning of the new millennium was marked by huge development of social networks, internet technologies in the cloud and applications of artificial intelligence tools on the web. Extremely rapid growth in the number of articles on the Internet (blogs, e-commerce websites, forums, discussion groups, and systems for transmission of short messages, social networks and portals for publishing news) has increased the need for developing methods of rapid, comprehensive and accurate analysis of the text. Therefore, remarkable development of language technologies has enabled their applying in processes of document classification, document clustering, information retrieval, word sense disambiguation, text extraction, machine translation, computer speech recognition, natural language generation, sentiment analysis, etc. In computational linguistics, several different names for the area concerning processing of emotions in text are in use: sentiment classification, opinion mining, sentiment analysis, sentiment extraction. According to the nature and the methods used, sentiment analysis in text belongs to the field of computational linguistics that deals with the classification of text. In the process of analysing of emotions we generally speak of three kinds of text classification: • identification of subjectivity (opinion classification or subjectivity identification) used to divide texts into those that carry emotional content and those that only have factual content • sentiment classification (polarity identification) of texts that carry emotional content into those with positive and those with negative emotional content • determining the strength or intensity of emotional polarity (strength of orientation). In terms of the level at which the analysis of feelings is carried out, there are three methodologies: an analysis at the document level, at the sentence level and at the level of attributes. Standardized methods of text classification usually use machine learning methods or rule-based techniques. Sentiment analysis, as a specific type of classification of documents, also uses these methods. This doctoral thesis, whose main task is the analysis of emotions in text, presents research related to the sentiment classification of texts in Serbian language, using a probabilistic method of machine learning of multinomial logistic regression i.e. maximum entropy method. The aim of this research is to create the first comprehensive, flexible, modular system for sentiment analysis of Serbian language texts, with the help of digital resources such as: semantic networks, specialized lexicons and domain ontologies. This research is divided into two phases. The first phase is related to the development of methods and tools for detecting sentiment polarity of literal meaning of the text. In this part of the work, a new method of reducing the feature vector space for sentiment classification is proposed, implemented and evaluated. The proposed method for reduction is applied in the classification model of maximum entropy, and relies on the use of lexical-semantic network WordNet and a specialized sentiment lexicon. The proposed method consists of two successive processes. The first process is related to the expansion of feature vector space by the inflectional forms of features. The study has shown that usage of stemming in sentiment analysis as a standard method of reducing feature vector space in text classification, can lead to incomplete or incorrect sentiment-polarity feature labelling, and with the introduction of inflectional feature forms, this problem can be avoided. The paper shows that a feature vector space, increased due to the introduction of inflectional forms, can be successfully reduced using the other proposed procedure – semantic mapping of all predictors with the same sentiment-polarity into a small number of semantic classes. In this way, the feature vector space is reduced compared to the initial one, and it also retains the semantic precision. The second phase of the dissertation describes the design and implementation of formal ontologies of Serbian language rhetorical figures – the domain ontology and the task ontology. Usage of the task ontology in generating features representing figurative speech is presented. The research aim of the second phase is to recognize figurative speech to be used in improving of the existing set of predictors generated in the first phase of the research. The research results in this phase show that some classes of figures of speech can be recognized automatically. In the course of working on this dissertation, a software tool SAFOS (Sentiment Analysis Framework for Serbian), as an integrated system for sentiment classification of text in Serbian language, has been developed, implemented and statistically evaluated. Results of the research within the scope of this thesis are shown in papers (Mladenović & Mitrović, 2013; Mladenović & Mitrović, 2014; Mladenović, Mitrović & Krstev, 2014; Mladenović, Mitrović, Krstev & Vitas, 2015; Mladenović, Mitrović & Krstev, 2016). The dissertation consists of seven chapters with the following structure. Chapter 1 introduces and defines methods, resources and concepts used in the first phase of research: text classification, sentiment classification, machine learning, supervised machine learning, probabilistic supervised machine learning, and language models. At the end of the introductory section, the tasks and objectives of the research have been defined. Chapter 2 presents a mathematical model of text classification methods and classification of sentiment methods. A mathematical model of a probabilistic classification and an application of the probabilistic classification in regression models are presented. At the end of the chapter it is shown that the method using the mathematical model of maximum entropy, as one of the regression models, has been successfully applied to natural language processing tasks. Chapter 3 presents the lexical resources of the Serbian language and the methods and tools of their processing. Chapter 4 deals with the comprehensive research on the currently available types and methods of sentiment classification. It shows the current work and research in sentiment classification of texts. It also presents a comparative overview of research in sentiment classification of texts using the method of maximum entropy. Chapter 5 discusses the contribution of this thesis to methods of feature space reduction for maximum entropy classification. First, a feature space reduction method is analysed. A new feature space reduction method which improves sentiment classification is proposed. А mathematical model containing proposed method is defined. Learning and testing sets and lexical-semantic resources that are used in the proposed method are introduced. Chapter 5 also describes building and evaluation of a system for sentiment classification – SAFOS, which applies and evaluates the proposed method of a feature vector space reduction. The parameters and the functions of SAFOS are defined. Also, measures for evaluation of the system were discussed – precision, recall, F1-measure and accuracy. A description of the method for assessing the statistical significance of a system is given. Also, implementation of the statistical test in the system SAFOS is discussed. The chapter provides an overview of the presented experiments, results and evaluation of the system. Chapter 6 deals with methods of recognizing figurative speech which can improve sentiment classification. The notion of domain ontology is introduced, the role of rhetorical figures and domain ontology of rhetorical figures. The importance of figurative speech in the sentiment classification has been explored. The description of the construction and structure of the first domain ontology of rhetorical figures in Serbian language, RetFig.owl, is given. Also, the description of the construction and structure of the corresponding task ontology that contains rules for identification of some classes of rhetorical figures is given. At the end of this chapter, an overview of the performed experiments, results and evaluation of the SAFOS system plugin that improved the recognition of figurative speech is given. The final chapter of this study deals with the achievemnts, problems and disadvantages of the SAFOS system. The conclusion of this thesis points to the great technological, social, educational and scientific importance of the sentiment analysis and recognition of the figurative speech and gives some routes in further development of the SAFOS system.

URI:

http://hdl.handle.net/123456789/4422

Files in this item: 1

Mladenovic_Miljana.pdf ( 13.60Mb )

Now showing items 1-20 of 45

Browsing Computer Science by Issue Date

eLibrary

Browsing Computer Science by Issue Date

Search eLibrary

Browse

All of eLibrary

This Collection

My Account

Relited sites

COPYRIGHT STATEMENT