Primena pravila pridruživanja i metoda podržavajućih vektora za predviđanje T - ćelijskih epitopa

Show simple item record

dc.contributor.advisor	Mitić, Nenad
dc.contributor.author	Jandrlić, Davorka
dc.date.accessioned	2017-04-25T15:34:24Z
dc.date.available	2017-04-25T15:34:24Z
dc.date.issued	2016
dc.identifier.uri	http://hdl.handle.net/123456789/4457
dc.description.abstract	Application of association rule and support vector machine technique for T cell epitope prediction Abstract: Data mining is an interdisciplinary sub eld of computer science, including various scienti c disciplines such as: database systems, statistics, machine learning, arti cial intelligence and the others. The main task of data mining is automatic and semi-automatic analysis of large quantities of data to extract previously unknown, nontrivial and interesting patterns. Rapid development in the elds of immunology, genomics, proteomics, molecular biology and other related areas has caused a large increase in biological data. Drawing conclusions from these data requires sophisticated computational analyses. Without automatic methods to extract data it is almost impossible to investigate and analyze this data. Currently, one of the most active problems in immunoinformatics is T cell epitope identi cation. Identi cation of T - cell epitopes, especially dominant T - cell epitopes widely represented in population, is of the immense relevance in vaccine development and detecting immunological patterns characteristic for autoimmune diseases. Epitope-based vaccines are of great importance in combating infectious and chronic diseases and various types of cancer. Experimental methods for identi cation of T - cell epitopes are expensive, time consuming, and are not applicable for large scale research (especially not for the choice of the optimal group of epitopes for vaccine development which will cover the whole population or personalized vaccines). Computational and mathematical models for T - cell epitope prediction, based on MHC-peptide binding, are crucial to enable the systematic investigation and identi cation of T - cell epitopes on a large dataset and to complement expensive and time consuming experimentation [16]. T - cells (T - lymphocytes) recognize protein antigen(s) only when degradated to peptide fragments and complexed with Major Histocompatibility Complex (MHC) molecules on the surface of antigen-presenting cells [1]. The binding of these peptides (potential epitopes) to MHC molecules and presentation to T - cells is a crucial (and the most selective) step in both cellular and humoral adoptive immunity. Currently exist numerous of methodologies that provide identi cation of these epitopes. In this PhD thesis, discussed methods are exclusively based on peptide sequence binding to MHC molecules. It describes existing methodologies for T - cell epitope prediction, the shortcomings of existing methods and some of the available databases of experimentally determined linear T - cell epitopes. The new models for T - cell epitope prediction using data mining techniques are developed and extensive analyses concerning to whether disorder and hydropathy prediction methods could help understanding epitope processing and presentation is done. Accurate computational prediction of T cell epitope, which is the aim of this thesis, can greatly expedite epitope screening by reducing costs and experimental e ort. These theses deals with predictive data mining tasks: classi cation and regression, and descriptive data mining tasks: clustering, association rules and sequence analysis. The new-developed models, which are main contribution of the dissertation are comparable in performance with the best currently existing methods, and even better in some cases. Developed models are based on the support vector machine technique for classi cation and regression problems. À new approach of extracting the most important physicochemical properties that in uence the classi cation of MHC-binding ligands is also presented. For that purpose are developed new clustering-based classi cation models. The models are based on k-means clustering technique. The second part of the thesis concerns the establishment of rules and associations of T - cell epitopes that belong to di erent protein structures. The task of this part of research was to nd out whether disorder and hydropathy prediction methods could help in understanding epitope processing and presentation. The results of the application of an association rule technique and thorough analysis over large protein dataset where T cell epitopes, protein structure and hydropathy has been determined computationally, using publicly available tools, are presented. During the research on this theses new extendable open source software system that support bioinformatic research and have wide applications in prediction of various proteins characteristics is developed. A part of this thesis is described in the works [71][82][45][42][43][44][72][73] that are published or submitted for publications in several journals. The dissertation is organized as follows: In section1 is illustrated introduction to the problem of identifying T - cell epitopes, the importance of mathematical and computational methods in this area, vii as well as the importance of T - cell epitopes to the immune system and basis for functioning of the immune system. In section 2 are described in details data mining techniques that are used in the thesis for development of new models. Section 3 provides an overview of existing methods for predicting the T - cell epitopes and explains the work methodologies of existing models and methods. It pointed out the shortcomings of existing methods which have been the motivation for the development of new models for the T - cell epitope prediction. Some of the publicly available databases with the experimentally determined MHC binding peptides and T - cell epitope are described. In section 4 are presented new developed models for epitopes prediction. The developed models include three new encoding schemes for peptide sequences representation in the form of a vector which is more suitable as input to models based on the data mining techniques. Section 5 reports results of presented new classi cation and regression models. The new models are compared with each other as well as with currently existing methods for T cell epitope prediction. Section 6 presents the research results of the T - cell epitopes relationship with ordered and disordered regions in proteins. In the context of this chapter summary results are presented which are shown in more detail in the published works [71][82][45][44]. Section 7 concludes the dissertation with some discussion of the potential signi cance of obtained results and some directions for future work.	en_US
dc.description.provenance	Submitted by Slavisha Milisavljevic (slavisha) on 2017-04-25T15:34:24Z No. of bitstreams: 1 doktorskaTezaDavorkaJandrlic.pdf: 7938943 bytes, checksum: db033339d434ceab596b0b0052181924 (MD5)	en
dc.description.provenance	Made available in DSpace on 2017-04-25T15:34:24Z (GMT). No. of bitstreams: 1 doktorskaTezaDavorkaJandrlic.pdf: 7938943 bytes, checksum: db033339d434ceab596b0b0052181924 (MD5) Previous issue date: 2016	en
dc.language.iso	sr	en_US
dc.publisher	Beograd	en_US
dc.title	Primena pravila pridruživanja i metoda podržavajućih vektora za predviđanje T - ćelijskih epitopa	en_US
mf.author.birth-date	1981-02-12
mf.author.birth-place	Dubrovnik	en_US
mf.author.birth-country	Hrvatska	en_US
mf.author.residence-state	Srbija	en_US
mf.author.citizenship	Srpsko	en_US
mf.author.nationality	Srpkinja	en_US
mf.subject.area	Computer science	en_US
mf.subject.keywords	Support vector machine, classi cation, regression, ê-mean clustering, association rules, T cell epitopes	en_US
mf.subject.subarea	Data mining	en_US
mf.contributor.committee	Mitić, Nenad
mf.contributor.committee	Pavlović - Lažetić, Gordana
mf.contributor.committee	Pavlović, Mirjana
mf.university.faculty	Mathematical Faculty	en_US
mf.document.references	119	en_US
mf.document.pages	146	en_US
mf.document.location	Beograd	en_US
mf.document.genealogy-project	No	en_US
mf.university	Belgrade University	en_US

Files in this item

Files	Size	Format	View
doktorskaTezaDavorkaJandrlic.pdf	7.938Mb	PDF	View/Open

This item appears in the following Collection(s)

Mathematics

Show simple item record

Primena pravila pridruživanja i metoda podržavajućih vektora za predviđanje T - ćelijskih epitopa

eLibrary

Primena pravila pridruživanja i metoda podržavajućih vektora za predviđanje T - ćelijskih epitopa

Files in this item

This item appears in the following Collection(s)

Search eLibrary

Browse

All of eLibrary

This Collection

My Account

Relited sites

COPYRIGHT STATEMENT