The research problems tackled in the Ph.D. thesis concern four dierent items. Our work has taken inspiration from a previous contribution [1] that focused on the study of noncoding (promoter) sequences in eukaryotes. Through a collaboration with the genetics group of Professor Renato Fani we have moved the object of our research towards noncoding sequences in some bacterial species. The rst goal has been to dene a method able to uniquely identify noncoding DNA sequences in proximity of genes in bacteria, similar to promoters in eukaryotes, that do not even undergo the transcription process by RNA polymerase. More precisely, following the approach proposed in [1], the sequences obtained so far have been compared pairwise and then, thanks to a clustering procedure, they have been divided into groups based on structural similarities. Making use of updated databases, we have found a correspondence between structural features and biological properties. We have used two databases in order to do this. The rst one, STRING, has allowed us to create biological networks among the genes regulated by the corresponding noncoding sequences. We have built co-expression and co-occurrence networks and with a statistical procedure we have seen which cluster gave rise to networks with dierent features respect to networks created by choosing randomly the noncoding sequences. The second database used, COG, has allowed us to univocally associate a biological function to each noncoding sequence; then, through a functional enrichment analysis, a functional category in a cluster has been seen to be over- or down-represented respect to the genome background. This analysis has produced the following publication: Lenzini L, Di Patti F, Livi R, Fondi M, Fani R, Mengoni A. A Method for the Structure-Based, Genome-Wide Analysis of Bacterial Intergenic Sequences Identies Shared Compositional and Functional Features. Genes. 2019; 10(10):834. It would be interesting to further investigate biological correlations using other databases and to expand the study to other prokaryotic organisms by looking for possible similarities between noncoding sequences belonging to dierent species. The next goal has been to study the sequences from a point of view of the thermodynamics of denaturation taking into account most of the above identied noncoding bacterial DNA sequences. In fact, taking advantage of the availability of real noncoding sequences, we have investigated the correspondence between structural and thermodynamic properties. To model the nucleotide chain we have used the Dauxois-Peyrard-Bishop model in which, among the various contributes to the potential energy, there is the one given by the Morse potential which takes into account the transverse bonds between the two opposite strands, introducing a dierence for what concerns weak and strong bases. The study of the denaturing properties for clusters of noncoding sequences in a given organism had the merit of highlighting the importance of the changes we have made to Dauxois-Peyrard-Bishop model. In fact, if in Dauxois-Peyrard-Bishop model the four nucleotides had the same mass, we have introduced a "degeneration" in the masses in the symplectic algorithm reproducing the dynamics of the chain coupled with a thermostat. This adaptation can be seen studying the denaturation process. The results of this study were reported in the following paper: Leonardo Lenzini, Francesca Di Patti, Stefano Lepri, Roberto Livi, and Stefano Luccioli. Thermodynamics of dna denaturation in a model of bacterial intergenic sequences. Chaos, Solitons & Fractals, 130:109446, 2020. A natural development of this line of research would be to apply the study of denaturation dynamics to eukaryotic IGSs, although it must be taken into account that this would involve a much higher computational cost. The last part of this work concerns the reexamination of the intergenic eukaryotic sequences pre- viously studied in [1]. In particular we have analyzed compositional features of dierent eukaryotic species along the phylogenetic tree. This systematic investigation has yielded the empirical obser- vations about the presence of structural constraints characterizing such sequences. By this analysis it has emerged a correlation between evolutionary trends and compositional structures of noncoding sequences in analogy with what has been observed in coding components. Despite its interest and novelty, a clear interpretation about the presence of these constrains is still lacking. Certainly further investigations are necessary to possibly reach the goal of a convincing biological interpretation. In this perspective these results can be viewed as a preliminary step still insucient for producing a publication. It would be important to improve and extend this analysis in order to reveal if and how the constraints are correlated with evolution or depending on combination with other causes. At the present stage of the research it is a fully open problem. [1] Pettinato L, Calistri E, Di Patti F, Livi R, Luccioli S (2014) Genome-Wide Analysis of Pro- moters: Clustering by Alignment and Analysis of Regular Patterns. PLoS ONE 9(1): e85260. https://doi.org/10.1371/journal.pone.0085260
The structure of bacterial intergenic sequences: relation with regulation and denaturation / Leonardo Lenzini. - (2020).
The structure of bacterial intergenic sequences: relation with regulation and denaturation
Leonardo Lenzini
2020
Abstract
The research problems tackled in the Ph.D. thesis concern four dierent items. Our work has taken inspiration from a previous contribution [1] that focused on the study of noncoding (promoter) sequences in eukaryotes. Through a collaboration with the genetics group of Professor Renato Fani we have moved the object of our research towards noncoding sequences in some bacterial species. The rst goal has been to dene a method able to uniquely identify noncoding DNA sequences in proximity of genes in bacteria, similar to promoters in eukaryotes, that do not even undergo the transcription process by RNA polymerase. More precisely, following the approach proposed in [1], the sequences obtained so far have been compared pairwise and then, thanks to a clustering procedure, they have been divided into groups based on structural similarities. Making use of updated databases, we have found a correspondence between structural features and biological properties. We have used two databases in order to do this. The rst one, STRING, has allowed us to create biological networks among the genes regulated by the corresponding noncoding sequences. We have built co-expression and co-occurrence networks and with a statistical procedure we have seen which cluster gave rise to networks with dierent features respect to networks created by choosing randomly the noncoding sequences. The second database used, COG, has allowed us to univocally associate a biological function to each noncoding sequence; then, through a functional enrichment analysis, a functional category in a cluster has been seen to be over- or down-represented respect to the genome background. This analysis has produced the following publication: Lenzini L, Di Patti F, Livi R, Fondi M, Fani R, Mengoni A. A Method for the Structure-Based, Genome-Wide Analysis of Bacterial Intergenic Sequences Identies Shared Compositional and Functional Features. Genes. 2019; 10(10):834. It would be interesting to further investigate biological correlations using other databases and to expand the study to other prokaryotic organisms by looking for possible similarities between noncoding sequences belonging to dierent species. The next goal has been to study the sequences from a point of view of the thermodynamics of denaturation taking into account most of the above identied noncoding bacterial DNA sequences. In fact, taking advantage of the availability of real noncoding sequences, we have investigated the correspondence between structural and thermodynamic properties. To model the nucleotide chain we have used the Dauxois-Peyrard-Bishop model in which, among the various contributes to the potential energy, there is the one given by the Morse potential which takes into account the transverse bonds between the two opposite strands, introducing a dierence for what concerns weak and strong bases. The study of the denaturing properties for clusters of noncoding sequences in a given organism had the merit of highlighting the importance of the changes we have made to Dauxois-Peyrard-Bishop model. In fact, if in Dauxois-Peyrard-Bishop model the four nucleotides had the same mass, we have introduced a "degeneration" in the masses in the symplectic algorithm reproducing the dynamics of the chain coupled with a thermostat. This adaptation can be seen studying the denaturation process. The results of this study were reported in the following paper: Leonardo Lenzini, Francesca Di Patti, Stefano Lepri, Roberto Livi, and Stefano Luccioli. Thermodynamics of dna denaturation in a model of bacterial intergenic sequences. Chaos, Solitons & Fractals, 130:109446, 2020. A natural development of this line of research would be to apply the study of denaturation dynamics to eukaryotic IGSs, although it must be taken into account that this would involve a much higher computational cost. The last part of this work concerns the reexamination of the intergenic eukaryotic sequences pre- viously studied in [1]. In particular we have analyzed compositional features of dierent eukaryotic species along the phylogenetic tree. This systematic investigation has yielded the empirical obser- vations about the presence of structural constraints characterizing such sequences. By this analysis it has emerged a correlation between evolutionary trends and compositional structures of noncoding sequences in analogy with what has been observed in coding components. Despite its interest and novelty, a clear interpretation about the presence of these constrains is still lacking. Certainly further investigations are necessary to possibly reach the goal of a convincing biological interpretation. In this perspective these results can be viewed as a preliminary step still insucient for producing a publication. It would be important to improve and extend this analysis in order to reveal if and how the constraints are correlated with evolution or depending on combination with other causes. At the present stage of the research it is a fully open problem. [1] Pettinato L, Calistri E, Di Patti F, Livi R, Luccioli S (2014) Genome-Wide Analysis of Pro- moters: Clustering by Alignment and Analysis of Regular Patterns. PLoS ONE 9(1): e85260. https://doi.org/10.1371/journal.pone.0085260File | Dimensione | Formato | |
---|---|---|---|
Tesi_corretta.pdf
accesso aperto
Tipologia:
Tesi di dottorato
Licenza:
Open Access
Dimensione
12.55 MB
Formato
Adobe PDF
|
12.55 MB | Adobe PDF |
I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.