Identification of possible differences in coding and non coding fragments of DNA sequences by using the method of the Recurrence Quantification Analysis

Journal of Research and Review in Applied Science 13 (2):1-28 (2012)
  Copy   BIBTEX

Abstract

Starting with the results of Li et al. in 1992 there is valuable interest in finding long range correlations in DNA sequences since it raises questions about the role of introns and intron-containing genes. In the present paper we studied two sequences that are the human T-cell receptor alpha/delta locus, Gen-Bank name HUMTCRADCV, a noncoding chromosomal fragment of M = 97630 bases (composed of less than 10% of coding regions), and the Escherichia Coli K12, Gen-Bank name ECO110K, a genomic fragment with M = 111401 bases consisting of mostly coding regions and containing more that 80% of coding regions. We attributed the value (+1) to the purines and the value (-1) to the pirimidines and to such reconstructed random walk we applied the method of the Recurrence Quantification Analysis(RQA) that was introduced by Zbilut and Webber in 1994. By using dimension D=1 and Embedded Dimensions D=3 and D=5, we obtain some indicative results. Also by a simple eye examination of the reconstructed maps, the differences between coding and non coding regions are evident and impressive and consist in the presence in noncoding regions of long patches of the same colour that are absent in the coding sequence. At first sight this suggests a simple explanation to the concept of „long-range‟ correlation. On the quantitative plane, we used the %Rec., the %Det., the Ratio, the Entropy, the %Lam., and the Lmax that, as explained in detail in the text, represent the basic variables of RQA. The significant result that we have here is that both Lmax and Laminarity exhibit very large values in HUMTCRADCV and actually different in values respect to ECO110K where such variables assume more modest values. Therefore we suggest that there is the observed difference between HUMTCRADCV and ECO110K. The claimed higher long-range correlations of introns respect to exons from many authors may be explained here in reasonof such found higher values of Lmax and of Laminarity in HUMTCRADCV respect to ECO110K.

Author Profiles

Analytics

Added to PP
2009-10-17

Downloads
551 (#28,063)

6 months
60 (#66,134)

Historical graph of downloads since first upload
This graph includes both downloads from PhilArchive and clicks on external links on PhilPapers.
How can I increase my downloads?