Scripting Tcl 大小事: DNA Sequence

下面是試著使用 Tcl 解 DNA Sequence 問題的解法：

#!/usr/bin/env tclsh
#
# DNA is a long, chainlike molecule which has two strands twisted 
# into a double helix. The two strands are made up of simpler molecules 
# called nucleotides. Each nucleotide is composed of one of the 
# four nitrogen-containing nucleobases cytosine (C), guanine (G), 
# adenine (A) and thymine (T).
# Write a script to print nucleobase count in the given DNA sequence. 
# Also print the complementary sequence where Thymine (T) on one strand 
# is always facing an adenine (A) and vice versa; guanine (G) is always 
# facing a cytosine (C) and vice versa.
#

set dna "GTAAACCCCTTTTCATTTAGACAGATCGACTCCTTATCCATTCTCAGAGATGTGTTGCTGGTCGCCG"
set ccount [llength [regexp -all -inline "C" $dna]]
set gcount [llength [regexp -all -inline "G" $dna]]
set acount [llength [regexp -all -inline "A" $dna]]
set tcount [llength [regexp -all -inline "T" $dna]]
puts "C: $ccount"
puts "G: $gcount"
puts "A: $acount"
puts "T: $tcount"
set complement ""
set length [string length $dna]
for {set i 0} {$i < $length} {incr i} {
    set substring [string index $dna $i]
    if {[string compare $substring "C"]==0} {
        append complement "G"
    } elseif {[string compare $substring "G"]==0} {
        append complement "C"
    } elseif {[string compare $substring "A"]==0} {
        append complement "T"
    } elseif {[string compare $substring "T"]==0} {
        append complement "A"
    }
}
puts ""
puts "Complement:"
puts $complement

也可以這樣寫：

#!/usr/bin/env tclsh
#
# DNA is a long, chainlike molecule which has two strands twisted 
# into a double helix. The two strands are made up of simpler molecules 
# called nucleotides. Each nucleotide is composed of one of the 
# four nitrogen-containing nucleobases cytosine (C), guanine (G), 
# adenine (A) and thymine (T).
# Write a script to print nucleobase count in the given DNA sequence. 
# Also print the complementary sequence where Thymine (T) on one strand 
# is always facing an adenine (A) and vice versa; guanine (G) is always 
# facing a cytosine (C) and vice versa.
#

set dna "GTAAACCCCTTTTCATTTAGACAGATCGACTCCTTATCCATTCTCAGAGATGTGTTGCTGGTCGCCG"
set ccount 0
set gcount 0
set acount 0
set tcount 0
set complement ""
set length [string length $dna]
for {set i 0} {$i < $length} {incr i} {
    set substring [string index $dna $i]
    if {[string compare $substring "C"]==0} {
        incr ccount
        append complement "G"
    } elseif {[string compare $substring "G"]==0} {
        incr gcount
        append complement "C"
    } elseif {[string compare $substring "A"]==0} {
        incr acount
        append complement "T"
    } elseif {[string compare $substring "T"]==0} {
        incr tcount
        append complement "A"
    }
}
puts "C: $ccount"
puts "G: $gcount"
puts "A: $acount"
puts "T: $tcount"
puts ""
puts "Complement:"
puts $complement

如果使用 array 實作，就會是下面的樣子：

#!/usr/bin/env tclsh
#
# DNA is a long, chainlike molecule which has two strands twisted 
# into a double helix. The two strands are made up of simpler molecules 
# called nucleotides. Each nucleotide is composed of one of the 
# four nitrogen-containing nucleobases cytosine (C), guanine (G), 
# adenine (A) and thymine (T).
# Write a script to print nucleobase count in the given DNA sequence. 
# Also print the complementary sequence where Thymine (T) on one strand 
# is always facing an adenine (A) and vice versa; guanine (G) is always 
# facing a cytosine (C) and vice versa.
#

set dna "GTAAACCCCTTTTCATTTAGACAGATCGACTCCTTATCCATTCTCAGAGATGTGTTGCTGGTCGCCG"
set complement ""
array set count {}
set length [string length $dna]
for {set i 0} {$i < $length} {incr i} {
    set substring [string index $dna $i]
    incr count($substring)
    switch $substring {
        "C" {
            append complement "G"
        }
        "G" {
            append complement "C"
        }
        "A" {
            append complement "T"
        } 
        "T" {
            append complement "A"
        }
    }
}
foreach {key value} [array get count] {
    puts "$key: $value"
}
puts ""
puts "Complement:"
puts $complement

Scripting Tcl 大小事

2020-12-12

DNA Sequence

沒有留言: