Scripting Tcl 大小事: 12月 2020

2020-12-28

tcl-tidy: Tcl bindings for libtidy

我在最近又試著使用 html tidy 的時候發現這個命令列工具在最近幾年改寫以後，其實只是 libtidy 的包裝。所以也許可以透過 Tcl 呼叫 libtidy 來做到相同的事（而不是 exec tidy tool），所以嘗試寫了一個 Tcl extension，看起來基本的功能和選項設定沒問題，所以我把 code 放一份到 github 了。

2021/01/01 更新
能夠設 encoding 似乎會造成問題，目前先移掉，預設編碼使用 utf8，版本升為 v0.2。

2020-12-26

Replace e with E

Write a script to replace the character ‘e’ with ‘E’ in the string ‘Weekly Challenge’. Also print the number of times the character ‘e’ is found in the string.

#!/usr/bin/env tclsh

set times [regsub -all "e" "Weekly Challenge" "E" result]
puts "Find e $times times"
puts "Output: $result"

使用 Regular Expressions 來做替代字串的任務。

也可以使用 string map 來替代字串，下面是 Replace e with E 的其它寫法：

#!/usr/bin/env tclsh

set str "Weekly Challenge"
set count 0

foreach x [split $str {}] {
    if {$x == "e"} {
        incr count
    }
}
puts "Find e $count times"
set result [string map {e E} $str]
puts "Output: $result"

也可以在計算 e 數目的時候就同時建構字串：

#!/usr/bin/env tclsh

set str "Weekly Challenge"
set count 0
set result {}

foreach x [split $str {}] {
    if {$x == "e"} {
        incr count
        append result "E"
    } else {
        append result $x
    }
}
puts "Find e $count times"
puts "Output: $result"

2020-12-21

Isomorphic Strings

下面是 Isomorphic Strings 使用 string first 解的結果：

#!/usr/bin/env tclsh
#
# You are given two strings $A and $B. 
# Write a script to check if the given strings are Isomorphic. 
# Print 1 if they are otherwise 0.
#

proc isIsomorphic {str1 str2} {
    set len [string length $str1]
    if {$len != [string length $str2]} {
        return 0
    }

    for {set i 0} {$i < $len} {incr i} {
        set floc1 [string first [string index $str1 $i] $str1]
        set floc2 [string first [string index $str2 $i] $str2]

        if {$floc1 != $floc2} {
            return 0
        }
    }

    return 1
}

if {$argc == 2} {
    set string1 [lindex $argv 0]
    set string2 [lindex $argv 1]
} else {
    exit    
}

puts [isIsomorphic $string1 $string2]

2020-12-15

Count Number

下面是使用 Tcl 試著解 Count Number 的答案：

#!/usr/bin/tclsh
#
# You are given a positive number $N.
# Write a script to count number and display as you read it.
#
# For example,
# Input: $N = 1122234
# Output: 21321314
# as we read "two 1 three 2 one 3 one 4"
#

puts -nonewline "Please input a number: "
flush stdout
gets stdin number
if {$number <= 0} {
    puts "Number requires > 0."
    exit
}

array set mapping [list 1 one 2 two 3 three 4 four 5 five 6 six 7 seven 8 eight 9 nine]

set last [string index $number 0]
set index 0
set results [list]
lset results $index [list 1 $last]
for {set i 1} {$i < [string length $number]} {incr i} {
    set current [string index $number $i]

    if {$current == $last} {
        set indexlist [lindex $results $index]
        set curval [lindex $indexlist 0]
        incr curval
        set indexlist [list $curval $current]
    } else {
        incr index
        set indexlist [list 1 $current]
    }

    lset results $index $indexlist
    set last $current        
}

set answer {}
foreach r $results {
   append answer [join $r ""]
}
puts "\nOutput: $answer"

puts -nonewline "as we read \""
set nresults [list]
for {set index 0} {$index < [llength $results]} {incr index} {
   set r [lindex $results $index] 
   set key [lindex $r 0]
   set value [lindex $r 1]
   lappend nresults "$mapping($key) $value"
}
puts -nonewline [join $nresults " "]
puts "\""

2020-12-12

DNA Sequence

下面是試著使用 Tcl 解 DNA Sequence 問題的解法：

#!/usr/bin/env tclsh
#
# DNA is a long, chainlike molecule which has two strands twisted 
# into a double helix. The two strands are made up of simpler molecules 
# called nucleotides. Each nucleotide is composed of one of the 
# four nitrogen-containing nucleobases cytosine (C), guanine (G), 
# adenine (A) and thymine (T).
# Write a script to print nucleobase count in the given DNA sequence. 
# Also print the complementary sequence where Thymine (T) on one strand 
# is always facing an adenine (A) and vice versa; guanine (G) is always 
# facing a cytosine (C) and vice versa.
#

set dna "GTAAACCCCTTTTCATTTAGACAGATCGACTCCTTATCCATTCTCAGAGATGTGTTGCTGGTCGCCG"
set ccount [llength [regexp -all -inline "C" $dna]]
set gcount [llength [regexp -all -inline "G" $dna]]
set acount [llength [regexp -all -inline "A" $dna]]
set tcount [llength [regexp -all -inline "T" $dna]]
puts "C: $ccount"
puts "G: $gcount"
puts "A: $acount"
puts "T: $tcount"
set complement ""
set length [string length $dna]
for {set i 0} {$i < $length} {incr i} {
    set substring [string index $dna $i]
    if {[string compare $substring "C"]==0} {
        append complement "G"
    } elseif {[string compare $substring "G"]==0} {
        append complement "C"
    } elseif {[string compare $substring "A"]==0} {
        append complement "T"
    } elseif {[string compare $substring "T"]==0} {
        append complement "A"
    }
}
puts ""
puts "Complement:"
puts $complement

也可以這樣寫：

#!/usr/bin/env tclsh
#
# DNA is a long, chainlike molecule which has two strands twisted 
# into a double helix. The two strands are made up of simpler molecules 
# called nucleotides. Each nucleotide is composed of one of the 
# four nitrogen-containing nucleobases cytosine (C), guanine (G), 
# adenine (A) and thymine (T).
# Write a script to print nucleobase count in the given DNA sequence. 
# Also print the complementary sequence where Thymine (T) on one strand 
# is always facing an adenine (A) and vice versa; guanine (G) is always 
# facing a cytosine (C) and vice versa.
#

set dna "GTAAACCCCTTTTCATTTAGACAGATCGACTCCTTATCCATTCTCAGAGATGTGTTGCTGGTCGCCG"
set ccount 0
set gcount 0
set acount 0
set tcount 0
set complement ""
set length [string length $dna]
for {set i 0} {$i < $length} {incr i} {
    set substring [string index $dna $i]
    if {[string compare $substring "C"]==0} {
        incr ccount
        append complement "G"
    } elseif {[string compare $substring "G"]==0} {
        incr gcount
        append complement "C"
    } elseif {[string compare $substring "A"]==0} {
        incr acount
        append complement "T"
    } elseif {[string compare $substring "T"]==0} {
        incr tcount
        append complement "A"
    }
}
puts "C: $ccount"
puts "G: $gcount"
puts "A: $acount"
puts "T: $tcount"
puts ""
puts "Complement:"
puts $complement

如果使用 array 實作，就會是下面的樣子：

#!/usr/bin/env tclsh
#
# DNA is a long, chainlike molecule which has two strands twisted 
# into a double helix. The two strands are made up of simpler molecules 
# called nucleotides. Each nucleotide is composed of one of the 
# four nitrogen-containing nucleobases cytosine (C), guanine (G), 
# adenine (A) and thymine (T).
# Write a script to print nucleobase count in the given DNA sequence. 
# Also print the complementary sequence where Thymine (T) on one strand 
# is always facing an adenine (A) and vice versa; guanine (G) is always 
# facing a cytosine (C) and vice versa.
#

set dna "GTAAACCCCTTTTCATTTAGACAGATCGACTCCTTATCCATTCTCAGAGATGTGTTGCTGGTCGCCG"
set complement ""
array set count {}
set length [string length $dna]
for {set i 0} {$i < $length} {incr i} {
    set substring [string index $dna $i]
    incr count($substring)
    switch $substring {
        "C" {
            append complement "G"
        }
        "G" {
            append complement "C"
        }
        "A" {
            append complement "T"
        } 
        "T" {
            append complement "A"
        }
    }
}
foreach {key value} [array get count] {
    puts "$key: $value"
}
puts ""
puts "Complement:"
puts $complement