Ruby中常用的字符串处理函数使用实例

2014年11月30日作者：junjie

1.返回字符串的长度

str.length => integer

2.判断字符串中是否包含另一个串

str.include? other_str => true or false

"hello".include? "lo"   #=> true

"hello".include? "ol"   #=> false

"hello".include? ?h     #=> true

3.字符串插入

str.insert(index, other_str) => str

"abcd".insert(0, 'X')    #=> "Xabcd"

"abcd".insert(3, 'X')    #=> "abcXd"

"abcd".insert(4, 'X')    #=> "abcdX"

"abcd".insert(-3, 'X')

-3, 'X')   #=> "abXcd"

"abcd".insert(-1, 'X')   #=> "abcdX"

4.字符串分隔,默认分隔符为空格

str.split(pattern=$;, [limit]) => anArray

" now's the time".split        #=> ["now's", "the", "time"]

"1, 2.34,56, 7".split(%r{,s*}) #=> ["1", "2.34", "56", "7"]

"hello".split(//)               #=> ["h", "e", "l", "l", "o"]

"hello".split(//, 3)            #=> ["h", "e", "llo"]

"hi mom".split(%r{s*})         #=> ["h", "i", "m", "o", "m"]

"mellow yellow".split("ello")   #=> ["m", "w y", "w"]

"1,2,,3,4,,".split(',')         #=> ["1", "2", "", "3", "4"]

"1,2,,3,4,,".split(',', 4)      #=> ["1", "2", "", "3,4,,"]

5.字符串替换

str.gsub(pattern, replacement) => new_str

str.gsub(pattern) {|match| block } => new_str

"hello".gsub(/[aeiou]/, '*')              #=> "h*ll*"     #将元音替换成*号

"hello".gsub(/([aeiou])/, '<1>')         #=> "h<e>ll<o>"   #将元音加上尖括号,1表示保留原有字符???

"hello".gsub(/./) {|s| s[0].to_s + ' '}   #=> "104 101 108 108 111 "

字符串替换二:

str.replace(other_str) => str

s = "hello"         #=> "hello"

s.replace "world"   #=> "world"

6.字符串删除

str.delete([other_str]+) => new_str

"hello".delete "l","lo"        #=> "heo"

"hello".delete "lo"            #=> "he"

"hello".delete "aeiou", "^e"   #=> "hell"

"hello".delete "ej-m"          #=> "ho"

7.去掉前和后的空格

str.lstrip => new_str

" hello ".lstrip   #=> "hello "

"hello".lstrip       #=> "hello"

8.字符串匹配

str.match(pattern) => matchdata or nil

9.字符串反转

str.reverse => new_str

"stressed".reverse   #=> "desserts"

10.去掉重复的字符

str.squeeze([other_str]*) => new_str

"yellow moon".squeeze                  #=> "yelow mon" #默认去掉串中所有重复的字符

" now   is the".squeeze(" ")         #=> " now is the" #去掉串中重复的空格

"putters shoot balls".squeeze("m-z")   #=> "puters shot balls" #去掉指定范围内的重复字符

11.转化成数字

str.to_i=> str

"12345".to_i             #=> 12345

chomp和chop的区别:

chomp:去掉字符串末尾的n或r
chop:去掉字符串末尾的最后一个字符,不管是nr还是普通字符

"hello".chomp            #=> "hello"

"hellon".chomp          #=> "hello"

"hellorn".chomp        #=> "hello"

"hellonr".chomp        #=> "hellon"

"hellor".chomp          #=> "hello"

"hello".chomp("llo")     #=> "he"
"stringrn".chop   #=> "string"

"stringnr".chop   #=> "stringn"

"stringn".chop     #=> "string"

"string".chop       #=> "strin"

split是String类的一个类方法，我根据ri String.split提供的内容简单翻译一下。
----------------------------------------------------------- String#split
str.split(pattern=$;, [limit]) => anArray
------------------------------------------------------------------------
Divides _str_ into substrings based on a delimiter, returning an
array of these substrings.
将一个字符串用分隔符分割成一些子字符串，并返回一个包含这些子字符串的数组。

If _pattern_ is a +String+, then its contents are used as the
delimiter when splitting _str_. If _pattern_ is a single space,
_str_ is split on whitespace, with leading whitespace and runs of
contiguous whitespace characters ignored.
如果pattern部分是一个字符串，那么用它作分割符来分隔，如果pattern是一个空格，那么在空格处分割，并且临近的空格被忽略。

If _pattern_ is a +Regexp+, _str_ is divided where the pattern
matches. Whenever the pattern matches a zero-length string, _str_
is split into individual characters.
如果pattern是个正则表达式，那么在匹配pattern的地方分割，当pattern是长度为0的字符串，那么split将把字符串分割为单个字符

If _pattern_ is omitted, the value of +$;+ is used. If +$;+ is
+nil+ (which is the default), _str_ is split on whitespace as if `
' were specified.
如果pattern被忽略，将用$;来分隔，如果$;没有设置（就是在默认状态），split将制定空格' '
If the _limit_ parameter is omitted, trailing null fields are
suppressed. If _limit_ is a positive number, at most that number of
fields will be returned (if _limit_ is +1+, the entire string is
returned as the only entry in an array). If negative, there is no
limit to the number of fields returned, and trailing null fields
are not suppressed.
如果limit参数被忽略，跟踪空段被抑制，如果limit是个正数，那么至多返回limit个字段（如果是1，那么将整个字符串作为一个字段返回），如果是个负数，那么跟踪空段不被抑制。

" now's the time".split #=> ["now's", "the", "time"]
" now's the time".split(' ') #=> ["now's", "the", "time"]
" now's the time".split(/ /) #=> ["", "now's", "", "the", "time"]
"1, 2.34,56, 7".split(%r{,s*}) #=> ["1", "2.34", "56", "7"]
"hello".split(//) #=> ["h", "e", "l", "l", "o"]
"hello".split(//, 3) #=> ["h", "e", "llo"]
"hi mom".split(%r{s*}) #=> ["h", "i", "m", "o", "m"]

"mellow yellow".split("ello") #=> ["m", "w y", "w"]
"1,2,,3,4,,".split(' ,') #=> ["1", "2", "", "3", "4"]
"1,2,,3,4,,".split(',', 4) #=> ["1", "2", "", "3,4,,"]
"1,2,,3,4,,".split(',', -4) #=> ["1", "2", "", "3", "4", "", ""]

如果包含特殊字符，注意转义
"wo | shi | yi | ge | bing".split(/s*|s*) #竖杠别忘了转义

还有它和String.scan的区别，split中的pattern是分隔符，而scan中的pattern指的是要匹配的东西。

"123=342=4234=523421=6424".scan(/d+/) #=> ["123","342","4234","523421","6424"]

如果匹配项被括起来，那么则会保留分割符，例如：

"Three little words".split(/s+/) #===>["three","little",words"]
"Three little words".split(/(s+)/) #===>["three"," ","little"," ","words"] 保留了空格