2015年7月7日 星期二

Python 字串處理

這篇文章記錄Python字串處理的範例

  • 字串連接,可為一字串清單或字串元組的資料項連接成字串
  >>> country = ["China", "Taiwan", "Japan"]
  >>> " ".join(country)
  'China Taiwan Japan'
  •  尋找字串,這裡用index()此函數說明:  
   def is_image_extension(file_name):
    image_extension = [".jpg", ".bmp", ".ico", ".png"]
    try:
        i = file_name.index(".")       
        for extension in image_extension:           
            if file_name[i::] == extension:
                return True
            
        return False     
    except ValueError:
        return None
  • 將字串前後無用的前導和尾隨空白字符去掉 
    >>> s = "\t test code "
    >>> s.lstrip(), s.rstrip(), s.strip()
    ('test code ', '\t test code', 'test code')
    
  • 分割字串 
>>> person = "Tom Lin*1985-7-15*2036-6-3"
>>> fields = person.split("*")
>>> fields
['Tom Lin', '1985-7-15', '2036-6-3']
  • 字串格式化
>>> "The election of '{0}' will be held in '{1}'".format("Taiwan", "2016")
"The election of 'Taiwan' will be held in '2016'"

>>> "{0:20}".format(s)  # 最小寬度為25
'This is a sentence  '

>>> "{0:>20}".format(s)  # 靠右對齊,最小寬度為25
'  This is a sentence'

>>> "{0:*>20}".format(s)  # 以*填充,靠右對齊,最小寬度為25
'**This is a sentence'

  • 整數及浮點數格式化, 使用locale.setlocale()做設定. Note: 在Windows可先用locale.locale_alias查循支援的語系
>>> x, y = (1234567890, 1234.56)
>>> locale.setlocale(locale.LC_ALL, "english_united-states.437")
'English_United States.437'
>>> en = "{0:n} {1:n}".format(x,y)
>>> print(en)
1,234,567,890 1,234.56

>>> data = (10 ** 4) * math.e
>>> "[{0:12.2e}] [{0:12.2f}]".format(data)
'[    2.72e+04] [    27182.82]'
>>> "[{0:*>12.2e}] [{0:*>12.2f}]".format(data)
'[****2.72e+04] [****27182.82]'

  • decimal.Decimal格式化,使用逗號做為千位分隔
>>> "{:,.4f}".format(decimal.Decimal("123456.1234"))
'123,456.1234'