如何在Python中计算一个特定的单词？

问题内容：

我想计算文件中的特定单词。

例如，“ apple”在文件中出现了多少次。我尝试了这个：

#!/usr/bin/env python
import re

logfile = open("log_file", "r")

wordcount={}
for word in logfile.read().split():
    if word not in wordcount:
        wordcount[word] = 1
    else:
        wordcount[word] += 1
for k,v in wordcount.items():
    print k, v

通过用“ apple”替换“ word”，但它仍会计算文件中所有可能的单词。

任何建议将不胜感激。:)

问题答案：

您可以使用，str.count()因为您只关心单个单词的出现：

with open("log_file") as f:
    contents = f.read()
    count = contents.count("apple")

但是，为避免出现一些极端情况，例如错误地计数像这样的单词"applejack"，我建议您使用正则表达式：

import re

with open("log_file") as f:
    contents = f.read()
    count = sum(1 for match in re.finditer(r"\bapple\b", contents))

\b在正则表达式中，确保模式在 单词边界处 开始和结束（与较长字符串中的子字符串相对）。

如何在Python中计算一个特定的单词？

微信关注