https://codertw.com/%E7%A8%8B%E5%BC%8F%E8%AA%9E%E8%A8%80/363018/
從這邊抄來的做個筆記
from collections import Counter
def tf(word, count):
return count[word] / sum(count.values())
def n_containing(word, count_list):
return sum(1 for count in count_list if word in count)
def idf(word, count_list):
return math.log(len(count_list) / (1+n_containing(word, count_list)))
def tfidf(word, count, count_list):
return tf(word, count) * idf(word, count_list)
count =Counter( 一個list 放滿了元素 )
count_list = list放滿了counter()
可參考
https://docs.python.org/zh-tw/3/library/collections.html#counter-objects
type(count )
----------->collections.Counter
是dict的子類,一種集合(set)
可用
clear
copy
elements
fromkeys
get
items
keys
most_common
pop
popitem
setdefault
subtract
update
values
https://nlp.stanford.edu/IR-book/html/htmledition/document-and-query-weighting-schemes-1.html
tf_idf的參考