字典树,或者叫trie树,或者叫单词查找树。前缀匹配上比较方便,词频统计也常常用到。当然实际有更好的替代品double array,不过trie树比较易于实现。
用python实现了一把trie树的建树和查询:
root = {}
def build():
for i in s:
tmp = root
for c in i:
tmp = tmp.setdefault(c, {})
def query(prefix):
tmp = root
for c in prefix:
if c in tmp:
tmp = tmp.get(c)
else:
return []
res = []
chars = [prefix]
stack = [tmp.iteritems()]
while len(stack) > 0:
it = stack[-1]
try:
c, v = it.next()
if len(v) == 0:
res.append(''.join(chars) + c)
continue
stack.append(v.iteritems())
chars.append(c)
except StopIteration:
stack.pop()
chars.pop()
return res
s = [ 'abcd', 'abd', 'bcd', 'efg', 'hij' ]
build()
res = query('ab')
print(res)