Skip to the content.

leetcode [面试题17.13] 恢复空格


Contact me:

Blog -> https://cugtyt.github.io/blog/index
Email -> cugtyt@qq.com
GitHub -> Cugtyt@GitHub


哦,不!你不小心把一个长篇文章中的空格、标点都删掉了,并且大写也弄成了小写。像句子”I reset the computer. It still didn’t boot!”已经变成了”iresetthecomputeritstilldidntboot”。在处理标点符号和大小写之前,你得先把它断成词语。当然了,你有一本厚厚的词典dictionary,不过,有些词没在词典里。假设文章用sentence表示,设计一个算法,把文章断开,要求未识别的字符最少,返回未识别的字符数。

注意:本题相对原题稍作改动,只需返回未识别的字符数

示例:

输入:
dictionary = ["looked","just","like","her","brother"]
sentence = "jesslookedjustliketimherbrother"
输出: 7
解释: 断句后为"jess looked just like tim her brother",共7个未识别字符。

提示:

0 <= len(sentence) <= 1000
dictionary中总字符数不超过 150000。
你可以认为dictionary和sentence中只包含小写字母。

动态规划,遍历字符串,如果字符串出现在字典中,那么结束处的dp值和前一个dp值相等。

from collections import defaultdict
class Solution:
    def respace(self, dictionary: List[str], sentence: str) -> int:
        if len(dictionary) == 0 or len(sentence) == 0: return 0
        dic = defaultdict(list)
        for d in dictionary:
            dic[d[0]].append(d)

        dp = list(range(len(sentence) + 1))
        for i, s in enumerate(sentence):
            flag = True
            if s in dic:
                for d in dic[s]:
                    if i + len(d) <= len(sentence) and sentence[i: i + len(d)] == d:
                        dp[i + len(d)] = min(dp[i], dp[i + len(d)])
                        flag = False
            dp[i + 1] = min(dp[i + 1], dp[i] + 1)
        return dp[-1]