※ 引述《stimim (qqaa)》之铭言:
: 刚刚和 Leon 讨论了一下,发现我们对题目的了解有一点不同
: 原题目:
: Given a document and a query of K words, how do u find the smallest window
: that covers all the words at least once in that document? (given you know the
: inverted lists of all K words, that is, for each word, you have a list of all
: its occurrrences). This one is really hard. Could someone propose an
: algorithm in O(n)?
: 我们可以确定:
: 1. document is given
: 2. 知道我们关心的是哪 K 个字
: 3. 这 K 个字的 occurrence list 也是给定的
: 问题是,n 指的是什么?
: 1. n = document size (in number of words)
: 在这种情况下, O(n) 是可能的,方法基本上就是 Leon 提出来的。
恕删.
我觉得那个 N 的定义比较有可能是 document length..
不过下面那个定义, 我想我的方法应该也可以啦.
Here is the modification.
: 2. n = 这 K 个字出现的总次数
: Ex:
: K = 3, 要找出包含 {a, b, c} 的最小 window
: occurrences:
: a = { 1, 1000, 2500}
: b = {400, 1001, 2000}
: c = {500, 1002, 1500}
: document size = 2500 甚至更高 (其他的位置是其他的字)
: n = 3 + 3 + 3 = 9 而不是 2500
: 在这种情况下,已知可以做到 O(n logK) (RockLee 写的方法)
: 不过,在这种状况下,给定原本的 document 似乎就没什么用了
: 而且在读入 input 时,
: 读入原本的 document 会花掉 O(document size) 的时间
: 所以在和 Leon 讨论时,他对这种解释方法有一点怀疑
So, when you build the occurance list from document,
actually you will have scaned this array first..
{1, 400, 500, 1000, .... 2500} which represents {a, b, c, a..}
We need to keep this too. Call it "Show_list"
Then, use my algorithm, the left boundary are choosen sequentally
from "Show_list".
The first step will be, choosing [1] as left,
and use max-min{b, c} as right.
The second step is, drop [1] and use "Show_list(2)" as a new
left boudary.
Now, you drop at most one element in the window.
The need O(1) to find the right boundary and get back this element.
Hope it helps.