Re: [问题] Google Interview Question (4)

楼主: Leon (Achilles)   2013-03-09 15:15:33
※ 引述《stimim (qqaa)》之铭言:
: 刚刚和 Leon 讨论了一下,发现我们对题目的了解有一点不同
: 原题目:
: Given a document and a query of K words, how do u find the smallest window
: that covers all the words at least once in that document? (given you know the
: inverted lists of all K words, that is, for each word, you have a list of all
: its occurrrences). This one is really hard. Could someone propose an
: algorithm in O(n)?
: 我们可以确定:
: 1. document is given
: 2. 知道我们关心的是哪 K 个字
: 3. 这 K 个字的 occurrence list 也是给定的
: 问题是,n 指的是什么?
: 1. n = document size (in number of words)
: 在这种情况下, O(n) 是可能的,方法基本上就是 Leon 提出来的。
恕删.
我觉得那个 N 的定义比较有可能是 document length..
不过下面那个定义, 我想我的方法应该也可以啦.
Here is the modification.
: 2. n = 这 K 个字出现的总次数
: Ex:
: K = 3, 要找出包含 {a, b, c} 的最小 window
: occurrences:
: a = { 1, 1000, 2500}
: b = {400, 1001, 2000}
: c = {500, 1002, 1500}
: document size = 2500 甚至更高 (其他的位置是其他的字)
: n = 3 + 3 + 3 = 9 而不是 2500
: 在这种情况下,已知可以做到 O(n logK) (RockLee 写的方法)
: 不过,在这种状况下,给定原本的 document 似乎就没什么用了
: 而且在读入 input 时,
: 读入原本的 document 会花掉 O(document size) 的时间
: 所以在和 Leon 讨论时,他对这种解释方法有一点怀疑
So, when you build the occurance list from document,
actually you will have scaned this array first..
{1, 400, 500, 1000, .... 2500} which represents {a, b, c, a..}
We need to keep this too. Call it "Show_list"
Then, use my algorithm, the left boundary are choosen sequentally
from "Show_list".
The first step will be, choosing [1] as left,
and use max-min{b, c} as right.
The second step is, drop [1] and use "Show_list(2)" as a new
left boudary.
Now, you drop at most one element in the window.
The need O(1) to find the right boundary and get back this element.
Hope it helps.

Links booklink

Contact Us: admin [ a t ] ucptt.com