# An expectation question

11 Jan 2019

During the winter break, my friend XYZ asked me to help him improving his driving techniques in order to get his driver’s license as soon as possible. One day, after we finished our lunch at an Indian buffet, he gave me a question which was claimed by him to be an interview question from a company F.

Q: There are $n$ distinct integers in a pool. Every time you draw one integer (with replacement) from the pool uniformly at random. If this integer has not been drawn before, then you add it to the list of previously drawn integers. Now, what is the expected number of draws needed to obtain a list of $k$ integers?

A: $O(n\log \frac{n}{n - k})$.

Define $T_k$ as the number of draws when we obtain a length $k$ list of drawn intgers and $D_i$ as the number of draws we need given a length $i-1$ list of drawn integers. Due to the linearity of expectation, $E[T_k] = \sum_i^k E[D_i]$. And we can see that $E[D_i] = (1 - \frac{i-1}{n}) \cdot 1 + \frac{i-1}{n} (1 - \frac{i-1}{n}) \cdot 2 + \cdots = \frac{n}{n - i+1}$. Therefore, $E[T_k] = n \sum_i^k \frac{1}{n-i+1} \approx n (\log n - \log (n - k))$.