Problem · Trie

Design Search Autocomplete System

HardRobloxPHONE SCREEN
See Roblox hiring insights

You are given three arrays:

  • queries[i] is a search query issued by a user.
  • timestamps[i] is the time when queries[i] was issued.
  • prefixes[j] is a prefix typed into an autocomplete box.

For each prefix, return every distinct query that starts with that prefix, ordered by:

  1. Higher occurrence count first.
  2. If counts tie, smaller earliest timestamp first (the earliest time that query was ever issued).
  3. If both still tie, lexicographically smaller query first.

Return one ranked list per prefix, in the same order as prefixes. If a prefix has no matching query, return an empty list for that prefix. The empty prefix matches every distinct query.

This is the batch version of the classic search-autocomplete design question. A Trie keyed on query strings (each terminal node carrying the query's count and earliest timestamp) lets each prefix lookup collect its subtree's queries, which are then ranked by the comparator above. In a live interview, clarify whether the output should include all matching queries or only the top k suggestions.

Examples
01 · Example 1
queries = ["roblox studio", "roblox", "roadblocks", "roblox studio", "roblox game", "roblox"]
timestamps = [5, 1, 7, 3, 4, 2]
prefixes = ["rob", "robl", "road", "x"]
return = [["roblox", "roblox studio", "roblox game"], ["roblox", "roblox studio", "roblox game"], ["roadblocks"], []]
"roblox" and "roblox studio" each appear twice, so they rank ahead of "roblox game" (once). Between the two count-2 queries, "roblox" has earliest timestamp 1 versus "roblox studio" earliest timestamp 3, so "roblox" comes first. Prefixes "rob" and "robl" match the same three queries; "road" matches only "roadblocks"; "x" matches nothing.
02 · Example 2
queries = ["alpha", "alpine", "alpha", "alpine"]
timestamps = [10, 3, 2, 8]
prefixes = ["al"]
return = [["alpha", "alpine"]]
Both queries appear twice, so the earliest-timestamp tie-break applies: "alpha" has earliest timestamp 2 while "alpine" has earliest timestamp 3, so "alpha" ranks first.
03 · Example 3
queries = ["aba", "abb", "aba", "abb"]
timestamps = [1, 1, 5, 7]
prefixes = ["ab"]
return = [["aba", "abb"]]
Both queries appear twice and share earliest timestamp 1, so the lexicographic tie-break orders "aba" before "abb".
Constraints
  • queries.length == timestamps.length
  • 0 <= queries.length <= 10^5
  • 1 <= prefixes.length <= 10^4
  • 1 <= queries[i].length <= 100
  • 0 <= prefixes[i].length <= 100
  • 0 <= timestamps[i] <= 10^9
  • queries[i] contains printable ASCII characters.
  • prefixes[i] is empty or contains printable ASCII characters.
More Roblox problems
drafts saved locally
public List<List<String>> autocomplete(String[] queries, int[] timestamps, String[] prefixes) {
  // write your code here
}
queries["roblox studio", "roblox", "roadblocks", "roblox studio", "roblox game", "roblox"]
timestamps[5, 1, 7, 3, 4, 2]
prefixes["rob", "robl", "road", "x"]
expected[["roblox", "roblox studio", "roblox game"], ["roblox", "roblox studio", "roblox game"], ["roadblocks"], []]
sign in to submit