How should I practice Longest Dictionary Tokenization?

Start by reading the full statement, solve the visible examples, then compare the pattern against related Google phone screen problems in FastPrep.

Is Longest Dictionary Tokenization associated with Google?

FastPrep tags this as a Google phone screen coding practice problem when the problem catalog has that company and stage context.

Problem · String

Longest Dictionary Tokenization

Learn this problem

● Medium

GoogleFULLTIMEPHONE SCREEN

See Google hiring insights

Problem statement

You are given a text string and a dictionary of token-to-id mappings. Starting from the beginning of text, repeatedly choose the longest dictionary token that matches the current position and output its id. If no dictionary token matches, output the current character itself and advance by one character.

Return the sequence of emitted ids and literal characters.

Function

encodeWithDictionary(text: String, dictionary: String[][]) → String[]

Examples

Example 1

text = "applepie"dictionary = [["app","B"],["apple","A"],["pie","P"]]return = ["A","P"]

apple is preferred over app because it is the longest match at index 0.

Example 2

text = "xabcd"dictionary = [["ab","1"],["abc","2"],["bc","3"]]return = ["x","2","d"]

The first character has no match, then abc is the longest token starting at index 1.

Constraints

If multiple dictionary entries have the same token, use the first mapping provided. The tokenization is greedy and scans left to right.