Problem Brief
Most Unknown Elements (Part 3 :)
OA
If time allows, the company'd like us to finish Part 3 :)
Implement the jaccard function.
The Jaccard Similarity is a pair-wise comparison calculated with the formula:
jaccard(A, B) = intersection(A, B) / union(A, B)
The intersection is the set of characters that are present in both strings together, including repeats.
The union is the smallest set that contains all characters found in either string, including repeats.
Thanks a ton to the kind soul who shared the source!
1Example 1
Input
string1 = "baa", string2 = "abbc"
Output
0.4
Explanation
intersection(S1, S2) = "ab"
union(S1, S2) = "aabbc"
J(S1, S2) = len("ab") / len("aabbc") = 2/5