Difference between revisions of "TADM2E 3.29"
Revision as of 18:13, 11 September 2014
Scan the web page word by word and for each one, except the first, take the pair formed by the previous word + the current one. Use that pair as the key to a hashtable, where you store the frequency of that pair. If the entry is not already in the hash table store it with value 1, if it is there increment the current value by one. Keep the highest frequency seen and update it wherever you see a greater one. In the end return the greatest frequency seen.
Complexity, where N = number of words in the text:
Time: O(N) (you only iterate over each word once)
Space: O(N) (Worst case is when every word is distinct and thus you have no repeated pairs, and that would take N storage on the hash table)