Crawling with Python and WordFreq library

Today I published a mini-repo that was part of a training inside Atrápalo https://github.com/raimonbosch/wordfreq.crawler

The objective of this training was to come out with a crawler that could find a specific word as fast as possible. To do so we have used the library wordfreq in order to analyze which links could be closer to the word we are looking for. The function zipf_frequency in this library also was very useful to find links of high value and that could hold inside interesting content and avoid infinite loops crawling content of low value.

Leave a comment