Poulami Dutta
2 min readJul 8, 2019

--

RECIPE OF SCRAPING

What’s so special about the recipe of scraping data from any versatile website?Let’s find out!

The most joyous moment for a scrapper is when they try to scrape out data and receive the desired results. When scraping comes into scene, ‘Python’ is considered as the most compatible language. Although apart from this, there are other languages like Node.js,C,C++,PHP etc. for scrapping, as well.

Frankly speaking , scraping a website is considered an easy task but I do want to emphasise that scraping is not at all as easy as people perceive it. Scraping includes well defined “Recipe” for every website, i.e., for every single website, the way to extract data is variant.

The answer as to why scraping is difficult is that even if you are successful in fetching the current page data, the front end may change over time and accordingly, the code needs to be modified. So, it turns out to be a real cumbersome job to render the appropriate data to the dispatching end.

As we all must be aware, there are multiple ways of scraping data.
The commonly used practice to accomplish scraping is through “BeautifulSoup”.Its good to start with and is easy to learn and implement for any basic website. Simplifying things for you, BeautifulSoup is nothing but a library for parsing data. But, in the long run, there was a need to explore something more powerful, something more generic.

That is when I came across this substitute of BeautifulSoup, which is a well defined framework called “Scrapy” which can handle almost all kinds of website. It’s a very interesting framework where Spiders, (yes you heard it right!) are made to crawl in order to scrape a particular website. It’s super easy to learn and can be of great help for people into scraping domain. You can start using this link,https://docs.scrapy.org/en/latest/

As for someone dealing with an advanced level of scraping, “Selenium” is the best choice, which is basically a package. It’s the preferable way for very tough-to-go websites as there are multiple ways you can curb the way of coding and work with python.

So guys and gals,these are a few recipes for scrapping…It is not a facile task to deal with, but with sheer determination and practice you will surely succeed in it !!! Stay tuned to my blog for more such recipes and I assure that you won’t be disappointed!

--

--

Poulami Dutta

Software Developer Engineer (Python Developer, Ex-ISRO)