A classified website is an online advertising platform that is pivotal in promoting products or services. It connects both buyers and sellers in a single entity. The classified portals have several sections devoted to jobs, resumes, housing, personals, services, items wanted, community, and more.
Structured or unstructured data plays a significant role in generating growth and innovation for your business. In this era of Big Data, the web scraping process is crucial for every industry. And classified site is no such exception.
The classified portal is generally customized and differentiated, allowing users to search for relevant categories and sub-categories. Web scraping popular classified site data enables extracting important information that can benefit business owners and buyers.
Some vital information extracted from classified sites is ad title, ad ID, description, current price, posting date, specification, category, subcategory, image, city, state, category, etc.
By leveraging the power of automated scraper, businesses can find classifieds for a fair price and, in the meantime, can focus on other business activities rather than manually collecting information. Scraping allows you to extract your competitors’ contact information, including pricing, contact numbers, and emails, and help send several marketing messages to customers.
With web scraping popular classified websites, businesses can easily procure several volumes of data immediately and save time and effort while manually collecting information. The information extracted is helpful for several other business-related areas
With the Classified extractor services from iWeb Data Scraping, you can quickly obtain competitors’ contact information and connect with them. However, you can also promote your business to your customers and target consumers.
However, with several other internet marketing methods, the primary aspect of data scraping is legality.
The primary reason for extracting classified websites may be variable. The most popular ones are:
Here, we are scrapping OLX classified site. We will choose a random category, i.e., Children’s clothing. First, open in Chrome tab and enable the developer tools. Now, go to the Elements tab, select a tool for inspecting the page item, click on the section with the first item, and then see the selected HTML node in the HTML code in the ‘Elements’ tab.
Here, we can use td.offer as the CSS selector. But, first, we need to ensure that. For this, press CTRL + F while you are in the HTML code of the ‘Elements’ tab. In the search bar, type the Selector. If everything goes right, you can see 44 elements.
First, we will open the HTML elements and find the required link. The Selector for links to ad pages is td.offer a.link.detailsLink. Check and ensure that there are 44 links. For better compatibility, we can use a.link.detailsLink selector.
Now, check for the paginator. We will find the link to the next page in the paginator. The Selector that we obtained is a[data-cy="page-link-next"]. Ensure that there is just one element specific to the Selector on the page.
To navigate through the category pages, we will use a link pool. The scraper will appear like this:
Next, we describe the data collection logic from the ad page. Hence, to perform, we will first open any ad and find CSS selectors:
We will first code part of the scraper to collect the actual data from the ad page.
We get the following dataset record:
Now, we want to collect the phone numbers. And for this, first, open the page with the ad, then the developer tools, and then go to the Network tab. Within the tab, we only want to view XHR requests. Click to clear all requests button. Now, click on the ‘Show Phone’ button.
Now, open the requests and check the address and the type of data they sent.
The URL that we have is
https://www.olx.ua/ajax/misc/contact/phone/qsKeK/
and the parameter pt is
cda38f1d74d6e50f6f5a248ea2578ba04d44b58ccb6648718ce825a15dd1c036494b2cd1c6cb27762a8de30f5f58676149a11ee8a228998fd7f6b8cde5bb83a9
From the above link, we c require ad id qsKeK and parameter pt to imitate such requests. This parameter is in JavaScript, which we can extract using a regular expression. We will make certain changes in the scraper to collect the phone number and then add a snippet.
If we run the scraper in debug mode, we see the following structure:
We will use body_safe > value CSS Selector to collect the phone number. Add it to the web scraper to obtain the following:
Conclusion: Thus, the scraper works well to collect the data we require from OLX classified site.
CTA: For more information, contact iWeb Data Scraping now! You can also reach us for all your web scraping service and mobile app data scraping service requirements.