How to scrape websites using advanced CSS selectors

How To Scrape Websites Using Advanced CSS Selectors

Web scraping has revolutionized the way millions of people and businesses work, eliminating the most time-consuming and labor-intensive tasks.

No code web scraping tools like Hexomatic provide a fully automatic point-and-click interface  enabling users to scrape just about any website. 

However, there are websites which require manual selection of CSS selectors in order to capture data correctly.

The good news is that Hexomatic in addition to its point-and-click interface can provide you with more advanced features to tackle more complex projects. 

Particularly, Hexomatic lets you use Google Chrome’s developer tools to isolate elements you need to capture, copy-paste their CSS selector and use it natively inside Hexomatic. 

This way you can scrape any element that is not possible to scrape using a simple point-and-click.

In this tutorial, you will find step-by-step instructions on how to use advanced CSS selectors for scraping just about any website. 

In particular, I’ll show you how to use Google Chrome’s developer tools to isolate elements you want to capture, copy-paste their CSS selector and use these natively inside Hexomatic.
To begin, you need to have a account.

Step 1: Create a new scraping recipe

Go to your dashboard and create a blank scraping recipe

scraping recipe

Step 2: Add the page URL

Go to the website you want to scrape and capture its URL.

Then, paste the URL to the Hexomatic scraping recipe builder and click Preview

Step 3 (A): Select elements to scrape normally

Normally, when you scrape websites with Hexomatic, you can use the automatic selectors. 
So, we can simply click the title here -> Select All and we can scrape all the article titles and select Link as the element type.

This is the automatic way Hexomatic detects and scrapes elements from the page. 

Step 3: (B) Select elements using CSS selectors

However, sometimes HTML and JS used on pages prevent pages from scraping.

For such cases, you can use CSS selectors instead. 

To do that, simply, open the page you want to scrape using, for example, Google Chrome. 

Right-click on the page and select Inspect. 

Next, choose the element you want to scrape. In this case, it’s the title, then right-click again choosing Copy selector.

Let’s get back to our scraping recipe. Click Add actions to be able to insert the captured CSS selector. 

Next, paste the selector in the Selector field, choose Link URL as the type, and save the element. 

This shows as our sample value and, in this case, it only finds one element. 

Next, let’s see how to edit the selectors to have all the elements detected on the page. 

Step 4: Edit CSS selectors

It is possible to get all the elements of the same category on the page by clicking on Edit and removing the part that stands div so that we have a CSS selector for all the targeted elements on the page, not a single one. 

So, now we have all the 10 article URLs detected on the page.

Now, that we have all the necessary elements scraped, let’s Save the recipe. 

Step 5: Use the recipe in the workflow

To view the results of the recipe, it is necessary to run it in the workflow. 
To do that, go to your Scraping recipes, choose the recipe, and click Use in a workflow.

Once the workflow has been created, click Continue

Step 6: Run the workflow

Click Run now to proceed with the workflow.

Step 7: View and save the results

Once the workflow has finished running, you can view the results and save them in a convenient spreadsheet format, like Google Sheets or CSV file. 

Automate & scale time-consuming tasks like never before

Hexomatic. The no-code, point and click work automation platform.

Harness the internet as your own data source, build your own scraping bots and leverage ready made automations to delegate time consuming tasks and scale your business.

No coding or PhD in programming required.

Scroll to Top