Password-protected websites can sometimes be an obstacle when scraping data, especially if they use non-standard login mechanisms such as JS forms or pop-ups.
Before we get started, you may wonder if cookies store your passwords. The short answer is no. So how can they help you to log into a password-protected website?
Instead of keeping your password, cookies keep unique identifiers for websites to remember the user, so by copying your session cookie from your existing web browser where you are already logged in you are effectively re-using the same authentication mechanism when you scrape.
If you haven’t got a Hexomatic account yet, go to https://hexomatic.com to get started.
Step 1: Create a new scraping recipe
To get started, create a blank scraping recipe from your dashboard.
Step 2: Add the web page URL
Next, add the web page URL and click Preview. Be sure to set Full-stack as your browser mode as the Cookes option works only in this mode.
Step 3: Capture cookie details
Go to the website you are going to scrape and capture the cookie details. Note that you need to be logged your account on this window in order to get the cookie details.
To get the cookie details, you need to Inspect the website and find Cookies in the Applications section. Then copy the cookie name (_session), its value, and the domain.
Step 4: Add actions in your scraping recipe builder
In your scraping recipe builder, click Add Actions and select Cookie from the pop-up window.
Step 5: Paste Cookie details in the pop-up window
Now, you can paste the captured Cookie Name, Value, and Domain into the matching fields in the pop-up window.
After filling in the fields, click Proceed.
And, voila, after proceeding, you will log into your account.
Automate & scale time-consuming tasks like never before
Marketing Specialist | Content Writer
Experienced in SaaS content writing, helps customers to automate time-consuming tasks and solve complex scraping cases with step-by-step tutorials and in depth-articles.
Follow me on Linkedin for more SaaS content