Scrapy can't find form on page
Scrapy can't find form on page
I'm trying to write a spider that will automatically log in to this website. However, when I try using scrapy.FormRequest.from_response
in the shell I get the error:
scrapy.FormRequest.from_response
No <form> element found in <200 https://www.athletic.net/account/login/?ReturnUrl=%2Fdefault.aspx>
No <form> element found in <200 https://www.athletic.net/account/login/?ReturnUrl=%2Fdefault.aspx>
I can definitely see the form when I inspect element on the site, but it just did not show up in Scrapy when I tried finding it using response.xpath()
either. Is it possible for the form content to be hidden from my spider somehow? If so, how do I fix it?
response.xpath()
1 Answer
1
The form is created using Javascript, it's not part of the static HTML source code. Scrapy does not parse Javascript, thus it cannot be found.
The relevant part of the static HTML (where they inject the form using Javascript) is:
To find issues like this, I would either:
In this case, you have to manually create your FormRequest for this web page. I was not able to spot any form of CSRF protection on their form, so it might be as simple as:
FormRequest(url='https://www.athletic.net/account/auth.ashx',
formdata={"e": "foo@example.com", "pw": "secret"})
However, I think you cannot use formdata
, but instead they expect you to send JSON. Not sure if FormRequest
can handle this, I guess you just want to use a standard Request
.
formdata
FormRequest
Request
Since they heavily use Javascript on their front end, you cannot use the source code of the page to find these parameters either. Instead, I used the developer console of my browser and checked the request/response that happened when I tried to login with invalid credentials.
This gave me:
General:
Request URL: https://www.athletic.net/account/auth.ashx
[...]
Request Payload:
{e: "foo@example.com", pw: "secret"}
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Comments
Post a Comment