Is it possible to check if a certain url is redericting without opening a request to the site in Python? [duplicate]


Is it possible to check if a certain url is redericting without opening a request to the site in Python? [duplicate]



This question already has an answer here:



I know that it is possible to check if a URL redirects, as mentioned in the following question and its answer.



How to check if the url redirect to another url using Python



using the following code:


eq = urllib2.Request(url=url, headers=headers)
resp = urllib2.urlopen(req, timeout=3)
redirected = resp.geturl() != url # redirected will be a boolean True/False



However, I have list of Millions of URLs. Currently it is discussed wether one of them is a harmful URL or redirects to a harmful URL.



I want to know if it is possible to check for redirect without opening a direct connection to the redirecting website to avoid creating a connection with a harmful website?



This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.





Do you mean "connect to original website but not connect to the redirected website", or "not even create a connection at all"? The later one is impossible.
– Sraw
Jun 29 at 9:43





“to avoid creating a connection with a harmful website” - why? In what way to do imagine any “harmful website” could actually do any damage to your python script?
– CBroe
Jun 29 at 9:44





Some websites will download automatically binary code to your pc upon creating a connection @Cbroe
– Kev1n91
Jun 29 at 9:46





@Sraw - the first one , I will edit the question
– Kev1n91
Jun 29 at 9:46





The only way to do this would be to connect to every URL, and check whether it redirects. You can check this without connecting to the other website. Redirects are most often done through the 3xx headers. Of course, the javascript on the website may also perform a redirect, but this would be harder to detect without just running it.
– Quaisaq Anderson
Jun 29 at 9:46




1 Answer
1



You can do a HEAD request and check the status code. If you are using the third party requests library you can do that like this:


HEAD


import requests

original_url = '...' # your original url here
response = requests.head(original_url)

if response.is_redirect:
print("Redirecting")
else:
print("Not redirecting")

Comments

Popular posts from this blog

paramiko-expect timeout is happening after executing the command

Opening a url is failing in Swift

Export result set on Dbeaver to CSV