Output for scrapped data with Text value in CSV file

I am new to web scrapping using Python and need help on extracting Sub category name (Title) and the Page Title (Main Category Header) with the URLs that is being scrapped with my Python code. I tried .text with beautifulsoup but I think there could be better option to do this task as I am getting error and no output once used.

Help would be appreciated. Please take a look at the code and help with Output stored in csv file with URL t Sub category title t Main Category header.

Example: Subcategory URL
Required:

http://www.medicalexpo.com/medical-manufacturer/neonatal-incubator-2963.html Neonatal incubators Pediatrics http://www.medicalexpo.com/medical-manufacturer/infant-radiant-warmer-13522.html Infant radiant warmers Pediatrics http://www.medicalexpo.com/medical-manufacturer/infant-phototherapy-lamp-44327.html Infant phototherapy lamps Pediatrics

Something like this

Code:

from bs4 import BeautifulSoup import requests import unicodecsv import time import random def get_soup(url): return BeautifulSoup(requests.get(url).content, "lxml") url = 'http://www.medicalexpo.com/' soup = get_soup(url) raw_categories = soup.select('div.univers-main li.category-group-item a') print(raw_categories) category_links = {} for cat in (raw_categories): t0 = time.time() response_delay = time.time() - t0 time.sleep(10*response_delay) time.sleep(random.randint(2,5)) soup = get_soup(cat['href']) links = soup.select('#category-group li a') category_links[cat.links] = [link['href'] for link in links] print(category_links)

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Search This Blog

Mgiyuk