remove characters in xpath / scrapy

I use scrapy to extract data and it generates the field (typeFacture) with (''), I want to extract the text and delete ('') to insert it into a database, I want to do that to help from XPATH

HTML code:

<td class="tNorm tSmall-xs"> <b>FACTURE</b> <br> '' Commission '' </td>

my code:

item['typeFacture'] = [item.strip() for item in sel.xpath('//tbody/tr/td[5]/text()').extract()]

result:

'typeFacture': ['', '', 'Commission', '', '', 'Commission', '', '', 'Commission', '', '', 'Commission', '', '', 'Abonnement']}

Don't put images, put the code.
– Mathieu
Jun 26 at 10:39

Please paste your html code, desired and actual result as text instead of links to images.
– running.t
Jun 26 at 10:42

thank you for your advice, I changed, do you have an idea about this problem?
– user_1330
Jun 26 at 10:49

something like this maybe item['typeFacture'] = [item.strip() for item in sel.xpath('//tbody/tr/td[5]/text()').extract() if item]
– bobrobbob
Jun 26 at 12:16

item['typeFacture'] = [item.strip() for item in sel.xpath('//tbody/tr/td[5]/text()').extract() if item]

Why is commision repeating ? Your html doesn't have it multiple times?
– Tarun Lalwani
Jun 26 at 12:43

1 Answer
1

I found the solution, but not with XPATH.
I use it in a simple python code, before inserting it in the database

item['typeFacture'] = list(filter(None, item['typeFacture']))

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Search This Blog

Mgiyuk