remove characters in xpath / scrapy


remove characters in xpath / scrapy



I use scrapy to extract data and it generates the field (typeFacture) with (''), I want to extract the text and delete ('') to insert it into a database, I want to do that to help from XPATH



HTML code:


<td class="tNorm tSmall-xs">
<b>FACTURE</b>
<br>
''
Commission
''
</td>



my code:


item['typeFacture'] = [item.strip() for item in sel.xpath('//tbody/tr/td[5]/text()').extract()]



result:


'typeFacture': ['',
'',
'Commission',
'',
'',
'Commission',
'',
'',
'Commission',
'',
'',
'Commission',
'',
'',
'Abonnement']}





Don't put images, put the code.
– Mathieu
Jun 26 at 10:39





Please paste your html code, desired and actual result as text instead of links to images.
– running.t
Jun 26 at 10:42





thank you for your advice, I changed, do you have an idea about this problem?
– user_1330
Jun 26 at 10:49





something like this maybe item['typeFacture'] = [item.strip() for item in sel.xpath('//tbody/tr/td[5]/text()').extract() if item]
– bobrobbob
Jun 26 at 12:16


item['typeFacture'] = [item.strip() for item in sel.xpath('//tbody/tr/td[5]/text()').extract() if item]





Why is commision repeating ? Your html doesn't have it multiple times?
– Tarun Lalwani
Jun 26 at 12:43




1 Answer
1



I found the solution, but not with XPATH.
I use it in a simple python code, before inserting it in the database


item['typeFacture'] = list(filter(None, item['typeFacture']))






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Comments

Popular posts from this blog

paramiko-expect timeout is happening after executing the command

Opening a url is failing in Swift

Export result set on Dbeaver to CSV