Google Cloud Bigquery Library Error


Google Cloud Bigquery Library Error



I am receiving this error

Cannot set destination table in jobs with DDL statements



Cannot set destination table in jobs with DDL statements



When I try to resubmit a job from the job.build_resource() function in the google.cloud.bigquery library.


job.build_resource()


google.cloud.bigquery



It seems that the destination table is set to something like this after that function call.




'destinationTable': {'projectId': 'xxx', 'datasetId': 'xxx', 'tableId': 'xxx'},



'destinationTable': {'projectId': 'xxx', 'datasetId': 'xxx', 'tableId': 'xxx'},



Am I doing something wrong here? Thanks to anyone that can give me any guidance here.



EDIT:



The job is initially being triggered by this


query = bq.query(sql_rendered)



We store the job id and use it later to check the status.



We get the job like this


job = bq.get_job(job_id=job_id)



If it meets a condition, in this case it failed due to rate limiting. We retry the job.



We retry the job like this


di = job._build_resource()
jo = bigquery.Client(project=self.project_client).job_from_resource(di)
jo._begin()



I think that's pretty much all of the code you need, but happy to provide more if needed.





Please share more of the code that you are using. It isn't possible to tell what is happening to cause the destination table to be set just from the error message.
– Elliott Brossard
2 days ago





Is there a reason that you don't resubmit the original job? It's kind of weird to submit the one returned by bq.get_job, since that contains unrelated attributes like query statistics and so on (and in this case the destination table).
– Elliott Brossard
2 days ago


bq.get_job





The reason is that we are using Airflow. So it's passed through xcoms, so we are storing the job_id, grabbing that and then using it. Should we change that workflow?
– dillon
2 days ago






The (easy) alternative would be to strip down the job resource that you get back just to the relevant fields, but a better long-term solution may be to propagate the original job or query that you want to run. It's hard to say which is more reasonable given your current setup, though.
– Elliott Brossard
2 days ago









By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Comments

Popular posts from this blog

paramiko-expect timeout is happening after executing the command

Opening a url is failing in Swift

Export result set on Dbeaver to CSV