BigQuery-Python "Please look into the error stream for more details."

I've been working on loading large data sets in to bigq from a csv in GCS. I've got it working fine for some tables, but for others I get the following error:

bigquery.errors.JobExecutingException: Reason:invalid. Message:Error while reading data, error message: CSV table encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the error stream for more details.

I have a feeling it's schema related, but I can't tell. How can I 'look in to the error stream' to see details? insertErrors isn't in my job object after it fails, I don't see any type of error message other than printing out the exception.

    job = bqClient.client.import_data_from_uris( [gsFile], dataset, table, schema=None)
    try:
        job_id, _results = bqClient.client.wait_for_job(job)
        print("Job ID: " + str(job_id))
        print("Results: " + str(_results))
    except Exception as e:
        print(str(e))
        print(str(job))

My code works great for some tables but not others, so I'm trying to find out whats wrong with one particular table.

Oct 05 '18 17:10 ampkeegan

This sounds like a similar issue. There should be an errors property on the job.

Oct 05 '18 17:10 tylertreat

I'm not seeing errors or insertErrors on my job.

I have another piece of code which does have errors on occasion:

job = bqClient.client.push_rows(
            to_bq_jobs[path]['dataset'],
            to_bq_jobs[path]['table'],
            rowsToAdd[rowCount:topCount],
            #insert_id_key=to_bq_jobs[path]['id_to_match']
            )
        #print(str(job))
        if 'insertErrors' in job:

And that lets me print out the errors. However on this job, there isn't an error or insertError key in the job dict.

The status dict has 'state': "running", there isn't any error listed.

I tried changing to job_id, _results = bqClient.client.wait_for_job(job), which still throws the same error and _results doesn't have anything.

Oct 08 '18 18:10 ampkeegan

Can you dump the contents of the dict that gets returned?

Oct 08 '18 18:10 tylertreat

I've been testing with JSON newlimited import which also fails, but it also throws a similar error.

Reason:invalid. Message:Error while reading data, error message: JSON table encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the error stream for more details.
{
    "configuration": {
        "jobType": "LOAD",
        "load": {
            "destinationTable": {
                "datasetId": "redacted",
                "projectId": "redacted",
                "tableId": "stages_copy"
            },
            "schema": {
                "fields": [
                    {
                        "mode": "NULLABLE",
                        "name": "Stage_id",
                        "type": "INTEGER"
                    },
                    {
                        "mode": "NULLABLE",
                        "name": "Stage_Order",
                        "type": "INTEGER"
                    },
                    {
                        "mode": "NULLABLE",
                        "name": "Stage_Name",
                        "type": "STRING"
                    },
                    {
                        "mode": "NULLABLE",
                        "name": "Stage_Pipeline_id",
                        "type": "INTEGER"
                    }
                ]
            },
            "sourceFormat": "NEWLINE_DELIMITED_JSON",
            "sourceUris": [
                "gs://redacted/stages.json"
            ]
        }
    },
    "etag": "\"redacted/29pKp--d60WMuqds86QyFCCo47Q\"",
    "id": "redacted",
    "jobReference": {
        "jobId": "redacted",
        "location": "US",
        "projectId": "redacted"
    },
    "kind": "bigquery#job",
    "selfLink": "https://www.googleapis.com/bigquery/v2/projects/redacted?location=US",
    "statistics": {
        "creationTime": "1539023656058",
        "startTime": "1539023656543"
    },
    "status": {
        "state": "RUNNING"
    },
    "user_email": "redacted"
}

`

Oct 08 '18 19:10 ampkeegan

Our workaround is to surround all our job.result() calls with a try/except that prints out job.errors, but it would be really nice if the errors were just printed out so that we didn't have to do that!

Oct 24 '18 14:10 bencaine1