RForcecom icon indicating copy to clipboard operation
RForcecom copied to clipboard

EOF within quoted string error

Open mwolfe07 opened this issue 7 years ago • 22 comments

I'm getting the "EOF within quoted string error" while trying to bulk query information from Salesforce.

query <- "SELECT Id, Name FROM Account"

AccountDF <- rforcecom.bulkQuery(session, 
                                 soqlQuery = query,
                                 object = 'Account')

So far it is only doing this when I use the Name field. Also it only reads in 650k records instead of 2.5M if I were to omit the Name field.

Any help appreciated!

mwolfe07 avatar Apr 03 '17 15:04 mwolfe07

Any update on this? I still have been unable to get around this error. I've been directed from Stack Overflow to here since it seems like a package specific issue. This is a roadblock for me to be able to successfully query against the account object, and would greatly help me out to figure this out!! I would like to essentially escape the single quotes. Please help!

mwolfe07 avatar Jun 07 '17 22:06 mwolfe07

@mwolfe07 With Salesforce it's always a challenge to reproduce errors from your system because we can't access it, therefore you'll have to help troubleshoot. I suspect the error is caused at this line: https://github.com/hiratake55/RForcecom/blob/master/R/rforcecom.getBulkQueryResult.R#L57

The issue is probably because there is a # symbol or unmatched quotes or some other wierd anomaly in the data. It would help if you break down the bulk call into its components and then fiddle with the function rforcecom.getBulkQueryResult().

In other words run the code below, then mess with rforcecom.getBulkQueryResult() until things work, and then report back your findings. Thanks!

# Login
my_username <- "PUT YOURS HERE"
my_password <- "PUT YOURS HERE"
session <- rforcecom.login(my_username, my_password)

# Specify Query
my_query <- "SELECT Id, Name FROM Account"
target_object <- "Account"

# Submit the Query
job_info <- rforcecom.createBulkJob(session, operation='query', object= target_object)
batch_query_info <- rforcecom.submitBulkQuery(session,
                                              jobId = job_info$id,
                                              query = my_query)

# Keep Checking Status. Wait until it's done!
batch_query_status <- rforcecom.checkBatchStatus(session,
                                                 jobId = batch_query_info$jobId,
                                                 batchId = batch_query_info$id)

# Once it's done, grab the details
batch_query_details <- rforcecom.getBatchDetails(session,
                                                 jobId = batch_query_info$jobId,
                                                 batchId = batch_query_info$id)

# THIS IS THE FUNCTION YOU NEED TO MODIFY AND PLAY AROUND WITH
batch_query_recordset <- rforcecom.getBulkQueryResult(session,
                                                      jobId = batch_query_info$jobId,
                                                      batchId = batch_query_info$id,
                                                      resultId = batch_query_details$result)

StevenMMortimer avatar Jun 08 '17 00:06 StevenMMortimer

Hey @ReportMort thanks so much for responding. I'm running what you sent over. One quick question - is it normal for the state to be "InProgress" for a long time? Number of records proccessed and everything is 0

 > str(batch_query_status)
 List of 10
 $ id                     : chr "75113000009hntvAAA"
 $ jobId                  : chr "75013000005D7xQAAS"
 $ state                  : chr "InProgress"
 $ createdDate            : chr "2017-06-08T13:22:22.000Z"
 $ systemModstamp         : chr "2017-06-08T13:22:22.000Z"
 $ numberRecordsProcessed : chr "0"
 $ numberRecordsFailed    : chr "0"
 $ totalProcessingTime    : chr "0"
 $ apiActiveProcessingTime: chr "0"
 $ apexProcessingTime     : chr "0"

mwolfe07 avatar Jun 08 '17 13:06 mwolfe07

In addition, is 'rforcecom.bulkQuery' the correct function to be using for this? if I just adjusted that one line in the function to include 'res <- read.csv(con, stringsAsFactors=FALSE, na.strings="", quote="")' would that potentially fix it?

mwolfe07 avatar Jun 08 '17 14:06 mwolfe07

Yes, your query might take awhile, especially if it recordset is large. You can monitor the progress of all your API calls in the Salesforce UI for more details. Just got to your home link and add 750 at the end. For example, https://na26.salesforce.com/750

With regards to your second question, yes, I believe the issue could be fixed by a simple change to read.csv () . We will see...

StevenMMortimer avatar Jun 08 '17 15:06 StevenMMortimer

Yes, your query might take awhile, especially if the recordset is large. You can monitor the progress of all your API calls in the Salesforce UI for more details. Just go to your home link and add 750 at the end. For example, https://na26.salesforce.com/750

With regards to your second question, yes, I believe the issue could be fixed by a simple change to read.csv () . We will see...

StevenMMortimer avatar Jun 08 '17 15:06 StevenMMortimer

Hey Mort, I'm still not sure what I'm doing wrong here. It spits out into the batch_query_results the ~300k lines that it read in, but omitted the remaining 3.8M records. As probably expected, I got the same "EOF within quoted string" error when running the getBulkQueryResult. In the bulk data job in SF it shows that 4.1M records were processed, no records failed, and the status was completed. So it looks like the bulk job was successful but the reading in of it was not.

I then tried adjusting the function code you linked to, to add in the quote="" argument to read.csv. Which prompted a "could not find function rforcecom.api.getBulkEndPoint which is one of the first commands in the function, so maybe me taking it and just running it in the console did something to it. I'm not sure.

I'm trying to run the bulkQuery on other objects in SF that I would need and it's working like 75% of the time. I'm also running into a "Issue with batches submitted" error sometimes. Trying to find a way around that.

Let me know if bulkQuery is the correct function for just querying a table and pulling it into an object.

Thanks again for your help!

mwolfe07 avatar Jun 08 '17 18:06 mwolfe07

rforcecom.api.getBulkEndPoint is not an exported function. Just add RForcecom::: in front of it like this:

RForcecom:::rforcecom.api.getBulkEndPoint

As far as things working 75% of the time, I'd recommend checking everything you submit to the API using the Salesforce UI so you can determine whether the issue is on the R side of things or on the Salesforce side of things.

StevenMMortimer avatar Jun 09 '17 17:06 StevenMMortimer

I think the problem is I'm not sure how to interpret the data to see what is wrong. It looks like the API was working correctly, and it's the function, but I'm not really sure. Any ideas?

Also, if I were to go into that line of code within the Rforcecom package that's loaded to my machine, to add in the quote = "" portion to read.csv, do you think that would fix it? And would you be able to guide me into doing that to try?

mwolfe07 avatar Jun 10 '17 23:06 mwolfe07

Try this:

res <- read.csv(con, stringsAsFactors=FALSE, na.strings="", quote="", comment="")

If you can send back any error or warning messages after running that on the data. If you can get the data into a single text file and you don't mind emailing me that file ([email protected]), then I can assist with figuring out how to read it into the R environment and why it's behaving a little funny, otherwise I'd suggest reading about the various options in reading data into R and how to address quirks in files.

StevenMMortimer avatar Jun 11 '17 04:06 StevenMMortimer

Hi Mort,

Thanks again for helping me out! Where exactly do I go in order to edit the formula? I'm looking through the library package details that downloaded to my local machine but can't find seem to find any .R file to edit. And re: emailing you the file, I don't think I will be able to as it's sensitive customer data to my company :/. Otherwise I would jump at the chance of finding out what is causing this error.

mwolfe07 avatar Jun 12 '17 13:06 mwolfe07

I used trace to edit rforcecom.getBulkQueryResults function and replace res with the line above that you indicated. Here is the error message I received:

" Error in read.table(file = file, header = header, sep = sep, quote = quote,  : 
  duplicate 'row.names' are not allowed "

mwolfe07 avatar Jun 12 '17 15:06 mwolfe07

Add row.names=NULL to read.csv

StevenMMortimer avatar Jun 12 '17 15:06 StevenMMortimer

This time it gave me no errors, but it still stopped at 299,902 records out of 4.1M

mwolfe07 avatar Jun 12 '17 16:06 mwolfe07

Great, no errors is a good thing. I suggest you continue interrogating that file and checking against the Salesforce UI to figure out where the records are being lost.

StevenMMortimer avatar Jun 12 '17 16:06 StevenMMortimer

I generally read in csv's with fread from data.table.

also, I saved the csv file of just the names from SF(what i'm trying to read in), and was able to bring the whole 4.1M by using the read.csv function.

mwolfe07 avatar Jun 12 '17 16:06 mwolfe07

Would you mind sharing some code of what finally worked? Something where you set the colnames argument based on the expected names?

StevenMMortimer avatar Jun 12 '17 16:06 StevenMMortimer

So I'm not really out of the woods as it hasn't really fully worked. It's not reading in the whole 4.1M. But the normal read.csv function does work when I download it elsewhere and read it in via a csv.

This below doesn't give any errors but stops at 299k/4.1M, the goal is to figure out what's stopping it and get the whole 4.1M through this method.

res <- read.csv(con, stringsAsFactors = FALSE, na.strings = "", 
      quote = "", comment = "", row.names = NULL)

When I save the .csv file of 4.1M names, I can read it in via..

Accountsdf <- read.csv("name.csv", stringsAsFactors = FALSE, na.strings = "", 
         quote = "", comment = "", row.names = NULL)

I'm not sure whats missing or why the connection is stopping where it's stopping.

mwolfe07 avatar Jun 12 '17 17:06 mwolfe07

If you go to directly to the result URL in your browser, do you get the full CSV? The URL I'm talking about is formed in the R code like this:

URL <- paste(session['instanceURL'], endpointPath, '/job/', jobId, '/batch/', batchId, '/result/', resultId, sep="")

If you get the full CSV, then it's some issue with how curlPerform reads from the URL. Maybe a max limit or something.

StevenMMortimer avatar Jun 12 '17 18:06 StevenMMortimer

What is the endpointPath? Apologies for all of the questions.

mwolfe07 avatar Jun 12 '17 20:06 mwolfe07

You should be able to follow the code back through.

endpointPath <- RForcecom:::rforcecom.api.getBulkEndpoint(session['apiVersion'])
URL <- paste(session['instanceURL'], endpointPath, '/job/', jobId, '/batch/', batchId, '/result/', resultId, sep="")

session is the variable you passed into the function originally that you created using rforcecom.login

StevenMMortimer avatar Jun 12 '17 20:06 StevenMMortimer

Hi, sorry to resurrect a zombie thread. However, I'm having the same experience as the OP.

To answer your question, yes, the full file downloads when I visit the URL.

Pro tip: you will want your browser logged into Salesforce in order to use the download link.

Regarding the browser-downloaded file, if I read it with either read.csv() or fread() from data.table, it works fine. So, maybe its a problem with the way the data is being downloaded by httr::GET.

MichaelGriebe avatar Feb 04 '19 20:02 MichaelGriebe