RForcecom icon indicating copy to clipboard operation
RForcecom copied to clipboard

Strings returned as factors

Open ax42 opened this issue 9 years ago • 5 comments

> columnTest <- rforcecom.query(session, "select Id, Name, Amount, CloseDate, IsWon from Opportunity limit 1")
> sapply(columnTest, class)
       Id      Name    Amount CloseDate     IsWon 
 "factor"  "factor" "numeric"  "factor"  "factor" 

I would expect:

> sapply(columnTest, class)
       Id             Name   Amount     CloseDate   IsWon 
 "character"  "character" "numeric"  "character"  "factor" 

And absolute bonus marks for actually getting CloseDate as a date.

Given how nasty "stringsAsFactors = TRUE" is in read.csv, I would suggest at least mimicking the behaviour of "stringsAsFactors = FALSE". Unfortunately, it will probably need a defaulted parameter, defaulted to the current way (else it will break old code).

ax42 avatar Oct 21 '15 21:10 ax42

The issue is using the function type.convert() https://github.com/hiratake55/RForcecom/blob/master/R/rforcecom.query.R#L59. In the documentation for that function it says:

Given a character vector, it attempts to convert it to logical, integer, numeric or complex, and failing that converts it to factor unless as.is = TRUE. The first type that can accept all the non-missing values is chosen

There are a few options, I guess:

  1. Set the as.is argument of type.convert() equal to !default.stringsAsFactors() and then it's controlled globally
  2. Add a new argument to rforcecom.query that would allow user to specify function behavior
  3. Pull down Salesforce metadata on the fields and then format each column according to its Salesforce type (this would require a lot more overhead in the function but ensures consistency betwen the two systems)
  4. Use something other than type.convert

StevenMMortimer avatar Oct 22 '15 04:10 StevenMMortimer

This isn't a solution, but my workaround is to pipe the output of rforcecom.query into mutate_if(is.factor, as.character), which turns all of the factors into characters.

Breza avatar Jul 21 '17 21:07 Breza

@Breza This may or may not be fixed depending on your version of RForcecom. Looking at current GitHub source code, it automatically returns strings as characters: https://github.com/hiratake55/RForcecom/blob/master/R/rforcecom.utils.R#L54

StevenMMortimer avatar Jul 22 '17 00:07 StevenMMortimer

@StevenMMortimer This issue is not solved in the current CRAN and Github versions of RForcecom.

I can confirm that rforcecom.query() indeed returns factor variables instead of characters if options(stringsAsFactors = T) — which it is in a default R environment.

Add a new argument to rforcecom.query that would allow user to specify function behavior

This would be a fantastic solution imho that does not require mucking around with global settings.

rrmn avatar May 13 '19 13:05 rrmn

@RomanAbashin I have created my own fork of the package and the functionality you mention is available there: https://github.com/StevenMMortimer/salesforcer

More specifically, there is an argument in the sf_query() function called guess_types that is a logical allowing you to keep all of the columns as character datatypes or use the col_guess() magic of the readr package. That functionality is not yet on CRAN, but will be soon.

StevenMMortimer avatar May 15 '19 03:05 StevenMMortimer