csonv.js icon indicating copy to clipboard operation
csonv.js copied to clipboard

Request: Guessing data types

Open wandernauta opened this issue 13 years ago • 8 comments

The requirement that the second row of the CSV file has to contain data types means that you can't use csonv.js for existing CSV sources (legacy API's you can't change, etcetera). Wouldn't it be nice if Csonv could guess which datatype you want based on the CSV file's contents?

For example...

name;books_owned;achievements
Alice;3.0;avidreader,commentator,spectator
Bob;1.0;ubermeister

The first column would be turned into a string, the second into a float (matches (\d+)\.(\d+)), and the third into an array (as it is a string with comma's and no spaces). This guessing is what Excel does as well, I believe. It's not pretty, but it works most of the time.

What do you think?

wandernauta avatar Jun 25 '11 07:06 wandernauta

Would you guess based only on the first row?

arextar avatar Aug 18 '11 15:08 arextar

Yes, either that or soms kind of best-fitting type logic, i.e. '1.3', 'foo' becoming strings and '1', '4.2' becoming floats, if that makes any sense. The latter seems harder to implement.

Op donderdag 18 augustus 2011 schreef arexkun ( [email protected]) het volgende:

Would you guess based only on the first row?

Reply to this email directly or view it on GitHub: https://github.com/archan937/csonv.js/issues/1#issuecomment-1840224

wandernauta avatar Aug 26 '11 12:08 wandernauta

One major problem I see with detecting the type is the fact that booleans are integers. The only way to tell them apart is if there is an empty entry. Complete type guessing is not possible when two types are indistinguishable. Other than the booleans it would actually be quite easy.

arextar avatar Aug 26 '11 12:08 arextar

Using 'true' and 'false'/'' like YAML does instead of 1 and 0 for booleans would solve that, but it's not what the README says.

wandernauta avatar Aug 26 '11 13:08 wandernauta

If the author wants to switch to using true/false then it's very possible to guess types.

@archan937 : Would you be willing to switch to true/false?

arextar avatar Aug 26 '11 14:08 arextar

If they do decide to switch to true/false here's a quick function I wrote up to test data types:

var r_not_num=/\D/,
    r_float=/\d*.\d+/

function type(str,t,x,l,ty){
    if(~str.indexOf(",")){
        l=(str=str.split(",")).length;

        for(x=0;x<l;x++){
            t=type(str[x])
            if(t=="string") return "string"
            if(t=="boolean"){
                if(ty&&ty!="boolean") return "string";
                ty=t;
            }
            if(t=="float") ty=t;
            if(t=="integer"&&(!ty||ty=="integer")) ty=t;
        }

        return ty;
    }

    if(!r_not_num.test(str)){
        t="integer";
    }
    else if(r_float.test(str)){
        t="float";
    }
    else if(str=="true"||str=="false"){
        t="boolean";
    }
    else
    {
        t="string";
    }

    return t;
}

arextar avatar Aug 26 '11 14:08 arextar

Interring is nifty, but wouldn't mind if I could just pass in that line of definition as a parameter. That would still allow me to do relational lookups and specify plurals.

chrishaff avatar Nov 23 '12 16:11 chrishaff

Being able to pass in an array as a parameter would make for a quick improvement. Something like:

var keyTypes = ["integer","string","string","strings","boolean"];

Although having the script guess the type would be the best option, being able to pass in a parameter to override the guess would be handy in the event that you want to pass an integer as a string or whatnot.

ghost avatar Jan 08 '14 21:01 ghost