surgeon icon indicating copy to clipboard operation
surgeon copied to clipboard

Add ability to format the result

Open gajus opened this issue 8 years ago • 4 comments

There has been a request to add a "formatting" ability like in scrape-it library.

Its documented as:

convert (Function): An optional function to change the value.

Example:

{
   articles: {
       listItem: ".article"
     , data: {
           createdAt: {
               selector: ".date"
+             , convert: x => new Date(x)
           }
         , title: "a.article-title"
         , tags: {
               listItem: ".tags > span"
           }
         , content: {
               selector: ".article-content"
             , how: "html"
           }
       }
   }
}

Considerations:

  • Need to consider how this integrates with validation (does formatting happen before, after)
  • Whats the API?

gajus avatar Jan 17 '17 16:01 gajus

Re: API, I've toyed with idea of passing arrays to indicate selector + transforms, a la createdAt: ['.date', x => new Date(x)]. IMO, it's easier to read than createdAt: { selector: ".date", convert: x => new Date(x) } especially when you have many transforms in your schema.

sllvn avatar Jan 18 '17 06:01 sllvn

Lets say you select all links in a document and want to filter out duplicates. sm a|ra href

Any user-defined subroutine is called once per item in the array, not on the array as a whole, right? (Nor is it called as a reducer?) So I cannot make a subroutine to sort and remove duplicates from the array. Or a subroutine to flatten the array.

ComLock avatar Oct 16 '18 11:10 ComLock

It can be done if the subroutine combines select and read :)

sl: (subject, v, b) => selectSubroutine(subject, ['a', '{0,}'], b).map(match => readSubroutine(match, ['attribute', 'href'], b))

ComLock avatar Oct 16 '18 11:10 ComLock

Wow powerful stuff:

function sortAndRemoveDups(arr) {
	const sorted = arr.sort();
	const uniq = [];
	let prev = null;
	for (let i = 0; i < sorted.length; i += 1) {
		if (sorted[i] !== prev) { uniq.push(sorted[i]); }
		prev = sorted[i];
	}
	return uniq;
}

...

slb: (s, v, b) => sortAndRemoveDups(selectSubroutine(s, [v.concat('a:not([href^="#"])').join(' '), '{0,}'], b).map(m => readSubroutine(m, ['attribute', 'href'], b)))

...

allRealLinksUnderBody: slb body

ComLock avatar Oct 16 '18 12:10 ComLock