scrapemark icon indicating copy to clipboard operation
scrapemark copied to clipboard

Pull Request

Open quink opened this issue 14 years ago • 3 comments

Some fixes and stuff.

quink avatar Aug 07 '11 22:08 quink

This looks like a win. @arshaw: Merge?

EmilStenstrom avatar Dec 25 '11 00:12 EmilStenstrom

I don't know which update is the cause, but this code does not work with patterns involving lists of dictionaries. For example, the following pattern

{* <a href='{{ [links].url }}'>{{ [links].title }}</a> *}

yields this result when run on http://www.google.com/ :

{'links': [{'title': u'SearchImagesMapsPlayYouTubeNewsGmailDriveMore \u25bcCalendarTranslateMobileBooksOffersWalletShoppingBloggerReaderFinancePhotosVideosEven more \xbbWeb HistorySign inSearch settingsInstall Google ChromeAdvanced searchLanguage toolsAdvertising\xa0ProgramsBusiness Solutions+GoogleAbout GooglePrivacy & Terms',
   'url': u'http://www.google.com/webhp?hl=en&tab=wwhttp://www.google.com/imghp?hl=en&tab=wihttp://maps.google.com/maps?hl=en&tab=wlhttps://play.google.com/?hl=en&tab=w8http://www.youtube.com/?tab=w1http://news.google.com/nwshp?hl=en&tab=wnhttps://mail.google.com/mail/?tab=wmhttps://drive.google.com/?tab=wohttp://www.google.com/intl/en/options/https://www.google.com/calendar?tab=wchttp://translate.google.com/?hl=en&tab=wThttp://www.google.com/mobile/?tab=wDhttp://books.google.com/bkshp?hl=en&tab=wphttps://www.google.com/offers/home?utm_source=xsell&utm_medium=products&utm_campaign=sandbar&tab=wG#!detailshttps://wallet.google.com/manage/?tab=wahttp://www.google.com/shopping?hl=en&tab=wfhttp://www.blogger.com/?tab=wjhttp://www.google.com/reader/?hl=en&tab=wyhttp://www.google.com/finance?tab=wehttp://picasaweb.google.com/home?hl=en&tab=wqhttp://video.google.com/?hl=en&tab=wvhttp://www.google.com/intl/en/options/http://www.google.com/history/optout?hl=enhttps://accounts.google.com/ServiceLogin?hl=en&continue=http://www.google.com//preferences?hl=en/chrome/index.html?hl=en&brand=CHNG&utm_source=en-hpp&utm_medium=hpp&utm_campaign=en/advanced_search?hl=en&authuser=0/language_tools?hl=en&authuser=0/intl/en/ads//services/intl/en/about.html/intl/en/policies/'}]}

With the pattern where the dctionaries only have one key each

{* <a href='{{ [links].url }}'></a> *}

it returns the expected result:

{'links': [{'url': u'http://www.google.com/webhp?hl=en&tab=ww'},
  {'url': u'http://www.google.com/imghp?hl=en&tab=wi'},
  {'url': u'http://maps.google.com/maps?hl=en&tab=wl'},
  {'url': u'https://play.google.com/?hl=en&tab=w8'},
  {'url': u'http://www.youtube.com/?tab=w1'},
  {'url': u'http://news.google.com/nwshp?hl=en&tab=wn'},
  {'url': u'https://mail.google.com/mail/?tab=wm'},
  {'url': u'https://drive.google.com/?tab=wo'},
  {'url': u'http://www.google.com/intl/en/options/'},
  {'url': u'https://www.google.com/calendar?tab=wc'},
  {'url': u'http://translate.google.com/?hl=en&tab=wT'},
  {'url': u'http://www.google.com/mobile/?tab=wD'},
  {'url': u'http://books.google.com/bkshp?hl=en&tab=wp'},
  {'url': u'https://www.google.com/offers/home?utm_source=xsell&utm_medium=products&utm_campaign=sandbar&tab=wG#!details'},
  {'url': u'https://wallet.google.com/manage/?tab=wa'},
  {'url': u'http://www.google.com/shopping?hl=en&tab=wf'},
  {'url': u'http://www.blogger.com/?tab=wj'},
  {'url': u'http://www.google.com/reader/?hl=en&tab=wy'},
  {'url': u'http://www.google.com/finance?tab=we'},
  {'url': u'http://picasaweb.google.com/home?hl=en&tab=wq'},
  {'url': u'http://video.google.com/?hl=en&tab=wv'},
  {'url': u'http://www.google.com/intl/en/options/'},
  {'url': u'http://www.google.com/history/optout?hl=en'},
  {'url': u'https://accounts.google.com/ServiceLogin?hl=en&continue=http://www.google.com/'},
  {'url': u'/preferences?hl=en'},
  {'url': u'/chrome/index.html?hl=en&brand=CHNG&utm_source=en-hpp&utm_medium=hpp&utm_campaign=en'},
  {'url': u'/advanced_search?hl=en&authuser=0'},
  {'url': u'/language_tools?hl=en&authuser=0'},
  {'url': u'/intl/en/ads/'},
  {'url': u'/services/'},
  {'url': u'/intl/en/about.html'},
  {'url': u'/intl/en/policies/'}]}

bsidhom avatar Dec 18 '12 21:12 bsidhom

I know this thread is ancient, and I'm sorry for being very uninvolved with this project ever since its launch more than 3 years ago, but just wanted to announce that Scrapemark is no longer being maintained. I wrote a blog post reflecting on it: http://blog.arshaw.com/1/post/2013/03/reflecting-on-scrapemark.html

@quink, looking back, this should have been test-driven development, you got that right. Looks like you got to know the code pretty well. thanks for all these changes and sorry I never merged them.

arshaw avatar Mar 24 '13 09:03 arshaw