city-scrapers icon indicating copy to clipboard operation
city-scrapers copied to clipboard

how to handle events that require registration?

Open eads opened this issue 7 years ago • 8 comments

@mwgalloway raised this issue. @diaholliday might have some thoughts about how to structure this as the open civic data specification doesn't account for it

eads avatar Aug 09 '17 00:08 eads

Have we come across an example that can be posted here?

diaholliday avatar Aug 09 '17 00:08 diaholliday

@diaholliday CPS Board of Education is the page I'm building a scraper for that brought up the question. Below the meeting schedule there are registration instructions.

mwgalloway avatar Aug 09 '17 00:08 mwgalloway

I think that particular registration may be for speakers (not general attendance) but I'd like to log it in either case. Can you mark it down in the notes section of the excel sheet? If we see it pop up in more events I may want to create a new column on the excel sheet to keep track of them (and eventually include a button/box on the calendar).

diaholliday avatar Aug 09 '17 00:08 diaholliday

Is it possible to utilize the HttpAuthMiddleware class in scrapy? You can authenticate with an http username and password.

https://doc.scrapy.org/en/latest/topics/downloader-middleware.html#scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware

Alternatively, BeautifulSoup also supports http auth functionality and is compatible with scrapy.

https://doc.scrapy.org/en/latest/faq.html?highlight=beautifulsoup#faq-scrapy-bs-cmp

On Tue, Aug 8, 2017, 7:51 PM diaholliday [email protected] wrote:

I think that particular registration may be for speakers (not general attendance) but I'd like to log it in either case. Can you mark it down in the notes section of the excel sheet? If we see it pop up in more events I may want to create a new column on the excel sheet to keep track of them (and eventually include a button/box on the calendar).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/City-Bureau/documenters-aggregator/issues/38#issuecomment-321121429, or mute the thread https://github.com/notifications/unsubscribe-auth/ATuvEyRIJmLJEM0Qb_A7LOuJHFe-ZBVuks5sWQJ7gaJpZM4OxdEY .

--

https://about.me/csethna?promo=email_sig&utm_source=product&utm_medium=email_sig&utm_campaign=edit_panel Cyrus Sethna about.me/csethna https://about.me/csethna?promo=email_sig&utm_source=product&utm_medium=email_sig&utm_campaign=edit_panel

csethna avatar Aug 09 '17 04:08 csethna

@csethna in this case, this isn't registration to scrape the site, but instead events that require some form of registration to attend. Seems relevant to Documenters and reporters as the Board of Ed site says "Advance registration is available for speakers and observers."

eads avatar Aug 09 '17 04:08 eads

My instinct is to extend the open civic data format (and suggest they add this as an optional field) with a boolean registration required field. But registration almost always involves some strange and curious process, so we need a place to put more detailed/human-readable registration instructions. That could just be part of the description field or something separate.

eads avatar Aug 09 '17 04:08 eads

How about we extend the OCD model with a new object, registration:

{
    '_type': 'event',

    # other fields

    'registration': {
        'required': True,
        'registration_url': 'https://domain.test/events/register',
        'info': 'Info about registration requirements',
        'info_url': 'https://domain.test/how_to_register_for_events',
    },
}

For events that don't need registration we could leave out most of the new fields:

{
    '_type': 'event',

    # other fields

    'registration': {
        'required': False,
    },
}

jim avatar Aug 23 '17 15:08 jim

@eads This has come up with the ward night scraper as well. What do you think of adding fields as described above?

jim avatar Oct 13 '17 01:10 jim