kg-covid-19
kg-covid-19 copied to clipboard
Panacea Lab COVID19 Twitter Chatter
I'd like to work on mapping social media chatter from Twitter to the OBOs, which we could then parse into edge lists. I think it would be interesting to capture things like symptoms and map them to exposures and phenotypes.
I'd like to prioritize the data mined by Juan Banda's Lab: https://github.com/thepanacealab/covid19_twitter
This doesn't have to be a priority, but I do think it could be really interesting!
@callahantiff This just popped on my radar. Happy to help with this. Let me know how you are planning on processing this.
@callahantiff This just popped on my radar. Happy to help with this. Let me know how you are planning on processing this.
I would love to work together on this! I'll email you to see if there is a time we could chat. I think it would be good to pow-wow about existing tools for normalizing to OBOs and maybe outline a plan for what to focus on first!
is this similar to what we'd do for CORD-19, i.e. information entity (tweet) mentions entity (OBO class or gene or ...)?
There are specific properties we might like to include on the tweet, e.g time. Not sure how these properties would be used in ML but certainly useful for display/querying
is this similar to what we'd do for CORD-19, i.e. information entity (tweet) mentions entity (OBO class or gene or ...)?
There are specific properties we might like to include on the tweet, e.g time. Not sure how these properties would be used in ML but certainly useful for display/querying
That's what I was initially thinking, perhaps with some emphasis on things like symptoms. It would be great if we can draw some correlation to reported outcomes as well (when/if they exist).
There are multiple ways we can parse this:
- Information Content Entity -> OBO terms
- Tweet -> Phenotypes -> Phenopackets
- Tweets over time -> time series (might be brittle)
It would rely on some form of NER first
Interesting discussion, I can harp here on time modeling as well :) But just getting into that topic, discussion going on in another thread -- but minimally storing the tweet source time point could help. Less for graph learning but more search and data science/modeling ...
covidscholar.com is also interested in tweets but they have their hands full and I haven't shared the Twitter corpus that Tiffany shared ... In theory we can use their NLP effort to enrich our graph but let's take a stab first to see what how this plays out.
best, marcin
On Tue, Apr 7, 2020 at 11:22 AM Deepak [email protected] wrote:
There are multiple ways we can parse this:
- Information Content Entity -> OBO terms
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Knowledge-Graph-Hub/kg-covid-19/issues/32#issuecomment-610545904, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDJLTU3CIRP42Z7GP57IATRLNVOVANCNFSM4LSGBS3Q .