ehr_deidentification
ehr_deidentification copied to clipboard
Shifting dates in the notes
Hi! Thanks for open-sourcing the code and your models! It's very useful!
I want to automatically replace the found dates (in the notes) with shifted dates (e.g. by 1 day). I tried to add a custom deid_strategy
which returns the original dates and nothing for other entity types, and convert the date string to datetime, add 1 day and then convert it back to string (in TextDeid.__get_deid_text
), using the code below. The model seems to find all types of dates including *days e.g. "Friday".
I am wondering if there is a better way to replace the dates automatically using your code?
def string2dates(date_text):
formats = ['%m/%d/%Y', '%m.%d.%Y', '%m/%d/%y', '%m.%d.%y',
'%Y-%m-%d', '%y-%m-%d', '%m-%d-%Y', '%m-%d-%y',
'%b %d %Y', '%B %d, %Y', '%b %d %y', '%B %d, %y']
parsed_date = None
for fmt in formats:
try:
parsed_date = datetime.strptime(date_text, fmt)
break
except ValueError:
pass
return parsed_date
if deid_strategy == 'replace_informative' and not age_unchanged:
deid_text = deid_text[:start_pos] + deid_tag.format(note_text[start_pos:end_pos]) + deid_text[end_pos:]
elif deid_strategy == 'shift_dates':
if tag == "DATE":
date_dt = string2dates(deid_tag.format(note_text[start_pos:end_pos]))
if date_dt:
new_date_dt = date_dt + timedelta(days=1)
deid_text = deid_text[:start_pos] + str(new_date_dt.date()) + deid_text[end_pos:]
else:
deid_text = deid_text[:start_pos] + deid_tag.format(note_text[start_pos:end_pos]) + deid_text[end_pos:]
else:
deid_text = deid_text[:start_pos] + deid_tag.format(note_text[start_pos:end_pos]) + deid_text[end_pos:]
else:
deid_text = deid_text[:start_pos] + deid_tag + deid_text[end_pos:]