exp run: date values are not treated as strings
Bug Report
Description
Parameters to be stored in a JSON file that are ISO dates (yyyy-mm-dd) are automatically treated as date values by exp run, it seems -- which then errors when writing them to JSON.
Reproduce
Clone this test repo and try the following:
$ dvc repro dostuff --force # originally, the parameter is "2021-01-01"
Running stage 'dostuff':
> python dostuff.py
Param is: 2021-01-01
Use `dvc push` to send your updates to remote storage.
$ dvc exp run -S params.json:testparam="2021" # set to something else -- works
Running stage 'dostuff':
> python dostuff.py
Param is: 2021
Updating lock file 'dvc.lock'
...
$ dvc exp run -S params.json:testparam="2021-02-01" # set to a date -- fails
ERROR: unexpected error - Object of type date is not JSON serializable
...
Both 2021 and 2021-02 work and are parsed as an integer and a string, respectively.
Expected
Things should not be parsed as dates when I didn't say so; or at least, if they are parsed, this should work. Otherwise, I'd expect everything to be passed as an uninterpreted string.
Environment information
Output of dvc doctor:
DVC version: 2.0.18 (deb)
---------------------------------
Platform: Python 3.8.9 on Linux-5.4.0-72-generic-x86_64-with-glibc2.4
Supports: All remotes
Cache types: <https://error.dvc.org/no-dvc-cache>
Caches: local
Remotes: None
Workspace directory: ext4 on /dev/mapper/vg0-root
Repo: dvc, git
Seems that even "normal" run suffers from this problem:
#!/bin/bash
rm -rf wspace
mkdir wspace
pushd wspace
set -ex
main=$(pwd)
mkdir repo
pushd repo
git init --quiet
dvc init --quiet
echo "date: 2020-02-01" >> params.yaml
git add -A
git commit -am "init"
dvc run -n train -p date -o out "cp params.yaml out"
results in:
ERROR: unexpected error - object of type 'datetime.date' has no len(│
)
Any date or datetime inside params.yaml leads to the object of type 'datetime.date' has no len() error
Should I open a new issue and report it as a regression? I'm getting the error, ERROR: failed to reproduce 'pipeline': Object of type date is not JSON serializable even in DVC version 2.27.2 on Python 3.10.7 and Windows 10.
dvc doctor
DVC version: 2.27.2 (pip)
---------------------------------
Platform: Python 3.10.7 on Windows-10-10.0.19044-SP0
Subprojects:
dvc_data = 0.10.0
dvc_objects = 0.4.0
dvc_render = 0.0.11
dvc_task = 0.1.2
dvclive = 0.10.0
scmrepo = 0.1.1
Supports:
gs (gcsfs = 2022.8.2),
http (aiohttp = 3.8.1, aiohttp-retry = 2.8.3),
https (aiohttp = 3.8.1, aiohttp-retry = 2.8.3)
Cache types: hardlink, symlink
Cache directory: NTFS on C:\
Caches: local
Remotes: gs
Workspace directory: NTFS on C:\
Repo: dvc, git
dvc.yaml
stages:
pipeline:
cmd: "python -m boilerdata.pipeline"
deps:
- "data/curves"
params:
- "src/boilerdata/config/axes.yaml":
- "src/boilerdata/config/project.yaml":
- "src/boilerdata/config/trials.yaml":
outs:
- "data/results"
src/boilerdata/config/trials.yaml
trials:
# snip
- date: 2022-09-14
group: "control"
rod: "R"
coupon: "A0"
joint: "none"
comment: "A descriptive comment."
good: true
# snip
Sorry, @blakeNaccarato , I think it was a mistake to close this one.
I see this has been closed as not planned. The reasoning in https://github.com/iterative/dvc/pull/9473#issuecomment-1552541485 goes into more detail (for future readers landing here). Reminds me of that old yarn, falsehoods programmers believe about time.
So yeah, be sure to stringify your datetimes in params.yaml!
Looks like the PR discussion also covers intent to clarify the error when a bare datetime is in your params.yaml, and to document the unsupported YAML datetime type clearly. That could probably go in this part of the docs, if it's not already planned to be documented elsewhere.
If you're gonna triage some issues as out of scope, datetime-related ones are probably some of the best candidates! 😅