raven
raven copied to clipboard
[DEFECT] Pointset does not load a variable in the first column of a csv.
Issue being that the system does not load the variable Time in the first column of the csv.
Error is below. Shows missing Tb and Tb as being unused.
RAVEN Python dependencies located and checked. ( 2.99 sec) SIMULATION : Message -> Simulation started at 2021-05-19 15:59:39 InputData: Using param spec "DataSet" to read XML node "PointSet. ( 3.05 sec) PointSet : ERROR -> Not all variables requested for data object "data" were found in csv "C:\Users\FRICKL\projects\TEDS_VandV\V_V\csv\TEDS_startup.csv"! Needed: {'Tb', 'Flow Rate'}; Unused: {'Type K TC #1', 'Tb', 'Omega Pressure', 'Date/Time', 'Type K TC #3', 'Type K TC #2'}; Missing: {'Tb'} OSError: Not all variables requested for data object "data" were found in csv "C:\Users\FRICKL\projects\TEDS_VandV\V_V\csv\TEDS_startup.csv"! Needed: {'Tb', 'Flow Rate'}; Unused: {'Type K TC #1', 'Tb', 'Omega Pressure', 'Date/Time', 'Type K TC #3', 'Type K TC #2'}; Missing: {'Tb'}
Describe how to Reproduce Steps to reproduce the behavior:
- unload the zip. issue.zip
- make a csv folder and place the .csv file in there.
- run the .xml file in a directory above the csv folder.
Screenshots and Input Files Please attach the input file(s) that generate this error. The simpler the input, the faster we can find the issue.
Platform (please complete the following information):
- OS: [e.g. iOS]
- Version: [e.g. 22]
- Dependencies Installation: [CONDA or PIP]
For Change Control Board: Issue Review
This review should occur before any development is performed as a response to this issue.
- [x] 1. Is it tagged with a type: defect or task?
- [x] 2. Is it tagged with a priority: critical, normal or minor?
- [x] 3. If it will impact requirements or requirements tests, is it tagged with requirements?
- [x] 4. If it is a defect, can it cause wrong results for users? If so an email needs to be sent to the users.
- [x] 5. Is a rationale provided? (Such as explaining why the improvement is needed or why current code is wrong.)
For Change Control Board: Issue Closure
This review should occur when the issue is imminently going to be closed.
- [x] 1. If the issue is a defect, is the defect fixed?
- [x] 2. If the issue is a defect, is the defect tested for in the regression test system? (If not explain why not.)
- [x] 3. If the issue can impact users, has an email to the users group been written (the email should specify if the defect impacts stable or master)?
- [x] 4. If the issue is a defect, does it impact the latest release branch? If yes, is there any issue tagged with release (create if needed)?
- [x] 5. If the issue is being closed without a pull request, has an explanation of why it is being closed been provided?
It looks like the issue is this data was scraped from txt
into csv
using Excel, and Excel encodes using the \ufeff
BOM (see https://stackoverflow.com/questions/17912307/u-ufeff-in-python-string/17912811#17912811). There are some encoding options of Python that will correctly remove that; I'm looking into whether we can do this automagically.
It turns out the fix in #1565 didn't work for Windows but passed our tests somehow. I'm working on it.
@PaulTalbot-INL Could you send an email to the users regarding this fix?
I ran into this issue today again with the most recent version of RAVEN. It appears to still be sneaking its way in there sometimes.
Looks like there may be other encoding problems we don't catch yet, then?
I'm not sure, I'm looking more into it. Today, it was the BOM character \ufeff
that was appearing at the beginning of the file after the person I was working with had opened the CSV in Excel on Windows.
Reopening issue because this bug was found in several RAVEN DataObject methods.
@PaulTalbot-INL @dylanjm Do you have time to look into this issue?
@wangcj05 I will have some time to take a look at this again.