rpg_e2vid
rpg_e2vid copied to clipboard
Event Camera Dataset Evaluation
Hello there,
Thank you for your great work! This is a follow-up question to #17. I would really appreciate it if you would provide some further clarification.
- I tried to cut the Event Camera sequences with the timestamps in that issue. I have:
dynamic_6dof count: 319
boxes_6dof count: 326
poster_6dof count: 341
shapes_6dof count: 340
office_zigzag count: 134
slider_depth count: 39
calibration count: 357
In total, this gives 1856 frames, which is a little bit different than the expected 1670 frames. Here is how I count the frames:
import numpy as np
import os
import pandas as pd
seq_config = [ {
"name": "dynamic_6dof",
"config": {
"start_time_s": 5.0,
"stop_time_s": 20.0
}
},
{
"name": "boxes_6dof",
"config": {
"start_time_s": 5.0,
"stop_time_s": 20.0
}
},
{
"name": "poster_6dof",
"config": {
"start_time_s": 5.0,
"stop_time_s": 20.0
}
},
{
"name": "shapes_6dof",
"config": {
"start_time_s": 5.0,
"stop_time_s": 20.0
}
},
{
"name": "office_zigzag",
"config": {
"start_time_s": 5.0,
"stop_time_s": 12.0
}
},
{
"name": "slider_depth",
"config": {
"start_time_s": 1.0,
"stop_time_s": 2.5
}
},
{
"name": "calibration",
"config": {
"start_time_s": 5.0,
"stop_time_s": 20.0
}
} ]
for seq in seq_config:
data_dir = os.path.join('data', 'EventCamera', 'test', seq['name'])
target_list_name = os.path.join(data_dir, 'images.txt')
with open(target_list_name) as f:
target_list = [(float(line.split()[0]), line.split()[1]) for line in f.readlines()]
target_list.sort() # It should be sorted by default. Just be safe
event_name = os.path.join(data_dir, 'events.txt')
event_pd = pd.read_csv(event_name, delim_whitespace=True, header=None,
names=['t', 'x', 'y', 'pol'],
dtype={'t': np.float64, 'x': np.int16, 'y': np.int16, 'pol': np.int16},
engine='c',
chunksize=1)
event_start = next(event_pd)['t'][0]
count = 0
for target_t, target_name in target_list:
if target_t < event_start + seq['config']['start_time_s']:
continue
elif target_t > event_start + seq['config']['stop_time_s']:
break
count += 1
print('{} count: {}'.format(seq['name'], count))
- What is the expected normalization before calculating MSE? I was thinking about casting the pixel values to
[0, 1]
but would like to make sure here. - The definition of SSIM appears to have several parameters. I used the default ones from skimage. Is that also what you did?
- For LPIPS, my understanding is that the metric value is affected by the pretained network weights. I used the vgg network from this link. It will be helpful if you can let me know what network you use.
Thanks in advance!
@chensong1995 Hello, I also encountered the same evaluation indicator calculation problem.
When I calculated the SSIM index according to the method you said, the results obtained were far from the data in the paper. Did you get the results similar to those in the paper?
@midofalasol Negative. I think there is definitely some data normalization issue going on here.
For anyone seeking quantitative evaluation, you can utilize EVREAL, our library designed to evaluate and analyze PyTorch-based event-based video reconstruction methods (including E2VID).