[Question] How to sample an event time delta range within [0, 10,000]
In your example data, time_since_last_event is always within the range [0, 10]. If my sampled time_since_last_event can range from [0, 10,000], can you guide me on how to sample it?
Hi, this is a 'dtime_max' in the thinning algo params that determine the range of the
model_config:
.....
thinning:
.....
dtime_max: 5. <-------------------- HERE
....
Thank you for your answer. In fact, I adjusted this dtime_max but it didn't help.
Even after adjusting dtime_max in reproducing retweet results #49, the event type prediction accuracy remains lower than when normalizing the data delta time to the range [0, 10].
let me have a look
I'm not sure I understand your code correctly. Here I found you used pad_token_id to pad time_delta_sequence. Suppose we have 10 event types, but the delta time can be as large as 100. Should we use 100 to pad the time_delta_sequence?
I'm not sure I understand your code correctly. Here I found you used
pad_token_idto padtime_delta_sequence. Suppose we have 10 event types, but the delta time can be as large as 100. Should we use 100 to pad thetime_delta_sequence?
Hi,
The perfect case is indeed to use a different pad token for time_delta_sequence.
The current implementation of using type pad token is a simple workaround. When computing loss, we use masks from type_sequence to eliminate padded events, and therefore, the pad tokens for time_delta_sequence are not used.
see https://github.com/ant-research/EasyTemporalPointProcess/blob/main/easy_tpp/model/torch_model/torch_basemodel.py#L110
Another reason is there is not a straightforward way to determine the pad token id for time sequences. One way is to compute the statistics of the time delta sequences and then choose a large number. But this causes computations and not very friendly for users.