Date2Vec
Date2Vec copied to clipboard
Is Date2Time a general time representation? How I generate my own embeddings?
Hello Surya,
Firstly, thank you for the nice job. I am trying uderstand the usefulness of time2vec and if I can use it in some way in my use case. In general time2vec paper claims that "...developing a general-purpose model-agnostic representation for time that can be potentially used in any architecture. In particular, we develop a learnable vector representation (or embedding) for time as a vector representation can be easily combined with many models or architectures..."
In my case, I have 12 values for each new acquisition and I have a new acquisition every 3 to 5 days. This acquisition date I want instead of use it as a simple column feature (i.e. doy of year) to transform it using time2vec in a continuous embedding.
But, what is the proper way to create this useful time represenation for me? You provide some pretrained date2vec representation (based on time2vec approach), and I am thinking that is a kind of general time representation trained in big dataset likes pretrained word2vec (by google) in english wikipedia corpus. Is it true? In what dataset and "pretext" task you have pretrain? Are these time representations general enough?
Also, if the better is to train my own time2vec embeddings in my own dataset, how I will do it? I will train a NN with 1 time2vec layer + 1 lstm layer with task let's say to predict one of the 12 values per acquisition and in the end I will keep the time2vec layer as the time embedding?
Sorry for bombarding with questions.
BR, Ilias
Hi! Thank you for taking interest in my project.
- The tasks that I used to train the date2vec model on is given in the readme. In summary, I have given code for two tasks:
- Next date prediction: given a random timestamp, predict the exact timestamp after 24 hours.
- Date reconstruction (Autoencoding): given a random timestamp, predict the same timestamp.
- The pretrained model is trained on the first task on a very large dataset of randomly generated timestamps. These models learn general representations because the training task is unsupervised.
- If you want to use time2vec with your own model, simply import the layer and use it as an embedding. You can even finetune my pretrained model and maybe that might help?
Hope that helps you!
Thank you for your answer. But you have check if this representation of time give better results instead of other simple representation of time like a simple datime feature or an one hot representation of time? In the case of text, word2vec captures the context of each word and inject it in the representation taking into account the words around of the target word but what is the equivelant high level interpretation of time2vec. A datetime instead of a word always have the same neighborhood of datetimes. There is a physical order in the time which dominate the similarity across dates, always 17/8 will after 16/8 and before 18/8... there is no something similar in the text. So, what is the usefulness of a representation of time as embedding like date2vec instead of a simple feature?