Xing Han Lu
Xing Han Lu
@nicolaskruchten how much effort would it be to rework the CSS, and how much expertise would be needed? Even with one banner the problem seems to be present, but not...
> Hey! Is there anything new you guys are working on? More data? I love this because I think multion is actually doing a decent work on this kind of...
Yes, we are interested in building better DMR variants! We are still looking into different ways we can approach the candidate selection problem. Regarding discord, I think it's a great...
Hey! We are all actively working on improving weblinx. Llama 3.2 is definitely under our radar, but we are waiting to streamline our new eval pipeline and augment the training...
I'm a bit confused, why does it say that it is true by default, then immediately after it is false by default?
Would True or False be better in this case? Perhaps it is useful to have reversible string, but that would be bad if it affects the final results (regression)
Hmm I see. In this case, i'm happy to merge a PR if you are interested in creating one!
no, this issue is specific to the colab notebook we used to go over the weblinx tutorial.
Would proably make sense to have a Tokenizer class at this point to allow for generator/streaming. I.e.: ```python class Tokenizer: def __init__(self): self.vocab_dict = {} def __call__(self, texts, stream=False): for...
@dl423 Thank you for the suggestion! I think it's an interesting idea, however I'm not sure how that could be incorporated into the current API without changing how the corpus...