BruceZhao comments

Results 22 comments of


                                            BruceZhao

header-img 可以用其他方式加载吗?

目测是可以甩链接的～我觉得这个跟在文章中引用图片一样的原理，我测试过的有 '又拍网' ，加载速度还可以。 From 邮箱大师。

Using jiebaR package (SimHash algorithm)

@remibacha the `jiebaR::distance` first use TF-IDF calculate the `keywords`, then use these keywords to generate 64bits hash code, last, calucuate the hamming-distance between the hash codes. Here is an example:...

Using jiebaR package (SimHash algorithm)

@remibacha jiebaR is design for Chinese Text Segment, it has a default idf dict which only contains Chinse words. Maybe the default idf weight for English word is `11.7392`. So,...

文本文件分解式词性标注失败

@Hz-EMW 1、确保文件路径不包含中文（还可以用 `normalizePath(fs::dir_ls("E:/201803D/0910ontosim/texttest",glob = "*.txt"))`） 2、确保文件编码为UTF-8/或者在读取文件的时候指定编码 3、用户自定义词典里面可以添加专业词汇

最近计划用 bookdown 重写一下文档教程，欢迎大家提一下意见和建议

@qinwf 1. 我觉得直接写现有的函数用法和例子就行，之前版本历史可以略掉？ 2. 原理能不能简单介绍一下，优缺点，瓶颈在哪里之类的。 3. 词库介绍能不能再详细一点，我第一次接触完全是懵逼的，还问了些傻瓜的问题。 4. 哪些是可以自己优化的，比如自己训练idf词库这样的，要是能给出训练方法那就完美了。 5. FAQ 我这个周末整理一下，根据所有的issues整理归纳，大概会有5到8个的样子。不足你再补充一下 6. 可以加上后续分析的例子，我看到很多文章都是基于jiebaR分词，然后完成各种炫酷的文本分析，加上这些参考链接，或许能让人更快上手。我会留心一下，如果碰到就加上链接。 _最后非常感谢你开发了这个包，造福了广大文本分析爱好者！！_

BruceZhao

header-img 可以用其他方式加载吗?

header-img 可以用其他方式加载吗?

Using jiebaR package (SimHash algorithm)

Using jiebaR package (SimHash algorithm)

文本文件分解式词性标注失败

最近计划用 bookdown 重写一下文档教程，欢迎大家提一下意见和建议

最近计划用 bookdown 重写一下文档教程，欢迎大家提一下意见和建议

关键词提取可以加入 TextRank 算法吗？

Missing input file: 'xecjk.sty' from line

Missing input file: 'xecjk.sty' from line