chatglm.cpp icon indicating copy to clipboard operation
chatglm.cpp copied to clipboard

web_demo.py bug fixed

Open emikeliu opened this issue 2 years ago • 4 comments

Now web_demo.py won't display HTML tags on gradio chatbot.

emikeliu avatar Jul 03 '23 10:07 emikeliu

Would you provide some bad cases with the current web_demo.py. They might be caused by truncated or low-quality model outputs. The mdtex2html processing was taken from the ChatGLM official demo. I think it's aimed to convert markdown code blocks or formula to html and I prefer to keep this feature.

li-plus avatar Jul 04 '23 16:07 li-plus

I have tried web_demo.py and found the dialog would be displayed as raw HTML code. I guess that the gradio chatbot could automatically convert markdown or TeX formula to HTML now, and mdtex2html is no longer needed. e.g.

<p>写一个HTML文档</p>

<p>好的,以下是一个简单的HTML文档:<pre><code class="language-"><br><!DOCTYPE html><br><html><br> <head><br>   <title>My HTML Document</title><br> </head><br> <body><br>   <h1>Welcome to my HTML Document</h1><br>   <p>This is my first HTML document.</p><br>   <ul><br>     <li>Item 1</li><br>     <li>Item 2</li><br>     <li>Item 3</li><br>   </ul><br>   <p>This is my second paragraph.</p><br> </body><br></html><br></code></pre><br>这个文档包含了以下元素:<br>- <code><!DOCTYPE html></code>:声明文档采用HTML5规范。<br>- <code><html></code>:文档的开始标签。<br>- <code><head></code>:文档的头部,包含元数据和样式信息。<br>- <code><title></code>:文档的标题,显示在浏览器的标签页上。<br>- <code><h1></code>:标题元素,用于定义文档的标题。<br>- <code><p></code>:段落元素,用于定义文档的段落。<br>- <code><ul></code>:无序列表元素,用于定义无序列表。<br>- <code><li></code>:列表项元素,用于定义列表项。<br>- <code></ul></code>:无序列表元素的结束标签。<br>- <code></head></code>:文档的头部结束标签。<br>- <code><body></code>:文档的主体,包含文档的内容。<br>- <code><h1></code>:标题元素,定义文档的标题。<br>- <code><p></code>:段落元素,定义文档的段落。<br>这个文档包含了文本、标题、段落和无序列表项等基本元素,可以用于创建一个简单的HTML文档。</p>

After apply the patch, it should be:

写一个HTML文档

好的,以下是一个简单的HTML文档,包含一个标题、一个段落和一个图片链接:

<!DOCTYPE html>
<html>
 <head>
   <title>My HTML Document</title>
 </head>
 <body>
   <h1>Welcome to my HTML document</h1>
   <p>This is a sample paragraph.</p>
   <a href="https://www.w3schools.com/w3css/img_lights.jpg">Click on this link to see a cool light image.</a>
 </body>
</html>

这个文档使用<!DOCTYPE html>来声明文档类型,使用<html>标签来标识文档的开始和结束,使用<head>标签来包含文档的元数据,如标题,样式和脚本,使用<body>标签来包含文档的主体内容。在<head>标签中,我们定义了<title>元素来设置文档的标题,<meta>元素来定义文档的描述,以及<link>元素来引用w3schools.com上的图片。在<body>标签中,我们创建了一个标题和一个段落,以及一个图片链接。

emikeliu avatar Jul 06 '23 03:07 emikeliu

That's weird. I'm not facing this problem with gradio==3.35.2 and Chrome 114.0.5735.198 on macOS. Would you provide more information of your environment?

li-plus avatar Jul 07 '23 05:07 li-plus

I knew what was really happening after upgrading gradio from 3.28 to 3.35. I found that gradio supports native HTML rendering in the chatbot component after a certain version, so mdtex2html might work, but is not really needed. So I think that although my patch did not fix anything, it did remove some unnecessary code. In short, I did a little research on gradio, and the code from the ChatGLM official demo did not work with gradio<=3.3x (uncertain).

emikeliu avatar Jul 07 '23 12:07 emikeliu

You're right. Finally I found it unnecessary to use mdtex2html or other post-processing. Gradio chatbot will handle markdown syntax automatically. I'll merge this PR. Thanks for you contribution!

li-plus avatar Aug 05 '23 14:08 li-plus