semantic-kernel
semantic-kernel copied to clipboard
The TXT file was uploaded successfully, but Copilot indicates that the file cannot be accessed
I uploaded the txt file through here, but when I asked it to help me summarize and analyze the content of the txt file, it prompted me that I could not analyze it because I did not have the ability to connect to the Internet. Or it prompts that there is no ability to access the contents of the file
Hello!
Thanks for your feedback!
Can you confirm with me that the file has been successfully uploaded before you ask the questions?
Note: you will see a notification banner show up on top when the file is uploaded successfully.
Prompt that the upload was successful, I updated the latest code and it seems to be ready to use. But it can't help me analyze the contents of the file. It may be that the request response token is too short, which makes it impossible to fully analyze the log file I gave him.
Hello!
Yes, you're right. Copilot Chat has limited capacity and cannot analyze the entire files if they are very large. You might want to try to ask the bot more specific questions. For example, you can ask "Please find as many dangerous entries as possible in the log file I provided and summarize them for me?".
你好!
你是对的。Copilot Chat 的容量有限,如果文件非常大,则无法分析整个文件。你可能想尝试向机器询问更具体的问题。比如,你可以问“请在我提供的日志文件中尽可能多地寻找出险的条目,而为我总了结一下?”。
Maybe so, I'll give it a try. Today I gave him a 220k txt file. I hope it can help me analyze which logs in this file are safe and which are not safe. It can print out part of the logs for me, but only part of them, such as I tell it the second log, when I tell it the 10th log it doesn't know. Maybe I need your tips and help, how can I operate better in this situation? Does it analyze logs using openai? What does coploit chat do in this process? Can the default dialogue return to Chinese?
I tried a 20k text log file and it doesn't parse well and give results, which I think is a pity. We need this kind of scene very much, can we optimize it?
你好! 你是对的。Copilot Chat 的容量有限,如果文件非常大,则无法分析整个文件。你可能想尝试向机器询问更具体的问题。比如,你可以问“请在我提供的日志文件中尽可能多地寻找出险的条目,而为我总了结一下?”。
Maybe so, I'll give it a try. Today I gave him a 220k txt file. I hope it can help me analyze which logs in this file are safe and which are not safe. It can print out part of the logs for me, but only part of them, such as I tell it the second log, when I tell it the 10th log it doesn't know. Maybe I need your tips and help, how can I operate better in this situation? Does it analyze logs using openai? What does coploit chat do in this process? Can the default dialogue return to Chinese?
Hello!
The bot will not be able to fit the entire document in one response due to token limit. A general rule is to try to ask more specific questions about the document. In addition, the default chunking strategy for document import may not work well with logs (I am assuming your logs have relatively structured entries compared to other types of documents such as a company annual report). I encourage you to experiment with other chunking strategies in the document import controller.
The bot uses the AI services of your choice. They can be OpenAI, Azure OpenAI or a combination of both.
Unfortunately, the bot by default will return in English because the prompts are written in English. We have not tested the capability of the bot chatting in other languages. I highly encourage you to experiment with the prompts in Chinese to see how well it works :)
你好! 你是对的。Copilot Chat 的内容有限,如果文件非常大,则无法分析整个文件。你可能想尝试向机器询问更具体的问题。比如,你可以问“请在我提供的日志文件中尽可能多地寻找出险的条目,而为我总了结一下?”。
也许是这样,我会试一试。今天我给了他一个220k的txt文件。希望能帮我分析一下这个文件里面哪些日志是安全的,哪些是不安全的。它可以帮我打印出一部分日记,但只能打印一部分,比起我告诉它第二条日志,当我告诉它第十条日志它不知道。也许我需要你的提示和帮助,我怎样才能在这种情况下情况下更好地操作?它使用openai分析日志吗?coploit chat在这个过程中做了什么?默认对话能恢复中文吗?
你好!
由于指令牌限制,机器人将无法将整个文件放入一个响应中。一般规则是尝试询问有关文件的更全面的问题。另外,文件导入的默认分块策略可能不适合用于日志(我假设你的日志与其他类型的文档(例如公司年度报告)相比较具有相关结构化的条目)。我鼓励您在文档导入控制器中尝试其他分块策略。
该机器人使用您选择的 AI 服务。它们可以是 OpenAI、Azure OpenAI 或两者的组合。
不幸的是,默认情况下,机器人会以英文返回,因为提示是用英文写的。我们还没有测试机器人用其他语言聊天的能力。我强烈建议您尝试使用中文提示,看看效果如何 :)
How can I try other chunking strategies in the document import controller? Because the http request message cannot be in the format of Microsoft's annual report, I tried to use dividing lines and other methods, but it didn't work. Is there room for optimization in this area?
How can I change the prompt words to Chinese? Do you need to modify the source code? I use Chinese to ask questions, and the response is also in English. Generally speaking, I will ask coploit to make it return in Chinese, which is okay.
Here is where the chunking is taking place currently: https://github.com/microsoft/semantic-kernel/blob/main/samples/apps/copilot-chat-app/webapi/CopilotChat/Controllers/DocumentImportController.cs#L184. Feel free to experiment with other strategies that suit your need.
You can find the prompts here: https://github.com/microsoft/semantic-kernel/blob/main/samples/apps/copilot-chat-app/webapi/appsettings.json#L152. You will need to modify the source code since some parts of the prompts are hardcoded in the ChatSkill: https://github.com/microsoft/semantic-kernel/tree/main/samples/apps/copilot-chat-app/webapi/CopilotChat/Skills/ChatSkills
Closing now as it appears this issue has been resolved. Please reopen as needed.