Golden Grape comments

Results 6 comments of


Golden Grape

lxml module

+1, then I can use python-pptx to process the ppt file.

> https://develobile.com/pyto has lxml if you need it. "@ColdGrub1384 Remove lxml (see `lxml` branch and #25)" https://github.com/ColdGrub1384/Pyto/issues/25 @cclauss they had removed lxml 9 hours after your post. Maybe it is...

OpenAIEmbeddings Unsupported OpenAI-Version header provided: 2022-12-01

I had the same problem when I deployed on streamlit, then I restricted ```langchain == 0.0.157``` and added in the code ``` openai.api_version = '2020-11-07' os.environ["OPENAI_API_VERSION"] = '2020-11-07' ``` But...

OpenAIEmbeddings Unsupported OpenAI-Version header provided: 2022-12-01

Here is a temporary solution: placing ``openai.api_version = '2020-11-07'`` just before ```openai.ChatCompletion.create``` like this: ```python messages = [{"role": "user", "content": prompt}] openai.api_version = '2020-11-07' response = openai.ChatCompletion.create( model=model, messages=messages, #...

未来是否支持pdf格式要是可以用这个看论文也会很快

直接的PDF转epub可以用在线工具，但转出来的不一定是用\标记的。直接的python库一时找不到。其实word打开pdf效果看起来最好，然后可以用calibre转换word到epub。如果顺着这个思路，这里有一个pdf2docx的库，可以用来转换pdf https://github.com/dothinking/pdf2docx 然后可以考虑直接去双语docx，docx也是一个值得支持的格式。 PDF比较麻烦的是排版，加一段双语可能格式就都混乱了。从排版考虑的话，以”页“为单位处理PDF而不是以”段“来可能会更好，但就需要一个大大的宽屏显示器了。 PDF真是一个邪恶的格式啊

未来是否支持pdf格式要是可以用这个看论文也会很快

感觉一个一个格式依次支持也很复杂，要不可以这样：先都提取文本，然后再一段一段往回送，这样就不用管各种标记符号了。摘取文本的动作可以直接从llamahub那里拿到插件。一段一段往回送，其实就是一个字符串（纯文本）从起点选子串，另一个字符串（复杂格式）可以从中间选子串，然后取最大连续相同子串，选出来以后纯文本字符串删掉子串。然后循环 ``` A=file_to_text(B) new_file=B while len(A)>0 max_sub=find_max_same_sub_string(A[:n], B) trans_sub=translate(max_sub) new_file.replace(max_sub, max_sub+trans_sub) A=A[n:] ```

Golden Grape

lxml module

lxml module

OpenAIEmbeddings Unsupported OpenAI-Version header provided: 2022-12-01

OpenAIEmbeddings Unsupported OpenAI-Version header provided: 2022-12-01

未来是否支持pdf格式 要是可以用这个看论文也会很快

未来是否支持pdf格式 要是可以用这个看论文也会很快

未来是否支持pdf格式要是可以用这个看论文也会很快

未来是否支持pdf格式要是可以用这个看论文也会很快