nagisa icon indicating copy to clipboard operation
nagisa copied to clipboard

importing nagisa gives error "source code string cannot contain null bytes"

Open worthy7 opened this issue 3 months ago • 4 comments

Full output

nagisa-0.2.11-cp310-cp310-manylinux_2_5_x86_64

ValueError                                Traceback (most recent call last)
Cell In[10], [line 6](vscode-notebook-cell:?execution_count=10&line=6)
      [1](vscode-notebook-cell:?execution_count=10&line=1) # now, tokenizing the data
      [2](vscode-notebook-cell:?execution_count=10&line=2) #Text preprocessing, tokenizing and filtering of stopwords are all included in CountVectorizer, which builds a dictionary of features and transforms documents to feature vectors:
      [3](vscode-notebook-cell:?execution_count=10&line=3) 
      [4](vscode-notebook-cell:?execution_count=10&line=4) # custom tokenization, this also removes common words
      [5](vscode-notebook-cell:?execution_count=10&line=5) from keyword_extraction import extract_keyword
----> [6](vscode-notebook-cell:?execution_count=10&line=6) import nagisa
      [7](vscode-notebook-cell:?execution_count=10&line=7) # Takes in a document, returns the list of words
      [8](vscode-notebook-cell:?execution_count=10&line=8) def tokenize_jp(doc):

File [~/.local/lib/python3.10/site-packages/nagisa/__init__.py:4](https://BigQuery AI76qp.vscode-resource.vscode-cdn.net/workspaces/voc-application/category-prediction/~/.local/lib/python3.10/site-packages/nagisa/__init__.py:4)
      [1](https://BigQuery AI76qp.vscode-resource.vscode-cdn.net/workspaces/voc-application/category-prediction/~/.local/lib/python3.10/site-packages/nagisa/__init__.py:1) import nagisa_utils as utils
      [3](https://BigQuery AI76qp.vscode-resource.vscode-cdn.net/workspaces/voc-application/category-prediction/~/.local/lib/python3.10/site-packages/nagisa/__init__.py:3) from nagisa.tagger import Tagger
----> [4](https://BigQuery AI76qp.vscode-resource.vscode-cdn.net/workspaces/voc-application/category-prediction/~/.local/lib/python3.10/site-packages/nagisa/__init__.py:4) from nagisa.train import fit
      [6](https://BigQuery AI76qp.vscode-resource.vscode-cdn.net/workspaces/voc-application/category-prediction/~/.local/lib/python3.10/site-packages/nagisa/__init__.py:6) version = '0.2.11'
      [7](https://BigQuery AI76qp.vscode-resource.vscode-cdn.net/workspaces/voc-application/category-prediction/~/.local/lib/python3.10/site-packages/nagisa/__init__.py:7) # Initialize instance

File [~/.local/lib/python3.10/site-packages/nagisa/train.py:11](https://BigQuery AI76qp.vscode-resource.vscode-cdn.net/workspaces/voc-application/category-prediction/~/.local/lib/python3.10/site-packages/nagisa/train.py:11)
      [7](https://BigQuery AI76qp.vscode-resource.vscode-cdn.net/workspaces/voc-application/category-prediction/~/.local/lib/python3.10/site-packages/nagisa/train.py:7) import logging
      [8](https://BigQuery AI76qp.vscode-resource.vscode-cdn.net/workspaces/voc-application/category-prediction/~/.local/lib/python3.10/site-packages/nagisa/train.py:8) from collections import OrderedDict
---> [11](https://BigQuery AI76qp.vscode-resource.vscode-cdn.net/workspaces/voc-application/category-prediction/~/.local/lib/python3.10/site-packages/nagisa/train.py:11) import model
     [12](https://BigQuery AI76qp.vscode-resource.vscode-cdn.net/workspaces/voc-application/category-prediction/~/.local/lib/python3.10/site-packages/nagisa/train.py:12) import prepro
     [13](https://BigQuery AI76qp.vscode-resource.vscode-cdn.net/workspaces/voc-application/category-prediction/~/.local/lib/python3.10/site-packages/nagisa/train.py:13) import mecab_system_eval

worthy7 avatar Mar 28 '24 06:03 worthy7

Hi @worthy7. Thank you for using nagisa and sending us a bug report. I apologize for any inconvenience caused. I would like to investigate the cause of the error, so please let me know the version of the OS of your environment.

taishi-i avatar Mar 28 '24 09:03 taishi-i

Actually this was just inside GitHub codespaces. I think it's Ubuntu but should be easy to reproduce.

On Thu, 28 Mar 2024, 18:03 taishi-i, @.***> wrote:

Hi @worthy7 https://github.com/worthy7. Thank you for using nagisa and sending us a bug report. I apologize for any inconvenience caused. I would like to investigate the cause of the error, so please let me know the version of the OS of your environment.

— Reply to this email directly, view it on GitHub https://github.com/taishi-i/nagisa/issues/34#issuecomment-2024712480, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABEKWHZF6BUFPWLC57RJHKDY2PMGHAVCNFSM6AAAAABFMFQHN6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRUG4YTENBYGA . You are receiving this because you were mentioned.Message ID: @.***>

worthy7 avatar Mar 28 '24 13:03 worthy7

Thank you for your response. I will now try to reproduce tokenizing texts with nagisa in GitHub codespaces, so please wait a moment. I will respond as soon as I identify the cause.

taishi-i avatar Mar 28 '24 15:03 taishi-i

I have completed the reproduction, and it seems that nagisa can be used without any issues if it's the simplest configuration of GitHub codespace.

Here's the configuration:

  1. Create a new project in a new codespace and select 2-core 8GB RAM 32GB.
  2. Next, install the Python extension (Python3.10.13 v2024.2.1).
  3. Perform the installation with pip install nagisa.
  4. Check the operation in the terminal.

I would like to identify the cause of the error. First, could you try executing import nagisa in your terminal in GitHub codespace to see if it can be imported without any problems?

taishi-i avatar Mar 28 '24 16:03 taishi-i