GraphScope
GraphScope copied to clipboard
Support wildcards when loading from files
For example:
graph.add_edges('/data/person/*.csv')
We support wildcards in oss and hdfs, (by fsspec) but not for local files.
In addtion, it requires well documented.
Hi, I'd like to attempt to solve this issue. Could it be assigned to me. Thank you.
Hi, if u r interested in this issue, then u r welcome to open a PR to solve it, discussions/consultations abt how it should be address is also welcome. U may need a understanding of how loading process works in here, and setup a developing environment of this project.
Thank you for your response. I have read part of the project documentation and have set up the development environment using the dev docker container provided by the community. I have a few questions:
-
- Should I open a PR now or at a later stage?
-
- Where in the documentation can I learn about the loading process? I noticed that the file
python/graphscope/framework/loader.py
seems to be responsible for this task.
- Where in the documentation can I learn about the loading process? I noticed that the file
-
- Could you explain why this issue is tagged with
component:gae
andcomponent:vineyard
?
- Could you explain why this issue is tagged with
Looking forward to your guidance.
i. It's better to open a PR after a workable version. ii. I'm afraid u have to go through the source code. iii. Cuz the loading process is in the c++ code, which calls functions within library vineyard, which u can trace through.
This is not a hard task, but it contains a rather long call chain.
Thank you for your detailed explanation, I will try to understand and complete it
Hello,
I noticed in the developer's guide that I can use the make minitest(unitest)
command for testing. However, I didn't find this command in the Makefile. I have noticed in other issues that the developer's guide may be a bit outdated. Could you please guide me on how to handle testing for this issue?
Regarding the solution to the issue, I have found that the load_from
method of the Graph
class is responsible for file loading. Could you please confirm if my understanding is correct?
For the specific solution, I plan to use the glob
library to achieve the goal.
Looking forward to your guidance.
-
You could add a test in test_create_graph.py, and refer to this python test workflow to test.
-
load_from is for gathering necessary informations, such as label, property, file location, etc. The read file process actually is in arrow_fragment_loader in v6d.
Hello,
Firstly, I want to express my respect for your time. I understand that you must be busy, so I greatly appreciate you taking the time to assist me.
I've encountered some issues while trying to run the test_create_graph.py
test. The command I used is:
python3 -m pytest -d --tx popen//python=python3 \
-s -v \
--cov=graphscope --cov-config=python/.coveragerc --cov-report=xml --cov-report=term \
/workspaces/GraphScope/python/graphscope/tests/unittest/test_create_graph.py
Running this test took me about 4 hours, and about 80% of the test points reported errors. From the output, the problem seems to be related to grpc. The specific error type is grpc._channel._InactiveRpcError
, the status code is StatusCode.ABORTED
, and the error details are "Launch analytical engine failed:", indicating that an error occurred when launching the analytical engine.
This error occurs during the initialization of graphscope.client.session
, when it tries to create an analytical instance via a gRPC connection. The exception is thrown when calling the create_analytical_instance
method in graphscope.client.rpc
.
I wanted to ask if this is a normal situation? Or could it be that there are network issues with my Linux server?
I greatly appreciate any assistance you can provide, and I respect your time, so if you need more information to help solve this problem, I will provide it as soon as possible.
Thank you again for your help.
It's the program can't find or can't launch the analytical_engine (a.k.a. grape_engine). Probably the installation was not successful.
Thanks for your reply. I'll try to reinstall it
Hello,
Sorry to bother you at night've been following the suggestions provided in this issue thread and have tried reinstalling Gs on my machine. Unfortunately, I'm still encountering issues when trying to task it.I attached the raw output below Could you please provide further guidance on how to resolve this? Any help would be greatly appreciated.
Thank you in advance
It seems the previous message was not accepted
You might want to try using devcontainer to get rid of the environment issue. We have a devcontainer.json provided.