-
Notifications
You must be signed in to change notification settings - Fork 443
Is HDFS loader support parquet orc file format? #2264
Replies: 6 comments · 25 replies
-
Yes, but not inclueded in current stable release yet. You could install our nightly build via
|
Beta Was this translation helpful? Give feedback.
All reactions
-
Example of use? |
Beta Was this translation helpful? Give feedback.
All reactions
-
Closing as moving to discussion. |
Beta Was this translation helpful? Give feedback.
All reactions
-
Based on the implementation details, I think prefix should work as well. Haven't tested, though. Feel free to paste the error logs if it doesn't work. Thanks! |
Beta Was this translation helpful? Give feedback.
All reactions
-
load vertices use path load edges use path with file prefix when load edge data error log:
|
Beta Was this translation helpful? Give feedback.
All reactions
-
@sighingnow can you help me fix this error? thank you! |
Beta Was this translation helpful? Give feedback.
All reactions
-
仔细看了下日志,通过prefix file name ,好像没用通过 adaptors/read_orc.py 去读orc文件,好像是按照csv 格式读取的ORC,造成读取文件失败 |
Beta Was this translation helpful? Give feedback.
All reactions
-
版本问题?更新一下 graphscope的版本? pip3 install -U graphscope |
Beta Was this translation helpful? Give feedback.
All reactions
-
Internally graphscope use fsspec to list files, see https://github.com/v6d-io/v6d/blob/main/modules/io/python/drivers/io/adaptors/read_orc.py#L126 Could you please take a try to install fsspec (https://github.com/fsspec/filesystem_spec) and see if fs = ....
fs.glob("hdfs:///user/hive/warehouse/community.db/transaction_edges/job_id=112/part") yields any output? Thanks! |
Beta Was this translation helpful? Give feedback.
All reactions
-
it`s success get files with prefix:
result: |
Beta Was this translation helpful? Give feedback.
All reactions
-
Can I access these two files somewhere? |
Beta Was this translation helpful? Give feedback.
All reactions
-
This is internal data, can you simulate two orc files? |
Beta Was this translation helpful? Give feedback.
All reactions
-
你现在版本是? |
Beta Was this translation helpful? Give feedback.
All reactions
-
pip3 install -U graphscope --pre |
Beta Was this translation helpful? Give feedback.
All reactions
-
I see. We currently have no way to infer that "hdfs:///user/hive/warehouse/community.db/transaction_edges/job_id=112/part" indicates some ORC files.... I think we need to add some extra arguments for such cases. |
Beta Was this translation helpful? Give feedback.
All reactions
-
👍 1
-
full log:
|
Beta Was this translation helpful? Give feedback.
All reactions
-
As the error message indicated, these files are still treated as CSV. The newly added I have just uploaded the wheel to pypi. Looks like you are using local sessions, so
and try again, it should work. Please confirm you have installed |
Beta Was this translation helpful? Give feedback.
All reactions
-
Thank you very much, it works local seesion!! |
Beta Was this translation helpful? Give feedback.
All reactions
-
when run on k8s
in engin pod vineyard-io = 0.11.1 how can update vineyard-io>=0.11.2. on k8s cluster? |
Beta Was this translation helpful? Give feedback.
All reactions
-
Our nightly CI seems failed to publish images in recently nights. You could launch the session, then |
Beta Was this translation helpful? Give feedback.
All reactions
-
❤️ 1
-
can support load GraphAR format file now? |
Beta Was this translation helpful? Give feedback.
All reactions
This discussion was converted from issue #2263 on December 01, 2022 07:02.
-
Is your feature request related to a problem? Please describe.
now it's support CSV file format for vertex/edge data
Describe the solution you'd like
Loader support parquet orc file ?
Beta Was this translation helpful? Give feedback.
All reactions