filecoin-plus-large-datasets icon indicating copy to clipboard operation
filecoin-plus-large-datasets copied to clipboard

[DataCap Application] <AIOCP> - <GPU Cloud computing & Cloud Storage>

Open aiocp opened this issue 3 years ago • 73 comments

Large Dataset Notary Application

To apply for DataCap to onboard your dataset to Filecoin, please fill out the following.

Core Information

  • Organization Name: AIOCP
  • Website / Social Media: https://www.aiocp.co.kr/
  • Total amount of DataCap being requested (between 500 TiB and 5 PiB): 2 PiB
  • Weekly allocation of DataCap requested (usually between 1-100TiB): 100 TiB
  • On-chain address for first allocation: f1dj23xokyovdqnbgx3nis3ygk73szanzlwb2kduy

Please respond to the questions below by replacing the text saying "Please answer here". Include as much detail as you can in your answer.

Project details

Share a brief history of your project and organization.

AIOCP, established in 2012, started offering network equipment such as GPU, servers, switch and storage and extended the business fields to cloud computing services, particularly in one-on-one B2B consulting of network infrastructure. 

Now, we are about to release a brand-new cloud computing service called “BIGBANGCLOUD” which can use GPU resources in a virtual environment. Building its infrastructure using IPFS technology is a strength unlike typical cloud services.

The resources of basic cloud services currently offered are deployed in an on-premise environment. However, we realized decentralization and high availability are the core values in the era of web 3.0. It has been over two years since we invested in IPFS development. And now is the time to get started. 

BIGBANGCLOUD will be the right choice for start-up, medical/educational institutions that run 4th industrial revolution technologies such as autonomous driving, mobility, big data and AI. 

aiocp field aiocp press release

What is the primary source of funding for this project?

Business income

What other projects/ecosystem stakeholders is this project associated with?

None.

Use-case details

Describe the data being stored onto Filecoin

Currently, we are offering cloud computing as one of our main services, and are about to release an innovative hosting service using GPU resources. In the meantime, we were looking for a virtual storage that can store and retrieve our GPU hosting services. The answer is building infrastructure based on IPFS technology. 

The client who uses cloud computing services requires a space to store the data. It means “Cloud Storage”. 
The data we are planning to put into the filecoin system is mostly our client’s public data, and also R&D data and container images from our own resources. 

The R&D data refers to the test result or video from developing GPU cloud hosting. and the container images refer to tools that installed tensorflow, PyTorch, Keras, CuDNN would provide to the client. 

Where was the data in this dataset sourced from?

Mostly our client’s public data running in cloud services, and the container image using open sources made by our R&D team.

Can you share a sample of the data? A link to a file, an image, a table, etc., are good ways to do this.

https://drive.google.com/drive/folders/1MPFQo2FGqOS6AuQGkDvs0Uo1-patTOTd?usp=share_link

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

Yes. 

What is the expected retrieval frequency for this data?

Whenever our clients are provisioning new services and storing some applications or their own data. 

For how long do you plan to keep this dataset stored on Filecoin?

540 days at least. 

DataCap allocation plan

In which geographies (countries, regions) do you plan on making storage deals?

Prefer regions in Asia

How will you be distributing your data to storage providers? Is there an offline data transfer process?

Use both online and offline transfer upon SP's request. 

How do you plan on choosing the storage providers with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

Please answer here.

How will you be distributing deals across storage providers?

Github and slack can help us find more sp with reputation and enough resource. We'd like to contact sps from different regions for distributed storage.

Do you have the resources/funding to start making deals as soon as you receive DataCap? What support from the community would help you onboard onto Filecoin?

yes. 

aiocp avatar Nov 18 '22 02:11 aiocp

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

Datacap Request Trigger

Total DataCap requested

2PiB

Expected weekly DataCap usage rate

100TiB

Client address

f1dj23xokyovdqnbgx3nis3ygk73szanzlwb2kduy

simonkim0515 avatar Nov 29 '22 17:11 simonkim0515

To sign the DC allocation, I would like to ask you a few questions first.

  1. It is difficult to verify 5PiB with sample data. Please provide the basis for applying for 5PiB. Example) Screenshot of the data size you have.
  2. Are customer data stored in the cloud publicly available?
  3. Do you own a Filecoin node? How would you respond to the point of self-dealing in customer validation?

psh0691 avatar Dec 07 '22 05:12 psh0691

Hello, @psh0691

  1. We have 400TB for one copy. I would like to make 5 copies and distribute one copy to 5 SPs. So total 2Pib. I have just revised the application.
  2. Yes. They are all public data.
  3. No, we do not have a Filecoin node.

aiocp avatar Dec 08 '22 06:12 aiocp

@simonkim0515 I have revised the total amount of Datacap request from 5Pib to 2Pib. It was already approved two weeks ago. Can you please check and pull the trigger again? @raghavrmadya @galen-mcandrew @Kevin-FF-USA

aiocp avatar Dec 12 '22 07:12 aiocp

DataCap Allocation requested

Multisig Notary address

f02049625

Client address

f1dj23xokyovdqnbgx3nis3ygk73szanzlwb2kduy

DataCap allocation requested

50TiB

Id

097bf9c9-d573-4283-b7d0-cad7c834bc5f

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacedgduadxfy34alg2zlw67qxiviavjn6w5aewcwjop7jl34wfhic4i

Address

f1dj23xokyovdqnbgx3nis3ygk73szanzlwb2kduy

Datacap Allocated

50.00TiB

Signer Address

f1qdko4jg25vo35qmyvcrw4ak4fmuu3f5rif2kc7i

Id

097bf9c9-d573-4283-b7d0-cad7c834bc5f

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedgduadxfy34alg2zlw67qxiviavjn6w5aewcwjop7jl34wfhic4i

psh0691 avatar Dec 15 '22 13:12 psh0691

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacedvmrrrkyokz3nc7sdxwzxc3sg356ixfnjhypedrqcmfo5yrjwi5s

Address

f1dj23xokyovdqnbgx3nis3ygk73szanzlwb2kduy

Datacap Allocated

50.00TiB

Signer Address

f1yjhnsoga2ccnepb7t3p3ov5fzom3syhsuinxexa

Id

097bf9c9-d573-4283-b7d0-cad7c834bc5f

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedvmrrrkyokz3nc7sdxwzxc3sg356ixfnjhypedrqcmfo5yrjwi5s

kernelogic avatar Dec 20 '22 08:12 kernelogic

@aiocp

How do you plan on choosing the storage providers with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others. Please answer here.

1.Please answer the question above.

I would like to make 5 copies and distribute one copy to 5 SPs.

2.Seems like you intend to make deal with 5 SPs, can you list SPs you have contacted at present?

IreneYoung avatar Dec 24 '22 05:12 IreneYoung

Hi, @IreneYoung

  1. We are planning to choose the SPs can meet our requirement, having stable network connection specifically over 10G port line and storage capacity in Asia. We think that this conditions would make the tons of data to be stably retrievable in the future. We also plan to select SPs that are interested in the retrieval market, which is expected to grow in 2023.

If so, we participated some community and network events in Seoul hosted by Protocol Laps and met SPs discussing the filecoin road-map. We are still deciding proper SPs which can support us.

  1. We found three 3 SPs which can meet our requirement. but still deciding 2 SPs among our own SP list that we have contacted.

aiocp avatar Dec 27 '22 08:12 aiocp

DataCap Allocation requested

Request number 2

Multisig Notary address

f02049625

Client address

f1dj23xokyovdqnbgx3nis3ygk73szanzlwb2kduy

DataCap allocation requested

100TiB

Id

c159d966-9f96-49b1-a104-6951c9a62a40

Stats & Info for DataCap Allocation

Multisig Notary address

f01858410

Client address

f1dj23xokyovdqnbgx3nis3ygk73szanzlwb2kduy

Last two approvers

kernelogic & psh0691

Rule to calculate the allocation request amount

100% of weekly dc amount requested

DataCap allocation requested

100TiB

Total DataCap granted for client so far

50TiB

Datacap to be granted to reach the total amount requested by the client (2 PiB)

1.95PiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
3378 3 50TiB 42.12 12.24TiB

DataCap and CID Checker Report[^1]

  • Organization: AIOCP
  • Client: f1dj23xokyovdqnbgx3nis3ygk73szanzlwb2kduy

Approvers

1kernelogic
1psh0691

Storage Provider Distribution

The below table shows the distribution of storage providers that have stored data for this client.

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

Since this is the 3rd allocation, the following restrictions have been relaxed:

  • Storage provider should not exceed 70% of total datacap.
  • Storage provider should not be storing duplicate data for more than 20%.
  • Storage provider should have published its public IP address.
  • All storage providers should be located in different regions.

✔️ Storage provider distribution looks healthy.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01956198new Seoul, Seoul, KR
EHOSTICT
15.20 TiB 44.47% 15.18 TiB 0.10%
f01873489new Seoul, Seoul, KR
EHOSTICT
14.30 TiB 41.86% 14.24 TiB 0.44%
f0521569 Seoul, Seoul, KR
Korea Telecom
4.67 TiB 13.67% 4.63 TiB 0.84%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

Since this is the 3rd allocation, the following restrictions have been relaxed:

  • No more than 70% of unique data are stored with less than 3 providers.

⚠️ 100.00% of deals are for data replicated across less than 3 storage providers.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
17.62 TiB 17.65 TiB 1 51.65%
8.22 TiB 16.52 TiB 2 48.35%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients. Usually different applications owns different data and should not resolve to the same CID.

However, this could be possible if all below clients use same software to prepare for the exact same dataset or they belong to a series of LDN applications for the same dataset.

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

Hello,

I see that you have only stored in one location. What is the reason that the rules of Fil+ are not followed?

cryptowhizzard avatar Feb 09 '23 13:02 cryptowhizzard

Hi, @cryptowhizzard The main reason we stored only in one location is the delivery time of the data.

We have lots of data to store for now as the first transaction and we need it to do as fastest as we can. If so, we decided to choose the SP that is located nearby from us. Also, I prefer had business meetings with SP in person so that I can trust how they run their minor node. Met them in the SP community.

The shown locations are only one as Seoul, but we didn't allocate it in one company. They are located in different regions in Seoul. To follow Fil+ rules, I have allocated it to two china SP too. I think it hasn't updated to CID checker.

aiocp avatar Feb 13 '23 07:02 aiocp

@aiocp - You can not "decided to choose the SP that is located nearby from us" when the FIL+ rules clearly state to distribute amongst multiple regions and SP's. When do you expect to start following the rules of FIL+ ?

herrehesse avatar Feb 13 '23 08:02 herrehesse

@herrehesse The moment CID Checker report came, only had 3SPs in Korea. but we added 2 china SP to follow the rules. please check the details in below url. https://filplus.d.interplanetary.one/clients/f01936354/breakdown So, I think I did follow the rules so far.

Delivering data abroad is taking way too long, so I preferred Seoul. but apperantly storing in one city should be avoided (I got some advices from slack channel)

My customers are mostly in Asia. I should think about the download time from the client's perspective. But to follow the allocation rules, I would find more SPs in Asia.

Would it be okay?

aiocp avatar Feb 15 '23 04:02 aiocp

I am not supportive of reducing the regional distributions to china (Asia) only. I will support if you are able to spread the data to EU/USA too.

Storage prices are negative at this point in time I am sure you can find ways to distribution.

All good applications do this. And it’s (in my opinion) needed to grant datacap.

herrehesse avatar Feb 15 '23 06:02 herrehesse

If you can show details about more SPs and the plan for cooperation with the them, I could consider to sign next batch DC for you, thanks.

GaryGJG avatar Feb 15 '23 07:02 GaryGJG

@herrehesse @GaryGJG SG(f01777785), JP(f01153105), AUS(f01777777), US(f0717969), KR(f01956198,f01873489) I have found them in Filgram. planning to have 6SPs for the next batch.

aiocp avatar Feb 17 '23 04:02 aiocp

Hi,

I am sorry. f01777785 , f01153105, f0717969 are involved in Abuse. If you want me to sign i won't let you store on these.

f01777777, f01956198 and f01873489 are ok.

Can you find some SP with standing reputation in the US? GreaterHeat might be an option for you or PikNik.

cryptowhizzard avatar Feb 17 '23 11:02 cryptowhizzard

@cryptowhizzard oh. I didn't know that they are involved in Abuse. How do I know? Searching their minor address in slack?

Anyway, I found another two f01971600, f01992630 from GreaterHeat in USA. I have discussed with them in Slack.

So, it will be these 5 SP for the second batch if possible. f01777777, f01956198, f01873489, f01971600, f01992630

aiocp avatar Feb 21 '23 08:02 aiocp

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacebp5bcf74ql7dpqog4qddm5gp45yfihajulbm3g7776qjkosaibwe

Address

f1dj23xokyovdqnbgx3nis3ygk73szanzlwb2kduy

Datacap Allocated

100.00TiB

Signer Address

f1krmypm4uoxxf3g7okrwtrahlmpcph3y7rbqqgfa

Id

c159d966-9f96-49b1-a104-6951c9a62a40

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacebp5bcf74ql7dpqog4qddm5gp45yfihajulbm3g7776qjkosaibwe

cryptowhizzard avatar Feb 21 '23 13:02 cryptowhizzard

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacebx54prcfves557p2clyuf7ely5ph2lpj33q354hvbutzv2pgio3c

Address

f1dj23xokyovdqnbgx3nis3ygk73szanzlwb2kduy

Datacap Allocated

100.00TiB

Signer Address

f1yjhnsoga2ccnepb7t3p3ov5fzom3syhsuinxexa

Id

c159d966-9f96-49b1-a104-6951c9a62a40

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacebx54prcfves557p2clyuf7ely5ph2lpj33q354hvbutzv2pgio3c

kernelogic avatar Feb 22 '23 08:02 kernelogic

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

github-actions[bot] avatar Jul 23 '23 01:07 github-actions[bot]

Kepp this LDN open. It's in progress.

aiocp avatar Jul 24 '23 01:07 aiocp

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

github-actions[bot] avatar Aug 04 '23 01:08 github-actions[bot]

Keep this open. thanks

aiocp avatar Aug 04 '23 03:08 aiocp

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

github-actions[bot] avatar Aug 16 '23 01:08 github-actions[bot]