filecoin-plus-large-datasets
filecoin-plus-large-datasets copied to clipboard
[DataCap Application] <Edwardext> - <SDSS Dataset >
Data Owner Name
Sloan Digital Sky Survey
Data Owner Country/Region
United States
Data Owner Industry
Environment
Website
https://www.sdss.org
Social Media
https://twitter.com/sdssurveys
https://www.youtube.com/user/sdssurveys
Total amount of DataCap being requested
5PiB
Weekly allocation of DataCap requested
600TiB
On-chain address for first allocation
f15dqakgac2j2keky2up6oz2qidxhm3fssqnbghoy
Data Type of Application
Public, Open Dataset (Research/Non-Profit)
Custom multisig
- [ ] Use Custom Multisig
Identifier
No response
Share a brief history of your project and organization
I am a researcher and developer in blockchain technology.In 2020, I participated in the Filecoin Space Race as a member of the technical support team, providing technical assistance to other service providers (SPs).
I am deeply interested in Filecoin's complex storage system and have a deep understanding of mining technology, as well as extensive industry experience.Through system optimization, hardware matching optimization, and process optimization, I have developed a mining system based on Lotus that is suitable for large-scale production.
I have a team of three developers in China, and we are currently developing an ecosystem application based on FVM.
Based on my recognition of Filecoin's valid data, I would like to provide the public dataset as valid data storage and recommend it to SPs that I know or work with.
I will supervise and guide my collaborating SPs to strictly follow the valid data encapsulating rules established by the community.
Is this project associated with other projects/ecosystem stakeholders?
Yes
If answered yes, what are the other projects/ecosystem stakeholders
issue 1827/1841
Describe the data being stored onto Filecoin
The SDSS (Sloan Digital Sky Survey) dataset is a publicly available astronomical observation dataset of the night sky. It has been obtained over the years by the Sloan Foundation Telescope located in New Mexico, USA.
The SDSS dataset includes data of more than 500 million astronomical objects, such as stars, galaxies, quasars, and asteroids, from five survey phases and was about 407T in size. These data include information such as the positions, brightness, and colors of celestial objects in the sky. The dataset also includes spectra of about 4 million objects, which can provide information about their chemical composition, distance, and velocity.
Where was the data currently stored in this dataset sourced from
AWS Cloud
If you answered "Other" in the previous question, enter the details here
the SDSS data set is stored in multiple locations: AWS\SDSS Science Archive Server (SAS)
It can be publicly accessed and downloaded from the sdss website
How do you plan to prepare the dataset
others/custom tool
If you answered "other/custom tool" in the previous question, enter the details here
Self-developed tool
Please share a sample of the data
https://www.sdss4.org/dr17
https://www.sdss.org/dr18/
Confirm that this is a public dataset that can be retrieved by anyone on the Network
- [X] I confirm
If you chose not to confirm, what was the reason
No response
What is the expected retrieval frequency for this data
Sporadic
For how long do you plan to keep this dataset stored on Filecoin
1 to 1.5 years
In which geographies do you plan on making storage deals
Greater China, Asia other than Greater China, Africa, North America, South America, Europe, Australia (continent), Antarctica
How will you be distributing your data to storage providers
HTTP or FTP server, Shipping hard drives, Others
How do you plan to choose storage providers
Slack, Partners
If you answered "Others" in the previous question, what is the tool or platform you plan to use
No response
If you already have a list of storage providers to work with, fill out their names and provider IDs below
f01980952 CN
f01943941 CN
f0150816 CN
f02031264 SGP
f02052244 US
f02052252 US
How do you plan to make deals to your storage providers
Lotus client
If you answered "Others/custom tool" in the previous question, enter the details here
No response
Can you confirm that you will follow the Fil+ guideline
Yes
Thanks for your request!
Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!
Thanks for your request!
Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!
Thanks for your request! Everything looks good. :ok_hand:
A Governance Team member will review the information provided and contact you back pretty soon.
Thanks for your request! Everything looks good. :ok_hand:
A Governance Team member will review the information provided and contact you back pretty soon.
Datacap Request Trigger
Total DataCap requested
5PiB
Expected weekly DataCap usage rate
600TiB
Client address
f15dqakgac2j2keky2up6oz2qidxhm3fssqnbghoy
DataCap Allocation requested
Multisig Notary address
f02049625
Client address
f15dqakgac2j2keky2up6oz2qidxhm3fssqnbghoy
DataCap allocation requested
256TiB
Id
6c2a5269-259f-455b-80d5-016f4e36f8cc
DataCap Allocation requested
Multisig Notary address
f02049625
Client address
f15dqakgac2j2keky2up6oz2qidxhm3fssqnbghoy
DataCap allocation requested
256TiB
Id
d7568554-f3d3-4ef6-b98e-f2f9b841c7e9
DataCap and CID Checker Report[^1]
No application info found for this issue on https://filplus.d.interplanetary.one/clients.
[^1]: To manually trigger this report, add a comment with text checker:manualTrigger
DataCap and CID Checker Report[^1]
No application info found for this issue on https://filplus.d.interplanetary.one/clients.
[^1]: To manually trigger this report, add a comment with text checker:manualTrigger
Does this "https://www.sdss.org" require permission to download, what data do you plan to download, can you give an explanation?
Hi @a1991car, thank you for your attention and questions. All the data provided for download on SDSS are publicly available.
The data submitted this time involves celestial data and spectrum.
Request Proposed
Your Datacap Allocation Request has been proposed by the Notary
Message sent to Filecoin Network
bafy2bzacedroq2zsrstbazejdwbrvr2d3vemvtdhgum7lks3ih4mwuer26ra2
Address
f15dqakgac2j2keky2up6oz2qidxhm3fssqnbghoy
Datacap Allocated
256.00TiB
Signer Address
f1qnumecdypgrbaebtkdfjnwt5ndacadcuas3deiq
Id
d7568554-f3d3-4ef6-b98e-f2f9b841c7e9
You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedroq2zsrstbazejdwbrvr2d3vemvtdhgum7lks3ih4mwuer26ra2
DataCap and CID Checker Report[^1]
No application info found for this issue on https://filplus.d.interplanetary.one/clients.
[^1]: To manually trigger this report, add a comment with text checker:manualTrigger
Request Approved
Your Datacap Allocation Request has been approved by the Notary
Message sent to Filecoin Network
bafy2bzaceb63qubp4pgw64cfpob6be4tudays7f642gbmzuaqn6oehxzcacui
Address
f15dqakgac2j2keky2up6oz2qidxhm3fssqnbghoy
Datacap Allocated
256.00TiB
Signer Address
f16karfxq7lxdy7izqrzrk75jf3not34k6sg6zvcy
Id
You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceb63qubp4pgw64cfpob6be4tudays7f642gbmzuaqn6oehxzcacui
checker:manualTrigger
DataCap and CID Checker Report[^1]
No application info found for this issue on https://filplus.d.interplanetary.one/clients.
[^1]: To manually trigger this report, add a comment with text checker:manualTrigger
@NewHuoPool
Can you let us (the community) know what due diligence has been done here for this client? What is the data onboarding plan for this client? Where is he/she going to store the data and who is she? What is their internet capacity? On what bases have you approved this application and why?
From our own experience i know that downloading this dataset is a pain. I wonder what data is going to be stored, where and when. Can you enlighten us @edwardext
Since the project application, we have been downloading this dataset until now. From the official website of SDSS, you can learn that this dataset has gone through five stages and 18 data releases. The data we plan to store for this project exceeds 700T, and we plan to store five copies. Based on the redundancy encapsulated , we applied for a 5P DC. We will download the dataset in five stages as it takes time for downloading and transmission. Finally, we will send the data to each SP via hard disk. We expect to start the encapsulating at the earliest by the end of April, and the latest in May.
DataCap Allocation requested
Request number 2
Multisig Notary address
f02049625
Client address
f15dqakgac2j2keky2up6oz2qidxhm3fssqnbghoy
DataCap allocation requested
512TiB
Id
0edafe77-d669-46ad-a72d-90256c925f20
Stats & Info for DataCap Allocation
Multisig Notary address
f02049625
Client address
f15dqakgac2j2keky2up6oz2qidxhm3fssqnbghoy
Rule to calculate the allocation request amount
10% of total dc amount requested
DataCap allocation requested
512TiB
Total DataCap granted for client so far
256TiB
Datacap to be granted to reach the total amount requested by the client (5PiB)
4.75PiB
Stats
| Number of deals | Number of storage providers | Previous DC Allocated | Top provider | Remaining DC |
|---|---|---|---|---|
| null | null | 256TiB | null | 44.87TiB |
checker:manualTrigger
DataCap and CID Checker Report Summary[^1]
Storage Provider Distribution
✔️ Storage provider distribution looks healthy.
Deal Data Replication
⚠️ 100.00% of deals are for data replicated across less than 4 storage providers.
Deal Data Shared with other Clients[^3]
✔️ No CID sharing has been observed.
[^1]: To manually trigger this report, add a comment with text checker:manualTrigger
[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger
[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...
Full report
Click here to view the full report.
Request Proposed
Your Datacap Allocation Request has been proposed by the Notary
Message sent to Filecoin Network
bafy2bzacebfstxu4dedrp7omhhcdv6qpt6twwask6kpwd6aongnn6o4wwc6no
Address
f15dqakgac2j2keky2up6oz2qidxhm3fssqnbghoy
Datacap Allocated
512.00TiB
Signer Address
f1nwjsd2mc6hu4qrwnmd6ukrfkuu4h5fhs7u3exii
Id
0edafe77-d669-46ad-a72d-90256c925f20
You can check the status of the message here: https://filfox.info/en/message/bafy2bzacebfstxu4dedrp7omhhcdv6qpt6twwask6kpwd6aongnn6o4wwc6no
checker:manualTrigger
DataCap and CID Checker Report Summary[^1]
Storage Provider Distribution
✔️ Storage provider distribution looks healthy.
Deal Data Replication
⚠️ 100.00% of deals are for data replicated across less than 4 storage providers.
Deal Data Shared with other Clients[^3]
✔️ No CID sharing has been observed.
[^1]: To manually trigger this report, add a comment with text checker:manualTrigger
[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger
[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...
Full report
Click here to view the full report.
The results of CID Checker look good because it is only the first round and the data is only distributed among 3 SPs, I hope to see more SPs subsequently. ⚠️ 100.00% of deals are for data replicated across less than 4 storage providers.
A few randomly selected CIDs were retrieved and verified by the CID Tool tool, and all were successful.
https://filecoin.tools/f02114878
Request Approved
Your Datacap Allocation Request has been approved by the Notary
Message sent to Filecoin Network
bafy2bzaceadebdkrnuoo7uhack2r5apzjiiilhezf3cwltjs4ismslh4w6kxk
Address
f15dqakgac2j2keky2up6oz2qidxhm3fssqnbghoy
Datacap Allocated
512.00TiB
Signer Address
f1e77zuityhvvw6u2t6tb5qlnsegy2s67qs4lbbbq
Id
0edafe77-d669-46ad-a72d-90256c925f20
You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceadebdkrnuoo7uhack2r5apzjiiilhezf3cwltjs4ismslh4w6kxk
checker:manualTrigger
DataCap and CID Checker Report Summary[^1]
Storage Provider Distribution
✔️ Storage provider distribution looks healthy.
Deal Data Replication
⚠️ 100.00% of deals are for data replicated across less than 4 storage providers.
Deal Data Shared with other Clients[^3]
✔️ No CID sharing has been observed.
[^1]: To manually trigger this report, add a comment with text checker:manualTrigger
[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger
[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...
Full report
Click here to view the full report.
checker:manualTrigger