Comparison: Decentralized storage

Layer 1 and Layer 2 market movers of decentralized cloud storage

Today I would like to bring some attention to the web3 discipline heavily overshadowed by Defi and NFT sectors, borrowing the voice of critics one could say the current state of Defi and NFT is just a synthetic economy pouring capital to a limited group of people without solving any real problem world…on the other hand, there is a tech with ability to optimize costs of billions $$$ on monthly basis by maintaining internet content and data, all in a decentralized manner where anyone could benefit from participation.

Ideas of peer-to-peer data protocols are already known in space for more than 20 years with the introduction of BitTorrent where regular users got the ability to share any data with each other using just the internet. Blockchain technology adds a next step to the picture with its architectural properties to:

  • Create an easily decentralized shared economy in any industry, with consensuses like Proof of Work (PoW) or Proof of Stake (PoS)
  • Support smart contracts , self-executable business logic with given conditions

Thanks to these concepts existing protocols can evolve, from sharing a favorite movie or music to completely hosting parts of their computer for others in need and helping them with:

  • Enterprise data backups
  • Media storage for dApps operating with pictures/audio/video
  • Website Hosting

This article will briefly introduce the key protocols taking a role decentralized storage market right now or in close future, as an addition to already massively growing IT services market included Amazon Web Service (AWS), Google Cloud, Microsoft Azure — projected to grow from $78.6b in 2022 to $183.7b by 2027.

Summarized protocols

Decentralized cloud storage — Comparison


1. IPFS

IPFS, Interplanetary File System is a distributed system for storing and accessing files, websites, applications, and data. Most used L1 decentralized peer-to-peer storage combines threeKey concepts :

  • Search by Content Identifier (CID) instead of physical place, it means validation by SHA hash won't match
  • Linking content via Directed Acyclic Graphs (DAG), mainly Merkle DAG used also in Git to connect all the commits into a repository. Check dag.ipfs.io to visualize DAG
  • Content search using Distribute Hash Tables (DHT) with key-value pairs to find which peers are hosting searched content, using libp2pas core component.

Interplanetary FileSystem

A large network of nodes (200 000+) is designed to store shares of single files into multiple devices and manage data flow effectively based on their demand.

  • Using IPFS service is absolutely FREE , similarly to BitTorrent you just need to be part of the network and share any amount of data. The important thing to remember the data won't persist by default if you don't run the node 24/7 or use 3rd party service to do that for you.
  • Security : IPFS is public protocol, including the contents of files themselves, unless they're encrypted on their own. CIDs are public and additional measures are required to keep data private.

Applications

Luckily, to mitigate IPFS issues and advance its usage, periphery protocols on top of IPFS developed services to do the hard stuff for you:

  • Ceramic, permissionless data streaming network for data storage, decentralized identity, and much more
  • Fission , WebNative file system with encrypted-at-rest storage capabilities
  • Fleek/ Spheron , decentralized hosting for your frontend. Primary supported React/Next but with a possibility to deploy more frontend frameworks.
  • Lit protocol, L2 privacy layer to securely store IPFS data
  • OrbitDB, peer-to-peer serverless, decentralized database leveragingIPFS Pubsub
  • Pinata , CDN with 200+ caching locations all over the world. Simplifies management of IPFS files and grants ability to manage premium content via Submarine product. Ideal to store and maintain NFT business.
  • Filebase, similarly to Pinata, Filebase offers UI for pinning files, in this case to IPFS, Storj, Sia or Skynet. In addition provide a set of tooling to sync AWS S3 including JavaScript and Python SDK

1.1 Filecoin

Filecoin is one of the missing incentive layers (L2) for IPFS and verifies that data is being stored, while maintaining the efficiency, authenticity and resiliency provided by IPFS.

Filecoin enhances IPFS with:

  • Storage market , determines pricing for data storage given by market conditions.
  • Filgram , first storage provider discovery and client marketplace tool
  • Data indexerto improve search quality and performance

Developers could interact with Filecoin in multiple ways:

  • Powergateexposes higher-level API to interact with IPFS and Filecoin nodes easier via CLI or gRPC API endpoints with JavaScript or Go
  • Lotus CLI represents a more powerful though complicated toolset to manage Filecoin data through its Lotus nodes.

Filecoin vs IPFS

  • IPFS allows peers to store, request, and transfer verifiable data with each other (like BitTorrent)
  • Filecoin is designed to provide a system of persistent data storage. Guarantee that miners have correctly stored the data they committed to maintain.

Filecoin Economy

Contract-based storage can be more simply thought of as a pay-as-you-go model.Over 4000 miners participate in Filecoin network compensated by FIL token and price agreed in data marketplaces, right now:

  • $0.0000002 GB/Month

In comparison with Amazon S3:

  • $0.013 GB/Month

Ultimate Guide to Filecoin: Breaking Down Filecoin Whitepaper & Economics

What is Filecoin, How it Works? Understanding Technical & Economic Aspects of Filecoin.medium.com

FVM

Filecoin Virtual Machine (FVM) could be one of the game-changing capabilities upgrading protocol outside just storing data with additional business logic, coming in 2023. EVM-compatible smart contracts with unlimited storage might enable use cases like:

  • NFTs with on-chain media
  • Decentralized computation
  • Data DAOs

More on FVM at:

Filecoin Virtual Machine

A robust WASM-based VM The FVM is a WASM-based polyglot execution environment for IPLD data. It is designed to support…fvm.filecoin.io

1.2 Crust

Crust Network is essentially an IPFS incentive layer protocol and a substrate-based blockchain (Polkadot ecosystem), similarly to Filecoin Crust:

  • Settles data storage market between IPFS node providers and data providers
  • Offers personal "Google Drive" like Web3.0 storage** encrypted** for private or public use

Key consensus concepts backed by 2000+ nodes consist of3 layers:

  • MPOW (Meaningful Proof of Work) — low-trust/zero-trust storage proof layer based on TEE (Trusted Execution Environment) serves as off-chain messaging to inspect and prove the storage work of miners.
  • GPOS (Guaranteed Proof of Stake) — PoS-derived consensus layer that requires nodes to provide storage proof to get staking quota as a motivation aspect.
  • DSM  — Decentralized Storage Market to handshake pricing conditions between buyers and sellers.

All supported by set of tooling to:

  • Auto-deploy the DApp/website through Github action
  • Decentralized pin through Github action, Node.js package or Crust CLI

As a winner of Polkadot parachain auction, Crust aims to leverage its interoperability properties with other parachains like Astar or Moonbeam and become home of data storage for smart contracts built in Polkadot or Kusama ecosystem


2. Sia

Sia is a contract-based layer 1 decentralized cloud storage (Similarly to Filecoin) using blockchain technology with economy powered by token SIA. Each file is split into 30 pieces around multiple nodes, from which 10 is enough to compose the file back in case of unavailability (Reed-Solomon erasure coding). A key aspect to remember should be its focus on privacy.

  • Max individual file size = 300GB
  • All files are private by default using Threefish algorithm for high performance and secure encryption
  • Pricing moves dynamically, right now with monthly storage price around $1/TB, Upload price $0.50/TB, Download price $2/TB

Sia average pricing in $$$

The combination of its privacy and pricing makes Sia environment suitable to store long-term large backups cheaper than centralized clouds like Google Drive, unlimited inflation of Sia coin also indicates its price won't skyrocket and keeps storage cheap.

2.1 Skynet

I did prepare a few sentences about a layer on top of Sia called Skynet but at the time of writing Skynet Labs announced the game over due to the lack of funding…so fuck it

Skynet Labs is shutting down. Skynet remains online.

It is with great sadness that we announce the shutdown of Skynet Labs. Skynet and its data will remain online.blog.sia.tech

Although the decentralized storage market does not experience the same competitiveness and excitement as NFT or Defi, it requires a huge effort, creativity and discipline to offer something special to attract users and developers to migrate from centralized solution…this will in startup companies happen, visions and personalities could clash, mistakes could be made and sometimes it's too much from everything.


3. Storj

Storj became from its creation in 2014 quite a powerful layer 1 contract-based storage solution with developed network and interesting features including:

Data

Files are divided into 80 or more bundles and placed between multiple nodes around the world ( total 14k+ ), only 29 bundles are needed to compose a file to ensure enough buffer in case of nodes failure (again using Reed-Solomon coding) and high availability > 99.9%.

All with rich FREE and PRO pricing models

  • FREE: 150GB storage and bandwidth monthly
  • PRO: Storage $4/TB, Bandwith $7/TB, monthly

User could think with StorJ of use cases like audio/video streaming services, calculate costs with dedicated team and explore continuously improving technology.


4. Arweave

Arweave's approach is kind of different, consensus called PoA (Proof of Access) aims for all files to be stored permanently , consensus motivates to randomly split block history between nodes, its algorithm identifies which data is shared less and gives larger reward for miners covering rare files. More at:

On one hand the capacities of miners could be very well optimized and no file would be forgotten or lost by mistake, on the other hand, there is a concern regarding its content moderation rule, some users could struggle with the idea of storing data expensively on censored protocol, where files identified as abusive could be discarded from the archive by democratic voting.

  • Users pay one fee to store data permanently, price is calculated dynamically. Right now costs about $2/GB
  • The network covers over 1000 blockchain (SPoRA) nodes with transaction throughput of over 5000 TPS.
  • Data on Arweave is always immutable
  • ArDrive serves as encrypted file storage and organization platform

Although direct interaction with Arweave might be challenging, there is a solution making life easier called Bundlr…

Bundlr

Bundlr increases the number of transactions conducted on Arweave by 4,000% without sacrificing security or usability and is around ~3,000x faster at uploading data.

Bundlr is actually Proof-of-Stake network sits on top of Arweave, currently accounts for over 90% of data uploaded to Arweave. It is a multichain solution and is compatible with leading blockchains including Ethereum, Solana, Avalanche, Polygon, and many more.

Validators are chosen every day randomly to be charge of making sure transactions pass to Arweave, provides in addition:

  • Caching layer
  • Infinite scalability
  • Guaranteed instant transaction finality
  • free data uploads under 100kb

The bundle spec, designed by Bundlr, is open-sourced and is currently implemented in JavaScript and Go.


Other references:

CESS Cumulus Encrypted Storage System

CESS Decentralized Storage Web3 Data Security Data Privacy Data Ownership Confirmation Cumulus Encrypted Storage Systemcess.cloud

The Decentralized Storage War: Filecoin vs. Arweave | CoinMarketCap

Disclosure: Multicoin has established, maintains and enforces written policies and procedures reasonably designed to…coinmarketcap.com

Summary

Sharing computing resources will take a huge role in the future, although its impact does not represent blockchain use cases many would like to see — like ending world hunger, fixing economical flaws or achieving political fairness. Its benefits are straight and anyone involved in IT will appreciate cost optimization for data and throughput which became at some scale non-affordable for smaller businesses and dreamers.

In addition, cheap data storage in pair with high-performance scaling blockchains could unlock new ways of interactions I couldn't think of now.

If I summarize what I saw, there are several base protocols for storing data in a decentralized manner:

  • IPFS , which is not exactly blockchain but developed a rich ecosystem of ideas by 3rd parties uses a blockchain economy to bring nice cheap data services to everyone.
  • Storj , a very well developed contract based cloud storage, continuously improving with the possibility to sync with AWS S3
  • Sia , on paper secure protocol to store large backups privately and cheaper than on Google, although with a questionable reputation, and doubtful future through the lack of ecosystem incentives.
  • Arweave , truly permanent storage useful for archival boosted with L2 Bundlr, however with built-in content moderation mechanism one could be afraid of potential censorship if the content is controversial for any reason.

One note to pour here a bit of pure wine if someone entitles any project as AWS/GCP killer…all protocols mentioned above still do not contain even 1% of all the features and gadgets GCP or AWS offers right now…the image blockchain alternatives could eat a significant amount of this market pie of hundreds of billions is rather pessimistic.

If the market proves the way blockchain is working in data storage and cloud computing, providers and consumers both profit, at some point largest world companies with thousand-year experience in IT would probably notice, adapt, buy teams and gain some advantage…but that's not all bad — it would mean this path was meaningful and will generate business and benefits for many in next years or decades.