The Orange Pill Storage System Manifesto

This was originally written as a post on Larry Sanger's Decentralizers Forum, titled The Orange Pill Storage System: A point-by-point and exhaustively detailed answer to "What Decentralization Requires".


This is a detailed response to Larry's article called What Decentralization Requires

In brief, I agree with most of what Dr. Kumar wrote in his response. This post contains more or less the same points, with more reference to technical details.

Dr. Kumar wrote the following

By analogy, I came up with the following: that article is proposing a set of traffic laws for flying cars. It's not possible for me to meaningfully agree or disagree with it, because I don't know what constraints flying car technology imposes, because flying car technology doesn't exist yet.

Postulating about high-level rules for what a decentralized internet might look like is not a useful exercise until we know what constraints the low-level technology imposes. That insight cannot come until the technology exists and is widely used.

The basic technology required to create a truly decentralized internet does not currently exist.

Many of my responses to each bullet point include the phrase "application-level issue", which I believe is what Dr. Kumar meant by "traffic laws for flying cars". OPSS is that low-level basic technology that Dr. Kumar wrote about.

"Application-specific issues" are issues which may be important, but that we can't discuss meaningfully at this point. We either do not have adequate knowledge about constraints of the low-level technology, or the issue is something that is decided on an application-by-application basis. OPSS is infrastructure, it can't and shouldn't solve application-specific issues.

All this being said, I have a lot of domain knowledge about what constraints OPSS is likely to impose, and which constraints are independent of OPSS.

So without further ado,

Bullet points from the article

Here I will address each bullet point laid out in that article, and later circle around to re-articulate the central point.


Circling back

The most important of the principles is data-ownership. Enforcing that forces all of the other design choices.

Here's the data-ownership principle, as far as OPSS is concerned:

If you like your data, you can keep your data. But as soon as you put your data on someone else's computer, it's no longer your data.

Now let us walk through some examples to illustrate how this forces design choices.

Usage case 1: sharing a graduation video

Your kid is graduating from high school. You don't want to make the video public, but you do want to share it with your family.

At the moment, the only way to do that is to put the video on a social media site like YouTube or Facebook and then share the link. That means Facebook or YouTube owns the data.

From the technological standpoint, there is currrently no way to share data without surrendering ownership of the data. We can't completely fix that. However, we can certainly improve the situation a lot.

What we can do is give you a way to streamtorrent the video, on-demand to your friends, with access control, and without ever having to upload it to some central system like YouTube or Facebook. There's a distributed caching system so that your home internet connection doesn't get overloaded.

We can use a pretty straightforward cryptography scheme to implement access control. So even if someone intercepts the stream, if they don't have the correct key, they can't read it.

We can't stop people on the receiving end from recording the stream, or compel them to delete the stream. Nor should we attempt to, as the only way to do so is to implement decentralized DRM.

There's a ton of issues in making a system like this user-friendly, such as letting your audience know that the video exists. But it's doable.

Usage case 2: Joe Rogan Podcast Stream

Joe Rogan wants to stream his podcast, and he doesn't want to worry about censorship.

In this case, what OPSS can offer is streamtorrenting. Rogan streams from his lair. Viewers get data both from Rogan's lair and from each other.

Usage case 3: The Decentralizers Forum

This case also covers something like Twitter or Facebook.

These types of situations are more challenging, because it's not as clear who should own what data. There's tradeoffs to be made. One goal of OPSS is to make these tradeoffs plainly visible, and allow an application developer to cleanly choose which side he wants to take.

The tradeoff here is performance versus privacy. Here's the two extreme ends of the tradeoff

What these all have in common (the problem that OPSS solves)

At the core of all of these systems, there's some very basic problems:

  1. Where do bits live?
  2. How do you get bits from point A to point B?
  3. How do you handle the Slashdot effect (servers melt from going viral)?
  4. How do you implement access control?
  5. How do you make the system user-friendly?
  6. How do you implement social network features like aggregation, content recommendation, and "you might like this person/channel/account"?

OPSS solves issues 1-4. No existing open-source distributed system can make that claim. Issues 5 and 6 need to be solved on an application-by-application basis.

The best OPSS can do on #5 is to make OPSS as invisible to end users as possible, and make itself as easy to use by application developers as possible.

Issues 1-4 have the properties that

  1. They are difficult to do correctly
  2. Most developers find them boring
  3. Making a mistake is very costly
  4. The requirements are exactly the same in every application
  5. Everyone using the same solution is concave: the whole is greater than the sum of its parts.

Property (5) is true of OPSS for the same reason that it's true of BitTorrent: the more people who are torrenting something, the faster it is to torrent it. The more people who are using OPSS, the faster it is. It would be quite the waste for a each application to implement its own distributed data store, because it's faster if they're all using the same system.

There's a lot of ways to skin a cat. We're selling knives.