LBRY: A Decentralized Digital Content Marketplace

Alex Grintsvayg, Jeremy Kauffman, Kay Kurokawa
{alex,jeremy,kay}@lbry.io

Note

Please excuse the unfinished state of this paper. It is being actively worked on. The content here is made available early because it contains useful information for developers.

For more technical information about LBRY, visit lbry.tech.

The LBRY Blockchain

The LBRY blockchain serves three key purposes: it’s an index of the content available on the network, it’s a payment system for purchased content, and it provides decentralized publisher identity. LBRY uses a fork of the Bitcoinhttps://bitcoin.org/bitcoin.pdf blockchain, with several key modifications.

The claimtrie. Bitcoin handles payments, but isn’t suitable for storing information about content or identities. To do that, LBRY adds a second parallel Merkle tree for storing metadata. A claim is a single metadata entry. The claimtrie is the set of these claims. LBRY also adds additional opcodes for manipulating this information (more on claims later). The root hash of the claimtrie is stored in the block header. This allows SPV clients to validate claim resolutions without downloading the full blockchain, which is a key requirement for the application layer.

Faster blocks. The target block time was lowered from 10 to 2.5 minutes to facilitate faster transaction confirmation.

Continuous difficulty adjustment. The proof-of-work target is adjusted every block. This protects against sudden changes in hashrate. Bitcoin did not need this because mining increased slowly at the beginning and the high current hashrate is not affected by relatively small hashrate movements.

Hash algorithm. The mining hash algorithm was chosen to delay the development of a GPU miner and give early adopters a chance to mine without specialized hardware. The algorithm is a combination of SHA256, SHA512, and RIPEMD160.

Address format. The address version byte is set to 0x55 for standard (pay-to-public-key-hash) addresses and 0x7a for multisig (pay-to-script-hash) addresses. P2PKH addresses start with the letter b, and P2SH addresses start with r.

The Data Network

LBRY’s core purpose is to provide access to content without middlemen. A robust data network is crucial to a good experience. LBRY aims to achieve distributed availability, searchability, and pseudonymous publisher identity.

DHT

Distributed hash tables have proven to be an effective way to build a decentralized content network. Our DHT implementation follows the Kademliahttps://pdos.csail.mit.edu/~petar/papers/maymounkov-kademlia-lncs.pdf spec fairly closely, with some modifications.

A distributed hash table is a key-value store that is spread over multiple host nodes in a network. Nodes may join or leave the network anytime, with no central coordination necessary. Nodes communicate with each other using a peer-to-peer protocol to advertise what data they have and what they are able to store.

The unit of content in our network is called a blob. A blob is an encrypted chunk of data up to 2MB in size. Each blob is indexed by its blob hash, which is a SHA384 hash of the blob contents. Addressing blobs by their hashes simultaneously protects against naming collisions and ensures that the content you get is what you expect. When a host connects to the DHT, it advertises the blob hash for every blob it wishes to share. Downloading a blob from the network requires querying DHT for a list of hosts that advertised that blob’s hash (called peers), then requesting the blob from the peers directly.

Multiple blobs may be combined into a stream. A stream may be a book, a movie, a CAD file, etc. All content on the network is shared as streams. Every stream begins with the stream descriptor blob, which contains a JSON list of the hashes and keys of the content blobs. The content blobs hold the actual content of the stream. Every stream ends with an empty content blob, to signify that the stream has finished (this is similar to a null-terminated string, and is necessary to support streaming content).

To review, a piece of content in LBRY is represented by a group of blobs dispersed throughout the network. The blobs are identified by their blob hashes, which are grouped together in the stream descriptor blob (or SD blob). The SD blob hash is identified by its its own blob hash, which is called the SD hash, and is the key to finding and downloading any content. It is stored directly in the blockchain, as part of the content metadata.

Names and Claims

Most existing public name schemes are first-come, first-serve. This leads to several bad outcomes. When the system is young, users are incentivized to register common names even if they don't intend to use them, in hopes of selling them to the proper owner in the future for an exorbitant price. In a centralized system, the authority may allow for appeals to reassign names based on trademark or other common use reasons. There may also be a process to "verify" that a name belongs to the entity you think it does (e.g. Twitter's verified accounts). Such processes are often arbitrary, change over time, and may still lead to names being used in ways that are contrary to user expectation (e.g. nissan.com is not what you’d expect).

In a decentralized system, such approaches are not possible, so name squatting is especially dangerous (see Namecoin). Instead, LBRY aims to create an efficient allocation of names via a market. Following Coasehttps://en.wikipedia.org/wiki/Coase_theorem , we believe that if the rules for name ownership and exchange are clearly defined, transaction costs are low, and there is no information asymmetry, bargaining will lead to efficient outcomes.

LBRY names are assigned via a continuous auction, where bids for control of the name are called claims. When a claim is created, an amount of credits is set aside to back it. When resolving a name, the claim with the most credits wins. Claims can be supported by the original author or by third parties, which enables a group consensus around the ideal claim for a name, Finally, claims and supports can be withdrawn at any time to reclaim the frozen credits, so there’s no risk in supporting claims that are useful.

Claim

A claim is the basic building block of name ownership. It consists four primary pieces of information:

ID - a unique identifier of this claim
Name - the name being claimed
Amount - how many credits will be set aside to back the claim
Data - information associated the name (e.g. SD hash, content metadata, etc)

Here is an example claim:

{
  "claimId": "fa3d002b67c4ff439463fcc0d4c80758e38a0aed",
  "name": "lbry",
  "amount": 100000000,
  "value": "{\"ver\": \"0.0.3\", \"description\": \"What is LBRY? An introduction with Alex Tabarrok\",
            \"license\": \"LBRY inc\", \"title\": \"What is LBRY?\", \"author\": \"Samuel Bryan\",
            \"language\": \"en\", \"sources\": {\"lbry_sd_hash\":
            \"e1e324bce7437540fac6707fa142cca44d76fc4e8e65060139a88ff7cdb218b4540cb9cff8bb3d5e06157ae6b08e5cb5\"},
            \"content_type\": \"video/mp4\", \"nsfw\": false, \"thumbnail\":
            \"https://s3.amazonaws.com/files.lbry.io/logo.png\"}",
  "txid": "53ed05d9dfd728a94bedf952d67783bbe9da5d2ab436a84338bb53f0b85301b5",
  "n": 0,
  "height": 146117
}

The value field contains the claim contents, including the source and the metadata (discussed later). The structure of the metadata is flexible to allow storing different kinds of content. As the protocol grows and matures, standard structures will be defined and used (similar to Bitcoin’s standard and non-standard transaction).

Claimtrie

The current state of the claims in the blockchain is stored in a data structure called the claimtrie. The claimtrie is a Merkle Patricia Triehttps://github.com/ethereum/wiki/wiki/Patricia-Tree where the keys are the claimed names and values are claims for that name (sorted in decreasing order by total amount). The root hash of the claimtrie is stored in the block header of each LBRY block, enabling nodes in the LBRY network to efficiently and securely validate the state of the claimtrie without downloading the whole block.

Operations and Opcodes

There are four claim operations: create, support, update, and abandon.

The create operation is used to make a new claim for a name, or to submit a competing claim on an existing name. A support is a claim that adds to the credit total of an existing claim. A support does not have it’s own claim ID or data. Instead, it has the claim ID of the claim to which its amount will be added. An update changes the data or the amount stored in an existing claim or support. Updates do not change the claim ID, so an updated claim retains any supports attached to it. An abandon withdraws a claim or support, freeing the associated credits to be used for other purposes.

To enable the above operations, 3 new opcodes were added to the blockchain scripting language: OP_CLAIM_NAME, OP_SUPPORT_CLAIM, and OP_UPDATE_CLAIM (in Bitcoin they are respectively OP_NOP6, OP_NOP7, and OP_NOP8). Each op code will push a zero on to the execution stack, and will trigger the claimtrie to perform calculations necessary for each bid type. Below are the three supported transactions scripts using these opcodes.

OP_CLAIM_NAME <name> <value> OP_2DROP OP_DROP <pubKey>
OP_UPDATE_CLAIM <name> <claimId> <value> OP_2DROP OP_2DROP <pubKey>
OP_SUPPORT_CLAIM <name> <claimId> OP_2DROP OP_DROP <pubKey>

<pubKey> can be any valid Bitcoin payout script, so a claimtrie script is also a pay-to-pubkey script to a user-controlled address. Note that the zeros pushed onto the stack by the claimtrie opcodes and vectors are all dropped by OP_2DROP and OP_DROP. This means that claimtrie transactions exist as prefixes to Bitcoin payout scripts and can be spent just like standard transactions.

For example, a claim transaction setting the name “Fruit” to “Apple” and using a pay-to-pubkey script will have the following payout script:

OP_CLAIM_NAME Fruit Apple OP_2DROP OP_DROP OP_DUP OP_HASH160 <addressOne>
OP_EQUALVERIFY OP_CHECKSIG

Like any standard Bitcoin transaction output script, it will be associated with a transaction hash and output index. The transaction hash and output index are concatenated and hashed to create the claimID for this claim. For the example above, let's say the above transaction hash is 7560111513bea7ec38e2ce58a58c1880726b1515497515fd3f470d827669ed43 and the output index is 1. Then the claimID would be 529357c3422c6046d3fec76be2358004ba22e323.

A support for this bid will have the following payout script:

OP_SUPPORT_CLAIM Fruit 529357c3422c6046d3fec76be2358004ba22e323 OP_2DROP OP_DROP
OP_DUP OP_HASH160 <addressTwo> OP_EQUALVERIFY OP_CHECKSIG

And now let's say we want to update the original claim to change the value to “Banana”. An update transaction has a special requirement that it must spend the existing claim that it wishes to update in its redeem script. Otherwise, it will be considered invalid and will not make it into the claimtrie. Thus it will have the following redeem script:

<signature> <pubKeyForAddressOne>

This is identical to the standard way of redeeming a pay-to-pubkey script in Bitcoin.

The payout script for the update transaction is:

OP_UPDATE_CLAIM Fruit 529357c3422c6046d3fec76be2358004ba22e323 Banana OP_2DROP
OP_2DROP OP_DUP OP_HASH160 <addressThree> OP_EQUALVERIFY OP_CHECKSIG

Claim States

A claim can have the following states at a given block:

Accepted

An accepted claim or support is simply one that has been entered into the blockchain. This happens when the transaction containing the claim is included in a block.

Abandoned

An abandoned claim or support is one that was withdrawn by its creator. It is no longer in contention to control a name. Spending the transaction that contains the claim will also cause the claim to become abandoned.

Active

A claim is active when it is in contention for controlling a name (or a support for such a claim). An active claim must be accepted and not abandoned. The time it takes an accepted claim to become active is called the activation delay, and it depends on the claim type, the height of the current block, and the height at which the last takeover occurred for the claim’s name.

If the claim is an update or support to the current controlling claim, or if it is the first claim for a name (T = 0), the claim becomes active as soon as it is accepted. Otherwise it becomes active at height A, where , and

A = activation height
D = activation delay
C = claim height (height when the claim was accepted)
H = current height
T = takeover height (the most recent height at which the controlling claim for the name changed)

In plain English, the delay before a claim becomes active is equal to the claim’s height minus height of the last takeover, divided by 32. The delay is capped at 4032 blocks, which is 7 days of blocks at 2.5 minutes per block (our target block time). The max delay is reached 224 (7x32) days after the last takeover. The goal of this delay function is to give long-standing claimants time to respond to takeover attempts, while still keeping takeover times reasonable and allowing recent or contentious claims to be taken over quickly.

Controlling

The controlling claim is the claim that is returned when a name is resolved. The controlling claim must be active and cannot be a support. Only one claim can be controlling for a given name at a given block. To determine which claim is controlling for a given name in a given block, the following algorithm is used:

  1. For each active claim for the name, add up the amount of the claim and the amount of all the active supports for that claim.

  2. Determine if a takeover is happening

    1. If the claim with the greatest total is the controlling claim from the previous block, then nothing changes. That claim is still controlling at this block.
    2. Otherwise, a takeover is occurring. Set the takeover height for this name to the current height, recalculate which claims and supports are now active, and then perform step 1 again.
  3. At this point, the claim with the greatest total is the controlling claim at this block.

The purpose of 2b is to handle the case when multiple competing claims are made on the same name in different blocks, and one of those claims becomes active but another still-inactive claim has the greatest amount. Step 2b will cause this claim to also activate and become the controlling claim.

Here is a step-by-step example to illustrate the different scenarios. All claims are for the same name.

A(10) is controlling Block 13: Claim A for 10LBC is accepted. It is the first claim, so it immediately becomes active and controlling.

A(10) is controlling, B(20) is accepted. Block 1001: Claim B for 20LBC is accepted. It’s activation height is

A(10+14) is controlling, B(20) is accepted. Block 1010: Support X for 14LBC for claim A is accepted. Since it is a support for the controlling claim, it activates immediately.

A(10+14) is controlling, B(20) is accepted, C(50) is accepted. Block 1020: Claim C for 50LBC is accepted. The activation height is

A(10+14) is controlling, B(20) is active, C(50) is accepted. Block 1031: Claim B activates. It has 20LBC, while claim A has 24LBC (10 original + 14 from support X). There is no takeover, and claim A remains controlling.

A(10+14) is controlling, B(20) is active, C(50) is accepted, D(300) is accepted. Block 1040: Claim D for 300LBC is accepted. The activation height is

A(10+14) is active, B(20) is active, C(50) is active, D(300) is controlling. Block 1051: Claim C activates. It has 50LBC, while claim A has 24LBC, so a takeover is initiated. The takeover height for this name is set to 1051, and therefore the activation delay for all the claims becomes . All the claims become active. The totals for each claim are recalculated, and claim D becomes controlling because it has the highest total.

Claim Contents

A useful index of LBRY’s content must be succinct yet meaningful. It should be machine-readable, comprehensive, and should ideally point you toward the content you’re looking for. LBRY achieves this by defining a set of common properties for streams. Two key properties are the metadata and the source.

The metadata contains structured information describing the content, such as the title, author, language, and so on. Here’s an example:

"metadata": {
  "author": "",
  "description": "All proceeds go to holly for buying toys, i will post the video with those toys on Xmas day",
  "language": "en",
  "license": "All rights reserved.",
  "licenseUrl": "",
  "nsfw": false,
  "preview": "",
  "thumbnail": "http://www.thetoydiscounter.com/happy.jpg",
  "title": "Holly singing The Happy Working Song",
  "version": "_0_1_0"
}

The source data that goes with the above metadata is below. This source has the type “lbrysdhash”, meaning the source is a hash of a stream descriptor blob in the LBRY DHT. We intend to support other source types, such as HTTP URLs, Bittorrent infohashes, or IPFS paths.

"source": {
  "contentType": "video/mp4",
  "source": "92b8aae7a901c56901fd5602c1f1acc0e63fb5492ef2a3cd5b9c631d92cab2e060e2a908baa922c24dee6c5229a98136",
  "sourceType": "lbry_sd_hash",
  "version": "_0_0_1"
}

The structure of the source and the metadata is defined as Protocol Buffers to facilitate interoperability across systems/languages and to make it easy to add properties in backwards-compatible way. We anticipate this structure will evolve along with the network, so these definitions are subject to change. URLs LBRY has urls that can be resolved to claims. A url is generally a name with one or more modifiers. A bare name on its own will resolve to the controlling claim at the latest block height. The available modifiers are:

lbry://meet-LBRY Name: a basic claim for a name

lbry://@lbry Channel: a claim for a channel

lbry://@lbry/meet-LBRY Claim in Channel: a claim for this name that has been signed with a key connected to the controlling claim for this channel

lbry://meet-LBRY#7a0aa95c5023c21c098
lbry://meet-LBRY#7a
Claim ID: a claim for this name with this claim ID (does not have to be the controlling claim). Partial prefix matches are allowed.

lbry://meet-LBRY:1 Claim Sequence: the Nth claim for this name, in the order the claims entered the blockchain. N must be a positive number. This can be used to determine which claim came first, rather than which claim has the most support.

lbry://meet-LBRY$2
lbry://meet-LBRY$3
Bid Position: the Nth claim for this name, in order of most support to least support. N must be a positive number. This is useful for resolving non-winning bids in bid order if you, for example, want to list the top three winning claims in a voting contest or want to ignore the activation delay.

lbry://meet-LBRY?arg=value Query Params: extra parameters (reserved for future use)

In consequence, the symbols @, #, :, $, ?, and / are not allowed in name claims. The full URL schema can be defined as a regex:

(?P<uri>
  ^
  (?P<protocol>lbry\:\/\/)?
  (?P<content_or_channel_name>
    (?P<content_name>[^@#$?/:]+)
    |
    (?P<channel_name>\@[^@#$?/:]+)
  )
  (?P<modifier>
    (?:\#(?P<claim_id>[0-9a-f]{1,40}))
    |
    (?:\$(?P<bid_position>\-?[1-9][0-9]*))
    |
    (?:\:(?P<claim_sequence>\-?[1-9][0-9]*))
  )?
  (?:\/(?P<path>[^@#$?/:]+))?
  $
)

Channels and Identity

Channels are the unit of identity in the LBRY system. A channel is simply a claim that start with @ and contains a public key. Once a channel claim is accepted on the blockchain, content claims that are signed with the channel’s private key will appear in lists under that channel. The purpose of channels is to collect content into a list under a pseudonymous name. There are no usernames in LBRY, but channels fulfill the same function.

Here’s the value of an example channel claim:

"certificate": {
    "keyType": "SECP256k1",
    "publicKey": "3056301006072a8648ce3d020106052b8104000a0342
                  0004180488ffcb3d1825af538b0b952f0eba6933faa6
                  d8229609ac0aeadfdbcf49C59363aa5d77ff2b7ff06c
                  ddc07116b335a4a0849b1b524a4a69d908d69f1bcebb",
    "version": "_0_0_1"
}

When a claim published into a channel, the claim data is signed and the following is added to the claim:

"publisherSignature": {
    "channelClaimID": "2996b9a087c18456402b57cba6085b2a8fcc136d", 
    "signature": "bf82d53143155bb0cac1fd3d917c000322244b5aD17
                  e7865124db2ed33812ea66c9b0c3f390a65a9E2d452
                  e315e91ae695642847d88e90348ef3c1fa283a36a8", 
    "signatureType": "SECP256k1", 
    "version": "_0_0_1"
}

Conclusion

TODO