gemini://sunshinegardens.org/~xj9/wiki/off-system/

> Today we announce a massively distributed copy-less file system. A place where all content is available instantly, anonymously and to everyone, without breaking any laws. Today we announce the Owner-Free File System. An island of sanity in your sea of madness.

# the owner-free filesystem. a brightnet. # # FIXME represent the tuple size in a more natural and consistent way. # to store, choose the tuple size `t` (default 3), split the source file `s` into blocks # `s(i)` of size 56 KiB (pad with random data to fit). # # t ← 3 # # for each, do the following: select `t-1` blocks for use as randomizer blocks, or for # short, randomizers, from the existing OFF cache, which have not been used in previous # iterations. if not enough randomizers exist in the cache, generate them using a random # number generator. # # Xor ←{≠˝𝕩} # # calculate `oi = si Xor R0 Xor R1 Xor ... Xor R(t-1)` and store the resulting # block `oi` in the cache. # # 𝕩∾𝕨 # ┌─ # ╵ 0 0 0 1 0 0 1 1 0 1 0 0 1 0 # si # 0 1 0 1 1 0 0 0 1 0 1 1 1 0 # R1 # 1 0 0 0 1 1 1 0 1 0 0 0 1 1 # R(t-1) # ┘ # ≠˝𝕩∾𝕨 # ⟨ 1 1 0 0 0 1 0 1 0 1 1 1 1 1 ⟩ # # update the descriptor list, which contains the information on how to restore each source # block `s(i)`, with a new entry, which is a set of size `t`: `{o(i), r(1), r(2) ... r(t-1)}`. # # (≠˝𝕩∾𝕨)∾𝕨 # ┌─ # ╵ 1 1 0 0 0 1 0 1 0 1 1 1 1 1 # oi # 0 1 0 1 1 0 0 0 1 0 1 1 1 0 # R1 # 1 0 0 0 1 1 1 0 1 0 0 0 1 1 # R(t-1) # ┘ # # finally, store the descriptor list in its own block (or blocks, if the list is larger than # 56 KiB) and insert these blocks into the block cache and generate an OFF URL for referencing # the source file and output it to the user or into the local OFF URL database. # # •Show R(t-1) EncodeBlockDescriptor block EncodeBlockDescriptor ⇐{(≠˝𝕩∾𝕨)∾𝕨} # to retrieve, obtain the descriptor block or blocks and for each contained set of size `t`, # do the following: # # obtain the listed blocks `b(1), b(2) ... b(t)`. though they have no identity any more at # this point, they could be called `o(i), r(1), r(2) ... r(t-1)`. # # perform `s(i) = b(1) XOR b(2) XOR ... XOR b(t)` and output the resulting source data block # `s(i)` to a viewer program or to storage. # # •Show 1 = block ≡ ≠˝R(t-1) BlockDescriptor block DecodeBlockDescriptor ⇐{≠˝𝕩} # The forwarding method requires that a data block is uploaded and downloaded several # times before it reaches its destination, which happens between 5 and 15 times. given by # the formula s × (hi + ho + 1) × 2 - s. where s is the source file size, hi the inbound # tunnel length, and ho the outbound tunnel length. plus one for the hop between outbound # endpoint and inbound gateway. equivalent to an overhead of 900 to 2900%, while the # overhead of OFFS without optimizations is about 200%. # # Re-use some of the result blocks oi and randomizer blocks, which reduces the overhead to # s × ( t − 1 ) × e ÷ 100, where s is the source file size, t the tuple size and e the # percentage of unrelated blocks used in the randomizing step during the store procedure. # For example, if e is chosen 75 (and t chosen 3), this leads to an overhead of 150%. # # If some of the blocks required to fully retrieve, resp. re-assemble, the source file are # already present in the block cache from other transfers or stored files because of the # multi-use nature of OFFS blocks, the degree of efficiency is increased further. # # Use of a targeted store procedure leads to the blocks of specific other files to be chosen # as randomizers for this file with higher probability and therefore a reduction in overhead. # This is especially useful when storing a group of related files.

remember that we are breaking all input files into 56KiB blocks before applying `(≠˝𝕩∾𝕨)∾𝕨` to generate a descriptor block, which may be stored in the same way if it exceeds the block size limit. `𝕨` represents the random blocks that are used to obfuscate the source block. all random blocks and descriptor blocks are stored in the same content-addressed storage. clients select some blocks from the network to use as random blocks, some blocks are generated, and others are chosen from the local cache. in `•OffSystem` this block store is represented as `R𝕩` in the examples.

in a brightnet, rather than hide all correlations, we attempt to entangle all insertions with as much of the present state as possible. source blocks can only be retrieved if you know the name of the descriptor block that can unravel the entangled states. the required overhead for obfuscating blocks with random data exceeds any potential efficiency gains from deduplication, but according to the literature only runs around 150% overall when re-using blocks.

ultimately this can't defeat copyright since the sharing the descriptor block would legally constitute infringement, still i think there are useful applications for this design. despite the apparent intent of the original developers, more recent work clarifies this updated perspective:

> The Owner Free File System (OFF System / OFFS) is the world's first "brightnet". It facilitates legal data sharing activity over its network through the use of its ingenious data storage mechanisms. This allows it to perform its operations in the open without divulging the privacy, intent, or security of its network participants. The storage mechanism is unique in that it never stores whole files but instead stores completely random data blocks which contain large randomly generated numbers. These blocks have no discrete mapping to any single file but instead are shared by infinite combinations of data representations that are arbitrarily created by its users. This creates a universal public storage cloud with similar properties as national public radio or public broadcasting. Unlike torrents who’s life span stems from the popularity of individual files, files represented in the off system are constantly extending their availability as new files are being represented. Because no one can own mathematics or numbers neither can users own the blocks of data thus making the file system "owner free". The key to retrieving a file from the network is the URL for the data representation created by its importer. No tangible works, files, or copyrighted materials cross the network boundaries nor are they contained by the local storage maintained by the application nodes.

given a complete block cache and enough storage, its likely possible to brute-force the listing to discover all descriptor blocks and generate the source files that they represent. this is potentially mitigated by "enough storage", but more robust obfuscation for these entrypoints is certainly warranted depending on the application. for example, restricting access to a random block that is used in all store procedures, using a secret hmac key to locally remap block ids, or encrypting descriptor blocks using a network access key.

preferably we'd use a "bright" method to make it difficult to single out correlations, but it may prove impossible for this case. maybe its better not to store descriptors in the block cache in the first place. if non-descriptor blocks are difficult-to-impossible to unravel from the entangled state of the block cache without the descriptor this may be sufficient. if not, the entire design is broken anyway.

off system re-rebooted

potential flaws

re-reboot

see also