index

2021-06-18

2021-06-18


dali as package repository

Could Dali be used as a package repository for Linux?

Package repository provides:

Dali data segments should be signed to make p2p distribution possible.

Versioning is done naturally using temporary indexes.

Large files are refered using hashes in the metadata.

describing enumeration value types

Enumeration is a record. It could consist for example one single byte field. Some or all possible values of that single byte could be then given labels to describe their meaning.

{:dali.id tid.boolean-number-field
 :dali.type 
 :dali.label "number"
 :dali.value-type "ui8"}
 
{:dali.id tid.boolean-record
 :dali.type 
 :dali.label "boolean"
 :dali.record.fields [tid.boolean-number-field]}

{:dali.type 
 :dali.record-type tid.boolean-record
 :dali.label "false"
 :dali.value [0]}

{:dali.type 
 :dali.record-type tid.boolean-record
 :dali.label "true"
 :dali.value [1]}

why plaintext formats are used?

Tooling for visualization and eiditing utf-8 encoded text is ubiquitous. If there was such tooling for dali datastructures, they could be used in place of utf-8 where structured data is needed.

One approach could be a bi directional mapping between and utf-8. i.e. a plain text serialization syntax. The files on disk could be stored as dali structures but editing could be done with a plain text editor.

A specialized spreadsheet like editor would be much prefered for example when dealing with large data structures. One could open the wikipedia as multi terabyte dali datastructure and browse and edit it with the generic dali editor.

versioned binary blobs

Large values are stored in a key value store and addressed by hashes. It should be possible to store versions of the same value as patches. For example wikipedia page revisions would be stored as patches.

When the number of patches to a value exceedes a certain threshold, the value would be stored fully to make reading more efficient.

Git uses delta compression for packed objects:

linux - Can git use patch/diff based storage? - Stack Overflow

Git automatically discovers optimal patch sequences based on the file contents instead of the commit history.

The patch files should have headers that refer to all of the patches that are needed to form the whole file. This would make retrieving all of the patches more efficient over the internet.

Delta compression could be applied also to index data segments.

delta compressed blob storage


delta compressed blob storage
content addressed functions
cross language typing
managing files
temporary index

This site is generated with zetgen