ComparisonCondensation vs. file systems

Condensation vs. file systems draft, add graphs from presentation slides

Among the most prevalent data systems in use today are file systems, such as ext4, btrfs, NTFS, FAT32, or HFS+. While Condensation can be used for the same purpose (data storage), it differs in some key aspects from typical file systems.

Immutable vs. mutable

File systems are mutable data systems, i.e., directories, files, or parts thereof can be modified in-place.

Condensation is an immutable storage system. Once a piece of data is written, it can never be modified again. Any change results in a new piece of data, while existing data remains untouched.

Mutable data is simple and straightforward to use. If some piece of information changes, the corresponding file is modified by replacing the old content with the new content. Modifying several files as a single atomic operation (transaction) is substantially more difficult and error-prone, however, and replicating those changes atomically, or synchronizing them across several devices is even more difficult.

With immutable data, small modifications are a bit more difficult, but the rest is substantially easier. All operations are inherently atomic, and data synchronization or replication is very simple. In addition, the actual pieces of data (objects) can easily be spread across multiple disks.

Data structures

File systems are based on folders and files. Each files holds a byte sequence, serialized by the application.

At the lowest level, Condensation stores object trees. Each object holds a byte sequence, serialized by the application.

Most applications however use a higher-level data structure, such as documents, on top of Condensation.

Access rights management

File systems keep access rights as access control lists or owner/group/world bitmasks. Files of different users may be stored next to each other in the same folder. To share a file or folder, access is granted to multiple users.

Condensation uses encryption and message-passing instead. Each user keeps his own data, and encrypts it. Other users do not see this data. To share data, users send small messages to each other, containing a link to the data (object reference) and the corresponding encryption key.

Storage and messaging

While file systems are purely intended for storage, Condensation offers both storage and messaging.

Summary

Condensation File system (typical)
Server-client protocol get object, put object, list, add, remove open, read, write, rename, mkdir, fcntl, ...
Backend disk, partition, file system, database, network storage, cloud service physical or virtual partition, network storage
Mutability immutable mutable
Main data structure object tree folders and files
Distributed yes, by design no
Storage yes yes
Messaging yes no
Transactions inherent, as box updates are atomic atomic mv (POSIX) or special transaction API, requires careful design
Replication (one-way) inherent, efficient good tools exist; limited efficiency
Synchronization (two-way) fairly easy difficult
Versioning inherent, lightweight difficult in general; some file systems offer snapshots