I've been experimenting with taking this self-description paradigm even farther, for a file format I've cooked up for ephemeral data in my search engine.
Basically, since I ended up building a custom library for this, I wanted to solve the portability problem by making it stupidly simple to reverse engineer, so I cooked up a convention where each column (and supporting column) is a file, with a file name that describes its format and role.
So a real-world production table looks like this if you ls in the directory (omitting a few columns for brevity):
The design goal is that just based on an ls output, someone who has never seen the code of the library producing the files should be able to trivially write code that reads it.
Internally the design of Vortex is very similar. The file consists of a whole bunch of "messages" (your files), which then have some metadata attached, and the read logic decides which messages it needs when.
Not yet, but I will compile one at some point. I'm in the middle of moving right now so I don't quite have the time to sit down and finish the write-up...
Basically, since I ended up building a custom library for this, I wanted to solve the portability problem by making it stupidly simple to reverse engineer, so I cooked up a convention where each column (and supporting column) is a file, with a file name that describes its format and role.
So a real-world production table looks like this if you ls in the directory (omitting a few columns for brevity):
The design goal is that just based on an ls output, someone who has never seen the code of the library producing the files should be able to trivially write code that reads it.