Based on the feedback we got for our initial alpha release, we worked on improving Skizze and moving the project forward.
To recap, Skizze is a sketch data store to deal with all problems around counting and sketching using probabilistic data-structures.
My old time hacking buddy Neil Patel, who is also Xamarin Insights Technical Lead and Architect, blogged about the latest release, and also provided some background on why Skizze exists, and how to get started.
This second alpha focuses mainly on improving development and operating experience. It is an early alpha so don’t expect much, but hopefully now it is easier to run and experiment with.
In this post I wanted to highlight a couple of features and ideas we have for Skizze:
Domains
As Neil mentioned in his post, we have a clear definition of Domains now:
One of the first pieces of feedback we got with the initial release was: "My data needs all four Sketches and I'm sending the values four times to the server!". Which is a fair point, in most cases the stream of values you have, you almost always want all four kinds of questions answered. Domains help solve this problem.
When you create a Domain, behind the scenes, Skizze creates all four sketches of the same name. From then on, as you add values to that domain, they automatically get multiplexed into each Sketch that belongs to it.
Here is a small Mockup of how flow looks like.
Persistence
One of the main new features is the introduction of an initial functioning persistence functionality, which is done via AOF (Append Only File) inspired by Redis:
In basic terms, append-only log files keep a record of data changes that occur by writing each change to the end of the file. In doing this, anyone could recover the entire dataset by replaying the append-only log from the beginning to the end -