Note that the headline is from Langfuse, not ClickHouse. Reading the announcement from ClickHouse[0], the headline is "ClickHouse welcomes Langfuse: The future of open-source LLM observability". I think the Langfuse team is suggesting that they will be continuing to do the same work within ClickHouse, not that the entire ClickHouse organization has a goal of building the best LLM engineering platform.
Your notes aren't very good. They're not a time series database company, they're a columnar database company. But yeah the LLM bit is weird, database companies _always_ feel like charlatans when it comes to LLMs.
ClickHouse effectively has a number of personas. Time series is one of them, and ClickHouse has steadily absorbed market share from pure play time series databases over the last few years. Other personas include real-time observability backend (the single biggest use case in my experience) as well as real-time data lake engine. Time series support, column storage, and real-time response are key underlying capabilities. It's quite versatile and fun to use.
Disclosure: I run Altinity, a vendor in this space.
"Berkshire Hathaway Inc. is an American multinational conglomerate holding company" is a weird thing for a textile manufacturer to call itself. Almost like...businesses expand and evolve?
(they've never been a time series database company either lol)
Could you elaborate? because that sentence made my brow wrinkle with confusion. I have thought to myself before that all business data problems eventually become time series problems. I'd like to understand your point of view on how LLMs fit into that.
Time series just means that the order of features matter. Feature 1 occurs before feature 2.
E.g, fitting a model to house prices, you don’t care if feature 1 is square meters and feature 2 is time on market, or vice versa, but in a time series, your model changes if you reverse the order of features.
With text, the meaning of word 2 is dependent on the meaning of word 1. With stock prices, you expect the price at time 2 to be dependent on time 1.
Text can be modeled as a time series.
A language model tells you the next character/token/word depending on the previous input.
Language models are time series.
It’s not an audacious claim.
Any student of nlp should have met a paper modeling text as time series before writing their thesis. How could you not meet that?
It is possible for LLMs to learn Bernford's law, implicitly. So they will be non-null predictors of time series data, because time series data is also Bernford-law-distributed [4].
Interesting headline for a checks notes time series database company.