> Our goal continues to be building the best LLM engineering platform Interestin...

michaelmior · 2026-01-17T12:16:58 1768652218

Note that the headline is from Langfuse, not ClickHouse. Reading the announcement from ClickHouse[0], the headline is "ClickHouse welcomes Langfuse: The future of open-source LLM observability". I think the Langfuse team is suggesting that they will be continuing to do the same work within ClickHouse, not that the entire ClickHouse organization has a goal of building the best LLM engineering platform.

[0] https://clickhouse.com/blog/clickhouse-acquires-langfuse-ope...

dangoodmanUT · 2026-01-17T13:50:45 1768657845

Your notes aren't very good. They're not a time series database company, they're a columnar database company. But yeah the LLM bit is weird, database companies _always_ feel like charlatans when it comes to LLMs.

hodgesrm · 2026-01-18T00:37:10 1768696630

ClickHouse effectively has a number of personas. Time series is one of them, and ClickHouse has steadily absorbed market share from pure play time series databases over the last few years. Other personas include real-time observability backend (the single biggest use case in my experience) as well as real-time data lake engine. Time series support, column storage, and real-time response are key underlying capabilities. It's quite versatile and fun to use.

Disclosure: I run Altinity, a vendor in this space.

(Update: Disclaimer -> Disclosure. Sigh.)

swyx · 2026-01-18T01:01:35 1768698095

Altinity isn't just a vendor, it's THE ClickHouse vendor before ClickHouse became a company. https://altinity.com/blog/big-news-in-the-clickhouse-communi...

always nice to see a database ceo be "one of us" and/or "write like a real human being".

hodgesrm · 2026-01-18T03:40:08 1768707608

That's very kind of you. We love working on ClickHouse and real-time analytics.

vegabook · 2026-01-17T15:36:00 1768664160

Willing to bet most columnar stores are used for time series.

domoritz · 2026-01-17T15:54:02 1768665242

I suspect most use of columnar databases is OLAP, which is different from what people usually mean when they say time series data.

goodkiwi · 2026-01-17T18:59:54 1768676394

I’d take that bet

cs554 · 2026-01-17T13:04:05 1768655045

"Berkshire Hathaway Inc. is an American multinational conglomerate holding company" is a weird thing for a textile manufacturer to call itself. Almost like...businesses expand and evolve?

(they've never been a time series database company either lol)

vibedev · 2026-01-17T15:51:52 1768665112

But this is correct? The article that you read is from Langfuse POV, not Clickhouse.

wodenokoto · 2026-01-17T13:07:03 1768655223

Language models are time series models.

It’s great when you get this insight as a student of NLP, because suddenly your toolset grows quite a bit.

Jgrubb · 2026-01-17T14:35:43 1768660543

Could you elaborate? because that sentence made my brow wrinkle with confusion. I have thought to myself before that all business data problems eventually become time series problems. I'd like to understand your point of view on how LLMs fit into that.

wodenokoto · 2026-01-17T19:23:37 1768677817

Time series just means that the order of features matter. Feature 1 occurs before feature 2.

E.g, fitting a model to house prices, you don’t care if feature 1 is square meters and feature 2 is time on market, or vice versa, but in a time series, your model changes if you reverse the order of features.

With text, the meaning of word 2 is dependent on the meaning of word 1. With stock prices, you expect the price at time 2 to be dependent on time 1.

Text can be modeled as a time series.

A language model tells you the next character/token/word depending on the previous input.

Language models are time series.

It’s not an audacious claim.

Any student of nlp should have met a paper modeling text as time series before writing their thesis. How could you not meet that?

LunaSea · 2026-01-17T23:37:17 1768693037

As a data structure it is an ordered list of integers but no LLM needs to accès it in a database, it's way to slow for anything serious.

RAG and vector Approximate Nearest Neighbour (ANN) is the the go to use case.

thesz · 2026-01-17T19:28:19 1768678099

[1] https://towardsdatascience.com/llm-powered-time-series-analy...

[2] https://arxiv.org/abs/2506.02389

[3] https://arxiv.org/html/2402.10835v3

Some links from the top of Google search.

Take a look here, also, it's an important law: https://en.wikipedia.org/wiki/Benford%27s_law

It is possible for LLMs to learn Bernford's law, implicitly. So they will be non-null predictors of time series data, because time series data is also Bernford-law-distributed [4].

[4] https://ui.adsabs.harvard.edu/abs/2017EGUGA..19.2950T/abstra...

madduci · 2026-01-17T18:12:44 1768673564

Diversification is the keyword

mrits · 2026-01-17T17:03:14 1768669394

They are closer to an LLM database than a time series database. But they aren't very close to either.

stingraycharles · 2026-01-17T11:52:53 1768650773

That’s what you get when you raise a lot of VC capital. Just being the best timeseries database is not enough.