I don’t think OpenAI train on data processed via the API, unless there’s an exce...

dpoloncsak · 2025-08-13T14:47:03 1755096423

Maybe I misunderstand, but I'm pretty sure they offer an option for cheaper API costs (or maybe its credits?) if you allow them to train on your API requests.

To your point, pretty sure it's off by default, though

Edit: From https://platform.openai.com/settings/organization/data-contr...

Share inputs and outputs with OpenAI

"Turn on sharing with OpenAI for inputs and outputs from your organization to help us develop and improve our services, including for improving and training our models. Only traffic sent after turning this setting on will be shared. You can change your settings at any time to disable sharing inputs and outputs."

And I am 'enrolled for complimentary daily tokens.'

trhway · 2025-08-13T09:50:35 1755078635

i'd not rule out some approach like instead of training directly on the data, may be they would train on a very high dimensional embedding of such a data (or some other similarly "anonymized", yet still very semantically rich representation of the data)

dannyw · 2025-08-13T06:20:13 1755066013

Can you truly trust them though?

cedws · 2025-08-13T06:58:34 1755068314

Yes, it would be disastrous for OpenAI if it got out they are training on B2B data despite saying they don’t.

dweinus · 2025-08-13T12:34:52 1755088492

We're both talking about the company whose entire business model is built on top of large scale copyright infringement, right?

dymk · 2025-08-13T19:42:53 1755114173

Not the same when the people you infringe on can sue you into the dirt

reasonableklout · 2025-08-13T09:15:06 1755076506

Have they said they don't? (actually curious)

gkbrk · 2025-08-13T09:46:42 1755078402

Yes, they have. [1]

> Your data is your data. As of March 1, 2023, data sent to the OpenAI API is not used to train or improve OpenAI models (unless you explicitly opt in to share data with us).

[1]: https://platform.openai.com/docs/guides/your-data

mattigames · 2025-08-13T07:40:56 1755070856

Yeah, so many companies have been completely ruined after similar PR disasters /s

j33zusjuice · 2025-08-13T20:03:58 1755115438

Their terms of service say they won’t use the data for training, so it wouldn’t just be a PR disaster; it’d be a breach of contract. They’d be sued into oblivion.

johnthescott · 2025-08-13T03:37:13 1755056233

i am too lazy to ask openai.