How We Chose Our AI Provider: Why Formaly Runs on Nebius Token Factory

When you build an AI product, it is easy to obsess over the model and quietly skip the more important question: who sees the data you send it?

For a lot of apps, that question stays abstract. For Formaly, it is the whole thing.

People use Formaly to collect survey responses. The text flowing through our system is some of the most sensitive data a company holds: why a customer churned, what an employee actually thinks about their manager, what frustrated someone enough to write a paragraph about it, sometimes health or financial details typed into an open-ended box. Every one of those answers gets sent to a language model to be summarized, scored, or turned into structured data.

So before I picked a model, I had to pick something that matters more: an inference provider I would trust with that data.

This is the story of how I chose, and why Formaly runs on Nebius Token Factory.

The question most AI products skip

The moment your app sends text to a third-party model, that provider becomes a sub-processor of your users' data. You have quietly extended your trust boundary to include them. Whatever promises you make to your customers about privacy, you can only keep if the company behind your API keeps them too.

Most teams never look closely at this. They grab the most famous API, ship, and assume the data just evaporates after the response comes back. It often does not. With many providers, prompts and outputs are retained by default, sometimes for abuse monitoring, sometimes to improve the service, sometimes for training. For a weekend project, fine. For a product that holds other companies' customer feedback, that default is a dealbreaker.

Why this matters more for a feedback tool

If I were building a code assistant, retained prompts would be a concern. Building a survey platform, it is existential.

A churn response can contain a customer's name, their company, their reasons for leaving, and their opinion of a competitor. An employee engagement survey can contain things people would never say to their manager's face. If a customer runs a patient-experience survey or a financial-services NPS, the open text can include regulated data. The whole promise of a survey is that it is safe to be honest. If the infrastructure underneath leaks or reuses that honesty, the promise is broken before the first response comes in.

So my evaluation was not "which model is smartest." It was "which provider lets me make a real, defensible promise about this data."

What I was actually evaluating

I wrote down what I needed before I looked at any logos:

Data handling. Can I guarantee prompts and outputs are not stored or used for training?
Compliance. Real certifications I can point a security-conscious customer to, not a vibe.
Model flexibility. Access to strong open models, and the freedom to switch without a rewrite.
A compatible API. I did not want to maintain a bespoke integration.
Production performance. Predictable latency and an SLA, not best-effort.
Room to grow. Fine-tuning and dedicated capacity for when Formaly needs them.

Nebius Token Factory was the option that cleared all six, and it cleared the first one in a way I had not seen stated so plainly.

The one that mattered most: zero data retention

Nebius offers Zero Data Retention as a control you can turn on. When it is enabled, your inputs (the prompts) and your outputs (the model's responses) are not stored on their systems after the request is processed, and that data is never used for model training or any other purpose. The request runs, the answer comes back, and nothing about it lingers.

That is exactly the guarantee a feedback platform needs. It means I can tell a customer, truthfully, that the open-ended answer someone typed into a survey is processed to generate their summary and then it is gone. Not retained, not mined, not turned into training data for a model someone else will use later.

It is worth being precise about how it works, because honesty about your infrastructure is part of the point. ZDR is a setting, not the silent default. By default, Nebius keeps inputs and outputs to speed up inference (a technique called speculative decoding), and under the standard terms they act as a data processor governed by a data processing agreement. The reason I value the feature is that the stricter behavior is available and explicit. Privacy you can switch on and verify is worth more than privacy you have to assume.

Compliance is the unglamorous part that earns trust

Zero data retention is the headline, but a promise is only as good as the audit behind it. This is where Nebius made the decision easy. The platform carries the certifications that security reviews actually ask about: SOC 2 Type II, HIPAA, ISO 27001, and ISO 27799. It operates in line with GDPR and CCPA, and it is certified under the EU-US Data Privacy Framework, which matters for moving data between regions cleanly. Inference runs in data centers in Finland, France, and the US, so EU and US data-residency requirements can both be met.

None of that is exciting to write about. All of it is what lets a careful customer say yes.

The features that made it an easy yes

Once the data story checked out, the rest of the platform made the choice comfortable rather than just principled:

OpenAI-compatible API. This was a genuinely pleasant surprise. Switching to Nebius meant pointing the OpenAI client at a different base URL and using my key. No new SDK, no rewrite. If I ever need to move, the same compatibility works in reverse, so I am not locked in.
A real catalog of open models. Nebius Token Factory serves 60+ open-source models across text, code, and vision (the Llama, Qwen, and DeepSeek families and more). Formaly runs on an open model today, and I can swap models per task without changing providers.
Production-grade serving. Dedicated endpoints offer isolation, autoscaling throughput, and a 99.9% SLA with predictable latency, which is what you want once real traffic depends on it.
A path forward. Batch inference, fine-tuning, and their Data Lab tooling are there for when Formaly's analysis layer needs a model tuned on our own patterns. I did not need it on day one, but I liked knowing the runway existed on the same platform.

The honest tradeoff

I will be straight about the tension. Building on open models hosted by an inference provider is a different bet than wiring straight into a single frontier lab's closed model. You are trading a sliver of raw frontier capability for control, portability, and a much clearer data story.

For Formaly, that trade is obvious. The work our models do (summarizing responses, scoring sentiment, turning answers into structured data and analytics) is well within the reach of strong open models, and the privacy posture is worth far more to my customers than the last few percent of benchmark performance. I would rather give them data they can trust than a marginally cleverer summary they cannot.

What this means if you use Formaly

Most of this happens where you never see it, which is the point. When you run a churn survey or a sensitive employee feedback round through Formaly, the responses are processed on infrastructure I chose specifically because it can be told to forget them.

Your respondents are doing you the favor of being honest. The least we can do is build on a stack that treats that honesty as theirs, not ours to keep.

Choosing a model is a product decision. Choosing who holds the data is a trust decision. I think the second one matters more, and it is the one I spent the most time getting right.

When you build an AI product, it is easy to obsess over the model and quietly skip the more important question: who sees the data you send it?

For a lot of apps, that question stays abstract. For Formaly, it is the whole thing.

So before I picked a model, I had to pick something that matters more: an inference provider I would trust with that data.

This is the story of how I chose, and why Formaly runs on Nebius Token Factory.

The question most AI products skip

Why this matters more for a feedback tool

If I were building a code assistant, retained prompts would be a concern. Building a survey platform, it is existential.

So my evaluation was not "which model is smartest." It was "which provider lets me make a real, defensible promise about this data."

What I was actually evaluating

I wrote down what I needed before I looked at any logos:

Data handling. Can I guarantee prompts and outputs are not stored or used for training?
Compliance. Real certifications I can point a security-conscious customer to, not a vibe.
Model flexibility. Access to strong open models, and the freedom to switch without a rewrite.
A compatible API. I did not want to maintain a bespoke integration.
Production performance. Predictable latency and an SLA, not best-effort.
Room to grow. Fine-tuning and dedicated capacity for when Formaly needs them.

Nebius Token Factory was the option that cleared all six, and it cleared the first one in a way I had not seen stated so plainly.

The one that mattered most: zero data retention

Compliance is the unglamorous part that earns trust

None of that is exciting to write about. All of it is what lets a careful customer say yes.

The features that made it an easy yes

Once the data story checked out, the rest of the platform made the choice comfortable rather than just principled:

OpenAI-compatible API. This was a genuinely pleasant surprise. Switching to Nebius meant pointing the OpenAI client at a different base URL and using my key. No new SDK, no rewrite. If I ever need to move, the same compatibility works in reverse, so I am not locked in.
A real catalog of open models. Nebius Token Factory serves 60+ open-source models across text, code, and vision (the Llama, Qwen, and DeepSeek families and more). Formaly runs on an open model today, and I can swap models per task without changing providers.
Production-grade serving. Dedicated endpoints offer isolation, autoscaling throughput, and a 99.9% SLA with predictable latency, which is what you want once real traffic depends on it.
A path forward. Batch inference, fine-tuning, and their Data Lab tooling are there for when Formaly's analysis layer needs a model tuned on our own patterns. I did not need it on day one, but I liked knowing the runway existed on the same platform.

The honest tradeoff

What this means if you use Formaly

Your respondents are doing you the favor of being honest. The least we can do is build on a stack that treats that honesty as theirs, not ours to keep.

Choosing a model is a product decision. Choosing who holds the data is a trust decision. I think the second one matters more, and it is the one I spent the most time getting right.

How We Chose Our AI Provider: Why Formaly Runs on Nebius Token Factory

The question most AI products skip

Why this matters more for a feedback tool

What I was actually evaluating

The one that mattered most: zero data retention

Compliance is the unglamorous part that earns trust

The features that made it an easy yes

The honest tradeoff

What this means if you use Formaly

Completion Rate Is a Vanity Metric

The Form Is Dying. The Interview Is Replacing It.

How to Run a Churn Survey That Actually Tells You Why People Left

How We Chose Our AI Provider: Why Formaly Runs on Nebius Token Factory

The question most AI products skip

Why this matters more for a feedback tool

What I was actually evaluating

The one that mattered most: zero data retention

Compliance is the unglamorous part that earns trust

The features that made it an easy yes

The honest tradeoff

What this means if you use Formaly

Completion Rate Is a Vanity Metric

The Form Is Dying. The Interview Is Replacing It.

How to Run a Churn Survey That Actually Tells You Why People Left