·Szymon Smagowski · AI Lab · est. 2024
Node · 01 · USER
The question.
Any input, any language. Previous user context and chat history travel with the turn — and the graph can enter from any node, not only here.
Node · 02 · VOICE
Steer with your voice.
Everything text can do — done by voice instead. Wake-word, in-browser, no audio leaving the device until the user wants it to. Hands free, eyes on the work, "in the room with the assistant".
Node · 03 · ROUTER
The router.
A fast LLM classifies the turn — small-talk, retrieval, or tool call. Everything downstream branches here.
Node · 04 · EMBEDDING
Text becomes vectors.
Text becomes a high-dimensional vector. Same meaning lands close together — across languages, across typos.
Node · 05 · VECTOR DB
The vector database.
Indexed chunks as dense vectors. Payload ACLs filter at query time — no separate auth pass. Sparse index in parallel for keyword recall.
Node · 06 · RERANK
An LLM judge.
Vector similarity gives a coarse ranking. A smaller LLM rescores the candidates and trims to the most relevant — the noise cosine distance can't see.
Node · 07 · TOOLS
When the agent acts.
Some questions need actions, not retrieval. The agent picks a tool, runs it, brings the output back. Scoped args block prompt injection structurally.
Node · 08 · ANSWER LLM
The generator.
Retrieved chunks plus tool output become the prompt. A large LLM streams the response back to the user.
Node · 09 · OBSERVABILITY
Every step traced.
Every step lands as a trace span — scores, args, latency, cost. The same trace ID flows from the user turn to the eval harness.
Node · 10 · FEEDBACK
The human signal.
Thumbs up / down on the answer or the retrieval, free-text follow-ups, escalations. Real users telling the system what's good and what's wrong — captured per turn, joined to the trace.
Node · 11 · DATASET + EVAL
A labelled ground truth.
Business subject-matter experts plus the user feedback stream — distilled into a labelled eval set. RAG and LLM-answer quality scored against it on every build. The dataset grows, the bar moves with it.
The loop closes
The system learns.
Findings from feedback and eval flow back to retrieval and the model — re-ranked priors, re-weighted retrieval, tuned prompts, fine-tunes. Today's bug becomes tomorrow's regression test.
Talk to the lab
Got something to build?
A short email is fine. Tell me what you're trying to do and where you're stuck — I'll write back.