# Polymathy — full machine-readable summary

> Polymathy is a Rust web service that transforms traditional search into an
> answer engine. It is a single HTTP endpoint sitting between a SearxNG
> metasearch instance and a content processor. Maintained by Skelf-Research,
> released under GPL-3.0, published on crates.io as `polymathy`.

This file is the expanded version of `/llms.txt`. It contains the full prose
summary of each public page on the polymathy.skelfresearch.com marketing site,
so that retrieval-augmented assistants can answer questions about the project
without crawling every URL.

## Identity

- Project: Polymathy
- Tagline: Rust web service that transforms traditional search into an answer engine.
- Owner: Skelf-Research (https://skelfresearch.com)
- Site: https://polymathy.skelfresearch.com
- Docs: https://docs.skelfresearch.com/polymathy/
- Source: https://github.com/Skelf-Research/polymathy
- Package: https://crates.io/crates/polymathy
- License: GPL-3.0
- Current version: 0.2.0
- Minimum Rust: 1.70+
- Stack: actix-web 4, tokio, usearch 2.12, reqwest, serde, apistos

## Request flow

    Query -> SearxNG -> URLs -> Content Processor -> Semantic Chunks -> You

1. Caller hits `GET /v1/search?q={query}` on Polymathy.
2. Polymathy issues a GET against the configured `SEARXNG_URL` with the query.
3. The first ten URLs from `results[]` are taken.
4. Polymathy POSTs each URL to the configured `PROCESSOR_URL` in parallel with
   a fixed config block: `chunking_size: 100`, `chunking_type: "words"`,
   `embedding_model: "AllMiniLML6V2"`.
5. The processor returns chunks (and 384-dim embeddings, which the processor
   produces; Polymathy is agnostic to the model).
6. Polymathy assigns sequential `u64` chunk IDs and returns a JSON map of
   `chunk_id -> [source_url, text]`.

The 384-dim USearch index (inner product, F32 quantized) is instantiated per
request. In v0.2 it is wired in but not used on the public read path; the
returned payload is the chunk map.

## What is NOT bundled

- No SearxNG instance.
- No content processor (no chunker, no embedding model).
- No LLM call, no reranker, no citation renderer.
- No persistent index, no auth layer, no rate limiting, no multi-tenant isolation.
- No UI, no chat, no admin console.

Polymathy is the seam between metasearch and chunking. Everything above and
below it is your responsibility.

## Endpoints

- `GET /v1/search?q={query}` — main endpoint, returns chunk map JSON.
- `GET /openapi.json` — OpenAPI spec generated from Rust types via `apistos`.
- `GET /swagger` — Swagger UI.
- `GET /redoc` — ReDoc UI.
- `GET /rapidoc` — RapiDoc UI.
- `GET /scalar` — Scalar UI.

## Configuration

Environment variables, loaded from a `.env` file or the process environment:

- `SEARXNG_URL` (required) — SearxNG `/search` endpoint.
- `PROCESSOR_URL` (required) — content processor endpoint.
- `SERVER_HOST` (optional, default `127.0.0.1`).
- `SERVER_PORT` (optional, default `8080`).

## Pages on this site

### / (overview)

The home page presents the request flow, the single endpoint, the stack as
shipped, the explicit list of things Polymathy is not, and three honest "use it
when" cases: you already run SearxNG; you are building a Perplexity-style UI
and want a Rust backend for fetch-and-chunk; you are prototyping a content
processor and want a stable HTTP harness.

### /about

The about page describes the four-step request handler in detail, notes that
the USearch index is instantiated but not yet used on the read path in v0.2,
and is explicit about what is not included. It also covers maintainership
(Skelf-Research), versioning (`v0.2.0` at time of writing, API expected stable
on `/v1/search`), and how to file issues or contribute.

### /blog (notes)

Three engineering notes:

1. "From 10 blue links to one cited paragraph: the shift in internal search"
   (~1290 words). Argues that internal search moved from ranking to synthesis
   because the cost of synthesis collapsed; lists the four things an answer
   engine actually needs; explains why fetch-and-chunk deserves its own
   service.

2. "Why answer engines need to cite — and how" (~1360 words). Citations are a
   system invariant, not a UX flourish. Three code consequences: chunks carry
   provenance end-to-end; prompt template marks chunks with identifiers; the
   renderer refuses to render uncited claims. Polymathy enforces the first
   through the response shape.

3. "Why Polymathy is on Actix-web, not Axum (yet)" (~1240 words). The OpenAPI
   story drove the framework choice; `apistos` is wired into actix-web; the
   migration to axum will happen when the persistent-index work makes `tower`
   middleware worth the rewrite.

RSS feed: /rss.xml.

### /compare/onyx

Comparison with Onyx (formerly Danswer), the open-source enterprise
answer-engine platform. Onyx is the full product (connectors, chat UI, admin,
Vespa index). Polymathy is the small Rust seam between metasearch and
chunking. Pick Polymathy when you want to host the UI yourself and your corpus
is the public web; pick Onyx when you want a turnkey internal answer engine
with Slack/Confluence/Drive/Notion/GitHub connectors.

### /compare/algolia-llm

Comparison with the "Algolia + LLM" pattern (hosted SaaS retrieval + downstream
LLM synthesis). Algolia owns your corpus; SearxNG points at the public web.
Pick Polymathy when the corpus is the public web, when you want self-hosted
Rust with no per-search pricing, and when data residency matters. Pick the
Algolia path when you have a closed corpus, want a managed dashboard with A/B
testing, and are happy with per-search pricing.

### /404

Service-flavoured 404 page rendered as a tiny JSON error blob, in keeping with
the "the service exposes /v1/search" voice.

## Voice and brand

- Service-flavoured, not product-flavoured. "The service exposes /v1/search."
  not "Polymathy lets you...".
- Honest about what the binary does and does not include. The README has a
  "What This Isn't" section; the site mirrors it.
- Citations matter. When the site makes a claim about the codebase, it links
  to the file. Citations on the blog render in cobalt with small superscript
  brackets, evoking a research paper rather than a chat interface.
- Palette: warm paper background (#f4efe6), sage for structure (#3e4a36
  deep, #7a8a6e regular), cobalt for citations and primary CTAs (#1f3fb1),
  ember for "not bundled" callouts (#c8533a).
- Type: Inter for UI/body/headings, JetBrains Mono for code, env vars, and
  chunk IDs.

## Hosting and deploy

The site is a static Astro 5 build. It is deployed via dnscap to Backblaze B2
behind Cloudflare; the marketing site is intentionally separate from the
Rust crate's `docs.rs` page and from the docs.skelfresearch.com mkdocs site.