Getting started / Introduction

Introduction

Rheme is a deterministic semantic runtime. It reads prose as you write and resolves what the prose means, a language server for meaning rather than for code.

Editors have long had a language server watching the code, naming every symbol and catching every slip. Rheme does that for sentences. It runs in real time and marks each span of text along the axis of meaning it carries, so software always knows who said what, what is claimed, what is given, and what is merely felt.

These pages are themselves a Rheme surface. Every colored span is annotated live. Hover one to see the axis, the source, and the confidence behind it, or flip annotations off in the top bar to read the plain prose.

What Rheme reads

A sentence does many things at once. It names a source, takes a stance, hedges or commits, points back to something earlier, and carries a feeling. Rheme separates those threads and keeps each one labeled. The claim that a result may shift is held apart from the claim that it held in every test, because one is a hedge and the other is a warrant.

How it reads it

Rheme is deterministic. The same input returns the same output every time, with no sampling and no drift between runs. It does this without turning text into an embedding, so nothing is averaged away. Because the words stay words, attribution survives the parse and you can always trace a phrase back to the source that uttered it.

// the runtime hands back spans, not a single vector
let doc = rheme::resolve("According to Dr. Reyes, the result holds.");
doc.spans()[0].axis    // => Attribution
doc.spans()[0].source  // => "Dr. Reyes"

Where it runs

The core is built in Rust and ships WASM and C-ABI native, so it drops into a browser, a server, or a native app behind the same interface. It clocks around 12.6 ns per token and holds a flat line to about 35,000 tokens before latency starts to climb.

What you build with it

The flagship surface is the Inspector, an editor that feels like an IDE for prose. The same runtime powers review tools, compliance checks, and any place where it matters that meaning is never flattened into a single blurred score.

Core concepts / The ten axes

The ten axes

Every mark Rheme makes is a span placed on one axis of meaning. Ten axes cover the load a sentence carries, with an eleventh for time.

Reading along axes

An axis is a single question Rheme asks of every span. Who is the source? How sure are they? What is the sentence doing? Each answer is a labeled, colored mark you can hover for its source and confidence. The schema below is sampled from the Inspector and stands until the canonical taxonomy lands.

attribution

Who a claim is sourced to. In according to Dr. Reyes, the source is fixed and carried forward.

modality

How firmly something is asserted. A runtime that may shift is hedged, not promised.

speech act

What the sentence does. To propose a question is a different act from answering one.

negation

Where meaning is reversed. Rheme marks that attribution is never flattened.

evidence

The grounds for a claim. A result that holds in every test carries its warrant with it.

causal

A cause linked to an effect. Meaning survives because it is never averaged.

sentiment

The positive or negative charge. A result can read as genuinely remarkable.

emotion

A felt state coloring the words. An aside like honestly, a relief carries feeling, not fact.

register

How formal or casual the wording is. Plainly put sits lower than a formal clause.

sense

Which meaning of an ambiguous word holds. Read in the strict sense the word resolves one way.

temporal

When something holds, and for how long. A claim true only as the corpus grows is bound in time.

Confidence

Every span carries a confidence, the runtime stating how sure it is of the read. The hover card shows it as a bar and a number. Because Rheme is deterministic, that number is stable. The same sentence returns 12.6 ns later returns the same confidence, not a fresh guess.

Diagnostics

When a reading will not settle, Rheme marks a diagnostic with a wavy underline rather than a clean span. In the committee told the engineers that they had failed, the pronoun has two plausible antecedents, so the runtime holds both candidates instead of picking one.

A diagnostic is a held question, surfaced for you to resolve. The runtime states the candidates and its confidence rather than guessing on your behalf.

Getting started / Installation

Installation

Rheme ships as a single crate for Rust, a WASM bundle for the browser, and a C header for everything else.

Each target wraps the same core, so a span resolved in the browser matches the same span resolved on a server, byte for byte.

Rust

Add the crate and you have the whole runtime, the parser, the schema, and the diagnostics, with no model files to download and nothing to fetch at startup.

# add it to the project, or edit Cargo.toml by hand
cargo add rheme

The crate is free of native dependencies, so the same cargo build works on Linux, macOS, and Windows with no extra tooling.

Browser

The browser target is the same core compiled to WASM, shipped as an ES module with the .wasm binary inlined.

npm install @rheme/wasm

// resolve a string the moment the module loads
import { resolve } from "@rheme/wasm";
const doc = await resolve("According to Reyes, the result holds.");

C and native

For everything else there is a C header and a static or shared library, the C-ABI build of the identical core.

/* link against librheme, include the one header */
#include "rheme.h"

rheme_doc *doc = rheme_resolve("The result holds.");

Verify

Resolve a known sentence and check that the first span comes back as attribution. Because the runtime is deterministic, the source and confidence are fixed values you can assert against.

A span resolved here matches the same span resolved in the browser or behind the C interface, byte for byte. If your verify step passes on one target it passes on all three.

Getting started / Quickstart

Quickstart

Hand Rheme a string and read back the spans. There is no compile step, because annotations update live as the text changes.

Type something. Rheme annotates as you go, and the spans recolor the moment the meaning shifts.

Resolve a string

One call hands back a document, a list of spans with the diagnostics attached. There is no session to open and no model to warm.

let doc = rheme::resolve("According to Dr. Reyes, the result may shift as the corpus grows.");
doc.spans().len()   // => 3

Walk the spans

Each span names its axis, its source, and how sure the runtime is. The claim that a result may shift is held apart from who said it and from when it holds.

for span in doc.spans() {
  println!("{:?}  {}  {:.2}", span.axis, span.source, span.confidence);
}
// Attribution  Dr. Reyes    0.98
// Modality     Dr. Reyes    0.90
// Temporal     the runtime  0.90

Catch diagnostics

When a reading will not settle, it surfaces as a diagnostic rather than a clean span. In the committee told the engineers they had failed, the pronoun has two antecedents, so the runtime holds both.

for d in doc.diagnostics() {
  println!("{:?}: {}", d.kind, d.candidates.join(", "));
}
// Coreference: the committee, the engineers

Read it live

Nothing here is a batch step. Feed the runtime text as it changes and the spans recolor the moment the meaning shifts; that loop is exactly what these pages and the Inspector run.

Flip annotations off in the top bar to read this page as plain prose, then back on to watch every span return to its axis. Same text, same spans, every time.

Core concepts / Theme and rheme

Theme and rheme

Every sentence splits into a theme, the part already in play, and a rheme, the new part that is the point.

The runtime takes its name from the second one. The rheme is the new information, the claim being asserted, and it is what most software throws away first.

The given and the new

Linguists call the part of a sentence already in play the theme, and the part that advances the point the rheme. In the result, however, may shift, the result is given; that it may shift is the news.

Why the rheme

The runtime takes its name from the second half because the rheme is what most software discards first. Flatten a sentence to a single vector and the new claim averages into the topic around it. Rheme keeps the claim separate and unflattened, so the point survives the parse.

Marking the split

Every span sits on the rheme side of some clause, the move the sentence is making. The axis says which move it is, a hedge, a citation, a reversal, and the source says whose move it is.

Theme and rheme are a lens, not a tag. The runtime does not label a span “rheme”; it reads the new information and places it on whichever axis of meaning it carries.

Core concepts / The span

The span

The span is the one move Rheme makes, a stretch of text marked along a single axis.

It is the underline in the wordmark made literal. Prose carrying a colored mark, attributed and classified, drawn in as the meaning resolves.

The fields

A span is a stretch of text with a few fixed fields. It is the whole unit Rheme returns, nothing larger and nothing smaller.

range, the start and end offsets in the source text.
axis, which of the ten axes the span sits on.
source, who the span is attributed to, carried forward from earlier in the text.
confidence, how sure the runtime is of the read, a stable number between 0 and 1.

Reading a span

Because the words stay words, a span points straight back at the text it marks. You can slice the original string with its range and get the exact phrase the axis describes.

let s = &doc.spans()[0];
&text[s.range]      // => "according to Dr. Reyes"
s.axis              // => Attribution
s.confidence        // => 0.98

Spans overlap

One phrase can carry more than one axis at once. A clause may be attributed and hedged in the same breath, and the runtime returns both spans rather than forcing a single label.

The span is the underline in the wordmark made literal, a stretch of prose drawn in as its meaning resolves. Everything else Rheme reports is a list of these.

Core concepts / Attribution

Attribution

Attribution is the thread Rheme refuses to drop. It traces every claim back to whoever said it.

Turn prose into a single vector and the sources blur into one average. Rheme keeps each one separate, so a quote stays a quote and a claim stays its author's.

The chain

Once a source is named it carries forward. According to Dr. Reyes, the result holds, and it held in every test, and the next clause is still hers until the text hands off to someone else.

What survives the parse

Turn prose into one vector and every source blurs into a single average. Rheme keeps each one never merged, so a quote stays a quote and a claim stays its author's. This is the one thread the runtime refuses to drop.

let doc = rheme::resolve("Reyes says the result holds. Critics call it premature.");
doc.spans()[0].source   // => "Reyes"
doc.spans()[1].source   // => "Critics"

Nested attribution

Sources stack. When the report claims Reyes said the result holds, the claim belongs to the report and the result to Reyes, two layers the runtime keeps distinct rather than collapsing into one voice.

Attribution is why Rheme suits review and compliance work. Every assertion traces back to a source, so nothing arrives unsigned.

Core concepts / Diagnostics

Diagnostics

A diagnostic is a linter note for meaning. It flags where a naive reading would mislead, marked with a wavy underline rather than a clean span.

Every diagnostic answers four questions: what signal was found, where, what it could distort, and how severe. They are evidence-grounded, never a verdict. The rule is strict: a diagnostic is only worth raising if it changes what a careful reader would do.

Severity

Four levels, ordered by how badly acting on the raw text would mislead, not by how “bad” the source is.

error — a category mistake if stored naively: a denial read as a fact, a hedge recorded as certainty.
warning — a real hazard that wants attention: a causal overclaim, an unresolved contradiction, an unsourced assertion.
info / hint — caution, not correction: loaded wording, a missing time, a single-source dependency.

The never-flatten guards

Three errors exist to stop the runtime from collapsing distinct meanings into a plain fact. A reported denial like Acme denied selling drones raises denial_misparsed: the embedded event stays disputed, never stored as false. A line like no evidence shows the shipment occurred raises absence_of_evidence_misparsed: absence is not negation. And a claim recorded more firmly than its hedged source raises modality_erased.

Warnings

The warnings flag hazards a reader should resolve. causal_overclaim fires when causal language outruns its evidence, but stays silent on mere sequence (after, not because). missing_attribution fires on a factual assertion made in the document's own voice with no source. A pronoun with zero or several antecedents, as in they told them they had failed, raises ambiguous_referent.

Reading a diagnostic

Each diagnostic carries a message, the spans it covers, the objects it may distort, and suggested actions. It states the hazard and the fix and leaves the call to you.

The runtime never says “true,” “false,” or “biased.” A diagnostic shows the structure of the risk, with evidence, and leaves the judgment to the reader.

Core concepts / Structure, not verdict

Structure, not verdict

Rheme surfaces the move a sentence makes and stops there. It will show you that a sentence hedges, denies, or attributes a claim — but it never says true, false, or biased.

This is the product's one law, and the reason it stays trustworthy enough to keep open all day. The structure of what is said is made legible; the judgment is left to the reader.

The move, not the verdict

People rarely state plain facts. They hedge, attribute a claim to a source, deny, qualify, and report. Rheme marks each of those moves, in whatever register it happens in, and leaves them as they are. It does not rate them.

Why restraint is the point

The moment a tool declares a verdict — true, false, biased — it becomes partisan and brittle, and you stop leaving it open. By reporting only the structure, with evidence and a source, Rheme stays a neutral instrument. It never renders the call for you.

Even the warnings hold back

Diagnostics follow the same rule. A diagnostic states a hazard and a fix — a denial that must not be stored as false, a causal claim that outruns its evidence — but it never says the source is wrong. It changes what a careful reader would do, and nothing more.

Show the structure; never the verdict. It is the line that keeps Rheme an instrument rather than an opinion.

The runtime / Determinism

Determinism

Rheme returns the same output for the same input, every run, with no sampling.

Determinism is what makes the annotations something you can build on. A test that passes today passes tomorrow, and a span you cite will read the same when someone else opens it.

The guarantee

The same input returns the same output every run, the same spans, the same sources, the same confidence to the digit. There is no seed to set and no warm-up that changes the result.

No sampling

Rheme reads text through a fixed procedure, not a sampled model, so there is no temperature and no drift between runs. Confidence is a measured property of the read, not a fresh guess each time you ask.

What you can build on it

Determinism is what turns an annotation into something you can cite. A test that passes today passes tomorrow, a span you reference reads the same when a colleague opens it, and a diff over the output reflects a change in the text, not in the runtime.

Pin a sentence in a test and assert on its spans. Because the read is stable, the assertion is too, even across machines and across the WASM and native targets.

The runtime / Performance

Performance

Rheme clocks about 12.6 ns per token and holds a flat line to roughly 35,000 tokens.

The numbers are measured on everyday inputs, not a staged best case. Throughput sits near 79.1 M tokens per second before latency begins to climb.

The numbers

The runtime clocks about 12.6 ns per token and sustains roughly 79.1 M tokens per second on a single core. Resolving a full page of prose finishes in well under a millisecond.

Scaling

Latency holds a flat line to about 35,000 tokens in a single pass before it begins to climb. Up to that point, doubling the input doubles the time and nothing worse, so cost stays easy to predict.

Methodology

The figures are measured on everyday inputs, mixed prose at a typical length, not a staged best case. Because the runtime is deterministic, a benchmark rerun reports the same work, so the numbers are stable enough to regress against.

The status bar at the bottom of these pages shows the same per-token figure the benchmark reports. It is the runtime's real cost, not a rounded headline.

The runtime / WASM and C-ABI

WASM and C-ABI

One core, built in Rust, shipped WASM and C-ABI native.

The browser bundle and the native library share one implementation, so a result resolved in a web app matches the same result resolved behind a C interface.

One core

The runtime is written once, in Rust, and shipped to every target from that single implementation. There is no second port to drift out of sync.

WASM

The browser build is the core compiled to WebAssembly, exposed as an ES module. It runs in the page with no server round-trip, which is what lets these docs annotate live.

import { resolve } from "@rheme/wasm";
const doc = await resolve(text);   // same spans as the native build

C-ABI

The native build exposes a flat C ABI, so any language that can call C can call Rheme, whether Python, Go, Swift, or C itself.

#include "rheme.h"
rheme_doc *doc = rheme_resolve("The result holds.");
size_t n = rheme_span_count(doc);

Parity

The targets agree by construction. A span resolved in a web app matches the same span behind the C interface, byte for byte, because both run the identical core.

Pick the target by where the text lives, not by what you can afford to lose. The schema, the axes, and the diagnostics are the same on all three.

Reference / The Inspector

The Inspector

The Inspector is the flagship surface, an editor that feels like an IDE while the document is prose.

A metrics rail, an axis legend, a problems panel, and a command palette sit around the page, and the runtime underlines spans as you type.

The surface

The Inspector is an editor that feels like an IDE while the document is prose. You write in the center and the runtime underlines spans as you type, recoloring them the moment a meaning shifts.

The panels

Four panels sit around the page, each reading the same runtime output.

Metrics rail, span, axis, and diagnostic counts, with the live per-token cost.
Axis legend, the ten axes and their colors, sampled from the same schema these docs use.
Problems panel, every diagnostic in the document, each a held question you can jump to.
Command palette, keyboard-first access to every action, filtering as you type.

The annotation layer

The colored underlines are a layer you can lift. Turn them off and the prose reads plain; turn them back on and every span returns to its axis. These pages run the same toggle in the top bar.

The Inspector is the flagship surface, but it is only a client of the runtime. Anything it shows, the API returns, so you can build the same view yourself.

Reference / API

API

The API is JSON-RPC in the spirit of LSP. You open a document and query it, the way a language server answers a code editor.

The unit of work is the live document, not a one-shot call. Open it once, edit it as it changes, and ask what sits at any position; the runtime re-reads incrementally and keeps every object pinned to its source span.

Open a document

One message loads the text and returns a version. Edits arrive the same way and the document re-extracts in place.

{ "method": "textDocument/didOpen",
  "params": { "documentId": "d1",
    "text": "The review board suggested the café may have violated health codes." } }

Query a position

Ask what object covers a byte or a line and character position. The runtime answers with the typed object and its epistemic axes already separated.

{ "method": "textDocument/objectAtPosition",
  "params": { "documentId": "d1", "byte": 27 } }

// result
{ "id": "claim_a3f2…", "type": "Claim",
  "summary": "ReportedClaim (Possible/Reported)",
  "modality": "Possible", "polarity": "Positive",
  "verificationStatus": "Reported" }

The methods

The surface mirrors a language server. Objects are read by position or by type, and the workspace methods answer across edits and documents.

textDocument/didOpen · didChange · didClose — lifecycle; each edit re-extracts.
objectAtPosition · hover · references — what is here, and where else it appears.
claims · entities · events · contradictions · ambiguities · diagnostics — typed object lists.
provenance — the extractor, version, and validation behind any object.
workspace/semanticDiff · query — what changed between versions, and search across documents.

Edits and diff

Because object ids are derived from content, an edit to one clause leaves the ids of every other clause untouched. A semantic diff is an id-set difference: a reworded clause shows up as removed plus added, while an object that merely gained a span shows up as changed.

Raw object lists serialize the canonical snake_case schema; the convenience views (objectAtPosition, hover, provenance) use camelCase. Documented so a client is never surprised.

Reference / CLI

CLI

The command line wraps the same runtime in four verbs: parse, diagnose, objects, and diff.

It reads from a file or a pipe and prints typed objects as plain, stable text. There is no daemon and no config; it reads and exits.

parse

Read a document into its object graph, the claims, events, sources, and their axes.

# a file, or anything on stdin
rheme parse report.txt
echo "Acme denied selling drones." | rheme parse

diagnose

Print the diagnostics alone, one per line, each with its severity. The reported denial above raises a denial_misparsed error rather than recording the sale as a fact.

rheme diagnose report.txt
# error    denial_misparsed     keep the embedded event disputed
# warning  missing_attribution  attach a source to the assertion

objects and diff

Filter the graph by type with objects, or compare two versions with diff. Because ids are content-derived, the diff reports only what actually changed.

rheme objects --type claim report.txt
rheme diff report.v1.txt report.v2.txt

In a pipeline

The output is deterministic, so it diffs cleanly and fits straight into a build or a review step.

Drop rheme diagnose into a pre-commit hook to fail on a new denial misparse or causal overclaim, or to assert that an attribution still points where it should.