Semantic interlingua experiment

Language Converter

Use this workbench to test whether a mutable source expression can preserve its task-relevant concept, relation, and validation evidence after conversion. IOTA output is an approximate semantic neighbor, not exact translation.

Mutable languages

Surface forms may drift

Translations, paraphrases, scripts, idioms, register, and locale can move the surface expression without preserving every implication. The converter treats that mutability as evidence to inspect, not noise to hide.

Concept layer

Meaning is registry-bound

Successful conversion converges on reviewed concept evidence: opaque concept IDs, canonical vector hashes, labels, examples, negative examples, provenance, confidence, and drift warnings.

ISO/IEC 10646

Public text substrate

Unicode and ISO/IEC 10646 provide assigned characters, normalization obligations, grapheme behavior, and public metadata. IOTA-1 uses those facts as inspectable expression evidence, not as a universal ontology.

Centroid evidence

Translation averaging stays conditional

Translation centroids can help estimate a language-neutral concept prototype only when translations are sense-aligned, model-tagged, normalized, and tested against near misses and round-trip drift.

Multilingual concept canonicalization

Evidence-first conversion

Choose a language tag, convert a source expression, and inspect the interlingua packet: normalized input, phrase segments, selected concept, ranked public-symbol candidates, vector evidence summaries, drift, and public Unicode safety.

For experimentation, read confidence as a review signal. A high score suggests preserved retrieval intent inside this profile; it does not certify exact semantic equivalence across all languages or models.

Loading registry status...

DatabaseOnly

Seed and SQL evidence

Uses packaged public seed records and stored corpus vectors when available. It must work without LM Studio and should report SQL offline states instead of failing the page.

Hybrid fallback

Stored evidence first

Combines deterministic registry evidence with hosted corpus search when configured, while keeping unresolved words visible instead of inventing private meanings.

LLM assisted

Optional local AI

LM Studio is a local optional helper for embedding or review workflows. Public conversion endpoints remain read-only and do not mutate the stored corpus.

Ready.

Output

Public-symbol rendering

Run a conversion.
Concept: -- Confidence: -- Unknown rate: -- Unicode safety: --

Interlingua packet

Normalization--
Source--
Registry--
Neutralizer--
No phrase segments yet.
No glyph candidates yet.
No warnings yet.

Vectors

Raw, neutralized, canonical

Candidates

Concept ranking

No candidates yet.

Language-neutral retrieval

Residue and corpus readiness

No language-neutral retrieval report yet.

Drift

Round-trip visibility

No drift report yet.

Developer JSON

No API response yet.

Experiment doctrine

How to read a conversion result

Protocol5 uses semantic isomorphism as an engineering target: preserve the useful relation between expression, concept, candidate glyph, validator behavior, and provenance under a declared Unicode, locale, registry, vector, and model policy.

Accept

Approximate preservation

Use top-K concept agreement, vector evidence, source provenance, unknown rate, and round-trip drift to decide whether a result is good enough for review.

Reject

False certainty

Do not treat one glyph, code point, token ID, embedding coordinate, centroid, or quantized code as meaning authority without the registry and evidence packet.

Measure

Universal representation pursuit

The goal is a better shared semantic representation over time: lower drift, clearer abstentions, stronger provenance, and stable concept selection across mutable language surfaces.