Quark-v0.1 · Now Available

Introducing
Quark.

A family of sovereign language models built for local efficiency, privacy, and deep conversational intelligence. Built from scratch by ThingsAI & OvercastLab.

Start Chatting Read the Paper

quark-135m · local

$ quark run --model 135m

✓ Model loaded (135M params)

✓ Running on RTX 3070 · BF16

─────────────────────────

› Explain quantum entanglement

Quantum entanglement occurs when

two particles become correlated

regardless of distance. Measuring

one instantly affects the other

— no information travels, yet

the correlation is preserved.

135M

Parameters in the primary chat model, optimized for RTX 3070

15B

Tokens used during pretraining on sovereign hardware

2048

Context window in tokens per sequence

100%

Local inference. No telemetry, no data retention.

Release Timeline

The Quark Family.

From lightweight CPU models to full-scale instruction-tuned flagships. Each model is designed for a specific compute tier.

Model Parameters Status Notes

Quark-50M

50M

Live

Lightweight. Runs on CPU.

Quark-135M

135M

Live

Primary chat model. RTX 3070 optimized.

Quark-270M

270M

Coming Soon

Enhanced reasoning depth.

Quark-540M

540M

Coming Soon

Extended context. Multi-domain.

Quark-1.4B

1.4B

Coming Soon

Flagship. Full instruction suite.

Architecture · Quark-135M

Built different.

Every architectural decision in Quark-135M is made for efficiency, stability, and local deployment.

Architecture

GQA

Grouped Query Attention

Activation

SwiGLU

Feed-forward gate

Normalization

RMSNorm

Pre-attention & pre-FFN

Positional Enc.

RoPE

Rotary Embeddings

Context Window

2048

Tokens per sequence

Training Data

15B

Tokens pretrained

Precision

BF16

Ampere optimized

Weight Tying

Yes

Embed ↔ LM Head

Things Chat · Beta

Talk to Quark.

Every conversation is processed locally on our hardware. No telemetry, no data retention. Your words stay yours.

Quark-135M

Local · RTX 3070 · BF16

What makes Quark different from other language models?

Quark is built entirely from scratch on sovereign hardware — no cloud APIs, no external data pipelines. Every weight, every token, processed locally. The architecture prioritizes efficiency over scale: GQA attention, SwiGLU activations, RMSNorm stability. You get a capable model that runs on your own machine.

Is my conversation stored anywhere?

No. Zero telemetry, zero retention. Your session exists only in memory during the conversation and is discarded when you close the tab. ThingsAI has no access to what you say.

Explore

Go deeper.

Read the research or try the model yourself.

Research · Architecture

How Quark Works

Deep dive into the pretraining pipeline, dataset mix, and the architectural decisions behind the Quark model family — trained entirely on sovereign hardware.

Beta · Open Access

Things Chat

Talk directly with Quark-135M. Every conversation is processed locally. No telemetry, no data retention. Your words stay yours. Help shape the next release.

Open Chat →

Beta Program

Contribute.

Share your experience with Quark. Every report helps improve the next version.

IntroducingQuark.

The Quark Family.

Built different.

Talk to Quark.

Go deeper.

How Quark Works

Things Chat

Contribute.

Introducing
Quark.