Quark-v0.1 · Now Available

Introducing
Quark.

A family of sovereign language models built for local efficiency, privacy, and deep conversational intelligence. Built from scratch by ThingsAI & OvercastLab.

quark-135m · local
$ quark run --model 135m
✓ Model loaded (135M params)
✓ Running on RTX 3070 · BF16
─────────────────────────
Explain quantum entanglement
Quantum entanglement occurs when
two particles become correlated
regardless of distance. Measuring
one instantly affects the other
— no information travels, yet
the correlation is preserved.
135M
Parameters in the primary chat model, optimized for RTX 3070
15B
Tokens used during pretraining on sovereign hardware
2048
Context window in tokens per sequence
100%
Local inference. No telemetry, no data retention.
Release Timeline

The Quark Family.

From lightweight CPU models to full-scale instruction-tuned flagships. Each model is designed for a specific compute tier.

Model Parameters Status Notes
Quark-50M
50M
Live
Lightweight. Runs on CPU.
Quark-135M
135M
Live
Primary chat model. RTX 3070 optimized.
Quark-270M
270M
Coming Soon
Enhanced reasoning depth.
Quark-540M
540M
Coming Soon
Extended context. Multi-domain.
Quark-1.4B
1.4B
Coming Soon
Flagship. Full instruction suite.

Architecture · Quark-135M

Built different.

Every architectural decision in Quark-135M is made for efficiency, stability, and local deployment.

Architecture
GQA
Grouped Query Attention
Activation
SwiGLU
Feed-forward gate
Normalization
RMSNorm
Pre-attention & pre-FFN
Positional Enc.
RoPE
Rotary Embeddings
Context Window
2048
Tokens per sequence
Training Data
15B
Tokens pretrained
Precision
BF16
Ampere optimized
Weight Tying
Yes
Embed ↔ LM Head

Things Chat · Beta

Talk to Quark.

Every conversation is processed locally on our hardware. No telemetry, no data retention. Your words stay yours.

Quark-135M
Local · RTX 3070 · BF16
U
What makes Quark different from other language models?
Q
Quark is built entirely from scratch on sovereign hardware — no cloud APIs, no external data pipelines. Every weight, every token, processed locally. The architecture prioritizes efficiency over scale: GQA attention, SwiGLU activations, RMSNorm stability. You get a capable model that runs on your own machine.
U
Is my conversation stored anywhere?
Q
No. Zero telemetry, zero retention. Your session exists only in memory during the conversation and is discarded when you close the tab. ThingsAI has no access to what you say.

Explore

Go deeper.

Read the research or try the model yourself.

Research · Architecture

How Quark Works

Deep dive into the pretraining pipeline, dataset mix, and the architectural decisions behind the Quark model family — trained entirely on sovereign hardware.

Read More →
Beta · Open Access

Things Chat

Talk directly with Quark-135M. Every conversation is processed locally. No telemetry, no data retention. Your words stay yours. Help shape the next release.

Open Chat →

Beta Program

Contribute.

Share your experience with Quark. Every report helps improve the next version.

Report submitted. Thank you.
© 2026 ThingsAI Laboratories & OvercastLab. All rights reserved. QUARK-V0.1 · BUILD_135M