lmxlab

Transformer language models on Apple Silicon, built with MLX.

A single LanguageModel class handles all architectures (GPT, LLaMA, DeepSeek, Gemma, Qwen, Mixtral, and more). Switching architectures is a config change — no subclassing needed.

from lmxlab.models.llama import llama_config
from lmxlab.models.base import LanguageModel

model = LanguageModel(llama_config(d_model=512, n_heads=8, n_kv_heads=4, n_layers=6))

Getting started

API Reference

Core — blocks, attention, FFN, normalization, position encodings, LoRA
Models — LanguageModel, config factories, generation
Training — trainer, optimizers, checkpoints, callbacks
Data — tokenizers, datasets, batching
Eval — perplexity, bits-per-byte
Inference — sampling, speculative decoding