Drop files, folders, or a .zip here — or /
Supports: .txt, .md, .csv, .json, .xml, .html, .css, .js, .ts, .py, .java, .c, .cpp, .log, .yaml, .yml, .toml, .ini, .sql, .sh, .rb, .go, .rs, .php, .swift, .kt, .env, .zip
Paste text, upload files, drop a folder, or a .zip archive. Instantly see token counts, model compatibility, and estimated costs for 33 AI models.
Drop files, folders, or a .zip here — or /
Supports: .txt, .md, .csv, .json, .xml, .html, .css, .js, .ts, .py, .java, .c, .cpp, .log, .yaml, .yml, .toml, .ini, .sql, .sh, .rb, .go, .rs, .php, .swift, .kt, .env, .zip
In English, 1000 words is approximately 750–1300 tokens depending on the model and text complexity. Common English text averages about 1.3 tokens per word. Code and technical text tend to use more tokens per word.
Paste your text into the box above to get an instant token estimate. For exact counts, use OpenAI's tiktoken library in Python. This tool also shows which models your text fits within and the estimated API cost.
A token is a subword unit that AI language models use to process text. Common words like "the" are single tokens, while longer or rarer words get split into multiple tokens. On average, 1 token equals about 4 characters in English.
As of April 2026, Llama 4 Scout and Maverick support up to 10 million tokens. Grok 4.20 supports 2 million tokens. Google Gemini models and GPT-4.1 support 1 million tokens. Claude Opus 4.7 and Sonnet 4.6 also support 1 million tokens.
Each model uses a different tokenizer with its own vocabulary. GPT-4o uses a 200K-token vocabulary while older models use 100K. The same text can produce different token counts across models. This tool gives a universal estimate based on average tokenization.
Yes, completely free. Everything runs in your browser — your text never leaves your device. There's no signup, no API key needed, and no usage limits.