The portable computing marvel can handle 120-billion-parameter LLMs, approximately three times the size of GPT-3, without requiring an internet or cloud connection.

(Image credit: Tiiny AI)Subscribe to our newsletter
A U.S. venture has engineered what it asserts is the globe’s most diminutive artificial intelligence (AI) supercomputer. Equipped with advanced hardware and substantial RAM, company spokespeople state it can execute “Ph.D.-level intelligence” AI models — all while being small enough to fit in a pocket. This capability enables autonomous problem resolution, abstract thought, and strategic planning.
The “AI Pocket Lab,” as its developers at Tiiny AI have christened the unit, is equipped to operate a sophisticated 120-billion-parameter large language model (LLM) locally, completely independent of internet connectivity. Typically, running such systems necessitates data-center-grade infrastructure, but this device unlocks potential for on-device expert-level coding, document analysis and enhancement, or multi-stage reasoning.
It is constructed around a 12-core ARM processor, the type commonly integrated into smartphones, laptops, and tablets. Despite its incredibly small dimensions — the device measures merely 5.59 by 3.15 by 1.00 inches (14.2 by 8 by 2.53 cm) — it incorporates 80 GB of LPDDR5X RAM. For context, most contemporary laptops are equipped with 8 GB to 32 GB of RAM.
An impressive 48 GB of the Pocket Lab’s RAM is also dedicated solely to the neural processing unit (NPU), a specialized chip designed for AI-centric calculations. For several years, both Intel and AMD have been producing processors that feature dedicated NPUs to manage AI tasks and to meet Microsoft’s requirement of 40 trillion operations per second (TOPS) for AI features on Windows 11.
The Pocket Lab qualifies as a supercomputer, distinguishing it from a typical mini-PC or workstation, owing to its processing prowess. It can execute workloads—specifically, local inference on language models exceeding 100 billion parameters—that traditionally demand multi-GPU, data-center-scale systems. Current models supported by the device include GPT-OSS 120B, extensive Phi models, and high-parameter Llama family models.
This advancement aligns with a recent trend favoring edge computing for AI, aiming to alleviate some of the power limitations and environmental consequences associated with distributed AI processing.
Pocket power
While it does not rival the most potent supercomputers globally, the AI Pocket Lab delivers 190 TOPS of computational capability, distributed between its NPU and CPU. This marks another stride in miniaturization, following Nvidia’s recent unveiling of the Project Digits mini PC. Although it doesn’t match the computational might of the Nvidia project, it is significantly smaller.
To integrate such substantial power into a modest enclosure, the Tiiny AI team utilized several technologies and optimizations. A key innovation, termed TurboSparse by the company, enables massive LLMs to operate more rapidly on less powerful hardware by ensuring the system accesses only the necessary components of a model at any given time. Unlike conventional models that utilize all parameters for each processing/output step, a TurboSparse model employs specific parameters per stage.
Another crucial feature is PowerInfer, which facilitates heterogeneous scheduling of the device’s CPU, GPU, and NPU. This approach assigns workloads to each processor based on its optimal capabilities, enhancing overall system efficiency and reducing energy consumption. PowerInfer also incorporates intelligent power management, determining when maximum power is required and when reduced usage is feasible, partly by eliminating redundant computations.
The implications of a compact AI supercomputer extend beyond mitigating our dependence on energy-intensive data centers. It offers enhanced privacy, allowing users to leverage the power of advanced LLMs without an internet connection and without their data being processed in the cloud by external entities. Furthermore, it enables AI accessibility in remote operational contexts, such as field research stations, or aboard vessels or aircraft beyond connectivity range.
Sourse: www.livescience.com
