Welcome to the Hugging-Verse

@stevhliu|Mar 10, 2025 (11 months ago)59 views

I often see people ask, what is Hugging Face? The answer is usually some variant of "Hugging Face is the GitHub of machine learning".

It's not a bad answer, but it hides a lot of depth. The Hugging Face ecosystem, or Hugging-Verse, is expansive and encompasses nearly every aspect of machine learning. For this reason, it can be overwhelming if you're just getting started.

This is my Hugging-Verse walkthrough, inspired by games like Baldur's Gate 3, Elden Ring, and Kingdom Come Deliverance II. These games are enormous, and you can easily spend 100+ hours on a single playthrough in each of them. Through side quests and lore, they add a ton of richness and worldbuilding that create immersive gameplay.

But I've also had to look certain things up in guides because I was overwhelmed. So if you're feeling lost, I hope this helps.

Classes are the core libraries you build with.
Skill trees are optional specialized libraries.
The Hub is a hosted platform where you can get ML services like compute and storage.
Buffs are learning resources.
Companions are interactive products.

#starting classes

There are many libraries in the Hugging-Verse, each dedicated to a specific topic like transformer models, diffusion models, pretraining/finetuning, robotics, evaluation, and more.

Choose a starter class, Transformers or Diffusers, depending on whether you're interested in large language models or image/video generation. These give you access to models and the APIs to train or run inference with them. It's important to level up these class skills first, like learning how to finetune a model with the Trainer API, because later on, you'll find that some of the more specialized libraries build on top of Transformers.

As an example, TRL trainers extend the Transformers Trainer. If you're already familiar with Trainer, then you'll get a +x% faster learning bonus.

#skill tree

As you level up, you can start exploring the skill tree and decide whether and where you want to spend your points on more specialized skills.

For example, if you're interested in a "training" build, put some points in Accelerate or nanotron. If you want to do reinforcement learning, invest a point in TRL. Or if you want to do an "optimization" build, check out kernels to build and load faster compute operations or bitsandbytes to quantize models to use less memory.

The reason you should consider whether you want to invest in these skills is that some features from these specialized libraries get added directly to Transformers over time. That's why it's such a powerful class. It can scale to the late-game, where you may need more specific abilities.

This isn't to say you shouldn't learn a specialized library, because not all abilities are integrated into Transformers. You may find that you need something in TRL that isn't available in Transformers.

#the hub

The Hub is where you go for shops and services to get things done. Most of these services are free to use, but a PRO subscription unlocks access to higher limits and more features.

#storage

One of the main services the Hub offers is storage for models and datasets you create. It pairs the Git workflow with Xet's storage system. Xet is faster and more efficient than Git LFS because it uses content-defined chunking (CDC) to deduplicate data. Only the parts of a file that changed are uploaded, unlike Git LFS, which uploads the entire file again.

The starter storage is pretty generous and comes with around 8TB of public storage and 100GB of private storage.

#spaces

Spaces lets you turn your models into ML apps with frameworks like Gradio, Docker, and React. You can even turn your ML app into a callable tool for agents with MCP to integrate a Space into a workflow.

Arena (formerly LMArena) is a $1.7B startup that started off as a Gradio app comparing how models perform on different tasks. If you can imagine it, you can build it with Spaces.

A free Space runs on a CPU with 16GB of RAM, and you can upgrade to bigger GPUs if you need additional compute. For a more unique and powerful hardware option, try , a shared cluster of H200s. The H200s are dynamically allocated to a Space to complete a workload, then released for the next Space. This ensures you only use compute when you need it and aren't leaving GPUs idle.

#inference providers and endpoints

Serverless inference lets you run your model through an API without having to manage the infrastructure on your own. This is designed to help you deploy models to production. There are two options:

Inference Providers connect models on the Hub to companies like Cerebras and Hyperbolic to let you make on-demand inference calls to their hardware. Make sure you use the comparison tool to help you select a provider based on price or speed.
Inference Endpoints is Hugging Face's dedicated and managed inference infrastructure ("endpoint") that runs continuously. There are more choices to make about the deployment, such as hardware (AWS/GCP/Azure), inference engine (vLLM/SGLang), and autoscaling.

#jobs

Jobs provides access to Hugging Face's hardware (CPUs/GPUs) for temporary compute. It stops once the task is complete, or you can schedule a task to run periodically.

hf jobs uv run --flavor a100-large --timeout 6h --with trl --secrets HF_TOKEN train.py

Launch a finetuning job on an A100 GPU.

This is useful for one-off or scheduled workloads like finetuning and data processing.

#data studio

Data Studio is available in dataset repositories for exploring data in the browser without downloading it. Ask the agent questions about a dataset or use the built-in SQL console to query it.

#buffs

Acquire these permanent buffs to increase your research and knowledge skills.

#courses

Take the courses at hf.co/learn, which cover topics like agents, reinforcement learning, diffusion, and more.

This is a good early-game buff because you get more in-depth explanations about how things work.

#papers

Browse Papers, a curated daily selection to help you stay on top of the latest research.

#research

Follow the science team, which produces and shares research you can learn from and build on.

FineData has several clean and high-quality datasets for large-scale pretraining, and they've also shared their recipe for extracting and refining data.
Smol Models Research releases small but competitive models, and has also written a playbook for training them.

This is a good late-game buff.

#companions

Add a companion to your party to help you out.

#reachy

Reachy is a desktop robot for experimenting with human-robot interactions. The robot is built on open-source software, so you can program new "behaviors" for it.

NVIDIA AI Developer

@NVIDIAAIDev

·Follow

Turn @HuggingFace Reachy Mini into your own AI assistant. 🤖 New step-by-step guide walks you through creating an interactive agent using Nemotron 3 open models, Brev, and DGX Spark. Bring Reachy Mini to life running locally and in the cloud. 🤗 Read the tutorial:

clem 🤗

@ClementDelangue

Super cool to see Jensen @nvidia showcasing Reachy Mini at his #CES26 keynote. Paired with a DGX Spark & Brev, it can make the perfect local home AI robotics setup!

Watch on Twitter

127

Read 11 replies

#huggingchat

HuggingChat is an open version of ChatGPT with support for many models like Kimi-K2.5 and gpt-oss-120B. If you don't know which model to use, its Omni router automatically selects the best model for your message.

HuggingChat is also available in the Hugging Face docs like Transformers, as well as in Papers, so you can ask it questions directly to level up even faster.