Your Agent Can Now Train Models — Merve Noyan, Hugging Face

Source
en-orig
May 13, 2026 May 14, 2026
Video preview
Share:

Merve Noyan from Hugging Face's open source team discusses the open agent ecosystem, highlighting new features like local coding agents, MCP servers, and skills that allow AI agents to train models and perform complex tasks.

Open Source in Machine Learning ⏱ 0:15

  • Open source models have commercially available licenses (MIT, Apache 2.0), while open weight models have non-commercial licenses.
  • Open source ensures no hidden performance degradation.
  • Access to weights allows shrinking, quantizing, and fine-tuning models with guaranteed privacy for end users.
  • Open models are now competitive: GLM 5.1 is an example; the Artificial Analysis Intelligence Index shows green (open) models catching up to black (closed) models.
  • Hugging Face Hub and Agentic Models ⏱ 2:48

  • The Hugging Face Hub hosts nearly 3 million models, datasets, and spaces.
  • Users can filter for agentic models, which are mostly trending.
  • Two types: vision LLMs (VLMs) that can act as computer use agents, and LLMs.
  • New models like Gemma 4 (omni) and Qwen 2.5 are released with vision capabilities day zero.
  • Running VLMs is easy with VLM ML or llama.cpp server with few lines of code.
  • New benchmark datasets feature: click "benchmark" on datasets to see rankings (e.g., SWE-bench Pro, Humanities Last Exam, AIME).
  • Inference Providers service routes to best models/providers; shows cheapest/fastest options and tool use column.
  • Local Coding Agents and Tools ⏱ 6:57

  • Options for local coding agents: Pi (simple setup, can use remote inference providers or local llama.cpp server), llama-agent (binary in llama.cpp), and Hermes agents (memory management, easy setup via wizard, integrates with Slack/WhatsApp).
  • Recommended open model: GLM 5.1; Merve used it to fix Slack integration with Hermes agent.
  • Traces: new dataset repository type for code execution traces (Codex, Claude Code, Pi). Traces can be viewed as parsed trees and later used to train models.
  • To serve LLMs locally: filter apps tab (lm studio, Jan, llama.cpp) on Hugging Face; models show GGUF section with hardware compatibility (e.g., Gemma 4 4-bit fits in L4 GPU with 24 GB VRAM).
  • "Use this model" button on model repos provides one-click commands for local serving.
  • Supercharging with Hugging Face Skills and MCP ⏱ 12:07

  • Skills: CLI skill (manage repos, jobs, demos), LLM trainer skill (train LLMs and VLMs remotely or locally), Gradio skill (build demos), dataset skill (explore via API).
  • Example: Merve asked Claude Code to train Qwen2-VL on LLaVA-Instruct-Mix; the agent calculated VRAM, asked for validation split, launched job, and published model on Hub.
  • Skills also support training object detectors and segmentation models.
  • MCP server for Hugging Face: provides models, datasets, spaces (with semantic search), jobs (pay per uptime). Dynamic spaces setting enables querying all spaces.
  • Example: query "generate image of baklava made of yarn" → MCP calls Hugging Face Qwen image generation model.
  • Real-World Application: OCR on AI Papers ⏱ 16:44

  • Colleague Neils OCR'd 30,000 AI papers using Codex open OCR models and jobs, all through prompting.
  • Steps: pick OCR model via OCR benchmark dataset (first result: Chandra OCR, but a new skill recommends models for fine-tuning).
  • Then ask LLM to write script and kick off processing job on Hugging Face infra; skill sets up instance and hosting automatically.
  • Jobs use new "buckets" product (cheaper and faster than S3 buckets).
  • Key Takeaways

  • Open source models are now competitive with closed models; GLM 5.1 tops SWE-bench Pro.
  • Hugging Face Hub hosts nearly 3 million models and provides easy filtering for agentic models.
  • Local coding agents like Hermes and Pi can be set up quickly with open models via llama.cpp.
  • Hugging Face Skills enable agents to train models (LLMs, VLMs, object detectors) by just describing the task.
  • The MCP server integrates Hugging Face Hub into agents, allowing tasks like image generation and paper OCR.
  • Conclusion

    The open agent ecosystem is evolving rapidly, making it easier than ever to run, train, and deploy open models with minimal friction.

    Visual Highlightsbeta