Hugging Face使用指南

来源

原始文档: Hugging Face Use.md

核心内容

Hugging Face 是 AI 模型和数据集的共享平台,提供丰富的预训练模型。

平台介绍

模型库(Model Hub): 数十万预训练模型
数据集(Datasets): 开源数据集集合
Spaces: 部署和分享 ML 应用
Transformers 库: Python 深度学习库

Transformers 使用

from transformers import pipeline

# 文本分类
classifier = pipeline("sentiment-analysis")
result = classifier("I love this product!")

# 命名实体识别
ner = pipeline("ner", grouped_entities=True)
result = ner("Hugging Face is a company based in New York.")

# 文本生成
generator = pipeline("text-generation", model="gpt2")
result = generator("Once upon a time")

# 翻译
translator = pipeline("translation", model="Helsinki-NLP/opus-mt-en-de")
result = translator("Hello, how are you?")

模型下载

# 使用 huggingface-cli 下载
huggingface-cli download gpt2

# 指定本地目录
huggingface-cli download gpt2 --local-dir ./models/gpt2

# 使用 git-lfs
git lfs install
git clone https://huggingface.co/gpt2

使用代理

# 设置代理
export HF_ENDPOINT=https://hf-mirror.com
export https_proxy=http://127.0.0.1:7890

# Python 中使用
from huggingface_hub import snapshot_download
snapshot_download(repo_id="gpt2", local_dir="./models")

模型推理

from transformers import AutoTokenizer, AutoModelForCausalLM

# 加载模型和分词器
tokenizer = AutoTokenizer.from_pretrained("gpt2")
model = AutoModelForCausalLM.from_pretrained("gpt2")

# 生成文本
inputs = tokenizer("Hello, I'm a language model", return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
print(tokenizer.decode(outputs[0]))

关键要点

Hugging Face 是 AI 领域最大的模型社区
Transformers 库支持 PyTorch、TensorFlow 和 JAX
使用 HF_ENDPOINT 可切换镜像站点(如 hf-mirror.com)
大部分模型需要接受许可协议才能下载

LoongLee's blog

Hugging Face使用指南

Hugging Face使用指南

来源

核心内容

平台介绍

Transformers 使用

模型下载

使用代理

模型推理

关键要点

相关实体