huggingface_inference
Run inference on Hugging Face models via the Inference API.
Overview
Run inference on Hugging Face models via the Inference API. This step provides access to thousands of pre-trained models hosted on Hugging Face for tasks like text classification, named entity recognition, summarization, translation, question answering, text generation, and more. You can use any public model from the Hugging Face Hub without managing infrastructure. Specify the model ID, provide your API token, and configure task-specific parameters. The step handles API communication and returns parsed results. Perfect for ML tasks without deploying your own models.
Quick Start
steps:
- type: huggingface_inference
api_token: hf_A1b2C3d4e5
model: bert-base-uncasedConfiguration
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Model id (for example 'bert-base-uncased'). Values starting with http(s) are treated as full endpoint URLs. |
api_token | string | Yes | Hugging Face API token sent as a Bearer Authorization header. |
input_from | string | No | Dot path selecting the payload to send. When omitted, the entire event dictionary is posted. |
input_key | string | No | DEPRECATED: Use 'input_from' instead. Dot path selecting the payload. |
output_to | string | No | Event key that receives the parsed response payload.
Default: "huggingface" |
output_key | string | No | DEPRECATED: Use 'output_to' instead. Event key for response. |
payload_field | string | No | JSON key used to wrap the payload (defaults to 'inputs' to match Hugging Face conventions).
Default: "inputs" |
raw_on_error | boolean | No | When True, store the raw response body under '<output_to>_raw' if JSON parsing fails.
Default: true |
swallow_on_error | boolean | No | If True, skip injecting error details and return the original event on failures.
Default: false |
timeout | integer | No | Request timeout in seconds for the inference call (default 10).
Default: 10 |
extra_headers | string | No | Additional HTTP headers merged with the defaults for each request (Authorization, Content-Type, Accept, User-Agent). |
task_params | string | No | Additional top-level parameters (for example temperature) merged into the request body alongside the payload field. |
Examples
Sentiment analysis
Classify text sentiment (positive/negative) using DistilBERT
type: huggingface_inference
model: distilbert-base-uncased-finetuned-sst-2-english
api_token: ${env:huggingface_token}
input_from: review.text
output_to: review.sentiment
Text summarization
Generate concise summaries of long documents
type: huggingface_inference
model: facebook/bart-large-cnn
api_token: ${env:huggingface_token}
input_from: article.full_text
output_to: article.summary
task_params:
max_length: 150
min_length: 50
timeout: 20
Named entity recognition
Extract people, organizations, and locations from text
type: huggingface_inference
model: dslim/bert-base-NER
api_token: ${env:huggingface_token}
input_from: document.content
output_to: document.entities
timeout: 15
Question answering
Answer questions based on provided context
type: huggingface_inference
model: deepset/roberta-base-squad2
api_token: ${env:huggingface_token}
input_from: qa_pair
output_to: answer
task_params:
question: ${qa_pair.question}
context: ${qa_pair.context}
Advanced Options
These options are available on all steps for error handling and retry logic:
| Parameter | Type | Default | Description |
|---|---|---|---|
retries | integer | 0 | Number of retry attempts (0-10) |
backoff_seconds | number | 0 | Backoff (seconds) applied between retry attempts |
retry_propagate | boolean | false | If True, raise last exception after exhausting retries; otherwise swallow. |