step

huggingface_inference

Run inference on Hugging Face models via the Inference API.

Overview

Run inference on Hugging Face models via the Inference API. This step provides access to thousands of pre-trained models hosted on Hugging Face for tasks like text classification, named entity recognition, summarization, translation, question answering, text generation, and more. You can use any public model from the Hugging Face Hub without managing infrastructure. Specify the model ID, provide your API token, and configure task-specific parameters. The step handles API communication and returns parsed results. Perfect for ML tasks without deploying your own models.

Quick Start

steps:
- type: huggingface_inference
  api_token: hf_A1b2C3d4e5
  model: bert-base-uncased

Configuration

Parameter Type Required Description
model string Yes Model id (for example 'bert-base-uncased'). Values starting with http(s) are treated as full endpoint URLs.
api_token string Yes Hugging Face API token sent as a Bearer Authorization header.
input_from string No Dot path selecting the payload to send. When omitted, the entire event dictionary is posted.
input_key string No DEPRECATED: Use 'input_from' instead. Dot path selecting the payload.
output_to string No Event key that receives the parsed response payload.
Default: "huggingface"
output_key string No DEPRECATED: Use 'output_to' instead. Event key for response.
payload_field string No JSON key used to wrap the payload (defaults to 'inputs' to match Hugging Face conventions).
Default: "inputs"
raw_on_error boolean No When True, store the raw response body under '<output_to>_raw' if JSON parsing fails.
Default: true
swallow_on_error boolean No If True, skip injecting error details and return the original event on failures.
Default: false
timeout integer No Request timeout in seconds for the inference call (default 10).
Default: 10
extra_headers string No Additional HTTP headers merged with the defaults for each request (Authorization, Content-Type, Accept, User-Agent).
task_params string No Additional top-level parameters (for example temperature) merged into the request body alongside the payload field.

Examples

Sentiment analysis

Classify text sentiment (positive/negative) using DistilBERT

type: huggingface_inference
model: distilbert-base-uncased-finetuned-sst-2-english
api_token: ${env:huggingface_token}
input_from: review.text
output_to: review.sentiment

Text summarization

Generate concise summaries of long documents

type: huggingface_inference
model: facebook/bart-large-cnn
api_token: ${env:huggingface_token}
input_from: article.full_text
output_to: article.summary
task_params:
  max_length: 150
  min_length: 50
timeout: 20

Named entity recognition

Extract people, organizations, and locations from text

type: huggingface_inference
model: dslim/bert-base-NER
api_token: ${env:huggingface_token}
input_from: document.content
output_to: document.entities
timeout: 15

Question answering

Answer questions based on provided context

type: huggingface_inference
model: deepset/roberta-base-squad2
api_token: ${env:huggingface_token}
input_from: qa_pair
output_to: answer
task_params:
  question: ${qa_pair.question}
  context: ${qa_pair.context}

Advanced Options

These options are available on all steps for error handling and retry logic:

Parameter Type Default Description
retries integer 0 Number of retry attempts (0-10)
backoff_seconds number 0 Backoff (seconds) applied between retry attempts
retry_propagate boolean false If True, raise last exception after exhausting retries; otherwise swallow.