//AgriBERT-Triton-ConfigurationbyAnsh-Sarkar

AgriBERT-Triton-Configuration

🚀 This repository contains the NVIDIA Triton Inference Server configuration for serving the AgriBERT model (recobo/agriculture-bert-uncased) using the PyTorch LibTorch backend 🧠⚙️, focusing on the config.pbtxt that defines the model’s inputs, outputs, and execution settings required for deployment on Triton 🖥️🔥.

View on GitHub Website

AgriBERT Triton Configuration

This repository contains the NVIDIA Triton Inference Server configuration for serving the AgriBERT model (recobo/agriculture-bert-uncased) using the PyTorch LibTorch backend. The repository currently focuses on the model configuration (config.pbtxt), which defines inputs, outputs, and execution settings required to deploy the model on Triton.

📌 Model Overview

undefinedAgriBERT is a BERT-based language model trained specifically for the agriculture domain.

Base model: SciBERTundefined
Further pre-trained using Masked Language Modeling (MLM)undefined
Domain: Agriculture research and practical agricultural knowledge
Language: English

Training Data

The model was trained on a large agriculture-focused corpus consisting of:

undefined1.2 million paragraphs from the US National Agricultural Library (NAL)
undefined5.3 million paragraphs from agriculture-related books and common literature

This balanced dataset includes both scientific and general agricultural content, enabling strong domain-specific language understanding.

Training Objective

undefinedMasked Language Modeling (MLM)undefined
15% of input tokens are randomly masked
The model predicts the masked tokens using bidirectional context

This allows AgriBERT to learn deep contextual representations relevant to agricultural text.

🧠 Use Case

While AgriBERT is commonly used for fill-mask tasks, this Triton configuration exposes the model as a general-purpose inference endpoint, suitable for:

Text classification
Feature extraction
Downstream agriculture NLP tasks

The output shape indicates a 7-class FP32 output, suggesting use in a classification or scoring setup.

⚙️ Triton Model Configuration

The config.pbtxt defines how Triton serves the model.

Model Details

undefinedModel Name: agribert
undefinedBackend: pytorch_libtorch
undefinedExecution: GPU-based inference

Inputs

Name	Data Type	Shape	Description
`input_ids`	INT32	`[batch_size, seq_len]`	Token IDs from the tokenizer
`token_type_ids`	INT32	`[batch_size, seq_len]`	Segment IDs for sentence pairs
`attention_mask`	INT32	`[batch_size, seq_len]`	Attention mask for padding tokens

All inputs support dynamic batch size and dynamic sequence length.

Outputs

Name	Data Type	Shape	Description
`OUTPUT__0`	FP32	`[batch_size, 7]`	Model logits or class scores

Instance Group

undefinedGPU Instances: 1
undefinedKind: KIND_GPU

📂 Repository Structure

.
├── agribert/
│   ├── config.pbtxt
│   └── 1/
│       └── model.pt  (expected)
└── README.md

⚠️ Note: The actual serialized TorchScript model (model.pt) must be placed under a versioned directory (e.g., 1/) for Triton to load the model successfully.

🚀 Deployment Notes

Ensure the model is exported to TorchScript (torch.jit.trace or torch.jit.script)
Tokenization must be handled client-side using the same tokenizer:
- recobo/agriculture-bert-uncased
Input tensors must be provided as INT32undefined

🔗 Model Reference

Hugging Face Model: recobo/agriculture-bert-uncasedundefined
Task: Masked Language Modeling (MLM)
Framework: PyTorch / Transformers

undefined📄 License & Attribution: All credit for the model architecture and pretraining goes to Recobo.ai and the original AgriBERT authors. This repository only contains deployment configuration for NVIDIA Triton Inference Server.

Find me

[beta]v0.14.0