//AgriBERT-Triton-ConfigurationbyAnsh-Sarkar

AgriBERT-Triton-Configuration

🚀 This repository contains the NVIDIA Triton Inference Server configuration for serving the AgriBERT model (recobo/agriculture-bert-uncased) using the PyTorch LibTorch backend 🧠⚙️, focusing on the config.pbtxt that defines the model’s inputs, outputs, and execution settings required for deployment on Triton 🖥️🔥.

1
0
1
image

AgriBERT Triton Configuration

This repository contains the NVIDIA Triton Inference Server configuration for serving the AgriBERT model (recobo/agriculture-bert-uncased) using the PyTorch LibTorch backend. The repository currently focuses on the model configuration (config.pbtxt), which defines inputs, outputs, and execution settings required to deploy the model on Triton.

📌 Model Overview

undefinedAgriBERT is a BERT-based language model trained specifically for the agriculture domain.

  • Base model: SciBERTundefined
  • Further pre-trained using Masked Language Modeling (MLM)undefined
  • Domain: Agriculture research and practical agricultural knowledge
  • Language: English

Training Data

The model was trained on a large agriculture-focused corpus consisting of:

  • undefined1.2 million paragraphs from the US National Agricultural Library (NAL)
  • undefined5.3 million paragraphs from agriculture-related books and common literature

This balanced dataset includes both scientific and general agricultural content, enabling strong domain-specific language understanding.

Training Objective

  • undefinedMasked Language Modeling (MLM)undefined
  • 15% of input tokens are randomly masked
  • The model predicts the masked tokens using bidirectional context

This allows AgriBERT to learn deep contextual representations relevant to agricultural text.

🧠 Use Case

While AgriBERT is commonly used for fill-mask tasks, this Triton configuration exposes the model as a general-purpose inference endpoint, suitable for:

  • Text classification
  • Feature extraction
  • Downstream agriculture NLP tasks

The output shape indicates a 7-class FP32 output, suggesting use in a classification or scoring setup.

⚙️ Triton Model Configuration

The config.pbtxt defines how Triton serves the model.

Model Details

  • undefinedModel Name: agribert
  • undefinedBackend: pytorch_libtorch
  • undefinedExecution: GPU-based inference

Inputs

Name Data Type Shape Description
input_ids INT32 [batch_size, seq_len] Token IDs from the tokenizer
token_type_ids INT32 [batch_size, seq_len] Segment IDs for sentence pairs
attention_mask INT32 [batch_size, seq_len] Attention mask for padding tokens

All inputs support dynamic batch size and dynamic sequence length.

Outputs

Name Data Type Shape Description
OUTPUT__0 FP32 [batch_size, 7] Model logits or class scores

Instance Group

  • undefinedGPU Instances: 1
  • undefinedKind: KIND_GPU

📂 Repository Structure

.
├── agribert/
│   ├── config.pbtxt
│   └── 1/
│       └── model.pt  (expected)
└── README.md

⚠️ Note: The actual serialized TorchScript model (model.pt) must be placed under a versioned directory (e.g., 1/) for Triton to load the model successfully.

🚀 Deployment Notes

  • Ensure the model is exported to TorchScript (torch.jit.trace or torch.jit.script)
  • Tokenization must be handled client-side using the same tokenizer:
    • recobo/agriculture-bert-uncased
  • Input tensors must be provided as INT32undefined

🔗 Model Reference

  • Hugging Face Model: recobo/agriculture-bert-uncasedundefined
  • Task: Masked Language Modeling (MLM)
  • Framework: PyTorch / Transformers

undefined📄 License & Attribution: All credit for the model architecture and pretraining goes to Recobo.ai and the original AgriBERT authors. This repository only contains deployment configuration for NVIDIA Triton Inference Server.

[beta]v0.14.0