# Gateway Layer

## **AI Gateway (Orchestration Layer)**

The AI Gateway acts as the central brain of the iG3 Edge Network. It manages all device interactions, workload assignments, and dynamic routing. Every edge device connects through this gateway to receive tasks, report results, and maintain secure identity verification.

**Key Responsibilities:**

* **Device Registration & DID Verification**: Each device is authenticated using Decentralized Identifiers (DID) on the peaq network.
* **Workload Distribution**: Distributes AI inference tasks to the most suitable nodes, preferring edge devices first and falling back to the cloud if needed.
* **Task Queueing & Retry**: Ensures fault tolerance with automatic re-queuing of failed jobs.
* **Topology-Aware Load Balancing**: Routes tasks based on proximity, device capability, and current load to optimize performance.
* **Edge Model Discovery**: Detects which models are available on which edge nodes for efficient dispatching.

**Tech Stack:**

* **Kubernetes** for service orchestration and mesh networking.
* **Kafka** as an event bus for task scheduling and status updates.
* **gRPC APIs** for high performance communication between services.
* **Pub/Sub System** to stream results and updates back to users and dashboards.

## **LLM Gateway**

The LLM Gateway specializes in handling natural language tasks and large model interactions. It abstracts away model complexity and provides a streamlined interface for edge users to interact with LLMs in real time whether from a desktop client or a Telegram bot.

**Key Responsibilities:**

* **Token Based Authentication**: Secures access using wallet signatures or DID verification.
* **Text Generation & Chat Handling**: Receives prompts and routes them to the most appropriate model.
* **Model Sharding & Orchestration**: Supports parallelism across multiple GPUs or devices for scalability.
* **Local Edge or Cloud Inference**: Automatically selects between lightweight edge LLMs or cloud-hosted large models (e.g., H100-backed) depending on complexity and urgency.

**Tech Stack:**

* **LiteLLM** for API compatibility with major LLMs and simplified routing.
* **vLLM / NVIDIA Triton / Text generation webui** for efficient model hosting and scaling.
* **Faiss / Qdrant** to support retrieval-augmented generation (RAG) pipelines using vector search.
* **Redis** for caching frequent responses and reducing redundant compute.
* **Billing & Metering Hooks** to track usage and apply token-based pricing models.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.ig3.ai/how-ig3-works/ig3-overall-architecture/gateway-layer.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
