Skip to content
InfermaticInfermatic
  • Models
  • Pricing
  • API
  • Contact
  • My Account

API

  • Overview
  • API Key
  • Request limits
View Categories
  • Home
  • Docs
  • API
  • Overview

Overview

API Documentation Overview #

This document provides an overview of the available endpoints in our minimal and basic API, highlighting key functionalities such as chat and general completions, and token counting.

1. Understanding the POST Method #

The POST method is one of the main HTTP methods used in API interactions, designed to send data to a server to create or update a resource. Here are the key aspects of the POST method:

  • Purpose: Used for submitting form data.
  • Data Submission: Data is included in the body of the request, which can be in formats like JSON, form data, or XML.
  • Idempotency: POST requests are not idempotent, meaning multiple identical POST requests might create the same resource multiple times.
  • Response: Typically returns a status code of 200 (OK), 201 (Created), or 204 (No Content).

Structure of a POST request: #

For this models:

  • Mixtral-8x7B-Instruct-v0.1, Qwen2-72B-Instruct, deepseek-coder-33b-instructdeepseek-coder-33b-instruct
{
    "model": "[Name of the model]",
    "prompt": "[your prompt here]",
    "max_tokens": 7000,
    "temperature": 0.7,
    "top_k": 40,
    "repetition_penalty": 1.2,
    "messages": [
        {
            "role": "user",
            "content": "[your prompt here]"
        }
    ]
}

2. Chat Completions #

Endpoints designed to handle chat completion requests:

  • POST /queue/chat/completions: Handles asynchronous queue requests for chat completions.
  • POST /openai/deployments/{model}/chat/completions: Submits chat completion requests for a specific model.
  • POST /engines/{model}/chat/completions: Engages a specific model for chat completion.
  • POST /chat/completions: General endpoint for chat completion requests.
  • POST /v1/chat/completions: Version 1 endpoint for chat completion requests.

3. General Completions #

Endpoints for general text completions:

  • POST /openai/deployments/{model}/completions: Handles completion requests for specific deployments.
  • POST /engines/{model}/completions: Directly submits text completion requests to a specified engine.
  • POST /completions: General endpoint for obtaining text completions.
  • POST /v1/completions: Version 1 of the text completions endpoint.

4. Token Counting #

Endpoint for managing model usage based on token constraints:

  • POST /utils/token_counter: Counts the number of tokens for a given input.

5. Model Management #

Operations that can be performed on models via the API:

  • GET /models or GET /v1/models: Retrieves a list of all available models in the system.

For more detailed information, please refer to our Swagger documentation.

What are your Feelings
Share This Article :
  • Facebook
  • X
  • LinkedIn
  • Pinterest
Still stuck? How can we help?

How can we help?

Updated on August 28, 2024
API Key
Table of Contents
  • API Documentation Overview
    • 1. Understanding the POST Method
      • Structure of a POST request:
    • 2. Chat Completions
    • 3. General Completions
    • 4. Token Counting
    • 5. Model Management
Home
Pricing Plans
Privacy and Open Use
Models
Model Settings
API Docs
Contact: Discord
Contact: Geek to Geek
Contact: Send Message
Get Started for free
Terms & Conditions
Privacy Policy

Copyright 2025 Infermatic. All rights reserved.

Copyright 2025 © Flatsome Theme
  • Models
  • Pricing
  • API
  • Contact
  • My Account