Models

Unleash your creativity

Check our our current supported models below. We support the best, highest-rated models available, and we update the selection of models we support frequently. Sign up to receive a monthly update about our latest models and other news, and participate in our Discord Server to ask questions or request additional models.

TOP LLMS

rAIfle/

SorcererLM 8x22b bf16

RP
BF16
rAIfle-SorcererLM-8x22b-bf16
Context: 16K
Recommended Settings: https://files.catbox.moe/9tj7m0.json

TheDrummer/

Fallen Llama 3.3 R1 70B v1

RP, Storywriting

TheDrummer-Fallen-Llama-3.3-R1-70B-v1

New

Doctor-Shotgun/

L3.3 70B Magnum v4 SE

"SE" for Special Edition, the objective, as with the other Magnum models, is to emulate the prose style and quality of the Claude 3 Sonnet/Opus series of models on a local scale.

Anthracite-org/

Magnum-72b-v4

This model has spatial awareness, memory and detailed descriptions to keep the generation entertaining. Very good creativity and NSFW.

RP, Storywriting.
FP8 Dynamic
anthracite-org-magnum-v4-72b-FP8-Dynamic
Context: 32k
Preset: https://files.catbox.moe/rqei05.json
RP Instruct: https://files.catbox.moe/btnhau.json
RP Context: https://files.catbox.moe/7kct3f.json

Settings provided by: GERGE

Sao10K/

72B Qwen2.5 Kunou v1

Another version of Euryale with with a Qwen base model.

Deepseek-ai/

DeepSeek R1 Distill Llama 70B

GP, reasoning

Sao10K/

70B L3.3 Cirrus x1

RP, Story Writing, Creative

Sao10K-70B-L3.3-Cirrus-x1

Sao10K/

L3.3-70B-Euryale-v2.3

A direct replacement / successor to Euryale v2.2

Sao10K-L3.3-70B-Euryale-v2.3-FP8-Dynamic
FP8:
Context: 32K

Sao10K/

L3.1 70B Hanami x1

GP, RP
FP16
Sao10K-L3.1-70B-Hanami-x1
Context: 32K

PLUS PLAN MODELS

All of Free + Essential + Standard and the following:

Top

rAIfle/

SorcererLM 8x22b bf16

RP
BF16
rAIfle-SorcererLM-8x22b-bf16
Context: 16K
Recommended Settings: https://files.catbox.moe/9tj7m0.json

STANDARD PLAN MODELS

All of Free + Essential and the following:

New

Top

Doctor-Shotgun/

L3.3 70B Magnum v4 SE

"SE" for Special Edition, the objective, as with the other Magnum models, is to emulate the prose style and quality of the Claude 3 Sonnet/Opus series of models on a local scale.

Top

Anthracite-org/

Magnum-72b-v4

This model has spatial awareness, memory and detailed descriptions to keep the generation entertaining. Very good creativity and NSFW.

RP, Storywriting.
FP8 Dynamic
anthracite-org-magnum-v4-72b-FP8-Dynamic
Context: 32k
Preset: https://files.catbox.moe/rqei05.json
RP Instruct: https://files.catbox.moe/btnhau.json
RP Context: https://files.catbox.moe/7kct3f.json

Settings provided by: GERGE

Top

Deepseek-ai/

DeepSeek R1 Distill Llama 70B

GP, reasoning

ESSENTIAL PLAN MODELS

All of Free and the following:

intfloat/

multilingual-e5-base

Embeddings model. This model is initialized from xlm-roberta-base and continually trained on a mixture of multilingual datasets. It supports 100 languages from xlm-roberta, but low-resource languages may see performance degradation. 512 Max context length.

intfloat-multilingual-e5-base

NousResearch/

DeepHermes 3 Mistral 24B Preview

Latest version of the flagship Hermes series. One of the first models to unify Reasoning and normal LLM response modes into one model. Also has also improved LLM annotation, judgement, and function calling.

NousResearch-DeepHermes-3-Mistral-24B-Preview

Top

TheDrummer/

Fallen Llama 3.3 R1 70B v1

RP, Storywriting

TheDrummer-Fallen-Llama-3.3-R1-70B-v1

Top

Sao10K/

72B Qwen2.5 Kunou v1

Another version of Euryale with with a Qwen base model.

TheDrummer/

Anubis 70B v1

Finetune of llama 3.3.

RP – Storywriting
TheDrummer-Anubis-70B-v1-FP8-Dynamic
Context: 32K

Top

Sao10K/

70B L3.3 Cirrus x1

RP, Story Writing, Creative

Sao10K-70B-L3.3-Cirrus-x1

Sao10K/

L3-70B-Euryale-v2.2

Coherent, emotional and very creative.

RP, Storywriting.
FP8 Dynamic
L3.1-70B-Euryale-v2.2-FP8-Dynamic
Context: 16K
RP Instruct: https://files.catbox.moe/1c9sp0.json
RP Context: https://files.catbox.moe/5wwpin.json

Settings provided by: ShotMisser64

Top

Sao10K/

L3.3-70B-Euryale-v2.3

A direct replacement / successor to Euryale v2.2

Sao10K-L3.3-70B-Euryale-v2.3-FP8-Dynamic
FP8:
Context: 32K

Anthracite-org/

magnum v2 72b

This model is fine-tuned on top of Qwen-2 72B Instruct.

anthracite-org-magnum-v2-72b-FP8-Dynamic
FP8
Context: 32K

LatitudeGames/

Wayfarer 12B

Wayfarer is an adventure role-play model specifically trained to give players a challenging and dangerous experience.

LatitudeGames-Wayfarer-12B

meta-llama/

Meta Llama Guard 2 8B

Meta Llama Guard 2 is a safeguard model. It can be used for classifying content in both LLM inputs (prompt classification) and in LLM responses (response classification).

Meta-Llama-Guard-2-8B

TheDrummer/

UnslopNemo 12B v4.1

RP
BF16
TheDrummer-UnslopNemo-12B-v4.1
Context: 32K
Recommended Settings: https://files.catbox.moe/7e6zjo.json

nvidia/

Llama 3.1 Nemotron 70B Instruct HF

GP
BF16
nvidia-Llama-3.1-Nemotron-70B-Instruct-HF
Context: 32K
Recommended Settings: https://files.catbox.moe/7e6zjo.json

Sophosympatheia/

Midnight Miqu 70B v1.5

RP
FP16
Midnight-Miqu-70B-v1.5
Context: 18K
Preset: https://files.catbox.moe/l8e5zt.json
RP Instruct: https://files.catbox.moe/eaj6gy.json
RP Context: https://files.catbox.moe/mvn3jo.json

Settings provided by: ShadingCrawler

Qwen/

Qwen2.5-72B-Instruct

GP
FP8
Qwen2.5-72B-Instruct-Turbo
Context: 32K

Top

Sao10K/

L3.1 70B Hanami x1

GP, RP
FP16
Sao10K-L3.1-70B-Hanami-x1
Context: 32K

Qwen/

Qwen2-72B-Instruct

BF16
GP
Qwen2-72B-Instruct
Context: 32K

meta-llama/

Llama-3.2-11B-Vision-Instruct

BF16
GP
Llama-3.2-11B-Vision-Instruct-Turbo
Context: 128K

FREE MODELS

Mistralai/

Mixtral 8x7B Instruct v0.1

BF16
GP
Mixtral-8x7B-Instruct-v0.1
Context: 32K

TheDrummer/

Rocinante-12B-v1.1

RP
BF16
TheDrummer-Rocinante-12B-v1.1
Context: 32K

GUIDES

Models

L3-70B-Euryale-v2.1

Meet L3 70B Euryale v2.1: Your New Creative Companion What is L3 70B Euryale v2.1 [...]

20
Aug

Guides

Using Infermatic.ai API with SillyTavern

SillyTavern is one of the most popular interfaces to interact with LLMs. We have been [...]

21
Jun

Models

nvidia/Llama-3.1-Nemotron-70B-Instruct

Llama 3.1 Nemotron 70B Instruct: Follow and assert Llama 3.1 Nemotron 70B Instruct is NVIDIA’s [...]

20
Feb

Models

Infermatic/MN 12B Inferor v0.0

MN 12B Inferor v0.0: Dynamic and Creative MN 12B Inferor, also known as Mistral Nemo [...]

20
Feb

Model Settings

Docs

API Docs

Infermatic API documentation

vLLM doc

All the models hosted use vLLM backend. If you want to know more feel free to visit vLLM’s documentation.

Models

Unleash your creativity

TOP LLMS

SorcererLM 8x22b bf16

Fallen Llama 3.3 R1 70B v1

L3.3 70B Magnum v4 SE

Magnum-72b-v4

72B Qwen2.5 Kunou v1

DeepSeek R1 Distill Llama 70B

70B L3.3 Cirrus x1

L3.3-70B-Euryale-v2.3

L3.1 70B Hanami x1

PLUS PLAN MODELS

All of Free + Essential + Standard and the following:

SorcererLM 8x22b bf16

STANDARD PLAN MODELS

All of Free + Essential and the following:

L3.3 70B Magnum v4 SE

Magnum-72b-v4

DeepSeek R1 Distill Llama 70B

ESSENTIAL PLAN MODELS

All of Free and the following:

multilingual-e5-base

DeepHermes 3 Mistral 24B Preview

Fallen Llama 3.3 R1 70B v1

72B Qwen2.5 Kunou v1

Anubis 70B v1

70B L3.3 Cirrus x1

L3-70B-Euryale-v2.2

L3.3-70B-Euryale-v2.3

magnum v2 72b

Wayfarer 12B

Meta Llama Guard 2 8B

UnslopNemo 12B v4.1

Llama 3.1 Nemotron 70B Instruct HF

Midnight Miqu 70B v1.5

Qwen2.5-72B-Instruct

L3.1 70B Hanami x1

Qwen2-72B-Instruct

Llama-3.2-11B-Vision-Instruct

FREE MODELS

Mixtral 8x7B Instruct v0.1

Rocinante-12B-v1.1

GUIDES

L3-70B-Euryale-v2.1

Using Infermatic.ai API with SillyTavern

nvidia/Llama-3.1-Nemotron-70B-Instruct

Infermatic/MN 12B Inferor v0.0

Docs

API Docs

vLLM doc

Frequently Asked Questions from Geek to Geek