DeepSeek & MiniMax LLM API

DeepSeek R1 and MiniMax-Text-01 • API Showcase

3 min read • March 11, 2025

Introduction
Real-time Web Search
Image recognition and analysis
Document processing using a 1M context
Conclusion

Introduction

This article showcases various functionality provided by MiniMax API v1 https://api.useapi.net/v1/minimax/llm API endpoints.

Please visit 🚀 Chat with AI to try a fully-fledged chat demo featuring the complete LLM functionality available via our API, complete with full source code.

You can also refer to the source code of our 🤖 Ask AI support bot.

Real-time Web Search

POST minimax/llm
Prompt

Compose JSON {percent, useragent, system}[] of top 10 user agents published at https://techblog.willshouse.com/2012/01/03/most-common-user-agents/

Expand Response : DeepSeek R1

💡Thinking… Okay, I need to compose a JSON array of the top 10 user agents from the provided document. The user specified that each entry should have percent, useragent, and system fields. Let me check the document content carefully.

Looking at the HTML content, there’s a table with columns percent, useragent, system, and user_agent_string_md5. The top entries start at 18.2% for Chrome 133.0 on Win10. I should list them in order as they appear in the table. The first row is 18.2%, then 11.1%, 7.8%, and so on until the tenth row, which is 1.6% for Firefox 135.0 on Linux (X11 Ubuntu). Each entry in the JSON should include the exact useragent string, the percentage, and the system description from the “system” column. I need to make sure that the JSON formatting is correct, using quotes properly and escaping any necessary characters. Also, check that there are exactly 10 entries. Let me count them from the table. Yes, rows 1 to 10 in the table are the top 10. Now, I’ll structure each JSON object accordingly, ensuring the fields match the required keys. Need to verify the useragent strings to ensure they are correctly copied, especially any underscores or special characters. For example, the macOS entries have underscores in OS X versions. Also, check for any trailing spaces or formatting issues in the system descriptions. Once all entries are correctly transcribed, compile them into a JSON array. That should do it.

[
  {
    "percent": "18.2%",
    "useragent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.0.0 Safari/537.36",
    "system": "Chrome 133.0 Win10"
  },
  {
    "percent": "11.1%",
    "useragent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.0.0 Safari/537.36",
    "system": "Chrome 133.0 macOS"
  },
  {
    "percent": "7.8%",
    "useragent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/132.0.0.0 Safari/537.36",
    "system": "Chrome 132.0 Win10"
  },
  {
    "percent": "7.1%",
    "useragent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:135.0) Gecko/20100101 Firefox/135.0",
    "system": "Firefox 135.0 Win10"
  },
  {
    "percent": "4.1%",
    "useragent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.0.0 Safari/537.36 Edg/133.0.0.0",
    "system": "Edge 133.0 Win10"
  },
  {
    "percent": "3.5%",
    "useragent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.0.0 Safari/537.36",
    "system": "Chrome 133.0 Linux"
  },
  {
    "percent": "3.2%",
    "useragent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/132.0.0.0 Safari/537.36",
    "system": "Chrome 132.0 macOS"
  },
  {
    "percent": "3.0%",
    "useragent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:135.0) Gecko/20100101 Firefox/135.0",
    "system": "Firefox 135.0 macOS"
  },
  {
    "percent": "2.8%",
    "useragent": "Mozilla/5.0 (X11; Linux x86_64; rv:135.0) Gecko/20100101 Firefox/135.0",
    "system": "Firefox 135.0 Linux"
  },
  {
    "percent": "2.4%",
    "useragent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/18.3 Safari/605.1.15",
    "system": "Safari Generic macOS"
  }
]

Image recognition and analysis

The LLM API offers advanced RAG (Retrieval Augmented Generation) functionality. The following document formats are currently supported: txt, docx, doc, pdf, ppt, pptx, xls, xlsx, png, jpeg, jpg, webp, svg, heif, and tiff. The maximum file size is 100MB.

POST minimax/llm
Prompt

Provide detailed description of both images. List similarities and differences.

file
file

Expand Response : DeepSeek R1

💡Thinking… Okay, let me start by looking at the two images the user provided. First, I need to parse both descriptions carefully. The first image is a person holding a sign in a mountainous outdoor setting. The background has mountains with greenery and rocks, a clear blue sky. The person is wearing a black hoodie, has a blonde ponytail, hoop earrings, and holds a cardboard sign with “I ❤️ Riffussion API” in black text and a red heart.

The second image is a person in ancient Greek attire between two classical columns. The attire is a cream gown with gold details, gold accessories including a headpiece, necklace, and bracelets. The pose has one hand on a column, the other relaxed. The columns are fluted and weathered. Lighting is dramatic with a dark background creating contrast.

Now, I need to list similarities and differences. Let me start with similarities. Both images feature a single person as the main subject. They both have the person holding or interacting with an object—the first holds a sign, the second touches a column. The color gold appears in both: the earrings in the first image (hoop earrings, probably gold?) and the accessories and details in the second. Wait, the first image’s earrings are described as “small hoop earrings,” but the color isn’t mentioned. The user specified the text on the sign: red heart and black text, but maybe the earrings could be assumed as gold, but I should check if that’s explicitly stated. Looking back: no, the first image’s earrings aren described as gold. So maybe that’s not a similarity. Let me correct that.

Wait, the first image’s person has “small hoop earrings” but no color mentioned. The second has gold accessories. So that’s a difference. Maybe another similarity: both have outdoor or architectural elements in the background—mountains vs. columns. Both images have a clear focus on the person and their attire or message. Perhaps both have a contrast in their backgrounds: the first is a bright mountain scene, the second a dark background, but the contrast is used in both to highlight the subject.

Now, differences. Setting: first is mountainous and natural, the second is classical columns in a dark, possibly indoor or staged environment. Attire: casual modern clothing (hoodie) vs. ancient Greek-style elaborate gown. The person’s pose: holding a sign vs. leaning on a column. The text on the sign vs. no text in the second image. Lighting: natural light in the first, dramatic artificial lighting in the second. Accessories: minimal (earrings) vs. extensive gold jewelry. The sign in the first is a message, while the second is more about aesthetic and classical representation. The columns show wear and tear, whereas the mountain setting is more pristine. The colors in the first image are black, red, and blue skies, while the second uses cream, gold, and dark backgrounds. Textures: the first has the texture of cardboard and natural elements, the second has flowing fabric and weathered stone.

Let me make sure I don’t miss any points. Also, gender isn’t specified for either person, so I should avoid assuming. The first image’s person has long blonde hair, the second’s hair is covered or styled under the headpiece. The purposes might differ: one with a personal message, the other perhaps artistic or historical reenactment.

Now, compiling all that into structured similarities and differences, ensuring clarity and that I don’t mix up details. Need to check if all points are accurate and not conflated. For example, the background in the first is outdoor natural, the second maybe an indoor set or studio with columns. The key elements to compare are setting, subject’s attire, objects they interact with, color schemes, lighting, accessories, and overall theme.

Image Descriptions

Image 1:
A person in casual modern attire (black hoodie, blonde ponytail, hoop earrings) stands outdoors in a mountainous setting holding a cardboard sign reading “I ❤️ Riffussion API.” The background features rugged, greenery-dotted mountains under a clear blue sky.

Image 2:
A person in elaborate ancient Greek-inspired attire (cream gown with gold accents, gold headpiece, necklace, and bracelets) poses between weathered fluted columns. The scene has dramatic lighting against a dark background, emphasizing classical aesthetics.

Similarities

Central Subject: Both images focus on a single person as the primary subject.
Interaction with Objects:
- Image 1: Holding a sign.
- Image 2: Hand resting on a column.
Contrasting Backgrounds:
- Image 1: Bright natural backdrop (sky, mountains).
- Image 2: Dark, shadowy backdrop.
Metallic Accents:
- Image 1: Small hoop earrings (implied metallic sheen).
- Image 2: Gold jewelry and embellishments.

Differences

Category	Image 1	Image 2
Setting	Outdoor mountain landscape	Indoor/staged classical architecture
Attire	Modern casual (hoodie, ponytail)	Ancient Greek-style gown and regalia
Lighting	Natural daylight	Dramatic artificial lighting
Color Palette	Bright (blue sky, red heart, black)	Earthy tones (cream, gold, dark)
Text/Message	Explicit text on sign (“Riffussion”)	No text; purely aesthetic focus
Accessories	Minimal (earrings)	Elaborate (headpiece, necklace, bracelets)
Mood	Informal, personal expression	Theatrical, historical elegance
Textures/Details	Cardboard sign, rocky terrain	Fluted columns, flowing fabric

Key Contrasts

Theme: Image 1 emphasizes individuality and modernity (text message), while Image 2 evokes classical artistry and grandeur.
Environment: Natural wilderness vs. constructed architectural drama.
Visual Intent: Image 1 feels candid and relatable; Image 2 is stylized and symbolic.

Document processing using a 1M context

We created an altered version of The Iliad by Homer.

The text is about 800Kb in size and has just over 200K tokens. We can use the MiniMax-Text-01 LLM model, which has a 1M context window to analyze this text.

Alterations include:

Renaming all occurrences of Achilles to Sarrah.
Adding the following text at line #11000:

The TERMINATOR arrives from the future year 2029.
The TERMINATOR hunts and finds Sarrah.
The long-lasting battle between Sarrah and TERMINATOR takes place.
The TERMINATOR kills Sarrah and assumes his identity.

POST minimax/llm
Prompt