Rate Limits & Costs

Understanding and managing your API usage is crucial for a smooth and cost-effective experience with VJSP. This section explains how to track your Token usage, costs, and configure rate limits.

Token Usage

VJSP uses Tokens to interact with AI models. A Token is essentially a sub-unit of words. The number of Tokens used in requests and responses impacts both processing time and cost.

Input Tokens: These are the Tokens in your prompt, including system prompts, your instructions, and any context provided (e.g., file contents).
Output Tokens: These are the Tokens generated by the AI model in its response.

You can view the number of input and output Tokens for each interaction in your chat history.

Configuring Rate Limits

To prevent accidental overuse of the API and help you manage costs, VJSP allows you to set rate limits. A rate limit defines the minimum time (in seconds) between API requests.

How to configure:

Open VJSP Settings (⚙️ icon in the top-right corner).
Navigate to the Model Configuration > Advanced Settings section.
Locate the API Request Frequency Limit setting.
Select your desired delay in seconds. A value of 0 disables the rate limit.

Example:

If you set the rate limit to 10 seconds, VJSP will wait at least 10 seconds after one API request completes before sending the next.

Tips for Optimizing Token Usage

Be Concise: Use clear and concise language in your prompts. Avoid unnecessary words or details.
Provide Only Relevant Context: Use context mentions (@file.ts, @folder/) selectively. Only include files directly related to the task.
Break Down Tasks: Split large tasks into smaller, focused sub-tasks.
Use Custom Instructions: Provide custom instructions to guide VJSP's behavior, reducing the need for lengthy explanations in every prompt.
Choose the Right Model: Some models are more cost-effective than others. For tasks that don't require the full capabilities of a larger model, consider using a smaller, faster one.
Use Modes: Different modes grant access to different tools. For example, Architect cannot modify code, making it a safe choice for analyzing complex codebases without risking expensive operations.
Disable MCP if Unused: If you are not using MCP (Model Context Protocol) features, consider disabling it in the MCP settings to significantly reduce the size of the system prompt and save Tokens.

Rate Limits & Costs ​

Token Usage ​

Configuring Rate Limits ​

Tips for Optimizing Token Usage ​

Rate Limits & Costs

Token Usage

Configuring Rate Limits

Tips for Optimizing Token Usage