Official Developer Guide & Specification

Welcome to the complete language reference and learning portal for the INTHON programming language layer.

INTHON (Intelligent + Python) is a domain-specific language layer designed for AI-native workflows. By representing agent execution intent as structured, deterministic code rather than unstructured natural language or verbose JSON/XML, INTHON reduces token footprint, validates schemas statically, and guarantees absolute sandbox safety.

This guide serves as both a step-by-step learning course and a technical manual, detailing language mechanics, compiler behavior, scoping rules, and safety boundaries.

The Learning Path

Follow these modules in sequence to master INTHON:

1. Getting Started & Tooling

Prerequisites, environment setup, CLI reference, and configuring project settings inside inthon.toml.

2. Scoping, Operators & Closures

Variable scoping, typing details, mathematical operators, lexical closures, and implicit return pipelines.

3. Agents, Tools & Policies

Creating agent blocks, configuring goals, static tool validation, and capability policy constraints.

4. PyBridge Sandbox & Security

Module allowlists, custom import hook interception, and secure proxy wrapper mechanics.

5. Advanced Primitives & Resiliency

Human-in-the-loop approval gates, semantic memory stores, and exponential backoff retry math.

6. Production Templates & Patterns

Complete, production-ready boilerplates for web scrapers, data pipelines, payment gates, and memory-driven agents.

Why Learn INTHON?

Traditional agent architectures rely on Large Language Models (LLMs) outputting fragile JSON schemas or raw code blocks to trigger actions. These approaches lead to:

  1. Token Bloat: Verbose JSON syntax consumes excessive tokens, increasing prompt costs.
  2. Side-Effect Risks: Running raw python/shell code exposes the host filesystem and network to compromises.
  3. Audit Hardness: Non-deterministic agent loops cannot be easily replayed or restricted.

INTHON solves these issues by introducing an optimized domain grammar, capability sandboxing, and deterministic JSON trace logging.

Part 1: Getting Started & Tooling

Set up your local environment, run your first program, configure project metadata, and explore CLI compiler commands.

1. Prerequisites

Ensure your local host has Python version 3.11 or higher and the `pip` package manager installed. Verify your environment:

python --version

2. Setup & Installation

Clone the INTHON repository and install it in editable/developer mode to make the CLI available system-wide:

# Clone the repository
git clone https://github.com/harvatechs/inthon.git
cd inthon

# Set up virtual environment
python -m venv .venv
# On Windows:
.venv\Scripts\activate
# On Unix/macOS:
source .venv/bin/activate

# Install with development dependencies
pip install -e .[dev,data,ml]

Verify that the command line tool is registered on the system PATH:

inthon --help

3. Configuring inthon.toml

A project is defined by an inthon.toml file in the working directory. This file configures type checking sensitivity, default sandbox quotas, and safety permissions:

# inthon.toml
[project]
name = "inthon-default"
version = "0.1.0"
description = "Default INTHON configuration"
entry = "main.inth"

[inthon]
runtime = "python"
version = ">=0.1.0"

[type_checking]
mode = "strict" # options: "strict", "warn", "none"

[permissions]
network = false        # Controls default raw network socket operations
filesystem = "read_only" # options: "read_write", "read_only", "none"
shell = false          # Blocks subprocess execution
payment = false        # Blocks financial transactions by default
memory_persist = true  # Enables episodic SQL memory persistence

[pybridge]
allowed_modules = ["numpy", "pandas"] # Whitelisted python modules for imports

[sandbox]
max_runtime_sec = 300  # Hard timeout duration
max_cost_usd = 1.0     # Maximum LLM cost cap allowed
max_tool_calls = 50    # Call volume quota

[trace]
enabled = true
output_dir = ".inthon/traces"

4. Your First Program: hello.inth

Create a file named hello.inth in your text editor:

// hello.inth
// INTHON supports single-line comments using double-slashes

fn greet(name: str) -> str {
    return "Hello, " + name + "!"
}

let message = greet("INTHON Developer")
message

5. Executing code

Use the CLI to run the program via the AST-walking interpreter:

inthon run hello.inth

Output:

Hello, INTHON Developer!

6. CLI Tooling Reference

The INTHON CLI provides a complete set of utilities for development, linting, formatting, and inspecting compilation steps.

Command Usage Description
run inthon run <file.inth> Executes the file in the sandboxed runtime. Supports cost caps and trace exports.
check inthon check <file.inth> Lints and statically verifies type safety and tool references without executing. Displays compile-time errors.
fmt inthon fmt <file.inth> --write Formats spacing, brackets, and newlines. --write updates the file in place.
ast inthon ast <file.inth> Prints the parsed Abstract Syntax Tree output as a structured JSON object.
ir inthon ir <file.inth> Prints the lowered bytecode Intermediate Representation tree serialized as JSON.

Statically Checking Variables & Types

Before executing in production, always validate your scripts using the linter:

inthon check hello.inth

Specifying Execution Budgets

You can override default interpreter cost limits directly from the CLI:

inthon run hello.inth --max-cost 0.50 --trace-out trace_log.json

AST and Bytecode IR Diagnostics

To inspect what the compiler produces, run the ir compiler pass:

inthon ir hello.inth

The lowered bytecode serialized output is represented as JSON instructions, mapping names to values for execution:

{
  "constants": ["Hello, ", "!", "INTHON Developer"],
  "instructions": [
    {"op": "LOAD_CONST", "arg": 2},
    {"op": "STORE_NAME", "arg": "name"},
    {"op": "LOAD_CONST", "arg": 0},
    {"op": "LOAD_NAME", "arg": "name"},
    {"op": "BINARY_ADD"},
    {"op": "LOAD_CONST", "arg": 1},
    {"op": "BINARY_ADD"},
    {"op": "RETURN_VALUE"}
  ]
}

Part 2: Scoping, Operators & Closures

Deep-dive into INTHON scoping rules, data types, operators precedence, control flow statements, and call frame stacks.

1. Block Scoping (let and const)

INTHON implements strict block scoping. Variables declared within a block {} are invisible outside of it.

Variable Scope Lifespans

  • Mutable Scope (let): Can be declared with or without type annotations. Value bindings can be changed.
  • Immutable Scope (const): Must be bound at declaration time. Value bindings are read-only; attempts to assign a new value to a constant trigger compile-time checking failures.
let x = 10
const y: float = 3.14

if x > 5 {
    let z = "inner scope"
    x = x + 1 // OK: reassigning let variable
    // y = 2.71 // ERROR: Cannot reassign const variable 'y'
}
// z // ERROR: Variable 'z' is undefined in this scope

2. Complete Data Types Reference

INTHON features static type checking with structural support for primitive structures, functional objects, and generic containers.

Type Description Example Declaration
int Signed 64-bit integer values. let a: int = -42
float 64-bit double precision float points. let b: float = 0.005
str UTF-8 text character arrays. let c: str = "agent-1"
bool Boolean binary switches. let d: bool = true
list[T] Dynamic list of uniform type T. let e: list[str] = ["url1", "url2"]
dict[K, V] Key-value map mapping keys of type K to values of type V. let f: dict[str, float] = {"rate": 0.82}
fn(A1, A2) -> R First-class callable function type. let g: fn(int) -> int = doubler

3. Operators & Precedence Table

Operators are evaluated according to a strict mathematical hierarchy. Parentheses () should be used to override defaults:

Level Operators Description Associativity
1 (Highest) !, - (unary) Logical negation, numeric sign conversion Right-to-left
2 *, / Multiplication, division Left-to-right
3 +, - Addition / Concatenation, subtraction Left-to-right
4 <, <=, >, >= Relational comparison checks Left-to-right
5 ==, != Equality, inequality verification Left-to-right
6 && Logical AND Left-to-right
7 (Lowest) || Logical OR Left-to-right

4. Control Flow Structures

Control flow structures control code branch executions. Blocks are bounded by brackets {} and do not require parentheses around expression conditions.

Branching (if, elif, else)

let score = 85
let grade = ""

if score >= 90 {
    grade = "A"
} elif score >= 80 {
    grade = "B"
} else {
    grade = "C"
}

Iteration (while)

Loops execution while the condition resolves to true:

let i = 0
let sum = 0
while i < 10 {
    sum = sum + i
    i = i + 1
}

5. Functions & Closures

Functions are declared with fn, parameter names, type signatures, and an optional return type.

Activation Call Frames

When a function is called, the VM spawns a new frame containing the parameters. Variables in the outer lexical scope remain accessible to nested helper functions, forming closures.

Implicit Return Pipeline

If the last statement in a function body is an expression (not terminated by a semicolon or keyword), the VM automatically pops it from the evaluation stack and returns it as the function's output.

fn multiplier(factor: int) -> fn(int) -> int {
    fn inner(x: int) -> int {
        x * factor // Implicit return of evaluation
    }
    return inner
}

let double = multiplier(2)
let result = double(10) // Returns 20

Part 3: Agents, Tools & Policies

Learn how to declare structured agent containers, import capabilities, write custom tools, and apply sandbox constraints.

1. Structured Agent Container

The agent block maps an execution lifecycle. It encapsulates a goals directive, boundary interfaces, sandboxing policies, and the tool workflow code.

SVG Diagram: Policy Guard Enforcement

Agent Execution Plan Instructions Policy Guard Validate Cost & Quotas Protected OS / Host Approved Action

2. Declaring and Importing Tools

Tools are external service bindings registered on the host machine. To invoke them, you must import them at the file header using use tool:

use tool web.search
use tool file.write

This imports the structural parameter schema of the tool. The semantic analyzer uses these schemas to verify argument counts, types, and names prior to compilation, ensuring the LLM cannot emit malformed API structures.

3. Writing & Registering Custom Tools

Developers can build their own custom tool integrations. This involves two steps: writing the execution code in Python and registering it with the compiler's tool schema registry.

Step A: Write the Python Tool Logic

Write a standard Python function that handles inputs and executes the actions:

# tools/custom_io.py
def file_writer_impl(filepath: str, content: str) -> dict:
    """Writes content to a specific file within the sandbox directory."""
    import os
    try:
        # Enforce sandbox scoping
        safe_path = os.path.abspath(filepath)
        if not safe_path.startswith(os.getcwd()):
            return {"status": "error", "message": "Sandbox path violation"}
            
        with open(safe_path, "w", encoding="utf-8") as f:
            f.write(content)
        return {"status": "success", "written_bytes": len(content)}
    except Exception as e:
        return {"status": "error", "message": str(e)}

Step B: Register Tool Schema & Bind Callbacks

Define specifications and register them using the ToolRegistry API in your application bootstrap routine:

# app.py
from inthon.tools.registry import ToolRegistry
from inthon.tools.schema import ToolSpec, ToolArgSchema, ToolCostModel
from tools.custom_io import file_writer_impl

def init_registry() -> ToolRegistry:
    registry = ToolRegistry()
    
    registry.register(
        spec=ToolSpec(
            name="file.write",
            description="Write text contents securely to local disk",
            input_schema={
                "filepath": ToolArgSchema(type="str", description="Target path relative to project"),
                "content": ToolArgSchema(type="str", description="UTF-8 plain text data to write"),
            },
            output_schema={"status": "str", "written_bytes": "int"},
            side_effects=["filesystem"],
            required_permissions=["allow_fs"],
            cost_model=ToolCostModel(base_usd=0.0, per_call_usd=0.001)
        ),
        impl=file_writer_impl
    )
    return registry

4. Complete Policy Reference

The policy block contains the execution boundaries. If code attempts to cross these limits, the runtime raises a PolicyViolationError.

Key Type Default Description
allow_network bool false Controls outbound API calls from tool and PyBridge calls.
max_tool_calls int 0 Integer cap on total tool calls allowed per session.
max_cost_usd float 0.00 Financial cap on cumulative LLM token cost.
allow_memory_persist bool false Controls SQLite episodic database inserts.
allow_fs bool false Controls local disk read and write privileges.
allow_payment bool false Controls execution of transaction charges or Stripe API hooks.

Agent Implementation Example

use tool web.search

agent Researcher {
    goal "Find room-temperature superconductor papers on arxiv"
    inputs {
        count: int
    }
    outputs {
        papers: list[dict]
    }
    policy {
        allow_network: true
        max_tool_calls: 3
        max_cost_usd: 0.05
    }
    plan {
        let results = web.search("superconductor", limit: count)
        return results
    }
}

Part 4: PyBridge Sandbox & Security

Import standard Python libraries safely. Leverage NumPy and Pandas under strict capability validation checks.

1. The Import Hook Filter (sys.meta_path)

To enforce sandbox safety, PyBridge intercepts module requests before loading them onto the Python process. It injects a custom meta-importer hook inside sys.meta_path. When the script parses use py.numpy, the hook performs strict name validations.

SVG Diagram: PyBridge Sandbox Interception Flow

use py.subprocess Malicious Request sys.meta_path Hook Blocked: NOT Allowed PyBridgeError InthonPyObject Proxy Safe Attribute Wrapper use py.numpy (Allowed)

2. Permitted vs Blocked Packages

Pre-approved libraries contain no filesystem writes or system access wrappers:

  • Approved Standard: numpy, pandas, math, json, datetime, collections, itertools, functools, re, pathlib.
  • Approved Advanced: torch, transformers, sklearn, scipy, matplotlib, seaborn, plotly.
  • Blocked System Modules: os, sys, subprocess, ctypes, shutil, socket, multiprocessing, importlib.

If you need to permit a new library for custom data environments, append it to the configuration whitelists inside your local inthon.toml configuration file:

[pybridge]
allowed_modules = ["scipy", "statsmodels"]

3. Attribute Level Protection (InthonPyObject)

Even if a module is allowed, standard library functions can sometimes be exploited. PyBridge wraps all returned packages inside a proxy object called InthonPyObject.

This proxy class overrides the Python attribute resolution dunder methods (__getattribute__, __setattr__). If code attempts to access private properties (e.g. np.__dict__) or traverse namespaces to access system commands (e.g. np.__config__.__builtins__['eval']), the proxy raises a SecurityViolation exception and terminates execution.

Blocked Specific API Functions

Certain whitelisted libraries have specific features restricted for execution safety. The BLOCKED_ATTRIBUTES configuration blocks the following access points:

Module Name Blocked Attribute Path Safety Rationale
pandas pandas.eval Prevents executing arbitrary Python code strings.
pandas pandas.read_clipboard Prevents accessing host operating system clipboards.
numpy numpy.frompyfunc Prevents wrapping arbitrary python execution scopes.

Part 5: Advanced Primitives & Resiliency

Master human verification gateways, relational memory storage, and automatic retry algorithms.

1. Human-in-the-Loop Approval Gates (approve)

For critical side-effects, INTHON enforces human intervention. The compiler registers the target action and suspends the thread execution, emitting a structured request event:

approve subscription_charge before stripe.charge(amount: 49.00)

SVG Diagram: HITL Execution Suspend Flow

Execute Plan Interpreter Thread Halt & Emit Hook Approval Prompted Approved: Resume Exec Denied: Abort Transaction

2. Episodic Memory Primitives (SQLite Backed)

Semantic variables can be persisted across sessions using the remember and recall primitives:

remember "Superconductors display zero electrical resistance" in session
let val = recall "electrical resistance" from session

Storage Architecture

Under the hood, memory spaces are backed by a local SQLite relational store. When remember is called, the statement text is written to the database along with its calculated text embeddings vector. When recall is invoked, the engine searches the database for statements that have the highest cosine similarity with the query:

\(\text{Similarity} = \frac{\mathbf{A} \cdot \mathbf{B}}{\|\mathbf{A}\| \|\mathbf{B}\|}\)

This allows the agent to dynamically recall relevant contextual statements without querying raw external indexes on every step.

3. Resilient Retries & Backoff Mathematics

External APIs often fail due to network instability. INTHON includes structured retry loops with automatic backoff options:

retry 3 with backoff exponential {
    let raw = web.search("AI")
    guard raw.status == 200
} catch err {
    return "Failed: " + err.message
}

Exponential Backoff Formula

The time interval (in seconds) between retry attempts is computed by the interpreter using the following formula:

\(t_{\text{backoff}} = \text{base} \times 2^{\text{attempt}} \pm \text{jitter}\)

Where base defaults to 1.0 second, attempt represents the zero-indexed retry number, and jitter is a random offset variable to prevent request synchronization bottlenecks on the host server.

Part 6: Production Templates & Design Patterns

Ready-to-use, copy-pasteable boilerplates and design patterns to solve common challenges in professional agent systems.

Template 1: Web Scraper & Summarizer Agent

This template demonstrates how to search the web, fetch the page content of matching links, and structure a unified research response under safety restrictions.

// scraper_summarizer.inth
use tool web.search
use tool web.read

agent ScraperAgent {
    goal "Locate and summarize the key findings about new energy technologies"
    inputs {
        topic: str,
        limit: int
    }
    outputs {
        summary_report: dict[str, str]
    }
    policy {
        allow_network: true
        max_tool_calls: 5
        max_cost_usd: 0.05
    }
    plan {
        let results = web.search(topic, limit: limit)
        let report = {}
        let i = 0
        
        while i < limit {
            let item = results[i]
            let link = item["url"]
            let title = item["title"]
            
            // Read content safely
            let content = web.read(link)
            
            // Perform basic parsing and extract snippets
            let snippet = content
            if content.length > 300 {
                snippet = content.slice(0, 300) + "..."
            }
            
            report[title] = snippet
            i = i + 1
        }
        
        return {
            "query_topic": topic,
            "summary_report": report
        }
    }
}

Template 2: Data Analysis & CSV Analytics Pipeline

This template demonstrates how to leverage PyBridge to load dataset arrays, perform mathematical analysis using Pandas and NumPy, and return structured statistical results.

// data_analytics.inth
use py.pandas as pd
use py.numpy as np

fn process_metrics(data_points: list[dict]) -> dict {
    // Convert input list of dictionaries to Pandas DataFrame safely
    let df = pd.DataFrame(data_points)
    
    // Calculate statistics using pandas wrappers
    let prices = df["price"]
    let average_price = prices.mean()
    let median_price = prices.median()
    let deviation = prices.std()
    
    // Detect outliers using numpy math utilities
    let threshold = average_price + deviation
    let outliers = df[prices > threshold]
    let outlier_items = outliers["item"].to_list()
    
    return {
        "mean": average_price,
        "median": median_price,
        "std_dev": deviation,
        "anomalous_outliers": outlier_items
    }
}

// Execute logic with mock database values
let dataset = [
    {"item": "Server Unit A", "price": 1500.0},
    {"item": "Server Unit B", "price": 1600.0},
    {"item": "GPU Adapter X", "price": 4500.0}, // Outlier price point
    {"item": "Cabling Kit", "price": 150.0},
    {"item": "Network Switch", "price": 800.0}
]

let stats = process_metrics(dataset)
stats

Template 3: HITL Stripe Payment Gate

This template sets up a safe financial coordinator that calculates tier pricing and routes execution control through the human approval gateway before charging payments.

// billing_agent.inth
use tool stripe.charge

agent BillingAgent {
    goal "Securely charge clients based on execution limits"
    inputs {
        client_id: str,
        compute_seconds: int,
        tier: str
    }
    outputs {
        charge_status: str,
        transaction_id: str
    }
    policy {
        allow_payment: true
        max_cost_usd: 0.10
    }
    plan {
        // Calculate price dynamically
        let base_rate = 0.05
        if tier == "enterprise" {
            base_rate = 0.02
        }
        
        let calculated_amount = compute_seconds * base_rate
        
        // Enforce human validation block prior to payment API call
        approve client_payment before stripe.charge(
            customer: client_id, 
            amount: calculated_amount,
            currency: "USD"
        )
        
        let response = stripe.charge(calculated_amount)
        return {
            "charge_status": response["status"],
            "transaction_id": response["id"]
        }
    }
}

Template 4: Episodic Memory-Driven Q&A Agent

This template showcases how an agent can store facts in its episodic memory store during interactions, and query it later via semantic search. If memory is empty, the agent falls back to external search.

// memory_agent.inth
use tool web.search

agent MemoryAgent {
    goal "Answer queries using historical context when available"
    inputs {
        user_query: str
    }
    outputs {
        response_text: str,
        context_source: str
    }
    policy {
        allow_network: true
        allow_memory_persist: true
    }
    plan {
        // Attempt to recall historical facts semantically matching the query
        let matched_fact = recall user_query from session
        
        if matched_fact != "" {
            return {
                "response_text": "Recalled context: " + matched_fact,
                "context_source": "episodic_memory"
            }
        }
        
        // Fallback to searching the web if no relevant facts are remembered
        let search_results = web.search(user_query, limit: 1)
        let first_result = search_results[0]
        let snippet = first_result["snippet"]
        
        // Remember this fact for future queries in the user session
        remember snippet in session
        
        return {
            "response_text": snippet,
            "context_source": "web_search_fallback"
        }
    }
}