Agent 101

LLM vs. LLM-with-tools — same model on both sides, the only difference is the agent loop on the right. Watch what happens on questions the LLM can't answer from memory.

Built for CS 203 at IIT Gandhinagar · companion Colab

Ask something

Model

Try an example — click a pill to auto-run it:

🚫 Without tools

Plain LLM — no calculator, no weather, no notes.

🛠️ With tools (agent loop)

Same model, but it can call functions.

The three pieces of an agent

Piece	What it does
LLM	Decides what to do next
Tools	Python functions the LLM can call
Loop	Keep asking the LLM until it's done

Pseudocode

while True:
    response = llm.chat(messages, tools=schemas)
    if not response.tool_calls:
        return response.text          # done
    for call in response.tool_calls:
        result = TOOLS[call.name](**call.args)
        messages.append(result)       # feed back, loop again

The LLM never runs code. It just names a tool and your code runs it.

System prompt

You are a helpful assistant with access to tools.

Rules you must follow:
1. For ANY arithmetic — even simple multiplication like 5 * 7 — call the
   `calculate` tool. Never compute numbers in your head.
2. For real-time facts like weather, current time, or stock prices,
   call the matching tool (`get_weather`, `get_time`, `get_stock_price`).
   Do not guess or say you don't know.
3. For unit conversions, call `convert_units` — don't approximate.
4. For currency conversion, call `get_exchange_rate`.
5. For distances between cities, call `get_distance`. For city
   populations, call `get_population`.
6. For questions about the CS 203 course ("what week did we cover X?"),
   call `search_notes`.
7. For definitions of CS / ML terms, call `define_word`.
8. Multi-step questions need multiple tool calls in sequence. Examples:
   - "5 km/day for a week in miles" → `convert_units` then `calculate`.
   - "Delhi-Mumbai distance in miles" → `get_distance` then
     `convert_units`.
   - "Shares of AAPL for 5000 INR" → `get_exchange_rate` →
     `get_stock_price` → `calculate`.
9. After every tool result, decide: do I need another tool, or can I
   write the final answer? Only answer once you have ALL the numbers.
10. If the question genuinely doesn't need any tool (e.g. "capital of
    France"), answer directly.

CRITICAL: when you ARE calling a tool, use the provider's structured
tool_calls interface — do NOT write tool calls as Python-style text like
`[get_weather(city="Delhi")]` or as JSON in your reply. If you want to
call a tool, emit a real tool_call; otherwise write a final natural-
English answer with the numbers spelled out.

Tools the agent has

`calculate`

Schema (what the model sees):

{
  "type": "function",
  "function": {
    "name": "calculate",
    "description": "Evaluate a math expression. Use Python syntax: + - * / ** (). Example: '4729 * 8314'. ALWAYS use this for arithmetic.",
    "parameters": {
      "type": "object",
      "properties": {
        "expression": {
          "type": "string",
          "description": "e.g. '2 + 3 * 4'"
        }
      },
      "required": [
        "expression"
      ]
    }
  }
}

Python implementation (what actually runs):

def calculate(expression: str) -> str:
    allowed = set("0123456789+-*/.() ")
    if not all(c in allowed for c in expression):
        return "Error: only numbers and +-*/() are allowed"
    try:
        return str(round(eval(expression), 10))
    except Exception as e:
        return f"Error: {e}"

`get_weather`

Schema (what the model sees):

{
  "type": "function",
  "function": {
    "name": "get_weather",
    "description": "Current weather (temperature in Celsius, condition, humidity) for a city. Cities: Gandhinagar, Mumbai, Bangalore, Delhi, Chennai, Kolkata, Paris, New York.",
    "parameters": {
      "type": "object",
      "properties": {
        "city": {
          "type": "string",
          "description": "City name, e.g. 'Mumbai'"
        }
      },
      "required": [
        "city"
      ]
    }
  }
}

Python implementation (what actually runs):

def get_weather(city: str) -> str:
    data = {
        "gandhinagar": {"temp_c": 38, "condition": "Sunny",  "humidity": 25},
        "mumbai":      {"temp_c": 32, "condition": "Humid",  "humidity": 80},
        "bangalore":   {"temp_c": 28, "condition": "Rainy",  "humidity": 65},
        "delhi":       {"temp_c": 40, "condition": "Haze",   "humidity": 30},
        "chennai":     {"temp_c": 35, "condition": "Cloudy", "humidity": 70},
        "kolkata":     {"temp_c": 33, "condition": "Humid",  "humidity": 75},
        "paris":       {"temp_c": 18, "condition": "Cloudy", "humidity": 60},
        "new york":    {"temp_c": 22, "condition": "Clear",  "humidity": 45},
    }
    return json.dumps(data.get(city.lower(), {"error": f"No data for {city}"}))

`convert_units`

Schema (what the model sees):

{
  "type": "function",
  "function": {
    "name": "convert_units",
    "description": "Convert a value between units. Supports celsius/fahrenheit, kg/pounds, km/miles, meters/feet, liters/gallons.",
    "parameters": {
      "type": "object",
      "properties": {
        "value": {
          "type": "number",
          "description": "The numeric value"
        },
        "from_unit": {
          "type": "string",
          "description": "Source unit, e.g. 'celsius'"
        },
        "to_unit": {
          "type": "string",
          "description": "Target unit, e.g. 'fahrenheit'"
        }
      },
      "required": [
        "value",
        "from_unit",
        "to_unit"
      ]
    }
  }
}

Python implementation (what actually runs):

def convert_units(value: float, from_unit: str, to_unit: str) -> str:
    conversions = {
        ("celsius", "fahrenheit"):  lambda v: v * 9 / 5 + 32,
        ("fahrenheit", "celsius"):  lambda v: (v - 32) * 5 / 9,
        ("kg", "pounds"):           lambda v: v * 2.20462,
        ("pounds", "kg"):           lambda v: v / 2.20462,
        ("km", "miles"):            lambda v: v * 0.621371,
        ("miles", "km"):            lambda v: v / 0.621371,
        ("meters", "feet"):         lambda v: v * 3.28084,
        ("feet", "meters"):         lambda v: v / 3.28084,
        ("liters", "gallons"):      lambda v: v * 0.264172,
        ("gallons", "liters"):      lambda v: v / 0.264172,
    }
    try:
        value = float(value)
    except (TypeError, ValueError):
        return f"Error: value '{value}' is not a number"
    fn = conversions.get((str(from_unit).lower(), str(to_unit).lower()))
    if fn is None:
        return f"Cannot convert {from_unit} to {to_unit}"
    return f"{fn(value):.4f} {to_unit}"

`search_notes`

Schema (what the model sees):

{
  "type": "function",
  "function": {
    "name": "search_notes",
    "description": "Search the CS 203 (IIT Gandhinagar) course notes by keyword. Returns which week a topic was covered.",
    "parameters": {
      "type": "object",
      "properties": {
        "query": {
          "type": "string",
          "description": "Topic keyword, e.g. 'docker'"
        }
      },
      "required": [
        "query"
      ]
    }
  }
}

Python implementation (what actually runs):

def search_notes(query: str) -> str:
    topics = {
        "data drift":          "Week 10: detecting distribution shift with KS test, PSI, chi-squared test",
        "profiling":           "Week 11: cProfile, timeit, finding bottlenecks in ML code",
        "quantization":        "Week 11: INT8 / FP16, dynamic quantization, ONNX, model compression",
        "pruning":             "Week 11: unstructured and structured pruning, removing weights",
        "distillation":        "Week 11: teacher-student training, soft labels, knowledge transfer",
        "docker":              "Week 10: containerization, Dockerfiles, reproducible environments",
        "fastapi":             "Week 12: building REST APIs, Pydantic validation, /predict endpoints",
        "agents":              "Week 12: tool calling, Gemma 4, the agent loop, function calling",
        "git":                 "Week 9: version control, commits, branches, merge conflicts",
        "experiment tracking": "Week 8: MLflow, Weights & Biases, hyperparameter tuning",
        "cross validation":    "Week 7: k-fold CV, train/val/test splits, bias-variance tradeoff",
        "gradio":              "Week 12: building demo UIs for ML models, share=True for public link",
        "streamlit":           "Week 12: building dashboard-style ML apps with Python",
        "batching":            "Week 11: processing multiple inputs at once for throughput",
        "onnx":                "Week 11: portable model format, export once run anywhere",
    }
    q = query.lower()
    matches = {k: v for k, v in topics.items() if q in k or q in v.lower()}
    if matches:
        return json.dumps(matches, indent=2)
    return json.dumps({"message": f"No results for '{query}'. Try: {', '.join(list(topics.keys())[:5])}"})

`define_word`

Schema (what the model sees):

{
  "type": "function",
  "function": {
    "name": "define_word",
    "description": "Look up the definition of a common CS / ML term.",
    "parameters": {
      "type": "object",
      "properties": {
        "word": {
          "type": "string",
          "description": "Term, e.g. 'overfitting'"
        }
      },
      "required": [
        "word"
      ]
    }
  }
}

Python implementation (what actually runs):

def define_word(word: str) -> str:
    definitions = {
        "overfitting":    "When a model learns training data too well (including noise) and performs poorly on new data.",
        "gradient":       "The vector of partial derivatives of the loss w.r.t. each parameter.",
        "epoch":          "One full pass through the entire training dataset.",
        "batch size":     "The number of training examples in one forward/backward pass.",
        "learning rate":  "A hyperparameter controlling how much to adjust weights per update.",
        "regularization": "Techniques to prevent overfitting, e.g. L1/L2 penalties, dropout.",
        "transformer":    "A neural-network architecture based on self-attention (GPT, BERT, etc.).",
        "tokenizer":      "Converts text into a sequence of integer IDs a model can process.",
    }
    r = definitions.get(word.lower())
    if r:
        return json.dumps({"word": word, "definition": r})
    return json.dumps({"error": f"'{word}' not found. Available: {', '.join(definitions.keys())}"})

`get_time`

Schema (what the model sees):

{
  "type": "function",
  "function": {
    "name": "get_time",
    "description": "Get the current wall-clock time in a given timezone or city. Accepts UTC, IST, JST, EST, PDT, 'Tokyo', 'New York', etc.",
    "parameters": {
      "type": "object",
      "properties": {
        "timezone": {
          "type": "string",
          "description": "Timezone code or city, e.g. 'IST' or 'Tokyo'"
        }
      },
      "required": [
        "timezone"
      ]
    }
  }
}

Python implementation (what actually runs):

def get_time(timezone: str = "UTC") -> str:
    """Return the current wall-clock time in a named timezone or UTC offset."""
    offsets = {
        "utc": 0, "ist": 5.5, "gmt": 0, "bst": 1,
        "edt": -4, "est": -5, "pdt": -7, "pst": -8,
        "jst": 9, "kst": 9, "cst": 8, "cet": 1, "eet": 2,
        "sgt": 8, "hkt": 8, "aest": 10,
        "gandhinagar": 5.5, "mumbai": 5.5, "delhi": 5.5, "bangalore": 5.5,
        "tokyo": 9, "seoul": 9, "singapore": 8, "london": 0, "paris": 1,
        "new york": -4, "san francisco": -7, "sydney": 10,
    }
    key = timezone.lower().strip()
    offset_h = offsets.get(key)
    if offset_h is None:
        m = re.match(r"^utc([+-])(\d+(?:\.\d+)?)$", key)
        if m:
            offset_h = float(m.group(2)) * (1 if m.group(1) == "+" else -1)
    if offset_h is None:
        return json.dumps({"error": f"Unknown timezone '{timezone}'. Try UTC, IST, JST, 'Tokyo', 'New York', etc."})
    now_utc = _dt.datetime.utcnow()
    local = now_utc + _dt.timedelta(hours=offset_h)
    return json.dumps({
        "timezone": timezone,
        "utc_offset_hours": offset_h,
        "iso": local.strftime("%Y-%m-%d %H:%M:%S"),
        "day_of_week": local.strftime("%A"),
    })

`get_exchange_rate`

Schema (what the model sees):

{
  "type": "function",
  "function": {
    "name": "get_exchange_rate",
    "description": "Convert a currency amount using fixed demo rates. Supports USD, EUR, GBP, JPY, INR, CNY, AUD, CAD, SGD, CHF, KRW.",
    "parameters": {
      "type": "object",
      "properties": {
        "from_currency": {
          "type": "string",
          "description": "Source ISO code, e.g. 'USD'"
        },
        "to_currency": {
          "type": "string",
          "description": "Target ISO code, e.g. 'INR'"
        },
        "amount": {
          "type": "number",
          "description": "Amount to convert"
        }
      },
      "required": [
        "from_currency",
        "to_currency",
        "amount"
      ]
    }
  }
}

Python implementation (what actually runs):

def get_exchange_rate(from_currency: str, to_currency: str, amount: float = 1.0) -> str:
    """Mock currency converter — rates are approximate and fixed for the demo."""
    # Rates expressed as "1 unit of X in USD".
    usd_rates = {
        "usd": 1.0, "eur": 1.08, "gbp": 1.27, "jpy": 0.0067,
        "inr": 0.012, "cny": 0.14, "aud": 0.66, "cad": 0.73,
        "sgd": 0.74, "chf": 1.13, "krw": 0.00075,
    }
    try:
        amount = float(amount)
    except (TypeError, ValueError):
        return json.dumps({"error": f"amount '{amount}' is not a number"})
    f = str(from_currency).lower().strip()
    t = str(to_currency).lower().strip()
    if f not in usd_rates or t not in usd_rates:
        return json.dumps({
            "error": f"Unsupported currency pair {from_currency}->{to_currency}",
            "supported": sorted(usd_rates.keys()),
        })
    usd = amount * usd_rates[f]
    converted = usd / usd_rates[t]
    return json.dumps({
        "from": f.upper(), "to": t.upper(),
        "amount": amount, "converted": round(converted, 2),
        "note": "rates are approximate demo values, not live market data",
    })

`get_distance`

Schema (what the model sees):

{
  "type": "function",
  "function": {
    "name": "get_distance",
    "description": "Great-circle distance in kilometres between two cities. Supports major Indian and global cities (Gandhinagar, Mumbai, Delhi, Bangalore, Chennai, Kolkata, Hyderabad, Pune, Tokyo, Seoul, Singapore, Hong Kong, London, Paris, New York, San Francisco, Sydney).",
    "parameters": {
      "type": "object",
      "properties": {
        "city_a": {
          "type": "string",
          "description": "First city, e.g. 'Delhi'"
        },
        "city_b": {
          "type": "string",
          "description": "Second city, e.g. 'Mumbai'"
        }
      },
      "required": [
        "city_a",
        "city_b"
      ]
    }
  }
}

Python implementation (what actually runs):

def get_distance(city_a: str, city_b: str) -> str:
    """Great-circle distance between two cities in kilometres (demo data)."""
    # Stored as city -> (lat, lon) so we can list any pair.
    coords = {
        "gandhinagar": (23.22, 72.65), "mumbai": (19.08, 72.88),
        "delhi":       (28.61, 77.21), "bangalore": (12.97, 77.59),
        "chennai":     (13.08, 80.27), "kolkata": (22.57, 88.36),
        "hyderabad":   (17.38, 78.49), "pune": (18.52, 73.86),
        "tokyo":       (35.68, 139.69), "seoul": (37.57, 126.98),
        "singapore":   (1.35, 103.82),  "hong kong": (22.32, 114.17),
        "london":      (51.51, -0.13),  "paris": (48.86, 2.35),
        "new york":    (40.71, -74.00), "san francisco": (37.77, -122.42),
        "sydney":      (-33.87, 151.21),
    }
    a = coords.get(city_a.lower().strip())
    b = coords.get(city_b.lower().strip())
    if a is None or b is None:
        return json.dumps({"error": f"Unknown city. Supported: {sorted(coords.keys())}"})
    import math
    lat1, lon1 = map(math.radians, a)
    lat2, lon2 = map(math.radians, b)
    dlat, dlon = lat2 - lat1, lon2 - lon1
    h = math.sin(dlat / 2) ** 2 + math.cos(lat1) * math.cos(lat2) * math.sin(dlon / 2) ** 2
    km = 2 * 6371.0 * math.asin(math.sqrt(h))
    return json.dumps({"from": city_a, "to": city_b, "distance_km": round(km, 1)})

`get_population`

Schema (what the model sees):

{
  "type": "function",
  "function": {
    "name": "get_population",
    "description": "Approximate metro-area population (in millions). Same city list as get_distance.",
    "parameters": {
      "type": "object",
      "properties": {
        "city": {
          "type": "string",
          "description": "City name"
        }
      },
      "required": [
        "city"
      ]
    }
  }
}

Python implementation (what actually runs):

def get_population(city: str) -> str:
    """Approximate metro-area population (demo data, in millions)."""
    data = {
        "gandhinagar": 0.36, "mumbai": 20.96, "delhi": 32.07, "bangalore": 13.61,
        "chennai": 11.56,    "kolkata": 15.33, "hyderabad": 10.53, "pune": 6.81,
        "tokyo": 37.44,      "seoul": 9.97,    "singapore": 5.94,  "hong kong": 7.49,
        "london": 9.54,      "paris": 11.21,   "new york": 18.87,  "san francisco": 3.32,
        "sydney": 5.37,
    }
    v = data.get(city.lower().strip())
    if v is None:
        return json.dumps({"error": f"No population data for {city}"})
    return json.dumps({"city": city, "population_millions": v})

`get_stock_price`

Schema (what the model sees):

{
  "type": "function",
  "function": {
    "name": "get_stock_price",
    "description": "Current share price in USD for a handful of tickers (AAPL, MSFT, GOOGL, AMZN, META, TSLA, NVDA, AMD, IBM, INTC, NFLX, ORCL).",
    "parameters": {
      "type": "object",
      "properties": {
        "ticker": {
          "type": "string",
          "description": "Ticker symbol, e.g. 'AAPL'"
        }
      },
      "required": [
        "ticker"
      ]
    }
  }
}

Python implementation (what actually runs):

def get_stock_price(ticker: str) -> str:
    """Mock share price in USD for a handful of tickers."""
    prices = {
        "aapl": 228.45, "msft": 417.12, "googl": 178.32, "amzn": 212.91,
        "meta": 574.80, "tsla": 249.07, "nvda": 135.60, "amd": 162.33,
        "ibm": 224.18,  "intc":  21.40, "nflx": 745.20, "orcl": 176.55,
    }
    p = prices.get(ticker.lower().strip())
    if p is None:
        return json.dumps({"error": f"Unknown ticker '{ticker}'. Try: {sorted(prices.keys())}"})
    return json.dumps({"ticker": ticker.upper(), "price_usd": p, "note": "demo price, not live market data"})