Introduction to LLM

This page provides an easy-to-understand guide on LLMs (Large Language Models) from basics to applications for AI enthusiasts.

Total of 31 articles available. | Currently on page 1 of 1.

Chapter 15 — Building a Secure AI Organization

Fifteenth post of the LLM Primer VII walkthrough. Security culture for AI teams, red teams and internal audits, vendor risk (SOC 2, ISO 42001), and the emerging AI BOM.

2026-05-24

Chapter 9 — Model Integrity and Supply Chain Risks

Ninth post of the LLM Primer VII walkthrough. Open-source model dependency risk, Sleeper Agents (Hubinger et al.), safetensors vs pickle, CVE-2024-3568, and the SLSA / Sigstore artifact-signing discipline.

2026-05-18

Chapter 8 — Adversarial Attacks on Models

Eighth post of the LLM Primer VII walkthrough. Adversarial examples in NLP (HotFlip, TextFooler), model extraction (Tramèr et al., Carlini et al.), and the defensive strategies for API-boundary abuse.

2026-05-17

Chapter 5 — Input Validation and Output Filtering

Fifth post of the LLM Primer VII walkthrough. Input sanitization, structured guardrails (NeMo, Llama Guard 3, Lakera, Bedrock), and red teaming with Garak, PyRIT, and promptfoo.

2026-05-14

Chapter 3 — Data Security and Privacy

Third post of the LLM Primer VII walkthrough. Training-data risks, memorization and extraction (Carlini et al., Nasr et al.), and the encryption, isolation, and retention disciplines that keep sensitive prompts contained.

2026-05-12

Chapter 2 — Threat Modeling for LLM Systems

Second post of the LLM Primer VII walkthrough. Adapting STRIDE, PASTA, and attack trees to LLM systems — model, prompt, data, and infrastructure as assets, and MITRE ATLAS as the LLM-specific adversary catalog.

2026-05-11

Chapter 12 — Disaggregated Serving and Kubernetes

Twelfth post of the LLM Primer VI walkthrough. Why aggregating prefill and decode wastes compute, and how LeaderWorkerSet, NVIDIA Grove, and KAI Scheduler split them apart on Kubernetes.

2026-05-04

Chapter 10 — The LLM Engine Layer

Tenth post of the LLM Primer VI walkthrough. vLLM as the safe default, TensorRT-LLM for peak NVIDIA-only throughput, SGLang for structured and agentic outputs, and TGI/Ollama for the rest.

2026-05-02

Chapter 8 — Next-Generation KV Cache Management

Eighth post of the LLM Primer VI walkthrough. PagedAttention, KV eviction algorithms (H2O, InfiniGen), and prefix caching for multi-turn conversations and multi-agent RAG.

2026-04-30

Chapter 5 — Evaluating LLM Applications

Fifth post of the LLM Primer V walkthrough. The offline-online eval distinction, LLM-as-judge patterns, the RAG Triad, and trajectory tests for agents.

2026-04-18

Chapter 2 — Foundation Models & Prompt Engineering

Second post of the LLM Primer V walkthrough. Model tiering, sampling parameters, defensive prompt patterns, and structured outputs as engineering surfaces — the layer just inside the deterministic wrapper.

2026-04-15

Chapter 1 — The Discipline of AI Engineering

First post of the LLM Primer V walkthrough. Why the demo works and production doesn't — the deterministic wrapper around the probabilistic core, and the five pillars (reliability, quality, performance, cost, evolution) that keep the wrapper honest.

2026-04-14

LLM Primer V — Series Introduction & Index

Kicking off the chapter-by-chapter walkthrough of Book V in the LLM Primer series — Building Real-World LLM Applications. Why AI engineering is a discipline of its own, who this book is for, and the schedule for the eight posts that follow, April 14 through April 21.

2026-04-13

Chapter 14 — Benchmarking, Testing, and Performance

Fifteenth and final post of the LLM Primer IV walkthrough. The MCP-Universe Benchmark on real servers, the two systemic failure modes it exposed, the ten-times throughput gap between session-per-request and shared session pools, and the bridge to Volume V.

2026-04-12

Chapter 10 — Long-Horizon Task Memory

Tenth post of the LLM Primer IV walkthrough. Short-term memory through windows and ReAct scratchpads, long-term memory through episodic vectors and semantic stores, and the compaction techniques that keep an agent productive over hours and days.

2026-04-08

Chapter 9 — Managing the Attention Budget

Ninth post of the LLM Primer IV walkthrough. Context rot, the lost-in-the-middle cliff, tool-loadout rot, and the three architectural answers — MCP, RAG, fine-tuning — to the question of where a model's missing knowledge actually belongs.

2026-04-07

Chapter 6 — Fundamental Orchestration Strategies

Sixth post of the LLM Primer IV walkthrough. The two foundational orchestration shapes — sequential pipelines and concurrent scatter-gather — and the prior question every team should ask: is a multi-agent system the right answer at all?

2026-04-04

Chapter 4 — Client Primitives: Agentic Behaviors and Control

Fourth post of the LLM Primer IV walkthrough. Sampling, Roots, and Elicitation are the three small, controlled holes MCP punches through the host-server wall — each a capability granted back, each a risk accepted on the user's behalf.

2026-04-02

Chapter 3 — Server Primitives: Exposing Context and Capabilities

Third post of the LLM Primer IV walkthrough. The three nouns an MCP server can offer — Resources (read state), Prompts (reusable scaffolding), Tools (write actions) — their schemas, their lifecycles, their error models, and the discipline of choosing the right primitive.

2026-04-01

Chapter 2 — Unveiling the Model Context Protocol (MCP)

Second post of the LLM Primer IV walkthrough. What MCP actually standardizes, the three-role split of Host, Client, and Server, why dynamic discovery and bidirectional messaging differ from REST in the cases that matter, and the session lifecycle that opens with capability negotiation.

2026-03-31

Chapter 1 — The AI Integration Crisis and the Rise of Agentic Architecture

First post of the LLM Primer IV walkthrough. Why monolithic agents fray as system prompts grow, the N times M integration problem hiding underneath, and the move from prompt engineering to context engineering that MCP was built to enable.

2026-03-30

LLM Primer IV — Series Introduction & Index

Kicking off the chapter-by-chapter walkthrough of Book IV in the LLM Primer series — Designing AI Cognition with MCP. Why agents need a protocol layer to scale past demoware, who this book is for, and the schedule for the fourteen posts that follow, March 30 through April 12.

2026-03-29

Chapter 10 — Leading Evaluation Frameworks

Tenth post of the LLM Primer III walkthrough. A field guide to the frameworks that turn the Evaluation Triad into something a team can actually run — RAGAS, TruLens, DeepEval on one side, Braintrust, LangSmith, Phoenix, Galileo, Opik on the other, and the Evaluation Gap none of them has yet closed.

2026-03-27

Chapter 9 — The RAG Evaluation Triad

Ninth post of the LLM Primer III walkthrough. A RAG system can fail in three different places and the failures look identical from the outside — the Evaluation Triad of Context Relevance, Groundedness, and Answer Relevance is the small vocabulary that prevents fixing one bug while measuring another.

2026-03-26

Chapter 2 — Intelligent Document Parsing

Second post of the LLM Primer III walkthrough. Why a PDF is not a text file, what layout-aware parsers actually preserve, the current tool landscape (LlamaParse, Docling, Unstructured, Marker-PDF, Firecrawl, DeepSeek-OCR), and the multimodal track that retrieves over page images directly.

2026-03-19

LLM Primer III — Series Introduction & Index

Kicking off the chapter-by-chapter walkthrough of Book III in the LLM Primer series — Enhancing Enterprise AI with RAG. Why retrieval-augmented generation looks simple from the outside and is a stack of disciplines underneath, who this book is for, and the schedule for the eleven posts that follow, March 18 through March 28.

2026-03-17

Chapter 12 — Real-World Applications of LLMs

Twelfth post of the LLM Primer II walkthrough. Text generation, summarization, QA, translation, reasoning — and the constrained decoding, agent loops, and multimodal generalization that turn one next-token machine into a dozen kinds of product.

2026-03-14

Chapter 10 — Safety, Ethics, & Trust: Beyond the Marketing

Chapter 10 of the LLM Primer I series. The honest picture of LLM safety — why hallucinations happen mechanistically, where bias actually lives, how layered guardrails work, and why governance is the institutional layer that technical controls can't replace. For practitioners who need to ship safely.

2026-02-27

Chapter 7 — Beyond Next-Token Prediction: Embeddings, Retrieval, and Multimodality

Chapter 7 of the LLM Primer I series. The capabilities that turn a next-token predictor into something much more — embeddings, semantic search, retrieval-augmented generation, and the move into multimodal inputs. How RAG actually keeps an LLM grounded in real documents instead of confabulating.

2026-02-24

The LLM Primer Series — A Field Guide to Generative AI, Built One Volume at a Time

The LLM Primer Series — a completed seven-volume field guide to generative AI by Sho Shimoda. From foundations to security. Includes Physical AI as sister volume. All 7 volumes available on Amazon.

2026-02-15

4.4 How LLMs Write Code: The Rise of AI-Powered Programming Assistants

Explore how large language models (LLMs) generate and complete code from natural-language prompts, and what it means for the future of software development.

2024-09-27