<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[ArataX — Java, Spring Boot & Cloud-Native Engineering]]></title><description><![CDATA[Hands-on Java & Spring Boot — with diagrams, code samples, and real-world demos.]]></description><link>https://code.aratax.dev</link><generator>RSS for Node</generator><lastBuildDate>Wed, 15 Apr 2026 10:05:10 GMT</lastBuildDate><atom:link href="https://code.aratax.dev/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Real-Time Communication in Quarkus: SSE or WebSocket?]]></title><description><![CDATA[Preface
There is a moment every backend developer encounters.
You refresh the page. Nothing changes. You refresh again. Still nothing.
And then the question appears:
“Why does my app only speak when I ask?”
Modern users expect more. Dashboards update...]]></description><link>https://code.aratax.dev/real-time-communication-in-quarkus-sse-or-websocket</link><guid isPermaLink="true">https://code.aratax.dev/real-time-communication-in-quarkus-sse-or-websocket</guid><category><![CDATA[SSE]]></category><category><![CDATA[websocket]]></category><category><![CDATA[quarkus]]></category><dc:creator><![CDATA[Arata]]></dc:creator><pubDate>Tue, 16 Dec 2025 15:14:51 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1765902588617/f53bb355-1596-485e-a95b-c5dcdf79bffd.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-preface">Preface</h2>
<p>There is a moment every backend developer encounters.</p>
<p>You refresh the page. Nothing changes. You refresh again. Still nothing.</p>
<p>And then the question appears:</p>
<p>“Why does my app only speak when I ask?”</p>
<p>Modern users expect more. Dashboards update by themselves. Notifications arrive without clicking refresh. Systems feel alive.</p>
<p>That moment is usually when developers discover Server-Sent Events (SSE) and WebSocket.</p>
<p>This article is a beginner-friendly journey through both technologies, grounded in a Quarkus-based project, and focused on why they exist—not just how to use them.</p>
<hr />
<h2 id="heading-1-before-real-time-the-problem-we-all-had">1. Before Real-Time: The Problem We All Had</h2>
<p>The web was built on a simple promise:</p>
<pre><code class="lang-plaintext">Client  →  Request  →  Server
Client  ←  Response ←  Server
</code></pre>
<p>It worked beautifully—for documents.</p>
<p>But problems emerged when applications needed to:</p>
<ul>
<li><p>show live metrics</p>
</li>
<li><p>display notifications</p>
</li>
<li><p>update dashboards continuously</p>
</li>
<li><p>support interactive collaboration</p>
</li>
</ul>
<p>The workaround was ugly:</p>
<ul>
<li><p>refresh loops</p>
</li>
<li><p>aggressive polling</p>
</li>
<li><p>long-polling hacks</p>
</li>
</ul>
<p><em>Bandwidth wasted. Servers overloaded. UX suffered. Real-time communication wasn’t a luxury anymore—it was survival.</em></p>
<hr />
<h2 id="heading-2-enter-sse-and-websocket-the-heroes-we-needed">2. Enter SSE and WebSocket: The Heroes We Needed</h2>
<p>To solve these problems, two technologies emerged: Server-Sent Events (SSE) and WebSocket.</p>
<p>SSE became part of HTML5 around 2011, designed for:</p>
<ul>
<li><p>live feeds</p>
</li>
<li><p>notifications</p>
</li>
<li><p>monitoring dashboards</p>
</li>
</ul>
<p>It intentionally avoided complexity.</p>
<p>WebSocket, standardized in 2011 as well, offered a full-duplex communication channel, ideal for:</p>
<ul>
<li><p>chat applications</p>
</li>
<li><p>multiplayer games</p>
</li>
<li><p>collaborative tools</p>
</li>
</ul>
<p>It allowed both client and server to send messages independently.</p>
<hr />
<h2 id="heading-3-understanding-server-sent-events-sse">3. Understanding Server-Sent Events (SSE)</h2>
<p>SSE is a unidirectional protocol where the server pushes updates to the client over a single HTTP connection, and built on top of standard HTTP, making it easy to implement and compatible with existing infrastructure.</p>
<p>The flow looks like this:</p>
<pre><code class="lang-plaintext">Browser
  |
  |  (HTTP request for SSE)
  v
Server
    |=====&gt; (streaming events) =====&gt;
    |
Browser
</code></pre>
<p>Key ideas:</p>
<ul>
<li><p>One-way communication (server → client)</p>
</li>
<li><p>Built on plain HTTP</p>
</li>
<li><p>Automatic reconnection</p>
</li>
<li><p>Very little client-side code</p>
</li>
</ul>
<hr />
<h2 id="heading-4-understanding-websocket">4. Understanding WebSocket</h2>
<p>WebSocket is a full-duplex protocol that allows both the client and server to send messages independently over a single, long-lived connection.</p>
<p>The flow looks like this:</p>
<pre><code class="lang-plaintext">Browser
  |\
  | \  (WebSocket handshake)
  v  \
Server
    |&lt;==== bidirectional messages ====&gt;
    |
Browser
</code></pre>
<p>Key ideas:</p>
<ul>
<li><p>Two-way communication (client ↔ server)</p>
</li>
<li><p>Requires a handshake to upgrade from HTTP</p>
</li>
<li><p>Low latency, real-time interaction</p>
</li>
<li><p>More complex client and server implementations</p>
</li>
</ul>
<hr />
<h2 id="heading-5-when-to-use-sse-vs-websocket">5. When to Use SSE vs. WebSocket</h2>
<p>Choosing between SSE and WebSocket depends on your application's needs:</p>
<ul>
<li><p>Use SSE when:</p>
<ul>
<li><p>You need simple, one-way updates from server to client.</p>
</li>
<li><p>Your application is read-heavy (e.g., live news feeds, stock tickers).</p>
</li>
<li><p>You want to leverage existing HTTP infrastructure.</p>
</li>
</ul>
</li>
<li><p>Use WebSocket when:</p>
<ul>
<li><p>You need two-way communication.</p>
</li>
<li><p>Your application is interactive (e.g., chat apps, multiplayer games).</p>
</li>
<li><p>Low latency is critical.</p>
</li>
</ul>
</li>
</ul>
<hr />
<h2 id="heading-6-comparison-table">6. Comparison Table</h2>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Feature / Aspect</td><td>Server-Sent Events (SSE)</td><td>WebSockets</td></tr>
</thead>
<tbody>
<tr>
<td>Communication Direction</td><td>One-way (Server → Client only)</td><td>Two-way (Client ↔ Server)</td></tr>
<tr>
<td>Protocol Base</td><td>Standard HTTP/HTTPS (works with HTTP/1.1 &amp; HTTP/2)</td><td>Custom WebSocket protocol (after HTTP upgrade handshake)</td></tr>
<tr>
<td>Connection Establishment</td><td>Simple HTTP GET request</td><td>Requires handshake to upgrade from HTTP to WebSocket</td></tr>
<tr>
<td>Data Format</td><td>Text only (UTF-8, event/data fields)</td><td>Text or binary (flexible framing)</td></tr>
<tr>
<td>Automatic Reconnection</td><td>Built-in (EventSource retries automatically)</td><td>Must be implemented manually (heartbeat/reconnect logic)</td></tr>
<tr>
<td>Browser Support</td><td>Widely supported (except legacy IE/Edge)</td><td>Widely supported in modern browsers</td></tr>
<tr>
<td>Firewall/Proxy Friendliness</td><td>Very high (uses standard ports 80/443, HTTP semantics)</td><td>Can be blocked by strict firewalls/proxies (non-HTTP protocol)</td></tr>
<tr>
<td>Client → Server Messaging</td><td>Not supported (needs separate HTTP calls like fetch/Ajax)</td><td>Natively supported (<code>send()</code> method)</td></tr>
<tr>
<td>Complexity</td><td>Low (simple API, browser handles reconnection)</td><td>Higher (manage state, heartbeats, message framing)</td></tr>
<tr>
<td>Performance</td><td>Efficient for server push, but limited to text</td><td>Very efficient, supports high-frequency, low-latency data exchange</td></tr>
<tr>
<td>Typical Use Cases</td><td>Notifications, stock tickers, news feeds, dashboards</td><td>Chat apps, online games, collaborative editing, IoT, real-time trading</td></tr>
</tbody>
</table>
</div><hr />
<h2 id="heading-7-implementing-sse-and-websocket-in-quarkus">7. Implementing SSE and WebSocket in Quarkus</h2>
<h3 id="heading-sse-example-in-quarkus">SSE Example in Quarkus</h3>
<p>This design is a Quarkus SSE (Server‑Sent Events) resource that continuously streams JSON events to connected clients</p>
<p><strong>The architecture looks like this:</strong></p>
<pre><code class="lang-plaintext">+--------------------+     HTTP (text/event-stream)       +-----------------------+
| Browser            |  -------------------------------&gt;  | Quarkus SSE Resource  |
| Web Component      |   GET /sse/stream                  | streams events        |
| EventSource()      |  &lt;-------------------------------  | every 1s              |
+--------------------+         continuous stream          +-----------------------+
</code></pre>
<p><strong>The server implementation:</strong></p>
<p>This Quarkus resource exposes an SSE endpoint at /sse/stream. It uses a reactive Multi to emit a tick every second, mapping each tick into a JSON object with an event name and timestamp. Because the method produces text/event-stream and specifies application/json, clients receive a continuous stream of JSON messages over a single HTTP connection</p>
<pre><code class="lang-java"><span class="hljs-meta">@Path("/sse")</span>
<span class="hljs-keyword">public</span> <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">SseResource</span> </span>{

    <span class="hljs-meta">@GET</span>
    <span class="hljs-meta">@Path("/stream")</span>
    <span class="hljs-meta">@Produces("text/event-stream")</span>
    <span class="hljs-meta">@RestStreamElementType("application/json")</span>
    <span class="hljs-keyword">public</span> Multi&lt;Map&lt;String, String&gt;&gt; streamEvents() {
        <span class="hljs-keyword">return</span> Multi.createFrom().ticks().every(Duration.ofSeconds(<span class="hljs-number">1</span>))
                .map(tick -&gt; Map.of(
                        <span class="hljs-string">"event"</span>, <span class="hljs-string">"tick"</span>,
                        <span class="hljs-string">"time"</span>, DateTimeFormatter.ISO_INSTANT.format(Instant.now())
                ));
    }
}
</code></pre>
<p><strong>The client implementation:</strong></p>
<p>The following HTML snippet defines a custom web component that connects to the SSE endpoint and updates its content with the received time every second.</p>
<pre><code class="lang-html"><span class="hljs-tag">&lt;<span class="hljs-name">script</span>&gt;</span><span class="javascript">
    <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">TimeStream</span> <span class="hljs-keyword">extends</span> <span class="hljs-title">HTMLElement</span> </span>{
        connectedCallback() {
            <span class="hljs-built_in">this</span>.innerHTML = <span class="hljs-string">`&lt;p&gt;Waiting for time...&lt;/p&gt;`</span>;
            <span class="hljs-keyword">const</span> es = <span class="hljs-keyword">new</span> EventSource(<span class="hljs-string">"http://localhost:8080/sse/stream"</span>);
            es.onmessage = <span class="hljs-function">(<span class="hljs-params">event</span>) =&gt;</span> {
                <span class="hljs-keyword">const</span> data = <span class="hljs-built_in">JSON</span>.parse(event.data);
                <span class="hljs-built_in">this</span>.querySelector(<span class="hljs-string">"p"</span>).textContent = data.time;
            };
        }
    }
    customElements.define(<span class="hljs-string">"time-stream"</span>, TimeStream);
</span><span class="hljs-tag">&lt;/<span class="hljs-name">script</span>&gt;</span>
<span class="hljs-tag">&lt;<span class="hljs-name">body</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">h1</span>&gt;</span>Hello {name}!<span class="hljs-tag">&lt;/<span class="hljs-name">h1</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">time-stream</span>&gt;</span><span class="hljs-tag">&lt;/<span class="hljs-name">time-stream</span>&gt;</span>
<span class="hljs-tag">&lt;/<span class="hljs-name">body</span>&gt;</span>
</code></pre>
<p><strong>Demo page:</strong></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1765893066327/a1f4c6f3-75b2-4a92-916f-41f9d4c3538a.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-websocket-example-in-quarkus">WebSocket Example in Quarkus</h3>
<p>This design is a Quarkus WebSocket resource that enables bidirectional communication between clients and the server.</p>
<p>The architecture looks like this:</p>
<pre><code class="lang-plaintext">+--------------------+      WebSocket (Client ↔ Server)   +-----------------------+
| Browser            |  -------------------------------&gt;  | Quarkus WebSocket     |
| Web Component      |   GET /websocket                   | endpoint              |
| WebSocket()        |  &lt;-------------------------------  |                       |
+--------------------+         bidirectional stream       +-----------------------+
</code></pre>
<p><strong>The server implementation:</strong></p>
<p>The following Java class defines a WebSocket endpoint at /chatEndPoint. It handles connection events, incoming messages, and errors, logging relevant information for each event.</p>
<pre><code class="lang-java"><span class="hljs-meta">@Slf4j</span>
<span class="hljs-meta">@ServerEndpoint("/chatEndPoint")</span>
<span class="hljs-meta">@ApplicationScoped</span>
<span class="hljs-keyword">public</span> <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">ChatEndPoint</span> </span>{
    <span class="hljs-meta">@OnOpen</span>
    <span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-keyword">void</span> <span class="hljs-title">onOpen</span><span class="hljs-params">(Session session)</span> </span>{
        log.info(<span class="hljs-string">"WebSocket opened: {}"</span>, session.getId());
    }

    <span class="hljs-meta">@OnMessage</span>
    <span class="hljs-function"><span class="hljs-keyword">public</span> String <span class="hljs-title">onMessage</span><span class="hljs-params">(String message, Session session)</span> </span>{
        log.info(<span class="hljs-string">"Received from {}: {}"</span>, session.getId(), message);
        <span class="hljs-keyword">return</span> <span class="hljs-string">"Echo: "</span> + message;
    }

    <span class="hljs-meta">@OnClose</span>
    <span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-keyword">void</span> <span class="hljs-title">onClose</span><span class="hljs-params">(Session session)</span> </span>{
        log.info(<span class="hljs-string">"WebSocket closed: {}"</span>, session.getId());
    }

    <span class="hljs-meta">@OnError</span>
    <span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-keyword">void</span> <span class="hljs-title">onError</span><span class="hljs-params">(Session session, Throwable throwable)</span> </span>{
        log.error(<span class="hljs-string">"Error in session {}: {}"</span>, session.getId(), throwable.getMessage(), throwable);
    }
}
</code></pre>
<p><strong>The client implementation:</strong></p>
<p>The following HTML snippet defines a custom web component that connects to the WebSocket endpoint, allowing users to send messages and receive echoed responses from the server.</p>
<pre><code class="lang-html"><span class="hljs-tag">&lt;<span class="hljs-name">html</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">script</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"module"</span>&gt;</span><span class="javascript">
        <span class="hljs-keyword">import</span> <span class="hljs-string">"/ws-chat/ws-chat.js"</span>;
    </span><span class="hljs-tag">&lt;/<span class="hljs-name">script</span>&gt;</span>
<span class="hljs-tag">&lt;<span class="hljs-name">body</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">ws-chat</span> <span class="hljs-attr">url</span>=<span class="hljs-string">"ws://localhost:8080/chatEndPoint"</span>&gt;</span><span class="hljs-tag">&lt;/<span class="hljs-name">ws-chat</span>&gt;</span>
<span class="hljs-tag">&lt;/<span class="hljs-name">body</span>&gt;</span>
<span class="hljs-tag">&lt;/<span class="hljs-name">html</span>&gt;</span>
</code></pre>
<p><strong>The custon web component implementation(by Lit Framework):</strong></p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> {
  LitElement,
  html,
  css,
} <span class="hljs-keyword">from</span> <span class="hljs-string">"https://cdn.jsdelivr.net/gh/lit/dist@3/core/lit-core.min.js"</span>;

<span class="hljs-keyword">export</span> <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">WsChat</span> <span class="hljs-keyword">extends</span> <span class="hljs-title">LitElement</span> </span>{
  <span class="hljs-keyword">static</span> properties = {
    <span class="hljs-attr">url</span>: { <span class="hljs-attr">type</span>: <span class="hljs-built_in">String</span> }, <span class="hljs-comment">// &lt;ws-chat url="ws://..."&gt;</span>
    <span class="hljs-attr">connected</span>: { <span class="hljs-attr">type</span>: <span class="hljs-built_in">Boolean</span>, <span class="hljs-attr">state</span>: <span class="hljs-literal">true</span> },
    <span class="hljs-attr">messages</span>: { <span class="hljs-attr">type</span>: <span class="hljs-built_in">Array</span>, <span class="hljs-attr">state</span>: <span class="hljs-literal">true</span> },
  };

  <span class="hljs-keyword">constructor</span>() {
    <span class="hljs-built_in">super</span>();
    <span class="hljs-built_in">this</span>.url = <span class="hljs-string">""</span>; <span class="hljs-comment">// default computed in connect()</span>
    <span class="hljs-built_in">this</span>.connected = <span class="hljs-literal">false</span>;
    <span class="hljs-built_in">this</span>.messages = [];
    <span class="hljs-built_in">this</span>._ws = <span class="hljs-literal">null</span>;
  }

  connectedCallback() {
    <span class="hljs-built_in">super</span>.connectedCallback();
    <span class="hljs-built_in">this</span>._connect();
  }

  disconnectedCallback() {
    <span class="hljs-built_in">super</span>.disconnectedCallback();
    <span class="hljs-built_in">this</span>._close();
  }

  updated(changed) {
    <span class="hljs-comment">// If url attribute changes at runtime, reconnect</span>
    <span class="hljs-keyword">if</span> (changed.has(<span class="hljs-string">"url"</span>)) {
      <span class="hljs-built_in">this</span>._connect(<span class="hljs-literal">true</span>);
    }
  }

  render() {
    <span class="hljs-keyword">return</span> html`<span class="xml">
      <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"chat"</span>&gt;</span>
        <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"output"</span>&gt;</span></span><span class="hljs-subst">${<span class="hljs-built_in">this</span>.messages.map((m) =&gt; html`<span class="xml"><span class="hljs-tag">&lt;<span class="hljs-name">p</span>&gt;</span></span><span class="hljs-subst">${m}</span><span class="xml"><span class="hljs-tag">&lt;/<span class="hljs-name">p</span>&gt;</span>`</span>)}</span><span class="xml"><span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
        <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"input-row"</span>&gt;</span>
          <span class="hljs-tag">&lt;<span class="hljs-name">input</span> @<span class="hljs-attr">keydown</span>=</span></span><span class="hljs-subst">${<span class="hljs-built_in">this</span>._onKeyDown}</span><span class="xml"><span class="hljs-tag"> <span class="hljs-attr">placeholder</span>=<span class="hljs-string">"Type a message..."</span> /&gt;</span>
          <span class="hljs-tag">&lt;<span class="hljs-name">button</span> ?<span class="hljs-attr">disabled</span>=</span></span><span class="hljs-subst">${!<span class="hljs-built_in">this</span>.connected}</span><span class="xml"><span class="hljs-tag"> @<span class="hljs-attr">click</span>=</span></span><span class="hljs-subst">${<span class="hljs-built_in">this</span>.sendMessage}</span><span class="xml"><span class="hljs-tag">&gt;</span>
            Send
          <span class="hljs-tag">&lt;/<span class="hljs-name">button</span>&gt;</span>
        <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
      <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
    `</span>;
  }

  _onKeyDown = <span class="hljs-function">(<span class="hljs-params">e</span>) =&gt;</span> {
    <span class="hljs-keyword">if</span> (e.key === <span class="hljs-string">"Enter"</span>) <span class="hljs-built_in">this</span>.sendMessage();
  };

  _add(msg) {
    <span class="hljs-built_in">this</span>.messages = [...this.messages, msg];

    <span class="hljs-comment">// auto-scroll after render</span>
    <span class="hljs-built_in">this</span>.updateComplete.then(<span class="hljs-function">() =&gt;</span> {
      <span class="hljs-keyword">const</span> out = <span class="hljs-built_in">this</span>.renderRoot.querySelector(<span class="hljs-string">".output"</span>);
      out.scrollTop = out.scrollHeight;
    });
  }

  _defaultUrl() {
    <span class="hljs-keyword">const</span> proto = location.protocol === <span class="hljs-string">"https:"</span> ? <span class="hljs-string">"wss"</span> : <span class="hljs-string">"ws"</span>;
    <span class="hljs-keyword">return</span> <span class="hljs-string">`<span class="hljs-subst">${proto}</span>://<span class="hljs-subst">${location.host}</span>/ws`</span>;
  }

  _connect(force = <span class="hljs-literal">false</span>) {
    <span class="hljs-keyword">const</span> target = <span class="hljs-built_in">this</span>.url?.trim() || <span class="hljs-built_in">this</span>._defaultUrl();
    <span class="hljs-keyword">if</span> (!force &amp;&amp; <span class="hljs-built_in">this</span>._ws &amp;&amp; <span class="hljs-built_in">this</span>._ws.readyState === WebSocket.OPEN) <span class="hljs-keyword">return</span>;

    <span class="hljs-built_in">this</span>._close();
    <span class="hljs-built_in">this</span>.connected = <span class="hljs-literal">false</span>;

    <span class="hljs-built_in">this</span>._ws = <span class="hljs-keyword">new</span> WebSocket(target);
    <span class="hljs-built_in">this</span>._ws.onopen = <span class="hljs-function">() =&gt;</span> {
      <span class="hljs-built_in">this</span>.connected = <span class="hljs-literal">true</span>;
      <span class="hljs-built_in">this</span>._add(<span class="hljs-string">"Connected to server."</span>);
    };
    <span class="hljs-built_in">this</span>._ws.onclose = <span class="hljs-function">() =&gt;</span> {
      <span class="hljs-built_in">this</span>.connected = <span class="hljs-literal">false</span>;
      <span class="hljs-built_in">this</span>._add(<span class="hljs-string">"Disconnected from server."</span>);
    };
    <span class="hljs-built_in">this</span>._ws.onmessage = <span class="hljs-function">(<span class="hljs-params">e</span>) =&gt;</span> <span class="hljs-built_in">this</span>._add(<span class="hljs-string">"Server: "</span> + e.data);
  }

  _close() {
    <span class="hljs-keyword">if</span> (!<span class="hljs-built_in">this</span>._ws) <span class="hljs-keyword">return</span>;
    <span class="hljs-keyword">try</span> {
      <span class="hljs-built_in">this</span>._ws.close(<span class="hljs-number">1000</span>);
    } <span class="hljs-keyword">catch</span> {}
    <span class="hljs-built_in">this</span>._ws = <span class="hljs-literal">null</span>;
  }

  sendMessage = <span class="hljs-function">() =&gt;</span> {
    <span class="hljs-keyword">const</span> input = <span class="hljs-built_in">this</span>.renderRoot.querySelector(<span class="hljs-string">"input"</span>);
    <span class="hljs-keyword">const</span> msg = input.value.trim();
    <span class="hljs-keyword">if</span> (!msg || !<span class="hljs-built_in">this</span>._ws || <span class="hljs-built_in">this</span>._ws.readyState !== WebSocket.OPEN) <span class="hljs-keyword">return</span>;

    <span class="hljs-built_in">this</span>._ws.send(msg);
    <span class="hljs-built_in">this</span>._add(<span class="hljs-string">"You: "</span> + msg);
    input.value = <span class="hljs-string">""</span>;
    input.focus();
  };
}

customElements.define(<span class="hljs-string">"ws-chat"</span>, WsChat);
</code></pre>
<p><strong>Demo page:</strong></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1765893129580/5494bb77-e8eb-44f2-aa2c-f94c134b1d16.png" alt class="image--center mx-auto" /></p>
<hr />
<h2 id="heading-8-key-takeaways">8. key Takeaways</h2>
<ul>
<li><p>SSE and WebSocket are powerful tools for real-time web applications, each with its strengths and ideal use cases.</p>
</li>
<li><p>SSE is perfect for simple, one-way server-to-client updates, while WebSocket excels in interactive, two-way communication.</p>
</li>
<li><p>Quarkus makes it easy to implement both SSE and WebSocket, allowing developers to build modern, responsive applications that meet user expectations for real-time interactivity.</p>
</li>
</ul>
<hr />
<h2 id="heading-9-conclusion">9. Conclusion</h2>
<p>Real‑time communication is no longer a luxury — it has become an expectation. The challenge is not to reach for the most powerful tool by default, but to select the one that best fits the need.<br />When your system primarily delivers information to users, <strong>Server‑Sent Events (SSE)</strong> offer an elegant and efficient solution. When interaction flows both ways and users need to respond in real time, <strong>WebSockets</strong> provide the right abstraction.<br />With Quarkus, both approaches are accessible, performant, and ready for production, giving you the flexibility to choose wisely without compromise.</p>
<hr />
<h2 id="heading-10-resources">10. Resources</h2>
<ul>
<li><p><a target="_blank" href="https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events">Server-sent events</a></p>
</li>
<li><p><a target="_blank" href="https://developer.mozilla.org/en-US/docs/Web/API/WebSocket">WebSocket</a></p>
</li>
<li><p><a target="_blank" href="https://quarkus.io/">Quarkus</a></p>
</li>
<li><p><a target="_blank" href="https://developer.mozilla.org/en-US/docs/Web/Web_Components">Web Components</a></p>
</li>
<li><p><a target="_blank" href="https://lit.dev/">Lit Framework</a></p>
</li>
</ul>
<p>you can find the full code examples on my GitHub repository: <a target="_blank" href="https://github.com/arata-x/sse-and-websocket-demo">sse-and-websocket-demo</a></p>
]]></content:encoded></item><item><title><![CDATA[Building a Multi-Node Kubernetes Cluster with Vagrant]]></title><description><![CDATA["In distributed systems, consistency isn’t just a property — it’s a promise."
Why This Article
Imagine you’re building a small banking application. Users can deposit and withdraw money, check their balances, and expect data accuracy every single time...]]></description><link>https://code.aratax.dev/building-a-multi-node-kubernetes-cluster-with-vagrant</link><guid isPermaLink="true">https://code.aratax.dev/building-a-multi-node-kubernetes-cluster-with-vagrant</guid><category><![CDATA[Kubernetes]]></category><category><![CDATA[k8s]]></category><category><![CDATA[vagrant]]></category><category><![CDATA[Springboot]]></category><category><![CDATA[PostgreSQL]]></category><dc:creator><![CDATA[Arata]]></dc:creator><pubDate>Mon, 10 Nov 2025 07:37:30 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1762752770612/c59bc5fe-ad32-4232-b2cc-3b43d4477576.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>"In distributed systems, consistency isn’t just a property — it’s a promise."</p>
<h2 id="heading-why-this-article">Why This Article</h2>
<p>Imagine you’re building a small banking application. Users can deposit and withdraw money, check their balances, and expect data accuracy every single time — even if multiple requests hit the system simultaneously. But the moment you deploy it across containers, networks, and replicas, one question starts haunting every architect:</p>
<p>How do we keep data consistent when everything is happening everywhere?</p>
<p>In this tutorial, we’ll explore that question through a hands-on story — from concept to infrastructure — and deploy a Spring Boot + PostgreSQL banking demo across a five-node Kubernetes lab, fully automated with Vagrant. Our goal isn’t to ship production code, but to understand the design thinking behind consistency, locking, and automation.</p>
<p>Most tutorials use <strong>Minikube</strong> or <strong>kind</strong>, which are great for learning but limited to single-node simulations.<br />What if you could spin up a <strong>full Kubernetes cluster</strong> — control plane, multiple worker nodes, real networking, storage, and ingress — entirely automated and reproducible?</p>
<p>It’s a perfect local lab for experimenting with deployments, storage, and load testing — without relying on cloud services.</p>
<hr />
<h2 id="heading-what-youll-learn">What You’ll Learn</h2>
<ul>
<li><p>Build a <strong>5-node Kubernetes cluster</strong> using Vagrant and VirtualBox</p>
</li>
<li><p>Automate provisioning with <strong>Bash scripts</strong></p>
</li>
<li><p>Deploy a real <strong>Spring Boot + Postgres</strong> application</p>
</li>
<li><p>Test the rest endpoint using <strong>k6</strong> load testing</p>
</li>
</ul>
<blockquote>
<p>Build, deploy, and test a real multi-node Kubernetes cluster from scratch — all on your local machine.</p>
</blockquote>
<hr />
<h2 id="heading-design">Design</h2>
<p>The system provides RESTful endpoints for withdrawal and deposit operations, served by a Spring Boot–based API backend. When the API receives a client request, it updates the account balance in a PostgreSQL relational database. To ensure data consistency under concurrent transactions, the system supports both optimistic and pessimistic locking mechanisms.</p>
<p>Both the API backend and the PostgreSQL database are deployed on a Kubernetes cluster comprising five virtual machines:</p>
<ul>
<li><p>One control plane node for cluster management</p>
</li>
<li><p>One edge node for network routing and ingress</p>
</li>
<li><p>Two worker nodes hosting the Spring Boot web applications</p>
</li>
<li><p>One database node running PostgreSQL</p>
</li>
</ul>
<p>This is a visual representation of the cluster setup:</p>
<h3 id="heading-k8s-architecture-overview">K8S Architecture Overview</h3>
<pre><code class="lang-plaintext">┌─────────────────────────────────────────────────────────────┐
│                    Kubernetes Cluster                       │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  Control Plane (k8s-cp-01)                                  │
│  └─ IP: 192.168.56.10                                       │
│  └─ Role: Master node, API server, scheduler, controller    │
│                                                             │
│  Worker Nodes:                                              │
│  ├─ k8s-node-01 (192.168.56.11) - tier: edge                │
│  │  └─ Ingress Controller, Local Path Provisioner, MetalLB  │
│  ├─ k8s-node-02 (192.168.56.12) - tier: backend             │
│  │  └─ Spring Boot Application Pods                         │
│  ├─ k8s-node-03 (192.168.56.13) - tier: backend             │
│  │  └─ Spring Boot Application Pods                         │
│  └─ k8s-node-04 (192.168.56.14) - tier: database            │
│     └─ PostgreSQL Database                                  │
│                                                             │
│  LoadBalancer IP Pool: 192.168.56.240-250                   │
└─────────────────────────────────────────────────────────────┘
</code></pre>
<hr />
<h2 id="heading-building-the-complete-environment">Building the Complete Environment:</h2>
<p>The environment will be provisioned using Vagrant, which automates the creation of virtual machines and the setup of the Kubernetes cluster. Once the infrastructure is ready, it will deploy the prebuilt cloud-native Spring Boot web application and the fully configured PostgreSQL database, assembling the complete application service environment.</p>
<h3 id="heading-prerequisites">Prerequisites</h3>
<p>The following tools are required on your host machine:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Tool</td><td>Version</td><td>Install</td></tr>
</thead>
<tbody>
<tr>
<td>VirtualBox</td><td>≥ 7.1.6</td><td>See install checklist</td></tr>
<tr>
<td>Vagrant</td><td>≥ 2.4.9</td><td>See install checklist</td></tr>
<tr>
<td>RAM</td><td>≥ 13 GB</td><td>3GB for cp + 2 GB per node</td></tr>
<tr>
<td>CPU</td><td>≥ 4 cores</td><td>Recommended</td></tr>
<tr>
<td>Network</td><td>192.168.56.0/24</td><td>VirtualBox Host-Only</td></tr>
</tbody>
</table>
</div><p>Below is the installation command:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Windows (Chocolatey, run admin PowerShell)</span>
choco install virtualbox vagrant

<span class="hljs-comment"># Ubuntu/Debian</span>
sudo apt-get update &amp;&amp; sudo apt-get install -y virtualbox
<span class="hljs-comment"># Get latest vagrant from HashiCorp website or apt repo</span>
<span class="hljs-comment"># Install HashiCorp GPG key</span>
wget -O- https://apt.releases.hashicorp.com/gpg | sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg
<span class="hljs-comment"># Add HashiCorp repository</span>
<span class="hljs-built_in">echo</span> <span class="hljs-string">"deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] \
https://apt.releases.hashicorp.com <span class="hljs-subst">$(lsb_release -cs)</span> main"</span> | sudo tee /etc/apt/sources.list.d/hashicorp.list
sudo apt-get install -y vagrant
</code></pre>
<h2 id="heading-step-1-vagrant">Step 1: Vagrant</h2>
<p>Clone the project</p>
<pre><code class="lang-bash">git <span class="hljs-built_in">clone</span> https://github.com/arata-x/vagrant-k8s-bank-demo.git
</code></pre>
<h3 id="heading-project-structure">Project Structure</h3>
<p>The outline of the project structure is as follows:</p>
<pre><code class="lang-plaintext">Root:
│  Vagrantfile
│
└─provision
    ├─deployment
    │  │
    │  └─standard
    │      ├─app
    │      │      10-config.yml
    │      │      20-rbac.yml
    │      │      30-db-deploy.yml
    │      │      40-app-deploy.yml
    │      │      50-services.yml
    │      │      60-network-policy.yml
    │      │      70-utilities.yml
    │      │
    │      └─infra
    │              10-storage-class.yml
    │              20-metallb.yaml
    │
    └─foundation
            10-common.sh
            20-node-network.sh
            30-control-panel.sh
            40-join-node.sh
            50-after-vagrant-setup.sh
            join-command.sh
</code></pre>
<p>A <strong>Vagrantfile</strong> is a configuration file written in Ruby syntax that defines how Vagrant should provision and manage a virtual machine (VM). It’s the heart of any Vagrant project—used to automate the setup of reproducible development environments.</p>
<p>Vagrantfile specifies:</p>
<p>-Base OS image (e.g., ubuntu/jammy64) -Resources (CPU, memory, disk) -Networking (port forwarding, private/public networks) -Provisioning scripts (e.g., install Java, Maven, Docker) -Shared folders between host and VM</p>
<p>Below is the content of Vagrantfile</p>
<pre><code class="lang-ruby">Vagrant.configure(<span class="hljs-string">"2"</span>) <span class="hljs-keyword">do</span> <span class="hljs-params">|config|</span>
  config.vm.box = <span class="hljs-string">"ubuntu/jammy64"</span>
  config.vm.synced_folder <span class="hljs-string">"."</span>, <span class="hljs-string">"/vagrant"</span>
  root_path = <span class="hljs-string">"provision/foundation/"</span>;
  <span class="hljs-comment"># Common setup for all nodes</span>
  config.vm.provision <span class="hljs-string">"shell"</span>, <span class="hljs-symbol">path:</span> root_path + <span class="hljs-string">"10-common.sh"</span>
  <span class="hljs-comment"># Node definitions</span>
  nodes = [
    { <span class="hljs-symbol">name:</span> <span class="hljs-string">"k8s-cp-01"</span>,  <span class="hljs-symbol">ip:</span> <span class="hljs-string">"192.168.56.10"</span>, <span class="hljs-symbol">script:</span> <span class="hljs-string">"30-control-panel.sh"</span>, <span class="hljs-symbol">memory:</span> <span class="hljs-number">3072</span> },
    { <span class="hljs-symbol">name:</span> <span class="hljs-string">"k8s-node-01"</span>, <span class="hljs-symbol">ip:</span> <span class="hljs-string">"192.168.56.11"</span>, <span class="hljs-symbol">script:</span> <span class="hljs-string">"40-join-node.sh"</span>,    <span class="hljs-symbol">memory:</span> <span class="hljs-number">2048</span> },
    { <span class="hljs-symbol">name:</span> <span class="hljs-string">"k8s-node-02"</span>, <span class="hljs-symbol">ip:</span> <span class="hljs-string">"192.168.56.12"</span>, <span class="hljs-symbol">script:</span> <span class="hljs-string">"40-join-node.sh"</span>,    <span class="hljs-symbol">memory:</span> <span class="hljs-number">2048</span> },
    { <span class="hljs-symbol">name:</span> <span class="hljs-string">"k8s-node-03"</span>, <span class="hljs-symbol">ip:</span> <span class="hljs-string">"192.168.56.13"</span>, <span class="hljs-symbol">script:</span> <span class="hljs-string">"40-join-node.sh"</span>,    <span class="hljs-symbol">memory:</span> <span class="hljs-number">2048</span> },
    { <span class="hljs-symbol">name:</span> <span class="hljs-string">"k8s-node-04"</span>, <span class="hljs-symbol">ip:</span> <span class="hljs-string">"192.168.56.14"</span>, <span class="hljs-symbol">script:</span> <span class="hljs-string">"40-join-node.sh"</span>,    <span class="hljs-symbol">memory:</span> <span class="hljs-number">2048</span> },
  ]
  <span class="hljs-comment"># Create VMs</span>
  nodes.each <span class="hljs-keyword">do</span> <span class="hljs-params">|node|</span>
    config.vm.define node[<span class="hljs-symbol">:name</span>] <span class="hljs-keyword">do</span> <span class="hljs-params">|node_vm|</span>
      node_vm.vm.provider <span class="hljs-string">"virtualbox"</span> <span class="hljs-keyword">do</span> <span class="hljs-params">|vb|</span>
        vb.cpus = <span class="hljs-number">2</span>
        vb.memory = node[<span class="hljs-symbol">:memory</span>]
      <span class="hljs-keyword">end</span>
      node_vm.vm.hostname = node[<span class="hljs-symbol">:name</span>]
      node_vm.vm.network <span class="hljs-string">"private_network"</span>, <span class="hljs-symbol">ip:</span> node[<span class="hljs-symbol">:ip</span>]
      node_vm.vm.provision <span class="hljs-string">"shell"</span>, <span class="hljs-symbol">path:</span> root_path + <span class="hljs-string">"20-node-network.sh"</span>, <span class="hljs-symbol">args:</span> node[<span class="hljs-symbol">:ip</span>]      
      node_vm.vm.provision <span class="hljs-string">"shell"</span>, <span class="hljs-symbol">path:</span> root_path + <span class="hljs-string">"<span class="hljs-subst">#{node[<span class="hljs-symbol">:script</span>]}</span>"</span>
    <span class="hljs-keyword">end</span>
  <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span>
</code></pre>
<h3 id="heading-1-overview">1️⃣ Overview</h3>
<p>This Vagrantfile provisions :</p>
<ul>
<li><p>Creates 5 Ubuntu 22.04 VMs</p>
</li>
<li><p>Installs Docker, kubeadm, kubelet, kubectl</p>
</li>
<li><p>Initializes Kubernetes control plane</p>
</li>
<li><p>Joins 4 worker nodes</p>
</li>
<li><p>Configures Calico CNI networking</p>
</li>
</ul>
<h3 id="heading-2-global-configuration">2️⃣ Global Configuration</h3>
<pre><code class="lang-ruby">Vagrant.configure(<span class="hljs-string">"2"</span>) <span class="hljs-keyword">do</span> <span class="hljs-params">|config|</span>
  config.vm.box = <span class="hljs-string">"ubuntu/jammy64"</span>
  config.vm.synced_folder <span class="hljs-string">"."</span>, <span class="hljs-string">"/vagrant"</span>
</code></pre>
<ul>
<li><p><code>Vagrant.configure("2")</code> → Uses configuration syntax version 2.</p>
</li>
<li><p><code>config.vm.box</code> → Every VM uses Ubuntu 22.04 LTS (“jammy64”).</p>
</li>
<li><p><code>config.vm.synced_folder</code> → Shares your project folder on the host with each guest VM at <code>/vagrant</code>.</p>
</li>
</ul>
<h3 id="heading-3-common-provisioning">3️⃣ Common Provisioning</h3>
<pre><code class="lang-ruby">  root_path = <span class="hljs-string">"provision/foundation/"</span>
  config.vm.provision <span class="hljs-string">"shell"</span>, <span class="hljs-symbol">path:</span> root_path + <span class="hljs-string">"10-common.sh"</span>
</code></pre>
<p>Runs <strong>once globally</strong> for all machines to install baseline packages, set host files, etc.</p>
<h3 id="heading-4-cluster-node-setup">4️⃣ Cluster node setup</h3>
<pre><code class="lang-ruby">nodes = [
  { <span class="hljs-symbol">name:</span> <span class="hljs-string">"k8s-cp-01"</span>,  <span class="hljs-symbol">ip:</span> <span class="hljs-string">"192.168.56.10"</span>, <span class="hljs-symbol">script:</span> <span class="hljs-string">"30-control-panel.sh"</span>, <span class="hljs-symbol">memory:</span> <span class="hljs-number">3072</span> },
  { <span class="hljs-symbol">name:</span> <span class="hljs-string">"k8s-node-01"</span>, <span class="hljs-symbol">ip:</span> <span class="hljs-string">"192.168.56.11"</span>, <span class="hljs-symbol">script:</span> <span class="hljs-string">"40-join-node.sh"</span>,    <span class="hljs-symbol">memory:</span> <span class="hljs-number">2048</span> },
  { <span class="hljs-symbol">name:</span> <span class="hljs-string">"k8s-node-02"</span>, <span class="hljs-symbol">ip:</span> <span class="hljs-string">"192.168.56.12"</span>, <span class="hljs-symbol">script:</span> <span class="hljs-string">"40-join-node.sh"</span>,    <span class="hljs-symbol">memory:</span> <span class="hljs-number">2048</span> },
  { <span class="hljs-symbol">name:</span> <span class="hljs-string">"k8s-node-03"</span>, <span class="hljs-symbol">ip:</span> <span class="hljs-string">"192.168.56.13"</span>, <span class="hljs-symbol">script:</span> <span class="hljs-string">"40-join-node.sh"</span>,    <span class="hljs-symbol">memory:</span> <span class="hljs-number">2048</span> },
  { <span class="hljs-symbol">name:</span> <span class="hljs-string">"k8s-node-04"</span>, <span class="hljs-symbol">ip:</span> <span class="hljs-string">"192.168.56.14"</span>, <span class="hljs-symbol">script:</span> <span class="hljs-string">"40-join-node.sh"</span>,    <span class="hljs-symbol">memory:</span> <span class="hljs-number">2048</span> },
]
</code></pre>
<p>Defines five nodes for node creation loop.</p>
<h3 id="heading-5-node-creation-loop">5️⃣ Node Creation Loop</h3>
<pre><code class="lang-ruby">nodes.each <span class="hljs-keyword">do</span> <span class="hljs-params">|node|</span>
  config.vm.define node[<span class="hljs-symbol">:name</span>] <span class="hljs-keyword">do</span> <span class="hljs-params">|node_vm|</span>
    node_vm.vm.provider <span class="hljs-string">"virtualbox"</span> <span class="hljs-keyword">do</span> <span class="hljs-params">|vb|</span>
      vb.cpus = <span class="hljs-number">2</span>
      vb.memory = node[<span class="hljs-symbol">:memory</span>]
    <span class="hljs-keyword">end</span>
    node_vm.vm.hostname = node[<span class="hljs-symbol">:name</span>]
    node_vm.vm.network <span class="hljs-string">"private_network"</span>, <span class="hljs-symbol">ip:</span> node[<span class="hljs-symbol">:ip</span>]
    node_vm.vm.provision <span class="hljs-string">"shell"</span>, <span class="hljs-symbol">path:</span> root_path + <span class="hljs-string">"20-node-network.sh"</span>, <span class="hljs-symbol">args:</span> node[<span class="hljs-symbol">:ip</span>]      
    node_vm.vm.provision <span class="hljs-string">"shell"</span>, <span class="hljs-symbol">path:</span> root_path + <span class="hljs-string">"<span class="hljs-subst">#{node[<span class="hljs-symbol">:script</span>]}</span>"</span>
  <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span>
</code></pre>
<p>For each node:</p>
<ol>
<li><p><strong>Allocates</strong> cpus = 2 and memory</p>
</li>
<li><p><strong>Defines</strong> a named VM.</p>
</li>
<li><p><strong>Sets hostname</strong> inside the guest.</p>
</li>
<li><p><strong>Configures a private network</strong> on <code>192.168.56.0/24</code>.</p>
</li>
<li><p><strong>Runs</strong> <code>setup-node-network.sh</code> to configure IPs, <code>/etc/hosts</code>, etc.</p>
</li>
<li><p><strong>Runs role-specific script</strong> (<code>setup-controller.sh</code> or <code>setup-node.sh</code>).</p>
</li>
</ol>
<h3 id="heading-6-build-the-cluster">6️⃣ Build the Cluster</h3>
<p>Run vagrant up to start provisioning.</p>
<pre><code class="lang-bash"><span class="hljs-built_in">cd</span> k8s
vagrant up
</code></pre>
<p>🕒 <strong>Expected duration:</strong> 10–15 minutes.</p>
<p>Verify all VMs are running:</p>
<pre><code class="lang-bash">vagrant status
</code></pre>
<h3 id="heading-7-provisioning-scripts-deep-dive">7️⃣ <strong>Provisioning Scripts Deep Dive</strong></h3>
<p>Let's take a closer look at the shell scripts used during provisioning.</p>
<p><strong>10-common.sh</strong></p>
<p>The script configures the operating system’s memory settings, installs the required Kubernetes components, and sets up the environment for Kubernetes networking.</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Disable Swap</span>
sudo swapoff -a
sudo sed -i <span class="hljs-string">'/ swap / s/^/#/'</span> /etc/fstab

<span class="hljs-comment"># Install Core Dependencies</span>
sudo apt-get install -y kubelet kubeadm containerd

cat &lt;&lt;EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
sudo modprobe overlay
sudo modprobe br_netfilter

sudo cat &lt;&lt;<span class="hljs-string">'EOF'</span> | sudo tee /etc/sysctl.d/99-kubernetes-cri.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF
sudo sysctl --system
</code></pre>
<p><strong>Purpose</strong></p>
<p><strong>1. Disable Swap</strong> Ensures Kubernetes can accurately manage memory resources.</p>
<p><strong>2. Add Kubernetes Repository and Install Components</strong></p>
<p>Configures the official Kubernetes APT repo and installs kubelet, kubeadm, and containerd.</p>
<p><strong>3. Setup Kernel Modules</strong></p>
<p>Enables overlay and br_netfilter modules required for container networking and storage layers.</p>
<p><strong>4. Set Kernel Parameters</strong></p>
<p>Adjusts sysctl settings to enable IP forwarding and proper packet handling between bridged interfaces.</p>
<p><strong>20-common.sh</strong></p>
<p>This script configures the Kubernetes node’s network identity by explicitly assigning its IP address to the kubelet service, ensuring proper communication and cluster registration.</p>
<pre><code class="lang-bash">NODE_IP=$(ip -4 addr show enp0s8 | grep -oP <span class="hljs-string">'(?&lt;=inet\s)\d+(\.\d+){3}'</span>)

DROPIN_FILE=/lib/systemd/system/kubelet.service.d/10-kubeadm.conf

<span class="hljs-keyword">if</span> ! grep -q -- <span class="hljs-string">"--node-ip=<span class="hljs-variable">$NODE_IP</span>"</span> <span class="hljs-string">"<span class="hljs-variable">$DROPIN_FILE</span>"</span>; <span class="hljs-keyword">then</span>
  sudo sed -i <span class="hljs-string">"0,/^Environment=\"KUBELET_KUBECONFIG_ARGS=/s|\"$| --node-ip=<span class="hljs-variable">$NODE_IP</span>\"|"</span> <span class="hljs-string">"<span class="hljs-variable">$DROPIN_FILE</span>"</span>
<span class="hljs-keyword">fi</span>

sudo systemctl daemon-reexec
sudo systemctl daemon-reload
sudo systemctl restart kubelet
</code></pre>
<p>When using Vagrant, a default NAT interface (enp0s3) is created for outbound network access. A second, user-defined network interface (enp0s8) is typically added for internal cluster communication. However, Kubernetes may fail to correctly resolve the node’s IP address in this setup, requiring manual configuration.</p>
<p>After testing, the following approach proves effective: explicitly assign the node IP to the enp0s8 interface and configure the kubelet to use this IP. Once applied, the kubelet service starts with the correct node IP address, ensuring reliable communication between cluster components and accurate node registration within the Kubernetes control plane.</p>
<p><strong>30-contol-panel.sh</strong></p>
<p>This script automates the setup of a Kubernetes control plane node in a virtualized environment. It also takes care of installing and configuring essential tools like kubectl, the Container Network Interface (CNI), and a monitoring stack to give you full visibility into your cluster.</p>
<pre><code class="lang-bash">sudo apt-get install -y kubectl

<span class="hljs-comment"># Initialize cluster</span>
sudo kubeadm init --apiserver-advertise-address=192.168.56.10 --pod-network-cidr=10.224.0.0/16

<span class="hljs-comment"># Setup kubeconfig</span>
mkdir -p /home/vagrant/.kube
cp /etc/kubernetes/admin.conf /home/vagrant/.kube/config
chown vagrant:vagrant /home/vagrant/.kube/config
mkdir -p ~/.kube
cp /etc/kubernetes/admin.conf ~/.kube/config

<span class="hljs-comment"># Create join command</span>
sudo kubeadm token create --print-join-command &gt; /vagrant/provision/foundation/join-command.sh

<span class="hljs-comment"># Install Calico network plugin</span>
AUTO_METHOD=<span class="hljs-string">"cidr=192.168.56.0/24"</span>
curl -O https://raw.githubusercontent.com/projectcalico/calico/v3.30.3/manifests/calico.yaml
kubectl apply -f calico.yaml
kubectl <span class="hljs-built_in">set</span> env daemonset/calico-node -n kube-system IP_AUTODETECTION_METHOD=<span class="hljs-string">"<span class="hljs-variable">$AUTO_METHOD</span>"</span>
sudo systemctl restart kubelet

<span class="hljs-comment"># Install K9s</span>
wget https://github.com/derailed/k9s/releases/latest/download/k9s_linux_amd64.deb
sudo apt install ./k9s_linux_amd64.deb
</code></pre>
<p><strong>1. Installs kubectl</strong></p>
<p>Initializes the Kubernetes cluster using kubeadm, specifying the API server advertise address (192.168.56.10) and the Pod network CIDR (10.224.0.0/16).</p>
<p><strong>2. Kubeconfig Configuration</strong></p>
<p>Sets up the Kubernetes configuration (admin.conf) for both the vagrant user and the root user, enabling access to cluster management commands.</p>
<p><strong>3. Node Join Command Generation</strong></p>
<p>Creates and stores the cluster join command in /vagrant/provision/foundation/join-command.sh for worker nodes to join the cluster.</p>
<p><strong>3. Calico Network Plugin Setup</strong></p>
<p>Downloads and applies the Calico manifest to enable networking between pods. Configures Calico’s IP autodetection method to use the local network (cidr=192.168.56.0/24).</p>
<p><strong>4. Kubernetes Management Tool Installation</strong></p>
<p>Installs k9s, a terminal-based Kubernetes cluster management tool</p>
<hr />
<h2 id="heading-step-2-post-init-setup">Step 2: Post-Init Setup</h2>
<p>SSH into the control plane:</p>
<pre><code class="lang-bash">vagrant ssh k8s-cp-01
</code></pre>
<p>Run the post-initialization script:</p>
<pre><code class="lang-bash">sudo /vagrant/provision/foundation/50-after-vagrant-setup
</code></pre>
<p><strong><em>50-after-vagrant-setup</em></strong></p>
<p>This script configures functional labels to define node roles(edge, backend, database), and configures essential Kubernetes components with targeted scheduling.</p>
<pre><code class="lang-bash">NODES=(k8s-node-01 k8s-node-02 k8s-node-03 k8s-node-04)
kubectl label node <span class="hljs-string">"<span class="hljs-variable">${NODES[0]}</span>"</span> node-role.kubernetes.io/worker-node=<span class="hljs-string">""</span> tier=edge --overwrite
kubectl label node <span class="hljs-string">"<span class="hljs-variable">${NODES[1]}</span>"</span> node-role.kubernetes.io/worker-node=<span class="hljs-string">""</span> tier=backend --overwrite
kubectl label node <span class="hljs-string">"<span class="hljs-variable">${NODES[2]}</span>"</span> node-role.kubernetes.io/worker-node=<span class="hljs-string">""</span> tier=backend --overwrite
kubectl label node <span class="hljs-string">"<span class="hljs-variable">${NODES[3]}</span>"</span> node-role.kubernetes.io/worker-node=<span class="hljs-string">""</span> tier=database --overwrite
<span class="hljs-comment"># Install Local Path Provisioner for dynamic storage provisioning</span>
kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/master/deploy/local-path-storage.yaml
<span class="hljs-comment"># Install NGINX Ingress Controller</span>
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/main/deploy/static/provider/cloud/deploy.yaml
<span class="hljs-comment"># Install MetalLB</span>
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.15.2/config/manifests/metallb-native.yaml
</code></pre>
<p>After labeling, it deploys several core infrastructure components:</p>
<ol>
<li><p>Local Path Provisioner – Enables dynamic storage provisioning</p>
</li>
<li><p>NGINX Ingress Controller – Provides ingress routing</p>
</li>
<li><p>MetalLB – Implements Layer 2 load balancing with controller deployment</p>
</li>
</ol>
<p>Verify:</p>
<pre><code class="lang-bash">kubectl get nodes --show-labels
kubectl get pods -A
</code></pre>
<hr />
<h2 id="heading-step-3-deploy-infrastructure">Step 3: Deploy Infrastructure</h2>
<p>Apply storage and networking configuration.</p>
<p><strong>Run Deployment</strong></p>
<pre><code class="lang-bash">kubectl apply -f /vagrant/provision/deployment/standard/infra
</code></pre>
<h3 id="heading-whats-inside">What’s Inside</h3>
<ul>
<li><p><code>10-storage-class.yml</code> — Local path dynamic PV provisioning</p>
</li>
<li><p><code>20-metallb.yaml</code> — IP pool and L2Advertisement setup</p>
</li>
</ul>
<h3 id="heading-confiuration-review">Confiuration Review</h3>
<p><strong>10-storage-class.yml</strong></p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">storage.k8s.io/v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">StorageClass</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">vm-storage</span>
<span class="hljs-attr">provisioner:</span> <span class="hljs-string">rancher.io/local-path</span>
<span class="hljs-attr">volumeBindingMode:</span> <span class="hljs-string">WaitForFirstConsumer</span>
<span class="hljs-attr">reclaimPolicy:</span> <span class="hljs-string">Delete</span>
</code></pre>
<p>This manifest defines a StorageClass named <code>vm-storage</code> that uses the Rancher Local Path Provisioner to dynamically create node-local PersistentVolumes. It sets <code>volumeBindingMode: WaitForFirstConsumer</code> so volume provisioning is deferred until a pod is scheduled, ensuring the PV is created on the same node as the workload. The <code>reclaimPolicy: Delete</code> cleans up underlying storage when the PersistentVolumeClaim is removed.</p>
<p><strong>20-metallb.yaml</strong></p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">metallb.io/v1beta1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">IPAddressPool</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">default-address-pool</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">metallb-system</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">addresses:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-number">192.168</span><span class="hljs-number">.56</span><span class="hljs-number">.240</span><span class="hljs-number">-192.168</span><span class="hljs-number">.56</span><span class="hljs-number">.250</span>
<span class="hljs-meta">---</span>
<span class="hljs-attr">apiVersion:</span> <span class="hljs-string">metallb.io/v1beta1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">L2Advertisement</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">default-l2-advert</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">metallb-system</span>
</code></pre>
<p>In cloud environments, services like AWS or GCP automatically provide load balancers to expose your applications to the outside world. But on bare-metal or virtualized Kubernetes clusters, you don’t get that luxury out of the box — and that’s where MetalLB steps in.</p>
<p>The manifest configures <code>MetalLB</code> to handle external traffic just like a cloud load balancer would. It defines an IPAddressPool that allocates IPs from <code>192.168.56.240–192.168.56.250</code>, and an <code>L2Advertisement</code> that announces those addresses at Layer 2 so other devices on the network can reach your services directly.</p>
<p>The result is seamless, cloud-like load balancing for your on-premises or Vagrant-based Kubernetes setups — giving your local cluster the same networking power as a managed one.</p>
<p>Verify:</p>
<pre><code class="lang-bash">kubectl get storageclass
kubectl get ipaddresspool -n metallb-system
</code></pre>
<hr />
<h2 id="heading-step-4-deploy-the-application">Step 4: Deploy the Application</h2>
<p>In this step, we’re bringing everything together — deploying a complete multi-tier application stack on Kubernetes. This manifests sets up dedicated namespaces, injects configuration data, and applies the necessary RBAC permissions for secure access control. It also provisions a PostgreSQL database backed by persistent storage, then deploys a Spring Boot application with multiple replicas for scalability and resilience.</p>
<p>To make the services accessible, it exposes them through ClusterIP and NodePort, and strengthens cluster security with NetworkPolicies that control how pods communicate. Optionally, it can also install monitoring and maintenance utilities, giving you full visibility and manageability of your application stack — all running seamlessly inside Kubernetes.</p>
<p><strong>Run Deployment</strong></p>
<pre><code class="lang-bash">kubectl apply -f /vagrant/provision/deployment/standard/app
</code></pre>
<h3 id="heading-whats-inside-1">What’s Inside</h3>
<ul>
<li><p><code>10-config.yml</code> — Namespaces, ConfigMaps, Secrets</p>
</li>
<li><p><code>20-rbac.yml</code> — RBAC setup</p>
</li>
<li><p><code>30-db-deploy.yml</code> — PostgreSQL with PVC</p>
</li>
<li><p><code>40-app-deply.yml</code> — Spring Boot app (2 replicas)</p>
</li>
<li><p><code>50-services.yml</code> — ClusterIP and NodePort</p>
</li>
<li><p><code>60-network-policy.yml</code> — Secure traffic rules</p>
</li>
<li><p><code>70-utilities.yml</code> — Optional utilities</p>
</li>
</ul>
<h3 id="heading-confiuration-review-1">Confiuration Review</h3>
<p><strong>10-config.yml</strong></p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Namespace</span>
<span class="hljs-attr">metadata:</span> { <span class="hljs-attr">name:</span> <span class="hljs-string">demo</span> }
</code></pre>
<p>Creates a <code>demo</code> namespace for isolating application resources.</p>
<p><strong>20-rbac.yml</strong></p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">rbac.authorization.k8s.io/v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">RoleBinding</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">database-rolebinding</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">demo</span>
<span class="hljs-attr">subjects:</span>
<span class="hljs-bullet">-</span> <span class="hljs-attr">kind:</span> <span class="hljs-string">ServiceAccount</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">postgres-sa</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">demo</span>
<span class="hljs-attr">roleRef:</span>
  <span class="hljs-attr">kind:</span> <span class="hljs-string">Role</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">database-role</span>
  <span class="hljs-attr">apiGroup:</span> <span class="hljs-string">rbac.authorization.k8s.io</span>
<span class="hljs-meta">---</span>
<span class="hljs-attr">apiVersion:</span> <span class="hljs-string">rbac.authorization.k8s.io/v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">RoleBinding</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">app-rolebinding</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">demo</span>
<span class="hljs-attr">subjects:</span>
<span class="hljs-bullet">-</span> <span class="hljs-attr">kind:</span> <span class="hljs-string">ServiceAccount</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">app-sa</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">demo</span>
<span class="hljs-attr">roleRef:</span>
  <span class="hljs-attr">kind:</span> <span class="hljs-string">Role</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">app-role</span>
  <span class="hljs-attr">apiGroup:</span> <span class="hljs-string">rbac.authorization.k8s.io</span>
</code></pre>
<p>The manifest defines two service accounts: <code>database-sa</code> for PostgreSQL and <code>app-sa</code> for the Spring Boot service, enabling least-privilege access and clear separation of duties.</p>
<p><strong>30-db-deploy.yml</strong></p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">ConfigMap</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">db-cm</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">demo</span>
<span class="hljs-attr">data:</span>
  <span class="hljs-attr">POSTGRES_USER:</span> <span class="hljs-string">postgres</span>
  <span class="hljs-attr">POSTGRES_DB:</span> <span class="hljs-string">appdb</span>
  <span class="hljs-attr">APP_USER:</span> <span class="hljs-string">appuser</span>
<span class="hljs-meta">---</span>
<span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Secret</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">db-secret</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">demo</span>
<span class="hljs-attr">type:</span> <span class="hljs-string">Opaque</span>
<span class="hljs-attr">stringData:</span>
  <span class="hljs-attr">POSTGRES_PASSWORD:</span> <span class="hljs-string">strong-password</span>
  <span class="hljs-attr">APP_PASSWORD:</span> <span class="hljs-string">strong-password</span>
<span class="hljs-meta">---</span>
<span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">ConfigMap</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">db-init</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">dem</span>
<span class="hljs-attr">data:</span>
  <span class="hljs-attr">00-roles.sql:</span> <span class="hljs-string">(skip)</span>
  <span class="hljs-attr">01-db.sql:</span> <span class="hljs-string">(skip)</span>
  <span class="hljs-attr">02-schema.sql:</span> <span class="hljs-string">(skip)</span>
  <span class="hljs-attr">03-comments.sql:</span> <span class="hljs-string">(skip)</span>
  <span class="hljs-attr">04-table.sql:</span>  <span class="hljs-string">(skip)</span>
<span class="hljs-meta">---</span>
<span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Service</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">postgres-headless</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">demo</span>
  <span class="hljs-attr">labels:</span>
    <span class="hljs-attr">app:</span> <span class="hljs-string">postgres</span>      
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">clusterIP:</span> <span class="hljs-string">None</span>
  <span class="hljs-attr">ports:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">pg</span>
      <span class="hljs-attr">port:</span> <span class="hljs-number">5432</span>
      <span class="hljs-attr">targetPort:</span> <span class="hljs-number">5432</span>
  <span class="hljs-attr">selector:</span>
    <span class="hljs-attr">app:</span> <span class="hljs-string">postgres</span>
<span class="hljs-meta">---</span>
<span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Service</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">postgres</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">demo</span>
  <span class="hljs-attr">labels:</span>
    <span class="hljs-attr">app:</span> <span class="hljs-string">postgres</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">type:</span> <span class="hljs-string">ClusterIP</span>
  <span class="hljs-attr">ports:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">pg</span>
      <span class="hljs-attr">port:</span> <span class="hljs-number">5432</span>
      <span class="hljs-attr">targetPort:</span> <span class="hljs-number">5432</span>
  <span class="hljs-attr">selector:</span>
    <span class="hljs-attr">app:</span> <span class="hljs-string">postgres</span>
<span class="hljs-meta">---</span>
<span class="hljs-attr">apiVersion:</span> <span class="hljs-string">apps/v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">StatefulSet</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">postgres</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">demo</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">serviceName:</span> <span class="hljs-string">postgres-headless</span>
  <span class="hljs-attr">replicas:</span> <span class="hljs-number">1</span>
  <span class="hljs-attr">selector:</span>
    <span class="hljs-attr">matchLabels:</span>
      <span class="hljs-attr">app:</span> <span class="hljs-string">postgres</span>
  <span class="hljs-attr">template:</span>
    <span class="hljs-attr">metadata:</span>
      <span class="hljs-attr">labels:</span>
        <span class="hljs-attr">app:</span> <span class="hljs-string">postgres</span>
    <span class="hljs-attr">spec:</span>
      <span class="hljs-attr">serviceAccountName:</span> <span class="hljs-string">database-sa</span>
      <span class="hljs-attr">nodeSelector:</span>
        <span class="hljs-attr">tier:</span> <span class="hljs-string">database</span>
      <span class="hljs-attr">containers:</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">postgres</span>
          <span class="hljs-attr">image:</span> <span class="hljs-string">postgres:18</span>
          <span class="hljs-attr">ports:</span>
            <span class="hljs-bullet">-</span> <span class="hljs-attr">containerPort:</span> <span class="hljs-number">5432</span>
              <span class="hljs-attr">name:</span> <span class="hljs-string">pg</span>
          <span class="hljs-attr">envFrom:</span>
            <span class="hljs-bullet">-</span> <span class="hljs-attr">secretRef:</span>
                <span class="hljs-attr">name:</span> <span class="hljs-string">db-secret</span>
            <span class="hljs-bullet">-</span> <span class="hljs-attr">configMapRef:</span>
                <span class="hljs-attr">name:</span> <span class="hljs-string">db-cm</span>
          <span class="hljs-attr">volumeMounts:</span>
            <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">db-data</span>
              <span class="hljs-attr">mountPath:</span> <span class="hljs-string">/var/lib/postgresql/</span>
            <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">run-socket</span>
              <span class="hljs-attr">mountPath:</span> <span class="hljs-string">/var/run/postgresql</span>
            <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">db-init</span>
              <span class="hljs-attr">mountPath:</span> <span class="hljs-string">/docker-entrypoint-initdb.</span>
      <span class="hljs-attr">volumes:</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">run-socket</span>
          <span class="hljs-attr">emptyDir:</span> {}
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">db-init</span>
          <span class="hljs-attr">configMap:</span>
            <span class="hljs-attr">name:</span> <span class="hljs-string">db-init</span>
  <span class="hljs-attr">volumeClaimTemplates:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-attr">metadata:</span>
        <span class="hljs-attr">name:</span> <span class="hljs-string">db-data</span>
      <span class="hljs-attr">spec:</span>
        <span class="hljs-attr">accessModes:</span> [<span class="hljs-string">"ReadWriteOnce"</span>]
        <span class="hljs-attr">storageClassName:</span> <span class="hljs-string">vm-storage</span>
        <span class="hljs-attr">resources:</span>
          <span class="hljs-attr">requests:</span>
            <span class="hljs-attr">storage:</span> <span class="hljs-string">5Gi</span>
</code></pre>
<p>The manifest provisions a <code>single-replica</code> PostgreSQL database as a <code>StatefulSet</code> on the <code>database</code> tier. It uses the <code>database-sa</code> service account, loads environment variables and credentials from ConfigMaps and Secret, runs optional init SQL from a ConfigMap, and persists data via a PersistentVolumeClaim using the <code>vm-storage</code> StorageClass, and exposes the database through two Kubernetes Services: a standard ClusterIP Service <code>postgres</code> for in-cluster access on port <code>5432</code>, and a headless Service <code>postgres-headless</code> to enable direct pod-to-pod communication and stable DNS resolution—typically.</p>
<p><strong>40-app-deply.yml</strong></p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">ConfigMap</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">springboot-cm</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">demo</span>
  <span class="hljs-attr">labels:</span>
    <span class="hljs-attr">environment:</span> <span class="hljs-string">demo</span>
<span class="hljs-attr">data:</span>
  <span class="hljs-attr">BPL_JVM_THREAD_COUNT:</span> <span class="hljs-string">"100"</span>
  <span class="hljs-attr">JAVA_TOOL_OPTIONS:</span> <span class="hljs-string">"-XX:InitialRAMPercentage=25.0 -XX:MaxRAMPercentage=75.0"</span>
  <span class="hljs-attr">LOGGING_LEVEL_ROOT:</span> <span class="hljs-string">INFO</span>
  <span class="hljs-attr">SPRING_PROFILES_ACTIVE:</span> <span class="hljs-string">prod</span>
  <span class="hljs-attr">SPRING_DATASOURCE_URL:</span> <span class="hljs-string">"jdbc:postgresql://postgres.demo.svc.cluster.local:5432/appdb"</span>
<span class="hljs-meta">---</span>
<span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Secret</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">springboot-secret</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">demo</span>
  <span class="hljs-attr">labels:</span>
    <span class="hljs-attr">environment:</span> <span class="hljs-string">demo</span>
<span class="hljs-attr">type:</span> <span class="hljs-string">Opaque</span>  
<span class="hljs-attr">stringData:</span>
  <span class="hljs-attr">spring.datasource.username:</span> <span class="hljs-string">appuser</span>
  <span class="hljs-attr">spring.datasource.password:</span> <span class="hljs-string">strong-password</span>
<span class="hljs-meta">---</span>
<span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Service</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">api-svc</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">demo</span>
  <span class="hljs-attr">labels:</span>
    <span class="hljs-attr">environment:</span> <span class="hljs-string">demo</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">selector:</span>
    <span class="hljs-attr">app:</span> <span class="hljs-string">api</span>
  <span class="hljs-attr">ports:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-attr">port:</span> <span class="hljs-number">80</span>
      <span class="hljs-attr">targetPort:</span> <span class="hljs-number">8080</span>
  <span class="hljs-attr">type:</span> <span class="hljs-string">ClusterIP</span>
<span class="hljs-meta">---</span>
<span class="hljs-attr">apiVersion:</span> <span class="hljs-string">apps/v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Deployment</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">bank-account-demo</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">demo</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">replicas:</span> <span class="hljs-number">2</span>
  <span class="hljs-attr">selector:</span>
    <span class="hljs-attr">matchLabels:</span>
      <span class="hljs-attr">app:</span> <span class="hljs-string">api</span>
  <span class="hljs-attr">template:</span>
    <span class="hljs-attr">metadata:</span>
      <span class="hljs-attr">labels:</span>
        <span class="hljs-attr">app:</span> <span class="hljs-string">api</span>     
    <span class="hljs-attr">spec:</span>
      <span class="hljs-attr">serviceAccountName:</span> <span class="hljs-string">app-sa</span>
      <span class="hljs-attr">affinity:</span>
        <span class="hljs-attr">nodeAffinity:</span>
          <span class="hljs-attr">requiredDuringSchedulingIgnoredDuringExecution:</span>
            <span class="hljs-attr">nodeSelectorTerms:</span>
            <span class="hljs-bullet">-</span> <span class="hljs-attr">matchExpressions:</span>
              <span class="hljs-bullet">-</span> <span class="hljs-attr">key:</span> <span class="hljs-string">tier</span>
                <span class="hljs-attr">operator:</span> <span class="hljs-string">In</span>
                <span class="hljs-attr">values:</span>
                <span class="hljs-bullet">-</span> <span class="hljs-string">backend</span>
      <span class="hljs-attr">topologySpreadConstraints:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">maxSkew:</span> <span class="hljs-number">1</span>
        <span class="hljs-attr">topologyKey:</span> <span class="hljs-string">kubernetes.io/hostname</span>
        <span class="hljs-attr">whenUnsatisfiable:</span> <span class="hljs-string">DoNotSchedule</span>
        <span class="hljs-attr">labelSelector:</span>
          <span class="hljs-attr">matchLabels:</span>
            <span class="hljs-attr">app:</span> <span class="hljs-string">api</span>
      <span class="hljs-attr">containers:</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">bank-account-demo</span>
          <span class="hljs-attr">image:</span> <span class="hljs-string">docker.io/aratax/bank-account-demo:1.0</span>
          <span class="hljs-attr">ports:</span>
            <span class="hljs-bullet">-</span> <span class="hljs-attr">containerPort:</span> <span class="hljs-number">8080</span>
          <span class="hljs-attr">envFrom:</span>
            <span class="hljs-bullet">-</span> <span class="hljs-attr">configMapRef:</span>
                <span class="hljs-attr">name:</span> <span class="hljs-string">springboot-cm</span>
            <span class="hljs-bullet">-</span> <span class="hljs-attr">secretRef:</span>
                <span class="hljs-attr">name:</span> <span class="hljs-string">springboot-secret</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">wait-for-database</span>
        <span class="hljs-attr">image:</span> <span class="hljs-string">busybox</span>
        <span class="hljs-attr">command:</span> [<span class="hljs-string">'sh'</span>, <span class="hljs-string">'-c'</span>, <span class="hljs-string">'until nc -z postgres.demo.svc.cluster.local 5432; do echo waiting; sleep 2; done;'</span>]
</code></pre>
<p>This manifest provisions a <code>two-replica</code> Spring Boot application deployment in the <code>backend</code> tier. It uses the <code>app-sa</code> service account, loads runtime configuration and credentials from the ConfigMap and Secret, and connects to the PostgreSQL database via the internal DNS endpoint <code>postgres.demo.svc.cluster.local:5432</code>. The deployment includes an init container <code>wait-for-database</code> to ensure the database is reachable before application startup. It exposes the application through a ClusterIP Service named <code>api-svc</code> on port <code>80</code>. The deployment is set up to run only on <code>backend</code> nodes and to spread its pods evenly across different nodes for better reliability and load balance.</p>
<p><strong>50-services.yml</strong></p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Service</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">database-nodeport</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">demo</span>
  <span class="hljs-attr">labels:</span>
    <span class="hljs-attr">environment:</span> <span class="hljs-string">demo</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">type:</span> <span class="hljs-string">NodePort</span>
  <span class="hljs-attr">selector:</span>
    <span class="hljs-attr">app:</span> <span class="hljs-string">postgres</span>
  <span class="hljs-attr">ports:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">pg</span>
      <span class="hljs-attr">port:</span> <span class="hljs-number">5432</span>         
      <span class="hljs-attr">targetPort:</span> <span class="hljs-number">5432</span>    
      <span class="hljs-attr">nodePort:</span> <span class="hljs-number">30000</span>
<span class="hljs-meta">---</span>
<span class="hljs-attr">apiVersion:</span> <span class="hljs-string">networking.k8s.io/v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Ingress</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">webapp-ingress</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">demo</span>
  <span class="hljs-attr">labels:</span>
    <span class="hljs-attr">environment:</span> <span class="hljs-string">demo</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">ingressClassName:</span> <span class="hljs-string">nginx</span>
  <span class="hljs-attr">rules:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-attr">host:</span> <span class="hljs-string">app.demo.local</span>
      <span class="hljs-attr">http:</span>
        <span class="hljs-attr">paths:</span>
          <span class="hljs-bullet">-</span> <span class="hljs-attr">path:</span> <span class="hljs-string">/</span>
            <span class="hljs-attr">pathType:</span> <span class="hljs-string">Prefix</span>
            <span class="hljs-attr">backend:</span>
              <span class="hljs-attr">service:</span>
                <span class="hljs-attr">name:</span> <span class="hljs-string">api-svc</span>
                <span class="hljs-attr">port:</span>
                  <span class="hljs-attr">number:</span> <span class="hljs-number">80</span>
</code></pre>
<p>The manifest enables both the application and the database to be accessed from inside and outside the cluster. It defines a NodePort Service named <code>database-nodeport</code> that exposes the PostgreSQL database on port <code>30000</code> for external access, typically used for development and debugging. It also creates an Ingress resource named <code>webapp-ingress</code> that routes web traffic for <code>app.demo.local</code> to the internal <code>api-svc</code> service, which runs the Spring Boot application on port 80.</p>
<hr />
<h2 id="heading-step-5-review-application-design">Step 5: Review Application Design</h2>
<p>To implement a simple banking system, two tables were designed:</p>
<ul>
<li><p><code>accounts</code> — stores core account information (owner, currency, balance, etc.).</p>
</li>
<li><p><code>ledger_entries</code> — records all debit/credit transactions linked to each account for auditing and reconciliation.</p>
</li>
</ul>
<p>This schema ensures data integrity, supports concurrent balance updates via versioning, and provides immutable transaction history.</p>
<h3 id="heading-database-table-layout">Database Table Layout:</h3>
<pre><code class="lang-sql">+<span class="hljs-comment">------------------------------------------------------------+</span>
|                         accounts                           |
+<span class="hljs-comment">------------------------------------------------------------+</span>
| Column       | Type           | Constraints / Default      |
|<span class="hljs-comment">--------------|----------------|----------------------------|</span>
| id           | UUID (PK)      | DEFAULT uuidv7()           |
| owner_name   | TEXT           | NOT NULL                   |
| currency     | CHAR(3)        | NOT NULL                   |
| balance      | NUMERIC(18,2)  | NOT NULL DEFAULT 0         |
| version      | BIGINT         | NOT NULL DEFAULT 0         |
| updated_at   | TIMESTAMPTZ    | NOT NULL DEFAULT NOW()     |
+<span class="hljs-comment">------------------------------------------------------------+</span>
| INDEX: idx_accounts_owner (owner_name)                     |
+<span class="hljs-comment">------------------------------------------------------------+</span>

                  1
     accounts ─────────────┐
                           │  (fk_ledger_account)
                           ▼

+<span class="hljs-comment">------------------------------------------------------------+</span>
|                     ledger_entries                         |
+<span class="hljs-comment">------------------------------------------------------------+</span>
| Column       | Type           | Constraints / Default      |
|<span class="hljs-comment">--------------|----------------|----------------------------|</span>
| id           | UUID (PK)      | DEFAULT uuidv7()           |
| account_id   | UUID (FK)      | REFERENCES accounts(id)    |
| direction    | TEXT           | NOT NULL                   |
| amount       | NUMERIC(18,2)  | NOT NULL <span class="hljs-keyword">CHECK</span> (amount &gt; <span class="hljs-number">0</span>)|
| reason       | <span class="hljs-built_in">TEXT</span>           |                            |
| created_at   | TIMESTAMPTZ    | <span class="hljs-keyword">NOT</span> <span class="hljs-literal">NULL</span> <span class="hljs-keyword">DEFAULT</span> <span class="hljs-keyword">NOW</span>()     |
+<span class="hljs-comment">------------------------------------------------------------+</span>
</code></pre>
<h3 id="heading-springboot">SpringBoot:</h3>
<p>The Java application provide unified transaction endpoint processes both deposit and withdrawal operations, allowing clients to specify the locking strategy (OPTIMISTIC or PESSIMISTIC) per request.</p>
<p><strong>Rest Endpoints</strong></p>
<pre><code class="lang-java"><span class="hljs-meta">@RestController</span>
<span class="hljs-meta">@RequestMapping("/api/accounts")</span>
<span class="hljs-keyword">public</span> <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">AccountController</span> </span>{

    <span class="hljs-keyword">private</span> <span class="hljs-keyword">final</span> AccountService accountService;

    <span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-title">AccountController</span><span class="hljs-params">(AccountService accountService)</span> </span>{ <span class="hljs-keyword">this</span>.accountService = accountService;}

    <span class="hljs-meta">@PostMapping(value = "/{id}/transaction", produces = MediaType.APPLICATION_JSON_VALUE)</span>
    <span class="hljs-function"><span class="hljs-keyword">public</span> ResponseEntity&lt;TransactionResponse&gt; <span class="hljs-title">transaction</span><span class="hljs-params">(
            <span class="hljs-meta">@PathVariable</span> UUID id,
            <span class="hljs-meta">@Valid</span> <span class="hljs-meta">@RequestBody</span> TransactionRequest request)</span> </span>{
        TransactionResponse response = accountService.executeTransaction(id, request );
        <span class="hljs-keyword">return</span> ResponseEntity.ok(response);
    }

}
</code></pre>
<p><strong>Rest Example</strong> POST /api/accounts/3f93c1c2-1c52-4df5-8c6a-9b0c6d7c5c11/transaction</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"type"</span>: <span class="hljs-string">"DEPOSIT"</span>,
  <span class="hljs-attr">"amount"</span>: <span class="hljs-number">500</span>,
  <span class="hljs-attr">"lockingMode"</span>: <span class="hljs-string">"OPTIMISTIC"</span>,
  <span class="hljs-attr">"reason"</span>: <span class="hljs-string">"API_DEPOSIT"</span>
}
# or
{
  <span class="hljs-attr">"type"</span>: <span class="hljs-string">"WITHDRAWAL"</span>,
  <span class="hljs-attr">"amount"</span>: <span class="hljs-number">300</span>,
  <span class="hljs-attr">"lockingMode"</span>: <span class="hljs-string">"PESSIMISTIC"</span>,
  <span class="hljs-attr">"reason"</span>: <span class="hljs-string">"API_WITHDRAWAL"</span>
}
</code></pre>
<p><strong>Concurrency Control Strategy in JPA</strong> The Java application uses JPA (Java Persistence API) — an ORM framework — to interact with a PostgreSQL database while maintaining data integrity during concurrent transactions. It also explores two different locking strategies, described below, to demonstrate how JPA handles concurrency in real-world scenarios.</p>
<p><strong>Optimistic Locking Strategy</strong></p>
<p>The <code>@Version</code> field provides optimistic concurrency control — each update automatically increments the version. When two transactions modify the same Account, the second commit detects a version mismatch and throws an OptimisticLockException, preventing lost updates without requiring database locks. A retry strategy with controlled backoff (<code>5</code> attempts) can be applied to gracefully handle these transient conflicts.</p>
<p>Entity</p>
<pre><code class="lang-java"><span class="hljs-meta">@Data</span>
<span class="hljs-meta">@Entity</span>
<span class="hljs-meta">@Table(name = "accounts", schema = "app")</span>
<span class="hljs-keyword">public</span> <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Account</span> </span>{

  <span class="hljs-meta">@Id</span> <span class="hljs-meta">@UuidGenerator</span>
  <span class="hljs-keyword">private</span> UUID id;

  <span class="hljs-meta">@Column(name = "owner_name", nullable = false)</span>
  <span class="hljs-keyword">private</span> String ownerName;

  <span class="hljs-meta">@Column(length = 3, nullable = false)</span>
  <span class="hljs-keyword">private</span> String currency;

  <span class="hljs-meta">@Column(nullable = false, precision = 18, scale = 2)</span>
  <span class="hljs-keyword">private</span> BigDecimal balance = BigDecimal.ZERO;

  <span class="hljs-meta">@Version</span>
  <span class="hljs-meta">@Column(nullable = false)</span>
  <span class="hljs-keyword">private</span> <span class="hljs-keyword">long</span> version;

  <span class="hljs-meta">@Column(name = "updated_at", columnDefinition = "timestamptz", nullable = false)</span>
  <span class="hljs-keyword">private</span> Instant updatedAt = Instant.now();

}
</code></pre>
<p>Service</p>
<pre><code class="lang-java"><span class="hljs-meta">@Transactional(isolation= Isolation.READ_COMMITTED, rollbackFor = Exception.class)</span>
<span class="hljs-meta">@Override</span>
<span class="hljs-function"><span class="hljs-keyword">public</span> TransactionResponse <span class="hljs-title">execute</span><span class="hljs-params">(UUID id, TransactionType type, BigDecimal amt, String reason)</span> </span>{
    <span class="hljs-keyword">var</span> account = accountRepo.findById(id).orElseThrow();
    <span class="hljs-keyword">if</span> (TransactionType.DEPOSIT.equals(type)) 
      account.deposit(amt); 
    <span class="hljs-keyword">else</span> 
      account.withdraw(amt);
    <span class="hljs-keyword">var</span> ledgerEntry = ledgerRepo.save(LedgerEntry.of(account, type, amt, reason));
    <span class="hljs-keyword">return</span> TransactionResponse.success(account, ledgerEntry);
}
</code></pre>
<p><strong>Pessimistic Locking Strategy</strong></p>
<p>The <code>@Lock(LockModeType.PESSIMISTIC_WRITE)</code> annotation enforces pessimistic locking by issuing a <code>database-level SELECT ... FOR UPDATE query</code>. This explicitly locks the selected Account row until the current transaction completes.</p>
<pre><code class="lang-java"><span class="hljs-keyword">public</span> <span class="hljs-class"><span class="hljs-keyword">interface</span> <span class="hljs-title">AccountRepository</span> <span class="hljs-keyword">extends</span> <span class="hljs-title">JpaRepository</span>&lt;<span class="hljs-title">Account</span>, <span class="hljs-title">UUID</span>&gt; </span>{

  <span class="hljs-comment">// Pessimistic row lock (SELECT ... FOR UPDATE)</span>
  <span class="hljs-meta">@Lock(LockModeType.PESSIMISTIC_WRITE)</span>
  <span class="hljs-meta">@Query("select a from Account a where a.id = :id")</span>
  <span class="hljs-function">Optional&lt;Account&gt; <span class="hljs-title">findForUpdate</span><span class="hljs-params">(<span class="hljs-meta">@Param("id")</span> UUID id)</span></span>;

}
</code></pre>
<hr />
<h2 id="heading-step-6-access-the-app">Step 6: Access the App</h2>
<p>Get the Ingress IP:</p>
<pre><code class="lang-bash">kubectl get ingress -n demo
</code></pre>
<p>Example output:</p>
<pre><code class="lang-bash">NAME             CLASS   HOSTS            ADDRESS          PORTS   AGE
webapp-ingress   nginx   app.demo.local   192.168.56.240   80      21h
</code></pre>
<p>Then test endpoints from your host:</p>
<pre><code class="lang-bash">curl -H <span class="hljs-string">"Host: app.demo.local"</span>  http://192.168.56.240/actuator/health
</code></pre>
<hr />
<h2 id="heading-step-7-testing-with-k6">Step 7: Testing with k6</h2>
<p>In this step, the environment is fully prepared to simulate concurrent bank account transactions and observe how the locking mechanisms behave under load.</p>
<p>You can use the built-in k6 test script to run the simulation. For each test run, you’ll choose an account and a specific locking strategy (for example, optimistic or pessimistic locking). The script then launches <code>50</code> virtual users (VUs) running concurrently, using the shared-iterations executor — a total of <code>100</code> iterations distributed across all VUs. This setup effectively mimics concurrent access to the same account, allowing you to verify how data integrity is preserved during simultaneous transactions.</p>
<p>To get started, install k6 on your host system:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Debian/Ubuntu</span>
sudo gpg -k
sudo gpg --no-default-keyring --keyring /usr/share/keyrings/k6-archive-keyring.gpg --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys C5AD17C747E3415A3642D57D77C6C491D6AC1D69
<span class="hljs-built_in">echo</span> <span class="hljs-string">"deb [signed-by=/usr/share/keyrings/k6-archive-keyring.gpg] https://dl.k6.io/deb stable main"</span> | sudo tee /etc/apt/sources.list.d/k6.list
sudo apt-get update
sudo apt-get install k6
<span class="hljs-comment"># Windows</span>
choco install k6
</code></pre>
<p>Run the load test:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># host</span>
k6 run -e BASE_URL=`http://192.168.56.240` -e ACCOUNT_ID=`3f93c1c2-1c52-4df5-8c6a-9b0c6d7c5c11` -e MODE=`OPTIMISTIC` <span class="hljs-built_in">test</span>/k6-load-test.js
</code></pre>
<p>You will get the output like this:</p>
<pre><code class="lang-javascript">
         /\      Grafana   /‾‾/  
    <span class="hljs-regexp">/\  /</span>  \     |\  __   /  /   
   <span class="hljs-regexp">/  \/    \    | |/</span> <span class="hljs-regexp">/  /</span>   ‾‾\ 
  /          \   |   (  |  (‾)  |
 <span class="hljs-regexp">/ __________ \  |_|\_\  \_____/</span> 

     execution: local
        <span class="hljs-attr">script</span>: k6-load-test.js
        <span class="hljs-attr">output</span>: -

     scenarios: (<span class="hljs-number">100.00</span>%) <span class="hljs-number">1</span> scenario, <span class="hljs-number">50</span> max VUs, <span class="hljs-number">2</span>m30s max duration (incl. graceful stop):
              * concurrent_load: <span class="hljs-number">100</span> iterations shared among <span class="hljs-number">50</span> VUs (maxDuration: <span class="hljs-number">2</span>m0s, <span class="hljs-attr">gracefulStop</span>: <span class="hljs-number">30</span>s)

INFO[<span class="hljs-number">0000</span>] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━  source=<span class="hljs-built_in">console</span>
INFO[<span class="hljs-number">0000</span>] ▶ K6 Load Test <span class="hljs-keyword">for</span> AccountController          source=<span class="hljs-built_in">console</span>
INFO[<span class="hljs-number">0000</span>] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━  source=<span class="hljs-built_in">console</span>
INFO[<span class="hljs-number">0000</span>] ▶ Target base URL : http:<span class="hljs-comment">//192.168.56.240:8080     source=console</span>
INFO[<span class="hljs-number">0000</span>] ▶ Account ID      : <span class="hljs-number">3</span>f93c1c2<span class="hljs-number">-1</span>c52<span class="hljs-number">-4</span>df5<span class="hljs-number">-8</span>c6a<span class="hljs-number">-9</span>b0c6d7c5c11  source=<span class="hljs-built_in">console</span>
INFO[<span class="hljs-number">0000</span>] ▶ Locking Mode    : OPTIMISTIC                source=<span class="hljs-built_in">console</span>
INFO[<span class="hljs-number">0000</span>] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━  source=<span class="hljs-built_in">console</span>
INFO[<span class="hljs-number">0000</span>]                                               source=<span class="hljs-built_in">console</span>
INFO[<span class="hljs-number">0000</span>] 📊 Initial Account State:                      source=<span class="hljs-built_in">console</span>
INFO[<span class="hljs-number">0000</span>]    Balance: <span class="hljs-number">10</span> USD | Version: <span class="hljs-number">494</span> | Owner: Alice  source=<span class="hljs-built_in">console</span>
INFO[<span class="hljs-number">0000</span>] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━  source=<span class="hljs-built_in">console</span>
INFO[<span class="hljs-number">0000</span>]                                               source=<span class="hljs-built_in">console</span>
ERRO[<span class="hljs-number">0000</span>] [ERROR <span class="hljs-number">422</span>] WITHDRAWAL <span class="hljs-number">60</span> failed: Business rule violation - insufficient funds  source=<span class="hljs-built_in">console</span>
INFO[<span class="hljs-number">0000</span>] [<span class="hljs-number">2025</span><span class="hljs-number">-11</span><span class="hljs-number">-02</span>T13:<span class="hljs-number">20</span>:<span class="hljs-number">44.705</span>Z] TX:a52d4700-e63a<span class="hljs-number">-463</span>c<span class="hljs-number">-808e-66</span>bf4132ba26 | DEPOSIT <span class="hljs-number">24</span> USD | Balance: <span class="hljs-number">34</span> USD (v494)  source=<span class="hljs-built_in">console</span>                                                                                                                                               
INFO[<span class="hljs-number">0000</span>] [<span class="hljs-number">2025</span><span class="hljs-number">-11</span><span class="hljs-number">-02</span>T13:<span class="hljs-number">20</span>:<span class="hljs-number">44.709</span>Z] TX:<span class="hljs-number">8</span>d001c33-c64a<span class="hljs-number">-44</span>eb-b39f<span class="hljs-number">-93</span>ed854c6e02 | DEPOSIT <span class="hljs-number">46</span> USD | Balance: <span class="hljs-number">159</span> USD (v496)  source=<span class="hljs-built_in">console</span>
INFO[<span class="hljs-number">0000</span>] [<span class="hljs-number">2025</span><span class="hljs-number">-11</span><span class="hljs-number">-02</span>T13:<span class="hljs-number">20</span>:<span class="hljs-number">44.709</span>Z] TX:ba848747-f5ea<span class="hljs-number">-4</span>ebf-b1b2<span class="hljs-number">-3</span>c501c44cd2b | DEPOSIT <span class="hljs-number">79</span> USD | Balance: <span class="hljs-number">113</span> USD (v495)  source=<span class="hljs-built_in">console</span>                                                                                                                                              
INFO[<span class="hljs-number">0000</span>] [<span class="hljs-number">2025</span><span class="hljs-number">-11</span><span class="hljs-number">-02</span>T13:<span class="hljs-number">20</span>:<span class="hljs-number">44.723</span>Z] TX:<span class="hljs-number">87</span>d8874d-af6c<span class="hljs-number">-47</span>a5<span class="hljs-number">-9901</span><span class="hljs-number">-823</span>f616e8add | DEPOSIT <span class="hljs-number">10</span> USD | Balance: <span class="hljs-number">169</span> USD (v497)  source=<span class="hljs-built_in">console</span>
...
skip
...
INFO[<span class="hljs-number">0004</span>] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━  
INFO[<span class="hljs-number">0004</span>] 📊 Final Account State:                        source=<span class="hljs-built_in">console</span>
INFO[<span class="hljs-number">0004</span>]    Balance: <span class="hljs-number">501</span> USD | Version: <span class="hljs-number">582</span> | Owner: Alice                                                                                                                                                                                                              
INFO[<span class="hljs-number">0004</span>] 📈 Changes:                                    source=<span class="hljs-built_in">console</span>                                                                                                                                                                                                              
INFO[<span class="hljs-number">0004</span>]    Balance Change: +<span class="hljs-number">491</span> USD                   source=<span class="hljs-built_in">console</span>                                                                                                                                                                                                               
INFO[<span class="hljs-number">0004</span>]    Version Change: +<span class="hljs-number">88</span>                        
INFO[<span class="hljs-number">0004</span>] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━  source=<span class="hljs-built_in">console</span>                                                                                                                                                                                         
INFO[<span class="hljs-number">0004</span>] ✅ Test completed successfully!                source=<span class="hljs-built_in">console</span>                                                                                                                                                                                                              


  █ THRESHOLDS

    http_req_duration
    ✓ <span class="hljs-string">'p(95)&lt;2000'</span> p(<span class="hljs-number">95</span>)=<span class="hljs-number">1.02</span>s

    http_req_failed
    ✗ <span class="hljs-string">'rate&lt;0.1'</span> rate=<span class="hljs-number">11.76</span>%

    version_conflicts
    ✓ <span class="hljs-string">'rate&lt;0.3'</span> rate=<span class="hljs-number">0.00</span>%


  █ TOTAL RESULTS

    checks_total.......: <span class="hljs-number">300</span>    <span class="hljs-number">74.544329</span>/s
    checks_succeeded...: <span class="hljs-number">88.00</span>% <span class="hljs-number">264</span> out <span class="hljs-keyword">of</span> <span class="hljs-number">300</span>
    checks_failed......: <span class="hljs-number">12.00</span>% <span class="hljs-number">36</span> out <span class="hljs-keyword">of</span> <span class="hljs-number">300</span>

    ✗ status is <span class="hljs-number">200</span>
      ↳  <span class="hljs-number">88</span>% — ✓ <span class="hljs-number">88</span> / ✗ <span class="hljs-number">12</span>
    ✗ response has account data
      ↳  <span class="hljs-number">88</span>% — ✓ <span class="hljs-number">88</span> / ✗ <span class="hljs-number">12</span>
    ✗ response has transaction data
      ↳  <span class="hljs-number">88</span>% — ✓ <span class="hljs-number">88</span> / ✗ <span class="hljs-number">12</span>

    CUSTOM
    account_balance................: avg=<span class="hljs-number">499.056818</span> min=<span class="hljs-number">30</span>       med=<span class="hljs-number">506</span>     max=<span class="hljs-number">983</span>   p(<span class="hljs-number">90</span>)=<span class="hljs-number">767.9</span>    p(<span class="hljs-number">95</span>)=<span class="hljs-number">815.95</span>
    deposits_total.................: <span class="hljs-number">48</span>     <span class="hljs-number">11.927093</span>/s
    other_errors...................: <span class="hljs-number">12</span>     <span class="hljs-number">2.981773</span>/s
    version_conflicts..............: <span class="hljs-number">0.00</span>%  <span class="hljs-number">0</span> out <span class="hljs-keyword">of</span> <span class="hljs-number">0</span>
    withdraws_total................: <span class="hljs-number">52</span>     <span class="hljs-number">12.921017</span>/s

    HTTP
    http_req_duration..............: avg=<span class="hljs-number">308.79</span>ms   min=<span class="hljs-number">10.07</span>ms  med=<span class="hljs-number">91.88</span>ms max=<span class="hljs-number">2.11</span>s p(<span class="hljs-number">90</span>)=<span class="hljs-number">900.17</span>ms p(<span class="hljs-number">95</span>)=<span class="hljs-number">1.02</span>s
      { <span class="hljs-attr">expected_response</span>:<span class="hljs-literal">true</span> }...: avg=<span class="hljs-number">332.89</span>ms   min=<span class="hljs-number">10.07</span>ms  med=<span class="hljs-number">99.69</span>ms max=<span class="hljs-number">2.11</span>s p(<span class="hljs-number">90</span>)=<span class="hljs-number">932.24</span>ms p(<span class="hljs-number">95</span>)=<span class="hljs-number">1.24</span>s
    http_req_failed................: <span class="hljs-number">11.76</span>% <span class="hljs-number">12</span> out <span class="hljs-keyword">of</span> <span class="hljs-number">102</span>
    http_reqs......................: <span class="hljs-number">102</span>    <span class="hljs-number">25.345072</span>/s

    EXECUTION
    iteration_duration.............: avg=<span class="hljs-number">1.5</span>s       min=<span class="hljs-number">578.54</span>ms med=<span class="hljs-number">1.48</span>s   max=<span class="hljs-number">3.79</span>s p(<span class="hljs-number">90</span>)=<span class="hljs-number">2.12</span>s    p(<span class="hljs-number">95</span>)=<span class="hljs-number">2.51</span>s
    iterations.....................: <span class="hljs-number">100</span>    <span class="hljs-number">24.84811</span>/s
    vus............................: <span class="hljs-number">1</span>      min=<span class="hljs-number">1</span>         max=<span class="hljs-number">50</span>
    vus_max........................: <span class="hljs-number">50</span>     min=<span class="hljs-number">50</span>        max=<span class="hljs-number">50</span>

    NETWORK
    data_received..................: <span class="hljs-number">94</span> kB  <span class="hljs-number">23</span> kB/s
    data_sent......................: <span class="hljs-number">28</span> kB  <span class="hljs-number">6.9</span> kB/s

running (<span class="hljs-number">0</span>m04<span class="hljs-number">.0</span>s), <span class="hljs-number">00</span>/<span class="hljs-number">50</span> VUs, <span class="hljs-number">100</span> complete and <span class="hljs-number">0</span> interrupted iterations                                                                                                                                                                                                               
concurrent_load ✓ [======================================] <span class="hljs-number">50</span> VUs  <span class="hljs-number">0</span>m04<span class="hljs-number">.0</span>s/<span class="hljs-number">2</span>m0s  <span class="hljs-number">100</span>/<span class="hljs-number">100</span> shared iters
</code></pre>
<hr />
<h2 id="heading-step-8-monitoring-kubernetes">Step 8: Monitoring Kubernetes</h2>
<h3 id="heading-using-k9s">Using k9s</h3>
<p>k9s is a terminal-based UI for Kubernetes. Instead of typing dozens of kubectl commands, you get a fast, interactive dashboard right inside your terminal — perfect for developers, DevOps engineers, and operators who live in the CLI.</p>
<pre><code class="lang-bash">k9s
</code></pre>
<p><strong>Useful views:</strong></p>
<ul>
<li><p><code>:node</code> - View all nodes</p>
</li>
<li><p><code>:pod</code> - View all pods</p>
</li>
<li><p><code>:deployment</code> - View deployments</p>
</li>
<li><p><code>:service</code> - View services</p>
</li>
<li><p><code>:ingress</code> - View ingress rules</p>
</li>
<li><p><code>:pv</code> - View persistent volumes</p>
</li>
<li><p><code>:pvc</code> - View persistent volume claims</p>
</li>
<li><p><code>:event</code> - View cluster events</p>
</li>
</ul>
<hr />
<h2 id="heading-step-9-cleanup">Step 9: Cleanup</h2>
<p>Stop all VMs but keep state:</p>
<pre><code class="lang-bash">vagrant halt
</code></pre>
<p>Destroy everything (full reset):</p>
<pre><code class="lang-bash">vagrant destroy -f
</code></pre>
<hr />
<h2 id="heading-key-takeaways">Key Takeaways</h2>
<ul>
<li><p><strong>Infrastructure as Code made simple</strong> — Spin up a complete multi-node Kubernetes cluster with one vagrant up. No cloud required.</p>
</li>
<li><p><strong>Realistic local lab</strong> — Simulate a production-like environment with control plane, workers, networking, storage, and ingress — all from your laptop.</p>
</li>
<li><p><strong>Application + Infrastructure synergy</strong> — Deploy a real Spring Boot + PostgreSQL system to understand how app logic and cluster behavior interact under load.</p>
</li>
<li><p><strong>Data consistency in action</strong> — Experiment hands-on with JPA’s optimistic and pessimistic locking strategies to see how concurrency control works in practice.</p>
</li>
<li><p><strong>Performance validation</strong> — Use k6 to generate concurrent transactions and validate system reliability through real metrics and stress tests.</p>
</li>
<li><p><strong>Full observability from the CLI</strong> — With k9s, monitor nodes, pods, and resources interactively — no GUI required.</p>
</li>
<li><p><strong>Reproducibility and cleanup</strong> — Destroy and rebuild your environment anytime with vagrant destroy -f, ensuring consistent test conditions for every run.</p>
</li>
</ul>
<hr />
<h2 id="heading-conclusion">Conclusion</h2>
<p>We’ve built more than just a demo — we’ve created a fully automated multi-node Kubernetes lab that runs a real Spring Boot + PostgreSQL banking system with live networking, storage, and load testing. From Vagrant provisioning to JPA locking strategies and k6 concurrency simulations, every layer demonstrates how consistency and automation come together in modern systems.</p>
<p>This setup isn’t about production readiness — it’s about understanding. You now have a reproducible playground to experiment with distributed transactions, concurrency control, and cluster operations — all on your own machine. It’s a hands-on way to learn how reliability and scalability emerge when software, data, and infrastructure align.</p>
<hr />
<h2 id="heading-resources">Resources</h2>
<ul>
<li><p><a target="_blank" href="https://developer.hashicorp.com/vagrant/docs">Vagrant Docs</a></p>
</li>
<li><p><a target="_blank" href="https://kubernetes.io/docs/home/">Kubernetes Docs</a></p>
</li>
<li><p><a target="_blank" href="https://k9scli.io/">k9s</a></p>
</li>
<li><p><a target="_blank" href="https://k6.io/docs/">k6 Load Testing</a></p>
</li>
<li><p><a target="_blank" href="https://github.com/arata-x/vagrant-k8s-bank-demo.git"><em>Demo Project</em></a></p>
</li>
</ul>
<hr />
<blockquote>
<p>🧡 <em>“Build it. Break it. Rebuild it — that’s how real engineering insight is forged.”</em><br />— ArataX</p>
</blockquote>
]]></content:encoded></item><item><title><![CDATA[Mastering Kafka: Concept, Architecture, and Deployment]]></title><description><![CDATA[Preface
Before diving into this deep-dive, I encourage you first to read the article “Kafka Made Simple: A Hands-On Quickstart with Docker and Spring Boot”That piece serves as a practical gateway into the Kafka ecosystem, helping you set up a local c...]]></description><link>https://code.aratax.dev/mastering-kafka-concept-architecture-and-deployment</link><guid isPermaLink="true">https://code.aratax.dev/mastering-kafka-concept-architecture-and-deployment</guid><category><![CDATA[kafka]]></category><dc:creator><![CDATA[Arata]]></dc:creator><pubDate>Sun, 05 Oct 2025 13:21:28 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1759670198332/091c33df-eca5-4c7e-91d3-813a6fbe02f8.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-preface">Preface</h2>
<p>Before diving into this deep-dive, I encourage you first to read the article <strong>“</strong><a target="_blank" href="https://dev.to/aratax/kafka-made-simple-a-hands-on-quickstart-with-docker-and-spring-boot-180i"><strong>Kafka Made Simple: A Hands-On Quickstart with Docker and Spring Boot</strong></a><strong>”</strong><br />That piece serves as a practical gateway into the Kafka ecosystem, helping you set up a local cluster, publish your first events, and see how Kafka fits into a real Spring Boot project.</p>
<p>This article builds on that foundation. Instead of focusing only on the <em>how</em>, here we unpack the <em>why</em> and the <em>what</em>:</p>
<ul>
<li><p>The <strong>concepts</strong> that make Kafka more than just a messaging system.</p>
</li>
<li><p>The <strong>architecture</strong> that ensures durability, scalability, and fault tolerance.</p>
</li>
<li><p>The <strong>design principles</strong> behind Kafka’s performance.</p>
</li>
<li><p>A systematic <strong>deep dive</strong> into partitions, logs, replication, producers, consumers, transactions, and rebalancing.</p>
</li>
<li><p>Practical <strong>deployment insights</strong> and configuration guidance.</p>
</li>
</ul>
<p>👉 Think of this as the <strong>conceptual companion</strong> to your hands-on quickstart—helping you see the big picture, design production-ready systems, and apply Kafka confidently in real-world projects.</p>
<hr />
<h2 id="heading-1-core-design-principles">1. Core Design Principles</h2>
<h3 id="heading-distributed-and-scalable-architecture">Distributed and Scalable Architecture</h3>
<ul>
<li><p>Kafka runs as a cluster of brokers, enabling horizontal scalability.</p>
</li>
<li><p>Topics are partitioned across brokers to support parallelism and high throughput.</p>
</li>
</ul>
<h3 id="heading-immutable-append-only-log">Immutable, Append-Only Log</h3>
<ul>
<li><p>Each partition is a structured commit log with sequential message appends.</p>
</li>
<li><p>Simplifies replication, recovery, and stream processing.</p>
</li>
</ul>
<h3 id="heading-decoupled-producers-and-consumers">Decoupled Producers and Consumers</h3>
<ul>
<li><p>Kafka uses a publish-subscribe model with loose coupling.</p>
</li>
<li><p>Consumers read independently without affecting producers.</p>
</li>
</ul>
<h3 id="heading-message-durability-and-fault-tolerance">Message Durability and Fault Tolerance</h3>
<ul>
<li><p>Messages are persisted to disk and replicated across brokers.</p>
</li>
<li><p>Leader-follower replication ensures durability during broker failures.</p>
</li>
</ul>
<h3 id="heading-high-throughput-and-low-latency">High Throughput and Low Latency</h3>
<ul>
<li><p>Kafka handles millions of messages per second with minimal latency.</p>
</li>
<li><p>Batching, compression, and efficient I/O optimize performance.</p>
</li>
</ul>
<h3 id="heading-stream-oriented-processing">Stream-Oriented Processing</h3>
<ul>
<li><p>Kafka Streams and integrations (e.g., Flink, Spark) support real-time processing.</p>
</li>
<li><p>Enables event-driven architectures and stateful computations.</p>
</li>
</ul>
<h3 id="heading-consumer-controlled-offset-management">Consumer-Controlled Offset Management</h3>
<ul>
<li><p>Consumers manage their own offsets for replayability and fault recovery.</p>
</li>
<li><p>Supports exactly-once or at-least-once semantics based on configuration.</p>
</li>
</ul>
<h3 id="heading-pluggable-and-extensible-apis">Pluggable and Extensible APIs</h3>
<ul>
<li><p>Kafka provides Producer, Consumer, Streams, and Connect APIs.</p>
</li>
<li><p>Kafka Connect simplifies integration with external systems like databases and Hadoop.</p>
</li>
</ul>
<hr />
<h2 id="heading-2-partitions">2. Partitions</h2>
<p>Partitions are fundamental to Kafka’s ability to scale horizontally and maintain high availability across distributed systems. Each topic is split into one or more partitions, which serve as independent, ordered logs.</p>
<h3 id="heading-what-is-a-partition">What is a Partition?</h3>
<ul>
<li><p>An <strong>ordered, immutable log</strong> of records.</p>
</li>
<li><p>Each record has a unique <strong>offset</strong> (like a line number).</p>
</li>
<li><p>Ordering is <strong>guaranteed within a partition</strong>, but not across partitions.</p>
</li>
<li><p>Producers append sequentially, consumers read sequentially.</p>
</li>
</ul>
<p>✅ Think of a partition as a “mini-log” that can be processed independently.</p>
<h3 id="heading-partitioning-strategy">Partitioning Strategy</h3>
<ul>
<li><p><strong>Round-robin</strong> → default if no key is provided; balances evenly.</p>
</li>
<li><p><strong>Key-based hashing</strong> → same key always maps to the same partition; ensures per-key ordering.</p>
</li>
<li><p><strong>Custom partitioner</strong> → user-supplied logic for specialized routing.</p>
</li>
</ul>
<p>✅ Use a meaningful key (e.g., customer ID) for predictable ordering.</p>
<h3 id="heading-ordering-guarantees">Ordering Guarantees</h3>
<ul>
<li><p>Records with the same key always land in the same partition.</p>
</li>
<li><p><strong>Per-key ordering is guaranteed.</strong></p>
</li>
<li><p>Global ordering across partitions is <strong>not provided</strong>.</p>
</li>
</ul>
<p>⚠️ If you need total ordering, use a <strong>single partition</strong> (but this limits throughput).</p>
<h3 id="heading-parallelism-amp-consumer-scaling">Parallelism &amp; Consumer Scaling</h3>
<ul>
<li><p>One consumer in a group reads from one or more partitions.</p>
</li>
<li><p>More partitions → more consumers can share the workload.</p>
</li>
<li><p>This enables Kafka to scale horizontally with <strong>consumer groups</strong>.</p>
</li>
</ul>
<p>✅ Match partition count to expected parallelism (e.g., number of consumer instances).</p>
<h3 id="heading-trade-offs">Trade-offs</h3>
<p>Adding partitions boosts throughput and enables horizontal scaling, but also increases metadata, file handles, and controller load—balance performance with operational overhead.</p>
<p>⚠️ Too many partitions per broker can hurt stability (common pitfall in large clusters).</p>
<h3 id="heading-partition-reassignment-amp-expansion">Partition Reassignment &amp; Expansion</h3>
<ul>
<li><p>Kafka supports <strong>rebalancing partitions</strong> across brokers for load balancing.</p>
</li>
<li><p>Adding partitions later increases capacity but may <strong>break key ordering</strong> (keys may re-hash to new partitions).</p>
</li>
</ul>
<p>✅ Plan partition counts in advance. Increase only when unavoidable.</p>
<h3 id="heading-summary">Summary</h3>
<ul>
<li><p>Partitions = <strong>scaling + ordering + parallelism</strong>.</p>
</li>
<li><p>They allow Kafka to distribute work across consumers and brokers.</p>
</li>
<li><p>The number of partitions directly impacts <strong>performance, cost, and design trade-offs</strong>.</p>
</li>
</ul>
<blockquote>
<p>💡 Pick partition counts carefully: balance <strong>parallelism vs overhead</strong>.</p>
</blockquote>
<hr />
<h2 id="heading-3-log">3. Log</h2>
<p>At the core of Kafka is the <strong>log</strong> — an append-only data structure where each topic-partition maintains a sequential list of records. The log underpins durability, ordering, and replayability in Kafka.</p>
<h3 id="heading-log-fundamentals">Log Fundamentals</h3>
<ul>
<li><p><strong>Append-only</strong>: Producers write new records only at the end.</p>
</li>
<li><p><strong>Sequential reads</strong>: Consumers read messages by offset in order.</p>
</li>
<li><p><strong>Immutability</strong>: Records are never modified once written.</p>
</li>
<li><p><strong>Ordering</strong>: Within a partition, offsets guarantee strict ordering.</p>
</li>
<li><p><strong>Durability</strong>: Backed by disk with efficient sequential writes and OS page cache.</p>
</li>
</ul>
<p>✅ Simplifies recovery and replay by ensuring deterministic ordering. ⚠️ Updates or deletes are handled via <strong>compaction</strong> or <strong>tombstones</strong>, not in-place mutation.</p>
<h3 id="heading-partition-as-a-folder">Partition as a Folder</h3>
<ul>
<li>Each partition maps to a <strong>directory</strong> on disk (e.g., <code>/var/lib/kafka/volumes/kafka_data/_data/order-0</code>).</li>
</ul>
<p>✅ Keeps partition data isolated for replication and recovery.</p>
<p><strong>Inside a Partition Directory</strong></p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>File Name</td><td>Purpose</td></tr>
</thead>
<tbody>
<tr>
<td><code>*.log</code></td><td>Stores Kafka records (key-value pairs).</td></tr>
<tr>
<td><code>*.index</code></td><td>Maps offsets to byte positions in the <code>.log</code> file.</td></tr>
<tr>
<td><code>*.timeindex</code></td><td>Maps timestamps to offsets for time-based lookups.</td></tr>
<tr>
<td><code>leader-epoch-checkpoint</code></td><td>Tracks leader epochs for replication consistency.</td></tr>
<tr>
<td><code>partition.metadata</code></td><td>Stores partition-level configuration or state.</td></tr>
</tbody>
</table>
</div><h3 id="heading-log-lifecycle">Log Lifecycle</h3>
<ul>
<li><p>As data grows, Kafka rolls logs into <strong>segments</strong>.</p>
</li>
<li><p>Each segment has a <code>.log</code>, <code>.index</code>, and <code>.timeindex</code> file.</p>
</li>
<li><p>New messages go into the <strong>active segment</strong> (latest <code>.log</code>).</p>
</li>
<li><p>Old segments can be safely deleted or compacted based on retention rules.</p>
</li>
</ul>
<p><strong>Example</strong> (partition <code>order-0</code>):</p>
<pre><code class="lang-text">00000000000000000000.log        → Log segment storing the actual messages
00000000000000000000.index      → Offset index for fast lookup of records
00000000000000000000.timeindex  → Timestamp index for time-based queries
leader-epoch-checkpoint         → Tracks changes in partition leadership
partition.metadata              → Metadata about the partition configuration
</code></pre>
<p>As more data arrives and the first segment grows beyond the configured segment size, Kafka rolls over to create new segments:</p>
<pre><code class="lang-text">00000000000000000001.log
00000000000000000001.index
00000000000000000001.timeindex
</code></pre>
<h3 id="heading-retention-and-compaction">Retention and Compaction</h3>
<ul>
<li>Kafka does not keep logs forever → policies determine retention.</li>
</ul>
<p><strong>Retention Policies</strong>:</p>
<ul>
<li><p><strong>Time-based</strong>: Delete records older than <code>retention.ms</code>.</p>
</li>
<li><p><strong>Size-based</strong>: Delete when total log size exceeds <code>retention.bytes</code>.</p>
</li>
<li><p><strong>Compaction</strong>: Retain only the latest value per key.</p>
</li>
</ul>
<p>✅ Retention prevents unbounded disk usage.</p>
<p>⚠️ Aggressive retention can delete records needed for replay or lagging consumers.</p>
<h3 id="heading-performance-considerations">Performance Considerations</h3>
<ul>
<li><p><strong>Segment size</strong> and retention settings impact disk churn and log cleanup frequency.</p>
</li>
<li><p><strong>Disk throughput</strong> and filesystem tuning (XFS recommended) directly affect performance.</p>
</li>
<li><p><strong>Consumer lag</strong> → large replay windows may require higher retention to allow catch-up.</p>
</li>
</ul>
<p>✅ SSDs improve latency, but sequential disk writes mean <strong>HDDs can still perform well</strong>.</p>
<p>⚠️ Misconfigured retention can either exhaust the disk or delete needed data too quickly.</p>
<h3 id="heading-summary-1">Summary</h3>
<p>The Kafka log is:</p>
<ul>
<li><p><strong>Append-only</strong> → simple and efficient for writes.</p>
</li>
<li><p><strong>Segmented</strong> → scalable and manageable on disk.</p>
</li>
<li><p><strong>Retained or compacted</strong> → supports both replayability and bounded storage.</p>
</li>
</ul>
<blockquote>
<p>💡 Proper tuning of <strong>segment size, retention, and compaction</strong> ensures Kafka logs remain durable, performant, and aligned with application needs.</p>
</blockquote>
<hr />
<h2 id="heading-4-key-and-log-compaction">4. Key and Log Compaction</h2>
<p>Kafka topics allow multiple messages with the same <strong>key</strong>, and Kafka provides <strong>log compaction</strong> to keep only the latest value per key. This design supports stateful stream processing, caching, and event sourcing use cases.</p>
<h3 id="heading-keys-in-kafka">Keys in Kafka</h3>
<ul>
<li><p>Kafka does not enforce <strong>uniqueness</strong> of keys.</p>
</li>
<li><p>The key determines <strong>partition placement</strong>:</p>
<ul>
<li><p>Same key → always routed to the same partition.</p>
</li>
<li><p>Ensures <strong>per-key ordering</strong> of events.</p>
</li>
</ul>
</li>
</ul>
<p><strong>Common Use Cases:</strong></p>
<ul>
<li><p>Updates to the same entity (e.g., user profile changes).</p>
</li>
<li><p>Event streams per entity (e.g., customer actions).</p>
</li>
<li><p>Stateful stream processing (aggregates or reducers).</p>
</li>
<li><p>Materialized views (latest state per key).</p>
</li>
<li><p>Caching or event sourcing (replay per entity).</p>
</li>
</ul>
<p>⚠️ Keys don’t guarantee global uniqueness — they only ensure ordering within a partition.</p>
<h3 id="heading-log-compaction">Log Compaction</h3>
<ul>
<li><p><strong>Log compaction</strong> removes older records for a given key, retaining only the most recent value.</p>
</li>
<li><p>Enabled via <code>cleanup.policy=compact</code>.</p>
</li>
</ul>
<p>✅ Benefits:</p>
<ul>
<li><p>Keeps the latest value per key for <strong>stateful applications</strong>.</p>
</li>
<li><p>Reduces disk usage while preserving key-level history.</p>
</li>
</ul>
<p>⚠️ Considerations:</p>
<ul>
<li><p>Compaction is <strong>asynchronous</strong> → old versions may remain temporarily.</p>
</li>
<li><p><strong>Offsets and order are preserved</strong> even after compaction.</p>
</li>
<li><p>Not a replacement for time/size-based retention.</p>
</li>
</ul>
<p><strong>Key Configurations</strong>:</p>
<ul>
<li><p><code>cleanup.policy=compact</code> → enable compaction.</p>
</li>
<li><p><code>min.cleanable.dirty.ratio</code> → % of log dirtiness before cleaning triggers.</p>
</li>
<li><p><code>min.compaction.lag.ms</code> / <code>max.compaction.lag.ms</code> → control delay before segments are compacted.</p>
</li>
<li><p><code>delete.retention.ms</code> → how long tombstones are retained.</p>
</li>
</ul>
<h3 id="heading-tombstones">Tombstones</h3>
<ul>
<li><p>A <strong>tombstone</strong> is a message with a key and a <code>null</code> value.</p>
</li>
<li><p>Signals that all previous values for that key should be deleted during compaction.</p>
</li>
</ul>
<p><strong>Example</strong>:</p>
<pre><code class="lang-json">{ <span class="hljs-attr">"key"</span>: <span class="hljs-string">"user123"</span>, <span class="hljs-attr">"value"</span>: <span class="hljs-literal">null</span> }
</code></pre>
<p>How Tombstones Work:</p>
<ol>
<li><p>Marks the key for deletion → tells Kafka “forget this key.”</p>
</li>
<li><p>During compaction, Kafka removes earlier messages with that key.</p>
</li>
<li><p>The tombstone itself is later removed after <code>delete.retention.ms</code>.</p>
</li>
</ol>
<p>✅ Enables explicit <strong>deletes</strong> in a compacted topic. ⚠️ Consumers must be designed to interpret null values correctly.</p>
<h3 id="heading-summary-2">Summary</h3>
<ul>
<li><p><strong>Keys</strong> define partitioning and enable ordered per-entity streams.</p>
</li>
<li><p><strong>Log compaction</strong> ensures only the latest record per key is retained, reducing log size while preserving correctness.</p>
</li>
<li><p><strong>Tombstones</strong> provide a mechanism for deleting keys in compacted topics.</p>
</li>
</ul>
<blockquote>
<p>💡 keys + compaction allow Kafka to serve as both a durable event log and a state store for real-time applications.</p>
</blockquote>
<hr />
<h2 id="heading-5-replication">5. Replication</h2>
<p>Replication in Kafka ensures resilience and fault tolerance by distributing partitions across multiple brokers. Each partition has one <strong>leader</strong> and one or more <strong>followers</strong> that maintain synchronized copies.</p>
<h3 id="heading-leader-and-followers">Leader and Followers</h3>
<ul>
<li><p><strong>Leader</strong> → handles all reads and writes for the partition.</p>
</li>
<li><p><strong>Followers</strong> → replicate the leader’s log asynchronously to stay in sync.</p>
</li>
</ul>
<p>✅ Clients always interact with the leader, simplifying producer/consumer logic.</p>
<h3 id="heading-replication-factor">Replication Factor</h3>
<ul>
<li><p>Defines the number of copies per partition.</p>
</li>
<li><p>Common default: <strong>3 (1 leader, 2 followers)</strong>.</p>
</li>
</ul>
<p>✅ Higher replication factor = stronger fault tolerance.</p>
<p>⚠️ Increases storage and network overhead.</p>
<h3 id="heading-in-sync-replicas-isr">In-Sync Replicas (ISR)</h3>
<ul>
<li><p>ISRs are replicas fully caught up with the leader.</p>
</li>
<li><p>Only ISRs are eligible for promotion during failover.</p>
</li>
</ul>
<p>✅ Ensures safe and consistent recovery.</p>
<p>⚠️ Too many out-of-sync replicas weaken durability guarantees.</p>
<h3 id="heading-leader-election-and-failover">Leader Election and Failover</h3>
<ul>
<li><p>If the leader fails, a new one is chosen from the ISR set.</p>
</li>
<li><p>The <strong>Controller</strong> (see Section 8) coordinates this election.</p>
</li>
</ul>
<p>✅ Enables fast recovery and high availability.</p>
<h3 id="heading-consistency-vs-latency-trade-offs">Consistency vs Latency Trade-offs</h3>
<ul>
<li><p><code>acks=all</code> → strongest durability. Leader waits for all ISR acknowledgments.</p>
</li>
<li><p><code>acks=1</code> → leader-only acknowledgment. Faster writes, but less durable.</p>
</li>
</ul>
<p>⚠️ <strong>More replicas = More safety</strong>, but also higher cost and latency.</p>
<h3 id="heading-summary-3">Summary</h3>
<p>Replication provides:</p>
<ul>
<li><p><strong>High availability</strong> through leader/follower design.</p>
</li>
<li><p><strong>Durability</strong> via multiple replicas and ISRs.</p>
</li>
<li><p><strong>Fault tolerance</strong> with automatic leader election.</p>
</li>
</ul>
<blockquote>
<p>💡 Balance <strong>safety</strong> and <strong>performance</strong> by adjusting replication and acknowledgments.</p>
</blockquote>
<hr />
<h2 id="heading-6-controller">6. Controller</h2>
<p>The <strong>Kafka Controller</strong> is a special broker role that manages <strong>cluster-wide metadata and coordination</strong>.<br />In modern <strong>KRaft mode (Kafka Raft)</strong>, controllers form a <strong>quorum</strong> that replaces ZooKeeper, ensuring metadata consistency and high availability.</p>
<h3 id="heading-metadata-management">Metadata Management</h3>
<ul>
<li><p>Tracks topics, partitions, broker registrations, and configurations.</p>
</li>
<li><p>Persists updates in the internal metadata log <code>__cluster_metadata</code>.</p>
</li>
</ul>
<p>✅ Ensures all brokers share a consistent view of the cluster.</p>
<h3 id="heading-leader-election">Leader Election</h3>
<ul>
<li><p>Coordinates <strong>partition leader elections</strong> when brokers fail or join.</p>
</li>
<li><p>Relies on the ISR set maintained by replication (see Section 7).</p>
</li>
</ul>
<p>✅ Keeps partitions highly available with minimal downtime.</p>
<h3 id="heading-partition-assignment">Partition Assignment</h3>
<ul>
<li><p>Distributes partitions across brokers for load balancing.</p>
</li>
<li><p>Reassigns partitions during rebalances, broker failures, or cluster expansion.</p>
</li>
</ul>
<p>⚠️ Frequent reassignments add overhead; prefer stable membership.</p>
<h3 id="heading-quorum-coordination-kraft">Quorum Coordination (KRaft)</h3>
<ul>
<li><p>Controllers form a <strong>Raft quorum</strong>:</p>
<ul>
<li><p>One acts as the <strong>active leader</strong>.</p>
</li>
<li><p>Others are <strong>followers</strong>, replicating metadata changes.</p>
</li>
</ul>
</li>
</ul>
<p>✅ Provides fault tolerance without external ZooKeeper.</p>
<h3 id="heading-cluster-health-and-recovery">Cluster Health and Recovery</h3>
<ul>
<li><p>Detects broker failures and updates cluster state.</p>
</li>
<li><p>Removes failed brokers from the ISR (in coordination with replication).</p>
</li>
<li><p>Triggers <strong>leader re-election</strong> for affected partitions.</p>
</li>
</ul>
<p>✅ Enables rapid self-healing and resilience.</p>
<h3 id="heading-active-vs-follower-controllers">Active vs. Follower Controllers</h3>
<ul>
<li><p><strong>Active Controller (Leader)</strong></p>
<ul>
<li><p>Makes cluster-wide decisions:</p>
<ul>
<li><p>Runs leader elections.</p>
</li>
<li><p>Updates ISR lists.</p>
</li>
<li><p>Tracks broker registrations and failures.</p>
</li>
<li><p>Applies config changes (topics, ACLs, quotas).</p>
</li>
</ul>
</li>
<li><p>Persists changes in <code>__cluster_metadata</code>, replicated to followers.</p>
</li>
</ul>
</li>
</ul>
<p>👉 Functions as the <strong>“cluster brain.”</strong></p>
<ul>
<li><p><strong>Follower Controllers</strong></p>
<ul>
<li><p>Replicate metadata log entries from the active controller.</p>
</li>
<li><p>Do not make independent decisions.</p>
</li>
<li><p>Stay ready to take over if the active controller fails.</p>
</li>
</ul>
</li>
</ul>
<p>👉 Serve as <strong>“standby brains.”</strong></p>
<h3 id="heading-summary-4">Summary</h3>
<p>The Controller is the <strong>control plane</strong> of Kafka:</p>
<ul>
<li><p>Maintains <strong>metadata consistency</strong>.</p>
</li>
<li><p>Runs <strong>leader elections</strong> based on ISR information.</p>
</li>
<li><p>Coordinates <strong>partition assignment</strong> and cluster state changes.</p>
</li>
<li><p>In KRaft mode, controllers use Raft quorum replication, removing ZooKeeper.</p>
</li>
</ul>
<blockquote>
<p>💡 Together with <strong>Replication (7)</strong>, the Controller ensures Kafka remains highly available, consistent, and fault-tolerant.</p>
</blockquote>
<hr />
<h2 id="heading-7-producer">7. Producer</h2>
<p>Producers are responsible for reliable, ordered, and efficient delivery of messages to Kafka topics. Their configuration balances durability, ordering, latency, and resource usage through several key mechanisms.</p>
<h3 id="heading-durability-and-acknowledgments-acks">Durability and Acknowledgments (acks)</h3>
<ul>
<li><p>Producers control how many broker acknowledgments are required before a send is considered successful.</p>
<ul>
<li><p><code>acks=0</code> → fire-and-forget, lowest latency, no durability.</p>
</li>
<li><p><code>acks=1</code> → leader acknowledgment only, balances latency and durability.</p>
</li>
<li><p><code>acks=all</code> → requires leader + ISR acknowledgment, strongest durability.</p>
</li>
</ul>
</li>
</ul>
<p>✅ Use <code>acks=all</code> for critical data.</p>
<h3 id="heading-ordering-and-retries">Ordering and Retries</h3>
<ul>
<li><p>Kafka producers retry failed sends automatically.</p>
</li>
<li><p>Retries can break ordering if multiple requests are in flight.</p>
</li>
<li><p>Use <code>max.in.flight.requests.per.connection=1</code> to strictly preserve order.</p>
</li>
<li><p><strong>Idempotence</strong> (<code>enable.idempotence=true</code>) ensures retries don’t produce duplicates.</p>
</li>
</ul>
<p>✅ Combine retries + idempotence to achieve exactly-once semantics.</p>
<h3 id="heading-batching-and-latency-trade-offs">Batching and Latency Trade-offs</h3>
<ul>
<li><p>Producers buffer messages into batches before sending.</p>
</li>
<li><p><code>batch.size</code> controls max size of a batch in bytes.</p>
</li>
<li><p><code>linger.ms</code> sets how long to wait before sending a partially full batch.</p>
<ul>
<li><p>Larger batches / higher linger → better throughput, higher latency.</p>
</li>
<li><p>Smaller batches / lower linger → lower latency, reduced throughput.</p>
</li>
</ul>
</li>
</ul>
<p>✅ Tune for workload: real-time systems prefer low latency; batch pipelines prefer throughput.</p>
<h3 id="heading-compression">Compression</h3>
<ul>
<li><p>Supported codecs: <code>gzip</code>, <code>snappy</code>, <code>lz4</code>, <code>zstd</code>.</p>
</li>
<li><p>Compression applies per batch, saving bandwidth and storage.</p>
</li>
<li><p>Default is <code>none</code>.</p>
</li>
<li><p><code>gzip</code> costs higher CPU usage for compression/decompression.</p>
</li>
</ul>
<p>✅ <code>lz4</code> or <code>zstd</code> for good speed/ratio balance.</p>
<h3 id="heading-resource-limits-and-buffering">Resource Limits and Buffering</h3>
<ul>
<li><p><code>buffer.memory</code>: max memory available for unsent records.</p>
</li>
<li><p><code>max.block.ms</code>: how long <code>send()</code> will block when buffer is full.</p>
</li>
<li><p><code>max.request.size</code>: prevents oversized requests.</p>
</li>
<li><p>These settings protect the producer and broker from overload.</p>
</li>
</ul>
<p>✅ Monitor producer metrics (buffer exhaustion, errors) to detect bottlenecks.</p>
<h3 id="heading-summary-5">Summary</h3>
<p>Producer tuning is about balancing:</p>
<ul>
<li><p><strong>Durability vs. latency</strong> (<code>acks</code>).</p>
</li>
<li><p><strong>Ordering vs. throughput</strong> (retries, in-flight requests).</p>
</li>
<li><p><strong>CPU vs. I/O efficiency</strong> (compression, batching).</p>
</li>
</ul>
<blockquote>
<p>💡 With correct configuration, producers achieve high throughput without sacrificing reliability.</p>
</blockquote>
<hr />
<h2 id="heading-8-consumer">8. Consumer</h2>
<p>Consumers are responsible for reading messages from topics, tracking their progress, and coordinating with other consumers in a group. Their configuration impacts delivery guarantees, throughput, latency, fault tolerance, and ordering.</p>
<h3 id="heading-offset-management-and-delivery-guarantees">Offset Management and Delivery Guarantees</h3>
<ul>
<li><p><strong>Automatic commits</strong> (<code>enable.auto.commit=true</code>) → simple, but only <em>at-least-once</em> delivery since commits are decoupled from processing.</p>
</li>
<li><p><strong>Manual commits</strong> (<code>commitSync</code> / <code>commitAsync</code>) → give precise control to commit only after successful processing.</p>
</li>
<li><p>For <strong>exactly-once semantics</strong>, bind offset commits to transactions or use manual synchronous commit.</p>
</li>
<li><p><code>auto.offset.reset</code> determines startup behavior if no committed offset exists:</p>
<ul>
<li><p><code>earliest</code> → start from the beginning (useful for replays).</p>
</li>
<li><p><code>latest</code> → only consume new records.</p>
</li>
</ul>
</li>
</ul>
<p>✅ Use manual commits or transactional commits in critical pipelines.</p>
<h3 id="heading-partition-assignment-and-rebalancing">Partition Assignment and Rebalancing</h3>
<ul>
<li><p>Within one consumer group, each partition is assigned to <strong>at most one member</strong> at a time.</p>
</li>
<li><p>Multiple consumer groups can read the same partition independently.</p>
</li>
<li><p>Assignment strategies:</p>
<ul>
<li><p><strong>Range</strong> → contiguous partition sets.</p>
</li>
<li><p><strong>RoundRobin</strong> → even distribution across members.</p>
</li>
<li><p><strong>Sticky</strong> → minimizes partition movement during rebalances.</p>
</li>
</ul>
</li>
<li><p>Frequent join/leave events → trigger rebalances and pause consumption.</p>
</li>
</ul>
<p>✅ Keep membership stable to reduce churn.<br />⚠️ Tune <code>session.timeout.ms</code> and <code>heartbeat.interval.ms</code>:</p>
<ul>
<li><p>Higher values tolerate long GC pauses or transient work.</p>
</li>
<li><p>Lower values detect failures faster but may cause false positives.</p>
</li>
</ul>
<h3 id="heading-poll-and-fetch-tuning">Poll and Fetch Tuning</h3>
<ul>
<li><p><code>max.poll.records</code>:</p>
<ul>
<li><p>Increase for higher throughput.</p>
</li>
<li><p>Reduce to limit per-iteration processing and avoid long loops.</p>
</li>
</ul>
</li>
<li><p><code>max.partition.fetch.bytes</code> and <code>fetch.max.wait.ms</code>:</p>
<ul>
<li><p>Larger values → better for bulk processing.</p>
</li>
<li><p>Smaller values → better for low-latency use cases.</p>
</li>
</ul>
</li>
<li><p><code>fetch.min.bytes</code>:</p>
<ul>
<li><p>Set higher to batch more data (throughput).</p>
</li>
<li><p>Set to <code>1</code> for immediate returns (latency).</p>
</li>
</ul>
</li>
<li><p>The poll loop must call <code>poll()</code> frequently:</p>
<ul>
<li><p>Long processing requires increasing <code>max.poll.interval.ms</code>.</p>
</li>
<li><p>Handle rebalance callbacks to stay responsive.</p>
</li>
</ul>
</li>
</ul>
<p>✅Balance throughput vs latency depending on workload.</p>
<h3 id="heading-summary-6">Summary</h3>
<p>Consumer tuning balances:</p>
<ul>
<li><p><strong>Delivery guarantees vs. simplicity</strong> (auto vs manual commits).</p>
</li>
<li><p><strong>Partition stability vs. flexibility</strong> (assignment and rebalance strategies).</p>
</li>
<li><p><strong>Throughput vs. latency</strong> (poll/fetch tuning).</p>
</li>
</ul>
<blockquote>
<p>💡Use manual or transactional commits for critical pipelines, keep consumer group membership stable, and tune poll/fetch settings to balance throughput with latency.</p>
</blockquote>
<hr />
<h2 id="heading-9-offset-tracking">9. Offset Tracking</h2>
<p>An <strong>offset</strong> is a position marker that tells a consumer <em>which record it has read up to</em> in a partition, and where to resume on restart or after a failure. Kafka tracks offsets <strong>per partition, per consumer group</strong>, allowing multiple consumers to share work safely.</p>
<h3 id="heading-how-offset-tracking-works">How Offset Tracking Works</h3>
<ul>
<li><p><strong>Consumer Pull Model</strong></p>
<ul>
<li><p>Consumers request data from partitions starting from a specific offset.</p>
</li>
<li><p>They control whether to begin from <code>earliest</code>, <code>latest</code>, or a committed offset.</p>
</li>
</ul>
</li>
<li><p><strong>Offset Commitment</strong></p>
<ul>
<li><p>Consumers save progress by committing offsets, either automatically or manually.</p>
</li>
<li><p>Committed offsets are stored in Kafka’s internal topic <code>__consumer_offsets</code>, which is partitioned and replicated.</p>
</li>
</ul>
</li>
</ul>
<p>✅ Automatic commits are simple for <em>at-least-once</em> delivery.</p>
<p>⚠️ Manual commits are safer for critical processing, but require more application logic.</p>
<h3 id="heading-consumer-position-vs-committed-offset">Consumer Position vs. Committed Offset</h3>
<ul>
<li><p><strong>Consumer Position</strong> → the <strong>next</strong> record the consumer will read (held in memory).</p>
</li>
<li><p><strong>Committed Offset</strong> → the last offset safely stored as a checkpoint.</p>
</li>
</ul>
<pre><code class="lang-text">[00][01][02][03][04][05][06][07][08][09][10][11]
                                      ^-- committed = 09 (resume here)
                                              ^-- position = 11 (next to read)
</code></pre>
<p>👉 If the consumer crashes, it restarts from the <strong>committed offset</strong>, not the in-memory position.<br />This means it may <strong>re-read some records</strong> but won’t skip any.</p>
<h3 id="heading-summary-7">Summary</h3>
<ul>
<li><p>Offsets are <strong>per-partition position markers</strong>.</p>
</li>
<li><p>Kafka persists committed offsets in the <code>__consumer_offsets</code> topic.</p>
</li>
<li><p>The gap between <strong>position vs. committed offset</strong> provides fault tolerance, but may cause duplicates.</p>
</li>
</ul>
<blockquote>
<p>💡 Correct offset management is essential for delivery guarantees (<em>at-least-once, at-most-once, exactly-once</em>).</p>
</blockquote>
<hr />
<h2 id="heading-10-rebalance">10. Rebalance</h2>
<p>Rebalancing is the process where Kafka’s <strong>Group Coordinator</strong> redistributes partitions among consumers in a <strong>consumer group</strong> whenever the workload relationship changes.</p>
<h3 id="heading-when-rebalancing-happens">When Rebalancing Happens</h3>
<ul>
<li><p>A new consumer joins the group (more parallelism).</p>
</li>
<li><p>An existing consumer leaves or fails (load must be reassigned).</p>
</li>
<li><p>A topic’s partitions increase (new partitions must be assigned).</p>
</li>
</ul>
<h3 id="heading-how-rebalancing-works">How Rebalancing Works</h3>
<ol>
<li><p>Group Coordinator detects a change in group membership.</p>
</li>
<li><p>All consumers stop fetching temporarily.</p>
</li>
<li><p>Coordinator calculates a new partition assignment.</p>
</li>
<li><p>Each consumer receives its updated assignment.</p>
</li>
<li><p>Consumers resume reading from their assigned offsets.</p>
</li>
</ol>
<blockquote>
<p>💡 Minimize unnecessary group membership changes and control partition counts carefully to reduce rebalance frequency and consumer downtime.</p>
</blockquote>
<hr />
<h2 id="heading-11-exactly-once-and-transactions">11. Exactly Once and Transactions</h2>
<p>Kafka’s <strong>Exactly-Once Semantics (EOS)</strong> ensures that messages are processed <em>once and only once</em>, even in the face of retries or failures. This combines idempotent production, transactions, and offset commits into a unified model for reliable stream processing.</p>
<h3 id="heading-idempotent-producer">Idempotent Producer</h3>
<ul>
<li><p>When <code>enable.idempotence=true</code>, the producer is assigned a <strong>Producer ID (PID)</strong> and per-partition sequence numbers.</p>
</li>
<li><p>Retries are deduplicated at the broker using these sequence numbers.</p>
</li>
</ul>
<p>✅ Guarantees <em>no duplicates</em> in a single partition, even under retries.</p>
<p>⚠️ Does not guarantee atomicity across multiple partitions or topics by itself.</p>
<h3 id="heading-transactional-producer">Transactional Producer</h3>
<ul>
<li><p>A <strong>transactional producer</strong> groups multiple writes and offset commits into a single atomic unit.</p>
</li>
<li><p>Either all messages + offset commits succeed, or none do.</p>
</li>
<li><p>Controlled via a stable <code>transactional.id</code>, which enables fencing (old producers with the same ID are invalidated).</p>
</li>
</ul>
<p>✅ Provides atomic <em>read → process → write</em> semantics.</p>
<h3 id="heading-transaction-coordinator">Transaction Coordinator</h3>
<ul>
<li><p>A special broker component that manages transaction state.</p>
</li>
<li><p>Persists transaction metadata in the internal topic <code>__transaction_state</code>.</p>
</li>
<li><p>Ensures commit/abort decisions are coordinated for each <code>transactional.id</code>.</p>
</li>
</ul>
<p>⚠️ Coordinator bottlenecks can occur if too many producers use transactions with wide scope.</p>
<h3 id="heading-consumer-isolation-levels">Consumer Isolation Levels</h3>
<ul>
<li><p>Consumers control visibility into transactional writes via <code>isolation.level</code>:</p>
<ul>
<li><p><code>read_uncommitted</code> → sees all records (including aborted transactions).</p>
</li>
<li><p><code>read_committed</code> → sees only records from successfully committed transactions.</p>
</li>
</ul>
</li>
</ul>
<p>✅ Use <code>read_committed</code> in pipelines that require strict correctness.</p>
<h3 id="heading-offsets-in-transactions">Offsets in Transactions</h3>
<ul>
<li><p>The <code>sendOffsetsToTransaction</code> API binds offset commits to producer transactions.</p>
</li>
<li><p>Offsets are only committed if the producer transaction itself commits.</p>
</li>
</ul>
<p>✅ Ensures <em>exactly-once</em> end-to-end semantics: messages are processed and offsets advanced atomically.</p>
<h3 id="heading-summary-8">Summary</h3>
<ul>
<li><p><strong>Idempotence</strong> removes duplicates per partition.</p>
</li>
<li><p><strong>Transactions</strong> extend atomicity across topics + offsets.</p>
</li>
<li><p><strong>Coordinators</strong> maintain transaction state.</p>
</li>
<li><p><strong>Isolation levels</strong> let consumers choose between speed (<code>read_uncommitted</code>) and safety (<code>read_committed</code>).</p>
</li>
</ul>
<blockquote>
<p>💡 Enable <strong>enable.idempotence=true</strong> by default and use <strong>transactions</strong> (transactional.id + sendOffsetsToTransaction) only when strict exactly-once guarantees across topics and offsets are required.</p>
</blockquote>
<hr />
<h2 id="heading-12-deployment">12. Deployment</h2>
<h3 id="heading-cluster-topology-and-roles">Cluster Topology and Roles</h3>
<ul>
<li><p>Separate <strong>controller</strong> and <strong>broker</strong> roles on dedicated nodes for production-scale clusters.</p>
</li>
<li><p>Run a <strong>controller-only quorum</strong> of 3 or 5 nodes.</p>
<ul>
<li><p>Three controllers are sufficient for moderate clusters.</p>
</li>
<li><p>Five controllers are preferred for larger clusters or higher availability needs.</p>
</li>
</ul>
</li>
<li><p>Use <strong>broker-only nodes</strong> for the data plane (producers and consumers).</p>
</li>
<li><p>Deploy at least three brokers and configure <code>replication.factor ≥ 3</code> for critical topics.</p>
</li>
</ul>
<h3 id="heading-storage-and-disks">Storage and Disks</h3>
<ul>
<li><p>Use <strong>JBOD (Just a Bunch of Disks)</strong> — no RAID. Present disks individually to brokers and let Kafka handle replication.</p>
</li>
<li><p>Prefer the <strong>XFS filesystem</strong> tuned for large files; mount broker volumes with <code>noatime</code> (or <code>relatime</code> if atime tracking is required).</p>
</li>
<li><p>Use <strong>HDDs</strong> on brokers for high sequential throughput and cost efficiency. Consider <strong>SSDs/NVMe</strong> for controller nodes (metadata logs) or if your workloads involve heavy random reads or strict latency SLAs.</p>
</li>
<li><p>Tune <code>log.segment.bytes</code> and retention policies to manage the number of segments and control <code>mmap</code> usage.</p>
</li>
</ul>
<h3 id="heading-memory-heap-and-os-tuning">Memory, Heap, and OS Tuning</h3>
<ul>
<li><p>Keep broker JVM heap <strong>small and fixed</strong> (typically 4–8 GB). Leave the remaining RAM for the OS page cache.</p>
</li>
<li><p>Apply the <strong>RAM sizing rule</strong>: provision enough RAM to buffer approximately 30 seconds of peak ingest throughput in the page cache.</p>
</li>
</ul>
<blockquote>
<p><strong>Example</strong> If ingest is 300 MB/s, you want ~9 GB RAM just for cache.</p>
<p><strong>Formula</strong> Required RAM for cache ≈ (ingest throughput in MB/s) × 30 seconds</p>
</blockquote>
<ul>
<li>Raise <code>vm.max_map_count</code> for large clusters with many partitions or segments (e.g., set to 262144 or higher when required).</li>
</ul>
<blockquote>
<p><strong>Formula</strong> required_vm.max_map_count ≈ partitions_per_broker × segments_per_partition × 2</p>
</blockquote>
<ul>
<li><p>Increase file descriptor limits (<code>ulimit -n</code>) to at least 100k.</p>
</li>
<li><p>For networking, provision <strong>10Gbps NICs</strong> for high-throughput clusters and tune socket buffers for cross–data center replication.</p>
</li>
</ul>
<h3 id="heading-availability-replication-and-durability">Availability, Replication, and Durability</h3>
<ul>
<li><p>Configure <code>min.insync.replicas ≥ 2</code> when <code>replication.factor = 3</code> to ensure durability even if one replica fails.</p>
</li>
<li><p>Require producers to use <code>acks=all</code> for critical topics to ensure writes are fully replicated before acknowledgment.</p>
</li>
<li><p>Enable <strong>rack awareness</strong> (<code>broker.rack</code>) so replicas are distributed across racks or availability zones for better fault tolerance.</p>
</li>
<li><p>Consider <strong>tiered storage</strong> (e.g., S3 or HDFS) for offloading cold data while keeping hot data local to brokers.</p>
</li>
</ul>
<h3 id="heading-security-and-networking">Security and Networking</h3>
<ul>
<li><p>Enable <strong>TLS encryption</strong> for both client–broker and inter-broker communication.</p>
</li>
<li><p>Use <strong>SASL authentication</strong> (SCRAM, mTLS, or GSSAPI depending on your environment).</p>
</li>
<li><p>Apply <strong>Kafka ACLs</strong> to enforce least-privilege access control.</p>
</li>
<li><p>Restrict broker ports to trusted networks and place brokers/controllers in <strong>private subnets</strong>.</p>
</li>
</ul>
<h3 id="heading-operations-monitoring-and-alerting">Operations, Monitoring, and Alerting</h3>
<p>Kafka’s monitoring flow begins with JMX exposing internal metrics, which are collected by a Prometheus exporter and visualized through Grafana dashboards for real-time tracking and alerting.</p>
<ul>
<li><p><strong>Key Metrics to Track</strong></p>
<ul>
<li><p>Under-replicated or offline partitions</p>
</li>
<li><p>Request latency across produce and fetch paths</p>
</li>
<li><p>ISR size fluctuations and consumer lag</p>
</li>
<li><p>Disk usage and I/O saturation</p>
</li>
<li><p>GC pause duration and frequency</p>
</li>
</ul>
</li>
<li><p><strong>Critical Alerts</strong></p>
<ul>
<li><p>Shrinking ISR or under-replicated partitions.</p>
</li>
<li><p>Offline or missing replicas.</p>
</li>
<li><p>Disk pressure or high utilization.</p>
</li>
<li><p>Long GC pauses.</p>
</li>
<li><p>Frequent rebalances.</p>
</li>
</ul>
</li>
</ul>
<hr />
<h2 id="heading-13-key-takeaways">13. Key Takeaways</h2>
<ul>
<li><p><strong>Kafka is not just a queue</strong> : it’s a <strong>distributed event streaming platform</strong> for high-throughput, real-time data pipelines.</p>
</li>
<li><p><strong>Core roles</strong> : Producers publish, Consumers subscribe, Topics organize, and Partitions enable horizontal scalability.</p>
</li>
<li><p><strong>Immutable, ordered logs</strong> : guarantee replayable data streams and predictable processing.</p>
</li>
<li><p><strong>Replication and ISR</strong> : leaders handle writes, followers stay synchronized to ensure fault tolerance.</p>
</li>
<li><p><strong>KRaft replaces ZooKeeper</strong> : simplifying cluster metadata management and deployment complexity.</p>
</li>
<li><p><strong>Performance is filesystem-driven</strong> : sequential disk I/O, OS page cache, and batching give Kafka exceptional throughput.</p>
</li>
<li><p><strong>Exactly-once semantics (EOS)</strong> : achieved through idempotent + transactional producers combined with committed offsets.</p>
</li>
<li><p><strong>Production readiness</strong> : comes from careful tuning: partitions, replication factor, monitoring, and security controls.</p>
</li>
</ul>
<hr />
<h2 id="heading-14-conclusion">14. Conclusion</h2>
<p>Kafka has become the backbone of modern data systems. Its <strong>distributed log architecture</strong> delivers scalability, fault tolerance, and speed—making it ideal for event-driven microservices, real-time analytics, and data pipelines.</p>
<p>By understanding <strong>core concepts</strong> (topics, partitions, logs, replication, controllers) and applying <strong>best practices</strong> in deployment and tuning, you can build <strong>robust, scalable, and future-proof systems</strong> powered by Kafka.</p>
<hr />
<h2 id="heading-appendix-demo-project">Appendix: Demo Project</h2>
<p>To complement the concepts explored in this article, I’ve built a hands-on demo project that puts Kafka’s architecture and transactional patterns into practice.</p>
<p><strong>GitHub Repository:</strong> <a target="_blank" href="https://github.com/arata-x/springboot-kafka-cluster"><em>Spring Boot Kafka Cluster</em></a></p>
<p>This project showcases a production-grade Kafka setup running in <strong>KRaft</strong> mode, integrated with <strong>Spring Boot</strong> and <strong>PostgreSQL</strong>. It includes:</p>
<ul>
<li><p>A multi-node Kafka cluster with <strong>3 controllers</strong> and <strong>3 brokers</strong></p>
</li>
<li><p>A RESTful producer service that publishes events to Kafka</p>
</li>
<li><p>Three consumer services demonstrating:</p>
<ul>
<li><p>Manual acknowledgment</p>
</li>
<li><p>Kafka transactions</p>
</li>
<li><p>Database transactions</p>
</li>
</ul>
</li>
<li><p>A PostgreSQL-backed persistence layer</p>
</li>
<li><p>Docker Compose orchestration for easy startup</p>
</li>
<li><p>Scripts for testing, error simulation, and direct Kafka publishing</p>
</li>
</ul>
<p>Whether you're exploring offset management, transactional guarantees, or deployment strategies, this demo gives you a practical playground to experiment with real-world Kafka patterns.</p>
<blockquote>
<p>💡 Use it as a reference, a starting point, or a sandbox to deepen your Kafka mastery.</p>
</blockquote>
]]></content:encoded></item><item><title><![CDATA[Kafka Made Simple: A Hands-On Quickstart with Docker and Spring Boot]]></title><description><![CDATA[Apache Kafka is a distributed, durable, real-time event streaming platform. It goes beyond a message queue by providing scalability, persistence, and stream processing capabilities.
In this guide, we’ll quickly spin up Kafka with Docker, explore it w...]]></description><link>https://code.aratax.dev/kafka-made-simple-a-hands-on-quickstart-with-docker-and-spring-boot</link><guid isPermaLink="true">https://code.aratax.dev/kafka-made-simple-a-hands-on-quickstart-with-docker-and-spring-boot</guid><category><![CDATA[kafka]]></category><category><![CDATA[Docker]]></category><category><![CDATA[Springboot]]></category><dc:creator><![CDATA[Arata]]></dc:creator><pubDate>Sat, 20 Sep 2025 07:56:16 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1758355241467/270ae084-fb0f-4a50-b434-c6dbba6bf91f.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Apache Kafka is a distributed, durable, real-time event streaming platform. It goes beyond a message queue by providing scalability, persistence, and stream processing capabilities.</p>
<p>In this guide, we’ll quickly spin up Kafka with Docker, explore it with CLI tools, and integrate it into a Spring Boot application.</p>
<hr />
<h2 id="heading-1-what-is-kafka">1. What is Kafka?</h2>
<p>Apache Kafka is a <strong>distributed, durable, real-time event streaming platform</strong>.<br />It was originally developed at LinkedIn and is now part of the Apache Software Foundation.<br />Kafka is designed for <strong>high-throughput, low-latency data pipelines, streaming analytics, and event-driven applications</strong>.</p>
<h3 id="heading-what-is-an-event">What is an Event?</h3>
<p>An <strong>event</strong> is simply a <strong>record of something that happened</strong> in the system.<br />Each event usually includes:</p>
<ul>
<li><p><strong>Key</strong> → identifier (e.g., user ID, order ID).</p>
</li>
<li><p><strong>Value</strong> → the payload (e.g., “order created with total = $50”).</p>
</li>
<li><p><strong>Timestamp</strong> → when the event occurred.</p>
</li>
</ul>
<p>Example event:</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"key"</span>: <span class="hljs-string">"order-123"</span>,
  <span class="hljs-attr">"value"</span>: { <span class="hljs-attr">"customer"</span>: <span class="hljs-string">"Alice"</span>, <span class="hljs-attr">"total"</span>: <span class="hljs-number">50</span> },
  <span class="hljs-attr">"timestamp"</span>: <span class="hljs-string">"2025-09-19T10:15:00Z"</span>
}
</code></pre>
<h3 id="heading-what-is-an-event-streaming-platform">What is an Event Streaming Platform?</h3>
<p>An <strong>event streaming platform</strong> is a system designed to handle continuous flows of data — or <em>events</em> — in real time.<br />Instead of working in batches (processing data after the fact), it allows applications to <strong>react as events happen</strong>.</p>
<hr />
<h2 id="heading-2-what-kafka-can-do">2. What Kafka Can Do</h2>
<p>Kafka is more than a message queue—it's a real-time event backbone for modern systems.</p>
<h3 id="heading-messaging-like-a-message-queue">Messaging Like a Message Queue</h3>
<p>Kafka decouples producers and consumers, enabling asynchronous communication between services.</p>
<p><strong>Example</strong>: A banking system publishes transaction events to Kafka. Fraud detection, ledger updates, and notification services consume these events independently.</p>
<h3 id="heading-event-streaming">Event Streaming</h3>
<p>Kafka streams data in real time, allowing systems to react instantly.</p>
<p><strong>Example</strong>: An insurance platform streams claim events to trigger automated validation, underwriting checks, and customer updates in real time.</p>
<h3 id="heading-data-integration">Data Integration</h3>
<p>Kafka Connect bridges Kafka with databases, cloud storage, and analytics platforms.</p>
<p><strong>Example</strong>: A semiconductor company streams sensor data from manufacturing equipment into a data lake for predictive maintenance and yield optimization.</p>
<h3 id="heading-log-aggregation">Log Aggregation</h3>
<p>Kafka centralizes logs from multiple services for monitoring and analysis.</p>
<p><strong>Example</strong>: An industrial automation system sends logs from PLCs and controllers to Kafka, where they’re consumed by a monitoring dashboard for anomaly detection.</p>
<h3 id="heading-replayable-history">Replayable History</h3>
<p>Kafka retains events for reprocessing or backfilling.</p>
<p><strong>Example</strong>: An insurance company replays past policy events to train a model that predicts claim risk or customer churn. This avoids relying on static snapshots and gives the model a dynamic, time-aware view of behavior.</p>
<h3 id="heading-scalable-microservices-communication">Scalable Microservices Communication</h3>
<p>Kafka handles high-throughput messaging across distributed services.</p>
<p><strong>Example</strong>: A financial institution uses Kafka to coordinate customer onboarding, KYC checks, and account provisioning across multiple microservices.</p>
<hr />
<h2 id="heading-3-architecture">3. Architecture</h2>
<p>Apache Kafka’s architecture is built for <strong>high throughput, fault tolerance, and horizontal scalability</strong>. At its core, Kafka relies on a <strong>log-based storage model</strong> and a <strong>distributed broker cluster</strong>.</p>
<p><strong>Core Components</strong></p>
<ul>
<li><p><strong>Producer</strong> → Publishes records (events/messages) to topics. Can be idempotent or transactional.</p>
</li>
<li><p><strong>Topic</strong> → Logical category/feed for messages. Divided into <strong>partitions</strong> for parallelism.</p>
</li>
<li><p><strong>Partition</strong> → Ordered, immutable commit log. Records have sequential <strong>offsets</strong>.</p>
</li>
<li><p><strong>Broker</strong> → A Kafka server that stores partitions. Clusters have multiple brokers.</p>
</li>
<li><p><strong>Consumer</strong> → Subscribes to topics and processes messages. Part of a <strong>consumer group</strong> for scaling.</p>
</li>
<li><p><strong>Controller</strong> → Special broker role that manages metadata, leader election, and partition assignment.</p>
</li>
<li><p><strong>Replication</strong> → Each partition has one <strong>leader</strong> and multiple <strong>followers</strong> in the <strong>ISR</strong> (in-sync replicas).</p>
</li>
</ul>
<p> <strong>Data Flow</strong></p>
<ol>
<li><p><strong>Producers</strong> send records to brokers.</p>
</li>
<li><p>Records are appended to the <strong>leader partition log</strong>.</p>
</li>
<li><p>Followers replicate the leader’s log for durability.</p>
</li>
<li><p><strong>Consumers</strong> fetch records from leaders, tracking their <strong>offsets</strong>.</p>
</li>
</ol>
<p> <strong>Architecture Diagram</strong></p>
<pre><code class="lang-plaintext">               +-----------------+
               |    Producers    |
               +-----------------+
                   |    |    |
                   v    v    v
            +------------------------+
            |     Kafka Cluster      |
            |  +---------+           |
            |  | Broker 1|  &lt;--------------- Partition 0 Leader
            |  +---------+           |
            |  | Broker 2|  &lt;--------------- Partition 0 Follower
            |  +---------+           |
            |  | Broker 3|  &lt;--------------- Partition 1 Leader
            |  +---------+           |
            +------------------------+
                   |    |    |
                   v    v    v
              +-------------------+
              |  Consumer Group   |
              |-------------------|
              | Consumer A → P0   |
              | Consumer B → P1   |
              +-------------------+
</code></pre>
<hr />
<h2 id="heading-4-quickstart-with-docker">4. QuickStart with Docker</h2>
<p>This configuration sets up a single-node Kafka broker using the KRaft. It’s ideal for development, testing scenarios</p>
<pre><code class="lang-yaml"><span class="hljs-attr">name:</span> <span class="hljs-string">kafka</span>
<span class="hljs-attr">services:</span>
  <span class="hljs-attr">kafka:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">apache/kafka:4.1.0</span>
    <span class="hljs-attr">container_name:</span> <span class="hljs-string">kafka</span>
    <span class="hljs-attr">environment:</span>
      <span class="hljs-attr">KAFKA_NODE_ID:</span> <span class="hljs-number">1</span>
      <span class="hljs-attr">KAFKA_PROCESS_ROLES:</span> <span class="hljs-string">broker,controller</span>
      <span class="hljs-attr">KAFKA_LISTENERS:</span> <span class="hljs-string">BROKER://:9092,CONTROLLER://:9093</span>
      <span class="hljs-attr">KAFKA_CONTROLLER_QUORUM_VOTERS:</span> <span class="hljs-number">1</span><span class="hljs-string">@localhost:9093</span>
      <span class="hljs-attr">KAFKA_CONTROLLER_LISTENER_NAMES:</span> <span class="hljs-string">CONTROLLER</span>
      <span class="hljs-attr">KAFKA_INTER_BROKER_LISTENER_NAME:</span> <span class="hljs-string">BROKER</span>
      <span class="hljs-attr">KAFKA_LISTENER_SECURITY_PROTOCOL_MAP:</span> <span class="hljs-string">BROKER:PLAINTEXT,CONTROLLER:PLAINTEXT</span>
      <span class="hljs-attr">KAFKA_ADVERTISED_LISTENERS:</span> <span class="hljs-string">BROKER://localhost:9092</span>
      <span class="hljs-attr">KAFKA_CLUSTER_ID:</span> <span class="hljs-string">"kafka-1"</span>
      <span class="hljs-attr">KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR:</span> <span class="hljs-number">1</span>
      <span class="hljs-attr">KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR:</span> <span class="hljs-number">1</span>
      <span class="hljs-attr">KAFKA_TRANSACTION_STATE_LOG_MIN_ISR:</span> <span class="hljs-number">1</span>
      <span class="hljs-attr">KAFKA_LOG_DIRS:</span> <span class="hljs-string">/var/lib/kafka/data</span>
    <span class="hljs-attr">volumes:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">kafka_data:/var/lib/kafka/data</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"9092:9092"</span>
<span class="hljs-attr">volumes:</span>
  <span class="hljs-attr">kafka_data:</span>
</code></pre>
<h2 id="heading-how-to-run">How to Run</h2>
<p>Start the Kafka container using:</p>
<pre><code class="lang-bash">docker compose up
</code></pre>
<p>Kafka will be available at localhost:9092 for producers and consumers, and internally at localhost:9093 for controller communication.</p>
<hr />
<h2 id="heading-5-kafka-cli">5. Kafka CLI</h2>
<p>Before running Kafka commands, log into the Kafka container:</p>
<pre><code class="lang-bash">docker container <span class="hljs-built_in">exec</span> -it localhost bash
</code></pre>
<h3 id="heading-create-topic">Create Topic</h3>
<p>Create a topic named quickstart with one partition and a replication factor of 1:</p>
<pre><code class="lang-bash">/opt/kafka/bin/kafka-topics.sh --create \
  --bootstrap-server localhost:9092 \
  --replication-factor 1 \
  --partitions 1 \
  --topic quickstart
</code></pre>
<h3 id="heading-list-topic">List Topic</h3>
<p>Check all existing topics:</p>
<pre><code class="lang-bash">/opt/kafka/bin/kafka-topics.sh --list \
  --bootstrap-server localhost:9092
</code></pre>
<h3 id="heading-consume-message">Consume Message</h3>
<p>Read messages from the order topic starting from the beginning:</p>
<pre><code class="lang-bash">/opt/kafka/bin/kafka-console-consumer.sh \
  --bootstrap-server localhost:9092 \
  --topic quickstart \
  --from-beginning
</code></pre>
<h3 id="heading-send-message">Send Message</h3>
<p>You can send messages to the quickstart topic using either direct input or a file.</p>
<h3 id="heading-option-a-send-a-single-message">Option A: Send a single message</h3>
<pre><code class="lang-bash"><span class="hljs-built_in">echo</span> <span class="hljs-string">'This is Event 1'</span> | \
/opt/kafka/bin/kafka-console-producer.sh \
  --bootstrap-server localhost:9092 \
  --topic quickstart
</code></pre>
<h3 id="heading-option-b-send-multiple-messages-from-a-file">Option B: Send multiple messages from a file</h3>
<pre><code class="lang-java">echo <span class="hljs-string">'This is Event 2'</span> &gt; messages.txt
echo <span class="hljs-string">'This is Event 3'</span> &gt;&gt; messages.txt
cat messages.txt | \
/opt/kafka/bin/kafka-console-producer.sh \
  --bootstrap-server localhost:<span class="hljs-number">9092</span> \
  --topic quickstart
</code></pre>
<hr />
<h2 id="heading-5-spring-boot-integration">5. Spring Boot Integration</h2>
<p>This configuration enables seamless integration between a Spring Boot application and an Apache Kafka broker. It defines both producer and consumer settings for message serialization, deserialization, and connection behavior.</p>
<h3 id="heading-pomxml">pom.xml</h3>
<pre><code class="lang-xml"><span class="hljs-comment">&lt;!-- spring-web --&gt;</span>
<span class="hljs-tag">&lt;<span class="hljs-name">dependency</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">groupId</span>&gt;</span>org.springframework.boot<span class="hljs-tag">&lt;/<span class="hljs-name">groupId</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">artifactId</span>&gt;</span>spring-boot-starter-web<span class="hljs-tag">&lt;/<span class="hljs-name">artifactId</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">version</span>&gt;</span>3.4.9<span class="hljs-tag">&lt;/<span class="hljs-name">version</span>&gt;</span>
<span class="hljs-tag">&lt;/<span class="hljs-name">dependency</span>&gt;</span>
<span class="hljs-comment">&lt;!-- kafka --&gt;</span>
<span class="hljs-tag">&lt;<span class="hljs-name">dependency</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">groupId</span>&gt;</span>org.springframework.kafka<span class="hljs-tag">&lt;/<span class="hljs-name">groupId</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">artifactId</span>&gt;</span>spring-kafka<span class="hljs-tag">&lt;/<span class="hljs-name">artifactId</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">version</span>&gt;</span>3.3.9<span class="hljs-tag">&lt;/<span class="hljs-name">version</span>&gt;</span>
<span class="hljs-tag">&lt;/<span class="hljs-name">dependency</span>&gt;</span>
<span class="hljs-comment">&lt;!-- Lombok(optional) --&gt;</span>
<span class="hljs-tag">&lt;<span class="hljs-name">dependency</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">groupId</span>&gt;</span>org.projectlombok<span class="hljs-tag">&lt;/<span class="hljs-name">groupId</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">artifactId</span>&gt;</span>lombok<span class="hljs-tag">&lt;/<span class="hljs-name">artifactId</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">version</span>&gt;</span>1.18.30<span class="hljs-tag">&lt;/<span class="hljs-name">version</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">optional</span>&gt;</span>true<span class="hljs-tag">&lt;/<span class="hljs-name">optional</span>&gt;</span>
<span class="hljs-tag">&lt;/<span class="hljs-name">dependency</span>&gt;</span>
</code></pre>
<h3 id="heading-applicaitonyml">applicaiton.yml</h3>
<pre><code class="lang-yaml"><span class="hljs-attr">spring:</span>
  <span class="hljs-attr">kafka:</span>
    <span class="hljs-attr">bootstrap-servers:</span> <span class="hljs-string">localhost:9092</span>
    <span class="hljs-attr">template:</span>
      <span class="hljs-attr">default-topic:</span> <span class="hljs-string">orders</span>
    <span class="hljs-attr">consumer:</span>
      <span class="hljs-attr">group-id:</span> <span class="hljs-string">quickstart-group</span>
      <span class="hljs-attr">auto-offset-reset:</span> <span class="hljs-string">latest</span>
      <span class="hljs-attr">key-deserializer:</span> <span class="hljs-string">org.apache.kafka.common.serialization.StringDeserializer</span>
      <span class="hljs-attr">value-deserializer:</span> <span class="hljs-string">org.springframework.kafka.support.serializer.JsonDeserializer</span>
      <span class="hljs-attr">properties:</span>
        <span class="hljs-attr">spring.json.trusted.packages:</span> <span class="hljs-string">"dev.aratax.messaging.kafka.model"</span>
    <span class="hljs-attr">producer:</span>
      <span class="hljs-attr">key-serializer:</span> <span class="hljs-string">org.apache.kafka.common.serialization.StringSerializer</span>
      <span class="hljs-attr">value-serializer:</span> <span class="hljs-string">org.springframework.kafka.support.serializer.JsonSerializer</span>
</code></pre>
<h3 id="heading-topic-setup">Topic Setup</h3>
<pre><code class="lang-java"><span class="hljs-meta">@Bean</span>
<span class="hljs-function"><span class="hljs-keyword">public</span> NewTopic <span class="hljs-title">defaultTopic</span><span class="hljs-params">()</span> </span>{
    <span class="hljs-keyword">return</span> <span class="hljs-keyword">new</span> NewTopic(<span class="hljs-string">"orders"</span>, <span class="hljs-number">1</span>, (<span class="hljs-keyword">short</span>) <span class="hljs-number">1</span>);
}
</code></pre>
<h3 id="heading-event-model">Event Model</h3>
<pre><code class="lang-java"><span class="hljs-keyword">public</span> <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">OrderEvent</span> </span>{
    <span class="hljs-keyword">private</span> String id;
    <span class="hljs-keyword">private</span> Status status;
    <span class="hljs-keyword">private</span> BigDecimal totalAmount;
    <span class="hljs-keyword">private</span> Instant createdAt = Instant.now();
    <span class="hljs-keyword">private</span> String createdBy;

    <span class="hljs-keyword">public</span> <span class="hljs-class"><span class="hljs-keyword">enum</span> <span class="hljs-title">Status</span> </span>{
        IN_PROGRESS,
        COMPLETED,
        CANCELLED
    }
}
</code></pre>
<h3 id="heading-producer-example">Producer Example</h3>
<pre><code class="lang-java"><span class="hljs-meta">@RestController</span>
<span class="hljs-meta">@RequestMapping("/api")</span>
<span class="hljs-meta">@RequiredArgsConstructor</span>
<span class="hljs-keyword">public</span> <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">OrderEventController</span> </span>{

    <span class="hljs-keyword">private</span> <span class="hljs-keyword">final</span> KafkaTemplate&lt;String, OrderEvent&gt; kafkaTemplate;

    <span class="hljs-meta">@PostMapping("/orders")</span>
    <span class="hljs-function"><span class="hljs-keyword">public</span> String <span class="hljs-title">create</span><span class="hljs-params">(<span class="hljs-meta">@RequestBody</span> OrderEvent event)</span> </span>{
        event.setId(UUID.randomUUID().toString());
        event.setCreatedAt(Instant.now());
        kafkaTemplate.sendDefault(event.getId(), event);
        <span class="hljs-keyword">return</span> <span class="hljs-string">"Order sent to Kafka"</span>;
    }
}
</code></pre>
<h3 id="heading-consumer-example">Consumer Example</h3>
<pre><code class="lang-java"><span class="hljs-meta">@Component</span>
<span class="hljs-keyword">public</span> <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">OrderEventsListener</span> </span>{

    <span class="hljs-meta">@KafkaListener(topics = "orders")</span>
    <span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-keyword">void</span> <span class="hljs-title">handle</span><span class="hljs-params">(OrderEvent event)</span> </span>{
        System.out.println(<span class="hljs-string">"Received order: "</span> + event);
    }
}
</code></pre>
<hr />
<h2 id="heading-6-demo-project">6. Demo Project</h2>
<p>I built a demo project using Spring Boot and Kafka to demonstrate basic producer/consumer functionality. Check it out on GitHub: <a target="_blank" href="https://github.com/arata-x/springboot-kafka-quickstart">springboot-kafka-quickstart</a></p>
<hr />
<h2 id="heading-7-key-takeaways">7. Key Takeaways</h2>
<ul>
<li><p>Kafka is more than a message queue—it's a scalable, durable event streaming platform.</p>
</li>
<li><p>Events are central to Kafka’s architecture, enabling real-time data flow across systems.</p>
</li>
<li><p>Docker makes setup easy, allowing you to spin up Kafka locally for development and testing.</p>
</li>
<li><p>Kafka CLI tools help you explore topics, produce messages, and consume events interactively.</p>
</li>
<li><p>Spring Boot integration simplifies Kafka usage with built-in support for producers and consumers.</p>
</li>
<li><p>Real-world use cases span industries like banking, insurance, semiconductor, and automation.</p>
</li>
</ul>
<hr />
<h2 id="heading-8-conclusion">8. Conclusion</h2>
<p>Apache Kafka empowers developers to build reactive, event-driven systems with ease. Whether you're streaming financial transactions, processing insurance claims, or monitoring factory equipment, Kafka provides the backbone for scalable, real-time communication.</p>
<p>With Docker and Spring Boot, you can get started in minutes—no complex setup required. This quickstart gives you everything you need to explore Kafka hands-on and begin building production-grade event pipelines.</p>
<p>Ready to go deeper? Try explore its design/implementation, stream processing, or Kafka Connect integrations next.</p>
]]></content:encoded></item><item><title><![CDATA[Redis Sentinel Made Simple: Hands-On High Availability]]></title><description><![CDATA[High availability is no longer a luxury — it’s a survival kit for modern applications. Databases crash, servers die, containers get killed (sometimes by accident, sometimes by design). In the world of Redis, Sentinel is the quiet guardian that keeps ...]]></description><link>https://code.aratax.dev/redis-sentinel-made-simple-hands-on-high-availability</link><guid isPermaLink="true">https://code.aratax.dev/redis-sentinel-made-simple-hands-on-high-availability</guid><category><![CDATA[Redis]]></category><category><![CDATA[Springboot]]></category><category><![CDATA[Java]]></category><category><![CDATA[high availability]]></category><dc:creator><![CDATA[Arata]]></dc:creator><pubDate>Sun, 24 Aug 2025 12:03:41 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1756036595463/5083a392-6783-44d4-8bdc-5f3e18ce2acb.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>High availability is no longer a luxury — it’s a survival kit for modern applications. Databases crash, servers die, containers get killed (sometimes by accident, sometimes by design). In the world of Redis, <strong>Sentinel</strong> is the quiet guardian that keeps your cache cluster alive when chaos happens.</p>
<p>In this article, I’ll walk you through Redis Sentinel step by step, with a runnable Docker demo and a Spring Boot integration example. By the end, you’ll see failover happening live — and how your application can recover without manual intervention.</p>
<hr />
<h2 id="heading-1-introduction">1. Introduction</h2>
<h3 id="heading-why-does-redis-sentinel-matter">Why does Redis Sentinel matter?</h3>
<p>Picture this: you’ve got Redis set up with one master and a couple of replicas. Everything’s smooth… until the master suddenly crashes. Now what? Who decides which replica should take over? Who makes sure your clients know where to connect? 👉 That’s exactly the job Sentinel handles for you.</p>
<ul>
<li><p><strong>Monitors</strong> your Redis instances.</p>
</li>
<li><p><strong>Notifies</strong> you when something goes wrong.</p>
</li>
<li><p><strong>Automatically promotes</strong> a replica to master.</p>
</li>
<li><p><strong>Redirects clients</strong> to the new master.</p>
</li>
</ul>
<p>Sentinel is the difference between a cache outage and a smooth failover.</p>
<hr />
<h2 id="heading-2-what-is-redis-sentinel">2. What is Redis Sentinel?</h2>
<p>At its core, Redis Sentinel is a distributed system that provides:</p>
<ul>
<li><p>Monitoring – constantly checking whether your master and replicas are alive.</p>
</li>
<li><p>Notification – alerting operators (or systems) when something goes wrong.</p>
</li>
<li><p>Automatic Failover – promoting a replica when the master is unavailable.</p>
</li>
<li><p>Client Redirection – letting apps connect to the new master automatically.</p>
</li>
</ul>
<hr />
<h2 id="heading-3-sentinel-architecture">3 . Sentinel Architecture</h2>
<p>A Sentinel deployment usually includes multiple Sentinel nodes plus your Redis master and replicas. Sentinels work together, reaching <strong>quorum</strong> before deciding a master is truly dead.</p>
<p>Key concepts:</p>
<ul>
<li><p><strong>SDOWN (Subjectively Down):</strong> One Sentinel thinks the master is down.</p>
</li>
<li><p><strong>ODOWN (Objectively Down):</strong> Enough Sentinels agree the master is down.</p>
</li>
<li><p><strong>Replica Priority:</strong> Determines which replica should be promoted first.</p>
</li>
</ul>
<h3 id="heading-deployment-diagram">Deployment Diagram</h3>
<pre><code class="lang-plaintext">+-------------------+       +-------------------+
|   Sentinel #1     |       |   Sentinel #2     |
+-------------------+       +-------------------+
           \                     /
            \                   /
             \   Quorum Vote   /
              \               /
            +-------------------+
            |   Sentinel #3     |
            +-------------------+
                   |
                   v
            +-------------------+
            | Redis Master      |
            +-------------------+
              /          \
             v            v
   +----------------+   +----------------+
   | Redis Replica1 |   | Redis Replica2 |
   +----------------+   +----------------+
</code></pre>
<hr />
<h2 id="heading-4-setting-up-redis-sentinel">4. Setting Up Redis Sentinel</h2>
<p>We use Docker Compose with one master, two replicas, and three Sentinels.</p>
<h3 id="heading-redis-sentinel-config">Redis Sentinel Config</h3>
<pre><code class="lang-plaintext">sentinel announce-ip "127.0.0.1"
sentinel announce-port 26379
# sentinel with version above 6.2 can resolve host names, but this is not enabled by default.
sentinel resolve-hostnames yes
# Monitor master named "mymaster" at 127.0.0.1(or domain name):6379 with quorum of 2
sentinel monitor mymaster 127.0.0.1 6379 2
# Master is considered down after 5 seconds of no response
sentinel down-after-milliseconds mymaster 5000
# Failover timeout 18 seconds
sentinel failover-timeout mymaster 18000

##Below line 'Generated by CONFIG REWRITE 'controlled by Redis Sentinel(Config file should be writable)
# Generated by CONFIG REWRITE
</code></pre>
<h3 id="heading-ways-to-run-sentinel">Ways to Run Sentinel:</h3>
<pre><code class="lang-bash">redis-sentinel /etc/redis/sentinel.conf
<span class="hljs-comment"># or</span>
redis-server /etc/redis/sentinel.conf --sentinel
</code></pre>
<h3 id="heading-redis-cli-useful-commands">Redis CLI Useful commands:</h3>
<pre><code class="lang-plaintext">#Start Sentinel's monitoring.
SENTINEL MONITOR &lt;master name&gt; 
#Stop Sentinel's monitoring.
SENTINEL REMOVE &lt;master name&gt;
#Set Sentinel's monitoring configuration. 
SENTINEL SET &lt;master name&gt; &lt;option&gt; &lt;value&gt;
#(&gt;= 5.0) Show a list of replicas for this master, and their state.
SENTINEL REPLICAS &lt;master name&gt; 
#Show a list of sentinel instances for this master, and their state.
SENTINEL SENTINELS &lt;master name&gt;
#Force a failover as if the master was not reachable, and without asking for agreement to other Sentinels 
#(however a new version of the configuration will be published so that the other Sentinels will update their configurations.
#That's called 'Configuration propagation'
SENTINEL FAILOVER &lt;master name&gt;
#Display information by Role.
INFO
</code></pre>
<h3 id="heading-docker-compse">Docker Compse :</h3>
<pre><code class="lang-dockerfile">  redis-sentinel-<span class="hljs-number">1</span>:
    image: bitnami/redis-sentinel:<span class="hljs-number">8.0</span>.<span class="hljs-number">3</span>
    container_name: redis-sentinel-<span class="hljs-number">1</span>
    ports:
      <span class="hljs-comment"># Sentinel, Docker, NAT, and possible issues. Set port-mapping 1:1</span>
      - <span class="hljs-string">"26379:26379"</span>
    environment:
      - ALLOW_EMPTY_PASSWORD=yes   
    volumes:
      <span class="hljs-comment"># Use with caution regarding permissions.</span>
      - redis-sentinel-<span class="hljs-number">1</span>-data:/bitnami/redis-sentinel
      - ./redis-sentinel-<span class="hljs-number">1</span>:/usr/local/etc/redis-sentinel
    <span class="hljs-comment"># Sentinel, Docker, NAT, and possible issues. Use host for maximum compatibility.</span>
    network_mode: host
    depends_on:
      - redis-master
      - redis-replica-<span class="hljs-number">1</span>
      - redis-replica-<span class="hljs-number">2</span>
    restart: unless-stopped
    command: [<span class="hljs-string">"redis-sentinel"</span>, <span class="hljs-string">"/usr/local/etc/redis-sentinel/sentinel.conf"</span>]
</code></pre>
<h2 id="heading-5-redis-docker-demo">5. Redis Docker Demo</h2>
<p>Clone the demo project:</p>
<pre><code class="lang-bash">git <span class="hljs-built_in">clone</span> https://github.com/arata-x/redis-ha.git
</code></pre>
<p>Docker Setup/Run</p>
<pre><code class="lang-bash"><span class="hljs-built_in">cd</span> redis-ha/docker/redis/sentinel
docker-compose up
</code></pre>
<p>Simulate master crash:</p>
<pre><code class="lang-bash">docker <span class="hljs-built_in">kill</span> redis-master
</code></pre>
<p>The Sentinels will detect the failure and promote a replica to do the Failover.</p>
<hr />
<h2 id="heading-6-spring-boot-integration">6. Spring Boot Integration</h2>
<p>Spring Boot supports Sentinel natively via <code>spring-boot-starter-data-redis</code>. Here’s how to configure it.</p>
<h3 id="heading-pomxml">pom.xml</h3>
<pre><code class="lang-xml"><span class="hljs-tag">&lt;<span class="hljs-name">dependency</span>&gt;</span>
  <span class="hljs-tag">&lt;<span class="hljs-name">groupId</span>&gt;</span>org.springframework.boot<span class="hljs-tag">&lt;/<span class="hljs-name">groupId</span>&gt;</span>
  <span class="hljs-tag">&lt;<span class="hljs-name">artifactId</span>&gt;</span>spring-boot-starter-data-redis-reactive<span class="hljs-tag">&lt;/<span class="hljs-name">artifactId</span>&gt;</span>
<span class="hljs-tag">&lt;/<span class="hljs-name">dependency</span>&gt;</span>
</code></pre>
<h3 id="heading-applicationyml">application.yml</h3>
<pre><code class="lang-yaml"><span class="hljs-attr">spring:</span>
  <span class="hljs-attr">data:</span>
    <span class="hljs-attr">redis:</span>
      <span class="hljs-attr">sentinel:</span>
        <span class="hljs-attr">master:</span> <span class="hljs-string">localhost</span>
        <span class="hljs-attr">nodes:</span>
          <span class="hljs-bullet">-</span> <span class="hljs-string">redis-sentinel-1:26379</span>
          <span class="hljs-bullet">-</span> <span class="hljs-string">redis-sentinel-2:26379</span>
          <span class="hljs-bullet">-</span> <span class="hljs-string">redis-sentinel-3:26379</span>
</code></pre>
<h3 id="heading-spring-boot-config-for-pubsub-messagesoptional">Spring Boot Config for Pub/Sub messages（Optional）</h3>
<pre><code class="lang-java">  <span class="hljs-meta">@Bean(destroyMethod = "shutdown")</span>
  <span class="hljs-function"><span class="hljs-keyword">public</span> RedisClient <span class="hljs-title">sentinelClient</span><span class="hljs-params">()</span> </span>{
    <span class="hljs-keyword">return</span> RedisClient.create(<span class="hljs-string">"redis://127.0.0.1:26379"</span>);
  }

  <span class="hljs-meta">@Bean(destroyMethod = "close")</span>
  <span class="hljs-function"><span class="hljs-keyword">public</span> StatefulRedisPubSubConnection&lt;String, String&gt; <span class="hljs-title">sentinelPubSub</span><span class="hljs-params">(RedisClient client)</span> </span>{
    <span class="hljs-keyword">var</span> conn = client.connectPubSub();
    conn.addListener(<span class="hljs-keyword">new</span> RedisPubSubAdapter&lt;&gt;() {
      <span class="hljs-meta">@Override</span> <span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-keyword">void</span> <span class="hljs-title">message</span><span class="hljs-params">(String channel, String message)</span> </span>{
        log.info(<span class="hljs-string">"Sentinel event [{}] {}"</span>, channel, message);
      }
    });

    <span class="hljs-comment">// subscribe to key Sentinel events (or use psubscribe("*") to get all)</span>
    conn.sync().subscribe(
        <span class="hljs-string">"+switch-master"</span>,        <span class="hljs-comment">// master changed</span>
        <span class="hljs-string">"+sdown"</span>, <span class="hljs-string">"-sdown"</span>,      <span class="hljs-comment">// subjective down / cleared</span>
        <span class="hljs-string">"+odown"</span>, <span class="hljs-string">"-odown"</span>,      <span class="hljs-comment">// objective down / cleared (masters only)</span>
        <span class="hljs-string">"+try-failover"</span>,
        <span class="hljs-string">"+failover-state-*"</span>
    );
    <span class="hljs-keyword">return</span> conn;
  }
</code></pre>
<p>This way, clients automatically reconnect after failover. And log Sentinel events.</p>
<hr />
<h2 id="heading-7-testing-failover-amp-logs">7. Testing Failover &amp; Logs</h2>
<h3 id="heading-failover-timeline">Failover Timeline</h3>
<pre><code class="lang-plaintext">t0: Master alive
t1: Master killed  -&gt; SDOWN
t2: Quorum reached -&gt; ODOWN
t3: Leader elected -&gt; VOTE
t4: Master elected -&gt; PROMOTE
t5: New master active -&gt; CLIENTS REDIRECT
t6: Replica detcted -&gt; SLAVE
t7: Old master back -&gt; SLAVE
</code></pre>
<h3 id="heading-docker-logs">Docker logs</h3>
<pre><code class="lang-plaintext">redis-sentinel-1  | 1:X 24 Aug 2025 01:29:56.652 * Sentinel ID is 45f2090cc345fd2a0a9afad89d45d3c212816390
redis-sentinel-3  | 1:X 24 Aug 2025 01:29:56.670 * Sentinel ID is 72098a7942ff006106511dbb0db3044b00fa5473
redis-sentinel-2  | 1:X 24 Aug 2025 01:29:56.690 * Sentinel ID is b87c2be6edf6192e03783f1ed1647af7fa2b51f6
# Simulate the master down via command 'docker container kill redis-master' and the Failover will start.
redis-sentinel-1  | 1:X 24 Aug 2025 01:30:32.047 # +sdown master mymaster redis-master 6379
redis-sentinel-2  | 1:X 24 Aug 2025 01:30:32.067 # +sdown master mymaster redis-master 6379
redis-sentinel-3  | 1:X 24 Aug 2025 01:30:32.107 # +sdown master mymaster redis-master 6379
redis-sentinel-2  | 1:X 24 Aug 2025 01:30:32.144 # +odown master mymaster redis-master 6379 #quorum 2/2
redis-sentinel-2  | 1:X 24 Aug 2025 01:30:32.144 # +try-failover master mymaster redis-master 6379
redis-sentinel-2  | 1:X 24 Aug 2025 01:30:32.151 # +vote-for-leader b87c2be6edf6192e03783f1ed1647af7fa2b51f6 1
redis-sentinel-3  | 1:X 24 Aug 2025 01:30:32.166 # +vote-for-leader b87c2be6edf6192e03783f1ed1647af7fa2b51f6 1
redis-sentinel-1  | 1:X 24 Aug 2025 01:30:32.167 # +vote-for-leader b87c2be6edf6192e03783f1ed1647af7fa2b51f6 1
redis-sentinel-2  | 1:X 24 Aug 2025 01:30:33.215 # +promoted-slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster redis-master 6379
redis-sentinel-2  | 1:X 24 Aug 2025 01:30:32.244 # +elected-leader master mymaster redis-master 6379
redis-sentinel-2  | 1:X 24 Aug 2025 01:30:32.244 # +failover-state-select-slave master mymaster redis-master 6379
redis-sentinel-2  | 1:X 24 Aug 2025 01:30:32.299 # +selected-slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster redis-master 6379
redis-sentinel-2  | 1:X 24 Aug 2025 01:30:32.299 * +failover-state-send-slaveof-noone slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster redis-master 6379
redis-sentinel-3  | 1:X 24 Aug 2025 01:30:33.263 # +switch-master mymaster redis-master 6379 127.0.0.1 6381
redis-sentinel-3  | 1:X 24 Aug 2025 01:30:33.264 * +slave slave redis-master:6379 redis-master 6379 @ mymaster 127.0.0.1 6381 
# Restore master via command 'docker container start redis-master' and master will be the replica.
redis-sentinel-2  | 1:X 24 Aug 2025 01:30:34.096 # -sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6380
redis-master      | 1:S 24 Aug 2025 01:30:34.236 * Before turning into a replica, using my own master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
redis-master      | 1:S 24 Aug 2025 01:30:34.236 * Connecting to MASTER 127.0.0.1:6380
redis-sentinel-1  | 1:X 24 Aug 2025 01:30:34.236 * +convert-to-slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6380
redis-replica-1   | 1:M 24 Aug 2025 01:30:34.447 * Synchronization with replica 127.0.0.1:6379 succeeded
redis-master      | 1:S 24 Aug 2025 01:30:34.447 * MASTER &lt;-&gt; REPLICA sync: Successfully streamed replication buffer into the db (0 bytes in total)
</code></pre>
<h3 id="heading-redis-event-list">Redis Event List</h3>
<ul>
<li><p>+slave -- A new replica was detected and attached.</p>
</li>
<li><p>+sdown -- The specified instance is now in Subjectively Down state.</p>
</li>
<li><p>+odown -- The specified instance is now in Objectively Down state.</p>
</li>
<li><p>+try-failover -- New failover in progress, waiting to be elected by the majority.</p>
</li>
<li><p>+elected-leader -- Won the election for the specified epoch, can do the failover.</p>
</li>
<li><p>+failover-state-select-slave -- New failover state is select-slave: we are trying to find a suitable replica for promotion.</p>
</li>
</ul>
<h3 id="heading-spring-boot-log-by-redis-pubsub">Spring Boot log by Redis pub/sub</h3>
<pre><code class="lang-plaintext">2025-08-24T01:34:46.946+08:00  INFO 44256 --- [redis-reactive-demo] [ioEventLoop-7-1] d.a.redis.config.RedisConfigSentinel     : Sentinel event [+sdown] master mymaster 127.0.0.1 6379
2025-08-24T01:34:48.055+08:00  INFO 44256 --- [redis-reactive-demo] [ioEventLoop-7-1] d.a.redis.config.RedisConfigSentinel     : Sentinel event [+odown] master mymaster 127.0.0.1 6379 #quorum 3/2
2025-08-24T01:34:48.176+08:00  INFO 44256 --- [redis-reactive-demo] [ioEventLoop-7-1] d.a.redis.config.RedisConfigSentinel     : Sentinel event [+switch-master] mymaster 127.0.0.1 6379 127.0.0.1 6381
</code></pre>
<hr />
<h2 id="heading-8-best-practices">8. Best Practices</h2>
<ul>
<li><p>Run <strong>at least 3 Sentinels</strong>.</p>
</li>
<li><p>Distribute Sentinels across nodes for resilience.</p>
</li>
<li><p>Tune <code>failover-timeout</code> and <code>down-after-milliseconds</code>.</p>
</li>
</ul>
<hr />
<h2 id="heading-10-final-thoughts">10. Final Thoughts</h2>
<p>🚦 Think of Redis Sentinel as your system’s insurance policy. Most of the time, you’ll never notice it quietly standing guard in the background. But the moment your master node takes a dive, Sentinel steps in to keep traffic flowing — and you’ll be very glad it was there all along.</p>
<p>👉 Use Sentinel when you want simple, lightweight high availability. It doesn’t complicate your setup and gets the job done for most HA needs.</p>
<p>⚡ But if your workload demands both horizontal scaling (sharding) and HA, that’s where Redis Cluster shines. Sentinel won’t replace Cluster — they solve different problems.</p>
<hr />
<h2 id="heading-demo-project-redis-sentinelhttpsgithubcomarata-xredis-ha">🔗Demo project: <a target="_blank" href="https://github.com/arata-x/redis-ha">Redis Sentinel</a></h2>
]]></content:encoded></item><item><title><![CDATA[Redis Replication Made Simple: With Spring Boot Integration]]></title><description><![CDATA[Imagine it’s 3 AM. Your Redis server—yes, the one holding all your app’s session data—just crashed. Your team’s phones are buzzing. Users are locked out, and panic is rising.
What if I told you this nightmare could be avoided with a simple feature bu...]]></description><link>https://code.aratax.dev/redis-replication-made-simple-with-spring-boot-integration</link><guid isPermaLink="true">https://code.aratax.dev/redis-replication-made-simple-with-spring-boot-integration</guid><category><![CDATA[Redis, SpringBoot, Database, Replication]]></category><dc:creator><![CDATA[Arata]]></dc:creator><pubDate>Sat, 09 Aug 2025 16:15:34 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1754828317366/776bb6dd-76d4-4949-905f-74cf9873d9c2.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<hr />
<p>Imagine it’s 3 AM. Your Redis server—yes, the one holding all your app’s session data—just crashed. Your team’s phones are buzzing. Users are locked out, and panic is rising.</p>
<p>What if I told you this nightmare could be avoided with a simple feature built right into Redis? Enter <strong>Redis replication</strong>—your built-in safeguard for data availability, read scaling, and peace of mind.</p>
<hr />
<h2 id="heading-what-is-redis-replication">🔑 What Is Redis Replication?</h2>
<p>At its core, <strong>Redis replication</strong> enables a single Redis instance (the <strong>primary</strong>) to automatically copy its data to one or more <strong>replicas</strong>.</p>
<ul>
<li><p>The <strong>primary</strong> handles all write operations.</p>
</li>
<li><p><strong>Replicas</strong> stay in sync and serve read requests, reducing the load on the primary.</p>
</li>
<li><p>If the primary fails, replicas can quickly take over.</p>
</li>
</ul>
<p>This fundamental setup lays the groundwork for high availability and scaling in Redis environments.</p>
<hr />
<h2 id="heading-how-does-it-work">⚙️ How Does It Work?</h2>
<p>Redis replication works in three key stages:</p>
<ol>
<li><p><strong>Initial Sync:</strong> A replica requests a full snapshot (RDB) from the primary, loads it, and applies any updates.</p>
</li>
<li><p><strong>Command Streaming:</strong> Once synced, the replica continuously receives write commands from the primary to stay current.</p>
</li>
<li><p><strong>Partial Resync (PSYNC2):</strong> If a replica temporarily disconnects, it resumes from where it left off using Redis’s backlog buffer—avoiding a full resync.</p>
</li>
</ol>
<p>This process is <strong>asynchronous</strong>, which means replicas may lag slightly but offer high throughput.</p>
<hr />
<h2 id="heading-setting-up-replication-primary-two-replicas">🖥 Setting Up Replication (Primary + Two Replicas)</h2>
<p>Here’s how to launch a simple master-replica setup locally:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Start Primary</span>
redis-server --port 6379

<span class="hljs-comment"># Start Replica 1</span>
redis-server --port 6380 --replicaof 127.0.0.1 6379

<span class="hljs-comment"># Start Replica 2</span>
redis-server --port 6381 --replicaof 127.0.0.1 6379
</code></pre>
<p>At this point, reads can be routed to replicas while writes continue to flow to the primary.</p>
<hr />
<h2 id="heading-chained-replication-replica-of-a-replica">🆕 Chained Replication (Replica of a Replica)</h2>
<p>Beyond basic replication, Redis supports <strong>chained replication</strong>, where a replica can act as a source for another replica.</p>
<h3 id="heading-why-use-it">Why Use It?</h3>
<ul>
<li><p><strong>Reduce primary load:</strong> Only one replica pulls directly from the primary.</p>
</li>
<li><p><strong>Regional optimization:</strong> Place replicas closer to users while syncing through a nearer node.</p>
</li>
<li><p><strong>Better bandwidth usage:</strong> Ideal for distributed or high-latency networks.</p>
</li>
</ul>
<p>Example:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Primary</span>
redis-server --port 6379

<span class="hljs-comment"># Replica 1 (syncs from primary)</span>
redis-server --port 6380 --replicaof 127.0.0.1 6379

<span class="hljs-comment"># Replica 2 (syncs from Replica 1)</span>
redis-server --port 6381 --replicaof 127.0.0.1 6380
</code></pre>
<hr />
<h2 id="heading-deployment-diagram">✒Deployment Diagram</h2>
<p>This diagram shows a primary node with two direct replicas and one chained replica.</p>
<pre><code class="lang-plaintext">                                  +--------+
                                  | Server |
                                  +---+----+
                                      |
                                     WRITE
                                      v
                                  +--------+
                                  | Master |
                                  +---+----+
                                  /        \
                             SYNC/          \SYNC
                                v            v
                        +-------+--+    +----+------+
                        | Replica  |    |  Replica  |
                        +----+-----+    +-----+-----+
                             |
                         CHAINED SYNC
                             v
                        +----+-----+
                        | Replica  |
                        +----------+
</code></pre>
<hr />
<h2 id="heading-diskless-replication">⚡ Diskless Replication</h2>
<p>To further speed up initial synchronization, enable <strong>diskless replication</strong>, which streams snapshots directly to replicas:</p>
<p>redis.conf (master)</p>
<pre><code class="lang-plaintext">repl-diskless-sync yes
repl-diskless-sync-delay 5
</code></pre>
<p>redis.conf (replica)</p>
<pre><code class="lang-plaintext">replicaof redis-master 6379
replica-read-only yes
repl-diskless-load on-empty-db
</code></pre>
<p>This avoids writing intermediate files to disk and is ideal for large datasets or high-performance environments.</p>
<hr />
<h2 id="heading-spring-boot-with-redis-replicas">🔧 Spring Boot with Redis Replicas</h2>
<p>Let’s integrate this into a Spring Boot project for practical use.</p>
<h3 id="heading-dependency">Dependency:</h3>
<pre><code class="lang-xml"><span class="hljs-tag">&lt;<span class="hljs-name">dependency</span>&gt;</span>
  <span class="hljs-tag">&lt;<span class="hljs-name">groupId</span>&gt;</span>org.springframework.boot<span class="hljs-tag">&lt;/<span class="hljs-name">groupId</span>&gt;</span>
  <span class="hljs-tag">&lt;<span class="hljs-name">artifactId</span>&gt;</span>spring-boot-starter-data-redis<span class="hljs-tag">&lt;/<span class="hljs-name">artifactId</span>&gt;</span>
<span class="hljs-tag">&lt;/<span class="hljs-name">dependency</span>&gt;</span>
</code></pre>
<h3 id="heading-configuration">Configuration:</h3>
<pre><code class="lang-java"><span class="hljs-meta">@Bean</span>
<span class="hljs-function"><span class="hljs-keyword">public</span> LettuceConnectionFactory <span class="hljs-title">redisConnectionFactory</span><span class="hljs-params">()</span> </span>{
    RedisStaticMasterReplicaConfiguration masterReplicaConfig =
            <span class="hljs-keyword">new</span> RedisStaticMasterReplicaConfiguration(<span class="hljs-string">"127.0.0.1"</span>, <span class="hljs-number">6379</span>);
    masterReplicaConfig.addNode(<span class="hljs-string">"127.0.0.1"</span>, <span class="hljs-number">6380</span>);
    masterReplicaConfig.addNode(<span class="hljs-string">"127.0.0.1"</span>, <span class="hljs-number">6381</span>);
    masterReplicaConfig.setPassword(RedisPassword.of(<span class="hljs-string">"myRedisPass"</span>));

    LettuceClientConfiguration clientConfig = LettuceClientConfiguration.builder()
            .readFrom(ReadFrom.ANY_REPLICA)
            .build();

    <span class="hljs-keyword">return</span> <span class="hljs-keyword">new</span> LettuceConnectionFactory(masterReplicaConfig, clientConfig);
}
</code></pre>
<p>This configuration connects Spring Boot to a primary and its replicas, preferring replica reads automatically.</p>
<hr />
<h2 id="heading-verifying-reads-are-hitting-replicas">🛠 Verifying Reads Are Hitting Replicas</h2>
<p>To confirm that reads hit replicas rather than the primary:</p>
<h3 id="heading-1-monitor-replica-activity">1️⃣ Monitor Replica Activity</h3>
<pre><code class="lang-bash">redis-cli -p 6380 MONITOR
</code></pre>
<p>Execute a read query and see it logged on the replica.</p>
<h3 id="heading-2-in-spring-boot">2️⃣ In Spring Boot</h3>
<pre><code class="lang-java"><span class="hljs-meta">@Autowired</span>
<span class="hljs-keyword">private</span> RedisTemplate&lt;String, Object&gt; redisTemplate;

<span class="hljs-meta">@Bean</span>
<span class="hljs-function"><span class="hljs-keyword">public</span> CommandLineRunner <span class="hljs-title">testRedis</span><span class="hljs-params">()</span> </span>{
    <span class="hljs-keyword">return</span> args -&gt; {
        ValueOperations&lt;String, Object&gt; ops = redisTemplate.opsForValue();
        ops.set(<span class="hljs-string">"user:1"</span>, <span class="hljs-string">"Alice"</span>); <span class="hljs-comment">// Write -&gt; Master</span>
        String value = ops.get(<span class="hljs-string">"user:1"</span>); <span class="hljs-comment">// Read -&gt; Replica</span>
        System.out.println(<span class="hljs-string">"Read value (replica preferred): "</span> + value);
    };
}
</code></pre>
<hr />
<h2 id="heading-limitations-of-replicas-alone">⚠️ Limitations of Replicas Alone</h2>
<p>However, while replication improves resilience, it doesn’t guarantee full HA on its own.</p>
<ul>
<li><p><strong>No automatic failover:</strong> Promotion must be done manually without Sentinel or Cluster.</p>
</li>
<li><p><strong>Asynchronous replication:</strong> Recent writes might be lost if the primary fails before syncing.</p>
</li>
<li><p><strong>Single control point:</strong> The primary remains the bottleneck for writes.</p>
</li>
</ul>
<p>These gaps highlight why replication is essential but insufficient for full HA in production environments.</p>
<hr />
<h2 id="heading-final-thoughts">✅ Final Thoughts</h2>
<p>Redis replication is a <strong>simple yet powerful</strong> way to protect against single points of failure, scale reads, and prepare for failover. Its nature—one primary continuously mirrored by one or more replicas—ensures that your data is <strong>redundant, accessible, and performance-optimized</strong>.</p>
<p><strong>Why use replicas?</strong></p>
<ul>
<li><p>Keep a live backup ready for emergencies.</p>
</li>
<li><p>Reduce read load on the primary.</p>
</li>
<li><p>Improve latency with geographically placed replicas.</p>
</li>
</ul>
<p><strong>Key takeaway</strong>: Replication is your <strong>first step</strong> toward high availability. Pair it with <strong>Sentinel</strong> for automatic failover or <strong>Cluster</strong> for sharding to achieve a <strong>production-grade, fault-tolerant Redis deployment</strong>.</p>
<hr />
<h2 id="heading-demo-project-for-readers">🛠 Demo Project for Readers</h2>
<p>I have created a <strong>demo project</strong> that showcases practical usage of Redis replica.</p>
<h3 id="heading-project-includes">Project Includes:</h3>
<ul>
<li><p><strong>Docker</strong>: Pre-configured Redis master &amp; replicas using <code>docker-compose</code>.</p>
</li>
<li><p><strong>Spring Boot</strong>: Example backend service demonstrating Redis read/write splitting.</p>
</li>
</ul>
<h3 id="heading-access-the-project">🔗 Access the Project</h3>
<p>You can clone or explore the project from my repository :</p>
<pre><code class="lang-bash">git <span class="hljs-built_in">clone</span> https://github.com/arata-x/redis-ha.git
<span class="hljs-built_in">cd</span> redis-ha
</code></pre>
<hr />
<h2 id="heading-additional-resources">Additional Resources</h2>
<ul>
<li><p><a target="_blank" href="https://redis.io/docs/latest/operate/oss_and_stack/management/replication/">Official Replication Documentation</a></p>
</li>
<li><p><a target="_blank" href="https://docs.spring.io/spring-data/redis/docs/current/reference/html/">Spring Data Redis Reference Guide</a></p>
</li>
</ul>
]]></content:encoded></item></channel></rss>