<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Sagyam's Blog]]></title><description><![CDATA[Sagyam's Blog]]></description><link>https://blog.sagyamthapa.com.np</link><generator>RSS for Node</generator><lastBuildDate>Tue, 14 Apr 2026 13:04:44 GMT</lastBuildDate><atom:link href="https://blog.sagyamthapa.com.np/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[An Interactive Guide To Count Min Sketch]]></title><description><![CDATA[Introduction
Count min sketch is a probabilistic data structure that can estimate the frequency of items in a stream. It is an improvement over Hyperloglog. While hyperloglog can estimate the number of unique items in a fixed amount of data, count mi...]]></description><link>https://blog.sagyamthapa.com.np/an-interactive-guide-to-count-min-sketch</link><guid isPermaLink="true">https://blog.sagyamthapa.com.np/an-interactive-guide-to-count-min-sketch</guid><category><![CDATA[count-min-sketch]]></category><category><![CDATA[probabilistic data structure]]></category><dc:creator><![CDATA[Sagyam Thapa]]></dc:creator><pubDate>Wed, 25 Jun 2025 17:58:40 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1750873204106/b7240185-5094-44a3-bb6f-e5eef6317dc1.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>Count min sketch is a probabilistic data structure that can estimate the frequency of items in a stream. It is an improvement over <a target="_blank" href="https://blog.sagyamthapa.com.np/an-interactive-guide-to-hyperloglog">Hyperloglog</a>. While hyperloglog can estimate the number of unique items in a fixed amount of data, count min sketch can do that over a stream of data. Think of hyperloglog as something that can guess the frequency of unique items on an image <em>(something that is fixed),</em> while count min sketch as something that can do that for a live video stream <em>(stream of data)</em>. Meaning even when you don’t know how long the data stream is, you can still guess the frequency of items in the stream</p>
<h2 id="heading-working-principle">Working principle</h2>
<p>This blog is the third installment of my probabilistic data structure series. I have written similar interactive guides on <a target="_blank" href="https://blog.sagyamthapa.com.np/an-interactive-guide-to-bloom-filter">Bloom Filter</a> and <a target="_blank" href="https://blog.sagyamthapa.com.np/an-interactive-guide-to-hyperloglog">Hyperloglog</a>. If you are unfamiliar with probabilistic data structures, then reading a similar interactive guide in <a target="_blank" href="https://blog.sagyamthapa.com.np/an-interactive-guide-to-hyperloglog">Hyperloglog</a> should give you a good idea.</p>
<ul>
<li><p><strong>A Count-Min Sketch is made of</strong></p>
<ul>
<li><p>A 2D array of counters with <code>d</code> rows and <code>w</code> columns.</p>
</li>
<li><p>Each row has its hash function <em>(h1, h2, h3..)</em>.</p>
<p>  <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750863919322/0160d9fe-f665-48a0-baa3-816d7ec5fa04.png" alt class="image--center mx-auto" /></p>
</li>
</ul>
</li>
</ul>
    <div data-node-type="callout">
    <div data-node-type="callout-emoji">💡</div>
    <div data-node-type="callout-text">Here we hash first items i.e. 2 from stream of data and obtain the cell position that needs to be incremented.</div>
    </div>

<ul>
<li><p><strong>Insert Operation (Adding an item):</strong></p>
<ul>
<li><p>For an item <code>x</code> Hash it using all <code>d</code> hash functions.</p>
</li>
<li><p>For each hash, increment the corresponding counter in its row:</p>
<pre><code class="lang-python">  count[i][hash_i(x)] += <span class="hljs-number">1</span>
</code></pre>
</li>
</ul>
</li>
<li><p><strong>Query Operation (Getting frequency estimate of</strong> <code>x</code>):</p>
<ul>
<li><p>Hash <code>x</code> with the same <code>d</code> hash functions.</p>
</li>
<li><p>Fetch the counts from the corresponding cells.</p>
</li>
<li><p><strong>Return the minimum</strong> value among those <code>d</code> counters:</p>
<pre><code class="lang-python">  estimate = min(count[<span class="hljs-number">0</span>][h1(x)], count[<span class="hljs-number">1</span>][h2(x)], ..., count[d<span class="hljs-number">-1</span>][hd(x)])
</code></pre>
</li>
</ul>
</li>
</ul>
<p>I have created a fun little <a target="_blank" href="https://tools.sagyamthapa.com.np/cms-working">app</a> that lets you see the working of Count-Min Sketch. Adjust the number of rows and columns. Click to generate a random number. The number is hashed n times. Each time it’s hashed a location for cell whose value needs to incremented by one is found. Clicking on a number follows a similar process. Except instead of incrementing the values, we take the minimum of all cells to get the estimate for that item.</p>
<p><strong>Can you get a count-min sketch to always get it right?</strong></p>
<div class="hn-embed-widget" id="cms-working"></div><p> </p>
<h2 id="heading-fun-facts">Fun facts</h2>
<ul>
<li><p>It is called count-min sketch because it counts the minimum from a sketch <em>(sketch is like a compact summary of a large dataset)</em>.</p>
</li>
<li><p>It has sub-linear space complexity, meaning it takes less space than storing an accurate count.</p>
</li>
<li><p>The reason it never underestimates is that counters can only ever be incremented, and the minimum count is taken.</p>
</li>
<li><p>Increasing <code>d</code> (rows) means higher probability of accurate results because more independent estimates but with more time complexity.</p>
</li>
<li><p>Increasing <code>w</code> (columns) means better accuracy due to less chance of collision but more memory usage.</p>
</li>
</ul>
<h2 id="heading-demo">Demo</h2>
<p>I have create a fun little <a target="_blank" href="https://tools.sagyamthapa.com.np/count-min-sketch">app</a> that puts all the pieces together to show you count min sketch would work. Here our app is guessing the frequency of fruits in a stream of 5000 fruits. Hit start and see a stream of fruits appear. See how the hash table is updated in real-time. Notice that the count min sketch never underestimates the real amount.</p>
<div class="hn-embed-widget" id="count-min-sketch"></div><p> </p>
<h2 id="heading-mathematical-relationships">Mathematical Relationships</h2>
<h3 id="heading-error-bounds">Error Bounds</h3>
<ul>
<li><p>Error in frequency estimate ≤ ε × N with probability 1 - δ</p>
</li>
<li><p>Where:</p>
<ul>
<li><p>ε = error factor (e.g., 0.001 means 0.1% error)</p>
</li>
<li><p>N = total number of items processed</p>
</li>
<li><p>δ = failure probability (e.g., 0.01 means 99% confidence)</p>
</li>
</ul>
</li>
</ul>
<h3 id="heading-formula-for-parameters">Formula for Parameters</h3>
<ul>
<li><p>Width: <code>w = ⌈e/ε⌉ (where e ≈ 2.718)</code></p>
</li>
<li><p>Depth: <code>d = ⌈ln(1/δ)⌉</code></p>
</li>
</ul>
<h3 id="heading-use-cases"><strong>Use Cases</strong></h3>
<ol>
<li><p>Finding heavy hitters in a stream.</p>
</li>
<li><p>Detecting DDoS attack.</p>
</li>
<li><p>Tracking popular search queries in search engine</p>
</li>
</ol>
<h2 id="heading-references">References</h2>
<ul>
<li><p><a target="_blank" href="https://dsf.berkeley.edu/cs286/papers/countmin-latin2004.pdf">Paper</a></p>
</li>
<li><p><a target="_blank" href="https://www.wikiwand.com/en/articles/Count%E2%80%93min_sketch">Wikipedia</a></p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[An Interactive Guide To Caching Strategies]]></title><description><![CDATA[Introduction
Word cache originates from French word cacher which means to hide. Outside computer science circle it refers to a secret place where you hide things, usually emergency supplies. In computer science though the meaning of the word is flipp...]]></description><link>https://blog.sagyamthapa.com.np/an-interactive-guide-to-caching-strategies</link><guid isPermaLink="true">https://blog.sagyamthapa.com.np/an-interactive-guide-to-caching-strategies</guid><category><![CDATA[caching strategies]]></category><category><![CDATA[interactive guide]]></category><dc:creator><![CDATA[Sagyam Thapa]]></dc:creator><pubDate>Fri, 20 Jun 2025 14:42:44 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1750428353601/a835e7d5-e3d1-4776-9702-a2959671c18c.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>Word cache originates from French word <code>cacher</code> which means <code>to hide</code>. Outside computer science circle it refers to a secret place where you hide things, usually emergency supplies. In computer science though the meaning of the word is flipped. Cache is a place where you store your frequently accessed data. It is one of the most effective ways to improve application performance, but choosing the right caching strategy can be tricky. Each strategy has its own strengths, trade-offs, and ideal use cases.</p>
<h2 id="heading-terminologies">Terminologies</h2>
<ul>
<li><p><strong>Cache Hit:</strong> When the data you are looking for is found in cache</p>
</li>
<li><p><strong>Cache Miss:</strong> When the data you are looking for is not found in cache</p>
</li>
<li><p><strong>Asynchronous Writes:</strong> When you write multiple items to DB back to back without waiting for last write to complete</p>
</li>
<li><p><strong>Eventual Consistency:</strong> Data syncing mechanism where updates not transferred immediately</p>
</li>
<li><p><strong>Cache Stampede:</strong> Situation where data that’s not available in cache is suddenly in high demand</p>
</li>
<li><p><strong>Pre-Warming cache:</strong> Loading cache with frequently used data before it’s even asked.</p>
</li>
<li><p><strong>Cache pollution:</strong> When frequently accessed data is repeatedly evicted and re-fetched leading to performance degradation</p>
</li>
</ul>
<p>In this guide, we'll explore five common caching strategies that every developer should understand.</p>
<h2 id="heading-cache-aside">Cache Aside</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750427846671/1bcfccb4-77e4-4c3b-b97d-9f80c35a4db1.png" alt="Diagram showing a user request process involving a server, cache, and database. Steps: 1. User asks server for data. 2. Cache miss occurs. 3. Server reads from database. 4. Database sends data. 5. Server updates cache." class="image--center mx-auto" /></p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">Note that at step 2 cache miss is actually reported to server and server is responsible for updating cache.</div>
</div>

<p><strong>Introduction:</strong> Server takes the responsibility of managing cache</p>
<p><strong>Cache Hit Behavior:</strong> Return data directly from cache</p>
<p><strong>Cache Miss Behavior:</strong> Load from database, update cache, return data</p>
<p><strong>Write Behavior:</strong> Write to database only, invalidate cache entry</p>
<p><strong>Consistency:</strong> Eventual consistency, possible stale data</p>
<p><strong>Performance:</strong> Fast reads on hit, slower on miss</p>
<p><strong>Use Cases:</strong> Read-heavy workloads, unpredictable access patterns</p>
<p><strong>Advantages:</strong> Simple implementation, application has full control</p>
<p><strong>Disadvantages:</strong> Cache stampede risk, code duplication across services</p>
<h3 id="heading-link-to-interactive-apphttpstoolssagyamthapacomnpcachingstrategycache-aside"><a target="_blank" href="https://tools.sagyamthapa.com.np/caching?strategy=cache-aside">Link to interactive app</a></h3>
<div class="hn-embed-widget" id="cache-aside"></div><p> </p>
<h2 id="heading-read-through">Read Through</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750427868738/3495c5aa-5ab5-491f-9e5b-4286e438809d.png" alt="Flowchart illustrating a cache system. The user requests data from the server, which first attempts to read from the cache. If a cache miss occurs, the server reads from the database, retrieves the data, and updates the cache before returning the data to the user." class="image--center mx-auto" /></p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">Note that at step 2 cache does not actually return cache miss to server. Instead it transparently fetches data from database and updates cache and send the data to server.</div>
</div>

<p><strong>Introduction:</strong> Cache takes full responsibility of managing cache</p>
<p><strong>Cache Hit Behavior:</strong> Cache returns data directly</p>
<p><strong>Cache Miss Behavior:</strong> Cache loads from database transparently</p>
<p><strong>Write Behavior:</strong> Typically combined with write-through or write-back</p>
<p><strong>Consistency:</strong> Depends on write strategy used</p>
<p><strong>Performance:</strong> Consistent read performance</p>
<p><strong>Use Cases:</strong> Uniform data access patterns</p>
<p><strong>Advantages:</strong> Simplified application code, centralized cache logic</p>
<p><strong>Disadvantages:</strong> Less flexibility, cache becomes critical component</p>
<h3 id="heading-link-to-interactive-apphttpstoolssagyamthapacomnpcachingstrategyread-through"><a target="_blank" href="https://tools.sagyamthapa.com.np/caching?strategy=read-through">Link to interactive app</a></h3>
<div class="hn-embed-widget" id="read-through"></div><p> </p>
<h2 id="heading-write-through">Write Through</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750428483063/6a3b5207-1993-42ff-88db-cc59acaf7ce3.png" alt="Diagram showing data flow from a user to a server, then writing to a cache and synchronously to a database." class="image--center mx-auto" /></p>
<p><strong>Introduction:</strong> Writes are first written to cache then immediately written to database</p>
<p><strong>Cache Hit Behavior:</strong> Return cached data</p>
<p><strong>Cache Miss Behavior:</strong> Load from database into cache</p>
<p><strong>Write Behavior:</strong> Write to cache and database before confirming</p>
<p><strong>Consistency:</strong> Strong consistency guaranteed</p>
<p><strong>Performance:</strong> Slower writes due to dual updates</p>
<p><strong>Use Cases:</strong> Financial systems, inventory management</p>
<p><strong>Advantages:</strong> No data loss risk, always consistent</p>
<p><strong>Disadvantages:</strong> Higher write latency, no benefit for write-heavy loads</p>
<h3 id="heading-link-to-interactive-apphttpstoolssagyamthapacomnpcachingstrategywrite-through"><a target="_blank" href="https://tools.sagyamthapa.com.np/caching?strategy=write-through">Link to interactive app</a></h3>
<div class="hn-embed-widget" id="write-through"></div><p> </p>
<h2 id="heading-write-back">Write Back</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750427984765/1c579fcb-c9d0-4fd6-90d2-b06228b3c446.png" alt="Diagram showing a user interacting with a server, which writes constantly to a cache. The cache asynchronously writes to a database occasionally." class="image--center mx-auto" /></p>
<p><strong>Introduction:</strong> Writes are first written to cache and eventually written to database</p>
<p><strong>Cache Hit Behavior:</strong> Return cached data</p>
<p><strong>Cache Miss Behavior:</strong> Load from database if not in write queue</p>
<p><strong>Write Behavior:</strong> Write to cache immediately, batch/delay database writes</p>
<p><strong>Consistency:</strong> Eventual consistency</p>
<p><strong>Performance:</strong> Very fast writes</p>
<p><strong>Use Cases:</strong> Write-heavy workloads, analytics data</p>
<p><strong>Advantages:</strong> Excellent write performance, reduced database load</p>
<p><strong>Disadvantages:</strong> Risk of data loss, complex failure handling</p>
<h3 id="heading-link-to-interactive-apphttpstoolssagyamthapacomnpcachingstrategywrite-back"><a target="_blank" href="https://tools.sagyamthapa.com.np/caching?strategy=write-back">Link to interactive app</a></h3>
<div class="hn-embed-widget" id="write-back"></div><p> </p>
<h2 id="heading-write-around">Write Around</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750428701241/64dd991a-96a5-457d-8534-ac7495f87531.png" alt class="image--center mx-auto" /></p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">Note at step 4 cache miss is not advertised to server instead the cache fetches the data from database keeps a copy for itself and returns the data. Also step 2 can be asynchronous or synchronous.</div>
</div>

<p><strong>Description:</strong> Writes bypass cache, goes directly to database</p>
<p><strong>Cache Hit Behavior:</strong> Return cached data</p>
<p><strong>Cache Miss Behavior:</strong> Load from database, optionally cache</p>
<p><strong>Write Behavior:</strong> Direct to database, don't update cache</p>
<p><strong>Consistency:</strong> Avoids cache pollution from writes</p>
<p><strong>Performance:</strong> Good for infrequent reads after writes</p>
<p><strong>Use Cases:</strong> Bulk imports, audit logs</p>
<p><strong>Advantages:</strong> Prevents <a target="_blank" href="https://www.wikiwand.com/en/articles/Cache_pollution">cache pollution</a>, simpler write path</p>
<p><strong>Disadvantages:</strong> First read after write is slow</p>
<h3 id="heading-link-to-interactive-apphttpstoolssagyamthapacomnpcachingstrategywrite-around"><a target="_blank" href="https://tools.sagyamthapa.com.np/caching?strategy=write-around">Link to interactive app</a></h3>
<div class="hn-embed-widget" id="write-around"></div><p> </p>
<h2 id="heading-refresh-ahead">Refresh Ahead</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750427956768/89630563-1a9c-41d3-9ce4-17b7f6d1985d.png" alt class="image--center mx-auto" /></p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">Note that it’s not wise to re-fetch data just because it’s stale. This causes unnecessary strain on database. It better to re-fetch stale data only when it’s requested.</div>
</div>

<p><strong>Introduction:</strong> Needs pairing with another strategy</p>
<p><strong>Cache Hit Behavior:</strong> Always return fresh data</p>
<p><strong>Cache Miss Behavior:</strong> Rare, only on first access</p>
<p><strong>Write Behavior:</strong> Depends on combined strategy</p>
<p><strong>Consistency:</strong> Near real-time data freshness</p>
<p><strong>Performance:</strong> Consistent fast reads</p>
<p><strong>Use Cases:</strong> Frequently accessed data, predictable patterns</p>
<p><strong>Advantages:</strong> Minimizes cache misses, predictable performance</p>
<p><strong>Disadvantages:</strong> Wastes resources on unused data, complex prediction logic</p>
<h3 id="heading-link-to-interactive-apphttpstoolssagyamthapacomnpcachingstrategyrefresh-ahead"><a target="_blank" href="https://tools.sagyamthapa.com.np/caching?strategy=refresh-ahead">Link to interactive app</a></h3>
<div class="hn-embed-widget" id="refresh-ahead-strategy"></div><p> </p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>As you see from these demos, picking the best caching strategy depends on your use case. Remember that these strategies aren't mutually exclusive. Many production systems combine strategies.</p>
<p>Here are some of my recommendation:</p>
<h3 id="heading-content-management-system">Content Management System</h3>
<p><strong>Recommended:</strong> Write Through + Read Through</p>
<p><strong>Requirements:</strong></p>
<ul>
<li><p>Ensures published content is immediately available</p>
</li>
<li><p>Simplifies application logic</p>
</li>
<li><p>Strong consistency for content updates</p>
</li>
</ul>
<h3 id="heading-e-commerce-product-catalog">E-commerce Product Catalog</h3>
<p><strong>Recommended:</strong> Cache Aside + Refresh Ahead</p>
<p><strong>Requirements:</strong></p>
<ul>
<li><p>Cache Aside for flexibility with varying access patterns</p>
</li>
<li><p>Refresh Ahead for popular items to ensure availability</p>
</li>
<li><p>Handles both predictable (trending) and unpredictable (<a target="_blank" href="https://www.wikiwand.com/en/articles/Long_tail">long-tail</a>) access</p>
</li>
</ul>
<h3 id="heading-financial-trading-system">Financial Trading System</h3>
<p><strong>Recommended:</strong> Write Through</p>
<p><strong>Requirements:</strong></p>
<ul>
<li><p>Strong consistency is must</p>
</li>
<li><p>Every transaction must be persisted immediately</p>
</li>
<li><p>Cache serves only to reduce read latency</p>
</li>
</ul>
<h3 id="heading-real-time-chat-application">Real-time Chat Application</h3>
<p><strong>Recommended:</strong> Write Back + Read Through</p>
<p><strong>Requirements:</strong></p>
<ul>
<li><p>Write Back for message sending performance</p>
</li>
<li><p>Read Through for message history</p>
</li>
<li><p>Recent messages stay in cache</p>
</li>
</ul>
<h3 id="heading-gaming-leader-boards">Gaming Leader boards</h3>
<p><strong>Recommended:</strong> Write Back + Refresh Ahead</p>
<p><strong>Requirements:</strong></p>
<ul>
<li><p>Write Back for rapid score updates</p>
</li>
<li><p>Refresh Ahead for top players</p>
</li>
<li><p>Eventual consistency acceptable</p>
</li>
</ul>
<h3 id="heading-api-rate-limiting">API Rate Limiting</h3>
<p><strong>Recommended:</strong> Write Back</p>
<p><strong>Requirements:</strong></p>
<ul>
<li><p>Extremely high update frequency</p>
</li>
<li><p>Small data loss acceptable</p>
</li>
<li><p>Performance critical for API gateway</p>
</li>
</ul>
<p>Start simple with Cache-Aside, measure your performance, and evolve your caching strategy as your application grows. Happy caching!</p>
<h2 id="heading-references">References</h2>
<ul>
<li><p><a target="_blank" href="https://www.prisma.io/dataguide/managing-databases/introduction-database-caching?query=&amp;page=1">Prisma</a></p>
</li>
<li><p><a target="_blank" href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3smq5msfo852zeoej5iz.jpg">ByeByteGo</a></p>
</li>
<li><p><a target="_blank" href="https://www.enjoyalgorithms.com/system-design/">Enjoy Algorithms</a></p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[An Interactive Guide To Rate Limiting]]></title><description><![CDATA[Introduction
Rate limiting is a must have strategy in every back-end app. It prevent one user from overusing a resource and degrading the quality of service for other users. Here are some benefits of rate limiting

It presents resource starvation

Re...]]></description><link>https://blog.sagyamthapa.com.np/interactive-guide-to-rate-limiting</link><guid isPermaLink="true">https://blog.sagyamthapa.com.np/interactive-guide-to-rate-limiting</guid><category><![CDATA[rate-limiting]]></category><dc:creator><![CDATA[Sagyam Thapa]]></dc:creator><pubDate>Wed, 04 Jun 2025 22:19:25 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/a-hCmlnehyU/upload/9b95d9fd3e800c60d20595e644eb1e2d.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>Rate limiting is a must have strategy in every back-end app. It prevent one user from overusing a resource and degrading the quality of service for other users. Here are some benefits of rate limiting</p>
<ul>
<li><p>It presents resource starvation</p>
</li>
<li><p>Reduces server hosting cost</p>
</li>
<li><p>Provides basic protection against <a target="_blank" href="https://en.wikipedia.org/wiki/Denial-of-service_attack">DDoS</a></p>
</li>
</ul>
<p>I have made four interactive app that let’s you play around with common rate limiting algorithms.</p>
<h2 id="heading-token-bucket">Token bucket</h2>
<h3 id="heading-working">Working:</h3>
<ul>
<li><p>A bucket holds fixed number tokens</p>
</li>
<li><p>Tokens are added to bucket at fixed rate</p>
</li>
<li><p>When a request comes in:</p>
<ul>
<li><p>If a token is available, it’s removed from the bucket and the request is allowed.</p>
</li>
<li><p>If no tokens are available, the request is rejected or delayed.</p>
</li>
</ul>
</li>
<li><p>Allows for occasional short burst if tokens are available</p>
</li>
</ul>
<p>I have created an <a target="_blank" href="https://tools.sagyamthapa.com.np/token-bucket">app</a> that let’s you play with leaky bucket algorithm.</p>
<div class="hn-embed-widget" id="token-bucket"></div><p> </p>
<h2 id="heading-leaky-bucket">Leaky bucket</h2>
<h3 id="heading-working-1">Working</h3>
<ul>
<li><p>Think of it as a bucket leaking at a fixed rate</p>
</li>
<li><p>Incoming requests are added to the bucket</p>
</li>
<li><p>Requests are processed (or "leak") at a <strong>constant rate</strong></p>
</li>
<li><p>If the bucket is full when a new request arrives, the request is dropped</p>
</li>
<li><p>Smooths out bursts; outputs requests at a steady rate</p>
<p>  I have made an <a target="_blank" href="https://tools.sagyamthapa.com.np/leaky-bucket">app</a> that let’s you play with leaky bucket algorithm.</p>
</li>
</ul>
<div class="hn-embed-widget" id="leaky-bucket"></div><p> </p>
<h2 id="heading-fixed-window-counter">Fixed window counter</h2>
<h3 id="heading-working-2">Working:</h3>
<ul>
<li><p>Time is divided into fixed size windows (e.g., 1 minute)</p>
</li>
<li><p>A counter tracks the number of requests per client/IP in the current window</p>
</li>
<li><p>If the count exceeds the limit, further requests are rejected until the next window</p>
</li>
<li><p>Simple and efficient, but allows burst traffic spike at end/start</p>
<p>  I have created an <a target="_blank" href="https://tools.sagyamthapa.com.np/fixed-window">app</a> that let’s you play with fixed bucket algorithm.</p>
</li>
</ul>
<div class="hn-embed-widget" id="fixed-window"></div><p> </p>
<h2 id="heading-sliding-window-counter">Sliding window counter</h2>
<h3 id="heading-working-3">Working:</h3>
<ul>
<li><p>Keeps a timestamped log of each request</p>
</li>
<li><p>When a request comes in, logs are checked to count how many requests were made in the last <code>X</code> seconds</p>
</li>
<li><p>If under the limit, the request is allowed and logged; otherwise, it’s rejected</p>
<p>  I have created an <a target="_blank" href="https://tools.sagyamthapa.com.np/sliding-window">app</a> that let’s you play with sliding bucket algorithm.</p>
</li>
</ul>
<div class="hn-embed-widget" id="sliding-window"></div>]]></content:encoded></item><item><title><![CDATA[Instrument your NodeJS App With OpenTelemetry]]></title><description><![CDATA[Introduction
Have you ever had a bug that occurred in production and you have no idea what went wrong because your logs won’t tell you exactly what went wrong or a request that takes usually long to process.
Sometimes debugging these issues without a...]]></description><link>https://blog.sagyamthapa.com.np/distributed-tracing-with-opentelemetry-and-jaeger-for-nest-application</link><guid isPermaLink="true">https://blog.sagyamthapa.com.np/distributed-tracing-with-opentelemetry-and-jaeger-for-nest-application</guid><category><![CDATA[OpenTelemetry]]></category><category><![CDATA[SRE]]></category><category><![CDATA[Node.js]]></category><category><![CDATA[jaeger]]></category><dc:creator><![CDATA[Sagyam Thapa]]></dc:creator><pubDate>Tue, 03 Jun 2025 18:15:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1732552404401/b1da1bec-c093-48e8-bedd-07f8b47d1d5a.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h3 id="heading-introduction">Introduction</h3>
<p>Have you ever had a bug that occurred in production and you have no idea what went wrong because your logs won’t tell you exactly what went wrong or a request that takes usually long to process.</p>
<p>Sometimes debugging these issues without a tracing system is impossible. A tracing system is like a CCTV camera that captures every thing <em>what happened, when did it happen, what was the order of events, how long did it each event take</em>. This information is vital for debugging and identifying performance bottlenecks in complex distributed applications.</p>
<h3 id="heading-prerequisite">Prerequisite</h3>
<ul>
<li><p>NodeJS</p>
</li>
<li><p>Typescript</p>
</li>
<li><p>NestJS</p>
</li>
<li><p>Docker</p>
</li>
</ul>
<h3 id="heading-terminology">Terminology</h3>
<ul>
<li><p><strong>Trace</strong>: A trace is like a complete journey map of a single request as it moves through your entire distributed system. Imagine it as a detailed travel log that follows a request from its starting point to its final destination, capturing every stop and interaction along the way.</p>
<p>  <img src="https://www.jaegertracing.io/img/spans-traces.png" alt="https://www.jaegertracing.io/img/spans-traces.png" /></p>
</li>
<li><p><strong>Instrumentation</strong>: The process of adding code to your application to collect telemetry data. It's like installing GPS trackers in different parts of your system.</p>
</li>
<li><p><strong>Exporter</strong>: A component responsible for sending collected trace data to a back-end system for storage and analysis. Think of it as a postal service that sends your travel logs to a central archive.</p>
</li>
<li><p><strong>Span:</strong></p>
<ul>
<li><p>Root span: The first span in a trace, marking the beginning of the entire request journey. It's like the starting point of your travel log.</p>
</li>
<li><p>Child span: A span that is nested within another span, representing a more specific operation within a broader process.</p>
</li>
</ul>
</li>
<li><p><strong>Context propagation:</strong> The mechanism of transferring trace information between different services and components. It's like passing a traveler's passport that contains their complete journey details.</p>
</li>
<li><p><strong>Metrics</strong>: Metrics are numerical data that tells up about app’s performance, health and behavior.</p>
</li>
<li><p><strong>Logs</strong>: Logs are text entries describing usage patterns, activities, and operations within your application.</p>
</li>
</ul>
<h3 id="heading-three-horsemen-of-observability">Three horsemen of observability</h3>
<p>Observability lets you understand a system from the outside by letting you ask questions about that system without knowing its inner workings. It allows you to easily troubleshoot and handle novel problems, that is, <strong>“unknown unknowns”</strong>. It also answers the question <strong>“Why is this happening?”</strong></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1732559314516/82a38ba4-fd0a-4767-a55f-45903d918b47.png" alt="Venn diagram showing realtion between Tracing, Metrics and Logging" class="image--center mx-auto" /></p>
<h3 id="heading-setting-up-the-project">Setting up the project</h3>
<pre><code class="lang-bash">pnpm i -g @nestjs/cli
nest new tracing-app
<span class="hljs-built_in">cd</span> tracing-app
</code></pre>
<h3 id="heading-installing-dependencies"><strong>Installing dependencies</strong></h3>
<p>Install Jaeger and OpenTelemetry related libraries:</p>
<pre><code class="lang-bash">pnpm install @opentelemetry/sdk-trace-node @opentelemetry/resources @opentelemetry/sdk-trace-base 
pnpm install @opentelemetry/instrumentation @prisma/instrumentation @opentelemetry/instrumentation-net @opentelemetry/instrumentation-http @opentelemetry/instrumentation-express
pnpm install @opentelemetry/exporter-trace-otlp-http
pnpm install @opentelemetry/api @opentelemetry/semantic-conventions
</code></pre>
<p>Install Prisma ORM and SQLite:</p>
<pre><code class="lang-bash">pnpm install @prisma/client sqlite3 class-validator
pnpm install prisma --save-dev
pnpm install --save @nestjs/swagger
</code></pre>
<p>Initialize Prisma:</p>
<pre><code class="lang-bash">npx prisma init
</code></pre>
<p>This will create a <code>prisma</code> directory with a <code>schema.prisma</code> file.</p>
<pre><code class="lang-sql">datasource db {
  provider = "sqlite"
  url      = "file:./dev.db"
}

generator client {
  provider = "prisma-client-js"
}

model User {
  id    Int     @id @default(autoincrement())
  name  String
  email String  @unique
}
</code></pre>
<p>Run Prisma migrations:</p>
<pre><code class="lang-bash">npx prisma migrate dev --name init
</code></pre>
<p>Generate Prisma Client:</p>
<pre><code class="lang-bash">npx prisma generate
</code></pre>
<h3 id="heading-setup-a-crud-endpoint"><strong>Setup a CRUD endpoint</strong></h3>
<p>Generate a CRUD module for users</p>
<pre><code class="lang-bash">pnpm nest generate resource users
</code></pre>
<p>This will create a <code>users</code> module with a controller, service, and DTOs.</p>
<p>Create a <code>prisma.service.ts</code> file in prisma folder</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> { Injectable, OnModuleInit, OnModuleDestroy } <span class="hljs-keyword">from</span> <span class="hljs-string">'@nestjs/common'</span>;
<span class="hljs-keyword">import</span> { PrismaClient } <span class="hljs-keyword">from</span> <span class="hljs-string">'@prisma/client'</span>;

<span class="hljs-meta">@Injectable</span>()
<span class="hljs-keyword">export</span> <span class="hljs-keyword">class</span> PrismaService <span class="hljs-keyword">extends</span> PrismaClient <span class="hljs-keyword">implements</span> OnModuleInit, OnModuleDestroy {
  <span class="hljs-keyword">async</span> onModuleInit() {
    <span class="hljs-keyword">await</span> <span class="hljs-built_in">this</span>.$connect();
  }

  <span class="hljs-keyword">async</span> onModuleDestroy() {
    <span class="hljs-keyword">await</span> <span class="hljs-built_in">this</span>.$disconnect();
  }
}
</code></pre>
<p>Update the <code>users.module.ts</code> file to include the <code>PrismaService</code>:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> { Module } <span class="hljs-keyword">from</span> <span class="hljs-string">'@nestjs/common'</span>;
<span class="hljs-keyword">import</span> { UsersService } <span class="hljs-keyword">from</span> <span class="hljs-string">'./users.service'</span>;
<span class="hljs-keyword">import</span> { UsersController } <span class="hljs-keyword">from</span> <span class="hljs-string">'./users.controller'</span>;
<span class="hljs-keyword">import</span> { PrismaService } <span class="hljs-keyword">from</span> <span class="hljs-string">'../../prisma/prisma.service'</span>;

<span class="hljs-meta">@Module</span>({
  controllers: [UsersController],
  providers: [UsersService, PrismaService],
})
<span class="hljs-keyword">export</span> <span class="hljs-keyword">class</span> UsersModule { }
</code></pre>
<p>Create a file named <code>create-user.dto.ts</code> in the <code>users/dto</code> directory:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> { IsEmail, IsNotEmpty, IsString } <span class="hljs-keyword">from</span> <span class="hljs-string">'class-validator'</span>;
<span class="hljs-keyword">import</span> { ApiProperty } <span class="hljs-keyword">from</span> <span class="hljs-string">'@nestjs/swagger'</span>;

<span class="hljs-keyword">export</span> <span class="hljs-keyword">class</span> CreateUserDto {
    <span class="hljs-meta">@ApiProperty</span>({
        description: <span class="hljs-string">'The name of the user'</span>,
        example: <span class="hljs-string">'John Doe'</span>,
    })
    <span class="hljs-meta">@IsNotEmpty</span>()
    <span class="hljs-meta">@IsString</span>()
    name: <span class="hljs-built_in">string</span>;

    <span class="hljs-meta">@ApiProperty</span>({
        description: <span class="hljs-string">'The email of the user'</span>,
        example: <span class="hljs-string">'email@domain.com'</span>,
    })
    <span class="hljs-meta">@IsNotEmpty</span>()
    <span class="hljs-meta">@IsEmail</span>()
    email: <span class="hljs-built_in">string</span>;
}

<span class="hljs-keyword">export</span> <span class="hljs-keyword">class</span> UpdateUserDto <span class="hljs-keyword">extends</span> PartialType(CreateUserDto) {}
</code></pre>
<p>Update the <code>users.service.ts</code> file to use Prisma:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> { Injectable } <span class="hljs-keyword">from</span> <span class="hljs-string">'@nestjs/common'</span>;
<span class="hljs-keyword">import</span> { PrismaService } <span class="hljs-keyword">from</span> <span class="hljs-string">'../prisma/prisma.service'</span>;
<span class="hljs-keyword">import</span> { CreateUserDto } <span class="hljs-keyword">from</span> <span class="hljs-string">'./dto/create-user.dto'</span>;
<span class="hljs-keyword">import</span> { UpdateUserDto } <span class="hljs-keyword">from</span> <span class="hljs-string">'./dto/update-user.dto'</span>;

<span class="hljs-meta">@Injectable</span>()
<span class="hljs-keyword">export</span> <span class="hljs-keyword">class</span> UsersService {
  <span class="hljs-keyword">constructor</span>(<span class="hljs-params"><span class="hljs-keyword">private</span> prisma: PrismaService</span>) {}

  create(createUserDto: CreateUserDto) {
    <span class="hljs-keyword">return</span> <span class="hljs-built_in">this</span>.prisma.user.create({
      data: createUserDto,
    });
  }

  findAll() {
    <span class="hljs-keyword">return</span> <span class="hljs-built_in">this</span>.prisma.user.findMany();
  }

  findOne(id: <span class="hljs-built_in">number</span>) {
    <span class="hljs-keyword">return</span> <span class="hljs-built_in">this</span>.prisma.user.findUnique({
      where: { id },
    });
  }

  update(id: <span class="hljs-built_in">number</span>, updateUserDto: UpdateUserDto) {
    <span class="hljs-keyword">return</span> <span class="hljs-built_in">this</span>.prisma.user.update({
      where: { id },
      data: updateUserDto,
    });
  }

  remove(id: <span class="hljs-built_in">number</span>) {
    <span class="hljs-keyword">return</span> <span class="hljs-built_in">this</span>.prisma.user.delete({
      where: { id },
    });
  }
}
</code></pre>
<p><strong>Update the</strong> <code>users.controller.ts</code> file:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> { Controller, Get, Post, Body, Patch, Param, Delete } <span class="hljs-keyword">from</span> <span class="hljs-string">'@nestjs/common'</span>;
<span class="hljs-keyword">import</span> { UsersService } <span class="hljs-keyword">from</span> <span class="hljs-string">'./users.service'</span>;
<span class="hljs-keyword">import</span> { CreateUserDto } <span class="hljs-keyword">from</span> <span class="hljs-string">'./dto/create-user.dto'</span>;
<span class="hljs-keyword">import</span> { UpdateUserDto } <span class="hljs-keyword">from</span> <span class="hljs-string">'./dto/update-user.dto'</span>;
<span class="hljs-keyword">import</span> { ApiGoneResponse, ApiNotFoundResponse, ApiOkResponse, ApiOperation, ApiParam, ApiTags } <span class="hljs-keyword">from</span> <span class="hljs-string">'@nestjs/swagger'</span>;

<span class="hljs-meta">@ApiTags</span>(<span class="hljs-string">'users'</span>)
<span class="hljs-meta">@Controller</span>(<span class="hljs-string">'users'</span>)
<span class="hljs-keyword">export</span> <span class="hljs-keyword">class</span> UsersController {
  <span class="hljs-keyword">constructor</span>(<span class="hljs-params"><span class="hljs-keyword">private</span> <span class="hljs-keyword">readonly</span> usersService: UsersService</span>) { }

  <span class="hljs-meta">@ApiOperation</span>({ summary: <span class="hljs-string">'Create user'</span> })
  <span class="hljs-meta">@ApiOkResponse</span>({ description: <span class="hljs-string">'User created'</span> })
  <span class="hljs-meta">@Post</span>()
  create(<span class="hljs-meta">@Body</span>() createUserDto: CreateUserDto) {
    <span class="hljs-keyword">return</span> <span class="hljs-built_in">this</span>.usersService.create(createUserDto);
  }

  <span class="hljs-meta">@ApiOperation</span>({ summary: <span class="hljs-string">'Get all users'</span> })
  <span class="hljs-meta">@ApiOkResponse</span>({ description: <span class="hljs-string">'Users found'</span> })
  <span class="hljs-meta">@Get</span>()
  findAll() {
    <span class="hljs-keyword">return</span> <span class="hljs-built_in">this</span>.usersService.findAll();
  }

  <span class="hljs-meta">@ApiOperation</span>({ summary: <span class="hljs-string">'Get user by id'</span> })
  <span class="hljs-meta">@ApiOkResponse</span>({ description: <span class="hljs-string">'User found'</span> })
  <span class="hljs-meta">@ApiNotFoundResponse</span>({ description: <span class="hljs-string">'User not found'</span> })
  <span class="hljs-meta">@ApiParam</span>({ name: <span class="hljs-string">'id'</span>, description: <span class="hljs-string">'User id'</span> })
  <span class="hljs-meta">@Get</span>(<span class="hljs-string">':id'</span>)
  findOne(<span class="hljs-meta">@Param</span>(<span class="hljs-string">'id'</span>) id: <span class="hljs-built_in">string</span>) {
    <span class="hljs-keyword">return</span> <span class="hljs-built_in">this</span>.usersService.findOne(+id);
  }

  <span class="hljs-meta">@ApiOperation</span>({ summary: <span class="hljs-string">'Update user'</span> })
  <span class="hljs-meta">@ApiOkResponse</span>({ description: <span class="hljs-string">'User updated'</span> })
  <span class="hljs-meta">@ApiNotFoundResponse</span>({ description: <span class="hljs-string">'User not found'</span> })
  <span class="hljs-meta">@ApiParam</span>({ name: <span class="hljs-string">'id'</span>, description: <span class="hljs-string">'User id'</span> })
  <span class="hljs-meta">@Patch</span>(<span class="hljs-string">':id'</span>)
  update(<span class="hljs-meta">@Param</span>(<span class="hljs-string">'id'</span>) id: <span class="hljs-built_in">string</span>, <span class="hljs-meta">@Body</span>() updateUserDto: UpdateUserDto) {
    <span class="hljs-keyword">return</span> <span class="hljs-built_in">this</span>.usersService.update(+id, updateUserDto);
  }

  <span class="hljs-meta">@ApiOperation</span>({ summary: <span class="hljs-string">'Delete user'</span> })
  <span class="hljs-meta">@ApiGoneResponse</span>({ description: <span class="hljs-string">'User deleted'</span> })
  <span class="hljs-meta">@ApiParam</span>({ name: <span class="hljs-string">'id'</span>, description: <span class="hljs-string">'User id'</span> })
  <span class="hljs-meta">@Delete</span>(<span class="hljs-string">':id'</span>)
  remove(<span class="hljs-meta">@Param</span>(<span class="hljs-string">'id'</span>) id: <span class="hljs-built_in">string</span>) {
    <span class="hljs-keyword">return</span> <span class="hljs-built_in">this</span>.usersService.remove(+id);
  }
}
</code></pre>
<h3 id="heading-configuring-exporters"><strong>Configuring exporters</strong></h3>
<p>Create a file <strong><em>tracing.ts</em></strong> in your <strong><em>src</em></strong> directory:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> { ATTR_SERVICE_NAME, ATTR_SERVICE_VERSION } <span class="hljs-keyword">from</span> <span class="hljs-string">'@opentelemetry/semantic-conventions'</span>;
<span class="hljs-keyword">import</span> { BatchSpanProcessor } <span class="hljs-keyword">from</span> <span class="hljs-string">'@opentelemetry/sdk-trace-base'</span>;
<span class="hljs-keyword">import</span> { ExpressInstrumentation } <span class="hljs-keyword">from</span> <span class="hljs-string">'@opentelemetry/instrumentation-express'</span>;
<span class="hljs-keyword">import</span> { HttpInstrumentation } <span class="hljs-keyword">from</span> <span class="hljs-string">'@opentelemetry/instrumentation-http'</span>;
<span class="hljs-keyword">import</span> { NetInstrumentation } <span class="hljs-keyword">from</span> <span class="hljs-string">'@opentelemetry/instrumentation-net'</span>;
<span class="hljs-keyword">import</span> { NodeTracerProvider } <span class="hljs-keyword">from</span> <span class="hljs-string">'@opentelemetry/sdk-trace-node'</span>;
<span class="hljs-keyword">import</span> { OTLPTraceExporter } <span class="hljs-keyword">from</span> <span class="hljs-string">'@opentelemetry/exporter-trace-otlp-http'</span>;
<span class="hljs-keyword">import</span> { PrismaInstrumentation } <span class="hljs-keyword">from</span> <span class="hljs-string">'@prisma/instrumentation'</span>;
<span class="hljs-keyword">import</span> { Resource } <span class="hljs-keyword">from</span> <span class="hljs-string">'@opentelemetry/resources'</span>;
<span class="hljs-keyword">import</span> { diag, DiagConsoleLogger, DiagLogLevel } <span class="hljs-keyword">from</span> <span class="hljs-string">'@opentelemetry/api'</span>;
<span class="hljs-keyword">import</span> { registerInstrumentations } <span class="hljs-keyword">from</span> <span class="hljs-string">'@opentelemetry/instrumentation'</span>;

<span class="hljs-keyword">export</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">setupTracing</span>(<span class="hljs-params"></span>) </span>{
    <span class="hljs-comment">// Enable OpenTelemetry diagnostic logging</span>
    diag.setLogger(<span class="hljs-keyword">new</span> DiagConsoleLogger(), DiagLogLevel.INFO);

    <span class="hljs-comment">// Create a resource with service information</span>
    <span class="hljs-keyword">const</span> resource = <span class="hljs-keyword">new</span> Resource({
        [ATTR_SERVICE_NAME]: process.env.SERVICE_NAME || <span class="hljs-string">'tracer-app'</span>,
        [ATTR_SERVICE_VERSION]: process.env.npm_package_version || <span class="hljs-string">'1.0.0'</span>,
    });

    <span class="hljs-keyword">const</span> otlpExporter = <span class="hljs-keyword">new</span> OTLPTraceExporter({
        url: process.env.OTEL_EXPORTER_OTLP_ENDPOINT || <span class="hljs-string">'http://localhost:4318/v1/traces'</span>,
    });


    <span class="hljs-comment">// Create tracer provider with resource and span processors</span>
    <span class="hljs-keyword">const</span> provider = <span class="hljs-keyword">new</span> NodeTracerProvider({
        resource,
        spanProcessors: [
            <span class="hljs-keyword">new</span> BatchSpanProcessor(otlpExporter, {
                maxQueueSize: <span class="hljs-number">100</span>,
                scheduledDelayMillis: <span class="hljs-number">5000</span>,
                exportTimeoutMillis: <span class="hljs-number">30000</span>,
                maxExportBatchSize: <span class="hljs-number">50</span>,
            })
        ]
    });

    <span class="hljs-comment">// Register instrumentations with more comprehensive coverage</span>
    registerInstrumentations({
        tracerProvider: provider,
        instrumentations: [
            <span class="hljs-keyword">new</span> HttpInstrumentation({
                requestHook: <span class="hljs-function">(<span class="hljs-params">span, request</span>) =&gt;</span> {
                    span.setAttribute(<span class="hljs-string">'http.request.method'</span>, request.method);
                },
            }),
            <span class="hljs-keyword">new</span> NetInstrumentation(),
            <span class="hljs-keyword">new</span> ExpressInstrumentation(),
            <span class="hljs-keyword">new</span> PrismaInstrumentation({ middleware: <span class="hljs-literal">true</span> }),
        ],
    });

    <span class="hljs-comment">// Register the provider</span>
    provider.register();

    <span class="hljs-comment">// Return the provider for potential manual instrumentation</span>
    <span class="hljs-keyword">return</span> provider;
}

<span class="hljs-comment">// Call this at application startup</span>
setupTracing();
</code></pre>
<h3 id="heading-explanation"><strong>Explanation</strong></h3>
<ul>
<li><p><strong>Diagnostic Logging:</strong> Enables diagnostic logging using a console logger at the <code>INFO</code> level to debug tracing setup.</p>
</li>
<li><p><strong>Resource Initialization:</strong></p>
<ul>
<li><p>Defines metadata about the service, like <code>SERVICE_NAME</code> and <code>SERVICE_VERSION</code>.</p>
</li>
<li><p>This metadata is attached to every trace and helps identify which service the trace belongs to.</p>
</li>
</ul>
</li>
<li><p><strong>OTLP Trace Exporter:</strong> Configures the OpenTelemetry Protocol (<strong>OTLP</strong>) exporter to send trace data to the backend which is Jaeger for now using <strong><em>HTTP</em></strong> protocol. Note that Jaeger can be swapped with other backend like Honeycomb, Zipkin etc and you change the protocol to more efficient <strong>GRPC</strong> from current <strong>HTTP.</strong></p>
</li>
<li><p><strong>Tracer Provider with Span Processor:</strong> Creates a <code>NodeTracerProvider</code>, which manages tracers and spans:</p>
<ul>
<li><p><strong>Resource</strong>: Includes service metadata.</p>
</li>
<li><p><strong>BatchSpanProcessor</strong>: Buffers spans and exports them in batches to minimize performance impact. Key configurations:</p>
<ul>
<li><p><code>maxQueueSize</code>: Maximum spans in the queue before flushing.</p>
</li>
<li><p><code>scheduledDelayMillis</code>: Frequency of flushing spans.</p>
</li>
<li><p><code>exportTimeoutMillis</code>: Max time allowed for export.</p>
</li>
<li><p><code>maxExportBatchSize</code>: Maximum spans per export batch.</p>
</li>
</ul>
</li>
</ul>
</li>
<li><p><strong>Register Instrumentation</strong>: Automatically captures traces for libraries and frameworks:</p>
<ul>
<li><p><code>HttpInstrumentation</code>: Captures time taken by HTTP requests/responses.</p>
</li>
<li><p><code>NetInstrumentation</code>: Captures time taken by low-level networking events.</p>
</li>
<li><p><code>ExpressInstrumentation</code>: Tracks time taken by Express middleware and routes.</p>
</li>
<li><p><code>PrismaInstrumentation</code>: Tracks time taken by SQL queries generated by Prisma</p>
</li>
</ul>
</li>
</ul>
<h3 id="heading-inject-instrumenting-code-in-your-application"><strong>Inject instrumenting code in your application</strong></h3>
<p>Import and initialize the tracing configuration in your main application file <strong><em>main.ts</em></strong>:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> { NestFactory } <span class="hljs-keyword">from</span> <span class="hljs-string">'@nestjs/core'</span>;
<span class="hljs-keyword">import</span> { SwaggerModule, DocumentBuilder } <span class="hljs-keyword">from</span> <span class="hljs-string">'@nestjs/swagger'</span>;
<span class="hljs-keyword">import</span> { AppModule } <span class="hljs-keyword">from</span> <span class="hljs-string">'./app.module'</span>;
<span class="hljs-keyword">import</span> <span class="hljs-string">'./tracing'</span>;

<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">bootstrap</span>(<span class="hljs-params"></span>) </span>{
  <span class="hljs-keyword">const</span> app = <span class="hljs-keyword">await</span> NestFactory.create(AppModule);

  <span class="hljs-keyword">const</span> config = <span class="hljs-keyword">new</span> DocumentBuilder()
    .setTitle(<span class="hljs-string">'Tracing example'</span>)
    .setDescription(<span class="hljs-string">'The tracing API description'</span>)
    .setVersion(<span class="hljs-string">'1.0'</span>)
    .addTag(<span class="hljs-string">'tracing'</span>)
    .build();
  <span class="hljs-keyword">const</span> documentFactory = <span class="hljs-function">() =&gt;</span> SwaggerModule.createDocument(app, config);
  SwaggerModule.setup(<span class="hljs-string">'api-docs'</span>, app, documentFactory);

  <span class="hljs-keyword">await</span> app.listen(process.env.PORT ?? <span class="hljs-number">3000</span>);
}
bootstrap();
</code></pre>
<h3 id="heading-run-your-application">Run Your Application</h3>
<pre><code class="lang-typescript">pnpm run start:dev
</code></pre>
<h3 id="heading-setting-jaeger-for-development-environment"><strong>Setting Jaeger for development environment</strong></h3>
<p>Easiest way to setup Jaeger is with docker-compose winch will work fine a development environment.</p>
<p>Create <code>docker-compose.yaml</code> file:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">services:</span>
  <span class="hljs-attr">jaeger:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">jaegertracing/all-in-one:1.63.0</span>
    <span class="hljs-attr">container_name:</span> <span class="hljs-string">jaeger</span>
    <span class="hljs-attr">environment:</span>
      <span class="hljs-attr">COLLECTOR_OTLP_ENABLED:</span> <span class="hljs-string">"true"</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"4317:4317"</span> <span class="hljs-comment"># For Jaeger-GRPC</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"4318:4318"</span> <span class="hljs-comment"># For Jaeger-HTTP</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"16686:16686"</span> <span class="hljs-comment"># # Web UI</span>

<span class="hljs-attr">networks:</span>
  <span class="hljs-attr">default:</span>
    <span class="hljs-attr">driver:</span> <span class="hljs-string">bridge</span>
</code></pre>
<h3 id="heading-containerize-app-optional"><strong>Containerize app (optional)</strong></h3>
<p>You can use <a target="_blank" href="https://docs.docker.com/reference/cli/docker/init/"><strong>docker init</strong></a> command to automatically generate an optimized Dockerfile if you have newer version of docker installed.</p>
<pre><code class="lang-dockerfile"><span class="hljs-comment"># Arguments for versions</span>
<span class="hljs-keyword">ARG</span> NODE_VERSION=<span class="hljs-number">20.18</span>.<span class="hljs-number">0</span>
<span class="hljs-keyword">ARG</span> PNPM_VERSION=<span class="hljs-number">9.12</span>.<span class="hljs-number">2</span>
<span class="hljs-keyword">ARG</span> ALPINE_VERSION=<span class="hljs-number">3.20</span>

<span class="hljs-comment">################################################################################</span>
<span class="hljs-comment"># Base stage: Build the application</span>
<span class="hljs-keyword">FROM</span> node:${NODE_VERSION}-alpine${ALPINE_VERSION} AS builder

<span class="hljs-comment"># Set working directory</span>
<span class="hljs-keyword">WORKDIR</span><span class="bash"> /usr/src/app</span>

<span class="hljs-comment"># Install pnpm globally with cache</span>
<span class="hljs-keyword">RUN</span><span class="bash"> --mount=<span class="hljs-built_in">type</span>=cache,target=/root/.npm \
    npm install -g pnpm@<span class="hljs-variable">${PNPM_VERSION}</span></span>

<span class="hljs-comment"># Copy package.json and pnpm-lock.yaml to install dependencies</span>
<span class="hljs-keyword">COPY</span><span class="bash"> ../package.json pnpm-lock.yaml ./</span>

<span class="hljs-comment"># Install dependencies with cache</span>
<span class="hljs-keyword">RUN</span><span class="bash"> --mount=<span class="hljs-built_in">type</span>=cache,target=/root/.pnpm-store \
    pnpm install --frozen-lockfile</span>

<span class="hljs-comment"># Copy the all application code</span>
<span class="hljs-keyword">COPY</span><span class="bash"> .. .</span>

<span class="hljs-comment"># Setup prisma</span>
<span class="hljs-keyword">RUN</span><span class="bash"> pnpm prisma generate</span>

<span class="hljs-comment"># Build the application</span>
<span class="hljs-keyword">RUN</span><span class="bash"> pnpm run build</span>

<span class="hljs-comment"># Runner Stage</span>
<span class="hljs-keyword">FROM</span> node:${NODE_VERSION}-alpine${ALPINE_VERSION} AS runner

<span class="hljs-comment"># Set working directory</span>
<span class="hljs-keyword">WORKDIR</span><span class="bash"> /usr/src/app</span>

<span class="hljs-comment"># Copy the built application from the builder stage</span>
<span class="hljs-keyword">COPY</span><span class="bash"> --from=builder /usr/src/app/dist ./dist</span>
<span class="hljs-keyword">COPY</span><span class="bash"> ../package.json pnpm-lock.yaml ./</span>
<span class="hljs-keyword">COPY</span><span class="bash"> ../prisma/schema.prisma ./prisma/schema.prisma</span>

<span class="hljs-comment"># Install pnpm globally</span>
<span class="hljs-keyword">RUN</span><span class="bash"> --mount=<span class="hljs-built_in">type</span>=cache,target=/root/.npm \
    npm install -g pnpm@<span class="hljs-variable">${PNPM_VERSION}</span></span>

<span class="hljs-comment"># Install dependencies with cache</span>
<span class="hljs-keyword">RUN</span><span class="bash"> --mount=<span class="hljs-built_in">type</span>=cache,target=/root/.pnpm-store \
    pnpm install --frozen-lockfile --prod</span>

<span class="hljs-comment"># Set NODE_ENV to production</span>
<span class="hljs-keyword">ENV</span> NODE_ENV=production

<span class="hljs-comment"># Run the application</span>
<span class="hljs-keyword">CMD</span><span class="bash"> [<span class="hljs-string">"pnpm"</span>, <span class="hljs-string">"run"</span>, <span class="hljs-string">"start:prod"</span>]</span>
</code></pre>
<h3 id="heading-swagger-ui"><strong>Swagger UI</strong></h3>
<p>Visit http://localhost:3000/api-docs and make some API calls</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1732656370443/c7db3296-da38-44f2-8724-472032225594.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-visualizing-traces"><strong>Visualizing traces</strong></h3>
<p>Open your browser and go to <a target="_blank" href="http://localhost:16686"><code>http://localhost:16686</code></a> to see the Jaeger UI. Run some request and click <strong>Find Trace</strong> and click on a trace</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1732656011574/c85346cc-15f4-4f59-854b-afe3c5dd330e.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1732655235246/92991171-f32d-4e18-8ccd-4646f08dd44f.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1732655291696/f22f8c63-81cb-4ca6-9b14-4fffc3f67b0f.png" alt class="image--center mx-auto" /></p>
]]></content:encoded></item><item><title><![CDATA[Scaling PostgreSQL with Kubernetes]]></title><description><![CDATA[A case for vertical scaling
If you have read any article or a book on system design then you probably know what vertical and horizontal scaling is and benefits of horizontal scaling. Before I explain how to setup proper horizontal scaling with Postgr...]]></description><link>https://blog.sagyamthapa.com.np/scaling-postgresql-with-kubernetes</link><guid isPermaLink="true">https://blog.sagyamthapa.com.np/scaling-postgresql-with-kubernetes</guid><category><![CDATA[Kubernetes]]></category><category><![CDATA[PostgreSQL]]></category><category><![CDATA[Distributed Database]]></category><category><![CDATA[high availability]]></category><dc:creator><![CDATA[Sagyam Thapa]]></dc:creator><pubDate>Sun, 25 May 2025 20:09:27 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/YXwt-vJ3szA/upload/a96f88a74a8f84457f1a3af7b94373c9.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-a-case-for-vertical-scaling">A case for vertical scaling</h2>
<p>If you have read any article or a book on system design then you probably know what vertical and horizontal scaling is and benefits of horizontal scaling. Before I explain how to setup proper horizontal scaling with Postgres let me make a case when you should not try this.</p>
<ol>
<li><p>Simplicity: Single node database means you can run your database out of the box. Although I recommend you run <a target="_blank" href="https://pgtune.leopard.in.ua/">PGTune</a> for a quick preset or visit <a target="_blank" href="http://postgresqlco.nf">postconf</a> a full breakdown</p>
</li>
<li><p>Easier backup and recovery: No need to think about state across replicas when creating backups or applying a backup.</p>
</li>
<li><p>No network overhead especially with write heavy operations.</p>
</li>
<li><p>A temporary fix: If need a fix right now, this will provide an instant relief.</p>
</li>
</ol>
<h2 id="heading-prerequisite">Prerequisite</h2>
<p>Make sure you have following tools installed.</p>
<ul>
<li><p><a target="_blank" href="https://kubernetes.io/docs/reference/kubectl/">kubectl</a></p>
</li>
<li><p><a target="_blank" href="https://helm.sh/">helm</a></p>
</li>
<li><p><a target="_blank" href="https://minikube.sigs.k8s.io/docs/start/?arch=%2Flinux%2Fx86-64%2Fstable%2Fbinary+download">minikube</a></p>
</li>
<li><p><a target="_blank" href="https://k9scli.io/">k9s</a></p>
</li>
</ul>
<p>    Following the guide requires you have basic understanding of Kubernetes, <a target="_blank" href="https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/">CRD</a>, Helm. Nothing deep a quick AI summary will suffice.</p>
<h2 id="heading-replication">Replication</h2>
<p>Replication means keeping multiple copies of data on multiple machines connected via network. Here is why you might want to do that:</p>
<ul>
<li><p>It keeps you data close to your users.</p>
</li>
<li><p>It acts as a hot backup of a follower goes down.</p>
</li>
<li><p>It helps with scaling if most of your workload is read operation (which is the case for most <a target="_blank" href="https://www.wikiwand.com/en/articles/Online_transaction_processing">OLTP</a>)</p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1747838067614/bce32bd8-7609-4918-843d-72e7626c4cf8.png" alt="Diagram depicting a database architecture with a leader and two followers. The leader handles create, delete, and update queries, while followers handle read queries. Data synchronization is done through WAL sync. User queries are directed through a pg-pool component." /></p>
<blockquote>
<p>Here <a target="_blank" href="https://www.pgpool.net/docs/46/en/html/intro-whatis.html">pg-pool</a> acts as load balancer, it distributes read request evenly among followers and mutation request to the leader. Notice that Leader periodically syncs it <a target="_blank" href="https://www.wikiwand.com/en/articles/Write-ahead_logging">WAL</a> with it’s followers.</p>
</blockquote>
<h3 id="heading-setup-stackgres-and-enable-load-balancer">Setup StackGres and enable load balancer</h3>
<pre><code class="lang-bash">minikube addons <span class="hljs-built_in">enable</span> metallb
minikube tunnel
</code></pre>
<pre><code class="lang-bash">helm install stackgres-operator stackgres-charts/stackgres-operator \
    --namespace stackgres-operator \
    --create-namespace
</code></pre>
<h3 id="heading-define-crd-for-replicated-cluster">Define CRD for replicated cluster</h3>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">stackgres.io/v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">SGCluster</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">cluster</span>

<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">instances:</span> <span class="hljs-number">3</span> <span class="hljs-comment"># 1 primary + 2 replicas</span>

  <span class="hljs-attr">postgres:</span>
    <span class="hljs-attr">version:</span> <span class="hljs-string">"15"</span>

  <span class="hljs-attr">pods:</span>
    <span class="hljs-attr">persistentVolume:</span>
      <span class="hljs-attr">size:</span> <span class="hljs-string">"1Gi"</span>

  <span class="hljs-attr">profile:</span> <span class="hljs-string">development</span>

  <span class="hljs-attr">postgresServices:</span>
    <span class="hljs-attr">primary:</span>
      <span class="hljs-attr">type:</span> <span class="hljs-string">LoadBalancer</span>
    <span class="hljs-attr">replicas:</span>
      <span class="hljs-attr">type:</span> <span class="hljs-string">LoadBalancer</span>
</code></pre>
<h3 id="heading-apply-the-crd">Apply the CRD</h3>
<pre><code class="lang-bash">kubectl apply -f ./replication.yaml
kubectl get pods -w
</code></pre>
<h3 id="heading-get-credentials">Get credentials</h3>
<pre><code class="lang-bash">PG_PASSWORD=$(kubectl -n default get secret cluster --template <span class="hljs-string">'{{ printf "%s" (index .data "superuser-password" | base64decode) }}'</span>)
<span class="hljs-built_in">echo</span> <span class="hljs-string">"The superuser password is: <span class="hljs-variable">$PG_PASSWORD</span>"</span>
</code></pre>
<h3 id="heading-see-who-is-who">See who is who</h3>
<pre><code class="lang-bash">kubectl <span class="hljs-built_in">exec</span> -it cluster-0  -c patroni -- patronictl list
</code></pre>
<h3 id="heading-kill-the-primary">Kill the primary</h3>
<pre><code class="lang-bash">kubectl delete pod cluster-0
</code></pre>
<h3 id="heading-see-who-is-in-charge-now">See who is in charge now</h3>
<p><a target="_blank" href="https://patroni.readthedocs.io/en/latest/">Patroni</a> should have elected a new leader by now.</p>
<pre><code class="lang-bash">kubectl <span class="hljs-built_in">exec</span> -it cluster-1 -c patroni -- patronictl list
</code></pre>
<h3 id="heading-tell-something-only-to-the-primary">Tell something only to the primary</h3>
<pre><code class="lang-bash">PRIMARY=$(kubectl <span class="hljs-built_in">exec</span> -it cluster-1 -c patroni -- patronictl list | grep Leader | awk <span class="hljs-string">'{print $2}'</span>)
kubectl <span class="hljs-built_in">exec</span> -it <span class="hljs-variable">$PRIMARY</span> -c patroni -- psql -U postgres -c <span class="hljs-string">"CREATE TABLE replication_test_table (id SERIAL PRIMARY KEY, data TEXT);"</span>
kubectl <span class="hljs-built_in">exec</span> -it <span class="hljs-variable">$PRIMARY</span> -c patroni -- psql -U postgres -c <span class="hljs-string">"INSERT INTO replication_test_table (data) VALUES ('Spread the word about our lord savior PostgreSQL!');"</span>
</code></pre>
<h3 id="heading-primary-tell-his-followers">Primary tell his followers</h3>
<pre><code class="lang-bash">kubectl <span class="hljs-built_in">exec</span> -it cluster-0 -c patroni -- psql -U postgres -c <span class="hljs-string">"SELECT * FROM replication_test_table;"</span>
kubectl <span class="hljs-built_in">exec</span> -it cluster-1 -c patroni -- psql -U postgres -c <span class="hljs-string">"SELECT * FROM replication_test_table;"</span>
kubectl <span class="hljs-built_in">exec</span> -it cluster-2 -c patroni -- psql -U postgres -c <span class="hljs-string">"SELECT * FROM replication_test_table;"</span>
</code></pre>
<p>As you can see how quickly the word has spread. This is possible because StackGres uses <a target="_blank" href="https://patroni.readthedocs.io/en/latest/">Patroni</a> under the hood to coordinate all the replication.</p>
<h2 id="heading-partitioning">Partitioning</h2>
<p>Partitioning splits the data (table in our case) into smaller, more manageable parts. This is done <strong>within a single database instance</strong>. Postgres supports this out of the box. It is defined in data definition layer and having multiple replicas for makes a partition highly available. It works best for time-series data, logs, or region-based segmentation.</p>
<h3 id="heading-types-of-partitioning">Types of Partitioning</h3>
<ol>
<li><p><strong>Range Partitioning</strong> – Data is partitioned based on value ranges (e.g., date ranges).</p>
</li>
<li><p><strong>List Partitioning</strong> – Partitioning based on a list of values (e.g., regions or categories).</p>
</li>
<li><p><strong>Hash Partitioning</strong> – Data is distributed using a hash function (e.g., <code>MOD(user_id, 4)</code>).</p>
</li>
</ol>
<p>Following code create a table orders and derives three tables from it using range, list and hash based partition in a hierarchical way. Order table is split by year, year is further split into regions and region is finally split by hash.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1747840673080/c6e6cc7b-7f90-4d07-a35a-130eb3ef1c52.png" alt class="image--center mx-auto" /></p>
<blockquote>
<p>Notice that only hash based partition grantees that all partition are of same size.</p>
</blockquote>
<h3 id="heading-setup-stackgres-and-enable-load-balancer-1">Setup StackGres and enable load balancer</h3>
<pre><code class="lang-bash">helm install stackgres-operator stackgres-charts/stackgres-operator \
    --namespace stackgres-operator \
    --create-namespace

minikube addons <span class="hljs-built_in">enable</span> metallb
minikube tunnel
</code></pre>
<h3 id="heading-get-credentials-1">Get credentials</h3>
<pre><code class="lang-bash">PG_PASSWORD=$(kubectl -n default get secret cluster --template <span class="hljs-string">'{{ printf "%s" (index .data "superuser-password" | base64decode) }}'</span>)
<span class="hljs-built_in">echo</span> <span class="hljs-string">"The superuser password is: <span class="hljs-variable">$PG_PASSWORD</span>"</span>
</code></pre>
<blockquote>
<p>Your database should now be available at <code>postgresql://postgres:&lt;password&gt;:localhost:5432</code></p>
</blockquote>
<p>Now open an SQL Editor like <a target="_blank" href="https://www.pgadmin.org/">pgAdmin</a>, and run the following.</p>
<pre><code class="lang-pgsql"><span class="hljs-comment">-- Parent table</span>
<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">TABLE</span> orders (
    order_id    <span class="hljs-type">INT</span>,
    customer_id <span class="hljs-type">INT</span>,
    order_date  <span class="hljs-type">DATE</span>,
    region      <span class="hljs-type">TEXT</span>,
    amount      <span class="hljs-type">INT</span>,
    <span class="hljs-keyword">PRIMARY KEY</span> (order_id, order_date, region, customer_id)
) <span class="hljs-keyword">PARTITION BY RANGE</span> (order_date);


<span class="hljs-comment">-- Range: Year 2024</span>
<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">TABLE</span> orders_2024 <span class="hljs-keyword">PARTITION</span> <span class="hljs-keyword">OF</span> orders
    <span class="hljs-keyword">FOR</span> <span class="hljs-keyword">VALUES</span> <span class="hljs-keyword">FROM</span> (<span class="hljs-string">'2024-01-01'</span>) <span class="hljs-keyword">TO</span> (<span class="hljs-string">'2025-01-01'</span>)
    <span class="hljs-keyword">PARTITION BY LIST</span> (region);

<span class="hljs-comment">-- Range: Year 2025</span>
<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">TABLE</span> orders_2025 <span class="hljs-keyword">PARTITION</span> <span class="hljs-keyword">OF</span> orders
    <span class="hljs-keyword">FOR</span> <span class="hljs-keyword">VALUES</span> <span class="hljs-keyword">FROM</span> (<span class="hljs-string">'2025-01-01'</span>) <span class="hljs-keyword">TO</span> (<span class="hljs-string">'2026-01-01'</span>)
    <span class="hljs-keyword">PARTITION BY LIST</span> (region);

<span class="hljs-comment">-- 2024 - US region</span>
<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">TABLE</span> orders_2024_us <span class="hljs-keyword">PARTITION</span> <span class="hljs-keyword">OF</span> orders_2024
    <span class="hljs-keyword">FOR</span> <span class="hljs-keyword">VALUES</span> <span class="hljs-keyword">IN</span> (<span class="hljs-string">'US'</span>)
    <span class="hljs-keyword">PARTITION BY HASH</span> (customer_id);

<span class="hljs-comment">-- 2024 - EU region</span>
<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">TABLE</span> orders_2024_eu <span class="hljs-keyword">PARTITION</span> <span class="hljs-keyword">OF</span> orders_2024
    <span class="hljs-keyword">FOR</span> <span class="hljs-keyword">VALUES</span> <span class="hljs-keyword">IN</span> (<span class="hljs-string">'EU'</span>)
    <span class="hljs-keyword">PARTITION BY HASH</span> (customer_id);

<span class="hljs-comment">-- 2024 - US - Hash partitions</span>
<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">TABLE</span> orders_2024_us_0 <span class="hljs-keyword">PARTITION</span> <span class="hljs-keyword">OF</span> orders_2024_us <span class="hljs-keyword">FOR</span> <span class="hljs-keyword">VALUES</span> <span class="hljs-keyword">WITH</span> (MODULUS <span class="hljs-number">2</span>, REMAINDER <span class="hljs-number">0</span>);
<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">TABLE</span> orders_2024_us_1 <span class="hljs-keyword">PARTITION</span> <span class="hljs-keyword">OF</span> orders_2024_us <span class="hljs-keyword">FOR</span> <span class="hljs-keyword">VALUES</span> <span class="hljs-keyword">WITH</span> (MODULUS <span class="hljs-number">2</span>, REMAINDER <span class="hljs-number">1</span>);

<span class="hljs-comment">-- 2024 - EU - Hash partitions</span>
<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">TABLE</span> orders_2024_eu_0 <span class="hljs-keyword">PARTITION</span> <span class="hljs-keyword">OF</span> orders_2024_eu <span class="hljs-keyword">FOR</span> <span class="hljs-keyword">VALUES</span> <span class="hljs-keyword">WITH</span> (MODULUS <span class="hljs-number">2</span>, REMAINDER <span class="hljs-number">0</span>);
<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">TABLE</span> orders_2024_eu_1 <span class="hljs-keyword">PARTITION</span> <span class="hljs-keyword">OF</span> orders_2024_eu <span class="hljs-keyword">FOR</span> <span class="hljs-keyword">VALUES</span> <span class="hljs-keyword">WITH</span> (MODULUS <span class="hljs-number">2</span>, REMAINDER <span class="hljs-number">1</span>);
</code></pre>
<h3 id="heading-bulk-insert-synthetic-data">Bulk insert synthetic data</h3>
<pre><code class="lang-sql"><span class="hljs-comment">-- Generate 1000 random orders</span>
<span class="hljs-keyword">INSERT</span> <span class="hljs-keyword">INTO</span> orders (order_id, customer_id, order_date, region, amount)
<span class="hljs-keyword">SELECT</span> 
    <span class="hljs-comment">-- Generate order IDs between 1000 and 9999</span>
    <span class="hljs-number">1000</span> + <span class="hljs-keyword">floor</span>(random() * <span class="hljs-number">9000</span>)::<span class="hljs-built_in">int</span> <span class="hljs-keyword">AS</span> order_id,
    <span class="hljs-comment">-- Generate customer IDs between 1000 and 9999</span>
    <span class="hljs-number">1000</span> + <span class="hljs-keyword">floor</span>(random() * <span class="hljs-number">9000</span>)::<span class="hljs-built_in">int</span> <span class="hljs-keyword">AS</span> customer_id,    
    <span class="hljs-comment">-- Generate dates in 2024 (to fit the 2024 partition)</span>
    <span class="hljs-built_in">DATE</span> <span class="hljs-string">'2024-01-01'</span> + (<span class="hljs-keyword">floor</span>(random() * <span class="hljs-number">366</span>)::<span class="hljs-built_in">int</span> * <span class="hljs-built_in">INTERVAL</span> <span class="hljs-string">'1 day'</span>) <span class="hljs-keyword">AS</span> order_date,    
    <span class="hljs-comment">-- Randomly select region</span>
    (<span class="hljs-built_in">ARRAY</span>[<span class="hljs-string">'US'</span>, <span class="hljs-string">'EU'</span>])[<span class="hljs-number">1</span> + <span class="hljs-keyword">floor</span>(random() * <span class="hljs-number">2</span>)::<span class="hljs-built_in">int</span>] <span class="hljs-keyword">AS</span> region,    
    <span class="hljs-comment">-- Generate random amounts between 10 and 1000</span>
    <span class="hljs-number">10</span> + <span class="hljs-keyword">floor</span>(random() * <span class="hljs-number">990</span>)::<span class="hljs-built_in">int</span> <span class="hljs-keyword">AS</span> amount
<span class="hljs-keyword">FROM</span> 
    generate_series(<span class="hljs-number">1</span>, <span class="hljs-number">1000</span>) <span class="hljs-keyword">AS</span> i;            <span class="hljs-comment">-- 1k rows</span>
</code></pre>
<pre><code class="lang-sql"><span class="hljs-keyword">SELECT</span> * <span class="hljs-keyword">FROM</span> orders
<span class="hljs-keyword">WHERE</span> order_date = <span class="hljs-string">'2024-06-10'</span>
  <span class="hljs-keyword">AND</span> region = <span class="hljs-string">'US'</span>
</code></pre>
<blockquote>
<p>Querying the orders does not require you to know the partition</p>
</blockquote>
<h2 id="heading-sharding-with-replication">Sharding with replication</h2>
<p>Sharding splits a large database into small pieces called shards. Each shard is then split among multiple machines so that our database can continue to function even if we lose a few machines. Routing of queries to the proper is done by a coordinator, and just like with the replication example we will have <code>pg-pool</code> doing load balancing within a shard.</p>
<h3 id="heading-types-of-sharding">Types of sharding</h3>
<ol>
<li><p><strong>Row based:</strong> Think of it like splitting a very thick book into many volumes <strong><em>(shards)</em></strong> <em>based on</em> and creating a new volumes just to keep track of table of content <strong><em>(coordinator)</em></strong>. Think of a table where the schema of the table is simple but amount of rows and amount write operation has gone crazy. With this method both read/write operation for every shard can scale as needed.  </p>
</li>
<li><p><strong>Schema based:</strong> Just like last time we are still splitting the book but this time we are taking a few chapters that are related and turning it into a book about a sub topic. Think of how a very thick physics textbook can be split into Optics, Thermodynamics, Quantum Mechanics. Think of a table to large number of columns, but you don’t need the all the columns every time a query is made. So you split the table into shards such that related columns get placed together.</p>
</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1748175349647/636859c6-5d9f-47f7-ab08-2eb4f1e83ac8.png" alt class="image--center mx-auto" /></p>
<blockquote>
<p>Notice the resiliency of this architecture, not only we have multiple replicas for shards but also for the coordinator. As long as we have a minimum of 3 machines to run our sharded cluster, failure of single machine will not bring the down our database.</p>
</blockquote>
<h3 id="heading-setup-stackgres-and-enable-load-balancer-2">Setup StackGres and enable load balancer</h3>
<pre><code class="lang-bash">helm install stackgres-operator stackgres-charts/stackgres-operator \
    --namespace stackgres-operator \
    --create-namespace

minikube addons <span class="hljs-built_in">enable</span> metallb
minikube tunnel
</code></pre>
<h3 id="heading-define-crd-for-sharded-cluster">Define CRD for Sharded Cluster</h3>
<pre><code class="lang-yaml"><span class="hljs-comment"># shard.yaml</span>
<span class="hljs-attr">apiVersion:</span> <span class="hljs-string">stackgres.io/v1alpha1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">SGShardedCluster</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">cluster</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">type:</span> <span class="hljs-string">citus</span>
  <span class="hljs-attr">database:</span> <span class="hljs-string">mydatabase</span>
  <span class="hljs-attr">postgres:</span>
    <span class="hljs-attr">version:</span> <span class="hljs-string">'latest'</span>
  <span class="hljs-attr">coordinator:</span>
    <span class="hljs-attr">instances:</span> <span class="hljs-number">2</span> <span class="hljs-comment"># Number of coordinator instances</span>
    <span class="hljs-attr">pods:</span>
      <span class="hljs-attr">persistentVolume:</span>
        <span class="hljs-attr">size:</span> <span class="hljs-string">'1Gi'</span>
  <span class="hljs-attr">shards:</span>
    <span class="hljs-attr">clusters:</span> <span class="hljs-number">3</span> <span class="hljs-comment"># Number of shards</span>
    <span class="hljs-attr">instancesPerCluster:</span> <span class="hljs-number">3</span> <span class="hljs-comment"># 1 primary and 2 replicas</span>
    <span class="hljs-attr">pods:</span>
      <span class="hljs-attr">persistentVolume:</span>
        <span class="hljs-attr">size:</span> <span class="hljs-string">'1Gi'</span>
  <span class="hljs-attr">postgresServices:</span>
    <span class="hljs-attr">coordinator:</span>
      <span class="hljs-attr">primary:</span>
        <span class="hljs-attr">type:</span> <span class="hljs-string">LoadBalancer</span>

  <span class="hljs-attr">profile:</span> <span class="hljs-string">development</span>
</code></pre>
<h3 id="heading-apply-citus-crd">Apply Citus CRD</h3>
<pre><code class="lang-bash">kubectl apply -f ./shard.yaml
</code></pre>
<h3 id="heading-get-credentials-2">Get credentials</h3>
<pre><code class="lang-bash">PG_PASSWORD=$(kubectl -n default get secret cluster --template <span class="hljs-string">'{{ printf "%s" (index .data "superuser-password" | base64decode) }}'</span>)
<span class="hljs-built_in">echo</span> <span class="hljs-string">"The superuser password is: <span class="hljs-variable">$PG_PASSWORD</span>"</span>
</code></pre>
<blockquote>
<p>Your database should now be available at <code>postgresql://postgres:&lt;password&gt;:localhost:5432</code></p>
</blockquote>
<p>Now open a app like SQL Editor like <a target="_blank" href="https://www.pgadmin.org/">pgAdmin</a>, and run the following.</p>
<h3 id="heading-create-some-distributed-table">Create some distributed table</h3>
<pre><code class="lang-pgsql"><span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">TABLE</span> users (
    id <span class="hljs-type">BIGINT</span> <span class="hljs-keyword">PRIMARY KEY</span>,
    <span class="hljs-type">name</span> <span class="hljs-type">TEXT</span>
);
<span class="hljs-keyword">SELECT</span> create_distributed_table(<span class="hljs-string">'users'</span>, <span class="hljs-string">'id'</span>);

<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">TABLE</span> orders (
    id <span class="hljs-type">BIGINT</span>,
    user_id <span class="hljs-type">BIGINT</span>,
    product_id <span class="hljs-type">BIGINT</span>,
    amount <span class="hljs-type">INTEGER</span>,
    <span class="hljs-keyword">PRIMARY KEY</span> (user_id, id)
);
<span class="hljs-keyword">SELECT</span> create_distributed_table(<span class="hljs-string">'orders'</span>, <span class="hljs-string">'user_id'</span>);

<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">TABLE</span> products (
    id <span class="hljs-type">BIGINT</span> <span class="hljs-keyword">PRIMARY KEY</span>,
    <span class="hljs-type">name</span> <span class="hljs-type">TEXT</span>,
    price <span class="hljs-type">NUMERIC</span>
);
<span class="hljs-keyword">SELECT</span> create_reference_table(<span class="hljs-string">'products'</span>);
</code></pre>
<h3 id="heading-insert-some-data">Insert some data</h3>
<pre><code class="lang-pgsql"><span class="hljs-keyword">INSERT</span> <span class="hljs-keyword">INTO</span> users (id, <span class="hljs-type">name</span>) <span class="hljs-keyword">VALUES</span>
(<span class="hljs-number">1</span>, <span class="hljs-string">'Alice'</span>),
(<span class="hljs-number">2</span>, <span class="hljs-string">'Bob'</span>),
(<span class="hljs-number">3</span>, <span class="hljs-string">'Charlie'</span>);

<span class="hljs-keyword">INSERT</span> <span class="hljs-keyword">INTO</span> orders (id, user_id, product_id, amount) <span class="hljs-keyword">VALUES</span>
(<span class="hljs-number">1</span>, <span class="hljs-number">1</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>),
(<span class="hljs-number">2</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>),
(<span class="hljs-number">3</span>, <span class="hljs-number">2</span>, <span class="hljs-number">1</span>, <span class="hljs-number">1</span>),
(<span class="hljs-number">4</span>, <span class="hljs-number">3</span>, <span class="hljs-number">3</span>, <span class="hljs-number">5</span>);
<span class="hljs-keyword">INSERT</span> <span class="hljs-keyword">INTO</span> products (id, <span class="hljs-type">name</span>, price) <span class="hljs-keyword">VALUES</span>
(<span class="hljs-number">1</span>, <span class="hljs-string">'Product A'</span>, <span class="hljs-number">10.00</span>),
(<span class="hljs-number">2</span>, <span class="hljs-string">'Product B'</span>, <span class="hljs-number">20.00</span>),
(<span class="hljs-number">3</span>, <span class="hljs-string">'Product C'</span>, <span class="hljs-number">30.00</span>);
</code></pre>
<h3 id="heading-see-how-shards-are-spread">See how shards are spread</h3>
<pre><code class="lang-pgsql"><span class="hljs-keyword">SELECT</span> * <span class="hljs-keyword">FROM</span> citus_shards
<span class="hljs-keyword">WHERE</span> <span class="hljs-built_in">table_name</span> = <span class="hljs-string">'orders'</span>::<span class="hljs-type">regclass</span>;
</code></pre>
<h3 id="heading-find-which-node-host-which-shard">Find which node host which shard</h3>
<pre><code class="lang-pgsql"><span class="hljs-keyword">SELECT</span>
  s.shardid,
  n.nodename,
  n.nodeport
<span class="hljs-keyword">FROM</span> pg_dist_shard s
<span class="hljs-keyword">JOIN</span> pg_dist_shard_placement p <span class="hljs-keyword">ON</span> s.shardid = p.shardid
<span class="hljs-keyword">JOIN</span> pg_dist_node n <span class="hljs-keyword">ON</span> p.nodename = n.nodename
<span class="hljs-keyword">WHERE</span> s.logicalrelid = <span class="hljs-string">'orders'</span>::<span class="hljs-type">regclass</span>;
</code></pre>
<h3 id="heading-find-which-has-a-specific-row">Find which has a specific row</h3>
<pre><code class="lang-pgsql"><span class="hljs-keyword">SELECT</span> get_shard_id_for_distribution_column(<span class="hljs-string">'orders'</span>, <span class="hljs-number">1</span>);
</code></pre>
<h3 id="heading-join-distributed-distributed-co-located">Join distributed-distributed (co-located)</h3>
<pre><code class="lang-pgsql"><span class="hljs-keyword">SELECT</span>
    o.id <span class="hljs-keyword">AS</span> order_id,
    u.name <span class="hljs-keyword">AS</span> customer,
    o.amount
<span class="hljs-keyword">FROM</span> orders o
<span class="hljs-keyword">JOIN</span> users u <span class="hljs-keyword">ON</span> o.user_id = u.id;
</code></pre>
<p>This is efficient because orders and users are sharded using the same key (user_id and id)</p>
<h3 id="heading-join-distributed-reference">Join distributed-reference</h3>
<pre><code class="lang-pgsql"><span class="hljs-keyword">SELECT</span>
    o.id <span class="hljs-keyword">AS</span> order_id,
    u.name <span class="hljs-keyword">AS</span> customer,
    p.name <span class="hljs-keyword">AS</span> product,
    o.amount
<span class="hljs-keyword">FROM</span> orders o
<span class="hljs-keyword">JOIN</span> users u <span class="hljs-keyword">ON</span> o.user_id = u.id
<span class="hljs-keyword">JOIN</span> products p <span class="hljs-keyword">ON</span> o.product_id = p.id;
</code></pre>
<p>This works well because products is replicated across all nodes.</p>
<h2 id="heading-references">References</h2>
<ul>
<li><p><a target="_blank" href="https://www.postgresql.org/docs/current/ddl-partitioning.html">Postgres partitioning docs</a></p>
</li>
<li><p><a target="_blank" href="https://stackgres.io/doc/1.2/reference/crd/sgcluster/">StackGres docs</a></p>
</li>
<li><p><a target="_blank" href="https://www.citusdata.com/blog/2023/08/04/understanding-partitioning-and-sharding-in-postgres-and-citus/">Citus data</a></p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[An interactive Guide to HyperLogLog]]></title><description><![CDATA[The problem
Imagine you’re running a large scale online store. Thousands of users visit your website every second, and you want to know how many unique users visit each day. This sounds straightforward just track each user by their IP address or logi...]]></description><link>https://blog.sagyamthapa.com.np/an-interactive-guide-to-hyperloglog</link><guid isPermaLink="true">https://blog.sagyamthapa.com.np/an-interactive-guide-to-hyperloglog</guid><category><![CDATA[hyperloglog]]></category><category><![CDATA[interactive_blog]]></category><category><![CDATA[algorithms]]></category><dc:creator><![CDATA[Sagyam Thapa]]></dc:creator><pubDate>Mon, 16 Dec 2024 19:13:28 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1734202494700/1652c5f4-f19a-4338-bc54-54cd6747f374.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-the-problem">The problem</h2>
<p>Imagine you’re running a large scale online store. Thousands of users visit your website every second, and you want to know how many unique users visit each day. This sounds straightforward just track each user by their IP address or login ID. But here’s the catch: keeping a list of every unique user requires a lot of memory, especially as the number of users grows into the billions.</p>
<p>How do you solve this problem without drowning in memory usage? This is where <strong>HyperLogLog</strong>, a probabilistic data structure, comes into play.</p>
<h2 id="heading-the-solution">The solution</h2>
<p>HyperLogLog (HLL) is a clever algorithm that provides an approximate count of unique items <strong><em>(billion items)</em></strong> while using a fraction of the memory <strong><em>(1.5 kB)</em></strong> required by exact methods. It achieves this by trading off a small amount of accuracy for significant space savings.</p>
<h2 id="heading-play-with-hyperloglog">Play with HyperLogLog</h2>
<p>I have create a fun little <a target="_blank" href="https://tools.sagyamthapa.com.np/hyperloglog">app</a> that let you play with HyperLogLog. Here is how this app works</p>
<ul>
<li><p><strong>Input IP Address</strong>:</p>
<ul>
<li><p>Click <strong>"Random IP"</strong> to generate IP address automatically.</p>
</li>
<li><p>Add the entered IP to the HyperLogLog by clicking <strong>"Add to HLL"</strong>.</p>
</li>
</ul>
</li>
<li><p><strong>Adjust Bucket Count</strong>:</p>
<ul>
<li><p>Use the slider to adjust the number of buckets.</p>
</li>
<li><p>This resets the HyperLogLog and clears all previous data.</p>
</li>
</ul>
</li>
<li><p><strong>Add Multiple Random IPs</strong>:</p>
<ul>
<li>Use the preset buttons to add 1K, 5K, 10K, 50K, or 100K random IP addresses to the HyperLogLog.</li>
</ul>
</li>
<li><p><strong>View Metrics</strong>:</p>
<ul>
<li><p>Check <strong>Actual Count</strong>, <strong>Estimated Count</strong>, <strong>Difference</strong>, <strong>Margin of Error</strong>, and <strong>Actual Error</strong> in the metrics cards.</p>
</li>
<li><p>Accurate metrics are displayed as the HyperLogLog processes the inputs.</p>
</li>
</ul>
</li>
<li><p><strong>Inspect Buckets</strong>:</p>
<ul>
<li>Scroll through individual buckets to observe how HyperLogLog distributes and calculates run lengths.</li>
</ul>
</li>
</ul>
<div class="hn-embed-widget" id="hyper-log-log"></div><p> </p>
<h2 id="heading-things-to-notice">Things to notice</h2>
<ul>
<li><p>As you increase then number of buckets <strong><em>Margin of error</em></strong> reduces. <em>This is because a few bucket may get unlucky and get a big run early on, but when you spread that luck over large number of buckets chances of such mistakes reduces.</em> (Kinda like how insurance work)</p>
</li>
<li><p>The error never actually reaches zero, this is because this is probabilistic algorithms i.e. unexpected wild swing are possible.</p>
</li>
<li><p>See what happens to the estimate if you skimp on number of buckets.</p>
</li>
</ul>
<h2 id="heading-working">Working</h2>
<ol>
<li><h3 id="heading-hash-functions-and-uniform-distribution">Hash Functions and Uniform Distribution</h3>
<p> If you have reached here that I trust you know how hash function works. To refresh you memory</p>
<p> Hashing involves using a hash function to convert input data (like an IP address) into a fixed size output, often a number. Good hash functions are deterministic <em>(they always produce the same output for the same input)</em> and uniformly distribute outputs across the possible range.</p>
</li>
<li><h3 id="heading-leading-zeros-and-cardinality">Leading Zeros and Cardinality</h3>
<p> The core insight is that for a uniform random hash value:</p>
<ul>
<li><p>The probability of encountering a hash value with at least \(k\) leading zeros in its binary representation is \(2^{-k}\).<br />  Example: For \(k=3\), the binary prefix must start with \(0\), which occurs \(2^{-3} = \frac{1}{8}\) of the time.</p>
</li>
<li><p>The expected maximum run of leading zeros in the hash values increases logarithmically with the number of distinct elements \(n\) in the dataset.</p>
</li>
</ul>
</li>
</ol>
<ol start="3">
<li><h3 id="heading-bucketing-and-parallelism">Bucketing and Parallelism</h3>
</li>
</ol>
<p>To reduce variance and improve accuracy, the HyperLogLog algorithm splits the hash values into \(m=2^p\) buckets (where \(p\) is a tunable parameter).</p>
<ul>
<li><p>Each bucket is determined by the first \(p\) bits of the hash value, which serve as the <strong>bucket index</strong>.</p>
</li>
<li><p>The remaining bits of the hash value are used to compute the number of leading zeros for that bucket.</p>
</li>
<li><p>Each bucket keeps track of the <strong>maximum number of leading zeros</strong> observed for hash values assigned to it.</p>
</li>
</ul>
<ol start="4">
<li><h3 id="heading-harmonic-mean-of-maximum-leading-zeros">Harmonic Mean of Maximum Leading Zeros</h3>
<p> Each bucket contributes an estimate of the cardinality based on the leading zeros it observes. Since these estimates can vary significantly, HyperLogLog uses the <strong>harmonic mean</strong> of these estimates to combine the results:</p>
</li>
</ol>
<p>$$E = α_m​⋅m^2⋅ \left(\sum_{j=1}^{m} 2^{-M[j]} \right)$$</p><p>Where:</p>
<ul>
<li><p>\(m=2^p\): Number of buckets.</p>
</li>
<li><p>\(M[j]\): Maximum number of leading zeros observed in the \(j-th\) bucket.</p>
</li>
<li><p>\(α_m\)​: A bias correction constant dependent on \(m\) derived empirically.</p>
</li>
</ul>
<ol start="5">
<li><h3 id="heading-bias-correction-and-range-adjustment">Bias Correction and Range Adjustment</h3>
<p> The raw estimate \(E\) can be biased for small or large cardinalities. HyperLogLog applies bias correction in the following ways:</p>
<p> 1. <strong>Small Range Correction</strong>: If \(E\) is small ( \(E \leq \frac{5}{2}\)), it applies a correction to handle underestimation caused by hash collisions:</p>
</li>
</ol>
<p>$$E_{\text{corrected}} = m \cdot \log \left( \frac{m}{V} \right)$$</p><p>Where \(V\) is the number of empty buckets.</p>
<ol start="2">
<li><p><strong>Large Range Correction</strong>: If \(E\) exceeds a threshold (typically when \(n\) approaches or exceeds \(2^{32}\), the algorithm switches to a simpler <strong>linear counting</strong> method.</p>
</li>
<li><h3 id="heading-error-and-memory-efficiency">Error and Memory Efficiency</h3>
</li>
</ol>
<p>The relative error of HyperLogLog is approximately:</p>
<p>$$Error \approx \frac{1.04}{\sqrt{m}}$$</p><ul>
<li><p>Larger \(m\) (more buckets) reduces error but increases memory usage.</p>
</li>
<li><p>Memory usage is proportional to \(mlog_2​(log_2​(n))\) bits, making HyperLogLog extremely space efficient.</p>
</li>
</ul>
<ol start="7">
<li><h3 id="heading-intuition-behind-logarithmic-behavior">Intuition Behind Logarithmic Behavior</h3>
</li>
</ol>
<p>The logarithmic behavior of leading zeros stems from the exponential relationship between probabilities and cardinalities:</p>
<ul>
<li><p>As the cardinality \(n\) grows, the probability of observing hash values with more leading zeros increases logarithmically.</p>
</li>
<li><p>HyperLogLog aggregates these local estimates (per bucket) and normalizes them using the harmonic mean, resulting in a robust global estimate.</p>
</li>
</ul>
<h2 id="heading-where-its-used">Where it’s used</h2>
<ul>
<li><p><strong>Web Analytics</strong>: Google Analytics and YouTube use algorithms similar to HyperLogLog to estimate unique visitors.</p>
</li>
<li><p><strong>Databases</strong>: <a target="_blank" href="https://docs.timescale.com/use-timescale/latest/hyperfunctions/approx-count-distincts/hyperloglog/">TimescaleDB</a> and <a target="_blank" href="https://redis.io/docs/latest/develop/data-types/probabilistic/hyperloglogs/">Redis</a> implement HyperLogLog for approximate distinct counts.</p>
</li>
<li><p><strong>Big Data Platforms</strong>: <a target="_blank" href="https://druid.apache.org/docs/latest/querying/hll-old#hyperunique-aggregator">Apache Druid</a> and <a target="_blank" href="https://prestodb.io/docs/current/functions/hyperloglog.html#">Presto</a> use HyperLogLog to provide fast, approximate query results.</p>
</li>
</ul>
<h2 id="heading-downsides">Downsides</h2>
<p>While HyperLogLog is powerful, it’s important to understand its limitations:</p>
<ul>
<li><p><strong>Approximation</strong>: The algorithm provides an estimate, not an exact count. The error rate is about</p>
</li>
<li><p><strong>Hash Collisions</strong>: The accuracy depends on the quality of the hash function. Poor hashing can lead to inaccuracies.</p>
</li>
</ul>
<h2 id="heading-further-reading">Further reading</h2>
<p><a target="_blank" href="https://algo.inria.fr/flajolet/Publications/FlFuGaMe07.pdf">HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm</a></p>
<p><a target="_blank" href="https://engineering.fb.com/2018/12/13/data-infrastructure/hyperloglog/">Engineering at Meta: HyperLogLog in Presto: A significantly faster way to handle cardinality estimation</a></p>
<p><a target="_blank" href="https://redis.io/docs/latest/develop/data-types/probabilistic/hyperloglogs/">Redis Docs</a></p>
]]></content:encoded></item><item><title><![CDATA[An Interactive Guide to Bloom Filter]]></title><description><![CDATA[Introduction
Bloom filter is space efficient probabilistic data structure that can tell if a given element is already present in a database. It saves us from doing an expensive query to our database. While Bloom filters can guarantee that an element ...]]></description><link>https://blog.sagyamthapa.com.np/an-interactive-guide-to-bloom-filter</link><guid isPermaLink="true">https://blog.sagyamthapa.com.np/an-interactive-guide-to-bloom-filter</guid><category><![CDATA[interactive_blog]]></category><category><![CDATA[bloom filter]]></category><category><![CDATA[data structures]]></category><dc:creator><![CDATA[Sagyam Thapa]]></dc:creator><pubDate>Sun, 01 Dec 2024 16:56:15 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1732723787386/a448c312-6aa9-4b3d-884a-1b22f1abfc10.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h1 id="heading-introduction">Introduction</h1>
<p>Bloom filter is <strong>space efficient probabilistic data structure</strong> that can tell if a given element is already present in a database. It saves us from doing an expensive query to our database. <strong>While Bloom filters can guarantee that an element is not in the set, they cannot guarantee its presence.</strong> Instead, they can sometimes return false positives indicating an element is in the set when it is not but they never return false negatives.</p>
<h1 id="heading-problem">Problem</h1>
<p>Before diving into how Bloom filters work, let’s consider the problem they solve. Imagine you run a website that needs to process thousands or even millions of requests every second. One of your tasks is to check whether the IP address making a request is in a list of banned IPs.</p>
<p>If you store this list in a traditional database or an in memory data structure like a hash table, every lookup will consume resources, and the time to check will grow with the size of the list. For every incoming request, you’ll have to query the database or search through the list, which could severely impact the website’s performance.</p>
<p>Wouldn’t it be great if there was a magic solution to quickly determine whether an IP address is banned in constant time without querying the database? Enter Bloom filters.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1733064544669/86d95e4f-ecf8-4cbc-9933-27493f2a9255.png" alt class="image--center mx-auto" /></p>
<h1 id="heading-prerequisite">Prerequisite</h1>
<h2 id="heading-hashing">Hashing</h2>
<p>To understand Bloom filters, you need to be familiar with the concept of <strong>hashing</strong>. Hashing involves using a hash function to convert input data (like an IP address) into a fixed size output, often a number. Good hash functions are deterministic (they always produce the same output for the same input) and uniformly distribute outputs across the possible range.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1733025523446/548bdb33-99e4-4c0d-b6ed-a53b0a22e3dc.png" alt="Diagram illustrating a hash function mapping keys to hash values. &quot;1.1.1.1&quot; maps to &quot;00,&quot; &quot;2.2.2.2&quot; and &quot;3.3.3.3&quot; map to &quot;05,&quot; showing a hash collision. &quot;4.4.4.4&quot; maps to &quot;05.&quot; Hash values range from &quot;00&quot; to &quot;06.&quot;" class="image--center mx-auto" /></p>
<h1 id="heading-working">Working</h1>
<p>Bloom filters address the problem of quickly checking membership by using multiple hash functions and a bit array. Here’s how it works:</p>
<ol>
<li><p><strong>Initialization</strong>: A Bloom filter uses a fixed size bit array <code>(m)</code>, initially set to all zeros. It also uses <code>k</code> independent hash functions.</p>
</li>
<li><p><strong>Adding an element</strong>:</p>
<ul>
<li><p>To add an element, it is passed through all <code>k</code> hash functions.</p>
</li>
<li><p>Each hash function maps the element to a position in the bit array, and the corresponding bits at these positions are set to 1.</p>
</li>
</ul>
</li>
<li><p><strong>Checking for membership</strong>:</p>
<ul>
<li><p>To check if an element is in the set, the element is hashed with the same <code>k</code> hash functions.</p>
</li>
<li><p>If all the bits at the positions indicated by the hash functions are set to 1, the filter reports that the element <em>might</em> be in the set.</p>
</li>
<li><p>If any of these bits are 0, the element is definitely not in the set.</p>
</li>
</ul>
</li>
</ol>
<p>This design ensures that the Bloom filter is both space efficient and fast. However, there is a trade off: the possibility of <strong>false positives</strong>, which occurs when the bits set by other elements overlap, making it appear that an element is in the set when it is not.</p>
<p>I have create a fun little <a target="_blank" href="https://tools.sagyamthapa.com.np/bloom-filter">app</a> that let’s you play with a bloom filter.</p>
<div class="hn-embed-widget" id="bloom-filter"></div><p> </p>
<ul>
<li><p>See what happens when you fill filter with all ones.</p>
</li>
<li><p>Can you get the bloom filter to return is false positive.</p>
</li>
<li><p>Notice how increasing hash function fills up the filter.</p>
</li>
<li><p>Notice how deceasing hash function affects probability.</p>
</li>
</ul>
<h1 id="heading-tuning">Tuning</h1>
<p>I have made another fun little <a target="_blank" href="https://tools.sagyamthapa.com.np/bloom-calculator">app</a> that let’s you play around with parameters of a bloom filter.</p>
<ul>
<li><p>Number of elements <code>N</code></p>
</li>
<li><p>Size of filter <code>M</code></p>
</li>
<li><p>Number of hash functions <code>K</code></p>
</li>
</ul>
<div class="hn-embed-widget" id="bloom-calculator"></div><p> </p>
<p>Some conclusion you to draw from the graphs, for a well designed filter:</p>
<ul>
<li><p>False positive vs No of Items follows a logistic curve.</p>
</li>
<li><p>False positive vs No of hash function follows a J curve.</p>
</li>
<li><p>False positive vs Filter size follows a linear downward line.</p>
</li>
</ul>
<h1 id="heading-formulae">Formulae</h1>
<h3 id="heading-1-probability-of-a-false-positive">1. Probability of a False Positive</h3>
<p>The probability of a false positive in a Bloom Filter is given by:</p>
<p>$$P \approx \left( 1 - e^{- \frac{kn}{m}} \right)^k$$</p><p>Where:</p>
<ul>
<li><p><code>m</code>: Number of bits in the Bloom Filter.</p>
</li>
<li><p><code>k</code>: Number of hash functions.</p>
</li>
<li><p><code>n</code>: Number of elements inserted into the filter.</p>
</li>
</ul>
<h3 id="heading-2-optimal-number-of-hash-functions">2. Optimal Number of Hash Functions</h3>
<p>The optimal number of hash functions <code>k</code>, to minimize the false positive rate, is:</p>
<p>$$k = \frac{m}{n} \ln 2$$</p><h3 id="heading-3-expected-fraction-of-bits-set-to-1">3. Expected Fraction of Bits Set to 1</h3>
<p>The fraction <code>f</code> of bits in the Bloom Filter that are set to 1 after <code>n</code> insertions is:</p>
<p>$$f = 1 - \left( 1 - \frac{1}{m} \right)^{kn}$$</p><h1 id="heading-improvements">Improvements</h1>
<ul>
<li><h3 id="heading-cuckoo-filters">Cuckoo Filters:</h3>
<p>  A lighter and faster version that allows you to delete an inserted item. It collects fingerprint of inserted item and stores it in an array of buckets. Works great for application needing frequency counts or large scale de-duplication.</p>
</li>
<li><h3 id="heading-counting-bloom-filters">Counting Bloom filters</h3>
<p>  Uses a <strong>counter array</strong>, where each position in the array is a small integer. Lookup and delete are performed by incrementing and decrementing the positions in the array. Works great for application with high performance and low memory usage</p>
</li>
</ul>
<h1 id="heading-application">Application</h1>
<p>Bloom filters have a wide range of applications, including:</p>
<ul>
<li><p><strong>Database Query Optimization</strong>: Reduce database lookup by quickly discarding queries for non existent elements.</p>
</li>
<li><p><strong>Web Caching</strong>: Check if a URL is cached before attempting to fetch it.</p>
</li>
<li><p><strong>Spam Detection</strong>: Quickly determine whether an email sender is blacklisted.</p>
</li>
<li><p><strong>Distributed Systems</strong>: Identify duplicate data or requests in distributed storage and processing systems.</p>
</li>
</ul>
<h1 id="heading-further-reading">Further reading</h1>
<ul>
<li><p><a target="_blank" href="https://blog.cloudflare.com/when-bloom-filters-dont-bloom">When Bloom filters don't bloom</a></p>
</li>
<li><p><a target="_blank" href="https://en.wikipedia.org/wiki/Bloom_filter">Wikipedia</a></p>
</li>
<li><p><a target="_blank" href="https://www.jasondavies.com/bloomfilter/">JasonDavies</a></p>
</li>
<li><p><a target="_blank" href="https://samwho.dev/bloom-filters/">SamWhoCode</a></p>
</li>
</ul>
]]></content:encoded></item></channel></rss>