zaro

What is the difference between HugeTLB and thp?

Published in Linux Memory Management 6 mins read

HugeTLB and Transparent Huge Pages (THP) are both mechanisms in Linux for utilizing huge pages, which are memory pages larger than the traditional 4KB size (commonly 2MB or 1GB). The fundamental difference lies in how they are allocated and managed: HugeTLB requires explicit application involvement and pre-allocation, offering strong guarantees, while THP operates transparently, dynamically attempting to use huge pages without application changes.

Understanding Huge Pages

Memory management in operating systems relies on dividing physical memory into pages. Standard page sizes are typically 4KB. While efficient for general use, smaller pages can lead to higher Translation Lookaside Buffer (TLB) miss rates for applications with large memory footprints. A TLB is a CPU cache that stores recent virtual-to-physical address translations. More TLB misses mean more time spent looking up addresses in the main page tables, reducing performance.

Huge pages address this by using larger page sizes, reducing the number of TLB entries required for a given memory region, thereby improving TLB hit rates and overall application performance, especially for memory-intensive workloads.

HugeTLB (Explicit Huge Pages)

HugeTLB pages, also known as static or explicit huge pages, are a feature where huge pages are pre-allocated and reserved at boot time or runtime by the system administrator. Applications must explicitly request these pages using specific system calls, such as mmap with the MAP_HUGETLB flag.

Characteristics of HugeTLB:

  • Explicit Allocation: Applications must be modified or configured to specifically request memory from the HugeTLB pool.
  • Pre-allocated: A fixed number of huge pages are reserved in physical RAM by the kernel. This memory cannot be used for anything else.
  • Guaranteed Usage: When an application successfully allocates memory using HugeTLB, it is guaranteed to use huge pages. The allocated memory is always aligned to the huge page size, ensuring optimal performance characteristics.
  • Persistent: HugeTLB pages remain allocated until explicitly freed or the system is rebooted.
  • Specific Sizes: While the default is often 2MB, HugeTLB can be configured to use different huge page sizes (e.g., 1GB on architectures that support it).
  • Use Cases: Ideal for applications that benefit significantly from predictable memory access patterns and require large, contiguous memory regions, such as high-performance computing (HPC) applications, large databases (e.g., Oracle, SAP HANA), and in-memory caches.

Configuration Example:

To configure HugeTLB, you typically set kernel parameters or use sysctl:

# Set 100 huge pages of 2MB each (200MB total)
echo 100 > /proc/sys/vm/nr_hugepages

# Verify
cat /proc/meminfo | grep HugePages

Transparent Huge Pages (THP)

Transparent Huge Pages (THP) is a kernel feature designed to automate the use of huge pages without requiring application changes. The kernel attempts to transparently allocate and manage huge pages for application memory, reducing the administrative overhead associated with HugeTLB.

Characteristics of THP:

  • Transparent Allocation: THP operates in the background. The kernel tries to merge smaller pages into huge pages or allocate huge pages directly for applications, often without their knowledge.
  • Dynamic Management: Unlike HugeTLB, THP pages are not pre-allocated in a fixed pool. The kernel dynamically promotes normal 4KB pages to huge pages or allocates huge pages on demand. This means THP can initially use normal pages and convert them into huge pages later if possible and beneficial.
  • Default Size Only: THP primarily uses the default huge page size, which is typically 2MB.
  • Best-Effort Basis: While THP aims to use huge pages, it might revert to using normal 4KB pages if contiguous memory for a huge page is not available, or if memory fragmentation makes it difficult.
  • Potential for Fragmentation: Because THP dynamically manages pages, it can sometimes contribute to memory fragmentation, or its attempts to create huge pages might fail if sufficient contiguous memory isn't available.
  • Use Cases: Suitable for a broader range of applications, including general-purpose workloads, Java Virtual Machines (JVMs), web servers, and virtual machines, where manual tuning for HugeTLB might be impractical.

Configuration Example:

THP can be configured via /sys/kernel/mm/transparent_hugepage/enabled:

  • always: Aggressively tries to use THP.
  • madvise: Uses THP only for memory regions explicitly advised by madvise(MADV_HUGEPAGE).
  • never: Disables THP.
# Check current THP setting
cat /sys/kernel/mm/transparent_hugepage/enabled

# Disable THP (e.g., for specific database workloads)
echo never > /sys/kernel/mm/transparent_hugepage/enabled

Key Differences Summarized

Feature HugeTLB (Explicit Huge Pages) Transparent Huge Pages (THP)
Management Manual (administrator) & Explicit (application) Automatic (kernel) & Transparent (application)
Allocation Pre-allocated, reserved pool Dynamic, on-demand, attempts to convert or allocate
Application Change Required (e.g., mmap(MAP_HUGETLB)) Not required
Guaranteed Huge Pages Always uses huge pages if allocation succeeds Can use normal pages and convert later; best-effort
Memory Alignment Always guaranteed to be aligned to huge page size Not always guaranteed initially; alignment happens during conversion or allocation
Page Sizes Configurable (e.g., 2MB, 1GB if supported) Typically uses the default 2MB size only
Contiguity High guarantee of contiguous blocks Can suffer from fragmentation, may revert to 4KB pages
Overhead Higher setup overhead, minimal runtime overhead Lower setup overhead, some runtime overhead for compaction
Use Cases HPC, large databases, high-performance caches General purpose, JVMs, web servers

Choosing Between HugeTLB and THP

The choice between HugeTLB and THP depends on the specific workload and administrative preferences:

  • Choose HugeTLB for applications that:
    • Are highly sensitive to memory performance and predictability.
    • Have a large, stable memory footprint.
    • Benefit from guaranteed huge page usage and alignment.
    • Can be explicitly configured to use HugeTLB (e.g., database software with mmap options).
    • You are willing to manage the pre-allocation and potential for wasted reserved memory.
  • Choose THP for applications that:
    • Do not require explicit huge page allocation or cannot be modified.
    • Benefit from reduced TLB misses but can tolerate dynamic behavior.
    • Are general-purpose and don't require extreme performance guarantees from huge pages.
    • You prefer a "set and forget" approach to huge page management.
    • Are experiencing performance issues related to TLB misses without explicit huge page configuration.

In some cases, specific workloads, particularly certain databases, might recommend disabling THP due to potential performance variability or interference with their own memory management. Understanding the implications for your specific software stack is crucial.