zaro

How to extract a number from a string in Java?

Published in Java String Number Extraction 6 mins read

Extracting a number from a string in Java involves using specific parsing methods for pure numeric strings or, more commonly and robustly, leveraging regular expressions to find and isolate numeric patterns within mixed text.

1. Parsing a Pure Numeric String

If your string contains only a numeric value (e.g., "123", "45.67") without any other characters, you can directly parse it into the desired numeric type using Java's wrapper class methods.

For Integers: Integer.parseInt()

Use Integer.parseInt() to convert a string representing a whole number into an int.

public class StringToIntExample {
    public static void main(String[] args) {
        String str = "123";
        int num = Integer.parseInt(str);
        System.out.println("Extracted integer: " + num); // Output: Extracted integer: 123
    }
}

Key Points:

  • This method throws a NumberFormatException if the string is not a valid integer (e.g., "123a" or "abc").
  • For larger whole numbers that might exceed the int range, use Long.parseLong().

For Floating-Point Numbers: Double.parseDouble() or Float.parseFloat()

For strings representing decimal numbers, use Double.parseDouble() for double values or Float.parseFloat() for float values. double is generally preferred for its higher precision in most applications.

public class StringToDoubleExample {
    public static void main(String[] args) {
        String decimalStr = "98.76";
        double price = Double.parseDouble(decimalStr);
        System.out.println("Extracted double: " + price); // Output: Extracted double: 98.76

        String floatStr = "123.45";
        float value = Float.parseFloat(floatStr);
        System.out.println("Extracted float: " + value); // Output: Extracted float: 123.45
    }
}

2. Extracting Numbers from Mixed Strings using Regular Expressions

When a string contains numbers embedded with other characters (e.g., "Item count: 5 units", "Price: $12.99"), regular expressions (regex) provide a powerful and flexible way to find and extract the numeric parts.

Basic Extraction of the First Number

To extract the first sequence of digits, you can use the \\d+ regex pattern, which matches one or more digits.

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class ExtractFirstNumber {
    public static void main(String[] args) {
        String text = "The item count is 15 units.";
        Pattern pattern = Pattern.compile("\\d+"); // Matches one or more digits
        Matcher matcher = pattern.matcher(text);

        if (matcher.find()) {
            String numberStr = matcher.group();
            int number = Integer.parseInt(numberStr);
            System.out.println("Extracted number: " + number); // Output: Extracted number: 15
        } else {
            System.out.println("No number found in the string.");
        }

        String priceText = "Price: $12.99 only.";
        Pattern decimalPattern = Pattern.compile("\\d+\\.\\d+"); // Matches one or more digits, a dot, then one or more digits
        Matcher decimalMatcher = decimalPattern.matcher(priceText);

        if (decimalMatcher.find()) {
            String decimalNumberStr = decimalMatcher.group();
            double price = Double.parseDouble(decimalNumberStr);
            System.out.println("Extracted price: " + price); // Output: Extracted price: 12.99
        } else {
            System.out.println("No decimal number found.");
        }
    }
}

Extracting All Numbers

If a string might contain multiple numbers, you can loop through all matches found by the Matcher.

import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.util.ArrayList;
import java.util.List;

public class ExtractAllNumbers {
    public static void main(String[] args) {
        String mixedText = "The transaction includes 3 items, totaling 25.50 USD, for order #12345.";

        // Pattern to find integers and decimals:
        // [+-]? (optional sign), \\d+ (one or more digits), (\\.\\d+)? (optional decimal part)
        Pattern pattern = Pattern.compile("[+-]?\\d+(\\.\\d+)?");
        Matcher matcher = pattern.matcher(mixedText);

        List<Double> numbers = new ArrayList<>();
        while (matcher.find()) {
            try {
                numbers.add(Double.parseDouble(matcher.group()));
            } catch (NumberFormatException e) {
                System.err.println("Could not parse '" + matcher.group() + "': " + e.getMessage());
            }
        }
        System.out.println("Extracted numbers: " + numbers); // Output: Extracted numbers: [3.0, 25.5, 12345.0]
    }
}

Common Regular Expression Patterns for Numbers

The Pattern class in Java is essential for defining these patterns.

Pattern Description Examples Matched
\\d+ One or more digits (0-9). 123, 0, 999
[0-9]+ Equivalent to \\d+. 45, 0
\\d+\\.\\d+ Decimal numbers (e.g., 12.34). 1.5, 100.0
[+-]?\\d+ Optional sign (+ or -) followed by one or more digits (integers). 12, -5, +100
[+-]?\\d+(\\.\\d+)? Optional sign, one or more digits, and an optional decimal part. 12, -5.5, +100.0, 0.5
\\b\\d+\\b Whole numbers (digits surrounded by word boundaries). Useful to avoid matching numbers within words (e.g., "abc123def"). 123 in "item 123", not 123 in "item123value"

3. Extracting Numbers by Removing Non-Digit Characters (Simpler Cases)

For very simple scenarios where you just want to strip out everything that isn't a digit and then parse the remaining string, you can use String.replaceAll(). This method is less flexible than regex with Pattern/Matcher because it loses information about the original number's context (e.g., if there are multiple numbers or complex decimal points).

Removing all non-digits to get an integer

public class RemoveNonDigits {
    public static void main(String[] args) {
        String text = "Order ID: #12345ABC";
        // Retain only digits
        String numberStr = text.replaceAll("[^\\d]", "");
        if (!numberStr.isEmpty()) {
            int orderId = Integer.parseInt(numberStr);
            System.out.println("Extracted Order ID: " + orderId); // Output: Extracted Order ID: 12345
        } else {
            System.out.println("No digits found after cleaning.");
        }

        String anotherText = "Amount is $12.50";
        // To get a decimal number, you need to keep the decimal point
        String cleanNumberStr = anotherText.replaceAll("[^\\d.]", ""); // Keep digits and dots
        try {
            if (!cleanNumberStr.isEmpty()) {
                double amount = Double.parseDouble(cleanNumberStr);
                System.out.println("Extracted Amount: " + amount); // Output: Extracted Amount: 12.5
            } else {
                System.out.println("No valid number found after cleaning.");
            }
        } catch (NumberFormatException e) {
            System.err.println("Error parsing number '" + cleanNumberStr + "': " + e.getMessage());
        }
    }
}

Caution: This approach can be problematic if the string contains multiple numbers or multiple decimal points (e.g., "123.45.67" would become "123.45.67", which is not a valid double). It also doesn't handle negative signs or specific number formats well unless the regex is made more complex, at which point Pattern/Matcher becomes a more intuitive and robust solution.

Important Considerations and Best Practices

  • Error Handling: Always wrap parsing calls like parseInt(), parseDouble(), etc., in a try-catch block for NumberFormatException if there's any chance the string might not be a valid number.
  • Data Types:
    • Use int for whole numbers within the standard range.
    • Use long for very large whole numbers.
    • Use double for decimal numbers where precision isn't critically important.
    • For financial calculations or when exact decimal precision is required, use java.math.BigDecimal.
  • Regular Expression Complexity: While powerful, complex regex patterns can be hard to read and debug. Test your patterns thoroughly with various inputs.
  • Locale: Be mindful of locale-specific decimal separators (e.g., a comma in some European countries). Standard Double.parseDouble() expects a dot .. For locale-aware parsing, consider java.text.NumberFormat.
  • No Match Found: Always check matcher.find() before attempting to retrieve groups (e.g., matcher.group()) to avoid an IllegalStateException.

By understanding these different approaches, you can select the most appropriate method for extracting numbers based on the complexity and format of your input strings.