zaro

What is Intersection in DBMS?

Published in DBMS Set Operations 3 mins read

In a Database Management System (DBMS), INTERSECT is a set operator used to combine the results of two SELECT statements, returning only the rows that are present in both result sets. Essentially, it identifies and displays the common data between two queries.

Understanding the INTERSECT Operation

The INTERSECT operator functions like the mathematical intersection of two sets. When applied to the results of two SQL SELECT queries, it yields a new result set containing only the distinct rows that appear in both of the original query results.

Based on the provided reference, here are the key characteristics and rules for using INTERSECT:

  • Combines Two SELECT Statements: It is specifically designed to merge data from two separate query outputs.
  • Returns Common Rows: The primary function is to retrieve only the rows that exist in the result set of the first SELECT statement AND the result set of the second SELECT statement.
  • Identical Structure Required: A critical rule is that for the INTERSECT operation to work, "the number of datatype and columns must be the same" in both SELECT statements. This means the columns selected must match in count, order, and compatible data types.
  • No Duplicates: The result set of an INTERSECT operation inherently "has no duplicates". If a common row appears multiple times in either or both of the original result sets, it will only appear once in the final INTERSECT output.
  • Default Ordering: By default, INTERSECT "arranges the data in ascending order".

How INTERSECT Differs from Other Set Operators

While INTERSECT finds common rows, other SQL set operators handle combinations differently:

  • UNION: Combines the results of two or more SELECT statements into a single result set, including all distinct rows from all queries. UNION ALL includes duplicates.
  • EXCEPT (or MINUS in some systems): Returns distinct rows from the first SELECT statement that are not found in the second SELECT statement.

Practical Example

Consider two tables, Employees and Projects, where Employees lists all company employees and Projects lists employees currently assigned to a project. We want to find employees who are both in the general employee list and currently assigned to a project.

Employees Table:

EmployeeID Name Department
101 Alice HR
102 Bob IT
103 Carol Sales
104 David IT

ProjectAssignments Table:

EmployeeID ProjectName
102 Alpha
104 Beta
105 Eve

Using INTERSECT:

To find EmployeeIDs that are present in both tables:

SELECT EmployeeID
FROM Employees

INTERSECT

SELECT EmployeeID
FROM ProjectAssignments;

Result of INTERSECT:

EmployeeID
102
104

This result shows that EmployeeID 102 and 104 are the only ones present in both the Employees table and the ProjectAssignments table, indicating they are active employees assigned to projects.

Use Cases for INTERSECT

INTERSECT is particularly useful in scenarios where you need to:

  • Identify Shared Data: Find common customers between two different marketing campaigns.
  • Verify Data Consistency: Check if records exist in multiple related tables, ensuring data integrity.
  • Filter Based on Multiple Criteria: Combine results of complex subqueries to narrow down data to a precise set meeting all conditions.
  • Perform Data Validation: Confirm that a set of users has access rights defined in two separate security tables.

In essence, INTERSECT is a powerful tool for precise data retrieval, allowing database users to pinpoint records that satisfy multiple independent conditions across different data sets.