Blog Zenika

#CodeTheWorld

Web

Set, an underrated power of JavaScript

In the world of JavaScript development, we tend to rely on traditional objects like arrays to manipulate data collections. However, the Set object remains underrated, even though it offers powerful features, especially for unique operations on data. With the arrival of new methods through the TC39 Set proposal, Set becomes even more attractive. Let’s explore its capabilities together through a practical example.

Origin and historical context of Set

Set was introduced in 2015 with ECMAScript 6 (ES6), a major update to JavaScript. Before its introduction, developers primarily used arrays to manage data collections, which posed several issues:

  • Value duplication: Arrays allow duplicates, making uniqueness checks tedious and inefficient.
  • Inefficient existence check: Finding if an element exists in an array requires a linear search (e.g.: Array.includes), which can be costly for large collections.

The introduction of Set aimed to solve these limitations by providing a structure dedicated to unique values, with optimized operations for adding, removing, and checking existence. In other words, Set was designed to simplify and accelerate manipulation of collections of unique values while making the code more readable and less prone to errors.

What is a Set?

A Set is a data structure that stores unique values. This means that a value in the Set can only appear once. Here’s a simple example:

const mySet = new Set();
mySet.add(1);
mySet.add(2);
mySet.add(2); // Ignored because 2 already exists

console.log(mySet); // Output: Set(2) { 1, 2 }

The elements in a Set can be of any type:

const mixedSet = new Set([1, 'hello', { key: 'value' }, true]);
console.log(mixedSet); // Output: Set(4) { 1, 'hello', { key: 'value' }, true }

In the context of an application, for example, you could use Set to manage collections, such as recently viewed products, selected product categories, or user lists without duplication.

Why use Set in an application?

Removing duplicates

Imagine a recently viewed products feature. You want to display unique products for each user, even if they have clicked on the same product multiple times:

const recentProducts = [101, 102, 103, 101, 104, 102];
const uniqueRecentProducts = new Set(recentProducts);

console.log(uniqueRecentProducts); // Output: Set(4) {101, 102, 103, 104}

Now an example with an array:

const recentProducts = [101, 102, 103, 101, 104, 102];
const uniqueRecentProducts = [];
for (let i = 0; i < recentProducts.length; i++) {
  if (!uniqueRecentProducts.includes(recentProducts[i])) {
    uniqueRecentProducts.push(recentProducts[i]);
  }
}
console.log(uniqueRecentProducts); // [101, 102, 103, 104]

With array, it is up to you to ensure the uniqueness of your values, which can make your code longer and less readable.

Quick existence check

Suppose you’re managing a shopping cart. Before adding a product, you need to check if it’s already in the cart:

const cart = new Set([201, 202, 203]);

if (!cart.has(204)) {
  cart.add(204);
}

console.log(cart); // Output: Set(4) { 201, 202, 203, 204 }

Optimized Performance

Set offers superior performance for adding, removing, and searching for elements compared to arrays, which is especially important for systems handling large amounts of data.

Basic Set methods

Before the introduction of the new methods by the TC39 proposal, Set already had several practical methods:

add(value)

Adds a value to the Set (if it’s not already present).

const mySet = new Set();
mySet.add(1);
mySet.add(2);
mySet.add(2);
console.log(mySet); // Output: Set(2) { 1, 2 }

delete(value)

Removes a specific value from the Set.

mySet.delete(1);
console.log(mySet); // Output: Set(1) { 2 }

has(value)

Checks if a value is present in the Set.

console.log(mySet.has(2)); // Output: true
console.log(mySet.has(3)); // Output: false

size

Returns the number of elements in the Set.

console.log(mySet.size); // Output: 1

clear()

Removes all elements from the Set.

mySet.clear();
console.log(mySet.size); // Output: 0

Iterating over elements

A Set is an iterable, and can be iterated over via loops or methods like forEach.

const mySet = new Set([1, 2, 3]);

for (const value of mySet) {
  console.log(value);
}
// Output:
// 1
// 2
// 3

mySet.forEach(value => console.log(value));
// Output:
// 1
// 2
// 3

These features already provided a solid foundation for manipulating unique value sets. However, complex cases involving operations like union, intersection, or differences between Set had to be implemented manually, often in suboptimal ways.

Recently added features

The Set Methods proposal introduces several new methods to manipulate Sets. Let’s look at how these methods can be used in an e-commerce application.

union

Use Case: Combining recommended product lists.

Suppose you have two recommendation systems: one based on viewed products and another based on best-selling products. You want to merge them without duplicates:

const viewedRecommendations = new Set([301, 302, 303]);
const topSelling = new Set([303, 304, 305]);

const combinedRecommendations = viewedRecommendations.union(topSelling);
console.log(combinedRecommendations); // Output: Set(5) { 301, 302, 303, 304, 305 }
enter image description here

With an array, an union seems to be easy like :

const viewedRecommendations = new Set([301, 302, 303]);
const topSelling = new Set([303, 304, 305]);

const combinedRecommendations = [...viewedRecommendations, ...topSelling];
console.log(combinedRecommendations); // [301, 302, 303, 303, 304, 305]

But as you can see, in this basic example, you lost uniqueness. You have to merge array and apply removeDuplicate algorythme.

intersection

Use Case: Finding products shared between two lists.

You want to identify products that appear both in a promotional list and in the list of products the user has viewed:

const promotionProducts = new Set([401, 402, 403]);
const viewedProducts = new Set([402, 403, 404]);

const productsToHighlight = promotionProducts.intersection(viewedProducts);
console.log(productsToHighlight); // Output: Set(2) { 402, 403 }
enter image description here

With an array, an intersection requires loops or filters:

const promotionProducts = new Set([401, 402, 403]);
const viewedProducts = new Set([402, 403, 404]);

const productsToHighlight = promotionProducts.filter(x => viewedProducts.includes(x));
console.log(productsToHighlight); // [402, 403]

difference

Use Case: Identifying products not yet added to the cart.

You have a list of products viewed by the user and want to know which ones haven’t been added to the cart:

const viewedProducts = new Set([501, 502, 503]);
const cartProducts = new Set([503]);

const notInCart = viewedProducts.difference(cartProducts);
console.log(notInCart); // Output: Set(2) { 501, 502 }
enter image description here

With an array, a difference also requires loops or filters:

const viewedProducts = new Set([501, 502, 503]);
const cartProducts = new Set([503]);

const notInCart = viewedProducts.filter(x => !cartProducts.includes(x));
console.log(notInCart); // [501, 502]

symmetricDifference

Use Case: Identifying differences between warehouses.

You manage inventory across two warehouses and want to identify products that are only present in one of them:

const warehouseA = new Set([601, 602, 603]);
const warehouseB = new Set([603, 604, 605]);

const uniqueToEachWarehouse = warehouseA.symmetricDifference(warehouseB);
console.log(uniqueToEachWarehouse); // Output: Set(4) { 601, 602, 604, 605 }
enter image description here

With an array, a symmetricDifference also requires two loops or filters:

const warehouseA = new Set([601, 602, 603]);
const warehouseB = new Set([603, 604, 605]);

const uniqueToEachWarehouse = [
    ...warehouseA.filter(value => !warehouseB.includes(value)),
    ...warehouseB.filter(value => !warehouseA.includes(value))
];
console.log(uniqueToEachWarehouse); // [601, 602, 604, 605]

isSupersetOf

Use Case: Checking if an order is complete.

Before confirming an order, you want to make sure that all items in the cart are available in your inventory:

const inventory = new Set([801, 802, 803, 804]);
const cart = new Set([802, 803]);

console.log(inventory.isSupersetOf(cart)); // true
enter image description here

With an array, a isSupersetOf also requires loops or every:

const inventory = new Set([801, 802, 803, 804]);
const cart = new Set([802, 803]);

const isSupersetArray = cart.every(value => inventory.includes(value));
console.log(isSupersetArray); // true

isSubsetOf

Use Case: Checking if an order is complete.

It is strictly the opposite of isSupersetOf:

const inventory = new Set([801, 802, 803, 804]);
const cart = new Set([802, 803]);

console.log(cart.isSubsetOf(invetory)); // true
enter image description here

Array based algorythme is the same as isSupersetOf.

isDisjointFrom

Use Case: Check if payment method of the user is compatible with app:

const restrictedPaymentMethods = new Set(['Paypal', 'Credit Card']);
const userPaymentMethod = new Set(['Gift Card']);
if (userPaymentMethod.isDisjointFrom(restrictedPaymentMethods)) {
    console.log('Method not allowed');
} // Method not allowed
enter image description here

With an array, a isDisjointFrom also requires loops or every:

const restrictedPaymentMethods = new Set(['Paypal', 'Credit Card']);
const userPaymentMethod = new Set(['Gift Card']);
if (!userPaymentMethod.some(el => restrictedPaymentMethods.includes(el))) {
    console.log('Method not allowed');
} // Method not allowed

Performance

The main avantage to use Set, if readability is not enought, is the optimization of each implementation of those functionnality.

A Set is optimized for search and uniqueness of values, using a hash table-like structure, whereas an Array stores elements in an indexed manner and requires a linear search.

This difference has a direct impact on the algorythmic complexity of the code: operations involving Set generally require fewer conditions and loops, thus reducing the work load and improving the maintainability of the code. However, Arrays remain more efficient for sequential manipulations and indexed accesses, making their choice more relevant depending on the use case.


Algorithmic complexity measures the amount of resources (time or space) needed to execute an algorithm based on the size of its input, using Big-O notation. It describes how the execution time grows with respect to the input size, without considering constants or multiplicative factors. For example, an algorithm with O(n) complexity takes linear time relative to the input size, while an algorithm with O(n × m) will be more expensive because it has to process two sets of sizes n and m in time proportional to their product.

Let’s imagine a function that adds the first n integers:

function sum(n) {
  let total = 0;
  for (let i = 1; i <= n; i++) {
    total += i;
  }
  return total;
}

In this example, the for loop runs n times, meaning the execution time of the function increases linearly with the size of n. Therefore, the time complexity of the sum(n) function is O(n) because the number of operations is directly proportional to the input size.


Here a comparison of the complexity between Set and Array for main features. Array complexity includes, if necessary, a uniqueness verification.

For this, I benched the differences of execution times between Set and Array methods. For each comparison, I used a huge amount of entries, so we can really appreciate the differences.

FeatureSet complexityArray complexitySet benchmarkArray benchmark
Add (100k entries)O(1)O(n+1)6.9ms663.5ms
Has (10M entries)O(1)O(n)0.1ms5.7ms
Delete (10M entries)O(1)O(n) + O(n) (search and shift)0.1ms7.2ms
Size (100k entries)O(1)O(1)0.1ms0.1ms
Union (100k entries)O(n + m)O(n * m)3.7ms1874.2ms
Intersection (100k entries) O(n)O(n * m) 5ms 802.2ms
Difference (100k entries)O(n) O(n * m) 3.3ms918.5ms
SymmetricDifference (100k entries)O(n + m) O(n * m)4.4ms1901.8ms
isSupersetOf (100k entries)O(m)O(n * m) 1.1ms134.3ms
isSubsetOf (100k entries)O(n)O(n * m) 1.1ms134.3ms
isDisjointFrom (100k entries)O(n)O(n * m) 1.9ms1065.1ms

n is the size of first list and m the size of the second list.

As you can see, for common operations like add, include, delete and length, the execution times between Set and Array are close and can be omitted for relatively small sets.

So, why should you use Set?

You should prefer Set over Array when you need to quickly check for the presence of elements, frequently add or remove elements, and ensure uniqueness. Set uses hash tables, making operations like searching, adding, and removing elements much faster (on average O(1)) compared to Array, where searching and removing can be slower (O(n)). Additionally, Set automatically rejects duplicates, making it ideal for collections of unique elements. On the other hand, Array is better suited when the order of elements or accessing specific indices is important.

With the new methods proposed by TC39, Set becomes an even more powerful and expressive tool for manipulating data collections in JavaScript.

If you often work with arrays and are looking to optimize your operations, it’s time to give Set a try!

Auteur/Autrice

Laisser un commentaire

Ce site utilise Akismet pour réduire les indésirables. En savoir plus sur la façon dont les données de vos commentaires sont traitées.

En savoir plus sur Blog Zenika

Abonnez-vous pour poursuivre la lecture et avoir accès à l’ensemble des archives.

Poursuivre la lecture