What is a Java set?
Java sets are a fundamental data structure that have many practical use cases. But what is a Java set exactly, and how do Java sets work?
What are Java sets?
Java sets are implementations of the set data structure in the Java programming language. A set is an abstract data type that is a collection of objects with two important characteristics:
- The objects in a set are unordered; there is no defined sequence.
- Each element in a set is distinct and unique; there are no repeated values.
These two qualities distinguish Java sets from Java lists, which are ordered collections that may contain the same element multiple times.
The set abstract data type is a concrete implementation of the concept of a mathematical set, which is also an unordered collection of unique values. For example, the set of natural numbers consists of all integers above 0: {1, 2, 3, ...}.
How do Java sets work?
In Java, sets are implemented by the java.util.Set interface. The most important methods of Java sets include:
- add(): Adds the given element to the set (if not already present).
- clear(): Removes all elements from the set.
- contains(): Returns a Boolean value based on whether the set contains the given element.
- isEmpty(): Returns a Boolean value based on whether the set is empty.
- remove(): Removes the given element from the set.
- size(): Returns the number of elements in the set (i.e. the set's cardinality).
Because java.util.Set is only an interface, it cannot be instantiated directly in Java. Instead, users need to instantiate one of the Java classes that implement the Set interface, including HashSet and TreeSet:
- HashSet: HashSets implement an unsorted set using a hash table data structure. This allows you to enjoy very efficient constant time complexity for add, remove, and lookup operations.
- TreeSet: TreeSets implement a set using a tree data structure, which means that elements are ordered descending from the root of the tree. In exchange for sorting the elements, however, you sacrifice efficiency: add, remove, and lookup operations have logarithmic time complexity (which is still very fast for most use cases).
Java sets in Redis
Redis is an open-source, in-memory data structure store used to implement NoSQL key-value databases, caches, and message brokers. Sets are one of the five fundamental data types in Redis, along with lists, hashes, sorted sets, and strings.
In Redis, sets are unordered collections of strings. Because strings in Redis are more flexible than strings in Java, in practice you can use sets to store any type of object. The Redis platform includes several built-in commands for working with set data structures, including:
- SADD: Adds the specified member(s) to the set.
- SCARD: Returns the number of elements in the set (the cardinality).
- SDIFF: Returns the set difference of two or more sets, i.e. the elements of the first set that are not present in the other set(s).
- SINTER: Returns the intersection of two or more sets, i.e. the elements that are present in all of the sets.
- SISMEMBER: Returns a Boolean value based on whether the set contains the given element.
- SREM: Removes the given element(s) from the set.
- SUNION: Returns the union of two or more sets, i.e. all the elements that are present in at least one of the sets.
While these commands allow you to effectively use sets in Redis, many Java developers prefer the familiar methods of the Set interface in Java. Unfortunately, Redis and Java aren't compatible with each other out of the box. To resolve this issue and lower the Redis learning curve, many Java developers choose to install a third-party Redis Java client such as Redisson.
Redisson includes many familiar Java constructs, including dozens of Java objects, locks, and collections. In Redisson, Java sets are implemented using the RSet interface. Below is an example of how to use RSet in Redisson:
RSet<SomeObject> set = redisson.getSet("anySet"); set.add(new SomeObject()); set.remove(new SomeObject());
RSet includes all of the familiar methods from the java.util.Set standard implementation, as well as Redis set functionality such as set difference, intersection, and union. Note that Redis places a strict limit on the number of elements in an RSet to 4,294,967,295 (2^32 - 1). In addition, RSets come with asynchronous, reactive, and RxJava2 interfaces, so that you can choose the programming model that best fits your needs.