Data StructuresIntermediate7 min24 / 63

Sets

Learn how sets store unique items and make tasks like removing duplicates and checking membership surprisingly easy.

Imagine you have a guest list for a party. You want to make sure every name appears only once — no duplicates. That is exactly what a set does in Python. A set is a collection of unique items. If you try to add the same item twice, Python just shrugs and keeps only one copy.

Sets are also unordered, which means Python does not remember what order you put things in. That trade-off unlocks something powerful: checking whether an item is in a set is extremely fast, even with millions of items.

See it in action

Visual walkthrough1 / 5
1

Sets: No Duplicates Allowed

A set is a collection that keeps only unique items — add the same thing twice and Python quietly keeps just one copy. Think of it as a guest list that automatically removes repeated names.

Sets are unordered, so Python won't remember what order you added things in.

#Creating a Set

You can create a set with curly braces {}, listing items separated by commas. Python automatically removes any duplicates for you.

Duplicates vanish automatically — only unique values survive.
colors = {"red", "green", "blue", "red", "green"}
print(colors)
print(type(colors))
Watch out

Empty {} is a dict, not a set!

This is a very common beginner trap. Writing {} gives you an empty dictionary, not an empty set.

To create an empty set you must write set(): ``python empty_set = set() # correct empty_dict = {} # this is a dictionary! ``

You can also convert any list (or other iterable) into a set using set(). This is a handy way to strip duplicates from a list.

Converting a list to a set removes all duplicate values instantly.
scores = [90, 85, 90, 100, 85, 70]
unique_scores = set(scores)
print(unique_scores)

#Adding and Removing Items

Sets are mutable — you can change them after creation. Use add() to insert a new item and remove() or discard() to delete one.

add() quietly ignores items that are already in the set.
fruits = {"apple", "banana"}

fruits.add("cherry")
print(fruits)

fruits.add("apple")  # already there — no change
print(fruits)
Common mistake

remove() vs discard() — know the difference

remove(item) raises a KeyError if the item is not found. discard(item) does nothing if the item is missing — no error.

Use discard() when you are not sure whether the item exists.

discard() is the safe choice when the item might not exist.
fruits = {"apple", "banana", "cherry"}

fruits.discard("banana")   # removes it safely
fruits.discard("mango")    # mango was never there, but no crash!

print(fruits)

#Fast Membership Testing

The in keyword works with sets just like with lists, but sets are much faster for this check — especially when the collection is large. Think of a set like an index at the back of a book: you jump straight to the answer instead of reading every page.

Membership testing with 'in' is one of sets' superpowers.
allowed_users = {"alice", "bob", "carol"}

if "bob" in allowed_users:
    print("Access granted")
else:
    print("Access denied")

print("dave" in allowed_users)

#Set Operations — Math Made Practical

Think of it like

Think of two overlapping circles

Picture a Venn diagram with two circles. Union is everything in both circles. Intersection is the overlapping middle bit. Difference is one circle minus the shared part. Python's set operators do exactly that.

Python gives you four key set operations using simple symbols:

  • | union — all items from both sets
  • & intersection — only items in both sets
  • - difference — items in the left set but not the right
  • ^ symmetric difference — items in one set or the other, but not both
All four operations in action with developer team data.
python_devs = {"alice", "bob", "carol"}
js_devs     = {"bob", "carol", "dave"}

print(python_devs | js_devs)   # union
print(python_devs & js_devs)   # intersection
print(python_devs - js_devs)   # difference
print(python_devs ^ js_devs)   # symmetric difference

#No Indexing — Sets Are Unordered

Because a set has no guaranteed order, you cannot access items by position. There is no my_set[0]. If you need a specific item, convert the set to a list first — but remember the order may differ every time you run the program.

Tip

When to use a set vs a list

  • Use a list when order matters or you need duplicates.
  • Use a set when you need uniqueness or fast membership checks.

A common pattern: collect data in a list, convert to a set to remove duplicates, then convert back to a sorted list if you need order.

list -> set -> sorted list is a classic deduplication recipe.
tags = ["python", "beginner", "python", "tutorial", "beginner"]
unique_tags = sorted(set(tags))
print(unique_tags)
Quick check

What does the following code print? ```python a = {1, 2, 3, 4} b = {3, 4, 5, 6} print(a & b) ```

Key takeaways

  • A set stores only unique items — duplicates are silently dropped.
  • Use set() to create an empty set; {} alone creates an empty dictionary.
  • Use discard() instead of remove() when the item might not exist.
  • The |, &, -, and ^ operators let you combine sets like a Venn diagram.
  • Sets have no index — use them when you need uniqueness or fast membership checks, not ordered access.
Practice challenges
Test yourself · earn XP
0/4
Predict the output#1

What does this code print?

predict-output
nums = [1, 2, 3, 2, 1, 4]
unique = set(nums)
print(len(unique))
Fix the bug#2

This code is supposed to create an empty set and add one item to it. What is wrong?

fix-bug
my_set = {}
my_set.add("hello")
print(my_set)
Fill in the blank#3

Complete the code so it prints only the names that appear in BOTH lists.

morning = {"alice", "bob", "carol"}
evening = {"bob", "carol", "dave"}

print(morning  evening)
Reorder the lines#4

Put these lines in the right order to remove duplicates from a list and print the unique values in alphabetical order.

1
words = ["cat", "dog", "cat", "bird", "dog"]
2
print(sorted_words)
3
sorted_words = sorted(unique_words)
4
unique_words = set(words)
Your turn
Practice exercise

You have two lists of students who signed up for different workshops. Find: (1) all students who signed up for at least one workshop, (2) students who signed up for BOTH workshops, and (3) students who signed up for the coding workshop but NOT the design workshop. Print each result on its own line.

Try it live — edit the code and hit Run to execute real Python:

solution.py · editable