Python has an ordered dictionary. What about an ordered set?

# Does Python have an ordered set?

## The Question :

*519 people think this question is useful*

*The Question Comments :*

- what about the converse, a bag of things? (unordered and non-unique)
- @wim
`collections.Counter`

is Python’s bag. - What if something gets added twice? What should the position be?
- @McKay – if it were to follow the behavior of collections.OrderDict it would still be in the position of the initial addition
- Warning: several answers here are outdated. E.g.,
`dict`

is now insertion-ordered (guaranteed since Python 3.7)

## The Answer 1

*214 people think this answer is useful*

There is an ordered set (possible new link) recipe for this which is referred to from the Python 2 Documentation. This runs on Py2.6 or later and 3.0 or later without any modifications. The interface is almost exactly the same as a normal set, except that initialisation should be done with a list.

OrderedSet([1, 2, 3])

This is a MutableSet, so the signature for `.union`

doesn’t match that of set, but since it includes `__or__`

something similar can easily be added:

@staticmethod def union(*sets): union = OrderedSet() union.union(*sets) return union def union(self, *sets): for set in sets: self |= set

## The Answer 2

*159 people think this answer is useful*

## An ordered set is functionally a special case of an ordered dictionary.

The keys of a dictionary are unique. Thus, if one disregards the values in an ordered dictionary (e.g. by assigning them `None`

), then one has essentially an ordered set.

As of Python 3.1 and 2.7 there is `collections.OrderedDict`

. The following is an example implementation of an OrderedSet. (Note that only few methods need to be defined or overridden: `collections.OrderedDict`

and `collections.MutableSet`

do the heavy lifting.)

import collections class OrderedSet(collections.OrderedDict, collections.MutableSet): def update(self, *args, **kwargs): if kwargs: raise TypeError("update() takes no keyword arguments") for s in args: for e in s: self.add(e) def add(self, elem): self[elem] = None def discard(self, elem): self.pop(elem, None) def __le__(self, other): return all(e in other for e in self) def __lt__(self, other): return self <= other and self != other def __ge__(self, other): return all(e in self for e in other) def __gt__(self, other): return self >= other and self != other def __repr__(self): return 'OrderedSet([%s])' % (', '.join(map(repr, self.keys()))) def __str__(self): return '{%s}' % (', '.join(map(repr, self.keys()))) difference = __sub__ difference_update = __isub__ intersection = __and__ intersection_update = __iand__ issubset = __le__ issuperset = __ge__ symmetric_difference = __xor__ symmetric_difference_update = __ixor__ union = __or__

## The Answer 3

*121 people think this answer is useful*

The answer is no, but you can use `collections.OrderedDict`

from the Python standard library with just keys (and values as `None`

) for the same purpose.

**Update**: As of Python 3.7 (and CPython 3.6), standard `dict`

is guaranteed to preserve order and is more performant than `OrderedDict`

. (For backward compatibility and especially readability, however, you may wish to continue using `OrderedDict`

.)

Here’s an example of how to use `dict`

as an ordered set to filter out duplicate items while preserving order, thereby emulating an ordered set. Use the `dict`

class method `fromkeys()`

to create a dict, then simply ask for the `keys()`

back.

>>> keywords = ['foo', 'bar', 'bar', 'foo', 'baz', 'foo'] >>> list(dict.fromkeys(keywords)) ['foo', 'bar', 'baz']

## The Answer 4

*43 people think this answer is useful*

## Implementations on PyPI

While others have pointed out that there is no built-in implementation of an insertion-order preserving set in Python (yet), I am feeling that this question is missing an answer which states what there is to be found on PyPI.

There are the packages:

- ordered-set (Python based)
- orderedset (Cython based)
- collections-extended
- boltons (under iterutils.IndexedSet, Python-based)
- oset (last updated in 2012)

Some of these implementations are based on the recipe posted by Raymond Hettinger to ActiveState which is also mentioned in other answers here.

### Some differences

- ordered-set (version 1.1)
- advantage: O(1) for lookups by index (e.g.
`my_set[5]`

) - oset (version 0.1.3)
- advantage: O(1) for
`remove(item)`

- disadvantage: apparently O(n) for lookups by index

Both implementations have O(1) for `add(item)`

and `__contains__(item)`

(`item in my_set`

).

## The Answer 5

*41 people think this answer is useful*

I can do you one better than an OrderedSet: boltons has a pure-Python, 2/3-compatible `IndexedSet`

type that is not only an ordered set, but also supports indexing (as with lists).

Simply `pip install boltons`

(or copy `setutils.py`

into your codebase), import the `IndexedSet`

and:

>>> from boltons.setutils import IndexedSet >>> x = IndexedSet(list(range(4)) + list(range(8))) >>> x IndexedSet([0, 1, 2, 3, 4, 5, 6, 7]) >>> x - set(range(2)) IndexedSet([2, 3, 4, 5, 6, 7]) >>> x[-1] 7 >>> fcr = IndexedSet('freecreditreport.com') >>> ''.join(fcr[:fcr.index('.')]) 'frecditpo'

Everything is unique and retained in order. Full disclosure: I wrote the `IndexedSet`

, but that also means you can bug me if there are any issues. ðŸ™‚

## The Answer 6

*17 people think this answer is useful*

If you’re using the ordered set to maintain a sorted order, consider using a sorted set implementation from PyPI. The sortedcontainers module provides a SortedSet for just this purpose. Some benefits: pure-Python, fast-as-C implementations, 100% unit test coverage, hours of stress testing.

Installing from PyPI is easy with pip:

pip install sortedcontainers

Note that if you can’t `pip install`

, simply pull down the sortedlist.py and sortedset.py files from the open-source repository.

Once installed you can simply:

from sortedcontainers import SortedSet help(SortedSet)

The sortedcontainers module also maintains a performance comparison with several alternative implementations.

For the comment that asked about Python’s bag data type, there’s alternatively a SortedList data type which can be used to efficiently implement a bag.

## The Answer 7

*10 people think this answer is useful*

In case you’re already using pandas in your code, its `Index`

object behaves pretty like an ordered set, as shown in this article.

Examples from the article:

indA = pd.Index([1, 3, 5, 7, 9]) indB = pd.Index([2, 3, 5, 7, 11]) indA & indB # intersection indA | indB # union indA - indB # difference indA ^ indB # symmetric difference

## The Answer 8

*7 people think this answer is useful*

A little late to the game, but I’ve written a class `setlist`

as part of `collections-extended`

that fully implements both `Sequence`

and `Set`

>>> from collections_extended import setlist >>> sl = setlist('abracadabra') >>> sl setlist(('a', 'b', 'r', 'c', 'd')) >>> sl[3] 'c' >>> sl[-1] 'd' >>> 'r' in sl # testing for inclusion is fast True >>> sl.index('d') # so is finding the index of an element 4 >>> sl.insert(1, 'd') # inserting an element already in raises a ValueError ValueError >>> sl.index('d') 4

GitHub: https://github.com/mlenzen/collections-extended

Documentation: http://collections-extended.lenzm.net/en/latest/

## The Answer 9

*7 people think this answer is useful*

There’s no `OrderedSet`

in official library.
I make an exhaustive cheatsheet of all the data structure for your reference.

DataStructure = { 'Collections': { 'Map': [ ('dict', 'OrderDict', 'defaultdict'), ('chainmap', 'types.MappingProxyType') ], 'Set': [('set', 'frozenset'), {'multiset': 'collection.Counter'}] }, 'Sequence': { 'Basic': ['list', 'tuple', 'iterator'] }, 'Algorithm': { 'Priority': ['heapq', 'queue.PriorityQueue'], 'Queue': ['queue.Queue', 'multiprocessing.Queue'], 'Stack': ['collection.deque', 'queue.LifeQueue'] }, 'text_sequence': ['str', 'byte', 'bytearray'] }

## The Answer 10

*6 people think this answer is useful*

As other answers mention, as for python 3.7+, the dict is ordered by definition. Instead of subclassing `OrderedDict`

we can subclass `abc.collections.MutableSet`

or `typing.MutableSet`

using the dict’s keys to store our values.

class OrderedSet(typing.MutableSet[T]): """A set that preserves insertion order by internally using a dict.""" def __init__(self, iterable: t.Iterator[T]): self._d = dict.fromkeys(iterable) def add(self, x: T) -> None: self._d[x] = None def discard(self, x: T) -> None: self._d.pop(x) def __contains__(self, x: object) -> bool: return self._d.__contains__(x) def __len__(self) -> int: return self._d.__len__() def __iter__(self) -> t.Iterator[T]: return self._d.__iter__()

Then just:

x = OrderedSet([1, 2, -1, "bar"]) x.add(0) assert list(x) == [1, 2, -1, "bar", 0]

I put this code in a small library, so anyone can just `pip install`

it.

## The Answer 11

*2 people think this answer is useful*

The ParallelRegression package provides a setList( ) ordered set class that is more method-complete than the options based on the ActiveState recipe. It supports all methods available for lists and most if not all methods available for sets.

## The Answer 12

*2 people think this answer is useful*

As others have said, `OrderedDict`

is a superset of an ordered set in terms of functionality, but if you need a set for interacting with an API and *don’t* need it to be mutable, `OrderedDict.keys()`

is actually an implementation `abc.collections.Set`

:

import random from collections import OrderedDict, abc a = list(range(0, 100)) random.shuffle(a) # True a == list(OrderedDict((i, 0) for i in a).keys()) # True isinstance(OrderedDict().keys(), abc.Set)

The caveats are immutability and having to build up the set like a dict, but it’s simple and only uses built-ins.

## The Answer 13

*-4 people think this answer is useful*

For many purposes simply calling sorted will suffice. For example

>>> s = set([0, 1, 2, 99, 4, 40, 3, 20, 24, 100, 60]) >>> sorted(s) [0, 1, 2, 3, 4, 20, 24, 40, 60, 99, 100]

If you are going to use this repeatedly, there will be overhead incurred by calling the sorted function so you might want to save the resulting list, as long as you’re done changing the set. If you need to maintain unique elements and sorted, I agree with the suggestion of using OrderedDict from collections with an arbitrary value such as None.

## The Answer 14

*-4 people think this answer is useful*

So i also had a small list where i clearly had the possibility of introducing non-unique values.

I searched for the existence of a unique list of some sort, but then realized that testing the existence of the element before adding it works just fine.

if(not new_element in my_list): my_list.append(new_element)

I don’t know if there are caveats to this simple approach, but it solves my problem.