Using the heap package in Go
By Yichao Ma at
Heap
is an efficient data structure for retrieving the minimum or
maximum element in constant time. Moreover, modifying operations
(push
, pop
, and update
) only take logarithmic time.
A typical use case is using a heap as the underlying data
structure when implementing a priority queue. Although,
heap is usually visualized using a tree-shaped representation
for easier understanding. But it's common to implement heap
using an array.
It seems that the container/heap
package in the standard library
does not provide a default implementation. Instead, it exposes
an interface that we can implement for any data structure to
make it act as a heap.
/* container/heap/heap.go */
type Interface interface {
sort.Interface
Push(x any) // add x as element Len()
Pop() any // remove and return element Len() - 1.
}
Say we would like to store integers in the heap. We can then
make a type alias of integer slice and implement the methods
required by heap.Interface
.
type minHeap []int
func (h minHeap) Len() int {
return len(h)
}
func (h minHeap) Less(i, j int) bool {
return h[i] < h[j]
}
func (h minHeap) Swap(i, j int) {
h[i], h[j] = h[j], h[i]
}
func (h *minHeap) Push(x any) {
*h = append(*h, x.(int))
}
func (h *minHeap) Pop() any {
last := h.Len() - 1
x := (*h)[last]
*h = (*h)[:last]
return x
}
NOTE: You might need to replace the
any
type with empty interfaceinterface{}
if you are using a version of Go before 1.18.
Now, let's create an instance of our minHeap
and use it to
sort a list of integers.
import (
"container/heap"
"fmt"
)
func main() {
h := minHeap{5, 4, 3, 2, 1}
heap.Init(&h) // heapify
sorted := make([]int, 0, h.Len())
for h.Len() > 0 {
sorted = append(sorted, heap.Pop(&h).(int))
}
fmt.Println(sorted)
}
A frequent mistake I make is when pushing and/or popping the
heap. I sometimes accidentally wrote h.Push(x)
and h.Pop()
instead of heap.Push(&h, x)
and heap.Pop(&h)
. Calling the
Push
and Pop
methods we defined on minHeap
directly will
not cause build errors because they are totally valid Go code.
But the heap will not function correctly because the custom
Push
and Pop
methods are supposed to be called by the heap
package when you call heap.Push
and heap.Pop
.
What if we also need a maxHeap
? Do we need to declare a new
type and implement those methods again? You might have noticed
that all methods would be implemented the same way except for
the Less(i, j int) bool
method used for ordering the entries
in the heap. Therefore, re-implementing every method required
by the heap.Interface
is not ideal. Luckily, Go supports a
mechanism known as type embedding.
We can take advantage of this and "override" the Less
method
defined on minHeap
.
type maxHeap struct {
minHeap
}
func (h maxHeap) Less(i, j int) bool {
return h.minHeap[i] > h.minHeap[j]
}
Now we can sort integers in descending order by replacing
h := minHeap{5, 4, 3, 2, 1}
with
h := maxHeap{[]int{1, 2, 3, 4, 5}}
in our previous example.
Another operation that is often performed is updating the
entries in the heap. If the updated value is exactly what
we used to maintain the order of the heap. Then we need to
adjust the position of the this updated entry if its current
position violates the invariance of a heap. The heap.Fix(h heap.Interface, i int)
method is what we need here. But in order to call Fix
,
we need to know where the entry is within the heap. So this
means you need to keep track of the position of each entry in
the heap. And every time this entry gets moved, we need to
update the index as well to match its current position.
This can be done in the Swap(i, j int)
method we implemented
for the heap.Interface
.
type entry struct {
val, pos int // assume we store position along with value in heap
}
type minHeap []entry
func (h minHeap) Swap(i, j int) {
h[i], h[j] = h[j], h[i]
h[i].pos = i
h[j].pos = j
}
/* modify other methods to work with the `entry` type */
So before you heapify the minHeap
using heap.Init
. You
should initialize the pos
field of each entry to its current
index in the heap. If you are pushing new entries, assign h.Len()
to pos
before you append it to the underlying array. With this,
now you can fix the position of an entry after updating its
value by invoking heap.Fix(&h, entry.pos)
. A good example
of this usage is when implementing the Dijkstra's algorithm
which requires a heap to find the next vertex with the shortest
distance away from the source vertex efficiently.
Then use heap.Fix
to update its neighbors' distance till every
node in the graph is processed.
I will use 2642. Design Graph With Shortest Path Calculator
from LeetCode to demonstrate this in action. Feel free to give
it a try yourself before peeking at the solution below.
Expand to see solution
const INF = math.MaxInt - 1e6
type node struct {
pos, dis int
}
type Graph struct {
hlen int
h []int
nodes []node
adj []*list.List
}
func (g Graph) Len() int {
return g.hlen
}
func (g Graph) Less(i, j int) bool {
return g.nodes[g.h[i]].dis < g.nodes[g.h[j]].dis
}
func (g Graph) Swap(i, j int) {
g.h[i], g.h[j] = g.h[j], g.h[i]
g.nodes[g.h[i]].pos = i
g.nodes[g.h[j]].pos = j
}
func (g *Graph) Push(x any) {
g.h = append(g.h, x.(int))
g.hlen++
}
func (g *Graph) Pop() any {
x := g.h[g.hlen-1]
g.hlen--
return x
}
func Constructor(n int, edges [][]int) Graph {
adj := make([]*list.List, n)
for _, e := range edges {
from := e[0]
if adj[from] == nil {
adj[from] = list.New()
}
adj[from].PushFront(e)
}
return Graph{
hlen: n,
h: make([]int, n),
nodes: make([]node, n),
adj: adj,
}
}
func (g *Graph) AddEdge(edge []int) {
from := edge[0]
if g.adj[from] == nil {
g.adj[from] = list.New()
}
g.adj[from].PushFront(edge)
}
func (g *Graph) ShortestPath(node1 int, node2 int) int {
for i := range g.h {
g.h[i] = i
g.nodes[i].pos = i
g.nodes[i].dis = INF
}
g.nodes[node1].dis = 0
g.hlen = len(g.h)
heap.Init(g)
for g.hlen > 0 {
from := heap.Pop(g).(int)
g.nodes[from].pos = -1
if g.adj[from] == nil || g.nodes[from].dis == INF {
continue
}
elem := g.adj[from].Front()
for elem != nil {
to, cost := elem.Value.([]int)[1], elem.Value.([]int)[2]
if g.nodes[to].pos >= 0 {
g.nodes[to].dis = min(g.nodes[to].dis, g.nodes[from].dis+cost)
heap.Fix(g, g.nodes[to].pos)
}
elem = elem.Next()
}
}
if g.nodes[node2].dis == INF {
return -1
}
return g.nodes[node2].dis
}
func min(x, y int) int {
if x < y {
return x
}
return y
}
That's all I want to share in this post. Hope there's something useful to you. I will definitely update this post in the future if I find more tricks we can use to do interesting things in Go using the heap package.