Golang Sync.Map

Kimi.Tsai 收錄於 Golang

2022-08-03 約 4308 字預計閱讀 9 分鐘

Go sync.Map

source from : https://mp.weixin.qq.com/s/8aufz1IzElaYR43ccuwMyA

在之前的《爲什麼 Go map 和 slice 是非線程安全的？》文章中，我們討論了 Go 語言的 map 和 slice 非線程安全的問題，基於此引申出了 map 的兩種目前在業界使用的最多的併發支持的模式。

分別是：

標準庫 sync.Map（Go1.9 及以後）
原生 map + 互斥鎖或讀寫鎖 mutex

sync.Map 優勢

在 Go 官方文檔中明確指出 Map 類型的一些建議： https://pkg.go.dev/sync#Map

多個 goroutine 的併發使用是安全的，不需要額外的鎖定或協調控制。
大多數代碼應該使用原生的 map，而不是單獨的鎖定或協調控制，以獲得更好的類型安全性和維護性。

同時 Map 類型，還針對以下場景進行了性能優化：

當一個給定的鍵的條目只被寫入一次但被多次讀取時。例如在僅會增長的緩存中，就會有這種業務場景。
當多個 goroutines 讀取、寫入和覆蓋不相干的鍵集合的條目時。

這兩種情況與 Go map 搭配單獨的 Mutex 或 RWMutex 相比較，使用 Map 類型可以大大減少鎖的爭奪。

性能測試

聽官方文檔介紹了一堆好處後，他並沒有講到缺點，所說的性能優化後的優勢又是否真實可信。我們一起來驗證一下。

首先我們定義基本的數據結構：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
// 代表互斥鎖
type FooMap struct {
 sync.Mutex
 data map[int]int
}

// 代表讀寫鎖
type BarRwMap struct {
 sync.RWMutex
 data map[int]int
}

var fooMap *FooMap
var barRwMap *BarRwMap
var syncMap *sync.Map

// 初始化基本數據結構
func init() {
 fooMap = &FooMap{data: make(map[int]int, 100)}
 barRwMap = &BarRwMap{data: make(map[int]int, 100)}
 syncMap = &sync.Map{}
}

result

可直接使用 go19-examples/benchmark-for-map 項目也可以使用 Go 官方提供的 map_bench_test.go，有興趣的小夥伴可以自己拉下來運行試一下。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
goos: darwin
goarch: amd64
pkg: MyGoNote/Golang/Sync.Map
BenchmarkBuiltinMapStoreParalell
BenchmarkBuiltinMapStoreParalell-8               7730116               196.5 ns/op             9 B/op          0 allocs/op
BenchmarkBuiltinRwMapStoreParalell
BenchmarkBuiltinRwMapStoreParalell-8             7243088               189.2 ns/op            10 B/op          0 allocs/op
BenchmarkBuiltinSyncMapStoreParalell
BenchmarkBuiltinSyncMapStoreParalell-8           3633213               359.0 ns/op            49 B/op          3 allocs/op
BenchmarkBuiltinMapLookupParalell
BenchmarkBuiltinMapLookupParalell-8              8469343               141.5 ns/op             0 B/op          0 allocs/op
BenchmarkBuiltinRwMapLookupParalell
BenchmarkBuiltinRwMapLookupParalell-8           18966241                62.50 ns/op            0 B/op          0 allocs/op
BenchmarkBuiltinSyncMapLookupParalell
BenchmarkBuiltinSyncMapLookupParalell-8         34626170                34.63 ns/op            0 B/op          0 allocs/op
BenchmarkBuiltinMapDeleteParalell
BenchmarkBuiltinMapDeleteParalell-8             10312282               138.6 ns/op             0 B/op          0 allocs/op
BenchmarkBuiltinRwMapDeleteParalell
BenchmarkBuiltinRwMapDeleteParalell-8            8938326               152.8 ns/op             0 B/op          0 allocs/op
BenchmarkBuiltinSyncMapDeleteParalell
BenchmarkBuiltinSyncMapDeleteParalell-8         42648486                39.55 ns/op            0 B/op          0 allocs/op
PASS
ok      MyGoNote/Golang/Sync.Map        18.757s

寫入

function name	method	result	performance
BenchmarkBuiltinMapStoreParalell-8	map+mutex	196.5 ns/op	2
BenchmarkBuiltinRwMapStoreParalell-8	map+rwmutex	189.2 ns/op	1
BenchmarkBuiltinSyncMapStoreParalell-8	sync.map	359.0 ns/op	3

在寫入元素上，最慢的是 sync.map 類型，其次是原生 map + 互斥鎖（Mutex），最快的是原生 map + 讀寫鎖（RwMutex）。

總體的排序（從慢到快）爲：SyncMapStore < MapStore < RwMapStore。

查詢

function name	method	result	performance
BenchmarkBuiltinMapSLookupParalell-8	map+mutex	141.5 ns/op	3
BenchmarkBuiltinRwMapSLookupParalell-8	map+rwmutex	62.50 ns/op	2
BenchmarkBuiltinSyncMapSLookupParalell-8	sync.map	34.63 ns/op	1

在查找元素上，最慢的是原生 map + 互斥鎖，其次是原生 map + 讀寫鎖。最快的是 sync.map 類型。

總體的排序爲：MapLookup < RwMapLookup < SyncMapLookup。

刪除

function name	method	result	performance
BenchmarkBuiltinMapSDeleteParalell-8	map+mutex	138.6 ns/op	2
BenchmarkBuiltinRwMapSDeleteParalell-8	map+rwmutex	152.8 ns/op	3
BenchmarkBuiltinSyncMapSDeleteParalell-8	sync.map	39.55 ns/op	1

在刪除元素上，最慢的是原生 map + 讀寫鎖，其次是原生 map + 互斥鎖，最快的是 sync.map 類型。

總體的排序爲：RwMapDelete < MapDelete < SyncMapDelete。

場景分析

根據上述的壓測結果，我們可以得出 sync.Map 類型：

在讀和刪場景上的性能是最佳的，領先一倍有多。

在寫入場景上的性能非常差，落後原生 map + 鎖接近有一倍之多。

因此在實際的業務場景中。假設是讀多寫少的場景，會更建議使用 sync.Map 類型。

但若是那種寫多的場景，例如多 goroutine 批量的循環寫入，那就建議另闢途徑了，性能不忍直視（無性能要求另當別論）。

example code

main.go

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
package syncmap

import "sync"

// 代表互斥鎖
type FooMap struct {
	sync.Mutex
	data map[int]int
}

// 代表讀寫鎖
type BarRwMap struct {
	sync.RWMutex
	data map[int]int
}

var fooMap *FooMap
var barRwMap *BarRwMap
var syncMap *sync.Map

// 初始化基本數據結構
func init() {
	fooMap = &FooMap{data: make(map[int]int, 100)}
	barRwMap = &BarRwMap{data: make(map[int]int, 100)}
	syncMap = &sync.Map{}
}

// 寫入 map + mutex
func builtinMapStore(k, v int) {
	fooMap.Lock()
	defer fooMap.Unlock()
	fooMap.data[k] = v
}

// 查詢 map + mutex
func builtinMapStoreLookup(k int) int {
	fooMap.Lock()
	defer fooMap.Unlock()
	if v, ok := fooMap.data[k]; !ok {
		return -1
	} else {
		return v
	}
}

// 刪除  map + mutex
func builtinMapDelete(k int) {
	fooMap.Lock()
	defer fooMap.Unlock()
	if _, ok := fooMap.data[k]; !ok {
		return
	} else {
		delete(fooMap.data, k)
	}
}

// 寫入 map + rwmutex
func builtinRwMapStore(k, v int) {
	barRwMap.Lock()
	defer barRwMap.Unlock()
	barRwMap.data[k] = v
}

// 查詢 map + rwmutex
func builtinRwMapLookup(k int) int {
	barRwMap.RLock()
	defer barRwMap.RUnlock()
	if v, ok := barRwMap.data[k]; !ok {
		return -1
	} else {
		return v
	}
}

// 刪除 map + rwmutex
func builtinRwMapDelete(k int) {
	barRwMap.Lock()
	defer barRwMap.Unlock()
	if _, ok := barRwMap.data[k]; !ok {
		return
	} else {
		delete(barRwMap.data, k)
	}
}

// 寫入 syncMap
func builtinSyncMapStore(k, v int) {
	syncMap.Store(k, v)
}

// 查詢 syncMap
func builtinSyncMapLookup(k int) int {
	if v, ok := syncMap.Load(k); !ok {
		return -1
	} else {
		return v.(int)
	}
}

// 刪除 syncMap
func builtinSyncMapDelete(k int) {
	if _, ok := syncMap.Load(k); !ok {
		return
	} else {
		syncMap.Delete(k)
	}
}

main_test.go

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
package syncmap

import (
	"math/rand"
	"testing"
	"time"
)

// 寫入
// map + mutex
func BenchmarkBuiltinMapStoreParalell(b *testing.B) {
	b.ResetTimer()
	b.RunParallel(func(pb *testing.PB) {
		r := rand.New(rand.NewSource(time.Now().Unix()))
		for pb.Next() {
			k := r.Intn(100000000)
			v := r.Intn(100000000)
			builtinMapStore(k, v)
		}
	})
}

// map + rwmutex
func BenchmarkBuiltinRwMapStoreParalell(b *testing.B) {
	b.ResetTimer()
	b.RunParallel(func(pb *testing.PB) {
		r := rand.New(rand.NewSource(time.Now().Unix()))
		for pb.Next() {
			k := r.Intn(100000000)
			v := r.Intn(100000000)
			builtinRwMapStore(k, v)
		}
	})
}

func BenchmarkBuiltinSyncMapStoreParalell(b *testing.B) {
	b.ResetTimer()
	b.RunParallel(func(pb *testing.PB) {
		r := rand.New(rand.NewSource(time.Now().Unix()))
		for pb.Next() {
			k := r.Intn(100000000)
			v := r.Intn(100000000)
			builtinSyncMapStore(k, v)
		}
	})
}

// 查詢
func BenchmarkBuiltinMapLookupParalell(b *testing.B) {
	b.ResetTimer()
	b.RunParallel(func(pb *testing.PB) {
		r := rand.New(rand.NewSource(time.Now().Unix()))
		for pb.Next() {
			k := r.Intn(100000000)
			builtinMapStoreLookup(k)
		}
	})
}

func BenchmarkBuiltinRwMapLookupParalell(b *testing.B) {
	b.ResetTimer()
	b.RunParallel(func(pb *testing.PB) {
		r := rand.New(rand.NewSource(time.Now().Unix()))
		for pb.Next() {
			k := r.Intn(100000000)
			builtinRwMapLookup(k)
		}
	})
}

func BenchmarkBuiltinSyncMapLookupParalell(b *testing.B) {
	b.ResetTimer()
	b.RunParallel(func(pb *testing.PB) {
		r := rand.New(rand.NewSource(time.Now().Unix()))
		for pb.Next() {
			k := r.Intn(100000000)
			builtinSyncMapLookup(k)
		}
	})
}

// 刪除
func BenchmarkBuiltinMapDeleteParalell(b *testing.B) {
	b.ResetTimer()
	b.RunParallel(func(pb *testing.PB) {
		r := rand.New(rand.NewSource(time.Now().Unix()))
		for pb.Next() {
			k := r.Intn(100000000)
			builtinMapDelete(k)
		}
	})
}

func BenchmarkBuiltinRwMapDeleteParalell(b *testing.B) {
	b.ResetTimer()
	b.RunParallel(func(pb *testing.PB) {
		r := rand.New(rand.NewSource(time.Now().Unix()))
		for pb.Next() {
			k := r.Intn(100000000)
			builtinRwMapDelete(k)
		}
	})
}

func BenchmarkBuiltinSyncMapDeleteParalell(b *testing.B) {
	b.ResetTimer()
	b.RunParallel(func(pb *testing.PB) {
		r := rand.New(rand.NewSource(time.Now().Unix()))
		for pb.Next() {
			k := r.Intn(100000000)
			builtinSyncMapDelete(k)
		}
	})
}

sync.Map 剖析

清楚如何測試，測試的結果後。我們需要進一步深挖，知其所以然。

爲什麼 sync.Map 類型的測試結果這麼的 “偏科”，爲什麼讀操作性能這麼高，寫操作性能低的可怕，他是怎麼設計的？

數據結構

sync.Map 類型的底層數據結構如下：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
type Map struct {
 mu Mutex
 read atomic.Value // readOnly
 dirty map[interface{}]*entry
 misses int
}

// Map.read 屬性實際存儲的是 readOnly。
type readOnly struct {
 m       map[interface{}]*entry
 amended bool
}

mu：互斥鎖，用於保護 read 和 dirty。
read：只讀數據，支持併發讀取（atomic.Value 類型）。如果涉及到更新操作，則只需要加鎖來保證數據安全。
read 實際存儲的是 readOnly 結構體，內部也是一個原生 map，amended 屬性用於標記 read 和 dirty 的數據是否一致。
dirty：讀寫數據，是一個原生 map，也就是非線程安全。操作 dirty 需要加鎖來保證數據安全。
misses：統計有多少次讀取 read 沒有命中。每次 read 中讀取失敗後，misses 的計數值都會加 1。

在 read 和 dirty 中，都有涉及到的結構體：

1
2
3
type entry struct {
 p unsafe.Pointer // *interface{}
}

其包含一個指針 p, 用於指向用戶存儲的元素（key）所指向的 value 值。

在此建議你必須搞懂 read、dirty、entry，再往下看，食用效果會更佳，後續會圍繞着這幾個概念流轉。

查找過程

劃重點，Map 類型本質上是有兩個 “map”。一個叫 read、一個叫 dirty，長的也差不多：

sync.Map 的 2 個 map (read, dirty) 當我們從 sync.Map 類型中讀取數據時，其會先查看 read 中是否包含所需的元素：

若有，則通過 atomic 原子操作讀取數據並返回。
若無，則會判斷 read.readOnly 中的 amended 屬性，他會告訴程序 dirty 是否包含 read.readOnly.m 中沒有的數據；因此若存在，也就是 amended 爲 true，將會進一步到 dirty 中查找數據。

sync.Map 的讀操作性能如此之高的原因，就在於存在 read 這一巧妙的設計，其作爲一個緩存層，提供了快路徑（fast path）的查找。

同時其結合 amended 屬性，配套解決了每次讀取都涉及鎖的問題，實現了讀這一個使用場景的高性能。

寫入過程

我們直接關注 sync.Map 類型的 Store 方法，該方法的作用是新增或更新一個元素。

源碼如下：

1
2
3
4
5
6
7
8
// Store sets the value for a key.
func (m *Map) Store(key, value interface{}) {
 read, _ := m.read.Load().(readOnly)
 if e, ok := read.m[key]; ok && e.tryStore(&value) {
  return
 }
  ...
}

調用 Load 方法檢查 m.read 中是否存在這個元素。若存在，且沒有被標記爲刪除狀態，則嘗試存儲。

若該元素不存在或已經被標記爲刪除狀態，則繼續走到下面流程：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// Store sets the value for a key.
func (m *Map) Store(key, value any) {
	read, _ := m.read.Load().(readOnly)
	if e, ok := read.m[key]; ok && e.tryStore(&value) {
		return
	}

	m.mu.Lock()
	read, _ = m.read.Load().(readOnly)
	if e, ok := read.m[key]; ok {
		if e.unexpungeLocked() {
			// The entry was previously expunged, which implies that there is a
			// non-nil dirty map and this entry is not in it.
			m.dirty[key] = e
		}
		e.storeLocked(&value)
	} else if e, ok := m.dirty[key]; ok {
		e.storeLocked(&value)
	} else {
		if !read.amended {
			// We're adding the first new key to the dirty map.
			// Make sure it is allocated and mark the read-only map as incomplete.
			m.dirtyLocked()
			m.read.Store(readOnly{m: read.m, amended: true})
		}
		m.dirty[key] = newEntry(value)
	}
	m.mu.Unlock()
}

由於已經走到了 dirty 的流程，因此開頭就直接調用了 Lock 方法上互斥鎖，保證數據安全，也是凸顯性能變差的第一幕。

其分爲以下三個處理分支：

若發現 read 中存在該元素，但已經被標記爲已刪除（expunged），則說明 dirty 不等於 nil（dirty 中肯定不存在該元素）。其將會執行如下操作。
將元素狀態從已刪除（expunged）更改爲 nil。
將元素插入 dirty 中。
若發現 read 中不存在該元素，但 dirty 中存在該元素，則直接寫入更新 entry 的指向。
若發現 read 和 dirty 都不存在該元素，則從 read 中複製未被標記刪除的數據，並向 dirty 中插入該元素，賦予元素值 entry 的指向。

我們理一理，寫入過程的整體流程就是：

查 read，read 上沒有，或者已標記刪除狀態。
上互斥鎖（Mutex）。
操作 dirty，根據各種數據情況和狀態進行處理。

回到最初的話題，爲什麼他寫入性能差那麼多。究其原因：

寫入一定要會經過 read，無論如何都比別人多一層，後續還要查數據情況和狀態，性能開銷相較更大。
(第三個處理分支)當初始化或者 dirty 被提升後，會從 read 中複製全量的數據，若 read 中數據量大，則會影響性能。

可得知 sync.Map 類型不適合寫多的場景，讀多寫少是比較好的。

若有大數據量的場景，則需要考慮 read 複製數據時的偶然性能抖動是否能夠接受。

刪除過程

這時候可能有小夥伴在想了。寫入過程，理論上和刪除不會差太遠。怎麼 sync.Map 類型的刪除的性能似乎還行，這裏面有什麼貓膩？

源碼如下：

1
2
3
4
5
6
7
8
9
func (m *Map) LoadAndDelete(key any) (value any, loaded bool) {
	read, _ := m.read.Load().(readOnly)
	e, ok := read.m[key]
	if !ok && read.amended {
		...
	}
	if ok {
		return e.delete()
	}

刪除是標準的開場，依然先到 read 檢查該元素是否存在。

若存在，則調用 delete 標記爲 expunged（刪除狀態），非常高效。可以明確在 read 中的元素，被刪除，性能是非常好的。

若不存在，也就是走到 dirty 流程中：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// Delete deletes the value for a key.
func (m *Map) Delete(key any) {
	m.LoadAndDelete(key)
}

// LoadAndDelete deletes the value for a key, returning the previous value if any.
// The loaded result reports whether the key was present.
func (m *Map) LoadAndDelete(key any) (value any, loaded bool) {
	read, _ := m.read.Load().(readOnly)
	e, ok := read.m[key]
	if !ok && read.amended {
		m.mu.Lock()
		read, _ = m.read.Load().(readOnly)
		e, ok = read.m[key]
		if !ok && read.amended {
			e, ok = m.dirty[key]
			delete(m.dirty, key)
			// Regardless of whether the entry was present, record a miss: this key
			// will take the slow path until the dirty map is promoted to the read
			// map.
			m.missLocked()
		}
		m.mu.Unlock()
	}
	if ok {
		return e.delete()
	}
	return nil, false
}

若 read 中不存在該元素，dirty 不爲空，read 與 dirty 不一致（利用 amended 判別），則表明要操作 dirty，上互斥鎖。

再重複進行雙重檢查，若 read 仍然不存在該元素。則調用 delete 方法從 dirty 中標記該元素的刪除。

需要注意，出現頻率較高的 delete 方法：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
func (e *entry) delete() (value any, ok bool) {
	for {
		p := atomic.LoadPointer(&e.p)
		if p == nil || p == expunged {
			return nil, false
		}
		if atomic.CompareAndSwapPointer(&e.p, p, nil) {
			return *(*any)(p), true
		}
	}
}

該方法都是將 entry.p 置爲 nil，並且標記爲 expunged（刪除狀態），而不是真真正正的刪除。

注：不要誤用 sync.Map，前段時間從字節大佬分享的案例來看，他們將一個連接作爲 key 放了進去，於是和這個連接相關的，例如：buffer 的內存就永遠無法釋放了…

總結

通過閱讀本文，我們明確了 sync.Map 和原生 map + 互斥鎖 / 讀寫鎖之間的性能情況。

標準庫 sync.Map 雖說支持併發讀寫 map，但更適用於讀多寫少的場景，因爲他寫入的性能比較差，使用時要考慮清楚這一點。

另外我們針對 sync.Map 的性能差異，進行了深入的源碼剖析，瞭解到了其背後快、慢的原因，實現了知其然知其所以然。

經常看到併發讀寫 map 導致致命錯誤，實在是令人憂心。大家覺得如果本文不錯，歡迎分享給更多的 Go 愛好者：）

目錄