simons blog

LRU and LFU in C++

Konstantin Vladimirov poses the following problem in his lecture A scent of C++ (which you can find on youtube):

Assume we have access to million of pages, each with a different number. We can only access the pages via a slow access function slow_get_page_int. Ideally we'd like to cache a small portion of the pages (the ones the user accesses frequently) for quicker access. That can be done via multiple caching algorithms.

In this blogpost we will discuss a caching algorithm (LRU) Konstantin provides in the lecture and another version (LFU) he suggests to implement as a homework.

LRU

LRU stands for least recently used. That means when the cache is full we will simply evict the least recently used page from the cache. Let us see how this is implemented in C++.

template <typename T, typename KeyT = int> struct lru_cache_t {
	size_t sz_;
	std::list<std::pair<KeyT, T>> cache_;

	using ListIt = typename std::list<std::pair<KeyT, T>>::iterator;
	std::unordered_map<KeyT, ListIt> hash_;

	lru_cache_t(size_t sz) : sz_(sz) {}

	bool full() const { return (cache_.size() == sz_); }

	template <typename F> bool lookup_update(KeyT key, F slow_get_page) {
		auto hit = hash_.find(key);
		if (hit == hash_.end()) {
			if (full()) {
				hash_.erase(cache_.back().first);
				cache_.pop_back();
			}
			cache_.emplace_front(key, slow_get_page(key));
			hash_.emplace(key, cache_.begin());
			return false;
		}

		auto eltit = hit->second;
		cache_.splice(cache_.begin(), cache_, eltit);
		return true;
	}
};

We see that the implementation is pretty simple. Let us analyse the this small piece of code.

size_t sz_;
std::list<std::pair<KeyT, T>> cache_;

using ListIt = typename std::list<std::pair<KeyT, T>>::iterator;
std::unordered_map<KeyT, ListIt> hash_;

lru_cache_t(size_t sz) : sz_(sz) {}

bool full() const { return (cache_.size() == sz_); }

We see that cache_ is simply a std::list. Each entry of this list is a pair consisting of the key and the corresponding page value. We furthermore have a hash_ that maps a given key to the corresponding iterator in the linked list. Note that this is smart because it allows us to put the most recent entry to the front of the cache without copying or moving it and by just providing the iterator. See for instance here for a documentation on splice that is used to do that. We'll furthermore have a helper function that simply indicates if the cache is full.

template <typename F> bool lookup_update(KeyT key, F slow_get_page) {
	auto hit = hash_.find(key);
	if (hit == hash_.end()) {
		if (full()) {
			hash_.erase(cache_.back().first);
			cache_.pop_back();
		}
		cache_.emplace_front(key, slow_get_page(key));
		hash_.emplace(key, cache_.begin());
		return false;
	}

	auto eltit = hit->second;
	cache_.splice(cache_.begin(), cache_, eltit);
	return true;
}

This is the main logic and will tell us, if we have a hit for the current entry. We have to cases:

  1. The given key is already in the cache (i.e. a hit).
  2. The given key is not in the cache (i.e. no hit).

In the first case we simply access the second element (of type ListIt) and than employ splice on the cache_ to move the corresponding element in front of the cache. That is because we always want to keep track within the cache at which time each page was accessed the last time.

In the second case we will remove the back of the cache, before adding the new element to the front of it. This is simply the eviction policy in action.

Note that hash_ enables a quick membership check because access to unordered map is O(1).

LFU

After thinking about it the above approach has some problems. For example we might think of a user clicking 1,000 times on one page over his whole browser history. It could than still happen that the cache is filled with pages that he just clicked once. That is because LRU only focuses on recency when evicting pages from the cache.

LFU is the following idea: Introduce a grouping into the cache. Instead of ordering only based on the access times we order on a first level by the access frequency. If multiple entries were accessed the same number of times we fall back to the time ordering which makes sense as well. We can implement this using an unordered map where the key is the frequency and the values are linked lists like above.

template <typename T, typename KeyT = int, typename Freq = int> struct lfu_cache_t {
	size_t sz_;
	Freq min_freq_;

	using List = typename std::list<std::tuple<KeyT, T, Freq>>;
	std::unordered_map<Freq, List> freq_cache_;

	using ListIt = typename std::list<std::tuple<KeyT, T, Freq>>::iterator;
	std::unordered_map<KeyT, ListIt> hash_;

	lfu_cache_t(size_t sz) : sz_(sz), min_freq_(0) {}

	bool full() const { return (hash_.size() == sz_); }

	template <typename F> bool lookup_update(KeyT key, F slow_get_page) {
		auto hit = hash_.find(key);
		if (hit == hash_.end()) {
			if (full()) {
				hash_.erase(std::get<0>(freq_cache_[min_freq_].back()));
				freq_cache_[min_freq_].pop_back();
			}
			min_freq_ = 1;
			freq_cache_[min_freq_].emplace_front(key, slow_get_page(key), min_freq_);
			hash_.emplace(key, freq_cache_[min_freq_].begin());
			return false;
		}
		auto eltit = hit->second;
		Freq freq = std::get<2>(*eltit);
		++(std::get<2>(*eltit));
		freq_cache_[freq + 1].splice(freq_cache_[freq + 1].begin(), freq_cache_[freq], eltit);
		if (freq_cache_[freq].empty() && min_freq_ == freq) {
			min_freq_ = freq + 1;
		}
		return true;
	}
};

We see that the code is very similar as above.

using List = typename std::list<std::tuple<KeyT, T, Freq>>;
std::unordered_map<Freq, List> freq_cache_;

Here we have a separate "sub-cache" for each frequency. Also each "Node" is now a tuple has an additional attribute which corresponds to the frequency a certain page was accessed.

lfu_cache_t(size_t sz) : sz_(sz), min_freq_(0) {}

We'll introduce a minimum frequency. This will be resetted once a certain frequency group is either empty or we get a new entry into the cache (which has of course frequency of one initially).

template <typename F> bool lookup_update(KeyT key, F slow_get_page) {
	auto hit = hash_.find(key);
	if (hit == hash_.end()) {
		if (full()) {
			hash_.erase(std::get<0>(freq_cache_[min_freq_].back()));
			freq_cache_[min_freq_].pop_back();
		}
		min_freq_ = 1;
		freq_cache_[min_freq_].emplace_front(key, slow_get_page(key), min_freq_);
		hash_.emplace(key, freq_cache_[min_freq_].begin());
		return false;
	}
	auto eltit = hit->second;
	Freq freq = std::get<2>(*eltit);
	++(std::get<2>(*eltit));
	freq_cache_[freq + 1].splice(freq_cache_[freq + 1].begin(), freq_cache_[freq], eltit);
	if (freq_cache_[freq].empty() && min_freq_ == freq) {
		min_freq_ = freq + 1;
	}
	return true;
}

We see that the lookup is similar.

In case of cache miss now we erase (if the cache is full) the least recently seen page for the minimum frequency if the cache is full. Afterwards we input the current page into the minimum frequency bucket within our cache and initialise it with a frequency of one.

If we have a hit we simply increase the frequency, move the current page one bucket "up" and potentially adjust the minimum frequency (if the hit was before in the minimum frequency bucket and left it empty).

Google Tests

Konstantin suggests to use appropriate build system and unit testing for the homework. Here I will briefly show organisation of the code.

I added GoogleTest library as a submodule to my git repository. It can than be used via add_subfolder in CMake. At the appropriate folder (i.e. the one containing our unit tests) we can than add

add_executable(test_cache test_cache.cc)
target_link_libraries(
        test_cache
        GTest::gtest_main
)
target_include_directories(
        test_cache
        PRIVATE
        ${CMAKE_CURRENT_SOURCE_DIR}/../include
)

include(GoogleTest)
gtest_discover_tests(test_cache)

and simply run our tests.

For example the number of hits can than be tested like so (where we simulate user input via file streaming)

template <typename CacheT>
int get_hits_for_testcase(const std::string& file_name) {
    std::ifstream is(file_name);
    int hits = 0;
    int n;
    size_t m;

    is >> m >> n;
    assert(is.good());
    CacheT c{m};

    for (int i = 0; i < n; ++i) {
        int q;
        is >> q;
        assert(is.good());
        if (c.lookup_update(q, slow_get_page_int))
            hits += 1;
    }

    return hits;
}

TEST(MyTests, CacheSizeOneLRU) {
    const int hits = get_hits_for_testcase<caches::lru_cache_t<int>>("../../src/data/LruTwoHits.txt");
    EXPECT_EQ(hits, 0);
}

You can check out the full build and test setup I used on my repo.

Conclusion

I hope this blogpost showed the advantages modern C++ offers via its standard library. I suggest you to check out Konstantin's lecture to learn more. I think it's a really great lecture. My repository can be found here and if you like we can connect on Linkedin to exchange ideas about C++, CUDA or GPU programming in general.