pudge embedded database in 500 lines on golang

pudge is an embeddable key / value database written in the standard Go library.

I will dwell on the fundamental differences from the existing solutions.

Stateless

pudge.Set("../test/test", "Hello", "World")

Pooj will automatically create a test database, including subdirectories, or open it. There is no need to store the state of the table and it is safe to store values in multi-threaded applications. Pooj is thread safe.

Typefree

Bytes, strings, numbers, or structures can be written to the pooj. Do not worry about converting data into their binary representation.

  type Point struct { X int Y int } for i := 100; i >= 0; i-- { p := &Point{X: i, Y: i} db.Set(i, p) } var point Point db.Get(8, &point) log.Println(point)

QuerySystem

Pooj provides the ability to extract keys in a specific order, including a selection with the indication of a limit, an indent, a sort, and a selection by prefix.

  keys, _ := db.Keys(7, 2, 0, true)

The above code is analogous to the SQL query:

 select keys from db where key>7 order by keys asc limit 2 offset 0

It should be noted that the sorting of keys - "lazy." On the other hand, the keys are stored in memory and it runs pretty quickly.

Parallelism

Pooj, like most modern databases, uses a non-blocking read model, but writing to a file blocks all operations. But you can create / open files on the fly, minimizing the number of locks. There is no database already opened error in puja. An example of using an http router:

 func write(c *gin.Context) { var err error group := c.Param("group") counter := c.Param("counter") db, err := pudge.Open(group, cfg) if err != nil { renderError(c, err) return } _, err = db.Counter(counter, 1) if err != nil { renderError(c, err) return } c.String(http.StatusOK, "%s", "ok") }

Engines

Despite its small size, the pooj supports two modes of data storage. In memory and on disk. By default, the pooj stores data (values) only on disk. But if you want, you can turn on data storage in memory. In this case, they will be dropped to disk on request, or when closing the database.

Status

Pooj is used both in home projects and in production, on the chart below - the number of requests to the http server based on the pooj, and the number of requests longer than 20 ms

In this case, the pooj is enabled in full synchronization mode, and, at the time of fsync, significant (more than 20 ms) delays occur. But fortunately, there are not so many of them as a percentage.

On the project page you can find more links with examples of integrating puja into various projects.

Speed

In the benchmark repository , you can compare the pooj with other databases:

Test 1

 Number of keys: 1000000 Minimum key size: 16, maximum key size: 64 Minimum value size: 128, maximum value size: 512 Concurrency: 2

	^pogreb	^goleveldb	^bolt	^badgerdb	^pudge	^slowpoke	^{pudge (mem)}
^{1M (Put + Get), seconds}	¹⁸⁷	³⁸	¹²⁶	³⁴	²³	²³	²
^{1M Put, ops / sec}	⁵³³⁶	³⁴⁷⁴³	⁸⁰⁵⁴	³³⁵³⁹	⁴⁷²⁹⁸	⁴⁶⁷⁸⁹	⁴³⁹⁵⁸¹
^{1M Get, ops / sec}	^1782423	⁹⁸⁴⁰⁶	⁴⁹⁹⁸⁷¹	²²⁰⁵⁹⁷	⁴⁹⁹¹⁷²	⁴⁴⁵⁷⁸³	^1652069
^{FileSize, Mb}	⁵⁶⁸	³⁵⁷	⁵⁵²	⁴⁸⁷	³⁵⁸	³⁵⁸	³⁵⁸

The pooj is very well balanced in terms of the ratio between the writing speed and the reading speed. Those he is not a highly specialized database optimized for reading or writing. With a high reading speed, a rather high writing speed is maintained Which, incidentally, can be further increased by parallelizing the recording in different files (as is done in the LSM Tree engines).

Links to the database used in the test:

read-heavy workloads written in go
goleveldb LevelDB key / value database in Go.
bolt An embedded key / value database for Go.
badgerdb Fast key-value DB in Go
slowpoke (based on pudge)
pudge

They asked to compare with memcache and redis, but since the lion's share of time is spent on network interfaces when interacting with data DB, this is not entirely fair. Although on the other hand, the pooj benefits from multithreading, even though it writes data to disk.

Further development

Transactions It would be convenient to combine pool write requests, with automatic rollback in case of an error.
Ability to limit key lifetime (like TTL in memcache / cassandra etc)
No server. It is convenient to build a pooj in existing microservices, but most likely there will be a separate server. In a separate project.
Mobile version. For use on Android, iOS and as a plugin for Flutter.

Source: https://habr.com/ru/post/439216/

pudge embedded database in 500 lines on golang

Test 1

More articles: