Background
我在业余时间自学数据库,试图通过实现一个入门知识来学习.
您必须实现的第一件事是底层数据格式和存储机制.
在数据库中,有一种称为"Slotted Page"的 struct ,如下所示:
+-----------------------------------------------------------+
| +----------------------+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ |
| | HEADER | | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | | |
| +----------------------+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ |
| SLOT ARRAY |
| |
| |
| |
| +--------------------+ +----------------+ |
| | TUPLE #4 | | TUPLE #3 | |
| | | | | |
| +--------------------+ +----------------+ |
| +--------------------------+ +------------------+ |
| | TUPLE #2 | | TUPLE #1 | |
| | | | | |
| +--------------------------+ +------------------+ |
+-----------------------------------------------------------+
页面数据通过二进制序列化存储到文件.槽是最简单的部分,其中的定义可能如下所示:
struct Slot {
uint32_t offset;
uint32_t length;
}
在C++中,读/写过程可能是std::memcpy
// Ignoring offset of header size in below
void write_to_buffer(char *buffer, Slot& slot, uint32_t slot_idx) {
memcpy(buffer + sizeof(Slot) * slot_idx, &slot.offset, sizeof(uint32_t));
memcpy(buffer + sizeof(Slot) * slot_idx + sizeof(uint32_t), &slot.length, sizeof(uint32_t));
}
void read_from_buffer(char *buffer, Slot& slot, uint32_t slot_idx) {
memcpy(&slot.offset, buffer + sizeof(Slot) * slot_idx, sizeof(uint32_t));
memcpy(&slot.length, buffer + sizeof(Slot) * slot_idx + sizeof(Slot), sizeof(uint32_t));
}
在Java中,据我所知,您可以做两件事中的任何一件:
- 字节缓冲区
record Slot(int offset, int length) {
void write(字节缓冲区 buffer) {
buffer.putInt(offset).putInt(length);
}
static Slot read(字节缓冲区 buffer) {
return new Slot(buffer.getInt(), buffer.getInt());
}
}
- 新的 foreign 记忆material
record Slot(int offset, int length) {
public static MemoryLayout LAYOUT = MemoryLayout.structLayout(
ValueLayout.JAVA_INT.withName("offset"),
ValueLayout.JAVA_INT.withName("length"));
public static TupleSlot from(MemorySegment memory) {
return new TupleSlot(
memory.get(ValueLayout.JAVA_INT, 0),
memory.get(ValueLayout.JAVA_INT, Integer.BYTES));
}
public void to(MemorySegment memory) {
memory.set(ValueLayout.JAVA_INT, 0, offset);
memory.set(ValueLayout.JAVA_INT, Integer.BYTES, length);
}
}
它们之间的性能差异会是什么?
如果字节缓冲区 API可以忽略不计,我更喜欢它.