A C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.
X
C++
Velox is a nifty library that accelerates database performance using C++, designed to be your go-to for turbocharging query engines and data crunching systems. Think of it like a powerhouse toolkit that Meta crafted with big names like IBM, Intel, Microsoft, and others backing it up. In the grand scheme of things, Velox takes your fully-flushed-out query plan and gets straight to business, crunching numbers and data faster than your morning cup of joe kicks in. But don’t expect it to handle SQL parsing, data framing, or play query optimizer—Velox is all about being the secret sauce you pour over your own compute engines. Here’s the skinny on what makes Velox the cat’s pajamas: - **Type**: A dynamo of a typing system, handling anything from scalar and complex types to structures, maps, arrays, and tensors. - **Vector**: Packs an Arrow-compatible columnar memory layout and a bunch of encoding options into one neat module. Whether you’re dealing with Flat, Dictionary, or Sequence/RLE, Velox has got you covered. - **Expression Eval**: This puppy can soaring through expressions with efficiency, thanks to its fully vectorized engine. - **Function Packages**: Delivers vectorized functions based on Presto and Spark semantics. - **Operators**: All the usual suspects—scan, projection, filtering, groupBy, orderBy, shuffle, and more—ready at your command. - **I/O**: Handles different file formats and storage adapters like a champ, syncing with ORC/DWRF, Parquet, S3, HDFS, and local files. - **Network Serializers**: Provides the interface for implementing wire protocols; PrestoPage and Spark's UnsafeRow are in the mix. - **Resource Management**: Masterfully manages computational resources, from memory and buffer management to tasks, drivers, and thread pools. The magic of Velox lies in its flexibility and extensibility. Developers can craft their engine-specific jazz with custom types, functions, operators, file formats, storage adapters, and network serializers. Velox doesn’t just sit pretty on the shelf. It's designed for the folks who like to roll up their sleeves and dive into CPU and thread execution, resource spilling, and caching. With companies like Meta, IBM, and Intel in the mix, you know you're in good company when you integrate Velox into your data processing engine. Got a Mac? Ubuntu? Centos? Velox sets up smoothly on all. Whether you’re using a trusty Intel or cutting-edge Apple silicon, setting up Velox is a breeze with handy scripts to get you started. Need to build it? Just run `make` after setup. For the tinkerers, there are debug and optimized build commands available. If containers are more your jam, Velox also builds with docker-compose. Jump into the Velox community on Slack or contribute through their robust open-source channels. The library’s governance, components, and maintainers are transparent and open for all to get involved. Licensed under the Apache 2.0 License, Velox is ready for you to extend, optimize, and make it your own. So, if optimizing query engines and hefty data processing is your game, Velox might just be the ace up your sleeve.
Check out site