Basekick Labs Blog

Real stories, technical insights, and product news. Everything about Arc and how we make your data work for you.

Oct 23, 2025

Supercharge Arc with the VS Code Database Manager Extension

The other day we shared how you can interact with Arc using Apache Superset for visualization. Today we're even more excited to introduce Arc Database Manager, our Visual Studio Code (VS Code, for friends) extension for working with Arc.

While a standalone UI remains on the roadmap, we wanted to give engineers and developers a better way to manage Arc right now—and what better place than one of the most popular IDEs out there?

In this article I'll walk you through the feature set and show you how to get up and running. Ready? Let's go!

Features

Here's what the extension can do today.

Connection Management

Multiple saved connections with secure token storage
Quick connection switching
Connection health monitoring
Visual status indicators in sidebar and status bar

Query Execution

SQL IntelliSense with auto-completion for tables, columns, and DuckDB functions
Execute queries with Ctrl+Enter / Cmd+Enter
Interactive results view with:
- Export to CSV, JSON, or Markdown
- Automatic chart visualization for time-series data
- Table sorting and filtering
- Execution time and row count statistics

Arc Notebooks

Mix SQL and Markdown in a single document (.arcnb files)
Execute cells individually or all at once
Parameterized queries with variable substitution
Export notebooks to Markdown with results
Auto-save functionality

Schema Explorer

Browse databases and tables in sidebar
Right-click context menus for:
- Show table schema
- Preview data (first 100 rows)
- Show table statistics
- Generate SELECT queries
- Quick time filters (last hour, today)

Data Ingestion

CSV Import with guided wizard
- Auto-detect delimiters and headers
- Timestamp column selection
- Batch processing for large files
- Uses high-performance MessagePack columnar format
Bulk Data Generator with 5 presets:
- CPU Metrics
- Memory Metrics
- Network Metrics
- IoT Sensor Data
- Custom schemas

Alerting & Monitoring

Create alerts based on query results
5 condition types: greater than, less than, equals, not equals, contains
Configurable check intervals (minimum 10 seconds)
Desktop notifications when alerts trigger
Alert history tracking
Enable/disable alerts without deletion

Query Management

Automatic query history - every query is logged
Saved queries - bookmark frequently used queries
View execution time, row counts, and errors
Quick re-run from history

Token Management

Create, rotate, and delete server tokens
Verify token validity
Secure storage in system keychain
Visual token management in sidebar

Dark Mode Support

Automatic theme detection - adapts to VS Code theme
Works with Light, Dark, and High Contrast themes
Theme-aware charts and visualizations
No configuration needed

As you can see, we're not taking this lightly—ADM (Arc Database Manager) is packed with features, with more already in the works.

Installation

Arc Database Manager is live in the VS Code Marketplace. Search for Arc Database Manager, install it, and enable auto-updates so you're always running the latest release.

vscode-arc-extension

Once it's installed, connect Arc Database Manager to your Arc instance.

Connecting ADM to Arc

In the status bar (bottom left of VS Code) you'll see Arc: Not connected. Click it to open the command palette and provide:

Name: Arc Server or any label you prefer
Host: http://localhost if Arc is running locally, or https://my-arc-server behind a reverse proxy
Port: 8000 by default, or 443 for HTTPS
Protocol: http or https
Authentication Token: Grab this from the Arc logs on first run

If everything checks out, your connections list will look something like this.

vscode-arc-extension-connections

Querying Arc

As mentioned earlier, the extension lets you query Arc and browse the schema without leaving your editor.

In this example I used the telegraf database and the cpu measurement/table. (We're working on a native Telegraf integration—stay tuned!)

Once the query is ready, execute it with:

Mac: Cmd+Enter
Windows/Linux: Ctrl+Enter

Results open in a new VS Code tab.

arc-vscode-query-results

From there you can export to CSV or JSON, copy the results as Markdown, or toggle a chart view—like the one below.

arc-vscode-toogle-chart

Notebooks

Notebook support was one of the first features I wanted to include—here's how to use it.

Open the command palette (Cmd+Shift+P on macOS, Ctrl+Shift+P on Windows/Linux) and type Arc: New Notebook.

arc-vscode-new-notebook-command

Save the file—I named mine new-notebook.arcnb—and the notebook opens automatically.

arc-vscode-notebooks

From here the floor is yours. Arc makes it easy to pull data from different databases into a single view. Imagine logs and system utilization side by side in the same notebook to correlate issues in real time. Powerful.

arc-vscode-notebooks-correlation

Alerts

Alerts are one of my favorite features because they surface outliers instantly.

I set up a quick check on CPU usage. Click Active Alerts, give the alert a name, add your query (mine was select usage_system from telegraf.cpu), set the threshold, and choose how often it should run. I went with a 60-second interval.

High CPU Usage
Query: select usage_system from telegraf.cpu
Condition: greater than 0
Check Interval: 60s
Status: Enabled
Last Check: 10/23/2025, 10:12:59 AM
Last Result: 12.008733624627581
Triggered: 1 time

When the alert fires you'll see a notification like this—pretty handy, right?

arc-vscode-alerting

You can review alert history in the same panel and dive into what triggered each event.

arc-vscode-alert-history

Disabling an alert when you're done takes just a couple of clicks.

arc-vscode-alert-disabled

To conclude

This walkthrough only scratches the surface of what the Visual Studio Code extension for Arc can do. We're continuing to improve it so working with Arc feels effortless for both administrators and engineers.

Want to collaborate? The project is open source:

github.comBasekick-Labs/arc-vscode-extensionhttps://github.com/Basekick-Labs/arc-vscode-extension

You'll find full instructions, examples, and release notes in the Marketplace listing:

marketplace.visualstudio.comArc Database Manager - Visual Studio MarketplaceExtension for Visual Studio Code - Complete development toolkit for Arc Database - SQL queries, notebooks, data ingestion, monitoring, and morehttps://marketplace.visualstudio.com/items?itemName=basekick-labs.arc-db-manager

And of course, here's the Arc repo itself—can't wait to see what you build with it.

github.comBasekick-Labs/archttps://github.com/Basekick-Labs/arc

Until next time.

Oct 20, 2025

Outrunning the Giants: How Arc Became the Fastest Time-Series Engine on ClickBench

Ignacio Van Droogenbroeck

Outrunning the Giants: How Arc Became the Fastest Time-Series Engine on ClickBench

It's been crazy the last two weeks since we released the code of Arc Core (OSS) on GitHub on October 7. But as you read in the first blog post, Arc is a project that's been in the works for 3 years, and now, with the release of the code, we felt ready to compare Arc against the giants of the time series industry.

And the way we did that was using ClickBench, a benchmark tool widely recognized in the industry to prove the speed of the systems being tested.

Before going into the actual numbers, let me share what I feel about benchmarks—they're a vanity metric, something you can claim, but in reality there aren't many use cases that need, for example, 2.42M RPS. Based on my experience, I've only found a very small number of corporations, like streaming ones, that ingest around that.

But also, that doesn't mean that because only a small number of projects need that level of performance, the software should be slow or not take performance into consideration. Also think that competition elevates us, makes us better, and in that sense it's clear we enjoy these results, but also, they set the north star to chase the improvements we need to make.

That said, let me introduce you to the numbers that show Arc is not only the fastest time series database/data warehouse, but also, thanks to the decision to use DuckDB as our SQL engine, we're in the top 8 fastest systems out there.

Arc is the fastest time series database/data warehouse

Let me start with a graph that shows how we perform in c6a.4xlarge (combined), which is the standard for ClickBench. We outperform QuestDB (they're doing a great job) and TimescaleDB in Combined. In Cold, we win too. In Hot, QuestDB takes the lead there—looks like their caching is very aggressive.

You can see the current values yourself on the ClickBench site

arc-fastest-timeseries-database-c6a.4xlarge

Now, if we look at all the system sizes, we outperform in every one of them in Cold Run, except in c6a.metal where QuestDB is doing a great job.

Here's a screenshot of those results, but if you want to see the full results and dive into the specific queries, you can click here Arc fastest Time series database in ClickBench

What about comparing to other systems?

Well, Arc does an excellent job there too, because if we look at the entire zoo of systems out there, Arc is one of the fastest systems. We're only behind a few monsters in the industry: DuckDB, ClickHouse, and CedarDB.

arc-ranking-fastest-database-systems-clickbench

Ok, we got it, you are doing pretty well, what's the secret?

Something that gives us an advantage is the design decisions we made when we started building Arc, we didn't try to reinvent the wheel, we built on the shoulders of DuckDB, which we chose as our SQL engine.

But also, we chose MessagePack, which offers incredible performance, and if we combine that with zero-copy passthrough with the columnar format of Parquet that are part of the core of Arc, it allows us to process 2.42M requests per second. It's kind of the same at the query level—we use Apache Arrow, again, in the same way, zero-copy in columnar format so you get your data blazing fast.

In the middle, we implemented our own set of procedures to avoid locking the system and to do proper multi-threading in a bunch of processes, like compaction or WAL, to offer data durability with just a little bit of overhead.

This is just the beginning and we're just starting, with the premise of offering something really performant that can scale and isn't a monster (right now Arc is approximately 5,500 lines of code).

What can you build with Arc?

Now, beyond the numbers, let me share what this performance means for real-world applications. Arc's speed isn't just about bragging rights—it's about unlocking use cases that were previously challenging or expensive to implement.

IoT and Smart Devices: Imagine you're managing millions of sensors sending data every second. With Arc's 2.42M RPS ingestion capability, you can handle massive IoT deployments without breaking a sweat. Whether it's smart cities, industrial sensors, or connected vehicles, Arc can ingest, store, and query your telemetry data in real-time without requiring a massive infrastructure investment.

Logistics and Supply Chain: Track your entire fleet, warehouse operations, or shipment movements with millisecond-level precision. Arc's query speed means you can run complex analytics on location data, delivery times, and route optimization in real-time, helping you make decisions faster and keep your operations running smoothly.

Aerospace and Aviation: When you're dealing with flight data, aircraft telemetry, or air traffic patterns, speed and reliability aren't optional. Arc can handle high-frequency aircraft sensor data, ground station telemetry, and flight operations analytics while giving you instant query responses for safety-critical decision-making.

Observability and Monitoring: If you're running modern cloud infrastructure, you know that observability data can quickly become overwhelming. Arc excels at ingesting metrics, logs, and traces at scale. The zero-copy columnar format means you can query across billions of data points to troubleshoot issues, analyze system performance, or detect anomalies without waiting around.

Financial Services and Trading: Market data, transaction logs, and trading signals generate massive time-series datasets. Arc's sub-millisecond query performance means you can run real-time analytics on market trends, backtest trading strategies, or monitor risk across your entire portfolio without compromise.

The key advantage? You get all this performance without the operational complexity. Arc's small footprint (5,500 lines of code) means fewer moving parts, easier debugging, and a system you can actually understand and maintain.

In the future

As I said, this is just the beginning and we're pumped to keep shipping features that will allow Arc to not only be the most performant, but the go-to option for time series/analytics use cases. The community has been responding—as of the time I'm writing this, we have 252 stars on GitHub in only 13 days since the Arc Core repo went live.

We've already answered more than 100 emails and comments all over the place asking about Arc, how it works, and what our plans are (roadmap coming soon).

We're sure we want to keep pushing in this direction, keep talking with the communities and potential new partners to shape Arc into something that solves your issues.

Get started with Arc

Ready to try the fastest time series database? Here's how to get started:

Check out the Arc Core repository on GitHub and star it if you find it interesting: github.comBasekick-Labs/archttps://github.com/Basekick-Labs/arc

Join our growing community, ask questions, share your use cases, or contribute to the project. We're actively responding to issues, discussions, and are excited to hear what you're building.

Again, this is just the beginning. Thank you for the love we're getting, and don't forget to stay tuned to this blog to get news about Basekick Labs and Arc, the fastest time series database.

Oct 16, 2025

Building Lightning-Fast Dashboards: Connect Arc with Apache Superset

Ignacio Van Droogenbroeck

Building Lightning-Fast Dashboards: Connect Arc with Apache Superset

Hey there! Welcome to our first hands-on tutorial for Arc. Today, I'm going to walk you through connecting Arc with Apache Superset and show you how to visualize your time-series data like a pro.

If you haven't heard of Apache Superset yet, it's basically a super powerful open-source data visualization platform. Think of it as your Swiss Army knife for creating dashboards and exploring data. And the best part? Superset plays really well with Arc through not one, but two different connection methods.

We've got two Arc dialects for Superset:

arc-superset-dialect: Uses JSON for data transfer. Perfect for getting started and works great for smaller datasets.
arc-superset-arrow: Uses Apache Arrow IPC format. This one's the speed demon, we're talking 28-75% faster queries depending on your dataset size. If you're moving large amounts of data, this is your go-to.

Both options give you full SQL support, API key authentication, and automatic discovery of all your Arc databases and tables. Pretty neat, right?

Want to learn more about Superset? Check out https://superset.apache.org/

Getting Started with Docker

Alright, let's get our hands dirty. The easiest way to get Arc running is with Docker. We'll get Arc up and running first, then set up Superset separately with the Arc dialect.

Setting up Arc

Arc comes with a Docker Compose setup. First, clone the Arc repository:

git clone https://github.com/Basekick-Labs/arc.git
cd arc

The Arc docker-compose.yml already has everything configured, but we need to expose the API port. Modify the arc-api service in docker-compose.yml to add the ports section:

services:
  arc:
    build: .
    container_name: arc
    restart: unless-stopped
    environment:
      STORAGE_BACKEND: local
      LOCAL_STORAGE_PATH: /data/arc/
      DB_PATH: /data/arc.db
      ARC_QUERY_CACHE_TTL: 60
      ARC_LOG_LEVEL: INFO
      ARC_DISABLE_WAL: "true"
    ports:
      - "8000:8000"  # Add this line to expose Arc API
    volumes:
      - arc-db:/data
      - arc-data:/data/arc
      - arc-logs:/app/logs
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
 
volumes:
  arc-db:
  arc-data:
  arc-logs:

Fire it up:

docker-compose up -d

This will build Arc from source. The first build might take a few minutes, so perfect time for that coffee break!

Check if it's running:

docker-compose ps

You should see the arc container up and healthy.

Getting Your Arc API Token

Arc uses token-based authentication to keep your data secure. The good news? Arc automatically creates a token for you on first startup. The bad news? You'll only see it one time, so get it from the first run.

Check the Arc logs to grab your token:

docker logs arc | grep "Initial admin API token"

You should see something like:

======================================================================
FIRST RUN - INITIAL ADMIN TOKEN GENERATED
======================================================================
Initial admin API token: mytokenisthebest
======================================================================
SAVE THIS TOKEN! It will not be shown again.
Use this token to login to the web UI or API.
You can create additional tokens after logging in.
======================================================================

Save that token! You'll need it for connecting Superset to Arc. Pro tip: export it to your environment for convenience:

export ARC_TOKEN="arc_1234567890abcdef"

Want to see what's happening under the hood? Check the live logs:

docker-compose logs -f arc-api

Pushing Data to Arc

Before we can visualize anything in Superset, we need some data in Arc. Let me show you a quick example of how to push time-series data to Arc.

Arc has a simple HTTP API for ingesting data. Here's a Python example that pushes some sample metrics:

import msgpack
import requests
from datetime import datetime
import os
 
# Get API token
token = os.getenv("ARC_TOKEN")
 
# All data organized as columns (arrays), not rows
data = {
    "m": "cpu",                    # measurement name
    "columns": {                   # columnar data structure
        "time": [
            int(datetime.now().timestamp() * 1000),
            int(datetime.now().timestamp() * 1000) + 1000,
            int(datetime.now().timestamp() * 1000) + 2000
        ],
        "host": ["server01", "server02", "server03"],
        "region": ["us-east", "us-west", "eu-central"],
        "datacenter": ["aws", "gcp", "azure"],
        "usage_idle": [95.0, 85.0, 92.0],
        "usage_user": [3.2, 10.5, 5.8],
        "usage_system": [1.8, 4.5, 2.2]
    }
}
 
# Send columnar data (2.32M RPS throughput)
response = requests.post(
    "http://localhost:8000/write/v2/msgpack",
    headers={
        "Authorization": f"Bearer {token}",
        "Content-Type": "application/msgpack",
        "x-arc-database": "system"  # Optional: specify database
    },
    data=msgpack.packb(data)
)
 
# Check response (returns 204 No Content on success)
if response.status_code == 204:
    print(f"Successfully wrote {len(data['columns']['time'])} records!")
else:
    print(f"Error {response.status_code}: {response.text}")

You can save this in a file, let's say, data.py, save it, and then execute it. (Make sure to execute this several times so we can have some data to graph later)

python3 data.py

The output should be something like this

Successfully wrote 3 records!

Arc will automatically create the database and table if they don't exist. Pretty convenient!

Setting up Superset with Arc Dialect

Now that Arc is running, let's get Superset set up with the Arc dialect. You have two main options here: install the dialect into an existing Superset instance, or build a custom Superset Docker image that includes the Arc dialect.

Option 1: Installing into Existing Superset (Easiest)

If you already have Superset running (or want to set it up from scratch), you can simply install the Arc dialect via pip. You have two dialect choices:

JSON Dialect (Good for Getting Started)

This is the straightforward option that works great for most use cases:

pip install arc-superset-dialect

Then restart Superset:

superset run -h 0.0.0.0 -p 8088

Arrow Dialect (High Performance)

Want those 28-75% speed gains? Install the Arrow-based dialect instead:

pip install arc-superset-arrow

Then restart Superset:

superset run -h 0.0.0.0 -p 8088

Pro tip: You can actually have both installed and create separate connections to Arc using each dialect. This lets you test performance differences side-by-side!

Option 2: Building Superset Docker Image with Arc

If you prefer a containerized setup, you can build a custom Superset Docker image with the Arc dialect baked in. Both GitHub repos include Dockerfiles that handle everything for you.

For JSON dialect:

git clone https://github.com/basekick-labs/arc-superset-dialect.git
cd arc-superset-dialect
docker build -t superset-arc:latest .
docker run -d -p 8088:8088 -v superset_home:/app/superset_home \
  --name superset-arc superset-arc:latest

For Arrow dialect:

git clone https://github.com/basekick-labs/arc-superset-arrow.git
cd arc-superset-arrow
docker build -t superset-arc:latest .
docker run -d -p 8088:8088 -v superset_home:/app/superset_home \
  --name superset-arc superset-arc:latest

Both images come with Superset pre-configured, the Arc dialect installed, automatic database initialization, and default admin credentials (admin/admin – definitely change those in production!).

Connecting Superset to Arc

Now for the fun part – let's connect Superset to Arc so we can start building dashboards.

Accessing Superset

Open your browser and head to http://localhost:8088. The default credentials are:

Username: admin
Password: admin

(Change these in production, obviously!)

Adding Arc as a Data Source

Once you're logged in:

Click on "Settings" menu in the top right and select "Database Connections" and then click on + Database
In the Supported Databases dropdown menu, select Other like the image that you can see here:

Superset Arc Connection

Enter your connection string:
```
arc://arc_1234567890abcdef@arc:8000/system
```
Breaking this down:
- arc:// - the protocol
- arc_1234567890abcdef - your API key
- arc:8000 - hostname and port (we use arc because that's the Docker service name, port 8000)
- system - the database name
Test the connection by clicking the "Test Connection" button. If everything's configured correctly, you should see a success message!
Click "Connect" and give your database a friendly name like "Arc Metrics"

Superset Arc Connection

Exploring Your Arc Data

After connecting, Superset will automatically discover all your tables in Arc. Head over to "Datasets" then in + Dataset, select the Database Arc Metrics, the schema system, and then the table cpu. The result should be something like this:

Arc Superset Dataset

Click on Create dataset and create chart and let's create a panel to visualize the data we just ingested.

Arc Superset new chart

Once we are there, we should select time in X-axis, and in Metrics one of the metrics; in our case, let's use usage_system. If we want, we can add a filter to filter for the server that you want. In this case, I filtered the data for server02.

In my case, the chart looks simple, but it's a good signal that it works.

Arc Superset chart

We can save this chart to use it in a dashboard by clicking the Save button at the top right.

Running Queries in Superset

Now that we've seen how to create a chart to use in a dashboard, let's run some queries against our Arc data! Head to SQL from the top menu, select SQL Lab and let's do it.

Basic Query Example

Here's a simple query to get the average CPU usage per server:

SELECT
    host,
    AVG(usage_system) as avg_usage_system,
    AVG(usage_user) as avg_usage_user,
    AVG(usage_idle) as avg_usage_idle
FROM system.cpu
GROUP BY host
ORDER BY avg_usage_system DESC

Arc's DuckDB-powered SQL engine handles this instantly. You'll see results appear in the query editor.

Arc Superset SQL Lab

Time-Series Query with Arc

Since Arc is built for time-series data, let's do something more interesting with time bucketing:

SELECT
    DATE_TRUNC('hour', timestamp) as time_bucket,
    server,
    AVG(cpu_usage) as avg_cpu,
    MAX(cpu_usage) as max_cpu,
    MIN(cpu_usage) as min_cpu
FROM server_stats
WHERE timestamp >= NOW() - INTERVAL '24 hours'
GROUP BY time_bucket, server
ORDER BY time_bucket DESC, server

This groups data by hour and gives us nice aggregations. Arc handles time-series queries like this effortlessly.

Arc Superset Time Bucketing

Advanced Query: Moving Averages

Want to calculate a moving average? Arc supports window functions:

SELECT
    time,
    host,
    usage_system,
    AVG(usage_system) OVER (
        PARTITION BY host
        ORDER BY time
        ROWS BETWEEN 5 PRECEDING AND CURRENT ROW
    ) as moving_avg_cpu
FROM system.cpu
ORDER BY time DESC
LIMIT 100

The beauty of using Arc with Superset is that all these queries execute quickly thanks to Arc's columnar storage. And if you're using the Arrow dialect, you'll notice even better performance on large result sets.

Arc Superset Moving Average

Performance: JSON vs Arrow

Quick note on choosing between the two Arc dialects for Superset:

Use arc-superset-dialect (JSON) when:

You're just getting started
Your result sets are relatively small (< 10k rows)
You want the simplest setup

Use arc-superset-arrow (Arrow) when:

You're working with large datasets
Query performance is critical
You need 28-75% faster query execution
You're building dashboards with real-time data

Both dialects support the exact same SQL features and Arc functionality. The only difference is how data gets transferred from Arc to Superset. Arrow uses a columnar, zero-copy format that's way more efficient for large result sets.

Wrapping Up

And that's it! You now have Arc connected to Apache Superset and can start building awesome dashboards for your time-series data.

The combination of Arc's lightning-fast columnar storage and Superset's flexible visualization capabilities is pretty powerful. Whether you choose the JSON dialect for simplicity or the Arrow dialect for maximum performance, you're getting a solid stack for time-series analytics.

Both Arc and Superset are under active development, so expect even more features and better integration in the future. If you run into any issues or have questions, feel free to open an issue on the GitHub repos:

Happy dashboarding!

Oct 13, 2025

Hello World, We are Basekick Labs and We Bring Arc to the Game

Ignacio Van Droogenbroeck

Hello World, We are Basekick Labs and We Bring Arc to the Game

Welcome to Basekick Labs, where we're building Arc. But what is Arc?

Arc is a high-performance time-series data warehouse for engineers building observability platforms, IoT systems, and real-time analytics. It's built on DuckDB and Parquet with 2.01M records/sec ingestion, SQL analytics, and flexible storage options for unlimited scale. Most importantly, Arc is the fastest time-series database.

But wait—where did all this start? How did we get here? Those are great questions, and we're going to answer them by telling you a story.

Where It All Started

To understand where we came from, we need to go back to November 2022. While working in the time-series database industry, I bought a separate computer for personal side projects—something I could keep independent for after-hours work. In my spare time, I started building an API that could automate the deployment of various time-series tools across different cloud providers. The idea was to create an API-first platform that offered hosted and managed versions of these technologies.

I worked on that until June 2023, when I decided to pursue this idea full-time. At that moment, I was focused on providing technical coverage in Latin America for time-series software, with a heavy focus on IoT use cases.

We had some customers, but as we evolved the platform to integrate more cloud providers, I knew I needed to work on something we owned. A database? Maybe. But the core idea was simple: stop depending on others to build our business. Eventually, I wanted to migrate customers we were hosting on InfluxDB, TimescaleDB, QuestDB, and other databases to something we controlled end-to-end.

I created a pilot called Wavestream. It was very primitive, offering about 20k writes per second on 8 cores and 16GB of RAM. At that point, it was clear I had a lot of work to do.

That venture ended in 2024, and I started analyzing how I could keep moving forward. I recalled the Wavestream idea, but the code was gone. For me, it was attached to negative moments. If I was going to start something new, it needed to be 100% fresh.

The Birth of Historian (and Eventually Arc)

So I started again. I built something that, one year later in 2025, I called Historian. The platform was created to offer tiered storage for InfluxDB v1.x and 2.x, with eventual support for other databases. The key innovation was storing data in Parquet files on S3.

That worked well. I was able to sell this to specific InfluxDB Enterprise customers who were experiencing scaling issues.

During the development of Historian, I included DuckDB as the engine for running SQL queries on Parquet files. I introduced time-based indexing. And one step at a time, Historian was converting from an archive solution into something much bigger.

So, Today?

Today, that bigger thing is Arc. As I stated earlier, Arc is a time-series data warehouse built for speed, with a record of 2.01M records/sec on local NVMe in a single node. It combines DuckDB, Parquet, and flexible storage (local/MinIO/S3/GCS).

More importantly, Arc is the fastest time-series database, with a cold run time of 36.43 seconds on 99.9 million rows, beating VictoriaLogs (3.3x), QuestDB (6.5x), Timescale Cloud (18.2x), and TimescaleDB (29.7x).

The core version is published on GitHub under the AGPL-3.0 license. We believe in open formats, transparent benchmarks, and building tools that engineers actually want to use.

Arc's Key Features

Here are some of Arc's standout features:

High-Performance Ingestion: MessagePack binary protocol (recommended), InfluxDB Line Protocol (drop-in replacement), JSON
Multi-Database Architecture: Organize data by environment, tenant, or application with database namespaces
Write-Ahead Log (WAL): Optional durability feature for zero data loss (disabled by default)
Automatic File Compaction: Merges small Parquet files into larger ones for 10-50x faster queries (enabled by default)
DuckDB Query Engine: Fast analytical queries with SQL, cross-database joins, and advanced analytics
Flexible Storage Options: Local filesystem (fastest), MinIO (distributed), AWS S3/R2 (cloud), or Google Cloud Storage
Data Import: Import data from InfluxDB, TimescaleDB, HTTP endpoints
Query Caching: Configurable result caching for improved performance

The Response Has Been Incredible

But this is just the beginning. After six days of publishing the code, the response has been amazing. We hit over 200 stars on GitHub, and I've spent most of the last five or six days responding to questions on Reddit, Hacker News, email, and even Russian blogs that noticed Arc and called us "the ClickHouse killer", something I don't believe for a minute. ClickHouse is a great tool, and while we're doing well, we're focused on the time-series category specifically.

Here's our GitHub star growth:

github star graph

We've also already built an integration to connect Apache Superset to Arc, and we're working to get that documented on the Superset documentation website. Over the weekend, we also started building an output plugin for Telegraf, the collector from InfluxData.

Why Are We Here?

That's a great question. We're here to offer a modern and performant time-series technology that doesn't change engines with every version, doesn't confuse people with too many versions, isn't mounted over another database, and isn't pivoting to logs because they're trying to kill Datadog.

We're here for the engineers.

We're here for those who need to collect data from the edge to understand how their trucks are performing and where they are. We're here to give insights to doctors about the health of their patients. We're here to empower teams developing solutions that clean CO2 from the environment. We're here for platform engineers who need to understand how their infrastructure performs.

And with Arc Core, we're here to help students and hobbyists monitor their grills, plants, track planes, vessels, and more.

This Is Just the Start

We're incredibly excited about what we're building, and we want you to test it. Let us know the good, but especially the bad. Over the weekend, we learned about several issues that helped us improve from 1.95M RPS to 2.01M RPS. That's what open source is all about—it's not just code, it's feedback too.

Check out Arc on GitHub:

github.comBasekick-Labs/archttps://github.com/Basekick-Labs/arc

Welcome to Basekick Labs. Welcome to Arc. Let's make history.

<- Previous Page
Next Page ->

Ready to get started?

Build your next observability platform with Arc

Deploy in minutes with Docker or native mode. Ingest millions of metrics per second, query millions of rows in seconds, and scale from edge to cloud.

Read the Docs ->View on GitHub